Top Banner
Lecture 12: Multi-view geometry / Stereo III Tuesday, Oct 23 CS 378/395T Prof. Kristen Grauman Outline Last lecture: stereo reconstruction with calibrated cameras non-geometric correspondence constraints Homogeneous coordinates, projection matrices Camera calibration Weak calibration/self-calibration – Fundamental matrix – 8-point algorithm
25

Lecture 12: Multi-view geometry / Stereo III

Feb 03, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lecture 12: Multi-view geometry / Stereo III

Lecture 12: Multi-view geometry / Stereo III

Tuesday, Oct 23

CS 378/395TProf. Kristen Grauman

Outline• Last lecture:

– stereo reconstruction with calibrated cameras– non-geometric correspondence constraints

• Homogeneous coordinates, projection matrices• Camera calibration• Weak calibration/self-calibration

– Fundamental matrix– 8-point algorithm

Page 2: Lecture 12: Multi-view geometry / Stereo III

Review: stereo with calibrated cameras

Camera-centered coordinate systems are related by known rotation R and translation T.

Vector p’ in second coord. sys. has coordinates Rp’in the first one.

Review: the essential matrix

( )[ ] 0=′×⋅ pRTp

[ ][ ]

0

0

=′

==′⋅

Τ pEpRTEpRTp

x

x

E is the essential matrix, which relates corresponding image points in both cameras, given the rotation and translation between their coordinate systems.

Let

Page 3: Lecture 12: Multi-view geometry / Stereo III

Review: stereo with calibrated cameras

• Image pair• Detect some features• Compute E from R and T• Match features using the

epipolar and other constraints

• Triangulate for 3d structure

Review: disparity/depth maps

image I(x,y) image I´(x´,y´)Disparity map D(x,y)

(x´,y´)=(x+D(x,y),y)

Page 4: Lecture 12: Multi-view geometry / Stereo III

Review: correspondence problem

Multiple match hypotheses satisfy epipolarconstraint, but which is correct?

Figure from Gee & Cipolla 1999

• To find matches in the image pair, assume– Most scene points visible from both views– Image regions for the matches are similar in

appearance• Dense or sparse matches• Additional (non-epipolar) constraints:

– Similarity– Uniqueness– Ordering– Figural continuity– Disparity gradient

Review: correspondence problem

Page 5: Lecture 12: Multi-view geometry / Stereo III

Review: correspondence error sources

• Low-contrast / textureless image regions• Occlusions• Camera calibration errors• Poor image resolution• Violations of brightness constancy (specular

reflections)• Large motions

Homogeneous coordinates• Extend Euclidean space: add an extra coordinate• Points are represented by equivalence classes• Why? This will allow us to write process of

perspective projection as a matrix

2d: (x, y)’ (x, y, 1)’

3d: (x, y, z)’ (x, y, z, 1)’

Mapping to homogeneous coordinates

Mapping back from homogeneous coordinates

2d: (x, y, w)’ (x/w, y/w)’

3d: (x, y, z, w)’ (x/w, y/w, z/w)’

Page 6: Lecture 12: Multi-view geometry / Stereo III

Homogeneous coordinates

•Equivalence relation:

(x, y, z, w) is the same as (kx, ky, kz, kw)

Homogeneous coordinates are only defined up to a scale

Perspective projection equations

Scene point Image coordinates

Camera frame

Image plane

Optical axis

Focal length

Page 7: Lecture 12: Multi-view geometry / Stereo III

Projection matrix for perspective projection

From pinhole camera model

Same thing, but written in terms of homogeneous coordinates

From pinhole camera model

ZfXx =

ZfYy =

Projection matrix for perspective projection

Page 8: Lecture 12: Multi-view geometry / Stereo III

Projection matrix for orthographic projection

1Yy =

1

1Xx =

• Extrinsic: location and orientation of camera frame with respect to reference frame

• Intrinsic: how to map pixel coordinates to image plane coordinates

Camera parameters

Camera 1 frame

Reference frame

Page 9: Lecture 12: Multi-view geometry / Stereo III

Rigid transformations

Combinations of rotations and translation• Translation: add values to coordinates• Rotation: matrix multiplication

Rotation about coordinate axes in 3d

Express 3d rotation as series of rotations around coordinate axes by angles γβα ,,

Overall rotation is product of these elementary rotations:

zyx RRRR =

Page 10: Lecture 12: Multi-view geometry / Stereo III

Extrinsic camera parameters

)( TPRP −= wc

Point in worldPoint in camera reference frame

( )ZYXc ,,=P

• Extrinsic: location and orientation of camera frame with respect to reference frame

• Intrinsic: how to map pixel coordinates to image plane coordinates

Camera parameters

Camera 1 frame

Reference frame

Page 11: Lecture 12: Multi-view geometry / Stereo III

Intrinsic camera parameters

• Ignoring any geometric distortions from optics, we can describe them by:

xxim soxx )( −−=

yyim soyy )( −−=

Coordinates of projected point in camera reference

frame

Coordinates of image point in

pixel units

Coordinates of image center in

pixel units

Effective size of a pixel (mm)

Camera parameters• We know that in terms of camera reference frame:

• Substituting previous eqns describing intrinsic and extrinsic parameters, can relate pixels coordinatesto world points:

)()()(

3

1

TPRTPR

−−

=−− Τ

Τ

w

wxxim fsox

)()()(

3

2

TPRTPR

−−

=−− Τ

Τ

w

wyyim fsoy

Ri = Row i of rotation matrix

Page 12: Lecture 12: Multi-view geometry / Stereo III

Linear version of perspective projection equations

• This can be rewritten as a matrix product using homogeneous coordinates:

x1x2x3

= Mint Mext

XwYwZw1

100/0

0/yy

xxosfosf

−−

Mint=TRTRTR

Τ

Τ

Τ

−−−

3333231

2232221

1131211

rrrrrrrrr

Mext=

32

31

//

xxyxxx

im

im

==

point in camera coordinates

Calibrating a camera

• Compute intrinsic and extrinsic parameters using observed camera data

Main idea• Place “calibration object” with

known geometry in the scene• Get correspondences• Solve for mapping from scene

to image: estimate M=MintMext

Page 13: Lecture 12: Multi-view geometry / Stereo III

Linear version of perspective projection equations

• This can be rewritten as a matrix product using homogeneous coordinates:

x1x2x3

= Mint Mext

XwYwZw1

w

wim

w

wim

y

x

PMPMPMPM

⋅⋅

=

⋅⋅

=

3

2

3

1

M

Let Mi be row i of matrix M

product M is single projection matrix encoding both extrinsic and intrinsic parameters

32

31

//

xxyxxx

im

im

==

Pw in homog.

Estimating the projection matrix

w

wim

w

wim

y

x

PMPMPMPM

⋅⋅

=

⋅⋅

=

3

2

3

1wimx PMM ⋅−= )(0 31

wimy PMM ⋅−= )(0 32

Page 14: Lecture 12: Multi-view geometry / Stereo III

Estimating the projection matrixFor a given feature point:

In matrix form:

wimx PMM ⋅−= )(0 31

wimy PMM ⋅−= )(0 32

Twimw

wimw

PyPPxP

−−

ΤΤ

ΤΤΤ

00

3

2

1

MMM

00=

Stack rows of matrix M

Estimating the projection matrix

Expanding this to see the elements:

imwimwimwimwww

imwimwimwimwwwyZyYyXyZYXxZxYxXxZYX

−−−−−−−−

1000000001

00=

Twimw

wimw

PyPPxP

−−

ΤΤ

ΤΤΤ

00

3

2

1

MMM

00=

wimx PMM ⋅−= )(0 31

wimy PMM ⋅−= )(0 32

Page 15: Lecture 12: Multi-view geometry / Stereo III

Estimating the projection matrixThis is true for every feature point, so we can stack up nobserved image features and their associated 3d points in single equation:

Solve for mij’s (the calibration information) with least squares. [F&P Section 3.1]

)1()1()1()1()1()1()1()1()1()1(

)1()1()1()1()1()1()1()1()1()1(

1000000001

imwimwimwwww

imwimwimwimwww

yZyYyXyZYXxZxYxXxZYX

im −−−−−−−−

… … … …

)()()()()()()()()()(

)()()()()()()()()()(

1000000001

nnnnnnnim

nnn

nnnnnnnnnn

imwimwimwwww

imwimwimwimwww

yZyYyXyZYXxZxYxXxZYX

−−−−−−−−

00=…

00=

P m

Pm = 0

Summary: camera calibration

• Associate image points with scene points on object with known geometry

• Use together with perspective projection relationship to estimate projection matrix

• (Can also solve for explicit parameters themselves)

Page 16: Lecture 12: Multi-view geometry / Stereo III

When would we calibrate this way?

• Makes sense when geometry of system is not going to change over time

• …When would it change?

Self-calibration

• Want to estimate world geometry without requiring calibrated cameras– Archival videos– Photos from multiple unrelated users– Dynamic camera system

• We can still reconstruct 3d structure, up to certain ambiguities, if we can find correspondences between points…

Page 17: Lecture 12: Multi-view geometry / Stereo III

Uncalibrated case

)()(1

int, leftleft left pMp −=

)(1

int,)( rightrightright pMp −=Camera coordinates

Internal calibration matrices

Image pixel coordinates

wextPMMp int=

pSo:

Uncalibrated case: fundamental matrix

0)()( =Τleftright Epp From before, the

essential matrix

( ) ( ) 011 =−Τ−leftleftrightright pMEpMDropped subscript, still

internal parameter matrices

( ) 01 =−Τ−Τleftleftrightright pEMMp

0=Τleftright pFp

Fundamental matrix

)()(1

int, leftleft left pMp −=

)(1

int,)( rightrightright pMp −=

Page 18: Lecture 12: Multi-view geometry / Stereo III

Fundamental matrix

• Relates pixel coordinates in the two views• More general form than essential matrix:

we remove need to know intrinsic parameters

• If we estimate fundamental matrix from correspondences in pixel coordinates, can reconstruct epipolar geometry without intrinsic or extrinsic parameters

Computing F from correspondences

• Cameras are uncalibrated: we don’t know E or left or right Mint matrices

• Estimate F from 8+ point correspondences.

0=Τleftright pFp( )1−Τ−= leftrightEMMF

Page 19: Lecture 12: Multi-view geometry / Stereo III

Computing F from correspondences0=Τ

leftright pFpEach point correspondence generates one constraint on F

Collect n of these constraints

Invert and solve for F. Or, if n > 8, least squares solution.

Robust computation

• Find corners• Unguided matching – local search, cross-

correlation to get some seed matches• Compute F and epipolar geometry: find F that is

consistent with many of the seed matches• Now guide matching: using F to restrict search

to epipolar lines

Page 20: Lecture 12: Multi-view geometry / Stereo III

RANSAC application: robust computation

Interest points (Harris corners) in left and right imagesabout 500 pts / image640x480 resolution

Outliers (117)(t=1.25 pixel; 43 iterations)

Final inliers (262)

Hartley & Zisserman p. 126

Putative correspondences (268)(Best match,SSD<20)

Inliers (151)

Figure by Gee and Cipolla, 1999

Page 21: Lecture 12: Multi-view geometry / Stereo III

Need for multi-view geometry and 3d reconstruction

Applications including:• 3d tracking• Depth-based grouping • Image rendering and generating interpolated or

“virtual” viewpoints• Interactive video

Z-keying for virtual reality

• Merge synthetic and real images given depth maps

Kanade et al., CMU, 1995

Page 22: Lecture 12: Multi-view geometry / Stereo III

Z-keying for virtual reality

http://www.cs.cmu.edu/afs/cs/project/stereo-machine/www/z-key.html

Virtualized RealityTM

http://www.cs.cmu.edu/~virtualized-reality/3manbball_new.html

Kanade et al, CMU

Capture 3d shape from multiple views, texture from images

Use them to generate new views on demand

Page 23: Lecture 12: Multi-view geometry / Stereo III

Virtual viewpoint video

C. Zitnick et al, High-quality video view interpolation using a layered representation,SIGGRAPH 2004.

Virtual viewpoint video

http://research.microsoft.com/IVM/VVV/

Page 24: Lecture 12: Multi-view geometry / Stereo III

Noah Snavely, Steven M. Seitz, Richard Szeliski, "Photo tourism: Exploring photo collections in 3D," ACM Transactions on Graphics (SIGGRAPH Proceedings), 25(3), 2006, 835-846.

http://phototour.cs.washington.edu/, http://labs.live.com/photosynth/

Page 25: Lecture 12: Multi-view geometry / Stereo III

Coming up• Tuesday: Local invariant features

– Read Lowe paper on SIFT

• Problem set 3 out next Tuesday, due 11/13

• Graduate students: remember paper reviews and extensions, due 12/6