Peripheral drift illusion
Jan 04, 2016
Peripheral drift illusion
Multiple views
Hartley and Zisserman
Lowestereo visionstructure from motionoptical flow
Why multiple views?• Structure and depth are inherently ambiguous from
single views.
Images from Lana Lazebnik
Why multiple views?• Structure and depth are inherently ambiguous from
single views.
Optical center
P1P2
P1’=P2’
Geometry for a simple stereo system
• First, assuming parallel optical axes, known camera parameters (i.e., calibrated cameras):
baseline
optical center (left)
optical center (right)
Focal length
World point
image point (left)
image point (right)
Depth of p
• Assume parallel optical axes, known camera parameters (i.e., calibrated cameras). What is expression for Z?
Similar triangles (pl, P, pr) and (Ol, P, Or):
Geometry for a simple stereo system
Z
T
fZ
xxT rl
lr xx
TfZ
disparity
Depth from disparity
image I(x,y) image I´(x´,y´)Disparity map D(x,y)
(x´,y´)=(x+D(x,y), y)
So if we could find the corresponding points in two images, we could estimate relative depth…
What did we need to know?
• Correspondence for every pixel. Sort of like project 2, but project 2 is “sparse” and we need “dense” correspondence.
• Calibration for the cameras.
Where do we need to search?
How do we calibrate a camera?312.747 309.140 30.086305.796 311.649 30.356307.694 312.358 30.418310.149 307.186 29.298311.937 310.105 29.216311.202 307.572 30.682307.106 306.876 28.660309.317 312.490 30.230307.435 310.151 29.318308.253 306.300 28.881306.650 309.301 28.905308.069 306.831 29.189309.671 308.834 29.029308.255 309.955 29.267307.546 308.613 28.963311.036 309.206 28.913307.518 308.175 29.069309.950 311.262 29.990312.160 310.772 29.080311.988 312.709 30.514
880 214 43 203270 197886 347745 302943 128476 590419 214317 335783 521235 427665 429655 362427 333412 415746 351434 415525 234716 308602 187
World vs Camera coordinates
Image formation
Let’s design a camera– Idea 1: put a piece of film in front of an object– Do we get a reasonable image?
Slide source: Seitz
Pinhole camera
Idea 2: add a barrier to block off most of the rays– This reduces blurring– The opening known as the aperture
Slide source: Seitz
Projection: world coordinatesimage coordinates
Camera Center (tx, ty, tz)
Z
Y
X
P.
.
. f Z
Y
v
up
.Optical Center (u0, v0)
v
u
X
Homogeneous coordinates
Conversion
Converting to homogeneous coordinates
homogeneous image coordinates
homogeneous scene coordinates
Converting from homogeneous coordinates
World vs Camera coordinates
Slide Credit: Saverese
Projection matrix
XtRKx x: Image Coordinates: (u,v,1)K: Intrinsic Matrix (3x3)R: Rotation (3x3) t: Translation (3x1)X: World Coordinates: (X,Y,Z,1)
Ow
iw
kw
jwR,T
X0IKx
10100
000
000
1z
y
x
f
f
v
u
w
K
Slide Credit: Saverese
Projection matrix
Intrinsic Assumptions• Unit aspect ratio• Optical center at (0,0)• No skew
Extrinsic Assumptions• No rotation• Camera at (0,0,0)
Remove assumption: known optical center
X0IKx
10100
00
00
10
0
z
y
x
vf
uf
v
u
w
Intrinsic Assumptions• Unit aspect ratio• No skew
Extrinsic Assumptions• No rotation• Camera at (0,0,0)
Remove assumption: square pixels
X0IKx
10100
00
00
10
0
z
y
x
v
u
v
u
w
Intrinsic Assumptions• No skew
Extrinsic Assumptions• No rotation• Camera at (0,0,0)
Remove assumption: non-skewed pixels
X0IKx
10100
00
0
10
0
z
y
x
v
us
v
u
w
Intrinsic Assumptions Extrinsic Assumptions• No rotation• Camera at (0,0,0)
Note: different books use different notation for parameters
Oriented and Translated Camera
Ow
iw
kw
jw
t
R
Allow camera translation
XtIKx
1100
010
001
100
0
0
10
0
z
y
x
t
t
t
v
u
v
u
w
z
y
x
Intrinsic Assumptions Extrinsic Assumptions• No rotation
3D Rotation of Points
Rotation around the coordinate axes, counter-clockwise:
100
0cossin
0sincos
)(
cos0sin
010
sin0cos
)(
cossin0
sincos0
001
)(
z
y
x
R
R
R
p
p’
g
y
z
Slide Credit: Saverese
Allow camera rotation
XtRKx
1100
0
1 333231
232221
131211
0
0
z
y
x
trrr
trrr
trrr
v
us
v
u
w
z
y
x
Degrees of freedom
XtRKx
1100
0
1 333231
232221
131211
0
0
z
y
x
trrr
trrr
trrr
v
us
v
u
w
z
y
x
5 6
Beyond Pinholes: Radial Distortion• Common in wide-angle lenses or for
special applications (e.g., security)• Creates non-linear terms in
projection• Usually handled by through solving
for non-linear terms and then correcting image
Image from Martin Habbecke
Corrected Barrel Distortion
How to calibrate the camera?
1****
****
****
Z
Y
X
s
sv
su
XtRKx
Calibrating the Camera
Use an scene with known geometry– Correspond image points to 3d points– Get least squares solution (or non-linear solution)
134333231
24232221
14131211
Z
Y
X
mmmm
mmmm
mmmm
s
sv
su
How do we calibrate a camera?
312.747 309.140 30.086305.796 311.649 30.356307.694 312.358 30.418310.149 307.186 29.298311.937 310.105 29.216311.202 307.572 30.682307.106 306.876 28.660309.317 312.490 30.230307.435 310.151 29.318308.253 306.300 28.881306.650 309.301 28.905308.069 306.831 29.189309.671 308.834 29.029308.255 309.955 29.267307.546 308.613 28.963311.036 309.206 28.913307.518 308.175 29.069309.950 311.262 29.990312.160 310.772 29.080311.988 312.709 30.514
880 214 43 203270 197886 347745 302943 128476 590419 214317 335783 521235 427665 429655 362427 333412 415746 351434 415525 234716 308602 187
0
0
0
0
10000
00001
10000
00001
34
33
32
31
24
23
22
21
14
13
12
11
1111111111
1111111111
m
m
m
m
m
m
m
m
m
m
m
m
vZvYvXvZYX
uZuYuXuZYX
vZvYvXvZYX
uZuYuXuZYX
nnnnnnnnnn
nnnnnnnnnn
Method 1 – homogeneous linear system
• Solve for m’s entries using linear least squaresAx=0 form
134333231
24232221
14131211
Z
Y
X
mmmm
mmmm
mmmm
s
sv
su
[U, S, V] = svd(A);M = V(:,end);M = reshape(M,[],3)';
Method 2 – nonhomogeneous linear system
• Solve for m’s entries using linear least squares
n
n
nnnnnnnnn
nnnnnnnnn
v
u
v
u
m
m
m
m
m
m
m
m
m
m
m
ZvYvXvZYX
ZuYuXuZYX
ZvYvXvZYX
ZuYuXuZYX
1
1
33
32
31
24
23
22
21
14
13
12
11
111111111
111111111
10000
00001
10000
00001
Ax=b form
11333231
24232221
14131211
Z
Y
X
mmm
mmmm
mmmm
s
sv
su
M = A\Y;M = [M;1];M = reshape(M,[],3)';
Calibration with linear method• Advantages
– Easy to formulate and solve– Provides initialization for non-linear methods
• Disadvantages– Doesn’t directly give you camera parameters– Doesn’t model radial distortion– Can’t impose constraints, such as known focal length
• Non-linear methods are preferred– Define error as difference between projected points and measured points– Minimize error using Newton’s method or other non-linear optimization