Structure From Motion Sebastian Thrun, Gary Bradski, Daniel Russakoff Stanford CS223B Computer Vision http://robots.stanford.edu/cs223b
Dec 21, 2015
Structure From Motion
Sebastian Thrun, Gary Bradski, Daniel RussakoffStanford CS223B Computer Vision
http://robots.stanford.edu/cs223b
Sebastian Thrun Stanford University CS223B Computer Vision
Structure From Motion (1)
[Tomasi & Kanade 92]
Sebastian Thrun Stanford University CS223B Computer Vision
Structure From Motion (2)
[Tomasi & Kanade 92]
Sebastian Thrun Stanford University CS223B Computer Vision
Structure From Motion (3)
[Tomasi & Kanade 92]
Sebastian Thrun Stanford University CS223B Computer Vision
Structure From Motion
Problem 1:– Given n points pij =(xij, yij) in m images
– Reconstruct structure: 3-D locations Pj =(xj, yj, zj)
– Reconstruct camera positions (extrinsics) Mi=(Aj, bj)
Problem 2:– Establish correspondence: c(pij)
Sebastian Thrun Stanford University CS223B Computer Vision
Orthographic Camera Model
Limit of Pinhole Model:
z
y
x
z
y
x
z
y
x
b
b
b
P
P
P
aaa
aaa
aaa
p
p
p
333231
232221
131211
Extrinsic Parameters
Rotation
Orthographic Projection bAPb
b
P
P
P
a
a
a
a
a
a
p
p
y
x
Z
Y
X
y
x
23
13
22
12
21
11
Sebastian Thrun Stanford University CS223B Computer Vision
Orthographic Projection
Limit of Pinhole Model:
Orthographic Projection
1||
1||
0
22
21
21
a
a
aa
rotation is
333231
232221
131211
aaa
aaa
aaa
ijij bPAp
featurejcamerai
bAPb
b
P
P
P
a
a
a
a
a
a
p
p
y
x
Z
Y
X
y
x
23
13
22
12
21
11
Sebastian Thrun Stanford University CS223B Computer Vision
The Affine SFM Problem
}{ and },{recover jPii bA
ijij bPAp featurejcamerai
Sebastian Thrun Stanford University CS223B Computer Vision
Count # Constraints vs #Unknowns
m camera poses n points 2mn point constraints 8m+3n unknowns
Suggests: need 2mn 8m + 3n But: Can we really recover all parameters???
ijij bPAp featurejcamerai
Sebastian Thrun Stanford University CS223B Computer Vision
How Many Parameters Can’t We Recover?
0 3 6 8 9 10 12 n m nm
Place Your Bet!
We can recover all but…
Sebastian Thrun Stanford University CS223B Computer Vision
The Answer is (at least): 12
iji bPA
ijij bPAp ''' ijij bPAp
dCPCP jj11'
ii CAA '
iii bdAb 'singular-non , Cd
iijij bdAdCPCCAp ))(( :Proof 11
iiiji bdAdAPA
Sebastian Thrun Stanford University CS223B Computer Vision
Points for Solving Affine SFM Problem
m camera poses n points
Need to have: 2mn 8m + 3n-12
Sebastian Thrun Stanford University CS223B Computer Vision
Affine SFM
jij PAp
Fix coordinate systemby making p0=origin
m
j
p
p
q 1
mA
A
A 1
jj APqm :cameras
ADQn :points
NPPD 1
mn
n
m p
p
p
p
Q
1
1
11
ijij bPAp
Proof:
3m2 size has A
Rank Theorem: Q has rank 3
nD 3 size has
Sebastian Thrun Stanford University CS223B Computer Vision
The Rank Theorem
3rank has
1
1
1
1
11
11
Nyy
Nxx
Nyy
Nxx
MM
MM
pp
pp
pp
pp
n elements
2m
ele
me
nts
Sebastian Thrun Stanford University CS223B Computer Vision
Tomasi/Kanade 1992
T
Nyy
Nxx
Nyy
Nxx
VWU
pp
pp
pp
pp
MM
MM
1
1
1
1
11
11
Singular Value Decomposition
n332 m 33
Sebastian Thrun Stanford University CS223B Computer Vision
Tomasi/Kanade 1992
structure affine TWV
positions camera affine U
Gives also the optimal affine reconstruction under noise
Sebastian Thrun Stanford University CS223B Computer Vision
Back To Orthographic Projection
1||
1||
0
sConstraint
22
21
21
a
a
aa
matrix singular -non , vector Cd
with
Find C and d for which constraints are met
''' ijij bPAp
dCPCP jj11'
ii CAA '
iii bdAb '
Sebastian Thrun Stanford University CS223B Computer Vision
Back To Projective Geometry
Orthographic (in the limit)
Projective
Sebastian Thrun Stanford University CS223B Computer Vision
Projective Camera:
0
2
3
2
,
2
3
1
ji
jiij
ji ji
jiij Pm
Pmy
Pm
Pmx
Non-Linear Optimization Problem: Bundle Adjustment!
Sebastian Thrun Stanford University CS223B Computer Vision
Structure From Motion
Problem 1:– Given n points pij =(xij, yij) in m images
– Reconstruct structure: 3-D locations Pj =(xj, yj, zj)
– Reconstruct camera positions (extrinsics) Mi=(Aj, bj)
Problem 2:– Establish correspondence: c(pij)
Sebastian Thrun Stanford University CS223B Computer Vision
The Correspondence Problem
View 1 View 3View 2
Sebastian Thrun Stanford University CS223B Computer Vision
Correspondence: Solution 1
Track features (e.g., optical flow)
…but fails when images taken from widely different poses
Sebastian Thrun Stanford University CS223B Computer Vision
Correspondence: Solution 2
Start with random solution A, b, P Compute soft correspondence: p(c|A,b,P) Plug soft correspondence into SFM Reiterate
See Dellaert/Seitz/Thorpe/Thrun 2003
Sebastian Thrun Stanford University CS223B Computer Vision
Example
Sebastian Thrun Stanford University CS223B Computer Vision
Results: Cube
Sebastian Thrun Stanford University CS223B Computer Vision
Animation
Sebastian Thrun Stanford University CS223B Computer Vision
Tomasi’s Benchmark Problem
Sebastian Thrun Stanford University CS223B Computer Vision
Reconstruction with EM
Sebastian Thrun Stanford University CS223B Computer Vision
3-D Structure