Structure From Motion Sebastian Thrun, Gary Bradski, Daniel Russakoff Stanford CS223B Computer Vision .

Structure From Motion

Sebastian Thrun, Gary Bradski, Daniel RussakoffStanford CS223B Computer Vision

http://robots.stanford.edu/cs223b

Sebastian Thrun Stanford University CS223B Computer Vision

Structure From Motion (1)

[Tomasi & Kanade 92]









Problem 1:– Given n points pij =(xij, yij) in m images

– Reconstruct structure: 3-D locations Pj =(xj, yj, zj)

– Reconstruct camera positions (extrinsics) Mi=(Aj, bj)

Problem 2:– Establish correspondence: c(pij)


Orthographic Camera Model

Limit of Pinhole Model:

z

y

x

z

y

x

z

y

x

b

b

b

P

P

P

aaa

aaa

aaa

p

p

p

333231

232221

131211

Extrinsic Parameters

Rotation

Orthographic Projection bAPb

b

P

P

P

a

a

a

a

a

a

p

p

y

x

Z

Y

X

y

x

23

13

22

12

21

11


Orthographic Projection

Limit of Pinhole Model:

Orthographic Projection

1||

1||

0

22

21

21

a

a

aa

rotation is

333231

232221

131211

aaa

aaa

aaa

ijij bPAp

featurejcamerai

bAPb

b

P

P

P

a

a

a

a

a

a

p

p

y

x

Z

Y

X

y

x

23

13

22

12

21

11


The Affine SFM Problem

}{ and },{recover jPii bA

ijij bPAp featurejcamerai


Count # Constraints vs #Unknowns

m camera poses n points 2mn point constraints 8m+3n unknowns

Suggests: need 2mn 8m + 3n But: Can we really recover all parameters???

ijij bPAp featurejcamerai


How Many Parameters Can’t We Recover?

0 3 6 8 9 10 12 n m nm

Place Your Bet!

We can recover all but…


The Answer is (at least): 12

iji bPA

ijij bPAp ''' ijij bPAp

dCPCP jj11'

ii CAA '

iii bdAb 'singular-non , Cd

iijij bdAdCPCCAp ))(( :Proof 11

iiiji bdAdAPA


Points for Solving Affine SFM Problem

m camera poses n points

Need to have: 2mn 8m + 3n-12


Affine SFM

jij PAp

Fix coordinate systemby making p0=origin

m

j

p

p

q 1

mA

A

A 1

jj APqm :cameras

ADQn :points

NPPD 1

mn

n

m p

p

p

p

Q

1

1

11

ijij bPAp

Proof:

3m2 size has A

Rank Theorem: Q has rank 3

nD 3 size has


The Rank Theorem

3rank has

1

1

1

1

11

11

Nyy

Nxx

Nyy

Nxx

MM

MM

pp

pp

pp

pp

n elements

2m

ele

me

nts


Tomasi/Kanade 1992

T

Nyy

Nxx

Nyy

Nxx

VWU

pp

pp

pp

pp

MM

MM

1

1

1

1

11

11

Singular Value Decomposition

n332 m 33


Tomasi/Kanade 1992

structure affine TWV

positions camera affine U

Gives also the optimal affine reconstruction under noise


Back To Orthographic Projection

1||

1||

0

sConstraint

22

21

21

a

a

aa

matrix singular -non , vector Cd

with

Find C and d for which constraints are met

''' ijij bPAp

dCPCP jj11'

ii CAA '

iii bdAb '


Back To Projective Geometry

Orthographic (in the limit)

Projective


Projective Camera:

0

2

3

2

,

2

3

1

ji

jiij

ji ji

jiij Pm

Pmy

Pm

Pmx

Non-Linear Optimization Problem: Bundle Adjustment!



Problem 1:– Given n points pij =(xij, yij) in m images

– Reconstruct structure: 3-D locations Pj =(xj, yj, zj)

– Reconstruct camera positions (extrinsics) Mi=(Aj, bj)

Problem 2:– Establish correspondence: c(pij)


The Correspondence Problem

View 1 View 3View 2


Correspondence: Solution 1

Track features (e.g., optical flow)

…but fails when images taken from widely different poses


Correspondence: Solution 2

Start with random solution A, b, P Compute soft correspondence: p(c|A,b,P) Plug soft correspondence into SFM Reiterate

See Dellaert/Seitz/Thorpe/Thrun 2003


Example


Results: Cube


Animation


Tomasi’s Benchmark Problem


Reconstruction with EM


3-D Structure

Structure From Motion Sebastian Thrun, Gary Bradski, Daniel Russakoff Stanford CS223B Computer Vision .

Documents

motion sebastian thrun

n unknowns n

motion n problem

noise slide

educs223b slide

camera poses n n points

b j n problem

cp ij slide