Multi-View Stereo for Static and Dynamic Scenestheobalt/courses/...Multi-View Stereo for Static and Dynamic Scenes Wolfgang Burgard Jan 6, 2010 Main references Yasutaka Furukawa and

Multi-View Stereo for Static and Multi-View Stereo for Static and Dynamic ScenesDynamic Scenes

Wolfgang BurgardWolfgang BurgardJan 6, 2010Jan 6, 2010

● Main referencesMain references● Yasutaka Furukawa and Jean Ponce,Yasutaka Furukawa and Jean Ponce,

Accurate, Dense and Robust Multi-View Stereopsis, Accurate, Dense and Robust Multi-View Stereopsis, 20072007

● C.L. Zitnick, S.B. Kang, M. Uyttendaele, S. Winder, C.L. Zitnick, S.B. Kang, M. Uyttendaele, S. Winder, and R. Szeliski,and R. Szeliski,High-quality Video View Interpolation using a High-quality Video View Interpolation using a Layered Representation, Layered Representation, 20042004

22

Stereo Reconstruction - Static SceneStereo Reconstruction - Static Scene

● SettingsSettings● Two images (2D) of the same sceneTwo images (2D) of the same scene● Static: Scene hasn't changed accross imagesStatic: Scene hasn't changed accross images● Acquisition from different viewpointsAcquisition from different viewpoints● Camera parameters known / estimated (Zhang)Camera parameters known / estimated (Zhang)

● GoalGoal● Reconstruct geometry (3D) of objects in sceneReconstruct geometry (3D) of objects in scene

...

33

Reconstruction so farReconstruction so far

● Visual hullVisual hull● Silhouette based:Silhouette based:

Intersection of cones fromIntersection of cones fromsilhouette back projectionsilhouette back projection

● Image based:Image based:Project entrance/exit intervalProject entrance/exit intervalin reference images ontoin reference images ontoviewing-raysviewing-rays

● Problem: Concave surface,Problem: Concave surface, Image-based: View-dependent Image-based: View-dependent

44

Epipolar GeometryEpipolar Geometry

● ApproachApproach● Identify correspondences between images Identify correspondences between images

( ( correspondence problemcorrespondence problem ) )● Triangulation: Rays through corresponding pixels Triangulation: Rays through corresponding pixels

meet at scene point meet at scene point

depth=b⋅focal_lengthdisparity

with disparity=xl−x r

xl and xr

Baseline (Length b)

Opticalcenter

Epipolar Line

Image plane

Scene point

Depth

55

Static Scene Approach OverviewStatic Scene Approach Overview

● Idea:Idea:● Correspondences of pixels within aCorrespondences of pixels within a

local area constrain each otherlocal area constrain each other(Local photometric consistency)(Local photometric consistency)

● Geometry as patch setGeometry as patch set● Global visibility constraintsGlobal visibility constraints

● Major stepsMajor steps● MatchMatch features features

➔ sparse surface patchessparse surface patches● Expand Expand to nearby pixelsto nearby pixels

➔ dense set of patchesdense set of patches● Filter Filter out incorrect patches ( visibility ) out incorrect patches ( visibility )

● Patch model -> meshPatch model -> mesh

66

Initial Patch Set (Match step) 1/2Initial Patch Set (Match step) 1/2

● Divide image into 32x32 pixel cellsDivide image into 32x32 pixel cells● Extract features in each cellExtract features in each cell

● BlobsBlobs( DoG operator )( DoG operator )

● CornerCorner( Harris operator )( Harris operator )

● Uniform coverage:Uniform coverage:4 local maxima with4 local maxima withstrongest responsestrongest responseof each operatorof each operator

● Triangulation with feature pairs (f,f') => 3D pointsTriangulation with feature pairs (f,f') => 3D points● Tolerance of 2 pixels from epipolar lineTolerance of 2 pixels from epipolar line● Consider matches (f,f') if of same typeConsider matches (f,f') if of same type

77

Initial Patch Set (Match step) 2/2Initial Patch Set (Match step) 2/2

● Many 3D points for feature f. Optimal one ?Many 3D points for feature f. Optimal one ?Nearest to O which is photoconsistent:Nearest to O which is photoconsistent:● Patch candidate pPatch candidate p

– Center c(p): 3D pointCenter c(p): 3D point– Extension of p:Extension of p:

Projection into image inProjection into image in5x5 (7x7) axis aligned5x5 (7x7) axis alignedsquaresquare

● OptimiziationOptimiziation– MaximalMaximal

photometric consistencyphotometric consistency– Refine parameters: c(p) and n(p)Refine parameters: c(p) and n(p)

● p p photoconsistentphotoconsistent with 2-3 images with 2-3 images➔ Accept, otherwise next 3D pointAccept, otherwise next 3D point

opticalcenter

O

inconsistentpatch

consistentpatches

normal

88

Photometric ConsistencyPhotometric Consistency

● p photoconsistent with image I ?p photoconsistent with image I ?

● Normalized cross corelation (NCC)Normalized cross corelation (NCC)– Energy independent similarity measureEnergy independent similarity measure– Similarity of p in I with p in reference imageSimilarity of p in I with p in reference image

● NCC > treshold => photoconsistentNCC > treshold => photoconsistent● Correspondence problem not solved pointwise but Correspondence problem not solved pointwise but

for an areafor an area● Local surface area considered perspectivelyLocal surface area considered perspectively

Reference image

99

Expand 1/2Expand 1/2

● Initial patches too sparseInitial patches too sparse● For all patches p add patches p' to neighbour cellsFor all patches p add patches p' to neighbour cells● Conditions for addingConditions for adding

● No visible patch in cellNo visible patch in cell● No should-be-visible patch n-adjacent in cellNo should-be-visible patch n-adjacent in cell

Close patch centers

patch p p's neighbour patch p p's neighbour

Similar normals

1010

Expand 2/2Expand 2/2

● Initial parameters of new patch p'Initial parameters of new patch p'● n(p') = n(p)n(p') = n(p)● c(p') intersection of ray throughc(p') intersection of ray through

Cell (i',j') with plane of pCell (i',j') with plane of p

● OptimiziationOptimiziation● Maximal photometric consistencyMaximal photometric consistency● Refines c(p') and n(p')Refines c(p') and n(p')

● Accept if photometric consistent in 2-3 imagesAccept if photometric consistent in 2-3 images

plane of p

neighbour patch p'

1111

Filter 1/2Filter 1/2

● Remove patches outside of real surfaceRemove patches outside of real surface● Caused by e.g. obstaclesCaused by e.g. obstacles

● Condition to remove p:Condition to remove p:– p's Photometric consistency < photometric p's Photometric consistency < photometric

consistency of patches hidden by pconsistency of patches hidden by p– Intuition:Intuition:

● Projected outliers visible in less images than Projected outliers visible in less images than real surface patchesreal surface patches

outlier seen in oneimage conflicts 2

patches seen in otherimages

1212

Filter 2/2Filter 2/2

● Remove patches inside real surfaceRemove patches inside real surface● Caused by iterative scheme ( expand )Caused by iterative scheme ( expand )

– Patch added occludes "inside“ patchPatch added occludes "inside“ patch● Not visible in 2-3 images => remove pNot visible in 2-3 images => remove p● Visibility here defined via depth valuesVisibility here defined via depth values

outlier only visible in 1 image

1313

Mesh from Patches 1/2Mesh from Patches 1/2

● Once Once Match Match (initial patches)(initial patches)● ExpandExpand and and FilterFilter step until convergence step until convergence

(dense patches)(dense patches)● Mesh from patchesMesh from patches

Initial bounding volume mesh

Move vertices by forces Remesh

1414

Mesh from Patches 2/2Mesh from Patches 2/2

● Forces moving verticesForces moving vertices– Smoothness - regularization (rigidness of mesh )Smoothness - regularization (rigidness of mesh )– Photometric consistencyPhotometric consistency

● Initial phase:Initial phase:– Move towards photoconsistent patchesMove towards photoconsistent patches

● Later phases:Later phases:– Create patch at vertex;Create patch at vertex;– Optimize patch;Optimize patch;– Photoconsistency: c(p) - c*(p)Photoconsistency: c(p) - c*(p)

– Rim consistency Rim consistency - pull mesh towards visual cone- pull mesh towards visual cone● projected surface silhouette ~ silhouette in projected surface silhouette ~ silhouette in

imageimage

1515

ResultsResults

● Computation Time: Minutes to hoursComputation Time: Minutes to hours

1616

Multi-View Stereo for Dynamic ScenesMulti-View Stereo for Dynamic Scenes

● InputInput● Image sequences (video) of dynamic sceneImage sequences (video) of dynamic scene● Each sequence captured scene over timeEach sequence captured scene over time

● Goal:Goal:● Variable viewpoints in videoVariable viewpoints in video

( using geometric data )( using geometric data )● Additional challenges:Additional challenges:

● Object MovementObject Movement● Object DeformationObject Deformation● Much dataMuch data

( input and output )( input and output )

New viewpointbetween

recording cameras

Scene

1717

Viewpoint Manipulation Approaches 1/2Viewpoint Manipulation Approaches 1/2

● Geometry-less approachesGeometry-less approaches● Jump between still camerasJump between still cameras

No software interpolationNo software interpolationProblem:Problem:– Jumping artifactsJumping artifacts

Still cameras alongtrajectory

Scene

1818





Scene

1919





Scene

2020


● 'Light-field Rendering' for dynamic scences'Light-field Rendering' for dynamic scencesProblem: Requires many videosProblem: Requires many videos

● HomographyHomography

– Problem:Problem:● No parallax effect,No parallax effect,

i.e. foreground and background objects movei.e. foreground and background objects movewith same velocity - independent from depthwith same velocity - independent from depth

Novel view

Project & Blend

2121

OverviewOverview

Matte extraction( depth disconuity artifacts )

3D Reconstruction withImage Segmentation

Snchronized videos + Camera parameters

Compression

Rendering using temporal twolayered compression representation

OfflineOffline

InteractivelyInteractively

2222

Recording SetupRecording Setup

● 8 synchronized cameras8 synchronized cameras● 1024x768 images @ 15 fps1024x768 images @ 15 fps

● Possible extension:Possible extension:2D or 360°2D or 360°not trivial, e.g. cameras in imagenot trivial, e.g. cameras in image

● Zhang's algorithm to estimateZhang's algorithm to estimatecamera parameterscamera parameters

Basic camera setup

30°

2D camera setup

2323

Stereo Reconstruction 1/2Stereo Reconstruction 1/2

● Traditional 3D Stereo Reconstruction Traditional 3D Stereo Reconstruction ● Errors around disparity disconuitiesErrors around disparity disconuities

=> noticable visual artifacts at intensity edges=> noticable visual artifacts at intensity edges

● Color segmentation-based stereo algorithmColor segmentation-based stereo algorithm● Segment imageSegment image

– Similar disparity in each segment => no artifactsSimilar disparity in each segment => no artifacts1. Smooth & reduce noise1. Smooth & reduce noise2. Merge segments (initially: each pixel )2. Merge segments (initially: each pixel ) if average color similar if average color similar3. Split/merge too large/small segments3. Split/merge too large/small segments

Segments

2424

Stereo Reconstruction 2/2Stereo Reconstruction 2/2

● Initial disparity in each segmentInitial disparity in each segment– Photometric conditionsPhotometric conditions– Constant disparityConstant disparity

● Refine disparityRefine disparity– Relax constant assumptionRelax constant assumption– Average across imagesAverage across images– Average between segmentsAverage between segments– Smoothness in each segmentSmoothness in each segment

2525

Boundary Matting 1/4Boundary Matting 1/4

● Problem:Problem:At depth disconuities: Foreground pixels contain At depth disconuities: Foreground pixels contain background colorbackground color

background

Pixel contain foregroundand background color

foreground

Hairs from foreground object having blue color

from background

2626


Novel view

Left camera image

Right camera image

2727


● Matting at disparity disconuitiesMatting at disparity disconuities● Extract foreground and backgroundExtract foreground and background

colors + alphacolors + alpha● Two-layered representationTwo-layered representation

● Main layerMain layer

Main layer colors Main layer depth

2828


● Matting at disparity disconuitiesMatting at disparity disconuities● Extract foreground and backgroundExtract foreground and background

colors + alphacolors + alpha● Two-layered representationTwo-layered representation

● Boundary layerBoundary layer

Boundary colors Boundary depth Boundary alpha

2929

Rendering 1/2Rendering 1/2

● StepsSteps

1.1.Select 2 nearest camerasSelect 2 nearest cameras2.2.Render into 2 buffers separate ( 1 for each camera )Render into 2 buffers separate ( 1 for each camera )

● Project main & boundary layer into viewProject main & boundary layer into view● Depth map => 3D meshDepth map => 3D mesh● Remove triangles across depth disconuities from Remove triangles across depth disconuities from

main layer; Boundary mesh insteadmain layer; Boundary mesh instead

Main mesh at depth disconuity

Boundary mesh at depth disconuity

3030

Rendering 2/2Rendering 2/2

3.3.Blend buffersBlend buffers– Pixels with different depth => use frontmostPixels with different depth => use frontmost– Similar depthSimilar depth

=> average with camera distance to view as => average with camera distance to view as weightweight

Right camera nearer =>more influence of pixel colors

3131

ResultsResults

● Interactive renderingInteractive rendering● 1024x768 @ 5 fps1024x768 @ 5 fps● 512x384 @ 10 fps512x384 @ 10 fps● ATI 9800 PROATI 9800 PRO● If all images on GPU memory:If all images on GPU memory:

1024x768 @ 30 fps 1024x768 @ 30 fps

3232

SummarySummary

● Static scene approachStatic scene approach● Stereo reconstuction with featuresStereo reconstuction with features● Photoconsistent patchesPhotoconsistent patches● Geometry: Complete 3D patch & mesh modelGeometry: Complete 3D patch & mesh model

● Dynamic scene approachDynamic scene approach● Stereo reconstruction with segmented imagesStereo reconstruction with segmented images● Geometry:Geometry:

– 2 layers: Main layer + boundary layer ( matting )2 layers: Main layer + boundary layer ( matting )– 3D mesh for rendering3D mesh for rendering

3333

DiscussionDiscussion

Multi-View Stereo for Static and Dynamic Scenestheobalt/courses/...Multi-View Stereo for Static and Dynamic Scenes Wolfgang Burgard Jan 6, 2010 Main references Yasutaka Furukawa and

Documents

Multi-View Stereo for Static and Dynamic Scenestheobalt/courses/...Multi-View Stereo for Static and Dynamic Scenes Wolfgang Burgard Jan 6, 2010 Main references Yasutaka Furukawa and