Pixelwise View Selection for Unstructured Multi-View Stereo Johannes L. Schönberger 1 Enliang Zheng 2 Marc Pollefeys 1,3 Jan-Michael Frahm 2 1 ETH Zürich 2 UNC Chapel Hill 3 Microsoft Overview Joint Depth - Normal - Occlusion Inference This work presents an open source Multi-View Stereo system for robust and efficient dense modeling from unstructured image collections. Experiments on benchmarks and large-scale Internet photo collections demonstrate state-of-the-art performance in terms of accuracy, completeness, and efficiency. Contributions • Joint depth - normal - occlusion inference embedded in improved PatchMatch sampling scheme • Pixelwise view selection using photometric and geometric priors • Multi-view geometric consistency for simultaneous refinement and image-based fusion • Graph-based filtering and fusion of depth and normal maps Multi-View Geometric Consistency Pixelwise View Selection • Occlusion prior • Triangulation prior • Resolution prior • Incident prior Ref. patch Source patches Reference camera Source camera Source camera -1 -0.5 0 0.5 1 NCC score 0 0.5 1 Occlusion Prior σρ =0.3 σρ =0.6 σρ =0.9 0 50 100 150 Incident angle [deg] 0 0.5 1 Incident Prior σκ = 15◦ σκ = 30◦ σκ = 45◦ 0 5 10 15 20 25 30 Triangulation angle [deg] 0 0.5 1 Triangulation Prior ¯ α =2◦ ¯ α =5◦ ¯ α = 10◦ ¯ α = 15◦ 0 1 2 3 4 5 Relative resolution b l m with b l = 1 0 0.5 1 Resolution Prior Source patches Ref. patch Traditional Pixelwise a S Pix • Joint likelihood function • Generalized Expectation Maximization (GEM) • E-Step: Infer using variational inference • M-Step: Infer using PatchMatch sampling Z θ, N ξ m l =1−ρ m l +η min (ψ m l ,ψ max ) Photometric Geometric Cost function Optimization argmin θ ∗ l ,n ∗ l 1 |S| m∈S ξ m l (θ ∗ l , n ∗ l ) Filtering and Fusion ψ m l Filtering Fusion Results Zheng et al. Photometric Photometric + Geometric Filtered Normals https://colmap.github.io Source Code | Documentation | Tutorial | Examples P (α m l )=1 − (min( ¯ α,α m l )−¯ α) 2 ¯ α 2 P (β m l ) = min(β m l , (β m l ) −1 ) P (κ m l ) = exp(− κ m l 2 2σ 2 κ ) P (X m l |Z m l ,θl)= 1 NA exp − (1−ρ m l (θl)) 2 2σ2 ρ if Z m l =1 1 N U if Z m l =0 P (X, Z, θ, N ) Normals Depth Occlusion Images P (α m l ) P (β m l ) P (κ m l ) P (X m l |Z m l ,θl)