Stereo Video

PowerPoint

Stereo VideoTemporally Consistent Disparity Maps from Uncalibrated Stereo VideosReal-time Spatiotemporal Stereo Matching Using the Dual-Cross-Bilateral GridTemporally Consistent Disparity and Optical Flow via Efficient Spatio-temporal FilteringEfficient Spatio-temporal Local Stereo Matching Using Information Permeability Filtering1A. Temporally Consistent Disparity Maps from Uncalibrated Stereo VideosMichael Bleyer and Margrit Gelautz

International Symposium on Image and Signal Processing and Analysis (ISPA) 200922B. Real-time Spatiotemporal Stereo Matching Using The Dual-cross-bilateral GridChristian Richardt, Douglas Orr, Ian Davies, Antonio Criminisi, and Neil A. Dodgson1

The European Conference on Computer Vision (ECCV) 201033C. Temporally Consistent Disparity And Optical Flow Via Efficient Spatio-temporal FilteringAsmaa Hosni, Christoph Rhemann, Michael Bleyer, and Margrit Gelautz

The Pacific-Rim Symposium on Image and Video Technology (PSIVT) 20114D. Efficient Spatio-temporal Local Stereo Matching Using Information Permeability FilteringCuong Cao Pham, Vinh Dinh Nguyen, and Jae Wook Jeon

International Conference on Image Processing(ICIP)20125OutlineIntroductionRelated WorksMethods and ResultsA. Median FilterB. Temporal DCB GridC. Spatial-temporal Weighted Smoothing D. Three-pass AggregationComparisonConclusion6Introduction7IntroductionStereo matching issues only focus on static image pairs.The conventional methods estimate the disparities by using spatial and color information.

The important problem of extending to video is flickering.Solution :Base on local methods (for real-time)Enforce temporally consistent (for flickering)

8

Related Works 9Related Works About Local MethodsThe key of local method lies in the cost aggregation step.Aggregate the cost data from the neighboring pixels within a finite size window.The most well-known method is edge-preserving algorithm.Adaptive support wight Geodesic DiffusionBilateral filterGuided filter10Related Works Single-frame stereo matching11

Related Works Spatio-temporal stereo matchingThe inter disparity difference between two successive frames is minimized to enforce the temporal consistency.12

Methods and Results13A. Median filter14

A. Median filter15

A. Median filterComputing 1 disparity map takes 1 second.But a video content about 30~60 frames per second.=> Can NOT achieve real-time. No data and comparison.

16

B. Temporal DCB GridBilateral GridIt runs faster and uses less memory as increases.

Dual-Cross-Bilateral Grid

17

B. Temporal DCB GridDichromatic DCB Grid

Comparison (fps)18

200xB. Temporal DCB GridTemporal DCB Grid

Last n = 5 frames, each weighted by wii=0 : current framei=1 : previous frame19

Weighted Sum

B. Temporal DCB Grid20

16 fps14 fps21

B. Temporal DCB GridSource dataB. Temporal DCB GridOnly use intensity informationJust near-real-time22

C. Spatial-temporal Weighted Smoothing Cost initializationConstruct a spatio-temporal cost volume for each disparity d.Cost aggregationSmooth cost volume with a spatio-temporal filter.(Guided filter [1])Disparity computationSelect the lowest costs as disparity(WTA)RefinementWighted median filter

23[1]Rhemann, C., Hosni, A., Bleyer, M., Rother, C., Gelautz, M.Fast Cost-Volume Filtering for Visual Correspondence and Beyond. CVPR(2011) and PAMI (2013)C. Spatial-temporal Weighted Smoothing 24

C. Spatial-temporal Weighted Smoothing Cost initialization

Cost aggregation

25

wk: wx * wy* wt : smoothness parameter

C. Spatial-temporal Weighted Smoothing The guided filter weights can be implemented by a sequence of linear operations.

All summations are 3D box filters and can be computed in O(N) time.26

C. Spatial-temporal Weighted Smoothing Disparity computation : Winner take all

Refinement : Wighted Meadian filter

=> Just adjust to reduce single frame error.

27

C. Spatial-temporal Weighted Smoothing Temporal vs. frame-by-frame processing. 2nd row: Disparity maps computed by a frame-by-frame implementation show flickering artifacts. 3rd row: Our proposed method exploits temporal information, thus can remove most artifacts28

C. Spatial-temporal Weighted Smoothing 29



D. Three-pass cost aggregationThree-pass cost aggregation technique based on information permeability(Adaptive Support-Weight).[2]

32[2] Yoon, K.J., Kweon, I.S.: Locally Adaptive Support-Weight Approach for VisualCorrespondence Search. In: CVPR (2005)

D. Three-pass cost aggregation33

Frame i+1Frame iFrame i-1D. Three-pass cost aggregationMatching cost initialization

v = (x, y, t) represents the spatial and temporal positions of a voxel.

Similarity(weighted) function

34

Show the effectiveness of using temporal information in addition to spatial information .D. Three-pass cost aggregationSpatial Aggregation : Horizontal and then Vertical

35

D. Three-pass cost aggregationTemporal Aggregation : Forward and backward

Disparity computation : WTARefinementconsistency check 3 3 median filter.

36

D. Three-pass cost aggregationComputational ComplexityOnly six multiplications and nine additions per voxelIt is still more efficient than the adaptive support-weight approach.

Without motion estimation

37D. Three-pass cost aggregation38

D. Three-pass cost aggregation39

Comparison40ComparisonA.B.C.D.MethodOptical flow +Median filterWeighted last 5 framesGuided filter temporallyThree pass DrawbackToo slowOver smoothnessReference frame number3 frames-1~15 frames-4~05 frames-2~23frames-1~141Comparison42

No post-processingInclude post-processing : consistency check and 3 3 median filter Conclusion43ConclusionBased on edge-preserving methods.Extend these concepts to time dimension.

These methods only solved slow motion scenes.They do not perform well with dynamic scenes that contain large object motions.

44

Stereo Video

Documents

temporal dcb grid20

temporal consistency

temporal dcb gridc

spatiotemporal cost

temporal dcb gridsource

disparity computationselect

disparity difference

cost data