A New Correspondence A New Correspondence Algorithm Algorithm Jitendra Malik Computer Science Division University of California, Berkeley Joint work with Serge Belongie, Jan Puzicha, Alex Berg
Dec 22, 2015
A New Correspondence AlgorithmA New Correspondence Algorithm
Jitendra MalikComputer Science Division
University of California, Berkeley
Joint work with Serge Belongie, Jan Puzicha, Alex Berg
Key contributions: Years Key contributions: Years 1-41-4 The FAÇADE system for semi-automated The FAÇADE system for semi-automated
modeling of architectural scenes modeling of architectural scenes High dynamic range image acquisitionHigh dynamic range image acquisition Image based lighting Image based lighting Inverse global illumination for recovering Inverse global illumination for recovering
reflectance and lighting propertiesreflectance and lighting properties Segmented objects from range images Segmented objects from range images
ContributorsContributors
Paul Debevec, now at ICTPaul Debevec, now at ICT George Borshukov, recipient of George Borshukov, recipient of
Technical Achievement Award Technical Achievement Award 2001 with colleagues at Manex 2001 with colleagues at Manex visual effectsvisual effects
Yizhou Yu, Asst. Prof., UIUCYizhou Yu, Asst. Prof., UIUC
What remains?What remains?
High quality automated High quality automated correspondence is essentialcorrespondence is essential
3D Structure recovery algorithms 3D Structure recovery algorithms need to scale upneed to scale up
Geometric and reflectance Geometric and reflectance properties need to be modeled for properties need to be modeled for a much larger range of scenes a much larger range of scenes than previously considered than previously considered
Towards better Towards better correspondencecorrespondence Humans use contextual Humans use contextual
information much more effectively information much more effectively than current algorithms.than current algorithms.
Features are not robust to changes Features are not robust to changes in viewpoint.in viewpoint.
The solution to the The solution to the dilemma.dilemma. Large windows capture more context Large windows capture more context
but suffer from increased distortion.but suffer from increased distortion. Goal: Design a similarity measure Goal: Design a similarity measure
which can tolerate affine distortion. which can tolerate affine distortion. Similarity should decrease linearly with Similarity should decrease linearly with
the amount of distortion.the amount of distortion. Cross correlation does not have this Cross correlation does not have this
propertyproperty
Affine Robustness Affine Robustness ConditionCondition
))(()1(),())(()1( TLTffsTL bT )(for
)()(),( gBfBgfs
•The similarity function s(f, f T) should be close to a linear function L of the amount of distortion (T).
•We can obtain an s that satisfies this condition:
•Where B is a bounded distortion blur…
Affine Robust FeatureAffine Robust Feature
The bounded distortion blur of a signal f is the Affine Robust Feature B(f). Constructively B is a linear mapping with:
)')(:)(())(( 00 bTtTtB
And we take
0 1 2 0 1 2
)(tf ))(( tfB
)()( ii
i tctf
In 2dIn 2d
Six oriented filters, half-wave rectified to Six oriented filters, half-wave rectified to provide12 channelsprovide12 channels
Bounded distortion blur applied to each Bounded distortion blur applied to each channelchannel
Similarity is the sum of similarities in Similarity is the sum of similarities in each channel computed separatelyeach channel computed separately
Find correspondences between points on Find correspondences between points on shapeshape
Estimate transformationEstimate transformation Measure similarityMeasure similarity
model target
...
Another application: Matching Another application: Matching shapesshapes
Shape ContextShape ContextCount the number of points inside each bin, e.g.:
Count = 4
Count = 10...
Compact representation of distribution of points relative to each point
Hand-written Digit Hand-written Digit RecognitionRecognition MNIST 60 000:MNIST 60 000:
linear: 12.0%linear: 12.0% 40 PCA+ quad: 3.3%40 PCA+ quad: 3.3% 1000 RBF +linear: 3.6%1000 RBF +linear: 3.6% K-NN: 5%K-NN: 5% K-NN K-NN (deskewed)(deskewed): 2.4%: 2.4% K-NN K-NN (tangent dist.)(tangent dist.): 1.1%: 1.1% SVM: 1.1%SVM: 1.1% LeNet 5: 0.95%LeNet 5: 0.95%
MNIST 600 000 MNIST 600 000 (distortions):(distortions): LeNet 5: 0.8%LeNet 5: 0.8% SVM: 0.8%SVM: 0.8% Boosted LeNet 4: 0.7%Boosted LeNet 4: 0.7%
MNIST 20 000MNIST 20 000 K-NN, Shape context K-NN, Shape context
matching: 0.63 %matching: 0.63 %
ConclusionConclusion
A new image descriptor which is A new image descriptor which is robust to affine image robust to affine image deformationsdeformations
Preliminary results suggest that Preliminary results suggest that this could result in a considerable this could result in a considerable improvement in quality of improvement in quality of correspondence for long baseline correspondence for long baseline multiple view analysis.multiple view analysis.
Plans for next 6 monthsPlans for next 6 months
Combine the use of the affine robust Combine the use of the affine robust window features with the use of window features with the use of epipolar constraints and probabilistic epipolar constraints and probabilistic matching.matching.
Test technique on stereo and motion Test technique on stereo and motion imagery.imagery.
Explore this in the context of an end Explore this in the context of an end to end system for scene to end system for scene reconstruction.reconstruction.