Top Banner
SVD-matching using SIFT features Elisabetta Delponte * , Francesco Isgro ` , Francesca Odone, Alessandro Verri DISI, Universita ` di Genova, Via Dodecaneso 35, Genova I-16146, Italy Received 31 January 2006; received in revised form 31 March 2006; accepted 7 July 2006 Abstract The paper tackles the problem of feature points matching between pair of images of the same scene. This is a key prob- lem in computer vision. The method we discuss here is a version of the SVD-matching proposed by Scott and Longuet- Higgins and later modified by Pilu, that we elaborate in order to cope with large scale variations. To this end we add to the feature detection phase a keypoint descriptor that is robust to large scale and view-point changes. Furthermore, we include this descriptor in the equations of the proximity matrix that is central to the SVD-matching. At the same time we remove from the proximity matrix all the information about the point locations in the image, that is the source of mismatches when the amount of scene variation increases. The main contribution of this work is in showing that this compact and easy algo- rithm can be used for severe scene variations. We present experimental evidence of the improved performance with respect to the previous versions of the algorithm. Ó 2006 Elsevier Inc. All rights reserved. Keywords: Point matching; Spectral methods; Scale invariant features 1. Introduction Finding correspondences between feature points is one of the keystones of computer vision, with application to a variety of problems. For this reason it has been tackled since the old days of computer vision research [44,31]. Automatic feature matching is often an initialisation procedure for more com- plex tasks, such as fundamental matrix estimation, image mosaicing, object recognition, and three-di- mensional point clouds registration. In this paper we consider the case when the epi- polar geometry is not known, and then the corre- sponding point can be anywhere in the image. Also, we are interested in dealing with the corre- spondence problem as the baseline grows. Classical approaches to point matching with unknown geometry assume a short baseline, and they are usually based on correlation (see, for instance, [8]). It is well known that correlation- based approaches suffer from view-point changes and do not take into account the global structure of the image. On this respect an elegant approach, falling in the family of spectral based methods, is due to Scott and Longuet-Higgins [36]. Spectral graph theory [5] aims to characterise the global structural properties of graphs using 1524-0703/$ - see front matter Ó 2006 Elsevier Inc. All rights reserved. doi:10.1016/j.gmod.2006.07.002 * Corresponding author. Fax: +39 010 353 6699. E-mail addresses: [email protected] (E. Delponte), isgro@ na.infn.it (F. Isgro `), [email protected] (F. Odone), verri@ disi.unige.it (A. Verri). Graphical Models xxx (2006) xxx–xxx www.elsevier.com/locate/gmod ARTICLE IN PRESS Please cite this article as: Elisabetta Delponte et al., SVD-matching using SIFT features, Graphical Models (2006), doi:10.1016/j.gmod.2006.07.002.
17

SVD-matching using SIFT featuresmdailey/cvreadings/Delponte-SIFT.pdf · SVD-matching using SIFT features ... due to the feature descriptor adopted, ... shows the superiority of SIFT

Aug 08, 2018

Download

Documents

dothien
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: SVD-matching using SIFT featuresmdailey/cvreadings/Delponte-SIFT.pdf · SVD-matching using SIFT features ... due to the feature descriptor adopted, ... shows the superiority of SIFT

ARTICLE IN PRESS

Graphical Models xxx (2006) xxx–xxx

www.elsevier.com/locate/gmod

SVD-matching using SIFT features

Elisabetta Delponte *, Francesco Isgro, Francesca Odone, Alessandro Verri

DISI, Universita di Genova, Via Dodecaneso 35, Genova I-16146, Italy

Received 31 January 2006; received in revised form 31 March 2006; accepted 7 July 2006

Abstract

The paper tackles the problem of feature points matching between pair of images of the same scene. This is a key prob-lem in computer vision. The method we discuss here is a version of the SVD-matching proposed by Scott and Longuet-Higgins and later modified by Pilu, that we elaborate in order to cope with large scale variations. To this end we add to thefeature detection phase a keypoint descriptor that is robust to large scale and view-point changes. Furthermore, we includethis descriptor in the equations of the proximity matrix that is central to the SVD-matching. At the same time we removefrom the proximity matrix all the information about the point locations in the image, that is the source of mismatches whenthe amount of scene variation increases. The main contribution of this work is in showing that this compact and easy algo-rithm can be used for severe scene variations. We present experimental evidence of the improved performance with respectto the previous versions of the algorithm.� 2006 Elsevier Inc. All rights reserved.

Keywords: Point matching; Spectral methods; Scale invariant features

1. Introduction

Finding correspondences between feature pointsis one of the keystones of computer vision, withapplication to a variety of problems. For this reasonit has been tackled since the old days of computervision research [44,31]. Automatic feature matchingis often an initialisation procedure for more com-plex tasks, such as fundamental matrix estimation,image mosaicing, object recognition, and three-di-mensional point clouds registration.

1524-0703/$ - see front matter � 2006 Elsevier Inc. All rights reserveddoi:10.1016/j.gmod.2006.07.002

* Corresponding author. Fax: +39 010 353 6699.E-mail addresses: [email protected] (E. Delponte), isgro@

na.infn.it (F. Isgro), [email protected] (F. Odone), [email protected] (A. Verri).

Please cite this article as: Elisabetta Delponte et al., SVD-mdoi:10.1016/j.gmod.2006.07.002.

In this paper we consider the case when the epi-polar geometry is not known, and then the corre-sponding point can be anywhere in the image.Also, we are interested in dealing with the corre-spondence problem as the baseline grows.

Classical approaches to point matching withunknown geometry assume a short baseline, andthey are usually based on correlation (see, forinstance, [8]). It is well known that correlation-based approaches suffer from view-point changesand do not take into account the global structureof the image. On this respect an elegant approach,falling in the family of spectral based methods, isdue to Scott and Longuet-Higgins [36].

Spectral graph theory [5] aims to characterisethe global structural properties of graphs using

.

atching using SIFT features, Graphical Models (2006),

Page 2: SVD-matching using SIFT featuresmdailey/cvreadings/Delponte-SIFT.pdf · SVD-matching using SIFT features ... due to the feature descriptor adopted, ... shows the superiority of SIFT

2 E. Delponte et al. / Graphical Models xxx (2006) xxx–xxx

ARTICLE IN PRESS

the eigenvalues and eigenvectors of an affinitymatrix. Recently, it has been applied to a varietyof computer vision and pattern matching prob-lems, including point and shapes matching, andimage segmentation [43,37,4,35,45]. The pointmatching method by Scott and Longuet-Higginsis based on computing a proximity matrix thatdepends on the distance between points belongingto the two images. The method performs well onsynthetic images, but it is sensitive to the noisethat affects points detection and localisation inreal images. More recently, Pilu [29] suggested amodification of the method through a correspon-dence matrix encoding both proximity and simi-larity information. The similarity is computed asthe normalised correlation between the pointsneighbourhoods. Experimental evidence showsthat the method performs very well on stereopairs, but performance drops as the baselinegrows.

We claim that the reason for this behaviour isdue to the feature descriptor adopted, more thanto a limit of the algorithm. In this paper we proposea variant of the SVD-matching that uses scaleinvariant keypoints to tackle both scale changesand view-point changes.

Scale invariant features (often referred to asSIFT) were first proposed in [23] and attracted theattention of the computer vision community fortheir tolerance to scale, rotation, and view-pointvariations. A comparative study of many localimage descriptors [26] shows the superiority of SIFTwith respect to other feature descriptors for the caseof several local transformations.

In our method we first locate keypoints with anaffine invariant Harris corner detector [27], and wecompute a SIFT description. We then build a corre-spondence matrix which is based on the distancebetween SIFT descriptors, discarding proximityinformation entirely. We present an extensive exper-imental analysis, judging the performances of ourapproach with respect to the original SVD-match-ing [29], a matching based on the Euclidean distancebetween SIFT [23], and to a previous version of thiswork that was using both SIFT similarity and prox-imity information [7].

The experimental results show that includingSIFT point descriptors in the SVD-matchingimproves the performance with respect to the pastversions of this algorithm. In particular it returnsgood results for scale changes, and large view-pointvariations. The current version still does not cope

Please cite this article as: Elisabetta Delponte et al., SVD-mdoi:10.1016/j.gmod.2006.07.002.

with wide-baselines. These conclusions are support-ed by an extensive experimental evaluation on dif-ferent typologies of image data.

The paper is organised as follows. Section 2 givesand overview of the state of the art on image match-ing, also with a reference to other spectral-basedmethods. In Section 3 we recall the SVD-matchingalgorithm, while in Section 4 we describe the modi-fied SVD-matching. Section 5 is left to the experi-mental analysis and to the comparative evaluation.A final discussion, in Section 6, ends the paper.

2. Related work

The state of the art on algorithms for imagematching is vast. It is common practise to distin-guish between feature-based methods and directmethods. The former rely on first acquiring imagemeaningful features and then matching them, pro-ducing sparse matches. Direct methods try to findmatches over all image positions. The results aredense disparity maps, less reliable in flat areas.Direct methods usually assume a small view-pointchange, and they are often applied to stereo andmotion estimation. In this paper we focus on fea-ture-based methods: this section reviews the maincontributions to this topic, mainly on dealing withviewpoint and scale changes. The section ends witha brief overview of spectral methods applied tomatching.

2.1. Local interest points

Early works on matching images with salient fea-tures were based on using small amounts of localinformation to describe meaningful keypoints, suchas corners [28,17]. Harris [16] showed that cornerswere efficient for tracking and estimating structurefrom motion. Applications to these fields wereextended later by Shi and Tomasi [38]. In earlyworks corners were simply represented using corre-lation windows centred around the keypoint. Whenmatching these keypoints the underlying assump-tion is that no relevant changes occurred in illumi-nation and scale.

Instead, every object in an image assumes a dif-ferent meaning if observed at different scales, orunder different illumination conditions. Anothersource of changes in appearance is view-point vari-ation. These issues have been extensively studied inthe last decades. For what concerns scale invariancemany contributions appeared in the past [6,20,39],

atching using SIFT features, Graphical Models (2006),

Page 3: SVD-matching using SIFT featuresmdailey/cvreadings/Delponte-SIFT.pdf · SVD-matching using SIFT features ... due to the feature descriptor adopted, ... shows the superiority of SIFT

E. Delponte et al. / Graphical Models xxx (2006) xxx–xxx 3

ARTICLE IN PRESS

in particular it is worth mentioning the scale-space

approach [20]. Scale-space is an effective frameworkto handle objects at different scales: an image is rep-resented at different resolution levels, and thedescription obtained is not a simple random sup-pression of details, but it is a well-defined processthat guarantees linearity and scale-space invariance.Detecting local features in a scale-space representa-tion allows us to estimate the keypoint scale as wellas its position. A great part of local keypoints thathave been recently proposed follow this approach.A foremost aspect of the scale-space approach isthat there are methods [19] that automaticallychoose the appropriate resolution level, discardinguseless information. Thus, scale invariant featurescan be obtained by applying, for instance, a Harrisdetector at different scales, and then estimating themost meaningful scale for each keypoint. Alterna-tively, features can be localised directly on a scale-space structure, searching local maxima both onspace and scale. The latter approach has been fol-lowed by Lowe in designing his Difference of Gaus-sians (DoG) feature detector [23].

Among the many recent works populating the lit-erature on keypoint detection, it is worth mention-ing the scale and affine invariant interesting points

recently proposed by Mikolajczyk and Schmid[27], as they appear to be among the most promisingkeypoint detectors to date. The detection algorithmcan be sketched as follows: first Harris corners aredetected at multiple scales, then points at which alocal measure of variation is maximal over scaleare selected. This provides a set of distinctive pointsat the appropriate scale. Finally, an iterative algo-rithm modifies location, scale, and neighbourhoodof each point and converges to affine invariantpoints.

In many application domains it has beenshown that an efficient keypoint localisationshould be associated to a feature description lesssensitive to view-point changes than grey levels[46,11], and possibly embed invariance to rotation,scale, or illumination changes. One of the pioneer-ing works in this direction is due to Schmid andMohr [32]. They showed that local feature match-ing could be applied effectively to image recogni-tion if a more robust feature description wasused: they located the keypoints with a Harrisdetector, and then used a rotationally invariantdescriptor of the local image region centred atthe keypoint. From the same research group weget a comparative study on the effectiveness of

Please cite this article as: Elisabetta Delponte et al., SVD-mdoi:10.1016/j.gmod.2006.07.002.

the various invariant feature descriptors proposedso far [33]. Here it is shown that SIFT (scale

invariant feature transform) [22] lead to excellentperformances compared to other existingapproaches. SIFT description is computed as fol-lows: once a keypoint is located and its scalehas been estimated, one or more orientations areassigned to it based on local image gradient direc-tion around the keypoint. Then, image gradientmagnitude and orientation are sampled aroundthe keypoint, using the scale of the keypoint toselect the level of Gaussian blur. The gradient ori-entations obtained are rotated with respect to thekeypoint orientation previously computed. Finally,the area around the keypoint is divided in sub-re-gions, each of which is associated an orientationshistogram weighted with the magnitude. Thisapproach has been suggested to the author by amodel of biological vision [9].

Other local keypoint descriptors can be found inthe recent literature: Baumberg [2] propose a match-ing technique based on the Harris corner detectorand a description based on the Fourier–Mellintransform to achieve invariance to rotation. Harriscorners are also used in [1], where rotation invari-ance is obtained by a hierarchal sampling that startsfrom the direction of the gradient. Matas et al. [25]introduce the concept of maximally stable extremalregion to be used for robust matching. Theseregion’s are connected components of pixels whichare brighter or darker than pixels on the region’scontour, they are invariant to affine and perspectivetransform, and to monotonic transformation ofimage intensities.

2.2. Matching with large or wide-baslines

It is well known that a major source of appear-ance variation is view-point change. This variationbecomes more challenging to model as the distancebetween observation points (i.e., the baseline)grows. This section reviews some methods consider-ing this issue.

Early applications to local image matching werestereo and short-range motion tracking. Zhanget al. showed that it was possible to match Harriscorners over a large image range, with an outlierremoval technique based on a robust computationof the fundamental matrix and the elimination ofthe feature pairs that did not agree with the solutionobtained [47]. Later on, the invariant featuresdescribed above were extensively studied as they

atching using SIFT features, Graphical Models (2006),

Page 4: SVD-matching using SIFT featuresmdailey/cvreadings/Delponte-SIFT.pdf · SVD-matching using SIFT features ... due to the feature descriptor adopted, ... shows the superiority of SIFT

4 E. Delponte et al. / Graphical Models xxx (2006) xxx–xxx

ARTICLE IN PRESS

guaranteed some degree of flexibility with respect toview-point change. Recently, many works onextending local features to be invariant to affinetransformations have been proposed, including avariant of SIFT [3].

Tuytelaars and Van Gool [42] deal with wide-baseline matching extracting image region’s aroundcorners, where edges provide orientation and skewinformation. They also address scale variation bycomputing the extrema of a 2D affine invariantfunction; as a descriptor they use generalised colourmoments. The actual matching is done using theMahalanobis distance. In a more recent work [10]they establish wide-baseline correspondencesamong unordered multiple images, by first comput-ing pairwise matches, and then integrating theminto feature tracks each representing a local patchof the scene. They exploit the interplay betweenthe tracks to extend matching to multiple views. Amethod based on automatic determination of localneighbourhood shapes is presented in [12], but itonly works for image areas where stationary textureoccurs.

An alternative approach for determining featurecorrespondences relies on prior knowledge on theobserved scene, for instance in knowing the epipolargeometry of two or more views [34]. Georgis et al.[13] assume that projections of four correspondingnon coplanar points at arbitrary positions areknown. Pritchett and Zissermann [30] use localhomographies determined by parallelogram struc-tures or from motion pyramids. Lourakis et al [21]present a method based on the assumption thatthe viewed scene contains two planar surfaces andexploits the geometric constraints derived by thisassumption. The spatial relation between the fea-tures in each images, together with appearance, isused in [40].

Recently a simple ordering constraint that canreduce the computational complexity for wide-bas-line matching, for the only case of approximatelyparallel epipolar lines, has been proposed in [24].

2.3. Spectral analysis for point matching

Spectral graph analysis aims at characterising theglobal properties of a graph using the eigenvaluesand the eigenvectors of the graph adjacency matrix[5]. Recently this subject has found a number ofapplications to classical computer vision problems,including point matching, segmentation, line group-ing, shape matching [43,37,4,35,45]. In this section

Please cite this article as: Elisabetta Delponte et al., SVD-mdoi:10.1016/j.gmod.2006.07.002.

we review some works on point matching with spec-tral analysis.

Most of these contributions are based on the socalled proximity or affinity matrix, that is a continu-ous representation of the adjacency matrix: insteadthan being set to 0 or 1, the matrix elements areweights that reflect the strength of a pair relation(in terms of proximity or sometimes similarity).Usually the proximity matrix is defined as:

Gij ¼ e�r2ij=2r2 ð1Þ

with r a free parameter, and rij is a distance betweenpoints xi and xj computed with an appropriateaffinity.

Scott and Longuet-Higgins [36] give one of themost interesting and elegant contributions to thistopic, that we will describe in Section 3. One ofthe first applications of spectral analysis to pointmatching is due to Umeyama [43]. The author pre-sents an SVD method for finding permutationsbetween the adjacency matrixes of two graphs. Ifthe graphs have the same size and structure ofthe edges the method is able to find correspondenc-es between the nodes of the graph. Shapiro andBrady [37] propose a method that models the con-tent of each image by means of an intra-imagepoint proximity matrix, and then evaluates thesimilarity between images by comparing thematrixes. The proximity matrixes are built usinga Gaussian weighting function, as in Eq. (1). Foreach proximity matrix, a modal matrix (a matrixthe columns of which are eigenvectors of the origi-nal matrix) is built. Each row of the modal matrixrepresents one point of the corresponding image.The authors find the correspondences by compar-ing the rows of the two modal matrixes, using abinary decision function based on the Euclideandistance. Carcassoni and Hancock [4] propose avariant of this approach that changes the originalmethod in three different ways. First, the evalua-tion of proximity matrixes are based on otherweighting functions, including a sigmoidal and anEuclidean weighting function; second, the use ofrobust methods for comparing the modal matrixes;third, an embedding of the correspondence processwithin a graph matching EM algorithm. Experi-ments reported in the paper show that the lattercontribution is useful to overcome structuralerrors, including the deletion or insertion of points.The authors also show that the Gaussian weightingfunction performs worst than the other weightingfunctions evaluated.

atching using SIFT features, Graphical Models (2006),

Page 5: SVD-matching using SIFT featuresmdailey/cvreadings/Delponte-SIFT.pdf · SVD-matching using SIFT features ... due to the feature descriptor adopted, ... shows the superiority of SIFT

E. Delponte et al. / Graphical Models xxx (2006) xxx–xxx 5

ARTICLE IN PRESS

3. SVD-matching

In this section we summarise the algorithms pro-posed in [36] and [29] upon which we base ourmatching technique. Scott and Longuet-Higgins

Fig. 2. Matches determined for stereo pairs of a desk. (a) A reasonablebetween the wall and the corner of the screen. (b) The second image is adetermined. (c) Scale variation, wrong matches on the edge of the tabl

Fig. 3. Matches determined for a large baseline stereo

Fig. 1. Examples of features extracted. The ellipse around the

Please cite this article as: Elisabetta Delponte et al., SVD-mdoi:10.1016/j.gmod.2006.07.002.

[36], getting some inspiration from structural chem-istry, were among the first to use spectral methodsfor image matching. They show that, in spite ofthe well-known combinatorics complexity of findingfeature correspondences, a reasonably good solu-

level of scene variation. We could notice only one wrong matchsynthetic rotation of the first one. No wrong matches have been

e.

pairs. Only 2–3 wrong matches are determined.

feature points represents the support area of the feature.

atching using SIFT features, Graphical Models (2006),

Page 6: SVD-matching using SIFT featuresmdailey/cvreadings/Delponte-SIFT.pdf · SVD-matching using SIFT features ... due to the feature descriptor adopted, ... shows the superiority of SIFT

6 E. Delponte et al. / Graphical Models xxx (2006) xxx–xxx

ARTICLE IN PRESS

tion can be achieved through the singular valuedecomposition of the proximity matrix of Eq. (1)followed by a simple manipulation of the eigen-values. As pointed out in [29] their algorithm isrooted into the solution of the subspace rotationproblem known as orthogonal Procrustes problem(see [14] for details).

Let A and B be two images, containing m and n

features respectively (Ai, i = 1,. . .,m, andBj,j = 1,. . .,n). The goal is to determine two subsetsof the two sets of points that can be put in a one toone correspondence. In the original algorithm pro-posed by Longuet-Higgins, the main assumptionwas that the two images were taken from close

Fig. 4. Sample from the sequences used for the experiments presented inSecond row: 1st, 3rd and 5th frame of the Graf sequence. Third and fourframe of the stereo sequence.

Please cite this article as: Elisabetta Delponte et al., SVD-mdoi:10.1016/j.gmod.2006.07.002.

points of view, so that the corresponding pointshad similar image coordinates.

The algorithm consists of three steps:

1. Build a proximity matrix G, where each elementis computed according to Eq. (1). Letrij = kAi � Bjk be the Euclidean distance betweenthe two points, when considering them in thesame reference plane. The parameter r controlsthe degree of interactions between features,where a small r enforces local correspondences,while a bigger r allows for more distant interac-tions. The elements of G range from 0 to 1, withhigher values for closer points.

this paper. First row: 1st, 3rd and 5th frame of the Boat sequence.th rows: left and right views, respectively, of the 1st, 16th and 30th

atching using SIFT features, Graphical Models (2006),

Page 7: SVD-matching using SIFT featuresmdailey/cvreadings/Delponte-SIFT.pdf · SVD-matching using SIFT features ... due to the feature descriptor adopted, ... shows the superiority of SIFT

Fig. 6. Comparison with other weighting functions: results forthe 30-frames stereo sequence. The baseline is fixed for all thestereo pairs, and the correspondences are computed for eachstereo frame of the sequence. Top: total number of matchesdetected. Middle: number of correct matches. Bottom: accuracyof the method.

E. Delponte et al. / Graphical Models xxx (2006) xxx–xxx 7

ARTICLE IN PRESS

2. Compute the Singular Value Decomposition forG: G = VDU>.

3. Compute a new correspondence matrix P by con-verting diagonal matrix D to a diagonal matrix Ewhere each element Dii is replaced with a 1:P = VEU>.

The algorithm is based on the two principles ofproximity and exclusion, that is, correspondingpoints must be close, and each point can have onecorresponding point at most. The idea is to obtainfrom the similarity matrix G a matrix L such thatthe entry ij is 1 if i and j are corresponding points,0 otherwise. The matrix P computed by the algo-rithm is orthogonal (in the sense that the rows aremutually orthogonal), as all the singular values are1, and it is the orthogonal matrix closest to theproximity G. Because of the orthogonality, if theparameter r is chosen properly, P enhances goodpairings, as its entries have properties close to thoseof the ideal matrix L. Following this idea the algo-rithms establishes a correspondence between thepoints i and j if the entry Pij is the largest elementin row i and the largest element in row j.

In the case of real images, point localisation isaffected by noise and keypoint detection is unsta-ble—keypoints may be detected or not dependingon the viewing angle. The algorithm presented in[36] was working well on synthetic data, but perfor-mance started to fall down when moving to realimages. Pilu [29] argues that this behaviour couldbe taken care of by evaluating local image similari-ties. He adapts the proximity matrix in order to takeinto account image intensity as well as geometricproperties. The modified matrix appears as follows:

Gij ¼Cij þ 1

2e�r2

ij=2r2 ð2Þ

where the term Cij is the normalised correlation be-tween image patches centred in the feature points.

In [29] experimental evidence is given that theproposed algorithm performs well on short baselinestereo pairs. In fact the performance falls when the

Fig. 5. The different weighting functions used. Left: Gauss

Please cite this article as: Elisabetta Delponte et al., SVD-mdoi:10.1016/j.gmod.2006.07.002.

baseline increases. It is our target to show that thereason for this behaviour is in the feature descriptorchosen and is not an intrinsic limit of the algorithm.

ian. Middle: Double-exponential. Right: Lorentzian.

atching using SIFT features, Graphical Models (2006),

Page 8: SVD-matching using SIFT featuresmdailey/cvreadings/Delponte-SIFT.pdf · SVD-matching using SIFT features ... due to the feature descriptor adopted, ... shows the superiority of SIFT

8 E. Delponte et al. / Graphical Models xxx (2006) xxx–xxx

ARTICLE IN PRESS

4. SVD-matching using SIFT

In this section we discuss the use of the SIFTdescriptor in the SVD-matching algorithm. As men-tioned in the previous section SVD-matching pre-sented in [29] does not perform well when thebaseline starts to increase. The reason for thisbehaviour is in the feature descriptor adopted. Theoriginal algorithm uses the grey level values in aneighbourhood of the keypoint. As pointed out inSection 2 this description is too sensitive to changesin the view-point, and more robust descriptor havebeen introduced so far.

A comparative study of the performance of var-ious feature descriptors [26] showed that the SIFTdescriptor is more robust than others with respectto rotation, scale changes, view-point change, andlocal affine transformations. The quality of theresults decrease in the case of changes in the illumi-nation. In the same work, cross-correlationbetween the image grey levels returned unstableperformance, depending on the kind of transfor-mation considered. The considerations above sug-gested the use of a SIFT descriptor, instead ofgrey levels. The descriptor is associated to scaleand affine invariant interest points [27], brieflysketched in Section 2. Some examples of such key-points are shown in Fig. 1.

Fig. 7. Comparison of different weighting functions: results for the 30-fright (bottom) 27th frames. Left; S-SVD Right: L-SVD. The results fo

Please cite this article as: Elisabetta Delponte et al., SVD-mdoi:10.1016/j.gmod.2006.07.002.

In a previous version of this work [7] we left thematrix G in Eq. (2) unchanged in its form, but Cij

was the cross-correlation between SIFT descriptors.This straightforward modification improves the per-formance of the SVD-matching, and also gives bet-ter results, in terms of number of points correctlymatched, with respect to the SIFT distance usedfor the experiments reported in [26]. However thematrix terms are still strongly dependent on the dis-tance on the image plane between feature points,causing a large number of mismatches when the dis-tance between points increases. For this reason wedecided to switch back to the original form of theG matrix, with

Gij ¼ e�r2ij=2r2 ð3Þ

where rij is now the distance between the featuredescriptors in the SIFT space.

In order to reduce the number of mismatcheseven further we also added a constraint on the entryPij for determining the correspondence betweenpoints i and j. Let aij1

and aij2being, respectively,

the largest and second largest elements in row i,and bi1j and bi2j the largest and second largest ele-ments in column j. We say that i and j are corre-sponding points if

1. j1=j and i1=i

rames stereo sequence. Correct matches between the left (top) andr D-SVD are in Fig. 12.

atching using SIFT features, Graphical Models (2006),

Page 9: SVD-matching using SIFT featuresmdailey/cvreadings/Delponte-SIFT.pdf · SVD-matching using SIFT features ... due to the feature descriptor adopted, ... shows the superiority of SIFT

Fig. 8. Comparison of different weighting functions: results forthe Boat sequence. The images are zoomed and rotated respect tothe first frame. Matches are computed between the first frame andeach other frame of the sequence. Top: total number of matchesdetected. Middle number of correct matches. Bottom: accuracy ofthe method.

E. Delponte et al. / Graphical Models xxx (2006) xxx–xxx 9

ARTICLE IN PRESS

2. 0:6aij1P aij2

and 0:6bij1P bij2

In plain words it still needs to be the largest elementin row i and column j, but also the largest by far.

5. Experimental results

In this section we report some experimentscarried out on different image pairs and sequenc-es. First we show some of the matches returnedby our algorithm on few image pairs. Then weattempt a more quantitative analysis of the per-formance of our algorithm on short imagesequences.

5.1. Experiments on image pairs

The first lot of experiments that we show refers toresults on image pairs of two different scenesreturned by the algorithm proposed in this paper.

In Figs. 2(a) and (b) we show all the matchesdetermined on two pairs of images of a desk scene.The first one presents a reasonable level of scenevariation, whereas the latter is a synthetic rotationof the first image. We spotted only a wrong matchin Fig. 2(a). The last image pair is relative to a stu-dio scene with scale variation. The result is shown inFig. 2(c). Our visual inspection of the results deter-mined only few wrong matches between points onthe border of the table.

In Fig. 3 we show the matches determined on alarge baseline stereo pair. A visual inspection couldspot no more than three wrong matches.

5.2. Comparative experiments

We performed different comparative experi-ments. The first group of experiments focuses onproximity matrixes built in the descriptor space asfor the one given in (3), that uses a Gaussian weight-ing function. Following [4] we test against theGaussian the performance of two other weightingfunctions, drawn from the literature on robuststatistics.

The second group of experiments tests the per-formance of the algorithm using the proximitymatrix proposed in (3) against two other matrixesproposed in previous works [29,7], and a SIFT-based point matcher, based on the Euclidean dis-tance between SIFTs, proposed by Lowe in [23],and used in [26] for measuring the SIFTperformance.

Please cite this article as: Elisabetta Delponte et al., SVD-mdoi:10.1016/j.gmod.2006.07.002.

For evaluating the performance of the threepoint matching methods used for this work we com-puted: (a) the total number of matches detected; (b)the number of correct matches; (c) the accuracy,defined as the ratio between number of correctmatches and the total number of matches detected.

The data used are of different nature. We consid-ered a stereo image sequence taken with a stereo sys-tem with relatively large baseline, and in particularwe focused our experiments on input sequences foran immersive video-conferencing system [18]. Thenwe used short sequences with large variations with

atching using SIFT features, Graphical Models (2006),

Page 10: SVD-matching using SIFT featuresmdailey/cvreadings/Delponte-SIFT.pdf · SVD-matching using SIFT features ... due to the feature descriptor adopted, ... shows the superiority of SIFT

Fig. 9. Comparison of different weighting functions: results for Boat sequence. Correct matches between the left (top) and right (bottom)last. Left; S-SVD Right: L-SVD. The results for D-SVD are in Fig. 14.

10 E. Delponte et al. / Graphical Models xxx (2006) xxx–xxx

ARTICLE IN PRESS

respect to the first frame: the kind of variations con-sidered are viewpoint changes and zoom plus rota-tion.1 Some of these last sequences were used in[26]. The experiments performed on the videosequences compare the first frame of the sequencewith all the others. Sample frames from thesequence used are shown in Fig. 4.

The method used for determining the correctmatches depends on what geometric informationon the camera geometry is available. For sets ofdata consisting of fixed camera sequences orsequences of planar scenes for which the homogra-phies between the different views were available,we say that a pair of corresponding points (p,p 0) isa correct match if

kp0 � Hpk < 5

where H is the homography between the two imag-es. For the stereo-sequence with a fixed baseline thecorrespondence were computed between images ofeach stereo frame. In this case, because the sceneis not planar, we compute the fundamental matrixF from the calibrated camera matrixes, and a pairof corresponding points (p p 0) is a correct match if

ðdðp0; FpÞ þ dðp; F tp0ÞÞ=2 < 5

1 Sequences available from http://www.robots.ox.ac.uk/~vgg/research/affine/index.html

Please cite this article as: Elisabetta Delponte et al., SVD-mdoi:10.1016/j.gmod.2006.07.002.

where d(p 0,Fp) is the distance between point p 0 andthe epipolar line corresponding to point p [41].

For all the experiments we set the parameter r to1000.

5.2.1. Comparison of different weighting functions

The weighting function models the probability ofthe similarity between the feature points. In previ-ous works it was used the Gaussian weighting func-tion. The reason for trying functions different fromthe Gaussian is that the distance between featuredescriptors of corresponding points increases withthe baseline. In this case a function with more prom-inent tails than the Gaussian can give the chance todetect some more matches. This, as we will see, atthe price of a sometimes lower accuracy.

In this section we considered a small sample ofdifferent weighting functions borrowed from the lit-erature on robust statics, in particular from the lit-erature on M-estimators [15]. The comparativeevaluation on the performance of different matchingmethods, whose description is given in 4, are basedon the following different weighting functions:

• S-SVD: a Gaussian weighting function as it hasbeen used all along the paper;

• D-SVD:a double-exponential weighting function

Gij ¼ eð�jrij=krjÞ ð4Þ

atching using SIFT features, Graphical Models (2006),

Page 11: SVD-matching using SIFT featuresmdailey/cvreadings/Delponte-SIFT.pdf · SVD-matching using SIFT features ... due to the feature descriptor adopted, ... shows the superiority of SIFT

E. Delponte et al. / Graphical Models xxx (2006) xxx–xxx 11

ARTICLE IN PRESS

• L-SVD: a Lorentzian weighting function, definedas

Gij ¼1

1þ 12

r2ij

r2

ð5Þ

The different weighting functions are shown inFig. 5. We choose

In Figs. 6 and 7 we show the results for the video-conferencing stereo sequence. We see that in termsof number of matches and correct matches the dou-ble-exponential function returns the best results,

Fig. 10. Comparison of different weighting functions: results forthe Graf sequence. The images present a change in the view-pointrespect to the first frame. Top: total number of matches detected.Middle: number of correct matches. Bottom: accuracy of themethod.

Please cite this article as: Elisabetta Delponte et al., SVD-mdoi:10.1016/j.gmod.2006.07.002.

while the Gaussian and the Lorentzian have similarperformance. These last two report an average accu-racy of 0.6. The accuracy returned by the double-ex-ponential is lower, but on average above 0.5, thatmeans that at most 50% of the matches detectedare wrong matches, and this is the largest amountof wrong matches that standard robust statisticstools can tolerate.

The results for the Boat sequence are shown inFigs. 8 and 9. Even in this case the D-SVD returnsthe highest number of correct matches, and, exceptfor the last frame, the accuracy reported is above

Fig. 11. Comparison with other algorithms: results for the 30-frames stereo sequence. The baseline is fixed for all the stereopairs, and the correspondences are computed for each stereoframe of the sequence. Top: total number of matches detected.Middle: number of correct matches. Bottom: accuracy of themethod.

atching using SIFT features, Graphical Models (2006),

Page 12: SVD-matching using SIFT featuresmdailey/cvreadings/Delponte-SIFT.pdf · SVD-matching using SIFT features ... due to the feature descriptor adopted, ... shows the superiority of SIFT

12 E. Delponte et al. / Graphical Models xxx (2006) xxx–xxx

ARTICLE IN PRESS

0.7. The other two functions reported an accuracyalways well above 0.5.

For the Graf sequence the results are similar towhat seen in the previous section (see Fig. 10).The double-exponential still returns the largestnumber of correct matches, but again the perfor-mance drops for the last two frames of thesequence when the change in the point of viewis too large.

We can conclude this evaluation of the weightingfunctions saying that the double-exponential

Fig. 12. Comparison with other algorithms: results for the 30-frames s(bottom) 27th frames. (a) D-SVD. (b) C-SVD. (c) P-SVD. (d) S-DIST

Please cite this article as: Elisabetta Delponte et al., SVD-mdoi:10.1016/j.gmod.2006.07.002.

performs slightly better than the other two functionsconsidered, but it does not seem that the use of anyof these function dramatically changes the perfor-mance of the algorithm.

The double-exponential weighting function willbe used in the following analysis.

5.2.2. Comparison with other matching algorithms

The comparative evaluation on the performanceof different matching methods considers the follow-ing techniques:

tereo sequence. Correct matches between the left (top) and right.

atching using SIFT features, Graphical Models (2006),

Page 13: SVD-matching using SIFT featuresmdailey/cvreadings/Delponte-SIFT.pdf · SVD-matching using SIFT features ... due to the feature descriptor adopted, ... shows the superiority of SIFT

Fig. 13. Comparison with other algorithms: results for the Boat

sequence. The images are zoomed and rotated respect to the firstframe. Matches computed between the first frame and each otherframe in the sequence. Top: total number of matches detected.Middle: number of correct matches. Bottom: accuracy of themethod.

E. Delponte et al. / Graphical Models xxx (2006) xxx–xxx 13

ARTICLE IN PRESS

• D-SVD: point matches are established followingthe SVD-matching algorithm of Section 3 withthe proximity matrix G given in (4);

• C-SVD: point matches are established followingthe algorithm discussed in [7]

Cij ¼X

t

ðSit �meanðSiÞÞðSj

t �meanðSjÞÞstdvðSiÞstdvðSjÞ

where Si and Sj are the SIFT descriptors;• P-SVD: point matches are determined as for

C-SVD but with

cij ¼X

t

ðI it �meanðI iÞÞðIj

t �meanðIjÞÞstdvðI iÞstdvðIjÞ

where Ii and Ij are the two grey-levels neighbour;• S-DIST: point matches are established following

the method proposed in [23], that is two features i

and j matches if

dij ¼ minðDiÞ < 0:6 minðDi � fdijgÞand

dji ¼ minðDjÞ < 0:6 minðDj � fdjigÞwhere Di = {dih = kSi � Shk}.

In Figs. 11 and 12 we show the results for the vid-eo-conferencing stereo sequence. The S-SVDreturns the largest number of matches and of correctmatches (an average of 50 and 40, respectively, foreach stereo frame) with respect to the other three:the C-SVD presents an average of 30 and 20 per ste-reo frame, while the values returned by the othertwo methods are much lower.

S-DIST returns the highest accuracy (almostalways 1), but a very small number of matches.The accuracy obtained with D-SVD and C-SVD isslightly lower (ranging from 0.7 to 0.5) but it is highenough to use standard robust statistics tools foridentifying and discarding wrong matches. As forP-SVD we notice that accuracy drops down to 0.4that is too low for trating outliers with robuststatistics.

The results shown in Figs. 13 and 14 are relativeto a six frames sequence where the fixed camerazooms and rotates around the optical centre. In thiscase D-SVD is still giving the larger amount ofcorrect matches. The number of matches goes downsensibly, because of the large zoom effect betweenthe first and the last frame, so that the points detect-ed at a finer scale in the first frame cannot bematched. The C-SVD still has acceptable perfor-

Please cite this article as: Elisabetta Delponte et al., SVD-mdoi:10.1016/j.gmod.2006.07.002.

mance while the other two methods perform poorlyon this sequence. In particular P-SVD can only findmatches between the first two frames. This isbecause this method uses correlation between imagepatches, that are very sensitive to rotation and scalechanges.

The performance of the algorithms starts to godown with severe changes in the view-point, asshown in Figs. 15 and 16. In fact for the last 2frames the amount of matches and the accuracyobtained are too low. The results returned by theS-DIST algorithm, that has been designed for theSIFT descriptor, are even worse, implying that the

atching using SIFT features, Graphical Models (2006),

Page 14: SVD-matching using SIFT featuresmdailey/cvreadings/Delponte-SIFT.pdf · SVD-matching using SIFT features ... due to the feature descriptor adopted, ... shows the superiority of SIFT

Fig. 14. Comparison with other algorithms: results for the Boat sequence. Correct matches between the left (top) and right (bottom) lastframes. (a) D-SVD. (b) C-SVD. (c) P-SVD. (d) S-DIST.

14 E. Delponte et al. / Graphical Models xxx (2006) xxx–xxx

ARTICLE IN PRESS

descriptor cannot cope with too large viewpointchanges. Similar results have been reported in [26]for several descriptors.

In conclusion we can state that the use ofthe SIFT descriptors in combination with aSVD-matching algorithm improves the perfor-mance with respect to older versions of thealgorithm, as already shown in [7]. Moreoverthe experiments reported in this paper show

Please cite this article as: Elisabetta Delponte et al., SVD-mdoi:10.1016/j.gmod.2006.07.002.

that replacing the distance between featurepoints with the distance between point descrip-tors in the weighting function used to buildthe proximity matrix gives better results whenlarge changes in the scene occur. This is partic-ularly noticeable in the case of severe zoom/ro-tation changes. However the performance arestill not satisfactory for the case of too largeviewpoint change.

atching using SIFT features, Graphical Models (2006),

Page 15: SVD-matching using SIFT featuresmdailey/cvreadings/Delponte-SIFT.pdf · SVD-matching using SIFT features ... due to the feature descriptor adopted, ... shows the superiority of SIFT

Fig. 15. Comparison with other algorithms: results for the Graf

sequence. The images present a change in the view-point respectto the first frame. Top: total number of matches detected. Middle:number of correct matches. Bottom: accuracy of the method.

Fig. 16. Comparison with other algorithms: results for the Graf

sequence. Correct matches between the left (top) and right(bottom) last frames. (a) D-SVD. (b) C-SVD. P-SVD. and S-DIST did not return any correct match.

E. Delponte et al. / Graphical Models xxx (2006) xxx–xxx 15

ARTICLE IN PRESS

6. Conclusions

This paper presented a method for determiningthe correspondences between sparse feature pointsin images of the same scene based on the SVD-matching paradigm, that has been used by differentauthors in the past, and on a state-of-the-art key-point descriptor, namely SIFT.

We showed that including SIFT point descriptorsin the SVD-matching improves the performancewith respect to the past versions of this algorithm.In particular it returned good results for scalechanges, severe zoom and image plane rotations,and large view-point variations. The current version

Please cite this article as: Elisabetta Delponte et al., SVD-mdoi:10.1016/j.gmod.2006.07.002.

still does not cope with wide-baslines. Theseconclusions are supported by an extensive experi-mental evaluation, on different typologies of imagedata.

As for many spectral methods, the SVD-match-ing algorithm is based on the choice of an appropri-ate weighting function for building a proximitymatrix. The previous SVD-matching algorithmswere using a Gaussian function. We compared itsperformance against other functions borrowed fromthe robust statistics literature: the Lorentzian andthe double-exponential function. The resultsobtained suggest that the choice of this latterfunction can somewhat improve the quality of theresults, but it does not appear to be a crucialissue.

Acknowledgments

The authors thank Maurizio Pilu for usefuldiscussions.

The video-conferencing sequences used in thiswork have been provided by the EU Framework-V Grant (IST-1999-10044) VIRTUE (VIRtualTeam User Environment).

The software used for detecting the keypoints isdue to K. Mikolajczyk and is available for down-load from http://www.robots.ox.ac.uk/vgg/re-search/affine/index.html.

This work is partially supported by EuropeanFP6 NoE aim@shape grant 506766, the FIRB Pro-ject ASTA RBAU01877P.

atching using SIFT features, Graphical Models (2006),

Page 16: SVD-matching using SIFT featuresmdailey/cvreadings/Delponte-SIFT.pdf · SVD-matching using SIFT features ... due to the feature descriptor adopted, ... shows the superiority of SIFT

16 E. Delponte et al. / Graphical Models xxx (2006) xxx–xxx

ARTICLE IN PRESS

References

[1] N. Allezard, M. Dhome, F. Jurie, Recognition of3D textured objects by mixing view-based andmodel based representations, in: Proceedings ofICPR, 2000.

[2] A. Baumberg, Reliable feature matching across widelyseparated views, in: Proceedings of CVPR, 2000, pp.774–781.

[3] M. Brown, D. Lowe, Invariant features from interestingpoint groups, in: Proceedings of BMVC, 2002, pp. 656–665.

[4] M. Carcassoni, E.R. Hancock, Spectral correspondence forpoint pattern matching, Pattern Recognition 36 (2003) 193–204.

[5] F.R. Chung, Spectral Graph Theory, American Mathemat-ical Society vol,92 (1997).

[6] J.L. Crowley, A.C. Parker, A representation for shape basedon peaks and ridges in the difference of low-pass transform,IEEE Transactions on Pattern Analysis and Machine Intel-ligence 6 (2) (1984) 156–170.

[7] E. Delponte, F. Isgro, F. Odone, A. Verri. SVD-matchingusing sift features. in: E. Trucco, M. Chantler, (Eds),Proceedings of the of the International Conference onVision, Video and Graphics, pp 125–132, Edinburgh, UK,2005, pp. 125–132.

[8] R. Deriche, Z. Zhang, Q.T. Luong, O. Faugeras, Robustrecovery of the epipolar geometry from an uncalibratedstereo rig. in: J.O. Eklundh, (Eds), Proceedings of ECCV,1994, pp. 567–576.

[9] S. Edelman, N. Ingrator, T. Poggio, Complex cells andobject recognition. 1997. Unpublished manuscript.Cogprints.

[10] V. Ferrari, T. Tuytelaars, L.V. Gool, Wide-baseline muliple-view correspondences. in: Proceedings of CVPR, 2003.

[11] W. Freeman, E. Adelson, The design and use of steerablefilters, IEEE Transactions on Pattern Analysis and MachineIntelligence 13 (9) (1991) 891–906.

[12] M. Galun, E. Sharon, R. Basri, A. Brandt, Texturesegmentation by multiscale aggregation of filter respons-es and shape elements, in: Proceedings of ICCV, 2003,pp. 716–723.

[13] N. Georgis, M. Petrou, J. Kittler, On the correspondenceproblem for wide angular separation of non-coplanar points,Image and Vision Computing 16 (1998).

[14] G.H. Golub, C.F.V. Loan, Matrix Computations, JohnHopkins University Press, 1983.

[15] F.R. Hampel, E.M. Ronchetti, P.J. Rousseeuw, W.A. Stahel,Robust Statistics: The Approach Based on Influence Func-tions, John Wiley & Sons, 1986.

[16] C. Harris, Geometry from visual motion. in: A. Blake, A.Yuille (Eds), Active Vision, MIT Press, 1992.

[17] C. Harris, M. Stephens, A combined corner and edgedetector, in: Proceedings of the 4th Alvey Vision Conference,1988, pp. 147–151.

[18] F. Isgro, E. Trucco, P. Kauff, O. Schreer, Three-dimensionalimage processing in the future of immersive media, IEEETransactions on Circuits and Systems for Video Technology14 (3) (2004) 288–303.

[19] T. Lindeberg, Principles for automatic scale selection,Technical Report ISRN KTH/NA/P-98/14-SE, CVAPDepartment of numerical analysis and computing scienceKTH, S-100 44 Stockholm, Sweden, August 1998.

Please cite this article as: Elisabetta Delponte et al., SVD-mdoi:10.1016/j.gmod.2006.07.002.

[20] T. Lindeberg, Scale-space theory: a basic tool for analysingstructures at different scales, Journal of Applied Statistics 21(2) (1994) 224–270.

[21] M. Lourakis, S. Tzurbakis, A. Argyros, S. Orphanoudakis,Feature transfer and matching in disparate stereo viewsthrough the use of plane homographies, IEEE Transactionson Pattern Analysis and Machine Intelligence 25 (2) (2003).

[22] D.G. Lowe, Distinctive image features from scale-invariantkeypoints, International Journal of Computer Vision 60 (2)(2004) 91–110.

[23] David G. Lowe, Object recognition from local scale-invari-ant features, in: Proceedings of ICCV, Corfu, Greece, 1999,pp. 1150–1157

[24] Xiaoye Lu, R. Manduchi, wide-basline feature matchingusing the cross-epipolar ordering constraint, in: Proceedingsof CVPR, 2004, pp. 16–23.

[25] J. Matas, O. Chum, M. Urban, T. Pajdla, Robust widebaseline stereo from maximally stable extremal regions, in:Proceedings of BMVC, 2002, pp. 384–393.

[26] K. Mikolajczyk, C. Schmid, A performance evaluation oflocal descriptors, in: Proceedings of CVPR, 2003, pp. 257–263.

[27] K. Mikolajczyk, C. Schmid, Scale and affine invariantinterest point detectors, International Journal of ComputerVision 60 (1) (2004) 63–86.

[28] H. Moravec, Rover visual obstacle avoidance, in: Proc. ofthe 7th Intern, Joint Conference on Artificial Intelligence,1981, pp. 785–790.

[29] M. Pilu, A direct method for stereo correspondence based onsingular value decomposition, in: Proceedings of CVPR,Puerto Rico, 1997, pp. 261–266.

[30] P. Pritchett, A. Zisserman, Wide baseline stereo matching,in: Proceedings of ICCV, 1998, pp. 754–760.

[31] A. Rosenfeld, G. van der Brug, Coarse-fine templatematching, IEEE Transactions on Systems, Man and Cyber-netics 7 (1977) 104–107.

[32] C. Schmid, R. Mohr, Local grayvalue invariants for imageretrieval, IEEE Transactions on Pattern Analysis andMachine Intelligence 19 (5) (1997) 530–534.

[33] C. Schmid, R. Mohr, C. Bauckhage, Evaluation of interestpoint detectors, International Journal of Computer Vision 37(2) (2000) 151–172.

[34] C. Schmid, A. Zissermann, Automatic line matching acrossviews. in: Proceedings of CVPR, (1997)666–671.

[35] S. Sclaroff, A.P. Pentland, Modal matching for correspon-dence and recognition, IEEE Transactions on PatternAnalysis and Machine Intelligence 17 (6) (1995) 545–561.

[36] G. Scott, H. Longuet-Higgins, An algorithm for associatingthe features of two images, Proceedings of Royal SocietyLondon B244 (1991) 21–26.

[37] L.S. Shapiro, J.M. Brady, Feature-based correspondence –an eigenvector approach, Image Vision Computing 10(1992).

[38] J. Shi, C. Tomasi, Good features to track. in: Proceedings ofCVPR, 1994, pp. 593–600,

[39] A. Shokoufandeh, I. Marsic, S.J. Dickinson, View-basedobject recognition using saliency maps, Image and VisionComputing 17 (1999) 445–460.

[40] D. Tell, S. Carlsson, Combining appearance and topologyfor wide baseline matching. in: Proceedings of ECCV, 2002,pp. 68–81.

[41] E. Trucco, A. Verri, Introductory Techniques for 3-DComputer Vision, Prentice-Hall, 1998.

atching using SIFT features, Graphical Models (2006),

Page 17: SVD-matching using SIFT featuresmdailey/cvreadings/Delponte-SIFT.pdf · SVD-matching using SIFT features ... due to the feature descriptor adopted, ... shows the superiority of SIFT

E. Delponte et al. / Graphical Models xxx (2006) xxx–xxx 17

ARTICLE IN PRESS

[42] T. Tuytelaars, L.V. Gool. Wide baseline stereo matchingbased on local, affinely invariant regions, in: Proceedings ofBMVC, 2000, pp. 412–425.

[43] S. Umeyama, An eigen decomposition approach to weightedgraph matching problems, IEEE Transactions on PatternAnalysis and Machine Intelligence 10 (1988).

[44] G. van der Brug, A. Rosenfeld, Two-stage templatematching, IEEE Transactions on Computers 26 (4) (1977)384–393.

Please cite this article as: Elisabetta Delponte et al., SVD-mdoi:10.1016/j.gmod.2006.07.002.

[45] Y. Weiss, Segmentation using eigenvectors: a unifying view.In Proceedings of ICCV, pages 975–982, 1999.

[46] R. Zabih, J. Woodfill. Non-parametric local transforms forcomputing visual corresposdence. In Proceedings of ECCV,pages 151–158, 1994.

[47] Z. Zhang, R. Deriche, O. Faugeras, Q.T. Luong, A robusttechique for matching two uncalibrated images through therecovery of the unknown epipolar geometry, ArtificialIntelligence 78 (1995) 87–119.

atching using SIFT features, Graphical Models (2006),