-
Flexible Syntactic Matching of Curves andIts Application to
Automatic Hierarchical
Classification of SilhouettesYoram Gdalyahu and Daphna
Weinshall, Member, IEEE
AbstractÐCurve matching is one instance of the fundamental
correspondence problem. Our flexible algorithm is designed to
match
curves under substantial deformations and arbitrary large
scaling and rigid transformations. A syntactic representation is
constructed
for both curves and an edit transformation which maps one curve
to the other is found using dynamic programming. We present
extensive experiments where we apply the algorithm to silhouette
matching. In these experiments, we examine partial occlusion,
viewpoint variation, articulation, and class matching (where
silhouettes of similar objects are matched). Based on the
qualitative
syntactic matching, we define a dissimilarity measure and we
compute it for every pair of images in a database of 121 images. We
use
this experiment to objectively evaluate our algorithm: First, we
compare our results to those reported by others. Second, we use
the
dissimilarity values in order to organize the image database
into shape categories. The veridical hierarchical organization
stands as
evidence to the quality of our matching and similarity
estimation.
Index TermsÐCurve matching, syntactic matching, image database,
silhouettes.
æ
1 INTRODUCTION
GIVEN a large collection of images, unraveling itsredundancies
is an important and challenging task.One could use this knowledge
to assist in image queryingand to construct more efficient and
compact imagerepresentations. In order to identify redundancy in
thedatabase, we propose the following approach: First,
designalgorithms to measure the similarity between images.Second,
given pairwise image similarity, use similarity-based clustering to
reveal the structure in the data byhierarchically dividing the
images into distinct clusters.Third, identify redundancy in each
cluster and use it toprune the database and pick cluster
representatives; thiswould allow for efficient indexing into the
database.
It is important to distinguish between our approach,
where the clustering of N images uses only their N �Nsimilarity
matrix, and the more typical approach, where
images are first embedded in some D-dimensional vector
space whose dimension should be significantly reduced
using such methods as PCA. Mapping an image into such a
space, in effect, requires the identification of D measure-
ments (or ªfeaturesº) that completely describe the image.
This has proven to be an elusive task. The task of image
comparison, on the other hand, seems more within our
reach: Rather than look for an explicit representation of
images as vectors, we seek an algorithm (as complex as
necessary) which receives as input two images and returns
as output the similarity between them.
In this paper, we focus on the design of similaritymeasures,
limiting ourselves to the shape dimension ofsimilarity and ignoring
other dimensions (e.g., color,motion, and context). The similarity
is, therefore, definedas similarity between silhouettes. We
describe our silhou-ette matching algorithm and show results of
extensiveexperiments with real images. We then outline a
stochasticclustering algorithm (described at length in [14]) and
arguethat the good clustering results we show give
objectiveevidence to the quality of our silhouette matching
algo-rithm. The issue of database pruning, and how to
identifywithin cluster redundancies to allow for efficient
indexinginto the database, is discussed in later work [21].
More specifically, we describe a novel flexible curvematching
algorithm to relate between feature points that areautomatically
extracted on the boundaries of objects. Unlikemost pattern
recognition applications of clustering, we usereal images of
three-dimensional objects, basing oursimilarity measure on the
shape of their occluding contours.Our algorithm is designed to give
a graded similarity value,where low values reflect similarity
between weakly similarcurves and higher values indicate strong
similarity. Toillustrate, given two curves describing the shape of
twodifferent mammals, we consider our algorithm to beªsuccessfulº
if it matches their limbs and head correspond-ingly. The matched
pairs of feature points are then alignedusing an optimal 2D
similarity transformation (translation,rotation, and scale). From
the residual distances betweencorresponding features, we compute a
robust dissimilaritymeasure between silhouettes. The matching
algorithm isoutlined in Section 3 and the resulting dissimilarity
valuesare compared with those reported in the literature.
According to the general approach adopted in this paper,our next
step is to feed the computed dissimilarities into apairwise
clustering algorithm to obtain hierarchical clusters
1312 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE
INTELLIGENCE, VOL. 21, NO. 12, DECEMBER 1999
. The authors are with the Institute of Computer Science, The
HebrewUniversity, 91904 Jerusalem, Israel. E-mail: {yoram,
daphna}@cs.huji.ac.il.
Manuscript received 4 Jan. 1998; revised 19 Oct.
1999.Recommended for acceptance by K. Bowyer.For information on
obtaining reprints of this article, please send e-mail
to:[email protected], and reference IEEECS Log Number 107787.
0162-8828/99/$10.00 ß 1999 IEEE
-
of similar images. A pairwise clustering algorithm exploitsonly
proximity information and, is therefore, suitable whenvectorial
representation of images is not available. Instead,the images are
represented as nodes in a graph with edgeswhose weight reflect the
similarity between every imagepair. In [14], we describe our
stochastic clustering algo-rithm, which is outlined in Section 4.
Our algorithmdetermines the number of clusters in a (true)
hierarchicalmanner and tolerates, to some extent, violations of
metricproperties (i.e., violation of the triangle inequality).
Wedemonstrate perceptually veridical results using a databaseof 121
images of 12 different objects, which are hierarchi-cally
classified. The useful clustering results illustrate thequality of
our matching algorithm and the usefulness of ourgeneral
approach.
2 CURVE MATCHING: PROBLEM AND RELATEDWORK
Contour matching is an important problem in computervision with
a variety of applications, including model-basedrecognition, depth
from stereo, and tracking. In theseapplications, the two matched
curves are usually verysimilar. For example, a typical application
of curvematching to model-based recognition would be to
decidewhether a model curve and an image curve are the same, upto
some scaling or 2D rigid transformation and somepermitted level of
noise.
In this paper, we are primarily interested in the casewhere the
similarity between the two curves is weak. Theorganization of
silhouettes into shape categories (like tools,cars, etc.)
necessitates flexible matching, which can supportgraded similarity
estimation.
While our approach focuses on the silhouette boundary,a dual
approach is based on its medial axis. Specifically, amedial axis,
together with singularities labeling, form ashock graph
representation, and matching shock graphs isan isomorphism problem.
The methods for solving itincludes semidefinite programming [38],
replicator dy-namics [32], graduated assignment [36], and syntactic
graphmatching [40]. In some of these cases, the matching is
onlystructural, while, in others, two levels of matching(structural
and metrical) are supported. The methods basedon shock graphs
succeed in defining a graded similaritymeasure and may be combined
with suitable databaseindexing [37], [24]. In this paper, we show,
however, thatour results are of the same quality in spite of
usingboundary representation, which is inherently less sensitiveto
occlusion and which does not involve the NP-completegraph
isomorphism problem.
To put our method in the context of existing work onboundary
matching, we first distinguish between densematching and feature
matching. Dense matching is usuallyformulated as a parameterization
problem, with some costfunction to be minimized. The cost might be
defined as theªelastic energyº needed to transform one curve to the
other[5], [8], [10], but other alternatives exist [2], [12], [15],
[30].The main drawbacks of these methods are their
highcomputational complexity (which is reduced significantly ifonly
key points are matched) and the fact that they are
usually not invariant under both 2D rotation and scaling.
Inaddition, the computation of elastic energy (which isdefined in
terms of curvature) is scale dependent andrequires accurate
evaluation of second order derivatives.
Feature matching methods may be divided into threegroups:
proximity matching, spread primitive matching,and syntactic
matching. The idea behind proximity match-ing methods is to search
for the best matching whilepermitting the rotation, translation,
and scaling (to be calledalignment transformation) of each curve
such that thedistances between matched key points are minimized
[4],[20], [22], [45]. Consequently, these methods are ratherslow;
moreover, if scaling is permitted, an erroneousshrinking of one
feature set may result, followed by thematching of the entire set
with a small number of featuresfrom the other set. One may avoid
these problems byexcluding many-to-one matches and by using the
pointsorder, but then the method becomes syntactic (see
below).Moreover, we illustrate in Section 5.7.3 why
proximitymatching is not adequate for weakly similar curves. As
analternative to the alignment transformation, features may
bemapped to an intrinsic invariant coordinate frame [28],
[34],[35]; the drawback of this approach is that it is global, as
theentire curve is needed to correctly compute the mapping.
Features can be used to divide the curves into shapeelements or
primitives. If a single curve can be decomposedinto shape
primitives, the matching algorithm should beconstrained to preserve
their order. But, in the absence ofany ordering information (like
in stereo matching of manysmall fragments of curves), the matching
algorithm may becalled ªspread primitive matching.º In this
category, wefind algorithms that seek isomorphism between
attributedrelational graphs [6], [9], [26] and algorithms that look
forthe largest set of mutually compatible matches.
Here,compatibility means an agreement on the induced coordi-nate
transformation and a few techniques exist to find thelargest set of
mutually compatible matches (e.g., clusteringin Hough space [39],
geometrical hashing [25], and cliquefinding in an association graph
[7], [11], [18], [23]). Notethat, at the application level, finding
isomorphism betweenattributed relational graphs is the same problem
as findingisomorphism between shock graphs (discussed
above),although, in the last case, an additional constraint
mayapply [32].
For our purpose of matching complex outlines, it isadvantageous
to use the natural order of primitives. Thisresults in a great
simplification and there is no need to solvethe difficult graph
isomorphism problem. Moreover, therelations encoded by the
attributed relational graphs needto be invariant with respect to 2D
image transformationsand as a result they are usually nonlocal.
A syntactical representation of a curve is an ordered list
ofshape elements, having attributes like length,
orientation,bending angle, etc. Hence, many syntactical
matchingmethods are inspired by efficient and well-known
stringcomparison algorithms, which use edit operations
(sub-stitution, deletion, and insertion) to transform one string
tothe other [17], [29], [46]. The vision problem is differentfrom
the string matching problem in two major aspects,however: First, in
vision, invariance to certain geometrical
GDALYAHU AND WEINSHALL: FLEXIBLE SYNTACTIC MATCHING OF CURVES
AND ITS APPLICATION TO AUTOMATIC HIERARCHICAL... 1313
-
transformations is desired; second, a resolution degradation(or
smoothing) may create a completely different list ofelements in the
syntactical representation.
There are no syntactic algorithms available whichsatisfactorily
solve both of these problems. If invariantattributes are used, the
first problem is immediatelyaddressed, but then the resolution
problem either remainsunsolved [1], [16], [27] or may be addressed
by constructing,for each curve, a cascade of representations at
differentscales [3], [31], [44]. Moreover, invariant attributes are
eithernonlocal (e.g., length that is measured in units of the
totalcurve length) or they are noninterruptible (see discussion
inSection 5.7). Using variant attributes is less efficient,
butprovides the possibility of defining a merge operator whichcan
handle noise [33], [41], [42] and might be useful (ifcorrectly
defined) in handling resolution change. However,the methods using
variant attributes could not ensurerotation and scale
invariance.
3 FLEXIBLE SYNTACTIC CURVE MATCHING:ALGORITHM
In this section, we present a local syntactic matchingmethod
which can cope with both occlusion and irrelevantchanges due to
image transformation, while using variantattributes. These
attributes support a simple smoothingmechanism, hence, we can
handle true scale (resolution)changes. The algorithm is outlined in
Section 3.1, while themissing details are given in Section 5. We
are primarilyconcerned with the amount of flexibility that our
methodachieves since we aim to apply it to weakly similar
curves.Section 3.2 shows extensive experiments with real
images,where excellent matching is obtained between weaklysimilar
shapes. We demonstrate silhouette matching underpartial occlusion,
under substantial change of viewpoint,and even when the occluding
contours describe different(but related) objects, like two
different cars or mammals.Our method is efficient and fast, taking
only a few secondsto match two curves.
3.1 The Proposed Matching Method
The occluding contours of objects are first extracted fromthe
image and a syntactic representation is constructed,whose
primitives are line segments and whose attributesare length and
absolute orientation. Our algorithm thenuses a variant of the edit
matching procedure combinedwith heuristic search. Thus, we define a
novel similaritymeasure between primitives to assign cost to each
editoperation, a novel merge operation, and introduce penaltyfor
interrupting a contour (in addition to the
regulardeletion/insertion penalty).
More specifically, let A and A0 be two syntacticrepresentations
of two contours; A fa1; a2; . . . ; aNg is acyclically ordered list
of N line segments and A0 fa01; a02; . . . ; a0N 0 g is another
cyclic list of N 0 segments.
Let ai be a segment of A and a0j be a segment of A
0.Matching these segments uniquely determines the relativeglobal
rotation and scale (2D alignment transformation)between the curves.
We assume that the optimal alignmentis well approximated by at
least one of the NN 0 possibleselections of ai and a
0j. In fact, we will discuss in Section 5.2
a method to prune many of them, leaving us with a set � A�A0 of
candidate global alignments, such thatusually jj � NN 0.
A member fai; a0jg of (abbreviated fi; jg for conve-nience)
denotes a starting point for our syntactic matchingalgorithm. The
algorithm uses edit operations to extend thecorrespondence between
the remaining unmatched seg-ments, preserving their cyclic order.
Total cost is minimizedusing dynamic programming, where the cost of
a matchbetween every two segments depends on their attributes,
aswell as on the attributes of the initial chosen pair (ai and
a
0j).
This implicitly takes into account the global
alignmentoperation. The various edit operations and their
costfunctions are described in Section 5.3; the cost of the
editoperations can be either negative or positive.1
We are searching for the member fi; jg 2 for which theextended
correspondence list has minimal edit cost. Bruteforce
implementation of this search is computationallyinfeasible. The
syntactic matching process is, therefore,interlaced with heuristic
search for the best initial pair in .Namely, a single dynamic
programming extension step isperformed for the best candidate in
(possibly a differentcandidate in each extension matching step)
while main-taining the lowest cost achieved by any of the
sequences.When no candidate in has the potential to achieve a
lowercost, the search is stopped (see below).
The pseudocode in Fig. 1 integrates the components ofour
matching algorithm into a procedure which gets twosyntactic
representations and returns a segment correspon-dence and a
dissimilarity value. The arrays which supportthe dynamic
programming are not referenced in thispseudocode to increase its
readability. We note that theprocedure CURVE-MATCHING minimizes the
edit cost,which is typically negative, thus, in effect,
CURVE-MATCHING is maximizing the ªgainº of matching.
The procedure INITIALIZE performs the initial prun-ing of pairs
of starting points and returns the set ofcandidates sorted by
increasing potential values (seebelow). Full description is given
in Section 5.2. Itsimplementation performs a few syntactic matching
stepsfor all NN 0 possible pairs and computes the intermediateedit
cost corresponding to this partial matching. Theminimal
intermediate edit cost achieved by any of thecandidates is returned
as cost� and the candidate pair whichachieves cost� is returned as
fi�; j�g.
The procedure PICK-CANDIDATE selects a particularmember of to be
fed into the syntactic algorithm. Tounderstand its operation, we
need to define the concept ofpotential: For each candidate fi; jg
that has been extendedto a correspondence list of some length, we
compute alower bound on its final edit cost. This bound is based
onthe intermediate edit cost which has already been achievedand the
cost of the best (lowest cost) possible matching of
1314 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE
INTELLIGENCE, VOL. 21, NO. 12, DECEMBER 1999
1. Negative cost can be interpreted as positive ªgainº or
ªreward,º Everymatching prefix has a total (accumulated) cost,
which should be as negativeas possible. A prefix is never extended
by a suffix with positive cost sincethis would increase the cost;
hence, partial matching is achieved by leavingthe last segments
unmatched. Note that if all costs are negative, then theminimal
cost must be obtained by matching all the segments of the
shortersequence, and if all costs are positive, then the minimal
cost is triviallyobtained by leaving all segments unmatched. The
average cost valuedetermines the asymptotic matching length for
random sequences [13].
-
the remaining unmatched segments. We call this bound
thepotential of the candidate fi; jg. The procedure PICK-CANDIDATE
returns the member of which has best(minimal) potential.
Technically, we store as an ordered list sorted byincreasing
potential value. Each member of is a candidatefi; jg and its
current potential . The procedurePICK-CANDIDATE then returns the
first member of .The list is initially sorted by INITIALIZE and its
order ismaintained by the procedure PREDICT-COST
discussedbelow.
The search for the best candidate fi�; j�g continues aslong as
there exists a candidate whose potential is lowerthan the best cost
achieved so far (cost�). It is implementedby the loop, which
iterates as long as cost� > potential. Notethat cost� cannot
increase and potential cannot decreaseduring the search.
The procedure SYNTACTIC-STEP is the core of ouralgorithm. It is
given as input two cyclically orderedsequences A fa1; . . . ; aNg
and A0 fa01; . . . ; a0N 0 g, whichare partially matched from
position i of A and onward andfrom position j of A0 and onward. It
uses dynamicprogramming to extend the edit transformation betweenA
and A0 by one step. Since our editing cost operationtypically takes
negative values, the edit cost of fi; jg couldbecome better (lower)
than cost�. In this case, fi�; j�g is setequal to fi; jg and cost�
is set equal to the newly achievededit cost (otherwise, fi�; j�g
and cost� remain unchanged).Sections 5.3 and 5.4 give the full
description of theprocedure SYNTACTIC-STEP.
Extending the editing sequence of fi; jg is likely toincrease
its potential, making it a less attractive candidate.This is
because the potential of fi; jg is partiallydetermined by a lower
bound on the final edit distancebetween the yet unmatched segments
and the editoperation just added can only tighten this bound
bydecreasing the number of unmatched segments. Theprocedure
PREDICT-COST reestimates the final cost and
corrects the potential of fi; jg. Since is kept as anordered
list, PREDICT-COST pushes the candidate fi; jgdown to maintain the
order of the list. Section 5.5 gives fulldetails of the potential
estimation.
Assuming that the reader is familiar with conventionaldynamic
programming implementations, it is sufficient todescribe the
procedure TRACE as the procedure whichreads the lowest cost path
from the dynamic program-ming array.2 When this procedure is
applied to the arrayassociated with the best candidate fi�; j�g,
the lowest costediting sequence is obtained. In our implementation,
inorder to keep space complexity low, we keep just the lastfew rows
and columns for each array. Hence, theprodecure TRACE needs to
repeat the syntactic matchingfor the best pair fi�; j�g. See
Section 5.4 for a descriptionof the dynamic programming
implementation.
Finally, using the correspondence we found, we refinethe global
2D alignment by minimizing the sum of residualdistances between
matched segments endpoints. Theprocedure DISSIMILARITY performs the
minimization,and uses the residual distances to define a robust
measureof dissimilarity between the curves (details in Section
5.6).
Our approach thus combines syntactic matching with aproximity
measure (in this sense, our method resemblesthat of [1]). That is,
we establish feature correspondenceusing syntactic matching and
then evaluate the dissim-ilarity according to the residual
distances betweenmatched points. We do not use the edit distance as
ameasure of dissimilarity, mainly due to the fact that thisquantity
depends on the somewhat arbitrary parametersof the edit operation
and segment similarity, whereas,typically, the best matching result
is not sensitive to theseexact parameters. That is, the same
matching is obtainedfor a range of edit parameter values, although
the editdistance may be different. Another advantage to combin-ing
syntactic and proximity criteria is that, in many cases,the
combination provides a mechanism for outliersremoval, as is
demonstrated in Section 5.6.
3.2 Matching Results
We now present a few image pairs and triplets togetherwith the
matching results, which demonstrate perceptuallyappealing matching.
In Section 3.3 below, we apply ourmatching algorithm to a database
of 31 silhouettes given tous by Sharvit and Kimia and compare our
dissimilarityvalues to those reported in [36]. Additional
classificationresults will be presented in Section 4.2, using the
matchingof a few thousand image pairs, to provide indirect
objectiveexamination of the matching quality.
In all the experiments reported in this paper, we use thesame
parameter values (defined in Section 5): w1 1,w2 0:8, w3 8:0, and K
4 (with the exception of Fig. 6,where K 5). Each matching of an
image pair took only afew seconds (see Section 4.2).
Fig. 2 shows two images of different objects. There is
ageometrical similarity between the two silhouettes whichhas
nothing to do with the semantic similarity between
GDALYAHU AND WEINSHALL: FLEXIBLE SYNTACTIC MATCHING OF CURVES
AND ITS APPLICATION TO AUTOMATIC HIERARCHICAL... 1315
Fig. 1. Pseudocode for CURVE-MATCHING procedure.
2. Note, however, that using both positive and negative costs
allows forpartial matching, hence the path can terminate at any
entry of thedynamical programming array and not necessarily at the
last row orcolumn.
-
them. The geometrical similarity includes five approxi-mately
vertical swellings or lumps (which describe the fourlegs and the
tail). In other words, there are many placeswhere the two contours
may be considered locally similar.This local similarity is captured
by our matching algorithm.
The two occluding contours of the two animals and thefeature
points were automatically extracted in the prepro-cessing stage.
Corresponding points are marked in Fig. 2 bythe same numbers.
Hence, the tails and feet are nicelymatched, although the two
shapes are only weakly similar.The same matching result is obtained
under arbitrarily largerotation and scaling of one image relative
to the other.
Fig. 3 demonstrates the local nature of our algorithm,namely,
that partial matching can be found when objects areoccluded. Since
our method does not require global imagenormalization, the
difference in length between the silhou-ette outlines does not
impede the essentially perfectmatching of the common parts.
Moreover, the commonparts are not identical (note the distance
between the frontlegs and the number of ears) due to a small
difference inviewpoint; this also does not impede the performance
ofour algorithm.
Fig. 3 also demonstrates outliers pruning using threeimages. In
Fig. 3b, there is a shadow between two of theleaves (pointed to by
the arrow) and, as a result, the outline
penetrates inward. The feature points along the penetrationare
(mistakenly) matched with features along the tail inFig. 3a, since
the two parts are locally similar. However, weuse the procedure of
mapping the points of Fig. 3a to Fig. 3b,then to Fig. 3c and back
to Fig. 3a. Only points which aremapped back to themselves are
accepted as correct matches;these matches are marked by common
numbers in Fig. 3.
Figs. 4 and 6 show results when matching images takenfrom very
different points of view. In Fig. 4, two differentviews of the same
object are matched and the method ofiterative elimination of
distances is demonstrated (seeSection 5.6). Fig. 6 shows matching
between three differentcars, viewed from very different viewpoints
and subjectedto occlusion. Matching under a large perturbation
ofviewpoint can be successful as long as the silhouettesremain
similar ªenough.º Note that preservation of shapeunder change of
viewpoint is a quality that definesªcanonicalº or ªstableº views.
Stable images of 3D objects
1316 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE
INTELLIGENCE, VOL. 21, NO. 12, DECEMBER 1999
Fig. 2. Qualitative matching between pictures of toy models of a
horseand a wolf. Note the correct correspondence between the feet
of the wolfto those of the horse and the correspondence between the
tails. Theresults are shown without outliers pruning. In this
example, all thefeatures to which no number is attached had been
merged, e.g., thesegment 9-10 on the horse outline was matched with
three segments onthe wolf outline.
Fig. 3. Dealing with occlusion: partial matching between three
images.Points are mapped from (a) to (b) to (c) and back to (a).
Only pointswhich are mapped back to themselves are accepted (order
is notimportant). The points on the tail in (a) are matched with
the shadow(pointed to by the arrow) in (b), but matching (b) with
(c) leaves theshadow unmatched. Hence, the tail is not matched back
to itself and thecorrespondence with the shadow is rejected.
-
were proposed as the representative images in anappearance based
approach to object representation [47].
The last example (Fig. 5), shows results with anarticulated
object, matching human limbs at different bodyconfigurations.
3.3 Dissimilarity Measurements: Comparison
In this section, we use our matching algorithm to computea
dissimilarity value, as will be explained in Section 5.6.Good
matching is essential for correct dissimilarity estima-tion, whose
values we use for quantitative comparisonswith other methods. We
use the image database created bySharvit et al. (see [36]), which
consists of 31 silhouettes ofsix classes of objects (including
fish, airplanes, and tools).
In [36], 25 images were selected out of this databaseand their
pairwise similarities were computed. Themeasure of quality was the
number of instances (out of25) in which the first, second, and
third nearest neighborof an image was found in its own class. We
follow thesame procedure and show, in Table 1, the
dissimilarityvalues which we obtain for 25 silhouettes.3
We find that the fraction of times (out of 25) that thefirst
nearest neighbor of an image belongs to its own classis 25/25,
namely, it is always the case. For the second andthird nearest
neighbors, the results are 21/25 and 19/25.4
In comparison, the results reported in [36] are 23/25,21/25, and
20/25, respectively, whereas our results fortheir choice of 25
images are the fractions 25/25, 20/25,and 17/25, respectively. It
is to be noted, however, that, inthe framework of [36], one can use
additional information,specifically, whether the two graphs that
represent a pairof shapes have similar topology.
We conclude that the two methods are comparable inquality when
isolated silhouettes are matched. This is inspite of our using
boundary representation, whereassymmetry representation (shock
graph) is used in [36].The shock graph representation is inherently
more sensitiveto occlusions, while shock graph matching requires
solvingthe difficult NP-complete graph isomorphism problem
(seeSection 1). Moreover, our method can easily be adjusted
tohandle open curves by avoiding the assumption that thesyntactic
representation is cyclic. On the other hand, shockgraphs must
distinguish between interior and exterior.
Recently, progress has been made toward computingthe edit
distance between shock graphs [40] using apolynomial time algorithm
that exploits their specialstructure. So far, however, the
algorithm is not capableof dealing with invariance to image
transformations andno quantitative measures have been reported.
GDALYAHU AND WEINSHALL: FLEXIBLE SYNTACTIC MATCHING OF CURVES
AND ITS APPLICATION TO AUTOMATIC HIERARCHICAL... 1317
Fig. 4. Matching two views of an object subjected to large
foreshorten-ing. Rejected pairs (in circles) were detected in four
iterations ofeliminating the 10 percent most distant pairs and
realigning the others.Thirty-three and 35 features were extracted
on the two outlines; 32 pairswere initially matched and nine pairs
were rejected 28 percent.
Fig. 5. Matching of human limbs at different body
configurations. In this
case, the outlines were extracted with snakes rather than by
gray level
clustering (see acknowledgments). Original images are not
shown.
Fig. 6. Combination of various sources of difficulty: different
models,different viewpoints, and occlusion. The merging utility is
used toovercome the different number of feature points around the
wheels; gapinsertion is utilized to ignore the large irrelevant
part.
3. As Table 1 shows, we have at least four images in each class.
In [36],the same images were chosen with the exception that one of
the classesconsisted of only three images, while the fish class
contained five images.However, for members of a class consisting of
only three images, the threenearest neighbors can no longer be all
in the same class. Hence, we slightlymodified the choice of
selected images.
4. This demostrates the limitation of syntatic curve matching
for shaperecognition as, in some case, objects have similar
bounding curves butdifferent interior regions.
-
4 CLUSTERING OF SILHOUETTES
The next step in our approach involves feeding a graph of
image similarity values, computed by the silhouette match-
ing algorithm, to a similarity-based clustering algorithm.
By
hierarchically dividing the images into distinct clusters,
we
discover structure in the data. To this end, we developed a
stochastic clustering algorithm [14], whose full description
is beyond the scope of this paper. Instead, we give below a
brief review of the algorithm, showing clustering results
which demonstrate the usefulness of our approach and the
quality of the matching results.
4.1 Stochastic Clustering Algorithm
We represent the image database as a graph, with
nodesrepresenting the images and similarity values assigningweights
to the edges. Every partition of the nodes into rdisjoint sets is
an r-way cut in the graph and the edgeswhich connect between
different sets of nodes are said tocross the cut. The value c of a
cut is the sum of the weights ofall crossing edges.
Our clustering method induces a probability distributionover the
set of all r-way cuts in the graph, with a knownlower bound on the
probability of the minimal cut. Underthis distribution over cuts,
we compute, for every twonodes u and v in the graph, their
probability pruv of being inthe same component of a random r-way
cut.
The partition of the nodes into disjoint sets, whichsatisfies
pruv < 0:5 for every crossing edge u; v, is theoutput of our
clustering algorithm for scale level rr 1 . . .N. At level r 1 all
the nodes must be in onecluster and, as r is increased, the
partitions undergo aseries of bifurcations. The ªinterestingº
bifurcation points,which account for meaningful data clustering,
can beclearly distinguished from the others. This is a
majoradvantage of our algorithm over deterministicagglomerative
methods.
Our clustering method is robust and efficient, runningin ON
log2N time for sparse graphs and ON2 logNfor complete graphs. It
does not require that thesimilarity values obey metric relations,
hence it is suitablefor the present application where the
similarity provided
1318 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE
INTELLIGENCE, VOL. 21, NO. 12, DECEMBER 1999
TABLE 1The Dissimilarity Values Computed between 25 Silhouettes
(Multiplied by 1,000 and Rounded)
For each line, the columns that correspond to the three nearest
neighbors (and the self-zero distance) are highlighted. The first,
second, and thirdnearest neighbor are in the same class, a fraction
of 25/25, 21/25, and 19/25 of the times, respectively.
-
by our matching algorithm does not necessarily satisfythe
triangle inequality.5
4.2 Clustering Results
We now integrate the two steps of similarity estimationand
pairwise clustering into one experiment of imagedatabase
categorization. The database contains 121 imagesof 12 different
objects; 90 of the images were collected byplacing six toy models
on a turntable so that the objectscould be viewed from different
viewpoints. The otherimages are the 31 silhouettes discussed in
Section 3.3. Foreach of the six toy models, we collected 15 images
byrotating them in azimuth
# ÿ20�;ÿ10�; 0�; 10�; 20�and elevation (' ÿ10�; 0�; 10�). We
used models of a cow,wolf, hippopotamus, two different cars, and a
child.
The central images # ' 0 in each of the threegroups of pictures
of animal models are side views (i.e.,four legs, head, and tail are
visible). All the different 15images of each animal model are
somewhat similar in thatthe same parts are visible (though, in some
pictures, someparts, such as two legs or a leg and a tail, are
merged intoone in the silhouette). Thus, there is weak
geometricalsimilarity between all the 45 silhouettes of the
threemammals and there is weak geometrical similarity between
the 30 different silhouettes of the two cars. A desirable
shape categorization procedure should reveal this hidden
hierarchical structure.All the images were automatically
preprocessed to
extract the silhouettes of the objects and represent them
syntactically (see Section 5.1). The dissimilarities between
the silhouettes are estimated using the algorithm described
in Section 3.1. In order to compare all the image pairs in
our
database of 121 images, we performed 7,260 matching
assignments; this took about 10 hours on an INDY R4400
175Mhz workstation (about 5 seconds per image pair, on
average).The dissimilarity matrix constitutes the input to
the
clustering algorithm outlined above. When the scale
parameter r is varied, the hierarchical classification shown
in Fig. 7 is obtained. At the highest level r 1, all theimages
belong to a single cluster. As r is increased, finer
structure emerges. Note that related clusters (like the two
car clusters) split at higher r values, which means that our
dissimilarity measure is continuous, assigning low (but
reliable) values to weakly similar shapes.Since humans can do
so, we assume that an ideal shape
classifier can put the images of every object in a different
class. It is hard to test this hypothesis since, as humans,
we
cannot ignore the semantic meaning of the shapes. Never-
theless, comparing with the ideal human perceptual
classification, our finest resolution level is almost
perfect,
with only two classification errors (in the boxes marked by
�) and the undesirable split of the fish cluster.
GDALYAHU AND WEINSHALL: FLEXIBLE SYNTACTIC MATCHING OF CURVES
AND ITS APPLICATION TO AUTOMATIC HIERARCHICAL... 1319
5. This would also be the case had we used the generalized
Hausdorffdistance or the normalized edit distance [29]. The
violation of metricproperties is also known to exist in the
function underlying our humannotion of similarity between both
semantic and perceptual stimuli [43].
Fig. 7. The classification tree (dendrogram) obtained for an
image database consisting of 121 images. The finest classification
level is shown byputting each cluster of silhouettes in a box. For
the large clusters representing our own toy models (see text), the
figure shows only five exemplars,but the other 10 are classified
correctly as well. Note that the lower levels of the tree
correspond to meaningful hierarchies, where similar classes
(likethe two cars or the three sets of mammals) are grouped
together. The vertical axis is not in scale.
-
Our categorization is obtained using only intrinsic
shapeinformation. The relative size, orientation, and position
ofthe silhouettes within each category is arbitrary.
Moreover,global information, like the length of the occluding
contouror the area it encloses, is not used. Hence, we expect
thatmoderate occlusion will not affect the classification.
5 FLEXIBLE SYNTACTIC MATCHING: DETAILS
In this section, we give the details of the various
proceduresand steps involved in the curve matching
algorithm,outlined in Section 3.1. Contour representation is
discussedin Section 5.1. The initial pruning of candidate
globalalignments is discussed in Section 5.2, where we describethe
procedure INITIALIZE. The syntactic edit operationswhich are used
by SYNTACTIC-STEP, and their respectivecosts, are discussed in
Section 5.3. The details of thedynamic programming procedure, which
SYNTACTIC-STEP uses to minimize the edit distance, are given
inSection 5.4. In Section 5.5, we define the potentials whichare
used to guide the search for best starting point anddescribe the
procedure PREDICT-COST. Finally, theprocedure DISSIMILARITY is
discussed in Section 5.6.
5.1 Preprocessing and Contour Representation
In the examples shown in this paper, objects appear ondark
background and segmentation is successfullyaccomplished by a
commercial k-means segmentation tool.A syntactic representation of
the occluding contour is thenautomatically extracted: It is a
polygon whose vertices areeither points of extreme curvature or
points which areadded to refine the polygonal approximation. Thus,
theprimitives of our syntactic representation are line segmentsand
the attributes are length and absolute orientation. Thenumber of
segments depends on the chosen scale and theshape of the contour,
but typically it is around 50. Coarserscale descriptions may be
obtained using merge operations.
Feature points (vertices) are initially identified at pointsof
high curvature, according to the following procedure: Atevery
contour pixel p, an angle � is computed between twovectors u and v.
The vector u is the vectorial sum of mvectors connecting p to its m
neighboring pixels on the leftand the vector v is similarly defined
to the right. Pointswhere � is locally minimal are defined as
feature points. Thepolygonal approximation (obtained by connecting
thesepoints by straight lines) is compared with the
originaloutline. If the distance between the contour points to
thepolygonal segments is larger than a few pixels, more
featurepoints are added to refine the approximation.
5.2 Global Alignment: Pruning the Starting Points
The procedure INITIALIZE receives two cyclic sequencesA and A0
of lengths N and N 0, respectively, as defined inSection 3.1.
Initially, there are NN 0 possible starting pointsfor the syntactic
matching procedure; recall that eachstarting point corresponds to
the matching of one segmentai in A to another segment aj in A
0, thus defining the global2D alignment between the two curves.
However, thenumber of successful starting points is much smaller
thanNN 0and they tend to correspond to similar global 2Dalignment
transformations for two reasons: 1) Low cost
transformations tend to be similar since any pair ofsegments
fai; a0jg, which belongs to a good correspondencelist, is likely to
be a good starting point for the syntacticalgorithm. 2) The overall
number of good starting pointstends to be small since most of the
pairs in A�A0 cannot beextended to a low cost correspondence
sequence; the reasonis that it might be possible to find a random
match of shortlength, but it is very unlikely to successfully match
longrandom sequences.
These observations are used by the procedure INITI-ALIZE to
significantly reduce the number of candidatestarting points. The
procedure uses as a parameter thenumber t of edit operations which
are performed for everyone of the NN 0 possible starting points (in
our experimentswe use t 5 or 10). The pruning proceeds using the
relationbetween starting points and global 2D alignments,
asfollows:
Every starting point fai; a0jg is associated with a global2D
alignment and, in particular, with a certain rotationangle that
maps the direction of ai to that of a
0j. Let n be
minN;N 0, and observe the distribution of the n rotationangles
which achieved the best n edit distances after t steps.If these
angles are distributed sharply enough around somecentral value c,
we conclude that c is a good estimator forthe global rotation.
Then, we discard every starting point inA�A0 whose associated
rotation is too far from c. Theremaining set of candidates is the
set (see Fig. 8).
For each candidate that remains in , the procedureINITIALIZE
computes its future potential, as discussed inSection 5.5, and
sorts the list by increasing potentialvalues. The minimal edit
distance (cost) that has beenachieved during the first t steps is
returned as cost� and thecandidate possessing this cost is returned
as fi�; j�g.5.3 Syntactic Operations Which Determine the Edit
Distance
The goal of classical string edit algorithms is to find
asequence of elementary edit operations which transformone string
into another at a minimal cost. The elementaryoperations are
substitution, deletion, and insertion of stringsymbols. Converting
the algorithm to the domain of vision,symbol substitution is
interpreted as matching two shapeprimitives and the substitution
cost is replaced by thedissimilarity between the matched
primitives. The dissim-ilarity measure is discussed in Section
5.3.1. Novel opera-tions involving gap insertion and the merging of
primitivesare discussed in Sections 5.3.2 and 5.3.3.
5.3.1 Similarity Between Primitives
We now define the similarity between line segments ak anda0l.
The cost of a substitution operation is this value with aminus
sign. Hence, the more similar the segments are, thelower their
substitution cost is. We denote the attributes ofak; a
0lÐorientation and lengthÐby �; ` and �0; `0, respec-
tively. The ratio between the length attributes is
denotedrelative scale c `=`0.
The term ªreference segmentsº refers to the starting
pointsegments which implicitly determine the global rotation
andscale that aligns the two curves (as discussed above).
Thereference segments are specified by the argument fi; jg inthe
call to the procedure SYNTACTIC-STEP and are
1320 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE
INTELLIGENCE, VOL. 21, NO. 12, DECEMBER 1999
-
denoted here a0; a00. The segment similarity also depends on
the corresponding attributesÐorientation, length, andrelative
scaleÐof the reference segments: �0; `0, �00; `00,and c0
`0=`00.
We first define the component of similarity which isdetermined
by the length (or relative scale) attribute oftwo matching
segments. We map the two matched pairsof length values f`; `0g and
f`0; `00g to two correspondingdirections in the `; `0-plane and
measure the anglebetween these two directions. The cosine of twice
thisangle is the length-dependent component of our measureof
segment similarity (Fig. 9). This measure is numericallystable. It
is not sensitive to small scale changes nor doesit diverge when c0
`0=`00 is small. It is measured inintrinsic units between ÿ1 and 1.
The measure issymmetric so that the labeling of the contours as
ªfirstºand ªsecondº has no effect.
Let � be the angle between the vectors `; `0 and `0; `00.Our
scale similarity measure is:
S``; `0j`0; `00 cos 2� 4cc0 c2 ÿ 1c20 ÿ 1c2 1c20 1
: 1
Thus, S` depends explicitly on the scale values c and c0rather
than on their ratio, hence, it cannot be computedfrom the invariant
attributes `=`0 and `
0=`00. The irrelevanceof labeling can be readily verified
since
S`c; c0 S`cÿ1; cÿ10 :We next define the orientation similarity
S� between two
line segments whose attributes are � and �0, respectively.The
relative orientation between them is measured in thetrigonometric
direction (denoted �! �0) and comparedwith the reference rotation
(�0 ! �00):
S��; �0j�0; �00 cos �! �0 ÿ �0 ! �00: 2As with the scale
similarity measure, the use of the cosineintroduces nonlinearity;
we are not interested in finesimilarity measurement when the two
segments are closeto being parallel or antiparallel. Our matching
algorithm isdesigned to be flexible in order to match curves that
areonly weakly similar; hence, we want to encourage segmentmatching
even if there is a small discrepancy between theirorientations.
Similarly, the degree of dissimilarity betweentwo nearly opposite
directions should not depend too muchon the exact angle between
them. On the other hand, the
point of transition from acute to obtuse angle between thetwo
orientations seems to have a significant effect on thedegree of
similarity and, therefore, the derivative of S� ismaximal when the
line segments are perpendicular.
Finally, the combined similarity measure is defined asthe
weighted sum:
S w1S` S�; 3The positive weight w1 (which equals 1 in all
ourexperiments) controls the coupling of scale and
orientationsimilarity.
5.3.2 Gap Opening
In string matching, the null symbol � serves to define
thedeletion and insertion operations using a! � and �!
a,respectively, where a denotes a symbol. In our case, adenotes a
line segment and � is interpreted as a ªgapelement.º Thus, a! �
means that the second curve isinterrupted and a gap element � is
inserted into it to bematched with a. Customarily, we define the
same cost forboth operations, making the insertion of a into one
sequenceequivalent to its deletion from the other.
The cost of interrupting a contour and inserting� connected gap
elements into it (that are matched with� consecutive segments on
the other curve) is defined asw3 ÿ w2 � �, where w2; w3 are
positive parameters. Thus, weassign a penalty of magnitude w3 for
each single interrup-tion, which is discounted by w2 for every
individualelement insertion or deletion. This predefined
quantitycompetes with the lowest cost (or best reward) that can
beachieved by � substitutions. A match of � segments whosecost is
higher (less negative) than w3 ÿ w2 � � is consideredto be poor
and, in this case, the interruption and gapinsertion is
preferred.
In all our experiments, we used w2 0:8 and w3 8:0.(These values
were determined in an ad hoc fashion and notby systematic parameter
estimation, which is left for futureresearch). These numbers make
it possible to match a gapwith a long sequence of segments as
required when curvesare partially occluded. On the other hand,
isolated gaps arediscouraged due to the high interruption cost.
Ouralgorithm, therefore, uses deletions in cases of occlusion,
GDALYAHU AND WEINSHALL: FLEXIBLE SYNTACTIC MATCHING OF CURVES
AND ITS APPLICATION TO AUTOMATIC HIERARCHICAL... 1321
Fig. 9. Length similarity is measured by comparing the
correspondingreference length values `0; `00 with the corresponding
length values ofthe current segment `; `0. Each length pair is
mapped to a direction inthe plane and similarity is defined as
cos2�. This value is boundedbetween ÿ1 and 1.
Fig. 8. An example of initial pruning. Two curves with n minN;N
0 50 are matched syntactically using t edit steps (see text for
details). The50 candidates which achieve best (minimal) edit cost
are examined tosee whether the distribution of their associated
rotations shows centraltendency. Here, we sample the distribution
after 1, 5, and 10 syntacticsteps. In this example, 5-10 steps are
sufficient for a reliable estimationof the global rotation angle c.
We proceed by eliminating the candidateswhose associated global
rotation is too far from c.
-
while, for local mismatches, it uses the merging operation,which
is described in Section 5.3.3.
5.3.3 Segment Merging
One advantage of using variant attributes (length andabsolute
orientation), is that segment merging becomespossible. We use
segment merging as the syntactichomologue of curve smoothing,
accomplishing noisereduction by local resolution degradation.
Segment mer-ging, if defined correctly, should simplify a
contourrepresentation by being able to locally and adaptivelychange
its scale from fine to coarse.
We define the following merging rule: Two adjacent linesegments
are replaced by the line connecting their twofurthest endpoints. If
the line segments are viewed asvectors oriented in the direction of
propagation along thecontour, then the merging operation of any
number ofsegments is simply their vectorial sum.6 The cost of a
mergeoperation is defined as the dissimilarity between the
mergedsegment and the one it is matched with. A comparison ofthis
rule with the literature in given in the appendix.
5.4 Minimizing Edit Cost via Dynamic Programming
Procedure SYNTACTIC-STEP is given as input twocyclically ordered
sequences A fa1; . . . ; aNg andA0 fa01; . . . ; a0N 0 g, which are
partially matched fromposition i of A and onward, and from position
j of A0
and onward. It uses dynamic programming to extend theedit
transformation between A and A0 by one step. Theedit operations and
their cost functions were discussedabove in Section 5.3.
The notation used in Section 3.1 hides, for simplicity,
theworkspace arrays which are used for path planning. Let usassume
that the procedure SYNTACTIC-STEP can access anarrayRwhich
corresponds with the starting point fi; jg. Theentry R�; � holds
the minimal cost that can be achievedwhen the � segments of A
following ai are matched with the� segments of A0 following a0j. We
will see later that it is notnecessary to keep the complete array
in memory.
A syntactic step is the operation of getting an array Rwhich is
partially filled, and extending it by filling some ofthe missing
entries. Our choice of extension is called ªblockcompletion.º Let
us assume for simplicity that the referencesegments fi; jg are the
first ones. There is no loss ofgenerality here since the order in
A;A0 is cyclic. Initially, thecomputed portion of the array R
corresponds to a diagonalblock whose two corners are R1; 1 and R�;
�. When theprocedure SYNTACTIC-STEP is applied to R, it extends Rby
filling in another row (of length �) and/or anothercolumn (of
length �).
The decision whether a row or a column is to be addeddepends on
the previous values of R and is related to thepotential computation
that is discussed below in Section 5.5.Roughly speaking, we add a
row (column) if the minimum
of R is in the last computed row (column) and we add bothif the
minimum is obtained in the last computed corner.
When extending the block of size �� �, every new entryRk; l is
computed according to the following rule:
Rk; l minfr1; r2; r3g; 4where
r1 min�;�2
fR�; � ÿ S�k; �lgr2 min
0
-
We use a tight lower bound on the edit cost estimation,
termed ªpotential.º The optimality of the search is thus
guaranteed. In this section, we define the potential
function
and explain how it is used by the procedure PREDICT-
COST.Let us consider an entry Rk; l in the dynamic
programming array R. Without loss of generality, due
to the cyclic segment order, we can assume that the
forward path leaving this entry lies in the rectangular
block defined by the corners Rk; l and RN;N 0. Wedefine � and �
to be the dimensions of this rectangle:
� minN ÿ k;N 0 ÿ l and � maxN ÿ k;N 0 ÿ l. Wealso define � to be
the maximal similarity between
segments (� 1 w1 from (3)). The lowest cost substitu-tion thus
has the negative cost ÿ�, which acts as areward.
The values of the parameters w1; w2; w3 are chosen to
guarantee that a diagonal path consisting of the lowest cost
substitutions has the minimal possible cost.7 Hence, the
lowest cost forward path originating from Rk; l mustcontain a
diagonal segment of maximal length, namely of
length �. The cost assigned with this part is ÿ��. In additionto
the diagonal part, the forward path may contain a vertical
or horizontal part of length � ÿ �. In general, this part
candecrease the cost only if w3 ÿ w2� ÿ � is negative. How-ever, in
the case where the value of Rk; l is set equal to r2 orr3 in (4),
it is the case that the entry k; l is alreadyconsidered to be
inside a gap. In this case, we can first
extend the gap by � ÿ � moves and only then continue with�
diagonal steps. Thus, the interruption penalty is avoided,
and the forward path cost becomes ÿ�� ÿ w2� ÿ �.Combining this
bound with the value of Rk; l, we get alower bound fkl on the
minimal edit cost which can be
achieved passing through Rk; l:
fkl Rk; l ÿ �� minf0; w3 ÿ w2� ÿ �g
if Rk; l r1 in 4Rk; l ÿ �� ÿ w2� ÿ � if Rk; l r2; r3 in 4:
8
-
the following: Features are matched when the local pieces of
curve around them have similar shape; if, after alignment,
they are also proximal, meaning that they agree with the
global alignment, then the match is likely to be correct.The
other pruning technique can be used when three
related images are available (rather than two). Assume
that a feature point p on contour 1 is matched with point
p0 on contour 2 and p0 is matched with p00 on contour 3.If the
matching between 1 and 3 supports the mapping
between p and p0 0, then the correspondence list (p$ p0,p0 $
p00, p$ p00) is accepted.5.7 Discussion and Comparison to Other
Methods
Below we discuss some issues relating to complexity,
invariance, and direct proximity computation. A detailed
comparison of our syntactic operations to other methods
is given in the appendix.
5.7.1 Complexity
The algorithm develops jj dynamic programming arraysand, when it
terminates, each array has been completed upto a block of some
size. Let n2 be the average number ofentries in a completed block.
The total number of computedentries is, therefore, jjn2. Clearly, n
is a fraction of N . Fromthe considerations discussed in Section
5.2, it follows thatjj is typically of the order of N as well. It
is possible,although not used here, to constrain the
procedureINITIALIZE to return a set of size minN;N 0 exactly.The
overall number of computed entries is, therefore,ON3.
Every single entry computation is of complexity OK2since KK ÿ
1=2 alternative evaluations are required tocompute r1 in (4). Note
that K is usually a small constant(we used K 4). An entry
computation is followed byupdating the array potential. We maintain
for each row andcolumn the best score achieved there, hence, the
arraypotential is computed in OK time after a block
iscompleted.
5.7.2 How to Achieve Invariance
Available syntactic matching methods usually achievescale and
rotation invariance (if at all) by using invariantattributes. The
benefit of using invariant attributes isefficiency. The drawback is
that invariant attributescannot be smoothed by merging and they are
eithernonlocal or noninterruptible. For example, in [27],
theorientation of a line segment is measured with respect toits
successor, hence, the opening of a gap betweensegments introduces
ambiguity into the representation(see Fig. 11). In [16], the
attributes which describe curvefragments are Fourier coefficients
and, in [1], an attributecalled ªsphericityº is defined. Both are
invariant attributesbut noninterruptible.8
Moreover, it seems to be impossible to find operators
oninvariant attributes that are equivalent to smoothing in
realspace. Instead, a cascade of different scale
representationsmust be used [44], where a few fragments may be
replacedby a single one which is their ªancestorº in a scale
spacedescription. This requires massive preprocessing, building
acascade of syntactical representations for each curve
withconsistent fragment hierarchy.
In contrast, our algorithm is invariant with respect toscaling,
rotation, and translation without relying on invar-iant attributes,
while remaining efficient and capable ofcomparing complex real
image curves in a few seconds.Furthermore, a novel merging
operation was defined whichaccomplished curve simplification and
helped in noisereduction and resolution change.
5.7.3 Direct Proximity Minimization
The conversion of residual distances into a dissimilaritymeasure
is explained in Section 5.6. The reader may wonderwhy we can't
choose a proximity criterion, like the averagedistance between
matched points or the Hausdorff distance,and minimize it directly.
The Hausdorff distance mayappear to be especially attractive since
it is not based on anyprior feature pairing [2], [19].
We claim that direct minimization, which is heavilystudied in
the literature, is not adequate when the curvesare only weakly
similar. The reason is that proximitymethods treat curves as two
sets of points and ignore morequalitative ªstructuralº
information.
An example is shown in Fig. 12. In this example, wecompare two
weakly similar curves, where a permissiblealignment transformation
includes translation, rotation, anduniform scaling. Assuming that
two sets of feature pointshave been extracted from the two curves,
we investigate theproximity measure which is defined as the average
distancebetween matched points. This measure is larger for
thealignment shown in Fig. 12c than for the one shown inFig. 12d.
Hence, a proximity algorithm, which seeks theoptimal alignment that
achieves minimal residual distances,will consider Fig. 12d as a
better alignment than Fig. 12c.9
1324 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE
INTELLIGENCE, VOL. 21, NO. 12, DECEMBER 1999
8. The Fourier coefficients are normalized individually, which
meansthat if every fragment undergoes a different rigid or scaling
transformation,the representation remains unchanged. The sphericity
representationbehaves in the same way. The relative size and
orientation information ispreserved as long as the sequence is not
interrupted since overlappingfragments are used. Note that, in
spite of this property, the algorithms areapplied to partial
matching in the framework of model-based recognition,since the
solution that preserves the correct relative size and
orientationinformation between primitives remains a valid solution
and the danger offinding an undesired solution (as is demonstrated
in Fig. 11) is small.
9. Note that the lower proximity value in Fig. 12d is not the
result of theuse of different global scaling since the shorter
curve appears in Fig. 12c andFig. 12d at the same size. Moreover,
the lower proximity measure of Fig.12d is obtained even though
many-to-one matches are avoided and theorder of points is kept.
Without these constrainst, it is easy to get arbitrarilysmall
proximity values for an arbitrary correspondence by shrinking one
setof points and matching it with only a few points (or even a
single point)from the other set.
Fig. 11. When the primitive attribute is measured relative to a
precedingprimitive, interrupting the sequence creates problems.
Here, forexample, the orientation information is lost when the
dotted segmentis matched with a gap element. As a result, the two
contours may bematched almost perfectly to each other and
considered as very similar.
-
We continue with the previous example and investigateinstead the
use of the directed Hausdorff distance as theproximity criterion.
The directed Hausdorff distance from apoint set P to a point set Q
is defined (with the Euclideannorm jj � jj) as: hP;Q maxp2P minq2Q
jjpÿ qjj; it is equalto the largest distance from some point in P
to its nearestneighbor in Q. This measure is asymmetric and,
therefore,the symmetric expression maxfhP;Q; hQ;P g is
oftenpreferred. However, when there is large image clutter, as
inour case, the symmetric distance is not useful since its valueis
determined by the irrelevant part of the curve. Thus, thedirected
distance, measured from the shorter curve to thelonger one, is
larger for the alignment shown in Fig. 12cthan for the one shown in
Fig. 12d. A proximity algorithmthat is based on minimizing the
directed Hausdorff distancewill consider Fig. 12d as a better
alignment than Fig. 12c.
In contrast, because it uses local structure, our
syntacticmatching algorithm provides the correct matching of
thecurves in Fig. 12c.
6 DISCUSSION
This paper deals with the inherently ill-posed problem
ofmatching weakly similar curves for the purpose of
similarityestimation. Our flexible syntactic matching is based on
asimple heuristic, guided by the principle that matchedfeatures
should lie on locally similar pieces of curve.Naturally, this
principle cannot guarantee that the resultswould always agree with
our human intuition for ªgoodºmatching, but our examples
demonstrate that satisfactory
and intuitive results are usually obtained. In addition,
wesuccessfully passed two more objective tests: 1) In adatabase of
31 images we showed that nearest neighbors,computed according to
our distance, are always of the sametype; 2) in a database of 121
images, clustering based on oursimilarity results gave perceptually
veridical hierarchicalstructure. Note that ªsuccessfulº matching
depends on theapplication at hand. Our method is not suitable
forrecovering depth from stereo, but it is well-suited for
morequalitative tasks, such as the organization of an
imagedatabase, the selection of prototypical shapes, and
imagemorphing for graphics or animation.
In order to achieve large flexibility, we introduced anonlinear
measure of similarity between line segmentswhich is not sensitive
to either very small or very largedifferences in their scale and
orientation. Our specific choiceof segment similarity, combined
with a novel mergingmechanism and an improved interruption
operation, addup to a robust and successful algorithm. The
mostimportant properties of our algorithm, which make
itadvantageous over other successful matching algorithms,are its
relatively low complexity, its locality, which allowsus to deal
with occlusions, and its invariance to 2D imagetransformations.
We demonstrated excellent results, matching similarcurves under
partial occlusion, matching similar curveswhere the curves depict
the occluding contours of objectsobserved from different
viewpoints, and matching differentbut related curves (like the
silhouettes of different mammalsor cars). Furthermore, in a
classification task using ouralgorithm to precompute image pair
similarity, the cluster-ing algorithm detected all the relevant
partitions. This kindof database structuring could not have emerged
without thereliable estimation of the dissimilarities between
weaklysimilar images. Hence, the classification tree is an
indirectand objective evidence for the quality of our
matchingmethod.
Note that, according to the appearance, based approachto object
recognition, an object is represented by a collectionof images
which, in some sense, span the image space ofthat object. Our
method can be used to divide theappearances of an object into
clusters of similar views inorder to assist the construction of an
appearance basedrepresentation.
APPENDIX
SYNTACTIC OPERATIONS: COMPARISONS
A How to Measure Similarity
Local and scale invariant matching methods usually use the
normalized length `=`0. For example, the ratio between
normalized lengths `=`0`0=`00
is used in [26], [27] (with global
normalization, the difference j`=Lÿ `0=L0j can be used
[41],[44]). The ratio between normalized lengths may be viewed
as the ratio between the relative scale c `=`0 and thereference
relative scale c0 `0=`00. While this scale ratio isinvariant,
unlike our measure, it is not bounded and is thus
less stable.
GDALYAHU AND WEINSHALL: FLEXIBLE SYNTACTIC MATCHING OF CURVES
AND ITS APPLICATION TO AUTOMATIC HIERARCHICAL... 1325
Fig. 12. (a), (b)ÐTwo weakly similar curves. The points are
extractedautomatically from polygonal approximations that do not
depart from theoriginal curves by more than eight pixels. (c) The
desirable matchingresult, as obtained by our matching algorithm.
The average distancebetween matched feature points is 25 pixels.
The directed Hausdorffdistance is 34 pixels. (d) A nondesirable
pairing of points which yieldsbetter proximity value. The average
distance between matched featurepoints is only 23 pixels. The
directed Hausdorff distance is only 29pixels.
-
We are familiar with only one other definition of asymmetric,
bounded, and scale invariant measure forsegment length similarity
[26]. However, their matchingalgorithm is not syntactic and is very
different from ours. Inaddition, there is an important qualitative
differencebetween the two definitions (see Fig. 13), where
ourmeasure is more suitable for flexible matching.
As for our measure of orientation difference, we notethat a
linear measure of orientation differences has beenwidely used by
others [9], [27], [41], [42]. The nonlinearmeasure used by [26]
differs from ours in exactly the sameway as discussed above
concerning length.
Reminiscent of our combined similarity measure (3), in[33] a
coupled measure is used: The segments are super-imposed at one end
and their dissimilarity is proportionalto the distance between
their other ends. However, thismeasure is too complicated for our
case, and it has theadditional drawback that it is sensitive to the
arbitraryreference scale and orientation (in the character
recognitiontask of [33], it is assumed that characters are of the
samescale and properly aligned).
B How to Open Gaps
All the syntactical shape matching algorithms that we
arefamiliar with make use of deletions and insertions as
purelylocal operations, as in classical string matching. That is,
thecost of inserting a sequence of gaps into a contour is equalto
the cost of spreading the same number of gap elements indifferent
places along the contour. We distinguish the twocases since the
first typically arises from occlusion or partialmatching, while the
second arises typically from curvedissimilarity. In order to make
the distinction, we adopt atechnique frequently used in protein
sequence comparison,namely, we assign a cost to any event of
contourinterruption, in addition to the (negative) cost
fromdeletion/insertion of any single element.
C How to Merge Segments
A similar approach to segment merging (Section 5.3.3) wastaken
in [44], but their use of invariant attributes made itimpossible to
realize the merge operator as an operation onattributes.
Specifically, there is no analytical relationbetween the attributes
being merged to the attributes ofthe new primitive. Instead, a
cascade of alternativerepresentations was used, each one obtained
by a differentGaussian smoothing of the two-dimensional curve;
a
primitive sequence is replaced by its ªancestorº in the
scalespace description.10
Compare the polygonal approximation after mergingwith the
polygon that would have been obtained had thecurve been first
smoothed and then approximated. The twopolygons are not identical
since smoothing may causedisplacement of features (vertices).
However, a displacedvertex cannot be too far from some feature at
the finestscale; the location error caused by ªfreezingº the
featurepoints is clearly bounded by the length of the
longestfragment in the initial (finest scale) representation.
Toensure good multiscaled feature matching, our suboptimalpolygonal
approximation is sufficient and the expensivegeneration of the
multiscale cascade is not necessary.Instead, the attributes of the
coarse scale representationmay be computed directly from the
attributes of the finerscale.
Merging was defined as an operation on attributes by[41], which
also applied the technique to Chinese characterrecognition [42].
Their algorithm suffers from some draw-backs concerning invariance
and locality11; below weconcentrate on their merging mechanism and
compare itto our own.
Assume that two line segments characterized by `1; �1and `2; �2
are to be merged into one segment `; �. In [41],` `1 `2 and � is
the weighted average between �1 and �2,with weights `1=`1 `2 and
`2=`1 `2 and with thenecessary cyclic correction.12 Usually, the
polygonal shapethat is obtained using this simple ad hoc merging
schemecannot approximate the smoothed contour very well.
1326 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE
INTELLIGENCE, VOL. 21, NO. 12, DECEMBER 1999
Fig. 13. Scale similarity: The scale c `=`0 is compared with
thereference scale c0 `0=`00. Left: The binary relation used by Li
[26] tomeasure scale similarity is expÿjlogc=c0j=� with � 0:5.
Right: Ourmeasure function (1) is not sensitive to small scale
changes, since it isflat near the line c c0.
Fig. 14. Comparison between merging rules: (a) A polygonal
approx-imation of a curve with two dotted segments which are to be
merged. (b)Merging result according to our scheme. A coarser
approximation isobtained. (c) Merging according to Tsai and Yu
[41]. The new polygondoes not appear to give a good
approximation.
10. The primitive elements used in [44] are convex and
concavefragments, which are bounded by inflection points. The
attributes are thefragment length divided by total curve length (a
nonlocal attribute) and theaccumulated tangent angle along the
fragment (a noninterruptibleattribute). The algorithm cannot handle
occlusions or partial distortionsand massive preprocessing is
required to prepare the cascade of syntacticalrepresentations for
each curve with consistent fragment hierarchy.
11. The primitives used by [41] are line segments, the
attributes arerelative length (with respect to the total length)
and absolute orientation(with respect to the first segment). The
relative length is, of course, anonlocal attribute and, in
addition, the algorithm uses the total number ofsegments, meaning
that the method cannot handle occlusions. The problemof attribute
variance due to rotation remains, in fact, unsolved. The
authorsassume that the identity of the first segments is known.
They comment thatif this information is missing, one may try to
hypothesize an initial match bylabeling the segment that is near
the most salient feature as segmentnumber one.
12. For example, an equal weight average between 0:9� (almost
ªwestº)and ÿ0:9� (almost ªwestº as well) is � (ªwestº) and not zero
(ªeastº).
-
Satisfactory noise reduction is only achieved in one of
thefollowing two extreme cases: Either one segment isdominant (much
longer than the other one) or the twosegments have similar
orientation. If two or more segmentshaving large variance are
merged, the resulting curve maybe very different from the original
curve (see Fig. 14).Hence, by performing segment merging on the
polygonalapproximation at fine scale, one typically does not
obtainan acceptable coarse approximation of the shape.
ACKNOWLEDGMENTS
The authors would like to thank Ben Kimia and Daniel
Sharvit for the 31 silhouette database and Davi Geiger for
the human limbs data. This research is partially funded by
the Israeli Ministry of Science.
REFERENCES[1] N. Ansari and E. Delp, ªPartial Shape Recognition:
A Landmark
Based Approach,º IEEE Trans. Pattern Analysis and
MachineIntelligence, vol. 12, pp. 470-489, 1990.
[2] E. Arkin, L.P. Chew, D. Huttenlocher, K. Kedem, and
J.Mitchel, ªAn Efficiently Computable Metric for ComparingPolygonal
Shapes,º IEEE Trans. Pattern Analysis and MachineIntelligence, vol.
13, pp. 209-216, 1991.
[3] H. Asada and M. Brady, ªThe Curvature Primal Sketch,º
IEEETrans. Pattern Analysis and Machine Intelligence vol. 8, pp.
2-14,1986.
[4] N. Ayach and O. Faugeras, ªHYPER: A New Approach for
theRecognition and Positioning of Two Dimensional Objects,º
IEEETrans. Pattern Analysis and Machine Intelligence, vol. 8, pp.
44-54,1986.
[5] R. Basri, L. Costa, D. Geiger, and D. Jacobs, ªDetermining
theSimilarity of Deformable Shapes,º IEEE Workshop Physics
BasedModeling in Computer Vision, pp. 135-143, 1995.
[6] B. Bhanu and O. Faugeras, ªShape Matching of Two
DimensionalObjects,º IEEE Trans. Pattern Analysis and Machine
Intelligence,vol. 6, pp. 137-155, 1984.
[7] R. Bolles and R. Cain, ªRecognizing and Locating Partially
VisibleObjects: The Focus Feature Method,º Int'l J. Robotics
Research,vol. 1, pp. 57-81, 1982.
[8] A. Brint and M. Brady, ªStereo Matching of Curves,º Image
andVision Computing, vol. 8, pp. 50-56, 1990.
[9] W. Christmas, J. Kittler, and M. Petrou, ªStructural
Matching inComputer Vision Using Probabilistic Relaxation,º IEEE
Trans.Pattern Analysis and Machine Intelligence, vol. 17, pp.
749-764, 1995.
[10] I. Cohen, N. Ayache, and P. Sulger, ªTracking Points
onDeformable Objects Using Curvature Information,º Proc.
EuropeanConf. Computer Vision, pp. 458-466, 1992.
[11] L. Davis, ªShape Matching Using Relaxation Techniques,º
IEEETrans. Pattern Analysis and Machine Intelligence, vol. 1, pp.
60-72,1979.
[12] A. Del Bimbo and P. Pala, ªVisual Image Retrieval by
ElasticMatching of User Sketches,º IEEE Trans. Pattern Analysis
andMachine Intelligence, vol. 19, pp. 121-132, 1997.
[13] A. Dembo, S. Karlin, and O. Zeitouni, ªCritical Phenomena
forSequence Matching with Scoring,º Annals of Probability, vol.
22,pp. 1,993-2,021, 1994.
[14] Y. Gdalyahu, D. Weinshall, and M. Werman, ªStochastic
ImageSegmentation by Typical Cuts,º Proc. IEEE Conf. Computer
Visionand Pattern Recognition, Fort Collins, Colo., 1999.
[15] D. Geiger, A. Gupta, L. Costa, and J. Vlontzos,
ªDynamicProgramming for Detecting, Tracking and Matching
DeformableContours,º IEEE Trans. Pattern Analysis and Machine
Intelligence,vol. 17, pp. 294-302, 1995.
[16] J. Gorman, O. Mitchell, and F. Kuhl, ªPartial Shape
RecognitionUsing Dynamic Programming,º IEEE Trans. Pattern Analysis
andMachine Intelligence, vol. 10, pp. 257-266, 1988.
[17] J. Gregor and M. Thomason, ªDynamic Programming Alignmentof
Sequences Representing Cyclic Patterns,º IEEE Trans.
PatternAnalysis and Machine Intelligence, vol. 15, pp. 129-135,
1993.
[18] R. Horaud and T. Skordas, ªStereo Correspondence
ThroughFeature Grouping and Maximal Cliques,º IEEE Trans.
PatternAnalysis and Machine Intelligence, vol. 11, pp. 1,168-1,180,
1989.
[19] D. Huttenlocher, G. Klanderman, and W. Rucklidge,
ªComparingImages Using the Hausdorff Distance,º IEEE Trans.
PatternAnalysis and Machine Intelligence, vol. 15, pp. 850-863,
1993.
[20] D. Huttenlocher and S. Ullman, ªObject Recognition
UsingAlignment,º Proc. Int'l Conf. Computer Vision, pp.
102-111,London, 1987.
[21] D.W. Jacobs, D. Weinshall, and Y. Gdalyahu, ªCondensing
ImageDatabases when Retrieval Is Based on Nonmetric Distances,º
Proc.Sixth Int'l Conf. Computer Vision, Bombay, 1998.
[22] B. Kamger-Parsi, M. Margalit, and A. Rozenfeld,
ªMatchingGeneral Polygonal Arcs,º Computer Vision, Graphics, and
ImageProcessing: Image Understanding, vol. 53, pp. 227-234,
1991.
[23] M. Koch and R. Kashyap, ªUsing Polygons to Recognize
andLocate Partially Occluded Objects,º IEEE Trans. Pattern
Analysisand Machine Intelligence vol. 9, pp. 483-494, 1987.
[24] L.J. Latecki and R. LakaÈmper, ªShape Similarity Measure
forImage Database of Occluding Contours,º Proc. Fourth IEEEWorkshop
Applications of Computer Vision, Princeton, N. J., Oct.1998.
[25] Y. Lamdan, J. Schwartz, and H. Wolfson, ªAffine Invariant
ModelBased Object Recognition,º IEEE Trans. Robotics and
Automation,vol. 6, pp. 578-589, 1990.
[26] S. Li, ªMatching: Invariant to Translations, Rotations and
ScaleChanges,º Pattern Recognition, vol. 25, pp. 583-594, 1992.
[27] H. Liuand and M. Srinath, ªPartial Shape Classification
UsingContour Matching in Distance Transformation,º IEEE
Trans.Pattern Analysis and Machine Intelligence, vol. 12, pp.
1,072-1,079,1990.
[28] C. Lu and J. Dunham, ªShape Matching Using
PolygonApproximation and Dynamic Alignment,º Pattern
RecongitionLetters, vol. 14, pp. 945-949, 1993.
[29] A. Marzal and E. Vidal, ªComputation of Normalized
EditDistance and Applications,º IEEE Trans. Pattern Analysis
andMachine Intelligence, vol. 15, pp. 926-932, 1993.
[30] R. McConnell, R. Kwok, J. Curlander, W. Kober, and S. Pang,
ª-SCorrelation and Dynamic Time Warping: Two Methods forTracking
Icefloes in SAR Images,º IEEE Trans. Geoscience andRemote Sensing,
vol. 29, pp. 1,004-1,012, 1991.
[31] F. Mokhtarian and A. Mackworth, ªScale-Based Description
andRecognition of Planar Curves and Two-Dimensional Shapes,ºIEEE
Trans. Pattern Analysis and Machine Intelligence, vol. 8, pp.
34-44, 1986.
[32] M. Pelillo, K. Siddiqi, and S. Zucker, ªMatching
Hierarchicalstructures Using Association Graphs,º Proc. European
Conf.Computer Vision, pp. 3-16, 1998.
[33] J. Rocha and T. Pavlidis, ªA Shape Analysis Model
withApplications to a Character Recognition System,º IEEE
Trans.Pattern Analysis and Machine Intelligence, vol. 16, pp.
393-404, 1994.
[34] S. Sclaroff and A. Pentland, ªModal Matching for
Correspondenceand Recognition,º IEEE Trans. Pattern Analysis and
MachineIntelligence, vol. 17, pp. 545-561, 1995.
[35] L. Shapiro and M. Brady, ªFeature Based Correspondence:
AnEigenvector Approach,º Image and Vision Computing, vol. 10,pp.
283-288, 1992.
[36] D. Sharvit, J. Chan, H. Tek, and B. Kimia, ªSymmetry
BasedIndexing of Image Databases,º J. Visual Comm. and
ImageRepresentation, 1998.
[37] A. Shokoufandeh, S. Dickinson, K. Siddiqi, and S.
Zucker,ªIndexing Using a Spectral Encoding of Topological
Structure,ºProc. Computer Vision and Pattern Recognition, pp.
491-497, 1999.
[38] K. Siddiqi, A. Shokoufandeh, S. Dickinson, and S.
Zucker,ªShock Graphs and Shape Matching,º Proc. Int'l Conf.Computer
Vision, pp. 222-229, 1998.
[39] G. Stockman, S. Kopstein, and S. Benett, ªMatching Images
toModels for Registration and Object Detection via Clustering,ºIEEE
Trans. Pattern Analysis and Machine Intelligence, vol. 4,
pp.229-241, 1982.
[40] S. Tirthapura, D. Sharvit, P. Klein, and B. Kimia,
ªIndexing Basedon Edit Distance Matching of Shape Graphs,º SPIE
Proc. Multi-media Storage and Archiving Systems III, pp. 25-36,
1998.
[41] W. Tsai and S. Yu, ªAttributed String Matching with Merging
forShape Recognition,º IEEE Trans. Pattern Analysis and
MachineIntelligence, vol. 7, pp. 453-462, 1985.
GDALYAHU AND WEINSHALL: FLEXIBLE SYNTACTIC MATCHING OF CURVES
AND ITS APPLICATION TO AUTOMATIC HIERARCHICAL... 1327
-
[42] Y. Tsay and W. Tsai, ªAttributed String Matching by Split
andMerge for On-Line Chinese Character Recognition,º IEEE
Trans.Pattern Analysis and Machine Intelligence, vol. 15, pp.
180-185, 1993.
[43] A. Tversky, ªFeatures of Similarity,º Psychological Review,
vol. 84,pp. 327-352, 1977.
[44] N. Ueda and S. Suzuki, ªLearning Visual Models fromShape
Contours Using Multiscale Covex/Concave StructureMatching,º IEEE
Trans. Pattern Analysis and Machine Intelli-gence, vol. 15, pp.
337-352, 1993.
[45] S. Umeyama, ªParameterized Point Pattern Matching and
ItsApplication to Recognition of Object Families,º IEEE Trans.
PatternAnalysis and Machine Intelligence, vol. 15, pp. 136-144,
1993.
[46] Y. Wang and T. Pavlidis, ªOptimal Correspondence of
StringSubsequences,º IEEE Trans. Pattern Analysis and Machine
Intelli-gence, vol. 12, pp. 1,080-1,086, 1990.
[47] D. Weinshall and M. Werman, ªOn View Likelihood
andStability,º IEEE Trans. Pattern Analysis and Machine
Intelligence,vol. 19, pp. 97-108, 1997.
[48] M. Werman and D. Weinshall, ªSimilarity and Affine
InvariantDistance Between Point Sets,º IEEE Trans. Pattern Analysis
andMachine Intelligence, vol. 17, pp. 810-814, 1995.
Yoram Gdalyahu received the BSc degree inphysics and mathematics
from the HebrewUniversity, Jerusalem, Israel. He received theMSc
degree in physics from the WeizmannInstitute of Science, Rehovot,
Israel for his workon resonant tunneling and inelastic scattering
inGallium Arsenide heterostructures. His PhDthesis was in computer
vision, submitted to thesenate of the Hebrew University in October
1999.His research combines computer vision and
machine learning. Upon graduation, he received the Eshkol
scholarshipgiven by the Israeli Ministry of Science to the best PhD
students. He iscurrently with the IBM Research Laboratory, Haifa,
Israel.
Daphna Weinshall received the BSc degree inmathematics and
computer science from Tel-Aviv University, Tel-Aviv, Israel, in
1982. Shereceived the MSc and PhD degrees in mathe-matics and
statistics from Tel-Aviv University in1985 and 1986, respectively,
for her work onmodels of evolution and population genetics.Between
1987 and 1992, she visited the centerfor biological information
processing at MIT andthe IBM T.J. Watson Research Center. In
1993,
she joined the Institute of Computer Science at the Hebrew
University ofJerusalem, where she is now an associate professor.
Her researchinterests include computer and biological vision,
machine and humanlearning. She has published papers on learning in
machine and humanvision, qualitative vision, visual psychophysics,
Bayesian vision,invariants, multipoint and multiframe geometry,
image and modelpoint-based metrics, motion and structure from
motion. She is amember of the IEEE.
1328 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE
INTELLIGENCE, VOL. 21, NO. 12, DECEMBER 1999