Top Banner
Invariance to Affine-Permutation Distortions Liang-Yan Gui Carnegie Mellon University [email protected] David A. Sepiashvili Independent Consulting [email protected] Jos´ e M. F. Moura Carnegie Mellon University [email protected] Abstract An object imaged from various viewpoints appears very different. Hence, effective shape representation of objects becomes central in many applications of computer vision. We consider affine and permutation distortions. We de- rive the affine-permutation shape space that extends, to in- clude permutation distortions, the affine only shape space (the Grassmannian). We compute the affine-permutation shape space metric, the sample mean of multiple shapes, the geodesic defined by two shapes, and a canonical repre- sentative for a shape equivalence class. We illustrate our approach in several applications including clustering and morphing of shapes of different objects along a geodesic path. The experimental results on key benchmark datasets demonstrate the effectiveness of our framework. 1. Introduction Shape is an important characteristic of an object and shape analysis has wide applications in medical imag- ing [13, 12], document analysis [18, 17], neuroscience [10, 1], and many other computer vision problems [5]. A major challenge lies with the large distortions in imaged objects that result from wide variations in viewpoint. In many prac- tical applications, these variations are well approximated by affine distortions [8], like when modeling a pinhole camera as an affine camera when the camera center is far from the object and the object is rigid. In this paper, we represent shapes by the collective of the landmarks on the boundary and in the interior of the image of an object, when available. We consider that the ordering of the pixels is unknown, and so they can be shuffled or permuted between different con- figurations of the same object; in other words, the images are obtained under affine and permutation distortions. Our work develops an affine-permutation shape space that is invariant to both affine and permutation distortions. We define a shape similarity measure that is invariant to these distortions to quantify the difference between two affine-permutation distorted shapes. We propose to calcu- late this similarity measure, and we find a canonical repre- sentative for each distinct object. We use this canonical rep- resentative to solve for the geodesic connecting two shapes and the Karcher mean given multiple shapes–representing possibly different affine-permutation distorted objects. 2. Shape Space: A Group Perspective We adopt here the group representations in [16, 6, 15]. Configuration Space. As described by Ha and Moura [6], the configuration of a rigid object consists of N landmarks (pixels) {p k } k=1,...,N on a 2D plane R 2 . Given a reference coordinate system, the location of point p k is specified by a pair of coordinates (x k ,y k ). We represent the configuration by an N × 2 matrix X, the configuration matrix. The collection of all configurations of N points on a 2D image plane is defined as the configuration space X . Here, X is the Euclidean space R N×2 . We exclude matrices with rank=1 since they correspond to degenerate shapes, i.e., a single point or a straight line. We consider the affine distorted configuration X and the affine-permutation distorted configuration X of configura- tion X. Two affine distorted configurations X and X be- long to the same equivalence class XA . The affine shape of X is the corresponding equivalence class. The quotient space X /A, where A defines the affine group action [14], collects all possible equivalence classes and defines the affine shape space. Similarly, the equivalence class XAP defines the affine-permutation shape of X under both affine and permutation distortions. The quotient space X /AP de- fines the affine-permutation shape space. Affine transformations account for unknown translation, reflection, rotation, scaling, and skewing distortions be- tween two configurations. We sequentially factor out these different distortions. Normalized Configuration Space. We consider trans- lations, scaling, and skewing distortions and define normal- ized configurations that are invariant to these distortions. Given a configuration X, its normalized configuration Y satisfies three conditions: (1) Y and X are in the same equivalence class (for affine distortion, Y ∈⌈XA ; for affine-permutation distortion, Y ∈⌈XAP ); (2) The cen- ter of mass is at the origin: mean(Y )=0, where mean(Y ) 1
5

Invariance to Affine-Permutation Distortions...and the Karcher mean given multiple shapes–representing possibly different affine-permutation distorted objects. 2. Shape Space: A

Aug 12, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Invariance to Affine-Permutation Distortions...and the Karcher mean given multiple shapes–representing possibly different affine-permutation distorted objects. 2. Shape Space: A

Invariance to Affine-Permutation Distortions

Liang-Yan GuiCarnegie Mellon University

[email protected]

David A. SepiashviliIndependent [email protected]

Jose M. F. MouraCarnegie Mellon University

[email protected]

Abstract

An object imaged from various viewpoints appears very

different. Hence, effective shape representation of objects

becomes central in many applications of computer vision.

We consider affine and permutation distortions. We de-

rive the affine-permutation shape space that extends, to in-

clude permutation distortions, the affine only shape space

(the Grassmannian). We compute the affine-permutation

shape space metric, the sample mean of multiple shapes,

the geodesic defined by two shapes, and a canonical repre-

sentative for a shape equivalence class. We illustrate our

approach in several applications including clustering and

morphing of shapes of different objects along a geodesic

path. The experimental results on key benchmark datasets

demonstrate the effectiveness of our framework.

1. Introduction

Shape is an important characteristic of an object and

shape analysis has wide applications in medical imag-

ing [13, 12], document analysis [18, 17], neuroscience [10,

1], and many other computer vision problems [5]. A major

challenge lies with the large distortions in imaged objects

that result from wide variations in viewpoint. In many prac-

tical applications, these variations are well approximated by

affine distortions [8], like when modeling a pinhole camera

as an affine camera when the camera center is far from the

object and the object is rigid. In this paper, we represent

shapes by the collective of the landmarks on the boundary

and in the interior of the image of an object, when available.

We consider that the ordering of the pixels is unknown, and

so they can be shuffled or permuted between different con-

figurations of the same object; in other words, the images

are obtained under affine and permutation distortions.

Our work develops an affine-permutation shape space

that is invariant to both affine and permutation distortions.

We define a shape similarity measure that is invariant to

these distortions to quantify the difference between two

affine-permutation distorted shapes. We propose to calcu-

late this similarity measure, and we find a canonical repre-

sentative for each distinct object. We use this canonical rep-

resentative to solve for the geodesic connecting two shapes

and the Karcher mean given multiple shapes–representing

possibly different affine-permutation distorted objects.

2. Shape Space: A Group Perspective

We adopt here the group representations in [16, 6, 15].

Configuration Space. As described by Ha and

Moura [6], the configuration of a rigid object consists of Nlandmarks (pixels) {pk}k=1,...,N on a 2D plane R

2. Given

a reference coordinate system, the location of point pk is

specified by a pair of coordinates (xk, yk). We represent

the configuration by an N × 2 matrix X , the configuration

matrix. The collection of all configurations of N points on

a 2D image plane is defined as the configuration space X .

Here, X is the Euclidean space RN×2. We exclude matrices

with rank= 1 since they correspond to degenerate shapes,

i.e., a single point or a straight line.

We consider the affine distorted configuration X and the

affine-permutation distorted configuration X of configura-

tion X . Two affine distorted configurations X and X be-

long to the same equivalence class ⌈X⌋A. The affine shape

of X is the corresponding equivalence class. The quotient

space X/A, where A defines the affine group action [14],

collects all possible equivalence classes and defines the

affine shape space. Similarly, the equivalence class ⌈X⌋AP

defines the affine-permutation shape of X under both affine

and permutation distortions. The quotient space X/AP de-

fines the affine-permutation shape space.

Affine transformations account for unknown translation,

reflection, rotation, scaling, and skewing distortions be-

tween two configurations. We sequentially factor out these

different distortions.

Normalized Configuration Space. We consider trans-

lations, scaling, and skewing distortions and define normal-

ized configurations that are invariant to these distortions.

Given a configuration X , its normalized configuration Ysatisfies three conditions: (1) Y and X are in the same

equivalence class (for affine distortion, Y ∈ ⌈X⌋A; for

affine-permutation distortion, Y ∈ ⌈X⌋AP ); (2) The cen-

ter of mass is at the origin: mean(Y ) = 0, where mean(Y )

1

Page 2: Invariance to Affine-Permutation Distortions...and the Karcher mean given multiple shapes–representing possibly different affine-permutation distorted objects. 2. Shape Space: A

is defined as a 1 × 2 row vector containing the mean val-

ues of the columns of Y ; (3) Y is an orthonormal matrix:

Y TY = I , where I is the 2× 2 identity matrix.

The collection of the normalized configurations forms

the normalized configuration space, i.e., a Stiefel manifold,

denoted by Gf(2, N) [4]. To find the normalized config-

uration of X , we first center it by removing translation

Xc , X−1⊗mean(X), where 1 is an N×1 vector of ones,

such that mean(Xc) = 0 where 0 is a 1 × 2 zero vector,

and ⊗ is the Kronecker product. We then Normalize it by

compute the compact singular value decomposition (SVD)

of Xc: Xc = USV T . The normalized configuration Y is

defined as Y , UV T . The normalized configuration Y is

invariant to translation, scaling, and skewing distortions. Yis an N × 2 orthonormal matrix, mapped to a point on the

Stiefel manifold. Next, we construct the affine shape space

and the affine-permutation shape space as quotient spaces

of Gf(2, N).

Affine Shape Space. If two configurations X and Xare affine distortions of each other, their corresponding nor-

malized configurations Y, Y are related by Y = Y V, whereV is a 2 × 2 orthogonal matrix. That is, after the center-ing and normalizing steps, only orientation ambiguity re-mains. Let O2 represent the orthogonal group consisting ofthe 2×2 orthogonal matrices (rotations and reflections) anddefine the coset of O2 with respect to Y as

⌈Y ⌋O = {Y V : V ∈ O2} . (1)

The equivalence class ⌈Y ⌋O is invariant to orientation

transformations since both Y and Y belong to ⌈Y ⌋O. The

orthogonal equivalence relation denoted by ≡O partitions

the Stiefel manifold Gf(2, N). The resulting quotient

space Gf(2, N)/O2 is the affine shape space X/A, which

is the Grassmannian manifold Gr(2, N) [7]. Each affine

shape ⌈Y ⌋O is mapped to a point p on Gr(2, N). Any

matrix belonging to the equivalence class ⌈Y ⌋O stores the

point p numerically and is a matrix representation of p.Affine-Permutation Shape Space. For the affine-

permutation distortions, after the centering and normaliz-ing steps, orientation and permutation ambiguities remain.Let PN be the set of N × N permutation matrices. PN

together with matrix multiplication forms the permutationgroup PN . The double coset of Y by O2 and PN is

⌈Y ⌋OP = {PY V : Y ∈ Gf(2, N), P ∈ PN , V ∈ O2} . (2)

This double coset is invariant to the affine and permuta-

tion distortions; it is the affine-permutation shape ⌈X⌋AP .

Any matrix belonging to the equivalence class ⌈Y ⌋OP is

a matrix representation of this affine-permutation shape.

We also present a canonical representative for each affine-

permutation shape in Section 3. The quotient space of

the affine shape space Gr(2, N) by PN is the affine-

permutation shape space. We use GS(2, N) to represent

the affine-permutation shape space.

3. Affine-Permutation Shape Space

Distance Definition between Two Affine-Permutation

Shapes. The definition of distance refers to the geometry of

the affine-permutation space. It should be invariant to both

affine and permutation distortions. We first define a distance

in the affine-permutation shape space GS(2, N) [15] and

then provide an efficient method for its computation.

Definition 1. The distance between two points s0, sτ ∈GS(2, N), which are represented by ⌈Y0⌋OP and ⌈Yτ⌋OP ,is defined by

dGS(2,N)(⌈Y0⌋OP , ⌈Yτ⌋OP)

= minP0,Pτ∈PN

dGr(2,N)(P0 · ⌈Y0⌋O, Pτ · ⌈Yτ⌋O), (3)

= minP∈PN

dGr(2,N)(⌈Y0⌋O, P · ⌈Yτ⌋O). (4)

where P0 and Pτ are N ×N permutation matrices from the

permutation set PN . The second equation is because P0

and Pτ are not independent, and we fix one of them as ref-

erence without loss of generality.

Distance Computation. The definition of distance

involves a combinatorial minimization problem over the

set of N -dimensional permutations. It can be solved

by dGS(2,N)(⌈Y0⌋OP , ⌈Yτ⌋OP) = minP∈PN

√2

π

trace(E2τ ),

where Eτ is a diagonal matrix Eτ = acosm(Cτ ), and

Cτ is obtained by an SVD Y T0 · (PYτ ) = V0CτV

Tτ .

Let cost function J(P ) = trace(E2τ ) and consider P =

argminP∈PNJ(P ).

Since permutation matrices are doubly-stochastic andorthogonal, PN is a subset of the set of doubly stochas-tic matrices DN . According to Birkhoff’s theorem [9],the optimum of a concave cost function over all doublystochastic matrices D ∈ DN is obtained if and only ifD is a permutation matrix. So, we relax the constraint toP = argminP∈DN

J(P ). This problem is still challengingsince the double stochastic matrix P affects the three SVDcomponents. Typical methods, i.e., finding the derivative,are intractable. However, an approximation to the distanceis still feasible when casting it as a linear cost function. Weapproximate this distance by the Frobenius Norm:

dGS(2,N)(⌈Y0⌋OP , ⌈Yτ⌋OP)

= minP∈DN ,O∈O2

‖Y0 − P · Yτ ·O‖F. (5)

Thus, we define a new approximate cost function

JL(P )=‖Y0 − P · Yτ ·O‖2F=−trace(P · (Yτ ·O) · Y T

0 ), (6)

whose gradient is ∇JL(P ) = −Y0 · (Yτ · O)T . To simplifythe minimization, we further use the vectorization notationsvec(·) to formulate a linear approximate cost function. Byletting w = vec(D), Eq. (6) and (3) become

JL(w) = −vec(Y0 · (Yτ ·O)T )T · w, (7)

∇JL(w) = −vec(Y0 · (Yτ ·O)T ). (8)

Page 3: Invariance to Affine-Permutation Distortions...and the Karcher mean given multiple shapes–representing possibly different affine-permutation distorted objects. 2. Shape Space: A

As to the constraints, considering that a doubly-stochasticmatrix is a square matrix with nonnegative numbers and thateach of its rows and columns sums to 1, its vector form wshould satisfy the following conditions:

(1T ⊗ I) · w = 1, (I ⊗ 1T ) · w = 1, w ≥ 0, (9)

where ⊗ denotes the Kronecker product, 1 is an N × 1 vec-

tor, and I is an N ×N identity matrix.

Since two variable factors exist, orientation and permu-

tation, the optimization problem can be solved by an alter-

nating minimization. We fix the orientation matrix O as one

possible value first, then optimize the objective function (7)

over permutation, and finally obtain the minimization of the

distance in the current setting. Throughout the distances

over different orientation matrices, we select the mini-

mum as the desired distance dGS(2,N) (⌊Y0⌉OP , ⌊Yτ⌉OP)together with the corresponding parameters of orientation Oand permutation vector w. We recover the permutation ma-

trix P from w. When applying this algorithm to two points

in the shape space, we determine the orientation and permu-

tation between them simultaneously.

Canonical Representative in GS(2, N). It is crucial to

find a unique canonical representative for each point in the

affine-permutation shape space GS(2, N). We start by fix-

ing one element in one equivalence class and define the cor-

responding element in every other equivalence class as the

canonical representative.The procedure is the following: given two points

⌈Y0⌋OP and ⌈Yτ⌋OP with their matrix representations Y0

and Yτ , fix one as reference, such as⋆

Y0 = Y0; then applythe above algorithm to find the permutation P and orienta-tion distortion O between them; and, finally, transform theother point Yτ , to its corresponding canonical representative

Yτ = P · Yτ ·O. (10)

With the canonical representative for the configurations, we

can compute the Karcher mean of the affine-permutation

shapes and compute the geodesic between two affine-

permutation shapes .

4. Experimental Results

Datasets. We use three datasets [19, 11, 20] for our ex-

periments, which are widely used in shape analysis: the

UCI repository [19], MPEG-7 [11], and MCD [20].

Canonical Representatives in Affine-Permutation

Shape Space. We find the canonical representative for an

affine-permutation shape with the MCD dataset in Fig. 1.

Each row illustrates an example. Each row of the left

most column visualizes the matrix representation of a

shape ⌊Y0⌉OP that is taken to be reference⋆

Y0. Each row

of the second column visualizes one possible matrix repre-

sentations Yτ of a shape ⌊Yτ⌉OP . Each row of the third

column shows the calculated distance, whose minimum is

−0.1 −0.05 0 0.05 0.1

−0.1

−0.05

0

0.05

0.1

0.15

(a)

−0.1 −0.05 0 0.05 0.1

−0.1

−0.05

0

0.05

0.1

0.15

(b)

−3.142 −1.571 0 1.571 3.1420

0.1

0.2

0.3

0.4

Rotation/Reflection angles

Dis

tance

Rotation

Reflection

(c)

−0.1 −0.05 0 0.05 0.1

−0.1

−0.05

0

0.05

0.1

0.15

(d)

−0.1 −0.05 0 0.05 0.1

−0.1

−0.05

0

0.05

0.1

0.15

(e)

−0.1 −0.05 0 0.05 0.1

−0.1

−0.05

0

0.05

0.1

0.15

(f)

−3.142 −1.571 0 1.571 3.1420.25

0.3

0.35

0.4

Rotation/Reflection angles

Dis

tance

Rotation

Reflection

(g)

−0.1 −0.05 0 0.05 0.1

−0.1

−0.05

0

0.05

0.1

0.15

(h)

−0.1 −0.05 0 0.05 0.1 0.15

−0.1

−0.05

0

0.05

0.1

(i)

−0.1 −0.05 0 0.05 0.1 0.15

−0.1

−0.05

0

0.05

0.1

(j)

−3.142 −1.571 0 1.571 3.1420.45

0.5

0.55

Rotation/Reflection angles

Dis

tance

Rotation

Reflection

(k)

−0.1 −0.05 0 0.05 0.1 0.15

−0.1

−0.05

0

0.05

0.1

(l)

−0.1 −0.05 0 0.05 0.1 0.15

−0.1

−0.05

0

0.05

0.1

(m)

−0.1 −0.05 0 0.05 0.1 0.15

−0.1

−0.05

0

0.05

0.1

(n)

−3.142 −1.571 0 1.571 3.1420.45

0.5

0.55

Rotation/Reflection angles

Dis

tance

RotationReflection

(o)

−0.1 −0.05 0 0.05 0.1 0.15

−0.1

−0.05

0

0.05

0.1

(p)

Figure 1: Visualization of our canonical representatives. Each

row shows a pair of examples. 1st and 2nd columns (e.g.,

Fig. 1(a), (b)): two representatives Y0, Yτ of two shapes

⌊Y0⌉OP , ⌊Yτ⌉OP ∈ GS(2, N). 3rd column (e.g., Fig. 1(c)): the

minimum distance between Y0 and each rotated/reflected Yτ be-

tween [−π, π]. 4th column (e.g., Fig. 1(d)): the computed canon-

ical representative of Yτ with respect to Y0.

the affine-permutation distance between the shapes in the

first two columns, under rotation or reflection by angles

in [−π, π]. For a fixed angle i, i ∈ [−π, π], the min-

imum of the distance across all possible permutations is

considered to be the distance dGS1(i) in Eq. (5) if under

a rotation transformation R1(i), or the distance dGS2(i)in Eq. (5) if under a reflection transformation R2(i).For all possible angles i, the minimum of the distance

{dGS1(i), dGS2(i), i ∈ [−π, π]} is considered to be the

affine-permutation distance dGS(2,N) (⌊Y0⌉OP , ⌊Yτ⌉OP)between these two affine-permutation shapes. The orienta-

tion matrix O and the permutation matrix P corresponding

to the affine-permutation distance are then used to compute

the canonical representative⋆

Yτ with Eq. (10). Each row of

the fourth column contains the canonical representative⋆

of the shape in the second column Yτ with the shape in the

first column as reference⋆

Y0.

We now analyze in detail Fig. 1. The color of the land-

marks indicates their scanning order. For example, the red

landmark is the first scanned landmark in the landmark se-

quence and the blue landmark is the last in the sequence.

With respect to the first row, only affine and permutation

distortions exist between Fig. 1(a) and (b), since we gener-

ated Fig. 1(b) by random affine and permutation transfor-

mations of Fig. 1(a). From Fig. 1(c), we observe that, when

the rotation angle −1.04 rad, the distance is 0 under the ap-

propriate permutation. Fig. 1(d) is recovered from Fig. 1(b)

by Eq. (10) and is exactly the same as in Fig. 1(a). This

proves that our method recovers one unique canonical rep-

resentative for each affine-permutation shape. With respect

Page 4: Invariance to Affine-Permutation Distortions...and the Karcher mean given multiple shapes–representing possibly different affine-permutation distorted objects. 2. Shape Space: A

to the second row, Fig. 1(e) and (f) are projective distorted

with respect to each other. We generated these two affine-

permutation shapes from two samples of the same category

“butterfly.” From Fig. 1(g), we see that, when the rotation

angle 0.56 rad, the distance achieves its minimum 0.281,

which is the affine-permutation distance between Fig. 1(e)

and Fig. 1(f). Fig. 1(h) is the canonical representative of

Fig. 1(f) obtained by Eq. (10) and is well aligned and ap-

propriately oriented.

We generated Fig. 1(i), (j), (m), and (n) from two dif-

ferent categories, “fish” and “guitar.” Fig. 1(i) and (m) vi-

sualize the same configuration of “fish.” Fig. 1(j) and (n)

are affine-permutation distorted versions of each other. For

Fig. 1(i) and (j), Fig. 1(k) shows, when the rotation an-

gle −2.442 rad, the distance achieves its minimum 0.479.

Fig. 1(l) shows the canonical representative of Fig. 1(j) ob-

tained with Eq. (10). For Fig. 1(m) and (n), Fig. 1(o) shows,

when the reflection angle −0.06 rad, the distance achieves

its minimum 0.479. Fig. 1(p) shows the canonical rep-

resentative of Fig. 1(n) obtained with Eq. (10). Fig. 1(l)

and Fig. 1(p) are exactly the same, which also proves that

our method generates a unique canonical representative for

an affine-permutation shape given a reference shape. The

shapes of “fish” and “guitar” are visually quite different, but

the landmarks are well permuted since the colors of Fig. 1(l)

are ordered as the colors of Fig. 1(i), and similarly the colors

of Fig. 1(p) are ordered as the colors of Fig. 1(m).

These examples show that our procedure of finding

canonical representatives is effective.

Application I: Clustering. Clustering is a natural task to

evaluate the distance and the Karcher mean. In centroid-

based clustering, such as K-means or K-means++ [2] clus-

tering, clusters are represented by a central vector, and the

data memberships are determined by their distances to the

cluster centers. We first solve the permutation distortions

by finding the canonical representatives. We then extend

K-means++ from the typical Euclidean space to affine-

permutation shape space by replacing the corresponding

concepts and calculations of the central vector and dis-

tance to the Karcher Mean and the distance in the affine-

permutation shape space.

We tested the modified K-means++ on the UCI hand-

written digit database. Each digit was resampled to N =

100, 200, 300, 400, 500, 600 landmarks. For our test, we as-

sumed that the coordinates of the points are displaced by

additive independent Gaussian noises. We arbitrarily se-

lected one configuration from each class 0-9 and generated

M = 200, 400, 600, 800, 1,000 affine distortions that were

disturbed by additive noise of different intensities. We gen-

erated 10×M images from 10 classes in total. We prepro-

cessed the original configurations and obtained the canoni-

cal representatives as the input of K-means++. We use [3]

2000 4000 6000 8000 1000070

80

90

100

Sample Number

Accu

racy (

%)

Average Accuracy for K−means++

Algorithm 1Algorithm in [21]*

Ours

Algorithm [3]

(a)

0 200 400 6000

100

200

300

400

Number of Landmarks per Image

Tim

e (

s)

Average Computational Time for K−means++

Algorithm 1Algorithm in [21]*

Ours

Algorithm [3]

(b)

Figure 2: Algorithm performance. Fig. 2(a) compares the accu-

racy of our method with that of the algorithm in [3]. Fig. 2(b)

compares their computational time. When increasing the num-

ber of landmarks, the computational time of our method increases

more slowly than that of algorithm [3].

Figure 3: Shape Morphing. From top to bottom: digits “3” to

“0”, between two tools, and from “chick” to “butterfly”.

as a baseline. This reference proposed different methods for

the affine shape space construction and Karcher mean.

We compare our method and Algorithm [3] with re-

spect to the accuracy in Fig. 2(a) and computational time

in Fig. 2(b). Fig. 2 shows that our method achieves higher

accuracy while being faster. This improved performance in-

dicates that our constructed affine-permutation shape space

identifies the intrinsic characteristics of the digital shapes.

Application II: Shape Morphing on Labeled/Unlabeled

Data. Morphing produces a sequence of images that al-

lows a smooth transition from one image (shape) into an-

other, thus connecting different shapes. Given a starting

shape and an ending shape, we find the canonical repre-

sentative of the ending shape with respect to the starting

shape with our method, and then use the linear solution to

the logarithmic map to morph a shape into another. Morph-

ing constructs the geodesic that bridges the two endpoints

and obtains the intermediate shapes along this geodesic.

To be specific, we use in our experiments the UCI hand-

written digit database, the MCD dataset, and the MPEG-7

database. With the starting shapes as references, we first

permuted and rotated/reflected the end shapes to get their

canonical representatives with our method and Eq. (10).

We then morphed the starting shape into the ending shape.

Fig. 3 demonstrates the results on three sets of shapes. The

distance between any two adjacent intermediate shapes is

d/5, given that d is the distance between the two endpoints.

For the top two rows, we plot the contours given by the

boundary landmarks for better visualization. From Fig. 3,

we see the smooth deformation along the geodesic between

each pair of ending shapes, which proves our method and

the linear solution are feasible.

Page 5: Invariance to Affine-Permutation Distortions...and the Karcher mean given multiple shapes–representing possibly different affine-permutation distorted objects. 2. Shape Space: A

References

[1] A. Amedi, W. M. Stern, J. A. Camprodon, F. Bermpohl,

L. Merabet, S. Rotman, C. Hemond, P. Meijer, and

A. Pascual-Leone. Shape conveyed by visual-to-auditory

sensory substitution activates the lateral occipital complex.

Nature neuroscience, 10(6):687–689, May 2007. 1

[2] D. Arthur and S. Vassilvitskii. K-Means++: The advantages

of careful seeding. In ACM-SIAM Symposium on Discrete

Algorithms, pages 1027–1035, PA, US, 2007. 4

[3] E. Begelfor and M. Werman. Affine invariance revisited. In

IEEE Conference on Computer Vision and Pattern Recogni-

tion, volume 2, pages 2087–2094, NY, US, June 2006. 4

[4] W. M. Boothby. An introduction to differentiable manifolds

and Riemannian geometry, volume 120. Gulf Professional

Publishing, TX, US, 2003. 2

[5] L. da Fontoura Costa and R. M. Cesar Jr. Shape analysis and

classification: theory and practice. CRC press, FL, US, 2nd

edition, 2010. 1

[6] V. H. S. Ha and J. M. F. Moura. Affine-permutation invari-

ance of 2-D shapes. IEEE Transactions on Image Process-

ing, 14(11):1687–1700, November 2005. 1

[7] J. Harris. Algebraic geometry: a first course, volume 133 of

Graduate Texts in Mathematics. Springer, NY, US, 1992. 2

[8] R. Hartley and A. Zisserman. Multiple view geometry in

computer vision. Cambridge University Press, Cambridge,

UK, 2000. 1

[9] R. A. Horn and C. R. Johnson. Matrix analysis. Cambridge

University Press, Cambridge, UK, 2012. 2

[10] N. Kriegeskorte, W. K. Simmons, P. S. Bellgowan, and C. I.

Baker. Circular analysis in systems neuroscience: the dan-

gers of double dipping. Nature neuroscience, 12(5):535–

540, January 2009. 1

[11] L. J. Latecki, R. Lakamper, and T. Eckhardt. Shape descrip-

tors for non-rigid shapes with a single closed contour. In

IEEE Conference on Computer Vision and Pattern Recogni-

tion, volume 1, pages 424–429, SC, US, June 2000. 3

[12] T. McInerney and D. Terzopoulos. Deformable models in

medical image analysis: a survey. Medical image analysis,

1(2):91–108, June 1996. 1

[13] A. Meyer-Baese and V. J. Schmid. Pattern Recognition and

Signal Analysis in Medical Imaging. Elsevier, MA, US, 2nd

edition, 2014. 1

[14] J. J. Rotman. An introduction to the theory of groups, volume

148. Springer, NY, US, 1995. 1

[15] D. Sepiashvili. Computations on the Grassmann Manifold

and Affine-Permutation Invariance in Shape Representation.

PhD thesis, Department of Electrical and Computer Engi-

neering, Carnegie Mellon University, May 2006. 1, 2

[16] D. Sepiashvili, J. M. F. Moura, and V. H. S. Ha. Affine-

permutation symmetry: Invariance and shape space. In IEEE

Workshop on Statistical Signal Processing, pages 307–310,

St. Louis, MI, US, 2003. 1

[17] O. R. Terrades, S. Tabbone, and E. Valveny. A review of

shape descriptors for document analysis. In International

Conference on Document Analysis and Recognition, vol-

ume 1, pages 227–231, Parana, Brazil, September 2007. 1

[18] Ø. D. Trier and T. Taxt. Evaluation of binarization methods

for document images. IEEE Transactions on Pattern Analy-

sis and Machine Intelligence, (3):312–315, March 1995. 1

[19] UCI. Data set. http://archive.ics.uci.edu/ml/.

3

[20] M. Zuliani, S. Bhagavathy, B. Manjunath, and C. S. Kenney.

Affine-invariant curve matching. In IEEE International Con-

ference on Image Processing, volume 5, pages 3041–3044,

Singapore, October 2004. 3