Topological Data Analysis for detecting Hidden Patterns in Data Susan Holmes Statistics, Stanford, CA 94305. Joint work with Persi Diaconis, Mehrdad Shahshahani and Sharad Goel. Thanks to Harold Widom, Gunnar Carlssen, John Chakarian, Leonid Pekelis for discussions, and NSF grant DMS 0241246 for funding.
105
Embed
Topological Data Analysis for detecting Hidden …statweb.stanford.edu/~susan/talks/AIMTopDA.pdfTopological Data Analysis for detecting Hidden Patterns in Data Susan Holmes Statistics,
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Topological Data Analysis for detectingHidden Patterns in Data
Susan Holmes
Statistics, Stanford, CA 94305.
Joint work with Persi Diaconis, Mehrdad Shahshahani and
Sharad Goel.
Thanks to Harold Widom, Gunnar Carlssen, John Chakarian,
Leonid Pekelis for discussions, and NSF grant DMS 0241246
for funding.
A la recherche du temps perdu: Gradients etOrdination
Many popular multivariate methods based on spectral
decompositions of distance methods or transformed
distances, Multidimensional Scaling, kernel PCA,
correspondence analysis, Metric MDS aim to detect hidden
underlying structure of points in high dimensions.
A first type of dependence is a hidden gradient, placing
points close to a curve in high dimensional space. Ecologists,
archeologists have long known to look for horseshoes or
arches which are symptomatic of such structure.
We take a political science example with data from 2005
U.S. House of Representatives roll call votes. MDS and
kernel PCA, in this case, output two ‘horseshoes’ that are
characteristic of dimensionality reduction techniques.
PCA: Dimension Reduction
PCA seeks to replace the original (centered) matrix X by a
matrix of lower rank, this can be solved by doing the singular
value decomposition of X:
X = USV ′, with U ′DU = In and V ′QV = Ip and Sdiagonal
XX ′ = US2U ′, with U ′DU = In and S2 = ΛPCA is a linear nonparametric multivariate method for
dimension reduction.
Ordination : Finding Time (Le temps perdu...)
Early studies in archeology have aimed for seriation in time
Guttman, Kendall and Ter Braak have pointed out and
studied the arch or horseshoe effect.
Here is a linguistic example where I dated the works of Plato
according to their sentence endings using a particular
distance between the books called the Chisquare distance:
As an example we take data analysed by Cox and
Brandwood [?] who wanted to seriate Plato’s works using
the proportion of sentence endings in a given book, with a
given stress pattern. We propose the use of correspondence
analysis on the table of frequencies of sentence endings, for a
detailed analysis see Charnomordic and Holmes[?].
The first 10 profiles (as percentages) look as follows:
This step removed, for example, the Speaker of House
Dennis Hastert (R-IL) who by custom votes only when his
vote would be decisive, and Robert T. Matsui (D-CA) who
passed away at the start the term.
We define a distance between legislators as
d̂(li, lj) =1
669
669∑k=1
|vik − vjk|.
Rougly, d̂(li, lj) is the percentage of roll calls on which
legislators li and lj disagreed. This interpretation would be
exact if not for the possibility of ‘not voting’.
Since we now have data points in a metric space, we can
apply the MDS algorithm. The figure shows the results of a
3-dimensional MDS mapping. The most striking feature of
the mapping is that the data separate into ’twin horseshoes’.
In the next figure we have added color to indicate the
political party affiliation of each Representative (blue for
Democrat, red for Republican, and green for the lone
independent–Rep. Bernie Sanders of Vermont). The output
from MDS is qualitatively similar to that obtained from other
dimensionality reduction techniques, such as principal
components analysis applied directly to the voting matrix V .
We build and analyze a model for the data in an effort to
understand and interpret these pictures. Roughly our theory
predicts that the Democrats, for example, are ordered along
the blue curve in correspondence to their political ideology,
i.e. how far they lean to the left.
We discuss connections between the theory and the data. In
particular, we explain why in the data, legislators at the
political extremes are not quite at the tips of the MDS
curves, but rather are positioned slightly toward the center.
Briefly, this amounts to the fact that there are distinct
groups of relatively-liberal Republicans, which accordingly
exhibit quite different voting patterns.
!0.1!0.05
00.05
0.1!0.2
!0.1
0
0.1
0.2
!0.2
!0.15
!0.1
!0.05
0
0.05
0.1
0.15
3-Dimensional MDS mapping of legislators based on the
2005 U.S. House of Representatives roll call votes.
!0.1!0.05
00.05
0.1!0.2
!0.1
0
0.1
0.2
!0.2
!0.15
!0.1
!0.05
0
0.05
0.1
0.15
3-Dimensional MDS mapping of legislators based on the 2005
U.S. House of Representatives roll call votes. Color has been
added to indicate the party affiliation of each representative.
A Model for the Data
Following the standard paradigm of placing politicians within
a left-right spectrum, it is natural to identify legislators li1 ≤ i ≤ n with points in the interval I = [0, 1] in
correspondence with their political ideologies. We define the
distance between legislators to be
d(li, lj) = |li − lj|.
This assumption that legislators can be isometrically mapped
into an interval is key to our analysis.
To apply MDS to the voting data, we defined a distance
between legislators via roll call votes. We now introduce a
‘cut-point model’ for voting that connects our distance d
above to the data-based roll call distance.
The Model: Each bill 1 ≤ k ≤ m on which the legislators
vote is represented as a pair
(Ck, Pk) ∈ [0, 1]× {0, 1}.
We can think of Pk as indicating whether the bill is liberal
(Pk = 0) or conservative (Pk = 1), and we can take Ck to be
the cut-point between legislators that vote ‘yea’ or ‘nay’. Let
Vik ∈ {1/2,−1/2} indicate how legislator li votes on bill k.
Then, in this model,
Vik ={
1/2− Pk li ≤ CkPk − 1/2 li > Ck
.
As described, the model has n+ 2m parameters, one for
each legislator and two for each bill. We reduce the number
of parameters by assuming that the cut-points are
independent random variables uniform on I. Then,
P(Vik 6= Vjk) = d(li, lj) (1)
since legislators li and lj take opposites sides on a given bill
if and only if the cut-point Ck divides them. Observe that
the Pk do not affect the probability above.
Define the empirical distance between legislators li and lj by
d̂m(li, lj) =1m
m∑k=1
|Vik − Vjk| =1m
m∑k=1
1Vik 6=Vjk.
By 1, we can estimate the distance d between legislators by
the distance d̂ which is computable from the voting record.
In particular,
limm→∞
d̂m(li, lj) = d(li, lj) a.s.
since we assumed the cut-points are independent. More
precisely, we have the following result:
Lemma. For m ≥ log(n/√ε)/ε2
P(∣∣∣d̂m(li, lj)− d(li, lj)
∣∣∣ ≤ ε ∀ 1 ≤ i, j ≤ n)≥ 1− ε.
Proof. By the Hoeffding inequality, for fixed li and lj
P(∣∣∣d̂m(li, lj)− d(li, lj)
∣∣∣ > ε)≤ 2e−2mε2.
Consequently,
P
⋃1≤i<j≤n
∣∣∣d̂m(li, lj)− d(li, lj)∣∣∣ > ε
≤∑
1≤i<j≤nP(∣∣∣d̂m(li, lj)− d(li, lj)
∣∣∣ > ε)
≤(
n
22e−2mε2
)≤ ε
for m ≥ log(n/√ε)/ε2 and the result follows.
In our model we identified latent variables with points in the
interval I = [0, 1] and accordingly defined the distance
between them to be d(li, lj) = |li − lj|. This general
description seems to be reasonable in a number of
applications. We then built a simple model for the data that
facilitated empirical approximation of this distance. This
second step depends heavily on the application. In the rest
of the paper, we simply assume that the distance d can be
reasonably approximated from the data.
Analysis of the Model
In this section we analyze the MDS algorithm applied to
metric models satisfying
d(xi, xj) = |i/n− j/n|.
This corresponds to the case in which legislators are
uniformly spaced in I: li = i/n.
Similarity and Transition Matrices
Given a distance d on a state space X , there are several ways
to build a similarity S. Two standard transformations are:
1. S1(xi, xj) = e−d(xi,xj)
2. S2(xi, xj) = supzi,zj d(zi, zj)− d(xi, xj)
Once we have a similarity, we can define a Gram/Kernel
matrix K by normalizing the rows. That is,
K(xi, xj) =S(xi, xj)∑xkS(xi, xk)
.
To ease the analysis, sometimes we instead normalize the
similarity matrix by the average row sum
z =1n
∑xi
∑xj
S(xi, xj).
That is, we set K(xi, xj) = S(xi, xj)/z.
Eigenvectors and Horseshoes
We find approximate eigenfunctions and eigenvalues for
models that satisfy
d(xi, xj) = |i/n− j/n|
with Gram matrices that are built with either a linear
similarity or an exponential similarity. The eigenfunctions are
found by continuizing the discrete Gram matrix, and then
solving the corresponding integral equation∫ 1
0K(x, y)f(y)dy = λf(x).
Standard matrix perturbation theory can then be applied to
recover approximate eigenfunctions for the original, discrete
kernel.
The eigenfunctions that we derive are in agreement with
those arising from the voting data, and lend considerable
insight into our data analysis problem and also into general
features of MDS mappings.
Approximate Eigenfunctions
We now state a classical perturbation result that relates two
different notions of an approximate eigenfunction. For more
refined estimates, see Parlett[?].
Theorem. Consider an n × n symmetric matrix A witheigenvalues λ1 ≤ · · · ≤ λn. If for ε > 0
‖Af − λf‖2 ≤ ε
for some f, λ with ‖f‖2 = 1, then A has an eigenvalue λksuch that |λk − λ| ≤ ε.
If we further assume that s = mini:λi 6=λk |λi − λk| > ε
then A has an eigenfunction fk such that Afk = λkfk and‖f − fk‖2 ≤ ε/(s− ε).
Remark. The second statement of the theorem allowsnon-simple eigenvalues, but requires that the eigenvaluescorresponding to distinct eigenspaces be well-separated.Remark. The eigenfunction bound of the theorem isasymptotically tight in ε as the following exampleillustrates: Consider the matrix
A =[λ 00 λ+ s
]with s > 0. For ε < s define the function
f =
[ √1− ε2/s2
ε/s
].
Then ‖f‖2 = 1 and ‖Af − λf‖2 = ε. The theoremguarantees that there is an eigenfunction fk with eigenvalue
λk such that |λ − λk| ≤ ε. Since the eigenvalues of A areλ and λ + s, and since s > ε, we must have λk = λ. LetVk = {fk : Afk = λkfk} = {ce1 : c ∈ R} where e1 is the firststandard basis vector. Then
minfk∈Vk
‖f − fk‖2 = ‖f − (f · e1)e1‖ = ε/s.
The bound of the theorem, ε/(s− ε), is only slightly larger.
Proof of Approximate Eigenfunction TheoremProof. First we show that mini |λi − λ| ≤ ε.If mini |λi − λ| = 0 we are done; otherwise A− λI is invertible. Then,
‖f‖2 ≤ ‖(A− λI)−1‖ · ‖(A− λ)f‖2≤ ε‖(A− λI)−1‖.
Since the eigenvalues of (A − λI)−1 are 1/(λ1 − λ), . . . , 1/(λn − λ), by
symmetry
‖(A− λI)−1‖ =1
mini |λi − λ|.
The result now follows since ‖f‖2 = 1.
Set λk = argmin|λi− λ|, and consider an orthonormal basis g1, . . . , gm of
the associated eigenspace Eλk. Define fk to be the projection of f onto
Eλk:
fk = 〈f, g1〉g1 + · · ·+ 〈f, gm〉gm.Then fk is an eigenfunction with eigenvalue λk. Writing f = fk+(f−fk)
In particular, ε ≥ ‖Af − λf‖2 ≥ ‖(A− λI)(f − fk)‖2.For λi 6= λk, |λi − λ| ≥ s− ε. The result now follows since for h ∈ E⊥λk
‖(A− λI)h‖2 ≥ (s− ε)‖h‖2.
Centering Kernel matrices
If our kernel K is renormalized so that it has row sums 1.
K1n = 1
Then 1n is an eigenvector of K with eigenvalue 1.
As a consequence if we recenter K by applying the centering
matrix H = I− 1n11′, for any eigenvector v different from 1n
KHv = Kv − 1nK1n1′nv = λv
and also HKHv = λHv = λv
So we will not bother to recenter the K matrix.
Linear Similarity
When we make a continuous version of the discrete Kernel
matrix Kn, we get the continuous kernel
K(x, y) =32[1− |x− y|].
Once we guess that the solutions to the corresponding
integral equation are trigonometric, verifying this is
straightforward. We start with a simple integral computation.
Lemma. For a 6= 0∫ 1
0cos(ax+ b)[1− |c− x|]dx =
2a2(cos(ac+ b))
− 1a2 [a sin b− ac sin b− ac sin(a+ b) + cos(b) + cos(a+ b)] .
In particular,
1. For odd integers k∫ 1
0sin(kπ(x−1/2))[1−|c−x|]dx =
2(kπ)2 cos(kπ(c−1/2))
2. For solutions to (a/2) tan(a/2) = 1∫ 1
0cos [a(x− 1/2)] [1− |c− x|]dx =
2a2 cos [a(c− 1/2)] .
Proof. The result follows from a straightforward calculation. Set
fc(x) = cos(ax+ b)[1− |c− x|].
Then
Z 1
0fc(x)dx = (1− c)
Z c
0cos(ax+ b)dx+
Z c
0x cos(ax+ b)dx
+ (1 + c)
Z 1
ccos(ax+ b)dx−
Z 1
cx cos(ax+ b)dx.
Integration by parts shows that,
Zx cos(ax+ b) =
1
ax sin(ax+ b) +
1
a2cos(ax+ b).
Substituting into the above, we have
Z 1
0fc(x)dx =
1
a2[a(1− c) sin(ac+ b)− a(1− c) sin(b) + a(1 + c) sin(a+ b)
− a(1 + c) sin(ac+ b) + ac sin(ac+ b) + cos(ac+ b)− cos(b)
− a sin(a+ b)− cos(a+ b) + ac sin(ac+ b) + cos(ac+ b)].
At a = kπ and b = 0 for k an odd integer,
a sin b− ac sin b− ac sin(a+ b) + cos(b) + cos(a+ b) = 0
and so Z 1
0cos(kπx)[1− |c− x|]dx =
2
(kπ)2cos(kπc).
Since for odd ksin(kπ(x− 1/2)) = cos(kπx− π(k + 1)/2) = (−1)
(k+1)/2cos(kπx)
the first part of the lemma follows. At b = −a/2 where a is a solution to (a/2) tan(a/2) = 1
a sin b− ac sin b− ac sin(a+ b) + cos(b) + cos(a+ b) = −a sin(a/2) + 2 cos(a/2)
= 0.
Consequently, Z 1
0cos(ax− a/2)[1− |c− x|]dx =
2
a2cos(ac− a/2).
for a a solution to (a/2) tan(a/2) = 1.
The solutions of (a/2) tan(a/2) = 1 occur at approximately
a = 2kπ for integers k. More precisely, we have the following
result.
Lemma. The positive solutions of (a/2) tan(a/2) = 1 lie inthe set
(0, π) ∪∞⋃k=1
(2kπ, 2kπ + 2/kπ)
with exactly one solution per interval. Furthermore, a is asolution if and only if −a is a solution.
Proof. Let f(θ) = (θ/2) tan(θ/2). Then f is an even
function, so a is a solution to f(θ) = 1 if and only if −a is
a solution. Since f ′(θ) = (1/2) tan(θ/2) + (θ/4) sec2(θ/2),
f(θ) is non-negative and increasing in the first and second
quadrants, and furthermore
f(2kπ) = 0 < 1 < +∞ = limθ→(2k+1)π−
f(θ).
The third and fourth quadrants have no solutions since f(θ) ≤0 in those regions. This shows that the solutions to f(θ) = 1lie in the intervals
∞⋃k=0
(2kπ, 2kπ + π)
with exactly one solution per interval. Recall the power series
expansion of tan θ for |θ| < π/2 is
tan θ = θ + θ3/3 + 2θ5/15 + 17θ7/315 + . . . .
In particular, for 0 ≤ θ < π/2, tan θ ≥ θ. Finally, for k ∈ Z≥1
f(2kπ + 2/kπ) = (kπ + 1/kπ) tan(kπ + 1/kπ)
= (kπ + 1/kπ) tan(1/kπ)
≥ (kπ + 1/kπ)(1/kπ)
> 1
which gives the result.
Remark. The first few positive solutions of (a/2) tan(a/2) =1 are
1. a = 1.72066717803876 . . .
2. a = 6.85123691896346 . . .
3. a = 12.87459635834389 . . .
4. a = 19.05866881072393 . . .
Lemma. For 1 ≤ i, j ≤ n, let
Kn(xi, xj) =3
2n
(1− |i− j|
n
).
Set fn,a(xi) = cos(a(i/n − 1/2)) where a is a positivesolution to (a/2) tan(a/2) = 1, and set gn,k(xi) =sin(kπ(i/n− 1/2) for k ≥ 1 an odd integer. Then∣∣∣∣Knfn,a(xi)−
3a2fn,a(xi)
∣∣∣∣ ≤ a+ 1n
and ∣∣∣∣Kngn,k(xi)−3
(kπ)2gn,k(xi)∣∣∣∣ ≤ kπ + 1
n.
That is, fn,a and gn,k are approximate eigenfunctions of Kn
with approximate eigenvalues proportional to their squaredperiods.
Proof. Once we guess that f and g are approximate
eigenfunctions of Kn, the proof of this fact follows from
the integral computation in the previous Lemma. We have,
Knfn,a(xi) =3
2n
n∑j=1
cos(a(j/n− 1/2))[1− |i/n− j/n|]
=32
∫ 1
0cos(a(x− 1/2))[1− |j/n− x|]dx+
32Rn
=3a2fn,a(xi) +
32Rn by Lemma
where the error term satisfies
|Rn| ≤M
2nforM ≥ sup
0≤x≤1
∣∣∣∣ ddx cos(a(x− 1/2))[1− |j/n− x|]∣∣∣∣
by the standard right-hand rule error bound. In particular, we
can take M = a+ 1 independent of j, from which the result
for fn,a follows. The case of gn,k is analogous.
Lemma For 1 ≤ i, j ≤ n set
Kn(xi, xj) =3
2n
(1− |i− j|
n
)and let λ1, . . . , λn be the eigenvalues of Kn.
1. For positive solutions to (a/2) tan(a/2) = 1
min1≤i≤n
∣∣∣∣λi − 3a2
∣∣∣∣ ≤ 2(a+ 1)√n
.
2. For odd integers k ≥ 1
min1≤i≤n
∣∣∣∣λi − 3(kπ)2
∣∣∣∣ ≤ kπ + 1√n
.
Remark. By Remark the first few values of 3/a2 forsolutions to (a/2) tan(a/2) = 1 are
1. 1.01327541515878 . . .
2. 0.06391212873818 . . .
3. 0.01809897627265 . . .
4. 0.00825916473010 . . .
and the first few values of 3/(kπ)2 for k ≥ 1 an odd integerare
1. 0.30396355092701 . . .
2. 0.03377372788078 . . .
3. 0.01215854203708 . . .
4. 0.00620333777402 . . .
Exponential Transformation of Similarity
The case of exponential similarity is analogous to that of
linear similarity. Continuizing the discrete Gram matrix Kn,
we get the kernel
K(x, y) =e
2e−|x−y|.
Once again, we find trigonometric solutions to Kf = λf .
Lemma. For constants a, c ∈ R∫ 1
0e−|x−c| cos[a(x− 1/2)]dx
=2 cos[a(c− 1/2)]
1 + a2 +
(e−c + ec−1
)(a sin(a/2)− cos(a/2))
1 + a2
and∫ 1
0e−|x−c| sin[a(x− 1/2)]dx
=2 sin[a(c− 1/2)]
1 + a2 +
(e−c − ec−1
)(a cos(a/2) + sin(a/2))
1 + a2
In particular,
1. For a such that a tan(a/2) = 1∫ 1
0e−|x−c| cos[a(x− 1/2)]dx =
2 cos[a(c− 1/2)]1 + a2
2. For a such that a cot(a/2) = −1∫ 1
0e−|x−c| sin[a(x− 1/2)]dx =
2 sin[a(c− 1/2)]1 + a2
Proof. The lemma follows from a straightforward integration. First split the integral into two pieces:
Evaluating these expressions at the appropriate limits of integration gives the first statement of the lemma. The computation
ofR 10 e−|x−c| sin[a(x− 1/2)]dx is analogous.
The solution of a tan(a/2) = 1 are approximately 2kπ for
integers k and the solutions of a cot(a/2) = −1 are
approximately (2k + 1)π.
Lemma.
1. The positive solutions of a tan(a/2) = 1 lie in the set
(0, π) ∪∞⋃k=1
(2kπ, 2kπ + 1/kπ)
with exactly one solution per interval. Furthermore, a isa solution if and only if −a is a solution.
2. The positive solutions of a cot(a/2) = −1 lie in the set∞⋃k=0
((2k + 1)π, (2k + 1)π + 1/(kπ + π/2))
with exactly one solution per interval. Furthermore, a isa solution if and only if −a is a solution.
Remark. The first few positive solutions of a tan(a/2) = 1 are
1. a = 1.30654237418881 . . .
2. a = 6.58462004256417 . . .
3. a = 12.72324078413133 . . .
4. a = 18.95497141084159 . . .
and the first few positive solutions of a cot(a/2) = −1 are
1. a = 3.67319440630425 . . .
2. a = 9.63168463569187 . . .
3. a = 15.83410536933241 . . .
4. a = 22.08165963594259 . . .
Lemma. For 1 ≤ i, j ≤ n, let
Kn(xi, xj) =e
2ne−|i−j|/n.
Set fn,a(xi) = cos(a(i/n − 1/2)) where a is a positivesolution to a tan(a/2) = 1, and set gn,a(xi) = sin(a(i/n −1/2)) where a is a positive solution to a cot(a/2) = −1.Then ∣∣∣∣Knfn,a(xi)−
e
1 + a2fn,a(xi)∣∣∣∣ ≤ 2(a+ 1)
n∣∣∣∣Kngn,a(xi)−e
1 + a2gn,a(xi)∣∣∣∣ ≤ 2(a+ 1)
n.
That is, fn,a and gn,a are approximate eigenfunctions of Kn.
Lemma. For 1 ≤ i, j ≤ n set
Kn(xi, xj) =e
2ne−|i−j|/n
and let λ1, . . . , λn be the eigenvalues of Kn.
1. For positive solutions to a tan(a/2) = 1
min1≤i≤n
∣∣∣∣λi − e
1 + a2
∣∣∣∣ ≤ 4(a+ 1)√n
.
2. For positive solutions to a cot(a/2) = −1
min1≤i≤n
∣∣∣∣λi − e
1 + a2
∣∣∣∣ ≤ 4(a+ 1)√n
.
Remark. The first few values of e/(1 + a2) for solutions toa tan(a/2) = 1 are
1. 1.00414799895293 . . .
2. 0.06128160783626 . . .
3. 0.01668877420197 . . .
4. 0.00754468546867 . . .
The first few values of e/(1 + a2) for solutions to a cot(a/2) = −1 are
1. 0.18756657740212 . . .
2. 0.02898902316936 . . .
3. 0.01079887885138 . . .
4. 0.00556341289490 . . .
Horseshoes and Twin Horseshoes
The 2-dimensional mapping is built out of the second andthird eigenfunctions of the Gram matrix. Above wecomputed several approximate eigenfunctions and
eigenvalues for the Gram matrix arising from the votingmodel. The linear and exponential similarity cases are
analogous, and so we only consider the latter here. In thiscase, we have the approximate eigenfunctions
1. fn,1(xi) = cos(1.3065(i/n− 1/2)) with eigenvalue λ ≈ 1.004
2. fn,2(xi) = sin(3.6732(i/n− 1/2)) with eigenvalue λ ≈ 0.1876
3. fn,3(xi) = cos(6.5846(i/n− 1/2)) with eigenvalue λ ≈ 0.06128.
0 0.5 1!0.5
0
0.5
1
1.5
2
0 0.5 1
!1
!0.5
0
0.5
1
0 0.5 1
!1
!0.5
0
0.5
1
Approximate eigenfunctions f1, f2 and f3.
!1 !0.8 !0.6 !0.4 !0.2 0 0.2 0.4 0.6 0.8 1!1
!0.8
!0.6
!0.4
!0.2
0
0.2
0.4
0.6
0.8
1
A horseshoe that results from plotting
Λ : xi 7→ (f2(xi), f3(xi)).
In particular, from Λ it is possible to deduce the relative
order of the representatives in the interval I. Since −f2 is
also an eigenfunction, it is not in general possible to
determine the absolute order knowing only that Λ comes
from the eigenfunctions.
You need a crib!
Voting Data
With the voting data, we see not one, but two horseshoes.
To see how this can happen, consider the two population
state space X = {x1, . . . , xn1, y1, . . . , yn2} with distance