Dmitri Tymoczko Princeton University May 13, 2005 The Geometry of Musical Chords DRAFT Musical chords have a geometry that is surprisingly easy to specify. An n-note chord can be represented as a point on the orbifold T n /S n (the n-torus modulo the symmetric group S n ) (1). Composers in a wide range of musical styles have exploited the non-Euclidean features of these spaces, typically by utilizing short-distance pathways between structurally-similar chords (Fig. 1). The existence of such pathways depends on a chord’s symmetry, or near-symmetry, under translation, reflection, and permutation. Paradigmatically “consonant” and “dissonant” chords possess different symmetries, thereby suggesting different musical applications. Western music lies at the intersection of two seemingly independent disciplines: harmony and counterpoint. Harmony delimits the range of acceptable chords and chord- sequences. “Chords,” informally, are collections of simultaneously-occurring notes. Western musical styles typically permit only a limited number of chords (for example, only the major and minor triads) and successions between them (for example, the triadic progression C major-F major, but not C major-Ef major). Counterpoint is the technique of connecting the individual notes in a series of chords to form simultaneous melodic lines. To a good first approximation, chords are typically connected so that these lines (or voices) move independently (not all in the same direction by the same amount) and efficiently (by short distances, according to some perhaps-implicit notion of musical “distance”). Such voice-leading simplifies physical performance, engages explicit aesthetic norms (2-4), and facilitates the auditory streaming necessary for perceiving music polyphonically (5). Figure 1 shows independent, efficient voice-leadings in a wide range of musical styles. In each example, arrows represent individual musical voices. Fig. 1(a) comes from the classical period and features four major triads forming an archetypal sequence: I-IV-I-V-I in C major. The voice-leading connects each note in the first chord to its
46
Embed
The Geometry of Musical Chords DRAFT - Freecanonsrythmiques.free.fr/gueststars/voiceleading.pdf · The Geometry of Musical Chords DRAFT ... under translation, reflection, ... including
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Dmitri TymoczkoPrinceton University
May 13, 2005
The Geometry of Musical ChordsDRAFT
Musical chords have a geometry that is surprisingly easy to specify. An n-notechord can be represented as a point on the orbifold Tn/Sn (the n-torus modulothe symmetric group Sn) (1). Composers in a wide range of musical styles haveexploited the non-Euclidean features of these spaces, typically by utilizingshort-distance pathways between structurally-similar chords (Fig. 1). Theexistence of such pathways depends on a chord’s symmetry, or near-symmetry,under translation, reflection, and permutation. Paradigmatically “consonant”and “dissonant” chords possess different symmetries, thereby suggestingdifferent musical applications.
Western music lies at the intersection of two seemingly independent disciplines:
harmony and counterpoint. Harmony delimits the range of acceptable chords and chord-
sequences. “Chords,” informally, are collections of simultaneously-occurring notes.
Western musical styles typically permit only a limited number of chords (for example,
only the major and minor triads) and successions between them (for example, the triadic
progression C major-F major, but not C major-Ef major). Counterpoint is the technique
of connecting the individual notes in a series of chords to form simultaneous melodic
lines. To a good first approximation, chords are typically connected so that these lines
(or voices) move independently (not all in the same direction by the same amount) and
efficiently (by short distances, according to some perhaps-implicit notion of musical
“distance”). Such voice-leading simplifies physical performance, engages explicit
aesthetic norms (2-4), and facilitates the auditory streaming necessary for perceiving
music polyphonically (5).
Figure 1 shows independent, efficient voice-leadings in a wide range of musical
styles. In each example, arrows represent individual musical voices. Fig. 1(a) comes
from the classical period and features four major triads forming an archetypal sequence:
I-IV-I-V-I in C major. The voice-leading connects each note in the first chord to its
2
nearest successor in the second. Fig. 1(b), which is a common contemporary jazz pattern,
is analogous: here the chords are again similar, and the voice-leading connects notes by
short, though not necessarily minimal, paths. The voice-leadings in Fig. 1(c), which are
celebrated examples of nineteenth-century chromaticism, are also efficient, though here
they connect chords traditionally considered to belong to different types. Fig. 1(d)
presents a series of voice-leadings in which voices move independently and efficiently
within an unchanging harmony; such procedures are typical of twentieth-century “avant
garde” composition.
How is it that Western music can satisfy harmonic and contrapuntal constraints at
once? And what determines whether two chords can be connected by efficient voice-
leading? Composers and music theorists have been investigating these questions for
almost three hundred years. The “circle of fifths” (Fig. S1), first published in 1728 (6),
can be interpreted as depicting maximally efficient voice-leadings among the twelve
familiar major scales. The Tonnetz (Fig. S2), originating with Euler in 1739 and
discussed by nineteenth-century music theorists Oettingen and Riemann, depicts efficient
voice-leadings among the twenty-four major and minor triads (3, 7-8, 12). Recent
theoretical work (9-19) has continued this tradition, investigating efficient voice-leading
among other small collections of interesting chords. However, no comprehensive theory
of voice-leading has yet emerged. In this paper I provide such a theory, showing that
chords that can be connected by efficient voice-leading are close in the space of all
possible n-note chords.
Characterizing the geometry of chord-space requires surprisingly recent
mathematics: chord-space is an “orbifold,” a notion introduced by Satake in 1956 (20)
and developed by William Thurston in the 1970s (21). Understanding the orbifold
structure of chord-space permits a unified perspective on musical practices across a very
wide range of styles and time-periods: in particular, it shows that composers have
frequently (and perhaps unwittingly) exploited the special contrapuntal properties of
nearly-symmetrical chords (Fig. 1). More generally, the geometry of chord-space reveals
how the internal structure of a chord, including its degree of acoustic consonance,
3
determines the kind of efficient voice-leadings it can participate in. Thus, for the first
time, we can precisely specify the way in which harmony and counterpoint are related.
I. Background and definitionsFor maximal generality, we will consider voice-leading in a continuous, octave-
free space of “pitch-classes.” Two pitches are instances of the same pitch-class (or
chroma) when they are one or more octaves apart. Music theorists represent pitches
numerically by associating their fundamental frequencies f with real numbers according
to the equation:
p = 69 + 12log2(f/440) (1)
This creates a linear pitch space in which middle C is 60, an octave has size 12, and a
semitone (the distance between adjacent keys on a piano keyboard) has size 1. To create
circular pitch-class space, we identify all points p and p+12, forming the quotient space
R/12Z. (Here R refers to the set of real numbers and Z to the group of integers; the
notation R/12Z refers to the circular quotient space, whose points are the orbits of 12Z as
it acts on R.) This creates numerical equivalents for the familiar pitch-class letter names:
C=0, Cs/Df=1, D=2, “D quarter-tone sharp”=2.5, and so on. Note that although we will
consider the most general case of a continuous pitch-class space, in musical situations
one is typically concerned with a lattice of discrete, equally-spaced points in this space,
corresponding to the familiar pitch-classes of Western equal-temperament.
Formally, a chord is a multiset of pitch-classes, i.e. a set in which duplicates are
allowed. We will denote unordered multisets using curly braces: the C major chord is {0,
4, 7}, and the F-major chord is {0, 5, 9}. The musical term transposition is synonymous
with the mathematical term translation, and corresponds to addition in R/12Z. Two
chords are transpositionally equivalent if they the same up to some translation in pitch-
class space. Thus the C-major chord and F-major chord are transpositionally equivalent,
Figure 1. Efficient voice-leading in the Western tradition.Numbers correspond to pitch-classes, with C = 0, Cs = 1, etc. The voice-leadings in (a)-(c) are minimal voice-leadingscontaining no “voice-crossings.” That in (d) is non-minimal,and contains crossings. The four examples exploit threedifferent kinds of near-symmetry: translation in (a) and (b), reflection in (c), and permutation in (d).
a) a common classical upper-voice I-IV-I-V-I pattern
b) a common jazz-piano “left-hand” voice-leading pattern
c) Wagner, Parsifal (simplified) and Debussy, Prelude to the Afternoon of a Faun
d) in the style of Gyorgy Ligeti
(11, 0, 1) → (0, 1, 11) → (11, 1, 0)
01
00
02
03
04
05
06
07
08
09
0t
0e
[00]
11
12
13
14
15
16
17
18
19
1t
[1e]
1e
2e
3e
4e
5e
6e
7e
8e
9e
te
ee
22
23
24
25
26
27
28
29
[2t]
2t
3t
4t
5t
6t
7t
8t
9t
tt
33
34
35
36
37
38
[39]39
49
59
69
79
89
99
44
45
46
47
[48]
48
58
68
78
88
55
56
[57]
57
67
77
[66]
66
Figure 2. The orbifold T2/S2, drawn using a Euclidean metric Labelled points in the space correspond to equal-tempered dyads; the symbols “t” and “e” refer to 10 and 11, respectively. The left “edge” is identified, with a half-twist, with the right. The two voice-leadings (0, 1)→(1, 0) and (4, 11)→(5, 10) are shown on the graph; the first of these is reflected off the figureʼs mirror boundary.
Number ofNotes
The equal-tempered chordproviding the best approximationto the lowest pitch-classes of theharmonic series
Other chords providing reasonablygood approximations to the lowestpitch-classes of the harmonic series
Table 1. Familiar sonorities used in Western music. The sonorities on the left providethe best equal-tempered approximations to the first n pitch-classes of the harmonic series.The commonly-used sonorities on the right lie also approximate the first n pitch-classesof the harmonic series. All sonorities divide pitch-class space fairly evenly.
S1
MATERIALS AND METHODSTABLE OF CONTENTS
1. Comparing voice-leadings S1
2. Minimal voice-leadings and voice-crossings S4
3. A polynomial-time algorithm for finding a minimum
voice-leading between two chords S8
4. Derivation of the voice-leading orbifolds S10
5. Efficient voice-leading and symmetry S12
6. Evenness and transpositional invariance S16
1. Comparing voice-leadings. Let a be an element of R/12Z. We define the
norm of a, written |a|12Ζ, as the smallest real number |x| such that x and a are congruent
mod 12Z. (Here |x| refers to the standard absolute-value function.) The distance between
two pitch-classes a and b is |b – a|12Ζ. We define the displacement multiset associated
with a voice-leading A→B as the multiset of distances |bj – ai|12Ζ for all (ai, bj) in A→B.
For example, the displacement multiset associated with the voice-leading (0, 0, 4,
7)→(11, 2, 5, 7) is {1, 2, 1, 0}.
We will require that any method of comparing voice-leadings depend only on
their displacement multisets: for any two displacement multisets X and Y, it tells us
which, if any, is larger. More formally, a method of comparing voice-leading size will be
an asymmetric, negatively transitive relation (a strict weak order) over multisets of non-
negative reals. (A relation “>” is “asymmetric” if A > B implies that not B > A. It is
“negatively transitive” if A > B implies that either A > C or C > B, for all C.) A strict
weak order defines equivalence classes consisting of all non-comparable items: A ≡ B iff
neither A > B nor B > A. Strict weak orders are stronger than partial orders, since they
satisfy the trichotomy axiom: for any two elements in a strict, weakly ordered set, either
A > B, A ≡ B, or B > A. However, a strict weak order is weaker than a total order, since
it does not satisfy the “antisymmetry” condition: in a strict weak order, A ≡ B does not
imply that A and B are the same object.
S2
Let > be a strict weak order of multisets of nonnegative reals. We will say that
the relation > is normlike if and only if it satisfies two constraints.
{x1 + i, x2, …, xn} ≥ {x1, x2 + i, …, xn} ≥ {x1, x2, …, xn}, for x1 > x2, i > 0 (Distribution)
(NB: since multisets are unordered the numerical subscripts do not have ordinal
significance: x1 is no more “first” than x2 or xn.) The recursion constraint mandates a
predictable relationship between the size of a multiset and the size of its sub-multisets.
The distribution constraint’s first inequality requires that if X is an n-element multiset
whose values sum to x, then {x, 0, 0, …, 0} ≥ X ≥ {x/n, x/n, …, x/n}. Thus, x semitones
of motion in a single “voice” yields at least as large a voice-leading as x semitones of
motion distributed over multiple voices. As we will see below, this constraint is closely
related to the triangle inequality. The distribution constraint’s second inequality requires
that reducing the size of an element in a displacement multiset not make that multiset
larger. If a normlike strict weak order strictly satisfies both of the distribution
constraint’s inequalities, we will say that it strictly satisfies the distribution constraint.
At present, every music-theoretical method of measuring voice-leading size
produces a normlike strict weak order of multisets of non-negative reals. All but one
strictly satisfy the distribution constraint.
A. “Smoothness.” The size of a voice-leading is the sum of the elements of thedisplacement multiset (S1, S2, S3). This is sometimes called “taxicab norm.”Smoothness satisfies the distribution constraint non-strictly.
B. Smoothness is analogous to the L1 vector norm, though the components ofvectors are ordered whereas the elements of displacement multisets are not. Theanalogues to Lp vector norms strictly satisfy the distribution constraint for finite p> 1. (The L∞ vector norm also satisfies the distribution constraint, but notstrictly.) The L2 vector norm, which has been used by Callender (S4),corresponds to Euclidean norm.
S3
C. “Parsimony.” Parsimony generalizes a notion introduced by Richard Cohn anddeveloped by Jack Douthett and Peter Steinbach (S5, S6). Given two voice-leadings, α and β, α is smaller (or “more parsimonious”) than β iff there existssome real number j such that
1) for all real numbers i > j, i appears the same number of times in thedisplacement multisets associated with α and β; and2) j appears fewer times in the displacement multiset of α than β.
D. “Smoothness then parsimony.” This measure represents my own besthypothesis about how classical composers might have thought about voice-leading size. Given two voice-leadings α and β, α is smaller than β iff:
1) α is smoother than β; or2) α and β are equally smooth, and α is more parsimonious than β.
Many of these methods of measuring voice-leading size yield mathematical “norms”:
there is some function f from multisets to the real numbers, such that f(X) > f(Y) if and
only if X > Y according to the normlike strict weak order >. Note, however, that neither
“parsimony” nor “smoothness then parsimony” can give rise to such a function f.
Nevertheless, both “parsimony” and “smoothness then parsimony” represent musically
viable ways of thinking about voice-leading size. For this reason, we cannot simply
impose the mathematically-convenient requirement that measurements of voice-leading
size produce “norms” or “metrics.”
However, both “parsimony” and “smoothness then parsimony” are very closelyrelated to traditional norms. “Parsimony” refines the L∞ vector norm, according to which
the size of a voice-leading is given by the largest element in its displacement multiset.Given two voice-leadings α and β, if α < β according to the L∞ norm then α is more
parsimonious than β. However, the converse does not hold: the voice-leadings {3, 3} and
{3, 0} have the same L∞ norm but the first is less parsimonious than the second.
“Parsimony” is therefore closely related to, but slightly more fine-grained than the L∞
norm. “Smoothness then parsimony” stands in an analogous relation to “smoothness,”
the L1 vector norm. For this reason, we can often reason about “smoothness” and“smoothness then parsimony” using our geometric intuitions about the L∞ and L1 norms.
This point holds more generally. As the name suggests, the notion of a normlike
strict weak order is a weakened analogue to a traditional geometrical “norm.” We can
S4
think of the displacement multiset associated with the voice-leading A→B as a non-real-
valued “norm” of the voice-leading A→B. Likewise, the displacement multiset
associated with the minimal voice-leading A→B is analogous to a non-real-valued
“distance” between A and B. This non-real-valued “distance” has many of the properties
associated with a proper mathematical metric:
1. It is symmetric, since the displacement multiset associated with the minimalvoice-leading A→B is the same as the displacement multiset associated with the minimal
voice-leading B→A.
2. The minimal voice-leading A→A has displacement multiset {0, 0, …, 0},
which is at least as small as any other displacement multiset with the same number of
elements. In this sense, the “distance” between A and A is as small as it can be.
3. If the displacement multiset associated with the minimal bijective voice-leadingA→B is {0, 0, …, 0} then A = B.
4. Finally, the distribution constraint is closely related to the triangle inequality.
Indeed, as long as we require that the size of a voice-leading depend only on the size of
its displacement multiset, then the two principles are equivalent: any violation of the
distribution constraint generates a violation of the triangle inequality, and vice-versa.
(This is fairly obvious in the case of the metrics associated with the Lp vector norms, and
less than obvious in the general case. I sketch a proof at the end of §2, below.)
Intuitively, both the distribution constraint and the triangle inequality express the
principle that x steps in a single direction take you farther than x total steps in a number
of mutually orthogonal directions.
2. Minimal voice-leadings and voice-crossings. The following theorem shows that
between any two chords there is a minimal voice-leading with no “voice-crossings” in
pitch-class space. Since avoidance of voice-crossings is a feature of traditional Western
musical practice, it helps justify our use of normlike strict weak orders; it furthermore
allows us to generate an efficient algorithm for determining the minimal voice-leading
between two chords.
S5
THEOREM 1. Let A and B be any two chords, and let our measure voice-leading size be a strict weak order satisfying the distribution constraint. There willexist a minimal voice-leading from A to B, (a1, a2, …, an)→(b1, b2, …, bn), that hasno “voice-crossings” in pitch-class space. That is, there will exist a set ofcontinuous functions fn(t) such that fn(0) = an, fn(1) = bn, and fm(t) ≠ fn(t), for all m≠ n, and all t such that 0 < t < 1. Furthermore, if our order strictly satisfies thedistribution constraint, then every minimal voice-leading between A and B will becrossing-free.
The theorem is proved by a simple examination of cases.Suppose that a voice-leading A→B contains a crossing; we will show that we can
remove the crossing without increasing the size of the voice-leading and without creating
any new crossings. In what follows, will depict pitch-class space as a circle with
ascending motion in pitch-class space corresponding to clockwise motion around thecircumference. It is always assumed that 0 < x, x + m, x + n ≤ 6. Note that although the
following proof is stated in terms of pitch-classes, a precisely analogous result applies to
pitches; here, “chords” are simply multisets of real numbers, and there is always a
minimal voice-leading with no crossings in pitch-space.
Figure S3(a) shows the first geometrical possibility: pitch-class a1 moves n
semitones counterclockwise to b2 while pitch class a2 moves x + m semitonescounterclockwise to b1, with 0 ≤ n < m. The uncrossed voice-leading (a1, a2)→(b1, b2) has
displacement multiset {m, x + n}. Since m > n, the distribution constraint implies that{m, x + n} ≤ {x + m, n}. The uncrossed voice-leading is no larger than the voice-leading
with the crossing; if the strict weak order strictly satisfies the distribution constraint, then
the uncrossed voice-leading is smaller.
Figure S3(b) shows a second possibility: a1 moves clockwise by n semitones to b2,while a2 moves counterclockwise by x + m semitones to b1, with m ≥ 0, x > n > 0. The
voice-leading (a1, a2)→(b2, b1) is associated with the displacement multiset {n, x + m};
the voice-leading (a1, a2)→(b1, b2) is associated with {m, x – n}. By the distribution
constraint, {x + m, n} ≥ {x, n + m} ≥ {x – n, m}, so the uncrossed voice-leading is no
larger than the crossed voice-leading. If the strict weak order strictly satisfies the
distribution constraint, the uncrossed voice-leading is smaller.
S6
Figure S3(c) shows a third possibility. m + n > x, since otherwise there would beno crossing. This implies x – m < n and x – n < m. Therefore {m, n} ≥ {x – m, x – n},
and the uncrossed voice-leading is no larger the crossed voice-leading. Again, if the
strict weak order strictly satisfies the distribution constraint then the uncrossed voice-
leading is smaller.
The remaining cases are closely analogous to those already considered, and are
left for the interested reader to verify. It remains to be shown that we can follow the
above procedures without creating any new voice-crossings. This is readily seen from
Figure S4. Without loss of generality, we can choose points b1 and b2 in Figure S4 to be
adjacent. We connect every note in the source chord to its destination by a path that has
no unnecessary crossings, as in Figure S4. Figure S4(a) features the crossing (a1,a2)→(b2, b1), as well as two additional types of voice-crossing: c1→d1, which crosses the
line a1→b2, and c2→d2, which crosses both a1→b2 and a2→b1. Figure S4(b), which
removes the crossing (a1, a2)→(b2, b1), shows that the remaining crossings c1→d1 and
c2→d2 are unaffected. Removing the crossing therefore reduces the total number of
voice-crossings in the voice-leading. The crossings shown in Figure S4, along with those
that can be obtained from this figure by reflection, exhaust the relevant geometrical
possibilities. We conclude that it is possible remove a voice-leading’s crossings without
making the voice-leading larger. If our normlike strict weak order strictly obeys the
distribution constraint, then removing voice-crossings will always make the voice-leading
smaller.
Theorem 1 is significant because it ties an important musical notion, “voice-
crossing,” to an important mathematical one: the triangle inequality, as represented by its
close cousin, the distribution constraint. It is widely accepted that avoidance of “voice-
crossings” in pitch space is a feature of traditional Western compositional practice (S7).
Theorem 1, which can easily be adapted to cover the case of voice-leadings in non-
circular pitch space, shows that normlike strict weak orderings are compatible with this
feature of classical practice. Moreover, it is easy to show that if a method of comparing
voice-leading size violates the distribution constraint, then there will be at least one
“crossed” voice-leading (in either pitch or pitch-class space) that is preferred to its
S7
uncrossed alternative. Thus the distribution constraint and the principle of avoiding
voice-crossings are equivalent within the limits of the formalism we have developed.
At the same time, the distribution constraint is closely related to the triangle
inequality. This allows us to use the minimal voice-leading between two chords to define
a “distance” between them, thereby underwriting the geometrical approach of the present
paper. Again, Theorem 1 is interesting precisely because it shows that our reference to
the geometrical concept of “distance” requires that we not prefer crossed voice-leadings
to their uncrossed alternatives. Consequently, were classical composers to have favored
voice-crossings, we would not be able to able to speak of the “distance” between chords
in the relatively straightforward way that we do here. We would be constrained to talk
only about the affine structure of musical chords—roughly, those non-metric properties
that depend only on the existence of “straight lines” in the space.
We conclude this section with a brief sketch of a proof that the distribution
constraint is equivalent to the triangle inequality. Let A and C be chords. The triangleinequality requires that a bijective voice-leading A→C be no larger than combined length
of any pair of bijective voice-leadings A→B and B→C, that takes A to C by way of B in
such a way as to preserve the mappings of the “direct” voice-leading A→C. It is
straightforward to identify the displacement multiset associated with A→B→C when A,
B, and C are collinear: one simply adds the elements of the displacement multisetsassociated with A→B and B→C so as to be faithful to the musical voices’ motions. The
displacement multiset associated with non-collinear A→B→C, if defined, is simply the
displacement multiset associated with A→B→D, with A, B, and D collinear and B→C
the same size as B→D. A normlike strict weak ordering does not ensure that there is a
displacement multiset associated with all paths A→B→C; but it does ensure if there is, it
is smaller than that associated with the direct voice-leading A→C.
To see why, suppose there is some crossed voice-leading between chords A and Cthat is preferred to the uncrossed voice-leading A→C. There will be a pair of voice-
leadings A→B→C that has the same combined displacement multiset as the crossed
voice-leading but which preserves the mappings of the “direct” voice-leading A→C.
(Here B is the point where the two voices cross as they move linearly from notes in A to
S8
their counterparts in C.) Since the crossed voice-leading is preferred, the combinedvoice-leadings A→B→C are smaller than A→C, which violates the triangle inequality.
Conversely, suppose there is a triangle ABC such that the combined voice-leadingsA→B→C are smaller than the “direct” voice-leading A→C. There is a voice-leading
A→D with the same displacement multiset as A→B→C. Since A→B→C form two legs
of a triangle, it is easy to show that the preference for A→D over A→C must violate the
distribution constraint.
3. A polynomial-time algorithm for finding a minimum voice-leading between twochords. Given two chords A and B, how do we find a minimal voice-leading between
them? The question is non-trivial, since minimal voice-leadings need not be bijective:
using any of the standard measures of voice-leading size, the minimal voice-leadingbetween {0, 4, 6} and {6, 10, 0} is (0, 0, 4, 6)→(10, 0, 6, 6). The large number of
possibilities here—roughly 2mn, where m and n are the cardinalities of the two
chords—makes an exhaustive search impractical, particularly in time-critical applications
such as interactive computer music.
However, Theorem 1 enables us to use the technique of “dynamic programming,”
common in computer science, to provide an efficient, polynomial-time algorithm (order
n2m) for determining a minimal voice-leading between arbitrary chords. Define the
ascending distance from pitch-class a to b as the smallest positive real number x such that
a + x is congruent to b, mod 12Z. Let (a1, a2, …, am, am+1 = a1) order the elements of
chord A based on ascending distance from arbitrarily-chosen a1. (Note that we repeat the
first element a1 as the last element of the list.) Similarly, for (b1, b2, …, bn, bn+1 = b1). Thenotation [a1, …, ai]→[b1, … bj] will refer to all voice-leadings from {a1, a2, …, ai} to {b1,
b2, … bj}, that can be notated so that both chords’ subscripts are in nondecreasing order.Thus [a1, a2]→[b1, b2, b3] includes (a1, a1, a2)→(b1, b2, b3), (a1, a1, a2, a2)→(b1, b2, b3, b3),
and so on.
If a crossing-free voice-leading contains the pair (ai, bj) then it must contain at
least one of the following: (ai-1, bj), (ai, bj-1), or (ai-1, bj-1) (subscript arithmetic modulo the
cardinality of the chords). By the recursion constraint, the smallest voice-leading of the
S9
form [a1, …, ai]→[b1, … bj] will be the voice-leading that adds the pair (ai, bj) to the
smallest voice-leading of the form [a1, …, ai-1]→[b1 … bj], [a1, …, ai]→[b1 … bj-1], or [a1,
…, ai-1]→[b1 … bj-1].
Thus, once we have fixed the pair (a1, b1) we can recursively compute the minimal
voice-leading between A and B that contains that pair. We do this by creating a matrixwhose cells ei, j record the size of the minimal voice-leading of the form [a1, …, ai]→[b1,
… bj]. It is trivial to fill in the first row and column of the matrix; from there, we can
proceed to fill in the rest. At each step, we need only consider the voice-leadings in a
cell’s upper, left, and upper-left neighbors.
Figure S5 illustrates the technique, identifying the smallest voice-leading between
the C and E major-seventh chords, {4, 7, 11, 0} and {4, 8, 11, 3}, such that the voice-
leading contains the pair (4, 4). In constructing this matrix we have used “smoothness”
(or “taxicab norm”) to measure the voice-leading size. The voice-leading in the bottom-
right cell is the minimal voice-leading between the two chords that contains (4, 4). To
remove this last restriction, we would need to repeat the calculation three more times,
each time cyclically permuting the order of one of the chords so as to fix a different
initial pair. As it happens, however, the voice-leading shown in Figure S5 is the
minimum voice-leading between the respective chords. This follows from the fact thatthe voice-leading in the top-left cell (4→4) contributes nothing to the overall size of the
voice-leading; we can therefore add this mapping to any voice-leading without increasing
its size according to the L1 norm.
Figure S5 includes in each cell both the numerical size of the voice-leading and
the voice-leading itself. With the L1 norm (“smoothness”) this is unnecessary: we need to
keep track of the size, but not the voice-leading. To determine the value of cell ei,j we can
simply add the distance between the pair (ai, bj) to the minimum value in the cells ei-1, j
ei, j-1, and ei-1, j-1. (With the Euclidean metric we can calculate squared distance in this
way, taking the square-root just before output.) Having filled in the matrix, we can
recover the minimal voice-leading by “tracing back” all paths that move from the bottom-
right cell to the top left, moving only north, west, and northwest, such that the size of the
S10
voice-leading decreases as much as possible with each step. The cells in boldface
indicate the path that such a traceback algorithm would take.
Due to the circular structure of pitch-class space, the voice-leading in the lower
right-hand corner of the matrix counts the pair (a1, b1) = (am+1, bn+1) twice; this can easily
be corrected prior to output.
Finally, note that need only consider n distinct possibilities to find a minimalbijective voice-leading A→B. Let (a0, a1, … , an-1) order the elements of chord A based on
ascending distance from arbitrarily-chosen element a0. Similarly for (b0, b1, … , bn-1). By
Theorem 1, there will be a minimal bijective voice-leading between A and B of the form(a0, a1, … , an-1)→(bc, bc+1, … , bc+n-1), where c is an integer and the subscript arithmetic is
reduced mod n.
4. Derivation of the voice-leading orbifolds. We begin by deriving Figure 2 in the main
text. Figure S6 shows the 2-torus T2, drawn using a Euclidean metric, and representing
ordered 2-note chords. To form a graph of unordered chords we need to identify all
points (x, y) and (y, x). As can be seen from Figure S6, this involves “folding” the 2-
torus along the diagonal line AB. The result is a “triangle” whose two sides are
identified, shown in Figure S7. Although it may not be immediately obvious, this figure
is a Möbius strip. To see why, cut Figure S7 along the line CD. This creates two
detached triangles. Then glue line AC on one triangle to CB on the other. (You will
have to turn one piece of paper over to get the chords to line up.) The result is the main
text’s Figure 2.
We now proceed more abstractly, describing the orbifolds Tn/Sn for arbitrary n.
For simplicity of exposition and ease of visualization, we will assume the Euclidean
metric in what follows. Since pitch-class space is represented by the circle R/12Z we are
interested in the n-torus (R/12Z)n. The quotient space (R/12Z)n/Sn can also be writtenRn/(Sn × 12Zn), where Zn refers to the group of n-tuples of integers, and the Zn action is
by componentwise addition. (The notation 12Zn indicates that the components of each n-
tuple in Zn are to be multiplied by the scalar 12.) We will proceed by deriving afundamental domain of Sn × 12Zn in Rn. (A “fundamental domain” of the group Γ in
S11
space S is a region R of S, such S is the union of the regions gR, for all g ⊂ Γ, and such
that the intersection of any two regions gR and hR, for g ≠ h, has no interior.) By
identifying the appropriate boundary points of this fundamental domain, we will obtainthe orbifold Rn/(Sn × 12Zn).
We first describe a fundamental domain of Sn in Rn. In this region, no two
distinct points (x1, x2, …, xn) and (y1, y2, …, yn) have coordinates that are equivalentunder some permutation: that is, there is no σ(n) such that (x1, x2, …, xn) = (yσ(1), yσ(2),…,
yσ(n)), where σ(n) is some permutation of the integers from 1 to n. We can create such a
region simply by requiring that a point’s coordinates be in nondescending order: i.e.considering all points (x1, x2, …, xn) such that x1 ≤ x2 ≤ … ≤ xn. We can incorporate the12Zn action by requiring that xn ≤ x1 + 12, and 0 ≤ Σnxn≤ 12. In Euclidean space, the
resulting fundamental domain is a right hyperprism whose faces are n-1 dimensional
simplexes. To see why, observe that1. The n inequalities x1 ≤ x2 ≤ … ≤ xn ≤ x1 + 12 define an n-1 simplex in every
plane Σnxn = n.
2. Addition by (c, c, …, c) sends the simplex in the plane Σnxn = n to the simplex in
the plane Σnxn = n + cn.
3. The planes Σnxn = n are perpendicular to the vector (1, 1, …, 1).
The vector (1, 1, …, 1) points in the direction of the “height” coordinate of the prism; the
prism’s “faces” lie in planes perpendicular to the vector (1, 1, …, 1) and therefore contain
chords whose pitch-classes sum to the same value.
Our construction of the fundamental domain ensures that no two points on asingle plane Σnxn = n can represent the same chord. However, the planes do contain
chords related by transposition: if (x1, x2, …, xn) satisfies the inequalities x1 ≤ x2 ≤ … ≤ xn
≤ x1 + 12, then so does the transpositionally-related (x2 – 12/n, x3 – 12/n, …, xn – 12/n, x1
+ 12 – 12/n), which has the same sum as (x1, x2, …, xn). Let O refer to the function that
applying O to a chord X, we can obtain n transpositionally equivalent chords X, O(X),
O2(X), … On-1(X) whose pitch-classes all sum to the same value. (If X is invariant under
some transposition, then some of the chords On(X) will be equal.) In Euclidean space, O
S12
is an orthogonal transformation that is an automorphism of the prism: it is a rotation
when the prism has an odd number of dimensions, and a rotation-plus-reflection
otherwise. O acts so as to cyclically permute the vertices of the simplex in each planeΣnxn = n.
It remains to be determined how the two simplicial faces of the prism are to be
identified. We cannot identify them in the obvious way, since this would identify point(x1, x2, …, xn) on the Σnxn = 0 face of the prism with the transpositionally-distinct chord (x1
+ 12/n, x2 + 12/n, …, xn + 12/n) on the Σnxn = 12 face. Notice, however, that
(x2, x3, …, xn, x1 + 12) represents the same chord as (x1, x2, …, xn). We therefore need to
identify (x1, x2, …, xn) with O(x1 + 12/n, x2 + 12/n, …, xn + 12/n) = (x2, x3, …, xn, x1 +12). Colloquially, we apply the transformation O to the Σnxn = 12 face before “gluing the
two faces together.” This identification transforms the prism’s “height coordinate” into a
circle: in moving parallel to the vector (1, 1, …, 1) we pass through all and only the
transpositions of a given chord, returning eventually to our starting point. Thus we can
describe the orbifold (R/12Z)n/Sn as the product of a n-1 simplex with a circle, modulo
the action that rotates the circle by 360/n degrees while applying the transformation O to
the simplex.
5. Efficient voice-leading and symmetry. Let A be an n-note chord and let (a1, a2, …,
an) be an arbitrary ordering of its elements. The symbol σ(a1, a2, …, an) will refer to the
ordering (aσ(1), aσ(2),…, aσ(n)), where σ(n) is some permutation of the integers from 1 to n.
We will use the notation A→σ(A) to refer to any voice-leading A→A that can be written
(a1, a2, …, an)→(aσ(1), aσ(2),…, aσ(n)) (4)
An arbitrary n-note chord S will be invariant under σ (or σ-invariant) if the chord’s
elements can be labeled so that si = sσ(i) for all i ≤ n.
S13
In what follows, we will use the variable O to refer to a specific permutation σ,
transposition Tx, or inversion Iy. We will say that an n-note chord S is invariant under Oif there is some voice-leading S→O(S) that is trivial. We will generally assume that Oitself is non-trivial: that is, there is at least one chord that is not invariant under O. Thus
we will not be considering the trivial permutation σ(n) = n or the trivial transposition
T0(x) = x.
It is intuitively obvious that the size of a voice-leading A→S, where S is invariant
under some O, sets an upper bound on the size of the minimal voice-leading A→O(A).
This is because we can express the voice-leading A→O(A) as the composition of two
equally-sized voice-leadings A→S and S→O(A). (For any A→S, we can find an equally
large S→O(A), since S is invariant under O and since a normlike strict weak order is
insensitive to the “direction” of the voice-leading.) Write the displacement multiset
corresponding to A→S as {d1, d2, …, dn}. We can conclude that the minimal voice-
leading A→O(A) can have a displacement multiset no larger than {2d1, 2d2, …, 2dn}.
Thus as the size of the voice-leading A→S goes to zero, the minimal voice-leading
A→O(A) must also go to zero.
The converse, however, is less obvious. Suppose we have some bijective voice-
leading A→O(A). Does the size of A→O(A) set an upper bound on the size of the
minimal voice-leading A→S, where S is O-invariant?
Yes, assuming such an S exists. The following theorem uses the size of A→O(A)
to limit the size of A→S, showing that as A→O(A) vanishes so must A→S. Since the
result is proven for any normlike strict weak order, it does not set a very tight (or
interesting) limit on the voice-leading A→S. However, it does establish the general
theoretical point that the size of A→O(A) is dependent on that of A→S.
Lemma 2.1. Let A be a chord with n elements, let x be some element of R/12Zsuch that nx is congruent to 0, mod 12Z, and let A→σ(A) be a bijective voice-leading that acts as a cyclical permutation of A’s elements. Label the pitch-classes of A so that the voice-leading A→σ(A) can be written
(a0, a1, …, an-1)→(a1, a2, …, an-1, a0)
There is a voice-leading A→Tx(A) has the form
S14
(a0, a1, …, an-1)→(x + a1, x + a2, …, x + an-1, x + a0)
with displacement multiset consisting of values di = |x + (ai+1 – ai)|12Z (subscriptarithmetic mod n). There must therefore exist a voice-leading A→S, from A tosome chord S that is invariant under Tx (or σ, if x = 0) and which hasdisplacement multiset.
m–1 m–1 m–1 m m+1 n–2{Σ di, Σ di, …, Σ di, Σ di, Σ di, …, Σ di, 0} i = 0 i = 1 i = m–1 i = m i = m i = m
Here m = n/2, the greatest integer ≤ n/2.
Proof. The value di = |x + (ai+1 – ai)|12Ζ measures how close the interval ai+1 – ai is to –x.
As Figure S8 shows, we need to move at most n/2 pitch-classes by |x + (ai+1 – ai)|12Ζ
semitones in order to make a given interval ai+1 – ai equal to –x. We can do so,
furthermore, without disturbing any of the other di, for i < n-1. Only the intervals ai+1 – ai
and dn-1 = W need be disturbed. Since the voice-leading acts as a circular permutation,
and since nx is congruent to 0, mod 12Z, we need iterate this procedure only n–1 times in
order to obtain a set that is invariant under Tx or σ (if x = 0): once we set n–1 of the
intervals equal to –x, the final “wraparound” interval—labeled W on Figure S8—will
also be equal to –x. Note that since our choice of a1 is arbitrary, we can choose W so as
to minimize the resulting voice-leading A→S.
Lemma 2.2. Let A→Ix(A) be a crossing-free, bijective voice-leading. SinceA→Ix(A) is crossing-free, it can be written in the form
(a0, a1, …, an-1)→(x – ac-0, x – ac-1, …, x – ac-(n-1))
with displacement multiset {d0, d1, …, dn-1} (subscript arithmetic is mod n). Wecan therefore find an S such that S is invariant under Ix and the voice-leadingA→S has displacement multiset no larger than
{d0/2, d1/2, …, dn-1/2}.
Proof. The crossing-free voice-leading A→Ix(A) associates pitch-class ai with x – ac-i
(subscript arithmetic mod n). There are two cases to consider, i = c/2, in which case the
S15
voice-leading associates ac/2 with x – ac/2, and i ≠ c/2, in which case the voice-leading
associates ai with x – ac-i and ac-i with x – ai.
Case 1. i = c/2. Our voice-leading associates ai with x – ai; the distance between
these two points is |x – 2ai|12Ζ. Now consider the two minimal-length linear paths in pitch-
class space: the first from ai to x – ai and the second its retrograde, from x – ai to ai.
These paths are reflection symmetrical under Ix: every point ai + ε along the path ai→(x –
ai) is mapped by Ix to the point x – (ai + ε) along the path (x – ai)→ai. Therefore, the
midpoint af = x – af is fixed by the reflection. Consequently, we can move ai by |x –
2ai|12Ζ/2 semitones to obtain a pitch-class that is invariant under Ix.
Case 2. i ≠ c/2. Let j = c – i. Our voice-leading associates ai with x – aj and aj
with x – ai. Both ai and aj are mapped to pitch-classes |x – (ai + aj)|12Ζ semitones away.
Consider the two minimal linear paths ai→x – aj and aj→x – ai. If we reverse the
direction of the second path, we obtain two equal-length paths ai→x – aj and x – ai→aj
that are reflection-symmetrical under Ix: every point ai + ε along the path from ai→x – aj
is mapped to the point x – (ai + ε) along the path x – ai→aj. The points halfway along
these paths, af and x – af, are related by Ix. Therefore, we can move each pitch-class by
|x – (ai + aj)|12Ζ/2 semitones to obtain a pair that is invariant under Ix.
THEOREM 2. Let A be an n-note chord, let O be a non-trivial permutation,transposition, or inversion such that there exists an n-note chord that is invariantunder O. Then, if the displacement multiset associated with A→O(A) is smallerthan the n-element multiset {d, 0, 0, …, 0}, there will be a voice-leading A→S,with S is invariant under O, and displacement multiset less than or equal to
{d, d, 2d, 2d, 3d, 3d, …, n/2d, 0}.
The term n/2d appears once for even n, twice for odd n.
Proof. By the distribution constraint, the displacement multiset corresponding to thevoice-leading A→O(A) has no terms greater than or equal to d. There are three cases to
consider, depending on whether O is a permutation σ, a transposition T, or an inversion
I.
S16
Case 1. O is a permutation σ. Since any permutation can be decomposed into
cycles, we simply apply Lemma 2.1 to obtain a voice-leading A→S that is no larger than
{d, d, 2d, 2d, 3d, 3d, …, n/2d, 0}, with S invariant under σ.
Case 2. O is a nonzero transposition Tx. By Theorem 1, there exists a crossing-free voice-leading A→Tx(A) whose displacement multiset consists of values less than d.
Any crossing-free voice-leading can be decomposed into cycles of the form:
(a0, a1, …, an-1)→(x + a1, x + a2, …, x + an-1, x + a0)
Thus we can again apply Lemma 2.1 to obtain the desired voice-leading.
Case 3. O is an inversion Ix. By Theorem 1, there exists a crossing-free voice-leading A→Ix(A) whose displacement multiset consists of values less than d. By Lemma
2.2, there exists a voice-leading A→S, such that S is invariant under Ix, and with
displacement multiset less than or equal to {d/2, d/2, …, d/2}. By the distributionconstraint, this multiset is less than or equal to {d, d, 2d, 2d, 3d, 3d, …, n/2d, 0}.
6. Evenness and transpositional invariance. We begin with an informal argument
describing the relation between “evenness” and T-invariance. By Theorem 1, there willbe a minimal bijective voice-leading A→Tx(A) of the form (a0, a1, …, an-1)→(ac + x, ac+1 +
x, …, ac+n-1 + x), where c is some integer and the subscript arithmetic is reduced mod n.
The displacement multiset associated with this voice-leading will consist of the distances|x + (ac+i – ai)|12Ζ. For a chord that divides the octave nearly-evenly, the values ac+i – ai are
nearly-constant for all c. (This is simply because the distance between the values ac+i – ai
measures how evenly the chord divides c octaves into n equal parts.) Thus, for every c,there will be an x for which ac+i – ai ≅ –x, for all i. The “cyclical” component of the
voice-leading offsets the “parallel” component. For chords that evenly divide the octave,
the quantities ac+i – ai can be made to approximate –x as closely as is possible for n-note
chords.
We now provide a rigorous proof of this last statement.
S17
THEOREM 3. Let A be any multiset of cardinality n. For all x, the minimalbijective voice-leading between A and Tx(A) can be no smaller than the minimalbijective voice-leading between E and Tx(E), where E divides pitch-class spaceinto n equal parts.
In proving Theorem 3 it is again convenient to work in pitch-space, or Rn. (Notethat we do not assume the Euclidean metric on this space.) We will use the symbol ≡nZ to
mean “congruent mod nZ.” The symbol applies to both scalars and ordered n-tuples:thus –2.5 ≡12Z 9.5, and (0, 4, 7) ≡12Z (-12, 4, 19). Each chord is represented by an infinite
number of points in Rn, all congruent mod 12Z. A voice-leading between two points X,Y ⊂ Rn will simply be the ordered n-tuple X – Y = (x1 – y1, x2 – y2, … xn – yn). The
displacement multiset associated with this voice-leading will be the multiset (|x1 – y1|, |x2
– y2|, … |xn – yn|). Clearly, for any voice-leading in the orbifold Rn/(Sn × 12Zn), there will
be an infinite number of equivalent, equally-sized voice-leadings in Rn. Conversely, forany voice-leading in Rn, with displacement multiset containing only elements ≤ 6, there
is a corresponding voice-leading in Rn/(Sn × 12Zn).
Let En be a chord that divides pitch-class space into n equal parts. Since E is
invariant under transposition by 12/n semitones, there will be a voice-leading between
chords congruent to E and Tx(E) of the form
(e1, e2, …, en)→(e1 + c, e2 + c, … en + c), where c is any real number ≡12Z/n x (5)
(NB: c is congruent to x mod 12Z/n, not mod 12Z.) Choose c so that |c| is as small as
possible. The displacement multiset corresponding to this voice-leading is {|c|, |c|, … ,
|c|}. The sum of the elements of this multiset is n|c|, where n|c| is the smallest positive
real number such that nc ≡12Z nx. By the distribution constraint, this multiset is as small
as any n-note multiset with the same or greater sum.
Now consider any bijective voice-leading between representatives of two n-notetranspositionally-equivalent chords A and Tx(A). Let ΣA refer to the sum of the
components of A. Therefore,
S18
Σ(Tx(A) – A) ≡12Z nx (6)
The real number Σ(Tx(A) – A) is the sum of signed quantities; the sum of the absolute
values of these quantities must therefore be greater than or equal to n|c|, where n|c| is the
smallest positive number such that nc ≡12Z nx. Thus the elements of the displacement
multiset associated with the voice-leading A→Tx(A) sum to at least n|c|. We conclude
that this voice-leading can be no smaller than the minimal voice-leading between En and
Tx(E).
There is a useful corollary to Theorem 3 that applies in the discrete case.
COROLLARY. Let Ek (the “chromatic scale”) divide pitch-class space into k >n equal parts, let A be any n-note subset of Ek, and let M be the “maximally even”n-note subset of Ek (S8). Then, for any integer i, the minimal bijective voice-leading between A and T12i/k(A) can be no smaller than the minimal bijectivevoice-leading between M and T12i/k(M).
The proof follows the same basic outlines as the proof of Theorem 3. We rely on the fact
that M divides any number of octaves into nearly even parts: given M = (m0, m1, …, mn-
1), and some constant integer c, the distances |mc+i – mi|12Ζ (subscript arithmetic mod n)
come in “consecutive integer sizes” when measured in units of 12/k (S8). That is, forevery integer c there exists an integer j, such that the distances |mc+i – mi|12Ζ are equal to
12j/k and (12j+1)/k. This allows us to find a voice-leading M→T12i/k(M) is small as
possible for n-note subsets of Ek. As before, we use the “cyclical” component of thevoice-leading mi→mc+i to neutralize the “transpositional” component of the voice-leading
mi→mi + x.
Now for the formalities. By the argument given above, the minimal voice-leadingA→T12i/k(A) has a displacement multiset whose sum is at least n|c|, where n|c| is the
smallest positive number such that nc ≡12Z 12in/k. What needs to be shown is that there is
a voice-leading M→T12i/k(M), with a displacement multiset summing to n|c|, whose
values are as evenly distributed as possible. Since our voice-leadings are required to
connect subsets of Ek, we can establish maximally-even distribution by showing that the
S19
values of the displacement multiset take on just two distinct values: 12r/k and 12(r+1)/k,
where r is some nonnegative integer.
Let (m0, m1, … mn-1) order the elements of M in ascending numerical order; formthe infinite sequence S = {m(j mod n) + 12j/12}∞j=-∞. (Again, “x” refers to the greatest
integer ≤ x.) S consists of all of the elements of R congruent mod 12Z to elements of M.
This sequence is ordered in ascending numerical order and indexed such that S-1 = mn-1 –
12, S0 = m0, S1 = m1, and so on. The voice-leadings
are voice-leadings between chords congruent to M and Tx(M).
The following music-theoretical facts are well known:
1. The (real-valued) sum of the components of (Sa – m0, Sa+1 – m1, … Sa+n-1 – mn-1)
is equal to 12a (S9).
2. The elements of this n-tuple will either be constant, or have two distinct
values: 12r/k and 12(r+1)/k, where r is some integer (S8).From these two facts, it follows that we can find a voice-leading S→Tx(S)
corresponding to the n-tuple
(Sa + x – m0, Sa+1 + x – m1, …, Sa+n-1 + x – mn-1) (8)
with elements summing to nc, where n|c| is the smallest positive number such that nc ≡12Z
nx. When x and Sa+i – mi are both integer multiples of 12/k, the values of this n-tuple are
either constant or can be expressed in the form 12r/k and 12(r+1)/k, where r is some
integer. These values will either be all nonnegative or all nonpositive. The sum of theelements of this voice-leading’s displacement multiset will therefore be n|c|. The
displacement multiset will contain just two distinct values, 12|r|/k and 12|r+1|/k. This
implies that the displacement multiset is as evenly-distributed as possible, given the
hypothesis that the voice-leading connects subsets of Ek.
S20
NOTES
S1. D. Lewin. Journal of Music Theory 42, 15 (1998).
S2. R. Cohn, Journal of Music Theory 42, 283 (1998).
S3. J. Straus. Music Theory Spectrum 25, 305 (2003).
S4. C. Callender, Music Theory Online 10.3 (2004).
S5. R. Cohn. Journal of Music Theory 41, 1 (1997).
S6. J. Douthett, P. Steinbach. Journal of Music Theory 42, 241(1998).
S7. R. Gauldin, Harmonic Practice in Tonal Music (Norton, New York, 1997).
S8. J. Clough, J. Douthett. Journal of Music Theory 35, 93 (1991).
S9. J. Clough, G. Myerson. Journal of Music Theory 29, 249 (1985)
SYMBOL OR TERM DEFINITIONmultiset A set in which duplications are permitted. Like sets, multisets are
unordered.{a, b, c} A multiset with elements a, b, c.(a, b, c) An ordered list. (a, b, c) and (b, c, a) are not the same.x The greatest integer ≤ x.R The real numbers.Z The integers.nZ, where n is a real number The set {ni | i ⊂ Z}. Thus 12Z is the set
{…, -24, -12, 0, 12, 24, …}, whose elements form a group underaddition.
mZn, where m is real and n is aninteger
The set of ordered n-tuples (x1, x2, … xn) such that each xi ⊂ mZ.This set forms a group under vector addition.
A/G, where G is some group oftransformations acting on theelements of A
the quotient space that identifies all points a and ga, where a ⊂ Aand g ⊂ G
R/12Z The circular quotient space in which all real numbers x and x + 12have been identified. The group 12Z acts by ordinary addition, sothat every point x has orbits {…, x – 36, x – 24, x – 12, 0, x + 12, x+ 24, x + 36}.
a ≡nZ b Pitch class a is congruent to b mod nZ. Thus there exists an integerc such that a = b + cn.
|a|12Ζ The norm of a pitch-class a. The smallest real number |x| such thatx ≡12Z a.
(a1, a2, …, an) ≡12Z (b1, b2, …, bn) For all n, an ≡12Z bn.
Tn The n-torus, or product of n circles. Since R/12Z is a circle, Tn canalso be written (R/12Z)n.
Sn The “symmetric group” consisting of all the distinct permutationsof n objects.
Table S1. A glossary of mathematical terms and symbols used in the article.
SYMBOL OR TERM DEFINITIONpitch Pitch is a fundamental attribute of musical notes. Pitches are
typically represented by real numbers such that middle C is 60, theoctave has length 12, and semitones have size 1.
pitch-class An equivalence class of pitches, consisting of all pitches separatedby an integral number of octaves. A220 and A440 both areinstances of the same pitch-class A. Pitch-classes can berepresented by elements of the quotient space R/12Z.
chord A multiset of pitch-classes. It is also possible to consider chords ofpitches, which are simply multisets of real numbers.
transposition Translation in pitch or pitch-class space. In both pitch and pitch-class space, transposition corresponds to addition by a constantvalue. If a is a pitch or pitch-class then a + x is the transposition ofa by x semitones.
Tx(A) The transposition of the chord A by x semitones.inversion Reflection in pitch or pitch-class space. In both pitch and pitch-
class space, inversion corresponds to subtraction from a constantvalue. If a is a pitch or pitch-class, then x – a is an inversion of a.The quantity “x” is called the index number of the inversion.
Ix(A) The inversion of chord A with index number x.voice-leading A voice-leading between two multisets {a1, a2, …, am} and {b1, b2,
…, bn} is a multiset of ordered pairs (ai, bj), such that every elementof each chord is in some pair.
trivial voice-leading A trivial voice-leading contains only pairs of the form (x, x).
Table S2. A glossary of musical terms and symbols used in the article.
G D A E B/Cf Fs/Gf
Cs/Df
Af
Ef
Bf
F
Figure S1. The circle of fifths can be interpreted as depicting minimal voice-leadings between diatonic collections (major scales). Each diatonic collection can be transformed into its neighbors by voice-leading in which one pitch-class moves by semitone. For example, the C major scale, containing pitch-classes 0, 2, 4, 5, 7, 9, and 11 (= e) can be transformed into the G major scale (containing pitch-classes 0, 2, 4, 6, 7, 9, and 11) by moving the pitch class 5 (F) to 6 (Fs). Here as elsewhere, theletters “t” and “e” refer to the numbers 10 and 11, respectively.
...
..
. . ..
...
C{5↔6} {t↔e}
{3↔
4}{8
↔9}
{1↔
2}
{6↔
7}
{11↔0} {4↔5}
{2↔3}
{7↔8}
{0↔1}
{9↔10}
{024579e}{024679e}
{124679e}{124689e}
{134689e}
{13468te}{13568te}
{13568t0}
{135
78t0
}{2
3578
t0}
{235
79t0
}{24579t0}
Cc
f F
a A
[e] [E]
e E
[g] [G]
d D
fs
gs
Fs
[as]
[cs]
cs
[Cs]
Af
Ef
bf
g G
B
ds
b
Bf
[D]
[Fs]
[Bf]
[Bf]
[Bf]
Df
Figure S2. The Tonnetz. Nineteenth-century theorists such as Hostinsky, Oettingen, and Riemannexplored a geometrical figure that is the “geomterical dual” of the one shown here. The graph displays efficient voice-leadings among the 24 familiar major and minor triads. Triads connected by horizontal lines share both “root” and “fifth,” and can be connected by voice-leading in which one note moves by one semitone. (For example, the C-major triad can be transformed into a C-minor triad by changing E to Ef.) Triads along the NE/SW diagonal also share two notes and can be connected by single-semitone voice-leading. (For example, the C-major triad can be transformed into an E-minor triad by changing C to B.) Triads along a NW/SE diagonal share two notes and can be connected by voice-leading in which one note moves by two semitones. (For example, the C-major triad can be transformed into an A-minor triad by changing G to A.) Topologically, the figure is a 2-torus.
a1
(a1)
b1a2
(a2)b2
m
n
x
..
.. .
.
a1
(a1)
b1a2
(a2)b2
m
n
x
..
.. .
.
a1
(a1)b1
a2
(a2)b2
m
nx
. .
.. .
.
Figure S3. Three types of voice-crossing
a)
b)
c)
a1
b1
a2
b2. .. c1 c2
.d1
d2..
..
Figure S4. Removing a crossing does not create new crossings
a)
a1
b1
a2
b2
c1 c2
.d1
d2..
.b) . .
. .
FIGURE S5. Using dynamic programming to find minimal voice-leading
Figure S6. Ordered dyad-space is a 2-torus. To identify points (a, b) and (b, a), we need to “fold” the torus along the AB diagonal. The result of this operation is shown in Figure S7.
B
A
0100 02 03 04 05 06 07 08 09 0t 0e [00]
11 12 13 14 15 16 17 18 19 1t 1e [10]
22 23 24 25 26 27 28 29 2t 2e [20]
33 34 35 36 37 38 39 3t 3e [30]
44 45 46 47 48 49 4t 4e [40]
55 56 57 58 59 5t 5e [50]
66 67 68 69 6t 6e [60]
77 78 79 7t 7e [70]
88 89 8t 8e [80]
99 9t 9e [90]
tt te [t0]
ee [e0]
[00]
Fig. S7. The result of “folding” the 2-torus in Figure S6 along its diagonal AB. The resulting figureis a triangle with two of its sides identified, which is a Möbius strip. To transform Figure S7 intoa more familiar representation of a Möbius strip, cut the figure along the line CD and glue AC to CB.(To make this identification in Euclidean 3-space, you will need to turn over one of the pieces of paper.)The result is a “square” with opposite sides identified, as in Figure 2 of the main paper.
A
D
B
C
Figure S8. The cyclical voice-leading (a0, a1, a2, a3, a4, a5, a6)→(a1, a2, a3, a4, a5, a6, a0) has displacementmultiset {d0, d1, d2, d3, d4, d5, d6 = W}. By moving at most three notes by |x – di| semitones, we can make any of the di = x without changing the other dn ≠ W. That is, to change d0, we need only move a0; to change d5 we need only move a6; to change d1 we need only move a0 and a1; and so on. In the case of an arbitrary cyclical voice-leading, we never need to move more than half of a chord s notes by |x – di| semitones to “fix”any interval.