-
FRED LERDAHLColumbia University
CAROL L. KRUMHANSLCornell University
THIS STUDY PRESENTS AND TESTS a theory of tonal ten-sion
(Lerdahl, 2001). The model has four components:prolongational
structure, a pitch-space model, a surface-tension model, and an
attraction model. These compo-nents combine to predict the rise and
fall in tension inthe course of listening to a tonal passage or
piece. Wefirst apply the theory to predict tension patterns
inClassical diatonic music and then extend the theory tochromatic
tonal music. In the experimental tasks, lis-teners record their
experience of tension for theexcerpts. Comparisons between
predictions and datapoint to alternative analyses within the
constraints ofthe theory. We conclude with a discussion of the
under-lying perceptual and cognitive principles engaged bythe
theorys components.
Received March 16, 2006, accepted October 4, 2006.
Key words: tonal tension, attraction, pitch space,prolongational
structure, multiple regression
THE EBB AND FLOW OF tension is basic to themusical experience
and has long been of interestin music theory and criticism (Berry,
1976;
Hindemith, 1937; Kurth, 1920; Rothfarb, 2002; Schenker,1935;
Zuckerkandl, 1956). It appears to have a directlink to musical
affect (Krumhansl, 1996, 1997), and itshapes not only the listening
experience but also aspectsof musical performance (Palmer,
1996).
Building on the prolongational component in Lerdahland
Jackendoff (1983; hereafter GTTM), Lerdahl (2001;hereafter TPS)
developed a formal model of tonal ten-sion and the related concept
of tonal attraction. Themodel generates quantitative predictions of
tension andattraction for the sequence of events in any passage
oftonal music. Earlier empirical studies have shown prom-ising
connections between the models predictions and
participants responses (Bigand, Parncutt, & Lerdahl,1996;
Cuddy & Smith, 2000; Krumhansl, 1996; Lerdahl& Krumhansl,
2004; Palmer, 1996; Smith & Cuddy,2003). Our purpose here is to
provide a comparativelycomprehensive empirical treatment and
analysis ofthe models predictions over a range of musical
styles.
By tonal tension we mean not an inclusive defini-tion of musical
tension, which can be induced by manyfactors, such as rhythm,
tempo, dynamics, gesture, andtextural density, but the specific
sense created bymelodic and harmonic motion: a tonic is relaxed
andmotion to a distant pitch or chord is tense; the reversalof
these motions causes relative relaxation. Becausetonal tension is a
uniquely musical phenomenon(unlike such factors as fluctuations in
loudness, speed,or contour), it is perhaps the most crucial respect
inwhich music tenses and relaxes. This study sets asideother kinds
of musical tension and focuses on tonaltension.
The sense of tonal tension and relaxation can also beexpressed
as stability and instability or even conso-nance and dissonance.
These pairs of terms have some-what different shades of meaning.
Dissonance refersfirst of all to a sensory property that is studied
in the psy-choacoustic literature. In a traditional
music-theoreticcontext, it refers to intervallic combinations that
requireparticular syntactic treatment, such as the passing toneand
the suspension. Intervals that are musically dis-sonant usually
correspond to intervals that are psy-choacoustically dissonant.
Instability has cognitiveor conceptual meaning beyond
psychoacoustic effects.Theorists such as Riemann (1893), Schenker
(1935),and Schoenberg (1911) extend musical dissonance froma
surface characteristic to abstract levels. One mayspeak of a
composed-out passing tone that is harmo-nized at the surface, or of
a subsidiary tonal region thatis conceptually dissonant in relation
to the tonic(Rosen, 1972). Schoenberg (1975) asserts that the goal
ofa tonal composition, after its initial destabilization, is
toreestablish stability.
The term tension, as employed here, refers both tosensory
dissonance and to cognitive dissonance or insta-bility;
similarly,relaxation refers to sensory consonanceand to cognitive
consonance or stability. The expression
MODELING TONAL TENSION
Music Perception VOLUME 24, ISSUE 4, PP. 329366, ISSN 0730-7829,
ELECTRONIC ISSN 1533-8312 2007 BY THE REGENTS OF THE UNIVERSITY OF
CALIFORNIA. ALLRIGHTS RESERVED. PLEASE DIRECT ALL REQUESTS FOR
PERMISSION TO PHOTOCOPY OR REPRODUCE ARTICLE CONTENT THROUGH THE
UNIVERSITY OF CALIFORNIA PRESSS
RIGHTS AND PERMISSIONS WEBSITE,
HTTP://WWW.UCPRESSJOURNALS.COM/REPRINTINFO.ASP.
DOI:10.1525/MP.2007.24.4.329
Modeling Tonal Tension 329
-
tension and relaxation also has the advantage ofinvoking
physical motion and exertion beyond a specif-ically musical
function. Everyone experiences physicaltension and relaxation, and
it is common to extend theterms to mental and emotional terrains as
well. Conse-quently, it is relatively straightforward to ask
experi-mental participants to respond to degrees of tensionand
relaxation and thereby elicit consistent interper-sonal responses
(see Krumhansl, 1996).
The TPS model also develops an attraction compo-nent. The term
attraction refers to the intuition thatmelodic or voice-leading
pitches tend toward otherpitches in greater or lesser degrees.
Bharucha (1984)refers to melodic anchoring; Larson (2004; Larson
&VanHandel, 2005) speaks of musical forces; Margulis(2005),
Meyer (1956), and Narmour (1990) couch attrac-tion in terms of
melodic expectation or implication.Attraction can also be seen as a
kind of tension: themore attracted a pitch is to another pitch,
such as theleading tone to the tonic, the more the listener
experi-ences the tension of anticipation. This kind of
tensioncontrasts with the tension of instability. The leadingtone
is less stable than the tonic, but its expectancy-tension (to use
Marguliss expression) is much greaterthan that of the tonic. That
is, the leading tone stronglywants to resolve to the tonic; but the
tonic pitch,being the point of maximal stability, expresses
compar-atively little urge to move to the leading tone or to
anyother pitch.
To summarize, our concern is with three kinds oftonal tension:
the sensory dissonance of certain inter-vallic combinations,
harmonic and regional stability/instability in relation to a
governing tonic, and melodicattraction as a projection of
expectancy-tension.
Overview of the Tension Model
The four components listed in Figure 1 are required fora
quantitative theory of tonal tension. First, there mustbe a
representation of the hierarchical event structurein a musical
passage. Adapting a traditional music-theoretic term, we call this
component prolongationalstructure. Second, there must be a model of
tonal pitchspace and all distances within it. Tonal pitch space is
thecognitive schema whereby listeners have tacit
long-termknowledge, beyond the patterns within any particularpiece,
of the distances of pitches, chords, and tonalregions from one
another. Third, there must be a treat-ment of surface or sensory
dissonance. This measure islargely psychoacoustic: the interval of
a seventh is moredissonant than a sixth, and so on. Fourth, there
must bea model of melodic or voice-leading attractions.
Listeners
experience the relative pull of pitches toward otherpitches in a
tonal context.
Let us review these four components, starting withprolongational
structure. (This exposition summarizesmaterial in TPS.) GTTM
addresses prolongational organ-ization not as an aesthetic ideal,
as in Schenkerian analy-sis, but as a psychological phenomenon
describable bynested patterns of tension and relaxation. Tension
dependson hierarchical position: a tonic chord in root positionis
relaxed; another chord or region is relatively tense inrelation to
the tonic; a nonharmonic tone is tense inrelation to its harmonic
context. This componentassigns prolongational structure by a
cognitively moti-vated rule system that proceeds from grouping
andmeter through time-span segmentation and reduction.These steps
are necessary because prolongational con-nections depend not only
on degrees of pitch similarityand stability but also on the
rhythmic position of events.
To represent an event hierarchy, the prolongationalcomponent
employs a tree notation. Here it will sufficeto refer to branchings
stripped of the node typesemployed in GTTM and TPS. Right branches
stand fora tensing motion (or departure), left branches for a
relax-ing motion (or return). The degree of tension or relax-ation
between two events depends on the degree ofcontinuity between them.
If two events that connect arethe same or similar, there is little
change in tension. If theyare different, there is more change in
tension. Figure 2shows an abstract tension pattern: at a local
level, Event 1tenses into Event 2, Event 3 relaxes into Event 4,
and
330 F. Lerdahl and C. L. Krumhansl
1. A representation of hierarchical (prolongational) event
structure.
2. A model of tonal pitch space and all distances within it.
3. A treatment of surface (largely psychoacoustic)
dissonance.
4. A model of voice-leading (melodic) attractions.
FIGURE 1. Four components necessary for a quantitative theory
oftonal tension.
FIGURE 2. Tension (t) and relaxation (r) represented by a tree
structure.
-
Event 5 relaxes into Event 6; at larger levels, Event 1tenses
into Event 4, Event 6 relaxes into Event 7, andEvent 1 relaxes into
Event 7. Notice that this representa-tion says nothing about the
tension relationshipbetween Events 2 and 3 or Events 4 and 5. More
seri-ously, it does not quantify the amount of tension
orrelaxation. It merely says that if two events are con-nected, one
is relatively tense or relaxed in relation tothe other.
Further progress in the evaluation of tension dependson the
second component listed in Figure 1, a model oftonal pitch space. A
well-known finding in music psy-chology is that listeners judgments
about the distancesof pitches, chords, and regions (or keys) from a
giventonic form consistent patterns (Bharucha & Krumhansl,1983;
Krumhansl, 1990, hereafter CFMP; Krumhansl &Kessler, 1982).
These results have been replicated in sev-eral ways, using
different input materials, participantswith varied training, and
different task instructions.When submitted to multidimensional
scaling, the empir-ical data are represented as geometrical
structures inwhich spatial distance corresponds to cognitive
distance.The regular geometry found for regions (Krumhansl
&Kessler, 1982) corresponds to musical spaces proposedearlier
by music theorists (Schoenberg, 1954; Weber,1817-21).
It is striking that listeners share a complex mentalschema of
the mutual distances of pitches, chords, andregions. But how is
this empirical result to be accountedfor? Several researchers have
proposed explanatoryframeworks: CFMP through sensitivity to
statistical fre-quency of tone onsets or durations; Bharucha
(1987)through neural-net modeling; Parncutt (1989)
throughpsychoacoustic modeling. A fourth approach, which
iscomplementary to the others, has been to develop amusic-theoretic
formal model of tonal pitch space thatcorrelates with the empirical
data and that unifies thetreatment of pitches, chords, and keys
within a singleframework (Lerdahl, 1988; TPS). The model begins
withthe basic space in Figure 3, set to I/C. Regions are
desig-nated in boldface type, with upper-case letters for majorkeys
and lower-case letters for minor keys. The numbersin familiar
pitch-class notation signify either pitches orpitch classes,
depending on context. The basic space is
hierarchical in that if a pitch class is stable at one level,it
repeats at the immediately superordinate level. Thediatonic scale
is built from members of the chromaticscale and the triad from
members of the diatonic scale.The triad itself has an internal
hierarchy, with the fifthmore stable than the third and the root as
the most sta-ble element. The shape of this structure corresponds
tothe major-key tone profile in CFMP and can be viewedas an
idealized form of it.
Transformations of the basic space measure the dis-tance from
any chord in any region to any other chordwithin the region or to
any chord in any other region.The space shifts by means of a
diatonic chord distancerule in which the distance from chord x to
chord yequals the sum of three variables, as shown in
theabbreviated statement of the rule in Figure 4. Computa-tional
details aside, the factors involved are the degree ofrecurrence of
common tones and the number of movesalong two cycles of fifths, one
for triads over the dia-tonic collection and the other for the
diatonic collectionover the chromatic collection.
Figure 5 illustrates some basic-space configurationswith their
distance calculations from I/C (see Figure 3).The underlined
numbers signify new pitch classes in thenew configuration (variable
k in the rule). The distancefrom I/C to V/C in Figure 5a is
accomplished by stayingin the same diatonic collection (i), moving
the chordonce up the diatonic cycle of fifths ( j), and counting
theresultant noncommon tones (k). The distance from I/Cto I/G in
Figure 5b is two units greater, even though thetwo chords are the
same. In the latter, there is also acycle-of-fifths move at the
scale level (i), causing an extranoncommon tone at that level (k).
Motion betweenmajor and minor chords arises not by a
transformationbut is a by-product of moves along a cycle of
fifths.
Modeling Tonal Tension 331
(a) octave (root) level: 0 (0)
(b) fifths level: 0 7 (0)
(c) triadic level: 0 4 7 (0)
(d) diatonic level: 0 2 4 5 7 9 11 (0)
(e) chromatic level: 01 2 3 4 5 6 7 8 9 10 11 (0)
FIGURE 3. Diatonic basic space, set to I/C (C = 0, C# = 1, . . .
B = 11).
FIGURE 4. The rule for calculating the distance between triads
in diatonic space.
-
Thus the distance from I/C to i/a in Figure 5c is reachedby
staying in the same diatonic collection (i) and mov-ing the chord
three times up the diatonic cycle of fifths( j), producing four
noncommon tones (k). The dis-tance from I/C to i/c in Figure 5d, in
contrast, involvesmoving the scale three times down the chromatic
cycleof fifths (i). With no change of chord root ( j), the thirdof
the chord becomes minor. Again there are four non-common tones
(k).
When mapped geometrically, the distances (d ) fromtriad to triad
within a key exhibit a regular pattern, withthe diatonic cycle of
fifths arrayed on the vertical axisand the diatonic cycle of minor
thirds on the horizontalaxis. Figure 6a displays this pattern along
with distancesfrom the tonic triad to the other triads within the
key.Regional spacethat is, distances from a given tonictriad to
other tonic triadsshows a similar pattern,with the chromatic cycle
of fifths on the vertical axis andthe minor-third cycle on the
horizontal axis. Figure 6bgives a portion of regional space along
with the distancevalues. If these chordal and regional patterns
areextended, both Figures 6a and 6b form toroidal struc-tures.
(Figure 6b corresponds to a multidimensionalsolution developed from
empirical data in Krumhansland Kessler, 1982; also see CFMP.)
Pitch-space distances are input to prolongationalstructure via
the principle of the shortest path. The ideais that listeners
construe their understanding of melodiesand chords in the most
efficient way; in other words, theyinterpret events in as stable
and compact a space as possible.For example, if one hears only the
melodic progressionC E, the most stable interpretation is as in
C,for C and E are then in an optimally stable location in
31
diatonic basic space. A slightly less preferred alternative isas
in a; still less preferred would be in G; andso forth. Similarly, a
G major chord heard in a C contextis likely to be heard by the
shortest path as V/C ratherthan, say, by longer paths to I/G or
iii/e.
Figure 7 illustrates the use of the principle of theshortest
path in a derivation of the prolongationalstructure for the final
phrase of the Bach chorale,Christus, der ist mein Leben. (Later on
we discuss theentire chorale; also see the extensive analysis in
TPS,chapter 1.) Let us assume that the phrasal boundariesand
metrical grid have already been assigned. As a firststep, automatic
segmentation rules carve the music intonested rhythmic units so
that each event is assigned to atime-span segment. Second, at the
quarter-note time-span level at the bottom of the graph,
nonharmonictones are reduced out, the cadence (marked c) is
desig-nated, and tonic orientation is established by shortest-path
measurement. The opening F major triad is takento be the tonic
because the distance to itself is 0(d[F F] = 0), whereas distances
to other possible ton-ics would be greater. All the subsequent
events takeplace within F. Third, events at the quarter-note
levelcome up for comparison at the half-note level. In eachcase,
the most stable event is selected for comparison inthe next larger
span, where stability is defined in termsof the distance to another
available event. Thus theopening I is compared within span a to
viio6 and I6, andii65 is compared within span b to the V. In span
a, theopening I is selected over viio6 because, unlike viio6,
thedistance of I to the tonic is 0; I wins over I6 because itsroot
is in the bass. In span b, the V is chosen because it
6453
332 F. Lerdahl and C. L. Krumhansl
FIGURE 5. Illustrations of d.
(a)
(iii) V viio
8
5
vi 7 I 7 iii8
5
ii IV (vi)
(b)
e G (g)
9
7
a 7 C 7 c
7
10
d F (f)
FIGURE 6. Portions of (a) chordal space within a region; and(b)
regional space, along with values calculated by d.
-
is part of the cadence. In span c, the only choice is thefinal
I. Thus the half-note time-span level yields I-V-I.
The time-span hierarchy then forms the input to
theprolongational tree, moving from global to local levels.The
distances between available global events ared(opening I final I) =
0 and d(V I) = 5. The firstoption wins because its path is shorter:
the opening Ibranches directly to the closing I, and within that
con-text V branches to the final I. At a more local level, in
thefirst part of the phrase d(I I6) = 0 and d(I viio6) =
5(counting, as is customary, viio6 as an incompletedominant), so I6
attaches to I; within the context of I I6,viio6 branches to I6.
Finally, ii65 lies between I
6 and V.d(ii65 I6) = 9 and d(ii65 V) = 7, so ii65 attaches to
themore proximate V. As a visual aid, the slurs betweenevents in
the music duplicate the relationshipsdescribed in the tree.
Supplementary to the principle of the shortest path isa second
factor in the derivation of prolongationalreductions, the principle
of good form, which encour-ages optimal patterns of tension and
relaxation. Thissecond principle breaks down into three
conditions.First is the recursion constraint, in which
successive
right or left branches are preferable to unconnectedright or
left branches. Thus there is pressure to assignthe first instance
of Figure 8a rather than the second.Second is the balance
constraint, in which the numberof right and left branches
approaches equality. Thus thefirst instance in Figure 8b is
preferred over the second.Third is normative structure, in which
there is a pref-erence for at least one right branch leading off
thestructural beginning of the phrase and for at least oneleft
branch (a pre-dominant) leading into the phrasescadence. Finally,
there is a third overarching principle,that of parallelism:
parallel passages preferably haveparallel structures. GTTM uses
this principle in all of itstheoretical components.
The principles of the shortest path, prolongationalgood form,
and parallelism reinforce one another in theBach phrase, but in
other passages they might conflict.Although the procedures
involving the shortest path arealgorithmic, their interaction with
prolongational goodform is not fully specified; and the principle
of paral-lelism is notoriously difficult to quantify. Hence there
is adegree of flexibility in the assignment of
prolongationalstructure.
Modeling Tonal Tension 333
FIGURE 7. Derivation of the prolongational structure of the
final phrase of the Bach chorale, Christus, der ist mein Leben.
-
The chord distance rule calculates not only the dis-tance
between two chords x and y but also the tonal ten-sion between
them. Tension can be computed bothsequentially and hierarchically.
Sequential tension ismeasured simply from one event to the next, as
if thelistener had no memory or expectation. Hierarchicaltension
proceeds through the prolongational analysisfrom global to local
levels in the tree structure. It is anempirical question how much
listeners hear tensionsequentially and how much hierarchically. No
doubtthey hear from one event to the next, but if listeningwere
only sequential there would be little larger-scalecoherence to the
musical experience.
We turn now to the third component of tonal ten-sion, surface
dissonance. Nonharmonic tones (tonesnot belonging to a sounding
triad) are less stable, hencetenser, than harmonic tones. Even when
all the sound-ing tones are harmonic, the triad is more stable if
it is inroot position than if it is in inversion; and, to a
lesserextent, it is more stable if its melodic note is on the
rootof the triad than if it is on the third or fifth scale
degree.These factors are registered, categorically and
approxi-mately, in the surface tension rule in Figure 9. They
areonly approximate because tones within a category infact differ
in their degree of perceived dissonance,depending on intervallic
structure, metrical position,duration, loudness, timbre, and
textural location. Analternative method would be to quantify
surface tensionaccording to an established measure of sensory
disso-nance in the psychoacoustic literature (for
instance,Hutchinson & Knopoff, 1978). This method wouldgive
rise to a continuous measure of surface tension.
However, surface tension is perceived categorically to
aconsiderable extent. For example, in a diatonic 7-6 sus-pension
chain, all the sevenths, major or minor, soundmore or less equally
dissonant. Here we take the cate-gorical approach.
The chord distance rule and the surface tension rulecombine in
two possible ways to yield an overall ten-sion value for a given
event. The simpler way, stated inFigure 10a, is sequential:
calculate the pitch-space dis-tance from one event to the next and
add the value forsurface tension. The more complex way in Figure
10b ishierarchical: calculate the pitch-space distance from
theimmediately dominating event and add the value forsurface
tension; then add hierarchical values as inheriteddown the
prolongational tree.
As illustration, consider Figure 11, the Grail themefrom Wagners
Parsifal. (This is also known as theDresden Amen and is familiar as
such in some Protes-tant services. Here the theme is transposed
from A , itscharacteristic key, to E so that it can be directly
com-pared later on to its chromatic version in E .) The
the-oretically preferred analysis, following the
recursionconstraint and parallelism for the first four events,
saysthat the music tenses away from the opening I until
thepre-dominant ii (Event 5) in bar 2. After an elaborationof ii,
the progression relaxes, in observance of norma-tive structure,
into the closing I, which repeats theopening I an octave higher.
The dashed branch to Event 5signifies an alternative branching that
continues to followthe parallelism of the harmonic sequence but
thatremoves the pre-dominant left branch required bynormative
structure. We shall return to this point.
334 F. Lerdahl and C. L. Krumhansl
FIGURE 8. Prolongational good form: (a) recursion constraint;
(b) balance constraint; (c) normative structure.
FIGURE 9. The rule for calculating surface tension.
-
Included in Figure 11 are numerical values from theapplication
of the rules in Figures 9 and 10. The firstrow of numbers between
the staves lists surface disso-nance values. The second row lists
sequential tensionvalues, obtained by calculating d from one chord
to thenext and adding surface distance values. For example,the
sequential distance from Event 2 to Event 3 is 7, andthe surface
dissonance value for Event 3 is 1; so thesequential tension
associated with Event 3 is 7 + 1 = 8.The third row similarly lists
hierarchical tension values,obtained globally by adding the
distance numbers nextto the branches of the tree and then adding
the surfacedistance values. Thus the hierarchical distance
fromEvent 2 to Event 4 is 0 + 7 + 7 = 14, and the surface
dis-sonance value for Event 4 is 1; hence the hierarchicaltension
associated with Event 4 is 14 + 1 = 15.
The same calculations appear in the tabular format inFigure 12.
The events for Tseq in Figure 12a are listedin sequential order.
The table decomposes the surface-dissonance and pitch-space factors
into their compo-nent parts. The values in each row are summed to
reachthe total sequential tension for each event. In Thier inFigure
12b, the target chords (those to the right of thearrows) are still
listed in sequential order, but the sourcechords (those to the left
of the arrows) are now listed bythe immediately dominating events
in the tree. Forexample, the notation Thier(4 3) indicates that
Event 3,because it branches from Event 4, derives its tensionvalue
from Event 4. In accordance with the hierarchicaltension rule,
Figure 12b includes the additionalcolumns of local total and
inherited value. The hier-archical tension for each event, given in
the global
Modeling Tonal Tension 335
FIGURE 10. Tension rules: (a) sequential tension plus surface
dissonance; (b) hierarchical tension plus surface dissonance.
FIGURE 11. Grail theme (diatonic version) from Wagners Parsifal,
together with its theoretically preferred prolongational analysis,
surfacedissonance values, sequential tension values, and
hierarchical tension values.
-
total column, equals the local total plus the
inheritedvalue.
The fourth component of the tension model is the fac-tor of
attraction. That pitches tend strongly or weaklytoward other
pitches has long been recognized in musictheory (see TPS, pp.
166167 and 188192). Bharucha(1984, 1996) provides a psychological
account of thisphenomenon through the notion of anchoring, which
isthe urge for a less stable pitch to resolve on a
subsequent,proximate, and more stable pitch. This corresponds tothe
account offered by Krumhansl (1979) for the effectof temporal order
on tone similarity judgments.Bharucha and Larson (1994, 2004) also
equate theattractive urge with melodic expectancy (Meyer,
1956;Narmour, 1990). The TPS attraction model extendsBharuchas
approach to include the attraction of anypitch to any other pitch
and to harmonic progression. Italso quantifies the relevant
variables and places themwithin a larger cognitive theory.
Figure 13a repeats the basic space with the fifths level(level b
in Figure 3) omitted, in order to make attractions
to the third and fifth scale degrees equal. Each level ofthe
space is assigned an anchoring strength in inverserelation to its
depth of embedding. Figure 13b gives themelodic attraction rule.
The two factors in the equa-tion, combined by multiplication, are
the ratio ofanchoring strengths of two pitches and the
inversesquare of the semitone distance between them. The dis-tance
factor is estimated to behave as in Newtons classi-cal
gravitational equation. The inverse-square factorrenders miniscule
attractions between pitches that aremore than a major second apart.
To convey the behav-ior of the rule, Figure 13c lists a few
attractions to dia-tonic neighbors in the context I/C. The pitch B
is highlyattracted to C because the two pitches are a semitoneapart
and C is more stable. D is less attracted to Cbecause it is two
semitones away. F is more attracted toE than E is to F because of
their inverse anchoringstrengths.
The attraction rule applies not only to individuallines but also
to each voice in a progression. As stated inthe harmonic attraction
rule in Figure 14, these values
336 F. Lerdahl and C. L. Krumhansl
FIGURE 12. Tension tables for Figure 11: (a) sequential tension;
(b) hierarchical tension.
-
are summed and then divided by the value for the chorddistance
rule to obtain the overall attraction value fromone chord to the
next.
Figure 15 applies the harmonic attraction rule to thefirst and
last progressions in Wagners Grail theme.Where a pitch repeats,
null is designated because theattraction rule does not apply to
repeated pitches. Thevalues of a are summed to the combined
realized voice-leading value (arvl), which is divided by the d
value togive the final realized harmonic attraction value
(arh).Notice the extreme differences between the arh valuesfor
Prog(1 2) and Prog(8 9). In the former, the pro-gression I vi is
only moderately strong and includesrepeated notes; in the latter,
the progression V7 I isvery strong and resolves by half step in two
voices.Indeed, the strongest harmonic attraction is from adominant
seventh chord to its tonic, because of thepowerful attractions of
the leading tone to the tonic
and the fourth to the third scale degree and because ofthe short
distance from the dominant to the tonicchord. This is why (aside
from statistical frequency)the expectancy for a tonic chord is so
high after adominant-seventh chord.
Attractions in TPS are computed not only from eventto event at
the musical surface but also from event toevent at immediately
underlying levels of prolonga-tional reduction. The resulting sets
of numbers, how-ever, are not integrated into a single attraction
measureacross reductional levels. Depending in part on
tempo,underlying levels presumably contribute to the overallresult
in increasingly smaller amounts as the analysisabstracts away from
the surface. (Margulis, 2005, pro-poses a mechanism for this step.)
In this study wedispense with underlying levels of attraction.
Figure 16 shows the surface attraction values for theGrail
theme. The numbers appear between events because
Modeling Tonal Tension 337
FIGURE 13. Melodic attractions: (a) The basic space minus the
fifth level and with anchoring strengths indicated by level; (b)
the melodic attractionrule; (c) some computed attractions between
scalar adjacencies in the context I/C.
FIGURE 14. Harmonic attraction rule.
-
they apply to relations between events. Where the har-mony does
not change, as in events 3-4 and 5-7, a singleattraction value
obtains.
There is a complementary relationship between ten-sion and
attraction numbers. Where the music tensesaway from the tonic,
attractions are realized on less sta-ble pitches and chords. Hence
where tension numbersrise, attraction values tend to be small. But
where themusic relaxes toward the tonic, attractions are realizedon
more stable pitches and chords; tension numbersdecline and
attraction values rise. A high attractionvalue in effect
constitutes a second kind of tensionnot the tension of motion away
from stability but thetension of expectation that the attractor
pitch or chordwill arrive.
A further general point about tension and attractionconcerns
numerical quantification. As Klumpenhouwer(2005) points out, the
theorys numbers measure differ-ent entities in the different
components: in the disso-nance component, chord inversions and
nonharmonictones; in the distance model, steps on cycles of
fifthsand noncommon tones; in the attraction component,pitch
stabilities and distances. As numerical values,then, these might be
considered incommensurate (forexample, a 2 for inversion in the
dissonance componentis not exactly the same as a 2 for the k
distance betweenchords). One approach to this issue would be to
findcoefficients for the different variables to express the
rel-ative strength of their units of measurement. We havefound,
however, that coefficients are not needed for thetension rules;
that is, the numbers already express the
relative strength of the variables in question. However,the
attraction rules yield incommensurable outputnumbers compared to
those of the tension rules.Empirical data suggest that coefficients
are neededwhen tension combines with attraction. For this, wetake a
practical rather than theoretical solution throughthe mathematical
technique of multiple regression,which weights the two sets of
numbers to find the bestfit between the tension and attraction
curves.
Before proceeding, it should be noted that themelodic attraction
rule (Figure 13b) stands on weakerempirical grounds than does the
chord distance rule(Figure 4). Experimental results guided the
develop-ment of the distance rule. (However, the output of
theelaborated form of d, the chord/region distance rule [TPS, p.
70], which employs the pivot-region concept,proves to be
empirically less successful, and we shall notinvoke it.) The
attraction rule, in contrast, was devel-oped by a blend of
theoretical and intuitive considera-tions without much supporting
empirical data. Severalaspects of the rule can be criticized.
First, it is unclearthat a multiplicative rather than additive
relationshipshould obtain between the stability (s2s1) and
proximity(1n2) parts of the equation. Second, as Larson (2004)and
Samplaski (2005) observe, there is arbitrariness inthe reduction of
five levels of the basic space (Figure 3)to four when calculating
attractions (Figure 13a).Third, the inverse factor for proximity
eliminates theattraction of a pitch to itself because of the
impossibil-ity of a zero denominator. Pitch repetition may indeedbe
a case where intuitions of attraction and expectation
338 F. Lerdahl and C. L. Krumhansl
FIGURE 15. Two applications of the harmonic attraction rule.
FIGURE 16. Attraction values for the Grail theme.
-
diverge. One may expect a pitch to repeat, but it seemsmore
natural to think of a pitch as being attracted onlyto other
pitches. Nevertheless, the exclusion of pitchrepetition from the
calculations leaves a gap in the the-ory. Fourth, the specific form
of the inverse factor, 1n2,appears to create too steep a curve;
that is, the dropfrom great attraction at the half-step distance to
lessattraction at the whole-step distance to very littleattraction
at the minor-third distance seems tooextreme (Margulis, 2005). The
obvious alternative, 1n,yields too flat a curve. An intermediate
curve is possible,but the theoretical and empirical bases for such
a solu-tion are unclear. Fifth, and perhaps most importantlyfrom a
theoretical perspective, the measurement ofproximity only by
semitones may be too simple a met-ric. Larson (2004) cites evidence
in Povel (1996) thatstepwise arpeggiated intervalsthat is, between
adjacentmembers of a triad or between the fifth and the tonicyield
greater attractions than predicted by 1n2 or 1n.Krumhansl (1979)
and CFMP (Table 5.1) also find highrelatedness ratings for triad
members. This evidence fitsthe discussion in Chapter 2 of TPS about
pitch proxim-ity, step motion, and linear completion. It appears
thatthat discussion, in which stepwise motion is seen as
per-taining to the alphabet in question at a given level of
thebasic space, should have informed the formulation ofpitch
attractions in Chapter 4.
Despite these reservations, the principles behind theattraction
rule, stability and proximity, remain the cen-tral factors in a
treatment of melodic attraction. Wehave tried the alternatives of
five instead of four stabil-ity levels and of proximity by 1n
instead of 1n2, but theresulting values do not lead to improvements
over thoseof the original rule with respect to the empirical
data.Nor are there enough instances of voice-leading arpeg-giation
in our examples to force a stratified treatment ofmelodic
proximity. Our project is to test the success orfailure of the TPS
theory of tension and attraction, andwe leave theoretical
refinements of the attraction rulefor future research.
Experimental Approach
The participants in the experiments under discussionwere
musically trained students at Cornell Universitywith relatively
little training in music theory comparedto the extent of their
instruction on musical instru-ments or voice. (More details of
music backgroundsand other details of the experimental method can
befound in Appendix A.) They were tested for tensionresponses for
Wagners Grail theme from Parsifal inits diatonic and chromatic
versions, a Bach chorale, a
chromatic Chopin prelude, and a passage from MessiaensQuartet
for the End of Time. The data were compared tothe models
predictions. (A Mozart sonata movementthat received a similar
treatment is not discussed in thispaper; see Krumhansl, 1996, and
Lerdahl, 1996.)
The tests for the Wagner and Bach excerpts wereconducted in two
ways, the stop-tension task and thecontinuous-tension task. In the
stop-tension task, thefirst event was sounded, at which point the
participantsrated its degree of tension; then the first and
secondevents were sounded and the participants rated the ten-sion
of the second event; then the first, second, andthird events were
sounded, and so on, until the tensionassociated with each
successive event was recorded. Inthe continuous-tension task, which
was done for allexcerpts, the participants interacted with a
graphicinterface that enabled them to move a slider right andleft
on the computer screen using a mouse, in corre-spondence with their
ongoing experience of increasingand decreasing tension. The
advantage of the stop-tensiontask is that it records the response
precisely for the eventthat is evaluated. Its disadvantage is that
it is rather arti-ficial and prohibitively time-consuming for
longexcerpts. The advantage of the continuous-tension taskis that
it encourages a spontaneous response to intu-itions of tension in
real time. Its disadvantage is thatthere is a lag time, for which
an approximated correc-tion must be made, between the sounded
events and thephysical response of moving the mouse. Perhaps
sur-prisingly, the results from the two tasks yielded almostthe
same results for the short passages where bothtasks were used. For
the longer Chopin and Messiaenselections, however, it was practical
to employ only thecontinuous-tension task. The participants in the
studyby Krumhansl (1996) using this method varied in theextent of
their musical training, but training had littleeffect on the
tension judgments.
As mentioned, the analyses combine tension andattraction values
to achieve an overall measure of ten-sion. We follow three
conventions in this respect. First,even though an attraction number
does not adhere to asingle pitch but represents a relation between
two suc-cessive pitches x and y, we assign the number to x,
ineffect claiming that it is at x that the experience of
attrac-tion most saliently takes place. In this way, each event
hastwo numbers associated with it, one for d and the otherfor a.
Second, the harmonic attraction rule (Figure 14)has d in the
denominator and hence requires that d 0.This creates a problem when
the voice leading moves butthe harmony does not progress. In such
cases, we repeatthe value for d from the point at which the harmony
lastchanged (as in Figure 16, Events 3-4 and 5-7). Third, a
Modeling Tonal Tension 339
-
virtual attraction can be computed from Event x to anypossible
Event y, and, in particular, to the y with thehighest attraction
value. Instead we calculate only therealized attraction, that is,
from x to the y that actuallyfollows. It might be argued,
especially for the stop-tensiontask, that the strongest virtual
attraction, when it is notthe same as the realized attraction,
should be calculated,on the view that the strongest attraction
correspondsto the strongest expectation. Expectations,
however,depend not only on strongest attractions but also
onschematic patterns that lie beyond current formaliza-tion. To
calculate to an event that does not occur wouldbe somewhat
speculative in this context. It suffices as afirst approximation to
rely on the definiteness of realizedattractions.
A larger methodological point concerns the interactionbetween
prediction and data. It is sometimes thoughtthat an experiment
simply tests a preexisting theory. Yetexperimental data can give
rise to a theory; this in factwas the case for the construction of
the pitch-spacemodel. In a healthy science, it often happens that
afruitful exchange develops between theory and experi-ment. Such is
the case here. If the data suggest that thepredictions are faulty,
principled ways are soughtwithin the model to reach predictions
that achieve abetter empirical result. These reevaluations are
princi-pled in the sense that they are constrained by the gen-eral
assumptions and specific formalisms of the theory.This process can
go back and forth a number of times.One must of course be careful
not to adjust the theorysimply in order to fit the data. Rather,
the data can illu-minate how listeners construe tension, suggesting
inter-pretations within the model that are both
theoreticallyacceptable and more predictive. In this way the
theorycan be improved. Furthermore, in our view it is notenough to
achieve a statistically significant overall cor-relation. What is
wanted, in addition, is an explanatoryaccount of why the model
succeeds or fails at any givenpoint in the analysis.
In this back-and-forth process there are two kinds offlexibility
within the theory. First, sequential or hierar-chical tension can
be computed, each with or withoutattractions. Second, unlike the
tension and attractionrules (all those that incorporate d and a),
which arealgorithmic, the derivation of prolongational
structureinvolves gradient preference rules, which interact withone
another in search of an optimal solution (see thediscussions in
GTTM and TPS; also Temperley, 2001).
Preferential conditions arise in three ways. First is
theinteraction between the principles of the shortest
path,prolongational good form, and parallelism. Second,when there
is a shift from a right- to a left-branching
pattern, the event where the shift takes place can attacheither
way, depending on the shortest path and goodform. Third, it is not
always clear where to locate anevent in pitch space; that is, there
can be ambiguityabout the identity of a chord or the exact moment
of amodulation. As a result of these factors, a passage ofmusic
yields not a single prolongational analysis but alimited range of
preferred analyses. The data canpoint in any given case toward
which theoreticallyviable prolongational analysis conforms best to
listenersresponses.
Analyses in a Diatonic Framework
Wagner Theme, Diatonic Version
We begin with the diatonic Grail theme from WagnersParsifal,
shown in Figure 11. Figure 17 records the nineevents of the excerpt
on the x-axis and tensionresponses from the stop-tension task on
the y-axis. Thedashed line represents the sequential tension
valuesfrom Figure 12a, without the inclusion of attractionvalues,
and the solid line shows the data from the aver-aged listeners
responses. The fit is quite poor: R2(1,7) =.08, p = .46, R2adj =
.049.
Some words of explanation may be helpful. For eachcorrelation,
we present the following information aboutthe statistical test. The
first number R2 is the proportionof variation in the data that is
accounted for by themodel. It is associated with two numbers, the
degrees offreedom. The first degree of freedom indicates thenumber
of predictor variables in the model. In this casethere is one
variable, sequential tension. The seconddegree of freedom is the
number of data points (in this
340 F. Lerdahl and C. L. Krumhansl
FIGURE 17. Sequential tension analysis of the Grail theme from
Parsifal.
-
case 9, one for each event in the music) minus two.
Thesubtraction of two results from the regression usingup two
degrees of freedom for the parameters it deter-mines. The
regression finds the best-fitting linearmodel predicting the data
from the variable(s). Themodel determines the optimal values for
the slope ofthe line (one parameter) and the intercept of the
line(the second parameter). Hence the number two is sub-tracted
from the number of data points going intothe regression to give the
second degree of freedom.If the model can find a perfect fit
between the data andthe variable(s), R2 would equal 1.0. In
general, the valueis less than this, and its significance is
measured by theprobability, p, which is the next number given in
thestatistical report. By convention, a statistic, such as R2,
isconsidered significant when the p value is less than .05.The
probability depends on both the size of R2 and thedegrees of
freedom. The last number given is theadjusted R2, R2adj. The R
2adj is the R
2 value adjusted tomake it more comparable with other models for
thesame data that have different numbers of degrees offreedom.
Methods such as time-series analysis or functionaldata analysis
are not appropriate here. Our objective isto determine whether the
judgments fit the quantitativepredictions of the model for each
musical event. Forthis we need a single number for judged tension
foreach event.
The conclusion from this statistical test for Figure 17is that
listeners do not hear this passage in a simplesequential manner.
The R2 value is only .08, whichmeans that the sequential tension
variable accounts foronly 8% of the variability in the tension
judgments, andthe p value of .46 tells us that this is an
unimpressiveresult. Graphically, this is apparent in Figure 17
wherethe two lines do not follow each other closely.
The second analysis is another single variable model,using the
attraction values displayed in Figure 18. Theseare the attraction
values from Figure 16, without theinclusion of tension values,
against the listenersresponses. The fit is improved but still not
good: R2(1,7) =.35, p = .09. R2adj =.26.
Figure 19 combines Figures 17 and 18 by adding theattraction
values to the sequential tension values. Mul-tiple regression
weights the two sets of numbers toachieve a best-fit solution, and
assigns a probability toeach of the predictor variables. These will
be denotedp(attraction) for the probability of attraction
andp(tension) for the probability of the total tension pre-dictor.
Each of these is shown with a standardized betavalue, b. The b
weights are the coefficients in the linearmodel predicting the data
from the predictor variable
(after they have been standardized to have the samemean and
standard deviation). The picture is betterthan in Figure 17 but no
better than in Figure 18:R2(2,6) = .35, p = .28, R2adj=.13 ;
p(attraction) = .17, b = .58;p(tension) = .96, b = .02. The higher
p value for R2 is thepenalty for using two predictor variables
rather thanone, thus increasing the first degree of freedom. Or,
toput it another way, when attraction is included in themodel,
adding the sequential tension values does notimprove the fit (the p
value for sequential tension in themultiple regression is .96,
which means that adding ithas virtually no effect). This analysis
confirms the con-clusion that the strict sequential treatment of
tensiondoes not contribute to the fit of the data.
Let us abandon the sequential-tension approach andconsider
instead the theoretically preferred prolongational
Modeling Tonal Tension 341
FIGURE 18. Attraction analysis of the Grail theme.
FIGURE 19. Combined sequential + attraction analysis of the
Grailtheme.
-
analysis in Figure 11 together with its derived hierarchi-cal
tension analysis in Figure 12b. At first we ignoreattraction
values. The resultant graph in Figure 20achieves a better
correlation than the previous analyses:R2(1,7) = .43, p = .056,
R2adj =.35. However, the predictedvalues are too high for Events
1-4 and too low forEvents 5-8.
Figure 21 adds the attraction values in Figure 16 tothe tension
values in Figure 12b. Now the correlationis quite good and
statistically significant: R2(2,6) = .75,p = .016,R2adj = .66;
p(attraction) = .03, b = .56; p(tension) =.02, b = .63. The most
notable change in Figure 21compared to Figure 20 is the raising of
the predictedcurve at Event 8 (V7). In Figure 20 the tension
model
correctly assigns relaxation into the cadence, but partic-ipants
experience greater tension at the V7 chord thanshown there. This
happens because the V7 is highlyattracted to the following tonic
resolution, an effectrealized in Figure 21 by the inclusion of
attraction val-ues. Discrepancies remain, however. The predictions
forEvents 3-4 are still too high and those for Events 5-7 aretoo
low.
These shortcomings can be overcome through a revi-sion of the
prolongational analysis. In the originalanalysis in Figure 11,
there is equilibrium between rightand left branching (following the
balance constraint),with Event 5 (the ii chord) interpreted as a
pre-dominantto the cadence (following normative structure).
Theanalysis in effect claims that, beginning at Event 5,
thelistener already expects the resolution on Event 9. But itis
harder to anticipate prospectively than it is to remem-ber
retrospectively. Besides, Event 5 continues from theprevious events
the harmonic sequence of descendingthirds with a rising melodic
second. It is easier to hearinstead the analysis in Figure 22, in
which the principleof parallelism wins over those of branching
balance andpre-dominant function. The only difference is thatEvent
5 is now a right instead of left branch; Events 6 and 7attach to
Event 5 as before. This single change leads toalterations in
tension values for Events 5-7, as listedbetween the staves. In this
interpretation, the tension ofthe harmonic sequence continues
through the elabora-tion of ii in Events 6-7 and is released only
at thecadence in Events 8-9. Attractions remain as before.
Theresult is the almost perfectly matching curves in Figure
23:R2(2,6) = .97, p < .0001, R2adj = .97; p(attraction) <
.0001,b = .58; p(tension) < .0001, b = .79.
Three broad conclusions can be drawn from thisanalysis of the
Grail theme. First, attractions must beincorporated into the
predictions. Second, listenershear tension hierarchically more than
sequentially.Third, unless schematic intuitions are strong,
listenerstend to construe events in a right-branching manner,that
is, in terms of previous rather than followingevents.
Are the stop-tension data related to the continuous-stop data?
Krumhansl (1996) found that the discretepredictions of the TPS
model could provide a good fitto the continuous tension judgments
by assuming anintegration time of 2.5 seconds. In the present case,
thisapproach is adapted to ask whether the continuous-tension data
could be predicted by the stop-tensiondata, assuming the same
integration time. The calcula-tion assumes that the values of past
events are degradedas an inverse exponential function with a
half-life of0.5 seconds. The continuous data are plotted as a
solid
342 F. Lerdahl and C. L. Krumhansl
FIGURE 20. Tension graph for the theoretically preferred
hierarchicalanalysis of the Grail theme.
FIGURE 21. Combined hierarchical (theoretically preferred) +
attractionanalysis of the Grail theme.
-
line in Figure 24 together with the values calculatedfrom the
stop-tension data. A high degree of agreementis reached: R2(1,104)
= .95, p < .0001, R2adj = .95. This isof interest because the
participants performed the stop-tension task before the
continuous-tension task. Thismeans that when they performed the
stop-tension task,they had not heard the music beyond the chord
thatthey were judging. The extent to which the two tasksconverge
suggests that listeners were responding to thesounded events rather
than to events they anticipatedbecause of memory from previous
listening. Althoughthe analyses will not be presented here, the
stop-tensionand continuous-tension data are similarly related
for
the other two excerpts for which they are available
(thechromatic version of the Wagner Grail theme and theBach
chorale).
Bach Chorale
On the basis of the discussion of the Wagner excerpt,the
remaining analyses follow the hierarchical ratherthan sequential
tension model and incorporate attrac-tions as part of the overall
prediction of tension. Wefirst consider the Bach chorale Christus,
der ist meinLeben. Its prolongational analysis is divided, for
rea-sons of space, between Figure 25 and Figure 26. The top
Modeling Tonal Tension 343
FIGURE 22. Prolongational analysis of the Grail theme, with
Event 5 reinterpreted as right branching.
FIGURE 23. Combined hierarchical (right-branching) + attraction
analysisof the Grail theme.
FIGURE 24. Comparison of the continuous-tension data (solid
line)with predictions from the stop-tension data for the Grail
theme, afterthe latter are integrated over 2.5 seconds with an
exponential decaywith half life 0.5 sec.
-
branches, all of which represent the tonic I/F (hence d = 0in
the tree), should be understood as connectingtogether. Event 2 in
Figure 25 attaches to Event 41 inFigure 26, and the designation for
Event 19 in Figure 26refers to Event 19 in Figure 25. The predicted
values ofsurface dissonance, hierarchical tension, and
attractionappear between the staves. (Incidentally, Event
34branches differently than does the equivalent first eventin
Figure 7. Here it connects not to the final cadence[Event 41] but
back to Event 19, showing the return to
/F. This happens because a prolongational analysisalways makes
the most global connection possible. InFigure 7 the context was a
single phrase; here it is theentire chorale.)
Figure 27 shows the fit of the empirical data with
thepredictions in Figures 25-26: R2(2,38) = .79, p < .0001,R2adj
=.78 ; p(attraction) < .0001, b = .47; p(tension)