Lerdahl-Krumhansl_2006

FRED LERDAHLColumbia University

CAROL L. KRUMHANSLCornell University

THIS STUDY PRESENTS AND TESTS a theory of tonal ten-sion (Lerdahl, 2001). The model has four components:prolongational structure, a pitch-space model, a surface-tension model, and an attraction model. These compo-nents combine to predict the rise and fall in tension inthe course of listening to a tonal passage or piece. Wefirst apply the theory to predict tension patterns inClassical diatonic music and then extend the theory tochromatic tonal music. In the experimental tasks, lis-teners record their experience of tension for theexcerpts. Comparisons between predictions and datapoint to alternative analyses within the constraints ofthe theory. We conclude with a discussion of the under-lying perceptual and cognitive principles engaged bythe theorys components.

Received March 16, 2006, accepted October 4, 2006.

Key words: tonal tension, attraction, pitch space,prolongational structure, multiple regression

THE EBB AND FLOW OF tension is basic to themusical experience and has long been of interestin music theory and criticism (Berry, 1976;

Hindemith, 1937; Kurth, 1920; Rothfarb, 2002; Schenker,1935; Zuckerkandl, 1956). It appears to have a directlink to musical affect (Krumhansl, 1996, 1997), and itshapes not only the listening experience but also aspectsof musical performance (Palmer, 1996).

Building on the prolongational component in Lerdahland Jackendoff (1983; hereafter GTTM), Lerdahl (2001;hereafter TPS) developed a formal model of tonal ten-sion and the related concept of tonal attraction. Themodel generates quantitative predictions of tension andattraction for the sequence of events in any passage oftonal music. Earlier empirical studies have shown prom-ising connections between the models predictions and

participants responses (Bigand, Parncutt, & Lerdahl,1996; Cuddy & Smith, 2000; Krumhansl, 1996; Lerdahl& Krumhansl, 2004; Palmer, 1996; Smith & Cuddy,2003). Our purpose here is to provide a comparativelycomprehensive empirical treatment and analysis ofthe models predictions over a range of musical styles.

By tonal tension we mean not an inclusive defini-tion of musical tension, which can be induced by manyfactors, such as rhythm, tempo, dynamics, gesture, andtextural density, but the specific sense created bymelodic and harmonic motion: a tonic is relaxed andmotion to a distant pitch or chord is tense; the reversalof these motions causes relative relaxation. Becausetonal tension is a uniquely musical phenomenon(unlike such factors as fluctuations in loudness, speed,or contour), it is perhaps the most crucial respect inwhich music tenses and relaxes. This study sets asideother kinds of musical tension and focuses on tonaltension.

The sense of tonal tension and relaxation can also beexpressed as stability and instability or even conso-nance and dissonance. These pairs of terms have some-what different shades of meaning. Dissonance refersfirst of all to a sensory property that is studied in the psy-choacoustic literature. In a traditional music-theoreticcontext, it refers to intervallic combinations that requireparticular syntactic treatment, such as the passing toneand the suspension. Intervals that are musically dis-sonant usually correspond to intervals that are psy-choacoustically dissonant. Instability has cognitiveor conceptual meaning beyond psychoacoustic effects.Theorists such as Riemann (1893), Schenker (1935),and Schoenberg (1911) extend musical dissonance froma surface characteristic to abstract levels. One mayspeak of a composed-out passing tone that is harmo-nized at the surface, or of a subsidiary tonal region thatis conceptually dissonant in relation to the tonic(Rosen, 1972). Schoenberg (1975) asserts that the goal ofa tonal composition, after its initial destabilization, is toreestablish stability.

The term tension, as employed here, refers both tosensory dissonance and to cognitive dissonance or insta-bility; similarly,relaxation refers to sensory consonanceand to cognitive consonance or stability. The expression

MODELING TONAL TENSION

Music Perception VOLUME 24, ISSUE 4, PP. 329366, ISSN 0730-7829, ELECTRONIC ISSN 1533-8312 2007 BY THE REGENTS OF THE UNIVERSITY OF CALIFORNIA. ALLRIGHTS RESERVED. PLEASE DIRECT ALL REQUESTS FOR PERMISSION TO PHOTOCOPY OR REPRODUCE ARTICLE CONTENT THROUGH THE UNIVERSITY OF CALIFORNIA PRESSS

RIGHTS AND PERMISSIONS WEBSITE, HTTP://WWW.UCPRESSJOURNALS.COM/REPRINTINFO.ASP. DOI:10.1525/MP.2007.24.4.329

Modeling Tonal Tension 329

tension and relaxation also has the advantage ofinvoking physical motion and exertion beyond a specif-ically musical function. Everyone experiences physicaltension and relaxation, and it is common to extend theterms to mental and emotional terrains as well. Conse-quently, it is relatively straightforward to ask experi-mental participants to respond to degrees of tensionand relaxation and thereby elicit consistent interper-sonal responses (see Krumhansl, 1996).

The TPS model also develops an attraction compo-nent. The term attraction refers to the intuition thatmelodic or voice-leading pitches tend toward otherpitches in greater or lesser degrees. Bharucha (1984)refers to melodic anchoring; Larson (2004; Larson &VanHandel, 2005) speaks of musical forces; Margulis(2005), Meyer (1956), and Narmour (1990) couch attrac-tion in terms of melodic expectation or implication.Attraction can also be seen as a kind of tension: themore attracted a pitch is to another pitch, such as theleading tone to the tonic, the more the listener experi-ences the tension of anticipation. This kind of tensioncontrasts with the tension of instability. The leadingtone is less stable than the tonic, but its expectancy-tension (to use Marguliss expression) is much greaterthan that of the tonic. That is, the leading tone stronglywants to resolve to the tonic; but the tonic pitch,being the point of maximal stability, expresses compar-atively little urge to move to the leading tone or to anyother pitch.

To summarize, our concern is with three kinds oftonal tension: the sensory dissonance of certain inter-vallic combinations, harmonic and regional stability/instability in relation to a governing tonic, and melodicattraction as a projection of expectancy-tension.

Overview of the Tension Model

The four components listed in Figure 1 are required fora quantitative theory of tonal tension. First, there mustbe a representation of the hierarchical event structurein a musical passage. Adapting a traditional music-theoretic term, we call this component prolongationalstructure. Second, there must be a model of tonal pitchspace and all distances within it. Tonal pitch space is thecognitive schema whereby listeners have tacit long-termknowledge, beyond the patterns within any particularpiece, of the distances of pitches, chords, and tonalregions from one another. Third, there must be a treat-ment of surface or sensory dissonance. This measure islargely psychoacoustic: the interval of a seventh is moredissonant than a sixth, and so on. Fourth, there must bea model of melodic or voice-leading attractions. Listeners

experience the relative pull of pitches toward otherpitches in a tonal context.

Let us review these four components, starting withprolongational structure. (This exposition summarizesmaterial in TPS.) GTTM addresses prolongational organ-ization not as an aesthetic ideal, as in Schenkerian analy-sis, but as a psychological phenomenon describable bynested patterns of tension and relaxation. Tension dependson hierarchical position: a tonic chord in root positionis relaxed; another chord or region is relatively tense inrelation to the tonic; a nonharmonic tone is tense inrelation to its harmonic context. This componentassigns prolongational structure by a cognitively moti-vated rule system that proceeds from grouping andmeter through time-span segmentation and reduction.These steps are necessary because prolongational con-nections depend not only on degrees of pitch similarityand stability but also on the rhythmic position of events.

To represent an event hierarchy, the prolongationalcomponent employs a tree notation. Here it will sufficeto refer to branchings stripped of the node typesemployed in GTTM and TPS. Right branches stand fora tensing motion (or departure), left branches for a relax-ing motion (or return). The degree of tension or relax-ation between two events depends on the degree ofcontinuity between them. If two events that connect arethe same or similar, there is little change in tension. If theyare different, there is more change in tension. Figure 2shows an abstract tension pattern: at a local level, Event 1tenses into Event 2, Event 3 relaxes into Event 4, and

330 F. Lerdahl and C. L. Krumhansl

1. A representation of hierarchical (prolongational) event structure.

2. A model of tonal pitch space and all distances within it.

3. A treatment of surface (largely psychoacoustic) dissonance.

4. A model of voice-leading (melodic) attractions.

FIGURE 1. Four components necessary for a quantitative theory oftonal tension.

FIGURE 2. Tension (t) and relaxation (r) represented by a tree structure.

Event 5 relaxes into Event 6; at larger levels, Event 1tenses into Event 4, Event 6 relaxes into Event 7, andEvent 1 relaxes into Event 7. Notice that this representa-tion says nothing about the tension relationshipbetween Events 2 and 3 or Events 4 and 5. More seri-ously, it does not quantify the amount of tension orrelaxation. It merely says that if two events are con-nected, one is relatively tense or relaxed in relation tothe other.

Further progress in the evaluation of tension dependson the second component listed in Figure 1, a model oftonal pitch space. A well-known finding in music psy-chology is that listeners judgments about the distancesof pitches, chords, and regions (or keys) from a giventonic form consistent patterns (Bharucha & Krumhansl,1983; Krumhansl, 1990, hereafter CFMP; Krumhansl &Kessler, 1982). These results have been replicated in sev-eral ways, using different input materials, participantswith varied training, and different task instructions.When submitted to multidimensional scaling, the empir-ical data are represented as geometrical structures inwhich spatial distance corresponds to cognitive distance.The regular geometry found for regions (Krumhansl &Kessler, 1982) corresponds to musical spaces proposedearlier by music theorists (Schoenberg, 1954; Weber,1817-21).

It is striking that listeners share a complex mentalschema of the mutual distances of pitches, chords, andregions. But how is this empirical result to be accountedfor? Several researchers have proposed explanatoryframeworks: CFMP through sensitivity to statistical fre-quency of tone onsets or durations; Bharucha (1987)through neural-net modeling; Parncutt (1989) throughpsychoacoustic modeling. A fourth approach, which iscomplementary to the others, has been to develop amusic-theoretic formal model of tonal pitch space thatcorrelates with the empirical data and that unifies thetreatment of pitches, chords, and keys within a singleframework (Lerdahl, 1988; TPS). The model begins withthe basic space in Figure 3, set to I/C. Regions are desig-nated in boldface type, with upper-case letters for majorkeys and lower-case letters for minor keys. The numbersin familiar pitch-class notation signify either pitches orpitch classes, depending on context. The basic space is

hierarchical in that if a pitch class is stable at one level,it repeats at the immediately superordinate level. Thediatonic scale is built from members of the chromaticscale and the triad from members of the diatonic scale.The triad itself has an internal hierarchy, with the fifthmore stable than the third and the root as the most sta-ble element. The shape of this structure corresponds tothe major-key tone profile in CFMP and can be viewedas an idealized form of it.

Transformations of the basic space measure the dis-tance from any chord in any region to any other chordwithin the region or to any chord in any other region.The space shifts by means of a diatonic chord distancerule in which the distance from chord x to chord yequals the sum of three variables, as shown in theabbreviated statement of the rule in Figure 4. Computa-tional details aside, the factors involved are the degree ofrecurrence of common tones and the number of movesalong two cycles of fifths, one for triads over the dia-tonic collection and the other for the diatonic collectionover the chromatic collection.

Figure 5 illustrates some basic-space configurationswith their distance calculations from I/C (see Figure 3).The underlined numbers signify new pitch classes in thenew configuration (variable k in the rule). The distancefrom I/C to V/C in Figure 5a is accomplished by stayingin the same diatonic collection (i), moving the chordonce up the diatonic cycle of fifths ( j), and counting theresultant noncommon tones (k). The distance from I/Cto I/G in Figure 5b is two units greater, even though thetwo chords are the same. In the latter, there is also acycle-of-fifths move at the scale level (i), causing an extranoncommon tone at that level (k). Motion betweenmajor and minor chords arises not by a transformationbut is a by-product of moves along a cycle of fifths.


(a) octave (root) level: 0 (0)

(b) fifths level: 0 7 (0)

(c) triadic level: 0 4 7 (0)

(d) diatonic level: 0 2 4 5 7 9 11 (0)

(e) chromatic level: 01 2 3 4 5 6 7 8 9 10 11 (0)

FIGURE 3. Diatonic basic space, set to I/C (C = 0, C# = 1, . . . B = 11).

FIGURE 4. The rule for calculating the distance between triads in diatonic space.

Thus the distance from I/C to i/a in Figure 5c is reachedby staying in the same diatonic collection (i) and mov-ing the chord three times up the diatonic cycle of fifths( j), producing four noncommon tones (k). The dis-tance from I/C to i/c in Figure 5d, in contrast, involvesmoving the scale three times down the chromatic cycleof fifths (i). With no change of chord root ( j), the thirdof the chord becomes minor. Again there are four non-common tones (k).

When mapped geometrically, the distances (d ) fromtriad to triad within a key exhibit a regular pattern, withthe diatonic cycle of fifths arrayed on the vertical axisand the diatonic cycle of minor thirds on the horizontalaxis. Figure 6a displays this pattern along with distancesfrom the tonic triad to the other triads within the key.Regional spacethat is, distances from a given tonictriad to other tonic triadsshows a similar pattern,with the chromatic cycle of fifths on the vertical axis andthe minor-third cycle on the horizontal axis. Figure 6bgives a portion of regional space along with the distancevalues. If these chordal and regional patterns areextended, both Figures 6a and 6b form toroidal struc-tures. (Figure 6b corresponds to a multidimensionalsolution developed from empirical data in Krumhansland Kessler, 1982; also see CFMP.)

Pitch-space distances are input to prolongationalstructure via the principle of the shortest path. The ideais that listeners construe their understanding of melodiesand chords in the most efficient way; in other words, theyinterpret events in as stable and compact a space as possible.For example, if one hears only the melodic progressionC E, the most stable interpretation is as in C,for C and E are then in an optimally stable location in

31

diatonic basic space. A slightly less preferred alternative isas in a; still less preferred would be in G; andso forth. Similarly, a G major chord heard in a C contextis likely to be heard by the shortest path as V/C ratherthan, say, by longer paths to I/G or iii/e.

Figure 7 illustrates the use of the principle of theshortest path in a derivation of the prolongationalstructure for the final phrase of the Bach chorale,Christus, der ist mein Leben. (Later on we discuss theentire chorale; also see the extensive analysis in TPS,chapter 1.) Let us assume that the phrasal boundariesand metrical grid have already been assigned. As a firststep, automatic segmentation rules carve the music intonested rhythmic units so that each event is assigned to atime-span segment. Second, at the quarter-note time-span level at the bottom of the graph, nonharmonictones are reduced out, the cadence (marked c) is desig-nated, and tonic orientation is established by shortest-path measurement. The opening F major triad is takento be the tonic because the distance to itself is 0(d[F F] = 0), whereas distances to other possible ton-ics would be greater. All the subsequent events takeplace within F. Third, events at the quarter-note levelcome up for comparison at the half-note level. In eachcase, the most stable event is selected for comparison inthe next larger span, where stability is defined in termsof the distance to another available event. Thus theopening I is compared within span a to viio6 and I6, andii65 is compared within span b to the V. In span a, theopening I is selected over viio6 because, unlike viio6, thedistance of I to the tonic is 0; I wins over I6 because itsroot is in the bass. In span b, the V is chosen because it

6453


FIGURE 5. Illustrations of d.

(a)

(iii) V viio

8

5

vi 7 I 7 iii8

5

ii IV (vi)

(b)

e G (g)

9

7

a 7 C 7 c

7

10

d F (f)

FIGURE 6. Portions of (a) chordal space within a region; and(b) regional space, along with values calculated by d.

is part of the cadence. In span c, the only choice is thefinal I. Thus the half-note time-span level yields I-V-I.

The time-span hierarchy then forms the input to theprolongational tree, moving from global to local levels.The distances between available global events ared(opening I final I) = 0 and d(V I) = 5. The firstoption wins because its path is shorter: the opening Ibranches directly to the closing I, and within that con-text V branches to the final I. At a more local level, in thefirst part of the phrase d(I I6) = 0 and d(I viio6) = 5(counting, as is customary, viio6 as an incompletedominant), so I6 attaches to I; within the context of I I6,viio6 branches to I6. Finally, ii65 lies between I

6 and V.d(ii65 I6) = 9 and d(ii65 V) = 7, so ii65 attaches to themore proximate V. As a visual aid, the slurs betweenevents in the music duplicate the relationshipsdescribed in the tree.

Supplementary to the principle of the shortest path isa second factor in the derivation of prolongationalreductions, the principle of good form, which encour-ages optimal patterns of tension and relaxation. Thissecond principle breaks down into three conditions.First is the recursion constraint, in which successive

right or left branches are preferable to unconnectedright or left branches. Thus there is pressure to assignthe first instance of Figure 8a rather than the second.Second is the balance constraint, in which the numberof right and left branches approaches equality. Thus thefirst instance in Figure 8b is preferred over the second.Third is normative structure, in which there is a pref-erence for at least one right branch leading off thestructural beginning of the phrase and for at least oneleft branch (a pre-dominant) leading into the phrasescadence. Finally, there is a third overarching principle,that of parallelism: parallel passages preferably haveparallel structures. GTTM uses this principle in all of itstheoretical components.

The principles of the shortest path, prolongationalgood form, and parallelism reinforce one another in theBach phrase, but in other passages they might conflict.Although the procedures involving the shortest path arealgorithmic, their interaction with prolongational goodform is not fully specified; and the principle of paral-lelism is notoriously difficult to quantify. Hence there is adegree of flexibility in the assignment of prolongationalstructure.


FIGURE 7. Derivation of the prolongational structure of the final phrase of the Bach chorale, Christus, der ist mein Leben.

The chord distance rule calculates not only the dis-tance between two chords x and y but also the tonal ten-sion between them. Tension can be computed bothsequentially and hierarchically. Sequential tension ismeasured simply from one event to the next, as if thelistener had no memory or expectation. Hierarchicaltension proceeds through the prolongational analysisfrom global to local levels in the tree structure. It is anempirical question how much listeners hear tensionsequentially and how much hierarchically. No doubtthey hear from one event to the next, but if listeningwere only sequential there would be little larger-scalecoherence to the musical experience.

We turn now to the third component of tonal ten-sion, surface dissonance. Nonharmonic tones (tonesnot belonging to a sounding triad) are less stable, hencetenser, than harmonic tones. Even when all the sound-ing tones are harmonic, the triad is more stable if it is inroot position than if it is in inversion; and, to a lesserextent, it is more stable if its melodic note is on the rootof the triad than if it is on the third or fifth scale degree.These factors are registered, categorically and approxi-mately, in the surface tension rule in Figure 9. They areonly approximate because tones within a category infact differ in their degree of perceived dissonance,depending on intervallic structure, metrical position,duration, loudness, timbre, and textural location. Analternative method would be to quantify surface tensionaccording to an established measure of sensory disso-nance in the psychoacoustic literature (for instance,Hutchinson & Knopoff, 1978). This method wouldgive rise to a continuous measure of surface tension.

However, surface tension is perceived categorically to aconsiderable extent. For example, in a diatonic 7-6 sus-pension chain, all the sevenths, major or minor, soundmore or less equally dissonant. Here we take the cate-gorical approach.

The chord distance rule and the surface tension rulecombine in two possible ways to yield an overall ten-sion value for a given event. The simpler way, stated inFigure 10a, is sequential: calculate the pitch-space dis-tance from one event to the next and add the value forsurface tension. The more complex way in Figure 10b ishierarchical: calculate the pitch-space distance from theimmediately dominating event and add the value forsurface tension; then add hierarchical values as inheriteddown the prolongational tree.

As illustration, consider Figure 11, the Grail themefrom Wagners Parsifal. (This is also known as theDresden Amen and is familiar as such in some Protes-tant services. Here the theme is transposed from A , itscharacteristic key, to E so that it can be directly com-pared later on to its chromatic version in E .) The the-oretically preferred analysis, following the recursionconstraint and parallelism for the first four events, saysthat the music tenses away from the opening I until thepre-dominant ii (Event 5) in bar 2. After an elaborationof ii, the progression relaxes, in observance of norma-tive structure, into the closing I, which repeats theopening I an octave higher. The dashed branch to Event 5signifies an alternative branching that continues to followthe parallelism of the harmonic sequence but thatremoves the pre-dominant left branch required bynormative structure. We shall return to this point.


FIGURE 8. Prolongational good form: (a) recursion constraint; (b) balance constraint; (c) normative structure.

FIGURE 9. The rule for calculating surface tension.

Included in Figure 11 are numerical values from theapplication of the rules in Figures 9 and 10. The firstrow of numbers between the staves lists surface disso-nance values. The second row lists sequential tensionvalues, obtained by calculating d from one chord to thenext and adding surface distance values. For example,the sequential distance from Event 2 to Event 3 is 7, andthe surface dissonance value for Event 3 is 1; so thesequential tension associated with Event 3 is 7 + 1 = 8.The third row similarly lists hierarchical tension values,obtained globally by adding the distance numbers nextto the branches of the tree and then adding the surfacedistance values. Thus the hierarchical distance fromEvent 2 to Event 4 is 0 + 7 + 7 = 14, and the surface dis-sonance value for Event 4 is 1; hence the hierarchicaltension associated with Event 4 is 14 + 1 = 15.

The same calculations appear in the tabular format inFigure 12. The events for Tseq in Figure 12a are listedin sequential order. The table decomposes the surface-dissonance and pitch-space factors into their compo-nent parts. The values in each row are summed to reachthe total sequential tension for each event. In Thier inFigure 12b, the target chords (those to the right of thearrows) are still listed in sequential order, but the sourcechords (those to the left of the arrows) are now listed bythe immediately dominating events in the tree. Forexample, the notation Thier(4 3) indicates that Event 3,because it branches from Event 4, derives its tensionvalue from Event 4. In accordance with the hierarchicaltension rule, Figure 12b includes the additionalcolumns of local total and inherited value. The hier-archical tension for each event, given in the global


FIGURE 10. Tension rules: (a) sequential tension plus surface dissonance; (b) hierarchical tension plus surface dissonance.

FIGURE 11. Grail theme (diatonic version) from Wagners Parsifal, together with its theoretically preferred prolongational analysis, surfacedissonance values, sequential tension values, and hierarchical tension values.

total column, equals the local total plus the inheritedvalue.

The fourth component of the tension model is the fac-tor of attraction. That pitches tend strongly or weaklytoward other pitches has long been recognized in musictheory (see TPS, pp. 166167 and 188192). Bharucha(1984, 1996) provides a psychological account of thisphenomenon through the notion of anchoring, which isthe urge for a less stable pitch to resolve on a subsequent,proximate, and more stable pitch. This corresponds tothe account offered by Krumhansl (1979) for the effectof temporal order on tone similarity judgments.Bharucha and Larson (1994, 2004) also equate theattractive urge with melodic expectancy (Meyer, 1956;Narmour, 1990). The TPS attraction model extendsBharuchas approach to include the attraction of anypitch to any other pitch and to harmonic progression. Italso quantifies the relevant variables and places themwithin a larger cognitive theory.

Figure 13a repeats the basic space with the fifths level(level b in Figure 3) omitted, in order to make attractions

to the third and fifth scale degrees equal. Each level ofthe space is assigned an anchoring strength in inverserelation to its depth of embedding. Figure 13b gives themelodic attraction rule. The two factors in the equa-tion, combined by multiplication, are the ratio ofanchoring strengths of two pitches and the inversesquare of the semitone distance between them. The dis-tance factor is estimated to behave as in Newtons classi-cal gravitational equation. The inverse-square factorrenders miniscule attractions between pitches that aremore than a major second apart. To convey the behav-ior of the rule, Figure 13c lists a few attractions to dia-tonic neighbors in the context I/C. The pitch B is highlyattracted to C because the two pitches are a semitoneapart and C is more stable. D is less attracted to Cbecause it is two semitones away. F is more attracted toE than E is to F because of their inverse anchoringstrengths.

The attraction rule applies not only to individuallines but also to each voice in a progression. As stated inthe harmonic attraction rule in Figure 14, these values


FIGURE 12. Tension tables for Figure 11: (a) sequential tension; (b) hierarchical tension.

are summed and then divided by the value for the chorddistance rule to obtain the overall attraction value fromone chord to the next.

Figure 15 applies the harmonic attraction rule to thefirst and last progressions in Wagners Grail theme.Where a pitch repeats, null is designated because theattraction rule does not apply to repeated pitches. Thevalues of a are summed to the combined realized voice-leading value (arvl), which is divided by the d value togive the final realized harmonic attraction value (arh).Notice the extreme differences between the arh valuesfor Prog(1 2) and Prog(8 9). In the former, the pro-gression I vi is only moderately strong and includesrepeated notes; in the latter, the progression V7 I isvery strong and resolves by half step in two voices.Indeed, the strongest harmonic attraction is from adominant seventh chord to its tonic, because of thepowerful attractions of the leading tone to the tonic

and the fourth to the third scale degree and because ofthe short distance from the dominant to the tonicchord. This is why (aside from statistical frequency)the expectancy for a tonic chord is so high after adominant-seventh chord.

Attractions in TPS are computed not only from eventto event at the musical surface but also from event toevent at immediately underlying levels of prolonga-tional reduction. The resulting sets of numbers, how-ever, are not integrated into a single attraction measureacross reductional levels. Depending in part on tempo,underlying levels presumably contribute to the overallresult in increasingly smaller amounts as the analysisabstracts away from the surface. (Margulis, 2005, pro-poses a mechanism for this step.) In this study wedispense with underlying levels of attraction.

Figure 16 shows the surface attraction values for theGrail theme. The numbers appear between events because


FIGURE 13. Melodic attractions: (a) The basic space minus the fifth level and with anchoring strengths indicated by level; (b) the melodic attractionrule; (c) some computed attractions between scalar adjacencies in the context I/C.

FIGURE 14. Harmonic attraction rule.

they apply to relations between events. Where the har-mony does not change, as in events 3-4 and 5-7, a singleattraction value obtains.

There is a complementary relationship between ten-sion and attraction numbers. Where the music tensesaway from the tonic, attractions are realized on less sta-ble pitches and chords. Hence where tension numbersrise, attraction values tend to be small. But where themusic relaxes toward the tonic, attractions are realizedon more stable pitches and chords; tension numbersdecline and attraction values rise. A high attractionvalue in effect constitutes a second kind of tensionnot the tension of motion away from stability but thetension of expectation that the attractor pitch or chordwill arrive.

A further general point about tension and attractionconcerns numerical quantification. As Klumpenhouwer(2005) points out, the theorys numbers measure differ-ent entities in the different components: in the disso-nance component, chord inversions and nonharmonictones; in the distance model, steps on cycles of fifthsand noncommon tones; in the attraction component,pitch stabilities and distances. As numerical values,then, these might be considered incommensurate (forexample, a 2 for inversion in the dissonance componentis not exactly the same as a 2 for the k distance betweenchords). One approach to this issue would be to findcoefficients for the different variables to express the rel-ative strength of their units of measurement. We havefound, however, that coefficients are not needed for thetension rules; that is, the numbers already express the

relative strength of the variables in question. However,the attraction rules yield incommensurable outputnumbers compared to those of the tension rules.Empirical data suggest that coefficients are neededwhen tension combines with attraction. For this, wetake a practical rather than theoretical solution throughthe mathematical technique of multiple regression,which weights the two sets of numbers to find the bestfit between the tension and attraction curves.

Before proceeding, it should be noted that themelodic attraction rule (Figure 13b) stands on weakerempirical grounds than does the chord distance rule(Figure 4). Experimental results guided the develop-ment of the distance rule. (However, the output of theelaborated form of d, the chord/region distance rule [TPS, p. 70], which employs the pivot-region concept,proves to be empirically less successful, and we shall notinvoke it.) The attraction rule, in contrast, was devel-oped by a blend of theoretical and intuitive considera-tions without much supporting empirical data. Severalaspects of the rule can be criticized. First, it is unclearthat a multiplicative rather than additive relationshipshould obtain between the stability (s2s1) and proximity(1n2) parts of the equation. Second, as Larson (2004)and Samplaski (2005) observe, there is arbitrariness inthe reduction of five levels of the basic space (Figure 3)to four when calculating attractions (Figure 13a).Third, the inverse factor for proximity eliminates theattraction of a pitch to itself because of the impossibil-ity of a zero denominator. Pitch repetition may indeedbe a case where intuitions of attraction and expectation


FIGURE 15. Two applications of the harmonic attraction rule.

FIGURE 16. Attraction values for the Grail theme.

diverge. One may expect a pitch to repeat, but it seemsmore natural to think of a pitch as being attracted onlyto other pitches. Nevertheless, the exclusion of pitchrepetition from the calculations leaves a gap in the the-ory. Fourth, the specific form of the inverse factor, 1n2,appears to create too steep a curve; that is, the dropfrom great attraction at the half-step distance to lessattraction at the whole-step distance to very littleattraction at the minor-third distance seems tooextreme (Margulis, 2005). The obvious alternative, 1n,yields too flat a curve. An intermediate curve is possible,but the theoretical and empirical bases for such a solu-tion are unclear. Fifth, and perhaps most importantlyfrom a theoretical perspective, the measurement ofproximity only by semitones may be too simple a met-ric. Larson (2004) cites evidence in Povel (1996) thatstepwise arpeggiated intervalsthat is, between adjacentmembers of a triad or between the fifth and the tonicyield greater attractions than predicted by 1n2 or 1n.Krumhansl (1979) and CFMP (Table 5.1) also find highrelatedness ratings for triad members. This evidence fitsthe discussion in Chapter 2 of TPS about pitch proxim-ity, step motion, and linear completion. It appears thatthat discussion, in which stepwise motion is seen as per-taining to the alphabet in question at a given level of thebasic space, should have informed the formulation ofpitch attractions in Chapter 4.

Despite these reservations, the principles behind theattraction rule, stability and proximity, remain the cen-tral factors in a treatment of melodic attraction. Wehave tried the alternatives of five instead of four stabil-ity levels and of proximity by 1n instead of 1n2, but theresulting values do not lead to improvements over thoseof the original rule with respect to the empirical data.Nor are there enough instances of voice-leading arpeg-giation in our examples to force a stratified treatment ofmelodic proximity. Our project is to test the success orfailure of the TPS theory of tension and attraction, andwe leave theoretical refinements of the attraction rulefor future research.

Experimental Approach

The participants in the experiments under discussionwere musically trained students at Cornell Universitywith relatively little training in music theory comparedto the extent of their instruction on musical instru-ments or voice. (More details of music backgroundsand other details of the experimental method can befound in Appendix A.) They were tested for tensionresponses for Wagners Grail theme from Parsifal inits diatonic and chromatic versions, a Bach chorale, a

chromatic Chopin prelude, and a passage from MessiaensQuartet for the End of Time. The data were compared tothe models predictions. (A Mozart sonata movementthat received a similar treatment is not discussed in thispaper; see Krumhansl, 1996, and Lerdahl, 1996.)

The tests for the Wagner and Bach excerpts wereconducted in two ways, the stop-tension task and thecontinuous-tension task. In the stop-tension task, thefirst event was sounded, at which point the participantsrated its degree of tension; then the first and secondevents were sounded and the participants rated the ten-sion of the second event; then the first, second, andthird events were sounded, and so on, until the tensionassociated with each successive event was recorded. Inthe continuous-tension task, which was done for allexcerpts, the participants interacted with a graphicinterface that enabled them to move a slider right andleft on the computer screen using a mouse, in corre-spondence with their ongoing experience of increasingand decreasing tension. The advantage of the stop-tensiontask is that it records the response precisely for the eventthat is evaluated. Its disadvantage is that it is rather arti-ficial and prohibitively time-consuming for longexcerpts. The advantage of the continuous-tension taskis that it encourages a spontaneous response to intu-itions of tension in real time. Its disadvantage is thatthere is a lag time, for which an approximated correc-tion must be made, between the sounded events and thephysical response of moving the mouse. Perhaps sur-prisingly, the results from the two tasks yielded almostthe same results for the short passages where bothtasks were used. For the longer Chopin and Messiaenselections, however, it was practical to employ only thecontinuous-tension task. The participants in the studyby Krumhansl (1996) using this method varied in theextent of their musical training, but training had littleeffect on the tension judgments.

As mentioned, the analyses combine tension andattraction values to achieve an overall measure of ten-sion. We follow three conventions in this respect. First,even though an attraction number does not adhere to asingle pitch but represents a relation between two suc-cessive pitches x and y, we assign the number to x, ineffect claiming that it is at x that the experience of attrac-tion most saliently takes place. In this way, each event hastwo numbers associated with it, one for d and the otherfor a. Second, the harmonic attraction rule (Figure 14)has d in the denominator and hence requires that d 0.This creates a problem when the voice leading moves butthe harmony does not progress. In such cases, we repeatthe value for d from the point at which the harmony lastchanged (as in Figure 16, Events 3-4 and 5-7). Third, a


virtual attraction can be computed from Event x to anypossible Event y, and, in particular, to the y with thehighest attraction value. Instead we calculate only therealized attraction, that is, from x to the y that actuallyfollows. It might be argued, especially for the stop-tensiontask, that the strongest virtual attraction, when it is notthe same as the realized attraction, should be calculated,on the view that the strongest attraction correspondsto the strongest expectation. Expectations, however,depend not only on strongest attractions but also onschematic patterns that lie beyond current formaliza-tion. To calculate to an event that does not occur wouldbe somewhat speculative in this context. It suffices as afirst approximation to rely on the definiteness of realizedattractions.

A larger methodological point concerns the interactionbetween prediction and data. It is sometimes thoughtthat an experiment simply tests a preexisting theory. Yetexperimental data can give rise to a theory; this in factwas the case for the construction of the pitch-spacemodel. In a healthy science, it often happens that afruitful exchange develops between theory and experi-ment. Such is the case here. If the data suggest that thepredictions are faulty, principled ways are soughtwithin the model to reach predictions that achieve abetter empirical result. These reevaluations are princi-pled in the sense that they are constrained by the gen-eral assumptions and specific formalisms of the theory.This process can go back and forth a number of times.One must of course be careful not to adjust the theorysimply in order to fit the data. Rather, the data can illu-minate how listeners construe tension, suggesting inter-pretations within the model that are both theoreticallyacceptable and more predictive. In this way the theorycan be improved. Furthermore, in our view it is notenough to achieve a statistically significant overall cor-relation. What is wanted, in addition, is an explanatoryaccount of why the model succeeds or fails at any givenpoint in the analysis.

In this back-and-forth process there are two kinds offlexibility within the theory. First, sequential or hierar-chical tension can be computed, each with or withoutattractions. Second, unlike the tension and attractionrules (all those that incorporate d and a), which arealgorithmic, the derivation of prolongational structureinvolves gradient preference rules, which interact withone another in search of an optimal solution (see thediscussions in GTTM and TPS; also Temperley, 2001).

Preferential conditions arise in three ways. First is theinteraction between the principles of the shortest path,prolongational good form, and parallelism. Second,when there is a shift from a right- to a left-branching

pattern, the event where the shift takes place can attacheither way, depending on the shortest path and goodform. Third, it is not always clear where to locate anevent in pitch space; that is, there can be ambiguityabout the identity of a chord or the exact moment of amodulation. As a result of these factors, a passage ofmusic yields not a single prolongational analysis but alimited range of preferred analyses. The data canpoint in any given case toward which theoreticallyviable prolongational analysis conforms best to listenersresponses.

Analyses in a Diatonic Framework

Wagner Theme, Diatonic Version

We begin with the diatonic Grail theme from WagnersParsifal, shown in Figure 11. Figure 17 records the nineevents of the excerpt on the x-axis and tensionresponses from the stop-tension task on the y-axis. Thedashed line represents the sequential tension valuesfrom Figure 12a, without the inclusion of attractionvalues, and the solid line shows the data from the aver-aged listeners responses. The fit is quite poor: R2(1,7) =.08, p = .46, R2adj = .049.

Some words of explanation may be helpful. For eachcorrelation, we present the following information aboutthe statistical test. The first number R2 is the proportionof variation in the data that is accounted for by themodel. It is associated with two numbers, the degrees offreedom. The first degree of freedom indicates thenumber of predictor variables in the model. In this casethere is one variable, sequential tension. The seconddegree of freedom is the number of data points (in this


FIGURE 17. Sequential tension analysis of the Grail theme from Parsifal.

case 9, one for each event in the music) minus two. Thesubtraction of two results from the regression usingup two degrees of freedom for the parameters it deter-mines. The regression finds the best-fitting linearmodel predicting the data from the variable(s). Themodel determines the optimal values for the slope ofthe line (one parameter) and the intercept of the line(the second parameter). Hence the number two is sub-tracted from the number of data points going intothe regression to give the second degree of freedom.If the model can find a perfect fit between the data andthe variable(s), R2 would equal 1.0. In general, the valueis less than this, and its significance is measured by theprobability, p, which is the next number given in thestatistical report. By convention, a statistic, such as R2, isconsidered significant when the p value is less than .05.The probability depends on both the size of R2 and thedegrees of freedom. The last number given is theadjusted R2, R2adj. The R

2adj is the R

2 value adjusted tomake it more comparable with other models for thesame data that have different numbers of degrees offreedom.

Methods such as time-series analysis or functionaldata analysis are not appropriate here. Our objective isto determine whether the judgments fit the quantitativepredictions of the model for each musical event. Forthis we need a single number for judged tension foreach event.

The conclusion from this statistical test for Figure 17is that listeners do not hear this passage in a simplesequential manner. The R2 value is only .08, whichmeans that the sequential tension variable accounts foronly 8% of the variability in the tension judgments, andthe p value of .46 tells us that this is an unimpressiveresult. Graphically, this is apparent in Figure 17 wherethe two lines do not follow each other closely.

The second analysis is another single variable model,using the attraction values displayed in Figure 18. Theseare the attraction values from Figure 16, without theinclusion of tension values, against the listenersresponses. The fit is improved but still not good: R2(1,7) =.35, p = .09. R2adj =.26.

Figure 19 combines Figures 17 and 18 by adding theattraction values to the sequential tension values. Mul-tiple regression weights the two sets of numbers toachieve a best-fit solution, and assigns a probability toeach of the predictor variables. These will be denotedp(attraction) for the probability of attraction andp(tension) for the probability of the total tension pre-dictor. Each of these is shown with a standardized betavalue, b. The b weights are the coefficients in the linearmodel predicting the data from the predictor variable

(after they have been standardized to have the samemean and standard deviation). The picture is betterthan in Figure 17 but no better than in Figure 18:R2(2,6) = .35, p = .28, R2adj=.13 ; p(attraction) = .17, b = .58;p(tension) = .96, b = .02. The higher p value for R2 is thepenalty for using two predictor variables rather thanone, thus increasing the first degree of freedom. Or, toput it another way, when attraction is included in themodel, adding the sequential tension values does notimprove the fit (the p value for sequential tension in themultiple regression is .96, which means that adding ithas virtually no effect). This analysis confirms the con-clusion that the strict sequential treatment of tensiondoes not contribute to the fit of the data.

Let us abandon the sequential-tension approach andconsider instead the theoretically preferred prolongational


FIGURE 18. Attraction analysis of the Grail theme.

FIGURE 19. Combined sequential + attraction analysis of the Grailtheme.

analysis in Figure 11 together with its derived hierarchi-cal tension analysis in Figure 12b. At first we ignoreattraction values. The resultant graph in Figure 20achieves a better correlation than the previous analyses:R2(1,7) = .43, p = .056, R2adj =.35. However, the predictedvalues are too high for Events 1-4 and too low forEvents 5-8.

Figure 21 adds the attraction values in Figure 16 tothe tension values in Figure 12b. Now the correlationis quite good and statistically significant: R2(2,6) = .75,p = .016,R2adj = .66; p(attraction) = .03, b = .56; p(tension) =.02, b = .63. The most notable change in Figure 21compared to Figure 20 is the raising of the predictedcurve at Event 8 (V7). In Figure 20 the tension model

correctly assigns relaxation into the cadence, but partic-ipants experience greater tension at the V7 chord thanshown there. This happens because the V7 is highlyattracted to the following tonic resolution, an effectrealized in Figure 21 by the inclusion of attraction val-ues. Discrepancies remain, however. The predictions forEvents 3-4 are still too high and those for Events 5-7 aretoo low.

These shortcomings can be overcome through a revi-sion of the prolongational analysis. In the originalanalysis in Figure 11, there is equilibrium between rightand left branching (following the balance constraint),with Event 5 (the ii chord) interpreted as a pre-dominantto the cadence (following normative structure). Theanalysis in effect claims that, beginning at Event 5, thelistener already expects the resolution on Event 9. But itis harder to anticipate prospectively than it is to remem-ber retrospectively. Besides, Event 5 continues from theprevious events the harmonic sequence of descendingthirds with a rising melodic second. It is easier to hearinstead the analysis in Figure 22, in which the principleof parallelism wins over those of branching balance andpre-dominant function. The only difference is thatEvent 5 is now a right instead of left branch; Events 6 and 7attach to Event 5 as before. This single change leads toalterations in tension values for Events 5-7, as listedbetween the staves. In this interpretation, the tension ofthe harmonic sequence continues through the elabora-tion of ii in Events 6-7 and is released only at thecadence in Events 8-9. Attractions remain as before. Theresult is the almost perfectly matching curves in Figure 23:R2(2,6) = .97, p < .0001, R2adj = .97; p(attraction) < .0001,b = .58; p(tension) < .0001, b = .79.

Three broad conclusions can be drawn from thisanalysis of the Grail theme. First, attractions must beincorporated into the predictions. Second, listenershear tension hierarchically more than sequentially.Third, unless schematic intuitions are strong, listenerstend to construe events in a right-branching manner,that is, in terms of previous rather than followingevents.

Are the stop-tension data related to the continuous-stop data? Krumhansl (1996) found that the discretepredictions of the TPS model could provide a good fitto the continuous tension judgments by assuming anintegration time of 2.5 seconds. In the present case, thisapproach is adapted to ask whether the continuous-tension data could be predicted by the stop-tensiondata, assuming the same integration time. The calcula-tion assumes that the values of past events are degradedas an inverse exponential function with a half-life of0.5 seconds. The continuous data are plotted as a solid


FIGURE 20. Tension graph for the theoretically preferred hierarchicalanalysis of the Grail theme.

FIGURE 21. Combined hierarchical (theoretically preferred) + attractionanalysis of the Grail theme.

line in Figure 24 together with the values calculatedfrom the stop-tension data. A high degree of agreementis reached: R2(1,104) = .95, p < .0001, R2adj = .95. This isof interest because the participants performed the stop-tension task before the continuous-tension task. Thismeans that when they performed the stop-tension task,they had not heard the music beyond the chord thatthey were judging. The extent to which the two tasksconverge suggests that listeners were responding to thesounded events rather than to events they anticipatedbecause of memory from previous listening. Althoughthe analyses will not be presented here, the stop-tensionand continuous-tension data are similarly related for

the other two excerpts for which they are available (thechromatic version of the Wagner Grail theme and theBach chorale).

Bach Chorale

On the basis of the discussion of the Wagner excerpt,the remaining analyses follow the hierarchical ratherthan sequential tension model and incorporate attrac-tions as part of the overall prediction of tension. Wefirst consider the Bach chorale Christus, der ist meinLeben. Its prolongational analysis is divided, for rea-sons of space, between Figure 25 and Figure 26. The top


FIGURE 22. Prolongational analysis of the Grail theme, with Event 5 reinterpreted as right branching.

FIGURE 23. Combined hierarchical (right-branching) + attraction analysisof the Grail theme.

FIGURE 24. Comparison of the continuous-tension data (solid line)with predictions from the stop-tension data for the Grail theme, afterthe latter are integrated over 2.5 seconds with an exponential decaywith half life 0.5 sec.

branches, all of which represent the tonic I/F (hence d = 0in the tree), should be understood as connectingtogether. Event 2 in Figure 25 attaches to Event 41 inFigure 26, and the designation for Event 19 in Figure 26refers to Event 19 in Figure 25. The predicted values ofsurface dissonance, hierarchical tension, and attractionappear between the staves. (Incidentally, Event 34branches differently than does the equivalent first eventin Figure 7. Here it connects not to the final cadence[Event 41] but back to Event 19, showing the return to

/F. This happens because a prolongational analysisalways makes the most global connection possible. InFigure 7 the context was a single phrase; here it is theentire chorale.)

Figure 27 shows the fit of the empirical data with thepredictions in Figures 25-26: R2(2,38) = .79, p < .0001,R2adj =.78 ; p(attraction) < .0001, b = .47; p(tension)

Lerdahl-Krumhansl_2006

Documents

experience of tension

sense of tonal tension

flow of tension

becausetonal tension

theory of tonal

surfacetension model

theory tochromatic tonal

tonal passage