Top Banner
Oxford University Honour School of Engineering and Computing Science Part II Project Report Contrafactal and Metrical Analysis of English Folk Songs (Music and Computers) Paul J. Walmsley Trinity College Trinity Term 1996 Project Performed in Collaboration with: Douglas Anderson, Pembroke College Project Supervised by Ian Page This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com !
48

Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

Nov 06, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

Oxfor d Univer si tyHonour School of Engineer ing and Comput ing Science

Par t I I Pr oj ect Repor t

Contrafactal and Metrical Analysisof English Folk Songs

(Music and Computers)

Paul J. WalmsleyTrinity College

Trinity Term 1996

Project Performed in Collaboration with:Douglas Anderson, Pembroke College

Project Supervised by Ian Page

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 2: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

Summary

In this project we aim to investigate the characteristics of English Folk Song which determinewhether a particular lyric can be sung to a given tune. Contrafacture is the name given to theability to sing different words to the same tune, whilst Metr ical Analysis is an umbrella term forthe analysis of the inherent rhythm of text. This is a field of no apparent previous research, andso we were able to investigate the problem area ourselves and produce original results. We haveconcentrated on the analysis of a popular seventeenth century ballad called Packington's Pound.

The ultimate goal of our project was to create a suite of software which takes as its input a tuneand a set of words and produces a numerical assessment of how easy it would be to sing the givenwords to that tune. This we achieved, but with the unexpected bonus of adapting the rhythm sothat the program also tells the user how to sing the tune to achieve a good fit. We truly surpassedoriginal expectations when we were able to produce a program to find a number of tunes froma database to match an arbitrary lyric.

We were able to split the project into two distinct halves C I concentrated on the lyrics, whilstDouglas dealt with the music C which we were able to integrate towards the end into a singlestand-alone program.

In this report I intend to detail the course of my research (which has concentrated on the lyricalside) identifying the problem, developing heuristics for its solution and describing the algorithmsand data structures required to implement the solution.

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 3: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

3

Contents

Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5The Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

Characterisation of Lyrical Features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7Metrical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7Analysis of Metrical Deviance. . . . . . . . . . . . . . . . . . . . . . . . . . . . 8Analysis of Caesura . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Grammatical Analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11The Lexicon. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11Grammatical Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12Word Binding Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15Finding the Caesura . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Musical Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18Primary Word Stress . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18Scansion and Musical Stress. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19Beat Likelihood Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20Adaptive Beat Placement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22Super-Rhythm Contraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24Sub-Rhythm Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26Caesura Redundancy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Project Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29Division of Labour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29Interaction with Supervisor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Conclusions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

Possible Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

Electronic Texts and Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

Appendix A - Packington's Pound. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

Appendix B - Sample Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42Packington's Pound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 4: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

4

Tune Searching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45Limericks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

Appendix C - Progress Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 5: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

Intonation will not be considered in the scope of this project, although is closely related to Primary Word1

Stress (see page ). See Hogg & McCully (1987).

Conveniently, limericks exhibit similar properties to those in song lyrics in terms of word rhythms..2

5

In t roduct ion

Def ini t ions

As far as we will be concerned, music consists of two orthogonal concepts B those of rhythmand melody, which are respectively time and frequency domain abstractions. We will define atune to be melody superimposed upon a rhythm (in order to make the distinction between tuneand melody).

However, words also have features analogous to the rhythm and melody of the tune. We shallcall these properties metr ical rhythm and intonation . Consequently, the relations between1

these four properties determine how good a particular set of words sound to a given tune, andit is the purpose of this project to find these relations and to implement software to evaluate thisgoodness-of-fit.

A song is a composition comprising of lyr ics and a tune.

The Problem

A singer is familiar with a particular tune; someone gives him/her a new set of words claimingthat they can be sung to that tune. Sometimes it is possible to see by inspection of the versestructure whether the words may fit, but more often it is necessary for the singer to try and singthe new words, mentally or aloud, to establish whether or not the claim is justified.

To illustrate, take the first line of a limerick and try singing the second line to the same rhythm2

(boldface denotes the stressed syllables which can be emphasised by foot-tapping):

There was a young man from Dum- bar- tonBaa Baa black sheep have you an- y wool

The fit is hardly satisfying B it is very unnatural, but there is no way of knowing this until you tryto say the phrase. Contrast with:

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 6: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

See the Appendix for more information.3

6

John Smith- son goes down to the shop

which, although it doesn't look like a limerick first line, can be spoken as one quite easily. Ouranalysis therefore needs to centre on why this latter case fits but the former doesn't.

A related problem also occurs if we rename John to Smithyson:

John Smith- y- son goes down to the shop

The extra syllable inserted by the y pushes the rest of the line to the right and we lose the goodfit we once had. However, if we say Smith-y-son a bit faster, we can keep the rest of the lineunaltered, and once again we have a good fit, although not quite as good as the original:

John Smith- y- son goes down to the shop

These examples show the two-pronged nature of the problem. The first is indicative of thegeneral word-fit analysis problem, whilst the second acknowledges that the obvious way ofsinging a line isn't necessarily the best one, and in this case, succeeds in finding the better way.What we would like is a program to which we can feed a tune and a set of lyrics, and which willreturn an assessment of how easy it would be to sing the words to the given tune, and if possibleto tell us how we can 'cheat' as in the above example to improve the quality of the fit.

Most of the examples in this report are taken from ballads written to the tune of Packington'sPound to which many sets of words have been written . 3

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 7: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

Although frequently elision and prolongation of syllables is performed, either for effect or to improve the4

fit. Elision is the removal of a syllable to join two words or abbreviate one, e.g. I'm, e'er, he's, etc., whereasprolongation is the insertion of an extra syllable to lengthen the word , e.g.. air pronounced ay-er.

It wasn't until later that I found that this is neither a sufficient nor even a necessary criterion.5

7

Character isat ion of Lyr ical Features

"The unit of musical form ... is the proportional melody; and that most certainly took shapeunder the controlling influence of the metrical structure of the words to which it was united"

B Cecil Sharp

In this section I intend to explain how features are extracted from the text to obtain usefulinformation about the rhythmic structure of the words and find out how it may be possible to fitthem to any particular tune.

We have seen how intuitive conditions such as the number of words or syllables in a line aren'tsufficient to guarantee a good fit, so we must look for a deeper relationship between a tune andits lyric. At first glance the only perceived structure in lyrics is that of verses and refrains (orchoruses), and to a great extent a song is defined by its verse structure, such that if it is knownthen further verses can be written to fit.

At this stage therefore, a primary goal would to be able to compare two sets of lyrics B the firstis known to conform to the required structure of the song and we want to know whether thesecond also conforms to this structure. Nothing else is known or assumed about the tune, so werequire a method for determining the likeness of each lyric. Scaling down the problem, given twoindividual lines, can they be sung to the same tune?

Metr ical Analys is

Whilst carrying out the groundwork for the project, I came across Dobson and Harrison's (1979)collection of Mediæval English songs, which confirmed the supposition that generally eachsyllable corresponds to a note in the tune, and so we can regard the syllable as our atomic unit .4

It appeared at the time that the sufficient condition for fit is that the number of syllables match thenumber of notes .5

In the following example, taken from the Mediæval song 'I sang of a myden', he concludes thatalthough printed without music, that it was intended to be sung to mensural music, i.e. such thateach verse had roughly the same metrical structure, with syllable count 6566.

He cam also stillë He cam alB so stillB ëþer his moder lay þer his moB der layasë deaw in April aB së deaw in AB prilþat falleþ on þe spray þat fallB eþ on þe spray

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 8: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

8

The first three lines alternate stressed and unstressed syllables, whilst the fourth inverts thispattern. The analysis of stress patterns is termed metr ical analysis, and is an area of ProsodyB the analysis of speech rhythms. Stressed and unstressed syllables can be grouped together intoa metr ical foot, some of which are shown below (long and short syllables correspondrespectively to stressed and unstressed), which in turn are grouped into a metr ical rhythm. Forexample, the first three lines of the above verse are trochaic and the last line iambic.

Dobson asserts that Mediæval song is characterised by strict metre, and metrical rhythms arepreserved between verses (and mirrored by the rhythm of the tune). Variations are allowed suchas replacing a dactyl with a trochee, where one unstressed syllable can be replaced with two,since the rhythmic structure is determined predominantly by the stressed syllables. If thesecharacteristics filter through into folk song then it would provide a simple method of assessingthe rhythm of the text and would allow for direct comparison of two sets of lyrics.

A mapping is therefore needed between text and syllable length. The property which dictateslength appeared to be the vowel in a syllable, producing short syllables (hat, kit, ant) and longsyllables (far, thou, try) B anything indeterminate we classed as 'medium'. Douglas was initiallyinvolved with the mapping algorithm and I took a list of long, short or medium syllables as theinput to my algorithms.

Analys is of Metr ical Deviance

Given two lists, each element representing the length of a syllable, I had to find a method ofcomparing them. The method had to take account of the property that a couple of short ones canbe 'absorbed' into one, as described above. I chose to represent these lists numerically, ascribingvalues to short, medium and long syllables, such that the long syllables carry the stress and havemost influence on the structure.

Now that we have two lists of numbers to be compared, the first and most obvious method wassimply to find the Integral Squared Error of the two lists, normalised by the number of syllables,which I termed metr ical deviance, as a measure of likeness of two syllable patterns.

Graphically this encoding corresponds to a Pulse Height Modulation scheme, but this methodfailed in the cases where an extra short syllable was (legally) inserted (c.f. 'Smithyson'), but thelists no longer aligned and an error was introduced. As an alternative I tried Pulse Width

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 9: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

See the appendix for results of this analysis.6

9

Diagram shows encodings of the sequences: ì B B c c B , í B c c c c c B The arrows show the correspondences between each pattern. An extra shortsyllable occurs in í . The metrical rhythms are similar, which should be mirroredin the data representations, to minimise the ISE.

Modulation, where temporal information is encoded by the length of the pulse; this gives betterresults when extra syllables are present but takes no account of giving priority to the stresspatterns. As a hybrid approach I tried Pulse Area Modulation, where the length of the syllableis represented by both the height and the length of the pulse. This gave erratic results where itwas impossible to obtain useful information from the data.

At this stage I noticed that occasionally, two lines can be sung to the same tune but can havedifferent syllable counts, and rather than one line having a replaceable trochee somewhere in themiddle, as we have been accounting for, it can have a couple of extra syllables at the beginningor at the end. As an example take lines from a couple of Packington's Pound ballads:

(you) free-men and masters, and prentices [mourn](But of a) suffering mayor who may pass for a [mar-tyr]

The parenthesised words are the words that would be sung before the start of the first bar, thatis on the anacrusis beat so that the following word would be sung on the first beat of the first bar.The words in brackets are the words that are sung on the last beat. The text in the middle of allthe lines is uniform in having nine syllables. This holds true for many (but not all) other linestaken from these ballads. By now the theory of strict metrical structure in terms of iambic andtrochaic patterns looks inappropriate as there seems to be little correlation between syllable lengthand the length of the note to which it is to be sung .6

Having found this new apparent structure I sought to modify my initial metrical deviancealgorithm. I decided to use the PHM encoding as this didn't allow for the replaceable trocheeproperty, whilst still allowing for emphasis of stress patterns. The new algorithm I wanted was

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 10: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

Interestingly, Cook (1987) describes the Meyerian approach to music analysis in terms of metrical rhythm,7

grouping downbeats and upbeats into metrical feet, applied hierarchically to reduce music eventually into a singlemetrical foot.

As a result, many songs have a number of lines which are very bad fits, but which are required for8

continuity of the story. Scansion is to be sacrificed before content, it seems.

See anacrusis diagram, page 17.9

10

one which could iteratively remove the first few syllables and compare against the reference, andthen remove the last few syllables and compare again. The best score in terms of metricaldeviance then would suggest the correct number of syllables before and after the middle section.I termed these properties head and tail deviances. This approach gave better results for thephrases with extra head and tail syllables, but was too lenient with bad fits, by compensating forphrases of different lengths.

Unfortunately, in analysing the head and tail sections I had missed the crucial issue B I hadsurmised that the placement of these extra syllables was arbitrary. How could an extra syllablesometimes be placed at the beginning and sometimes at the end? When singing each phrase tothe Packington's Pound tune it seemed intuitively obvious where the extra beats should occur.However, when faced with a phrase of similar syllabic structure there are cases where it is veryhard to sing the phrase to the tune without the addition or elision of a number of syllables.

Additionally, I realised that metrical rhythm in terms of long and short syllables was irrelevantin this context B folk music (and virtually all other post-Mediæval music) places importance on7

lyrical content , whereas writing songs with strict metrics is rather an academic discipline perhaps8

best left to hymn writers.

Analys is of Caesura

One useful concept, again borrowed from Prosody, is the notion of a caesura. A caesura is apause in the middle of a line in a poem or piece of prose. For most of the songs in our data setthere are very definite occurrences of caesura, as denoted by a punctuation mark in the middleof the line. Location of this point should then provide an idea of how many syllables of each halfare invariant, and how many are head or tail sections. Motivation for this new theory came withthe discovery that the caesura usually occurs before the anacrusis beat in the second half of thefirst line, providing a halfway point in each line .9

Unfortunately not every line contains explicit punctuation B probably less than half for thePackington ballads. In speech and singing, however, subconscious pauses are often inserted atspecific points, and in these instances the pause can represent the caesura. So we need todetermine the location of these notional pauses.

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 11: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

The OALD is an ASCII text file which gives useful information about the form of a word, its pronunciation,10

frequency and syllable count. There are some 70,000 entries, including a large number of English towns and otherproper nouns.

The pronunciation turns out to be useful later on for finding the primary word stress.11

11

Grammat ical Analys is

Orators and singers introduce pauses into text to break up sentences into more manageable andmeaningful chunks, and any sentence may be made up of a number of phrases, each of which maybe a sentence in its own right, joined together with conjunctions or other grammatical device,e.g.

Jack eats an apple and the cat sleeps on the mat if it is cold outside.

Commas could also be inserted before the and or i f to assist the speaker in punctuating andphrasing of the sentence, but a speaker would normally insert them automatically. A pause mayalso be inserted for musical or dramatic effect. To understand how sentences may be broken upit is necessary to have information about the grammatical structure of the text being analysed.

The Lexicon

We need to be able to map input text onto a grammatical framework. The first stage of thismapping is to obtain the form of a given word, be it noun, verb, etc. This facility is provided bythe lexicon. We were extremely fortunate in finding a usable lexicon in the form of the OxfordAdvanced Learner's Dictionary , which we would like to use as a grammatical look-up table.10

For reasons of sanity whilst testing I wrote a fast lookup algorithm using an index file based onan enumeration of the initial two letters of the desired word, which positions the file pointer towithin about 0.01 to 1% of the correct place in the file, yielding fast results even under MS-DOS.The raw lookup procedure returns the record of the word once it has been found in the file. Thisis a string looking like the following:

wi sh wI S J1%, M7% 1 2A, 3A, 6A, 7A, 9, 12A, 13A, 15Awi shed wI St Jc%, Jd% 1 2A, 3A, 6A, 7A, 9, 12A, 13A, 15Awi shes ' wI SI z Ja%, Mj % 2 2A, 3A, 6A, 7A, 9, 12A, 13A, 15Awi shf ul ' wI Sf @l OA% 2

Each record contains information on pronunciation, word form (i.e. noun, verb, etc), rarity andsyllable count. Of all these fields, for the present I am only concerned with the word form . I11

have found that the following base set of forms provides a sufficient amount of information forfurther processing:

t ype Wor dType = Ext Ver b | AnomVer b | Ver b | Noun| Adj ect i ve | Adver b | Pr onoun | Pr oVer b| Def Ar t | I nDef Ar t | Pr epos | Conj | I nt er j| Unknown; ;

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 12: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

12

In addition to the usual forms I have acknowledged the irregular behaviour of the verb to be.Conjugations of to be (is, am, was, were) are classed as Existential Verbs. Different behaviouris also exhibited by verbs such as have, will, should which I have termed Anomalous Verbs. Ihave extended the usual class of pronouns (he, she, I) to include proper nouns, reflecting the useof phrases such as London is nice, Fred has fleas whilst not permitting the erroneous cases Ineed a Manchester, I hit the Albert. The class ProVerb contains compound words where anexistential or anomalous verb has been contracted with a pronoun (I'm, they've, we'll, he'd).

Despite having a 70 000 word lexicon there are a few cases when a word isn't found. In folksong, local and personal references abound, and due to the age of much of the texts, antiquatedwords and dialectic phrases often appear. Fortunately most of these cases are detectable fromsimple prefix/suffix rules as shown in the table below:

Case Form Example

ends in ing Verb or Adjective

the vicar is runningthe running vicar

ends in eth Verb the vicar runneth

ends in ed Verb or Adjective

he disembowelled the dogthe disembowelled dog

ends in ly Adverb he went sheepishly

ends in er Noun the tanner

ends in 's Adjective John's dog

ends in 'd Verb or Adjective the mirror crack'd

starts with capital Pronoun Ethelred, Ramsbottom

Grammatical Inference

Take the two sentences:

John bats like rice puddingHungry bats like rice pudding

It is obvious that ambiguity arises between the cases as both the words bats and like have twodifferent forms. A lookup in the dictionary would return bats as both a noun and a verb, so thisalone is not sufficient to deduce the form of the word. Clearly the part of speech to which theword belongs is context-dependent. So how do we know which form to infer from the sentence?In the above case, looking at the respective words preceding bats we can see that John is(effectively) a pronoun and we know that hungry is an adjective. We could therefore establish

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 13: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

13

rules to infer bats to be respectively verb and noun.

I needed to reduce a sentence into its grammatical components, resolving the indeterminacybetween possible cases. I created a new data type declaring definiteness as the inverse ofambiguity:

t ype Def i ni t eness = Def i ni t e of Wor dType | Pr obabl y of Wor dType| Maybe of Wor dType l i s t | Def Unknown; ;

Following a lookup from the lexicon, if there is only one form then it is returned as Definiteotherwise the list of forms are returned as Maybe. The most desirable situation is to have a listof Def i ni t e which has no ambiguity. Unfortunately this rarely happens and most sentenceshave at least one word with an unknown form. To solve this problem, I adopted a concept oftype promotion to reduce indeterminacy, where all words are initially Maybe and promoted toa Probably if a particular form can be inferred from the surrounding words.

Rather than use a complex single pass Left-Right (LR) inference approach, I used an island-drivenstrategy based on multiple passes, each looking for different features. The general principle wasto nucleate connected areas of Definite or Probably where such areas already existedby using them to infer the types of uncertain words. The fundamental function here was calledinfer :

( * i nf er : Wor dType - > Def i ni t eness - > Def i ni t eness * )l et i nf er gt ype = f unct i on Maybe t ypes - >

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 14: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

14

i f ( mem gt ype t ypes) t hen Pr obabl y gt ype el se Maybe t ypes| x - > x ; ;

This simple function tries to infer a given type from its given word, so if we know that bats canbe a noun or a verb, then we can only infer noun or verb. In such an event its definiteness ispromoted to a Pr obabl y. For the purposes of the rest of the analysis, a Probably is treatedin the same way as a Definite (and is deemed to be certain). In total, nine differentinference rules are applied to the input in an infer cycle. Each stage applies a particular rule totry and resolve any ambiguous forms. If any words are promoted then on the pass of the nextfunction they assume certain status and can be used to infer other surrounding words.

The effects of some of these functions are shown below C in this notation square brackets mean'optional' and the asterisk suffix means 'none or more'. The prefixes represent the definiteness ofthe word, D is definite, C is certain (definite or probably) and I is inferrable (i.e. of type maybe,one form of which is the desired form). Also shown are examples of where the situation mightoccur with the matching sections in italics and the inferred words in bold.

infer_verb_from_anom: I should bite( C AnomVer b) ( I Ver b) - > ( C AnomVer b) ( C Ver b)

i nf er _ver b_phr ase_f r om_pr onoun: He almost always runs fast( C Pr onoun) [ C Adver b| I Adver b] * ( I Ver b) - >

( C Pr onoun) [ C Adver b] * ( C Ver b)

i nf er _noun_phr ases: The happy green man( D Def Ar t | D I nDef Ar t ) [ C Adj ect i ve| I Adj ect i ve] * ( I Noun) - > ( D Def Ar t | D I nDef Ar t ) [ C Adj ect i ve] * ( C Noun)

i nf er _adj _f r om_noun: The happy green man[ C Adj ect i ve| I Adj ect i ve] * ( C Noun) - >

[ C Adj ect i ve] * ( C Noun)

The last two cases differ crucially in the fact that in the former the certain or definite forms occurfirst, and terms to the right are inferred, whereas in the other the term on the right is known sowords are inferred to the left, requiring the list of words to be presented backwards. For thisreason it isn't possible to have an LR evaluation strategy. On average this algorithm will beslightly greater than O(n), but n is constrained externally to an upper bound of around ten.

The actions of these functions are implicitly probabilistic as they are performed in such an orderto infer the most likely types first, for example giving priority to declarative sentences overinterrogative sentences as they occur more often. Of course, although the accuracy of thisalgorithm was surprisingly good, there will be inevitable anomalous cases, particularly due toantiquated expressions, e.g. "this is the way for to fill up our purse". An alternative approachmight have resolved the ambiguity of a single Maybe into a number of certain sentences, one foreach inferrable form of the word, however this produces a list of sentences which may not be allgrammatically correct. The idea of the approach used here is that the uncertainty is resolved

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 15: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

Woods (1984) uses the same concept for his Augmented Transition N etworks12

15

within grammatical constraints to obtain the most likely grammatical form of the input sentence.

Word Binding Order

Once our grammatical framework is in place, we require a set of rules for finding possible caesurapoints. We have seen one way of being able to split a sentence into sub-phrases by the use ofconjunctions. This is a fairly logical place to start as conjunctions often act as semantic delimiters.Similar behaviour is exhibited by interjections ("Oh London, Oh London", "You people") and sowe can mark a possible caesura point before an interjection or conjunction.

In order to evaluate other grammatical cases I introduced a concept of word binding todetermine which parts of a phrase are separable and which are strongly bound together. Thegiven line of n words is transformed into a list of (n-1) pairs of words, where each word is pairedwith its neighbour. The binding is evaluated for each pair of words; each pair is initialised toSpMaybe before being promoted to either Glue or Split , whilst the last pair of words aremarked as EndPt .

t ype Spl i t Bi nd = Spl i t | SpMaybe | Gl ue | EndPt ; ;t ype Spl i t Pai r == ( Spl i t Bi nd* Def i ni t eness* Def i ni t eness) ; ;

After the simple conjunction/interjection cases I afforded first consideration to the glue operator,as this is often easier to place than the split operator. For example it is rare to split the phraseJohn plays the guitar as it is an atomic sentence, that is the most fundamental form of declarativesentence which makes grammatical sense. John plays the is not a sentence in its own right as therequires a noun phrase following it. As a contrary example, the phrase John plays the guitar onSaturdays could be split before the on as this is a sentence made up of a sentence and aprepositional phrase. Therefore we might say that the sequence of words John plays the guitarhave a higher binding order than the words guitar on.

By evaluating the pairs of words which can be bound, we can group together sequences of wordsto form phrases, surrounded by splits. I formulated around eighteen rules to determine possiblebinds and splits which are based mostly on the analyses of the semantics and phrasing ofsentences. I had investigated some of the work of Chomsky by this stage, who has doneextensive research into grammatical analysis. In Syntactic Structures he proposes a notation forconstituent analysis which is the development of a Context Free Grammar for language . The12

idea was to able to have a list of rules or derivations which would group the sentence intocohesive units, e.g. (where S=sentence, N P=noun phrase, etc):

S -> NP + VPS -> S + C + SNP -> T + NVP -> V + NPT -> the

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 16: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

This is no doubt due to the fact that this type of notation was used as a transformational grammar which13

was more concerned with transforming statements into questions, past tense into present tense, etc. An extension tothe notation, however, could perhaps allow for the binding requirements

16

N -> man, ball, etcVerb -> hit, took, etcC -> and, so, etc

Which can generate a set of simple declarative sentences. Initially this appeared to be useful, butthe main caveat is that the grouping is rather arbitrary, as for example with the second rule it isnot clear to which sentence the conjunction is bound, whereas in speech the phrasing wouldalmost certainly be:

the man hits the ball, and the woman catches the ball as opposed to:

the man hits the ball and, the woman catches the ball.

Chomsky's rules make no allowance for this effect . Fortunately a great deal can be done on the13

fundamental level of considering each pair of words at a time. Below is a summary of the rulesI formulated, where the underscore character represents a wildcard. The rules nearest to the topare evaluated first.

Split Glue

_ Conj DefArt _

_ Interj InDefArt _

_ Prepos AnomVerb Verb

_ Pronoun Adverb (Verb|Adverb)

_ Proverb Adjective (Adjective|Noun)

Noun Adjective Pronoun (Verb|Adverb)

Interj _ Conj _

Prepos _

Finding the Caesura

We are now able to map a line of text to a list of words with various permissible split points.Inevitably though there will be more than one such point indicated, but we only want a single splitpoint, as the vast majority of songs we have encountered can only be split once. A simplifiedsolution is to pick the one nearest the middle. Unfortunately there are two different midpoints,one according to the number of words (hence the semantic mid-point) and the other on the

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 17: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

See the results section for the other data.14

17

syllable count (the rhythmic mid-point). Which do we pick? Grammatically a caesura pointseems to occur at about the halfway point of the number of words. However, in prosody (andmore specifically, in lyrics) the caesura point usually appears at around half the syllable count.

The main motivation for trying to find the split point through grammatical analysis rather than justlooking for halfway points stems from the anacrusis phenomenon, where a few syllables canoccur before the start of the bar, and also extra syllables appended to the ends (c.f. head and taildeviance), but since both can be applied independently this distorts the location of the true centre.If we can find the caesura more formally then it gives us the chance to distinguish between extrasyllables in each half and hence help us to find anacrusis beats. My algorithm for caesura locationgave very encouraging results with our sample data set:

you f r eemen and mast er s C and pr ent i ces mour n f or now you ar e l ef t C wi t h your char t er f or l or noh London oh London C how coul dst t hou pr et endagai nst t hy def ender C t hy cr i mes t o def endI s i ng of no her et i c C Tur k or of t ar t arbut of a suf f er i ng mayor C who may pass f or a mar t yr t he yout h was bel oved C i n t he spr i ng of hi s l i f ebef or e hi s head t ur ned C wi t h pl ace t i t l e and wi f e

The dashes denote the place where the algorithm thought a grammatical caesura should appear;out of a test data set of 41 lines, the algorithm found 38 caesura points correctly whilst the other3 appeared to be anomalous cases, for which I was unable to find an obvious caesura pointmyself .14

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 18: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

18

Musical In tegrat ion

There is a very fortunate direct musical analogy with caesura which reflects the rhythmic pausebetween the words. In the same way that a line of text can often be broken in half, the music togo with the words is commonly a complete (and often even) number of bars, and where ananacrusis occurs (i.e. the tune starts on the last beat of the preceding bar), the last bar finishesa beat early. This also often happens halfway through each line, such that the caesura pointcorrelates directly with the anacrusis beat at the musical halfway point. This discovery providedgreat reassurance as to the integrity of the method.

The next impetus was to find how the anacrusis notes are determined from the lyrics. It seemsintuitive when singing a song when to sing one syllable before the bar and when to sing more, forexample:

You | freemen and masters and prentices | mourn c

For | now you are left with your charter for- | lornc

But of a | suffering mayor who may pass for a | mar-tyr c

The '|' character denotes the start of the first bar and the start of the last bar and 'c' represents thecaesura point. The words between the '|' marks have the same number of syllables (9) in each line,but the first two both have a single syllable at each extremity, whilst the last has three and tworespectively. We are now interested in which factors determine how many syllables occur at thebeginning and at the end, and in particular why the word forlorn is split across the beat whenmourn and martyr are not.

Primary Word Stress

To address the latter problem first, the issues dictating the splitting of a single word across a beatevaded me for a while, and I had originally thought that it was dependent on the number ofsyllables between the first and the last bar (i.e. between the '|' marks). However, it happens thatthe phenomenon is a property of the word itself and is independent of the surrounding text.Consider the following words:

1 1 1 1design ellipsis elephant precipice

The numbers above the vowels denote the position of the primary stress position. At first glancethe placement of this primary stress seems somewhat arbitrary, but the theory explaining suchword stress is only a recent development. A generic rule for finding the primary stress is given

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 19: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

However, in the OALD, words such as present are recorded as two separate entries, with different15

pronunciation fields for each, and under time constraints I have not allowed for this fact, and so present is regardedas a noun.

As a counter-example, Reggae places the emphasis on the off beat.16

19

(in Hogg & McCully's notation) as:

V ÷ [1 stress] / C C (V' (C)) (V' C )]0 0 N

In this expression, V' matches a short vowel (e.g. in hit, hat, hut), C matches 0 or more0

consonants, and C matches a single consonant and brackets mean 'optional'. The expression ismatched to the phonetic transcription of the word (so that dipthongs are recorded as singlevowels and endings such as '-ice' are treated as a vowel and a consonant - 'Is'), and thenexpanded from the right to match as much of the word as possible: in the above examples itexpands to (C ), (C2 V' C), (C V' C V' C ) and (C V' C V' C) respectively. The stress is then2 2

taken by the vowel before the matching section.

Hence the primary word stress is dictated by the long and short vowels in a word, for exampleif we replace each internal vowel of precipice (the phonetic transcription of which is 'presIpIs'in OALD notation) with with a long 'ee' vowel, it would become preecipice, preceepice andprecipeece, with stress respectively on the first, second and third syllables.

However, this approach only works for nouns (hence the N suffix), as can be seen by theexample of the word 'present'. Compare ,pre'sent and 'pre,sent where the height of the marksdenote stress (so (') is the primary word stress), which represent the verbal and nounal forms ofthe word respectively. So if we are using present as a verb, the emphasis is on the secondsyllable, whereas if it is used as a noun, the emphasis is on the first syllable. The key result isthat the stressed syllable is the one which is uttered on the on-beat, so a word is split across twobars if the primary word stress doesn't fall on the first syllable.

Fortunately in our lexicon , the phonetic transcription gives the position of the primary stresses,15

and we don't need to use Hogg's technique manually. For words not in the dictionary, the stresspoint is put arbitrarily at the beginning of the word, which in most cases is fairly reliable.

Scansion and Musical Stress

The generalisation of the problem of superfluous end syllables is to determine how words lie onthe beats of the underlying music. Music has stress patterns dictated by the placement ofindividual note rhythms in the bar. In Western music, both modern and classical, it is usually thecase that musical stress is borne by the on-beats , their positions in the bar depending on the time16

signature of the piece. The emphases occur on these beats, and it turns out that there is a directcorrelation between this musical stress and the stress patterns of the text to be sung. From observations of a number of examples, once again there seems to be a set of rules which

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 20: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

This example is illustrated graphically in the Adaptive Beat Placement chapter, page 23.17

20

apply to the majority of cases. Expanding the annotation of the previous examples, where the^ marks denote the on-beats (The beat numbers are also shown):

3 1 2 3 1 2 3 1 2 3 1You f̂ree- men and ^mas- ters and ^pren- ti- ces ^mourn c

For ^now you are l̂eft with your ^char- ter for- l̂orn c

But of a ^suff- er- ing ^mayor who may ^pass for a ^mar-tyr c

The first observation to make is the tendency for the beats to lie on the most important words ofthe sentence. These are mostly nouns, but also can be verbs; note that forlorn and suffering alsolie on the beat. Again these are semantically important words B the emphasis is on the fact thatthe mayor is suffering, and so both words take the stress. Unfortunately an analysis of thesemantics is well beyond the scope of this project, and perhaps often beyond the immediateconcerns of the singer.

The art (or indeed, science) of placing the words at the appropriate places in the bar is known asscansion, yet another useful prosodic concept. The approach that I adopted from this point wasto find the words which were most likely to carry the stresses, and find out whereabouts in thosewords the stress occurred. However, the above observations do not exhaustively hold:

Be- f̂ore his head t̂urned with place t̂i-tle and ^wifec

Initially it may be expected that head should be afforded some degree of emphasis, rather thanturned, but if this were so, then it would require the syllables in the first bar to be extended byan extra note and the syllables in the second to be contracted by one note. This would make thefirst bar sound rushed and the second hesitant, which would sound unnatural, so we must settlefor the emphasis on turned instead .17

Beat Likel ihood Evaluat ion

All words in a phrase compete for the beat, i.e. to be stressed, and so we must assign prioritiessuch that the most important words are the most likely ones to be stressed. From analysing alarge number of examples, I was able to establish a number of heuristic rules for theseprecedences. Fundamentally there seemed to be a direct mapping between individual words andlikelihood, such that nouns and verbs have highest precedence, followed by adjectives andadverbs (which are often the key semantic elements) and then all other forms, except for articlesand prepositions (e.g. a, the, in) which have lowest likelihood. I created an abstract data type torepresent these above cases called StressType :

t ype St r essType = Pr obBeat | PossBeat | UnknownBeat | Not Beat ; ;

The allocation algorithm had to take account of the uncertainty in the word type, for example,

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 21: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

21

when a word was known to be maybe an adverb, a conjunction or an interjection. A simple wayaround this problem was to count the number of what I have termed the pr imary (i.e. likely tobe stressed) and tertiary (i.e. very unlikely be stressed) words, and assign the result by majorityvoting. Other cases could be resolved by trying to infer more than one particular form from aword, e.g. a noun and a verb (such as 'run'). This simple solution was very effective in removinga large degree of uncertainty from the input phrase. Other functions allotted higher precedenceto words which were thought to be place names, and hence worthy of emphasis. Some of theresults of this algorithm are shown below:

- you ++f r eemen - and ++mast er s - and ++pr ent i ces ++mour n

- f or now - you ar e +l ef t - wi t h +your ++char t er +f or l or n

++I ++s i ng - of - no ++her et i c ++Tur k - or - of ++t ar t ar

- but - of - a +suf f er i ng ++mayor - who - may ++pass - f or - a ++mar t yr

Boldface denotes the positions of the true beats and the -,+,++ symbols show the stress ascribedto each word, in ascending order, corresponding to the values NotBeat , PossBeat andPr obBeat . All unmarked words are classified as UnknownBeat . The result of this analysisis a list of StressType which can be thought of as the extent to which we are prepared to letthe beat lie on this word. Note that at the moment we are not considering the rhythm of the tune,instead we are trying to ascertain the natural word stress pattern of the input phrase.

For convenience, St r essType was enumerated to values between 0.0 and 0.6 (0.6 is used forhistorical reasons). Once in the numerical domain, visualisation was somewhat easier, and alsoprocessing using numerical algorithms became possible. The motivation was thus: if we need tocompare a word stress pattern with the rhythm of the tune of the song, then we apply atransformation to the rhythm to obtain a similar probabilistic list, only since the rhythm is knowna priori this list can be two state, i.e. crisp, rather than the noisy approximation generated by theword stresses. The difference between these two lists then provided an estimate of the goodness-of-fit. Hence, we can find the Euclidean distance between a sentence and a rhythm.

This seems to be exactly what we require. Unfortunately two major problems arise with thismethod:

i) The lists of numbers may be of different lengths, which, as we saw in the caseof metrical deviance, caused problems, as Euclidean distance isn't defined forvectors of different dimensions. Head and Tail deviance isn't suitable for thiscase

ii) If there are a different number of ProbBeat to the number of beats available,it makes no attempt to decide which word is most worthy of taking the stress.

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 22: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

22

Example of two phrases to fit the same rhythm B thesecond has an extra syllable in the first bar.

Peak sharpening example

Adapt ive Beat Placement

Clearly what is required is not just a method for finding how similar the stress patterns of thewords and music are, but a way of resolving conflicts between the lists of different lengths, suchthat the algorithm can decide to place superfluous syllables at the beginning, middle or end, inorder to make the list lengths the same, which subsequently can be used for finding the distancebetween them.

The first apparent solution to this problem is to use a Dynamic Time Warping algorithm to matchthe best sections. I worked on methods of pre-processing the stress signal by sharpening anypeaks, in an attempt to make the signal two-state, as is the rhythm signal, which would befollowed by a DTW to check the correspondence of peaks.

Whilst this method would work well with stress patterns that deviate only slightly from therhythm, the algorithm is more inclined towards the compression and rarefaction betweenadjacent stresses, whereas we really want to determine where the beats are placed, rather thanevaluate how the rhythm can be stretched to where we may think the beats are.

One other characteristic which precludes the use of DTW is that in this problem, there are manycases when there is little correspondence despite the fact that the words may be a good fit (see theexample on page 24).

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 23: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

23

The algorithm required must be adaptive, as it must be capable of altering the signal to improveits goodness-of-fit. However I could see no easy way of doing this in my likelihood listrepresentation. On further analysis, I found that the rhythm could be characterised by removingthe on-beats and listing the number of offbeats in each bar, for example, if 0.6 represents an on-beat, [0.0; 0.6; 0.0; 0.0; 0.6; 0.0; 0.0] becomes [1; 2; 2], by counting the number in each groupof zeroes. Although the on-beats have been omitted, their positions are now implicit.

Similarly, if the stress signal is pre-processed (e.g. by peak sharpening) such that it is two-state,then it too can be represented as an off-beat list. Here we have achieved one of our goals: tocompress a variable length list into a vector of length determined by the number of bars, andhence we can find a measure of the distance between two vectors. This representation allows usto see the effect of some of the phenomena described previously, such as insertion and elision ofsyllables.

An extra syllable at the beginning of the phrase above will make the pattern [2; 2; 2], in themiddle: [1; 3; 2] and at the end: [1; 2; 3]. We know that it is quite permissible to allow an extrasyllable at the beginning or end without affecting the scansion of a phrase, but if an extra syllableoccurs within a bar (or if one is removed) it reduces the quality of the fit and the singing of thephrase must compensate for the extra or missing syllable. Hence we can compute a distancebetween the two vectors that will allow extra start or end syllables without penalty, butdifferences in the intervening bars incur cost.

This distance provides a quantitative assessment of how good the fit is between the rhythm andthe stress pattern using qualitative constraints:

f(b) allows a single degree of freedom for b at zero cost, as u() represents the Heaviside stepfunction. d(b) calculates the distance between the word stress and music offbeat patterns (thedifference of which is the vector b: b is the initial anacrusis bar and b is the last bar). k , k0 n-1 f m

and k are the weights for extra beats in the first, middle and end bars respectively. e

This important result is overshadowed however by the requirement that we must know whichwords occur on the on-beats, but we have still not yet found a method of resolving ambiguitybetween a number of possible beat allocations. Take again the example 'before his head turned...'to see how the second on-beat could occur on head or turned and the third could occur on titleor wife:

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 24: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

24

The first graph represents the rhythm values of the tune to be matched, and the second shows theassociated beat probabilities. Although the list lengths are the same and non-zero values appearwhere the beat should lie, it is clear that the insertion of an extra syllable as shown in the thirdgraph would shift the last half of the list one place to the right, incorrectly placing the beats onplace and and, so clearly we can't just look for coincident peaks. Also there is little we can doin terms of peak sharpening, as the peaks are indistinct, and we risk losing information aboutbefore as it is in danger of being swamped by nearby 0.6 values.

The above examples show the complexity of the problem, as two similar vectors can benumerically and graphically very different, and so we require a more innovative algorithm toresolve the differences.

Super-Rhythm Contract ion

Instead of trying to enhance the peaks as much as possible to obtain a list containing the rightnumber of peaks, I applied peak sharpening to obvious peaks and then in order to produce a twostate list, I saturated the remaining values, such that all non-zero values became 0.6. This hasthe benefit of keeping information about the definite stress points, whilst still representing themore uncertain stress points. From this two-state list it is possible to extract the offbeat pattern,which is quite likely to be longer than the pattern of the corresponding rhythm. In the example,the saturated pattern becomes:

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 25: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

In contrast with the dynamic time warping algorithm, which increments cost throughout its execution,18

and hence only considers data on its left; this problem requires consideration of both sides at every stage.

25

Finding the offbeat pattern vectors gives us [1;0;0;0;1;0;2;0], whilst the song rhythm is[1;2;2;2;0], but since these are different lengths we are unable to proceed as we have donebeforehand. This longer vector I term the super-rhythm, as it has too many components. Whatwe would like is to contract this vector to the same length as the rhythm offbeat pattern, such thatwe obtain the most likely fit.

If a super-rhythm list of length n is to be decreased by one, then there are (n-1) possibleoutcomes, since the contraction is effected by adding two neighbouring elements andincrementing the sum by one (since there is also an implicit on-beat between them). If the listlength is to be reduced to length m then the total number of outcomes will be:

This will clearly rise at exponential rates as n increases and m decreases, so at first glance thisdoesn't appear to be a worthwhile method. However, the upper and lower bounds on n and mare the maximum and minimum number of words in a phrase; the number of syllables in a phraseis constrained externally to be around 8 to 14, so the bounds on these numbers will beapproximately 12 and 4 in the worst cases. Such a discrepancy between m and n tends to suggesta bad fit anyway, but in practice such a difference isn't encountered very much unless the tune andwords quite obviously don't match.

In the above example, of our 210 condensed offbeat patterns, one will match our requiredtemplate, but what if the fit isn't exact? In this case we simply find the pattern with the best score,i.e. the shortest distance, calculated by our distance function. Herein lies the simplicity/timecomplexity trade-off with the method B the prevention of the use of a dynamic programmingstrategy. The cost function is not locally optimal, as the whole of the pattern must beconsidered . The other caveat is that the cost function can only be evaluated when the list lengths18

are the same, i.e. when the whole list of patterns has been calculated. However, it is guaranteedto find global minima.

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 26: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

26

The real gain here is with the power this function provides B it considers individual beatprobability and the actual rhythm to be matched, and returns the optimal pattern which not onlylets us compute the quality of the fit, but also how to sing these words to the given rhythm: forexample, if we were to find the optimal offbeat pattern of the phrase 'Before his head turnedabout head, title and wife' (rather than 'with'), which we know to have an extra syllable, thenthis would be [1;2;3;2;0], compared against [1;2;2;2;0]. From the difference of these vectors[0;0;1;0;0] (which is also used to compute the score), we can tell that the second bar (remember:the first is the anacrusis bar) has an extra syllable, so three syllables must be sung in the time oftwo. This information is fed back to the user in terms of the rhythm he/she must sing.

The real beauty of this method, however, is that it reduces the problem of fitting words andmusic to the minimisation of a cost function, an incredibly effective numerical abstraction.

Sub-Rhythm Expansion

Just as problems arise when too many viable words compete for the beat stresses, the oppositesituation occurs when too few words are thought to carry stress. For example:

After peak sharpening the signal will only have three peaks and the offbeat pattern will be[5;2;2;0]. In order to find the distance we must compare it with [1;2;2;2;0], i.e. make bothvectors the same length. This suggests the inverse of the previous function, but the solution infact uses the super-rhythm function. The method is to relax all constraints on beat probabilities,requiring only that the beat must lie on a primary word stress; the above example becomes:

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 27: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

27

So the beats can lie anywhere apart from the first syllable of although. The super-rhythmcontraction function is then used to find the vector of minimum cost; in this case [2;2;2;2;0]which has zero distance, since we allow an extra initial syllable at no extra cost.

One caveat of this approach is that if way were to be replaced with meth-od, then this would shiftthe last two beats by one place to place the beats on [to, our], giving zero cost and hencesuggesting that this fit was good, which it isn't, as we know that [fill, purse] should takeprecedence.

One way of circumventing this problem is to make the super-rhythm contraction function morecritical, by not just choosing the pattern of minimal cost, but to penalise each score by a numberproportional to the number of unexpected beat placements, that is by comparing each beat-coincidental word with the previously calculated beat probability for that word. So to and ourwould be penalised, whilst fill and purse wouldn't; combined with the original score this givesa new assessment of the optimal beat placement, using our original qualitative criteria.

Caesura Redundancy

I t should be noted that in the last few sections a phrase was transformed in various ways into asingle list of values. This conflicts with previous arguments for the use of a caesura point, whichwould suggest the use of a pair of lists. The initial principle was to simplify the problem bysplitting a single line in the song into two coherent phrases which could be analysed in identicalfashion, as the music is usually structured in a similar way. This approach worked well in themajority of our test cases, but some cases highlighted a point of concern:

you f r eemen and mas t er s C and pr ent i ces mour n f or now you ar e l ef t C wi t h your char t er f or l or n

After the second beat the caesura (B ) has been placed at the point which seems to makegrammatical sense in both cases, but it doesn't correspond to the same beat in the bar in bothcases. Should the beat be kept in the same place and the words with your compressed into theanacrusis beat, whilst left is expanded to fill the gap? A subjective assessment would say not, ajudgement supported by one of the Packington's tunes (see the Appendix) which does not havean anacrusis beat in this place B the syllables [mas, ters, and] all occur on crotchets with no restsbetween them. Some tunes exhibit such a split point, but many others don't.

The main complication which occurs is when not enough information is available to detect apossible split point, or the wrong one is chosen, as with the above example. This thennecessitates shuffling values between the two lists in order to balance each side at the off-beatpattern stage. Unfortunately, this leads to local minima (and hence bad fits), as the optimisationshave already been performed on each of the constituent lists and not over the whole.

A curious feature of the caesura determination algorithm is that whilst it has a high success ratein finding correct split points, it will not reject any phrase and try regardless to find a split point.A major oversight seemed apparent at this point B not all lyrics have lines long enough to warranta caesura, and in such cases the algorithm tries to insert one anyway, making each half only three

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 28: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

28

or four syllables long.

Following the discovery by Douglas that the location of a caesura in the music isn't alwaysobvious either, we decided to abandon this bipartite method, and produce only single lists, ashave been described previously. At the expense of time and space complexity we acquire theadvantages of finding global minima and the ability to deal with a range of line lengths.

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 29: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

29

Project Management

For the whole duration of the project we have had the rare privilege of being able to provide ourown specifications and largely decide our own directions, due to the research-based nature of ourwork. From the project's inception the problem area was very vague, largely because of a lackof knowledge base and the unavailability of suitable reference material. We didn't know what waspossible in this area, and of more immediate concern was the issue of how much was possible inthe given time.

To take account of this requirement, our initial approach was to opt for a goal-based strategy,where we would fulfil individual goals as constituent parts of successively larger problems, andhope to integrate them all into a coherent working project by the end. However, due to thenature of the problem, the direction of the next stage was often not known until the completionof the current one and so quite often there was little we could do by way of forward planning.

The period before Christmas was largely taken up with background reading on a large range ofsubjects, taking in phonetics, linguistics, metrical phonology, Chomskian grammars, metricalanalysis, folk music, Mediæval poetry and ethnomusicography amongst others. It was necessaryto have a wide intake due to the multi-faceted nature of words: inter-word and intra-wordrelationships, structure within grammatical frameworks, factors dictating word and sentencestresses, use of language in lyrics and of course how to reconcile these considerations withmusic. We made the choice of programming in CAML at an early stage, to take advantage ofits flexible data types and its support of functional and imperative programming paradigms.

The main part of this report chronicles the development of my part of the project through itsmeandering early stages, e.g. metrical analysis, into the more focussed later stages, such asgrammatical analysis and beat placement. The later work derives little from the initial metricalanalysis ideas, but following its unsatisfactory results, sprung from a need to find alternativemethods. The learning phase was consolidated on the approach to Christmas and I had found apositive direction with the grammatical analysis (the progress report in the appendix coincideswith the start of this stage), mostly completed by the start of Hilary Term, rapidly followed bythe beat placement stage which was subject to fine-tuning at the end of Hilary Term. Finally wetied up both halves of the project to try and obtain a stand-alone working program.

Divis ion of Labour

It wasn't until after the initial groundwork that we were able to establish a clear division of labour:I did the words and Douglas did the music. Originally we hadn't thought that we could split theproblem so cleanly, but it seemed that there was a lot we could do independently, so we each setabout investigating our individual areas. Through daily lab contact we were able to monitor theprogress of the project as a whole, consolidated by periodic meetings.

At the start of Hilary Term we were able to define interface specifications through the use ofAbstract Data Types B which afforded us both great flexibility and adoption of data structuressuitable for the representation of musical and lyrical data. By manually entering test data I was

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 30: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

30

able to test my own programs before Douglas made the conversion program available, with theintention of applying both halves effectively to other test data in our large database. We also setout to help the software as much as possible by annotating key features by hand, but in the endwe were able to use raw data with the only requirement that the text was syllabised by insertinghyphens manually.

Interact ion w ith Supervisor

Throughout this project we have kept in regular contact with our supervisor, Ian Page, who hasbeen instrumental in providing useful references, contacts and feedback, and provided the originalinspiration for the project. Being a folk singer it has been invaluable to have his input, since thereare many elements of the software that we have produced which are dependent on subjectivecriteria within a subject which we previously knew nothing about. We have enjoyed being ableto work independently and to dictate the direction of the project ourselves under the guidinginfluence of a supervisor.

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 31: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

31

Resul ts

"Exceptions are the darts which lacerate the soul of the academic"B Cecil Sharp

In this section I intend to describe the results produced by the program in terms of its ability togeneralise, its robustness and other caveats. Example output is shown in the Appendix, forbrevity.

In the early stages of research I found that English folk music inherits little from Mediæval musiclyrically, but modern folk, pop and rock music have a lyrical structure which is remarkably similarto their seventeenth century predecessors in two respects:

C The song structure, in terms of verses interspersed with choruses.C The heuristic rules of beat allocation

This is surprisingly little development over three hundred years C if these characteristics remaininvariant, then in theory the algorithms should still be effective with newer genres of music.

The grammatical models were originally designed only for simple declarative sentences (I mustbuy bread and milk on Monday), but accidentally turned out to be reasonably effective for morecomplicated sentences (N o! Where is my dinner?) where it could still latch on to key parts of thephrase and detect the form of some of the words. Since the caesura analysis phase has now beenobviated, the only dependent processing stage is the beat probability assignment which alsotranspires to be reasonably transparent to sentence structure.

Here are some circumstances the program should cope with:C Alliteration or other lyrical devices which will not alter the underlying rhythm of the

text.C Declarative, exclamatory and interrogative phrases.C Narratives and reported speech. C Out-of-vocabulary text (e.g. proper nouns) and anachronistic expressions.C Any musical genre with the same beat-placement heuristics.C Lyrical 'padding' (la la, dee-de-dee).C Irregular punctuation and capitalisation.

It would probably experience some problems with:C Dramatics B where pauses or stresses are made in unexpected places 'for effect'.C Lyrical or musical devices which will alter the underlying rhythm of the text.C Out-of-vocabulary text with primary word stress on a syllable other than the firstC Awkward phrasing, contrived to get particular words into the text despite the fact

that they don't fit (17th century Folk music is particularly guilty of this) .C Musical styles with irregular or syncopated beat stresses.C Prolongation of syllables (i.e. slurs) over more than two notes (e.g. the last

'Hallelujah' in the Hallelujah Chorus)

We have succeeded in writing algorithms to fulfil our original criteria of assessing the goodness-

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 32: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

32

of-fit between music and words, but rather than just providing a quantitative assessment, e.g. anumber between 0 and 1, we have managed to provide a qualitative judgement and been able toadapt the rhythm of the tune to generate the best fit, even with very contrived lyrics, and the datastructures also contain the processing history of the input phrase, so it is possible to also explainwhy the rhythm has to be modified.

As a significant extension to the original goals (and something which we have only been able totalk about speculatively before now) we have been able to produce a program to search adatabase of tunes for the best matches to a given lyric, which could prove to be an invaluable toolfor a folk singer who regularly encounters new texts without any indication of a melody. Theprogram proved to be very successful with a sample Packington lyric, which it matched near-perfectly to five tunes, and produced a number of other suggestions from a database of only 110tunes, including the remarkable result that it can be sung perfectly to Away in a manger (theseexamples are also shown in the Appendix).

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 33: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

However, we were initially looking at metrical analysis in terms of syllable lengths, rather than syllable19

stresses which is more closely related to our later work.

Which turns out not to be useful in this context, but could be used if very long lines are to be split, or if20

lines are merged without obvious start and end points.

33

Conclus ions

I have only ever encountered the word 'contrafacture' in one text (Dobson (1979)), which wouldlead me to think that this is the first attempt to evaluate contrafacture computationally. Far frombeing a hindrance, the lack of useful references has been a benefit, as it has forced a wide-ranginginvestigation into the nature of words and music in order to thoroughly understand the basis ofthe problem and the strategies for its solution, without being tempted to merely follow previouslines of research. As a result we have encountered a number of dead-ends along the way tofinding usable solutions, for example Metrical Analysis which we discounted early on as beingof little use.19

Interestingly, the algorithms I have devised seem far removed from any logical chain of thoughtwhich is able to produce the same result, which does suggest that there may be a morefundamental procedure than the one I have adopted. Alternatively, perhaps this method is themost direct C when we recognise shapes we don't consciously calculate the second moment ofarea, yet the process may be capable of producing the same solutions.

From its inception, we knew this project would be challenging and perhaps unrealisable. Aboveall however, it has been very enjoyable, particularly since we have a working program at the endwhich copes with most of what we throw at it and which surpasses our original intentions.

I feel the main achievements of the project are:C to bring the abstract concepts of words and music into the numerical domainC the implementation of a grammatical inference engineC identification of the beat placement problem and heuristics for its solutionC reduction of the word fit problem to the minimisation of a cost functionC transformation of variable length data into constant length feature vectors

(offbeat patterns), so that a meaningful distance metric can be definedC the ability to transform both the words and the rhythm to produce the best fitC the production of a program to retrieve the best fits from a database of tunes

for a given set of lyrics

Other achievements include the minimal manual annotation required by the user, the principle ofcaesura location , the finding of global minima, robustness under uncertainty (e.g. out-of-20

vocabulary text), and the definition of effective data structures.

I have enjoyed being involved with a subject so far removed from computational analysis, takingin many different fields of research and being able to work unconstrained by previousinvestigations. I hope to have contributed some amount of knowledge to the field of computationmusicology, to have brought it closer to that of textual analysis, and most of all to have givena little more insight into how we perceive and interact with music.

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 34: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

34

Poss ib le Future Work

Since the program achieves most of that which we had originally intended, there doesn't seem tobe much scope for future work in this particular direction. One of the few unexplored avenueshowever, is that concerned with melody B Douglas has carried out some work relating melodyto caesura, but we haven't tried to relate lyrics and melody or lyrical intonation, rather, most ofour work has been concerned with uniting the rhythms of music and text.

Most of my recent work has been occupied with 'tweaking' parameters, particularly in the beatallocation algorithms to take account of results on new test data. I had considered the use ofNeural Networks to optimise values such as these, but since the problem domain is one ofadaption rather than classification, I concluded that this would be inappropriate.

Possible future enhancements might include:C optimising algorithms in the light of later knowledgeC rewriting the software using more integrated data typesC detection of rhyming schemes as part of verse structureC trying to characterise songs by rhythmic structureC feature extraction from melodyC metrical analysis (iambs, trochees, etc.) using rhythm and stresses

rather than syllable length

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 35: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

35

Bib l iography

Bird, Richard & Wadler, Philip (1988). Introduction to Functional Programming.

Chappell, William (reprinted 1968). Old English Popular Music.

Chomsky, Noam (1957). Syntactic Structures.

Clark, Andrew (1907). The Shirburn Ballads.

Cook, Nicholas (1987). Guide to Musical Analysis.

Dobson, E.J & Harrison F. Ll (1979). Medieval English Songs.

Hogg, Richard & McCully C.B (1987). Metrical Phonology.

Kaplan, Ronald (1994). The Formal Architecture of Lexical-Functional Grammar.

Kidson, Frank & Neal, Mary (1915). English Folk Song and Dance.

Leroy, Xavier (1995). The Caml Light System.

Lloyd, A.L. (1975). Folk Song in England.

Lomax, Alan (1968). Folk Song Style and Culture.

Mauny, Michel (1995). Functional Programming using Caml Light.

Mitton, Roger (1992). The Computer Usable Version of the Oxford Advanced Learner's

Dictionary.

Palmer, Roy (1983). Folk Songs Collected by Ralph Vaughan Williams.

Parsons, Denys (1975). The Directory of Tunes and Musical Themes.

Paulson, Lawrence (1991). ML for the Working Programmer.

Sharp, Cecil (1907). English Folk Song.

Woods, William (1984). Language Processing for Speech Understanding.

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 36: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

36

Elect ron ic Texts and Resources

Digital Tradition folk song archive C music search:http://pubweb.parc.xerox.com/digitrad

Digital Tradition down-loadable folk song database:ftp://parcftp.xerox.com/pub/music/digital_tradition/apr95/ftp://ftp.uwp.edu/pub/music/folk/digital.tradition/apr95/http://music.wit.com/uwp/music/folk/digital.tradition/apr95/

The Computer Usable Version of the Oxford Advanced Learner's Dictionary:ftp://ota.ox.ac.uk/pub/ota/public/dicts/710

CAML light:http://sunsite.doc.ic.ac.uk/public/computing/programming/languages/ML/Caml-Light

Project Page:http://sable.ox.ac.uk/~tr96003/project.htmlhttp://www.comlab.ox.ac.uk/oucl/people/ian.page.html

Contact e-mail addresses:[email protected]@[email protected]

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 37: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

37

Appendix A - Pack ington 's Pound

Throughout the seventeenth century, a popular form of music in England was the BroadsideBallad. In the days when few people could read and news was customarily spread by word-of-mouth, there evolved a class of song which aimed to relate the important topics of the day whilststill being ostensibly a form of entertainment. Songs about the Great Frost of 1683 or the loss ofLondon's Charter are indicative of the subject matter. Ballad writers, mainly in London, wroteof the newsworthy events which were then printed in broadsheet (or broadside) form and sold atmarkets. Literate folk singers would buy these ballads and sing them to their friends and families,and would propagate throughout the generations in a phenomena known as the Oral Tradition.However, since even fewer people could read music than could read text, no music was printedwith the songs, and so a subtitle such as: to the tune of Packington's Pound, was given. In orderfor a ballad writer to sell many of his songs it was necessary to choose a popular tune, of whichPackington's Pound was one.

At least three hundred ballads are known to have been written to the tune, mostly in theseventeenth century, and for this reason we have concentrated on this ballad; it typifies thepractice of contrafacture (the singing of different words to one tune). I have managed to find

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 38: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

38

three variants. The first appears to be the original, first published in Barley's N ew Book ofTabliture, 1596. This version also shows the accompaniment (the melody is shown by theupwards pointing note stems in the treble clef), and is in 3/4 in the Dorian mode of G. It is takenfrom Old English Popular Music by William Chappell.

The second version is in 6/4 in the Dorian mode of D. It has the same opening phrase as the first,and also the start of the 12th bar is similar, but the rest of the tune is very different.

The last version B In Praise of a Dairy from Playford's Musical Companion B is in 6/8 in theDorian mode of E. It doesn't have the same opening phrase as the others, but the characteristici, vii , i turnaround appears at the end of the phrases rather than at the start. Where the tunes may#

differ, they are all underpinned by fundamentally the same chord progression B I, V, I, V, I, IV,I. The reason for such deviation between each of these forms is attributable to the Oral Traditionwhere the music was inevitably corrupted in some way en route. The key features such as mode,chords, turnarounds and cadences appear to survive between the different versions.

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 39: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

39

The following example from London's Lamentation (1683) shows the lyrical form of aPackington's Pound ballad:

Oh London! Oh London! how could'st thou pretend 1

Against thy Defender thy crimes to defend?2

Thy Freedom and Rights from kind Princes did spring3

And yet in contempt they withstandest thy King;4

With bold brazen face5

They pleaded thy Case,6

In hopes to the Charter the King wou'd give place;7

Oh London! thou'dst better no Charter at all,8

Than thus for Rebellion thy Charter shou'd fall.9

The melody for lines 1 and 2 is repeated by lines 3 and 4, and similar to that of lines 8 and 9,whilst 5 and 6 have a different melody, and 7 is different again (in some texts it is split into twoparts). The rhythms of lines 1 and 2 (and similarly for the others) are almost identical, with theresult that both lines can be sung to the same tune. Lines 5 and 6 and each half of line 7 have thesame property. I have used this fact to gain examples for my test data set, so that I need onlysupply the rhythm of the first line and can use four lines from each verse to test for a match.

Here follows a selection of test data from my corpus (there are 41 lines in total), with reasonsfor choosing them:

From London's Lamentation, a 'standard' Packington fit. (Primary) Word Stress occurs onsecond syllable in forlorn, against and defend:

l et a1 = " You f r ee- men and mas- t er s , and pr en- t i - ces mour n" ;;l et a2 = " For now you ar e l ef t wi t h y18our char - t er f or - l or n" ; ;l et a3 = " Oh, Lon- don! , Oh, Lon- don! How coul dst t hou pr e- t end" ; ;l et a4 = " A- gai nst t hy de- f en- der t hy cr i mes t o de- f end?" ; ;

From N orth Country Mayor, a5 has an extra tail syllable, as does a6, but also with two extrasyllables at the beginning. Mayor is taken to have one syllable as using two doesn't fit:

l et a5 = " I s i ng of no he- r e- t i c , Tur k or of Tar - t ar " ; ;l et a6 = " But of a suf f - er - i ng mayor who may pass f or a mar - t yr " ; ;

From A Ballad Called Lamentable Lory, a8 has a number of words which could take stress B seethe analysis on page 18.

l et a7 = " The yout h was be- l ov ' d i n t he spr i ng of hi s l i f e" ; ;l et a8 = " Be- f or e hi s head t ur ned wi t h pl ace, t i - t l e and wi f e" ; ;

From Cold winter's gone and past: controls, to ensure rejection of lines which cannot possiblyfit, due to being too short or too long. au also tests for out-of-vocabulary phrases.:

l et a9 = " Col d wi n- t er ' s gone and past " ; ;l et a10 = " Peas- ant sum- mer s come at l ast " ; ;l et a10b = " The t r ees t hey do gr ow hi gh" ; ;l et a11 = " and t he l eaves t hey do gr ow gr een" ; ;l et au=" Shoul d aul d ac- quai n- t ance be f or - got and ne- ver br ought t omi nd?" ; ;

a12 is a syllable short in the middle, requiring run like to become ru-n like or run li-ke, i.e. by

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 40: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

40

elongating the vowels, in order to scan easily:l et a12 = " young wom- en t hey ' l l r un l i ke har es on t he moun- t ai ns" ; ;

From London's wonder - the great frost 1683, a13 and a14 have an extra head syllable. a15 hasa compound noun:

l et a13 = " Ti s some Lap- l and ac- quai n- t ance of Con- j ur - or Oat es" ; ;l et a14 = " That has t i ed up your hands, and i m- pr i s- on' d your Boat s" ; ;l et a15 = " I n Roast - Beef and Br an- dy much mon- ey i s spent " ; ;

From In Praise of a Dairy, b3 takes beat on second syllable of without rather than cream, whilstb4 takes beat after without:

l et b1 = " a- mongt t he r ar e v i r - t ues t hat mi l k does pr o- duce" ; ;l et b2 = " f or a t hou- sand of dai n- t i es i t s dai l y i n use" ; ;l et b3 = " f or a dai n- t y f i ne pudd- i ng wi t h- out cr eam or mi l k" ; ;l et b4 = " i s a c i t - i - zen' s wi f e wi t h- out sat - i n or s i l k" ; ;l et b5 = " i n t he v i r - t ues of mi l k t her e i s mor e t o be mus- t er ed" ; ;l et b6 = " t he char m- i ng de- l i ght s bot h of cheese- cake and cus- t ar d" ; ;

From The Clothiers Delight, c1 takes stress on in, rather than England, c2 stressed on is and we:l et c1=" of al l - sor t s of cal l - i ngs t hat i n Engl - and be" ; ;l et c2=" t her e i s none t hat l i ve- t h so gal l - ant as we" ; ;l et c3=" our t r ad- i ng mai n- t ai ns us as br ave as a ni ght " ; ;l et c4=" we l i ve at our pl ea- sur e and t ake- t h de- l i ght " ; ;l et c5=" we heap- et h up r i ch- es and t r eas- ur e gr eat s t or e" ; ;

c6 stressed on we rather than get. c7 and c8 both start with although + Pronoun, but are stressedon this, and second syllable of although respectively:

l et c6=" whi ch we get by gr i p- i ng and gr i nd- i ng t he poor " ; ;l et c7=" al - t hough t hi s i s t he way f or t o f i l l up our pur se" ; ;l et c8=" al - t hough we do get i t wi t h ma- ny a cur se" ; ;l et c9=" t hr ough- out t he whol e k i ng- dom i n count - r y and t own" ; ;

There are a couple of different ways of singing c10: elongate is or drop anacrusis beat. In c14make is stressed and earn is elongated, since if will is stressed, this requires to to be stressedalso. In c15, former has to stretched over 3 syllables and us'd (used) over 2:

l et c10=" t her e i s no dang- er of our t r ade go- i ng down" ; ;l et c11=" so l ong as t he comb- er can wor k wi t h hi s comb" ; ;l et c12=" and al so t he wea- ver weave i n hi s l omb" ; ;l et c13=" t he t uck- er and spi nn- er t hat spi ns al l t he year " ; ;l et c14=" we wi l l make t hem t o ear n t hei r wag- es f ul l dear " ; ;l et c15=" i n f or - mer a- ges we us ' d t o gi ve" ; ;

Anacrusis beat must be dropped with c16, i.e. so occurs on the first beat and is stressed:l et c16=" so t hat our wor k- f ol ks l i ke f ar - mer s di d l i ve" ; ;l et c17=" but t he t i mes ar e al - t er ed we wi l l make t hem know" ; ;l et c18=" al l we can f or t o br i ng t hem al l un- der our bow" ; ;l et c19=" we wi l l make t hem t o wor k har d f or s i x pence a day" ; ;l et c20=" t hough a shi l l - i ng t hey de- ser ve i f t hey had t hei r j us t pay" ; ;

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 41: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

41

Appendix B - Sample Resul ts

Packington's Pound

The first numerical results I obtained were from the metrical deviance matrix method, which wasthe integral squared error of two metrical rhythms. The example below uses Pulse HeightModulation, and the weightings adapted to favour similar lengths. The matrix is symmetricalabout the leading diagonal (from the back corner to the front), and the leading diagonal valuesare predictably zero. I have clustered the sample data such that the similar groupings are a1-a4,a5-a6, a7-a8, a9-a10 and a11-a11b, with a1-a8 corresponding to the same overall rhythm, whilsta9-a11b are controls to show rejection. It is clear that this approach has recognised a1-a8 if athreshold of around 0.2 is used. This method however is not very accurate, and was replaced bythe beat placement method.

The following results show the success of the beat placement algorithm over the entire test dataset; the algorithm is able to place all of the obvious fits without error, however, it also producesvery good results with the harder examples, most of which require a few attempts to get rightwhen sung unaided. In this notation, the ++ marks denote the stress points, i.e. the syllableswhich occur on the beats. Extra syllables are in parentheses, and these must be uttered morequickly, for example:

That ( has) ++t i ed . . .

In these cases, the words that and has must be said in the time of one, so if a crotchet was usedfor the first note, it would have to be split into quavers for each syllable. Another situation mayoccur:

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 42: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

42

. . . wom- en t hey ' l l ++r u- n l i ke . . .

Here, there aren't enough syllables to fit the rhythm, so the program has decided that the wordrun should be split into two syllables (i.e. the vowel is sustained for an extra beat). Thecharacters at the end of the lines denote the quality rating: VG, Gd, Ok, Bad and Can't Find Fit,which are all self explanatory.

I will highlight the salient cases first and then will follow the other examples:

The typical Packington's Pound fit.you ++f r ee- men and ++mas- t er s, and ++pr en- t i - ces ++mour n - - > [ VG]

Correctly placed primary word stress on forlorn, pretend, against and defend.f or ++now you ar e ++l ef t wi t h your ++char - t er f or - ++l or n - - > [ VG]Oh, ++Lon- don! oh, ++Lon- don! How ++coul dst t hou pr e- ++t end? - - > [ VG]A- ++gai nst t hy de- ++f en- der t hy ++cr i mes t o de- ++f end? - - > [ VG]

Extra head and tail syllables detected.I ++si ng of no ++he- r e- t i c ++Tur k or of ++t ar - ( t ar ) - - > [ VG]But ( of ) ( a ) ++suf f - er - i ng ++mayor who may ++pass f or a ++mar - ( t yr )- - > [ Ok]

Correctly placed beats on [turned, title] rather than [head, place]. The controls (a9-a11) arerejected.Be- ++f or e hi s head ++t ur ned wi t h pl ace, ++t i - t l e and ++wi f e - - > [ VG]

Identified need for extra syllable in run. young ++wom- en t hey' l l ++r u- n l i ke ++har es on t he ++moun- ( t ai ns ) - - >[ Ok]

Problems due to noun/verb duality of produce, taking primary word stress on first rather than lastsyllable C see page 19.a- ++mongst t he r ar e ++vi r - t ues t hat ++mi - l k does ++pr o- ( duce ) - - >[ Ok]

Correctly placed beat on second syllable of without rather than cream in the first case, andcorrectly after without in the second case.f or ( a ) ++dai n- t y f i ne ++pudd- i ng wi t h- ++out cr eam or ++mi l k - - >[ Gd]i s ( a ) ++ci - t i - zen' s ++wi f e wi t h- out ++sat - i n or ++si l k - - > [ VG]

Stress correctly placed on in and be rather than England in the first case, and on is and we in thesecond.of ++al l - sor t s of ++cal l - i ngs t hat ++i n Engl - and ++be - - > [ Gd]

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 43: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

43

t her e ++i s none t hat ++l i ve- t h so ++gal l - ant as ++we - - > [ Ok]

A difficult example, this solution is probably wrong C the stress should be taken by we insteadof adding an extra syllable to get.whi ch ( we ) ++ge- t by ++gr i p- i ng and ++gr i nd- i ng t he ++poor - - > [ Ok]

Correctly placed beats in each case despite the fact that they start with although + Pronoun.al - ( t hough ) ++t hi s i s t he ++way f or t o ++f i l l up our ++pur se - - >[ Gd]al - ++t hough we do ++get i t wi t h ++ma- ny a ++cur se - - > [ Ok]

Decides to stress and elongate the is, which is debatably correct C the alternative would be todrop the anacrusis beat and sing there on the first beat. In any case, the algorithm has decidedthat the fit it has chosen is bad, and the other option, (which will have been evaluated) deemedto be worse.t her e ++i - s no ++dang- er of ( our ) ++t r ade go- i ng ++down - - > [ Bad]

Recognised the need to prolong a word with more than one syllable (weaver). and ++al - so t he ++wea- - ver ++weave i n hi s ++l omb - - > [ Ok]

Successfully stressed make and earn, rather than will and to, and to compensate, adds an extrasyllable to earn.we ( wi l l ) ++make t hem t o ++ea- r n t hei r ++wag- es f ul l ++dear - - > [ Ok]

In these cases, it has decided to drop the anacrusis beat and place the stress on the first word.The first case is penalised further by prolonging us'd (i.e. used).++i n f or - mer ++a- ges we ++u- s' d t o ++gi ve - - > [ Bad]++so t hat our ++wor k- f ol ks l i ke ++f ar - mer s di d ++l i ve - - > [ Ok]

Other examples: our ++t r ad- i ng mai n- ++t ai ns us as ++br ave as a ++kni ght - - > [ VG]we ++l i ve at our ++pl ea- sur e and ++t ak- et h del - ++i ght - - > [ VG]we ++heap- et h up ++r i ch- es and ++t r eas- ur e gr eat ++st or e - - > [ VG]The ++yout h was be- ++l ov' d i n t he ++spr i ng of hi s ++l i f e - - > [ VG]That ( has ) ++t i ed up your ++hands, and i m- ++pr i s- on' d your ++Boat s- - > [ VG]I n ++Roast Beef and ++Br an- dy much ++mon- ey i s ++spent - - > [ VG]f or ( a ) ++t hou- sand of ++dai n- t i es i t ' s ++dai l - y i n ++use - - > [ VG]i n ( t he ) ++vi r - t ues of ++mi l k t her e i s ++mor e t o be ++mus- ( t er ed ) - - >[ VG]t he ++char m- i ng del - ++i ght s bot h of ++cheese- cake and ++cus- ( t ar d ) - - >[ VG]t hr ough- ++out t he whol e ++ki ng- dom i n ++count - r y and ++t own - - > [ VG]so ++l ong as t he ++comb- er can ++wor k wi t h hi s ++comb - - > [ VG]t he ++t uck- er and ++spi nn- er t hat ++spi ns al l t he ++year - - > [ VG]but ( t he ) ++t i me- s ar e ++al - t er ed we ++wi l l make t hem ++know - - >[ Ok]al l ( we ) ++can f or t o ++br i ng t hem al l ++un- der our ++bow - - > [ VG]we ( wi l l ) ++make t hem t o ++wor k har d f or ++si x- pence a ++day - - >

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 44: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

44

[ VG]t hough ( a ) ++shi l l - i ng t hey ( de- ) ++ser ve i f t hey ++had t hei r j ust ++pay - - > [ Ok]

Ti s ( some ) ++Lap- l and acqu- ++ai nt - ance of ++Con- j ur - or ++Oat es - - >[ VG]

Tune Searching

Up to now the only test data we have been considering have been a number of lyrics written tothe tune of Packington's Pound. We have also encountered three variants of the tune, which areincluded in an appendix. We have at our disposal a database of around 5000 (mainly) folk songs,which include the tune (in SongWright format) and a text file containing all the verses, asobtained from the Digital Tradition corpus (see page 37 for details on how to obtain).

As an application of the software we have written, we wanted to produce a program to searchthe database for a tune to match a given set of words C something a folk singer may do mentallywhilst trying to find a tune with which to sing newly discovered lyrics. We have managed toproduce a program to do just that, and for the purposes of familiarity, we have searched thedatabase for tunes to which the following well-worn verse can be sung:

You free-men and mas-ters, and pren-ti-ces mournFor now you are left with your char-ter for-lornOh, Lon-don! oh, Lon-don! How couldst thou pre-tend?A-gainst thy de-fen-der thy crimes to de-fend?

We used a subset of around 110 song files for testing with the above verse in the filepack1. t x t , and the following function returns a list of all those which fit to any particularextent; there are two cases: if all lines are found to match, the average quality of fit is given; ifsome lines do fit but others don't then the non-fitting line numbers are given, together with thenumber and average quality of fit of the lines that do.

#f i nd_t une_f or _wor ds songf i l es " pack1. t xt " ; ; [ ( " aul dl ang" , " pack1. t xt " ) , Ver seFi t Qual i t y Fi t Bad; ( " awaymngr " , " pack1. t xt " ) , Ver seFi t Qual i t y Fi t Ver yGood; ( " babwood2" , " pack1. t xt " ) , Ver seFi t Qual i t y Fi t Ok; ( " babydi ed" , " pack1. t xt " ) , Ver seLi nesDont Fi t ( [ 1; 2; 4] , ( 1, Fi t Bad) ) ; ( " paceegg" , " pack1. t xt " ) , Ver seFi t Qual i t y Fi t Good; ( " packi ngt " , " pack1. t xt " ) , Ver seLi nesDont Fi t ( [ 3; 4] , ( 2, Fi t Ver yGood) ) ; ( " paddycnl " , " pack1. t xt " ) , Ver seLi nesDont Fi t ( [ 1; 3] , ( 2, Fi t Ok) ) ; ( " pensi onr " , " pack1. t xt " ) , Ver seLi nesDont Fi t ( [ 2] , ( 1, Fi t Ver yGood) ) ; ( " pi onsng" , " pack1. t xt " ) , Ver seFi t Qual i t y Fi t Ver yGood; ( " poor l i l " , " pack1. t xt " ) , Ver seLi nesDont Fi t ( [ 1] , ( 1, Fi t Ok) ) ; ( " poor l one" , " pack1. t xt " ) , Ver seLi nesDont Fi t ( [ 1] , ( 1, Fi t Ok) ) ; ( " popwobb" , " pack1. t xt " ) , Ver seLi nesDont Fi t ( [ 3] , ( 2, Fi t Ok) ) ; ( " por t j ai l " , " pack1. t xt " ) , Ver seLi nesDont Fi t ( [ 2; 4] , ( 2, Fi t Ok) ) ; ( " pr at smal " , " pack1. t xt " ) , Ver seLi nesDont Fi t ( [ 3] , ( 2, Fi t Ok) ) ; ( " pr et babe" , " pack1. t xt " ) , Ver seLi nesDont Fi t ( [ 4] , ( 3, Fi t Good) ) ; ( " pr et baby" , " pack1. t xt " ) , Ver seLi nesDont Fi t ( [ 1; 2] , ( 1, Fi t Ok) ) ; ( " pr et sar o" , " pack1. t xt " ) , Ver seFi t Qual i t y Fi t Ok; ( " pr snr l i f " , " pack1. t xt " ) , Ver seFi t Qual i t y Fi t Good; ( " pr t j r vs" , " pack1. t xt " ) , Ver seLi nesDont Fi t ( [ 1; 3] , ( 2, Fi t Ver yBad) ) ;

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 45: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

Through the per f or m and pl ay utilities provided with Digital Tradition.21

45

( " puddywel " , " pack1. t xt " ) , Ver seLi nesDont Fi t ( [ 1; 3] , ( 2, Fi t Ver yBad) ) ; ( " puf f dr gn" , " pack1. t xt " ) , Ver seLi nesDont Fi t ( [ 2; 3; 4] , ( 1, Fi t Ver yBad) ) ]

However, this list is rather unwieldy, so further processing is done by imposing quality thresholds.In this case anything less than a FitGood is rejected, and if less then 2/3 of the number of linesfit then these are also rejected, leaving a more convenient subset. These steps can be tiedtogether to produce a single line script to search for the best fits:

#show_good_songs_f or _wor ds songf i l es " pack1. t xt " ; ;

Tot al f i t s:

Wor ds: pack1. t xt Musi c: awaymngr Qual i t y: - - > [ VG]you ++f r ee- men and ++mas- t er s, and ++pr en- t i - ces ++mour n - - > [ VG]f or ++now you ar e ++l ef t wi t h your ++char - t er f or - ++l or n - - > [ VG]Oh, ++Lon- don! oh, ++Lon- don! How ++coul dst t hou pr e- ++t end? - - > [ VG]A- ++gai nst t hy de- ++f en- der t hy ++cr i mes t o de- ++f end? - - > [ VG]

Wor ds: pack1. t xt Musi c: pi onsng Qual i t y - - - > [ VG]you ++f r ee- men and ++mas- t er s, and ++pr en- t i - ces ++mour n - - > [ VG]f o- r ++now you ar e ++l ef t wi t h your ++char - t er f or - ++l or n - - > [ VG]O- h, ++Lon- don! oh, ++Lon- don! How ++coul dst t hou pr e- ++t end? - - > [ VG]A- - ++gai nst t hy de- ++f en- der t hy ++cr i mes t o de- ++f end? - - > [ VG]

Wor ds: pack1. t xt Musi c: paceegg Qual i t y: - - > [ Gd]you ++f r ee- men and ++mas- t er s, and ++pr en- t i - ces ++mour n - - > [ VG]f o- r ++no- w you ++ar e l ef t wi t h your ++char - t er f or - ++l or n - - > [ Ok]O- h, ++Lon- don! oh, ++Lon- don! How ++coul dst t hou pr e- ++t end? - - > [ VG]A- - ++gai nst t hy de- ++f en- der t hy ++cr i mes t o de- ++f end? - - > [ VG]

Wor ds: pack1. t xt Musi c: pr snr l i f Qual i t y: - - > [ Gd]you- ++f r ee- men and ++mas- t er s, and ++pr en- t i - ces ++mour n - - > [ VG]f or ++now you ar e ++l ef t wi t h your ++char - t er f or - ++l or n - - > [ VG]Oh, ++Lon- don! oh, ++Lon- don! How ++coul dst t hou pr e- ++t end? - - > [ VG]A- ++gai nst t hy de- ++f e- n der t hy ++cr i mes t o de- ++f end? - - > [ Ok]

Par t i al f i t s:

Wor ds: pack1. t xt Musi c: pr et babe Qual i t y: - - > [ Gd] Number of non- f i t t i ngl i nes: 1/ 4you ++f r ee- men and ++mas- t er s, and ++pr en- t i - ces ++mour n - - > [ VG]f o- r ++now you ar e ++l ef t wi t h ( your ) ++char - t er f or - ++l or n - - > [ Ok]O- h, ++Lon- don! oh, ++Lon- don! ++How coul dst t hou ( pr e- ) ++t end? - - >[ Ok]Can' t f i nd f i t

Notice that the program has identified Away in a manger as being a very good fit, which thereader should be able verify. Similarly on listening to the other songs , the fits were equally21

satisfying, so in this instance the program has been remarkably successful. Undoubtedly someof the rejected cases would also fit, but lowering the threshold should pick these out quite easily.

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 46: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

46

Limer icks

I found that the qualities that make a limerick scan were remarkably similar to those of song; thesame rules of beat placement seem to apply, if the rhythm is written musically, as shown below:

However, it is usually allowable to drop a beat in the first or second bar, which our cost functionwill penalise. In this sense, the program will insist on a Limerick of the strictly correct form.Here are a couple of examples of my Limerick detection algorithm:

Ther e ++was a young ++l ad- y ( f r om ) ++Leeds - - > [ Gd]Who ++swal l - owed a ++pack- et ( of ) ++seeds - - > [ Gd]I n ++l ess t han an ++hour , her ++nose was i n ++f l ow- ( er ) - - > [ VG]And ( her ) ++hea- d was ++cov- er ed ( i n ) ++weeds - - > [ Ok]

A ++j og- ger f r om ++Jer - sey ( cal l ed ) ++Jack - - > [ Ok]Once ++had an i mm- ++ense- l y ( bad ) ++back - - > [ Gd]He ++bui l t a bi g ++shi p, On ++whi ch he coul d ++ki p - - > [ VG]But ( i t ) ++f ai l ed by an ( i n- ) ++ci p- i ent ++cr ack - - > [ Gd]

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 47: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

47

Appendix C - Progress Report

ECS 4th Year Project Progress Repor t

Contrafactal and Metr ical Analysis of English Folk Songs

(Music and Computers)

Michaelmas 1995

Paul Walmsley, Tr inity

! In partnership with: Douglas Anderson, Pembroke! Supervised by: Ian Page

Overview

This project is involved with trying to characterise the qualities which make lyrics fit to tunes.Once these have been found we should have the ability to 'match' a set of words to the best fittingof a database of tunes. Contrafacture is the ability to sing multiple sets of words to a single tune.

Plan of Work

The inherent resarch-based nature of this project makes it difficult to pin down particular goalsand schedules. The initial work has concentrated on reading in relevant data sets and preliminaryprocessing B the bulk of the work ahead will be to extract the features of both the words and themusic. A rough list of goals might include:

! Establishing a single, unified data structure to encapsulate all areas of the project! Finding or estimating the caesura (i.e. the subconscious pause) in a line of text! Evaluate the goodness-of-fit of two metrical rhythms and determination of anacrusis

beats (i.e. beats before the bar)! Combining the results for individual lines of a verse to obtain a comparison of two whole

sets of verses! Comparison of rhyming schemes! Relating words to melodies - use of cadences, key, modes, etc! Automatically producing a report on the attempt to match one set of words to a pre-

matched songs and tune, before finding suitable weighings to give a rating between 0and 1 of how good the fit is.

Background Work

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!

Page 48: Contrafactal and Metrical Analysis of English Folk Songs ... · concentrated on the analysis of a popular seventeenth century ballad called Packington's Pound. The ultimate goal of

48

A significant amount of background work has been done on this project due mainly to the researchbias. This is very much an interdisciplinary project, and the ever-growing bibliography includesbooks on folk music, Mediæval English songs, ethnomusicography, poetry, linguistics andlexical functional grammars. Most of the analysis we have done has come from a set of balladscalled the 'Broadside Ballads', from the 17th Century, the main characteristic of which is theexistence of a small number of tunes and a large number of different words.

Interaction with Supervisor

We have had three meetings this term with Ian Page, whom, like ourselves, has been veryenthusiastic about the project whilst not knowing what results may be possible. It has proved tobe an interesting and challenging project, and he has been very useful in coming up with ideas,references and contacts, which has led to us talking to a student in the Phonetics Lab, himselfa useful source of information.

Progress to Date

On an implementational level, we have been working in CAML-light, an imperativeprogramming language supporting a functional programming style, and is a dialect of StandardML. Douglas has created preliminary data structures to read in data from a large archive of folksongs and syllabize the lyrics, whilst I have been working on metrical analysis of the syllables,in terms of long, medium and short syllables, to compare metrical rhythms; that is patterms of,for example LONG, LONG, SHORT. Mediæval English songs are highly structured in this waybut unfortunately later songs are less inhibited by a need for strict metre.

This was evident in the first set of results I produced, in the form of a metrical deviance matrix,which compared two metrical rhythms on a strict basis. Although it did successfully find rhythmswhich did match, finding the reasons why rhythms didn't match wasn't really possible. In orderto get more information on how to match two phrases (e.g. if one requires an extra syllable tobe inserted on a passing note), I have begun an analysis of finding the caesura of a sentence toestablish the midpoint of a line, which is often clearly marked in the music itself.

This PDF was produced by PStill, licensing the software will remove this mark See http://www.pstill.com or for the MacOS X version http://www.stone.com!