-
Evolutionary Computation Applied to MelodyGeneration
Matt D. Johnson
December 5, 2003
AbstractIn recent years, the personal computer has become an
integral component in the
typesetting and management of various types of music. However,
the computer iscapable of serving as more than just a typesetting
and data management tool. Thispaper explores the ability of a
computer to generate and arrange four part vocalharmony in the
style of church hymnody. The research presented here involves
theuse of an evolutionary algorithm to generate a melody. The
resulting melody isthen arranged into four parts using a decision
tree for assigning chords. The resultis an application that
produces unique and pleasing music suitably arranged forSoprano,
Alto, Tenor, and Bass.
Contents1 Introduction 2
2 Related Work 32.1 Interactive Systems . . . . . . . . . . . .
. . . . . . . . . . . . . . . 32.2 Autonomous Systems . . . . . . .
. . . . . . . . . . . . . . . . . . . 32.3 Rule Based Systems . . .
. . . . . . . . . . . . . . . . . . . . . . . . 4
3 Research Methodology 63.1 Problem Size . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . 63.2 Problem Simplification . .
. . . . . . . . . . . . . . . . . . . . . . . 63.3 Problem
Representation . . . . . . . . . . . . . . . . . . . . . . . . .
73.4 Evolutionary Cycle . . . . . . . . . . . . . . . . . . . . . .
. . . . . 8
3.4.1 Initialize the Population . . . . . . . . . . . . . . . .
. . . . 83.4.2 Terminating Condition . . . . . . . . . . . . . . .
. . . . . . 83.4.3 Selection of Parents . . . . . . . . . . . . . .
. . . . . . . . . 83.4.4 Reproduction . . . . . . . . . . . . . . .
. . . . . . . . . . . 93.4.5 Mutation . . . . . . . . . . . . . . .
. . . . . . . . . . . . . 93.4.6 Rhythm Correction . . . . . . . .
. . . . . . . . . . . . . . . 93.4.7 Competition . . . . . . . . .
. . . . . . . . . . . . . . . . . 10
3.5 Fitness . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . 103.6 Results . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . 14
1
-
4 Conclusion 15
KeywordsEvolutionary Computation, Evolutionary Algorithm,
Artificial Intelligence, Music Gen-eration, Melody Generation,
Computer Generated Music, Genetic Algorithm, FitnessBottleneck
1 IntroductionIn recent years, personal computers have become
tools used to store and typeset sheetmusic. There is currently
research underway which will hopefully lead to computerapplications
that are capable of generating and arranging music as well as a
humanbeing can. Section 2 of this document presents a brief
overview of a few such researchprojects.
The research methodology presented in section 3 of this paper
presents an evolu-tionary algorithm which generates a melody in the
traditional style of church hymnody.The resulting melody is in the
soprano range. Alto, Tenor, and Bass parts are generatedto go along
with the melody using CAVM, a tool that automatically adds Alto,
Tenor,and Bass parts to an existing melody [4].
2
-
2 Related WorkEvolutionary programming is a powerful tool which
has been used by a number ofresearchers in the field of computer
generated music. Across the board, it seems thatthe greatest
challenge to researchers in this community is the fitness function
for theirevolutionary systems. The authors of [8] categorize
computer generated music researchaccording to the fitness function
used in that particular method. A number of thosecategories are
used here.
2.1 Interactive SystemsThe fitness function in an interactive
system is a human being. Every generation createdby the
evolutionary program must be painstakingly evaluated by hand. This
creates a“fitness bottleneck” [1]. However, it practically
gaurantees the patient user a computergenerated melody that is
pleasing to that individual.
A system called “Variations” is presented by Bruce L. Jacob [3].
Jacob chose toconduct his experiments at the level of phrases and
motives instead of notes; typically,computer generated music is
implemented at the note level. This system uses threemodules,
namely the ear, composer, and arranger. Each of these modules
either uses agenetic algorithm, or was developed with a genetic
algorithm.
Composing with “Variations” requires a human operator to define
a number ofmotives which will be used as the basis for the musical
composition. [3]. Phrasesare developed by the composer module,
which performs recombination and variationon the original motives.
The composer module refers to the ear module to determinewhether a
given phrase is acceptable. Once a number of accepted phrases are
created,the arranger module will put the phrases together and wait
for feedback from a humanevaluator. The arranger module will
continue to work with the human evaluator untilthe program
terminates.
2.2 Autonomous SystemsIn a typical EA, the fitness function is
constant, while the population evolves over timeto become more fit.
Autonomous systems are different in that both the population andthe
fitness function evolve [8].
One of the most interesting pieces of literature uncovered in
this research is a paperentitled “Frankensteinian Methods for
Evolutionary Music Composition” [2]. In thispaper, Gregory begins
by presenting an extensive overview of a number of differentmusic
composition projects. Throughout the overview, references to
Frankenstien areused to illustrate various points. The paper
climaxes in section four when the authorpresents his evolutionary
ideas for generating music.
In the “Frankensteinian” approach, both the individual and the
environment coe-volve. According to Gregory, this relationship is
similar to the relationship betweenFrankenstein and his monster.
Frankenstein and his monster each contributed to theothers
environment, so they evolved together based on the other.
Gregory presents two types of individuals in section 4.2 of his
paper, “Coevolvinghopeful singers and music critcs”. The female
individual represents the evolving en-
3
-
vironment and choses the males, which represent the singers. The
female maintainsa note transistion table. This table indicates what
type of transitions she expects andwith what frequency. The table
is initialized with note transitions collected from simplefolk-tune
melodies. Over time, the table can change in response to what the
female ob-serves in the male singers. This creates the changing
environment. Males in the systemstart out with randomly generated
melodies and evolve based on the environment.
2.3 Rule Based SystemsThe rule based system uses a fitness
function which encodes a set of rules. The rulesmust be built into
the system based on the authors musical knowledge [8].
George Papadopoulos and Geraint Wiggins [7] present a genetic
algorithm for gen-erating jazz melodies based on an input chord
progression. Their algorithm is madedistinctive by the following
characteristics:
1. The algorithmic fitness function described in [7] calculates
the weighted sum ofa number of distinct characteristics of the
chromosome. This approach avoidsthe “fitness bottleneck” described
by John A. Biles [1].
2. Problem specific genetic operators allow this system to
converge to a high fitnessrelativley quickly.
3. The representation of the melody is based on the scale degree
of a note, as op-posed to the traditional binary encoding. This
allows for greater readability andmore problem specific
operators.
The paper concludes by saying that the resulting system
frequently generates “in-teresting” patterns, and also enumerates
some extensions which could lead to morehuman-like jazz
melodies.
A genetic algorithm for harmonising chorale melodies is
presented in “Evolution-ary Methods for Musical Composition” [9].
Note representation is based on standardwestern music syntax.
Information such as the key signature and time signature isstored.
For every note, pitch is expressed in terms of scale degree and its
duration is aninteger; another integer is used to indicate the
octave the note occurs in. The absolutepitch of the note is not
stored.
The genetic algorithm presented in [9] makes use of several
domain specific oper-ators. One such operator is named “Splice” and
is a traditional one point crossover. Aunique operator in this
implementation is the “PhraseEnd” operator. The “PhraseEnd”operator
mutates the end of a phrase such that it ends with a chord in root
position.
Two types of fitness functions are used in this genetic
algorithm. One fitness func-tion evaluates individual voices, and
tends to favor movement in a consonant direction.The fitness
function also leans against large jumps in the voice. The second
fitnessfunction considers the relationship between voices, and
tends to avoid certain types ofparallel motion and cross voices
[9].
The authors of [9] note in their review of this genetic
algorithm that the resultsare decent, but certainly not optimal.
The domain knowledge encoded in the algorithmallowed for them to
acheive the results they got rather quickly - within 300
generations.
4
-
They end this section of their paper by suggesting that a
conventional rule based systemworking in conjunction with one or
more genetic algorithms would be a better approachto
harmonisation.
5
-
gc d e f a b gc d e f a b gc d e f a b gc d e f a b
{_______________} {_______________} {_______________}
{_______________}Two−Line 2One−Line 1Small 0Great −1
Bass
Tenor
Alto
Soprano
Figure 1: Voice Ranges.
3 Research Methodology
3.1 Problem SizeThe decision to use an evolutionary algorithm to
generate a melody is driven by onemain factor - complexity.
Consider the following: An average soprano can sing notesin the
range from D1 to G2, or 18 different pitches. (See Figure 1 for an
illustra-tion of voice ranges.) There are 8 note durations
typically found in church hymnody:sixteenth, eighth, quarter, half,
whole, dotted eighth, dotted quarter, dotted half. Thenumber of
notes found in a typical hymn can range from roughly 20 to 60. (40
onaverage) Given this information, the number of potential melodies
can be calculated.
Pitches * Durations = 18 * 8 = 144 = # possible notes(# possible
notes)ˆ(melody length) = 144ˆ40
= 2.16 * 10ˆ86 melodies
The large search space makes an EA well suited to tackling this
problem.
3.2 Problem SimplificationSince the number of possible melodies
is so large, reduction of the search space willmake the problem
more manageable. This is done quite handily by acting on
twoobservations. First, most melodies stay within the key of the
musical piece. Second,the range of most melodies does not exceed
one octave. By constraining the melodyto notes within one key (F)
and one octave, the number of pitches drops from 18 to 8.For
completeness, a rest is included as a pitch, making the number of
pitches 9. Thischanges our initial calculation to the
following:
Pitches * Durations = 9 * 8 = 72 = # possible notes(# possible
notes)ˆ(melody length) = 72ˆ40
= 1.96 * 10ˆ74 melodies
6
-
Note Duration Integer Usedwhole note 0dotted half 1half 2dotted
quarter 3quarter 4dotted eigth 5eigth 6sixteenth 7
Table 1: Note Duration Mapping
Scale Degree Note name in Key of F Integer UsedREST - 0ONE F
1TWO G 2THREE A 3FOUR B flat 4FIVE C 5SIX D 6SEVEN E 7EIGHT F 8
Table 2: Note Degree Mapping
This is still a daunting number of melodies, but it is
significantly smaller than thefirst calculation. Additionally, the
restrictions placed on the melody will automaticallyproduce a more
pleasing sound, because notes outside the key will not occur.
3.3 Problem RepresentationThe note is the building block of
music. Therefore, the cornerstone of the representa-tion is a Note
structure. The structure consists of a scale degree and a duration.
Theduration of a note indicates how long the note will sound. This
value is represented asan enumerated integer type. Table 1
illustrates the mapping of a note duration to theunderlying integer
used in the implementation. The scale degree of a note indicates
itspitch within a given key. For simplicity, every melody generated
by this algorithm is inthe key of F. Table 2 illustrates the
mapping between the scale degree, the letter nameof the note in the
key of F, and the underlying integer used in the implementation.
Acomplete melody consists of a vector of notes.
A class called Individual is responsible for storing the melody.
In addition tothe melody, an Individual also contains the following
functions: Initialize, Crossover,GetFitness, ForceBeats,
ChangeOneNoteDegree, ChangeOneNoteLength. These func-
7
-
tions are used throughout the evolutionary process, and will be
explained in later sec-tions.
The controlling class is named Population . The Population class
directs the evolu-tionary process and stores all the individuals in
an AVL tree [5] based on the fitness ofthat Individual.
3.4 Evolutionary CycleThe evolutionary cycle used is as
follows:
Initialize the Population;while(the terminating condition has
not been reached){
Select two parents;Reproduction;Mutate the children;Correct the
rhythm of the children;Competition;
}
Each of these steps will now be explained in detail.
3.4.1 Initialize the Population
The size of the population is encoded in the Population class
and is currently set atfifty individuals. For every member of the
population, Population will instantiate anIndividual and call the
Individual::Initialize function. The Individual::Initialize
func-tion will decide the length of the melody (from 20 to 60
notes) and then generate thatnumber of notes; each note has a
randomly generated scale degree and duration.
TheIndividual::ForceBeats (see section 3.4.6 for details) function
is called after Individ-ual::Initialize is called.
3.4.2 Terminating Condition
A population will evolve until 100,000 generations has been
reached, or until the bestindividual in the population has a
fitness of at least 30. See section 3.5 for a completedescription
of the fitness function.
3.4.3 Selection of Parents
Since the individuals are stored in an AVL Tree based on their
fitness, implementationof rank based selection is straight forward.
The tree is traversed starting with the mostfit individual,
proceding towards the least fit individual. At any point along the
traver-sal, the current individual has a twenty percent chance of
being selected. Traversalwill continue through the tree, giving
every individual along the way a twenty percentchance of selection
when it is visited, until one is finally selected. If the traversal
fails
8
-
to select an individual, the most fit individual in the tree
will be used. Once an individ-ual is selected, the traversal starts
over and a second parent is selected using the samecriteria. It is
possible that the same individual will be selected both times.
3.4.4 Reproduction
Reproduction is essentially crossover between the two children,
who at this point arejust copies of their parents. One child will
call its Individual::Crossover function. TheCrossover function
takes as an argument another Individual,which is the second
child.To facilitate the discussion of reproduction, the following
terminology will be used:
this melody: The melody contained in the Individual whos
Crossover function is cur-rently executing.
in melody: The melody contained in the Individual who was passed
into the Crossoverfunction.
temp melody: The temporary melody that was created inside the
Crossover function.
The Crossover function will randomly select a scale degree and
use it as a crossoverpoint. The length of temp melody will be
determined to be the length of one of the othertwo melodies,
whichever is shorter. The melody temp melody is created by
copyingnotes from this melody until the crossover point is hit.
Then, in melody will be scanneduntil the crossover point is found.
Starting with this crossover point in in melody,notes will be
copied from in melody to temp melody until another crossover point
isencountered. The algorithm will at this point switch back to this
melody for anotherchunk of notes. Thus, temp melody is created by
adding sets of notes from the othertwo melodies until the melody is
full. (Refer to Figure 2 for an example.) At the endof
Individual::Crossover , this melody is reassigned to be the same as
temp melody.
3.4.5 Mutation
The only two mutation operators are ChangeOneNoteLength and
ChangeOneNoteDe-gree, which are used by both children.
ChangeOneNoteLength randomly selects a notein the melody. Then, it
either decreases or increases the integer which represents thenote
duration. Table 1 shows the note duration to integer mapping.
ChangeOneNoteDe-gree operates in exactly the same manner, except is
modifies the scale degree insteadof the note duration. Table 2
shows the note length to integer mapping.
3.4.6 Rhythm Correction
During the development stages of this project, the observation
was made that therhythm patterns in the melodies were exceptionally
difficult and unusual. To correctthis problem, a deterministic
function called ForceBeats was introduced. ForceBeatsworks as
follows:
Loop through the whole melodySelect the next note
9
-
Figure 2: Reproduction Crossover.
If the note duration is equal to one beatGo on to the next
note.
If the note duration is more than a beatEnsure that the
following note or notes do notextend beyond the end of the current
count.
If the note duration is less than one beatEnsure that the
following note or notes plus thecurrent one have a total duration
of one count.
End loop
3.4.7 Competition
Every generation, two individuals are born, and two individuals
die. The two individ-uals created are stored in the tree. Then, the
two least fit individuals in the tree areterminated.
3.5 FitnessWithout a doubt, the fitness function was the most
challenging aspect of this project.The fitness function is a member
of the Individual class. The fitness function uses a“Fitness Loop”,
which cycles through every note in the melody, checking the
relation-
10
-
ship of the current note with the note which follows it. As the
melody is evaluated, thefitness function keeps a running total of
“fitness points”.
The following list provides a name for a particular
characteristic within the melody,the fitness points awarded for
that characteristic, and a brief description. The phrase“next note”
is used below to indicate the note which follows the current note
in the“Fitness Loop”.
1. SAME NOTE: Fitness Points: 17. The scale degree of the next
note has notchanged.
2. ONE STEP: Fitness Points: 17. The scale degree of the next
note has gone upor down one step.
3. ONE THIRD: Fitness Points: 15. The scale degree of the next
note has gone upor down two steps.
4. ONE FOURTH: Fitness Points: 12. The scale degree of the next
note has goneup or down three steps.
5. ONE FIFTH: Fitness Points: 10. The scale degree of the next
note has gone upor down four steps.
6. OVER FIFTH: Fitness Points: -25. The scale degree of the next
note is greaterthan four steps away.
7. FOUR SEVEN: Fitness Points: -25. The current note is scale
degree four andthe next note is scale degree seven.
8. SIXTEENTH NOTE: Fitness Points: -10. The current note is a
sixteenth note.
9. DRASTIC DURATION CHANGE: Fitness Points: -20. The duration
changebetween the current note and the next note is more than four
steps in table 1.
10. BEGIN TONIC: Fitness Points: 50. The melody begins with the
tonic note(scale degree 1).
11. END TONIC: Fitness Points: 50. The melody ends in the tonic
note (scaledegree 1).
“Fitness points” are awarded and stored in a local integer
variable as the “FitnessLoop” executes. The function returns the
value of the fitness points divided by thenumber of notes. In the
event that the number of fitness points happens to be negative,the
function will return -100.
When the population is first initialized, the best individuals
fitness is typicallyaround zero. In 1000 generations, the fitness
of the best individual will usually achieveat least 15. Sometimes,
a fitness of 18 or better can be achieved in that time frame.
Inother situations, a fitness of 18 is never acheived. Figure 3
shows the fitness of the bestindividual in the tree every 1000
generations for a particular run. Figure 4 illustratesthe average
fitness of the population every 1000 generations.
11
-
0 1 2 3 4 5 6 74
6
8
10
12
14
16
18
Figure 3: Best Fitness.
12
-
0 1 2 3 4 5 6 7−50
−40
−30
−20
−10
0
10
20
Figure 4: Average Fitness.
Figure 5: A Melody with a fitness of 14.
Figure 6: A Melody with a fitness of 15.
13
-
Figure 7: A Melody with a fitness of 17.
Figure 8: A Melody with a fitness of 17 arranged into four
parts.
3.6 ResultsIn the early stages of development, the melodies
generated were quite dissapointing.However, after fine tuning the
fitness function, juggling parameters, and determin-istically
correcting rhythm patterns, the resulting melodies are quite nice.
The bestmelodies seem to be in the fitness range of 14 to 20,
depending on how much excite-ment is desired in the melody.
Melodies on the low end of this range are quite unique and
interesting. Thesemelodies are also more difficult to sing, and may
not sound as nice. Figure 5 is a goodexample of this type of
melody. (All music shown in this document was typeset byLilypond
[6])
Melodies with a fitness greater than 19 start to exhibit
similarity between eachother. The algorithm terminates before the
one “perfect” individual is found, but itdoes appear that given
infinite time that “perfect” individual would be a rather
boringmelody. Figure 9 is actually one of the more interesting
super high fitness melodies.Other individuals with a fitness of 20
have been less interesting.
Figure 9: A Melody with a fitness of 20.
14
-
The best individuals are in the fitness range of 16 to 18. They
exhibit uniqueness,are pleasant to listen to and tend to be easy to
sing. Figure 7 is an excellent representa-tive of the top notch
individuals produced by the EA. Notice how the part moves aroundand
changes frequently, but has few irratic jumps. Figure 8 uses the
same melody andpresents alto, tenor, and bass to accompany the
melody. The alto, tenor, and bass linesare arranged by CAVM
[4].
4 ConclusionWithout a doubt, a simple evolutionary algorithm is
capable of generating very nicemelodies. Further development
including musical modeling would lead to even bettermelodies. More
sophisticated and problem specific genetic operators would also
likelyimprove the results.
Computers are inherently good at doing any type of work which
requires crunchingnumbers or doing logic. Their biggest weekness
lies in areas which involve feelingsand emotions, such as art and
music. The research presented here, as well as ongoingresearch in
the computer generated music community, leads this author to
concludethat there may come a day in the near future in which
computers can do more than justcrunch numbers.
Over time, Computer Science methodologies will continue to
develop. Eventually,these methods will converge to mimic the
creative nature of the human brain. Imaginea computer that can
compose with the anger of Wagner, to do so in a moment, andmake no
type o’s in the process. Artificial Intelligence will grow until it
encapsulatesthe nature and production of human feelings into ones
and zeros. At that point, we willhave computers that can not only
crunch numbers, but can also express emotions.
References[1] BILES, J. Genjam: A genetic algorithm for
generating jazz solos, 1994.
[2] GREGORY, P. T. Frankensteinian methods for evolutionary
music composition.
[3] JACOB, B. Composing with genetic algorithms, 1995.
[4] JOHNSON, M. D., AND WILKERSON, R. W. Computerized
arrangement of vo-cal music. In Intelligent Engineering Systems
Through Artificial Neural NetworksVolume II (2001).
[5] KARAS, W. Code: Abstract avl tree template - available in
the public domain.
[6] LILYPOND. http://lilypond.org/web.
[7] PAPADOPOULOS, G., AND WIGGINS, G. A genetic algorithm for
the generationof jazz melodies.
15
-
[8] SANTOS, A., ARCAY, B., DORADO, J., ROMERO, J., AND
RODRIGUEZ, J.Evolutionary computation systems for musical
composition. In Proceedings ofAcoustics and Music: Theory and
Applications (AMTA 2000). vol 1. pp 97-102.ISBN:960-8052-23-8.
(2000).
[9] WIGGINS, G., PAPADOPOULOS, G., PHON-AMNUAISUK, S., AND
TUSON, A.Evolutionary methods for musical composition, 1998.
16