Conference PaperPDF Available

A Music Composition Model with Genetic Programming.

A Music Composition Model with Genetic Programming
Kyoko Komatsu1, Tomomi Yamanaka1, Masami Takata1, and Kazuki Joe1
1Graduate School of Humanities and Sciences, Nara Women’s University, Nara-City, Nara, Japan
AbstractIn this paper, we propose a music composition
model. This model is constructed two parts, which are code
progression and melody. Both parts are adopted Genetic
Programming(GP). In the first part, a partial code progres-
sion is generated. Then, suitable melody lines related to the
partial code progression is led by Hidden Markov Model.
In the second part, a partial melody is generated by GP, in
which initial individuals are based on the suitable melody
lines. By using GP, since it doesnâ ˘
Zt limit phonetic value,
every note can be expressed.
Keywords: Genetic Programming, Hidden Marcov Model, chord
progression, triplet
1. Introduction
Free music material for various themes gets a lot of
attention for generating web sites or movie contents. Such
material may be easily acquired with music composition
systems, for example. Recent music composition systems
allow novice users to compose a piece of music so that
they use the resultant music pieces for their web pages.
Some composition systems generate music with just giving
initial parameters such as keys, tempos and moods while
other generate music with giving interactive selections of
users’ favorites during the music generation processes. Since
good music compositions require professional knowledge
and enough experience, it is very difficult for novice users
to compose appropriate music. To avoid the problem, auto-
matic music composition systems have been proposed and
implemented [1], [2], [3] using genetic algorithm (GA) [4].
In these systems, note lengths are limited. For example,
these systems cannot generate triplets and sixty-fourth notes.
In addition, there is large burden for interactive operations
to reflect users’ opinions. Although the music generation
systems adopt some rules for music generation based on
simplified music theory, it does not guarantee that the
generated music is as natural as experts generating ones.
The music generation is too complicated to be solved in a
single framework. Hence, in this paper, we propose an auto-
matic music composition model using genetic programming
(GP) [5], [6] to be constructed with two independent but
cooperative models for code progression and melody.
The rest of the paper is constructed as follows. In section
2, we explain existing music composition systems. In section
3, we propose a new composition model using GP. In section
4, we present some experiments to validate the model.
2. Related works
In this section, we explain three existing music compo-
sition systems using GAs. We describe GenJam that plays
jazz improvisation, an automatic music composition system
based on experience networks, and an automatic music
composition system with a database in subsection 2.1 and
2.2, respectively.
2.1 GenJam
GenJam [1] is a GA based music composition system
to generate Jazz improvisation consisting of three phases:
learning, improving and demonstration. Initial measure and
phrase for GenJam, as initial populations for GenJam-GA,
are generated by a music composition system. The learning
method is intended to build up fitness values without genetic
operators. We do not describe the learning method because
of lack of space. In the improving phase, several genetic
operators such as crossover, mutation and inversion, are
applied to get improved individuals. The genetic operators
terminate in the following cases.
The number of generations reaches to a given maximum
Individuals that meet a given criterion for the fitness
are generated.
A set of populations converges to a single individual.
The computational resource is exhausted.
Selection is applied to each individual according to the
fitness values in the learning phase. Individuals are expressed
as an array to be applied to GA of which element is set
to an integer number of 0-15. There are fourteen different
note events (encoded as 1-14), one rest (encoded as 0),
and one hold (encoded as 15). Figure.1 presents two kinds
of individuals where the top and the bottom individuals
represent a phrase and a bar, respectively. The maximum
number of population for phrases and bars is 48 and 64,
respectively. GenJam generates a piece of music by repeating
the learning and improving phases.
Since the generated music pieces are to be converted into
MIDI data, users can read and listen to the music piece
on the demonstration phase. For the implementation reason
of GA, the minimum unit of arrangement is a quaver. It
can express a longer sound than quaver to combine multiple
elements of the arrangement while it cannot express a shorter
sound nor a triplet. GenJam generates 14 notes width music
pieces that are not enough for general music.
23 -12 57 57 11 38
11 6 9 7 0 5 7 8 7 5
38 -4 7 8 7 7 15 15 15 0
57 22 9 7 0 5 7 15 15 0
Fig. 1: group of individual
C F C G7 C E Am C7
Fig. 2: gene representation of chord progression wavy data
2.2 Automatic Music Composition by Experi-
ence Network
The experience network is a directed graph constructed by
connecting directional edges among nodes with information
properties based on implementer’s experience. The number
of edges connected to a node can be more than one. There is
an automatic music composition system with the experience
network to generate code progression [2]. The system con-
sists of three parts, experience data storage system, system
for making code advance and system for making melody.
In the first part, existing chord progressions are stored as
sample data. Base on the experience of implementer and
properties of the code progressions, the stored data sets are
converted into directional graphs. GA is used as the main
engine for the second and third parts. The individuals for
chord progression are represented in an array structure as
shown in Figure.2. Chords continuously change by measure
because an array data represents a measure, where one mea-
sure is 4/4. In the system for making code advance, an initial
individual is generated based on the experience network.
The first chord is appropriately selected from the experience
network, and the next chord that has an edge to the first code
is selected. When the system executes genetic operators, the
resultant individuals should have enough properties obtained
from the experience network. Consequently, when crossover
operations are performed, two individuals are selected so that
each of the two individuals has the same chord at the same
position in the array data. If there were deficient points, for
example so small experience network, it could not be rich
in diversity. To avoid this problem, when there are the same
individuals, it leaves just one to delete all the rests. New
individuals are compensated at this point.
In addition, individuals with low fitness values are re-
moved. The mutation rate is set low because mutation
sometimes disturbs connection of two chords. In the sys-
tem for making melody, genetic operators are sequentially
executed from the top element of the array data by chord.
Consequently a melody individual represents a measure of
melody as shown in Figure.3. Each array member represents
a demiquaver. Suitable properties for musical scale are stored
in the experience network with respect to each chord. The
first melody individual is generated to randomly select the
Fig. 3: gene representation of melody line and wavy data
Fig. 4: general representation of the model
notes included in the chord corresponding to the first array
element. The next individual is generated to randomly select
the notes among adjacent notes to the previous note and the
same note. The generated melody individuals do not always
make sound. As Figure.3 describes, when a property of a
melody individual is one, it makes sound. In the case of
zero, it continues the previous sound. In this way, longer
notes than a demiquaver can be represented. However, as
with GenJam, it cannot express shorter sounds or triplets.
Furthermore, since it generates music from accumulated
data in the experience network, it can generate theoretically
correct music but it lacks of uniquity.
3. Proposed Models
In this section, first we explain the overview of this model.
Next, we propose the information of individuals. Finally, we
describe about the selection.
3.1 Overview
We propose a new composition model that does not
require music composition experts. Composing music, there
are two methods in general. One method is melody first
to apply chord in accord with the melody while the other
method is chord first to create melody using the chord.
We use the latter method for our music composition model
provided that tempo and accent are not changeable. Figure.4
illustrates the model overview. Our model consists of two
parts: code progression and melody. To create as unconfined
music as possible, both parts make use of genetic operators.
To represent any length of notes such as a triplet and dotted
Fig. 5: digitalization of notes
one within each individual, we adopt GP, which is based
on tree structures, to express complicated data rather than
GA, which is based on array structures. Music is expressed
in a form of plural bars. The code progression generating
part generates a partial code progression by bar to obtain
the best and final individual. The melody generating part
generates a partial melody to fit the best individual of the
code progression. The initial code progression bar is created
using Hidden Markov Model (HMM) [7], and the melody
line to fit the initial code progression bar is created. The
resultant code progression bar and melody line are a set of
initial individuals for the next step GPs. This initialization
strategy helps the GPs to generate better individuals. Finally,
we obtain an individual as the partial melody. Multi-agent
[8] is adopted for the evaluation phase in the both GA
parts. Having many agents evaluate the partial melody from
various viewpoints, the created music chunk is evaluated
from various perspectives. The evaluation methods are de-
scribed in [9]. In the selection phase, we mainly adopt elite
strategy. As is well known, a single selection strategy often
leads the solutions to local minima. Therefore, we also use
roulette or tournament strategy as well as elite strategy to
get diversity of the resultant music. We think the ratio of the
three strategies can be a parameter for the diversity, namely
the characteristics of generated music. In the improvement
phase, genetic operators such as crossover and mutation are
executed. Applying crossover, if a partial tree that has the
same depth level and leaf information to the target partial
tree was selected, the genetic operator would not change
the individuals after all. In such the case, the selection is
canceled to find another partial tree with different depth
levels and/or different leaf information.
3.2 Information of individuals
We explain individuals for chord progression in subsection
3.2.1, and initial individuals for melody in subsection 3.2.2.
3.2.1 Individuals for chord progression
Chord progression is represented by a tree. The informa-
tion needed for chord progression is chords and each length
of them. Figure.5 gives an example of converting keyboard
notes into digital values. Chords are classified in 12 types.
Chord progression consists of 17 types as shown in Table.1.
Table 1: kind of chord configuration
chord configuration signage
Major triad ¤
Minor triad ¤m
Dominant seventh ¤7
Minor seventh ¤m7
Major seventh ¤M7
Minor Major seventh ¤mM7
Diminished seventh ¤dim
Augmented triad ¤aug
Diminished triad ¤5
Augmented seventh ¤+5
Dominant seventh Flatted fifth ¤5
Augmented Major seventh ¤+5
Minor seventh Flatted fifth ¤5
Major sixth ¤6
Minor sixth ¤m6
Suspended fourth ¤sus4
Dominant seventh suspended fourth ¤7sus4
Table 2: example of node
node major triad minor triad
C 047 037
Cis/Des 158 148
D 269 259
Dis/Es 37 10 36 10
E 48 11 47 11
F 590 580
Fis/Ges : :
G : :
Gis/As : :
A : :
Ais/B : :
H 11 36 11 26
Therefore a node presents 204 (17*12) types of chord and
chord progression. Table.2 gives a node example. To define
the structure of individuals, the following conditions must
be satisfied.
Must be constructed with the minimum data to reduce
the calculation cost.
Can generate all the progressive patterns.
In the proposed model, a code is assigned to a node. The
depth level of the tree structure represents the length of the
code. The phonetic value of the root node has a constant
length to homologize the depth level of the tree structure to
the length of the chord. In the case of a binary tree, the chord
length in the upper level is as twice as in the lower level. In
the case of a non-binary tree, we can assume that the number
of sub-nodes in any level is even because the chord progres-
sions of an odd number merely exist in general. Therefore,
the sub-nodes of an even number are replaced with several
binary sub-nodes. The proposed method adopts binary tree
for the chord progression structure. Chord progressions are
represented as the order of nodes by depth first search. The
length of a chord is represented as the depth level. In this
way, the tree structure represents any chord and its length
Fig. 6: tree rank represent phonetic value
Fig. 7: representation of triplet and dotted crotchet
Figure.6 gives an example of the tree structure. The
phonetic value of the root node is a semibreve. The phonetic
values of the second and the third level nodes represent a half
note and a quarter note, respectively. In chord progression,
each phonetic value should have an appropriate length. In
the case of Figure.6, the tree structure with depth level 3 is
enough. The proposed tree structure for chord progression
does not require a large number of depth levels in general. It
means that we do not require large memory for small search
3.2.2 Initial individuals for melody
Figure.7 show an example of a tree structure to represent
a melody line. The tree structure for representing melody
is defined as in the previous subsection. Namely, a node
presents a sound pitch and its depth level presents the length
of the sound. Different from the chord progression, the tree
structure for melody allows ternary sub-trees for triplet as
well as binary sub-trees. The information stored in a node
consists of 12 notes (C, Cis (Des), D, Dis (Es), E, F, Fis
(Ges), G, Gis (As), A, Ais (B), H), rest (*) and hold (~)
as shown in Figure.5. Note that it does not limit within one
octave for keyboard notes. Consequently, there are 12*(the
number of octaves)+2 nodes. "Hold" means continuation of
adjacent notes to represent dotted notes and notes step over
the next measure.
In GP, configurations of initial individuals are important
because different configurations get different convergence
speeds and/or values. Since random configurations of initial
individuals for melody may generate a large number of
unsuited individuals against the given chord progression,
Table 3: note behavior
primary triad I V(7) IV
secondary triad VI VII II
Table 4: fundamental note in each key
C : 0 2 4 5 7 9 11
Cis :/Des : 1 3 5 6 8 10 0
D : 2 4 6 7 9 11 1
Dis :/Es : 3 5 7 8 10 0 2
E : 4 6 8 9 11 1 3
F : 5 7 9 10 0 2 4
Fis :/Ges : 6 8 10 11 1 3 5
G : 7 9 11 0 2 4 6
Gis :/As : 8 10 0 1 3 5 7
A : 9 11 1 2 4 6 8
Ais :/B : 10 0 2 3 5 7 9
H : 11 1 3 4 6 8 10
c : 0 2 3 5 7 8 10
cis :/des : 1 3 4 6 8 9 11
d : 2 4 5 7 9 10 0
dis :/es : 3 5 6 8 10 11 1
e : 4 6 7 9 11 0 2
f : 5 7 8 10 0 1 3
fis :/ges : 6 8 9 11 1 2 4
g : 7 9 10 0 2 3 5
gis :/as : 8 10 11 1 3 4 6
a : 9 11 0 2 4 5 7
ais :/b : 10 0 1 3 5 6 8
h : 11 1 2 4 6 7 9
the initial individuals for melody should be configured
based on a chord progression. To solve the problem, we
uniquely generate an initial individual by an HMM. The
HMM supposes that parameters are regarded as an unknown
Markov process, and estimate the unknown parameters from
observable information. By using the Baum-Welch algorithm
[7], which estimates parameters of a given model, and
the Viterbi algorithm [7], which calculates the maximum
likelihood state transitions, it generates the most likelihood
melody line for a partial chord progression generated from
the chord progression model. For example, when a partial
chord progression (CCFCDCAC) is given, we get a uniquely
determined melody line of (DDCE*CEA).
Next, GP is applied so that we get individuals that contain
similar melody lines to the one uniquely determined by the
HMM. When the number of the chord progression by the
HMM is L, the number of uniquely-determined melody lines
is L+1. However, the L+1th melody line is regarded as a
suitable note to the first note because of the property of
the Viterbi algorithm. Consequently, the initial population is
generated by the reference to the L melody lines. In this
regard, when similar melody lines are generated, there is
no affect to change the number of child individuals and the
depth level. In fact, there is no need to fix the number of
melody lines to L.
3.3 Selection
Music has minimum basic properties for all categories.
The individuals that do not have such properties should
be removed for better chord progression. The following
illustrates the properties [10].
Tonic (T) : basic notes: beginning and ending notes
Subdominant (S) : notes for procession and connection:
easy to transfer to Dominant
Dominant (D) : notes for returning to Tonic
As above described, there is the basic chord progression
of T-S-D-T. In Table.3, a musical scale is plotted with I to
VII. III is not presented because the property of III is not
included. Table.4 expresses the fundamental notes of I-VII
for chord C to B in the major and minor key. The major and
minor chord is expressed in uppercase characters (C to H)
and lowercase characters (c to h), respectively. In both keys,
various chords have the same fundamental notes.
In genetic operations, all individuals do not have the chord
progression of T-S-D-T. If the individuals that do not have
the T-S-D-T chord progression were completely removed
in a generation, the generation would be poor in diversity.
So we select individuals that have a leaf of D-S chord
progressions. In this case, one third and one fifth of the
total individuals are removed when the number of chord is
four and five, respectively. In addition, the shortest chord
in most chord progression is quarter note. For this reason,
individuals with deeper levels are removed, too.
4. Experiments
In this section, first we show the experimental results
about chord progression generation. Next we show melody
generation experiments based on the chord progression
generation results. The root phonetic values of individuals
generated in the both models represent four measures.
4.1 Generation of chord progression
We explain experimental environments in subsection 4.1.1
and experimental results in subsection 4.1.2.
4.1.1 Experimental environments
We perform experiments of generation of chord progres-
sion in C major. The parameters we used for the experi-
ments are the number of individuals (np), crossover value
(c), mutation rate (m) and maximum generation number
(Generation). In this experiment, we use the following
parameter choices.
Although we are to adopt an evaluation model by multi-
agents, the model has not been well constructed yet. So we
use the following values for calculating the fitness value.
Table 5: basic chord in each key
chord M M M m m m M
Major 1st 4th 5th 1st 4th 5th 3rd
/minor T S D T S D
B/g]B E F]m G]m C]m D]m D]
E/c]E A B C]m F]m G]m G]
A/f]A D E F]m Bm C]m C]
D/b D G A Bm Em F]m F]
G/e G C D Em Am Bm B
C/a C F G Am Dm Em E
F/d F B[C Dm Gm Am A
B[/g B[E[F Gm Cm Dm D
E[/c E[A[B[Cm Fm Gm G
A[/f A[D[E[Fm B[m Cm C
D[/b[D[G[A[B[m E[m Fm F
G[/e[G[C[D[E[m A[m B[m Bv
variation M7 M7 7m7 m7 M7 7
sus4 sus4
i) fraction of chords in Table.5
ii) fraction of chord mutation per individual.
Table.5 shows an example of basic chord in tonality major
and minor. There are rules for frequently used chords for
each tonality. For example, in C major (C) and A minor (a),
chord progressions are generated by using 16 chords (C, F,
G, Am, Dm, Em, E, CM7, Csus4, FM7, G7, Gsus4, Am7,
Dm7, EM7, E7) so that the tonality is kept.
i) is used for keeping tonality. This experiment is intended
to generate chord progression in C major. The sum of leaf
nodes fitting to these 16 chords (sum1) is divided by the sum
of leaf nodes (j), and the quotient is added to fitness1.
ii) is used for controlling chord progression. Due to the
culling, depth level (level) of any individual is less than
4. When level is 4, it is the length of a half note. But
in chord progression, it is a rare case to change by half
note. Therefore, chord progression goes by level 3that is
the length of a measure. Assuming that the number of leaf
nodes of level 3is sum2, when sum2is 1,1/j is added
to fitness2. When sum2satisfies 2sum24,1.0is
added to fitness2.
Using fitness1and fitness2,fitness is calculated
below and satisfies 0fitness 2.
fitness =fitness1 + fitness2
sum1 + sum2
j(0 sum21)
j+ 1 (2 sum24) (1)
4.1.2 Experiment results
Figure.8 shows the transition of fitness values against the
number of individuals. Figure.8-a shows the average values
of fitness for np =5,10,50,60,70,80,90 and 100
where each experiment is performed ten times. Figure.8-
b, Figure.8-c and Figure.8-d shows the transition of fitness
values for np =50,60 and 100, respectively. The dotted
lines show each average values. The genetic operations
a. average b. np 10
c. np 50 d. np 100
Fig. 8: process of fitness in chord progression generation
are converged two and six times out of ten experiments
in the case of np =5and 10, respectively. In the rest
cases, the genetic operations get into local solutions without
convergence at Generation =50. When the number of
individuals is small (np =5,10), the genetic operations tend
to get into local solutions because of the large number of
chord progression nodes (204 types). In the case of np =50,
60,70,80,90 and 100, all genetic operations are converged
with Generation =42,37,40,14,15, and 20, respectively.
Apparently, the larger the number of individuals is, the faster
the convergence speed is. Especially, when the number of
individuals exceeds 80, the convergence speed is saturated.
4.2 Melody generation
Melody generation is based on chord progression arrays
obtained from chord progression generation. We explain
experimental environments in subsection 4.2.1, and exper-
imental results in subsection 4.2.2.
4.2.1 Experimental environments
We use the following parameters as used in chord progres-
sion generation: the number of individuals (np), crossover
value (c), mutation rate (m) and maximum generation num-
ber (Generation). In this experiment, we use the following
parameter choices.
The length of a chord for input is fixed to a half note. As
in the case of chord progression, since the evaluation model
by multi-agent has not been well constructed yet, we use the
following values for calculating the fitness value.
iii) Consistency of each chord of given chord arrays.
iv) Supplementary value for melody notes.
The consistency is defined as the generation rate of
suitably generated melody to inputted chord progression. So,
fitness is calculated whether generated notes are included in
inputted chords. For example, chord C is constructed by C,
E and G as shown in Table.5, so the notes are better suited
for C, E and G. The generation rate is obtained as sum3/j
and it is added to fitness where sum3represents the sum
of suitable leaf nodes and jrepresents the sum of leaf nodes.
A melody line including overlong notes sometimes gives
a boring impression. Therefore, iv) is used for reducing such
melody lines, and calculated as sum4/j to be added to
fitness where sum4is the sum of leaf nodes with longer
length than a semibreve.
Using fitness3and fitness4,fitness is calculated
below and satisfies 1fitness 1.
fitness =sum3sum4
(0 sum3, sum4j, 0< j)(2)
4.2.2 Experiment results
Using a given chord progression CCFCDCAC, a melody
line DDCEA*CD is obtained by HMM. Figure.9 shows the
transition of fitness values against the number of individuals.
Figure.9-a shows the average values of fitness for np =5,10,
15 and 20 where each experiment is performed ten times.
Figure.9-b, Figure.9-c and Figure.9-d shows the transition
of fitness values for np =10,15 and 20, respectively. The
dotted lines show each average values. In the case of np =
5, the genetic operations are converged six times out of ten
experiments at Generation =43 (average) while in the rest
four experiments they are not converged to get into local so-
lutions at Generation =133,144,218 and 244. In the case
of np =10, the genetic operations are converged seven times
out of ten experiments at Generation =41 (average) while
in the rest three experiments they are not converged to get
into local solutions at Generation =55,70 and 153. In the
case of np =15, the genetic operations are converged eight
times out of ten experiments at Generation =31 (average)
while in the rest two experiments they are not converged to
get into local solutions at Generation =41 and 53. In the
case of np =20, all the genetic operations are converged
by Generation =51. We observe that the convergence
speed is slow with the small number of individuals. When
the number of generations exceeds the average number of
converged generations, we also confirm that they get into
local solutions. Note that with five to twenty individuals,
they are converged regardless of the number of individuals.
In addition, we observe that the convergence speed is fast
with the large number of individuals. In the case of the small
number of individuals, the convergence speed is slow even
in the bast case. It turn out that the larger the number of
individuals, the faster the convergence speed is.
a. average b. np 10
c. np 15 d. np 20
Fig. 9: process of fitness in melody generation
5. Conclusions
In this paper, we proposed a music composition model
to automatically generate music with GPs. In this model,
we apply GP to the chord progression part while we apply
HMM and GP to the melody part. Both parts generate chord
progression or melody by GP with appropriate parameter
choices. In the melody part, suitable melody lines related to a
given partial chord progression are provided by HMM so that
good initial individuals are acquired. By regarding the depth
levels of a tree structure as phonetic values, it is possible to
represent shorter notes. In addition, by using binary and/or
triplet trees, it is possible to generate any rhythm patterns
such as triplets and dotted notes. When we give another
existing music to the HMM to change the state transition
probability, different music would be generated regardless
of the same parameter choice.
In this paper, we perform some experiments for music
generation in C major. It turned out that melody lines of
longer phonetic values are generated with fast convergence
while melody lines of shorter phonetic values are generated
with slow convergence, which are not affected by the number
of individuals. The evaluation method we used is based
on phonetic values. We also use chord selection based on
the basic chord table by chord in the chord progression
part, and consistency to given chord arrays in the melody
part. However, in this evaluation, the limitation of chord
progression patterns and notes in melody lines seems to be
too heavy.
Improved evaluation methods using multi-agents are our
future work. We will improve our models for parameter
choices to obtain more efficient original sounds. The final
system will provide music novices with their favorite music
without taking care of copyright and bothering complicated
system operations. Furthermore, we would like to apply our
future system to Jazz improvisation.
[1] John,A,Biles, “GenJam: A Genetic Algorithm for Generating Jazz
Solos,” in Proc. ICMA’94, 1994, p.3-4.
[2] Yamada.T, and Shiizuka.H, “Automatic Composition by Generic Algo-
rithm,” in Proc. IPSJ SIGMUS’98, 1998, paper 27.2, p. 7-14.
[3] Tanaka.T, Toyama.F, and Shoji.K, “Automatic Composition by Com-
bination of Pitch Transition Patterns and Rhythm Using a Genetic
Algorithm,” in Proc. IPSJ SIGMUS’01, 2001, paper 41.8, p.43-48.
[4] J,H,Holand, Adaption in Natural and Artificial Systems, 2nd ed.,
Massachusetts, USA: MIT Press, 1992.
[5] John,R,Koza, Genetic Programming: on the programming of computers
by means of natural selection, Massachusetts, USA: MIT Press, 1992.
[6] John,R,Koza, Genetic Programming II: Automatic Discovery of
Reusable Programs, Massachusetts, USA: MIT Press, 1994.
[7] Paul,Taylor, Text-to-Speech Synthesis, Cambridge, UK: Cambridge
University Press, 2009.
[8] Adelinde,M,Uhrmacher and Danny,Weyns, Computational Analysis,
Synthesis, and Design of Dynamic Models Series 4: Multi-Agent
Systems: Simulation and Applications, Oxford, UK: CRC Press,
[9] Tomomi Yamanaka, Kyoko Komatsu, Naoko Yoshii, Masami Takata
and Kazuki Joe, “Multi-Regression Analysis of Music impressions for
Music Evaluation,” in Proc. PDPTA’10, 2010, accepted.
[10] William,E,Caplin, Classical form: a theory of formal functions for the
instrumental music of Haydn, Mozart, and Beethoven, Oxford, UK:
Oxford University Press, 1998.
... Some authors have explored other approaches, like analytic functions (Laine and Kuuskankare 1994), languages like abc (Oliwa 2008), or motives represented by patterns (Liu and Ting 2015). More often, trees have also been used (Phon-Amnuaisuk, Law, and Ho 2007;Komatsu et al. 2010;Hofmann 2015), but always in the context of a genetic programming approach, where the trees are representing a string in a language, formally represented by a syntax, so they are actually, parsing trees. In the present work, we introduce the use of melodic trees for this task, as a structured representation of a monophonic series of notes. ...
Full-text available
Genetic-based composition algorithms are able to explore an immense space of possibilities, but the main difficulty has always been the implementation of the selection process. In this work, sets of melodies are utilized for training a machine learning approach to compute fitness, based on different metrics. The fitness of a candidate is provided by combining the metrics, but their values can range through different orders of magnitude and evolve in different ways, which makes it hard to combine these criteria. In order to solve this problem, a multi-objective fitness approach is proposed, in which the best individuals are those in the Pareto front of the multi-dimensional fitness space. Melodic trees are also proposed as a data structure for chromosomic representation of melodies and genetic operators are adapted to them. Some experiments have been carried out using a graphical interface prototype that allows one to explore the creative capabilities of the proposed system. An Online Supplement is provided and can be accessed at, where the reader can find some technical details, information about the data used, generated melodies, and additional information about the developed prototype and its performance.
Conference Paper
Full-text available
This paper describes GenJam, a genetic algorithm-based model of a novice jazz musician learning to improvise. GenJam maintains hierarchically related populations of melodic ideas that are mapped to specific notes through scales suggested by the chord progression being played. As GenJam plays its solos over the accompaniment of a standard rhythm section, a human mentor gives real-time feedback, which is used to derive fitness values for the individual measures and phrases. GenJam then applies various genetic operators to the populations to breed improved generations of ideas.
Text-to-Speech Synthesis provides a complete, end-to-end account of the process of generating speech by computer. Giving an in-depth explanation of all aspects of current speech synthesis technology, it assumes no specialized prior knowledge. Introductory chapters on linguistics, phonetics, signal processing and speech signals lay the foundation, with subsequent material explaining how this knowledge is put to use in building practical systems that generate speech. Including coverage of the very latest techniques such as unit selection, hidden Markov model synthesis, and statistical text analysis, explanations of the more traditional techniques such as format synthesis and synthesis by rule are also provided. Weaving together the various strands of this multidisciplinary field, the book is designed for graduate students in electrical engineering, computer science, and linguistics. It is also an ideal reference for practitioners in the fields of human communication interaction and telephony.
Automatic Composition by Generic Algorithm
  • . T Yamada
Yamada.T, and Shiizuka.H, "Automatic Composition by Generic Algorithm," in Proc. IPSJ SIGMUS'98, 1998, paper 27.2, p. 7-14.
Automatic Composition by Combination of Pitch Transition Patterns and Rhythm Using a Genetic Algorithm
  • T Tanaka
  • . F Toyama
Tanaka.T, Toyama.F, and Shoji.K, "Automatic Composition by Combination of Pitch Transition Patterns and Rhythm Using a Genetic Algorithm," in Proc. IPSJ SIGMUS'01, 2001, paper 41.8, p.43-48.
Classical form: a theory of formal functions for the instrumental music of Haydn, Mozart, and Beethoven
  • E William
William,E,Caplin, Classical form: a theory of formal functions for the instrumental music of Haydn, Mozart, and Beethoven, Oxford, UK: Oxford University Press, 1998.
  • M Adelinde
  • Uhrmacher
  • Weyns Danny
Adelinde,M,Uhrmacher and Danny,Weyns, Computational Analysis, Synthesis, and Design of Dynamic Models Series 4: Multi-Agent Systems: Simulation and Applications, Oxford, UK: CRC Press, 2009.