Article

Machine Learning of Jazz Grammars

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

In the context of an educational software tool that can generate novel jazz solos using a probabilistic grammar (Keller 2007), this article describes the automated learning of such grammars. Learning takes place from a corpus of transcriptions, typically from a single performer, and our methods attempt to improvise solos representative of such a style. In order to capture idiomatic gestures of a specific soloist, we extend an earlier grammar representation (Keller and Morrison 2007) with a technique for representing melodic contour. Representative contours are extracted from a corpus using clustering, and sequencing among contours is done using Markov chains that are encoded into the grammar. This article first defines the basic building blocks for contours of typical jazz solos, which we call slopes, then shows how these slopes may be incorporated into a grammar wherein the notes are chosen according to tonal categories relevant to jazz playing. We show that melodic contours can be accurately portrayed using slopes learned from a corpus. Experimental results, including blind comparisons of solos generated from grammars based on several corpora, are reported.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Melody [Pinkerton 1956;Brooks et al. 1957;Moorer 1972;Conklin and Witten 1995;Pachet and Roy 2001;Davismoon and Eccles 2010;Pearce et al. 2010;Gillick et al. 2010;McVicar et al. 2014;Papadopoulos et al. 2014] Harmony [Hiller Jr and Isaacson 1957;Xenakis 1992 ...
... Melody [Keller and Morrison 2007;Gillick et al. 2010;Herremans and Sörensen 2012] Harmony [Hiller Jr and Isaacson 1957;Steedman 1984;Ebcioglu 1988;Cope 1996;Assayag et al. 1999b;Cope 2004;Huang and Chew 2005 Narrative [Casella and Paiva 2001;Farbood et al. 2007;Brown 2012;Nakamura et al. 1994] ...
... The improvisation system (Impro-Visor) designed by Keller and Morrison [2007] uses probabilistic grammars to generate jazz solos. The model successfully learns the style of a composer, as reflected in an experiment described by Gillick et al. [2010], where human listeners correctly matched 95% of solos composed by Impro-Visor in the style of the famous performer Clifford Brown to the original solo. The accuracy was 90% for Miles Davis, and slightly less, 85% for Freddie Hubbard. ...
Preprint
Digital advances have transformed the face of automatic music generation since its beginnings at the dawn of computing. Despite the many breakthroughs, issues such as the musical tasks targeted by different machines and the degree to which they succeed remain open questions. We present a functional taxonomy for music generation systems with reference to existing systems. The taxonomy organizes systems according to the purposes for which they were designed. It also reveals the inter-relatedness amongst the systems. This design-centered approach contrasts with predominant methods-based surveys and facilitates the identification of grand challenges to set the stage for new breakthroughs.
... (9) This robot, named Shimon, plays the marimba alongside humans in a traditional jazz combo based on a conventional understanding of key, harmony, and form, but with a complex machine learning-based model for generating solos. I compare Shimon to a computer program called Impro-Visor (Gillick, Tang, and Keller 2010), which does not perform in real time but which generates solos in a similar style using a different corpus-based machine learning model, and I contrast both of these systems with George Lewis's Voyager, a long-standing project that stems from Lewis's work in free improvisation. (10) [0.5] ...
... It is worth noting that Shimon's particular style is far from the only possible algorithmic approach to the jazz repertory that the robot is designed to play. To illustrate, I turn to another system developed by Jon Gillick, Kevin Tang, and Robert Keller, which I will refer to as Impro-Visor, as it is incorporated into the software of that name produced by Keller, though that program has other features not discussed here (see Gillick, Tang, and Keller 2010). (20) ...
... 1. number of notes in the abstract melody, 2. location of the first note that starts within the window, 3. total duration of rests, 4. average maximum slope of ascending or descending groups of notes, 5. whether the window starts on or off the beat, 6. order of the contour (how many times it changes direction), and 7. consonance. (Gillick, Tang, and Keller 2010, 60) (26) [2.3] Gillick, Tang, and Keller (2010) treat the clusters themselves as states in a Markov model, returning to the full corpus of actual melodic fragments and classifying each by its cluster. They then calculate transition likelihoods among the clusters. ...
Article
Though improvising computer systems are hardly new, jazz has recently become the focus of a number of novel computer music projects aimed at convincingly improvising alongside humans, with a particular focus on the use of machine learning to imitate human styles. The attempt to implement a sort of Turing test for jazz, and interest from organizations like DARPA in the results, raises important questions about the nature of improvisation and musical style, but also about the ways jazz comes popularly to stand for such broad concepts as “conversation” or “democracy.” This essay explores these questions by considering robots that play straight-ahead neoclassical jazz alongside George Lewis’s free-improvising Voyager system, reading the technical details of such projects in terms of the ways they theorize the recognition and production of style, but also in terms of the political implications of human-computer musicking in an age of algorithmic surveillance and big data.
... Toward this goal, we developed two techniques that will enhance coherence. The first is to extend the grammar formalism developed in [5] for Impro-Visor [6] to provide an easy means for exploiting motifs, which are either represented by specific grammar productions or which are captured when a motif is generated dynamically. The second is to provide a mechanism for recognizing motifs during the grammar learning process, so that the exploitation mechanism has a set of motifs to use as a basis. ...
... In the current work, motifs are manipulated using one of the methods by which Impro-Visor improvises: probabilistic generative grammars, as described in [5]. In such a grammar, filling melodic space is governed by grammatical productions. ...
... Here are few examples of how grammars work. We use the notation of [5] for increasing readability of the exposition. However, in the Impro-Visor software, grammars are represented textually using S-expressions. ...
Conference Paper
Full-text available
Building on previous work in computer generated jazz solos using probabilistic grammars, this paper describes research extending the capabilities of the current learning process and grammar representation used in the Impro-Visor educational music software with the concepts of motifs and motif patterns. An approach has been developed using clustering, best match search techniques, and probabilistic grammar rules to identify motifs and incorporate them into computer generated solos. The abilities of this technique are further expanded through the use of motif patterns. Motif patterns are used to induce coherence in generated solos by learning the patterns in which motifs were used in a given set of transcriptions. This approach is implemented as a feature of the Impro-Visor software.
... Melody [Pinkerton 1956;Brooks et al. 1957;Moorer 1972;Conklin and Witten 1995;Pachet and Roy 2001;Davismoon and Eccles 2010;Pearce et al. 2010;Gillick et al. 2010;McVicar et al. 2014;Papadopoulos et al. 2014] Harmony [Hiller Jr and Isaacson 1957;Xenakis 1992 ...
... Melody [Keller and Morrison 2007;Gillick et al. 2010;Herremans and Sörensen 2012] Harmony [Hiller Jr and Isaacson 1957;Steedman 1984;Ebcioglu 1988;Cope 1996;Assayag et al. 1999b;Cope 2004;Huang and Chew 2005 Narrative [Casella and Paiva 2001;Farbood et al. 2007;Brown 2012;Nakamura et al. 1994] ...
... The improvisation system (Impro-Visor) designed by Keller and Morrison [2007] uses probabilistic grammars to generate jazz solos. The model successfully learns the style of a composer, as reflected in an experiment described by Gillick et al. [2010], where human listeners correctly matched 95% of solos composed by Impro-Visor in the style of the famous performer Clifford Brown to the original solo. The accuracy was 90% for Miles Davis, and slightly less, 85% for Freddie Hubbard. ...
Article
Full-text available
Digital advances have transformed the face of automatic music generation since its beginnings at the dawn of computing. Despite the many breakthroughs, issues such as the musical tasks targeted by different machines and the degree to which they succeed remain open questions. We present a functional taxonomy for music generation systems with reference to existing systems. The taxonomy organizes systems according to the purposes for which they were designed. It also reveals the inter-relatedness amongst the systems. This design-centered approach contrasts with predominant methods-based surveys and facilitates the identification of grand challenges to set the stage for new breakthroughs.
... Two fully implemented software packages represent the two theoretical approaches, ImPact (Ramalho, Rolland, & Ganascia, 1999) and Impro-Visor (Gillick, Keller, & Tang, 2010). Ramalho et al. describe how their software, ImPact, creates jazz bass lines by reusing fragments derived from six different recordings of bass lines performed by jazz bassist Ron Carter. ...
... Impro-Visor (Gillick et al., 2010;Keller, 2012) is a full software implementation based on rules or grammars that is able to create solo lines in various styles given a chord progression. Gillick et al. generated grammars inspired by the solos of Charlie Parker, Lester Young, John Coltrane, and others by deriving rules related to contour, rhythm, chord tones, approach tones, and color tones. ...
... Often computer algorithms for improvisation are based on strict vertical relationships between the improvised line and the underlying chords so the line clearly reflects each chord (Band-in-a-Box, 2013; Gillick et al., 2010;Johnson-Laird, 2002;Ramalho et al., 1999). In two of these models, grammars based directly on the underlying chord progression are used to create improvised material, thereby possibly overemphasizing vertical elements in improvisational thinking (Gillick et al., 2010;Johnson-Laird, 2002). ...
Article
Full-text available
Building on previous work, which suggests that jazz improvisers insert patterns stored in procedural memory, a probabilistic model based on patterns from a corpus of Charlie Parker solos was developed and implemented. In previous analysis, patterns were detected in the corpus in significant proportions; however, the results of a parallel control situation showed minimal patterns. The control improvisation was generated by software based on grammars and contours, coincident with the cognitive position that emphasizes learned rule-based procedures in improvisation, as opposed to stored patterns. The present pattern-based improvisations, using our model, have graphs that coincide significantly with the actual human improvisation. Though briefly described earlier (Norgaard, Montiel, & Spencer, 2013), the current article expands the theoretical foundation and adds methods for evaluating our algorithm using interval distributions and alternate corpora. Specifically, we show that the algorithm is capable of generating improvisations in fiddle and classical styles, demonstrating that the pattern-based algorithm is style independent. Our model shows much promise both for future research in the cognitive underpinnings of musical improvisation as well as for the development of software based on a stylistically appropriate concatenation of actual patterns. (PsycINFO Database Record (c) 2014 APA, all rights reserved)
... A case study is presented demonstrating a practical application of the SPECS methodology. SPECS is used to evaluate in detail the creativity of four musical improvisation systems: GAmprovising [7], GenJam [8], Impro-Visor [9] and Voyager [10]. The results show that GenJam is perceived as most creative overall. ...
... In discussion at the most recent International Conference in Computational Creativity (ICCC '11), the question of how to evaluate computational creativity was referred to as one of the 'big questions' of this research area. Although some authors have proposed evaluation methodologies for creativity, 9 to some at ICCC'11 it seemed pointless to tackle such questions while they have not yet been dealt with sufficiently in human creativity research, despite decades more investigation. 10 Some members of the steering committee have gone as far as to say that the tackling of creativity evaluation 'probably needs to be deferred until we are substantially more capable in general automated reasoning and knowledge representation' [1, p. 19]. ...
... This has grown to an average of 33 accepted papers and an average of 42 program committee members over the 2010-2012 conferences. 9 Existing evaluation methodologies for computational creativity are examined later in this section of the paper. 10 This paper will return later to these discussions at ICCC '11. ...
Article
Full-text available
Computational creativity is a flourishing research area, with a variety of creative systems being produced and developed. Creativity evaluation has not kept pace with system development with an evident lack of systematic evaluation of the creativity of these systems in the literature. This is partially due to difficulties in defining what it means for a computer to be creative; indeed, there is no consensus on this for human creativity, let alone its computational equivalent. This paper proposes a Standardised Procedure for Evaluating Creative Systems (SPECS). SPECS is a three-step process: stating what it means for a particular computational system to be creative, deriving and performing tests based on these statements. To assist this process, the paper offers a collection of key components of creativity, identified empirically from discussions of human and computational creativity. Using this approach, the SPECS methodology is demonstrated through a comparative case study evaluating computational creativity systems that improvise music. An author's postprint (same content, but before it has been put into journal-specific formatting) is available via my institutional repository at https://kar.kent.ac.uk/cgi/users/home?screen=EPrint::View&eprintid=42379
... Event-Based representations also have a long history in music generation; they have been used in models based on Markov Chains (Ames, 1989;Gillick et al., 2010), Recurrent Neural Networks (Mozer, 1994;Eck and Schmidhuber, 2002;Sturm et al., 2016) and Transformers Huang and Yang, 2020). In contrast to Fixed-Grid representations, which keep track of an event's temporal position by encoding it relative to a specific point on a timeline, Event-Based representations track the passage of time through a discrete vocabulary of time-shift events, each of which moves a playhead forward by a specific increment. ...
... In contrast to Fixed-Grid representations, which keep track of an event's temporal position by encoding it relative to a specific point on a timeline, Event-Based representations track the passage of time through a discrete vocabulary of time-shift events, each of which moves a playhead forward by a specific increment. These increments can be measured in musical durations like 8th or 16th notes, for example to generate jazz improvisations (Gillick et al., 2010) or folk tunes (Sturm et al., 2016), but of particular interest for this work are a recent series of models of expressive performance that use more fine-grained timespans, with vocabularies allowing time shifts as short as 8 milliseconds. These extended vocabularies of time shifts makes room for models to learn directly from data in formats like MIDI without explicitly modeling tempo and beat. ...
Article
Full-text available
We present a new data representation for music modeling and generation called a Flexible Grid. This representation aims to balance flexibility with structure in order to encode all the musical events (notes or rhythmic onsets) in a dataset without quantizing or discarding any temporal information. In experiments with a dataset of MIDI drum performances, we find that when implemented in a Variational AutoEncoder (VAE) model, Flexible Grid representations can enable detailed generation of music performance data that includes multiple different gestures and articulations.
... Technologically, our Musical and Conversational Artificial Intelligence can imitate the typically human cognitive skills to produce music by using an advanced technique called abstract melody [4]. ...
... As a result, it generates a MIDI file, which is a standard instructional file that illustrates which notes are played, when they are played, and how long and loud each note is. Now, the system uses an advanced technique called abstract melody [4] to extract the progress of the pitch in the time domain staring from the specifications from the MIDI file. The pitch includes information about its variation in time and the presence/absence of a note in every single time instant. ...
... ‫اثر‬ ‫مالکیت‬ ‫و‬ ‫اصالت‬= (ℎ −1 , ) ( ۱ - ۲ ) ( ) = tanh ( ) ( ۲ - ۲ ) = ( ℎℎ • ℎ −1 + ℎ • ) ( ۳ - ۲ ) = ℎ ℎℎ • ℎ −1 + ℎ • ( 4 - ۲ ) ℎ = ℎ • ( ۵ - ۲ ) ‫پس‬ ‫خطا‬ ‫یار‬ [Pinkerton 1956;Brooks et al. 1957;Moorer 1972;Conklin and Witten 1995;Pachet and Roy 2001;Davismoon and Eccles 2010;Pearce et al. 2010;Gillick et al. 2010;McVicar et al. 2014;Papadopoulos et al. 2014] ‫هارمونی‬ [Hiller Jr and Isaacson 1957;Xenakis 1992;Farbood and Schoner 2001;Allan and Williams 2005;Lee and Jang 2004;Yi and Goldsmith 2007;Simon et al. 2008;Eigenfeldt and Pasquier 2009;De Prisco et al. 2010;Chuan and Chew 2011;Bigo and Conklin 2015] ‫ریتم‬ [Tidemann and Demiris 2008;Marchini and Purwins 2010;Hawryshkewich et al. 2011] ‫تعامل‬ [Thom 2000] ‫روایت‬ [Prechtl et al. 2014a,b] ‫هارمونی‬ [Hiller Jr and Isaacson 1957;Steedman 1984;Ebcio˘glu 1988;Cope 1996;Assayag et al. 1999b;Cope 2004;Huang and Chew 2005;Anders 2007;Anders and Miranda 2009;Aguilera et al. 2010;Sörensen 2012, 2013;Tanaka et al. 2016 [Todd 1989;Duff 1989;Mozer 1991;Lewis 1991;Toiviainen 1995;Eck and Schmidhuber 2002;Franklin 2006;Agres et al. 2009;Boulanger-Lewandowski et al. 2012] ‫هارمونی‬ [Lewis 1991;Hild et al. 1992 ‫کاراکتر‬ ' ' ' _ ' '' ' / ' '/3' '/4' '2' ‫احتمال‬ 0 0 ۱ 0.07۳۱ 0 0 0.۱۳78 ‫کاراکتر‬ '3' '4' ',' ''' ' > ' ' < ' '' ‫احتمال‬ 0.0۲۳7 0.00۳4 0.00۲6 0.00۱4 0.0۱۹0 0.00۱۲ 0. ...
... ‫دریافت‬ ‫را‬ ‫امتیاز‬ ‫شترین‬ ‫ی‬ ‫جدول‬ ‫در‬ ‫مطالعه‬ ‫یک‬ ‫تنها‬ ‫و‬ ‫شده‬ ‫ستفاده‬ ‫ا‬ ‫کم‬ ‫سیار‬ ‫ب‬ ‫رویکرد‬ ‫این‬ ‫از‬ ‫سیبی‬ ‫مو‬ ۲ -4 ‫هم‬ ‫آن‬ ‫که‬ ‫شده‬ ‫آورده‬ ‫است.‬ ‫ملودی‬ ‫تولید‬ ‫نه‬ ‫و‬ ‫تعامل‬ ‫بحث‬ ‫در‬ ‫انجام‬ ‫موسیبی‬ ‫تولید‬ ‫برای‬ ‫مشخصی‬ ‫قواعد‬ ‫ایجاد‬ ‫یا‬ ‫و‬ ‫موسیبی‬ ‫قواعد‬ ‫اساس‬ ‫بر‬ ‫نیز‬ ‫مطالعات‬ ‫از‬ ‫بسیاری‬ ‫روش‬ ‫با‬ ‫شده‬ ‫انجام‬ ‫کارهای‬ ‫قواعد‬ ‫بر‬ ‫مبتنی‬ ‫های‬ ‫مب‬ ‫قانون‬ ‫و‬ ‫قید‬ ‫قواعد/‬ ‫بر‬ ‫تنی‬ ‫ملودی‬[Keller and Morrison 2007;Gillick et al. 2010; Herremans and Sörensen 2012] ...
Thesis
Full-text available
With the development of artificial intelligence techniques and deep learning, the methods of sequences production (such as sentence making and music generation) have evolved, significantly. However, researchers are still looking for faster and more reliable ways to generate music. Another problem with this area is that there is no specific approach to evaluate the music generated by the machine. In this thesis, a methodology based on interactive evolutionary optimization has been introduced in order to generate music, in which the scoring of the generated music has been performed by humans. Moreover, human scoring of music is modeled using a bi-LSTM network and is exploited in the final music generation system based on the genetic algorithm. The results show that the proposed method is able to create pleasurable melodies in desired styles and lengths. The proposed method is also fairly quick to generating music, which is significantly faster compared to data-based evolutionary systems. Keywords: music generation, melody, neural network Bi-LSTM, interactive evolutionary optimization algorithm, genetic algorithm. با فراگیر شدن روش¬های هوش مصنوعی و ظهور یادگیری عمیق، روش¬های تولید دنباله¬های زمانی (مانند تولید جمله و تولید موسیقی) دچار تحول شده است. اما پژوهشگران همچنان به دنبال روش¬های سریع¬تر و مطمئن¬تری برای تولید موسیقی هستند. یکی دیگر از مشکلاتی که در این حوزه وجود دارد این است که معیار ارزیابی مشخصی برای موسیقی تولید شده توسط ماشین وجود ندارد. در این پایان نامه یک روش مبتنی بر بهینه¬سازی تکاملی تعاملی به منظور تولید موسیقی معرفی شده است که در آن امتیازدهی به موسیقی¬های تولید شده توسط انسان انجام می¬شود. همچنین نحوه امتیازدهی به موسیقی¬ها توسط انسان به کمک یک شبکه biLSTM مدل‌سازی شده و در سیستم نهایی تولید موسیقی که مبتنی بر الگوریتم ژنتیک است استفاده شده است. نتایج نشان می¬دهند که روش پیشنهادی قادر است در سبک¬ها و طول¬های دلخواه، ملودی¬های خوشایندی ایجاد کند. همچنین روش پیشنهادی برای تولید موسیقی نسبتاً سریع عمل می¬کند به طوری که در مقایسه با سیستم¬های تکاملی مبتنی بر داده به طور قابل ملاحظه¬ای سریع¬تر است. کلیدواژه‌ها: تولید موسیقی، ملودی، شبکه عصبی biLSTM، الگوریتم بهینه¬سازی تکاملی تعاملی، الگوریتم ژنتیک.
... Probabilities for CFGs, matching a particular style of music, can also be learned from existing music [13] and stochastic elements can be introduced. N-grams representing the probabilities of the choice of production rule based on the previous n − 1 choices can then be learned from existing pieces of music [13], further increasing the choice of suitably chosen production rules. ...
... Probabilities for CFGs, matching a particular style of music, can also be learned from existing music [13] and stochastic elements can be introduced. N-grams representing the probabilities of the choice of production rule based on the previous n − 1 choices can then be learned from existing pieces of music [13], further increasing the choice of suitably chosen production rules. Thus, Bayesian reasoning also takes its place in music generation along with related techniques such as entropy calculation in the choice of probabilities [29]. ...
Conference Paper
Full-text available
This paper is to describes a method for interposing computer generated melody with tone linked to unique entities within the text of a novel. Background: A recent study describing a piece of software called "TransProse" has already shown that sentiment in the text of a novel can be used to automatically generate simple piano music that reflects the same sentiment as the novel. This study wished to establish a method whereby, if after aligning the text with the melody, the sentiment in the words surrounding particular characters as they occurred within the novel could produce another melody line, for each character, that could reflect the individual characters' tone and distinguish the melodies ascribed to each character from each other. Method: The sentiment in the text of the novel is extracted by looking up the words in a database that groups the words into emotional groups called "Ekman categories". Simplistic relations between aspects of music such as pitch and tempo are chosen based on the two categories that contained the most words. These chosen attributes are then used to generate the first two melody lines. The paragraphs within which the named entities referring to characters are found is manually determined and the top "Ekman category" of the named entities is obtained through simplistic methods of extraction. Each bar of the melody is aligned with individual paragraphs of text and an additional melody line is generated for each character. Results: Adjusting the fitness function of the Genetic algorithm (GA) that was used was not sufficient to link the tone of the characters to the melody. Assigning each character their own short melodic phrase and varying the phrase appropriately achieved the desired outcome but requires additional work to harmonise better with the first two melody lines.
... In a second control analysis, the chords underlying each improvisation in the original corpus were entered into the computer program Impro-Visor (Keller, 2012) to create an alternate improvisation corpus. This program uses probabilistic grammars based on the chords and contour rules to create an improvised output (Gillick, Keller, & Tang, 2010;Keller & Morrison, 2007). ...
... A second analysis was conducted on a computergenerated corpus to investigate whether the patterns simply appeared because the improvisations followed tonal rules and jazz convention. According to the programmers, the computer program Impro-Visor uses algorithms to generate melodic solos based on a given chord progression (Gillick et al., 2010;Keller & Morrison, 2007). Importantly, the program extracts algorithms based on actual solos but does not store and reuse specific patterns. ...
Article
Full-text available
It is well known that jazz improvisations include repeated rhythmic and melodic patterns. What is less understood is how those patterns come to be. One theory posits that entire motor patterns are stored in procedural memory and inserted into an ongoing improvisation. An alternative view is that improvisers use procedures based on the rules of tonal jazz to create an improvised output. This output may contain patterns but these patterns are accidental and not stored in procedural memory for later use. The current study used a novel computer-based technique to analyze a large corpus of 48 improvised solos by the jazz great Charlie Parker. To be able to compare melodic patterns independent of absolute pitch, all pitches were converted to directional intervals listed in half steps. Results showed that 82.6% of the notes played begin a 4-interval pattern and 57.6% begin interval and rhythm patterns. The mean number of times the 4-interval pattern on each note position is repeated in the solos analyzed was 26.3 and patterns up to 49-intervals in length were identified. The sheer ubiquity of patterns and the pairing of pitch and rhythm patterns support the theory that pre-formed structures are inserted during improvisation. The patterns may be encoded both during deliberate practice and through an incidental learning processes. These results align well with related processes in both language acquisition and motor learning.
... They also deliver a quite comprehensive literature overview regarding previous efforts. (Gillick, Tang, and Keller, 2010) further extend the latter approach by machine-learning the jazz grammars. ...
Preprint
Full-text available
A Note to the Reader This report largely written back in Spring 2021, but unfortunately never submitted to peer review. In the meantime, two extensive reviews of similar nature have been published by Civit et al. (peer reviewed) and Zhao et al. (arxiv). We believe that this manuscript still adds value to the scientific community, since it focuses on music generation in an interactive, hence real-time, scenario, and how such systems can be evaluated. 2 Abstract In recent years, machine learning, and in particular generative adversarial neural networks (GANs) and attention-based neural networks (transformers), have been successfully used to compose and generate music, both melodies and polyphonic pieces. Current research focuses foremost on style replication (e.g., generating a Bach-style chorale) or style transfer (e.g., classical to jazz) based on large amounts of recorded or transcribed music, which in turn also allows for fairly straightforward "performance" evaluation. However, most of these models are not suitable for human-machine co-creation through live interaction, neither is clear, how such models and resulting creations would be evaluated. This article presents a thorough review of music representation, feature analysis, heuristic algorithms, statistical and parametric modelling, and human and automatic evaluation measures, along with a discussion of which approaches and models seem most suitable for live interaction.
... Harmonization is a fundamental aspect of jazz, where multiple musical voices combine to create a rich and complex sound. In recent years, there has been growing interest in using artificial intelligence (AI) to assist the harmonization of melodies in jazz [11,12]. However, one major challenge in this area is the scarcity of properly annotated datasets [13][14][15], in the case of jazz standards, mainly due to copyright issues with the melodies. ...
Article
Full-text available
This paper presents a methodology for generating cross-harmonizations of jazz standards, i.e., for harmonizing the melody of a jazz standard (Song A) with the harmonic context of another (Song B). Specifically, the melody of Song A, along with the chords that start and end its sections (chord constraints), are used as a basis for generating new harmonizations with chords and chord transitions taken from Song B. This task involves potential incompatibilities between the components drawn from the two songs that take part in the cross-harmonization. In order to tackle such incompatibilities, two methods are introduced that are integrated in the Hidden Markov Model and the Viterbi algorithm. First, a rudimentary approach to chord grouping is presented that allows interchangeable utilization of chords belonging to the same group, depending on melody compatibility. Then, a “supporting” harmonic space of chords and probabilities is employed, which is learned from the entire dataset of the available jazz standards; this space provides local solutions when there are insurmountable conflicts between the melody and constraints of Song A and the harmonic context of Song B. Statistical and expert evaluation allow an analysis of the methodology, providing valuable insight about future steps.
... As a result of this phase, the system generates a MIDI file, a standard instructional file that illustrates which notes are played, when they are played, and how long and loud each note is. The system can finally use the abstract melody [11] to extract from the MIDI file the specifications about the distances between consecutive notes. Boris' melody can be imagined as a broken line, going up and down as the input voice goes up or down, like in Figure 3. ...
... Work on computer music has developed in various directions. Some work has concentrated on the sequential aspects of music, the most common techniques in this direction being Markov models [2] and machine learning [6]. is work has obtained good results, especially in music composition [3,4,9], but has fallen short of capturing the structural aspects of harmony and, in general, of creating a bridge between mathematical formalization and musical theory as developed, for example, in [14]. ...
Preprint
Understanding the structural characteristics of harmony is essential for an effective use of music as a communication medium. Of the three expressive axes of music (melody, rhythm, harmony), harmony is the foundation on which the emotional content is built, and its understanding is important in areas such as multimedia and affective computing. The common tool for studying this kind of structure in computing science is the formal grammar but, in the case of music, grammars run into problems due to the ambiguous nature of some of the concepts defined in music theory. In this paper, we consider one of such constructs: modulation, that is, the change of key in the middle of a musical piece, an important tool used by many authors to enhance the capacity of music to express emotions. We develop a hybrid method in which an evidence-gathering numerical method detects modulation and then, based on the detected tonalities, a non-ambiguous grammar can be used for analyzing the structure of each tonal component. Experiments with music from the XVII and XVIII centuries show that we can detect the precise point of modulation with an error of at most two chords in almost 97% of the cases. Finally, we show examples of complete modulation and structural analysis of musical harmonies.
... More recently, research has focused on guided improvisation where the improvisation system relies both on a generative model representing the musical style and on a prior knowledge of a sequential structure called scenario hereafter. Gillick et al. (2010) used inference of a probabilistic context-free grammar to generate melodies over a given chord progression. Pachet and Roy (2011) used a set of constraints to generate blues chord progressions or to generate melodies using scales specific to a given musical style. ...
Article
Full-text available
This paper focuses on learning the hierarchical structure of a temporal scenario (for instance, a chord progression) to perform automatic improvisation consistently upon several time scales. We first present how to represent a hierarchical structure with a phrase structure grammar. Such a grammar enables us to analyse a scenario upon several levels of organisation creating a multi-level scenario. Then, we propose a method to automatically induce this grammar from a corpus based on sequence selection with mutual information. We applied this method on a corpus of rhythm changes and obtained multi-level scenarios similar to the analysis performed by a professional musician. We then propose new heuristics to exploit the multi-level structure of a scenario to guide the improvisation with anticipatory behaviour in the factor oracle driven improvisation paradigm. This method ensures consistency of the improvisation regarding the global form and opens up possibilities when playing on chords that do not exist in the memory. This system was evaluated by professional improvisers during listening sessions and received very good feedback.
... Some [6] introduce the concept of re-telling to refer to the re-working of a standard, based on a famous recording of a master, stressing the important tension between individual voice and tradition. Others [7] explore machine learning of jazz grammars, using basic building-blocks or "slopes," touching upon the antitheses of abstraction versus vocabulary, and attempting to codify harmonic tension. ...
Poster
Full-text available
Jazz mapping" is a multi-layered analytical approach to jazz improvisation. It is based on hierarchical segmenta-tion and categorization of segments, or constituents, according to their function in the overall improvisation. The approach aims at identifying higher-level semantics of transcribed and recorded jazz solos. At these initial stages , analytical decisions are rather exploratory and rely on the input of one of the authors and experienced jazz performer. We apply the method to two well-known solos, by Sonny Rollins and Charlie Parker, and discuss how improvisations resemble story-telling, employing a broad range of structural, expressive and technical tools, usually associated with linguistic production, experience, and meaning. We elucidate the implicit choices of experienced jazz improvisers, who have developed a strong command over the language and can communicate expressive intent, elicit emotional responses, and unfold musical "stories" that are memorable and enjoyable to fellow musicians and listeners. We also comment on potential artificial intelligence applications of this work to music research and performance.
... 3. Discovering RNP from music notation 3.1 N -gram model A typical stochastic model used in music for modeling fixed length subsequences is the N -gram and its varied smoothing techniques (Downie, 1999;Scholz et al., 2009;Unal et al., 2012;Hillewaere et al., 2012;Gillick et al., 2010;Sentürk, 2011). Consider the classical Ngram model as applied to note transcripts of a rāga, A. A graphical model depicting the Markov dependency for tri-gram (N = 3) is shown in Figure 2. The Ngram model presumes the statistical dependency of a given note on the N − 1 previous notes. ...
Article
Carnatic music, a form of Indian Art Music, has relied on an oral tradition for transferring knowledge across several generations. Over the last two hundred years, the use of prescriptive notations has been adopted for learning, sight-playing and sight-singing. Prescriptive notations offer generic guidelines for a raga rendition and do not include information about the ornamentations or the gamakas, which are considered to be critical for characterizing a raga. In this paper, we show that prescriptive notations contain raga attributes and can reliably identify a raga of Carnatic music from its octave-folded prescriptive notations. We restrict the notations to 7 notes and suppress the finer note position information. A dictionary based approach captures the statistics of repetitive note patterns within a raga notation. The proposed stochastic models of repetitive note patterns (or SMRNP in short) obtained from raga notations of known compositions, outperforms the state of the art melody based raga identification technique on an equivalent melodic data corresponding to the same compositions. This in turn shows that for Carnatic music, the note transitions and movements have a greater role in defining the raga structure than the exact note positions.
... Substantial work has been done on creation of music by computation, ranging from grammar-based methods (Keller and Morrison, 2007;Gillick, Tang, and Keller, 2010) or genetic algorithms (Biles, 1994) to neural network approaches, including recurrent models (Eck and Schmidhuber, 2002;Franklin, 2004) and deep belief networks (Bickerman et al., 2010). These approaches have had varying success in creating convincing jazz melodies over specific chord progressions. ...
Conference Paper
Full-text available
We describe a neural network architecture designed to learn the musical structure of jazz melodies over chord progressions, then to create new melodies over arbitrary chord progressions from the resulting connectome (representation of neural network structure). Our architecture consists of two sub-networks, the interval expert and the chord expert, each being LSTM (long short-term memory) recurrent networks. These two sub-networks jointly learn to predict a probability distribution over future notes conditioned on past notes in the melody. We describe a training procedure for the network and an implementation as part of the open-source Impro-Visor (Improvisation Advisor) application , and demonstrate our method by providing improvised melodies based on a variety of training sets.
... Often these systems contain a generative component and an optimizing component. Techniques for the generative component include genetic algorithms (Biles, 1994;Loughran & O'Neill, 2016), feed-forward neural networks (Bickerman, Bosley, Swire, & Keller, 2010), recurrent neural networks ("Google Magenta," n.d.)), and grammars (Gillick, Tang, & Keller, 2010;Keller & Morrison, 2007). The optimizing component could be gradient descent for neural network generators (Geoffrey E. Hinton, D. E. Rumlhart, & Ronald J. Williams, 1988), more genetic algorithms (Loughran & O'Neill, 2016), human input (Biles, 1994), or variable neighborhood search (Herremans & Chew, 2016). ...
Conference Paper
Full-text available
We used a recurrent neural network as a fitness function for a genetic algorithm to generate monophonic solos. The genetic algorithm is based on GenJam as described in Biles (1994). We conducted training sessions with human participants in order to compare and quantify some of the differences between human-feedback and RNN fitness functions. We found that the RNNs can effectively play the role of human fitness feedback, but still suffer in many areas. Our results suggest that certain types of recurrent neural networks can address the issues with human feedback, and thus should be explored in future research. There have been many approaches to automatic composition that combine AI techniques. Often these systems contain a generative component and an optimizing component. Techniques for the generative component include genetic algorithms Of the aforementioned approaches, genetic algorithms (GA) have shown promising results and thus have been used in a number of automatic composition systems over the past 25 years (Gibson & Byrne, 1991). Genetic algorithms consist of a population of solutions to a problem and a fitness function that is used to rank those solutions. Solutions are iteratively ranked, crossed with each other to produce a new generation , and then mutated. As this process is repeated, the fitness of the solutions in the population increases. An important aspect of designing a genetic algorithm is choosing a representation for the genotype and phenotype. Genotype refers to how the data is encoded in the computation space so that it can be manipulated in mutation and crossover, and is usually a simple list of numbers. The phenotype is the expression or decoding of that data in a form that is relevant to the solution-finding process, and in this case is the music that the fitness function receives. Choosing how the genotype is decoded to the phenotype affects what musical factors are fixed and what factors are learned, and it also affects the size of the solution space that is being optimized. Part of the appeal of genetic algorithms for generating music is that they can have a simple genotype that maps to the output music , rather than directly representing all the parametric complexities that the music may contain. Additionally, genetic algorithms can be endowed with domain specific information about musically relevant attributes and relationships in the form of mutation, crossover, and selection methods. Further, they allow the amount of randomness in the output to be adjusted in the form a mutation rate. In a musical genetic algorithm, these mutations often result in meaningful manipulations, such as repeating a section, reorganizing a chord, or reversing a sequence of notes. Musical Fitness Functions The fitness function is an important part of a musical genetic algorithm, as it determines what solutions are deemed " good ". The problem is that in music, " good " solutions are subjective and highly dependent on context. The extent to which a particular sequence of pitches will work is not inherent either in the pitches themselves or their organization; it is also a function of harmonic context, rhythmic configu
... Keller and A Transformational Grammar Framework for Improvisation Morrison [9] described the use of probabilistic context-free grammars in generating jazz melodies. Gillick, Tang, and Keller [10] described a method for machine learning of such grammars from solo transcriptions. These methods have been implemented in Impro-Visor [7], a program designed to help musicians learn and improve improvisational skills. ...
Conference Paper
Full-text available
Jazz improvisations can be constructed from common idioms woven over a chord progression fabric. Prior art has shown that probabilistic generative grammars are one effective means of achieving such improvisations. Here we introduce another approach using transformational grammars instead. One advantage that transformational grammars provide is a form of steering from an underlying melodic outline. We demonstrate by showing how idioms can be defined in a transformational grammar and how the placement of idioms conforms to the outline and chord structure. We illustrate how transformational grammars can provide unique and varied improvisations that are suggestive of the outline. We illustrate the application of this approach in an educational software tool.
... These formats are useful to students learning music using computer software, musicologists interested in the computational analysis of musical structure, and music composers of the Western tradition. Previous work in symbolic music modelling has addressed a range of musical tasks such as Jazz and Blues solo improvisation [1], [2], polyphonic music generation [3], [4], chorale harmonization [5], [6] and modelling musical expectation [7], [8]. The reader is referred to [9], [10] for recent reviews on machine learning models used in music composition and music cognition research. ...
Conference Paper
Full-text available
We address the task of modelling sequential information in monophonic music. The goal is to learn a probability distribution over the various possible values of musical pitch of the next note given those leading up to it. For this task, we propose the Recurrent Temporal Discriminative Restricted Boltzmann Machine (RTDRBM). It is obtained by carrying out discriminative learning and inference, as put forward in the Discriminative RBM (DRBM), in a temporal setting by incorporating the recurrent structure of the Recurrent Temporal RBM (RTRBM). The RTDRBM that results, is suitable for labelling sequences where the label is known immediately following the prediction, such as the notes in the melodies considered here. This model is evaluated with respect to the cross entropy of its predictions on a corpus containing 8 datasets of folk and chorale melodies, and compared with n-gram models and other standard connectionist models. We found that the RTDRBM outperforms the rest of the models, including the RTRBM on which discriminative inference is carried out.
Chapter
Complex optimization problems are often associated to large search spaces and consequent prohibitive execution times in finding the optimal results. This is especially relevant when dealing with dynamic real problems, such as those in the field of power and energy systems. Solving this type of problems requires new models that are able to find near-optimal solutions in acceptable times, such as metaheuristic optimization algorithms. The performance of these algorithms is, however, hugely dependent on their correct tuning, including their configuration and parametrization. This is an arduous task, usually done through exhaustive experimentation. This paper contributes to overcome this challenge by proposing the application of sequential model algorithm configuration using Bayesian optimization with Gaussian process and Monte Carlo Markov Chain for the automatic configuration of a genetic algorithm. Results from the application of this model to an electricity market participation optimization problem show that the genetic algorithm automatic configuration enables identifying the ideal tuning of the model, reaching better results when compared to a manual configuration, in similar execution times.KeywordsAutomatic algorithm configurationElectricity marketsGenetic algorithmMetaheuristic optimizationPortfolio optimization
Chapter
Algorithmic composition (AC) refers to the process of creating music by means of algorithms, either for realising music entirely composed by a computer or with the help of a computer. In this paper, we report on the development of the system PAUL\textsf{PAUL}, an algorithmic composer for the automatic creation of short pieces of classical piano music, based on a neural-network architecture. The distinguishing feature of PAUL\textsf{PAUL} is that it allows to specify the desired complexity of the output piece in terms of an input parameter, which is a central aspect towards the designated future usage of PAUL\textsf{PAUL} as being part of a tutoring system teaching piano students how to sight-read music. PAUL\textsf{PAUL} employs a long short-term memory (LSTM) neural network to produce the lead track and a sequence-to-sequence neural network for the realisation of the accompanying track. Although PAUL\textsf{PAUL} is still work-in-progress, the obtained results are of reasonable to good quality. In a small-scale study, evaluating the specified vs. the perceived complexity of different pieces generated by PAUL\textsf{PAUL}, a clear correlation is observable.KeywordsAlgorithmic compositionNeural networksMusic education
Chapter
Creating your own musical pieces is one of the most attractive ways to enjoy music. However, many musically untrained people lack the basic musical skills to do so. In this paper, we seek to explore how machine learning algorithms can enable musically untrained users to create their own music. To achieve this, we propose a Neural Hidden Markov Model (NHMM). It is a hybrid of a Hidden Markov model (HMM) and Convolution neural network (CNN) with a Long Short-Term Memory (LSTM) neural network. This model takes users’ original musical ideas in an easy intuitive way and automatically modifies the input to generate musically appropriate melodies as output. We further extend the model to allow users to specify the magnitude of revision, duration of music segment to be revised, choice of music genres, popularity of songs, and co-creation of songs in social settings. These extensions enhance user understanding of music theory, enrich their experience of self-learning, and enable social aspects of music creation. The model is trained using MIDI files of existing songs. We also conduct experiments on melody generation. We also hope to design a mobile application with an intuitive, interactive, and graphical user interface, which is suitable for the elderly and young children. Different from most existing literature focusing on computer music composition itself, our research and application aim at using computers to aid human composition and enriching the music education of musically untrained people.
Article
It is easy to apply artificial intelligence methods and address scientific problems with concrete rules. However, it is often challenging to apply these methods to art creation problems of weak regularity. To automatically generate musical melodies in the prairie songs of northern China, this paper proposes a formal method for melody creation based on theme development and fuzzy inference. First, we analyze the features of mode and scale in the prairie songs and construct an algorithm to generate a theme phrase according to the inner fuzzy relations among the prairie-song phrases to obtain seed materials. Then, concerning the fuzzy relations between two phrases in the prairie songs, this paper adopts fuzzy inference to manage the progression of the phrase relations and generates developmental phrases. Finally, many complete melodies of the prairie songs are generated. Compared with existing rule-based approaches, the proposed method can improve the global structure of music and can make the output compositions more musical and interesting.
Article
Full-text available
This paper builds on writings in psychology and philosophy to offer an “ecological” description of jazz improvisation. The description is grounded in the analogy of navigation through a complex environment, an environment that comprises the harmonic and metrical scheme on which the improvisation is based coupled with broader stylistic norms. The improvising soloist perceives this environment in terms of its “affordances,” that is, the possibilities for action that it offers (Gibson, 1979). While navigating the improvisational environment, the soloist also seeks opportunities for artistic display—motivic development, conspicuous risk-taking, and so on. Errors in improvisation reflect the soloist’s misperception of the environment’s affordances. Learning to improvise is a matter of refining perception through repeated experiences of improvisational success and failure. To bring the description to life, I offer evidence from an exploratory study of improvisational errors. The ecological description leads to new interpretations of the referent (the conceptual frame for a solo), improvisational learning and memory, and temporal coordination between soloist and ensemble. It counterbalances the prevailing computational view of improvisation, oriented around input, processing, and output.
Conference Paper
Full-text available
Trading is a common form of jazz improvisation in which one performer exchanges improvisations with others, usually in four-or eight-bar segments. We describe and demonstrate a new feature of Impro-Visor (short for Improvisation Advisor, a program designed to help musicians develop im-provisational skills) called active trading, which significantly extends its former automated, but passive, grammar-based trading capabilities. Because Impro-Visor's active trading can be based on a variety of different response models , it can be viewed as a meta capability, providing for future extensions simply by plugging in code for other trading modules.
Article
Full-text available
This bibliography compiles articles of interest in jazz music scholarship that were published in 2009 or 2010 and appeared in journals not specifically dedicated to jazz study.
Article
To compose some happy melodies which have hierarchical structures, this paper proposes an automatic melody composition algorithm based on relations. First, various types of melody structure are formalized and saved into a database, so the melody structure form preferred by a user can be elected by human-computer interaction. Second, some sequences of trunk-note and several algorithms of splitting note are constructed by means of the pitch interval features of happy melody, and the theme phrase of happy melody is generated by splitting some trunk notes of the trunk-note-sequence. Third, several types of operators for developing the theme phrase, which include pitch offsetting, phrase inversing and repeating-developing, are constructed using relationship methods. Finally, under the guidance of the elected melody structure, some happy melodies of songs are produced automatically by the interreaction of the theme phrase and these operators. Experimental results demonstrate that this algorithm can make the obtained melodies have musically meaningful structures, and it is not easy to distinguish these machine-generated melodies from human-generated melodies.
Article
Many tools for computer-assisted composition contain built-in music-theoretical assumptions that may constrain the output to particular styles. In contrast, this article presents a new musical representation that contains almost no built-in knowledge, but that allows even musically untrained users to generate polyphonic textures that are derived from the user's own initial compositions. This representation, called functional scaffolding for musical composition (FSMC), exploits a simple yet powerful property of multipart compositions: The pattern of notes and rhythms in different instrumental parts of the same song are functionally related. That is, in principle, one part can be expressed as a function of another. Music in FSMC is represented accordingly as a functional relationship between an existing human composition, or scaffold, and a generated set of one or more additional musical voices. A human user without any musical expertise can then explore how the generated voice (or voices) should relate to the scaffold through an interactive evolutionary process akin to animal breeding. By inheriting from the intrinsic style and texture of the piece provided by the user, this approach can generate additional voices for potentially any style of music without the need for extensive musical expertise.
Thesis
Full-text available
Music composition is a complex, multi-modal human activity, engaging faculties of perception, memory, motor control, and cognition, and drawing on skills in abstract reasoning, problem solving, creativity, and aesthetic evaluation. For centuries musicians, theorists, mathematicians—and more recently computer scientists—have attempted to systematize composition, proposing various formal methods for combining sounds (or symbols repre- senting sounds) into structures that might be considered musical. Many of these systems are grounded in the statistical modelling of existing music, or in the mathematical formal- ization of the underlying rules of music theory. This thesis presents a different approach, looking at music as a holistic phenomenon, arising from the integration of perceptual and cognitive capacities. The central contribution of this research is an integrated cognitive architecture (ICA) for symbolic music learning and generation called MusiCog. Inspired by previous ICAs, MusiCog features a modular design, implementing functions for perception, working memory, long-term memory, and production/composition. MusiCog’s perception and memory modules draw on established experimental research in the field of music psychology, integrating both existing and novel approaches to modelling perceptual phe- nomena like auditory stream segregation (polyphonic voice-separation) and melodic seg- mentation, as well as higher-level cognitive phenomena like “chunking” and hierarchical sequence learning. Through the integrated approach, MusiCog constructs a representa- tion of music informed specifically by its perceptual and cognitive limitations. Thus, in a manner similar to human listeners, its knowledge of different musical works or styles is not equal or uniform, but is rather informed by the specific musical structure of the works themselves. MusiCog’s production/composition module does not attempt to model explicit knowl- edge of music theory or composition. Rather, it proposes a “musically naïve” approach to composition, bound by the perceptual phenomena that inform its representation of musical structure, and the cognitive constraints that inform its capacity to articulate its knowledge through novel compositional output. This dissertation outlines the background research and ideas that inform MusiCog’s design, presents the model in technical detail, and demonstrates through quantitative testing and practical music theoretical analysis the model’s capacity for melodic style imitation when trained on musical corpora in a range of musical styles from the West- ern tradition. Strengths and limitations—both of the conceptual approach and the spe- cific implementation—are discussed in the context of autonomous melodic generation and computer-assisted composition (CAC), and avenues for future research are presented. The integrated approach is shown to offer a viable path forward for the design and implementa- tion of intelligent musical agents and interactive CAC systems.
Article
Full-text available
We describe an approach to the automatic generation of convincing jazz melodies using probabilistic grammars. Uses of this approach include a software tool for assisting a soloist in the creation of a jazz solo over chord progressions. The method also shows promise as a means of automatically improvising complete solos in real-time. Our approach has been implemented and demonstrated in a free software tool.
Article
Full-text available
The ability to construct a musical theory from examples presents a great intellectual challenge that, if successfully met, could foster a range of new creative applications. Inspired by this challenge, we sought to apply machine-learning methods to the problem of musical style modeling. Our work so far has produced examples of musical generation and applications to a computer-aided composition system. Machine learning consists of deriving a mathematical model, such as a set of stochastic rules, from a set of musical examples. The act of musical composition involves a highly structured mental process. Although it is complex and difficult to formalize, it is clearly far from being a random activity. Our research seeks to capture some of the regularity apparent in the composition process by using statistical and information theoretic tools to analyze musical pieces. The resulting models can be used for inference and prediction and, to a certain extent, to generate new works that imitate the style of the great masters.
Conference Paper
We proposed the mechanism of extracting theme melodies from a song by using a graphical clustering algorithm. In the proposed mechanism, a song is split into the set of motifs each of which is the minimum meaningful unit. Then the system clusters the motifs into groups based on the similarity values calculated between all pairs of motifs so that each cluster has higher similarity values between them than others. From each clusters, the system selects a theme melody based on the positions of the motif within a song and the maximum summation of similarity values of edges adjacent to the motif node in each cluster. As the experimental results, we showed an example in which we describe how the theme melodies of a song can be extracted by using the proposed algorithm.
Conference Paper
An application of Grammatical Inference (GI) in the field of Music Processing is presented, were Regular Grammars are used for modeling musical style. The interest in modeling musical style resides in the use of these models in applications, such as Automatic Composition and Automatic Musical Style Recognition. We have studied three GI Algorithms, which have been previously applied successfully in other fields. In this work, these algorithms have been used to learn a stochastic grammar for each of three different musical styles from examples of melodies. Then, each of the learned grammars was used to stochastically synthesize new melodies (Composition) or to classify test melodies (Style Recognition). Our previous studies in this field showed the need of a proper music coding scheme. Different coding schemes are presented and compared according to results in Composition and Style Recognition. Results from previous studies have been improved.
Conference Paper
The goal of this paper is to describe a new approach to algorithmic music composition that uses pattern extraction techniques to find patterns in a set of existing musical sequences, and then to use these patterns to compose music via a Markov chain. The transition probabilities of the Markov chain are learned from the musical sequences from which the patterns were extracted. These transitions determine which of the extracted patterns can follow other patterns. Our pattern matching phase considers three dimensions: time, pitch, and duration. Patterns of notes are considered to be equivalent under shifts in time, the baseline note of the pattern, and multiplicative changes of duration across all notes in the pattern. We give experimental results using classical music as training sequences to show the viability of our method in composing novel musical sequences.
Article
We describe a system which supports dynamic user interaction with multimedia information using content based hypermedia navigation techniques, specialising in a technique for navigation of musical content. The model combines the principles of open hypermedia, whereby hypermedia link in- formation is maintained by a link service, with content based retrieval techniques in which a database is queried based on a feature of the multimedia content; our approach could be de- scribed as 'content based retrieval of hypermedia links'. The experimental system focuses on temporal media and consists of a set of component-based navigational hypermedia tools. We propose the use of melodic pitch contours in this context and we present techniques for storing and querying contours, together with experimental results. Techniques for integrating the contour database with open hypermedia systems are also discussed. This paper describes a project to investigate content based navigation (CBN) of music. The project has adopted an open hypermedia approach, with support for temporal media, nav- igation based on multimedia content, and specifically navi- gation based on melodic pitch contours in music. The key innovation of the CBN model is that the user can select an ar- bitrary part of any multimedia document and use this to query a hypermedia link database; the query is based on informa- tion extracted from the selection using a technique specific to the content type. This enables the user to find links even when there are no pre-authored links in the specific document (e.g. no buttons). We believe this to be a powerful technique for dynamic user interaction, to support both multimedia au- thors and users. Our work extends previous work in open hypermedia by investigating content based navigation of temporal media.
Conference Paper
The author combines a survey of Markov-based efforts in automated composition with a tutorial demonstrating how various theoretical properties associated with Markov processes can be put to practical use. The historical background is traced from A. A. Markov's original formulation through to the present. A digression into Markov-chain theory introduces 'waiting counts' and 'stationary probabilities'. The author's "Demonstration 4" for solo clarinet illustrates how these properties affect the behavior of a melody composed using Markov chains. This simple example becomes a point of departure for increasingly general interpretations of the Markov process. The interpretation of 'states' is reevaluated in the light of recent musical efforts that employ Markov chains of higher-level objects and in the light of other efforts that incorporate relative attributes into the possible interpretations. Other efforts expand Markov's original definition to embrace 'Nth-order' transitions, evolving transition matrices and chains of chains. The remainder of this article contrasts Markov processes with alternative compositional strategies.
Article
this paper in L a T E Xpartly supported by ARPA (ONR) grant N00014-94-1-0775 to Stanford University where John McCarthy has been since 1962. Copied with minor notational changes from CACM, April 1960. If you want the exact typography, look there. Current address, John McCarthy, Computer Science Department, Stanford, CA 94305, (email: jmc@cs.stanford.edu), (URL: http://www-formal.stanford.edu/jmc/ ) by starting with the class of expressions called S-expressions and the functions called S-functions. In this article, we first describe a formalism for defining functions recursively. We believe this formalism has advantages both as a programming language and as a vehicle for developing a theory of computation. Next, we describe S-expressions and S-functions, give some examples, and then describe the universal S-function apply which plays the theoretical role of a universal Turing machine and the practical role of an interpreter. Then we describe the representation of S-expressions in the memory of the IBM 704 by list structures similar to those used by Newell, Shaw and Simon [2], and the representation of S-functions by program. Then we mention the main features of the LISP programming system for the IBM 704. Next comes another way of describing computations with symbolic expressions, and finally we give a recursive function interpretation of flow charts. We hope to describe some of the symbolic computations for which LISP has been used in another paper, and also to give elsewhere some applications of our recursive function formalism to mathematical logic and to the problem of mechanical theorem proving. 2 Functions and Function Definitions