ThesisPDF Available

Computational Methods for Tonality-Based Style Analysis of Classical Music Audio Recordings

Authors:

Abstract and Figures

With the tremendously growing impact of digital technology, the ways of accessing music crucially changed. Nowadays, streaming services, download platforms, and private archives provide a large amount of music recordings to listeners. In the area of Music Information Retrieval, researchers are developing automatic methods for organizing and browsing such collections. One application scenario is the classification of music recordings according to categories such as musical genres. In this thesis, we approach such classification problems by discriminating subgenres within Western classical music. In particular, we focus on stylistic categories such as historical periods or individual composers. Musicologists usually analyze musical style in a manual fashion relying on scores. This thesis contributes with computational methods for realizing such analyses on comprehensive corpora of audio recordings. Our style analysis experiments focus on the fields of harmony and tonality. In the first step, we use signal processing techniques for computing chroma representations of the audio data. Based on these representations, we model specific concepts from music theory and propose algorithms to measure the occurrence of certain tonal structures in audio recordings. One of the algorithms estimates the global key of a piece by considering the particular role of the final chord. Another method serves to visualize modulations regarding diatonic scales as well as scale types over the course of a piece. Furthermore, we propose techniques for estimating the presence of specific interval and chord types and for measuring tonal complexity. On the basis of these novel audio features, we analyze audio recordings regarding musical style. Using unsupervised clustering methods, we investigate the similarity of musical works across composers and composition years. Furthermore, we perform classification experiments according to historical periods ("eras") and composers. Our results indicate that the tonal features proposed in this thesis seem to robustly capture stylistic properties. In contrast, using standardized timbral features for classification often leads to overfitting resulting in worse performance. This shows that tonal characteristics can be discriminative for style analysis and that we can measure such characteristics directly from audio recordings.
Content may be subject to copyright.
A preview of the PDF is not available
... However, such features are less suitable for discriminating sub-genres or historical periods within Western classical music-consider, e. g., the co-existence of solo piano music composed over several centuries [9]. To address this challenge, harmonic features have shown promising results [10][11][12][13]. Yet, existing harmonic audio features exhibit two main limitations. ...
... Fixed-size segments with equal duration are adopted at multiple time scales or resolutions to capture the piece's different harmonic or tonal dimensions. Following [10][11][12], we consider four time resolutions in our work: 100ms, 500ms, 10s, and global (i.e., the entire piece or excerpt under analysis). Smaller time scales (e.g., 100ms and 500ms) capture finer musical elements such as individual notes, intervals, and chords. ...
... For our experiments, we consider the Cross-Era 1 and Cross-Composer 2 datasets, which include 1600 and 1100 pieces, respectively. In detail, the Cross-Era dataset has 1 https://www.audiolabs-erlangen.de/resources/MIR/cross-era 2 https://www.audiolabs-erlangen.de/resources/MIR/cross-comp Table 3. Balanced subsets obtained from the Cross-Era and Cross-Composer datasets [12]. ...
Conference Paper
Full-text available
The extraction of harmonic information from musical audio is fundamental for several music information retrieval tasks. In this paper, we propose novel harmonic audio features based on the perceptually-inspired tonal interval vector space, computed as the Fourier transform of chroma vectors. Our contribution includes mid-level features for musical dissonance, chromaticity, dyadicity, triadicity, diminished quality, diatonicity, and whole-toneness. Moreover, we quantify the perceptual relationship between short-and long-term harmonic structures, tonal dispersion, harmonic changes, and complexity. Beyond the computation on fixed-size windows, we propose a context-sensitive harmonic segmentation approach. We assess the robustness of the new harmonic features in style classification tasks regarding classical music periods and composers. Our results align with, slightly outperforming, existing features and suggest that other musical properties than those in state-of-the-art literature are partially captured. We discuss the features regarding their musical interpretation and compare the different feature groups regarding their effectiveness for discriminating classical music periods and composers.
... The unit of study is typically the 'composition' and the analysis is to compare similarities/differences across audio signals from a series of compositions or from different segments of a same composition (self-similarity). This particular angle has been used among others, by Foote (1999), Pampalk (2006), and more recently by Weiss (2017) and Weiss et al. (2018) for classical music, and by Mauch et al. (2015) for popular music where they investigate the evolution of musical diversity and disparity and whether evolution has been gradual or punctuated. Context-based MIR is motivated by the fact that there are aspects not encoded in an audio signal or that cannot be extracted from it, but which are nevertheless important to human perception of music, for example, the cultural background of a composer. ...
... , where a, b, c, d are the count of composers in sets described in Footnote 4. Weiss et al. 2018) typically show that composers tend to cluster in ways that conform to our intuitions about stylistic traditions. In this paper we will compare the clustering performance of our context-based approach versus the content-based approach of Weiss (2017). Beyond grouping composers on a fixed number of clusters, we also investigate, as Weiss (2017), hierarchical clustering, which better highlights evolution and trends in classical music. ...
... Beyond grouping composers on a fixed number of clusters, we also investigate, as Weiss (2017), hierarchical clustering, which better highlights evolution and trends in classical music. Weiss (2017) does it for 70 composers and we will again compare results in the next section. Besides the 70 composers analysed in Weiss, our method permits to compute dendrograms for up to 500 composers representing seven centuries of classical music. ...
Article
Full-text available
This paper applies clustering techniques and multi-dimensional scaling (MDS) analysis to a 500 × 500 composers’ similarity/distance matrix. The objective is to visualize or translate the similarity matrix into dendrograms and maps of classical (European art) music composers. We construct dendrograms and maps for the Baroque, Classical, and Romantic periods, and a map that represents seven centuries of European art music in one single graph. Finally, we also use linear and non-linear canonical correlation analyses to identify variables underlying the dimensions generated by the MDS methodology.
... ). Motivated by [23,24], we prefer the latter arrangement accounting for the similarity of fifth-related scales, which have six out of seven pitch classes in common. We set the diatonic scale corresponding to the piece's global key in the center of the visualization, with upper-fifth-related scales (more sharps) above and lower-fifth-related scales (more flats) below that center scale. ...
... [13][14][15][16][17][18][19][20][21], and a cadence passage (mm. [21][22][23][24][25][26], which ends after a G major scale on the single tone G without having modulated (see Figure 4). 2 From m. 27 (0:43) on, this is followed by the melodic motif in G minor mentioned above, which is repeated in D minor (m. 33) and continued towards A minor (m. ...
Conference Paper
Full-text available
The computational analysis of music has traditionally seen a sharp divide between the "audio approach" relying on signal processing and the "symbolic approach" based on scores. Likewise, there has also been an unfortunate gap between any such computational endeavour and more traditional approaches as used in historical musicology. In this paper, we take a step towards ameliorating this situation through the application of a computational method for visualizing local key characteristics in audio recordings. We exploit these visualizations of diatonic scale content by discussing their musicological implications, being aware of methodological limitations as for the case of minor keys. As a proof of concept, we use this method for investigating differences between the traditional sonata-form model and selected Beethoven piano sonatas in the context of sonata theory from the end of the 18 th century. We consider this scenario as an example for a rewarding dialogue between computer science and historical musicology.
... He has published several papers on computational analyses in classical music, which unlike Simonton do not focus on melodic originality or thematic fame. Weiß's original doctoral dissertation was on the topic of "Computational Methods for Tonality-Based Style Analysis" [19] and he has continued research in this field specifically looking at tonal complexity and tonality-based style analysis [18]. He and his colleges make up a majority of recent publications in classical music computational analyses [20]. ...
Preprint
Full-text available
In this work, the researcher presents a novel approach to calculating melodic originality based on the research by Simonton (1994). This novel formula is then applied to a dataset of 428 classical music pieces from the Romantic period to analyze the relationship between melodic originality and thematic fame.
... There are several classification algorithms, both using supervised learning (SVM, K-nearest neighbors) and unsupervised learning (K-means clustering). Weiss (2017) and Weiss et al. (2018) apply several of these methods using audio features to characterise and then classify composers. ...
Article
Full-text available
This article illustrates different information visualization techniques applied to a database of classical composers and visualizes both the macrocosm of the Common Practice Period and the microcosms of twentieth century classical music. It uses data on personal (composer-to-composer) musical influences to generate and analyze network graphs. Data on style influences and composers ‘ecological’ data are then combined to composer-to-composer musical influences to build a similarity/distance matrix, and a multidimensional scaling analysis is used to locate the relative position of composers on a map while preserving the pairwise distances. Finally, a support-vector machines algorithm is used to generate classification maps. This article falls into the realm of an experiment in music education, not musicology. The ultimate objective is to explore parts of the classical music heritage and stimulate interest in discovering composers. In an age offering either inculcation through lists of prescribed composers and compositions to explore, or music recommendation algorithms that automatically propose works to listen to next, the analysis illustrates an alternative path that might promote the active rather than passive discovery of composers and their music in a less restrictive way than inculcation through prescription.
... There are several classification algorithms, both using supervised learning (SVM, Knearest neighbors) and unsupervised learning (K-means clustering). Weiss (2017) and Weiss et al. (2018) apply several methods using audio features to characterise and then classify composers. ...
Article
Full-text available
This paper and its companion article (I. An Application of Network Graphs) illustrate different information visualization techniques applied to a database of classical composers. In the first paper we used data on 'personal' (composer-to-composer) musical influences to generate and analyze (social) network graphs. In this second article we derive a composers' similarity matrix on the basis of musical (personal and style) influences and ecological data (features of a composer). We then use the similarity indices to locate composers on multidimensional scaling maps, and use machine-learning classification algorithms (e.g., support-vector machines and K-nearest neighbors) to generate classification maps. Both the macrocosm of the Common Practice Period and the microcosms of 20 th century classical music are visualized. The ultimate objective is to enhance basic music education and stimulate interest in discovering composers by proposing graphs and maps tracking the interconnections of composers. In an age offering either inculcation through lists of prescribed composers and compositions to explore, or music recommendation algorithms that automatically propose works to listen to next, the two articles illustrate an alternative path that might promote the active rather than passive discovery of composers and their music in a less restrictive way than inculcation through prescription.
... There are several classification algorithms, both using supervised learning (SVM, K-Nearest Neighbors) and unsupervised learning (K-means clustering). Weiss (2017) and Weiss et al. (2018) apply several methods using audio features to characterise and then classify composers. ...
Conference Paper
Full-text available
This paper illustrates different information visualization techniques (data visualization) applied to a classical composers' database. In particular we present composers network graphs, heat maps and multidimensional scaling maps (the latter two obtained from a composer distance matrix), composers' classification maps using support-vector machine and K-Nearest Neighbors algorithms, and dendrograms. All visualization techniques have been developed using Python programming and libraries. The ultimate objective is to enhance basic music education and interest in classical music by presenting information quickly and clearly, taking advantage of the human visual system's ability to see patterns and trends.
... The basic frequency is described as the minimum frequency of a stationary rhythmic sound signal, that can be described as tonal sound. Tonality in audio is a system that arranges musical scale notes based on musical criteria (Weiß 2017;Demirel et al. 2018). The basic frequency is described as the minimum frequency of a stationary rhythmic sound signal, that can be described as tonal sound. ...
Article
Full-text available
Audio signal processing is the most challenging field in the current era for an analysis of an audio signal. Audio signal classification (ASC) comprises of generating appropriate features from a sound and utilizing these features to distinguish the class the sound is most likely to fit. Based on the application’s classification domain, the characteristics extraction and classification/clustering algorithms used may be quite diverse. The paper provides the survey of the state-of art for understanding ASC’s general research scope, including different types of audio; representation of audio like acoustic, spectrogram; audio feature extraction techniques like physical, perceptual, static, dynamic; audio pattern matching approaches like pattern matching, acoustic phonetic, artificial intelligence; classification, and clustering techniques. The aim of this state-of-art paper is to produce a summary and guidelines for using the broadly used methods, to identify the challenges as well as future research directions of acoustic signal processing.
Article
This article is devoted to the study of instrumental sonatas by composers of Baroque, Classicism, and Romanticism periods. Within the study, the overall number of notes, number of repeated, lowered and raised notes, maximum interval, and range of musical passage are examined. Six subhypotheses are tested to learn the differences in musical works. The analysis is carried out by Student's t-test and Kruskal–Wallis test. The analysis results manifest that four of the six analysed parameters depend on the era of composing. Therefore, these parameters can be used to identify the period of writing a musical piece.
Conference Paper
Full-text available
We evaluate our novel approach to global key extraction from audio recordings, restricting ourselves to the genre Classical only. Especially in this field of music, musical key is a significant information since many works include the key in their title. Our rule-based method relies on pre-extracted chroma features and puts special emphasis on the final chord of the piece to estimate the tonic note. To determine the mode, we analyze the chroma histogram over the complete piece and estimate the underlying diatonic scale. In both steps, we apply a multiplicative procedure to obtain high error robustness. This approach helps to minimize the amount of false tonic notes, which is important for further key-related tonality analyses. The algorithm is evaluated on three different datasets containing mainly 18th and 19th century music for orchestra , piano, and mixed instruments. We reach accuracies up to 97 % for correct full key (correct tonic note and mode) classification and up to 100 % for correct tonic note classification.
Conference Paper
Full-text available
We present an algorithm for visualizing the tonal characteristics of classical music audio recordings. The method is inspired by music theory concepts on harmony and intended to assist musicological research. Our system bases on chroma features that are extracted from the audio data in order to represent the dominant local pitch classes. Hence, the approach is widely insensitive to the orchestration. We use a coarse time resolution to account for the overall local pitch content rather than for single melody notes. The first visualization type presented in this paper serves to display the temporal evolution of local keys within a movement. This method is inspired by Gárdonyi's analysis technique regarding diatonic key relationships. We calculate local estimates for the underlying diatonic scales. These scale estimates are arranged according to a perfect fifths series to account for tonal similarity. Visualizing the local results over time provides an overview of the modulation structure of a piece. The second method refers to the general scale type and the symmetries of the local pitch content. This technique is related to scale-based theories of harmony such as the analysis methods by Gárdonyi and Lendvai or the Tonfeld concept by Simon. Scale models such as the whole tone scale, the octatonic scale or the acoustic scale play an important role in impressionistic music or in Messiaen's compositions, among others. With our method, we compute the maximal likelihood for all transpositions of a scale to measure the occurrence of the respective scale type. These estimates are displayed over the course of the piece to show the locally prominent scales. This allows for an analysis of the formal aspects of tonality.
Conference Paper
Full-text available
In diesem Artikel versuchen wir anhand eines konkreten Beispiels zu skizzieren, inwiefern informatische Methoden gewinnbringend im Bereich der Musikwissenschaft eingesetzt werden können und inwiefern umgekehrt musikwissenschaftliche Fragestellungen zu neuen Herausforderungen in der Informatik führen. Konkret stellen wir computerbasierte Werkzeuge vor, die es erlauben, große Musikdatenbestände hinsichtlich harmonischer Strukturen auf interaktive Weise zu durchsuchen, zu visualisieren und zu ana-lysieren. Die musikwissenschaftliche Relevanz dieser Konzepte soll anhand eines Beispielszenarios aus Richard Wagners Die Walküre erprobt werden. Dabei gibt es mehrere relevante Fragestellungen: Inwie-weit können bekannte harmonische Strukturen maschinell nachvollzogen und visuell dargestellt werden? Gibt es bisher verborgene harmonische Strukturen und Bezüge? Exemplarisch sollen diese Fragen hier im Bezug auf Wagners Begriff der »dichterisch-musikalischen Periode« diskutiert und die Möglichkeiten der vorgestellten Visualisierungsmethoden angedeutet werden. Dabei soll überprüft werden, ob und in welcher Weise sich die charakteristischen Aspekte solcher Perioden in den automatisch berechneten harmonischen Analysen und deren Visualisierungen abbilden und ob ähnliche Strukturen an anderen Stellen entdeckt werden können. An diesem Beispiel sollen paradigmatisch Möglichkeiten des Dialogs zwischen Historischer Musikwissenschaft und Informatik auf der Basis ihrer jeweils unterschiedlichen Voraussetzungen und Methoden aufgezeigt werden.
Article
Full-text available
This article is an homage to the late music theorist Steve Larson, who passed away quite unexpectedly last year. Among his many talents (and he had many) was a remarkable ability to create ambigrams , which are constructed using ambiguous figures that can be interpreted in two different ways. An ambigram can be read the same way in two different orientations (often right-side-up and upside-down, but other orientation pairs are sometimes used). This article takes the notion of ambiguity at the heart of the ambigram concept and examines it in music. In addition to paying tribute to Steve Larson's delightful mind, the paper explores how metric and tonal ambiguity affect music analysis and pedagogy—two disciplines that were important to Steve throughout his career.
Conference Paper
Full-text available
We study the automatic identification of Western classical music styles by directly using chroma histograms as classification features. Thereby, we evaluate the benefits of knowing a piece's global key for estimating key-related pitch classes. First, we present four automatic key detection systems. We compare their performance on suitable datasets of classical music and optimize the algorithms' free parameters. Using a second dataset, we evaluate automatic classification into the four style periods Baroque, Classical, Romantic, and Modern. To that end, we calculate global chroma statistics of each audio track. We then split up the tracks according to major and minor keys and circularly shift the chroma histograms with respect to the tonic note. Based on these features, we train two individual classifier models for major and minor keys. We test the efficiency of four chroma extraction algorithms for classification. Furthermore, we evaluate the impact of key detection performance on the classification. Additionally, we compare the key-related chroma features to other chroma-based features. We obtain improved performance when using an efficient key detection method for shifting the chroma histograms.
Article
Thesupport-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the support-vector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to non-separable training data.High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated. We also compare the performance of the support-vector network to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.