Mark Gotham’s research while affiliated with Durham University and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (23)


Figure 1: Overview of the proposed research framework
Figure 3: Jeonggan-like encoding position labels
Figure 4: Comparison between encoding schemes
Figure 5: Orchestral part generation
Six Dragons Fly Again: Reviving 15th-Century Korean Court Music with Transformers and Novel Encoding
  • Preprint
  • File available

August 2024

·

37 Reads

Danbinaerin Han

·

Mark Gotham

·

Dongmin Kim

·

[...]

·

We introduce a project that revives a piece of 15th-century Korean court music, Chihwapyeong and Chwipunghyeong, composed upon the poem Songs of the Dragon Flying to Heaven. One of the earliest examples of Jeongganbo, a Korean musical notation system, the remaining version only consists of a rudimentary melody. Our research team, commissioned by the National Gugak (Korean Traditional Music) Center, aimed to transform this old melody into a performable arrangement for a six-part ensemble. Using Jeongganbo data acquired through bespoke optical music recognition, we trained a BERT-like masked language model and an encoder-decoder transformer model. We also propose an encoding scheme that strictly follows the structure of Jeongganbo and denotes note durations as positions. The resulting machine-transformed version of Chihwapyeong and Chwipunghyeong were evaluated by experts and performed by the Court Music Orchestra of National Gugak Center. Our work demonstrates that generative models can successfully be applied to traditional music with limited training data if combined with careful design.

Download


Adaptation and Optimization of AugmentedNet for Roman Numeral Analysis Applied to Audio Signals

March 2024

·

13 Reads

Lecture Notes in Computer Science

Automatic harmonic analysis of music has recently been significantly improved by AugmentedNet, a convolutional recurrent neural network for predicting Roman numeral labels. The original network was trained on a combination of computer encodings of digital scores and human harmonic analyses thereof. Learning from these pairs, the system predicts new harmonic analyses for unseen examples. However, for much music, no score symbolic is available. For this study, we adjusted AugmentedNet for a direct application to audio signals (represented either by chromagrams or semitone spectra). We also implemented and compared further modifications to the network architecture: adding a preprocessing block designed to learn pitch spellings, increasing the network size, and adding dropout layers to avoid over-fitting. A thorough statistical analysis helped to identify the best among the proposed configurations and has shown that some of the optimization steps significantly increased the classification performance. We find that this adapted AugmentedNet can reach similar accuracy levels when faced with audio features as it achieves with the “cleaner” symbolic data on which it was originally trained.





Musical Genre Recognition Based on Deep Descriptors of Harmony, Instrumentation, and Segments

April 2023

·

16 Reads

·

1 Citation

Lecture Notes in Computer Science

Deep learning has recently established itself as a cluster of methods of choice for almost all classification tasks in music information retrieval. However, despite very good classification performance, it sometimes brings disadvantages including long training times and higher energy costs, lower interpretability of classification models, or an increased risk of overfitting when applied to small training sets due to a very large number of trainable parameters. In this paper, we investigate the combination of both deep and shallow algorithms for recognition of musical genres using a transfer learning approach. We train deep classification models once to predict harmonic, instrumental, and segment properties from datasets with respective annotations. Their predictions for another dataset with annotated genres are used as features for shallow classification methods. They can be trained over and again for different categories, and are particularly useful when the training sets are small, in a real world scenario when listeners define various musical categories selecting only a few prototype tracks. The experiments show the potential of the proposed approach for genre recognition. In particular, when combined with evolutionary feature selection which identifies the most relevant deep feature dimensions, the classification errors became significantly lower in almost all cases, compared to a baseline based on MFCCs or results reported in the previous work.KeywordsMusical genre recognitionDeep neural networksTransfer learningInterpretable featuresEvolutionary feature selection


Fig. 1. Fanny Hensel (née Mendelssohn)'s 5 Lieder, Op.10 No.1: 'Nach Süden' as an example of an OpenScore Lieder Corpus score with aligned harmonic analysis part added.
Connecting the Dots: Engaging Wider Forms of Openness for the Mutual Benefit of Musicians and Musicologists

December 2021

·

112 Reads

·

2 Citations

Empirical Musicology Review

While it is encouraging to see renewed attention to 'openness' in academia, that debate (and its interpretation of the F.A.I.R. principles) is often rather narrowly defined. This paper addresses openness in a broad sense, asking not so much whether a project is open, but how open and to whom. I illustrate these ideas through examples of my own ongoing projects which to seek to make the most of a potential symbiosis between academic and wider musical communities. Specifically, I discuss how these communities can both benefit from – and even work together on building – highly accessible and interoperable corpora of scores and analyses when ambitious openness is factored into decision making from the outset.


Figure 1. AugmentedNet. The bass and chroma inputs are processed through independent convolutional blocks and then concatenated. Both convolutional blocks are identical and expanded on the top of the figure. A convolutional block has six 1D convolutional layers. Each layer doubles the length of the convolution window and halves the number of output filters. On the right, the MTL layout with eleven tasks. Each task indicates the number of output classes in parentheses.
AugmentedNet: A Roman Numeral Analysis Network with Synthetic Training Examples and Additional Tonal Tasks

November 2021

·

305 Reads

·

13 Citations

AugmentedNet is a new convolutional recurrent neural network for predicting Roman numeral labels. The network architecture is characterized by a separate convolu-tional block for bass and chromagram inputs. This layout is further enhanced by using synthetic training examples for data augmentation, and a greater number of tonal tasks to solve simultaneously via multitask learning. This paper reports the improved performance achieved by combining these ideas. The additional tonal tasks strengthen the shared representation learned through multitask learning. The synthetic examples, in turn, complement key transposition , which is often the only technique used for data augmentation in similar problems related to tonal music. The name 'AugmentedNet' speaks to the increased number of both training examples and tonal tasks. We report on tests across six relevant and publicly available datasets: ABC, BPS, HaydnSun, TAVERN, When-in-Rome, and WTC. In our tests, our model outperforms recent methods of functional harmony, such as other convolutional neural networks and Transformer-based models. Finally, we show a new method for reconstructing the full Roman numeral label, based on common Roman numeral classes, which leads to better results compared to previous methods.



Citations (14)


... The model is trained to classify each token as being part of a root position, first, second, or third inversion chord. We consider the When-in-Rome dataset which includes roman numeral labels (Gotham et al. 2023a) from which only the chord inversion characteristic is extracted. From this dataset, we only kept Bach chorales, from which we assume the melody to be the soprano voice. ...

Reference:

Evaluating Interval-based Tokenization for Pitch Representation in Symbolic Music Analysis
When in Rome: A Meta-corpus of Functional Harmony

Transactions of the International Society for Music Information Retrieval

... Alignment between sources of different types is one of the fundamental, ubiquitous problems not only in this micro-field, but in many corners of the wider fields of MIR and corpus study. WiR's alignment methods are handled by functionality that began life on WiR and is now available as a separate package announced and discussed in Gotham et al. (2023). By way of a brief introduction, some common cases to be relatively reliably resolved by that package include: ...

The ‘Measure Map’: an inter-operable standard for aligning symbolic music
  • Citing Conference Paper
  • November 2023

... Pre-training used an in-house 160K ABC-notation score dataset. To evaluate models' generalization with different tokenization, we fine-tuned on three classical music datasets of different instrumentation: 398 Bach chorales [10], 103 Haydn string quartets [11], 54 Mozart piano sonatas [12]. Additionally, data augmentation on 15 key signatures was done in both pre-training and fine-tuning. ...

The “OpenScore String Quartet” Corpus
  • Citing Conference Paper
  • November 2023

... For further examples how shallow and deep classifiers can be applied to solve diverse music classification tasks, we refer to [48][49][50]. Further studies that make use of AAM include segment detection [51], genre recognition [52], and neural architecture search for instrument recognition [53]. These three provide in-depth examples for possible applications. ...

Musical Genre Recognition Based on Deep Descriptors of Harmony, Instrumentation, and Segments
  • Citing Chapter
  • April 2023

Lecture Notes in Computer Science

... Musical keys and functional harmony have been explored in the field of roman numeral analysis [30][31][32]. The analysis of how modes and tonalities relate to mid-level perceptual features (e.g., dissonance, tonal stability, minorness) and affect the emotional perception of music pieces has also been discovered [22,24]. ...

AugmentedNet: A Roman Numeral Analysis Network with Synthetic Training Examples and Additional Tonal Tasks

... The challenge of supporting interoperability of music content-related data has been the subject of relevant efforts in the last decade, especially supporting their evolution, reuse, and sustainability [18][19][20] according to FAIR data principles 21 and through Semantic Web technologies. The Music Ontology 22 addresses contextual metadata about music pieces, such as when they were recorded or arranged and by whom, providing a basis for interlinking music datasets. ...

Connecting the Dots: Engaging Wider Forms of Openness for the Mutual Benefit of Musicians and Musicologists

Empirical Musicology Review

... Instead of explicitly extracting and numerically evaluating such information, it is often insightful to visualize harmonic structures and leave the final interpretation to a music expert. One such approach is described and applied to Beethoven's sonatas in (Weiß et al., 2020a), where relevant pitch content with respect to the 12 diatonic scales is extracted from an audio recording and visualized in the form of a time-diatonic representation. Figure 4 shows such a visualization of a time-diatonic representation derived from a WK64 recording of the Piano Sonata Op. 14 No 2 in G Major. ...

Discourse Not Dualism: An Interdisciplinary Dialogue on Sonata Form in Beethoven's Early Piano Sonatas

... Voltage type PWM uses capacitors as energy storage components to stabilize voltage output, while current type PWM uses inductors to provide current output. 31 Voltage type PWM has a simple structure, high efficiency, low loss, and fast dynamic response characteristics. Therefore, this study chooses voltage type PWM for waveform control. ...

Not All Roads Lead to Rome: Pitch Representation and Model Architecture for Automatic Harmonic Analysis

Transactions of the International Society for Music Information Retrieval

... Computational musicology is well-equipped to study music empirically with tools such as the Humdrum toolkit [1], the VIS Framework (Antila & Cumming, 2014), and music21 (Cuthbert & Ariza, 2010). The latter is conveniently bundled with its own corpora, and there are also several other high-quality symbolic music corpora specifically curated with musicological analysis in mind, which are available online (Sapp, 2005;Devaney et al., 2015;Neuwirth et al., 2018;Gotham et al., 2018). However, even when combined, the available corpora only cover a very small part of the published musical works. ...

Scores of scores: an openscore project to encode and share sheet music
  • Citing Conference Paper
  • September 2018

... For example, in the case of the off-beatness measure, if n is prime, there are φ(n) = n − 1 strong positions and then the resulting rhythms will have high rhythmic complexity in general, which is not always musically relevant. In chapters like these, we observe certain naïveté in Toussaint's approach to core music theory (this was also pointed out by Gotham 2013). Toussaint himself acknowledges in Chapter 15, page 90, "whether any of these methods actually guided the evolutionary selection process is another matter altogether. ...

Review of Godfried Toussaint, The Geometry of Musical Rhythm: What Makes a “Good” Rhythm Good? (CRC Press, 2013)
  • Citing Article
  • July 2013

Music Theory Online