Meinard Müller

Meinard Müller
Friedrich-Alexander-Universität of Erlangen-Nürnberg

Professor
IEEE Fellow for contributions to Music Signal Processing

About

388
Publications
221,003
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
12,331
Citations
Introduction
Meinard Müller studied mathematics and computer science at Bonn University, Germany. Since September 2012, he holds a professorship at the International Audio Laboratories Erlangen, a joint institution of the Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) and Fraunhofer IIS. His current research interests include music processing, audio signal processing, and music information retrieval. He is author of the textbook "Fundamentals of Music Processing" (www.music-processing.de).
Additional affiliations
April 2002 - March 2003
Keio University
Position
  • PostDoc Position
March 2012 - August 2012
University of Bonn
Position
  • Professor (W2)
Description
  • Praktische Informatik/Audiosignalverarbeitung
December 2007 - February 2012
Saarland University
Position
  • Senior Researcher
Description
  • Member of the the Cluster of Excellence, Multimodal Computing and Interaction (MMCI), MPI Informatik and Saarland University, Saarbrücken, Germany
Education
October 1990 - February 1997
University of Bonn
Field of study
  • Mathematics

Publications

Publications (388)
Article
In this article, we investigate the notion of model-based deep learning in the realm of music information research (MIR). Loosely speaking, we refer to the term model-based deep learning for approaches that combine traditional knowledge-based methods with data-driven techniques, especially those based on deep learning, within a differentiable com...
Preprint
In this article, we investigate the notion of model-based deep learning in the realm of music information research (MIR). Loosely speaking, we refer to the term model-based deep learning for approaches that combine traditional knowledge-based methods with data-driven techniques, especially those based on deep learning, within a diff erentiable comp...
Article
The availability of digital music data in various modalities provides opportunities both for music enjoyment and music research. Regarding the latter, the computer-assisted analysis of tonal structures is a central topic. For Western classical music, studies typically rely on machine-readable scores, which are tedious to create for large-scale work...
Article
Full-text available
This paper is concerned with how singers of Georgian traditional vocal music interact when singing together. Applying a variety of computational methods from audio signal processing and music information retrieval (MIR), we examine three existing corpora of (field) recordings for manifestations of a high degree of mutual coordination of the singers...
Article
Full-text available
In this work, we address the novel and rarely considered source separation task of decomposing piano concerto recordings into separate piano and orchestral tracks. Being a genre written for a pianist typically accompanied by an ensemble or orchestra, piano concertos often involve an intricate interplay of the piano and the entire orchestra, leading...
Article
Full-text available
Music synthesis aims to generate audio from symbolic music representations, traditionally using techniques like concatenative synthesis and physical modeling. These methods offer good control but often lack expressiveness and realism in timbre. Recent advancements in diffusion-based models have enhanced the realism of synthesized audio, yet these m...
Article
Full-text available
Musicology meets seismology - and opens up new possibilities for computer-assisted recording and analysis of polyphonic singing. Researchers use throat microphones to study traditional Georgian vocal music, which is included in the UNESCO World Heritage list.
Conference Paper
Full-text available
Auditory roughness is a psychoacoustic property that correlates with the perceived “pleasantness” of sounds for Western listeners (Terhardt, 1974; McDermott et al., 2016). It is an integral part of musical expression in terms of changing harmonies and consonances (Vassilakis, 2005; Berezovsky, 2019; Marijeh et al., 2022) and a multitude of models (...
Article
Full-text available
The piano concerto is a genre of central importance in Western classical music, often consisting of a virtuoso solo part for piano and an orchestral accompaniment. In this article, we introduce the Piano Concerto Dataset (PCD), which comprises a collection of excerpts with separate piano and orchestral tracks from piano concertos ranging from t...
Preprint
Full-text available
To model the periodicity of beats, state-of-the-art beat tracking systems use "post-processing trackers" (PPTs) that rely on several empirically determined global assumptions for tempo transition, which work well for music with a steady tempo. For expressive classical music, however, these assumptions can be too rigid. With two large datasets of We...
Preprint
Full-text available
Soft dynamic time warping (SDTW) is a differentiable loss function that allows for training neural networks from weakly aligned data. Typically, SDTW is used to iteratively compute and refine soft alignments that compensate for temporal deviations between the training data and its weakly annotated targets. One major problem is that a mismatch betwe...
Preprint
Full-text available
Within the last century, the originally oral transmission of traditional (Georgian) vocal music has changed dramatically. The first mechanical and later electromagnetic field recording capabilities of sound and the associated acoustic reproducibility led to the first immense changes in the transmission mechanisms. The advent of the Internet and tod...
Article
Full-text available
Sound classification algorithms are challenged by the natural variability of everyday sounds, particularly for large sound class taxonomies. In order to be applicable in real-life environments, such algorithms must also be able to handle polyphonic scenarios, where simultaneously occurring and overlapping sound events need to be classified. With th...
Preprint
Many tasks in music information retrieval (MIR) involve weakly aligned data, where exact temporal correspondences are unknown. The connectionist temporal classification (CTC) loss is a standard technique to learn feature representations based on weakly aligned training data. However, CTC is limited to discrete-valued target sequences and can be dif...
Article
Full-text available
Instrument activity detection is a fundamental task in music information retrieval, serving as a basis for many applications, such as music recommendation, music tagging, or remixing. Most published works on this task cover popular music and music for smaller ensembles. In this paper, we embrace orchestral and opera music recordings as a rarely con...
Article
To model the periodicity of beats, state-of-the-art beat tracking systems use “post-processing trackers” (PPTs) that rely on several empirically determined global assumptions for tempo transition, which work well for music with a steady tempo. For expressive classical music, however, these assumptions can be too rigid. With two large datasets of We...
Article
Full-text available
The Multi-Scale Spectral (MSS) loss is commonly used for comparing audio signals, as it provides a good trade-off between temporal and spectral resolution. However, some configuration choices, including window type and size, magnitude compression, as well as the distance between spectrograms, are often set implicitly, even though they can significa...
Conference Paper
Full-text available
Audio synchronization aims at aligning multiple recordings of the same piece of music. Traditional synchronization approaches are often based on dynamic time warping using chroma features as an input representation. Previous work has shown how one can integrate onset cues into this pipeline for improving the alignment's temporal accuracy. Furthermo...
Article
Full-text available
Given a music recording, music structure analysis aims at identifying important structural elements and segmenting the recording according to these elements. In jazz music, a performance is often structured by repeating harmonic schemata (known as choruses), which lay the foundation for improvisation by soloists. Within the fields of music informat...
Article
Full-text available
Wearing face coverings became one essential tool in order to prohibit virus transmission during the COVID-19 pandemic. In comparison to speaking and breathing, singing emits a much higher amount of aerosol particles. Therefore, there are situations in which singers can perform or rehearse only if they are using protective masks. However, such masks...
Preprint
Full-text available
For expressive music, the tempo may change over time, posing challenges to tracking the beats by an automatic model. The model may first tap to the correct tempo, but then may fail to adapt to a tempo change, or switch between several incorrect but perceptually plausible ones (e.g., half- or double-tempo). Existing evaluation metrics for beat track...
Article
Full-text available
In this study we examine the tonal organization of the 2016 GVM dataset, a newly-created corpus of high-quality multimedia field recordings of traditional Georgian singing (with focus on Svaneti) which we collected during the summer of 2016. Because of the peculiarities of the performance practice of traditional Svan singing (e.g, exhibiting a cons...
Conference Paper
Appropriate sound effects are an important aspect of immersive virtual experiences. Particularly in mixed reality scenarios it may be desirable to change the acoustic properties of a naturally occurring interaction sound (e.g., the sound of a metal spoon scraping a wooden bowl) to a sound matching the characteristics of the corresponding interactio...
Preprint
Full-text available
In this study we examine the tonal organization of the 2016 GVM dataset, a newly-created corpus of high-quality multimedia field recordings of traditional Georgian singing (with focus on Svaneti) which we collected during the summer of 2016. Because of the peculiarities of the performance practice of traditional Svan singing (e.g, exhibiting a cons...
Presentation
This presentation is concerned with how singers of Georgian traditional vocal music interact when singing together. Applying a variety of computational methods from audio signal processing and music information retrieval (MIR), we examine three existing corpora of (field) recordings for manifestations of a high degree of mutual coordination of the...
Article
Full-text available
Three-voiced funeral songs from Svaneti in North-West Georgia (also referred to as Zär) are believed to represent one of Georgia’s oldest preserved forms of collective music-making. Throughout a Zär performance, the singers often jointly and intentionally drift upwards in pitch. Furthermore, the singers tend to use pitch slides at the beginning and...
Article
Full-text available
Existing acoustic scene classification (ASC) systems often fail to generalize across different recording devices. In this work, we present an unsupervised domain adaptation method for ASC based on data standardization and feature projection. First, log-amplitude spectro-temporal features are standardized in a band-wise fashion over samples and time...
Article
Attention-based Transformer models have been increasingly employed for automatic music generation. To condition the generation process of such a model with a user-specified sequence, a popular approach is to take that conditioning sequence as a priming sequence and ask a Transformer decoder to generate a continuation. However, this prompt-based con...
Article
For expressive music, the tempo may change over time, posing challenges to tracking the beats by an automatic model. The model may first tap to the correct tempo, but then may fail to adapt to a tempo change, or switch between several incorrect but perceptually plausible ones (e.g., half- or double-tempo). Existing evaluation metrics for beat track...
Preprint
Full-text available
Attention-based Transformer models have been increasingly employed for automatic music generation. To condition the generation process of such a model with a user-specified sequence, a popular approach is to take that conditioning sequence as a priming sequence and ask a Transformer decoder to generate a continuation. However, this prompt-based con...
Article
Full-text available
This paper approaches the automatic detection of musical patterns in audio recordings with a particular focus on leitmotifs, which are specific types of patterns associated with certain characters, places, items, or feelings occurring in an opera or movie soundtrack. The detection of such leitmotifs is particularly challenging since their appearanc...
Preprint
Full-text available
The deployment of machine listening algorithms in real-life applications is often impeded by a domain shift caused for instance by different microphone characteristics. In this paper, we propose a novel domain adaptation strategy based on disentanglement learning. The goal is to disentangle task-specific and domain-specific characteristics in the a...
Article
Full-text available
This paper deals with a scoreaudio music retrieval task where the aim is to find relevant audio recordings of Western classical music, given a short monophonic musical theme in symbolic notation as a query. Strategies for comparing score and audio data are often based on a common mid-level representation, such as chroma features, which capture melo...
Preprint
Full-text available
In this article, we show how music may serve as a vehicle to support education in signal processing. Using Fourier analysis as a concrete example, we show how the music domain provides motivating and tangible applications that make learning signal processing an interactive pursuit. Furthermore, we indicate how software tools, originally developed f...
Article
Full-text available
The fields of music, health, and technology have seen significant interactions in recent years in developing music technology for health care and well-being. In an effort to strengthen the collaboration between the involved disciplines, the workshop “Music, Computing, and Health” was held to discuss best practices and state-of-the-art at the inters...
Article
Full-text available
Automatically detecting the presence of singing in music audio recordings is a central task within music information retrieval. While modern machine-learning systems produce high-quality results on this task, the reported experiments are usually limited to popular music and the trained systems often overfit to confounding factors. In this paper, we...
Article
Full-text available
Flexible retrieval systems are required for conveniently browsing through large music collections. In a particular content-based music retrieval scenario, the user provides a query audio snippet, and the retrieval system returns music recordings from the collection that are similar to the query. In this scenario, a fast response from the system is...
Article
Full-text available
In this artaicle, we illustrate how music may serve as a vehicle to support education in signal processing. Using Fourier analysis as a concrete example, we demonstrate how the music domain provides motivating and tangible applications that make learning signal processing an interactive pursuit. Furthermore, we indicate how software tools, original...
Article
This article presents a multimodal dataset comprising various representations and annotations of Franz Schubert’s song cycle Winterreise . Schubert’s seminal work constitutes an outstanding example of the Romantic song cycle—a central genre within Western classical music. Our dataset unifies several public sources and annotations carefully created...
Article
Full-text available
This paper provides a guide through the FMP notebooks, a comprehensive collection of educational material for teaching and learning fundamentals of music processing (FMP) with a particular focus on the audio domain. Organized in nine parts that consist of more than 100 individual notebooks, this collection discusses well-established topics in music...
Chapter
Audio signals are typically complex mixtures of different sound sources. The sound sources can be several people talking simultaneously in a room, different instruments playing together, or a speaker talking in the foreground with music being played in the background.
Chapter
One of the attributes distinguishing music from random sound sources is the hierarchical structure in which music is organized. At the lowest level, one has events such as individual notes, which are characterized by the way they sound, their timbre, pitch, and duration.
Chapter
Music can be represented in many different ways and formats. For example, a composer may write down a composition in the form of a musical score. In a score, musical symbols are used to visually encode notes and how these notes are to be played by a musician.
Chapter
In music, harmony refers to the simultaneous sound of different notes that form a cohesive entity in the mind of the listener. The main constituent components of harmony, at least in the Western music tradition, are chords, which are musical constructs that typically consist of three or more notes.
Chapter
Temporal and structural regularities are perhaps the most important incentives for people to get involved and to interact with music. It is the beat that drives music forward and provides the temporal framework of a piece of music.
Chapter
Music can be described and represented in many different ways including sheet music, symbolic representations, and audio recordings. For each of these representations, there may exist different versions that correspond to the same musical work.
Chapter
The revolution in music distribution and storage brought about by digital technology has fueled tremendous interest in and attention to the ways that information technology can be applied to this kind of content. The rapidly growing corpus of digitally available music data requires novel technologies that allow users to browse personal collections...
Chapter
As we have seen in the last chapter, music signals are generally complex sound mixtures that consist of a multitude of different sound components. Because of this complexity, the extraction of musically relevant information from a waveform constitutes a difficult problem.
Chapter
The goal of automatic music segmentation is to calculate boundaries between musical parts or sections that are perceived as semantic entities. Such sections are often characterized by specific musical properties such as instrumentation, dynamics, tempo, or rhythm. Recent data-driven approaches often phrase music segmentation as a binary classificat...
Article
Full-text available
In this paper, we adapt a recently proposed U-net deep neural network architecture from melody to bass transcription. We investigate pitch shifting and random equalization as data augmentation techniques. In a parameter importance study, we study the influence of the skip connection strategy between the encoder and decoder layers, the data augmenta...
Preprint
Full-text available
The fields of music, health, and technology have seen significant interactions in recent years in developing music technology for health care and well-being. In an effort to strengthen the collaboration between the involved disciplines, the workshop ‘Music, Computing, and Health’ was held to discuss best practices and state-of-the-art at the inters...
Book
The textbook provides both profound technological knowledge and a comprehensive treatment of essential topics in music processing and music information retrieval (MIR). Including numerous examples, figures, and exercises, this book is suited for students, lecturers, and researchers working in audio engineering, signal processing, computer science,...
Article
Full-text available
While Georgia has a long history of orally transmitted polyphonic singing, there is still an ongoing controversial discussion among ethnomusicologists on the tuning system underlying this type of music. First attempts have been made to analyze tonal properties (e. g., harmonic and melodic intervals) based on fundamental frequency (F0) trajectories....
Book
Full-text available
In this study we examine the tonal organization of a series of recordings of liturgical chants, sung in 1966 by the Georgian master singer Artem Erkomaishvili. This dataset is the oldest corpus of Georgian chants from which the time synchronous F0-trajectories for all three voices have been reliably determined (Müller et al. 2017). It is therefore...
Article
Full-text available
Musical themes are essential elements in Western classical music. In this paper, we present the Musical Theme Dataset (MTD), a multimodal dataset inspired by “A Dictionary of Musical Themes” by Barlow and Morgenstern from 1948. For a subset of 2067 themes of the printed book, we created several digital representations of the musical themes. Beyond...