Diemo Schwarz

Diemo Schwarz
Institut de Recherche et Coordination Acoustique/Musique | IRCAM · Sound–Music–Movement Interaction Team

PhD

About

107
Publications
21,863
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,680
Citations

Publications

Publications (107)
Conference Paper
Full-text available
We present a way to make environmental recordings controllable again by the use of continuous annotations of the high-level semantic parameter one wishes to control, e.g. wind strength or crowd excitation level. A partial annotation can be propagated to cover the entire recording via cross-modal analysis between gesture and sound by canonical time...
Conference Paper
Full-text available
Dirty Tangible Interfaces (DIRTI) are a new concept in interface design that forgoes the dogma of repeatability in favor of a richer and more complex experience, constantly evolving, never reversible, and infinitely modifiable. We built a prototype interface realizing the DIRTI principles based on low-cost commodity hardware and kitchenware: A vide...
Conference Paper
Full-text available
The synthesis of sound textures, such as rain, wind, or crowds, is an important application for cinema, multimedia creation, games and installations. However, despite the clearly defined requirements of naturalness and flexibility, no automatic method has yet found widespread use. After clarifying the definition, terminology, and usages of sound te...
Article
Full-text available
Concatenative sound synthesis is a promising method of musical sound synthesis with a steady stream of work and publications for over five years now. This article offers a comparative survey and taxonomy of the many different approaches to concatenative synthesis throughout the history of electronic music, starting in the 1950s, even if they weren'...
Article
Corpus-based concatenative methods for musical sound synthesis have attracted much attention recently. They make use of a variety of sound snippets in a data- base to assemble a desired sound or phrase according to a target specification given in sound descriptors or by an example sound. With ever-larger sound databases easily available, together w...
Article
Full-text available
Joint actions typically involve a sense of togetherness that has a distinctive phenomenological component. While it has been hypothesized that group size, hierarchical structure, division of labour, and expertise impact agents’ phenomenology during joint actions, the studies conducted so far have mostly involved dyads performing simple actions. We...
Article
Full-text available
We describe an art–science project called “Feral Interactions—The Answer of the Humpback Whale” inspired by humpback whale songs and interactions between individuals based on mutual influences, learning process, or ranking in the dominance hierarchy. The aim was to build new sounds that can be used to initiate acoustic interactions with these whale...
Chapter
DJ techniques are an important part of popular music culture. However, they are also not sufficiently investigated by researchers due to the lack of annotated datasets of DJ mixes. Thus, this paper aims at filling this gap by introducing novel methods to automatically deconstruct and annotate recorded mixes for which the constituent tracks are know...
Book
extended version of CMMR 2019 conference article hal-02172427
Article
Full-text available
A widespread belief is that large groups engaged in joint actions that require a high level of flexibility are unable to coordinate without the introduction of additional resources such as shared plans or hierarchical organizations. Here, we put this belief to a test, by empirically investigating coordination within a large group of 16 musicians pe...
Article
Full-text available
Adults readily make associations between stimuli perceived consecutively through different sense modalities, such as shapes and sounds. Researchers have only recently begun to investigate such correspondences in infants but only a handful of studies have focused on infants less than a year old. Are infants able to make cross-sensory correspondences...
Conference Paper
Full-text available
We report preliminary results of an ongoing project on automatic recognition and classification of musical “gestures” from audio extracts. We use a machine learning tool designed for motion tracking and recognition, applied to labeled vectors of audio descriptors in order to recognize hypothetical gestures formed by these descriptors. A hypothesis...
Article
Full-text available
This paper presents preliminary works exploring the use of machine learning in computer-aided composition processes. We propose a work direction using motion recognition and audio descriptors to learn abstract musical gestures.
Conference Paper
Full-text available
In perceptual listening tests, subjects have to listen to short sound examples and rate their sound quality. As these tests can be quite long, a serious and practically relevant question is if participants change their rating behaviour over time, because the prolonged concentration while listening and rating leads to fatigue. This paper presents fi...
Article
Full-text available
Granular methods to synthesise environmental sound textures (e.g. rain, wind, fire, traffic, crowds) preserve the richness and nuances of actual recordings, but need a preselection of timbrally stable source excerpts to avoid unnaturally-sounding jumps in sound character. To overcome this limitation, we add a description of the timbral content of e...
Conference Paper
Full-text available
EFFICACe est un projet de recherche centré sur les outils de composition assistée par ordinateur (CAO), explorant les rapports entre calcul, temps et interactions dans les processus de composition musicale. Nous présentons différents travaux en cours dans le cadre de ce projet, concernant l'application de processus interactifs pour le traitement et...
Article
We present the Collective Sound Checks, an exploration of user scenarios based on mobile web applications featuring motion-controlled sound that enable groups of people to engage in spontaneous collaborative sound and music performances. These new forms of musical expression strongly shift the focus of design from human-computer interactions toward...
Article
cote interne IRCAM: Schwarz13b
Conference Paper
Full-text available
We present a way to make environmental recordings con-trollable again by the use of continuous annotations of the high-level semantic parameter one wishes to control, e.g. wind strength or crowd excitation level. The annotations serve as a descriptor in corpus-based concatenative synthesis. The workflow has been evaluated by a prelim-inary subject...
Conference Paper
Full-text available
L'apprentissage de mappings du geste vers le son constitue aujourd'hui un enjeu de recherche majeur. Dans un travail précédent, nous avons proposé un modèle hiérarchique permettant de modéliser des structures temporelles à différentes échelles. Nous nous intéressons ici à l'apprentissage de structures temporelles de plus haut niveau. Plus spécifiqu...
Article
This article presents AudioGuide, an innovative application for sound synthesis which aims to heighten compositional control of morphology in electronic music. We begin with a discussion of the challenges of managing detail when composing with computers, emphasizing the need for more tools which help the composer address the intricacies of sonic ev...
Book
Full-text available
Topophony literally means a place of sound, in other words sound spaces, which can be real, virtual or augmented (mixed). For example, in real life, sound sources are distributed around us: some are fixed, other are mobile. As listeners, we evolve in a space and constantly mix the sources that surround us. That experience is what we call sound navi...
Article
Full-text available
We present a method, applicable to corpus-based concate- native synthesis and specifically to audio mosaicing, that assists the composer in exploring the relationship between the parameterization of a concatenative algorithm and the resulting similarity between the output sound and the original target soundfile. Rather than focus solely on straight...
Article
Full-text available
The need for fine-tuned microtonal pitch combined with the timbral richness of corpus-based concatenative synthesis has led to the development of a new tool for corpus-based pitch and loudness control in real time with CataRT. Drawing on recent research in feature modulation synthesis (FMS) as well as the bach library for Max/MSP, we have implement...
Conference Paper
Full-text available
Query by example retrieval of environmental sound recordings is a research area with applications to sound design, music composition and automatic suggestion of metadata for the labeling of sound databases. Retrieval problems are usually composed of successive feature extraction (FE) and similarity measurement (SM) steps, in which a set of extracte...
Conference Paper
Full-text available
Corpus-based concatenative synthesis is based on descriptor analysis of any number of existing or live-recorded sounds, and synthesis by selection of sound segments from the database matching given sound characteristics. It is well described in the literature, but has been rarely examined for its capacity as a new interface for musical expression....
Article
cote interne IRCAM: Schwarz11d
Conference Paper
Full-text available
Interactive navigation within geometric, feature-based database representations allows expressive musical performances and installations. Once mapped to the feature space, the user's position in a physical interaction setup (e.g. a multitouch tablet) can be used to select elements or trigger audio events. Hence physical displacements are directly c...
Conference Paper
Full-text available
In the most common approach to corpus-based concatenative synthesis, the unit selection takes places as a content based similarity match based on a weighted Euclidean distance between the audio descriptors of the database units, and the synthesis target. While the simplicity of this method explains the relative success of CBCS for interactive descr...
Conference Paper
Full-text available
In audio-graphic scenes, visual and audio modalities are synchronized in time and space, and their behaviour is determined by a common process. We present here a novel way of modeling audio-graphic content for interactive 3D scenes with the concept of sound processes and their activation through 2D or 3D profiles. Many 3D applications today support...
Article
Full-text available
The main advances of the R&D Sample Orchestrator project are presented, aiming at the development of innovative functions for the manipulation of sound samples. These features rely on studies on sound description, i.e. the formalization of relevant data structures for characterizing the sounds’ content and organization. This work was applied to aut...
Article
Full-text available
We propose to apply the principle of interactive real-time corpus-based concatenative synthesis to search in effects or instrument sound databases, which becomes content-based navigation in a space of descriptors and categories. This sur- passes existing approaches of presenting the sound database first in a hierarchy given by metadata, and then le...
Conference Paper
Full-text available
Existing methods for sound texture synthesis are often concerned with the extension of a given recording, while keeping its overall properties and avoiding artefacts. However, they generally lack controllability of the resulting sound texture. After a review and classification of existing approaches, we propose two methods of statistical modeling o...
Conference Paper
Full-text available
In this paper, the authors describe how they use an electric bass as a subtle, expressive and intuitive interface to browse the rich sample bank available to most laptop owners. This is achieved by audio mosaicing of the live bass performance audio, through corpus-based concatenative synthesis (CBCS) techniques, allowing a mapping of the multi-dime...
Conference Paper
Full-text available
Audio descriptor analysis in real-time or in batch is increas-ingly important for advanced sound processing, synthesis, and research, often using a relaxed-real-time approach. Ex-isting approaches mostly lack either modularity or flexi-bility, since the design of an efficient modular descriptor analysis framework for commonly used real-time environ...
Conference Paper
Full-text available
Corpus-based concatenative synthesis presents unique possibilities for the visualization of audio descriptor data. These visualization tools can be applied to sound diffusion in the physical space of the concert hall using current spa-tialization technologies. Using CATART and the FTM&CO library for MAX/MSP we develop a technique for the orga-nizat...
Article
Full-text available
This paper describes the submission to the MIREX‘06 (Music Information Retrival Evaluation eXchange) first score following task. 1. Overview Score following is the key to an interaction with a written score/song based on the metaphor of a performer with an accompanist or band. For a historical review of score follower systems we refer the reader to...
Conference Paper
Timbre space is a cognitive model useful to address the problem of structuring timbre in electronic music. The recent concept of corpus-based concatenative sound synthesis is proposed as an approach to timbral control in both real- and deferred-time applications. Using CataRT and related tools in the FTM and Gabor libraries for Max/MSP we describe...
Conference Paper
Full-text available
The article presents methods for sound search in large effects or instrument sound databases by interactive contentbased navigation in a space of descriptors and categories, based on the principle of real-time corpus-based concatenative synthesis. We focus on three algorithms: fast similaritybased search by a kD-tree in the high-dimensional descrip...
Article
Full-text available
This article reports on developments conducted in the fra-mework of the SampleOrchestrator project. We assembled for this project a set of tools allowing for the interactive real-time synthesis of automatically analysed and annota-ted audio files. Rather than a specific technique we present a set of components that support a variety of different in...
Article
Performed by Diemo Schwarz (real-time corpus-based concatenative synthesis software CataRT)and Etienne Brunet (bass clarinet). Recorded live at the Placard Headphone Festival, October 2007. Time has passed, time is dead. You have to play live again (and again and again until you're dead) to taste this pleasure that no recording can convey. The prec...
Conference Paper
Full-text available
We present a set of extensions to the Sound Description In- terchange Format (SDIF) for the purpose of storage and/or transmission of general audio descriptors. The aim is to al- low portability and interoperability between the feature ex- traction module of an audio information retrieval application and the remaining modules, such as training, cla...
Article
Full-text available
cote interne IRCAM: Grosshauser08a
Conference Paper
Full-text available
Corpus-based concatenative synthesis plays grains from a large corpus of segmented and descriptor-analysed sounds according to proximity to a target position in the descriptor space. This can be seen as a content-based extension to granular synthesis providing direct access to specific sound characteristics. The interactive concatenative sound synt...
Article
Full-text available
Plumage is an interface for interactive 3D audio/graphic scene browsing and design. The interface relies on the notion of tape heads in a sonic and graphic 3D space made of feathers associated with sound micro-samples. The spatial layout of the feathers is defined by sound parameters of the associated samples. The musical play is the outcome of a c...
Chapter
The subject of this chapter is the estimation, representation, modification, and use of spectral envelopes in the context of sinusoidal-additive-plus-residual analysis/synthesis. A spectral envelope is an amplitude-vs-frequency function, which may be obtained from the envelope of a short-time spectrum (Rodet et al., 1987; Schwarz, 1998). [Precise d...
Article
Full-text available
The last decade has seen the development of standards for music notation (MusicXML), audio analysis (SDIF), and sound control (OSC), but there are no widespread standards, nor structured approaches, for handling music-related movement, action and gesture data. This panel will address the needs for such formats and standards in the computer music co...
Article
Full-text available
Although sound visualization and image sonification have been extensively used for scientific and artistic purposes, their combined effect is rarely considered. In this paper, we propose the use of an iterative visualization/sonification approach as a sound generation mechanism. In particular, we visualize sounds using a textural self-similarity re...
Article
Full-text available
Corpus-based concatenative synthesis (CBCS) builds on a large database of segmented and descriptor-analysed sounds that are selected and played according to proxim-ity to a target position in the descriptor space. This can be seen as a content-based extension to granular synthe-sis providing direct access to specific sound characteris-tics in real-...
Conference Paper
Full-text available
This article explains evaluation methods for real-time au- dio to score alignment, or score following, that allow for the quantitative assessment of the robustness and precise- ness of an algorithm. The published ground truth data base and the evaluation framework, including file formats for the score and the reference alignments, are presented. Th...
Conference Paper
Full-text available
The concatenative real-time sound synthesis system CataRT plays grains from a large corpus of segmented and descriptor-analysed sounds according to proximity to a target position in the descriptor space. This can be seen as a content-based extension to granular synthesis providing direct access to specific sound characteristics. CataRT is implement...