Baptiste Caramiaux

Baptiste Caramiaux
French National Centre for Scientific Research | CNRS

PhD

About

89
Publications
32,522
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,918
Citations
Introduction
Research in Interaction Design and Machine Learning. Focus on musical applications and sound perception.
Additional affiliations
June 2015 - present
Mogees Ltd.
Position
  • Senior Researcher
March 2010 - April 2010
McGill University
Position
  • Visiting researcher
July 2012 - July 2015
Goldsmiths, University of London
Position
  • Research Associate

Publications

Publications (89)
Conference Paper
Full-text available
Sonic interaction is the continuous relationship between user actions and sound, mediated by some technology. Because interaction with sound may be task oriented or experience-based it is important to understand the nature of action-sound relationships in order to design rich sonic interactions. We propose a participatory approach to sonic interact...
Article
Full-text available
Expressivity is a visceral capacity of the human body. To understand what makes a gesture expressive, we need to consider not only its spatial placement and orientation, but also its dynamics and the mechanisms enacting them. We start by defining gesture and gesture expressivity, and then present fundamental aspects of muscle activity and ways to c...
Article
Full-text available
This article presents a gesture recognition/adaptation system for human--computer interaction applications that goes beyond activity classification and that, as a complement to gesture labeling, characterizes the movement execution. We describe a template-based recognition method that simultaneously aligns the input gesture to the templates using a...
Article
Full-text available
Gesture-to-sound mapping is generally defined as the association between gestural and sound parameters. This article describes an approach that brings forward the perception-action loop as a fundamental design principle for gesture–sound mapping in digital music instrument. Our approach considers the processes of listening as the foundation – and t...
Article
Full-text available
We investigated gesture description of sound stimuli performed du-ring a listening task. Our hypothesis is that the strategies in gestural responses depend on the level of identification of the sound source, and specifically on the identification of the action causing the sound. To validate our hypothesis, we conducted two ex-periments. In the firs...
Article
Seen both as a resource and an obstacle to clarity, uncertainty is a concept that permeates many areas of design. As the concept gains prominence in HCI, this special issue specifically explores the interplay between uncertainty and prototyping in Research through Design (RTD). We first outline three histories of uncertainty in design, in relation...
Article
Full-text available
One of the challenges of technology-assisted motor learning is how to adapt practice to facilitate learning. Random practice has been shown to promote long-term learning. However, it does not adapt to the learner’s specific learning requirements. Previous attempts to adapt learning considered the skill level of learners from past training sessions....
Article
Full-text available
Background: Movement sonification, the use of real-time auditory feedback linked to movement parameters, have been proposed to support rehabilitation. Nevertheless, if promising results have been reported, the effect of the type of sound used has not been studied systematically. The aim of this study was to investigate in a single session the effe...
Article
Full-text available
This paper combines design, machine learning and social computing to explore generative deep learning as both tool and probe for respiratory care. We first present GANspire, a deep learning tool that generates fine-grained breathing waveforms, which we crafted in collaboration with one respiratory physician, attending to joint materialities of huma...
Conference Paper
Full-text available
For several decades NIME community has always been appropriating machine learning (ML) to apply for various tasks such as gesture-sound mapping or sound synthesis for digital musical instruments. Recently, the use of ML methods seems to have increased and the objectives have diversified. Despite its increasing use, few contributions have studied wh...
Conference Paper
Full-text available
For several years, the various practices around ML techniques have been increasingly present and diversified. However , the literature associated with these techniques rarely reveals the cultural and political sides of these practices. In order to explore how practitioners in the NIME community engage with ML techniques, we conducted interviews wit...
Article
Full-text available
Analysing movement learning can rely on human evaluation, e.g. annotating video recordings, or on computing means in applying metrics on behavioural data. However, it remains challenging to relate human perception of movement similarity to computational measures that aim at modelling such similarity. In this paper, we propose a metric learning meth...
Preprint
Full-text available
Background: Movement sonification, the use of real-time auditory feedback linked to movement parameters, have been proposed to support rehabilitation. Nevertheless, if promising results have been reported, the effect of the type of sound used has not been studied systematically, and mechanisms involved during movement execution with sonification re...
Article
Alongside recent advances in artificial intelligence (AI), a new art practice has emerged in recent years that borrows and transforms these advances in the production of artworks. The actors of this emergent practice are coming from contemporary art, media and digital arts. These artists have developed an original practice of AI within their creati...
Article
The spread of AI-embedded systems involved in human decision making makes studying human trust in these systems critical. However, empirically investigating trust is challenging. One reason is the lack of standard protocols to design trust experiments. In this paper, we present a survey of existing methods to empirically investigate trust in AI-ass...
Article
Machine learning systems became pervasive in modern interactive technology but provide users with little, if any, agency with respect to how their models are trained from data. In this paper, we are interested in the way novices handle learning algorithms, what they understand from their behavior and what strategy they may use to "make it work". We...
Article
Full-text available
Preparing a new dance performance involves more than learning individual steps. We are interested in understanding how dancers collaborate as they rehearse a new dance piece, with a particular emphasis on how they use physical and digital artifacts to support this process. We conducted a 12-month longitudinal observational study with a dance compan...
Article
Full-text available
Humans excel at using sounds to make judgements about their immediate environment. In particular, timbre is an auditory attribute that conveys crucial information about the identity of a sound source, especially for music. While timbre has been primarily considered to occupy a multidimensional space, unravelling the acoustic correlates of timbre re...
Article
Software tools for generating digital sound often present users with high-dimensional, parametric interfaces, that may not facilitate exploration of diverse sound designs. In this article, we propose to investigate artificial agents using deep reinforcement learning to explore parameter spaces in partnership with users for sound design. We describe...
Conference Paper
Full-text available
This paper introduces GANspire, a deep learning tool that generates expressive breathing waveforms for art and health applications. We describe our ongoing work and contributions, which include the development of a generative model of breathing pressure waveforms, the participatory design of an interface for creative exploration of breathing genera...
Preprint
Full-text available
This article presents a five-year collaboration situated at the intersection of Art practice and Scientific research in Human-Computer Interaction (HCI). At the core of our collaborative work is a hybrid, Art and Science methodology that combines computational learning technology -- Machine Learning (ML) and Artificial Intelligence (AI) -- with int...
Article
Full-text available
Machine learning approaches have seen a considerable number of applications in human movement modeling but remain limited for motor learning. Motor learning requires that motor variability be taken into account and poses new challenges because the algorithms need to be able to differentiate between new movements and variation in known ones. In this...
Preprint
The use of machine learning to model motor learning mechanisms is still limited, while it could help to design novel interactive systems for movement learning or rehabilitation. This approach requires to account for the motor variability induced by motor learning mechanisms. This represents specific challenges concerning fast adaptability of the co...
Article
Full-text available
Our goal is to understand how dancers learn complex dance phrases. We ran three workshops where dancers learned dance fragments from videos. In workshop 1, we analyzed how dancers structure their learning strategies by decomposing movements. In workshop 2, we introduced MoveOn, a technology probe that lets dancers decompose video into short, repeat...
Preprint
Software tools for generating digital sound often present users with high-dimensional, parametric interfaces, that may not facilitate exploration of diverse sound designs. In this paper, we propose to investigate artificial agents using deep reinforcement learning to explore parameter spaces in partnership with users for sound design. We describe a...
Conference Paper
Full-text available
We are interested in supporting motor skill acquisition in highly creative skilled practices such as dance. We conducted semi-structured interviews with 11 professional dancers to better understand how they learn new dance movements. We found that each dancer engages in a set of operations, including imitation, segmentation, marking, or applying mo...
Article
Full-text available
Motor skill acquisition inherently depends on the way one practices the motor task. The amount of motor task variability during practice has been shown to foster transfer of the learned skill to other similar motor tasks. In addition, variability in a learning schedule, in which a task and its variations are interweaved during practice, has been sh...
Article
Musical instrument timbre has been intensively investigated through dissimilarity rating tasks. It is now well known that audio descriptors such as attack time and spectral centroid, among others, account well for the dimensions of the timbre spaces underlying these dissimilarity ratings. Nevertheless, it remains very difficult to reproduce these p...
Article
Full-text available
A set of prominent designers embarked on a research journey to explore aesthetics in movement-based design. Here we unpack one of the design sensitivities unique to our practice: a strong first person perspective—where the movements, somatics and aesthetic sensibilities of the designer, design researcher and user are at the forefront. We present an...
Chapter
Machine learning is the capacity of a computational system to learn structure from data in order to make predictions on new data. This chapter draws on music, machine learning, and human-computer interaction to elucidate an understanding of machine learning algorithms as creative tools for music and the sonic arts. It motivates a new understanding...
Conference Paper
Full-text available
In this paper we present two datasets of instrumental gestures performed with expressive variations: five violinists performing standard pedagogical phrases with variation in dynamics and tempo; and two pianists performing a repertoire piece with variations in tempo, dynamics and articulation. We show the utility of these datasets by highlighting t...
Conference Paper
Full-text available
Expert musicians' performances embed a timing variability pattern that can be used to recognize individual performance. However, it is not clear if such a property of performance variability is a consequence of learning or an intrinsic characteristic of human performance. In addition, little evidence exists about the role of timing and motion in re...
Article
Full-text available
p>Archaeological data are heterogeneous, making it difficult to correlate and combine different types. Datasheets and pictures, stratigraphic data and 3D models, time and space mixed together: these are only a few of the categories a researcher has to deal with. New technologies may be able to help in this process and trying to solve research relat...
Article
Full-text available
p>Archaeological data are heterogeneous, making it difficult to correlate and combine different types. Datasheets and pictures, stratigraphic data and 3D models, time and space mixed together: these are only a few of the categories a researcher has to deal with. New technologies may be able to help in this process and trying to solve research relat...
Conference Paper
Full-text available
This paper presents a knowledge-based, data-driven method for using data describing action-sound couplings collected from a group of people to generate multiple complex map-pings between the performance movements of a musician and sound synthesis. This is done by using a database of multimodal motion data collected from multiple subjects coupled wi...
Article
Full-text available
Machine learning is the capacity of a computational system to learn structures from datasets in order to make predictions on newly seen data. Such an approach offers a significant advantage in music scenarios in which musicians can teach the system to learn an idiosyncratic style, or can break the rules to explore the system's capacity in unexpecte...
Conference Paper
Full-text available
Archaeological data are heterogeneous (i.e., data-sheets and pictures, stratigraphic data, 3D models), and innovative virtual reconstructions helps to visualize and study those data. In this short paper, we describe our work in progress in the design of an innovative way to interact with the complexity of a virtual reconstruction, using natural ges...
Conference Paper
Full-text available
We discuss the notion of movement coarticulation, which has been studied in several fields such as motor control, music performance and animation. In gesture recognition, movement coarticulation is generally viewed as a transition between "gestures" that can be problematic. We propose here to account for movement coarticulation as an informative el...
Conference Paper
Full-text available
This note presents a system that learns expressive and idiosyncratic gesture variations for gesture-based interaction. The system is used as an interaction technique in a music conducting scenario where gesture variations drive music articulation. A simple model based on Gaussian Mixture Modeling is used to allow the user to configure the system by...
Conference Paper
Full-text available
Machine learning is one of the most important and successful techniques in contemporary computer science. It involves the statistical inference of models (such as classifiers) from data. It is often conceived in a very impersonal way, with algorithms working autonomously on passively collected data. However, this viewpoint hides considerable human...
Conference Paper
Full-text available
In our work on computational design of expressive gestural interaction, we experienced various challenges for advanced optimisation methods. Here we want to highlight two of these challenges based on the design and the use of a Bayesian model called Gesture Variation Follower, with the aim to discuss such challenges with a broader community of desi...
Conference Paper
Full-text available
Music as a multimodal phenomenon promises to provide new insights into music cognition. Studied from an embodied perspective, body movements play a major role in our musical experiences. Here we address how motor invariants such as the two-thirds power law relate to music cognition. A sample of 64 musically trained and untrained participants were a...
Article
Full-text available
This article draws a perceptual approach to audio-visual mapping. Clearly perceivable cause and effect relationships can be problematic if one desires the audience to experience the music. Indeed perception would bias those sonic qualities that fit previous concepts of causation, subordinating other sonic qualities, which may form the relations bet...
Conference Paper
Full-text available
We present a way to make environmental recordings controllable again by the use of continuous annotations of the high-level semantic parameter one wishes to control, e.g. wind strength or crowd excitation level. A partial annotation can be propagated to cover the entire recording via cross-modal analysis between gesture and sound by canonical time...
Article
Full-text available
Gesture-to-sound mapping is generally defined as the association between gestural and sound parameters. This article describes an approach that brings forward the perception–action loop as a fundamental design principle for gesture–sound mapping in digital music instrument. Our approach considers the processes of listening as the foundation—and the...
Article
While human-human or human-object interactions involve very rich, complex and nuanced gestures, gestures as they are captured for human-computer interaction remain relatively simplistic. Our approach is to consider the study of variation of motion input as a way of understanding expression and expressivity in human-computer interaction and in order...
Conference Paper
Full-text available
The text reports a study, which draws upon methods from experimental psychology to inform audio-visual instrument design. The study aims at gleaning how an audio-visual mapping can produce a sense of causation, and simultaneously confound the actual cause and effect relationships. We call this a fungible audio-visual mapping. The participants in th...
Conference Paper
Full-text available
We present a study that explores the affordance evoked by sound and sound-gesture mappings. In order to do this, we make use of a sensor system with minimal form factor in a user study that minimizes cultural association. The present study focuses on understanding how participants describe sounds and gestures produced while playing designed sonic i...
Conference Paper
Full-text available
Gesture-based interaction is widespread in touch screen interfaces. The goal of this paper is to tap the richness of expressive variation in gesture to facilitate continuous interaction. We achieve this through novel techniques of adaptation and estimation of gesture characteristics. We describe two experiments. The first aims at understanding whet...
Conference Paper
Full-text available
The Modular Musical Objects (MO) are an ensemble of tangible interfaces and software modules for creating novel musical instruments or for augmenting objects with sound. In particular, the MOs allow for designing action-sound relationships and behaviors based on the interaction with tangible objects or free body movements. Such interaction scenario...
Conference Paper
Full-text available
We present a ludic interactive music performance that allows live recorded sounds to be re-rendered through the users' movements. The interaction design made the control similar to a shaker where the motion energy drives the energy of the played music piece. The instrument has been designed for musicians as well as non-musicians and allows for mult...
Conference Paper
Full-text available
We present the first combined use of the electromyogram (EMG) and mechanomyogram (MMG), two biosignals that result from muscular activity, for interactive music applications. We exploit differences between these two signals, as reported in the biomedical literature, to create bi-modal sonification and sound synthesis mappings that allow performers...
Conference Paper
Full-text available
This paper presents work in progress on applying a Multimodal interaction (MMI) approach to studying interactive music performance. We report on a study where an existing musical work was used to provide a gesture vocabulary. The biophysical sensing already used in the work was used as input modality, and augmented with several other input sensing...
Conference Paper
Full-text available
We present an overview of machine learning (ML) techniques and their application in interactive music and new digital instrument design. We first provide the non-specialist reader an introduction to two ML tasks, classification and regression, that are particularly relevant for gestural inter- action. We then present a review of the literature in c...
Thesis
Full-text available
This thesis presents the studies on the analysis of the relationship between gesture and sound with the aim to help with the design of digital expressive instruments for musical performance. Studies of these relationships are related to various areas of research and lead to a multidisciplinary approach. We initiate the thesis by presenting an explo...
Conference Paper
Full-text available
We propose a hierarchical approach for the design of gesture-to-sound mappings, with the goal to take into account multilevel time structures in both gesture and sound processes. This allows for the integration of temporal mapping strategies, complementing mapping systems based on instantaneous relationships between gesture and sound synthesis para...
Conference Paper
Full-text available
In this paper, we explore the use of movement qualities as interaction modality. The notion of movement qualities is widely used in dance practice and can be understood as how the movement is performed, independently of its specific trajectory in space. We implemented our approach in the context of an artistic installation called A light touch. Thi...
Article
Full-text available
This article presents a segmentation model applied to musician movements, taking into account different time structures. In particular we report on ancillary gestures that are not directly linked to sound production, whilst still being entirely part of the global instrumental gesture. Precisely, we study movements of the clarinet captured with an o...