Conference PaperPDF Available

The Human, the Mechanical, and the Spaces in between: Explorations in Human-Robotic Musical Improvisation

Authors:

Abstract

HARMI (Human and Robotic Musical Improvisation) is a software and hardware system that enables musical robots to improvise with human performers. The goal of the system is not to replicate human musicians, but rather to explore the novel kinds of musical expression that machines can produce. At the same time, the system seeks to create spaces where humans and robots can communicate with each other in a common language. To help achieve the former, ideas from contemporary compositional practice and music theory were used to shape the system's expressive capabilities. In regard to the latter, research from the field of cognitive psychology was incorporated to enable communication, interaction, and understanding between human and robotic performers. The system was partly developed in conjunction with a residency at High Concept Laboratories in Chicago, IL, where a group of human improvisers performed with the robotic instruments. The system represents an approach to the question of how humans and robots can interact and improvise in musical contexts. This approach purports to highlight the unique expressive spaces of humans, the unique expressive spaces of machines, and the shared spaces between the two.
The Human, the Mechanical, and the Spaces in between:
Explorations in Human-Robotic Musical Improvisation
Scott Barton
Worcester Polytechnic Institute, Worcester, MA
sdbarton@wpi.edu
Abstract
HARMI (Human and Robotic Musical Improvisation) is a
software and hardware system that enables musical robots to
improvise with human performers. The goal of the system is
not to replicate human musicians, but rather to explore the
novel kinds of musical expression that machines can
produce. At the same time, the system seeks to create
spaces where humans and robots can communicate with
each other in a common language. To help achieve the
former, ideas from contemporary compositional practice
and music theory were used to shape the system’s
expressive capabilities. In regard to the latter, research from
the field of cognitive psychology was incorporated to enable
communication, interaction, and understanding between
human and robotic performers. The system was partly
developed in conjunction with a residency at High Concept
Laboratories in Chicago, IL, where a group of human
improvisers performed with the robotic instruments. The
system represents an approach to the question of how
humans and robots can interact and improvise in musical
contexts. This approach purports to highlight the unique
expressive spaces of humans, the unique expressive spaces
of machines, and the shared spaces between the two.
Introduction
HARMI (Human and Robotic Musical Improvisation) is a
software and hardware system that enables musical robots
to improvise with human performers. The physical
instruments were built by the Music, Perception, and
Robotics Lab at WPI and EMMI (Expressive Machines
Musical Instruments), and the software was programmed
by the author. In order to create a robotic musician that
can interact with human performers and at the same time
can create music that is novel and compelling, the design
of this system incorporates ideas from both cognitive
psychology, contemporary compositional practice and
Copyright © 2013, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
music theory. In regard to the former, as robotic and
human musical improvisers share perceptual and cognitive
capabilities, certain kinds of musical communication are
enabled. This does not mean that a robotic improviser has
to replicate a human one, and as HARMI does not seek to
replicate human improvisers, it differs from previous
efforts in the field. Instead, HARMI seeks to both find
common ground that facilitates communication, and to
explore and exhibit the interpretive and expressive spaces
that are unique to machines and humans. Ideas from
contemporary compositional practice and music theory are
integrated so as to create new kinds of ideas that are
musically, and not just technologically, valuable.
Prior work
A number of systems have been developed that enable
humans to musically improvise with machines. Efforts in
this field are often included in the category “interactive
music” (see Winkler 1995 and Winkler 2001) that may or
may not include improvisatory capabilities. One category
of interactive systems uses software to interpret the
gestures of human collaborators (these gestures can be
communicated via MIDI or audio data) and then produces
sounds (which can be synthesized, pre-recorded and/or
processed) through loudspeakers. Specific examples
include Cypher (Rowe 1992), Voyager (Lewis 2000), and
the Continuator (Pachet 2002). Another category uses
software to interpret the gestures of human collaborators
but produces sound using physical electro-mechanical
instruments. Compared to loudspeakers, physical
instruments produce sounds that have unique acoustical
qualities, clarify the causality involved in sound
production, are capable of nuanced and idiosyncratic
expression as a function of their physical design, give
visual cues and feedback to human collaborators, and offer
visual interest to an audience in performance (some of
Musical Metacreation: Papers from the 2013 AIIDE Workshop (WS-13-22)
9
these features are discussed in Weinberg and Driscoll
2007). Because of these capabilities, such systems offer
possibilities for new kinds of musical expression and
interaction. Previous work in this area includes Gil
Weinberg et al.’s Haile (Weinberg and Driscoll 2006,
2007), Shimon (Hoffman and Weinberg 2011), and
Kapur’s MahaDeviBot (Kapur 2011). HARMI purports to
further the work in this field by integrating ideas from
cognitive psychology and contemporary compositional
practice and music theory to allow for humans and
machines to create new kinds of musical expression in
improvisatory contexts.
A Creative Improviser: Musical, Perceptual
and Cognitive Considerations
Listening and Responding
Musical improvisation involves action and reaction;
processing, interpretation and creativity. We can conflate
these ideas into two core phases that a musical improviser
inhabits during an interaction: listening and responding.
We can use these general categories to guide the design of
a robotic musical improviser. In humans, the concept of
listening involves perceptual and cognitive mechanisms.
We must interpret sonic information from the world in
order to understand temporal relationships, spatial location
and source. We understand both individual sonic events,
which we can describe using concepts such as pitch and
timbre, and larger organizations that comprise those
events, such as rhythms, melodies and musical pieces.
After we have listened, we choose to either let our
collaborators continue to speak, or we may choose to
respond with ideas of our own, which are often derived
from ideas that we have encoded in memory. Experience,
physical capabilities, preferences, and cultural conventions
shape these processes (see Lewis 1999).
In order to incorporate such functionality in an
improvising machine, programmers typically design modes
that carry out particular cognitive or perceptual functions.
For example, in Rowe’s Cypher, the composition section
of the program includes three modes that transform
“heard” material, generate material in a variety of styles,
and play material from a stored library (Rowe 1992). The
modes of Weinberg et al.’s Haile include imitation,
stochastic transformation, simple accompaniment,
perceptual transformation, beat detection and perceptual
accompaniment (Weinberg and Driscoll 2007). HARMI
was designed according to a similar paradigm of modular
functionality, although it differs from previous efforts in
regard to the musical motivations and aesthetic preferences
that shape the kinds of functionality that have been
incorporated into the software.
Ideas from Music Composition and Theory
HARMI’s purpose is to make contemporary music,
therefore, ideas from (contemporary) compositional
practice and music theory, such as multiple simultaneous
beat rates, non-integer-multiple quantization, rhythmic
structures based on frequency-weighted distributions,
isorhythms, gesture transformations, and reiteration
(including pattern matching) were used to guide which
interpretive and choice-making functionality was
integrated into the software.
The Beat and Quantization
The notion of the beat, or the tactus, is considered to be
a primary component of rhythmic experience (Krumhansl
2000, which nicely summarizes the work of Fraisse;
Honing 2012; Kapur 2011; Janata 2012). The tactus is
usually one or two rates that is/are described as primary,
most salient, “intermediate” and “moderate” relative to
other rates in a rhythmic texture (Lerdahl and Jackendoff
1983, p. 21; Parncutt 1994). This is not to say that the rate
that we can tap our foot to is the only one of significance.
Some theorists, such as Yeston (1976), Krebs (1987) and
London (2004) highlight that we are sensitive to multiple
simultaneous hierarchically-related regular rates in a
rhythmic texture, particularly in the context of meter. The
relationships between these rates shape our experience of
rhythm. Krebs (1987) describes the relationship between
rhythmic levels in terms of metrical consonance and
dissonance, which correspond to the degree of alignment
between rhythmic levels. Rhythmic alignment is a
function of both configuration and the mathematical
relationship between rates. In musical perception and
production, alignment and the mathematical relationship
between rates are limited by the temporal accuracy of the
system interpreting and creating those rates. Computer-
driven robotic performers, which are capable of high
degrees of temporal accuracy, are therefore able to
perceive and perform rhythms in ways that human
musicians cannot. Musical robots thus open the door for
new compositional and improvisational possibilities.
HARMI explores these possibilities by analyzing
rhythmic textures to find multiple beat rates, not just the
rate that we find most salient as human listeners. The
creative process then chooses from these various rates
instead of calculating multiples or divisions of a primary
beat. By approaching rhythm in this way, novel kinds of
rhythmic configurations and relationships can be created.
To illustrate the difference between a system that
interprets multiple beat levels as compared to one that
adjusts all levels relative to a tactus, consider a rhythmic
texture that contains rates at 243 msec, 407 msec and 734
10
msec.1 A common approach to automatic rhythmic
interpretation is to quantize the temporal locations of
elements in order to simplify the proportional relationships
between rates.2 This allows one rate to be specified as the
tactus to which the others are related by small integer
ratios. Thus, by identifying the primary beat and
quantizing the other rhythmic elements relative to that
beat, the relationship between 243, 407 and 734 could
become 1:2:4. While such an interpretation is consonant
with Western notational practice (we now have eighth,
quarter and half notes) and the evidence that humans
perceive durations categorically (Clarke 1987; Schulze,
1989; Sasaki et al. 1998; Hoopen et al. 2006; Desain and
Honing 2003; Krumhansl 2000; London 2004), modifying
durations in this way is problematic for a number of
reasons. First, restricting rhythmic configurations to small
integer ratios produces idealized versions of durations and
temporal relationships that typically inspire notions of an
undesirable kind of “mechanical” production. Second,
these processes filter out (in the case of interpretation) or
prevent (in the case of production) rhythmic richness. This
richness is a vehicle for expressivity and “feel”: some
players speed up or slow down eighth notes relative to
quarter notes, or play “in front of” or “behind” the beat in
order to convey shifts in energy, a certain mood, or define
their individual interpretive styles. More universally, this
richness can help define musical genres via minute timing
conventions, such as “swing” in Jazz, or rubato in
Romantic Western Art Music. This richness allows
complex superimposition rates and cross-rhythms. Perhaps
most importantly (for those interested in making new
music), this richness allows composers (and robots
programmed by composers) to voice new kinds of
rhythmic configurations and relationships that can lead to
new kinds of musical styles, conventions and identities. A
rhythmic configuration consisting of the aforementioned
durations (243 msec, 407 msec, 734 msec) projects an
identity distinct from that of its small-integer ratio
counterpart. One imagines the diversity of rhythmic
identities that are possible given the temporal capabilities
of our machines, yet have been relatively unexplored in
musical practice (importantly, these rhythms can still be
periodic and cyclic, and thus, beat-based). Such
configurations and relationships inspire us to think of
rhythm in new ways: How will we, as human listeners,
experience such mechanically-produced rhythms given our
tendency to perceive temporal relationships categorically?
How can an artificial intelligence interpret and produce
such rhythms while also being sensitive to the temporal
1 Here, a “rate” is a single numerical representation of some distribution
of IOIs (inter-onset intervals).
2 There are number of different approaches to the problem of
quantization: see Desain 1993 and Desain and Honing 1989 for a
discussion of the topic.
categories and expressive timing that characterize human
perception? How can a robotic improviser explore new
rhythmic territory while simultaneously being able to
communicate with human musicians? By enhancing our
vocabulary beyond that of simple grids, we open the door
to these fantastic rhythmic questions and possibilities.
HARMI was designed with these ideas in mind, thus it
can produce quantized rates that are related by complex
proportions. This process preserves timing nuance;
complex rhythmic configurations and relationships; and
subtle tempi alterations made possible by multi-rate
analysis, and at the same time, gives the compositional
systems a limited set of values from which to find and
produce rhythmic patterns.
Metric Frequencies, Transformation and Reiteration
HARMI also makes use of the idea that we are more
perceptually sensitive to certain metric positions than
others (Palmer and Krumhansl 1990). Some have
connected this sensitivity with frequency distributions in
meter-based musical canons, which both reflect and shape
perceptual and production tendencies.3 HARMI extends
this idea to duration, so that temporal intervals that the
system chooses depend on those that were heard in the
past. In HARMI, temporal intervals are weighted based on
their frequency of occurrence within a particular grouping.
This frequency distribution becomes a probability
distribution that governs how intervals will be chosen in
the process of creating new rhythms. The order of these
intervals is chosen according to transformational processes.
Transformation is a core component of musical
compositional practice and is implemented in HARMI in a
number of ways. When we transform a musical idea, we
reiterate some components of the original gesture while at
the same time adding other ideas around those original
components. We can understand a statement A and its
transformation B by considering the number of operations
(such as additions, subtractions, and substitutions) that are
required to turn A into B (which also can describe inter-
entity similarity: see Hahn et al. 2003; Orpen & Huron
1992). We can use these transformational distances (Hahn
et al. 2003) to represent and create new ideas from ones
heard in an interaction. In HARMI, transformations occur
via random and sequential processes. In one mode, a
rhythm is transformed by substituting durations one at a
time: the location of the substitution within the sequence is
chosen at non-repeating random, and the duration is chosen
from the probability distribution. In another, the number
of alterations that is to be made is randomly chosen within
a restricted range, the location of those transformations
within the sequence is chosen, and then the durations to be
substituted are chosen from the probability distribution.
3 David Huron discusses this idea and related research in Sweet
Anticipation.
11
The number of alterations is not restricted to the length of
the phrase, so that additions can be made.
Reiteration without transformation also plays an
important role in musical improvisation. There are a
number of reasons for this. When one musician repeats the
idea of another, he shows that basic communication is
occurring successfully (if you are really listening to me,
you can repeat exactly what I said). It also motivates an
ensemble towards a shared idea, which can then be the
source of future musical explorations. Thus, the system
has the ability to reiterate the pitch and durational
sequences of human collaborators.
Within the category of reiteration, HARMI has the
ability to match both pitch and rhythmic patterns played by
performers. In one mode, when the system detects a
pattern, it will then repeat that pattern a certain number of
times, after which it will transform the pattern in one of the
ways discussed above. The system can find and express
pitch and rhythmic patterns independently. Given the
temporal variability of human musical performance, multi-
rate quantization, as described earlier, is used in order find
patterns.
The above shows the extent to which the system is as
much a composition as it is an autonomous agent, and thus
it expresses the values and aesthetic preferences of the
author. At the same time, it interprets and produces ideas
in a way that the author didn’t necessarily anticipate, thus
we can say it expresses ideas in its own way, which
provides novel and interesting ingredients to a musical
mix. The design of the system therefore creates spaces for
the human, for the mechanical and for the areas in
between. The character and boundaries of these spaces are
ready to be defined and explored by composers and
performers.
HARMI in Practice; Future Directions
HARMI was partly developed in conjunction with a group
of human improvisers during a residency at High Concept
Laboratories in Chicago, IL in July 2013. The rehearsals
during this residency allowed performers to share their
thoughts about the system and how it could be improved.
These rehearsals revealed the need for feedback when
the HARMI was listening. Because of the number of
musicians involved in the improvisations (up to three
humans and two robots), the human improvisers sometimes
found it difficult to determine when and “where” the
robot’s attention was directed. Visual solutions to this
problem include lights, projection screens, or
anthropomorphic electromechanical structures (the latter is
a feature of Weinberg et al.’s Shimon) that illuminate,
display or move to convey the robot’s attention.
Alternatively (or in combination), one could utilize an
auditory feedback system that produces a sound when a
note is heard. An auditory system has a number of
advantages over a visual one in musical contexts. First, an
auditory feedback system requires that an improviser
actively attend to the individual components of a musical
texture in order to distinguish and interpret the auditory
feedback. This is not necessarily the case in a visual
system, which may cue a visually sensitive improviser
whose auditory attention is not focused on the rest of the
musicians, or the music as a whole. The latter is a
problem: careful listening is an essential part of musical
communication. Second, an auditory feedback system
allows the human performers to understand how the robot
“hears” in the same language (one of rhythms, pitches,
phrases, etc.) and modality that they are “speaking”. This
understanding can motivate human musicians to play,
experiment, and create in new ways.
The system was therefore modified so that auditory cues
were given when HARMI heard a tone onset. It became
clear how the machine interpreted some gestures as a
human musician would, and some gestures in unique ways.
This bit of functionality proved to be inspirational to the
human musicians, who subsequently experimented with
the ways that HARMI “hears”. As a result, the musicians
learned how to communicate with the machine, which
invited the musicians to try new ways of playing, which
caused the output from the robots to be unexpected and
interesting. As positive surprise provides some of the
greatest moments in music, particularly improvised music,
these results were successful and inspirational.
Understanding how HARMI “hears” in unique ways
excites ideas about other ways a robotic improviser could
interpret sonic information. When presented with
microtonal pitch sequences or multiphonic-rich passages,
frequency relationships could be translated into temporal
ones (this kind of functionality is particularly important for
instruments that improvise in contemporary musical
contexts but are restrained by equal temperament). A
musical robot could interpret rhythmic patterns in ways
that a human performer typically would not. For example,
the run and gap principles (Garner and Gottwald 1968;
Rosenbaum and Collyer 1998) describe how human
listeners perceptually organize auditory cyclic temporal
patterns. An artificial musical intelligence does not have to
be governed by the same principles, and thus may choose
pattern beginnings and configurations in unique ways. The
combination of these interpretations with human ones
could produce interesting musical textures.
The rehearsals also motivated questions about memory
and musical form. While human memory privileges
musical information heard in the recent past (Brower 1993;
Snyder 2000), an artificial intelligence need not be
governed by the same sorts of temporal constraints. A
musical robot could reproduce an idea voiced at any point
12
in an improvisation’s past. These recollections could be
reproduced exactly, or they could be colored and
transformed in a variety of ways depending on what else
had been perceived during the musical interactions, or on
other kinds of knowledge encoded in the memory system.
These sorts of alternative recollective capabilities could
provide structure that would allow human-robot
collaborators to create new improvisational forms.
These results and contemplations reflect the core
approach taken in the design of HARMI, which is the
attempt to find not only the shared perceptual and
production spaces between human and robotic improvisers,
but to also highlight those spaces that are uniquely human
and uniquely robotic. As these spaces are explored in
greater depth through integration of learning, sensitivity to
physical design, higher-level perceptual abilities, aesthetic
preferences and stylistic conventions, new kinds of music
will be created.
References
Brower, C. 1993. Memory and the Perception of Rhythm. Music
Theory Spectrum, 19–35.
Clarke, E. F. 1987. Categorical rhythm perception: An ecological
perspective. Action and perception in rhythm and music, 55, 19–
33.
Desain, P., & Honing, H. 1989. The quantization of musical time:
A connectionist approach. Computer Music Journal, 13(3), 56–
66.
Desain, P. 1993. A connectionist and a traditional AI quantizer,
symbolic versus sub-symbolic models of rhythm perception.
Contemporary Music Review, 9(1-2), 239–254.
Desain, P., & Honing, H. 2003. The formation of rhythmic
categories and metric priming. Perception, 32(3), 341–365.
Garner, W. R., & Gottwald, R. L. 1968. The perception and
learning of temporal patterns. The Quarterly journal of
experimental psychology, 20(2), 97–109.
Hahn, U.; Chater, N.; and Richardson, L. B. 2003. Similarity as
Transformation. Cognition 87.1: 1–32.
Honing, H. 2012. Without It No Music: Beat Induction as a
Fundamental Musical Trait. Annals of the New York Academy of
Sciences 1252.1: 85–91.
Hoopen, G. T., Sasaki, T., and Nakajima, Y. 1998. Categorical
rhythm perception as a result of unilateral assimilation in time-
shrinking. Music Perception, 201–222.
Janata, P.; Tomic, S. T.; and Haberman, J. M. 2012. Sensorimotor
Coupling in Music and the Psychology of the Groove. Journal of
Experimental Psychology: General 141.1: 54–75.
Kapur, A. 2011. Multimodal Techniques for Human/Robot
Interaction. Musical Robots and Interactive Multimodal Systems.
Solis, J. and Ng, K eds. Springer Berlin Heidelberg. 215–232.
Krebs, H. 1987. Some extensions of the concepts of metrical
consonance and dissonance. Journal of Music Theory, 31(1), 99–
120.
Lerdahl, F. A. and Jackendoff, R. S. 1983. A generative theory of
tonal music. The MIT Press.
Lewis, G. E. 1999. Interacting with Latter-day Musical Automata.
Contemporary Music Review 18.3: 99–112.
Lewis, G. E. 2000. Too Many Notes: Computers, Complexity and
Culture in Voyager. Leonardo Music Journal 10: 33–39.
London, J. 2004. Hearing in Time : Psychological Aspects of
Musical Meter. New York: Oxford University Press.
Orpen, K. S., and Huron, D. 1992. Measurement of Similarity in
Music: A Quantitative Approach for Non-parametric
Representations. Computers in music research 4: 1–44.
Pachet, F. 2003. The Continuator: Musical Interaction with Style.
Journal of New Music Research 32.3: 333–341.
Palmer, C., & Krumhansl, C. L. 1990. Mental representations for
musical meter. Journal of Experimental Psychology: Human
Perception and Performance, 16(4), 728–741.
Parncutt, R. 1994. A perceptual model of pulse salience and
metrical accent in musical rhythms. Music Perception, 11, 409–
409.
Rosenbaum, D. A., & Collyer, C. E. 1998. Timing of Behavior:
Neural, Psychological, and Computational Perspectives. The MIT
Press.
Rowe, R. 1992. Machine Listening and Composing with Cypher.
Computer Music Journal 16.1: 43–63.
Sasaki, T., Hoopen, G. T., & Nakajima, Y. 1998. Categorical
rhythm perception as a result of unilateral assimilation in time-
shrinking. Music Perception, 201–222.
Schulze, H.-H. 1989. Categorical perception of rhythmic patterns.
Psychological Research, 51(1), 10–15.
Snyder, B. 2000. Music and memory : an introduction.
Cambridge, Mass.: MIT Press.
Weinberg, G., and Driscoll, S. 2006. Robot-human Interaction
with an Anthropomorphic Percussionist. In Proceedings of the
SIGCHI Conference on Human Factors in Computing Systems,
1229–1232. New York, NY, USA: ACM.
Weinberg, G., and Driscoll, S. 2007. The Interactive Robotic
Percussionist: New Developments in Form, Mechanics,
Perception and Interaction Design. In Proceedings of the
ACM/IEEE International Conference on Human-robot
Interaction, 97–104. New York, NY, USA: ACM.
Winkler, T. 2001. Composing Interactive Music. The MIT Press.
Winkler, T. 1995. Strategies for Interaction: Computer Music,
Performance, and Multimedia. In Proceedings of the 1995
Connecticut College Symposium on Arts and Technology.
Yeston, M. 1976. The stratification of musical rhythm. Yale
University Press New Haven.
13
... Many robotic musicians play together with humans and take cues from their human counterparts by using machine vision to detect human performer gestures (Bretan et al. 2012;Cicconet et al. 2013;Solis et al. 2005) and head nods (Pan et al. 2010), along with using various machine listening techniques to estimate tempo (Barton 2013;Kapur 2011;Weinberg and Driscoll 2007) and recognise different playing styles (Kapur et al. 2012). As the focus is mostly on how robotic agents technically perform, either alone or in the presence of humans, human behaviour studies are rare, and the research's contributions to HRI are often implicit. ...
Chapter
As social robots make their way into human environments, they need to communicate with the humans around them in rich and engaging ways. Sound is one of the core modalities of communication and, beyond speech, affects and engages people across cultures and language barriers. While a growing body of work in human–robot interaction (HRI) investigates the various ways it affects interactions, a comprehensive map of the many approaches to sound has yet to be created. In this chapter, we therefore ask “What are the ways robotic agents can communicate with us through sound?”, “How does it affect the listener?” and “What goals should researchers, practitioners and designers have when creating these languages?” These questions are examined with reference to HRI studies, and robotic agents developed in commercial, artistic and academic contexts. The resulting map provides an overview of how sound can be used to enrich human–robot interactions, including sound uttered by robots, sound performed by robots, sound as background to HRI scenarios, sound associated with robot movement, and sound responsive to human actions. We aim to provide researchers and designers with a visual tool that summarises the role sound can play in creating rich and engaging human–robot interactions and hope to establish a common framework for thinking about robot sound, encouraging robot makers to engage with sound as a serious part of the robot interface.
... Researchers have developed robots able to perform on a wide range of acoustic and electronic instrument. These include the robot percussionists Cog (Williamson 1999), DB (Atkeson et al. 2000), the modular, non-anthropomorphic LEMUR robots (Singer et al. 2004), and Georgia Tech's Shimon (Savery et al. 2021;Weinberg and Driscoll 2007), as well as pianists WABOT (Kato et al. 1987), and ACT (Zhang et al. 2011), and animatronic robot rock band Compressorhead, whose cultural impact was investigated by Davies Many robotic musicians play together with humans and take cues from their human counterparts by using machine vision to detect human performer gestures (Bretan et al. 2012;Cicconet et al. 2013;Solis et al. 2005) and head nods (Pan et al. 2010), along with using various machine listening techniques to estimate tempo (Barton 2013;Kapur 2011;Weinberg and Driscoll 2007) and recognise different playing styles (Kapur et al. 2012). As the focus is mostly on how robotic agents technically perform, either alone or in the presence of humans, human behaviour studies are rare, and the research's contributions to HRI are often implicit. ...
Chapter
Education is one of the predominant applications that is foreseen by researchers in social robotics. In this context, social robots are often designed to interact with one or several learners and with teachers. While educational scenarios for social robots have been studied widely, with experiments being conducted in several countries for nearly 20 years, the cultural impact of accepting social robots in classrooms is still unclear. In this paper, we review the literature on social robots for education with the lens of cultural sensitivity and adaptation. We discuss culture theories and their application in social robotics and highlight research gaps in terms of culture-sensitive design and cultural adaptation in social robots assisting learners in terms of (1) the robot’s role, (2) envisioned tasks, and (3) interaction types. We also present guidelines for designing cross-cultural robots and culturally adaptive systems.
... 2. Si le robot joue d'un instrument de musique sans avoir besoin d'un musicien humain, il existe cependant des cas d'interaction entre un musicien humain jouant d'un instrument de musique et un robot autonome jouant d'un autre ou du même instrument de musique, comme p.ex. [18,189,[255][256][257][258] du musicien lors d'une interaction homme-machine [150,151,246,256] ou une étude perceptive des sons produits ne sont pas abordées ici. ...
Thesis
Une première contribution de cette thèse concerne une combinaison d’une loi de commande en temps fini (pour l’efficacité) avec la passivité (pour garantir la robustesse, par exemple, face à une mauvaise identification des paramètres du modèle), dans le cas d’un système EDO du deuxième ordre. Cette loi de commande est utilisée pour le contrôle d’un haut-parleur dans une application d’absorbeur électroacoustique. Ensuite, une méthode numérique passive résolvant un problème de raideur intrinsèque chez des systèmes EDO stabilisés en temps fini est proposée. Finalement, une deuxième application au contrôle d’une corde non linéaire par suivi de trajectoire en temps fini est proposée. Une deuxième contribution concerne le contrôle en temps fini de systèmes hybrides sous la forme d’une EDP hyperbolique couplée à une EDO. Deux cas de systèmes vibratoires concrets sont développés : un tom (instrument de percussion augmenté par rétroaction sur un haut-parleur), et un chariot avec câble pesant (modèle pour le mouvement 2D dans une grue de construction ou un pont roulant). Un observateur-régulateur pour cas du tom est construit par une approche modale et est implémenté sur un prototype pour une évaluation expérimentale. Une stabilisation en temps fini du modèle 2D de la grue a été obtenue en mettant à profit un théorème existant sur le contrôle en temps fini à la frontière d’une EDP hyperbolique.
... This scenario is particularly interesting when the other is a robot (a machine that can sense, interpret and respond accordingly). In the last five years or so, Scott Barton has devoted considerable time to the development and exploration of computational systems that allow human-robot improvisation [3]. These experiences (as a system designer, co-performer and audience member) have shown the ability of machines to convey and inspire expressivity. ...
Conference Paper
Full-text available
Robotic instrument designers tend to focus on the number of sound control parameters and their resolution when trying to develop expressivity in their instruments. These parameters afford greater sonic nuance related to elements of music that are traditionally associated with expressive human performances including articulation, timbre, dynamics, and phrasing. Equating the capacity for sonic nuance and musical expression stems from the "transitive" perspective that musical expression is an act of emotional communication from performer to listener. However, this perspective is problematic in the case of robotic instruments since we do not typically consider machines to be capable of expressing emotion. Contemporary theories of musical expression focus on an "intransitive" perspective, where musical meaning is generated as an embodied experience. Understanding expressivity from this perspective allows listeners to interpret performances by robotic instruments as possessing their own expressive meaning, even though the performer is a machine. It also enables musicians working with robotic instruments to develop their own unique vocabulary of expressive gestures unique to mechanical instruments. This paper explores these issues of musical expression, introducing the concept of mechatronic expression as a compositional and design strategy that highlights the musical and performative capabilities unique to robotic instruments.
... As a result the robot can detect "swing" in jazz music and the more rubato nature intrinsic to classical music. 3 Other analysis methods help robots to perceive the pitch and harmonic content of music. Though pitch detection of a monophonic signal has been achieved with relatively robust and accurate results, polyphonic pitch detection is more challenging. ...
Article
Reviewing the technologies that enable robot musicians to jam.
... In the 1980s--90s ac-ademic research institutions such as Waseda University in Japan began developing robotic musical instruments to study the mechanics of human performance (Solis and Takanishi, 2011). A number of research schools continue this work, with notable advancements in human--robot interaction from Gil Weinberg at the Georgia Institute of Technology (Weinberg 2006), Ajay Kapur at the California Institute of the Arts (Kapur 2011), and Scott Barton at Worcester Polytechnic Institute (Barton 2013). At the same time, musician--technologists, such as Trimpin, God-fried--Willem Raes, Eric Singer, and others created en-sembles of robotic musical instruments designed for cre-ative exploration (Maes 2011, Singer 2003. ...
Conference Paper
Full-text available
Scholarly literature on musical robotics has primarily focused on design and technical capabilities rather than the musical potential of robotic instru-ments. Drawing upon the author’s experience designing and composing for robotic instruments as a co-founder of Expressive Machines Musical Instruments (EMMI), this article considers electromechanical musicality: the unique musical characteristics of robotic instruments. Though computer-controlled, electromechanical robotic instruments have only existed for a few decades, they can be traced back to musical automata from the eight-eenth century. Many of the same aesthetic considerations apply to both musical robots and musical automata, including criticisms that music pro-duced by these instruments fails to be expressive. The relationship between electromechanical musicality and expressivity is discussed within the context of several of the author’s compositions for EMMI’s musical robots. These pieces include hyper-virtuosic music that is unplayable by humans, the “broken machine” aesthetic, and the use of robots to critique contemporary notions of futurism.
Article
Full-text available
Our private perception of listening to an individualized playlist during a jog is very different from the interaction we might experience at a live concert. We do realize that music is not necessarily a performing art, such as dancing or theater, while our demands regarding musical performances are conflicting: We expect perfect sound quality and the thrill of the immediate. We want the artist to overwhelm us with her virtuosity and we want her to struggle, just like a human. We want to engage with the musical expression and rely on visual and physical cues. Considering that the ears of today's listeners are used to technologically mediated music, in this paper I explore the unique qualities of musical live performances and examine if our conception allows for new mechatronic inventions, in particular robotic musicians, to participate in this art form. Some of Godlovitch's main thoughts expounded in his work on "musical performance" [11] serve as a reference and starting point for this investigation. His concept of 'personalism', which deprives computer-/program-based musical performances from expressive potential and creative accomplishment is an issue that I want to challenge by pointing out new approaches arising from a reflective discourse on technology, embodiment and expression. The enquiry conducted illustrates, how in reasoning about machine performers and algorithmic realization of music, we also examine the perceptual, physical and social aspects of human musicianship, reconceptualizing our understanding of a musical live performance.
Article
Full-text available
In this paper, I argue that machines can create works of art. My argument is based on an analysis of the so-called creative machines and focuses on technical functions and intentions. If my proposal is correct, then creative machines are technical artifacts with the proper function to bring about works of art. My account is based on sensible conceptual connections between makers, technical artifacts, intentions, and the creation of art. One upshot of the account presented here is that we do not need a new conceptual framework or dubious assumptions about artistic agency on part of machines in order to arrive at the conclusion that creative machines make art. I will conclude the paper with some remarks regarding the artistic value of items produced by creative machines.
Article
Full-text available
In previous studies, we established an illusion of time perception that we called time-shrinking: an empty time interval, immediately preceded by a slightly shorter time interval, is underestimated. In the first experiment of the present study, we examined the perceived duration not only of the second interval (t2), but also of the first interval (tl). The empty time intervals tl and t2, making a total duration of 90,180, 360, or 720 ms, were presented such that the time ratio between them changed systematically. The points of subjective equality of tl and t2 were established by the method of adjustment. In the patterns typically susceptible to timeshrinking, that is, in which t2 was underestimated, tl was perceived almost vertically. In the second experiment, listeners had to bisect an empty duration of 180 ms, marked by sound bursts. The bisecting sound marker was positioned closer to the initial marker than to the final one. Thus, tl had to be shorter than t2 in order for a regular pattern to be perceived. In the third experiment, just-noticeable forward and backward displacements of the middle sound marker were measured by a transformed updown method. The prediction that the interval of uncertainty was closer to the initial than to the final sound marker was confirmed. The three experiments demonstrated the existence of unilateral temporal assimilation, and it is argued that this perceptual mechanism causes a category of 1:1 rhythms, despite a considerable change in temporal ratio between two contiguous time intervals.
Article
Full-text available
We propose a system, the Continuator, that bridges the gap between two classes of traditionally incompatible musical systems: (1) interactive musical systems, limited in their ability to generate stylistically consistent material, and (2) music imitation systems, which are fundamentally not interactive. Our purpose is to allow musicians to extend their technical ability with stylistically consistent, automatically learnt material. This goal requires the ability for the system to build operational representations of musical styles in a real time context. Our approach is based on a Markov model of musical styles augmented to account for musical issues such as management of rhythm, beat, harmony, and imprecision. The resulting system is able to learn and generate music in any style, either in standalone mode, as continuations of musician’s input, or as interactive improvisation back up. Lastly, the very design of the system makes possible new modes of musical collaborative playing. We describe the architecture, implementation issues and experimentations conducted with the system in several real world contexts.
Article
Full-text available
The authors discuss the separation of the discrete and continuous components of musical time, which they call quantization, although the term is generally used to reflect only the extraction of a metrical score from a musical performance. They examine known methods of quantization and they then consider the use of connectionist methods. They discuss the implementation of the approach and identify areas that need further research.
Article
Consideration of the roles of echoic, short-term, and long-term memory in rhythmic perception yields a three-fold division of the rhythmic hierarchy into foreground, middleground, and background. It appears that at middleground levels meter provides a quantitative basis for rhythmic perception, while at foreground and background levels rhythms are perceived more qualitatively. It also appears that at lower middleground levels meter is perceived using an entrainment strategy, while at higher middleground levels perception is based on a counting strategy. Evidence for these perceptual differences is found in the kinds of metric irregularities introduced at each level, with lower levels being characterized by syncopations and higher levels by expansions and contractions.
Article
The author discusses his computer music composition, Voyager, which employs a computer-driven, interactive & virtual improvising orchestra that analyzes an improvisor's performance in real time, generating both complex responses to the musician's playing and independent behavior arising from the program's own internal processes. The author contends that notions about the nature and function of music are embedded in the structure of software-based music systems and that interactions with these systems tend to reveal characteristics of the community of thought and culture that produced them. Thus, Voyager is considered as a kind of computer music-making embodying African-American aesthetics and musical practices.
Article
The idea of music that somehow plays itself, or emerges from a nonhuman intelligence, is a common, transculturally present theme in folklore, science, and art. Over the centuries, this notion has been expressed through the development of various technological means. This paper explores aspects of my ongoing encounter with computers in improvised music, as exemplified by my most recent interactive computer music compositions. These works involve extensive interaction between improvising musicians and computer music-creating programs at the performance (“real-time”) level. In both theory and practice, this means that both human musicians and computer programs play central organizing and structuring roles in any performance of these works. This paper seeks to explore aesthetic, philosophical, cultural and social implications of this work. In addition, the nature and practice of improvisation itself will be explored, since an understanding of this ubiquitous musical activity is essential to establishing the cultural and historical context of the work.