ArticlePDF Available

Motion-sensing Music: Artistic and Technical Challenges In Two Works For Dance



David Rokeby's Very Nervous System (VNS), offers a sophisticated level of computer control for detecting the accurate speed and location of dancers on stage. The VNS, coupled with software written in Opcode's Max, is used in two recent dance productions in which music is produced by dancers' movements. Dark Around the Edges, a collaboration with Walter Ferrero, uses mechanical motion to create rhythms and musical phrases. In Songs for the Body Electric, choreographer Gerry Girouard's fluid gestures influence compositional algorithms and signal processing parameters. Video projections and lighting changes also generate sound in counterpoint with the dance. 1. The Collaborative Process From Stravinsky and Nijinsky to Cage and Cunningham, history tells the stories of the mutual influence between composers and choreographers. New technology using motion sensors to trigger computer music ups the ante on this close collaboration by inviting dancers to become musical performers. This requires many hours of experimentation and testing for each new work in order to discover music and response mechanisms that feel "right" to the dancers for specific types of movement. One artistic challenge is to go beyond the novelty of producing music "out of thin air," by finding links between the body and sound that are both convincing to an audience and which serve the expressive purpose of the dance.
Published in Proceedings of the 1998 International Computer Music Conference
Todd Winkler
MacColl Studio for Electronic Music
Brown University
David Rokeby's Very Nervous System (VNS), offers a sophisticated level of computer control for
detecting the accurate speed and location of dancers on stage. The VNS, coupled with software
written in Opcode’s Max, is used in two recent dance productions in which music is produced by
dancers’ movements. Dark Around the Edges, a collaboration with Walter Ferrero, uses mechanical
motion to create rhythms and musical phrases. In Songs for the Body Electric, choreographer Gerry
Girouard's fluid gestures influence compositional algorithms and signal processing parameters.
Video projections and lighting changes also generate sound in counterpoint with the dance.
1. The Collaborative Process
From Stravinsky and Nijinsky to Cage and Cunningham, history tells the stories of the mutual influence between
composers and choreographers. New technology using motion sensors to trigger computer music ups the ante on
this close collaboration by inviting dancers to become musical performers. This requires many hours of
experimentation and testing for each new work in order to discover music and response mechanisms that feel “right”
to the dancers for specific types of movement. One artistic challenge is to go beyond the novelty of producing music
“out of thin air,” by finding links between the body and sound that are both convincing to an audience and which
serve the expressive purpose of the dance.
This paper presents two years of research and creative work in interactive dance, culminating in the productions, Dark
Around the Edges and Songs for the Body Electric. Both works utilize the Very Nervous System (VNS), a motion-
sensing device, created by David Rokeby, which reports the location and speed of dancers to Max software written by
the author. The software generates a musical response based on an interpretation of the movement data using
compositional algorithms and mapping strategies. MIDI data is then sent to a Kurzweil K2500, a sample-based
synthesizer with 64 MB of memory, and an Ensoniq DP/4 signal processor. Similar systems have been used by the
author for tracking participants in motion-sensing audio and video installations. Technical information, response
techniques and working methods used to create the dances are described here. For a more theoretical discussion on the
connection between movement and music, see "Making Motion Musical," from the 1995 Proceedings of the
International Computer Music Conference (Winkler, 1995).
Dark Around the Edges is a twenty-minute solo with Walter Ferrero, an actor and choreographer living in
Stockholm. It premiered in Rhode Island on April 20, 1997. Since the capabilities and idiosyncrasies of the VNS
were unknown to us, we began with several months of improvisation and experimentation so that our artistic
decisions would naturally evolve out of a spontaneous, physical understanding of the system. Ferrero, who is also
trained in the circus arts, experimented with juggling balls, walking on stilts and using toys to trigger the VNS. The
final production includes theatrical moments characterized by humor and clowning, such as a section showing Ferrero
whipping and snapping long scarves into space to start and stop different layers of music. Other sections are
characterized by precise robotic and repetitive movements creating rhythms with machine and percussive sounds; and
slow, fluid movements producing thick evolving sounds continuously altered by speed. Software developed for this
dance production served as the basis for all of my subsequent VNS projects.
Songs for the Body Electric is an one hour production featuring two dancers, theatrical lighting, and multiple video
projections. Work on the project began in the summer of 1997 as a commission from the American Composers
Forum to work with Gerry Girouard, a choreographer/dancer with a background in gymnastics. Stephen Rueff, a
lighting and video designer, was invited to collaborate with us early on in the process. Since we were living in
different cities, we developed ideas by exchanging video and audio tapes frequently through the mail. In two separate
week-long work sessions, we came together to test and realize our ideas. Since the choreography had to be rehearsed
without the VNS, I developed response strategies that allow the dancers to move freely without worrying about
hitting an exact location. These techniques include using larger sensing areas on stage and relying on overall speed
and video projections to create sound. Only two out of eleven sections require precise placement to trigger specific
sounds. The work premiered in Minneapolis on December 12, 1997.
2. Movement Sensing and Analysis
The VNS is a SCSI device running on an Apple Macintosh computer, with two video camera inputs (Rokeby,
1997). The view from each video camera can be thought of as the active sensing area used for motion analysis.
However, the VNS doesn't actually measure motion; it measures changes in light. By comparing the amount of light
in one video frame to previous frames, it determines what part of the video image has changed, and by how much.
The VNS captures each video frame as a black and white image with a gray-scale resolution of 6 bits (64 shades of
gray) and an image resolution of 128 (horizontal) by 240 (vertical) pixels. The user specifies a grid size onto which
the video image will be mapped, with each square of the grid defined by a group of pixels. This group becomes an
active “region” corresponding to a location on stage. Any change within a region is reported to Max by subtracting
the total gray-scale value for all pixels in one region from the same region in the previous frame (or frames). If there
is no movement in a particular part of the stage, then Max reports a zero in the corresponding region (no change in
light was detected). Faster movement across regions will yield higher values, since the light values will change more
dramatically from frame to frame. In effect, each region acts similarly to a continuous controller, with numbers
streaming into Max representing the activity in each area. The range of these values and the rate at which they are
reported to Max are variable. Up to 240 regions may be active at a time, although in practice, fewer regions are often
more practical and effective. A region may also be irregularly shaped using on-screen drawing tools to draw a region
directly on the video image as it appears on the computer screen.
The sheer number of possible configurations for the VNS makes it effective for a wide variety of projects and
lighting conditions. For the dance pieces, we tested various lenses, grid spaces, and camera placements, finally
establishing a basic setup for each work so that the dancers could become familiar with a single spatial environment.
The basic setup for Dark Around the Edges uses a single Panasonic black-and-white camera with a wide angle lens,
placed on the floor in the front of the stage, with a simple 4 x 4 grid, updating every 33 milliseconds. In Songs for
the Body Electric, a second camera, also with a 4 x 4 grid, is added to the right of the first to cover a wider area of the
stage. To help the dancers identify the borders of each region, we marked the floor and back wall with tape during the
development period. In performance, the tape is replaced by small, unobtrusive markers showing crucial locations on
stage. By becoming very familiar with the basic setups, the dancers could perform and improvise with some degree
of expertise.
The sixteen active grid areas (regions) per camera provide plenty of challenges and variety, particularly since the size
of an active area changes as the performer moves upstage and downstage. Furthermore, the function and response of
one or more grid points can change through software at any time. Each region has the ability to trigger a particular
MIDI note or a series of notes, report continuous changes in movement, reconfigure the software, and start or stop a
musical process. Two or more regions can be added to form a single, larger active area, or to define an area that the
system will ignore. (Although the VNS software is capable of combining several grid spaces into single regions, and
blocking out or "masking," various areas, it proved easier to automate and structure these changes via Max). In this
way, further variations of the initial 4 x 4 set-up are created, such as using all of the middle regions of the grid to
determine the overall tempo, while using the outer regions to start and stop specific processes. For some sections,
the space is divided into left/right, or high/low regions.
3. Software Design
Max-based software designed for the dance projects has three functions: to further analyze, interpret, and scale
movement data; to provide a user interface to facilitate composition, rehearsal and performance; and to generate,
process and mix sound. Raw values are received by Max via the VNS object, an object written by Rokeby to handle
system configurations. From there, changing values representing the grid are displayed graphically, then scaled,
mapped, or otherwise prepared to enter the system's response modules. The response modules are a collection of self-
contained programs that are designed to produce music based on location triggers and continuous motion. They range
from very simple data structures designed to map regions to specific MIDI note numbers, to highly complex
algorithms, such as those using overall movement to continuously change parameters representing tempo, register,
and timbre. Any number of these modules may be active at a given time using on/off toggle switches.
All of the possible software settings are available through a master "preset" object, which is used to automatically
recall and store all system parameters for performances and rehearsals. Individual parameters can be changed in real
time through on-screen graphics, or automated in response to particular movements. Specialized editing modules are
all linked to the master preset object in the front panel. These include modules for mapping triggers, timbre
selection, mixing, and signal processing. During a performance, the presets are advanced from one cue to the next,
with each preset having its own behavior and response. This strategy was especially helpful in our initial working
sessions, since everything from the selection of sounds to the configuration of the VNS could be quickly stored and
recalled at a later time.
Two types of movement data influence all musical responses: continuous data (reported by the VNS), and triggers
(discrete values representing a particular location on stage). A simple threshold strategy sends a trigger whenever
activity within a region goes above a specified level. If a value goes above the threshold, a corresponding grid
number (1 to 16) is sent out. The region can be retriggered only after the values drop below the threshold level, by
either leaving the space or by less motion or stillness within the space. In this way, constant movement in one
region avoids sending repeated triggers, but a short pause, or leaving a region and returning, allows for the next
trigger. The threshold setting proves to be invaluable since it can be changed frequently to avoid errors, acting as a
sensitivity gate to optimize the response for each action. Set high, the system ignores basic lighting changes, while
forcing the dancers to make large gestures to trigger sound. Set very low, a single finger acts as a trigger. (At its
most sensitive setting, the VNS can register the blink of an eye at four feet away from the camera!) Two or more
threshold settings are used in conjunction to identify a range of fast, medium, or slow speeds within regions and
across the stage.
Continuous data entering response modules is scaled to usable values, with a smoothing algorithm applied to reduce
some of the rapid fluctuations in the signal. Since the time frame for averaging is variable, the smoothing algorithm
is also effective in applying more slowly changing functions to parameters that do not always respond well to abrupt
changes, such as tempo and signal processing. These techniques are often used with the total motion parameter, a
value representing the sum of all active regions (all activity within the video field). This type of sensing proves to
be responsive and “error proof,” since the dancers are not required to be in a specific spot on stage, only to move in a
general way. Continuous data is also controlled via a gating function that allows one value to pass every X
milliseconds. This reduces the number of calculations required by the computer, while setting a tempo for the
algorithms that generate music.
4. Sound Design and Response Mechanisms
The Kurzweil K2500 signal processing architecture offers a sophisticated platform for sound design, since samples
can be altered in real time via continuous controller messages. In several sections of the works, physical gestures
have an obvious and immediate impact on the quality of timbre by mapping VNS values to control DSP functions,
such as filters, low frequency oscillators, and distortion algorithms. Further processing under MIDI control is
available from the Ensoniq DP/4 processor. Several techniques may be combined, such as in “Gazelles” from Songs
for the Body Electric, where location triggers play chords at a maximum rate of two per second, while speed effects a
low pass filter and pitch bend.
The various sections of each work are characterized mood, gesture, sound, and response mechanisms. In Dark Around
the Edges we gravitated towards high impact sound of percussion and machines. We associated these sounds with the
force and weight used to move a particular part of the body. Thus, a heavy jump onto one leg has the weight of a
large, low sound. Small head movements or flicks of the hand are lighter with less energy, which seem to fit smaller
sounds like a small wood block. Whipping, breaking, and tearing sounds appear in particularly strenuous moments.
Using speed to crossfade between several related percussive samples adds realism and variety to the resultant sound.
While this approach is too literal and repetitive to be sustained throughout an entire performance, it energizes the
performer to produce sound so closely aligned with his efforts.
Linking a single sound to a specific area on stage helps the audience enjoy the show by beginning with a clear and
unambiguous example of how the VNS works. The one-to-one approach is extended using prestored sequences and
algorithms to generate a different sound, mix, pitch, or a short musical phrase each time a particular region is
triggered. This is especially successful when using pitched sounds, since a single area can be retriggered to generate
melodies with continuous variation. Variation techniques employ movement data and constrained random values to
influence significant musical parameters. Even though the final production was carefully choreographed, the music
for each performance varied, as did many subtleties of the dancers’ interpretation.
Certain processes, such as transposition, worked well on a continuous basis, with the music moving high and low
corresponding to the amount of movement on stage. Continuous changes in tempo were a bit chaotic, and became
more predictable when divided to create discreet selections of three or four related speeds. Other processes influenced
by overall movement included steadiness of the pulse, amount of dissonance, range of melodies, phrase length,
timbre, note density, selection of intervals, melodic contour, and articulation. A smoothing algorithm was often used
to average these values to avoid abrupt changes. Motion data averaged over a longer period of time would reflect
general types of movement (fast/slow) within a section. This produced interesting delayed processes showing the
accumulative results of movement rather than an immediate response.
5. Making Music with Lighting and Video
The fact that the VNS analyzes movement by reading light values poses some interesting analysis problems and
must be considered when designing sets, costumes, and lighting for dance. Background color, clothing color, lighting
changes, and the proximity of movement to a camera all influence the reading of movement. For example, all other
things being equal, if the VNS analyzed the movement of a person wearing white in front of a black curtain, it
would result in higher values than a person wearing black moving in front of the same curtain, since there would be
greater contrast in light per frame. Quick lighting changes can registered as tremendous "activity" on stage. These
high values quickly reset back to zero as the light remains steady. As for proximity, moving closer to a camera
usually results in higher values, since the body occupies more of a region and even small movements cause large
changes in the image. Something as trivial as the color or pattern of a shirt could alter the overall reading of the
movements. It was important to keep these elements the same from one rehearsal to the next, and to be able to
adjust the response software to new spaces and lighting conditions.
In Dark Around the Edges, we avoid lighting issues by using a fixed plot which stays the same throughout the
show. In Songs for the Body Electric, we collaborated with designer Stephen Rueff to create an elaborate series of
lighting and video cues to work specifically with the VNS. In general, threshold settings are set so that gradual
changes in light or color do not send out location triggers. In “Speaking Stick” however, the lighting board plays
music with a group of small spot lights using preprogrammed rhythms. Video projections are also designed to
trigger music along with the dancers. In “The Raft,” the dancers move at the edge of the stage, outside of the view of
the two sensing cameras, while their images (shot from above) are projected onto the back wall along with
silhouetted shadows for the VNS. In “Evolution,” the sensing area is divided into high and low regions. The dancers
slither on the ground in blue light, their speed controlling the distortion and filtering effects of a low rumbling
sound, while above their heads, a black and white projection of processed video triggers a high bowed sound.
6. Conclusion
These works explore the imaginative relationships between sound and movement suggesting a new paradigm for
dance that links it more intimately than ever with music. The artists involved enjoyed a close collaborative process,
addressing some of the inherent artistic and technical challenges posed by this new technology. The VNS, coupled
with Max software, proves to be a dependable system in which ideas can be rapidly shaped and carefully refined.
7. References
Rokeby, David. Personal website., September 8, 1997.
Winkler, Todd. "Making Motion Musical: Gesture Mapping Strategies for Interactive Computer Music." In
Proceedings for the 1995 International Computer Music Conference. San Francisco, CA: Computer Music
Association, 1995
... This characteristic is found, for example, in computer games, which commonly have physics engines that are unrealistic, yet foster highly engaging and immersive environments (TAVARES; PAIVA, 2018). The link between the physical and the virtual worlds, provided by sensors and actuators, has enabled a diversity of interactive artwork, including augmented performances (WINKLER, 1998) (TAVARES et al., 2015), live audience participation (ARAÚJO et al., 2019), augmented musical instruments (TRAIL et al., 2012) Interactive art installations commonly require specialized equipment, including projectors, loudspeakers, the sensors themselves, computers, and a physical environment to host the installation. ...
... A 3 This concept aligns with the ideas underlying the MOTUS (TAVARES, 2015), an audiovisual interactive installation that focuses on using affordable hardware for an engine that captures movement and converts it into musical and visual feedback. Cameras have been used in humancomputer interaction for music for decades (WINKLER, 1998), and MOTUS employs this concept while deploying its software as a webpage, thus only requiring hardware that is becoming increasingly ubiquitous (TURNER, 2021) (ZUGARA, 2021). Therefore, even though it is an art installation, it can be understood as a virtual one, as the audience does not have to be physically present in a particular place to participate on it. ...
Many art installations rely on camera-based audiovisual interactions, and this commonly requires specialized hardware and software. Consequently, audiovisual installations are usually restricted to wealthier areas, in which the specialized equipment can be afforded and properly hosted. In countries with an evident income unbalance linked to location, the geographic restriction leads to an audience restriction. In this work, we present the development of a web-deployed composition tool for audiovisual interactions that runs on the client side and does not require installing any additional software. Simultaneously, it provides visual feedback that can aid the audience to understand the experience. Consequently, the tool can be used to compose audiovisual interactions that reach a large audience via web. We further explore the tool by composing the audiovisual installation Homino-idea. The installation is inspired by the interactions between humans and the environment, and can be either shown in art venues or used online.
... Therefore there is not so much other work directly related to MELE. Of course, various kinds of motion tracking have been used to control sound processing and synthesis (e.g. with the EyesWeb system [5] or with VNS 5 used to create what Winkler calls "motion-sensing music" [6]), but very rarely high-definition systems such as the one employed in MELE have been used in stage performances. This is due to the low availability of such technology and the complexity of using it in live performance. ...
... In most of the work related to MELE, the sound is produced or transformed by the motion of the audience (such as in installation situations, e.g. using a system like VNS) or by dance performers (e.g. [6]). The project that had the biggest influence on MELE is the Embodied Generative Music (EGM) project 6 , which also supplied much of the technological infrastructure described next. ...
Full-text available
Motion-Enabled Live Electronics (MELE) is a special approach towards live electronic music aiming at increas-ing the degree of the performers' embodiment in shaping the sound processing. This approach is characterized by the combination of a high-resolution and fully-3D motion track-ing system with a tracking data processing system tailored towards articulating the relationship between bodily move-ment and sound processing. The artistic motivations driving the MELE approach are described, an overview of related work is given and the technical setup used in a workshop exploring the approach is introduced. Brief descriptions of the pieces realized in the workshop and performed in the fi-nal concert inform the presentation of the conclusions drawn from the workshop.
... A classic example of such a collaboration is the New York 69th Regiment Armory in 1966 where 10 New York artists and choreographers worked with 30 engineers and scientists from Bell Telephone Laboratories to create groundbreaking performances that incorporated new technology [6]. Since then there have been a number of productions using dancers to control the music, i.e. [11] [12]. ...
Conference Paper
Full-text available
This is an overview of the three installations Hoppsa Universum, CLOSE and Flying Carpet. They were all designed as choreographed sound and music installations controlled by the visitors movements. The perspective is from an artistic goal/vision intention in combination with the technical challenges and possibilities. All three installations were realized with video cameras in the ceiling registering the users' position or movement. The video analysis was then controlling different types of interactive software audio players. Different aspects like narrativity, user control, and technical limitations are discussed.
... The Theremin-like quality of such gestural devices quickly dives into the realms of dance, theater and interactive installations when the space and number of performers increase [10]. Coherently, systems that utilize video capture and IR motion capture devices had been employed since the eighties for dance driven music, as Simon Veitch's 3DIS system [2] and David Rokeby's VNS (Very Nervous System) [18,19]. ...
Full-text available
This research presents a 3D gestural interface for collabo-rative concatenative sound synthesis and audio mosaicing. Our goal is to improve the communication between the au-dience and performers by means of an enhanced correlation between gestures and musical outcome. Nuvolet consists of a 3D motion controller coupled to a concatenative synthe-sis engine. The interface detects and tracks the perform-ers hands in four dimensions (x,y,z,t) and allows them to concurrently explore two or three-dimensional sound cloud representations of the units from the sound corpus, as well as to perform collaborative target-based audio mosaicing. Nuvolet is included in the Esmuc Laptop Orchestra catalog for forthcoming performances.
... Video has been used by many composers as a sensing device, allowing movement to control music. Examples include Rokeby's Very Nervous System (Rokeby, 1998) and some of Winkler's dance and installation pieces (Winkler, 1998). STEIM's Big Eye software (Demeyer, 1996) has enabled many composers to incorporate video sensing into their work. ...
Full-text available
Digital video offers an interesting source of control information for musical applications. A novel synthesis technique is introduced where digital video controls sound spectra in real time. Light intensity modulates the amplitudes of 32 harmonics in each of several synthesized "voices." Problems addressed include how to map from video to sound, dealing with global variations in light level, dealing with low frame rates of video relative to high sample rates of audio, and overall system implementation. In one application, images of light reflected from a shallow pool of water are used to control sound, offering a rich tactile interface to sound synthesis.
With the evolution of intelligent devices, sensors, and ambient intelligent systems, it is not surprising to see many research projects starting to explore the design of intelligent artifacts in the area of art and technology; these projects take the form of art exhibits, interactive performances, and multi-media installations. In this paper, we seek to propose a new architecture for an ambient intelligent dance performance space. Dance is an art form that seeks to explore the use of gesture and body as means of artistic expression. This paper proposes an extension to the medium of expression currently used in dance—we seek to explore the use of the dance environment itself, including the stage lighting and music, as a medium for artistic reflection and expression. To materialize this vision, the performance space will be augmented with several sensors: physiological sensors worn by the dancers, as well as pressure sensor mats installed on the floor to track dancers’ movements. Data from these sensors will be passed into a three layered architecture: a layer analyzes sensor data collected from physiological and pressure sensors. Another layer intelligently adapts the lighting and music to portray the dancer’s physiological state given artistic patterns authored through specifically developed tools; and, lastly, a layer for presenting the music and lighting changes in the physical dance environment.
In the previous chapter we showed how human musicians can benefit from visual and physical cues that are afforded by robotic musicians. Similarly, robotic musicians can benefit by augmenting their own abilities through analyzing visual cues by humans. Like humans, robotic musicians can use vision to anticipate, coordinate and synchronize their music playing with human collaborators.
This article discusses the artistic and technical development, and performance of an original work for interactive dance, digital sound, and multimedia entitled, Antennae (version 2). The work has been an ongoing collaboration between experimental choreographer Emily Beattie and media artist Brian Knoth, who are also the co-authors of this research document. Among other insights, we have found that through technologically mediated live performance, we can explore concepts related to noise and signal, as experienced within one’s self and out in the world. Consequently, our own perspectives on noise and signal have evolved. In turn, we are slowly developing an artistic language that expresses this experiment in abstract storytelling, performance art process, and aesthetic practice, while continuing efforts towards a meaningful implementation of technology within an interactive multimedia and dance context.
A growing number of practitioners in the international community of choreographers and performers has begun to experiment with computer-assisted work linking dance and new technologies. This hardly comes as a surprise, since dance-on-film and videodance had already attracted considerable attention, at least since the 1980s. Earlier experiments, such as the astonishing films by Maya Deren, take us back to the 1940s, and today's motion capture-based animations find their historical roots in late nineteenth century motion studies in chronophotography and early cinema (Muybridge, Marey, Méliès). Furthermore, dancemakers, researchers, and teachers have used film or video as a vital means of documenting or analyzing existing choreographies. Some scholars and software programmers published tools (LabanWriter, LifeForms) that attracted attention in the field of dance notation and preservation as well as among choreographers (e.g., Merce Cunningham) who wanted to utilize the computer for the invention and visualization of new movement possibilities.
Making Motion Musical: Gesture Mapping Strategies for Interactive Computer Music
  • Todd Winkler
Winkler, Todd. "Making Motion Musical: Gesture Mapping Strategies for Interactive Computer Music." In Proceedings for the 1995 International Computer Music Conference. San Francisco, CA: Computer Music Association, 1995