Content uploaded by Nuno N. Correia
Author content
All content in this area was uploaded by Nuno N. Correia on Jun 03, 2014
Content may be subject to copyright.
PESI: Extending Mobile Music
Instruments with Social Interaction
Abstract
This paper presents the research project The Notion of
Participative and Enacting Sonic Interaction (PESI).
PESI aims to extend the engagement of performers in
collective music practices using embodied approaches
in physical and social interaction. In its design, the
mobile phone functions as a tangible and expressive
musical instrument, together with an extended system.
In this paper, we contextualize the project with
associated themes – physical, spatial and social
interaction – and with related works. We also present
the modular structure of the project, evaluation
methods, initial conclusions and paths for future
developments.
Keywords
Sound, music, collaborative, mobile, physical
interaction, social interaction, spatial interaction, sonic
interaction
ACM Classification Keywords
H.5.5 Information Interfaces and Presentation: Sound
and Music Computing—Systems; H.5.3 Information
Interfaces and Presentation: Group and Organization
Interfaces – Collaborative computing
Copyright is held by the author/owner(s).
TEI 2013, February 10-13, 2013, Barcelona, Spain
ACM
Nuno N. Correia
Department of Media
Aalto University, School of Arts,
Design and Architecture
FI-00076 AALTO, Finland
nuno.correia@aalto.fi
Koray Tahiroğlu
Department of Media
Aalto University, School of Arts,
Design and Architecture
FI-00076 AALTO, Finland
koray.tahiroglu@aalto.fi
Miguel Espada
Universidad Complutense de
Madrid
Departamento de Sistemas
Informáticos y Computación.
Plaza de las Ciencias, 28040,
Madrid, Spain
mvaleroe@pdi.ucm.es
2
Introduction
Collective music making relies on many interaction cues
that are not directly affecting the sound-producing
actions by the individual performer. For instance,
guitarists on stage might come close to each other to
attract the attention of the audience and underline
specific passages of the performance. Likewise,
performers can coordinate their movements with each
other, taking into consideration the others´ actions
attentively. These collective movements can correspond
to further bodily interactions that originate social
actions in a collaborative music experience. A more
inclusive view of interactions that facilitate physical and
social actions with compelling ways to use
computational technology can provide rich and expert
music performances. Similar design approaches can
enhance the engagement of the performers in
collaborative music making, transforming it into a more
active and enriched experience.
In the case of our research project The Notion of
Participative and Enacting Sonic Interaction (PESI), we
focus on extending the engagement of performers in
group music practices using embodied approaches in
physical and social interaction, based on empirical
methods. In this context, we account for embodied
interaction as forms of interaction with digital
technologies that are embedded in physical and social
environments [3]. With the PESI project, we have been
developing design strategies where the mobile phone
functions as a tangible and expressive musical
instrument, in parallel with an extended system. The
extended system incorporates the mobile instruments
and motion tracking technology to create an
environment in which performers are not only free to
move and interact with each other but where their
social actions contribute to the sonic outcome.
PESI is an ongoing research; the core novelty of the
project is in the research field of sonic interaction, as it
takes into account embodiment and multi-user
collaboration. In this paper we introduce the current
state and the contextualization of the project, the
modular structure of the extended system, a
preliminary evaluation of the initial design decisions
and we indicate future developments of the system and
the implementation.
Physical, Spatial and Social
The PESI research concerns device interaction (mobile
phone instruments) combined with social interactions
between performers in a free group-improvisation
context. Many design strategies and approaches to
physical, social and spatial interaction in collaborative
art context, specifically in music making, have been
proposed and introduced in multidisciplinary research
fields.
SignalPlay is a collective audio installation where users
can control physical objects with embedded computing
devices to collectively modify a soundscape [17]. It is
not designed purely as a musical instrument, but as an
experiment in interface exploration. Focusing on the
way people interact with the space and each other, the
authors emphasize the importance of temporal notions
in interaction, such as pace and rhythm. Some authors
choose to have less focus on the physical space and are
more concerned with collaboration and social
interaction. Ocarina is a musical tool for social
interaction developed for the iPhone, which is socially
aware: it allows to hear other Ocarina players
3
throughout the world [16]. Similarly, Daisyphone allows
multiple participants to collaborate in the creation of
music loops without being in the same physical space
[2]. Notions explored in Daisyphone, such as mutual
awareness; localization of sounds; and spatial
arrangements; have informed the design of PESI.
TOUCHtr4ck is another project that has been presented
as a multi-touch tabletop tool for collaborative music
making [18]. Despite the emphasis on Graphical User
Interface, the authors of TOUCHtr4ck identify three
main interaction factors for democratic collaborative
music making that are important for PESI: awareness
of others’ actions; modifiability of others’ actions; and
the distinction of users’ musical expertise.
As PESI is designed for co-located collaboration, it
brings reflections about space. Spatiality is inherent to
tangible interfaces, since they are part of that space –
they are embedded in it – and users need to move in
space to manipulate them [4, 9]. Manipulation in an
interactive space can mean not only touching or
moving objects, but also moving our bodies. It is full-
body interaction if the whole body can be used.
Gestures happen in a space surrounding our bodies, in
communication with others. Spatial interaction has
performative aspects, resulting in performative actions
with others. Sharing a space brings awareness of the
others and their presence: we meet them face-to-face,
and feel their potential resistance to (or collaboration
with) our actions. Therefore, space and social action
are deeply interconnected [17].
The PESI project proposes an adaptive and flexible
interactive environment in which each performer
conducts him/herself autonomously. The intrinsic
characteristics of the system have many similarities
with other natural social systems [1, 10, 12]. Apart
from the absence of a centralized command, it also
relies on a continuous feedback loop to re-organize
relationships among players. In the PESI extended
system, the audio synthesis is determined by social
action parameters extracted from the flow of
movement. Swarm studies show that complex system
can be studied by decomposing the movement patterns
into minimal behavioral rules [1, 8]. Therefore, to
understand patterns in a social interaction
environment, such as in PESI, we can rely on the
analysis of the individual actions and the local relations
between them, instead of trying to decode the overall
flux of the system.
Project Description
The PESI research deals with the notion of
participation, embodiment and human-human
interaction. The main direction of our current research
is to understand more about bodily and social
interaction among participants, specifically in relation to
their location, and coordination within a group. PESI is
designed for small-scale collaboration, three
performers, since it allows for more complex social
interaction then simply two (figure 1). PESI is
composed of different networked elements: the mobile
phones (three iPhones) running specially developed
software (providing sensory input and audio-haptic
output) and a motion capture "extended system" (two
Kinect sensor bars and two computers) (figure 2).
Both elements rely on Pure Data1 for audio synthesis
and manipulation. Sound output is divided into personal
speakers attached to each performer, and multichannel
1 http://puredata.info/
4
surround speakers. The mobile instrument connected to
the personal speakers is designed to be “self-contained
and autonomous sound-producing object” [15].
figure 1. Three performers using PESI
The Mobile Instruments
The mobile instruments are iPhones running custom
software built with Objective-C2 and Libpd3, the
embedded version of Pure Data environment. The
software receives input from the accelerometer and
gyroscope, as well as touch interaction. Players can
choose one out of three different sound modules, each
2 http://developer.apple.com/library/mac/#referencelibrary/
GettingStarted/Learning_Objective-C_A_Primer/_index.html
3 http://libpd.cc
with its own sonic character and interaction-sound
mapping settings.
figure 2. A diagram with the different PESI elements
In order to translate actions into sound, the
accelerations and continuous pointing are measured
constantly along all axes, providing a picture of the
performer’s hand as it gestures and moves. The three
values of accelerometer and gyroscope for X, Y and Z
arrive as a continuous stream of floats in Libpd. At the
same time, the information retrieved from
accelerometer, gyroscope and touchscreen is sent to
the extended system via the Open Sound Control -
OSC4 protocol. The sound produced by the mobile
phones is channeled to two portable speakers that the
players wear in their chests (figure 3).
4 http://opensoundcontrol.org/
5
figure 3. A performer using the PESI
mobile instrument
.
At the same time, the information retrieved from
accelerometer, gyroscope and touchscreen is sent to
the extended system. In addition to the sensor input
data, the mobile instrument provides haptic feedback
aiming to give tactile response to certain control actions
on the mobile phone.
Eyes free interaction is the core design model in our
mobile phone instrument design, which allows
participants to focus more on their interaction with
other participants and with the environment. The
objective is to increase mutual engagement and
decrease cognitive overload [2]. However, there are
some basic visual cues, such as the background color of
the screen indicating the choice of sound module and
the state of the instrument.
The Extended System
The PESI extended system is highly reconfigurable;
each individual action affects local relations between
players and modifies the sound parameters. The virtual
space is designed as a two dimensional granular
synthesizer in which parameters vary on real-time
depending on the input of the social analysis. Grains
are distributed and triggered according to the relation
among players. As in ant systems, in which
“pheromones” are used to communicate between
individuals, we use sound grains for establishing sound-
based interaction. The continuous reconfiguration leads
players to actively participate in the generation and
exploration of the creative possibilities of the system.
System design is conditioned by the critical
requirements of the installation. The flexibility and
highly dynamicity of the setting demands a robust
tracking, an analyzer system that adapts quickly and
reliably to any change of the configuration.
The extended system consists of a multi-Kinect motion
tracking system that constantly computes the position
of the three players and some information about their
relations, such as relative distances and alignment
(figure 4).
figure 4. Motion tracking of three performers using Kinect
Several Kinect infrared cameras are needed to do a
robust tracking. They avoid the loss of players due to
occlusions. Each device is attached to one computer,
which performs the following tasks:
• Reads the raw point-cloud information and
transforms it into real world coordinates.
6
• Extracts the information of the users in the
scene, removing the background, floor and
static objects.
• Determines the “center of mass” of every user
by computing the centroid of his mesh.
• Sends the information through OSC to a central
server.
The server merges the data coming from the other
computers and computes the data that feeds the audio
synthesizer. The tracking system is built using the
OpenNI5 library with Processing6 and openFrameworks7.
The server also receives the sensor input data from the
mobile phones, and uses that information to manipulate
the audio synthesis. The sound routed to the main
speakers is manipulated using the temporal and spatial
information between players.
As Fencott and Bryan-Kinns state, “designing to
support group musical interaction necessitates a careful
consideration of how audio should be presented” [6].
Therefore, to facilitate identification, each player’s
sound is panned to be situated as close to the source
as possible, reinforcing the sound produced by the
wearable speakers.
Audio Synthesis
The audio output is created in Pure Data environment
and uses a number of different synthesis techniques in
5 http://openni.org/
6 http://processing.org/
7 http://www.openframeworks.cc/
dynamic combination to provide a complex and flexible
array of sonic possibilities. The first sound module in
the mobile instrument is designed with 4-point adaptive
mapping strategies in 2 dimensional touch screen
interfaces [14]. Simply changing the coordinates of
these four points while applying the same touch
gesture results in a variety of outcomes in waveshaping
synthesis and the synthesis based on squarewave
generator and pulse width modulation. The second
sound module is based on granular synthesis,
controlling the size and the duration of the grains. It is
the modification of the asynchronous granular synthesis
techniques introduced in [5, 13]. The third sound
module is a granular synthesis that allows for time-
stretching and pitch shifting, which can be classified
under synchronous synthesis techniques [5, 13].
The sound-action strategy in PESI mobile instruments
is focused on tilt as the main action to change the
phone’s state, which determines the control features of
sound modules in PESI mobile instruments. The sonic
characteristics of the mobile instruments have been
developed further based on the comments and results
we gained during our initial user-test [11].
The relative position of each player regarding each
other, and the speed of her/his movement to or from
the others, affects the overall sound. The reactivity to
distance and speed allows for social actions to emerge.
The extended system searches for certain drives (such
as attraction, alignment and avoidance [12]) and
manipulates the sound accordingly.
Methods for Evaluation
Earlier we had conducted an initial evaluation in order
to assess if our initial design ideas met the expectations
7
of the test participants or not [11]. The test involved 21
participants, divided into groups of three. Each
participant was given a mobile instrument. We tested
relational parameters that might provide an indication
of the group dynamics and social interaction while
playing together, particularly distance among players.
The test was conducted based on two scenarios. In the
first scenario, the sound of the three instruments was
not affected by the distance parameter. In the second
scenario, the sound was affected by distance between
players. A mix of evaluation methods were applied,
including quantitative survey analysis and qualitative
interview data analysis. The qualitative data analysis
showed that our approach for augmenting the social
dimension of music making enriched the playful
interaction aspect of the group activity.
We are in the process of evaluating further
developments of PESI. The evaluation will be done via
two experiments. The first experiment aims to observe
patterns in social interaction, and to test different audio
mappings to these interactions. The mobile instruments
will not be used. Visualization of spatial workspace
organization, video observation and a short group
interview will be used as evaluation measures. The
second experiment will be conducted towards the end
of 2012 with experienced musicians, and aims to test
the expressivity and collaborative aspects of the
project. The musicians will be given the mobile
instruments one week before the experiment, in order
to familiarize themselves with them. The evaluation will
use the different measure proposed by Fencott and
Bryan-Kinns: questionnaires, interaction log analysis,
visualization of spatial workspace organization, video
observation and group interviews [6].
There is a history of developing collaborative musical
tools within the computer music community, but there
is still a scarcity of research regarding the evaluation of
these systems [6]. Fencott and Bryan-Kinns propose a
series of measures to evaluate collaborative music
systems: questionnaires, interaction log analysis,
visualization of spatial workspace organization, video
observation and group interviews. Hattwick and
Wanderley propose a dimension space for evaluating a
particular configuration of a Digital Music Ensemble
(their term for a collective of digital music instrument
performers) [7]. This dimension space is based on
multiple axes proposed by these authors: texture
(relative to the individuality of each component);
equality (the roles of the performers); centralization
(the flow and importance of information and control);
physicality (localization within the same space or not,
communication between performers); synchrony (is
collaboration in real-time, and is there synchronization
or sequence between players); and dependence (do the
players depend on each other to produce sound, and do
they share control of common elements). These
proposals have been influential for our research.
Conclusions and Future Work
We detect a potential for using social interaction for
augmenting embodied and embedded musical
interfaces. PESI proposes an extended system to track
players and augment their actions with a social layer:
the way they interact with other players in space will
affect the sonic output of their instruments. A multi-
device audio delivery approach is adopted. We will
present preliminary results from the two experiments
at the conference.
8
Regarding future work, we intend to continue
developing PESI based on the results of the evaluation
process. We will release the different components of
the PESI software as a toolkit: the extended system
and the iPhone software, together with tutorials and
architecture design. The software will be released as
open source.
Acknowledgements
This work is supported by the Academy of Finland
(project 137646).
References
[1] Bonabeau, E. et al. 1999. Swarm Intelligence:
From Natural to Artificial Systems. Oxford University
Press, Santa Fe Institute Studies in the Sciences of
Complexity, New York, NY.
[2] Bryan-Kinns, N. 2012. Mutual Engagement in
Social Music. LNICST. 78 (2012), 260–266.
[3] Dourish, P. 2001. Where the action is: the
foundations of embodied interaction. MIT Press.
[4] Eitan, Z. and Granot, R.Y. 2004. Musical
Parameters and Spatio-Kinectic Imagery. Proceedings
of the 8th International Conference on Music Perception
and Cognition (Evanston, IL, 2004), 57–63.
[5] Farnell, A. 2010. Designing Sound. MIT Press.
[6] Fencott, R. and Bryan-Kinns, N. 2012. Audio
delivery and territoriality in collaborative digital musical
interaction. BCS-HCI ’12 Proceedings of the 26th
Annual BCS Interaction Specialist Group Conference on
People and Computers (Swinton, Sep. 2012), 69–78.
[7] Hattwick, I. and Wanderley, M. 2012. A
Dimension Space for Evaluating Collaborative Musical
Performance Systems. (2012).
[8] Hoar, R. and Dynamics, A.E.T. 2002.
Evolutionary Swarm Traffic. Evolutionary Computation,
2002. CEC ’02. Proceedings of the 2002 Congress on.
(2002), 1910–1915.
[9] Hornecker, E. 2005. Space and Place–setting
the stage for social interaction. Position paper
presented at ECSCW05 workshop “Settings for
Collaboration: the role of place”. (2005).
[10] Jacob, C.J. et al. 2007. SwarmArt!: Interactive
Art from Swarm Intelligence. Leonardo. 40, 3 (Jun.
2007), 248–254.
[11] Pugliese, R. et al. 2012. A qualitative
evaluation of augmented human-human interaction in
mobile group improvisation. Proceedings of NIME 2012
(Ann Arbor, MI, 2012).
[12] Reynolds, C.W. 1987. Flocks, herds and
schools: A distributed behavioral model. ACM
SIGGRAPH Computer Graphics. 21, 4 (Aug. 1987), 25–
34.
[13] Roads, C. 2004. Microsound. MIT Press.
[14] Tahiroglu, K. 2011. An Exploration on Mobile
Interfaces with Adaptive Mapping Strategies in Pure
Data. Proceedings of the 4th International Pure Data
Convention (Weimar, 2011).
[15] Tanaka, A. 2009. Sensor-Based Musical
Instruments and Interactive Music. The Oxford
Handbook of Computer Music. R. Dean, ed. Oxford
University Press. 233–257.
[16] Wang, G. et al. 2009. Smule = Sonic Media!:
An Intersection of the Mobile, Musical and Social.
Proceedings of ICMC 2009 (Montreal, 2009), 283–286.
[17] Williams, A. et al. 2005. From Interaction to
Participation!: Configuring Space Through Embodied
Interaction. Proceedings of UbiComp 2005 (Tokyo,
2005), 287–304.
[18] Xambó, A. et al. 2011. TOUCHtr4ck:
democratic collaborative music. Proceedings of the fifth
international conference on Tangible, Embedded, and
Embodied Interaction (Kingston, 2011), 309–312.