Conference PaperPDF Available

Interactive Spaces: Models and Algorithms for Reality-based Music Applications

Authors:
  • Conservatory of Music, Brescia

Abstract and Figures

Reality-based interfaces have the property of linking the user's physical space with the computer digital content, bringing in intuition, plasticity and expressiveness. Moreover, applications designed upon motion and gesture tracking technologies involve a lot of psychological features, like space cognition and implicit knowledge. All these elements are the background of three presented music applications, employing the characteristics of three different interactive spaces: a user centered three dimensional space, a floor bi-dimensional camera space, and a small sensor centered three dimensional space. The basic idea is to deploy the application's spatial properties in order to convey some musical knowledge, allowing the users to act inside the designed space and to learn through it in an enactive way.
Content may be subject to copyright.
Interactive Spaces: Models and
Algorithms for Reality-Based Music
Applications
Marcella Mandanici
CSC - Sound and Music
Computing Group
Department of Information
Engeneering
University of Padova, ITALY
mandanici@dei.unipd.it
Permission to make digital or hard copies of part or all of this work for personal
or classroom use is granted without fee provided that copies are not made or
distributed for profit or commercial advantage and that copies bear this notice
and the full citation on the first page. Copyrights for third-party components
of this work must be honored. For all other uses, contact the Owner/Author.
Copyright is held by the owner/author(s).
ITS ’15, November 15-18, 2015, Funchal, Portugal
ACM 978-1-4503-3899-8/15/11. http://dx.doi.org/10.1145/2817721.2820986
Abstract
Reality-based interfaces have the property of linking the
user’s physical space with the computer digital content,
bringing in intuition, plasticity and expressiveness.
Moreover, applications designed upon motion and gesture
tracking technologies involve a lot of psychological
features, like space cognition and implicit knowledge. All
these elements are the background of three presented
music applications, employing the characteristics of three
dierent interactive spaces: a user centered three
dimensional space, a floor bi-dimensional camera space,
and a small sensor centered three dimensional space. The
basic idea is to deploy the application’s spatial properties
in order to convey some musical knowledge, allowing the
users to act inside the designed space and to learn
through it in an enactive way.
Author Keywords
Blended interaction; Sound augmented reality; Learning
environments.
ACM Classification Keywords
H.5.1 [Artificial, augmented, and virtual realities]; H.5.2
[User Interfaces]; H.5.5 [Sound and Music Computing]
451
Motivation and Research Approach
The widespread utilization of cameras, camera software
systems and motion tracking devices like Kinect
(https://www.microsoft.com/en-us/kinectforwindows/)
and the Leap Motion Sensor
(https://www.leapmotion.com/), has brought to the
attention of a continously growing audience of researchers,
designers and practitioners the great potentialities of
newborn interaction styles. Physical environments lying in
the range of these sensors, allow the users to move freely
in everyday spaces, without anything to wear or to hold in
their hands, and to produce some audio or/and visual
feedback from a computer system, connecting in this way
the digital contents to their movements or actions. My
Figure 1: The floor interface of
the “Harmonic Walk” application
while being tested by a high
school student
research focuses on these kinds of spatial interaction, and
on how interfaces belonging to the real world can be
arranged to deploy the uniqueness of these interaction
styles. In particular, my design experiences concern music
expressive interaction, music learning environments (see
Fig.1) and music production (see Fig.2). Many musical
features, like harmony or melodic movements, have been
historically depicted by meaningful spatial representations,
e.g. the Euler’s Tonnetz (see Fig.3, literally “web of
tones”, a spatial schema showing the triadic relationships
upon which tonal harmony is based), or the Gregorian
Chant Neumatic Notation and Chironomy, a gestural
system expressing melodic contour variations. The very
notion of musical instrument involves precise spatial rules
and element displacement, whose study has produced a
large body of research about new music production
interfaces in the field of the Sound and Music Computing.
Another interesting example of music spatial interaction is
the conductor model, where expressive performance
informations are transmitted to performers through
free-hands open-space gestures. This led me to consider
the possibility of employing physical two dimensional
surfaces or 3D spaces to project geometrical
representations of musical concepts, thus enabling users
to enter and to navigate conceptual maps representing
some musical knowledge. The experimental hypothesis
upon which my work is based, is that in this situation
implicit and tacit knowledge may emerge and drive the
users to also accomplish very complex tasks, like melody
harmonization, without delivering them any previous
information. Implicit knowledge refers to our brain’s ability
to acquire an abstract representation of any environmental
stimulus (like for instance an unknown language), learning
its rules in an unconscious way. Implicit learning is linked
to the idea of enactivism, that is the ability of learning by
doing. These important cognitive and psychological
concepts are both involved in reality-based learning
environments, which can deploy such powerful tools to
convey information in a faster and more direct way. The
aim of my work is also to study the cognitive content of
musical knowledge and to try to expand the same learning
methodology towards other domains, like the STEM
disciplines (Science, Technology, Engineering and
Mathematics), which very often benefit from being
introduced through creativity and arts-based learning.
Research Outlook
In 2010 while attending the 7th Sound and Music
Computing Conference and Summer School in Barcelona I
was introduced to the Stanza Logomotoria [18], a motion
tracking, camera-based application where children are
asked to match a story tale with sounds positioned on the
mapped surface (see Fig.5). Further, the application was
presented in some experimental sessions with music
didactics students in Conservatories and elementary
schools. Nevertheless, my actual research begun in 2012,
when I graduated in Electronic Music with a master thesis
on Disembodied voices, the first of the three case studies
452
described below. This experience disclosed to me the
great potentialities of reality based, full body and free
handed interaction, especially when coupled with a well
established human-to-human communication model, like
the one of the music conductor. During my first year of
PhD at the Department of Information Engineering of the
University of Padova with the Sound and Music
Computing Group (2013), I examined in depth various
theories about the representation of music harmony spaces
and music image schemas. In 2014 I realized my first
camera space music application, the Harmonic Walk,the
second of the three case studies reported below. By the
end of 2015 I will conclude my PhD studies, after having
developed several applications for entertainment, learning
and creative and expressive music interaction. Presently,
together with other members of the CSC research group, I
am working to a spatial collaborative game to enhance
music listening (Good or Bad )andtoaspatialdisplayfor
an interactive image sonification installation.
Figure 2: A child playing a
Disklavier through gestural input
detected by a Leap Motion
Sensor
Figure 3: Euler’s Tonnetz, 1739.
Related Work
My work is based on two dierent kinds of background.
Firstly the analysis of the origins ([4]and[3]) and of the
state of art of interactive spaces like
Google’s Interactive Spaces and Liquid Galaxy
(http://www.interactive-spaces.org/),
Aarhus University Research Centre
(http://www.interactivespaces.net/,
Blended Interactions Design Studio, Rochester (NY)
(http://blendedinteractions.com/)
UCLA REMAP University of California
(http://openptrack.org/)
and blended interaction theory [5], providing also a short
survey of interactive spaces definitions, typologies,
available platforms and themes. I pay also particular
attention to existing learning environments like
WizeFloor (https://www.wizefloor.com/)
Smallab (http://smallablearning.com/)
STEP (http://remap.ucla.edu/research/
cultural-civic-computing/
791-science-through-technology- enhanced-play-step)
research centres, arts and culture installations and music
production environments. Secondly, all the theoretical
studies concerning space cognition ([11]and[17]),
implicit knowledge ([14]and[16]), enactivism [2]and
image schemas ([6]and[1]), which form an
interconnected interdisciplinary field, very useful to
understand how reality-based applications work. In Fig.4a
tentative conceptual map of the relationships existing
among reality-based applications, implicit knowledge and
space cognition is proposed, outlining reciprocal interplay
and dependencies.
Three Case Studies
In this Section three case studies of music applications are
presented. The three cases employ dierent motion
tracking devices and convey musical concepts through
geometrical interpretation and spatial representation. Also
the projections refer to dierent spatial models, as the
first employs the 3D spherical-polar coordinate system,
the second 2D Cartesian coordinates on a flat floor
surface and the third 3D space with x, y Cartesian
coordinates plus zplane for depth data.
Disembodied Voices
[9] is an interactive environment designed for an
expressive, gesture-based musical performance. The
motion sensor Kinect, placed in front of the user, provides
the computer with the 3D polar coordinates of the two
453
Figure 4: A conceptual map of the main relationships existing among reality-based environments (blue connections), space cognition
(green connections) and implicit knowledge (brown connections).
hands. The application is designed according to the
interaction model of the music conductor: the user,
through gestures, is able to run a score and to produce a
real-time expressive interpretation. The conductor moves
her/his arms and hands in the space around her/his torso
and in the direction of the performers. Movement analysis
([10]and[12]) as well as the teaching practice[15]
subdivide the role of the two hands; in general, the right
executes musical cues while the left is devoted to iconics,
metaphorics and dynamics. As can be seen from Fig.6,
the geometrical interpretation of the conductor’s
interaction space is a hemisphere with the center at the
level of the breastbone of the conductor and the diameter
corresponding approximately to the two stretched arms
length. Following the conductor’s interaction model, the
Figure 5: The “Stanza
Logomotoria” basic
configuration, with computer,
audio monitors and ceiling
mounted camera.
Figure 6: The hemispherical
regions for user hands interaction
in “Disembodied Voices”. For
the left hand there is an inner
region and two outer regions, one
in front and the other at the side
of the user.
hemisphere is subdivided in two parts, one for the right
and the other one for the left hand. For the left hand
three dierent regions are shaped, triggering various
digital sound processing eects. (video available at
https://www.youtube.com/watch?v=oyf7GrMMrL8)
The Harmonic Walk
[8] is an interactive physical environment designed for
experiencing a novel spatial approach to musical creation.
In particular, the system allows the user to get in touch
with some fundamental tonal music features in a very
simple and readily available way. The application’s
interface consists of a camera placed on the ceiling which
can trace the presence of a user who walks on a flat
surface within the camera’s view. The Harmonic Walk,
through the body movement in space, can provide a live
experience of tonal melody structure, chord progressions,
melody accompaniment and improvisation. Enactive
knowledge and embodied cognition allow the user to build
454
an inner map of these musical features, which can be
acted by moving on the mapped surface with a simple
step. Listeners interpret a tonal melody grouping the
perceived sequence of events after a metrical and
harmonic frame [13]. This produces a segmentation of the
composition into dierent harmonic regions which, with
the underlying harmonic structure, are the leading features
of a tonal composition. The time proceeding of the
various musical units is led by the melody, whose
metaphoric scheme is expressed by the so-called
“source-path-goal” schema [6]. Following this metaphor
Figure 7: Visual tags of the
straight and circular path of the
“Harmonic Walk’s” interface.
and employing the simplest motion in space a human can
do - the walk - it is possible to represent a tonal
composition as a sequence of spatial blocks, where each
step corresponds to the next musical unit (white crosses in
Fig.7), while the harmonic space is represented by a six
parts sliced circular ring containing the six roots of the
tonality harmonic space (black crosses). (video available
at https://www.youtube.com/watch?v=OjwXfzq_
CkU&index=1&list=UU1E9xCq8TWqlzessRIzUGxw)
Hand Composer
[7] is a gesture-driven composition system, based on the
analysis of the existing relationships between music
generative models and musical composition in the context
of the XX century music history background. The system
framework is based on a number of interactive machines
performing various patterns of music composition and
producing a stream of MIDI data to be compatible with a
Disklavier performance. Hand gestural input, captured by
the Leap Motion sensor (see Fig.8), can control some
parameters of the music composing machines, changing
interactively their musical output. (video available at
https://www.youtube.com/watch?v=mdsn9_5Ig_A&list=
UU1E9xCq8TWqlzessRIzUGxw)
Figure 8: The 3D interaction space of the Hand Composer
application. x, y Cartesian coordinates are employed to map
the two-dimensional vertical plane, while zcoordinates map
the depth data.
Thesis Scope and Expected Contributions
The experimental hypothesis upon which all my research
is based, is that abstract knowledge like musical harmony
or tonal music composition structure can be conveyed to
the users through the enactive learning induced by
interactive space environments. Thus, my thesis will focus
on the relationship between space and cognition, and on
the ways knowledge can be spatially represented. As I
have worked with various kinds of interactive spaces
(bi-dimensional floors as well as three-dimensional
volumes), I will try to work out a comprehensive, unifying
framework for all of them and to highlight their respective
properties and dierences. If my experimental hypothesis
is verified, reality-based learning environments design
could benefit of very powerful cognitive tools, which could
be extended beyond music also to other knowledge fields,
like science, mathematics and technology.
455
Acknowledgements
I am grateful to my supervisor Sergio Canazza for his
precious support and help. I also want to express my
gratitude to Antonio Rod`a who followed step by step the
development of my work and to Davide Rocchesso for his
advices and presence.
References
[1] Brower, C. A cognitive theory of musical meaning.
Journal of Music Theory (2000), 323–379.
[2] Bruner, J. S. The act of discovery. Harvard
educational review (1961).
[3] Engelbart, D. C. Augmenting human intellect: a
conceptual framework (1962). PACKER, Randall and
JORDAN, Ken. Multimedia. From Wagner to Virtual
Reality. New York: WW Norton & Company (2001),
64–90.
[4] Harrison, S., and Dourish, P. Re-place-ing space: the
roles of place and space in collaborative systems. In
Proc. ACM conference on Computer supported
cooperative work, ACM (1996), 67–76.
[5] Jetter, H.-C., Reiterer, H., and Geyer, F. Blended
interaction: understanding natural human–computer
interaction in post-wimp interactive spaces. Personal
and Ubiquitous Computing 18, 5 (2014), 1139–1158.
[6] Lako, G., and Johnson, M. Metaphors we live by.
University of Chicago press, 2008.
[7] Mandanici, M., and Canazza, S. The “hand
composer”: gesture-driven music composition
machines. In Proc. 13th Conf. on Intelligent
Autonomous Systems (2014), 553–560.
[8] Mandanici, M., Rod`a, A., and Canazza, S. The
harmonic walk: an interactive educational
environment to discover musical chords. In Proc.
ICMC-SMC Conference (2014).
[9] Mandanici, M., and Sapir, S. Disembodied voices: a
kinect virtual choir conductor. In Proc. 9th Sound
and Music Computing Conference (2012), 271–276.
[10] Marrin, T., and Picard, R. The ‘conductor’s jacket’:
A device for recording expressive musical gestures. In
Proc. International Computer Music Conference
(1998), 215–219.
[11] Montello, D. R. International Encyclopedia of the
Social and Behavioral Sciences. Oxford, Pergamon
Press, 2001, ch. Spatial Cognition, 14771 14775.
[12] Murphy, D., Andersen, T. H., and Jensen, K.
Conducting audio files via computer vision. In
Gesture-based communication in human-computer
interaction. Springer, 2004, 529–540.
[13] Povel, D.-J., and Jansen, E. Harmonic factors in the
perception of tonal melodies. Music Perception 20,1
(2002), 51–85.
[14] Reber, A. S. Implicit learning and tacit knowledge.
Journal of experimental psychology: General 118,3
(1989), 219.
[15] Rudolf, M., and Stern, M. The grammar of
conducting: A comprehensive guide to baton
technique and interpretation. Schirmer Books, New
York, 1994.
[16] Tillmann, B., Bharucha, J. J., and Bigand, E.
Implicit learning of tonality: a self-organizing
approach. Psychological review 107, 4 (2000), 885.
[17] Tversky, B. Functional Significance of Visuospatial
Representations. Handbook of higher-level
visuospatial thinking Handbook of (2005), 1–34.
[18] Zanolla, S., Canazza, S., Rod`a, A., Camurri, A., and
Volpe, G. Entertaining listening by means of the
Stanza Logo-Motoria: an Interactive Multimodal
Environment. Entertainment Computing 2013,4
(2013), 213–220.
456
... We would like to suggest that this note arrangement introduces a low threshold for harmonic output for novices, while presenting a path of exploration for seasoned musicians. Later iterations (Fig 2: D) born from TP feedback, employed enactive or implicit knowledges based on existing semiotic relationships hoping to augment the development of a player's cognitive map [27] of this novel note arrangement. ...
Conference Paper
Full-text available
Immersive technologies are showing increasing potential for accessible music making. Costs, availability, interfacing methods and open-source development tools allow for exploration of the potential to facilitate atypical minds and bodies. We present a participatory study on the facilitation of accessibility within virtual reality musical environments. This study was carried out during a series of workshops with an experienced group of musicians with congenital physical disabilities and a community music group of novice musicians with acquired brain injuries. We qualitatively evaluate prototype instruments adopted for hand recognition. We examine this embedded process, asking how a musician may understand, organise and personalise their virtual music environment, while examining how co-locating virtual instruments with physical surfaces can be used as potential aids to accessibility. Findings show that when participants are matched to the technology, it contains potential for musicians to gain agency through a process of shared knowledges and shared explorations. Understanding and personalising their own musical environment means perceiving affordances and uncovering hidden affordances. We find the quickly adaptable, visual, iterative processes of participatory design using VR is engaging and motivating for musicians and the wider network of stakeholders
... A recent study by Corrigall and Trainor (2010) has shown that children in preschool age have implicit harmonic knowledge, including chord functions, harmonic relations, and harmonic rhythm, indicating that harmony perception begins to develop earlier than previously suggested. A structured and exhaustive discussion of the above concepts, in the context of technologically-augmented music education, is provided by Mandanici (2017). ...
Article
Full-text available
Harmony has always been considered a difficult matter to learn, also by experienced musicians. The aim of this paper is to present a system designed to provide unskilled users with an indication about the sound of the different harmonic regions and to help them to build a cognitive map of their relationships, linking musical perception to spatial abilities like orientation and wayfinding. The Harmonic Walk is an interactive environment which responds to the user’s position inside a rectangular space. Different chords are proposed to the user depending on her/his position. The user’s task is to find and to recognize them, and, then, to decide how to link the chords producing a convincing harmonic progression. This can be made by choosing a precise path to perform the best satisfying “harmonic walk”, selecting it among various possibilities. From a theoretical point of view the project is inspired to the neo-Riemannian ideas of harmony and parsimonious progressions, which try to give a wider and coherent framework to 19th century harmony and to its representation. The results of our preliminary tests confirm that, in a sample of children from 7 to 11 years old, most of the participants were able to locate the chords and to find some valid path to perform a harmonic progression.
Article
Full-text available
Disembodied voices" is an interactive environment de-signed for an expressive, gesture-based musical performance. The motion sensor Kinect, placed in front of the performer, provides the computer with the 3D space coordinates of the two hands. The application is designed according to the metaphor of the choir director: the performer, through gestures, is able to run a score and to produce a real-time expressive interpretation. The software, developed by the authors, interprets the gestural data and controls articulated events to be sung and expressively performed by a vir-tual choir. Hence the name of the application: you follow the conductor's gestures, hear the voices but don't see any singer. The system also provides a display of motion data, a visualization of the part of the score performed at that time, and a representation of the musical result processed by the compositional algorithm.
Article
Full-text available
We introduce Blended Interaction, a new conceptual framework that helps to explain when users perceive user interfaces as “natural” or not. Based on recent findings from embodied cognition and cognitive linguistics, Blended Interaction provides a novel and more accurate description of the nature of human–computer interaction (HCI). In particular, it introduces the notion of conceptual blends to explain how users rely on familiar and real-world concepts whenever they learn to use new digital technologies. We apply Blended Interaction in the context of post-“Windows Icons Menu Pointer” interactive spaces. These spaces are ubiquitous computing environments for computer-supported collaboration of multiple users in a physical space or room, e.g., meeting rooms, design studios, or libraries, augmented with novel interactive technologies and digital computation, e.g., multi-touch walls, tabletops, and tablets. Ideally, in these spaces, the virtues of the familiar physical and social world are combined with that of the digital realm in a considered manner so that desired properties of each are preserved and a seemingly “natural” HCI is achieved. To support designers in this goal, we explain how the users’ conceptual systems use blends to tie together familiar concepts with the novel powers of digital computation. Furthermore, we introduce four domains of design to structure the underlying problem and design space: individual and social interaction, workflow, and physical environment. We introduce our framework by discussing related work, e.g., metaphors, mental models, direct manipulation, image schemas, reality-based interaction, and illustrate Blended Interaction using design decisions we made in recent projects.
Article
Full-text available
By common assumption, the first step in processing a tonal melody consists in setting up the appropriate metrical and harmonic frames required for the mental representation of the sequence of tones. Focusing on the generation of a harmonic frame, this study aims (a) to discover the factors that facilitate or interfere with the development of a harmonic interpretation, and (b) to test the hypothesis that goodness ratings of tone sequences largely depend on whether the listener succeeds in creating a suitable harmonic interpretation. In two experiments, listeners rated the melodic goodness of selected sequences of 10 and 13 tones and indicated which individual tones seemed not to fit. Results indicate that goodness ratings (a) are higher the more common the induced harmonic progression, (b) are strongly affected by the occurrence and position of nonchord tones: sequences without nonchord tones were rated highest, sequences with anchoring nonchord tones intermediately, and nonanchoring nonchord tones lowest. The explanation offered is compared with predictions derived from other theories, which leads to the conclusion that when a tone sequence is perceived as a melody, it is represented in terms of its underlying harmony, in which exact pitch-height characteristics play a minor role.
Article
Full-text available
Article
Full-text available
I examine the phenomenon of implicit learning, the process by which knowledge about the rule-governed complexities of the stimulus environment is acquired independently of conscious attempts to do so. Our research with the two seemingly disparate experimental paradigms of synthetic grammar learning and probability learning, is reviewed and integrated with other approaches to the general problem of unconscious cognition. The conclusions reached are as follows: (a) Implicit learning produces a tacit knowledge base that is abstract and representative of the structure of the environment; (b) such knowledge is optimally acquired independently of conscious efforts to learn; and (c) it can be used implicitly to solve problems and make accurate decisions about novel stimulus circumstances. Various epistemological issues and related problems such as intuition, neuroclinical disorders of learning and memory, and the relationship of evolutionary processes to cognitive science are also discussed. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
This article presents an Interactive Multimodal Environment (IME), the Stanza Logo-Motoria, designed to support learning in primary schools. In particular we describe the use of this system as a tool (a) to practice listening to English as a Second Language (ESL) and (b) to enable children with severe disabilities to perform an interactive listening. We document the ongoing experimentation of the Stanza Logo-Motoria in ESL lessons and report its encouraging results. Moreover, we explain how it may be possible, by means of the Stanza Logo-Motoria, to redesign traditional learning environments in order to allow pupils to experience listening as an active and engaging experience.