Conference PaperPDF Available

Composition as an Embodied Act: a Framework for the Gesture-based Creation of Augmented Reality Action Scores

  • Xi-an Jiaotong - Liverpool University

Abstract and Figures

In a context where Augmented Reality (AR) is rapidly spreading out as one of the most promising technologies, there is a great potential for applications addressing musical practices. This paper presents the development of a framework for creating AR gesture-based scores in the context of experimental instrumental composition. The notation system is made possible by GesturAR, an Augmented Reality software developed by the author: it allows one to draw trajectories of gestures directly on the real vibrating body. Those trajectories are visualized as lines moving in real-time with a predetermined speed. The user can also create an AR score (a sequence of trajecto-ries) by arranging miniaturized trajectories representations on a timeline. The timeline is then processed and a set of events is created. This application paves the way to a new kind of notation: embodied interactive notation, characterized by a mimetic 4D representation of gesture, where the act of notation (performed by the composer during the compositional process) corresponds to the notated act (i.e., the action the interpreter is meant to produce during the performance).
Content may be subject to copyright.
Proceedings of the 17th Sound and Music Computing Conference, Torino, June 24th – 26th 2020
Composition as an Embodied Act: a Framework for the Gesture-based
Creation of Augmented Reality Action Scores
Giovanni Santini
Hong Kong Baptist University
In a context where Augmented Reality (AR) is rapidly
spreading out as one of the most promising technologies,
there is a great potential for applications addressing mu-
sical practices. This paper presents the development of
a framework for creating AR gesture-based scores in the
context of experimental instrumental composition. The
notation system is made possible by GesturAR, an Aug-
mented Reality software developed by the author: it al-
lows one to draw trajectories of gestures directly on the
real vibrating body. Those trajectories are visualized as
lines moving in real-time with a predetermined speed. The
user can also create an AR score (a sequence of trajecto-
ries) by arranging miniaturized trajectories representations
on a timeline. The timeline is then processed and a set of
events is created. This application paves the way to a new
kind of notation: embodied interactive notation, charac-
terized by a mimetic 4D representation of gesture, where
the act of notation (performed by the composer during the
compositional process) corresponds to the notated act (i.e.,
the action the interpreter is meant to produce during the
GesturAR is an application that allows to create, store, re-
call and organize trajectories in 4D (space and time) cre-
ated from performance gestures detected through motion
capture equipment. It has been developed for exploring of
a new concept of musical notation: embodied interactive
notation. In such form of notation, the act of notating (re-
alized as a physical gesture in 3D space) corresponds to
the notated act (the gesture that is meant to be performed).
The transition from notation (realized by the composer) to
the interpretation (realized by the performer) is mediated
through an AR system (see section 3). This project relies
on a background mainly including gesture-based notation,
live notation and 3D/AR/VR forms of notation.
Copyright: c
2020 Giovanni Santini et al. This is
an open-access article distributed under the terms of the
Creative Commons Attribution 3.0 Unported License, which permits unre-
stricted use, distribution, and reproduction in any medium, provided the original
author and source are credited.
1.1 Extended techniques and gesture-based notation
Both as a cause and a consequence of the increasing mu-
sical experimentation (especially from the 50s), the Com-
mon Music Notation 1(CMN) has been pushed beyond the
traditional forms of representation depending on different
aesthetical purposes and composers. We can call graphic
notation all those forms of notation that do not follow the
traditional rhythm-pitch-loudness definition (as in CMN)
and use graphic solutions not adopted in CMN. In partic-
ular, action scores [2] are scores where a prescriptive in-
dication of the gesture to perform replaces the descriptive
indication of the result to obtain. An example of action
score can be found in Figure 1.
Figure 1. Lachenmann’s Gran Torso, bars 95-97.
Gran Torso by Lachenmann (1971, Figure 1), presents
graphic elements along with freely arranged derivations of
CMN (Common Music Notation). In the example in Fig-
ure 1, it is possible to notice the use of the typical string
clef: it represents the body of the string instruments; lines
drawn inside the staff indicate the bow position. More im-
portantly, in the first bar of Violin I appear two shapes with
an arrowhead. They indicate form and direction of two
different movements (to be performed together). In this
score, as in numerous other that deal with some forms of
gesture-based notation, the performative act that generates
the sound is a compositional resource and part of the musi-
cal form. At the same time, the possibility to compose the
gesture (alongside the sound) allows composers to create
a brand-new set of timbral resources that were unthinkable
1“Common music notation (CMN) is the standard music notation sys-
tem originating in Europe in the early seventeenth century.” [1].
Proceedings of the 17th Sound and Music Computing Conference, Torino, June 24th – 26th 2020
and unreachable inside the frame of traditional instrumen-
tal techniques.
1.2 Real-time scores
The increasing processing power of computing devices has
enabled the possibility to include visual processing inside
musical scores. Real-time scores (typically scores visu-
alised on a screen) make use of some forms of animated
notation. As they are essentially represented on screens,
they can also be called screen scores and categorized as
scrolling, permutative (elements of the score are moved,
copied or cancelled in real-time), transformative (elements
of the score are transformed in real-time) or generative
scores (an algorithm generates in real-time the notation or
parts of it) [3].
Such scores need specific pieces of software for being vi-
sualized. For example, the Decibel Scoreplayer is a tool
developed specifically for real-time scores synchronized
over a network [4] [5]). It can be used for scrolling scores,
or for changing transparency of layers of superimposed
static images. For example, in the composition trash vor-
tex (2016) [6] by L. Vickery, different visual structures
(used as musical notation) are revealed by such changes in
Advanced use of graphics and animations can be found
in more recent scores, sometimes more focused on some
idea of dramaturgy than on notation itself, projected to-
wards film-making and gamification. Genni (2018) by P.
Turowski is the staging of a story where the characters are
geometric figures moving in a 3D environment. The score
has to be interpreted by the performers as a form of graphic
1.3 3D, VR and AR scores
Technical developments during the last decade allowed the
utilization of the three spatial dimensions in real-time ren-
dering, thus fostering the idea of 3D musical notation, as
in [7]. In 64x4x4 by D. Kim-Boyle (Figure 2) for string
quartet, for instance, the score is animated and nodes in-
side a 3D space are mapped to different pitches and playing
techniques. For example, “colored nodes [. . . ] represent
various natural harmonics. The nodes are connected by a
series of thin lines the color of which denote the strings on
which the harmonics are to be performed” [8].
Figure 2. Kim-Boyle’s 64x4x4 string quartet.
More recently, VR and AR applications started being de-
veloped. The concept of immersive score is presented in [8].
The 3D score 5x3x3 (2016), for any three pitched instru-
ments, is translated, in room-scale size, into an AR envi-
ronment. The score becomes a virtual structure superim-
posed to the real world. It can be visualized by performers
wearing Hololens 1.
In [P.O.V.] (2017) by O. Escudero for saxophonist, VR
glasses, electronics and projected video, the performer sees,
in a VR environment, a 2D score and some short anima-
tions used as markers for some musical details, such as
repetitions. Projections require lights to be turned off; the
VR environment reproduces a score in the absence of phys-
ical light.
In Hidden Motive (2018) by A. Brandon, a graphic score
is generated live by the composer and transmitted to a mo-
bile device mounted on a VR/AR headset through wi-fi.
The score is also mirrored to a projector.
LINEAR [9], in Figure 3, constitutes one of the first exper-
iments making use of AR as a resource for live-generated
notation and musical performance in the context of impro-
visation. The application allows one to draw perdurable
gestures in the air. Those gestures, represented as virtual
strokes formed by numerous virtual bodies, are linked to
specific sounds. In order to produce some sound, the per-
former is required to draw and then interact with those vir-
tual objects. Thus, virtual trajectories are both notation (as
they indicate the gesture to perform) and control interface
(as they can be used for producing sound).
Figure 3. Rehearsal using LINEAR.
Other forms of AR music notation have been experimented
in the field of music education. Piano learning is by far the
most explored topic, presenting already a high number of
studies (e.g. [10–12]). Experimentations also exist for gui-
tar (e.g. [13, 14]), violin or viola [15, 16] and non-western
instruments (e.g. [17,18]). In a good number of researches
on AR-based music education it is possible to find:
the use of some form of indication of position in 3D
space (where to perform an action);
the notation of event’s timing is not in space (e.g., on 2D
paper or screen) but over time (i.e. events are notated
when they are meant to happen and their duration is
indicated through visual cues that last as long as the
event they are referred to).
The piano roll (often used in AR piano education) is a
good example: AR colored blocks come towards the per-
Proceedings of the 17th Sound and Music Computing Conference, Torino, June 24th – 26th 2020
former in correspondence of specific piano keys (indica-
tion of position). As long as a block is “in contact” with
the respective key, the player has to keep that key low-
ered (indication of time over time). Notwithstanding the
interest of some solutions, notation in music education is
usually adopted for teaching simple compositions selected
from the traditional repertoire (compositions are notated
in CMN in their original form) for beginners or amateurs.
Evaluations in the studies show an increase in precision by
using AR, a higher motivation and a lowered barrier of en-
try for beginners.
All the experiences above, from graphic scores to AR no-
tation, extend resources and aims of the notation far be-
yond the CMN. It is possible to individuate a process to-
wards the expansion of notation from the
2-dimensionality of paper towards the (interactive) space-
time continuum. AR technology has the potential to further
enhance these possibilities.
GesturAR represents a first step towards the exploration
of a new notational concept made possible by the recent
developments of AR: the creation of 3D gesture-based ac-
tion score (or interactive embodied notation, see section 3).
The concept behind it is that, by notating quite exactly a
movement in 4D space and time (as a trajectory), new aes-
thetical and experimental possibilities might arise. Current
notational solutions for extended techniques are, in numer-
ous cases, complete as they are: in fact, indeterminacy in
the identification of the precise action to perform (almost
intrinsic in prescriptive notation) leaves room for the inter-
preter to establish, within certain limits, an own relation-
ship with the instrument and with the notation. However,
the new and basically unexplored possibilities of AR for
gesture-based notation present a promising perspective in
which prescriptive notation reaches a high degree of pre-
dictability in terms of gesture and sound result, especially
where complex actions, difficult to notate on paper, come
into play.
GesturAR allows one to draw, store and play back trajec-
tories of gestures (with the sound they produce) performed
on any acoustic instrument having a vibrating surface. For
example, metal sheets, gongs, toms, bass drums, piano
strings, strings or harp could be suitable instruments. In
fact, the sound of all of the mentioned instruments is mod-
ified by trajectories performed on their surfaces (or along
and across the strings). On the contrary, woodwinds or
brass instruments players could hardly make any use of
GesturAR, as the sound is almost exclusively controlled
through lips position, blow emission and fingerings. The
application has been developed thinking about musical per-
formance on acoustic instruments. In GesturAR, trajecto-
ries are recorded with the original speed and internal artic-
ulation of the gesture and will maintain the exact temporal
proportions when played back.
For example, let’s consider a setup formed by a tam-tam
and two spherical magnets (one per side of the tam-tam,
held together across the instrument’s body by the magnetic
force, Figure 4). By moving the magnets on the surface,
the timbre of the instrument is changed according to the
position of the magnet and the perceived pitch(es) (if any)
will be shifted downwards or upwards. The system pro-
vides indication about the position of those magnets in time
(therefore about the result, which is unique given a specific
instrument and a specific position) directly on the instru-
ment itself.
Figure 4. A magnet moved on a tam-tam. The second
magnet is in the back of the instrument.
Another example (Figure 5) can be seen on the piano
played with a rattle-singing magnet (similar effects can
be achieved, in general, with metal objects). In that case,
the position of the magnet determines the production of
two pitches at any given time: the metal object divides the
string in two distinct vibrating parts, each of them produc-
ing a different pitch. In normal notation, the indication of
the precise gesture related to a precise pitch result would
present several issues. In GesturAR, the trajectory and the
position of the magnet at any given time can be delivered
in real-time and in space.
Figure 5. The use of a rattle-singing magnet on piano
The visual representation of gesture shown directly on
the vibrating body (such as strings or metal plates) could
provide a new, intuitive framework for writing, rehearsing
and performing some specific sets of extended techniques.
From the performer’s point of view, it is possible to follow
the trajectories in real-time on the instrument instead of
learning, from a paper score, the precise gesture, its timing
and its approximate timbral result. From the composer’s
Proceedings of the 17th Sound and Music Computing Conference, Torino, June 24th – 26th 2020
point of view, the notational process allows a direct and in-
tuitive way of storing and retrieving information about the
wanted gesture and, at the same time, the intended sound
2.1 Technical framework
The framework is composed of:
software developed in Unity3D/C#, used for motion cap-
ture, data processing (headset and trackers position
and orientation) and graphics rendering; connected
through OSC protocol to:
Max/MSP patch handling audio processing and playing
sound files;
1 HTC Vive Pro headset;
1 HTC Vive Tracker;
1 LeapMotion 2;
1 contact microphone.
2.2 Functions
2.2.1 Recording trajectories
The system makes use of LeapMotion for detecting the fin-
gers’ position at each video frame. The detection is per-
formed using the LeapMotion plugin for Unity. There is
also a sound processing unit (in Max/MSP) used for de-
tecting when the instrument is played (details explained in
2.2.4). The position of the instrument (tam-tam, in this
case) in space is detected by using a Vive Tracker. This
way, the user does not need to manually set the position of
the instrument after every reboot of the software.
In GesturAR, a User Interface (UI) is created around the
instrument. It can be used for triggering different func-
tions: Start Recording, Stop Recording, Cancel, Save, Load,
Write Score, Process Score. The interaction with virtual
buttons is enabled by the hands’ position detection pro-
vided by the LeapMotion. The hands have virtual colliders
attached to them, used for detecting collisions with virtual
When the user’s hand collides with the Start Recording
button, the system waits for the user to start playing (by
using an envelope follower implemented in Max/MSP, see
2.2.4) Then it starts recording the position of the left index
finger-tip at each frame (Figure 6) and creates a trajectory
out of it. When the user stops playing (according to the en-
velope follower), the system stops recording the position.
2Device for skeletal hand-tracking, providing position and orientation
for each knuckle, the palm and the wrist.
Figure 6. When the recording is activated and the instru-
ment is played, the software starts recording the position
of the fingertip frame by frame.
If the user hits Save, the system stores the trajectory as
a.json file and saves it for later use. The trajectory can
be played back by reading consequent positional data in
the .json file in consequent video frames. By interacting
with the Cancel button, the user can erase the last recorded
2.2.2 Loading trajectories
Figure 7. The user can select different pre-stored trajecto-
ries by “pressing” with the hand the red square containing
its miniaturized shape.
After the user presses the Load button, an additional part
of the UI is rendered near to the instrument, presenting all
the trajectories already stored in the system. Each of them
is represented as a miniaturised collection of spheres (one
per sampling point) inscribed inside a red blocks. The user
can select one trajectory by interacting with the relative
blocks (Figure 7). The selected trajectory is then rendered
on the instrument, in full scale (Figure 8). The original
gesture can be played back as a line that moves from one
sampling point to the next, preserving the original speed of
the gesture (Figure 9).
Proceedings of the 17th Sound and Music Computing Conference, Torino, June 24th – 26th 2020
Figure 8. The selected trajectory is rendered over the real
instrument as a collection of spheres, indicating the sam-
pling points in space.
Figure 9. Trajectory playback.
2.2.3 Creating the score
By selecting Write Score, a timeline (a transparent white
rectangle) is activated (Figure 10). The user will have the
possibility to select trajectories from a menu (still in minia-
turized form inside red blocks). When a trajectory is se-
lected, a copy of it is created inside the timeline. By using
agrab gesture 3, the user can move trajectories inside the
timeline. At the current stage of development, only one
trajectory at a time is possible. Inside the timeline, dura-
tions are proportionally represented. The length of blocks
is proportional to the duration of each gesture and the space
between different gestures is proportional to the duration
of the silence (i.e., no gesture rendered on the instrument)
between consecutive gestures. Once the timeline has been
arranged, the user can press the Process button and the AR
action score will be created and played back.
3An interaction gesture consisting in closing the fist and moving it.
This gesture is typically used in AR/VR for moving virtual objects.
Figure 10. The user positioning trajectories on the time-
The score consists of a series of lines drawn on the in-
strument, representing the gestures in the given order on
the timeline, with their original speed and with rests be-
tween gestures having a duration proportional to the space
between consecutive red blocks.
2.2.4 Max/MSP
The Max/MSP patch has two functionalities:
understanding when the instrument is played for trigger-
ing the recording inside the AR software;
storing and recalling the recorded sound linked to each
The first function is accomplished by using an envelope
following algorithm based on spectral magnitude values,
averaged over 10 consecutive audio frames. When the val-
ues pass the minimal threshold, a start message is sent
through OSC (Open Sound Control) to the AR software,
that starts recording the trajectory. When the values fall
back under the threshold, a stop message is sent to the AR
software. The second function is accomplished by stor-
ing the sound produced by each trajectory inside a buffer.
If the user decides to save the trajectory (inside the AR
software) a save message is sent through OSC from the
AR software to Max/MSP and the content of the buffer is
saved to a sound file. When a trajectory in the AR soft-
ware is selected and played back, a play message is sent
from the AR software to Max/MSP and the file associated
to the trajectory is played back.
GesturAR represents the first exploration into a new type
of notation that could be called embodied interactive nota-
tion: the notation is created as a direct consequence of an
embodied act (detected through sensors) and is a 4D rep-
resentation in space and time of the original gesture, in the
form of a trajectory or some other kind of spatial marking.
The gesture that creates the notation and the notated act co-
incide, as the notational moment is also a performative one
(the notation is generated as a consequence of a gesture
that produces sound). Following from this description, an
AR representation of gesture in space has a two-fold inter-
pretation: from one point of view, it is, as said, a new type
Proceedings of the 17th Sound and Music Computing Conference, Torino, June 24th – 26th 2020
of musical notation, with specific possibilities and affor-
dances, still largely unexplored. On the other side, it has a
very specific technical profile, as it allows a precise repre-
sentation of trajectory and timing of complex gestures, in
a way that is precluded to notation on paper.
This particular form of notation implies a peculiar rela-
tionship between composer and interpreter: the notation
keeps trace of the performative moment from which it was
generated. For notating, the composer has to be a per-
former. For interpreting, the performer has to redo what
the composer did in the moment of creation.
Embodied interactive notation, as realised in GesturAR,
also implies a peculiar approach to rhythm. Musical time
is not notated in a proportional or symbolic way. The sys-
tem is not meant to represent a precise rhythm subdivided
in discrete values of duration (as it happens, for example,
with quavers and semiquavers in CMN). In fact, while in
the usual practice rhythm is notated in space (e.g., on paper
or on a screen), GesturAR notates time over time: informa-
tion about starting moment and duration of events is pro-
vided in the moment in which the event is meant to happen
and as long as the event is meant to last. Such a relation-
ship with time is quite intrinsic in AR forms of notation,
as shown in numerous research papers on music education
that make use of some real-time positional indication (e.g.,
the piano roll described in 1.3). The notation in GesturAR
can portray the inner articulation of different velocities in-
side a gesture: typically, a complex gesture does not have
an even speed for its all duration but is characterised by an
alternation of different velocities. This peculiar approach
to musical time could be called continuous rhythm, as op-
posed to discrete rhythm (the one indicated by noteheads).
GesturAR also presents an embodied approach to the con-
struction of the musical form. The disposition of gestures
on the timeline, as a result of a bodily physical motion,
poses an interesting perspective, although, at the current
stage of development, the solutions offered are too limited
(see next section).
The technical quality for motion capture and rendering con-
stitutes one of the limitations of the current version of Ges-
turAR. The precision allowed by the use of LeapMotion
is barely satisfying. The device has been designed to de-
tect bare hands or hands while holding a pen or an object
with similar shape. When the finger is pressing against
the instrument (tam-tam, in this case), positional errors
arise (trajectories tend to be subjected to random devia-
tions from the original gesture). Another limit in the use of
LeapMotion consists in the need to always keep the hands
in the limited Field of View (FOV) of the sensor (150 x
135 degrees). This limitation might be solved through the
use of motion capture gloves (not using magnetometers) or
similar equipment. The quality of rendering is limited by
the low resolution of the front facing cameras mounted on
the Vive (420p). Further enhancements would include the
use of a third-party front facing stereo VR camera (such
as Stereolab’s ZED Mini) or the adoption of a see-through
device, such as Microsoft’s HoloLens 2. However, in this
case, the limited FOV (43 x 29 degrees) might make the
device unsuitable for this specific application. Other see-
through devices have a similar or smaller FOV.
Although the representation of trajectories can precisely
resemble speed and position over time of the original ges-
ture, the interpreter cannot follow said trajectory as soon as
it appears, due to the physiological reaction time. Instead,
the performer would wait until a part of the trajectory has
been shown before starting his/her own movement. There-
fore, the notation would be followed with some degree of
approximation. This fact is not necessarily a limit, as it
is due to biological factors and should be perceived as a
characteristic rather than a shortcoming.
Drawbacks can be found in the compositional dimension
of GesturAR. The system so far allows the construction of
very simple musical forms with no possibility of overlap-
ping or superimposing gestures. The long-range temporal
organization is left to an approximate perception of dis-
tance between blocks on the timeline, with no system for
accurately structuring the occurrence of events. The cur-
rent version of GesturAR does not allow the indication of
dynamics, although the speed of the gesture itself might
be considered, in some circumstances, as a dynamic mark.
The notation can only work on instruments that have a sur-
face to play on, like tam-tam, drums, piano strings. Such a
form of notation could not be used for instruments such as
woodwinds or brasses.
At the current stage, an evaluation experiment has yet to
be realized. In order to assess the actual usefulness of the
system, different parameters should be evaluated, with per-
formers, composers and non-musicians as participants to
the experiment(s). The main aspects to evaluate would be:
time effectiveness of the system (both for composing and
learning a piece), accuracy of the performance (comparing
an AR system with a normally written score) and perceived
This paper is about GesturAR, a system for the real-time
notation of gesture and embodied composition in AR. The
concept of embodied interactive notation has also been pre-
GesturAR allows the composer/performer to record tra-
jectories in 4D space and time through the use of motion
capture and AR equipment. Such gestures are intended to
be performed on the surface of an instrument (for exam-
ple, a tam-tam, as described in this research) as a collec-
tion of sampling points in space (3D cartesian coordinates
and time values). By using a virtual UI, each saved trajec-
tory is stored in memory and can be recalled, selected and
played back (with a line moving from one sampling point
to the next) on the instrument’s surface. Trajectories can be
adjusted on an AR timeline by using bare hands gestures.
Once the timeline is processed, the time of occurrence of
events is calculated proportionally (the red blocks’ length
is proportional to the duration of the gesture and the si-
lence is calculated proportionally to the distance between
The concept of embodied interactive notation has been
Proceedings of the 17th Sound and Music Computing Conference, Torino, June 24th – 26th 2020
proposed, defined as that form of notation that is created
as a consequence of a gesture (detected through sensors)
and that is a precise representation of that gesture in space
and time, displayed as a trajectory or some other form of
AR mark. Composer and performer experience the same
performative dimension, although at different times and
with different aims. In this context, the representation of
rhythm shows a peculiar trait. In fact, this form of nota-
tion is not suitable for the indication of discrete time val-
ues (such as quaver or semiquaver) or different time pro-
portions (still conceptually affine to the concept of subdivi-
sion of a longer duration value into shorter ones). Instead,
it is the vehicle of a continuous representation of rhythm,
where time proportions are represented over time and as
different speeds inside a single gesture.
Limitations can be found in the robustness of the hand-
tracking system making use of LeapMotion and in its com-
positional usability. Future developments of the research
will include the use of motion capture gloves for reach-
ing more reliable space sampling and capture data, the de-
sign of an evaluation experiment and improvements in the
functionalities, such as the possibility of superimposition
of different gestures.
Supporting material
Demonstration of the use of GesturAR at:
[1] C. Roads and P. Wieneke, “Grammars as Representa-
tions for Music,” Computer Music Journal, 2007.
[2] A. Heathcote, “Liberating sounds: phylosophical per-
spectives on the music and writings of Helmut Lachen-
mann,” Ph.D. dissertation, Durham University, 2003.
[3] L. Vickery, “The evolution of notational innovations
from the mobile score to the screen score,” Organised
Sound, 2012.
[4] C. Hope and L. Vickery, “The DECIBEL Scoreplayer -
A Digital Tool for Reading Graphic Notation,” in Pro-
ceedings of the International Conference on Technolo-
gies for Music Notation and Representation, 2015.
[5] C. Hope, A. Wyatt, and L. Vickery, “The Decibel
scoreplayer - a digital tool for reading graphic nota-
tion,” pp. 59–70, 2015.
[6] L. Vickery, “Rhizomatic approaches to screen-based
music notation,” Proceedings of the International Con-
ference on New Interfaces for Musical Expression,
vol. 16, pp. 394–400, 2016.
[7] D. Kim-Boyle, “64x4x4(2017) for string quartet,
2017. [Online]. Available: http://www.davidkimboyle.
[8] D. Kim-Boyle and B. Carey, “3D scores on the
HoloLens,” in TENOR 2019 International Conference
on Technologies for Musical Notation and Representa-
tion, Melbourne, 2019.
[9] G. Santini, “LINEAR - Live-generated Interface
and Notation Environment in Augmented Reality,”
in TENOR 2018 International Conference on Tech-
nologies for Musical Notation and Representation,
eal, 2018, pp. 33–42.
[10] K. Rogers, A. R ¨
ohlig, M. Weing, J. Gugenheimer,
B. K¨
onings, M. Klepsch, F. Schaub, E. Rukzio,
T. Seufert, and M. Weber, “P.I.A.N.O.: Faster Piano
Learning with Interactive Projection,” in Proceedings
of the Ninth ACM International Conference on Inter-
active Tabletops and Surfaces (ITS 2014), 2014.
[11] X. Xiao, P. Aguilera, J. Williams, and H. Ishii, “Mir-
rorFugue iii: conjuring the recorded pianist,” Proceed-
ings of the International Conference on New Interfaces
for Musical Expression, 2013.
[12] X. Xiao, B. Tome, and H. Ishii, “Andante: Walking
Figures on the Piano Keyboard to Visualize Musical
Motion,” in Proceedings of the International Confer-
ence on New Interfaces for Musical Expression (NIME
2014), 2014, pp. 629–932.
[13] C. Kerdvibulvech and H. Saito, “Vision-Based Gui-
tarist Fingering Tracking Using a Bayesian Classifier
and Particle Filters,” in PSIVT’07 Proceedings of the
2nd Pacific Rim conference on Advances in image and
video technology, vol. 14, 2007, pp. 625–638.
[14] J. R. Keebler, T. J. Wiltshire, D. C. Smith, and S. M.
Fiore, “Picking up STEAM: Educational implications
for teaching with an augmented reality guitar learning
system,” in Lecture Notes in Computer Science (includ-
ing subseries Lecture Notes in Artificial Intelligence
and Lecture Notes in Bioinformatics), 2013.
[15] F. D. Sorbier, H. Shiino, and H. Saito, “Violin Peda-
gogy for Finger and Bow Placement using Augmented
Reality,” in Signal & Information Processing Associ-
ation Annual Summit and Conference (APSIPA ASC),
[16] S. Mohan, “Music Instruction in a virtual/augmented
reality environment using The Cave 2 and Microsoft
Hololens,” Unknown, 2016.
[17] Y. Zhang, S. Liu, L. Tao, C. Yu, Y. Shi, and Y. Xu,
“ChinAR: Facilitating Chinese Guqin learning through
interactive projected augmentation,” in ACM Interna-
tional Conference Proceeding Series, 2015.
[18] M. Doi and H. Miyashita, “Koto learning support
method considering articulations,” in Lecture Notes in
Computer Science (including subseries Lecture Notes
in Artificial Intelligence and Lecture Notes in Bioinfor-
matics), 2018.
... Mostly, researches exploited the use of the temporal dimension (one of the typical traits of AR) [1,2]. In some researches, the 3-dimensional spatial nature of AR was also exploited [3,4,5]. In most of the cases, researchers adopted graphic notation solutions, as opposed to the traditional descriptive notation. ...
... GesturAR [4] is an experimental application that allows to notate performance gestures in the real space. A hand tracking device is used to detect the palm position. ...
Conference Paper
Full-text available
LINEAR (Live-generated Interface and Notation Environment in Augmented Reality) is an environment for the generation of real-time 3D interactive graphic notation. The environment is suitable for ensemble improvisative performances featuring acoustic instruments, live-electronics and two Augmented Reality (AR) performers. One AR performer uses an iPhone for drawing virtual trajectories in the space, rendered as a sequence of Virtual Objects (VOs) aligned along the trajectory. VOs trigger samples upon virtual collisions with the iPhone. They are also used as a form of graphic notation for instrumentalists/vocalists: the screen of the iPhone is mirrored to a projector. The second AR performer uses a headset and can use VR controllers to design trajectories used for the spatialization of each audio source in a 3D audio setup. The headset AR performer can use virtual spheres (one per instrument) to control the position of each sound source (one per instrument). The sound of every acoustic instrument is processed live. The mixing of processing effects are controlled by a laptop player. The system has been repeatedly tested during a two-semesters long workshop. The system was also used for two online concerts. Beyond demonstrating the technical and musical viability of LINEAR, the workshop also gave the chance to record student's response to the system. Although the sample size is quite small (four students completed the survey), the answers show encouraging results in terms of engagement and interest. Future work should be conducted to further enhance the user experience and more clearly assess LINEAR's usability and effectiveness as an innovative system for improvisation and musical performance.
... The vast majority of studies on Musical XR have an applicative nature, as seen in Figure 4 (left). Nevertheless, various theoretical studies have also been carried out (e.g., [3], [24], [126]). These range from design principles and evaluation frameworks, to position papers, perspectives, and philosophical considerations. ...
... Along a similar line, the study reported in [9] describes a system where musical notation is presented in AR to musicians wearing an HMD. In another project, Santini describes a system for creating AR gesture-based scores in the context of experimental instrumental composition [126]. ...
Full-text available
The intersection between music and Extended Reality (XR) has grown significantly over the past twenty years, amounting to an established area of research today. The use of XR technologies represents a fundamental paradigm shift for various musical contexts as they disrupt traditional notions of musical interaction by enabling performers and audiences to interact musically with virtual objects, agents, and environments. This article both surveys and expands upon the knowledge accumulated in existing research in this area to build a foundation for future works that bring together Music and XR. To this end, we created a freely available dataset of 260 publications in this space and conducted an in-depth analysis covering 199 works in the last decade. We conducted this analysis using a list of conceptual dimensions belonging to technical, artistic, perceptual and methodological domains. This review of the literature is complemented with a set of interviews with domain experts with the goal of establishing a definition for the emergent field of Musical XR, i.e., the field of music in Extended Realities. Based on the results of the conducted review, a research agenda for the field is proposed.
Conference Paper
Full-text available
Recent developments in Augmented Reality (AR) technology are opening up new modes of representation and interaction with virtual objects; at the same time, increase in processing power of portable devices is enabling a wide diffusion of applications until recently usable only in very specific situations (like motion-capture labs). This study aims to describe an AR environment created for musical performance: LINEAR (Live-generated Interface and Notation Environment in Augmented Reality), where the author explored some perspectives made possible by the current state of AR technology applied to music. In LINEAR, one dedicated performer using an AR iPhone app, can create virtual objects (rendered in real-time and superimposed to the real environment) according to the movement of the device; they are used both as virtual interfaces for electronics (sending OSC message to Max/MSP on a computer) and as forms of live-generated graphic notation. LINEAR allows, with some limitations, the representation of gestural movements with an exact 3-D placement in space: we can now have an analogic notation of gestures, rather than a symbolic one. For the iPhone performer , the act of notation corresponds to the notated act. The resulting representations can be also approached as graphic animated notation by other performers (the iPhone screen is mirrored to a projector). The multiple perspectives on the notation and the possibilities of interaction with virtual bodies allow a high level of flexibility, while introducing some almost unprecedented resources and foreseeing a very rich scenario.
Conference Paper
Full-text available
The rhizome concept explored by Deleuze and Guatarri has had an important influence on formal thinking in music and new media. This paper explores the development of rhizomatic musical scores that are arranged cartographically with nodal points allowing for alternate pathways to be traversed. The challenges of pre-digital exemplars of rhizomatic structure are discussed. It follows the development of concepts and technology used in the creation of five works by the author Ubahn c. 1985: the Rosenberg Variations [2012], The Last Years [2012], Sacrificial Zones [2014], detritus [2015] and trash vortex [2015]. The paper discusses the potential for the evolution of novel formal structure using a rhizomatic approach.
Conference Paper
Full-text available
In 2009, the Decibel new music ensemble based in Perth, Western Australia was formed with an associated manifesto that stated “Decibel seek to dissolve anydivision between sound art, installation and music by focusing on the combination of acoustic and electronic instruments” [1]. The journey provided by this focus led to a range of investigations into different score types, resulting in a re-writing of the groups statement to “pioneering electronic score formats, incorporating mobile score formats and networked coordination performance environments” [2]. This paper outlines the development of Decibel’s work with the ‘screen score’, including the different stages of the ‘Decibel ScorePlayer’, an application (App) for reading graphic notation on the iPad. The paper proposes that the Decibel ScorePlayer App provides a new, more accurate and reliable way to coordinate performances of music where harmony and pulse are not the primary elements described by notation. It features a discussion of selected compositions facilitated by the application, with a focus on the significance of the application to the author’s own compositional practices. The different stages in the development, from prototype score player to the establishment of a commercialized ‘Decibel ScorePlayer’, are outlined in the context of practice led investigations.
Conference Paper
Full-text available
Learning to play the piano is a prolonged challenge for novices. It requires them to learn sheet music notation and its mapping to respective piano keys, together with articulation details. Smooth playing further requires correct finger postures. The result is a slow learning progress, often causing frustration and strain. To overcome these issues, we propose P.I.A.N.O., a piano learning system with interactive projection that facilitates a fast learning process. Note information in form of an enhanced piano roll notation is directly projected onto the instrument and allows mapping of notes to piano keys without prior sight-reading skills. Three learning modes support the natural learning process with live feedback and performance evaluation. We report the results of two user studies, which show that P.I.A.N.O. supports faster learning, requires significantly less cognitive load, provides better user experience, and increases perceived musical quality compared to sheet music notation and non-projected piano roll notation.
Conference Paper
We present Andante, a representation of music as animated characters walking along the piano keyboard that appear to play the physical keys with each step. Based on a view of music pedagogy that emphasizes expressive, full-body communication early in the learning process, Andante promotes an understanding of music rooted in the body, taking advantage of walking as one of the most fundamental human rhythms. We describe three example visualizations on a preliminary prototype as well as applications extending our examples for practice feedback, improvisation and composition. Through our project, we reflect on some high level considerations for the NIME community.
Conference Paper
The Guqin, a seven-stringed fretless zither, is the most representative traditional musical instrument in China. However, the complexity of its unique notation and theory has severely limited its popularity in the modern world. With the goal of providing an easy and effective way of learning Guqin, we have created an interactive learning system called ChinAR which employs augmented reality. We have made three main contributions in this paper: (1) a systematic method to design for instrumental learning combing eastern and western musical concepts; (2) a primary validation of the effect of augmented reality in facilitating learning of the Chinese Guqin (3) a natural user interface for the learning system applying gesture detection. The result of user study shows our design is helpful in providing better learning experience and enhancing performance and memorization with markedly less time spent learning. This work shows how a new interface helps promote the use of heritage instruments and culture.
Conference Paper
Incorporation of the arts into the current model of science, technology, engineering, and mathematics (STEAM) may have a profound impact on the future of education. In light of this, we examined a novel technology at the intersection of these disciplines. Specifically, an experiment was conducted using augmented reality to learn a musical instrument, namely the guitar. The Fretlight® guitar system uses LED lights embedded in the fretboard to give direct information to the guitarist as to where to place their fingers. This was compared to a standard scale diagram. Results indicate that the Fretlight® system led to initial significant gains in performance over a control condition using diagrams, but these effects disappeared over the course of 30 trials. Potential benefits of the augmented reality technology are discussed, and future work is outlined to better understand how embodied cognition and augmented reality can increase learning outcomes for playing musical instruments.
Conference Paper
The body channels rich layers of information when playing music, from intricate manipulations of the instrument to vivid personifications of expression. But when music is captured and replayed across distance and time, the performer's body is rarely present. MirrorFugue conjures the recorded performer at the piano by combining the moving keys of a player piano with life-sized projection of the pianist's hands and upper body. Inspired by reflections on a lacquered grand piano, MirrorFugue evokes the sense that the virtual pianist is playing the physically moving keys. This video tells two stories of interactions across space and time mediated by MirrorFugue. One presents a great concert broadcasted across the world, where a young pianist learns by playing along. The other depicts a woman who plays a duet with her childhood self.