Content uploaded by Giovanni Santini
Author content
All content in this area was uploaded by Giovanni Santini on Jul 24, 2020
Content may be subject to copyright.
Proceedings of the 17th Sound and Music Computing Conference, Torino, June 24th – 26th 2020
357
Composition as an Embodied Act: a Framework for the Gesture-based
Creation of Augmented Reality Action Scores
Giovanni Santini
Hong Kong Baptist University
info@giovannisantini.com
ABSTRACT
In a context where Augmented Reality (AR) is rapidly
spreading out as one of the most promising technologies,
there is a great potential for applications addressing mu-
sical practices. This paper presents the development of
a framework for creating AR gesture-based scores in the
context of experimental instrumental composition. The
notation system is made possible by GesturAR, an Aug-
mented Reality software developed by the author: it al-
lows one to draw trajectories of gestures directly on the
real vibrating body. Those trajectories are visualized as
lines moving in real-time with a predetermined speed. The
user can also create an AR score (a sequence of trajecto-
ries) by arranging miniaturized trajectories representations
on a timeline. The timeline is then processed and a set of
events is created. This application paves the way to a new
kind of notation: embodied interactive notation, charac-
terized by a mimetic 4D representation of gesture, where
the act of notation (performed by the composer during the
compositional process) corresponds to the notated act (i.e.,
the action the interpreter is meant to produce during the
performance).
1. INTRODUCTION AND BACKGROUND
GesturAR is an application that allows to create, store, re-
call and organize trajectories in 4D (space and time) cre-
ated from performance gestures detected through motion
capture equipment. It has been developed for exploring of
a new concept of musical notation: embodied interactive
notation. In such form of notation, the act of notating (re-
alized as a physical gesture in 3D space) corresponds to
the notated act (the gesture that is meant to be performed).
The transition from notation (realized by the composer) to
the interpretation (realized by the performer) is mediated
through an AR system (see section 3). This project relies
on a background mainly including gesture-based notation,
live notation and 3D/AR/VR forms of notation.
Copyright: c
2020 Giovanni Santini et al. This is
an open-access article distributed under the terms of the
Creative Commons Attribution 3.0 Unported License, which permits unre-
stricted use, distribution, and reproduction in any medium, provided the original
author and source are credited.
1.1 Extended techniques and gesture-based notation
Both as a cause and a consequence of the increasing mu-
sical experimentation (especially from the 50s), the Com-
mon Music Notation 1(CMN) has been pushed beyond the
traditional forms of representation depending on different
aesthetical purposes and composers. We can call graphic
notation all those forms of notation that do not follow the
traditional rhythm-pitch-loudness definition (as in CMN)
and use graphic solutions not adopted in CMN. In partic-
ular, action scores [2] are scores where a prescriptive in-
dication of the gesture to perform replaces the descriptive
indication of the result to obtain. An example of action
score can be found in Figure 1.
Figure 1. Lachenmann’s Gran Torso, bars 95-97.
Gran Torso by Lachenmann (1971, Figure 1), presents
graphic elements along with freely arranged derivations of
CMN (Common Music Notation). In the example in Fig-
ure 1, it is possible to notice the use of the typical string
clef: it represents the body of the string instruments; lines
drawn inside the staff indicate the bow position. More im-
portantly, in the first bar of Violin I appear two shapes with
an arrowhead. They indicate form and direction of two
different movements (to be performed together). In this
score, as in numerous other that deal with some forms of
gesture-based notation, the performative act that generates
the sound is a compositional resource and part of the musi-
cal form. At the same time, the possibility to compose the
gesture (alongside the sound) allows composers to create
a brand-new set of timbral resources that were unthinkable
1“Common music notation (CMN) is the standard music notation sys-
tem originating in Europe in the early seventeenth century.” [1].
Proceedings of the 17th Sound and Music Computing Conference, Torino, June 24th – 26th 2020
358
and unreachable inside the frame of traditional instrumen-
tal techniques.
1.2 Real-time scores
The increasing processing power of computing devices has
enabled the possibility to include visual processing inside
musical scores. Real-time scores (typically scores visu-
alised on a screen) make use of some forms of animated
notation. As they are essentially represented on screens,
they can also be called screen scores and categorized as
scrolling, permutative (elements of the score are moved,
copied or cancelled in real-time), transformative (elements
of the score are transformed in real-time) or generative
scores (an algorithm generates in real-time the notation or
parts of it) [3].
Such scores need specific pieces of software for being vi-
sualized. For example, the Decibel Scoreplayer is a tool
developed specifically for real-time scores synchronized
over a network [4] [5]). It can be used for scrolling scores,
or for changing transparency of layers of superimposed
static images. For example, in the composition trash vor-
tex (2016) [6] by L. Vickery, different visual structures
(used as musical notation) are revealed by such changes in
transparency.
Advanced use of graphics and animations can be found
in more recent scores, sometimes more focused on some
idea of dramaturgy than on notation itself, projected to-
wards film-making and gamification. Genni (2018) by P.
Turowski is the staging of a story where the characters are
geometric figures moving in a 3D environment. The score
has to be interpreted by the performers as a form of graphic
notation.
1.3 3D, VR and AR scores
Technical developments during the last decade allowed the
utilization of the three spatial dimensions in real-time ren-
dering, thus fostering the idea of 3D musical notation, as
in [7]. In 64x4x4 by D. Kim-Boyle (Figure 2) for string
quartet, for instance, the score is animated and nodes in-
side a 3D space are mapped to different pitches and playing
techniques. For example, “colored nodes [. . . ] represent
various natural harmonics. The nodes are connected by a
series of thin lines the color of which denote the strings on
which the harmonics are to be performed” [8].
Figure 2. Kim-Boyle’s 64x4x4 string quartet.
More recently, VR and AR applications started being de-
veloped. The concept of immersive score is presented in [8].
The 3D score 5x3x3 (2016), for any three pitched instru-
ments, is translated, in room-scale size, into an AR envi-
ronment. The score becomes a virtual structure superim-
posed to the real world. It can be visualized by performers
wearing Hololens 1.
In [P.O.V.] (2017) by O. Escudero for saxophonist, VR
glasses, electronics and projected video, the performer sees,
in a VR environment, a 2D score and some short anima-
tions used as markers for some musical details, such as
repetitions. Projections require lights to be turned off; the
VR environment reproduces a score in the absence of phys-
ical light.
In Hidden Motive (2018) by A. Brandon, a graphic score
is generated live by the composer and transmitted to a mo-
bile device mounted on a VR/AR headset through wi-fi.
The score is also mirrored to a projector.
LINEAR [9], in Figure 3, constitutes one of the first exper-
iments making use of AR as a resource for live-generated
notation and musical performance in the context of impro-
visation. The application allows one to draw perdurable
gestures in the air. Those gestures, represented as virtual
strokes formed by numerous virtual bodies, are linked to
specific sounds. In order to produce some sound, the per-
former is required to draw and then interact with those vir-
tual objects. Thus, virtual trajectories are both notation (as
they indicate the gesture to perform) and control interface
(as they can be used for producing sound).
Figure 3. Rehearsal using LINEAR.
Other forms of AR music notation have been experimented
in the field of music education. Piano learning is by far the
most explored topic, presenting already a high number of
studies (e.g. [10–12]). Experimentations also exist for gui-
tar (e.g. [13, 14]), violin or viola [15, 16] and non-western
instruments (e.g. [17,18]). In a good number of researches
on AR-based music education it is possible to find:
•the use of some form of indication of position in 3D
space (where to perform an action);
•the notation of event’s timing is not in space (e.g., on 2D
paper or screen) but over time (i.e. events are notated
when they are meant to happen and their duration is
indicated through visual cues that last as long as the
event they are referred to).
The piano roll (often used in AR piano education) is a
good example: AR colored blocks come towards the per-
Proceedings of the 17th Sound and Music Computing Conference, Torino, June 24th – 26th 2020
359
former in correspondence of specific piano keys (indica-
tion of position). As long as a block is “in contact” with
the respective key, the player has to keep that key low-
ered (indication of time over time). Notwithstanding the
interest of some solutions, notation in music education is
usually adopted for teaching simple compositions selected
from the traditional repertoire (compositions are notated
in CMN in their original form) for beginners or amateurs.
Evaluations in the studies show an increase in precision by
using AR, a higher motivation and a lowered barrier of en-
try for beginners.
All the experiences above, from graphic scores to AR no-
tation, extend resources and aims of the notation far be-
yond the CMN. It is possible to individuate a process to-
wards the expansion of notation from the
2-dimensionality of paper towards the (interactive) space-
time continuum. AR technology has the potential to further
enhance these possibilities.
2. GESTURAR
GesturAR represents a first step towards the exploration
of a new notational concept made possible by the recent
developments of AR: the creation of 3D gesture-based ac-
tion score (or interactive embodied notation, see section 3).
The concept behind it is that, by notating quite exactly a
movement in 4D space and time (as a trajectory), new aes-
thetical and experimental possibilities might arise. Current
notational solutions for extended techniques are, in numer-
ous cases, complete as they are: in fact, indeterminacy in
the identification of the precise action to perform (almost
intrinsic in prescriptive notation) leaves room for the inter-
preter to establish, within certain limits, an own relation-
ship with the instrument and with the notation. However,
the new and basically unexplored possibilities of AR for
gesture-based notation present a promising perspective in
which prescriptive notation reaches a high degree of pre-
dictability in terms of gesture and sound result, especially
where complex actions, difficult to notate on paper, come
into play.
GesturAR allows one to draw, store and play back trajec-
tories of gestures (with the sound they produce) performed
on any acoustic instrument having a vibrating surface. For
example, metal sheets, gongs, toms, bass drums, piano
strings, strings or harp could be suitable instruments. In
fact, the sound of all of the mentioned instruments is mod-
ified by trajectories performed on their surfaces (or along
and across the strings). On the contrary, woodwinds or
brass instruments players could hardly make any use of
GesturAR, as the sound is almost exclusively controlled
through lips position, blow emission and fingerings. The
application has been developed thinking about musical per-
formance on acoustic instruments. In GesturAR, trajecto-
ries are recorded with the original speed and internal artic-
ulation of the gesture and will maintain the exact temporal
proportions when played back.
For example, let’s consider a setup formed by a tam-tam
and two spherical magnets (one per side of the tam-tam,
held together across the instrument’s body by the magnetic
force, Figure 4). By moving the magnets on the surface,
the timbre of the instrument is changed according to the
position of the magnet and the perceived pitch(es) (if any)
will be shifted downwards or upwards. The system pro-
vides indication about the position of those magnets in time
(therefore about the result, which is unique given a specific
instrument and a specific position) directly on the instru-
ment itself.
Figure 4. A magnet moved on a tam-tam. The second
magnet is in the back of the instrument.
Another example (Figure 5) can be seen on the piano
played with a rattle-singing magnet (similar effects can
be achieved, in general, with metal objects). In that case,
the position of the magnet determines the production of
two pitches at any given time: the metal object divides the
string in two distinct vibrating parts, each of them produc-
ing a different pitch. In normal notation, the indication of
the precise gesture related to a precise pitch result would
present several issues. In GesturAR, the trajectory and the
position of the magnet at any given time can be delivered
in real-time and in space.
Figure 5. The use of a rattle-singing magnet on piano
strings.
The visual representation of gesture shown directly on
the vibrating body (such as strings or metal plates) could
provide a new, intuitive framework for writing, rehearsing
and performing some specific sets of extended techniques.
From the performer’s point of view, it is possible to follow
the trajectories in real-time on the instrument instead of
learning, from a paper score, the precise gesture, its timing
and its approximate timbral result. From the composer’s
Proceedings of the 17th Sound and Music Computing Conference, Torino, June 24th – 26th 2020
360
point of view, the notational process allows a direct and in-
tuitive way of storing and retrieving information about the
wanted gesture and, at the same time, the intended sound
result.
2.1 Technical framework
The framework is composed of:
•software developed in Unity3D/C#, used for motion cap-
ture, data processing (headset and trackers position
and orientation) and graphics rendering; connected
through OSC protocol to:
•Max/MSP patch handling audio processing and playing
sound files;
•1 HTC Vive Pro headset;
•1 HTC Vive Tracker;
•1 LeapMotion 2;
•1 contact microphone.
2.2 Functions
2.2.1 Recording trajectories
The system makes use of LeapMotion for detecting the fin-
gers’ position at each video frame. The detection is per-
formed using the LeapMotion plugin for Unity. There is
also a sound processing unit (in Max/MSP) used for de-
tecting when the instrument is played (details explained in
2.2.4). The position of the instrument (tam-tam, in this
case) in space is detected by using a Vive Tracker. This
way, the user does not need to manually set the position of
the instrument after every reboot of the software.
In GesturAR, a User Interface (UI) is created around the
instrument. It can be used for triggering different func-
tions: Start Recording, Stop Recording, Cancel, Save, Load,
Write Score, Process Score. The interaction with virtual
buttons is enabled by the hands’ position detection pro-
vided by the LeapMotion. The hands have virtual colliders
attached to them, used for detecting collisions with virtual
objects.
When the user’s hand collides with the Start Recording
button, the system waits for the user to start playing (by
using an envelope follower implemented in Max/MSP, see
2.2.4) Then it starts recording the position of the left index
finger-tip at each frame (Figure 6) and creates a trajectory
out of it. When the user stops playing (according to the en-
velope follower), the system stops recording the position.
2Device for skeletal hand-tracking, providing position and orientation
for each knuckle, the palm and the wrist.
Figure 6. When the recording is activated and the instru-
ment is played, the software starts recording the position
of the fingertip frame by frame.
If the user hits Save, the system stores the trajectory as
a.json file and saves it for later use. The trajectory can
be played back by reading consequent positional data in
the .json file in consequent video frames. By interacting
with the Cancel button, the user can erase the last recorded
trajectory.
2.2.2 Loading trajectories
Figure 7. The user can select different pre-stored trajecto-
ries by “pressing” with the hand the red square containing
its miniaturized shape.
After the user presses the Load button, an additional part
of the UI is rendered near to the instrument, presenting all
the trajectories already stored in the system. Each of them
is represented as a miniaturised collection of spheres (one
per sampling point) inscribed inside a red blocks. The user
can select one trajectory by interacting with the relative
blocks (Figure 7). The selected trajectory is then rendered
on the instrument, in full scale (Figure 8). The original
gesture can be played back as a line that moves from one
sampling point to the next, preserving the original speed of
the gesture (Figure 9).
Proceedings of the 17th Sound and Music Computing Conference, Torino, June 24th – 26th 2020
361
Figure 8. The selected trajectory is rendered over the real
instrument as a collection of spheres, indicating the sam-
pling points in space.
Figure 9. Trajectory playback.
2.2.3 Creating the score
By selecting Write Score, a timeline (a transparent white
rectangle) is activated (Figure 10). The user will have the
possibility to select trajectories from a menu (still in minia-
turized form inside red blocks). When a trajectory is se-
lected, a copy of it is created inside the timeline. By using
agrab gesture 3, the user can move trajectories inside the
timeline. At the current stage of development, only one
trajectory at a time is possible. Inside the timeline, dura-
tions are proportionally represented. The length of blocks
is proportional to the duration of each gesture and the space
between different gestures is proportional to the duration
of the silence (i.e., no gesture rendered on the instrument)
between consecutive gestures. Once the timeline has been
arranged, the user can press the Process button and the AR
action score will be created and played back.
3An interaction gesture consisting in closing the fist and moving it.
This gesture is typically used in AR/VR for moving virtual objects.
Figure 10. The user positioning trajectories on the time-
line.
The score consists of a series of lines drawn on the in-
strument, representing the gestures in the given order on
the timeline, with their original speed and with rests be-
tween gestures having a duration proportional to the space
between consecutive red blocks.
2.2.4 Max/MSP
The Max/MSP patch has two functionalities:
•understanding when the instrument is played for trigger-
ing the recording inside the AR software;
•storing and recalling the recorded sound linked to each
trajectory.
The first function is accomplished by using an envelope
following algorithm based on spectral magnitude values,
averaged over 10 consecutive audio frames. When the val-
ues pass the minimal threshold, a start message is sent
through OSC (Open Sound Control) to the AR software,
that starts recording the trajectory. When the values fall
back under the threshold, a stop message is sent to the AR
software. The second function is accomplished by stor-
ing the sound produced by each trajectory inside a buffer.
If the user decides to save the trajectory (inside the AR
software) a save message is sent through OSC from the
AR software to Max/MSP and the content of the buffer is
saved to a sound file. When a trajectory in the AR soft-
ware is selected and played back, a play message is sent
from the AR software to Max/MSP and the file associated
to the trajectory is played back.
3. EMBODIED INTERACTIVE NOTATION
GesturAR represents the first exploration into a new type
of notation that could be called embodied interactive nota-
tion: the notation is created as a direct consequence of an
embodied act (detected through sensors) and is a 4D rep-
resentation in space and time of the original gesture, in the
form of a trajectory or some other kind of spatial marking.
The gesture that creates the notation and the notated act co-
incide, as the notational moment is also a performative one
(the notation is generated as a consequence of a gesture
that produces sound). Following from this description, an
AR representation of gesture in space has a two-fold inter-
pretation: from one point of view, it is, as said, a new type
Proceedings of the 17th Sound and Music Computing Conference, Torino, June 24th – 26th 2020
362
of musical notation, with specific possibilities and affor-
dances, still largely unexplored. On the other side, it has a
very specific technical profile, as it allows a precise repre-
sentation of trajectory and timing of complex gestures, in
a way that is precluded to notation on paper.
This particular form of notation implies a peculiar rela-
tionship between composer and interpreter: the notation
keeps trace of the performative moment from which it was
generated. For notating, the composer has to be a per-
former. For interpreting, the performer has to redo what
the composer did in the moment of creation.
Embodied interactive notation, as realised in GesturAR,
also implies a peculiar approach to rhythm. Musical time
is not notated in a proportional or symbolic way. The sys-
tem is not meant to represent a precise rhythm subdivided
in discrete values of duration (as it happens, for example,
with quavers and semiquavers in CMN). In fact, while in
the usual practice rhythm is notated in space (e.g., on paper
or on a screen), GesturAR notates time over time: informa-
tion about starting moment and duration of events is pro-
vided in the moment in which the event is meant to happen
and as long as the event is meant to last. Such a relation-
ship with time is quite intrinsic in AR forms of notation,
as shown in numerous research papers on music education
that make use of some real-time positional indication (e.g.,
the piano roll described in 1.3). The notation in GesturAR
can portray the inner articulation of different velocities in-
side a gesture: typically, a complex gesture does not have
an even speed for its all duration but is characterised by an
alternation of different velocities. This peculiar approach
to musical time could be called continuous rhythm, as op-
posed to discrete rhythm (the one indicated by noteheads).
GesturAR also presents an embodied approach to the con-
struction of the musical form. The disposition of gestures
on the timeline, as a result of a bodily physical motion,
poses an interesting perspective, although, at the current
stage of development, the solutions offered are too limited
(see next section).
4. LIMITATIONS AND FUTURE WORK
The technical quality for motion capture and rendering con-
stitutes one of the limitations of the current version of Ges-
turAR. The precision allowed by the use of LeapMotion
is barely satisfying. The device has been designed to de-
tect bare hands or hands while holding a pen or an object
with similar shape. When the finger is pressing against
the instrument (tam-tam, in this case), positional errors
arise (trajectories tend to be subjected to random devia-
tions from the original gesture). Another limit in the use of
LeapMotion consists in the need to always keep the hands
in the limited Field of View (FOV) of the sensor (150 x
135 degrees). This limitation might be solved through the
use of motion capture gloves (not using magnetometers) or
similar equipment. The quality of rendering is limited by
the low resolution of the front facing cameras mounted on
the Vive (420p). Further enhancements would include the
use of a third-party front facing stereo VR camera (such
as Stereolab’s ZED Mini) or the adoption of a see-through
device, such as Microsoft’s HoloLens 2. However, in this
case, the limited FOV (43 x 29 degrees) might make the
device unsuitable for this specific application. Other see-
through devices have a similar or smaller FOV.
Although the representation of trajectories can precisely
resemble speed and position over time of the original ges-
ture, the interpreter cannot follow said trajectory as soon as
it appears, due to the physiological reaction time. Instead,
the performer would wait until a part of the trajectory has
been shown before starting his/her own movement. There-
fore, the notation would be followed with some degree of
approximation. This fact is not necessarily a limit, as it
is due to biological factors and should be perceived as a
characteristic rather than a shortcoming.
Drawbacks can be found in the compositional dimension
of GesturAR. The system so far allows the construction of
very simple musical forms with no possibility of overlap-
ping or superimposing gestures. The long-range temporal
organization is left to an approximate perception of dis-
tance between blocks on the timeline, with no system for
accurately structuring the occurrence of events. The cur-
rent version of GesturAR does not allow the indication of
dynamics, although the speed of the gesture itself might
be considered, in some circumstances, as a dynamic mark.
The notation can only work on instruments that have a sur-
face to play on, like tam-tam, drums, piano strings. Such a
form of notation could not be used for instruments such as
woodwinds or brasses.
At the current stage, an evaluation experiment has yet to
be realized. In order to assess the actual usefulness of the
system, different parameters should be evaluated, with per-
formers, composers and non-musicians as participants to
the experiment(s). The main aspects to evaluate would be:
time effectiveness of the system (both for composing and
learning a piece), accuracy of the performance (comparing
an AR system with a normally written score) and perceived
usefulness.
5. CONCLUSIONS
This paper is about GesturAR, a system for the real-time
notation of gesture and embodied composition in AR. The
concept of embodied interactive notation has also been pre-
sented.
GesturAR allows the composer/performer to record tra-
jectories in 4D space and time through the use of motion
capture and AR equipment. Such gestures are intended to
be performed on the surface of an instrument (for exam-
ple, a tam-tam, as described in this research) as a collec-
tion of sampling points in space (3D cartesian coordinates
and time values). By using a virtual UI, each saved trajec-
tory is stored in memory and can be recalled, selected and
played back (with a line moving from one sampling point
to the next) on the instrument’s surface. Trajectories can be
adjusted on an AR timeline by using bare hands gestures.
Once the timeline is processed, the time of occurrence of
events is calculated proportionally (the red blocks’ length
is proportional to the duration of the gesture and the si-
lence is calculated proportionally to the distance between
blocks).
The concept of embodied interactive notation has been
Proceedings of the 17th Sound and Music Computing Conference, Torino, June 24th – 26th 2020
363
proposed, defined as that form of notation that is created
as a consequence of a gesture (detected through sensors)
and that is a precise representation of that gesture in space
and time, displayed as a trajectory or some other form of
AR mark. Composer and performer experience the same
performative dimension, although at different times and
with different aims. In this context, the representation of
rhythm shows a peculiar trait. In fact, this form of nota-
tion is not suitable for the indication of discrete time val-
ues (such as quaver or semiquaver) or different time pro-
portions (still conceptually affine to the concept of subdivi-
sion of a longer duration value into shorter ones). Instead,
it is the vehicle of a continuous representation of rhythm,
where time proportions are represented over time and as
different speeds inside a single gesture.
Limitations can be found in the robustness of the hand-
tracking system making use of LeapMotion and in its com-
positional usability. Future developments of the research
will include the use of motion capture gloves for reach-
ing more reliable space sampling and capture data, the de-
sign of an evaluation experiment and improvements in the
functionalities, such as the possibility of superimposition
of different gestures.
Supporting material
Demonstration of the use of GesturAR at:
https://www.giovannisantini.com/gesturar
6. REFERENCES
[1] C. Roads and P. Wieneke, “Grammars as Representa-
tions for Music,” Computer Music Journal, 2007.
[2] A. Heathcote, “Liberating sounds: phylosophical per-
spectives on the music and writings of Helmut Lachen-
mann,” Ph.D. dissertation, Durham University, 2003.
[3] L. Vickery, “The evolution of notational innovations
from the mobile score to the screen score,” Organised
Sound, 2012.
[4] C. Hope and L. Vickery, “The DECIBEL Scoreplayer -
A Digital Tool for Reading Graphic Notation,” in Pro-
ceedings of the International Conference on Technolo-
gies for Music Notation and Representation, 2015.
[5] C. Hope, A. Wyatt, and L. Vickery, “The Decibel
scoreplayer - a digital tool for reading graphic nota-
tion,” pp. 59–70, 2015.
[6] L. Vickery, “Rhizomatic approaches to screen-based
music notation,” Proceedings of the International Con-
ference on New Interfaces for Musical Expression,
vol. 16, pp. 394–400, 2016.
[7] D. Kim-Boyle, “64x4x4(2017) for string quartet,”
2017. [Online]. Available: http://www.davidkimboyle.
net/64x4x4-2017.html
[8] D. Kim-Boyle and B. Carey, “3D scores on the
HoloLens,” in TENOR 2019 International Conference
on Technologies for Musical Notation and Representa-
tion, Melbourne, 2019.
[9] G. Santini, “LINEAR - Live-generated Interface
and Notation Environment in Augmented Reality,”
in TENOR 2018 International Conference on Tech-
nologies for Musical Notation and Representation,
Montr´
eal, 2018, pp. 33–42.
[10] K. Rogers, A. R ¨
ohlig, M. Weing, J. Gugenheimer,
B. K¨
onings, M. Klepsch, F. Schaub, E. Rukzio,
T. Seufert, and M. Weber, “P.I.A.N.O.: Faster Piano
Learning with Interactive Projection,” in Proceedings
of the Ninth ACM International Conference on Inter-
active Tabletops and Surfaces (ITS 2014), 2014.
[11] X. Xiao, P. Aguilera, J. Williams, and H. Ishii, “Mir-
rorFugue iii: conjuring the recorded pianist,” Proceed-
ings of the International Conference on New Interfaces
for Musical Expression, 2013.
[12] X. Xiao, B. Tome, and H. Ishii, “Andante: Walking
Figures on the Piano Keyboard to Visualize Musical
Motion,” in Proceedings of the International Confer-
ence on New Interfaces for Musical Expression (NIME
2014), 2014, pp. 629–932.
[13] C. Kerdvibulvech and H. Saito, “Vision-Based Gui-
tarist Fingering Tracking Using a Bayesian Classifier
and Particle Filters,” in PSIVT’07 Proceedings of the
2nd Pacific Rim conference on Advances in image and
video technology, vol. 14, 2007, pp. 625–638.
[14] J. R. Keebler, T. J. Wiltshire, D. C. Smith, and S. M.
Fiore, “Picking up STEAM: Educational implications
for teaching with an augmented reality guitar learning
system,” in Lecture Notes in Computer Science (includ-
ing subseries Lecture Notes in Artificial Intelligence
and Lecture Notes in Bioinformatics), 2013.
[15] F. D. Sorbier, H. Shiino, and H. Saito, “Violin Peda-
gogy for Finger and Bow Placement using Augmented
Reality,” in Signal & Information Processing Associ-
ation Annual Summit and Conference (APSIPA ASC),
2012.
[16] S. Mohan, “Music Instruction in a virtual/augmented
reality environment using The Cave 2 and Microsoft
Hololens,” Unknown, 2016.
[17] Y. Zhang, S. Liu, L. Tao, C. Yu, Y. Shi, and Y. Xu,
“ChinAR: Facilitating Chinese Guqin learning through
interactive projected augmentation,” in ACM Interna-
tional Conference Proceeding Series, 2015.
[18] M. Doi and H. Miyashita, “Koto learning support
method considering articulations,” in Lecture Notes in
Computer Science (including subseries Lecture Notes
in Artificial Intelligence and Lecture Notes in Bioinfor-
matics), 2018.