ArticlePDF Available

Ocarina: Designing the iPhone's Magic Flute

Authors:

Abstract and Figures

Ocarina for the iPhone is one of the earliest mobile-musical (and social-musical) apps in this modern era of personal mobile computing. Created and released in 2008, it re-envisions the ancient flute-like clay instrument – the 4-hole “English- pendant” ocarina – and transforms it in the kiln of modern technology. It features physical interaction leveraging breath input, multi-touch, and accelerometer, as well as social interaction that allows users to listen in – anonymously and voyeuristically – to users playing this instrument around the world, taking advantage of the iPhone’s GPS location and persistent network connection. To date, the Smule Ocarina and its successor, Ocarina 2, has more than 10 million users worldwide. More than 5 years after its inception at the beginning of a new era of apps on powerful smartphones, this article chronicles Ocarina’s design – both physical and social – as well as user case studies, and reflect on what we have learned so far.
Content may be subject to copyright.
Ocarina: Designing the
iPhone’s Magic Flute
Ge Wang
Center for Computer Research in Music
and Acoustics (CCRMA)
Department of Music
Stanford University
660 Lomita Drive
Stanford, California 94305, USA
ge@ccrma.stanford.edu
Abstract: Ocarina, created in 2008 for the iPhone, is one of the first musical artifacts in the age of pervasive,
app-based mobile computing. It presents a flute-like physical interaction using microphone input, multi-touch, and
accelerometers—and a social dimension that allows users to listen in to each other around the world. This article
chronicles Smule’s Ocarina as a mobile musical experiment for the masses, examining in depth its design, aesthetics,
physical interaction, and social interaction, as well as documenting its inextricable relationship with the rise of mobile
computing as catalyzed by mobile devices such as the iPhone.
Ocarina for the iPhone was one of the earliest
mobile-musical (and social-musical) apps in
this modern era of personal mobile computing.
Created and released in 2008, it re-envisions an
ancient flute-like clay instrument—the four-hole
“English-pendant” ocarina—and transforms it in
the kiln of modern technology (see Figure 1). It
features physical interaction, making use of breath
input, multi-touch, and accelerometer, as well
as social interaction that allows users to listen
in to each other playing this instrument around
the world, anonymously (in a sort of musical
“voyeurism”), by taking advantage of the iPhone’s
Global Positioning System (GPS) location and its
persistent network connection (see Figure 2). To
date, the Smule Ocarina and its successor, Ocarina 2
(released in 2012), has more than ten million users
worldwide, and was a first class inductee into
Apple’s App Store Hall of Fame. More than five
years after its inception and the beginning of a
new era of apps on powerful smartphones, we look
in depth at Ocarina’s design—both physical and
social—as well as user case studies, and reflect on
what we have learned so far.
When the Apple App Store launched in 2008—one
year after the introduction of the first iPhone—few
could have predicted the transformative effect app-
mediated mobile computing would have on the
world, ushering in a new era of personal computing
and new waves of designers, developers, and even
entire companies. In 2012 alone, 715 million
Computer Music Journal, 38:2, pp. 8–21, Summer 2014
doi:10.1162/COMJ a00236
c
2014 Massachusetts Institute of Technology.
new units of smartphones were sold worldwide.
Meanwhile, in Apple’s App Store, there are now
over one million distinct apps spanning dozens
of categories, including lifestyle, travel, games,
productivity, and music. In the humble, early days
of mobile apps, however, there were far fewer
(on the order of a few thousand) apps. Ocarina
was one of the very first musical apps. It was
designed to be an expressive musical instrument,
and represents perhaps the first mass-adopted,
social-mobile musical instrument.
Origins and Related Works
The ingredients in creating such an artifact can be
traced to interactive computer music software such
as the ChucK programming language (Wang 2008),
which runs in every instance of Ocarina, laptop
orchestras at Princeton University and Stanford
University (Trueman 2007; Wang et al. 2008, 2009a),
and the first mobile phone orchestra (Wang, Essl,
and Penttinen 2008, 2014; Oh et al. 2010), utilizing
research from 2003 until the present. These works
helped lead to the founding of the mobile-music
startup company Smule (Wang et al. 2009b; Wang
2014, 2015), which released its first apps in summer
2008 and, at the time of this writing (in 2013), has
reached over 100 million users.
More broadly, much of this was inspired and
informed by research on mobile music, which
was taking place in computer music and related
communities well before critical mass adoption of
an app-driven mobile device like the iPhone.
Reports on an emerging community of mobile
music and its potential can be traced back to
8Computer Music Journal
Figure 1. Ocarina for
the iPhone. The user blows
into the microphone to
articulate the sound, multi-
touch is used to control
pitch, and accelerometers
control vibrato.
Figure 1
Figure 2. As counterpoint
to the physical instrument,
Ocarina also presents
a social interaction that
allows users to listen in,
surreptitiously, to others
playing Ocarina around
the world, taking advan-
tage of GPS location and
cloud-based networking.
Figure 2
2004 and 2006 (Tanaka 2004; Gaye et al. 2006).
The first sound synthesis on mobile phones was
documented by projects such as PDa (Geiger 2003),
Pocket Gamelan (Schiemer and Havryliv 2006),
and Mobile STK (Essl and Rohs 2006). The last of
these was a port of Perry Cook and Gary Scavone’s
Synthesis Toolkit to the Symbian OS platform,
and was the first programmable framework for
parametric sound synthesis on mobile devices.
More recently, Georg Essl, the author, and Michael
Rohs outlined a number of developments and
challenges in considering mobile phones as musical
performance platforms (Essl, Wang, and Rohs 2008).
Researchers have explored various sensors on
mobile phones for physical interaction design. It
is important to note that, although Ocarina ex-
plored a number of new elements (physical elements
and social interaction on a mass scale), the con-
cept of blowing into a phone (or laptop) has been
documented in prior work. In the Princeton Lap-
top Orchestra classroom of 2007, Matt Hoffman
created an instrument and piece for “unplugged”
(i.e., without external amplification) laptops, called
Breathalyzer, which required performers to blow
into the microphone to expressively control audio
synthesis (Hoffman 2007). Ananya Misra, with Essl
and Rohs, conducted a series of experiments that
used the microphone for mobile music performance
(including breath input, combined with camera
input; see Misra, Essl, and Rohs 2008). As far as
we know, theirs was the first attempt to make a
breath-mediated, flute-like mobile phone interface.
Furthermore, Essl and Rohs (2007) documented sig-
nificant exploration in combining audio synthesis,
accelerometer, compass, and camera in creating
purely on-device (i.e., no laptop) musical interfaces,
collectively called ShaMus.
Location and global positioning play a significant
role in Ocarina. This notion of “locative media,” a
term used by Atau Tanaka and Lalya Gaye (Tanaka
and Gemeinboeck 2006) has been explored in various
installations, performances, and other projects.
These include Johan Wagenaar’s Kadoum, in which
GPS sensors reported heart-rate information from
24 participants in Australia to an art installation on
a different continent. Gaye, Maz´
e, and Holmquist
(2003) explored locative media in Sonic City with
location-aware body sensors. Tanaka et al. have
pioneered a number of projects on this topic,
including Malleable Mobile Music and Net D´
erive,
the latter making use of a centralized installation
that tracked and interacted with geographically
diverse participants (Tanaka and Gemeinboeck
2008).
Lastly, the notion of using mobile phones for
musical expression in performance can be traced
back to Golan Levin’s Dialtones (Levin 2001),
perhaps the earliest concert concept that used the
audience’s mobile phones as the centerpiece of a
sustained live performance. More recently, the
aforementioned Stanford Mobile Phone Orchestra
was formed in 2007 as the first ensemble of its kind.
The Stanford Mobile Phone Orchestra explored a
more mobile, locative notion of “electronic chamber
music” as pioneered by the Princeton Laptop
Orchestra (Trueman 2007; Smallwood et al. 2008;
Wang 9
Wang et al. 2008) and the Stanford Laptop Orchestra
(Wang et al. 2009a), and also focused on various
forms of audience participation in performance (Oh
and Wang 2011). Since 2008, mobile music has
entered into the curriculum at institutions such
as Stanford University, University of Michigan,
Princeton University, and the California Institute
of the Arts, exploring various combinations of live
performance, instrument design, social interaction,
and mobile software design.
Physical Interaction Design Process
The design of Ocarina took place in the very early
days of mobile apps, and was, by necessity, an
experiment, which explored an intersection of
aesthetics, physical interaction design, and multiple
modalities in sound, graphics, and gesture.
“Inside-Out Design”
Why an ocarina?
If one were to create a musical instrument on a
powerful mobile device such as the iPhone, why not
a harpsichord, violin, piano, drums, or something
else—anything else?
The choice to create an ocarina started with
the iPhone itself—by considering its very form
factor while embracing its inherent capabilities
and limitations. The design aimed to use only the
existing features without hardware add-ons—and to
use these capabilities to their maximum potential.
For one, the iPhone was about the physical size
of a four-hole ocarina. Additionally, the hardware
and software capabilities of the iPhone naturally
seemed to support certain physical interactions that
an ocarina would require: microphone for breath
input, up to 5-point multi-touch (quite enough for a
four-hole instrument), and accelerometers to map to
additional expressive dimensions (e.g., vibrato rate
and depth). Furthermore, additional features on the
device, including GPS location and persistent data
connectivity, beckoned for the exploration of a new
social interaction. Working backwards or “inside-
out” from these features and constraints, the design
suggested the ocarina, which fit the profile in terms
of physical interaction and as a promising candidate
for social experimentation.
Physical Aesthetics
From an aesthetic point of view, the instrument
aspect of Ocarina was rigorously designed as a
physical artifact. The visual presentation consists
only of functional elements (such as animated
finger holes, and breath gauge in Ocarina 2) and
visualization elements (animated waves or ripples
in response to breath). In so doing, the statement
was not “this simulates an ocarina,” but rather
“this is an ocarina.” There are no attempts to adorn
or “skin” the instrument, beyond allowing users
to customize colors, further underscoring that the
physical device is the enclosure for the instrument.
Even the naming of the app reflects this design
thinking, deliberately avoiding the common early
naming convention of prepending app names with
the lowercase letter “i” (e.g., iOcarina). Once again,
it was a statement of what this app is, rather than
what it is trying to emulate.
This design approach also echoed that of a certain
class of laptop orchestra instruments, where the
very form factor of the laptop is used to create
physical instruments, embracing its natural benefits
and limitations (Fiebrink, Wang, and Cook 2007).
This shifted the typical screen-based interaction
to a physical interaction, in our corporeal world,
where the user engages the experience with palpable
dimensions of breath, touch, and tilt.
Physical Interaction
The physical interaction design of Ocarina takes ad-
vantage of three onboard input sensors: microphone
for breath, multi-touch for pitch control, and ac-
celerometers for vibrato. Additionally, Ocarina uses
two output modalities: audio and real-time graph-
ical visualization. The original design schematics
that incorporated these elements can be seen in
Figure 3. The intended playing method of Ocarina
asks the user to “hold the iPhone as one might a
10 Computer Music Journal
Figure 3. Initial physical
interaction design
schematic.
sandwich,” supporting the device with thumbs and
ring fingers, putting the user in position to blow into
the microphone at the bottom of the device, while
also freeing up both index fingers and both middle
fingers to hold down different combinations of the
four onscreen finger-holes.
Breath
The user articulates Ocarina literally by blowing
into the phone, specifically into the onboard mi-
crophone. Inside the app, a ChucK program tracks
the amplitude of the incoming microphone signal in
real time, and an initial amplitude envelope is cal-
culated using a leaky integrator, implemented as a
one-pole feedback filter (the actual filter parame-
ter was determined empirically; later versions of
Ocarina actually contained a table of device-specific
gains to further compensate for variation across
device generations). The initial breath signal is con-
ditioned through additional filters tuned to balance
between responsiveness and smoothness, and is
then fed into the Ocarina’s articulator (including
a second envelope generator), which controls the
amplitude of the synthesized Ocarina signal. The
signal resulting from air molecules blown into the
microphone diaphragm has significantly higher en-
ergy than speech and ambient sounds, and naturally
distinguishes between blowing interactions and
other sounds (e.g., typical speech).
Wang 11
Real-Time Graphics
There are two real-time graphical elements that
respond to breath input. Softly glowing ripples
smoothly “wash over” the screen when significant
breath input is being detected, serving both as a
visual feedback to breath interaction, but also as
an aesthetic element of the visual presentation. In
the more recent Ocarina 2, an additional graphical
element visualizes the intensity of the breath input:
Below an internal breath threshold, the visualization
points out the general region to apply breath; above
the threshold, an aurora-like light gauge rises and
falls with the intensity of the breath input.
Multi-Touch Interaction and Animation
Multi-touch is used to detect different combinations
of tone holes held by the user’s fingers. Modeled after
a four-hole English-pendant acoustic ocarina, the
mobile phone instrument provides four independent,
virtual finger holes, resulting in a total of 16 different
fingerings. Four real-time graphical finger holes are
visualized onscreen. They respond to touch gestures
in four quadrants of the screen, maximizing the
effective real estate for touch interaction. The finger
holes respond graphically to touch: They grow and
shrink to reinforce the interaction, and to help
compensate for lack of tactility. Although the touch
screen provides a solid physical object to press
against, there is no additional tactile information
regarding where the four finger holes are. The real-
time visualization aims to mitigate this missing
element by subtly informing the user of the current
fingering. This design also helps first-time users
to learn the basic interaction of the instrument by
simply playing around with it—Ocarina actually
includes a built-in tutorial, but providing more “on-
the-fly” cues to novices seemed useful nonetheless.
A nominal pitch mapping for Ocarina can be see in
Figure 4, including extended pitch mappings beyond
those found on an acoustic four-hole ocarina.
Accelerometers
Accelerometers are mapped to two parameters
of synthesized vibrato. This mapping offers an
additional, independent channel of expressive
control, and further encourages physical movement
with Ocarina. For example, the user can lean
forward to apply vibrato, perhaps inspired by the
visual, performative gestures of brass and woodwind
players when expressing certain passages. The front-
to-back axis of the accelerometer is mapped to
vibrato depth, ranging from no vibrato—when the
device is flat—to significant vibrato when the device
is tilted forward (e.g., the screen is facing away
from the player). A secondary left-to-right mapping
allows the more seasoned player to control vibrato
rate, varying linearly between 2 Hz from one side to
10 Hz on the opposite side (the vibrato is at 6 Hz in its
non-tilted center position). Such interaction offers
“one-order higher” expressive parameters, akin to
expression control found on MIDI keyboards. In
practice, it is straightforward to apply vibrato in
Ocarina to adorn passages, and the mechanics also
allows subtle variation of vibrato for longer notes.
Sound Synthesis
Audio output in Ocarina is synthesized in real
time in a ChucK program that includes the afore-
mentioned amplitude tracker and articulator. The
synthesis itself is straightforward (the acoustic oca-
rina sound is not complex). The synthesis elements
include a triangle wave, modulated by a second
oscillator (for vibrato), and multiplied against the
amplitude envelope generated by the articulator
situated between Ocarina’s analysis and synthesis
modules. The resulting signal is fed into a reverber-
ator. (A general schematic of the synthesis can be
seen in Figure 5.)
The acoustic ocarina produces sound as a
Helmholtz resonator, and the size of the finger
holes are carefully chosen to affect the amount
of total uncovered area as a ratio to the enclosed
volume and thickness of the ocarina—this relation-
ship directly affects the resulting frequency. The
pitch range of an acoustic four-hole English-pendant
ocarina is typically one octave, the lowest note
played by covering all four finger holes, and the
highest played by uncovering all finger holes. Some
chromatic pitches are played by partially covering
12 Computer Music Journal
Ocarina 1.0
design specification
ge, October 2008
Pitch Mappings
C : Ionian
C
D
E
F
G
A
B
C
BA
G
Figure 4. Pitch mappings for
C Ionian. Five additional
pitch mappings not
possible in traditional four-
hole ocarinas are denoted
with dotted outline.
Figure 4
Multitouch
(pitch)
Accelerometers
(vibrato)
SinOsc
(LFO for vibrato)
TriOsc
(carrier oscillator)
X
Step
(secondary envelope)
OnePole
(low-pass filter )
ADSR
(on/off envelope)
NRev
(reverberator)
Audio
output
Breath Input
(articulation)
(primary envelope generation)
OnePole
(rough envelope)
(base signal generation)
Figure 5. Ocarina’s general
sound synthesis scheme as
implemented in ChucK.
Figure 5
Wang 13
C
C#
D
D#
E
F
F#
G
G#
A
A#
B
Ionian
Dorian
Phrygian
Lydian
Mixolydian
Aeolian
Locrian
Zeldarian
root mode
red
blue
green
cyan
yellow
brown
radio
finger breath
red
blue
green
teal
yellow
brown
radio
echo
breath sensitivity
name your ocarina
jabberwocky
defaults are
in orange
defaults are
in orange
text input,
uploaded to server,
potentially unique
and a beginning to
getting people to
create smule
anonymous
identities
send to
chuck
code?
send to
chuck
code?
Ocarina
a mobile music application
version 1.0 design specification
ge, October 2008
Option Screen
Figure 6. Initial
option screen design,
allowing users to name
their instrument (for social
interaction), change key
and mode, as well as simple
customizations for the
instrument’s appearance.
certain holes. No longer coupled to the physical
parameters, the digital Ocarina offers precise in-
tonation for all pitches, extended pitch mapping,
and additional expressive elements, such as vibrato
and even portamento in Ocarina 2. The tuning is
not fixed; the player can choose different root keys
and diatonic modes (Ionian, Dorian, Phrygian, etc.),
offering multiple pitch mappings (see Figure 6).
The app even contains a newly invented (i.e.,
rather apocryphal) “Zeldarian” mode, where the
pitches are mapped to facilitate the playing of a
single melody: The Legend of Zelda theme song. In
popular culture, the Nintendo 64 video game The
Legend of Zelda: Ocarina of Time (1998) may be
the most prominent and enduring reference to the
acoustic ocarina. In this action-adventure game, the
protagonist, Link, must learn to play songs on an
in-game ocarina with magical powers to teleport
through time. The game is widely considered to be
in the pantheon of greatest video games (Wikipedia
14 Computer Music Journal
Twinkle Twinkle Little Star
traditional
root: C mode: ionian
Figure 7. A typical
tablature on Ocarina’s
online songbook database
populated with content
from the user community.
2013), and for that reason continues to endure and
delight—and continues to introduce the ocarina
to new generations of gamers (so effectively that
apparently a portion of the population mistakenly
believe ocarina is a purely fictional instrument that
exists only in the mythical in-game realm of Hyrule).
In any case, there is a sense of magic associated with
the ocarina, something that the design of Ocarina
aimed to capture. After all, isn’t hinting at magic a
powerful way to hide technology, while encouraging
users to focus on the experience?
Incorporating Expressive Game-Like Elements
In Ocarina, users learn to play various melodies via a
Web site specially crafted for users to share tablatures
for the iPhone-based instrument (Hamilton, Smith,
and Wang 2011). Each tablature shows a suggested
root key, mode, and sequence of fingerings (see
Figure 7). An editor interface on the Web site allows
users to input and share new tablatures. Through
this site, users are able to search and access over
5,000 user-generated Ocarina tablatures; during
peak usage the site had more than a million hits per
month. Users would often display the tablature on
a second computer (e.g., their laptop), while using
their iPhone to play the music. This is reminiscent
of someone learning to play a recorder while reading
music from a music stand—only here, the physical
instrument is embodied by the mobile phone, and
the computer has become both score and music
stand.
A sequel to Ocarina was created and released in
2012, called Ocarina 2 (abbreviated as O2—alluding
to the oxygen molecule and the breath interaction
needed for the app). Inspired by the success of the
Figure 8. Ocarina 2
provides a teaching mode
that shows the next three
fingerings for any
particular song (from
center and up). This mode
also provides basic
harmony accompaniment
that follows the user’s
melody playing.
Web-based tablatures, Ocarina 2’s most significant
new core features are (1) a game-like “songbook
mode” that teaches players how to play songs
note by note and (2) a dynamic harmony-rendering
engine that automatically accompanies the player. In
addition, every color, animation, spacing, sound, and
graphical effect was further optimized in Ocarina 2.
For a given song in Ocarina 2, an onscreen
queue of ocarina fingerings shows the next note to
play, as well as two more fingerings beyond that (see
Figure 8). The player is to hold the right combination
of finger holes onscreen, and articulate the note by
blowing—the Ocarina 2 songbook engine detects
these conditions, and advances to the next note. It is
important to emphasize there are no time or tempo
restrictions in this mode—players are generally free
to hold each note as long as they wish (and apply
dynamics and vibrato as desired), and furthermore
they are encouraged to play at their own pace. In
essence this songbook mode follows the player,
not the other way around. The design aims to both
provide a more natural and less stressful experience
to learn, and also to leave as much space as possible
for open expression. The player is responsible for
tempo and tempo variations, articulation (and co-
articulation of multi-note passages), dynamics, and
vibrato. The player is also responsible for the pitch
by holding the correct fingerings as shown, but is
free to embellish by adding notes and even trills.
Wang 15
There is no game-score reward system in
Ocarina 2, though game-like achievements can
be earned. Progress is accumulated per song, via
“breath points” as a general measurement of how
much a user has blown into his or her phone.
Achievements like “Every Breath You Take” (accu-
mulate 300 breath points) can be earned over time.
Probably the most hard-core achievement in Oca-
rina 2 is one called “Lungevity,” which challenges
the user to accumulate 1,000,000 breath points.
By rough estimation, to get this achievement, one
would need to play 500 songs each 200 times!
Ocarina 2 was an exploration to strike a balance
between an expressive musical artifact (i.e., an
instrument) and a game or toy. The goal is to retain
genuine expressive possibilities while offering game-
like qualities that can drastically reduce barrier of
entry into the experience. The theory was that
people are much less inhibited and intimidated
by trying something they perceive as a game, in
contrast to a perceived musical instrument—yet,
perhaps the two are not mutually exclusive. It
should be possible to have game-like elements
that draw people in, and even benignly “trick” the
user into being expressive—and, for some, possibly
getting a first-time taste for the joy of making music.
Social Interaction Design
Ocarina is possibly the first-ever massively adopted
instrument that allows its users to hear one another
around the world, accompanied by a visualization
of the world that shows where each musical
snippet originated. After considering the physical
interaction, the design underwent an exercise to use
the additional hardware and software capabilities of
the iPhone to maximum advantage, aimed to enable
a social-musical experience—something that one
could not do with a traditional acoustic ocarina (or
perhaps any instrument). The exercise sought to
limit the design to exactly one social feature, but
then to make that feature as compelling as possible.
(If nothing else, this was to be an interesting and
fun experiment!)
From there, it made sense to consider the device’s
location capabilities—because the phone is, by
definition, mobile and travels in daily life with its
user, and it is always connected to the Internet. The
result was the globe in Ocarina, which allows any
user to anonymously (and surreptitiously) listen in
on potentially any other Ocarina user around the
world (see Figure 9). Users would only be identified
by their location (if they agreed to provide it to the
app), a moniker they could choose for themselves
(e.g., Link123 or ZeldaInRome), and their music
(see Figure 10).
If listeners like what they hear, they can “love”
the snippet by tapping a heart icon. The snippet
being heard is chosen via an algorithm at a central
Ocarina server, and takes into account recency,
popularity (as determined by users via “love”
count), geographic diversity of the snippets, as well
as filter selections by the user. Listeners can choose
to listen to (1) the world, (2) a specific region, (3)
snippets that they have loved, and (4) snippets they
have played. To the author’s knowledge, this type
of social–musical interaction is the first of its kind
and scale, as users have listened to each other over
40 million times on the globe. A map showing the
rough distribution of Ocarina users can be seen in
Figure 11.
How is this social interaction accomplished,
technically? As a user plays Ocarina, an algorithm in
the analysis module decides when to record snippets
as candidates for uploading to a central Ocarina
server, filtering out periods of inactivity, limiting
maximum snippet lengths (this server-controlled
parameter is usually set to 30 seconds), and even
taking into account central server load. When
snippet recording is enabled, the Ocarina engine
rapidly takes snapshots of gestural data, including
current breath-envelope value, finger-hole state,
and tilt from two accelerometers. Through this
process a compact network packet is created,
time-stamped, and geotagged with GPS information,
and uploaded to the central Ocarina server and
database.
During playback in Ocarina’s globe visualization,
the app requests new snippets from the server
according to listener preference (region, popularity,
and other filters). A server-side algorithm identifies
a set of snippets that most closely matches the
desired criteria, and sends back a snippet selected
16 Computer Music Journal
icons (maybe)
(semi) real-time kjoule map
depends on locale
return to
primary display
Ocarina
an mobile + social music application
version 1.0 design specification
ge, October 2008
Real-time Map Display
With the user's permission,
his/her GPS/tower location is
upload to a central smule server;
the server then sends updates
to the phone, which displays /
animates the current ocarina
usage around the world
i
audio playback
plays back selections
of uploaded snippets,
or perhaps in real-time
Figure 9. Social interaction
design for Ocarina. The
goal was to utilize GPS
location and data
connectivity into a single
social feature.
Figure 9
Figure 10. Listening to the
world in Ocarina.
Figure 10
Wang 17
Figure 11. Distribution of
the first 2 billion breath
blows around the world.
Figure 11
Sound
synthesis
Envelope
generator
X
Breath input
(articulation)
Multitouch
(pitch)
Accelerometers
(vibrato)
Audio
output
Gesture
recorder / player
Network
module Internet
Database
Central servers
Anonymous
user data
Figure 12. Ocarina system
design, from physical
interaction to social
interaction.
Figure 12
at random from this matching set. Note that no
audio recording is ever stored on the server—only
gesture information (which is more compact and
potentially richer). The returned snippet is rendered
by the Ocarina app client, feeding the gesture data
recording into the same synthesis engine used for
the instrument, and rendering it into sound in the
visualized globe. The system design of Ocarina,
from physical interaction to cloud-mediated social
interaction, can be seen in Figure 12.
User Case Studies
Ocarina users have listened in on each other over
40 million times, and somehow created an
18 Computer Music Journal
Figure 13. Ocarina users
share their performances
via Internet video.
unexpected flood of self-expression in their ev-
eryday life. Within a few days of the release of
Ocarina (in November 2008), user-created videos
began surfacing on the Internet in channels such
as YouTube (see Figure 13). Thousands of videos
showcased everyday users performing on their
iPhone Ocarinas, in living rooms, dorm rooms,
kitchens, holiday parties, on the streets, and many
other settings. Performers vary in age from young
children to adults, and seem to come from all over
the globe. They play many types of music, from
Ode to Joy, video game music (e.g., Legend of Zelda,
Super Mario Bros., Tetris), themes from movies
and television shows (e.g., The X-Files,Star Wars,
Star Trek), to pop and rock music, show tunes,
and folk melodies (e.g., Amazing Grace,Kumbaya,
Shenandoah). Many are solo performances; others
are accompanied by acoustic guitars, piano, and
even other iPhone-based musical instruments.
As an example, one user created a series of videos
in which she plays Ocarina by blowing into the
iPhone with her nose (top left in Figure 12). Ap-
parently, she has a long history of playing nose
flutes, and Ocarina was her latest nasal-musical
experiment. She began with a nose-mediated ren-
dition of Music of the Night and, after this video
gained renown on YouTube, followed up with per-
formances of The Blue Danube (this one played
upside-down to further increase the difficulty), the
Jurassic Park theme, The Imperial March from Star
Wars, and Rick Astley’s Never Gonna Give You Up.
One user braved snowy streets to busk for money
with his iPhone Ocarina and filmed the experience.
Another group of users created a video promoting
tourism in Hungary. Some have crafted video
tutorials to teach Ocarina; others have scripted
and produced original music videos. All of these
represent creative uses of the instrument, some
that even we, its creators, had not anticipated.
There is something about playing Ocarina on one’s
iPhone that seems to overcome the inhibition
of performing, especially in people who are not
normally performers and who don’t typically call
themselves musicians.
It was surprising to see such mass adoption
of Ocarina, in spite of the app’s unique demand on
physically using the iPhone in unconventional ways.
Over the years, one could reasonably surmise that
much of its popularity may be that the sheer novelty
and curiosity of playing a flute-like instrument on
Wang 19
a mobile phone effectively overcame barriers to
try a new musical instrument. And if the physical
interaction of Ocarina provoked curiosity through
novelty, the social globe interaction provided
something—perhaps a small sense of wonder—that
was not possible without a mobile, location-aware,
networked computer.
Discussion
Is the app a new form of interactive art? Can an app
be considered art? What might the role of technology
be in inspiring or ushering a large population into
exploring musical expression? Although the mobile
app world has evolved with remarkable speed since
2008, the medium is perhaps still too young to fully
answer these questions. We can ponder, nonetheless.
There are definite limitations to the mobile
phone as a platform for crafting musical expression,
especially in creating an app designed to reach
a wide audience. In a sense, we have to work
with what is available on the device, and nothing
more. We might do our best to embrace the capabili-
ties and limitations, but is that enough? Traditional
instruments are designed and crafted over decades
or even centuries, whereas something like Ocarina
was created in six weeks. Does it even make sense
to compare the two?
On the other hand, alongside limitations lie
possibilities for new interactions—both physical
and social—and new ways to inspire a large
population to be musical. Ocarina affords a sense
of expressiveness. There are moments in Ocarina’s
globe interaction where one might easily forget
the technology, and feel a small, yet nonetheless
visceral, connection with strangers on the other
side of the world. Is that not a worthwhile human
experience, one that was not possible before? The
tendrils of possibility seem to reach out and plant
the seeds for some yet-unknown global community.
Is that not worth exploring?
As a final anecdote, here is a review for Ocarina
(Apple App Store 2008):
This is my peace on earth. I am currently
deployed in Iraq, and hell on earth is an every
day occurrence. The few nights I may have
off I am deeply engaged in this app. The globe
feature that lets you hear everybody else in the
world playing is the most calming art I have
ever been introduced to. It brings the entire
world together without politics or war. It is
the EXACT opposite of my life—Deployed U.S.
Soldier.
Is Ocarina itself a new form of art? Or is it a toy?
Or maybe a bit of both? These are questions for each
person to decide.
Acknowledgments
This work owes much to the collaboration of many
individuals at Smule, Stanford University, CCRMA,
and elsewhere, including Spencer Salazar, Perry
Cook, Jeff Smith, David Zhu, Arnaud Berry, Mattias
Ljungstrom, Jonathan Berger, Rob Hamilton, Georg
Essl, Rebecca Fiebrink, Turner Kirk, Tina Smith,
Chryssie Nanou, and the Ocarina community.
References
Apple App Store. 2008. “Ocarina.” Available online
at itunes.apple.com/us/app/ocarina/id293053479.
Accessed October 2013.
Essl, G., and M. Rohs. 2006. “Mobile STK for Symbian
OS.” In Proceedings of the International Computer
Music Conference, pp. 278–281.
Essl, G., and M. Rohs. 2007. “ShaMus—A Sensor-Based
Integrated Mobile Phone Instrument.” In Proceedings
of the International Computer Music Conference, pp.
200–203.
Essl, G., G. Wang, and M. Rohs. 2008. “Developments and
Challenges Turning Mobile Phones into Generic Music
Performance Platforms.” In Proceedings of Mobile
Music Workshop, pp. 11–14.
Fiebrink, R., G. Wang, and P. R. Cook. 2007. “Don’t
Forget the Laptop: Using Native Input Capabilities
for Expressive Musical Control.” In Proceedings of
the International Conference on New Interfaces for
Musical Expression, pp. 164–167.
Gaye, L., R. Maz´
e, and L. E. Holmquist. 2003. “Sonic City:
The Urban Environment as a Musical Interface.” In
Proceedings of the International Conference on New
Interfaces for Musical Expression, pp. 109–115.
20 Computer Music Journal
Gaye, L., et al. 2006. “Mobile Music Technology: Report
on an Emerging Community.” In Proceedings of
the International Conference on New Interfaces for
Musical Expression, pp. 22–25.
Geiger, G. 2003. “PDa: Real Time Signal Processing
and Sound Generation on Handheld Devices.” In
Proceedings of the International Computer Music
Conference, pp. 283–286.
Hamilton, R., J. Smith, and G. Wang. 2011. “Social
Composition: Musical Data Systems for Expres-
sive Mobile Music.” Leonardo Music Journal 21:
57–64.
Hoffman, Matt. 2007. “Breathalyzer.” Available online at
smelt.cs.princeton.edu/pieces/Breathalyzer. Accessed
October 2013.
Levin, G. 2001. “Dialtones (a Telesymphony).” Available
online at www.flong.com/projects/telesymphony.
Accessed December 2013.
Misra, A., G. Essl, and M. Rohs. 2008. “Microphone as
Sensor in Mobile Phone Performance.” In Proceedings
of the International Conference on New Interfaces for
Musical Expression, pp. 185–188.
Oh, J., and G. Wang. 2011. “Audience–Participation
Techniques Based on Social Mobile Computing.” In
Proceedings of the International Computer Music
Conference, pp. 665–671.
Oh, J., et al. 2010. “Evolving the Mobile Phone Orchestra.”
In Proceedings of the International Conference on New
Interfaces for Musical Expression, pp. 82–87.
Smallwood, S., et al. 2008. “Composing for Laptop
Orchestra.” Computer Music Journal 32(1):9–25.
Schiemer, G., and M. Havryliv. 2006. “Pocket Gamelan:
Tuneable Trajectories for Flying Sources in Mandala 3
and Mandala 4.” In Proceedings of the International
Conference on New Interfaces for Musical Expression,
pp. 37–42.
Tanaka, A. 2004. “Mobile Music Making.” In Proceedings
of the International Conference on New Interfaces for
Musical Expression, pp. 154–156.
Tanaka, A., and P. Gemeinboeck. 2006. “A Framework for
Spatial Interaction in Locative Media.”In Proceedings
of the International Conference on New Interfaces for
Musical Expression, pp. 26–30.
Tanaka, A., and P. Gemeinboeck. 2008. “Net D´
erive:Con-
ceiving and Producing a Locative Media Artwork.” In
G. Goggins and L. Hjorth, eds. Mobile Technologies:
From Telecommunications to Media. London: Rout-
ledge, pp. 174–186.
Trueman, D. 2007. “Why a Laptop Orchestra?” Organised
Sound 12(2):171–179.
Wang, G. 2008. “The ChucK Audio Programming Lan-
guage: A Strongly-Timed and On-the-Fly Environ/
mentality.” PhD Thesis, Princeton University.
Wang, G. 2014. “The World Is Your Stage: Making Music
on the iPhone.” In S. Gopinath and J. Stanyek, eds.
Oxford Handbook of Mobile Music Studies,Volume2.
Oxford: Oxford University Press, pp. 487–504.
Wang, G. 2015. “Improvisation of the Masses: Anytime,
Anywhere Music.” In G. Lewis and B. Piekut, eds.
Oxford Handbook of Improvisation Studies. Oxford:
Oxford University Press.
Wang, G., G. Essl, and H. Penttinen. 2008. “MoPhO:
Do Mobile Phones Dreams of Electric Orchestras?”
In Proceedings of the International Computer Music
Conference, pp. 331–337.
Wang, G., G. Essl, and H. Penttinen. 2014. “Mobile Phone
Orchestra.” In S. Gopinath and J. Stanyek, eds. Oxford
Handbook of Mobile Music Studies, Volume 2. Oxford:
Oxford University Press, pp. 453–469.
Wang, G., et al. 2008. “The Laptop Orchestra as Class-
room.” Computer Music Journal 32(1):26–37.
Wang, G., et al. 2009a. “Stanford Laptop Orchestra
(SLOrk).” In Proceedings of International Computer
Music Conference, pp. 505–508.
Wang, G., et al. 2009b. “Smule =Sonic Media: An
Intersection of the Mobile, Musical, and Social.” In
Proceedings of the International Computer Music
Conference, pp. 283–286.
Wikipedia. 2013. “The Legend of Zelda: Ocarina of Time,”
Wikipedia. Available online at en.wikipedia.org/
wiki/The Legend of Zelda: Ocarina of Time. Accessed
October 2013.
Wang 21
Chapter
Full-text available
A variety of methods for audio quality evaluation are available ranging from classic psychoacoustic methods like alternative forced-choice tests to more recent approaches such as quality taxonomies and plausibility. This chapter introduces methods that are deemed to be relevant for audio evaluation in virtual and augmented reality. It details in how far these methods can directly be used for testing in virtual reality or have to be adapted with respect to specific aspects. In addition, it highlights new areas, for example, quality of experience and presence that arise from audiovisual interactions and the mediation of virtual reality. After briefly introducing 3D audio reproduction approaches for virtual reality, the quality that these approaches can achieve is discussed along with the aspects that influence the quality. The concluding section elaborates on current challenges and hot topics in the field of audio quality evaluation and audio reproduction for virtual reality. To bridge the gap between theory and practice useful resources, software and hardware for 3D audio production and research are pointed out.
Chapter
Full-text available
This chapter examines user experience design for collaborative music making in shared virtual environments (SVEs). Whilst SVEs have been extensively researched for many application domains including education, entertainment, work and training, there is limited research on the creative aspects. This results in many unanswered design questions such as how to design the user experience without being detrimental to the creative output, and how to design spatial configurations to support both individual creativity and collaboration. Here, we explore multi-modal approaches to supporting creativity in collaborative music making in SVEs. We outline an SVE, LeMo, which allows two people to create music collaboratively. We then present two studies; the first explores how free-form visual 3D annotations instead of spoken communication can support collaborative composition processes and human–human interaction. Five classes of use of annotation were identified in the study, three of which are particularly relevant to the future design of sonic interactions in virtual environments. The second study used a modified version of LeMo to test the support for a creative collaboration of two different spatial audio settings, which according to the results, changed participants’ behaviour and affected their collaboration. Finally, design implications for the auditory design of SVEs focusing on supporting creative collaboration are given.
Chapter
Full-text available
The development of Virtual Reality (VR) systems and multimodal simulations presents possibilities in spatial-music mixing, be it in virtual spaces, for ensembles and orchestral compositions or for surround sound in film and music. Traditionally, user interfaces for mixing music have employed the channel-strip metaphor for controlling volume, panning and other audio effects that are aspects that also have grown into the culture of mixing music spatially. Simulated rooms and two-dimensional panning systems are simply implemented on computer screens to facilitate the placement of sound sources within space. In this chapter, we present design aspects for mixing in VR, investigating already existing virtual music mixing products and creating a framework from which a virtual spatial-music mixing tool can be implemented. Finally, the tool will be tested against a similar computer version to examine whether or not the sensory benefits and palpable spatial proportions of a VE can improve the process of mixing 3D sound.
Chapter
Full-text available
As the next generation of active video games (AVG) and virtual reality (VR) systems enter people’s lives, designers may wrongly aim for an experience decoupled from bodies. However, both AVG and VR clearly afford opportunities to bring experiences, technologies, and users’ physical and experiential bodies together, and to study and teach these open-ended relationships of enaction and meaning-making in the framework of embodied interaction. Without such a framework, an aesthetic pleasure, lasting satisfaction, and enjoyment would be impossible to achieve in designing sonic interactions in virtual environments (SIVE). In this chapter, we introduce this framework and focus on design exemplars that come from a soma design ideation workshop and balance rehabilitation. Within the field of physiotherapy, developing new conceptual interventions, with a more patient-centered approach, is still scarce but has huge potential for overcoming some of the challenges facing health care. We indicate how the tactics such as making space, subtle guidance, defamiliarization, and intimate correspondence have informed the exemplars, both in the workshop and also in our ongoing physiotherapy case. Implications for these tactics and design strategies for our design, as well as for general practitioners of SIVE are outlined.
Chapter
Full-text available
The relationships between the listener, physical world, and virtual environment (VE) should not only inspire the design of natural multimodal interfaces but should be discovered to make sense of the mediating action of VR technologies. This chapter aims to transform an archipelago of studies related to sonic interactions in virtual environments (SIVE) into a research field equipped with a first theoretical framework with an inclusive vision of the challenges to come: the egocentric perspective of the auditory digital twin. In a VE with immersive audio technologies implemented, the role of VR simulations must be enacted by a participatory exploration of sense-making in a network of human and non-human agents, called actors. The guardian of such locus of agency is the auditory digital twin that fosters intra-actions between humans and technology, dynamically and fluidly redefining all those configurations that are crucial for an immersive and coherent experience. The idea of entanglement theory is here mainly declined in an egocentric spatial perspective related to emerging knowledge of the listener’s perceptual capabilities. This is an actively transformative relation with the digital twin potentials to create movement, transparency, and provocative activities in VEs. The chapter contains an original theoretical perspective complemented by several bibliographical references and links to the other book chapters that have contributed significantly to the proposal presented here.
Chapter
Full-text available
Real-time auralization is essential in virtual reality (VR), gaming, and architecture to enable an immersive audio-visual experience. The audio rendering must be congruent with visual feedback and respond with minimal delay to interactive events and user motion. The wave nature of sound poses critical challenges for plausible and immersive rendering and leads to enormous computational costs. These costs have only increased as virtual scenes have progressed away from enclosures toward complex, city-scale scenes that mix indoor and outdoor areas. However, hard real-time constraints must be obeyed while supporting numerous dynamic sound sources, frequently within a tightly limited computational budget. In this chapter, we provide a general overview of VR auralization systems and approaches that allow them to meet such stringent requirements. We focus on the mathematical foundation, perceptual considerations, and application-specific design requirements of practical systems today, and the future challenges that remain.
Chapter
Full-text available
Sonic experiences are usually considered as the result of auditory feedback alone. From a psychological standpoint, however, this is true only when a listener is kept isolated from concurrent stimuli targeting the other senses. Such stimuli, in fact, may either interfere with the sonic experience if they distract the listener, or conversely enhance it if they convey sensations coherent with what is being heard. This chapter is concerned with haptic augmentations having effects on auditory perception, for example how different vibrotactile cues provided by an electronic musical instrument may affect its perceived sound quality or the playing experience. Results from different experiments are reviewed showing that the auditory and somatosensory channels together can produce constructive effects resulting in measurable perceptual enhancement. That may affect sonic dimensions ranging from basic auditory parameters, such as the perceived intensity of frequency components, up to more complex perceptions which contribute to forming our ecology of everyday or musical sounds.
Chapter
Full-text available
This chapter addresses the first building block of sonic interactions in virtual environments, i.e., the modeling and synthesis of sound sources. Our main focus is on procedural approaches, which strive to gain recognition in commercial applications and in the overall sound design workflow, firmly grounded in the use of samples and event-based logics. Special emphasis is placed on physics-based sound synthesis methods and their potential for improved interactivity. The chapter starts with a discussion of the categories, functions, and affordances of sounds that we listen to and interact with in real and virtual environments. We then address perceptual and cognitive aspects, with the aim of emphasizing the relevance of sound source modeling with respect to the senses of presence and embodiment of a user in a virtual environment. Next, procedural approaches are presented and compared to sample-based approaches, in terms of models, methods, and computational costs. Finally, we analyze the state of the art in current uses of these approaches for Virtual Reality applications.
Chapter
Full-text available
Immersive virtual musical instruments (IVMIs) lie at the intersection between music technology and virtual reality. Being both digital musical instruments (DMIs) and elements of virtual environments (VEs), IVMIs have the potential to transport the musician into a world of imagination and unprecedented musical expression. But when the final aim is to perform live on stage, the employment of these technologies is anything but straightforward, for sharing the virtual musical experience with the audience gets quite arduous. In this chapter, we assess in detail the several technical and conceptual challenges linked to the composition of IVMI performances on stage, i.e., their scenography , providing a new critical perspective on IVMI performance and design. We first propose a set of dimensions meant to analyse IVMI scenographies, as well as to evaluate their compatibility with different instrument metaphors and performance rationales. Such dimensions are built from the specifics and constraints of DMIs and VEs; they include the level of immersion of musicians and spectators and provide an insight into the interaction techniques afforded by 3D user interfaces in the context of musical expression. We then analyse a number of existing IVMIs and stage setups, and finally suggest new ones, with the aim to facilitate the design of future immersive performances.
Chapter
Full-text available
Due to their mobility, intimacy, and sheer strength in numbers, mobile phones have become much more than “portable miniature computers,” increasingly serving as personal extensions of ourselves. Therein lies immense potential to reshape the way we think and do, especially in how we engage one another socially, creatively, musically. This chapter explores the emergence of the iPhone as a unique platform for creating new expressive and social mediums. In my dual role as an Assistant Professor at Stanford University's CCRMA and the Co-founder of Smule—a startup company devoted to music-making on the iPhone—I chronicle the beginnings of the iPhone and first-hand experience in co-founding Smule in 2008, designing its products, and reflecting on the ramifications (so far). Through three case studies, I examine how Smule's “social musical artifacts” are able to take deep advantage of the iPhone's intersection of technologies (multitouch, powerful mobile CPU and GPU, full audio pipeline, GPS and location, persistent data connection) and its human factors (mobility, ubiquity, and intimacy) to provide experiences that seek to be expressive on a personal level, and social on a global scale. It is my hope that these notes demonstrate a potential for new types of creative communities and look ahead into a possible future of global music-making via personal computing devices.
Chapter
Full-text available
We chronicle our adventures with the Mobile Phone Orchestra (MoPhO), a new repertoire-based ensemble using mobile phones as the primary musical instrument. While mobile phones have been used for artistic expression before, MoPhO is the first (to the best of our knowledge) to approach it from an ensemble/repertoire angle. It employs more than a dozen players and mobile phones that serve as a compositional and performance platform for a expanding and dedicated repertoire. In this sense, it is the first ensemble of its kind. MoPhO was instantiated in Fall 2007 at Stanford University’s Center for Computer Research in Music and Acoustics (CCRMA) and performed its debut concert in January 2008. Mobile phones are growing in sheer number and computational power. Hyper-ubiquitous and deeply entrenched in the lifestyles of people around the world, they transcend nearly every cultural and economic barrier. Computationally, the mobile phones of today offers speed and storage capabilities comparable to desktop computers from less than ten years ago, rendering them suitable for real-time sound synthesis and other musical applications. Like traditional acoustic instruments, the mobile phones are intimate sound producing devices. By comparison to most instruments, they are rather soft and have somewhat limited acoustic bandwidth. However, mobile phones have the advantages of ubiquity, strength in numbers, and ultra-mobility, making it feasible to hold jam sessions, rehearsals, and even performance almost anywhere, anytime. A goal of the Mobile Phone Orchestra is to explore these possibilities as a research and music-making body. We investigate the fusion of technological artifact and human musicianship, and provide a new vehicle for experimenting with new music and music-making. We see the mobile phone orchestra idea matching the idea of a laptop orchestra. The phones as intimate sound sources provide a unique opportunity to explore ”mobile electronic chamber music”. The Mobile Phone Orchestra presents a well-defined platform of hardware and software configuration and players, enabling composers to craft mobile instruments and write music tailored to such an ensemble. Furthermore, the combination of technology, aesthetics, and instrument building presents a potentially powerful pedagogical opportunity, which compared to laptop orchestras, gains the added benefit of extreme mobility.
Thesis
Full-text available
The computer has long been considered an extremely attractive tool for creating, manipulating, and analyzing sound. Its precision, possibilities for new timbres, and potential for fantastical automation make it a compelling platform for expression and experimentation - but only to the extent that we are able to express to the computer what to do, and how to do it. To this end, the programming language has perhaps served as the most general, and yet most precise and intimate interface between humans and computers. Furthermore, “domain-specific” languages can bring additional expressiveness, conciseness, and perhaps even different ways of thinking to their users. This thesis argues for the philosophy, design, and development of ChucK, a general-purpose programming language tailored for computer music. The goal is to create a language that is expressive and easy to write and read with respect to time and parallelism, and to provide a platform for precise audio synthesis/analysis and rapid experimentation in computer music. In particular, ChucK provides a syntax for representing information flow, a new time-based concurrent programming model that allows programmers to flexibly and precisely control the flow of time in code (we call this “strongly-timed”), and facilities to develop programs on-the-fly - as they run. A ChucKian approach to live coding as a new musical performance paradigm is also described. In turn, this motivates the Audicle, a specialized graphical environment designed to facilitate on-the-fly programming, to visualize and monitor ChucK programs in real-time, and to provide a platform for building highly customizable user interfaces. In addition to presenting the ChucK programming language, a history of music and programming is provided (Chapter 2), and the various aspects of the ChucK language are evaluated in the context of computer music research, performance, and pedagogy (Chapter 6). As part of an extensive case study, the thesis discusses ChucK as a primary teaching and development tool in the Princeton Laptop Or- chestra (PLOrk), which continues to be a powerful platform for deploying ChucK 1) to teach topics ranging from programming to sound synthesis to music com- position, and 2) for crafting new instruments, compositions, and performances for computer-mediated ensembles. Additional applications are also described, including classrooms, live coding arenas, compositions and performances, user studies, and integrations of ChucK into other software systems. The contributions of this work include the following. 1) A time-based pro- gramming mechanism (both language and underlying implementation) for ultra- precise audio synthesis, naturally extensible to real-time audio analysis. 2) A non- preemptive, time/event-based concurrent programming model that provides fun- damental flexibility and readability without incurring many of the difficulties of programming concurrency. 3) A ChucKian approach to writing code and design- ing audio programs on-the-fly. This rapid prototyping mentality has potentially wide ramifications in the way we think about coding audio, in designing/testing software (particular for real-time audio), as well as new paradigms and practices in computer-mediated live performance. 4) The Audicle as a new type of audio programming environment that combines live development with visualizations. 5) Extended case studies of using, teaching, composing, and performing with ChucK, most prominently in the Laptop Orchestra. These show the power of teaching programming via music, and vice versa - and how these two disciplines can reinforce each other.
Conference Paper
Full-text available
There has been an ongoing effort to turn mobile phones into generic platforms or musical expression. By generic we mean useable in a wide range of expressive settings, where the enabling technology has minimal influence on the core artistic expression itself. We describe what has been achieved so far and outline a number of open challenges.
Article
Full-text available
This article explores the role of symbolic score data in the authors' mobile music-making applications, as well as the social sharing and community-based content creation workflows currently in use on their on-line musical network. Web-based notation systems are discussed alongside in-app visual scoring methodologies for the display of pitch, timing and duration data for instrumental and vocal performance. User-generated content and community-driven ecosystems are considered alongside the role of cloud-based services for audio rendering and streaming of performance data.
Conference Paper
Full-text available
In the paper, we chronicle the instantiation and adventures of the Stanford Laptop Orchestra (SLOrk), an ensemble of laptops, humans, hemispherical speaker arrays, interfaces, and, more recently, mobile smart phones. Motivated to deeply explore computer-mediated live performance, SLOrk provides a platform for research, instrument design, sound design, new paradigms for composition, and performance. It also offers a unique classroom combining music, technology, and live performance. Founded in 2008, SLOrk was built from the ground-up by faculty and students at Stanford University's Center for Computer Research in Music and Acoustics (CCRMA). This document describes 1) how SLOrk was built, 2) its initial performances, and 3) the Stanford Laptop Orchestra as a classroom. We chronicle its present, and look to its future.
Thesis
The computer has long been considered an extremely attractive tool for creating, manipulating, and analyzing sound. Its precision, possibilities for new timbres, and potential for fantastical automation make it a compelling platform for expression and experimentation - but only to the extent that we are able to express to the computer what to do, and how to do it. To this end, the programming language has perhaps served as the most general, and yet most precise and intimate interface between humans and computers. Furthermore, domain-specific languages can bring additional expressiveness, conciseness, and perhaps even different ways of thinking to their users. This thesis argues for the philosophy, design, and development of ChucK, a general-purpose programming language tailored for computer music. The goal is to create a language that is expressive and easy to write and read with respect to time and parallelism, and to provide a platform for precise audio synthesis/analysis and rapid experimentation in computer music. In particular, ChucK provides a syntax for representing information flow, a new time-based concurrent programming model that allows programmers to flexibly and precisely control the flow of time in code (we call this strongly-timed), and facilities to develop programs on-the-fly - as they run. A ChucKian approach to live coding as a new musical performance paradigm is also described. In turn, this motivates the Audicle, a specialized graphical environment designed to facilitate on-the-fly programming, to visualize and monitor ChucK programs in real-time, and to provide a platform for building highly customizable user interfaces. In addition to presenting the ChucK programming language, a history of music and programming is provided (Chapter 2), and the various aspects of the ChucK language are evaluated in the context of computer music research, performance, and pedagogy (Chapter 6). As part of an extensive case study, the thesis discusses ChucK as a primary teaching and development tool in the Princeton Laptop Orchestra (PLOrk), which continues to be a powerful platform for deploying ChucK 1) to teach topics ranging from programming to sound synthesis to music composition, and 2) for crafting new instruments, compositions, and performances for computer-mediated ensembles. Additional applications are also described, including classrooms, live coding arenas, compositions and performances, user studies, and integrations of ChucK into other software systems. The contributions of this work include the following. 1) A time-based programming mechanism (both language and underlying implementation) for ultra-precise audio synthesis, naturally extensible to real-time audio analysis. 2) A non-preemptive, time/event-based concurrent programming model that provides fundamental flexibility and readability without incurring many of the difficulties of programming concurrency. 3) A ChucKian approach to writing code and designing audio programs on-the-fly. This rapid prototyping mentality has potentially wide ramifications in the way we think about coding audio, in designing/testing software (particular for real-time audio), as well as new paradigms and practices in computer-mediated live performance. 4) The Audicle as a new type of audio programming environment that combines live development with visualizations. 5) Extended case studies of using, teaching, composing, and performing with ChucK, most prominently in the Laptop Orchestra. These show the power of teaching programming via music, and vice versa - and how these two disciplines can reinforce each other.
Chapter
The mass-scale adoption of modern mobile computing technology presents immense potential to reshape the way we engage one another socially, creatively, and musically. This article explores ad hoc, music-making and improvisatory performance on a massive scale, leveraging personal interactive mobile instruments (e.g., via iPhones and iPads), location-awareness (via GPS), and the connective social potential of clouding computing. We investigate a new social/musical improvisatory context that doesn't exist in any single location, but as a network and community of anonymous but connected participants around the world. As case studies, we will draw from experiences with the Stanford Mobile Phone Orchestra, as well as the community of Smule social/proto-musical experiences on mobile devices, including Ocarina, I Am T-Pain, and MadPad. We reflect on these experiences in the context of an new type of "anytime, anywhere" music made with mobile devices.
Conference Paper
Mobile devices, such as smartphones and tablets, are becoming indispensable to everyday life, connecting us in a powerful network through services accessible via web browsers or mobile applications. In conjunction with recent explorations on physical interaction techniques for making music on mobile devices, the mobile platform can be regarded as an attractive solution for designing music performances based on audience participation. Using smartphones to enable audience participation not only offers convenience, but also tends to induce engaging and interactive social experience. In this paper, we first take a look at two separate phenomena of interest to us: the rise of mobile music and the design of audience participation performance paradigm. Then we present techniques for enabling audience participation based primarily on using smartphones, as experimented by the Stanford Mobile Phone Orchestra. We evaluate these techniques and consider the future of social music interactions aided by mobile technology.