Perry R. Cook

Perry R. Cook
Princeton University | PU · Department of Computer Science

PhD, Electrical Engineering, Stanford University

About

228
Publications
75,458
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
9,933
Citations
Introduction
Skills and Expertise

Publications

Publications (228)
Preprint
Full-text available
This paper will present observations on the design, artistic, and human factors of creating digital music controllers. Specific projects will be presented, and a set of design principles will be supported from those examples.
Chapter
On-the-fly programming is a style of programming in which the programmer/performer/composer augments and modifies the program while it is running, without stopping or restarting, in order to assert expressive, programmable control at runtime. Because of the fundamental powers of programming languages, we believe the technical and aesthetic aspects...
Chapter
This paper describes the design of an Electronic Sitar controller, a digitally modified version of Saraswati’s (the Hindu Goddess of Music) 19-stringed, pumpkin shelled, traditional North Indian instrument. The ESitar uses sensor technology to extract gestural information from a performer, deducing music information such as pitch, pluck timing, thu...
Chapter
This paper gives a historical overview of the development of alternative sonic display systems at Princeton University; in particular, the design, construction, and use in live performance of a series of spherical and hemispherical speaker systems. We also provide a DIY guide to constructing the latest series of loudspeakers that we are currently u...
Chapter
This paper will present observations on the design, artistic, and human factors of creating digital music controllers. Specific projects will be presented, and a set of design principles will be supported from those examples.
Chapter
We draw on our experiences with the Princeton Laptop Orchestra to discuss novel uses of the laptop’s native physical inputs for flexible and expressive control. We argue that instruments designed using these built-in inputs offer benefits over custom standalone controllers, particularly in certain group performance settings; creatively thinking abo...
Article
The Communications Web site, http://cacm.acm.org, features more than a dozen bloggers in the BLOG@CACM community. In each issue of Communications, we'll publish selected posts or excerpts. twitter Follow us on Twitter at http://twitter.com/blogCACM http://cacm.acm.org/blogs/blog-cacm Perry R. Cook considers the career path that led him to STEAM.
Article
Full-text available
ChucK is a programming language designed for computer music. It aims to be expressive and straightforward to read and write with respect to time and concurrency, and to provide a platform for precise audio synthesis and analysis and for rapid experimentation in computer music. In particular, ChucK defines the notion of a strongly timed audio progra...
Article
Full-text available
This paper gives a historical overview of the development of alternative sonic display systems at Princeton University; in particular, the design, construction, and use in live performance of a series of spherical and hemispherical speaker systems. We also provide a DIY guide to constructing the latest series of loudspeakers that we are currently u...
Chapter
This chapter covers algorithms, technologies, computer languages, and systems for computer music. Computer music involves the application of computers and other digital/electronic technologies to music composition, performance, theory, history, and the study of perception. The field combines digital signal processing, computational algorithms, comp...
Article
Banded waveguide (BWG) synthesis is an efficient method for real-time physical modeling of dispersive and multidimensional sounding objects, affording simulation of complex interactions, such as bowing. Current implementations, however, use nonphysical design parameters and produce a range of outputs that do not match equivalently designed modal an...
Article
This paper describes the development of an introductory curriculum in computer science modeled on a traditional Applied Introduction to Programming and Algorithms course sequence, but designed specifically for artists as a means of furthering their creative work. Computer science theory is presented in lecture/demos, with weekly assignments that co...
Article
With advances in algorithms for sound synthesis and processing, combined with inexpensive computational hardware and sensors, we can now easily build new types of musical instruments, and other real-time interactive expressive devices. These new "instruments" can leverage and extend the expertise of virtuoso performers, expand the palette of sounds...
Article
Full-text available
A weekly seminar consisting of seven composers and one computer scientist was convened for the purpose of exploring questions surrounding how technology can support aspects of the computer music composition process. The composers were introduced to an existing interactive software system for creating new musical interfaces and compositions, which t...
Chapter
by Miriam A. Kolar, with John W. Rick, Perry R. Cook, and Jonathan S. Abel This study of ancient sound-producing instruments within a comprehensive archaeoacoustic investigation is greatly enhanced by an integrative methodology that explores interrelationships among instrumental and environmental acoustic dynamics, and considers their auditory per...
Conference Paper
Model evaluation plays a special role in interactive machine learning (IML) systems in which users rely on their assessment of a model's performance in order to determine how to improve it. A better understanding of what model criteria are important to users can therefore inform the design of user interfaces for model evaluation as well as the choi...
Article
Full-text available
[Editors’ note: We have solicited letters from six of Max Mathews’ colleagues and friends who knew him at different points in his life (and in their lives). Though sampled from very few of the people who knew and worked with Mathews, these letters paint a more detailed picture of how his accomplishments have impacted our community, and they make it...
Article
This paper describes a series of investigations into the use of sustainable methods for powering electronic musical instruments and perhaps ultimately a large ensemble such as the Princeton Laptop Orchestra, a collection of 15-25 metainstruments each consisting of a laptop computer, interfacing equipment and a hemispherical speaker. The research di...
Article
Full-text available
In 2001, a group of twenty Strombus galeatus marine shell trumpets were excavated at the 3000-year-old ceremonial center at Chavín de Huántar, Perú, marking the first documented contextual discovery of intact sound producing instruments at this Formative Period site in the Andean highlands. These playable shells are decorated, use-polished, and add...
Article
Manfred Schroeder died 28 December 2009 at age 83. Born in Germany on 12 July 1926, he combined his interests in mathematics with his boyhood experiments with radio to work on radar systems during World War II. After university, he divided his professional career (mostly simultaneously) between Bell Laboratories and the University of Göttingen, whe...
Article
Full-text available
Traditional musical instruments provide compelling metaphors for human-computer interfacing, both in terms of input (physical, gestural performance activities) and output (sound diffusion). The violin, one of the most refined and expressive of traditional instruments, combines,a peculiar physical interface with a rich acoustic diffuser. We have bui...
Conference Paper
Full-text available
Recent research in machine learning has focused on breaking audio spectrograms into separate sources of sound using latent variable decompositions. These methods require that the number of sources be specified in advance, which is not always possible. To address this problem, we develop Gamma Process Nonnegative Matrix Factorization (GaP-NMF), a Ba...
Conference Paper
It is challenging to search a dictionary consisting of thousands of entries in order to select appropriate words for building written communication. This is true both for people trying to communicate in a foreign language who have not developed a full vocabulary, for school children learning to write, for authors who wish to be more precise and exp...
Article
Full-text available
Finding words in an assistive communication device can be challenging and time-consuming for individuals with lexical access disorders like those caused by aphasia. These users have persistent difficulties accessing and retrieving words due to impaired semantic links in their mental lexicon. As a result, they can easily get lost in a vocabulary hie...
Conference Paper
Full-text available
Auditory displays have been used in both human-machine and computer interfaces. However, the use of non-speech audio in assistive communication for people with language disabilities, or in other applications that employ visual representations, is still under-investigated. In this paper, we introduce SoundNet, a linguistic database that associates n...
Conference Paper
Full-text available
It is challenging to navigate a dictionary consisting of thousands of entries in order to select appropriate words for building communication. This is particularly true for people with lexical access disorders like those present in aphasia. We make vocabulary navigation and word-finding easier by building a vocabulary network where links between wo...
Article
Full-text available
We propose a demonstration of The Wekinator, our soft-ware system that enables the application of machine-learning based music information retrieval techniques to real-time musical performance, and which emphasizes a richer human-computer interaction in the design of ma-chine learning systems.
Article
Full-text available
Existing Augmentative and Alternative Communication vocabularies assign multimodal stimuli to words with multiple meanings. The ambiguity hampers the vocabulary effectiveness when used by people with language disabilities. For example, the noun "a missing letter" may refer to a character or a written message, and each corresponds to a different pic...
Article
Full-text available
This paper presents a digital waveguide woodwind instrument tonehole implementation which, in a single model, characterizes all states of the hole from open to closed. This e- cient implementation produces results which agree well with previous acoustical analyses of the tonehole. A similar model is also presented for the register hole. A complete...
Conference Paper
Full-text available
In this paper we study how verbs are visually conveyed in daily communication contexts for both young and old adults. Four visual modes are compared: a single static image, a panel of four static images, an animation, and a video clip. The results reveal age effects, as well as performance differences introduced by lexical verb properties and visua...
Conference Paper
Full-text available
Many songs in large music databases are not labeled with semantic tags that could help users sort out the songs they want to listen to from those they do not. If the words that apply to a song can be predicted from audio, then those predictions can be used both to automatically annotate a song with tags, allowing users to get a sense of what qual-...
Conference Paper
Full-text available
In this paper, we present the design of ViVA, a visual vocabulary for aphasia. Aphasia is an acquired language disorder that causes variability of impairments affecting individual's ability to speak, comprehend, read and write. Existing communication aids lack flexibility and adequate customization functionality failing to address this variability...
Conference Paper
TAPESTREA is a sound design and composition framework that facilitates the creation of new sound from existing digital audio recordings, through interactive analysis, transformation and re-synthesis. During analysis, sound templates of diffierent types are extracted using a variety of techniques. Each extracted template is transformed and synthesiz...
Conference Paper
Full-text available
In this paper, we introduce W2ANE, an Online Multimedia Language Assistant for individuals with aphasia, a language disorder that affects millions of people. W2ANE offers a rich online multimedia library (OMLA) supported by an adaptable and adaptive vocabulary scaffold (ViVA). The system, accessible over the Internet, provides a platform for applic...
Conference Paper
TAPESTREA is a sound design and composition framework that facilitates the creation of new sound from existing digital audio recordings, through interactive analysis, transformation and re-synthesis. During analysis, sound templates of different types are extracted using a variety of techniques. Each extracted template is transformed and synthesize...
Conference Paper
Full-text available
The difficulties of navigating vocabulary in an assistive communication device are exacerbated for individuals with lexical access disorders like those due to aphasia. We present the design and implementation of a vocabulary network based on WordNet, a resource that attempts to model human semantic memory, that enables users to find words easily. T...
Conference Paper
Full-text available
People with aphasia, a condition that impairs the ability to understand or generate written or spoken language, are aided by assistive technology that helps them communicate through a vocabulary of icons. These systems are akin to language translation systems, translating icon arrangements into spoken or written language and vice versa. However, th...
Article
Full-text available
We describe our tool for interactively creating musical controller mappings using a "play-along" paradigm, in which a user pretends to play along with a musical score in real-time using an arbitrary input control modality. As the user "performs," a supervised machine learning system builds a training dataset from the user's gestures and the synthes...
Conference Paper
Full-text available
We explore the potential for and implications of musical (or proto-musical) social interaction and collaboration using currently available technologies embedded into mobile phones. The dynamics of this particular brand of social intercourse and the emergence of an associated aesthetic is described. The clichéd concept of a global village is made a...
Conference Paper
Full-text available
Computational support of creativity is a core concern of our daily work, as researchers and musicians working in computer music. We are enthusiastic about the prospect of attending the Computational Creativity Support workshop at CHI 2009, both to share our work on laptop orchestras and real-time machine learning in music performance, and to explor...
Article
Full-text available
Supervised learning methods have long been used to allow musical interface designers to generate new mappings by example. We propose a method for harnessing machine learning algorithms within a radically interactive paradigm, in which the designer may repeatedly generate examples, train a learner, evaluate outcomes, and modify parameters in real-ti...
Article
Full-text available
In this paper, we introduce an audio mosaicing technique based on performing posterior inference on a probabilistic generative model. Whereas previous approaches to concate- native synthesis and audio mosaicing have mostly tried to match higher-level descriptors of audio or individual STFT frames, we try to directly match the magnitude spectrogram...
Article
An experiment examined how listeners learn about the physical parameters of computer-generated sound sources. Across three learning sessions, participants used a computer interface to manipulate three physical parameters - number of colliding objects, system damping and resonant frequency - constraining each of eight sound objects. Kinetic energy w...
Article
We present an overview of digital audio analysis and synthesis methods for sound design and composition. The sonic landscape available to us contains a multitude of sounds, ranging from artificial to natural, purely musical to purely "real-world." To take full advantage of this diversity, it is helpful to have a comprehensive knowledge of the tools...
Conference Paper
Full-text available
Icons and digital images used in augmentative and alternative communication (AAC) are not as effective in illustrating verbs, especially for people with cognitive degeneration or impairment. Realistic videos have possible advantages for conveying verbs, as verified in our studies with young and old adults comparing single image, multiple images, an...
Conference Paper
Full-text available
In this paper, we discuss our recent additions of audio analysis and machine learning infrastructure to the ChucK music programming language, wherein we provide a complementary system prototyping framework for MIR researchers and lower the barriers to applying many MIR algorithms in live music performance. The new language capabilities preserve Chu...
Conference Paper
Full-text available
Machine learning techniques such as classification have proven to be vital tools in both music information retrieval and music performance, where they are useful for leveraging data to learn and model relationships between low-level features and high-level musical concepts. Explicitly supporting feature extraction and classification in a computer m...
Article
Full-text available
Sinusoidal tracks in TAPESTREA have too long been op- pressed, unable to choose which other tracks they hang out with, and only able to be modified in fixed groups. They had no say in their own lives, no individuality. That ends today. We advocate FREEDOM in TAPESTREA by way of VTAPS, where every track can be interacted with as a distinct, FREE ent...
Article
Full-text available
The recently formed Princeton Laptop Orchestra (PLOrk) can be said as the first ever orchestra conducting using a laptop computer. The feat is especially relevant as it developed strategies for control, sound design, spatialization, conductor roles, improvisation and instrument design. PLOrk is an ensemble of laptop-based instrumentalists with loca...
Article
Full-text available
This article chronicles our pedagogical adventures in the Princeton Laptop Orchestra (PLOrk). We introduce the PLOrk classroom as well as new approaches and tools for teaching. In doing so, we explore an integrated, naturally interdisciplinary educational environment for composition, performance, and computer science. in such an environment, the le...
Conference Paper
Full-text available
We develop a method for discovering the latent structure in MFCC feature data using the Hierarchical Dirichlet Process (HDP). Based on this structure, we compute timbral simi- larity between recorded songs. The HDP is a nonparametric Bayesian model. Like the Gaussian Mixture Model (GMM), it represents each song as a mixture of some number of multiv...
Article
Full-text available
A system to aid composition with analysis, transformation, and resynthesis of natural sounds is described. Sinusoidal analysis is used to isolate and extract deterministic sounds, and transients are also isolated/extracted, leaving the stochas-tic background sound which is parameterized by wavelet tree analysis. All of these components become templ...
Article
The views of categorisation presented in this paper along with my own are for the purpose of providing background for current taxonomic projects related to electroacoustic music (e.g. EARS: ElectroAcoustic Resource Site). The views might be summarised ...
Article
Full-text available
We present two physics-based analysis, synthesis, and control systems for synthesizing hand clapping sounds. They both rely on the separation of the sound synthesis and event generation, and both are capable of producing individual hand-claps, or mimicking the asynchronous/synchronized applause of a group of clappers. The synthesis models consist o...
Conference Paper
Full-text available
In this paper, we present a new programming model for performing audio analysis, spectral processing, and feature extraction in the ChucK programming language. The solution unifies analysis and synthesis in the same high-level, strongly-timed, and concurrent environment, extending and fully integrating with the existing language framework. In parti...
Conference Paper
Full-text available
We draw on our experiences with the Princeton Laptop Orchestra to discuss novel uses of the laptop's native physical inputs for flexible and expressive control. We argue that instruments designed using these built-in inputs offer benefits over custom standalone controllers, particularly in certain group performance settings; creatively thinking abo...
Article
Full-text available
A crucial set of decisions in digital musical instrument design deals with choosing mappings between parameters controlled by the performer and the synthesis algorithms that actually generate sound. Feature-based synthesis offers a way to parameterize audio synthesis in terms of the quantifiable perceptual characteristics, or features, the performe...
Article
Full-text available
This paper describes the FeatSynth framework, a set of open-source C++ classes intended to make it as easy as possible to integrate feature-based synthesis techniques into audio software. We briefly review the key ideas behind feature-based synthesis, and then discuss the framework's architecture. We emphasize design choices meant to make the frame...
Article
This chapter covers algorithms, technologies, computer languages, and systems for computer music. Computer music involves the application of computers and other digital/electronic technologies to music composition, performance, theory, history, and perception. The field combines digital signal processing, computational algorithms, computer language...
Conference Paper
Carotid stenoses are responsible for many of the strokes occurring each year. We have developed a low-cost-of-use instrument for use in primary-care physicians' offices that can detect these stenoses before they produce a stroke. Non-specialists are to use the instrument with only limited training, so it is desirable to make the Doppler audio it pr...
Article
Full-text available
We present a general framework for synthesizing audio manifesting arbitrary sets of perceptually motivated, quantifiable acoustic features. Much work has been done recently on finding acoustic features that describe perceptually relevant aspects of sound. The ability to synthesize sounds defined by arbitrary feature values would allow perception re...
Article
Advances in algorithms, hardware, and sensors now allow us to build new types of expressive musical devices based around small computer systems. These new ‘‘instruments’’ can leverage and extend the expertise of virtuoso performers, expand the palette of sounds available to composers, and encourage new ideas and participation from the young or the...
Conference Paper
Full-text available
We present a new paradigm and framework for creating high-quality “sound scenes” from a set of recordings. A sound scene is a combination of background and foreground sounds that together evoke the sense of being in a specific environment. The ability to craft and control sound scenes is important in entertainment (movies, TV, games), virtual/augme...
Conference Paper
Full-text available
We present a general framework for performing feature- based synthesis - that is, for producing audio characterized by arbitrarily specified sets of perceptually motivated, quantifiable acoustic features of the sort used in many music information retrieval systems.
Conference Paper
Full-text available
Emergence is the formation of complex patterns from simpler rules or systems. This paper motivates and describes new graphical interfaces for controlling sound designed for strongly-timed, collaborative computer music ensembles. While the interfaces are themselves minimal and often limiting, the overall collaboration can produce results novel beyon...
Conference Paper
Full-text available
In this paper, we describe the networking of multiple Integral Music Controllers (IMCs) to enable an entirely new method for creating music by tapping into the composite gestures and emotions of not just one, but many performers. The concept and operation of an IMC is reviewed as well as its use in a network of IMC controllers. We then introduce a...
Article
Full-text available
This paper describes VFerret, a content-based similarity search tool for continuous archived video. Instead of depending on attributes or annotations to search desired data from long-time archived video, our system allows users to perform content-based similarity search using visual and audio features, and to combine content-based similarity search...
Conference Paper
Full-text available
In this paper we report on the current state of the newly established Princeton Laptop Orchestra (PLOrk), a collection of 15 meta-instruments each consisting of a laptop computer, interfacing equipment, and a hemispherical speaker. Founded in the fall of 2005, PLOrk represents the first laptop ensemble of its size and kind, and brings together many...