Computer Music Journal

Published by MIT Press

Online ISSN: 1531-5169


Print ISSN: 0148-9267


The Lucasfilm Audio Signal Processor
  • Conference Paper
  • Full-text available

June 1982


317 Reads

The requirements of audio processing for motion pictures present several special problems that both make digital processing of audio very desirable and relatively difficult. The difficulties can be summarized as follows: (1) Large amounts of numerical computation are required, on the order of 2 million integer multiply-adds per second per channel of audio, for some number of channels. (2) The exact processing involved changes in real time but must not interrupt the flow of audio data. (3) Large amounts of input/output capacity is necessary, simultaneous with numerical calculation and changes to the running program, on the order of 1.6 million bits per second per channel of audio. To this end, the digital audio group at Lucasfilm is building a number of audio signal processors the architecture of which reflects the special problems of audio.

Synthesis of Timbral Families by Warped Linear Prediction
A central problem in the production of music by computer is how to generate sounds with similar but different timbres; that is, how to synthesize a "family" of instruments that are perceived as distinguishable but similar in timbre. This paper describes a method for synthesizing a family of string-like instruments that proceeds as follows: First, a few actual violin notes are analyzed using linear prediction. Second, a bilinear transformation is applied to the linear prediction model on synthesis, using a recently described algorithm. This yields a computationally efficient way to generate a virtual string orchestra, with violin-, viola-, violoncello-, and bass-like timbres. A realization of a 4.7-minute piece composed by P. Lansky will be played.

Figure 1: Top-N ranking agreement scores from table 3 plotted as a grayscale image.
A Large-Scale Evaluation of Acoustic and Subjective Music Similarity Measures

November 2003


310 Reads

Adam Berenzweig



Daniel P. W. Ellis




Cambridge Ma U. S. A
Subjective similarity between musical pieces and artists is an elusive concept, but one that must be pursued in support of applications to provide automatic organization of large music collections. In this paper, we examine both acoustic and subjective approaches for calculatingsimilarity between artists, comparing their performance on a common database of 400 popular artists. Specifically, we evaluate acoustic techniques based on Mel-frequency cepstral coefficients and an intermediate `anchor space' of genre classification, and subjective techniques which use data from The All Music Guide, from a survey, from playlists and personal collections, and from web-text mining.

Figure 1. Performance of Csound as a function of block size. Very large blocks do not degrade performance, even though the large blocks do not fit in the cache.
Figure 2. Performance of a table-lookup oscillator as a function of phase increment. Execution time is directly related to the number of fetches per cache line, which in turn is controlled by the phase increment.
Figure 3. Performance gain is plotted as a function of how many simple filters are merged with a table-lookup oscillator.
Figure 5. Benchmark execution time as a function of table size. Access to tables is either "Normal" as necessary to compute a roughly 440Hz tone, or "Sequential," for which the oscillator artificially reads memory words consecutively.
Benchmark execution time under different schemes for handling control signals.
Real-Time Software Synthesis on Superscalar Architectures

September 1997


77 Reads

Advances in processor technology will make it possible to use general-purpose personal computers as real-time signal processors. This will enable highly-integrated "all-software" systems for music processing. To this end, the performance of a present generation superscalar processor running synthesis software is measured and analyzed. A real-time reimplementation of Fugue, now called Nyquist, takes advantage of the superscalar synthesis approach, integrating symbolic and signal processing. Performance of Nyquist is compared to Csound.

A Brief Survey of Music Representation Issues, Techniques, and Systems

November 1998


274 Reads

ly, there is a mathematical function that maps beat numbers to time and an inverse function that maps time to beats. (Some special mathematical care must be taken to allow for music that momentarily stops, instantaneously skips ahead, or is allowed to jump backwards, but we will ignore these details.) One practical representation of the beat-to-time function is the set of times of individual beats. For example, a performer can tap beats in real time and a computer can record the time of each tap. This produces a finitely sampled representation of the continuous mapping from beats to time, which can be interpolated to obtain intermediate values. MIDI Clocks, which are transmitted at 24 clocks per quarter note, are an example of this approach. The Mockingbird score editor [Maxwell 84] interface also supported this idea. The composer would play a score, producing a piano-roll notation. Then, downbeats were indicated by clicking with a mouse at the proper locations in the piano roll score....

Figure 1: The curve shows the response of Terhardt's outer and middle ear model. The dotted lines mark the center frequencies of the critical-bands. For our work we use the first 20 bands.
Figure 2: Aligned-SOMs trained with a small animal dataset showing changes in the organization, (a) first layer with weighting ratio 1:0 between appearance and activity features, (b) ratio 3:1, (c) ratio 1:1, (d) ratio 1:3, (e) last layer with ratio 0:1. The shadings represent the density calculated using SDH (n = 2 with bicubic interpolation).
Figure 4: Codebooks depicting the underlying organization. (a) and (b) represent the codebooks of the aligned-SOM organized according to periodicity histograms while (c) and (d) are organized according to the spectrum histograms. The visualization is the same as in Figure 3 with a different color scale.  
Exploring Music Collections by Browsing Different Views

November 2003


262 Reads

The availability of large music collections calls for ways to efficiently access and explore them. We present a new approach which combines descriptors derived from audio analysis with meta-information to create different views of a collection. Such views can have a focus on timbre, rhythm, artist, style or other aspects of music. For each view the pieces of music are organized on a map in such a way that similar pieces are located close to each other. The maps are visualized using an Islands of Music metaphor where islands represent groups of similar pieces. The maps are linked to each other using a new technique to align self-organizing maps. The user is able to browse the collection and explore different aspects by gradually changing focus from one view to another. We demonstrate our approach on a small collection using a meta-information-based view and two views generated from audio analysis, namely, beat periodicity as an aspect of rhythm and spectral information as an aspect of timbre.

The Canon Score Language

July 1996


65 Reads

Canon is both a notation for musical scores and a programming language. Canon offers a combination of declarative style and a powerful abstraction capability which allows a very high-level notation for sequences of musical events and structures. Transformations are operators that can adjust common parameters such as loudness or duration. Transformations can be nested and time-varying, and their use avoids the problem of having large numbers of explicit parameters. Behavioral abstraction, the concept of making behavior an arbitrary function of the environment, is supported by Canon and extends the usefulness of transformations. A non-real-time implementation of Canon is based on Lisp and produces scores that control MIDI synthesizers. Introduction Canon is a computer language designed to help composers create note-level control information for hardware synthesizers or synthesis software. Canon was motivated by my need for a simple yet powerful language for teaching second-semester stud...

Machine tongues XVIII: a child's garden of sound file formats

March 1995


67 Reads

This article introduces a few of the many ways that sound data can be stored in computer files, and describes several of the file formats that are in common use for this purpose. This text is an expanded and edited version of a "frequently asked questions" (FAQ) document that is updated regularly by one of the authors (van Rossum). Extensive references are given here to printed and network-accessible machine-readable documentation and source code resources. The FAQ document is regularly posted to the USENET electronic news groups alt.binaries.sounds and comp.dsp for maximal coverage of people interested in digital audio, and to comp.answers, for easy reference. It is available by anonymous Internet file transfer A Child's Garden of Sound File Formats DRAFT: To appear in Computer Music Journal 19:1 (Spring 1995) 2

Combining Event and Signal Processing in the MAX Graphical Programming Environment

February 1970


319 Reads

This paper describes how MAX has been extended on the NeXT to do signal as well as control computations. Since MAX as a control environment has already been described elsewhere, here we will offer only an outline of its control aspect as background for the description of its signal processing extensions.

Figure 2: MODE system software components
Figure 8: TR-Tree editor examples
The Interim DynaPiano: An Integrated Computer Tool and Instrument for Composers

January 1996


183 Reads

/SampledSounds sound UI tools Mixes Mix editors and browsers browsers other structures MIDIVoices SampleVoices Kernel Pope: IDP 9 CMJ 16(3) - Fall, 1992 MODE: The Musical Object Development Environment The MODE software system consists of Smalltalk-80 classes that address five areas: the representation of musical parameters, sampled sounds, events and event lists; the description of middle-level musical structures; real-time MIDI, sound I/O and DSP scheduling; a user interface framework and components for building signal, event, and structure processing applications; and several built-in end-user applications. We will address each of these areas in the sections below. Figure 2 is an attempted schematic presentation of the relationships between the class categories and hierarchies that make up the MODE environment. The items in Fig. 2 are the MODE packages, each of which consists of a collection or hierarchy of Smalltalk-80 classes. The items that are displayed in courier font are ...

Problems and Prospects for Intimate Musical Control of Computers

December 2001


209 Reads

In this paper we describe our efforts towards the development of live performance computer-based musical instrumentation. Our design criteria include initial ease of use coupled with a long term potential for virtuosity, minimal and low variance latency, and clear and simple strategies for programming the relationship between gesture and musical result. We present custom controllers and unique adaptations of standard gestural interfaces, a programmable connectivity processor, a communications protocol called Open Sound Control (OSC), and a variety of metaphors for musical control. We further describe applications of our technology to a variety of real musical performances and directions for future research.

Something Digital

October 1998


111 Reads

nd to its importance to the musicmaking process. It exists if only because musicians think it exists; that's enough if it changes the way they play music. To make live music with the hypothetical Computer Music Workstation of the first paragraph we must make it work in real time: connect one or more gestural input devices to it, compute each sample only slightly in advance of the time it is needed for conversion by the DACs, and make the sample computation dependent in some musically useful way on what has come in recently from the live input. The closer the rapport between the live input and what comes out of the speaker, the longer the audience will stay awake. This rapport is the crux of live computer music. A part of the quest for a better Computer Music Workstation is to make it easy to establish real-time musical control over the computer. I am against trying to set the computer up as a musical performer. Large software systems which try to instill "musical intelligence" i

EIN: A Signal Processing Scratchpad

September 1998


241 Reads

time and frequency domains. The user provides a script in a language that is a superset of C. The EIN system then compiles it and provides a wrapper that calls the script for each time sample from t = 0 up to the specified number of samples, nsamps; computes the next output sample y (for the mono case), or the variables left and right (for the stereo case); and writes the output samples to an output sound file, formated for the sampling rate of sr samples/sec. The int variables t and nsamps, and the float variables y, left, right, and sr, are reserved by EIN, and should not be used for other purposes. The mono/stereo option, and the values of sr and nsamps are selected with radio buttons on the interface. The remaining reserved names

Musical Applications of Electric Field Sensing

January 1998


3,720 Reads

The Theremin was one of the first electronic musical instruments, yet it provides a degree of expressive real-time control that remains lacking in most modern electronic music interfaces. Underlying the deceptively simple capacitance measurement used by it and its descendants are a number of surprisingly interesting current transport mechanisms that can be used to inexpensively, unobtrusively, robustly, and remotely detect the position of people and objects. We review the relevant physics, describe appropriate measurement instrumentation, and discuss applications that began with capturing virtuosic performance gesture on traditional stringed instruments and evolved into the design of new musical interfaces. 1)

A Framework For Representing Knowledge About Synthesizer Programming

January 1997


36 Reads

We aim at capturing the know-how of expert patch programmers to build more productive human interfaces for commercial synthesizers. We propose a framework to represent this superficial layer of knowledge, that can be used to help musicians program commercial synthesizers patches. Our key idea is to classify sounds according to transformations experts can apply to them. We propose a dual representation of sounds combining object-oriented programming with classification mechanisms. We illustrate our framework with a prototype system that helps program Korg-05R/W synthesizers. Key Words: Computer-Aided Synthesis, Desription Logics, ObjectOriented programming, Smalltalk From Computer Aided Synthesis to Computer Aided Synthesizer Programming Our framework stems from the following remark. In an intensive care unit, a typical nurse knows perfectly well how to use an infusion pump. However, the nurse may be an "expert" in infusion pump manipulation without necessarily having any theoretical ...

Figure 1. A Nyquist sound expression and resulting representation.  
Figure 2. Sound representation in Nyquist.  
Figure 3. Sounds may have leading zeros, trailing zeros, and internal gaps.  
Figure 4. Representation for sound termination. Two sounds are shown, each with one more block to read before termination. (defun short-roll () (mult (drum-roll) (drum-envelope))) This just multiplies a drum roll by a finite-length envelope. When the envelope goes to zero, the multiplication suspension will notice that one operand (the envelope) is zero. Therefore, the product is zero, and the suspension can link its output to the terminal (zero) list node. The suspension frees itself and reference counting and garbage collection dispose of the remaining drum roll. This kind of recursive infinite structure might also be used in granular synthesis (Roads 1991). A granular synthesis instrument can generate potentially infinite sounds that need only to be multiplied by an envelope to obtain the desired amplitude and duration.  
Figure 5. Optimization of add when one operand terminates and one remains.
The Implementation of Nyquist, A Sound Synthesis Language

November 1996


219 Reads

Nyquist is a functional language for sound synthesis with an efficient implementation. It is shown how various language features lead to a rather elaborate representation for signals, consisting of a sharable linked list of sample blocks terminated by a suspended computation. The representation supports infinite sounds, allows sound computations to be instantiated dynamically, and dynamically optimizes the sound computation.

Machine Tongues XI: Object-Oriented Software Design

January 1996


39 Reads

ion The design technique of abstraction can be used to increase the amount of reuse within a set of classes by describing abstract classes that embody some of their shared state and/or behavior. One uses abstraction to construct a hierarchy relating several concrete classes to common abstract classes for better reuse of state and behavior. This emphasizes sharing of both state and behavior; abstract classes often implement algorithms in terms of methods that are overridden by concrete subclasses. The result of abstraction is a more horizontal hierarchy or a new protocol family. A typical abstraction example is building a model graphical objects for an objectoriented drawing package. The first pass includes several concrete classes (for the superclass name Controller StorageManager superclass name instance variables View EventListEditorView displayList layoutManager "manage read/write of list to/from files" (links represent 'has-a' relationship) superclass name Controller Se...

Figure 5: Steps of the SWSS "Compilation" Process FM Example The example shown in Fig. 6 is a simple frequency modulation (FM) instrument (Chowning 1973), in which a single modulator wave is used with exponential ADSR (attack-decay-sustain-release) envelopes for the amplitude and modulation index. It uses two line-segment envelope unit generators and sine-wave oscillators for modulation and carrier waveforms. The parameters of this instrument are: p1 = note command; p2 = instrument name or number; p3 = start time (in seconds); p4 = duration (in seconds); p5 = amplitude (in the range from 0 to 1); p6 = fundamental pitch (in symbolic pitch units); p7 = carrier-to-modulation frequency ratio; p8 = index of modulation; p9 = amplitude attack time (in seconds); p10 = amplitude decay time (in seconds); p11 = modulation index attack time (in seconds); and p12 = modulation index decay time (in seconds).
Figure 6(c) shows the instrument, with the data flowing from the note parameters into the envelope generators, then to the modulator and carrier signal oscillators, then to the output.
Figure 6: Exponential Envelope and FM instrument block diagram-(a) exponential-segment envelope; (b) env unit generator; (b) 2-oscillator FM with amplitude and modulation index envelopes The note command will list the instrument's parameters in order p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 p11 p12 note name start dur ampl pitch c:m index att dec i_att i_dec
Figure 8: Sound file mixing instrument definition-(a) sound file reader UG; (b) file mixer instrument
Machine Tongues XV: Three Packages for Software Sound Synthesis

January 1996


344 Reads

This article will discuss the technology of SWSS and then present and compare these three systems. It is divided into three parts; the first introduces SWSS in terms of progressive examples. Part two compares the three systems using the same two instrument/score examples written in each of them. The final section presents informal benchmark tests of the systems run on two different hardware platforms---a Sun Microsystems SPARCstation-2 IPX and a Next Computer Inc. TurboCube machine---and subjective comments on various features of the languages and programming environments of stateof -the-art SWSS software. This author's connection with this topic is that of extensive experience with several different SWSS systems over the last 15 years, starting with MUS10 and including all three compared here: Csound (in the form of Music-11 initially) at the CMRS studio in Salzburg (Pope 1982); cmusic in the CARL environment at PCS/Cadmus computers in Munich (Pope 1986); and more recently a combination of cmix, Csound, and various vocoder software packages with user interfaces written in Smalltalk-80 at the CCRMA Center for Computer Research in Music and Acoustics at Stanford University (Pope 1992). Pope: MT XV---Three Systems for SWSS 2 CMJ 17(2)---Summer, 1993 1993.01.16

Stockhausen on Electronics, 2004

December 2008


27 Reads

Karlheinz Stockhausen had the last piece from his 29-hour-long opera cycle licht entitled Licht-Bilder on which 27 long years were put forth. This work was complemented by a video production where both an elaborate sound system for auditoriums and a complete redundant set of stage equipment were provided. Triangular sails were designed on the stage onto which the videos were being projected. This work tries to show off a kind of run-through of all the seven days of the opera while the diverse materials that were being used are a dense interconnection that is central on how Stockhausen is thinking. The pitch, rhythm, and duration were not altered at all while the tempi and dynamics were given interpretative leeway. As for its acousmatic music, both the construction and the sound act of the sound elements were in a dialectic way.

Top-cited authors