The requirements of audio processing for motion pictures present several special problems that both make digital processing of audio very desirable and relatively difficult. The difficulties can be summarized as follows: (1) Large amounts of numerical computation are required, on the order of 2 million integer multiply-adds per second per channel of audio, for some number of channels. (2) The exact processing involved changes in real time but must not interrupt the flow of audio data. (3) Large amounts of input/output capacity is necessary, simultaneous with numerical calculation and changes to the running program, on the order of 1.6 million bits per second per channel of audio. To this end, the digital audio group at Lucasfilm is building a number of audio signal processors the architecture of which reflects the special problems of audio.
A central problem in the production of music by computer is how to generate sounds with similar but different timbres; that is, how to synthesize a "family" of instruments that are perceived as distinguishable but similar in timbre. This paper describes a method for synthesizing a family of string-like instruments that proceeds as follows: First, a few actual violin notes are analyzed using linear prediction. Second, a bilinear transformation is applied to the linear prediction model on synthesis, using a recently described algorithm. This yields a computationally efficient way to generate a virtual string orchestra, with violin-, viola-, violoncello-, and bass-like timbres. A realization of a 4.7-minute piece composed by P. Lansky will be played.
Subjective similarity between musical pieces and artists is an elusive concept, but one that must be pursued in support of applications to provide automatic organization of large music collections. In this paper, we examine both acoustic and subjective approaches for calculatingsimilarity between artists, comparing their performance on a common database of 400 popular artists. Specifically, we evaluate acoustic techniques based on Mel-frequency cepstral coefficients and an intermediate `anchor space' of genre classification, and subjective techniques which use data from The All Music Guide, from a survey, from playlists and personal collections, and from web-text mining.
Advances in processor technology will make it possible to use general-purpose personal computers as real-time signal processors. This will enable highly-integrated "all-software" systems for music processing. To this end, the performance of a present generation superscalar processor running synthesis software is measured and analyzed. A real-time reimplementation of Fugue, now called Nyquist, takes advantage of the superscalar synthesis approach, integrating symbolic and signal processing. Performance of Nyquist is compared to Csound.
ly, there is a mathematical function that maps beat numbers to time and an inverse function that maps time to beats. (Some special mathematical care must be taken to allow for music that momentarily stops, instantaneously skips ahead, or is allowed to jump backwards, but we will ignore these details.) One practical representation of the beat-to-time function is the set of times of individual beats. For example, a performer can tap beats in real time and a computer can record the time of each tap. This produces a finitely sampled representation of the continuous mapping from beats to time, which can be interpolated to obtain intermediate values. MIDI Clocks, which are transmitted at 24 clocks per quarter note, are an example of this approach. The Mockingbird score editor [Maxwell 84] interface also supported this idea. The composer would play a score, producing a piano-roll notation. Then, downbeats were indicated by clicking with a mouse at the proper locations in the piano roll score....
The availability of large music collections calls for ways to efficiently access and explore them. We present a new approach which combines descriptors derived from audio analysis with meta-information to create different views of a collection. Such views can have a focus on timbre, rhythm, artist, style or other aspects of music. For each view the pieces of music are organized on a map in such a way that similar pieces are located close to each other. The maps are visualized using an Islands of Music metaphor where islands represent groups of similar pieces. The maps are linked to each other using a new technique to align self-organizing maps. The user is able to browse the collection and explore different aspects by gradually changing focus from one view to another. We demonstrate our approach on a small collection using a meta-information-based view and two views generated from audio analysis, namely, beat periodicity as an aspect of rhythm and spectral information as an aspect of timbre.
Canon is both a notation for musical scores and a programming language. Canon offers a combination of declarative style and a powerful abstraction capability which allows a very high-level notation for sequences of musical events and structures. Transformations are operators that can adjust common parameters such as loudness or duration. Transformations can be nested and time-varying, and their use avoids the problem of having large numbers of explicit parameters. Behavioral abstraction, the concept of making behavior an arbitrary function of the environment, is supported by Canon and extends the usefulness of transformations. A non-real-time implementation of Canon is based on Lisp and produces scores that control MIDI synthesizers. Introduction Canon is a computer language designed to help composers create note-level control information for hardware synthesizers or synthesis software. Canon was motivated by my need for a simple yet powerful language for teaching second-semester stud...
This article introduces a few of the many ways that sound data can be stored in computer files, and describes several of the file formats that are in common use for this purpose. This text is an expanded and edited version of a "frequently asked questions" (FAQ) document that is updated regularly by one of the authors (van Rossum). Extensive references are given here to printed and network-accessible machine-readable documentation and source code resources. The FAQ document is regularly posted to the USENET electronic news groups alt.binaries.sounds and comp.dsp for maximal coverage of people interested in digital audio, and to comp.answers, for easy reference. It is available by anonymous Internet file transfer A Child's Garden of Sound File Formats DRAFT: To appear in Computer Music Journal 19:1 (Spring 1995) 2
This paper describes how MAX has been extended on the NeXT to do signal as well as control computations. Since MAX as a control environment has already been described elsewhere, here we will offer only an outline of its control aspect as background for the description of its signal processing extensions.
/SampledSounds sound UI tools Mixes Mix editors and browsers browsers other structures MIDIVoices SampleVoices Kernel Pope: IDP 9 CMJ 16(3) - Fall, 1992 MODE: The Musical Object Development Environment The MODE software system consists of Smalltalk-80 classes that address five areas: the representation of musical parameters, sampled sounds, events and event lists; the description of middle-level musical structures; real-time MIDI, sound I/O and DSP scheduling; a user interface framework and components for building signal, event, and structure processing applications; and several built-in end-user applications. We will address each of these areas in the sections below. Figure 2 is an attempted schematic presentation of the relationships between the class categories and hierarchies that make up the MODE environment. The items in Fig. 2 are the MODE packages, each of which consists of a collection or hierarchy of Smalltalk-80 classes. The items that are displayed in courier font are ...
In this paper we describe our efforts towards the development of live performance computer-based musical instrumentation. Our design criteria include initial ease of use coupled with a long term potential for virtuosity, minimal and low variance latency, and clear and simple strategies for programming the relationship between gesture and musical result. We present custom controllers and unique adaptations of standard gestural interfaces, a programmable connectivity processor, a communications protocol called Open Sound Control (OSC), and a variety of metaphors for musical control. We further describe applications of our technology to a variety of real musical performances and directions for future research.
nd to its importance to the musicmaking process. It exists if only because musicians think it exists; that's enough if it changes the way they play music. To make live music with the hypothetical Computer Music Workstation of the first paragraph we must make it work in real time: connect one or more gestural input devices to it, compute each sample only slightly in advance of the time it is needed for conversion by the DACs, and make the sample computation dependent in some musically useful way on what has come in recently from the live input. The closer the rapport between the live input and what comes out of the speaker, the longer the audience will stay awake. This rapport is the crux of live computer music. A part of the quest for a better Computer Music Workstation is to make it easy to establish real-time musical control over the computer. I am against trying to set the computer up as a musical performer. Large software systems which try to instill "musical intelligence" i
time and frequency domains. The user provides a script in a language that is a superset of C. The EIN system then compiles it and provides a wrapper that calls the script for each time sample from t = 0 up to the specified number of samples, nsamps; computes the next output sample y (for the mono case), or the variables left and right (for the stereo case); and writes the output samples to an output sound file, formated for the sampling rate of sr samples/sec. The int variables t and nsamps, and the float variables y, left, right, and sr, are reserved by EIN, and should not be used for other purposes. The mono/stereo option, and the values of sr and nsamps are selected with radio buttons on the interface. The remaining reserved names
The Theremin was one of the first electronic musical instruments, yet it provides a degree of expressive real-time control that remains lacking in most modern electronic music interfaces. Underlying the deceptively simple capacitance measurement used by it and its descendants are a number of surprisingly interesting current transport mechanisms that can be used to inexpensively, unobtrusively, robustly, and remotely detect the position of people and objects. We review the relevant physics, describe appropriate measurement instrumentation, and discuss applications that began with capturing virtuosic performance gesture on traditional stringed instruments and evolved into the design of new musical interfaces. 1)
We aim at capturing the know-how of expert patch programmers to build more productive human interfaces for commercial synthesizers. We propose a framework to represent this superficial layer of knowledge, that can be used to help musicians program commercial synthesizers patches. Our key idea is to classify sounds according to transformations experts can apply to them. We propose a dual representation of sounds combining object-oriented programming with classification mechanisms. We illustrate our framework with a prototype system that helps program Korg-05R/W synthesizers. Key Words: Computer-Aided Synthesis, Desription Logics, ObjectOriented programming, Smalltalk From Computer Aided Synthesis to Computer Aided Synthesizer Programming Our framework stems from the following remark. In an intensive care unit, a typical nurse knows perfectly well how to use an infusion pump. However, the nurse may be an "expert" in infusion pump manipulation without necessarily having any theoretical ...
Nyquist is a functional language for sound synthesis with an efficient implementation. It is shown how various language features lead to a rather elaborate representation for signals, consisting of a sharable linked list of sample blocks terminated by a suspended computation. The representation supports infinite sounds, allows sound computations to be instantiated dynamically, and dynamically optimizes the sound computation.
ion The design technique of abstraction can be used to increase the amount of reuse within a set of classes by describing abstract classes that embody some of their shared state and/or behavior. One uses abstraction to construct a hierarchy relating several concrete classes to common abstract classes for better reuse of state and behavior. This emphasizes sharing of both state and behavior; abstract classes often implement algorithms in terms of methods that are overridden by concrete subclasses. The result of abstraction is a more horizontal hierarchy or a new protocol family. A typical abstraction example is building a model graphical objects for an objectoriented drawing package. The first pass includes several concrete classes (for the superclass name Controller StorageManager superclass name instance variables View EventListEditorView displayList layoutManager "manage read/write of list to/from files" (links represent 'has-a' relationship) superclass name Controller Se...
This article will discuss the technology of SWSS and then present and compare these three systems. It is divided into three parts; the first introduces SWSS in terms of progressive examples. Part two compares the three systems using the same two instrument/score examples written in each of them. The final section presents informal benchmark tests of the systems run on two different hardware platforms---a Sun Microsystems SPARCstation-2 IPX and a Next Computer Inc. TurboCube machine---and subjective comments on various features of the languages and programming environments of stateof -the-art SWSS software. This author's connection with this topic is that of extensive experience with several different SWSS systems over the last 15 years, starting with MUS10 and including all three compared here: Csound (in the form of Music-11 initially) at the CMRS studio in Salzburg (Pope 1982); cmusic in the CARL environment at PCS/Cadmus computers in Munich (Pope 1986); and more recently a combination of cmix, Csound, and various vocoder software packages with user interfaces written in Smalltalk-80 at the CCRMA Center for Computer Research in Music and Acoustics at Stanford University (Pope 1992). Pope: MT XV---Three Systems for SWSS 2 CMJ 17(2)---Summer, 1993 1993.01.16
Karlheinz Stockhausen had the last piece from his 29-hour-long opera cycle licht entitled Licht-Bilder on which 27 long years were put forth. This work was complemented by a video production where both an elaborate sound system for auditoriums and a complete redundant set of stage equipment were provided. Triangular sails were designed on the stage onto which the videos were being projected. This work tries to show off a kind of run-through of all the seven days of the opera while the diverse materials that were being used are a dense interconnection that is central on how Stockhausen is thinking. The pitch, rhythm, and duration were not altered at all while the tempi and dynamics were given interpretative leeway. As for its acousmatic music, both the construction and the sound act of the sound elements were in a dialectic way.
In music information retrieval, one of the central goals is to automatically recommend music to users based on a query song or query artist. This can be done using expert knowledge (e.g., www.pandora.com), social meta-data (e.g., www.last.fm), collaborative filtering (e.g., www.amazon.com/mp3), or by extracting information directly from the audio (e.g., www.muffin.com). In audio-based music recommendation, a well-known effect is the dominance of songs from the same artist as the query song in recommendation lists.
This effect has been studied mainly in the context of genre-classification experiments. Because no ground truth with respect to music similarity usually exists, genre classification is widely used for evaluation of music similarity. Each song is labelled as belonging to a music genre using, e.g., advice of a music expert. High genre classification results indicate good similarity measures. If, in genre classification experiments, songs from the same artist are allowed in both training and test sets, this can lead to over-optimistic results since usually all songs from an artist have the same genre label. It can be argued that in such a scenario one is doing artist classification rather than genre classification. One could even speculate that the specific sound of an album (mastering and production effects) is being classified. In Pampalk, Flexer, and Widmer (2005) the use of a so-called “artist filter” that ensures that a given artist’s songs are either all in the training set, or all in the test set, is proposed. Those authors found that the use of such an artist filter can lower the classification results quite considerably (as much as from 71 percent down to 27 percent, for one of their music collections). These over-optimistic accuracy results due to not using an artist filter have been confirmed in other studies (Flexer 2006; Pampalk 2006). Other results suggest that the use of an artist filter not only lowers genre classification accuracy but may also erode the differences in accuracies between different techniques (Flexer 2007).
All these results were achieved on rather small databases (from 700 to 15,000 songs). Often whole albums from an artist were part of the database, perhaps even more than one. These specifics of the databases are often unclear and not properly documented. The present article extends these results by analyzing a very large data set (over 250,000 songs) containing multiple albums from individual artists. We try to answer the following questions:
1. Is there an album and artist effect even in very large databases?
2. Is the album effect larger than the artist effect?
3. What is the influence of database size on music recommendation and classification?
As will be seen, we find that the artist effect does exist in very large databases, and the album effect is bigger than the artist effect.
For our experiments we used a data set D(ALL) of S0 = 254,398 song excerpts (30 seconds each) from a popular Web store selling music. The freely available preview song excerpts were obtained with an automated Web-crawl. All meta-information (artist name, album title, song title, genres) is parsed automatically from the HTML code. The excerpts are fromU = 18,386 albums from A = 1,700 artists. From the 280 existing different hierarchical genres, only the G = 22 general ones on top of the hierarchy are being kept for further analysis (e.g., “Pop/General” is kept but not “Pop/Vocal Pop”). The names of the genres plus percentages of songs belonging to each of the genres are given in Table 1. (Each song is allowed to belong to more than one genre, hence the percentages in Table 1 add up to more than 100 percent.) The genre information is identical for all songs on an album. The numbers of genre labels per album are given in Figure 1. Our database was set up so that every artist contributes between 6 and 29 albums (see Figure 2).
Percentages (y-axis) of albums having 1, 2, 3, 4, or 5 to 8 genre labels (x-axis).
Ars Electronica, the annual “Festival for Art, Technology and Society,” took place for the 30th time 3–9 September 2009, in Linz, Austria. The years since my review of the 2005 festival in these pages ( www.computermusicjournal.org/reviews/30-2/shintani-ae.html ) have been characterized by incremental developments as far as technology is concerned; but the events with the greatest ramifications have been the global financial and ecological crises. What differences from 2005 would be in evidence in 2009? Every epoch elicits a greater or lesser reaction from its artists, and this year’s artistic reactions in Linz were as dire as the global situation. But 2009 was also Ars Electronica’s 30th anniversary, and Linz enjoyed the distinction of being a European Cultural Capital as well. These occasions for celebration combined with the global hangover made Ars Electronica 2009 a curious mix of events that at times left us at a loss for words.
The festival’s theme this year was “Human Nature,” presented by festival directors Gerfried Stocker and Christine Schöpf in the program booklet: “We are entering a new age here on earth: the Anthropocene. An age definitively characterized by humankind’s massive and irreversible influence on our home planet. Population explosion, climate change, the poisoning of the environment, and venturing into outer space have been the most striking symbols of this development so far” ( www.facebook.com/notes.php?id=55661199917 ). As they go on to point out, genetic engineering and biotechnology are prime indicators of the transition into this new epoch. Humankind has barely begun to grasp how human life is created, yet it is already modifying entire genomes, cloning, creating, and inventing new life.
As in past years, the festival was spread over many venues, from the concert hall and museums of the ever-expanding art mile along the banks of the Danube (Brucknerhaus Concert Hall and park, Lentos Museum, new Ars Electronica Quarter) to the streets and other galleries scattered throughout the city (the complete program is available at www.aec.at/humannature/en/program-overview ), although outlying locations involving transport, such as the 2006 visit to the St. Florian Monastery were not on this year’s docket (a few clips of the 2006 festival including the St. Florian visit can be viewed at shintanis.blogspot.com/2006 09 01 archive.html). In keeping with the theme of Linz as European Capital of Culture, sites within the confines of the city limits (easily accessible on foot) were highlighted—and “high” lighted they were, indeed! The six-story, 5,100 m2 LED facade of the newly expanded Ars Electronica Center ( aec.at/center_exhibitions_en.php ; www.treusch.at/project.php ) was “played” nightly thanks to the combined efforts of invited sound and light artists; and the show Höhenrausch (high-altitude euphoria), constructed over the rooftops of several Linz edifices (church, museum, etc.), exploited the 360° view of the city and Danube valley as backdrop for commissioned sound and visual sculptures aimed at orienting Linz in a larger geographical and cultural setting. Among works exhibited were such disparate elements as a vast scrap-metal installation on the floor of an attic (Serge Spitzer), an herbal roof garden (Mali Wu), an ingenious “shower of sounds” (Paul DeMarinis), and a Ferris wheel (Maider López), to name some of the more spectacular ( www.ok-centrum.at/english/ausstellungen/hoehenrausch/index.html ). These vertiginous events were surely planned long in advance of the current global crises, and their heady heights made the fall back to earth all the more graphic, metaphorically speaking. The festival opened with a blackout of Linz aimed at permitting star gazing (Starry, Starry Night), but rain on opening night dampened rather than ignited spirits.
As every year, Ars Electronica featured lectures and seminars, this year on the topics cloud intelligence, sound-image relations in art, the future of retail (!), archiving media art, wearable computers, education in the 21st century, etc. The keynote lectures on the topic “Human Nature” were held in the Brucknerhaus concert hall. Reflecting the apocalyptic mood of the times, the representation of human nature in the lectures included robotics with the festival’s featured artist, Hiroshi Ishiguro...
[Editor's note: Selected reviews are posted on the Web at mitpress2.mit.edu/e-journals/Computer-Music-Journal/Documents/reviews/index.html. In some cases, they are either unpublished in the Journal itself or published in an abbreviated form in the Journal.]
ISMIR 2010, the Eleventh International Society for Music Information Retrieval Conference, took place in Utrecht, the Netherlands, from 9–13 August 2010. It was jointly organized by Utrecht University, the Utrecht School of the Arts, the Meertens Institute, Philips Research, and the University of Oldenburg. Since 2000, ISMIR has become an interdisciplinary forum dedicated to research on musical data, and brings together researchers in areas such as musicology, library and information science, cognitive science, computer science, and others.
I write this review from the perspective of a computer science student and a newcomer to the Music Information Retrieval (MIR) community. As a first-time ISMIR attendee, the conference offered me the opportunity to investigate the field and see the possibilities it had to offer. This review outlines the events that transpired in Utrecht over the five days of the conference, and present a fresh perspective on ISMIR, and on MIR in general.
As a student attending the conference, affordable accommodations were offered in student housing units. Being able to connect with other students the weekend before the conference began provided opportunities to not only socialize with others, new and experienced, in the community, but also to tour Utrecht and the surrounding cities in an enjoyable excursion prior to the start of the conference.
Before the official opening of ISMIR 2010, introductory tutorials were held on Monday, 9 August, for those interested in gaining background knowledge on some of the subfields within the MIR realm. One such session, entitled "A Tutorial on Pattern Discovery and Search Methods in Symbolic MIR," presented by Ian Knopke (BBC) and Eric Nichols (Indiana University), focused on symbolic music representation and applications, and cognitive approaches to MIR, including an overview of the principles of music cognition and the use of symbolic models for various MIR tasks. As a part of the second round of tutorials Meinard Müller (Saarland University and MPI Informatik) and Anssi Klapuri (Queen Mary, University of London) presented "A Music-orientated Approach to Music Signal Processing," which focused on explaining how music-specific aspects can be exploited for feature representation used in various MIR tasks. The wide range of topics discussed included pitch and harmony, tempo and beat, timbre, and melody. Although I was familiar with most of these topics, the tutorial offered a well-rounded evaluation on the areas that would be focal points throughout the conference. Overall, the tutorials provided an opportunity for those new to the field to gauge their knowledge and prepare them for the upcoming conference. The relaxed environment of the reception following the tutorials also offered an opportunity to get to know people in the community before the conference had formally begun.
The theme for ISMIR 2010 focused on MIR research and applications that modelled the perception and cognition of music, that gave insight to the human musical experience and understanding, or that used creative innovations of MIR research. To underscore this theme, Carol L. Krumhansl of Cornell University, well known for her research on tonal perception, opened the conference on Tuesday with her keynote speech entitled "Music and Cognition: Links at Many Levels." She explored the associations between the objective properties of music and the subjective experience of music. An example of such a link is her study on sensitivity to frequent patterns in sound events to the encoding and remembering of music and the generation of expectations. She showed a direct emotional response in functional magnetic resonance imaging (fMRI) scans of the brain to music containing violations of those expectations. The links between objective properties and subjective experience, although present and important in the MIR community, are becoming more relevant in applied research as well. Through the project Plink: "Thin Slices" of Music, she explored the identification of artist, title, and release date of short excerpts of popular music with results pointing towards a large capacity for details and knowledge of style and emotional content of music in long-term memory...
With Centaur’s release of CDCM (Consortium to Distribute Computer Music) Volume 38, featuring music from Bowling Green State University’s MidAmerican Center for Contemporary Music, we celebrate two notable anniversaries: the approaching 25th birthday of the Consortium itself, and the 80th birthday of CDCM’s founder, Larry Austin. Also worthy of reflection is that soon, 45 years will have passed since the first volume of SOURCE: Music of the Avant Garde was published. Even a cursory, retrospective glance over the history of these two chronicles suggests we have much to be thankful for. It’s easy to view CDCM as a translation, if you will, into the 1980s and beyond, of the vision that founded SOURCE in 1966. Both represent the most prominent chronicles of new music of their day, and both were founded by Mr. Austin. SOURCE’s oeuvre is complete, but CDCM’s is not, and neither are the spirit and vision that founded them both.
Because this music comes from a Center dedicated to contemporary music, it seems fitting to begin such a review by first paying tribute to the composer whose music appears to best “rhyme” with the current date on the calendar—Dan Tramte. He is represented on this compact disc by three short pieces, the longest of which is 1′35″. These works grew out of a request from a percussionist to provide fixed-media pieces to “fill the void” (description used in the liner notes) between percussion works during setup from one piece to another. With some, perhaps poetic, interpretation, this appears to reflect a contemporary condition whereby the ostensibly pragmatic, and potentially perfunctory, though creative opportunism becomes singularly and separately artful in its own right. More obviously, the brevity and succinctness of expression mirrors a contemporary trait, or need, to cater to shorter attention spans as more and more is demanded of our attention from media of various types. From this collection, titled Gluons, Boson, Graviton, and Electron appear on this compact disc. These three youthful, sprite, and energetic works spend no more time than needed to say what needs to be said. This latter attribute is a welcome feature in today’s often anachronistically verbose artistic world.
When a composer is given the opportunity to work with or compose for a performer who is internationally recognized for his or her masterful and artful renderings, that composer can rightfully consider herself or himself very fortunate. Here, soprano saxophonist Stephen Duke is this performer, and Elainie Lillios is the fortunate composer. As usual with Mr. Duke’s performances, this recording illustrates his subtle brush strokes and consummate artistry. It is impossible not to be drawn deeply into the piece’s own personal, intimate life. The notes provided on the compact disc for this work’s raison d’être very properly describe the listening experience: “[exploring] the vague continuum between reality and imagination, consciousness and dreaming” (liner notes). Mr. Duke accepted and championed the challenge of performing the once obligatory key-slaps and non-pitched “breathing through the instrument” figures, while mitigating the act of recognition by an experienced listener. Several pieces have been inspired by Wallace Stevens’s Thirteen Ways of Looking at a Blackbird, including works by Lukas Foss and Louise Talma. Ms. Lillios’s Veiled Resonances was also inspired by the Stevens work and won first prize in the 2009 Concours Internationale de Bourges.
Burton Beerman is both composer and performer for Dayscapes, a three-movement work for clarinet and live, interactive electroacoustic sound. Even if no other piece on this compact disc were to be found enjoyable by some listener (very doubtful), this piece alone surely would be found to be worth the “price of admission.” It is with gratitude that one only finds technical information supplied in the liner notes for this piece. The imagination is, thereby, free to roam and join the virtuosic exploration of many spaces, characters, and personalities, free to look about and sample the riches of a tasty, poignant variety such as one might experience while taking in a circus. The performance is indeed virtuosic but never simply for the sake of performance virtuosity. The more important virtuosity here is...
In 2005, the newest version of the Waseda Flutist Robot, the Waseda Flutist Robot Number 4 Refined II (WF-4RII) was successfully implemented. The WF-4RII had improvements on the design of humanoid organs involved in flute playing such as robotic lips, lungs, arms, neck, tongue, oral cavity and vibrato system. The musical performance of the robot flutist is facilitated by a computer controller, a software sequencer and a vision system. An experimental setup was proposed to verify how well the WF-4RII could imitate a human performance. A statistical analysis using signal features of RMS, pitch, spectral centroid, spectral rolloff, number of crossings and sonogram was performed of the sound from the performances of a professional flutist and the robot to identify the main differences between them. Using a theme from Mozart's flute quartet KV 298, RMS feature of the robot and the flutist exhibit the same intensity. However, the dynamics of the robot were slightly different in some areas. Spectral centroids ofboth performances were similar, however, in some cases, the professional flutist presented a larger frequency distribution. Regarding the spectral rolloff, the robot's sound still lacks some high-frequency musical components during note transitions. In the zero-crossing feature, the professional flutist's sound presented more dynamic transitions. Regarding the sonogram feature, the professional flutist presents strong accents on the horizontal and vertical lines, whereas the robot presents strong horizontal lines but some distortion on the vertical lines.
Electroacoustic music pioneer Francis Dhomont is renowed internationally for his compositions. His scrutiny of Pierre Schaeffer's work and his own experiments in the earliest days of "music concrète" resulted in a favored place for the development of acousmatic music in Montreal. In this paper, Dhomont talks about the influences of the composers of GRM (Groupe de Recherches Musicales) in Paris on his work. He outlines how his involvement with "music concr-grete" started years after the second world war.
Digital audio workstations (DAWs) such as Digidesign Pro Tools, Apple Logic, and Ableton Live are the cornerstone of composition, recording, editing, and performing activities for producers working in popular music (Théberge 1997). Human-computer interaction (HCI) research has a unique challenge in understanding the activities of professional music producers and in designing DAW user interfaces to support this work. Unlike many other user-interface design domains, in computer-mediated music production the user is principally engaged in the process of building and editing immensely complex digital representations (Lerdahl and Jackendoff 1983; Pope 1986; Dannenberg, Rubine, and Neuendorffer 1991; Dannenberg 1993). In this case, those representations model the intricate structure and synthesis parameters of the musical composition the producer is creating. Determining the right vocabulary of abstract representations to build into the user interface of DAWs is a difficult problem, and these design decisions have a critical impact on the activity of professional producers. The design of these abstraction mechanisms has been primarily informed by the historical origins of DAWs in multitrack tape recorders and mixing desks (Théberge 2004), which together we refer to as the "multitrack-mixing model." Our research has identified many ways in which these user-interface metaphors (Barr 2003) from the past often do not support the activities of professional producers.
The evolutionary reliance of DAWs on the multitrack-mixing model can be contrasted with more radical algorithmic approaches to composition and performance, such as those found in visual or textual programmatic tools that are common in the experimental and avant-garde computer-music traditions. Because these tools have a different set of abstraction mechanisms and corresponding tradeoffs, their procedural rather than declarative nature (Dannenberg 1993) and their focus on "generative" music-making render them less well suited to the work of the participants in this study, and therefore outside the scope of this discussion.
This article outlines findings from our detailed qualitative investigation (Duignan 2008) into the activity of computer-mediated music-making in the popular idiom, and the abstraction mechanisms that professional producers use to manage the complex digital representations of their compositions for studio work and live performance. We present a framework that articulates the key interactions and tensions between professional producers and the abstraction mechanisms provided by the tools on which they so heavily rely. This framework helps us understand and clearly identify issues that need to be resolved in the next generation of DAW user interfaces.
This research was conducted as a collective case study based on Creswell's Qualitative Inquiry and Research Design (1998), and it exploits the emphasis of case-study inquiry on developing an in-depth analysis of a particular activity. We employed comprehensive semi-structured interviews of professional producers and conducted extended observations of them in the field. We focused on five participants for the in-depth case study, with twelve additional participants who added breadth of perspective to better support meaningful triangulation and context for the issues that we discovered.
Music production tools play a vital mediating role between users and their composition.
We drew from two theoretical traditions with associated research tools in creating our semi-structured interview protocol and in analyzing the findings: activity theory and cognitive dimensions of notations. Activity theory (Vygotskii 1978; Nardi 1996) provides us with a useful terminology, a framework for analysis, and a clear catalog of the important components of human activity (Kaptelinin, Nardi, and Macaulay 1999) that we used to develop an interview protocol (Duignan, Noble, and Biddle 2006). Cognitive dimensions of notations (Blackwell and Green 2003) provides a common vocabulary for describing recurring trade-offs in the conceptual models employed by notational systems. Cognitive dimensions have obvious relevance to our inquiry, having been applied to music typesetting and "live-coding" musical systems in the past (Blackwell and Green 2000; Blackwell and Collins 2005). The cognitive dimensions framework provided us with further content for our interview protocol (Blackwell and Green 2000) and a terminology and framework for analyzing the tensions between music production abstractions and the activity of producers.
Activity theory posits that the process of intellectual work, and therefore aspects of consciousness itself, are embedded not...
The mechanized experiments for characterizing the feel of the piano action, where mechanization is undertaken with both the role of finger/key impedance and considering changing contact conditions of the piano action are presented. An experimental apparatus with a source impedance modeled after the human finger was used. The data collected from the coupled dynamics to fit linear lumped-parameter models was used and then a hybrid dynamical system was build, which accounts for the changing contact conditions. The flexure bearing was configured such that at its home position, the two horizontal lengths of coil were centered within two air gaps of a magnetic circuit driven by four rare-earth permanent magnets. A strain gauge load cell was secured to the armature and configured with an instrumentation amplifier. The recordings of the armature displacement and current command were processed in MATLAB using the tfestimate function to produce a transfer-function estimate for each preparation.
Frequency-modulation (FM) synthesis is widely known as a computationally efficient method for synthesizing musically interesting timbres. However, it has suffered from neglect owing to the difficulty in creating natural-sounding spectra and mapping gestural input to synthesis parameters. Recently, a revival has occurred with the advent of adaptive audio-processing methods, and this work proposes a technique called adaptive FM synthesis. This article derives two novel ways by which an arbitrary input signal can be used to modulate a carrier. We show how phase modulation (PM) can be achieved first by using delay lines and then by heterodyning. By applying these techniques to real-world signals, it is possible to generate transitions between natural-sounding and synthesizer-like sounds. Examples are provided of the spectral consequences of adaptive FM synthesis using inputs of various acoustic instruments and a voice. An assessment of the timbral quality of synthesized sounds demonstrates its effectivenes.
The relation between the peak amplitude above the noise floor and the descriptor boundaries for the class of sinusoidal peaks has been investigated. A new adaptive threshold-selection algorithm is presented that can be used for classification of spectral peaks. The limit values for the distributions of sinusoidal peaks in the descriptor domain can be explicitly obtained using the set of peak descriptors and the proposed compact sinusoidal model related to the analysis window. The variations of the limit values are characterized in a deterministic approach through a parameter that is referred as the peak signal/noise ratio. Through this control parameter, the descriptor limits the classification algorithm can be adapted intuitively increasing or decreasing the tolerance of the classifier with respect to noise level and modulation. The approximation accuracy is large for different types of analysis window.
Adaptive concatenative sound synthesis (ACSS) is a recent technique for generating and transforming digital sound. Variations of sounds are synthesized from short segments of others in the manner of callage based on a measure of similarity. ACSS provides an intuitive way to automate and control the procedure of sound arranging micromontage freeing time for experimenting and composing with this flexible sound-synthesis technique. ACSS is in one sense a granular synthesis or multiple-wavetable synthesis, and in another it is sound scrambling, or a sophisticated version of remixing. Since 2000, there has been growing interest in applying concatenative techniques to sound and music synthesis. An impressive implementation of ACSS is Sven König's Scrambled Hackz. This software allows concatenatively to synthesize in real-time, audio input from popular music synchronized to the respective music video frames. ACSS presents numerous possibilities for timbrally transforming any sound. A composer can create a target sound using a variety of vocal sound effects that provides a stencil on which any sound material may be dabbed.
Several aspects of a body of sound-synthesis techniques are presented. The ordering of segments using tendency masks was particularly successful. A wide selection of segments would result in a noisy sound structure. Narrow masks led to unstable sounds within a confined frequency region. Masks moving from narrow to wide could produce dramatic transitions between these two extremes. Herbert Brü n stresses the ubiquity of technological considerations in musical composition. He also criticizes the view of technology as a mere means to preconceived ends. The composer needs to engage in actively designing artificial systems. He highlights the common ground of art, science, and technology, which he locates in the idea of the system.
A range of algorithmically defined NoiseSong sounds that were clustered perceptually with each other and with unaltered singing were investigated. The NoiseSong were of two primary types, which included transformed singing and synthetic. Transformed singing comprised the AtSg series and some of the NsSg series and synthetic singing, NsSg 6-8, comprised singing-formants super-imposed stably on white noise. Four formants were used across the frequency range and including the singer's formant. Participants were asked for each sound whether it derived from drums, piano, singing, and/or water to determine source derivations and were also asked whether they were confident for each response. They were told they could indicate several different derivations for the same sound, or indicate that none of the possible derivations were related to what they heard. The range observed for the Song samples was at least from -0.59 to 1.03 and average valence ratings per item ranged overall from -1.03 to 1.21.
An imitative multi-agent system (IMAP) approach has been developed to generate expressive performances of music, based on agents' individual parameterized musical rules. Agents in IMAP have two types of expressive actions including changes in tempo and changes in note loudness and each agent also has a musical evaluation function based on a collection of rules, where different agents give different weightings to the rules and use the combination to evaluate the performances. The agents' evaluation functions are generated in IMAP by providing agents with rules describing the essential features of an expressive performance. An agent's evaluation function is defined at two stages including the rule level that involves a series of five rules derived from previous work on generative performance and the analytics level that involves a group of musical analysis functions that the agent uses to represent the structure of the musical score.
An orchestration system called Orchidée, communicate with computer music environments through simple client/server architecture. The innovative interfaces in OpenMusic, Max/MSP and MATLAB are introduced for the specification of the orchestration problem and for the exploration of its potential solutions that allow a fine comprehension of the links between symbolic and audio data through preferences-inference mechanisms. The SoundTarget is an OpenMusic object directed to the specification of the orchestration target that provides a convenient interface that hides most of the complex underlying sound processing and programming issues while maintaining the efficiency of a programmable system. Another object created in OpenMusic is Orchestra that is provided with its orchestra editor along with the function box labeled as submit-orchestra that formats the orchestra information as a simple message and allows it to communicate to the Orchidée server.