Article

Vibrato: Questions And Answers From Musicians And Science

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

this paper, we will focus on vibrato as an expressive means within musical performances. In this respect, we assume that vibrato may be used by musicians to stress notes or to convey a certain musical interpretation. It is an area of research that recently gained interest and is still in an explorative stage (see the contribution of Gleiser & Friberg to this proceedings). We turn to musicians hypotheses concerning the expressive function of vibrato and compare this to observations made on the relation between music structural characteristics and vibrato rate and extent in actual performances. The analyses of the performance data are based on the predictions of expressive vibrato behavior (Sundberg, Friberg & Frydn, 1991) and on predictions stemming from piano performance research that attributes expressive behavior to the pianists interpretation of musical structure (e.g., Clarke, 1988). The comparison aims to show that the scientific inquiries could be inspired by hypotheses stemming from musicians/experts who devote their life to refining their control of musical parameters for expressive means, and teaching that to students. Vise versa, the scientific results can achieve a musical meaningfulness and value, also for musicians and teachers. METHOD Five professional musicians participated in the study: a cellist, an oboist, a tenor, a thereminist, and a violinist. The musicians are all known-musicians for their performance in orchestras, chamber ensembles and/or as soloists. Each participant was paid for participation. The study used a notation of the first phrase of Le Cygne by Saint-Sans (18351921) for musicians to play from. Originally, 'Le Cygne' (translation: the swan) is for cello solo with orchestral accompaniment. A piano reduction of the orchestral accompanime...

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Les mécanismes de production du vibrato sont différents selon l'instrument considéré. Timmers et Desain [TD00] en décrivent certains. ...
... Le vibrato du chant est de type « oscillation de la valeur de la fréquence fondamentale ». La première de ses caractéristiques est qu'il est produit spontanément par la voix humaine [TD00]. L'origine de ce vibrato spontané est très discutée [AC07]. ...
... L'origine de ce vibrato spontané est très discutée [AC07]. Toujours est-il que les auteurs sont cependant d'accord sur le fait que, même si les chanteurs professionnels peuvent dans une certaine mesure le contrôler (l'atténuer ou au contraire l'amplifier), il est toujours présent lorsque quelqu'un chante [TD00,AC07,Sea38]. La deuxième caractéristique intéressante est la fréquence des oscillations : pour les instruments, cette fréquence est choisie par le musicien. ...
Article
Full-text available
Actuellement, la quantité de musique disponible, notamment via Internet, va tous les jours croissant. Les collections sont trop gigantesques pour qu'il soit possible d'y naviguer ou d'y rechercher un extrait sans l'aide d'outils informatiques. Notre travail se place dans le cadre général de l'indexation automatique de la musique. Afin de situer le contexte de travail, nous proposons tout d'abord une brève revue des travaux réalisés actuellement pour la description automatique de la musique à des fins d'indexation : reconnaissance d'instruments, détermination de la tonalité, du tempo, classification en genre et en émotion, identification du chanteur, transcriptions de la mélodie, de la partition, de la suite d'accords et des paroles. Pour chacun de ces sujets, nous nous attachons à définir le problème, les termes techniques propres au domaine, et nous nous attardons plus particulièrement sur les problèmes les plus saillants. Dans une seconde partie, nous décrivons le premier outil que nous avons développé : une distinction automatique entre les sons monophoniques et les sons polyphoniques. Nous avons proposé deux nouveaux paramètres, basés sur l'analyse d'un indice de confiance. La modélisation de la répartition bivariée de ces paramètre est réalisée par des distributions de Weibull bivariées. Le problème de l'estimation des paramètres de cette distribution nous a conduit à proposer une méthode originale d'estimation dérivée de l'analyse des moments de la loi. Une série d'expériences nous permet de comparer notre système à des approches classiques, et de valider toutes les étapes de notre méthode. Dans la troisième partie, nous proposons une méthode de détection du chant, accompagné ou non. Cette méthode se base sur la détection du vibrato, un paramètre défini à partir de l'analyse de la fréquence fondamentale, et défini a priori pour les sons monophoniques. A l'aide de deux segmentations, nous étendons ce concept aux sons polyphoniques, en introduisant un nouveau paramètre : le vibrato étendu. Les performances de cette méthode sont comparables à celles de l'état de l'art. La prise en compte du pré-traitement monophonique / polyphonique nous a amenés à adapter notre méthode de détection du chant à chacun de ces contextes. Les résultats s'en trouvent améliorés. Après une réflexion sur l'utilisation de la musique pour la description, l'annotation et l'indexation automatique des documents audiovisuels, nous nous posons la question de l'apport de chacun des outils décrits dans cette thèse au problème de l'indexation de la musique, et de l'indexation des documents audiovisuels par la musique et offrons quelques perspectives.
... This also applies to musical instrument performance. By vibrato, it refers to a periodic fluctuation in pitch of a musical tone [11], [12]. This fluctuation is sometimes sinusoidal. ...
... The enormous digital pop song collection is left unexplored. Besides, the characteristics of vibrato are generally considered to be personal, and remain the same for a singer [11], [12]. Individual singers may possess similar or distinct vibrato characteristics. ...
... The feature extraction scheme below aims to estimate these two parameters. Traditionally, vibrato is often manually analyzed by professionals [12], whereas some of the recent feature extractions are applicable to sinusoidal vibratos only [15]. As vibrato tone may take various forms, e.g. ...
Conference Paper
Full-text available
Pleasant singing voice is often ornamented by vibrato. This pitch fluctuation acts as a distinctive feature for singing and promotes voice quality. Nevertheless, independent pitch processing in singing voice synthesis does not guarantee the output quality. The spectral envelope actually varies with pitch during human voice production. This paper proposes a modeling technique for singers' vibratos, followed by a joint processing on vibrato and spectral envelope, such that these attributes are consistent. The performance of the proposed processing has been verified by subjective listening test. The synthetic singing outputs are found to have similar quality as the human singing.
... The vibrato extent ranges between 0.6 − 2 semitones for singers and between 0.2 − 0.35 semitones for string players (see (Timmers & Desain, 2000) for a review). (Bretos & Sundberg, 2003) showed that the vibrato extent and the mean fundamental frequency were correlated with sound level. ...
... Results from similarity ratings indicate that the vibrato rate is perceptually more relevant than the vibrato extent (Järveläinen, 2002). The use of vibrato by performers to convey musical expression was investigated in (Timmers & Desain, 2000). A strong effect of musical structure, particularly metrical stress, was observed on both vibrato rate and extent, yielding a consistent use of vibrato over repetitions. ...
... The shape of the vibrato has received little attention. (Horii, 1989) quoted by (Timmers & Desain, 2000) proposed a classification of singer-vibrato-shapes into sinusoidal, triangular, trapezoidal, and unidentifiable. But the impact of vibrato shape of perceived sound quality remains to be studied. ...
Article
Full-text available
Actes du Colloque interdisciplinaire de musicologie (CIM05) Montréal (Québec) Canada, Abstract We promote a clearer definition of vibrato (Seashore, 1932), based on a review of various vibrato features. We also propose a generalised vibrato effect generator that includes spectral envelope modulation, and a frequency-dependent hysteresis behaviour. We then investigate the influence of spectral envelope modulation on perceived quality with a double-blind ran-domized AB comparison task. Eight participants listened to 12 pairs of sounds with vibrato matched for loudness. Each pair included one sound with constant average spectral envelope (identical amplitude modulation over all frequencies) and one with modulated spectral enve-lope (frequency dependent amplitude modulation). Participants were asked to choose which version sounded the most natural. The statistical analysis revealed a significant preference for sounds with modulated spectral envelope (p < 0.001). Our results highlight the need to consider spectral envelope modulation for vibrato modelling.
... Pitch instability refers to the property of the singing voice that its pitch varies considerably over time compared with other pitched instruments. This is mostly due to the fact that vibrato typically exhibits an extent of ±60–200 cents for singing voice and only ±20–35 cents for other instru- ments [1] . Also, vocalists almost always sing legato, changing pitch smoothly during note attacks and transitions. ...
... In order to assess which TWM pitch estimates are likely to be correct, each estimate is further associated with a reliability measure. This measure is obtained simply by mapping the TWM error [5] linearly to the interval [0, 1]. ...
Article
This paper deals with the transcription of vocal melodies in music recordings. The proposed system relies on two distinct pitch estimators which exploit characteristics of the human singing voice. A Hidden Markov Model (HMM) is used to fuse the pitch estimates and make voicing decisions. The resulting performance is evaluated on the MIREX 2006 Audio Melody Extraction data.
... • Rule 1: Note transitions are typically limited to one octave [1]. • Rule 2: The vibrato exhibits an extent of 60-200 cents for singing voice and only 20-30 cents for other instruments [18]. The statistics mentioned above are based on discrete units of notes, and for it to be used in determining the transition probability of melody pitch, , it must be converted to a continuous unit such as Hertz (Hz). ...
... 3) Note bigram: The note bigram, , is obtained from based on Rule 1: more weights are placed on pitch candidates one octave below and above . This is mathematically represented in (18) and (19) where denotes the nearest integer of the argument: (18) and as shown by (19) at the bottom of the page.In Fig. 4(c), an example of the note bigram is shown. ...
Article
This paper proposes a melody tracking algorithm based on the state-space equation of the parameters that define melody. The parameters that consist of melody pitch and harmonic amplitudes are assumed to follow two uncoupled first-order Markov processes, and the polyphonic audio is related to the parameters such that the current framed segment of the polyphonic audio is conditionally independent of other framed segments given the parameters. The transition probability of the melody pitch is defined based on a number of statistical characteristics of music that account for small and large variation in melody, and for reasons of mathematical tractability, the transition probability of harmonic amplitude is assumed to be Gaussian. To estimate and track the parameters, the sequential Monte Carlo method is utilized. Experimental results show that the performance of the proposed algorithm is better than or comparable to other well-known melody extraction algorithms in terms of the raw pitch accuracy (RPA) and the raw chroma accuracy (RCA).
... As an expressive means, vibrato plays an important role in conveying the musical interpretation to the listener. In recent research note-by-note relations between averaged vibrato rate and extent and structural aspects, such as phrasing, melodic charge and metrical stress were reported [1] [2]. This suggests that the performer varies vibrato rate and extent in a certain structured way. ...
... Thus, the phrasing arches, calculated for sound level, also apply to vibrato extent. Furthermore, rules were formulated for the dependence of vibrato rate and extent on melodic charge, as reported in another study [1]. In this study it was shown that both vibrato rate and extent increased with melodic charge in violin performance. ...
Article
Full-text available
Vibrato is one of the most important expressive parameters that players can control when rendering a piece of music. The simulation of vibrato, in systems for automatic music performance, is still an open problem. A mere regular periodic modulation of pitch generally yields unsatisfactory results, sounding both unnatural and mechanical. An appropriate control of vibrato rate and vibrato extent is a major requirement of a successful vibrato model. The goal of the present work was to develop a generative, rule-based model for expressive violin vibrato. Measurements of vibrato as performed by professional violinists were used for this purpose. The model generates vibrato rate and extent envelopes, which are used to control a sampled violin synthesizer.
... Vibrato is an expressive ornamentation that is frequently used in music [19,14]. The term vibrato is generally understood to refer to a quasi periodic modulation of the pitch or fundamental frequency. ...
... Accordingly, the vibrato signal has a quasi-periodic variation of the fundamental frequency around a central value. Perceptually, this modulation does not directly affect the perceived pitch of the note [19]. The perceived pitch of vibrato notes has been investigated repeatedly and has been found to be given by the a value close to the mean pitch over a vibrato period [20]. ...
Conference Paper
Full-text available
This paper describes research into signal transformation operators allowing to modify the vibrato extent in recorded sound signals. A number of operators are proposed that deal with the problem taking into account different levels of complexity. The experimental validation shows that the operators are effective in removing existing vibrato in real world recordings at least for the idealized case of long notes and with properly segmented vibrato sections. It shows as well that for instruments with significant noise level (flute) independent treatment of noise and harmonic signal components is required.
... In general, confusion areas tend to be concentrated on the bottom left to the center area of the graph. The extent and rate of the artificial tones that are highly misclassified seem to be around the range of vibratos of singers, which is said to be around 0.6 to 2 semitone with rate around 5.5 to 8 Hz [30]. We also observe a within-system difference, i.e., the presence and the type of formants affect the models. ...
Preprint
Full-text available
Since the vocal component plays a crucial role in popular music, singing voice detection has been an active research topic in music information retrieval. Although several proposed algorithms have shown high performances, we argue that there still is a room to improve to build a more robust singing voice detection system. In order to identify the area of improvement, we first perform an error analysis on three recent singing voice detection systems. Based on the analysis, we design novel methods to test the systems on multiple sets of internally curated and generated data to further examine the pitfalls, which are not clearly revealed with the current datasets. From the experiment results, we also propose several directions towards building a more robust singing voice detector.
... Unlike the pitch of speech, which has speaker-dependent intonation patterns and pitch levels, the pitch of singing voice follows the frequency of musical tone. Vibrato is periodic, rather sinusoidal, modulation of pitch, amplitude, and/or timbre of a musical tone [26]. It adds a special quality to the tone and seems to be driven by a pulsation of subglottal pressure [18]. ...
Article
Vibrato is a slightly tremulous effect imparted to vocal or instrumental tone for added warmth and expressiveness through slight variation in pitch. It corresponds to a periodic fluctuation of the fundamental frequency. It is common for a singer to develop a vibrato function to personalize his/her singing style. In this paper, we explore the acoustic features that reflect vibrato information in order to identify singers of popular music. We start with an enhanced vocal detection method that allows us to select vocal segments with high confidence. From the selected vocal segments, the cepstral coefficients which reflect the vibrato characteristics are computed. These coefficients are derived using bandpass filters, such as parabolic and cascaded bandpass filters, spread according to the octave frequency scale. The strategy of our classifier formulation is to utilize the high level musical knowledge of song structure in singer modeling. Singer identification is validated on a database containing 84 popular songs from commercially available CD recordings from 12 singers. We achieve an average error rate of 16.2% in segment level identification
... Melody line is characterized by prolonged periods of smoothness, with infrequent sharp changes in note transition or during vibrato regions. Furthermore, there are two general rules concerning the melody line: 1) the vibrato exhibits an extent of 60∼200 cents 2 for singing voice and only 20∼30 cents for other [8] , and 2) the transitions are typically limited to one oc- tave [1] . Therefore, assumption that v ω0,t−1 follows a Gaussian distribution with fixed variance is not appropriate. ...
... Vibrato is a periodic, rather sinusoidal, modulation of pitch and amplitude of a musical tone [14]. Vocal vibrato can be seen as a function of the style of singing associated to a particular singer [11]. ...
Conference Paper
Timbre can be defined as feature of an auditory stimulus that allows us to distinguish the sounds which have the same pitch and loudness. In this paper, we explore timbre based perceptual feature for singer identification. We start with a vocal detection process to extract the vocal segments from the sound. The cepstral coefficients, which reflect timbre characteristics, are then computed from the vocal segments. The cepstral coefficients of timbre are formulated by combining information of harmonic and the dynamic characteristics of the sound such as vibrato and the attack-decay envelope of the songs. Bandpass filters that spread according to the octave frequency scale are used to extract vibrato and harmonic information of sounds. The experiments are conducted on a database of 84 popular songs. The results show that the proposed timbre based perceptual feature is robust and effective. We achieve an average error rate of 12.2% in segment level singer identification.
... ce. If this expectation is violated, the deviation should sooner or later be resolved. Context dependent norms were further mentioned by Clarke (1995), who suggested the possibility that a way of performing certain figures, such as the long/short interpretation of equal quarter notes, can become the norm from which later performances might deviate. Timmers and Desain (2000) have found musicians referring to the process of setting the norm and deviating from it within the performance of a single piece. Repp (1998) has suggested the existence of expectations on performance variations based on previous variations, but has rejected this hypothesis on the basis of the findings in his own study. He found that th ...
Article
Full-text available
The aim of this study was to show that the quality of an expressive interpretation depends on expressive context. The main hypothesis was that expression is evaluated in relation to preceding expressive variations. Two experiments and a model tested this hypothesis. In the first experiment, 39 listeners rated the quality of the performance of the continuation (second half of the musical stimulus) given the performance of the initiation (first half of the musical stimulus). The results showed a significant effect of continuation on the quality judgements and a significant interaction between continuation and initiation. This interaction was seen as the first confirmation of the hypothesis. In the second experiment, 20 participants rated the quality of the six performances of the initiation and of the continuation separately. The results of this experiment were unable to explain the quality judgements of experiment 1. The low agreement between the judgements was taken as a second confirmation that contextual considerations can overrule general aesthetic preference. A regression model was proposed that predicts the quality rating of experiment 1 from the similarity in rubato extent, key velocity pattern, average articulation, grace note duration and average asynchrony between the two segments. This model was better able to explain the quality judgements of the continuation, providing final confirmation that the quality of the second half was a function of its agreement with the first half
... In the case where the performance is not given, a set of performance rules is in charge of calculating it. More precisely these rules determine the characteristics of the vibrato [51], [72], [52], [80] and the exact value of the pitch at any instant. They also compute the precise loudness shape over time of each note and how successive notes are connected or not, and this is specified in terms of energy and some attributes of timbre such as spectral slope. ...
Article
As soon as the beginning of the 60s,the singing voice have been synthesized by computer. Since these first experiments, the musical and natural quality of singing voice synthesis has largely improved and high quality commercial applications can be envisioned for a near future. This talk gives an overview of synthesis methods, control strategies and research in this field. Future challenges include synthesizer models improvements, automatic estimation of model parameter values from recordings, learning techniques for automatic rule construction and gaining a better understanding of the technical, acoustical and interpretive aspects of the singing voice 1.
... For the voice, the average rate is around 6 Hz and increases exponentially over the duration of a note event [12]. The average extent ranges from 0.6 to 2 semitones for singers and from 0.2 to 0.35 semitones for string players [13]. ...
Article
Full-text available
In this paper we investigate the problem of locating singing voice in music tracks. As opposed to most existing methods for this task, we rely on the extraction of the characteristics specific to singing voice. In our approach we suppose that the singing voice is characterized by harmonicity, formants, vibrato and tremolo. In the present study we deal only with the vibrato and tremolo characteristics. For this, we first extract sinusoidal partials from the musical audio signal . The frequency modulation (vibrato) and amplitude modulation (tremolo) of each partial are then studied to determine if the partial corresponds to singing voice and hence the corresponding segment is supposed to contain singing voice. For this we estimate for each partial the rate (frequency of the modulations) and the extent (amplitude of modulation) of both vibrato and tremolo. A partial selection is then operated based on these values. A second criteria based on harmonicity is also introduced. Based on this, each segment can be labelled as singing or non-singing. Post-processing of the segmentation is then applied in order to remove short-duration segments. The proposed method is then evaluated on a large manually annotated test-set. The results of this evaluation are compared to the one obtained with a usual machine learning approach (MFCC and SFM modeling with GMM). The proposed method achieves very close results to the machine learning approach : 76.8% compared to 77.4% F-measure (frame classification). This result is very promising, since both approaches are orthogonal and can then be combined.
... It should be noted that musical instruments also exhibit a considerable bit of vibrato. However, it has been observed that vibrato extent is lower in musical instruments (0.2-0.35 semitones) as compared to singers (0.6 to 2.0 semitones) 27 . Thus, on key filtering, the attenuation of the musical instruments will be greater than that of the singing voice. ...
Article
We present a framework to detect the regions of singing voice in musical audio signals. This work is oriented towards the development of a robust transcriber of lyrics for karaoke applications. The technique leverages on a combination of low-level audio features and higher level musical knowledge of rhythm and tonality. Musical knowledge of the key is used to create a song-specific filterbank to attenuate the presence of the pitched musical instruments. This is followed by subband processing of the audio to detect the musical octaves in which the vocals are present. Text processing is employed to approximate the duration of the sung passages using freely available lyrics. This is used to obtain a dynamic threshold for vocal/ non-vocal segmentation. This pairing of audio and text processing helps create a more accurate system. Experimental evaluation on a small database of popular songs shows the validity of the proposed approach. Holistic and per-component evaluation of the system is conducted and various improvements are discussed.
... Thus, the statement suggests that the violin extent would be no more than 17 to 62.5 cents, with a mean of 30.5 cents. In another study, the vibrato extent for Western string players was shown to be 0.2 to 0.35 semitones [29]. The vibrato extent of violin in the present study is very close to that reported in the literature. ...
Conference Paper
Full-text available
This study compares Chinese and Western instrument playing styles, focussing on vibrato in performances of Chinese music on the erhu and on the violin. The analysis is centered on erhu and violin performances of the same piece; comparing parameters of vibrato extracted from recordings. The parameters studied include vibrato rate, extent, sinusoid similarity and number of humps in the vibrato f0 envelope. Results show that erhu and violin playing have similar vibrato rates, and signi cantly di erent vibrato extents, with the erhu exhibiting greater vibrato extent than violin. Moreover, the vibrato shape of the erhu samples was more similar to a sinusoid than that of the violin vibrato samples. The number of vibrato f0 envelope humps are positively correlated with the number of beats for both erhu and violin, suggesting that erhu and violin players share similar vibrato extent variation strategies.
... The rate is constant per singer. [TD00] 5.5 8 [DHAT99] 6 7 de 4 -12 increases towards the end of the note. We now present another feature of the singing voice, still occurring on the pitch contour, that helps to distinguish the voice from the instrumental background during the transitions between two notes. ...
Article
Full-text available
This dissertation is concerned with the problem of describing the singing voice within the audio signal of a song. This work is motivated by the fact that the lead vocal is the element that attracts the attention of most listeners. For this reason it is common for music listeners to organize and browse music collections using information related to the singing voice such as the singer name. Our research concentrates on the three major problems of music information retrieval: the localization of the source to be described (i.e. the recognition of the elements corresponding to the singing voice in the signal of a mixture of instruments), the search of pertinent features to describe the singing voice, and finally the development of pattern recognition methods based on these features to identify the singer. For this purpose we propose a set of novel features computed on the temporal variations of the fundamental frequency of the sung melody. These features, which aim to describe the vibrato and the portamento, are obtained with the aid of a dedicated model. In practice, these features are computed on the time-varying frequency of partials obtained using the sinusoidal model. In the first experiment we show that partials corresponding to the singing voice can be accurately differentiated from the partials produced by other instruments using decisions based on the parameters of the vibrato and the portamento. Once the partials emitted by the singer are identified, the segments of the song containing singing can be directly localized. To improve the recognition of the partials emitted by the singer we propose to group partials that are related harmonically. Partials are clustered according to their degree of similarity. This similarity is computed using a set of CASA cues including their temporal frequency variations (i.e. the vibrato and the portamento). The clusters of harmonically related partials corresponding to the singing voice are identified using the vocal vibrato and the portamento parameters. Groups of vocal partials can then be re-synthesized to isolate the voice. The result of the partial grouping can also be used to transcribe the sung melody. We then propose to go further with these features and study if the vibrato and portamento characteristics can be considered as a part of the singers' signature. Previous works on singer identification describe audio signals using features extracted on the short-term amplitude spectrum. The latter features aim to characterize the timbre of the sound, which, in the case of singing, is related to the vocal tract of the singer. The features we develop in this document capture long-term information related to the intonation of the singer, which is relevant to the style and the technique of the singer. We propose a method to combine these two complementary descriptions of the singing voice to increase the recognition rate of singer identification. In addition we evaluate the robustness of each type of feature against a set of variations. We show the singing voice is a highly variable instrument. To obtain a representative model of a singer's voice it is thus necessary to build models using a large set of examples covering the full tessitura of a singer. In addition, we show that features extracted directly from the partials are more robust to the presence of an instrumental accompaniment than features derived from the amplitude spectrum.
... Vibrato is a well-known [11,12] property of the human singing voice. In general, the vibrato is defined as a periodic oscillation of the fundamental frequency. ...
Article
In this article, we present an improvement of a previous singing voice detector. This new detector is in two steps. First, we distinguish monophonies from polyphonies. This distinction is based on the fact that the pitch estimated in a monophony is more reliable than the one estimated in a polyphony. We study the short term mean and variance of a confidence indicator; their repartition is modelled with bivariate Weibull distributions. We present a new method to estimate the parameters of these distributions with the moment method. Then, we detect the presence of singing voice. This is done by looking for the presence of vibrato, an oscillation of the fundamental frequency between 4 and 8 Hz. In a mono-phonic context, we look for vibrato on the pitch. In a polyphonic context, we first make a frequency tracking on the whole spectrogram, and then look for vibrato on each frequency tracks. Results are promising: from a global error rate of 29.7 % (previous method), we fall to a global error rate of 25 %. This means that taking into account the context (monophonic or polyphonic) leads to a relative gain of more than 16 %.
... It is commonly parameterized by its rate (the modulation frequency given in Hertz) and its extent (the modulation's amplitude given in cents 1 ). These parameters have been studied extensively from musicological and psychological perspectives, often in a cumbersome process of manually annotating spectral representations of monophonic music signals, see for example [5,10,18,20,22]. ...
Conference Paper
It is common that a singer develops a vibrato to personalize his/her singing style. In this paper, we explore the acoustic features that reflect vibrato information, to identify singers of popular music. We start with an enhanced vocal detection method that allows us to select vocal segments with high confidence. From the selected vocal segments, the cepstral coefficients which reflect the vibrato characteristics are computed. These coefficients are derived using cascaded bandpass filters spread according to the octave frequency scale. We employ the high level musical knowledge of song structure in singer modeling. Singer identification is validated on a database containing 84 popular songs in commercially available CD records from 12 singers. We achieve an average error rate of 16.2% in segment level identification
Conference Paper
Timbre is the quality of sound which allows the ear to distinguish between musical sounds. In this paper, we study timbre effects in identification of singing voice segments in popular songs. Firstly, we identify between singing voice and instrumental segments in a song. Then, singing voice segments are further categorized according to their singer identity. Timbre-motivated effects are formulated by fusion of systems that use the features from vibrato, harmonic information and other features extracted using Mel and Log frequency scale filter banks. Statistical methods to select singing voice segments with high confidence measure are proposed for better performance in singer identification process. The experiments conducted on a database of 214 popular songs show that the proposed approach is effective.
Conference Paper
We propose a co-training algorithm to detect the singing voice segments from the pop songs. Co-training algorithm leverages compatible and partially uncorrelated information across different features to effectively boost the model from unlabeled data. We adopt this technique to take advantage of abundant unlabeled songs and explore the use of different acoustic features including vibrato, harmonic, attack-decay and MFCC (mel frequency cepstral coefficients). The proposed algorithm substantially reduces the amount of manual labeling work and computational cost. The experiments are conducted on the database of 94 pop solo songs. We achieve an average error rate of 17% in segment level singing voice detection.
Conference Paper
Perceptual features are motivated by human perception of sounds. In this paper, several perceptually-motivated features such as harmonic, vibrato and timbre are studied to detect singing voice segments in a song. In addition, singing formant and attack-decay envelope of the sound are also studied for acoustic feature formulation. The cepstral coefficients which reflect the timbre characteristics are formulated by combining information from harmonic content, vibrato, singing formant and attack-decay envelope of the sound. Bandpass filters that spread according to the octave frequency scale are used to extract vibrato and harmonic information. Several experiments are conducted using a database that includes 84 popular songs from commercially available CD recordings. The experiments show that the proposed feature formulation methods are effective.
Conference Paper
By investigating the production mechanism of vocal vibrato, a nonlinear digital feedback oscillator based vibrato control model was proposed. The feedback oscillation behaviour of laryngeal agonist-antagonist muscles is modeled by a self-oscillated feedback oscillator, in which a nonlinear limiter is introduced to describe the physical limitation of muscle activity. The proposed model could provide a flexible control over the following dynamic characteristics of vocal vibrato: (1) the attack and release; (2) irregularities of the vibrato rate and extent; (3) the increasing vibrato rate at the end of a note. Psychoacoustic experiments were conducted to evaluate the model and to demonstrate how these characteristics influence singing perception. Experimental results show that the dynamic characteristics affect the quality of vocal vibrato. Moreover, introducing attack and release could significantly improve the naturalness of synthetic singing voice.
Conference Paper
Quality of singing is a subjective description of an impression received while listening to the singer. The features that can be related to the quality of singing are among others: intonation, vibrato, tremolo and timbre. This paper presents the results of the analysis of tremolo feature in the singing. The method presented here is developed to estimate the tremolo over singing samples. At the beginning the term of tremolo is defined and differences and similarities to other music terms, as vibrato and wobble, are described. Then the method developed to estimate the tremolo over singing samples is presented. Finally the authors try to find an answer to the question whether the tremolo feature is useful to evaluate the quality of singing.
Conference Paper
The article presents the results of signal analysis of the recorded singing voice samples. For that study the recorded samples of the “a-e-i-o-u” exercise is analysed. Some significant parameters describing voice have been estimated. Among the estimated parameters are: pitch, calculated with the use of autocorrelation method, values of the first five harmonics, set of parameters containing first five formants and the value of vibrato parameter. The analysis of those parameters allowed, among others, to draw conclusions about dependencies of their values for different sung vowels. The study is a part of a broader research on singing voice signal analysis. The results may contribute to the development of the diagnostics tools for computer analysis of singer's and speaker's voices. The authors try to find an answer to the question whether the dependencies observed between the parameters may be useful to evaluate the quality of singing. This may be useful when designing a computer based singing assessment systems.
Article
Full-text available
Since communication and expression are central aspects of music performance it is important to develop a systematic pedagogy of teaching children and teenagers expressiveness. Although research has been growing in this area a comprehensive literature review that unifies the different approaches to teaching young musicians expressiveness has been lacking. Therefore, the aim of this article is to provide an overview of literature related to teaching and learning of expressiveness from music psychology and music education research in order to build a new theoretical framework for teaching and learning expressive music performance in instrumental music lessons with children and teenagers. The article will start with a brief discussion of interpretation and expression in music performance, before providing an overview of studies that investigated teaching and learning of performance expression in instrumental music education with adults and children. On the foundation of this research a theoretical framework for dialogic teaching and learning of expressive music performance will be proposed and the rationale explained. Dialogic teaching can be useful for scaffolding young musicians’ learning of expressivity as open questions can stimulate thinking about the interpretation and may serve to connect musical ideas to the embodied experience of the learner. A “toolkit” for teaching and learning of expressiveness will be presented for practical application in music lessons. In addition, a theoretical model will be proposed to further our understanding of teaching and learning of expressive music performance as a multifaceted and interactive process that is embedded in the context of tutors’ and learners’ experiences and environment. Finally, implications of this framework and suggestions for future research will be discussed.
ResearchGate has not been able to resolve any references for this publication.