Archived project

Glissando: A corpus for multidisciplinary prosodic studies in Spanish and Catalan

Goal: The Glissando corpus is a prosodic corpus for Spanish and Catalan which has proved the capabilities of extensive speech corpora for the empirical study of prosody. The corpus comprises two distinct data-sets, a news subcorpus and a dialogue subcorpus, the latter containing either unplanned conversations or task dialogues oriented to a specific goal in the domain of information request (travel information, information request for an exchange university course and information request for a touristic route). The recordings were made in high acoustic quality by two profiles of speakers: broadcasting and advertising professionals and native undergraduate students. The twenty-five hours of recordings cover different reading styles (radio, advertising and neutral), registers (reading news, formal and informal dialogue), voices (male and female) and languages (central Catalan and standard European Spanish). The inclusion of these variables aims to facilitate cross-linguistic, interspeaker and inter-style analyses.

Date: 1 January 2008 - 31 December 2014

Updates
0 new
0
Recommendations
0 new
0
Followers
0 new
26
Reads
0 new
156

Project log

Lourdes Aguilar
added 4 research items
Literature review on prosody reveals the lack of corpora for prosodic studies in Catalan and Spanish. The Glissando corpus is a prosodic corpus for Spanish and Catalan which intends to fill this gap showing the capabilities of extensive speech corpora for the empirical study of prosody. With the aid of this corpus, it is possible to analyze differences due to linguistic and sociolinguistic criteria such as genre, speech style or voice register.
Large prosodically labeled corpora are needed to make real progress in understanding the prosodic mechanisms that dominate the spoken communication between humans. ToBI is a framework to develop community-wide conventions to transcribe the intonation and prosodic structure of spoken utterances in a language variety. This fine-grained labeling system requires well-trained transcribers. A given language must provide a freely available manual to teach the system to new transcribers, with many recorded examples of transcribed utterances graded from easy to difficult. The paper describes the content of two websites with the Catalan and Spanish ToBI Training Materials, each containing explanations and interactive exercises with sounds and graphics, and for the first time in the ToBI Training Materials, a basic ear training section.
Previous accounts of prosodic phrasing patterns in Catalan (Frota et al., 2007) have shown various cues that are indicative of phrase boundaries.This paper reports on the distributional analysis of the boundary types in the Festcat corpus. The most relevant results are the following: (1) Strategies of phrasing are equivalent in the two voices (male and female) and the data are consistent accross the different text styles. (2) As it is shown for other languages, the size of prosodic constituents are balanced with respect to length constraints (Prieto, 2007). (3) It is not the selection of different acoustic correlates which differentiates between the prosodic units. When faced with large sets of data, it is more a matter of finding likelihoods, not simple mappings from acoustics to a clear set of prosodic features.
Lourdes Aguilar
added a research item
This study is an analysis of vowel sequences in Peninsular Spanish in a corpus of read and spontaneous speech, with respect to the three processes across word boundaries: hiatus maintanance, diphthong formation (synalepha, syllabic contraction) and deletion. Based on the comparison with F1 / F2 data of the vowels in the syllabic nucleus position to specify the articulatory modifications that the vowels undergo when are in contact between two words, as well as to establish the type of segment that arises in the deletion processes, the results indicate that: a) The vowels produced in a hiatus do not differ from the same vowels produced in a consonant environment; b) Although the influences of one vowel on another in the formation of homosyllabic groups are quantified, the variability is high and the data obtained in the acoustic analysis do not allow deciding which of the two elements is the nucleus, since both the word-final vowel and the following word-initial undergo frequencial modifications to adapt them to the syllabic configuration; c) the phonetic result of a deletion process is varied and the frequential pattern of segments do not coincide with the reference values assigned to the canonical vowels.
Lourdes Aguilar
added 2 research items
This work contributes with an overview of the phonological and phonetic processes in Spanish: the former are those phenomena that occur systematically in any word or sequence of words in the language and are assumed by the entire linguistic community; the latter are only regular in certain speech situations, affect a restricted number of words or are assumed only by a speaker or a small group of speakers. The phonological processes formalize the changes that the segments undergo depending on the context in which they appear (allophonic variation or contextual processes of articulatory modification) but also others that have to do with articulatory force (simplification and strenghthening processes). The presentation of phonetic processes is based on a set of experimental studies that focus on the various acoustic realizations of vowels and consonants in Spanish, which are mostly explained by the application of weakening rules with respect to the canonical form (which coincides with the one that usually appears in the isolated pronunciation of the words that contain the segment). The review of the main phonological and phonetic processes in Spanish allows us to establish a set of generalizations relevant to the relationships between phonetics and phonology, as well as to the correspondence between the notions of phonological change and process. The review of the main phonological and phonetic processes of Spanish makes it possible to establish a set of pertinent generalizations for the relations between phonetics and phonology, as well as for the correspondence between the notions of phonological change and process.
Lourdes Aguilar
added 2 research items
The study deals with the main acoustic clues that differentiate between phonetic realizations of / b d g / in contexts of lenition in spontaneous speech. Time and intensity data of the approximant consonants gathered from the spontaneous subcorpus of the Glissando corpus allow to establish a non-discrete category within an allophonic continuum: the closed approximants have a longer duration and less intensity than the open approximants, which in turn last more and are less intense than the vocalic approximants. Temporal variations remain independent of prosodic conditions (stressed or unstressed syllable), phonological (simple or complex onset) and phonetic (postvocalic or postconsonantic). Regarding the intensity, new measures are necessary, such as the relative intensity with respect to the adjacent vowels or the difference between maximum and minimum values in the consonant-vowel interval.
Lourdes Aguilar
added a research item
The present study examines the patterns in stress, phrasing and intonation found in a Spanish corpus of news read by broadcasters to describe the prosodic strategies that can be considered as genre-distinguishing features. Results indicate that, firstly, the main stress modifications concern the upgrading of unstressed syllables to accented ones, the stress shift to mark word-initial boundaries and the maintenance of adjacent stresses. Secondly, the special features related to phrasing are unexpected pauses, which enhance the prosodic units that offer new information, and the prosodic marking of initial edges of groups with the aim of capturing the listener’s attention. Finally, the most relevant tonal events that identify the typical chanting of broadcasters are a recurrent use of rises whose f0 peak coincides with the stressed syllable, a variety of non-falling pitch movements signalling intermediate phrasing, and the use of rising-falling pitch movements to signal ends. All the described prosodic and tonal strategies contribute to obtaining an emphatic style in news reading and are representative of a prosodically marked genre.
Lourdes Aguilar
added 2 research items
Diversos estudios sobre la distinción ente hiato y diptongo han demostrado que el tratamiento léxico de este fenómeno es la mejor vía de explicación en el dominio de la palabra. En el estudio presente nos planteamos qué sucede en el enlace de palabras, meidante el diseño de un experimento con el doble objetivo de determinar los indicios acústicos que permiten establecer una categorización en los resultados fonéticos, e dientificar las variables que influyen en la resolución de los contactos vocálicos. Los resultados muestran que los procesos que operan en los enlaces son: heterosilabificación, reducción y elisión, cada uno de ellos desencadena patrones temporales y de frecuencia específicos. En cuanto a los factores que predicen los resultados, se demuestra que la elisión es muy frecuente cuando una palabra gramatical está implicada en el enlace, ya que se forma un sintagma fonológico, mientras que si el contacto se da entre dos palabras con significado léxico, la reestructuración en un sintagma fonológico es opcional y, como consecuencia, también son opcionales la reducción o la elisión.
Yurena Gutiérrez
added 2 research items
The aim of this paper is to present a phonological description of the boundary tones in final and non-final declarative sentences in Spanish, drawn from a read news corpus and a dialogue corpus. The final clauses tend to finish with L*L% and sometimes L+H*L%. Four different pitch configurations can be found for non-final patterns: a rise (L*H%), a fall-to-mid (H*!H%), a fall-rise (H*LH%) and a sustained tone, which presents different phonetic manifestations depending on the pitch level of the previous accent (H*, !H* or L*). These findings question the validity of the traditional Sp_ToBI convention (HL%) to describe a sustained tone since it cannot account for a level pitch after !H* or L*. A new boundary tone, =%, is proposed whose feature for pitch height is underspecified. For this reason, it can adopt the values of H, !H or L, according to the pitch height of the last accent.
This paper presents a novel methodology to characterize the style of different speakers or groups of speakers. This methodology uses sequences of prosodic labels (automatic Sp_ToBI labels) to compare and differentiate these speaking styles. A set of metrics based on conditional entropy is used to compute the distance between two speakers or group of speakers depending on the use of sequences of prosodic labels. Additionally, the most contrastive sequences of labels are identified as characteristic patterns of the speaking styles represented in a given corpus. When this methodology is applied to a corpus of radio news items, the result is that the most frequent prosodic patterns coincide with those previously characterized in studies about radio style. Finally, a perceptual test verifies that the participants attribute these characteristic patterns to the radio news style.
Lourdes Aguilar
added a research item
L�article presenta el primer sistema complet d�etiquetatge prosòdic per al català basat en el model mètric autosegmental: el Catalan ToBI o Cat_ToBI. La proposta es basa en resultats d�estudis anteriors i en una anàlisi qualitativa d�un corpus de diferents dialectes del català. Les unitats fonològiques que formen el sistema són les següents. Pel que fa als accents tonals, H*, L*, L+H*, L+>H*, L*+H, H+L* i l�accent tritonal L+H*+L en alguerès, així com els tons alts amb esglaonament ascendent i descendent (¡H* i !H*). Pel que fa als accents de frontera, es proposa un únic tipus de to de frontera i es prescindeix dels accents de frase. Aquests tons de frontera poden ser alts, mitjans o baixos i poden ser monotonals (H%, M%, L%) o aparèixer en combinacions bitonals (LH%, HH%, MM% i HL%) o tritonals (LHL%). El principal objectiu del sistema Cat_ToBI és que serveixi de referència per a la sistematització fonològica de la prosòdia del català i com a eina per a l�anotació prosòdica de corpus orals. This paper describes the conventions proposed in the first prosodic transcription system within the Tones and Break Indices (ToBI) framework for Catalan: Catalan ToBI or Cat_ToBI. The proposal is based on the previous literature on Catalan intonation and a qualitative analysis of a corpus of spoken Catalan that covers several dialects. In the tone tier, Catalan distinguishes among the following accent types: H*, L*, L+H*, L+>H*, L*+H, H+L*, the tritonal L+H*+L (Alguerese), and the downstepped and upstepped variants of H (¡H* and !H*). The model differs from the English ToBI model in that there is no phrase accent category and that only one type of boundary tone occurs to the right of intermediate and intonational phrase boundaries. Catalan distinguishes among the following boundary tones: H%, M%, L%, which can be monotonal or can group into bitonal (LH%, HH%, MM% and, HL%) or tritonal combinations (LHL%). The main aims of the Cat_ToBI system are to serve to improve our knowledge about Catalan intonation and to provide a tool to prosodically annotate oral corpora.
Lourdes Aguilar
added 6 project references
Lourdes Aguilar
added a project goal
The Glissando corpus is a prosodic corpus for Spanish and Catalan which has proved the capabilities of extensive speech corpora for the empirical study of prosody. The corpus comprises two distinct data-sets, a news subcorpus and a dialogue subcorpus, the latter containing either unplanned conversations or task dialogues oriented to a specific goal in the domain of information request (travel information, information request for an exchange university course and information request for a touristic route). The recordings were made in high acoustic quality by two profiles of speakers: broadcasting and advertising professionals and native undergraduate students. The twenty-five hours of recordings cover different reading styles (radio, advertising and neutral), registers (reading news, formal and informal dialogue), voices (male and female) and languages (central Catalan and standard European Spanish). The inclusion of these variables aims to facilitate cross-linguistic, interspeaker and inter-style analyses.
 
Lourdes Aguilar
added 6 research items
This paper reports on the results of a pilot study that was run to assess the labeling consistency of the proposed approach in Sp-ToBI before starting a large-scale production of annotations in the project Glissando. This test should serve to refine the model and to maintain consistently the annotation conventions across transcription sites. The Spanish ToBI labeling system has been proved as an effective system to annotate intonation for Spanish, although the annotation conventions across tran-scribers require a broader consensus. This is specially needed in the following pitch accents: high pitch accent (H*) vs ris-ing pitch accent (L+H*), downstepped pitch accents versus non-downstepped counterparts, and mid tones. A related issue is the difficulty to decide in a very low pitch range if a tone is present or if the syllable has been unaccented. Moreover, the statistical procedures will shed light on the most confusable tones sug-gesting new approaches for the automatic prediction of ToBI labels in a Spanish Spoken corpus.
This paper presents an experimental study on how corpus-based automatic prosodic information labeling can be transferred from a source language to a different target language. Tone accent identification models trained for Spanish, using the ESMA corpus, are used to automatically assign tonal accent ToBI labels on the (English) Boston Radio news corpus, and vice versa. Using just local raw prosodic acoustic features, we got about 75% correct annotation rates, which provides a good starting point to speed up automatic prosodic labeling of new unlabeled corpora. Despite the different ranges and relevance of inter corpora acoustic input features, the contrasting of the results with respect to manual labeling profiles indicate the potential capabilities of the procedure.