Lori LamelFrench National Centre for Scientific Research | CNRS · LIMSI, SLP group
Lori Lamel
S.B. ,S.M. PhD in EECS from MIT
About
416
Publications
83,286
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
14,293
Citations
Additional affiliations
January 1996 - present
January 1992 - present
December 2002 - present
Independent Researcher
Position
- Consultant
Education
September 1980 - May 1988
Massachusetts Institute of Technology
Field of study
- EECS
September 1975 - May 1980
Massachusetts Institute of Technology
Field of study
- EECS
Publications
Publications (416)
Les mots à « h aspiré » ou « disjonctifs » en français forment un phénomène multifactoriel difficile à décrire : ils sont rares dans le discours, et sont associés à une charge prescriptive qui influence les locuteurs testés en laboratoire. Cette étude propose d’étudier la disjonctivité dans de grands corpus de parole naturelle, à l’aide des outils...
Cette contribution présente une étude sur la détection d’émotions et de mélanges d’émotions dans un corpus collecté dans un centre d’appels d’urgence à Paris (CEMO). Notre corpus, enregistré ‹in the wild›, est riche en diversité vocale (âge, accent, nombre de locuteurs) et est annoté avec un schéma original qui représente jusqu’à deux émotions par...
Emotion recognition in conversations is essential for ensuring advanced human-machine interactions. However, creating robust and accurate emotion recognition systems in real life is challenging, mainly due to the scarcity of emotion datasets collected in the wild and the inability to take into account the dialogue context. The CEMO dataset, compose...
The emotion detection technology to enhance human decision-making is an important research issue for real-world applications, but real-life emotion datasets are relatively rare and small. The experiments conducted in this paper use the CEMO, which was collected in a French emergency call center. Two pre-trained models based on speech and text were...
This study aims to increase our knowledge of Mandarin lexical tone duration in continuous Mandarin speech. Related variation factors such as the number of syllable(s) in word, the position of syllable in word, its prosodic position and speech style were also explored. Large corpora of casual and journalistic speech (total ∼1000 hours) were used. Mo...
La parole en situation de code-switching (CS) est potentiellement soumise à une variation accrue due aux changements de langue. Ce papier compare les occlusives orales du français et de l'arabe algérien par des locuteurs bilingues. L'arabe et le français ne partagent que /t, k/ et /b, d/. Néanmoins, le bilinguisme favorise les emprunts avec [p] et...
This paper builds upon recent work in leveraging the corpora and tools originally used to develop speech technologies for corpus-based linguistic studies. We address the non-canonical realization of consonants in connected speech and we focus on voicing alternation phenomena of stops in 5 standard varieties of Romance languages (French, Italian, Sp...
Les systèmes d’alignement automatique de la parole sont aujourd’hui très performants pour produire segmentations et étiquetages automatiques, notamment grâce aux variantes incluses dans leurs dictionnaires de prononciation, par exemple en autorisant des variantes avec et sans consonne de liaison (ex. est prononcé [e] ou [et]). La liaison en françai...
What is commonly considered as an epenthetic vowel can actually refer to at least two different realities: phonological epenthesis or phonetic excrescence. French schwa, noted [ә], is a vowel alternating with zero and limited to unstressed syllables that can appear word-internally or word-finally. This paper presents an extensive description of the...
Recognizing a speaker's emotion from their speech can be a key element in emergency call centers. End-to-end deep learning systems for speech emotion recognition now achieve equivalent or even better results than conventional machine learning approaches. In this paper, in order to validate the performance of our neural network architecture for emot...
In this study we examine phonetic variation of discourse markers in French, using for this purpose the 4-hour richly annotated LOCAS-F corpus. Both linguistic factors and stylistic variables are considered: speech style, part-of-speech category, mean phone duration and vowel formant distributions with respect to the word status. The results show th...
French schwa is traditionally referred to as a weak or reduced vowel noted [ә] restricted to unstressed syllables and variably alternating with zero. It can surface word internally as in [sәmɛn], semaine, ‘week’, or word-finally as in [katχә], quatre, ‘four’. In Standard French, it is considered a deletable lexical vowel when word-internal, but an...
This study aims to analyse factors that could influence schwa deletion in word-initial syllables of polysyllabic words in continuous French speech. Both phonological and extralinguistic factors were considered: number of consonants, post-lexical context, speech style, sex and profession. Three large corpora covering different speech styles were exp...
Lenition is a well-known phenomenon defined as a process whereby a consonant is “weakened”: “a segment X is said to be weaker than a segment Y if Y goes through an X stage on its way to zero” (Venneman in Hyman 1975). A refined definition (Szigetvári 2008) distinguishes between “consonantic” lenition, where consonants become more consonant-like whe...
Phonologization is a process whereby phonetic substance becomes phonological structure [1]. The process involves at least two steps: (i) a universal phonetic ('automatic') variation becomes a language-specific ('speaker-controlled') pattern, (ii) the language-specific pattern becomes a phonological ('structured') object. This paper will focus on th...
Le schwa est une voyelle faible ou réduite notée [ә] alternant avec zéro et restreinte aux syllabes non-accentuées. En français standard, il peut faire surface à l’intérieur ou en fin de mot. Nous proposons ici une étude du schwa final de mot exclusivement, en particulier par le prisme de la question du schwa final en tant que « lubrifiant phonétiq...
Cet article propose un état de l’art au sujet d’un phénomène du français oral de France métropolitaine assez peu décrit et relativement récent. Il s’agit de l’épithèse fricative qui consiste en un rajout de son consonantique sourd, souvent dorso-palatal, imprévisible, à la finale des mots qui clôturent un énoncé, s’ils finissent par une voyelle fer...
[FR] L’exploration automatisée de grands corpus permet d’analyser plus finement la relation entre motifs de variation phonétique synchronique et changements diachroniques durables : les erreurs dans les transcriptions automatiques sont riches d’enseignements sur la variation contextuelle en parole continue et sur les possibles mutations systémiques...
The present paper aims at providing a first study of lenition- and fortition-type phenomena in coda position in Romanian, a language that can be considered as less-resourced. Our data show that there are two contexts for devoicing in Romanian: before a voiceless obstruent, which means that there is regressive voicelessness assimilation in the langu...
This study investigates the tendency towards word-final devoicing
of voiced obstruents in Standard French, and how devoicing
is influenced by domain, speech style, manner and place of articulation. Three large corpora with automatic segmentations
produced by forced alignment are used: ESTER, ETAPE and
NCCFr. A voicing-ratio is established for each...
This study quantifies “final devoicing” (FD) in largescale corpora of Standard French via automatic alignment with pronunciation variants. We use corpora of different speech styles, ESTER (journalistic speech) and NCCFr (conversation between friends), to compare the rates of devoicing and voicing of word-final fricatives as a function of the follow...
Studies of variation in continuous speech converge towards the conclusion that in everyday speech, words are often produced with reduced variants: some segments are shortened or completely absent. We describe an initiative to automatically exploit spoken corpora, in order to better understand linguistic behavior in spontaneous speech. This study fo...
The French Algerian Code-Switching Triggered corpus (FACST) was created in order to support a variety of studies in phonetics, prosody and natural language processing. The first aim of the FACST corpus is to collect a spontaneous Code-switching speech (CS) corpus. In order to obtain a large quantity of spontaneous CS utterances in natural conversat...
Published 2018
Chitoran, I., I. Vasilescu, L. Lamel, B. Vieru – Connected speech in Romanian: Exploring sound change through an ASR system. In D. Recasens and F. Sánchez Miret (Eds.) Production and perception mechanisms of sound change. München: Lincom Europa. 129-143
The research presented in the paper addresses conversational telephone speech recognition and keyword spotting for the Lithuanian language. Lithuanian can be considered a low e-resourced language as little transcribed audio data, and more generally, only limited linguistic resources are available electronically. Part of this research explores the i...
Most speech and language technologies are trained with massive amounts of speech and text information. However, most of the world languages do not have such resources or stable orthography. Systems constructed under these almost zero resource conditions are not only promising for speech technology but also for computational language documentation....
In this paper we aim to enhance keyword search for conversational telephone speech under low-resourced conditions. Two techniques to improve the detection of out-of-vocabulary keywords are assessed in this study: using extra text resources to augment the lexicon and language model, and via subword units for keyword search. Two approaches for data a...
This paper reports on investigations of using two tech-
niques for language model text data augmentation for low-
resourced automatic speech recognition and keyword search.
Low-resourced languages are characterized by limited train-
ing materials, which typically results in high out-of-vocabulary
(OOV) rates and poor language model estimates. One t...
The project Breaking the Unwritten Language Barrier (BULB), which brings together linguists and computer scientists, aims at supporting linguists in documenting unwritten languages. In order to achieve this we develop tools tailored to the needs of documentary linguists by building upon technology and expertise from the area of natural language pro...
This paper reports on an experimental work to build a speech transcription system for Lithuanian broadcast data, relying on unsupervised and semi-supervised training methods as well as on other low-knowledge methods to compensate for missing resources. Unsupervised acoustic model training is investigated using 360hours of untranscribed speech data....
This paper describes a systems for emotion recognition and its application on the dataset from the AV+EC 2016 Emotion Recognition Challenge. The realized system was produced and submitted to the AV+EC 2016 evaluation, making use of all three modalities (audio, video, and physiological data). Our work primarily focused on features derived from audio...
http://www.isca-speech.org/archive/Interspeech_2016/pdfs/0762.PDF
This research extends our earlier work on using machine translation (MT) and word-based recurrent neural networks to augment language model training data for keyword search in conversational Cantonese speech. MT-based data augmentation is applied to two language pairs: English-Lithuanian and English-Amharic. Using filtered N-best MT hypotheses for...
Code-switching (CS), i.e. the dynamic switching from one language to another within a given oral or written speech interaction, is a phenomenon resulting from language contact. Arabic-French code-switching is relatively frequent in Maghreb countries, although more typical of Algerian Arabic [1, 3]. French, considered by Algerian speakers a prestigi...
We present work in progress on an intelligent embodied conversation agent in the basic care and healthcare domain. In contrast to most of the existing agents, the presented agent is aimed to have linguistic cultural, social and emotional competence needed to interact with elderly and migrants. It is composed of an ontology-based and reasoning-drive...
This paper presents a method to improve a language model for a limited-resourced language using statistical machine translation from a related language to generate data for the target language. In this work, the machine translation model is trained on a corpus of parallel Mandarin-Cantonese subtitles and used to translate a large set of Mandarin co...
This paper presents an experimental study on using morphological units for both automatic speech recognition (ASR) and keyword spotting (KWS) for the Kazach language. Similar to other morpholoigically rich languages, the words in Kazach are composed from fixed morphemes, which specify the meaning and may change form depending on the context. This t...
This paper presents a conversational telephone speech recognition system for the low-resourced Lithuanian language, developed in the context of IARPA-Babel program. Phoneme-based systems and grapheme-based systems are compared to establish whether or not it is necessary to use a phonemic lexicon. We explore the impact using Web data for language mo...
This paper provides a summary of previous efforts made to build an ASR system for Romanian. Thereafter, the data developed within the ASR framework are used to conduct linguistic studies. A first study is dedicated to morpho-phonetic processes in Romanian such as the deletion of masculine definite article -l and the realization of the word final pa...
The aim of this paper is two-fold : 1) to provide a new analysis of a set of phonological processes in Embosi (Bantu C 25), particularly of the dropping of the class prefix consonant, proposing that it involves dissimilatory processes and compensatory lengthenings due to an empty C position, and vowel elision at phonological word boundaries; 2) to...
For many languages, an expert-defined phonetic lexicon may not exist. One popular alternative is the use of a grapheme-based lexicon. However, there may be a significant difference between the orthography and the pronunciation of the language. In our previous work, we proposed a statistical machine translation based approach to improving grapheme-b...
In a previous work [1], we have shown that model interpolation can be applied for acoustic model adaptation for a specific show. Compared to other approaches, this method has the advantage to be highly flexible, allowing rapid adaptation by simply reassigning the interpolation coefficients. In this work this approach is used for a multi-accented En...
This paper investigates unsupervised training strategies for the Korean language in the context of the DGA RAPID Rapmat project. As with previous studies, we begin with only a small amount of manually transcribed data to build preliminary acoustic models. Using the initial models, a larger set of untranscribed audio data is decoded to produce appro...
Luxembourgish is embedded in a multilingual context on the divide between Romance and Germanic cultures and remains one of Europe’s low-resourced languages. We describe our efforts in building a large vocabulary ASR system for such a “minority” language without resorting to any prior transcribed audio training data. Instead, acoustic models are der...
Luxembourgish, a Germanic-Franconian language, is embedded in a multilingual context on the divide between Romance and Germanic cultures and remains one of Europe’s under-described languages. This paper investigates the similarity between Luxembourgish phone segments with German, French and English via forced speech alignment techniques. Making use...