Jérôme Farinas

Jérôme Farinas
Toulouse Institute of Computer Science Research · Structuration, Analysis, MOdeling of Video and Audio Team (SAMoVA)

PhD

About

77
Publications
14,463
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
752
Citations

Publications

Publications (77)
Article
Full-text available
Perceptual measures, such as intelligibility and speech disorder severity, are widely used in the clinical assessment of speech disorders in patients treated for oral or oropharyngeal cancer. Despite their widespread usage, these measures are known to be subjective and hard to reproduce. Therefore, an M-Health assessment based on an automatic predi...
Article
Reliable fundamental frequency ( f 0 ) extraction algorithms are crucial in many fields of speech research. The current bulk of studies testing the robustness of different algorithms have focused on healthy speech and/or measurements of sustained vowels. Few studies have tested f 0 estimations in the context of pathological speech, and even fewer o...
Article
Full-text available
Background In head and neck cancer, many tools exist to measure speech impairment, but few evaluate the impact on communication abilities. Some self-administered questionnaires are available to assess general activity limitations including communication. Others are not validated in oncology. These different tools result in scores that does not prov...
Article
Full-text available
Most state-of-the-art speech systems use deep neural networks (DNNs). These systems require a large amount of data to be learned. Hence, training state-of-the-art frameworks on under-resourced speech challenges are difficult tasks. As an example, a challenge could be the limited amount of data to model impaired speech. Furthermore, acquiring more d...
Article
Purpose: The constitution of social circles around patients treated for cancer of the upper aerodigestive tract (UADT) has a major influence on factors that affect Quality of Life (QOL) but is poorly assessed, mainly due to a lack of tools. The objective of this study is to develop a questionnaire that assesses the constitution of social circles i...
Article
Background: Speech disorders impact quality of life for patients treated with oral cavity and oropharynx cancers. However, there is a lack of uniform and applicable methods for measuring the impact on speech production after treatment in this tumor location. Objective: The objective of this work is to (1) model an automatic severity index of spe...
Presentation
Full-text available
Contexte L’évaluation de l’intelligibilité de la parole est majoritairement menée de façon perceptive en clinique courante. Ce type d’évaluation présente de nombreuses limites en termes de reproductibilité inter et intra-juges. Mais le développement récent de systèmes de reconnaissance automatique de la parole (RAP) peut permettre de pallier ces bi...
Article
Full-text available
Introduction En cancérologie ORL, de nombreux outils perceptifs et automatiques existent à l’heure actuelle pour mesurer le trouble de la parole, mais peu permettent d’évaluer l’impact du trouble sur les capacités de communication. Quelques autoquestionnaires sont disponibles pour évaluer les limitations d’activités et les restrictions de participa...
Article
"Rééducation Orthophonique" est une revue scientifique trimestrielle, réalisée par la Fédération Nationale des Orthophonistes. Chaque numéro est thématique.
Poster
Les dysfonctionnements cognitifs sont fréquemment signalés par les patients qui souffrent d’une pathologie neurologique. Parmi les difficultés cognitives rapportées, nous retrouvons des troubles du langage appelés aphasie. L’identification des troubles phasiques nécessite une évaluation spécifique par le biais de tests lexicaux, évaluant l’accès au...
Poster
Full-text available
En cancérologie ORL, de nombreux outils perceptifs et automatiques existent à l’heure actuelle pour mesurer le trouble de la parole, mais peu permettent d’évaluer l’impact du trouble sur les capacités de communication [1,2]. Quelques autoquestionnaires sont disponibles pour évaluer les limitations d’activités et les restrictions de participation, c...
Article
Full-text available
PurposeTo validate the upgraded version of the CHI with two new dimensions (“limitation of neck and/or shoulder movements”, “changes in physical appearance”). To assess the relationship between CHI scores and patient self-reported management needs.Methods71 patients treated for cancer with ENT complaints and 36 controls were included. Construct val...
Article
Full-text available
Within the framework of the Carcinologic Speech Severity Index (C2SI) INCa Project, we collected a large database of French speech recordings aiming at validating Disorder Severity Indexes. Such a database will be useful for measuring the impact of oral and pharyngeal cavity cancer on speech production. It will permit to assess patients Quality of...
Article
Full-text available
Introduction La première version du « Carcinologic Handicap Index » (CHI) évaluait les symptômes présentés par les patients traités pour un cancer des voies aérodigestives supérieures, et leur retentissement dans neuf dimensions fonctionnelles (douleur, déglutition, nutrition, respiration, phonation, audition, vision, olfaction-gustation, impact ps...
Preprint
Full-text available
Most state-of-the-art speech systems are using Deep Neural Networks (DNNs). Those systems require a large amount of data to be learned. Hence, learning state-of-the-art frameworks on under-resourced speech languages/problems is a difficult task. Problems could be the limited amount of data for impaired speech. Furthermore, acquiring more data and/o...
Poster
Full-text available
Oral and oropharyngeal cancer affects anatomical regions involved in speech production. Alterations of communicational functions lead to a major impact on patients' quality of life: Little research on functional impact of speech disorders ; Moderate correlations between quality of life and speech disorder severity scores (assessed perceptually). Bu...
Article
Full-text available
Context: Nowadays, clinical tools are available to evaluate the functional impact of speech disorders in neurological conditions, but few are validated in oncology. Because of their location, cancers of the upper aerodigestive tract directly impact patients' communication skills. Two questionnaires exist in French, the Speech Handicap Index (SHI)...
Article
Full-text available
Background: The development of automatic tools based on acoustic analysis allows to overcome the limitations of perceptual assessment for patients with head and neck cancer. The aim of this study is to provide a systematic review of literature describing the effects of oral and oropharyngeal cancer on speech intelligibility using acoustic analysis....
Conference Paper
Full-text available
Introduction : The decrease in mortality and the lengthening of the life span following cancer make the sequelae management of the pathology and treatments a priority, The quality of life of patients treated for oral cavity or oropharynx cancer can be impaired because this pathology modifies the communication abilities of the patients due to its lo...
Poster
Full-text available
L’aire du triangle vocalique, construit en mesurant les valeurs fréquentielles des deux premiers formants des voyelles prononcées, est l’une des mesures employées pour l’évaluation de l’intelligibilité de la parole [1,2]. Le placement des voyelles sur un graphique à deux axes, F1 et F2, permet d’interpréter les valeurs obtenues par rapport au trian...
Preprint
Full-text available
In this paper, we describe the outcomes of the challenge organized and run by Airbus and partners in 2018. The challenge consisted of two tasks applied to Air Traffic Control (ATC) speech in English: 1) automatic speech-to-text transcription, 2) call sign detection (CSD). The registered participants were provided with 40 hours of speech along with...
Conference Paper
Full-text available
et al.. Automatic analysis of word association data from the Evolex psycholinguistic tasks using computational lexical semantic similarity measures. Abstract. This paper is the fruit of a multidisciplinary project gathering researchers in Psycholinguistics, Neuropsychology, Computer Science, Natural Language Processing and Linguistics. It proposes...
Conference Paper
Full-text available
Cet article présente une étude comparative entre des mesures perceptive et automatique pour l’évalua-tion de l’intelligibilité de la parole dans des conditions dégradées. Il fait suite à une étude précédentequi a permis de proposer une méthodologie pour la simulation et l’évaluation automatique de l’effetde la presbyacousie sur l’intelligibilité de...
Conference Paper
Full-text available
Within the framework of the Carcinologic Speech Severity Index (C2SI) InCA Project, we collected a large database of French speech recordings aiming at validating Disorder Severity Indexes. Such a database will be useful for measuring the impact of oral and pharyngeal cavity cancer on speech production. That will permit to assess patients' Quality...
Poster
Full-text available
Age-related hearing loss (ARHL) – the progressive bilateral decline in hearing sensitivity generally assessed by pure-tone audiometry – affects many people over the age of 60 years. It has important negative consequences, especially in noisy environments, not only on speech perception but also on the socio-psychological well- being of the affected...
Article
Full-text available
Purpose. The purpose of this article is to assess speech processing for listeners with simulated age-related hearing loss (ARHL) and to investigate whether the observed performance can be replicated using an automatic speech recognition (ASR) system. The long-term goal of this research is to develop a system that will assist audiologists/hearing-ai...
Conference Paper
Full-text available
This article presents a new method for analyzing Automatic Speech Recognition (ASR) results at the phonological feature level. To this end the Levenshtein distance algorithm is refined in order to take into account the distinctive features opposing substituted phonemes. This method allows to survey features additions or deletions, providing microsc...
Conference Paper
Full-text available
In this paper, we report automatic pronunciation assessment experiments at phone-level on a read speech corpus in French, collected from 23 Japanese speakers learning French as a foreign language. We compare the standard approach based on Goodness Of Pronunciation (GOP) scores and phone-specific score thresholds to the use of logistic regressions (...
Conference Paper
Full-text available
Les méthodes de traitement automatique de la parole constituent des solutions de choix pour aider à l'évaluation des performances orales. Les progrès récents en reconnaissance automatique de la parole (RAP) — en particulier dans le domaine de l'apprentissage des langues assisté par ordinateur — ont contribué au développement de techniques pour iden...
Conference Paper
Full-text available
This research work forms the first part of a long-term project designed to provide a framework for facilitating hearing aids tuning. The present study focuses on the setting up of automatic measures of speech intelligibility for the recognition of isolated words and sentences. Both materials were degraded in order to simulate presbycusis effects on...
Article
Full-text available
In this article, we report on the use of an automatic technique to assess pronunciation in the context of several types of speech disorders. Even if such tools already exist, they are more widely used in a different context, namely, Computer-Assisted Language Learning, in which the objective is to assess nonnative pronunciation by detecting learner...
Conference Paper
Full-text available
In this paper, we report on a study with the aim of automatically detecting phoneme-level mispronunciations in 32 French speakers suffering from unilateral facial palsy at four different clinical severity grades. We sought to determine if the Good-ness of Pronunciation (GOP) algorithm, which is commonly used in Computer-Assisted Language Learning s...
Article
Full-text available
RÉSUMÉ. Cet article présente une étude comparative entre mesures perceptives et mesures au-tomatiques de l'intelligibilité de la parole sur de la parole dégradée par une simulation de la presbyacousie. L'objectif est de répondre à la question : peut-on se rapprocher d'une mesure perceptive humaine en utilisant un système de reconnaissance automatiq...
Article
Full-text available
This study aims at comparing perceptive and automatic measures of speech intelligibility in the case of speech signals simulating the effects of age-related hearing loss (presbycusis). A new corpus especially designed for studying speech intelligibility and perception and the comparison of human speech recognition scores with Automatic Speech Recog...
Conference Paper
Full-text available
In this paper we present an approach designed to map variable size audio sequences into fixed-length vectors, useful to discover contents of audio databases. First, we model standard audio parameters with Gaussian mixture models (GMM). Then, symmetric Kullback-Leiber divergences between models are approximated with a Monte-Carlo method. We use thes...
Article
Full-text available
This paper presents an approach for applying spectral clus-tering to time series data. We define a novel similarity mea-sure based on euclidean distance and temporal proximity be-tween vectors. This metric is useful for conditioning ma-trices needed to perform spectral clustering, and its applica-tion leads to the detection of abrupt changes in a s...
Article
Full-text available
Bouts of vocalizations given by seven red deer stags were recorded over the rutting period, and homomorphic analysis and hidden Markov models (two techniques typically used for the automatic recognition of human speech utterances) were used to investigate whether the spectral envelope of the calls was individually distinctive. Bouts of common roars...
Article
Full-text available
This paper deals with an approach to automatic language identification based on rhythmic modelling. Beside phonetics and phonotactics, rhythm is actually one of the most promising features to be considered for language identification, even if its extraction and modelling are not a straightforward issue. Actually, one of the main problems to address...
Article
The aim of this study is to propose a new approach to Automatic Language Identification: it is based on rhythmic modelling and fundamental frequency modelling and does not require any hand labelled data. First we need to investigate how prosodic or rhythmic information can be taken into account for Automatic Language Identification. A new automatic...
Article
An automatic estimation of speaking rate is developed in this paper. It is based on an unsupervised vowel detection algorithm and thus may be costlessly applied to any language. Validation is driven on a spontaneous speech subset of the OGI Multilingual Telephone Speech Corpus. The correlation coefficient between the estimated and real speaking rat...
Article
Full-text available
Colloque avec actes et comité de lecture. internationale.
Conference Paper
This paper deals with an approach to automatic language identification using only prosodic modeling. The actual approach for language identification focuses mainly on phonotactics because it gives the best results. We propose here to evaluate the relevance of prosodic information for language identification with read studio recording and spontaneou...
Conference Paper
This paper deals with an approach to Automatic Language Identification based on rhythmic modeling and vowel system modeling. Experiments are performed on read speech for 5 European languages. They show that rhythm and stress may be automatically extracted and are relevant in language identification: using cross-validation, 78% of correct identifica...
Article
Full-text available
La plupart des systèmes d'Identification Automatique des langues accordent une grande importance au niveau phonotactique, en utilisant des modèles N-gram et des dictionnaires phonétiques de grande taille. Cependant, il est évident que l'introduction d'autres paramètres (acoustiques, phonétiques et prosodiques) améliorera les performances. Récemment...
Article
Full-text available
In this article we study some results of the non-linear di-mensionality reduction of speech vectors. Spectral cluste-ring, Kernel PCA, Isomap, Laplacian eigenmaps and Lo-cally Linear Embedding are related non-supervised me-thods that help to discover important caracteristics from data such as high-density regions or low-dimensional sur-faces (manif...
Article
Full-text available
This paper deals with rhythmic modeling and its application to language identification. Beside phonetics and phonotactics, rhythm is actually one of the most promising features to be considered for language identification, but significant problems are unresolved for its modeling. In this paper, an algorithm dedicated to rhythmic segmentation is des...
Article
This paper deals with an approach to Automatic Lan-guage Identification using only prosodic modeling. The traditional approach for language identification focuses mainly on phonotactics because it gives the best re-sults. Recent studies reveal that humans use different levels of perception to identify a language, in particular prosodic cues. Among...
Article
An automatic method for speaking rate estimation is de-veloped in this paper. It is based on an unsupervised seg-mentation and vowel detection algorithm and thus may be costlessly applied to any language. Validation is driven on a spontaneous speech subset of the OGI Multilingual Telephone Speech Corpus. Statistics related to the spea-king rate bot...
Article
Full-text available
Most systems of Automatic Language Identification give a great importance to the phonotactic level, by using N-gram models and relatively large phone-dictionary sizes. However, it is obvious that introducing other features (acoustic, phonetic, prosodic) will improve performances. Recently, we have proposed an alternative acoustic phonetic model whi...
Article
Full-text available
Most systems of Automatic Language Identification give a great importance to the phonotactic level, by using N-gram models and relatively large phone-dictionary sizes. However, it is obvious that introducing other features (acoustic, phonetic, prosodic) will improve performances. Recently, we have proposed an alternative acoustic phonetic model whi...

Network

Cited By