About
61
Publications
9,787
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
120
Citations
Introduction
automatic natural language processing and deep learning
Current institution
Publications
Publications (61)
This paper proposes a multitask learning framework for simultaneous emotion and gender recognition from Arabic speech. Leveraging a hybrid Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) architecture, the model captures both spectral and temporal characteristics of speech. By using Mel Frequency Cepstral Coefficients (MFCCs) as...
This paper presents a Convolutional Neural Network (CNN) model for emotion recognition in speech, using Mel spectrograms as input features. The model classifies seven emotional states-anger, neutral, joy, anxiety, disgust, boredom, and sadness-using the EmoDB database, which contains emotional speech samples recorded by ten actors. The CNN architec...
Speech Emotion Recognition (SER) has garnered increasing attention in recent years due to its wide-ranging applications in human-computer interaction, affective computing, and healthcare. Despite significant progress, developing robust SER systems for Arabic speech remains a challenge, primarily due to the linguistic diversity and distinct acoustic...
A fricative sound is produced by the close proximity of two articulators, resulting in a partially obstructed airstream and turbulent airflow. The frequency spectrum of the majority of fricatives is similar to noise. This peculiarity presents a significant challenge in their numerical processing. Nonetheless, the literature contains a large number...
Par ses propriétés morphologiques et syntaxiques la langue arabe est considérée comme une langue très difficile à maîtriser dans le domaine du traitement automatique et les systèmes de synthèse à partir du texte arabe sont donc très peu nombreux. Le but de notre travail est une réalisation d'un système de synthèse de la parole par concaténation (sé...
The present study focuses on the evaluation of the degradation of emotional expression in speech generated by a wireless telephone network. Two assessment approaches emerged: an objective one, deploying convolutional neural networks (CNNs) fed with spectrograms across three scales (Linear, Logarithmic, Mel), and a subjective method grounded in huma...
We have developed an innovative methodology for the multilingual vocalization of Algerian SMS messages, drawing insights from an initial study that highlights the challenges posed by the complex and specific language used in these messages. In the course of our research, we worked with a dataset comprising 20,000 SMS messages, which, after segmenta...
In this paper, a complete methodology of a corpus realization of authentic SMS from Algerian dialect and which are transcribed in Latin characters or symbols is presented. A linguistic material constituted by 6000 SMS coming from the different geographical regions of Algeria (Middle, East and West) corresponding to 42 administrative and geographica...
This paper aims to design and validate a phonetically balanced speech corpus for Arabic language. Designing and developing a rich and phonetically balanced corpus in optimal context is one of the key issues in building high quality of text-to-speech synthesis systems. The rich characteristic is in the sense that it must contain all the possible pho...
An analysis of the right vowel context influence on their spectrum of fourteen Arabic fricative was presented in this study. The results were validated on the base of the recognition of these fricatives from a system based on Neural Network (Multilayer Perceptron MLP). The latter showed that the number of fricatives considered, the vowel context an...
An analysis of the right vowel context influence on their spectrum of fourteen Arabic fricative was presented in this study. The results were validated on the base of the recognition of these fricatives from a system based on Neural Network (Multilayer Perceptron MLP). The latter showed that the number of frica-tives considered, the vowel context a...
Abstract—This work is a contribution to the enhancement of the emotional regardless of speaker recognition rate for the Arabic language, which is a poorly endowed language in this field. This problem is still relevant, especially for databases created by non-professional speakers(the emotion produced by professionals is standard whereas that produc...
The general objective of this paper is to build a system in order to automatically recognize emotion in speech. The linguistic material used is a corpus of Arabic expressive sentences phonetically balanced. The dependence of the system on speaker is an encountered problem in this field; in this work we will study the influence of this phenomenon on...
This paper presents the methods used and results obtained for the creation of a Festival-compatible pronunciation dictionary of above 10k words for the kabyle language. Kabyle is a berber dialect spoken in Northern Algeria. This dictionary will be useful in the design of text-to-speech and automatic speech recognition systems for the kabyle languag...
This paper describes the design of an automatic identification system for the distinction between two common languages in Algeria which are MSA (Modern Standard Arabic) and Kabyle which is an Algerian Berber dialect. The characteristics used for this are prosodic (melody and stress) and cepstral (Mel Frequency Cepstral Coefficients) features extrac...
The absence of the diacritical marks from the modern Arabic text generates a significant increase of the ambiguity in the Arabic text, which can cause confusion in the pronunciation of a written word. Despite the fact that the reader with a certain level of Arabic knowledge can easily recover the missing diacritics by: using the words context, the...
Abstract
By its morphological and syntactic properties, the Arabic language is considered as a very difficult language to control in processing and automatic speech synthesis systems. This justifies the limited number of automated processing systems. The aim of our work is a contribution to the realization of a speech synthesis system based on conc...
This is a corpus of expressive speech for the arabic language.
It consists of 13 speakers pronouncing 10 sentences in 4 expressive styles: Neutral, Anger, Joy and Sadness.
Our study consists of a phenomena analysis involved in the production of constrained speech. The paradigm used is the variation of speech rate. A variable rate flow speech simulation study was performed using the ISHIZAKA vocal tract mechanical model (two-mass model) grafted on the Klatt formant synthesizer (acoustic speech production model). enabl...
Résumé : Notre étude consiste en une analyse des phénomènes mis en jeu lors de la production de la parole en contrainte. Le paradigme utilisé est la variation de débit d’élocution. Une étude en simulation de la parole produite à débit variable a été effectuée par le biais du modèle mécanique des cordes vocales d’ISHIZAKA (
modèle à deux masses) gre...
In this article we present the methodology employed for the design and evaluation of a Basic Arabic Expressive Speech corpus (BAES-DB). The corpus, which has a total length of approximately 150 minutes, is constituted of 13 speakers uttering a set of 10 sentences while simulating 3 emotions (joy, anger and sadness) in addition to a neutral utteranc...
In this paper we will present a contribution to the design of an expressive speech synthesis system for the Arabic language. The system uses diphone concatenation as the synthesis method for the generation of 10 phonetically balanced sentences in Arabic. Rules for the orthographic-to-phonetic transcription are detailed, as well as the methodology e...
The prosody of a speech signal is related to many factors: the social and geographical origin of the speaker, his or her emotional state, his physiological state (weariness, sickness, …) and the type of the sentence (interrogative, affirmative, etc.). A good synthesis or speech transformation system must account for all of these factors in order to...
The purpose of this study is an acoustic characterization of two "lip-velarized" consonants /gw/ and /kw/ specific to the Berber language (especially the Algerian Kabylian language). For this, we used locus equations and the second formants trajectories. We have proposed a method that representing the second formant transition by the slope and inte...
The goal of this study is to consider the instantaneous frequencies corresponding to the speech signal formant using the wavelet transform. The developed method is based on an analysis of derivative phase of the continuous Morlet wavelet transform coefficients. Using synthesized signals produced by a formant model made it possible to adapt this met...
This paper deals with the underlining and the prominent display of the compensatory strategies developed by six speakers which are from different geographic regions when they express themselves in a second language very different from their mother native language. The paradigm that had been used in this study is the elocution speed as a speaking co...
The purpose of this study is an acoustic characterization of two-labial consonants velarized / gw / and / kw / specific to the Berber language (particularly the Algerian Kabyle). For this, we used the locus equations and trajectories of formants. We have proposed a representation process of the trajectory for the second formant using the slope and...
This paper focuses on the acquisition of the tonal and prosodic structure of affirmative and question sentences in Berber language. The study on the prosodic differences between these two types of sentences in Berber language, the detection and classification of sentence type is the main subject of this paper. We've realized a system for segmentati...
We used locus equations for characterizing the two berber consonants 'lip-vélarized' /gw/and /kw/. The aim is to show that these two phonemes are consonants distinct from their homologous velar /g/and /k/. The second and third order of locus equations have produced appreciable results
One of the major issues when transforming a voice using the PSOLA algorithm is to be able to accurately find the values for the signal modification parameters (α, β and γ) that allow us to transform the source signal into the target signal. In this paper, we propose a way to determine these parameters on the basis of a study of their influence on s...
The acoustic measurement of articulation place
for consonants and the acoustic measurement of the amount
of coarticulation are two long-standing problems in
phonetics. Previous work on consonant place of articulation,
using articulatory-acoustic models, within the acoustic
theory of speech production (Stevens, 1998), has found
certain acoustic cor...
Our study consists of analysing Arabic utterances VCVα in brief vocalic context with Vα and speech rate as variables in order to observe the impact of the "right" context and speech rate on the coarticulation. Thus, we have to look for some invariance in the speech signal explaining the coarticulation phenomenon related to speech rate. So, we have...
This study aims to identify the compensatory strategies that speaker develops in speech production in noisy environment. We have developed a method which allows studying the effect of noise on acoustic parameters that characterize the speech signal (F0 duration, formants, cepstral and LPC parameters). The results reflect the speakers' attitudes tow...
We used the locus equations and formant trajectories to characterize two consonants " lip-vélarized " / gw / and / kw /. We have proposed a method of representing the formant trajectory by the slope and the intercept of the linear regression. This method allowed to provide information to complement locus equations and helped distinguish between " l...
Doter la machine des capacités de compréhension des comportements humains : tel est le défi scientifique autour duquel se rassemblent différentes communautés scientifiques (traitement du signal, traitement automatique du langage, intelligence artificielle, robotique, interaction homme-machine, etc.). L'un des signaux fréquemment utilisé est le sign...
The labiovelarization of velar consonants and labials is a very widespread phenomenon. It is attested in all the major northern Berber dialects. Only the Tuareg completely ignores it. But, even within the large Berber-speaking regions of the north, it is very unstable: it may be completely absent in some languages (such as the Bougie region in Kaby...
We propose in this study, a method allowing us to characterize a speech occurred in a stress situation. For this, we created an artificial disturbance (stress lip) and then, we analyzed the effects of stress on the acoustic parameters of the signals produced. We have developed a methodology allowing us to analyze the timing, the fundamental frequen...
This study is part of adaptive mechanisms research in speech production. We were interested to acoustic variations of the voice message when the speaker is placed in a noisy environment. We propose a methodology to highlight the articulatory strategies adopted by the speaker to adapt to the noisy environment. This methodology consists of two parts:...
We acoustically analyzed behavior of speech signal produced in noisy constraint by four speakers, when noise is sent by a helmet to speaker and when noise is sent by high speaker. The goal is to find speech signal acoustic parameters which are most sensitive to noise and the compensatory strategies adopted by the speakers to counter this constraint...
The goal of this study is to consider the instantaneous frequencies corresponding to the speech signal formant using the wavelet transform. The developed method is based on an analysis of derivative phase of the continuous Morlet wavelet transform coefficients. Using synthesized signals produced by a formant model made it possible to adapt this met...
This study treats the effects of speech rate in a second language. Six speakers from different geographical areas have produced sentences in written Arabic, carrying fricatives (Arabic specifications), at different speeds of elocution. The selected speakers are: two Lebanese (CH and LI), two inhabitants of Algiers (FE and MA) and two Kabyles (SAand...
Nous présentons une étude paramétrique d’un modèle de la source vocale, connu comme modèle à deux masses. Ce modèle nous a permis de caractériser la fréquence fondamentale de la source en fonction de la pression des poumons et la tension des cordes vocales. Les résultats d’une telle étude contribuent à une meilleure connaissance de la réponse d’un...
The degree of coarticulation and the vocalic reduction (RV) are indices related to a good engine control (Gay, 1978). Fowler (1998) explains why locus equation (LE) is used to characterize, at the same time, the place of articulation and the degree of coarticulation between consonants and vowels: a strong slope (m=1) indicates a maximum coarticulat...
The degree of coarticulation and the vocalic reduction (RV) are indices related to good engine control (Gay 1978). Fowler (1998) explains why locus equation (LE) is used to characterize, at the same time, the place of articulation and the degree of coarticulation between consonants and vowels: a strong slope (m=1) indicates a maximum coarticulation...
The determination of the glottic source parameters is a relatively difficult subject, because it relates to the measurement of parameters convoluted of the vocal source. If, using a recording EGG, it is possible to reach two of the principal parameters of analysis, there is not currently reliable method to determine the remaining parameters. Thus,...
Locus equations are linear regressions of the onset of F2 transitions on their offset .These functions are able to characterise consonantal place categories. Taking up again previous literature studies; this experiment explored the information for place of articulation provided by these functions to the two Arabic plosive /q/ (/ﻖ /) and /?/ (/)/...
Locus equations are linear regressions of the onset of F2 transitions on their offset. These functions are able to characterize consonantal place categories. This experiment explored the accuracy in the information for place of articulation provided by these functions to the two Arabic plosive /q/ (//ﻖ) and /?/ (/)/ﺀ and their corresponding fri...
The determination of the glottic source parameters is a relatively difficult subject, because it relates to the measurement of parameters convoluted of the vocal source. If, using a recording EGG, it is possible to reach two of the principal parameters of analysis, there is not currently reliable method to determine the remaining parameters. Thus,...
Locus equations are linear regressions of the onset of F2 transitions on their offset .These functions are able to characterise consonantal place categories. Taking up again previous literature studies; this experiment explored the information for place of articulation provided by these functions to the two Arabic plosive /q/ (/ﻖ /) and /?/ (/)/...
The determination of the glottic source parameters is a relatively difficult subject, because it relates to the measurement of parameters convoluted of the vocal source. If, using a recording EGG, it is possible to reach two of the principal parameters of analysis, there is not currently reliable method to determine the remaining parameters. Thus,...
The determination of the glottic source parameters is a relatively difficult subject, because it relates to the measurement of parameters convoluted of the vocal source. If, using a recording EGG, it is possible to reach two of the principal parameters of analysis, there is not currently reliable method to determine the remaining parameters. Thus,...