Daniel Hirst

Daniel Hirst
French National Centre for Scientific Research (CNRS) & Aix-Marseille University · Laboratoire Parole et Langage

PhD, Dr Hab.

About

168
Publications
110,795
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,944
Citations
Citations since 2017
24 Research Items
705 Citations
2017201820192020202120222023020406080100120140
2017201820192020202120222023020406080100120140
2017201820192020202120222023020406080100120140
2017201820192020202120222023020406080100120140
Additional affiliations
October 2011 - April 2020
French National Centre for Scientific Research
Position
  • Professor Emeritus
Description
  • The LPL is a laboratory jointed supported by the CNRS and Aix-Marseille University
September 2011 - October 2014
Tongji University
Position
  • Lecture Professor
October 1977 - September 2011
French National Centre for Scientific Research
Position
  • Researcher
Education
September 1982 - September 1987
Institut de Phonétique, Université de Provence
Field of study
  • Phonetics and Linguistics
September 1970 - September 1974
Insitut de Phonétique, Université de Provence
Field of study
  • Phonetics and Linguistics
September 1965 - June 1968
St. David's College, University of Wales
Field of study
  • English Literature

Publications

Publications (168)
Conference Paper
Full-text available
In this study we analyse 18 metrics which were extracted fully automatically from the acoustic signal to describe the melodic characteristics of recordings of English read by L2 Chinese speakers from Shanghai. The metrics were compared to those of native English speakers recording the same material and also to comparable Chinese recordings read by...
Article
Full-text available
this paper I present an overview ofProZed an aid for developing prosody rules for speech synthesis using the MOMEL and INTSINT [19], algorithms and interfaced with the MBROLA , [12]MBROLIGN [23] and Praat [4] programs. It allows the interactive editing of a symbolic representation of an utterance in any of the twenty languages and dialects for whic...
Article
Full-text available
This paper presents a revised version of an implementation of the Momel and INTSINT algorithms for the automatic modelling and symbolic coding of intonation patterns. The algorithms are implemented as external functions which are seamlessly integrated into the Praat speech manipulation software by means of the recently proposed plugin facility for...
Article
Full-text available
The abstract for this document is available on CSA Illumina.To view the Abstract, click the Abstract button above the document title.
Preprint
Full-text available
This is a preprint of Chapter 7 of my forthcoming book Speech Prosody. From Acoustics to Interpretation. questions, comments and suggestions are very welcome.
Preprint
Full-text available
This is Chapter 6. Prosodic Structure. from my forthcoming book Speech Prosody. From Acoustics to Interpretation. Questions, comments and corrections are welcome.
Preprint
Full-text available
This is a preprint of chapter 2 of my forthcoming book. Comments and questions are welcome.
Preprint
Full-text available
This is a preprint of chapter 1 of my forthcoming book. Speech Prosody: from Acoustics to Interpretation. Comments and questions are welcome.
Preprint
Full-text available
This is an unfinished draft of chapter 5 The Phonology of Speech Prosody. of my forthcoming book. Speech Prosody. From Acoustics to Interpretation Comments, suggestions and questions are welcome.
Preprint
Full-text available
This is a preprint of chapter 4 The Prosody of Words of my forthcoming book: Speech Prosody. From Acoustics to Interpretation. Comments and questions are welcome.
Preprint
Full-text available
This is a preprint of Chapter 2 - The transcription of prosody - of my forthcoming book Speech Prosody. From Acoustics to Interpretation. Comments, suggestions and questions are welcome.
Preprint
Full-text available
In this chapter, we introduce the reader to the concepts of pitch and fundamental frequency from a functional, physiological and physical perspective. Several issues, including the modelling of intonation, pitch detection and measurement and acoustic scales, described below, are addressed to inform the reader about best practice for teaching and le...
Presentation
Full-text available
It is well known that L2 speakers have particular difficulty with prosody - the rhythm and melody of their speech - and that this is a major factor leading to their speech being difficult to understand for native speakers. This presentation suggests the possibility of providing automatic visual and auditory feedback as an aid to the improvement of...
Presentation
Full-text available
This tutorial explains how to display the prosody of a recording and how to transfer (clone) the prosody of a source recording to a target recording These tutorials are first drafts and feedback is welcome, in particular if anything is not clear or needs more explanation. send feedback to <djhirst@me.com>
Presentation
Full-text available
This tutorial shows how to use the Momel-Intsint plugin for the automatic analysis of the prosody of a single recording. These tutorials are first drafts and feedback is welcome, in particular if anything is not clear or needs more explanation. Send feedback to <djhirst@me.com>
Presentation
Full-text available
This tutorial shows how to use the Momel-INSINT plugin for the automatic analysis of the prosody of a whole batch of recordings. It presupposes familiarity with analysing the prosody of a single recording, as described in Tutorial 1 These tutorials are first drafts and feedback is welcome, in particular if anything is not clear or needs more expl...
Preprint
Full-text available
This is the draft of the references for my forthcoming book Speech Prosody. From Acoustics to Interpretation. Suggestions, comments and corrections are welcome.
Preprint
Full-text available
This is a preprint of chapter 8 Modelling Speech Melody of my forthcoming book: Speech Prosody. From Acoustics to Interpretation. Questions, comments and suggestions are welcome.
Preprint
Full-text available
In this chapter, we introduce the reader to the concepts of pitch and fundamental frequency from a functional, physiological and physical perspective. Several issues, including the modelling of intonation, pitch detection and measurement and acoustic scales, described below, are addressed to inform the reader about best practice for teaching and le...
Code
Minor updates - replaces the term "target points" by "anchor points" Includes updated Readme file
Conference Paper
Full-text available
This presentation reports work in progress on an improved and simplified algorithm for coding the output of the Momel algorithm using the INTSINT alphabet, building on recent work which proposed the Octave-Median scale (ome = log2(Hz/Median)) as a natural scale for the representation of pitch. Preliminary results comparing the output of the new alg...
Conference Paper
Full-text available
Our ideas about prosodic representation are heavily influenced by our knowledge of written language. All writing systems represent utterances as a linear sequence of elements drawn from a finite set of characters. In many languages special characters such as spaces or punctuation marks are used as boundary symbols. There is a general consensus toda...
Conference Paper
Full-text available
Modelling pitch patterns from acoustic data needs to take into account the fact that raw f0 curves are the product of an underlying global pitch pattern and a more local (micromelodic) influence of the individual speech sounds. This suggests the hypothesis that pitch could be modelled using only the f0 detected on sonorant rimes (vowels and sonoran...
Conference Paper
Full-text available
OMProDat is an open multilingual prosodic database, which aims to collect, archive and distribute recordings and annotations of directly comparable data from different languages representing different prosodic typological characteristics. OMProDat contains recordings of 40 five-sentence passages read by 5 male and 5 female speakers of each language...
Conference Paper
Full-text available
Fundamental frequency, the primary acoustic correlate of speech melody, is generally analysed and displayed using a linear scale (Hertz) or a logarithmic one, generally in semitones and usually offset to an arbitrary reference level such as 100 Hz. In this paper we argue that a more natural scale for analysing speech is the OME (Octave-MEdian) scal...
Conference Paper
Full-text available
Based on the Momel algorithm, a set of acoustic parameters was analyzed automatically on Chinese emotional speech. Global prosodic features were calculated on the sentence level, which showed a concordance with the usual pattern reported in the literature. Local constraints were also considered on the syllable layer. An ANOVA showed that there were...
Conference Paper
Full-text available
During Speech Prosody 2012, we presented SPPAS, SPeech Phonetization Alignment and Syllabification, a tool to auto-matically produce annotations which include utterance, word, syllabic and phonemic segmentations from a recorded speech sound and its transcription. SPPAS is open source software is-sued under the GNU Public License. SPPAS is multi-pla...
Conference Paper
Full-text available
Current research on speech prosody generally makes use of large quantities of recorded data. In order to provide an open multi-lingual basis for the comparative study of speech prosody, the Laboratoire Parole et Langage has begun the creation of an open database OMProDat containing recordings of 40 five sentence passages, originally taken from the...
Conference Paper
Full-text available
It is more and more standard practice, in speech research, to make publicly available the data used in the research, in particular the speech recordings. This can potentially raise the problem of how to respect the anonymity of the speakers, particular if the recordings consist of unmonitored conversations, which may contain references to people by...
Conference Paper
Full-text available
In recent years there have been a number of proposals for objective paradigms for establishing prosodic typologies among languages. This paper compares the results of melody metrics calculated on just over two hours of read speech for each of three languages. Pitch movements in Chinese, a lexical tone language, were found to be significantly more a...
Code
This is the Praat script described in my paper Hirst, D.J. 2013 "Anonymising long sounds for prosodic research" in B.Bigi & D.J.Hirst (eds) Proceedings of Tools and Resources for the Analysis of Speech Prosody (TRASP) pp 36-37
Conference Paper
Full-text available
In standard Chinese, a low tone (Tone 3) is usually changed into a rising tone (Tone 2) when it is immediately followed by another third tone, which is known as the third tone sandhi. The 3rd tone sandhi has been widely discussed in Chinese phonology. This paper, however, employs a prosodic corpus we are developing to study the acoustic realization...
Article
Full-text available
Wiktor Jassem's short article on rhythm (‘Indication of speech rhythm in the transcription of educated Southern English’) was published in 1949 in Le Maître Phonétique , the ancestor of today's Journal of the International Phonetic Association . The author was a young man aged 27 at the time, who was working on a longer treatment of the intonation...
Conference Paper
Full-text available
SPPAS, SPeech Phonetization Alignment and Syllabification, is a tool to automatically produce annotations which include utterance, word, syllable and phoneme seg-mentations from a recorded speech sound and its tran-scription. SPPAS is currently implemented for French, English, Italian and Chinese and there is a very simple procedure to add other la...
Conference Paper
Full-text available
This paper describes a tool designed to allow linguists to manipulate the prosody of an utterance via a sym-bolic representation in order to evaluate linguistic mod-els. Prosody is manipulated via a Praat TextGrid which allows the user to modify the rhythm and melody. Rhythm is manipulated by factoring segmental duration into three components: (i)...
Conference Paper
Full-text available
This paper presents a multilingual learners corpus (AixOx) collected in the framework of an ALLIANCE project. Speakers reading forty 1-minute passages in French and in English were recorded. The passages are taken from the EUROM 1 corpus (Chan et al. 1995). The corpus consists of the recordings of these passages read by native speakers and L2 lear...
Book
This collection of studies on phonetics and phonology is cordially dedicated to Professor Wiktor Jassem by his colleagues and friends on the occasion of his 90th birthday, 11th June 2012, in appreciation of his influential and pioneering contributions to the field.
Article
Full-text available
This paper describes the application of the analysis by synthesis paradigm to the melody of speech. A complete chain of processes is described from the acoustic analysis of fundamental frequency (f 0), via the phonetic modelling of f 0 using the Momel algorithm, to the surface phonological representation of the curves using the INTSINT alphabet. Ea...
Chapter
Full-text available
The term dialect is used here as a taxinomic level of linguistic classification, subordinate to a language. In this sense everyone speaks a dialect, including those who speak the prestigious dialect of a language. A comparaison with biological taxinomy brings to light a parallel between the notions of language and of species where, in both cases, t...
Article
Full-text available
The following two texts were published by the predecessor of our journal over half a century ago and illustrate the characteristic nature of its publications at that time.
Conference Paper
Full-text available
We propose in this paper a broad-coverage approach for multimodal annotation of conversational data. Large annotation pro-jects addressing the question of multimo-dal annotation bring together many dif-ferent kinds of information from different domains, with different levels of granula-rity. We present in this paper the first re-sults of the OTIM p...
Conference Paper
Full-text available
Fundamental frequency, the primary acoustic correlate of speech melody, is generally analysed and displayed using a linear scale (in Hertz) or a logarithmic one (usually in semitones), generally offset to an arbitrary reference level. In this paper we argue that a more natural scale for analysing speech is the OME (Octave-MEdian) scale, using the o...
Conference Paper
Full-text available
This study investigates rhythmic parameters in the production of French learners in a dual perspective: (i) to analyse the influence of rhythm of the native language (L1=French) on the target language (L2=English) and, (ii) to provide prosodic evaluative criteria for French speakers' productions. The method used is a comparative analysis of French...
Conference Paper
Full-text available
While current tools for the automatic analysis and modeling of intonation are satisfactory for laboratory or isolated sentences, they appear insufficient for the study of longer stretches of authentic speech, which are in general marked by systematic changes of register. This study shows that implementing automatically detected register changes sig...
Article
Full-text available
Most existing algorithms to identify the primary stressed syllable of accented words for the recognition and synthesis of Arabic prosody are based on the fundamental frequency. In this study, we used both formants values and the acoustic parameter of energy by means of a classification by a discriminant analysis to detect the primary stressed sylla...
Chapter
Full-text available
This paper presents results from the analysis of the rhythmic characteristics of a corpus of five and a half hours of authentic speech of British English. It is shown (as suggested by Wiktor Jassem over 50 years ago) that the most appropriate unit to describe the relative lengthening of phones is the Narrow Rhythm Unit, beginning with the stressed...
Article
Full-text available
This paper proposes an approach for a Classification by Discriminant Analysis of stressed syllables in Standard Arabic. In this study, we exploited the acoustic parameters of fundamental frequency and energy by means of a classification by a discriminant analysis to detect stressed syllables of Standard Arabic words with the structure [CVCVCV] read...
Conference Paper
Full-text available
A promising strategy for the multilingual annotation of speech prosody is to use manual annotation of a small corpus of speech to bootstrap a fully automatic annotation system. We make a systematic distinction between functional annotation and formal annotation. The use of functional prosodic labelling for prosody control in a Finnish speech synthe...
Article
Full-text available
A promising strategy for the multilingual annotation of speech prosody is to use manual annotation of a small corpus of speech to bootstrap a fully automatic annotation system. We make a systematic distinction between functional annotation and formal annotation. The use of functional prosodic labelling for prosody control in a Finnish speech synthe...
Technical Report
Full-text available
This database may be freely distributed and used without any restriction except that it should always be accompanied by this notice. Our only request is that the providers of the database (us) should be informed of any enrichments you or others may make to it and that these enrichments should be made freely available for future distributions.
Article
Full-text available
Problem statement: In the early days of speech synthesis research the obvious focus of attention was intelligibility. But many researchers agree that the major remaining obstacle to fully acceptable synthetic speech is that it continues to be insufficiently natural. Approach: In this study, we exploited microvariations of fundamental frequency (F0)...
Article
Full-text available
Problem Statement: Current algorithms for the recognition and synthesis of Arabic prosody concentrate on identifying the primary stressed syllable of accented words on the basis of fundamental frequency. Generally, the three acoustic parameters used in prosody are: Fundamental frequency, duration and energy. Approach: In this study, we exploited th...
Article
Full-text available
One of the fundamental aims of prosodic analysis is to provide a reliable means of extracting functional information (what prosody contributes to meaning) directly from prosodic form. It has been argued that an explicit model of the mapping from prosodic function to prosodic form could provide an objective way of approaching this task. In this pres...
Article
Full-text available
This paper presents results from an ongoing research on the evaluation of the prosody of British English spoken by French learners and native speakers. This pilot study examines two potential rhythmic criteria: the analysis of the anacrusis/narrow rhythm unit and that of the pairwise variability index (PVI). The method used is a comparative analysi...
Article
Full-text available
This paper describes the contents of the Korean prosody corpus (Korean MULTEXT), which is a Korean version of the speech database Eurom1. The corpus consists of about 2 hours of read speech, transcribed primarily in orthography (in Korean alphabet and in a Romanized transcription), in IPA and in SAMPA. Furthermore, it includes the original F0 value...
Conference Paper
Full-text available
This paper describes the contents of the Korean prosody corpus (Korean MULTEXT), which is a Korean version of the speech database Eurom1. The corpus consists of about 2 hours of read speech, transcribed primarily in orthography (in Korean alphabet and in a Romanized transcription), in IPA and in SAMPA. Furthermore, it includes the original F0 value...
Article
Full-text available
The analysis of authentic speech, unlike that of laboratory speech, needs to take into account the fact that the fundamental frequency patterns corresponding to the intonation of utterances can be of two types -local pitch characteristics determined by the surface phonological representation of the intonation and longer term characteristics corresp...
Article
Full-text available
This paper presents a general model for the relation between representations of form and function for speech prosody on a multi-lingual basis. It outlines a procedure for analysing prosody by synthesis applied to English intonation patterns, generating formal representations from a minimal representation of prosodic functions and comparing the outp...