Ignasi Esquerra

Ignasi Esquerra
Universitat Politècnica de Catalunya | UPC · TALP - Centre for Language and Speech Technologies and Applications

MSc. Engineer

About

20
Publications
1,327
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
83
Citations
Citations since 2017
1 Research Item
10 Citations
2017201820192020202120222023012345
2017201820192020202120222023012345
2017201820192020202120222023012345
2017201820192020202120222023012345

Publications

Publications (20)
Article
Full-text available
This paper presents our work around the FESTCAT project, whose main goal was the development of voices for the Festival suite in Catalan. In the first year, we produced the corpus and the speech data needed for build 10 voices using the Clunits (unit selection) and the HTS (Markov models) methods. The resulting voices are freely available on the we...
Conference Paper
Full-text available
In this paper we describe the design and production of Catalan database for building synthetic voices. Two speakers, with 10 hours per speaker, have recorded 10 hours of speech. The speaker selection and the corpus design aim to provide resources for high quality synthesis. The resources have been used to build voices for the Festival TTS. Both the...
Article
Full-text available
Segre is a rule -based automatic phonetic transcription system for Catalan, jointly developed by the Universitat Politècnica de Catalunya, the Universitat Autònoma de Barcelona and the Universitat de Barcelona in the framework of the Catalan Reference Centre for Language Engineering (CREL, Centre de Referència en Enginyeria Lingüística). The syntax...
Conference Paper
Full-text available
This paper summarizes the text-to-speech system that has been developed in the Speech Group of the Universitat Politècnica de Catalunya (UPC). The system is composed of a core and different interfaces so that it is compatible for research, for telephone applications (either CTI boards or standard ISDN PC cards supporting CAPI), and Windows applicat...
Conference Paper
Full-text available
Knowledge of phonetic unit frequency is very necessary for developing databases in both concatenative synthesis and continuous speech recognition. In the present work, a large corpus of text was processed and phonetically transcribed to obtain allophone and diphone frequencies for the Catalan language. The corpus was acquired from newspaper article...
Article
Different databases of phonetic units are required in multilingual Text-to-Speech systems based on concatenative synthesis. We are currently developing a TTS system able to convert text either in Catalan and Spanish, with some of the modules being used indistinctly by the two languages while others are specific to each language. In order to reduce...
Conference Paper
The multiwindow approach is a meaningful framework for nonparametric spectral estimation. It also encompasses several conventional methods as WOSA and frequency-averaged periodogram. Recently, some authors claimed that the Slepian windows of Thomson's method and other related optimal sets of windows show a better performance in terms of resolution,...
Article
The developed tool provides utilities for prosody analysis and labeling of voice signals. It works under Windows 95 and Windows NT environments and uses the Microsoft Win32 application programming interface (API) for audio playing and recording. The application detects the prosody of speech signal and then the original intonation can be stylized in...
Article
Full-text available
This paper presents the evaluation of Ogmios, the UPC TTS system carried out within the Blizzard Challenge Initia-tive, 2007. Ogmios is a unit-selection based system. Prosodic models are used to select the units using acoustic measures in the target cost but the selected units are not modified. Most of the modules of Ogmios rely on data driven tech...
Article
Full-text available
This paper presents the UPC TTS system named Ogmios. It was used to generate the voices in UK English and Mandarin Chinese for Blizzard Challenge 2008. Ogmios is a system based on unit-selection using acoustic and phonetic features both in target and concatenation costs. Most of the modules of Ogmios rely on data driven techniques. This evaluation...
Article
In this paper, we present the design of a corpus for speech recognition to be used for the recording of a speech database in Catalan. A previous database in Spanish was the reference in setting the specifications about the characteristics of the sentences and in the minimum number of units required. An analysis of unit frequencies were carried out...
Article
Abstractó The topic of emotional speech synthesis has received lately a lot of interest within the research community, as shown by the number of papers presented at conferences and workshops. In the beginning, synthesizers used rules to perform prosodic and voice quality changes in order to produce different styles of speaking to the synthetic voic...
Article
This paper describes Ogmios, the UPC TTS system that was used in the 2010 Albayzin Evaluation. Ogmios is a concatenation system that builds the synthetic sentence from demiphones selected from the training database. In this evaluation round, the database was provided by the organization and it has been phonetically transcribed and segmented automat...

Network

Cited By