Doroteo Torre Toledano

Doroteo Torre Toledano
Universidad Autónoma de Madrid | UAM · Department of Electronical and Communication Technology

About

224
Publications
30,796
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,908
Citations
Citations since 2016
8 Research Items
796 Citations
2016201720182019202020212022020406080100120
2016201720182019202020212022020406080100120
2016201720182019202020212022020406080100120
2016201720182019202020212022020406080100120

Publications

Publications (224)
Preprint
A new multimodal biometric database, acquired in the framework of the BiosecurID project, is presented together with the description of the acquisition setup and protocol. The database includes eight unimodal biometric traits, namely: speech, iris, face (still images, videos of talking faces), handwritten signature and handwritten text (on-line dyn...
Article
Full-text available
Background: Obstructive sleep apnea (OSA) is a common sleep disorder characterized by frequent cessation of breathing lasting 10 seconds or longer. The diagnosis of OSA is performed through an expensive procedure, which requires an overnight stay at the hospital. This has led to several proposals based on the analysis of patients' facial images an...
Article
Full-text available
Within search-on-speech, Spoken Term Detection (STD) aims to retrieve data from a speech repository given a textual representation of a search term. This paper presents an international open evaluation for search-on-speech based on STD in Spanish and an analysis of the results. The evaluation has been designed carefully so that several analyses of...
Article
Full-text available
Language recognition systems based on bottleneck features have recently become the state-of-the-art in this research field, showing its success in the last Language Recognition Evaluation (LRE 2015) organized by NIST (U.S. National Institute of Standards and Technology). This type of system is based on a deep neural network (DNN) trained to discrim...
Preprint
Full-text available
BACKGROUND Obstructive sleep apnea (OSA) is a common sleep disorder characterized by frequent cessation of breathing lasting 10 seconds or longer. The diagnosis of OSA is performed through an expensive procedure, which requires an overnight stay at the hospital. This has led to several proposals based on the analysis of patients’ facial images and...
Article
Full-text available
Background: Sleep apnea (OSA) is a common sleep disorder characterized by recurring breathing pauses during sleep caused by a blockage of the upper airway (UA). The altered UA structure or function in OSA speakers has led to hypothesize the automatic analysis of speech for OSA assessment. In this paper we critically review several approaches using...
Article
Full-text available
Long Short Term Memory (LSTM) Recurrent Neural Networks (RNNs) have recently outperformed other state-of-the-art approaches, such as i-vector and Deep Neural Networks (DNNs), in automatic Language Identification (LID), particularly when dealing with very short utterances (∼3s). In this contribution we present an open-source, end-to-end, LSTM RNN sy...
Article
Full-text available
Query-by-example spoken term detection (QbE STD) aims at retrieving data from a speech repository given an acoustic query containing the term of interest as input. Nowadays, it is receiving much interest due to the large volume of multimedia information. This paper presents the systems submitted to the ALBAYZIN QbE STD 2014 evaluation held as a par...
Article
Full-text available
Spoken term detection (STD) aims at retrieving data from a speech repository given a textual representation of the search term. Nowadays, it is receiving much interest due to the large volume of multimedia information. STD differs from automatic speech recognition (ASR) in that ASR is interested in all the terms/words that appear in the speech data...
Article
Full-text available
Obstructive sleep apnea (OSA) is a common sleep disorder characterized by recurring breathing pauses during sleep caused by a blockage of the upper airway (UA). OSA is generally diagnosed through a costly procedure requiring an overnight stay of the patient at the hospital. This has led to proposing less costly procedures based on the analysis of p...
Article
Objectives: We investigated whether differences in formants and their bandwidths, previously reported comparing small sample population of healthy individuals and patients with obstructive sleep apnea (OSA), are detected on a larger population representative of a clinical practice scenario. We examine possible indirect or mediated effects of clini...
Book
The Spanish Thematic Network on Speech Technology (RTTH) and the ISCA-Special Interest Group on Iberian Languages (SIG-IL) present the selected papers of IberSpeech 2014, Joint VIII Jornadas en Tecnologías del Habla and IV Iberian SLTech Workshop, held in Las Palmas de Gran Canaria, Spain, November 19-21. The articles are organized into four differ...
Chapter
This paper presents the ATVS-CSLT-HCTLab spoken term detection (STD) system submitted to the NIST 2013 Open Keyword Search evaluation. The evaluation consists of searching a list of query terms in Vietnamese conversational speech data. Our submission involves an automatic speech recognition (ASR) subsystem which converts speech signals into word/ph...
Article
Full-text available
Query-by-Example Spoken Term Detection (QbE STD) aims at retrieving data from a speech data repository given an acoustic query containing the term of interest as input. Nowadays, it has been receiving much interest due to the high volume of information stored in audio or audiovisual format. QbE STD differs from automatic speech recognition (ASR) an...
Article
Obstructive sleep apnoea (OSA) is a highly prevalent disease affecting an estimated 2–4% of the adult male population that is difficult and very costly to diagnose because symptoms can remain unnoticed for years. The reference diagnostic method, Polysomnography (PSG), requires the patient to spend a night at the hospital monitored by specialized eq...
Article
Discriminative confidence based on multi-layer perceptrons (MLPs) and multiple features has shown significant advantage compared to the widely used lattice-based confidence in spoken term detection (STD). Although the MLP-based framework can handle any features derived from a multitude of sources, choosing all possible features may lead to over com...
Article
Full-text available
Nowadays definitive diagnosis of obstructive sleep apnoea (OSA) syndrome is expensive and time-consuming. Previous research on voice characteristics of OSA patients has shown that resonance, phonation and articulation differences arise when compared to healthy subjects. In this contribution we study different speech modeling techniques to detect pa...
Article
Full-text available
The process of human segmentation and labelling of speech can be seen as a two-step process. In the first step humans listen to a speech signal, recognize the word and phoneme sequence, and roughly determine the position of each phonetic boundary. In the second step humans examine several speech signal features (waveform, energy, spectrogram, etc.)...
Conference Paper
We present a novel approach using both sustained vowels and connected speech, to detect obstructive sleep apnea (OSA) cases within a homogeneous group of speakers. The proposed scheme is based on state-of-the-art GMM-based classifiers, and acknowledges specifically the way in which acoustic models are trained on standard databases, as well as the c...
Article
This paper presents the systems submitted by the ATVS Biometric Recognition Group to the 2009 Language Recognition Evaluation (LRE'09), organized by NIST. New challenges included in this LRE edition can be summarized by three main differences with respect to past evaluations. First, the number of languages to be recognized expanded to 23 languages...
Conference Paper
Full-text available
Discriminative confidence estimation along with confidence normalisation have been shown to construct robust decision maker modules in spoken term detection (STD) systems. Discriminative confidence estimation, making use of term-dependent features, has been shown to improve the widely used lattice-based confidence estimation in STD. In this work, w...
Conference Paper
Full-text available
A novel way of managing the compromise between noise reduction and speech distortion in Wiener filters is presented. It is based on adjusting the amount of noise reduced, and therefore the speech distortion introduced, on a phone-by-phone basis. We show empirically that optimal Wiener filters produce different amounts of speech distortion for diffe...
Article
Full-text available
This paper describes the system submitted by ATVS-UAM to the 2010 edition of NIST Speaker Recognition Evaluation (SRE). Instead of focusing on multiple, complex and heavy systems, our submission is based on a fast, light and efficient single system. Sample development results with English SRE08 data (data used in the previous evaluation in 2008) ar...
Article
Full-text available
This paper describes the ATVS-UAM systems submitted to the Audio Segmentation and Speaker Diarization Albayzin 2010 Evaluation. The ATVS-UAM audio segmentation system is based on a 5-GMM-MMI-state HMM model. Testing utterances are aligned with the model by means of the Viterbi algorithm. Spurious changes in the state sequence were removed by mode-f...
Article
Full-text available
This study is part of an ongoing collaborative effort between the medical and the signal processing communities to promote research on applying standard Automatic Speech Recognition (ASR) techniques for the automatic diagnosis of patients with severe obstructive sleep apnoea (OSA). Early detection of severe apnoea cases is important so that patient...
Article
During the last decades the need for fast searching of speech recordings has rapidly developed. One of the crucial technologies for searching and further processing this material is automatic language recognition on spontaneous speech. A brief introduction to this technology is provided and the system submitted by ATVS-UAM to the 2007 International...
Article
Band-limited speech (speech for which parts of the spectrum are completely lost) is a major cause for accuracy degradation of automatic speech recognition (ASR) systems particularly when acoustic models have been trained with data with a different spectral range. In this paper, we present an extensive study of the problem of ASR of band-limited spe...
Conference Paper
The aim of this paper is to study new possibilities of using Automatic Speaker Recognition techniques (ASR) for detection of patients with severe obstructive sleep apnoea (OSA). Early detection of severe apnoea cases can be very useful to give priority to their early treatment optimizing the expensive and time-consuming tests of current diagnosis m...