
José Luis Pérez-CórdobaUniversity of Granada | UGR · Department of Signal Theory, Telematics and Communications
José Luis Pérez-Córdoba
PhD
About
52
Publications
5,949
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
563
Citations
Publications
Publications (52)
In this paper, we propose a novel algorithm called multiview temporal alignment by dependence maximisation in the latent space (TRANSIENCE) for the alignment of time series consisting of sequences of feature vectors with different length and dimensionality of the feature vectors. The proposed algorithm, which is based on the theory of multiview lea...
Articulatory-to-acoustic (A2A) synthesis refers to the generation of audible speech from captured movement of the speech articulators. This technique has numerous applications, such as restoring oral communication to people who cannot longer speak due to illness or injury. Most successful techniques so far adopt a supervised learning framework, in...
This review summarises the status of silent speech interface (SSI) research. SSIs rely on non-acoustic biosignals generated by the human body during speech production to enable communication whenever normal verbal communication is not possible or not desirable. In this review, we focus on the first case and present latest SSI research aimed at prov...
This review summarises the status of silent speech interface (SSI) research. SSIs rely on non-acoustic biosignals generated by the human body during speech production to enable communication whenever normal verbal communication is not possible or not desirable. In this review, we focus on the first case and present latest SSI research aimed at prov...
Morales-Artacho, AJ, García-Ramos, A, Pérez-Castilla, A, Padial, P, Gomez, AM, Peinado, AM, Pérez-Córdoba, JL, and Feriche, B. Muscle activation during power-oriented resistance training: continuous vs. cluster set configurations. J Strength Cond Res XX(X): 000-000, 2018-This study examined performance and electromyography (EMG) changes during a po...
Voice over IP (VoIP) communications are prone to transmission delays and data losses as they are carried out over packet-switched networks which are unable to guarantee real-time packet delivery. Speech codecs used in these channels strongly rely on Packet Loss Concealment (PLC) algorithms, the performance of which can be compromised as frame losse...
One of the well-known problems of Code-Excited Linear Prediction (CELP)-type codec is its vulnerability to a frame erasure. When a frame is erased, the inter-frame dependency introduced by the Long Term Prediction causes a desynchronization of the Adaptive Codebook (ACB) which introduces in its turn an error propagation through the correctly receiv...
The strong interframe dependency present in Code Excited Linear Prediction (CELP) codecs renders the decoder very vulnerable when the Adaptive Codebook (ACB) is desynchronized. Hence, errors affect not only the concealed frame but also all the subsequent frames. In this paper, we have developed a Forward Error Correction (FEC)-based technique which...
In this paper, we propose an error mitigation scheme which combines two different approaches, a replacement super vector technique which provides replacements to reconstruct both the LPC coefficients and the excitation signal along bursts of lost packets, and a Forward Error Code (FEC) technique in order to minimize the error propagation after the...
In this paper, a new speech database, the so-called Secu-Voice, is described. This database consists of utterances in Spanish of isolated digits recorded with two different smartphones: a mid-range smart-phone and a high-range one. This database is intended for research on biometrics and secure applications that integrate both automatic speech reco...
The importance of packet-based speech transmissions has grown since it offers cheaper and efficient communications. However, frame erasures are a common hurdle in these networks and concealment techniques are necessary to ensure a minimum quality of service. In this paper, we propose a mitigation technique focused on the reconstruction of the linea...
Most of the algorithms used for information extraction and for processing the amino acid chains that make up proteins treat them as symbolic chains. Fewer algorithms exploit signal processing techniques that require a numerical representation of amino acid chains. However, these algorithms are very powerful for extracting regularities that cannot b...
In this paper we present a new mitigation technique for lost speech frames transmitted over loss-prone packet networks. It is based on an MMSE estimation from the last received frame, which provides replacements not only for the LPC coefficients (envelope) but also for the residual signal (excitation). Although the method is codec-independent, it r...
This paper presents a recovery scheme for the error-propagation distortion which frequently appears after a frame erasure in CELP-based speech coders, in particular the AMR codec. The extensive use of predictive filters and parameter encoding allow a high-quality speech synthesis in these codecs, but makes them more vulnerable to frame erasures. Th...
This paper presents an ACELP-based speech transmission scheme that is robust to frame erasures. The scheme is based on the steganographic transmission of media-specific FEC codes. These FEC codes are intended to prevent the adaptive codebook desynchronization frequently found in the decoder after a frame erasure. They are based on a multipulse repr...
This paper presents a forward error correction (FEC) technique based on a multipulse representation of the excitation for code-excited linear prediction (CELP) speech transmission under packet loss conditions. In this approach, the encoder sends the position of a pulse that it is used for the resynchronization of the adaptive codebook, so that prop...
In this paper, we analyze the performance of network speech recognition (NSR) over IP networks, adapting and proposing new solutions to the packet loss problem for code excited linear prediction (CELP) codecs. NSR has a client-server architecture which places the recognizer at the server side using a standard speech codec for speech transmission. I...
While VoIP (voice over IP) is gaining importance in comparison with other types of telephony, packet loss remains as the main source of degradation in VoIP systems. Traditional speech codecs, such as those based on the CELP (code excited linear prediction) paradigm, can achieve low bit-rates at the cost of introducing interframe dependencies. As a...
This paper proposes a method for the remote recognition of speech coded with the iLBC codec, which is employed by a number of VoIP systems. While the usual way of performing recognition of coded speech is to decode first the speech signal and use it as input to the recognition engine, our system directly converts the iLBC parameters into recognitio...
RESUMEN En este trabajo se presenta una implementación de un sistema de reconocimiento distribuido del habla en tiem-po real para su aplicación en un entorno de Internet. Desa-rrollado como una aplicación cliente-servidor, el clien-te hace uso del front-end estándar definido por la ETSI. Incluye un detector de voz para sólo enviar información cuand...
Distributed Speech Recognition involves the development of techniques to conceal the degradations that the transmission channel introduces in the speech features. This work proposes a low-complexity high-accuracy error concealment technique compatible with the DSR ETSI standards. This is achieved by combining three different techniques: fast MMSE e...
This work present a joint source-channel technique based on Channel Optimized Vector Quantization (COVQ) for transmission over bursty channels applied to LSP parameters coding. The bursty channel is modeled as a Finite State Channel (FSC) with two states. We call Bursty COVQ (BCOVQ) to the resulting quantization technique. The case in which channel...
The noise degrades the performance of Automatic Speech Recognition systems mainly due to the mismatch between the training and recognition conditions it introduces. The noise causes a distortion of the feature space which usually presents a non-linear behavior. In order to reduce this mismatch, the methods proposed for robust speech recognition try...
This paper proposes a new packet loss concealment technique based on the inclusion in each packet of a few FEC bits, representing data replicas, combined with a minimum mean square error estimation (MMSE). This technique is developed for an Aurora-2 distributed speech recognition system working over an IP network. In addition to the data representi...
The emergence of distributed speech recognition has generated the need to mitigate the degradations that the transmission channel introduces in the speech features used for recognition. This work proposes a hidden Markov model (HMM) framework from which different mitigation techniques oriented to wireless channels can be derived. First, we study th...
A study of combined source and channel coding applied to LSP parameters in wideband speech coding is presented. The traditional approach to protect against channel errors is to increase the bit-rate for channel coding, decreasing the bit-rate of the source coding according the channel conditions. Joint source-channel coding is an alternative that p...
Distributed Speech Recognition involves the development of tech- niques to mitigate the degradations that the transmission channel introduces in the speech features. This work proposes an HMM framework from which different mitigation techniques oriented to bursty channels can be derived. In particular, two MMSE-based and a new Viterbi-based mitigat...
This paper studies a progressive image transmission technique over waveform channels. The channel optimized vector quantization codec (COVQ) (Farvardin and Vaishampayan 1991) is applied to the image wavelet coefficients creating a robust progressive image transmission technique that mitigates the effects of a noisy channel on the reconstructed imag...
Recently, the first version of an ETSI standard for Distributed Speech Recognition has been proposed. The main benefit of this approach is the possibility of maintaining a high recognition performance when accessing remote information systems. The use of a digital channel for transmission of the encoded speech parameters implies the introduction of...
Combined source and channel coding is a technique to mitigate
channel errors without increasing the bit error rate. Channel optimized
vector quantizer (COVQ) performs these objectives in the context of
vector quantization. This paper presents a study of channel optimized
matrix quantizer (COMQ) applied to quantize the line spectral pair (LSP)
param...
We present a study of a channel optimized matrix quantizer (COMQ)
applied to quantize LSP (line spectral pair) parameters in a CELP (code
excited linear prediction) coder. A modification of the DoD FS-1016
standard (Campbell et al., 1989) is used for this purpose. A Gaussian
channel is considered as the channel through which information is sent
and...
A channel optimized vector quantizer (COVQ) is studied for the
case of transmission over waveform channels. In this work, a number of
modulation schemes with multidimensional signal constellations are
considered, specifically, results on the binary signalling. M-ary
phase-shift keying (MPSK) and M-ary quadrature amplitude modulation
(MQAM) performa...
This work presents the STACC, Sistema Telef onico Autom atico de Consulta de Calificaciones (Automatic Telephone System for Consulting Marks). This system has been developed at our laboratory during 1996 and implements a service through telephone line that allows the students to consult by speech their marks after the exams by means of a simple pho...
In this paper a general transform domain formulation of a low-delay frequency domain technique [1],[2] recently developed
is presented. Following this formulation a Discrete Cosine Transform scheme is proposed which combined with trellis coded
quantization leads to a Transform Trellis Coded Quantization technique. This technique is applied to the c...
The main goal is automatic speech recognition by using artificial neural networks. The authors define a generalized type of neuron that, grouped in a recurrent neural network (an Alphanet), implements a semicontinuous hidden Markov model (SCHMM). The neurons are grouped in a single layer that generates the Alphanet in such a way that some of its in...
In this Ph.D. dissertation the influence of packet losses on speech recognition is analyzed and different solutions to pre-vent, reduce and conceal their effects are developed. The per-formance of remote speech recognition will be subject to the robustness of the speech coding scheme used. Conventional speech codecs achieve to reduce the bit-rate b...
This paper presents the research project TEC2010-18009/TCM proposed for its funding by the Ministerio de Cien-cia e Innovacin (MICINN). The main goal of this research project is the development of two groups of techniques for the processing of noisy, damaged or lost information: estimation and uncertainty processing. We will consider two different...
In this paper the robustness of Network Speech Recogni-tion (NSR) systems is analyzed. In NSR the speech signal is transmitted using a conventional speech codec from the client to the server, where the recognition task is carried out. The use of speech codecs degrades the performance of such systems, mainly in presence of acoustic noise and packet...
Se describe en este trabajo la realización de un sintetizador multipulso para un conversor texto-voz en castellano. Las unidades sonoras con las que se compone el mensaje oral son difonemas, ya que estos permiten la correcta resolución de los problemas que plantea la concatenación entre dos unidades. El tipo de codificación utilizada es la de LPC M...