A new phase model for sinusoidal transform coding of speech

Dept. of Electr. Eng., Arizona State Univ., Tempe, AZ
IEEE Transactions on Speech and Audio Processing (Impact Factor: 2.29). 10/1998; DOI: 10.1109/89.709675
Source: IEEE Xplore

ABSTRACT A phase modeling algorithm for sinusoidal analysis-synthesis of
speech is presented, where short-time sinusoidal phases are approximated
using a combination of linear prediction, spectral sampling, delay
compensation, and phase correction techniques. The algorithm is
different to phase compensation methods proposed for source-system LPC
in that it has been tailored to sinusoidal representation of speech.
Performance analysis on a large speech data base reveals an improvement
in temporal and spectral signal matching, as well as in the subjective
quality of reconstructed speech. The method can be applied to enhance
phase matching in low bit rate sinusoidal coders, where underlying sine
wave amplitudes are extracted from an all-pole model. Preliminary
subjective results are presented for a 2.4 kb/s sinusoidal coder

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a novel technique for modeling and quantization of the phase information in low-rate harmonic+noise coding. In the proposed phase model, each frequency track is adjusted by a frequency deviation (FD) that reduces the error between measured and predicted phases. By exploiting the intra-frame relationship of the FD's, the phase information is represented more efficiently when compared with the representation by measured phases or by phase prediction residuals. An efficient FD quantization scheme based on closed-loop analysis is also developed. In this scheme, the FD of the first harmonic and a vector of the FD differences are quantized by minimizing a perceptually weighted distortion measure between the measured phases and the quantized phases. The proposed technique reproduces the temporal events of the original speech signal and improves the subjective quality of the synthesized speech using 13 bits per frame only.
  • [Show abstract] [Hide abstract]
    ABSTRACT: Speech signal could be represented as a combination of sinusoidal signal with infinite combination of amplitude, frequency and phase. On quantization based on peak to peak, speech signal is detected its peaks, both of positive and negative. Then time distance between peak to peak would be quantized. In this paper, we explain a new method to quantize the speech signal which is segmented into peak to peak based on sinusoidal modeling. The part of signal between positive peak and following negative peak or vice versa is estimated as a half period of sinusoidal signal. Magnitude between peaks is the double of the ed sine amplitude. The experiment result showed that synthesis signal quality is reduced on the high frequency interval. Human perception due to the synthesis signal is good enough, because of less sensitivity human perception above 1 kHz.
    Information, Communications & Signal Processing, 2007 6th International Conference on; 01/2007
  • [Show abstract] [Hide abstract]
    ABSTRACT: Segmental Sinusoidal Model for Speech Signal Coding. Periodic signal can be decomposed by sinusoidal componentwith Fourier series. With this characteristic, it can be modeled referring by sinusoidal form. By the sinusoidal model,signal can be quantized in order to encode the speech signal at the lower rate. The recent sinusoidal method isimplemented in speech coding. By using this method, a block of the speech signal with 20 ms to 30 ms width is codedbased on Fourier series coefficients. The new method proposed is quantization and reconstruction of speech signal bythe segmental sinusoidal model. A segment is defined as a block of the speech signal from certain peak to consecutivepeak. The length of the segment is variable, instead of the fixed block like the recent sinusoidal method. Coder consistsof the encoder and the decoder. Encoder works to code speech signal at variable rate. Then coded signal will betransmitted to receiver. On the receiver, coded signal will be reconstructed, so that the reconstruction signal has the nearquality compared with the original signal. The experimental results show that the average of segmental SNR is morethan 20 dB.Keywords: peak, period, quantization, segmental, sinusoidal
    10/2010; DOI:10.7454/mst.v10i2.426