Publications (3)3.1 Total impact
-
Article: Spectro-temporal modulation energy based mask for robust speaker identification.
[show abstract] [hide abstract]
ABSTRACT: Spectro-temporal modulations of speech encode speech structures and speaker characteristics. An algorithm which distinguishes speech from non-speech based on spectro-temporal modulation energies is proposed and evaluated in robust text-independent closed-set speaker identification simulations using the TIMIT and GRID corpora. Simulation results show the proposed method produces much higher speaker identification rates in all signal-to-noise ratio (SNR) conditions than the baseline system using mel-frequency cepstral coefficients. In addition, the proposed method also outperforms the system, which uses auditory-based nonnegative tensor cepstral coefficients [Q. Wu and L. Zhang, "Auditory sparse representation for robust speaker recognition based on tensor structure," EURASIP J. Audio, Speech, Music Process. 2008, 578612 (2008)], in low SNR (≤ 10 dB) conditions.The Journal of the Acoustical Society of America 05/2012; 131(5):EL368-74. · 1.55 Impact Factor -
Article: Multiband analysis and synthesis of spectro-temporal modulations of Fourier spectrogram.
[show abstract] [hide abstract]
ABSTRACT: The two-dimensional spectro-temporal modulation filtering concept of the auditory model [T. Chi, P. Ru, and S. A. Shamma, J. Acoust. Soc. Am. 118(2), 887-906 (2005)] is implemented on the Fourier spectrogram. The Fourier magnitude spectrogram is analyzed in terms of its joint spectro-temporal modulations, which embed the temporal dynamics and spectral structures. Instead of iterative projection methods, the overlap-and-add method is adopted to invert modified Fourier spectrograms back to sounds. The proposed framework not only provides a similar spectro-temporal analytical process for sounds as the auditory model but also produces synthesized sounds with better quality in a timely manner, which makes proposed framework feasible to human speech recognition (HSR) applications as well.The Journal of the Acoustical Society of America 05/2011; 129(5):EL190-6. · 1.55 Impact Factor -
Conference Proceeding: FFT-based spectro-temporal analysis and synthesis of sounds.
Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011, May 22-27, 2011, Prague Congress Center, Prague, Czech Republic; 01/2011
Top Journals
Institutions
-
2011–2012
-
National Chiao Tung University
- Department of Electronics Engineering
Hsinchu, Taiwan, Taiwan
-