Are you Ting-Han Lin?

Claim your profile

Publications (3)6.28 Total impact

  • Tai-Shih Chi, Ting-Han Lin, Chung-Chien Hsu
    [Show abstract] [Hide abstract]
    ABSTRACT: Spectro-temporal modulations of speech encode speech structures and speaker characteristics. An algorithm which distinguishes speech from non-speech based on spectro-temporal modulation energies is proposed and evaluated in robust text-independent closed-set speaker identification simulations using the TIMIT and GRID corpora. Simulation results show the proposed method produces much higher speaker identification rates in all signal-to-noise ratio (SNR) conditions than the baseline system using mel-frequency cepstral coefficients. In addition, the proposed method also outperforms the system, which uses auditory-based nonnegative tensor cepstral coefficients [Q. Wu and L. Zhang, "Auditory sparse representation for robust speaker recognition based on tensor structure," EURASIP J. Audio, Speech, Music Process. 2008, 578612 (2008)], in low SNR (≤ 10 dB) conditions.
    The Journal of the Acoustical Society of America 05/2012; 131(5):EL368-74. · 1.65 Impact Factor
  • Source
    Chung-Chien Hsu, Ting-Han Lin, Tai-Shih Chi
    [Show abstract] [Hide abstract]
    ABSTRACT: The concept of the two-dimensional spectro-temporal modulation filtering of the auditory model [1] is implemented for the FFT spectrogram. It analyzes the spectrogram in terms of the temporal dynamics and the spectral structures of the sound. The overlap and add (OLA) method, which is more convenient and reliable than the iterative-projection method proposed in [1], is used to invert the FFT spectrogram back to sounds. The Non-Negative Sparse Coding (NNSC) method is adopted to demonstrate the benefit of our analysis-synthesis procedures in a noise suppression application. Even without fine-tuning parameters, our proposed analysis-synthesis procedures offer benefits in de-noising especially under low SNR conditions. Index Terms—spectro-temporal modulation filtering,
    Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011, May 22-27, 2011, Prague Congress Center, Prague, Czech Republic; 01/2011
  • Ting-Han Lin, Chung-Chien Hsu, Tai-Shih Chi
    [Show abstract] [Hide abstract]
    ABSTRACT: The performance of conventional speaker identification systems is severely compromised by interference, such as additive or convolutional noises. High-level information of the speaker provides more robust cues for identifying speakers. This paper proposes an auditory-model based spectro-temporal modulation filtering (STMF) process to capture high-level information for robust speaker identification. Text-independent closed-set speaker identification simulations are conducted on TIMIT and GRID corpora to evaluate the robustness of Auditory Cepstral Coefficients (ACCs) after the STMF process. Simulation results show ACCs' substantial improvement over conventional MFCCs in all SNR conditions. The superior noise-suppression performance of STMF to newly developed Auditory-based Nonnegative Tensor Cepstral Coefficients (ANTCCs) is also demonstrated in low SNR conditions.
    01/2010;