[show abstract][hide abstract] ABSTRACT: Our earlier work  on speech synthesis has shown that syllables can produce reasonably natural quality speech. Nevertheless, audible artifacts are present due to discontinuities in pitch, energy, and formant trajectories at the joining point of the units. In this paper, we present some minimal signal modification techniques for reducing these artifacts.
Spoken Language Technology Workshop, 2008. SLT 2008. IEEE; 01/2009
[show abstract][hide abstract] ABSTRACT: In this work we describe a new îsyllable-likeî speech unit that is suitable for concatenative speech synthesis. These units are automatically generated using a group delay based segmentation algorithm and acoustically correspond to the form C VC (C: consonant, V: vowel). The effectiveness of the unit is demonstrated by synthesizing natural-sounding speech in Tamil, a regional Indian language. Signicant quality improvement is obtained if bisyllable units are also used, rather than just monosyllables, with results far superior to the traditional diphone-based approach. An important ad- vantage of this approach is the elimination of prosody rules. Since f0 is part of the target cost, the unit selection proce- dure chooses the best unit from among the many candidates. The naturalness of the synthesized speech demonstrates the effectiveness of this approach.