Copyright © 2007 The Institute of Electronics, Information and Communication Engineers
Regular Section -- Papers -- Speech and Hearing |
A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis
1 The author is with the Graduate School of Information Science, Nara Institute of Science and Technology, Ikoma-shi, 6300192 Japan. E-mail: tomoki{at}is.naist.jp, 2 The author is with the Graduate School of Engineering, Nagoya Institute of Technology, Nagoya-shi, 4668555 Japan. E-mail: tokuda{at}ics.nitech.ac.jp
| Abstract |
|---|
This paper describes a novel parameter generation algorithm for an HMM-based speech synthesis technique. The conventional algorithm generates a parameter trajectory of static features that maximizes the likelihood of a given HMM for the parameter sequence consisting of the static and dynamic features under an explicit constraint between those two features. The generated trajectory is often excessively smoothed due to the statistical processing. Using the over-smoothed speech parameters usually causes muffled sounds. In order to alleviate the over-smoothing effect, we propose a generation algorithm considering not only the HMM likelihood maximized in the conventional algorithm but also a likelihood for a global variance (GV) of the generated trajectory. The latter likelihood works as a penalty for the over-smoothing, i.e., a reduction of the GV of the generated trajectory. The result of a perceptual evaluation demonstrates that the proposed algorithm causes considerably large improvements in the naturalness of synthetic speech.
Key Words: HMM-based speech synthesis, speech parameter generation, maximum likelihood criterion, over-smoothing effect, global variance
Manuscript received July 11, 2006. Manuscript revised December 11, 2006.