Copyright © 2007 The Institute of Electronics, Information and Communication Engineers
Regular Section -- Letters -- Speech and Hearing |
State Duration Modeling for HMM-Based Speech Synthesis
1 The authors are with the Department of Computer Science and Engineering, Nagoya Institute of Technology, Nagoya-shi, 4668555 Japan. E-mail: zen{at}ics.nitech.ac.jp, E-mail: tokuda{at}ics.nitech.ac.jp, E-mail: kitamura{at}nitech.ac.jp, 2 The authors are with the Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology, Yokohama-shi, 2268502 Japan. E-mail: takao.kobayashi{at}ip.titech.ac.jp, 3 Presently, with the Corporate Research & Development Center, Toshiba Corporation., 4 Presently, with Toyota Central R&D Labs., Inc.
| Abstract |
|---|
This paper describes the explicit modeling of a state duration's probability density function in HMM-based speech synthesis. We redefine, in a statistically correct manner, the probability of staying in a state for a time interval used to obtain the state duration PDF and demonstrate improvements in the duration of synthesized speech.
Key Words: duration modeling, speech synthesis, hidden Markov model
Manuscript received July 27, 2006.