Copyright © 2007 The Institute of Electronics, Information and Communication Engineers
Regular Section -- Papers -- Speech and Hearing |
Word Error Rate Minimization Using an Integrated Confidence Measure
1 The authors are with NHK Science and Technical Research Laboratories, Tokyo, 1578510 Japan. E-mail: kobayashi.a-fs{at}nhk.or.jp
| Abstract |
|---|
This paper describes a new criterion for speech recognition using an integrated confidence measure to minimize the word error rate (WER). The conventional criteria for WER minimization obtain the expected WER of a sentence hypothesis merely by comparing it with other hypotheses in an n-best list. The proposed criterion estimates the expected WER by using an integrated confidence measure with word posterior probabilities for a given acoustic input. The integrated confidence measure, which is implemented as a classifier based on maximum entropy (ME) modeling or support vector machines (SVMs), is used to acquire probabilities reflecting whether the word hypotheses are correct. The classifier is comprised of a variety of confidence measures and can deal with a temporal sequence of them to attain a more reliable confidence. Our proposed criterion for minimizing WER achieved a WER of 9.8% and a 3.9% reduction, relative to conventional n-best rescoring methods in transcribing Japanese broadcast news in various environments such as under noisy field and spontaneous speech conditions.
Key Words: word error rate minimization, maximum entropy, support vector machines, n-best rescoring
Manuscript received June 30, 2006. Manuscript revised October 23, 2006.