Skip Navigation

IEICE Transactions on Information and Systems 2008 E91-D(3):478-487; doi:10.1093/ietisy/e91-d.3.478
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Request Permissions
Google Scholar
Right arrow Articles by SAKAI, M.
Right arrow Articles by NAKAGAWA, S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Copyright © 2008 The Institute of Electronics, Information and Communication Engineers

Special Section on Robust Speech Processing in Realistic Environments -- Papers -- Feature Extraction

Linear Discriminant Analysis Using a Generalized Mean of Class Covariances and Its Application to Speech Recognition

Makoto SAKAI1, Norihide KITAOKA2 and Seiichi NAKAGAWA3

1 The author is with DENSO CORPORATION, Nisshin-shi, 470–0111 Japan. E-mail: msakai{at}rlab.denso.co.jp, 2 The authors are with Nagoya University, Nagoya-shi, 464–8603 Japan. E-mail: kitaoka{at}nagoya-u.jp, 3 The author is with Toyohashi University of Technology, Toyohashi-shi, 441–8580 Japan. E-mail: nakagawa{at}slp.ics.tut.ac.jp

To precisely model the time dependency of features is one of the important issues for speech recognition. Segmental unit input HMM with a dimensionality reduction method has been widely used to address this issue. Linear discriminant analysis (LDA) and heteroscedastic extensions, e.g., heteroscedastic linear discriminant analysis (HLDA) or heteroscedastic discriminant analysis (HDA), are popular approaches to reduce dimensionality. However, it is difficult to find one particular criterion suitable for any kind of data set in carrying out dimensionality reduction while preserving discriminative information. In this paper, we propose a new framework which we call power linear discriminant analysis (PLDA). PLDA can be used to describe various criteria including LDA, HLDA, and HDA with one control parameter. In addition, we provide an efficient selection method using a control parameter without training HMMs nor testing recognition performance on a development data set. Experimental results show that the PLDA is more effective than conventional methods for various data sets.

Key Words: speech recognition, feature extraction, multidimensional signal processing


Manuscript received July 2, 2007. Manuscript revised September 14, 2007.

Reference

[1] S. Nakagawa and K. Yamamoto, "Evaluation of segmental unit input HMM," Proc. ICASSP, pp.439–442, 1996.

[2] M. Ostendorf and S. Roukos, "A stochastic segment model for phoneme-based continuous speech recognition," IEEE Trans. Acoust. Speech Signal Process., vol.37, no.12, pp.1857–1869, 1989.

[3] R. Haeb-Umbach and H. Ney, "Linear discriminant analysis for improved large vocabulary continuous speech recognition," Proc. ICASSP, pp.13–16, 1992.

[4] H. Gish and M. Russell, "Parametric trajectory models for speech recognition," Proc. ICSLP, pp.466–469, 1996.

[5] K. Tokuda, H. Zen, and T. Kitamura, "Trajectory modeling based on HMMs with the explicit relationship between static and dynamic features," Proc. Eurospeech 2003, pp.865–868, 2003.

[6] K. Fukunaga, Introduction to Statistical Pattern Recognition, second ed., Academic Press, New York. 1990.

[7] R.O. Duda, P.B. Hart, and D.G. Stork, Pattern Classification, John Wiley & Sons, New York. 2001.

[8] N.A. Campbell, "Canonical variate analysis — A general model formulation," Australian Journal of Statistics, vol.4, pp.86–96, 1984.

[9] N. Kumar and A.G. Andreou, "Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition," Speech Commun., pp.283–297, 1998.

[10] G. Saon, M. Padmanabhan, R. Gopinath, and S. Chen, "Maximum likelihood discriminant feature spaces," Proc. ICASSP, pp.129–132, 2000.

[11] F. de la Torre and T. Kanade, "Oriented discriminant analysis," British Machine Vision Conference, pp.132–141, 2004.

[12] M. Loog and R. Duin, "Linear dimensionality reduction via a heteroscedastic extension of LDA: The chernoff criterion," IEEE Trans. Pattern Anal. Mach. Intell., vol.26, no.6, pp.732–739, 2004.

[13] M.J.F. Gales, "Semi-tied covariance matrices for hidden Markov models," IEEE Trans. Speech Audio Process., vol.7, no.3, pp.272–281, 1999.

[14] J.R. Magnus and H. Neudecker, Matrix Differential Calculus with Applications in Statistics and Econometrics, John Wiley & Sons. 1999.

[15] J.A. Nelder and R. Mead, "A simplex method for function minimization," Comput. J., vol.7, pp.308–313, 1965.

[16] C.J.P. Belisle, "Convergence theorems for a class of simulated annealing algorithms," J. Applied Probability, vol.29, pp.885–892, 1992.

[17] S.R. Searle, Matrix Algebra Useful for Statistics, Wiley Series in Probability and Mathematical Statistics, New York. 1982.

[18] J. Nocedal and S.J. Wright, Numerical Optimization, Springer-Verlag. 1999.

[19] M. Fujimoto, K. Takeda, and S. Nakamura, "CENSREC-3: An evaluation framework for Japanese speech recognition in real driving-car environments," IEICE Trans. Inf. & Syst., vol.E89-D, no.11, pp.2783–2793, Nov. 2006.

[20] HTK Web site. http://htk.eng.cam.ac.uk/

[21] L. Bahl, P. Brown, P. de Sousa, and R. Mercer, "Maximum mutual information estimation of hidden Markov model parameters for speech recognition," Proc. ICASSP, pp.49–52, 1986.

[22] D. Povey and P. Woodland, "Minimum phone error and I-smoothing for improved discriminative training," Proc. ICASSP, pp.105–108, 2002.

[23] D. Povey, Discriminative Training for Large Vocabulary Speech Recognition, Ph.D. Thesis, Cambridge University. 2003.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Request Permissions
Google Scholar
Right arrow Articles by SAKAI, M.
Right arrow Articles by NAKAGAWA, S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?