:::

詳目顯示

回上一頁
題名:An Empirical Study of Word Error Minimization Approaches for Mandarin Large Vocabulary Continuous Speech Recognition
書刊名:International Journal of Computational Linguistics & Chinese Language Processing
作者:Kuo, Jen-weiLiu, Shih-hungWang, Hsin-minChen, Berlin
出版日期:2006
卷期:11:3
頁次:頁201-222
主題關鍵詞:Broadcast newsContinuous speech recognitionDiscriminative trainingMinimum phone errorWord error minimization
原始連結:連回原系統網址new window
相關次數:
  • 被引用次數被引用次數:期刊(0) 博士論文(0) 專書(0) 專書論文(0)
  • 排除自我引用排除自我引用:0
  • 共同引用共同引用:8
  • 點閱點閱:24
This paper presents and empirical study of word error minimization approaches for Mandarin large vocabulary continuous speech recognition (LVCSR). First, the minimum phone error (MPE) criterion, which is one of the most popular discriminative training criteria, is extensively investigated for both acoustic model training and adaptation in a Mandarin LVCSR system. Second, the word error minimization (WEM) criterion, used to rescore N-best word strings, is appropriately modified for a Mandarin LVCSR system. Finally, a series of speech recognition experiments is conducted on the MATBN Mandarin Chinese broadcast news corpus. The experiment results demonstrate that the MPE training approach reduces the character error rate (CER) by 12% for a system initially trained with the maximum likelihood (ML) approach. Meanwhile, for unsupervised acoustic model adaptation, MPE-based linear regression (MPELR) adaptation outperforms conventional maximum likelihood linear regression (MLLR) in terms of CER reduction. When the WEM decoding approach is used for N-based rescoring, a slight performance gain over the conventional maximum a posteriori (MAP) decoding method is also observed.
期刊論文
1.Leggetter, C. J.、Woodland, P. C.(1995)。Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models。Computer Speech and Language,9(2),171-185。  new window
2.Wang, Hsin-min、Chen, Berlin、Kuo, Jen-wei、Cheng, Shih-sian(20050600)。MATBN: A Mandarin Chinese Broadcast News Corpus。International Journal of Computational Linguistics & Chinese Language Processing,10(2),219-235。new window  new window
3.Levenshtein, V. I.(1966)。Binary codes capable of correcting deletions, insertions and reversals。Soviet Physics Doklady,10(8),707-710。  new window
4.Gales, M. J. F.、Woodland, P. C.(1996)。Mean and Variance Adaptation within the MLLR Framework。Computer Speech and Language,10,249-264。  new window
5.Ortmanns, S.、Ney, H.、Aubert, X. L.(1997)。A Word Graph Algorithm for Large Vocabulary Continuous Speech Recognition。Computer Speech and Language,11,43-72。  new window
6.Chen, B.、Kuo, J. W.、Tsai, W. H.(2005)。Lightly Supervised and Data-Driven Approaches to Mandarin Broadcast News Transcription。中文計算語言學期刊,10(1),1-18。  new window
7.Goel, Vinod、Byrne, W.(2000)。Minimum Bayes-Risk Automatic Speech Recognition。Computer Speech and Language,14,115-135。  new window
8.Gopalakrishnan, P.S.、Kanevsky, D.、Nádas, A.、Nahamoo, D.(1991)。An Inequality for Rational Functions with Applications to Some Statistical Estimation Problems。IEEE Trans. Information Theory,37,107-133。  new window
9.Kaiser, J.、Horvat, B.、Kacic, Z.(2000)。Overall Risk Criterion Estimation of Hidden Maekov Model Parameters。Speech Communication,38,383-398。  new window
10.Mangu, L.、Brill, E.、Stolcke, A.(2000)。Finding consensus in speech recognition: word error minimization and other applications of confusion networks。Computer Speech and Language,14,373-400。  new window
11.Povey, D.、Woodland, P. C.(2002)。Large Scale Discriminative Training of Acoustic Models for Speech Recognition。Computer Speech and Language,16,25-47。  new window
會議論文
1.Chien, J. T.、Huang, C. H.、Shinoda, K.、Furui, S.(2006)。Towards Optimal Bayes Decision for Speech Recognition。Toulouse, France。  new window
2.Kuo, J. W.、Chen, B.(2005)。Minimum Word error Based Discriminative Training of Language Models。Lisbon。1277-1280。  new window
學位論文
1.Povey, D.(2004)。Discriminative Training for Large Vocabulary Speech Recognition,Peterhouse。  new window
2.郭人瑋(2005)。An Initial Study on Minimum Phone Error Discriminative Learning of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition,Taipei。  new window
3.Normandin, Y.(1991)。Hidden Markov Models, Maximum Mutual Information Estimation, and the Speech Recognition Proble,Montreal。  new window
圖書
1.Duda, R. D.、Hart, P. E.、Stork, D. G.(2000)。Pattern Classification。Pattern Classification。New York。  new window
2.Zheng, J.、Stolcke, A.(2005)。Improved Discriminative Training Using Phone Lattices。Proc. INTERSPEECH'05。Lisbon。  new window
其他
1.Doumpiotis, V.,Tsakalidis, S.,Byrne, W.(2003)。Discriminative Training for Segmental Minimum Bayes Risk Decoding,Hong Kong。  new window
2.Doumpiotis, V.,Tsakalidis, S.,Byrne, W.(2003)。Lattice Segmentation and Minimum Bayes Risk Discriminative Training。  new window
3.Doumpiotis, V.,Byrne, W.(2004)。Pinched Lattice Minimum Bayes Risk Discriminative Training for Large Vocabulary Continuous Speech Recognition,Jeju Island, Korea。  new window
4.Kaiser, J.,Horvat, B.,Kacic, Z.(2000)。A Novel Loss Function for the Overall Risk Criterion Based Discriminative Training of HMM Models。  new window
5.Na, K.,Jeon, B.,Chang, D.,Chae, S.,Ann, S.(1995)。Discriminative Training of Hidden Markov Models using Overall Risk Criterion and Reduced Gradient Method,Madrid。  new window
6.Povey, D.,Woodland, P. C.(2002)。Minimum Phone Error and I-smoothing for Improved Discriminative Training。  new window
7.Povey, D.,Kingsbury, B.,Mangu, L.,Saon, G.,Soltau, H.,Zweig, G.(2005)。FMPE: Discriminatively Trained Features for Speech Recognition,Philadelphia, USA。  new window
8.Schlüter, R.,Scharrenbach, T.,Steinbiss, V.,Ney, H.(2005)。Bayes Risk Minimization using Metric Loss Functions。  new window
9.Schwartz, R.,Chow, Y. L.(1990)。The N-best algorithms: an efficient and exact procedure for finding the N most likely sentence hypotheses。  new window
10.Stolcke, A.,Konig, Y.,Weintraub, M.(1997)。Explict Word Error Minimization in N-best List Rescoring。  new window
11.Stolcke, A.(2000)。SRI language Modeling Toolkit。  new window
12.Wang, L.,Woodland, P. C.(2004)。MPE-Based Discriminative Linear Transform for Speaker Adaptation,Montreal, Canada。  new window
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top