Improved Minimum Phone Error Based Discriminative Training of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition_

:::

詳目顯示

第 1 筆 / 總合 1 筆

/1頁

來源文獻資料
外文摘要
引文資料

題名：	Improved Minimum Phone Error Based Discriminative Training of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition
書刊名：	International Journal of Computational Linguistics & Chinese Language Processing
作者：	Liu, Shih-hung／Chu, Fang-hui／Lo, Yueng-tien／Chen, Berlin
出版日期：	2008
卷期：	13:3
頁次：	頁343-361
主題關鍵詞：	Discriminative training；Minimum phone error；Phone accuracy function；Training data selection；Large vocabulary continuous speech recognition
原始連結：	連回原系統網址
相關次數：	被引用次數:期刊(0) 博士論文(0) 專書(0) 專書論文(0) 排除自我引用:0 共同引用:8 點閱:24

This paper considers minimum phone error (MPE) based discriminative training of acoustic models for Mandarin broadcast news recognition. We present a new phone accuracy function based on the frame-level accuracy of hypothesized phone arcs instead of using the raw phone accuracy function of MPE training. Moreover, a novel data selection approach based on the frame-level normalized entropy of Gaussian posterior probabilities obtained from the word lattice of the training utterance is explored. It has the merit of making the training algorithm focus much more on the training statistics of those frame samples that center nearly around the decision boundary for better discrimination. The underlying characteristics of the presented approaches are extensively investigated, and their performance is verified by comparison with the standard MPE training approach as well as the other related work. Experiments conducted on broadcast news collected in Taiwan demonstrate that the integration of the frame-level phone accuracy calculation and data selection yields slight but consistent improvements over the baseline system.

以文找文

期刊論文
1.	Rabiner, Lawrence R.(1989)。A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition。Proceedings of the IEEE，77(2)，257-286。
2.	Wang, Hsin-min、Chen, Berlin、Kuo, Jen-wei、Cheng, Shih-sian(20050600)。MATBN: A Mandarin Chinese Broadcast News Corpus。International Journal of Computational Linguistics & Chinese Language Processing，10(2)，219-235。
3.	Aubert, X. L.(2002)。An Overview of Decoding Techniques for Large Vocabulary Continuous Speech Recognition。Computer Speech and Language，16，89-114。
4.	Ortmanns, S.、Ney, H.、Aubert, X. L.(1997)。A Word Graph Algorithm for Large Vocabulary Continuous Speech Recognition。Computer Speech and Language，11，43-72。
5.	Saon, G.、Padmanabhan, M.(2001)。Data-Driven Approach to Designing Compound Words for Continuous Speech Recognition。IEEE transactions on speech and audio processing，9(4)，327-332。
6.	Chen, B.、Kuo, J. W.、Tsai, W. H.(2005)。Lightly Supervised and Data-Driven Approaches to Mandarin Broadcast News Transcription。中文計算語言學期刊，10(1)，1-18。
7.	Gopalakrishnan, P.S.、Kanevsky, D.、Nahamoo, D.、Nádas, A.(1989)。A Generalization of the Baum Algorithm to Rational Obejective Funtions。Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing，631-634。
8.	Jiang, H.、Li, X.、Liu, C.(2006)。Large Margin Hidden Markov Models for Speech Recognition。IEEE Transactions on Audio, Speech, and Language Processing，14(5)，1584-1595。
9.	Kuo, J. W.、Liu, S. H.、Wang, H. M.、Chen, B.(2006)。An Empirical Study of Word Error Minimization Approaches for Mandarin Large Vocabulary Speech Recognition。中文計算語言學期刊，11(3)，201-222。

會議論文
1.	Gillick, L.、Cox, S. J.(1989)。Some statistical issues in the comparison of speech recognition algorithms。Glasgow。532-535。
2.	Du, J.、Liu, P.、Soong, F. K.、Zhou, J. L.、Wang, R. H.(2006)。Minimum Divergence Based Discriminative Training。Pittsburgh。2410-2413。
3.	Goldberger, J.(2003)。An Efficient Image Similarity Measure Based on Approximations of KL-Divergence between Two Gaussian Mixtures。France。370-377。
4.	Gibson, Matthew、Hain, T.(2006)。Hypothesis Spaces for Minimum Bayes Risk Traininig in Large Vocabulary Speech Recognition。Pittsburgh。2406-2409。
5.	Heigold, G.、Macherey, W.、Schluter, R.、Ney, H.(2005)。Minimum Exact Word Error Training。Cancun。186-190。
6.	Li, J.、Lee, C. H.、Yuan, M.(2006)。Soft Margin Estimation of Hidden Markov Model Parameters。Pittsburgh。2422-2425。
7.	Liu, S. H.、Chu, F. H.、Chen, B.(2007)。Improved MPE Based Discriminative Training of Acoustic Models for Mandarin Large Vocabulary Continous Speech Recognition。
8.	Liu, S. H.、Chu, F. H.、Lin, S.-H.、Chen, B.(2007)。Investigating Data Selection for Minimum Phone Error Training of Acoustic Models。Beijing。348-351。
9.	Misra, H.、Bourlard, H.(2005)。Spectral Entropy Feature in Full-Combination Multi-Stream for Robust ASR。Lisbon。2633-2636。
10.	Zheng, J.、Stolcke, A.。Improved Discriminative Training using Phone Lattices。Lisbon。

學位論文
1.	Kumar, Nanda(1997)。Investigation of Silicon-Auditory Models and Generalization of Linear Discriminant Analysis for Improved Speech Recognition，Maryland。
2.	Povey, D.(2007)。Discriminative Training for Large Vocabulary Speech Recognition，Peterhouse。

圖書
1.	Bahl, L. R.、Brown, Polly、De Souza, P. V.、Mercer, R. L.(1986)。Maximum Mutual Information Estimation of Hidden Markov Model Parameters for Speech Recognition。Proc. IEEE ICASSP - 86。Tokyo。
2.	Stolcke, A.(2000)。SRI language Modeling Toolkit, version 1.3.3。SRI language Modeling Toolkit, version 1.3.3。
3.	Povey, D.、Kingsbury, B.(2007)。Evaluation of Proposed Modifications to MPE for Large Scales Discriminative Training。Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing。Hawaii。
4.	Povey, D.、Woodland, P. C.(2002)。Minimum Phone Error and I-smoothing for Improved Discriminative Training。Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing。Florida。
5.	Saon, G.、Padmanabhan, M.、Gopinath, R.、Chen, Sylvia(2000)。Maximum likelihood discriminant feature spaces。Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing。Istanbul。
6.	Wessel, F.、Schluter, R.、Ney, H.(2001)。Explicit Word Error Minimization Using Word Hypothesis Posterior Probabilities。Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing。Salt Lake City。

其他
1.	Wessel, F.，Schluter, R.，Ney, H.(2001)。Explicit Word Error Minimization Using Word Hypothesis Posterior Probabilities，Salt-Lake City, USA。

推文
推薦
引用網址
引用嵌入語法
轉寄

top

:::

相關期刊
相關論文
相關專書
相關著作
熱門點閱

1.	基於端對端模型化技術之語音文件摘要
2.	基於特徵粒度之訓練策略於中文口語問答系統之應用
3.	結合鑑別式訓練聲學模型之類神經網路架構及優化方法的改進
4.	當代非監督式方法之比較於節錄式語言摘要
5.	融合多任務學習類神經網路聲學模型訓練於會議語音辨識之研究
6.	節錄式語音文件摘要使用表示法學習技術
7.	使用概念資訊於中文大詞彙連續語音辨識之研究
8.	An Empirical Study of Word Error Minimization Approaches for Mandarin Large Vocabulary Continuous Speech Recognition
9.	MATBN: A Mandarin Chinese Broadcast News Corpus

無相關博士論文

無相關書籍

無相關著作

無相關點閱

QR Code

臺灣人文及社會科學引文索引資料庫系統

詳目顯示

臺灣人文及社會科學引文索引資料庫