:::

詳目顯示

回上一頁
題名:Lightly Supervised and Data-Driven Approaches to Mandarin Broadcast News Transcription
書刊名:International Journal of Computational Linguistics & Chinese Language Processing
作者:Chen, BerlinKuo, Jen-weiTsai, Wen-hung
出版日期:2005
卷期:10:1
頁次:頁1-17
主題關鍵詞:Acoustic look-aheadLightly supervised acoustic model trainingLanguage model adaptationMandarin broadcast news
原始連結:連回原系統網址new window
相關次數:
  • 被引用次數被引用次數:期刊(1) 博士論文(0) 專書(0) 專書論文(0)
  • 排除自我引用排除自我引用:1
  • 共同引用共同引用:0
  • 點閱點閱:28
This article investigates the use of several lightly supervised and data-driven approaches to Mandarin broadcast news transcription. With the special structural properties of the Chinese language taken into consideration, a fast acoustic look-ahead technique for estimating the unexplored part of a speech utterance is integrated into lexical tree search to improve search efficiency. This technique is used in conjunction with the conventional language model look-ahead technique. Then, a verification-based method for automatic acoustic training data acquisition is proposed to make use of large amounts of untranscribed speech data. Finally, two alternative strategies for language model adaptation are studied with the goal of achieving accurate language model estimation. With the above approaches, the overall system was found in experiments to yield an 11.88% character error rate when applied to Mandarin broadcast news collected in Taiwan.
期刊論文
1.Berger, Adam L.、Della Pietra, Stephen A.、Della Pietra, Vincent J.(1996)。A maximum entropy approach to natural language processing。Computational Linguistics,22(1),39-72。  new window
2.Furui, Sadaoki、Kikuchi, Tomonori、Shinnaka, Yousuke、Hori, Chiori(2004)。Speech-to-text and speech-to-speech summarization of spontaneous speech。IEEE Transactions on Speech and Audio Processing,12(4),401-408。  new window
3.Davis, S. B.、Mermelstein, P.(1980)。Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences。IEEE Transactions on Acoustic, Speech, and Signal Processing,28(4),357-366。  new window
4.Meng, H.、Chen, B.、Khudanpur, S.、Levow, G-A、Lo, W. K.、Oard, D. W.、Schone, P.、Tang, K.、Wang, H. M.、Wang, J.(2004)。Mandarin English Information (MEI): Investigating Translingual Speech Retrieval。Computer Speech and Language,18(2),163-179。  new window
5.Dempster, Arthur P.、Laird, Nan M.、Rubin, Donald B.(1977)。Maximum likelihood from incomplete data via the EM algorithm。Journal of the Royal Statistical Society: Series B (Methodological),39(1),1-38。  new window
6.Lee, L. S.(1997)。Voice dictation of Mandarin Chinese。IEEE Signal Processing Magazine,14(4),63-101。  new window
7.陳淑芳、Goodman, J.(1999)。An Empirical study of smoothing techniques for language modeling。Computer Speech and Language,13(4),359-394。  new window
8.Goodman, J.(2001)。A bit of progress in language modeling。Computer Speech and Language,15。  new window
9.Aubert, X. L.(2002)。An Overview of Decoding Techniques for Large Vocabulary Continuous Speech Recognition。Computer Speech and Language,16,89-114。  new window
10.Bellegarda, J. R.(2004)。Statistical Language Model Adaptation: Review and Perspectives。Speech Communication,42,93-108。  new window
11.Beyerlein, P.、Aubert, X. L.、Haeb-Umbach, R.、Harris, M.、Klakow, D.、Wendemuth, A.、Molau, S.、Ney, H.、Pitz , M.、Sixtus, A.(2002)。Large Vocabulary Continuous Speech Recognition of Broadcast News-The Philips/RWTH Approach。Speech Communication,37,109-131。  new window
12.Chang, E.、Seide, F.、Meng, H.、Chen, Z.、Shi, Y.、Li, Y. C.(2002)。A System for Spoken Query Information Retrieval on Mobile Devices。IEEE transactions on speech and audio processing,10(5),531-541。  new window
13.Chang, P. C.、Liao, S. P.、Lee, L. S.(2003)。Improved Chinese Broadcast News Transcription by Language Modeling with Temporally Consistent Training Corpora and Iterative Phrase Extraction。Proceedings of European Conference on Speech Communication and Technology,421-424。  new window
14.Chen, L.、Gauvain, J. L.、Lamel, L. F.、Adda, G.(2003)。Unsupervised Language Model Adaptation for Broadcast News。IEEE transactions on speech and audio processing,1,220-223。  new window
15.Chen, B.、Wang, H. M.、Lee, L. S.(2002)。Discriminating Capabilities of Syllable-Based Features and Approaches of Utilizing Them for Voice Retrieval of Speech Information in Mandarin Chinese。IEEE transactions on speech and audio processing,10(5),303-314。  new window
16.Chen, S. F.、Rosenfeld, R.(2000)。A Survey of Smoothing Techniques for ME Models。IEEE transactions on speech and audio processing,8(1),37-50。  new window
17.Gales, M. J. F.、Woodland, P. C.(1996)。Mean and Variance Adaptation within the MLLR Framework。Computer Speech and Language,10,249-264。  new window
18.Gauvain, J. L.、Lamel, L. F.、Adda, G.(2000)。The LIMSI Broadcast News Transcription System。Speech Communication,37,89-108。  new window
19.Evermann, G.、Woodland, P. C.(2003)。Design of Fast LVCSR Systems。Proc. of the IEEE Workshop on Automatic Speech Recognition and Understanding,7-12。  new window
20.Federico, M.、Bertoldi, N.(2001)。Broadcast News LM adaptation Using Cotemporary Texts。Proceedings of European Conference on Speech Communication and Technology,1,239-342。  new window
21.Kemp, T.、Waibel, A. H.(1999)。Unsupervised Training of a Speech Recognizer: Recent Experiments。Proceedings of European Conference on Speech Communication and Technology,6,2725-2728。  new window
22.Lamel, L. F.、Gauvain, J. L.、Adda, G.(2002)。Lightly Supervised and Unsupervised Acoustic Model Training。Computer Speech and Language,16(1),115-229。  new window
23.Liu, X.、Croft, W. Bruce(2005)。Statistical Language Modeling for Information Retrieval。Annual Review of Information Science and Technology,39。  new window
24.Ortmanns, S.、Ney, H.、Aubert, X. L.(1997)。A Word Graph Algorithm for Large Vocabulary Continuous Speech Recognition。Computer Speech and Language,11,43-72。  new window
25.Ortmanns, S.、Ney, H.(2000)。Look-ahead Techniques for Fast Beam Search。Computer Speech and Language,14,15-32。  new window
26.Rosenfeld, R.(2000)。Two Decades of Statistical Language Modeling: Where Do We Go from Here。Proceedings of the IEEE,88(8),1270-1278。  new window
27.Saon, G.、Padmanabhan, M.(2001)。Data-Driven Approach to Designing Compound Words for Continuous Speech Recognition。IEEE transactions on speech and audio processing,9(4),327-332。  new window
28.Schuster, M.(2000)。Memory-efficient LVCSR Search Using a One-Pass Stack Decoder。Computer Speech and Language,14,47-77。  new window
29.Wessel, F.、Ney, H.(2001)。Unsupervised Training of Acoustic Models for Large Vocabulary Continuous Speech Recognition。Proc. of the IEEE Workshop on Automatic Speech Recognition and Understanding,307-310。  new window
30.Wessel, F.、Schlüter, R.、Macherey, K.、Ney, H.(2001)。Confidence Measures for Large Vocabulary Continuous Speech Recognition。IEEE transactions on speech and audio processing,9(3),288-298。  new window
31.Woodland, P. C.(2002)。The Development of the HTK Broadcast News Transcription System: An Overview。Speech Communication,37,47-67。  new window
32.Zhai, C. X.、Lafferty, J.(2004)。A Study of Smoothing Methods for Language Models Applied to Information Retrieval。ACM Trans. on Information Systems,22(2),179-214。  new window
會議論文
1.Bacchiani, M.、Roark, B.(2003)。Unsupervised Language Model Adaptation。Hong Kong。224-227。  new window
2.Chen, B.、Wang, H. M.、Chien, L. F.、Lee, L. S.(1998)。A*-Admissible Key-Phrase Spotting with Sub-Syllable Level Utterance Verification。Sydney, Australia。  new window
3.Chen, B.、Kuo, J. W.、Tsai, W. H.(2004)。Lightly Supervised and Data-Driven Approaches to Mandarin Broadcast News Transcription。The IEEE International Conference on Acoustics, Speech, and Signal Processing 2004,777-780。  new window
4.Kneser, R.、Ney, H.(1995)。Improved Backing-off for M-gram Language Modeling。Hong Kong。181-184。  new window
5.Macherey, W.、Ney, H.(2002)。Towards Automatic Corpus Preparation for a German Broadcast News Transcription System733-736。  new window
6.Nguyen, L.、Xiang, B.(2004)。Light Supervision in Acoustic Model Training。Hong Kong。185-188。  new window
7.Wang, C. J.、Chen, B.、Lee, L. S.(2002)。Improved Chinese Spoken Document Retrieval with Hybrid Modeling and Data-driven Indexing Features。Sydney, Australia。1985-1988。  new window
圖書
1.Duda, Richard O.、Hart, Peter E.(1973)。Pattern Classification and Scene Analysis。New York, NY:John Wiley & Sons。  new window
2.Croft, W. Bruce、Lafferty, J.(2003)。Language Modeling for Information Retrieval。Language Modeling for Information Retrieval。  new window
3.Stolcke, A.(2000)。SRI language Modeling Toolkit, version 1.3.3。SRI language Modeling Toolkit, version 1.3.3。  new window
其他
1.Jelinek, F.,Merialdo, Bernard,Roukos, S.,Strauss, M.(1991)。A Dynamic Language Model for Speech Recognition。  new window
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top