:::

詳目顯示

回上一頁
題名:Modeling Pronunciation Variation for Bi-Lingual Mandarin/Taiwanese Speech Recognition
書刊名:International Journal of Computational Linguistics & Chinese Language Processing
作者:Lyu, Dau-chengLyu, Ren-yuanChiang, Yuang-chinHsu, Chun-nan
出版日期:2005
卷期:10:3
頁次:頁363-380
主題關鍵詞:Bi-lingualOne-pass ASRPronunciation modeling
原始連結:連回原系統網址new window
相關次數:
  • 被引用次數被引用次數:期刊(0) 博士論文(0) 專書(0) 專書論文(0)
  • 排除自我引用排除自我引用:0
  • 共同引用共同引用:3
  • 點閱點閱:22
In this paper, a bi-lingual large vocaburary speech recognition experiment based on the idea of modeling pronunciation variations is described. The two languages under study are Mandarin Chinese and Taiwanese (Min-nan). These two languages are basically mutually unintelligible, and they have many words with the same Chinese characters and the same meanings, although they are pronounced differently. Observing the bi-lingual corpus, we found five types of pronunciation variations for Chinese characters. A one-pass, three-layer recognizer was developed that includes a combination of bi-lingual acoustic models, an integrated pronunciation model, and a tree-structure based searching net. The recognizer’s performance was evaluated under three different pronunciation models. The results showed that the character error rate with integrated pronunciation models was better than that with pronunciation models, using either the knowledge-based or the data-driven approach. The relative frequency ratio was also used as a measure to choose the best number of pronunciation variations for each Chinese character. Finally, the best character error rates in Mandarin and Taiwanese testing sets were found to be 16.2% and 15.0%, respectively, when the average number of pronunciations for one Chinese character was 3.9.
期刊論文
1.Lyu,Ren-yuan、Liang,Min-siong、Chiang,Yuang-chin(20040800)。Toward Constructing a Multilingual Speech Corpus for Taiwanese (Min-nan), Hakka, and Mandarin。International Journal of Computational Linguistics & Chinese Language Processing,9:2,頁1-12。new window  new window
2.Bacchiani, M.、Ostendorf, Mari(1999)。Joint lexicon, acoustic unit inventory and model design。International Journal of Speech Communication,29(2/4),99-114。  new window
3.Holter, T.、Svendsen, T.(1999)。Maximum likelihood modelling of pronunciation variation。International Journal of Speech Communication,29,177-191。  new window
4.Kessens, J. M.、Cucchiarini, C.、Strik, H.(2003)。A data-driven method for modeling pronunciation variation。International Journal of Speech Communication,40,517-534。  new window
5.Kessens, J. M.、Wester, M.、Strik, H.(1999)。Improving the Performance of a Dutch CSR by Modeling Within-word and Cross-word Pronunciation Variation。International Journal of Speech Communication,29(2/4),193-207。  new window
6.Lee, T.、Lau, W.、Wong, Y. W.、Ching, P. C.(2002)。Using tone Information In Cantonese Continuous Speech Recognition。ACM Transactions on Asian Language Information Processing,1,83-102。  new window
7.Liu, Y.、Fung, P.(2003)。Modeling partial pronuncia-tion variations for spontaneous Mandarin speech recog-nition。International Journal of Computer Speech and Language,17,357-379。  new window
8.Liu, Y.、Fung, P.(2004)。State-Dependent Phonetic Tied Mixtures with Pronunciation Modeling for Spontaneous Speech Recognition。IEEE Trans. Speech and Audio Proc.,12,351-364。  new window
9.Riley, M.、Byrne, W.、Finke, M.、Khudanpur, S.、Ljolje, A.、McDonough, J.、Nock, H.、Saraçlar, M.、Wooters, Charles、Zavaliagkos, G.(1999)。Stochastic pronunciation modelling from hand-labelled phonetic corpora。International Journal of Speech Communication,29,209-224。  new window
10.Singh, R.、Raj, B.、Stern, R.(2002)。Automatic generation of subword units for speech recognition systems。IEEE Trans. Speech and Audio Proc.,10,89-99。  new window
11.Strik, H.、Cucchiarini, C.(1999)。Modeling Pronunciation Variation for ASR: Overview and Comparison of Method。International Journal of Speech Communication,29,225-246。  new window
12.Wester, M.(2003)。Pronunciation Modeling for ASR knowledge-based, Data-driven Methods。International Journal of Computer Speech and Language,88,69-85。  new window
其他
1.Liang, M. S.,Lyu, R. Y.,Chiang, Y. C.(2003)。An efficient algorithm to select phonetically balanced scripts for constructing corpus,Beijing, China。  new window
2.Aubert, X. L.(1999)。One pass cross word decoding for large vocabularies based on a lexical tree search organization,Budapest, Hungary。  new window
3.Chao, Y. R.(1979)。Tone contour。  new window
4.Cremelie, N.,Martens, J.-P.(1998)。In search of pronunciation rules,Rolduc, Kerkrade。  new window
5.Downey, S.,Wiseman, R.(1998)。Dynamic and static improvements to lexical baseforms,Roldue。  new window
6.Finke, M.,Waibel, A. H.(1997)。Speaking mode dependent pronunciation modeling in large vocabulary conversational speech recognition,Rhodos, Greece。  new window
7.Fukada, T.,Sagisaka, Y.(1997)。Automatic generation of a pronunciation dictionary based on a pronunciation network,Rhodos, Greece。  new window
8.Fukada, T.,Yoshimura, T.,Sagisaka, Y.(1998)。Automatic generation of multiple pronunciations based on neural networks and language statistics,Rolduc, Kerkrade。  new window
9.Huang, C.,Chang, E.,Zhou, J. L.,Lee, K. F.(2000)。Accent Modeling Based on Pronunciation Dictionary Adaptation for Large Vocabulary Mandarin Speech recognition,Beijing。  new window
10.Jurafsky, Daniel,Ward, W.,Zhang, J.,Herold, K.,Yu, X.,Zhang, S.(2001)。What kind of pronunciation variation is hard for triphones to model?,Salt Lake City, Utah。  new window
11.Kam, P.,Lee, T.(2002)。Modeling pronunciation variation for Cantonese speech recognition,Colorado, USA。  new window
12.Kam, P.,Lee, T.,Soong, F. K.(2003)。Modeling Cantonese pronunciation variation by acoustic model refinement,Geneva, Switzerland。  new window
13.Kessens, J. M.,Strik, H.,Cucchiarini, C.(2002)。Modeling pronunciation variation for ASR: Comparing criteria for rule selection,Estes Park, USA。  new window
14.Kipp, A.,Wesenick, M.-B.,Schiel, F.(1996)。Automatic detection and segmentation of pronunciation variants in German speech corpora,Philadelphia, USA。  new window
15.Liang, Po-Yu,Shen, J.-L.,Lee, L. S.(1998)。Decision Tree Clustering for Acoustic Modeling in Speaker-Independent Mandarin Telephone Speech Recognition,Singapore。  new window
16.Liao, Y. F.,Wang, N.,Huang, M.,Huang, H.,Seide, F.(2000)。Improvements of the Philips 2000 Taiwan Mandarin Benchmark System,Beijing。  new window
17.Liu, Y.,Fung, P.(2003)。Partial change accent models for accented Mandarin speech recognition,St. Thomas, U.S/Virgin Islands。  new window
18.Lyu, Dau-Cheng,Yang, B. H.,Liang, M. S.,Lyu, R. Y.,Hsu, Chun-Nan(1992)。Speaker Independent Acoustic Modeling for Large Vocabulary Bi-lingual Taiwanese/Mandarin Continuous Speech Recognition,Melbourne, Australia。  new window
19.Lyu, Dau-Cheng,Liang, M. S.,Chiang, Y. C.,Hsu, Chun-Nan,Lyu, R. Y.(2003)。Large Vocabulary Taiwanese (Min-nan) Speech Recognition Using Tone Features and Statistical Pronunciation Modeling,Geneva, Switzerland。  new window
20.Lyu, R. Y.,Chen, C. Y.,Chiang, Y. C.,Liang, Min-Siong(2000)。Bi-lingual Mandarin/Taiwanese(Min-nan), Large Vocabulary, Continuous Speech Recognition System Based on the Yong-yong Phonetic Alphabet,Beijing。  new window
21.Lyu, R. Y.,Lyu, Dau-Cheng,Liang, M. S.,Wang, M. H.,Chiang, Y. C.,Hsu, Chun-Nan(2004)。A Unified Framework for Large Vocabulary Speech Recognition of Mutually Unintelligible Chinese "Regionalects",Jeju Island, Korea。  new window
22.Odell, J. J.,Valtchev, V.,Woodland, P. C.,Young, S. J.(1994)。A One Pass Decoder Design for Large Vocabulary Recognition。  new window
23.Peters, S. D.,Stubley, Peter(1998)。Visualizing speech trajectories,Rolduc, Kerkrade。  new window
24.Polzin, T. S.,Waibel, A. H.(1998)。Pronunciation variations in emotional speech,Rolduc, Kerkrade。  new window
25.Soltau, H.,Metze, F.,Fuegen, C.,Waibel, A. H.(2001)。A One-pass decoder based on polymorphic linguistic context assignment,Trento, Italy。  new window
26.Strik, H.,Kessens, J. M.,Wester, M.(1998)。Modeling Pronunciation Variation for Automatic Speech Recognition,Rolduc, Kerkrade。  new window
27.Torre, D.,Villarrubia, L.,Hernandez, L.,Elvira, J. M.(1997)。Automatic Alternative Transcription Generation and Vocabulary Selection for Flexible Word Recognizers,Munich。  new window
28.Wester, M.,Fosler-Lussier, E.(2000)。Proceedings of International Conference on Spoken Language Processing,Beijing, China。  new window
29.Wester, M.,Kessens, J. M.,Strik, H.(2000)。Pronunciation Variation in ASR: Which Variation to model?,Beijing。  new window
30.Yang, Q.,Martens, J.-P.(2000)。Data driven lexical modeling of pronunciation variation in ASR,Beijing。  new window
31.Zeppenfeld, T.,Finke, M.,Ries, K.,Westphal, M.,Waibel, A. H.(1997)。Recognition of conversational speech using the JANUS speech engine,Munich。  new window
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
QR Code
QRCODE