:::

詳目顯示

回上一頁
題名:Data Driven Approaches to Phonetic Transcription with Integration of Automatic Speech Recognition and Grapheme-to-Phoneme for Spoken Buddhist Sutra
書刊名:International Journal of Computational Linguistics & Chinese Language Processing
作者:Liang, Min-siongLyu, Ren-yuanChiang, Yuang-chin
出版日期:2008
卷期:13:2
頁次:頁233-253
主題關鍵詞:Automatic phonetic transcriptionPhone recognitionGrapheme-to-phonemeG2PPronunciation variationChinese textTaiwaneseMin-NanDialectBuddhist Sutra
原始連結:連回原系統網址new window
相關次數:
  • 被引用次數被引用次數:期刊(0) 博士論文(0) 專書(0) 專書論文(0)
  • 排除自我引用排除自我引用:0
  • 共同引用共同引用:3
  • 點閱點閱:47
We propose a new approach for performing phonetic transcription of text that utilizes automatic speech recognition (ASR) to help traditional grapheme-to-phoneme (G2P) techniques. This approach was applied to transcribe Chinese text into Taiwanese phonetic symbols. By augmenting the text with speech and using automatic speech recognition with a sausage searching net constructed from multiple pronunciations of text, we are able to reduce the error rate of phonetic transcription. Using a pronunciation lexicon with multiple pronunciations for each item, a transcription error rate of 12.74% was achieved. Further improvement can be achieved by adapting the pronunciation lexicon with pronunciation variation (PV) rules derived manually from corrected transcription in a speech corpus. The PV rules can be categorized into two kinds: knowledge-based and data-driven rules. By incorporating the PV rules, an error rate of 10.56% could be achieved. Although this technique was developed for Taiwanese speech, it could easily be adapted to other Chinese spoken languages or dialects.
期刊論文
1.Lyu,Ren-yuan、Liang,Min-siong、Chiang,Yuang-chin(20040800)。Toward Constructing a Multilingual Speech Corpus for Taiwanese (Min-nan), Hakka, and Mandarin。International Journal of Computational Linguistics & Chinese Language Processing,9:2,頁1-12。new window  new window
2.Lamel, L. F.、Gauvain, J. L.、Adda, G.(2002)。Lightly Supervised and Unsupervised Acoustic Model Training。Computer Speech and Language,16(1),115-229。  new window
3.Cremelie, N.、Martens, J.-P.(1999)。In Search of Better Pronunciation Models for Speech Recognition。Speech Communication,29,115-136。  new window
4.Hain, T.(2005)。Implicit modeling of pronunciation variation in automatic speech recognition。Speech Communication,46,171-188。  new window
5.Nanjo, H.、Kawahara, T.(2004)。Language Model and Speaking Rate Adaptation for Spontaneous Presentation Speech Recognition。IEEE Transaction on Speech and Audio Processing,12,391-400。  new window
6.Saraclar, M.、Khudanpur, S.(2004)。Pronunciation change in conversation speech and its implications for automatic speech recognition。Computer Speech and Language,18,375-395。  new window
會議論文
1.Liang, M. S.、Yang, J. C.、Chiang, Y. C.、Lyu, R. Y.(2004)。A Taiwanese Text-to-Speech System with Applications to Language Learning。  new window
研究報告
1.(2003)。U.S. Department of State's Bureau of International Information Programs, IIP report。  new window
圖書
1.Cover, Thomas M.、Thomas, Joy A.(1991)。Elements of Information Theory。John Wiley & Sons, Inc.。  new window
2.Sik, D. G.(2004)。The Four Basic Sutra in Taiwanese。The Four Basic Sutra in Taiwanese。HsinChu, Taiwan。  new window
3.Sik, D. G.(2004)。Earth Treasure Sutra in Taiwanese。Earth Treasure Sutra in Taiwanese。HsinChu, Taiwan。  new window
4.Young, S.、Evermann, G.、Gales, M.、Hain, T.、Kershaw, D.、Liu, X.、Moore, G.、Odell, J.、Ollason, D.、Povey, D.、Valtchev, V.、Woodland, P.(2008)。The HTK Book 3.2。The HTK Book 3.2。  new window
其他
1.Chen, C. H.(2006)。Sutra on the original Vows of Bodhisattva Earth Treasure in English。  new window
2.Evermann, G.,Chan, H. Y.,Gales, M. J. F.,Hain, T.,Liu, X.,Mrva, D.,Wang, L.,Woodland, P. C.(2004)。Development of the 2003 CU-HTK Conversational Telephone Speech Transcription System,Montreal, Canada。  new window
3.Haeb-Umbach, R.,Beyerlein, P.,Thelen, E.(1995)。Automatic Transcription of Unknown Words in a Speech Recognition system,Detroit。  new window
4.Kanokphara, S.,Tesprasit, V.,Thongprasirt, R.(2003)。Pronunciation Variation Speech Recognition without Dictionary Modification on Sparse Database,Hong Kong。  new window
5.Kim, D. Y.,Chan, H. Y.,Evermann, G.,Gales, M. J. F.,Mrva, D.,Sim, K. C.,Woodland, P. C.(2005)。Development of the CU-HTK 2004 Broadcast News Transcription Systems,Philadelphia, USA。  new window
6.Liang, M. S.,Lyu, D. C.,Chiang, Y. C.,Lyu, R. Y.(2004)。Construct a Multi-Lingual Speech Corpus in Taiwan with Extracting Phonetically Balanced Articles,Jeju Island, Korea。  new window
7.Nouza, J.,Nejedlova, D.,Zdansky, J.,Kolorenc, J.(2004)。Very Large Vocabulary Speech Recognition System for Automatic Transcription of Czech Broadcast Programs,Jeju Island, Korea。  new window
8.Raux, A.(2004)。Automated Lexical Adaptation and Speaker Clustering based on Pronunciation Habits for Non-Native Speech Recognition,Jeju Island, Korea。  new window
9.Sarada, G. L.,Hemalatha, N.,Nagarajan, T.,Murthy, H. A.(2004)。Automatic Transcription of Continuous Speech using Unsupervised and Incremental Training,Jeju Island, Korea。  new window
10.Siohan, O.,Ramabhadran, B.,Zweig, G.(2004)。Speech Recognition Error Analysis on the English MALACH Corpus,Jeju Island, Korea。  new window
11.Soltau, H.,Kingsbury, B.,Mangu, L.,Povey, D.,Saon, G.,Zweig, G.(2005)。The IBM 2004 Conversational Telephony System for Rich Transcription,Philadelphia, USA。  new window
12.Tripitaka, S. S.(2005)。Sutra on the original Vows of Bodhisattva Earth Treasure in Chinese。  new window
13.Tsai, M. Y.,Chou, F. C.,Lee, L. S.(2002)。Improved pronunciation modeling by inverse word frequency and pronunciation entropy。  new window
14.Wu, J.,Gupta, V.(1999)。Application of Simultaneous Decoding Algorithm to Automatic Transcription of Known and Unknown Words,Phoenix, USA。  new window
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top