:::

詳目顯示

回上一頁
題名:Acoustic Model Optimization for Multilingual Speech Recognition
書刊名:International Journal of Computational Linguistics & Chinese Language Processing
作者:Lyu, Dau-chengHsu, Chun-nanChiang, Yuang-chinLyu, Ren-yuan
出版日期:2008
卷期:13:3
頁次:頁363-385
主題關鍵詞:Cross-lingual phone set optimizationSpeech recognitionDelta-BIC
原始連結:連回原系統網址new window
相關次數:
  • 被引用次數被引用次數:期刊(0) 博士論文(0) 專書(0) 專書論文(0)
  • 排除自我引用排除自我引用:0
  • 共同引用共同引用:3
  • 點閱點閱:21
Due to abundant resources not always being available for resource-limited languages, training an acoustic model with unbalanced training data for multilingual speech recognition is an interesting research issue. In this paper, we propose a three-step data-driven phone clustering method to train a multilingual acoustic model. The first step is to obtain a clustering rule of context independent phone models driven from a well-trained acoustic model using a similarity measurement. For the second step, we further clustered the sub-phone units using hierarchical agglomerative clustering with delta Bayesian information criteria according to the clustering rules. Then, we chose a parametric modeling technique -- model complexity selection -- to adjust the number of Gaussian components in a Gaussian mixture for optimizing the acoustic model between the new phoneme set and the available training data. We used an unbalanced trilingual corpus where the percentages of the amounts of the training sets for Mandarin, Taiwanese, and Hakka are about 60%, 30%, and 10%, respectively. The experimental results show that the proposed sub-phone clustering approach reduced relative syllable error rate by 4.5% over the best result of the decision tree based approach and 13.5% over the best result of the knowledge-based approach.
期刊論文
1.Lyu,Ren-yuan、Liang,Min-siong、Chiang,Yuang-chin(20040800)。Toward Constructing a Multilingual Speech Corpus for Taiwanese (Min-nan), Hakka, and Mandarin。International Journal of Computational Linguistics & Chinese Language Processing,9:2,頁1-12。new window  new window
2.Schwarz, Gideon(1978)。Estimating the Dimension of a model。The Annals of Statistics,6(2),461-464。  new window
3.Fowlkes, E. B.、Mallows, C. L.(1986)。A Method for Comparing Two Hierarchical Clusterings。Journal of the American Statistical Association,78(383),553-584。  new window
4.Kohler, J.(2001)。Multi-lingual Phone Model for Vocabulary-Independent Speech Recognition Task。International Journal of Speech Communication,35,21-30。  new window
5.Uebler, U.(2001)。Multi-lingual Speech Recognition in Seven Languages。International Journal of Speech Communication,35,53-69。  new window
會議論文
1.Anguera, X.、Shinozaki, T.、Wooters, C.、Hernando, J.(2007)。Model Complexity Selection and Cross-Validation EM Training for Robust Speaker Diarization。Honolulu。  new window
2.Lyu, D. C.、Lyu, R. Y.(2008)。Optimizing The Acoustic Modeling From An Unbalanced Bi-Lingual Corpus。Las Vegas。  new window
3.Mark, B.、Barnard, E.(1996)。Phone Clustering Using the Bhattacharyya Distance。Philadelphia。2005-2008。  new window
4.Marthi, B.、Morgan, J.、Peterek, K.、Picone, J.、Wang, W.(1999)。Towards Language Independent Acoustic Modeling。Keystone。  new window
5.Tritschler, A.、Gopinath, R.(1999)。Improved Speaker Segmentation And Segments Clustering Using The Bayesian Information Criterion672-682。  new window
6.Wu, Chung-Hsien、Chiu, Y. H.、Shia, C. J.、林君昱(2006)。Phone Set Generation Based On Acoustic and Contextual。Toulouse。  new window
7.Young, S. J.、Odell, J. J.、Woodland, P. C.(1994)。Tree-based State Tying for High Accuracy Acoustic Modeling。Berling。  new window
圖書
1.Kirchhoff, K.、Schultz, T.(2006)。Multilingual Speech Processing。  new window
2.Kumar, C. S.、Mohandas, V. P.、Li, H. Z.(2005)。Multi-lingual Speech Recognition - A Unified Approach。Proc. INTERSPEECH'05。Lisbon。  new window
3.Liu, Y.、Fung, P.(2005)。Automatic Phone Set Extension with Confidence Measure for Spontaneous Speech。Proc. INTERSPEECH'05。Lisbon。  new window
4.Lyu, D. C.、Yang, B. H.、Liang, M. S.、Lyu, R. Y.、Hsu, C. N.(2002)。Speaker Independent Acoustic Modeling for Large Vocabulary Bi-lingual Taiwanese/Mandarin Continuous Speech Recognition。Proc SST。Melburne。  new window
5.Mathews, Robert Henry(1975)。Chinese-English Dictionary。Chinese-English Dictionary。Caves。  new window
其他
1.Liang, Po-Yu,Shen, J.-L.,Lee, L. S.(1998)。Decision Tree Clustering for Acoustic Modeling in Speaker-Independent Mandarin Telephone Speech Recognition,Singapore。  new window
2.Young, S. P.,Evermann, G.,Hain, T.,Kershaw, D.,Moore, G.,Odell, J.,Ollason, D.,Povey, D.,Valtchev, V.,Woodland, P.(2002)。The HTK book version 3.2。  new window
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
QR Code
QRCODE