使用概念資訊於中文大詞彙連續語音辨識之研究__臺灣人文及社會科學引文索引資料庫

:::

詳目顯示

第 1 筆 / 總合 1 筆

/1頁

來源文獻資料
摘要
外文摘要
引文資料

題名：	使用概念資訊於中文大詞彙連續語音辨識之研究
書刊名：	International Journal of Computational Linguistics & Chinese Language Processing
作者：	郝柏翰／陳思澄／陳柏琳
作者(外文)：	Hao, Po-han／Chen, Ssu-cheng／Chen, Berlin
出版日期：	2014
卷期：	19:4
頁次：	頁47-59
主題關鍵詞：	語音辨識；語言模型；概念資訊；模型調適；Speech recognition；Language model；Concept information；Model adaptation
原始連結：	連回原系統網址
相關次數：	被引用次數:期刊(0) 博士論文(0) 專書(0) 專書論文(0) 排除自我引用:0 共同引用:8 點閱:10

語言模型是語音辨識系統中的關鍵組成，其主要的功能通常是藉由已解碼的歷史詞序列資訊來預測下一個詞彙為何的可能性最大，以協助語音辨識系統從眾多混淆的候選詞序列假設中找出最有可能的結果。本論文旨在於發展新穎動態語言模型調適技術，用以輔助並彌補傳統N 連(N-gram)語言模型不足之處，其主要貢獻有二。首先，我們提出所謂的概念語言模型(Concept Language Model, CLM)，其主要目的在於近似隱含在歷史詞序列中語者內心所欲表達之概念，並藉以獲得基於此概念下詞彙使用分布資訊，做為動態語言模型調適之線索來源。其次，我們嘗試以不同方式來估測此種概念語言模型，並將不同程度的鄰近資訊(Proximity Information) 融入概念語言模型以放寬其既有詞袋 (Bag-of-Words) 假設的限制。本論文是以中文大詞彙連續語音辨識(Large Vocabulary Continuous Speech Recognition, LVCSR)為任務目標，以比較我們所提出語言模型調適技術與其它當今常用技術之效能。實驗結果顯示我們的語言模型調適技在以字錯誤率(Character Error Rate, CER)評估標準之下，對於僅使用N 連語言模型的基礎語音辨識系統皆能有明顯的效能提升。

以文找文

Language modeling (LM) is part and parcel of automatic speech recognition (ASR), since it can assist ASR to constrain the acoustic analysis, guide the search through multiple candidate word strings, and quantify the acceptability of the final output hypothesis given an input utterance. This paper investigates and develops language model adaptation techniques for use in ASR and its main contribution is two-fold. First, we propose a novel concept language modeling (CLM) approach to rendering the relationships between a search history and an upcoming word. Second, the instantiations of CLM are constructed with different levels of lexical granularities, such as words and document clusters. In addition, we also explore the incorporation of word proximity cues into the model formulation of CLM, getting around the “bag-of-words” assumption. A series of experiments conducted on a Mandarin large vocabulary continuous speech recognition (LVCSR) task demonstrate that our proposed language models can offer substantial improvements over the baseline N-gram system, and achieve performance competitive to, or better than, some state-of-the-art language model adaptation methods.

以文找文

期刊論文
1.	Blei, D. M.(2014)。Build, compute, critique, repeat: Data analysis with latent variable models。Annual Review of Statistics and Its Application，1，203-232。
2.	Furui, S.、Deng, L.、Gales, M.、Ney, H.、Tokuda, K.(2012)。Fundamental technologies in modern speech recognition。IEEE Signal Processing Magazine，29(6)，16-17。
3.	O'Shaughnessy, D.、Deng, L.、Li, H.(2013)。Speech information processing: Theory and applications。Proceedings of the IEEE，101(5)，1034-1037。
4.	Zhai, C. X.(2008)。Statistical language models for information retrieval: A critical review。Foundations and Trends in Information Retrieval，2(3)，137-213。
5.	Kullback, Solomon、Leibler, Richard A.(1951)。On Information and Sufficiency。The Annals of Mathematical Statistics，22(1)，79-86。
6.	Wang, Hsin-min、Chen, Berlin、Kuo, Jen-wei、Cheng, Shih-sian(20050600)。MATBN: A Mandarin Chinese Broadcast News Corpus。International Journal of Computational Linguistics & Chinese Language Processing，10(2)，219-235。
7.	Dempster, Arthur P.、Laird, Nan M.、Rubin, Donald B.(1977)。Maximum likelihood from incomplete data via the EM algorithm。Journal of the Royal Statistical Society: Series B (Methodological)，39(1)，1-38。
8.	Blei, David M.、Ng, Andrew Y.、Jordan, Michael I.(2003)。Latent Dirichlet allocation。Journal of Machine Learning Research，3(4/5)，993-1022。
9.	Bellegarda, J. R.(2004)。Statistical Language Model Adaptation: Review and Perspectives。Speech Communication，42，93-108。
10.	Ortmanns, S.、Ney, H.、Aubert, X. L.(1997)。A Word Graph Algorithm for Large Vocabulary Continuous Speech Recognition。Computer Speech and Language，11，43-72。
11.	Rosenfeld, R.(2000)。Two Decades of Statistical Language Modeling: Where Do We Go from Here。Proceedings of the IEEE，88(8)，1270-1278。

會議論文
1.	Chen, K.(2014)。Leveraging effective query modeling techniques for speech recognition and summarization。The Conference on Empirical Methods on Natural Language Processing。
2.	Gildea, D.、Hofmann, T.(1999)。Topic-based language models using EM。The European Conference on Speech Communication and Technology，2167-2170。
3.	Kim, D.(2013)。A variational approximation for topic modeling of hierarchical corpora。The International Conference on Machine Learning。
4.	Kuhn, R.(1988)。Speech recognition and the frequency of recently used words: A modified Markov model for natural language。International Conference on Computational Linguistics，348-350。
5.	Lau, R.、Rosenfeld, R.、Roukos, S.(1993)。Trigger-based language models: a maximum entropy approach。The IEEE International Conference on Acoustics, Speech, Signal Processing，45-48。
6.	Liu, S. H.、Chu, F. H.、Lin, S. H.、Lee, H. S.、Chen, B.(2007)。Training data selection for improving discriminative training of acoustic models。IEEE workshop on Automatic Speech Recognition and Understanding，284-289。
7.	Mikolov, T.、Karafiát, M.、Burget, L.、Černocký, J.、Khudanpur, S.(2010)。Recurrent neural network based language model。The Eleventh Annual Conference of the International Speech Communication Association 2010，1045-1048。
8.	Potapenko, A.、Konstantin, V.(2013)。Robust PLSA performs better than LDA。The European Conference on Information Retrieval，784-787。
9.	Tam, Y.、Schultz, T.(2005)。Dynamic language model adaptation using variational Bayes inference。The Annual Conference of the International Speech Communication Association，5-8。
10.	Troncoso, C.、Kawahara, T.(2005)。Trigger-based language model adaptation for automatic meeting transcription。The Annual Conference of the International Speech Communication Association，1297-1300。
11.	Hofmann, T.(1999)。Probabilistic latent semantic indexing。The ACM Special Interest Group on Information Retrieval，(會議日期: August 15-19)，50-57。
12.	Chen, B.、Kuo, J. W.、Tsai, W. H.(2004)。Lightly Supervised and Data-Driven Approaches to Mandarin Broadcast News Transcription。The IEEE International Conference on Acoustics, Speech, and Signal Processing 2004，777-780。

圖書
1.	Baeza-Yates, R.、Ribeiro-Neto, B.(2011)。Modern Information Retrieval: the Concepts and Technology behind Search。Addison-Wesley。

其他
1.	Stolcke, A.(2000)。SRI Language Modeling Toolkit，http://www.speech.sri.com/projects/srilm。

圖書論文
1.	Deng, L.、Yu, D.(2014)。Deep Learning: Methods and Applications。Foundations and Trends in Signal Processing。Now Publishers。
2.	Blei, D.、Lafferty, J.(2009)。Topic models。Text Mining: Theory and Applications。Taylor and Francis。

推文
推薦
引用網址
引用嵌入語法
轉寄

top

:::

相關期刊
相關論文
相關專書
相關著作
熱門點閱

1.	基於端對端模型化技術之語音文件摘要
2.	基於特徵粒度之訓練策略於中文口語問答系統之應用
3.	結合鑑別式訓練聲學模型之類神經網路架構及優化方法的改進
4.	當代非監督式方法之比較於節錄式語言摘要
5.	融合多任務學習類神經網路聲學模型訓練於會議語音辨識之研究
6.	節錄式語音文件摘要使用表示法學習技術
7.	護理紀錄語料及辭典建置之研究與應用於語音辨識之可行性評估
8.	Improved Minimum Phone Error Based Discriminative Training of Acoustic Models for Mandarin Large Vocabulary Continuous Speech Recognition
9.	An Empirical Study of Word Error Minimization Approaches for Mandarin Large Vocabulary Continuous Speech Recognition
10.	MATBN: A Mandarin Chinese Broadcast News Corpus

無相關博士論文

無相關書籍

無相關著作

無相關點閱

QR Code

臺灣人文及社會科學引文索引資料庫系統

詳目顯示

臺灣人文及社會科學引文索引資料庫