文本相似度計算方法研究綜述__臺灣人文及社會科學引文索引資料庫

:::

詳目顯示

第 1 筆 / 總合 1 筆

/1頁

來源文獻資料
摘要
外文摘要
引文資料

題名：	文本相似度計算方法研究綜述
書刊名：	數據分析與知識發現
作者：	陳二靜／姜恩波
出版日期：	2017
卷期：	2017(6)
頁次：	1-11
主題關鍵詞：	文本相似度；語義相似度；本體；詞袋模型；神經網絡；Text similarity；Semantic similarity；Ontology；Bag of words model；Neural network
原始連結：	連回原系統網址
相關次數：	被引用次數:期刊(0) 博士論文(0) 專書(0) 專書論文(0) 排除自我引用:0 共同引用:0 點閱:5

【目的】分析文本相似度計算方法,了解該領域的發展態勢。【文獻范圍】在CNKI和Web of Science中分別以檢索式"篇名:文本相似度OR篇名:詞匯相似度OR篇名:語義相似度"和"TI:‘text similarity’or‘semantic similarity’or‘lexical similarity’"并限定文獻類型進行檢索,最終得到69篇重點文獻。【方法】對文本相似度計算方法進行系統梳理,分析重點方法的基本思想、特點并總結未來發展方向。【結果】形成了較為全面的分類描述體系,文本相似度計算方法可分為4類:基于字符串的方法、基于語料庫的方法、基于世界知識的方法和其他方法。其中,基于神經網絡和基于世界知識的方法以及針對跨領域文本的相似度計算將成為該領域的發展趨勢。【局限】僅將不同方法本身作為探討的核心,未進一步分析方法的應用情況。【結論】有助于全面把握和深入了解文本相似度計算方法的研究現狀和未來趨勢。

以文找文

[Objective] This paper analyzes the popular text similarity measures and discusses their latest developments.[Coverage] We retrieved 69 key articles from CNKI and Web of Science databases by searching "TI: 'text similarity' or 'semantic similarity' or 'lexical similarity' "in Chinese and English respectively. [Methods] We systematically reviewed the text similarity measures focusing on their basic concepts, characteristics and future directions. [Results]There were four types of text similarity measures: String-based, Corpus-based, Knowledge-based and others. Measures based on the neural network, Knowledge-based measures and inter-disciplinary measures could be the future research directions. [Limitations] We did not discuss the applications of those measures. [Conclusions] This paper is a comprehensive review of text similarity measure research.

以文找文

期刊論文
1.	Salton, G.、Wong, A.、Yang, C. S.(1975)。A Vector Space Model for Automatic Indexing。Communications of the ACM，18(11)，613-620。
2.	Islam, A.、Inkpen, D.(2008)。Semantic text similarity using corpus-based word similarity and string similarity。ACM Transactions on Knowledge Discovery from Data，2(2)，1-25。
3.	Bengio, Y.、Ducharme, R.、Vincent, P.、Jauvin, C.(2003)。A neural probabilistic language model。Journal of Machine Learning Research，3(6)，1137-1155。
4.	Landauer, T. K.、Dumais, S. T.(1997)。A solution to Plato's problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge。Psychological review，104(2)，211-240。
5.	Tversky, Amos(1977)。Features of similarity。Psychological Review，84(4)，327-352。
6.	Cilibrasi, R. L.、Vitanyi, P. M. B.(2007)。The Google Similarity Distance。IEEE Transactions on Knowledge and Data Engineering，19(3)，370-383。
7.	Resnik, Philip(1999)。Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language。Journal of Artificial Intelligence Research，11(1)，95-130。
8.	Li, Yu-Hua、Bandar, Zuhair A.、McLean, David(2003)。An Approach for Measuring Semantic Similarity between Words Using Multiple Information Sources。IEEE Transactions on Knowledge and Data Engineering，15(4)，871-882。
9.	Rada, R.、Mili, H.、Bicknell, E.(1989)。Development and Application of a Metric on Semantic Nets。IEEE Transactions on Systems, Man, and Cybernetics，19(1)，17-30。
10.	Gomaa, W. H.、Fahmy, A. A.(2013)。A Survey of Text Similarity Approaches。International Journal of Computer Applications，68(13)，13-18。
11.	魏韡、向陽、陳千(2010)。計算術語間語義相似度的混合方法。計算機應用，30(6)，1668-1670。延伸查詢
12.	王小林、肖慧、邰偉鵬(2015)。基於Hadoop平臺的文本相似度檢測系統的研究。計算機技術與發展，25(8)，90-93。延伸查詢
13.	Atoum, I.、Otoom, A.(2016)。Efficient Hybrid Semantic Text Similarity Using Wordnet and a Corpus。International Journal of Advanced Computer Science and Applications，7(9)，124-130。
14.	Pradhan, N.、Gyanchandani, M.、Wadhvani, R.(2015)。A Review on Text Similarity Technique Used in IR and Its Application。International Journal of Computer Applications，120(9)，29-34。
15.	秦春秀、趙捧未、劉懷亮(2007)。詞語相似度計算研究。情報理論與實踐，30(1)，105-108。延伸查詢
16.	李慧(2015)。詞語相似度算法研究綜述。現代情報，35(4)，172-177。延伸查詢
17.	韓普、王東波、王子敏(2016)。詞彙相似度計算和相似詞挖掘研究進展。情報科學，34(9)，161-165。延伸查詢
18.	孫海霞、錢慶、成穎(2010)。基於本體的語義相似度計算方法研究綜述。現代圖書情報技術，26(1)，51-56。延伸查詢
19.	劉宏哲、須德(2012)。基於本體的語義相似度和相關度計算研究綜述。計算機科學，39(2)，8-13。延伸查詢
20.	劉群、李素建(2002)。基於《知網》的詞彙語義相似度計算。中文計算語言學，7(2)，59-76。延伸查詢
21.	田久樂、趙蔚(2010)。基於同義詞詞林的詞語相似度計算方法。吉林大學學報.信息科學版，28(6)，602-608。延伸查詢
22.	張煥炯、王國勝、鐘義信(2001)。基於漢明距離的文本相似度計算。計算機工程與應用，37(19)，21-22。延伸查詢
23.	Dice, L. R.(1944)。Measures of the Amount of Ecologic Association Between Species。Ecology，26(3)，297-302。
24.	郭慶琳、李艶梅、唐琦(2008)。基於VSM的文本相似度計算的研究。計算機應用研究，25(11)，3256-3258。延伸查詢
25.	李連、朱愛紅、蘇濤(2012)。一種改進的基於向量空間文本相似度算法的研究與實現。計算機應用與軟件，29(2)，282-284。延伸查詢
26.	王振振、何明、杜永萍(2013)。基於LDA主題模型的文本相似度計算。計算機科學，40(12)，229-232。延伸查詢
27.	熊大平、王健、林鴻飛(2012)。一種基於LDA的社區問答問句相似度計算方法。中文信息學報，26(5)，40-45。延伸查詢
28.	張超、陳利、李瓊(2016)。一種PST_LDA中文文本相似度計算方法。計算機應用研究，33(2)，375-377+383。延伸查詢
29.	劉勝久、李天瑞、賈真(2014)。基於搜索引擎的相似度研究與應用。計算機科學，41(4)，211-214。延伸查詢
30.	陳海燕(2015)。基於搜索引擎的詞彙語義相似度計算方法。計算機科學，42(1)，261-267。延伸查詢
31.	Batet, M.、Sánchez, D.、Valls, A.(2011)。An Ontology-based Measure to Compute Semantic Similarity in Biomedicine。Journal of Biomedical Informatics，44(1)，118-125。
32.	Lord, P. W.、Stevens, R. D.、Brass, A.(2003)。Investigating Semantic Similarity Measures across the Gene Ontology: The Relationship Between Sequence and Annotation。Bioinformatics，19(10)，1275-1283。
33.	邊振興(2011)。WordNet中概念語義相似度IC參數模型研究。計算機工程與應用，47(19)，128-131。延伸查詢
34.	葛斌、李芳芳、郭絲路(2010)。基於知網的詞彙語義相似度計算方法研究。計算機應用研究，27(9)，3329-3333。延伸查詢
35.	王艶娜、周子力、何艶(2011)。WordNet中基於IC的概念語義相似度算法。計算機工程，37(22)，42-44。延伸查詢
36.	李文清、孫新、張常有(2012)。一種本體概念的語義相似度計算方法。自動化學報，38(2)，229-235。延伸查詢
37.	孫琛琛、申德榮、單菁(2012)。WSR：一種基於維基百科結構信息的語義關聯度計算算法。計算機學報，35(11)，2361-2370。延伸查詢
38.	盛志超、陶曉鵬(2011)。基於維基百科的語義相似度計算方法。計算機工程，37(7)，193-195。延伸查詢
39.	彭麗針、吳揚揚(2016)。基於維基百科社區挖掘的詞語語義相似度計算。計算機科學，43(4)，45-49。延伸查詢
40.	詹志建、梁麗娜、楊小平(2013)。基於百度百科的詞語相似度計算。計算機科學，40(6)，199-202。延伸查詢
41.	尹坤、尹紅風、楊燕(2014)。基於SimRank的百度百科詞條語義相似度計算。山東大學學報.工學版，44(3)，29-35。延伸查詢
42.	李彬、劉挺、秦兵(2003)。基於語義依存的漢語句子相似度計算。計算機應用研究，20(12)，15-17。延伸查詢
43.	李茹、王智強、李雙紅(2013)。基於框架語義分析的漢語句子相似度計算。計算機研究與發展，50(8)，1728-1736。延伸查詢
44.	Blanco, E.、Moldovan, D.(2015)。A Semantic Logic-Based Approach to Determine Textual Similarity。IEEE/ACM Transactions on Audio, Speech, and Language Processing，23(4)，683-693。
45.	Tasi, C. S.、Huang, Y. M.、Liu, C. H.(2012)。Applying VSM and LCS to Develop an Integrated Text Retrieval Mechanism。Expert Systems with Applications，39(4)，3974-3982。
46.	劉萍、陳燁(2012)。詞彙相似度研究進展綜述。現代圖書情報技術，2012(7/8)，82-89。延伸查詢
47.	Blei, David M.、Ng, Andrew Y.、Jordan, Michael I.(2003)。Latent Dirichlet allocation。Journal of Machine Learning Research，3(4/5)，993-1022。

會議論文
1.	Hofmann, Thomas(1999)。Probabilistic latent semantic analysis。The 15th Conference on Uncertainty in Artificial Intelligence。Morgan Kaufmann。289-296。
2.	Lin, D.(1998)。An information-theoretic definition of similarity。The 15th International Conference on Machine Learning。Madison, Wisconsin。296-304。
3.	Hinton, G. E.(1986)。Learning distributed representations of concepts。The Eighth Annual Conference of the Cognitive Science Society。Hillsdale, NJ：Erlbaum。1-12。
4.	Gabrilovich, E.、Markovitch, S.(2007)。Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis。The 20th International Joint Conference on Artificial Intelligence，1606-1611。
5.	Sahami, M.、Heilman, T. D.(2006)。A web-based kernel function for measuring the similarity of short text snippets。The 15th international conference on World Wide Web，377-386。
6.	Kusner, M. J.、Sun, Y.、Kolkin, N. I.(2015)。From Word Embeddings to Document Distances。The 32nd International Conference on Machine Learning，957-966。
7.	Liu, G.、Wang, R.、Buckley, J.(2011)。A WordNet-based Semantic Similarity Measure Enhanced by Internet-based Knowledge。The International Conference on Software Engineering & Knowledge Engineering。
8.	Mikolov, Tomas、Sutskever, I.、Chen, K.、Corrado, G.、Dean, J.(2013)。Distributed Representations of Words and Phrases and Their Compositionality。The 26th International Conference on Neural Information Processing Systems，3111-3119。
9.	Kenter, T.、Rijke, M. D.(2015)。Short Text Similarity with Word Embeddings。The 24th ACM International on Conference on Information and Knowledge Management，1411-1420。
10.	Huang, G.、Guo, C.、Kusner, M. J.(2016)。Supervised Word Mover's Distance。The 30th Conference on Neural Information Processing Systems。
11.	Wu, Z.、Palmer, M.(1994)。Verb Semantic and Lexical Selection。The 32nd Annual Meeting of the Associations for Computational Linguistics，133-138。
12.	Lin, D.(1993)。Principle-based Parsing without Overgeneration。The 31st Annual Meeting of the Association for Computational Linguistics。
13.	Strube, M.、Ponzetto, S. P.(2006)。WikiRelate! Computing Semantic Relatedness Using Wikipedia。The 21st National Conference on Artificial Intelligence。
14.	Milne, D.、Witten, I. H.(2008)。An Effective, Low-cost Measure of Semantic Relatedness Obtained from Wikipedia Links。The 23rd Association for the Advancement of Artificial Intelligence。
15.	Lizorkin, D.、Medelyan, O.、Grineva, M.(2009)。Analysis of Community Structure in Wikipedia。The 18th International Conference on World Wide Web，1221-1222。
16.	穗志方、俞士汶(1998)。基於骨架依存樹的語句相似度計算模型。1998中文信息處理國際會議。延伸查詢
17.	Jiang, J. J.、Conrath, D. W.(1997)。Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy。The International Conference on Research in Computational Linguistics。
18.	Pennington, Jeffrey、Socher, Richard、Manning, Christopher D.(2014)。GloVe: Global Vectors for Word Representation。The 2014 Conference on Empirical Methods in Natural Language Processing，(會議日期: 2014/10/25-10/29)，1532-1543。

其他
1.	董振東，董強。知網，http://www.keenage.com/zhiwang/c_zhiwang.html。延伸查詢
2.	Hliaoutakis, A.。Semantic Similarity Measures in MeSH Ontology and Their Application to Information Retrieval on Medline，http://www.intelligence.tuc.gr/publications/Hliautakis.pdf。
3.	Richardson, R.，Smeaton, A. F.，Murphy, J.。Using WordNet as a Knowledge Base for Measuring Semantic Similarity Between Words，http://pssd.computing.dcu.ie/wpapers/1994/1294.pdf。

圖書論文
1.	Harris, Z. S.(1970)。Distributional Structure。Papers in Structural and Transformational Linguistics。Dordrecht：Springer。

推文
推薦
引用網址
引用嵌入語法
轉寄

top

:::

相關期刊
相關論文
相關專書
相關著作
熱門點閱

1.	文本相似度計算方法研究綜述
2.	基於語義關聯和信息距離的個性化推薦方法研究
3.	一種基於語義組塊特徵的改進Cosine文本相似度計算方法

無相關博士論文

無相關書籍

無相關著作

1.	文本相似度計算方法研究綜述
2.	以創新角色探討圖書館的多元閱讀推廣模式--以國立公共資訊圖書館為例
3.	日治時期傳統漢詩的阿里山森林鐵道書寫
4.	公共圖書館創新策略之研究--以埔里鎮立圖書館為例
5.	公共圖書館創新策略之研究--以埔里鎮立圖書館為例

QR Code

臺灣人文及社會科學引文索引資料庫系統

詳目顯示

臺灣人文及社會科學引文索引資料庫