融入實體特徵的典籍自動分類研究__臺灣人文及社會科學引文索引資料庫

:::

詳目顯示

第 1 筆 / 總合 1 筆

/1頁

來源文獻資料
摘要
外文摘要

題名：	融入實體特徵的典籍自動分類研究
書刊名：	數據分析與知識發現
作者：	秦賀然／劉瀏／李斌／王東波
出版日期：	2019
卷期：	2019(9)
頁次：	68-76
主題關鍵詞：	古代典籍；文本分類；實體；支持向量機；Ancient classics；Text classification；Entity；Support vector machine
原始連結：	連回原系統網址
相關次數：	被引用次數:期刊(0) 博士論文(0) 專書(0) 專書論文(0) 排除自我引用:0 共同引用:0 點閱:0

【目的】在傳統統計特征詞算法的基礎上,添加實體特征對10本古代典籍進行分類研究。【方法】基于支持向量機模型,分別采用傳統的TF-IDF、信息增益、卡方檢驗、互信息4種統計量計算特征詞,再加入命名實體這一特征,驗證分類器的分類效果。【結果】加入實體特征之后分類器的最高精度達98.7%。在傳統的信息增益、TF-IDF、互信息和卡方檢驗特征計算下的分類精度分別提高12.4%、12.4%、12.3%、22.8%。【局限】將實體特征遷移到其他文本有一定的局限性,需要重新標注識別實體。【結論】實體可以作為一類特征應用到文本分類模型中,具有實際的應用推廣價值。

以文找文

[Objective] This paper modifies the algorithm of traditional statistical feature words with entity features, aiming to classify ten classics from ancient China. [Methods] For the support vector machine model, we added the traditional TF-IDF, information gain, chi-square test and mutual information to calculate the feature words. Then, we used the named entity to evaluate the classification results. [Results] The highest accuracy of the proposed classifier reached 98.7%. The accuracy was improved by 12.4%, 12.4%, 12.3% and 22.8% respectively with traditional information gain, TF-IDF, mutual information and chi-square test feature calculations. [Limitations] We need to re-label the recognition entities before applying entity features to other texts. [Conclusions] Entity features could improve the effectiveness of text categorization models.

以文找文

推文
推薦
引用網址
引用嵌入語法
轉寄

top

:::

相關期刊
相關論文
相關專書
相關著作
熱門點閱

1.	基於DDAG-SVM的在線商品評論可信度分類模型

1.	使用資料探勘技術挖掘線上論壇討論活動型態

無相關書籍

無相關著作

無相關點閱

QR Code

臺灣人文及社會科學引文索引資料庫系統

詳目顯示

臺灣人文及社會科學引文索引資料庫