:::

詳目顯示

回上一頁
題名:融合多粒度信息的文本向量表示模型
書刊名:數據分析與知識發現
作者:聶維民陳永洲馬靜
出版日期:2019
卷期:2019(9)
頁次:45-52
主題關鍵詞:文本分類詞向量卷積神經網絡主題模型Text classificationWord vectorConvolutional neural networkTopic model
原始連結:連回原系統網址new window
相關次數:
  • 被引用次數被引用次數:期刊(0) 博士論文(0) 專書(0) 專書論文(0)
  • 排除自我引用排除自我引用:0
  • 共同引用共同引用:0
  • 點閱點閱:2
【目的】更加全面地提取文本語義特征,提高文本向量對文本語義的表示能力。【方法】通過卷積神經網絡提取詞粒度、主題粒度和字粒度文本特征向量,通過"融合門"機制將三種特征向量融合得到最終的文本向量,并進行文本分類實驗。【結果】該模型在搜狗語料庫文本分類實驗上的準確率為92.56%,查準率為92.33%,查全率為92.07%,F1值為92.20%,較基準模型Text-CNN分別提高2.40%,2.05%,1.77%,1.91%。【局限】詞序關系范圍較小,語料庫規模較小。【結論】該模型可以更加全面地提取文本語義特征,得到的文本向量對文本語義表示能力更強。
[Objective] This paper proposed a model to extract semantic features from texts more comprehensively and to improve the representation of semantics by text vectors. [Methods] We obtained the word-granularity, topic-granularity and character-granularity feature vectors with the help of convolutional neural networks. Then, the three feature vectors were combined by the "merging gate" mechanism to generate the final text vectors. Finally, we examined the model with text classification experiment. [Results] The accuracy(92.56%), the precision(92.33%), the recall(92.07%) and the F-score(92.20%), were 2.40%, 2.05%, 1.77% and 1.91% higher than the results of Text-CNN. [Limitations] The Long-distance dependency features need to be included and the corpus size needs to be expanded. [Conclusions] The proposed model could better represent the text semantics.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
QR Code
QRCODE