:::

詳目顯示

回上一頁
題名:基於深度學習的學術論文語步結構分類方法研究
書刊名:數據分析與知識發現
作者:王末崔運鵬陳麗李歡
出版日期:2020
卷期:2020(6)
頁次:60-68
主題關鍵詞:語步分類深度學習雙向編碼器神經網絡Argumentative zoningDeep learningBidirectional encoderNeural networks
原始連結:連回原系統網址new window
相關次數:
  • 被引用次數被引用次數:期刊(0) 博士論文(0) 專書(0) 專書論文(0)
  • 排除自我引用排除自我引用:0
  • 共同引用共同引用:0
  • 點閱點閱:1
【目的】以深度學習語言表征模型學習論文句子表達,以此為基礎構建論文語步分類模型,提高分類效果。【方法】采用基于深度學習預訓練語言表征模型BERT,結合句子文中位置改進模型輸入,以標注數據集進行遷移學習,獲得句子級的嵌入表達,并以此輸入神經網絡分類器訓練分類模型,實現論文語步分類。【結果】基于公開數據集的實驗結果表明,11類別分類任務中,總體準確率提高了29.7%,達到81.3%;在7類別核心語步分類任務中,準確率達到85.5%。【局限】受限于實驗環境,所提改進輸入模型的預訓練參數來源于原始的模型結構,遷移學習的參數對于新模型輸入的適用程度可進一步探索。【結論】該方法較傳統的"特征構建+機器學習"分類器方法效果有大幅提高,較原始BERT模型亦有一定提高,且無須人工構建特征,模型不局限于特定語言,可應用于中文學術論文的語步分類任務,具有較大的實際應用潛力。
[Objective] This study aims at developing a new argumentative zoning method based on deep learning language representation model to achieve better performance. [Methods] We adopted a pre-trained deep learning language representation model BERT, and improved model input with sentence position feature to conduct transfer learning on training data from biochemistry journals. The learned sentence representations were then fed into neural network classifier to achieve argumentative zoning classification. [Results] The experiment indicated that for the eleven-class task, the method achieved significant improvement for most classes. The accuracy reached 81.3%, improved by 29.7% compared to the best performance from previous studies. For the seven core classes, the model achieved an accuracy of 85.5%. [Limitations] Due to limitation on experiment environment,our refined model was trained based on pre-trained parameters, which could limit the potential for classification performance. [Conclusions] The proposed method showed significant improvement compared to shallow machine learning schema or original BERT model, and was able to avoid tedious work of feature engineering. The method is independent of language, hence also suitable for research articles in Chinese language.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
QR Code
QRCODE