:::

詳目顯示

回上一頁
題名:應用語句關係網路計算語句向心性之新聞事件摘要方法
書刊名:資訊管理學報
作者:葉鎮源楊維邦 引用關係柯皓仁 引用關係鄭培成 引用關係
作者(外文):Yeh, Jen-yuanYang, Wei-pangKe, Hao-renCheng, Pei-cheng
出版日期:2014
卷期:21:3
頁次:頁271-304
主題關鍵詞:多文件摘要摘錄式摘要語句關係網路網路節點向心性語句排序Multidocument summarizationExtraction-based summarizationSentence similarity networkNetwork-based sentence centralitySentence ranking
原始連結:連回原系統網址new window
相關次數:
  • 被引用次數被引用次數:期刊(1) 博士論文(0) 專書(0) 專書論文(0)
  • 排除自我引用排除自我引用:1
  • 共同引用共同引用:0
  • 點閱點閱:103
摘錄式摘要技術的核心在於評估語句的摘要代表性,藉以排序語句作為摘錄 語句時的依據。本研究將語句視為節點,藉由語句相似度來決定節點間是否存在 連結,依此建構出語句關係網路模型。接著,衡量節點在網路中的重要性或對於 其他相連節點的影響性,提出:(1) Degree Centrality、(2) Normalized Similarity-based Degree Centrality、(3) HITS Centrality、(4) PageRank Centrality,及(5) iSpreadRank Centrality 的節點向心性分析;並以語句向心性作為語句的摘要代表性,藉此達到 排序語句的目的。最後,導入CSIS(Cross-Sentence Information Sub-sumption)過 濾重複性資訊,依序擷取語句組成摘要。實驗使用DUC 2004 資料集來驗證上述摘 要方法的可行性。在ROUGE-1 的指標下,結合不同語句向心性之摘要效能依序 是:iSpreadRank > Normalized Similarity-based Degree > PageRank > Degree > HITS。整體而言,實驗得知應用語句關係網路計算語句向心性之摘要方法確實可 行。
Purpose: One widely-adopted summarization paradigm, sentence extraction, aims at extracting important sentences and composing them into a summary. The foundation towards sentence extraction is to assess importance of sentences in the summary so as to rank sentences for extraction. This paper employs graph-based text analysis to model documents and investigates measures of graph-based centrality as sentence salience in summarization. Design/methodology/approach: This paper models documents on the same (or related) topic as a sentence similarity network, in which a sentence is regarded as a node and relationship between sentences only exists if they are semantically related. Severalmethods for evaluating the importance of a node (i.e., a sentence) in the network are then proposed, namely: (1) Degree Centrality; (2) Normalized Similarity-based Degree Centrality; (3) HITS Centrality; (4) PageRank Centrality; and (5) iSpreadRank Centrality. All are designed on the basis of the idea that the importance of a node is determined not only by the number of nodes to which it connects, but also by the importance of its connected nodes. As to summary generation, CSIS (Cross-Sentence Information Sub-sumption) is employed for anti-redundancy while extracting sentences according to the sentence ranking produced based on the centrality of sentences. Findings: The proposed summarization method was evaluated using the ROUGE evaluation suite on the DUC 2004 news stories collection. Experimental results show that, while considering the ROUGE-1 metric, the performance ranking is: iSpreadRank > Normalized Similarity-base Degree > PageRank > Degree > HITS. Another experiment, conducted to combine sentence centrality with surface-level features, also presents competitive results, compared with the best participant in the DUC 2004 evaluation. Research limitations/implications: Directions for future research would be: (1) instead of symbolic-level analysis, to take into account semantics, such as synonymy, polysemy, and term dependency, while determining if two sentences are semantically related; (2) to investigate graph-based centrality developed in social network analysis for evaluating sentence salience in summarization; (3) to improve the cohesion andcoherence of summaries using natural language processing techniques, such as sentence planning and generation. Practical implications: The proposed summarization method is in an unsupervised manner; thus no training dataset is required. Since no domain-specific knowledge or deep linguistic analysis is exploited, the method is domain- and language-independent. However, it might lead to poor understanding of the input texts and would probably produces poor summaries, due to neither deep analysis of natural language processing performed, discourse structure considered, nor domain-specific knowledge involved in the process of summarization, Originality/value: The contributions of this work are threefold. First, this paper offers a sentence similarity network to model topic-related documents. Second, novelgraph-based sentence ranking methods are explored to rank the importance of sentences for extraction. Finally, the proposed method had been proven successful in a case study with the DUC 2004 benchmark dataset.
期刊論文
1.Kleinberg, Jon M.(1999)。Authoritative sources in a hyperlinked environment。Journal of the ACM,46(5),604-632。  new window
2.Aliguliyev, R. M.(2010)。Clustering techniques and discrete particle swarm optimization algorithm for multi-document summarization。Computational Intelligence,26(4),420-448。  new window
3.Boudin, F.、Huet, S.、Torres-Moreno, J.M.(2011)。A graph-based approach to cross-language multi-document summarization。Polibits,43,113-118。  new window
4.Cai, X.-Y.、Li, W.-J.(2013)。Ranking through clustering: an integrated approach to multi-document summarization。IEEE Transactions on Audio, Speech, and Language Processing,21(7),1424-1433。  new window
5.Erkan, Gunes、Radev, Dragomir R.(2004)。LexRank: Graph-based Lexical Centrality As Salience in Text Summarization。Journal of Artificial Intelligence Research,22(1),457-479。  new window
6.Mani, I.、Bloedorn, E.(1999)。Summarizing similarities and difference among related documents。Information Retrieval,1(1-2),35-67。  new window
7.McDonald, D. M.、Chen, H.(2006)。Summary in context: searching versus browsing。ACM Transactions on Information Systems,24(1),111-141。  new window
8.Li, J.、Li, S.(2013)。A novel feature-based Bayesian model for query focused multi-document summarization。Transactions of the Association for Computational Linguistics,1,89-98。  new window
9.Sathish kumar, T.、Sharmila, V.(2013)。An efficient document summarization using adaptive ranking clustering scheme。International Journal of Computer Technology & Applications,4(6),1052-1054。  new window
10.Suanmali, L.、Salim, N.、Binwahlan, M.S.(2011)。Genetic algorithm based sentence extraction for text summarization。Journal of Innovative Computing,1(1),1-22。  new window
11.Anderson, J. R.(1983)。A spreading activation theory of memory。Journal of Verbal Learning and Verbal Behavior,22,261-295。  new window
12.Gupta, V.、Chauhan, P.、Garg, S.(2012)。An statistical tool for multi-document summarization。Journal of Scientific and Research Publications,2(5),1-5。  new window
13.Luhn, H. P.(1958)。The Automatic Creation of Literature Abstracts。IBM Journal of Research and Development,2(2),159-165。  new window
14.Brin, Sergey、Page, L.(1998)。The Anatomy of a Large-scale Hypertextual Web Search Engine。Computer Networks and ISDN Systems,30(1-7),107-117。  new window
15.Porter, M. F.(1980)。An Algorithm for Suffix Stripping。Program: Electronic Library and Information Systems,14(3),130-137。  new window
16.Collins, A. M.、Loftus, E. F.(1975)。A spreading-activation theory of semantic processing。Psychological Review,82(6),407-428。  new window
17.Salton, G.、Buckley, C.(1988)。Term-weighting approaches in automatic text retrieval。Information Processing & Management: an International Journal,24(5),513-523。  new window
18.Salton, G.、Singhal, A.、Mitra, M.、Buckley, C.(1997)。Automatic Text Structuring and Summarization。Information Processing & Management,33(2),193-207。  new window
19.Yeh, J. Y.、Ke, H. R.、Yang, W. P.、Meng, I. H.(2005)。Text Summarization Using a Trainable Summarizer and Latent Semantic Analysis。Information Processing and Management,41(1),75-95。  new window
20.Radev, D. R.、Jing, H.、Sty, M.、Tam, D.(2004)。Centroid-based summarization of multiple documents。Information Processing and Management,40,919-938。  new window
會議論文
1.Barzilay, R.、Elhadad, M.(1997)。Using lexical chains for text summarization。ACL/EACL'97 Workshop on Intelligent Scalable Text Summarization,(會議日期: 1997, July 11)。Madrid。10-17。  new window
2.Barzilay, R.、McKeown, K.R.、Elhadad, M.(1999)。Information fusion in the context of multi-document summarization。ACL 1999,(會議日期: June 20-26)。College Park, MD。550-557。  new window
3.Canhasi, E.、Kononenko, I.(2011)。Semantic role frames graph-based multidocument summarization。14th International Multiconference on Information Society,(會議日期: October 10-14)。Ljubljana, Slovenia。113-116。  new window
4.Christensen, J.、Mausam, Soderland, S.、Etzioni, O.(2013)。Towards coherent multi-document summarization(會議日期: June 9-14)。Atlanta, GA。1163-1173。  new window
5.Daniel, N.、Radev, D.、Allison, T.(2003)。Sub-event based multidocument summarization(會議日期: May 30)。Edmonton, Canada。9-16。  new window
6.Erkan, G.(2006)。Using biased random walks for focused summarization。DUC 2006 Document Understanding Workshop,(會議日期: June 8-9)。Brooklyn, NY。  new window
7.Goldstein, J.、Mittal, V.、Carbonell, J.、Kantrowitz, M.(2000)。Multi-document summarization by sentence extraction(會議日期: April 30)。Seattle, WA。40-48。  new window
8.Harabagiu, S.、Maiorano, S.(2002)。Multi-document summarization with GIS Texter。LREC 2002,(會議日期: May 29-31)。Canary Islands, Spain。1456-1463。  new window
9.Haveliwala, T.H.(2002)。Topic-sensitive Page Rank。WWW 2002,(會議日期: May 7-11)。Honolulu, HI。517-526。  new window
10.McKeown, K.R.、Klavans, J.L.、Hatzivassiloglou, V.、Barzilay, R.、Eskin, E.(1999)。Towards multidocument summarization by reformulation: progress and prospects。AAAI 1999,(會議日期: July 18-22)。Orlando, FL。453-460。  new window
11.McKeown, K.、Radev, D.R.(1995)。Generating summaries of multiple news articles。SIGIR 1995,(會議日期: July 09)。Seattle, WA。74-82。  new window
12.Mihalcea, R.(2004)。Graph-based ranking algorithms for sentence extraction, applied to text summarization(會議日期: July 21-26)。Barcelona, Spain。170-173。  new window
13.Mihalcea, R.、Tarau, P.(2005)。An algorithm for language independent single and multiple document summarization(會議日期: October 11-13)。Jeju Island, Korea。19-24。  new window
14.Lin, C.Y.、Hovy, E.(2003)。Automatic evaluation of summaries using N-gram co-occurrence statistics。HLT-NAACL 2003,(會議日期: May 27 - June 1)。Edmonton, Canada。71-78。  new window
15.Wan, X.、Yang, J.(2006)。Improved affinity graph based multi-document summarization。The 2006 Human Language Technology Conference-North American Chapter of the Association for Computational Linguistics Annual Meeting,(會議日期: 2006/06/04-06/09)。New York, NY。181-184。  new window
16.Xia, Y.、Zhang, Y.、Yao, J.(2011)。Co-clustering sentences and terms for multi-document summarization。CICLing 2011,(會議日期: February 20-26)。Tokyo, Japan。339-352。  new window
17.Zhang, J.、Sun, L.、Zhou, Q.(2005)。A cue-based hub-authority approach for multi-document text summarization(會議日期: August 21-23)。Wuhan, China。21-23。  new window
18.Zhu, X.、Goldberg, A.B.、Van Gael, J.、Andrzejewski, D.(2007)。Improving diversity in ranking using absorbing random walks。HLT-NAACL 2007,(會議日期: April 22-27)。Rochester, NY。97-104。  new window
19.陳光華(1998)。新資訊時代的啟發性資訊服務。臺北:桃園。195-208。  延伸查詢new window
20.Wan, X.、Yang, J.(2008)。Multi-document summarization using cluster-based link analysis。Annual International ACM Conference on Research and Development in Information Retrieval,299-306。  new window
21.Gong, Y.、Liu, X.。Generic Text Summarization Using Relevance Measure and Latent Semantic Analysis。SIGIR 2001,(會議日期: September 9-13, 2001)。New York, N. Y.。19-25。  new window
學位論文
1.謝仰哲(2008)。國中生友誼與學習諮詢網路之社會網路分析(碩士論文)。國立臺灣師範大學。  延伸查詢new window
2.葉鎮源(2002)。文件自動化摘要方法之研究及其在中文文件的應用(碩士論文)。國立交通大學,新竹。  延伸查詢new window
圖書
1.Carrington. P. J.、Scott, J.、Wasserman, S.(2005)。Models and methods in social network analysis。Cambridge University Press。  new window
2.Haykin, S.O.(2008)。Neural networks and learning machines。New York, NY:Prentice Hall。  new window
3.Mani, I.、Maybury, M. T.(1999)。Advances in automatic text summarization。Cambridge, MA:The MIT Press。  new window
其他
1.Das, D.,Martins, A.F.T.(2007)。A survey on automatic text summarization,http://www.dipanjandas.com/files/summarization.pdf, 2014/03/17。  new window
圖書論文
1.Ji, H.、Lin, W.P.、Gillick, D.、Hakkani-Tur, D.、Grishman, R.(2013)。Open-domain multi-document summarization via information extraction: challenges and prospects。Multi-source, Multilingual Information Extraction and Summarization。Berlin, Heidelberg, Germany:Springer。  new window
2.Nenkova, A.、McKeown, K.(2012)。A survey of text summarization techniques。Mining Text Data。New York, NY:Springer-Verlag。  new window
3.Quillian, M. R.(1968)。Semantic memory。Semantic information processing。Cambridge, MA:MIT Press。  new window
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top