[1]王雍凯,毛存礼,余正涛,等.基于图的新闻事件主题句抽取方法[J].南京理工大学学报(自然科学版),2016,40(04):438.[doi:10.14177/j.cnki.32-1397n.2016.40.04.010]
 Wang Yongkai,Mao Cunli,Yu Zhengtao,et al.Approach for topical sentence of news events extraction based on graph[J].Journal of Nanjing University of Science and Technology,2016,40(04):438.[doi:10.14177/j.cnki.32-1397n.2016.40.04.010]
点击复制

基于图的新闻事件主题句抽取方法
分享到:

《南京理工大学学报》(自然科学版)[ISSN:1005-9830/CN:32-1397/N]

卷:
40卷
期数:
2016年04期
页码:
438
栏目:
出版日期:
2016-08-29

文章信息/Info

Title:
Approach for topical sentence of news events extraction based on graph
文章编号:
1005-9830(2016)04-0438-06
作者:
王雍凯毛存礼余正涛郭剑毅洪旭东罗 林
昆明理工大学 信息工程与自动化学院,云南 昆明 650500
Author(s):
Wang YongkaiMao CunliYu ZhengtaoGuo JianyiHong XudongLuo Lin
School of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China
关键词:
新闻事件 事件主题句 触发词 命名实体 事件关系 无向图 排序 抽取
Keywords:
news events event topic sentences trigger word named entity event relation undirected graph ranking extraction
分类号:
TP311
DOI:
10.14177/j.cnki.32-1397n.2016.40.04.010
摘要:
新闻事件主题句识别任务是一项基于文本内容进行语义分析的自然语言处理技术。为准确计算新闻事件文本中与新闻主题语义最相关的句子,提出一种基于图的新闻事件主题句抽取方法。首先利用描述事件特征的触发词及命名实体构建候选新闻事件句子抽取模板,然后,计算候选事件句之间的关联关系构建事件关系无向图,最后基于TextRank算法思想将图中任意顶点的权值表征为与其有关联的顶点权值的加权和,并按权值进行排序实现事件主题句抽取。实验结果表明,提出的方法优于基于TFIDF和基于标题的事件主题句抽取方法,F值分别提升了6.26%和2%。
Abstract:
News events topical sentence recognition task is a text-based semantic analysis of natural language processing technology.In order to accurately calculate the news text sentences that are most relevant to the topic,this paper proposes a novel approach for topical sentence of news events extraction based on a undirected graph.This paper describes the characteristics of an event trigger Word and sentence extraction named entity and constructs the candidated event extraction templates.This paper,calculates the relationship between candidated event sentences and constructs undirected graphs of event relation ship.Finally,based on the TextRank algorithm,the weight of any vertex in the graph is represented by the weighted sum of the vertex weights,and according to the sorted weights,the event topical sentences are extracted.Experimental results show that the proposed approach is better than TFIDF and event extraction method based on title and that F values are respectively 6.26% and 2%.

参考文献/References:

[1] 赵妍妍,秦兵,车万翔,等.中文事件抽取技术研究[J].中文信息学报,2008,26(1):3-8.
Zhao Yanyan,Qin Bing,Che Wanxiang,et al.Research on Chinese event extraction[J].Journal of Chinese Information Processing,2008,26(1):3-8.
[2]Kim J D,Ohta T,Pyysalo S,et al.Overview of Bio NLP'09 shared task on event extraction[C]//Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing:Shared Task.Madison,USA:Omnipress,2009,32(11):77-85.
[3]Zha H.Generic summarization and keyphrase extraction using mutual reinforcement principle and sentence clustering[C]//Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.New York,USA:Association for Computing Machinery,2002:113-120.
[4]Ganesan K,Zhai C X,Han J.Opinosis:a graph-based approach to abstractive summarization of highly redundant opinions[C]//Proceedings of the 23rd International Conference on Computational Linguistics.Madison,USA:Omnipress,2010:340-348.
[5]Nishikawa H,Hasegawa T,Matsuo Y,et al.Opinion summarization with integer linear programming formulation for sentence extraction and ordering[C]//Proceedings of the 23rd International Conference on Computational Linguistics:Posters.Madison,USA:Omnipress,2010:910-918.
[6]林莉媛,王中卿,李寿山,等.基于PageRank的中文多文档文本情感摘要[J].中文信息学报,2014,28(2):85-90.
Lin Liyuan,Wang Zhongqing,Li Shoushan,et al.Chinese multi-docement opinion summarization via PageRank[J].Journal of Chinese Information Processing,2014,28(2):85-90.
[7]施佺,肖仰华,鲁轶奇,等.基于摘要图的不确定社会网络Top-k子图查询算法[J].南京理工大学学报(自然科学版),2014,12(6):738-743.
Shi Quan,Xiao Yanghua,Lu Yiqi,et al.Top-k Subgraph Query Algorithm on UncertainSocial Networks Based on Summary Graph[J].Journal of Nanjing University of Science and Technology(NaturalScience),2014,12(6):38-743.
[8]Wang D,Liu Y.A pilot study of opinion summarization in conversations[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics:Human Language Technologies-Volume 1.Madison,USA:Omnipress,2011:331-339.
[9]Boudin F.A comparison of centrality measures for graph-based keyphrase extraction[C]//International Joint Conference on Natural Language Processing(IJCNLP).Berlin,Germany:Springer,2013:834-838.
[10]Mihalcea R,Tarau P.TextRank:Bringing order into texts[C]//Proceedings of EMNLP-04 and the 2004 Conference on Empirical Methods in Natural Language Processing.Madison,USA:Omnipress,2004:404-411.
[11]Bougouin A,Boudin F,Daille B.Topicrank:Graph-based topic ranking for keyphrase extraction[C]//International Joint Conference on Natural Language Processing(IJCNLP).Berlin,Germany:Springer,2013:543-551.
[12]ACE(Automatic Content Extraction)Chinese Annotation Guidelines for Events.National Institute of Standards and Technology[R],2005.
[13]付剑锋.面向事件的知识处理研究[D].上海:上海大学计算机科学与工程学院,2010.
[14]刘群,李素建.基于《知网》的词汇语义相似度计算[J].计算语言学及中文信息处理,2002,7(2):59-76.
Liu Qun,Li Sujia.Word similarity computing based on How-net[J].Computational Linguistics and Chinese Language Processing,2002,7(2):59-76.
[15]王洋洋,刘柏嵩,刘薇.基于归一化割的主题划分算法研究[J].宁波大学学报(理工版),2013,26(4):40-44.
Wang Yangyang,Liu Baisong,Liu Wei.Probe:Normalized cuts based topic partition[J].Journal of Ningbo University(NSEE),2013,26(4):40-44.

备注/Memo

备注/Memo:
收稿日期:2016-02-05 修回日期:2016-06-30
基金项目:国家自然科学基金(61472168); 云南省科技厅重点项目(2013FA030); 云南省教育厅基金重点项目(2015Z022); 昆明理工大学引进人才科研启动基金项目(KKSY201503007)
作者简介:王雍凯(1990-),男,硕士生,主要研究方向:自然语言处理,信息检索,E-mail:412301999@qq.com; 通讯作者:毛存礼(1977-),男,博士,副教授,主要研究方向:自然语言处理、信息检索、机器翻译,E-mail:maocunli@163.com。
引文格式:王雍凯,毛存礼,余正涛,等.基于图的新闻事件主题句抽取方法[J].南京理工大学学报,2016,40(4):438-443.
投稿网址::http://zrxuebao.njust.edu.cn
更新日期/Last Update: 2016-06-30