自适应短文本关键词生成模型

王永剑; 孙亚茹; 杨莹

doi:10.13700/j.bh.1001-5965.2020.0601

自适应短文本关键词生成模型

doi: 10.13700/j.bh.1001-5965.2020.0601

公安部第三研究所, 上海 201204

详细信息

通讯作者:
杨莹, E-mail: yangying@mcst.org.cn

中图分类号: TP391
计量
- 文章访问数: 487
- HTML全文浏览量: 60
- PDF下载量: 57
- 被引次数: 0
出版历程
- 收稿日期: 2020-10-23
- 录用日期: 2020-11-06
- 网络出版日期: 2022-02-20

Adaptive short text keyword generation model

The Third Research Institute of Ministry of Public Security, Shanghai 201204, China

More Information

Corresponding author: YANG Ying, E-mail: yangying@mcst.org.cn

摘要

摘要:
关键词抽取对文本处理影响较大，其识别的准确度及流畅程度是任务的关键。为有效缓解短文本关键词提取过程中词划分不准确、关键词与文本主题不匹配、多语言混合等难题，提出了一种基于图到序列学习模型的自适应短文本关键词生成模型ADGCN。模型采用图神经网络与注意力机制相结合的方式作为对文本信息特征提取的编码框架，针对词的位置特征和语境特征编码，解决了短文本结构不规律和词之间存在关联复杂信息的问题。同时采用了一种线性解码方案，生成了可解释的关键词。在解决问题的过程中，从某社交平台收集并公布了一个标签数据集，其包括社交平台发文文本和话题标签。实验中，从用户需求角度出发对模型结果的相关性、信息量、连贯性进行评估和分析，所提模型不仅可以生成符合短文本主题的关键词，还可以有效缓解数据扰动对模型的影响。所提模型在公开数据集KP20k上仍表现良好，具有较好的可移植性
- 关键词提取 /
- 关键词生成 /
- 图神经网络 /
- 注意力机制 /
- 主题模型
Abstract:
Keyword extraction has a great impact on text processing, and the accuracy and fluency of keyword recognition are the keys to the task. In order to effectively solve the problems such as inaccurate word division, mismatch between keywords and text topics, and multi-language mixing in the process of keyword extraction from short text, we propose an adaptive short text keyword generation model based on graph convolutional neural network (ADGCN). First, the model uses graph neural network as the coding framework of text information feature extraction to solve the problem of irregular short text structure and the existence of complex information between words. Then, according to the location features and context features of words, the self attention mechanism is combined to capture rich context dependent information. Finally, a linear decoding scheme is used to generate interpretable keywords. We collect and publish a tag dataset TH from social media platform, including text and topic tags. We evaluate and analyze the relevance, information and coherence of the model results from the perspective of user needs. The model can not only generate keywords that meet the topic of short text, but also effectively alleviate the impact of data disturbance on the model. It is proved that the model performs well on the public dataset KP20k and has good portability.
- keyword extraction /
- keyword generation /
- graph neural network /
- attention mechanism /
- topic model

HTML全文

图 1 ADGCN模型原理结构

Figure 1. Schematic diagram of ADGCN model

下载: 全尺寸图片幻灯片

图 2 图构建过程

Figure 2. Graph construction process

下载: 全尺寸图片幻灯片

图 3 注意力层

Figure 3. Attention layer

下载: 全尺寸图片幻灯片

图 4 不同数据规模下各模型的F₁值比较

Figure 4. Comparison of F₁ values of different models under different data scales

下载: 全尺寸图片幻灯片

图 5 话题标签个数比例

Figure 5. Ratio of the number of topic tag

下载: 全尺寸图片幻灯片

图 6 不同话题标签个数下度量值的评估

Figure 6. Evaluation of measurement value under different numbers of topic tag

下载: 全尺寸图片幻灯片

表 1 不同话题中的TH样本和标签数量

Table 1. Number of TH samples and tags in different topics

主题	文本条数	标签个数
生活	14 044	10
教育	11 181	7
健康	4 868	5

下载: 导出CSV

表 2 KP20k数据集描述

Table 2. KP20k dataset description

KP20k	样本条数
训练集	530 809
验证集	20 000
测试集	20 000

下载: 导出CSV

表 3 生活主题下Baseline模型和本文模型的3个度量评估比较

Table 3. Comparison of three measurement evaluation of Baseline model and proposed model under life topic

模型	相关性	信息量	连贯性	结果
Tf-idf	6.75	5.70	8.03	6.83
TextTank	6.31	4.75	8.22	6.43
Maui	5.27	4.09	7.82	5.73
RNN	5.38	3.46	7.93	5.59
CopyRNN	6.52	5.21	8.04	6.59
CovRNN	6.56	5.24	8.09	6.63
ADGCN	8.13	6.21	7.53	7.29

下载: 导出CSV

表 4 教育主题下Baseline模型和本文模型的3个度量评估比较

Table 4. Comparison of three measurement evaluation of Baseline model and proposed model under education topic

模型	相关性	信息量	连贯性	结果
Tf-idf	5.01	4.01	7.17	5.40
TextTank	4.93	4.47	7.34	5.58
Maui	4.31	4.39	5.35	4.68
RNN	4.05	4.60	5.67	4.77
CopyRNN	5.31	4.95	6.26	5.51
CovRNN	5.27	5.02	6.24	5.51
ADGCN	7.91	6.33	6.14	6.79

下载: 导出CSV

表 5 健康主题下Baseline模型和本文所提模型的3个度量评估比较

Table 5. Comparison of three measurement evaluation of Baseline model and proposed model under health topic

模型	相关性	信息量	连贯性	结果
Tf-idf	4.65	4.51	6.95	5.37
TextTank	4.74	4.53	7.04	5.44
Maui	3.43	4.57	4.94	4.31
RNN	2.51	5.08	5.26	4.28
CopyRNN	4.87	5.39	5.31	5.19
CovRNN	4.85	5.47	5.43	5.25
ADGCN	6.97	6.36	5.09	6.14

下载: 导出CSV

表 6 KP20k数据集上Baseline模型和本文模型的精确率、召回率和F₁值评估比较

Table 6. Comparison of precision, recall and F₁ evaluation of Baseline model and proposed model on KP20k dataset

模型	P	R	F₁
Tf-idf	0.413	0.052	0.093
TextTank	0.309	0.054	0.092
Maui	0.564	0.125	0.205
RNN	0.581	0.126	0.208
CopyRNN	0.652	0.213	0.321
CovRNN	0.683	0.220	0.333
ADGCN	0.735	0.327	0.453

下载: 导出CSV

表 7 ADGCN模型的消融

Table 7. Ablation of ADGCN model

模型	P	R	F₁
ADGCN	0.735	0.327	0.453
去除图构建层	0.603	0.329	0.426
去除注意力层	0.667	0.299	0.413
去除主题交互层	0.661	0.293	0.406
去除注意力层，主题交互层	0.565	0.301	0.393
去除密集连接层	0.682	0.326	0.441

下载: 导出CSV

参考文献(37)

[1]	BOUDIN F. A comparison of centrality measures for graph-based keyphrase extraction[C]//Proceedings of the International Joint Conferences on Natural Language Processing (IJCNLP), 2013: 834-838.
[2]	LAHIRI S, CHOUDHURY S R, CARAGEA C. Keyword and keyphrase extraction using centrality measures on collocation networks[EB/OL]. (2014-01-25)[2020-10-01]. https://arxiv.org/abs/1401.6571.
[3]	PALSHIKAR G K. Keyword extraction from a single document using centrality measures[J]. Pattern Recognition and Machine Intelligence (PReMI), 2007: 4851(1): 503-510.
[4]	EDIGER D, JIANG E J, BADER D A, et al. Massive social network analysis: Mining twitter for social good[C]//39th International Conference on Parallel Processing(ICPP), 2010: 583-593.
[5]	BULGAROV F, CARAGEA C. A comparison of supervised keyphrase extraction models[C]//Proceedings of the 2015 International Conference on World Wide Web, 2015: 13-14.
[6]	MOTHE J, RAMIANDRISOA F, RASOLOMANANA M. Automatic keyphrase extraction using graph-based methods[C]//Proceedings of the 33rd Annual ACM Symposium on Applied Computing, 2018: 728-730.
[7]	刘啸剑, 谢飞, 吴信东. 基于图和LDA主题模型的关键词抽取算法[J]. 情报学报, 2016, 35(6): 664-672. doi: 10.3772/j.issn.1000-0135.2016.006.010 LIU X J, XIE F, WU X D. Keyword extraction algorithm based on graph and LDA topic model[J]. Journal of the China Society for Scientific and Technical Information, 2016, 35(6): 664-672(in Chinese). doi: 10.3772/j.issn.1000-0135.2016.006.010
[8]	BOUDIN F. Unsupervised keyphrase extraction with multipartite graphs[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018: 667-672.
[9]	SAROJ K B, MONALI B, JACOB S. A graph based keyword extraction model using collective node weight[J]. Expert Systems with Application, 2018, 97(1): 51-59.
[10]	BELLAACHIA A, AL-DHELAAN M. NE-Rank: A novel graph-based key phrase extraction in twitter[C]//Proceedings of the International Joint Conferences on Web Intelligence and Intelligent Agent Technology, 2012: 372-379.
[11]	LI Z, WANG C. Keyword extraction with character-level convolutional neural tensor networks[C]//23rd Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining, 2019: 400-413.
[12]	杨丹浩, 吴岳辛, 范春晓. 一种基于注意力机制的中文短文本关键词提取模型[J]. 计算机科学, 2020, 41(1): 193-198. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJA202001026.htm YANG D H, WU Y X, FAN C X. A Chinese short text keyword extraction model based on attention mechanism[J]. Computer Science, 2020, 41(1): 193-198(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-JSJA202001026.htm
[13]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems, 2017: 5998-6008.
[14]	冯建周, 宋沙沙, 王元卓, 等. 基于改进注意力机制的实体关系抽取方法[J]. 电子学报, 2019, 47(8): 1692-1700. doi: 10.3969/j.issn.0372-2112.2019.08.012 FENG J Z, SONG S S, WANG Y Z, et al. Entity relation extraction method based on improved attention mechanism[J]. Acta Electronica Sinica, 2019, 47(8): 1692-1700(in Chinese). doi: 10.3969/j.issn.0372-2112.2019.08.012
[15]	MATTHEW E P, MARK N, MOHIT I, et al. Deep contextualized word representations[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018: 2227-2237.
[16]	HAMILTON W L, YING Z, LESKOVEC J. Inductive representation learning on large graphs[C]//Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems, 2017: 1024-1034.
[17]	BERG R, KIPF T N, WELLING M. Graph convolutional matrix completion[EB/OL]. (2017-10-25)[2020-10-01]. https://arxiv.org/abs/1706.02263.
[18]	YING R, HE R, CHEN K, et al. Graph convolutional neural networks for web-scale recommender systems[C]//Proceedings of the 24th International Conference on Knowledge Discovery and Data Mining, 2018: 974-983.
[19]	HAMAGUCHI T, OIWA H, SHIMBO M, et al. Knowledge transfer for out-of-knowledge-base entities: A graph neural network approach[C]//Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017: 1802-1808.
[20]	KAMPFFMEYER M, CHEN Y, LIANG X, et al. Rethinking knowledge graph propagation for zero-shot learning[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 11487-11496.
[21]	WANG X L, YE Y F, GUPTA A. Zero-shot recognition via semantic embeddings and knowledge graphs[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 6857-6866.
[22]	LI Z, SUN Y, ZHU J, et al. Improve relation extraction with dual attention-guided graph convolutional networks[J]. Neural Computing and Applications, 2021, 33: 1773-1784. doi: 10.1007/s00521-020-05087-z
[23]	PENG H, LI J X, HE Y, et al. Large-scale hierarchical text classification with recursively regularized deep graph-CNN[C]//Proceedings of the 2018 World Wide Web Conference, 2018: 1063-1072.
[24]	LIU B, NIU D, WEI H, et al. Matching article pairs with graphical decomposition and convolutions[C]//Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019: 6284-6294.
[25]	LIU X, YOU X, ZHANG X, et al. Tensor graph convolutional networks for text classification[C]//The Thirty-Second Innovative Applications of Artificial Intelligence Conference, 2020: 8409-8416.
[26]	XU K, WU L F, WANG Z G, et al. Graph2Seq: Graph to sequence learning with attention-based neural networks[EB/OL]. (2018-12-03)[2020-10-01]. https://arxiv.org/abs/1804.00823.
[27]	XU K, WU L, WANG Z, et al. SQL-to-text generation with graph-to-sequence model[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018: 931-936.
[28]	BECK D, HAFFARI G, COHN T. Graph-to-sequence learning using gated graph neural networks[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018: 273-283.
[29]	XUE H, QIN B, LIU T. Topical key concept extraction from folksonomy through graph-based ranking[J]. Multimedia Tools and Applications, 2016, 75(15): 8875-8893. doi: 10.1007/s11042-014-2303-9
[30]	NAGARAJAN R, NAIR S A H, ARUNA P. Keyword extraction using graph based approach[J]. International Journal Advanced Research in Computer Science and Software Engineering, 2016, 6(10): 25-29.
[31]	SONG H, GO J, PARK S, et al. A just-in-time keyword extraction from meeting transcripts using temporal and participant information[J]. Journal of Intelligent Information Systems, 2017, 48(1): 117-140. doi: 10.1007/s10844-015-0391-2
[32]	KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[EB/OL]. (2017-02-22)[2020-10-01]. https://arxiv.org/abs/1609.02907.
[33]	DZMITRY B, KYUNGHYUN C, YOSHUA B. Neural machine translation by jointly learning to align and translate[EB/OL]. (2016-05-19)[2020-10-01]. https://arxiv.org/abs/1409.0473.
[34]	GU J T, LU Z D, LI H, et al. Incorporating copying mechanism in sequence-to-sequence learning[EB/OL]. (2016-06-08)[2020-10-01]. https://arxiv.org/abs/1603.06393.
[35]	MENG R, ZHAO S, HAN S, et al. Deep keyphrase generation[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2017: 582-592.
[36]	MEDELYAN O, FRANK E, WITTEN I H. Human-competitive tagging using automatic keyphrase extraction[C]//Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, 2009: 1318-1327.
[37]	ZHANG Y, XIAO W. Keyphrase generation based on deep Seq2seq model[J]. IEEE Access, 2018, 6: 46047-46057. doi: 10.1109/ACCESS.2018.2865589