基于增强逐点图卷积网络的民航短文本组合分类方法

刘晓琳; 宋营营; 李卓

doi:10.13700/j.bh.1001-5965.2024.0223

基于增强逐点图卷积网络的民航短文本组合分类方法

doi: 10.13700/j.bh.1001-5965.2024.0223

1.
中国民航大学电子信息与自动化学院，天津 300300
2.
中国农业大学信息与电气工程学院，北京 100083

基金项目:

天津市自然科学基金(17JCYBJC18200)

详细信息

通讯作者:
E-mail：caucyanjiusheng@163.com

中图分类号: V221⁺.3；TB553
计量
- 文章访问数: 271
- HTML全文浏览量: 121
- PDF下载量: 3
- 被引次数: 0
出版历程
- 收稿日期: 2024-04-16
- 录用日期: 2024-07-12
- 网络出版日期: 2024-08-13
- 整期出版日期: 2026-06-30

Civil aviation short text combined classification method based on enhanced point-wise graph convolutional networks

1.
College of Electronic Information and Automation，Civil Aviation University of China，Tianjin 300300，China
2.
School of Information and Electrical Engineering，China Agricultural University，Beijing 100083，China

Funds:

Natural Science Foundation of Tianjin, China (17JCYBJC18200)

More Information

Corresponding author: E-mail：caucyanjiusheng@163.com

摘要

摘要:
目前，大多数短文本分类方法存在信息挖掘不充分和局部信息关注度不足的问题，致使分类精度无法得到提升。鉴于此，提出一种融合增强语义信息和句法信息的逐点图卷积网络(ESS-PWGCN)少样本半监督民航短文本组合分类模型。筛选训练集高置信度关键词汇信息，丰富和增强民航短文本中关键信息的表达能力，扩大模型的应用领域；结合逐点卷积和图卷积网络(GCN)，并引入多头注意力机制，学习民航短文本的语义-句法信息，同时平衡文本图中全局-局部信息的影响权重；采用全连接层融合获取到的信息，输出分类结果；利用民航数据集和其他领域的公开数据集进行实验。结果表明：ESS-PWGCN模型与当前最先进的自训练文本图卷积网络(ST-TextGCN)模型相比，不仅分类的准确率和F₁值分别提高了4.59%和6.53%，而且具有更高的鲁棒性和泛化性。
- 短文本分类 /
- 深度学习 /
- 图卷积网络 /
- 逐点卷积 /
- 注意力机制 /
- 长短期记忆网络
Abstract:
The improvement of classification accuracy is currently hampered by the fact that most short text classification approaches suffer from inadequate information mining and insufficient attention to local information. In light of this, an enhanced semantic-syntactic point-wise graph convolutional network (ESS-PWGCN) short-text combination classification model with few samples and semi-supervised civil aviation was proposed. Firstly, the model selects training set high-confidence keyword information to enrich and enhance the expression of key information within civil aviation short texts, thereby broadening the applicability of the model. Secondly, it balances the influence weights of global-local information within the textual graph structure while learning the semantic-syntactic information of civil aviation short texts by combining point-wise convolution with graph convolutional networks (GCN) and multi-head attention mechanisms.Then, a fully connected layer is employed to amalgamate the acquired information for outputting classification results. Finally, experiments conducted on aviation datasets and other public domain datasets demonstrate that the ESS-PWGCN model not only surpasses the current state-of-the-art self-training text graph convolution networks( ST-TextGCN) model in terms of accuracy and F₁ score by 4.59% and 6.53%, respectively, but also exhibits superior robustness and generalizability.
- short text classification /
- deep learning /
- graph convolutional networks /
- pointwise convolution /
- attention mechanisms /
- long short-term memory networks

HTML全文

图 1 ESS-PWGCN模型结构

Figure 1. Model architecture of ESS-PWGCN

下载: 全尺寸图片幻灯片

图 2 图结构分类

Figure 2. Classification of graph structures

下载: 全尺寸图片幻灯片

图 3 句法依存树关系分析

Figure 3. Analysis of syntactic dependency tree relationships

下载: 全尺寸图片幻灯片

图 4 分类准确率随迭代次数变化曲线

Figure 4. Classification accuracy curves with number of iterations

下载: 全尺寸图片幻灯片

图 5 分类F₁值随迭代次数变化曲线

Figure 5. Classification F₁ value curves with number of iterations

下载: 全尺寸图片幻灯片

图 6 训练集类别分布

Figure 6. Category distribution of training set

下载: 全尺寸图片幻灯片

图 7 词汇置信度阈值对分类效果的影响

Figure 7. Effect of lexical confidence thresholds on classification effectiveness

下载: 全尺寸图片幻灯片

图 8 注意力头数对分类效果的影响

Figure 8. Effect of number of attention heads on classification effect

下载: 全尺寸图片幻灯片

图 9 关键词汇筛选模块消融实验结果

Figure 9. Results of ablation experiment of key vocabulary screening module

下载: 全尺寸图片幻灯片

图 10 逐点卷积模块消融实验结果

Figure 10. Ablation results of point-by-point convolution module

下载: 全尺寸图片幻灯片

图 11 不同模型不同数据集准确率结果对比

Figure 11. Comparison of accuracy results of different models and different datasets

下载: 全尺寸图片幻灯片

图 12 不同模型不同数据集F₁值结果对比

Figure 12. Comparison of F₁ values of different models and different datasets

下载: 全尺寸图片幻灯片

表 1 数据集信息

Table 1. Dataset information

数据集	类别总量	数据总量/个	测试集/个	训练集/个	验证集/个	平均词汇个数
ASRS	10	8865	7979	798	88	17.4689
Ohsumed	23	7400	6660	666	74	11.9296
Snippets	8	7370	6633	664	73	15.5438
MR	2	8000	7200	720	80	20.9989
Laptop	3	3532	3179	318	35	20.3347

下载: 导出CSV

表 2 ASRS数据集不同模型实验结果对比

Table 2. Comparison of experimental results of different models on ASRS datasets

模型	分类结果										A_CC	F₁	T/s
模型	Equipment_ Tooling	Weather	Manuals	Human_ Factors	Company_ Policy	Procedure	Ambiguous	Airport	Chart_Or_ Publication	Environment- Non_Weather_ Related	A_CC	F₁	T/s
ESS-PWGCN	273	626	125	496	672	278	419	487	471	438	0.5370	0.5304	18.6174
ST-TextGCN^[23]	0	674	0	420	865	15	197	699	542	116	0.4422	0.3437	1.3928
Mean^[26]	0	740	0	333	212	0	5	222	432	152	0.2627	0.2442	1.1596
Transformer^[27]	105	624	110	401	476	345	231	757	333	160	0.4439	0.3543	23.9254
TextCNN^[28]	50	684	82	351	551	260	89	431	355	181	0.3802	0.3341	5.0154
TextRNN^[29]	29	726	125	328	540	203	109	703	264	70	0.3881	0.3750	9.4696
TextING^[30]	173	662	67	409	464	205	338	757	160	97	0.4209	0.3824	64.5670
TextGCN^[13]	32	618	36	531	626	182	255	620	551	249	0.4637	0.4077	4.8735

下载: 导出CSV

表 3 不同数据集不同模型泛化实验结果对比

Table 3. Comparison of the results of generalization experiments of different models on different datasets

模型	A_CC					F₁
模型	ASRS	Ohsumed	MR	Laptop	Snippets	ASRS	Ohsumed	MR	Laptop	Snippets
Mean^[26]	0.2970	0.1797	0.6735	0.5408	0.6555	0.1008	0	0.5589	0.2606	0.5871
Transformer^[27]	0.4909	0.2910	0.7207	0.6499	0.8173	0.3847	0.1000	0.6390	0.4981	0.7598
TextCNN^[28]	0.4334	0.2303	0.7306	0.6150	0.8022	0.2864	0.1000	0.7519	0.3864	0.7446
TextRNN^[29]	0.4348	0.2356	0.7307	0.6266	0.7993	0.2260	0	0.6667	0.4206	0.7551
TextGCN^[13]	0.4651	0.3036	0.6517	0.6172	0.7096	0.4271	0.1489	0.6517	0.4868	0.7032
TextING^[30]	0.4383	0.4096	0.7186	0.6631	0.8415	0.4106	0.2431	0.7282	0.5254	0.8402
ST-TextGCN^[23]	0.4750	0.4209	0.7021	0.6625	0.8777	0.3940	0.2696	0.7021	0.5330	0.8767
ESS-PWGCN	0.5370	0.4658	0.7575	0.7005	0.9071	0.5304	0.3226	0.7559	0.5874	0.9056

下载: 导出CSV

参考文献(30)

[1]	李博涵, 向宇轩, 封顶, 等. 融合知识感知与双重注意力的短文本分类模型[J]. 软件学报, 2022, 33(10): 3565-3581. LI B H, XIANG Y X, FENG D, et al. Short text classification model combining knowledge aware and dual attention[J]. Journal of Software, 2022, 33(10): 3565-3581(in Chinese).
[2]	ZHANG Y, JIN R, ZHOU Z H. Understanding bag-of-words model: a statistical framework[J]. International Journal of Machine Learning and Cybernetics, 2010, 1(1): 43-52.
[3]	BLEI D M, NG A Y, JORDAN M I. Latent dirichlet allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022.
[4]	JOULIN A, GRAVE E, BOJANOWSKI P, et al. Bag of tricks for efficient text classification[EB/OL]. (2016-07-06)[2024-04-05]. https://arxiv.org/abs/1607.01759.
[5]	LIANG H, SUN X, SUN Y L, et al. Text feature extraction based on deep learning: a review[J]. EURASIP Journal on Wireless Communications and Networking, 2017, 2017(1): 211.
[6]	XU Y, HONG K, TSUJII J, et al. Feature engineering combined with machine learning and rule-based methods for structured information extraction from narrative clinical discharge summaries[J]. Journal of the American Medical Informatics Association, 2012, 19(5): 824-832.
[7]	贾宝惠, 姜番, 王玉鑫, 等. 基于民机维修文本数据的故障诊断方法[J]. 航空学报, 2023, 44(5): 253-267. JIA B H, JIANG F, WANG Y X, et al. Fault diagnosis method based on civil aircraft maintenance text data[J]. Acta Aeronautica et Astronautica Sinica, 2023, 44(5): 253-267(in Chinese).
[8]	蒋云良, 王青朋, 张雄涛, 等. 基于门控双层异构图注意力网络的半监督短文本分类[J]. 模式识别与人工智能, 2023, 36(7): 602-612. JIANG Y L, WANG Q P, ZHANG X T, et al. Semi-supervised short text classification based on gated double-layer heterogeneous graph attention network[J]. Pattern Recognition and Artificial Intelligence, 2023, 36(7): 602-612(in Chinese).
[9]	万家山, 吴云志. 基于深度学习的文本分类方法研究综述[J]. 天津理工大学学报, 2021, 37(2): 41-47. WAN J S, WU Y Z. Review of text classification research based on deep learning[J]. Journal of Tianjin University of Technology, 2021, 37(2): 41-47(in Chinese).
[10]	RAMACHANDRAN D, PARVATHI R. Enhanced classification of crisis related tweets using deep learning models and word embeddings[J]. International Journal of Web Engineering and Technology, 2021, 16(2): 158-186.
[11]	WU Y, ZHAO S, LI W. Phrase2Vec: phrase embedding based on parsing[J]. Information Sciences, 2020, 517: 100-127.
[12]	HUANG Y R, CHEN J J, ZHENG S M, et al. Hierarchical multi-attention networks for document classification[J]. International Journal of Machine Learning and Cybernetics, 2021, 12(6): 1639-1647.
[13]	YAO L, MAO C S, LUO Y. Graph convolutional networks for text classification[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2019, 33(1): 7370-7377.
[14]	LIU X E, YOU X X, ZHANG X, et al. Tensor graph convolutional networks for text classification[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2020, 34(5): 8409-8416.
[15]	YANG T C, HU L M, SHI C, et al. HGAT: heterogeneous graph attention networks for semi-supervised short text classification[J]. ACM Transactions on Information Systems(TOIS), 2021, 39(3): 1-29.
[16]	WU M Q. Commonsense knowledge powered heterogeneous graph attention networks for semi-supervised short text classification[J]. Expert Systems with Applications, 2023, 232: 120800.
[17]	JI Z Y, KONG D Y, YANG Y Y, et al. ASSL-HGAT: active semi-supervised learning empowered heterogeneous graph attention network[J]. Knowledge-Based Systems, 2024, 290: 111567.
[18]	DEFFERRARD M, BRESSON X, VANDERGHEYNST P. Convolutional neural networks on graphs with fast localized spectral filtering[EB/OL]. (2022-04-04)[2024-04-06]. https://arxiv.org/pdf/1606.09375v1.
[19]	ZHOU J, HUANG J X, HU Q V, et al. SK-GCN: modeling syntax and knowledge via graph convolutional network for aspect-level sentiment classification[J]. Knowledge-Based Systems, 2020, 205: 106292.
[20]	ZHANG C, LI Q C, SONG D W. Aspect-based sentiment classification with aspect-specific graph convolutional networks[EB/OL]. (2019-09-08)[2024-04-06]. https://arxiv.org/abs/1909.03477.
[21]	CHEN P, SUN Z Q, BING L D, et al. Recurrent attention network on memory for aspect sentiment analysis[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Kerrville: Association for Computational Linguistics, 2017: 452-461.
[22]	LI X, BING L D, LAM W, et al. Transformation networks for target-oriented sentiment classification[EB/OL]. (2018-03-03)[2024-04-06]. https://arxiv.org/abs/1805.01086.
[23]	CUI H Y, WANG G K, LI Y X, et al. Self-training method based on GCN for semi-supervised short text classification[J]. Information Sciences, 2022, 611: 18-29.
[24]	PANG B, LEE L. Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales[C]//Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05). Kerrville: Association for Computational Linguistics, 2005: 115-124.
[25]	LIN Y X, MENG Y X, SUN X F, et al. BertGCN: transductive text classification by combining GCN and BERT[EB/OL]. (2021-03-12)[2024-04-06]. https://arxiv.org/abs/2105.05727.
[26]	SUBAKTI A, MURFI H, HARIADI N. The performance of BERT as data representation of text clustering[J]. Journal of Big Data, 2022, 9(1): 15.
[27]	DAI Z H, YANG Z L, YANG Y M, et al. Transformer-XL: attentive language models beyond a fixed-length context[EB/OL]. (2019-01-09)[2024-04-06]. https://arxiv.org/abs/1901.02860.
[28]	KIM Y. Convolutional neural networks for sentence classification[EB/OL]. (2014-08-25)[2024-04-06]. https://arxiv.org/abs/1408.5882.
[29]	LIU P F, QIU X P, HUANG X J. Recurrent neural network for text classification with multi-task learning[EB/OL] . (2016-03-17)[2024-04-06]. https://arxiv.org/abs/1605.05101.
[30]	ZHANG Y F, YU X L, CUI Z Y, et al. Every document owns its structure: inductive text classification via graph neural networks[EB/OL] . (2020-04-22)[2024-04-06]. https://arxiv.org/abs/2004.13826.