Civil aviation short text combined classification method based on enhanced point-wise graph convolutional networks
-
摘要:
目前,大多数短文本分类方法存在信息挖掘不充分和局部信息关注度不足的问题,致使分类精度无法得到提升。鉴于此,提出一种融合增强语义信息和句法信息的逐点图卷积网络(ESS-PWGCN)少样本半监督民航短文本组合分类模型。筛选训练集高置信度关键词汇信息,丰富和增强民航短文本中关键信息的表达能力,扩大模型的应用领域;结合逐点卷积和图卷积网络(GCN),并引入多头注意力机制,学习民航短文本的语义-句法信息,同时平衡文本图中全局-局部信息的影响权重;采用全连接层融合获取到的信息,输出分类结果;利用民航数据集和其他领域的公开数据集进行实验。结果表明:ESS-PWGCN模型与当前最先进的自训练文本图卷积网络(ST-TextGCN)模型相比,不仅分类的准确率和
F 1值分别提高了4.59%和6.53%,而且具有更高的鲁棒性和泛化性。Abstract:The improvement of classification accuracy is currently hampered by the fact that most short text classification approaches suffer from inadequate information mining and insufficient attention to local information. In light of this, an enhanced semantic-syntactic point-wise graph convolutional network (ESS-PWGCN) short-text combination classification model with few samples and semi-supervised civil aviation was proposed. Firstly, the model selects training set high-confidence keyword information to enrich and enhance the expression of key information within civil aviation short texts, thereby broadening the applicability of the model. Secondly, it balances the influence weights of global-local information within the textual graph structure while learning the semantic-syntactic information of civil aviation short texts by combining point-wise convolution with graph convolutional networks (GCN) and multi-head attention mechanisms.Then, a fully connected layer is employed to amalgamate the acquired information for outputting classification results. Finally, experiments conducted on aviation datasets and other public domain datasets demonstrate that the ESS-PWGCN model not only surpasses the current state-of-the-art self-training text graph convolution networks( ST-TextGCN) model in terms of accuracy and
F 1 score by 4.59% and 6.53%, respectively, but also exhibits superior robustness and generalizability. -
表 1 数据集信息
Table 1. Dataset information
数据集 类别总量 数据总量/个 测试集/个 训练集/个 验证集/个 平均词汇个数 ASRS 10 8865 7979 798 88 17.4689 Ohsumed 23 7400 6660 666 74 11.9296 Snippets 8 7370 6633 664 73 15.5438 MR 2 8000 7200 720 80 20.9989 Laptop 3 3532 3179 318 35 20.3347 表 2 ASRS数据集不同模型实验结果对比
Table 2. Comparison of experimental results of different models on ASRS datasets
模型 分类结果 ACC F1 T/s Equipment_
ToolingWeather Manuals Human_
FactorsCompany_
PolicyProcedure Ambiguous Airport Chart_Or_
PublicationEnvironment-
Non_Weather_
RelatedESS-PWGCN 273 626 125 496 672 278 419 487 471 438 0.5370 0.5304 18.6174 ST-TextGCN[23] 0 674 0 420 865 15 197 699 542 116 0.4422 0.3437 1.3928 Mean[26] 0 740 0 333 212 0 5 222 432 152 0.2627 0.2442 1.1596 Transformer[27] 105 624 110 401 476 345 231 757 333 160 0.4439 0.3543 23.9254 TextCNN[28] 50 684 82 351 551 260 89 431 355 181 0.3802 0.3341 5.0154 TextRNN[29] 29 726 125 328 540 203 109 703 264 70 0.3881 0.3750 9.4696 TextING[30] 173 662 67 409 464 205 338 757 160 97 0.4209 0.3824 64.5670 TextGCN[13] 32 618 36 531 626 182 255 620 551 249 0.4637 0.4077 4.8735 表 3 不同数据集不同模型泛化实验结果对比
Table 3. Comparison of the results of generalization experiments of different models on different datasets
模型 ACC F1 ASRS Ohsumed MR Laptop Snippets ASRS Ohsumed MR Laptop Snippets Mean[26] 0.2970 0.1797 0.6735 0.5408 0.6555 0.1008 0 0.5589 0.2606 0.5871 Transformer[27] 0.4909 0.2910 0.7207 0.6499 0.8173 0.3847 0.1000 0.6390 0.4981 0.7598 TextCNN[28] 0.4334 0.2303 0.7306 0.6150 0.8022 0.2864 0.1000 0.7519 0.3864 0.7446 TextRNN[29] 0.4348 0.2356 0.7307 0.6266 0.7993 0.2260 0 0.6667 0.4206 0.7551 TextGCN[13] 0.4651 0.3036 0.6517 0.6172 0.7096 0.4271 0.1489 0.6517 0.4868 0.7032 TextING[30] 0.4383 0.4096 0.7186 0.6631 0.8415 0.4106 0.2431 0.7282 0.5254 0.8402 ST-TextGCN[23] 0.4750 0.4209 0.7021 0.6625 0.8777 0.3940 0.2696 0.7021 0.5330 0.8767 ESS-PWGCN 0.5370 0.4658 0.7575 0.7005 0.9071 0.5304 0.3226 0.7559 0.5874 0.9056 -
[1] 李博涵, 向宇轩, 封顶, 等. 融合知识感知与双重注意力的短文本分类模型[J]. 软件学报, 2022, 33(10): 3565-3581.LI B H, XIANG Y X, FENG D, et al. Short text classification model combining knowledge aware and dual attention[J]. Journal of Software, 2022, 33(10): 3565-3581(in Chinese). [2] ZHANG Y, JIN R, ZHOU Z H. Understanding bag-of-words model: a statistical framework[J]. International Journal of Machine Learning and Cybernetics, 2010, 1(1): 43-52. [3] BLEI D M, NG A Y, JORDAN M I. Latent dirichlet allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022. [4] JOULIN A, GRAVE E, BOJANOWSKI P, et al. Bag of tricks for efficient text classification[EB/OL]. (2016-07-06)[2024-04-05]. https://arxiv.org/abs/1607.01759. [5] LIANG H, SUN X, SUN Y L, et al. Text feature extraction based on deep learning: a review[J]. EURASIP Journal on Wireless Communications and Networking, 2017, 2017(1): 211. [6] XU Y, HONG K, TSUJII J, et al. Feature engineering combined with machine learning and rule-based methods for structured information extraction from narrative clinical discharge summaries[J]. Journal of the American Medical Informatics Association, 2012, 19(5): 824-832. [7] 贾宝惠, 姜番, 王玉鑫, 等. 基于民机维修文本数据的故障诊断方法[J]. 航空学报, 2023, 44(5): 253-267.JIA B H, JIANG F, WANG Y X, et al. Fault diagnosis method based on civil aircraft maintenance text data[J]. Acta Aeronautica et Astronautica Sinica, 2023, 44(5): 253-267(in Chinese). [8] 蒋云良, 王青朋, 张雄涛, 等. 基于门控双层异构图注意力网络的半监督短文本分类[J]. 模式识别与人工智能, 2023, 36(7): 602-612.JIANG Y L, WANG Q P, ZHANG X T, et al. Semi-supervised short text classification based on gated double-layer heterogeneous graph attention network[J]. Pattern Recognition and Artificial Intelligence, 2023, 36(7): 602-612(in Chinese). [9] 万家山, 吴云志. 基于深度学习的文本分类方法研究综述[J]. 天津理工大学学报, 2021, 37(2): 41-47.WAN J S, WU Y Z. Review of text classification research based on deep learning[J]. Journal of Tianjin University of Technology, 2021, 37(2): 41-47(in Chinese). [10] RAMACHANDRAN D, PARVATHI R. Enhanced classification of crisis related tweets using deep learning models and word embeddings[J]. International Journal of Web Engineering and Technology, 2021, 16(2): 158-186. [11] WU Y, ZHAO S, LI W. Phrase2Vec: phrase embedding based on parsing[J]. Information Sciences, 2020, 517: 100-127. [12] HUANG Y R, CHEN J J, ZHENG S M, et al. Hierarchical multi-attention networks for document classification[J]. International Journal of Machine Learning and Cybernetics, 2021, 12(6): 1639-1647. [13] YAO L, MAO C S, LUO Y. Graph convolutional networks for text classification[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2019, 33(1): 7370-7377. [14] LIU X E, YOU X X, ZHANG X, et al. Tensor graph convolutional networks for text classification[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2020, 34(5): 8409-8416. [15] YANG T C, HU L M, SHI C, et al. HGAT: heterogeneous graph attention networks for semi-supervised short text classification[J]. ACM Transactions on Information Systems(TOIS), 2021, 39(3): 1-29. [16] WU M Q. Commonsense knowledge powered heterogeneous graph attention networks for semi-supervised short text classification[J]. Expert Systems with Applications, 2023, 232: 120800. [17] JI Z Y, KONG D Y, YANG Y Y, et al. ASSL-HGAT: active semi-supervised learning empowered heterogeneous graph attention network[J]. Knowledge-Based Systems, 2024, 290: 111567. [18] DEFFERRARD M, BRESSON X, VANDERGHEYNST P. Convolutional neural networks on graphs with fast localized spectral filtering[EB/OL]. (2022-04-04)[2024-04-06]. https://arxiv.org/pdf/1606.09375v1. [19] ZHOU J, HUANG J X, HU Q V, et al. SK-GCN: modeling syntax and knowledge via graph convolutional network for aspect-level sentiment classification[J]. Knowledge-Based Systems, 2020, 205: 106292. [20] ZHANG C, LI Q C, SONG D W. Aspect-based sentiment classification with aspect-specific graph convolutional networks[EB/OL]. (2019-09-08)[2024-04-06]. https://arxiv.org/abs/1909.03477. [21] CHEN P, SUN Z Q, BING L D, et al. Recurrent attention network on memory for aspect sentiment analysis[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Kerrville: Association for Computational Linguistics, 2017: 452-461. [22] LI X, BING L D, LAM W, et al. Transformation networks for target-oriented sentiment classification[EB/OL]. (2018-03-03)[2024-04-06]. https://arxiv.org/abs/1805.01086. [23] CUI H Y, WANG G K, LI Y X, et al. Self-training method based on GCN for semi-supervised short text classification[J]. Information Sciences, 2022, 611: 18-29. [24] PANG B, LEE L. Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales[C]//Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05). Kerrville: Association for Computational Linguistics, 2005: 115-124. [25] LIN Y X, MENG Y X, SUN X F, et al. BertGCN: transductive text classification by combining GCN and BERT[EB/OL]. (2021-03-12)[2024-04-06]. https://arxiv.org/abs/2105.05727. [26] SUBAKTI A, MURFI H, HARIADI N. The performance of BERT as data representation of text clustering[J]. Journal of Big Data, 2022, 9(1): 15. [27] DAI Z H, YANG Z L, YANG Y M, et al. Transformer-XL: attentive language models beyond a fixed-length context[EB/OL]. (2019-01-09)[2024-04-06]. https://arxiv.org/abs/1901.02860. [28] KIM Y. Convolutional neural networks for sentence classification[EB/OL]. (2014-08-25)[2024-04-06]. https://arxiv.org/abs/1408.5882. [29] LIU P F, QIU X P, HUANG X J. Recurrent neural network for text classification with multi-task learning[EB/OL] . (2016-03-17)[2024-04-06]. https://arxiv.org/abs/1605.05101. [30] ZHANG Y F, YU X L, CUI Z Y, et al. Every document owns its structure: inductive text classification via graph neural networks[EB/OL] . (2020-04-22)[2024-04-06]. https://arxiv.org/abs/2004.13826. -


下载: