-
摘要:
随着多媒体的快速发展,单纯采用文本的方面类情感分析,不能准确识别用户所表达的情感。而现有图文数据的方面类情感分析方法仅考虑图文模态间的交互,忽略图文数据的不一致性和相关性。因此,提出联合方面注意力交互网络(JAAIN)模型的图文方面类情感识别方法。所提方法针对图文数据的不一致性与相关性,通过多层次融合方面信息和图文信息,去除与给定方面无关的文本和图像,增强给定方面的图文模态数据的情感表示,将文本数据情感表示、图像数据情感表示及方面类情感表示进行拼接融合与全连接,实现图文方面类情感判别。在数据集Multi-ZOL上进行实验,实验结果表明:所提模型能够提升图文方面类情感判别的性能。
Abstract:Due to the quick development of social media, the sentiment conveyed by users cannot be reliably identified by an Aspect-Category Sentiment Analysis of the text alone. However, the existing Aspect-Category Sentiment Analysis methods for image and text data only consider the interaction between image and text modalities, ignoring the inconsistency and correlation of image and text data. Therefore, this paper proposes a joint aspect attention interaction network (JAAIN) model for aspect-category sentiment identification. The suggested technique improves the representation of image and text modalities in particular aspects by multi-level aspect, image, and text information fusion. It does this by removing the text and images that are unrelated to certain aspects. The text data sentiment representation, image data sentiment representation and aspect category sentiment representation are concatenated, fused and fully connected to realize sentiment discrimination of image and text aspects. The experimental results show that the proposed model can improve the performance of sentiment identification in images and text on the Multi-ZOL Dataset.
-
表 1 图文数据不一致性和相关性样例
Table 1. Examples of image and text data inconsistencies and correlations
实例 文本 图像 方面 情感标签 1 手机挺好的,黑色挺酷,显示不错,还有全程的护眼模式,外观漂亮,运行流畅,就是输入法不太适应,电池一天一冲充肯定要的,就是在淘宝官方旗舰店买没有送耳机挺不开心的,心理不平衡!电池容量小,就这个缺点吧。 拍照效果 负面 2 1) 外观低调、沉稳、内敛、商务感十足,深受男人喜爱; 2) 分屏、显示、色彩在这个价位上表现还是很不错的; 3) 提供摆拍姿势指导,支持前置柔光灯; 4) 手机续航不错,用一天妥妥的; 5) 支持多个知名品牌耳机音效。 拍照效果 正面 3 配置高,性能强悍。顶级相机,成像质量佳。不支持快充,充电慢。后置摄像头是iPhoneSE的一大亮点,延续了iPhone6s的顶级配置,虽然没有光学防抖,但依旧让我十分兴奋。 拍照效果 正面 表 2 Multi-ZOL数据集统计
Table 2. Statistics of Multi-ZOL dataset
属性 数值 评论数 5228 标签数 10 每个评论的平均词数 315.11 每个评论的最大词数 8511 每个评论的最小词数 5 每个评论的平均图像数 4.5 每个评论的最大图像数 111 每个评论的最小图像数 1 表 3 对比实验结果
Table 3. Comparative experimental results
表 4 消融实验对比结果
Table 4. Comparison results of ablation experiments
% 模型 精确率 F1 JAAIN 74.57 74.48 -JAAIN(image) 68.04 67.80 -JAAIN(text) 39.32 30.36 -DAIMA 72.21 72.20 -ADIMA 73.20 73.09 -Transformer-encoder 73.27 73.28 -
[1] PONTIKI M, GALANIS D, PAVLOPOULOS J, et al. SemEval-2014 task 4: Aspect based sentiment analysis[C]//Proceedings of the 8th International Workshop on Semantic Evaluation. Stroudsburg: Association for Computational Linguistics, 2014: 27-35. [2] XU N, MAO W J, CHEN G D. Multi-interactive memory networkfor aspect based multimodal sentiment analysis[C]//Proceedings ofthe AAAI Conference on Artificial Intelligence. Washton, D.C.: AAAI, 2019, 33(1): 371-378. [3] XU N, MAO W J, CHEN G D. A co-memory network for multimodal sentiment analysis[C]//Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. New York: ACM, 2018: 929-932. [4] YU J F, JIANG J. Adapting BERT for target-oriented multimodal sentiment classification[C]//Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence. California: International Joint Conferences on Artificial Intelligence Organization, 2019: 5408-5414. [5] VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. New York: ACM, 2017: 6000–6010. [6] GU S Q, ZHANG L P, HOU Y X, et al. A position-aware bidirectional attention network for aspect-level sentiment analysis [C]//Proceedings of the 27th International Conference on Computational Linguistics. Santa Fe: Curran Associates, Inc., 2018: 774–784. [7] XU Q N, ZHU L, DAI T, et al. Aspect-based sentiment classification with multi-attention network[J]. Neurocomputing, 2020, 388: 135-143. doi: 10.1016/j.neucom.2020.01.024 [8] WU C, XIONG Q Y, GAO M, et al. A relative position attention network for aspect-based sentiment analysis[J]. Knowledge and Information Systems, 2021, 63(2): 333-347. doi: 10.1007/s10115-020-01512-w [9] LI Y, ZENG J B, SHAN S G, et al. Occlusion aware facial expression recognition using CNN with attention mechanism[J]. IEEE Transactions on Image Processing, 2019, 28: 2439-2450. doi: 10.1109/TIP.2018.2886767 [10] XIE S Y, HU H F, WU Y B. Deep multi-path convolutional neural network joint with salient region attention for facial expression recognition[J]. Pattern Recognition, 2019, 92: 177-191. doi: 10.1016/j.patcog.2019.03.019 [11] ZHAO S C, GAO Y, JIANG X L, et al. Exploring principles-of-art features for image emotion recognition[C]//Proceedings of the 22nd ACM international conference on Multimedia. New York: ACM, 2014: 47-56. [12] RAO T R, LI X X, XU M. Learning multi-level deep representations for image emotion classification[J]. Neural Processing Letters, 2020, 51(3): 2043-2061. doi: 10.1007/s11063-019-10033-9 [13] PORIA S, CHATURVEDI I, CAMBRIA E, et al. Convolutional MKL based multimodal emotion recognition and sentiment analysis[C]//Proceedings of the 2016 IEEE 16th International Conference on Data Mining. Piscataway: IEEE Press, 2016: 439-448. [14] CAO D L, JI R R, LIN D Z, et al. A cross-media public sentiment analysis system for microblog[J]. Multimedia Systems, 2016, 22(4): 479-486. doi: 10.1007/s00530-014-0407-8 [15] TRUONG Q T, LAUW H W. VistaNet: Visual aspect attentionnet-work for multimodal sentiment analysis[C]//Proceedings of theAAAI Conference on Artificial Intelligence. Washton, D.C.: AAAI, 2019, 33(1): 305-312. [16] XU J, HUANG F R, ZHANG X M, et al. Visual-textual sentiment classification with bi-directional multi-level attention networks[J]. Knowledge-Based Systems, 2019, 178: 61-73. doi: 10.1016/j.knosys.2019.04.018 [17] LU J S, BATRA D, PARIKH D, et al. ViLBERT: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks[EB/OL]. (2019-08-06) [2022-01-03]. https://arxiv.org/abs/1908.02265. [18] DEVLIN J, CHANG M W, LEE K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding [C]//Proceedings of 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Minneapolis: Association for Computational Linguistics, 2019: 4171-4186. [19] HINTON G E, SRIVASTAVA N, KRIZHEVSKY A, et al. Improving neural networks by preventing co-adaptation of feature detectors[J]. Computer Science, 2012, 3(4): 212-223. [20] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780. doi: 10.1162/neco.1997.9.8.1735 [21] WANG Y Q, HUANG M L, ZHU X Y, et al. Attention-based LSTM for aspect-level sentiment classification[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2016: 606-615. [22] TANG D Y, QIN B, LIU T. Aspect level sentiment classification with deep memory network[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2016: 214-224. [23] MA D H, LI S J, ZHANG X D, et al. Interactive attention networks for aspect-level sentiment classification[C]//Proceedings of the 26th International Joint Conference on Artificial Intelligence. New York: ACM, 2017: 4068-4074. [24] CHEN P, SUN Z Q, BING L D, et al. Recurrent attention network on memory for aspect sentiment analysis[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2017: 452-461. [25] KHAN Z, FU Y. Exploiting BERT for multimodal target sentiment classification through input space translation[C]//Proceedings of the 29th ACM International Conference on Multimedia. New York: ACM, 2021: 3034-3042.