-
摘要:
针对自然条件下人脸表情识别面临遮挡、光照、姿势变化等挑战,存在识别准确率低的问题, 提出了一种新的深度学习网络模型用于人脸表情识别。以ResNet为基础网络,融合了瓶颈注意力机制及全局二阶池化层,其中瓶颈注意力机制专注于表情重要特征的提取,全局二阶池化层度量表情特征之间的相关性,在此基础上通过联合正则化策略,平衡和改善特征数据分布情况,提高表情识别准确率。所提方法在2个公开数据集FER2013和CK+ 进行了测试及验证,最高准确率分别达到了74.227%和95.8%,性能优于诸多现存的主流方法,表明所提模型具有较好的准确性和鲁棒性。
Abstract:There are many challenges including occlusion, illumination and posture variation in the facial expression recognition under natural conditions, leading to the low accuracy. This paper proposes a new deep learning network model for facial expression recognition. This model uses ResNet as backbone, and introduces the bottleneck attention module and the global second-order pooling layer. The bottleneck attention module focuses on the extraction of important expression features, and the global second-order pooling layer aims to measure the correlation among expression features. On this basis, the joint normalization strategies are used balance and improve the distribution of feature data, which improves the accuracy of expression recognition. The test and validation of the proposed method have been carried out on the two public datasets FER2013 and CK+, resulting in the highest accuracy rates of 74.227% and 95.8%, respectively. The performance is better than most of the latest facial expression recognition methods. The results indicate that this model has better accuracy and robustness.
-
表 1 不同模型在FER2013和CK+数据集上的准确率
Table 1. Accuracy rate of different modles on FER2013 and CK+ datasets
模型名称 ResNet18准确率/% >ResNet34准确率/% >ResNet50准确率/% FER2013 CK+ FER2013 CK+ FER2013 CK+ Baseline 71.190 89.3 72.304 92.8 72.109 92.0 Cov 72.834 93.5 72.973 93.1 72.527 92.5 Cov-Bam 73.057 94.4 73.224 93.6 73.001 93.0 Cov-Bam-FBN 73.614 94.9 73.671 95.1 73.447 93.5 Cov-Bam-IGN 73.419 95.8 73.502 95.5 73.224 93.1 Cov-Bam-BGN 73.670 94.9 74.227 95.1 73.279 93.1 表 2 所提模型与目前一些方法在CK+数据集上的准确率比较
Table 2. Comparison between proposed models and state-of-the-art methods on CK+ dataset
表 3 所提模型与目前一些主流方法在FER2013数据集上的准确率比较
Table 3. Comparison of accuracy rate between proposed models and state-of-the-art methods on FER2013 dataset
表 4 所提模型与联合优化策略准确率对比
Table 4. Comparison of accuracy rate between proposed models and joint optimization strategy
模型名称 准确率/% CK+数据集 FER2013数据集 FBN 91.9 72.973 IGN 92.3 72.834 BGN 91.7 72.889 Cov-Bam-FBN 95.1 73.671 Cov-Bam-IGN 95.5 73.502 Cov-Bam-BGN 95.1 74.227 -
[1] TAHA B, HATZINAKOS D. Emotion recognition from 2D facial expressions[C]//2019 IEEE Canadian Conference of Electrical and Computer Engineering (CCECE). Piscataway: IEEE Press, 2019: 1-4. [2] LIU K C, HSU C C, WANG W Y, et al. Real-time facial expression recognition based on CNN[C]//2019 International Conference on System Science and Engineering (ICSSE). Piscataway: IEEE Press, 2019: 120-123. [3] CHA H S, CHOI S J, IM C H. Real-time recognition of facial expressions using facial electromyograms recorded around the eyes for social virtual reality applications[J]. IEEE Access, 2020, 8: 62065-62075. doi: 10.1109/ACCESS.2020.2983608 [4] PANG L, LI N Q, ZHAO L, et al. Facial expression recognition based on Gabor feature and neural network[C]//2018 International Conference on Security, Pattern Analysis and Cybernetics (SPAC). Piscataway: IEEE Press, 2018: 489-493. [5] KUSHWAH K, SHARMA V, SINGH U. Neural network method through facial expression recognition[C]//2017 International conference of Electronics, Communication and Aerospace Technology (ICECA). Piscataway: IEEE Press, 2017: 532-537. [6] LI J, JIN K, ZHOU D L, et al. Attention mechanism-based CNN for facial expression recognition[J]. Neurocomputing, 2020, 411: 340-350. doi: 10.1016/j.neucom.2020.06.014 [7] XIANG J, ZHU G M. Joint face detection and facial expression recognition with MTCNN[C]//2017 4th International Conference on Information Science and Control Engineering (ICISCE). Piscataway: IEEE Press, 2017: 424-427. [8] ZHOU Y, FENG Y Y, ZENG S Y, et al. Facial expression recognition based on convolutional neural network[C]//2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS). Piscataway: IEEE Press, 2019: 410-413. [9] ZENG G H, ZHOU J C, JIA X, et al. Hand-crafted feature guided deep learning for facial expression recognition[C]//2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018). Piscataway: IEEE Press, 2018: 423-430. [10] PARK J C, WOO S, LEE J Y, et al. BAM: Bottleneck attention module[EB/OL]. (2018-07-18)[2020-09-02]. https://arxiv.org/pdf/1807.06514. [11] GAO Z L, XIE J T, WANG Q L, et al. Global second-order pooling convolutional networks[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2019: 3019-3028. [12] 兰凌强, 李欣, 刘淇缘, 等. 基于联合正则化策略的人脸表情识别方法[J]. 北京航空航天大学学报, 2020, 46(9): 1797-1806. doi: 10.13700/j.bh.1001-5965.2020.0073LAN L Q, LI X, LIU Q Y, et al. Facial expression recognition method based on a joint normalization strategy[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(9): 1797-1806(in Chinese). doi: 10.13700/j.bh.1001-5965.2020.0073 [13] GOODFELLOW I J, ERHAN D, CARRIER P L, et al. Challenges in representation learning: A report on three machine learning contests[C]//International Conference on Neural Information Processing. Berlin: Springer, 2013: 117-124. [14] LUCEY P, COHN J F, KANADE T, et al. The extended Cohn-Kanade dataset (CK+): A complete dataset for action unit and emotion-specified expression[C]//2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-workshops. Piscataway: IEEE Press, 2010: 94-101. [15] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. (2015-10-04)[2020-09-02]. https://arxiv.org/abs/1409.1556. [16] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90. doi: 10.1145/3065386 [17] HE K M, ZHANG X Y, REN S Q, et al. Identity mappings in deep residual networks[M]//Computer Vision ECCV 2016. Berlin: Springer, 2016: 630-645. [18] CHENG S, ZHOU G H. Facial expression recognition method based on improved VGG convolutional neural network[J]. International Journal of Pattern Recognition and Artificial Intelligence, 2020, 34(7): 2056003. doi: 10.1142/S0218001420560030 [19] ZHONG Y X, QIU S H, LUO X S, et al. Facial expression recognition based on optimized ResNet[C]//2020 2nd World Symposium on Artificial Intelligence (WSAI). Piscataway: IEEE Press, 2020: 84-91. [20] HU J, SHEN L, ALBANIE S, et al. Squeeze-and-excitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2011-2023. doi: 10.1109/TPAMI.2019.2913372 [21] CHEN Y Z, HU H F. Facial expression recognition by inter-class relational learning[J]. IEEE Access, 2019, 7: 94106-94117. doi: 10.1109/ACCESS.2019.2928983 [22] WANG F J, SHEN L P. Expression recognition using region features and facial action units[C]//2019 15th International Conference on Intelligent Environments. Piscataway: IEEE Press, 2019: 9-15. [23] XU Q T, ZHAO N J. A facial expression recognition algorithm based on CNN and LBP feature[C]//2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC). Piscataway: IEEE Press, 2020: 2304-2308. [24] WANG K, PENG X J, YANG J F, et al. Region attention networks for pose and occlusion robust facial expression recognition[J]. IEEE Transactions on Image Processing, 2020, 29: 4057-4069. doi: 10.1109/TIP.2019.2956143 [25] LI Y, ZENG J B, SHAN S G, et al. Occlusion aware facial expression recognition using CNN with attention mechanism[J]. IEEE Transactions on Image Processing, 2019, 28(5): 2439-2450. doi: 10.1109/TIP.2018.2886767 [26] GAN Y L, CHEN J Y, YANG Z K, et al. Multiple attention network for facial expression recognition[J]. IEEE Access, 2020, 8: 7383-7393. doi: 10.1109/ACCESS.2020.2963913 [27] YU K C, SALZMANN M. Second-order convolutional neural networks[EO/OL]. (2017-03-20)[2020-09-02]. http://export.arxiv.org/abs/1703.06817. [28] LI P H, XIE J T, WANG Q L, et al. Is second-order information helpful for large-scale visual recognition [C]//2017 IEEE International Conference on Computer Vision (ICCV). Piscataway: IEEE Press, 2017: 2089-2097. [29] LI P H, XIE J T, WANG Q L, et al. Towards faster training of global covariance pooling networks by iterative matrix square root normalization[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 947-955. [30] FEI Z X, YANG E F, LI D, et al. Combining deep neural network with traditional classifier to recognize facial expressions[C]//2019 25th International Conference on Automation and Computing (ICAC). Piscataway: IEEE Press, 2019: 1-6. [31] ADIL B, NADJIB K M, YACINE L. A novel approach for facial expression recognition[C]//2019 International Conference on Networking and Advanced Systems (ICNAS). Piscataway: IEEE Press, 2019: 1-5. [32] SUN X, XIA P P, ZHANG L M, et al. A ROI-guided deep architecture for robust facial expressions recognition[J]. Information Sciences, 2020, 522: 35-48. doi: 10.1016/j.ins.2020.02.047 [33] XIE S Y, HU H F, WU Y B. Deep multi-path convolutional neural network joint with salient region attention for facial expression recognition[J]. Pattern Recognition, 2019, 92: 177-191. doi: 10.1016/j.patcog.2019.03.019 [34] HASANI B, NEGI P S, MAHOOR M. BReG-NeXt: Facial affect computing using adaptive residual networks with bounded gradient[J]. IEEE Transactions on Affective Computing, 2020, 99: 1. [35] SHAO J, QIAN Y S. Three convolutional neural network models for facial expression recognition in the wild[J]. Neurocomputing, 2019, 355: 82-92. doi: 10.1016/j.neucom.2019.05.005 [36] XIE W C, SHEN L L, DUAN J M. Adaptive weighting of handcrafted feature losses for facial expression recognition[J]. IEEE Transactions on Cybernetics, 2021, 51(5): 2787-2800. doi: 10.1109/TCYB.2019.2925095