Multi-scale depthwise separable convolution facial expression recognition embedded in attention mechanism
-
摘要:
针对面部表情识别中,传统机器学习方法特征提取较为复杂,浅层卷积神经网络识别率不高,以及深度卷积神经网络易带来梯度爆炸或弥散的问题,构建了残差网络嵌入注意力机制的多尺度深度可分离表情识别网络。通过多层多尺度深度可分离残差单元的叠加进行不同尺度的表情特征提取,使用CBAM注意力机制进行表情特征的筛选,提升有效表情特征权重的表达,削弱训练数据的噪声影响。所提网络模型在Fer-2103和CK+表情数据集分别取得了73.89%和97.47%的准确度,表明所提网络具有较强的泛化性。
Abstract:For facial expression recognition, traditional machine learning method features extraction is relatively complex, shallow convolutional neural network recognition rate is not high, and deep convolutional network is easy to cause gradient explosion or dispersion problems. This paper constructs the multi-scale deep separable expression recognition network with residual network which embedded in attention mechanism. Through superposition of multi-layer and multi-scale depth separable residual elements, facial expression feature extraction of different scales is achieved; in the meanwhile, CBAM attention mechanism was used to screen the expression features for the purpose of improving the expression of the weight of the expression features and weakening the noise impact of training data. The algorithm network model in this paper achieves accuracy of 73.89% and 97.47% in Fer-2103 and CK+ expression data sets respectively, which indicates that this network has strong generalization.
-
表 1 表情识别网络特征参数
Table 1. Feature parameters of expression recognition network
网络层 输入类型 输出类型 Conv2d 44×44×3 44×44×64 Basic Block-1 44×44×64 44×44×64 Basic Block-2 22×22×128 22×22×128 Basic Block-3 11×11×256 11×11×256 Basic Block-4 6×6×512 6×6×512 Golbal Average Pooling 6×6×512 1×1×512 FC+Softmax 1×1×512 1×1×7 表 2 本文算法与其他表情识别算法的准确度对比
Table 2. Comparison of recognition rates between proposed algorithm and other expression recognition algorithms
数据集 算法 准确度/% 总体准确度/% 愤怒 厌恶 恐惧 快乐 悲伤 惊喜 中立 藐视 Fer-2013 文献[15] 65 71 50 90 60 79 75 — 71.52 文献[16] 65 65 54 91 62 82 73 — 72.67 文献[17] — — — — — — — — 73.00 本文 64 69 58 91 64 84 74 — 73.89 CK+ 文献[18] 79 86 92 99 95 100 99 — 95.67 文献[19] 88 97 94 99 88 99 — 78 94.90 文献[20] 90 100 85 100 95 98 — 88 96.28 本文 97 98 95 100 92 99 — 93 97.47 表 3 消融实验
Table 3. Ablation experiments
A B C D A-CBAM B-CBAM C-CBAM D-CBAM Fer-2013数据集准确度/% CK+数据集准确度/% √ 71.91 92.12 √ √ 72.53 94.74 √ √ √ 72.72 95.17 √ √ √ √ 72.94 95.49 √ √ √ √ √ 73.06 95.96 √ √ √ √ √ √ 73.42 96.67 √ √ √ √ √ √ √ 73.63 97.09 √ √ √ √ √ √ √ √ 73.89 97.47 -
[1] SHAN C, GONG S, MCOWAN P W. Facial expression recognition based on local binary patterns: A comprehensive study[J]. Image and Vision Computing, 2009, 27(6): 803-816. doi: 10.1016/j.imavis.2008.08.005 [2] LUO Y, ZHANG T, ZHANG Y. A novel fusion method of PCA and LDP for facial expression feature extraction[J]. Optik, 2016, 127(2): 718-721. doi: 10.1016/j.ijleo.2015.10.147 [3] 刘帅师, 田彦涛, 万川. 基于Gabor多方向特征融合与分块直方图的人脸表情识别方法[J]. 自动化学报, 2011, 37(12): 1455-1463. https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO201112006.htmLIU S S, TIAN Y T, WAN C. Facial expression recognition method based on Gabor multiorien-tation features fusion and block histogram[J]. Acta Automatica Sinica, 2011, 37(12): 1455-1463(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO201112006.htm [4] KUMAR V D A, KUMAR V D A, MALATHI S, et al. Facial recognition system for suspect identification using a surveillance camera[J]. Pattern Recognition and Image Analysis, 2018, 28(3): 410-420. doi: 10.1134/S1054661818030136 [5] ZHOU J, ZHANG S, MEI H, et al. A method of facial expre-ssion recognition based on Gabor and NMF[J]. Pattern Recognition and Image Analysis, 2016, 26(1): 119-124. doi: 10.1134/S1054661815040070 [6] HSIEH C C, HSIH M H, JIANG M K, et al. Effective semantic features for facial expressions recognition using SVM[J]. Multimedia Tools and Applications, 2016, 75(11): 6663-6682. doi: 10.1007/s11042-015-2598-1 [7] SUN K, KANG H, PARK H H. Tagging and classifying facial images in cloud environments based on KNN using MapReduce[J]. Optik, 2015, 126(21): 3227-3233. doi: 10.1016/j.ijleo.2015.07.080 [8] 李勇, 林小竹, 蒋梦莹. 基于跨连接LeNet-5网络的面部表情识别[J]. 自动化学报, 2018, 44(1): 176-182. https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO201801015.htmLI Y, LIN X Z, JIANG M Y. Facial expression recognition based on cross-connected LeNet-5 network[J]. Acta Automatica Sinica, 2018, 44(1): 176-182(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO201801015.htm [9] MOLLAHOSSEINI A, CHAN D, MAHOOR M H. Going deeper in facial expression recognition using deep neural networks[C]//2016 IEEE Winter Conference on Applications of Computer Vision (WACV). Piscataway: IEEE Press, 2016: 1-10. [10] JUNG H, LEE S, YIM J, et al. Joint fine-tuning in deep neural networks for facial expression recognition[C]//Proceedings of the IEEE International Conference on Computer Vision. Pisca-taway: IEEE Press, 2015: 2983-2991. [11] HE K M, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 770-778. [12] WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision(ECCV). Berlin: Springer, 2018: 3-19. [13] GOODFELLOW I J, ERHAN D, CARRIER P L, et al. Challenges in representation learning: A report on three machine learning contests[C]//International Conference on Neural Information Processing. Berlin: Springer, 2013: 117-124. [14] LUCEY P, COHN J F, KANADE T, et al. The extended Cohn-Kanade dataset (CK+): A complete dataset for action unit and emotion-specified expression[C]//2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. Piscataway: IEEE Press, 2010: 94-101. [15] SHI C, TAN C, WANG L. A facial expression recognition method based on a multibranch cross-connection convolutional neural network[J]. IEEE Access, 2021, 9: 39255-39274. doi: 10.1109/ACCESS.2021.3063493 [16] XIE W C, SHEN L L, DUAN J M. Adaptive weighting of handcrafted feature losses for facial expression recognition[J]. IEEE Transactions on Cybernetics, 2021, 51(5): 2787-2800. doi: 10.1109/TCYB.2019.2925095 [17] HAYALE W, NEGI P S, MAHOOR M. Deep Siamese neural networks for facial expression recognition in the wild[J/OL]. IEEE Transactions on Affective Computing, 2021(2021-03-04)[2021-03-05]. https://ieeexplore.ieee.org/document/9423550. [18] SUN X, XIA P P, ZHANG L, et al. A ROI-guided deep architecture for robust facial expressions recognition[J]. Information Sciences, 2020, 522: 35-48. doi: 10.1016/j.ins.2020.02.047 [19] 兰凌强, 李欣, 刘淇缘, 等. 基于联合正则化策略的人脸表情识别方法[J]. 北京航空航天大学学报, 2020, 46(9): 1797-1806. doi: 10.13700/j.bh.1001-5965.2020.0073LAN L Q, LI X, LIU Q Y, et al. Facial expression recognition method based on a joint normalization strategy[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(9): 1797-1806(in Chinese). doi: 10.13700/j.bh.1001-5965.2020.0073 [20] GAN Y, CHEN J, YANG Z, et al. Multiple attention network for facial expression recognition[J]. IEEE Access, 2020, 8: 7383-7393. doi: 10.1109/ACCESS.2020.2963913 -