基于联合正则化策略的人脸表情识别方法

兰凌强; 李欣; 刘淇缘; 卢树华

doi:10.13700/j.bh.1001-5965.2020.0073

基于联合正则化策略的人脸表情识别方法

doi: 10.13700/j.bh.1001-5965.2020.0073

中国人民公安大学警务信息工程与网络安全学院, 北京 102600

基金项目:

国家重点研发计划 2016YFC0801005

中央高校基本科研业务费专项资金 2019JKF225

详细信息

作者简介:
兰凌强  男, 硕士研究生。主要研究方向:计算机视觉

李欣  男, 博士, 副教授, 硕士生导师。主要研究方向:信息技术

刘淇缘   女, 硕士研究生。主要研究方向:计算机视觉

卢树华   男, 博士, 副教授, 硕士生导师。主要研究方向:安全防范技术

通讯作者:
卢树华, E-mail:lushuhua@ppsuc.edu.cn

中图分类号: TP391
计量
- 文章访问数: 953
- HTML全文浏览量: 252
- PDF下载量: 70
- 被引次数: 0
出版历程
- 收稿日期: 2020-03-02
- 录用日期: 2020-04-09
- 网络出版日期: 2020-09-20

Facial expression recognition method based on a joint normalization strategy

College of Police Information Technology and Cyber Security, People's Public Security University of China, Beijing 102600, China

Funds:

National Key R & D Program of China 2016YFC0801005

the Fundamental Funds for the Central Universities 2019JKF225

More Information

Corresponding author: LU Shuhua, E-mail:lushuhua@ppsuc.edu.cn

摘要

摘要:
针对目前人脸表情识别大多采用基于深度学习的端到端特征提取及分类方法的现象，提出了一种新的深度模型优化方法。基于ResNet18残差网络架构和正则化思想，提出了联合正则化策略，即将过滤器响应正则化和批量正则化、实例正则化和组正则化、组正则化和批量正则化分别嵌入网络之中，平衡和改善特征数据分布，弥补单一正则化的缺点，提升模型性能。在2个公开数据集FER2013和CK+进行了验证和测试，最高准确率分别达到了73.558%和94.9%，实验结果表明，联合正则化策略提高了基础网络的性能，其表现优于诸多当前较新的人脸表情识别方法。
- 表情识别 /
- 联合正则化策略 /
- 过滤器响应正则化 /
- 批量正则化 /
- 组正则化
Abstract:
As for that end-to-end feature extraction and classification based on deep learning often used in facial expression recognition, a new method of depth model optimization has been proposed. This paper proposes the joint optimization strategies learned from ResNet18 residual network and normalization ideas, that is, filter response normalization and batch normalization, instance normalization and group normalization, as well as group normalization and batch normalization were embedded in the network, respectively, to balance and improve the distribution of feature data, make up for the shortcomings of single regularization, and improve model performance. The validation and test were carried out on the two public datasets FER2013 and CK+, and the highest accuracy rates are 73.558% and 94.9%, respectively. The experimental results indicate that the joint optimization strategy enhances the performance of the basic network, which is better than most of the latest facial expression recognition methods.
- expression recognition /
- joint strategy /
- filter response normalization /
- batch normalization /
- group normalization

HTML全文

图 1 FER2013数据集表情分类示例及表情数量分布

Figure 1. Samples of FER2013 dataset for facial expression and distribution of number of each facial expression

下载: 全尺寸图片幻灯片

图 2 CK+数据集表情分类示例及表情数量分布

Figure 2. Samples of CK+ dataset for facial expression and distribution of number of each facial expression

下载: 全尺寸图片幻灯片

图 3 网络架构

(a)为采用BN/FRN/GN/IN单一正则化的残差模块，箭头指向(a)表示整个网络以(a)的模块为基础；(b)、(c)、(d)为所提3种优化策略在残差块中的应用，箭头指向(b)、(c)、(d)分别表示使用(b)、(c)、(d)作为基础模块的残差网络。

Figure 3. Network architecture

下载: 全尺寸图片幻灯片

图 4 FER2013私有和公有测试集混淆矩阵

Figure 4. Confusion matrix for FER2013 private and public test sets

下载: 全尺寸图片幻灯片

图 5 CK+数据集混淆矩阵

Figure 5. Confusion matrix for CK+ dataset

下载: 全尺寸图片幻灯片

表 1 基础框架以及添加联合正则化策略后的实验结果

Table 1. Experimental results of basic framework and adding joint normalization strategies

模型准确率/%

文献[38] 71.190

Model1(本文) 73.558

Model2(本文) 73.534

Model3(本文) 73.031

下载: 导出CSV

表 2 残差网络添加联合正则化数量的效果比较

Table 2. Comparison of impact of adding number of joint normalization based on residual network

数量准确率/%

Model1 Model2 Model3

0 71.190 71.190 71.190

1 72.555 72.722 72.499

1-2 72.053 72.417 71.691

1-3 73.558 73.530 73.031

1-4 72.778 72.416 72.723

下载: 导出CSV

表 3 单一正则化与联合正则化(在前3个残差块中使用)的比较

Table 3. Comparison between individual normalization and joint normalization(used in the first three residual blocks)

优化策略准确率/%

BN 71.190

IN 73.168

GN 73.029

FRN 72.276

Model1 73.558

Model2 73.530

Model3 73.031

下载: 导出CSV

表 4 本文方法与目前较新的方法在FER2013数据集上准确率比较

Table 4. Comparison of accuracy rate between proposed method and state-of-the-art methods on FER2013 dataset

模型准确率/%

SHCNN^[39] 69.100

文献[40] 70.910

IcRL^[27] 72.360

文献[41] 72.640

Model1(本文) 73.558

Model2(本文) 73.534

Model3(本文) 73.031

下载: 导出CSV

表 5 本文方法与目前较新的方法在CK+数据集上准确率比较

Table 5. Comparison of accuracy rate between proposed method and state-of-the-art methods on CK+ dataset

模型准确率/%

3DCNN-DAP^[28] 92.4

Inception^[26] 93.2

文献[1] 93.2

DAM-CNN^[42] 95.9

文献[38] 89.3

Model1(本文) 94.9

Model2(本文) 93.6

Model3(本文) 94.1

下载: 导出CSV

参考文献(42)

[1]	JAIN D K, SHAMSOLMOALI P, SEHDEV P.Extended deep neural network for facial emotion recognition[J].Pattern Recogntion Letters, 2019, 120:69-74. doi: 10.1016/j.patrec.2019.01.008
[2]	HU S H, HU Y M, LI J Q, et al.Natural scene facial expression recognition based on differential features[C]//2019 Chinese Automation Congress(CAC).Piscataway: IEEE Press, 2019: 2840-2844.
[3]	LI Y, CAO G T, CAO W M.Stacking-based deep neural network for facial expression recognition[C]//2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).Piscataway: IEEE Press, 2019: 1338-1342.
[4]	HE J, CAI J F, FANG L Z, et al.A method of facial expression recognition based on LBP fusion of key expressions areas[C]//The 27th Chinese Control and Decision Conference(2015 CCDC).Piscataway: IEEE Press, 2015: 4200-4204.
[5]	OJALA T, PIETIKÄINEN M, MÄENPÄÄ T.Multiresolution gray-scale and rotation invariant texture classification with local binary patterns[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(7):971-987. doi: 10.1109/TPAMI.2002.1017623
[6]	IOFFE S, SZEGEDY C.Batch normalization: Accelerating deep network training by reducing internal covariate shift[C]//Proceedings of the 32nd International Conference on Machine Learning (ICML).New York: ACM, 2015: 448-456.
[7]	ULYANOV D, VEDALDI A, LEMPITSKY V.Instance normalization: The missing ingredient for fast stylization[EB/OL].(2017-11-06)[2020-02-20].
[8]	WU Y X, HE K M.Group normalization[C]//The European Conference on Computer Vision (ECCV).Berlin: Springer, 2018: 3-19.
[9]	SINGH S, KRISHNAN S.Filter response normalization layer: Fliminating batch dependence in the training of deep neural networks[EB/OL].(2019-11-21)[2020-02-20].
[10]	GOODFELLOW I J, ERHAN D, CARRIER P L, et al.Challenges in representation learning: A report on three machine learning contests[C]//International Conference on Neural Information Processing.Berlin: Springer, 2013: 117-124.
[11]	LUCEY P, COHN J F, KANADE T, et al.The extended Cohn-Kanade dataset (CK+): A complete dataset for action unit and emotion-specified expression[C]//2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR).Piscataway: IEEE Presss, 2010: 94-10.
[12]	DALAL N, TRIGGS B.Histograms of oriented gradients for human detection[C]//2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR).Piscataway: IEEE Press, 2005, 1: 886-893.
[13]	LOWE D G.Object recognition from local scale-invariant features[C]//Proceedings of the 7th IEEE International Conference on Computer Vision.Piscataway: IEEE Press, 1999, 2: 1150-1157.
[14]	LYONS M J, AKAMATSU S, KAMACHI M, et al.The Japanese female facial expression (JAFFE) database[C]//Proceedings of 3rd International Conference on Automatic Face and Gesture Recognition.Piscataway: IEEE Press, 1998: 14-16.
[15]	ZHAO G, HUANG X, TAINI M, et al.Facial expression recognition from near-infrared videos[J].Image and Vision Computing, 2011, 29(9):607-619. doi: 10.1016/j.imavis.2011.07.002
[16]	ELSAYED A, MAHMOOD A, SOBH T.Effect of super resolution on high dimensional features for unsupervised face recognition in the wild[C]//2017 IEEE Applied Imagery Pattern Recognition Workshop(AIPR).Piscataway: IEEE Press, 2017: 1-5.
[17]	WANG P Y, SU F, ZHAO Z C.Joint multi-feature fusion and attribute relationships for facial attribute prediction[C]//2017 IEEE Visual Communications and Image Processing (VCIP).Piscataway: IEEE Press, 2017: 1-4.
[18]	TAHERKHANI F, NASRABADI N M, DAWSON J.A deep face identification network enhanced by facial attributes prediction[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Piscataway: IEEE Press, 2018: 553-560.
[19]	KUO C M, LAI S H, MICHEL S.A compact deep learning model for robust facial expression recognition[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Piscataway: IEEE Press, 2018: 2121-2129.
[20]	YANG J, ZHANG F, CHEN B, et al.Facial expression recognition based on facial action unit[C]//2019 10th International Green and Sustainable Computing Conference (IGSC).Piscataway: IEEE Press, 2019: 1-6.
[21]	YU K, SALZMANN M.Second-order convolution neural networks[J].Clinical Immunology & Immunopathology, 2017, 66(3):230-238.
[22]	GAO Z, XIE J, WANG Q, et al.Global second-order pooling convolutional networks[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Piscataway: IEEE Press, 2019: 3024-3033.
[23]	HUANG Z, LUC V G.A Riemannian network for SPD matrix learning[C]//31st AAAI Conference on Artificial Intelligence.San Francisco: AAAI Press, 2017: 2036-2042.
[24]	ACHARYA D, HUANG Z, PAUDEL D P, et al.Covariance pooling for facial expression recognition[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Piscataway: IEEE Press, 2018: 367-374.
[25]	HAMESTER D, BARROS P, WERMTER S.Face expression recognition with a 2-channel convolutional neural network[C]//2015 International Joint Conference on Neural Networks (IJCNN).Piscataway: IEEE Press, 2015: 1-8.
[26]	MOLLAHOSSEINI A, CHAN D, MAHOOR M H.Going deeper in facial expression recognition using deep neural networks[C]//2016 IEEE Winter Conference on Applications of Computer Vision (WACV).Piscataway: IEEE Press, 2016: 1-10.
[27]	CHEN Y, HU H.Facial expression recognition by inter-class relational learning[J].IEEE Access, 2019, 7:94106-94117. doi: 10.1109/ACCESS.2019.2928983
[28]	LIU M, LI S, SHAN S, et al.Deeply learning deformable facial action parts model for dynamic expression analysis[C]//Asian Conference on Computer Vision(ACCV).Berlin: Springer, 2014: 143-157.
[29]	NGUYEN D H, KIM S H, LEE G S, et al.Facial expression recognition using a temporal ensemble of multi-level convolutional neural networks[EB/OL].(2019-10-10)[2020-02-20].
[30]	MONTAVON G, ORR G B, MULLER K R.Neural networks:Tricks of the trade[M].Berlin:Springer, 1998:9-50.
[31]	XIE S Y, GIRSHICK R, DOLLAR P, et al.Aggregated residual transformations for deep neural networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Piscataway: IEEE Press, 2017: 1492-1500.
[32]	SZEGEDY C, LIU W, JIA Y, et al.Going deeper with convolutions[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Piscataway: IEEE Press, 2015: 1-9.
[33]	HUANG G, LIU Z, WEINBERGER K Q, et al.Densely connected convolutional networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Piscataway: IEEE Press, 2017: 4700-4708.
[34]	ULYANOV D, VEDALDI A, LEMPITSKY V.Improved texture networks: Maximizing quality and diversity in feed-forward stylization and texture synthesis[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Piscataway: IEEE Press, 2017: 6924-6932.
[35]	PAN X, LUO P, SHI J, et al.Two at once: Enhancing learning and generalization capacities via IBN-net[C]//The European Conference on Computer Vision (ECCV).Berlin: Springer, 2018: 464-479.
[36]	EKMAN P, FRIESEN W V.Facial action coding system:A technique for the measurement of facial movement[M].Palo Alto:Consulting Psychologists Press, 1978:32-96.
[37]	HE K, ZHANG X, REN S, et al.Identity mappings in deep residual networks[C]//The European Conference on Computer Vision.Berlin: Springer, 2016: 630-645.
[38]	QIN Z Y, WU J.Visual saliency maps can apply to facial expression recognition[EB/OL].(2018-11-12)[2020-02-20].
[39]	MIAO S, XU H Y, HAN Z Q, et al.Recognizing facial expressions using a shallow convolutional neural network[J].IEEE Access, 2019, 7:78000-78011. doi: 10.1109/ACCESS.2019.2921220
[40]	ZHOU J C, JIA X, SHEN L L, et al.Improved softmax loss for deep learning-based face and expression recognition[J].Cognitive Computation and Systems, 2019, 1(4):97-102. doi: 10.1049/ccs.2019.0010
[41]	TIAN Y, WEN Z W, XIE W C, et al.Outlier-suppressed triplet loss with adaptive class-aware margins for facial expression recognition[C]//2019 IEEE International Conference on Image Processing (ICIP).Piscataway: IEEE Press, 2019: 46-50.
[42]	XIE S Y, HU H F, WU Y B.Deep multi-path convolutional neural network joint with salient region attention for facial expression recognition[J].Pattern Recognition, 2019, 92:177-191. doi: 10.1016/j.patcog.2019.03.019

施引文献

资源附件(0)

访问统计

点击查看大图

图(5) / 表(5)

计量

文章访问数: 953
HTML全文浏览量: 252
PDF下载量: 70
被引次数: 0

姓名
邮箱
手机号码
标题
留言内容
验证码

留言板

基于联合正则化策略的人脸表情识别方法

doi: 10.13700/j.bh.1001-5965.2020.0073

通讯作者:
卢树华, E-mail:lushuhua@ppsuc.edu.cn

计量

Facial expression recognition method based on a joint normalization strategy

Corresponding author: LU Shuhua, E-mail:lushuhua@ppsuc.edu.cn

计量

目录

模型	准确率/%
文献[38]	71.190
Model1(本文)	73.558
Model2(本文)	73.534
Model3(本文)	73.031

数量	准确率/%
数量	Model1	Model2	Model3
0	71.190	71.190	71.190
1	72.555	72.722	72.499
1-2	72.053	72.417	71.691
1-3	73.558	73.530	73.031
1-4	72.778	72.416	72.723

优化策略	准确率/%
BN	71.190
IN	73.168
GN	73.029
FRN	72.276
Model1	73.558
Model2	73.530
Model3	73.031

模型	准确率/%
SHCNN^[39]	69.100
文献[40]	70.910
IcRL^[27]	72.360
文献[41]	72.640
Model1(本文)	73.558
Model2(本文)	73.534
Model3(本文)	73.031

模型	准确率/%
3DCNN-DAP^[28]	92.4
Inception^[26]	93.2
文献[1]	93.2
DAM-CNN^[42]	95.9
文献[38]	89.3
Model1(本文)	94.9
Model2(本文)	93.6
Model3(本文)	94.1

留言板

基于联合正则化策略的人脸表情识别方法

doi: 10.13700/j.bh.1001-5965.2020.0073

通讯作者: 卢树华, E-mail:lushuhua@ppsuc.edu.cn

计量

出版历程

Facial expression recognition method based on a joint normalization strategy

Corresponding author: LU Shuhua, E-mail:lushuhua@ppsuc.edu.cn

计量

出版历程

目录

通讯作者:
卢树华, E-mail:lushuhua@ppsuc.edu.cn