基于EM自注意力残差的图像超分辨率重建网络

黄淑英; 胡瀚洋; 杨勇; 万伟国; 吴峥

doi:10.13700/j.bh.1001-5965.2022.0401

基于EM自注意力残差的图像超分辨率重建网络

doi: 10.13700/j.bh.1001-5965.2022.0401

黄淑英¹,
胡瀚洋²,
杨勇^3, ,,
万伟国⁴,
吴峥²

1.
天津工业大学软件学院，天津 300387
2.
江西财经大学信息管理学院，南昌 330032
3.
天津工业大学计算机科学与技术学院，天津 300387
4.
江西财经大学软件与物联网工程学院，南昌 330032

基金项目: 国家自然科学基金(61862030,62072218); 江西省自然科学基金(20192ACB20002,20192ACBL21008)

详细信息

通讯作者:
E-mail：greatyangy@126.com

中图分类号: TP391.41
计量
- 文章访问数: 731
- HTML全文浏览量: 86
- PDF下载量: 21
- 被引次数: 0
出版历程
- 收稿日期: 2022-05-21
- 录用日期: 2022-07-02
- 网络出版日期: 2022-10-18
- 整期出版日期: 2024-02-27

Image super-resolution reconstruction network based on expectation maximization self-attention residual

1.
School of Software，Tiangong University，Tianjin 300387，China
2.
School of Information Management，Jiangxi University of Finance and Economics，Nanchang 330032，China
3.
School of Computer Science and Technology，Tiangong University，Tianjin 300387，China
4.
School of Software and Internet of Things Engineering，Jiangxi University of Finance and Economics，Nanchang 330032，China

Funds: National Natural Science Foundation of China (61862030,62072218); Natural Science Foundation of Jiangxi, China (20192ACB20002,20192ACBL21008)

More Information

Corresponding author: E-mail：greatyangy@126.com

摘要

摘要:
基于深度学习的图像超分辨率(SR)重建方法主要通过增加模型的深度来提升图像重建的质量，但同时增加了模型的计算代价，很多网络利用注意力机制来提高特征提取能力，但难以充分学习到不同区域的特征。为此，提出一种基于期望最大化(EM)自注意力残差的图像超分辨率重建网络。该网络通过改进基础残差块，构建特征增强残差块，以更好地复用残差块中所提取的特征。为增加特征信息在空间上的相关性，引入EM自注意力机制，构建EM自注意力残差模块来增强模型中每个模块的特征提取能力，并通过级联EM自注意力残差模块来构建整个模型的特征提取结构。所获得的特征图通过上采样的图像重建模块获得重建的高分辨率图像。将所提方法与主流方法进行实验对比，结果表明：所提方法在5个流行的SR测试集上能够取得较好的主观视觉效果和更优的性能指标。
- 超分辨率重建 /
- 注意力机制 /
- 期望最大化 /
- 特征增强残差块 /
- EM自注意力残差模块
Abstract:
In recent years, most deep learning-based image super-resolution (SR) reconstruction methods mainly improve the quality of image reconstruction by increasing the depth of the model, while also increasing the computational cost of the model. Additionally, a lot of networks have implemented the attention mechanism to enhance their capacity for feature extraction, but it is still challenging to properly understand the properties of various regions. In response to the above problems, this paper proposes a novel SR reconstruction network based on expectation maximization (EM) self-attention residual. The network constructs a feature-enhanced residual block by improving the basic residual block to better reuse the features extracted from the residual block. In order to increase the spatial correlation of the feature information, an EM self-attention residual block is constructed by introducing the EM self-attention mechanism, which is used to enhance the feature extraction capability of each module in the deep network model. Moreover, the feature extraction structure of the entire model is constructed by cascading EM self-attention residual blocks. Finally, a reconstructed high-resolution image is obtained through an up-sampling image reconstruction module.In order to verify the effectiveness of the proposed method, this paper has carried out comparison experiments with some mainstream methods. The experimental results show that the proposed method can achieve better subjective visual effects and better objective evaluation indicators on five popular widely used SR test datasets.
- super-resolution reconstruction /
- attention mechanism /
- expectation maximization /
- feature-enhanced residual block /
- EM self-attention residual block

HTML全文

图 1 基于 EM 自注意力残差网络的结构

Figure 1. Network structure based on EM self-attention residual

下载: 全尺寸图片幻灯片

图 2 残差块结构

Figure 2. Residual block structure

下载: 全尺寸图片幻灯片

图 3 EM 自注意力机制

Figure 3. EM self-attention mechanism

下载: 全尺寸图片幻灯片

图 4 图像重建模块结构

Figure 4. Image reconstructed module structure

下载: 全尺寸图片幻灯片

图 5 放大因子为 2 时BSD100_37073数据集中一幅图像重建结果的可视化比较

Figure 5. Visualization comparison of reconstruction results of an image in BSD100_37373 dataset under a magnification factor of 2

下载: 全尺寸图片幻灯片

图 6 放大因子为 4 时BSD100_86000数据集中一幅图像重建结果的可视化比较

Figure 6. Visualization comparison of reconstruction results of an image in BSD100_86000 dataset under a magnification factor of 4

下载: 全尺寸图片幻灯片

表 1 放大因子分别为2、3、4时，不同方法在各数据集上SR重建图像的PSNR和SSIM平均结果

Table 1. Average PSNR and SSIM results of SR reconstructed images obtained by different methods under different magnification factors of 2, 3, and 4 on each dataset

方法	放大因子	平均PSNR/dB					平均SSIM
方法	放大因子	Set5	Set14	BSD100	Urban100	Manga109	Set5	Set14	BSD100	Urban100	Manga109
Bicubic	2	33.66	30.24	29.56	26.88	31.05	0.9299	0.8688	0.8431	0.8403	0.935
SRCNN^[12]	2	36.66	32.45	31.36	29.50	35.72	0.9542	0.9067	0.8879	0.8946	0.968
VDSR^[14]	2	37.53	33.03	31.90	30.76	37.16	0.9587	0.9124	0.8960	0.9140	0.974
DRCN^[15]	2	37.63	33.04	31.85	30.75	37.57	0.9588	0.9118	0.8942	0.9133	0.973
MS-LapSRN^[33]	2	37.78	33.28	32.05	31.15	37.78	0.9600	0.9150	0.8980	0.9190	0.976
SRMDNF^[34]	2	37.79	33.32	32.05	31.33	38.07	0.9601	0.9159	0.8985	0.9204	0.976
LESRCNN^[35]	2	37.65	33.32	31.95	31.45	38.49	0.9586	0.9148	0.8964	0.9206	0.9777
PAN^[36]	2	38.00	33.59	32.18	32.01	38.70	0.9605	0.9181	0.8997	0.9273	0.9773
SMSR^[37]	2	38.00	33.64	32.17	32.19	38.76	0.9601	0.9179	0.8990	0.9284	0.9771
本文方法	2	38.00	33.69	32.18	32.18	38.57	0.9606	0.9202	0.8997	0.9293	0.9774

方法	放大因子	平均PSNR/dB					平均SSIM
方法	放大因子	Set5	Set14	BSD100	Urban100	Manga109	Set5	Set14	BSD100	Urban100	Manga109
Bicubic	3	30.39	27.55	27.21	24.46	26.95	0.8682	0.7742	0.7385	0.7349	0.856
SRCNN^[12]	3	32.75	29.30	28.41	26.24	30.48	0.9090	0.8215	0.7863	0.7989	0.912
VDSR^[14]	3	33.66	29.77	28.82	27.14	32.01	0.9213	0.8314	0.7976	0.8279	0.934
DRCN^[15]	3	33.82	29.76	28.80	27.15	32.31	0.9226	0.8311	0.7963	0.8276	0.936
MS-LapSRN^[33]	3	34.06	29.97	28.93	27.47	32.68	0.9240	0.8360	0.8020	0.8370	0.939
SRMDNF^[34]	3	34.12	30.04	28.97	27.57	33.00	0.9254	0.8382	0.8025	0.8398	0.9403
LESRCNN^[35]	3	33.93	30.12	28.91	27.70	33.15	0.9231	0.8380	0.8005	0.8415	0.9433
PAN^[36]	3	34.40	30.36	29.11	28.11	33.61	0.9271	0.8423	0.8050	0.8511	0.9448
SMSR^[37]	3	34.40	30.33	29.10	28.25	33.68	0.9270	0.8412	0.8050	0.8536	0.9445
本文方法	3	34.45	30.41	29.12	28.31	33.62	0.9278	0.8442	0.8060	0.8554	0.9450

方法	放大因子	平均PSNR/dB					平均SSIM
方法	放大因子	Set5	Set14	BSD100	Urban100	Manga109	Set5	Set14	BSD100	Urban100	Manga109
Bicubic	4	28.42	26.00	25.96	23.14	25.15	0.8104	0.7027	0.6675	0.6577	0.789
SRCNN^[12]	4	30.48	27.50	26.90	24.52	27.66	0.8628	0.7513	0.7101	0.7221	0.858
VDSR^[14]	4	31.35	28.01	27.29	25.18	28.82	0.8838	0.7674	0.7251	0.7524	0.886
DRCN^[15]	4	31.53	28.02	27.23	25.14	28.97	0.8854	0.7670	0.7233	0.7510	0.886
MS-LapSRN^[33]	4	31.74	28.26	27.43	25.51	29.54	0.8890	0.7740	0.7310	0.7680	0.897
SRMDNF^[34]	4	31.96	28.35	27.49	25.68	30.09	0.8925	0.7787	0.7337	0.7731	0.902
LESRCNN^[35]	4	31.88	28.44	27.45	25.77	30.49	0.8903	0.7772	0.7313	0.7732	0.9091
PAN^[36]	4	32.13	28.61	27.59	26.11	30.51	0.8948	0.7822	0.7363	0.7854	0.9095
SMSR^[37]	4	32.12	28.55	27.55	26.11	30.54	0.8932	0.7808	0.7351	0.7868	0.9085
本文方法	4	32.25	28.69	27.59	26.20	30.50	0.8953	0.7841	0.7370	0.7891	0.9092

下载: 导出CSV

表 2 放大因子为2时不同方法在Set5数据集上的平均运行时间

Table 2. Average running time of different methods on Set5 dataset under a magnification factor of 2

方法	平均运行时间/s
SRCNN^[12]	0.002 4
VDSR^[14]	0.005 2
DRCN^[15]	0.103 6
MS-LapSRN^[33]	0.038 4
SRMDNF^[34]	0.187 3
LESRCNN^[35]	0.002 6
PAN^[36]	0.035 4
本文方法	0.090 2

下载: 导出CSV

表 3 放大因子为4时不同方法在Set5数据集上的LPIPS结果

Table 3. LPIPS results of different methods on Set5dataset under a magnification factor of 4

方法	LPIPS
Bicubic	0.3407
SRCNN^[12]	0.2421
VDSR^[14]	0.2246
DRCN^[15]	0.2289
LESRCNN^[35]	0.2188
PAN^[36]	0.2165
本文方法	0.2142

下载: 导出CSV

表 4 放大因子为4时不同模块在Set5数据集上的比较结果

Table 4. Comparsion results of different blocks on Set5 dataset under a magnification factor of 4

模型	PSNR/dB	SSIM	参数量/10³	计算量/10⁹ FLOP
DARN_res	32.20	0.8951	1649	1952
DARN_dr	32.08	0.8940	1436	1756
DARN_ema	31.79	0.8930	467	863
DARN_ca	32.11	0.8942	1445	1757
DARN	32.25	0.8953	1568	1878
注：FLOP为浮点运算次数，使用尺寸为1280×720的图像来测试。

下载: 导出CSV

参考文献(41)

[1]	TAKEDA H, MILANFAR P, PROTTER M, et al. Super-resolution without explicit subpixel motion estimation[J]. IEEE Transactions on Image Processing, 2009, 18(9): 1958-1975. doi: 10.1109/TIP.2009.2023703
[2]	ZOU W W W, YUEN P C. Very low resolution face recognition problem[J]. IEEE Transactions on Image Processing, 2012, 21(1): 327-340.
[3]	SHI W, CABALLERO J, LEDIG C, et al. Cardiac image super-resolution with global correspondence using multi-atlas PatchMatch[C]//Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention. Berlin: Springer, 2013, 8151: 9-16.
[4]	ARUN P V, BUDDHIRAJU K M, PORWAL A, et al. CNN based spectral super-resolution of remote sensing images[J]. Signal Processing, 2020, 169: 107394. doi: 10.1016/j.sigpro.2019.107394
[5]	KEYS R. Cubic convolution interpolation for digital image processing[J]. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1981, 29(6): 1153-1160. doi: 10.1109/TASSP.1981.1163711
[6]	MARQUIAN A, OSHER S J. Image super-resolution by TV-regularization and Bregman iteration[J]. Journal of Scientific Computing, 2008, 37(3): 367-382. doi: 10.1007/s10915-008-9214-8
[7]	DONG W, ZHANG L, SHI G, et al. Image deblurring and super-resolution by adaptive sparse domain selection and adaptive regularization[J]. IEEE Transactions on Image Processing, 2011, 20(7): 1838-1857. doi: 10.1109/TIP.2011.2108306
[8]	YANG J, WRIGHT J, HUANG T S, et al. Image super-resolution via sparse representation[J]. IEEE Transactions on Image Processing, 2010, 19(11): 2861-2873. doi: 10.1109/TIP.2010.2050625
[9]	ZEYDE R, ELAD M, PROTTER M. On single image scale-up using sparse-representations[C]//Proceedings of the International Conference on Curves and Surfaces. Berlin: Springer, 2010: 711-730.
[10]	HU Y, WANG N, TAO D, et al. SERF: A simple, effective, robust, and fast image super-resolver from cascaded linear regression[J]. IEEE Transactions on Image Processing, 2016, 25(9): 4091-4102. doi: 10.1109/TIP.2016.2580942
[11]	HUANG Y, LI J, GAO X, et al. Single image super-resolution via multiple mixture prior models[J]. IEEE Transactions on Image Processing, 2018, 27(12): 5904-5917. doi: 10.1109/TIP.2018.2860685
[12]	DONG C, LOY C C, HE K, et al. Image super-resolution using deep convolutional networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(2): 295-307. doi: 10.1109/TPAMI.2015.2439281
[13]	HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 770-778.
[14]	KIM J, LEE J K, LEE K M. Accurate image super-resolution using very deep convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 1646-1654.
[15]	KIM J, LEE J K, LEE K M. Deeply-recursive convolutional network for image super-resolution[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 1637-1645.
[16]	LIM B, SON S, KIM H, et al. Enhanced deep residual networks for single image super-resolution[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 1132-1140.
[17]	ZHANG Y L, LI K P, LI K, et al. Image super-resolution using very deep residual channel attention networks[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018, 11211: 286-310.
[18]	DAI T, CAI J, ZHANG Y, et al. Second-order attention network for single image super-resolution[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 11057-11066.
[19]	HUI Z, WANG X, GAO X. Fast and accurate single image super-resolution via information distillation network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 723-731.
[20]	LI J, FANG F, MEI K, et al. Multi-scale residual network for image super-resolution[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018, 11212: 527-542.
[21]	AHN N, KANG B, SOHN K A. Fast, accurate, and lightweight super-resolution with cascading residual network[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018, 11214: 256-272.
[22]	LEDIG C, THEIS L, HUSZAR F, et al. Photo-realistic single image super-resolution using a generative adversarial network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 105-114.
[23]	WANG X, YU K, WU S, et al. ESRGAN: Enhanced super-resolution generative adversarial networks[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018, 11133: 63-79.
[24]	YADAV S, CHEN C, ROSS A. Relativistic discriminator: A one-class classifier for generalized iris presentation attack detection[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. Piscataway: IEEE Press, 2020: 2624-2633.
[25]	LEI J, LIU C, JIANG D. Fault diagnosis of wind turbine based on long short-term memory networks[J]. Renewable Energy, 2019, 133: 422-432. doi: 10.1016/j.renene.2018.10.031
[26]	TAN Z, WANG M, XIE J, et al. Deep semantic role labeling with self-attention[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI, 2018.
[27]	WANG X, GIRSHICK R, GUPTA A, et al. Non-local neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 7794-7803.
[28]	WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018, 11211: 3-19.
[29]	SANG H, ZHOU Q, ZHAO Y. PCANet: Pyramid convolutional attention network for semantic segmentation[J]. Image and Vision Computing, 2020, 103: 103997. doi: 10.1016/j.imavis.2020.103997
[30]	赵杨璐, 段丹丹, 胡饶敏, 等. 基于EM算法的混合模型中子总体个数的研究[J]. 数理统计与管理, 2020, 39(1): 35-50. doi: 10.13860/j.cnki.sltj.20190808-002 ZHAO Y L, DUAN D D, HU R M, et al. On the number of components in mixture model based on EM algorithm[J]. Journal of Applied Statistics and Management, 2020, 39(1): 35-50(in Chinese). doi: 10.13860/j.cnki.sltj.20190808-002
[31]	LI X, ZHONG Z, WU J, et al. Expectation-maximization attention networks for semantic segmentation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 9166-9175.
[32]	AGUSTSSON E, TIMOFTE R. NTIRE 2017 challenge on sngle image super-resolution: Dataset and study[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 1122-1131.
[33]	LAI W S, HUANG J B, AHUJA N, et al. Deep Laplacian pyramid networks for fast and accurate super-resolution[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 5835-5843.
[34]	ZHANG K, ZOU W, ZHANG L. Learning a single convolutional super-resolution network for multiple degradations[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 3262-3271.
[35]	TIAN C, ZHUGE R, WU Z, et al. Lightweight image super-resolution with enhanced CNN[J]. Knowledge-Based Systems, 2020, 205: 106235. doi: 10.1016/j.knosys.2020.106235
[36]	ZHAO H, KONG X, HE J, et al. Efficient image super-resolution using pixel attention[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2020, 12537: 56-72.
[37]	WANG L G, DONG X Y, WANG Y Q, et al. Exploring sparsity in image super-resolution for efficient inference[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 4915-4924.
[38]	BEVILACQUA M, ROUMY A, GUILLEMOT C, et al. Low-complexity single-image super-resolution based on nonnegative neighbor embedding[C]//Procedings of the British Machine Vision Conference. Berlin: Springer, 2012: 135.1-135.10.
[39]	MARTIN D, FOWLKES C, TAL D, et al. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics[C]//Proceedings of the 8th IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2001, 2: 416-423.
[40]	HUANG J B, SINGH A, AHUJA N. Single image super-resolution from transformed self-exemplars[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2015: 5197-5206.
[41]	MATSUI Y, ITO K, ARAMAKI Y, et al. Sketch-based manga retrieval using Manga109 dataset[J]. Multimedia Tools and Applications, 2017, 76(20): 21811-21838. doi: 10.1007/s11042-016-4020-z