基于条件生成对抗网络的HDR图像生成方法

贝悦; 王琦; 程志鹏; 潘兴浩; 杨默涵; 丁丹丹

doi:10.13700/j.bh.1001-5965.2020.0518

基于条件生成对抗网络的HDR图像生成方法

doi: 10.13700/j.bh.1001-5965.2020.0518

1.
咪咕视讯科技有限公司, 上海 201201
2.
北京观止创想科技有限公司, 北京 100036

基金项目:

浙江省自然科学基金 LY20F010013

详细信息

通讯作者:
丁丹丹, E-mail: DandanDing@hznu.edu.cn

中图分类号: TP391
计量
- 文章访问数: 337
- HTML全文浏览量: 38
- PDF下载量: 59
- 被引次数: 0
出版历程
- 收稿日期: 2020-09-14
- 录用日期: 2021-04-23
- 网络出版日期: 2022-01-20

HDR image generation method based on conditional generative adversarial network

1.
MIGU Video Co., Ltd., Shanghai 201201, China
2.
Beijing Bravo Video Technologies Incorporation, Beijing 100036, China

Funds:

Zhejiang Provincial Natural Science Foundation of China LY20F010013

More Information

Corresponding author: DING Dandan, E-mail: DandanDing@hznu.edu.cn

摘要

摘要:
高动态范围（HDR）图像相比低动态范围（LDR）图像有更宽的色域和更高的亮度范围，更符合人眼视觉效果，但由于目前的图像采集设备大都是LDR设备，导致HDR图像资源匮乏，解决该问题的一种有效途径是通过逆色调映射将LDR图像映射为HDR图像。提出了一种基于条件生成对抗网络（CGAN）的逆色调映射算法，以重建HDR图像。为此，设计了基于多分支的生成对抗网络与基于鉴别块的鉴别网络，并利用CGAN的数据生成能力和特征提取能力，将单张LDR图像从BT.709色域映射到对应的BT.2020色域。实验结果表明：与现有方法相比，所提出的网络能够获得更高的客观与主观质量，特别是针对低色域中的模糊区域，所提方法能够重建出更清晰的纹理与细节。
- 条件生成对抗网络(CGAN) /
- 卷积神经网络 /
- 逆色调映射 /
- 色域转换 /
- 特征提取
Abstract:
Compared with low dynamic range (LDR) images, high dynamic range (HDR) images have a wider color gamut and higher brightness range, which is more in line with human visual effects. However, since most of the current image acquisition devices are LDR devices, HDR image resources are scarce. An effective way to solve this problem is to map LDR images to HDR images through inverse tone mapping. This paper proposes an inverse tone mapping algorithm based on conditional generative adversarial network (CGAN) to reconstruct HDR images. To this end, a multi-branch-based generation network and a discrimination network based on discrimination blocks are designed, and the data generation and feature extraction capabilities of CGAN are used to map a single LDR image from the BT.709 color gamut to the corresponding BT.2020 color area. The experimental results show that the proposed network can obtain higher objective and subjective quality compared with the existing methods. Especially for fuzzy areas in the low color gamut, the proposed method can reconstruct clearer textures and details.
- conditional generative adversarial network (CGAN) /
- convolutional neural network /
- inverse tone mapping /
- gamut mapping /
- feature extraction

HTML全文

图 1 GAN网络基本结构

Figure 1. Basic structure of GAN network

下载: 全尺寸图片幻灯片

图 2 所提出的生成网络结构

Figure 2. Structure of the proposed generative network

下载: 全尺寸图片幻灯片

图 3 本文鉴别网络及其内部鉴别块的结构

Figure 3. Structure of the proposed authentication network and its internal authentication block

下载: 全尺寸图片幻灯片

图 4 实验所使用的20张LDR测试图片

Figure 4. LDR test pictures used in this experiment

下载: 全尺寸图片幻灯片

图 5 不同方法得到的HDR图像的主观效果对比

Figure 5. Comparison of subjective effects of HDR images obtained by different methods

下载: 全尺寸图片幻灯片

图 6 低色域模糊场景下的HDR图像重建

Figure 6. HDR image reconstruction in low-color-gamut blurred scene

下载: 全尺寸图片幻灯片

图 7 一般场景下的HDR图像重建

Figure 7. HDR image reconstruction in general scenes

下载: 全尺寸图片幻灯片

图 8 消融实验：保留网络中不同分支所得到的主观图像质量

Figure 8. Subjective image quality by retaining different branches of network in ablation experiment

下载: 全尺寸图片幻灯片

图 9 不同分支输出的特征图

Figure 9. Feature maps output from different branches

下载: 全尺寸图片幻灯片

表 1 不同方法的客观性能比较

Table 1. Comparison of objective performance among different methods

方法	PSNR	MPSNR	SSIM	MS-SSIM	HDR-VDP-2
DrTMO^[2]	22.31	22.44	0.58	0.59	63.79
ExpandNet^[5]	23.61	23.79	0.70	0.71	78.02
HDRCNN^[3]	25.70	25.95	0.60	0.63	70.93
本文方法	24.99	25.26	0.71	0.77	78.67

下载: 导出CSV

表 2 消融实验：网络中不同分支的客观性能比较

Table 2. Comparison of objective performance of different branches of network in ablation experiment

方法	PSNR	MPSNR	SSIM	MS-SSIM	HDR-VDP-2
本文方法	24.99	25.26	0.71	0.77	78.67
去掉中频分支	22.06	22.46	0.61	0.65	77.73
仅中频分支	22.36	30.46	0.42	0.45	77.54
仅高频分支	22.84	42.44	0.45	0.47	78.72

下载: 导出CSV

参考文献(20)

[1]	马正先. HDR技术及其在4K超高清电视上的应用[J]. 电视技术, 2019, 43(1): 33-39. doi: 10.3969/j.issn.2096-0751.2019.01.010 MA Z X. HDR technology and application on 4K ultra-high-definition TV[J]. Television Technology, 2019, 43(1): 33-39(in Chinese). doi: 10.3969/j.issn.2096-0751.2019.01.010
[2]	ENDO Y, KANAMORI Y, MITANI J. Deep reverse tone mapping[J]. ACM Transactions on Graphics, 2017, 36(6): 177: 1-177: 10.
[3]	EILERTSEN G, KRONANDER J, DENES G, et al. HDR image reconstruction from a single exposure using deep CNNs[J]. ACM Transactions on Graphics, 2017, 36(6): 1-15. http://www.repository.cam.ac.uk/bitstream/1810/277485/3/paper-opt.pdf
[4]	XU Y C, SONG L, XIE R, et al. Deep video inverse tone mapping[C]//2019 IEEE Fifth International Conference on Multimedia Big Data (BigMM). Piscataway: IEEE Press, 2019: 142-147.
[5]	MARNERIDES D, BASHFORD-ROGERS T, HATCHETT J, et al. ExpandNet: A deep convolutional neural network for high dynamic range expansion from low dynamic range content[J]. Computer Graphics, 2018, 37(2): 37-49.
[6]	KINOSHITA Y, KIYA H. iTM-Net: Deep inverse tone mapping using novel loss function considering tone mapping operator[J]. IEEE Access, 2019, 7: 73555-73563. doi: 10.1109/ACCESS.2019.2919296
[7]	LEE S, AN G H, KANG S J. Deep chain HDRI: Reconstructing a high dynamic range image from a single low dynamic range image[J]. IEEE Access, 2018, 6: 49913-49924. doi: 10.1109/ACCESS.2018.2868246
[8]	XU Y C, NING S Y, XIE R, et al. GAN based multi-exposure inverse tone mapping[C]//2019 IEEE International Conference on Image Processing (ICIP). Piscataway: IEEE Press, 2019: 1-5.
[9]	NING S Y, XU H T, SONG L, et al. Learning an inverse tone mapping network with a generative adversarial regularizer[C]//2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Piscataway: IEEE Press, 2018: 1383-1387.
[10]	LEE S, AN G H, KANG S J. Deep recursive HDRI: Inverse tone mapping using generative adversarial networks[C]//Proceedings of the European Conference on Computer Vision (ECCV). Berlin: Springer, 2018: 596-611.
[11]	RONNEBERGER O, FISCHER P, BROX T. U-Net: Convolutional networks for biomedical image segmentation[C]//International Conference on Medical image Computing and Computer-Assisted Intervention. Berlin: Springer, 2015: 234-241.
[12]	TAKEUCHI M, SAKAMOTO Y, YOKOYAMA R, et al. A gamut-extension method considering color information restoration using convolutional neural networks[C]//2019 IEEE International Conference on Image Processing (ICIP). Piscataway: IEEE Press, 2019: 774-778.
[13]	LEDIG C, THEIS L, HUSZÁR F, et al. Photo-realistic single image super-resolution using a generative adversarial network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 4681-4690.
[14]	WANG X T, KE Y, WU S X, et al. EsrGAN: Enhanced super-resolution generative adversarial networks[C]//Proceedings of the European Conference on Computer Vision (ECCV). Berlin: Springer, 2018: 63-79.
[15]	ISOLA P, ZHU J Y, ZHOU T, et al. Image-to-image translation with conditional adversarial networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 1125-1134.
[16]	ZHU J Y, PARK T, ISOLA P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 2223-2232.
[17]	SIAROHIN A, SANGINETO E, LATHUILIōRE S, et al. Deformable GANs for pose-based human image generation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 3408-3416.
[18]	CHAN C, GINOSAR S, ZHOU T H, et al. Everybody dance now[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 5933-5942.
[19]	GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative adversarial nets[C]//Advances in Neural Information Processing Systems, 2014: 2672-2680.
[20]	RATLIFF L J, BURDEN S A, SASTRY S S. Characterization and computation of local Nash equilibria in continuous games[C]//2013 51st Annual Allerton Conference on Communication, Control, and Computing. Piscataway: IEEE Press, 2013: 917-924.