改进生成对抗网络实现红外与可见光图像融合

闵莉; 曹思健; 赵怀慈; 刘鹏飞

doi:10.3788/IRLA20210291

改进生成对抗网络实现红外与可见光图像融合

doi: 10.3788/IRLA20210291

闵莉^1,,
曹思健^1,,
赵怀慈^2, ,,
刘鹏飞²

1.
沈阳建筑大学机械工程学院，辽宁沈阳 110168
2.
中国科学院沈阳自动化研究所光电信息处理重点实验室，辽宁沈阳 110169

基金项目: 国家重点研发计划（2018YFB1105300）；装备预研重点基金（JZX7Y2019025049301）

详细信息

作者简介:
闵莉，女，副教授，硕士生导师，博士，主要研究方向为模式识别与智能系统、图像处理与机器视觉

通讯作者: 赵怀慈，男，研究员，博士生导师，博士，主要研究方向为图像处理，复杂系统建模与仿真技术，指挥、控制、通信与信息处理技术。

中图分类号: TP391

Infrared and visible image fusion using improved generative adversarial networks

Min Li^1
,,
Cao Sijian^1
,,
Zhao Huaici^{2
, ,},
Liu Pengfei²

1.
School of Mechanical Engineering, Shenyang Jianzhu University, Shenyang 110168, China
2.
Key Laboratory of Optical-Electronics Information Processing, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110169, China

Funds: National Key Research and Development Program of China（2018YFB1105300）；Equipment Pre-research Fundation（JZX7Y2019025049301）

摘要: 红外与可见光图像融合技术能够同时提供红外图像的热辐射信息和可见光图像的纹理细节信息，在智能监控、目标探测和跟踪等领域具有广泛的应用。两种图像基于不同的成像原理，如何融合各自图像的优点并保证图像不失真是融合技术的关键，传统融合算法只是叠加图像信息而忽略了图像的语义信息。针对该问题，提出了一种改进的生成对抗网络，生成器设计了局部细节特征和全局语义特征两路分支捕获源图像的细节和语义信息；在判别器中引入谱归一化模块，解决传统生成对抗网络不易训练的问题，加速网络收敛；引入了感知损失，保持融合图像与源图像的结构相似性，进一步提升了融合精度。实验结果表明，提出的方法在主观评价与客观指标上均优于其他代表性方法，对比基于全变分模型方法，平均梯度和空间频率分别提升了55.84%和49.95%。
- 图像融合 /
- 生成对抗网络 /
- 语义信息 /
- 谱归一化
Abstract: The infrared and visible image fusion technology can provide both the thermal radiation information of infrared images and the texture detail information of visible images. It has a wide range of applications in the fields of intelligent monitoring, target detection and tracking. The two type of images are based on different imaging principles. How to integrate the advantages of each type of image and ensure that the image will not distorted is the key to the fusion technology. Traditional fusion methods only superimpose images information and ignore the semantic information of images. To solve this problem, an improved generative adversarial network was proposed. The generator was designed with two branches of part detail feature and global semantic feature to capture the detail and semantic information of source images; the spectral normalization module was introduced into the discriminator, which would solve the problem that traditional generation adversarial networks were not easy to train and accelerates the network convergence; the perceptual loss was introduced to maintain the structural similarity between the fused image and source images, and further improve the fusion accuracy. The experimental results show that the proposed method is superior to other representative methods in subjective evaluation and objective indicators. Compared with the method based on the total variation model, the average gradient and spatial frequency are increased by 55.84% and 49.95%, respectively.
- image fusion /
- generative adversarial network /
- semantic information /
- spectral normalization

图 1 网络结构整体框架

Figure 1. Overall framework of network structure

下载: 全尺寸图片幻灯片

图 2 生成器网络结构

Figure 2. Generator network structure

下载: 全尺寸图片幻灯片

图 3 判别器网络结构

Figure 3. Discriminator network structure

下载: 全尺寸图片幻灯片

图 4 TNO数据集对比实验结果

Figure 4. Comparative experimental results of the TNO dataset

下载: 全尺寸图片幻灯片

图 5 RoadScene数据集对比实验结果。(a) 红外图像；(b) 可见光图像；(c) DRTV；(d) CNN；(e) FusionGAN；(f) DIDFuse；(g) DDcGAN；(h) 文中方法

Figure 5. Comparative experimental results of the RoadScene dataset. (a) Infrared image；(b) Visible image；(c) DRTV；(d) CNN；(e) FusionGAN；(f) DIDFuse；(g) DDcGAN；(h) Proposed method

下载: 全尺寸图片幻灯片

图 6 损失函数曲线

Figure 6. Loss function curve

下载: 全尺寸图片幻灯片

表 1 两组对比实验客观评价结果

Table 1. Objective evaluation results of two comparison experiment

Dataset	Methods	AG	SF	${Q^{AB/F}}$	${Q_{CB}}$
TNO	DRTV	3.761	9.639	0.319	0.411
	CNN	4.700	11.489	0.332	0.463
	FusionGAN	4.014	10.006	0.313	0.425
	DIDFuse	4.644	11.771	0.395	0.472
	DDcGAN	5.529	13.044	0.356	0.456
	Proposed method	5.861	14.454	0.401	0.504
Road scene	DRTV	3.221	8.696	0.368	0.384
	CNN	4.484	10.536	0.398	0.384
	FusionGAN	3.290	8.426	0.278	0.387
	DIDFuse	5.253	14.149	0.469	0.452
	DDcGAN	5.200	13.580	0.423	0.461
	Proposed method	5.855	15.243	0.480	0.494

下载: 导出CSV

表 2 消融实验评价结果

Table 2. Evaluation results of ablation experiment

Methods	AG	SF	${Q^{AB/F}}$	${Q_{CB}}$
DDcGAN	5.529	13.044	0.356	0.456
DDcGAN+SN	5.670	13.062	0.368	0.469
DDcGAN+GSFB	5.746	13.841	0.388	0.484
DDcGAN+GSFB+SN	5.861	14.454	0.401	0.504

下载: 导出CSV

[1]	Shen Ying, Huang Chunhong, Huang Feng, et al. Infrared and visible image fusion: review of key technologies [J]. Infrared and Laser Engineering, 2021, 50(9): 20200467. (in Chinese)
[2]	Shen Yali. RGBT dual-model Siamese tracking network with feature fusion [J]. Infrared and Laser Engineering, 2021, 50(3): 20200459. (in Chinese)
[3]	Chen J, Wu K, Cheng Z, et al. A saliency-based multiscale approach for infrared and visible image fusion [J]. Signal Processing, 2021, 182(4): 107936.
[4]	Huan Kewei, Li Xiangyang, Cao Yutong, et al. Infrared and visible image fusion with convolutional neural network and NSST [J]. Infrared and Laser Engineering, 2022, 51(3): 20210139. (in Chinese)
[5]	An W B, Wang H M. Infrared and visible image fusion with supervised convolutional neural network [J]. Optik-International Journal for Light and Electron Optics, 2020, 219(17): 165120.
[6]	Pan Y, Pi D, Khan I A, et al. DenseNetFuse: A study of deep unsupervised DenseNet to infrared and visual image fusion [J]. Journal of Ambient Intelligence and Humanized Computing, 2021(3): 02820.
[7]	Goodfellow I J, Pouget-Abadie J, Mirza M, et al. Generative adversarial networks [J]. Advances in Neural Information Processing Systems, 2014, 3: 2672-2680.
[8]	Ma J, Wei Y, Liang P, et al. FusionGAN: A generative adversarial network for infrared and visible image fusion [J]. Information Fusion, 2019, 48: 11-26. doi: 10.1016/j.inffus.2018.09.004
[9]	Ma J, Xu H, Jiang J, et al. DDcGAN: A dual-discriminator conditional generative adversarial network for multi-resolution image fusion [J]. IEEE Transactions on Image Processing, 2020, 29: 4980-4995. doi: 10.1109/TIP.2020.2977573
[10]	Arjovsky M, Chintala S, Bottou L. Wasserstein GAN [J]. arXiv, 2017: 1701. 07875v1.
[11]	Miyato T, Kataoka T, Koyama M, et al. Spectral normalization for generative adversarial networks [C]//International Conference on Learning Representations, 2018.
[12]	Jégou S, Drozdzal M, Vazquez D, et al. The one hundred layers tiramisu: Fully convolutional densenets for semantic segmentation [C]//2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), IEEE, 2016.
[13]	X Li, You A, Zhu Z, et al. Semantic flow for fast and accurate scene parsing [J]. arXiv, 2020: 2002.10120.
[14]	Karen S, Andrew Z. Very deep convolutional networks for large-scale image recognition[J]. arXiv, 2014: 1409.1556.
[15]	Du Qinglei, Xu Han, Ma Yongoing, et al. Fusing infrared and visible images of different resolutions via total variation model [J]. Sensors, 2018, 18(11): 3827.
[16]	Li H, Wu X J, Kittler J. Infrared and visible image fusion using a deep learning framework[J]. arXiv, 2018: 1804.06992.
[17]	Zhao Z, Xu S, Zhang C, et al. DIDFuse: Deep image decomposition for infrared and visible image fusion [C]//Twenty-Ninth International Joint Conference on Artificial Intelligence and Seventeenth Pacific Rim International Conference on Artificial Intelligence, 2020.

[1]	高红霞, 魏涛. 改进PCNN与平均能量对比度的图像融合算法 . 红外与激光工程, 2022, 51(4): 20210996-1-20210996-8. doi: 10.3788/IRLA20210996
[2]	谭威, 宋闯, 赵佳佳, 梁欣凯. 基于多层级图像分解的图像融合算法 . 红外与激光工程, 2022, 51(8): 20210681-1-20210681-9. doi: 10.3788/IRLA20210681
[3]	王婷, 税渝阳, 梁浩文, 刘忆琨, 周建英. 强散射背景下的图像感知、融合与可视化技术 . 红外与激光工程, 2022, 51(8): 20220418-1-20220418-11. doi: 10.3788/IRLA20220418
[4]	宦克为, 李向阳, 曹宇彤, 陈笑. 卷积神经网络结合NSST的红外与可见光图像融合 . 红外与激光工程, 2022, 51(3): 20210139-1-20210139-8. doi: 10.3788/IRLA20210139
[5]	林森, 赵振禹, 任晓奎, 陶志勇. 基于语义信息补偿全局特征的物体点云分类分割 . 红外与激光工程, 2022, 51(8): 20210702-1-20210702-12. doi: 10.3788/IRLA20210702
[6]	李霖, 王红梅, 李辰凯. 红外与可见光图像深度学习融合方法综述 . 红外与激光工程, 2022, 51(12): 20220125-1-20220125-20. doi: 10.3788/IRLA20220125
[7]	沈英, 黄春红, 黄峰, 李杰, 朱梦娇, 王舒. 红外与可见光图像融合技术的研究进展 . 红外与激光工程, 2021, 50(9): 20200467-1-20200467-18. doi: 10.3788/IRLA20200467
[8]	曾瀚林, 孟祥勇, 钱惟贤. 高斯差分滤波图像融合方法 . 红外与激光工程, 2020, 49(S1): 20200091-20200091. doi: 10.3788/IRLA20200091
[9]	刘鹏飞, 赵怀慈, 李培玄. 对抗网络实现单幅RGB重建高光谱图像 . 红外与激光工程, 2020, 49(S1): 20200093-20200093. doi: 10.3788/IRLA20200093
[10]	林森, 刘世本, 唐延东. 多输入融合对抗网络的水下图像增强 . 红外与激光工程, 2020, 49(5): 20200015-20200015-9. doi: 10.3788/IRLA20200015
[11]	戴进墩, 刘亚东, 毛先胤, 盛戈皞, 江秀臣. 基于FDST和双通道PCNN的红外与可见光图像融合 . 红外与激光工程, 2019, 48(2): 204001-0204001(8). doi: 10.3788/IRLA201948.0204001
[12]	张家民, 时东锋, 黄见, 王英俭. 图像融合在偏振关联成像中的应用 . 红外与激光工程, 2018, 47(12): 1226002-1226002(7). doi: 10.3788/IRLA201847.1226002
[13]	郭全民, 董亮, 李代娣. 红外与可见光图像融合的汽车抗晕光系统 . 红外与激光工程, 2017, 46(8): 818005-0818005(6). doi: 10.3788/IRLA201746.0818005
[14]	曾祥通, 张玉珍, 孙佳嵩, 喻士领. 颜色对比度增强的红外与可见光图像融合方法 . 红外与激光工程, 2015, 44(4): 1198-1202.
[15]	杨桄, 童涛, 孟强强, 孙嘉成. 基于梯度加权的红外与可见光图像融合方法 . 红外与激光工程, 2014, 43(8): 2772-2779.
[16]	张宝辉, 闵超波, 窦亮, 张俊举, 常本康. 目标增强的红外与微光图像融合算法 . 红外与激光工程, 2014, 43(7): 2349-2353.
[17]	王金玲, 贺小军, 宋克非. 采用区域互信息的多光谱与全色图像融合算法 . 红外与激光工程, 2014, 43(8): 2757-2764.
[18]	毛海岑, 刘爱东. 利用证据理论的图像融合方法 . 红外与激光工程, 2013, 42(6): 1642-1646.
[19]	张勇, 金伟其. 夜视融合图像质量客观评价方法 . 红外与激光工程, 2013, 42(5): 1360-1365.
[20]	张勇, 金伟其. 夜视融合图像质量主观评价方法 . 红外与激光工程, 2013, 42(2): 528-532.

点击查看大图

图(6) / 表(2)

计量

文章访问数: 389
HTML全文浏览量: 82
PDF下载量: 64
被引次数: 0

全文HTML

0. 引　言

红外与可见光图像融合作为图像融合技术的重要分支，在军事侦察和民用监控等领域有着广泛应用^[1]。红外成像探测器能够通过目标与背景的亮温差捕获目标，摆脱了可见光传感器对光源的依赖，可以在夜晚识别目标，具有能克服恶劣天气的优点，但通常图像分辨率低；而可见光成像传感器捕捉目标的反射信息，其图像适合人类的视觉感知系统，具有分辨率高、细节特征丰富等优点，但容易受到光照与天气因素影响。因此，这两种图像具有天然的互补性^[2]，融合后的图像可以同时提供高亮目标信息与高分辨率场景纹理细节信息。

在传统的红外与可见光图像融合方法中，通常将其他方法如显著性检测^[3]加入多尺度变换框架，通过建立混合模型结合各方法优点，虽然提升了图像融合性能，但又需要手动设计融合规则，这让传统方法变得越来越复杂。

卷积神经网络(Convolutional Neural Networks, CNN)通过卷积操作分割图像并自动提取不同层次特征，近些年通过在空间利用、深度、多路径、宽度、特征图利用、通道提升和引入注意力机制等方面的改进，使自身学习能力得到显著提升，在红外与可见光图像融合领域也获得广泛应用^[4]。如An等人^[5]提出基于CNN的图像融合方法，使融合图像特征更加清晰；Pan等人^[6]使用密集连接的卷积神经网络(Densely Connected Convolutional Networks, Densenet)构建融合算法，充分利用了各卷积层提取的特征。但训练CNN需要大量标记数据，而红外与可见光图像融合任务无法定义融合标准，缺少Ground Truth指导融合框架训练，导致CNN融合性能较差。

生成对抗网络(Generative Adversarial Networks, GAN)^[7]在图像生成领域具有独到的优势，在无监督情况下可以任意逼近真实数据的分布。利用GAN的这种特性，Ma等人^[8]提出FusionGAN方法，建立生成器与判别器之间的对抗，使融合图像保留更丰富的信息，端到端的网络结构不再需要手动设计融合规则；之后，Ma等人^[9]做出改进，建立双判别器条件生成对抗网络模型(A Dual-Discriminator Conditional Generative Adversarial Network, DDcGAN)，同时保留两种源图像的信息，但是双判别器也让网络更复杂，导致难以平衡生成器与判别器，使融合图像出现伪影。

基于GAN的图像融合方法正致力于设计更复杂的生成器结构获取红外图像的热辐射强度和可见光图像的纹理细节，并对源图像采取单一特征提取方式，使融合图像通常在局部区域表现突出，但因为忽略了源图像包含的丰富语义信息，导致融合图像边缘模糊，并且存在GAN网络的通病，训练不稳定问题。

针对以上问题，文中提出了一种改进的生成对抗网络实现红外与可见光图像融合。通过在融合图像与源图像之间建立对抗博弈充分训练生成器，提升图像融合效果。首先，在生成器中建立局部细节特征分支(Part Detail Feature Branch, PDFB)和全局语义特征分支(Global Semantic Feature Branch, GSFB)，同时提取输入图像的细节和语义信息，使融合图像具有更清晰的纹理和边缘；其次，在判别器中引入谱归一化模块(Spectral Normalization, SN)，增强网络训练过程中的稳定性，使网络更易收敛；最后，损失函数中增加感知损失，提高融合图像与源图像的语义相似性。

4. 结　论

文中提出了一种基于语义信息和谱归一化改进的生成对抗网络，以端到端形式进行红外与可见光图像融合。首先，在生成器的编码器内部建立局部细节特征分支和全局语义特征分支，提取输入图像的细节和语义信息；其次，在判别器中使用谱归一化，提升网络训练的稳定性，并引入感知损失，提升融合图像的视觉效果；最后，通过对比实验和消融实验证明改进方法在主观融合图像边缘、纹理细节上及客观融合图像质量上都达到更优的效果。但文中方法也存在不足，处理测试图像的平均时间为1.274 s，原因是方法基于深度学习，对硬件性能要求较高。因此，将在下一阶段重点研究轻量化网络方法解决上述问题，以便应用于便携式多光谱相机等设备。

参考文献 (17)

姓名
邮箱
手机号码
标题
留言内容
验证码

留言板

改进生成对抗网络实现红外与可见光图像融合

doi: 10.3788/IRLA20210291

作者简介:
闵莉，女，副教授，硕士生导师，博士，主要研究方向为模式识别与智能系统、图像处理与机器视觉

通讯作者: 赵怀慈，男，研究员，博士生导师，博士，主要研究方向为图像处理，复杂系统建模与仿真技术，指挥、控制、通信与信息处理技术。

Infrared and visible image fusion using improved generative adversarial networks

计量

改进生成对抗网络实现红外与可见光图像融合

doi: 10.3788/IRLA20210291

1. 沈阳建筑大学机械工程学院，辽宁沈阳 110168

2. 中国科学院沈阳自动化研究所光电信息处理重点实验室，辽宁沈阳 110169

作者简介:
闵莉，女，副教授，硕士生导师，博士，主要研究方向为模式识别与智能系统、图像处理与机器视觉

通讯作者: 赵怀慈，男，研究员，博士生导师，博士，主要研究方向为图像处理，复杂系统建模与仿真技术，指挥、控制、通信与信息处理技术。

English Abstract

Infrared and visible image fusion using improved generative adversarial networks

1. School of Mechanical Engineering, Shenyang Jianzhu University, Shenyang 110168, China

2. Key Laboratory of Optical-Electronics Information Processing, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110169, China

全文HTML

1.1. 生成对抗网络

1.2. 生成对抗网络的缺陷及改进方式

2.1. 网络整体框架

2.2. 生成器网络结构

2.3. 判别器网络结构

2.4. 损失函数设计

3.1. 数据集与实验过程

3.2. 对比实验

3.3. 消融实验

目录

留言板

改进生成对抗网络实现红外与可见光图像融合

doi: 10.3788/IRLA20210291

作者简介: 闵莉，女，副教授，硕士生导师，博士，主要研究方向为模式识别与智能系统、图像处理与机器视觉

通讯作者: 赵怀慈，男，研究员，博士生导师，博士，主要研究方向为图像处理，复杂系统建模与仿真技术，指挥、控制、通信与信息处理技术。

Infrared and visible image fusion using improved generative adversarial networks

计量

出版历程

改进生成对抗网络实现红外与可见光图像融合

doi: 10.3788/IRLA20210291

1. 沈阳建筑大学 机械工程学院，辽宁 沈阳 110168 2. 中国科学院沈阳自动化研究所 光电信息处理重点实验室，辽宁 沈阳 110169

作者简介: 闵莉，女，副教授，硕士生导师，博士，主要研究方向为模式识别与智能系统、图像处理与机器视觉

通讯作者: 赵怀慈，男，研究员，博士生导师，博士，主要研究方向为图像处理，复杂系统建模与仿真技术，指挥、控制、通信与信息处理技术。

English Abstract

Infrared and visible image fusion using improved generative adversarial networks

1. School of Mechanical Engineering, Shenyang Jianzhu University, Shenyang 110168, China 2. Key Laboratory of Optical-Electronics Information Processing, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110169, China

全文HTML

1.1. 生成对抗网络

1.2. 生成对抗网络的缺陷及改进方式

2.1. 网络整体框架

2.2. 生成器网络结构

2.3. 判别器网络结构

2.4. 损失函数设计

3.1. 数据集与实验过程

3.2. 对比实验

3.3. 消融实验

目录

作者简介:
闵莉，女，副教授，硕士生导师，博士，主要研究方向为模式识别与智能系统、图像处理与机器视觉

1. 沈阳建筑大学机械工程学院，辽宁沈阳 110168

2. 中国科学院沈阳自动化研究所光电信息处理重点实验室，辽宁沈阳 110169

作者简介:
闵莉，女，副教授，硕士生导师，博士，主要研究方向为模式识别与智能系统、图像处理与机器视觉

1. School of Mechanical Engineering, Shenyang Jianzhu University, Shenyang 110168, China

2. Key Laboratory of Optical-Electronics Information Processing, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110169, China