基于改进StarGAN的多波段红外图像生成方法研究

Research on multi-band infrared image generation based on improved StarGAN

  • 摘要: 为了高效而逼真的将可见光图像转换为同一场景不同波段的红外图像,形成多波段图像数据集,为多波段复合探测算法开发与能力测试提供足够的数据支撑,提出一种基于改进StarGAN的多波段红外图像生成方法。首先通过改进原始StarGAN模型网络结构,引入目标波段标签信息约束图像转换方向,实现可见光图像向多波段红外图像的单向转换,降低模型训练的运算量。同时,改进原始StarGAN模型重构损失函数,采用图像特征级差异计算代替像素级差异计算,提高生成红外图像的视觉逼真度,同时保证其与源可见光图像特征的一致性。实验结果表明,与同类Pix2Pix、CycleGAN以及原始StarGAN三种典型方法相比,改进后的StarGAN模型训练耗时得到大幅下降,并且生成的红外图像纹理细节更加丰富。其中短波红外图像结构相似性(SSIM)分别提升了约21%、 9%和10%,学习感知图像块相似度(LPIPS)分别提升了约46%、32%和25%;长波红外图像SSIM分别提升了约19%、13%和8%,LPIPS分别提升了约56%、49%和37%。体现出该方法在模型训练效率以及生成红外图像质量方面的显著优势,具有较好的应用价值。

     

    Abstract:
    Objective Multi-band composite imaging detection technology can fully leverage the advantages of detection in different bands. It not only acquires more target information but also has strong anti-interference ability. Intelligent image processing, as one of the key technologies of multi-band composite imaging, requires a large number of multi-band images for training. Therefore, how to obtain a large number of images of the same scene but in different bands quickly is a key issue that needs to be urgently solved at present. In order to efficiently and realistically convert visible light images into infrared images of different bands in the same scene, form a multi-band image dataset, and provide sufficient data support for the development and capability testing of multi-band composite detection algorithms, a multi-band infrared image generation method based on the improved StarGAN model is proposed in this paper.
    Methods A study on the multi-band infrared image generation method is conducted in this paper based on the StarGAN model, which is capable of simultaneously learning multi-domain relationships and realizing mutual conversion. In order to improve the training efficiency of the model, the network structure of the StarGAN was improved to realize the simultaneous conversion of visible images to multi-band infrared images (Fig.2). The model training loss function was improved, and the difference in image features was adopted to replace the difference in pixel values. It can enhance the fidelity of the generated infrared images.
    Results and Discussions The method proposed in this paper can convert visible light images into infrared images of different bands efficiently. The generated infrared images retain the scene feature information of the original visible images while having a relatively realistic infrared texture. For example, the generated short-wave infrared (SWIR) images can well reflect the strong SWIR radiation reflected by white clouds and vegetation, while the generated long-wave infrared (LWIR) images can well reflect the differences in infrared radiation emitted by various objects in the scene, showing the feature that the engines and tires of moving vehicles emit strong infrared radiation (Fig.6). Compare the output results of the method in this paper with those of Pix2Pix and CycleGAN methods on the same dataset. The proposed method is compared with three typical methods Pix2Pix, CycleGAN, and StarGAN under the same conditions. The experimental results show that the infrared images generated by the method proposed in this paper have a higher similarity to real infrared images and better image quality (Fig.7). In addition, the efficiency of model training has been significantly improved (Tab.1).
    Conclusions A multi-band infrared image generation method based on the improved StarGAN model is proposed. The training efficiency of the improved StarGAN model is enhanced significantly. And the generated infrared image scene has clear structure and rich texture details. Comparing its performance with Pix2Pix, CycleGAN, and StarGAN, the following results can be obtained.The structural similarity index measure (SSIM) of SWIR images is improved by approximately 21%, 9% and, 10%. The learned perceptual image patch similarity (LPIPS) is improved by approximately 46%, 32% and 25%. For LWIR images, the SSIM is improved by approximately 19%, 13% and 8%. And the LPIPS is improved by approximately 56%, 49% and 37%. The proposed method has a significant improvement in model training efficiency and image generation quality compared with traditional methods. It has good application value.

     

/

返回文章
返回