基于多任务卷积神经网络的红外与可见光多分辨率图像融合

doi:10.3964/j.issn.1000-0593(2023)01-0289-08

Abstract
References
Related Citation (1)

Download: PDF (6353 KB)
Export: BibTeX | EndNote (RIS)

Abstract Infrared and visible image fusion have always been a research hotspot in the image field. Fusion technology can compensate for a single sensor’s deficiency and provide good imaging pandation for image understanding and analysis. Due to the limitation of production technology and cost, the resolution of infrared detectors is much lower than that of visible detectors, which prevents practical usage to a great extent. A multi-task convolutional neural network framework combining infrared super-resolution and image fusion tasks is proposed, which is applied to the infrared and visible multi-resolution image fusion. In terms of network structure, firstly, a dual-channel network is designed to extract infrared and visible features respectively, so that the resolution of each source image does not limit the proposed algorithm. Secondly, the feature up-sampling block is proposed, using the bilinear interpolation method to increase the number of pixels. Then the mapping relationship between pixel smooth space and high-frequency space is refined via a multilayer perceptron. Therefore, the infrared images can be presented on an arbitrary scale, where the training tasks are not provided. Furthermore, the linear self-attention mechanism is introduced into the network to learn the nonlinear relationship between feature space positions, suppress irrelevant information and enhance global information expression. In terms of the loss function, the gradient loss is proposed to retain the filter response with larger absolute values in the infrared and visible images and calculate the Frobenius norm between the value and the response value of the reconstructed fusion image. Thus, fusion images can be generated without ideal images as ground truth supervising network learning. Finally, the fused and high-resolution infrared images can be reconstructed simultaneously by optimizing the multi-task model under the combined action of gradient loss and pixel loss. The proposed approach is trained on the RoadScene dataset and compared with the other four related algorithms on the TNO dataset. In terms of subjective performance, the proposed method can input source images with the arbitrary resolution, and fusion images have prominent infrared targets and rich visible details. When the resolution of source images is quite different, the proposed method can still reconstruct high-resolution infrared images with clear features and has robust generalization. The objective performance is excellent in multiple evaluation metrics such as entropy, the sum of the correlations of differences and spatial frequency. Experimental results demonstrate that fusion images have a large amount of information, high information conversion rate and high clarity, which verifies the effectiveness of the proposed method.

Key words：Infrared and visible image fusion; Multi-resolution image fusion; Linear attention; Gradient loss; Infrared image super-resolution

Received: 2021-12-10 Accepted: 2022-02-23

ZTFLH:

TP391.41

Corresponding Authors: LI Zheng E-mail: Lizheng_sitp@163.com

Cite this article:

ZHU Wen-qing,ZHANG Ning,LI Zheng, et al. A Multi-Task Convolutional Neural Network for Infrared and Visible Multi-Resolution Image Fusion[J]. SPECTROSCOPY AND SPECTRAL ANALYSIS, 2023, 43(01): 289-296.

URL:

https://www.gpxygpfx.com/EN/10.3964/j.issn.1000-0593(2023)01-0289-08 OR https://www.gpxygpfx.com/EN/Y2023/V43/I01/289