J-MSF：A new infrared dim and small target detection algorithm based on multi-channel and multiscale

Wang Guogang; Sun Zhaojin; Liu Yunpeng

doi:10.3788/IRLA20210459

An novel infrared dim and small target detection algorithm, called J-MSF, based on multi-channel and multi-scale feature fusion was proposed, which solved the problem that the classical infrared dim and small target detection algorithm based on deep learning cannot detect because the target information disappeared in the upper receptive field. Firstly, a new multi-channel Janet structure was proposed to design the J-MSF backbone extraction framework. Secondly, a descending threshold feature pyramid pooling structure (DSPP) was exploited, and a multi-scale fusion detection strategy was conducted. Finally, the Gauss loss optimization function was designed. The experimental results show that the recall rate and the AP value of the proposed algorithm are improved by 9.07%, 9.89% and 1.67%, 3.16%, respectively, compared with those of YOLOv3 and YOLOv4 algorithms in "a dataset for infrared detection and tracking of dim and small aircraft targets underground/air background". The proposed algorithm can be effectively applied to infrared dim and small target detection, shows good robustness and adaptability, and is better than the state of the art algorithms.

HTML

[1]	Jiang Guoqing, Wan Lanjun. Detection of dim and small infrared targets based on the most appropriate contrast saliency analysis [J]. Infrared and Laser Engineering, 2021, 50(4): 20200377. (in Chinese) doi: 10.3788/IRLA20200377
[2]	Liu Gaoru, Sun Shengli, Lin Changqing. Two-dimensional spatial profile method for infrared dim point target background suppression [J]. Infrared Technology, 2019, 41(4): 329-334. (in Chinese)
[3]	Zhang Congcong. Infrared dim small target detection method based on low rank background and sparse target characteristics [D]. Nanjing: Nanjing University of Science and Technology, 2018. (in Chinese)
[4]	Huang Yuanyuan. Research on infrared dim small target detection algorithm based on local contrast [D]. Chongqing: Chongqing University of Posts and Telecommunications, 2020. (in Chinese)
[5]	Zhao Yan, Liu Di, Zhao Lingjun. Infrared dim and small target detection based on YOLOv3 in complex environment [J]. Aero Weaponry, 2019, 26(6): 29-34. (in Chinese)
[6]	Feng Xiaoyu, Mei Wei, Hu Dashuai. Air target detection based on improved fast R-CNN [J]. Acta Optica Sinica, 2018, 38(6): 0615004. (in Chinese) doi: 10.3788/AOS201838.0615004
[7]	Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer vision and Pattern Recognition (CVPR), 2016: 779-788.
[8]	Redmon J, Farhadi A. YOLO9000: Better, faster, stronger[C]//Proceedings of the IEEE Conference on Computer vision and Pattern Recognition (CVPR), 2017: 7263-7271.
[9]	Redmon J, Farhadi A. Yolov3: An incremental improvement [J]. arXiv, 2018: 1804.02767.
[10]	Bochkovskiy A, Wang C Y, Liao H Y. YOLOv₄: Optimal speed and accuracy of object detection [J]. arXiv, 2020: 2004.10934.
[11]	Hui B, Song Z, Fan H. A dataset for infrared detection and tracking of dim-small aircraft targets underground/air background [J]. China Scientific Data, 2020, 5(3): 291-302.
[12]	Misra D. Mish: A self-regularized non-monotonic neural activation function [J]. arXiv, 2019: 1908.08681.
[13]	Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation[C]//International Conference on Medical Image Computing and Computer Assisted Intervention(MICCAI), 2015: 234–241.
[14]	Lin T Y, Dollar P, Girshick R, et al. Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer vision and Pattern Recognition (CVPR), 2017: 2117-2125.
[15]	Yuan W, Wang S, Li X, et al. A skip attention mechanism for monaural singing voice separation. [J]. IEEE Signal Processing Letters, 2019, 26(10): 1481-1485. doi: 10.1109/LSP.2019.2935867
[16]	Fan Xiangsuo. Research on small target detection and tracking algorithm in image sequences[D]. Chengdu: University of Electronic Science and Technology, 2019. (in Chinese)
[17]	Huang Z, Wang J, Fu X, et al. DC-SPP-YOLO: Dense connection and spatial pyramid pooling based YOLO for object detection [J]. Information Sciences, 2020, 522: 241-258. doi: 10.1016/j.ins.2020.02.067
[18]	Choi J, Chun D, Kim H, et al. Gaussian YOLOv3: An accurate and fast object detector using localization uncertainty for autonomous driving[C]//Proceedings of the IEEE/CVF Inter-national Conference on Computer Vision (ICCV), 2019: 502-511.
[19]	Chen L, Shi W, Deng D. Improved YOLOv3 based on attention mechanism for fast and accurate ship detection in optical remote sensing images [J]. Remote Sensing, 2021, 13(4): 660. doi: 10.3390/rs13040660
[20]	Ren S, He K, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks [J]. arXiv, 2015: 1506.01497.
[21]	Liu W, Anguelov D, Erhan D, et al. SSD: Single shot multi-box detector[C]//European Conference on Computer Vision (ECCV), 2016: 21-37.
[22]	Zhang S, Wen L, Bian X, et al. Single-shot refinement neural network for object detection[C]//Proceedings of the IEEE Conference on Computer vision and Pattern Recognition (CVPR), 2018: 4203-4212.
[23]	Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision, 2017: 2980-2988.

Fusion map/layer	Kernel size	Output size	Stride	Channel
Basic-feature map	-	8×8	-	1024
Artery-feature map	-	16×16	-	768
Detection map 1	-	32×32	-	30
Detection map 2	-	64×64	-	30
Detection map 3	-	128×128	-	30
Maxpooling 1	3	32×32	1	128
Maxpooling 2	5	32×32	1	128
Maxpooling 3	7	32×32	1	128

SNR region	3.26-3	3-2	2-1	1-0	0-(−1.97)	−3-(−20)
Data4	0	5	209	379	204	2
Data8	2	39	108	94	101	55
Data12	5	84	407	424	341	238
Data16	5	247	214	15	1	12
Data20	0	12	155	197	29	8
Total	12	387	1093	1109	676	315

Model	$\mathop X\nolimits_{FN} $	R	AP
Darknet-53	458	87.2%	86.38%
Darknet-53-JA	343	90.0%	88.43%
J-MSF	217	94.0%	93.13%

Darknet53	J-MSF	Loss	Fusion	Precision	R	AP	FPS
√	-	D	-	86%	87.20%	86.38%	66.3
√	-	D	√	92%	92.20%	92.74%	57.5
√	-	M	-	89%	94.04%	93.88%	71.9
√	-	M	√	82%	95.00%	93.47%	71.6
-	√	D	-	90%	94.00%	93.13%	59.0
-	√	D	√	90%	94.10%	93.46%	73.4
-	√	M	-	86%	95.85%	94.80%	66.8
-	√	M	√	88%	96.27%	96.29%	67.6

J-MSF：A new infrared dim and small target detection algorithm based on multi-channel and multiscale

doi: 10.3788/IRLA20210459

Abstract

References

Proportional views

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Related

Proportional views