Infrared dim and small target detection based on YOLO-IDSTD algorithm

Jiang Xinhao; Cai Wei; Yang Zhiyong; Xu Peiwei; Jiang Bo

doi:10.3788/IRLA20210106

Aiming at the problem that it is difficult to detect infrared dim and small target accurately and quickly in complex background, a lightweight real-time network model YOLO-IDSTD for infrared dim and small target detection was proposed. Firstly, in order to improve the detection speed, the network structure of the feature extraction part was redesigned, and the Focus module was used to reduce the reasoning time after the input layer. Secondly, in order to enhance the detection ability, the path aggregation network was adopted in the feature fusion part and an improved receptive field block was added. Finally, four-scales detection was increased in the target detection part. Compared with the classical lightweight model YOLOv3-tiny on the infrared dim and small target data set, the recall is increased by 7.57%, the average pricision is increased by 1.92%, and the CPU reasoning speed is increased by 36.1%. The model can balance accuracy and speed, and the amount of calculation and parameters are significantly reduced. The size of the model is compressed to 7.27 MB, which reduces the dependence on the computing power of the hardware platform and realizes the accurate and fast detection of infrared dim and small targets.

HTML

[1]	Zhang W, Cong M, Wang L. Algorithms for optical weak small targets detection and tracking: Review[C]//International Conference on Neural Networks and Signal Processing, Proceedings of the 2003. IEEE, 2003.
[2]	Du P, Hamdulla A. Infrared small target detection using homogeneity-weighted local contrast measure [J]. IEEE Geoence and Remote Sensing Letters, 2020, 17(3): 514-518.
[3]	Pan S D, Zhang S, Zhao M, et al. Infrared small target detection based onweighted scene prior [J]. Journal of Infrared and Millimeter Waves, 2019, 38(5): 633-641. (in Chinese)
[4]	Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2014: 580-587.
[5]	Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2016: 779-788.
[6]	Liu W, Anguelov D, Erhan D, et al. SSD single shot MultiBox detector[M]//Leibe B, Matas J, Sebe N, et al. Computer vision-ECCV 2016. Lecture notes in computer science. Cham: Springer, 2016, 9905: 21-37.
[7]	Zhao Xiaofeng, Xu Mingyang, Wang Danpiao, et al. Infrared camouflage detection method for special vehicles based on improved SSD [J]. Infrared and Laser Engineering, 2019, 48(11): 1104003. (in Chinese) doi: 10.3788/IRLA201948.1104003
[8]	Ju M R, Luo H B, Wang Z B, et al. Improved YOLO v3 algorithm and its application in small target detection [J]. Acta Optica Sinica, 2019, 39(7): 0715004. (in Chinese) doi: 10.3788/AOS201939.0715004
[9]	Qi H Y, Xu T H, Wang G, et al. MYOLOv3-Tiny: A new convolutional neural network architecture for real-time detection of track fasteners [J]. Computers in Industry, 2020, 123: 103303.
[10]	Glenn Jocher, Alex Stoken, Jirka Borovec, et al. Ultralytics/yolov5: v3.1-Bug Fixes and Performance Improvements (Version v3.1) [EB/OL]. (2020-10-29)[2021-01-19]. https://zenodo.org/record/4154370.
[11]	Howard J, Zhu M L, Chen B, et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications[EB/OL]. (2017-05-17)[2021-01-19]. https://arxiv.org/abs/1704.04861.
[12]	Liu S, Qi L, Qin H F, et al. Path aggregation network for instance segmentation[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2018: 8759-8768.
[13]	Liu S T, Huang D, Wang Y H. Receptive field block net for accurate and fast object detection[C]//Proceedings of the European Conference on Computer Vision(ECCV). New York: IEEE, 2018: 385-400.
[14]	Zheng Z, Wang P, Liu W, et al. Distance-IoU Loss: Faster and better learning for bounding box regression[C]//AAAI Conference on Artificial Intelligence, 2020.
[15]	回丙伟, 宋志勇, 范红旗, 等. 地/空背景下红外图像弱小飞机目标检测跟踪数据集. V1. [DB/OL]. Science Data Bank, 2019. (2019-10-28)[2021-01-19]. http://www.dx.doi.org/10.11922/sciencedb.902. Hui B W, Song Z Y, Fan H Q, et al. A dataset for infrared image dim-small aircraft target detection and tracking under ground/air background [DB/OL]. Science Data Bank, 2019. (2019-10-28) [2021-01-19]. http://www.dx.doi.org/10.11922/sciencedb.902.
[16]	Davis J W, Keck M A. A two-stage template approach to person detection in thermal imagery[C]//Application of Computer Vision, WACV/MOTION'05, IEEE Workshops on IEEE, 2005.
[17]	FREE Teledyne FLIR thermal dataset for algorithm training[EB/OL]. (2019-06-18)[2021-01-19]. https://www.flir.cn/oem/adas/adas-dataset-form/.
[18]	Tan M X, Pang R M, Le Q V. EfficientDet: Scalable and efficient object detection[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2020: 10781-10790.
[19]	Zhou X, Wang D, Krähenbühl P. Objects as points [EB/OL]. (2019-04-16)[2021-01-19]. https://doi.org/10.48550/arXiv.1904.07850.
[20]	Wang C Y, Bochkovskiy A, Liao H. Scaled-YOLOv4: Scaling cross stage partial network [EB/OL]. (2020-11-16)[2021-03-31]. https://arxiv.org/pdf/2011.08036.pdf.

No.	Name	Parameter	FLOPs
1	Focus, 1, 16	224×10⁶	33.0×10⁶
2	Conv, 3/1, 16	2336×10⁶	86.1×10⁶
3	Conv, 3/1, 32	4672×10⁶	43.1×10⁶
4	Conv, 3/1, 64	18560×10⁶	42.8×10⁶
5	PDSCP, 128	38016×10⁶	21.9×10⁶
6	PDSCP, 256	149760×10⁶	21.6×10⁶
7	PDSCP, 512	594432×10⁶	21.4×10⁶

Name	Related configurations
GPU	NVIDIA quadro GV100
CPU	sInter Xeon silver 4210/128G
GPU memory size	32G
Operating systems	Win10
Computing platform	CUDA11.0
CPU(test)	Inter Core i7 10700/16G

Size of extension box	Number of datasets	Number of images
5 pixel×5 pixel	13	12484
7 pixel×7 pixel	2	798

Parameter	Infrared dim and small targets datasets	Thermal Pedestrian Database	FLIR Thermal Datasets
Class number	1	1	3
Epoch	500	500	500
Batch size	64	4	64
Image size	384×384	320×320	512×512
Batch size(test)	1	1	1

Infrared dim and small target detection based on YOLO-IDSTD algorithm

doi: 10.3788/IRLA20210106

Abstract

References

Proportional views

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Related

Proportional views