-
在目标跟踪过程中,候选样本与目标模板进行相关运算得到相似性响应图,其相关峰越尖锐,跟踪越稳定[8]。在目标尺寸确定的情况下,目标模板的纹理细节对相关峰具有显著影响,因此,选取子块时应考虑目标的纹理细节特征。手动选取子块时,选取原则是子块与周围区域的区分性大。基于以上两点考虑,文中提出了一种融合视觉显著性及边缘方向离散度的子块自动提取方法。所提出算法的流程图如图1所示。视觉显著性可抑制重复性纹理在提取结果中的权重,边缘方向离散度可制约无纹理区域的产生,使提取的子块既具有丰富的纹理信息,又与周围区域有较高的区分性。
-
人类视觉系统能够优先将注意力分配给少数的视觉显著性区域,文中利用视觉注意力机制对子块进行候选区域的初选,抑制目标中的重复性纹理在联合适配性置信度响应图中的权重,提升子块的区域区分性。Hou等人[9]提出的频谱残差模型能够统计图像中的显著性区域,因此,文中采用频谱残差对子块的区域区分性进行度量。
在目标跟踪过程中,在初始帧中对目标模板
$T\left( {x,y} \right)$ 进行初始化,通过二维离散傅里叶变换将其从空间域转换到频域,其谱残差$R\left( f \right)$ 可利用公式(1)计算获得:$$R\left( f \right) = L\left( f \right) - A\left( f \right)$$ (1) 式中:
$f$ 为频率;$L\left( f \right)$ 为$\log $ 振幅谱;$A\left( f \right)$ 表示平均幅度谱。通过对$\log $ 振幅谱$L\left( f \right)$ 进行均值滤波得到:$$A\left( f \right) = {h_n}\left( f \right) * L\left( f \right)$$ (2) 式中:
${h_n}\left( f \right)$ 为$n \times n$ 的均值滤波器,通过傅里叶反变换重建空间域中的显著图,使用二维高斯滤波器$g\left( {x,y} \right)$ 对初始显著图进行平滑可得到残差结果:$$H\left( {x,y} \right) = g\left( {x,y} \right) * {F^{ - 1}}{\left[ {\exp \left( {R\left( f \right) + iP\left( f \right)} \right)} \right]^2}$$ (3) 式中:
${F^{ - 1}}$ 为二维傅里叶逆变换;$P\left( f \right)$ 为输入图像的相位图[10]。从图2可以看出,图像中灰度值发生突然变化的区域在显著图中具有较高响应值。
-
边缘方向离散度能够表征图像梯度方向的分布范围,该特征值越大,图像的纹理越丰富,其适配性越高。文中先利用Canny算子对目标模板进行边缘提取,采用信息熵来定义边缘方向离散度:
$$d\left( {x,y} \right) = - \sum\limits_{m = x - r}^{y + r} {\sum\limits_{n = y - c}^{y + c} {E\left( {m,n} \right)} } {P_{\theta \left( {m,n} \right)}}\log {P_{\theta \left( {m,n} \right)}}$$ (4) 式中:
$E\left( {m,n} \right)$ 为边缘二值图像,由目标模板图像$T\left( {x,y} \right)$ 利用Canny算子求得;${P_{\theta \left( {m,n} \right)}}$ 为不同梯度方向概率,在以$\left( {m,n} \right)$ 为中心,大小为(2r+1)×(2c+1)的邻域内,由目标模板图像$T\left( {x,y} \right)$ 在经梯度向量计算后,通过对邻域内各边缘点的梯度方向角进行直方图统计后求取[8]。如图3所示,图(a)~(d)分别为Canny边缘二值像、梯度方向图、边缘方向离散度图及其三维空间映射图,可看出目标模板图像中纹理细节越丰富,边缘方向离散度值越高。
-
文中结合图像的视觉显著性和纹理细节来共同衡量一个像素的适配性值,定义联合适配性置信度为:
$$M = {M_s} \times {M_d}$$ (5) 式中:
${M_s}$ 为归一化谱残差显著值,由公式(1)归一化后求得;${M_d}$ 为归一化边缘方向离散度值,由公式(4)归一化后求得。为避免异常样本及后续数据处理方便,对两种特征进行归一化处理。结合图2与图3,从图4中可看出,联合适配性置信度图中取较高值的区域既具有丰富的纹理细节,又与周围区域有明显的差异性。 -
获取联合适配性置信度图后,根据目标模板宽高比,提出一种子块自适应选取策略,根据表1所示的子块选取策略,可实现子块的自动选取,为减小计算冗余,设置了子块间隔,保证选取子块之间不重叠。表中,
$W$ 和H分别为目标模板$T\left( {x,y} \right)$ 的宽度和高度,$ \lceil \cdot \rceil $ 表示向上取整。根据上述算法原理的介绍,子块提取流程为:
(1)在联合适配性置信度图中,搜索响应值最大的点
$p_c^k = \left[ {x_c^k,y_c^k} \right]$ ;(2)根据目标模板的宽高比,确定子块的尺寸及个数,被选取的第k个子块可表示为
${p_k} = \left[ {p_c^k,p_w^{},p_h^{}} \right]$ ,以$p_c^k$ 为中心点、${p_w}$ 为宽、${p_h}$ 为高的矩形区域;(3)对选取的子块进行边界判断,保证选取的子块在目标模板上;
(4)区域置零,为了保证选取的子块间不重叠,减小计算冗余,将
$p_c^k$ 为中心,尺寸为$\left( {p_w^k + 2{M_x}} \right) \times $ $ \left( {p_h^k + 2{M_y}} \right)$ 的区域置零;(5)重复(1)~(4)直至选取出所有子块。
表 1 子块自适应选取策略
Table 1. Principle of adaptive selection of parts
Aspect ratio Number of parts Width of parts (${p_w}$) Height of parts (${p_h}$) Horizontal margin (${M_x}$) Vertical margin (${M_y}$) $AR \leqslant \dfrac{2}{3}$ 3 $\left\lceil {0.8W} \right\rceil $ $\left\lceil {0.8 \times \dfrac{H}{3} } \right\rceil$ $\left\lceil {0.05 \times W} \right\rceil$ $\left\lceil {0.05 \times H} \right\rceil$ $\dfrac{2}{3} < AR \leqslant \dfrac{3}{2}$ 4 $\left\lceil {0.8\dfrac{W}{2} } \right\rceil$ $\left\lceil {0.8 \times \dfrac{H}{2} } \right\rceil$ $AR > \dfrac{3}{2}$ 3 $\left\lceil {0.8 \times \dfrac{W}{3} } \right\rceil$ $\left\lceil {0.8 \times H} \right\rceil$ 图5展示了目标模板
$T\left( {x,y} \right)$ 的子块自动选取结果,图中目标模板的尺寸为62×52,宽高比值在$\dfrac{2}{3} < AR \leqslant \dfrac{3}{2}$ 区间内,根据表1所述的子块自适应选取策略,子块个数为4,子块宽${p_w} = 24$ ,子块高${p_h} = 20$ ,水平间隔${M_x} = $ 4,垂直间隔${M_y} = $ 3。根据子块提取流程自适应生成子块的区域,子块3和子块4在边界判断时超出了目标模板的尺寸大小,所以目标模板边界作为子块边界。 -
为验证所提方法的有效性,分别在可见光目标跟踪数据集、红外数据集及自采红外图像对文中提出的子块自动提取方法进行了主观分析及客观跟踪结果对比。
-
OTB100数据集是由Wu等提出的通用目标跟踪数据集[11],选取数据集中宽高比分别在
$AR \leqslant\dfrac{2}{3}$ 、$AR > \dfrac{3}{2}$ 区间的两组典型目标进行子块提取结果分析。图6中,子图(i)表示谱残差显著图;子图(ii)表示边缘方向离散度图;子图(iii)表示联合适配性置信度图;子图(iv)为文中提出方法的子块自动提取结果,可看出利用公式(5)得到的联合适配性置信度图既能够衡量区域的纹理丰富度,也能够衡量区域的视觉显著性。从主观上看,文中算法选取的子块与周围区域相比,区分性较大,符合手动选取的准则。
-
为了进一步验证算法的有效性,在FLIR Thermal红外图像数据集上对红外目标模板进行了子块自动提取验证。从图7中可以看出,文中方法提取的子块能够避免不适合跟踪的无纹理子块,子块与周围区域的分区性强。
-
图8所示为面向空地成像制导应用的目标模板子块自动提取结果,图8(a)中的弹目距离约13 km,目标在探测器上的成像尺寸为31×29;图8(b)中的弹目距离约10 km,目标在探测器上的成像尺寸为107×41,从提取结果中可以看出,所提取的子块区域区分特征明显且纹理丰富,符合空地成像制导的应用需求。
图 8 文中方法在自采红外图像序列上的实验结果。(a) 自采红外图像序列1子块自动提取结果;(b) 自采红外图像序列2子块自动提取结果
Figure 8. Experimental results of proposed method on private infrared sequences. (a) Results of automatic parts selection on the private infrared sequence #1; (b) Results of automatic parts selection on the private infrared sequence #2
-
为验证子块自动提取方法对目标跟踪精度的提升能力,选用OTB100数据集中的形变和遮挡序列对其跟踪性能进行定量分析,选用的对比算法分别为:(1)参考文献[7]提出的目标跟踪算法(手动选取子块);(2)将文中自动提取的子块输入到参考文献[7]提出的目标跟踪算法;(3) KCF[12];(4) TLD[13];(5) L1APG[14]。
图9为对比算法在OTB100数据集中43组具有形变属性序列和49组形变属性序列中的距离精度(Distance Precision,DP)和重叠精度(Overlap Precision,OP)指标对比结果进行评价[15]。图9(a)中,文中算法的DP值为0.640,与第2名的手动子块选取方法(Manual Selection,MS)相比,提高了3.4%;图9(b)中,文中算法的OP值为0.494,与排名第2名的MS相比,在距离精度上提高了4.6%。在图9(c)和(d)中,文中算法的DP和OP值为0.637和0.455,与第2名的MS相比,分别提高了4.5%和0.9%。因此,文中方法能够有效提升跟踪算法在形变及遮挡挑战下的跟踪精度。
图 9 不同跟踪算法在形变和遮挡属性下的距离精度和重叠成功率曲线。(a) 形变挑战的距离精度曲线;(b)形变挑战的重叠成功率曲线;(c)遮挡挑战的距离精度曲线;(d)遮挡挑战的重叠成功率曲线
Figure 9. Distance precision and overlap success rate curves of different algorithms under deformation and occlusion attribute. (a) Distance precision curve of deformation attribute; (b) Overlap success rate curve of deformation attribute; (c) Distance precision curve of occlusion attribute; (d) Overlap success rate curve of deformation attribute
利用中心位置误差(Center Location Error, CLE)对所提方法的跟踪精度提升进行评估,CLE定义为:
$$CLE = \sqrt {{{\left( {{x_g} - {x_t}} \right)}^2} - {{\left( {{y_g} - {y_t}} \right)}^2}} $$ (2) 式中:
$\left( {{x_g},{y_g}} \right)$ 表示人工标注的目标中心位置;$\left( {{x_t},{y_t}} \right)$ 表示由目标跟踪算法预测的目标中心位置。中心位置误差值越小,跟踪越稳定。图10中展示了Sylvester、Gym及Dancer2序列在参考文献[7]中跟踪器下的中心位置误差曲线,其中红色曲线为MS的中心位置误差,蓝色曲线为文中自动选取子块的中心位置误差。可以看出,所提子块自动选取方法获得的中心位置误差较小。
表2中展示了根据公式(3)计算的平均中心位置误差(mean Center Location Error, mCLE),可看出文中提出的方法能够有效提升跟踪算法的跟踪精度。
$$mCLE = \frac{1}{N}\sum\limits_{i = 1}^N {CL{E_i}} $$ (3) 式中:
$N$ 表示序列长度;$CL{E_i}$ 表示视频序列第$i$ 帧的中心位置误差。图 10 不同序列下文中方法与手动选取的子块的逐帧中心位置误差
Figure 10. Frame-by-frame center location errors of parts from proposed in the paper and manual selection in different sequences
表 2 不同序列下平均中心位置误差
Table 2. Mean center location error in different sequences
Sequence Proposed Manual selection Sylvester 2.9124 4.0036 Gym 10.888 15.816 Dancer2 7.5905 8.4727
Automatic parts selection method based on multi-feature fusion
-
摘要: 基于可变形模型的目标跟踪算法因其能够处理目标部分遮挡及形变问题成为目标跟踪领域的研究热点。当目标发生形变或部分遮挡时,可变形模型跟踪器可利用未被遮挡的子块继续完成跟踪。现有基于子块的目标跟踪算法均为手动选取子块的个数和尺寸,但在实际应用中,很难为子块的选取提供人机交互的机会,且手动选取子块易受主观因素影响。针对上述情况,提出了一种采用多特征融合的子块自动提取方法,该方法首先采用基于人眼视觉注意机制对目标模板的显著性区域进行度量;其次,利用边缘方向离散度对目标的纹理丰富度进行度量;然后,融合上述特征获得联合适配性置信度,并根据目标的面积和宽高比自适应确定子块的个数和尺寸;最后,根据联合适配性置信度提取目标子块。实验结果表明,与现有手动选取子块的可变形模型目标跟踪方法相比,采用所提方法自动提取的子块可获得更高的跟踪精度。Abstract: Deformable parts model target tracking methods becomes an active research due to its effectiveness in tackling partial occlusion and deformation issues of targets. When partial occlusion or deformation occurs, deformable parts model trackers could achieve accurate tracking via the uncovered reliable parts. Most of the part-based trackers initialize the number and size of parts manually. In practical tracking systems, it is difficult to provide the interaction to select parts manually. Meanwhile, manual parts selection method might be affected by subjective factors. Aimed at the problems mentioned, automatic parts selection method based on multi-feature fusion was proposed. Firstly, the saliency measure based on human visual attention mechanism was applied to describe the salient region of target template. Secondly, edge direction dispersion was employed to describe the richness of texture details. After obtaining the joint suitable-matching confidence map, the number and size of parts were adaptively selected according to the pixel area and aspect ratio of the target. Finally, the parts were selected according to the joint suitable-matching confidence. Experimental results show that the proposed method can achieve more tracking precision compared with the current deformable parts model target tracking algorithm which selects the parts manually.
-
图 8 文中方法在自采红外图像序列上的实验结果。(a) 自采红外图像序列1子块自动提取结果;(b) 自采红外图像序列2子块自动提取结果
Figure 8. Experimental results of proposed method on private infrared sequences. (a) Results of automatic parts selection on the private infrared sequence #1; (b) Results of automatic parts selection on the private infrared sequence #2
图 9 不同跟踪算法在形变和遮挡属性下的距离精度和重叠成功率曲线。(a) 形变挑战的距离精度曲线;(b)形变挑战的重叠成功率曲线;(c)遮挡挑战的距离精度曲线;(d)遮挡挑战的重叠成功率曲线
Figure 9. Distance precision and overlap success rate curves of different algorithms under deformation and occlusion attribute. (a) Distance precision curve of deformation attribute; (b) Overlap success rate curve of deformation attribute; (c) Distance precision curve of occlusion attribute; (d) Overlap success rate curve of deformation attribute
表 1 子块自适应选取策略
Table 1. Principle of adaptive selection of parts
Aspect ratio Number of parts Width of parts ( ${p_w}$ )Height of parts ( ${p_h}$ )Horizontal margin ( ${M_x}$ )Vertical margin ( ${M_y}$ )$AR \leqslant \dfrac{2}{3}$ 3 $\left\lceil {0.8W} \right\rceil $ $\left\lceil {0.8 \times \dfrac{H}{3} } \right\rceil$ $\left\lceil {0.05 \times W} \right\rceil$ $\left\lceil {0.05 \times H} \right\rceil$ $\dfrac{2}{3} < AR \leqslant \dfrac{3}{2}$ 4 $\left\lceil {0.8\dfrac{W}{2} } \right\rceil$ $\left\lceil {0.8 \times \dfrac{H}{2} } \right\rceil$ $AR > \dfrac{3}{2}$ 3 $\left\lceil {0.8 \times \dfrac{W}{3} } \right\rceil$ $\left\lceil {0.8 \times H} \right\rceil$ 表 2 不同序列下平均中心位置误差
Table 2. Mean center location error in different sequences
Sequence Proposed Manual selection Sylvester 2.9124 4.0036 Gym 10.888 15.816 Dancer2 7.5905 8.4727 -
[1] Luo H B, Xu L Y, Hui B, et al. Status and prospect of target tracking based on deep learning [J]. Infrared and Laser Engineering, 2017, 46(5): 0502002. (in Chinese) [2] Tang Y F, Wang Z J, Zhang Z X. Registration of sand dune images using an improved SIFT and SURF algorithm [J]. Journal of Tsinghua University (Science and Technology), 2021, 61(2): 161-169. (in Chinese) [3] Rodriguez A, Ehlenberger D B, Hof P R, et al. Three-dimensional neuron tracing by voxel scooping [J]. Journal of Neuroscience Methods, 2009, 184(1): 169-175. doi: 10.1016/j.jneumeth.2009.07.021 [4] Fan H, Xiang J. Robust visual tracking via local-global correlation filter[C]//AAAI Conference on Artificial Intelligence, 2017. [5] Chen X D, Sheng J, Yang J, et al. Ultrasound image segmentation based on a multi-parameter Gabor filter and multiscale local level set method [J]. Chinese Optics, 2020, 13(5): 1075-1084. (in Chinese) doi: 10.37188/CO.2020-0025 [6] Wang Q, Zhang L, Bertinetto L, et al. Fast online object tracking and segmentation: A unifying approach[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2018. [7] Ma J K, Luo H B, Chang, Z, et al. Visual tracking algorithm based on deformable parts model [J]. Infrared and Laser Engineering, 2017, 46(9): 0928001. (in Chinese) [8] Luo H B, Chang Z, Yu X R, et al. Automatic suitable-matching area selection method based on multi-feature fusion [J]. Infrared and Laser Engineering, 2011, 40(10): 2037-2041. (in Chinese) [9] Hou X D, Zhang L P. Saliency detection: A spectral residual approach[C]//Computer Vision and Pattern Recognition, 2007: 1-8. [10] Chen H Y, Xu S, Liu K, et al. Surface defect detection of steel strip based on spectral residual visual saliency [J]. Optics and Precision Engineering, 2016, 24(10): 2572-2580. (in Chinese) doi: 10.3788/OPE.20162410.2572 [11] Wu Y, Lim J, Yang M H. Object tracking benchmark[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1834-1848. [12] Henriques J F, Caseiro R, Martins P, et al. High-speed tracking with kernelized correlation filters [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583-596. doi: 10.1109/TPAMI.2014.2345390 [13] Kalal Z, Mikolajczyk K, Matas J. Tracking-learning-detection [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(7): 1409-1422. doi: 10.1109/TPAMI.2011.239 [14] Bao C L, Wu Y, Ling H B, et al. Real time robust L1 tracker using accelerated proximal gradient approach[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2012: 1830-1837. [15] Chen F L, Ding Q H, Luo H B, et al. Anti-occlusion real time target tracking algorithm employing spatio-temporal context [J]. Infrared and Laser Engineering, 2021, 50(1): 20200105. (in Chinese)