-
由于数据来源有两个,一个是偏振强度图像,另一个是可见光图像,所以在文中的系统中,将偏振强度图像的连通包络区域作为光学字符识别的限定条件,然后在此基础上完成不同权值的配准,从而提高可见光图像中光学字符的可辨识度。
-
采用Stocks法[11]解算偏振态,当入射偏振角可以预设时,则其矢量的强度值可以由初始光源的功率分布曲线获得。该Stocks矢量可写成:
$$ S = {\left[ {\begin{array}{*{20}{c}} {\boldsymbol{I}}&{\boldsymbol{Q}}&{\boldsymbol{U}}&{\boldsymbol{V}} \end{array}} \right]^{\rm T}} $$ (1) 式中:I 表示光强;Q 和U分别表示偏振分量值;V表示圆偏振分量。则在CCD1上,其穆勒矩阵有:
$$ {{\boldsymbol{S}}_{{\rm{CCD1}}}} = {\boldsymbol{ M}} \cdot {{\boldsymbol{S}}_t} $$ (2) 式中:SCCD1为CCD1上采集到的斯托克斯矢量;St为初始照射到目标上的斯托克斯矢量;M为穆勒矩阵。
由于光学字符与没有光学字符的位置表面存在差异,一种是由于书写导致的表面反射率的不同,一种是电力贴签上字符位置本身是由不同材料印刷上的,故根据不同偏振强度值可以对测试区域的图像进行范围划分,并赋予不同的权值,从而提高有效信息的提取能力。
由于待测电力铭牌等均具有一定的一致性,所以偏振光偏振态不需要大范围扫描,仅需采用几个固定偏转角的偏振光即可,文中系统中采用0°,60°和120°。不同偏振方向的响应结果可以通过均匀化校正得到,若设0°时响应为基准,则另外两个偏振角度的标定参数有:
$$ {k}_{{60}^{\text{o}}}=\frac{V{\text{'}}_{{0}^{\text{o}}}}{V{\text{'}}_{{60}^{\text{o}}}}\text{,}{k}_{{120}^{\text{o}}}=\frac{V{\text{'}}_{{0}^{\text{o}}}}{V{\text{'}}_{{120}^{\text{o}}}} $$ (3) 式中:V '60°表示偏振角为60°时CCD1的响应电压均值;V '120°表示偏振角为120°时CCD1的响应电压均值。通过与0°时进行比较从而可以确定两个偏振角度的标定参数。
-
偏振图像在文中系统中的主要作用是提供准确的图像边界信息,所以采用小波不变矩提取偏振图像中的特征范围可以大幅减少全图像数据分析的压力,从而保证测试系统的识别速度。设图像为g(x,y),则其第s+t阶的原点矩可表示为:
$$ {M_{st}} = \iint {{x^s}{y^t}g\left( {x,y} \right){\rm{d}}x{\rm{d}}y} $$ (4) 式中:x、y分别表示二维图像坐标值;s、t分别表示x方向与y方向上的阶数。因为测试过程中,铭牌类别各异,尺寸相差较大,若如果采用直角坐标进行图像对齐会使增加多余的计算量,故将其转化为极坐标系,可以提升图像融合效率。由直角坐标系与极坐标系的关系:
$$ x = r\cos \phi ,{\text{ }}y = r\sin \phi $$ (5) 公式(4)可以表示为:
$$ {M_{st}} = \iint {{r^{s + t}}{{\left( {\cos \phi } \right)}^s}{{\left( {\sin \phi } \right)}^t}g\left( {r,\phi } \right)r{\rm{d}}r{\rm{d}}\phi } $$ (6) 由此得到偏振图像中表达测试文字区域的原点矩参量,可以实现对特征范围的限定,再将该限定范围中的图像进行增强处理,就能快速高效地获取电力铭牌等上的有效信息了。
-
由于电力铭牌等上的字符往往由文本、数字甚至是特殊符号构成,所以不能简单地采用文字识别算法。电力系统中专业术语多,缩略表达也十分常见,同时,有效信息的长度、记录人员主观表达方式等都对含义有一定影响,所以获取数据必须尽可能地保证全面,减小因信息缺失造成的信息错误。综上所述,光学字符识别算法的构建需要综合考虑多种情况,故在模型设计中需要给出适用于多种状态分类的计算方法。
-
传统光学符号识别常采用诸如基于Bi-LSTM的特征识别算法、基于胶囊网络的聚类分析算法和基于滑动窗的预测识别算法。在文中系统中,由于需要分析的变量不多,但变量类型较复杂,所以相比之下,采用Bi-LSTM算法比较合适。在Bi-LSTM算法中,主要包括输入模块、遗忘模块、输出模块及存储模块。设当前输入词的向量值为 xt,而其隐状态的两个时刻分别是ht−
1和ht时,则Bi-LSTM算法中的特征参数可表示为: $$ \left\{ \begin{gathered} {h_t} = \sigma (A)\tanh \left\{ {\sigma (A)\left[ {\tanh (A) + {c_{t - 1}}} \right]} \right\} \\ A = {W_i}[{x_t}] + {U_i}{h_{t - 1}} + {V_i}{c_{t - 1}} + {b_i} \\ \end{gathered} \right. $$ (7) 式中:σ表示Sigmoid函数;ht表示函数输出,其前向输出与后向输出的对位相加可以计算得到;t表示当前时刻;t−1表示上一个时刻;i表示函数计算的第i个数据;A为计算过程中的中间变量;Wi[xt]为第i个数据在t时刻向量值xt对应的测试值;Ui和Vi分别为相对Wi测试值的补偿参数;ct−1为t−1时刻的偏振补偿参数;bi为修正常系数。
上式中将测试值关于测试值强度、时刻变化以及偏振补偿的影响统一到了一个模型中,从而能够在光学符号识别的过程中同时选择符合偏振边界及时序变化规律的图像区间。
-
在完成基于偏振特征的边界条件约束和光学字符识别模型的基础上,按照参数计算顺序与数据分析流程就能获取准确的电力铭牌等测试信息。算法步骤如下:
(1) 根据测试数据类型分别进行处理,结合测试时偏振镜的偏振角度,对偏振数据进行两个偏振角度标定参数的计算,即计算得到k60°和k120°。通过比较不同偏振角度的比值,完成对边界范围的选取;
(2)对获得的二维可见光图像进行原点距计算,从而为图像目标识别提供依据,在此过程中,将原有不同大小的直角坐标系统一成原坐标系进行表达,从而避免基于图像原点的测试字符范围偏差过大的问题,求解Mst,获得图像融合上下限及对齐角度;
(3) 将偏振边界范围作为限定条件代入图像解算模型,从而大幅降低图像融合数据处理的运算量,计算得到测试值Wi[xt]、Ui和Vi;
(4)根据测试值之间的关系计算光学字符识别的中间变量A,将计算得到的A值代入公式(7)完成时刻特征参数的计算;
(5)通过特征参数ht,循环迭代筛选图像区域,完成所有图像中目标字符集字符串的识别。
整体识别程序流程如图2所示。
Research on optical character recognition algorithm based on boundary constrained image fusion
-
摘要: 为了提高非合作目标区域中光学字符的识别能力,增强电力铭牌、电网文本等信息采集的准确性,设计了一种偏振图像与可见光图像同时采集并进行图像融合的光学字符识别系统。通过设置0°、60°和120°的偏振角对响应电压进行周期性调制,从而获取有效信息区域的连通范围,实现准确的边界约束。通过计算偏振角度标定参数,设置边界条件的合理阈值,为图像融合提供范围标准。实验分别测试了响应电压关于偏振角度和测试距离的函数曲线,结果显示,偏振角度周期性变化斜率为53.1 mV/(°)。在0.5~3.0 m范围内,响应电压最大值为241.7 mV,最小电压为18.5 mV,并且三条响应曲线的单调性几乎一致。实验针对图像清晰度较差的电力铭牌目标进行测试,结果显示,模糊的原始图像在传统图像滤波与增强后,对比度从0.34提升至1.56,图像质量得到了一定的改善,但仍有部分字符无法识别。而采用文中算法后,对比度达到了3.23,部分模糊字符也能有效识别。可见,该系统适用于非合作目标的光学字符识别,对低质量图像中光学字符识别具有很好的优化效果。Abstract: In order to improve the recognition ability of optical characters in non-cooperative target areas, and enhance the accuracy of information collection such as power nameplates and power grid texts, an optical character recognition system was designed that simultaneously collected polarized images and visible light images and performed image fusion. By setting the polarization angles of 0°, 60° and 120°, the response voltage was periodically modulated to obtain the connectivity range of the effective information region and achieve accurate boundary constraints. By calculating the polarization angle calibration parameters and setting reasonable thresholds of boundary conditions, a range standard was provided for image fusion. The function curves of the response voltage on the polarization angle and the test distance were tested in the experiment, and the results showed that the slope of the periodic change of the polarization angle was 53.1 mV/(°). In the range of 0.5-3.0 m, the maximum value of the response voltage was 241.7 mV, the minimum voltage was 18.5 mV, and the monotonicity of the three response curves was almost the same. The experiment was carried out on the power nameplate target with poor image definition. The results showed that after traditional image filtering and enhancement, the contrast ratio of the blurred original image was increased from 0.34 to 1.56, and the image quality was improved to a certain extent, but there were still some characters that could not be identified. After using this algorithm, the contrast ratio reaches 3.23, and some fuzzy characters can also be effectively recognized. It can be seen that the system is suitable for optical character recognition of non-cooperative targets, and has a good optimization effect on optical character recognition in low-quality images.
-
Key words:
- polarization imaging /
- image fusion /
- boundary constraints /
- optical characters
-
[1] Deng C, Pan L, Wang C, et al. Performance analysis of ghost imaging lidar in background light environment [J]. Photonics Research, 2017, 5(5): 431-435. doi: 10.1364/PRJ.5.000431 [2] Yu H, Li E, Gong W, et al. Structured image reconstruction for three-dimensional ghost imaging lidar [J]. Optics Express, 2015, 23(11): 14541-14551. doi: 10.1364/OE.23.014541 [3] Wang X, Shao Y M, Yang B, et al. Target tracking method based on infrared and laser lidar image fusion [J]. Infrared Technology, 2019, 41(10): 947-955. (in Chinese) [4] Ling Y, Gu G, He W, et al. Adaptive target profile acquiring method for photon counting 3-D imaging Lidar [J]. IEEE Photonics Journal, 2016, 8(6): 1-10. [5] Tian Z, Cui Z, Zhang L, et al. Control and image processing for streak tube imaging lidar based on VB and MATLAB [J]. Chinese Optics Letters, 2014, 12(6): 67-70. [6] Zhong H, Yuan Y, Wang J F, et al. Anchor box optimization for object detection [EB/OL]. (2018-12-02) [2022-02-14]. https://arxiv.org/pdf/1812.00469.pdf. [7] Xie L L, Liu Y L, Jin L W, et al. DeRPN taking a further step toward more general object detection[C]//Proc of the 33rd Association for the Advancement of Artifical Intelligence. Menlo Park, CA: AAAI Press, 2019: 9046-9053. [8] Yang T, Zhang X Y, Li Z M, et al. MetaAnchor: Learning to detect objects with customized anchors[C]//Proc of the 31st Annual Conference on Neural Information Processing Systems. New York: Curran Associates Press, 2018: 318-328. [9] Wang J Q, Chen K, Yang S H, et al. Region proposal by guided an Choring [C]//Proc of IEEE Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE Press, 2019: 2965-2974. [10] Ma K, Liu S H, Bai X, et al. DocUNet: Document image unwarping via a stacked U-net[C]//Proc of IEEE Computer Vision and Pattern Recognition. Piscataway, NJ: IEEE Press, 2018: 4700-4709. [11] Wang X Y, Gao T T. Translation optimization based on OCR optical character recognition [J]. Laser Journal, 2020, 41(12): 156-160. (in Chinese) [12] Matas J, Chum O, Urban M, et al. Robust wide-baseline stereo from Maximum stable extreme regions [J]. Image and Vision Computing, 2004, 22(10): 761-767. doi: 10.1016/j.imavis.2004.02.006 [13] Chen W L, Xu W B, Wang S H, et al. Research on coating materials detection and recognition based on infrared spectral polarization degree contrast [J]. Infrared and Laser Engineering, 2020, 49(6): 20190445. (in Chinese) doi: 10.3788/IRLA20190445 [14] Ou Q F, Xiao J B, Xie Q Q, et al. Multi-target detection and recognition for vehicle inspection images based on deep learning [J]. Journal of Applied Sciences, 2021, 39(6): 939-951. (in Chinese) doi: 10.3969/j.issn.0255-8297.2021.06.005 [15] Zhou Y, Zhu Q R, Xie H C, et al. Non-standard ship identification characters detection based on target detection and fuzzy matching [J]. Laser & Infrared, 2021, 51(11): 1526-1530. (in Chinese) doi: 10.3969/j.issn.1001-5078.2021.11.020