Optically realize convolution operation of microlens array

Fei Yuhang; Sui Xiubao; Wang Qingbao; Chen Qian; Gu Guohua

doi:10.3788/IRLA20210887

Volume 51 Issue 2

Feb. 2022

Turn off MathJax

Article Contents

Article Navigation > Infrared and Laser Engineering > 2022 > 51(2): 20210887

Fei Yuhang, Sui Xiubao, Wang Qingbao, Chen Qian, Gu Guohua. Optically realize convolution operation of microlens array[J]. Infrared and Laser Engineering, 2022, 51(2): 20210887. doi: 10.3788/IRLA20210887

Citation:

Fei Yuhang, Sui Xiubao, Wang Qingbao, Chen Qian, Gu Guohua. Optically realize convolution operation of microlens array[J]. Infrared and Laser Engineering, 2022, 51(2): 20210887. doi: 10.3788/IRLA20210887

Optically realize convolution operation of microlens array

doi: 10.3788/IRLA20210887

School of Electronic and Optical Engineering, Nanjing University of Science and Technology, Nanjing 210094, China

Received Date: 2021-12-20
Rev Recd Date: 2022-01-10
Accepted Date: 2022-02-06

Available Online: 2022-03-04

Publish Date: 2022-02-28

Abstract

As a simple linear translation invariant operation, convolution has been widely used in various fields of image processing, and the convolutional neural network derived from it is brilliant in the field of artificial intelligence. In order to deal with the problem of limited computing power of AI reasoning chip in the post-Moore era, optical neural network came into being. As one of the important research hotspots, optical convolutional neural network plays an important role in promoting the development of optical neural network. An optical convolution system was designed, based on the uniform light path formed by micro lens array and lens, the image carried in the light place was convoluted in two-dimensions. The system can complete simple image smoothing and sharpening in the optical path. When the spatial light modulator is used to realize the convolution kernel and input surface, the system can realize three convolution forms of various step sizes, and can also realize multi-channel three-dimensional convolution through multiple projection or flattening, thus laying a foundation for the realization of optical convolution neural network for complex image processing tasks.
- optical convolution,
- microlens array,
- unifying system,
- image processing

References

[1]	Castleman K R, 朱志刚, 林学闵, 等. 数字图像处理[M]. 北京: 电子工业出版社, 1998: 123-145. Castleman K R, Zhu Z, Lin X, et al. Digital Image Processing[M]. Beijing: Publishing House of Electronics Industry, 1998: 123-145. (in Chinese)
[2]	Goodfellow I J, Bulatov Y, Ibarz J, et al. Multi-digit number recognition from street view imagery using deep convolutional neural networks [J]. arXiv preprint arXiv, 2013, 1312: 6082.
[3]	薛珊, 张振, 吕琼莹, 等. 基于卷积神经网络的反无人机系统图像识别方法[J]. 红外与激光工程, 2020, 49(7): 20200154. doi: 10.3788/IRLA20200154 Xue S, Zhang Z, Lv Q Y, et al. Image recognition method of anti UAV system based on convolutional neural network [J]. Infrared and Laser Engineering, 2020, 49(7): 20200154. (in Chinese) doi: 10.3788/IRLA20200154
[4]	Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 39(4): 640-651.
[5]	王中宇, 倪显扬, 尚振东. 利用卷积神经网络的自动驾驶场景语义分割[J]. 光学精密工程, 2019, 27(11): 2429-2438. doi: 10.3788/OPE.20192711.2429 Wang Z Z, Ni X Y, Sheng Z D. Autonomous driving semantic segmentation with convolution neural networks [J]. Optics and Precision Engineering, 2019, 27(11): 2429-2438. (in Chinese) doi: 10.3788/OPE.20192711.2429
[6]	Chao D, Chen C L, He K, et al. Learning a deep convolutional network for image super-resolution[C]//ECCV, Springer International Publishing, 2014, 8692: 184-199.
[7]	郝建坤, 黄玮, 刘军, 等. 空间变化PSF非盲去卷积图像复原法综述[J]. 中国光学, 2016, 9(1): 41-50. doi: 10.3788/co.20160901.0041 Hao J K, Huang W, Liu J, et al. Review of non-blind deconvolution image restoration based on spatially-varying PSF [J]. Chinese Optics, 2016, 9(1): 41-50. (in Chinese) doi: 10.3788/co.20160901.0041
[8]	朱明, 杨航, 贺柏根, 等. 联合梯度预测与导引滤波的图像运动模糊复原[J]. 中国光学, 2013, 6(6): 850-855. Zhu M, Yang H, He B G, et al. Image motion blurring restoration of joint gradient prediction and guided filter [J]. Chinese Optics, 2013, 6(6): 850-855. (in Chinese)
[9]	张旭, 于明鑫, 祝连庆, 等. 基于全光衍射深度神经网络的矿物拉曼光谱识别方法[J]. 红外与激光工程, 2020, 49(10): 20200221. Zhang X, Yu M X, Zhu L Q, et al. Raman mineral recognition method based on all-optical diffraction deep neural network [J]. Infrared and Laser Engineering, 2020, 49(10): 20200221. (in Chinese)
[10]	郭玉彬, 邢培. 一种全光模糊智能信息处理系统设计[J]. 光学精密工程, 1998, 6(1): 23-30. doi: 10.3321/j.issn:1004-924X.1998.01.005 Guo Y B, Xing P. The design of an all optical signal processing system with fuzzy intelligence networks [J]. Optics and Precision Engineering, 1998, 6(1): 23-30. (in Chinese) doi: 10.3321/j.issn:1004-924X.1998.01.005
[11]	Xu S, Wang J, Wang R, et al. High-accuracy optical convolution unit architecture for convolutional neural networks by cascaded acousto-optical modulator arrays [J]. Optics Express, 2019, 27(14): 19778-19787. doi: 10.1364/OE.27.019778
[12]	Mario Miscuglio, Zibo Hu, Shurui Li, et al. Massively parallel amplitude-only Fourier neural network [J]. Optica, 2020, 7(12): 1812-1819. doi: 10.1364/OPTICA.408659
[13]	Wu Q, Fei Y, Liu J, et al. High speed and reconfigurable optronic neural network with digital nonlinear activation [J]. Optik, 2021, 247: 168043. doi: 10.1016/j.ijleo.2021.168043
[14]	Gu Z, Gao Y, Liu X. Optronic convolutional neural networks of multi-layers with different functions executed in optics for image classification [J]. Optics Express, 2021, 29(4): 5877-5889. doi: 10.1364/OE.415542
[15]	Sadeghzadeh H, Koohi S, Paranj A F. Free-space optical neural network based on optical nonlinearity and pooling operations [J]. IEEE Access, 2021, 9: 146533-146549. doi: 10.1109/ACCESS.2021.3123230

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(8)

Get Citation

PDF

XML

Article Metrics

Article views(495) PDF downloads(146) Cited by()

Proportional views

HTML

0. 引　言

卷积运算具有线性和平移不变性，常用于图像处理任务中的线性空间滤波系统。由于光子通量的随机性等原因，探测器接收到的图像会包含着不同程度的噪声。大多数情况下，这些噪声可以通过图像平滑技术进行抑制或去除。常见的平滑算法中的高斯滤波和均值滤波^[1]都是基于卷积运算实现的。此外，能够增强图像细节边缘和轮廓，便于后期对目标进行识别的图像锐化，也是通过一系列算子的组合对输入图像做卷积实现的。

然而，诸如上述基于卷积运算实现的经典的图像处理任务，其卷积核都是人工预先设定的。由于成像环境的多变性，这种方法很多时候并不能得到很好的效果。卷积神经网络（CNN）的提出就解决了该问题。在CNN中，卷积核不再是像平滑和锐化任务中使用的算子那样是固定的，而是可以通过计算机学习得到最适合的样式。从而用于实现图像识别^[2-3]、图像分割^[4-5]以及图像复原^[6-8]等任务。近年来，由于摩尔定律达到了瓶颈，计算机芯片的集成度的增长速度难以满足大数据时代的计算需求。光学研究领域的研究者们为了解决这个问题，提出了光学神经网络^[9-10]，期望利用光子这个高速、并行性强且抗电磁干扰的介质，来代替电子实现低延迟、高带宽且能耗低的神经网络。对于光学卷积神经网络而言，光学实现卷积层很重要。徐绍夫^[11]等人通过两个声光调制器阵列分别加载输入图像和卷积核，一次加载一个窗口的输入图像，反复使用两个阵列来实现一层卷积层。参考文献[12-13]基于卷积定理，利用光学4 f系统很巧妙地实现了光学卷积层。

此外，除了光学卷积神经网络的卷积层，卷积操作也可以用于实现池化层^[14-15]，来减少网络的参数量。Hoda^[15]等人提出的运动池化，通过在4 f系统的频谱面放置高斯掩模来实现，在减少参数量的同时，还提高了网络的平移不变性。

文中设计了一款光学系统，论述了系统的工作原理以及可行性分析。该系统用于对光场所携带的图像信息做卷积运算。其中卷积核是任意的正值，因此能够基于固定的算子实现图像模糊/锐化。此外，该系统也可以用作光学卷积神经网络的卷积处理单元。

1. 二维离散卷积原理

二维离散卷积是基于两个矩阵之间的运算，可以分为same、valid、full卷积三种类型。运算时，根据三种类型决定是否需要对图像进行边缘填充，然后使用核在图像上以步长大小滑动，并做元素对应相乘再求和的运算。例如图像$ {{x}} = \left[ {\begin{array}{*{20}{c}} 6&3&5 \\ 2&7&1 \\ 3&1&2 \end{array}} \right] $，卷积核$ {{k}} = \left[ {\begin{array}{*{20}{c}} 4&2 \\ 1&5 \end{array}} \right] $，步长为1， valid卷积的输出y为：

公式（1）中，一个∑内包含的是一个窗口内的点乘运算。实际上，可以将上述计算看做x^'与k^'的矩阵乘法（其中x^'是将原输入x的每个窗口内的数值按列堆叠，k^'则是将卷积核拉成一行）。公式（2）所示的y^'便是y拉成一行的结果。

3. 矩阵解释

对于二维输入${\left[ {{{m}},{{n}}} \right]^{\rm{{T}}}}$，经过一个传输矩阵为$ T = \left[ {\begin{array}{*{20}{c}} a&b \\ c&d \end{array}} \right] $的线性无损系统后，输出${\left[ {{{h}},{{k}}} \right]^{\rm{T}}}$。

因为无损，所以需要满足：

对于任意的m、n，公式（5）都要成立，因此$\left\{ {\begin{array}{*{20}{c}} {{a^2} + {b^2} = {c^2} + {d^2} = 1} \\ {ac + bd = 0} \end{array}} \right.$，传输矩阵T为酉矩阵。同理，对于其他维度的输入也可以得到此结果。因此，任意线性无损系统的传输矩阵为酉矩阵。

P₂处的强度调制操作相当于一衰减片，其传输矩阵为对角矩阵。根据奇异值分解原理，任何矩阵M都可以分解为两个酉矩阵与对角矩阵的乘法（$ M = U\Sigma {V^ + } $）。因此，L1，L2与P₂组成的系统可以实现任意正值的传输矩阵。

在图1中，使用微透镜阵列将输入图像划分成许多小块，进行并行处理。每一个小块都经历相同的传输矩阵M。将输入的每一个小块做列堆叠，传输矩阵拉成一行，此时图2所示的情况中的输出表示为：

根据第1节中的公式（2）可知此处实现了卷积运算。

6. 结　论

文中设计了一款光学系统，基于微透镜阵列与透镜组构成的匀光系统，对光场所携带的图像信息做卷积运算。一方面，可以实现固定的算子做图像模糊或者锐化，用于图像处理系统的预处理；另一方面也可以用作光学卷积神经网络的卷积单元。相较于参考文献[11]中实现一次卷积需要给卷积单元多次加载信息，文中的光学卷积系统仅需加载一次调制信息。然而，由于系统使用透射率作为卷积核的值，只能实现数值范围在0~1的卷积核，这给网络算力带来了上限。在使用SLM投影的有源系统中，可以通过对输入图像的预处理实现各种步长的3种卷积。同一个输入，多次投影卷积核，可以实现多通道输出。同一个卷积核，多通道输入可以同时平铺在输入面。因此，该系统有希望实现多通道的光学卷积神经网络用于一些复杂的图像处理任务。

Reference (15)

[1]	Castleman K R, 朱志刚, 林学闵, 等. 数字图像处理[M]. 北京: 电子工业出版社, 1998: 123-145.	Castleman K R, Zhu Z, Lin X, et al. Digital Image Processing[M]. Beijing: Publishing House of Electronics Industry, 1998: 123-145. (in Chinese)
[2]	Goodfellow I J, Bulatov Y, Ibarz J, et al. Multi-digit number recognition from street view imagery using deep convolutional neural networks [J]. arXiv preprint arXiv, 2013, 1312: 6082.
[3]	薛珊, 张振, 吕琼莹, 等. 基于卷积神经网络的反无人机系统图像识别方法[J]. 红外与激光工程, 2020, 49(7): 20200154.	Xue S, Zhang Z, Lv Q Y, et al. Image recognition method of anti UAV system based on convolutional neural network [J]. Infrared and Laser Engineering, 2020, 49(7): 20200154. (in Chinese)
[4]	Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 39(4): 640-651.
[5]	王中宇, 倪显扬, 尚振东. 利用卷积神经网络的自动驾驶场景语义分割[J]. 光学精密工程, 2019, 27(11): 2429-2438.	Wang Z Z, Ni X Y, Sheng Z D. Autonomous driving semantic segmentation with convolution neural networks [J]. Optics and Precision Engineering, 2019, 27(11): 2429-2438. (in Chinese)
[6]	Chao D, Chen C L, He K, et al. Learning a deep convolutional network for image super-resolution[C]//ECCV, Springer International Publishing, 2014, 8692: 184-199.
[7]	郝建坤, 黄玮, 刘军, 等. 空间变化PSF非盲去卷积图像复原法综述[J]. 中国光学, 2016, 9(1): 41-50.	Hao J K, Huang W, Liu J, et al. Review of non-blind deconvolution image restoration based on spatially-varying PSF [J]. Chinese Optics, 2016, 9(1): 41-50. (in Chinese)
[8]	朱明, 杨航, 贺柏根, 等. 联合梯度预测与导引滤波的图像运动模糊复原[J]. 中国光学, 2013, 6(6): 850-855.	Zhu M, Yang H, He B G, et al. Image motion blurring restoration of joint gradient prediction and guided filter [J]. Chinese Optics, 2013, 6(6): 850-855. (in Chinese)
[9]	张旭, 于明鑫, 祝连庆, 等. 基于全光衍射深度神经网络的矿物拉曼光谱识别方法[J]. 红外与激光工程, 2020, 49(10): 20200221.	Zhang X, Yu M X, Zhu L Q, et al. Raman mineral recognition method based on all-optical diffraction deep neural network [J]. Infrared and Laser Engineering, 2020, 49(10): 20200221. (in Chinese)
[10]	郭玉彬, 邢培. 一种全光模糊智能信息处理系统设计[J]. 光学精密工程, 1998, 6(1): 23-30.	Guo Y B, Xing P. The design of an all optical signal processing system with fuzzy intelligence networks [J]. Optics and Precision Engineering, 1998, 6(1): 23-30. (in Chinese)
[11]	Xu S, Wang J, Wang R, et al. High-accuracy optical convolution unit architecture for convolutional neural networks by cascaded acousto-optical modulator arrays [J]. Optics Express, 2019, 27(14): 19778-19787.
[12]	Mario Miscuglio, Zibo Hu, Shurui Li, et al. Massively parallel amplitude-only Fourier neural network [J]. Optica, 2020, 7(12): 1812-1819.
[13]	Wu Q, Fei Y, Liu J, et al. High speed and reconfigurable optronic neural network with digital nonlinear activation [J]. Optik, 2021, 247: 168043.
[14]	Gu Z, Gao Y, Liu X. Optronic convolutional neural networks of multi-layers with different functions executed in optics for image classification [J]. Optics Express, 2021, 29(4): 5877-5889.
[15]	Sadeghzadeh H, Koohi S, Paranj A F. Free-space optical neural network based on optical nonlinearity and pooling operations [J]. IEEE Access, 2021, 9: 146533-146549.

Optically realize convolution operation of microlens array

doi: 10.3788/IRLA20210887

Abstract

References

Proportional views

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Related

Proportional views