视频监控下利用记忆力增强自编码的行人异常行为检测

孙敬波; 季节

doi:10.3788/IRLA20210680

视频监控下利用记忆力增强自编码的行人异常行为检测

doi: 10.3788/IRLA20210680

孙敬波,
季节

济宁学院数学与计算机应用技术学院，山东曲阜 273155

详细信息

作者简介:
孙敬波，男，讲师，硕士，主要从事软件工程、大数据技术、人工智能方面的研究

中图分类号: TP391.4

Memory-augmented deep autoencoder model for pedestrian abnormal behavior detection in video surveillance

Sun Jingbo,
Ji Jie

School of Mathematics and Computer Application Technology, Jining University, Qufu 273155, China

摘要: 随着视频监控数据的快速增长，对大规模视频数据的自动异常检测的需求越来越大，基于深度自编码器重构误差检测方法已经被广泛探讨。但是，有时自编码器“泛化”得很好，能够很好地重建异常并导致漏检。为了解决这个问题，提出了采用记忆力模块来增强自动编码器，称为记忆力增强自编码（Memory-augmented autoencoder, Memory AE）方法。给定输入，Memory AE首先从编码器获取编码，然后将其用作查询以检索最相关的记忆项来进行重建。在训练阶段，记忆内容被更新以表示正常数据的原型元素。在测试阶段，将学习到的记忆元素固定下来，从正常数据的几个选定的记忆记录中获得重建，因此重建将趋向于接近正常样本。因此，将加强对异常的重构误差以进行异常检测。对两个公共视频异常检测数据集，即Avenue数据集和ShanghaiTech数据集的研究证明了所提出方法的有效性。
- 异常事件检测 /
- 视频监控 /
- 自编码网络 /
- 记忆力增强 /
- 深度学习
Abstract: With the rapid growth of video surveillance data, there is an increasing demand for video anomaly detection, and reconstruction error detection methods based on deep autoencoders have been widely discussed. However, the autoencoder generalizes well, can reconstruct the anomaly well and lead to missed detection. In order to solve this problem, this paper proposes to adopt a memory module to enhance the autoencoder, which is called the Memory-augmented autoencoder (Memory AE) method. Given the input, Memory AE first obtains the encoding from the encoder, and then uses it as a query to retrieve the most relevant memory items for reconstruction. In the training phase, the memory content is updated and encouraged to represent prototype elements of normal data. In the test phase, the learned memory elements are fixed, and reconstruction is obtained from several selected memory records of normal data, thus the reconstruction will tend to be close to normal samples. Therefore, the reconstruction of abnormal errors will be strengthened for abnormal detection. Experiments on two public video anomaly detection datasets, namely Avenue dataset and ShanghaiTech dataset, proves the effectiveness of the proposed method.
- anomalous event detection /
- video surveillance /
- auto-encoding network /
- memory-augmented model /
- deep learning

图 1 基于记忆力增强自编码的异常检测方法流程

Figure 1. The flow chart of Memory AE based anomaly detection method

下载: 全尺寸图片幻灯片

图 2 部分检测结果示例

Figure 2. Examples of the detection results

下载: 全尺寸图片幻灯片

表 1 同现有技术水平方法比较结果（以帧级AUC%的形式）

Table 1. Comparison with the state of the art methods in terms of AUC%

Method	Avenue	ShanghaiTech
MPPCA+ SF ^[17]	56.2%	-
MDT^[18]	77.4%	-
Conv-AE ^[8]	80.0%	60.9%
Conv3D-AE^[19]	80.9%	-
Stacked RNN^[20]	81.7%	68.0%
ConvLSTM-AE^[21]	77.0%	-
MemNormality^[22]	88.5%	70.5%
ClusterAE^[23]	86.0%	73.3%
AbnormalGAN^[24]	-	72.4%
Pred+Recon^[25]	85.1%	73.0%
Proposed method	85.7%	75.3%

下载: 导出CSV

表 2 记忆模块大小对于Avenue数据集实验结果（帧级 AUC%）的影响

Table 2. The influence of the number of memory module size on the experimental results of the Avenue data set (frame-level AUC%)

Size of memory module	500	1000	1500	2000	2500
Result	78.2%	85.7%	85.3%	85.7%	85.8%

下载: 导出CSV

[1]	Dong F, Zhang Y, Nie X. Dual discriminator generative adversarial network for video anomaly detection[J]. IEEE Access, 2020, 8: 88170-88176.
[2]	Doshi K, Yilmaz Y. Any-shot sequential anomaly detection in surveillance videos [C]//Proceedings of International Conference on Computer VisionPattern Recognition workshop, 2020: 934-935.
[3]	Doshi K, Yilmaz Y. Continual learning for anomaly detection in surveillance videos [C]//Proceedings of International Conference on Computer VisionPattern Recognition Workshop, 2020: 254-255.
[4]	Dutta J K, Banerjee B. Online detection of abnormal events using incremental coding length [C]//AAAI Conference on Artificial Intelligence, 2015: 3755-3761.
[5]	Feng Y, Yuan Y, Lu X. Learning deep event models for crowd anomaly detection[J]. *Neurocomputing*, 2017, 219: 548-556.
[6]	Gong D, Liu L, Le V, et al. Memorizing normality to detect anomaly: Memory-augmented deep autoencoder for unsupervised anomaly detection [C]//IEEE International Conference on Computer Vision, 2017: 1705-1714.
[7]	Ravanbakhsh M, Nabi M, Sangineto E, et al. Abnormal event detection in videos using generative adversarial nets [C]//IEEE International Conference on Image Processing, 2017: 1577-1581.
[8]	Hasan M, Choi J, Neumann J, et al. Learning temporal regularity in video sequences [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2016: 733-742.
[9]	Liu W, Luo W, Lian D, et al. Future frame prediction for anomaly detection–A new baseline [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2018: 6536-6545.
[10]	Park H, Noh J, Ham B. Learning memory-guided normality for anomaly detection [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2020, 14: 372-381.
[11]	Zaheer M Z, Lee J-H, Astrid M, et al. Old is gold: Redefining the adversarially learned one-class classifier training paradigm [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2020, 14: 183-193.
[12]	Zhu X, Liu J, Wang J, et al. Anomaly detection in crowded scene via appearance and dynamics joint modeling [C]//IEEE International Conference on Image Processing, 2013: 2705-2708.
[13]	Colque R V M, Júnior C A C, Schwartz W R. Histograms of optical flow orientation and magnitude to detect anomalous events in videos [C]//Sibgrapi Conference on Graphics, Patterns and Images, 2015: 126-133.
[14]	Zong B, Song Qi, Min M, et al. Deep autoencoding gaussian mixture model for unsupervised anomaly detection [C]//International Conference on Learning Representations, 2018.
[15]	Sabokrou M, Fayyaz M, Fathy M, et al. Deep-anomaly: Fully convolutional neural network for fast anomaly detection in crowded scenes [C]//Computer Vision and Image Understanding, 2018, 172: 88-97.
[16]	Li C, Han Z, Ye Q, et al. Visual abnormal behavior detection based on trajectory sparse reconstruction analysis [J]. Neurocomputing, 2013, 119(7): 94-100.
[17]	Mahadevan V, Li W, Bhalodia V, et al. Anomaly detection in crowded scenes [C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010: 1975-1981.
[18]	Li W, Mahadevan V, Vasconcelos N. Anomaly detection and localization in crowded scenes [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(1): 18-32.
[19]	Zhao Y, Deng B, Shen C, et al. Spatio-temporal autoencoder for video anomaly detection [C]//ACM International Conference on Multimedia, 2017: 1933-1941.
[20]	Luo W, Liu W, Gao S. A revisit of sparse coding based anomaly detection in stacked RNN framework [C]//IEEE International Conference on Computer Vision, 2017: 1-8.
[21]	Luo W, Liu W, Gao S. Remembering history with convolutional LSTM for anomaly detection [C]//IEEE International Conference on Multimedia and Expo (ICME), 2017: 439-444.
[22]	Park H, Noh J, Ham B. Learning memoryguided normality for anomaly detection [C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020: 14372-14381.
[23]	Chang Y, Tu Z, Xie W, Yuan J. Clustering driven deep autoencoder for video anomaly detection [C]//European Conference on Computer Vision (ECCV), 2020: 329-345.
[24]	Ravanbakhsh M, Nabi M, Sangineto E, et al. Abnormal event detection in videos using generative adversarial nets [C]//IEEE International Conference on Image Processing (ICIP), 2017: 1577-1581.
[25]	Liu W, Luo W, Lian D, et al. Future frame prediction for anomaly detection–A new baseline [C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018: 6536-6545.

[1]	周天彪, 黄思远, 文龙, 陈沁. 基于无透镜散斑图像编码的集成式光谱检测 . 红外与激光工程, 2024, 53(3): 20240010-1-20240010-9. doi: 10.3788/IRLA20240010
[2]	赵晓枫, 徐叶斌, 吴飞, 牛家辉, 蔡伟, 张志利. 基于并行注意力机制的地面红外目标检测方法（特邀） . 红外与激光工程, 2022, 51(4): 20210290-1-20210290-8. doi: 10.3788/IRLA20210290
[3]	张津浦, 王岳环. 融合检测技术的孪生网络跟踪算法综述 . 红外与激光工程, 2022, 51(10): 20220042-1-20220042-14. doi: 10.3788/IRLA20220042
[4]	庞忠祥, 刘勰, 刘桂华, 龚泿军, 周晗, 罗洪伟. 并行多特征提取网络的红外图像增强方法 . 红外与激光工程, 2022, 51(8): 20210957-1-20210957-9. doi: 10.3788/IRLA20210957
[5]	钟友坤, 莫海宁. 基于深度自编码-高斯混合模型的视频异常检测方法 . 红外与激光工程, 2022, 51(6): 20210547-1-20210547-7. doi: 10.3788/IRLA20210547
[6]	刘云朋, 霍晓丽, 刘智超. 基于深度学习的光纤网络异常数据检测算法 . 红外与激光工程, 2021, 50(6): 20210029-1-20210029-6. doi: 10.3788/IRLA20210029
[7]	李芳丽. 监控视频中采用深度支持向量数据描述的异常检测 . 红外与激光工程, 2021, 50(9): 20210094-1-20210094-7. doi: 10.3788/IRLA20210094
[8]	张旭, 于明鑫, 祝连庆, 何彦霖, 孙广开. 基于全光衍射深度神经网络的矿物拉曼光谱识别方法 . 红外与激光工程, 2020, 49(10): 20200221-1-20200221-8. doi: 10.3788/IRLA20200221
[9]	林森, 刘世本, 唐延东. 多输入融合对抗网络的水下图像增强 . 红外与激光工程, 2020, 49(5): 20200015-20200015-9. doi: 10.3788/IRLA20200015
[10]	周宏强, 黄玲玲, 王涌天. 深度学习算法及其在光学的应用 . 红外与激光工程, 2019, 48(12): 1226004-1226004(20). doi: 10.3788/IRLA201948.1226004
[11]	张秀, 周巍, 段哲民, 魏恒璐. 基于卷积稀疏自编码的图像超分辨率重建 . 红外与激光工程, 2019, 48(1): 126005-0126005(7). doi: 10.3788/IRLA201948.0126005
[12]	唐聪, 凌永顺, 杨华, 杨星, 路远. 基于深度学习的红外与可见光决策级融合检测 . 红外与激光工程, 2019, 48(6): 626001-0626001(15). doi: 10.3788/IRLA201948.0626001
[13]	张秀玲, 侯代标, 张逞逞, 周凯旋, 魏其珺. 深度学习的MPCANet火灾图像识别模型设计 . 红外与激光工程, 2018, 47(2): 203006-0203006(6). doi: 10.3788/IRLA201847.0203006
[14]	罗海波, 何淼, 惠斌, 常铮. 基于双模全卷积网络的行人检测算法(特邀) . 红外与激光工程, 2018, 47(2): 203001-0203001(8). doi: 10.3788/IRLA201847.0203001
[15]	刘天赐, 史泽林, 刘云鹏, 张英迪. 基于Grassmann流形几何深度网络的图像集识别方法 . 红外与激光工程, 2018, 47(7): 703002-0703002(7). doi: 10.3788/IRLA201847.0703002
[16]	郭强, 芦晓红, 谢英红, 孙鹏. 基于深度谱卷积神经网络的高效视觉目标跟踪算法 . 红外与激光工程, 2018, 47(6): 626005-0626005(6). doi: 10.3788/IRLA201847.0626005
[17]	耿磊, 梁晓昱, 肖志涛, 李月龙. 基于多形态红外特征与深度学习的实时驾驶员疲劳检测 . 红外与激光工程, 2018, 47(2): 203009-0203009(9). doi: 10.3788/IRLA201847.0203009
[18]	唐聪, 凌永顺, 杨华, 杨星, 郑超. 基于深度学习物体检测的视觉跟踪方法 . 红外与激光工程, 2018, 47(5): 526001-0526001(11). doi: 10.3788/IRLA201847.0526001
[19]	唐聪, 凌永顺, 郑科栋, 杨星, 郑超, 杨华, 金伟. 基于深度学习的多视窗SSD目标检测方法 . 红外与激光工程, 2018, 47(1): 126003-0126003(9). doi: 10.3788/IRLA201847.0126003
[20]	游瑞蓉, 王新伟, 任鹏道, 何军, 周燕. 约翰逊准则的视频监控目标检测性能评估方法 . 红外与激光工程, 2016, 45(12): 1217003-1217003(6). doi: 10.3788/IRLA201645.1217003

点击查看大图

图(2) / 表(2)

计量

文章访问数: 363
HTML全文浏览量: 114
PDF下载量: 33
被引次数: 0

全文HTML

0. 引　言

作为一种高级计算机视觉任务，视频异常检测指的是自动检测给定视频序列中的异常事件，能够有效区分视频序列中的异常和正常活动以及异常类别。在过去的几年中，研究者们开展了许多异常检测相关的研究^[1-9]。与正常事件相比，很少发生或发生概率低的事件通常被认为是异常。但在实际中，由于未知的事件类型和异常的定义不明确，很难建立有效的异常检测模型。大多数现有的异常检测方法都是基于这样一个假设，即任何与学习到的正常模式不同的模式都被视为异常。在这个假设下，不同场景中的相同活动可能被表示为正常或异常事件。例如，两个人在街头打架的打斗场景可能被认为是异常的，而这两个人在进行拳击运动时则是正常的；一个小孩因为恐慌而在街上奔跑可能被认为是异常的，但如果他忘记带伞，在下雨天奔跑可能是正常的；动物接触人可能被认为是不正常的，而在海豚亲吻人的情况下可能是正常的。此外，高维视频数据中存在大量冗余的视觉信息，这增加了视频序列中事件表示的难度。

根据参考文献[6]，异常检测方法通常可以分为两种类型。一种是通过重建误差设计的，专注于对视频序列中的正常模式进行建模^{[3-5, 7-8, 10-11]}。这类方法的目标是在训练阶段学习正常模式的特征表示模型，在测试阶段利用异常样本和正常样本之间的差异来确定测试数据的最终异常分数，例如重建成本或特定阈值^[7-14]。尽管基于重建的异常检测方法擅长重建视频序列中的正常模式，但关键问题是它们严重依赖于训练数据。另一类方法则将异常检测视为分类问题^[15-18]，通过使用已训练的分类器提取诸如光流直方图(histogram of optical flow， HOF)或动态纹理(dynamic texture, DT)之类的特征来预测视频序列的异常分数。这类方法的性能高度依赖于训练样本。为了获得令人满意的性能，提取有效且有判别力的特征对于该类异常检测方法至关重要。然而，这两类方法通常以相对简单的方式对事件之间的相互关系进行建模，例如仅捕获线性关系，这对于许多现实世界设置中复杂、高度非线性的关系是不够的。

近十年前，基于深度学习的方法被应用于视频检测领域，并取得了较大进展。例如，自编码器(autoenencoders，AE)使用重建误差来检测异常，并且一系列方法在此基础上进行了改进。此外，生成对抗网络（generative adversarial networks， GAN）和长短时记忆（long Short-term Memory， LSTM）网络也被应用于解决异常检测问题。然而，自编码器可能泛化能力较强，导致能够重构异常事件。参考文献[14]指出，由于没有异常训练样本，异常样本的重构是不可预测的，可能会导致异常样本获得更大的重构误差。如果某些异常与正常训练数据共享共同的组成模式（例如图像中的局部边缘），或者解码器“太强”而无法很好地解码某些异常编码，那么自编码器能够很好地重建异常。

为了克服自编码器的不足，文中采用记忆力模块来对深度自动编码器进行增强，引入一种记忆力增强自编码（memory-augmented autoencoder, Memory AE）方法。当输入新测试样本时，Memory AE不会直接将其编码并输入解码器，而是将其用作查询以检索记忆模块中相关的内容，然后将这些内容汇总后传递给解码器，这个过程是通过基于注意力的注意寻址来实现的。进一步，文中提出使用可微的收缩算子来诱导记忆寻址权重的稀疏性，这能够鼓励记忆内容接近特征空间中的查询。在训练阶段，编码器和解码器同时对记忆模块进行更新，以获得较低的平均重构误差。在测试阶段，学习到的记忆内容是固定的，将使用少量的正常记忆项进行重构，选择这些作为输入编码的邻域，重建误差会非常明显。在几个公共基准数据集上的实验表明，Memory AE的性能检测效果达到现有技术发展水平。

3. 结　论

文中提出了一种记忆力增强自编码器（Memory AE）来提高基于自编码器的异常检测方法的性能。给定输入，提出的Memory AE方法首先使用编码器获得编码表示，然后使用编码作为查询来检索记忆模块中最相关的模式以进行重建。由于记忆模块被训练来记录典型的正常模式，所提出的Memory AE可以很好地重建正常样本并扩大异常的重建误差，加强了重建误差作为异常检测标准的作用。在两个数据集上的实验证明了所提出方法的通用性和有效性。将来会研究使用寻址权重进行异常检测，考虑到所提出的内存模块是通用的，并且与编码器和解码器的结构无关，将其集成到更复杂的基础模型中，并将其在更具挑战性的数据集中进行实验。

参考文献 (25)

姓名
邮箱
手机号码
标题
留言内容
验证码

留言板

视频监控下利用记忆力增强自编码的行人异常行为检测

doi: 10.3788/IRLA20210680

作者简介:
孙敬波，男，讲师，硕士，主要从事软件工程、大数据技术、人工智能方面的研究

Memory-augmented deep autoencoder model for pedestrian abnormal behavior detection in video surveillance

计量

视频监控下利用记忆力增强自编码的行人异常行为检测

doi: 10.3788/IRLA20210680

济宁学院数学与计算机应用技术学院，山东曲阜 273155

作者简介:
孙敬波，男，讲师，硕士，主要从事软件工程、大数据技术、人工智能方面的研究

English Abstract

Memory-augmented deep autoencoder model for pedestrian abnormal behavior detection in video surveillance

School of Mathematics and Computer Application Technology, Jining University, Qufu 273155, China

全文HTML

1.1. 编码器和解码器

1.2. 记忆力模块

1.2.1. 基于注意力的表示

1.2.2. 用于记忆寻址的注意力

1.3. 训　练

1.4. 测　试

2.1. 实验数据及评价指标

2.2. 实验设置

2.3. 实验结果

目录

留言板

视频监控下利用记忆力增强自编码的行人异常行为检测

doi: 10.3788/IRLA20210680

作者简介: 孙敬波，男，讲师，硕士，主要从事软件工程、大数据技术、人工智能方面的研究

Memory-augmented deep autoencoder model for pedestrian abnormal behavior detection in video surveillance

计量

出版历程

视频监控下利用记忆力增强自编码的行人异常行为检测

doi: 10.3788/IRLA20210680

济宁学院 数学与计算机应用技术学院，山东 曲阜 273155

作者简介: 孙敬波，男，讲师，硕士，主要从事软件工程、大数据技术、人工智能方面的研究

English Abstract

Memory-augmented deep autoencoder model for pedestrian abnormal behavior detection in video surveillance

School of Mathematics and Computer Application Technology, Jining University, Qufu 273155, China

全文HTML

1.1. 编码器和解码器

1.2. 记忆力模块

1.2.1. 基于注意力的表示

1.2.2. 用于记忆寻址的注意力

1.3. 训 练

1.4. 测 试

2.1. 实验数据及评价指标

2.2. 实验设置

2.3. 实验结果

目录

作者简介:
孙敬波，男，讲师，硕士，主要从事软件工程、大数据技术、人工智能方面的研究

济宁学院数学与计算机应用技术学院，山东曲阜 273155

作者简介:
孙敬波，男，讲师，硕士，主要从事软件工程、大数据技术、人工智能方面的研究

1.3. 训　练

1.4. 测　试