Volume 47 Issue 2
Mar.  2018
Turn off MathJax
Article Contents

Zhang Guoshan, Zhang Peichong, Wang Xinbo. Visual place recognition based on multi-level feature difference map[J]. Infrared and Laser Engineering, 2018, 47(2): 203004-0203004(9). doi: 10.3788/IRLA201847.0203004
Citation: Zhang Guoshan, Zhang Peichong, Wang Xinbo. Visual place recognition based on multi-level feature difference map[J]. Infrared and Laser Engineering, 2018, 47(2): 203004-0203004(9). doi: 10.3788/IRLA201847.0203004

Visual place recognition based on multi-level feature difference map

doi: 10.3788/IRLA201847.0203004
  • Received Date: 2017-10-05
  • Rev Recd Date: 2017-12-11
  • Publish Date: 2018-02-25
  • Perceptual aliasing and perceptual variability caused by drastically appearance changing in the scene bring great challenge to visual place recognition. Many existing visual place recognition methods using CNN directly adopted the distance of the CNN features and set thresholds to measure the similarity between the two images, which had shown a poor performance when drastically appearance changing in the scene. A novel multi-level feature difference map based visual place recognition method was proposed. Firstly, a CNN pretrained on scene-centric dataset was adopted to extract features for perceptually different images of same place and aliased images of different places. Then, according to the different properties of different CNN layers, multi-level feature difference map was constructed on the multi-level CNN features to represent the difference between the two images. Finally, visual place recognition was regarded as a binary classification task. The feature difference maps were used to train a new CNN classification model for determining whether the two images are from the same place. Experimental results demonstrated that the feature difference map constructed by multi-level CNN features can well represent the difference between two images, and the proposed method can effectively overcome perceptual aliasing and perceptual variability, and achieve a better recognition performance when drastically appearance changing in the scene.
  • [1] Lowe D G. Object recognition from local scale-invariant features[C]//The Proceedings of the Seventh IEEE International Conference on Computer Vision, 1999, 2:1150-1157.
    [2] Lowry S, Snderhauf N, Newman P, et al. Visual place recognition:A survey[J]. IEEE Transactions on Robotics, 2016, 32(1):1-19.
    [3] Bay H, Ess A, Tuytelaars T, et al. Speeded-up robust features (SURF)[J]. Computer Vision and Image Understanding, 2008, 110(3):346-359.
    [4] Cummins M, Newman P M. Appearance-only SLAM at large scale with FAB-MAP 2.0[J]. International Journal of Robotics Research, 2011, 30(9):1100-1123.
    [5] Angeli A, Filliat D, Doncieux S, et al. Fast and incremental method for loop-closure detection using bags of visual words[J]. IEEE Transactions on Robotics, 2008, 24(5):1027-1037.
    [6] Nister D, Stewenius H. Scalable recognition with a vocabulary tree[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2006, 2:2161-2168.
    [7] Oliva A, Torralba A. Building the gist of a scene:The role of global image features in recognition[J]. Progress in Brain Research, 2006, 155:23-36.
    [8] Blaer P, Allen P. Topological mobile robot localization using fast vision techniques[C]//IEEE International Conference on Robotics and Automation, 2002, 1:1031-1036.
    [9] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks[C]//Advances in Neural Information Processing Systems, 2012:1097-1105.
    [10] Babenko A, Slesarev A, Chigorin A, et al. Neural codes for image retrieval[C]//European Conference on Computer Vision. Springer International Publishing, 2014:584-599.
    [11] Redmon J, Divvala S, Girshick R, et al. You only look once:unified, real-time object detection[C]//Proceeding of the IEEE Conference on Computer Vision and Pattern Recognition, 2016:779-788.
    [12] Luo Haibo, Xu Lingyun, Hui Bin, et al. Status and prospect of target tracking based on deep learning[J]. Infrared and Laser Engineering, 2017, 46(5):0502002. (in Chinese)
    [13] Bao Xuejing, Dai Shijie, Guo Cheng, et al. Nonlinear distortion image correction from confocal microscope based on interpolation[J]. Infrared and Laser Engineering, 2017, 46(11):1103006. (in Chinese)
    [14] Li Q, Li K, You X, et al. Place recognition based on deep feature and adaptive weighting of similarity matrix[J]. Neurocomputing, 2016, 199:114-127.
    [15] Hou Y, Zhang H, Zhou S. Convolutional neural network-based image representation for visual loop closure detection[C]//IEEE International Conference on Information and Automation, 2015:2238-2245.
    [16] Zhou B, Lapedriza A, Xiao J, et al. Learning deep features for scene recognition using places database[C]//Advances in Neural Information Processing Systems, 2014:487-495.
    [17] Arandjelovic R, Gronat P, Torii A, et al. NetVLAD:CNN architecture for weakly supervised place recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016:5297-5307.
    [18] Jgou H, Douze M, Schmid C, et al. Aggregating local descriptors into a compact image representation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2010:3304-3311.
    [19] Snderhauf N, Shirazi S, Dayoub F, et al. On the performance of convnet features for place recognition[C]//IEEE International Conference on Intelligent Robots and Systems, 2015:4297-4304.
    [20] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. Computer Vision and Pattern Recognition, 2014, arXiv preprint arXiv:1409.1556.
    [21] Zeiler M D, Fergus R. Visualizing and Understanding Convolutional Networks[C]//European Conference on Computer Vision, 2014:818-833.
    [22] Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks[C]//Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 2010:249-256.
    [23] Jia Y, Shelhamer E, Donahue J, et al. Caffe:Convolutional architecture for fast feature embedding[C]//Proceedings of the 22nd ACM international conference on Multimedia, 2014, arXiV preprint arxiv:1408.5093.
    [24] Torii A, Arandjelovic R, Sivic J, et al. 24/7 place recognition by view synthesis[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017:2667665.
  • 加载中
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Article Metrics

Article views(654) PDF downloads(198) Cited by()

Related
Proportional views

Visual place recognition based on multi-level feature difference map

doi: 10.3788/IRLA201847.0203004
  • 1. School of Electrical and Information Engineering,Tianjin University,Tianjin 300072,China

Abstract: Perceptual aliasing and perceptual variability caused by drastically appearance changing in the scene bring great challenge to visual place recognition. Many existing visual place recognition methods using CNN directly adopted the distance of the CNN features and set thresholds to measure the similarity between the two images, which had shown a poor performance when drastically appearance changing in the scene. A novel multi-level feature difference map based visual place recognition method was proposed. Firstly, a CNN pretrained on scene-centric dataset was adopted to extract features for perceptually different images of same place and aliased images of different places. Then, according to the different properties of different CNN layers, multi-level feature difference map was constructed on the multi-level CNN features to represent the difference between the two images. Finally, visual place recognition was regarded as a binary classification task. The feature difference maps were used to train a new CNN classification model for determining whether the two images are from the same place. Experimental results demonstrated that the feature difference map constructed by multi-level CNN features can well represent the difference between two images, and the proposed method can effectively overcome perceptual aliasing and perceptual variability, and achieve a better recognition performance when drastically appearance changing in the scene.

Reference (24)

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return