Volume 47 Issue 2
Mar.  2018
Turn off MathJax
Article Contents

Yin Yunhua, Li Huifang. RGB-D object recognition based on hybrid convolutional auto-encoder extreme learning machine[J]. Infrared and Laser Engineering, 2018, 47(2): 203008-0203008(8). doi: 10.3788/IRLA201847.0203008
Citation: Yin Yunhua, Li Huifang. RGB-D object recognition based on hybrid convolutional auto-encoder extreme learning machine[J]. Infrared and Laser Engineering, 2018, 47(2): 203008-0203008(8). doi: 10.3788/IRLA201847.0203008

RGB-D object recognition based on hybrid convolutional auto-encoder extreme learning machine

doi: 10.3788/IRLA201847.0203008
  • Received Date: 2017-08-05
  • Rev Recd Date: 2017-10-03
  • Publish Date: 2018-02-25
  • Learning rich representations efficiently plays an important role in RGB-D object recognition task, which is crucial to achieve high generalization performance. For the long training time of convolutional neural networks, a Hybrid Convolutional Auto-Encoder Extreme Learning Machine Structure (HCAE-ELM) was put forward which included Convolutional Neural Network (CNN) and Auto-Encoder Extreme Learning Machine (AE-ELM), which combined the power of CNN and fast training of AE-ELM. It used convolution layers and pooling layers to effectively abstract lower level features from RGB and depth images separately. And then, the shared layer was developed by combining these features from each modality and fed to an AE-ELM for higher level features. The final abstracted features were fed to an ELM classifier, which led to better generalization performance with faster learning speed. The performance of HCAE-ELM was evaluated on RGB-D object dataset. Experimental results show that the proposed method achieves better testing accuracy with significantly shorter training time in comparison with deep learning methods and other ELM methods.
  • [1] Cao Chuqing, Li Ruifeng, Zhao Lijun. Hand posture recognition method based on depth image technoloy[J]. Computer Engineering, 2012, 38(8):16-21. (in Chinese)曹雏清, 李瑞峰, 赵立军. 基于深度图像技术的手势识别方法[J]. 计算机工程, 2012, 38(8):16-21.
    [2] Wang Xin, Wo Bohai, Guan Qiu, et al. Human action recognition based on manifold learning[J]. Journal of Image and Graphics, 2014, 19(6):914-923. (in Chinese)王鑫, 沃波海, 管秋, 等. 基于流形学习的人体动作识别[J]. 中国图象图形学报, 2014, 19(6):914-923.
    [3] Li Changyong, Cao Qixin. Extraction method of shape feature for vegetables based on depth image[J]. Transactions of the Chinese Society for Agricultural Machinery, 2012, 43(Z1):242-245. (in Chinese)李长勇, 曹其新. 基于深度图像的蔬果形状特征提取[J]. 农业机械学报, 2012, 43(Z1):242-245.
    [4] Xu Ke. Study of convolutional neural network applied on image recognition[D]. Hangzhou:Zhejiang University, 2012. (in Chinese)许可. 卷积神经网络在图像识别的应用研究[D]. 杭州:浙江大学, 2012.
    [5] Blum M, Springenberg J T, Wulfing J, et al. A learned feature descriptor for object recognition in RGB-D data[C]//IEEE International Conference on Robotics Automation, 2012, 44(8):1298-1303.
    [6] Socher R, Huval B, Bath B P, et al. Convolutional-recursive deep learning for 3D object classification[C]//NIPS, 2012:665-673.
    [7] Niun Xiaoxiao, Suen Ching Y. A novel hybrid CNN-SVM classifier for recognizing handwritten digits[J]. Pattern Recognition, 2012, 45(4):1318-1325.
    [8] Liu Tianhua, Yang Shaoqing, Liu Songtao. Research of sea-aero target detection from photoelectricity image based on cellular neural networks[J]. Infrared and Laser Engineering, 2008, 37(S2):310-313. (in Chinese)刘天华, 杨绍清, 刘松涛. 基于CNN的海空光电目标检测技术研究[J]. 红外与激光工程, 2008, 37(S2):310-313.
    [9] Li Junmei, Hu Yihua, Tao Xiaohong. Recognition method based on principal component analysis and back-propagation neural network[J]. Infrared and Laser Engineering, 2005, 34(6):719-723. (in Chinese)李军梅, 胡以华, 陶小红. 基于主成分分析与BP神经网络的识别方法研究[J]. 红外与激光工程, 2005, 34(6):719-723.
    [10] Wang Yong, Xu Haisong. Spectral characterization of scanner based on PCA and BP ANN[J]. Chinese Optics Letters, 2005, 3(12):725-728.
    [11] Huang G B, Zhu Q Y, Siew C K. Extreme learning machine:a new learning scheme of feedforward neural networks[C]//IEEE International Joint Conference on Neural Networks, 2004, 2:985-990.
    [12] Huang G B, Zhu Q Y, Siew C K:Extreme learning machine:theory and applications[J]. Neurocomputing, 2006:70(1-3):489-501.
    [13] Arel I, Rose D C, Karnowski T P. Deep machine learning-a new frontier in artificial intelligence research[J]. IEEE Computational Intelligence Magazine, 2010, 5(4):13-18.
    [14] Boureau Y L, Ponce J, Lecun Y. A theoretical analysis of feature pooling in visual recognition[C]//International Conference on Machine Learning, 2010, 32(4):111-118.
    [15] Kasun L L C, Zhou H, Huang G B, et al. Representational learning with ELMs for big data[C]//Intelligent Systems IEEE, 2013, 28(6):31-34.
    [16] Lai K, Bo L, Ren X, et al. A large-scale hierarchical multi-view RGB-D object dataset[C]//IEEE International Conference on Robotics and Automation, 2011:1817-1824.
    [17] Bo L, Lai K, Ren X, et al. Object recognition with hierarchical kernel descriptors[C]//IEEE Int Conf on Computer Vision and Pattern Recognition, 2011:1729-1736.
    [18] Schwarz M, Schulz H, Behnke S. RGB-D object recognition and pose estimation based on pre-trained convolutional neural network features[C]//IEEE Int Conf on Robotics Automation, 2015:1329-1335.
    [19] Cheng Y, Zhao X, Huang K, et al. Semi-supervised learning and feature evaluation for RGB-D object recognition[J]. Computer Vision Image Understanding, 2015, 139(C):149-160.
    [20] Li F, Liu H, Xu X, et al. Multi-modal local receptive field extreme learning machine for object recognition[C]//International Joint Conference on Neural Networks, 2016:1696-1701.
  • 加载中
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Article Metrics

Article views(442) PDF downloads(89) Cited by()

Related
Proportional views

RGB-D object recognition based on hybrid convolutional auto-encoder extreme learning machine

doi: 10.3788/IRLA201847.0203008
  • 1. School of Electronics and Information,Northwestern Polytechnical University,Xi'an 710072,China;
  • 2. Science and Technology on Transient Impact Laboratory,Beijing 102202,China

Abstract: Learning rich representations efficiently plays an important role in RGB-D object recognition task, which is crucial to achieve high generalization performance. For the long training time of convolutional neural networks, a Hybrid Convolutional Auto-Encoder Extreme Learning Machine Structure (HCAE-ELM) was put forward which included Convolutional Neural Network (CNN) and Auto-Encoder Extreme Learning Machine (AE-ELM), which combined the power of CNN and fast training of AE-ELM. It used convolution layers and pooling layers to effectively abstract lower level features from RGB and depth images separately. And then, the shared layer was developed by combining these features from each modality and fed to an AE-ELM for higher level features. The final abstracted features were fed to an ELM classifier, which led to better generalization performance with faster learning speed. The performance of HCAE-ELM was evaluated on RGB-D object dataset. Experimental results show that the proposed method achieves better testing accuracy with significantly shorter training time in comparison with deep learning methods and other ELM methods.

Reference (20)

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return