基于体素特征重组网络的三维物体识别
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:3D Object Recognition Based on Voxel Features Reorganization Network
  • 作者:路强 ; 张春元 ; 陈超 ; 余烨 ; YUAN ; Xiao-hui
  • 英文作者:LU Qiang;ZHANG Chun-yuan;CHEN Chao;YU Ye;YUAN Xiao-hui;VCC Division, School of Computer and Information, Hefei University of Technology;Anhui Province Key Laboratory of Industry Safety and Emergency Technology (Hefei University of Technology);Department of Computer Science and Engineering, University of North Texas;
  • 关键词:物体识别 ; 体素 ; 卷积神经网络 ; 特征重组 ; 短连接
  • 英文关键词:object recognition;;voxel;;convolution neural network;;feature reorganization;;short connection
  • 中文刊名:GCTX
  • 英文刊名:Journal of Graphics
  • 机构:合肥工业大学计算机与信息学院VCC研究室;工业安全与应急技术安徽省重点实验室(合肥工业大学);北德克萨斯大学计算机科学与工程学院;
  • 出版日期:2019-04-15
  • 出版单位:图学学报
  • 年:2019
  • 期:v.40;No.144
  • 基金:安徽省自然科学基金项目(1708085MF158);; 国家自然科学基金项目(61602146);; 国家留学基金项目(201706695044);; 合肥工业大学智能制造技术研究院科技成果转化及产业化重点项目(IMICZ2017010)
  • 语种:中文;
  • 页:GCTX201902004
  • 页数:8
  • CN:02
  • ISSN:10-1034/T
  • 分类号:30-37
摘要
三维物体识别是计算机视觉领域近年来的研究热点,其在自动驾驶、医学影像处理等方面具有重要的应用前景。针对三维物体的体素表达形式,特征重组卷积神经网络VFRN使用了直接连接同一单元中不相邻的卷积层的短连接结构。网络通过独特的特征重组方式,复用并融合多维特征,提高特征表达能力,以充分提取物体结构特征。同时,网络的短连接结构有利于梯度信息的传播,加之小卷积核和全局均值池化的使用,进一步提高了网络的泛化能力,降低了网络模型的参数量和训练难度。ModelNet数据集上的实验表明,VFRN克服了体素数据分辨率低和纹理缺失的问题,使用较少的参数取得了优于现有方法的识别准确率。
        3D object recognition is a research focus in the field of computer vision and has significant application prospect in automatic driving, medical image processing, etc. Aiming at voxel expression form of 3D object, VFRN(voxel features reorganization network), using short connection structure,directly connects non-adjacent convolutional layers in the same unit. Through unique feature recombination, the network reuses and integrates multi-dimensional features to improve the feature expression ability to fully extract the structural features of objects. At the same time, the short connection structure of the network is conducive to the spread of gradient information. Additionally,employing small convolution kernel and global average pooling not only enhances generalization capacity of network, but also reduces the parameters in network models and the training difficulty.The experiment on ModelNet data set indicates that VFRN overcomes problems including low resolution ratio in voxel data and texture deletion, and achieves better recognition accuracy rate using less parameter.
引文
[1]张爱武,李文宁,段乙好,等.结合点特征直方图的点云分类方法[J].计算机辅助设计与图形学学报,2016,28(5):795-801.
    [2]徐敬华,盛红升,张树有,等.基于邻接拓扑的流形网格模型层切多连通域构建方法[J].计算机辅助设计与图形学学报,2018,30(1):180-190.
    [3]吴晓军,刘伟军,王天然,等.改进的基于欧氏距离测度网格模型体素化算法[J].计算机辅助设计与图形学学报,2004,16(4):592-597.
    [4]范涵奇,孔德星,李晋宏,等.从含噪采样重建稀疏表达的高分辨率深度图[J].计算机辅助设计与图形学学报,2016,28(2):260-270.
    [5]吕刚,郝平,盛建荣.一种改进的深度神经网络在小图像分类中的应用研究[J].计算机应用与软件,2014,31(4):182-184,213.
    [6]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
    [7]李琳辉,钱波,连静,等.基于卷积神经网络的交通场景语义分割方法研究[J].通信学报,2018,39(4):123-130.
    [8]BELLMAN R E.Dynamic programming[M].Princeton:Princeton University Press,1957.
    [9]LIN M,CHEN Q,YAN S.Network in network[EB/OL].(2013-12-16).[2014-03-04].http://arvix.org/abs/1312.4400.
    [10]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.ImageNet classification with deep convolutional neural networks[C]//Proceedings of International Conference on Neural Information Processing Systems.New York:CAM Press,2012:1097-1105.
    [11]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[EB/OL].(2014-09-04).[2015-04-10].https://arxiv.org/abs/1409.1556.
    [12]SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Tokyo:IEEE Computer Society Press,2015:1-9.
    [13]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEEConference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2016:770-778.
    [14]HUANG G,LIU Z,WEINBERGER K Q,et al.Densely connected convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2017:243.
    [15]WU Z,SONG S,KHOSLA A,et al.3D shapenets:Adeep representation for volumetric shapes[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEEComputer Society Press,2015:1912-1920.
    [16]MATURANA D,SCHERER S.Voxnet:A 3Dconvolutional neural network for real-time object recognition[C]//Proceedings of the Intelligent Robots and Systems(IROS),2015 IEEE/RSJ International Conference on.Los Alamitos:IEEE Computer Society Press,2015:922-928.
    [17]BROCK A,LIM T,RITCHIE J M,et al.Generative and discriminative voxel modeling with convolutional neural networks[EB/OL].(2016-08-15).[2016-08-16].https://arxiv.org/abs/1608.04236.
    [18]HEGDE V,ZADEH R.Fusionnet:3D object classification using multiple data representations[EB/OL].(2016-07-19).[2016-11-27].https://arxiv.org/abs/1607.05695.
    [19]SU H,MAJI S,KALOGERAKIS E,et al.Multi-view convolutional neural networks for 3D shape recognition[C]//Proceedings of the IEEE International Conference on Computer Vision.New York:IEEE Press,2015:945-953.
    [20]QI C R,SU H,NIESSNER M,et al.Volumetric and multi-view cnns for object classification on 3d data[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEEComputer Society Press,2016:5648-5656.
    [21]ZHI S F,LIU Y X,LI X,et al.Lightnet:A lightweight3D convolutional neural network for real-time 3Dobject recognition[C]//Proceedings of Eurographics Workshop on 3D Object Retrieval.Goslar:Eurographics Association Press,2017:9-16.
    [22]QI C R,SU H,MO K,et al.Pointnet:Deep learning on point sets for 3d classification and segmentation[J].ProcEedings of the IEEE Conference on Computer Vision and Pattern Recognition.Washington,DC:IEEEComputer Society Press,2017:77-85.
    [23]QI C R,YI L,SU H,et al.Pointnet++:Deep hierarchical feature learning on point sets in a metric space[C]//Proceedings of Advances in Neural Information Processing Systems.Heidelberg:Springer,2017:5105-5114.
    [24]LI Y,BU R,SUN M,et al.PointCNN[EB/OL].(2018-06-23).[2018-11-05].https://arxiv.org/abs/1801.07791.
    [25]冯元力,夏梦,季鹏磊,等.球面深度全景图表示下的三维形状识别[J].计算机辅助设计与图形学学报,2017,29(9):1689-1695.
    [26]WANG P S,LIU Y,GUO Y X,et al.O-cnn:Octree-based convolutional neural networks for 3d shape analysis[J].ACM Transactions on Graphics(TOG),2017,36(4):72.
    [27]LI Y Y,PIRK S,SU H,et al.Fpnn:Field probing neural networks for 3d data[C]//Proceedings of Advances in Neural Information Processing Systems.New York:Curran Associates Inc.2016:307-315.
    [28]REN M,NIU L,FANG Y.3D-A-Nets:3D deep dense descriptor for volumetric shapes with adversarial networks[EB/OL].(2017-11-28).[2017-11-28].https://arxiv.org/abs/1711.10108.
    [29]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[EB/OL].(2017-09-05).[2018-10-25].https://arxiv.org/abs/1709.01507.
    [30]IOFFE S,SZEGEDY C.Batch normalization:Accelerating deep network training by reducing internal covariate shift[EB/OL].(2015-02-11).[2015-03-02].https://arxiv.org/abs/1502.03167.