一种基于联合学习的家庭日常工具功用性部件检测算法

英文篇名：An Algorithm for Affordance Parts Detection of Household Tools Based on Joint Learning
作者：吴培良 ; 隰晓珺 ; 杨霄 ; 孔令富 ; 侯增广
英文作者：WU Pei-Liang;XI Xiao-Jun;YANG Xiao;KONG Ling-Fu;HOU Zeng-Guang;School of Information Science and Engineering,Yanshan University;State Key Laboratory of Management and Control for Complex Systems,Institute of Automation,Chinese Academy of Sciences;The Key Laboratory for Computer Virtual Technology and System Integration of Hebei Province;
关键词：功用性部件检测 ; 深度几何特征 ; 联合学习 ; 条件随机场 ; 稀疏编码
英文关键词：Affordance parts detection;;depth geometric features;;joint learning;;conditional random fields(CRF);;sparse coding
中文刊名：MOTO
英文刊名：Acta Automatica Sinica
机构：燕山大学信息科学与工程学院;中国科学院自动化研究所复杂系统管理与控制国家重点实验室;河北省计算机虚拟技术与系统集成重点实验室;
出版日期：2018-04-18 14:23
出版单位：自动化学报
年：2019
期：v.45
基金：国家重点研发计划(2018YFB1308305);; 国家自然科学基金(61305113);; 中国博士后自然科学基金(2018M631620);; 河北省自然科学基金(F2016203358);; 燕山大学博士基金(BL18007)资助~~
语种：中文;
页：MOTO201905015
页数：8
CN：05
ISSN：11-2109/TP
分类号：159-166

摘要

对工具及其功用性部件的认知是共融机器人智能提升的重要研究方向.本文针对家庭日常工具的功用性部件建模与检测问题展开研究,提出了一种基于条件随机场(Conditional random field, CRF)和稀疏编码联合学习的家庭日常工具功用性部件检测算法.首先,从工具深度图像提取表征工具功用性部件的几何特征;然后,分析CRF和稀疏编码之间的耦合关系并进行公式化表示,将特征稀疏化后作为潜变量构建初始条件随机场模型,并进行稀疏字典和CRF的协同优化:一方面,将特征的稀疏表示作为CRF的随机变量条件及权重参数选择器;另一方面,在CRF调控下对稀疏字典进行更新.随后使用自适应时刻估计(Adaptive moment estimation, Adam)方法实现模型解耦与求解.最后,给出了基于联合学习的工具功用性部件模型离线构建算法,以及基于该模型的在线检测方法.实验结果表明,相较于使用传统特征提取和模型构建方法,本文方法对功用性部件的检测精度和效率均得到提升,且能够满足普通配置机器人对工具功用性认知的需要.
The research for coherent robots to cognize tools and their affordance parts is an important direction to improve their machine intelligence. Aimed at modeling and detecting affordance parts of household tools, a joint learning algorithm for affordance parts detection via both conditional random field(CRF) and sparse coding is proposed. Firstly,geometric features of affordance parts are obtained from depth images of the tools. Secondly, the coupled relationship between CRF and sparse coding is analyzed and described with formulations. Initial CRF model is built by using sparse coded features as latent variables, and both the sparse dictionary and CRF are optimized simultaneously. On one hand,the sparse coded features are considered as the random variable condition and the weight parameter selector of CRF, and on the other hand, sparse dictionary is updated with the modulation of CRF. Then the model is decoupled and solved with the adaptive moment estimation(Adam). Finally, the offline joint learning algorithm for affordance parts modeling and online detection method are given. The experimental results show that, comparing with traditional features extracting and modeling methods, both the accuracy and efficiency of our method are improved, which can satisfy the affordance cognition requirements for robots with common configurations.

引文

1 Aly A, Griffiths S, Stramandinoli F. Towards intelligent social robots:current advances in cognitive robotics. Cognitive Systems Research, 2017, 43:153-156
    2 Min H Q, Yi C A, Luo R H, Zhu J H, Bi S. Affordance research in developmental robotics:a survey. IEEE Transactions on Cognitive and Developmental Systems, 2016, 8(4):237-255
    3 Lenz I, Lee H, Saxena A. Deep learning for detecting robotic grasps. The International Journal of Robotics Research,2015, 34(4-5):705-724
    4 Kjellstrom H, Romero J, Kragic D. Visual object-action recognition:inferring object affordances from human demonstration. Computer Vision and Image Understanding,2011, 115(1):81-90
    5 Grabner H, Gall J, Van Gool L. What makes a chair a chair? In:Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI:IEEE, 2011. 1529-1536
    6 Koppula H S, Gupta R, Saxena A. Learning human activities and object affordances from RGB-D videos. The International Journal of Robotics Research, 2013, 32(8):951-970
    7 Myers A, Teo C L, Fermuller C, Aloimonos Y. Affordance detection of tool parts from geometric features. In:Proceedings of the 2015 IEEE International Conference on Robotics and Automation. Seattle, WA:IEEE, 2015. 1374-1381
    8 Li Yu-Dong, He Hong-Jie, Chen Fan, Yin Zhong-Ke. A rigid object detection model based on geometric sparse representation of profile and its hierarchical detection algorithm.Acta Automatica Sinica, 2015, 41(4):843-853(林煜东,和红杰,陈帆,尹忠科.基于轮廓几何稀疏表示的刚性目标模型及其分级检测算法.自动化学报, 2015, 41(4):843-853)
    9 Redmon J, Angelova A. Real-time grasp detection using convolutional neural networks. In:Proceedings of the 2015IEEE International Conference on Robotics and Automation. Seattle, WA:IEEE, 2015. 1316-1322
    10 Zhong Xun-Gao, Xu Min, Zhong Xun-Yu, Peng Xia-Fu.Multimodal features deep learning for robotic potential grasp recognition. Acta Automatica Sinica, 2016, 42(7):1022-1029(仲训杲,徐敏,仲训昱,彭侠夫.基于多模特征深度学习的机器人抓取判别方法.自动化学报, 2016, 42(7):1022-1029)
    11 Myers A O. From form to function:detecting the affordance of tool parts using geometric features and material cues[Ph. D. dissertation], University of Maryland, 2016
    12 Nguyen A, Kanoulas D, Caldwell D G, Tsagarakis N G. Detecting object affordances with Convolutional Neural Networks. In:Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems. Daejeon:IEEE, 2016. 2765-2770
    13 Wu Pei-Liang, Fu Wei-Xing, Kong Ling-Fu. A fast algorithm for affordance detection of household tool parts based on structured random forest. Acta Optica Sinica, 2017, 37(2):0215001(吴培良,付卫兴,孔令富.一种基于结构随机森林的家庭日常工具部件功用性快速检测算法.光学学报, 2017, 37(2):0215001)
    14 Thogersen M, Escalera S, Gonzalez J, Moeslund T B. Segmentation of RGB-D indoor scenes by stacking random forests and conditional random fields. Pattern Recognition Letters, 2016, 80:208-215
    15 Bao C L, Ji H, Quan Y H, Shen Z W. Dictionary learning for sparse coding:algorithms and convergence analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence,2016, 38(7):1356-1369
    16 Yang J M, Yang M H. Top-down visual saliency via joint CRF and dictionary learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(3):576-588
    17 Yang E, Gwak J, Jeon M. Conditional random field(CRF)-boosting:constructing a robust online hybrid boosting multiple object tracker facilitated by CRF learning. Sensors,2017, 17(3):617
    18 Liu T, Huang X T, Ma J S. Conditional random fields for image labeling. Mathematical Problems in Engineering, 2016,2016:Article ID 3846125
    19 Lv P Y, Zhong Y F, Zhao J, Jiao H Z, Zhang L P. Change detection based on a multifeature probabilistic ensemble conditional random field model for high spatial resolution remote sensing imagery. IEEE Geoscience&Remote Sensing Letters, 2016, 13(12):1965-1969
    20 Qian Sheng, Chen Zong-Hai, Lin Ming-Qiang, Zhang ChenBin. Saliency detection based on conditional random field and image segmentation. Acta Automatica Sinica, 2015,41(4):711-724(钱生,陈宗海,林名强,张陈斌.基于条件随机场和图像分割的显著性检测.自动化学报, 2015, 41(4):711-724)
    21 Wang Z, Zhu S Q, Li Y H, Cui Z Z. Convolutional neural network based deep conditional random fields for stereo matching. Journal of Visual Communication&Image Representation, 2016, 40:739-750
    22 Szummer M, Kohli P, Hoiem D. Learning CRFs using graph cuts. In:Proceedings of European Conference on Computer Vision, Lecture Notes in Computer Science, vol. 5303.Berlin, Heidelberg:Springer, 2008. 582-595
    23 Kolmogorov V, Zabin R. What energy functions can be minimized via graph cuts? IEEE Transactions on Pattern Analysis&Machine Intelligence, 2004, 26(2):147-159
    24 Kingma D P, Ba J. Adam:a method for stochastic optimization. In:Proceedings of the 3rd International Conference for Learning Representations. San Diego, 2015.
    25 Mairal J, Bach F, Ponce J. Task-driven dictionary learning.IEEE Transactions on Pattern Analysis&Machine Intelligence, 2012, 34(4):791-804