面向室内动态环境的半直接法RGB-D SLAM算法

英文篇名：Semi-direct RGB-D SLAM Algorithm for Dynamic Indoor Environments
作者：高成强 ; 张云洲 ; 王晓哲 ; 邓毅 ; 姜浩
英文作者：GAO Chengqiang;ZHANG Yunzhou;WANG Xiaozhe;DENG Yi;JIANG Hao;College of Information Science and Engineering, Northeastern University;Faculty of Robot Science and Engineering, Northeastern University;
关键词：动态环境 ; 视觉SLAM(同时定位与地图创建) ; 半直接法 ; TSDF(truncated ; signed ; distance ; function)模型 ; 稠密地图
英文关键词：dynamic environment;;visual SLAM(simultaneous localization and mapping);;semi-direct algorithm;;TSDF(truncated signed distance function) model;;dense map
中文刊名：JQRR
英文刊名：Robot
机构：东北大学信息科学与工程学院;东北大学机器人科学与工程学院;
出版日期：2018-12-12 14:44
出版单位：机器人
年：2019
期：v.41
基金：国家自然科学基金(61471110,61733003);; 国家重点研发计划(2017YFC0805000/5005,2017YFB1301103);; 中央高校基本科研业务专项资金(N172608005);; 辽宁省自然科学基金(20180520040);; 辽宁省教育厅科技项目(L20150185)
语种：中文;
页：JQRR201903010
页数：12
CN：03
ISSN：21-1137/TP
分类号：86-97

摘要

为了解决室内动态环境下移动机器人的准确定位问题,提出了一种融合运动检测算法的半直接法RGB-D视觉SLAM(同时定位与地图创建)算法,它由运动检测、相机位姿估计、基于TSDF (truncated signed distance function)模型的稠密地图构建3个步骤组成.首先,通过最小化图像光度误差,利用稀疏图像对齐算法实现对相机位姿的初步估计.然后,使用视觉里程计的位姿估计对图像进行运动补偿,建立基于图像块实时更新的高斯模型,依据方差变化分割出图像中的运动物体,进而剔除投影在图像运动区域的局部地图点,通过最小化重投影误差对相机位姿进行进一步优化,提升相机位姿估计精度.最后,使用相机位姿和RGB-D相机图像信息构建TSDF稠密地图,利用图像运动检测结果和地图体素块的颜色变化,完成地图在动态环境下的实时更新.实验结果表明,在室内动态环境下,本文算法能够有效提高相机位姿估计精度,实现稠密地图的实时更新,在提升系统鲁棒性的同时也提升了环境重构的准确性.
To solve the accurate positioning problem of mobile robots in dynamic indoor environments, a semi-direct RGBD visual SLAM(simultaneous localization and mapping) algorithm integrating a motion detection algorithm, is proposed. It consists of three parts, including motion detection, camera pose estimation and dense map building based on TSDF(truncated signed distance function) model. Firstly, the primary camera pose can be estimated with the sparse image alignment algorithm by minimizing the photometric residual. Then, the pose estimation of visual odometry is used to compensate for image movement. A Gaussian model is built based on real-time update of the image patch. The moving object in the image can be segmented according to the variance change of each patch. By eliminating the local map points projected on the motion region and minimizing the reprojection error, the camera pose can be optimized again to improve its estimation accuracy.Finally, the dense TSDF map can be built with the camera pose and RGB-D image. Furthermore, the map can be updated in real time in dynamic environments by using the motion detection result and and the color change of voxel patches in the map. Experimental results show that the proposed algorithm can effectively improve the precision of camera pose estimation and implement the real-time update of the dense map in dynamic indoor environments. Therefore, it can improve the system robustness and the scene reconstruction accuracy.

引文

[1] Kerl C, Sturm J, Cremers D. Robust odometry estimation for RGB-D cameras[C]//IEEE International Conference on Robotics and Automation. Piscataway, USA:IEEE, 2013:3748-3754.
    [2] Mur-Artal R, Tardós J D. ORB-SLAM2:An open-source slam system for monocular, stereo, and RGB-D cameras[J]. IEEE Transactions on Robotics, 2017, 33(5):1255-1262.
    [3] Newcombe R A, Izadi S, Hilliges O, et al. KinectFusion:Realtime dense surface mapping and tracking[C]//10th IEEE/ACM International Symposium on Mixed and Augmented Reality.Piscataway, USA:IEEE, 2011:127-136.
    [4]付梦印,吕宪伟,刘彤,等.基于RGB-D数据的实时SLAM算法[J].机器人,2015,37(6):683-692.Fu M Y, Lü X W, Liu T, et al. Real-time SLAM algorithm based on RGB-D data[J]. Robot, 2015, 37(6):683-692.
    [5] Nist′er D, Naroditsky O, Bergen J. Visual odometry[C]//IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway, USA:IEEE, 2004:652-659.
    [6] Tan W, Liu H M, Dong Z L, et al. Robust monocular SLAM in dynamic environments[C]//12th IEEE/ACM International Symposium on Mixed and Augmented Reality. Piscataway, USA:IEEE, 2013:209-218.
    [7] Hahnel D, Triebel R, Burgard W, et al. Map building with mobile robots in dynamic environments[C]//IEEE International Conference on Robotics and Automation. Piscataway, USA:IEEE, 2003:1557-1563.
    [8] Bibby C, Reid I. Simultaneous localisation and mapping in dynamic environments(SLAMIDE)with reversible data association[C]//Robotics:Science and Systems Ⅲ. 2007. DOI:10.15607/RSS.2007.III.014.
    [9] Wang Y B, Huang S D. Motion segmentation based robust RGB-D SLAM[C]//11th World Congress on Intelligent Control and Automation. Piscataway, USA:IEEE, 2014:3122-3127.
    [10] Sun Y X, Liu M, Meng M Q H. Improving RGB-D SLAM in dynamic environments:A motion removal approach[J].Robotics and Autonomous Systems, 2017, 89:110-122.
    [11] Forster C, Pizzoli M, Scaramuzza D. SVO:Fast semi-direct monocular visual odometry[C]//IEEE International Conference on Robotics and Automation. Piscataway, USA:IEEE, 2014:15-22.
    [12] Engel J, Sch?ps T, Cremers D. LSD-SLAM:Large-scale direct monocular SLAM[C]//13th European Conference on Computer Vision. Cham, Switzerland:Springer, 2014:834-849.
    [13]魏彤,金砺耀.基于双目ORB-SLAM的障碍物记忆定位与去噪算法[J].机器人,2018,40(3):266-272.Wei T, Jin L Y. Obstacle memory localization and denoising algorithm based on binocular ORB-SLAM[J]. Robot, 2018,40(3):266-272.
    [14] Newcombe R A, Fox D, Seitz S M. Dynamicfusion:Reconstruction and tracking of non-rigid scenes in real-time[C]//IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, USA:IEEE, 2015:343-352.
    [15] Fehr M, Furrer F, Dryanovski I, et al. TSDF-based change detection for consistent long-term dense reconstruction and dynamic object discovery[C]//IEEE International Conference on Robotics and Automation. Piscataway, USA:IEEE, 2017:5237-5244.
    [16] Sturm J, Engelhard N, Endres F, et al. A benchmark for the evaluation of RGB-D SLAM systems[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems. Piscataway,USA:IEEE, 2012:573-580.
    [17] Rublee E, Rabaud V, Konolige K, et al. ORB:An efficient alternative to SIFT or SURF[C]//IEEE International Conference on Computer Vision. Piscataway, USA:IEEE, 2011:2564-2571.
    [18] Baker S, Matthews I. Lucas-Kanade 20 years on:A unifying framework[J]. International Journal of Computer Vision, 2004,56(3):221-255.
    [19] van Droogenbroeck M, Barnich O. ViBe:A disruptive method for background subtraction[M]//Background Modeling and Foreground Detection for Video Surveillance. Boca Raton,USA:CRC, 2014. DOI:10.1201/b17223-10.
    [20] Yi K M, Yun K, Kim S W, et al. Detection of moving objects with non-stationary cameras in 5.8 ms:Bringing motion detection to your mobile device[C]//IEEE Conference on Computer Vision and Pattern Recognition Workshops. Piscataway, USA:IEEE, 2013:27-34.
    [21] Bouguet J Y. Pyramidal implementation of the affine Lucas Kanade feature tracker description of the algorithm[EB/OL].[2018-05-01]. https://pdfs.semanticscholar.org/aa97/2b40c0f8e20b07e02d1fd320bc7ebadfdfc7.pdf.
    [22] Rosten E, Porter R, Drummond T. Faster and better:A machine learning approach to corner detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(1):105-119.
    [23] Klingensmith M, Dryanovski I, Srinivasa S S, et al. CHISEL:Real time large scale 3D reconstruction onboard a mobile device using spatially hashed signed distance fields[C/OL]//Robotics:Science and Systems. 2015. http://www.roboticsproceedings.org/rss11/p40.pdf.
    [24] Li S, Lee D. RGB-D SLAM in dynamic environments using static point weighting[J]. IEEE Robotics and Automation Letters, 2017, 2(4):2263-2270.
    [25] Kim D H, Kim J H. Effective background model-based RGB-D dense visual odometry in a dynamic environment[J]. IEEE Transactions on Robotics, 2016, 32(6):1565-1573.