Article Contents
Article Contents

# Stereo visual odometry based on dynamic and static features division

• * Corresponding author: Guangbin Cai

The first author is mainly supported by NSSF of China under Grant (No. 61773387)

• Accurate camera pose estimation in dynamic scenes is an important challenge for visual simultaneous localization and mapping, and it is critical to reduce the effects of moving objects on pose estimation. To tackle this problem, a robust visual odometry approach in dynamic scenes is proposed, which can precisely distinguish between dynamic and static features. The key to the proposed method is combining the scene flow and the static features relative spatial distance invariance principle. Moreover, a new threshold is proposed to distinguish dynamic features.Then the dynamic features are eliminated after matching with the virtual map points. In addition, a new similarity calculation function is proposed to improve the performance of loop-closure detection. Finally, the camera pose is optimized after obtaining a closed loop. Experiments have been conducted on TUM datasets and actual scenes, which shows that the proposed method reduces tracking errors significantly and estimates the camera pose precisely in dynamic scenes.

Mathematics Subject Classification: Primary: 65D19, 65D18; Secondary: 68U10.

 Citation:

• Figure 1.  Stereo camera model

Figure 2.  Generation of a visual vocabulary tree

Figure 3.  Overview of the proposed algorithm in dynamic scenes

Figure 4.  Classification of the scene flow based on angles [26]

Figure 5.  Invariance of the relative spatial distance of the static points

Figure 6.  Construction of the virtual map points

Figure 7.  Three static features selected by the algorithm

Figure 8.  Dynamic features obtained by the algorithm

Figure 9.  Experiment scene sets

Figure 10.  Experimental results of ORB-VO in lab scenes

Figure 11.  Experimental results of the proposed method in lab scenes

Figure 12.  Loop-closure detection result of the inverse proportional function

Figure 13.  Loop-closure detection result of the negative exponential power function

Figure 14.  Loop-closure detection result of the negative exponential power function

Figure 15.  Comparisons between estimated trajectories and the ground truth in walking sequences

Figure 16.  Comparisons between estimated trajectories and the ground truth in sitting sequences

Table 1.  Translation drift and rotational drift of VO method on TUM dataset

 Sequences RMSE of translational drift [m/s] RMSE of rotational drift [$^{\circ}$/s] DVO BaMVO SPW-VO Our Method DVO BaMVO SPW-VO Our Method sitting-static 0.0157 0.0248 0.0231 0.0112 0.6084 0.6977 0.7228 0.3356 sitting-xyz 0.0453 0.0482 0.0219 0.0132 1.4980 1.3885 0.8466 0.5753 sitting-rpy 0.1735 0.1872 0.0843 0.0280 6.0164 5.9834 5.6258 0.6811 sitting-halfsphere 0.1005 0.0589 0.0389 0.0151 4.6490 2.8804 1.8836 0.6103 walking-static 0.3818 0.1339 0.0327 0.0293 6.3502 2.0833 0.8085 0.5500 walking-xyz 0.4360 0.2326 0.0651 0.1034 7.6669 4.3911 1.6442 2.3273 walking-rpy 0.4038 0.3584 0.2252 0.2143 7.0662 6.3898 5.6902 3.9555 walking-halfsphere 0.2628 0.1738 0.0527 0.1061 5.2179 4.2863 2.4048 2.2983

Table 2.  RMSE of the ATE of camera pose estimation (m$^{-1}$)

 Sequences ORB-SLAM2 MR-SLAM SPW-SLAM SF-SLAM Our Method sitting-static 0.0082 – – 0.0081 0.0073 sitting-xyz 0.0094 0.0482 0.0397 0.0101 0.0090 sitting-rpy 0.0197 – – 0.0180 0.0162 sitting-halfsphere 0.0211 0.0470 0.0432 0.0239 0.0164 walking-static 0.1028 0.0656 0.0261 0.0120 0.0108 walking-xyz 0.4278 0.0932 0.0601 0.2251 0.0884 walking-rpy 0.7407 0.1333 0.1791 0.1961 0.3620 walking-halfsphere 0.4939 0.1252 0.0489 0.0423 0.0411
•  [1] P. F. Alcantarilla, J. J. Yebes, J. Almazán et. al., On combining visual slam and dense scene flow to increase the robustness of localization and mapping in dynamic environments, 2012 IEEE International Conference on Robotics and Automation, Saint Paul, Minnesota, USA, IEEE, 2012. [2] Y. An, B. Li and L. Wang, et al., Calibration of a 3D laser rangefinder and a camera based on optimization solution, J. Ind. Manag. Optim., 17 (2021), 427-445.  doi: 10.3934/jimo.2019119. [3] A. Angeli, D. Filliat and S. Doncieux, et al., Fast and incremental method for loop-closure detection using bags of visual words, IEEE Transactions on Robotics, 24 (2008), 1027-1037. [4] C. Bibby and I. Reid, Simultaneous localisation and mapping in dynamic environments (SLAMIDE) with reversible data association, Robotics: Science and Systems, Atlanta, Georgia, USA, 2007. [5] L. Bose and A. Richards, Fast Depth Edge Detection and Edge Based Rgb-D Slam, IEEE International Conference on Robotics and Automation, Stockholm, Sweden, IEEE, 2016. [6] C. Cadena, L. Carlone and H. Carrillo, et al., Simultaneous localization and mapping: Present, future, and the robust-perception age, IEEE Transactions on Robotics, 32 (2016), 1309-1332. [7] C. Choi, A. J. Trevor and H. I. Christensen, Rgbd Edge Detection and Edge-Based Registration, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, Tokyo, Japan, IEEE, 2013. [8] A. J. Davison, I. D. Reid and N. D. Molton, et al., MonoSLAM: Real-time single camera SLAM, IEEE Transactions on Pattern Analysis & Machine Intelligence, 29 (2007), 1052-1067. [9] J. Engel, V. Koltun and D. Cremers, Direct sparse odometry, IEEE Transactions on Pattern Analysis & Machine Intelligence, 40 (2018), 611-625. [10] J. Engel, T. Schöps and D. Cremers, LSD-SLAM: Large-Scale Direct Monocular SLAM, European Conference on Computer Vision, Springer, Zürich, Switzerland, 2014. [11] J. Fan, On the Levenberg-Marquardt methods for convex constrained nonlinear equations, J. Ind. Manag. Optim., 9 (2013), 227-241.  doi: 10.3934/jimo.2013.9.227. [12] C. Forster, M. Pizzoli and D Scaramuzza, SVO: Fast Semi-Direct Monocular Visual Odometry, IEEE International Conference on Robotics and Automation, Hong Kong, China, IEEE, 2014. [13] C. Forster, Z. Zhang and M. Gassner, et al., SVO: Semi-direct visual odometry for monocular and multicamera systems, IEEE Transactions on Robotics, 33 (2017), 249-265. [14] J. Fuentes-Pacheo, J. Ruiz-Ascencio and J. M. Rendón-Mancha, Visual simultaneous localization and mapping: A survey, Artificial Intelligence Review, 43 (2015), 55-81. [15] D.-K. Gu, G.-P. Liu and G.-R. Duan, Robust stability of uncertain second-order linear time-varying systems, J. Franklin Inst., 356 (2019), 9881-9906.  doi: 10.1016/j.jfranklin.2019.09.014. [16] D.-K. Gu and D.-W. Zhang, Parametric control to second-order linear time-varying systems based on dynamic compensator and multi-objective optimization, Appl. Math. Comput., 365 (2020), 124681, 25 pp. doi: 10.1016/j.amc.2019.124681. [17] D. K. Gu and D. W. Zhang, A parametric method to design dynamic compensator for high-order quasi-linear systems, Nonlinear Dynamics, 100 (2020), 1379-1400. [18] C. Kerl, J. Sturm and D. Cremers, Dense Visual Slam for Rgb-D Cameras, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, Tokyo, Japan, IEEE, 2013. [19] D. H. Kim and J. H. Kim, Image-Based Icp Algorithm for Visual Odometry using a Rgb-D Sensor in a Dynamic Environment, Robot Intelligence Technology and Applications, Gwangju, Korea, Springer, 2013. [20] D. H. Kim, S. B. Han and J. H. Kim, Visual Odometry Algorithm using an Rgb-D Sensor and Imu in a Highly Dynamic Environment, Robot Intelligence Technology and Applications, Beijing, China, Springer, 2015. [21] D. H. Kim and J. H. Kim, Effective background model-based rgb-d dense visual odometry in a dynamic environment, IEEE Transactions on Robotics, 32 (2016), 1565-1573. [22] M. Labbe and F. Michaud, Appearance-based loop closure detection for online large-scale and long-term operation, IEEE Transactions on Robotics, 9 (2013), 734-745. [23] S. Li and D. Lee, Rgb-d slam in dynamic environments using static point weighting, IEEE Robotics and Automation Letters, 2 (2017), 2263-2270. [24] B. Li, D. Yang and L. Deng, Visual vocabulary tree with pyramid TF-IDF scoring match scheme for loop closure detection, Acta Automatica Sinica, 37 (2011), 665-673. [25] Y. Li, G. Zhang and F. Wang, et al., An improved loop closure detection algorithm based on historical model set, Robot, 37 (2015), 663-673. [26] Z. L. Lin, G. L. Zhang and E. Yao, et al., Sterero visual odometry based on motion object detection in the dynamic scene, Acta Optica Sinica, 37 (2017), 187-195. [27] M. Lourakis and X. Zabulis, Model-Based Pose Estimation for Rigid Objects, International conference on computer vision systems, St. Petersburg, Russia, Springer, 2013. [28] R. Mur-Artal, J. M. M. Montiel and J. D. Tardós, ORB-SLAM: A versatile and accurate monocular slam system, IEEE Transactions on Robotics, 31 (2015), 1147-1163. [29] R. Mur-Artal and J. D. Tardós, ORB-SLAM2: An opensource slam system for monocular, stereo, and rgbd cameras, IEEE Transactions on Robotics, 335 (2017), 1255-1262. [30] D. Nistér, O. Naroditsky and J. Bergen, Visual odometry for ground vehicle applications, Journal of Field Robotics, 23 (2006), 3-20. [31] Z. Peng, Research on Vision-Based Ego-Motion Estimation and Environment Modeling in Dynamic Environment, Ph.D. dissertation, Zhejiang University, Hangzhou, China, 2013. [32] D. Scaramuzza and F. Fraundorfer, Visual odometry, IEEE Robotics & Automation Magazine, 18 (2011), 80-92. [33] J. Sturm, N. Engelhard, F. Endres et. al., A Benchmark for the Evaluation of RGB-D SLAM Systems, IEEE International Conference on Intelligent Robots and Systems, Vilamoura, Portugal, IEEE, 2012. [34] Y. Sun, M. Liu and M. Q. H. Meng, Improving rgbd slam in dynamic environments: A motion removal approach, Robotics and Autonomous Systems, 89 (2017), 110-122. [35] W. Tan, H. Liu, Z. Dong et. al., Robust Monocular SLAM in Dynamic Environments, IEEE International Symposium on Mixed and Augmented Reality, Adelaide, Australia, IEEE, 2013. [36] G. Younes, D. Asmar and E. Shammas, et al., Keyframe-based monocular slam: Design, survey, and future directions, Robotics and Autonomous Systems, 98 (2017), 67-88.

Figures(16)

Tables(2)