| ICP |
Besl & McKay 1992 |
— |
| DTAM |
Newcombe 2011 |
— |
| KinectFusion |
Newcombe 2011 |
GPGPU, Tracking (project depth → 3D, surface normal, coarse-to-fine ICP), Mapping (volumetric integration, TSDF), Robust to small scene changes, Cannot model deformation, Map growth cubic, Room-size only |
| Double Window Optimisation |
Strasdat 2011 |
— |
| Kintinuous |
Whelan 2012 |
Volume shift, Geometric, Photometric, dBoW+SURF, Optimisation, Loop closure |
| RGBD-SLAM-V2 |
Endres 2013 |
Tracking (colour image, visual features, depth image, point cloud, transformation), Mapping (OctoMap 2013) |
| SLAM++ |
Salas-Moreno 2013 |
Object-oriented SLAM |
| DVO |
Kerl 2013 |
Keyframe, Depth, Direct method, Optimisation, Loop closure |
| RTAB-Map |
Labbé 2014 |
Loop closure, Map merge, Multi-session memory management |
| MRS-Map |
Stuckler 2014 |
— |
| ElasticFusion |
Whelan 2015 |
Active: frame-to-model tracking (photometric + geometric), joint optimisation, fused surfel-based model reconstruction · Inactive: local loop closure (model-to-model local surface, submodel separation), global loop closure (randomised fern encoding, non-rigid space deformation) |
| DynamicFusion |
Newcombe 2015 |
6D motion field, Deformable scene |
| ORB-SLAM2 |
Mur-Artal 2016 |
Bundle adjustment, Sparse reconstruction |
| BundleFusion |
Dai 2016 |
Local-to-global optimisation, Sparse RGB feature, Coarse global pose estimation, Fine pose refinement (geometric + photometric) |
| SemanticFusion |
McCormac 2016 |
Deep Learning CNN, Deep Semantic SLAM |
| InfiniTAM v3 |
Prisacariu 2017 |
Tracking (scene raycast, depth image, RGB image), Relocalisation (random ferns), Mapping (TSDF reconstruction, voxel hashing, surfel reconstruction) |
| Fusion++ |
McCormac & Clark 2018 |
Deep Learning CNN, Mask-RCNN instance segmentation, Object-level SLAM, No prior, Object-level TSDF reconstruction |
| PointFusion / DenseFusion |
Xu 2018 / Wang 2019 |
RGB-D object pose estimation, Tracking, Relocalisation, Loop closure detection |
| BAD SLAM |
Schops 2019 |
Direct bundle adjustment, Deep Semantic SLAM |
| RTAB-Map v2 |
Labbé 2019 |
RGB-D/LiDAR, Light-source detection (2016) |
| MoreFusion |
Wada & Sucar 2020 |
DL instance segmentation, Object-level volumetric fusion, Volumetric pose prediction, 3D scene reconstruction, Collision-based refinement, Semantic SLAM, Object pose estimation, CAD object fitting |
| NodeSLAM |
Wada & Sucar 2020 |
Occupancy VAE, Object-level SLAM (→ also in Level 5 Latent Representation) |
| Kimera / 3D Dynamic Scene Graph |
Rosinol 2020 |
Kimera-VIO, Kimera-Mesher, Kimera-PGMO, Kimera-Semantics, Kimera-DSG |
| DSP-SLAM |
Wang (UCL) 2021 |
DeepSDF shape prior + ORB-SLAM2, object-level dense reconstruction (mono/stereo/LiDAR) |
📘 Study Roadmap : Visual-SLAM (Beginner → Master)
Level 1: Beginner
Programming
Mathematics
Projective Geometry
Camera Device
Image Data
Level 2: Getting Familiar with SLAM
Programming
Image Processing
Local Feature Matching
Global Feature Matching
Feature Tracking
Multiple View Geometry
Outlier Rejection
Least Squares Optimisation
Motion Model
Observation Model
Factor Graph Optimisation
Mapping
Sensors
Evaluation
Next Levels
Monocular SLAM · VIO/VINS · Stereo SLAM · Visual-LiDAR Fusion · RGB-D SLAM · Collaborative SLAM · Deep SLAM/Localization
Level 3: Monocular Visual-SLAM
Key Concepts
Feature-based SLAM
Direct SLAM
Hybrid (Feature + Direct)
Learning-based SLAM
Foundation Model SLAM
SfM Tools
Neural Representation SLAM
NeRF-based
3DGS-based
Semantic / Language-Grounded SLAM
Level 4: RGB-D Visual-SLAM
RGB-D Camera Devices
GPGPU Programming
Systems
Level 5: Applying Deep Learning
A. Deep Frontend — Perception
Feature Detection & Matching
Depth Estimation
Optical Flow & Scene Flow
Camera Pose Regression & Relocalization
Object Detection & Segmentation for SLAM
B. Deep Backend — Optimization
Differentiable Bundle Adjustment
Certifiably Optimal Algorithms
Gaussian Belief Propagation & Graph Processors
C. End-to-End Deep VO / SLAM Systems
Self-supervised & Learned VO
Latent Representation SLAM
Neural Rendering (reference)
D. Scene Understanding
Benchmarks & Foundations
3D Scene Graph
Level 6: VIO / VINS
Key Concepts
Foundations
Filter-based
Optimization-based
Level 7: World Models & Spatial AI
World Models
Generative 3D
Vision-Language Models (VLM)
Vision-Language-Action Models (VLA)
Resources
Level 8: Stereo SLAM
Key Concepts
Systems
Level 9: Collaborative / Multi-Robot SLAM
Key Concepts
Systems
Level 10: LiDAR & Visual-LiDAR Fusion SLAM
Key Concepts
LiDAR / LiDAR-Inertial SLAM
Visual-LiDAR Fusion SLAM
Resources
Level 11: Event Camera SLAM
Key Concepts
Foundations
Systems