Autonomous Mobile Robots Autonomous Systems Lab Zürich SLAM: Simultaneous Localization and Mapping "Position" Global Map Perception Motion Control Cognition Real World Environment Localization Path Environment Model Local Map
Autonomous Mobile Robots
Autonomous Systems LabZürich
SLAM: Simultaneous Localization and Mapping
"Position"
Global Map
Perception Motion Control
Cognition
Real WorldEnvironment
Localization
PathEnvironment Model
Local Map
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Today‟s lecture
Section 5.8 + some extras…
One of the fundamental problems in robotics: SLAM
EKF SLAM
Particle filter SLAM
GraphSLAM
Look into MonoSLAM
Lec.122
12 – Simultaneous Localization And Mapping
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Simultaneous Localization and Mapping
The SLAM problem:
How can a body navigate in a previously unknown environment while constantly building and updating a map of its workspace using
on board sensors only?
When is SLAM necessary?
When a robot must be truly autonomous (no human input)
When there is no prior knowledge about the environment
When we cannot place beacons (also in GPS-denied environments)
When the robot needs to know where it is
12 – Simultaneous Localization And Mapping
Lec.123
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Simultaneous Localization and Mapping
SLAM is significantly more difficult than all robotics problems discussed so far:
More difficult than pure localization: the map is unknown and has to be estimated along the way.
More difficult than mapping with known poses: the poses are unknown and have to be estimated along the way.
SLAM: one of the greatest challenges in probabilistic robotics
12 – Simultaneous Localization And Mapping
Lec.124
12 3.5
By hand: hard & costly(e.g. large environment) Automatic Map Building:
More challenging, but:
Automatic
The robot learns its environment
Can adapt to dynamic changes
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Features for SLAM
Can we track the motion of a camera/robot while it is moving?
Pick natural scene features to serve as landmarks (in most modern SLAM systems)
Range sensing (laser/sonar): line segments, 3D planes, corners
Vision: point features, lines, textured surfaces.
Key: features must be distinctive & recognizable from different viewpoints
Lec.125
12 – Simultaneous Localization And Mapping
Courtesy of Andrew J. Davison
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Map Building
1. Map Maintenance: Keeping track of
changes in the environment
e.g. disappearing
cupboard
- e.g. measure of belief of each
environment feature
- Generally assume static environment
2. Representing and Propagating Uncertainty
position of robot -> position of wall
position of wall -> position of robot
probability densities for feature positions
Map can become inconsistent due to
erroneous measurements / motion drift
?
12 – Simultaneous Localization And Mapping
Lec.126
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
How to do SLAMLec.12
7
12 – Simultaneous Localization And Mapping
Use internal representations for
the positions of landmarks (: map)
the camera parameters
Assumption: Robot‟s uncertainty at starting position is zero
A
B C
Start: robot has zero uncertainty
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lec.128
12 – Simultaneous Localization And Mapping
On every frame:
Predict how the robot has moved
Measure
Update the internal representations
Start: camera has zero uncertaintyFirst measurement of feature A
How to do SLAM
A
B C
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lec.129
12 – Simultaneous Localization And Mapping
On every frame:
Predict how the robot has moved
Measure
Update the internal representations
The robot observes a feature which is mapped with an uncertainty related to the exteroceptive sensor error model(aka measurement model)
How to do SLAM
A
B C
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lec.1210
12 – Simultaneous Localization And Mapping
Robot moves forwards: uncertainty grows
On every frame:
Predict how the robot has moved
Measure
Update the internal representations
As the robot moves, its pose uncertainty increases (obeying the robot‟s motion model)
How to do SLAM
A
B C
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lec.1211
12 – Simultaneous Localization And Mapping
Robot makes first measurements of B & C
On every frame:
Predict how the robot has moved
Measure
Update the internal representations
Robot observes two new features.
How to do SLAM
A
B C
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lec.1212
12 – Simultaneous Localization And Mapping
On every frame:
Predict how the robot has moved
Measure
Update the internal representations
Their position uncertainty results from the combination of the measurement error with the robotpose uncertainty.
map becomes correlated with the robot position estimate.
How to do SLAM
Robot makes first measurements of B & C
A
B C
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lec.1213
12 – Simultaneous Localization And Mapping
Robot moves again: uncertainty grows more
On every frame:
Predict how the robot has moved
Measure
Update the internal representations
How to do SLAM
Robot moves again and its uncertainty increases (motionmodel)
A
B C
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Robot moves again: uncertainty grows more
A
B C
Lec.1214
12 – Simultaneous Localization And Mapping
Robot re-measures A: “loop closure”
On every frame:
Predict how the robot has moved
Measure
Update the internal representations
Robot re-observes an old feature Loop closure detection
How to do SLAM
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lec.1215
12 – Simultaneous Localization And Mapping
Robot re-measures A: “loop closure”uncertainty shrinks
On every frame:
Predict how the robot has moved
Measure
Update the internal representations
Robot updates its position: the resulting position estimate becomes correlated with the feature location estimates.
Robot‟s uncertainty shrinks and so does the uncertainty in the rest of the map
How to do SLAM
A
B C
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
SLAM: Probabilistic Formulation
Robot pose at time t : xt Robot path up to this time: {x0, x1,…, xt}
Robot motion between time t-1 and t : ut (control inputs / propriocept. sens.
readings) Sequence of robot relative motions: {u0, u1,…, ut}
The true map of the environment: {m0, m1,…, mN}
At each time t the robot makes measurements zi
Set of all measurements (observations): {z0, z1,…, zk}
The Full SLAM problem: estimate the posterior
p(x0:t ,m0:n | z0:k ,u0:t )
The Online SLAM problem: estimate the posterior
p(xt ,m0:n | z0:k ,u0:t )
Lec.1216
12 – Simultaneous Localization And Mapping
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
z13z0 z1 z2 z3 z4 z5 z6 z7 z8 z9z12
z15z14 z16z10 z11
SLAM: graphical representationLec.12
17
12 – Simultaneous Localization And Mapping
m0 m1 m2 m3 m4 m5 m6 m7 m8
x0 x1 x2 x3 . . .
u0 u1 u2 u3
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Approaches SLAM
Full graph optimization (bundle adjustment)
Eliminate observations & control-input nodes and solve for the constraints between poses and landmarks.
Globally consistent solution, but infeasible for large-scale SLAM
If real-time is a requirement, we need to sparsify this graph
Lec.1218
12 – Simultaneous Localization And Mapping
m0 m1 m2 m3 m4 m5 m6 m7 m8
x0 x1 x2 x3
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Approaches SLAM
Filtering
Eliminate all past poses: „summarize‟ all experience with respect to the last pose, using a state vector and the associated covariance matrix
We‟re going to look at filtering in more detail…
Lec.1219
12 – Simultaneous Localization And Mapping
m0 m1 m2 m3 m4 m5 m6 m7 m8
x3x0 x1 x2
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Approaches SLAM
Key-frames
Retain the most „representative‟ poses (key-frames) and their dependency links optimize the resulting graph
Example: PTAM [Klein & Murray, ISMAR 2007]
Lec.1220
12 – Simultaneous Localization And Mapping
m0 m1 m2 m3 m4 m5 m6 m7 m8
x0 x1 x2 x3
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
PTAM: Parallel Tracking and MappingLec.12
21
12 – Simultaneous Localization And Mapping
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
The Three SLAM paradigms
Some of the most important approaches to SLAM:
Extended Kalman Filter SLAM (EKF SLAM)
Particle Filter SLAM (FastSLAM)
GraphSLAM
12 – Simultaneous Localization And Mapping
Lec.1222
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
EKF SLAM: overview
Like the standard EKF for robot localization
EKF SLAM summarizes all past experience in an extended state vector yt
comprising of the robot pose xt and the position of all the features mi in the map, and an associated covariance matrix Pyt:
If we sense 2D line-landmarks, the size of yt is 3+2n
(and size of Pt : (3+2n)(3+2n) )
3 variables to represent the robot pose and
2n variables for the n line-landmarks with state components
Hence,
As the robot moves and makes measurements, yt and Pyt are updated using the standard EKF equations
11111
11111
11
..
........
..
..
,...
1
1
nnnn
n
n
t
mmmmxm
mmmmxm
xmxmxx
y
n
t
t
PPP
PPP
PPP
P
m
m
x
y
),( ii rT
nntttt rrYXy ],,...,,,,,[ 1100
12 – Simultaneous Localization And Mapping
Lec.1223
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
The predicted robot pose at time-stamp t is computed using the estimated pose at time-stamp t-1 and the odometric control input
During this step, the position of the features remains unchanged.
EKF Prediction Equations:
EKF SLAM: Prediction
b
SSb
SSSSb
SSSS
Y
X
Y
X
uxfx
lr
lrt
lr
lrt
lr
t
t
t
t
t
t
ttt )2
sin(2
)2
cos(2
ˆ
ˆ
ˆ
),(ˆ1
1
1
1
1
1
0
0
...
0
0
)2
sin(2
)2
cos(2
ˆ
...
ˆ
ˆ
...
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
1
1
1
1
0
0
1
1
0
0 b
SSb
SSSSb
SSSS
r
r
Y
X
r
r
Y
X
y
lr
lrt
lr
lrt
lr
n
n
t
t
t
n
n
t
t
t
t
T
utu
T
yyyy FQFFPFPtt
1
ˆ
12 – Simultaneous Localization And Mapping
Lec.1224
tx̂
1tx },{ rlt SSu
Based on the example of Section 5.2.4
: distance travelled by the left and right wheels resp.
: distance between the two robot wheels
rl SS ;
bOdometryfunction
Jacobians of f
Covariance at previous time-stamp
Covariance of noise associated to the motion
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
EKF SLAM: Comparison with EKF localization
EKF LOCALIZATION
The state is ONLY the robot
configuration:
The prediction function is:
),(ˆ1 ttt uxfx
b
SSb
SSSSb
SSSS
Y
X
Y
X
lr
lrt
lr
lrt
lr
t
t
t
t
t
t
)2
sin(2
)2
cos(2
ˆ
ˆ
ˆ
1
1
1
1
1
EKF SLAM
The state comprises of the robot
configuration and that of each feature :
The prediction function is:
T
tttt YXx ],,[
),(ˆ1 ttt uyfy
0
0
...
0
0
)2
sin(2
)2
cos(2
ˆ
...
ˆ
ˆ
...
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
1
1
1
1
0
0
1
1
0
0 b
SSb
SSSSb
SSSS
r
r
Y
X
r
r
Y
X
y
lr
lrt
lr
lrt
lr
n
n
t
t
t
n
n
t
t
t
t
T
utu
T
xxxx FQFFPFPtt
1
ˆ T
utu
T
yyyy FQFFPFPtt
1
ˆ
T
nntttt rrYXy ],,...,,,,,[ 1100
12 – Simultaneous Localization And Mapping
Lec.1225
txtx
tyim
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
EKF SLAM: Measurement prediction & Update
The application of the measurement model is the same as in EKF localization. The predicted observation of each feature is:
After obtaining the set of actual observations z0:n-1 the EKF state gets
updated:
where
),ˆ(ˆ
ˆˆ
iti
i
i
i mxhr
z
T
tINtyy
ntnnttt
KKPP
mxhzKyy
tt
ˆ
)),ˆ((ˆ1:01:01:0
1)(ˆ
ˆ
INyt
T
yIN
HPK
RHPH
t
t
12 – Simultaneous Localization And Mapping
Lec.1226
im
We need the predicted new pose to predict where each feature lies in measurement space
Jacobian of h Measurement noise
Kalman Gain Innovation Covariance
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
EKF SLAM: Correlations
At start, when the robot makes the first measurements, the covariance matrix is populated by assuming that these (initial) features are uncorrelated off-diagonal elements are zero.
When the robot starts moving & making new measurements, both the robot pose and features start becoming correlated.
Accordingly, the covariance matrix becomes dense.
11
22
11
00
0...000
0...000
..................
00...00
00...00
00...00
0
nn
nn
mm
mm
mm
mm
xx
P
P
P
P
P
P
12 – Simultaneous Localization And Mapping
Lec.1227
T
utu
T
yyyy FQFFPFPtt
1
ˆ
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
EKF SLAM: Correlations
Correlations arise as
the uncertainty in the robot pose is used to obtain the uncertainty of the observed features.
the feature measurements are used to update the robot pose.
Regularly covisible features become correlated and when their motion is coherent, their correlation is even stronger
Correlations very important for convergence: The more observations are made, the more the correlations between the features will grow, the better the solution to SLAM.
12 – Simultaneous Localization And Mapping
Lec.1228
Chli & Davison, ICRA 2009
Features moving coherently
Features moving incoherently
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Feature Based SLAM (ETH - ASL)
3D laser on board
corner features extracted from lines and corners attached to structures
12 – Simultaneous Localization And Mapping
Lec.1229
[Weingarten & Siegwart, IROS 2006]
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Feature Based SLAM (ETH - ASL)
12 – Simultaneous Localization And Mapping
Lec.1230
[Weingarten & Siegwart, IROS 2006]
3D laser on board
corner features extracted from lines and corners attached to structures
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Drawbacks of EKF SLAM
The state vector in EKF SLAM is much larger than the state vector in EKF localization
Newly observed features are added to the state vector The covariance matrix grows quadratically with the no. features computationally expensive for large-scale SLAM.
Approach to attack this: sparsify the structure of the covariance matrix (via approximations)
12 – Simultaneous Localization And Mapping
Lec.1231
Chli & Davison, ICRA 2009
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Particle Filter SLAM
Particle filters: mathematical models that represent probability distributions as a set of discrete particles which occupy the state space.
Particle = a point estimate of the state with an associated weight(all weights should add up to 1)
Steps in Particle Filtering:
Predict:Apply motion prediction to each particle
Make measurements
Update: for each particle:
• Compare the particle‟s predictions of measurements with the actual measurements
• Assign weights such that particles with good predictions have higher weight
Normalize weights of particles to add up to 1
Resample: generate a new set of M particles which all have equal weights 1/M reflecting the probability density of the last particle set.
12 – Simultaneous Localization And Mapping
Lec.1232
},{ iti wypi
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Particle Filter SLAM
FastSLAM approach [Montemerlo et al., 2002]
• Solve state posterior using a Rao-Blackwellized Particle Filter
• Each landmark estimate is represented by a 2x2 EKF.
• Each particle is “independent” (due the factorization) from the others and maintains the estimate of M landmark positions.
12 – Simultaneous Localization And Mapping
Lec.1233
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
FastSLAM example
12 – Simultaneous Localization And Mapping
Lec.1234
[Montemerlo et al., 2002]
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
GraphSLAM [Thrun and Montemerlo, IJRR 2006]
12 – Simultaneous Localization And Mapping
Lec.1235
GraphSLAM basic idea: SLAM can be interpreted as a sparse graph of nodes and constraints between nodes.
nodes: robot locations and map-feature locations
edges: constraints between
consecutive robot poses (given by the odometry input u) and
robot poses and the features observed from these poses.
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Key property: constraints are not to be thought as rigid constraints but as soft constraints constraints acting like springs
Solve full SLAM by relaxing these constraints get the best estimate of the robot path and
the environment map by computing the state of minimal energy of this network
12 – Simultaneous Localization And Mapping
Lec.1236 GraphSLAM [Thrun and Montemerlo, IJRR 2006]
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
In GraphSLAM style techniques
the update-time is constant and
the required memory is linear with the no. features
… while in EKF SLAM, both the update-time and the storage of the covariance matrix scales quadratically with the no. features
However, the final graph optimization can be computationally very costly if the robot path is long.
12 – Simultaneous Localization And Mapping
Lec.1237 GraphSLAM vs. EKF SLAM
Autonomous Mobile Robots
Autonomous Systems LabZürich
Example of EKF SLAM
MonoSLAM: Real-Time Single Camera SLAM
[Davison, Reid, Molton and Stasse. PAMI 2007]
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Single Camera SLAMLec.12
39
12 – Simultaneous Localization And Mapping
• Images = information-rich snapshots of a scene
• Compactness + affordability of cameras
• HW advances
Visionfor SLAM
SLAM using a single, handheld camera:
• Hard but … (e.g. cannot recover depth from 1 image)
• very applicable, compact, affordable, …
Image Courtesy of G. Klein
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
From SFM to SLAM
12 – Simultaneous Localization And Mapping
Lec.1240
P1
P2
P3
Structure from Motion (SFM):
Take some images of the object/scene to reconstruct
Features (points, lines, …) are extracted from all frames and matched among them
Process all images simultaneously
Optimisation to recover both: camera motion and 3D structureup to a scale factor (similar to GraphSLAM)
Not real-time
[Agrawal et al., ICCV 2009]
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
MonoSLAMLec.12
41
12 – Simultaneous Localization And Mapping
camera view internal SLAM map
Can we track the motion of a hand-held camera while it is moving?i.e. online
SLAM using a single camera, grabbing frames at 30Hz
Ellipses (in camera view) and Bubbles (in map view) represent uncertainty
Courtesy of Andrew J. Davison
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Representation of the world
The belief about the state of the world x is approximated with a single,
multivariate Gaussian distribution:
Lec.1242
12 – Simultaneous Localization And Mapping
Mean(state vector)
Covariance matrix
Camera state
: Position [3 dim.]
: Orientation using quaternions [4 dim.]
: Linear velocity [3 dim.]
: Angular velocity [3 dim.]
Landmark’s statee.g. 3D position for
point-features
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Motion & Probabilistic Prediction
How has the camera moved from frame t-1 to frame t?
The camera is hand-held no odometry or control input data
Use a motion model
But how can we model the unknown intentions of a human carrier?
Davison et al. use a constant velocity motion model:
Lec.1243
12 – Simultaneous Localization And Mapping
“on average we expect undetermined accelerations occur with a Gaussian profile”
),(ˆ1 ttt uxfx
T
utu
T
ytyt FQFFPFP 1ˆ
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Constant Velocity Model
The constant velocity motion model, imposes a certain smoothness on the camera
motion expected.
In each time step, the unknown linear and angular accelerations (characterized by
zero-mean Gausian distr.) cause an impulse of velocity:
12 – Simultaneous Localization And Mapping
Lec.1244
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Motion & Probabilistic Prediction
Based on the predicted new camera pose predict which known features will be visible and where they will lie in the image
Use measurement model h to identify the predicted location of each
feature and an associated search region (defined in the corresponding diagonal block of )
Essentially: project the 3D ellipsoids in image space
Lec.1245
12 – Simultaneous Localization And Mapping
),ˆ(ˆitii yxhz
RHPH T
tIN ˆ
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Measurement & EKF update
Search for the known feature-patches inside their corresponding search regions to get the set of all observations
Update the state using the EKF equations
where:
Lec.1246
12 – Simultaneous Localization And Mapping
T
tINttt
ntnnttt
KKPP
yxhzKxx
ˆ
)),ˆ((ˆ1:01:01:0
1)(ˆ
ˆ
INtt
T
tIN
HPK
RHPH
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Application: HPR-2 Humanoid at JRL, AIST, Japan
12 – Simultaneous Localization And Mapping
Lec.1247
• Small circular loop within a large room
• No re-observation of „old‟ features until closing of large loop
Courtesy of Andrew J. Davison
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Real-time Single Camera SLAM systemsLec.12
48
12 – Simultaneous Localization And Mapping
MonoSLAM is computationally expensive with increasing no. features
not ready yet to leave the lab & perform everyday tasks
MonoSLAM[Davison et al. 2007]
PTAM[Klein, Murray 2008]
Graph-SLAM[Eade, Drummond 2007]
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
SLAM challengesLec.12
49
12 – Simultaneous Localization And Mapping
• Handle larger amounts of data more effectively
• Competing goals:
PRECISIONEFFICIENCY
• Fast motion
• Large scales
• Robustness
• Rich maps
• Low computation for embedded apps
Ishikawa Komuro Lab.
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Essential components of a scalable SLAM systemLec.12
50
12 – Simultaneous Localization And Mapping
Map management &optimisation
1
2 3
Mapping & loop-closure
detection
Robust local motion estimation
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Live Dense ReconstructionLec.12
51
12 – Simultaneous Localization And Mapping
During live camera tracking, perform dense per-pixel surface reconstruction
Relies heavily on GPU processing for dense image matching
[Newcombe, Davison CVPR 2010 ]