Introduction to Simultaneous Locomotion and Mapping
Dec 22, 2015
Introduction to Simultaneous Locomotion and Mapping
Overview
Definition of SLAM Why it's a hard problem Introduction to Kalman Filters Applying Kalman Filters to Solve SLAM
Problems Visualizations of Solutions A Nod to Other Approaches
Definition – Localization and Mapping
Simultaneous Localization and Mapping (SLAM) asks Where am I in the world reference frame? What are the obstacles and features of the world?
These problems are the Localization and Mapping problems
Localization : given a map, observations, and actions, determine the location of the robot
Mapping: given a trajectory and observations, determine a map of the environment
Odometry
The method of determining location solely from the internal sensors of the robot – no map!
Integrate actions to get new state.
Single-Disk Shaft EncoderA perforated disk is mounted on the shaftand placed between the emitter–detectorpair. As the shaft rotates, the holes in thedisk chop the light beam.
&
Uncertainty from Noise
However, our calculated trajectory is inaccurate due to noise factors Due to calibration, vibration, slippage, geometric
differences, The uncertainty grows with time.
Localization With Landmarks
We can use observations to reduce the uncertainty in our pose estimations.
Mapping
Complement problem to localization: We know our position, x[k], our history of positions, X[k-1], and noisy observations of the environment, Z[k], and we want to make a map of
the environment, M.
What is a map?
Set of features with relationships. Features
Occupancy grid Lines Anything identifiable by the robot
Relations Rigid-body transformation Topological
SLAM Combines Both
Unknown Map
Neither the map nor the location is known.
Formal Problem Statement
What is
P(x[k], m|Z[0:k], U[0:k], x[0])?
Problem Quantitiesx[k]: location of vehicle at time ku[k]: a control vector applied at k-1 to drive the vehicle from x[k-1] to x[k]m[i]: location of ith landmarkz[k]: observation of a landmark taken at time kX[0:k] : history of states {x[1],x[2], x[3], ..., x[k]}U[0:k] : history of control inputs {u[1], u[2], u[3], ..., u[k]}m: set of all landmarksZ[0:k] : history of all observations {z[1], z[2], ..., z[k]}
P(x[k], m|Z[0:k], U[0:k], x[0])
m[i]
m[i]
The “Holy Grail”
With SLAM, robots gain a major step toward autonomy. They can be placed at an unknown location, with unknown surroundings, and they can work out what's around them and where they are.
Difficulties
All our knowledge is probabilistic We have a “chicken and egg” problem:
If we knew the map, we could calculate our location If we knew our location, we could calculate the map
Kalman Filter: Overview
The Kalman Filter (KF) is a recursive algorithm to estimate the state of a process based on A model of how the process changes, A model of what observations we expect, and Discrete, noisy observations of the process.
Predictions of the state of the processare made based on a model.
Observations are made and our estimate is updated given thenew observation.
Foundations of Kalman Filter
The purpose of the KF is to calculate the distribution of the state at time k, given noisy observations.
However, filtering continuous distributions generates state distributions whose representation grows without bound.
An exception to this “rule” is the Gaussian distribution, which only needs a mean and variance to define the distribution.
Linear Gaussian Distributions
Under the continuous operations needed for KFs Conditioning
Bayes' rule
If our prior distribution is a Gaussian, and our transition models are linear Gaussian, then our calculated distribution is also Gaussian.
Conditioning
We can find P(Y) = Sum of P(Y,z) over all z. P(y,z) = 0.4, P(y, not z) = 0.2, :. P(y) = 0.6
And P(Y,z) = P(Y|z)P(z) from the product rule. So P(Y) = Sum of P(Y|z)P(z) For continuous distributions, this becomes
P(Y) = Integral of P(Y|z)P(z) over all zwhich is where we get
..but if our P(x) is only true given e[0:k], so is our result..
Bayes' Rule
We know that P(a and b) = P(a|b)P(b) =P(b and a) = P(b|a)P(a)
Therefore, P(b|a) = P(a|b)P(b)/P(a) With additional evidence, this becomes
P(b|a, e) = P(a|b,e)P(b|e)/P(a|e) Notice that this is a probability distribution, and
all values will sum to one – P(a|e) just acts as a normalizer.
This yields
KF Recursion
We want , and we have . Note that by using conditioning, we can calculate
as long as we know , which is referred to as the motion model.
Also, by the generalized Bayes' rule, we have
assuming we know , the sensor model.
Process Motion Model
The model of how the process changes is referred to as the motion model. A new state of the process is assumed to be a linear function of the previous state and the control input with Gaussian noise, p(w) ~ N(0,Q), added.
x[k] = Ax[k - 1] + Bu[k – 1] + w[k – 1]
where A and B are matrices capturing our process model.
Process Sensor Model
The sensor model is also required to be a linear Gaussian
z[k] = Hx[k] + v[k]
where H is a matrix that relates the state to the observation and p(v) ~ N(0,R).
Covariance
The multi-dimensional analog of variance:expected squared difference between elements and the mean.
It quantifies the uncertainty
of our estimate.
Kalman Filter Algorithm
The Kalman Filter operates on Gaussian distributions, and so only needs to calculate a mean, x, and a covariance, C.
Prediction stepx[k+1|k] = Ax[k] + Bu[k]C[k+1|k] = AC[k]AT + Q = cov(Ax[k],Ax[k]) + Q
Updatex[k+1|k+1] = x[k+1|k] + K(z[k+1] – Hx[k+1|k])C[k+1] = (I – KH)C[K+1|k]
K, Kalman Gain Factor
K is a measure of how to weight the prediction and the observation
A high value of K results in the measurement being trusted
A low value of K results in the prediction being trusted.
K is based on the covariancesK = P[k+1|k]HT(HP[k+1|k]HT + R)-1
Limit of K as R -> 0 = H-1
Limit of K as P[k+1|k] -> 0 = 0
Initial State
With the previous formulas, we can calculate an update. What's our initial state?
We need x[0], C[0], R, Q. X[0] is our initial guess at the process state C[0] is how certain we our of our guess R is the noise in the sensor model (measured) Q is the noise in the process model (estimated)
Quick Example
Estimate a slightly variable 1D signal by noisy observationx[k+1] = x[k] + wz[k+1] = x[k] + v
Therefore, A = 1, B = 0, H = 1. Let's assume Q = 0.00001 and R = 0.01
and x[0] = 0 and C[0] = 1
*Q and R often need to be tuned for performance
Result
Probabalistic SLAM
Is SLAM a Kalman Filter problem? We want , and we can assume
we know the previous state, . And with conditioning ...
gets us the predicted next state. And the generalized Bayes' rule integrates z[k].
So we need and .
Kalman Formulation for SLAM
looks exactly like the motion model from the Kalman Filter.
And if we incorporate the landmark locations into our system state, x[k], then is the same as the sensor model.
We assume landmarks don't move, sop[k+1] = p[k].
Our observations are of landmarks,z[k] = H
pp-H
rr[k] + w
where Hpp-H
rr[k] = Hx[k] and r is the robot pose.
SLAM Example
Suppose we have a robot moving in 1D that moves a certain velocity every timestep. Also assume the velocity has noise added,
Also assume the landmarks don't move,
This provides our motion model.
SLAM Example
Further, the observations we expect are
Thus we have A and H, respectively below, x[0], R, Q, and C[0] must be given, and B is [0].
Adding Landmarks
As more landmarks are discovered, they are added to the state.
All the problem matrices are updated and their dimension is increased.
Why this Works
Kalman Filters have been proven to solve the SLAM problem by Dissanyake, et. al.
This is due to the fact that the correlation between landmarks monotonically increases
Thus, relative positions of the landmarks become well known, and thus a map of the landmarks relative to eachother becomes well known.
Thus, the solution converges.
Limitations of Kalman Filter
Unimodal distribution cannot account for “choice.”
Computation is O(n2) where n is the size of the state vector.
Landmark association and loop closure. Requires a linear process model for the system.
This final weakness is addressed by the Extended Kalman Filter (EKM).
The assumption of linear motion models and sensor models is a serious limitation.
What if we let either of these models be non-linear?
x[k+1] = f(x[k], u[k], w[k])z[k] = h(x[k], v[k])
where f and h can be non-linear and v and w are zero mean Gaussian noise.
Extended Kalman Filter
EKM – Mean Calculation
Like the Kalman Filter, the EKM works by calculating a new distribution mean and covariance given the models and observations.
The means are propagated by the models, without the noise,
x[k + 1] = f(x[k], u[k], 0)z[k] = h(x[k], 0)
EKM – Covariances
The covariance estimates are based on a linearization of the models using Jacobians.
C[k + 1] = F[k]C[k]F[k] + W[k]Q[k]W[k]
where C is the covariance matrix, F[k] is the Jacobian of f with respect to the state variables and W[k] is the Jacobian of f with respect to the noise parameters, and Q is the covariance of the motion model's noise.
EKM – Gain
Finally, we need a new form for K
K[k] = C[k]J[k]T(J[k]P[k]J[k]T + V[k]R[k]V[k]T)-1
where C is our covariance, J is the Jacobian of h, the sensor model, with respect to the state variables, V is the Jacobian of the sensor model with respect to its noise, and R is the covariance of the noise in the sensor model.
EKM Example
A car model requires a non-linear motion model
dx = v * cos(theta)
dy = v * sin(theta)
dtheta = (v * tan(phi)) / L
Example Motion Model
Fixed steering angle,fixed velocity
Example Observation Model
Let's assume our observations are taken from an overhead camera.
z[k+1] = z[k] + v[k]
where v[k] is noise.
Prediction Step
Prediction of the mean is calculated from motion model.
x[k + 1|k] = f(x[k], u[k], 0) Jacobians needed to calculate the new
covariance.
Update Step
Predicted measurement given by the sensor model
z[k+1|k] = h(x[k], 0) Final estimate is
x[k+1] = x[k+1|k] + K(z[k] – z[k+1|k] This requires the gain,
K[k] = C[k]J[k]T(J[k]P[k]J[k]T + V[k]R[k]V[k]T)-1
Update Step cont.
The Gain requires two JacobiansK[k] = C[k]J[k]T(J[k]P[k]J[k]T + V[k]R[k]V[k]T)-1
All that's left is the final covariance, C[k+1]
C[k+1] = (I - K[k]J[k])C[k+1|k]
Result
With blue as the true trajectory, red as the observation, and green as the estimate.
EKM Issues
Landmark association and loop closure. Unimodal distribution cannot account for
“choice.” Computation is O(n2) where n is the size of the
state vector.
Landmark Association
How can the robot tell if it's looking at landmark m[i] or m[j]?
What if your landmarks are just laser ranges?
Constructing lines from Scans
Scan Matching
The robot scans, moves, and scans again. Short term motion noise results in features
having to be recognized.
Iterated Closest Point
Inputs: two point clouds Output: refined transformation. Method:
1. Associate points by the nearest neighbor criteria.
2. Estimate the parameters using a mean square cost function.
3. Transform the points using the estimated parameters.
4. Iterate (re-associate the points and so on).
Loops
ICP can handle small offsets, though it has high computational cost and is subject to local minima.
What about large offsets?
EKM Issues
Landmark association and loop closure. Unimodal distribution cannot account for
“choice.” Computation is O(n2) where n is the size of the
state vector.
Computation Cost
A common approach to attempt to cut down the computation cost, which is O(n2) where n is the size of the state space (and thus landmarks), is to only consider a local map.
These need to be stitchedtogether to achieve global navigation.
Particle Filter – Another Approach
The idea behind a particle filter is to use a number of samples to approximate the distributions.
Advantage: you don't need to assume linear Gaussian processes models, and could be faster
Idea one: generate particles uniformly with values [0, 1]; throw out any particle at x if value(x) > P(x)
Idea two: generate particles from a proposal distribution and assign each a weight.
Rao-Blackwellization
For SLAM problems, the state space is too large for particle filtering.
However, if our joint state P(x[1], m[1]) can be partitioned with the product rule P(m[1]|x[1])P(x[1]), and if P(m[1]|x[1]) can be calculated analytically, only P(x[1]) needs to be sampled.
This is called Rao-Blackwellization, and allows particle filtering for SLAM problems.
FastSLAM
FastSLAM is a particle filter approach to SLAM using Rao-Blackwellization.
Note that the joint SLAM problem can be factored
P(X[0:k], m|Z[0:k],U[0:k],x[0])* = P(m|X[0:k],Z[0:k)P(X[0:k]|Z[0:k],U[0:k],x[0])
*When conditioned on the trajectory, m is independent of U
FastSLAM
Our joint distribution at time k is {w[k]i,X[0:k]i,P(m|X[0:k]i, Z[0:k])}n
where X is a sample trajectory, w is the calculated weight, and P(m|X,Z) is our analytically calculated distribution of the landmarks.
FastSLAM: Recursive Calculations
For recursive step, assume prior estimates{w[k-1]i,X[0:k-1]i,P(m|X[0:k-1]i, Z[0:k-1])}n
1) A new pose is sampled from a proposal distribution. This could simply be the motion model.
x[k]i ~ P(x[k]|x[k-1]i, u[k]) 2) Assign each sample a weight
3) Use EKM on each particle to solve for maps.