Sensor Fusion, 2014 Lecture 5: 1 Lecture 5: Whiteboard: I Derivation framework for KF, EKF, UKF Slides: I Kalman filter summary: main equations, robustness, sensitivity, divergence monitoring, user aspects I Nonlinear transforms revisited I Application to derivation of EKF and UKF I User guidelines and interpretations
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Sensor Fusion, 2014 Lecture 5: 1
Lecture 5:
Whiteboard:I Derivation framework for KF, EKF, UKF
Slides:I Kalman filter summary: main equations, robustness, sensitivity,
divergence monitoring, user aspectsI Nonlinear transforms revisitedI Application to derivation of EKF and UKFI User guidelines and interpretations
Sensor Fusion, 2014 Lecture 5: 2
Lecture 4: SummaryI Detection problems as hypothesis tests:
H0 : y = e,H1 : y = x + e = h(x) + e.
I Neyman-Pearson’s lemma: T (y) = pe(y − h(x0))/pe(y)maximizes PD for given PFA (best ROC curve).
I Intuitive work flow of nonlinear filter:I MU: estimation from yk = h(xk) + ek and fusion with xk|k−1I TU: nonlinear transformation z = f (xk) and diffusion from
motion model ’;z=simulate(m,20);xhat1=kalman(m,z,’alg ’,1); % à
Stationaryxhat2=kalman(m,z,’alg ’,4); % à
Smootherxplot2(z,xhat1 ,xhat2 ,’conf ’,90,[1 à
2])
Sensor Fusion, 2014 Lecture 5: 7
Covariance illustrated as confidence ellipsoids in 2D plots orconfidence bands in 1D plots.xplot(z,xhat1 ,xhat2 ,’conf ’,99)
Sensor Fusion, 2014 Lecture 5: 8
Tuning the KFI The SNR ratio ‖Q‖/‖R‖ is the most crucial, it sets the filter
speeds. Note difference of real system and model used in theKF.
I Recommentation: fix R according to sensorspecification/performance, and tune Q (motion models areanyway subjective approximation of reality).
I High SNR in the model, gives fast filter that is quick inadapting to changes/maneuvers, but with larger uncertainty(small bias, large variance).
I Conversely, low SNR in the model, gives slow filter that is slowin adapting to changes/maneuvers, but with small uncertainty(large bias, small variance).
I P0 reflects the belief in the prior x1 ∈ N (x1|0,P0). Possible tochoose P0 very large (and x1|0 arbitrary), if no priorinformation exists.
I Tune covariances in large steps (order of magnitudes)!
Sensor Fusion, 2014 Lecture 5: 9
Optimality properties
I For a linear model, the KF provides the WLS solutionI The KF is the best linear unbiased estimator (BLUE)I It is the Bayes optimal filter for linear model when x0, vk , ek
are Gaussian variables,
xk+1|y1:k ∈ N (xk+1|k ,Pk+1|k)
xk |y1:k ∈ N (xk|k ,Pk|k)
εk ∈ N (0, Sk)
Sensor Fusion, 2014 Lecture 5: 10
Robustness and sensitivity
The following concepts are relevant for all filtering applications, butthey are most explicit for KF
I Observability.I Divergence tests: monitor performance measures and restart
the filter after divergence.I Outlier rejection: monitor sensor observations.I Bias error: incorrect model gives bias in estimates.I Sensitivity analysis: uncertain model contributes to the total
covariance.I Numerical issues: may give complex estimates.
Sensor Fusion, 2014 Lecture 5: 11
Observability
1. Snapshot observability if Hk has full rank. WLS can be appliedto estimate x .
2. Classical observability for time-invariant and time/varying case,
O =
HHFHF 2
...HF n−1
Ok =
Hk−n+1
Hk−n+2Fk−n+1Hk−n+3Fk−n+2Fk−n+1
...HkFk−1 . . .Fk−n+1
.
3. The covariance matrix Pk|k extends the observability conditionby weighting with the measurement noise and to forget oldinformation according to the process noise. Thus, (thecondition number of) Pk|k is the natural indicator ofobservability!
Sensor Fusion, 2014 Lecture 5: 12
Divergence tests
When is εkεTk significantly larger than its computed expected valueSk = E(εkε
Tk ) (note that εk ∈ N (0, Sk))?
Principal reasons:I Model errors.I Sensor model errors: offsets, drifts, incorrect covariances,
scaling factor in all covariances.I Sensor errors: outliers, missing dataI Numerical issues.
In the first two cases, the filter has to be redesigned.In the last two cases, the filter has to be restarted.
Sensor Fusion, 2014 Lecture 5: 13
Outlier rejection
Let H0 : εk ∈ N (0, Sk), then
T (yk) =εTk S−1k εk ∼ χ2(dim(yk))
if everything works fine, and there is no outlier. If T (yk) > hPFA ,this is an indication of outlier, and the measurement update can beomitted.In the case of several sensors, each sensor i should be monitored foroutliers
T (y ik) =εi ,Tk S−1
k εik ∼ χ2(dim(y ik)).
Sensor Fusion, 2014 Lecture 5: 14
Divergence monitoring
Related to outlier detection, but performance is monitored on alonger time horizon.One way to modify the chi2-test to a Gaussian test using thecentral limit theorem:
T =1N
N∑k=1
1dim(yk)
εTk S−1k εk ∼ N
(1,
2∑Nk=1 dim(yk)
),
If
(T − 1)
√√√√ N∑k=1
dim(yk)/2 > hPFA ,
filter divergence can be concluded, and the filter must be restarted.Instead of all data, a long sliding window or an exponential window(forgetting factor) can be used.
Sensor Fusion, 2014 Lecture 5: 15
Sensitivity analysis: parameter uncertainty
Sensitivity analysis can be done with respect to uncertainparameters with known covariance matrix using for instance Gaussapproximation formula.Assume F (θ),G (θ),H(θ),Q(θ),R(θ) have uncertain parameters θwith E(θ) = θ and Cov(θ) = Pθ.The state estimate xk is a stochastic variable with four stochasticsources, vk , ek , x1 at one hand, and θ on the other hand. . The lawof total variance (Var(X ) = EVar(X |Y ) + VarE(X |Y )) and Gaussapproximation formula (Var(h(Y )) ≈ h′Y (Y )Var(Y )(h′Y (Y ))T )gives
Cov(xk|k) ≈ Pk|k +dxk|k
dθPθ
(dxk|k
dθ
)T
.
The gradient dxk|k/dθ can be computed numerically bysimulations.
Sensor Fusion, 2014 Lecture 5: 16
Numerical issues
Some simple fixes if problem occurs:I Assure that the covariance matrix is symmetric
P=0.5*P+0.5*P’I Assure that the covariance matrix is positive definite by setting
negative eigenvalues in P to zero or small positive values.I Avoid singular R = 0, even for constraints.I Dithering. Increase Q and R if needed. This can account for
all kind of model errors.
Sensor Fusion, 2014 Lecture 5: 17
EKF and UKF
Theory from Chapter 8:I Nonlinear transformations.I Details of the EKF algorithms.I Numerical methods to compute Jacobian and Hessian in the
Taylor expansion.I An alternative EKF version without the Ricatti equation.I The unscented Kalman filter (UKF).
Applications from Chapter 16:I Automotive applications: yaw rate, roll ange, friction
estimation.I Rocket application: integration of IMU and GPS.
Sensor Fusion, 2014 Lecture 5: 18
Non-linear transformations
z = g(x) = g(x) + g ′(x)(x − x) +12
(x − x)Tg ′′(ξ)(x − x)︸ ︷︷ ︸r(x ;x ,g ′′(ξ))
,
The rest term is negligible and EKF works fine ifI the model is almost linear,I or the SNR is high, so ‖x − x‖ can be considered small.
The size of the rest term can be approximated a priori. Note: thesize may depend on the choice of state coordinates!If the rest term is large, use either of
I the second order compensated EKF that compensates for themean and covariance of r(x ; x , g ′′(ξ)) ≈ r(x ; x , g ′′(x)).
I the unscented KF (UKF).
Sensor Fusion, 2014 Lecture 5: 19
TT1: first order Taylor approximation
Gauss’ approximation formula
x ∈ N(x ,P
)→N
(g(x),
[g ′i (x)P(g ′j (x))T
]ij
)= N
(g(x), g ′(x)P(g ′(x))T )
Here [A]ij means element i , j in the matrix A. This is used in EKF1(EKF with first order Taylor expansion). Leads to a KF wherenonlinear functions are approximated with their Jacobians.Compare with the linear transformation rule
z = Gx ,x ∈ N
(x ,P
)→
z ∈ N(Gx ,GPGT )
Note that GPGT can be written [GiPGTj ]ij , where Gi denotes row i
of G .
Sensor Fusion, 2014 Lecture 5: 20
TT2: second order Taylor approximation
x ∈ N(x ,P
)→
N(g(x)+[tr(g ′′i (x)P)]i ,
[g ′i (x)P(g ′j (x))T +
12tr(Pg ′′i (x)Pg ′′j (x))
]ij
)This is used in EKF2 (EKF with second order Taylor expansion).Leads to a KF where nonlinear functions are approximated withtheir Jacobians and Hessians.UKF tries to do this approximation numerically, without formingthe Hessian g ′′(x) explicitly. This reduces the n5
x complexity in[Pg ′′i (x)Pg ′′j (x))
]ijto n3
x complexity.
Sensor Fusion, 2014 Lecture 5: 21
MC: Monte Carlo sampling
Generate N samples, transform them, and fit a Gaussiandistribution
x (i) ∈ N(x ,P
),
z(i) = g(x (i)),
µz =1N
N∑i=1
z(i),
Pz =1
N − 1
N∑i=1
(z(i) − µz
)(z(i) − µz
)T.
Not commonly used in nonlinear filtering, but a valid and solidapproach!
Sensor Fusion, 2014 Lecture 5: 22
UT: the unscented transformAt first sight, similar to MC:Generate 2nx + 1 so called sigma points, transform these, and fit aGaussian distribution:
x0 = x ,
x±i = x ±√
nx + λP1/2:,i , i = 1, 2, . . . , nx ,
z i = g(x i ),
E(z) ≈ λ
nx + λz0 +
±N∑i=±1
12(nx + λ)
z i ,
Cov(z) ≈(
λ
nx + λ+ (1− α2 + β)
)(z0 − E(z)
)(z0 − E(z)
)T+±N∑
i=±1
12(nx + λ)
(z i − E(z)
)(z i − E(z)
)T.
Sensor Fusion, 2014 Lecture 5: 23
UT: design parameters
I λ is defined by λ = α2(nx + κ)− nx .I α controls the spread of the sigma points and is suggested to
be chosen around 10−3.I β compensates for the distribution, and should be chosen toβ = 2 for Gaussian distributions.
I κ is usually chosen to zero.
Note thatI nx + λ = α2nx when κ = 0.I The weights sum to one for the mean, but sum to
2− α2 + β ≈ 4 for the covariance. Note also that the weightsare not in [0, 1].
I The mean has a large negative weight!I If nx + λ→ 0, then UT and TT2 (and hence UKF and EKF2)
are identical for nx = 1, otherwise closely related!
Sensor Fusion, 2014 Lecture 5: 24
Example 1: squared norm
z = g(x) = xT x , x ∈ N (0, In)⇒ z ∈ χ2(n).
Theoretical distribution is χ2(n) with mean n and variance 2n. Themean and variance are below summarized as a Gaussiandistribution. The number of Monte Carlo simulations is 10 000.
DOA g(x) = arctan2(x1, x2) N([3;0],[10,0;0,1])TT1 N(0,0.111)TT2 N(0,0.235)UT2 N(0.524,1.46)MCT N(0.0702,1.6)
Sensor Fusion, 2014 Lecture 5: 27
EKF1 and EKF2 principle
Apply TT1 and TT2, respectively, to the dynamic and observationmodels. For instance,
xk+1 = f (xk) + vk = f (x) + g ′(x)(x − x) +12
(x − x)Tg ′′(ξ)(x − x).
I EKF1 neglects the rest term.I EKF2 compensates with the mean and covariance of the rest
term using ξ = x .
Sensor Fusion, 2014 Lecture 5: 28
EKF1 and EKF2 algorithm
Sk = h′x (xk|k−1)Pk|k−1(h′x (xk|k−1))
T
+ h′e(xk|k−1)Rk(h′e(xk|k−1))T
Kk = Pk|k−1(h′x (xk|k−1))
TS−1k
εk = yk − h(xk|k−1, 0)− 12
[tr(h′′i,xPk|k−1)
]i
xk|k = xk|k−1 + Kkεk
Pk|k = Pk|k−1
− Pk|k−1(h′x (xk|k−1))
TS−1k h′x (xk|k−1)Pk|k−1
12
[tr(h′′i,x (xk|k−1)Pk|k−1h
′′j,x (xk|k−1)Pk|k−1)
]ij
xk+1|k = f (xk|k , 0)
Pk+1|k = f ′x (xk|k)Pk|k(f′x (xk|k))
T
+ f ′v (xk|k)Qk(f ′v (xk|k))T .
12
[tr(f ′′i,x (xk|k)Pk|k f ′′j,x (xk|k)Pk|k)
]ij.
Sensor Fusion, 2014 Lecture 5: 29
Comments
I The EKF1, using the TT1transformation, is obtained byletting both Hessians f ′′x and h′′x be zero.
I Analytic Jacobian and Hessian needed. If not available, usenumerical approximations (done in Signal and Systems Lab asdefault!).
I The complexity of EKF1 is as in KF n3x due to the FPFT
operation.I The complexity of EKF2 is n5
x due to the FiPFTj operation for
i , j = 1, . . . , nx .I Dithering is good! That is, increase Q and R from the
simulated values to account for the approximation errors.
Sensor Fusion, 2014 Lecture 5: 30
EKF variants
I The standard EKF linearizes around the current state estimate.I The linearized Kalman filter linearizes around some reference
trajectory.I The error state Kalman filter, also known as the
complementary Kalman filter, estimates the state errorxk = xk − xk with respect to some approximate or referencetrajectory. Feedforward or feedback configurations.
linearized Kalman filter = feedforward error state Kalman filterEKF = feedback error state Kalman filter
Sensor Fusion, 2014 Lecture 5: 31
Derivative-free algorithms
Numeric derivatives are preferred in the following cases:I The non-linear function is too complex.I The derivatives are too complex functions.I A user-friendly algorithm is desired, with as few user inputs as
possible.This can be achieved with either numerical approximation or usingsigma points!
Sensor Fusion, 2014 Lecture 5: 32
KF, EKF and UKF in one framework
First, recall the lemma(XY
)∈ N
((µxµy
),
(Pxx PxyPxy Pyy
))= N
((µxµy
),P)
Then, the conditional distribution for X , given the observed Y = y ,is Gaussian distributed:
The transformation approximation (UT, MC, TT1, TT2) gives
z ∼ N(xk+1|k ,Pk+1|k
).
Sensor Fusion, 2014 Lecture 5: 34
Measurement update: Let
x =
(xkek
)∈ N
((xk|k−1
0
),
(Pk|k−1 0
0 Rk
))z =
(xkyk
)=
(xk
h(xk , uk , ek)
)= g(x)
The transformation approximation (UT, MC, TT1, TT2) gives
z ∼ N
((xk|k−1yk|k−1
),
(Pxx
k|k−1 Pxyk|k−1
Pyxk|k−1 Pyy
k|k−1
))
The measurement update is now
Kk = Pxyk|k−1
(Pyy
k|k−1
)−1,
xk|k = xk|k−1 + Kk(yk − yk|k−1
).
Sensor Fusion, 2014 Lecture 5: 35
Linear case
Time update: f (xk , uk , vk) = Axk + vk gives
z ∼ N(Axk|k ,APk|kA
T + Qk
)= N
(xk+1|k ,Pk+1|k
)This is the KF time update.Measurement update: h(xk , ek) = Hxk + ek gives
z ∼ N((
xk|k−1Hxk|k−1
),
(Pk|k−1 HPk|k−1
Pk|k−1HT HPk|k−1HT + R
))This gives the KF measurement update!
Sensor Fusion, 2014 Lecture 5: 36
Comments
I The filter obtained using TT1 is equivalent to the standardEKF1.
I The filter obtained using TT2 is equivalent to EKF2.I The filter obtained using UT is equivalent to UKF.I The Monte Carlo approach should be the most accurate, since
it asymptotically computes the correct first and second ordermoments.
I There is a freedom to mix transform approximations in thetime and measurement update.
Sensor Fusion, 2014 Lecture 5: 37
Choice of nonlinear filterI Depends mainly on (i) SNR, (ii) the degree on nonlinarity and
(iii) the degree of non-Gaussian noise, in particular if anydistribution is multi-modal (has several local maxima).
I SNR and degree of nonlinearity is connected through the restterm, whose expected value is
Small rest term requires either high SNR (small P) or almostlinear functions (small f ′′ and h′′).
I If the rest term is small, use EKF1.I If the rest term is large, and the nonlinearities are essentially
quadratic (example xT x) use EKF2.I If the rest term is large, and the nonlinearities are not
essentially quadratic try UKF.I If the functions are severly nonlinear or any distribution is
multi-modal, consider filterbanks or particle filter.
Sensor Fusion, 2014 Lecture 5: 38
Course estimation in cars
Car with course gyro and wheel speed sensors. Gyro has drift.Wheel speeds can be converted to course angular rate, but with a(speed dependent) drift. EKF can correct the drifts!http://youtu.be/d9rzCCIBS9I
Applications: ABS, traction control, headlight controlSensors: many possible configurations of gyroscopes andaccelerometers. http://youtu.be/hT6S1FgHxOc