Realtime Estimation of Driving Resistance_Author Shi Jieqing

8/19/2019 Realtime Estimation of Driving Resistance_Author Shi Jieqing

1/54


2/54

1 Introduction

With recent advancements in vehicle automation, advanced driver assistance systems have

emerged as an important tool to facilitate energy efficient driving. For the driver assistance

systems to function an accurate model of the vehicle’s longitudinal dynamics is needed for

which vehicle parameters such as the mass and the driving resistances are required. In general,

a straightforward approach is to use sensors to measure these parameters. This, however is

not always applicable or it can be costly [1]. Furthermore, these vehicle parameters such as

mass and driving resistances can vary depending on the load, the attachment of trailers and

road conditions etc. In this respect, an efficient alternative to a sensor-based approach is to

estimate these parameters adaptively using a model-based approach [22]. For this, methods

from data fusion can be applied on available vehicle data retrieved from existing sensors and,

using a mathematical model of the vehicle dynamics, the unknown parameters can then be

reconstructed. The estimation algorithm is developed as a new software function based on various

rapid prototyping development tools and can be continuously validated on existing vehicle data.

To guarantee reliable system performance, the estimation algorithm has to be robust and accurate.

Many estimators have been proposed in literature amongst which the recursive least squares

(RLS) estimator is one of the most popular algorithms. However, in situations where the system

excitation is poor (e.g. on highways where the vehicle travels with nearly constant speed), a

problem called the estimator windup can occur in the RLS estimator. As a result, the estimator

is unable to produce accurate estimates of the unknown parameters which can severely inhibit

the system’s performance. This is why the RLS algorithm is not well-suited for the parameter

estimation of vehicle parameters.

The objective of this work is to find modifications and alternatives to the RLS estimator,

respectively, which show better performance regarding the estimation of vehicle parameters

during periods of poor excitation. More specifically, the following tasks are targeted in this

work:

• Analysis of estimator windup including its manifestation and consequences

• Research of alternative/modified estimators which target the problem of estimator windup

• Selection of suitable estimator candidates for the estimation of parameters in vehicle

dynamics model

• Implementation of estimators in MATLAB/Simulink

• Validation and evaluation of algorithms based on data obtained from various test drives

The remainder of this work is organized as follows: chapter 2 discusses the basics of real time

parameter identification. The algorithm of the RLS estimator is introduced and the problem


3/54


4/54

2 Real time parameter identification

2.1 Basics of parameter identification

In many adaptive systems the direct measurement of the unknown parameters is not possible

because the application of extra sensors is either impractical or not viable [1]. As a reasonable

alternative, the unknown parameters can be estimated using available data. One of the most

popular estimation schemes is the so-called recursive least squares which is described in thefollowing.

2.1.1 Recursive least squares estimation

The least squares method for parameter estimation is based on a linear, mathematical model

that can be formulated in the so-called regressor form [2]:

y(k) = ϕ1(k)θ1 + ϕ2(k)θ2 + . . . ϕn(k)θn + e(k) (2.1)

where y(k) denotes the output/observed variable, ϕT = [ϕ1(k) ϕ2(k) . . . ϕn(k)] denotes

the vector of regressors and θ = [θ1 θ2 . . . θn]T represents the vector of n parameters to be

estimated. Denoting ŷ(k) as the output of the estimation model of the above process and θ̂ as

the estimation of the unknown parameters θ , the error between predicted output ŷ(k) and the

actual output y (k) can be formulated as

e(k) = y(k) − ŷ(k) = y(k) −ϕT (k)θ̂

The objective of the least squares estimation is to choose the parameters θ such that the estimated

output ŷ(k) follows the real output y(k) as closely as possible. This can be expressed as estimatingthe parameters θ such that a loss function, which describes the deviation between real and

measured output, is minimized. Introducing the following matrix notations [2],

Y (k) = [y(1) y(2) . . . y(k)]T

Φ(k) =

ϕT (1)

ϕT (2)...

ϕT (k)

E (k) = [e(1) e(2) . . . e(k)]T

= Y (k) − Φ(k)θ̂


5/54


with Y ∈ RN ×1, E ∈ RN ×1 and Φ ∈ RN ×n (N denoting the number of observations) the

least-squares loss function can be formulated as

V (θ, k) = 1

2

k

i=1(y(i) −ϕT (i)θ̂)2 =

1

2(Y (k) − Φ(k)θ̂)T (Y (k) − Φ(k)θ̂) =

1

2E T (k)E (k) (2.2)

Choosing the θ that minimizes the loss-function, i.e.

θ̂ = arg min {V (θ, k)}

thus describes an unconstrained optimization problem with a quadratic loss function. Additionally

taking into account that y is linear in the parameters, an analytic solution can be obtained

simply by computing the gradient ∂V ∂ θ

and setting it to 0. This leads to the following unique

solution for θ̂ if the matrix ΦT Φ is regular [2]:

ΦT Φθ̂ = ΦT Y

⇒ θ̂ = (ΦT Φ)−1ΦT Y (2.3)

The matrix ΦT Φ is often denoted as the information matrix R , its inverse (ΦT Φ)−1 is called

the covariance matrix P [2].

P (k) = (ΦT Φ)−1 =

ki=1

ϕ(i) ·ϕT (i)

−1(2.4)

A geometric interpretation of the least squares estimate can be obtained when considering atwo-dimensional case where two parameters θ1 and θ2 are estimated (see fig. 2.1). With the

regression variables spanning a subspace in which the predicted output Ŷ lies, the least squares

estimation can be seen as finding the parameters θ such that the distance between the real

output Y and its best approximation Ŷ is minimal. The minimal distance is only achieved when

[2]

E = Y − Ŷ ⊥ span(ϕ1, ϕ2 . . . ϕn)

In order to enable online parameter estimation, the least-squares algorithm can also be formulated

in a recursive manner, i.e. the results obtained until time instance k − 1 are used to determine

the estimates at current time instance k. For this purpose, it is assumed that the information

matrix R (k) = ΦT Φ is regular for all k . Using the fact that R (k) can be decomposed as,

R (k) = P −1(k) = ΦT (k)Φ(k)

=k−1i=1

ϕ(i)ϕT (i) + ϕ(k)ϕT (k)

= P −1(k − 1) + ϕ(k)ϕT (k) (2.5)

5


6/54


Figure 2.1 – Geometric interpretation of the least squares estimate [2]

6


7/54


the least squares estimate (2.3) becomes

θ̂(k) = P (k)ΦT (k)Y (k) = P (k)

ki=1

ϕ(i)y(i)

= P (k)k−1i=1

ϕ(i)y(i) + ϕ(k)y(k) (2.6)where

k−1i=1

ϕ(i)y(i) = P −1(k − 1)θ̂(k − 1)

From (2.5) it can be deduced that P −1(k − 1) = P −1(k) −ϕ(k)ϕT (k). Plugging this expression

into (2.6) yields the following formula for θ̂(k) [2]:

θ̂(k) = θ̂(k − 1) + K (k)(k)

K (k) = P (k)ϕ(k)

(k) = y(k) −ϕT (k)θ̂(k − 1)

After some algebraic reformulations using the matrix inversion lemma (A+BCD)−1 = A−1 −

A−1B(C −1 + DA−1B)−1DA−1 on P (k) =P −1(k − 1) + ϕ(k)ϕT (k)

−1one obtains the

recursive least squares (RLS) algorithm [2]:

θ̂(k) = θ̂(k − 1) + K (k)(k) (2.7)

K (k) = P (k − 1)ϕ(k) 1 + ϕT (k)P (k − 1)ϕ(k)

−1(2.8)

P (k) = (I − K (k)ϕT (k))P (k − 1) (2.9)

(k) = y(k) −ϕT (k)θ̂(k − 1) (2.10)

with K (k) = P (k)ϕ(k) denoting a correction factor. Interpreting (k) as the error which occurs

by predicting y (k) one-step ahead based on θ̂(k − 1), the estimate θ̂(k) at time k is derived by

adding a correction term K (k)(k) to the previous estimate θ̂(k − 1). The correction term is

proportional to the difference between the measured output and the predicted output based on

the previous estimate [2].

In order to initialize the RLS algorithm initial values for θ̂(0) and P (0) must be known. These

initial values can be obtained in a variety of ways. For instance, if a priori knowledge is available

regarding the parameters and their covariances, these values can be used to instantiate the initial

estimates. Furthermore, it is possible to start with an off-line estimation to obtain the initial

estimates of θ̂(0) and P (0) or to simply assume appropriate initial values [14].

An interesting observation is the similarity between the RLS and the standard Kalman filter

recursive algorithm. The Kalman filter algorithm is usually associated with a random walk

parameter variation model and a linear regression model that can be described by [8], [20]:

θ(k) = θ(k − 1) + w(k) (2.11)

y(k) = ϕT (k)θ(k) + v(k) (2.12)

7


8/54


with w(k) and v(k) denoting the sequence of random vectors which are responsible for the

parameters’ change and the measurement noise, respectively. Generally, it is assumed that

w(k), v(k) are Gaussian processes with zero mean and the variances Ew(k)wT (k)

= Q(k) and

E

v(k)2

= r(k). Applying the standard Kalman filter to said model yields [20]:

θ̂(k) = θ̂(k − 1) + K (k)(k)

K (k) = P (k − 1)ϕ(k)

r(k) + ϕT (k)P (k − 1)ϕ(k) (2.13)

P (k) =I − K (k)ϕT (k)

P (k − 1) + Q(k) (2.14)

(k) = y(k) −ϕT (k)θ̂(k − 1)

The similarity between the standard RLS and the Kalman filter estimator become apparent when

comparing the algorithm equations. In fact, it can be shown that the RLS estimator is a special

case of the Kalman filter if specific assumptions about Q(k) and r(k) are made [8].

2.1.2 Exponential forgetting

In many real-world systems, the parameters of the system do not remain constant but can vary

with time. In that case, the proposed RLS algorithm is not suitable to estimate these time-varying

parameters. The reason is as follows: since the exponential convergence of the RLS algorithm (as

proven in various studies such as [13], [14]) implies that the covariance matrix converges to 0with increasing time horizon k , it follows from (2.7) that θ̂(k) = θ̂(k − 1) as the correction gain

K (k) vanishes [2]. Therefore if the parameters are time-varying the standard RLS algorithm

is not able to track parameter changes. In this case, one intuitive approach is to modify the

algorithm such that older data is continuously discarded, i.e. is assigned less weight while newer

incoming data is considered with higher weight. This is achieved by modifying the least-squares

loss-function in (2.2) [2]:

V (θ, k) = 1

2

ki=1

λk−i(y(i) −ϕT (i)θ̂)2 (2.15)

The constant 0 ≤ λ ≤ 1 is called the forgetting factor. Evidently, the modified loss-functionassigns exponentially less weight to data that is far away from current time instance k while

new incoming data is considered with more weight. This way parameter estimation using a

forgetting factor can be simply interpreted as averaging the data over a certain amount of data

points while the forgetting factor sets the memory length of the algorithm [3], [20]. Repeating the

calculations in the previous sub-section for the modified loss-function leads to the RLS algorithm

with exponential forgetting [2].

θ̂(k) = θ̂(k − 1) + K (k)(k) (2.16)

K (k) = P (k − 1)ϕ(k) λ +ϕT (k)P (k − 1)ϕ(k)−1 (2.17)P (k) = (I − K (k)ϕT (k))P (k − 1) · λ−1 (2.18)

(k) = y(k) −ϕT (k)θ̂(k − 1) (2.19)

8


9/54


In some literature the recursive RLS estimator is formulated using the information matrix instead

of the covariance matrix. In that case

R (k) = λR (k − 1) + ϕ(k)ϕT (k) (2.20)

is used instead of (2.9) [6]. However, from the perspective of implementation, it is often compu-

tationally more efficient to use the covariance matrix in the update equations so as to avoid a

matrix inversion operation at each update.

The quality of the estimates depends directly on the choice of the forgetting factor. Choosing

λ ≈ 1 leads to robust, smooth trajectories, however the algorithm loses its capability to track

parameter changes since old data is discarded at a relatively slow rate. Conversely, a small

value for λ enables fast tracking of parameter changes, however the measurements become more

sensitive to noise, ultimately causing fluctuating trajectories and decreased robustness. Therefore,

the trade-off between adaption rate and robustness needs to be taken into account when deciding

on the forgetting factor [14]. This relationship is illustrated in fig. 2.2. The left figure shows the

estimation of four parameters using λ = 0.9 while the right figure displays the same estimation

using λ = 0.95. It is apparent that a smaller value of λ causes more noise sensitive estimates and

larger measurement variances, however changes in the parameter can be tracked relatively fast.

In comparison, a larger λ leads to smoother, less noise sensitive estimates at the cost of slower

parameter tracking [14].

Figure 2.2 – Comparison of different forgetting factors λ = 0.9 (left) and λ = 0.95 (right) in the RLSestimator with exponential forgetting (true values - dashed lines, estimated values - solid lines)[14]

9


10/54

2.2 Lack of excitation


The RLS algorithm with exponential forgetting is a widely used technique for parameter estimation

in adaptive systems. However, certain circumstances can cause the RLS estimator to produce

inaccurate estimates. These situations arise when the input is insufficiently excited which leadsto a phenomenon denoted as estimator windup.

2.2.1 Estimator windup

For illustration of the windup phenomenon, we consider the extreme case of a process with no

excitation, i.e. ϕ(k) = 0 for a long amount of time. In this case, one can observe from (2.18) that

applying the an estimation technique with exponential forgetting leads to [14]

P (k) = (I − K (k)ϕT

(k) =0

)P (k − 1) · λ−1

= 1

λP (k − 1)

As λ


11/54


Figure 2.3 – Effects of estimator windup caused by a constant regressor: control variable (top left), covariancematrix element (top right), trajectories of estimates (bottom)[2]

the robustness of the algorithm cannot be guaranteed. Thus, as estimator windup causes an

unbounded increase of the covariance matrix the estimates become unreliable [7]. This is why

estimation algorithms relying on constant forgetting factors are only suitable for persistently

excited processes [22].

The concept of persistent excitation can be defined by the following expression. A sequence of

regressors is called persistently exciting in m steps if there exist constants c, C and m such that

[9]

cI ≤k+m

i=k+1

ϕ(i)ϕ(i)T ≤ C I ∀k (2.21)

for any m > n. This condition implies that if ϕ(k) is persistently exciting, the entire Rn space

can be spanned by ϕ(k) uniformly in m steps. On the contrary, if the input is not sufficiently

exciting, only a subspace with a dimension smaller than n is spanned [9].

To decide if a signal is exciting or not, it can be useful to exploit the fact that a lack of excitationis reflected in the behavior of the covariance matrix in several ways. Since the covariance matrix is

a symmetrical, real-valued, positive semidefinite matrix which possesses orthogonal eigenvectors,

it can be diagonalized using eigenvalue decomposition. In that way, a covariance matrix can be

transformed into canonical form, i.e. factorized as

P = U ΣU −1

where U is a square matrix whose i− th column is the eigenvector qi of P and Σ is a diagonal

matrix with the eigenvalues of P as its diagonal elements. As a result, each covariance matrix

can be fully represented in terms of its eigenvalues and eigenvectors.

Adopting a statistical interpretation, the eigenvalues of a covariance matrix represent the

magnitude of data spread in the direction of the respective eigenvectors. This implies that the

11


12/54


largest eigenvector of a covariance matrix points into the direction of the largest variance of the

data, i.e. the direction in which the data has the largest uncertainty, while the second largest

eigenvector points into the direction of the second largest variance, which is orthogonal to the

largest eigenvector, and so on. The length of the eigenvectors is represented by the magnitude

of the respective eigenvalues. This way, covariance matrices can be represented as ellipsoidsin the Rn space with n as the number of parameters [20]. For instance, considering a process

with two parameters, the covariance matrix can be represented in two-dimensional space as an

ellipse (see fig 2.4). In this case, the two perpendicular axes of the ellipse point into the direction

of eigenvectors of the P matrix while the length of the axes is determined by the eigenvalues.

Assuming that the first eigenvalue at time instance k is larger than the second, the ellipse is

more elongated in the direction of eigenvector 1 [20].

As already established, estimator windup is characterized by a blowup of the covariance

a

b

•

EV 1

EV 2

Figure 2.4 – Snapshot representation of a 2 × 2 covariance matrix as an ellipse at time k

matrix and is caused by a lack of excitation in the input signal i.e. no new information is

incoming regarding the parameters or more precisely, the incoming data does not contain enough

information all along the parameter space [5]. Since an estimator such as the RLS discards old

information, the ’uncertainty’ grows. Because each covariance matrix can be represented in terms

of its eigenvalues and eigenvectors, the windup phenomenon caused by insufficient excitation can

therefore be detected by an unbounded increase of its eigenvalues. Referring to fig. 2.4, a lack of excitation would be represented by an increase of the eigenvalues, thus changing the shape of the

ellipse during the period of poor excitation [20]. Furthermore, using the fact that the trace of a

matrix is defined as the sum of its eigenvalues, another possibility to detect a lack of excitation

is by simply measuring the trace of the P matrix. Thus, an increase of the covariance matrix

eigenvalues would be represented by an increase of the matrix trace.

Similarly, another method of detection is by interpreting the windup phenomenon from the

perspective of the information matrix. As seen from (2.20), poor excitation will lead to [6]

R (k) = λR (k − 1)

As λ


13/54


inverse can become ill-conditioned, i.e. P −1(k) = ΦT (k)Φ(k) can become non-invertible. In this

respect, poor excitation is also reflected by an increase of the condition number of P .

2.2.3 Parameter identification of vehicle longitudinal dynamics

In today’s driver assistance systems accurate models of the vehicle’s longitudinal dynamics are

required to enable automated control schemes for e.g. fuel efficient driving, vehicle following or

range prediction of electric vehicles. Since the longitudinal dynamics is mainly characterized by

the mass of the vehicle as well as rolling resistance and air resistance, obtaining an accurate

model depends mostly on good estimates of these parameters. Since a sensor-based measurement

is often not possible, an adaptive, real-time estimation of these parameters can be considered as

a rational alternative.In order to derive a system for online parameter identification, a physical model of a vehicle’s

longitudinal dynamics is required. The dynamics can be modeled as [1], [22]:

(m + mrot)v̇ = F A − 1

2ρairAcwv

2 − mg sin(α) − mg cos(α)f R (2.22)

In the above equation m is the vehicle mass, mrot is the equivalent mass of rotating components,

v̇ the vehicle acceleration. F A is the driving force and is computed as F A = T wheel

rdynwhich is the

wheel torque divided by the dynamic rolling radius of the tire. The resisting forces consist of the

aerodynamic resistance, slope resistance and rolling resistance. The air resistance is defined as

F air = 12ρairAcwv

2 with ρair denoting the air density, A being the frontal area of the vehicle, cw

being the drag coefficient and v representing the vehicle speed. The slope resistance is defined

as F S = mg sin(α) with g denoting the gravitational constant and α being the slope of the

road where α ≷ 0 corresponds to uphill and downhill grades respectively and α = 0 means no

inclination. Lastly, the rolling resistance is computed as F R = mg cos(α)f R with f R denoting the

rolling resistance coefficient of the road [1], [22].

The unknown parameters in the dynamic model to be estimated are the vehicle mass, the rolling

resistance coefficient and the drag coefficient. For the estimation of these parameters (2.22) can

be further simplified. Using the small angle approximation for small road slopes cos(α) ≈ 1, the

rolling resistance can be approximated as F R ≈ mgf R. Furthermore, in order to obtain a linear

representation, the rolling resistance coefficient is to be estimated together with the vehicle mass

(mf R). Likewise, the drag resistance is to be estimated in combination with the frontal surface

area of the vehicle (Acw). Lastly, the vehicle acceleration is combined with g sin(α) of the slope

resistance, so that v̇ + g sin(α) = ax is used. With these simplifications, (2.22) can be expressed

using the regressor notation [1]:

F A − mrot ŷ

= ax g 12ρairv2 ϕT

m

mf R

Acw

θ̂

(2.23)

13


14/54


where F A − mrot describes the system output,

ax g 12ρairv

2

represents the system input.

Thus, the parameter vector θ̂ to be estimated is

m mf R AcwT

.

It must be noted that the longitudinal dynamic model described in (2.23) is only valid under

specific circumstances. In fact, the following conditions must be fulfilled so that the parameters

can be estimated reliably [1]:

• The vehicle must move at a minimum speed v ≥ vmin and must have an acceleration faster

than a ≥ amin

• The vehicle is moving straight forward.

• The driver or the driving assistance system is not performing a breaking maneuver.

• The power train is fully opened or closed. This is because the wheel torque cannot be

calculated when the clutch is slipping.

If these validity conditions are not satisfied, the estimation algorithm is not activated.

14


15/54

3 Techniques for parameter identification underlack of excitation

With the notation defined in (2.23), the parameters θ can be identified in real-time using

estimation algorithms such as the RLS with exponential forgetting. However, as already described,

windup can occur in the RLS estimator during periods of poor excitation. Ultimately, this leads

to inaccurate and unreliable estimates and can result in the algorithm becoming unstable. This

is a serious drawback since for the parameter estimation in many adaptive systems, reliability

and robustness are particularly important. As a result, the RLS algorithm with exponential

forgetting is not the most suited estimator candidate.

Various researchers have identified the problem related to RLS estimators and have proposed a

wide variety of modifications and alternatives to the algorithm in order to target the windup

problem. These can be broadly categorized based on their utilized technique:

• Regularization of the least squares problem

• Variation of the forgetting factor

• Manipulation of the covariance matrix

• Limitation or scaling of the trace of the covariance matrix

In the following, some of the proposed estimation algorithms from literature are illustrated.

3.1 Estimation algorithms based on regularization mechanisms

The concept of the estimators which rely on a regularization technique is based on the notion

of well-posed/ill-posed problems. A problem is defined as well-posed if it is solvable and has aunique solution which depends continuously on system parameters, i.e. small changes of the data

cannot cause arbitrary large changes of the solution. Conversely when these conditions are not

fulfilled, a problem is called ill-posed [21], [19].

Applying the concept to recursive parameter estimation, one recalls that the least squares

estimation problem is described as Ŷ = ΦT θ̂ which is solved by finding the solution which

minimizes the euclidian norm [10]

(Y − ΦT θ̂)T (Y − ΦT θ̂) =

Y − ΦT θ̂

2

2(3.1)

. This yields the so-called normal equation

ΦT Φθ̂ = ΦT Y


16/54

3.1 Estimation algorithms based on regularization mechanisms

from which the solution θ̂ can be obtained (see (2.3)).

However, when windup occurs, the information matrix R = ΦT Φ can become ill-conditioned. As

a result, R can become singular and the covariance matrix P as the inverse of the information

matrix does not exist. Hence, the regularization based estimators aim at preventing the poor

conditioning of the information matrix.

3.1.1 Tikhonov regularization

One possibility to prevent the ill-conditioning of ΦT Φ is based on the regularization technique

proposed by Tikhonov. This so-called Tikhonov Regularization method proposes to add a positive

term to (3.1). Consequently, the optimization problem becomes [10]

minY − ΦT ˆθ2

2 + αT R Lˆθ2

2 (3.2)with αT R denoting an adjustable parameter and L being a weighting matrix.

In the above equation αT RLθ̂2

2can be seen as a penalty term, with which the optimization

problem can be ensured to stay well-posed at the price of biasing the obtained solution. Thus,

the regularization parameter αT R has to be chosen considering the trade-off that a small value

does not effectively prevent the ill-conditioning of the problem while a large value leads to a

larger bias of the obtained solution. Ideally, αT R can be chosen such that the residual is small

and the penalty is moderate [10].

Since in many cases L

is chosen as I

(so-called standard form) the solution of (3.2)

is obtainedas:

θ̂ = (ΦT Φ + αT RI )−1ΦT Y (3.3)

Similar to the standard RLS, the derivation of the solution can be formulated in recursive form.

Thus, one obtains the Tikhonov regularization estimator (TR) [23]:

θ̂(k) = θ̂(k − 1) + K (k)(k) + P (k)(1 − λ)αT Rθ̂(k − 1) (3.4)

K (k) = P (k)ϕ(k) (3.5)

R (k) = λR (k − 1) + ϕ(k)ϕT (k) + (1 − λ)αT RI (3.6)

(k) = y(k) −ϕT (k)θ̂(k − 1) (3.7)

Note that while with the RLS estimator it is possible to formulate the algorithm entirely without

the notion of the information matrix, due to the additional term of αT RI it is not possible to

apply the matrix inversion lemma to obtain a recursive expression for P (k). Hence, the algorithm

is formulated using R (k) = P −1(k).

3.1.2 Levenberg-Marquardt regularization

Another regularization based estimator called the Levenberg-Marquardt regularization has similar

properties to the Tikhonov regularization. Here, the ill-conditioning of the information matrix is

16


17/54

3.2 Estimation algorithms based on variation of the forgetting factor

prevented simply by adding a positive definite matrix [11]. This way, it can be ensured that the

information matrix is always invertible. Hence, the Levenberg-Marquardt regularization estimator

(LMR) can be obtained by e.g. adding a scaled identity matrix αLMRI to the update equation of

the information matrix [23].

θ̂(k) = θ̂(k − 1) + K (k)(k) (3.8)

K (k) = P (k)ϕ(k) (3.9)

R (k) = λR (k − 1) + ϕ(k)ϕT (k) + (1 − λ)αLM RI (3.10)

(k) = y(k) −ϕT (k)θ̂(k − 1) (3.11)

The LMR is formally very similar to the TR algorithm. The only difference is that unlike the

LMR, the TR algorithm includes the covariance matrix in the parameter update equation. Similar

to TR, the adjustable parameter in the LMR is the parameter αLMR which is to be chosen as a

positive constant while considering the same trade-off as in the TR estimator.


A different category of estimators target the windup problem from the perspective of the forgetting

factor. Since the standard RLS uses a constant, time-invariant forgetting factor, old data is

discarded uniformly in each iteration step. This means that the same forgetting factor is appliedto the covariance matrix at all times, regardless of the level of excitation of the input or the

variation rate of the parameters. Therefore the windup problem can be targeted by designing an

estimator which uses a variable forgetting factor.

3.2.1 RLS with variable forgetting factor

An estimation technique which is based on a variable forgetting factor has been proposed byFortescue et al, 1981. The basic idea of the method is to enforce time-variant forgetting where

the forgetting factor is chosen approximately as 1 when the process is not properly excited to

avoid windup and to decrease the forgetting factor during periods of rich excitation in order to

enable parameter tracking. This way, during periods of poor excitation the large forgetting factor

ensures that old data is not discarded and conversely, when the excitation is rich λ can be varied

[14].

In the proposed algorithm, the variable forgetting factor is computed as a function of the noise

variance level and the current estimation error. The idea is to choose λ(k) such that the a

posteriori error remains constant in time (E (k) = E (k − 1) = E (0)). In [14] this condition isachieved by using

λ(k) = 1 − 1

E (0)

1 −ϕT (k)K (k − 1)

(k)2 (3.12)

17


18/54


where the initial a posteriori error E (0) is defined as a function of the noise variance σ2n

E (0) = σ2n · N 0

N 0 = 1

1 − λ0

with λ0 being an adjustable parameter.

To achieve better performance, a modification of the algorithm has been proposed in subsequent

studies where the noise variance σ2n(k) is calculated recursively as the weighted sum of the previous

noise variance value and the current prediction error (k) [14]. Furthermore, two thresholds are

defined to detect if the parameters have changed. In case the thresholds are exceeded (i.e. a

change has occured) E (0) is chosen as a small value, which according to (3.12) leads to a smaller

forgetting factor that enables good parameter tracking. Otherwise E (0) is chosen as a large value

which leads to a larger λ that ensures robust estimates during poor excitation. The so-called

RLS with variable forgetting algorithm (VF) is obtained by adding the following equations to the

standard RLS given by (2.16) - (2.19):

N 02 = 1

1 − λ0(3.13)

N 01 = 10N 02 (3.14)

σ2n(k) = κσ2n(k − 1) + (1 − κ)

2(k) (3.15)

E (0) =

σ

2n(k)N 02 if σ

2n(k) ≥ σ

2n(0) ∧

σ2n(k) − σ2n(k − 1) ≥ ∆σ2nσ2n(k)N 01 else

(3.16)

λ(k) = 1 − 1

E (0) 1 −ϕT (k)K (k − 1) (k)2

In the above algorithm, σ2n(0) and ∆σ2n are the threshold parameters which need to be chosen.

The equation (3.16) states that if the noise variance exceeds its initial level and if there is a

significant increase or decrease of the noise variance level from the previous sample time, it

can be deduced that the process is sufficiently exciting. Thus, to enable good adaptation to

parameter changes, λ can be decreased which is achieved by choosing the smaller N 02 to compute

the forgetting factor. Otherwise, when the two thresholds are not exceeded λ should be chosen

as a large value in order to restrict the effects of estimator windup. In this case, the forgetting

factor is computed using the larger N 01 so that λ ≈ 1. Another parameter to be tuned is κ which

determines how the previous noise variance level and the current estimation error should be

weighted in the recursive computation of σ2n(k) [14].

3.2.2 RLS with multiple forgetting factors

A different approach to an estimator based on variable forgetting is proposed by Vahidi et al

in [22]. The key principle of the algorithm is based on the observation that in many adaptive

systems, parameters often vary with different rates. For instance, regarding the parameters in

the longitudinal dynamics model of vehicles, the vehicle mass is a parameter that changes rather

abruptly (e.g. when passengers get into or out of the vehicle) while the air resistance would vary

18


19/54


barely. Since the standard RLS assumes that all parameters have the same variation rate windup

can occur when multiple parameters are estimated that actually have different variation rates

[22].

A solution to this problem can be obtained when one considers not a single, but multiple forgetting

factors that are individually customized to each parameter. For this, a new loss function is definedwhich separates the error caused for each parameter. E.g. for an estimation of two parameters,

the error function can be stated as

V (θ̂1(k), θ̂2(k)) = 1

2

ki=1

λk−i1

y(i) − ϕT 1 (i)θ̂1(k) − ϕ

T 2 (i)θ̂2(i)

2

+1

2

ki=1

λk−i2

y(i) − ϕT 1 (i)θ̂1(i) − ϕ

T 2 (i)θ̂2(k)

2

With the above expression, the error term can distinguish between the error caused by the first

estimate and the second estimate.Similar to the standard RLS, the least squares estimates can be calculated using the individual

gradients ∂V ∂ ̂θi(k)

, i = 1, 2. Therefore, repeating the calculations for the standard RLS the update

equations for each individual parameter can be obtained (here presented for two parameters

i = 1, 2).

θ̂i(k) = θ̂i(k − 1) + ki(k)(k)

ki(k) = pi(k − 1)ϕi(k)

λi + ϕT i (k) pi(k − 1)ϕi(k)

−1 pi(k) = (I − ki(k)ϕ

T i (k)) pi(k − 1)

1

λi

(k) =

y(k) − ϕ

T 1 (k)θ̂1(k − 1) − ϕ

T 2 (k)θ̂2(k) for i = 1

y(k) − ϕT 1 (k)θ̂1(k) − ϕT 2 (k)θ̂2(k − 1) for i = 2

Applying some additional algebraic rearrangements the equations can be obtained in vector form

and the Multiple forgetting algorithm (MF) can be stated as [22]:

θ̂(k) = θ̂(k − 1) + K new(k)(k) (3.17)

(k) = y(k) −ϕT (k)θ̂(k − 1)

knew(k) = k λ−11 p1(k − 1)ϕ1(k)λ−12 p2(k − 1)ϕ2(k)

(3.18)k =

1

1 + λ−11 p1(k − 1)ϕ1(k)2 + λ−12 p2(k − 1)ϕ2(k)

2 (3.19)

pi(k) = (1 − knew,i(k)ϕT i (k)) pi(k − 1)

1

λi∀i = 1, 2 (3.20)

P (k) = I

p1(k)

p2(k)

(3.21)

with knew,i denoting the i-th row of knew. In this case, the covariance matrix is a diagonal matrix

with p1 and p2 as the diagonal elements. Thus, for each diagonal element an individual forgettingfactor is applied and each element of the covariance matrix is also updated separately.

19


20/54

3.3 Estimation algorithms based on covariance manipulation


While the above mentioned algorithms such as the VF estimator rely on non-uniform forgetting

in time, there are alternative estimation techniques which are based on non-uniform forgetting in

the parameter space. The basic idea comes from the observation that the incoming data is oftennot uniformly distributed in the parameter space [7]. As a matter of fact, the standard RLS

with constant forgetting is based on the misconception that old data can be forgotten uniformly

since it is assumed to be obsolete. However, old data should only be forgotten when the new

incoming data contains enough new information about the parameters [6]. This is often not the

case since the incoming data is not distributed uniformly in the parameter space [7]. This way,

when data is discarded using a constant forgetting factor, information related to non-excited

directions will be lost, causing the unlimited growth of some elements of the covariance matrix.

Therefore, a variety of algorithms have been proposed where the forgetting is applied only in

certain directions. This type of technique termed as ’directional forgetting’ aims at discardingdata only in those directions where the incoming information is sufficiently excited and to restrict

the dismissal of data in the non-excited directions [17], [6].

3.3.1 Directional Forgetting (Bittanti)

There exist many variations of directional forgetting based algorithms. For instance, Bittanti et

al in [4] proposed a version which is henceforth denoted as Directional Forgetting by Bittanti

(DFB). The algorithm can be described by modifying the standard RLS as follows [12]:

θ̂(k) = θ̂(k − 1) + K (k)(k)

(k) = y(k) −ϕT (k)θ̂(k − 1)

a(k) = ϕT (k)P (k − 1)ϕ(k) (3.22)

K (k) = P (k − 1)ϕ(k)(1 + a(k))−1 (3.23)

β (k) =

λ − 1−λ

a(k) if a(k) > 0

1 if a(k) = 0(3.24)

P (k) = P (k − 1) − P (k − 1)ϕ(k)ϕT (k)P (k − 1)β −1(k) + a(k)

(3.25)

In case windup is caused by a regression vector of 0, the term a(k) becomes 0 as well and the

equation for the correction gain (3.23) is the same as the corresponding equation in the standard

RLS without forgetting (2.8). Furthermore, the update equation of the covariance matrix is

almost the same as in (2.9) except for the difference in sign of the denominator expression.

Consequently, the influence of the forgetting factor λ is eliminated from the update equations

and the effects of estimator windup can be limited. On the other hand, if the regression vector is

constant but different from 0, a(k) > 0 applies and β (k) is set to λ − 1−λa(k) which can be either

positive or negative. Thus, the denominator β −1(k) + a(k) in the update equation of P (k) can

shift in sign, depending on the chosen forgetting factor and the magnitude of a(k). As a result,

when windup occurs due to a constant regression vector the covariance matrix can either increase

20


21/54


or decrease (i.e. P (k) stays bounded). This is in contrast to the RLS with constant forgetting

where P (k) is unbounded and can thus only increase [12].

3.3.2 Directional Forgetting (Cao)

An alternative directional forgetting based algorithm has been proposed by [7], [6]. The funda-

mental idea can be explained by examining the update equation of the information matrix of the

standard RLS in situations of poor excitation R (k) = λR (k − 1). It can be observed that in this

case, the entire matrix R (k) will tend to 0 because information is forgotten uniformly. However,

a better performance can be achieved when the information content of the regression vector ϕ(k)

is taken into account, i.e. forgetting is only applied to the specific part of R (k), which is affected

by the new information [7].

This leads to the modification of the information matrix update equation from (2.20) to a more

generalized form

R (k) = F (k)R (k − 1) + ϕ(k)ϕT (k) (3.26)

where F (k) denotes the ’forgetting matrix’. As stated in [6] the forgetting matrix should be

designed to apply forgetting only on the excited subspace of the parameter space.1 By introduc-

ing

R (k) = R̄ (k − 1) + ϕ(k)ϕT (k) (3.27)

R̄ (k − 1) = F (k − 1)R (k − 1) (3.28)

F (k) is to be chosen such that R̄ (k − 1) is positive definite and R̄ (k − 1) ≤ R (k − 1). This means

that R (k) can be bounded from below, thus preventing R (k) from becoming zero which causes

the windup phenomenon [6].

It is up to debate on how to choose F (k). In the proposed algorithm by [6], the forgetting matrix

is derived based on an orthogonal decomposition of R (k) along the excitation direction. Namely,

the information matrix is decomposed into two parts

R (k − 1) = R 1(k − 1) + R 2(k − 1) (3.29)

where R 2(k − 1) is the part to which forgetting is applied. This way it can be stated that

R 1(k − 1)ϕ(k) = 0 (3.30)

which establishes an orthogonal relationship between R 1(k − 1) and ϕ(k) and (3.29) becomes

R (k − 1)ϕ(k) = R 2(k − 1)ϕ(k) (3.31)

1As one can see, by setting F (k) = λI the update equation of the standard RLS is obtained.

21


22/54


Specifying that the new incoming data ϕ(k)ϕT (k) has rank one and that R 2(k − 1) must have

the same rank while R 1(k − 1) must have rank n − 1, a unique solution is given by

R 2(k − 1) =R (k − 1)ϕ(k)ϕT (k)R (k − 1)

α(k) (3.32)

α(k) = 1ϕT (k)R (k−1)ϕ(k) if |ϕ(k)| > δ 0 if |ϕ(k)| ≤ δ (3.33)

R 1(k − 1) = R (k − 1) − α(k)R (k − 1)ϕ(k)ϕT (k)R (k − 1)

(3.34)

where a dead-zone for ϕ(k) is introduced in which the decomposition is not performed (i.e.

R 2(k − 1) = 0,R 1(k − 1) = R (k − 1))[6].

If forgetting is only applied to R 2(k − 1) the recursive update equation of the information matrix

can be expressed as

R (k) = R 1(k − 1) + λR 2(k − 1) + ϕ(k)ϕT (k) (3.35)

where R 1(k − 1) refers to the part that is orthogonal to the regression vector which carries

information not to be discarded, R 2(k − 1) is the part of the information matrix to which

forgetting is applied and ϕ(k)ϕT (k) denotes the new incoming information.

After some reformulations and application of the matrix inversion lemma the estimation algorithm

which is henceforth referred to as Directional Forgetting by Cao (DFC) can be formulated [7],

[6]:

θ̂(k) = θ̂(k − 1) + K (k)(k)

(k) = y(k) −ϕT

(k)ˆ

θ(k − 1)K (k) = P (k)ϕ(k) =

P̄ (k − 1)ϕ(k)

1 + ϕ(k) P̄ (k − 1)ϕ(k)(3.36)

P̄ (k − 1) =

P (k − 1) + 1−λλ

ϕ(k)ϕT (k)

ϕT (k)R (k−1)ϕ(k) if |ϕ(k)| > δ

P (k − 1) if |ϕ(k)| ≤ δ (3.37)

P (k) = P̄ (k − 1) −P̄ (k − 1)ϕ(k)ϕT (k) P̄ (k − 1)

ϕT (k) P̄ (k − 1)ϕ(k)(3.38)

where δ is the threshold parameter for the deadzone of the covariance matrix update which

needs to be adjusted. Therefore, when the excitation is poor (i.e. the threshold is not exceeded),the covariance matrix is not updated, i.e. P (k) = P (k − 1) thus preventing the blowup of the

covariance matrix. In this case the update equations are exactly the same as the RLS without

forgetting and the effects of windup can be restricted since old data is not discarded [ 6].

3.3.3 Kalman filter based algorithm (I)

In a subsequent study by Cao and Schwartz a further modification of the DFC algorithm has

been proposed. The proposal of a modified version is motivated by the fact, that in the DFC

algorithm, the update of the covariance matrix requires both the value of the covariance matrix

P (k − 1) of the previous sample as well as its inverse R (k − 1). In order to improve computational

22


23/54


efficiency a simplified version of the algorithm has been designed where ϕT (k)R (k − 1)ϕ(k) is

replaced by an expression which does not contain R (k − 1). The simplified version can expressed

as follows [9]:

θ̂(k) = θ̂(k − 1) + K (k)(k)

(k) = y(k) −ϕT (k)θ̂(k − 1)

K (k) = P (k − 1)ϕ(k)

r + ϕT (k)P (k − 1)ϕ(k)

P (k) = P (k − 1) − P (k − 1)ϕ(k)ϕT (k)P (k − 1)

r + ϕT (k)P (k − 1)ϕ(k) +

ρϕ(k)ϕT (k)

γ + ϕT (k)ϕ(k) (3.39)

The estimator is denoted by the authors as the Kalman Filter based algorithm (KFB-I) since it

has very similar properties to a standard Kalman filter (see (2.7), (2.10), (2.13), (2.14)). Here

r, ρ and γ are adjustable parameters. For instance, ρ is a parameter that determines the tracking

speed of the algorithm while γ can often be chosen as very a small value to ensure that thecovariance matrix is well-defined. Interpreting the algorithm as a modification of the standard

Kalman filter, r represents the as the variance of the measurement noise, which can e.g. be

assumed as Gaussian and known, i.e. r(k) = r [8].

3.3.4 Kalman filter based algorithm (II)

Another modified version of the above algorithm is developed in [8]. The derivation is motivated

by the similar properties between the described Kalman Filter based algorithm KFB-I and the

standard Kalman filter for parameter estimation. In the standard Kalman filter, the covariance

matrix update is given by [8] (see (2.14))

P (k) = P (k − 1) − P (k − 1)ϕ(k)ϕT (k)P (k − 1)

r(k) + ϕT (k)P (k − 1)ϕ(k) +Q(k)

where Q(k) is the covariance matrix of the random walk sequence vector w(k). Since in real

applications Q(k) is never known exactly, it is possible to compute Q(k) recursively. Thus, the

so-called Modified Kalman Filter based algorithm (KFB-II) can be obtained by simply modifying

(3.39) as [8]

P (k) = P (k − 1) − P (k − 1)ϕ(k)ϕT (k)P (k − 1)

r + ϕT (k)P (k − 1)ϕ(k) +Q(k) (3.40)

Q(k) = λQ(k − 1) + ρϕ(k)ϕT (k)

γ + ϕT (k)ϕ(k) (3.41)

It can be observed that the modified version differs from the original version simply in the

choice of Q(k). That is, in the KFB-I estimator the variance matrix is chosen directly as

Q(k) = ρϕ(k)ϕT (k)

γ +ϕT (k)ϕ(k), while in the modified version Q(k) is obtained recursively as the weighted

sum of the previous value in k − 1 and ρϕ(k)ϕT (k)γ +ϕT (k)ϕ(k)

.

In summary, it can be proven that the properties of both Kalman Filter based algorithms as well

as DFC ensure that the covariance matrix can be bounded from both below and above [ 9], [8]. This

23


24/54

3.4 Estimation algorithms based on limiting or scaling the covariance matrix trace

represents a desirable property of any estimation algorithm since the boundedness from below

ensures that good tracking abilities (since P (k) does not tend to zero), while the boundedness

from above restricts the effects of estimator windup, indicating that the covariance matrix cannot

increase infinitely. In contrast, DFB only shows upper boundedness of the covariance matrix [6].

Hence, although all directional forgetting based algorithms should be able to restrict the effectsof windup, one should expect that the latter three DF based algorithms provide better tracking

abilities than the first algorithm.

3.3.5 Stenlund-Gustafsson Anti-Windup algorithm

Another directional forgetting based algorithm with properties similar to a Kalman Filter is

proposed by Stenlund and Gustafsson . The starting point of the algorithm is equation (3.40) of a Kalman filter for parameter estimation. In [20], it is proposed to choose the variance matrix

Q(k) as:

Q(k) = P dϕ(k)ϕ

T (k)P d

r + ϕT (k)P dϕ(k) (3.42)

where r denotes the error covariance and P d ∈ Rn is a matrix to be adjusted. This way, by

adding Q(k) to the update equation P d becomes the matrix to which P (k) converges in periods

of poor excitation [1]. This indicates that the covariance matrix stays bounded. As a result, the

algorithm which is henceforth denoted as Stenlund Gustafsson Anti-windup (SG) can be ob-

tained by adding (3.42) to the standard Kalman Filter estimator given by (2.7), (2.10), (2.13) [20].

3.4 Estimation algorithms based on limiting or scaling the covariance

matrix trace

Various studies have observed that in order to design a well-behaved estimator which is capable of

avoiding windup, the boundedness of the covariance matrix from above is of particular importance

[8]. While the above mentioned directional forgetting estimators bound the covariance matrix

indirectly through the application of non-uniform forgetting, there exist a variety of algorithms

which are based on the direct bounding of the covariance matrix through limiting or scaling the

matrix trace.

3.4.1 Constant trace algorithm

The idea of the constant trace algorithm is to scale the P matrix in each iteration such that

its trace remains constant. This way the eigenvalues of the covariance matrix cannot increase

infinitely as the trace is kept at a constant value. The algorithm can be obtained by introducing a

24


25/54

3.4 Estimation algorithms based on limiting or scaling the covariance matrix trace

recursively calculated matrix P̄ (k) and by calculating P (k) as a function of P̄ (k). The Constant

trace algorithm (CT) can be described by the following equations [ 2]:

θ̂(k) = θ̂(k − 1) + K (k)(k)

K (k) = P (k − 1)ϕ(k) λ +ϕT (k)P (k − 1)ϕ(k)−1

P̄ (k) = 1

λ

P̄ (k − 1) −

P (k − 1)ϕ(k)ϕT (k)P (t − 1)

1 + ϕT (k)P (k − 1)ϕ(k)

(3.43)

P (k) = c1P̄ (k)

tr P̄ (k)

+ c2I (3.44)(k) = y(k) −ϕT (k)θ̂(k − 1)

The key principle of the estimator can be explained through (3.43) and (3.44). Assuming that

excitation is poor, ϕ(k) = 0 leads to the exponential increase of P̄ (k) due to P̄ (k) = 1λP̄ (k − 1).

However, by dividing the matrix by its trace, the result P̄ (k)

tr{P̄ (k)} is scaled such that its traceremains constant, no matter how large P̄ (k) becomes. Therefore, the covariance matrix P (k) stays

bounded even in periods of poor excitation. The optional term c2I is added as a regularization

mechanism [2], [18].

3.4.2 Maximum trace algorithm

Another method to achieve boundedness of the covariance matrix is to limit its trace to a

maximum value. In [16], this is achieved modifying the forgetting factor according to:

λ(k) = 1 − (1 − λ0)

1 −

tr {P (k)}

tr {P max}

(3.45)

Thus, the Maximum trace algorithm (MT) is given by substituting the constant forgetting

factor of the standard RLS algorithm by (3.45). Using said expression λ tends to 1 once the

trace of matrix P approaches the predefined maximum value tr {P max} since 1 − tr{P (k)}tr{P max}

= 0.

Conversely, when the covariance matrix converges to 0, λ tends to the specified lower bound λ0

which ensures the algorithm’s adaptability to parameter changes.

25


26/54

4 Implementation and comparison of parameteridentification algorithms

In the following the estimation algorithms described in chapter 3 shall be implemented for the

estimation of vehicle mass and driving resistances as described by the longitudinal dynamics model

in 2.2.3. The algorithms are subsequently evaluated in terms of performance using numerical

data obtained from various test drives. A simulation model implemented in Simulink is used to

test the estimators which are subsequently compared in terms of various quality criteria.

4.1 Experimental setup

The starting point for the evaluation of the proposed estimators is a model of a vehicle’s

longitudinal dynamics (see (2.23)) implemented in MATLAB/Simulink . The experimental data

is obtained from ’recorded’ test drives which have been performed under different circumstances,

such as different vehicle types and/or different loads etc. In general, the test drives reflectthe environment of different landscapes. For instance, the vehicles travel through both plane

landscapes such as highways where the velocity is often constant and the acceleration is minimal as

well as city traffic which is characterized by varying velocity/acceleration. Thus, the performance

of the estimators can be evaluated under different circumstances. The relevant signals from the

test drives are transferred from a vehicle bus system to the simulation model [1].

For the evaluation of the proposed estimators, some preliminary preparations are made to improve

the estimators’ performances. For instance, all algorithms need a ’learning phase’ at the beginning.

This means that before they start to converge, the estimates can take on values which do not make

sense physically (i.e. too large or too small). Consequently, upon computation the parametersare limited to stay within a physically meaningful range θ̂min, θ̂max, respectively. Furthermore,

signals needed to formulate the longitudinal dynamics model such as the driving force F A and the

longitudinal acceleration ax can be noisy. This is why upon obtaining these signals from the bus

network system a PT1 filter is used to reduce the noise. Finally, the parameters to be estimated

have different scales which can lead to numerical problems during computation. Therefore, the

estimated variables are scaled to the same magnitude using a transformation

θ̃ = T θ̂ (4.1)

where the transformation matrix T is chosen so that the transformed variables θ̃ lie within the

range of −1 to 1.

In order to compare and evaluate the estimators the reference values of the vehicle mass and


27/54

4.2 Simulation model

driving resistances are required. For this purpose, the first parameter m, i.e. the mass of the

vehicle including its load is measured before a test drive using a scale. The reference values of

the coefficients needed for the estimation of the second and third parameter mf R and Acw are

determined through a specific experimental setup. During these experiments, the vehicle speed

is measured and the coefficients are computed through non-linear optimization such that theestimated speed using these coefficients is close to the actual measured speed. For the sake of

simplicity, the second and third parameter are henceforth denoted as f 0 = mf R and f 2 = Acw,

respectively.

4.2 Simulation model

A simulation model has been implemented in Simulink and is shown in fig. 4.1. In the model, a

Figure 4.1 – Simulink Model of test study

block named ’Parameters’ contains all parameters needed for the adjustment or initialization of

the algorithms, such as values of forgetting factors or initial values for estimates or covariance

matrices. Another block ’CAN2input’ processes all data obtained from the vehicle bus system.

The various data are subsequently sent to the orange colored estimator blocks as well as another

block called ’A/B Generator 3P’ which contains the model for the longitudinal dynamics. Finally,

an ’Evaluation’ block aggregates the simulation results regarding some quality measures and

sends the data to the MATLAB workspace for further processing.

All algorithm blocks have the same structure which is displayed in fig. 4.2 for the RLS algorithm.

The estimators are implemented as enabled subsystems since the estimators should only compute

when the conditions stated in 2.2.3 are fulfilled. The fulfillment of the conditions is evaluated in

a specific subsystem and the evaluation result is transmitted to each algorithm block using a

Goto-Block ’Valid’. This way, the estimators are activated only if Valid = 1. Each algorithm is

implemented as a MATLAB -function. Upon simulation the results are sent to the workspace

for further processing. Furthermore, a number of quality measures are used to evaluate the

27


28/54

4.3 Quality measures for estimator evaluation

Figure 4.2 – Structure of algorithm blocks

algorithm’s performances.

4.3 Quality measures for estimator evaluation

For the comparison of the various estimators different quality measures are used in the simulation

which all involve the computation of the quadratic error between the estimated and the actual

values. The quality measures used for the evaluation are:

• RMSE : the root mean square error between the measured output y and the estimated

output ϕT θ.

• RMSE m: the root mean square error between the actual vehicle mass mref and the

estimated mass m.

• pRMSE m and nRM SE m: the positive and negative RMSE m values, respectively.

• ∆mmax: the maximum deviations of the estimated mass from their true values.

• RMSE v: the root mean square error between the predicted speed and the measured speed

from the vehicle bus. Basically, the obtained parameter estimations θ are used for an ahead-

prediction of the vehicle speed. The prediction model is implemented for each algorithm

(see fig 4.2). The obtained result is compared to the measured vehicle speed from the testdrives and the root mean square error is computed.

• pRMSE v and nRMS E v: the positive and negative values of RMSE v, respectively.

28


29/54

4.4 Discussion of simulation results

All quality measures are calculated recursively. If the update conditions are fulfilled and the

starting phase of an algorithm has passed, the residual sum of squares (RSoS) is calculated as:

k = k + 1 (4.2)

RSoS (k) =

1

k k · RSoS (k) + E 2 (4.3)with k and k denoting the current and previous sample time, respectively and E denoting

the error of estimated value and actual/reference value. Otherwise when the update conditions

are not fulfilled k = k and RSoS (k) = RSos(k). The root mean square error is subsequently

computed as the square root of the RSoS value.

Regarding the calculation of the quality measures, a timer is used to count the time instances

during which the validity conditions of 2.2.3 are fulfilled and stops when the requirements

are violated. When the duration of the valid time instances passes a predefined threshold, the

calculation of the quality measures is activated. The reason for setting such a threshold is because

in the beginning of a test drive, an estimator is still in its learning phase. Thus, the obtained

estimates may fluctuate drastically in the beginning and may also show large deviations from

the actual values. This would lead to large error values which should not be counted in the

evaluation of the estimators. Therefore by setting a duration threshold it is ensured that the

quality measures are only calculated when the learning phase is passed. However, the duration

threshold must be set at an appropriate value since a threshold too large can cause the error

calculation to not be activated at all while a threshold too small can lead to biased results for

the evaluation.


First, the implementation as well as general performance of the algorithms introduced in section

3 shall be discussed. Subsequently, the estimation quality is studied in terms of the achieved

quality measures. For the sake of simplicity, it is assumed that the true/reference values of the

parameters remain constant. All estimators are initialized with the same covariance matrix and

the same parameter vector.

4.4.1 Standard Recursive Least Squares

The standard RLS with constant forgetting is implemented according to (2.16) - (2.19). For

initialization of the algorithm, the vectors

θ(0) = I 3×1

P (0) = 10−5I 3×3

are chosen, where θ is multiplied with T = [2100 20 0.6] for scaling. Moreover, a forgetting

factor of λ = 0.9999 is used. The algorithms are evaluated based on the data obtained from

a total of 30 test drives. During some of the test drives the vehicle travels mostly through

inner cities. Thus, due to the acceleration and braking maneuvers in inner city traffic the input

29


30/54


signals from ϕT =

ax, g, 12ρairv

2

can be seen as persistently exciting and consequently, the

RLS estimator performs reasonably well. However during test drives on highways or freeways

where the vehicle speed and acceleration remain constant for long periods of time the insufficient

excitation in the regressor can lead to noticeable estimator windup (see fig. 4.3). For instance,

m [ k g ]

2000

2200

2400

RLS

f 0 [ k g ]

0

20

40

RLS

f 2 [ m 2 ]

0.5

1

1.5

RLS

v [ k m / h ]

50

100

150

time[s]

0 200 400 600 800 1000 1200 1400 1600 1800

a x

[ m / s 2 ]

-5

0

5

Figure 4.3 – Estimator windup during t = 100s − 800s in the RLS algorithm (test drive 9); velocity v;acceleration ax

test drive no. 9 happens mostly on a free- or highway, which can be deduced from the vehicle

speeds v > 100km/h. Due to the poorly exciting signals of vehicle speed v and acceleration ax

which is most noticeable during t = 100s − 800s a significant drift in the parameter estimates can

be observed. This is especially noticeable in the estimates of f 0 whose values reach the bottom

thresholds defined by the saturation limits. However, once the signals are properly exciting

again (t > 800s) the estimators converge again to their reference values. This is also reflected in

the eigenvalues as well as trace of the covariance matrix. Fig. 4.4 shows the covariance matrixeigenvalues and the trace in logarithmic scale. The observed peaks in the eigenvalues (such as

around t ≈ 800s) happen when the eigenvalues switch orders, i.e. the largest eigenvalue becomes

the second largest or smallest. Evidently, the inaccurate estimates in the parameters correspond

to the relatively large eigenvalue/trace values of the covariance matrix during t = 100s − 800s.

Once the input is sufficiently exciting again the eigenvalues and thus the trace decrease and the

estimates become more accurate (see 4.3, 4.4).

4.4.2 Regularization based algorithms

Aside from the initialization for the information matrix R (0), the two regularization based

algorithms have two adjustable parameters λ and αT R/αLM R. In the model, λ is chosen as the

30


31/54


E V ( m )

-25

-20

-15

-10

RLS

E V ( f 0 )

-25

-20

-15

-10RLS

time[s]

0 200 400 600 800 1000 1200 1400 1600 1800

l o g ( t r a c e )

-20

-15

-10

RLS

E V ( f 2 )

-25

-20

-15

-10

RLS

Figure 4.4 – Eigenvalues and trace of the covariance matrix for RLS in logarithmic scale (test drive 9)

same value as for the RLS estimator while αT R is chosen as 2 · 10−6 and αLM R = 10

−8.

As already illustrated, the regularization based algorithms TR and LMR aim at making the

least squares problem well-conditioned so as to prevent the information matrix R from becomingsingular. As a direct consequence, it is to be expected that the covariance matrices of the

regularization based algorithms have better condition numbers than the regular RLS. This is

shown in fig. 4.5 for test drive no. 6 where one can see that the condition numbers of TR and

especially, LMR are significantly lower than those of RLS. This is also reflected in the parameter

estimates for test drive no. 6, which is shown in fig. 4.6. One can observe that during said test

drive poor excitation occurs noticeably during t = 100s − 200s as well as t = 300s − 400s. During

this time, the TR and especially LMR estimators manage to keep the condition number lower

which is directly reflected in the more robust and accurate estimates as compared to the regular

RLS.

4.4.3 Variable forgetting based algorithms

The implementation of the VF and MF algorithm are described in 3.2.1 and 3.2.2. As already

stated, the VF algorithm is basically a modified RLS with a noise variance based detection

mechanism for windup, based on which λ is varied. In the described model, the algorithm has

31


32/54


time[s]

0 100 200 300 400 500 600

l o g ( c o n d )

0

200

400

600

800

1000

1200

1400

1600

1800

RLS

LMR

TR

Figure 4.5 – Condition numbers of TR and LMR in logarithmic scale (test drive 6)

m [ k g ]

1950

2000

2050

2100

2150RLS

TR

LMR

f 0 [ k g ]

10

20RLS

TR

LMR

f 2 [ m 2 ]

0

0.5

1

RLSTR

LMR

v [ k m / h ]

0

100

time[s]0 100 200 300 400 500 600

a x

[ m / s 2 ]

-5

0

5

Figure 4.6 – Estimates of TR and LMR; velocity v ; acceleration ax (test drive 6)

32


33/54


4 adjustable parameters (not counting the initial covariance matrix). Through trial and error,

these have been chosen as

λ0 = 0.999

κ = 0.88σ2n(0) = 5 · 10

5

∆σ2n = 106

Furthermore, a lower threshold for λ has been set at 0.99 which is not to be exceeded.

The key concept of the VF estimator is best illustrated in fig. 4.7 for test drive no. 9 where one

can clearly see the variation of λ.

time[s]

160 162 164 166 168 170 172 174 176 178 180

λ

0.9994

0.9995

0.9996

0.9997

0.9998

0.9999

1

1.0001

Figure 4.7 – Variation of the forgetting factor in the VF estimator (test drive 9)

Upon further examination of the simulation results it can be concluded that with the given

parameters, the VF estimator behaves very similar to the RLS estimator. In fact, for almost all

test drives the estimation signals for the VF and RLS estimator overlap almost completely. A

small difference can be observed when using a more detailed scale of the signals (see fig. 4.8).

Regarding the MF estimator the algorithm which is adopted from [22] only discusses the case of

estimating two parameters. Therefore, in order to implement the algorithm for the estimation of

three parameters, the algorithm is adapted accordingly to feature an additional forgetting factor.

33


34/54


m [ k g ]

2075

2080

2085RLS

VF

f 0 [ k

g ]

17.617.8

1818.218.418.6

RLS

VF

f 2 [ m 2 ]

0.75

0.755

0.76RLS

VF

v [ k m / h ]

0

100

time[s]

800 850 900 950 1000 1050 1100 1150 1200

a x

[ m / s 2 ]

-5

0

5

Figure 4.8 – Excerpt of the parameter estimates of VF; velocity v ; acceleration ax (test drive 9)

Thus, there are three parameters which need to be adjusted. These have been determined as

λm = 0.9999, λf 0 = 0.9999, λf 2 = 0.99999, respectively.

The performance of the MF algorithm with the given parameters is somewhat unique as it iscapable of providing accurate estimates for some test drives while for other test drives, the

estimates are noise sensitive. This is best illustrated in the following figure which shows the MF

estimator’s performance for test drive no. 13, 9 and 6 (from left to right), all three of which are

cases with noticeable lack of excitation. One can deduce that with test drive 13 the MF estimator

performs better compared to the RLS with regards to the f 0 estimation but slightly worse with

regards to the m estimation (larger deviations during t = 500s − 1000s. With test drive 9, MF

provides significantly more accurate estimates for all three parameters. However with test drive

6, the MF estimates regarding f 2 are very noisy, although the estimates for m and f 0 are fairly

accurate.

4.4.4 Covariance matrix manipulation based algorithms

The algorithms DFB, DFC, KFB-I KFB-II and SG are all based on manipulating the covariance

matrix so that forgetting is only applied to certain directions in the parameter space.

Just like the RLS estimator DFB only has λ as an adjustable parameter, which is chosen as

λ = 0.9999 in the simulations. Taking e.g. test drive 13, DFB estimator performs better than the

standard RLS during the period of poor excitation (from 600s to 1000s), however the difference

is insignificant. Only with increasing horizon (from t = 1000s onwards ) the DFB estimates

show better convergence to the reference values than the RLS estimates. From the trace of

the covariance matrix, one can clearly see that while the trace of the RLS covariance matrix

34


35/54


m [ k g ] 3 8 0 0

4 0 0 0

f 0 [ k g ] 4 0

5 0

f 2 [ m 2 ]

0 1 2

v [ k m / h ]

0

1 0 0

t i m e

[ s ]

0

5 0 0

1 0 0 0

1 5 0

0

2 0 0 0

2 5 0 0

3 0 0 0

a x [ m / s 2 ]

- 5 0 5

2 0 0 0

2 2 0 0 0

2 0

4 0

0 . 5 1

1 . 5 0

1 0 0

t i m e

[ s ]

0

2 0 0

4 0 0

6 0 0

8 0 0

1 0 0 0

1 2 0 0

1 4 0 0

1 6 0 0

1 8 0 0

- 5 0 5

2 0 5 0

2 1 0 0

2 1 5 0

1 6

1 8

2 0

2 2

2 4 0

0 . 5 1

R L S

M F

0

1 0 0

t i m e

[ s ]

0

1 0 0

2 0 0

3 0 0

4 0 0

5 0 0

6 0 0

- 5 0 5

Figure 4.9 – Estimates of MF; velocity v ; acceleration ax: left = test drive 13, middle = test drive 9, right =test drive 6

35


36/54


grows during the period of poor excitation, the trace of the DFB covariance matrix actually

decreases 4.10. In fact, starting from t = 600s (where poor excitation begins), the trace of the

DFB covariance matrix is always smaller than the trace of the RLS covariance matrix. This

indicates that the windup effects are indeed limited in the DFB estimator which also explains

why, for the given test drive, the parameter estimates are more accurate than those of the RLSestimator.

time[s]

0 500 1000 1500 2000 2500 3000

l o g ( t r a c e )

-18

-17

-16

-15

-14

-13

-12

-11

-10

RLS

DFB

Figure 4.10 – Trace of the covariance matrix of DFB estimator in logarithmic scale(test drive 13)

Next, the DFC and KFB-I algorithms are analyzed. The DFC algorithm has two adjustable

parameters, while the KFB-I has three. For the simulation these are determined as:

DFC:

λ = 0.9999

δ = 1000

KFB-I:

ρ = 10−13

γ = 10−10

r = 1

36


37/54


m [ k g ]

3800

4000RLS

DFB

f 0 [ k g ]

40

45

50 RLSDFB

f 2 [ m 2 ]

0.5

1 RLSDFB

v [ k m / h ]

0

100

time[s]

0 500 1000 1500 2000 2500 3000

a x

[ m / s 2 ]

-5

0

5

Figure 4.11 – Estimates of DFB; velocity v ; acceleration ax(test drive 13)

m [ k g ]

3950

4000 RLSDFC

KFB1

f 0 [ k g ]

42444648505254

RLS

DFC

KFB1

f 2

[ m 2 ]

0.7

0.8

0.9RLS

DFC

KFB1

v [ k m / h ]

0

100

time[s]

0 500 1000 1500 2000 2500 3000

a x

[ m / s 2 ]

-5

0

5

Figure 4.12 – Estimates of DFC and KFB-I; velocity v; acceleration ax (test drive 13)

37


38/54


time[s]

0 500 1000 1500 2000 2500 3000

l o g ( t r a c e )

-18

-17

-16

-15

-14

-13

-12

-11

-10

RLS

DFC

KFB1

Figure 4.13 – Trace of the covariance matrix of DFC and KFB-I in logarithmic scale(test drive 13)

38


39/54


With the given parameters it has been found that DFC and KFB-I behave very similarly. For

instance, in test drive no.13 the signals of both estimators overlap almost entirely (see for instance

the f 0 estimates in fig. 4.12). A small difference can only be seen using a more detailed scale

of the estimates (see m estimates in fig. 4.12). Furthermore, analyzing the P - matrix trace for

the given test drive it can be deduced that both DFC and KFB-I have similar properties to theDFB algorithm, as both estimators restrict the growth of the covariance matrix during phases of

insufficient excitation. In fact, a general observation is that both DFC and KFB-I estimators

behave very similar to DFB for almost all test-drives.

Subsequently, the KFB-II estimator, for which the four adjustable parameters are set as

λ = 0.9999

ρ = 10−15

γ = 10−10

r = 1

is illustrated. It has been found that with the given parameters the KFB-II estimator values

always produces sensible estimates for f 0 and f 2, however the estimates of m can be very noise

sensitive. In fact, the sensitivity varies for different test drives. For instance, fig. 4.14 which

displays KFB-II estimates for test drive no. 10. shows that the mass estimations for said test

drive are very noisy while the estimates of the other two parameters are reasonably smooth. In

general, it has been found that the accuracy of the algorithm especially regarding the estimations

of m depends on the properties of the considered test drives.

m [ k g ]

2000

2100

2200RLS

KFB2

f 0 [ k g ]

0

20

40

RLS

KFB2

f 2 [ m 2 ]

0.5

1

RLS

KFB2

v [ k m / h ]

0

100

time[s]

0 1000 2000 3000 4000 5000 6000

a x

[ m / s 2 ]

-10

0

10

Figure 4.14 – Estimates of KFB-II algorithm; velocity v; acceleration ax (test drive 10)

Lastly in the category of directional forgetting based algorithms the behavior of the SG Anti-

39


40/54


m [ k g ]

3900

4000

4100DFB

DFC

KFB1

SG

f 0 [ k

g ]

354045

5055

DFB

DFCKFB1

SG

f 2 [ m 2 ]

0

1

2

DFB

DFC

KFB1

SG

v [ k m / h ]

0

100

time[s]

0 500 1000 1500 2000 2500 3000

a x

[ m / s 2 ]

-5

0

5

Figure 4.15 – Estimates of DFB, DFC, KFB1 and SG; velocity v; acceleration ax (test drive 13)

windup estimator is discussed. For the algorithm only the convergence matrix P d needs to be

specified, which is chosen as P d = I 3×3 × [10−10, 10−12, 10−12]T . In summary it has been

observed that the SG estimator behaves similarly to the above mentioned directional forgettingbased algorithms (with the exception of KFB-II). Using the example of test drive no. 13, the SG

estimator achieves similar estimates for the given parameters as the other directional forgetting

based estimators This is illustrated in fig. 4.15, where the estimates of DFB, DFC, KFB1 and

SG are displayed. Evidently, the similarity of the estimates result in almost a complete overlap

of the signals which is why only the estimates of SG can be seen.

4.4.5 Trace manipulation based algorithms

Finally, the CT and MT algorithms, which are based on a limitation or scaling of the covariance

matrix are evaluated.

The core concept of the CT algorithm is to keep the trace of the covariance matrix constant

at all times. This is illustrated in

Realtime Estimation of Driving Resistance_Author Shi Jieqing

Documents