LQR - Massachusetts Institute of Technology...Fall 2001 16.31 22—1 Linear Quadratic Regulator (LQR) • We have seen the solutions to the LQR problem using the symmetric root locus
Post on 26-Feb-2021
9 Views
Preview:
Transcript
Topic #22
16.31 Feedback Control
Deterministic LQR
• Optimal control and the Riccati equation
• Lagrange multipliers
• The Hamiltonian matrix and the symmetric root locus
Factoids: for symmtric R ∂uT Ru
= 2u T R ∂u
∂Ru = R
∂u
Copyright 2001 by Jonathan How.
1
Fall 2001 16.31 22—1
Linear Quadratic Regulator (LQR)
• We have seen the solutions to the LQR problem using the symmetric root locus which defines the location of the closed-loop poles.
— Linear full-state feedback control.
— Would like to demonstrate from first principles that this is the optimal form of the control.
• Deterministic Linear Quadratic Regulator
Plant:
x (t) = A(t)x(t) + Bu(t)u(t), x(t0) = x0
z(t) = Cz (t)x(t)
Cost: Z tf £ JLQR = zT (t)Rzz(t)z(t) + uT (t)Ruu(t)u(t)
¤ dt + x(tf )Ptf x(tf )
t0
— Where Ptf ≥ 0, Rzz(t) > 0 and Ruu(t) > 0
— Define Rxx = CzTRzzCz ≥ 0
— A(t) is a continuous function of time.
— Bu(t), Cz (t), Rzz(t), Ruu(t) are piecewise continuous functions of time, and all are bounded.
• Problem Statement: Find the input u(t) ∀t ∈ [t0, tf ] to mini-mize JLQR.
Fall 2001 16.31 22—2
• Note that this is the most general form of the LQR problem — we rarely need this level of generality and often suppress the time dependence of the matrices.
— Aircraft landing problem.
• To optimize the cost, we follow the procedure of augmenting the constraints in the problem (the system dynamics) to the cost (inte-grand) to form the Hamiltonian:
1 ¢ H =
2
¡xT (t)Rxxx(t) + uT (t)Ruuu(t) + λT (t) (Ax(t) + Buu(t))
— λ(t) ∈ Rn×1 is called the Adjoint variable or Costate
— It is the Lagrange multiplier in the problem.
• From Stengel (pg427), the necessary and sufficient conditions for optimality are that:
T 1. λ (t) = −∂H = −Rxxx(t) − AT λ(t)∂x
2. λ(tf ) = Ptf x(tf )
3. ∂H = 0 ⇒ Ruuu + BuT λ(t) = 0, so uopt = −R−1
∂u uu BuT λ(t)
4. ∂2H ≥ 0 (need to check that Ruu ≥ 0)∂u 2
Fall 2001 16.31 Optimization-1
• This control design problem is a constrained optimization, with the constraints being the dynamics of the system.
• The standard way of handling the constraints in an optimization is to add them to the cost using a Lagrange multiplier
— Results in an unconstrained optimization.
• Example: min f (x, y) = x2 + y2 subject to the constraint that c(x, y) = x + y + 2 = 0
2
1.5
1
0.5
0
−0.5
−1
−1.5
−2 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2
x
Figure 1: Optimization results
y
• Clearly the unconstrained minimum is at x = y = 0
Fall 2001 16.31 Optimization-2
• To find the constrained minimum, form the augmented cost function
L , f (x, y) + λc(x, y) = x 2 + y 2 + λ(x + y + 2)
— Where λ is the Lagrange multiplier
• Note that if the constraint is satisfied, then L ≡ f
• The solution approach without constraints is to find the stationary point of f (x, y) (∂f/∂x = ∂f/∂y = 0)
— With constraints we find the stationary points of L ∂L ∂L ∂L
= = = 0 ∂x ∂y ∂λ
which gives
∂L∂x
= 2x + λ = 0
∂L = 2y + λ = 0
∂y ∂L
= x + y + 2 = 0∂λ
• This gives 3 equations in 3 unknowns, solve to find x? = y? = −1
• The key point here is that due to the constraint, the selection of x and y during the minimization are not independent
— The Lagrange multiplier captures this dependency.
• The LQR optimization follows the same path as this, but it is com-plicated by the fact that the cost involves an integration over time.
Fall 2001 16.31 22—3
• Note that we now have:
x (t) = Ax(t) + Buopt(t) = Ax(t) − BuR−1 uu Bu
T λ(t)
with x(t0) = x0
• So combine with equation for the adjoint variable
λ (t) = −Rxxx(t) − AT λ(t) = −CzTRzzCzx(t) − AT λ(t)
to get: ∙x (t) λ (t)
¸
=
"A −BuR−1
uu BuT
−CzTRzzCz −AT
# ∙ x(t) λ(t)
¸
which of course is the Hamiltonian Matrix again.
• Note that the dynamics of x(t) and λ(t) are coupled, but x(t) is known initially and λ(t) is known at the terminal time, since λ(tf ) = Ptf x(tf )
— This is a two point boundary value problem that is very hard to solve in general.
• However, in this case, we can introduce a new matrix variable P (t) and show that:
1. λ(t) = P (t)x(t) 2. It is relatively easy to find P (t).
Fall 2001 16.31 22—4
• How proceed?
1. For the 2n system " #∙ ∙ x (t) A −BuR−1
uu BuT
x(t) λ (t)
¸
= −CzTRzzCz −AT λ(t)
¸
define a transition matrix " # F11(t1, t0) F12(t1, t0)
F (t1, t0) = F21(t1, t0) F22(t1, t0)
and use this to relate x(t) to x(tf ) and λ(tf ) " #∙ ∙
λ(t)
¸
= F11(t, tf ) F12(t, tf ) x(tf )x(t)
F21(t, tf ) F22(t, tf ) λ(tf )
¸
so
x(t) = Fh 11(t, tf )x(tf ) + F12(t, tf )iλ(tf ) = F11(t, tf ) + F12(t, tf )Ptf x(tf )
2. Now find λ(t) in terms of x(tf ) h i λ(t) = F12(t, tf ) + F22(t, tf )Ptf x(tf )
3. Eliminate x(tf ) to get: h i h i−1 λ(t) = F12(t, tf ) + F22(t, tf )Ptf F11(t, tf ) + F12(t, tf )Ptf x(t)
, P (t)x(t)
Fall 2001 16.31 22—5
4. Now, since λ(t) = P (t)x(t), then
λ (t) = P (t)x(t) + P (t)x (t)
⇒ − CzTRzzCzx(t) − AT λ(t) =
−P (t)x(t) = CzTRzzCzx(t) + AT λ(t) + P (t)x (t)
= CzTRzzCzx(t) + AT λ(t) + P (t)(Ax(t) − BuR
−1 uu Bu
T λ(t))
= (CzTRzzCz + P (t)A)x(t) + (AT − P (t)BuR
−1 uu Bu
T )λ(t)
= (CzTRzzCz + P (t)A)x(t) + (AT − P (t)BuR
−1 uu Bu
T )P (t)x(t)
=£ ATP (t) + P (t)A + Cz
TRzzCz − P (t)BuR−1 uu Bu
TP (t)¤ x(t)
• This must be true for arbitrary x(t), so P (t) must satisfy
−P (t) = ATP (t) + P (t)A + CzTRzzCz − P (t)BuR
−1 uu Bu
TP (t)
— Which is a matrix differential Riccati Equation.
• The optimal value of P (t) is found by solving this equation back-
wards in time from tf with P (tf ) = Ptf
Fall 2001 16.31 22—6
• The control gains are then
uopt = −R−1 uu Bu
T λ(t)
= −R−1 uu Bu
TP (t)x(t) = −K(t)x(t)
— Where K(t) , R−1 uu Bu
TP (t)
• Note that x(t) and λ(t) together define the closed-loop dynamics for the system (and its adjoint), but we can eliminate λ(t) from the solution by introducing P (t) which solves a Riccati Equation.
• The optimal control inputs are in fact a linear full-state feedback control
• Note that normally we are interested in problems with t0 = 0 and tf = ∞, in which case we can just use the steady-state value of P that solves (assumes that A,Bu is stabilizable)
ATP + PA + CzTRzzCz − PBuR
−1 uu Bu
TP = 0
which is the Algebraic Riccati Equation.
— If we use the steady-state value of P , then K is constant.
Fall 2001 16.31 22—7
• Example: simple system with t0 = 0 and tf = 10sec. ∙ ∙ 0 1 0
x = 0 −1
¸
x +1
¸
u " # Z 10 ∙ ∙
J = xT (10)0 0
x(10) + xT (t) q 0 0
0 ¸
x(t) + ru 2(t)
¸
dt 0 h 0
• Compute gains using both time-varying P (t) and steady-state value.
• Find state solution x(0) = [1 1]T using both sets of gains q = 1 r = 1 h = 5
1.4 K
1(t)
K1
5
4.5
1.2
4
1 3.5
3 0.8
2.5
0.6 2
0.4 1.5
1
0.2
0.5
K2(t)
K2
x1
x2
0 0 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Time (sec) Time (sec)
Dynamic Gains Static Gains 1.4 1.4
1.2
x1
x2 1.2
1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
−0.2 −0.2
−0.4 −0.4
−0.6 −0.6 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Time (sec) Time (sec)
Sta
tes
Gai
ns
Sta
tes
Gai
ns
Figure 2: Set q = 1, r = 1, h = 10, Klqr = [1 0.73]
Fall 2001 16.31 22—8
• As noted, the closed-loop dynamics couple x(t) and λ(t) and are given by ∙
x (t) λ (t)
¸
=
"A −BuR−1
uu BuT
−CzTRzzCz −AT
# ∙ x(t) λ(t)
¸
with the appropriate boundary conditions.
• OK, so where are the closed-loop poles of the system?
— They must be the eigenvalues of " #
H ,A −BuR−1
uu BuT
−CzTRzzCz −AT
• When we analyzed this before for a SISO system, we found that the closed-loop poles could be related to a SRL for the transfer function
Gzu(s) = Cz (sI − A)−1Bu = b(s)a(s)
and, in fact, the closed-loop poles were given by the LHP roots of
Rzz a(s)a(−s) + b(s)b(−s) = 0
Ruu
where we previously had Rzz/Ruu ≡ 1/r
• We now know enough to show that this is true.
Fall 2001 16.31 22—9
Derivation of the SRL• The closed-loop poles are given by the eigenvalues of " #
A −BuR−1 uu Bu
T
−CzTRzzCz −AT H ,
so solve det(sI − H) = 0" #
= det(A) det(D − CA−1B) A B
C D • If A is invertible: det
£ (sI + AT ) − Cz
T RzzCz (sI − A)−1BuR−1
u
¤uu B
T
= det(sI − A) det(sI + AT ) det
⇒ det(sI − H) = det(sI − A) det £ I − Cz
T RzzCz(sI − A)−1BuR−1
u (sI + AT )−1¤
uu BT
• Note that det(I + ABC) = det(I + CAB), and if a(s) = det(sI − A), then a(−s) = det(−sI − AT ) = (−1)n det(sI + AT )
det(sI−H) = (−1)n a(s)a(−s) det £ I + R−1
u (−sI − AT )−1CzT RzzCz(sI − A)−1Bu
¤uu B
T
• If Gzu(s) = Cz (sI −A)−1Bu, then GT zu(−s) = Bu
T (−sI −AT )−1CzT ,
so for SISO systems
£ I + R−1
zu(−s)RzzGzu(s)¤
uu GT
= (−1)na(s)a(−s) I + Rzz
Gzu(−s)Gzu(s)¸
Ruu
det(sI − H) = (−1)na(s)a(−s) det ∙
∙ Rzz
a(s)a(−s) += (−1)n
Ruu b(s)b(−s)
¸
= 0
Fall 2001 16.31 22—10
• Simple example from before: A scalar system with
x = ax + bu
with cost (Rxx > 0 and Ruu > 0)
J =Z ∞
0(Rxxx 2(t) + Ruuu 2(t)) dt
• Then the steady-state P solves
2aP + Rxx − P 2b2/Ruu = 0
which gives that P =a+ √ a2+b2Rxx/Ruu > 0 R−1
uu b2
• Then u(t) = −Kx(t) where
uu bP = a +
pa2 + b2Rxx/Ruu
K = R−1
b
• The closed-loop dynamics are
x = (a − bK)x =
µa −
b
b (a +
pa2 + b2Rxx/Ruu)
¶
x
= −pa2 + b2Rxx/Ruu x = Aclx(t)
• Note that as Rxx/Ruu →∞, Acl ≈ −|b|pRxx/Ruu
• And as Rxx/Ruu → 0, K ≈ (a + |a|)/b — If a < 0 (open-loop stable), K ≈ 0 and Acl = a − bK ≈ a
— If a > 0 (OL unstable), K ≈ 2a/b and Acl = a − bK ≈ −a
Fall 2001 16.31 22—11
Summary
• Can find the optimal feedback gains u = −Kx using the Matlab command
K = lqr(A,B,Rxx, Ruu)
• Similar derivation for the optimal estimation problem (Linear Quadratic Estimator)
— Full treatment requires detailed of advanced topics (e.g. stochas-tic processes and Ito calculus) — better left to a second course.
— But, by duality, can compute optimal Kalman filter gains from
Ke = lqr(AT,CyT , BwRwB
T w,Rv), L = Ke
T
top related