Top Banner
Topic #22 16.31 Feedback Control Deterministic LQR Optimal control and the Riccati equation Lagrange multipliers The Hamiltonian matrix and the symmetric root locus Factoids: for symmtric R u T Ru =2u T R u Ru = R u Copyright 2001 by Jonathan How. 1
14

LQR - Massachusetts Institute of Technology...Fall 2001 16.31 22—1 Linear Quadratic Regulator (LQR) • We have seen the solutions to the LQR problem using the symmetric root locus

Feb 26, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: LQR - Massachusetts Institute of Technology...Fall 2001 16.31 22—1 Linear Quadratic Regulator (LQR) • We have seen the solutions to the LQR problem using the symmetric root locus

Topic #22

16.31 Feedback Control

Deterministic LQR

• Optimal control and the Riccati equation

• Lagrange multipliers

• The Hamiltonian matrix and the symmetric root locus

Factoids: for symmtric R ∂uT Ru

= 2u T R ∂u

∂Ru = R

∂u

Copyright 2001 by Jonathan How.

1

Page 2: LQR - Massachusetts Institute of Technology...Fall 2001 16.31 22—1 Linear Quadratic Regulator (LQR) • We have seen the solutions to the LQR problem using the symmetric root locus

Fall 2001 16.31 22—1

Linear Quadratic Regulator (LQR)

• We have seen the solutions to the LQR problem using the symmetric root locus which defines the location of the closed-loop poles.

— Linear full-state feedback control.

— Would like to demonstrate from first principles that this is the optimal form of the control.

• Deterministic Linear Quadratic Regulator

Plant:

x (t) = A(t)x(t) + Bu(t)u(t), x(t0) = x0

z(t) = Cz (t)x(t)

Cost: Z tf £ JLQR = zT (t)Rzz(t)z(t) + uT (t)Ruu(t)u(t)

¤ dt + x(tf )Ptf x(tf )

t0

— Where Ptf ≥ 0, Rzz(t) > 0 and Ruu(t) > 0

— Define Rxx = CzTRzzCz ≥ 0

— A(t) is a continuous function of time.

— Bu(t), Cz (t), Rzz(t), Ruu(t) are piecewise continuous functions of time, and all are bounded.

• Problem Statement: Find the input u(t) ∀t ∈ [t0, tf ] to mini-mize JLQR.

Page 3: LQR - Massachusetts Institute of Technology...Fall 2001 16.31 22—1 Linear Quadratic Regulator (LQR) • We have seen the solutions to the LQR problem using the symmetric root locus

Fall 2001 16.31 22—2

• Note that this is the most general form of the LQR problem — we rarely need this level of generality and often suppress the time dependence of the matrices.

— Aircraft landing problem.

• To optimize the cost, we follow the procedure of augmenting the constraints in the problem (the system dynamics) to the cost (inte-grand) to form the Hamiltonian:

1 ¢ H =

2

¡xT (t)Rxxx(t) + uT (t)Ruuu(t) + λT (t) (Ax(t) + Buu(t))

— λ(t) ∈ Rn×1 is called the Adjoint variable or Costate

— It is the Lagrange multiplier in the problem.

• From Stengel (pg427), the necessary and sufficient conditions for optimality are that:

T 1. λ (t) = −∂H = −Rxxx(t) − AT λ(t)∂x

2. λ(tf ) = Ptf x(tf )

3. ∂H = 0 ⇒ Ruuu + BuT λ(t) = 0, so uopt = −R−1

∂u uu BuT λ(t)

4. ∂2H ≥ 0 (need to check that Ruu ≥ 0)∂u 2

Page 4: LQR - Massachusetts Institute of Technology...Fall 2001 16.31 22—1 Linear Quadratic Regulator (LQR) • We have seen the solutions to the LQR problem using the symmetric root locus

Fall 2001 16.31 Optimization-1

• This control design problem is a constrained optimization, with the constraints being the dynamics of the system.

• The standard way of handling the constraints in an optimization is to add them to the cost using a Lagrange multiplier

— Results in an unconstrained optimization.

• Example: min f (x, y) = x2 + y2 subject to the constraint that c(x, y) = x + y + 2 = 0

2

1.5

1

0.5

0

−0.5

−1

−1.5

−2 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2

x

Figure 1: Optimization results

y

• Clearly the unconstrained minimum is at x = y = 0

Page 5: LQR - Massachusetts Institute of Technology...Fall 2001 16.31 22—1 Linear Quadratic Regulator (LQR) • We have seen the solutions to the LQR problem using the symmetric root locus

Fall 2001 16.31 Optimization-2

• To find the constrained minimum, form the augmented cost function

L , f (x, y) + λc(x, y) = x 2 + y 2 + λ(x + y + 2)

— Where λ is the Lagrange multiplier

• Note that if the constraint is satisfied, then L ≡ f

• The solution approach without constraints is to find the stationary point of f (x, y) (∂f/∂x = ∂f/∂y = 0)

— With constraints we find the stationary points of L ∂L ∂L ∂L

= = = 0 ∂x ∂y ∂λ

which gives

∂L∂x

= 2x + λ = 0

∂L = 2y + λ = 0

∂y ∂L

= x + y + 2 = 0∂λ

• This gives 3 equations in 3 unknowns, solve to find x? = y? = −1

• The key point here is that due to the constraint, the selection of x and y during the minimization are not independent

— The Lagrange multiplier captures this dependency.

• The LQR optimization follows the same path as this, but it is com-plicated by the fact that the cost involves an integration over time.

Page 6: LQR - Massachusetts Institute of Technology...Fall 2001 16.31 22—1 Linear Quadratic Regulator (LQR) • We have seen the solutions to the LQR problem using the symmetric root locus

Fall 2001 16.31 22—3

• Note that we now have:

x (t) = Ax(t) + Buopt(t) = Ax(t) − BuR−1 uu Bu

T λ(t)

with x(t0) = x0

• So combine with equation for the adjoint variable

λ (t) = −Rxxx(t) − AT λ(t) = −CzTRzzCzx(t) − AT λ(t)

to get: ∙x (t) λ (t)

¸

=

"A −BuR−1

uu BuT

−CzTRzzCz −AT

# ∙ x(t) λ(t)

¸

which of course is the Hamiltonian Matrix again.

• Note that the dynamics of x(t) and λ(t) are coupled, but x(t) is known initially and λ(t) is known at the terminal time, since λ(tf ) = Ptf x(tf )

— This is a two point boundary value problem that is very hard to solve in general.

• However, in this case, we can introduce a new matrix variable P (t) and show that:

1. λ(t) = P (t)x(t) 2. It is relatively easy to find P (t).

Page 7: LQR - Massachusetts Institute of Technology...Fall 2001 16.31 22—1 Linear Quadratic Regulator (LQR) • We have seen the solutions to the LQR problem using the symmetric root locus

Fall 2001 16.31 22—4

• How proceed?

1. For the 2n system " #∙ ∙ x (t) A −BuR−1

uu BuT

x(t) λ (t)

¸

= −CzTRzzCz −AT λ(t)

¸

define a transition matrix " # F11(t1, t0) F12(t1, t0)

F (t1, t0) = F21(t1, t0) F22(t1, t0)

and use this to relate x(t) to x(tf ) and λ(tf ) " #∙ ∙

λ(t)

¸

= F11(t, tf ) F12(t, tf ) x(tf )x(t)

F21(t, tf ) F22(t, tf ) λ(tf )

¸

so

x(t) = Fh 11(t, tf )x(tf ) + F12(t, tf )iλ(tf ) = F11(t, tf ) + F12(t, tf )Ptf x(tf )

2. Now find λ(t) in terms of x(tf ) h i λ(t) = F12(t, tf ) + F22(t, tf )Ptf x(tf )

3. Eliminate x(tf ) to get: h i h i−1 λ(t) = F12(t, tf ) + F22(t, tf )Ptf F11(t, tf ) + F12(t, tf )Ptf x(t)

, P (t)x(t)

Page 8: LQR - Massachusetts Institute of Technology...Fall 2001 16.31 22—1 Linear Quadratic Regulator (LQR) • We have seen the solutions to the LQR problem using the symmetric root locus

Fall 2001 16.31 22—5

4. Now, since λ(t) = P (t)x(t), then

λ (t) = P (t)x(t) + P (t)x (t)

⇒ − CzTRzzCzx(t) − AT λ(t) =

−P (t)x(t) = CzTRzzCzx(t) + AT λ(t) + P (t)x (t)

= CzTRzzCzx(t) + AT λ(t) + P (t)(Ax(t) − BuR

−1 uu Bu

T λ(t))

= (CzTRzzCz + P (t)A)x(t) + (AT − P (t)BuR

−1 uu Bu

T )λ(t)

= (CzTRzzCz + P (t)A)x(t) + (AT − P (t)BuR

−1 uu Bu

T )P (t)x(t)

=£ ATP (t) + P (t)A + Cz

TRzzCz − P (t)BuR−1 uu Bu

TP (t)¤ x(t)

• This must be true for arbitrary x(t), so P (t) must satisfy

−P (t) = ATP (t) + P (t)A + CzTRzzCz − P (t)BuR

−1 uu Bu

TP (t)

— Which is a matrix differential Riccati Equation.

• The optimal value of P (t) is found by solving this equation back-

wards in time from tf with P (tf ) = Ptf

Page 9: LQR - Massachusetts Institute of Technology...Fall 2001 16.31 22—1 Linear Quadratic Regulator (LQR) • We have seen the solutions to the LQR problem using the symmetric root locus

Fall 2001 16.31 22—6

• The control gains are then

uopt = −R−1 uu Bu

T λ(t)

= −R−1 uu Bu

TP (t)x(t) = −K(t)x(t)

— Where K(t) , R−1 uu Bu

TP (t)

• Note that x(t) and λ(t) together define the closed-loop dynamics for the system (and its adjoint), but we can eliminate λ(t) from the solution by introducing P (t) which solves a Riccati Equation.

• The optimal control inputs are in fact a linear full-state feedback control

• Note that normally we are interested in problems with t0 = 0 and tf = ∞, in which case we can just use the steady-state value of P that solves (assumes that A,Bu is stabilizable)

ATP + PA + CzTRzzCz − PBuR

−1 uu Bu

TP = 0

which is the Algebraic Riccati Equation.

— If we use the steady-state value of P , then K is constant.

Page 10: LQR - Massachusetts Institute of Technology...Fall 2001 16.31 22—1 Linear Quadratic Regulator (LQR) • We have seen the solutions to the LQR problem using the symmetric root locus

Fall 2001 16.31 22—7

• Example: simple system with t0 = 0 and tf = 10sec. ∙ ∙ 0 1 0

x = 0 −1

¸

x +1

¸

u " # Z 10 ∙ ∙

J = xT (10)0 0

x(10) + xT (t) q 0 0

0 ¸

x(t) + ru 2(t)

¸

dt 0 h 0

• Compute gains using both time-varying P (t) and steady-state value.

• Find state solution x(0) = [1 1]T using both sets of gains q = 1 r = 1 h = 5

1.4 K

1(t)

K1

5

4.5

1.2

4

1 3.5

3 0.8

2.5

0.6 2

0.4 1.5

1

0.2

0.5

K2(t)

K2

x1

x2

0 0 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10

Time (sec) Time (sec)

Dynamic Gains Static Gains 1.4 1.4

1.2

x1

x2 1.2

1 1

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0

−0.2 −0.2

−0.4 −0.4

−0.6 −0.6 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10

Time (sec) Time (sec)

Sta

tes

Gai

ns

Sta

tes

Gai

ns

Figure 2: Set q = 1, r = 1, h = 10, Klqr = [1 0.73]

Page 11: LQR - Massachusetts Institute of Technology...Fall 2001 16.31 22—1 Linear Quadratic Regulator (LQR) • We have seen the solutions to the LQR problem using the symmetric root locus

Fall 2001 16.31 22—8

• As noted, the closed-loop dynamics couple x(t) and λ(t) and are given by ∙

x (t) λ (t)

¸

=

"A −BuR−1

uu BuT

−CzTRzzCz −AT

# ∙ x(t) λ(t)

¸

with the appropriate boundary conditions.

• OK, so where are the closed-loop poles of the system?

— They must be the eigenvalues of " #

H ,A −BuR−1

uu BuT

−CzTRzzCz −AT

• When we analyzed this before for a SISO system, we found that the closed-loop poles could be related to a SRL for the transfer function

Gzu(s) = Cz (sI − A)−1Bu = b(s)a(s)

and, in fact, the closed-loop poles were given by the LHP roots of

Rzz a(s)a(−s) + b(s)b(−s) = 0

Ruu

where we previously had Rzz/Ruu ≡ 1/r

• We now know enough to show that this is true.

Page 12: LQR - Massachusetts Institute of Technology...Fall 2001 16.31 22—1 Linear Quadratic Regulator (LQR) • We have seen the solutions to the LQR problem using the symmetric root locus

Fall 2001 16.31 22—9

Derivation of the SRL• The closed-loop poles are given by the eigenvalues of " #

A −BuR−1 uu Bu

T

−CzTRzzCz −AT H ,

so solve det(sI − H) = 0" #

= det(A) det(D − CA−1B) A B

C D • If A is invertible: det

£ (sI + AT ) − Cz

T RzzCz (sI − A)−1BuR−1

u

¤uu B

T

= det(sI − A) det(sI + AT ) det

⇒ det(sI − H) = det(sI − A) det £ I − Cz

T RzzCz(sI − A)−1BuR−1

u (sI + AT )−1¤

uu BT

• Note that det(I + ABC) = det(I + CAB), and if a(s) = det(sI − A), then a(−s) = det(−sI − AT ) = (−1)n det(sI + AT )

det(sI−H) = (−1)n a(s)a(−s) det £ I + R−1

u (−sI − AT )−1CzT RzzCz(sI − A)−1Bu

¤uu B

T

• If Gzu(s) = Cz (sI −A)−1Bu, then GT zu(−s) = Bu

T (−sI −AT )−1CzT ,

so for SISO systems

£ I + R−1

zu(−s)RzzGzu(s)¤

uu GT

= (−1)na(s)a(−s) I + Rzz

Gzu(−s)Gzu(s)¸

Ruu

det(sI − H) = (−1)na(s)a(−s) det ∙

∙ Rzz

a(s)a(−s) += (−1)n

Ruu b(s)b(−s)

¸

= 0

Page 13: LQR - Massachusetts Institute of Technology...Fall 2001 16.31 22—1 Linear Quadratic Regulator (LQR) • We have seen the solutions to the LQR problem using the symmetric root locus

Fall 2001 16.31 22—10

• Simple example from before: A scalar system with

x = ax + bu

with cost (Rxx > 0 and Ruu > 0)

J =Z ∞

0(Rxxx 2(t) + Ruuu 2(t)) dt

• Then the steady-state P solves

2aP + Rxx − P 2b2/Ruu = 0

which gives that P =a+ √ a2+b2Rxx/Ruu > 0 R−1

uu b2

• Then u(t) = −Kx(t) where

uu bP = a +

pa2 + b2Rxx/Ruu

K = R−1

b

• The closed-loop dynamics are

x = (a − bK)x =

µa −

b

b (a +

pa2 + b2Rxx/Ruu)

x

= −pa2 + b2Rxx/Ruu x = Aclx(t)

• Note that as Rxx/Ruu →∞, Acl ≈ −|b|pRxx/Ruu

• And as Rxx/Ruu → 0, K ≈ (a + |a|)/b — If a < 0 (open-loop stable), K ≈ 0 and Acl = a − bK ≈ a

— If a > 0 (OL unstable), K ≈ 2a/b and Acl = a − bK ≈ −a

Page 14: LQR - Massachusetts Institute of Technology...Fall 2001 16.31 22—1 Linear Quadratic Regulator (LQR) • We have seen the solutions to the LQR problem using the symmetric root locus

Fall 2001 16.31 22—11

Summary

• Can find the optimal feedback gains u = −Kx using the Matlab command

K = lqr(A,B,Rxx, Ruu)

• Similar derivation for the optimal estimation problem (Linear Quadratic Estimator)

— Full treatment requires detailed of advanced topics (e.g. stochas-tic processes and Ito calculus) — better left to a second course.

— But, by duality, can compute optimal Kalman filter gains from

Ke = lqr(AT,CyT , BwRwB

T w,Rv), L = Ke

T

snauti
®
snauti
MATLAB is a trademark of The MathWorks, Inc.
snauti
®