Numerical method for HJB equations. Optimal control problems and differential games (lecture 3/3) Maurizio Falcone (La Sapienza) & Hasnaa Zidani (ENSTA) ANOC, 23–27 April 2012 M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 1 / 50
46
Embed
Numerical method for HJB equations. Optimal control problems … · 2012. 5. 11. · Numerical method for HJB equations. Optimal control problems and differential games (lecture 3/3)
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Numerical method for HJB equations. Optimalcontrol problems and differential games
(lecture 3/3)
Maurizio Falcone (La Sapienza) & Hasnaa Zidani (ENSTA)
ANOC, 23–27 April 2012
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 1 / 50
Outline
1 Introduction
2 Planing Motion, reachability analysis
3 Hamilton-Jacobi approach: level set method
4 Differential games under state-constraints
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 2 / 50
Consider the controlled system:yx (s) = f (yx (s), α(s)), s ∈ (0,+∞),yx (0) = x ,
x ∈ Rd , t > 0;V∞(x ,0) = Φ(x) ∨ g(x) HJB-VI ineqation
where H(x ,q) := maxa∈A(−f (x ,a) · q).
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 9 / 50
ä To each control problem is associated an adequateHamilton-Jacobi equation
ä A very large class of control problems can be considered withinthe HJ framework (state-constrained control problem, infinitehorizon control problems, hybrid systems, impulsive control, ... )
ä The viscosity notion provides a very convenient framework for thetheoretical and numerical studies of the value function
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 10 / 50
ä When the (exact) value function is known, the feed-back controller canbe defined as the minimizer of the DPP.
ä This feedback can be shown to be an optimal control law
Van der Pol Problem :y1(t) = y2
y2(t) = −y1 + y2(1− y21 ) + a
a(t) ∈ [−1,1]
−2 −1 0 1 2−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
Scheme:ENO2−RK1
Target
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 11 / 50
ä When the (exact) value function is known, the feed-back controllercan be defined as the minimizer of the DPP.
ä This feedback can be shown to be an optimal control law
Van der Pol Problem :y1(t) = y2
y2(t) = −y1 + y2(1− y21 ) + a(t)
a(t) ∈ [−1,1]
x1
x2
−1.5 −1 −0.5 0 0.5 1 1.5
−1.5
−1
−0.5
0
0.5
1
1.5
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 12 / 50
Open problemIn general, only an approximation of the value function can becomputed.It is not clear how the feedback control behaves with respect to a smallperturbation of the value function.
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 13 / 50
Numerical methodsSemi-Lagrangian methods: based on the DPP (Falcone, Ferretti,Jakobsen, Grüne, Kushner-Dupuis, ...)
PROS: no CFL condition for stability (⇒ adaptative schemes)CONS: non-local
Finite difference methods: approximation of the gradient by FD(Crandall/Lions, Barles, Souganidis)
CONS: needs CFL condition for stabilityPROS: local method =⇒ can be parallelizedPROS: non-monotone variants are proposed to get numerically"high-order"
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 19 / 50
Let C be a closed target set (in our examples, C is safe)
Capture Basin (or Backward reachable set)
ä The Capture Basin CaptCt , at time t , is the set of all initial positionsx from which a trajectory yx ∈ S[0,t](x) can reach the target C.
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 20 / 50
ä Does there exist a trajectory leading from a state in initial setX to a state in the target C, during some finite time horizon?
ä Once an obstacle has been detected by suitable sensors(e.g. radar, pursuer), can a collision be avoided?
ä Sometimes we have no control over input signal (noise,actions of other agents, unknown system parameters, ...): itis safest to consider the worst-case.
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 21 / 50
Different approaches for computing the reachable sets
ä For every t ≥ 0, CaptCt = x ∈ Rd ; V (x , t) ≤ 0;
ä The minimum time function T : Rd → R+ ∪ +∞ is lsc.Moreover, we have:
T (x) = inft ≥ 0; x ∈ CaptCt = mint ≥ 0; V (x , t) ≤ 0.
ã Φ can be any function satisfying
Φ(x) ≤ 0⇐⇒ x ∈ C.
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 28 / 50
ä The level set approach can be used even whenthe minimum time function is discontinuous!
The value function V is Lipschitz continuous!
ä The level set approach can be extended to moregeneral situations: differential games, avoidanceof obstacles, moving target and/or obstacles, ...
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 29 / 50
General setting: differential games under stateconstraintsLet α ∈ A be a controlled input, and β ∈ B an uncontrolled input(perturbance). Consider the trajectory:
yx (s) = f (yx (s), α(s), β(s)), s ∈ (0,1),yx (0) = x ,
Let (Kθ)θ≥0 be a family of closed set (of constraints). Consider agame involving two players.
I The first player wants to steer the system from the initial position atpoint x to the target C and by staying in K (and using his/her inputα(t) ∈ A)
I while the second player tries to steer the system away from C orfrom K (with his/her input β(t) ∈ B).
Assume θ 7−→ Kθ is usc.
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 31 / 50
Non anticipative strategies
We define the set of non-anticipative strategies for the first player, asfollows:
Γ :=
a : B→ A, ∀(β, β) ∈ B and ∀s ∈ [0,∞),(β(θ) = β(θ) a.e. θ ∈ [0, s]
)⇒(
a[β](θ) = a[β](θ) a.e. θ ∈ [0, s])
.
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 32 / 50
ä For τ ≥ 0,
CaptK(τ) := x ∈ Rd | ∃α,∀β, ya[β],βx (s) ∈ Ks,and ya[β],β
x (τ) ∈ C
ä Again, let Φ(x) = dC(x) and consider the control problem:
ϑ(x , τ) := mina[β]
maxβ|ya[β],β
x (t)∈Kt
Φ(y(τ)).
Then
CaptK(τ) = x ∈ Rd , ϑ(x , τ) ≤ 0
T (x) = minτ ≥ 0, ϑ(x , τ) ≤ 0.
ä For controlled systems lacking controllability assumptions, thecharacterization of ϑ by means of HJB equations is not an easytask !!! (Ref: Soner’86, Ishii-Koike’91, Frankowska’91,Altarovici-Bokanowski-HZ’12)
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 33 / 50
An other formulation: exact penalisation
ä Let g(x , θ) = dKθ(x) for any θ ≥ 0 and x ∈ Rd . Then, consider :
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 34 / 50
Example1: One player, fixed obstacles
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 35 / 50
Example2: Zermelo problem with obstacles
x ′ = Vboat cos(θ) + Vcurrent − ay2
y ′ = Vboat sin(θ)
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 36 / 50
Zermelo problem with obstacles: feedback control law
x ′ = Vboat cos(θ) + Vcurrent − ay2
y ′ = Vboat sin(θ)
−3 −2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
Scheme:ENO2−RK1 Obstacle Target
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 37 / 50
Exemple: Ariane V
ObjectifMinimize the ergol consumption tosteer the (given) payload MCU tothe GTO (or GEO).
Collaboration with Cnes (projet OPALE 2007-2010)
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 38 / 50
The physical model involves 7 state variables, the position−−→OG of
the rocket in the 3D space, its velocity −→v and its mass m.
O
L
eJ
`
G
eK
eI
e`
−→v
er
G
χ
γ
eL
Projection of −→v on the frame (er , eL, e`)
The forces acting on the rocket are: Gravity−→P , Drag
−→FD, Thrust−→
FT , and Coriolis−→Ω .
Newton Law:
md−→vdt
=−→P +
−→FD +
−→FT − 2m
−→Ω ∧ −→v −m
−→Ω ∧ (
−→Ω ∧−−→OG),
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 39 / 50
GTO
m2
m1
m3
m2m1
m3
PHASE B
PHASE C (HJB)
GEO
PHASE A(HJB)
(transport)
Figure: mission Kourou-GEO.M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 40 / 50
The related equation
State variables:r=altitudev=modulus of the velocityγ=angle between the direction earth-rocket and the direction of therocket’s velocity.L= latitude`= longitudeχ= azimuthm= masse of the engine
Control:α=angle between the thrust direction and the direction of the rocket’svelocity.
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 41 / 50
r = v cos γ
v = −g(r) cos γ − FD(r , v)
m+
FT (r , v ,a)
mcosα
Ω2r cos `(cos γ cos `− sin γ sin ` sinχ)
γ = sin γ(
g(r)
v− v
r
)− FT (r , v ,a)
vmsinα
−2Ω cos ` cosχ− Ω2 rv cos `(sin γ cos `− cos γ sin ` sinχ)
L =vr
sin γ cosχcos `
˙ =vr
sin γ sinχ
χ = −vr
sin γ tan ` cosχ− 2Ω(sin `− cotanγ cos ` sinχ)+
Ω2 rv
sin ` cos ` cosχsin γ
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 42 / 50
â The plane of motion is the equatorial plane ` ≡ 0, and χ ≡ 0.
r = v cos γ
v = −g(r) cos γ − FD(r , v)
m+
FT (r , v ,a)
mcosα + Ω2r cos γ
γ = sin γ(
g(r)
v− v
r
)− FT (r , v ,a)
vmsinα− 2Ω− Ω2 r
vsin γ
L =vr
sin γ
m = −b(m(t))
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 43 / 50
The rocket’s mass
ä The evolution of the mass can be summarized as follows
where βEAP , βE1 and βE2 are the mass flow rates for the boosters,the first and the second stage.
ä At the changes of phases, we have a (not negligible) discontinuityin the rocket’s mass.
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 44 / 50
The control problem can be formulated as (for a fixed payload)
Minimize tf
(r , v , γ,m, α) satisfy the state equation
α(t) ∈ [0, π/2] a.e. t ∈ (0, tf ),
(r(tf ), v(tf ), γ(tf )) ∈ C,
Q(r(t), v(t))α(t)) ≤ Cs for t ∈ (0, tf ),
m(tf ) = Mp.
where the target C corresponds to the GTO orbit, and the function Q isthe dynamic pressure.
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 45 / 50
ã The Capture Basin is wide+ We introduce "physical" state constraints to define the
computational domain
ã Due to the CFL condition, the time step is very small+ Adaptative time discretization
ã "Different scales" for the state variables:
+ Change of variable:
r = r0(ex − 1) + rTv = v0(ey − 1) + vT
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 46 / 50
GTO target
0 500 10000
200
400
time (sec)
altitu
de
(km
)
0 500 1000
2000
4000
6000
8000
10000
time (sec)
sp
ee
d (
m/s
)
0 500 10000
0.5
1
1.5
time (sec)
ga
mm
a (
rad
)
0 500 1000
200
400
600
time (sec)
ma
ss (
ton
)
Figure: Full trajectory using the HJB minimal time value function
Reference trajectory, final mass: mT = 21.57 (t)HJB trajectory, final mass (after reconstruction): mT = 22.50 (t)
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 47 / 50
GTO target
0 500 10000
200
400
time (sec)
altitu
de
(km
)
0 500 1000
2000
4000
6000
8000
10000
time (sec)
sp
ee
d (
m/s
)
0 500 10000
0.5
1
1.5
time (sec)
ga
mm
a (
rad
)
0 500 1000
200
400
600
time (sec)
ma
ss (
ton
)
Figure: Full trajectory using the HJB minimal time value function
Reference trajectory, final mass: mT = 21.57 (t)HJB trajectory, final mass (after reconstruction): mT = 22.50 (t)
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 48 / 50
"Collision analysis for an UAV".E. Crück, A. Desilles, HZ.AIAA Guidance, Navigation, and Control, 2012
"A general Hamilton-Jacobi framework for nonlinear state-constrained control problems".A. Altarovici, O. Bokanowski and HZESAIM:COCV, 2012
"Minimal time problems with moving targets and obstacles". O. Bokanowski and HZ.18th IFAC World Congress, Milan, 2011
"Deterministic state constrained optimal control problems without controllability assumptions". O. Bokanowski, N.Forcadel and HZESAIM: COCV, 17(04), pp. 975–994, 2011
"An efficient data structure and accurate scheme to solve front propagation problems". O. Bokanowski, E. Cristiani andHZJ. of Scientific Computing, 42(2), pp. 251–273, 2010
"Reachability and minimal times for state constrained nonlinear problems without any controllability assumption". O.Bokanowski, N. Forcadel and HZSIAM J. Control and Optimization, vol. 48(7), pp. 4292-4316, 2010
"Convergence of a non-monotone scheme for Hamilton-Jacobi-Bellman equations with discontinuous initial data".O. Bokanowski, N. Megdich and HZ.Numerische Mathematik, 115(1), pp. 1–44, 2010
"An anti-diffusive scheme for viability problems"O. Bokanowski, S. Martin, R. Munos and HZ.Applied Num. Methematics, 56(9), pp. 1147–1162, 2006
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 49 / 50
... many thanks for your attention!
M. Falcone & H. Zidani () HJB approach for optimal control problems ANOC, 23–27 April 2012 50 / 50