A Graph Search Algorithm for Optimal Control of Hybrid Systems Olaf Stursberg University of Dortmund Germany Work financially supported by the European.

A Graph Search Algorithm for Optimal Controlof Hybrid Systems

Olaf Stursberg

University of DortmundGermany

Work financially supported by the European Union within the project AMETIST (Advanced Methods for Timed Systems),

IST 2001-35304

Olaf Stursberg

Graph Search Algorithm for Optimal Control of Hybrid Systems

2

Motivation

given: Hybrid dynamic system with: nonlinear continuous dynamics continuous and discrete inputs transitions with state resets

Specifications: transfer from initial state to goal set safety restriction (exclusion of unsafe states) maximized performance / minimized costs

[industrial relevance: start-up, shut-down or change-over of processing systems]

Objective:

Determine input trajectoriessuch that the specs are met!

location

z1

z2

z3

x2

x1

initialization

goal

reset

unsafe set

x(t)v

u

Olaf Stursberg


3

Why is this difficult?

Standard approaches to continuous OC:

UuXxtxxtxuxfxduxtxhJJ tarff

t

tff

tutx

f ,,,,,,,:withmin 00

,0

Variational Calculus / Maximum Principle

Necessary Conditions:

Hamiltonian:

adjunct DE:

, ...

required:

uxfuxH T ,,

T

xf

xh

tuxHtuxH ,,,,,, *****

uH

xh

xf

x

,,,

Bellman Principle / Dynamic Programming

A total strategy is only optimal if the remaining partial strategy is optimal.

Value function:

HJB equation (sufficient condition):

uxtJxtV fu

,,min,

uxfxV

uxtV

u,,min

fff txhxtV ,,

Hybrid Systems: not differentiable; requiredVfH ,,, v

Olaf Stursberg


4

Related Work

Among others: M. S. Branicky, V. S. Borkar, S. K. Mitter: “A unified framework for hybrid control: Model and

optimal control theory”. IEEE Trans. Automatic Control, vol. 43, no. 1, 31–45, 1998.

H. Sussmann: “A maximum principle for hybrid optimal control problems,” Proc. 38th IEEE Conf. Decision and Control, 1999, 425–430.

M. Broucke, M. Di Benedetto, S. Di Gennaro, A. Sangiovanni-Vincentelli: “Optimal control using bisimulations”. Hybrid Systems: Comp. and Control, LNCS 2034, 2001, 175–188.

M. Shaikh, P. Caines: “On the optimal control of hybrid systems”. Hybrid Systems: Comp. and Control, ser, LNCS 2623, 2003, 466–481.

B. De Schutter: “Optimal control of a class of linear hybrid systems with saturation”. Proc. 38thIEEE Conf. Decision and Control, 1999, pp. 3978–3983.

B. Lincoln, A. Rantzer: “Optimizing linear system switching”. Proc. 40th IEEE Conf. Decision and Control, 2001, 2063–2068.

X. Xu, P. Antsaklis: “Quadratic optimal control problems for hybrid linear autonomous systems with state jumps”. Proc. American Control Conf., 2003, 3393–3398.

A. Bemporad, M. Morari: “Control of systems integrating logic, dynamics, and constraints,” Automatica, vol. 35, no. 3, pp. 407–427, 1999.

O. Stursberg and S. Engell: “Optimal control of switched continuous systems using mixedinteger programming”. Proc. 15th IFAC World Congr. Automatic Control, vol. Th-A06-4, 2002.

Olaf Stursberg


5

continuous states:

continuous inputs:

finite set of discrete inputs:

finite set of locations:

invariants: , polyhedral for all z

transitions:

guards: , polyhedral, disjoint for each z

resets:

flow functions: s.t. defines a continuous vector field

Hybrid Model - Syntax

Hybrid automaton: f,r,g,,inv,Z,V,U,XHA

xnRXx

][][ 11

uu nn u,uu,uUu

vd

njn Rv,,v,vVv }{ 1

}{ 1 znz,,zZ XZ:inv 2

ZZz,z )( 21X:g 2

XX:r xnRVUXZ:f

v,u,x,zfx

Olaf Stursberg


6

Hybrid Model - Semantics

Time set: T = {t0, t1, t2, ...} contains event times

Hybrid state: k (zk, xk) with xk = x(tk), zk = z(tk)

Input trajectories: u = (u0, u1, ...) u , v = (v0, v1, ...) v

with uk, vk constant for t [tk, tk+1[

Feasible run of HA for given 0, u and v : = (0, 1, 2, ...) with k from:

(i) continuous: and is the unique solution to the flowfunction for t [0, ]; (t) inv(zk) but (t) g((zk, ))for t <

(ii) transition: (zk, zk+1) , () g((zk, z‘)), and

xk+1 = r((zk, zk+1), ()) inv(zk+1)

kx0 t

Note: all feasible runs are deterministic.

z2

z3

x2

x1

transition

x(t)

Olaf Stursberg


7

Problem Statement

Target region: (ztar, xtar) tar , with one ztar Z, xtar inv(ztar)

Forbidden sets: with Fj , continuous sets polyhedral

Assume: time set T = {t0, t1, ..., tf} is finite

}{ 1 jnF,FF

Optimal control task:

determine such that is the solution to:

subject to:

0 = (z0, x0), f tar, F

Chosen cost function :

tf in combination with weighted distances of k to tar

*v

*u , *

,,,tmin vuf, vvuu

Olaf Stursberg


8

Previously: MILP-based Approach

Problems

complexity increases highly polynomially with prediction horizon

applicability to larger systems restricted

Reasons: large number of auxiliary variables and constraints required to

express the transition dynamics algebraically

relaxations between different dynamics often inefficient

(e.g. )

Characteristics:

point-wise linearized hybrid dynamics

reformulation as purely algebraic optimization problem

solution by mixed-integer linear programming (MILP)

applied iteratively in a moving horizon scheme

1,0,1,,,,, 2122111 ikkkkkkk bbbvuxfbvuxfbx

Olaf Stursberg


9

Principle:

separate the optimization of continuous and discrete degrees of freedom:

(i) high level: search tree encoding the discrete DOF v(t)

(ii) low level: embedded NLP for the continuous DOF u(t)

branch&bound and heuristics to prune the search tree efficiently

cost function evaluated by hybrid simulation

Decomposition Approach

Hybrid Automaton HA Specification: 0, tar,

Graph Search Algorithm

Embedded Nonlinear Programming

Hybrid Simulation

Neighborhood infou, v,

node n, vk

vukˆ,ˆ,x

Prediction horizon p

vup1kkˆ,ˆ,ˆ,t,x,u

Olaf Stursberg


10

Graph Search (1)

acyclic graph encoding the possible -trajectories (finite length)

node:

with: ca – accumulated costs up to k

cp – predicted costs for

– priority for selection

‘shortest path’ search:costs too highcompared tobest solution

v

kpavu ,c,c,,,n

k pk1k ˆ,,ˆˆ

Olaf Stursberg


11

Graph Search (2)

Search strategy: (1.) best-first / depth-first until 1st solution is found

(2.) breadth-first (or continue with best-first)

Selection criterion: priorities (combine ca and cp)

Pruning: node does not belong to a feasible run

hybrid state within Fj

costs indicated that no optimal solution ( )

target reached

hybrid state in a neighborhood of another (and lower priority)

*apa ccc

Olaf Stursberg


12

Adjacency and Similarity

Observation: usually many ‘similar’ are investigated

Similarity: for two different input trajectories, and contain inter-

mediate states i and i’ with zi = zi’ and

i and i’ are called adjacent

[assumption: remaining optimal paths incur the same costs]

Priorities: if i and i’ are adjacent, and for accumulated costs ca, ca’ and

predicted costs cp, cp’, the priorities are:

(’) > () iff:

(1.) ca(’) < ca()

(2.) cp(’) < cp()

' 'xx ii

Olaf Stursberg


13

Embedded Nonlinear Optimization

Time set for the prediction: Tp = {tk, …, tk+p} T

Optimization:

for : each component of v is relaxed to its range of values

solution by nonlinear programming (NLP)

Evaluation of the cost function:

hybrid simulation of HA for the input trajectories

(involves evaluating the continuous dynamics, detecting the guard

satisfaction, and executing the resets)

Result of the embedded optimization:

continuous input uk

accumulated costs cA (from 0 to k+1)

predicted cost cp (underestimation lower bound)

vupˆ,ˆˆ,ˆ,ˆ,tmin

vu

v

vu ˆ,ˆ

Olaf Stursberg


14

Algorithm

Olaf Stursberg


15

Example - Description

Discretedynamics:

MF1 F2

sH

FC

F3

V, T, cA, cB

Variables: discrete inputs: F1, F2, sH

continuous inputs: FC, F3

state variables: V, T, cA, cB

Continuousdynamics:

low

level

high

level

V 0.8V 0.8

r

BBB

rAAA

,vhr,vc

fkV

cFckF

dt

dc

fkV

cFckF

dt

dc

fTkksfkfTkkFV

TkFTkF

dt

dT

FFFdt

dV

112101

9281

27651432211

321

only for “high level”

T

kccf

V

kkf

V

kkf BArvv

16215142,

13121, exp,, :with

Tank reactor with2nd order reaction

Olaf Stursberg


16

Example – Results (1)

Objectives: reach nominal reaction (target) from an initially empty reactor

time optimality

avoid overflow and critical temperatures.

Configurations:

select: best-first search (throughout)

determinebest: pruning based on adjacency after 1st solution is found

prediction horizon: p = 2

Results: termination after 959 nodes,

721 nodes fathomed due to adjacency, the remainder due to costs

[theoretical number of nodes for the encountered path length: 31014]

computation time: 484 CPU-sec (P4-1.5 GHz)

(approx. one order of magnitude smaller than for the MILP solution)

Olaf Stursberg


17

Example – Results (2)

Projection into the (VR, TR, cA) space:

red: fathomed

blue: not fathomed

green: target, best found solution

Corresponding trajectory v

Olaf Stursberg


18

Conclusions

Separation of continuous and discrete DOFs:

evaluates the original hybrid dynamics (not linearized models)

no algebraic encoding of the transition dynamics required

tree search only for true degrees of freedom (not for discrete auxiliary variables)

no relaxations between different continuous dynamics

unlike MPC: not all but the best solution discarded in each iteration

Adjacency:

avoids exploration of almost identical evolutions

used for pruning or only for sorting the list of live nodes

result is in general only suboptimal

Olaf Stursberg


19

-0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7-0.2

0

0.2

0.4

0.6

-0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Searchmode: [3 1] , mincost = 870.5501 , CPU-Time = 23.3594 , Explored Nodes: 200

x1

x2

x 3

Current Work

improve state space coverage: neighborhoods determined bythe progress in X

determine suitable choices forthe diameters of neighborhoods

evaluate the performance formore complex autonomousdiscrete dynamics

(includes costs for resets)

A Graph Search Algorithm for Optimal Control of Hybrid Systems Olaf Stursberg University of Dortmund Germany Work financially supported by the European.

Documents

hybrid control

control of systems

optimal control theory

automatic control

american control

continuous systems

class of linear hybrid

timed systems