CONTROL THEORY AND · 2013. 7. 24. · Thus the mathematical problems of control theory are inverse to the usual problems of mathematical physics. We assume only a limited knowledge

CONTROLTHEORYAND

K<0, a>0

ANALYSIS

K<0, a<0 co-*0-f

LECTURES PRESENTED AT ANINTERNATIONAL SEM INAR COURSE TRIESTE, 11 SEPTEMBER - 29 NOVEMBER 1974 ORGANIZED BY THE INTERNATIONAL CENTRE

\ FOR THEORETICAL PHYSICS i TRIESTE

VWÍ INTERNATIONAL ATOMIC ENERGY AGENCY, VIENNA, 1 976

CONTROL THEORY AND

TOPICS IN FUNCTIONAL ANALYSIS

Vol. I

INTERNATIONAL CENTRE FOR THEORETICAL PHYSICS, TRIESTE

CONTROL THEORY AND

TOPICS IN FUNCTIONAL ANALYSIS

LECTURES PRESENTED AT AN INTERNATIONAL SEMINAR COURSE

AT TRIESTE FROM 11 SEPTEMBER TO 29 NOVEMBER 1974 ORGANIZED BY THE

INTERNATIONAL CENTRE FOR THEORETICAL PHYSICS, TRIESTE

In three volumes

VOL. I

INTERNATIONAL ATOMIC ENERGY AGENCY VIENNA, 1976

THE INTERNATIONAL CENTRE FOR THEORETICAL PHYSICS (ICTP) in Trieste was

established by the International Atomic Energy Agency (IAEA) in 1964 under an agreement

with the Italian Government, and with the assistance of the City and University of Trieste.

The IAEA and the United Nations Educational, Scientific and Cultural Organization (UNESCO)

subsequently agreed to operate the Centre jointly from 1 January 1970.

Member States of both organizations participate in the work of the Centre, the main purpose of

which is to foster, through training and research, the advancement of theoretical physics, with

special regard to the needs of developing countries.

CONTROL THEORY AND TOPICS IN FUNCTIONAL ANALYSIS IAEA, VIENNA, 1976

STI/PUB/415 ISBN 92-0 —130076-XPrinted by the IAEA in Austria

March 1976

FOREWORD

The International Centre for Theoretical Physics has maintained an interdisciplinary

character in its research and training programmes in different branches of theoretical physics

and related applied mathematics. In pursuance of this objective, the Centre has - since 1964 -

organized extended research courses in various disciplines; most of the Proceedings of these

courses have been published by the International Atomic Energy Agency.

The present three volumes record the Proceedings of the 1974 Autumn Course on Control

Theory and Topics in Functional Analysis held from 11 September to 29 November 1974.

The first volume consists of fundamental courses on differential systems, functional analysis

and optimization in theory and applications; the second contains lectures on control theory

and optimal control of ordinary differential systems; the third volume deals with infinite

dimensional (hereditary, stochastic and partial differential) systems. The programme of lectures

was organized by Professors R. Conti (Florence, Italy), L. Markus (Warwick, United Kingdom)

and C. Olech (Warsaw, Poland).

Abdus Salam

CONTENTS OF VOL. I

Basic concepts of control theory (IAEA-SMR-17/6).................................................................. 1

L. MarkusConcepts of stability and control (IAEA-SMR-17/1)................................................................. 53

P.C. ParksFoundations of functional analysis theory (IAEA-SMR-17/2)................................................. 101

Ruth F. CurtainControl theory and applications (IAEA-SMR-17/3) .................................................................. 179

A.J. PritchardFinite-dimensional optimization (IAEA-SMR-17/4).................................................................. 223

D. Q. MayneReachability of sets and time- and norm-minimal controls (IAEA-SMR-17/37).................... 261

A. MarzolloExistence theory in optimal control (IAEA-SMR-17/61) ......................................................... 291

C. OlechAsymptotic control (IAEA-SMR-17/48) ..................................................................................... 329

R. ContiControllability of non-linear control dynamical systems (IAEA-SMR-17/71) ...................... 361

C. Lo bryIntroduction to convex analysis (IAEA-SMR-17/63)................................................................. 385

J.P. CecconiAn introduction to probability theory (IAEA-SMR-17/29)...................................................... 419

J. Zab czykSecretariat of Seminar..................................................................................................................... 463

IA E A -S M R -1 7 /6

BASIC CONCEPTS OF CONTROL THEORY

L. MARKUS

School of Mathematics,University of Minnesota,Minneapolis, Minnesota,United States of America

Abstract

BASIC CONCEPTS OF CONTROL THEORY.After a philosophical introduction on control theory and its position am ong various branches o f sc ien ce ,

m athem atical control theory and its connection with functional analysis are discussed. A chapter on system theory concepts follow s. After a summary o f results and notations in the general theory o f ordinary differential equations, a qualitative theory o f control dynam ical systems and chapters on the top o log ica l dynam ics, and the controllability o f linear systems are presented. As exam ples o f autonomous linear systems, the switching locus for the synthesis o f optim al controllers and linear dynam ics with quadratic cost optim ization are considered.

I . W H A T IS C O N T R O L T H E O R Y - A N D W H Y ?

J u s t w h a t is c o n tro l th eo ry? W ho or w hat is to be c o n tro lle d and by

w h om o r by w hat, and why is i t to be c o n tro lle d ? In a n u ts h e ll, c o n tro l

th eo ry , s o m e t im e s c a lle d a u to m a t io n , c y b e rn e tic s o r sy s te m s th eo ry , is

a b ra n c h of a p p lie d m a th e m a t ic s th a t d ea ls w ith the d e s ig n of m a c h in e ry

and o the r e n g in e e r in g sy s te m s so th a t these sy s te m s w o rk , arid w o rk b e tte r

th an b e fo re .

A s an e x am p le , c o n s id e r the p ro b le m of c o n tro ll in g the te m p e ra tu re in

a co ld le c tu re h a l l . T h is is a s ta n d a rd e n g in e e r in g p r o b le m f a m i l ia r to us

a l l . The th e r m a l s y s te m c o n s is ts o f the fu rn a c e as the he a tin g sou rce , and

the ro o m th e rm o m e te r as the re c o rd of the te m p e ra tu re of the h a l l . The

e x te rn a l e n v iro n m e n t we a s su m e fixed and no t b e lo ng in g to the th e r m o

d y n am ic sy s te m u nd e r a n a ly s is . The b a s ic h e a tin g so u rc e is the fu rn a ce ,

bu t the c o n tro l of the fu rn a ce is th ro ug h a th e rm o s ta t . The th e rm o s ta t

dev ice u s u a lly c o n ta in s a th e rm o m e te r to m e a s u re the c u r re n t ro o m

te m p e ra tu re and a d ia l on w h ich we se t the d e s ire d ro o m te m p e ra tu re . The

c o n tro l a sp e c t o f the th e rm o s ta t is tha t it c o m p a re s the a c tu a l and the

d e s ire d te m p e ra tu re s a t each m o m e n t and th en i t sends an e le c tr ic s ig n a l

o r c o n tro l c o m m a n d to the fu rn a c e to tu rn the f i r e in te n s ity up o r down.

In th is case , the jo b of the c o n tro l e n g inee r is to in v e n t o r d e s ign an

e ffe c tiv e th e rm o s ta t .

L e t us next lo ok a t a c o n tro l p r o b le m f r o m b io lo g y . P a r ts of the w o r ld

a re be ing o v e r ru n by an in c r e a s in g p o p u la t io n o f r a t s . H e re the sy s te m

c o n s is ts o f the l iv in g p o p u la t io n o f r a ts and the e n v iro n m e n ta l p a r a m e te rs

th a t a ffe c t th a t p o p u la t io n . The n a tu r a l g row th o f the r a t p o p u la t io n is to be

c o n tro lle d to w a rds som e d e s ire d n u m b e r , say , z e ro . H e re the jo b o f the

c o n tro l e n g ine e r is to b u ild a b e tte r m o use - tra p .

F r o m th is v ie w p o in t c o n tro l th e o ry does not a p p e a r too s in is te r . O n

the o the r hand , i t does no t s e e m too p ro found . So le t m e e lab o ra te on the

1

2 MARK US

s tr u c tu re of c o n tro l th e o ry to in d ic a te the re a so n s why U N E S C O and the IC T P

b e lie v e th is s u b je c t is im p o r ta n t . To o rg a n iz e these id e a s , I s h a ll d is c u s s

c o n tro l th e o ry f r o m fo u r v iew po in ts :

(i) as an in te l le c tu a l d is c ip lin e w ith in s c ie nce and the p h ilo so ph y of sc ie nce ;

(ii) as a p a r t o f e n g in e e r in g , w ith in d u s t r ia l a p p lic a t io n s ;

( i i i) as a p a r t of the e d u c a t io n a l c u r r ic u lu m a t u n iv e rs ity ;

(iv) a s a fo rc e in the w o r ld re la te d to te c h n o lo g ic a l, e cono m ic and s o c ia l

p ro b le m s of the p re se n t and the fu tu re .

F i r s t c o n s id e r the p h ilo s o p h ic a l p o s it io n of the d is c ip lin e of c o n tro l

th e o ry . W ith in the fr a m e w o rk of m e ta p h y s ic s , c o n tro l th e o ry is a te le o lo g ic a l

s c ie n ce . T ha t is , the concep ts o f c o n tro l th eo ry in vo lv e id e as such as

p u rp o se , g oa l- seek ing and id e a l o r d e s ira b le n o rm s . T hese a re te rm s of

n in e te en th ce n tu ry b io lo g y and psycho logy , te r m s of v o lit io n , w i l l and m o t iv a

t io n such as w ere in tro d u c e d by A r is to t le to e x p la in the fo und a tio n s of

p h y s ic s , b u t th en c a re fu lly e x o rc ize d by N ew ton when he c o ns tru c te d an

u nh um an g e o m e tr ic m e c h a n ic s . So c o n tro l th eo ry r e p re s e n ts a sy n th e s is of

the p h ilo so p h ie s o f A r is to t le and N ew ton show ing th a t in a n im a te d e te r m in is t ic

m e c h a n is m s can fu n c tio n as p u rp o se fu l s e lf- re g u la t in g o rg a n is m s . R e c a l l

how the in a n im a te th e rm o s ta t re g u la te s the r o o m te m p e ra tu re to w a rds the

ag re ed id e a l.

A n o th e r p h ilo s o p h ic a l a sp e c t of c o n tro l th eo ry is th a t i t avo id s the

concep ts o f e ne rgy bu t, in s te a d , dea ls w ith th e phenom enon of in fo rm a t io n in

p h y s ic a l s y s te m s . I f we c o m p a re the fu rn a ce w ith the th e rm o s ta t we note

a g re a t d is p a r ity o f s iz e and w e ig h t. The p o w e rfu l fu rn a ce supp lie s

q u a n t it ie s o f energy : a concep t o f c la s s ic a l p h y s ic s . H ow ever, the tin y bu t

in g e n io u s th e rm o s ta t d ea ls w ith in fo rm a t io n - an a spe c t of m o d e rn s ta t is t ic a l

p h y s ic s and m a th e m a t ic s . Thus c o n tro l th e o ry re s ts on a new ca te go ry of

p h y s ic a l r e a l ity , n a m e ly in fo rm a t io n , w h ich is d is t in c t f r o m energy o r

m a t te r . P o s s ib ly , th is a ffo rd s a new a p p ro a ch to the c o nu nd ru m of m in d

v e rs u s m a t te r , c o n ce rn in g w h ich the p h ilo s o p h ic a l jo u r n a l P un ch once

re m a rk e d ,

"W h a t is m a t te r ? — N eve r m in d .

W h a t is m in d ? - No m a t te r . "

B u t w hat a re the p r o b le m s , m e thods and r e s u lts of c o n tro l th eo ry as they

a re in te rp re te d in m o d e rn m a th e m a t ic a l p h y s ic s o r e n g in e e r in g ? In th is

sense c o n tro l th e o ry d ea ls w ith the in v e rs e p r o b le m of d y n a m ic a l s y s te m s .

T ha t is , suppose we have a d y n a m ic a l s y s te m , fo r exam p le m any v ib r a t in g

m a s s e s in te rc o n n e c te d by e la s t ic s p r in g s . S uch a d y n a m ic a l s y s te m is

d e sc r ib e d m a th e m a t ic a l ly by an a r r a y o f o rd in a r y d if fe r e n t ia l e qua tio ns th a t

p r e d ic t the e vo lu tio n o f the v ib r a t io n s a c c o rd in g to N ew ton 's law s of m o tio n .

The c u s to m a ry o r d ir e c t p r o b le m of d y n am ic s is the a n a ly s is o f the p h y s ic a l

s y s te m (to o b ta in the a r r a y of m a th e m a t ic a l d if fe r e n t ia l equa tions ) and then

the a n a ly s is of the d if fe r e n t ia l equa tio ns to com pu te the so lu tio n s d e s c r ib in g

the v ib r a t io n s . F o r exam p le , we m ig h t t r y to lo c a te a l l e q u i l ib r iu m s ta tes

o f the d y n a m ic a l s y s te m and to com pu te w h ich o f these a re s ta b le .

T he in v e rs e p ro b le m , th a t is , c o n tro l th e o ry a p p lie d to the v ib r a t in g

sy s te m , co nce rn s the q u e s tio n o f syn thes is a s the in v e rse o f a n a ly s is . H ere

we spe c ify a goa l and seek to m o d ify the p h y s ic a l and m a th e m a t ic a l sy s te m s

to in c o rp o ra te th is g o a l. F o r in s ta n ce , we m ig h t p ic k one of th e known

e q u i l ib r iu m s ta te s and in s is t th a t th is m u s t be s tab le so th a t, w hateve r the

IA E A -S M R -1 7 /6 3

c u r re n t s ta te o f the v ib r a t in g sy s te m , i t m u s t tend to w a rds the d e s ire d

e q u i l ib r iu m . In te rm s o f the p h y s ic s th is m e ans th a t we seek to syn th es ize

new fo rc e s (fo r in s ta n c e , f r ic t io n a l d am p ing ) in to the s y s te m to a ch ieve the

g oa l. In te rm s of m a th e m a t ic s we seek to add o r sy n th es ize new co e ff ic ie n ts

(of som e sp e c if ie d type) in to the d if fe re n t ia l equa tio ns to p ro duce the r e q u ir e d

type o f s ta b il ity .

In the s im p le s t case the v ib r a t in g sy s te m c o n s is ts o f a s in g le m a s s on a

l in e a r s p r in g . I f the d is p la c e m e n t f r o m e q u i l ib r iu m is x a t t im e t, then

N ew ton 's law a s s e r ts th a t the a c c e le ra t io n d2 x /d t2 is e qua l to the s p r in g

fo rc e -x (u s ing u n its in w h ich the m ass is un ity ) and we o b ta in the d if fe r e n t ia l

e qu a t io n o f m o tio n

C an we in tro d u c e a c o n tro l fo rc e u, depend ing on x and dx/dt, so th a t every

s o lu t io n of

dt^ = - x + u

r e tu rn s to the r e s t e q u i l ib r iu m s ta te x = 0, dx / d t = 0, a f te r a f in ite d u ra t io n ?

T h is is the p r o b le m o f c o n t ro l la b i l i ty . C an we f in d a c o n tro l law u* (x, dx / dt)

s u ch th a t the t im e of r e tu rn to r e s t is o f m in im a l d u ra t io n , as c o m p a re d w ith

the c o r re s p o n d in g d u ra t io n fo r a l l c o n tro l fo rc e s u w ith m ag n itu d e s l im it e d

by a p r e s c r ib e d bound? T h is is the p ro b le m of o p t im a l c o n tro l.

T hus the m a th e m a t ic a l p ro b le m of c o n tro l th e o ry co nce rns the m o d if ic a

t io n o f d if fe r e n t ia l e qua tio ns , w ith in p re s c r ib e d l im it a t io n s , so th a t the

s o lu t io n s behave in som e d e s ire d s p e c if ie d m a n n e r . M o re o v e r , c o n tro l

d y n a m ic s can r e fe r to o rd in a r y d if fe r e n t ia l equa tio ns as in the case of the

f in i te se t of v ib r a t in g m a s s e s , o r p a r t ia l d if fe r e n t ia l equa tio ns as in the

case o f te m p e ra tu re v a r ia t io n s th ro ug ho u t a le c tu re h a l l . In fa c t, m o re

c o m p lic a te d types of fu n c t io n a l e qua tio ns , l in e a r o r n o n - lin e a r , d e te r m in is t ic

o r s to c h a s t ic , f in ite - o r in f in ite - d im e n s io n a l can a r is e in c o n tro l th e o ry .

The c o m m o n fe a tu re o f a l l these p ro b le m s w ith in c o n tro l th e o ry is th a t we

p r e s c r ib e the d e s ire d b e h a v io u r o f the so lu tio n s and th en we seek to m o d ify

the co e ff ic ie n ts of the d y n a m ic a l equa tio ns so as to in duce th is b e h av io u r .

H ence c o n tro l th eo ry takes its fo rm a t f r o m m a th e m a t ic s bu t a ls o i t g ives to

m a th e m a t ic s new types o f s t im u la t in g q u es tio n s and c h a lle n g in g p ro b le m s .

In c la s s ic a l m a th e m a t ic a l p h y s ic s , we know a l l abou t the p h y s ic a l law s

g u id in g the deve lo pm e n t o f the phenom enon we m ay be s tudy ing ; we know a l l

the ru le s of the gam e and we w an t to p re d ic t the o u tco m e . T ha t is p h y s ic s .

In c o n tro l th e o ry we do no t know the r u le s , in fa c t we can change th e m w ith in

c e r ta in l im it a t io n s as we p ro ce e d , but we do know exac tly how we w an t the

g am e to end . T hus the m a th e m a t ic a l p ro b le m s of c o n tro l th e o ry a re in v e rse

to the u su a l p ro b le m s of m a th e m a t ic a l p h y s ic s . W e a s su m e on ly a l im ite d

know ledge o f our. d if fe r e n t ia l e qua t io n s , w h ich a re p a r t ia l ly s u b je c t to o u r

c o n tro l, but we spe c ify d e f in it iv e ly the d eve lo pm en t of the sy s te m f r o m the

p re s e n t s ta te to the fu tu re goa l.

N ow le t us tu rn to the e n g in e e r in g s ide o f c o n tro l th e o ry . C o n tro l

th e o ry is conce rned w ith d o ing w h e reas c la s s ic a l p h y s ic s is p r im a r i ly

4 M ARKUS

in te re s te d in u n d e rs ta n d in g . N a tu r a lly , the bes t p ro ce d u re w ou ld invo lve

c o m p re h e n s io n f i r s t and a c tio n second , b u t we canno t a lw ays hope fo r the

b e s t.

O ften we m u s t a c t to gu ide som e p ro ce s s w hen we do n o t u nd e rs tan d i t

th o ro u g h ly . F o r in s ta n ce , we do no t have a co m p le te know ledge of the love-

l ife of r a ts , ye t we can hope to c o n tro l the r a t p o p u la t io n e ffe c t iv e ly by a

c le v e r ly chosen s y s te m of tr a p s and p o is o n s . I f one m e thod is not e ffe c tive ,

we can a lw ay s t r y a n o th e r u n t i l we m e e t s u c c e s s . Y o u know the o ld say ing ,

" I f a t f i r s t you d o n 't succeed , try , tr y ag a in - and th en g ive up . T h e re 's

no use in m a k in g a fo o l o f y o u r s e lf " . B u t m a th e m a t ic a l c o n tro l th e o re t ic ia n s

do no t m in d b e ing fo o ls p ro v id ed th e ir s u c ce s s iv e t r ia ls g ra d u a lly converge

to a n e ffe c tiv e fu n c t io n a l c o n tro l s y n th e s is . The p ro ce s s of m e a s u r in g ju s t

how fa r s h o r t o f p e r fe c t io n each m e thod f a l ls , and th en c o r re c t in g the next

m e th od a c c o rd in g ly , is c a lle d the p r in c ip le o f feedback , a r e p u ls iv e w ord

fo r an im p o r ta n t concep t.

S c ie n t is ts d id not in v e n t the feedback c o n tro l p ro ce s s ; i t is a phenom enon

o f n a tu re . A lm o s t a l l b io lo g ic a l p ro ce s s e s in vo lve se lf- re g u la t io n th ro ug h

som e b io c h e m ic a l fe edback c o n tro ls . In p h y s io lo g y th is concep t is c a lle d

h o m e o s ta s is .

U s in g fe ed b ack c o n tro l we can so lve p ro b le m s th a t we do not un d e rs tan d .

T h is c ir c u m s ta n c e is w hat m ake s c o n tro l th eo ry such a p o w e rfu l to o l fo r

te c h n o lo g ic a l p r o b le m s in e n g in e e r in g and fo r e co n o m ic and s o c ia l p ro b le m s

in u rb a n p la n n in g .

Suppose th a t an e n g ine e r w ishes to s u p e rv is e an in d u s t r ia l p la n t m ak in g

g la s s , o r p a p e r o r s te e l. In p r a c t ic e , th is is the m a in a p p lic a t io n o f c o n tro l

th eo ry , in c o n tro ll in g the q u a lity of the p ro d u c t of a m a n u fa c tu r in g p ro c e s s .

C o n s id e r the p ro ce s s o f m ak in g p a p e r . S c ie n t is ts know v e ry l i t t le

abou t the b a s ic c h e m is try o f wood pu lp o r f ib r e s ludge , ye t we dem and th a t

the f in a l p ap e r shee tin g shou ld r o l l out w ith a v e ry p r e c is e ly c o n tro lle d

th ic k n e ss . So we m o n ito r the th ic k n e ss as the p a p e r r o l ls out and , i f th e re

is a n e r r o r , we c an m ake su c ce ss iv e c o r re c t io n s to the te m p e ra tu re of the

pu lp ta n k and the speed of the r o l le r s , u n t i l the c o r r e c t th ic k n e ss is

a c h ie v e d . S ince v e ry r a p id a d ju s tm e n ts o f the o p e ra tin g c o nd it io n s a re

r e q u ire d , a u to m a t ic m e a s u re m e n ts and c o m p u te r r e g u la t io n o f the p la n t

m u s t re p la c e in a c c u ra te and aw kw ard h u m an b e in g s . T hus a u to m a t ic

fa c to r ie s a r is e in v o lv in g no la b o u re r s - bu t, in s te ad , h o rde s of c o m pu te r

r e p a irm e n .

H ow can u n iv e r s it ie s te a ch th is s c ie nce of t r ia l- a n d - e r ro r , o f input-

ou tpu t c o m p a r is o n and o f c o n tro l a c tio n upon the p r a g m a t ic p r in c ip le of

fe edback r a th e r th an upon the r a t io n a l m e thods of a n a ly se s ? In the f i r s t

p la c e , we u s u a lly have som e g e n e ra l o r q u a lita t iv e id e a s c o n ce rn in g the

d y n a m ic s of the p la n t we p la n to c o n tro l. T hus the b a s ic e n g in e e r in g and

m a th e m a t ic a l te chn iq ue s th a t m u s t be taug h t a re found w ith in the q u a lita t iv e

th e o ry of d y n a m ic a l s y s te m s .

D y n a m ic a l- s y s te m th e o ry , o r the s tudy of chang ing p h y s ic a l q u an t it ie s

such as the p o s it io n s o f a s y s te m of v ib r a t in g m a s s e s , h as been a c e n tra l

p a r t o f m a th e m a t ic a l s c ie nce s in ce N ew ton . F r o m the p u re ly m a th e m a t ic a l

v iew p o in t, c o n tro l th e o ry fa l ls u n d e r the d o m a in of d y n a m ic a l- sy s te m

th e o ry , w ith heavy o ve rto nes o f s ta t is t ic s and p r o b a b il i ty and w ith som e

a sp e c t of o p t im iz a t io n , l in e a r p r o g r a m m in g and the c a lc u lu s o f v a r ia t io n s .

N a tu r a lly , th e re is a ls o a c lo se r e la t io n to c o m p u te r s c ie nce s ince

IA E A -S M R -1 7 /6 5

e n g in e e r in g c o n tro l p ro ce s s e s o ften in vo lv e c o m p u te r m ach in e s as p a r t of

the fe edback lo op th ro ug h w h ich in fo rm a t io n and c o n tro l c o m m an d s flo w .

F in a l ly , I s h a l l d is c u s s som e a p p lic a t io n s of c o n tro l m e thods to s o c io

e co n o m ic p r o b le m s . L e t us take f i r s t an exam p le f r o m e co n o m ic s w here

c o n tro l m e th od s a re w e ll a ccep ted and a re used as an e ffe c tive p r a c t ic a l

p ro ce d u re , n a m e ly in the K e y n e s ia n m a c ro - e c o n o m ic s of a n a tio n s ta te .

In the K e y n e s ia n f is c a l th eo ry a l l da ta , s ig n if ic a n t fo r the e cono m ic

d e s c r ip t io n of a n a tio n , a re ag g reg a te d in to a few c a te g o r ie s . T hese

e co n o m ic le v e ls a re such th in g s as g ro ss n a t io n a l p ro d u c t , u n e m p lo y m en t,

in v e s tm e n t of heavy in d u s try , p ro d u c t io n o f c o n su m e r goods, and th e ir

n u m e r ic a l m e a s u re s de fine the c u r re n t e cono m ic s ta te o f the n a t io n . The

g o ve rnm en t seeks to c o n tro l the e cono m ic sta te of the n a tio n by re g u la t in g

the le v e l of ta x a tio n and the p r im e in te r e s t r a te on m oney . I f the econom y

sags , th en th is in fo rm a t io n is fed b ack to the g o ve rnm en t who then a d ju s ts

the c o n tro ls of ta x a tio n and in te re s t to re v iv e and im p ro v e the e cono m ic s ta te .

F r o m the e cono m ic v iew p o in t the K e y n e s ia n c o n tro l th e o ry se e m s to

w o rk , bu t th e re a re p o l i t ic a l and m o r a l d e c is io n s th a t m u s t be re so lv ed .

W h a t is a d e s ir a b le e cono m ic s ta te ? I f th e re is an e cono m ic squeeze who

sho u ld su ffe r d u r in g the re - a d ju s tm e n t? Shou ld the b u rd e n be p la c e d on poo r

u nem p lo yed young peop le , o r on o ld-age p e n s io n e rs c ru sh ed by in f la t io n ?

T h is p o l i t ic a l o r s o c ia l d e c is io n o v e r r id e s the e co n o m ic p r o g r a m m e .

N e v e r th e le s s , the e n g in e e r in g c o n tro l th eo ry ap p ro ach to m a c ro - e c o n o m ic s ,

a s o u tlin e d by K eynes , g ives us a m e thod of a tta c k in g n a t io n a l e cono m ic

p r o b le m s .

In the m ic ro - e c o n o m ic th e o ry of a b u s in e ss com pany o r f i r m we

e n co un te r o th e r c o n tro l m e thods and concep ts in v o lv in g p e rso n n e l o r g a n iz a

t io n , in v e n to ry , d is t r ib u t io n and p ro d u c t io n . In fa c t, the in te r a c t io n

be tw een m a n u fa c tu r in g f i r m s and the n a t io n a l g ove rnm en t w i l l b ecom e m o re

and m o re im p o r ta n t as we re q u ir e p o llu t io n c o n tro l to becom e a p a r t of the

p ro d u c t io n p ro c e s s .

K e y n e s ia n f is c a l th e o ry and the o rg a n iz a t io n a l th e o ry of the f i r m a re

r e la t iv e s u cce ss s to r ie s fo r c o n tro l m e thods w ith in e co n o m ic s y s te m s .

B u t now we co m e to a t r a g ic f a i lu r e .

L ak e E r ie is one of the G re a t L a k e s of the U n ited S ta te s . I t is a ls o

one of the g re a t s in k s o f in d u s t r ia l and h u m an w as te s . A m a jo r g o v e rn

m e n ta l p la n to c le a n up L a k e E r ie w as in it ia te d s e v e ra l y e a rs ago , bu t the

p la n fa i le d . The te c h n o lo g ic a l d if f ic u lt ie s w ere o ve rcom e w ith new types of

in d u s t r ia l and s a n ita r y p u r if ic a t io n p la n ts , the e cono m ic p ro b le m s w ere m e t

w ith adequa te m i l l io n d o l la r budge ts f r o m n a t io n a l and s ta te ag enc ie s , bu t the

p o l i t ic a l p ro b le m s w ere u n so lv a b le . L ak e E r ie is s u r ro u n d e d by a v a r ie ty

o f c o u n tr ie s , d is t r ic ts , tow ns and c it ie s each w ith a d if fe re n t p o l it ic a l

s tr u c tu r e . C o - o p e ra t io n and c o m p ro m is e w ere n e ce s s a ry fo r the s a lv a t io n

o f L a k e E r ie , but c o - o pe ra tio n p ro ved p o l i t ic a l ly d if f ic u lt . W ha t is needed

to c le a n up L a k e E r ie is a new c o m m is s io n w ith a p p ro p r ia te p o l it ic a l

c o nne c tio n s and a u th o r ity . J u s t how th is c o m m is s io n shou ld be co ns titu ted

is no t c le a r . T h is is a p r o b le m fo r c o n tro l th e o re t ic ia n s o f p o l it ic a l s c ie n ce .

N ew tow ns a re now b e in g p la n ne d in E n g la n d . F o r each tow n the

te c h n o lo g ic a l s tr u c tu r e (ro ad , tr a n s p o r t , sew age e tc . ) is c a re fu l ly p la n n e d .

The e co n o m ic v ia b i l i ty ( in d u s tr ia l p la n ts , r a i lr o a d s e tc . ) is c o n s id e re d . B u t

is the p o l i t ic a l o rg a n iz a t io n o f the tow n and r e g io n a l g o v e rnm e n t a n a ly sed

a s a n in te g r a l se g m e n t of the o v e r- a ll s y s te m d e s ig n p ro b le m ? Such

6 M ARKUS

p o l i t ic a l d e s ign q u es tio n s s o m e t im e s a re faced , b a t in a r a th e r u n sy s te m a tic

w ay .

In s u m m a ry , the in flu e n ce of c o n tro l th eo ry on s ig n if ic a n t s o c ia l

p ro b le m s p ro cee ds th ro ug h s e v e ra l le v e ls in d ic a te d a s fo llow s :

P O L IT IC S

F e e d b a c k S yn the s is w ith L o g ic a l D e c is io n s

E C O N O M IC S

O p t im a l c o n tro l w ith co st c r i te r ia

E N G IN E E R IN G

C o n tro l inpu t- ou tpu t d es ign

B A S IC S C IE N C E

D y n a m ic a l S ys tem s

O n the m o s t fu n d am e n ta l o r b o tto m le v e l o f the fo u r la y e r s encoun te red in the

c o n tro l p r o b le m , we re q u ir e b a s ic s c ie n ce . T h is le v e l te l ls us w h a t is

p o s s ib le . F o r in s ta n ce , we know the p h y s ic a l o r b io lo g ic a l o r e cono m ic law s

d e s c r ib in g the d y n a m ic a l s y s te m s we seek to c o n tro l. The m o re we u n d e r

s tand abou t the w o r ld the b e tte r we can c o n tro l o u r e n v iro n m e n t.

T he nex t le v e l in v o lv e s e n g ine e r in g th eo ry and p r a c t ic e as a m e thod

o f im p r o v in g te c h n o lo g ic a l p ro ce sse s by c o n tro ll in g p ro d u c t io n . T h is le v e l

te l ls us w hat is fe a s ib le .

In the th ir d le v e l of a c t iv ity the e co n o m ic s of p ro d u c t io n m u s t be an a ly sed

to f in d the bes t o r o p t im a l c o n tro l, o r a t le a s t a s a t is fa c to ry c o n tro l. T h is

le v e l te lls us w hat is e ff ic ie n t . B ut we m u s t s t i l l dec ide on the goa l o f o u r

c o n tro l s y s te m . The o p t im a l m u s t be de fined in te rm s of a co s t c r i te r io n

th a t r e s ts on h u m a n ju d g e m e n ts and p o l it ic a l and m o r a l p r in c ip le s . B u t

m o re th an th is , the im p le m e n ta t io n of the e co n o m ic- e ng in e e r in g - sc ie n t if ic

c o n tro l d e s ig n depends on the d e c is io n - m a k in g c a p a b ility o f the c o m m is s io n

(its lo g ic a l p o w e rs of a n a ly s is ) and on the p o l it ic a l s treng th of it s a u th o r ity .

E a c h p e rso n has h is own l im ita t io n s and h is own in te r e s ts . No one

can range o ve r the to ta lity of the to p ic s in the fo u r le v e ls o f p ro b le m so lv in g

d e s c r ib e d above . E a c h has h is own n a r ro w e x p e r tis e . Y e t each one can

a p p re c ia te how h is e ffo r ts f i t in to a w id e r p a tte rn , a c o n t in u u m of know ledge .

T h is in t e r d is c ip l in a r y a p p ro a c h m ay o ffe r a m e ans of e ffe c t iv e ly a ttac h in g

the b e h a v io u ra l s c ie n ce s to the n a tu r a l s c ie n ce s w ith the g lue o f c o n tro l

th e o ry . W e can on ly t r y and w a it and hope .

I I . M A T H E M A T IC A L C O N T R O L T H E O R Y A N D F U N C T IO N A L A N A L Y S IS

1. C O N T R O L O F O R D IN A R Y D IF F E R E N T IA L SYST EM S

T he m a th e m a t ic a l f o rm u la t io n o f c o n tro l th eo ry , w ith in the fr a m e w o rk

o f d y n a m ic a l s y s te m s , in v o lv e s an in v e rse p r o b le m . In the s ta n d a rd th e o ry

o f d if fe r e n t ia l e qua t io n s , we a re g iven the co e ff ic ie n ts and we seek to

com pu te the s o lu tio n s ; in c o n tro l theo ry the so lu tio n s a re p re s c r ib e d (or

s om e a sp e c t of th e ir b e h av io u r is p re sc r ib e d ) and we seek the c o rre s p o n d in g

r e s t r ic t io n s on the c o e ff ic ie n ts . F r o m th is g e n e ra l v iew p o in t, c o n tro l

th e o ry re s e m b le s a b o u nd a ry v a lu e p r o b le m r a th e r th an an in i t ia l va lue

p r o b le m .

T he p h y s ic a l fo und a t io n fo r th is in v e rse n a tu re o f c o n tro l th e o ry is th a t

the d e s c r ip t io n s of s c ie nce a c c o rd in g to N ew ton , M ax w e ll, and o the rs , a re

IA E A -S M R -1 7 /6 7

posed in te rm s of d if fe r e n t ia l e q u a t io n s . The s ta te of the p h y s ic a l s y s te m

(m e c h a n ic a l d is p la c e m e n ts and v e lo c it ie s , o r e le c tr o d y n a m ic a l f ie ld

s tre n g th s , e tc . ) a re com pu ted upon in te g ra t in g d if fe r e n t ia l e qua tio ns whose

co e ff ic ie n ts co n ta in p h y s ic a l q u a n t it ie s such as fo rc e and c h a rg e . In o the r

w o rd s , we m u s t in v e r t the d if fe r e n t ia l o p e ra to rs to f in d the sta te p ro duced

by the c o n tro ll in g fo rc e s o r e le c tr ic c h a rge s .

A d if fe re n t a p p ro a ch is o ften encoun te red in e n g in e e r in g , b io lo g ic a l,

e c o n o m ic , and s o c ia l sy s te m s w here the p la n t d y n am ic s a re unknow n — no

equa tio ns of N ew ton o r M ax w e ll a re a v a ila b le . In th is c ase , we take a

p h e n o m e n o lo g ic a l a p p ro a ch and seek to in v e s tig a te and c o n tro l the sy s te m

a f te r p r e l im in a r y e x p e r im e n ta t io n o r c a l ib r a t io n . T ha t is , we pu t in

v a r io u s c o n tro l s ig n a ls o r fo rc e s , and th en we m e a s u re o r o bse rve the

s ta te o r som e fu n c tio n o f the s ta te th a t is p u t ou t. B y c o m p a r in g the in p u t

w ith the ou tput, we t r y to b u ild up a m o d e l of the in te r n a l p la n t d y n am ic s ,

and to d e te rm in e the a p p ro p r ia te c o n tro l. T h is is an a p p ro a ch of id e n t if ic a

t io n th e o ry and adap tive c o n tro l. W hen we fu r th e r seek the b es t c o n tro l

fo r som e pu rp o se , we e n co un te r the th eo ry o f o p t im a l c o n tro l. I t w i l l be a

p r in c ip le goa l o f th is c o u rse to c o m p a re these tw o app ro ache s to c o n tro l

th eo ry :

d y n a m ic a l sy s te m s and o p t im a l c o n tro l

in p u t- ou tpu t a n a ly s is and adap tive c o n tro l

T he p h y s ic a l o r e n g in e e r in g p r o b le m of c o n tro l d e s ig n is d e lim ite d on ly

by e x p e r ie n ce , tr a d it io n , p r a c t ic a b il i ty , and in g e nu ity . H ow ever, in

m a th e m a t ic a l c o n tro l th eo ry we m u s t c la r i fy the c la s s o f a d m is s ib le

c o n t ro l le r s a p r io r i . W e can seek c o n tro lle r s on ly w ith in c e r ta in se le c ted

c la s s e s such as l in e a r , o r p ie ce w ise l in e a r , w h ich a re o ften sugges ted by

e n g in e e r in g ex am p le s and m o tiv a t io n s .

A s r e m a rk e d above , c o n tro l th eo ry , as in te rp re te d w ith in the fr a m e w o rk

o f d y n a m ic a l sy s te m s o r d if fe r e n t ia l e qua tio ns , le ad s to p ro b le m s th a t a re

the in v e rse s of the c la s s ic a l m a th e m a t ic a l in v e s t ig a t io n s . The c la s s ic a l

th e o ry of d if fe r e n t ia l equa tio ns dea ls w ith a n a ly s is w hereas c o n tro l th e o ry

d e a ls w ith s y n th e s is . In the c la s s ic a l a p p ro a c h to d y n a m ic a l s y s te m s we

a re g iven the d if fe r e n t ia l equa tio ns of m o tio n , and th en we tr y to a n a ly se the

b e h av io u r o f the r e s u lt in g m o tio n s o r s o lu t io n s . In c o n tro l th eo ry we

p r e s c r ib e the d e s ire d b e h av io u r o f the so lu t io n s , and then we tr y to

s y n th e s ize the d if fe r e n t ia l equa tio ns to y ie ld these m o t io n s . O f c o u rse ,

the p ro ce d u re o f sy n th e s is m e an s , in m a th e m a t ic a l te r m s , th a t the b a s ic

fo r m o f the u n d e r ly in g d if fe r e n t ia l equa tio ns can be m o d if ie d by a d ju s tm e n t

o f c e r ta in c o n tro l p a r a m e te rs o r fu n c t io n a l c o e ff ic ie n ts w h ich a re se le c ted

f r o m c e r ta in a d m is s ib le c la s se s ; w he reas the sy n th es is m e an s , in e n g in e e r in g

te r m s , th a t the p r im a r y m ach in e o r p la n t can be m o d if ie d by the a d ju s tm e n t

o f ga in s in fe edback lo op s o r the in s e r t io n of a u x il ia r y d ev ices o f c e r ta in

p r a c t ic a l types.

H ence , to each p a r t o f c la s s ic a l th e o ry o f d if fe r e n t ia l e qua tio ns , say

s ta b il i t y o r o s c i l la t io n th e o ry , th e re co rre s p o n d s a f ie ld of c o n tro l th eo ry

w ith in v e r s e p r o b le m s .

F o r in s ta n c e , c o n s id e r the c la s s ic a l s ta b il ity a n a ly s is o f the d am ped

l in e a r o s c i l la to r

x + 2 b x + k 2 x = 0

8 MARKUS

w ith co ns ta n t c o e ff ic ie n ts . T h is o s c i l la to r is a s y m p to t ic a lly s tab le (in the

sense th a t a l l s o lu tio n s ap p ro ach x = x = 0 as t -» + oo) i f and on ly i f b > 0 and

k 2 > 0. A s an in v e r s e p ro b le m , a s su m e к > 0 fixed and tr y to choose b > 0

so th a t the s o lu t io n s a re d am ped a t the m a x im a l r a te . T ha t is , d e fine the

c o s t o r e ff ic ie n c y of the c o n tro l p a r a m e te r b to be

C(b) = m ax {Re Х г , R e X2}

w here X is any e igenva lue s a t is fy in g X2 + 2bX + k 2 = 0.

T hen we seek to s e le c t b to m in im iz e C (b ).

A n easy c a lc u la t io n show s th a t the o p t im a l c o n tro l b* m in im iz in g C (b) is

b * = k . I t is in te r e s t in g to note th a t th is is the s ta n d a rd v a lue fo r c r i t ic a l

d am p in g , and hence we see th a t th is f a m i l ia r p h y s ic a l a d ju s tm e n t is

e xp la in ed as an e le m e n ta ry r e s u lt in c o n tro l th eo ry .

A s a n o th e r i l lu s t r a t io n c o n s id e r the fo rce d o s c i l la to r

x + 2 bx + k 2x = s in u t

fo r co ns tan ts b > 0, к > 0, to > 0 . C la s s ic a l a n a ly s is show s tha t th e re is a

un ique p e r io d ic s o lu tio n

x = A s in (ut + <j>)

w ith am p litu d e

A = [ (k2 -w2 )2 + 4b2io2 ]'*

F o r the in v e r s e p r o b le m , f ix к > \f2 and b > 0, and tr y to choose the

fre q ue n cy u > 0 of the c o n tro l in p u t s in u t so as to m a x im iz e the am p litu d e

A (u ) of the re sp o n se ou tpu t. A n easy c a lc u la t io n show s th a t the o p t im a l

c o n tro l u * m a x im iz in g А (и) = [ (k 2 - u 2 )2 + 4b2u 2 ]"* is u * = (k 2- 2b 2)^, w h ich

is a s su m e d p o s it iv e . A g a in we find th is v a lu e f a m i l ia r s in ce u * is the

re s o n a t in g freq ue ncy , and hence th is b a s ic e n g in e e r in g tu n in g is exp la ined

as an e le m e n ta ry r e s u lt in c o n tro l th eo ry .

T hese co n tro l- th e o re t ic re s u lts a re in te re s t in g in th a t they i l lu m in a te

w e ll-know n p h y s ic a l and e n g in e e r in g p r a c t ic e . Y e t , they a re no t ty p ic a l

o f the m o d e rn th e o ry o f c o n tro l. In the next sec tio n , we c o m m e n t on a

s ta n d a rd fo rm u la t io n of m o d e rn c o n tro l th eo ry , as in te rp re te d w ith in the

f r a m e w o rk o f o rd in a r y d if fe r e n t ia l e q u a t io n s . L a te r we exam ine o the r

a p p ro a ch e s th ro u g h p a r t ia l d if fe r e n t ia l equa tio ns and o the r fu n c t io n a l

e qua t io n s .

In c o n tro l th e o ry we c o n s id e r a p ro ce s s o r p la n t o r d y n a m ic a l s y s te m

d e s c r ib e d by a d if fe r e n t ia l sy s te m ,

x = f(x , u)

w here x is the re a l- s ta te n- ve c to r a t t im e t, and the c o e ff ic ie n t f is an

n- vec to r fu n c tio n o f the p re s e n t s ta te x and the c o n tro l m - v e c to r u . F o r

s im p l ic i ty we a s su m e the p ro ce s s is au to no m o us (tim e- independen t) and

th a t f is c on tin uous w ith con tin uous f i r s t d e r iv a t iv e s fo r a l l x e R n and

u G Rm, th a t is

IA E A -S M R -1 7 /6 9

f: Rn x R m -> R n

is in c la s s C 1, so th a t th e re e x is ts a un ique re spo n se x(t) f r o m an a r b i t r a r y

in i t i a l s ta te xo fo r each c o n tro lle r u(t) on 0 S t S T .

W e m ig h t seek to c o n tro l x(t) betw een g iven in i t i a l and f in a l s ta te s in ■

som e fix e d d u ra t io n 0 S t S T,

x(0) = x 0, x(T) = x 1

by cho o s in g a c o n tro l fu n c tio n u(t) f r o m som e a d m is s ib le fu n c tio n c la s s (say

u e L ^ JO , T ], th a t is , u(t) is a bounded m e a s u ra b le fu n c tio n on 0 á t ê T ).

H ence x(t) is a s o lu t io n o f the tw o-po in t b o u nd a ry v a lu e p r o b le m , w ith

s e p a ra te d end c o n d it io n s ,

x = f(x , u (t)), x(0) = x 0, x(T) = x x

T h is c o n s titu te s the b a s ic p r o b le m of c o n t r o l la b i l i ty in c o n tro l th e o ry .

A m o n g a l l s o lu tio n s x(t) to th is b o u nd a ry v a lu e p ro b le m , th a t is fo r

a l l a d m is s ib le c o n tro l fu n c tio n s , we m ig h t t r y to s e le c t and d e s c r ib e th e .

o p t im a l s o lu t io n x*(t) fo r the o p t im a l c o n t ro l le r u*(t) w h ich m in im iz e s som e

g iven co st o r p e r fo rm a n c e fu n c t io n a l C (u ), T h is le a d s to the c e n tra l p ro b le m

of o p t im a l c o n tro l th e o ry , fo r w h ich th e re is a v a s t l i t e r a tu r e .

N ex t le t us exam ine these g ene ra l concep ts fo r m o re s p e c if ic p ro b le m s

and p a r t ic u la r e x am p le s .

E x a m p le 1. C o n s id e r a l in e a r v ib r a to r , say , a n a ir p la n e w ing w hose

v e r t ic a l v ib r a t io n s a re to be c o n tro lle d by a e ro d y n a m ic a l w ing- tabs (F ig . 1).

W e s e le c t s u ita b le u n its (m = к = 1) so the fre e - d y n am ic s s y s te m is

x + x = 0 and the c o n tro lle d sy s te m is x + x = u. T h is is e q u iv a le n t to the

d if fe r e n t ia l s y s te m

x = У

f = x + u

w here x is the d is p la c e m e n t , у is the v e lo c ity , and u is the c o n tro l fo rc e .

H e re the s ta te o r Г Х ^ is a v e c to r in the r e a l n u m b e r p lane R 2 , and

F IG .l . Airplane wings as linear vibrator.

10 MARKUS

the c o n tro l u is a r e a l s c a la r in R 1 = R a t e ach t im e t e R . In v e c to r

n o ta tio n , we w r ite x = x 1, y = x 2,

x 1 = x 2 = f 1 (x 1, x 2, u)

x 2 = -x1 + u = f 2 (x1, x 2, u)

and the m a t r ix n o ta tio n fo r th is l in e a r s y s te m is

F o r each in i t ia l s ta te ( X ° ) in R 2 a t t im e t = 0, and e ach c o n tro lle r ЧУо /

u(t) on t § 0, th e re is a un ique so lu tio n o r response x (t, x 0, Уо)> y (t, x 0,y o ) .

The cho ice o f u as a fu n c tio n o f t, i . e. a p re - p ro g ra m m e d c o n tro l s tra teg y ,

c o rre s p o n d s to open- loop c o n tro l.

A m o re im p o r ta n t p h y s ic a l a pp ro ach , a lthough m o re d if f ic u lt m a th e m a t i

c a lly , is to use c losed- loop c o n tro l w here u (x, y) depends on ly on the s ta te at

each m o m e n t. In th is c ase , the c o n tro l depends on the fe edback loop

c a r r y in g the c u r re n t s ta te b ack to in f lu e n ce the fu tu re e vo lu tio n o f the

d if fe r e n t ia l s y s te m , and the p h y s ic a l s y s te m is s e lf- c o r re c t in g in case

u n fo re see n d is tu rb a n c e s a r is e f r o m t im e to t im e .

CONTROL PLANT RESPONSE CONTROLINPUT OR PROCESS OUTPUT INPUT

u(x,x) u (t ) . x+x = u1 *1 t

RESPONSEOUTPUT

— 4 --------------------- 1 [■<- -и 1----- ’ x

CLOSED-LOOP CONTROL

O PEN-100P CONTROL

FIG .2 , C losed -loop and op en -loop control.

L e t us e x am ine the s im p le re g u la to r p ro b le m o f c o n tro ll in g an in i t ia l

s ta te (x 0, y 0) e R 2 to the o r ig in (0, 0), U s in g a c lo sed- loop feedback co n tro l,

say u = - y , we can m ake the o r ig in a s y m p to t ic a lly s tab le fo r

x + x = -x

o r

x = у

ÿ = -x - y

T ha t is , fo r each so lu t io n x (t), y(t) = x(t) the cu rve x(t), y(t) a pp ro ache s the

o r ig in in R2 as t -» -l-оо. Y e t the o r ig in is not a tta in ed in any f in ite t im e .

M o re o v e r , u s in g p u re feedback c o n tro l u(x, y ), say u(0, 0) = 0 and u(x, y) e C 1,

1 A E A -S M R -1 7 /6 11

F IG .3 . Phase-plane system with u = 0, u = + 1 and u = - 1 .

i t is im p o s s ib le to s te e r (xo, yo) to the o r ig in in a f in ite t im e (s ince the

o r ig in is then an e q u i l ib r iu m o r c r i t i c a l p o in t fo r the a u to no m o us d if fe r e n t ia l

s y s te m x + x = u(x, x )). A m o re d if f ic u lt m a th e m a t ic a l a n a ly s is p ro v id e s a

d is c o n t in u o u s fe ed back c o n tro l le r th a t re g u la te s a l l s o lu t io n s to the o r ig in in

f in i te d u ra t io n , bu t we d e fe r th is d is c u s s io n u n t i l la te r in the co u rse (F ig . 2 ).

Now , t r y to re g u la te the l in e a r o s c i l la to r by an open- loop c o n tro l le r u(t),

w h ich can be d is c o n t in u o u s i f n e c e s s a ry . I f we ske tch the phase- p lane

s y s te m w ith u = 0, we f in d a f a m i ly o f p e r io d ic o rb its e n c ir c l in g the o r ig in .

F o r u = +1 o r u = -1 we f in d s im i la r p o r tr a its b u t m e re ly d is p la c e d one un it

to w ard the r ig h t o r le ft , r e s p e c t iv e ly (F ig . 3).

W ith a s im p le schem e of sw itc h in g betw een the v a lu e s u = +1 and

u = - 1 e ach in i t i a l p o in t can be s te e re d to the o r ig in in f in ite t im e .

A m o re in vo lv e d s tudy show s th a t each (x0,y o ) in R 2 can be s te e re d to

any re q u ire d f in a l s ta te (хц, y i ) in R 2 . In th is sense the sy s te m is c o n tro lla b le

in the p la n e .

E x a m p le 2. C o n s id e r the doub le in te g ra to r d e s c r ib in g a fre e a c c e le ra to r

a c c o rd in g to the l in e a r c o n tro l s y s te m

x = u o r x = y o r / x \ _ / o i \ / x \ / o \У = u \ y ) " \ 0 0 ) \ y ) \ l ) u

I f u = 0, then the v e lo c ity y is c o ns tan t and the d is p la c e m e n t x in c re a s e s

l in e a r ly w ith t im e t. I t is e asy to v e r ify th a t th is s y s te m is a ls o c o n tro lla b le

in the (x, y )- p lane and th a t each in i t i a l s ta te (x 0, y 0) can be re g u la te d to the

o r ig in in f in ite t im e by som e open- loop c o n tro lle r .

N ow , c o n s id e r the c o n tro l u a p p lie d d ir e c t ly to the v e lo c ity r a th e r th an

the a c c e le ra t io n a c c o rd in g to the sy s te m (F ig . 4):

x = y + u

ÿ = 0

T h is s y s te m is no t too e a s i ly in te rp re te d in p h y s ic a l te rm s but i t s o m e t im e s

is used to d e s c r ib e the m o tio n of a p a r t ic le a lo ng a m o v in g p la t fo rm , w here

у is the v e lo c ity r e la t iv e to the p la t f o r m and u is the v e lo c ity o f the p la t fo r m

a lo n g the g round .

12 M ARKUS

У

FIG .4 . Control applied to v e lo c ity .

I t is e a sy to see th a t th is s y s te m is no t c o n tro lla b le s in ce у = y 0 r e g a rd

le s s of the c o n tro l u. H ow ever, suppose th a t on ly the d is p la c e m e n t x is to

be c o n tro lle d , and the v e lo c ity у ig n o re d . T hat is , th e re a re o b se rv a t io n s z

T hen the obse rved output z can be c o n tro lle d by u.

O nce a g a in le t us lo o k at these c o n tro l p ro b le m s f r o m a m o re g ene ra l

v iew p o in t. C o n s id e r the c o n tro l s y s te m

o r , in v e c to r la n gu ag e , x = f(t, x , u). W e o ften co ncen tra te on the au tonom ous

case w here f does no t depend e x p lic it ly on t so

w here the sm o o th (h igh ly d if fe re n t ia b le ) c o e ff ic ie n ts f(x , u) de fine the b a s ic

d y n am ic s o f a p h y s ic a l p ro ce s s o r p la n t . The s ta te of the p ro ce s s is the

r e a l n- ve c to r x, and the c o n tro l is the r e a l m - v e c to r u, a t each t im e t.

C o n s id e r the re g u la to r p ro b le m of choos ing a c o n tro l u(x, t) so th a t

eve ry s o lu t io n of

x = f(x, u(x, t))

tends to the o r ig in , x(t) -> 0 as t -* +oo. I f u = u(t), the c o n tro l is c a lle d

open- loop and co rre s p o n d s to a p re - p ro g ra m m e d s tra te g y , w h ich u su a lly

depends on know ledge of the in i t i a l s ta te . I f u = u(x) th is is c losed- loop

feedback c o n tro l and is s o m e t im e s c a lle d a syn th es is fo r a s ta b il iz in g

c o n tro l.

In case the c o n tro l s y s te m is l in e a r , we w r ite

x = A x + B u

a c c o rd in g to

x = f (x ,u )

fo r x e Rn , u e R m, and the r e a l c o e ff ic ie n t m a t r ic e s A (t) and B (t) a re know n.

L e t us d is c u s s he re the au to no m o us l in e a r c o n tro l p r o b le m w here A and В

IA E A -S M R -1 7 /6 13

a re r e a l c o ns tan t m a t r ic e s . The feedback r e g u la to r p r o b le m co n s is ts in

f in d in g som e r e a l c o ns tan t m a t r ix К (often c a lle d the fe edback ga in ) such

th a t u = K x s ta b il iz e s the l in e a r sy s te m , i . e.

x = A x + B K x = (A + BK ) x

has a l l s o lu tio n s x(t) -> 0 in R n as t -* +00. I t is w e ll- know n th a t a n e ce s s a ry

and s u ff ic ie n t c o nd it io n fo r such s ta b il ity is th a t a l l e ig enva lu e s X of (A + BK)

have ne ga tive r e a l p a r ts (lie in le ft- h a lf c o m p le x n u m b e r p la n e ). L a te r we

s h a l l f in d co nd it io n s on the m a t r ix p a ir (A, B ) th a t g uaran tee the ex is tence of

som e s ta b il iz in g g a in m a t r ix K .

Suppose the l in e a r a u to no m o us c o n tro l s y s te m

x = A x + B u

is a c co m p an ie d by a g iven o b se rv a t io n r e la t io n

z = Cx

fo r the o b se rv e d output z s p e c if ie d by the g iven r e a l c o ns tan t m a t r ix C .

L e t us exam ine th is to ta l s y s te m f r o m the in pu t- ou tpu t a n a ly s is o f e x p e r i

m e n ta l in v e s t ig a t io n s .

F o r each in p u t u(t), in som e a p p ro p r ia te se t o f c o n tro l fu n c tio n s , we

gene ra te a s ta te respo n se x(t) and th en an obse rved ou tpu t z(t) = С x ( t ) .

The in p u t- ou tpu t r e la t io n o r t r a n s fe r fu n c tio n is d e s c r ib e d by

u(t) -*• z(t)

To be m o re e x p lic it , le t us r e s t r ic t u(t) to p e r io d ic fu n c t io n s , say co m p le x

e xp o ne n tia ls , w ith a m p litu d e U(io) fo r freq ue ncy и

u = U e ian

and take the c o rre s p o n d in g s teady-sta te p e r io d ic respo n se

x = X e iuJt, z = Z eiwt

H e re the d if fe r e n t ia l s y s te m beco m es

(iw) X = A X + BU

o r the co m p le x re spo nse a m p litu d e , fo r U = 1, is

X = (i wl - A )- 1 B , Z = С ( iu l - A )’ 1 В .

T h is is the am p litu d e - fre q u e n c y response re la t io n .

I f we a llo w the freq ue ncy to be any co m p le x n u m b e r p , then

X = (p i - A ) ' 1 B , Z = С (pi - A )’ 1 В

w h ich is the t r a n s fe r fu n c tio n (or m a tr ix ) . N ote tha t

T (p) = С (pi - A )’ 1 В

14 MARK US

is a r a t io n a l fu n c tio n w ith p o le s a t the e ig enva lue s of A . The b a s ic p ro b le m

o f s y s te m id e n t if ic a t io n co nce rn s the co m p u ta tio n o f the m a tr ic e s А , В , С

f r o m the t r a n s fe r fu n c tio n T (p). In p a r t ic u la r , i f fo r a l l co m p le x p

C i (pi - A j )"1 B j = С (p i - A ) ' 1 В

w hen can we conc lude th a t A = А г, В = B 1( 0 = 0 ! ?

2. F U N C T IO N A L A N A L Y S IS A P P L IE D T O C O N T R O L T H E O R Y

F u n c t io n a l a n a ly s is is a m e thod of tr e a t in g a n a ly t ic a l p ro b le m s in

d if fe r e n t ia l and in te g ra l equa tio ns by m e a n s o f g e o m e tr ic m e th o d s . W h ile

the sam e c a lc u la t io n s m ig h t be p e r fo rm e d w ithou t u s in g the language of

fu n c t io n a l a n a ly s is , the g e o m e tr ic concep ts p ro ve v e ry u se fu l in sug ge s tin g

v a r io u s p o s s ib le app ro ache s to the p ro b le m s .

F o r e x am p le , c o n s id e r the c o lle c t io n of a l l r e a l- v a lu e d con tinuous

fu n c tio n s o f a r e a l v a r ia b le on the u n it in te rv a l, say f(t) on 0 s t 6 1. T ha t

is , we c o n s id e r the space С [0 ,1 ] of a l l con tin uous m ap s o f [0, 1 ] in to R .

B y c la s s ic a l a n a ly s is , С [0 , l] is a r e a l v e c to r space . I f we de fine the

d is ta n c e betw een tw o m e m b e rs f and g of the space С [ 0, 1 ] by

II f - g II = m ax |f(t) - g(t) I o s t s i

th en С [ 0, 1 ] beco m es a co m p le te m e tr ic space , in fa c t, a B an a ch space

w ith the n o r m ||f|| = ||f - o||. The W e ie r s t r a s s a p p ro x im a t io n th e o re m

a s s e r ts th a t the c o lle c t io n P o f r e a l p o ly n o m ia ls on 0 á t s 1 is dense in

С [0 , 1]. Thus the lan gu ag e o f fu n c t io n a l a n a ly s is p e rm its us to in te r p r e t

th is c la s s ic a l u n ifo rm a p p ro x im a t io n r e s u lt as the a s s e r t io n th a t P is a

dense l in e a r subspace of С [0, 1] . Note th a t th is g e o m e tr ic a p p ro ach is

c o nven ien t b u t m u s t be fo llow ed w ith c a re — s in ce in R n a dense l in e a r

subspace m u s t c o in c id e w ith a l l Rn .

F u n c t io n a l a n a ly s is is p a r t ic u la r ly u se fu l in c o n tro l th e o ry s in ce we

m u s t o ften d e a l w ith a space ^ o f c o n tro l fu n c tio n s u(t) and a space ^ " o f

re spo n se fu n c tio n s x (t). The tr a n s fe r fu n c tio n is th en som e m ap

5Г: u(. ) - x ( .)

w h ic h can on ly be s tu d ie d p re c is e ly w hen the d e ta ile d p ro p e r t ie s of the

spaces and ^ a r e s p e c if ie d . F o r in s ta n ce , s in ce d is c o n t in u o u s c o n tro lle r s a re

o ften e m p lo ye d the space ‘й 'm ig h t be L j [ 0, 1 ] ( in te g ra b le fu n c tio n s on

0 s t s 1 ) o r p e rh ap s som e o the r c la s s .

L e t us b r ie f ly c o n s id e r the fam o us lin e a r- q u a d ra t ic p r o b le m of o p t im a l

c o n tro l th e o ry in o rd e r to a p p re c ia te the connec tion s w ith fu n c t io n a l a n a ly s is .

C o n s id e r the l in e a r au tonom ous c o n tro l s y s te m

i = A x + B u

fo r the s ta te x e R n and c o n tro l u e R m a t each t im e t . W e seek an open

lo op c o n tro l le r u(t) on 0 s t s 1 to s te e r an in i t i a l s ta te x (0) = x 0 to the o r ig in

x ( l) = 0. M o re o v e r , we judge the p e r fo rm a n c e of each such c o n tro lle r ,

s te e r in g xq to 0 , b y a c o s t fu n c t io n a l

IA E A -S M R -1 7 /6 15

i

C(u) = f I u(t) I2 dt

0I 12 I i| 2

w h ich depends on a q u a d ra t ic in te g ra n d |u| = > |uJ| .

i=1The o p t im a l c o n t ro l le r u*(t) is r e q u ir e d to m in im iz e th is co st am o ng a l l

c o m p e tin g c o n t ro l le r s .

H e re , u(t) ran g e s o v e r som e subse t S o f the l in e a r space L 2 [0, 1] ofl

a l l m - v e c to r fu n c t io n s fo r w h ich f |u(t) | 2 d t < oo . F o r e ach u(t) in S weио

have the re a l- v a lu e d co s t C (u ), and i t seem s re a so nab le to expect the o p t im a l

c o n t ro l le r u*(t) to s a t is fy the n e ce s s a ry c o nd it io n

= 0 a t u = u*(t)du

B u t how is th is fu n c t io n a l d e r iv a t iv e to be de fined o r com pu ted? W e- sha ll

a n sw e r th ese q u es tio n s la te r in the p ap e r .

T he re spo nse x(t) to any c o n t r o l le r in S is

x (t) = eAt x 0 + e At / e "AsB u(s) ds

0 = eA x 0 + eA J e~As B u (s ) ds

о

I f we de fine the l in e a r o p e ra to r L c a r r y in g L 2 [ 0 ,1 ] to R n,

l

L u = J e"AsB u(s) ds

о ‘

th en o u r p r o b le m is to so lve the l in e a r sy s te m

L u = x0

1

w h ile m in im iz in g the n o r m ||u|| =llu II = / |u(t)|2 dt

1/2

S ince the se t o f a l l p o s s ib le s o lu tio n s u e L 2 [ 0, 1 ] o f the l in e a r s y s te m

L u = x 0 is som e l in e a r s u b - m an ifo ld S (if not em p ty ), we m u s t f in d the

p o in t u* e S n e a r e s t to the o r ig in of L 2 [0, 1] . T ha t is , u* is a t the p o in t

o f S w h ich is ta n g en t to the sphe re o f r a d iu s ||u* || abou t the o r ig in . T h is

g e o m e tr ic a p p ro a c h w i l l p ro ve u se fu l in s o lv in g th is im p o r ta n t p r o b le m la te r

in the c o u rs e .

16 M ARKUS

T he b a s ic s tages in the c o n s tru c t io n and a p p lic a t io n o f any th e o ry of

c o n tro l s y s te m s are :

(i) M o d e ll in g

(ii) Id e n t if ic a t io n

( i i i ) C o n tr o l la b i l i ty and S ta b il ity A n a ly s is

(iv) O p t im iz a t io n fo r O p t im a l C o n tro l

(v) Im p le m e n ta t io n .

In the f i r s t s tage ( i)w e m o d e l the p h y s ic a l p ro ce ss by a m a th e m a t ic a l

d y n a m ic a l o r c o n tro l s y s te m and in the la s t s tage (v) we r e in te r p r e t o u r

m a th e m a t ic a l r e s u lts in te rm s of e ng ine e r in g h a rd w a re . T hus these open ing

and c lo s in g s tages a re no t s t r ic t ly w ith in the d o m a in o f m a th e m a tic s and we

co nce n tra te on the m id d le th re e s tages , id e n t if ic a t io n , c o n tro l and s ta b il ity

a n a ly s is and o p t im iz a t io n , in th is c o u rse .

(i) N a tu r a l ly the p h y s ic a lly s ig n if ic a n t v a r ia b le s and re la t io n s a ffe c t the

cho ice of the m a th e m a t ic a l m o d e l and a l l the fu r th e r c o n tro l a n a ly s is . F o r

in s ta n c e , shou ld a c e r ta in p ro ce s s be d e sc r ib e d by a l in e a r au tonom ous

c o n tro l s y s te m in R n

x = A x + B u 0 S t < oo

o r, p e rh ap s , a d is c re te - t im e d iffe ren ce schem e

x t+1 = A x t + B u t t = 0, 1, 2, . . .

w ou ld be m o re a p p ro p r ia te ?

In the m o d e ll in g p ro ce s s the s tru c tu re of the p a r t ic u la r type of

d y n a m ic a l s y s te m is se le c ted , say , o rd in a ry o r p a r t ia l d if fe r e n t ia l sy s te m ,

l in e a r o r n o n - lin e a r , d e te rm in is t ic o r s to c h a s t ic . F o r exam p le , the

v ib r a t io n s o f a s tr in g can be s tud ied by the wave p a r t ia l d if fe re n t ia l equa tio n

92w _ Э2 w

3 t 2 Эх2

w here w(x, t) is the d is p la c e m e n t o f the s tr in g a t p o s it io n 0 S x S 1 a t t im e

t i 0, Suppose one end is fixed w(0, t) = 0 and the in i t i a l s ta te is g iven

w (x, 0 ) = w 0 (x), (9w/ 9t) (x, 0) = v0 (x) on 0 s x í 1 .

T hen we can app ly the c o n tro l u(t) at the endpo in t x = 1 by, say,

w ( l , t) = u(t)

and seek to b r in g the v ib r a t in g s tr in g to the r e s t s ta te w(x, T) = 0,

(9w /9t) (x, T) = 0 a t a la te r t im e T > 0. T h is is a fo rm of b o u nd a ry c o n tro l,

as d is t in c t f r o m d is tr ib u te d c o n tro l w here we m o d ify the p a r t ia l d if fe r e n t ia l

e qua t io n a c c o rd in g to

III. SY STEM TH EO RY CO N CEPTS

IA E A -S M R -1 7 /6 17

A n o th e r a p p ro a c h w ou ld be to c o n s id e r the s tr in g as a c ha in w ith a f in ite

n u m b e r o f r ig id l in k s , and to t r e a t the d y n am ics as a h ig h - o rd e r s y s te m of

o rd in a ry d if fe r e n t ia l e q u a t io n s . The cho ice of the m a th e m a t ic a l m o de l

depends on the p h y s ic a l sy s te m , and on som e d e c is io n as to w hat a re the

s ig n if ic a n t fe a tu re s and the n e g lig ib le a spe c ts .

In ano the r type of p ro b le m , say , the p o p u la t io n d y n a m ic s o f a c o m m u n ity

we m ig h t dec ide th a t the p o p u la t io n P (t) g row s a c c o rd in g to the law

P (t) = a P (t), P (0 ) = P0

w here a > 0 is the net b ir th - ra te . A c o n tro l m ig h t be in tro d u c e d by an

e m ig r a t io n ra te u(t) so

P = a P - u(t)

H ow ever , le t us suppose th a t the b ir th - ra te is not cons tan t, bu t depends on

the s iz e o f the p re v io u s g e n e ra t io n a = a -f 3 P ( t- l) , fo r co ns tan ts a > 0, j3 > 0.

T hen we have a d if fe re n t ia l- d e la y equa tio n

P (t) = [a-/3 P (t- l)] P (t) - u(t)

A s t i l l m o re c o m p lic a te d a s s u m p t io n on the b ir th - ra te le a d s to o the r

d if fe re n t ia l- fu n c t io n a l equa tio ns such as

L e t us note th a t p a r t ia l d if fe r e n t ia l e qua tio ns , such as the w ave equa tio n

o r h e a t e qua tio n , and d if fe re n t ia l- d e la y o r d if fe re n t ia l- fu n c t io n a l equa tio ns

can o ften be ana ly sed as d y n a m ic a l sy s te m s in a p p ro p r ia te in f in ite - d im e n s io n a l

s p a ce s . F o r e x am p le , the in i t ia l s ta te W 0 = (w0 (x), v 0 (x)) o f

b eco m e s a f te r t im e t > 0 the s ta te W t = (w(x, t), (3w /3 t) (x, t)) , a p a ir of r e a l

v a lu e d fu n c tio n s on 0 S x S 1.

H ence , the m ap T t o f the s ta te space is d e fin ed by

o 0

-1 -1

T t : W 0 - W t

and the b a s ic flow c o nd it io n is s a t is f ie d

Tt2 ° T , Tt2 + ti fo r t j i 0, t2 È 0

F o r the d if fe re n t ia l- d e la y e qua t io n in R

x(t) = f(x (t) , x (t- l) )

the in i t i a l s ta te is a g iven co n tin uo us fu n c tio n x o(0) on -1 s 0 s 0, T hen the

fu tu re s o lu tio n x(t) is d e fined fo r t S 0 and the s ta te x a t t im e t > 0 is the

18 MARKUS

p a s t u n it- t im e seg m e n t o f x(t) th a t is , x t(0) = x (t + 0) on - 1 â 0 s 0. Hence

the s o lu tio n x(t) in R can be defined by a flow Tt f r o m x o(0) to x t(0), say

e le m e n ts of the fu n c tio n space С [- 1 ,0 ] . T ha t is , c o ns id e r

T t : x 0(.)-*• x t (. )

a g a in s a t is fy in g the b a s ic flow co nd it io n fo r fu tu re t im e s t s 0,

O f c o u rse , the p re c is e s p e c if ic a t io n o f the fu n c tio n space of s ta te s and

the a p p ro p r ia te m e tr ic o r to po logy is im p o r ta n t in any s e r io u s study of these

f lo w s .

( ii) F r o m in p u t- ou tpu t e x p e r im e n ts (pe rh aps an a ly se d s ta t is t ic a lly ) we '

id e n t ify the c o n tro l s y s te m , w ith in som e accep ted c la s s o f s y s te m s . F o r

in s ta n ce , f r o m a r a t io n a l t r a n s fe r fu n c tio n m a t r ix T(p) can we com pu te the

m a t r ic e s (А , В , C) such th a t

T (p) = С (p i - A )"1 В

N ote th a t tw o sy s te m s fo r T(p) = — = -%P

and

Y e t, the f i r s t is a m in im a l m o d e l, fo r a s ta te space x e R , and see m s

p r e fe r a b le . I t tu rn s out th a t the m in im a l m o d e l, fo r a g iven t r a n s fe r fun c tio n ,

is c h a r a c te r iz e d by the p ro p e r t ie s o f b e ing c o n tro lla b le and o b se rv ab le .

R o u g h ly speak ing , the s y s te m is c o n tro lla b le in case the o r ig in can be

c o n tro lle d to eve ry p o in t o f the sta te space . A ls o i t is o b se rv ab le in case

no tw o d if fe re n t s ta te tr a je c to r ie s x x (t) = x 2 (t), fo r fix e d c o n tro l, can y ie ld

the sam e o b se rv a t io n z(t) .

( ii i) and (iv ). W e o m it any fu r th e r d is c u s s io n of these to p ic s h e re s ince

th ey have been i l lu s t r a te d by som e e a r l ie r ex am p le s , and they w i l l f o rm the

c e n tra l p a r t of the d eve lo pm e n t o f the c o u rse .

IV . G E N E R A L T H E O R Y O F O R D IN A R Y D IF F E R E N T IA L E Q U A T IO N S ,

S U M M A R Y O F R E S U L T S A N D N O T A T IO N S

W e s h a l l p r im a r i ly be in te re s te d in f ir s t- o r d e r o rd in a r y d if fe r e n t ia l

s y s te m s

j T i— = f ^ t . x ^ x 2, . . . , x n) i = 1 , . . . , n

o r , in v e c to r n o ta tio n ,

x = f (t, x)

H e re , x is a r e a l n- vec to r at e ach t im e t and we seek a s o lu tio n

x(t) = <p(t, t 0, x 0), w ith in i t i a l d a ta <p(t0, t 0, x 0) = x 0 in R n, s a t is fy in g the

IA E A -S M R -1 7 /6 19

sy s te m

= f(t, <p (t, 1 0, Xq ) )

T h is is the C auchy o r in it ia l- v a lu e p ro b le m , w h ich has a un ique s o lu tio n

x(t) u nd e r v e ry g ene ra l co nd it io n s on the c o e ff ic ie n t n- ve c to r f(t, x), as

s ta ted la te r .

I f the s y s te m is au to no m o us , so the c o e ff ic ie n t v e c to r f = f(x) is

independen t of the t im e t, th en we u s u a lly denote the s o lu tio n in i t ia t in g a t

xo w hen t 0 = 0, by x = <p(t, x0). T hen <p(t - t 0, x 0) is the so lu t io n o f the

au tonom ous sy s te m

x = f(x)

s a t is fy in g the in i t i a l d a ta x 0w hen t = t 0.

In the s p e c ia l case w hen f(t, x) is l in e a r in x we can use m a t r ix n o ta tio n

x 1 = a J ( t )x J + b '(t) (s um on repea ted index j)

x = A (t) x + b(t)

W hen b(t) = 0 we have a hom ogeneous l in e a r s y s te m . I f A and b a re co ns tan t

m a t r ic e s , the l in e a r s y s te m is a u to no m o us , and the s o lu t io n is

L e t us note th a t a s c a la r h ig h e r- o rd e r o rd in a r y d if fe r e n t ia l equa tio n

can be reduced to the study of a f ir s t- o r d e r v e c to r s y s te m by a su ita b le

change of n o ta tio n . T ha t is , the s c a le r e qua tio n , (w ith the re a l- v a lu ed

o r

0

fu n c tio n

can be w i i t t e n in te rm s of the v a r ia b le s x = x 1 and

(o r x = x 2, x = x 2 = x 3, e tc .)

20 M ARKUS

Then the n-vector

has a t im e - d e r iv a t iv e denoted by the n- vec to r

w h ich we can w r ite as

/ f ^ t , x)\

f(t , x) = f 2 (t, x)

\f“ ( t ,x ) /

(and we u s u a lly o m it the u n d e r lin in g of the v e c to rs ) .

N ote th a t a re a l- v a lu e d fu n c tio n x(t) w hose n-th d e r iv a t iv e e x is ts and

s a t is f ie s the s c a la r e qua tio n a lw ays d e te rm in e s a n- ve c to r x(t) w hose f ir s t-

d e r iv a t iv e e x is ts s a t is fy in g the v e c to r s y s te m , and v ice v e rs a . A ls o note

th a t the deg ree of d if fe r e n t ia b i l i t y o f the c o e ff ic ie n ts and the p ro p e r t ie s of

b e in g au to no m o us o r l in e a r a re tr a n s fe r r e d s tr a ig h tfo rw a rd ly f r o m the

n- th- o rde r s c a la r o rd in a ry d if fe r e n t ia l equa tio n to the v e c to r s y s te m .

B e cau se o f the g re a te r s y m m e try and g e n e ra lity o f the v e c to r s y s te m , we

m ake th is fo rm a t the p r in c ip a l m a th e m a t ic a l o b je c t fo r s tudy .

In any c a re fu l a n a ly s is o f o rd in a r y d if fe r e n t ia l e qua tio ns we m u s t

d e c la re the hy po theses of d if fe r e n t ia b i l ity on the c o e ff ic ie n t v e c to r f(t, x),

and the d o m a in in w h ich the s o lu tio n s a re de fin ed . F o r in s ta n c e , c o n s id e r

the n o n - lin e a r d if fe r e n t ia l e qua t io n in R 1

x = (x)2

w ith in i t i a l c o nd it io n x = x0 > 0 a t t = 0. N ote th a t the co e ff ic ie n t is a

q u a d ra t ic p o ly n o m ia l, hence d if fe re n t ia b le to a l l o rd e r s eve ryw he re in R 1 ,

Y e t the s o lu t io n is g iven by

1 . 1 . 1 , _ 1-- + — = t so x = —----- fo r t < --

X x 0 2 _ _ t x 0Xo

dx(upon s e p a ra t in g the v a r ia b le s , as in ^ = d t). H ence the s o lu t io n x(t) e x is ts

o n ly on a t im e d u ra t io n r . < t < t + w here t_ = -oo and t + = l / x 0 .

T h is e le m e n ta ry e x am p le show s th a t we can on ly expect lo c a l so lu tio n s ,

e ven w hen the c o e ff ic ie n ts a re v e ry sm o o th . In c o n tro l th eo ry , w here the

c o e ff ic ie n ts m ay be d is c o n t in u o u s in t, o the r c o m p lic a t io n s a r is e . W e f i r s t

s ta te a " s m o o th " ex is te nce th e o re m , and th en in d ic a te the m o d if ic a t io n s th a t

lA E A -S M R -1 7 /6 21

m u s t be m ad e w hen in c o rp o r a t in g c o n tro l d y n a m ic s . The p ro o fs o f these

b a s ic th e o re m s a re found in advanced tex ts .

W e f i r s t c o n s id e r a v e c to r au to no m o us d if fe r e n t ia l s y s te m

x = f(x)

w ith f(x) in c la s s C k in an open se t 0 с R n, fo r som e d if fe r e n t ia b i l ity o rd e r

к = 1 , 2 , . . . , oo. H e re a v e c to r (o r a m a tr ix ) is in a c la s s C k in case each

o f it s co m po nen ts is c o n tin uo us in 0 , and th e re has co n tin uo us p a r t ia l

d e r iv a t iv e s of a l l o rd e rs s к (o r a l l o rd e r s i f we take к = oo ). O f c o u rse ,

a r e a l a n a ly t ic fu n c tio n (w r itte n f e C " ) s p e c if ie d in each lo c a l i ty in © by an

a b so lu te ly co nve rge n t po w e r s e r ie s , is n e c e s s a r ily in c la s s C “ .

T h e o re m . C o n s id e r the v e c to r o rd in a r y d if fe r e n t ia l e qua tio n in an open set

0 c R " ,

5 П x = f (x)

w ith f(x) in C k {0 ) fo r som e fixed к = 1, 2, 3 ,. . . , o o , u . L e t x 0 be a g iven in i t i a l

p o in t in 0 .T hen th e re e x is ts a s o lu t io n x = (p(t,xo) o f ^ i n 0 , in i t ia t in g a t <p(0, x 0) = x 0,

and de fin ed on som e d u ra t io n т. < t < r+ , w h ich we can take m a x im a l. F u r th e r

m o re , th is s o lu t io n <p(t, x 0) is th en un ique , and is i t s e l f in c la s s C k in an open

ne ig hb o u rho o d of t = 0 w ith in the space R 1 x 0 С R 1+n.

A ls o the n x n m a t r ix Z (t) = (dtp/ 9x0) (t, x0 ), tre a te d as a fu n c tio n of t,

s a t is f ie s the l in e a r v a r ia t io n a l s y s te m

¿ ( t ) = ü (<p (t,x0)) z (t), Z ( 0 ) = I

R e m a r k s on g lob a l s o lu t io n s . L e t x = cp(t, x 0) on т. < t < т+ be a s o lu t io n o f SP i n 0 с R n. I f t + < + oo, th en x(t) m u s t e v en tu a lly le av e and s tay out o f any

p r e s c r ib e d c o m p a c t se t К с 0 as t -» r+.

In p a r t ic u la r , w hen 0 = R n and x(t) r e m a in s bounded fo r t > 0, th en we

can conc lude th a t t + = + o o , and a s im i l a r c o nd it io n en fo rce s r. = - oo. Thus

i f |f(x) I is bounded (or even grow s l in e a r ly w ith |x|) then eve ry so lu t io n x(t)

is d e fin ed f o r a l l t im e s t e R 1 in 0 = R n .

R e m a r k s on dependence on p a ra m e te rs

N ex t c o n s id e r the d if fe r e n t ia l s y s te m in R n

x = f(t, x, X)

fo r a co ns tan t p a r a m e te r X in R m. W e reduce th is non- au tonom ous sy s te m

to an a u to no m o us d if fe r e n t ia l s y s te m in a space of 1 + n + m d im e n s io n s .

T ha t is , le t (t, x , X) be the co - o rd in a te s of a v e c to r in R 1+n+m and c o n s id e r

"¡7 = f(t - t 0,x ,X )

T " = 1d T

d T

22 M ARKUS

fo r f(s , x , X) e C k in an open set, say R 1+n+m. H ere we p re s c r ib e the in i t ia l

da ta t = t 0, x = x 0, X = X 0 a t т = 0, and denote the un ique s o lu tio n by

x = Ф(т, t 0, x 0, X 0), t = t + t 0, X = X 0

T hen x = ф {t - t 0, t 0, x 0,X 0 ) is the re q u ire d so lu t io n of

x = f (t, x, X0 )

p a s s in g th ro u g h x 0 w hen t = t 0 . Note tha t th is v e c to r fu n c tio n o f (t, t 0, X 0 , X 0) is in c la s s C k in an open subse t of R 1+1+n+m. I t is th is th e o re m w h ich ju s t if ie s

d if fe re n t ia t io n o f the s o lu t io n w ith re s p e c t to in i t ia l d a ta and p a r a m e te r

co m po nen ts .

H ow ever , fo r even the s im p le s t c o n tro l s y s te m s we e ncoun te r m o re

te c h n ic a l p r o b le m s . C o n s id e r in Rn

x = f(x , u)

w here the n- vec to r f(x , u) is in c la s s C k fo r the s ta te n- ve c to r x and the

c o n tro l m - v e c to r u . Now i f u(t) is a c o n t ro l le r in Ь „ [ 0 , T] (bounded

m e a s u ra b le on the in te r v a l 0 s t s T) then f(x , u(t)) is no t n e c e s s a r ily

co n tin uo us in t . H ow ever f(x , u (t)) and (9f/ Эх) (x, u(t)) l ie in L „[ 0, T] fo r

each fixed x e R n (s ince |f(x, u) | and | (3 f/Эх) (x, u) | a re bounded w hen |x|

and |u| a re bounded) and f(x , u(t)) is in C k fo r each fixed t . T hese co nd it io n s

a re enough to e n su re the ex is tence of an a b so lu te ly con tin uous s o lu tio n x(t),

w hose d e r iv a t iv e e x is ts everyw here on 0 s t s r+ s T excep ting a se t o f t im e s

of m e a s u re z e ro , and w h ich then s a t is f ie s the d if fe r e n t ia l s y s te m . We

e m p h a s iz e th a t x(t) is a c on tin uous cu rve in R n and the p o s s ib i l i t y of c o rn e rs ,

o r d is c o n t in u it ie s o f x (t), c auses no s p e c ia l d if f ic u lt ie s .

W e s ta te the fu n d a m e n ta l ex is tence and un iqueness th e o re m needed in

c o n tro l th e o ry , and r e f in e m e n ts can be found in the tex t o f L e e - M a rk u s .

T h e o re m . C o n s id e r the n-vec to r d if fe re n t ia l sy s tem

50 x ‘ = f ‘ (t, x 2, x 2, . . . , x n) i = 1 , . . . , n

w here f(t, x) is d e fin ed fo r t in an open in t e r v a l / с R 1 and x in an open set

О C R n ,A ssu m e

(a) F o r e ach fix e d t e / ’the fu n c tio n s f 1(t, x) a re in C x{@),(b) F o r e ach fix e d x € © the func tio ns f l (t, x) a re m e a s u ra b le in t in f ,(c) G iv e n any c o m p ac t sets and К © th e re e x is ts an in te g ra b le

fu n c tio n m (t) on such th a t

! f(t, x) I g m (t) and lf“ (t»x)| s m (t)

fo r a l l (t, x) e x К (note J m (t)d t < o o ) .

■>*cT hen fo r each in i t i a l p o in t ( t0, x0) i n / х й th e re e x is ts a un ique so lu tio n

of Sf-.

lA E A -S M R -1 7 /6 23

x = <p(t, t 0, x 0) w ith <p(t0, t 0, x 0) = x 0

de fined fo r a m a x im a l t im e d u ra t io n in

T-(tо» x 0) < t < r+( t0, x 0)

M o re o v e r ;th e n- vec to r fu n c tio n <p(t, t 0, x 0) is then de fin ed and con tinuous

in an open se t D С R 1+1+n. F o r each fix e d (t0, x 0) the fu n c tio n (p(t, t 0, x 0) is

a b so lu te ly co n tin uo us in t and s a t is f ie s the v e c to r d if fe r e n t ia l equa tio n

= f(t, <p(t,t0, x 0))

a lm o s t e ve ryw he re on т. < t < r+ , and hence the in te g ra l equa tio n

t

<p{t) = X 0 + J f(s, <p{s)) ds

fo r a l l т. < t < t+ . F o r e ach fix e d (t, to) the fu n c tio n <p(t, tg , x 0) is in c la s s

C 1 i n x 0, and the v e c to r 3<p/3xJ0 (t, to , x 0), fo r each j = 1, . . . , n, is

a b so lu te ly con tin uous in t and s a t is f ie s the l in e a r d if fe r e n t ia l s y s te m

d ( Э tpl \ V a f 1 ,, . . . .. ( d ip11d t = Л Э Р * о 'х о))

4=1

E x a m p le . C o n tro l o f n o n - lin e a r v ib r a to r w ith one deg ree of fre e d o m

C o n s id e r a p a r t ic le w ith d is p la c e m e n t x a lo ng a s tr a ig h t t r a c k a t t im e t.

N ew ton 's equa tio ns o f m o tio n fo r the p a r t ic le of m a s s m a s s e r t

m i = F

W e a s su m e th a t the to ta l fo rc e F is the s u p e rp o s it io n of an e la s t ic s p r in g

fo rc e F s = - g(x), a f r ic t io n a l fo rc e F f = - f(x , x), and an e x te rn a l c o n tro l

fo rc e F c = u (t) . T hen , ta k in g u n its w here m = 1 fo r conven ience , the

c o n tro l d y n a m ic a l s y s te m b ecom es

x + f(x , x) + g(x) = u(t)

o r in t ro d u c in g the v e lo c ity y,

& = y

у = -f (x, y) - g(x) + u(t)

W e im p o s e the fo llo w in g p h y s ic a lly m o tiv a te d a s su m p t io n s :

(i) f(x , y) and g(x) a re in C 1 in the (x, y )-phase p lane R 2 and u(t) is bounded

and m e a s u ra b le on 0 s t < » ,

24 MARKUS

(ii) xg(x) > 0 fo r a l l x f 0 (re s to r in g - fo rc e cond it io n ) and l im G(x) = oo w here

X N - * “

G (x ) = f g (s )ds .

о( ii i) y f(x , y) È 0 fo r a l l (x, y) ( f r ic t io n a l- d a m p in g c o n d it io n .)

B y a s s u m p t io n i) th e re e x is ts a un ique s o lu tio n x (t, x 0, y 0), y (t, x 0, y 0)

in i t ia t in g at (x o ,y 0) in R 2 , and de fined fo r som e m a x im a l fu tu re t im e t + > 0.

W e s h a ll nex t show th a t c o nd it io n s ii) and i i i ) guaran tee the g lo b a l ex is tence

of a l l such fu tu re t r a je c to r ie s , th a t is , t+ = + o o . F o r th is p ro o f we p ic k

a f in i te t im e T > 0 and show th a t the t r a je c to r y m u s t r e m a in w ith in a c e r ta in

co m p ac t sub se t К С R 2 on the d u ra t io n 0 S t s m in ( r +, T ). F r o m th is it

fo llo w s th a t т+ > T, a c c o rd in g to the g ene ra l th e o ry s u m m a r iz e d above .

In the (x, y) ph ase- p lane de fine the (energy o r L ia puno v ) func tio n

V (x, y) = ^ + G(x)

T hen each le v e l cu rve V (x, у) = с (where с is a p o s it iv e constan t) is a s im p le

c lo sed curve e n c ir c l in g the o r ig in in R 2 . W e com pu te the t im e d e r iv a t iv e

o f V (x , y) a lo ng a s o lu t io n cu rve x(t), y(t) by

dV 9V . 8V . , . , ,~ i7x+i7y = g(x)y + yí-f-g + u(t)]

Hence

V = — yf (x, у ) + yu(t)

Note th a t the in e q u a lity V S 0, w h ich ho ld s fo r the fre e m o tio n w here u(t) = 0,

a lre a d y show s th a t each free tr a je c to r y l ie s w ith in the le v e l curve

V (x , y) = V (x 0, y 0) and so is de fined fo r a l l fu tu re t im e s . B u t we m u s t a llo w

c o n tro lle d m o tio n w ith [ u(t) | â u 0 (the e s s e n t ia l le a s t u p pe r bound on

0 â t < oo).

F o r any c o n tro lle d tr a je c to r y

V s ! У I uo g ( ' I - + u 0 S (V + 1) |u0 I

The a lo n g the c o n tro lle d tr a je c to r y

^ (V + 1) s (V + 1) u 0

so by the u su a l in e q u a lit ie s we f ind , fo r a l l t im e s in 0 S t S T,

V + 1 S [V (x 0, y 0) + 1] eu°T = Cl

Hence

V (x (t), y (t) ) S [V (x 0, y Q) + 1] eu°T = Cl

IA E A -S M R -1 7 /6 25

and so the fu tu re tr a je c to r y l ie s w ith in the s im p le c lo sed le v e l curve

V (x , y) = c-l and m u s t be de fined on the fu l l in te r v a l 0 S t i T . Thus t+ > T

fo r each T > 0 and so t + = + oo.

V . Q U A L IT A T IV E T H E O R Y O F C O N T R O L D Y N A M IC A L S YS T E M S

L e t us re tu rn to the study of au tonom ous d if fe r e n t ia l sy s te m s in R n ,

S f) x = f(x)

F o r s im p l ic i ty of e x p o s it io n we a s su m e th ro ug ho u t th is se c t io n th a t

f(x) e C 1 in a l l R n , and th a t each so lu tio n x(t) = <p(t, x 0), in i t ia t in g at

xq e R n w hen t = 0, is d e fined fo r a l l t im e s t e R .

In the s ta te space R n, we p ic tu re the d y n a m ic a l s y s t e m ^ a s a v e c to r

f ie ld . T ha t is , a t each p o in t x we a ttach the v e c to r f(x ). T hen a s o lu tio n

x(t) = cp{t, x 0) in i t ia t in g a t x 0 can be p ic tu re d as a t im e - p a ra m e tr iz e d curve

in R n w hose tangen t v e c to r x(t) c o in c id e s w ith the g iven v e c to r f ie ld f(x)

a t e ach p o in t on the cu rve x (t). W e c a l l the s o lu tio n cu rve x(t) the t r a je c to r y

o r the o rb it th ro ug h x 0 (s o m e t im e s these te rm s a re used m o re s tr in g e n tly

to d is t in g u is h betw een the m ap t -> x(t) and the p o in t se t и x(t), b u t we s h a ll

a s su m e the m e an in g is c le a r f r o m the contex t). t6T he re a re b a s ic a lly th ree types of o rb its , a s in g le p o in t x0 (a c r i t ic a l

p o in t w here f(xo) = 0), a s im p le c losed curve (a p e r io d ic o rb it ) , and a one-

to-one d if fe re n t ia b le im a g e of a l in e R in R n. The p ro o f of th is a s s e r t io n ,

and o the r s ta te m e n ts o f th is se c t io n , w i l l be g iven la te r in o u r d is c u s s io n

o f to p o lo g ic a l d y n a m ic s . In th is sec tio n , we m e re ly in tro d u ce concepts

and p re se n t p la u s ib le a rg u m e n ts to m o tiv a te the la te r a b s tra c t re a so n in g .

L e t us c o n s id e r the m a p of R n in to R n de fined by fo llo w in g each tr a je c to r y

o f ¿'‘’fo r a s p e c if ie d t im e d u ra t io n t,

Фс : R n -> R n : x 0- x t = <p(t, x 0)

B y the un iqueness and r e g u la r i ty th e o re m s of the p re v io u s sec tio n we know '

th a t Ф с is a b ije c t iv e m a p of R n onto R n, and both Ф с and its in v e rse m ap

$ . t a re d if fe re n t ia b le of c la s s C 1. T hus Ф( is a C 1- d iffe o m o rp h ism of R n

onto Rn fo r e ach g iven r e a l n u m b e r t (a ss u m in g th a t e ach tr a je c to r y is d e fined

fo r a l l t im e s ) .

M o re o v e r , because the s y s t e m J ^ s au to no m o us we can conc lude th a t

<P(Ц , 9 ( 4 - x o)) = <P(t2 + t i , x 0)

o r the c o m p o s it io n of m ap s s a t is f ie s Ф1го фГ1= Ф^+1 fo r a l l t 1 , t 2 e R ,T h is is know n as the " f lo w c o nd it io n " o r "g ro u p h o m o m o rp h is m co nd it io n "

s in ce i t a s s e r ts th a t t -» Фг is a h o m o m o rp h is m of the ad d it iv e g roup R in to

the g roup D if f (Rn ) of a l l C1- d iffe o m o rp h ism s of R n onto it s e lf .

In c id e n ta lly , le t us note th a t fo r p e r io d ic non- au tonom ous sy s te m s

x = f(t, x) w ith f( t + P , x) = f(t, x)

fo r som e p e r io d P > 0, the m ap

26 M ARKUS

x 0 - <P(P> x 0 )

is a C 1- d iffe o m o rp h ism of R n . A ls o the co rre spo nde nce

к -» Фкр = (Ф р)к fo r к = 0, ±1 , ±2 , . . .

is a h o m o m o rp h is m of the in te g e r g roup Z in to D if f (Rn ).

B y m e an s of the g eo m e try o f the flow # t (or the d is c re te flow (ФР)к fo r

p e r io d ic sy s te m s ) we can an a ly se the q u a lita t iv e p ro p e r t ie s of the so lu tio n s

o f the au to no m o us s y s t e m ^ in R n . F o r e x am p le , le t us c o n s id e r tw o k ind s

o f s ta b il i t y f o r 5 f L ia p u n o v and s t r u c tu r a l s ta b il ity , as d is c u s se d next.

L e t us s tudy the b e h av io u r o f the so lu t io n s of

Sf) x = f(x) • (f(x) in C 1 in R n )

n e a r a c r i t ic a l (o r e q u i l ib r iu m ) p o in t x 0 w here f (x 0) = 0. F o r s im p lic i ty

take x 0 to be the o r ig in so <p{t , 0) = 0. W e s h a ll c a l l the o r ig in s tab le (fu tu re

L ia p u n o v s tab le ) in case | (t, x 0) | s tays s m a l l fo re v e r on t g 0 p ro v id ed

|x0 | is s u ff ic ie n t ly s m a l l . I f a ls o ^>(t, x 0) -* 0 as t -> + oo, then the o r ig in is

a s y m p to t ic a lly s ta b le . T hus L ia p u n o v s ta b il ity re fe rs to the tendency of the

s o lu t io n to r e m a in n e a r to the e q u i l ib r iu m sta te x 0 = 0 p ro v id e d the in i t ia l

s ta te x 0 s u ffe rs a s u ita b ly s m a l l p e r tu rb a t io n aw ay f r o m x 0 = 0 .

T o a n a ly se the p o s s ib le L ia p u n o v s ta b il ity o f the o r ig in , we c o n s tru c t

the l in e a r v a r ia t io n a l equa tio n fo r s m a l l d e v ia tio n s v = <p(t, x 0) w ith x0 n e a r 0 .

T ha t is ,

^ v = f(<p(t, x 0)) = f (0 ) + | ^ (0) v + . . . ,

so the l in e a r iz e d s y s te m is d e fined by

♦1 - Ш ( ° Ы i = l , . . . , n

o r in m a t r ix n o ta tio n

Ofv = A v w here A = (0)

I t s e e m s p la u s ib le th a t i f a l l l in e a r iz e d s o lu tio n s v(t) -» 0, then the n o n - lin e a r

s y s te m w i l l be a s y m p to t ic a lly s tab le abou t the o r ig in . T h is c o n je c tu re is

tru e (we r e tu rn to the d e ta ils la te r ) and hence we conc lude th a tá ^ is a s y m p

to t ic a l ly s tab le a t the o r ig in p ro v id e d a l l e ig enva lu e s Aj o f the m a t r ix A lie

in the le ft- h a lf c o m p le x p la n e , th a t is R e Xj < 0.

N ex t le t us c o n tra s t L ia p u n o v s ta b il ity w ith s t r u c tu r a l s ta b il ity of a

s y s te m 5 * in R n v The f i r s t concep t d e a ls w ith s ta b il ity of som e e q u i l ib r iu m

p o in t (o r an in v a r ia n t set) u n d e r p e r tu rb a t io n s of the in i t ia l da ta , and the

second r e fe r s to s ta b il i t y o f the f a m i ly of a l l so lu tio n s of S u n d e r p e r tu rb a t io n s

of the c o e ff ic ie n ts of the d if fe r e n t ia l sy s tem .

L e t us c o m p a re the tw o d if fe r e n t ia l s y s te m s in the (x, y) p lane

IA E A -S M R -1 7 /6 27

The f i r s t is the c o n se rv a t iv e h a rm o n ic o s c i l la to r x + x = 0, and the second

is a dam ped o s c i l la to r x + x + x = 0. Bo th a re s tab le abou t the c r i t ic a l p o in t

a t the o r ig in , in fa c t the d am ped o s c i l la to r is g lo b a lly a s y m p to t ic a lly s ta b le .

Y e t the s ta b il ity of can be d es tro yed by an a r b i t r a r i ly s m a l l p e r tu rb a t io n ,

say

x = y + ex 3

y = -x + ey3e > 0

w h ere the p o la r r a d iu s r = (x2 + y2)1 2 in c re a s e s ,

^ ( ^ / ¿ ) = xx + yÿ = xy + ex4 - xy + ey4 = e(x4 + y4) > 0

B u t c o n s id e r a s m a l l C 1 - p e rtu rb a tio n o f 5^ , say,

X = У + Cj (x, y)

y = -x - y + e2 (x, y)

2

w here ^ | e j (x, у) | + | Эе j /Эх | + | д е ^ /д у | < e(x, y) fo r a s u ita b le co n tin uo us bound

оe (x ,y ) > 0 in R . I t can be show n th a t w hen e(x, y) is s u ita b ly s m a l l , e ve ry

such p e r tu r b a t io n of s t i l l h a s the sam e q u a lita t iv e fo rm fo r it s s o lu tio n

cu rve fa m i ly in R 2 ; n a m e ly , a s in g le c r i t ic a l p o in t to w ards w h ich a l l s o lu tio n

c u rves tend as t in c re a s e s .

T o s u m m a r iz e the above c o m p a r is o n s , we de fin e5 ^ to be s t r u c tu r a l ly

s ta b le in R n i f it s s o lu t io n f a m i ly is to p o lo g ic a lly unchanged w heneve r i s

m o d if ie d by a s u ita b ly s m a l l C 1- p e rtu rb a tio n . T hen 5^ is s t r u c tu r a l ly s tab le

in R 2 , b u t J f is n o t s t r u c tu r a l ly s ta b le .

F in a l ly , le t us c o n s id e r a c o n tro l sy s te m in R n

x = f(x , u)

fo r s ta te x e Rn and c o n tro l u e R m a t each t im e t. W e a s su m e f(x , u) in C 1 in a l l R n+m and use c o n tro lle r s u(t) in L „ [ 0 , T] fo r som e fin ite d u ra t io n

0 S t á T . W e s h a l l de fine the c o n tro l s y s te m to be (com p le te ly ) c o n tro lla b le

on [ 0, T ] in case: fo r each p a i r of po in ts x 0 and x x in R n th e re e x is ts a

c o n t r o l le r u(t) fo r w h ich the re spo n se in i t ia t in g a t x 0 te rm in a te s a t X } . T ha t

is , the so lu t io n x(t) = <p(t, 0, x 0) o f x = f(x, u(t)) s a t is f ie s the two endpo in t

co nd it io n s

x(0) = x 0, x(T ) = x :

so u(t) s te e rs o r c o n tro ls x(t) f r o m x 0 to X j ,

W e s h a l l show la te r th a t a l in e a r a u to no m o us c o n tro l s y s te m in R n

x = Ax + Bu

28 M ARKUS

is c o n tro lla b le i f and on ly if

r a n k [ В , A B , А 2 В , A 3 B , . . . , A n_1 B] = n

F o r g e n e ra l n o n - lin e a r sy s te m s in R n no such u se fu l c o n t ro l la b i l i ty c r i te r io n

is know n.

E x a m p le . C o n s id e r the sy s te m in the (x, y) p lane R 2

x = y

ÿ = x (c o r re s p o n d in g to x - x = 0)

The o r ig in is the un ique c r i t ic a l p o in t in R 2 . A n e le m e n ta ry ske tch of the

s o lu tio n s show s th a t the o r ig in is a sadd le p o in t and hence i t is no t L ia p u n o v

s ta b le .

I t see m s p la u s ib le th a t th is l in e a r s y s te m has it s q u a lita t iv e appea rance

unchanged w hen the c o e ff ic ie n ts a re s l ig h t ly p e r tu rb e d . T h is a s s e r t io n is

c o r r e c t (a lthough a d if f ic u lt th e o re m ) and so the l in e a r sadd le sy s te m is

s t r u c tu r a l ly s tab le in R2 .

N ex t c o n s id e r the c o n tro l sy s te m

x - x = u

o r

H e re A = [ ° M , В = ( ° ) , and

ra n k [В , A B] = ran k ^ = 2

оT hus the l in e a r s y s te m is c o n tro lla b le in R .

V I. T O P O L O G IC A L D Y N A M IC S . F L O W S IN M E T R IC S P A C E S

T o c o n s tru c t a g e n e ra l th e o ry o f q u a lita t iv e d y n a m ic s a p p ly in g to

o rd in a ry , d e lay , and p a r t ia l d if fe r e n t ia l e qua tio ns we a llo w the s ta te space M

to be an a r b i t r a r y m e tr ic space on w h ich the d y n a m ic s evo lves c o n t in uo u s ly

w ith the t im e t e R . T hus, th e re is a d is ta n ce m e tr ic in M s a t is fy in g the

u su a l a x io m s .

D e f in it io n . A to p o lo g ic a l f low Ф in a m e t r ic space M is a m ap (con tinuous in

tw o a rg u m e n ts ) :

Ф : R x M -* M : (t, x 0) -> xt = q>(\, x 0)

1 A E A -S M R -1 7 /6 29

such tha t: fo r each t 6 R we no te the r e s t r ic t io n to M

Ф£ : x 0 - xt

is a to p o lo g ic a l m ap of M onto M and fu r th e r

Фо = id e n tity , Ф-t = (Ф ^ ’ 1

and m o re g e n e ra lly

(group p ro pe rty )

R e m a r k , t -> Фг is a h o m o m o rp h is m of R in to the g roup Top(M ) o f a l l

to p o lo g ic a l m a p s of M onto it s e lf .

(a) I f th is h o m o m o rp h is m is de fined on ly on the s e m i- g ro u p R ,.( tsO ) in to

the g roup C o n t(M ), then th is d e fin es a (fu tu re ) se m i- flo w in M .

(b) I f Ф is d e fin ed on ly on a ne ig hb ou rho o d of the s lic e t = 0 in the p ro d u c t

space R x M , then Ф is a lo c a l f lo w . N ote th a t a lo c a l f low Ф is

d e fin ed fo r |t| < e u n ifo rm ly fo r a l l x 0 in a p re s c r ib e d co m pac t sub se t

K c M . O f c o u rse , the concep t of a fu tu re lo c a l flow is im m e d ia te .

E x a m p le s .

1) C o n s id e r the au tonom ous o rd in a ry d if fe re n t ia l s y s te m in R n

i = f(x)

w ith f(x) in C 1 in R n . T hen the so lu tio n s

: x o - x t = x o)

de fine a lo c a l f lo w . I f |f(x) | < k|x| in a l l R n, then the flo w is g lo b a lly

d e fin ed f o r a l l t € R (and th is a ls o happens in im p o r ta n t case s w h ich do no t

s a t is fy a l in e a r g row th hy po the s is ) .

2) T he non- au tonom ous o rd in a r y d if fe r e n t ia l s y s te m in R n

x = f(t, x)

w ith f( t , x) in C 1 in R 1+n de fin e s an au to no m o us s y s te m and flow in R 1+n.

N am e ly , w r ite

g = f ( t , x )

and th en the s o lu tio n s y ie ld a lo c a l f low in R 1+n:

Ф т : ( t0, x 0) ~ ( t + t 0, <р(т + t 0, t 0, x 0))

30 MARKUS

I f f (t + P , x) s f(t, x) is p e r io d ic in t, then the flqw can be de fined in the

p ro d u c t of the c ir c le S 1 w ith R n .

3) A d if fe re n t ia b le n - m a n ifo ld M is a m e tr ic space w h ich is covered by a

fa m i ly of lo c a l c o o rd in a te c h a r ts , each of w h ich m ap s an open se t of M

to p o lo g ic a lly onto an open se t in R n , w ith the a d d it io n a l r e q u ir e m e n t th a t

the c o o rd in a te t r a n s fo rm a t io n s de fined betw een o v e r la p p in g c h a r ts sho u ld be

in c la s s С . T hus d if fe r e n t ia l sy s te m s can be de fined g lo b a lly on M by

d e f in in g th e m in e ach c o o rd in a te c h a r t (x1, . . . , x n) as

x1 = f ‘(x), i = 1 , . . . , n

w ith the u su a l (c o n tr a v a r ia n t o r tangent) v e c to r t r a n s fo rm a t io n ru le in o v e r

la p p in g (x 1, . . , , x n) to y ie ld

x l = f ‘ (x) i = 1 , . . . , n

w ith

W e can a lw ays p ic tu re an n - m an ifo ld as an n - su r fa ce d if fe re n t ia b ly em bedded

in som e E u c lid e a n space (a c tu a lly R2n), and then an o rd in a ry d if fe r e n t ia l

s y s te m on M is m e re ly a C 1-vector f ie ld f eve ryw he re ta n g en t to M . T hen

the so lu tio n t r a je c to r ie s of f de fine a lo c a l f lo w on M . I f M i s a co m pac t

m a n ifo ld (say a sphe re Sn o r a to ru s T n), then e ach v e c to r f ie ld f d e fines

a g lo b a l flow on M .

4) C o n s id e r the o rd in a r y d if fe re n t ia l- d e la y sy s te m in R n

x(t) = f(x (t) , x (t- l) )

w ith the n- vec to r f(x , y) in C 1 in R 2n. F o r e ach in i t i a l s ta te x o(0) € C J - 1 ,0 ]

(the space of co n t in uo u s n- vec to rs on - 1 S в £ 0 ) le t x(t) be the co rre sp o n d in g

s o lu t io n in R n fo r som e m a x im a l d u ra t io n 0 S t < t + . D e fin e the fu tu re s ta tes

by

x t (e) = x(t + e) in cn[ - l, o]

T ake the B an ach space C n [-1, 0] as the sta te space M and then the so lu tio n s

de fine a fu tu re lo c a l f low in C n [-1, 0],

Ф : R x C n [-1 ,0] - C n [-1 ,0] : ( t ,x 0) - x t

I f |f(x, у) I s k( |x I + |y |) is a l in e a r g row th c o n d it io n in R2n, then Ф is a

g lo b a l s e m i- flo w — fo r in s ta n ce in the case of a l in e a r d if fe re n t ia l- d e la y

sy s te m

&(t) = A x(t) + B x ( t- l)

A m o re g en e ra l h e r e d ita ry p ro ce s s in Rn is de fined by a d if fe re n t ia l-

fu n c t io n a l s y s te m

IA E A -S M R -1 7 /6 31

x(t) = f (x t) fo r t § 0

w here

f : C n [-1, 0] - R n

s a t is f ie s a L ip s c h it z c o nd it io n . A g a in th e re is a fu tu re lo c a l f low (g lo b a l if

f s a t is f ie s a l in e a r g row th cond it io n ) in the s ta te space C n [-1, 0] .

5) C o n s id e r the l in e a r p a r a b o lic p a r t ia l d if fe r e n t ia l e qua tio n

w ith re a l- v a lu e d s o lu tio n s w (x, t), fo r x in a bounded open d o m a in D с Rn

(w ith sm o o th b o u nd a ry 9D) and t è 0. A s s u m e the c o e ff ic ie n ts а ^ ( х ) , Ь*(х),

c(x), f(x) a re С in D and th a t a 1J (x) = aJ1 (x) is s t r ic t ly p o s it iv e d e fin ite in D .

T hen a co n tin uo us s o lu tio n w (x, t) in D x R + w i l l be d e te rm in e d by s u ita b le

C auchy data:

w(x, 0) = wo(x) in i t i a l fu n c tio n in D v a n is h in g on 9D

w(x, t) = 0 fo r x € 9 D and a l l t s 0

In m o re d e ta il, i f w 0 (x) E C 0(D) (con tinuous on D bu t v a n is h in g on 3D ),

th en w (x, t) G C ” (D x (t > 0)) is the un ique c la s s ic a l s o lu tio n . In p a r t ic u la r ,

th e re is a fu tu re se m i- flo w in the s ta te space C o(D ) (B anach space w ith m ax

n o rm )

$ t : wo -» wt (w here w t = w(x, t ) ) fo r t § 0

Tajee the sam e p a r a b o l ic p a r t ia l d if fe re n t ia l e qu a t io n b u t use the sta te

space H ^ D ) , the H i lb e r t space a r is in g as the c o m p le t io n of C“ (D), C°°

fu n c t io n s on D v a n is h in g in a ne ig hb ou rho o d of 9D , u nd e r the Sobo lev n o rm

L e t us r e tu rn to the g ene ra l th eo ry o f a flo w Ф in a m e tr ic space M

Ф : R x M -» M , Ф£ : M M

D e fin e the fu tu re t r a je c to r y o r o r b it of a p o in t Рц e M to be the m ap

the fu l l t r a je c to r y o f P0 c o n s is ts of both p a s t and fu tu re t r a je c to r ie s o f P o .

T h e o re m . The o rb it of Po G M is one o f the th re e types

i) a s in g le p o in t P 0, th a t is a c r i t i c a l o r s ta t io n a ry o r e q u i l ib r iu m p o in t

w here <p(t , P 0 ) = P 0 .

Э х ^ х -*+ b l (x) + c(x)w + f(x)

un ique w eak s o lu tio n w t 6 H 1 (D) and a c o rre s p o n d in g sem i- flow

Фг : wq w t in H 1(D) fo r t ï 0

R + -» M : P 0 -* P t o r the p o in t se t U^Pt . T he p a s t tr a je c to r y is s im i la r , and

32 MARKUS

i i) a s im p le c lo sed cu rve (to po lo g ic a l c ir c le ) , th a t is cp{t + r , P 0) = <p(t, P 0)

w ith s m a lle s t p o s it iv e p e r io d т > 0 i i i ) an in je c t iv e co n tin uo us im ag e o f a l in e R .

P ro o f

C o n s id e r the set T с R of a l l t im e s fo r w h ich cp(t, P 0) = P 0. T hen T

is a c lo se d subg roup o f R and so e ith e r

i) T = R

ii) T = {кт} fo r к € Z and som e p o s it iv e т i i i ) T = 0

In case i) <p{t , P 0) = P j is a c r i t ic a l p o in t . In case ii) cp(t, P 0) = Pq bu t

q>( s, P 0 ) j f P 0 f o r 0 < s < т. T hen the m ap of the co m p ac t in te r v a l 0 s t s т

is one-to-one, excep t fo r the endpo in ts and is a to p o lo g ic a l m ap o f S 1 in to M .

C le a r ly , e ach p o in t on the o rb it o f Po has the sam e p e r io d т.

In case i i i ) P t f P 0 fo r t f 0 so the m ap t -* P t is a 1- to- l con tinuous

m ap o f R in to M . (Note th is m ap m ig h t no t have a con tin uous in v e r s e , as is

the case fo r the K ro n e c k e r i r r a t io n a l flow on a to ru s ) . Q . E .D .

D e f in it io n . A se t S с M is in v a r ia n t unde r the flow Ф г in case : P 0 G S im p lie s

the o rb it P , E S fo r a l l t . T h a t is , an in v a r ia n t se t S is the un ion of co m p le te

o rb it s . U su a lly we s tudy c lo se d in v a r ia n t se ts , s in ce S is in v a r ia n t .

W e nex t d e s c r ib e the v a r io u s types of (L iapuno v , fu tu re ) s ta b il ity

th a t m ig h t ho ld fo r an in v a r ia n t set.

D e f in it io n . L e t S с M be an in v a r ia n t set fo r a flow 3>t . Then S is s tab le

in case : fo r e ach n e ig hb ou rho o d U of S th e re e x is ts a s m a l le r n e ighbou rhood

V o f S so S с V с U , such th a t P , E U fo r a l l t г 0 p ro v id e d P 0 e V . If ,

fu r th e r , l im d is t (P t ,S ) = 0, then S is a s y m p to t ic a lly s ta b le . If , s t i l l fu r th e r ,

Q t -* S r e g a rd le s s of the in i t i a l p o in t Q 0 G M , then S is g lo b a lly a s y m p to t ic a lly

s ta b le .

E x a m p le . In R 2 the sy s te m s

x = y x = y

s p ir a l

ÿ = -x (cen tre ), and ÿ = -x - y (focus)

a re s tab le abou t the c r i t i c a l p o in t S = (0, 0), but

* = У(sadd le )

ÿ = x

is not s tab le abou t S.

In p o la r c o - o rd in a te s in R 2 , c o n s id e r the sy s te m

r = (1 - r ) , 0 = - 1

T hen the p e r io d ic o rb it r (t) = 1, 0 (t) = -t is a s tab le in v a r ia n t set.

IA E A -S M R -1 7 /6 33

D e f in it io n . L e t Фс de fine a flow in M . The fu tu re (or p o s it iv e o r u ) l im i t

se t of a p o in t Po (or the o rb it P t ) is the set

w(P0) = n ü Ptx > О C -

and the p a s t (o r nega tive o r a) l im i t get is

«(Po) = П и P t tL < 0 t

I t is e asy to show th a t w(Po) can be c h a ra c te r iz e d as the se t of a l l p o in ts

Q € M fo r w h ich som e sequence of p o in ts on the fu tu re o r b it P t a p p ro ach Q .

T h a t is , Q 6 u (P 0) ju s t in case:

3 t k -> + oo fo r w h ich l im P t = Q

tk~ ~

F r o m the d e f in it io n s i t is e asy to v e r ify th a t io(P0) and « (P q) a re connected ,

c lo se d , in v a r ia n t s e ts . In case the fu tu re o rb it of P q l ie s in som e co m p ac t

s ub se t of M (say M i t s e lf is c o m p ac t) , t h e n u (P 0) is a none m p ty c o m p ac t se t.

D e f in it io n . P j e M is fu tu re - re c u r re n t (or P o is s o n s tab le ) in case P 0 E u ( P 0),

T h a t is , P 0 is fu tu re - re c u r re n t in case the fu tu re o rb it P t r e c u r s b ack to

a n a r b i t r a r i ly s m a l l n e ig hb ou rho o d o f P 0 fo r som e a r b i t r a r i ly la te t im e s .

P a s t re c u r re n c e is s im i la r , and re c u r re n c e m e ans bo th p a s t and fu tu re

re c u r re n c e .

The se t of a l l p e r io d ic p o in ts of $ t is an in v a r ia n t set, and is in c lud e d

in the in v a r ia n t se t o f a l l r e c u r re n t p o in ts . B u t these in v a r ia n t se ts m ay

no t be c lo se d in M . W e next define a la r g e r in v a r ia n t se t th a t is c losed

in M .

D e f in it io n . L e t Фг define a flow in M . The non- w ande r in g (or r e g io n a lly

re c u r re n t) se t Í) с M co n s is ts of a l l po in ts P u G M such th a t each ne ig hbou rho o d

N o f Po in M r e c u r s to m e e t it s e l f a t a r b i t r a r i ly la rg e t im e s . T ha t is , P 0 £ Г2

ju s t in case: fo r each ne ig hb ou rho o d N of P 0 th e re e x is t a r b i t r a r i ly la rg e

t im e s tk-» +00 w hen Ф1к N m e e ts N .

I t is e asy to p rove th a t Q is a c lo sed in v a r ia n t sub se t o f M w h ich co n ta in s

a l l r e c u r re n t p o in ts . W h ile we r e q u ir e on ly fu tu re r e g io n a l r e c u r re n c e fo r

P 0 £ Г2, th is a u to m a t ic a l ly im p l ie s p a s t re g io n a l re c u r re n c e .

V II. T O P O L O G IC A L D Y N A M IC S . M IN IM A L SET S A N D S T A B IL IT Y T H E O R Y

In the s tudy of a flow on a m e tr ic space M we o ften s e a rc h fo r an

in v a r ia n t se t S с M . T hen Фг de fin es a flow on S w h ich m ig h t be r a th e r

s im p le r and w h ich he lp s to und e rs tand the to ta l flow on M .

E x a m p le . In the space R2n w ith c o o rd in a te s (x1, x 2, x n, y 1 , y 2 , . . . , y n )

we take any r e a l C 2 fu n c tio n H (x , y), and then d e te rm in e a d y n a m ic a l sy s te m

o f H a m ilto n ia n fo rm

34 M ARKUS

ЭН , .

y i = “ э ? ( x ' y )

T hen, fo r each co ns tan t C , the le v e l h y p e rs u r fa c e

H (x , у) = С

is an in v a r ia n t su b se t o f R 2n. T h is is ju s t the a s s e r t io n of c la s s ic a l m e ch an ic s

th a t a H a m ilto n ia n d y n a m ic a l s y s tem p re s e rv e s the e ne rgy H . N am e ly ,

dH _ ЭН . ЭН . TT TTdt “ Эх Эу У x У - Ну H x = 0

D e f in it io n . L e t Ф£ be a flow on a m e tr ic space M . A non- em pty co m pac t

in v a r ia n t se t E of # t is c a lle d m in im a l in case E co n ta in s no p r o p e r co m pac t

in v a r ia n t sub se t.

T h e o re m . L e t S be a non- em pty co m p ac t in v a r ia n t se t fo r a flow $ t in a

m e tr ic space M . T hen S c o n ta in s som e m in im a l se t E fo r Ф£.

P ro o f

P a r t ia l ly o rd e r the co m pac t in v a r ia n t subse ts of $ t by the u sua l re la t io n

o f in c lu s io n . U se Z o r n 's le m m a to se le c t a m a x im a l l in e a r ly o rd e re d c h a in

o f such se ts , and then E is the in te r s e c t io n o f the m e m b e rs of th is c h a in .

T hus th is r e s u lt is p ro v ed by Z o r n 's le m m a o r by use of the a x io m o f cho ice .

Q . E .D .

T h e o re m . L e t E be a m in im a l se t fo r a flow Фг in a m e tr ic space M . T hen

the flow in E is such that:

1) each p o in t P 0 £ E is to p o lo g ic a l- tr a n s it iv e in E , th a t is ,

e(Pb) = u(P0) = E.2) each p o in t P q £ E is r e c u r re n t in E , and w ith bounded t im e gaps

betw een re c u r re n c e s to any p r e s c r ib e d ne ig hbou rho o d .

P ro o f

T ake P j € E and then u (P 0) is a none m p ty c o m p ac t in v a r ia n t se t in E .

S ince E is a s su m e d m in im a l , u (P q) = E . A s im i la r a rg u m e n t p ro ves

e(P0) = S.N ex t take any n e ig hb ou rho o d N of Po in E and suppose Pt l ie s ou ts ide

N fo r a sequence of fu tu re t im e in te r v a ls L^ w ith le ng th s te n d in g tow ard

in f in ity . L e t tk be the m id - t im e of L ^ and c o n s id e r the sequence P tl( in

the co m p ac t se t E . S e le c t a subsequence t ki such th a t P tk -> Q in E . But

cj(Q) = E so the o rb it th ro ug h Q m e e ts N a f te r som e f in ite d u ra t io n L . By

c o n t in u ity w hen P tk. is s u ff ic ie n t ly n e a r Q , the o r b it o f Pti^ m u s t a ls o m ee t

N a f te r d u ra t io n L . B u t th is c o n tra d ic ts the s u p p o s it io n th a t Lk¡ -* °o. Q . E .D .

IA E A -S M R -1 7 /6 35

N ow we re tu rn to the s tudy of o rd in a ry d if fe r e n t ia l s y s te m s in R n in

o rd e r to m ake som e a p p lic a t io n s of o u r a b s tra c t th eo ry of to p o lo g ic a l

d y n a m ic s .

T h e o re m . C o n s id e r a d if fe r e n t ia l sy s te m in Rn

S?) x = f (x) j f (x) e C 1 in R n

w ith a c r i t ic a l po in t, say a t the o r ig in , f (0) = 0. A s s u m e th e re e x is ts a

r e a l fu n c t io n V(x) £ C 1 (Rn ), c a lle d L ia p u n o v fun c tio n , s a t is fy in g

i) V (x) > 0 in R n excep t V(0) = 0

i i ) ! ^ r f ’ (x > á 0 in R n .

T hen the o r ig in is (fu tu re , L ia puno v ) s tab le .

F u r th e r m o re a s su m e th a t the set

Z . { * : S r f ' W = o }

in R n is such th a t; in som e ne ighbou rhood N 0 of the o r ig in , Z n No co n ta in s

no m in im a l se t o th e r th an the o r ig in . T hen the o r ig in is a s y m p to t ic a lly

s ta b le .

M o re o v e r , if we can take N 0 = R n and a ls o a s su m e V(x) -* « as |x| -*■ oo,

th en the o r ig in is g lo b a lly a s y m p to t ic a lly s ta b le .

P ro o f

T ake a ne ig hb ou rho o d N o f the o r ig in , say w ith co m p ac t c lo su re N , and

define

e = m in V(x) > 0 xe9N

T ake a sub ne igh bo u rho od U w h e re in V(x) < e /2 . T hen fo r x 0 G U the so lu tio n

cu rve o f ^ n e v e r m e e ts 3N, s ince a lo ng x(t, xo) we com pu te

d v a v t , s .dT = эГ f(x) S

V(x, (t, x 0)) < e /2 fo r t г о

T hus the o r ig in is s ta b le .

N ow a s su m e Z n N 0 c o n ta in s no m in im a l se t o th e r th an the o r ig in . Take

a subne ighbo u rho od U 0 с N0 so x 0 £ U 0 has a fu tu re tr a je c to r y in N 0, and

suppose u (x 0) c o n ta in s a p o in t Q / 0. Note V(x) = V (Q) > 0 m u s t be co ns tan t

on u ( x 0), s in ce the m o n to n ic fu n c tio n V (x (t, x 0)) -» V (Q ). So to(x0) does n o t '

co n ta in the o r ig in .

H ence w (x0) co n ta in s som e m in im a l se t K , not ly in g in Z . T hen we

choose a p o in t у e К - Z so V (y(t)) < V (y(0)) = V (Q ), fo r s m a l l t > 0. B u t

36 MARKUS

th is c o n tra d ic ts the co ns tancy of V(x) o n w (x 0). Thus we m u s t conc lude th a t

u (x 0) = {0}, and so 5 ^ is a s y m p to t ic a lly s tab le .

F in a l ly , a s su m e Z co n ta in s no m in im a l set o the r th an the o r ig in and

a ls o V(x) -» oo a s |x| -* oo. T hen each s im p le c lo sed cu rve V(x) = cons tan t

e n c lo se s an in v a r ia n t reg io n w h ich can p la y the ro le o f N 0 above . Hence

x 0 6 Nq h a s a fu tu re t r a je c to r y th a t r e m a in s fo re v e r in the co m p ac t se t No,

and u (x 0) = {0}. Q . E .D .

E x a m p le . C o n s id e r an au to no m o us sy s te m o f n o n - lin e a r m e c h a n ic s w ith

1 -degree of fre e d o m

x + f(x )x + g(x) = 0

w ith f(x) and g(x) in C 1 fo r x in R , and we assu m e

f(x) > 0, and xg(x) > 0 fo r x f 0

T hen the sy s te m in the (x, y) ph ase p la n e R z is

x = У, j = - g(x) - yf(x)

w h ich we show to be a s y m p to t ic a lly s tab le about the o r ig in . I f a lso x

G(x) = J ' g(s)ds -» oo as |x| -> oo, then the sy s te m can be seen to be g lob a lly

о

a s y m p to t ic a lly s ta b le .

T o v e r ify these a s s e r t io n s , c o n s id e r the L ia p u n o v fu n c tio n

V (x, y) = ÿ + G(x)

N ote V > 0 except a t the o r ig in . A ls o

V = g(x)y + y [ - g(x) - y f(x )] = - y 2f (x) S 0

T hus the o r ig in is s ta b le .

N ex t c o n s id e r the se t in R2

Z = {(x, y) : - y 2f(x) = 0}

so Z is ju s t the x-ax is w here y = 0. O n the se t Z we o bse rve the d y n am ics

y = -g(x ), so the o n ly in v a r ia n t sub se t of Z is the o r ig in . By the above

th e o re m the o r ig in is then a s y m p to t ic a lly s tab le .

I f , fu r th e r , G (x) -» oo as |x| -* oo then the le v e l c u rves V (x, y) = cons t,

a re c losed c u rv e s e n c ir c l in g the o r ig in . The th e o re m then a s s e r ts tha t

the s y s te m is g lo b a lly a s y m p to t ic a lly s tab le about the o r ig in in R 2 .

N ote th a t the a s y m p to t ic s ta b il ity is a consequence of the p o s it iv e

d a m p in g f(x) > 0. I f in s te a d we have a co nse rv a t iv e sy s te m w ith f(x) = 0,

then e v e ry s o lu t io n l ie s on som e le v e l curve of V (x, y) and so the o r ig in is

s ta b le . I f V -» oo as |x| + |y| -* oo, then e ach le v e l curve of V is a c losed

cu rve e n c ir c l in g the o r ig in , so each so lu tio n is p e r io d ic .

IA E A -S M R -1 7 /6 37

F o r an au to no m o us l in e a r d if fe r e n t ia l e qua tio n in R n ,

x = A x

the o r ig in is an a s y m p to t ic a lly s tab le c r i t ic a l p o in t if and on ly i f a l l the

e ig e n v a lu e s A j of A l ie in the le ft- hand co m p lex p la n e , th a t is , R e X j < 0 .

To p ro ve th is m ake a l in e a r co - o rd in a te tr a n s fo rm a t io n in R n, y = Px ,

and com pu te the l in e a r sy s te m

у = P x = P A x = (P A P -1) у = A y

C hoose P so th a t P A P "1 = A is in s u ita b le J o r d a n c a n o n ic a l fo rm , and fo r

s im p l ic i ty o f e x p o s it io n we a s su m e

A = diagiX-L, X 2, . . . , An} w ith r e a l Xj < 0

T hen the s o lu t io n in i t ia t in g a t y 0 is

y(t) = e ^ ' y i + ... e x"ty n

T hus x(t) = P _:Ly(t) -> 0 e x p o n e n tia lly as t -> + oo, so the o r ig in is g lo b a lly

a s y m p to t ic a lly s ta b le .

E ve n fo r the n o n - lin e a r d if fe r e n t ia l e qua tio n in R n,

x = f(x) = A x + h(x)

w here h(x) £ C 1 is of h ig h e r o rd e r n e a r x = 0, the co nd it io n Re Xj < 0 im p lie s

( lo ca l) a sy m p to t ic s ta b il ity . The p ro o f ag a in uses the l in e a r co - o rd in a te s

у = P x and the L ia p u n o v fu n c tio n

V(y) = - y 'A y > 0 , fo r y f 0

Then

-y'A [Ay + h (P _1 y)]

and the n e ga tiv e - d e fin ite q u a d ra t ic fo rm -y'A2y d o m in a te s the h ig h e r- o rd e r

te rm s n e a r у = 0 .

H ow ever , note th a t the so lu tio n x(t) = P _1y(t) m e re ly tends a s y m p to t ic a lly

to w a rd s the o r ig in as t -*• + oo, and we m ig h t w ish to s te e r an in i t i a l s ta te x 0 to x = 0 in a f in ite t im e d u ra t io n by a p p ly in g a s u ita b le c o n tro l le r u(t).

D e f in it io n . The au tonom ous l in e a r c o n tro l s y s te m

5?) x = A x + B u , x G Rn, u £ Rm

is (com p le te ly ) c o n tro lla b le in R n on the d u ra t io n 0 s t g 'T in case : fo r each

p a ir o f s ta te s x 0 and x, in Rn th e re e x is ts a p ie ce w ise con tin uous c o n tro lle r

u(t) on 0 s t s T s te e r in g the re spo n se x(t) f r o m x(0) = x 0 to x(T) = x* .

VIII. C O N T R O L L A B IL IT Y OF LIN E AR SYSTEM S

38 MARKUS

I f the d u ra t io n T < oo is a llow ed to v a ry w ith the cho ice o f po in ts x 0, х г,

th en á f i s c o n tro lla b le on f in ite d u ra t io n s .

T h e o re m

¡£ ) x = A x + Bu, x £ R n, u G R m

is c o n tro lla b le on a g iven f in ite d u ra t io n 0 s t s T i f and on ly i f

r a n k [ B , A B , A 2 B , . . . , A" '1 B] = n

T hen x(T ) = eAT x 0 + {L}T, fo r a l in e a r subspace {L }T С R n .A ssu m e th a t r a n k [B , A B , A 2 B, . . . , A n"1 B] = n so th is n x n m m a t r ix

has n in dependen t row s . T hen a row n- vec to r rj a n n ih ila te s

[B , A B , A 2B , . . . , A n"1 B] i f and on ly if 17 = 0. T ha t is , 17B = 0, 17A B = 0,

r]A2B = 0, . . . , т}Ап_1В = 0 ho ld i f and on ly i f 17 = 0. Hence

2i7e 'As В = 17 [I - A s -Ц-р A2 - . . . ] В = 0 on O s s s T

im p l ie s 17 = 0 .

S uppose d im { L } T < n so th e re e x is ts 17 f 0 fo r w h ich r j{L }T = 0 o r

r je AT / e"As Bu(s) ds = 0, fo r a l l u (s). T hen (rjeAT) e”As В = 0 so 17 eAT = 0.

0T h is c o n tra d ic ts the s u p p o s it io n 17 f 0 so we conc lude {L }T = R n and is

c o n tro lla b le on 0 s t s T ,

C o n v e rse ly , a s su m e ¡ â is c o n tro lla b le - o r on ly the w eake r a s se r t io n

th a t the o r ig in can be s tee red to a l l po in ts of Rn in v a r io u s f in ite d u ra t io n s .

L e t {L},, = k 0{L }k = R n. S ince the l in e a r spaces {L}k a re nested in c r e a s in g

fo r к > 0 ( in s e r t a z e ro c o n tro l fo r d u ra t io n a b e fo re a c o n t ro l le r on 0 S t S к

to c o n s tru c t a c o n t ro l le r on 0 s t s к + o), the re a ch ab le se t {L }„ is a l in e a r

space w ith the s am e d im e n s io n a s { L } N, fo r som e la rg e N . T hat is ,

d im {L}n = n and so on ly the z e ro v e c to r is o r th o g o n a l to {L }N.

T hus , fo r any row n - ve c to r 17,

Proof

F ix x 0 and choose a c o n tro l le r u(t) to get a re sponse

0

T

N

im p lie s 17 = 0

0

H ence , w r it in g rjj = r jeAN,

N

implies 171 = 0

IA E A -S M R -1 7 /6 39

N ow suppose ran k [B , A B , A2 B , . . . , An_1 B] < n so th e re e x is ts n 1 f 0

r ^ B = 0, r)1 A B = 0, r)1 A2B = 0...........r i j A ^ B = 0

B y the C ay le y H a m ilto n T he o re m A n is a r e a l l in e a r c o m b in a tio n of

(I, A , A 2, . . . , A""1 ) so r)1 AnB = 0. C o n tin u in g we have r)1 e ‘AsB = 0 w ith

r¡1 f 0. T h is c o n tra d ic ts the a s s e r t io n th a t {L }N = R n, so we conc lude rank

[ B , A B , A 2B, . . . , А П_1В] = n, as re q u ire d . Q . E .D .

R e m a r k s . is c o n tro lla b le on [ 0, T] i f and on ly i f is c o n tro lla b le the

u n it d u ra t io n [0, 1] . A ls o the p ro o f of the th e o re m shows th a t {L}„ = R n

im p lie s th a t- S 'is c o n tro lla b le on [0, Т ] . Hence c o n tro lla b i li ty on v a r ia b le

f in ite d u ra t io n s is the sam e re q u ire m e n t as c o n t r o lla b i l i ty on a fixed d u ra t io n

fo r au tonom ous l in e a r sy s te m s

T h is la s t r e m a r k is not v a lid fo r t im e-dependen t l in e a r s y s te m s .

H ow ever, the sam e m e thods as above (H e rm e s- L a Sa lle ) p rove th a t a l in e a r

sy s te m w ith С co e ffic ie n ts

x = A (t)x + B (t)u

is c o n tro lla b le in R n on any fixed d u ra t io n 0 S t S T p ro v ided

ra n k [В , Г В , Г 2В , . . . , r k_1B ]t=0 = n

H e re k S 1 is any in te g e r and the d if fe r e n t ia l o p e ra to r Г is g iven by

T B (t) = -A(t) B (t) + B(t)

R e m a r k . N ote th a t the c la s s of c o n tro lle r s u(t) on 0 S t S T is not p a r t ic u la r ly

s ig n if ic a n t in the c o n t r o lla b i l i ty of £ f . In fa c t, we cou ld take u(t) b e lo ng ing to

L } [ 0, T ] , o r any dense l in e a r subspace th e re o f.

C o ro l la r y . F o r s c a la r c o n tro lle r s , so m = 1 and В = b is a c o lu m n ve c to r ,

£ £ i s c o n tro lla b le in R n i f and on ly i f

de t I b, A b , A2b, . . . , A n' 1 b| f 0

T h e o re m . A n au tonom ous l in e a r s c a la r p ro ce ss

x(n> + a 1x(n"1) + . . . + a nx = u

o r the c o rre sp o n d in g c o n tro l s y s te m in Rn

x n = - ajjX 1 - а Пш1х 2 - . . . - a- ^" + u

is c o n tro lla b le in R n w ith u G R 1 .

40 M ARK US

M o re o v e r , e ve ry au tonom ous l in e a r p ro ce s s in R n

i f ) x = A x + bu, u e R 1

w h ich is c o n tro lla b le , is l in e a r ly equ iv a len t to som e s c a la r c o n tro l p r o c e s s i f .

P ro o f

F o r the m a tr ic e s

the c o n t ro l la b i l i ty c r i te r io n is e a s ily v e r if ie d by co m p u ta tio n .

N ex t c o n s id e r the c o n tro lla b le p ro ce s s ¡ £ in R n and c o n s id e r the n o n

s in g u la r n x n m a t r ix

P = [АП_1 Ь, A n‘ 2 b , ____ A2b, A b , b]

In tro d uce new l in e a r c o - o rd in a te s in R n by x = P _1x so

к = P 1A P x + P _1b u

By d ir e c t m a t r ix m u lt ip l ic a t io n v e r ify

A ls o A P = P N o r P ' 1A P = N w here we define

H e re the r e a l co ns tan ts a , a 2 j . . . , a n a re un iq ue ly s p e c if ie d by the

c h a r a c te r is t ic e qua t io n fo r A ,

A n - A n 1 4- 0*2 A n 2 + . . . +

T hus â f i s l in e a r ly e q u iv a le n t to a s y s te m in R n

x = N x + b a u

B u t i f we beg in w ith a s c a la r p ro ce s s w ith m a t r ix A j, and c o rre s p o n d in g

c h a r a c te r is t ic e qua tio n

A ” = " a i - a 2 A j - . . . - a n

then is l in e a r ly e q u iv a le n t to p ro v id ed we take

- a l = a \ > - a 2 = a 2 > ■ • • > ‘ a n = f ín

S ince l in e a r e qu iv a lence is a t r a n s it iv e re la t io n , £ £ is l in e a r ly e q u iv a le n t to

the s c a la r s y s te m ¿2?, as r e q u ir e d . Q . E .D .

T h e o re m . C o n s id e r a c o n tro lla b le l in e a r au tonom ous p ro ce s s

Ç ? ) x = A x + bu, x G R n, u G R 1

T hen th e re e x is ts a l in e a r fe edback u = kx, fo r c o ns tan t row v e c to r k , such

th a t

& = (A + b k) x

is a s y m p to t ic a lly s tab le to w a rd s the o r ig in .

P ro o f

I t is e asy to see th a t the p ro p e r ty of p o s se s s in g a l in e a r fe edback w h ich

y ie ld s a s y m p to t ic s ta b il ity , is in v a r ia n t u n d e r the r e la t io n o f l in e a r

e qu iv a le n ce o r change of l in e a r co - o rd in a te s x = P _1x in R n .

B u t a i s l in e a r ly e qu iv a le n t to a s c a la r p r o c e s s ^

x<n> + a x x (n' 4 +. . .+ a nx = u

and we can s ta b il iz e S ) by ta k in g

u = ( a n- 1 ) x + (an. j- n) x + . . . + (a j- n) x^11' 1)

to o b ta in

+ n x *"'11 + . . . + rix + x = 0

B ut th is has the c h a r a c te r is t ic e qua tio n

Xn + nX " '1 + . . . + nX + 1 = (X + 1)D

w ith a l l e ig e n v a lu e s X = - 1 < 0. Q . E .D .

IA E A -S M R -1 7 /6 41

42 M ARKUS

C o n s id e r an au to no m o us l in e a r c o n tro l p ro ce s s in R n

â f ) x = A x + Bu

w here the c o n tro l le r s a re in Ъ ъ [ 0 , T] w ith v a lue u(t) r e s tr a in e d to som e

se t Q с R m. W e s h a ll a lw ays a s su m e th a t Q is a convex co m p ac t set w h ich

c o n ta in s the o r ig in o f R m in its in t e r io r . F o r in s ta n ce Q cou ld be the cube

I u-* I s l fo r j = 1 , 2 , . , , , m ,

D e f in it io n . C o n s id e r an au tonom ous l in e a r p ro ce s s in R n

i f ) x = A x + B u

w ith u(t) с £2, a convex c o m p a c t ne ig hb ou rho o d of the o r ig in in R m. The set

o f a t t a in a b i l i ty f r o m x 0 a t t im e T > 0 is

T

K x0(T) = "jx(T) = eAT x 0 + e 'As B u(s) ds | a l l a d m is s ib le u(s) с Г2

о

T h e o re m . K Xo(T) is a c o m p ac t convex subse t o f R n w h ich v a r ie s co n t in uo u s ly

w ith T > 0. A l s o K Xo(T) has a none m p ty in t e r io r if and on ly i f is c o n tro lla b le .

P ro o f

T he sub se t o f L 2 [0, T] of a d m is s ib le c o n t ro lle r s is a convex se t w h ich

is w eak ly c o m p ac t (note: a c lo sed b a ll in L 2 is w eak ly c o m p ac t - a ls o the

w eak l im i t o f a non- nega tive fu n c tio n is non-negative and th is can be used to

p rove th a t the w eak l im i t o f a d m is s ib le c o n tro lle r s is i t s e l f a d m is s ib le ) .

S ince in te g ra t io n is a l in e a r c o m p ac t o p e ra to r f r o m L 2 to R n, we f in d th a t

K Xo(T) is convex and co m p ac t. The c o n t in u ity (H au sd o r ff m e tr ic ) is ev iden t

f r o m the boundedness o f x (t).

T he c o n t r o l la b i l i ty o f y ie ld s an in te r io r o f K X|)(T) ju s t as is p roved

in e a r l ie r th e o re m s w here th e re is no r e s t r a in t on u. F o r d e ta ils see Lee-

M a rk u s .

D e f in it io n . The se t of n u l l- c o n tr o l la b il ity fo r

S C ) x = A x + Bu, u(t) c i 2 c R m

is the sub se t o f R n c o n s is t in g o f a l l s ta te s th a t can be s te e re d to the o r ig in in

f in i te t im e s .

T h e o re m . C o n s id e r as above ,

5?) x = A x + Bu, x e R n, u c ñ c R m

T hen the se t of n u l l c o n t ro l la b i l i ty is a l l R n p ro v id e d bo th

i) A l l e ig e n v a lu e s of A have nega tive r e a l p a r ts

ii) r a n k [ B , A B , A2 B , . . . , A n' 1B] = n .

IX . L IN E A R SYSTEM S W ITH CO N TR O L RE STRAIN TS

IA E A -S M R -1 7 /6 43

E x a m p le . C o n s id e r the dam ped l in e a r o s c i l la to r w ith c o n tro l

x + 2/3& + k 2x = u (t), 0 > 0, к > 0, |u(t) | ё 1

T hen e v e ry in i t i a l s ta te (x 0, y 0 = x 0) l ie s in Ф = R 2 .

X . T IM E - O P T IM A L C O N T R O L O F A U T O N O M O U S L IN E A R SYST E M S

A g a in we c o n s id e r an a u to no m o us l in e a r c o n tro l p ro ce s s

£ ? ) x = A x + B u , x 6 R " , u £ f i

w here Q is a convex co m p ac t n e ig hb ou rho o d of the o r ig in in R m. L e t an

in i t i a l s ta te xo l ie in the n u ll- c o n tr o l la b i l i t y set so xo can be s tee red

to the o r ig in in f in ite t im e by som e a d m is s ib le c o n t ro l le r u(t) с Q on 0 s t ё t i .

D e f in it io n . A m o n g a l l a d m is s ib le c o n t ro l le r s u(t) е й on v a r io u s [0, t j]

s te e r in g x 0 to X j = 0, a t im e - o p t im a l c o n t ro l le r u*(t) on [ 0, t*] is d e fined by

the r e q u ir e m e n t t* = in f t j .

T h e o re m . C o n s id e r the au tonom ous l in e a r p ro ce s s

¿ f ) & = A x + B u , x € R n, u G Г2

w ith i n i t i a l s ta te x 0 e T hen th e re e x is ts a t im e - o p t im a l c o n t ro l le r u*(t)

on 0 S t S t* s te e r in g x 0 to the o r ig in in m in im a l t im e t* .

P ro o f

T ake xo f 0 and c o n s id e r the m o v in g se t o f a t ta in a b i l ity K Xo(t) fo r t > 0.

S in ce K X|)(t) is a lw ays a co m p ac t set in R n, and it m o ve s c o n t in uo u s ly w ith t,

we f in d th a t K X|)(t) m ee ts x x = 0 fo r a f i r s t t im e t* > 0. The co rre sp o n d in g

c o n t ro l le r u*(t) w h ich y ie ld s th is p o in t х г е K x (t*) is the re q u ire d o p t im a l

c o n t r o l le r . Q . E .D .

I f fi is a seg m en t |u| s i , and- S fis c o n tro lla b le , then u*(t) is the

un ique (a lm o s t e veryw here ) c o n t ro l le r s te e r in g x 0 to X j = 0 in m in im a l

t im e t * . T h is w i l l fo llo w f r o m the c h a r a c te r iz a t io n o f u*(t) by m e ans of

the m a x im a l p r in c ip le of P o n try a g in , as p ro ved be low .

T h e o re m . (M ax . P r in c ip le ) . C o n s id e r the au to no m o us l in e a r c o n tro l s y s te m

in R n

£ f ) x = A x + Bu

w ith c o n t ro l le r s u(t) с Г2, a convex c o m p ac t ne ighbou rhood of the o r ig in in

R m. L e t x 0 6 so th a t x 0 can be s te e re d to the o r ig in X j = 0 in f in ite t im e ,

and so there e x is ts an o p t im a l c o n tro l le r u*(t) w ith response x*(t) fo r m in im a l

t im e 0 S t g t* .

T hen th e re e x is ts a n o n ze ro s o lu t io n ^ ' ( t ) o f the a d jo in t l in e a r s y s te m

f¡ = - r )A

44 M ARKUS

f o r w h i c h

r ) * ( t ) В u * ( t ) = m a x r j* ( t ) В u a l m o s t e v e r y w h e r e on 0 S t S t *U ë Q

T h u s i f w e d e f i n e t h e H a m i l t o n i a n f u n c t i o n

H ( r j , x , u ) = r ) [ A x + B u ]

t h e n r j * ( t ) , x * ( t ) s a t i s f y -

ЭН* . ЭН* ,

x = ~ э Г ' " = - 1 Г (,7' х Л )

w h e r e

H * ( r ) , x , t ) = H ( t 7, x , u * ( t ) )

a n d

H ( r ) =:= ( t ) , x , u * ( t ) ) = m a x H ( r ) * ( t ) , x , u )U € Q

P ro o f

W e need on ly show th a t

> l*(t)B u*(t) = m ax r i* ( t)B uU 6 Q

a lm o s t e ve ry w h e re , fo r a n o n - tr iv ia l a d jo in t row v e c to r s ince the

o th e r a s s e r t io n s c o n ce rn in g the H a m ilto n ia n a re m e re ly re - s ta te m e n ts .

T h is m a x im a l p r in c ip le w i l l be an a n a ly t ic a l s ta te m e n t o f the g e o m e tr ic c o nd it io n

th a t x-j = 0 n e c e s s a r ily l ie s on the b o u nd a ry ЭК of the convex co m p ac t se t of

a t ta in a b i l ity K Xo(t* ).

S ince K Xo(t*) is convex, choose an ou tw ard un it n o r m a l v e c to r

(n o rm a l to a s u p p o r t in g hy pe rp lane ) a t x j e ЭК. T hen X j is the p o in t of

K Xo(t*) fa r th e s t in the d ir e c t io n o f r¡ 1г th a t is ,

i l j X j й r¡1 x fo r a l l X € K Xo(t*)

t* tv

r ¡1 e M "' + e "As B u* (s ) ds й гцем '' + тпг е м ^ J ' e‘As B u (s )d s

о о

fo r a l l a d m is s ib le c o n t ro lle r s u(s) on 0 s s s t* .

L e t

n*(t) = (r7x e Al'' ) e "Al, s o i f ;' ( t * ) = n 1

IA E A -S M R -1 7 /6 45

be a n o n - tr iv ia l s o lu t io n of the a d jo in t d if fe r e n t ia l s y s te m ^ = -17A , fo r a

row n - ve c to r r). T hen we o b ta in the m a x im a l p r in c ip le :

t* t*

J ' r)*(s) B u* (s ) ds i J r j (s ) Bu(s) ds

0 0

B ut th is y ie ld s the po in tw ise re s u lt :

r¡* (s) B u* (s ) = m ax r j* (s )B u

u s a

S ince i f s uch an e q u a lity fa ile d on som e p o s it iv e d u ra t io n S we cou ld define

a new c o n tro l le r

m ax rf:‘ ( s )B u fo r s e S

U €

u*(s ) o th e rw ise

T hen û(s) (w h ich is an a d m is s ib le c o n tro lle r ) w ou ld c o n tra d ic t the above

in te g r a l in e q u a lity fo r the m a x im a l p r in c ip le . Q . E .D .

C o r o l la r y . F o r the case o f a s c a la r c o n t ro l le r w here m = 1 so Г2 is a

s eg m en t, say |u| s i , the m a x im a l p r in c ip le a s se r ts

u,:'(t) = sgn r f ( t ) b

H ence u*(t) is a bang-bang c o n tro l le r w ith a f in ite n u m b e r o f sw itche s , and

u*(t) is u n iq ue ly s p e c if ie d by the m a x im a l p r in c ip le , p ro v id e d

de t [ b , A b , A2b, . . . , A ^ b ] f 0

P ro o f

F o r a t im e - o p t im a l c o n t ro l le r u*(t) on 0 S t S t* we m u s t have

tj* (t) b u* (t) = m ax r ] * ( t )b u = |ii*(t)b|u e £2

T hus

u*(t) = sgn r r (t) b

N ow a s su m e th a t 5 f i s c o n tro lla b le so

de t [ b , A b , A? b, . . . , An 1 b ] f 0

L e t u-^t) be any a d m is s ib le c o n tro l le r s te e r in g x 0 to x x = 0 on Os t S t* .

T hen u ^ t ) a ls o s a t is f ie s the m a x im a l p r in c ip le so

rj°:= (t) b [u*(t) - u х( t)] = 0 a lm o s t e veryw here

46 MARKUS

I f u* (t) ф U j(t) on som e p o s it iv e d u ra t io n £ , then

rj*(t) b = 0 fo r t 6 E

S ince r)*(t) = (rj1 eAt"') e~At is a r e a l a n a ly t ic v e c to r , and s in ce E co n ta in s an

in f in ite se t of p o in ts ,

r¡*e‘Atb = 0 fo r a l l 0 s t s t*

w here we se t r¡* = r¡1eÁl'1' as a n o n ze ro row v e c to r . T hen a t t = 0 we have

r)*b = 0. D if fe re n t ia te and se t t = 0 to get rj*Ab = 0 and con tinue to find

П* [b, A b , A2 b , . . . , A " '1 b] = 0

B u t th is c o n tra d ic ts the c o n tro l la b i li ty a s su m p t io n fo r r/* f 0. Thus

U jJt) = u*(t) a lm o s t e veryw here

In o th e r w o rd s , th e re is a un ique o p t im a l c o n t ro l le r fo r the m in im a l

t im e t * . M o re o v e r any c o n tro l le r u j (t) on 0 s t s t* w h ich s a t is f ie s the

m a x im a l p r in c ip le m u s t be th is o p t im a l u * (t) . Q . E .D .

M o re g e n e ra l b ang- bang th e o re m s a re a v a ila b le fo r the r e s t r a in t se t

Г2 as a po ly h ed ro n in R ro - see the text of L e e - M a rk u s .

X I . S W IT C H IN G L O C U S F O R T H E SYN T H ES IS O F O P T IM A L C O N T R O L L E R S

C o n s id e r an au to no m o us l in e a r s y s te m in R n

¿ £ ) x = A x + Bu

fo r а с П , a convex c o m p a c t ne ighbou rhood of the o r ig in in R m. F o r any

in i t i a l s ta te x 0 in the se t of n u l l c o n t ro l la b i l i ty in R n, th e re e x is ts an

o p t im a l c o n t ro l le r u*(t) s te e r in g x*(t) f r o m x 0 to X j = 0 in m in im a l t im e t* .

A ls o u*(t) s a t is f ie s the m a x im a l p r in c ip le

rj*(t) B u*(t) = m ax B u a lm o s t e ve ryw he re on 0 s t i t* ,U ê O

fo r som e n o n t r iv ia l a d jo in t v e c to r 17* (t) s o lv in g

rj = - r ) A

In the case w hen В = b is a v e c to r , so m = 1 fo r s c a la r c o n tro lle r s

|u(t)| s 1 , and w hen S S is c o n tro lla b le , the o p t im a l c o n t ro l le r is c h a r a c te r iz e d

by the m a x im a l p r in c ip le as a bang-bang c o n tro l le r w ith sw itches on ly a t the

f in ite se t of z e ro s of r i* ( t)b . T h a t is ,

u * (t ) = s g n 17* (t ) b

W e i l lu s t r a te th is th e o ry by c o n s tru c t in g the sw itc h in g lo c u s , and

sy n th e s iz in g the o p t im a l c o n t ro l le r as a fe edback c o n tro l, fo r the case of

IA E A - S M R -1 7 /6 47

the l in e a r o s c i l la to r

x + x = u

o r the s y s te m in R 2

x = y

S f )

ÿ = -x + u

w ith Г2 : I u I S i

H e re

!) = A(;)+bu' A=(-?í> b=(?Thus the a d jo in t e qua t io n fo r 17 = (rj1, 172) is

П = -r}A o r {r} j , r¡2 ) = - (17 з_, r)2 ) q ^

T hus

= Ч 2 * ^ 2 = _ r h s o ^ 2 + ^ 2 = 0

H ence 17! (t) is a s in u s o id w ith a c o ns tan t d u ra t io n of v betw een su ccess iv e

z e ro s .

The m a x im a l p r in c ip le now a s s e r ts th a t

u*(t) = sgn (i]*(t), 17* (t)) = sgn Г)*(t)

Hence the o p t im a l c o n t r o l le r u*(t) sw itche s be tw een the e x tre m e va lue s

+1 and -1 eve ry -n u n its o f t im e . The m a in d if f ic u lty in f in d in g u*(t) , s te e r in g

a g iven p o in t (^y°^) ( 0 ) ^ t im e t* , is th a t we do not have the

in i t i a l (o r te r m in a l) d a ta fo r rjg (t). B u t we do know th a t e ach e x tre m a l

c o n t ro l le r (s a t is fy in g the m a x im a l p r in c ip le ) ac ts as the un ique o p t im a l

c o n t r o l le r fo r som e in i t i a l p o in t in R n. T hus , we p ro ce e d to c o n s tru c t a l l

p o s s ib le e x t re m a l c o n t ro l le r s , and o bse rve the c o rre s p o n d in g e x tre m a l

re sp o n se s , to f in d the c o r r e c t o p t im a l c o n t ro l le r fo r ( X ° ).У х Л \ У о /

T he o p t im a l re sp o n se f r o m ( u ) to the o r ig in m u s t fo llo w a r c s of\Уо /

the s o lu tio n s of the e x tre m a l d if fe r e n t ia l s y s te m s

x = у x = у

5Í) and 5J)

у = -x - 1 y = - x + 1

S ince the e x tre m a l d if fe r e n t ia l s y s te m s Sf. and 5^ a re au to no m o us , we can

c o n s tru c t e x tre m a l re sp o n se s th a t te rm in a te at the o r ig in by a p ro ce ss of

48 MARKUS

b ack in g out o f the o r ig in as -t in c re a s e s . T hat is , we s ta r t the e x tre m a l

re spo nse a t t = 0 a t the o r ig in and fo llo w the a p p ro p r ia te so lu tio n c u rves o f

5? and b a ck w a rd s in t im e (sw itch ing eve ry n u n its o f t im e ) to re a ch the

p o in t (^y0^) a t som e ne ga tive v a lu e o f t = - t*. Then r e v e rs e the t im e sense

and s ta r t f r o m (^y°^) a t t = 0 to a r r iv e at the o r ig in a t t = t* , th us o b ta in in g

the o p t im a l re sponse

u *(t).

I f we s ta r t a t the o r ig in a t t = 0 w ith г)г = 1 , rj2 = 0, th en rj2 (t) = - s in t and,

on the in te r v a l -it < t < 0 we have sgn r¡2 (t) = +1. The c o rre s p o n d in g e x tre m a l

re sponse t r a c e s out the s o lu t io n curve of th ro ug h the o r ig in , th a t is ,

X = - COS t +1 ОП -7Г < t < 0

Г :+у = s in t

o r

(х - l )2 + y2 = 1, y < О

O n the o the r hand i f we s ta r t w ith rj1 = -1 , rj2 = 0, then n 2 (t) = s in t and , on

-7Г < t < 0, we have sgn rj2 (t) = -1. The c o rre s p o n d in g e x tre m a l response

curve of SC is

p _ X = COS t -1 ОП -7Г < t < 0

у = - s in t

o r

(x + 1 ) 2 + y2 = 1 , у > 0

H o w e ver , fo r any o th e r cho ice o f ( r^ , rj2) a t t = 0 w ith r)2 > 0, the

e x tre m a l re spo nse t r a c e s b ack f r o m the o r ig in a lo ng Г+ u n t i l rj2 (t) = 0. A t

th is p o in t the e x tre m a l re sponse sw itches to a s o lu t io n o f SC_ w h ich i t

fo llo w s fo r a d u ra t io n of le ng th ir be fo re sw itc h in g a g a in to a so lu tio n of

A s im i la r p ro ce s s o c cu rs fo r in i t ia l d a ta w ith r]2 {0) < 0 but he re the e x tre m a l

re spo n se s ta r ts b ack f r o m the o r ig in a lo n g Г . .

T he sw itc h in g lo cu s W., w h ich c o n s is ts of a l l the po in ts w here a l l the

above e x tre m a l re spo n ses sw itch betw een the fa m i l ie s SC_ and SC is not

d if f ic u lt to d e s c r ib e in th is e x am p le . In fa c t, 'W is m ade up of the a rc s

and П and th e ir su c ce s s iv e tr a n s p o r ta t io n s a lo n g the a p p ro p r ia te so lu tio n

f a m i l ie s o f SC and S C fo r d u ra t io n s of le n g th ir. N ote th a t such a t r a n s p o r ta

t io n is ju s t a r ig id ro ta t io n o f the phase p la n e th ro u g h ir r a d ia n s abou t the

c o r re s p o n d in g cen tre (1, 0) o r (-1, 0). T h is p ro ce s s of su c ce ss iv e sw itc h in g

le a d s to a f in a l c o n s tru c t io n o f 'Ж 'as a co n tin uo us cu rve over the e n tire

x-ax is , co m po sed of s e m i- c ir c le s of r a d iu s one, as p ic tu re d in F ig . 5.

D e f in e the sy n th es is fo r r e a l (x, y) f (0, 0) by

Г -1 i f (x, y) l ie s above W o r on Г .

Ф (х<у) = j 0 i f (x, y) l ie s on 0Г o the rw ise

[ +1 i f (x, y) l ie s be low 9Г o r on Г +

x * ( "t )j , the o p t im a l t im e t* , and the o p t im a l c o n t ro lle r

IA E A -S M R -1 7 /6 49

+ = -1У

\

Ц/=+1

FIG.5. Switching locusOS^for x + x = u.

T hen the o p t im a l c o n tro l is g iven by the feedback

u = Ф (х , у)

and each o p t im a l response is a s o lu t io n of

x + x = Ф (х , y)

X I I . L IN E A R D Y N A M IC S W IT H Q U A D R A T IC COST O P T IM IZ A T IO N

C o n s id e r an a u to no m o us l in e a r c o n tro l s y s te m in R n

à ?) -к = Ax + B u

w ith a d m is s ib le c o n t ro lle r s u(t) as m - ve c to rs w h ich a re s q u a r e - in teg rab le

on a g iven f in ite d u ra t io n 0 s t S T . T hat is , the only r e s t r a in t on u(t) is th a t

T

о ■

T hen fo r a g iven in i t ia l s ta te x 0 G R n there e x is ts a re sponse x(t) on 0 S t S T

and we de fine the co st o f the c o n t ro l le r u(t) to be

T

о

w here W = W ' > 0 and U = U 1 > 0 a re po s it iv e- de fin ite cons tan t m a tr ic e s usedII и 2 i) 112

to de fine the n o rm s ||x||w = x ' f f x , Ilu II и = u 'U u . W e seek an o p t im a l

c o n t ro l le r u*(t) m in im iz in g the co st o r p e r fo rm a n c e fu n c t io n a l C(u) so

C (u*) = in f C(u)

w here || u ||2 = u 'u in R m

50 MARKUS

B y te chn ique s s im i la r to those o f the m in im a l t im e - o p t im a l p ro b le m i t

can be show n (see L e e - M a rk u s ) th a t th e re e x is ts a un ique o p t im a l c o n tro lle r

u*(t) and th a t i t is c h a r a c te r iz e d by the m a x im a l p r in c ip le :

x = A x + B U ^ B ' i ) ' , x(0) = x0

W e can e x p re ss th is o p t im a l c o n tro l th ro ug h a feedback ga in m a tr ix

E * ( t) , independen t of x 0, so

u*(t) = E * (t) x*(t)

H e re (see L e e - M a rk u s fo r d e ta ils ) i t c an be e s ta b lis h e d th a t

E * ( t) = U _:LB 'E ( t )

w here E (t) is the s o lu t io n of the m a tr ix R ic c a t i d if fe r e n t ia l equa tio n

Ê = W - A 'E - E 'A - E B U _1B E

w ith the te r m in a l d a ta E (T ) = 0. U s ing th is ga in m a t r ix E * (t) = \J"1 B 'E ( t ) ,

w h ich can be co m pu ted in advance of the c o n tro l p r o g r a m m e , we ob ta in the

o p t im a l re spo n se x*(t) as a so lu tio n of

x = A x + В [ U ' 1B 'E ( t ) x ] , x(0) = x 0

In the l im i t in g case w here T = + oo the R ic c a t i d if fe r e n t ia l e qua tio n is

re p la c e d by the q u a d ra t ic equa tio n

A 'E + E 'A + E B U _1B 'E = W

w h ich has a un ique s y m m e tr ic nega tive d e fin ite s o lu t io n E . Then a un ique

o p t im a l c o n t ro l le r u*(t) e x is ts (here we m u s t a s su m e th a tb if is c o n tro lla b le

in Rn) and can be d e te rm in e d by a fe edback sy n th e s is

u*(t) = U 1 B 'E x*(t)

so x*(t) s a t is f ie s

x = (A + B U _1B 'E ) x, x(0) = x о

The m in im a l co st fo r

u*(t) = U 1 B 'r j* ( t ) ' a lm o s t e ve ryw he re on 0 S t S T

w here r)*(t) is a row n - ve c to r s o lu tio n of

= x 'W -r)A, rj(T) = 0

w ith

0

IA E A -S M R -1 7 /6 51

is ju s t

C (u*) = - x¿E x 0

w h ich is a p o s it iv e d e fin ite q u a d ra t ic fo rm in x 0.

I t is in te r e s t in g to d e r iv e th is r e s u lt by the in tu it iv e m e thods of

d y n a m ic p r o g r a m m in g . L e t us c a l l the m in im a l c o s t f r o m x 0 on [ t 0, T]

to be

T

v ( x 0, t 0) = m i n f llx W l C + d tu e L 2 [ t 0, T ] J

4

T hen e ach c o n tro l le r u(t) on [ t0, T] w ith c o r re s p o n d in g re spo n se x u(t)

in i t ia t in g a t xq a t t = t 0 has cost

to + à X

f (llxJ w + llu O d t + / <llx X + IMIu)dtto t 0 + 5

w here б > 0 is an a r b i t r a r i ly s m a l l n u m b e r . B y m o d ify in g u(t) to becom e

o p t im a l on [ t Q + 6, T] we o b ta in the cost

to+ 5

f (Ilx ullw+ IMIu) dt + ¥ (хи(*0 + *0+6)to

T hus

t0 + 6

V (x 0, t 0) = m in \ j ( | | x j ^ +

u (t )€ L 2[t 0tT ] , to

Iu ||ц ) dt + V (xu (t0+ ó), t 0+ 6)|

T h is e q u a t io n i l lu s t r a te s the id e a of d yn am ic p r o g r a m m in g in th a t we th in k

o f the o p t im a l- c o n tro l p r o g r a m decom posed in to the s u m of two p r o g r a m s ,

on [ t0, t 0 + 6] and on [t0+ 6, T] w ith the d y n am ic s in d ic a te d by the p o s s ib i l i t y

o f v a ry in g 6. W e ig n o re p r o b le m s of d if fe r e n t ia b i l ity and conve rgence and

p ro ce e d fo rm a lly .

U s in g a T a y lo r s e r ie s e xp ans ion in te rm s o f the s m a ll n u m b e r 6 > 0 we

o b ta in

V (x „ , t 0) = m in i i u(t) 1

I I ! , + l|u(to)|2 ] + V (x 0, t 0)

| 7 (x o-t0) ^ ( t 0) + ^ ( x 0, t 0) 6 + 0 (ô)

W r ite

d x u

dt(t0) = A x 0 + Bu(t0)

52 MARKUS

and le t б -* 0 to ob ta in

w here we have w r it te n the g ene r ic p o in t (x0, t 0) as (x, t).

T hus we m u s t com pu te the m in im u m of the r e a l fu n c tio n

h ( u ) = H I ’ + | j [Ax + B u ]

fo r each fix e d (x, t) . But

so take

u (ЭУ

Эх ■)H ence V (x , t) m u s t be the s o lu tio n o f the n o n - lin e a r p a r t ia l d if fe re n t ia l

equa tio n

w ith te r m in a l d a ta

V(x, T) = 0

In the case T = + oo we can v e r ify tha t

V (x, t) = - x 'E x

is a s o lu t io n of th is B e l lm a n d y n am ic p r o g r a m m in g p a r t ia l d if fe re n t ia l

e q u a t io n ju s t in case E s a t is f ie s

W = E B U ' 1B 'E + A 'E + E 'A

as sp e c if ie d e a r l ie r .

CONCEPTS OF STABILITY AND CONTROL

P .C . PARKS

Control Theory Centre,

University o f W arw ick,

Coventry, United K ingdom

I A E A -S M R -1 7 /1

Abstract

CONCEPTS OF STABILITY AND CONTROL.T h e first part o f the paper deals with the ro le transfer functions play in control problem s (closed loop ,

Nyquist stability criterion, sampled data systems and z-transform s; the "hog c y c le " ; spring oscillations; Lyapunov functions; the Zubov m ethod; p ositive-rea l functions and the Popov criterion; the c ir c le criterion; linear t im e -d e la y systems; equations with period ic coe ffic ien ts ; stability o f repeated processes).In the second part the author considers the control o f systems w hich are described b y partial differential equations (h eat-conduction equation; w ave equation; control o f the heat and the w ave equations; parasitic oscillations; noise in linear systems; d iscrete noise processes). Many exam ples are given and briefly discussed.

1. C O N C E P T S O F S T A B IL IT Y A N D C O N T R O L ; T R A N S F E R F U N C T IO N S

T he m a th e m a t ic a l th eo ry of c o n tro l has a lw ays been p ro m p te d by

p r a c t ic a l c o n s id e ra t io n s and m u c h o f the d eve lo pm e n t of th eo ry has been

c a r r ie d out by e le c t r ic a l and m e c h a n ic a l e n g in e e rs , who have in tro d u ce d a

n u m b e r o f concep ts and w o rds w h ich a re no t a lw ays f a m i l ia r to the m a th e

m a t ic ia n b ro ug h t up on t r a d it io n a l c o u rse s in pu re and a p p lie d m a th e m a t ic s .

The m a th e m a t ic ia n h as m ad e im p o r ta n t c o n tr ib u tio n s a t c r i t ic a l s tages in the

d eve lo pm e n t of c o n tro l th e o ry — fo r exam p le M ax w e ll ( s ta b il ity , 186 8),

H u rw itz ( s ta b il ity , 1895), W ie n e r (ra n d o m p ro ce s se s , 1941), P o n try a g in

(o p t im a l c o n tro l, 1950) — but these deve lo pm e n ts have depended on a good

l ia is o n betw een the e n g in e e r and the m a th e m a t ic ia n — fo r exam p le be tw een the

tu rb in e e ng ine e r S todo la and the m a th e m a t ic ia n H u rw itz . I t is im p o r ta n t

th e re fo re fo r m a th e m a t ic ia n s in te re s te d in c o n tro l th eo ry to f a m i l ia r iz e

th e m se lv e s w ith these c o n tro l e n g in e e r in g concep ts .

O ne such concep t w h ich a pp ea rs in any book o r p ap e r on c o n tro l

e n g in e e r in g is th a t o f the t r a n s fe r fu n c t io n .

C o n s id e r the to w e r c rane show n in F ig . 1 . 1 . The lo ad В is suspended

fre e ly by a cab le o f le n g th i f r o m a h o r iz o n ta l ly m o v in g t r o l le y A .

C o n s id e r in g m o v e m e n ts in the p lane of the to w e r and j ib we w ish to set up

a m a th e m a t ic a l r e la t io n s h ip betw een the " in p u t " to the sy s te m w h ich is the

d is ta n c e C A = x o f the t r o l le y fr o m the d a tum po in t C , and the h o r iz o n ta l

m o v e m e n t o f the lo ad В m e a s u re d as D B = у as show n. C o n s id e r in g s m a ll

m o tio n s o f the lo ad (so th a t y - x « i) we o b ta in the equa tio n o f m o t io n

m + i ^ x }m d t2 + mC d t 1 SL

w here m is the m a s s o f the lo ad , T = m g is the te n s io n in the cab le and

m e is a d a m p in g co ns tan t r e p re s e n tin g a e ro d y n am ic d a m p in g o f the m o tio n

(p ro b ab ly r a th e r s m a l l ) .

53

54 PARKS

К

x lt) a D a+cD+K у tt )

FIG. 1 .2 . Transfer function (1 .2 ) .

FIG. 1. 3. Frequency response o f (1 .1 ) .

W e o b ta in

d2y_ + c Ú L + £ , r - £ d t2 + c dt + У = 7 X ( 1 . 1 )

as the second o rd e r d if fe r e n t ia l equa tio n r e la t in g the "o u tp u t" y(t) to the

" in p u t '1 x (t) . G iv e n x(t) and in i t i a l c o nd it io n s on y(t) and d y (t) /d t , E q . (1.1)

m a y be so lved by c la s s ic a l m e th o d s .

Now the c o n tro l e n g ine e r has deve loped the u se fu l concep t of the

t r a n s fe r fu n c tio n to re p re s e n t th is in p u t/o u tp u t r e la t io n s h ip . W r it in g D fo r

d /d t we can say that

К

a D 2 + cD + K( 1 . 2 )

w here К = g / £ , a = 1, and the e xp re ss io n K / ( a r f + cD + K ) is the t r a n s fe r

fu n c tio n betw een у and x . It is a lso u su a l to d raw a "b lo ck d ia g r a m " of th is

r e la t io n s h ip as show n in F ig . 1 . 2 . It is o n ly n e ce s s a ry to c ro s s - m u lt ip ly

E q . ( 1 . 2 ) to re c o v e r the d if fe r e n t ia l equa tio n (1 . 1) .

T he t r a n s fe r fu n c tio n p lay s two o th e r u se fu l ro le s — i f D is re p la c e d

by iu we o b ta in the sy s te m freq ue ncy re sp o n se w h ich r e la te s the output to

in p u t w hen the in p u t is a s ine-w ave a t fre q ue n cy w ( r a d /s ) . T h is is shown

p lo tted on the A rg an d d ia g ra m in F ig . 1. 3 . F o r a g iven u the ra d iu s OP

g ive s the r a t io o f ou tpu t a m p litu d e to in p u t a m p litu d e and the ang le AOP

re p re s e n ts the phase la g a n g le o f the ou tpu t s in u s o id r e la t iv e to the in p u t

s in u s o id .

I A E A - S M R - n /1 55

ERROR,.

0,ID)ylt)

SIGNAL x(t) ERROR

SIGNALGID)

yt t )

FIG. 1 .4 . Feedback system. FIG. 1. 5. Unit feedback.

I f D is r e p la c e d by s as used in L a p la c e t r a n s fo rm n o ta tio n , the tr a n s fe r

fu n c tio n then re p re s e n ts the L a p la c e t r a n s fo r m of the u n it im p u ls e re spo nse

o f the sy s te m , th a t is the L a p la c e t r a n s fo r m of y(t) w hen x(t) is a un it im p u ls e

a t t = 0+, the sy s te m s ta r t in g w ith z e ro in i t i a l c o nd it io n s on y(t) and d y /d t .

The un it im p u ls e re spo nse h(t) (t > 0) is a u se fu l concep t s in ce i t m a y be

used to w r ite down the so lu t io n y(t) fo r a v e ry g e n e ra l in p u t x(t) in the fo rm

o f a "c o n v o lu t io n in te g r a l"

y(t) h (r ) x (t- r) dT

r= 0

It o ften happens th a t we b u ild up a feedback sy s te m in v o lv in g a n u m b e r

of t r a n s fe r func tio ns G i(D ), G 2(D ), e t c . , w here each G ¡(D ) is a r a t io of

p o ly n o m ia ls in D as in F ig . 1 . 4 . H e re we m ay o pe ra te by the o rd in a r y

ru le s o f a lg e b ra to o b ta in the " c lo sed- loop tr a n s fe r fu n c t io n " r e la t in g y(t) to

x(t) w h ich is , in th is case ,

V G 2 ( D ) G i ( D )

x (1 + G 2 (D) 0 , ( 0 ) G 3 (D))

(The c i r c l e ® re p re s e n ts a s u b tra c t io n to fo rm the " e r r o r s ig n a l" w h ich is

x(t) - G 3(D )y(t) hence y(t) = G 2 (D) G x (D ) [ x(t) - G 3 (D )y(t) ] . )

A s p e c ia l case is show n in F ig . 1. 5 w here we c a l l G (D) the open- loop

t r a n s fe r fu n c t io n and G (D )/(1 + G (D )) is the c lo se d lo o p tr a n s fe r fun c tio n .

R e tu r n in g to the d if fe r e n t ia l equa tio n ( 1 . 1 ) and th in k in g abou t the c la s s ic a l

m e th o d o f s o lu t io n w here y (t) = p a r t ic u la r in te g ra l + c o m p le m e n ta ry fu n c tio n ,

we o b se rv e the c o m p le m e n ta ry fu n c tio n s , o r t r a n s ie n t so lu tio n s in the

language o f c o n tro l e n g in e e rs , d e te rm in e d by

+ f y = 0dt

T h is is s a t is f ie d by у = Ae w here

X2 + c X + f = 0

o r , lo o k in g a t the t r a n s fe r function^

0 g/l___D + cD + g j SL

56 PARKS

by the d e n o m in a to r o f the t r a n s fe r fu n c tio n r e p la c in g D by X and equa ting to

z e ro . R e p la c in g D by s and r e g a rd in g s as a co m p le x n u m b e r the roo ts in X

a re po les of the t r a n s fe r fun c tio n

g / l s2 +cs + g j i

I f the t r a n s ie n t s o lu tio n s a re to d ie aw ay as t im e in c r e a s e s , then the

ro o ts in X m u s t be r e a l and nega tive , o r c o m p le x w ith nega tive r e a l p a r ts .

A lte rn a t iv e ly " th e po les m u s t l ie in the le ft-hand h a lf o f the co m p le x p la n e " .

T h is is a g e n e ra l r e q u ir e m e n t fo r s ta b il ity of the t r a n s fe r fu n c tio n , and

e s p e c ia lly when i t is the tr a n s fe r fun c tio n of a c lo sed- loop s y s te m .

T he b e h av io u r o f the sy s te m a lso depends on the t r a n s ie n t s o lu tio n s ,

fo r e x am p le , these m a y be v e ry o s c i l la to r y even though e v e n tu a lly d am ped

o u t. A u se fu l te s t , th e o re t ic a lly and p r a c t ic a l ly , is the un it- s tep response

o f the sy s te m . The un it- s tep re spo nses of the seco nd- o rd e r sys tem

(D 2 + 2ÇunD + u2)y = u2x

(a s ta n d a rd fo rm fo r s e co nd- o rd e r sy s te m s ) fo r v a r io u s v a lue s of Ç is shown

in F i g . 1 .6 .

— 4 — s Vir 2ir 3 ît U-n

шп‘Step responses o f the second-order system (D2 + 2£wn D i ùj2r

A c la s s ic a l s ta b il ity p ro b le m so lved by H e rm ite (1854), R o u th (1877)

and H u rw itz (1895) w as to find n e ce ssa ry and s u ff ic ie n t co nd it io n s on the

co e ff ic ie n ts o f the l in e a r d if fe r e n t ia l equa tion

dny + ad t" ' " d t " '1

+ a ^ + 32 d t"’ 2

+ an-l dt

+ a v = 0 iW

fo r d am pe d o r s tab le tr a n s ie n ts .

H e r m it e 1 s s o lu t io n , w h ich dese rves to be b e tte r k n o w n ,is th a t the n X n

m a t r ix

H =

n “ n - l

0

0 a n-3a n

0

l a 4+ a2 a 3 0 a3

0 a l a 2 _ a 3 0

a 3 0 a j

be p o s it iv e d e fin ite (tha t is the m a t r ix o f a p o s it iv e - d e fin ite q u a d ra t ic fo rm ) .

I A E A -S M R -1 7 /1 57

Im

Re

Im

0Re

FIG. 1 .7 . Contour C, FIG. 1 .8 . Indented contour.

In the 1930s an o th e r te chn ique w as deve loped to in v e s tig a te s ta b il ity

o f c lo se d- loo p sy s te m s fro m a know ledge of the open- loop freq ue ncy

re s p o n se . T h is is the ce le b ra te d N yqu is t s ta b il ity c r i t e r io n . I f G (D ) is the

open- loop t r a n s fe r fu n c tio n then the c lo sed- loop t r a n s fe r fu n c tio n is

G (D )/(1 + G(D)) and we a re thus in v e s t ig a t in g ro o ts in s of the equa tio n

1+ G (s ) = 0 . Now G (s) = q (s ) /p (s ) , w here p(s) and q(s) a re p o ly n o m ia ls ,

and so we a re in te re s te d in the roo ts o f p(s) + q(s) = 0 . W e now co n s id e r

the co n tou r С on the A rg an d d ia g ra m show n in F ig . 1. 7. A p p ly in g "the

p r in c ip le of the a rg u m e n t" to the func tio n 1 + G (s) o f the co m p le x v a r ia b le s

w here N = n u m b e r of ze ro s of 1 + G (s) and P = n u m b e r o f po les of 1 + G(s)

w ith in C . H e re we have a s su m e d no ze ro s o r po le s a c tu a lly l ie on С its e lf .

I f the deg ree o f p(s) is g re a te r th an th a t o f q (s ) , then the c o n tr ib u t io n to

[ a r g ( l +G(s))] on the a rc o f r a d iu s R is ze ro as R -» 00 and we have to

c o n s id e r the change in a rg u m e n t o f 1+ G (iu ) as и v a r ie s fr o m + 00 th ro ug h

0 to -oo, th a t is 1 + G(s) fo r s = iu on the im a g in a r y a x is . I f th e re a re no

po les o f 1 + G (s), th a t is z e ro s o f p(s) in the r ig h t- hand h a lf- p la n e so tha t

P = 0, and fu r th e rm o re we re q u ir e N = 0 fo r s ta b il ity (so th a t no roo ts of

1 + G (s) = 0 have p o s it iv e r e a l p a r ts ) , then 1 + G (iu ) m u s t not e n c irc le the

o r ig in as u v a r ie s fo rm +°o th ro ug h 0 to -°o. A lte rn a t iv e ly , G (iu ) m u s t not

e n c ir c le the po in t (-1, 0 ) . T h is is the N yqu is t s ta b il ity c r i te r io n in its

s im p le s t fo rm .

A m o d if ic a t io n th a t is qu ite o ften n e ce s sa ry o ccu rs w hen p(s) has a ze ro

a t the o r ig in so th a t G (s) h as a po le th e re . It is then n e ce s s a ry to " in d e n t”

the co n tou r С by a s m a l l s e m ic ir c le as show n in F ig . 1 . 8 . The b eh av io u r

o f G (s ) a t in f in ity is d e te rm in e d by w hat happens as s d e s c r ib e s the s m a l l

s e m i- c ir c le s u r ro u n d in g the o r ig in . A n e x am p le is show n in F ig . 1 . 9 .

2. T R A N S F E R F U N C T IO N S A N D S T A B IL IT Y

2 . 1 . S am p le d data s y s te m s and z - tra n s fo rm s

D is c u s s io n h ith e r to has conce rned o rd in a ry d if fe r e n t ia l equa tions and

con tinuous fu n c tio n s o f t im e t . A n im p o r ta n t c la ss of c o n tro l sy s te m s uses

d is c re te t im e in p u ts and outputs : these a re know n as s am p le d data co n tro l

s y s te m s and a th e o ry ana lo g ou s to the L a p la c e t r a n s fo rm fo r con tinuous

sy s te m s has been b u ilt up b ased on the so- ca lle d z - tr a n s fo rm . T h is theo ry

m a y be a p p lie d a ls o to d if fe re n ce equa tions w h ich o c c u r d ir e c t ly in som e

a p p lic a t io n s , fo r e x a m p le ,e c o n o m ic m o d e ls .

a r g [ l + G (s)] = 2 7 r ( N - P )

58 PARKS

s ( s + q)

Closed-loop system stable,P = 0, N = 0 within indented contour.

Closed-loop system unstable,P = 0, N = 1 within indented contour.

K>0, Q < 0.x'lUo-O*

/ ЛA-1.0) i <*>-► + 00

jiCJ-»-00

\ !/

K<0, a<0iоr3

I \

w-»---- 1----------- -оъУ \^ -----1--------

Closed-loop system unstable,P = 1, N = 2 within indented contour.

M,0)Closed-loop system unstable,P = 1, N = 1 within indented contour.

FIG. 1 .9 . Nyquist criterion with an open-loop pole at s = 0.

D iffe re n c e equa tio ns d e s c r ib e the b e h av io u r betw een in p u t and ou tpu t

seq ue nce s , fo r exam p le

У + 2 У = x ; n = 0 , 1 , 2 , 3 , . . .^n+l ¿ J n n* ' » • • •

w here the in i t i a l c o n d it io n is yo = 1 and x n is the sequence ( 1 , 1 , 1 , 1 , . . . )

By a s tep-by-step p ro ce s s we o b ta in the s o lu t io n as

l 3 5 11У1 = 2 , У2 = 4 , Уз = 3 , У4 =

IA E A -S M R -1 7 /1 59

T he z- tra n s fo rm of a sequence {a r}= {a0, a 1( a 2, . . .} is by d e f in it io n

the s e r ie s

^ a r z ' r = Â (z)

r = 0

The z - tra n s fo rm of a sh ifte d sequence i a r+1} = i 3 !, a2> a3> • • •} is

CO

a r + 1 z"r = (Â (z) - a„) X z

r=0

The d if fe re n ce e qua tio n m a y be so lved by " ta k in g s eq ue nce s " and then

" ta k in g z - t r a n s fo rm s " to o b ta in

(Y (z ) - y„) z + i Y (z ) = l+ z- 1 +z-2+ . . . = - Ц - = - f-1 -z z _ i

H ence

- y 0 Z z” Z +1 + (z - 1 ) (z +|)

111 Z + 5

These two te rm s co rre sp o n d to the "c o m p le m e n ta ry fu n c t io n " and " p a r t ic u la r

in t e g r a l" te r m s tha t we enco un te r in both d if fe r e n t ia l and d iffe re n ce

e q u a t io n s .

Now the z- tra n s fo rm o f the g e o m e tr ic sequence

{1 , a2, a3 , . . . } is 1 + az_1 + a2 z ‘ 2 + . . . = --- — - = —-—1 - a z г - a

s o we m a y w r ite the in v e rs e t r a n s fo rm of Y (z ) to g ive the s o lu tio n

{yn}= (yo - ! ) { 1 . i . - j , . . ■} +1 {l, l . 1. 1. . . .}

o r when

Уо = 1 {Уп}={1, h I , f , . . . }

(Note th a t l im yn = | ).

2 . 2 . S am p le d data sy s te m s

C o n s id e r th e sy s te m show n in F ig . 2 . 1 . A con tinuous in p u t x(t) is

s a m p le d by the sw itc h Sx and the s am p le s m o d e lle d as im p u ls e s a re fed to

the f ir s t- o r d e r sy s tem w ith the tr a n s fe r fu n c tio n K / ( s + a ) . The sw itc h S i

c lo se s fo r a s h o r t t im e Д a t e q u a lly spaced t im e s h a p a r t . A n im p u ls e of

m a g n itu d e x (nh)A is thus fed to the f ir s t- o r d e r s y s te m . T he s a m p lin g m a y

be o rg a n iz e d so th a t Д = 1. The response o f the f i r s t- o r d e r sy s te m to th is

im p u ls e is x (nh)A k(t-nh) fo r t > nh , w here k(t) fo r t > 0 is the u n it im p u ls e

60 PARKS

nh ln+l)h t —»♦* -**

FIG. 2 .1 . System with sampled input.

a >0

FIG. 2 .2 . Second-order continuous linear closed-loop system (a) stable for all К > 0, (b) stable for 0 < К < 2a coth (ah/2).

re spo n se of the f ir s t- o r d e r s y s te m . I f we c o n s id e r the output s am p le d by

a second sy n ch ro n ize d sw itc h S2 then we find tha t

y(nh) = x(nh)A k (0 ) + x ((n- l)h )A k (h )

+ x ((n- 2 )h )A k(2 h) + . . .

+ x (0 )Ak(nh)

C o n s id e r in g now the z - tra n s fo rm of {y(nh)} we find that

z - tr a n s fo rm o f s am p le d output = ( z - tra n s fo rm of s am p le d input)

X (z - tra n s fo rm o f the s am p le d u n it im p u ls e respo n se ) X Д

A no the r a r r a n g e m e n t is the "s am p le - an d - h o ld " d ev ice w h ich has Д = 1

fo llow e d by the tr a n s fe r fun c tio n (1 - e"sh) /s w h ich m u lt ip l ie s any subsequen t

t r a n s fe r fu n c tio n such as K / ( s + a ) above .

W e note tha t s a m p lin g can change the c h a ra c te r o f the s y s te m , fo r

e x am p le the seco nd- o rd e r con tinuous l in e a r c lo sed- loop sy s te m show n in

F ig . 2 . 2a is s tab le fo r a l l К > 0, w hereas 2 . 2b , w ith Д = 1 and s a m p lin g

in te r n a l h , is s ta b le fo r 0 < К < 2a c o th (a h /2 ) on ly .

T h is r e q u ir e s som e a n a ly s is as fo llow s :

F o r F ig . 2 . 2b:

(z - tra n s fo rm o f ou tput) = z t r a n s fo r m of un it im p u ls e respo n se of К / s (s+ a)

X ( z - tra n s fo rm o f in p u t — z- tra n s fo rm of output)

IA E A -S M R -1 7 /1 61

and we a re in te re s te d in the "c lo se d - lo o p z - tra n s fe r fu n c t io n " H (z ) / [1 + H (z )]

w hose H (z) is the z - tra n s fo rm of the u n it im p u ls e re s p o n se . Now the

(con tinuous) un it im p u ls e re sp o n se o f K / s ( s + a ) is the in v e rs e L ap la c e

t r a n s fo rm o f th is fu n c tio n o f s w h ich is (К/ а ) (1 -e 'at). T he fu n c tio n H (z) is

g iven by

K f z z_

a 4 2 - 1 z_e

S ta b il ity o f the c lo sed lo op depends on the z e ro s in z o f 1 + H (z) = 0 w h ich

m u s t l ie in s id e the u n it c ir c le .

W e have

z2 - z (l+e"ah) + e’ ah + — (1 -e_ah) z = 0

Q21

■22 a"

-2

FIG. 2 .3 . Unit triangle.

Now the roo ts o f z 2 + a xz + a2 = 0 l ie in s id e the un it c ir c le i f (а1( a2) lie s

in s id e the t r ia n g le show n in F ig . 2 . 3 . W e have a 2 = e 'ah and

3l = J (1 -e"ah) - (l+ e 'ah)

and so

- 1 -e"ah < j (1 -e"ah) - ( 1 +e’ ah) < l+ e"ah

o r

0 < ? < = 2 c o t h (a h / 2 )

T he g e n e ra l s ta b il i t y c r i te r io n fo r roo ts of p o ly n o m ia l equa tio ns in z

to l ie w ith in the un it c ir c le on the co m p le x p lane is a s so c ia te d w ith the

n am es of S ch u r , C ohn and J u r y . One v e rs io n is g iven be low (Ju ry ):

L e t the p o ly n o m ia l be

F( z ) = an + a.z +a„z2 + . . . +a zn, a > 0 ' ' 0 1 2 n ’ n

W e c o n s tru c t the fo llo w in g tab le o f co e ff ic ie n ts and 2 X 2 d e te rm in an ts

fo rm e d by c ro s s - m u lt ip l ic a t io n :

62 PARKS

an an-l an-2 • ■. . . a2 a i a o

a 0 a l a2 a n-l a „

A l = b 0 b i b 2

V l ^n-2 bn-3 . . . b j cr о

Д 2 = c o C 1 C2 ••

Cn-2 Cn-3 Cn-4 • -. . . c 0

iс<а = t 0 *1

w here

_ a n a k» Ck =

b 0 ^n-k-l* =

c 0 cn-k-2 , e tc .a 0 a n-k' W - l c n-2 Ck

th en th e c o nd it io n s fo r a l l the roo ts of F ( z ) = 0 to l ie w ith in the un it c ir c le

a re F ( l ) > 0, (-1)" F ( - l ) > 0, А г > 0, Д 2 > 0, . . . > 0.

2 .3 . E x a m p le o f d y n a m ic a l sy s te m s d e s c r ib e d by d if fe re n c e equa tio ns — the

ho g cyc le

D iffe re n c e equa tio n m o d e ls o c c u r in v a r io u s f ie ld s o f a p p lic a t io n s . A

fam o us m o d e l a p p e a r in g in m an y tex tbooks o f m a th e m a t ic a l e cono m ics is

the "h o g c y c le " ("h o g " (USA) = " p ig " (U K )).

W e suppose th e re a re tw o cu rves d e s c r ib in g "d e m a n d " D and

"p ro d u c t io n " P v e rs u s p r ic e p as in F ig . 2 .4 . W e suppose th a t fa rm e rs base

th e ir p ro d u c t io n on the p re v io u s y e a r 's p r ic e and th a t the p r ic e a d ju s ts it s e lf

so th a t su pp ly equa ls d em and . W e have:

g(Pn) = D (n ) = p (n ) = f (Pn-l)

I f we a p p ro x im a te the c u rves o f f and g by s tr a ig h t lin e s so th a t

P = ap + b

D = -cp + d; а , с > 0

th en

-cpn + d = ap ,,.! + b

o r

the hog p r ic e cy c le w i l l be o s c i l la to r y as - а / с < 0 , and s tab le o r u n s tab le

depend ing on w he the r a / c < 1 o r > 1 r e s p e c t iv e ly .

I f the cy c le is s ta b le i t tends to the e q u i l ib r iu m p o in t w here the

p ro du c t io n and dem and cu rves in te r s e c t . T he d ia g r a m in F ig . 2 .4 w h ich

tr a c e s the m o v e m e n t o f p r ic e s is known as a "cob-w eb d ia g r a m " .

I A E A - S M R - n /1 63

Q .

Оo'Z

D,P

p. PRICE

{/)оÛ_

D,P

оd

Z

U_ P

FIG. 2 .4 . The hog c y c le . -J—lPo— -1} P , PRICE

2 .4 . S ta b ility

In the m a te r ia l c o n s id e re d so fa r s ta b il ity has in vo lv e d the decay o f the

c o m p le m e n ta ry fu nc tio ns a s t im e tends to in f in ity . T h is has in vo lv e d roo ts

in the le ft-hand h a lf o f the co m p le x p lane o r roo ts in s id e the u n it c ir c le fo r

the co n tin uo us t im e and d is c re te t im e s y s te m s , r e s p e c t iv e ly .

M o re p re c is e d e f in it io n s o f s ta b il ity can be g iven when the sy s te m s a re

e xp re sse d in phase space o r in s ta te space fo rm .

C o n s id e r the un fo rced seco nd- o rd e r d if fe r e n t ia l equa tio n

By in tro d u c in g the phase- p lane co - o rd in a te s x j = y and x 2 = d y /d t we m ay

re p la c e the s in g le s e co nd- o rd e r equa tio n by two f ir s t- o r d e r equa tions :

o r in m a t r ix fo rm

ii О I—» X 1_ ¿ 2 . - - a 2 ~a l - L x 2J

In s te ad o f p lo tt in g the s o lu tio n s o f (a ), i . e . y(t) o r d y /d t w ith re s p e c t to

t im e , we c o n s id e r p lo tt in g the phase-p lane d ia g ra m o f (b) in w h ich we p lo t

x 2 (= d y /d t) a g a in s t X i (= y) w he re , fo r v a ry in g t , the po in t (x j, x2) d e s c r ib e s

a c u rve o r t r a je c to r y on the phase - p lane . F ig u re 2.5 show s som e o f the

p o s s ib i l i t ie s fo r v a r io u s va lue s o f a j a n d a 2, w here the a r ro w s in d ic a te

(a)

d x ,2

dt a 2X l - a i X 2

64 PARKS

F IG .2 .5 . Phase-plane diagrams: (a) a1>2 :

(b ) al — 0, ( c ) a y > 0, a >4ag.0, a| < 4a

in c r e a s in g t im e . The o r ig in o f the phase- p lane is it s e l f a s o lu tio n and

t r a je c to r y and is a s in g u la r po in t o r e q u i l ib r iu m po in t. T he s ta b il ity o f an

e q u i l ib r iu m po in t m a y be de fin ed by c o n s id e r in g the b eh av io u r o f t r a je c to r ie s

in its n e ig hb ou rho o d . I f f r o m a l l s ta r t in g po in ts in a s m a l l s p h e r ic a l

n e ig hb ou rho o d o f the e q u i l ib r iu m po in t the tr a je c to r ie s sub se quen tly stay

c lo se to the e q u i l ib r iu m po in t, then the e q u i l ib r iu m po in t is s ta b le . If , in

a d d it io n , the t r a je c to r ie s tend to the o r ig in as t im e tends to in f in ity then

we say the e q u i l ib r iu m p o in t is a s y m p to t ic a lly s ta b le . T hus the o r ig in is

a s y m p to t ic a l ly s tab le in F ig s 2 .5 a and 2 .5 c bu t s tab le on ly in F ig . 2. 5b.

A m o re fo rm a l m a th e m a t ic a l d e f in it io n o f the lo c a l s ta b il i t y p ro p e r tie s

o f a n e q u i l ib r iu m p o in t re p re se n te d by the v e c to r x 0 is as fo llow s :

I f g iven any e > 0 th e re e x is ts a 6 = 6 (e) > 0 such th a t fo r a l l

l|x(t„) - x 0 ll < 6< "x.Q is stable.

I f , in a d d it io n , l im | [ x(t) -

x (t) - x 0 11 < e f ° r a l l * - t 0, then the e q u i l ib r iu m po in t

I = 0 then x 0 is a s y m p to t ic a lly s ta b le .

In th is d e f in it io n || • || r e p re s e n ts the E u c lid e a n n o rm .

A n u m b e r o f m o re so p h is t ic a te d d e f in it io n s such as "u n ifo rm a sy m p to tic

s ta b il i t y " m a y be deve loped fr o m e x am in a tio n o f the l im it in g p ro ce ss

11 x(t) - x 0 11 -*■ 0 as t -*■ °o.

So f a r we have c o n s id e re d as a n e x am p le a l in e a r s y s te m on ly (Eqs (a),

(b) and F ig . 2 .5 ) . A n o n - lin e a r sy s te m m ay have a n u m b e r of e q u i l ib r iu m

po in ts as show n in F ig . 2 .6 fo r s o ft- sp r in g o s c il la t io n s (F ig . 2 . 6a):

m x + cx + k ,x - kgX3 = 0

k i K3 3 сx 9 ----- x-, + — x , -----x„m m 1 m ¿

m , c , k 1 ( k 3 > 0

IA E A -S M R -1 7 /1 65

FIG .2 . 6. Equilibrium points o f sim ple non-linear systems; (a) soft spring, (b) asym m etric spring.

and an a s y m m e tr ic s p r in g (L a S a lle ) (F ig . 2. 6b):

x + | x + 2 x + x 2 = 0

¿ 2 = ' 2xl ' x i•2 . 1

3. S T A B IL IT Y . I.

3 .1 . L ia puno v fun c tio ns

A n im p o r ta n t idea fo r in v e s t ig a t in g the s ta b il i t y o f e q u i l ib r iu m po in ts

o f l in e a r and n o n - lin e a r sy s tem s is the concep t o f the L ia puno v fun c tio n

(L ia p u n o v , 1892). T h is is i l lu s t r a te d in F ig . 3 .1 w here the o r ig in o f the

sy s te m show n is s u r ro u n d e d by a nest o f c lo sed cu rves g iven by

V (xx, x 2) = b n x i + 2b 12x 1x2 + b 22x2 = cons tan t > 0

w here

x i = x 2 "I

x 2 = a2x 1 " alx 2 '

a re the sy s te m equa tio ns d is c u s s e d in S ec tion 2 .4 . w ith a , > 0 , a„ > 0.1 ¿

66 PARKS

W e e x a m i n e t h e r a t e o f c h a n g e o f V f o l l o w i n g t h e t r a j e c t o r i e s o f t h e s y s t e m

e q u a t i o n s . T h i s q u a n t i t y , s o m e t i m e s c a l l e d t h e E u l e r i a n d e r i v a t i v e o f V ,

i s g i v e n b y

d V _ _9V_ Э У d x 2

d t Э х х d t Э х 2 d t

= ( 2 b 1 1 x 1 + 2 b 1 2 X 2 ) x 2 + (2 Ь 12 Х 2 + 2 b 22X 2 ) ( - a 2 X j - a 1 x 2 )

o n d i f f e r e n t i a t i o n b y p a r t s a n d s u b s t i t u t i o n f o r X j a n d x 2 f r o m t h e s y s t e m

e q u a t i o n s . W e n o w t r y t o c h o o s e b n , b 1 2 a n d b 2 2 s o t h a t ( i ) t h e c o n t o u r s o f

V f o r m c l o s e d c u r v e s s u r r o u n d i n g t h e o r i g i n , ( i i ) d V / d t g 0 , b u t ^ 0

e x c e p t a t x = 0 . I f t h i s i s p o s s i b l e w e c a n d e d u c e t h a t t h e o r i g i n i s a n

a s y m p t o t i c a l l y s t a b l e e q u i l i b r i u m p o i n t , f o r V w i l l d e c r e a s e c o n t i n u a l l y

a l o n g a n y t r a j e c t o r y , w h i c h m u s t b e m o v i n g i n w a r d s t h r o u g h t h e c o n t o u r s

o f V a n d t h u s t e n d i n g t o t h e o r i g i n a s t ^ ° ° .

I n t h e c a s e o f a l i n e a r s y s t e m s u c h a s t h a t g i v e n i n F i g . 3 . 1 w e c a n

m a k e d V / d t i 0 b y e q u a t i n g i t i d e n t i c a l l y t o a n e g a t i v e - d e f i n i t e q u a d r a t i c

f o r m ( e . g . - 2 x j - 2 x | ) w h i c h w i l l i n g e n e r a l g i v e r i s e t o 3 e q u a t i o n s f o r t h e

3 u n k n o w n s Ь ц , b 1 2 , b 2 2 . W e m a y t h e n e x a m i n e V f o r p o s i t i v e d e f i n i t e n e s s

f o r w h i c h n e c e s s a r y a n d s u f f i c i e n t c o n d i t i o n s a r e Ь ц > 0 , Ь ц b 2 2 > b i 2 .

T h e s e c o n d i t i o n s w i l l a l s o b e n e c e s s a r y a n d s u f f i c i e n t f o r a s y m p t o t i c

s t a b i l i t y o f t h e o r i g i n .

W e n o t e t h a t V -» 0 a s t -*■ a n d s i n c e V m a j o r i z e s t h e n o r m

II x I) = s/xf + x | , t h i s p r o v e s t h a t | | x | | -» 0 w h i c h i s w h a t w e m e a n b y

a s y m p t o t i c s t a b i l i t y o f t h e p o i n t x = 0_. T h i s i d e a m a y b e g e n e r a l i z e d i n t o

g e n e r a l s i t u a t i o n s , f o r e x a m p l e t o d i s c u s s t h e s t a b i l i t y o f s y s t e m s g o v e r n e d

b y p a r t i a l d i f f e r e n t i a l e q u a t i o n s .

W i t h n o n - l i n e a r s y s t e m s w e m a y s t i l l b e a b l e t o f i n d L i a p u n o v f u n c t i o n s

a n d d e d u c e s t a b i l i t y p r o p e r t i e s , e v e n t h o u g h w e c a n n o t f i n d a n y e x p l i c i t

s o l u t i o n s o f t h e n o n - l i n e a r d i f f e r e n t i a l e q u a t i o n s t h e m s e l v e s . T h i s m a k e s

t h e t e c h n i q u e p a r t i c u l a r l y a t t r a c t i v e . C e r t a i n s y s t e m s , e . g . m o d e l - r e f e r e n c e

a d a p t i v e c o n t r o l s y s t e m s , m a y b e s y n t h e s i z e d a s s t a b l e s y s t e m s b y c h o o s i n g

V p o s i t i v e d e f i n i t e a n d m a k i n g d V / d t n e g a t i v e b y s u i t a b l e f e e d b a c k

a r r a n g e m e n t s .

3 . 2 . . C o n s t r u c t i o n o f L i a p u n o v f u n c t i o n s

S e v e r a l m e t h o d s f o r c o n s t r u c t i n g L i a p u n o v f u n c t i o n s h a v e b e e n d e v i s e d ,

s o m e o f w h i c h a r e b r i e f l y l i s t e d b e l o w :

I A E A -S M R -1 7 /1 67

T h e L i a p u n o v m a t r i x e q u a t i o n f o r l i n e a r s y s t e m s

G i v e n t h e l i n e a r c o n s t a n t s y s t e m s x = A x a n d t h e q u a d r a t i c f o r m

V = _ x T P x t h e n d V / d t = x T ( P A + A T P ) x a n d , b y e q u a t i n g d V / d t t o - x T Q x ,

w e o b t a i n t h e L i a p u n o v m a t r i x e q u a t i o n

P A + A T P = - g

T h e v a r i a b l e - g r a d i e n t m e t h o d

G i v e n t h e s y s t e m o f e q u a t i o n s x = _ f ( x ) , w e c o n s i d e r t h e g r a d i e n t v e c t o r

j _ ( 9 V 9 V 9 V Y

sradV = W ' ¿ v • ■ • • s r j

a n d t r y t o c o n s t r u c t i t s e l e m e n t s s o t h a t

( i ) ( g r a d V ) T - f ^ g s 0

( i i ) g r a d V i s i n f a c t t h e g r a d i e n t o f a s c a l a r .

I f t h e v e c t o r t o b e f o u n d i s g_ = ( g j , . . . , g n ) t h e n

( i ) £ T • f s 0

( i i ) c u r l g_ = 0

w h e r e t h e n X n m a t r i x c u r l £ i s d e f i n e d a s a m a t r i x w i t h i t s ( i , j ) t h e l e m e n t

g i j g i v e n b y

_ 9 g j 9 g ¡

9 x ¿ 9 x j

T h e c u r l m a t r i x i s a n t i s y m m e t r i c b y d e f i n i t i o n , a n d s o t h e r e a r e ^

c o n d i t i o n s o n t h e g ¡ t o b e s a t i s f i e d , n a m e l y

9 g j 9 g ¡a í = a i ; ^ = 1 < 2 ................ n ,

A c e r t a i n a r b i t r a r i n e s s i s p r e s e n t i n t h i s t e c h n i q u e .

T h e Z u b o v m e t h o d

T h i s i s o f p a r t i c u l a r i n t e r e s t w h e n t h e f¡[ ( e l e m e n t s o f J ) a r e a l g e b r a i c

f o r m s w i t h l i n e a r , q u a d r a t i c , c u b i c , . . . . , p a r t s . T h e L i a p u n o v f u n c t i o n V

i s t a k e n a s a s u m o f V 2 , q u a d r a t i c , V 3 , c u b i c , V 4 q u a r t i c , . . . . f o r m s .

F i r s t t h e l i n e a r p a r t o f _ f i s u s e d t o c o n s t r u c t V 2 w i t h a n e g a t i v e q u a d r a t i c

f o r m a s d e r i v a t i v e , u s i n g t h e t e c h n i q u e f o r l i n e a r s y s t e m s . T h e n V 3 i s

f o u n d u s i n g V 2> t h e l i n e a r a n d q u a d r a t i c p a r t s o f f t o k e e p d V / d t u n c h a n g e d .

V 4 , . . . . a r e f o u n d s i m i l a r l y i n s u c c e s s i o n i n a l o g i c a l b u t i n c r e a s i n g l y

c o m p l e x p r o c e s s . T h e r e m a y b e p r o b l e m s o f c o n v e r g e n c e a n d s i g n -

d e f i n i t e n e s s o f t h e r e s u l t i n g f u n c t i o n V .

68 PARKS

Integration by parts

T h i s i s a n i m p o r t a n t i d e a w h i c h h a s b e e n e x p l o i t e d b y B r o c k e t t t o

p r o v i d e a n i m p o r t a n t c o n n e c t i o n b e t w e e n L i a p u n o v f u n c t i o n s a n d f r e q u e n c y

r e s p o n s e m e t h o d s . T h e f o l l o w i n g i s a b a s i c t h e o r e m :

T h e o r e m ( B r o c k e t t , 1 9 6 4 )

S u p p o s e

p ( D ) = D n + a j D " ' 1 + a 2 D n' 2 + . . . + a n

w h e r e D = d / d t a n d t h e a ¡ a r e r e a l .

L e t q ( D ) b e a n o t h e r p o l y n o m i a l i n D o f d e g r e e l e s s t h a n o r e q u a l t o t h a t o f

p ( D ) a n d s u c h t h a t q ( z ) / p ( z ) i s a p o s i t i v e r e a l f u n c t i o n . I f q ( D ) , p ( D ) a n d

[ E v . q ( D ) p ( - D ) ] " d o n o t h a v e a c o m m o n f a c t o r t h e n

V ( x ) = J q ( D ) y p ( D ) y - { [ E v . q ( D ) p ( - D ) ] ‘ y } 2 d t

i s a p o s i t i v e d e f i n i t e L i a p u n o v f u n c t i o n f o r p ( D ) y = 0 w i t h a n o n - p o s i t i v e

d e r i v a t i v e , w h e r e x i s t h e p h a s e - s p a c e v e c t o r

X = [ y , D y , ..............D ^ y F

T h e t h e o r e m r e q u i r e s s o m e e x p l a n a t o r y n o t e s :

( i ) A p o s i t i v e r e a l f u n c t i o n ф(z ) i s s u c h t h a t ф(z ) i s r e a l i f z i s r e a l a n d

R e ф( z ) Ш 0 f o r R e z ё 0 . I n p a r t i c u l a r ф( i u ) § 0 f o r r e a l и w h i c h e n a b l e s

t h e i m p o r t a n t c o n n e c t i o n w i t h f r e q u e n c y r e s p o n s e t o b e m a d e .

( i i ) T h e n o t a t i o n ( E v . (• ) m e a n s t h e " e v e n p a r t o f " , e . g .

E v . ( z 3 + z 2 + z - l ) = z 2 - l .

( i i i ) T h e n o t a t i o n [ • ]" m e a n s t h a t p a r t o f a n e v e n f u n c t i o n w h i c h h a s z e r o s

i n t h e l e f t - h a n d h a l f o f t h e z - p l a n e , e . g .

[ z 2 - 1 ]" = [ ( z + 1 ) ( z - 1 ) ] ' = z + 1 .

( i v ) V ( x ) i s i n f a c t a q u a d r a t i c f o r m i n t h e p h a s e - s p a c e v a r i a b l e s

x i = У» x 2 = ° У - • • • - x n = D ^ y .

A n e x a m p l e o f t h i s t e c h n i q u e i s g i v e n b e l o w :

p ( D ) = ( D + 2 ) ( D + 3 ) = D z + 5 D + 6 , q ( D ) = ( D + 1 )

W e n o t e t h a t i s p o s i t i v e r e a l ( s e e F i g . 3 . 2 ) .P ( Z )

E v . q ( D ) p ( - D ) = E v . ( D + 1 ) ( D Z - 5 D + 6 )

= E v . ( D 3 - 4 D 2 + D + 6 )

= - 4D2 + 6 .[ E v . q ( D ) p ( - D ) ] ’ = [ - 4 D 2 + 6 ]" = I ( 2 D +■/6} ( - 2 D + JW )]' = ( 2 D + JÏÏ)

с n ь. t , л q ( ic j) 6 + 4 u 2 + i ( u - ( j3)Figure 3.2 shows a plot of np ( i u ) ( 6 - u ¿)¿ + 2 5 u z

IA E A -S M R -1 7 /1 69

Im

w =0 J <o=1 Re

FIG. 3 .2 . Plot o f q (iw )/p (iu ).

= J (ÿÿ+ ÿy+ 5ÿ 2 + 5 y j+ 6 y ÿ -4 y 2 -4 yÿ dt

= l ÿ2+ÿy + ( l l -4 ч/б") ^

= [ y ÿ ] which is positive-definite.

— = - (2 y + s/"6 y )2 which is negative-definite, dt

4. STABILITY. II

4 .1 . Positive-real functions and the Popov criterion

The positive real-function idea may be extended to the non-linear feedback arrangement shown in F ig .4.1 a to give the following result:Theorem (Brockett): "The zero solution of F ig .4. 1 a is asymptotically stable for all admissible f(y) if there exists an a > 0 such that (1 +ois )G (s ) is a positive real function.”

By an admissible non-linearity we mean (i) f(y) defined, continuous and single-valued for all y, (ii) f(0 ) = 0 , yf(y) > 0 , y f 0 ,

The proof of this result uses the Liapunov function formed by the integration- by-parts technique explained in Section 3. We consider the variable x in F ig .4. lb which satisfies the equation

P(D)x = -f {q(D)x}

We multiply by (1 +aD) q(D)x and form the new equation

(1 + iíD) q(D)x p(D)x + ffDq(D)x f{q(D)x} = -q(D)x f{q(D)x}

70 PARKS

pis)

( a )fly )

u=0

(b)

1pTs)

q(s)

fly )

FIG. 4 .1 . Non-linear feedback system. FIG. 4 .2 . Modified polar plot.

FIG.4 .3 . Shifted Popov line in modified polar plot. FIG.4 .4 Time-varying feedback.

We take V(x) as the time integral of the left-hand side having first subtracted {[E v. (1 +aD)q(D)p(-D) ]"x } 2 from both sides.

The left-hand side integrates into a positive-definite quadratic form in the phase-space variables by the result for the linear system p(D)y = 0, given in Section 3, plus

q(D)x

a I f(cr)dcr0

which is also non-negative for a > 0. The derivative

| = - q ( D ) x f{q(D)x} - '{ [E v . . . . ] ' x } 2 s 0

The condition of the theorem that (1 +as)G (s) is positive-real has an important graphical interpretation if we consider s = iw, for then the plot of u Im G(iu) against Re G(iu) must lie to the right of the straight line through the origin with slope i / o (see F ig .4. 2). The diagram is called a "modified polar plot" and the straight line the "Popov line". Of course the m ultiplication of Im G(iu) by u considerably changes the diagram from the more familiar frequency response diagram where Im G(iu) is plotted against Re G(iu).

IA E A -S M R -1 7 /1 71

FIG. 4. 5. Circle theorem disc.

A useful extension to this result in which further restrictions on the non-linearity are traded in for relaxations on the transfer function is embodied in the following theorem:

"The system of F ig .4.1 a has an asymptotically stable null solution for all admissable f(y) such that 0 § y f(y) s ky2 if there exists an a > 0 such that (1 + «s) G (s) + 1/k is positive real. "

This yields a new Popov line as shown in F ig .4 .3 , the modified polar plot of G(iu) having to lie to the right of this line.

4 .2 . The circ le criterion

This technique can be extended to the time varying system shown in F ig .4 .4 when the " circle theorem " is obtained: that is the loop of F ig .4 .4 is asymptotically stable provided the Nyquist locus G(iu) = q(iu)/p(iuj) does not encircle or enter the circular disc with diameter ( - 1 / 0 , 0 ), { - 1 / or, 0 ) as shown in F ig .4. 5, where

0 s (3 < f(t) < a < °о

(Notice that here we are using the usual Nyquist diagram, not the modified polar plot of the previous discussion. )

The proof is indirect, using first the following theorem:"if is positive real, the loop of F ig .4 .4 is asymptotically stablefor 0 < f(t) < k ."

Adding and subtracting /3q(D)x to the basic equation

p(D)x + f(t) q(D)x = 0

to obtain

P(D)x + |3q(D)x + (f(t) - 3) q(D)x = 0

'we can deduce that the loop is asymptotically stable for 0 < f(t) - /3 < a - (3 if »q(s) + p(s)/(3q(s) + p(s) is positive-real. The bilinear mapping W = (az + 1 )/ ()3z + 1), which maps the disc of F ig.4.5 into the left-hand half plane and q(s)/p(s) into [aq(s) + p(s)]/[j3q(s) + p(s)],com pletes the proof.

4 .3 . Linear tim e-delay systems

A number of control processes involve pure time delays in the forward loop or in the feedback loop. An example is given in F ig .4. 6 . (The transfer-function of a tim e-delay T is e 'sT.)

72 PARKS

u=0

FIG.4 .6 . Tim e delay system for К > 0, a > 0.

The stability of the loop is conveniently handled by the Nyquist criterion, bearing in mind the frequency response function of the tim e-delay T is e"itjT, a unit circle in the complex plane described in a clockwise direction as to increases from 0 to + °o. The open-loop frequency response is sketched in F ig .4. 7. The closed loop will be unstable for sufficiently large K /a, since the ( - 1 , 0 ) point will be encircled.

Conditions on the characteristic equation (which, for the example of F ig .4. 6 , is s + a + Ke”sT) for stability and instability have been given by Pontryagin (1942).

4 .4 . Equations with periodic coefficients

These occur naturally in a number of problems, for example the whirling of unsymmetrical shafts in unsymmetrical bearings or in the analysis of helicopter blade dynamics, but more especially when examining the stability of non-linear oscillations since the linearized equation for small disturbances occurring on top of the "steady-state" oscillation will be a linear differential equation with periodic coefficients. The calculation of stability boundaries of these equations is usually difficult and the boundaries are com plicated. The Mathieu equation

x + (6 + e cos t)x = 0

is a famous example with a stability diagram in terms of 6 and e shown in F ig .4 . 8 .

Use of Liapunov functions or the circle theorem can sometimes give useful sufficient (but not necessary) stability criteria.

I A E A -S M R -1 7 /1 73

FIG. 4. 9. Growing disturbance in repeated process (ploughing).

- » -| Gls)|* - j G (s)| « - | G(s) |~>

FIG. 4 .1 0 . M athem atical representation o f process o f Fig. 4 . 9.

4 .5 . Stability of repeated processes

A topic of recent interest is the stability of repeated processes, for example coal-cutting, agricultural ploughing, machining of metals, and control' of strings of urban transport vehicles by automatic means. If a slight irregularity occurs on one run of the tractor ploughing a field will the disturbance be magnified subsequently (see F ig .4. 9), which depicts a series of plough furrows following a step disturbance in the first furrow.

Mathematically the process is that of F ig .4.10 where an initial wave form is fed through the transfer function G(s) many times.

The stability criterion is(i) that G(s) itself is stable (that is with all its poles in the left-hand

half plane),(ii) that I G(iu) I s 1 for all real и . _

The Fourier transform of the output after n stages will be (G(iu))n U(iu)where TT(iu) is the Fourier transform of the input disturbance, and so the Fourier transform tends to zero as n-> « for all и if | G(iu) j s 1.

5. CONTROL OF SYSTEMS DESCRIBED BY PARTIAL DIFFERENTIALEQUATIONS

In recent years there has been a growth in the theory of control of "distributed parameter system s", that is systems governed by partial differential equations, as opposed to "lumped parameter system s" governed by ordinary differential equations. Before discussing control of distributed parameter systems it is necessary to discuss first two particularly important examples of the governing equations — the heat conduction equation and the wave equation.

5 .1 . The heat conduction equation

Consider the three-dimensional Cartesian frame and a small cubic element of a conducting solid as shown in Fig. 5 .1 . The heat flow across the face ADEF is -k (3T/3x)X óyóz into the element where к is the thermal

74 PARKS

FIG. 5 .1 . Cube for which heat conduction is considered.

conductivity and T = T(x, y, z, t) is the temperature. The flow out across BCHG is

Эх Эх k —Эх óx ) óy 6 z

Considering the other two pairs of faces the total heat input into the element is

ôxôyéz [ — Гк â î l + ^ - Гк 9Т1+ à "к

. Эх . Эу . Зу . 3z _

or, if к is constant,

Э2Т Э2Т Фт \Эу

This heat flow has increased the temperature so that, if p is the density and с the specific heat of the material,

Э2ТЭх2 эу"

Э T 3z2

= PCЭТ9t

on cancelling out 6 x 6 y 6 z on both sides. This is the highly important heat conduction equation. In the steady state, when ЭТ/9t = 0, we obtain Laplace's equation

Э2Т ^ Э2Т ^ Э2Т .+ --T + „ 9 = 0Эх2 Эу2 9z2

which has also other important applications, for example in potential theory and fluid-flow problem s.

It is usual to introduce a new constant К = k /pc called the "thermal diffusivity" so that the heat conduction equation can be written neatly as

К V2T = 9t

IA E A -S M R -1 7 /1 75

V 2 be ing known as the L ap lace op era tor

ax2

Some classical solutions of the heat equation may be found by separation of the variables: for example, a thin rod lying on the x-axis gives rise to the one-dimensional heat equation

32T _ ЭТ Эх2 9t

If we assume T(x, t) = X*(x) T*(t), where X* is a function of x only and T* a function of t only, then

giving solutions in X := and T* so that

T(x, t) = (A cos px + В sin px) e "KP 1 p f 0

The boundary and initial conditions determine admissible values for p and the constants A and В for each p - the original equation being linear we may add different solutions together.

F or example: "The rod is of length Í and at time t = 0 is at a uniform temperature T0. One end x = 0 is reduced to zero temperature, the other end x = $ is held at T0. Find the general temperature T (x ,t)" .

We look at solutions T(x, t) above. When x = 0 then T(0,t) = 0 for all t so A = 0. When x = SL, T ( i ,t ) = T0 and so we may have as a possible solution

r = l

where the possible values of p are гт/Í (r = l , 2, 3, . . . ) . When t = 0 we have

= A + Bx p = 0

г - 1

FIG. 5 .2 . Diagram o f solution to heat-conduction equation.

76 PARKS

and so a Fourier analysis yields

B,i

2 I0

and

г — 1

The solution is sketched in Fig. 5. 2.An alternative boundary condition to a given temperature on the boundary

is a given heat input. This corresponds to specifying ЭТ/Эх rather than T itself in the one-dimensional case. In particular, if one end of the rod is insulated ЭТ/Эх = 0 there.

The heat equation may be expressed in other co-ordinates depending on the problem in hand, for example cylindrical polar co-ordinates and spherical polar co-ordinates.

Besides describing heat flow the equation

represents the "consolidation equation" in soil mechanics, the diffusion of chemicals ("F ick 's law"), the "skin-effect equation" in electrical field theory, the behaviour of an electrical cable with capacitance and resistance but with negligible inductance, and incom pressible viscous flow.

5 .2 . The wave equation

Waves occur all around us — in the air as sound waves, on the surface of water, in space as radio and light waves and in elastic bodies as m echanical vibrations.

The simplest mathematical model is the one-dimensional wave equation which may be used to describe planar vibrations of a string under tension - (Fig. 5 .3 ). If the tension is T and the mass per unit length is m, we obtain by resolving perpendicularly to the x-axis

where y(x,t) is the displacement of the string and Эу/Эх is assumed to be small. We obtain the one-dimensional wave equation

found to be the velocity of waves travelling along the string). The most general solution of the equation is

y ( x , t ) = f ( x - c t ) + g ( x + c t )

IA E A -S M R - И /1 77

FIG. 5 .3 . Equation o f m otion o f a string. FIG. 5 .4 . Waves travelling through a string.

FIG. 5 .5 . M otion in standing w ave.

These are arbitrary continuous functions. As t increases, the wave form f moves to the right at a velocity с (Fig. 5.4) (and sim ilarly the wave form g moves to the left with velocity c).

Of particular importance are progressive harmonic waves

y(x, t) = A cos 2?r = д cos 2 tt(Ï - ft)

where с = fX. Here A is the amplitude, X the wavelength, f the frequency in Hz, t is the time (s) and с the wave velocity. The period of the wave is l / f and the wave number 2 т / A.

The harmonic progressive waves travelling in opposite directions may be combined to form a standing or stationary wave:

2 7г 2?ry(x, t) = A cos — (x-ct) + A cos — (x+ct)Л Àn л 27ГХ 2 7Г= 2A cos —— cos — ct

A A

= 2A cos cos 2 tr ft A

Figure 5.5 illustrates the motion. The points x = X/4 ± rX/2 (r = 0, 1, 2, ...) are all called nodes and the points x = rX/2 (r = 0, 1, 2, ...) antinodes. Example: A string of length A is fixed at each end. It is plucked at its m idpoint which is displaced through a distant h and released. Determine the subsequent motion.

78 PARKS

(хг с(',- ‘о» 1 v ^ W - VX

FIG. 5 .6 . Motion o f string with amplitude h,

FIG. 5 .7 . Domains resulting from characteristics in the one-dimensional case.

The solution may be expressed as a sum of harmonic standing waves

Д. sin —— cos Г 7TCtÍ

r = l

for which Эу/9t (x, 0) = 0. The Br must be found by Fourier analysis of the initial wave form to give

The motion is illustrated in Fig. 5 .6.A useful property of the wave equation is the existence of characteristics.

In the case of the one-dimensional wave equation with solution y(x,t) we can set up a plane with co-ordinates x and t. In this plane с (Эу/Эх) + 9y/9t is constant along straight lines with equation et + x = constant, and с (Эу/Эх) - 3y/3t is constant along lines et - x = constant. These lines are called "characteristics" and the constancy of the two quantities с (Эу/Эх) + 9y/3t and c(9y/9x) - 9y/9t on these lines is useful in constructing general solutions, especially when boundary controls are acting.

Consideration of the characteristics gives rise to the concepts of domains of determinancy and dependence illustrated in Fig. 5 .7 . The motion at the point (xx, t^ can be found from a knowledge of y(x,to) and Эу/9t (x, to) for Xj - с (tx - t0) < x < xi * с (ti - to) on the line t = t 0 which is a "domain of dependence". On the other hand, knowledge of y(x, t0) and 9y/9t(x , t0) on this interval determines y(x, t) for all points x, t inside the shaded triangle which is the "domain of determinancy" of the interval Xj - С(^ - tQ) < X < Xj+ c (tx - t0) on t = t0.

6 . CONTROL CONCEPTS FOR PARTIAL DIFFERENTIAL EQUATIONS. I

6 .1 . Control of the heat equation

= 0 , r even

Figure 6.1 shows a feedback control arrangement for heating a metal bar. The behaviour of temperature T(x, t) in the bar may be described by the heat equation

IA E A -S M R -1 7 /1 79

/ / / / / / / / / / / / / / / / / / / / / / /: hit)

Gis)

/

Ï777itT (a.t)

FIG. 6 .1 . Feedback control for heating a m etallic bar.

and the heat input h(t) is given by the boundary condition

9T-k Эхx = 0

If the bar is insulated as shown, there is a second boundary condition at x = £ :

ЭТЭх = 0

x = с

The temperature is measured by a sensor at x = a.We observe that the control is a "boundary control" and that we have

one pointwise sensor. (In general^we might have a distributed heat input, and in theory we might be able to measure the temperature at many points simultaneously. )

Taking the Laplace transform of the heat equation we obtain

К dfTdx2

(x, s) = sT (x, s) - T(x, 0) (6 . 1 )

where T (x, s) is the Laplace transform of T(x, t), given by

T(x, s) = J T (x ,t)e ‘ stdtt = o

We have also the transformed boundary conditions

dTdx (j?, s) = 0

■ k S (0’ s) = H(s) = G(s) (Ti(s) ■ T (a>s »where H(s) is the Laplace transform of h(t), and T¡(s) that of T¡(t).

The second-order ordinary differential equation (6.1) for T(x, s) may be solved by the "method of variation of parameters" in which we assume

T(x, s) = A(x) coshs/ s /K x + B(x) sinh J s jK x

so that

dT jX' = n/ s /K (A(x) sinh si s /K x + B(x) cosh si s /K x)

80 PARKS

provided

^"dx^ cos^ ^ S/K x + sinh -J s /K x = 0 (6.2)

Now

d2T(x, s) _ _s_ (A(x) cosh n/ s /K x + B(x) sinh \/s/K x) dx K

+ v/ s /K sinh \Ts/ k x + cosh J s / К x^

or, substituting into (6 . 1 ),

J sK ^ si-1* 'fs/K x + cosh n/s/K x^) = - T(x, 0) (6 . 3)

Equations (6.2) and (6.3) enable us to get solutions for dA(x)/dxand dB(x)/dx giving

К x

cosh ^ S/ K x dx v s K 7

At x = 0

-k sTs/K B(0) = H(s)

and at x = I

A (i) sinh n/s/K Í + B (i) cosh J s /K £ = 0

so that

BM - ÿ ÿ k - J ‘n h i 1 » » > ■ Л 7 к s «о

x

A(x) = Tj| ^ sinh siiTK Ç dÇ + Со

where

eJ ^ s K s:'-n ^ s /^ f dÇ + C sinh sI s f К iо

+ _ k n/ s/K ‘ J n/ sK cosh n/ s/K j? = 0

Ï A E A - S M R - n / l 81

C sinh \l s /K i = ¿ к cosh «J s /K l + J' cosh -J s/K (£ -f ) dÇo

giving, finally,X

T (x , s) = cosh si s /K x j s in h n /s /K £ d f

0

4 . A - f 2 b J > l oosh ^ 7 K (J .E|dç , « a s â ® -s in s/ s /K í J J s K ’ k*7 s / К s in h v s /К i

0x _

- sinh s i i / K x { J cosh ^ / K f d? + к Т ^ к }0

We notice that this is made .up of two parts ~ one depending on the initial temperature distribution T(x, 0) and one depending on H(s), the heat input.The transfer function relating T(a,t) to h(t) or T(a, s) to H(s) is therefore

cosh ‘Js/KSl cosh s /s /K a sinh -J s /K a _ cosh -J s/K (i -x) k n / s / K sinh n/s/K j? ks/s/K k ч/ s/K sinh s i s/K &

If a = 0 this reduces to

cosh ч/s /K i .~ " T / k .... <6- 5>

and if a = 1 we have the transfer function

ks/s /К sinh s/ s /K jP (6 - 6 )

We may analyse the stability of the closed loop using the Nyquist criterion in which case we are interested in the frequency response of Eqs (6.4) - (6 . 6 ). The frequency responses of (6.5) and (6 . 6 ) putting s = iu are shown in Fig. 6 . 2a and b.

We notice that if H(s) is a constant gain К then the open loop frequency response will encircle the (-1 ,0 ) point for sufficiently large К unless a = 0.

(From a practical point of view this problem is somewhat artificial as it is unlikely that h(t) can go negative, unless heating and cooling arrangements are available at x = 0 . )

FIG. 6 .2 . Frequency responses o f Eq. (6. 5) (a) and (6 .6 ) (b).

82 PARKS

Another way of relating T(x, t) to h(t) is by assuming a Fourier series for T (x,t) in the form

r = 0which satisfies the boundary condition at x = i . However, to satisfy the boundary condition at x = 0 we actually arrange for it to be satisfied at x = 0 + by considering the modified differential equation

К Э2Т К ЭТ+ к 6<х>ь^ = ST

which contains a Dirac delta function at x = 0.On substitution of the Fourier series for T(x, t), multiplication by

cos and integration from x = 0 to x = Í we obtain

• , K r V д 2K , , „ „r Jp A r ■ kjg r 1 , 2 , 3 , . . .

Á 0 = i f h « : r = 0

If h(t) is a unit impulse we obtain

.. 2K V -r27tzKt/cz . К{X' > = kJ L 6 cos (r7Tx/i) + —

r = l(notice that this is not convergent for t = 0 ).Taking the Laplace transform

cos (гтгх/H ) КT(X,S) kJ? X ( s + r 2TT2K /£2) + k is

r = l

It is not immediately obvious that this is identical to E q.(6 .4 ) obtained earlier which was (putting a = x)

tv - cosh -J s /K (f -x)(X,S) k nTs/K sinh \Ts/K Í

The first expression may be regarded as a partial fraction expansion of the second, using the relationship

Sinhz = z П ( 1 + ^ 2

We notice that extension of feedback control ideas to partial differential equations involve a number of difficulties: more complex solutions, more elaborate Laplace transform s, convergence of series, and use of Dirac delta functions. A comprehensive account of control of distributed parameter systems still has to be written.

I A E A - S M R - n /1 83

7. CO N TR O L CO N CEPTS F O R P A R T IA L D IF F E R E N T IA L EQUATIONS. II

7 .1 . Control of the wave equation

Figure 7.1 illustrates an angular position control of a uniform flexible shaft. The torsion of the shaft gives rise to the one-dimensional wave equation

_320 _ ^ ТЭ2 0 Э2© ü2c. длв■ W = G J ^ OT 9Ï

n Эz 0 , o GJ= — y where c = — d x ¿ I

FIG. 7 .1 . Angular position control o f flexible shaft.

I is the moment of inertia per unit length in the x-direction and GJ is the torsional stiffness of the shaft (in conventional notation). The boundary conditions are

where C* is the torque produced by the m otor.While we could develop a transfer function approach which would be

quite sim ilar to the procedure in Section 6 , we shall consider here a different problem, that of "controllability". In particular, we shall consider what torque C* should be applied to bring the shaft to rest from any given initial condition which involves specification of 0 (x, 0 ) and 3 0 /3t (x, 0 ) for each x, 0 § x § H. Let us suppose we require 30/3x(x, T) = 0 and 30/3t (x, T) = 0 for all 0 s x s S and T to be as small as possible.

In this problem the characteristics in the(x, t)-plane are very useful.We know from Section 5 that

90 90r r + c — is constant on lines x + ct = constant dt о xand

— - c — is constant on lines x - ct = constant.3t Эх

Figure 7.2 shows the characteristics diagram when T = 2 i /c . Let us suppose that the objectives have been achieved at t = T = 2 i /c . Then at P, both 3 0 /3t and 3 0 /Эх are zero, and so 3 0 /3t + с Э0/ Эх is zero along PA and 3 0 /9t - с Э0/Эх is zero along PC.

84 PARKS

F IG .7 .2 . Characteristics diagram for T = 20 /c .

A t С the to rq u e C* (t) is g iven by

= C*- G j fЭх

and so

96_ _ 8в_at C Эх

2 1 -x

С * / 2¡t ->

; GJ V с

A lo ng CD

эе , эе . .—— Y с — = co ns tan t = at Эх

-2cC* f 2£ -x

G J

A t D

эе „ эе 2 с с* f 2i -х

э7 эГ =

and a lo ng D P 1 and at P 1

эе эе . . 2 с С * Í 2$ -х—- - с — = c o ns tan t ----- -77- I —-—at Эх G J \ с

(7 .1 )

S im i la r ly , fo llo w in g the pa th P А В Р ' , we ob ta in th a t a t P '

эе ^ эе— + с —at эх

2 с С * / x

GJ Ve (7 .2 )

T hus the to rq ue С * (t) is u n ique ly d e te rm in e d fro m (7 .1 ) fo r £ / c S t S 2 i / c

and f r o m (7 .2 ) fo r 0 S t s f / c g iven the v a lue s o f Э в /d t and Эе/Эх a t t = 0

and fo r 0 s x l l .

The ang le tu rn e d th ro ug h a t x = 0 m a y be found by ta k in g the in te g ra l

of 90/ St (0 , t) w h ich is equa l to

-cC* (x /c )

G Jand

-cC *((2 l -x)/c)

G J

a t В and C , re s p e c t iv e ly . H ence , fr o m (7 .1 ) and (7 .2 ) ,

fl(0, T) = 6 ( 0 , 0) + I f f (x, 0 ) ^

IA E A -S M R -1 7 /1 85

A more general problem is how to move the shaft from an undeformed position with S(x, 0 ) = 0 , 30/3t(x, 0 ) = 0 , 0 S x s £, to a new position with6(x, T) = e,ae/3t(x, т ) = о, о § x s s.

We may apply a pulse torque of magnitude C* constant over the time interval (-h, 0 ) giving initial conditions

Э0 cC 36 C* . . .— = - r r r , - — = - — f o r 0 S x S c h at t = 0 3t GJ Эх GJ

To bring this to rest we require a torque of magnitude -С''1' for [2£/ c) - h < t < 2H/ с . The total angle turned by the shaft is 2C* h /Ic, the total tim e(2 i/c)+ h. By increasing C* and decreasing h we can achieve a given angle in time 2 i / c , by use of two impulses at t = 0 and t = 21/c.

We have concentrated on the T = 2 i /c case as an examination of the characteristics diagram corresponding to Fig. 7. 2, which reveals that in general the problem of bringing the shaft to rest in a time less than 2i/c is im possible, no matter what size the torque C* (t) is permitted. This is in marked contrast to controllability of lumped systems which may be brought to rest instantaneously if large enough controls are applied (using, of course, Dirac delta functions and their derivatives if need be).

7 .2 . Parasitic oscillations

Many flexible devices having stiffness and inertia and obeying the wave equation or sim ilar equations (the beam equation, for example) exhibit unstable parasitic oscillations when they are included in a feedback loop, that is when they form a link between an actuator and a feedback instrument. These parasitic instabilities may be analysed by use of the Nyquist diagram and may sometimes be avoided by careful siting of the feedback instruments. In particular, if it is possible to site the feedback instrument close to the actuator, the time delay involved in waves travelling from one to the other is avoided. It is really this time delay which causes the instability.

8 . NOISE IN LINEAR SYSTEMS

An important part of linear system theory is concerned with the treatment of random inputs and outputs. Given the "power spectral density" of the input signal, then the power spectral density of the output signal is easily calculated, and other quantities such as the mean square output may then be found. Examples of random inputs to systems include the motion of a m otor car on its suspension when running over a rough road, and glint noise entering an automatic tracking radar scanner or a homing m issile guidance system.

Given a collection or ensemble of records of a random signal plotted as functions of time (Fig. 8 . 1), we distinguish two kinds of average of the properties of the records:(i) a time average taken along a particular record,(ii) an ensemble average taken at a particular time across the collection

of record s .If the ensemble averages of various properties are constant with time

the records are said to be "stationary".

86 PARKS

TIME AVERAGE

FIG. 8 .1 . Ensemble and tim e averages.

-tWO . yl«A „i J l T L »'

ply) •0.4^ 3o A

1 1 ,0.1

I 1 1 -- 3 - 2 - 1 0 1 2 3 1

^ -à -o '-'n°2 0

( a )

ply)1

г tV 0 A y

p(y) = l -S ly + A )+ f Л у -А )

(b)

FIG. 8 .2 . A m plitude probability distributions; (a) Gaussian noise, (b) random square w ave, (c ) sine wave.

If;in addition,the time average equals the ensemble average, then the records satisfy the "ergodic hypothesis". We thus have the set-theoretical formulation:

non-stationary signals Э stationary signals D ergodic signals

The "ergodic hypothesis" is often made in engineering problems in the absence of evidence to the contrary.

The most important properties of a random signal are(i) its amplitude probability distribution(ii) its frequency content or power spectral density.

The amplitude probability distribution p(y) may be measured as an ensemble average, or on a particular record by considering the proportion of time p(y)6 y spent by the signal in a small interval (у, у + 6y) as 6 y -* 0 . Some examples are given in Fig. 8 .2 .

The frequency content or power spectral density S,;:(u) may be regarded as the power or mean square output S*(u) 6u from an ideal filter with centre frequency и and small bandwidth 6u as 6u ~ 0. Mathematically it is usual to distribute the power spectral density equally over positive and negative ш, taking S(u) = |S*(u). While there are spectrum analysers which work in this way using analogue or recorded signals, another approach to spectral analysis is via the autocorrelation function of the signal y(t).

The autocorrelation function R(t ) is the time average y(t) y (t-r). Note that R(0) is the mean square of the signal and that R(r) = R (-t ) s R(0). The

I A E A - S M R - n / l 87

autocorrelation function R(t ) and the power spectral density function S(u) are Fourier transforms one of the other:

If we put GJ = 27rf, where f is in cycles per second (Hz) (u being in radians per second), the integrals become as shown, where S*(f) = 27rS*(u) is measured in power units per cycle per second

known as the Wiener-Khintchine relations.This relationship may be demonstrated by taking a finite length 2T of the

random signal, carrying out a Fourier analysis (regarding it as periodic with period 2T), calculating the power from the Fourier coefficients and an estimate of the power spectral density by distributing this power over the frequency interval 2ît/ 2T. The resulting expression may be written as a double integral. This integral may be evaluated in a different way giving an expression involving y(t)"y(tTr) with -T < т < T . By letting T the first of the Fourier transform pair is obtained.

Some examples of the Fourier-transform relationship are given in F ig .8 .3 .

The reciprocal relationship between R(t) and S(u) should be noted: a "narrow" R ( t ) corresponds to a "broad" S(u), and vice versa. Extreme cases of this are shown in F ig .8 .4 where we have y(t) as "direct current" and "white noise" signals. While "direct current" is a realistic signal, "white noise" is in fact a useful mathematical concept which cannot, however, exist in reality, since its mean square or power is infinite.

We now have to consider the response of a stable linear system with transfer function H(s) and impulse response h(t) to a random input of known power spectral density Gl:1(w), say. There are two ways of thinking about

-

Since S(oj) and R ( t ) are both even functions we may rewrite these relationships as

r =0

0 0

R ( r ) c o s 2 i t î t d r

R(t ) = S*(f) cos 277-fr dff = 0

88 PARKS

A sin

FIG. 8 .3 . Autocorrelation functions and power spectral density functions; (a) sine w ave, (b) w ide-band noise, (c ) narrow-band noise.

y(t) = A (CONSTANT) y(t)="WHITE NOISE RM * IR(r) \*

02itG

S M SM

0 to

FIG. 8 .4 . "D irect current" and "w hite noise".

this — first considering the effect of the transfer function on sine waves, we can deduce heuristically that the power spectral density of the output G00(u) is given by

- |H(iu)| Сгц(и)

A m ore sophisticated approach is via the cross-correlation function R oi(t) = y0(t) yj(t - t ) and its Fourier transform, the cross-pow er spectrum G01(u). Now

Уо (*) = f h(r) yx ( t - t ) dTt=0

from which it follows, on multiplying through by y1(t -T j) and taking time averages, that

R 0i(Tl) = J Ь(т) R j j fT j - t ) dTT = о

where R u (t ) is the autocorrelation function of yx(t). Taking Fourier transforms yields

G 01(u) = H(iu) G n (u)

IA E A -S M R -1 7 /1 89

Multiplying by yQ(t -T j) and time-averaging yields

R o o ( T i )=f h (T ) R i o (T i " T ) d rt = 0

Hence

G „о И = H(iu) G 10(u) = H(iu) G01(-w) = H(iu)H(-iu) Gu (u)= I H(iu) 12 Gn (to)

which is the result obtained above.The mean square output, cr2, may then be calculated as

a2 = J' G00(u) du = J' |Н(ш)|2 Gn (u) du

= J H(iw) H(-iu) Gn (u) du

This integral may be calculated by the residue theorems of the complex integral calculus, and standard form s for polynomial fractions in u are available up to degree 9.For example: if Gu = G constant ("white noise" input) and

H(iu) = ^ .K,. K, a, b > 0' ' ( i u +a) (iu +b)

then we can consider the integral

Г K2G dzJ (z+a) (z+b) ( - z + a) ( -z +b) с

where С is a large sem icircle in the left-hand half-plane and its diameter (the imaginary axis). The integrand has simple poles at z = -a and z = -b inside С and so by the residue theorem

Г K2G dz P K2G i du . о(z+a) (z+b) ( -z + a ) (-z + b ) J (iu+a) (iu+b) (-iu + a) (-iu+b)

<jJ = " 00

= 2iri (sum of residues at poles within C).

„ . , lim K2G , lim K2G= 27Г1 ' ------------------- - • • ...........• - a (z +b) (-z +a) (-z +b) “ b (z +a) (-z +a) (-z + b)

_ Oiri K2r I i i - - . - . _ i __ . .\ (b - a) 2 a (a +b) (a -b ) (b +a) 2 b

= 27ri K2G , * 0 ■(a +b) 2 ab

90 PARKS

Hence it follows that

2 - ttK2G(a +b) ab

An analogous theory exists for discrete time systems described by difference equations. The output sequence {y 0(t)} is related to the input sequence {y 1(t)} (where t takes discrete values . . . - 3 , -2, - 1 , 0, 1, 2, 3, by the relationship

n = 0

where the sequence {h(n)} corresponds to the unit impulse response h(t) in the continuous case described above. With a stationary random input with the autocorrelation function R 11(t) = yi(t)yj(t-т), defined for integer values of t , we obtain

R oo(t)=' X X h(k)h(-f)R n (T+i - k)k = 0 i = 0

The spectral density function is defined as

S(U) = 2 7 X R ( n ) e inwП = - OO

where7T

R(n) = J S(io)e‘ inaJdu - Ï Ï

(The R(n) defined for integer n are the Fourier coefficients of the periodic function S(gj) defined on the basic interval (-ж, i t ) . )

Now the pulse transfer function using z-transform notation is

H(z) = ^ h(n)z"nn - 0

and so taking the transform of Rqo(t ) we obtain

Soo(w) = ¿ Ê Z Z eik- h(k)e-iPwh(i) e+i(n + {- k)w Rn (n + i -k)k = 0 í = 0

= Щ е-П Щ е^ )В п (и)

To find the mean square output R qo(0) we need to evaluate7Г

J H(e'iw) H(eiw) Sn (u) du -ïï

Such integrals can often be evaluated by considering a contour integral around the unit circle (where z = ei(J, dz = iz du) and evaluating this by use of the residue theorem. If the discrete system is stable, which is a necessary requirement, then H(z) will have all its poles inside the unit circle .

I A E A -S M R -1 7 /1 91

9. DISCRETE NOISE PROCESSES

To simulate noisy systems on a digital or analogue computer it is useful to be able to make up random signals with prescribed amplitude probability distributions and with prescribed spectral density properties. Such signals are also useful for system identification using cross-correlation techniques. We shall discuss a number of examples.

9 .1 . The random square wave

Consider time divided up into a sequence of intervals each of length h. The random square wave y(t) takes the constant value +A or - A for nh S t < (n+1) h for n = { . . . , -3, -2, -1, 0, 1, 2, 3, •••} so that (i) thereis an equal probability of +A or of -A in each interval and (ii) there is no correlation between the values +A and - A in different intervals (Fig. 9. 1).

The amplitude probability distribution function, p(a), is clearly two Dirac delta functions at ±A (F ig .9 .2 ) each of magnitude j .

P (a)

-A 0 A a

FIG. 9 .2 . Amplitude probability distribution o f Fig. 9 .1 .

The spectral density of the random square wave is obtained from its autocorrelation function. This may be calculated by considering pairs of points, a distance т apart, with the left-hand point placed at random on the t-ax is. When т = 0 we shall obtain the mean square which is clearly A2. When t > h, the mean value of y(t)y(t+r) calculated from lots of pairs of points is zero, because of the zero correlation in different intervals. When 0 < t < h the two points of one pair sometimes lie in the same interval and sometimes in adjacent intervals in proportion 1 - т /h to т /h . From this we deduce the autocorrelation function R(t) shown in Fig. 9.3.

With this the power spectral density function S(u) is obtained from the Fourier transform of R ( t ) as

S(tJ) = 17 J к (т )е’ 1ШТс1т

_ A^h sin2 (uh/2 )2ir (uh/2 ) 2

sketched in Fig. 9 .4 .A modification of this random square wave is to replace the values of

±A by a sequence of random numbers with a zero mean and given variance

92 PARKS

FIG. 9 .3 . Autocorrelation function o f the random square wave.

FIG. 9 .4 . Power spectral density o f the random square wave.

FIG. 9. 5, Random telegraph signal.

cr2. This will produce a random stepping signal with a prescribed amplitude probability distribution and (by similar arguments) a power spectral density

c^h sin2 (tüh/ 2 )2ж (uh/2 ) 2

9 .2 . The random telegraph signal

Random telegraph signal is the name given to a signal taking the two values +A and - A with equal probability where the switching points form a Poisson process in time, with an average frequency of v switches per unit time (Fig. 9. 5). Its amplitude probability distribution is clearly the same as that shown in F ig. 9. 2, but its autocorrelation function and power spectral density are quite different from the random square wave. If we consider many pairs of points, the two points of each pair being a distance т apart, and then suppose т to be increased to т + 6 т then, by considering the number of switching points which will be included in the many 6t intervals, we deduce that

R ( t + 6 t) = - R ( t ) v 6 t + R ( t ) (1 - v 6t)

giving the differential equation for R ( t )

^ = -а-'Щт)

Now R(0) = A2 so R(t) = A2 e’ 2^ Tl. By Fourier transformation

с, \ - 2 yA2Ь(Ы) " 7г(4у2 + и2)

See F ig s 9. 6 and 9 . 7 .

IA E A -S M R -1 7 /1 93

FIG. 9 .6 , Autocorrelation function for the random telegraph signal.

FIG. 9. 8. Autocorrelation function for the differenced random square wave.

S M A2

1 1

2 i>tt

i i ,- U v - 2 v 0 2 v Uv ы

FIG. 9 .7 . Power spectral density function for the random telegraph signal.

FIG. 9 .9 . Power spectral density o f Fig. 9. 8,

9 .3 . "D i f fe re n t ia te d " ra n d o m sequences

O ne m ay " t a i lo r " pow er s p e c tra l d e n s it ie s by d if fe re n c in g g iven ran d o m

seq ue nce s . F o r e x am p le , i f we c o n s id e r the rand o m squa re wave o r the

ran d o m s te pp ing wave fo rm ed fr o m the ran d o m sequence {yn} say , w ith m e an

0 and v a r ia n c e ct2, then we can fo rm a new sequence { z n} g iven by

7-n = Уп■- Уп-i and a r a n d o m s tepp ing wave z(t) by p u ttin g y(t) = zn fo r

nh S t < (n + 1 )h.

The new w ave fo rm has the a u to c o r re la t io n fun c tio n show n in F ig . 9. 8 .

F r o m the F o u r ie r t r a n s fo r m , a f te r som e c a lc u la t io n , we find

ол.ч - 2 g 2 h ( s i n t o h / 2 ) 4 b(U) 7Г ( u h / 2 ) 2

See F i g . 9 .9 .

The fac t tha t S(0) = 0 cou ld be deduced d ir e c t ly fro m the fo rm of R (t ) s in ce

S(0) = 2 7 f R (T )d T = °

fr o m F ig . 9 .8 .

9 .4 . A n a p p ro x im a t io n to "w h ite n o is e "

I f we e xam ine the ran d o m s tepp ing wave w ith an a u to c o r re la t io n func tio n

R ( t ) = ct2 (1 - I t I /h ) fo r O s I t I s h

R ( r ) = 0, |t |> h

94 PARKS

and a power spectral density

олл _ sin2 (uh /2 )b(U) 2 7Г (uh/ 2 ) 2

We may consider letting cr2 -* » and h -» 0 in such a way that cr2h = constant = 2tG , say. This means that the autocorrelation function becomes a delta function at t = 0 of magnitude a2h = 2irG, and the power spectral density becom es a constant (G) at all frequencies.

In practice we would make h small compared with system time constants (or 2 îr/h large compared with the system pass-band), and ct2 large so that a2h/27T = G (given).

9 .5 . Pseudo-random binary sequences

These signals, which are in fact periodic, are particularly useful for system identification using cross-correlation of output with input. There are two important classes — quadratic residue sequences and shift register or M -sequences.

Quadratic residue sequences are based on work in the theory of numbers due to Gauss. We take a prime number N = 4k - 1 where к is an integer, and calculate the numbers l2, 22, 32, . . . , {4k -2/2 } 2 modulo N. We form a signal y(t) which is + A in intervals corresponding to these N - 1 numbers and is - A in the other intervals making up N intervals in all. The signal is then repeated to form a periodic signal of period Nh where h is the length of the basic interval.

f t

Ж7h

2Лh FIG. 9.12. Line spectrum o f Fig. 9.10.

I A E A -S M R -1 7 /1 95

FIG. 9.13 . Quadratic residue binary sequence N = 1019

96 PARKS

For example: When N = 7 (k = 2) we calculate l2, 22, 32 (modulo 7) =1, 4, 2 and thus form the signal shown in Fig. 9. 10. This signal has the autocorrelation function shown in Fig. 9 .11. This gives a line spectrum (F ig .9.12) with a frequency separation of lines equal to 2 7 r /N h rad /s .The envelope is of the form (S*n ^ )?uh/ 2

A longer quadratic residue sequence with N = 1019 and A = 1 is given in Fig. 9.13.

FIG. 9.14. Tw o-level shift register with binary adder.

Another form of pseudo-random binary sequence is the "M -sequence" which can be generated with a 2 -level shift register having appropriate feed back arrangements and a clock with period h. Consider the system shown in F ig. 9. 14, where a, b and с are registers and d is a binary adder. Let us initially set the binary numbers 1 in a, 0 in b and 0 in c. At each clock pulse the digits are shifted one to the right, whilst the vacant place in a is filled with the m odulo - 2 sum of the digits in b and с before the shift, this sum being calculated by d. The sequence of operations is as follows:

Register a b с

0 1 0 0

h 0 1 0

2 h 1 0 1

3h 1 1 0

4h 1 1 1

5h 0 1 1

6 h 0 0 1

7h 1 0 0

This pattern is then repeated. If we tap the output of the register a and convert this into a square wave where 0 becom es +A and 1 becomes - A we obtain the signal shown in Fig. 9. 15. (In this case this becomes equivalent to the waveform of Fig. 9.10 shifted by 2h to the left and the autocorrelation function of Fig. 9. 15 is identical to Fig. 9 .11 .)

Any sequence of length N = 2k - 1 generated by a к -stage shift register is called a maximal length sequence or "M -sequence". In such a sequence the registers go through all possible states except the a ll-zero state and it follows from this that there is one more 1 than 0 in such a sequence.

IA E A -S M R - И /1 97

An important property of such sequences is that if a sequence is added modulo 2 to a delayed version of itself, then the original sequence with a new delay is formed; for example

Original sequence (register a) 1 0 1 1 1 0 0 1 0 1 1 1 Single delay of h (register b) 0 1 0 1 1 1 0 0 1 0 1 1

Addition modulo 2 1 1 1 0 0 1 0 1 1 1 0 0

(The resulting delay is 5h. )Modulo 2 addition of the sequence is equivalent to autocorrelation of the

signal y(t) in Fig. 9.15e.g.t 0+ h+ 2 h+ 3h + 0 + h+ 2 h+ 3h+y(t) -A + A -A -A 1 0 1 1y (t + 2 h) -A -A -A + A 1 1 1 0

y ( t )y (t + 2h) + A 2 - A 2 + A 2 -A2 0 1 0 1

It follows that the autocorrelation function has one m ore - A2 than +A2 adding over one period Nh. Hence R(t) = -A 2/N for h S r s (N -l)h . This yields the now familiar form shown in F ig.9.16.

FIG. 9 .1 5 . Square-w ave signal for register pattern. FIG. 9 .1 6 . A utocorrelation function.

d

FIG. 9 .1 7 . Corresponding shift register as in Fig. 9 .1 0 .

The shift register operation may be described by a polynomial in the delay operator D. Figure 9.17 shows the corresponding shift register diagram.

x = D3x © D?x

or

D3x © D2x © x = 0

98 PARKS

For example

X = 0 Dx = 1 D2x = 0 D3x = 0X = 1

ОIIXО

D2x = 1 D3x = 0X = 1 Dx = 1 D2x = 0 D3x = 1

looking at the table of contents of the registers a, b, с regarded as Dx,D2x and D3x.

The polynomial involved, D3 © D2 © 1, is known as the "generating polynom ial". To generate an M-sequence, this polynomial must be(a) irreducible, that is it must have no factors; for example

D4 © D3 © D2 © 1 = (D © 1) (D3 © D © 1) is not irreducible;

(b) primitive - that is it must not divide exactly into any polynomial of the form Dn © 1 for any n less than 2k - 1 where к is the degree of the original polynomial. For example

D4 © D3 © D2 © D © 1 = and 5 < 24 © 1

so D4 © D3 © D2 © D © 1 is not primitive.

Prim itive polynomials up to к = 34 are given in W.W. Peterson "E rror correcting codes" (Wiley, 1965); S.W. Golomb "Shift regular sequences" (Holden-Day, 1967) shows that D127 © D © 1 is a suitable generating polynomial; it generates a sequence of length (2127 - 1) h - 1037 hi For example, if the clock frequency is 1 0 6 pulses a second the sequence will repeat itself after 3 X 1024 years .1

k We note that all irreducible generating polynomials of degree к divide D2 " 1 + 1 which means the period must be a factor of 2k - 1. If 2k - 1 is prime the period of the irreducible polynomial must be 2k - 1 and so it is also prim itive. Such prime numbers are called Mersenne primes (Mersenne, 1644). 2k - 1 is known to be prime for к = 1, 2, 3, 5, 7, 13, 17, 19, 31, 61, 89, 107, 127 and 11213. Given a Mersenne prime p = 2k - 1, then Dk © D © 1 is a primitive polynomial.

B IB L IO G R A P H Y

KUO, B .C. , Automatic Control Systems, Prentice-Hall (1962).

LEFSCHETZ, S . , La SALLE, J.P. , Stability by Liapunov's Direct Method with Applications, Academic Press (1961).

BROCKETT, R .W ., Finite-dimensional Linear Systems, Wiley (1970).

OGATA, K. , State Space Analysis o f Control Systems, Prèntice-Hall (1967).

BARNETT, S ., STOREY, C . , Matrix Methods in Stability Theory, Nelson (1970).

WILLEMS, J .L ., Stability Theory o f Dynamical Systems, Nelson (1970).

La SALLE, J.P., LEFSCHETZ, S ., Stability by Liapunov's Direct Method with Applications, Academic Press (1961).

IA E A -S M R -1 7 /1 99

CESARI, L. , Asymptotic Behaviour and Stability Problems in Ordinary Differential Equations, Springer- Verlag (1959).

PORTER, B. , Stability Criteria for Linear Dynamical Systems, Oliver and Boyd (1967).

MINORSKI, N. , Non-linear oscillations, Van Nostrand (1962).

WANG, P .K .C ., "Control o f distributed parameter systems", Advances in Control Systems (LEONDES, C .T ., Ed.). Academ ic Press (1964).

BUTKOVSKII, A. G. , Distributed Control Systems, American Elsevier (1969).

BUTKOVSKII, A .G . , Control o f Systems with Distributed Parameters, Naiika, Moscow (1975).

LIONS, J.L. , Optimal Control o f Systems Governed by Partial Differential Equations, Springer-Verlag (1971).

WANG, P .K .C ., Theory o f stability and control for distributed parameter systems, Int. J. Control 7 (1968) 101.

HAMZA, M.H. (Ed.), Proc. IFAC Conference on Control o f Distributed Parameter Systems, Banff, Canada, June 1971, Vols 1 and 2.

PARKS, P. C. , "On how to shake a piece o f string to a standstill” , Recent Mathematical Developments in Control (BELL, D.J. Ed.), Academic Press (1973) 267.

JAMES, H.M . , NICHOLS, N.B. , PHILLIPS, R. S. , "Theory o f Servomechanisms", McGraw Hill (1947).

ÂSTRÔM, K. , "Introduction to Stochastic Control Theory", Academic Press (1970).

HOFFMANN de VISME, G. , Binary Sequences, English Universities Press (1971).

PETERSON, W .W ., Error Correcting Codes, Wiley (1961).

GOLOMB, S .W ., Shift Register Sequences, Holden-Day (1967).

IA E A -S M R -1 7 /2

FOUNDATIONS OF FUNCTIONAL ANALYSIS THEORY

Ruth F. CURTAIN Control Theory Centre,University of Warwick,Coventry, United Kingdom

Abstract

FOUNDATIONS OF FUNCTIONAL ANALYSIS THEORY .This paper provides the basic analytical background for applications in optimization, stability and

control theory. Proofs are generally omitted. The topics o f the sections are the following: 1. Normed linear spaces; 2. Metric spaces; 3. Measure theory and Lebesgue integration; 4. Hilbert spaces; 5. Linear functionals, weak convergence and weak compactness; 6. Linear operators; 7. Spectral theory; 8. Probability measures; 9. Calculus in Banach spaces.

Some mathematical symbols

3 .there exists А с В A is contained in Вe in, belongs to А э В A contains ВV for аЦ E' complement of E'> such that f V g minimum of f and gIR real numbers f Ag maximum of f and gС complex numbers 1 orthogonal to=> implies Â closure of A

iff if and only if

NORMED LINEAR SPACES

Definition 1. Linear vector space

A linear vector space is a set У= {x ,y , z , . . . } of elements with an operation © between any two elements such that

1 . x ® y = y ® x2. 3 e 3 »x ® e = x, Vxe'X'3. 3 - x > x ® - x = e4. x ® (y ® z) = (x ® y) ® z

commutative property existence of the identity existence of an inverse associative property

101

102 C U R T A IN

(i.e. У is a commutative group under the operation ®) and there is an associated scalar multiplication by the real numbers IR or the complex numbers <C such that ox is an element of ‘У''for x e X a n d

1 . a (x ® у) = a x ® о у 12 . (a +p) x = a x ® |3x I where a, g are scalars3. (o|8 )x=o(0 x) J4. l-x = x

This concept is best understood from some examples:

Example 1. Take y=JR under addition and ® as ordinary multiplication.

Example 2. Take У being a set of all polynomials of degree n with real coefficients and scalar multiplication by IR — this is a real vector space, but if we consider complex coefficients and multiplication by C, it is a complex vector space.

Example 3. IRn = set of all n-tuples, x= {х х, x 2, ..., x n}

with x + y = {x i + yi , х 2 +У2 , x n + yn}

and ttx = {ttx1 , 0 x 2 , . , . , i f x n}

Check that IRn is a vector space under scalar multiplication by IR. What if we take scalar multiplication by C?

Example 4. У - set of all mXn matrices with real entries and scalar multiplication by R .

Example 5. У - set of all scalar-valued functions f : S -IR , where S is any non-empty set. For any s G S, f(s) G IR we define

(f + g)(s) = f(s) + g(s), for all s G S

(fffHs) = o-fts), for all s G S, о 6 ]R.

Example 6 . У= set of real-valued functions f : [0, lJ-'lR such that

l

о

Define addition and scalar multiplication as in 5. We must verify that if

f,gG У, then f+ g and o f G У, i.e. J | f(s) +g(s) | ds <°o iо

The last inequality is trivial and for the first we have

J'\ f(s) +g(s) |2 ds <oo and J ' |»f(s) |2ds <«>.о

IA E A -S M R -1 7 /2 103

1 1

< oo

where we have used Schwarz's inequality

provided both sides exist.This last example is a linear subspace of example 5 with S = [0,1].

Definition 2. Linear subspace

If У is a linear vector space, then a subset S of У is a linear subspace if x, y G S => ax + ¡3y £ S, i.e. S is closed under addition and scalar multiplication.

Other examples of linear subspaces are

Example 7. In example 3, let S be the set of n-tuples of the form x = { xj, x 2 , 0 , ..., 0 }

Example 8 . In example 4, let S be the set of matrices with certain blocks zero.

Example 9. In example 2, let S be the set of all rth-order polynomials, where r < n.

Linear subspaces have the special property that they contain the zero element. A 'translated' subspace is given by

Definition 3. Affine subset

If 9îs a linear vector space, then an affine subset has the form

M = {x : x = с + x 0, where x0G S and с is fixed} for some c £ X and some linear subspâce S of У'.

Example 7a. In example 3, let M be the set of n-tuples of the form x = {x j ,x 2 , 1 , 1 } .

Example 8 a. In example 4, let M be the set of matrices with certain blocks of l 's .

Another very important type of subset of a vector space is a convex set.

104 CU RTAIN

Definition 4.

A subset A of У is convex, if x ,y e A implies that Xx + ( 1 - X )y 6 A for all X > 0 S X s 1.

We now introduce the concept of the dimension of a vector space.

Definition 5. I fx i , . . , x nе У and there are scalars oj, not all zerosuch that o 1x 1 + » 2 X 2 + • • • +onx n = 0 then we say this is a linearly dependent set. If no such scalars , , . ,o n exist, then хр x0 , . .x n is a linearly independent set.

For example, l ,x ,x 2 , ..,xP is a linearly independent set of nth-order polynomials, as is 1+x, -g- + 3x; however, 1 + x, | + 3x, 2x are linearly dependent.

Definition 6 . If x 1, . . , x n is a linearly independent set in У then we say that S = Sp {x x, .., xn } , the set of all linear combinations of xa, .., xn, has dimension n. If 0^ = S p{x j,.. .., xk} for some finite set of linearly independent elements, then У is of dimension k. If there exists no such set, then У is infinite-dimensional.

For example, the dimension of example 1 is 1, example 2 is (n + 1), example 3 is n, example 4 is mn and examples 5 and 6 are infinitedimensional.

Definition 7. If У= Sp {x i, .., x n}, the set {x x, ..., xn} is called a (Hamel) basis for У This basis is not unique, although the dimension of У is unique.

Example 10. A basis for example 2 is { l , x , . . . , x n} or, equivalently,

the Legendre polynomials.A useful fact to rem ember is that all real vector spaces of dimension n

are algebraically identical (or isomorphic to IRn).

Definition 8 . Vector spaces У and 'Ж’аге isomorphic if there is a bijective linear map T :1) ^ ^ such that T(»x + (3у) =геТх + (3Ty for all x, у e У ando,l3 scalars.

Another example of isomorphic spaces are all n-dimensional vector spaces over the complex numbers, which are all isomorphic to C n, the space of complex n-tuples.

Definition 9. A hyper plane, H, of a vector space У is a maximal proper affine subset of У i.e. the complement of H has dimension 1. For example, hyperplanes in IRn have dimension (n - 1) (i.e. y=H + S, where H and S are disjoint and S has dimension 1).

Hyperplanes in IRn have the dimension (n - 1).

So far we have only considered algebraic properties of sets. In order to develop mathematical concepts for "nearness" or distance, we need some topology, namely metric spaces.

IA E A -S M R -1 7 /2 105

Definition 10. A metric space X is a set of elements {x ,y , ...} and a distance function d(x,y) with the following properties:

1 . d (x ,y)s 0 for all x ,y G X2 . d(x, y) = 0 if and only if x = y3. d(x,y) = d(y,x) _4. d(x, y) s d(x, z) + d(z, y) for all x ,y ,z G X

We call d ( . , .) a metric on X.This is essentially a generalization of distance in the Euclidean plane.

Example 11. Let X be the set of 2-tuples x = {x i, x2} or Cartesian co-ordinates and d(x,^r) = [ (xx - yx )2 + (x2 - y2 )2 ] 1//2. Then d satisfies all the properties 1 -4 and property 4 is the familiar triangular inequality.

Example 12. Let X be as in ex. 11, but d(x, y) = |xj - y± | + | x 2 - y2 I or more generally, dp (x_, y) = ( |x}, -yj \p + | x2 - y 2 | p 1 S p<°° » and

d '(x ,y) = max {|xj - y j , |x2 - y 2 |}.

So the same set X can generate different metric spaces.

Example 13. X =^[a ,b ] the set of continuous functions on [a,b] and

d(x, y) = max {| x(t) - y(t) I ; a s t s b }

orЬ i / p

dp(x,y) = (^J Ix (t)-y(t)|pdt^ ; рё 1 .a

We sometimes use a pseudometric which satisfies conditions 1, 3, 4 of Definition 6 , but instead of 3, we have only d(x,x) = 0. d(x,y) = 0 does not necessarily imply that x = y.

So far we have introduced the algebraic structure of a linear vector space which enables us to consider linear combinations of elements and then the topological concept of a metric space which enables us to measure nearness or distance and hence to consider the tools of analysis, such as open sets, closed sets, convergence and continuity. We now combine these two notions in a normed linear space.

Definition 11. A normed linear space X is a linear vector space with anorm on each element, i.e. to each x € X corresponds a positive number ||x||, such that

1 . II x ||=0 iff x = 0

2 . Il a x II = I a I II x II , for all scalars a3. ||х + у|Ы |х||+ ||у||

If 1 is not necessarily true, we call it a seminorm. We note that if wedefine d(x, y) = || x - у || , then d is a metric on X .

1 0 6 CU RTAIN

Example 14. Consider R n again. This is already a vector space (see Ex.3). We can define several norms on ]Rn by

H - C i w ' T i = l

where lsp< oo, p fixed, the so-called 'p -norm '. We denote the normed linear space with the p-norm by ip and

i" =/x e R n > I! x|| = max {| x¡ |}j- ^ w l< i< n J

Example 15. Consider the space of infinite sequences, x= {x 1 ( x2, . . . } .Then this forms a vector space in a sim ilar manner to R n and we can again define a p-norm:

Ы К Х к Г ) for l s p < 00.i = 0

Note that ||xj|p is not finite for all infinite sequences, so to define a normed linear subspace we take the subset with finite p-norm , i.e.

ip = {oo -tuples x = { x : , x2 , . . . } , with ||x||p < oo}.

This contrasts with the ip normed linear spaces where the sets of elements for each p are identical, although the norm is different. Now for || • || p the set of elements for each p are different. The actual proofs that || • ||p is a norm rely on

Minkowski's inequality

i =1 i = 1 i = 1

which holds for n finite or infinite.Finally

=-(o o -tuples x, with ||x|| = sup { | x . |} < ooI ~ ~ l£ i< o o

We have the following inequalities:

and so

£ iC£2C

Example 16. X = ^ [a ,b ], the space of real continuous functions on [a,b] with norm

Il x(-) II = max I x(t) I astsb

I A E A -S M R -1 7 /2 107

This is called the uniform or sup norm

Examplel7. X=d[a,b] under the p-normb

a

That this satisfies the properties of a norm depends on the integral form of the Minkowski inequality:

f Г ч^Р / r> nVp / p \!/Рi^J |x(t)+y(t) |pdtJ S I x(t) |P dty + |y(t)|P dtj

a a a

Equality holds only if x(t) = ky(t) almost everywhere on [a ,b ],We note that although X = [a, b] is the same linear vector space in

Exs 16 and 17 by defining two different norms, we obtain two distinct normed linear spaces.

2. METRIC SPACES

We return to study the properties of metric spaces in more detail.There are two common ways of creating new metric spaces from known

ones — subspaces and product spaces.

Definition 12. Let (X, d) be a m etric space and A a subset of X , then we canconsider (A,d) as a metric space in its own right. Then (A,d) is a subspace of (X, d).

There are, in general, several m etrics we can put on A, but (A, d i) is a subspace of (X. d) only when the m etrics di and d coincide on A.

Definition 13. Let (X ,dx), (Y,dy) by two m etric spaces, then the product set of ordered pairs Ï X Y = {(x ,y ) : xGX , y€Y^} may be defined to be a m etric space, the product space, in several ways.

1 . d(u!,u2) = dx(x1, x 2 )+ d y(y i,y 2 ), where Ui = (x j,j^ ), i = 1 , 2 .2 . d2 (ui,u2) = (d£(xb x2) +dy(yi,y2 ) ) 1/23. dp(ui,u2) = (dg(x1 ,x 2) + dP(y1 ,y2 ))ly,P; ISpCoo4. d„(u1,u 2) = m ax{dx(x 1( x2), dy(y i,y2)}

Under any of these m etrics (XXY,d) is called the product space of (X, dx) and (Y, dy). (There are infinitely many choices of m etrics forX x Y ).

You can verify that if 3Tand ? are normed linear spaces, then (X, II • llx)X (Y, II • ||y ) becom es a normed linear space using any of the product m etrics derived from || • ||x and || * ||y.

Continuity in metric spaces

The introduction of the distance function d (.,.) allows us to generalize the definition of continuous functions on metric spaces.

108 CU RTAIN

Definition 14. Let f : X^ Y be a map from the metric space (X, dx ) to the m etric space (Y, dy). f is continuous at x0 in X if given € > 0 , 3 a rea l number 6> 0 , such that dy(f(x), f(y))< G, whenever dx(x, x 0) < 6 . f is continuous if it is continuous at each point in its domain.

Definition 15. A map f : X -* Y is uniformly continuous if for each G > 0, 3 6 = ô (G )> 0 , such that for any x 0, dy(f(x0), f(x)) < G, whenever d„(x, x0 ) < 6 .

Example 18. X = lRn under d 2 (u ,v)= (^^ |u¡ - v¡ | ji= l

n2\ 1/2

m

Y = ]Rm under d2 (w, z) = ^ |w¡-z¡|2î = l

and f : X^ Y is represented by the matrix F = (fjj ), i.e. y = Fx. Let x 0 be fixed in X and y 0 = Fx0.Let x be an arbitrary point in X and y = F x , then

m n

d(y ,y ) 2 = Y, I Y f i i ( x j “ x o j > Î2i = i j = i

m n n

sZ ( X |f« |2) ( ^ ,xi-x°j|2)1 = 1 J = 1 J=1

by the Schwarz inequality

s c2 d(x, x Q)2 , where c2 = | f | î . j

So if we are given € > 0 we may choose 6 = 6 / с , provided с / 0. So the map f is uniformly continuous.

Example 19. X = X = sPace of integrable functions on [0,T] with the metric

y /2[x(t)-y(t)]2dtj

0

Definet

— — Гf:X "* X by fx = y, where y(t) =J x(s)ds.о

Then tly(t)-y(t0)| = IJ x(s) - X0(s) ds I

0

d(j . „ ■ c

IA E A -S M R -1 7 /2 109

о о

S \Гт d(x, х0 )

d (y ,y0 ) = ( / I y (t) - y0 (t) |2 dt ) o

á Td(x,x0)

so f is uniformly continuous.If, however, we consider the interval (-°°, oo), the map f :X->X is nott

continuous, where fx = y is given by y(t) = J ' x(s) ds.

For let x 0 € X be fixed and see к an e> 0, such that there is an x £ X , such that d(x, x 0) < 6 and d(f(x), f(y)) ê E , for any choice of 5 .

Let y0 = f(x0 ), y = f(x).t

Then y(t) - y0 (t) = J'(x(s) - x0 (s))ds.

f c, Os t s 3 T 2 Choose x such that x(t) - x 0 (t) = j " c > 3T2 < t s 6 T 2

0 otherwise

. . d(x, x 0) = nГб Tc

and

í c t ,0 s t s 3T2 y(t) - y0 (t) = j c ( 6 T 2 - t ) , 3T2< t s 6 T 2

0 otherwise

and

d(y,y0 ) = чП8 Т3 с.

Let 6 > 0 be given and choose с , T, such that \ГГ8 T 3c = 1, •J~6Tc<ô, i.e. d (x ,x 0 ) < 6 and yet d(y,y0 ) й 1 , for this particular x. So f is not continuous at x 0.

A fundamental concept in analysis is of course convergence and we now define convergence in metric spaces.

Definition 16. A sequence {xn} С metric space (X, d) converges to x0 in (X, d) if d(xn,x 0)^ 0 as n-*°o.

Continuity and convergence are related concepts as is clearly seen from the following result:

1 1 0 C U RTAIN

Let f : (X, dx)-»(Y, dy) be a map between two metric spaces and x 0 a g point in X. Then the following two statements are equivalent:

(a) f is continuous at x0.

(b) lim f(xn) = f( lim x n), for every convergent sequence xn->xo, i.e. an-. <o П -» «e

map is continuous iff it preserves convergent sequences.

We now define Cauchy sequences.

Definition 17. A sequence {x n} of elements in a m etric space (X, d) is aCauchy sequence if d(xn , xm) -» 0 as m, n -* oo .

Metric spaces have the property that Cauchy sequences can have at most one limit, but they need not have a limit point in the metric space,as is seen from the following example.

Example 20. C onsiderólo , 1] under the 2-norm and take the sequence {xn} where we define

forxn(t)= n/2)t - n /4 + 1/2

OS t s 1/2 - l /n1 / 2 - l / n s t s 1 / 2 + l /n1 / 2 + l /n S t á 1

Graphically this looks like

lA E A -S M R -1 7 /2 111

= -Í2 (difference in the area of triangles). = >/"2 I l/4 n - 1 / 4mI -*• 0 as m ,n -*co.

s o {x n} is a Cauchy sequence under the 2-norm.We easily see that the pointwise limit of x n is

X * k

2

However, this function does not belong to d [0,1] because of its discontinuity at 1/2. It is rather awkward to use spaces which have Cauchy sequences whose limits do not belong to the space and so we define a class of metric spaces which always contain limit points of Cauchy sequences.

Definition 18. A metric space (X, d) is said to be complete if each Cauchy sequence in (X, d) is a convergent sequence in (X, d).

This concept of completeness is so important that it gives use to the definition of new types of spaces, as we shall see from the examples to follow.

First let us state two major results about complete metric spaces:

1. If (X, d) is a complete metric space and (Y_, d) is a subspace of (X, d),(Y_, d) is complete iff T is a closed set in (X, d).

Definition 19. A set A in X is closed if all convergent sequences in A have their limits also in A.

2. Every metric space has a completion which is unique up to an isometry.

This last property allows us to define an important concept in normed linear spaces.

Definition 20. A Banach space is a complete normed linear space (B-space).

Some examples of Banach spaces are ^ [a ,b ] under the sup norm (since uniformly convergent sequences of continuous functions have continuous limit functions), ip , ü ", ip andi*,.

However, from Ex.20 we see that ^[a ,b ] is not complete under the 2-norm . This is an important norm used in analysis for defining'mean square convergence of functions':

b(f n(t) - f(t) ) 2 dt-* 0 , as n-*«>

a

112 CU RTAIN

and so we would like to identify what space we get when we complete ^[a,b ] under the p-norm.

At first guess, one would look at all Riemann integrable functions with finite p-norm , i.e . x(-) such that

P \1/p/ x(t)Pd t) <«>

a

Unfortunately, a convergent sequence of Riemann integrable functions does not always converge to a Riemann integrable function. However, it happens that if you define integration in a more sophisticated way — called Lebesgue integration, you do get this property. We shall pursue this idea later on, but for the present we may consider Lebesgue integration as a generalization of Riemann integration and just think of Lebesgue integrable functions as Riemann integrable functions and some "m avericks". L p[a,b] denotes the space of Lebesgue integrable functions with finite p-norm, i.e.

i/pV ' P

t ( t ) I P d t y < 0 0

and it is a В -space. It may also be considered as the completion of¿f[a,b] under the p-norm.

The following examples of normed linear spaces are important in the theory of partial differential equations. Examples 21 and 23 are not complete, however, their completion under the || 'Hn.p norm gives rise to the important Sobolev spaces, which are used in distribution theory for studying partial differential equations.

Example 21. Let<^1a,b] be the space of infinitely differentiable functions on [a,b]. Then the following are well-defined norms

u a

l|x|ln,p = J Y 1о1х(1Р dta i = 0

1/p1 g p < o o

where D1 denotes the ith derivative. So.we can define infinitely many normed spaces on ^ "[a .b ]. Since ||x||n+1 D g ||x||n _ , we have that ^ “ [a,b] under theX - - - Il "U‘ A, p •' "11, p'(n + l,p ) norm C «"[a ,b ] with the (n,p) norm.

Example 22. Consider ) the space of real functions of n variables on Q which are continuously differentiable up to order k. Í2 is an open subset o f !R n. Let a= (o i, . . . ,an ) be a vector with positive integer entries and

definei=l

is continuous:

o-j . Then for u G ^ k(S2), the following derivative exists and

I A E A -S M R -И / 2 ИЗ

ó k(S2 ) is of course a linear vector space and we may take the sup norm or, alternatively, the norm

HI u HI = max {|| D“ u ||}0 < ex — к J

where ||d “ u|| is the usual sup norm. ^ k(f2) is a В-space under the III- ¡I -norm.

Example 23. Let ^ ”(Г2) be the space of infinitely differentiable functions on Í1, an open subset of lRn. Let » = (ffi...,on ) be as in Ex.22, and as before, define the space differential operator D“ u. Then we may define the following norms

lu Hk,p=:( / ^ |D“ u(x)|p dx)V/p

Q I a| £k

II * \\ktp is also a norm onAlthough the spaces in Exs 21 and 23 are not complete they do have the

important property that they are dense in some underlying space, the relevant Sobolev space.

Definition 21. A linear subspace S of a metric space X is dense in X if the closure of S with respect to the metric D X . This means that any element x e X can be approximated by some element s 6 S as closely as we like, i.e. d(s,x) < G .

Example 24. Consider £ 2 and let

S = {x G i 2 > x = { x ! ,x 2, . . . , X k , 0, .. .0 .. .} , к <°o}

Then S is a dense subspace of i 2, but it is not a В-space itself, since the sequence { 1 , 0 , . . . } , { 1 , 1 / 2 , 0 , . . . } , { 1 , 1 / 2 , 1 / 2 2 0 , ...} , ... { 1 , 1 / 2 , 1 / 2 2, . . . ,l / 2 n,0 , . . . } converges to {1, 1 /2 , 1/22 , . . . } which is not in S.

Example 25. L 2 [0, 1] = [0, 1] (closure under ¡ ‘ Цг norm) and

S = {x(-) G L 2 [0, 1 ] with x(0) = 0 and x ( - ) g L î [0 ,1 ]}.

This is not a closed subspace, since it is easy to construct sequences of functions in S whose derivatives converge to a function which is integrable, but its derivative is not. However, it can be shown that it is a dense subspace of L 2 [0, 1].

If we take a new norm

*(-)lll =P 1

J ( |x(t) |2 + I x(t) |2) dt I

for S, then S is a B -space with respect to this norm. We also define the concept of connectedness.

114 CU RTAIN

Definition 22. A metric space (X, d) is disconnected if it is the union of two open, non-empty disjoint subsets. Otherwise (X, d) is said to be connected.

Example 26. X = [0, 1]U[5, 6 ] with the Euclidean m etric is disconnected.

Contractions in metric spaces

This is an important concept which gives us the basic tool for proving existence and uniqueness of solutions of differential equations.

Definition 23. Let (X, d) be a metric space and f : X~*X. Then f is a contraction (mapping) if there is a real number K, 0 S K< 1, such that

d(f(x), f(y)) SKd(x.y) V x .y E X .

This implies that f is uniformly continuous.

Contraction mapping theorem

Let (X, d) be a complete metric space and f : X~>X a contraction mapping, then there is a unique x 0e X» such that f(x0) - x 0 • Here x0 is called the fixed point of f.

M oreover, if x is any point in X and we define the sequence {x n} by

x j = f ( x ) , X2 = f (Xj ) , . . . , Xn = ftXn-j),

then xn -* xo as n -♦ » .

Corollary

Let (X, d) be a complete metric space and f such that f p is acontraction for some p > 0. Then f has a fixed point.

Example 27. Existence and uniqueness theorem for solutions of ordinarydifferential equations

Consider

(1)

y(0) = 0

where f is real-valued, continuous on EX И.Then (1) is equivalent to the solution of the integral equation

о

I A E A -S M R -1 7 /2 115

and this may be thought of as z = F(y), where.

z(t) =о

and F : Á, the space of real-valued continuous functions defined on[0 - А, 0 + А].

Now y(t) is a solution of (1) iff y= Fy, i.e. iff у is a fixed point of the map F.

We now show that under appropriate conditions on f, F is a contraction.

Assumptions on f: (a) |f(y,t)|sM for - l i y S l , - l s t s l .

for (y ,t), (x,t) in [ - 1 , 1 ] x [ - 1 , 1 ].Let X = space of continuous real-valued functions 0(t), such that

I 0(t) I S M 111 on [- T, T], OSTSI.

M T S l, K T<1.

X is a subspace of ^ [ -T ,T ] and is closed under the sup metric dK,i.e. QL d^) is a complete metric space.

(i) F : X - X , since

(b) Lipschitz condition | f(y, t) - f(x, t) | S K | y-x

о о

(ii) F is a contraction, sincet

for t 2 0 I F(x)(t) -F(y)(t) I = I J U(x(s),s) -f(y ( s), s)] dsо

о

s Ktd«-.(x,y).

for tsO , I F(x)(t) - F(y)(t) I g K|t|d„(x,y)

116 CU RTAIN

111 ST , d jF (x ) ,F (y ) ) s K T d jx ,y )

F has a unique fixed point which is the solution of (1).

Compactness

Compactness is a very important concept in analysis, although it does not arise in finite-dimensional spaces as such. For finite-dimensional spaces all closed and bounded sets are compact. This is never true in infinite dimensions and we here define compactness for metric spaces.

Definition 24. A set A in a metric space (X, d) is compact if every sequence in A contains a convergent subsequence with a limit point in A.

We recall that in IRn every closed and bounded set contains a convergent subsequence.

For more general topological spaces there are several different kinds of com pactness, the definition above corresponding to sequential compactness. However, for metric spaces all types of compactness are equivalent. Still we can define a weaker form of compactness, relative compactness.

Definition 2 5. A set А С X is relatively compact (or conditionally compact) if its closure A is compact.

This means that every sequence in A contains a convergent subsequence which converges to a point not necessarily in A.

We also note the following properties of compact sets:

1. If (X, d) is compact, it is a complete metric space.2. A compact set is closed and bounded.3. If A is compact, every infinite subset in A has at least one point of

accumulation — the Bolzano-W eierstrass property.(This is also an equivalent definition of compactness.)

4. If f : X - Y is continuous and A C X is compact, f(A) is compact.5. A continuous function f: A->IR achieves its minimum if A is compact.

Finally we shall state the A rzela-A scoli theorem which will be used in the application sections.

Let (X, di) be a compact metric space and (Y, d2) a complete metric space and A(X, Y) the space of continuous functions on X with range in Y .

( /(X ,Y ) ,p ) is a metric space, where

p(f, g) = sup{d2 (f(x), g(x)), x £ X }

and in fact (^ (X ,Y ),p ) is a complete metric space.

Definition 2 6 . A c / ( X , Y ) is equicontinuous at xne X . If given £ > 0 , 36 < 0, such that d2 (f(x) - f(x0 ) ) < 6 VfGA whenever dx(x - x0) < 6 and x e X . _

A is equicontinuous on X if is is equicontinuous at all points in X .

and so fo r

A rzela-A scoli theorem

A c / ( X , T ) is relatively compact in (^ (X ,Y ),p ) iff

A is equicontinuous on 'K and for each x E 'K.

A(x) = { f(x) where f€ A } is relatively compact.

IA E A -S M R -1 7 /2 117

Deeper topological notions in metric spaces

We have already mentioned that metric spaces are a special class of topological spaces and here we shall explore some of the fundamental topological concepts for metric spaces.

First we show how the metric space may be characterized in terms of neighbourhoods.

Definition 27. Let (X, d) be a metric space and x 0 an arbitrary point in (X ,d). Bt(x0) = {x e X : d(x, xo) < r } ,

0 < r <oo is the open ball of radius r centred at xo .

Br [xq ] = { x £ X ; d (x ,x o )S r} is the closed ball. ...

Sr[x0] = {x e X ; d (x ,x Q) = r} is the sphere ...

Definition 28. Let xq be an arbitrary point in (X, d), then a subset N of (X, d) is a local neighbourhood of x 0 if N = Br(x„ ) or Br[x0] for some r j- 0. Br(x0) are called open local neighbourhoods and Br[x0] closed local neighbourhoods.

The local neighbourhood system of x 0 = {B r(x0), Br[x0]} , r > 0= N(x0).

It is now possible to redefine continuity in terms of local neighbourhood systems using the following theorem;

Theorem

A function f : (X, dx)-*(Y, dy) is continuous at x 0 in (X, dx) iff its inverse image of any local neighbourhood of f(x0) contains a local neighbourhood of x 0.

Similarly we can redefine convergence in terms of local neighbourhood systems.

118 C U RTAIN

M, LOCAL NEIGHBOURHOOD N, ARBITRARY LOCAL

Theorem

A sequence {x n} in a metric space (X, d) converges to x 0 iff x neN , a local neighbourhood of x0 , for all n ë m(N), a number depending on N. To -some extent our continuity and convergence may be defined independently of the m etric, but we can even characterize continuity and convergence in terms of open sets. Topological spaces are defined in terms of open sets and this family is called "the topology" on the space. The topology or open sets of metric space are always definable in terms of the m etric, but often several different m etrics generate the same topology. In fact, equivalent metric spaces need not have the same local neighbourhood systems.

Definition 29. Let (X .dj), (X, d2) be two metric spaces with the same underlying set. Then the m etrics d^ d2 are equivalent if

(a) f : (X, dj) -*■ (Y, d ), an arbitrary metric space, is continuous iff f : (X, d2 ) -> (Y, dy ) is continuous, and

(b) a sequence {x n} converges to x 0 in (X .d^ iff { x n} converges to x0 in (X, d2 ).

In fact, either of (a) or (b) ensures the equivalence of dj and d2 or even the condition:

(c) I : (X, dj) -* (X, d2) and I ' 1 : (X , d 2) -* (X, dj) are continuous (I is the . identity map).

A more fundamental way of expressing equivalence of metric spaces is in terms of open sets.

Definition 30. A set A in a m etric space (X, d) is open if A contains a local neighbourhood of each one of its points. Note that 0 and X are always open sets.

Definition 31. The class of all open sets in (X, d) is referred to as the topology (generated by the metric d) and is denoted by We now state the basic result on equivalence of metric spaces.

I A E A -S M R -1 7 /2 119

T h e o re m

Let (X, di) and (X, d2) be two m etric spaces with the same underlying se tX . Then d: and d2 are equivalent i î î = i<e- ^ they generate the same class of open sets.

We now restate the definitions of continuity and convergence in terms of open sets.

Definition 32. A map f : (X, di), (Y, d2 ) is continuous if the inverse image of each open set in (Y, d2) is an open set in (X, d^ .

Definition 33. A sequence {x n} in a metric space (X, d) converges to x Q in X iff x n is in every open set containing x 0 for sufficiently large n.

These are the usual definitions one uses for more general topological spaces.

Previously we defined closed sets as those containing all their limit points, but this property may also be defined in terms of open sets.

Definition 34. Let (X, d) be a metric space. A subset AC X is closed if its complement A' = X - A is an open set.

Open sets and closed sets have the properties:

1 . ¡5 and X are closed and open.2. If A¡ are closed, then Q A¡ is closed, but only a finite union is closed.3. If Aj are open, then U A; is open, but only a finite intersection П Ai

is open. 1 1=1

Finally, we define a separable metric space.

Definition 35. A metric space (X, d) is separable if it contains a countable subset A which is dense in X .

In the applications considered, most spaces will be separable Banach spaces, but in partial differential equation theory, you often need Fréchet spaces, which may be defined as complete metric spaces. Although they are not normed linear spaces, they do have sim ilar properties and can be defined in terms of seminorms on locally converse topological spaces. Probably the best way of thinking of Fréchet spaces is as the inductive limit of normed linear spaces, because a more detailed explanation of these spaces is beyond the scope of this paper.

3. MEASURE AND INTEGRATION THEORY

We recall that ¿ [0 , 1], the space of continuous functions, was notcomplete under the || • ||2 norm

о

120 CU RTAIN

I f we extend o u r c la s s of fu n c tio n s to the R ie m a n n in te g ra b le fu n c tio n s ,

then it is s t i l l not c o m p le te u nd e r th is n o rm . R e a lly , w hat we need is an

in te g ra l w h ich h as the p ro p e r ty tha t i f f n is sq ua re in te g ra b le and fn f

in m e an s q u a re , then f is a lso squ a re in te g ra b le . The in te g ra l w h ich does

have th is p ro p e r ty is the Lebesgue in te g ra l. W ith our b ackg round of m e tr ic

spaces and n o rm e d l in e a r spaces we cou ld de fine L ebesgue squa re in te g ra b le

fu n c tio n s to be e lem en ts of the c o m p le tio n o f У [ 0 , 1] und e r the || • ||2 n o rm ,

and s im i la r ly fo r g e n e ra l р ё 1. H ow ever, we s h a ll ske tch the c o n s tru c tio n

of the Lebesgue in te g ra l s ta r t in g fr o m b a s ic m e a su re th eo ry .

M e a su re spaces

D e f in it io n 36. A m e a s u re space_i_s a coup le (~X,38) c o n s is t in g of a se t X

and a 6- a lgeb ra 38 of subse ts of X . A subse t of X is c a lle d m e a s u ra b le if it

is in 38.(ст- a lg e b ra m ean s c lo sed u n d e r a l l coun tab le set o p e ra tio n s ) .

D e f in it io n 37. A m e a s u re ц on a m e a su ra b le space (X_,38) is a non-negative

se t fu n c tio n de fined fo r a l l se ts in 38 w ith the p ro p e r t ie s

(coun tab ly add itiv e )

The tr ip le (X , 38, ц ) is then a m e a su re s p a ce .

The L ebesgue m e a s u re space

E x a m p le 28. (IR , m ) w here m is the o u te r m e a su re de fined by

m A = in f £ le ng th ( In)

U I n 3 A

w here I n is a c o un tab le c o lle c t io n o f in te rv a ls c o v e r in g the set A .

The o u te r m e a su re is no t coun tab ly ad d it iv e over a l l p o s s ib le subse ts

o f IR , so we de fine a s u b co lle c t io n the se t of Lebesgue m e a su ra b le

s e ts , by

Then you can show th a t ^ is a cr-algebra o f se ts and i t a lso co n ta in s a l l

subse ts o f ]R w ith m e a su re z e ro , i .e . (IR , m ) is a w e ll- de fin ed m e a su re

space .

W e r e m a r k th a t a l l in te rv a ls a re i n ^ ^ in c lu d in g (a , oo).

F u r th e r , a l l B o re l se ts are in (th is in c lu d e s a l l open and c lo sed

se ts ). {38, the se t o f B o re l s e ts , is the s m a lle s t ff-a lgebra c o n ta in in g a l l

the open se ts o f IR.)

p(0) = 0

w here E¿ a re d is jo in t se ts in 30.i — 1

E G if m A = m (A П E) + m (A П E ') V A С IR

I A E A -S M R -1 7 /2 121

Example 29. Another example is (10, 1], m), \yhere ^ i s the Lebesgue measurable sets in [0, 1]. A lso (1R, âô, m).

We say that p is a finite measure if /u(X)<oo.

Example: ( [ 0 , 1 ] ,Д m) with m([0, 1]) = 1; all probability measures.

Definition 38. A measure space (_X,-á ,iu) is complete if 3S contains all the subsets of sets of measure 0. (The Lebesgue measure is complete.) All measure spaces can be completed.

Definition 39. Measurable functions

Let f : X^ { °°}, then f is measurable if {x : f(x) <a}&âS for each a . If f, g are measurable, so are f + g, cf, f + c, f -g , f V g where с is a

constant. If { f „} is a sequence of measurable functions then so are supfn, in ffn, l im fn, lim f n, i.e. the set of measurable functions is closed under limiting operations.

A special type of measurable function is a simple function which we use to construct the Lebesgue integral.

n

Definition 40. A simple function is g(x) = ^ CjXE.(x), where c¡ arei—1

constants and Xc is the characteristic function of E( 6 3S.c-l 1

у (x) = if x G EjXeíW [ 0 i f x Ç E j

All non-negative measurable functions f = lim gn , where gn is a monotonie increasing sequence of simple functions.

Example 30. f :A ^ E u {° ° } is Lebesgue measurable if A 6 ^ and for eacha , {x : f(x) <a}E.âë.

All continuous functions and piecewise continuous functions are Lebesgue measurable. All Riemann integrable functions are measurable.

We now build up our definition integration on (~К,38,ц) by defining it just for non-negative simple functions.

n

J" gd(x= с{1л(ЩПВ)E i=l

where E e 30.This integral is independent of the representation of g and has the usual

properties of integrals:

(a g +Ъф) dju = gdM + b j фdju E

for a, b > 0 .

122 C U RTAIN

For non-negative measurable f on (X, 38, ц ) , define

/ f d*u = sup / gd/j

where g ranges over all simple functions with Oâ gSf.This is a well-defined integral for non-negative measurable'functions

with the properties

1. J (af + bg) àiu = a j fd/u + bJ gd/u for a, b s 0 .E E E

2. J ' fdju S 0 with equality iff f = 0 'almost everywhere', i.e. except on aEset with measure zero.

3. If f g g a.e, then J' fd/к ê J gd;u .E E

4. If {fn} is a sequence of non-negative measurable functions

I l '■*“ î JE n=l n=l E

This last property is an important advantage over the Riemann integral. To extend the integral to arbitrary measurable functions we define:

Definition 41. A non-negative measurable function is integrable on E if

J ' f dp <oo .E

An arbitrary measurable function is integrable on E if f + and f* are both integrable and we define

J ' f d/д = J ' f +dju -J' í~diÁ.E E E

f + and f are both non-negative functions defined by

f- = i ( I f I - f)

This integral has the properties 1 -4 andîn addition,the very important Lebesgue dominated convergence property.

IA E A -S M R -П / 2 123

5. Let { f n(x)} be a sequence of measurable functions on E such that fn(x)-*f(x) almost everywhere on E and | f n(x) | S g(x), an integrable function on E, then 1

/ fd „ = lim J f "dM

and

6 . IJ f dju I sE E

We have now defined integration over an arbitrary measure space (X,á^,p) and two important examples are

Example 31. Lebesgue integral on (X, 38, ц) or ( ( a ,b ) ,^ m )

We note that if f is Riemann integrable on (a ,b ), then it is also Lebesgue integrable and the integrals agree.

Example 32. Probability measures (X, 38,n), where ¡л (X) = 1. In this case the integral on (X, 38, n) is the expectation of a random variable. (These spaces will be discussed in m ore detail later on.)

Example 33. Multiple integrals on (IRn m) or (I jx ...x ln , ^ n, m). IfX = I1x ...x In measurable sets have the form EjX.-.хЕд, where E¡ is a Lebesgue measurable set on I¡ , an interval of E .

Then

J f I d ц

■ J' f(xi, . . . ,x n) dxi... dxnEn

has all the usual properties of multiple integrals.

Differentiation and Lebesgue integration

Consider

f(x) d/u (x) =E, x . . xEn

F(t) = F (a )+ J 'f (s )d sa.

when the integral is the Lebesgue integral on ((а,Ь), m).We already know from Riemann integration theory that if f is continuous,

dF/dt = f on [a,b] and, conversely, if F is continuously differentiable (C1) with dF/dt = f , then F satisfies (1). With the Lebesgue theory we can extend this fundamental theorem of calculus using the concept of absolute continuity, which is slightly weaker than continuous differentiability.

124 CU RTAIN

Definition 42. Let I be a compact (closed bounded) interval on E , then F is absolutely continuous on I if for every G> 0 there is a 6>0 such that whenever Ik = [ak, b k] are non-overlapping intervals in I with

n n^ |bk - a k|só, ^ i F(bk) - F(ak)| s ek=l k=l

F is absolutely continuous on an arbitrary interval if it is absolutelycontinuous on every compact subinterval.

For example, all C' functions are absolutely continuous and all functions which satisfy a Lipschitz condition on I:

I F(t) - F(s) I s К 11 - s I for s, t G I

Then the extended fundamental theorem of calculus is

F:I->IR is absolutely continuous iff F satisfies (1) for some integrable f on [a,b]. Then F' exists almost everywhere and F '= f almost everywhere. Two other extensions of the Riemann theory are:

1. Differentiation under the integral

If g(t) is Lebesgue integrable on (c,°o), f(x,t) a measurable function of t for all x in (a,b) and 9 f/9x(x ,t) exists for all t in (c ,« ) with f, 9f/9x bounded on (a, b) x (c,oo) , then F(x) = / f(xt) g(t) dt is differentiable in (a,b) with

(c,«)

F '(x) = J I I (x,t) g(t) dt

2. Multiple integrals and change of order of integration

We have already noted that multiple integrals fall into our general theory of integration on (X, 38, ц) and have the usual properties of integrals. The relation between these multiple integrals on И" and successive Lebesgue integrals on IR is stated in Fubini's theorem

(a) If f(x ,y) is a measurable non-negative function on И X E , then

J' dx J' f(x ,y) dy = J J f(x,y) dxdv = J áyj'í ix , y)dxE F EXF F E

where E, F are Lebesgue measurable sets.

(b) If f(x,y) is a measurable function on E X E and any of the integrals is absolutely convergent, then

I A E A -S M R -1 7 /2 125

fd/-( is a b so lu te ly co nve rgen t if J |f|d^ e x is ts J .

E E(S im i la r r e s u lts ho ld fo r in te g ra ls of fu n c tio n s o f n v a r ia b le s .)

The L e b esg ue - B anach spaces

O u r d is c u s s io n o f the L ebesgue in te g ra l w as m o tiv a te d by the fa c t tha t

^ [ 0 , 1] was no t c lo sed u n d e r the Ц • ||2 n o rm . W e now d e fin e spaces L p (E)

w h ich o ve rcom e th is p ro b le m .

£ ? P(E) ■

1 § p<°o

D e fin e

f : E-" E , w here E is a f in ite L ebesgue m e a s u ra b le set

in E and J I f IP dt < °°E

I f II p = ( I I f(t) Ipdti/p

E

Then by M in k o w sk i's in e q u a lity

( / U + g l p d t ) P s ( j U l p d t ) P+ ( J | g | p d t ) p

E E E

we see th a t || • ||p is a s e m in o rm . I t is not a n o r m , s in ce || f ||p = || g ||p does

no t im p ly th a t f(t)= g(t) a t a l l po in ts in E , bu t o n ly " a lm o s t e ve ryw h e re "

( i.e . they can d if fe r on a se t o f po in ts o f m e a s u re ze ro ) .

To define a n o rm e d l in e a r space we need to c o n s id e r e qu iv a lence c la sse s

as e lem e n ts in5^P (E ) : f = g if f f(x) = g(x) fo r a l l x , except in a set o f m e a su re

ze ro .

I f we c o n s id e r these e qu iv a lence c la s s e s as e le m e n ts of a new space

L p(E) then II-||p is a n o r m , w here equa lity in L p(E) m e ans equa l a lm o s t

eve ryw he re .

L p(E) is a B an ach space ( i.e . co m p le te ).

W e a lso de fine L„(E ) w h ich is a g e n e ra liz a t io n o f bounded fu n c tio n s .

L (E) - -f * : E -*® ., w here f is a Lebesgue m e a s u ra b le fun c tio n \

“ I on E , w h ich is bounded a lm o s t eve ryw he re . J

I f we a g a in c o n s id e r equ iv a le n ce c la s s e s o f e le m e n ts , then L „ (E )

b eco m e s a n o rm e d l in e a r space u nd e r the n o rm

Il f ||œ = ess sup I f(t) I

tS E

w here the ess sup is the in f in u m of sup g (t) , w here g rang e s over a l l

e lem en ts in the equ iv a le n ce c la s s fo r f,

i .e . e ss sup f(t) = in f{K : m {t : f( t )> K = 0}

126 CU R T A IN

1.5

-1

Then sup f(x) = 2 , b u t ess sup f(x) = 1;

[ -1 .Я [-1. l]

L „ (E ) is ag a in a B anach space .

The L ebesgue spaces L p (E) a re b u ild in g b lo ck s in a n a ly s is and so we

s u m m a r iz e som e u s e fu l p ro p e r t ie s :

1 T - \ T — ч T — 4

w here ¿ q is the space of in f in ite ly d if fe re n t ia b le fu n c tio n s w ith co m pac t

s u p p o r t, (f has co m p ac t s upp o r t if it is ze ro ou ts id e c o m p a c t se ts .)

3. R ie s z re p re s e n ta t io n th e o re m

The d u a l space of L p (E) is L q(E ); l / p + 1/q = 1.

A l l c o n tin uo us l in e a r fu n c t io n a ls G on L p (E) have the fo rm

Ew h e r e

g e L q (E ) and II G II = Il g H, .

(T h is is e s s e n t ia l ly a consequence of the H o ld e r in e q u a li ty .)

|fg|dxS f| P dx / |g|4 d x j ; l / p + l / q = l ; l S p < o oE E E

(N ote , how ever, th a t the d u a l o f L « (E ) is not L ^ tE ) .)

4. L 2(E) is a H i lb e r t space .

IA E A -S M R - 1 7 /2 127

Som e ex tens ions

1. W e m ay s im i la r ly de fine L p(E ) , w here E is a f in ite L ebesgue m e a s u r

ab le set in R n and f is a m e a su ra b le re a l- v a lu e d fu n c tio n of

n v a r ia b le s .

L p (E ) is a B anach space u n d e r the n o rm

lf llp = ( / |f(x i . . . x n)|p dxx. . . d x n) ; 1 S p < oo

and

f || = e s s £ sup I f(x t , . . .x n )

2. W e m ay a lso c o n s id e r L p(E ; ]Rk ), the space o f v e c to r- v a lu e d fu n c tio n s

o f n v a r ia b le s f : E С IR"-* IR k u n d e r the n o rm

llf llp = ( j ~ I f U i . . . . . xn)lk dx x... dxnE

w here | • |k is the E u c lid e a n n o r m o n lR k .

3. A f in a l e x am p le is the space o f se co nd- o rd e r s to c h a s tic p ro ce sse s on

10, T], w h ich is a H i lb e r t space u n d e r the in n e r p ro duc t

1

< x , y > = / E { x ( t ) y (t)}d t

оT

= J J x (t ,u ) y ( t ,u ) d/udt

о n

4. H IL B E R T S P A C E S

S ince a n o rm e d l in e a r space is c h a r a c te r iz e d by its n o rm and its

a lg e b ra ic s tr u c tu re , we o ften find th a t supposed ly d if fe re n t spaces a re

e qu iv a le n t both a lg e b ra ic a l ly and to p o lo g ic a lly .

D e f in it io n 43. Two n o rm e d l in e a r spaces X _and Y a re to p o lo g ic a lly

is o m o rp h ic if th e re e x is ts a T e L (X ,_Y ) such th a t T ' 1 e x is ts and is

co n t in uo u s . T is c a lle d a to p o lo g ic a l is o m o rp h is m and we w r ite

_X£Ÿ

D e f in it io n 44. I f fu r th e r , || Tx ¡y = II x ||x , fo r a l l x £ X , we say th a t they

a re is o m e tr ic a l ly is o m o rp h ic .

128 CU R T A IN

A useful necessary and sufficient condition for the topological iso morphism between two spaces X and Y is 3m, M> 0, such that

m II x II x S (I Tx ||y S M I x IIx , Vx G X

Example 35. L e tX -ü ^ and

Z = { z : [0,1] - IR, z(t) = a0 + axt; a g ^ e l R }

under the norm

Il z II = max {| a0 I , | a j } .

Then Tx = X! + X2t for x = (x i, X2 ) is a linear map from X to Z andIl Tx II = max { I xx I , |x2|} = ||x|L and T is bounded. Also T - 1 : z->X is givenby T -1z = { z (0), dz/dt|t = 0}. T ' 1 is linear and || T “ 1 z || = || z ||.

So X= Z and even isometric ally isomorphic. This is an example of a very important result, namely:

All real n-dimensional normed linear spaces are topologically iso morphic, i.e. all are equivalent to IRn.

So we can find out all about real finite-dimensional normed linear spaces from R n .

Properties of ]Rn

1. e¡= {0 , , 0, 1, 0, ...} form s a basis for ]Rn,n

i-e - S = {x l .......xn> = X Xi~ i-i — 1

2. All norms are equivalent.

3. ]Rn is complete (i.e. a В-space) and all its subspaces are closed.

4. All linear maps L :]R n->Y are continuous for any normed linearspace Y .

5. A corollary of 4 is that the algebraic and topological duals of ]Rn coincide.

6 . All linear transformations T : IRn -*■ IRm may be represented as mXn m atrices. This representation is not unique, but depends on the basis of IRn and lRm. In fact T is represented by a class of sim ilar matrices.

Example 36. Let X be the space of polynomials of degree s 3,

i.e. x(t) =a1 +Q'2t + a-3 t 2 + » 4 t 3

Let Y be the space of polynomials of degree S 2;Let D be the derivative operator on X. The range of D is Y_ and

D : X-* Y is linear.

IA E A -S M R -1 7 /2 129

If we take {1, t, t2 , t 3}= {x 1 ,x 2 ,x 3 , x 4} as a basis for X and { 1 , t, t 2} = { у х, y2, У3 } as a basis for Y. then

' 0 Г02 =

J 33_

)x2 = У1 ’ Dx3 - 2 y 2

0 1 0 0 " ' “ l "

0 0 2 0 “ 2

« 30 0 0 3 «4

if Dx = ft +ñ2t + /33 t2

and relative to these bases D has the representation

О i 0

1—

о

0 0 2 0

_ 0 0 0 3.

If we change the basis of X to {1 + t, t + t2, t 2 -t-t3, 1 + 12} then D has the new representation

1 1 0 0

0 2 2 0

0 0 3 3

When we consider infinite-dimensional spaces, however, the situation is not as simple. However, there is an interesting subclass of infinitedimensional spaces like ¡L which can be thought of as a generalization of ]Rn to n = » .

Definition 45. An inner product on a linear space isf a bilinear functionX X X - <C , such that

1. (o x + |3y , z У = o ‘(x , z )> + fK y, z for scalars a , )3.

2 . <x7y > = <y, x >

3. ( x , x ) î 0 and (x ,x )> = 0 iff x = 0.

A linear space X with an inner product ( . , . /> is called an inner productspace. ПExample 37. ]Rn with <(x,y> = x.y.

i = iwhere

x = { X l, ... xn} , y = { yi , . ..y n}

bExample 38. L 2 [a,b] with <(x,y)>= J ' x(t) y(t) dt

130 CU R T A IN

1. <(x,y^> = 0 V x £ X y = 0.

2 . < x ,ffy )= ff < x ,y >

3. X can be made a normed linear space by defining || x || = */^x, x У

4. Parallelogram law

l|x + y ||2 + | | x -y ||2 = 2 II x II2 + 2 ü у II2

where || • || is the norm induced by the inner product.

5. Schwarz inequality |^x,y)>| s l|x||.||y||.

Definition 46. A Hilbert space is an inner product space which is complete as a normed linear space under the induced norm, i.e. a Hilbert space is a special case of a В-space. When is a В-space a Hilbert space?

If its norm obeys the parallelogram law, then we can define an inner product

< x ,v > = i jj|x + v ||2 - Il x - y H 2 + i||x + iy I) 2 - i H x - iy||2}

Example 52. L 2 [a,b] is a Hilbert space with inner product b

< x ,y > = y x(t) y(t)dt a

and normb

Other properties of the inner product

as before.Let V = { y e L 2 [a, b]: y ' e L 2 [a, b] and y(a) = 0 = y(b)}. Then V is a linear

subspace of L 2 [a,b] and in fact is a dense subspace. On V we can define a new inner product

z > v = < y . z > + < y ' , z ' >

V is a Hilbert space under <( .Уу , but it is not closed under •(.,.)>.

Example 53. Let Q be a closed subset in IE3 and X = ^ 2(Q), the space of complex-valued functions with continuous second partials in Г2 .„ „ / 3u Эи Эи NDefine V и = ( -— , -— , -— ) and

\ Эх: Э х2 Эх3 /

I 19 \ 1/2I x(t) I d tj

 =_ Эи uv+ - — Эх,

9v Эи 9vЭх, Эх0 Эх0

ЭиЭх,

ЭуЭх, dxjdxgdxg

where x = (xi, х 2 ,хз).

IA E A -S M R -1 7 /2 131

Then < ., .> is an inner product on X and it induces the norm

( /Example 54. Let Q be an open subset of IRK and u е ^ п(Г2). As in Ex.22 define the differential operator D“ u:

a I = n

and the inner product <(.,. on

I D “ u ( x ) D “ v ( x ) d x

, )>n induces the norm n,2

u ''".г = ( I X lD“u(x)|2d1/2

The completion of ó n(í2) with respect to this norm is a Hilbert space, denoted Hn(Q), a Sobolev space, which is used in distribution theory in partial differential equations.

Another important Sobolev space is Hg(f2), which is the completion of ¿ q(Q) under this (n, 2 ) norm. is the space of infinitely differentiablefunctions with compact support.)

We note that the other (n,p) norms cannot be obtained from inner products (see Ex.22).

An important finite-dimensional concept which generalizes Hilbert spaces is orthogonality:

Definition 47. x is orthogonal (perpendicular) to y, x x y, iff ^x, у У = 0.

If x i у, then the parallelogram law reduces to

which certainly looks familiar.

Definition 48. If M is a subspace of a Hilbert space, H, then the orthogonal complement

М± = { x e H: <x, y > = 0 V y £ M }

Mx is a closed linear subspace of H and H can be decomposed uniquely as a direct sum,

2 2

1 3 2 CU R T A IN

H = M e M1

also M Х1= M.I fy S H , then y = yM + yM-L, where yM6 M, y^ G M ^ a n d || y ||2= || ум ±||2

+ llyMx||2.M induces an orthogonal projection i on H. 7r : H - M where 7ry = ум .

7г is linear and bounded and ||7г||= 1 .

Example 39. H = L2 [-a , a], M = { set of all even functions: x(t) = x(-t)}M is a linear subspace of H.

Let P : H - H be у = Px , where y(t) +i(x(t) +x(-t)). Then P is a projection of H - M.

Definition 49. An orthonormal set in a Hilbert space H is a non-empty subset

i.e. {0 „} are mutually orthogonal unit vectors. Of course any mutually orthogonal set {un} may be normalized by

Example 40. • The set 0k = {0 , . . . , 0 ,1 , 0, ...} is an orthonormal set.Similarly for £2 .

Example 41. {sinîrnt} is an orthonormal set in L 2 t0 , 2 ].

Example 42. Any linearly independent set { x i , x 2, ...} in H can generate, an orthonormal set {0n} by the Gram-Schmidt orthogonalization process

{0n} of H > < 0 n,0 m> = 6nm

Let 0X = xx / II x x II

SO

Continue in the following way:

й _ x n - < x n , 0 i > 0i - < x n , 02>02 - < ■ ■ ■ > - < x , 0 n - l > 0 n - lГП 11 " [j

Orthonormal sets satisfy an important inequality: Bessel*s inequality.

IA E A -S M R - 1 7 /2 133

If 0n} is a finite orthonormal set in H and xeH then

X 1< * .0 1 > 1г « 1 4 *• i=l

and

x -i = 1

Proof

11

OS U - l < x ,^ 1>0i H2i=l

n n= <x - ^ < x ,0 ¡> 0 . , x - ^ < x , 0 i > 0 .>

i=l i=l

n n= <х,х>-2^Г <х,0;><хТ07>+^Г ) <x,0j><x,01 >6у

n i=1 î=l ij = 1

II2 \ ' I / л \ |2= II X ..

i=l

and

i = i i = i = 0

This has a useful geometric interpretation: n

£ < х .0 ! > 0 ,i = l

is the projection of x onto the subspace spanned by {$ x, 0 k} andn

X <*.*,> 0i

B essel's inequality is also valid for n = oo.Earlier on.we said that a Hilbert space can be thought of as '11°°' in

some sense. Well just as for IRn , where we can always express any element

134 CU RTAIN

as a linear combination of n basis elements, we can do a sim ilar thing for Hilbert spaces, except of course we need infinitely many basis elements and we need enough of them to include all the elements, i.e. a complete set.

Definition 50. An orthonormal set { j) n} in a Hilbert space H is complete or maximal if H = Sp{0

The following are equivalent conditions for {0¡} to be complete:

1. x 1 0i Vi =* x = 0

2. x = E <x, 01> 0i V xSH

This is called the Fourier expansion for x

<(x,0j)> are called the Fourier coefficients

3. Parseval's equation holds: || x ||2 = L \ <(x, 0¡ )>|2

4. There is no vector y€H such that ,y/|| yl|} is an orthonormal set larger than { 0 ¡} .

Example 43. Let H = L 2 [0, 2тг]. Then {0n(x) = e in* /-Jzi} is an orthonormal set in L z [0 , 2тг].

One can show that this set is complete and equality holds and we can write

where equality is of course in the L2 -sense, not pointwise in general. Since H = Sp{0¡} , we can see why the R iesz-F ischer theorem holds:

о

are the usual complex Fourier coefficients of f. B essel's inequality is

2tt

If {cn} is a complex sequence such that ^ lc n|2<00> then there is an

f e L 2[0,27r], whose Fourier coefficients are the cn' s .

Example 44. H = L 2 [0,2tt] over the real numbers. This is sim ilar to the previous example, except the complete orthonormal basis is

Example 45. H= L 2 [ - l , 1]. The Legendre polynomials are an orthogonal set, where

Also {Pn (x)} is complete in L2 [-1 ,1 ] and so for arbitrary f e L 2 t - l , 1] it has the expansion

where each polynomial Pn (x) is of degree n and the partial sums converge to f in the L2 -norm , i.e.

approximates f(x) in the mean square sense.

Example 46. H = L 2 (0,oo). The Laguerre polynomials Ln(x) are often defined as the polynomial solutions of xy" + (1 - x)y' + ny = 0. Then {e " x^2 L n(x)} form s a complete orthonormal set for H.

These last two examples are two of many such ways of expressing

{1/-/2Й-, 1 / 7Г sinnt, 1 / n/~7Гc o s n t }

and

l i

We have

l

-l

n=0

к

n = 0

n = 0

as an expansion in terms of orthonormal functions. Often these arise as eigenfunction expansions of solutions of partial differential equations after

1 3 6 CU RTAIN

using separation of variables. These also give us good approximations for f as a finite sum of terms, which are invaluable for numerical solutions. The advantage of using expansions in terms of orthonormal

functions, as opposed to, say, finite polynomials, f (x )= ^ a nxI1 » is that, ifn=0

you wish to increase your accuracy for f in the orthonormal expansion, you simply calculate one extra coefficient and add it on and repeat until the accuracy is sufficient. However, with approximation by polynomials, you need to recalculate all the coefficients for each new approximation.

5. LINEAR FUNCTIONALS, WEAK CONVERGENCE, WEAK COMPACTNESS

Definition 51. A linear functional on a normed linear space X is a linear map f : X-> К (or (С if X is a complex vector space).

Definition 52. The algebraic dual of X , denoted X a, is the linear vector • space of all linear functionals on X.

If we consider topological properties as well, we can look at the space of all continuous linear functionals (or equivalently all bounded linear functionals).

Definition 53. The topological dual of X or the conjugate space X* is the normed linear space of all bounded linear functionals on X with the norm

II f|| = sup{ I f(x) I : Il x H = 1 }xS _X

We rem ar» that X* is always a В-space, even when X is not.We illustrate this useful concept with the following examples:

Example 47. Consider X = and define fa(x) = axX j+a2 x2 + . . . +a x for a fixed a G Jîjj , a / 0 .

f a :X ^ R and is linear. It is also bounded since

к

fa(x)|= |a1x 1+a2 x 2+ ... + anx n

2

applying the useful Holder inequality

i= 1 i = 1 i= 1

where l /p + l / q= 1, valid for n finite or infinite and 1 p < °°. We also have that

IA E A -S M R - 1 7 /2 137

sup j I f a (x) I : ]| X II2 = l j

I a I! 2

L e tt in g x = и -“ ir- , f ( и an ^ Il a II? and hence we have the e q ua lity Il a II2 a V II a||2/ 2

Il fa II = Il a II g • So eve ry a E X de fines an e lem en t f a in the d u a l space;

s im i la r ly fo r ¿ 2 ■

E x a m p le 48. X = ^ [ a ,b ] ,

b

Then I(x) = J x(t) d t is a l in e a r fu n c t io n a l and | I(x) | ^ sup| x(t) | • (b - a)

a

w ith e q ua lity w hen x ^ co ns tan t. So I is a c o n tin uo us l in e a r fu n c t io n a l w ith

the n o rm (b - a).b

E x a m p le 49. X = ¿£[a,b] and I y = J y ( t) x(t) dt; I y is a l in e a r fu n c t io n a l on X

afo r in te g ra b le y.

b b

I I y(x) I S m ax I x(t) I У I y (t) I dt = I x II J |y(t)|dt

3 b a

So Iy is c o n tin uo us and in fa c t ||ly||= J |y(t)|dt.

a

Example 50. X= ^ [a ,b ] and ót x = x(t0)о

ôr is a linear functional on X with Lo|óto XI S I x ( t0 ) I S II X II

and we have e q u a lity when x is a c o ns tan t. So o u r f a m i l ia r "d e lta fu n c tio n "b

w h ich is o ften re p re se n te d as J x(t) 6 (t - t 0)d t is not a fu n c tio n , but a

a

co n t in uo u s l in e a r fu n c t io n a l on ^ [ a , b] w ith n o rm 1 .

I t is a l l v e ry w e ll to de fine the d u a l of a space and to g ive som e s im p le

e x am p le s , but we w ou ld l ik e to know m o re in the g e n e ra l c a se . The fo llo w in g

fam o us H ahn- B anach th e o re m e n su re s us o f the ex is tence of lo ts of c o n

tin uo u s l in e a r fu n c t io n a ls .

H ahn- B anach th e o re m I

E v e ry co n t in uo u s l in e a r fu n c t io n a l f : M-*]R de fined on a l in e a r s u b

space M o f X , a n o rm e d l in e a r space , can be extended to a con tin uous

l in e a r fu n c t io n a l F on a l l of X w ith p re s e rv a t io n o f n o rm .

I f we take the p a r t ic u la r subspace to be {axo} fo r x 0 / 0 and a any

s c a la r , then a l in e a r fu n c t io n a l is f 0(y) =о|| x 0 ||, w here y = a x 0 and f0 is

138 CU RTAIN

bounded with norm 1. The Hahn-Banach theorem says that there is an F 0 in X* with norm 1.

The Hahn-Banach theorem also has the following geometric interpretation, which is useful in proving the existence of optimal control theory problems.

Hahn-Banach theorem II

Let X be a normed linear space, M a manifold of X and A a non-empty convex, open subset of X not intersecting M. Then there exists a closed hyperplane in X containing M and not intersecting A.

For many spaces there is a nice relationship between X and its conjugate space X*.

Example 51. Consider and define fe ip * by f(x) = ^ f.x.;; lS p< °o . f is clearly linear, and i = 1

n

where l /p + l /q = 1 using Holder's inequality

n

i= l

If we letq

if f j / 0»

0 if f, = 0

n

i= 1

andП

f ( x ) | = Y l f i l 4i = 1

since l / p + 1 /q = 0 .

I A E A -S M R -1 7 /2 139

So we actually have

H - Œ m f

In fact all continuous linear functionals have this form and the dual of is , where l / p + l / q = 1 .

Similarly £ * = £ .

Example 52. Consider i n with ||x|| = max |x,|" lsisn

n

Let f(x) = ^ fjX. . Then f is a linear functional andi - 1

I K i l l - £ I f J I x J

* I U I I . Ï l ' i l - 1 * I . l l ' l ii=l

F o rX j= sg n f¡, we get equality and so || f ||= ||f Ц х. We can show that all continuous linear functionals must have this form , i.e. . Similarly£n* = £n a n d í ^ í ^ . However, 4 * /^ .

Example 53. There is an analogous result for the Lebesgue spaces:

LP [a, b]* = L4 [a, b], where 1/p + 1/q = 1; l<p<oo

i.e. linear functionals Fg on L p[a, b] have the form:

bFg (f) = J f(t) g(t) dt, where g e L q[a,b].

a

It is easily verified that Fg eLP [a ,b]* since it is linear and Holder's inequality for integrals is

I J fit) g(t) dt|s Ç J I f(t) IP dt ) (^J |g(t)|4 dt)a a a

= II f I p II g l l q » where l /p + 1 /q = 1

i.e. F g is bounded and |j Fg|| S ||g||q . In fact II F g || = || g||q . These duality results do not hold for p= oo.

We see that since l /p + l /q = 1, if we take a second dual of L p[a,b], i.e. (L p[a, b]*)* = L 4 [a, b]* = L p[a, b]. In general, for a normed linear space, one has X** D X and if X = X** we call X reflexive.

140 CU RTAIN

Definition 54. X is reflective if its second conjugate is itself, i .e . X **sX . This is a rather special property of spaces like L p[a,b], , SLp and is not shared by many common spaces, for example ^[a, b] is not reflexive. However, all Hilbert spaces are reflexive and in fact we can say more — they are self-dual spaces.

Suppose yGH and define fy:H-<E by

fy (x) = < x ,y > VxGH

fy is linear and |fy(x)| â So f y is bounded and || fyIn fact we can show that || fy || = ||y|| , and that every linear functional fEH* corresponds to some y€H in this way, i.e. there is a 1 - 1 correspondence between y EH and f y EH* and || f y || = || y || • H* can be considered as a Hilbert space too, by defining ( f x, f =<(y,x)'. So there is an isom etric iso morphism between H and H* — usually we identify them as the same space. It follows that Hilbert spaces are necessarily reflexive.

As convergence is a kep concept in analysis, we shall now have a careful look at what one means by x n- x . In fact, depending on the context we can mean many very different things.

„ , -, - * u\ J i / ’/ t on [1/n , 1]Example 54. L etx„(t) = in [_ 0 elsewherei

Then x n(.)G L p[0, 1], for all finite p, since J ' | xn (t) |p dt < oo. Clearly

xn(t ) - l/\ft pointwise. Now lA/t G Li [0, 1], but l/\ft ^ L 2 [0 , l ] , So x n converges in the space L x[0 , 1 ], but not in L 2 [0 , 1 ].

Example 55. If f(t) is integrable on (0 ,1 ), we can form its Fourier series

a . + ) (a cos27rnt + b sin27m t) о n n

iwhere

i i l a0 = / f(t)dt, an = 2 / f(t) cos 2mt dt, bn = 2 / f(t) sin 2rr nt dt

о

Now,in general, the Fourier series of f does not converge to f pointwise, even if f is continuous. However, the Fourier series always converges in L2 [0,1], i.e.

1 кa 0 + a ncos 27rnt + b n s in 27rnt - f(t)

о n id t-* 0 as к - co

Vf G L 2 [0, 1].This type of convergence we have been discussing is often called:

Definition 55. Convergence in norm (or strong convergence)

IA E A -S M R -1 7 /2 141

xn-* xin X means ||x-xn || 0 as n-»oo.

Sometimes a sequence or series does not converge in this sense, but it does tend to something in a weaker sense, and we find the following concept useful.

Definition 56. x n-> x weakly in X if f (x n)^ f(x) as nôo for all f€ X * .Convergence in norm implies weak convergence, because if x n~*x,

f(xn)- f(x) for all continuous functionals, i.e. for all f e X*.

Example 57. i p . Consider a sequence {x k}, where xk = {x k , . . . , x k}. What does weak convergence mean?

A functional on íp is f ¡(x )= x ¡, corresponding to f¡ = (0, ..., 1, 0, ...). So

f¡(xk) = xk -^Xj as kôo

i.e. each component of xk tends to each component of x. But

iixk- xiiP=( X Iх? " xiip) p i=l

-* 0 as kôo, since n is finite,

i.e. in this case, x k_>x in norm also.This is a particular property of finite dimensional spaces: weak

convergence and strong convergence are equivalent. However, for infinite dimensional spaces, this is certainly not the case.

Example 58. Consider and the sequence yk = (0, . ,1 ,0 , . . ) with a 1 in the к -th position.

Then for f e í * =üq , f(yk) = fk 0 as k-»oo , since

ik=l

So yk converges weakly to the zero element. But

II y k ~ 0 IIp = 1 -i—> 0 as к -*».

Example 59. ^ [a ,b ] with the sup norm.Here convergence in norm is exactly uniform convergence:

||xn - x ||-*0 as n-»°o iff max | xn(t) - x(t) | — 0 as nôo.a < t x(t) for each t

|xn(t) I § к uniformly in n and t.

Weak convergence in a Hilbert space becomes a particularly simple idea: We recall that xn -*x weakly in H iff f(xn) -* f(x) Vf £H*

i.e. iff < xn,y > -> < x ,y У V y € H

i.e. iff <(xn-х,уУ -> 0 as n-oo VyGH

Example 60. In L 2 [0,1], xn(t)->x(t) weakly means thatl

J ' (xn(t) - x(t)) y(t) dt-* 0 as n - 0 0 V y E L j [0,1]. оThere is yet another type of convergence in X-weak* convergence.

We need to identify X as the dual of some space, say X = Y*.

Definition 57. A sequence {y * } in Y* is weak* convergent to y* if

У? (y)_>y*(y) v y e Y.

We recall that Y c Y** and so elements of Y define linear-functionals on Y*, and so weak* convergence is like weak convergence, except you only use a subset of all possible linear functionals. Of course, if Y is reflexive, y = Y**, then weak convergence on Y* is just weak convergence on Y*. (This is true for Hilbert spaces).

Just as it is often useful to use weaker types of convergence in a space, it is useful to have weak concepts of compactness. For example in optimization problem s, one often seeks to maximize a linear functional over some set. A fundamental question is whether or not the given functional attains its maximum on the set and the main result is that a continuous functional on a compact set К of a normed linear space X achieves its maximum on K. For finite-dimensional spaces К is compact iff it is closed and bounded, but unfortunately this is not true for infinitedimensional spaces. In fact the unit ball { x 6 X : ||x||sl} is compact iff X is finite dimensional. However, one can prove that the unit ball is "weak* com pact". We recall that strong convergence or convergence on norm refers to the strong topology or norm topology of X , i.e. X considered as a metric space under the metric induced by the norm. Similarly, weak convergence induces the weak topology on X and weak* convergence the weak* topology and one has two new concepts of compactness.

Definition 58. A set A in X is weakly compact if for all sequences {xn} С A there is a weakly convergent subsequence with limit point in A.

Definition 59. A set A in X = Y* is weak* compact if for all sequences {x n}C A there is a weak* convergent subsequence with limit point in A.

Weak compactness and weak* compactness are equivalent for reflexive spaces.

IA E A -S M R -1 7 /2 143

We are often concerned with transformations between spaces. A very large class is the class of those transformations between linear spaces which preserve the algebraic structure.

Definition 60. A linear transformation T of a linear space X to a linear space Y is >T(ox+)3y) = oTx+^Ty for all x, y e X and for all scalars a, /3.

Example 61. Consider the spring-mass system

6. LINEAR O P E R A T O R S

—rm rv — nm nrv

If we assume that the friction between the mass and the surface is -b(dx/dt), the applied force is f(t) and the combined restoring force of the springs is - kx, then if initially x = 0 , dx/dt = 0 , we have the usual equation of motion

. d2x dx , . 1Лf = m -т-го + b — +kx (l)a t ¿ dt

with solution

tx(t) = J h(t - s) f(s) ds (ii)

о

where

Ы . 1 i X] Г X-2 Г \r) = —7T-----r (e - e )ПДА.! - X2)

and Xb X2 are the roots of mX2 +ЬХ+ к = 0. Now if f(.) e ^[0,oo], i.e. if it isa real continuous function, then so is x and so (ii) may be written x = Lf,where L is a linear transformation from ¿í[0,a>] to itself. Strictly speaking, L is an operator since X = Y. Similarly, (i) may be written f = Tx, where T again is a linear operator on [0,<x>]. L is an example of an integral operator and T a differential operator. Actually, since (i) and (ii) are equivalent, L = T _1 or L is the inverse of T.

Definition 61. A map F iX ^ Y is invertible if there is a map G: Y~>X such that GF and FG are the identity maps. G is called the inverse of F .

We know that the necessary and sufficient condition for F to be invertible is that F is 1 to 1 and FX = Y.

For linear maps, this reduces to the condition Fx = 0, only for x= 0. From our example 61 we see that the solution of differential equations

is one of finding inverses of certain transformations.

144 CU R T A IN

So far we have only considered algebraic properties of transformations,i.e. the preservation of the linear vector space structure. If X and Y are normed linear spaces, we may ask if a transformation preserves topological properties, for example

Definition 62. A transformation T:X-> Y is continuous at x 0g X if for every £ > 0 there exists a 6 > 0 such that

II x - x 0 II < 6 implies that || Tx - T x 0 1| < e

(Note that this definition applies whether T is linear or not.)An alternative form of the above definition is

Definition 62a. T iX -’ Y, a transformation between two normed spaces, is continuous if {x n} being a Cauchy sequence in X implies that {Txn} is Cauchy in Y.

Definition 63. T : X -* Y is said to be bounded if || Tx||y § к || x||x for some constant k> 0 and for all x £ X .

If T is bounded and linear then sup {|| Tx||y : 11 x ||x = 1} exists and is finite. We define this to be the norm of T, || T|| . It satisfies all the properties of a norm, and we note the useful property:

H Tx II y s II T II II x ||x for all x£ X .

For linear transformations you can show that the properties of continuity and boundedness are equivalent, and so we define

Definition 64. Let X , Y be normed linear spaces and £f{X, Y) = {T:X~* Y where T is bounded and linear}. Under the norm defined above, £f{X, Y) is a normed linear space.

For the special case, where X = Y, we write 5f(X) for the space of bounded linear operators on X. £f(X) is not only closed under addition and scalar multiplication, but it is closed under operator multiplication or com position, i.e. i*f(X) is an algebra of operators.

F or if Т ,€ ^ (Х ) , i.e. T¡ :X -» Y and is linear, it is readily seen that TjT2 :X -*Y and is also linear. The boundedness follows since

II T XT2 x(I S ||tJ ||t2 x||

s II t J II T2 II 11 x II

This also yields the important result || T1T2 || 2 II Ti|| || Т2 || .Of course not all linear operators are bounded, for example in E x.61.

L is bounded, whereas T is unbounded.

Example 62. Let X = L x [a, b], Y = -¿[a, b] and T : X ” Y be given by

IA E A -S M R -П / 2 145

where k (.,.) : [a ,b ]x [a,b]->IR is continuous in s ^nd t. Then T is an integral operator, clearly linear, and

Example 63. X = L 2 [0,1] and T : X - X is given by Tf = df/dt.T is linear, but cannot be bounded, as it is not even defined on all of X,

but only on the subspace S>= { f€ X > f '( .) E X }, called the domain of T. T is actually an example of a closed operator, which we shall define in the next section.

If, however, we take Y = /'[0 , 1], the space of all continuous functions with continuous first derivatives, then T is defined on Y and T : Y —X is bounded. This emphasizes the dependence of continuity and boundedness on the particular norm you are considering.

An interesting property of continuous linear transformation is used implicitly in the principle of superposition, a handy tool in differential equations. This depends on the mathematical theorem:

L :X —Y is a continuous linear transformation iff

Again we emphasize that the continuity of L depends on the norms you choose for X and Y.

Tx(s) I S max | k(s, t)0 < s , t innt> 7 I ï ï sirO n= 1

во t

■I/è

146 CU R T A IN

sinns ds

sinns dsnn=l 0oo

■ In=l

-4- (cos nt - 1 ) nz

Here we shall prim arily be concerned with operators on Hilbert spaces as this case is similar to the theory of matrices.

Definition 65. Let K ££f(H ), then the adjoint K* is defined by

<Kx, у У = ( x, K*y У V x ,y G H

That К* always exists is easily seen bv considering < ( K x ,y = f(x); f is a linear functional on H and | f(x) | § ||k x | ||y||

s INI llxll ||y ||

s const ||x|| for fixed у

So f GH* and so there is a y*G H , such that f(x) = <( x ,y * X i.e. К induces a map y -y * V yG H and we call this y* = K*y. The adjoint has the following easily verifiable properties:

1. I* = I; 0* = 0

2. (S + K)* = S* + K*

3. (aT )»= JT *

4. (ST)*=T*S*

5. Il T* II = Il T II

We shall just prove the last property:

Il T*y II2 = < T*y, T*y>

= < T T *y,y>

S ||тт*у|| llyll

s II Т|1 II T*y II U у II

II T*y II § II t || II у II

i.e. I T* Il â (I Т II and sim ilarly ||t || â ||т*|

I A E A -S M R -П / 2 147

Example 65. Н=4§- Then Aeáf(H) is representable by a matrix, and A* is its transpose for the real scalars and the conjugate transpose for the complex case. bExample 6 6 . H = L 2 [a, b] and Tx(.) = J ' k (.,s )x (s )d s

ab b< T x ,y > = J' к (t, s) x(s) dsy(t) dt

a ab b

= J x(s) J k(t, s)y(t) dtdsa

b b- J x(s) J к (t, s) y(t) dt ds

and sou

T*y(.) = j f k (t , .) y(t> dt

Example 69.Let H

i.e. H = L, ([0, T]; IRm)

space of functions u : [0 ,T]->]Rm with inner productT

<(u,v>= J u'(s) v(s) ds о

(u(s) is a column vector and the prime denotes the row vector.)If B(t) is a positive symmetric m Xm matrix, then we define the operator

В by

(Bu)(t) = B(t)u(t) v t e T

V erify that В* = В .Note also that

T

< B u ,u y = J (B(t)u(t))'u(t)dt о

T

= J u(t) 'B(t) u(t) dt о

g 0 since B(t) is a positive matrix.

This leads us to two further definitions:

Definition 6 6 . A €£f(H) is self-adjoint if A* = A. (A self-adjoint operator has the property that <CAx,x)> is always real.)

148 C U RTAIN

Definition 67. A positive operator Ae£¿\R) is a self-adjoint operator A such that ( A x ,x ) ï O VxG H.

It is called strictly positive if < Ax, x)> = 0 only if x =0. Of course the self-adjoint operators are generalizations of real symmetric matrices or Hermitian m atrices, and positive operators correspond to the positive definite matrices with the property x 'A xs 0 .

Self-adjoint and positive operators occur frequently in applications and they have several special properties which we shall list here as they are important in applications.

1. Il AII = sup I <(Ax,x^>|¡1*11 = 1

or equivalently the smallest M such that |<Âx,x)>|s M |j x Ц2

2. A finer inequality is m||x||2 â <(Ax,x^§ M||x||2

If A is positive then m ïO , and if A is strictly positive, then m>0.In the strictly positive case, the following also holds:

^||х||2 ё< А л х,х>;§ ^ Il x II2 V x£H

i.e. A " 1 exists and is strictly positive.

3. Every strictly positive operator A has a unique strictly positive square root A1/i2.

4. An orthogonal projection is self-adjoint, since

<CPx, x У = <Cxw , x +x У - •(x ,x У 4 ' 4 M m N m m

and

< x ,P x > = < x M + xM i,x M>=<xm,x m>

Example 70. Consider the same space as E x.69 (often denoted byLz ([0, T]; ]Rm) and let W(t, r) be an rXm m atrix-valuedfunction on [0, T]X [0, Т].Then the operator cl¥r\

t(OûHt) =J W(t, T ) u ( r ) d T

о

maps

H -Y = L 2 ([0 ,T ]; H r ).

T t

^ ^ u .v ^ y = J (J' W(t, t) u(t) d T ^ v(t) dtо о

I A E A -S M R -П / 2 149

т t= J J u(t) 'W '(t, t) v(t) dT dt

o oT T

= ' W(t, t ) ' v(t) dtJdTo t

changing the order of integration T T

= J u(t) ' ( J W(t, t) ' v(t) dt^ dTo t

= <(u)c)5r *v)>

T

where <W*-v(t) = J W ( t , t) 1 v(r) dx t

We should note that although for simplicity we have limited the definitions of adjoint, self-adjointness and positivity to bounded linear operators, all these definitions can be extended to closed linear operators on a Banach space.

7. SPECTRAL THEORY

Most of the problems of linear algebraic equations, ordinary differential equations, integral equations and partial differential equations can be formulated as linear operator problems.

Example 67. The Fredholm integral equation is b

A-f(t) -J k(t, s) f(s) ds = g(t)

to be solved for the unknown function f(t), where k(., .) : [a, b] X [a, b] -* IR. b bJ ' J ' |k(t,s)|2 dtds<°o. If we let X = L 2 [a, b] and define К : X - X by a a b

PК f(t) = J к (t, s) f(s) ds then this is really an operator problem (XI - K)f = j

a i l where I is the identity operator and f = (XI - КГ g, if ( 1 - K) exists.

Example 6 8 .

Э2и Эи ....... Г u(0, t) = 0 = u (l , t)Эх2 9t \ u(x, 0) = u0 (x)

150 C U RTAIN

This is the familiar heat equation. If we let X = L 2 [0 ,1], then we can define

A : X - X b y A f = 0 for f e¿2>(A) = j g:g, Ц . 0 6 L 2 [0,1]

A is a closed linear operator on X with domain S>(A) and the heat equation may be written

Ж = Аu(t>

In Eq.67, К is a linear bounded operator, but in Ex.6 6 , A is linear but unbounded. In fact most differential operators are unbounded, although fortunately they do form a class with nice properties; the class of closed operators.

Definition 6 8 . A linear operator T :X ->X is closed if for all sequences {x n} in the domain of T, ¿&(T) with x„-*x and Txn — у then x £ ^ (T ) and у = Tx. Another way of saying this is to consider the product space ¿^(T)X R(T) with typical element (x ,T x). Then T is closed if ^ (T )X R (T ) is closed in XXX. Ш Т ) is the range of T.)

A familiar example of a closed operator which occurs in physical problems is the Laplacian.

Example 69. Let ft be the unit disc in 1R2 , Q = {(x ,y) : x2 + y2 S 1} and consider the Laplace operator

. 32u 92uA u = 3ÏÏ2 + a72 on Ьг(а ) ’

the space of Lebesgue integrable functions of two variables with finite norm

l|u| | = ( / f u (x ,y)2dxdy У

Let

.^(Д ) = { u e L 2 (£2 ) : u is and Д u 6 L 2 (Г2 ) ; u = 0on ЭГ2}(ЭП is the boundary of the disc)

Then Д is a closed operator on L2 (f2) with the domain ^ (Д ). The classical partial differential equation: Au = f; u = 0 on 3Í2 has the solution

u(x, y) - J G(x,y; Ç, n) f(Ç,rj) dÇ dn

where the Green's function is given by

IA E A -S M R -1 7 /2 151

where the CTj are the distances shown in the following figure:

'_L j l V £2*t)2' ihv2J

i.e. Д has an inverse and u = A '1f. By inspection, A" 1 is an integral operator and is linear and bounded on L 2(f2). In a similar manner, other differential equations can be rephrased as problems of finding the inverse of a closed operator. Note that for a true solution to the problem, we need the inverse to be bounded. For example if Au = f is to have a solution for all f e L 2 (f2), we need Д' 1 linear and bounded, i.e. defined on all L 2 (f2). Much of functional analysis is concerned with finding inverses of operators, but for now we shall just state the important result concerning linear bounded operators:

If X ,Y are В-spaces and L e , Y) and L‘ exists in an algebraic sense, then it is bounded and linear. Now in finite dimensions the solution of linear equations depended on the eigenvalues of the m atrices and although the alternative theorem is probably very familiar to you, we shall formulate the results in a Hilbert space context, under the heading

"Spectral theory in finite dimensions"

Let X = lRn . Then T e 5f(]Rn) has a matrix representation, which is not unique as it depends on the basis used for ]Rn. However, each T is uniquely represented by an equivalence class of sim ilar m atrices (A and В are sim ilar matrices if there exists a non-singular C, such that A= С_1ВС). Similar matrices have the same characteristic equations, and hence the same eigenvalues.

So for each Te5f(IEn), we can define

1. The spectrum of T, cr(T) = {set of eigenvalues of T :X ¡}.

2. The eigenvectors of T corresponding to Xt are x¡: (T -X ¡I )x í =0.

3. The eigenspace of T corresponding to Xj is M¡ = Sp{x : Tx = Xjl}.

4. The projection operator Pj corresponding to M¡.

Using these definitions, we can decompose T as the sum,m

i=l

where Xi are distinct eigenvalues and Pj is their projection operator. It can be shown that the P* are pairwise orthogonal, i.e. PiPj = 0 , i^ j.

152 CU RTAIN

(You may rem ember this as the fact that eigenvectors of distinct eigenvalues are perpendicular.) So if we take powers of T, we have

rnT k = ^ XkPi

i = l

an easy calculation. Similarly, if f(t) is a polynomial in t, then m

f(T)i = l

The study of eigenvalues in matrices has particular significance for the algebraic linear equation

y = (X l-T )x on B n

The 'alternative' theorem states that it has a unique solution

x = (XI - T) ' 1 y

(i.e. (XI- T) " 1 exists as a matrix), i.e. provided X is not an eigenvalue or provided X$ a(T).

This property that the existence of a solution of y = (A.I- T)x depends on the spectrum of T generalizes to the infinite dimensional case.

Definition 69. Let T be a linear operator on a Banach space X, then

1. The resolvent of T, p(T) = { X6 С : (XI - T ) ' 1 G (X)}

2. The spectrum of T, a( T) = (C - p ( T)

That is, for X 6 p(T), the linear equation y = (Xl - T)x has a unique solution x = (XI - T ) ' 1 y for all y € X. Note that (XI - T)x = y may now be an integral equation or a differential equation (see Exs 65 and 6 6 ).

An important subset of o(T) is the point spectrum:

Pct (T) = {X : (XI - T) is not one to one}

In the finite-dimensional case, cr(T) = P ct(T) and is a finite set, but in infinite dimensions, things are slightly more complicated. Certain operators, however, have sim ilar properties to finite-dimensional operators (i.e. m atrices). These are compact operators, a subclass of linear bounded operators, which have the special property that their spectrum is a point spectrum and has countably many elements,

i.e. cr(T) = P ct(T) = {X¡; i = 1, 2 , . . oo}

This means then that (XI - T)x = y has a unique solution except for countably many Xj, the Xj6 Pa(T).

IA E A -S M R -1 7 /2 153

Definition 70. A compact operator T is an operator G£f(X) which maps a bounded set A of X into a precompact set TA (i.e. TA has compact closure TA).

An equivalent definition is alinear operator T which, for any bounded sequence { x n}, {T x n} , has a convergent subsequence. All finite-dimensional operators are compact, as are linear operators with finite-dimensional range, since in a finite-dimensional space every closed bounded set is compact. Another example of compact operators are integral operators;К of Ex. 65 is compact. Unfortunately, all linear operators are not compact and ct(T) may be very complicated.

Example 70. X = A[a., b] and T : x(t) -*■ /n(t) x(t) where ц (t) is continuous.Consider

(T - XI) x(t) = y(t)

(й(t) - X) x(t) = y(t)

X(t) = -jüjt)— X ’ Provided ju (t)-X / 0

and

c(T) = {X>/u (t) - X = 0, for as ts b}

a b

This is an example of a continuous spectrum.

Example 71. Х = ^ 1л Tx = {x i X2

1 ’ 2 n }Consider

(XI - T)x = у

Now

So ( XI - T) is not 1 - 1 if X = 1 / n; n = 1, 2

We can show that 0 ePff(T) also.

154 CU R T A IN

Example 72. Consider the bounded sequence cos x, cos 2x, cos 3x, ...tand the integral map T : Tx(t) = J x (s) ds, producing the sequence sinx ,

1 0 ? sin 2 x, ..., — sin n x ,... nThen this derived sequence is convergent, because T is a compact

operator on L,2 (0 , l ) .Since a knowledge of the spectrum tells us when we can hope for a

solution of (XI- T)x = y, it is an important study in analysis. One very useful result for self-adjoint compact operators is that: ст(Т) С [ - |[ T fl , ||t||], i.e. for |x| g II T 11 , (XI - Т)_1 Е£^(Х) or equivalently, (XI - T)x = у has the unique solution x = (XI - Tlÿ1 . The proof is easy and an instructive exercise. F irst we consider the X = 1 case: (I-T )x = y, ||t||<1 and le t

Bk ■ I - 'i=0

{B k} is Cauchy since n

1 в к - в пь ^ Ц т1 ‘ - 0 as n, k - 0 0 .

кbecause || T || < 1. âf(X) is complete and so B = ^ T Ieâ f(X ). But

i=o00 oo

b(i - t ) = Y t 1 - Y t í = I i=0 i= 1

: . в = (1 - т ) л е ^ ( х )

For X f 1 we just consider (XI - X )'1= X(I - (1/ X)T)_1 and || (1/ X)t|| < 1 is equivalent to II T [1 < X .

For certain classes of linear operators you can obtain a spectral decomposition for T. For example, if T is a compact, self-adjoint operator on a Hilbert space H, then

oo

= Y x k<x - e k > ‘Txk=l

where Xkecr(T) and e are the corresponding eigenvectors. This looks very sim ilar to the matrix case, however, for more general T, the best we can hope for is an integral decomposition of the type Tx = / XdE(X)x.

Application to Fredholm integral equations

Considerb

Xf(t) = [ k (t, s)f(s) ds = g(t)

IA E A -S M R -1 7 /2 155

the so-called Fredholm integral equation. Let К, X be as in E x.67 and write it as

which gives an iterative method for finding the solution. As the series solution for f converges, a finite approximation will suffice.

If || К ||> |x| , we know that, except for countably many X¡, the equation again has a unique solution. These are procedures for getting approximate solutions based on approximations on the kernel function k(t, s) but we shall only work out the case for k(t, s) = .k'.(s,t), i.e. К is self-adjoint. In this case, К has the decomposition

(XI- K)f= g

It can be shown that if k(t, s) is integrable on [a ,b ]X [a ,b] and

b b

a a

then K eáf(X ) and is compact. Now

b b bк || = sup к (t, s) f(s) ds|2dt ) > f |f(s)

f

2 ds = 1}a a a

We know that if ||k ||< | X |, then the equation has the unique solution

f ^ (X I -K H g

i=o

One can verify that

b

a

where

b

a

00

where Xk are its eigenvalues and xj, the corresponding eigenvectors.

156 C U R T A IN

We can now rewrite Xf - Kf = g, as

oo

Xf - X X j< f,Xi> xj = g i=l

X < f,X j> -X j< f ,x j > = < g ,x j>J since < x i,x j > = 6 ij

= Г for

/ . xf = K f+ g

00

= X X;<f » X j > X J+ gi-1

X - X. x i + ¡

which solves the equation for f if X<£ct(K).

b< g - x ly = f g(t)x.(t) dt

Unfortunately,in practice many equations are represented by operators which are not compact or even bounded. These are the ordinary differential equations and partial differential equations, which in linear cases can be represented by closed operators.

Although closed operators are not as well behaved as compact or bounded operators, they also have nice properties and a knowledge of their spectrum also gives useful information. For example, if A is a closed linear operator on a Hilbert space H and for some X, (XI - A)_1e£f(H ) and is compact and self-adjoint, then A has a special eigenfunction decomposition:

Ax

for every x in the domain of A and {x j, x2 , . . . } eigenvectors of A, and мп the eigenvalues of A.

Example 73. The heat equation of Ex . 6 8

Assume that u(x, t) = U(x) V(t)

IA E A -S M R -1 7 /2 157

Since the right-hand side depends only on x and the left-hand side only ont they must both equal a constant, X, say

A U = X U; ^ = XV dt

We can show that A has the property that (X0I - A ) ' 1 is compact and self-adjoint for some Х0,

i.e. A =y^Mn < - -x n> x n, and Ахп = x n

So a solution is u(x, t) = еМп xn(x) and the general solution is u(x, t) = У) d j^ n 1 x n(x) where the dn are determined by

U0(x) = £ d nx n<x )

i.e. dn = 

.'. u(x, t) <u0 ,x n > eMn x n(x)n=l

Unfortunately, we cannot linger on this very important and vast area of application of functional analysis to differential equations, but merely refer the reader to Naylor and Sell or Dunford and Schwartz (see Bibliography).

8 . CALCULUS IN B-SPACES

In this section,we shall be mainly concerned with developing a differential calculus for operators. The fundamental idea involved is the local approximation of operators by linear operators. Unfortunately, in order to understand this concept it will be necessary to "unlearn" the interpretation of the derivative of a real-valued function of a real variable. First of all, recall that for

f : IR - IR.

f' (x q ) = lim t-> о

f(x 0 +t) - f(x0 )

and that a good approximation to f(x) for x near Xq is

L(x) = f(x0) + f '(x 0) (x - x 0)

If we consider f :IR 2 —IR, then the above expression for f 1 no longer makes sense because we cannot add the vector x 0 to the scalar t. However, since f(x) = f(x1( x 2), we can consider the partial derivatives

3fЭх, = lim

t-of(xi + t,X 2) - f ( x ! ,x 2)

t

and

158 C U R T A IN

9f9x„ = lim

t-» оffx^xg +t) - f ( x j ,x 2)

However, it is still not obvious how we should interpret

f '(x 0 ) ( x - x 0) when x - x0 e ]R2

For И ', we have

f '(x 0) (x - x0 ) = f '(x 0)X (x - x 0)

and since the generalization of scalar multiplication is the inner product, we could take

f '(x 0)(x - x 0) = <V f(x0), x - x0>

where

V f = at 9f9xx ’ Эх2

This is usually written

9f , 3f ,df = - — dxi + - — dx2 9xj 1 9x2

But note that this definition depends on

(1) the inner product in ]R2

(2) a natural basis in ]R2

(3) the fact that f is real-valued.

Hence this approach will only generalize to functionals on K n. Let us now return to f : IR-* E and

f '(x 0)(x - x 0) = lim t-> о

f(x0 +t) - f(x0)(x - x0)

Write x - x0 = n and, replacing t by tr/, we have

f '(x 0 )r) = lim f(x„ + tri) - f ( x 0)' t

This latter interpretation makes sense for f :X -* Y , where X need only be a vector space and the space Y possesses some topological structure (so that the limit operation makes sense).

I A E A - S M R - n /2 159

Definition 71. Gateaux derivative

Given x and r) in X if

Df(x)r) = lim f(x +tn) - f(x)t

exists, then f is called Gateaux differentiable at x in the direction of r).We say f is Gateaux differentiable if it is Gateaux differentiable in

every direction, and in this case the operator Df(x) :X-* Y which assigns to each ri € X the vector Df(x)(ri) £ Y is called the Gateaux derivative at x

Example 74. Consider f : IRn- ]R and the basis e¡ = (0, .., 0, 1, 0 ,..) then

П = (rjj, 0 ) or (0 , n2)

This example shows that the existence of the partial derivatives is not a sufficient condition for the Gateaux derivative to exist.

Example 76, Let f : K 2-*IR be given by

This example shows that the Gateaux derivative is not a linear operator.

n

i=l

and

Example 75. Let f : R 2-IR be given by

Then

exists iff

x ^ 0 ; f(0 ) = 0

Then

1 6 0 C U R T A IN

Theorem: If the functional f :X -*K has a minimum or a maximum at x £ Xand Df(x) exists, then Df(x) = 0.

Proof: If rieX is such that Df(x)r)>0, then for t sufficiently small ^■(f(x + trj)-f(x))> 0. Consequently f(x+tr)) > f(x) if t> 0 and f(x + trj) < f(x) ift<0. A sim ilar argument can be used if Df(x)r]<0.

Example 77. Let ^ [0 ,1 ] be the vector space of real-valued functions which are continuous on [0,1]. Consider T :-^[0, l]-> ]R defined by

lT(y) [| (x + l)y (x ) 2 -y(x)]dx

0

Thenl

DT(y)(rj) = J [(x+ l)y(x) - 1] rj(x) dx 0

Hence for a minimum

DT(y)(n) = 0 Vn

Let

rj(x) = (x + 1 ) y(x) - 1 ; then we see

(x+ l)y (x ) - 1 = 0

.'. y(x) = l /x + 1 .

The concept of the Gateaux derivative did not require any topology on the domain space, this can lead to "unusual" properties, for example:Consider f : IR2-* IR, where f(x) = — , x / 0 ; f(0) = 0, then Df(0)(rj) = 0 Vr) G X.

x 2Hence Df(0) exists and is a continuous linear operator. But f is not continuous at 0. In order to make sure that differentiable functions are continuous we now introduce the concept of a Fréchet derivative.

Definition 72. Fréchet derivative

Consider f : X —Y, where both X and Y are normed linear spaces. Given x £ X ; if a linear operator df(x) exists which is continuous such that

Г II f(x + h) - f(x) - df(x)(h) II 'I n

. S i m ithen f is said to be Fréchet differentiable at x and df(x)(h) is said to be the Fréchet differential of f at x with increment h.

IA E A -S M R -1 7 /2 161

It is easy to see that if the Fréchet differential exists, then the Gateaux differential exists and the two are equal. M oreover, if f has a Fréchet differential at x, then f is continuous at x.

Example 78. Suppose f : IRn-»IRm is Fréchet differentiable at x, then

df(x)(n) =

If x is near x 0, i.e. || x - x 0 || is small, then || f(x) - f(x0) - df(x0)(x - x 0) || is near zero. Here a good approximation to f(x) is

f (x) = f { x Q) + d f ( x Q) (x - X 0)

So that the Fréchet derivative is essentially a linear approximation to f(x + A x )-f(x ) for ||Дх|| small.

We shall now see how these concepts can be applied to problems of approximation and optimization.

Newton's method

Consider the non-linear operator P :X -»X and suppose we wish to find x £ X such that P(x) = 0. Given x jE X , let r¡ = x - x 0. Then we require r¡ so that P(x0 +n) = 0, i.e. P (x 0 +rj) - P(x0) = - P(x0). But if x0 is near x, a good linear approximation is

P(x0 +rj) - P (x0) = dP(x0 )(rj)

Hence dP(x0)(ri) = -P (x0) and if dP(x„) is invertible, rj = - (dP(x0 ) ) '1 P(xQ)So that x = Xq - (dP(xo) ) - 1 P (x0). This is readily recognized as Newton's method, so that the method makes use of the Fréchet derivative to replace a non-linear problem by a sequence of linear ones.

Euler-Lagrange equation

A classical problem in the calculus of variations is that of finding a function x on the interval [a,b], minimizing an integral of the form

bJ = j ' î {x ( t ) , x (t),t)dt

a

To specify this problem we must agree on the class of functions within which to seek the extremum. We assume that f is continuous in x, t and x and has continuous partial derivatives with respect to x and x , and we seek a solution in the subspace of ¿ 4 a ,b ] for which x(a) = 0 = x(b).

A necessary condition for an extremum is that for all admissible h

fdî1 Эх j

Élui Э х ,

м ДЭх \

9frЭ х п у

DJ(x)(h) = 0

1 6 2 C U RTAIN

Now

DJ(x)(h) = f(x+£eh, x+ah, t) dta- 0

DJ(x)(h) Ц (x, x, t) h(t) dt + H (x, x, t)h(t) dt

and it is easily verified that this differential is actually Fréchet. If we assume that a continuous partial derivative with respect to t exists, by integrating by parts, we can write

DJ(x)(h) : ’ 9f d 8 f_" h dt + [ - h i_ Эх dt Эх . Эх _

Since h(a) = 0 =h(b), for an extremum

b9f___d_ df_ Эх dt dx hdt = 0 V h e ^ la .b ] with h(a) = 0 = h(b)

It can be shown that this implies

9f_ _ _d_ 9f_ = Q Эх dt Эх

The Euler-Lagrange equation

Example 7 9. What is the lifetime plan of investment and expenditure that maximizes total enjoyment for a man having a fixed quantity of savings?We assume the man has no other income except that obtained through investment. His rate of enjoyment at a given time is a certain functionV of r, his rate of expenditure. Thus we assume it is desired to maximize:

T

J~ e 'flt V(r(t)) dtо

where the e"6t term reflects the notion that future enjoyment is counted less today.

If x(t) is the total capital at time t, then

x(t) =ox(t) - r(t)

where a is the interest rate. Thus we maximizeT

J e'6t Vicrxit) - x(t)) d t

I A E A -S M R -1 7 /2 163

subject to x(0) = s, x(T) = 0. For the Euler-Lagrange equation

ce " 61 V 'b x - x) + e"6t V '(ax - x) = 0

V 1 (r (t)) = V '(r(0)) e

If V(r) = 2 -fr, then r(t) turns out to be

r(t) = r (0 ) e2 and x(t) = x(0 ) - eL ¿p-a _

r(0 ) 2 ¡ a - 6 11+ 2 (B -a e

If o'> (3 > о /2 , from x(T) = 0, we have

The total capital grows initially and then decreases to zero.

High-order derivatives

Just as in ordinary calculus, it is possible to define higher-order derivatives by induction. Let us first take an example.

Example 80. For a particular case of f : K ^ -IR , take

f(x) = X j + XjX2 + x2

Then

f(x 0 + tri) - f(x0) о n . 3---------------- 1--------------- = 2 х 01^ 1 + Ч Л 2 + 2 x 01 x 02r<2 + 4 x 02rl 2

+ tr} + tx0 1r)| + 2tr]1x02r¡2 + 6x Qgrig + 0 (t2)where

Now letx 0 +tr] = x, i.e. tr) 1 = x 1 - x 01, tr) 2 = x 2 - x 02.

Then up to quadratic terms in (x - xQ )

f(x) - f ( x Q) = 2 x01 (Xj - x 01)+ (x j - x 01)x 22 + 2 x01x 02(x2 - x 02)

+ 4 x § 2 ( x 2 - x 02) + ( x i - x 0 1 )2 + x 01(x 2 - x 02) 2

+ 2 x 02(x i - x 0i ) ( x 2 - x 02 ) + 6x 02(x2 - x 0 2 ) 2

= - (x - x0 )'2b+ (x - x 0 )'Q(x - X 0 )

164 C U R T A IN

where

2 b =

and

Q =

2 x0i + x 02 - 2 х 01х02 + 4xq2.

22 x 02

2 x 02 C01 '2 xm + 1 2 xq2

If f : R 2-* 1R is twice Fréchet differentiable, then its first Fréchet derivative V f is a vector, Vf = 2b, say, and its second Fréchet derivative is a matrix, say Q, and if x is near x 0, a good approximation to f(x) is

f (x) = f(x0) + <2b, x - x0>+<(x - x 0), Q(x - x 0) >

where the inner product is just the scalar product of vectors.In the more general case f:H->IR, where H is a Hilbert space, the same

approximation is valid, where 2bEH and Q e i^ H ), which has application in

Iterative methods

If a functional is twice Fréchet differentiable on a Hilbert space H and has, say, a minimum at x0 then near x0 the behaviour of f(x) must be of the form

f(x) = f(x0) + <2b, x - x0>+<(x - x0 ), Q(x - x0) >

So that many minimization problems can be examined by looking at the minimization of the functional

f(x) = <^x,Qx> - 2 <(x,b^

where Q is a positive-definite self-adjoint operator on H. Let us examine two methods of minimizing this functional.

1. Steepest descent

It is easy to see that the minimum is given by xo , where Qx0 = b. Write r = b - Qx and note that 2r is the negative gradient of f at the point x. Now consider the following iterative programme:

where rn = b - Qxn and an is to be chosen to minimize f(xn+1). Now

f(xn+1) = < (xn+anr n), Q(xn+ «nrn)> - 2<(xn+o-nrn),b >

= a 2 < rn,Q rn > - 2a 1 r nII2 + < x n,Qxn> - 2 < x n,b>

IA E A -S M R -1 7 /2 165

which is minimized by

11 r nil2<rn'Qrn>

Hence steepest descent is

11 r n II2X" +1=X" + <rn ,Q rn > r"

where rn = b - Qxn.

2. Conjugate gradient

In this method we try to reformulate the problem as a minimum norm problem. Again consider

f(x) = <(x, Qx У - 2<(x, b У

and introduce a new inner product

<x, у >Q = <x,Q y>.

Since Qxo = b, the problem is equivalent to minimizing

<x - x 0, Q(x - x 0) > = 11 x - x 0 IIq

Suppose we generate a sequence of vectors that are orthogonal with respect <ч» )*q • These are usually called "conjugate directions". Let x 0 be

expanded in a Fourier series with respect to this sequence. If x n is the coefficient of this expansion, then by the theories developed in section 4 Hxjj-XqIIq is minimized over the subspace spanned by the first n of the conjugate directions (say p2, p 2, .... pn) •

In order to compute the Fourier series of x0, we have to compute

< x 0 ,P i >Q = <Pi.Q x0> = 

i.e. no knowledge of Xo is necessary.The conjugate gradient method is given by the sequence

x n + i = x n + e - P n

e = . = _“ ■ ' r " Qx-

Too see this, we have

v _ v , < Pn,b -Q x 1 -Q yn >Уп+1_Уп <Pn,QPn> Pn

- „ * <Рп.У -У п>а' yn <p„.Q pn > Pn

166 CU R T A IN

Since yn is in the subspace spanned by px, . . . . p ^ and since the p j's are orthogonal with respect to < ,

<Рп’ Уп>0 = 0and

„ . . <Рп»У0 >QУ п + 1 y n [|_ || 2 P n

' Pfl "QThus

„ V <Pk* y’o >0 y"+1 L ||p I a pk

k=i 1 k Q

which is the nth partial sum of the Fourier expansion of y0 . It follows that Уп~* Уо or xn_>xo- The orthogonality relation < rn,p i,}= 0 follows from the fact that yn - y0 = xn - x0 is orthogonal to the subspace generated byPl' ••• Pn-i •

9. PROBABILITY SPACES AND STOCHASTIC PROCESSES

Probability theory and statistics can mean very different things to the pure mathematician or the statistician. To the pure mathematician probability theory is a very special case of a measure space (Í2 ,P , m) where ц(и) = 1, but to the statistician it means likelihoods of events, normal distribution, x _distribution or ways of dealing with random phenomena. Here we shall provide the mathematical framework of probability theory as it fits in nicely with our functional analytic approach.

Definition 73. A probability space is a measure space (i2 ,P ,p ), where ц(£1) = 1. Sets in P we call events, and ц a probability measure.

It is useful to compare the different terminology used in probability theory and in measure theory for essentially the same things:

probability space (С1,Р,ц) sample point w e Í2 event A e P sure event Q im possible event 0 almost surely a.s. "I with probability one w.p.l / random variable expectation

measure space element in Q measurable set whole space empty set almost everywhere

measurable function integral

Let us consider some simple illustrative examples.

Example 81. Let Г2 be the possible outcomes of tossing a die three times.clearly has 63 sample points of the form (2 , 1 , 6 ) and examples of m eas

urable sets in P or events are

A = {a 6 is turned up in at least one of the tosses}В = {a 3 is turned up 3 times running}

I A E A - S M R - n /2 167

Then you can see that ц (A) = 3/6 = 1/2 and /л (B) = 1 / 6 3

Example 82. Let £2 be the collection of all possible outcomes of flipping a coin 50 times. A typical sample point in £2 is и = (H,H, T, ...,H , T) a 50-tuple. Let p be the probability of getting H on a toss (allowing for 'unfair' coins). Then q = 1 - p is the probability of getting T. Then it is well known that the probability that you get n heads and 50 -n tails is

50! pnq50-n(50-n).'n!

This is the binomial distribution.

Example 83. Let

f(x) • a e 0

for x > 0 otherwise

If Г2 = 1R, then ц (A) = / f(x)dx is a probability measure on IR. This is called A

the exponential distribution.We have already said that a random variable is a measurable function

x : £2 -> R . In applications you usually think of it as some quantity whose value you can never predict exactly, but you can predict the probability that it will have a certain value. Example 81, for instance, in the die question, one random variable is the number turned up on the second throw; you cannot say what it will be, but you do know that it will be any of the numbers 1 - 6 with equal probability 1 / 6 .

Definition 74. A convenient way of expressing this information is by a probability distribution function of x, denoted by F(t) and defined by

F(t) = ju {и : x(u) S t}

i.e. F(t) is a real-valued monotonically increasing function of one variable. F(-oo) = 0, F H = 1, 0 S F(t) â 1.

Example 84. Let £2 be the 36 possible outcomes of rolling 2 dice. Let x be the random variable which assigns to each outcome the total points of the 2 dice. Its distribution function is shown in the following figure:

168 C U R T A IN

We note that

/л {tj Sx(u) S t2} = F (t2) - F (t1).

Definition 75. If F is sufficiently smooth, there exists a function f(s), the probability density function such that

where we may need to interpret the integration as an infinite summation in the case where x is a discrete random variable.

D iscrete random variable x takes on countably many distinct values, for example in the die experiment.

Continuous random variable. The possible values of x are some interval in IR.

t

whereA

for t> 0 otherwise

Define the random variable x(u) Then

eo

the exponential distribution function

and the exponential probability density function is

f ( t ) = j

oe at for t > 0 0 otherwise

I A E A -S M R -1 7 /2 169

Joint distribution functions

If x and y are two random variables on the same probability space (Г2, P ,ц) , we define the joint probability distribution function

F(t, s) = m { x(u) й t , y(co) S s}

Similarly you can define the joint distribution function for n random variables

F(t v M (x ¡ (u) s t t ; i= 1, ...,n }

The joint probability density function, if it exists, is given by

‘ 1 41

F(tj, . . . tn) = J ... J f(tp ... tn) dt1 ...dtn

Definition 7 6 . Expectation

This mathematical definition is motivated by the notion of the mean or average value of a random event repeated infinitely many times.

E{x } =J" x(w) d/u (w)

provided of course x(.) is integrable or x (.) € L i(n , P, ц ).

the equivalence class of measurable functionsx : i i - ’ K > / I х(ш) I d/u(u) <co

n

We then say that x has finite expectation.If E{ I x 12} <oo or equivalently x(.) G L2 ÍÍ2, P,/u) we say that x has

finite second moment ¡ x(u)2 dp(u), or that x is a second-order random variable. a

L2 (f2 ,P ,/u ) is the space of second-order random variables.

Definition 77.

If x £ L 2 ( n ,P , /и) we can define the variance cr2 (x) = E{ | x - Ex |2} which is a measure of the dispersion or spread of the values around the average value. a(x) is called the standard deviation of x.

Example 87. The normal distribution has the probability density function - t i - а)г

1 2 C.2f(t) = en/ 2iro

and

i— i г n 2 , , 2E {x} = a, a (x) = a

170 CU RTAIN

F(t) = probability that the outcome is ê t.

This space of second-order random variables L 2(i2,P,/u) is a Hilbert space with the inner product <(x, y )> = E{ xy} = / x(u)y(w) d/ufu).

n

Definition 78. We also define the covariance o f x , y G L 2 (£2,P,n)

cov(x, y) = E{ (x - Ex) (y - Ey)}

and the correlation coefficient

, . cov (x, y)Р(х-У) = а(х)а(у)

Stochastic independence

Intuitively this means that the outcomes of two events are not related. For example, if you toss a dice twice, the outcomes of the two tosses are independent. However, if you consider the event A= {the outcome of the first toss} and В = { the sum of the outcomes of the two tosses}, then clearly these are dependent events. This is formalized mathematically as follows:

Definition 7 9.

x and у are independent random variables if

/u{ x(u) 5 a, y(u) S /3} = ju{x(u) Sff}n{y(u) S J3}

Sim ilarly,x1( x 2, . . . , x n are mutually independent if

n/u{xj(u)) s e ¡ ; i = l . . .n } = П M{Xi(u) s /uj

An important result is that if x, у 6 L 9 (f2, P, u ) they are independent iff E{xy} = E{x( E{y] .

Definition 80. Stochastic processes

A stochastic process is a family of random variables Xt, t being a parameter.

Discrete stochastic process X( ; t = 0, ± 1, ± 2 , ...

Continuous stochastic process {xt}, t e an interval in R .

For each t, x t has a well-defined distribution function. An important class of random variables in applications are Gaussian random variables.

Definition 81. A random variable x is Gaussian if its probability density function is of the form

(t-a)»1 " 2°г

~ Го-1 e •J2-ÏÏ a

i.e. it has a normal distribution.

Important properties of Gaussian random variables

1. If x and у are Gaussian, so is ox + jSy.

2. The probability properties of x are com pletely determined by its expectation and its standard deviation (i.e. by its first two moments only).

3. A weighted average of a sequence of n independent identically d istributed random variables in Ьг(£2, P,ju) tends to behave like a Gaussian random variable as n ->00 (Limit theorem), i.e. we can approximatea large number of independent random factors by a Gaussian law.

Definition 82. A Gaussian stochastic process is one for which each random variable x t has a Gaussian distribution. The most general Gaussian process is com pletely specified by parameter functions p(t) and r(s ,t), where p(t) is the expectation function: /u (t) = E {x t} and r(s ,t) is the covariance function: r(s, t) = Е {х гхз} - /и (s) ju(t). (r(s ,t)= r(t, s) and the matrix [r(tj, 11)] is non-negative definite.)

The joint probability density function of xt ......xtn isJ a .. j 1 / 2 , ^ ~

f(Sl .......Sn>= ^ ‘jn/SS eXp { - 2 X âijtSj-MÎtpXS; -M (ti))j

where

( a ¡ j ) = (r(t¡, tj) ) " 1 and its determinant is | a ¡ j | .

A Gaussian stochastic process is a special case of second-order processes for which x te L 2 (£2, P ,p) for each t. For second-order processes we define ¡j (t) = E {x t} and r(s, t) = E{xtxs} -p(t) p (s), which, although they do not completely specify the process, they do provide very useful information.

IA E A -S M R -1 7 /2 171

172 CU R T A IN

An important subclass are wide-sense stationary processes where r(t, s) = r(t - s), which are extensively used in engineering applications.

DEFINITIONS

In reading new material and in particular reading these rather concise notes it may be useful to have a list of the major definitions.

Definition 1. Linear vector space: A set of elements with a binary operation under which it form s a group and an associated scalar multiplication by the real or complex numbers, which is associative.

Definition 2. Linear subspace: if У is a linear vector space, then a subset S of У is a linear subspace if x,y e S =» ex + |3y e S.

Definition 3. Affine subset: is the set M = {x :x = x 0 + c, where x0 e S, some linear subspace, and с is a fixed element of X}.

Definition 4. Convexity: A subset A of У is convex if x, y £ A=»Xx + (1 - X) у e A, VX > OS X s 1.

Definition 5. Linear dependence: if xlf x 2, .., xn e X and 3ci . . .a a not all zero such that o 1x 1+o'2 X2 + ... + enxn = then x b . . . ,x n are linearly dependent.

Linear independence: No such a x, exist.

Definition 6 . Dimension: If xx, . . . ,x n are linearly independent and any vector in X can be represented as a linear combination of x x, ..., xn then X is said to be of dimension n.

Definition 7. Basis: The set xi, . . . , x n of Definition 6 is called a basis for X.

Definition 8 . Isomorphic: Vector spaces X and 'W are isomorphic if Эa bijective linear map T :У~" W .

Definition 9. A hyperplane is a maximal proper affine subset.

Definition 10. Metric space: A set X of elements {x , y , ...} and a distancefunction d(x,y) with the properties (i) d (x ,y ) § 0

(ii) d(x, y) = 0 iff x = у(iii) d(x, y) = d(y, x)(iv) d(x, y) S d(x, z) + d(z, y)

d(x,y) is a metric on X.

Definition 10a. Pseudo metric: Condition (ii) is replaced by d(x,x) = 0(d(x,y) = 0 does not necessarily imply x = y).

Definition 11. Normed linear space X is a linear vector space with a norm on each element, i.e. to each xG X corresponds a positive number ||x||

IA E A -S M R -1 7 /2 173

such that (i) Il x II = 0 iff x = 0

(ii) II ахЦ = \a I II x H for all scalars(iii) II x + у II á H x H + И у К

Definition l ia . Seminorm || x|| = 0 need not imply x = 0.

Definition 12. A subspace of a metric space (X, d) is (A,d) where A C X .

Definition 13. Product metric space X X Y = { (x ,y ) :x e X , y e Y } , where (X ,d x), (Y, dy) are metric spaces and the metric on XX Y is a suitable function of dx, dy, e.g. d l x ^ y j , (x2, yg)) = dx(x1; x 2) + dy(y !,y 2) .

Definition 14. Continuity: f t X —Y a map between metric spaces (X, dx) and (Y, dv) is continuous at xp e X if given e> 0, 3 a real number 6 > 0 such that dy(f(x), f(x0 ) ) < 6 , whenever dx (x ,x 0) < 6 .

Definition 15. Uniform continuity: f is uniformly continuous if the above6 =6 (e) is independent of the point x0.

Definition 16. Convergence: A sequence {x n} in a metric space (X,d) converges to x 0 in (X, d) if d(xn, x 0) -► 0 as n — oo.

Definition 17. Cauchy sequence: { x n} is Cauchy if d(xn, x m)-* 0 as m, n—°o.

Definition 18. Completeness: A metric space (X, d) is complete if each Cauchy sequence converges to a point in X.

Definition 19. Closed set: A set A in a m etric space (X,d) is closed if it contains all its limit points.

Definition 20. Banach space: A complete normed linear space.

Definition 21. Dense: A linear subspace S of a m etric space X is dense in X if its closure with respect to the metric 2 X.

Definition 22. Connected: A metric space is disconnected if it is the union of two open, non-empty, disjoint subsets. Otherwise it is connected.

Definition 23. Contraction mapping: f : X - X , where (X,d) is a m etric space, is a contraction mapping if 3 k, Oskë 1 such that

d(f(x), f(y)) á kd(x,y) V x .y Ê X

Definition 24. Compact: A set A in a metric space is compact if every sequence in A contains a convergent subsequence with limit point in A.

Definition 25. Relatively compact: A set А С X is relatively compact if its closure is compact.

174 C U R T A IN

Definition 26. Equicontinuity: A set AC ¿ (X , Y), the space of continuousfunctions from (X,dx), a compact metric space, to (Y ,dy), a complete metric space, is equicontinuous at x o £ X if given G> 0 36 > 0 such thatI f(x) - f(x0 ) I < € Vf € A, whenever | x - xoI< 6 and x 6 X,

Definition 27. Open ball of a metric space (X, d)

Br(x 0) = {x e X ; d(x,x0) < r } , the open ball at x 0 of radius r. (Closed ball and sphere are sim ilarly defined with S r and = r, respectively.)

Definition 28. Local neighbourhood of a metric space (X ,d). A subset N of X, such that N = Br(x0 ) for some r / 0.

Definition 29. Equivalence of m etrics. M etrics di and d2 on the space X are equivalent if

(a) f : (X, dj) -* (Y, d3), an arbitrary metric space is continuous iff f : ( X ,d 2)-> (Y, d3) is continuous.

and

(b) A sequence {x n} converges to x0 in (X, dj) i f f {x n} converges to x 0 in (X, d2).

Definition 30. Open set. A set A С (X,d) is open if it contains a local neighbourhood of each of its points.

Definition 31. Topology: The class of all open sets of (X,d) is the topology of (X, d).

Definition 32. Continuity: A map f : (X, d J - 'lY , d2) is continuous if the inverse image of each open set in (Y, d2) is open in (X ,d x) (see Definition 14).

Definition 33. Convergence: A sequence{xn} in a metric space (X, d)converges to x0 in X iff xn is in every open set containing x0 for sufficiently large n (see Definition 16).

Definition 34. Closed set: А С X is closed if its complement is open (see Definition 19).

Definition 35. Separable metric space: (X,d) is separable if it contains a countable subset which is dense in X.

Definition 36. Measure space is a couple (X,¿^) consisting of a set X and a cr-algebra ,5 of subsets of X.(A subset of X is measurable if it is iná&)

Definition 37. Measure ц on a measurable space [?L,â8) is a non-negativeset function defined for all sets in âdwith the properties

M 0 ) =0

IA E A -S M R -И / 2 175

Mi=l

ц E¡ where E¡ are disjoint sets in SB. [¥>.,SB, ц) is theni = 1

a measure space.

Definition 38. Complete measure space ( Х, &, ц) is one which contains all subsets of sets of measure 0 .

Definition 39. Measurable function f - .X -IR u -t00} is measurable if { x : f { x)<a}eâ& for all real c .

Definition 40. Simple function: g(x) = ^ c ¡x E.(x), where c¡ are constantsi=l

and Xej is the characteristic function of E ¡GágL

Definition 41. Integrable function f :X -* E U { « ) is integrable if J fd/u<°°-E

Definition 42. Absolute continuity on I (compact interval in IR): For every G > 0, 36 > 0 such that

For I, an arbitrary interval, F must be absolutely continuous on every compact subinterval.

Definition 43. Topological isomorphism of two normed linear spacesX , Y Э Т б ^ Х Д ) such that Т л € 5^(Y,X); T is a topological isomorphism.

Definition 44. Isom etric isomorphism:

Il Tx H y = Il x||x Vx g X

Definition 45. Inner product is a bilinear function <( . , . ) : X XX — С such that

1. <(ex+/Зу, z> = o-<^x, z)>+J3<(y, z)> for scalars a, fi.2 . <x7 y > = < y ,x >3. <(x,x^> = 0

Definition 46, A Hilbert space is a complete normed linear space under the inner product norm |K .

Definition 47. Orthogonal, x is orthogonal to у iff ^х,у)> = 0.

n

n

k = l

n

whenever

176 C U RTAIN

Definition 48. Orthogonal complement of M С Hilbert space H is

M1 = { y 6 H: < x ,y > = 0 V x £ M}

Definition 49. Orthonormal set in a Hilbert space H is a set {0n } such that РтУ= énin •

Definition 50. Complete orthonormal set when H = Sp{ } .

Definition 51. Linear functional on a normed linear space X is a linear map f :X -»R .

Definition 52. Algebraic dual of X , Xa is a linear vector space of all linear functionals on X.

Definition 53. Topological dual of X , X* is the normed linear space of all bounded linear functionals on X.

Definition 54. Reflexive space: X** = X.

Definition 55. Convergence in norm: xn -» x in norm if

Цхн-хЦ-'О as n->oo.

Definition 56. Weak convergence f(xn)->f(x) as n -»«> V fG X * .

Definition 57. Weak* convergence in X *:x*(x )-*x*(x ) as n-*®, V xE X .

Definition 58. Weak compactness: A set A in X is weakly compact if V {x n} С A, there is a weakly convergent subsequence with limit point in A.

Definition 59. Weak* compactness: Replace weakly by weak* in Definition 58.

Definition 60. Linear transformation T : X-* Y, where X , Y are linear vector spaces is > T(ox + |3y) =oTx +|3Ty. T is an operator if X = Y.

Definition 61. Invertible transformation T : X -» Y is invertible if 3 G : Y -X > FG and GF are identity maps. G is the inverse of F.

Definition 62. Continuous transformation T : X -» Y at x 0 if for every e > 0,3 6>0 such that | | тх -тх0||<€ , whenever || x - x0 || < 6.

Definition 63. Bounded transformation T : X - Y if || Tx||y S k || x ||x , for some К > 0 and Vx G X.

Definition 64. £f{X, Y) = { T : X-» Y, where T is bounded and linear}

IA E A -S M R -1 7 /2 177

Definition 65. Adjoint T* of T e5 f(H ), H Hilbert space, is given by

< T x ,y > = < x ,T *y > Vx, y e H

Definition 6 6 . Self-adjoint operator: A e5 f(H ), when A = A*.

Definition 67. Positive operator: AG.Sf(H) if ( A x , x ) i 0. Strictly positiveif< (A x ,x^ = 0 o n ly ifx = 0.

Definition 6 8 . Closed operator T on a Banach space X if for all sequences { x j in the domain of T, ¿^(T), with x n -» x and Txn - y, then x e ® (T ) and y = Tx.

Definition 69. Resolvent p(T), spectrum cr(T) of a linear operator T on a Banach space X

p(T) = { XeC : (XI - T)'1 e

a( T)= C-p(T)

Definition 70. Compact operator Teâf(X ) and T maps bounded sequences into sequences with convergent subsequences.

Definition 71. Gateaux derivative: Given x and r) in X , if

exists, then Df(x)r) is called the Gateaux derivative at x in the direction of r). If it exists in all directions r), Df(x) :X->Y is called the Gateaux derivative at x.

Definition 72. Fréchet derivative: Consider f :X -»Y ; X , Y normed linearspaces.

If 3df(x)e.Sf(X, Y), such that

Г II f(x + h) - f(x) - df(x) h II 1 lim X ------------- г—л--------------- f = 0

l|h||-*0 L II h II J

then f is Fréchet differentiable at x and df(x)(h) is the Fréchet differential at x with increment h.

Definition 73. Probability space is a finite measure space (£2,P,p) with p№) = 1 .

Definition 74. Probability distribution function F of a random variable x is

F(t) = : x (u>) s t }

178 CU RTAIN

Definition 7 5. Probability density function f (if it exists) is given by

tF(t) = J f(s)ds

Definition 76. Expectation of a random variable x in Li(£î,P ,p) is

E {x } = J x(u) dц(ш)ü

Definition 77. Variance of a random variable x e L 2(£2,P,p) is

<r2 (x) = E {| x -E {x }| 2}

Definition 78, Covariance function of two random variables x, y e L 2 (fi, P, м) is

cov (x, y) = E{(x -E{x})(y - E{y})}

Definition 79. Independent random variables x, у if

й{х(ш) § a , y(u) s /3} =iu{x(u)s a} /j (y(co) S jS}

Definition 80. Stochastic process is a family of random variables.

Definition 81. A Gaussian random variable has a probability density function (t-a)2

f(t) = v k r e 2o!

Definition 82. A Gaussian stochastic process {x t} is one for which xt is a Gaussian random variable.


NAYLOR, A .W ., SELL, G .R ., Linear Operator Theory in Engineering and Science, Holt, Rinehart and Winston(1971).

TAYLOR, A .E ., Introduction to Functional Analysis, Wiley, New York (1967).

SIMMONS, G .F . , Introduction to Topology and Modern Analysis, McGraw-Hill (1963).

BACHMAN, G ., NARICI, L ., Functional Analysis, Academic Press, New York (1966).

YOSIDA, K ., Functional Analysis, Springer, Berlin (1966).

KANTOROVICH, L .V ., AKILOV, G .P ., Functional Analysis in Normed Spaces, Moscow (1955).

DUNFORD, N .. SCHWARTZ, J ., Linear Operators I and II, Interscience Publ. (1963).

CONTROL THEORY AND APPLICATIONS

A.J. PRITCHARD Control Theory Centre,University of Warwick, Coventry,United Kingdom

IA E A -S M R - П / З

Abstract

CONTROL THEORY AND APPLICATIONS.Most o f the theories In control have been developed assuming a mathematical model in terms of

differential equations. This part o f the course will examine the existence, uniqueness, and regularity o f both ordinary and partial, linear and non-linear differential equations.

A. ORDINARY DIFFERENTIAL EQUATIONS

1. Linear autonomous systems

The simplest systems in control theory are those described by equations of the form

x = Ax, x(0) =x (1.1)

where x : [ 0 , t ] -»Rn, A G i/(R n,Rn) and is represented by an nXn matrix, x is assumed to be Cx[0,t ] so that the above equation makes sense.

Very many linear differential equations can be formulated in this manner. For instance, consider the damped harmonic oscillator

x+kx + u2x = 0 , x (0 ) =Xj, x(0 ) =x0

Introducing x = y, we find

x =yу = -k y - ь?х

or

X ' 0 1 ' X x(0 ) xo_ÿ- - - к _ -У. * -y(0). - * i -

It is very useful to have an explicit representation of the solution of E q .(l .l) . To do this, we first introduce the normed linear space of nXn m atrices

179

1 8 0 PRITCHARD

Definition

If A is a matrix of numbers (a ) with n rows and n columns we define the norm of A, ü A I) by

We have, in fact, introduced many different spaces depending on the definition of ¡I ||Rn, However, it is easy to see that all these spaces are topologically isom orphic (because of the equivalence of norms on Rn).

Definition The exponential

We define ^ by

where I is the identity matrix and where we assume j| A || sk say. Then e* is well defined since

where || || is any norm on Rn. It is easy to show that

b) d A + B D S ü A ü + |j В IIc) H AB j) i H A d (I В У

and it is easy to show the following properties

dtAt » At A t . e = A e = e A

c) det A = etrA where tr A is the trace of Ad) However, eA+B Ф eAeB unless AB = BA

Solution of E q .(l .l)

From the property (a) above it is obvious that

x(t) = eAtx

is the required solution.

Let us illustrate this by considering the following example:

Example

x + x = 0 , x (0 ) = x1, x (0 )= x 0

Then

IA E A -S M R -1 7 /3 181

x 0 l" X x(0) x o_ - l 0. -У- _y(o). Lxj J

To calculate eA it is sometimes useful to use the characteristic equation

det(XI-A) =0

In our case this becom es

Л2 + 1 = 0

But we know that every matrix satisfies its own characteristic equation, so that

A2 + I = 0

From this we obtain

A2n = ( -1 )П I

Hence

eAt = 1 [ 1 - t 2/ 2 ! + t4/4 ! - •••] + A [ t - t 3/3 ! + t5/5 ! -•••]

= I cos t +A sin t

Then

x(t)-y(t).

cos t sint -s in t cost

cos t sint -s in t cost

x oXU

as required.

The inhomogeneous equation

Now consider

x = Ax + f(t), x(0) =x

where for the moment we will assume that f is continuous. Then the solution is

t

x ( t ) = e l x + Г e*<t ^ f is ) ds

( 1 . 2 )

182 PRITCHARD

We note that this solution is well defined, and verify that it is a solution of Eq.(1.2) by the direct calculation:

tx(t) = AeAtx+ eA(t' t' f(t) + J A eA(t‘ s) f(s) ds

о

= Ax(t) + f(t)

In particular, if f(t) =Bu(t) where u € R m is tobe thought of as a control, and В is an nXm matrix:

tx(t) = é * x + J e A(t"s) В u(s) ds

o

is the solution of

x = Ax + Bu, x(0) =x

Of course, this requires that u(t) be continuous, and this is usually too strong a requirement for control problems. However, before we try to overcome this difficulty we really need to examine what we mean by a solution to a differential equation. This will be carried out in the next section, and then we shall return to the above problem.

2. Existence of solutions

Consider the system of non-linear ordinary differential equations

x = f ( t , x ) , x (t ) = x (2.1)

where x is a vector-valued function defined on an interval I = [т,т + а] and f{t,x) is also vector-valued and defined on IX В =D where

В = {x e R n, ¡I x - x II s b }

We will assume that f £C°(D) and define a solution x(t) if

a) (t, x(t)) £ D , t e lb) x(t) = f ( t , x ( t ) ) , x (t ) =x

If x is a solution on I then clearly x e C 1(I). Integrating Eq.(2.1), gives

tx (t)=x + J f(s,x(s))ds (2 .2 )

T

Obviously (because f 6 C°(D)), Eq.(2.1) will have a solution if and only if Eq.(2.2) has a solution. There are many different kinds of existence, and

IA E A -S M R - П / З 183

uniqueness theorems for Eq.(2.2). We shall describe the Picard-Lindeloff theorem because of its historical importance, and also because the proof enables the solution to be constructed.

Picard-Lindeloff theorem

||f(t,x)- f(t,y) II SK||x - y II, t e l , x ,y G В

Assume ||f(t,x)|| Sm for (t,x) G D and set с =min (a,b/m ). Then for t S t S t + c there exists a unique solution with

We will show by induction that xk exists on [r, т +c], xkGC1 and

||xk+1 -x|| S m (t- r) k= 0 , 1 ,2 ...

Obviously, x satisfies these conditions. Assume xk does the same, then f(t ,x k(t)) is defined and continuous on [t, т + c ]. Hence from Eq.(2.3) xk+1 exists on [r, т+c] and x k+1 GC1 and obviously

Let fGC°(D) and satisfy a Lipschitz condition

x (t ) G В

Proof

Consider the sequence of successive approximations

x 0(t) = x

(2.3)

T

T

It now remains to show the convergence of xk. Set

Ak(t) = ||xk+1(t) - x k(t) (I, t G [т,т+с]

Then

T

Hencet t

Ak(t) SKT r

184 PRITCHARD

But

= II3E1 ( ' t ) - 3c II s m ( t - T )

So, by induction,

Ak(t) § mKk (t - t )k+1/(k + 1 )!

This shows that the seriesoo

Ik=0

is m ajorized by the series (m/K)(e?c -1 ) and hence the series is uniformly convergent on [t, т + с ] . Thus the series

OO

x„(t) + ^ ( x k+i ( t ) -x k(t)) k = 0

is uniformly convergent and the partial sums

n-lXjjft) = x0 (t) + (xk+1(t ) "

k = 0

tend uniformly on [t, t+ c ] to a continuous limit function x. We now need to show that x satisfies Eq.(2.2). Clearly,

t tII J [f(s,x(s))- f (s ,x k(s))]ds|| S J"||f(s,x(s))- f (s ,x k(s)) К ds

T Tt

S К J~ ¡I x (s ) xk(s) ü dsr

Now [Ix(s ) - xk(s) I] -* 0 as k -*00 uniformly on [t, t + c ] and hence x(s) satisfies Eq.(2.2). To show the uniqueness of this solution, let us suppose there are two solutions Xj, x 2 on [t, t + c ] . Then

tx 1( t ) - x 2(t) = J [f(s, Xjfs)) “ f(s, x2(s))] ds

T

Hence

tXj( t ) - X2(t) II S K J ||xa( s ) - X 2(s)||ds

Г

IA E A -S M R -1 7 /3 185

by the Lipschitz condition. We conclude the proof by an application of the following lemma:

Gronwall's lemma

Let a e L1(t, r + c), a(t)g О, Д e L° ( t , t + c ) and assume that b is absolutely continuous on [т, т + с]. If

A (t)sb (t)+ /a (s )A (s )d s

then

A (t) S Ь(т) exp / a(s) ds + / b (s )e x p ( / a(f) df Ids

Proof

Sett

H(t) = J a(s) A (s)dsT

Then

A(t) Sb(t) + H(t)

Multiplying by the integrating factor t

exp a(s) ds

H(t) = a(t) A(t) almost everywhere.

_d_dt H(t) exp

С С

- J a (s )d s j â a(t)b(t) exp -yâ(s)dsj

Integrating from т to t since H(t) = 0, we obtain

H(t) exp a(s) ds S 'J b(3) d i exp a(?)d| ds

= Ь(т) - b(t) exp a(s) ds + J b(s) expr

- a(?) dÇ ds

Hence the lemma is proved.

186 PRITCHARD

To apply the lemma we set

д (t) = II x x(t) - x 2(t) b(t) = 0

a(t) = К

Hence

X j ( t ) - x 2(t) II s 0

and so

X j ( t ) = x 2 ( t )

This concludes the proof of the theorem.

3. Extension of the idea of a solution

The discussion in Section 2 required that f be continuous in the (t,x)-domain. This is very restrictive for control problems since if we consider the system

x = f(t,u(t),x)

where u is a control, we do not wish to consider only continuous controls.In proving the existence and uniqueness we wrote the differential equation

in an equivalent integral form

Now, Eq.(3.1) will make sense for a wider class of f than those in C°(D).In particular, we may ask whether or not an absolutely continuous function x defined on I satisfies the above integral equation. In this case, the differential equation

x = f(t,x)

will only be satisfied almost everywhere (i.e. except on a set of Lebesgue measure zero). Defining the solution in this way, Carathéodory proved the following theorem:

Let f be measurable in t for each fixed x and continuous in x for each fixed t. If there exists a Lebesgue integrable function m on the inverval [т, т+а] such that

Il f(t,x) ü =j m(t) (t ,x )£ D

then there exists a solution x(t) on some interval [ t , t + / 3 ] satisfying x ( t ) = x .

(3.1)r

Theorem

IAE A-SM R-17/3 187

Proof

Define M by

M(t) = 0 t< r

M(t) = / m(s) ds t e [t, t + a]T

Then M is continuous, non-decreasing and M(t) =0. Hence there exists 3 such that (t,x)SD for some interval t e [т,т +0] if ||x-x || SM(t). The following iteration scheme is now introduced:

We now show that this scheme defines Xj(t) as a continuous function on [r, t + J3]. Clearly, Xj(t) = x is well defined. For any j 6 1, the first formula defines Xj on [r, т+ 0 /j ] and since (t, x) 6 D for t € [t, т + jS/j] the second formula defines Xj as a continuous function on (т + 3 / j , т+2/3/j]. Furthermore, on this interval

We can now define X]- on (r + 2j3/j, т+З/3/j] by the second formula, and so on. Note that for any two points tj, t2 we have

This implies that the set {xj(t)> is an equicontinuous, uniformly bound set on [т, t + / 3 ] . Hence, by A sco li 's theorem, 3 a subsequence xJk which converges uniformly on [t, t + 0 ] to a continuous limit function"

x as k-*°°.

We now show that this limit function is a required solution. To do this, we shall apply the dominated-convergence theorem of Lebesgue. First note that

Set

x (t) = x t e [т, t +j3/j]

t-0/j

T

Xj(t)-X ü S M (t-j3 /j)

Xj (tj ) - Xj(t2)|| S I M (tj- (3/j) - M(t2- УЗ/j ) I

f (t ,Xj (t) II sM (t), t 6 [t, t + j8]

and since f is continuous in x for fixed t:

f(t, Xjk (t) ^ f(t, x(t)) as k -°°

188 PRITCHARD

Then by the dominated-convergence theorem

lim J f(s, x^(s)) ds = J f(s ,x(s))dsr T

But

r I- B/jk

and we have

II / f(s,x. (s )) ds II -* 0'k

as k - » 00

Hencet

x(t) = x + / f(s, x(s)) ds■ /r

The conditions of the theorem only guarantee the existence of a solution. To obtain unique solutions, it is necessary to impose further conditions, for example Lipschitz conditions.

Another important problem is that of the continuation of solutions beyond T + /B. We shall not consider this problem but refer those interested to the bibliography.

4. Linear systems

We now return to the problem

x = Ax + Bu, x(0) =x

where u is measurable on I and || u(t)|| Sm (t), where m is Lebesgue integrable on I. We see that f(t,x) = Ax + Bu is continuous in x for each t and measurable in t for each x. M oreover, for (t,x) £D

Hence all the requirements of Carathéodory's theorem are satisfied and 3 a unique, absolutely continuous solution characterized by

f (t, x) (j S K A ü d x ü + ü В (I m(t)

f(t,x) satisfies a Lipschitz condition since

f (t ,x i ) - f ( t , x 2)|| = II A(X;L - x 2) I) s d A ü ü X j - x 2 ü

IAEA-SMR-17/3 189

Non-autonomous systems

We can generalize the problem to consider equations of the form

x = A(t)x, x(t) = x

where we assume that A is measurable on I and

where m is Lebesgue integrable, and uniformly bounded. Then by Carathéodory there exists a unique solution.

Now let us consider the problem for which x = ej i = 1,2,... n, where the e¿ are a basis for Rn. We might as well choose e¡ to be the vector with zeros everywhere except in the i-th place where there is a one. We denote the solution of this equation by cpj (t,T). For any x £ Rn there is a unique expansion

i= l

If we construct the matrix Ф (t, t) for which each column is the vector cpj (t, t) then

Ф (t, t) = A(t)5> (t, t)

M oreover, since

Ф; (T, t) = e¡, Ф(т, т) = I

We call $(t, t) the fundamental matrix.The solution is given by

Il A(t) ü S m(t), t £ I

n

i=l

and by linearity the corresponding solution will be

n

n n

= Ф { \ , т ) X

Note 1

In the case A(t) = A it is easy to verify that

$(t,T) = eA(t"T)

190 PRITCHARD

Note 2

In general,t

Ф (t, т) Ф exp J ' A(s) dsr

Non-homogeneous system

F or the controlled system

x = A(t)x +Bu, x ( t ) =x

where

u G L 2 (I, ЕШ), B Ë i / ( R m,R n)

It is easy to show that the solution is

t

x(t) = ®(t,r) x + / $(t,s) Bu(s) ds■IT

B. PARTIAL DIFFERENTIAL EQUATIONS

1. Introduction

The next section will be concerned with partial differential equations. There is a vast literature on this subject and it is obviously impossible to give anything m ore than a flavour of the subject in this paper. We shall, first of all, motivate the introduction of distributions and weak solutions, then go on to semigroups and mild solutions, and finally use monotone- operator theory to deduce results for non-linear partial differential equations.

A partial differential equation for a scalar function u is a relation of the

It may happen that this equation is supplemented by constraints on u and its partial derivatives on the boundary Г of the region Í2 throughout which the independent variables x ,y ... vary. These constraints are called boundary conditions, and if one of the variables is identified as time the constraint associated with that variable is called an initial condition. The order of the partial differential equation is the order of the highest derivative occurring in F. We shall use, throughout this paper, three examples of partial differen

form

where

etc.Эх ’ ХУ ЭхЭу

IAE A-SM R-17/3 191

tial equations which are of important physical significance. These examples are representatives of three classes of partial differential equations: parabolic, elliptic, hyperbolic, and we shall use them to illustrate the general results.

The equation governing the conduction of heat: parabolic

A good approximation to modelling the variation of the temperature T, in a rod of length I , is given by

Tt. = kTx.x.

This can be simplified by introducing х'/í = x, t1 = ( i 2 /k)t when

T = Tx t •‘■XX

The boundary conditions may take a variety of form s, e.g.

a) T (0 ,t)= T (l,t )= h 1(t)

b > T x l n = T x l i B h z Wx 'x = o x 'x = l ¿

where hj, h2 are given functions of time.The initial condition could be of the form

T(x,0) = T0 (x)

for some given function T0 (x).

Laplace equation: elliptic

In R2 this takes the form

V 2d> = ф + ф =0r v xx v yy

The equation represents many different phenomena ranging from the potential of some electric field to the stream function of a fluid flow. A variety of boundary conditions can be imposed depending on the particular physical situation. If, for example, the boundary is a circle С we could have

■Hc = f i ( C )

= i 2(C)

where фп is the derivative of ф in an outward direction normal to С and fi,f2 are two functions defined on C.

192 PMT CHARD

The wave equation: hyperbolic

Transverse vibrations of a taut string are governed by the wave equation

where z is the displacement from the x -axis. If the ends of the string are fixed, then the boundary conditions are

z(0 ,t) = z(l,t) = 0

The initial conditions could be

z(x ,0 ) =g 1(x)

zt (x,0 ) =g2 (x)

where g , g2 are given functions.A classical or strict solution to a partial differential equation (p.d.e.)

is defined to be a function u(x,y,...) such that all the derivatives which appear in the p.d.e. exist, are continuous, and such that the p.d.e., boundary conditions, and initial conditions are all satisfied. It may be thought that with this definition of a solution the main emphasis should be focussed on deriving methods by which a solution can be obtained. However, the story is not quite so simple. If we consider the wave equation

z tt= z xx, z(0 ,t) = z (l,t) = 0

z(x ,0 ) = zQ(x)

zt(x,0 ) = 0

then it is easily verified that the solution is

z(x,t) = |[z0(x+t) + z0 (x-t)]

However, if z„(x) is given by

z Q(x) = x, O é x é I / 2

Then the above solution cannot be a classical solution since z 0(x) is not differentiable at x =-|. So what do we mean by a solution to this problem?It is obvious that we must widen the concept of a solution if we are going to allow initial conditions of the form given above.

Let us consider another example. An obvious computational method for determining a solution to the heat conduction equation is to approximate the equation by

z Q(x) = 1 - x , x S 10 2 1

T (x ,t+ A t) - T(x, t) _ ^ Ât

IAEA-SMR-П/З 193

If we assume that T(x,0) is a function of the form z0 (x) (as given above), then the computation cannot even start because it is not possible to evaluate

Txx(x,0) at x = l /2

It is obvious, therefore, that reasonable definitions of solutions must allow for these eccentricities, and must include a link between the spaces in which the initial and boundary conditions lie, and the space in which the solution is sought.

If we transform the wave equation by setting

y =x - t

p =x +t

we find

z p y = 0 ° r z yp = 0

The equation zpy = 0 is satisfied by any function of y only, but the expression Zyp need not make sense for every such function. This is most peculiar and indicates the need for some generalized concept of a function.

We shall now show how the unnatural results of the above equation can be resolved by formulating a different concept of a solution, and this will lead to particular generalized functions — distributions.

If г 6 С 2й (the space of twice continuously differentiable functions onQ) where £2 is some bounded domain in R2 and f GC°(£2 ) then

z =f ( 1 . 1 )РУ

makes sense. Integrating by parts twice, we find

J J zcpypdydp = J J î t p d y d p (1 .2 )n «

for all

ф € С о ( £ 2 )

We define a weak solution to be a z such that formula (1.2) holds. If f is C°(£2) and z 6 C'¿[Q) then the weak solution is a classical solution.

However, the concept of a weak solution allows us to consider a larger class of f. Note that since

py Typ

we have

JJz Фру dy dp = JJZ Фур dy dp =JJ f ф dy dpÍ2 Í2 Q

194 PRITCHARD

so that the weak solutions of

Zpy = f « a n d Z y p = f

are the same.One way of looking at the weak solution is to think of zpy as being

represented by the linear form

a z c pyp ^ y d p

Then the study of differential operators leads to the study of continuous linear functionals on C0 (S7).

2. Distribution theory

There are many different ways in which distributions can be defined. We shall choose to define them as elements of the dual of a certain space D(f2), i.e . D'(f2). So we shall first of all define and give some properties of the space.

The space Р(П)

Let К be any compact subset of f2CRm and let DK(Q) be the set of all functions ф е С " (Q) such that the supports of the ф'a are in K. We define a family of seminorms on DK(f2) by

PK n(«í>) = sup |D“0 I

where the sup is taken over all | or | §n<°° and all x € K. and

э « 1 + а2 + . . . + a n

D“ <MX ) = ' a . a2-----------3xi 9 X2 ... 9xm

Then DK(f2) is a locally convex topological space whose open sets are determined via the fundamental system of neighbourhoods

V(K, l / s , n) n = 0, 1, 2 ... s = 1,2 ...

V ( K ,e ,n ) = ( ? e D K; Рк п (ф )й е }

If K-l S K2 then the topology of DKi (Г2) is the relative topology of DKj(£2) as a subset of DK2(f2). We define D(f2) to be the "inductive limit" of the Dk(í2)'s as К ranges over all compact subsets of Í2.

Inductive limit

We say that a set is open in D(i2) if and only if for every convex, balanced, and absorbing set V e D(Q) the intersection V П Dk (Г2) is an open set of DK(f2)

IAE A-SM R-17/3 195

containing the zero vector of D¡<(Í2) for all K. The topology defined in this way is the "inductive limit" of the DK(Q).

A better grasp of these ideas can be obtained by understanding what is meant by convergence in D(Œ). It can be shown that

lim фг =ф

means that both of the following conditions hold:

a) There exists a compact К С Г2 such that the support of </>r, r = 1,2 ... is in К

b) For all D“ , О01фг(х)->О“ф(х) as r->°° uniformly on K.

Before we look at distributions in m ore detail, there are a number of useful technical devices which although we shall not use them are worth mentioning because of their importance in a thorough treatment of the subject.

F irst, we define

Regularization of z:z6

If z is an arbitrary integrable function, we define its regularization z£ by

cp(x) = C f( II x||2- 1)

where

and с is a constant chosen so that

J ф dx = 1

Rn

Then it is easy to show that cp £ C ¡ (Rn).

Substituting ey =y' we find

196 PRITCHARD

Theorem

Let z be integrable and vanish outside a compact subset К of £2. Then ze 6 Cq(£2) if e is sm aller than the distance 6 from К to £2' (compliment of £2). As e-*0, z e -*z in Lp,lSp<°o and ze- z uniformly if z g c j p ) .

The proof of this theorem is immediate from the definition of ze and the representation

A direct result of this theorem is the so-called partition of unity.

Partition of unity

Let {K¡ > i = 1 ,2 ... be open sets such that {K¡> covers an open set £2CRn. Then there exist functions a¡ such that

i=lb) a¡€C°° and its support lies in some K¡.c) Every compact set in £2 intersects only a finite number of the

Support Of ttj.

Definition: Distributions

A distribution on £2 is an element T of D' (£2), and will usually be denoted by T(cp), cpGD(£2). Let us give some examples:

Example 1

Let f(x) be locally integrable, i.e.

J |f(x)|dx<°° for any compact K C R n К

Then T(cp) = / f(x)cp(x)dx, cp€D(£2) defines a distribution on £2 which we

Two distributions Tfj, T(■ are equal if and only if f j= f 2 almost everywhere.

Example 2. The Dirac delta function

Consider the distribution

a) a. È 0 ,' Л *

usually denote by Tf .

Note

T(cp) = cp(0), ф G D(£2)

IAE A-SM R-17/3 197

We shall call this the Dirac delta function concentrated at the origin and write 6 (ф). More loosely,this is written as 6 (x). If instead we consider the distribution

T(cp)=cp(a), ? £ D ( Í 2 )

we write

ба (ф) or 6 (x-a)

Example 3

Both of the above examples are special cases of the following distributions. Let d m be a measure on Í2 , then

is a distribution. If for example

d|u=fdx, f € L jC (Q)

then we obtain the distribution constructed in Example 1.

Example 4

If Т(ф) is a distribution, so is T(fç) for f G С (П).

Example 5OLIf Т(ф) is a distribution, then so is T(D ф). A useful characterization of

a distribution is: a linear functional T defined on C“ (Г2) is a distribution if and only if for every compact subset К of there corresponds a positive constant с and integer k, so that

I Т(ф) I i c P Rk whenever ф £ DK(f2)

We now make use of Example 5 to define the differentiation of a distribution.

Differentiation of a distribution

We define (Da T) (ф) by

(D“ T )M = ( - l ) |a|T(D% ), ? € D H

Example 1

Consider the Heaviside function

a

198 PRITCHARD

Then

'(x)dx = cp (0 )

since ф has compact support. So symbolically

H ' ( x ) = ó ( x )

Example 2

We can define the product of a function f €C°° and a distribution T as in Example 4.

fT(cp) = T(fq>), tp€D(n)

Then the differentiation

(DfT) (ф) is defined by

(DfT) (ф) = -ГГ(Оф) = -T(fDqp)

fDT(cp) = DT(fq>) = -T(D[fcp])

Now

D[fç] = fDq> +фDf

Therefore

(DfT) (cp) = DfTfo)+fDT(cp)

Note It is not possible to define the multiplication of two arbitrary distributions.

3. Sobolev spaces

Definition Hm(f2), where m is a positive integer, is the space of distributions Т(ф) such that DaT 6 L 2 (f2) for all a, |a|sm provided with the norm

also

DfTfa) = TfaDf)

and

|a|==m

LAEA-SMR-П /З 199

< ТГ Т 2>ш = 1 < D “ T ’ D “ T 2> L ) |a|

Hm(ii) is a Hilbert space

Remark I

If M> m

HM(ft) CHmff lC L 2M def_ H°(fi)

Remark II

The delta function does not belong to any of the spaces Hm(ii), m è 0.To see this we note

H ■*;■> { т и р * } ■ T {| И Г “ }Lz(i!) l|ŸV ( f i )

It is always possible to choose a cp€ D(f2) such that ср(0)ё к || Ф [|L2(^ for any given k. Hence 6 H°(Г2) and so by Remark I cannot belong to Hm(£2) for m s 0.

This remark indicates the need to consider more general spaces of distributions.

Tempered distributions

Let us consider the special case of f2 = Rn. We define the Fourier Transform of ф 6 L 2(Rn) by = ^ y 1 +x2 y2 + ... + x nyn

Then ф - $ is an isomorphism of L2 (Rn) onto L 2(Rn) and

Ф = 2 ^n / 2 J exp dxRn

Definition: Tempered distributions

Let

S ={ф: x“ D% 6 L2 (Rn) for all a,0}

2 0 0 PRITCHARD

Then with the seminorms

S is a Frechet space.We define the tempered distribution as elements of

S' = dual of S with the strong topology

Note

F or all cpG S and for all a

F(D“ cp) = (iyfFcp

and we can define the Fourier transform of u G S' by û =Fu where

(Fu,(p) = <(u,Fq))> V t(£S

where < , У denotes the duality between S' and S.

Theorem

If П =Rn, Hm(Rn) can be defined by

|a|£m

which proves the theorem.We note that the above definition of Hm(Rn) does not require m to be a

positive integer, and use the above to define Hm(Rn) for all n positive and

Hm(Rn) = lu : u G S' and (1+ ||у||2 Г /2и G L 2 (Rn)Iwith

m L2(Rn)

Proof

From Plancherel’ s theorem

L2(Rn) 'L2(Rn)

Hence

Rn |oi|£ m

But there exist constants Cj, c2 such that

IAEA-SMR-П/З 2 0 1

negative, m integer or not. It is then obvious that

H_S (Rn) Э H°(Rn) 3 Hm(Rn), -s < 0< m

We can also show that

a) If H°(Rn) is identified with its dual

{H s(Rn)}' =h ' S(r")

b) D(Rn) is dense in Hs(Rn) for all s.

The definition of the space Hs(ft) is a far more complicated matter and does not fit into the above pattern in the sense that a) and b) are not true for general s. Although we will have cause to use results for these spaces the development is outside the scope of these lectures, and the interested reader is referred to the literature (e.g. the book by Lions and Magenes, see bibliography).

There is a particularly simple representation of the spaces HS(F) where Г is the boundary of ft and is assumed to be a С "-m anifold.

If <//j are the eigenfunctions of the Laplace operator on Г and -Aj the corresponding eigenvalues, so that

Аф. + Х.ф. =0ч Г )

where <//j are assumed to be orthonormalized in Н°(Г). If u is a distribution on Г with Uj its Fourier coefficient relative to {<pji

oo

U = U j l / / j

j=l

Then for all s 6 Rco

Hs(r ) = ju : u e D '(r ) , Y 4 S luj |2<00} j=i

Regularity

It is very important to know whether a weak solution to a partial differential equation is a classical solution. This quite often involves comparing Hs(ft) spaces with L2 (ft) and Ck(ft) spaces. The .important results in this area are contained in the following

Sobolev embedding theorem

We shall write X Q Y to denote the continuous embedding of X in Y , i.e.

Il u II skIIuIL for all u exI 1 1Л

2 0 2 PRITCHARD

If

s > n /2 + k, S1CR"

Then

hs (ST2) q. d'(n)

We shallnot prove this theorem but by a simple application of the Schwarz inequality indicate its validity.

F or any ф G D(f2), f2= (0 ,l)

Ф2(х) = <py(y)dy l 2dv Фу d y 9 y d y

Hence

x e l ‘ l 2(«)

So by continuity

Sup I 12 * Il II2

for all u in the closure of D(f2) in the H1 norm. We shall denote this space by H¿(f2) and we see

H'0 (Г2) Q. C°(f2) , Í2CR1

An alternative definition of Н*(Г2) is the following

Definition Нр(П)

Let Г2 be a bounded open set С Rn with a suitably smooth boundary Г. Then u G Нд(Г2) if and only if

a) uGH1)!))

Q jb) = 0 on Г, 0 s i < s - l /2

О n j

where 3J/9nj is the derivative of order j along the normal to Г.It can be shown that

(i) D(f2) is not dense in Hs(f2), s> 1 /2 (ii) Hg(f2) = closure of D(f2) in Hs(f2)

(iii) (Hj(n)}' =H-S(fi).

4. Application to partial differential equations

We shall begin by considering a simple class of elliptic equations

IAEA-SMR-И /З 203

П

I DJ (aijD 1u) = - f in Г2

*• u = 0 on Гwhere

DJ' = ^ — , a.. 6 L°(iî)9 x j ч

and we assumen n

(4.1)

o2r

i,j=l i=l

We define

а(и,Ф) = £ Д . D*u iÿ çd x , ^f,9 >= J î y d x .

i, j =1 Q 0

for cp ED(f2). A weak solution of Eq.(4.1) is defined to be a solution of

а(и,ф) = <f,cp> (4.2)

Because of the imposed conditions we obtain

а(ф,ф) 2 K ü cp ü ц1 (0)

The proof of the existence now follows as a direct result of the Lax-Milgram theorem which states:

If a(u,v) is a bounded bilinear form on a real Hilbert space X such that

a(v,v) È К II v ||2 , v € X

F or each f S X ' there exists a unique uE X so that

a(u,v) = <f,v^ for all v E X

Of course,this reduces to the Riesz representation theorem when a(u,v)= a(v,u) and so is an inner product on X.

We see therefore that for f e H _1(f2), since D(H) is dense in H¿(f2), there is a unique solution u of Eq.(4.2): u 6 H¿(f2).If f EHS(Q) it is possible to show that there exists uE Hs+2(f2) П H'0(f2) such that

V2u = f

In particular, if Í2 CR2, s = l+ e ,e > 0 , u € H 3+е(Г2) П Hj(f2) and so by the Sobolev embedding theorem u G Сг(Г2) and is therefore a classical solution. This indicates how it is possible to obtain classical solutions via weak solutions.

204 PRITCHARD

5. Evolution equations

We shall again consider a space XC H where H is a real Hilbert space, then identifying H with its dual H ', we have

X CH CX'

where X 1 is the dual of X.We shall seek a solution of

f = A(t)x + f

x(0) = xQ, x QGH

The precise spaces in which f and the solutions lie will be explained later. We define the bilinear form

a (t ; Ф, Ф) = - <A(t) ф,<//>

where А^)ф EX' and'C , /denotes the duality between X and X 1. We assume

a) a (t ; ф,1/') is measurable on ( 0 , t ) and

I a(t; Ф, 0) i S с II ф||х |||//||х ф ,0 € Х , where |[ ||x denotes the norm on X.

b) a(t; ф,ф) ёК||ф||2 , K> 0 for all ф GX, t G(0,t )

so that

A(t) G ^ (L 2 (0,t; X); L 2(0 ,t ;X '))

i.e . if f GL2 (0,t; X), A (t)f is the function

t -A ( t ) f (t) G X 1

We know how to define df/dt for f G L2 (0,t; X). To do this we first define f G D1 (0,т; X) by

D '(0 , t ; X) = í /(D (0 ,t ) ; X)

the space of distribution on (0, t ) with values in X. Then if f G D '(0 ,t ; X),f(<p) GX for all ф GD(0,t ) and ф-*f (ф) is a continuous map of D(0,t )-*X.

F or f G L2 (0,t; X) we shall writeT

f (ф) = Jî( t)( f i( t )dt

о

and identify f GL2 (0,t; X) with f (ф ) G D '(0,t; X). The derivative df/dt is then defined by

df , , J ¿ф\

IAEA-SMR-17/3 205

Hence

¿ E D ' I O . t ; X ,

We now introduce the space W (0,r)

W (0,r) = £ L 2 (0, t ; X), G L 2(0, t ; X ')}endowed with the norm

T T

0 0

W (0,t ) is a Hilbert space. For more details the reader is referred to Lions and Magenes (see bibliography).

We seek a solution x GW(0, t ) to the evolution equation

^ = A(t) x + fdt (5.1)x ( 0 ) = x 0 G H , f G L 2 ( 0 , t ; X ')

Uniqueness

Consider E q.(5.1 ) with f = 0, x 0 = 0.

Then

a(t, x(t), x(t)) + < ^ (t), x(t)> = 0

But on integrating by partsT

0

Hence

T

0

and so

x =0

2 0 6 PRITCHARD

Existence

We shall assume that X is separable so that there exists a countable basis e1; e2 en, ... such that e^ ... e„ are linearly independent for all n and finite combinations are dense in X. Set

n

x (t) = ) q. (t) e.n / , Ш 1i=l

and choose c*in so that

< £ x n(t). e. > + a (t,xn(t), ep = <f(t), e j ) (5.2)

1 Si Sn

with ata(0 ) =Qin0 and n

Zq. . e. ->x„ in H as n - » 00 mo i о

i=l

The differential equations (5.2) are of the form

B n % ■ + А п ^ У п = fn’ Уп ( ° ) = i Qino>

where Bn is the matrix with elements ê¡, e jX

An(t) is the matrix with elements a(t, e¡, ej ) ,

y„ is the vector with elements ain(t) and fn(t) is the vector with elements <(f(t), е )>. Since the e are linearly independent

det Bn ф 0

and so the above equations admit a unique solution.We shall show that xn-*x (a solution of the original equation) as n-»°°. Multiplying Eq.(5.2) by orjn(t) and summing over j,an easy computation

yields

\ "ST Hxn(t)l|2 +a(t: xn(t)’ xn(t)) = xn(t)>

Then using the lower bound on a(t; xn(t), xn(t)) and integrating r r

l k (T)^ +2K / Hx n(t)llx dt S llxn(°H|2 + 2/ | < f (t), xn(t)> |dt 0 0

IAEA-SMR-17/3 207

T

11хп(°}|1н+2 f II f M Hx- ü Xn(t) llX dt

by the Schwarz inequality. But

2 / llf(t)llX' llXn(t)llx dtS & f llXn(t) llx0 0

and ||xn(0 )|| S a||x0||* for some a, /3> 0.

2 1 + —

Hence

■ .II ’ * j I I « « I I ” . Л

for some c. Thus xn ranges over a bounded set in L? (0,т; X) and we may extract a subsequence so that

x p->z weakly in L2 (0,t; X)

We shall now show that z is the desired solution.Let i be fixed but arbitrary and let p>j, then multiplying Eq.(5.2) by

cp(t),where n = p and cp(t) £ С 1(0,т) ср(т) = 0, integrating on (0,t) and setting 9 ¡(t) =<p(t) e¡, gives

T T

J -< x p(t), ф! (t)>+ a(t,xp(t), cp¡(t)) dt=J<f(t), 9 ¡(t)>dt+<xp(0 ), 9 ¡(0 )>

Since xp-*z weakly, we have

- <z,cp!> +a(t; z,cp. ) dt =J <f, ф. >dt+<xQ, <p¡(0 )>

Since the above is true for ф SCÔ, т), ф(т) = 0 we may take ф€ D(0,t). Then in the sense of distributions in D '(0 ,t) we have

<z(t), e. > + a(t; z(t), e. ) = <f(t), е; >

But the e are dense in X so that

f = A (t )z + f

Hence

dz. 9- ^ e L (0 ,T ;X ') and so z £W (0,t )

208 PRITCHARD

It is easy to verify that z satisfies the initial condition, so we have obtained the desired solution.

Example 1

Let Q= Œx (0,t) where Í2 is an open bounded set in Rn with a smooth boundary Г. Consider

so that all the conditions for the uniqueness and existence theorem are satisfied. Hence, for f E L 2 (0,t, H'^fyj.Uge L2 (Г2) there exists a unique solution и 6 L 2 (0,t; H'0 (£2)).

Example 2

Consider the above example where f is to be thought of as a control.Let us associate with the problem a performance index C(f) and try to find the control which minimizes C(f). Before attempting to find criteria for optimality, it is necessary to consider whether or not the problem is well posed, and the existence results will play an important role. For example, if

n

where и = 0 on Г, u(x,0) = uQ(x)

n n

i.J =1 i=l

We take

X = ВЦП), so that X' = H_1(«)

H = Н°(П) = L2 (Q)

Then for cp, ф e Hj(i2)

n

X, j “1 &

T

then we know the problem is well posed since u € L2 (0,t; Hj (&)). However, if

IAEA-SMR-17/3 209

the last two terms are well defined but it is not the case that u(r) lies in Hj (Œ) and the problem is not well posed. This kind of analysis becomes even m ore important when the control forces lie on the boundary of the system; for example, see Lions and Magenes (see bibliography).

C. STRONGLY CONTINUOUS SEMI-GROUPS

1. Introduction

We have seen (Section A) that if A e S^(X), where X is a finite-dimensional Euclidean space, the solution of

x= A x + f, x(0) =x0 (1.1)

for any integrable vector-valued function f. Here,

It is natural to attempt to define eAt for A unbounded on X, and then derive a sim ilar expression for the solution of the evolution equation (1.1). The above definition is difficult to use for A unbounded since D(An) D D(An+1) and the domain becomes smaller as we consider higher powers of A. M oreover, there is the problem of convergence! Instead, we choose a different representation of the exponential function, motivated by the scalar formula

Of course, it is necessary to make sure that the limit exists, and for this we shall assume

a) A is a closed operator such that D(A) is dense in X.b) (Al-A ) " 1 exists for X> 0 and || (XI - A)_1|| S l/X , A > 0.

Note that (b) implies

ist

о

II (I - pA)-1 1) S I, p iO

so that (I - t /n А)" is bounded and may be iterated. Now set1

2 1 0 PRITCHARD

Then

II Vn(t) II S I

and Vn(t) is holom orphic in t> 0 since we know the resolvent (XI- A ) '1 is holom orphic for X> 0. Hence

ÉXmÍí I = a ( i - - а Т (П+1)dt

Vn(t) is not n ecessarily holom orphic at t = 0, but it is strongly continuous at t =0. To see this, note that

- lH V jM x -x ll =t Л (1 - tA) Ax D 6 1К Ax II

Hence V jttJ x -x , as tiO fo r all x E D (A ). V1(t) is uniform ly bounded, and D(A) is dense in X , and hence (t) x-*x for all x e X . It is easy to show that the sam e must be true fo r the iterates Vn(t), so that

Vn(t)x - Vn(0 )x = x , t i 0

E xistence of the strong lim it lim Vn(t)------------------------------------------------------------------- n_*. oo

Writet-€

vn (t) x - vm (t) x = lim J ± Vm(t-s ) V„ (s) ds x

= lim - A

1H-* 1 1 1

11 C

O>

1

_ m Vm(t-s ) Vn(s )x

+ A ( l - - A j Vm(t -s )V n(s)x ds

Hence

I (Vn(t) - Vm(t))x H S I A2x I / ( f + ^ г У = ^ + ¿ ) ||A2x|

Hence Vn(t)x con verges uniform ly in t on any finite interval fo r a ll x e D(A2).Since D(A2) is dense in X , and Vn(t) is uniform ly bounded^lim Vn(t)x

exists fo r a ll x 6 X , and we shall denote this lim it by Tt x. n

IAEA-SMR-17/3 2 1 1

It is now necessary to show that Tt has the properties of the exponential function.

It is obvious that Tt is strongly continuous, and

Il T И S I, T0=I (1.2)

Now

d V n ( t ) _ д / т . 1 a ' T \T i+\ = V l+\ A i l - Í .

Properties of Tt

d t V n ( t ) = V B ( t ) A ( l - r A

= A V (t)(l - - A J (1.3)

But

n x = ( l - ^ A^ Ax -"Ax

as n-*-“° for all x£ D (A ). Therefore

Vn(t)A (Î-Â y x - T tAx, x€ D (A )

and since A is closed

AVn( t ) ( l - - A ) x -*■ ATt x

so that A commutes with Tt or AT, Э TtA.Integrating Eq.(1.3),

t

Vn(t)x- Vn(0)x=V n(t )x -x = f (l - — а Т (П+1' Ax dt, x E D(A)

But

uniformly for t in each finite interval. Hence

t

Ttx - x = J " TsA x d s, xE D (A )0

2 1 2 PRITCHARD

so that

^ ( T x) = Tt A x = A T x , x 6 D(A)

Thus x(t) =Ttx Q is the solution of

x = Ax, x(0) = xQ

ifx0 £ D(A)

Utilizing this fact,it is not difficult to show that

(1.4)

Such an operator Tt with the properties (1.2, 1.4) is called a strongly continuous contraction sem i-group. It is possible to obtain other semi-groups; for example, if we replace (b) by

Il Tt II s MeBt

In these cases we shall refer to strongly continuous sem i-groups.

2. Solution of the inhomogeneous equation

We have seen that the solution of the homogeneous equation

x = Ax, x(0) =xQ

isx (t )= T tx 0 for x0SD(A)

Consider now the equation

x= A x + f, x (0 )= x 0 (2.1)

where f is assumed to be strongly continuous with values in X, and A generates a strongly continuous sem i-group Tt .

(i) ü (AI - A)"k II X>0, k = l, 2 ...

Then it is possible to show that

t J| s m

(ii)Il (XI - A)"k К Ш (Х -р )"к Х>Э, к = 1,2...

then

IAEA-SMR-17/3 213

Suppose that x(t) is a solution, then

Therefore

Integrating on (0,t) we obtain

x(t) = ^ x 0 + / Tt.s f(s)ds0

Now one would hope that this solution is always the solution of Eqs (2.1) but this is not in general true. However, we can prove the following theorem:

Theorem

If A generates a strongly continuous sem i-group Tt , and

a) f(t) is continuously differentiable for të 0b) x0eD(A)

thent

о

is continuously differentiable and satisfies Eq.(2.1).

Proof

We need only show thatt

о

satisfies the differential equation and has initial value 0 . Let

t

о

Then

214 PRITCHARD

f (0 ) + J f '(r ) drо 0

t t t

V(t) = J Tt. :

tds

0 r

f ( r )d r

Now for x GD(A)

-f- [T x ] = AT x ds s s

Hence

Tt x - Tr x = / ATS x ds

Since A is closed and D(A) is dense in X, we have

A / T, ds = Tt -Tr

A T t-s d s = T t-r - 1

Using this result in the expression (2.2) ensures that V(t) €D(A) and

AV(t) = (Tt -I)f(O ) +jT (Tt. r -I) f' (r) dr

= -f(t) + Ttf(0) + J Tt.rf '(r )d r0

Now

V(t) = j TS f(t-s) ds о

dV(t)dt

Hence

= Tt f(0) + J Tsf’ (t-s)ds о

(2 .2 )

IAEA-SMR-17/3 215

T h e r e fo r e

= AV(t) + f(t) at

Other results of a different form more useful for control applications have appeared in the literature (see Balakrishnan).

3. Mild solution

If the conditions on f and x0 are not satisfied, but sufficient conditions are imposed so that

is well defined and strongly continuous, then the solution is said to be a "m ild" solution (e.g. f GL 2(0,T; X )). This concept of a solution is particularly important in control application since if f is regarded as a controller we may require it should only be piecewise continuous.

There is a theorem which gives a necessary and sufficient condition for an operator A to generate a strongly continuous semi-group:

If X is a Banach space such that

a) D(A) is dense in Xb) The resolvent (I - n'1A ) " 1 exists such that

I) ( I -n _1A )'m ü S С n = 1,2... m = 1,2...

then A generates a strongly continuous sem i-group Tt . Unfortunately, these conditions are not easily verified. However, there are a variety of conditions which give sufficient conditions (Hille and Phillips, and Yosida, see bibliography).

D. NON-LINEAR SYSTEMS

There is a growing literature on these systems (Browder, Kato, see bibliography), and it is not intended to attempt to survey the main ideas in this paper. Instead we shall take a simple example which will illustrate the kind of approach which is frequently used. We shall follow closely the kind of analysis which led to the solution of the linear elliptic problem. First, we state an abstract theorem:

t

(3.1)о

Theorem

Theorem

Let T be a mapping (possibly non-linear) of the reflexive Banach space X into its dual X ', which satisfies

2 1 6 PRITCHARD

a) T is continuous from lines in X to the weak topology in X '.b) There exists с on R1 with lim c(r) =+°o such that for all x G X

(Т х ,х ) й с ( IIx У ) ü x

where <C , У denotes the duality between X, X' and || || is the norm in X.

с) T is monotone, ie. for all x,y G X

< T x -Ту, x-y> г 0

Then T maps X onto X '.We shall not prove this theorem but show how a concrete example can be

formulated in such a way so that the theorem can be used.

Example

Consider

where A a is a non-linear partial differential operator of order m on an open bounded set Г2 , Assume

! ocj

A „ :S )X R m ^ R1

is continuous on Rm for fixed x G Г2 and measurable in x for fixed f G Rm. We assume

Aa (x,?) S c [ l+ ||| HRm]

and consider the non-linear equation

Au = f on Í2D6u = 0 on Г I /31 g m - 1

Set

a(u,i//) = X < ( - l )aAa(x,u,Du, ...D mu), D“ i//>

and consider

a(u,i//) = <f, ФУ, Ф £ D(r2)

with

u 6 H ” M = X

IAEA-SMR-17/3 217

From the assumptions we have

I a(u, ф) I S c( I) u (j ) ||0 ||x , Ф e D(f2)

This inequality will also hold for ф GH™(f2) since D(£2) is ênse in HQ (Г2), so that we may restate the problem

a(u,v) = <\f,vX v e x

u e x

We also note that a(u,v) is linear in v on X. Hence a(u,v) is a bounded linear functional on X, and so there exists a unique element Tu E X ', such that

(T u ,v ) = a (u ,v )v e x

Similarly, we can impose conditions on f so that there exists a unique u E X 1 such that

<(u,v)> = ( f ,v ^ v e x

Then the problem becomes

<(Tu, v^ = у У v e x

i.e.

Tu = u>

Now we may use the theorem to determine a solution. For m ore details see Browder (bibliography).

D. FUNCTIONAL DIFFERENTIAL EQUATIONS

1. General theory

In the previous sections, we have considered ordinary and partial differential equations which are examples of systems whose future behaviour depends on the present state and not on the past. However, there are many applications in control, mathematical biology, econom ics, etc., where the past does influence the future significantly. One class of such systems is described by differential delay equations or, more generally, by functional differential equations. F or example, if we model a population by saying that the growth (or decay) is proportional to the number of people in the population between the ages of 15 and 45, then one model could be

Ñ(t) = k[N(t- 15) - N(t-45)]

where N(t) is the number in the population at time t, and is a constant. We see here that the current rate of change of N(t) depends on the values of N(t), 15 and 45 units of time earlier.

2 1 8 PRITCHARD

A predator-prey m odel studied by V olterra is

о

Ni(t) = er Tiw 2(t) - / F 1 ( - B ) N 2(t + e ) à e- г

оN2 (t) = " e 2 + TiNj(t) + / F2 { - 9 ) N 1(t + e ) d e-I

-Г

where Na(t) is the number of prey in the population at tim e t, and N2(t) the number o f predators at tim e t, F1# F2 are given functions, and e i ,e 2, are constants. H ere, we see that the current rates of change depend on the past h istory through the integral term s. Several m ore exam ples may be found in the book of Hale.

The theory of existence and uniqueness of solutions of functional d ifferen tial equations is very s im ilar to that fo r ordinary differential equations except that it is n ecessary to consider solution segments

x(t +0) -b s в й О

over a tim e interval rather than just x(t). The follow ing results are developed in much m ore detail in H ale's book (see bibliography).

C onsider the functional differential equation o f retarded type

where f :R X C ([-b ,0 ] ; R n) - Rn, and

C ([-b ,0 ]; R n) = all continuous maps cp:[-b,0] -*■ Rn with norm

Ik II = k w L[ -b .O ] K

If a G R, aê 0, and x( • )G C ([-b ,0 ); Rn) then fo r any t G [a,a + a], we let x tG C ([-b ,0 ] ; Rn) be defined by xt (0) =x(t + 0), -b s 0 s 0. So xt (0) is the segment of the curve x(t), as indicated below

A function x(-) is said to be a solution of E q .(l) if there exists a G R , a> 0 such that xGC([a-b,CT + a]; Rn) and x(t) sa tisfies E q .(1 ) on [cr-b, a + a). F or a given cr G R and h G C ([-b ,0 ]; Rn) we say x =x(a,h) is a solution of E q .(l) with in itial value h at a if there is an a> 0 such that x(cr,h) is a solution of E q .(l) on [ст-b , a +a) and xo(a,h) = h.

x(t) = f(t, xt ) (1)

Note

F o r the initial condition it is n ecessary to p rescrib e a function on [ -b .O ].

IAEA-SMR-17/3 219

One can prove the following existence and uniqueness theorem using methods sim ilar to those used in Section A:

Theorem

a) Suppose D is an open set in R X C ([-b ,0], Rn) and f:D->-Rn is continuous. Then if (cr,h) £ D there exists a solution of E q.(l) through (cr,h).

b) If f is also Lipschitzian in h on all compact subsets of D then the solution is unique.

Now let us see the implication of this theorem for the linear system

о

x(t) = Ax(t)+Bx(t + 0j) + J ' С ( в ) x(t + в) d0 (2)-b

x 0 =h

Clearly, all the conditions of the theorem are satisfied and so there is a unique solution on [0 , °°).

If we let x t(h) be the solution of Eq.(2) considered as a differential equation on C([-b,0]; Rn) and write

xt (h) =T(t) h

then Tt , ts 0 is the strongly continuous sem i-group on C ([-b,0]; Rn) with infinitesimal generator j d given by

о

’ A cp iO + Bcpie j ) +Jc ( 9 ) < p ( e ) d e 0 = 0

^ ( 9 ) = \ d - b

S(®) e * °

with domain D(a) ={ф, ф' e C ([-b,0]; Rn) j.Hence the equation may be written as an abstract evolution equation on

C ([-b, 0]; Rn).

x t = j ^ x t

x0 =h

If we consider the inhomogeneous equation

x £ = j x , +f(t)

x 0 = h

where f is defined by

'«■"•>- { S o e fo ° í

220 PRITCHARD

then by analogy with Sections A and C, we may express the solution t

xt = T(t) h + T (t - s) f(s) ds о

at least for f(t) continuous.

(3)

2. Affine hereditary differential equations in M spaces

It turns out that the abstract evolution equation formulation for linear delay systems is not the most appropriate for many control applications since C([-b,0]; Rn) is not a Hilbert space. However, it is possible to reformulate the system on a Hilbert space M2 [-b,0; Rn] using the construction of Delfour and Mitter. Let us consider the system

dx , > _ A , . xit + Qj) Г x(t + 6 ) t + 0 ë 0— (t) - A x (t)+ B h(t + e j+J c ( e ) h(t + e) t + 6 < 0

1 -b

where O á tsT , A ,В are nXn m atrices and С £ L” ([-b,0], (R°)).nDefinition M space

Consider the space of functions

cp: [-b,0]->Rn

with seminorm

о

|ф||м2 = 0 ^ 1 2+/ -b

and is the Rn norm.We now define M ( [-b ,0]; Rn) to be the quotient space of L (-b,0; Rn)

enerated by the equivalence classes under T-Ф, ■ 0 .

2< i.e. we say Ф1 = Ф2 in M if

Thus M is a function space of a point at 0 = 0 plus a curve

-b

The point ф(0) is uniquely specified, but the rest of the curve is only specified almost everywhere.

M2 ([-b,0]; Rn ) is a Hilbert space with inner product

0

< Ф, ( //> = cp'(0 )i//(0 ) + J cp’ (s)iMs)ds -b

IAEA-SMR-И /З 221

Note. In Hale's formulation, cp was continuous on [-b, 0] and so was uniquely specified at all points.

Another way of looking at M2([-b,0], Rn) is to say that it is isom etrically isomorphic to RnXL2 (-b,0; Rn).

We now define another space AC2(t0,t; Rn) to be the space of absolutely continuous maps [t0,t] -*Rn with derivative in L 2(t0,t-[; Rn) and norm

AC k M 2+ / 6 i 2dstn

Then we can write the homogeneous equation

Ф (t) =j4<j>(t)

Ф(0 ) = h

(4)

л / is a closed linear operator on M2 with domain AC2 and is defined by

j ^ h ) ( 0 ) =

Ah(0)+Bh(6 . ) + / C(0)h(0)d0 0 = 0-b

(5)ф о

jrf is the infinitesimal generator of a strongly continuous sem i-group $(t) on M2 and Eq.(3) has the unique solution

(Ht)=$(t)h for h £ M2

Consider now the inhomogeneous equation on

Ф (t) = jS'cMt) + f(t ) (6)

</>(0 ) = h

where A is defined by expression (5), and fE L 2 (0,°°; Rn). ThenE q.(6 ) has the unique solution in

t

ij>(t) = $(t)h + jT $ (t-s )f(s ) ds о

Thus we are able to allow for discontinuous inputs which are particularly important in control applications. M oreover, since M2 is a Hilbert space we are able to obtain simple optimization results for the linear quadratic problem.

222 PRITCHARD

B I B L I O G R A P H Y

YOSIDA, K ., Functional Analysis, Springer Verlag (1968).

LIONS, J .L ., MAGENES, E ., Non-Homogeneous Boundary Value Problems, Springer Verlag ( 1972).

КАТО, T ., Perturbation Theory for Linear Operators, Springer Verlag (1966).

CARROLL, R .W ., Abstract Methods in Partial Differential Equations, Harper and Row (1969).

CODD1NGTON, E .A ., LEVINSON, N .. Theory o f Ordinary Differential Equations, McGraw-Hill (1955).

LIONS, J .L ., Optimal Control Systems Governed by Partial Differential Equations, Springer Verlag (1972),

HILLE, E., PHILLIPS, R., Functional Analysis and Semigroups, Am. Math. Soc. Colloq. 31 (1957).

BROWDER, F ., Existence and Uniqueness Theorems for Solutions of Non-Linear Boundary Value Problems, Proc. Symp. Appl. Maths. A .M .S . (1965).

КАТО, T ., Non-Linear Evolution Equations in Banach Space, ibid.

HALE, J .K ., Functional Differential Equations, Springer Verlag (1971).

DELFOUR,' M .C ., MITTER, S .K ., Hereditary DifferentialSystems with Constant Delays I. General Case, J. Diff.Eqns 12 2(1972) 213-235.

IAEA-SMR-17/4

FINITE-DIMENSIONAL OPTIMIZATION

D.Q. MAYNEDepartment o f Computing and Control,Imperial College of Science and Technology,London, United Kingdom

Abstract

FINITE-DIMENSIONAL OPTIMIZATION.Finite-dimensional optimization problems o f the following types are considered in the paper:

unconstrained optimization problem, inequality-constrained optimization problem, equally-constrained optimization problem, non-linear programming problem and optimization problems with special structures.

0 . PRELIMINARIES

Optim ization Problem

The fin ite-d im ensional optim ization problem s d iscussed in these lectures have three ingredients: an ob jective function f °: Rn -> R, an inequality constraint set Í2 which is a subset of R n, and an equality constraint set {x £ R n | r(x) = 0}, where r maps R n into Rk . The problem s considered are:

P I. (Unconstrained optim ization problem )

min{ f°(x ) I x G R n) ,

which is shorthand fo r — find an x in the set Rn such that fo r all x G Rn, f°(£ ) S f°(x ).

P2. (Inequality-constrained optim ization problem )

m in {f °(x) I xG Ш

P3. (E quality-constrained optim ization problem )

m in {f °(x) I x G Rn , r(x) = 0}

P4. (N on-linear program m ing problem )

m in {f° (x )| х е П , r(x) = 0}

Г2 w ill generally be defined in term s o f a function f : R n^ R m, that is:

1. Г2_Д {x e R n I f(x) s 0}

(the notation f(x) i z , f : R n - R m; z G R m, denotes f 1 (x) § z 1, i = l , 2, , , . m, where f 1, z 1 denote the i-th component of î, z, respectively ).

223

2 2 4 MAYNE

Optimization problems with special structure (for example, f° , f, r, linear or quadratic), and, therefore, simpler to solve, will also be considered.

Conventions

The following conventions and notation will be used: Rn denotes the Euclidean space of ordered n-tuples of real numbers (R denotes R1). x S R n has components x 1, x 2, . ..x n e R (x = (x1,x 2, . . . , x n)). x is treated as a column vector (nX l matrix) in matrix operations. denotes

the norm in Rn, defined by || x || = <jx,xУ 1 2. f orf(.) denotes a function, f : A-> В denotes that the domain of f is A, and its codomain is B. Given a function f : Rn -» Rm, fx :R n->Rmxn denotes the Jacobian matrix function whose ij-th element is (3fi/9xj), Rmxn being the space of mXn real m atrices. If A e R mxn, || A || Д max { || Ax || | ||x||sl}. A' 1 denotes the inverse of a matrix A, and AT its transpose.

Symbols:

V for all3 there exists, there does not exist=> implies, => does not imply<=* if and only if, is equivalent toД is defined byI~ such that{x| P } set of points having property PA + x {x|x = a + x, aeA }.

n union, intersectionAC В A is contained in Вz 6 A , (z A ) z belongs (does not belong) to A0 the empty setZ= {0 ,1 , 2, 3...} the set of non-negative integersZ+ = {1 ,2 , 3 ...} the set of positive integersJm= {1, 2, ...m }, J^= {0, 1 ,2, ...m }

Topology of Rn

Let B e(x) 4 {x I II x - x II < e } . x is an interior point of A С Rn , if 3 e > 0, Э В e(x) С A, and a closure point if, V e> 0 , В£ (х )П А /0 . A is open if every point of A is an interior point, and closed if every closure point of A is in A. The closure К of A is the set of all closure points of A.The interior Â of A is the set of all interior points of A. (a,b) denotes the open interval {xG R | a< x< b}. [a,b] denotes the closed interval {x é r | aSx áb].

Sequences

{ x j G R n } denotes a sequence x i , x 2 , x 3 ... in R n . x is an accumulation point of {xi} if VE> 0, VnGI, Э п 'ёп эх11' £ B 6 (x). x° is a limit of the

IA EA -SMR-17/4 225

sequence { x j (or the sequence { x j converges to a limit x °, written x i -> x °) if V£ > 0, 3n° Э хп G Bf(x°), Vn S n°. If x is an accumulative point of {x ¡}, Э a subsequence of {Xj} which converges to x, that is, 3K C Z su ch that x¡ -* x for i e K. If x is a closure point of A С Rn , 3 a sequence { x, G A} having x as an accumulation point.

A С R" is bounded if 3o 6 R э|| x II s a, Vx G A. A С Rn is compact if it is closed and bounded.

2. Theorem

A С Rn is compact <=>every sequence {xj £ A } has an accumulation point in A.

3. Theorem (Cauchy convergence criterion)

{x¡ G R n} is convergent <=>{xj} is a Cauchy sequence, that is,Ve > 0, 3 n °e l Э ||xm- x n||< e ,v m, n Sn°.

Continuity

f : R n ->Rm is continuous at x 0 if V6>0, 3e> 0 3 x G B e (x0)-=> f(x) G B6 (f(x0)) (note B c is a ball in R.n, B¿ a ball in R m). f is continuous on A C R n if it is continuous at all x G A. f : R n->R is lower semi-continuous at x 0 if V6>0, Эе > 0 Э x e Be (xo) f(x) > f(x0 ) - 6 . The definitions for lower sem icontinuity on A, and upper semi-continuity (at x0 or on A) are similar.

Infimum and minimum

f : A-*R is bounded from below if 3a £ R (ct is a lower bound for f on A) 3f(x) è a,\f x G A. A lower bound â (for f on A) is called the infimum for f on A

(a = inf {f(x) I x G A}) if Ve > 0, З х€ А Э f(x) < a + e . Upper bound and supremum are sim ilarly defined. If there exists an x e А Э f(x) g f(x) V x€ A, then f(x) is called the minimum of f on A, and x the minimizer of f on A. Maximum is sim ilarly defined.

Existence of minima

4. Theorem (existence)

If f : A->R is lower semi-continuous and A С Rn is compact, then 3 a solution x e A to the problem min{f(x) | xG A }.

(i) f not continuous, f : R-> R, f(0) = 1, f(x) = x, Vx > 0. inf {f(x) | x G [0, 1]} = 0 but there does not exist a x e [0 , 1 ] such that f(x)= 0 .

(ii) A not closed, f : R -"R, x - x ; A = (0,1]V xG A , 3 x'G А Э ^ х ')< ^ х ) (for example, x ' = x /2 ). Again inf {f(x) | xG A} = 0, but x satisfying f(x) = 0 does not lie in A.

5.

226 MAYNE

(iii) A not bounded. ^ R - ’ R, x " x ; A = R. V xeA , 3x' GA 3 f(x ') < f(x).

Differentiability

f : Rn -* R is differentiable at ЗГ, if 3 a linear function f :R n->R3 V6 > 0, 3 e> 0 Э :

6 . II f(x°) - f (к) II < 6 II x - x II , Vx £ Bf (x) (equivalently || f(x) - f(x) || = o( || x - x||), where o{a) /a- "0 as a - 0). Because f is linear it may be expressed as:

fix) = f(x) + < Vf(x), x - x У

The function V f: Rn-*Rn is called the gradient of f, and is regarded as a column vector in matrix operations.

7. Theorem

f : Rn -* R is differentiable at x, if all the partial derivatives 3 f/9x i, i = l , . . .n , of f exist and are continuous at x, in which case V f = f j , f ^ a f /Э х !, ...9 f/3 xn).

f : Rn -* R is twice differentiable at x if 3 a quadratic function f : Rn -*R, x " f(x) + < V f(x), x - x > + § <x - x, V2 f ® (x - х ) У э V ó> 0 3 c> 0 э :

Il f(x) - f(x) Il < 6 II x - x||2, V x e B £(x)

f is twice differentiable if the partial derivatives 92f/9x¡9xj, i, j= 1 , 2 , ... n exist and are continuous at x , in which case V f = f J and V2f = f xx , where fxx (x) is a nXn matrix, the Hessian of f at x, whose ij-th element is[ f x x ( x ) ] i j = 3 2 f ( x ) / 9 x i 9 x j , i , j = 1 , 2 . . . n .

If A € R lxm, II А К Д max {|| Ax|| | || x|| â l}. f : Rn -* Rm is differ entiable at x if 3 f :R n-*-Rm satisfying (6 ). If the partial derivatives d f ^ / d x ’ , i= 1, 2, ...m , j = 1, 2, ...n exist and are continuous at x, then f : Rn - R m satisfying ( 6 ) exists and is defined by: f(x) = f(x) + f x(x) (x - x) ; (the ij-th component of the Jacobian f x : Rn-»Rmxnis Sfi/Sxi).

Mean-value theorems

7. Theorem

Let f : Rn -»R be continuously differentiable (Vf exists and is continuous at all x € R n). Then Vx, h £ R n, VXGR, 3 | e [x ,x + Xh], the line segment joining x and x + Xh Э :

f (x + Xh) = f(x) + X <(y f(f ), h У

8 . Theorem

I f f : R n->Rm is continuously differentiable, then V x .h e R ” , V X eR ":

l

f(x + Xh) = f(x) + fx(x + t.Xh)dt Xhо

IAEA-SMR-17/4 227

Convex Sets

The set of points joining Xj, x2 6 Rn is denoted [x1( x2] = { x £ R n | x = x-l + A(x2 -xi ) , X e [ 0 , l l } , and is called a line segment. (x1 ; x2) A { x 6 Rn | x =x i+ M x 2 -xx), X e (0 , l ) } .

A set A С Rn is convex if, \/х1; x 2 e A, the line segment [x1( x 2 ] € A .If x 6 A, A convex, and 3 no distinct хь x2 € А Э х£ (xx, x 2) , then x is an extreme point of A. The intersection of a finite or infinite family of convex sets in Rn is convex. If x1; x2 , . . .x mG Rn, then the convex hull of xx, x2, . . .x m,

denoted co (xj m

Х Ч -i= 1If x 0 , x x, ...x n 6 Rn are such that (x ! - x0), (x2 - x 0) , . ..(xn - x 0) are

linearly independent, then co (x0,x j , ..xn) is a simplex in Rn. A set С is a cone with vertex xq if x 6 С => x0 + X(x - х 0)б С, VXaO. The dimension of a convex set С С R n is the dimension of the smallest subspace M 3 C C x + M | L fo r s o m e x e R . The relative interior of such a set, is rint C^ {x 6 l| 3e> 0 Э B£(x) П С С L }. For example, if a, b £ R n, int[a ,b]=0, rint [a, b]= (a, b).

Two sets are disjoint if they have no points in common. Two sets А, В С Rn are separable (strictly separable) by a hyperplane Hc(a)A {x G R n| < c ,x > = o } if 3 с € Rn, c / 0 , oG R Э:

x £ A = > ( c, x ) so ( ( с ,х ) < й )

x £ B = > ( c , x ) i o K c , х У > а )

НСЫ is said to separate A and B. Two sets may be separable but not disjoint, or disjoint but not separable. However:

x 2 . ..x m) is the set ix € Rn | x Ii = l

X, g 0 for i= 1 , 2 , ...m ,

9. Separation theorem

Let A, B C R 1 be convex and let A have dimension n. Then: A and В are separable <=* int А П rint В = ф. Note, if A, B e Rn are convex and disjoint, then A, В are separable.

The separation theorem may be used to prove a result, Farkas Lemma, much used in the sequel.

10. Farkas lemma

Let A € R mXn, b £ Rn, and let the columns of AT be ax, a2, .. ame R n. Then the following two statements are equivalent:

(i) {b} and SA {x 6 R n I x = A у .у й 0} are not disjoint ( 3y 6 R k,

' = £ a ¡y ¡ =1 i=l

у й 0 ЭАту = ) а;У;=Ь

(ii) { b} and S are not separable( S 0, i = l , 2, ...m)

228 MAYNE

Convex functions

Let X С R n be convex. f : X - R is convex if x 1; x2 6 X, Х е [0 ,1 )Ф

f(x 1 + X(x2- xx)) § f(xj) + X(f(x2) - f(xj)).

f is concave if - f is convex.

x is a (strict) local minimizer of f : Rn -*R on XG Rn if Эе > 0 3f(x) g f(x) (f(x)> f(x)), Vx G Be (x) П X ( Vx G B£ (x) П (X\{ x }). (X \ Y Д {x G X | x $ Y})

11. Theorem

If f :X -»R is a convex function on a convex set X C R n, then each local minimum of f on X is also the (global) minimum of f on X.

1. CONDITIONS OF OPTIMALITY

Assumption

In the sequel, f ° : R n^ R , f : R n^R m and r : Rn -> Rk will be assumed to be continuously differentiable.

1.1. Unconstrained optimization

1. Theorem

If f°(x) is a minimum (or local minimum) of f 0 on X , where X is an open subset of Rn , then:

Vf°(x) = 0

Proof

Эе > О Э Ne (x) С X. Suppose Vf°(x) / 0 and let h A Vf°(x)/||vf°(x) || . By local optimality 3ei G [0,е] Э (x -eh) G X and f °(x - oh) ê f°(x) V o 'G [0 ,e1]. But:

f°(x - ah) = f°(x) - a <Vf(x), h> + о [a)

= f°(x) - a [c + o(o)/ff], c > 0

so that 3c G (0, e j 3 f°(x -¡yh) s f°(x) - a c /2 , a contradiction. Hence Vf°(x) = 0.

1.2. Inequality constraints

Consider P2 : min { f°(x) I xGS2С Rn} where Q A {xG R n | f(x) s 0} , where f : R n->-Rm. Vx G R n let Io(x) A { i G{1, 2, ...m } I f^x) = 0}. The following condition of optimality is due to Zoutendijk.

IAEA-SMR-17/4 229

If xG£2 C R n is a solution to P2, then:

6 (x) A m in m ax{< V f°(x ),h ^ ; <Cvf4x),hX iG I0 (x)} = 0h e S

where S is any subset of Rn containing the origin in its interior.Proof

If0(x) = - 6 <O, then 3h G S Э :

<Vf‘ (5), h> á - 6 , V i 6 { 0 } m o(5)

Hence 3 o> 0 such that:

f°(x +oh) s f °(x) - a b 1 2

f ‘ (x+ah) S - a ô / 2 , V iG I 0(x)

fH x+ohisO , Vi e i '( x )

where Iq(x) denotes the complement of I0 (x) in JmA { l ,2 , . . .m } , contradicting the optimality of x.

To proceed further, we have to introduce the concept of a 'linear approximation' to Г2 at x. A convex cone is a linearization of {U - x) Л {h 6 Rn I h = x - x, xGf2} if, V sets { h 1, h2 , ...h j } of linearly independent vectors in С, Эе > О Э co {0 , eh1, eh2 , ...eh ]} С (Q - x). If h€C ,3 e> 0 3 c 'h e ( !2 -x ) , V e 'G [0 ,e ] .

1. T h e o re m

2. Theorem

If either

A(i) {VfMx), iGI(x)}

is a set of linearly independent vectorsôr

A(ii) 3h G Rn Э <Vfi(x), h > <0, V iG I 0 (x)

then С Д { h G R n I < V f'(x ), h> <0, i G I0 (x)} is a linearization of (Г2-x ) , if С /ф .

In fact,A(i) => A(ii).

3. Example

f\x) = (x 1- l)3 + x 2 , f 2(x) = - x 2 , 5= (1 , 0 ), I0 (x) = { 1 , 2 },

Vf'Hx) = (0, 1), V f2(x) = (0, -1). С is not a linearization.

230 MAYNE

4. Theorem

Suppose ice fi is a solution to P2, and С is a linearization of ( U - x ) . Then:

<V f°(x ), h > è 0, Vh e c

Proof

F or, if not, 3 h e C , 6>0

<Vf °(x), h > < - 6

Now, 3ei > 0 Эх + eheŒ , Ve e [O.ej], and 3e2 e [0, e J Э f°(x + eh) < f°(x ), V f 6 [0 ,£ 2 l. Thus V e € [ 0 ,e 2 ], x+ eh 6f2 and f°(x+ eh) < f°(x ), contradicting the optimality of x.

4a. Corollary. <[vf0(x), h )â û , V h eC .

Proof

This follows from Theorem 4 and the continuity of the function < V f° (x ) , .> :R n-*R.

5. Kuhn-Tucker Theorem

Suppose x 6 fi is a solution to P2, and A(ii) of (1.2.2) holds. Then э { X' й 0, i€ I0 (x)} Э :

V f ° ( x ) + ^ Х‘ у^ (х ) = 0

Proof

Since A(ii) holds we can replace С in (4) and (4a) by C, so that:

- <Vf °(5),h> s 0, Vh 6 Rn Э <Vf'(x), h > s 0, V i e i 0 (x)

By Farkas' Lemma (0.10), Э{ X*_> 0 1 i € l 0 (x)} Э. : .

ie l o ( x )

1.3. Equality and inequality constraints

Consider P3 : m in{f°(x) I f(x) S 0, r(x) = 0 } . Let g : Rn -*■ Rk + 1 be defined by:

g (x )= (f°(x ), r(x))

IAEA-SMR-17/4 231

We considerg is continuously differentiable, its Jacobian being gx first the relatively simple case when r is affine.

1. Theorem

If x is an optimal solution of P3, С is a linearization of (Г2 - x) and r is affine, then 3X = (Xo , X1, ...Xk ) G R k+ \ Xo S О, Х /0 , Э:

< X , g x(£)h>â 0, V h e c

Proof

Let Czâ gx(x)C

S A {z G R k+1|z = -(3(1,0, ...0 ), /3 > 0}

Cz and S are convex cones.We first prove that Cz and S are linearly separable. For, if they are

not, C z has dimension k+ 1, and all points of S are in the interior of Cz.Let z* e S; since z* G int С z, 3h* e С Э z* = gx(x)h*, that is, (fx(x)h*, rx(x)h*b -|S*(1 , 0 , . . .0 ) for some (3*> 0 , so that:

f X(x)h* = - )3* < 0, rx (x)h* = 0

Since f°(x) h* = - 0* and С is a linearization, 3e > 0 Э f°(x + eh*) < f °(x),(x + eh*)Gf2, and rx(x)eh* = 0. Since r(x) = 0, and r is affine, the latter condition implies r(x + eh*) = 0. Hence x is not optimal, a contradiction, so that Cz and S are linearly separated.

Hence 3XGRk+1, X / 0, Э :

<X, z > S 0, VzGCZ

<X, z > g 0, Vz G S

that is

< X ,gx(x)h>á 0, VhGC

- Xo a 0, that is Xo s 0

Since < X ,.> is a continuous function, С may be replaced by C, yielding the desired result.

We now show that the latter result can be expressed in the usual form:

2. Theorem

If x is an optimal solution of P3, A(ii) of (1.1.2) holds and r is affine, then 3XGRk+1, X /0 , X °s0, scalars ц 1 s 0, iG I0 (x )9 :

кX*V r‘ (x) + ^ AiIV fi(x) = 0

i=i i i l 00 0

X°Vf°(x) + y

232 MAYNE

Proof

From Theorem 1.3.1 replacing С by С

<gx(x) X ,h> = 0, VhEC

к

=» <X°Vf°(x) + ^ X ¡? r i{x), h> â 0, Vh G Rn Э <V f4 x ), h > S 0, V i e l 0 (x)i = l

Applying Farkas Lemma, 3 { n S 0 1 i e l 0 (x)}3

к

nVfHx)i = 1 i e l 0 (x)

The major task in the sequel is to remove the restriction 'r affine' in Theorems (1) and (2). This com plicates the proof considerably, and use has to be made of a fixed-point theorem.

3. Theorem 1

If for s o m e e > 0 , x £ R n л is a continuous map from Be(x) into Be(x), then у has a fixed point, i.e. Эх e Ве(х)Э

7 (x) = x

4. Theorem

If x is an optimal solution of P3, С is a linearization of (Í2 -x ) , then 3X = (Xo, X1, ...Xk) e R k+1, X°g 0, Х /0 Э :

<X, gJ(x)h>sO, V h € C

Proof

Firstly we show that C z and S are linearly separable. For if Cz and S are not linearly separable, then S lies in the interior of C2=> 3 z°S S (z° = -j30 (l,û ,...O ),(3 0 > O )3 z ° 6 intCz , and 3 simplex £ = c o { 0 , z 1 , z 2, . . .z k+1}Э z° £ int £ and £ С Cz. Since Cz has dimension k+1, gx(x) has maximal rank. 3{hIec|i= 1 ,2 ,...к + 1 }Э

5. z ‘ =gx(x)hi , i = 1, 2, ...k + 1

Also because С is a linearization of {Q -x ) , z¡ and, hence h‘ , can be chosen so that

X°Vf°(x)+ У XVr^x)

1 In the following, B£ (x) = {x | | x - x| s e } denotes a closed ball.

IA EA -SMR-17/4 233

6 . co {0 ,h \ ...hk+1}CQ

Now, 3r > 0 3B r(z°) 6 Z ,

also В*, (tfz^ e E, Va e [0,1].

Consider the map y a : Bor (oz0) -> Rk + 1 defined by:

7. 7 a (z) = z +Q-Z0 - [g(x+H Z " 1 z) - g(x)]

where the i th column of H £ R ,k+1) x(ktl* is h1, and the i th column of Z £ R *k ll*x(k + 1* is z1. (Z is invertible since Z is a simplex and Z = gx(x)H). I fh = HZ_1 z, and z G Bar (a z °) С Z С C z , then h £ (iî - x), and gx(x)h = z. Because g is continuously differentiable and ||z||sc, for some c<°o,Vz G Bar(oz°), it follows that 3a0 G (0,1] Э

II [g(x + HZ ' 1 z) - g(x)] - gx (x)HZ ' 1 z II

= II [g(x + HZ-1 z) - g(x)]- z|| S o r /2 , V z e B oit(a0 z).

Applying this to (7) yields:

II7° ( z ) -o° z° II Sa° r /2 , Vx eB aor (a°z0),

so that 7 maps Baor(oez 0) into Baor(o° z°). Since 7 “° is continuous, by Brouwer's fixed-point theorem, 3"z G Ba«r (a°z° ) Э :

that is

g(x + y) = g(x) + a° z°

where

ÿ = HZ_1 Ï6 (i2 -5)

Hence

x Дх +ÿG U

and

f°(x) = f°(x) -a°¡3° , a ° ,0 °> 0

r(x) = 0

contradicting the optimality of x.Hence Cz and S are linearly separated. The rest of the proof is now

the same as in the proof of Theorem 1.3.1.

234 MAYNE

5. Theorem

If x is the optimal solution of P3, and A(ii) of (1.1.2) holds, then 3 X € R k+1, Xo SO, X / 0, scalars ц1 SO, iG I 0 (x )9 :

к

i = l iel0(x)

Proof

The proof is the same as that for Theorem (1.3.2), using the strengthened result Theorem 1.3.4 in place of theorem 1.3.1.

6 . Corollary

IfVrM x), i= l,2 ,._ ..k , Vf^x), i e i 0(x), are linearly independent, then X°<0 .

=> /и1 = 0, Vi e I 0(x)

But 3 h 'e R n, h '/O , э<С Vrl (x) ,h ' У <0, i = 1, ...k, and this implies in turn thatXi= 0, i = l , . . .k , that is, X= (Xo , X1, ...Xk) = 0, a contradiction.

2. ALGORITHM MODELS - CONVERGENCE

Polak [1] gives three reasons for using simple algorithm models:

i) classification of computational methods;ii) elucidation of essential features of algorithms guaranteeing

convergence;iii) providing a simple procedure for obtaining 'implementable'

algorithms from 'conceptual' prototypes.

Each iteration of a conceptual algorithm may require an arbitrary number of arithmetical operations and function evaluations, while an implementable algorithm must require only a finite number. For example, many algorithms choose, at each iteration, a 'search direction' and then search along this direction; an algorithm which minimizes a function along this dir'ection would be conceptual. We consider algorithms for deter

Proof

Assume Xo = 0. Since Vr*(x), i = l , 2 , . . . k , Vf*(x), i e io (x ) are linearly independent, 3 h £ R n, h^ 0, Э <(угЧх),Ь>= 0, i = l , . . .k , <Vf‘ (x),h><0 , V i e I0 (x). From Theorem 1.3.5,

к

i=i i e l 0(x)

IAEA-SMR-17/4 235

mining points in fiC R n which are desirable. Conceptually, a point is desirable if it solves the optimization problem; m ore realistically, a point is desirable if it satisfies some necessary condition of optimality. The simplest algorithm for finding desirable points in a closed subset Cl of R n employ two functions, â : Í2 -* Г2 for generating new points and c : f 2 -»R for testing the desirability of a point.

1. Algorithm Model a :Q -»f2, c :f i -* R

Step 0. Compute a x0e П . Set i = 0.

Step 1. Set x i+1 =a(xj).

Step 2. If c (x i+1) s c(xj) stop.

Else set i = i + 1 and go to step 1

Under what conditions will this algorithm determine desirable points? The following theorem, due to Polak [1], shows that the essential condition is a kind of semi-continuity property of a.

2. Theorem

Suppose that:

(i) с is either continuous at all non-desirable points in Q , or else с is bounded from below in iî(3 Ô 6 R эс(х ) í c , V x e í í )

(ii) V xef2 which is not desirable, 3 e> 0 , 6> 0 (possibly depending on x) Э

c ( a ( x ') ) - c ( x ' ) s - 6 < 0 , V x 'e B £( x ) A { x 'e f 2

Then, either the sequence constructed by algorithm (1) is finite, and its next to last point is desirable, or it is infinite, and every accumulation point of {x¡] is desirable.

Comment

There is an implicit test for desirability in algorithm (1). If c(a (x)) 2 c(x) then x is desirable.

Proof

If the sequence is finite, then, by step 4, Зк э с (х к+1 ) 2 c(xk), so that xk is desirable. If the sequence is infinite and has an accumulation point x*, it has a subsequence converging to x* in U (that is 3K CZ э х ( -* х * for ie K ) . Assume that x* is not desirable. Hence 3 e > 0 , 6 > 0 and а к е К э :

3. x¡ e B£(x*), c (x i+1) - c(x¡) s - б , V isk , i e K .

Hence, for any two consecutive points x ¡ , x i + j of the subsequence K,i i k, we have:

236 MAYNE

С ( X j + j ) - c(Xj ) =[c(x¡+j ) - c (x i+j.! ) ] + ... + [c(xi+1) - с (x¡)]

S [c(xi+i) - c(x¡)]

S - 6

Henee, {c(x¡)| ieK )} is not a Cauchy sequence, and cannot converge. But, since с is either continuous, or bounded from below and {c(x ¡)| ieK } is monotonically decreasing for ie K , the sequence {c(x ¡)| ieK } does converge, a contradiction. Hence x* is desirable.

Comment

Note that the theorem does not state that accumulation points exist, only that if they do exist they are desirable. However, if Q is compact, or if the set ¿t ^{xefi| c(x) s c(x0)} is compact (if x0 e<¿ so does x¡, ViêO) then accumulation points in Í2 do exist.

An algorithm can not always be expressed in terms of a function a For example, once a search direction is chosen, the algorithmmay select any one of many points satisfying some criterion. In such cases the function a should be replaced by a set-valued mapping А : Г2 -»2 n , where 2a is the set of all subsets of Q .

3. Algorithm Model A : Í2-» I a , c :i2 -»R

Step 0. Compute a x0e Г2. S eti = 0.

Step 1. Compute a point y eA (x ¡). S etx i + 1 = y.

Step 2. If c(xi + 1) S c(x¡) stop.Else set i = i + l and go to step 1.

4. Theorem

Suppose that:

(i) с is either continuous or bounded from below in Г2 .(ii) V xef2 , x not desirable, 3 e> 0 , 6> 0 Э :

c(x") - c(x ') S - 6 , V x 'e B e(x), V x"eA (x ')

Then the next to last point of any finite sequence, and every accumulation point of any infinite sequence generated by the algorithm is desirable.

These models, and this is one of their great virtues, can relatively easily be modified to cope with the approximations needed to make conceptual algorithms implementable. One method of modelling algorithms with approximations is to employ a set valued mapping A : R+ X Q - 2a , (e',x)-> y e A (e ',x ) , where e 1 indicates the degree of approximation; as e'->0, A (e ',- ) tends, in some sense, to a 'conceptual' A satisfying Theorem 3. These comments are spelt out more precisely in Algorithm (5) and Theorem ( 6 ).

IAEA-SMR-17/4 237

5. Algorithm Model A : R+ X ft - 2 a , с : ft - R, e0 > 0

Step 0. Compute x 0 s Q . S eti= 0 .

Step 1. Set € = € 0 .

Step 2. Compute a y e A(e ,x ¡).

Step 3. If c(y) - c(x¡) s - e , set x i+1 = y, set i = i + 1 and go to step 1. Else set e = e / 2 and go to step 2 .

Comment

For simplicity the algorithm statement does not include a stop statement; for a version which does, see Polak [1], 1.3.26. Hence the algorithm either produces an infinite sequence {x¡}, or 'jams up' at some XjEft, cycling between steps 2 and 3, halving e at each cycle.

6 . Theorem

Suppose that:

(i) с is either continuous or bounded from below in ft.(ii) V x e ft , x non-desirable, 3e, 6 , 7 > 0

о7. c(x") - c(x ') S - 6 , V x 'e B £ (x), Vx" e A(e', x ') , V e 'e [ 0 , 7 ]

Then either the sequence { x 1} jams up at xs, where Xj is desirable (x is desirable if c(x ') - c (x ) iO , V x 'eA (O .x)), or the sequence is infinite, and all its accumulation points are desirable.

Proof

(i) F irst we show that the sequence cannotjamup at a non-desirable x s. For if so, the algorithm generates a sequence yj e A(e0 /2 i, x s ) (from step 3), c(yj) - c(xs) > - e0 /2 К However, 3 integer Ja e 0 /2 j s á and e0 / 2 S 7 (where 6 = 6 (xs) > 0 and e = e (xs) >0 exist by assumption (ii)) Vj s J. Hence, from assumption (ii), c(yj) - c(xs) & - 6 s - eo/2 J, V ja J , which contradicts the statement above that c (y j ) - c (x s) > - e 0 / 2 J, VjêO.

(ii) We next show that all accumulation points of an infinite sequence {x ;} generated by the algorithm are desirable. For if x* is a non-desirable accumulation point of {X j}, then 3 K C {0 ,1, 2 ,.. .} э x x*, for is K , and 3e , 6 , 7 > 0 satisfying (7) with x = x*. Hence 3Ji Э x ¡e B f (x*), V ieK , is Jx.

A lso, 3 j 9 e 0 / 2 J S6 , ande0 / 2 JS7 . From assumption (ii)

c (x i+1) - c(x4) S - 6 s - e 0 / 2 J

so that the test c(x i+1 ) - c(x¡) s - e in step 3 of the algorithm is satisfied with - e s - e 0 /2 J. Hence, for any two consecutive points Xj, x i+j , is in the convergent subsequence {Xj [ i eK} satisfy

238 MAYNE

c (x i+j) - c(x¡) = [ c(xi+j - c (x i+j_j )] + ... tc(xi+1) - с(х4)]

s - e 0 / 2 J<

Hence, the sequence {c(x ¡) | ie K } is not Cauchy, but from assumption (i), since {x¡ I i eK } converges, so does {с(х^ | i eK }, a contradiction.

3. UNCONSTRAINED OPTIMIZATION

In this section we shall describe several algorithms for PI, and examine their convergence properties using the algorithm models of Section 2. Since many algorithms choose at each iteration a search direction s¡, and then minimize, or approximately minimize, f° along s ' (that is, minimize the function 0 :R +-*R defined by в(У) = f°(x¡ + Xs¡)) we commence by describing several procedures for approximately minimizing a function в : R+ -* R.

3.1. Algorithms for approximate minimization of в : R* ~*R

1. Golden Section Search

Let в : R* -*R be convex. The golden-section algorithm consists of two parts. The first part determines an interval [а0 ,Ь 0] which brackets the minimum. The second part reduces the size of the interval bracketing the minimum to a pre-assigned value e. Both parts rely on the convexity of 6 . The first part follows. For a given p> 0, a sequence of points is calculated according to:

x 0 = 0

x i + i = x i + P

until the first point x¡ is reached satisfying:

e(xi+1) g e(xj)

The minimum of в will lie in the interval

I0 = ta0 ,b 0]A txj.j , x i+1]

The second (golden-section part) is defined below: (V jgO , £ j4 (b j -a j ) ,Ij Д [a j, bj])

2. Golden Section Algorithm: [a0 ,b 0], e > 0 , Fj = (3 - »/5)/\Í2, F2 = 1 - Fx

Step 0. Set i = 0.

Step 1. I f i j S e , set x = (a¿ +b i)/2 and stop. Else proceed.

Step 2. Set v¡ = a¿ tF j^ i, Wi = ai + F 2 £i.

IAEA-SMR-17/4 239

Step 3. If 0(v1 )<0(w 1), set Ii+1â [ai+1 , b i+1]= [aj.Wj].

If 0(v¡) S 0(wj), set Ii+iA [ai+1,b i+1] = [v¡ , b¡].

Set i = i + 1.

Note that í j = F2 Í 0 • It can be shown that if Ii+1 Д [ai+1, b i+1 ] = [a¡, w¡], thenw i+1 = Vj and if II+i = [v ¡,b ¡], then v i+1= Wj, reducing the computation required.If x* denotes the solution to min (0(x)|xeR+), then | x -x *| se /2 .

Recognizing the use of the algorithms presented in this section in the sequel, we now define 0 : Rn X R+ -» R by:

3. 0(x, X) Д f°(x + Xh(x)) - f °(x)

and a linear approximation 0 : Rn XR'^-'R to 0 by:

4. 0 (x, X) Д X <(vf °(x), h(x) У = Xv^0(x, 0)

It is assumed that

5. Vx0(x,O) = <V f°(x),h (x)><0

6 . A rm ijo 's algorithm: g e ( 0 , l ) , p>0

Step 0. S e ti= 0 , X0 = p .

Step 1. If 0(x, X¡) i j 6 ( x , X¡), stop.

Else set Xi+i = (3X¡ and go to step 1.

7. Comment

Figure 1 illustrates the situation.

8 . Theorem

Let V f 0 and h be continuous, and <Vf °(x), h(x) = - y < 0. V x e R n let X(x), denote the Xsatisfying the stop condition in step 1, that is, X(x) is the largest X of the form X = /3kp satisfying:

0(x, X) s j в ( х , X)

F IG .l. Illustration o f algorithm o f approximate minimization.

P

ÉMx.p)

6 ( x rp ) f 2

X — ►

Then, 3 e > 0, 6>0 (6 =7 / 2 ) э ;

0 (x ', X(x')) â - 6 , V x 'e B ( (x)

The proof is left to the reader as an exercise.

3.2. Algorithms for unconstrained minimization

Let us now return to the problem PI : m in{f°(x) | x e Rn}. We assume that f° is continuously differentiable. The simplest algorithm is:

1. (Conceptual) Steepest Descent Algorithm

Step 0. Select ax 0 6 Rn . S e ti= 0 .

Step 1. Set h(xt) = - V f°(x i ). If h(x¡) = 0 stop.

Step 2. Compute smallest scalar X(x¡) which solves

f °(x¡ + Xtxjíhíxj)) = min {f °(xj + Xh(x; ) | X ë 0}

Step 3. Set Xi+i= Xi + X(xj) h(xi), set i = i + 1 and go to step 1.

2. Theorem

Let C (x0) Д{х| f°(x) S f° (x 0)} be bounded.

(i) If {xj} is finite, terminating at x k, then x k is desirable (Vf°(xk) = 0).

(ii) I f { x ¡ } i s infinite, every accumulation point x* is desirable.

Proof

(iii) Let a :R n-*R be defined by step 3 of (1); let cA f °. с is continuous. Let в , в be defined by (4.1.3), (4.1.4). x* not desirable =»V^0(x*, 0) =<(Vf °(x*), h(x*) У = - Il V f °(x*) II 2 = - 7 < 0. Hence 3 X j> 0 э 0(x*, X) < - X 7 /2 ,VXe [0, X J so that X(x*) (see step 2 of (1)) satisfies 0(x*, X(x*)) a - X ^ / 2 .Let ф : Rn -> R be defined by:

<j>(x')^f0 (x' + X(x*)h(x')) - f°(x ')

Clearly, ф is continuous, and satisfies:

ф(х*) = 0 (x*, X(x*))= -Xjy/2

Hence Эе > 0 э

ф(х') s - Xj^/4, V x 'e B £(x*)

But, from the definition of ф and X(.):

е(х ',Х (х 1) )§ ф (х ')ё -X ív/4 , V x 'e B £(x*)

240 MAYNE

IAEA-SMR-17/4 241

But

0(x',X (x')) = f°(a(x ')) - f°(x ')

Hence f° satisfies assumption (ii) of Theorem (2.2) (with 6 = X17 / 4 ), thus proving the result.

Algorithm 1 is conceptual since each iteration involves an optimization problem, (step 2) requiring an infinite number of steps. One method of making use of the golden section search of (3.1.1), (3.1.2) to replace.step 2, together with a loop to reduce e.

3. (Golden section) Steepest descent: e0 >O, P > 0

Step 0 . Select x0 e Rn . Set i = 0.

Step 1 . Set e = e 0 .

Step 2 . Let h(x¡) = -V f 0(x¡). Ifh(Xj) = 0 stop.

Step 3. Compute X (x¡ ) э | X(x¡) - X(x¡ )|s e/2, X(.) defined in step 2 of (1 ), using(4.1.1), (’4.1.2) on the function 6 (xi, •)•

Step 4. If 6 (x i,J.{xi)) â - e , set xi+1 =x¡ + X(xj)h(x¡), set i = i+ la n d go to step 2 .Else set e = e /2 and go to step 3.

Polak [1] shows that if f° is convex, C(x0) bounded, then any infinite sequence{x¡l generated by (3) satisfies f°(x{) -*■ m in{f 0 (x) | x e Rn} .

4. (Armijo) Steepest descent: j3 e (0 , l ) , p>0.

Step 0. Select xq e Rn, set i = 0.

Step 1. Set h(x¿) = -V f°(x j). If h(x¿) = 0, stop.

Step 2. Compute X(x¡) using A rm ijo 's algorithm (4.1.6).

Step 3. Set x i+1 = Xj + X(xi )h(xi). S e t i= i+ 1 , and go to step 1.

That any accumulation point x* of an infinite sequence generated by (5) is desirable (h(x*) = 0) can be easily established, if C(x0) is bounded, using Theorem (3.1.8), identifying с with f° and defining a :R n->Rn by step 3.

The convergence properties of algorithms (1), (3) and (4) are unaffected if h is defined by

h(x) = -D (x)V f°(x)

where D :R n->RnXm is positive definite and continuous and C(x0) is bounded. If

D(x) = [ f °x (x )] -1

242 MAYNE

then we have Newton-Raphson-type algorithms, for which p should be set equal to 1 in (3). If f° is quadratic, the Newton-Raphson version of algorithm ( 1 ) converges in 1 iteration.

3.3. Conjugate direction and conjugate gradient algorithms

To motivate the sequel we note that f° behaves like a quadratic function in the neighbourhood of any local (unconstrained) minimum x*, in the sense that if f 0 is twice differentiable:

f °(x* + 6 x) = f °(x*) + -|< 6 x, f (x*) бх^ + о(||бх||2)

Hence, if an algorithm has poor behaviour when applied to a quadratic f°, it will behave poorly when used for a general f° . In this connection we note that the steepest descent algorithm, when f° is quadratic, and positive definite, that is:

1. f°(x) = a+ <(b,xУ + -|^x,Cx)>where С is symmetric and positive definite, has the following property:

2 . f°(xj) - f°(x*) So' [f(x0 ) - f(x*)] where

3. a A l - (Xmin| Xmax) where Xm¡n and Xmax are the minimum and maximumeigenvalues of C lo '- 'l as the 'condition number' (Xmin/X max)-> 0). Note, in passing, that if f° satisfies 1, f x(x)=Cx + b, f xx (x) = С and x* = -С ЛЬ in theunique local minimum (if С is p.d.). A lso note that

f °(x + 6 x) = f°(x) + f x(x) ôx + f < 6 x , f xx(x)ôx>

The Newton-Raphson algorithm satisfies

xi+i =x i _ Х(х;Ш°х(х;)]л Vf°(x)

= Xj - X(x1)C‘ 1 (Cxi + b)

= - C '1b = x*, if X(x¡) = 1and so behaves excellently when f° is quadratic. However, the computation of f xx is expensive. The algorithm described here tries to preserve the desirable 'second order' features of the Newton-Raphson algorithm at less expense. We assume, for sim plicity, in the sequel that f° is quadratic and p.d.

4. Conjugate direction algorithm

This is of the form of the steepest descent algorithm (3.2.1) with the restriction X§ 0 removed from step 2 and step 1 replaced by: Step 1'. Set h(Xji = h¡ where {h-¡, i= 0 ,1 , 2, ...n - 1} are С-conjugate, that is:

5. <Ь4,С ^ > = 0 Vi¡Éj, i, j e 0 , l , 2 , . . .n - 1

which implies that h g .h j , . . .^ x are l.i. (linearly independent). Let,Vj = l ,2 , . .n

6 . M j ^ { x | x = ^ « " ¡ h j , o ¡ e R }

j = o

and

7. L j A M j + x 0

8 . Fact, x solves min{f°(x) | x e Lj 1 <*

V f °(x)eJ'Mj (<Vf °(x), hj > = 0, Vi e {0 ,1 , 2, ...j - 1} )

Proof

Left as an exercise to the reader.

9. Theorem

If { xi I i = l,2 ...n } is generated by the C.D. algorithm (and f° is quadratic and p.d.), then x¡ minimizes f° on L ¡ (and hence x n minimizes f° on Rn).

Proof.

xj minimizes f° on L i. Assume Xj minimizes f° on L j , that is <(Vf0 (Xj), hj У = 0 for i= 0 , 1, ...j - 1. Now:

V f°(x j+1 ) = V f°(xj) + C(xj+1 - Xj)

= V f °(xj ) + X jChj

By virtue of the С -conjugate property:

<Vf°(xj+1), h ¡> = Xj<hi> Chj >

= 0 , for i = 0 , 1 , ...j - 1

Also since Xj +1 minimizes f 0 on { x j + Xh j | Xe R}

< ^ ° (х ;+1 ),Ь ; > = 0

that is, Xj+i minimizes f 0 on L j+1, thus completing the proof by induction.The following algorithm automatically generates С-conjugate directions

(giAAf°(Xj), h jA h (x j), V ie Z ) :

10. Conjugate gradient algorithm

Step 0. Select x0 e R n, s e t i= 0 , set h0 = g 0.

Step 1. If gj = 0, stop.

IAEA-SMR-17/4 243

j - 1

Step 2. Compute smallest scalar X¡ which solves

f°(xj + X.hj) = m in{f°(x j +Xhj) | X i 0 }

Step 3. Set x i+1 = x t + X1hi .

Set /3i+1 =<gi+1 , gi+i> /< g i,g i>

Set h i+i= - gi+i +0 1+1 ^

Set i = i + 1 and go to step 1.

11. Lemma

Let f° be p.d. and quadratic, and assume that h0, hi , ... hj_x are С-conjugate. Then:

(i) <gj> g¡ >= °> i= 0 , 1 , . . . j - 1

(ii) gj / 0 =*■ hQ , hp ...hj are C-conjugate.

Proof

(i) (h0 , hj, ...h j.j) are С-conjugate =>(Theorem 9) Xj minimizes f° on Lj=> < g j .hi> = i= •••]-!•

But for i= 0 , 1 ... j - 1:

<g j.h i> = < g j, - gi >

= -< g j,g i>

(ii) <hj ,C h i > = < -g j +í3jhj . 1 ,Chi>

= < -g j ,C hi >, i = 0, 1, ... j - 2

But g¡ f 0 f 0 (prove) => Ch¡ = [g i+1 - g j /X ¡ =» (using i) <hj, Ch¡> = 0,i = 0 , 1 , . . . j - 2 .

<hj, Chj.i> =<-gj + fth j.j ,(g j - gj.i )/>-j_i> = 0

as can be shown using (i), the definition of Xj, and the fact that Xj.j satisfies:

0 = < g j.h j.!> = <gj.i+ C X j. 1hj. 1,h j .1>

=» Xj.i= <gj-ij hj_i >/<hj_i , C h j.i)

12. Theorem

If f° is p.d. and quadratic, then algorithm (10) minimizes f° in at most n iterations.

2 4 4 MAYNE

IAEA-SMR-17/4 245

Proof

ho and hi are С-conjugate (prove). By induction, from Lemma 11, h0 , h i , . . .h n_i are С-conjugate. The result follows from Theorem 9.

Polak-Ribiere have established [2] that a modified version of the algorithm (/3í+i = <C(g i+l" g i), gi+l^/'\(gi, gi!)), which has the same behaviour when applied to quadratic functions (why?) has the property, when f° is strictly convex and twice continuously differentiable, that accumulation points of infinite sequences are desirable. This follows from the fact that3 p> 0 э - <gi,hi >й p II gil II h i II , Vi.

3.4. Pseudo-Newton-Raphson algorithms

There exists a class of algorithms, variously called pseudo- or quasi-Newton-Raphson, Variable M etric, Secant, which have the same or sim ilar properties as the conjugate gradient algorithm when applied to quadratic functions, and in addition produce an estimate of the Hessian V 2f° or its inverse. The essential property of these methods is; that at iteration j, the estimate Hj of (V2f 0 ) ' 1 satisfies:

1 . Ax i = HjAg;, i= 0 , ...j - 1

where Дх;Дх;+1, Agj A g j+1- g¡, Vi. The search directions { hj} are generated according to:

2. hj = "Hj gj

Since, if f° is quadratic and p .d ., Ax¡ = С_1Д gi, i = 0, ... j-1 , it can be seen that the restriction of Hj to CMj — the linear subspace spanned by {A go, •••Agj.i} or { CAx0, ...СДХ}_]} — is equal to the restriction of C" 1 to CMj (we shall say Hj = C ' 1 on CMj). Assume Hj is non-singular. Then, if hj e Mj from (2) and the non-singularity of H j, gj e CMj, so that H jg j= C '1 gj. Since x* = Xj - C ' 1g j , x* = x j + hj when X= 1 , Consequently, if ho , h ¡ , . . .h n_i are l .i . , Hn = C '1, and x n+1 = xn + Ahn = x* for A = 1 , where x* minimizes f° in Rn . Finally, it can be shown that if f 0 is quadratic and p.d., and Hi symmetric Vi, the algorithm generates С-conjugate search directions, and hence minimizes f° in Rn in at most n iterations.

The most used algorithm for calculating {H¡} is the Davidson-Fletcher- Powell (DFP) formula [3] H0 is chosen to be sym m etric, p.d. (for example, H0 =1) and {H¡} calculated according to:

4 H. TT , A xiA xJT (Hj AgjKHjA gj )TJ+1 J <AXj, AXj > <Agj,H jAgj>

It can be shown that if f° is minimized exactly in each search direction, then, Vj ё 0

( 1 ) HjAxj = 0 , i = 0 , 1 , 2 , . . . j - 1

(2) Hj+1 is p.d. symmetric.

246 MAYNE

This formula seems to be robust. Convergence of the resultant algorithm when applied to f° , strictly convex and twice continuously differentiable, has been established by Powell (see Polak [1]).

Another sim ilar formula is the rank 1 formula due to Sargent (see, e.g. Luenberger [3]):

TT , (Axj-H jAgjK Axj-H jAgj)T4 . Jtl ;+ + — T-----------------------------------------T------------

J+1 ] < A Xj -HjAgj, A gj>

if A x j/H jA g ji otherwise Hj+1=Hj. Unlike the D.F.P. formula, this formula does not require exact minimizations of f° along search directions, and, therefore, lends itself to 'implementable' versions using, for example, A rm ijo 's method. If we assume that Hj satisfies (1), that is, HjAg¡ = Ax¡, i= 0 , 1 ... j -1 , then Hj+1 Agj = Ax¡, i = 0, 1, ... j. If the algorithm does minimize f° along each search direction, and f° is quadratic and p.d., convergence in at most n iterations can be established.

4. RATE OF CONVERGENCE

Suppose {xi,] is a sequence converging to x*, and, for all non-negative integers k, let r)k denote || xk + 1- x* 11 / || xk- x*||p . {x kl is said to converge linearly if 3(3 £ (0, 1) э :

1 . l i m n ^ Pк-* '°oi.e. if 3ki э ü xk+1 - x* Il s (3 || x k - x* || V kêkj. If {xk} converges linearly 3 с e R э l|xk II S c (3k , Vk i k lf i.e. {x k} converges at least as fast as a geometric progression. |3 is called the convergence ratio. If (3 = 0 the convergence is said to be superlinear. The order of convergence is the supremum of p e R э lim sup rjP <°°. ( lim supnE = lim s,, wherek K k - » « K

sk = sup{r)?l j2 k}). Convergence of any order greater than 1 is superlinear;

however convergence of order 1 may be superlinear (if ¡3 = 0). A s an example consider {xk = a k}, a e ( 0 , l ) . r¡k = a, V k e Z , and in fact { x k} converges linearly with convergence ratio ¡3= a. If xk = a2k, Vk e Z, a s (0 ,1 ), then rjk = a 2**, V k e Z , and convergence is superlinear ((3 = 0). rj2= 1, V ke Z, and thus { x k} has order of Convergence 2.

These concepts enable us to compare the rate of convergence algorithms. Until now we have been merely concerned whether they converge or not. To illustrate the type of calculations required we consider two simple conceptual algorithms, steepest descent and Newton- Raphson, i.e. we shall consider algorithms of the form:

2 . a(x) = x + A.(x) h(x)

where A(x) is the minimizer of f°(x + Ah(x)), X eR +. As in section 3, 6 (x, A) denotes f°(x + Ah(x)) - f°(x).

IAEA-SMR-17/4 247

The steepest descent algorithm

For this algorithm h(x) = -V f°(x). We assume that f 0 is twice continuously differentiable, and strictly convex in the neighbourhood of x* which is assumed to be the limit point of a convergent sequence { x ¡}.Hence 3 e > 0 , m> 0, M > 0 , m SM Э:

3. <h ,V 2f°(x )h > ë [m У h (12, 1VE|| h||2]

V h e R , Vx e B £ (x*), and Эк1 э х ( e B£ (x*), V iê Ц. As a consequence of (1) we have:

0(x, X) s <Vf°(x), h(x) > + M II h(x) I2 X2 /2 0

Vx, Х э х д + ХЬ(х) 6 Bf (x*). H enceV iêk

4. etxj.Xixj))^ - II Vf°(xi )||2 /2M .

But we also have, Vi È к:

5. f 0 (xj) -f° (x* ) e [mllxj -x*||2 /2 , м|| X¡ - x* ||2 /2]

and

6 . I V f°(x¡) Il S m II Xj - x* Il

so that

f° (x i+i) - f° (x i) = б (x¡, X(x¡))

S -m 2 II x; - x* ||2 /2М

S - (m /M ) 2 [f°(xi ) - f°(x*)]

Hence

7. f°(x i+1) - f°(x*) s [1 - (m /M )2 ] [f0 (xj) - f°(x*)],

which establishes the linear convergence of { f 0( ) } with convergence ratio |3 = [1 - (m /M )2]. Convergence is possibly very slow as /3~* 1 ((m /M j-’ O), since f 0 (xj) - f°(x*) S /31 [f0 (xkj ) - f°(x*)]/j3ki. A lso Vi^k^

II x¡ - x*||2 s (2 /m ) [f°(xj) - f°(x*)]

S (2/m ) [f°(xki ) - f°(x*)] jS1 / 3kl

so that xj-^x* linearly with convergence ratio \[&.The above discussion can be generalized to deal with any algorithm

of the type in (2 ), where h(x) satisfies, for some p e ( 0 , l ] :

8 . <V f°(x), -h (x )> s p |vf°(x)|| I h(x) II

V x e R n. (8 ) is clearly satisfied for the steepest descent algorithm (-h (x )= V f°(x )).

248 MAYNE

The Newton-Raphson algorithm

As before, we assume that {x¡} converges to x*. Let H = V2 f 0 be the Hessian of f° . H and H' 1 are assumed to exist and be continuous. Hence 3 e , m , M e R+ э ;

9. I H(x) - H(y) (I ë M, ||нл (х)||ё1 / т

V x ,y Be(x*). 3kj 3 x¡ e B6 (x*) Vis kj. For the Newton-Raphson algorithm, Vx e Rn :

10. a(x) = x - H_1 (x) g(x)

and:

1 1 . A(x) = 1

where g :R n-,-Rn denotes the gradient V f°. Hence, Vis 0, {x¡} satisfies:

12. g(x¡) +H(xj) [xi+1- x¡] = 0

Hence:

H(xt)[xi+i- x t] = - g(xi )+H (x1 _1)[xi - x i _1]+g(xi 4 )

so that:

‘ l + i - * ! = - Н Л Ц ) J [H(xul) - H(xul + s(x¡ - xul )]ds [xt - Xj.j]

Hence, Vis kj.

13. II x i+i — x ¡ II á (M /2 m) II x¡ - Xi.! f ^ a ||x¡- x ul

It follows from (13 ) that:

lim II Xj + j - x* И / II x ; - x* I■ x * 2 = a

so that the order of convergence is two (and superlinear).Superlinear convergence can also be established for the implementable

version of this algorithm (employing Arm ijo), as well as for the conjugate gradient and quasi-Newton algorithms.

I A E A -S M R -1 7 /4 2 4 9

5. FEASIBLE DIRECTION ALGORITHMS FOR CONSTRAINED OPTIMIZATION PROBLEMS

We consider here the problem:

1 . m in{f °(x) I f4 x) s 0 , i = l , 2 ,...m )

The feasible set Q is defined by

2 . £2 ={x| f 4 x) S 0 , i = l , 2 ,.. .m ]

f i :R n->R, i = 0 , l , . . .m , are assumed to be continuously differentiable.An algorithm is said to be a method of feasible directions, if at each nondesirable point x 6 Q , it generates a new point x 'e { x + Xh(x) | X e R+} by choosing X = X(x) э x' = x + X(x) h(x) e Q and f°(x ') < f°(x). Hence such a method can only be employed if £2 has an interior (or relative interior if Í2 is a subset of a linear manifold).

3. Definition. Vx0 ,x e £ 2 :

< (x0)^ {x e Œ I f°(x) s f°(x0)}

B ’ (x)^ {х'еП|||х’ -x||áe}

One of the simplest feasible direction algorithms is that due to Topkis and Veinott. Let S “= { h e R n| jh11 s i , i= 1, 2 ... n} , Jm^ { 1, 2, ...m }.

J^= { 0 , 1 , 2 , ...m }.

4. Feasible directions algorithm 1.

Step 0. Compute x 0e Í2, s e t i= 0 .

Step 1. Set x = x¡ ; compute h(x) which solves:

6(x) = min m a x { (v f° (x ) ,h ) ; f i(x) + < Vf1 (x), h X i e Jm} h e s

if e(x) = о stop.

Step 2. Compute X(x), the smallest X which solves:

f °(x + X(x) h(x)) = m in{f°(x + Xh(x)) | XeR+ , x + Xh(x) e Q}.

Step 3. Set x i + 1= x + X(x) h(x). S eti = i + 1.

Go to Step 1.

5. Comment.

h(x) can be determined in Step 1 by solving the following linear program:

25 0 M A YN E

6 . 0(x) = min{a ! - CT+<Vf°(x),h>S 0; - cr + f ¡(x) + < Vf¡(x), h> S 0

i = l , 2 ...m ; I h1 1 â 1 , i = l , 2 , . . .n }

7. Theorem. Suppose that ¿(xo) is compact and has an interior. Let {xj} be generated by algorithm (4). Then either {x ¡} is finite, ending at x k+i= x k and 0 (xk) = 0 , or it is infinite, with every accumulation point x* of {x ¡} satisfying 0 (x*) = 0 .

P roof. The finite case is trivially true. Since h(x) is not necessarily unique suppose algorithm (4) defines a map A : Rn -*■ 2 Rn (x' e A(x) if x ' = x + X(x) h(x),for some h(x) solving ( 6 ). We note that S is compact, 0 and f 1, i= 0, 1, ...mare continuous a n d ^ (x0) is compact. Assume x is not desirable, i.e.0(x) = - v< 0. The following results can then be established:

8 . 3ej > 0 э 0(x') § - v /2 , V x 'e B£j(x)

9. X<(vf°(x'), h(x') У & - Xv/2 and

f‘ (x) + <Vf‘ (x ') , h(x') > S - v /2 , V x 'e B (x), Vi e Jm

Also

e (O.e^ a n d a X 1 > 0 9

10. I <V f‘ (x' +X h),h>-<V fi (x'),h>| S v /4 , V ie J ^

V x 'e B £(x), VXelO.Xj], V h eS .

Hence, using the mean-value theorem:

11. f ° (x '+ Xh(x')) - f°(x ') S - Xv/4, and f*(x '+ Xh(x')) S - Xv/4 V ie J ^ ,

Vx' e B£ (x), VX e [0, XJ

Since x' e B6 (x) =» f1 ( x ' ) SO, V ie Jm , it follows from (10) that X(x') (defined in Step 2 of (4)) satisfies:

1 2 . f°(x ") - f° (x ' ) § - 6

13. f ¡(x")sO , Vi e Jm

Vx' e B£(x), 6= - Xjv/4, Vx" e A(x').Д nIf we define с = f u, we see that с is continuous, and that c , A satisfy

the hypotheses of Theorem (2.2). Hence, any accumulation point x* of an infinite sequence generated by (4) is desirable (0(x*) =0). Note, from Section l , 6 (x*) = 0 is a necessary condition of optimality of x*.

Apart from the conceptual nature of Step 2 — which can be replaced by, e.g ., A rm ijo 's algorithm — the above algorithm suffers from the disadvantage of taking into account all the constraints all the time, thus increasing the dimensionality of the linear program (6 ). One might

I A E A -S M R -1 7 /4 251

consider performing Step 1 (i.e. solving (6 )) considering only the active constraints — those for which P{ x ¡ ) = 0. However, this strategy can lead to jamming, as shown by the example given by Wolfe [6 ]. Essentially, rem oval of ccinstraint f ] from the linear problem because f-Kx;) / 0 , may result in «(vfkxj), h(xj)^ being positive, and, if |^(х()| is small, the resultant step size A(x¡) may become arbitrarily small, leading to jamming — convergence of { x j to x* э 0(x*) / 0. Zoutendijk overcam e this phenomenon by taking into account e-active constraints in choosing the search direction h(x).

14. Definition. The e-active constraint set at x is:

I£(x) = { i e l J f ^ È -e }

and

I°(x) = {0} U I (x)e e

15. Ve e R+, 0£ : Rn -» R and h£ : Rn -*Rn are defined by:

0£ (x) = min max {< Vf‘ (x), h>} = max {VfMx), hf (x)} h e S i s I £°(x) ie l° (x )

Clearly, 0 as defined in Step 1 of (4), satisfies 0 = в0 , and h 0 = h. The minimization problem in (15) can be easily written in the form of the linear program (6 ). The following results can be established fairly simply:

16. 0£ (x) S 0, Vx e i l , Ve e R+

17. I£(x )C I £, (x), V x e ft , V e ,e 'e R+ э e < e '

18. 0£( x ) s 0 £, (x), V x e ft , Ve , e ' e R+ э e < e 1

19. Ve e R+, V x s f i , 3ex >0 э I£(x') С I£(x), V x 'e B J lx )

20. Ve e R+, V xef2 , 3 e 1 > 0 э l£,(x) = Ie.(x), V e 'e t O .e j]

The following algorithm is a modification by Polak [1] to an algorithm due to Zoutendijk.

21. Feasible directions algorithm 2: e0 > 0 , p > 0

Step 0. Compute ax0e £2. Set i = 0.

Step 1. Set e = e о .

Step 2. Compute a h£(xj) which solves:

0e(xi) = min max {< Vfl(x j),h> } he s ie l“(xj)

Step 3. If (x) > - e , se t e = e / 2 and go to Step 2.

Step 4. (Armijo) Compute the smallest integer к э

f° (x ¡ + p h £( x ) / 2 k) - f ° (x . ) S [0£ (Xj)p/2k] / 2

f l(xj + phe ( X j ) / 2 k) á 0 , V ie Jm

Step 5. Set A.(x¡) = (p /2 k)

Set xi+ j =x¡ + A(xj) h£ (x¡)

Set i = i+ 1

Go to Step 1.

22. Comment. For the purpose of analysis, no stop statement has been included. Hence, the algorithm either jams up at x s, cycling between Steps 2 and 3, or produces an infinite sequence {x¡}.

23. Lemma. V non-desirable x e Q , (0O (x) / 0) 3 j(x ), e (x) = e 0 /2 j(x >0 ,

э c t £ ( x ) s -e (x ).

Proof. Эех э I£(x) = I£l(x) V ee [O.ejJ.

Hence, i f6 0 (x )<0, then

0£ (x) = e0 (x) < 0, Ve e [0 ,e1]

Then j(x) = smallest integer эе (х ) = e0 /2 j(x) satisfies

e (x )S e x and e ( x ) á - 0 o(x)

so that

0£ (x) = 0O (x) s - e(x), V e e lO .e j]

Comment. Hence the map A : Q -* 2 a , { x e Г21 0o(x) = 0} is well defined by

A(x) = {x '|x ' = x + X(x)h£(x) (x)}, x ^ D

A(x) = x, if x s D.

24. Theorem

Let{x¡} denote a sequence produced by (21).

(a) If { Xj} jams up at x s, then 0O (xs) = 0.

(b) If {Xj} is infinite then every accumulate point x* satisfies 6 0 (x*) = 0.

252 M A YN E

IA E A -S M R -1 7 /4 253

(a) is established in (2 3).

(b) Suppose that x e i î is non-desirable (0o (x)<O) so that

0£ (x) (x) S - e (x) < 0. From (19) Эе1 > 0 э :

25. Ie(x) (x ')C I £(x) (x), W e B £i (x)

Let Ф : Rn -* R be defined by:

26. ф ( х ' ) = min max V f'(x '), hh es iel£»(x)U)

Clearly ф is continuous, iMx) = 0c(x) (x) and 0£ (x) (x ')é ф ( х ' ) . Hence 3 e2 e [0 .e 1] э :

27.

0£ (x )/ 2 (x ') s et (X) (x1) s Ф (x') s - e (x )/2 , Vx' g B£ (x)

(27) implies:

28.

e£(x') (X') = 0 £(x) (x>/2, V x 'e B £¡(x)

The rest of the proof proceeds as in the proof of 7, except for a minor modification due to use of the Arm ijo rule in Step 4.

For simplicity, we have restricted ourselves to the more direct feasible direction algorithms. For a discussion of a dual method of feasible directions, due to Pironneau and Polak, see R ef.[l],

6 . PENALTY AND BARRIER FUNCTION METHODS

The basic idea behind penalty function methods is to convert constrained optimization problems into unconstrained problem s, by discarding the constraints, and adding to f° a penalty function which penalizes constraint deviation. More precisely, the problem:

P l. min {f °(x) I x e Г2}

where Q С Rn is closed (and may even have no interior if £7 represents equality constraints) is replaced by the following sequence of problems:

P2. m in{f °(x) + Pj(x) I x e Rn} , i = 0, 1, 2, 3 ,...

whose solutions x¡ -* x, the solution of (1), where V ie Z , p¡(x) = 0 if and only if x e Q. Note that x¡ is not necessarily feasible, Vi e z.

Outline o f P r o o f

254 M A YN E

In contrast, the Barrier Function method also replaces PI by P2 but now, Vi e Z , Pi is definedoonly on the interior of Œ (assumed to be nonempty) and p.(x)sO , V xef2 and p¡(x) -»oo as x->6(f2), the boundary of Í2.

6.1. (Exterior) penalty function methods

1. Example

If Í2 ={x| f'(x) SO, i e Jm} , then a useful sequence of penalty functions is given by:

Note, for each x, pj(x)>p¡(x) if j > i. In fact the defining properties of a sequence { p¡} of penalty functions are:

3. For i = 0, 1 , 2 , . . . : р ; :11п-*11 continuous

p;(x) = 0, V x e f!

Pj(x) > 0 , Vx$ Í2

Pi+i (x)> Pi(x) V x f f i

p¡(x) -» oo as i -o o , Vx$ £ 2

The sequence { Pi) defined in (2) satisfies (3) i f { f J|je Jm} are continuous. Let <J)iAf° + Pi : Rn - R denote the cost function for P2, and let ^m inim ize (fi on R” , and x* minimize f° on Í2. If Эх1 э {x | f°(x) s f(x')} Д ^(х') is com pact, then x* and{x¡} all exist and lie in У(х'). The following result is a set of interesting and useful inequalities:

4. Lemma

m

2.j=i

ф0 ( x 0 ) S ( ^ ( X j ) S ... f ° (x * )

Proof

Since Pi (x) s p i + 1( x ) , Vx s Rn

фi (xi) s (í>i(xi+1) S c/)i+i(x i+1)s ф1+1 (х*), Vi

Also since x* e П, pt(x*) = 0, and

<f>i+i(x*) = f°(x*), Vi

5. Theorem

Any accumulation point o f{x j} is a solution of PI. Proof (for the case when{p¡} satisfy (2 )).

IA E A -S M R -1 7 /4 255

L et{x ¡] now denote a subsequence, converging to x. Then f °(xi) -* f °(x). Let f°(x*) denote the minimum value of f° on Я (x* is a minimizer of f° on Г2. By Lemma (4) {<j>¡(xi)} is a non-decreasing sequence bounded above by f°(x*), so that <j>i(x¡) -*■ ф*, say, where ф* s f°(x*). Hence

f O ^ W 0©

Ф^Х^ -* Ф* S f°(x)

ф ^ ) = f °(x¡) + i P(x¡)

Hence,iP (x ¡) - ф* - f°(x) S 0

Since i-*oo, and P ( x ) й 0, Vx e Rn, this implies that P ( x i ) - > 0 =*-P(x ) = 0 => x e f i , that is, x is feasible.

Also: f°(x) = lim f °(xj) S f°(x*)

Hence, x is optimal for PI.

6.2. (Interior) barrier function methods

With the barrier function method, U must have an interior — problems with equality constraints cannot be handled. More precisely, Í2 must be robust, that is, that Í2 = Q (the closure of the interior of Cl is equal to Г2). This rules out equality constraints, 'whiskers', isolated points, etc. Barrier function methods establish a barrier which prevents points generated by the algorithm leaving Q:

1. Example

If £2 = {x| f ‘ (x) i 0, i e J m} , then a useful sequence of penalty functions is given by:

2. p¡(x) = P (x)/i= ^ (- l / f J(x)) / ij = i

The defining properties of a sequence {pj} of barrier functions are:o

3. 0<p¡(x) -oo as Vi

As before, VieZ, let ф !^ ° + P i , let iq denote a minimizer of ф4 on Rn, and x* a minimizer of f° oij ii (because of the nature of p ¡ and ф ^ X j is also the minimizer of ф1 o n f i . ) It can be shown that:

4. ♦0 ( î 0 ) H 1(x1)ê ... ф.(х.) s ф. + 1(х .+1) ... ë f°(x*)

256 M A Y N E

Any accumulation point x o f { x ¡ } is a solution of PI.There also exist mixed penalty-barrier function methods.Penalty methods may be started at non-feasible x 0, and may be used

for problems with equality and inequality constraints. The sequence! x ¡} is, on the other hand, not necessarily feasible. Barrier methods, which require that xq e Í2, cannot be used for problems with equality constraints but produce feasible sequences. Both are conceptual, requiring minimization, V ie Z, of Ф1. However, implementable versions have been developed (see Polak 11 ]). Both methods have the disadvantage that </>¡ becomes 'nasty' as iâ> in the sense that its Hessian becomes extremely ill- conditioned. For an interesting elaboration of this theme see Luenberger [3].

7. DUALITY

Consider the primal problem:

P : min { f°(x) ] x e Q ]

where Q Д{ x e Rn 1 f(x) § 0}, which we can imbed in the family of problems (V y e R mT :

P(y): min {f°(x) ¡ x e £2(y)}

where

Q{y) Д{ x e Rn |f(x)sy}

Let

ГД {y e R m |n(y) ф 0}

Г С Rm is convex provided f :R n->Rm is convex.

For, if yi, уг e Г, then fi(yi) and fi(y2)are not empty. If X]S fi(yi) and х 2 e ^ (уг) and 9 e [0 , 1 ], then:

ffSxj + U - 0 )x2) = 0 f(x]) + (1 - 0 ) f(x2)

S 0 yi + (1 - 0 )y2

“►0 XJ + (1 - 0 )x2 e f i(0 yj + ( l - 0 )y2 ) / 0

=*0yi+(l - 0)y2 e Г

We define и : Г -» R by:

1. Def. u(y) Д in f{f°(x)|xe Г2(у)}

and the subset A, В of Rm+1 by:

5. T h e o re m

lA E A -S M R -1 7 /4 257

2. Def. А Д {(y0 ,y) e RXRm) | 3x e R n э y0 ë f °(x), y S f(x)}

= region above the graph of П

3. Def. y° A u(0) = inf{ f °(x) | x e Г2}

4. Def. В Д {(у °,у ) e R m+1 |y°g y°, ySO}

These definitions are illustrated in F ig.2; note, if x is a solution of P,

5. Lemma

(a) и is non-increasing (yj S y2 =*■ u(y£) ë u(y2))

(b) If f° , f are convex functions, и is a convex function and A is a convex set.

The Lagrangian L : R n X R m -*■ R is defined by:

6 . Def. L(x, X) A f °(x) + Â, f(x) )>

Associated with L is the unconstrained problem:

Px : min{ L(x, A) | x e R n }

The dual function ф : R m - * R is defined by:

7. Def. </>(А) Д inf{ L(x, A) j x e Rn}

and the dual problem is:

DP. max{ ф(А) I A ê 0}

8 . Def. ф A sup{ ф(A) | Хё 0}

Lemma 9 and 10 are true, whether or not , f0, f are convex,

f°(x) = y°.

X

0 у

FIG .2. D efinitions in the duality problem .

9. Lemma

ф : Rm+ R U { - 0 0 } is a concave function ( R m + Д {X e R m | Хё 0})

10. Lemma (weak duality)

If x s Í2, Xs 0, then

f ° (x )ë u(0)ё $ ё ф(Х)

Proof

f(x) s 0 , Xs 0 => < X, f(x) У s 0 , so

f °(x) ë L(x, X), V x e Q , Хё 0

Hence

f°(x) ё inf (f°(x) I x e Çi] = u(0 )

й inf {L(x, X) I x s Q}

ê inf{L(x, X) I x s Rn} = ф(Х)

that is

f° (x ) 2 и(0)гф(Х), Vx e Í2, X ё 0

Taking infimum w.r.t. Хё 0 yields the desired result.The basic problem in duality theory is to determine conditions under

which the solution to the primal and dual problems are equal, that is to(0 ) =ф.

11. Theorem (sufficiency)

Let (x, Я) satisfy the following optimality conditions:

(1 ) x is a solution of P^( 2 ) x e Í Í , Х ё 0(3) У = 0 if fHx)<0 (i.e . <X ,f(x)> = 0),

Then:

(i) x is a solution of P (to(0) = f°(x))(ii) X is a solution of DP ($ = ф(£))

(iii) u(0 ) = ф (f°(x) = ф(Я)).

Proof

Ifхе£2=П (0), then •(X, f(x) У § 0, and:

f°(x) ё f°(x) + <X, f(x) > = L(x, X)

2 5 8 MAYNE

IA E A -S M R -1 7 /4 259

й inf{L(x, X) I x e Rn}

= L(x, X) (by (1))

= f°(x) (by (3))

i.e. x is a solution of P, and f°(x) = u(0). A lso ф(Х) = L(x,X)

= f°(x) (by (3))

= u(0)

=>(by weak duality) u(0 ) = ф = ф(Я)

i.e. X is a solution of DP.

12. Theorem (strong duality)О

Let f° , f be differentiable convex functions, convex, ОеГ (i.e.3 x e Rn э f(x) < 0), u(0)> -oo), then 3 an optimal solution Я to DP and

и(0 ) = ф(Х)

(If x is an optimal solution to P I, u(0) =f°(x).) A lso, if X is a solution for DP, then x is a solution for P <=* (x, X) satisfy the optimality conditions of Theorem 11.

Hence, the solution x to P I, can be determined by finding x : R m+ Rn x(X) э Rn minimizes L(x,X), and then finding X* which maximizes ф(Х) = L(x(X), X) in Rm+. x = x(X*). However, this algorithm requires the (global) convexity of £2, f° and f. It is possible, though, to obtain similar results of a local nature (local duality). Under suitable conditions (including f° , f twice continuously differentiable) if PI has_a local minimum at x, withassociated Lagrange multiplier X, and if V 2L(x,X) is positive definite (localconvexity) then Я is local maximum of the unconstrained dual problem DP, and f °(x) = ф(£).

The above discussion holds, with minor modifications, for the case when Í2 = {x e Rn I r(x) = 0}. For this case, observe that the two problems:

PI. m in{f°(x) I r(x) = 0}

P2. m in{f°(x) + с II r(x) ||2 I r(x) = 0}

are equivalent, for any с > 0. A lso, that for с large enough:

V2 [L(x,X ) +c II r(x) ||2]

is positive definite. It can be shown that for every compact subset of Rn , there exists a c < « such that if x is a local minimum of

Фс ( x ) = L ( x , X ( x ) ) + с II r ( x ) ||2

2 6 0 M AYNE

where

Mx) = (gx(x )g j(x ) ) - 1 g x(x)Vf°(x),

then it is a local minimum for P I, so that algorithms can be constructed with a rule for automatically increasing c , producing sequences whose accumulation points satisfy necessary conditions of optimality of the original problem.

A C K N O W L E D G E M E N T S

These notes rely, in the main, on Polak [1], particularly in the exposition of the simple but powerful algorithm models of Chapter 2, and their application for establishing convergence of algorithms for unconstrained optimization (Chapter 3) and of algorithms of the feasible-directions type for constrained optimization. The discussion of conjugate-gradient and secant-type algorithms draws on the lecture notes of a colleague,Dr. J.C. Allwright. The discussion of conditions of optimality is based on Ref.[4], and the regrettably short treatment of duality is based on the treatment by Varaiya in R ef.[5].

R E F E R E N C E S

[ 1] POLAK, E., Com putational Methods in O ptim isation, A cad em ic Press (1971).[2 ] POLAK, E., RIB1ERE, G ., N ote sur la convergence de m éthodes de directions conjugues, Rev. Fr.

Inform. Rech. Operation <16-R1 ) (1969) 35 -43 .[3 ] LUENBERGER, D .G ., introduction to Linear and Nonlinear Programming, Addison-W esley (1973).[4 ] CANNON, M .D ., CULLUM, C .D ., POLAK, E., Theory o f O ptim al Control and M athem atical

Programming, McGraw H ill (1970).[5 ] VARAIYA, P.P., Notes on O ptim ization, Van Nostrand; REINHOLD: Notes on System Sciences (1972). [ 6] WOLFE, P., "On the C onvergence o f Gradient Methods under Constraints", IBM Research Report,

R.C. 1752, Yorktown Heights, New York (1967),

IA E A -S M R -1 7 /3 7

REACHABILITY OF SETS AND TIME- AND NORM-MINIMAL CONTROLS

A. MARZOLLOElectrical Engineering Department,University of Trieste andInternational Centre for Mechanical Sciences, Udine Italy

Abstract

REACHABILITY OF SETS AND TIME- AND NORM- MINIMAL CONTROLS.Standard continuous linear control systems as w ell as discrete tim e systems with "controller '' and

"a n ti-con tro lle r” representing disturbances are treated.

INTRODUCTION

This paper is divided into three parts; the first two parts concern standard continuous linear control system s, whereas the third part deals with discrete time systems whose state evolution is influenced both by a "controller" and an "anti-controller", which may represent disturbances (w orst-case approach).

In Part I, functional-analysis methods are used to give necessary and sufficient conditions for the reachability of a given convex compact set in state space with norm-bounded controls, for different types of norms. In Part II, which again uses functional-analysis methods, these conditions are applied to obtaining both time-minimal and norm-minimal open loop controls, and a special case of Pontryagin's maximum principle is derived. These results are illustrated by an exercise at the end of Part II. Part III deals with the problem of finding the set in state space starting from which an initial state can be steered into a given target set, in the presence of an "anticontroller"; both the cases of bounded and unbounded controls and anticontrols are considered, and interesting relations are shown among possible situations depending on different information structures available to the controller.

Part I and II of these notes are extensions of ideas developed in Ref.Il].

PART I. CONTROLLABILITY WITH RESPECT TO GIVEN SETS,•TIME INTERVAL AND BOUNDS ON THE NORM OF THE CONTROL

Let us consider a linear control system described by a linear differential equation of the type

x = A ( t ) x ( t ) + B ( t ) u ( t ) 1 ( 1 )

2 6 1

2 6 2 M ARZO LLO

where the state vector x(t) is in Rn, the control vector u(t) is in R m, and A(t), B(t) are matrices of appropriate dimensions." Under the usual hypotheses on the functions in A(t), B(t), and on u(t), if x(t 0) = x 0 , the system 1(1 ) has a unique solution

4v u(t) = V (t)x 0 + J w ( t ï , r)u(r) dr, t ê t 0

*0

where V(t) is the solution of the matrix equation

V(t) = A(t)V(t). V(t 0) = I

and

W ^ , r) = VUJV'MrJBtT)

Let us equip u with a norm over the interval K = [ t o . t j , which, for the moment, will be an L 2 2 norm, for the sake of simplicity:

Çf IIu ( т) II d r^ ) , I|u(t)|| = ^ ^ | u ¡ ( T ) | 2^)

(later on, we shall extend the considerations to more general Lr>p norms,1 s r s » , 1 < p § °°). Let Up be the class of admissible controls:

Up = { u : || u || § p, p ê 0}

The problem we shall now consider is the following:

Problem 1,1

Given a non-empty convex compact set A in Rn ; does an admissible control exist which "steers" x 0 at time t 0 into A at time t j?

In other words, does an u £ Up exist such that Vu(t1) G A, or is the system 1(1) "controllable with respect to x 0, A, Up, K?

To solve Problem 1,1 let us define the linear operator A from Up into Rn:

tiAu = J W(t1( T)u(r)dr

Eo

and the "reachable set" R at time t x

R = {x :x = V (t1 )x0 +Au, u G Up} 1(2)

Problem 1 ,1 clearly has a solution if and only if

R П A f 0 1(3)

IAEA " S M R -1 7 /3 7 263

The most natural way of verifying whether 1(3) holds or not is to apply the separation theorem for convex compact sets, or, more precisely, its following corollary

Corollary 1 ,1

Two convex compact sets A and R of Rn have a non-empty intersection if and only if

min <x, r> 5 max <(x,aX ¥ x £ R nr e R a e R

From the convexity of Up, the convexity of AUp, hence of R, follows trivially. Its compactness may be obtained from classical theorem of functional analysis (see, e.g. R ef.[2]) by using the (weak) compactness of Up and the (weak) continuity of Л.

Using Corollary 1,1 and recalling 1(2), we see that Problem 1,1 has a solution if and only if

min (<x, V (tx) x0> + < x ,A u )) § m a x (x ,a ) , Vx € Rnu e u P a g A

or

< x, VftjjXo У - max ■(x.Au) 1 S m a x (x ,a ) , Vx 6 Rn||u||sp a e -A

Defining A* to be the adjoint operator of A, which takes Rn into U*, we have

<x, V (tj)x „> - p max <Â*x, u > s max ( x ,a ) , Vx G Rn N N 1 a e A

or

( x , V ( t 1) x oy - p Цл*хЦ § т а х ^ х . а У , Vx G Rn 1(4)a e A

If A is the Euclidean sphere around x , of radius e:

A = {x : x = xx +y, ¡f y U s e }

1(4) becomes

< x ,V (t1)x0 - x : > s p||A*x|| + e||x||, V xG R "

which is clearly equivalent to

<( x , Vft^Xo У S p ||л*х ü + € , V x G R n, If x D = 1 1(5)

Representing ||л*х|| and noticing that, because of the arbitrariness of x, the first term of 1(5) may be substituted by its absolute value, we have the following

264 M ARZO LLO

T h e o re m I, 1

The system 1(1) is "controllable with respect to x 0, A = {x, x = x x +y.Il y II S e} , Up, К " if and only if

1 /2

I <х» V (t Jxq - Xj, > I S p Ç j ||w*(t1> t )x ||2 dT ) +€, Vx, Il x II = 1 1(6)£o

If e = 0, i.e . if A = xx, condition 1(6) may be easily verified by using standard techniques of linear algebra. Indeed, defining the vector Z of components zj as follows:

z = V (t!)x 0 - xx

and taking the square of both terms of 1(6 ), this inequality is equivalent to

( х , г ) ( г , х ) S p2 W(t1( t ) W *(t1( r)drx^>, Vx 6 Rn 1(7)to

and defining the nX n matrix Z = z ) <(z with elements z¡, z j , the last inequality is equivalent to the following one:

< x ,Z x > S p2 Y(t1 )x, V x 6 R"

wherekl

Y(tx) = J W(tx, r) W *(t¿, t ) d i 1(8 )

is the usual "controllability matrix" of system 1(1). Defining the nXn matrix С (^):

C(tx ) = Y(t x) - -^ - Z

from 1(7) we have the following

Corollary I, 2

System 1(1) is "controllable with respect to x0, x 1( Up, K" if and only if the matrix C(t J is positive semi-definite.

The previous condition depends naturally on the system (through Y(ti)), on x 0, Xj (through Z), on p and on К (through Y itj)). It is easy to verify that letting x 0, x 1( p, К be arbitrary, we have the usual controllability condition:

Condition 1,1

The state of the system 1(1) may be carried from any x 0 at time to into any x-l at time t x by controls u of appropriately large norm if and only if the matrix Y(ti) defined in 1(8) is positive definite.

IA E A -S M R -1 7 /3 7 265

Since the sufficiency is obvious, let us verify the necessity. Let С(^) be positive sem i-definite, i.e . let 1(7) hold with arbitrary x0, x£ and appropriately large p. Since

' I

( x , Y(tx )x )> = У ^ < W*{t1 , t ) x , W * ^ , r)x^> d

toti

= J ' IW*(t 1( r)x||2 dr ê 0, Vx € R

Y(ti) is, at least, positive sem i-definite. If it were singular, i.e . for some x' f 0, Y(t)x' = 0, from 1(7) we would have (<(x', z > ) 2 S 0, that is <^x', z У - 0 which is impossible for arbitrary x 0, x£, hence for arbitrary z.

Until now, we considered Up equipped with an L 22_norm. The previous results, appropriately adapted, would hold for more general norms, i.e. for u G 4 p :

IIu llr. p = (f l|u(T)||rP d x ) 1 г )

The admissible class of controls would be

U?, p = {u : Il u II r> p § p} 1(9)

Instead of Theorem 1,1 we would then have

Theorem 1 ,11

The system 1(1) is "controllable with respect to x 0, A, UfiP , Кц if and only if

/ p \!/q| < x ,x ! - V í t ^ X g ) ! S pl^J ||w*(t1( t)x||s d tJ + e, Vx £ Rn 1(10)

where

F or r / 2 , p f 2 , the verification of condition 1(9) may not be reduced to the previous simple algebraic condition of Corollary 1,2 but the following inequality should be verified

266 M ARZO LLO

maxHxHl

< x , x 1 -V tt j JX g) - pyj II w * ( t x , t ) x | | j d1/q

which implies a maximization over the unit sphere ||x|| = 1 , (or over the unit convex ball x S 1 ).

PART II. TIME-AND NORM-MINIMAL CONTROLS

In this part, we shall heavily rely on the results of Part I concerning the reachability of a given point Xj at time tj from x 0 at time t 0 under controls u belonging to an admissible class Up p defined in 1(9), or (x0, x ¡ , Uj p , К) controllability, i.e . we shall use 1(10) which we rewrite for convenience in the case e = 0 :

/ p \i/qK x .x j - V i t j jx , , ) ! * p ( j ||w*(tx , r)x||, d r j , V x € R n 11(1)

We shall obtain results concerning the existence and the form of time optimal controls as well as of norm optimal controls, and shall derive a special case of Pontryagin's maximum principle. Throughout this chapter, we shall suppose x x / У (^ )х 0, that is the function u = 0 is not supposed to transfer x0 into Xj. Let us first consider time optimal controls.

Theorem II, 1

If the system 1(1) is controllable relative to points x 0 and x 1( time interval K= [t0, t j and the class u f,p of admissible controls, then there exists a least interval К = [t0, t], (t0< t a t j) in which it is controllable.In other words, if a system is (x0, x x, Uf* , K) controllable, then there exists a time optimal control.

Proof

Let us consider the set H = {t, t> t0} over which the system is (x0, Xj, u f p , K) controllable. This set is clearly non-empty. Let us define

"t = inf H

We need to prove that the system 1(1) is controllable relative to x 0, xx,Up p and К = [t0 ,t ] , i.e . the interval К is the minimal interval over which the system 1(1 ) is controllable, which is equivalent to saying that t belongs to H. Indeed, suppose that t does not belong to H. Then, by theorem 1,1' there would exist a vector x in Rn such that

IA E A -S M R -1 7 /3 7 267

By the definition of t, there exists a sequence {tn} with lim tn - t and suchn -* “

that for each interval Kn = [t0> tn] the system is controllable; and, therefore,

' n i / q|<x, x i - v ( t n) x 0>| s p ( J ||w*(tn,T)5||s4 d r ) 11(3)

to

or

y(tn) s p6 (tn)

with obvious meaning of 7 (tn) and 5(tn). Since y(t) and 6 (t) are continuous functions, we have

lim 7 (tn ) = -y(t) = 7 = |<х,хх - V (t)x0> |П -*■ °c _

t ^

lim pó(tn) = p6 (T) = p6 = p ^ J ' ¡W *(t, t )x ||s4 dr^)t.

From inequality 11(3),

lim 7 (tn) S p lim 6 (tn) n -*«° n -* -»

that is

y S p 6

which would contradict 11(2 ).

Theorem II, 2

If К = [t0 ,t] is the minimal time interval over which the system 1(1) can be transferred from x0 to xx with an admissible control belonging to Uf p , then there exists a vector x' j 0, x' G R n such that

|<x', Xl - V (t) x0 > I = p Ç j ||w*(t, t) x' |£ dT^ 11(4)to

Proof

Let the sequence { tn} converge to t, with tn < t . For each tn, there exists a vector x n such that

, pn \ i/q|<xn, Xl " V (tn)x0> I > üW*(tn, r)xn ||s4d r j 11(5)

to

268 M A RZO LLO

The inequality 11(5) clearly holds if we replace xn by Xx}, = xn, where ||x'n || = 1. In this way, all x'n belong to the compact unit sphere of Rn, and, therefore, from the sequence {x'n} it is possible to extract a* convergent subsequence {x'nk} such that lim x},k = x ' , with ||x' || =1. Since both terms of inequality 11(5)

(with x instead of xn ) depend continuously on tn and on x'n, we have

lim |<xn, x 1 - V(t )x0> I = I <x ' , xx - V(t)x „> I4 " x’

andtn 1/q '

1Ím ( У HW*(tV T)X'nk ^ dT) II W*^*nl< 2 ‘о

4 " 1Recalling inequality 11(5), we have

/ г \1/qI < x ', x x -V (t )x 0>| H W* (t, r)x' || dry

Since the system is (x0, x lf U fp , K) controllable, hence also 11(1) holds for x = x ', the theorem is proved.

Let us now turn to the problem of the existence of the so called "minimal norm" control: given the system 1(1), which is supposed to be (x0, x x, Uj p, K) controllable, does a control u exist such that ||u||r p = min ||u||r and which transfers x 0 to x x, within K? ueUr, p '

The answer to this question is "yes", and the control u is called"m inim al-norm " control. We are going to prove this fact in the following

Theorem II, 3

_ Let the system 1(1) be (x0, x : , u f p , К = [t-0, t x] ) controllable and letU С u/¡p be the set of functions u which effect the transfer from x 0 to x £at time tj , i.e . such that

tiJ W(t1 ,v) u(r) dT = Xj -V (t 1 ) x 0

соThen there exists a function û such that

l|û||r,p = ini l|u||r,p = p 11(6)U€U

Proof

Let us define p to be the inf_ ||u||r . There exists, therefore, au e Ü ,P

sequence {pn}, with pn > p and such that lim pn = p. For each pn the systemn —»«.

IA E A -S M R -1 7 /3 7 269

T Pnis (x0 , X j, Ur>p, K) controllable; therefore

, 1 / /ч

|<x, / llw+ftj. t )x| | , dr ) , V x 6 Rn

This inequality is conserved as n tends to infinity, so we have

11 1/q|<x, x 1 -V ( t 1) x 0> I S p Ç j ||w*(t1; t )x ||s4 dr Vx e R 11(7)

and, by Theorem 1,1 ', there exists a control function û 6 U with ||û||r p S p. Since û € U, and inf_||u[|r p = p, ||u(|r must be equal to p, and the infimum

u e иin 11(1) is assumed. We now want to give an expression for p as a function of the parameters of the system 1(1), and of x 0, xi , t0 and t j . This expression is given in the following

Theorem II,4

Using the definition of Theorem 11(3) for u, we have

u||r = sup < x ,x 1 - У (Ц )х0>p = i n £ IIй H r,pUEO

I W* (tj, t)x Us dr = 1

11(8)

Proof

Since 11(7) is satisfied, we have

p ê supx e r1 1 ||W*(tj, t )x | [s d r

= su p < x ,x 1 - y ( t 1 )x0> = p'

1/ q, Xj - У (^ )х 0

гJ W*(t, , т)х II dr

1/q

11(9)

Therefore, we only need to show that p cannot be larger than p '. Let us suppose the contrary. Then there exists a positive constant a such that p' < a < p. Since a< p, by 11(6), the system is not (x0, x t , K, ) controllable, and there exists a vector x' € Rn:

4|< x ' , X j - У (^ )х 0> I > o Ç j II W * ( t j , t ) x ' Ii; dT )

1 /q11(1 0 )

270 M ARZO LLO

Since a> 0 , l^x' , x 1 - V ( t 1 )x0 >| is positive and, therefore, since û e U, hence

p > 0, by 11(7) also (^J I]W*(t1 , t)x' l^dr^) is positive, and we may therefore

consider the quantity

x 1

( / l | w * ( t j ,T ) x ' И,4 d r )

and by 11(1 0 ) we have

| < r j ' , x j - У ( ^ ) х 0 > I > a > p'

which contradicts 11(9).Going back to consider the time optimal problem, let us suppose that

К = [to ,til is the minimal time interval over which the system 1(1) can be transferred from xq at time t0 to Xl at time t x with controls whose norm does not exceed a given p. On the other hand, we know from Theorem 11,4 and Corollary II, 2 that, for the interval К considered given, the minimal norm controls have norm p:

We may raise the following question: with reference to the same system, the same points, to p and hence ufiP , can p be less than p?

We shall give a partial answer to this question in the following

Corollary 11,3

If the system 1(1) is controllable relative to any pair of points of Rn, to class UfiP of admissible controls and to the time interval K= [t0, t], where К is the minimal time interval over which the system is controllable relative to given x 0, x x and to U£p , with p given, then p = p, where p is the minimal value of the norm of controls which effect the given transfer from xo to x j , in K.

Since К = [t0, t] is a time optimal interval, by Theorem II, 2 there exists a vector x' such that 11(4) holds. Since the system is controllable relative to any xo. x l. to u £ p and to К = [t0 ,"fl, by the already quoted Con

dition I,Is Y(t0, t) = / W(t, T)W*(t, T)dT is positive definite and, therefore,

p = sup |<x, Xl - У (^ )х 0> 11(1 1 )

Proof

I A E A -S M R -1 7 /3 7 271

theL j

integral yj ||w(t, t ) x ' ||s d r )

p = | < X ' , X ! - V ( t jx 0)|Î, _ vi/q

||w*(t, r)x' Us Ат j

is positive. We may, therefore, write

and recalling 11(11), p è p. Recalling that obviously p S p, the corollary is proved. In what follows, we shall refer to time intervals К = [t 0, t] which are minimal with respect to x 0, and to classes p of admissible controls, p being also the minimal norm of controls transferring x0 into х х, in K. Beside 11(8), we shall also use

p = mm ||u||r>p u e U

m a x ( x , x x - V U ^ x ^

( j ^ | | w * C t , r ) x || s4 d r ) = 1

< X , X ! - V ( t ! ) x 0 >

||w*(t, t)x||s dTj.i/q

11(8 ')

We shall now see that the derivation of optimal controls (optimal in the sense just specified of being both time-minimal and norm-minimal) follows from the previous results and from a repeated use of Holder's inequality. Let û be optimal with respect to x 0, x j , p, K= [t0, t] . Since û transfers x 0 into x : in the interval K= [t0, t], we have

x i ~ v ( t ) x 0 = j W(t, r ) û ( r ) d T

*0

11( 1 2 )

and since К is the minimal time interval, 11(4) holds for some x. Putting together 11(4) and 11(12) and taking the adjoint W*(t,t) of W (t,t), we have

T T. 1 / q

Р ^ y "| w * (t , t)x||s4 d-rj = J <W*(t, t)x, u(r) > dit0 to

Using Holder's inequality in the finite-dimensional and infinite-dimensional case, we may write

Г 1 / ~ p ( ^ f IIW*(t, t ) x ||s4 d r^ = J'<W *(t,T)x,û(r)>dT

te ЧT T

~ f T) X ' Q ( T) > l d T s / l l w * ( t , T ) x | | s | | u ( r ) | | r d T

272 M ARZO LLO

s ( f iir>dT) ( J t>*ic d'’h to

= p ^ y ^ | | w * ( t , т)х|| ^ d T ^ = P II W * X [j Sj q 11( 13)

Since in 11(13) the first and last terms are equal, we must have equality at each step. Recalling the conditions for Holder's inequality to hold as an equality, we have almost everywhere in K:

||u(t) ||r = C||w*(t,t)x||fp 11(14)

u¡ (t) = K(t)[W*(t,t)x]-^r 11(15)

and

sign Uj(t) = sign[W*(t, t)x]-/r 11(16)

From 11(8') and 11(14) we may determine the constant C. Indeed,

/ r a-p \1/p /P = ||û||tip = c i^ J ||w*(t,T)x||,P d s j = C(||w*x||Siq)4 P

to

Hence,

P ________ P_С (||w*x||Siq)q/p = (||w*x||Siq)p-i

Since (see 11(4) or 11(8')) the vector x is determined up to a multiplicative real constant, and since we suppose, as always, x 1 - W [t)x0 f 0 , we may determine x such that the scalar product l^ x .x ! - V (t1 )x 0> | is equal to one. In correspondence to such x, we have from 11(8')

„ 1 P ‘ II W*x ||Si q

and, therefore,

C = (||w*¿||s>q n (17>

To determine the function K(t) let us take the finite-dimensional norm of u(t), using 11(15),

I A E A - S M R - n /3 7 2 7 3

||w*(7,t)x||

F r o m 1 1 ( 1 5 ) , 1 1 ( 1 7 ) , a n d 1 1 ( 1 8 ) w e h a v e

K(t) |W*x||s4q 11(19)

From 11(15), 11(16), 11(19), we eventually have

' q _ s

Ûj (t) l|W*x||Siq ||w*(t,t)x|| p r |w*(t, t)x|. sign(W*(t, t)x)¡ 11(20)

which is the general expression for the optimal control when the norm of u is the general norm

fp/r \ l/ p

||г- р = ( Л Е |и‘(т)|Г] Р dT)

— + — = 1 , — + — = 1 , and x is given by 11(8 ), 11(8 ’ ) with the additional constraintr s p q|x, x i - V(t)x о I = 1. Many particular cases of the general formula 11(29) areinteresting. For example, if u e L 2|2 ancl its norm becomes

1 m 1/2

IIй ll2,2 = ( JYj lUi(T) |2dT)

so as to have the intuitive meaning of "energy", the optimal control function is the following:

u¡(t) (W*(t,t)x)¡

J Y | w * ( t , t ) x | j d r

to ‘if u £ L» . and its norm is

u ess sup t e К

max j Uj (t) I i = 1 ... ш

the optimal control function is

1(t) = —

t msign[W*(t, t)x]j

J 2 |w*(t, t)x j j d 7

In general, the type of norm to be chosen is given by physical considerations, as is obvious. Let us consider the case n e L r » , or

274 M ARZO LLO

IIU||r „ = ess sup te к

; (1 s r s °°)

i = i

and let us see how it is now possible to derive the well-known Pontryagin's maximum principles for this case.Let us consider the system 1(1) with u= 0:

Let the fundamental matrix solution V(t) of 11(21) be non-singular at t, hence invertible almost everywhere. Calling \&(t) the fundamental matrix solution of 11(22) with = V - 1(t), we immediately see that

(Indeed, (d/dt) [Ф*(t) V(t)] = - [A*(t)'t(t)] * V(t) + Ф*^) A(t) V(t) = 0, hence ^*(t) V(t) =1 , where I is the identity matrix). As a consequence, we may write the general vector solution of 11(2 2 ) in the form

where x = w(t). The optimal control law u makes all inequalities of 11(13) equalities, as we have seen. In particular,

we may conclude that for every t the function u(t) makes the quantity l<^V'1*(t)V*(t)x, B(t)u(t)>| maximal with respect to all u(t) such that _J|u ||r>oo S p i.e . to all admissible controls. Since, by 11(23), w(t) = V^*(t)V*(t)x is the solution of the adjoint equation w(t) = - A*(t) w(t) satisfying w(t) = x, we may say that the optimal control û(t) maximizes for every t, a.e. and with

x = A(t)x 11( 2 1 )

and its adjoint equation

ф = - A*(t)ф 11(22 )

**(t) = V _1 (t)

w(t) = M>(t) Ф 1(t)x = V 1 (t)V*(t)x 11(23)

|< W *(t,t)x, Û(t)>| = ü W*(t, t)x ||s (I û(t) ||r

a.e. in К or, equivalently,

K v ^ w v ^ t l x , B(t)u(t) > I = |< B*(t)V‘ 1*(t)V*(t)x, u(t)>|

= ü B*(t) V "1*(t)V*(t)x ||s ||û(t)||r

and since for every u such that

m

t e K i = 1

|<V‘ 1+(t)V*(t)x, B(t)u(t) > I S ü B*(t)V '1*(t)V*(t)x ||s ||u(t) ||r II(24a)

IA E A -S M R -1 7 /3 7 275

respect to all admissible controls, the scalar product |^w(t), B(t) u(t) > | , where w(t) is an appropriate solution of the quoted adjoint vector equation. This is precisely Pontryagin's maximum principle for this case.

Let us now go back to the general optimal control 11(20) and notice that the only quantity whose computation is not explicit in it is the vector x, which corresponds to

P =K x .x . - V i t ^ x , , ) ! " ! ( ri|W*(t,T)x|r d / ' ' 4

I +1 = 1; — + i = 1p q ' r s

II(24b)

For a general type of norm, this computation is a rather difficult problem which we are not going to treat here.

We shall only show how the computation of x becomes an easy task when u £ L 2 2 , hence u belongs to a Hilbert space. From 11(20) we see that in this case the form of the optimal control is

û(t) = —--------------------------------------- = p2 W*(T, t)x 11(25)t

J^(W*(t, r)x, W*(t, r ) x ) d T

toSince û transfer x 0 into x j at t, we have, putting again z(t) = x x -V (t)x 0,

7J W(t, r)u(r)dr = z(t) to

which, by 11(24), becomes T

p 2J 'W(t, t)W*(t, t ) x dT = z(t)to

or, with the usual position for Y(t), the determination of x reduces to the solution of the linear algebraic equation

p2 Y(T)x = z(t) 11(26)

with (from 11(24))

i 2 = ( W ) II(27)

Let us distinguish two cases: in the first one, the system 1(1) is controllable at time t relative to any pair of points of Rn . In this case, Y(t) is nonsingular, therefore invertible, and Eq.II(26) has the unique solution

x = j ; Y_1 (t) z(t)

276 M ARZO LLO

with

p2 = < Y '^ t) z(t), z(t)>

and from 11(25), the expression of the optimal control law û, is

û(t) = B*(t) V-1* ( t lV ^ t ir 1 (t) z(t) for t G [t0, t] 11(28)

In the second case, the matrix Y(t) is singular, but the system is still controllable with respect to x0, x 1( К = [t0, t ] , p, and therefore the matrix C(t) = Y(t) - ( l /p 2)Z(t) is positive semi-definite. We shall see that it is still possible to find an explicit expression of the optimal control law. To do that, we need the following

Theorem II, 5

Let Ra(Y(t)] be th_e range of matrix Y(t). Then, if C(t) is positive sem i- definite, the vector z(t) belongs to Ra[Y(t)].

Proof

Let N[Y(t)] be the null space of Y(t), that is the set of vectors x such that Y(t)x = 0. (The dimension of the null space N[Y(t)] is equal to the multiplicity of 0 as an eigenvalue of Y(t)). Since Y(t) is real and symmetric, its eigenvectors span the space Rn, and

Rn = Ra[Y(tj] ® N[Y(?)]

Let P n be the projection operator of Rn into N[Y(t)], and consider the vector P N z(t). Since C(t) is positive sem i-definite, we have

<C(t)PNz (t ) ,P Nz(t)> 2 0

and, by definition of N[Y(t)],

< C(t) PNz(t), PNz(t) > = - 4 s (PN z ( t j ,z ( t )> s 0

and therefore Pn z(t) is orthogonal to z(t), which means that z(t) does not have components in N[Y(t)l and belongs to Ra[Y(t)] , as we wanted to prove.

Let now i be the dimension of N[Y(t)], so that n - I is the dimension ofn

Ra[Y(t)]. The vector z(t) can then be written as z(t) = ¡ > where d¡ are_ i=i+l _

the eigenvectors of Y(t) corresponding to the non-null eigenvalues of Y(t).nSince x can be written as x = ^ £¡ d¿, where dx, ..., di are the vectors

_ i = i spanning N[Y(t)], equation 11(26) becomes

IA E A -S M R -1 7 /3 7 277

p2 Y(t) Y ? i di = ^ / i d ii=l i=fi+l

n n

»s Z c*x‘ a* ■ Zi = C + l i = i + l

where are the non-null (and real, since Y(t) is real and symmetric) eigenvalues of Y(t). The solution of 11(26) is , therefore,

s - y ( í f d<)i=l i=i + l

and, from II<25), we see that the optimal control law isfi n

i = l i = C+l11(29)

where Ç1( Çn are arbitrary constants. We may check that the values of ? i , .... ?n do n0 affect the norm p simply by computing p, indeed,

1 = I < x , z(T) > I = I ^ X ^ i d i + L / i d * ) |

i = l i=i+l i=«+l i = i+lfrom which

2 N 1 /2

11(30)

i= C+l

Formula 11(30), with 1 = 0, is , of course, valid also when Y(t) is non-singular.

Exercise II, 1

Given the system

2 2 0 2 4 2 8x + u(t) x 0 = x l =

5 2 1 2 -3 -4X= _ 4

and admissible controls u such that T

E(l)

I u(s) ds s 3

where ||u(t) [| is the Euclidean norm of u(t).

2 7 8 M ARZO LLO

a) Prove that it is controllable relative to x 0, x i to the interval K= [0,T] fo r some T > 0 , and to the given class of admissible controls;

b) determine the minimal time t such that the system is controllable relative to the same quantities and construct a corresponding time optimal control;

c) verify whether there exists only one time-optimal control;d) verify whether there exist admissible controls which transfer x0 in x x,

and which are time-optimal but not norm minimal;e) if the class of admissible controls is U= {u : sup ||u(t)|| s 3}, with

t € [ 0, T]

respect to which T is the system controllable?

Solution

The eigenvalues of the matrix A =

The diagonal matrix A' = SÂS, with S = —

is, therefore, A'

20

22

- 1

are Xj = 2, X2 = - 3.

and S" 1 ;

, and the fundamental matrix solution V(t) of

E (l) with V(0) = I is

V(t) = -2 2 e 2t 0 1 - 2

- 1 1 0 e '3t 1 2

Therefore

1 2 2 e 2 (T - 1) о

4- 1 1 - 1 e - 3 ( T - t )

1 - 2

1 2

2 4

1 2= e ‘ 3(T ' 0

2 2 0 0

- 1 1 1 2E(2)

and the matrix product W(T, t) W*(T, t) is

= _ - 6 ( T - t )2 2 0 0 2 - 1

- 1 1 0 5 2 1E ( 3 )

a n d , t h e r e f o r e ,

IA E A -S M R " 1 7 /3 7 279

Y(T) = J W (T -s ) W *(T - s)dsо

1 _ e - 6T 2 2 0 0 2 - 1

6- 1 1 0 5 2 1

1 - e - 6T2 2 0 0 2 -1

6 - 1 1 0 1 2 1E(4)

We notice that since Y(T) has 0 as eigenvalue it is not positive definite; hence the system is not controllable relative to any pair of points x 0, Xj.

From the previous position z(T) = x - V (T )x 0 we have

z(T) =8

12 2 e 2T 0 i - 2 2

-4 ’ 4- 1 1 О n> CO H i 2 -3

and,therefore,

2 2

1

1 - 2

CO

- 1 1 1 2 -4

e2T 0

0 e“3T -4E(5)

Z(T) = z(T) z*(T)

X 2 - 1

2 2

- 1 1

1 - e2T - 2

1 2 - e 3T

2

- 1

1 - e2T 1 2 - 1

- 2 2 - e"3T 2 1

From the definition of C(t) and from E(4), E(5) we have

C(T) = Y(T) Z(T)

2 2

- 1 1 0 | ( l - e - 6T)

— (16-8e2T)2 c i(T)

c2 (T) -6 T

2 -1

2 1

E(6 )

where с г(Т) and c 2 (T) are functions obtainable from E(4), but which are not relevant for our purposes.

28 0 M ARZO LLO

From E(6 ) we see that C(T) has the same eigenvalues as the matrix

R(T)^ i e ( 1 6 - 8 e 2 T ) 2

C2 (T)

^ С г (Т)

and, therefore, C(T) has the same character as R(T) as far as positive or negative semi-definiteness is concerned.

By inspecting R(T), we easily see that a necessary condition for it to be positive sem i-definite is that 16-8e2T be zero, and this is verified only for T = I In 2 = t*. So this is the only candidate for the system to be controllable. It turns out that the system is controllable relative to the given x 0, x 1; p = \Гз and the time interval [0,t*] . To prove that, let us first see that the equation

Y(t*)x = z(t*)

has a solution. From E(4), E(5), Eq.E(7) with t* = | In 2 is

E(7)

3548

2 2 0 0 2 - 1£ =

2 2 0

zl 1 0 1 2 1 - 1 1 2 -3 /2E(8 )

Since is non-singular, we may pre-multiply both sides of E(8 )-1 1

by its inverse, obtaining

3512

0 0 2 i—l1

✓4X =0

0 1 2 1 4 • 2 3/2

or,

(2 * i + * 2) = Ц - 4 ‘ 2 ' 3 /2 E(9)

We know that any solution x of E(9) is such that the control u(t) = B*(t) X V '^ M V It^ x takes x 0 into x £, at time t*. Actually, E(9) has infinitely many solutions, all with'this property and all corresponding to norm optimal

controls. Among these solutions let us take one of the form x= X *i 9 _ о /л

that from E(9) we have X = — 2о D

IA E A -S M R -1 7 /3 7 2 8 1

T h e c o r r e s p o n d i n g n o r m - m i n i m a l c o n t r o l i s

u (t) 35a3t 0 s t s j In 2

and its squared norm is

OS , V

6 d S = 35

E(10)

E ( U )

Since p - < 3, the control û is admissible and we have proved that2-3 -4 , time intervalthe given system is controllable relative to

[0 , \ In 2 ] and p = ч/"з.Of course, since t* = j In 2 is the only time for which the system

is controllable, then it is also the minimal time, and the control u(t) given by E(10) is not only norm-minimal, but also time-optimal.

Since in E (ll) we saw that p < p = \Í3, we expect that, besides all infinitely many norm-minimal controls which effect the same transfer in the same time, there are other controls which do the same job, being also admissible but not norm-minimal. Indeed, let us add to the norm-minimal control u(t) given by E(10) any other control ux(t) such that

iln2

J V (| ln 2 )V '1 (s)B(s)u1 (s)ds = 0

, with K j and K2 constants, we have from E(2)

E ( 1 2 )

Taking u(s) =

I In 2

K i

lK 2

2 4 К: ~2K1 + 4K2ln2)V ' 1 (s) B(s) ds = С = С

- K2_ 1 2 K2 _ К L - 2K2

with С constant, and therefore all controls of the type u x(t) = f"*L*2

satisfy E(12). Therefore, all infinitely many controls u of the type

u(t) = Q(t) +К l

2 K 1

with |k J >0 limited by the constraint ||u(s)||ds< 3 are admissibleо

and transfer xo in x i in the time interval [0 , -|ln2 ] , and are therefore time optimal but not norm minimal.

Passing now to the last question e), let us verify the general controllability condition 1(1 1 ) in this specific case.

2 82 M A RZO LLO

F or the system to be controllable at time T, we must have

T

I <x, z(T) > I â \/"3 J I W*(T, s)x ¡I ds

for every x e Rn .From E(2), E(5) we must therefore have

2 - 1 X l 4 • 2 e 2T

2 1 X 2 e -3T

Since the matrix

0 1 2 -1 X 1

0 2 2 1 X2ds, Vxx, Vx2 E(13)

2 - 1

2 1is non-singular, the vector

У 1 2 -1 x l

У 2 2 1 x 2

spans the space R as well as and, therefore, we may write

/ У 1 4 - 2 e 2T \\ y 2

i

e " 3T / S s /3 J e - 3<T- s>

0

0 1

0 2

У1

У 2

ds,

I (4 - 2 e 2 T ) y 1 + e ‘ 3T y 2 | S ^ (1 - e '3 T ) | y 2 | v 3

v y i < v y 2

v y i- УУ2 E(14)

as a condition of controllability instead of E(13).A necessary condition for E(14) to hold is that the coefficient of yx be

zero, i.e . T = | ln 2 . For such a value of T, E(14) becomes

2 " 3 /2 i i 1 - 2 -3 /2 1 У2 I s J У 2 vy2

which is satisfied.Therefore we may conclude that the system E(l) is controllable with

respect to x 0, x ¡ , class of admissible controls U = {u : sup ||u(t)|| s3},only for the time interval К = [0, ^1п2]. t e [ o , i l n 2 ]

IAEA -S M R -1 7 /3 7 283

PART III. REACHABILITY OF SETS IN THE PRESENCE OF DISTURBANCES

Given a control system subject to partially unknown perturbations, a problem which arises naturally is to find the set X 0 of initial states x 0 which may be transferred "for sure" into a given target set X N by an admissible control.

Many practical examples of such a problem may be found in various fields of system sciences, of operational research and management science, when a "worst case" or conservative philosophy is to be adopted, that is when the aim is to reach a goal for sure, not to maximize the expectation of a given event: an intuitive case is , for example, the problem of determining the zone starting from which a given vehicle can reach for sure a target in a noisy environment. This kind of problem, as well as the determination of the best control in the worst possible conditions, has been form ally treated only recently, either in itself or in the framework of differential game theory, in which it may be embedded if the partially unknown disturbances or environmental situations are conservatively treated as the adversary of game theory.

The algorithms which are presently available for the solution of the problem above, even for linear systems with admissible controls and disturbances belonging to convex compact sets, are generally quite difficult to be implemented and it seems that further research is needed in order to furnish flexible algorithms for facing practical situations.

The purpose here is to sketch the key points of the methods of solution proposed until now, for different information structures available both to the controller and to the adversary disturbances, and to show some interesting simplifications which are the consequence of taking a subspace as a target set and any control and disturbance as admissible; for this case, it is possible to state some interesting results in a compact form , as we shall do.

As an example of the problem mentioned, let us consider the following linear discrete time system

x k + i = AxK + BuK + CwK (k=0 , 1 ..........) I I I ( 1 )

where the states x K belong to Rn, controls u K and disturbances wK belong to closed compact sets U of Rm and W of Rp respectively and the target is a given closed compact set X N of Rn (many types of extensions, like the one to tim e-varying systems with sets U and W varying also with time instants are possible but inessential to our purposes).

We formulate the problem of reachability of X N from X 0 under disturbances by giving two among the main possible information structures.

Problem III, 1

Determine the set x j of initial states x 0 which may be transferred into X N at the time instant N by an admissible sequence u0 (x0), U jix j), ...ui^.jfxj^.j of controls, for any admissible sequence (w0, w x, ... wN. x) of disturbances. Clearly, for each of its "m oves" u¡ the controller may take into account his perfect information about the present state x ¡ , but is ignorant about the "m ove" Wj of the disturbances.

2 8 4 M A RZO LLO

Determine the set X 0 of initial states x Q which may be transferred intoXN at time instant N by an admissible sequence u0 (x0, w0) ,__ _ uN.t ( x ^ , w ^j )of controls for any admissible sequence w 0, ..., wN_! of disturbances. Obviously, the information available to the controller is in this case "larger" than in the previous case, since in each step he knows the "m ove" of the disturbances.

Even for these simple examples of the general problem, with the clearly defined information structures mentioned, the used methods for finding x j or X „ have required the use of the so-called operation of geometric difference of sets 1 and either the use of separation theorems for convex compact sets [3], or the use of support functions to describe sets and set-inclusion [4, 5] or some ellipsoid-type or polyhedrical-type approximation of sets [6 ], which have the advantage of describing sets with a finite number of parameters but can give only sufficient conditions for a point x0 to be transfer rabie into X^ , that is can give only subsets of X j or X j.

When the instant of time at which reachability occurs is of interest, gam e-theoretical approaches to the above kind of problems are interesting, but computational results are rather involved indeed [7, 8 ].

Referring to problem III, 1 and using Eq.III(l), we see that at the (N -l)-th step the state xN_j is "transferrable" into XN if there exists uN_x (x ,^ ) € U such that

AXn-! + Виц-! + CwN.j e X N

for every e W , that is iff there exists u ^ t x ^ ) e U such that

A x n - 2 + B u ^ - i + C W G X N

or

A x n . j + BuN.j G (XN * CW)

that is iff

A x N . x G (Xn Í C W ) * B U

Therefore the set X N-j of states x ^ which may be transferred into X Nin one step satisfies the following equation:

Yjj-i = AXJ,., = (XN * CW) - BU

and, sim ilarly, the set X N_x (i = l , . . . ,N ) of states which m aybe transferred into X N in i steps satisfies

= a 'xJ j.j = (YJj-i+i — AW CW) - A 1" 1 BU (i = 1.......N) 111(2)

P r o b l e m I I I , 2

2

1 Given two sets S and T, the geometric difference Z = S * T is defined a sZ = { z : z + T c S } , For its properties, see Ref. [ 9 ].

IA E A -S M R -1 7 /3 7 2 8 5

= X N 111(3)

E q u a t i o n 1 1 1 ( 2 ) , t o g e t h e r w i t h t h e o b v i o u s e q u a l i t y

is a recursive algorithm for building Y-, , Y0 from U, W, X N and, therefore, x j which is characterized by

ANX i = Y j 111(4)

Proceeding in an analogous manner for Problem III, 2, and taking into account the different information structure, we have that in this case xN_x may be transferred into X N in one step iff for every wN.j e W there exists u N - l (x N - i > w N - l ) e U such that

+ BuN. x -t- Cw^-x £ X N

that is iff

Axn-i + CW e X N - BU

Therefore, the set Xf,^ of states x N.j of this type is given by

Y n - i = A X N -1 = (XN- B U ) * C W

and sim ilarly the sets Х^.; (i = 1, ... , N) of states which may be transferred into X N in i steps by admissible controls and for every disturbance satisfy

Y2.j = А*Х*М = (Y* _ i+1 - A i_1BU)± Ai_1 CW, (i = 1.......N) 111(2')

We have again found a recursive algorithm which, starting again backward from

Yn = X N 111(3')

2gives the following characterization of X 0:

,NAn X 2 = Y 2 111(4')

As is apparent from 111(2), (3), (4) or from 111(2'), (3' ) , (4 ' ) , the determination of X j or of Xq essentially involves the computation of N geom etric differences and N additions of sets. As we saw in Part I, the analogue of Problem III, 1 and Problem 111,2 for continuous systems, considering only open loop controls would lead to a sim ilar computation for only one step, but the matrices Ai_1 В and A1' 1 С (i = 0, ..., N -l) would be substituted by linear integral operators from sets U and W of control and disturbance functions into Rn.

Going back to consider Eqs 111(2), (3), (4) or 111(2'), ( 3 ' ) , (4 ' ) , we see that the key difficulty is the description of sets resulting from operations on sets.

286 M A RZO LLO

A natural tool to be used for this purpose is the one given by support functions (see, e.g. [5]), when the sets X N, U, W, and therefore also all other sets involved, are convex and closed. Support functi’ons are linear with respect to the addition of such a kind of sets; unfortunately, the operation of geom etric difference of sets does not possess such a property; if h is the support function of the set X : hy(p) = sup < p, x)>, Vp € X*, whers X is the

x e xdual of the space X, then for every set S, T, we have

hs* T(p) § hs (p) - h T(P) 111(5)

It is therefore important to find conditions on the sets involved in 111(2), (3), (4) (or 111(2'), (31 ), (4')) fo r 111(5) to be held as an equality (we suppose, for sim plicity, that the geom etric differences are never the empty set). In such a case, the construction of the sets Y¿-i , ..., Yq from Y¿ and of Y^-i , ..., Yn would be straightforward and we would also have the interesting consequence that these two sequences of sets would coincide if Y N = Y^; therefore x j would also coincide with x|j and the difference in the information structures of Problems 1 and 2 would not have any effect.

Defining St = (S ¿ T) +S as the "regular part" of S with respect to T, we define the set S to be regular with respect to T iff Sx = S. It is easy to prove that 111(5) holds as an equality iff S is regular with respect to T ; indeed, from the regularity it follows that

hs * T(p) + hT(p) = hs(p), Vp

that is

hs * T(p) = hs(p) + h T(p), Vp 111(6)

and the regularity follows from 111(6) for any couple of sets S and T com pletely described by the support functions.

These last remarks seem an important reason for research work on conditions easy to be expressed on couples of sets (representing, respectively, target sets and reachable regions) in order that the first one be regular with respect to the second one (see, e.g. Ref. [9]).

A case in which the regularity conditions are trivially satisfied is when the sets under consideration are linear subspaces. Sets of this kind have, furtherm ore, the following absorption property:

Г S if S D TS ± T = \ 111(7)

Щ} if S p T

An interesting consequence of this property applied to our Problems III, 1 and III, 2 when U and W are Rm and R r , respectively, and the target set XN is a linear subspace M of Rn. In this case we can define the orthogonal complement N of M in R n : R n = N ®M and the projection operator n of Rn onto N. We have

7ГХ e N, V x G R n, 7ГХ = {0} , Vx e M

IA E A -S M R -1 7 /3 7 287

MN = 7гХ n = jtM = {0} 111(8)

Referring to Problem 111,1 and using Eq.III(l) we have therefore, for x to be transferrable into M in one step

7t(Axn-i + R^n-i CwN-i) = Mn “ { 0 } , ^wn-i ^ W

that is

ttAXn-í = (MN * CW) + BU = Mn-i

and using Eq.111(1 )

M¿.¡ = = (M¿ . i+1 ± 7 r A i ' 1C W ) - тгА1' 1 BU (i = 1 , ... , N)

which gives an iterative algorithm for the construction of sets X^-; of states which may be transferred into X n in i steps, until

M¿ = AnX 0 = (Mj * An_1 CW) - jrA^ 1 BU

which characterizes x j ,Recalling U = R m and W = Rr , and defining

P¡ = 7Г (span A 1' 1 B) Q¡ = тг (span A 1_1 C) 111(9)

we have

Mj,.j = ttA ^n-í = (M j,.i+1 ± Qj ) +Pj (i = 1, ..., N) 111(10)

that is , using the absorption property of geometrical difference for subspaces

= ^ ¡ x ; , = MN. i+1 + p t if M¿ . i+1 D p t111(1 1 )

MN-i = { 0 } if Mn- í+1 Э Pj

From 111(11) we see that is not empty iffi = l

Q : = {O}; and P j D Q ^ (Pj + P2 ) Э Q 3 ... ^ Pi D Q¡ 111(12)

i = iProceeding in an analogous way for Problem III, 2 in this case, we have again 111(8 ) and

The ta rge t se t X n is th e re fo re c h a ra c te r iz e d by

M2N-i = ^ x ^ i = (M2 . i+ 1 + P 0 1 Q Í 111(1 0 ')

2 8 8 M ARZO LLO

untilм 20 = ttAn X2 = (M2 + P n) * q n

Using again the absorption property we have

M2N.; = M2N_i+1 + if M ^ + P O Q ;111(1 1 ')

MN-! = {1} if Mn-1+i + P¿ 1 Qj

2From 111(11') we see that MN.t is not empty iff

iP ^ Q j Pi + P2 Э Q 2 ......^ P j Э Q¡ 111(12')

j = i

As is intuitively comprehensible, condition 111(12) for the existence of some x 0 which can be brought into Хц in i steps (for example, in i = N steps) is stricter than condition 111(1 2 ') , which corresponds to an information structure more favourable for the controller. Nevertheless, it is very interesting to

Nnotice that when 111(12) is satisfied, then M 1 = P. = M2 .

j = iWe may summarize the preceding results in the following

Theorem III, 1

The set X j solution of Problem III, 1 for U = R m , W = Rr and X N given by a subspace M of Rn is not empty iff Eqs 111(12) are satisfied and is characterized by

An X¿ = M1,,

where M¿ is given by the recursive algorithm 111(1 1 ) starting from M¿ = 0 . Analogously, the set X 2 is not empty iff Eqs 111(12') are satisfied, and are characterized by

7rANX¿ = M 2

2where M0 is given by the recursive algorithm III(11') starting from M j = {0}. Furthermore, if X j is not empty, then x j = X 2, with

N

= Mo = X Pi j = l

For the particular system

we have the following

x k+1 = A x k + B u k + w k ( k = 0 , 1 , . . . ) 1 1 1 ( 1 3 )

IA E A -S M R -1 7 /3 7 289

Corollary

Considering Problem III, 2, if Xq is not empty for some N, then(a) any x o G X q can be brought into M in an arbitrary number of steps,(b) X§ = R n.

Proof

Since 111(12') is satisfied with Q ¡ = vrl = L, we have Px Э Qx = L; on the i i - 1 i

other hand ^ Pj э Pj ... ^ Pi i therefore, since all P¡ are in L, ^ Pj = L j = 1 j = 1 . J = 1

for any i, that is M2 = M2 = ... MjJ, = L. Recalling Mj = ггА'Х2, part (a) isproved. Part (b) follows from the characterization of X2:

X q = {x : ttANx € M0}

from the definition of тг and from M0 = L. i - iWe may observe that in our case, since С = I, Pj Э ... Э Pt Э Q x = /rRn ,

j = ithat is it span [B, AB, ....... AN 1 B] D 7rRn. This means that the system wascontrollable (in the classical sense), at least in the subspace L.

R E F E R E N C E S

[ 1 ] A N T O S IE W IC Z , H . , L inear C o n tro l S ystem s, A rch . Rat. M e ch .. A n a l. 12 (1 9 6 3 ) 3 1 3 -2 4 .[ 2 ] DUNFORD, , S C H W A R Z, J . T . , L inear O perators, P a r t i , In te rsc ie n ce (1 9 6 4 ).[3 ] MARZOLLO, A. , An algorithm for the determ ination o f ¿¡-controllability conditions in the presence of

noise with unknown statistics, Autom ation and Remote C ontrol (February 1972).[ 4 ] W ITSEN H AU SEN , H. , "A M in M a x C o n tr o l P rob lem fo r S a m p le d Linear S y ste m s", IEEE A C (feb ru a ry 1968).[5 ] WITSENHAUSEN, H .,,"S e ts o f possible states given perturbed observations", IEEE AC (O ctober 1968).[6 ] GLOVER, J .D . , SCHWEPPE, F .C . , "C ontrol o f Linear D ynam ic Systems with Set-constiained

Disturbances", IEEE A C , october 1971.[ 7 ] P O N T R YA G IN , L. , L inear D if fe re n t ia l G am es , I , II, S o v ie t M a th . D o k la d y , 8 3 (1 9 6 7 ): 4 (1 9 6 7 ).[8 ] BORGEST, W . , VARAIYA, P ., "Target Function Approach to Linear Pursuit Problems” , IEEE AC

(O ctober 1971).[9 ] MARZOLLO, A . , PASCOLETTI, A . , Com putational Procedures and Properties o f the G eom etrical

D ifference o f Sets” , J. Math. Anal. Appl. (to appear).

IA E A -S M R -1 7 /6 1

EXISTENCE THEORY IN OPTIMAL CONTROL

C. OLECHInstitute of Mathematics, Polish Academy of Sciences, Warsaw, Poland

Abstract

EXISTENCE THEORY IN OPTIMAL CONTROL.This paper treats the existence problem in two main cases. One case is that o f linear systems when

existence is based on closedness or compactness o f the reachable set and the other, non-linear case refers to a situation where for the existence o f optimal solutions closedness of the set o f admissible solutions is needed. Some results from convex analysis are included in the paper.

INTRODUCTION

Perron paradox. A necessary condition for N to be the largest positive integer is that N = 1. Indeed if N f 1 then N2 > N. So N is not the largest integer, contrary to the definition; thus, N = 1.

We have proved the theorem: "If N is the largest integer, then N = 1 ." There is nothing wrong with this statement. It is correct, except that

the assumption is never satisfied. From such an assumption everything follows. This example, though it looks like a joke, has an important implication, i .e . necessary conditions in optimization may be useless if we do not know that the solution we are talking about exists, since, if it does not exist, one may derive a wrong conclusion from correct necessary condition.

A simple example of such a situation is the following functional:1

f (1 + x 2 )1 /4 dt (0 . 1 )о

if we want to minimize it over all functions of class C 1 on [ 0 , 1 ] and satisfying the boundary condition x(0) = 0, x (l) = 1. This is a simple problem of calculus of variations. It is well known that the solution of the problem, if it does exist, has to satisfy the Euler equation which in this case is simply

хЦ+х2 ) ' 3/ 4 = constant

Therefore, any solution of the Euler equation of class C1 has to be linear (x(t) = constant) and the one which satisfies the boundary condition is x(t) = t. The value of the functional (0.1) for x(t) = t is 21//4 and it is easy to see that it is not minimal. The infimum in this case is equal to 1 and the optimal solution does not exist. Without checking the existence of the minimum we seek, we may be led to a wrong conclusion, from the necessary condition.

Consider another example. In the set of all functions x(t) of class C1 on [ - 1 , 1 ] such that

291

292 OLECH

x (- l) = x (l) = 0 and |x(t)| â 1

we wish to minimize the functional +l

I(x) = J ' tx(t)dt-1

The solution of this problem as stated does not exist. Indeed, it is clear that the integrand is estimated from below by - |t|. Thus, I(x) g -1 and equal to - 1 iff x(t) = 1 for t á 0 and - 1 for t g 0 while for any x(t) of class C1 the set {t |x(t) | < 1} has positive measure and, therefore, I(x) > -1.The example shows, and this is rather typical, that, to ensure the existence of the minimum, we sometimes have to enlarge or complete the set on which we wish to minimize the functional in question. In fact, the above problem has no solution in the class of C1, but it has a solution in the class of absolutely continuous functions.

Hilbert said: "Every problem of the calculus of variation has a solution provided the word 'solution' is suitably understood". The meaning of this statement is as follows: If we look for a minimum of a functional and the infimum is finite, then we always have a sequence approaching the infimum - the so-called minimizing sequence. The question of the existence of a minimum reduces to the problem of when we can produce the optimal solution out of the minimizing sequence. Often this is possible if we are able to prove that the minimizing sequence converges. But, even if this is possible, then not always in such a way that the lim it belongs to the set on which we originally wished to minimize the functional. The last example is such a case. Indeed, one can prove that if x n(t) satisfies the boundary condition, is continuously differentiable and such that I(xn)-* - 1 as n — oo, then x n(t) converges uniformly on [ - 1 , 1 ] to x0 (t) = t + 1 for t S 0 and -t + 1 for t S 0 while

converges to zero but the x n(t) do not converge uniformly to x n(t). Hence, x n is not convergent in C1 space but is convergent in the space AC of absolutely continuous functions. It is, therefore, natural to adjust the notion of "solution" and to treat this problem in the space of absolutely continuous functions rather than in C 1. This situation is typical and, for this reason, in control problem s usually the control function is assumed to be measurable. Even in the case where the optimal solution is more regular, it is convenient to introduce this enlargement of the space in order to prove that the optimal solution exists. F or the sim ilar reason of completion, the Sobolev spaces were invented and became so successful in the theory of partial differential equations (weak solutions).

In this paper, we discuss the existence of optimal solutions first to linear problems and then to a general non-linear problem. In the first case, compactness and convexity of the so-called reachable set is instrumental, while in the second case the closedness of the set of admissible solutions is important. In both cases, convex analysis is used extensively.

+l

-l

IA E A -S M R -1 7 /6 1 293

In Section 1, an existence theorem for time-optimal linear problems with constant coefficients is stated. Section 2 contains a detailed discussion of the integral of set-valued functions. Also, several propositions concerning convex sets are stated and proved in this section. Section 3, besides a proof of the theorem stated in Section 1, also contains some other applications of the integration of set-valued functions. Among these, we discuss the so - called "bang-bang" principle.

The existence theorem given in Section 4 is a typical example of an extension of the direct method well-known from the calculus of variations to optimal control problem s. The direct method originated by Hilbert was developed by Tonelli, McShane and Nagumo for problems in calculus of variations. Roughly speaking, it consists in establishing that the minimizing sequence converges (compactness), that the lim it is in the set in which we minimize the functional in question (closedness or completeness) and that the value of the functional at the limit is not greater than the infimum (lower semicontinuity). For optimal control problems, it has been extended in the last decade, mainly by Cesari. Theorem 4. 1 stated and proved in Section 4 is an example of possible existence theorems along these lines.

The important difference (and perhaps the only one, too) between classical results from calculus of variations and those concerned with the existence of a solution of optimal control problems lies in regularity assumption. To cover problems of interest, one is led to consider problems with as weak regularity assumption as possible. For this purpose, also a theory of measurable set-valued mappings, selector theorems, etc. is used in an essential way. Mathematical control theory, among others, gave an impact to the recent development of this theory. These questions will also be briefly discussed in Sections 4 and 5.

In Section 5, we discuss the necessity of the convexity assumption in the existence theorem given in Section 4. Actually, convexity is essential for lower semicontinuity or lower closure, and in Section 5 we give a result (Theorem 5.1), which shows that, in certain special cases, convexity is necessary and sufficient for weak lower semicontinuity of an integral functional. Also, as an example, another result on lower se mi continuity is stated there, but without detailed proof.

It would be impossible to cover the whole subject in such a short paper, which is, therefore, far from complete. The choice of material presented was highly influenced by the author's own contribution to the. subject.

A selected list of references is included, and, at the end of the paper, we also give some comments concerning the literature.

The author wishes to acknowledge the help of Miss Barbara Kaskosz in preparing this paper.

1. TIME-OPTIMAL LINEAR CONTROL PROBLEM

Consider the linear control system

x = Ax + Bu (1.1)

where x e Rn, u e Rm, A is an n x n matrix and В is an n x m matrix. Let U с Rm be a given fixed set. Let Q be a set of functions u taking values

294 OLECH

from U, called admissible control functions. In this section, we take as admissible control functions which are piecewise constant.

A solution of Eq. (1.1) is of the form

where X(t, to) is the so-called fundamental matrix of the homogeneous linear differential equation corresponding to Eq. (1.1). That is, we have

where E is the identity n x n matrix. In the case of A constant,X (t ,t0) = exp(A(t-t0)).

A solution of Eq. (1. 1) is unique if the initial value x 0 at time t = t0 and the control function u = u(t) are fixed. Thus, we shall use the notation x(t;to, x o, u) for solution (1. 2) of Eq. (1. 1) to indicate this relation.

Tim e-optim al control problem . Given an initial state x 0 at time t 0, we wish to transfer it by means of an admissible control function to a target point х1д and we want to do this in the shortest time possible, i .e . we seek an admissible a* and a time t* such that x(t*;t0, x 0, u*) = xa while x(t;t0, x 0, u) f x x for each u 6 fi if t < t*. We shall call t* and u* optimal time and optimal control, respectively.

F or the above optimal problem, the following existence theorem holds (due to LaSalle):

Theorem 1 .1 . If A, В in Eq. (1.1) are constant, U is a compact polyhedron, Г2 the set of piecewise constant control functions taking values in U and x : = x (t;t0, x 0, u) for some u e Г2 and t > t0, then the time-optimal solution exists.

In Section 3, we shall present a result which would contain this result as a special case. Now, we wish to call the reader's attention to the fact that Theorem 1.1 is, in general, not true if A and В are not constant and U is not the polyhedron. In other words, the piecewise constant control function is, in general, too narrow a class for the optimum to be attained.

F or the existence of a time-optimal solution, the properties of the reachable set are decisive. By reachable set of a control system we mean the set of all states which can be reached in time t by means of admissible control functions from a fixed initial condition. In the case of Eq. (1 .1 ), it is given by

( 1 . 2 )

4r X(t, t0) = AX(t, t0) and X (t ,t0) = E (1.3)

gt(t) = {x (t ;t0, x0, u) I uGÍÍ}

The properties of í%(t) which make the above theorem true are

(1.4)

¿%(t) is convex and compact for each t (1.5)

5?(t) is Hausdorff continuous in t (1. 6)

I A E A -S M R -1 7 /6 1 295

The latter means that h(^?(t), &(s)) tends to 0 if |s-t| tends to 0 where h(P, S) = max (sup d(p, S), sup d(s, P)) is the Hausdorff distance between two sets. P6P ses

F or the existence of optimal solution only compactness in (1. 5) is needed. Convexity is decisive in deriving necessary and sufficient conditions for optimality. In the next section a proof of (1. 5) and (1. 6 ) will be given in a more general setting.

2. REACHABLE SET AND INTEGRATION OF SET-VALUED FUNCTIONS

It follows from Eqs (1. 2) and (1. 4) that

¿%(t) = X (t;t0) (x0 + S(t))

where

S(t)

Putting

P(t)

we have

S ( t ) = { / v(t) dt I v(t) 6 P (t)} (2.1)to

The above set, if v is an arbitrary, but integrable function, is called the integral of set-valued function P(t). It is clear that properties of á*?(t) like closedness, convexity and continuity hold iff the same properties hold for S(t) given by (2. 1). Thus, in what follows we shall discuss the basic properties’ of the integral of the set-valued mapping P . F or simplicity, we shall consider P to be defined on [0, 1] with values P(t) being subsets of Rn.A point-valued function v from [0, 1] into Rn is a selection of P (or a selector) if v(t) e P(t). We shall require the latter to hold almost everywhere (a. e .) since we shall deal with measurable or integrable selections of P. ByK p we denote the set of all integrable selections of P. Thus the integral of Pis the set

l l I(P) = J ' P(T)dT = v(t) dt I v(t) e P(t) a. e. in [0, l ] j -

о о1 (2 . 2)

= v(t) dt I V e K p j

0

The aim of this section is to prove that I(P) is convex and to prove some other properties of I(P). The main results are stated at the end of the section.

b = / X _1(t, t0 )Bu(t)dt, u(t) 6 U

{p I p = X _1 (t, t0) Bu, u £ U}

296 OLECH

Before, we recall some elementary facts concerning the extremal structure of convex sets.

Let S be a convex set. A convex subset E с S is called an extremal face of S if for each s, p £ S and any 0 < X < 1 , condition As+(1-X) p G E implies that both s and p belong to E. If an extremal face reduces to a single point {e } then e is called an extreme point of S. The set of all extreme points of S is called the profile of S and we denote if by ext(S).

Our nearest goal is to give a characterization of an extremal face through some order relations. By С we denote a convex cone in R n and we shall write x § c y iff y - x £ С. Because of the convexity of C, this relation is transitive. Below we shall only deal with convex cones which satisfy the relation

С и (-C) = Rn (2. 3)

This implies that any two points x, у of Rn are comparable, that is either x - сУ or У - c x - Both inequalities hold iff x - y G С п (-C) = Mc. Clearly,M is a subspace of Rn. We shall write у = maxc S iff y e S and x s c y for each x e S. If S is closed and bounded then m a^S exists and it is unique if Mc = {0}.

Two following propositions about cones satisfying (2 . 3) will be useful:(a) F or each convex cone С satisfying (2. 3) there is an orthogonal sequencea*. . . . , aksuch that M c = {x | <x, a£> = 0, i = 1 , __ _ k} and x e С iff thefirst non-zero member of the sequence {<|x, a¡ } is positive.

Indeed, suppose z = (x+y)/2 G M where x, y G C. But -x = -2z+y and -2z G С, thus -x G С. Similarly, -y G C. Therefore both x and у belong to M. This shows that C\M is convex (M is an extremal face of C). Thus -C and C\M are convex and disjoined. We can separate them. Therefore, there is aj f 0 such that <(x, a £)> ё о for each x G С and equal zero if x G M. Putting C£ = {x G С I <(x, a x> = 0}, we see that Cx и ( -Сx) = {x |< x, ах)> = 0} and either C 2 = M then к = 1 and we are finished or there is a 2 J- a a and a 2 f 0 such that <x, a2^ 0 for each x £ C j and <(x, ag) 1 = 0 if x G M. An easy induction argument proves (cy).

The dimension of a convex set S is the dimension of the smallest linear manifold containing S (the linear manifold spanned by S). We say that a cone С satisfying (2. 3) is spanned by S if S с С and if for each C-l satisfying (2. 3) and containing S the inclusion C j C C implies that Cx = C. In general, there are more than one cone spanned by a set S. We shall now prove the following:(/3) For any set S there exists a cone С satisfying (2. 3) and spanned by S.

Indeed, let Ca, a G A be a family of cones satisfying (2. 3) and linearly ordered by inclusion; S C C „ for each a. Put С = Ca. Manifestly S С C.If there were x G Rn, a, |3 G A such that x $ Ca, x $ Cg then Ca С C0 or C ¡ c C„and x f Ca u (-C J or x | Са и (-C e). Thus (-C) и С = Rn. The Kuratowski-Zorn lemma completes the argument.

Notice that if С is spanned by convex set S then Mc is the linear span of Sj = {x G S I there is a > 0 that -a x G S}. Sj is the extremal face of S containing zero in the relative interior.

One more notation: If a G Rn then we put

Sa = {x G S I < x, a > = max <( s, a)>} s es

I A E A -S M R -1 7 /6 1 297

and, inductively,

S a !.........ak - (S a !........... ak.j )a k

We are now able to state the announced characterization of extremal faces to a convex set.(7 ) The following conditions are equivalent for a subset E of a convex set S:

(i) E is an extremal face of S(ii) There is a convex cone С satisfying (2. 3) such that E = {у | у = maxc S}

(iii) There is an orthogonal sequence aa, . . . , ak such that E = Saj.......a ,where n-k S dim E.

Proof. If (i) holds then take as С a cone spanned by x0 - S (see item (/3)), where x 0 is any point from relative interior of E. By definition of C, x 0 = maxc S and since x 0G relative int E, E G xo+Mc thus for each y £ E , y = maxc S. On the other hand if y = maxc S then y G x0 + Mc and the interior of the segment y, x 0 has non empty intersection with E. But E is an extremal face, thus y G E. Therefore (i) implies (ii). Assume now (ii). Let aa, . . . ak be the orthogonal sequence corresponding by (a) to the cone C. Notice that from (ii) it follows that E = S n (x0+ M c ), where x 0 is any point from E.This together with (a) applied to x0- S shows that E = Sai_ . . . iak. Manifestly dim E s dim Mc = n-k. Thus (iii) follows from (ii). Assume now (iii) andlet z = Xx+(l~X)y GE = Saj.......ak while x, y GSa,................a¡, i < к, and 0 <X < 1.Then <z, aj> = <x, aj > = <y, aj> for j = 1, . . . , i and <z, ai+1> ë max <x, a i+1 >, < y - ai+i>)- But since 0 < X < 1, and <z, a i+1 > = X<x, a i+1 >+ (1-X) <y, a i+1 > the latter inequality implies that <(z, a i+1 У = <x, ai+1 У = < y, a i+1 У whichmeans that x, y G Saj.......a.+l. Thus, the induction argument implies thatx, y G Saj.......ak = E, hence we proved that E is an extremal face if S.Therefore, (iii) =» (i) which completes the proof of (y).

A special case of interest of (y) is when E = { e} and e is an extreme point of S. Then, С in (ii) can be chosen so that C n (-C) = {0}. In this case, the order Sc is called lexicographical. We have the following corollary of (y):(y1) The following conditions are equivalent for a point e of a convex set S.

(i) e is an extreme point of S.(ii) There is a lexicographical order s in Rn such that x S e for each x G S.

(iii) There is an orthogonal basis an, . . . , an in Rn such that {e} = Sai......... En.

i .e . an (n-1)-dimensional simplex in Rn. It is convex, compact; e G Д is an extreme point of Д if at least one co-ordinate of e is equal to one (then the remaining have to be equal zero); extremal faces are subsets of all points in Д with certain fixed co-ordinates equal to zero. Notice that each x G Д can be written as a sum

n

Example: Consider the set Д = x¡ = 1 , x¡ S 0

i=l

n n

i=l i=lwhere {e^, , . , , en} is the profile of A.

29 8 OLECH

Let P be a subset of Rn, The smallest convex set containing P is called the convex hull of P and it is denoted by coP . Correspondingly, a closed convex hull of P, denoted by clco P, is the smallest closed convex set containing P. Since the intersection of convex and/or closed sets is convex and/or closed thus both со P and clco P are well defined and clco P is the closure of со P.

The basic result concerning convex hulls is the following:(ó) Carathéodory theorem: If P с Rn then

n n

СО P = U { Y XiPi I Xj ê 0, V = l j..........P n } c p t o

The proof of this theorem can be found in many textbooks. However, for convenience of the reader we shall also prove it here. Denoting the right- hand side of the last relation by D, we clearly see that D с со P, thus only convexity of D should be established to prove (6 ). Let x, y be two points of D, that is

n n n n

X

i=0 i=0 i=0 i=0

Let z = ах+(1-а)у, 0 < a < 1. Clearly, the set

* * 1 1 n n

= Z XiPi* y = Z Miqi' x‘ " °’ s °’ b = x ^ = i j p i 6 p’ qi g p

П 5Ш+1 zn+1

={ Z 7ipi+Z Tiqi-n-1 1Z 7i = ij 7i s °l

2n+l 2n+l

Вi=0 i=n+l i=0

is convex and contains z. Let

2n+l 2n+l

A I * ' X ' I

i=0 i=n+l i=0

n zn+i zn+i

={(tí) i Z 7ipi + Z 7141-"-1=z' Z 7i = 7¡ " °}

It is clear that A is a convex, closed subset of the (2 n+l)-dimensional simplex A2n+x and its dimension is, at least, n+1. A = M n Д2п+1 where M is linear manifold spanned by A. Let e be an extreme point of A; e = Wi (e)} o si s 2n+i- We claim that, at least, n+ 1 out of 2 n+ 2 co-ordinates of e are zeros. In fact, let F be the smallest extremal face of A 2n+j containing e. Then F n M = {e }. Therefore, the dimension of F is at most n+1, thus 7 ¿(e) = 0 for, at least, (n+1) " i ’s " . That implies that

2n+l

= Z ^ i< e ) P i + Z 7 i ( e ) 4 i - n - li=0 i=n+l

is, in fact, a convex combination of (n+1) points from P, thus z e D, which completes the proof of (6 ).

The next results will give an answer to the question of how small a subset of a convex closed set can be in order that the original set can be

IA E A -S M R -1 7 /6 1 299

reconstructed by taking a convex hull. A reasonable result of this sort can only be obtained for closed convex subsets.(e) Assume that S с R“ is closed convex and does not contain a line and let P be a subset of S composed of all extreme points and all extremal rays of S. Then S = co P. On the other hand, if S = со P then P contains the profile of S and for each extremal ray E of S the intersection E П p is unbounded.

P roof by induction with respect to the dimension of S. If dim S = 1 then (e) clearly holds. Assume that (e) holds if the dimension of the set in question is sm aller than n and assume dim S = n. Each point x of the boundary of S is in со P because it belongs to an extremal face E of S and dim E < n.But both the profile and extremal rays of E are contained in P, thus we may apply (e). Since S does not contain a line then there is a hyperplane H with the property that (x+H) n S is bounded for each x. Thus if x G intS then it can be represented as a convex combination of points from the boundary of S thus also x G co P. The second part follows from an easy observation that S\E is convex if E is an extremal face. Thus if E is an extremal ray or an extreme point and E is not equal to co(E n p) where P C S then со P is not equal S.

Thus if co P = S and E = {e}, e G ext(S), then e G P and if E is an extremal ray then co(E n p) = E which is the case only if E n p is unbounded and contains the extreme point of E.

As an immediate corollary of (e) we have the following result.(e1) If S is convex and either compact or closed and without extremal rays, then S = co(extS). If S = со P then ext S с P.

The next result will concern the closed convex hull of arbitrary sets (not necessarily bounded or closed).(?) If P с Rn and S = clco P, then the profile ext S of S is contained in cl P.

From the inclusion P c cl P c clco P = S and (ó) we obtain the equality clco P = clco (cl P ). Thus for bounded P, (Ç) follows from (e1). Notice also that if P is compact then S = co P. Suppose now that P is closed and that e G ext(clco P). By (6 ) we have

n ne = lim X^x^, where a 0 , ^ Xj = 1

j =o j=0

for each к and xŸ G P. If e were not in P then there would exist e > 0 suchI klthat |e-Xj| g e > 0 for each j and k. In this case we could represent e as a lim it of

n n^ Mj¡ z*. where ^ г 0, J 1, z G со Pj - 0 j=0

and |e-z^ | = e.Without any loss of generality, we may assume that both /jf-<■ /u¡ andn 1 J

Zj1 - Zj G clco P. Hence e = jUjZj and |e-Zj | = e > 0, which contradicts thej=o

assumption that e is an extreme point of S. Thus e G P and (Ç) is proved.

3 0 0 О LECH

We begin the study of the properties of the integrals (2. 2) with the following:

Lemma 1. Let S = clco I(P) and let e G S be an extreme point of S. Then toeach e > 0 there is 6 = 5(e) > 0 such that if |l(u)-e | < & and |l(v)-e | < 6 for

lany pair u, v € K p then llu -v ll = J ' I v(t)-u(t) | dt § e .

оP roof. Let e > 0 be arbitrary. Put 77 = e /4 N/"n and let B(e, r;) be the closedball centred at e and of radius rj. There is an open halfspace H such thate G H n S с B(e, 17). Indeed, consider the set Q beeing the intersection of S with the boundary of B(e, r}). Q is compact hence also coQ с B(e, 17) n S is compact. Manifestly e does not belong to co Q. Thus we can separate e from co Q; that is there is an open halfspace H, such that H n (S\B(e, 17)) = p because Q is equal S n 3B(e, rj). Now take 6 small enough that

S n B(e, 6 ) c H n S с B(e, rj) (2.4)

and assume u, v G KP are such that |l(u)-e | < 6 and |l(v)-e | < 6 . Let А с [ о, 1 ] be an arbitrary measurable set and put wj = u + x A(v-u) and w2 = v + Xa(u -v)> where xA is the characteristic function of the set A. Of course both W j andw2 belong to Kp and we have

I ( W j ) = I(u) + J ' (v(t)-u (t)) dT and I(w2) = I(v) - J ' (v(t)-u (t)) dT A A

This implies that, at least, one of those two points belongs t o H n S and therefore by (2.4)

( v ( t ) - u ( t ) ) d T I s 2 rj.

A

The latter holds for each A с [0,1] , therefore we have

I

J |vt ( t O - u ^ t ) |dT S 4r¡

0

where v¿ is the i-th co-ordinate of v and consequently,

1

II u-v II = J I v ( t ) - u ( t ) I d T S 4«/ñ" rj = с

which completes the proof.As a consequence of Lemma 1, we shall state two corollaries.

Corollary 1. Assume P(t) is closed for each t. Let С be a convex cone such that С и (-C) = Rn and F = { y | y = maxcx if x clco I(P)} be a compact extremal face of clco I(P). Then

IA E A -S M R -1 7 /6 1 3 0 1

(i) The set Kp = {v G K p| I(v) G F} is not empty.1

( i i ) F = с о I(K¡T) = с о { J v ( r ) d r I v e K p

Û

(iii) v G Kp iff v G Kp and u(t) S c v ( t ) a. e. in [ 0, 1] for each u G K p

P roof. Since F is compact there is an extreme point e of clco I(P) belonging to F. Thus by (Ç) there is{u¿} с Kp such that I(Uj) -» e and by Lemma 1 {u¡} is convergent in L x norm to a function v, hence without any loss of generality we may assume that u^t) -* v(t) pointwise. Thus v G KP and of course I(v) = e G F hence (i) holds and I(Kp) contains the profile of F, therefore(ii) holds as well. To prove (iii) let us fixe v G Kp and u G Kp be arbitrary.Put w(t) = maxc (u(t),Y(t)). We have the inequality v(t) Sc w(t) and therefore I(v) Sc I ( w ) . Therefore I (w ) G F. That means that w(t)-v(t) G С and

1

J ( w ( t ) - v ( t ) ) d t G C n ( - C )

о

which is the case only if w(t)-v(t) G С n (-C) a .e . in [ 0 , 1 ] . The latter set is a subspace thus v(t)-w(t) G С п (-C) also. Hence u(t) i c v(t) a .e . in [0, 1 ].On the other hand, if the latter inequality holds for each u G Kp and v G Kp then I(v) = maxcI(u). By (i) there is w G Kp such that I(w) G F. I(w) s c I(v)

U G Kpas well as I(v) sc I(w). Hence I(w)-I(v) G С n (-C) which implies that I(v) G F and completes the proof.

The following is a specification of Corollary 1 to the case F = {e}.Corollary 2. Under the assumption of Corollary 1 for each extreme point e of clco I(P) and each convex cone С such that C n (-C) = {0}, С и (-C) = Rn and clco I(P) С e-C there exists a unique v G Kp such that i(v) = e and for each u G Kp u(t) s c v ( t ) a. e. in [ 0, 1 ].

Lemma 1 as well as both corollaries hold if we integrate on [ 0, t] instead of [0, 1].

Put

t t

lt(P) = J P (r)dr=^y v ( r ) d T I v e Kp j-0 0

and

tFt = со { J ' v ( r ) dr I v G Kp

о

We shall prove now the following Lemma 2. Let P(t) be closed and F, С and Kp be as in Corollary 1. Then

(i) Ft is Hausdorff continuous in t.(ii) Ft is a compact extremal face of clco It(P) and Ft = {y |y = maxc c lc o I t(P)}.

( i i i ) F t c a n b e r e p r e s e n t e d a s a s u m <p(t)+Ft' , w h e r e cp i s c o n t i n u o u s a n d F ; с м с = С n (_c) f o r e a c h t.

302 OLECH

P roof. From Corollary 1, (iii) it follows that if u, v GKp then both u(t)-v(t) G C and v(t)-u(t) G C. Thus if we fix u0 G Kp then each w G Kp is of the form w = u0 + v where v(t) 6 Mc = С п (-C) a. e. in [ 0, 1]. This proves(iii) of Lemma 2.

Let F = {y |y = maxc clco It(P)}. It is clear that if s n6 co It(P) then

s n + / u 0 ( T ) d T G со I(P)

and if the first sequence tends to a point of Ft then the second tends to a point of F . This shows that Ft is compact because F is compact. Therefore, by Corollary 1,

Ft = co v(T)dt I u ( t ) Sc v ( t ) , 0 S t á t for each u G K pj-

o

It is clear that each v in the set above is a truncation of an element of Kp. Thus,

F = с о j ' J ' v ( T ) d T I v G K p j - = F t

о

Hence (ii) is also proved. Part (i) follows from the inequality |v(t) | s X(t) for each v G Kp, where X G L j . Such a X exists. Indeed, take any b G Mc different than zero. Apply Corollary 1 to С = (C\M c ) u {a GMc | ( a ,b ) ï 0},Kp с Kp and Kp = {v G Kp |<(v(t), b)> = /3(t) s <(u(t), b> for each u G Kp }. Note that /3 is integrable. This shows that for each b f 0, b G Mc , <(v(t), b> is uniformly bounded by an integrable function if v GKp , which together with(iii) proves the existence of a uniform integrable bound for Kp and completes the proof of Lemma 2.

Definition: We shall call v G Kp an extremal element of K P or extremal selector of P if there is a lexicographical order; that is a cone С such that C n (-C) = {0} and С U (-C) = R" such that

u(t) s c v(t) a. e. in t (2.5)

v G Kp is piecewise extremal if there is a partition t 0 = 0 < t 1 < . . . <t)<= l of the interval [ 0 , 1 ] such that v(t) is equal on each of subintervals [ tj, t i+1 ] to an extremal element of K p.

Lemma 3. If P(t) closed and F is compact extremal face of clco I(P) then for each x G F there is a piecewise extremal selector vx of P such that

x = J vx(t)dt. M oreover, the number of subintervals on which vx is extremalо

can be made not greater than the dimension of F plus one. In particular,F c I(P).

IA E A -S M R -1 7 /6 1 3 0 3

P roof. We shall prove Lemma 3 by induction with respect to the dimension of F. If dim F = 0, that is F = {e}, where e is an extreme point of clco I(P), then by Corollary 2 there is an extremal v such that I(v) = e. Suppose now that Lemma 3 holds for dim F < к and let dim F = k. Let x„ G F be arbitrary. Take any extremal selector v 0 in Kp (C is the cone corresponding to F) and consider function

(2 . 6 )x ( t ) = x 0 - J v 0 ( T ) d T

tBy Lemma 2 (iii), x{t)-<p(t) G Mc for each t, therefore, x(t) G <p(t)+Mc ^>Ft. But x(l) G F, and by Lemma 2, (i), Ft is continuous, thus there is 0 s s S 1 such that x(t) EFt if s s t s 1 and if s > 0 then for a sequence 0 s t¡ < s, t¿ -> s x(tj) С Ft. . Hence x(s) belongs to the relative boundary of Fs, thus to

1 S

an extremal face of dimension m < k. Therefore x(s) = J Vj ( r ) d T , where vjи

is piecewise extremal with m+1 pieces at the most. Putting v(t) = V j ( t ) if t s s and Vo (t) if t â s we see that v is piecewise extremal with k+ 1 piecesat the most and by (2 . 6 )

s 1 1

x q = J v 1 { t ) & t + J v 0 ( r ) d T = J v ( r ) à r

о s о

which was to be proved.Remark 1. Notice that if x 0 is not an extreme point of F then we have a choice of vu in the proof above and, therefore, such x0 can be reached by, at least, two different piecewise extremal elements.

From Lemma 2 we have also the following

Corollary 3. The integral I(P) of P is convex.Indeed, if v 1( v2 G K P then P}(t) = (v jit), v 2 (t)} с P(t) is closed and clco I(P}) is compact thus by Lemma 2 I(v¡) G clco I(Pj ) с ЦР-, ) с I(P), i = 1, 2, which proves convexity of I(P).Remark 2. Notice the very mild assumption for P(t) in obtaining these results. Actually, what was relevant that far was the following property of set K c L j :

(P) if u, v G К and A is measurable then u Xa + v X[o цча g

Manifestly, if К = K P that is К is the set of measurable selections of a setvalued map then property (P) holds. This only property was needed in the proof of Lemma 1. For the remaining lemmas, we needed the closedness of Kp in L j-s tron g topology which is the case if P(t) is closed for each t.

We are in a position to state the basic results of this section:l

Theorem 2 . 1. If P is a map from [0, 1] into closed subsets of Rn and / P(t)dt is bounded then 0

tS ( t ) = / P ( t ) d t 0 < t S 1

3 0 4 OLECH

is convex and compact, and it is Hausdorff continuous in t. Moreover if Ks;, is the class of piecewise extremal selectors of P(t) then

S(t) = i / v(t)dt v £ K ;, (2.7)0

t

P roof. S = clco Tp(t)dt is compact thus Lemmas 2 and 3 applied to S give the Theorem. 0

Theorem 2.2. (the unbounded case) If P is a map from f 0, 1 ] into closedl

subsets of Rn then J ' P(t)dt is convex and each compact extremal face of

l 0 i l cl P(t)dt is contained in J P(t)dt. In particular, if cl J P(t)dt does not

0 0 J 0

contain any extremal ray then J P(t)dt is closed. In the latter case, (2 . 7) also holds. l 0

P roof. Convexity of J P(t)dt is given by Corollary 3. The remaining partsо

follow from Lemma 3 and proposition (s ').Remark 3. Theorem 2. 1 is an extension of well known Liapunov theorem concerning the range of vector-valued measure, stating that the range offinite and non-atomic vector measure is closed and convex. A special caseof this is when the measure is given by ц { A) = J ' i ( t ) à t and the general case

Aessentially can be reduced to this case. Without any loss of generality, we may assume that the measure ц is given on the interval [ 0 , 1 ]; then we see that

l(ju(A) I A measurable subset of [0, 1]} = J' P(t)dt

where P(t) is composed of two points: f(t) and 0. On the other hand, the Liapunov theorem could be used to prove the convexity part of Theorem 2 .1 and to some extent also the compactness part. However, the second part of Theorem 2.1 cannot be obtained from the Liapunov theorem. This part of Theorem 1 is the reason for a rather detailed analysis of the integral of set-valued functions presented here and perhaps for a rather long proof of Theorem 2.1.

3. SOME APPLICATIONS; EXISTENCE; BANG-BANG PRINCIPLE

As a first application of the integral of set-valued function we shall give a proof of Theorem 1 .1. For the sake of simplicity, we assume that to = 0 in this theorem, and, we shall write X(t) for X(t;0). Put

IA E A -S M R -1 7 /6 1 3 0 5

P(t) = {p|p = x '^ tjB u , u e u } (3.1)

t

S(t) = J P(t)dt (3.2)0

mt) = X{t) ( X 0 + S(t)) (3. 3)

It is clear that the reachable set for (1. 1) defined by (1. 4) is contained iná?(t). By Theorem 2.1, however, we have

t

á%(t) = {x(t)(^x0 + J ' v(t)dt^ I v £ Kp piecewise extremal j- (3.4)о

By definition, v is an extremal selector of P(t) if there is a lexicographicalorder S c such that (2.5) holds. Take w(t) = maxc P(t). It is easy to see thatw(t) = maXj-IX'-^tíBui | 1 s i S k} where uL are vertices of U.

Each of the functions X"1 (t)Bui is analytic. If we restrict ourselves toonly these vertices of U for which Buj f Bu¡ if i f j, then function u(t) takingvalues from {ux, . . . , uk} and such that X _1(t)Bu s c X _1 (t)Bu(t) for each t > 0and u G U, is uniquely defined and piecewise constant. Indeed, notice thatС n ( - C ) = {0} and, therefore, by (a), there is an orthogonal sequenceax, . . . , an such that x E С iff the first non-zero member of the sequence< a i( x> is positive. Thus for each t, u(t) = u j(t), 1 á j(t) â к and< ax, X ' 1 (t)Bu j(t) )> = m a x {(a b X _1 (t)Buj> | 1 S j S k. This condition eitherdefines u(t) uniquely everywhere except for a finite number of valuesof t or the maximum is attained at u¡ . . . . . u¡ on. an interval. The latter,

, Ji* ’ Js ’because of the analyticity of X (t) B, is only in the case where< a x, X '^ B u ^ ) = < a x, X " 1 (t)Buj2> = . . . = < ax, X '^tlB ujj) equal tomax ^ a x, X _1 (t)Buj^ on an interval [ a, ¡3], a < 0. If this is the case, then<(a2, X _1 (t)Bu j,t) У = max ^ a2, X "1(t)BUjk)'. Applying the same argument

l i kssagain, we can prove, by induction, that u(t) is piecewise constant. Therefore, if v(t) is a piecewise extremal selector of P(t) given by (3.1), then there exists a u(t) which is piecewise constant and takes values from the set of vertices of U such that v(t) = X ' 1 (t)Bu(t). Thus, by Theorem 2.1,

á?(t) = m t)

where ^ (t) is the reachable set corresponding to piecewise constant control function andá?(t) is convex, compact and continuous in t. The existence of a time-optimal solution follows from the fact that the set

{t I х г e á?(t)} (3. 5)

is closed and contained in [ 0, +°o]. Indeed, by continuity of the set{(x, t) I x eá?(t)} = graph ^ is closed. Hence the set (3. 5) is closed and there is t* = min{t| X j€ ^ (t ) } which completes the proof of Theorem 1.1.

Notice that continuity was used only through the relation that the graph of is closed. The latter property is less than continuity and it is called

306 OLECH

upper semicontinuity in Kuratowski sense of set-valued function. There is another definition of upper semicontinuity (u. s. c . ) of set-valued maps:A set-valued map P into subsets of Rm is u. s .c . if

P 'G = {t I P(t) nG f 0}

is closed for each closed subset G of Rm.In the case of P(t) compact and bounded uniformly (i. e. P(t) с P0 for

each t, P 0 — bounded), the above two definitions are equivalent.Notice also that we did not use the convexity of ¿%{t). This property and

continuity allows the conclusion that if t* is the optimal time then x xe 3^ (t*) which in turn leads to necessary conditions for optimal control.

Theorem 1.1 can be easily extended to a more general situation. Namely to the case when the control system is of the form

where A(t) is an integrable matrix-valued function and f satisfies the Carathéodory conditions:(C) f(t, u) is measurable in t for each fixed u, continuous in u for each fixed t.

However, in this case the admissible control function u will be measurable selections of a given set-valued map U from [ t 0, T] into subsets of Rm and such that f(t, u(t)) is integrable. Under those assumptions, for each admissible u the solution of (3. 6 ) satisfying given initial condition x (t0) = x 0 is uniquely defined and the reachable set is contained in

where P(t) = X _1 (t)f(t, U(t)). To have the opposite inclusion, we need to check that if v(t) G P(t) and is integrable then there is an admissible u such that v(t) = X _:L(t)f(t, u(t)) on the interval in question. A positive answer to this question requires some properties of U. For this purpose we recall the notion of measurable set-valued mapping.

Definition. We say that a set-valued map P from [ 0, 1] into subsets of Rn is measurable if the set

is measurable for each G с Rn closed.Notice that upper semicontinuity implies measurability.

The following result is referred to often in the literature as Filippov lemma and gives condition under which (3. 7) holds.

Proposition 3 .1 . If f satisfies Carathéodory condition (C) and U(t) is closed and measurable in t, then for any measurable v such that v(t) e f(t, U(t)) there is a measurable selector u of U(t) such that v(t) = f(t, u(t)).

We omit the proof of this proposition but only explain two main steps in it. Firstly,one proves that the set-valued map V(t) = {u £ U(t) |f(t, u) = v(t)} is

x = A(t)x + f(t, u) (3.6)

(3.7)о

P 'G = {t |P(t) n G, f p)

1 A E A -S M R -1 7 /6 1 3 0 7

measurable. Secondly, one applies a selection theorem which states that measurable closed set-valued mapping admits measurable selection.

The next proposition suggests one way selection theorem can be obtained: Proposition 3 .2 . Let P(t) be closed and measurable in t on [0, 1] and let the cone С induces a lexicographical order in Rn. Then v(t, C) = maxc P(t) is finite on a measurable set on which it is measurable and v(t, C) is integrable

lon [ 0, 1] if and only if maxc cl J ' P(t)dt is finite.

оThis proposition gives both; existence of measurable selectors as well

as characterization of extremal elements of Kp if P is measurable. The proof of Proposition 3.2 is by induction argument. In fact, it is enough to prove that Pa(t) for fixed a £ Rn is measurable if P(t) is measurable.

To state an analogue of Theorem 1.1 for system (3. 6 ) we would like to introduce the notion of an extremal solution of (3.6). Namely, x(t) on [ t 0, t] is an extremal solution of (3. 6 ) if x(tj) is an extreme point of the reachable set¿%(t) corresponding to the initial condition x(t0) = x 0. Now Proposition 3. 1 and Theorem 2.1 implies:

Theorem 3. 1. If f satisfies assumption (C), f(t, U(t)) and U(t) are closed and U is measurable in t and we assume that for each u(t) G U(t) measurable, f(t, u(t)) is integrable, then:(a) the reachable set (3. 7) for system (3. 6 ) is compact and convex and depends continuously on t.(b) the problem of passing from x 0 at time t0 into a closed subset Z of [ t 0, +oo) x Rn in a minimal time has an optimal solution x..; (t). Moreover, there is also a time-optimal solution of the above problem, which is p iece- wise extremal. x.;. (t) is unique iff x.;.(t.;.) (t.;. the optimal time) is an extreme point of ¿5?(t.;¡).

Notice the following characterization of extremal solutions: x(t) is an extremal solution of (3. 6 ) on the interval [ a, b] iff for each t1( t2 6 [a, b ], tj < t2, and any other solution x(t) of (3. 6 ) the condition holds:

x(t¡) = x(t¿), i = 1 , 2 , implies that x(t) = x(t) for t e [ t1 , t 2 ]

For system (1.1) and U = { (u1, . . . , um)/ | u¿ | s 1, i = 1, . . . , m} the tim e- optimal control function u* has the property that |u~;(t) | = 1 for each t. This is so because the extremal control functions have that property. Thus the equality^?(t) = ¿%(t) which we have proved, holds also if for admissible controls £2 we take all piecewise constants u(t) such that |uj(t) | = 1 for each i. This can, in other words, be expressed as follows: each point which can be reached in time t from the initial state x 0 at time t 0 by means of a measurable control function (ua (t), . . . , un(t)), I u¡(t) j s l a . e. i n [ t 0, t], can also be reached, in the same time, by a solution corresponding to a control function u(t) whose co-ordinates only take on the values +1 or -1 . This phenomenon of jumping from one extremity to another has been given a colorful name of "bang-bang" principle. A generalization of this principle follows from Theorem 2. 1 and will be stated in the next theorem. Before we introduce the following property (compare with property (P)), we shall say that a class К of functions of a real variable is closed with respect to "piecew ise" operation if

3 0 8 OLECH

(P*) for each Uj, . . . , uk G К and t 0 = 0 < t 1 . . . < t k = l the function

u = I t j . j . t i ) UiG К

Theorem 3. 2. Under the assumptions of Theorem 3. 1, any point which can be reached from a fixed initial condition in a fixed time by an admissible solution of (3. 6 ) can be also reached — and in the same time — by a piecewise extremal solution of (3. 6 ). Moreover, if К denotes the class of all admissible solutions of (3. 6 ) with fixed initial condition (t fl, x<j), K!|t is the class of all piecewise extremal solutions of (3. 6 ) and К с К has the property (P*) and for each tj > t0 and any x 6 К there is x G К such that x(tj) = x(tj) then К* с К.

P roo f. The first part follows from Theorem 2. 1. To prove the second part, we notice that any subset К с К with the property that for each x 6 К there is x G K that xft-J = xftj) has to contain an extremal solution of (3. 6 ) because, if e = xft}) is an extreme point of the reachable set.í2(tj), then x(t) is the only solution of (3. 6 ) and e can be reached at time t j. This, together with (P*), implies К... с к .

Theorem 3. 2 is, in a sense, the best version possible of the "bang-bang" principle. Notice that uniqueness of admissible solutions leading to extreme points of.^(t) does not necessarily imply uniqueness of the corresponding control functions. Notice also that extremal control functions u(t) are such that f(t, u(t)) takes values from the profile of c lco f(t, U(t)), sim ilarly to the linear case discussed above, where extremal (and also optimal, in some cases) control functions take on values from the set of vertices of the cube U, thus from the profile of U. However, even in this simple case it may happen that to an extremal solution there correspond two different control functions.

As another application of Theorems 2.1 and 2. 2, let us consider the following more general problem: Suppose that, in the class of admissible solutions of (3. 6 ) with fixed initial condition, we wish to find a solution such that (tx, x(tj;xo, t 0, u)) G Z for some t x> t0 and that the functional

4J (<a°(t), x(t)> + f°(t, u(t)) dt + <p(t3 , xit-J)

attains its minimum. This is a Bolza type of problem. Consider the reachable set.9?(t) for an extended system

y = A(t) y + F(t, u), y(t0) = (x0, 0)

where y = (x, x°) G Rn+1, A(t) is (n+1) x (n+1) matrix of the form:

A(t), 0

( 3 . 8 )

A(t) =a°(t), 0

and F(t, u) = (f(t, u) f °(t, u)).

IA E A -S M R -1 7 /6 1 3 0 9

Under sim ilar assumptions as in Theorem 3 .1 , we can prove that the reachable set is compact and continuous. The problem reduces to finding a minimum of the function 3>(t, y) = <p(t, x) + x°, on the intersection of the graph of the reachable set^?(t) with Z, and one sees that a solution exists if <p is such that ip(t, x) -* +00 if t -» +00 or if we, in addition, require in the problem that t x be bounded from above by a constant. We leave to the reader the exact statement of an appropriate theorem and its proof.

So far, we considered the bounded case, i .e . the case where the reachable set of the corresponding integral of a set-valued function is bounded and, thus,compact. This is not the case in the example mentioned in the introduction,

1

i . e . the problem: minimize J ' ( l + x ^ ^ d t in the class of all absolutely0

continuous functions satisfying the.boundary condition x(0) = 0, x(l) = 1. This problem is also a special case of above control problem with A = 0, a0 = 0, f(t, u) = u and f°(t, u) = (1 +u2)1/’4. The reachable set is simply the integral

tm(t) = J p(t)dt

Q

where P(t) = const = graph (1 + u2 ) 1^4 . This integral is unbounded and nott

closed. In fact, one can check that J P(t)dt = tcoP if P constant and,0

therefore, in our case .í%(t) is a halfspace x > t with one point of the boundary (0, t) included. Thus, the solution of the problem exists only if the end condition is the same as the initial condition.

Let us dwell a little bit more on this kind of problem, i .e . the problem of minimizing the functional

min / f(t, x(t))dt (3.9)

in the class of absolutely continuous functions on [ 0 , 1 ] satisfaying the boundary condition x(0) = a, x(l) = b. This is a special case of the classicalLagrange problem from the calculus of variation (the integral does notdepend on x(t)). Put

P(t) = graph f(t, . ) = {(x, y) I y = f(t, x), x e Rn }

and Q(t) = epigraph f(t, • ) = {(x, y) | y ê f (t, x), x € Rn }As to f, we make the assumption that both P(t) and Q(t) are closed and measurable in t. The solution of the problem exists if and only if the

intersection of Q(t)dt with the line {(x, y) | x = b, y arbitrary} is closed0

from below. A sufficient condition for the existence of a solution of this problem for any finite value b can be deduced from Theorem 2. 2 . A sufficient condition is that either the closure of the integral does not have

3 1 0 О LECH

proper unbounded extremal faces or that it does not contain a ray different from that the parallel to the positive у -axis. Indeed, consider the inter

section ^cl J' Q(t)dt) П {(x, у) I x = b} = {(x, у) I x = b, у § /3} (only the case f}

0 Pis finite is of interest). The optimal solution exists iff (b ,/3) 6 / Q(t)dt. But

Г 0(b, /3) belongs to an extremal face F of cl / Q(t)dt which is transversal to0 fdirection of у -axis. Thus if F were not contained in / Q(t)dt then F as well

Г 0as cl / Q(t)dt would contain an extremal ray not parallel to the positive y-axis о

which contradicts the assumption.For both of these cases, we shall give analytic conditions in terms of the

so-called conjugate function to f.F irst, however, we shall discuss the support function to a convex set.

I f S с R n, t h e n

cps (p) = sup <p, s>S ë S

is called the support function of S. Note that 5 РФ1 + (1 -X)P2 ) X<ps (Pj) + (1-X) <ps(p2) if X e [ 0, 1](ii) (homogeneity) <ps(Xp) = X<ps(p) if X > 0

(üi) ( l . s . c . ) lim inf <ps (p) g ips(p 0)P Po

Denote by H(p, a) the half space {s |<(s, p)> é a}. H(p, <ps ( p ) ) is called the support halfspace of S and we have:

Л H(p, <ps (p)) = clco S P

On the other hand, if cp(p) satisfies (i), (ii) and (iii) then <p = <ps whereS = n H(p, < p ( p ) ) .

P

relation:An important property of the integral I(P) = J ' P(t)dt is the following tion: о

l■?I(P) (P) = f (P)dt (3. 10)

о

This is a consequence of Lemma 1 of the previous section. In this case,

we integrate the epigraph Q(t) then the integral S = J Q(t)dt has theо

property that if (x, y) £ S then (x, z) 6 S for each z > y. But S is convex, thus for each x we can define

g(x) = inf{y |(x,y) € S}

IAE A-SM R-17/61 311

i f the la t te r se t is no t e m p ty and +00 o th e rw ise . One can check th a t g(x) is

convex and i f Q is c lo sed th en g is 1. s. c. In th is case , the s u p p o r t fu n c tio n

if>s(p) o f S is +00 i f the la s t co - o rd in a te of p is p o s it iv e and

cps (p , -1 ) = sup(-g(x) + ) = g*(p)X

w here g* is the so - ca lle d con juga te fu n c tio n of g. I t is c le a r th a t g* is

convex and 1. s . c. W e no tice a lso th a t g** = c l g w here c l g deno tes the

so- ca lle d c lo su re of g, th a t is the la rg e s t convex and 1 . s . c . fu n c tio n bounded

by g . F r o m (3 .10), we have

w here f* is the con juga te fu n c tio n to f(t, • ).

B y the e ffe c tive d o m a in o f g we m ean the se t

A = {p|g*(p) < +°°}

The se t A is convex and we have the fo llo w in g p ro p o s it io n :

(ri) L e t Q be the e p ig ra p h of a convex and 1. s . c . fu n c tio n g : R n -» R и {+ со}.

T hen p be longs to the in te r io r of the e ffe c tive d o m a in of con juga te fun c tio n

g* if f the set

is non e m p ty and co m pac t.

P r o o f . G (p) is convex and c lo sed because g is 1. s . c. I f G (p) is unbounded

then i t co n ta in s a h a lf- lin e , i . e . th e re e x is ts x 0, y 0 6 Rn, yo f 0 such th a t

-g(x0 + X y 0) + < x 0+ Xy0, p > = g*(p) fo r each X g 0

I t is c le a r th a t fo r q such th a t < q , y 0> > 0 we have

sup (- g (x 0 + Xy0) + < x 0 + Xy0 , p + eq> ) = +00

i f o n ly e > 0. T hus, g*(P + eq) = +00 i f e > 0, hence p does no t be long to the

in t e r io r o f A . On the o th e r hand , suppose th a t G (po) is c o m pac t fo r som e

p 0 f ixed and le t r > 0 be such tha t

F = {(x, y) I |x |2 + |y 12 = r , -y + < x ,p 0 > = g*(Po)}

is d is jo in t f r o m G (p 0) as w e ll as f r o m Q . L e t e = m in d (z , Q ) and le t

zo = (x 0, g (x0)) be a p o in t f r o m G (p 0). T here is 6 > 0 such th a t d (z , D p) S e /2

fo r each z e F and any p such |p-po | < à and D p = {(x, y) | -y + ^ x , p^> =

-g(x0) + <^p, x 0>}. I f Dp П Q w ere unbounded , th en because of the convex ity

of Dp n Q , we w ou ld have a c o n tra d ic t io n w ith the d e f in it io n of r in F . T hus,

fo r e ach p such th a t |p-po | < 6, S = {(q, r) € Q | -r + < q , p > ê -g(xo) + ^P« x o X

1

(3 .11 )

0

G(p) = {(x , g ( x ) ) | - g ( x ) + < x , p > = g*(p)}

z 6 F

312 OLECH

is c o m p a c t and s ince G (p) = { (x, y) e S |-y + <(x, p > = m ax (-r + ^ q , p ^ } ,

th e re fo re G (p) is no t em p ty and co m p ac t. (чл)е s

W e m ay now s u m m a r iz e w hat we have p ro ved in the fo llow in g

T h e o re m 3. 3. Suppose th a t f: [0, 1] x R n -» R и { + co} is m e a su ra b le in t and

lo w e r s e m ic o n tin u o u s in u. I f the set

{p| f* (t, p) is in te g ra b le }

l

is open , then the in te g r a l J ' P (t )d t is c lo se d , w here

оP (t) = e p ig ra p h f(t, • )

l

In p a r t ic u la r , th e re e x is ts m in J ' f(t, x (t))d t o ver a l l x(t) a b so lu te ly con tinuous

оx: [ 0, 1 ] -> R n such th a t x(0) = a, x ( l) = b.

A p a r t ic u la r case of th is s itu a tio n is th a t w here f* (t, p) is in te g ra b le fo r

e ach p . In th is case , we say tha t f s a t is f ie s the grow th c o nd it io n . N otice

th a t th is a s s u m p t io n m e ans th a t

" f ( t , u) + S ф (t) (3 .12 )

w here i//p G L x o r e q u iv a le n t ly th a t to e ach e > 0 the re is an in te g ra b le fu n c tio n

i//£ (t) such tha t

e f(t, u) g |u I + фе (t) (3. 13)

F r o m (3. 13) i t fo llo w s th a t, ro ug h ly speak ing , f g row s, w ith re sp e c t to u,

fa s te r th an any l in e a r fu n c tio n . A s p e c ia l case of the g row th c o nd it io n is th a t

w here f(t, u) ê Ф( | u |) and 3>(s)/s -» + oo as s -> oo. T h is c o rre sp o n d s to (3. 12)

o r (3. 13) w ith fu n c tio n фp o r фс bounded o r s im p ly co ns tan t. T hus, i t is a

u n ifo rm g row th c o nd it io n . I f the grow th c o nd it io n h o ld s , then the only ra yl

con ta in ed in the in te g ra l J ' P (t )d t is th a t p a r a l le l to the p o s it iv e у -ax is . T h is

оis the d ir e c t io n in w h ich a l l P (t) a re unbounded .

The la s t th e o re m can be g e n e ra liz e d to a m o re c o m p lic a te d s itu a t io n

w hen we w ish to m in im iz e

i

I(x , u) = J (< a°(t), x (t)> + f°(t, u (t))d t

о

unde r the co nd it io n s

x = A (t) x + f(t, u), u e u ( t )

x ( 0 ) = a, x ( l ) = b

IAE A-SM R-17/61 313

The so lu t io n of th is p r o b le m e x is ts i f f and f° s a t is fy a C a ra th é o d o ry co nd it io n

i . e . f, f ° a r e m e a su ra b le in t and con tin uous in u, U(t) is c losed and m e a su ra b le

in t, F ( t , U (t)) is c lo se d and fo r e ach e > 0 the re is ф€ G L j such tha t

e f°(t, u) 2 |f(t, u) I + (M t) i f u G U ( t )

F , as b e fo re , s tands fo r (f, f°).

4 . A N O N - L IN E A R O P T IM A L P R O B L E M

In c o n tra s t to the case c o n s id e re d up so fa r , fo r a n o n - lin e a r c o n tro l

sy s te m , the re a ch ab le set m ay be no t c lo sed , even i f the set o f a d m is s ib le

v a lu e s of the c o n tro l p a r a m e te r is c o m p ac t. L e t uë1 c o n s id e r the fo llo w in g

exam p le :

x = l / ( l + y 2), y = u / ( l+ y 2 ), u = 1 o r u = -1 (4 .1 )

N o tice th a t d y /d x = u, and i t is c le a r how the t r a je c to r ie s of E q . (4. 1) lo ok

l ik e . H ow ever, the re achab le se t of E q . (4 .1 ) is no t c lo sed and the tim e -

o p t im a l p r o b le m o f p a s s in g f r o m (0, 0) to (1, 0) h as no s o lu tio n . Indeed , the

in f im u m is 1 because fo r each e > 0 the re is a s o lu t io n o f E q . (4. 1) such th a t

|y(t) I S e, x ( t£ ) = 1, y ( t£ ) = 0. B u t we have:

4 t£

J ' á (t)d t = 1 = J ' ^ ^ 2 dt 2 t £, thus t e S 1 + e2

о 0

O n the o the r hand , x tr a v e ls w ith a speed le s s o r e q u a l one, thus the

t im e is bounded f r o m be low by 1. T h e re fo re , the o p t im a l t im e , i f i t

has to be e qua l one. B u t tha t w ou ld be p o s s ib le i f and on ly i f x(t) = 1

i f y(t) = 0, w h ich is im p o s s ib le .

A n o th e r exam p le of the s o r t is the fo llo w in g L ag ra n g e p ro b le m :

M in im iz e

1

I(x ) = J (1 + x2) ( |x2 - 1 I + 1 ) d t

0

o ve r the c la s s o f a l l a b so lu te ly con tin uous fu n c tio n s s a t is fy in g the bounda ry

cond ition :

x(0) = 0, x ( l) = 0 (4. 3)

The in f im u m of I(x) o v e r x s a t is fy in g (4 .3 ) is 1. Indeed , ta k in g a po lyg ona l

l in e x n w ith s lope + 1 o r - 1 we see th a t

1

= J ( l+ x 2 (t))d t

0

o p t im a l

e x is ts ,

hence

(4 .2 )

314 OLECH

and it is c lear that we can do it so that |xn| s e for arbitrary e > 0. On the other hand I(x) г 1 for each x. But the value 1 is not attained because that would be possib le only if (1 + x 2) ( 11 - x 2 | +1) = 1. Thus if x(t) = 0 and x(t) = 1 . A contradiction.

In both those cases, the lack of convexity is the reason fo r the nonexistence of an optimal solution. If, in the firs t case, we convexify the right-hand side, i . e . we take the convex hull o f the set {(1 /(1 + y 2), u / l + y 2))| I u I = 1} which com es to allowing u to be |u| s i , or, in the second case , rep lace f(x, u) = (1 + x 2) ( |l-u2 | +1) by co f (x, u) = (1 + x 2) (max (0, u2 - 1) + 1) which is the largest function convex in u and bounded from above by f, then the optim al solution would exist.N otice that in both ca ses we could define sequences x n(t) of adm issible tra jectories such that the value of cost function on x n tends to the infimum. Such sequence is called a m inim izing sequence. In both cases, the m inim izing sequence is uniform ly convergent but in the fir s t case the lim it function is not an adm issible tra jectory and in the second the functional evaluated at the lim it is greater than the infimum. In the follow ing theorem we give a condition guaranteeing that neither of the above effects holds. P rob lem 1. M inim ize the functional

ti

= J f° (t , x(t), u(t))dt (4 .4 )0

over the c lass Г2 of adm issible pairs (x, u) such that x is an absolutely continuous function, u-m easurable and

x(t) = f(t, x(t), u(t)) a .e . in [ 0 , t-J

- x(0) = a x (t j) G Ф (^) (4. 5)

u(t) G U(t) — a given set-valued function

where Ф is continuous function from [0 , T ] into R n, and tj is not fixed but t j S T.

T heorem 4. 1. (existence theorem ) A ssum e that f °, f are continuous in x, u for fixed t, m easurable in t fo r x, u fixed; U(t) is c losed and m easurable in t; the set

Q(t, x) = {(q , r) |q = f(t, x, u), r ê f°(t, x, u), u G U(t)} (4.6)

is convex and closed for each (t, x); fo r each t fixed Q is u. s. c . in x inKuratowski sense; and the growth condition holds, that is

sup ( - r + <(p, q )0 = (p(t, p, x) S 4t(t, p) fo r each x (4 .7)(q, r ) 6 Q ( t , x )

where ф is integrable in t, fo r each p G Rn.Under these assum ptions there exists optimal solution o f the problem

(4, 4) - (4. 5); that is there are x*, u..., t* which satisfy (4. 5) and such that

IAE A-SM R-17/61 315

ti

J f ^ x ^ í t ) , u*(t))dt Й J f°(t, x (t),u (t))dtо о

fo r any other adm issible pair (x, u).

P ro o f . The proof of this theorem w ill be split in a few steps. F irs t we notice that because of (4, 7) of I(x, u) is finite. T herefore we have a sequence x n(t), un(t), t" of adm issible pa irs such that

f u(t, x n(t), un(t))dt -♦ a = infimum I(x, u)( x , u ) e s;

Put qn(t) = x n(t) = f(t, x n(t), un(t)), r n(t) = f C'(t, x n(t), un(t)). By (4. 6) we have

(qn(t), r n(t)) 6 Q(t, x n(t)) and J r n( t ) d t - a (4.8)о

We assum e that x n and un are defined on the interval [ 0, T ] and thus also (4. 8) holds on [ 0, T ] . This is not much of restriction . It w ill allow us to avoid certain technical d ifficu lties. By (4. 8) we get

T

J r n(t)dt s M < + 0 0 (4. 9)о

We shall prove firs t that both qn and rn are bounded in L x - norm . r n(t) is bounded because of (4. 9) and the inequality below obtained from (4. 9) and (4. 7) fo r p = 0: - r n(t) S ф (t, 0). S im ilarly, by (4. 7) and (4. 8)

 - r n W + < К Ъ р )

which holds fo r each p and which im plies that |qn(t) | s K (rn (t) + ф(t)) where К is constant, ф is integrable and independent on n. Thus there is M such that ||qn||Ll + ||rn||Ll s M fo r each n. This im plies that {(qn,r n)} is precom pact in the weak* topology of the conjugate space C* of the space С of continuousfunctions from [ 0, T ] -» Rn+1. Without any lo ss of generality we may assum ethat (qn, r n) is convergent in that sense. Thus there is a m easure ц taking values in Rn and a sca lar m easure v, such that

T T T T

J < ?(t), q n(t)>dt + J rn (t) n(t)dt - J <C(t),d;u(t ) > + J'T]( t)dy(t) (4.10)o o o o

fo r each f : [ 0, T] -*■ R n and rj: [ 0, T ] -» R continuous.F rom (4. 8) and (4. 7) we have the inequality

4 х » . u a) = f

 - r n(t) - i//(t, p) s о (4.11)

316 OLECH

D eno te by р а , /lis and i/a , v s a b so lu te ly c o n t in uo us and s in g u la r p a r t of ju

and v , r e s p e c t iv e ly . The le ft-hand s ide o f (4 .11 ) is a ls o co nve rg in g w eak*,

and the l im i t is because of the in e q u a lity (4. 11) a non- pos itiv e m e a su re .

T hus bo th a b so lu t ly con tin uous p a r t and s in g u la r p a r t of the l im i t a re n o n

p o s it iv e , too . T h e re fo re f r o m (4. 10) and (4. 11) we ob ta in ed in e q u a lit ie s :

É i L í ( t ) \ _ Û J Ù L dt dt

<(p,/ns (A)^> - v s (A ) â 0 fo r each A m e a su ra b le (4 .13 )

<P1 _d f (t)> m ^ ' ^ t ' P ) 50 (4Л2)

S ince (4 .13 ) h o ld s fo r e ach p thus ц5(А) = 0 fo r e ach A , T he re fo re

T T

f < l ( t ) , q n ( t ) > d t - * y < | ( t ) , q 0(t)>d t

w here q 0(t) = fo r each con tinuous £ w h ich im p lie s :

t t

x n(t) = a + J q n ( t ) d t - a + J q 0(t)dt = x 0(t) (4 .14 )

о 0

W e s h a ll p rove now th a t

l im sup < p{t,p , xn (t)) á <p(t, p, x 0(t)) a . e. in [ 0, T] (4 .15 )П

W ith o u t any lo s s o f g e n e ra lity , we m ay assum e th a t in (4. 7) ф{t, p) = sup cp{t, p , >X

T hen i//(t, p) is convex in p and there is a se t N с [ 0, T] o f m e a su re z e ro such

th a t i f t Ç N then ф(%, p) is f in ite fo r e ach p . In e q u a lity (4. 15) h o ld s fo r

t e [0, T]\ N. Indeed , fixe t S [ 0, T]\N, then Q (t, x) с P (t) , w here P (t) is the

e p ig ra p h of the con juga te fu n c tio n ф * Ц , • ) to ф{t, • ). W e have the in c lu s io n

Q (t, x) n {(q, r) |-r + < q , p > g (p(t, p, x 0(t))}

x C P (t) n {(q , r) |-r + < q , p> È <p(t,p, x 0(t))}

and by {ri) the la t te r set is c o m p ac t. T hus i f th e re is ( q n, r n) 6 Q (t, x n(t)) and

such th a t (p{t, p , x n (t) ) = - r n + s <¡»(t, p , x 0(t)) then (qn, r n) c o n ta ine s a

co nve rge n t subsequence and we m ay assum e as w e ll th a t (q n, r n ) co nve rge s .

T hen the l im i t ( q 0, r 0), s ince x n(t) -> x 0(t), be longs to Q (t, x 0(t)) because of

u . s . c . o f Q w ith r e s p e c t to x. Thus we have (4. 15).

I t is c le a r f r o m (4 .15 ) and (4. 7) th a t p u tt in g

i//n (t) = max(<p(t, p , x 0(t)), sup <p{t, p , x¿(t))i n

we have l im ^„ (t) = <pit, p , x 0(t)) and ipa(t) S ip it, p ), n = 1, 2, . . . T h is le a d s us

to a p o s s ib i l i t y o f r e p la c in g , in in e q u a lity (4 .1 2 ), <p{t, p) by ip n fo r a r b i t r a r y n

hence a ls o by cp{t, p , x 0(t)). W e have then

<P»q0(t )> “ r 0 W ë ^ ( t , p , x Q( t ) ) t e [ 0, T ] \N(p) (4 .16 )

IAE A-SM R-17/61 317

I t fo llo w s f r o m (4. 16) th a t fo r e ach d e n u m e rab le se t {p j} th e re is a se t N of

m e a su re z e ro such tha t

S ince bo th s id e s o f E q . (4 .1 6 ') a re con tin uous in p, i f {p j} is dense in R n,

then (4 .1 6 ') ho ld s fo r each p and t e [0, T]\ N. Hence we conc lude th a t

I t fo llo w s f r o m (4. 1 1 ) th a t fo r e ach e > 0 th e re is 6 such th a t J |qn(t) [dt < e

Ai f the m e a su re of A is le s s th an 6, w h ich im p l ie s , in p a r t ic u la r , th a t the

conve rgence in (4. 14) is u n ifo rm and tha t q n-+ q 0 w eak ly in L r F r o m th is

fo llo w s th a t th e re is t* e q u a l to the l im i t o f t " o r of a subsequence of i t such

th a t x 0(t*) = Ф ( ^ ) . The r e g u la r i ty a s s u m p t io n on f and f° a re enough to

conc lude th a t th e re is m e a s u ra b le u 0 (t) (P ro p o s it io n 3. 1) such tha t

and u 0(t) G U ( t ) . O b v io u s ly , x 0(0) = a and, th e re fo re , x 0(t), u 0(t) s a t is f ie s

(4. 5), hence is a d m is s ib le . Now by (4. 10) and (4. 13) fo r e ach e > 0 there

is 6 > 0 such th a t i î <p: [ 0, T ] -»■ [ 0, 1 ] is con tinuous and cp (t) = 1 if

t s t* + 6 / 2 and ip(t) = 0 i f t > t* + 6 then

- r 0 (t) + S <p(t, p , x Q (t)) fo r p G lP j} and t G [0, T]\N (4. 16')

(ЯоМ » r o W ) € Q ft , x o W ) a -e - i n [ 0 , T] (4. 17)

Q0 (t) = x Q(t) = f(t, x 0(t), u 0(t)) and r Q(t) ê f° ( t , x 0 (t), u 0 (t))

T

< e /4

о о

< e /4 fo r n ë N(e)

о 0

T T

0 D

< e /4 fo r n ê N(e)

о

The above in e q u a li t ie s im p ly

t»

J r 0(t)d t S a + e

о

B u t e is a r b i t r a r y , th e re fo re we have

t t

I(x0, U0) = J f ° ( t , x 0(t), u 0(t))d t ^ J r 0(t)d t

0 0

318 OLECH

w h ich show s th a t x0, u 0, t* is o p t im a l s o lu tio n and c o m p le te s the p ro o f of

the th e o re m .

T h e o re m 4. 1 can be ex tended , w ith o u t m u c h a d d it io n a l d if f ic u lty , to the

case o f m u lt ip le in te g ra ls .

P r o b le m 2. Suppose we w ish to m in im iz e the fu n c tio n a l

I(x , u) = J ' f °(t, x (t), u (t))d t fo r (x, u) G Q

Gw here G is a bounded d o m a in in R m, x : G -> Rk and u : G -» Rn. The set Г2

of a d m is s ib le p a ir s (x, u) is d e fined by the fo llo w in g co nd it io n s :

x b e lo ngs to the Sobo lev space H X(G, R k), u is m e a s u ra b le and u(t) G U(t)

a .e . in G , w here U is g iven se t-va lued m ap p ing ,

V x (t) = f(t, x (t), u(t)) a .e . in G

and the b o u nd a ry v a lue of x a re f ix e d in the sense th a t x G x 0 + H j0(G, R k ),

w here x 0G H x is fix e d and H x0 is the subspace of H x w h ich is o b ta ined by

c lo s in g in H j the se t of C °°-function w ith co m p ac t s u p p o r t con ta ined in G .

T h e o re m 4. 2. A s su m e th a t f ° , f a re con tinuous in x, u fo r fixed t, m e a su ra b le

in t fo r fix e d x, u, U(t) is c lo sed and m e a su ra b le in t, the set

Q (t, x) = {(q, r ) I q = f(t, x, u), r ë f° ( t , x, u), u G U(t)}

is convex and c lo se d fo r e ach (t, x) and fo r each t f ixed Q is u. s . c . in x in

K u ra to w sk i sense;

sup (-r + ) = <p(t, p , x) § ÿ/(t, p) fo r e ach x,

(q , r) <= Q (t, x)

w here ф is in te g ra b le in t fo r e ach fix e d p G R km.

U nd e r these a s su m p t io n s th e re e x is ts an o p t im a l s o lu t io n (x0, u 0) of the

p r o b le m 2 .

T he p ro o f is ana logous and we ju s t p o in t out those p la c e s w here there

is a d if fe re n c e .

A s in the p re v io u s case we pu t

q n(t) = V x n (t) = f(t, xn (t), u n(t))

r n (t) = f ° ( t , xn (t), u n(t))

w here (xn, u n) is the m in im iz in g sequence . In e x ac tly the sam e m a n n e r we

can p rove th a t th e re is a subsequence o f (qn, r n) (fo r s im p l ic i ty s t i l l denoted

by (q n, r j ) such th a t q n~* q 0 w eak ly in L a and J r n( t ) d t-> a * J rQ(t)d t.

i G i G T he space H x0 h as the p ro p e r ty th a t xn G H ^ 0 and Vxn conve rges w eak ly

in L j then x n co nve rges s tro n g ly to a fu n c tio n x 0 in H x0 and Vxn conve rges

IAE A-SM R-17/61 319

w e ak ly to V x0. So we have the l im i t fun c tio n , we see th a t q 0 = V x0 and we

conc lude th a t

The r e m a in in g p a r t goes e x ac tly in the sam e w ay o r even m o re s im p ly s ince

he re the in te g ra t io n is a lw ays over a fix e d se t G w h ile in the p re v io u s case

we a llo w the r ig h t end of the in te g ra t io n in te r v a l to be v a r ia b le .

5. L O W E R C L O S U R E A N D L O W E R S E M IC O N T IN U IT Y

The o p t im a l p r o b le m w h ich we d is cu ssed in the p re v io u s sec tio n can

have the fo llo w in g "c o n tro l- fre e " fo rm u la t io n :

L e t Q be a se t-va lued m ap de fined on [ 0, T] x R n in to subse ts o f R n+1 . We

a s su m e th a t Q (t, x) h as the p ro p e r ty th a t i f (q, q 0) G Q (t, x) ( q 0 is the la s t

co- o rd in a te of a p o in t f r o m Q) and r > q 0 then (q, r ) 6 Q (t, x ). L e t Q be

a c e r ta in set of p a ir s (x, w), c a lle d a d m is s ib le p a ir s , both b e in g in te g ra b le

fu n c tio n s f r o m [ 0, T ] in to R n.

P r o b le m 3:

T h is p r o b le m und e r the a s su m p t io n s o f T h e o re m 4. 1 c o nce rn in g f ° , f and

the se t-va lued m ap U is e q u iv a le n t to the o p t im a l- c o n tro l p r o b le m co ns id e re d

in the p re v io u s sec tio n in the sense th a t i f we de fine Q (t, x) by (4. 6) and an

in te g ra b le (w (t), v (t)) G Q (t, x(t)) then there is u(t) m e a s u ra b le such th a t

u(t) G U(t) and w(t) = f(t, x (t), u(t)) and v(t) й f 0(t, x (t), u (t)) . O n the o the r hand ,

the above p ro b le m can be e q u iv a le n t ly e xp re ssed as the L a g ra n g e p r o b le m if

we c o n s id e r Q (t, x) to be the e p ig ra p h of a fu n c tio n g(t, x, • ). M o re p r e c is e ly

in a g iven c la s s П of a d m is s ib le p a ir s (x (t ) ,u (t )) . U nde r som e m i ld r e g u la r i ty

c o nd it io n c o n ce rn in g Q , P r o b le m 4 is e q u iv a le n t to P ro b le m 3 in the sense

th a t a s o lu t io n o f P r o b le m 4 is the s o lu tio n 3 w ith the sam e Г2 and v ice

v e rs a . The a s su m p t io n s u ff ic ie n t fo r th is is th a t g(t, x (t), u(t)) is m e a su ra b le

fo r e ach x, u m e a su ra b le and th a t the in f im u m in (5 .1 ) is a tta in e d . The la t te r

is the case i f Q (t, x) is c lo sed w h ile the f i r s t a s su m p t io n ho ld s i f Q is

m e a s u ra b le in the sense th a t the g raph of Q is a ¿¿ fx á^- m easu rab le subse t of

[ 0, T] x R n x R n+1 w here S f is the L ebesgue б- fie ld on [0, T] a n d á i s the

B o re l 6- fie ld in R n x R n+1 .

( q 0U), r 0 (t)) G Q (t, x 0 (t))

T

m in im iz e

о

g(t, x , u) = in f {v 1 (u, v) G Q ( t , x)} (5 .1 )

i f the la t te r se t is n on- em pty and +oo o the rw ise .

I f Q (t, x) is c lo sed then i t is the e p ig ra p h o f g(t, x, • ) d e fined by (5.1).

P r o b le m 4.

T

M in im iz e / g(t, x (t), u (t))d t

о

3 2 0 OLECH

F o r the ex is te nce of o p t im a l s o lu tio n the fo llo w in g tw o co nd it io n s a re

u s u a lly e s ta b lish e d :

(C j) The set

T

D = { (x , u, a) I (u(t), v(t)) e Q (t, x(t)) and a = J ' v ( t)d t j“

о

is c lo se d .

(C2 ) T he re is a such th a t the set:

Q a = -j (x, u) I Г2 I the re is v such th a t (u(t), v(t)) G Q (t, x(t)) on [ 0, T] and

T

v (t)d t s a

о

is n on- em pty and co m pac t.

B o th co nd it io n s w ere e s ta b lis h e d in the course of the p ro o f of T h e o re m 4.1 ,

w ith s tro n g s e q u e n tia l conve rgence fo r x, and w eak s e q u e n tia l convergence

fo r u. I t shou ld be e m p h a s iz e d tha t the to po logy he re is no t g iven by a

p r o b le m and any topo logy is good p ro v id ed bo th co nd it io n s C j and C 2 a re

e s ta b lis h e d . In fa c t, d if fe re n t to po log ies have been u sed . T h is show s a lso

th a t one can have a g re a t v a r ie ty of ex is tence th e o re m s , even fo r the sam e

type o f p r o b le m . C o n d it io n C j is e q u iv a le n t to the so- ca lle d lo w e r c lo su re

p ro p e r ty of the o r ie n to r fie ld : (u(t), v(t)) G Q (t , x (t)); th a t is the p ro p e r ty that

i f u n - »u0, x n ^ x 0, J v n(t)d t -> a, and (un(t), vn(t)) G Q (t, xn (t)) then th e re is v 0T 0

such th a t J ' v0(t)d t á a and (u 0(t), v0 (t)) G Q (t, x 0(t)). C o n d it io n С 2 w hen

о

tr a n s la te d in te r m s of fu n c tio n g m e an s lo w e r s e m ic o n tin u ity of the fu n c tio n a l

TI(x , u) = J ' g(t, x (t), u (t))d t. Indeed , if a sequence o r a g e n e ra liz e d sequence

о(xa , u a ) co nve rges to (x 0, u 0) then by C j (x0, u 0, l im in f I (x a, u a )) G D . B u t

because of (5 .1 ) the in f im u m of { a | (x0, u 0, a) G D} equa ls J g ( t ,x 0(t),

оu о(t))d t = I (x 0, u 0), thus I (x q , u 0) S l im in f I(x a , u a ) and v ice v e rs a . C o n d it io n

d

C 2 g ive s us c o m pac tn e ss of the m in im iz in g sequence . The g row th co nd it io n

is re s p o n s ib le fo r th is c o nd it io n in T h e o re m 4. 1. B o th co nd it io n s C x and

C 2 p lu s c lo se dne ss o f Г2 im p ly ex is tence o f a o p t im a l s o lu tio n to P r o b le m 3.

O f c o u rse , in a s p e c if ic ex is tence th e o re m , a topo logy o r a m ode of

conve rgence in x-space and u-space has to be chosen and co nd it io n s C x and

C 2 have to be e s ta b lis h e d w ith 'r e s p e c t to those to p o lo g ie s . In choos ing

to po log ie s in x-space and u-space , we a re faced w ith a c o n f l ic t betw een C x

and C 2. The w eake r the to po logy is w h ich we choose the m o re p ro bab le is

IAE A-SM R-17/61 321

1+ t h a t C 2 w o u l d h o l d , w h i l e , f o r c o n d i t i o n C l f t h e s i t u a t i o n i s o p p o s i t e . A s

i n t h e p r e v i o u s s e c t i o n , w e s h a l l h e r e c o n s i d e r s t r o n g L i - t o p o l o g y f o r x a n d

L x - w e a k s e q u e n c i a l c o n v e r g e n c e f o r u . T h e r e a r e t w o t y p e s o f e x i s t e n c e

t h e o r e m s f o l l o w i n g t h e a b o v e p a t t e r n : O n e i s v a l i d w h e n c o n d i t i o n C 2 h o l d s ,

f o r r e a s o n s i n d e p e n d e n t o n t h e s t r u c t u r e o f Q . F o r e x a m p l e , i f w e b e f o r e h a n d

k n o w t h e t o p o l o g y i n w h i c h f 2 i s c o m p a c t , t h e n t h e o n l y a s s u m p t i o n s w e n e e d

t o i m p o s e o n Q s h o u l d b e s u c h t h a t C x h o l d s w i t h r e s p e c t t o t h i s p a r t i c u l a r

t o p o l o g y . T h e o t h e r t y p e i s s u c h t h a t t h e p r o p e r t i e s o f Q w h i c h w e s e e k

s h o u l d i m p l y b o t h C j a n d C 2 . T h e o r e m s 4 . 1 a n d 4 . 2 a r e o f t h i s t y p e .

T h e r e , t h e g r o w t h c o n d i t i o n g i v e s C 2 w h i l e t h e c o n v e x i t y a s s u m p t i o n i s

e s s e n t i a l ( a n d t h e g r o w t h c o n d i t i o n i s v e r y h e l p f u l ) t o o b t a i n C j ,

W e s h a l l p r o v e t h a t c o n v e x i t y o f Q ( t , x ) i s , i n a s e n s e , n e c e s s a r y f o r

C x t o h o l d . T o b e m o r e p r e c i s e , w e h a v e t h e f o l l o w i n g :

T h e o r e m 5 . 1 . L e t Q ( t ) b e a s e t - v a l u e d m a p d e f i n e d o n t h e i n t e r v a l [ 0 , 1 ]

w i t h v a l u e s b e i n g s u b s e t s o f R n+1 s p a c e w i t h t h e p r o p e r t y t h a t i f ( q , q 0 ) G Q ( t )

a n d r > q 0 t h e n ( q , r ) € Q ( t ) , t o o ( q 0 i s t h e l a s t c o - o r d i n a t e o f p o i n t i n R n T l ) .

A s s u m e t h a t t h e g r a p h o f Q , t h a t i s t h e s e t { ( t , v ) 1 1 G [ 0 , 1 ] , v G Q ( t ) } i s

^ - m e a s u r a b l e s u b s e t o f [ 0 , 1 ] x R n + 1 . T h e n t h e f o l l o w i n g c o n d i t i o n s a r e

e q u i v a l e n t :

0e a c h u 0 , a n d ( u ( t ) , v ( t ) ) G Q ( t ) h a s I . e . p . w i t h r e s p e c t t o w e a k c o n v e r g e n c e

i n L j .

( i i i ) Q ( t ) i s c l o s e d a n d c o n v e x a . e . i n [ 0 , 1 ] a n d

Q ( t ) = П { ( q , q Q) | - q 0 + < q , (t) > â ^ ( t ) } a . e . i n [ 0 , 1 ] ( 5 .(fi € Ф

w h e r e Ф С L „ ( [ 0 , 1 ] , R n ) i s n o t e m p t y a n d d e n u m e r a b l e a n d ф 6 L 2 f o r

e a c h c p .

( i v ) Q ( t ) = { (q, q(| ) j q 0 s f ( t , q)} w h e r e f : [ 0 , 1 ] x R n -> R и { + 0 0 } i s У х â ë -m e a s u r a b l e , l o w e r s e m i c o n t i n u o u s a n d c o n v e x i n u f o r e a c h f i x e d t

a n d t h e f u n c t i o n a l

i s w e l l d e f i n e d o n L x w i t h v a l u e s f r o m R U { + 00} a n d w e a k l y l o w e r

s e m i c o n t i n u o u s .

P r o o f , ( i ) = » ( i i ) . T h e I . e . p . o f ( u ( t ) , v ( t ) ) G Q ( t ) w i t h r e s p e c t t o w e a k

t o p o l o g y i m p l i e s , i n p a r t i c u l a r , t h a t t h e s e t К i n ( i i ) i s w e a k l y c l o s e d . I n

f a c t , i f { u a , v a } c o n v e r g e s w e a k l y t o ( u 0 , v 0 ) a n d ( u y . V y ) f К t h e n w i t h o u t

a n y l o s s o f g e n e r a l i t y w e m a y a s s u m e t h a t

( i i ) T h e s e t К = { ( u , v ) | u G L a ( [ 0 , l ] , R n ) , v G L ¡ ([ 0 , 1 ] , R ) , ( u ( t ) , v ( t ) ) G Q ( t ) |

i s c o n v e x a n d c l o s e d i n w e a k t o p o l o g y o f L j a n d t h e r e i s </>G L „ a n d

ф G L j s u c h t h a t f o r e a c h ( u , v ) G К

- v ( t ) + s i/ /( t ) a . e . i n [ 0 , 1 ] ( 5 . 2 )

1

0

322 OLECH

i 1

J v 0(t)d t < a = in f f v (t)d t| (u 0,v ) € kJ- (5 .4 )

о о

Indeed , le t A = {t | ( u 0(t), v 0 (t)) G Q (t)} and be such th a t (u0, у х ) 6 К and

J ' v 1 (t)d t - a < e. T hen , p u tt in g (ua (t), v a (t)) = (u0 (t), Vj (t)) i f t 6 A and

о(ua (t), v a (t)) o th e rw ise we o b ta in a sequence w eak ly co nve rgen t to (u 0, v £ ),

w here v e (t) = V j (t) i f t G A and v 0 (t) i f t A . I f e is s m a l l enough (5. 4)

h o ld s fo r ve because J ' (v 2 (t) - v 0(t))dt > 0. B u t (5 .4 ) c o n tra d ic ts the

[ 0 , 1] \ A lo w e r c lo s u re p ro p e r ty .

The c lo se dne ss o f K , to g e the r w ith the fa c t th a t К is the se t of

in te g ra b le s e le c to rs o f a se t-va lued fu n c tio n , im p l ie s th a t К is convex .

Indeed , le t w j , w 2 6 K and p u t w\ = X w j + (1 - X)w 2, X G (0, 1). T ake any

sequence <рг , . . . , <pk o f L „ fun c tio ns and c o n s id e r the set:

i i

= . f ^k>dt) i w = w iX A+ w 2 x [0i1]n a ; A m e a s u ra b le j

1 1

4 =ÇJ< w ¡ , d t , • . . . J < W j , (f’k> d t ^ ) , i = 1 , 2

1 1

В

0 0

B y th e o re m 2, 1 i t is c o m p ac t and convex and i t does co n ta in po in ts

1 l

a . = (

0 0

1 1

T herefore , it contains a lso Xax + (1 -X )a 2 = (^J <^wx , ip^dt, . . . , J ( w ^ , <pk)>dt^о 0

w h ich p ro ves th a t in any w eak ne ighbou rhood of w\ there is

w = w i *A + w 2^[o i ] \ A e К w h ich im p lie s th a t w^ G К i f К is w eak ly c losed ,

and i f К is w eak ly s e q u e n tia lly c lo se d w j £ K a lso , hence w j be longs to

the c lo s u re o f {W jXa + w2 Х[о,1]\а|а С [0 -1 ] m e a su ra b le } and the la t te r set

is bounded . So the c lo s u re of i t is a lso bounded and convex and

th us a ls o w e ak ly s e q u e n tia lly c lo se d . T he re fo re the set:

D = -j(u, a) I (u, v) G К , a = J v d t j is c lo sed and convex and i t fo llow s

оf r o m (i) th a t i t is bounded f r o m be low thus th e re is d t g cons t < +00

о

fo r e ach (u, a) D . The la t te r in e q u a lity and L e m m a 1 of S ec tio n 2 im p lie s

(5 .2 ) , w here

ф (t) = e ss sup (-v(t) + ^<p(t), u(t))>

( u , v ) e К

IAE A-SM R-17/61 323

(ii) =* ( ii i) T he re is a d e n um e rab le sub se t {wa } dense in L j- n o r m topo logy

in K . P u t Q (t) = c lc o {w a (t)}. O bv io u s ly , i f w € K then w(t) G Q (t) a . e . in

[ 0 ,1 ] , and v ice v e rs a , i f w(t) 6 Q (t) a. e . in [0 ,1 ] then w G c l{ w H} = K ,

w here the c lo su re is in L j - n o rm topo logy . I f the la t te r w ere not tru e then

we shou ld be ab le to show th a t w(t) on a s e t o f p o s it iv e m e a s u re is ou ts ide

Q (t) . Now , two m e a s u ra b le se t-va lued fu nc tio ns have to be e qua l a .e . i f

they have the sam e se t of in te g ra b le s e le c to rs . R e p re s e n ta t io n (5 .3 )

fo llo w s f r o m (5. 2) (that Ф is non-em pty ) and f r o m the fo llo w in g fa c ts : le t A N(t)

be de fined : A N(t) = {p I sup ( - q 0 + ) S N + ф(%), | p - < p (t)| áN }

(q . Яо) e Q (t )w here N is an in te g e r and ф a re f r o m (5. 2). A N(t) is c lo sed , convex and

m e a s u ra b le in t, th e re fo re th e re is a sequence of m e a s u ra b le fu n c t io n s '

p Nii such th a t A N(t) = c l{ p N>i(t)} fo r e ach t and N f ix e d . P u t Ф = U {рыдК

T hen Ф is the d e s ire d se t fo r w h ich (5. 3) h o ld s . N

( ii i) =» (iv) T he re is f(t, u) such tha t Q (t) is the e p ig ra p h of f(t, • ) fo r e ach t

f ix e d . I t is o bv ious th a t f h as the d e s ire d p ro p e r t ie s . F r o m (5. 3),

I(u ) > -oo, thus I(u ) is w e ll d e fined and e ith e r f in ite o r +oo. L e t un -* uo

w eak ly , then fo r e ach cp G Ф C L „

-f(t, u n(t)) + <<p(t), un (t)> S ¡/yt)

S im i la r ly , as in the p ro o f of T h e o re m 4 .1 , we deduce th a t v n = f(t, u n(t))

c o n ta in s a subsequence co nve rg in g to a m e a su re v and v 0 = dya /d t s a t is f ie s

the in e q u a lity

- v 0 ( t ) + i ^ ( t ) (5 .5 )

w h ile the s in g u la r p a r t v % is non- nega tive . In p a r t ic u la r if I ( u n) a then

i l

aо

N ow , (5 .5 ) to g e the r w ith (5 .3 ) , im p l ie s th a t (u 0(t), v 0 ( t ) )G Q ( t ) a .e . in [ 0 ,1 ] ,

th e re fo re v 0(t) г f(t, u 0(t)). H ence , by (5. 6), I (u 0) s l im in f I (u n), w h ich w as

to be p ro ved .

(iv) =» (i) The in f im u m in (i) is e q u a l to I ( u 0) if Q is the e p ig ra p h of f as in

the case i f (iv) h o ld s . F r o m 1. s . c . of I(u ) i t fo llo w s th a t the set

D = {(u, a) |u G L a , a ê I(u)}

is c lo se d . B u t D = { (u , a) | a = J ' v (t)d t, v(t) г f(t, u(t)) a . e . in [ 0, 1 ]j-.

T h e re fo re , le t (u n(t), vn (t)) G Q (t) a . e. in [ 0, 1] fo r each n, and le t u n(t) -> u 0

w eak ly and J vn (t)d t s a 0, then I ( u 0) s l im in f I (u n) ê a . Hence v 0 = f(t, u 0(t))

0 l

i s such th a t J v 0 (t)d t S a 0 and (u0 (t), v 0 (t)) G Q (t) a . e . in [ 0 ,1 ] . Hence the

0lo w e r c lo su re p ro p e r ty ho ld s and the p ro o f o f T h e o re m 5 .1 is c o m p le te d .

324 OLECH

I t is c le a r th a t convex ity of Q (t, x) is a ls o a n e ce s s a ry co nd it io n fo r lo w e r

c lo s u re p ro p e r ty of o r ie n to r f ie ld

(u (t) .v (t) ) Q (t ,x ( t ) ) (5 .7 )

w hen the conve rgence in u-space is L j- w e a k .

In the p ro o f of T h e o re m 5 .1 (p a r t (ii) =*• ( ii i) ) , we used the fo llo w in g

r e s u lts c o n ce rn in g m e a s u ra b le se t-va lued m ap s w itho u t p ro v in g them : The

c lo sed se t-va lued m ap Q is m e a su ra b le i f and on ly if th e re is a d e n um e rab le

sequence of m e a su ra b le s e le c to rs { p ¿} such th a t Q (t) = c l f p j t ) } . In

T h e o re m 5. 1, we a s su m e d tha t Q has ,im m e a s u ra b le g raph . T h is im p lie s

m e a s u r a b i l i ty of Q in the sense o f D e f in it io n 3 .1 and is e q u iv a le n t to if

v a lu e s of Q (t) a re c lo se d . W e have conc luded in the p ro o f above that

Q (t) = Q (t) a .e . in [0, 1] f r o m the fa c t th a t the sets of m e a su ra b le s e le c tio n s

a re the s a m e . T he re we used the fa c t th a t Q (t)\ Q(t) o r Q(t)\ Q(t) have g raphs

â f x ^ - m e a s u r a b le . T hus A = (t|Q(t)\ Q(t) f f>] is L e b esg ue m e a s u ra b le and

i f the m e a su re o f th is se t w ere no t z e ro then the re w ou ld be a se le c tio n w

of Q such th a t w (t) $ Q (t) on A . T here we m ade a use of the fo llo w in g

s e le c t io n th e o rem : I f Q : [0 ,1 ] -> subse ts of R n has S f x ^ - m e a s u r a b le g raph

then th e re is m e a s u ra b le s e le c tio n of Q .

F o r Q m e a s u ra b le in the sense of D e f in it io n 3. 1, such a s e le c tio n

th e o re m ho ld s p ro v id e d the v a lue s of Q a re c lo se d .

T h e o re m 5. 2. A s su m e th a t the m ap Q (t, x) o f [0, 1] x R n in to c losed and

convex subse ts of R n+1 has the p ro p e r t ie s : (i) i f (q, q 0) £ Q (t, x) and r > q Q

then (q, r) £ Q (t, x), (ii) the g raph of Q is Sé’y .â ë-m e a su ra b le sub se t of

[ 0 ,1 ] x R n x R n+1, ( ii i) fo r e ach fixed t the g raph of Q (t, • ) is c lo sed .

A s s u m e fu r th e r th a t the re is ф: [ 0, 1] -*■ R in te g ra b le and cons tan t M such

th a t fo r e ach in te g ra b le x: [ 0, 1] -*■ R n th e re is m e a su ra b le p: [ 0, 1] -» R n

such th a t I p (t) I â M and

-r + <(q, p(t) У s ф (t) + M j I x(t) I fo r each (q. r) £ Q (t, x(t))

T hen (5. 7) has lo w e r c lo su re p ro p e r ty w ith re s p e c t to s tro n g convergence

in L-j fo r x, w eak s e q u e n tia l convergence in L x fo r u. I f M x = 0 then the

c o n c lu s io n ho ld s a ls o fo r p o in tw ise convergence fo r x.

In th is p a p e r we s h a ll no t in c lud e the d e ta ile d p ro o f of the above r e s u lt

b u t o n ly m e n tio n the m a in s teps .

Suppose x n(t) -» x 0(t) po in tw ise and in L j- n o r m , u n ^ u 0 w eak ly in L j ,

J ' vn (t)d t -* a and (u n(t), v n(t)) £ Q (t, x n(t)) fo r e ach n . To p rove the

о n

T h e o re m 5. 2 we need to show th a t th e re is v 0 £ L x such th a t / v 0 (t)d t S a

оand (u 0(t), v 0(t)) £ Q (t, x 0(t)). To the la t te r se t-va lued fu n c tio n we app ly

T h e o re m 5 .1 , p a r ts (ii) and ( i i i ) . In p a r t ic u la r , fo r Q (t, x 0(t)), (5 .3 ) h o ld s .

L e t cp £ Ф be fix e d . U s in g the a s su m p t io n of lo w e r s e m ic o n t in u ity of Q (t, x),

one c an show th a t the re is cpa £ L » , фп, ф £ L x such tha t

-v„(t) + < ? n (t), u n (t)> s^n(t) й ф ( t)

l im sup 0n (t) = ^ ( t )

cpn -*• â ^ ( t ) a .e . in [0, 1 ], w h ich to g e the r

оw ith (5 .3 ) p ro v e s the T h e o re m . T hus, the m a in d if f ic u lt ie s in the p ro o f l ie

in the c o n s tru c t io n o f <pn and фп .A n e q u iv a le n t fo rm u la t io n of T h e o re m 5. 2 one gets i f one n o t ic e s th a t

Q (t, x) = e p ig ra p h f(t, x, • ) and th a t 1. c. p . of (5. 7) is e q u iv a le n t to lo w e r

s e m ic o n t in u ity of the fu n c t io n a l

6 . C O M M E N T S

The ex is te nce th e o re m 1. 1 can be found in the by now c la s s ic a l p ap e r

[19] of L a S a lle fo r U be ing a cube and a ls o in book [35]. E a r l i e r re s u lts

a re due to R . V . G a m c re l id z e fo r U b e ing o ne - d im e n s io n a l and to B ushaw

fo r tw o - d im e n s io n a l s y s te m s .

T he e x p o s it io n of S ec tio n 2 fo llow s the a u th o r 's p a p e rs on the sub jec t

[22-27]. L e m m a 1 is an ex ten s io n of a r e s u lt o b ta ined by B la c k w e ll [7].

I t can a lso be found in a d if fe re n t fo rm in the p ap e r by B o rg e s [ 8 ]. The

m a in d iffe ren ce betw een ou r m e thod of o b ta in in g T h e o re m s 2. 1 and 2. 2 and

th a t of o the r au th o rs is con ta ined in L e m m a 3. The s im p le id e a of the p ro o f

o f th is le m m a is a lso re s p o n s ib le fo r the fo rm of T h e o re m 2 .1 m o re g ene ra l

th an th a t u s u a lly to be seen in the l i t e r a tu r e ; by th is we m e an the second

p a r t of it , w h ich s ta te s th a t we m ay r e s t r ic t o u rse lv e s to p ie ce w ise e x tre m a l

s e le c to rs w ithou t chang ing the in te g ra l S (t). To prove th is le m m a , we use

the fa c t th a t the L e b esg ue in te g ra l on the in te r v a l [0, t] is con tinuous w ith

r e s p e c t to t. A s has been noted in one of o u r f i r s t p u b lic a t io n s on th is

s u b je c t [23], w o rk in g w ith a L eb esg ue m e a su re on an in te rv a l is no t m uch

of a r e s t r ic t io n . W h a t is r e a l ly needed in the case of an a b s t ra c t m e a su re

space (T, ц) is a o n e - p a ra m e te r f a m i ly of se ts T t such th a t T t С Ts i f t < s

and that, fo r each in te g ra b le fu n c tio n v ,the in te g ra l v(T)d/u(T) is a c on tin uous

fu n c tio n of t. In fa c t, i f (T ,/u) does no t a d m it a to m s and the m e a su re is fin ite ,

then such a f a m ily does ex is t.

The p ro o fs o f the le m m a s in S ec tio n 2 a re d if fe re n t f r o m those in the

a u th o r 's p re v io u s p a p e r s . T hey a re m o re "c o - o rd in a te - fre e " . T h is s eem s

to m ake th em s im p le r . In p a r t ic u la r , the p ro o f of L e m m a 1 is s im p le r .

N o tice that, th e re , we e s s e n t ia l ly p rove the w e ll-know n r e s u lt tha t exposed

po in ts of a convex c lo sed se t a re dense in the p r o f ile of th is se t. F o r

p ro p o s it io n s co nce rn in g convex sets and th e ir e x t re m a l s tr u c tu re , we r e fe r

to book [37] and a lso to a p a p e r by K le e [17]. In p a r t ic u la r , p ro p o s it io n (e)

is due to K le e .

T he re a re re a so n s why the se t of in te g ra ls of in te g ra b le s e le c to rs to a

se t-va lued fu n c tio n is c a lle d the in te g r a l o f the la t te r : One can define an

in te g r a l o f a se t-va lued fu n c tio n by im ita t in g the d e f in it io n s of the in te g ra l

l

о

T,

326 OLECH

of a s in g le- va lu e d fu n c tio n . M any of such d e f in it io n s have been g iven and i t

a p p e a rs th a t, in g e n e ra l, they are e q u iv a le n t to the one we used he re and

w h ich s o m e t im e s is r e fe r re d to in the l i t e r a tu r e as the A u m an n in te g ra l [2].

F o r fu r th e r re fe re n ce s and d is c u s s io n of the in te g ra l of se t-va lued m ap p ing ,

we r e fe r the r e a d e r to p a p e r [27].

C o n ce rn in g m e a s u ra b le se t-va lued m ap s , s e le c t io n th e o re m and re la te d

q ues tio n , the l i t e r a tu r e is qu ite e x tens iv e . The in te re s te d r e a d e r m ay

lo ok up the le c tu re no tes by P a r th a s a ra th y [ 33]. The p ap e r of R o c k a fe lla r

[ 38] on the s u b je c t is w e ll w r it te n and is a good sou rce o f in fo rm a t io n . In

p a r t ic u la r , the c h a r a c te r iz a t io n of a c lo se d se t-va lued m ap used in one

p ro o f of T h e o re m 5. 1 (p a r t (ii) =» ( iii) ) (see c o m m e n ts fo llo w in g the proo f)

h as been taken th e r e fr o m . In the sam e p la ce , we have used a recen t

s e le c t io n th e o re m found by A um ann [3] . P ro p o s it io n 3. 1 c a lle d o ften the

F i l ip p o v le m m a [14] has been a lso p ro ved and used by W aze w sk i [ 44].

P ro p o s it io n 3. 2 fo r c o m p ac t se t-va lued m ap s is p roved in R e f . [ 24].

One of the f i r s t g e n e ra liz a t io n s of the ex is tence th e o re m fo r l in e a r

s y s te m s of fo rm (1 . 1 ) to n o n - lin e a r sy s te m s in the c o n tro l p a r a m e te r of

fo rm (3. 6) is due to N eu s tad t [21]. T h e o re m 3. 1, in th a t g e n e ra lity , is

e s s e n s ia l ly due to the a u th o r and can be found in R e f. [ 25]. In the sam e

p ap e r , the ex tens io n of the "b ang- bang " p r in c ip le s ta ted in T h e o re m 3. 2 can

be found . T h is th e o re m can be used to ob ta in a g e n e ra liz a t io n of T h e o re m 1. 1

to s y s te m s of fo rm (3 .6 ) . W e m e an to fo rm u la te and to p rove the ex is tence

a n d /o r the "b ang- b ang " p r in c ip le in the c la s s of p ie cew ise con tinuous c o n tro l

fu n c t io n s . Such a r e s u lt w as g iven by L e v in so n [20] and H a lk in and

H e n d r ic k s [15] . F o r T h e o re m 3 .3 and s im i la r r e s u lts and a p p lic a t io n s , we

r e fe r to R e fs [26, 27]. The la t te r p a p e r co n ta in s som e m o re re fe re n ce s

and d is c u s s io n s and cou ld be consu lted in connec tion w ith the m a te r ia l

con ta ined in S ec tions 2 and 3. A m o n g m o re re ce n t p u b lic a t io n s on th is

s u b je c t we w ou ld lik e to m e n tio n B e r l io c c h i and L a s r y [ 6 ], W ag n e r and

Stone [41], A r s te in [1], and Ce s a r i [13]. A g ene ra l re fe re n ce fo r tim e-

o p t im a l c o n tro l p ro b le m s is a v e ry n ic e ly w r it te n m o n o g rap h [16] by

H e rm e s and L a S a lle .

The f i r s t ex is tence th e o re m fo r n o n - lin e a r c o n tro l p ro b le m s is due to

F i l ip p o v [14]. I t is conce rned w ith the t im e - o p t im a l p ro b le m and d ea ls w ith

the case w here the re a ch ab le set is c o m p ac t. S im i la r case s w ere a ls o

tre a te d by R o x in [40 ] . T hese re s u lts a re connected w ith (or fo llo w fro m )

the th eo ry of con tin gen t equa tio ns deve loped in the th ir t ie s by Z a r e m b a and

M a rc h a u d . I t w as W aze w sk i who no ticed and exp lo red th is connec tion g iv in g

a fo un d a t io n fo r , as he c a lle d th em h im s e lf , o r ie n to r f ie ld s and w h ich a ls o

is known under o the r n am e s as d if fe r e n t ia l e qua tio ns w ith m u lt i- v a lu e d

r ig h t- hand s id e , d if fe r e n t ia l in c lu s io n and som e o th e rs . In R e f . [43], the

in te re s te d r e a d e r can f in d the b a s ic th eo ry of o r ie n to r f ie ld s of W aze w sk i

and re fe re n ce s to h is p a p e rs .

Ce s a r i w as the f i r s t to in ve s tig a te the unbounded c a se . He connected

the T one lli- M cS hane- N aguno ex is tence th eo ry w ith o p t im a l c o n tro l p ro b le m s

and ob ta ined a s e r ie s of qu ite g e n e ra l r e s u lts . H is c o n tr ib u t io n to ex is tence

th eo ry canno t be o v e re s t im a te d . A s e le c tio n of the m u lt itu b e of p ap e rs w h ich

he p u b lish e d on the s u b je c t in the la s t ten y e a rs is in c lu d ed in the l is t of

re fe re n c e s . In p a r t ic u la r , ex is tence r e s u lts of the type of T h e o re m 4. 1 can

be found in R e f . [9] w ith u n ifo rm grow th co nd it io n . T h is th e o re m a ls o

ap p e a rs in the a u th o r 's p a p e r [28] and the p ro o f is taken f r o m th a t p a p e r .

IAE A-SM R-17 /61 327

F o r T h e o re m 4. 2 we r e fe r to R e f . [ 29] . In the p a p e rs quoted the r e a d e r

w i l l a ls o f in d som e o th e r ex is te nce th e o re m s . F o r v a r io u s fo rm u la t io n s

of o p t im a l c o n tro l and it s r e la t io n w ith c la s s ic a l v a r ia t io n a l p ro b le m s , see

R o c k a fe l la r 's w o rk , in p a r t ic u la r R e f . [ 39 ]. A m o ng o th e rs , in te g ra l

fu n c t io n a ls w ith the in te g ra n d a s s u m in g a ls o the va lue +oo a re tre a te d

s y s te m a t ic a lly in an e le g a n t and co nv in c in g w ay in th is p a p e r .

T h e o re m 5 .1 w as s ta ted in R e f . [31]; i t g e n e ra liz e s som e re s u lts

co n ta ine d in P o l ja k 's p a p e r [34]. In the la t te r p a p e r , a ls o the n e ce s sa ry

and s u ff ic ie n t c o n d it io n fo r L j- w e a k s e q u e n tia l lo w e r s e m ic o n t in u ity is

o b ta in ed (equ iva lence betw een (ii) and (iv)) o n ly fo r f(t, q) con tinuous in q,

w h ile he re we a re ab le to p rove i t o n ly by a s s u m in g th a t i t is B o re l-

m e a s u ra b le . The p ro o f of T h e o re m 5 .1 a p p e a rs he re fo r the f i r s t t im e .

F o r re la te d r e s u lts w h ich a re , how eve r, conce rned w ith d if fe re n t w eak

topo logy , see the a u th o r 's p a p e rs [30, 31]. T h e o re m 5 .2 w as a ls o s ta ted

in R e f . [ 3 1 ] , and the d e ta ile d p ro o f w i l l a p p e a r in R e f . [ 32] . T h is th e o re m

is e s s e n t ia l ly due to B e rk o v itz [ 5] and C e s a r i [ 10]. O u r fo rm u la t io n is

s l ig h t ly m o re g e n e ra l, and the p ro o f is d if fe re n t .

In th is c o n tr ib u t io n , we w ere no t ab le to cover a l l r e s u lts connected

w ith ex is te nce th eo ry . The tw o e x is te nce th e o re m s in S ec tio n 4 and two

re s u lts c o n c e rn in g lo w e r c lo su re in S ec tion 5 a re on ly e x am p le s . M o re o v e r ,

a ls o the l i s t of re fe re n c e s is f a r f r o m b e in g c o m p le te . The im p o r ta n t

th e o ry of g e n e ra liz e d so lu tio n s of Y o ung o r s l id in g re g im e s of G a m c re lid z e

o r re la x e d c o n tro ls of W a rg a , w h ich is v e ry c lo s e ly re la te d w ith the ex is tence

of o p t im a l s o lu t io n s , has no t been m e n tio n ed , a t a l l . W e r e fe r the r e a d e r to

m o n o g ra p h s by Y oung [45] and W a rg a [42]. The id e a o f s ta r t in g th is p a p e r

w ith the P e r r o n p a rad o x w as b o rro w e d f r o m R e f . [45]. F in a l ly , we w ou ld

a ls o l ik e to m e n tio n a re c e n t book by Io ffe and T ic h o m ir o v [ 36], w here the

ex is tence of o p t im a l s o lu tio n s is tre a te d in som e d e ta il . M any m o re

re fe re n c e s to th is s u b je c t can be found th e re .

R E F E R E N C E S

[1 ] ARTSTE1N, Z . , On a variational problem, J. Math. Anal. Appl. 45(1974)404-415 .[2 ] AUMANN, R .J ., Integrals o f set-valued functions, J. Math. Anal. A ppl., 22(1965) 1-12.[3 ] AUMANN, R .J ., Measurable utility and the measurable choice theorem, Proc. Int. C oll. C .N .R .S . "La

Decision", Paris(1969) 15-26.[4 ] BERKOVITZ, L .D ., Existence and lower closure theorems for abstract control problems, SIAM J. on

Control, 12(1974), 27-42.[5 ] BERKOVITZ, L .D ., Lower semicontinuity o f integral functionals, to appear in Trans. A .M .S .[6 ] BERLIOCCHI, H ., LASRY, J .M ., Intégrandes normales et mesures paramétrées en calcul des variations,

Bull. Soc. Math. France 101 (1973).[7 ] BLACKWELL, D ., The range of certain vector integrals, Proc. Am. Math. Soc. 2 (1951) 390-395.[8 ] BORGES, R ., Ecken des Wertebereiches von Vektorintegralen, Math. Annalen 175 (1967) 53-58.[9 ] CESARI, L ., Existence theorems for weak and usual optimal solutions in Lagrange problems with

unilateral constraints, I and II, Trans. A .M .S . , 124(1966) 396-412 and 413-470.[10] CESARI, L ., Lower semicontinuity and lower closure theorems without seminormality conditions,

Ann. Mat. Pura Appl. 98 (1974).[11] CESARI, L ., Closure theorems for orientor fields and weak convergence, Arch. Ration. Mech. A nal., to

appear.[12] CESARI, L ., LA PALM, J .R ., SANCHEZ, D .A . , An existence theorem for Lagrange problems with

unbounded controls and a slender set of exceptional points, SIAM J. Control, 1 (1971 ), 590-605.[13 ] CESARI, L ., An existence theorem without convexity conditions, SIAMJ. on Control, 12 (1974) 319-331.

32 8 OLECH

[14] FILIPPOV, A .F ., On certain questions in the theory o f optimal control, Vestnik Moskov. U niv., Ser.Math. Astron. 2(1959) 25-32.

[15] HALKIN, H ., HENDRICKS, E .C ., Sub-integrals o f set-valued functions with semianalytic graphs, Proc. Nation. Acad. S c ., 59(1968) 365-367,

[16] HERMES, H ., LASALLE, J .P ., Functional Analysis and Tim e Optimal Control, Academic Press (1969).[17] KLEE, V ., Extremal structure o f convex sets, Arch. M ath., 8 (1957) 234-240.[18] KURATOWSKI, K ., RYLL-NARDZEWSKI, C . , A general theorem on selectors, Bull. Acad. Pol. S c i.,

Ser. Sci. Math. Astron. Phys. j>(1965) 397-403.[19] LASALLE, J .P ., The tim e optimal control problem, Contr. to the theory o f nonlinear oscillations, 5,

Princeton Univ. Press, Princeton (1960) 1-24.[20] LEVINSON, N .. Minimax, Liapunov and "bang-bang", J. Diff. Eq. 2(1966) 218-241.[21] NEUSTADT, L. W ., The existence o f optimal controls in the absence of convexity conditions. J. Math.

Anal. Appl. 7 (1963) 110-117.[22] OLECH, C . , A contribution to the tim e optimal control problem, Abhandlungen der Deutschen Akademie

der Wissenschaften zu Berlin, Kl. Physik und Technik 2 (1965) 438-446.[23] OLECH, C ., Extremal solution o f a control system. J. D iff. Eqs. 2 (1966) 74-101.[24] OLECH, C ., A note concerning set-valued measurable functions, Bull. Acad. pol. S c i., Ser. Sci. Math.

Astron. Phys. 13 (1965) 317-321.[25] OLECH, C ., "Lexicographical order, range of integrals and bang-bang principle” , Math. Theory of

Control (BALAKRISHNAN, A .V ., NEUSTADT, L. W ., Eds), Academic Press, New York (1967) 35-37.[26] OLECH, C ., "Integrals o f set-valued functions and linear control problems", IFAC Congress Warsaw, 1969

on Optimal Control, Technical Session 7, 22-35.[27] OLECH, C . , Integrals o f set-valued functions and linear optimal control problems, Colloque sur la Théorie

Mathématique du Contrôle Optimal, C .B .R .M ., Vander Louvain(1970) 109-125.[28] OLECH, C . , Existence theorems for optimal problems with vector valued cost function, Trans. Am. Math.

Soc. 136 (1969) 159-179.[29] OLECH, C ., Existence theorems for optimal control problems involving multiple integrals, J. D iff. Eqs j5

(1966) 512-526.[30] OLECH, C ., The characterization of the weak closure o f certain sets o f integrable functions, SIAM J.

Control 12 (1974) 311-318.[31] OLECH, C . , Existence theory in optimal control problems - the underlying ideas, to appear in the

proceedings o f a conference held at the University o f Southern California, Los Angeles, September 1974.[32] OLECH, C . , Weak lower semicontinuity of integral functionals, to appear in ]. Optim. Theor. Appl.

[33] PARTHASARATHY, T . , Selection theorems and their applications. Lecture Notes in Mathematics 263, Springer-Verlag (1972),

[34] POLJAK, B .T ., Semicontinuity of integral functionals and existence theorems for extremal problems,Mat. Sbor. 28(1969) 65-84, (in Russian).

[35] PONTRYAGIN, L .S ., BOLTYANSKI, V .G ., GAMCREUDZE, R .V ., MISHCHENKO, E .F ., The Mathematical Theory of Optimal Control, Moscow (1961) in Russian. English translation: Interscience, New York (1962).

[36] IOFFE, A .D ., TICHOMIROV, W. M . , Theory of extremal problems, (in Russian) Moscow (1974).[37] ROCKAFELLAR, R .T ., Convex Analysis, Princeton Univ. Press(1969).[38] ROCKAFELLAR, R .T ., Measurable dependence of convex sets and functions on parameters, J. Math.

Anal. A ppl., 28(1967) 4 -25.[39] ROCKAFELLAR, R .T ., Existence theorems for general control problems of Bolza and Lagrange, to appear

in Advances in Math.[40] ROXIN, E ., The existence of optimal controls, Michigan Math. J. J3(1962) 109-119.[41] WAGNER, D .H ., STONE, L .D ., Necessity and existence results on constrained optimization o f seperable

functionals by a multiplier rule, SIAM J. Control 12 (1974) 356-372.[42] WARGA, J ., Optimal control o f differential and functional equations, Academic Press, New York (1972).[43 ] WAZEWSKI, T . , On an optimal control problem, Proc. Conf. D iff. Equations and their Applications,

Prague (1964) 229-242.[44] WAZEWSKI, T . , Sur une condition d*existence des fonctions implicites mesurables, Bull. Acad. Pol.

S c i ., Ser. Sci. Math. Astron. phys. ¿ (1 9 6 1 ) 861-863.[45] YOUNG, L .C ., Lectures on the Calculus of Variations and Optimal Control Theory, W.B. Saunders

Company, Philadelphia-London-Toronto (1969^.

IA E A -SM R -11/48

ASYMPTOTIC CONTROL

R. C O N TIIstituto M atem atico "Ulisse D in i” ,

Università deg li Studi,

Florence, Ita ly

Abstract

ASYM PTOTIC CONTROL.Asym ptotic control is discussed in the framework o f general control theory, special emphasis being

p laced on stability (including bounded-input bounded-state stability), affine control systems and stabilization problem s.

1. P R E L IM IN A R IE S

W e s h a ll f i r s t r e c a l l a few m o re o r le ss w e ll-know n fac ts abou t " l in e a r "

o rd in a r y d if fe r e n t ia l e qu a t io n s , so as to r e n d e r th is p a p e r as se lf- con ta in ed

as p o s s ib le .

L e t us denote by:

J = ] a , w [ an open in te r v a l o f the r e a l lin e E , w ith - « S a < u S + ю ;

A : t -► A (t) an n X n m a t r ix fu n c tio n o f t G J , L ebesgue m e a su ra b le

and lo c a l ly in te g ra b le on J;

X : t -* x(t) an n- vec to r fu n c tio n o f t G J , c on tin uous on J .

G iven any 0 G J , the V o lte r r a in te g ra l equa tio n

t

x(t) = x(t) + Í A (s ) x (s) dsue

has a s in g le s o lu tio n x : t - x (t) de fined on J by

x(t) = l im к

X(t) + / A ( t1) x ( tx) d t j + . . .

к - 1

. + A ( t1) . . . A ( tk) x (tk) d tk. . . d t 1

W hen x(t) = x , a c o ns tan t n- ve c to r , th is can be w r it te n

x(t) = l im

к

t lk - 1

. . . +J . . . J A f t j ) . . . A ( tk) d tk . . . d t x

e e

w here I deno tes the n X n u n it m a t r ix .

( 1 . 1 )

329

The fu n c tio n t -» x(t) d e fined by (1.1) s a t is f ie s the c o nd it io n

x(fl) = X (C)

It is lo c a l ly a b so lu te ly con tin uous on J and such th a t

- A (t) x(t) = 0 , a . e . t G Jdt

T h e re fo re , we s h a l l c a l l it the (C a ra th êod o ry ) s o lu tio n o f the l in e a r o rd in a ry

d if fe r e n t ia l e qua tio n

x - A (t) x = 0 (E 0 )

s a t is fy in g (C ).

S ince the l im i t a p p e a r in g in E q . (1.1) e x is ts fo r an a r b i t r a r y 0 we can

de fine the n X n m a tr ix

t

3 3 0 CONTI

G (t, 0) = l im к

+ J ■ ■ ■ ] A ( t 1 ) . . . A ( t k ) d t 1( . . . d t 1в e

(G)

w h ic h is c a lle d the t r a n s it io n m a t r ix o f (E ).

T hen we can re p la c e (1.1) by the m o re co m pac t fo rm u la

x (t) = G (t ,0 ) x (0), t ,0 G J (1.2)

F r o m (1.2) i t is easy to d e r iv e the a lg e b ra ic p ro p e r t ie s o f G:

G ( t , 0 ) G ( 0 , t ) = G ( t , T ) , 0 , t ,t G J (1.3)

G ( t , t ) = I , t G J (1.4)

G 1 (t, 0) = G (0 ,t ) , 0 , t G J (1.5)

and the d if fe r e n t ia l p ro p e r t ie s

£ £ % £ ) . - A (t) G (t, 0) = 0 , 0 G J , a . e . t G J (1.6)dt

9G(h 8) + G (t ,0 ) A (0) = 0 , t G J , a . e . i e J (1.7)o Í7

The n o r m o f the m a t r ix G (t, 0) is the n u m b e r

I G (t, 0 ) I = sup "jj G (t, 0) x |2 : í X |2 S I f (1.8)

w here | | 2 is the E u c lid e a n n o rm in lRn.

IA E A -SM R -17/48 331

It is easy to show th a t (t, 6 ) -* G (t, 6 ) is a co n tin uo us fu n c tio n on J X J

w ith r e s p e c t to the n o rm .

B y v ir tu e o f (1 .2), the n u m b e r (1.8) is e q u iv a le n t to

I G (t, б) I = sup j j x (t) I : x € E , < ( 4 (1.9)

w here E 0 deno tes the set o f s o lu tio n s o f (E0).

L e t us now r e c a l l som e in e q u a lit ie s to be used la te r on.

L e t us denote by X H(t) and MH(t) the le a s t and the g re a te s t e ig enva lue ,

re s p e c t iv e ly , o f the H e rm it ia n m a t r ix

H (t) = I A (t) + ^ A * ( t ) (1 .1 0)

w here A * is the tra n sp o se o f A .

T hen i t c an be show n th a t t X H(t) and t -*■ juH(t) a re m e a s u ra b le and

lo c a l ly in te g ra b le on J and we have

t tГ 1 XH(s)ds Jp„(s)ds

e® s G (t, 0) s # , 0 s t

t t-ÍM H(s)ds - / s)ds0 S | G (e ,t) s ee , 0 S t

( 1 . 11)

F r o m these in e q u a lit ie s i t fo llo w s tha t

- / IH(s) I ds

andt

- / 1 A(s)| ds

e 6 ■ s

I G (t, 0)|

I G (0 , t) I

|G(t,e)|

I G (0 , t) I

s ев/ 1 H(s) I ds

/ 1 A(s) I ds

S t

S t

( 1 . 1 2 )

(1.13)

F o r a fixed 9, (1.2) re p re s e n ts an is o m o rp h is m betw een ]Rn and the

se t E 0 o f s o lu tio n s o f (E0 ), so th a t £ 0 is a (re a l) v e c to r space o f d im e n s io n

L e t us denote by X : t -> X (t) any n X n m a t r ix fu n c tio n w hose c o lu m n s

x 1, . . . , x n a re s o lu tio n s o f (E0 ). F r o m (1.2) we have

X (t) = G (t, 0) X (0 ) , t, 0 6 J

hence

det X ( t) = de t G (t, 0) de t X (0 )

S ince de t G (t, 0) f 0 we have th a t e ith e r X (t) is n o n - s in g u la r fo r e ve ry

t £ J o r i t is s in g u la r fo r e ve ry t e J . In the f i r s t c a se , we can w r ite

332 CONTI

G (t ,9) = x ( t ) x '^ e ) ,

T h is y ie ld s e a s ily

t/ tr A(s)ds

det G (t, 0) = ее

The equa tio n

z + A* (t) z = 0

t, 0 G J

t, e e j (1.15)

(E *)

is the a d jo in t to (EQ ). U s in g (1.14) it is easy to show th a t the t r a n s it io n

m a t r ix Г o f ( EQ) is g iven by

r ( t , 6 ) = G * (0 ,t ) = G * '1 ^ ,© ) t , 0 e J (1.16)

E q u a t io n (E ) is s a id to be au tonom ous when A is independen t o f t.

In th is case J = R and

G (t ,0 ) = e (t ' e)A, t, € И (1.17)

It fo llo w s th a t the s o lu tio n s o f (E 0) a re a n a ly tic fu n c tio n s o f t E R . M ore

p r e c is e ly , le t . . . ,X k denote the d is t in c t e ig enva lue s o f A , n p n k

th e ir re sp e c tiv e m u l t ip l ic i t ie s . Then fo r e ve ry x G Œn th e re are

X . . . , xk 6 ŒN such th a t

etA x ' Z eX|,[ Í ? : <A.¡=1 r=0

(1.18)

2. S T A B IL IT Y I

F r o m now on, we s h a ll a s su m e w = +oo, i . e . J = ] a , +oo [ , -oo É a , and we s h a ll study the b e h av io u r o f the so lu tio n s o f

x - A (t) x = 0 (E 0)

as t-* +00.

B ecause o f

x(t) = G ( t , 0 ) x(0) (1.2)

th is am o un ts to s tudy ing the b e h av io u r o f G : (t, 0) -* G (t, 0) as t -» +oo.A s a m a t te r o f n o ta tio n , we s h a ll w r ite x (t, 0 , x ) in s te a d o f x(t) to

denote the s o lu tio n o f (E ) such th a t x(0 ) = x, so th a t ( 1 .2 ) w i l l be re p la c e d by

x(t, 0, x) = G(t, 6)x (2.1)

IAE A-SM R-17/4S 333

I t i s w e l l I m o w n t h a t t h e z e r o s o l u t i o n o f ( E 0 ) i s s t a b l e ( a c c o r d i n g

t o L i a p u n o v , a s t -» + 00) i f f f o r e v e r y e > 0 a n d r > a t h e r e e x i s t 6 = 6 ( т , е )

s u c h t h a t

I t i s e a s y t o s e e t h a t t h i s h o l d s i f f G h a s t h e p r o p e r t y S : f o r e v e r y

t > a t h e r e e x i s t у(т) > 0 s u c h t h a t

I n f a c t , i f ( 2 . 3 ) h o l d s , f o r a n y e > 0 , т > о w e h a v e

| x ( t , T , x ) | 2 = | G ( t , r ) x | 2 < y M | x |2 , a < t é t

i f ] x 12 á (T » e ) = t ' V ) ^ . C o n v e r s e l y , i f ( 2 . 2 ) h o l d s , f o r e = 1 , r > a, t h e r e e x i s t 6 = 6 (t) > 0 s u c h t h a t

I X |2 < <5 , a < T S t => I G ( t , t ) x I g < 1

a < t é t =* I G ( t , t ) I < 6 ~1 ( t )

P r o p e r t y S m e a n s t h a t f o r e v e r y f i x e d т > a t h e f u n c t i o n t -* | G ( t , т ) |

i s b o u n d e d f o r т á t . A m o r e r e s t r i c t i v e c o n d i t i o n i s s a t i s f i e d w h e n f o r

e v e r y в > a t h e f u n c t i o n ( t , T ) - > | G ( t , т ) | i s b o u n d e d f o r S l r S t . T h i s

c a n b e e x p r e s s e d b y s a y i n g t h a t G h a s t h e p r o p e r t y U S : f o r e v e r y в > a t h e r e e x i s t 7 (6 ) > 0 s u c h t h a t

в й т é t ¡ G ( t , T ) ¡ < 7 ( 0 ) ( 2 . 4 )

I t i s r e a d i l y s e e n ( b y t h e s a m e a r g u m e n t s u s e d t o p r o v e t h e e q u i v a l e n c e

b e t w e e n s t a b i l i t y a n d p r o p e r t y S) t h a t p r o p e r t y US i s e q u i v a l e n t t o t h e

u n i f o r m s t a b i l i t y o f t h e z e r o s o l u t i o n o f ( E 0 ) a s t -► + 00. T h i s m e a n s t h a t

f o r e v e r y e > 0 , 0 > a , t h e r e e x i s t 6 = 6 (0 , e ) > 0 s u c h t h a t

A n o th e r p ro p e r ty o f G m o re r e s tr ic t iv e th an p ro p e r ty S is p ro p e r ty AS:

fo r eve ry т > a we have

x | 2 < 6 , a < t É t => | x ( t , r , x)|2 < e (2 .2)

a < t â t => | G ( t , t ) j < 7(t) ( 2 . 3 )

w h e n c e

( 2 . 5 )

l i m j G ( t , t ) j = 0 (2 .6)

C le a r ly , th is is e q u iv a le n t to

t > a , x G ® - n =* l i m | x ( t , T , x ) | 2 = 0 (2.7)+ '

i . e . t o t h e a s y m p t o t i c s t a b i l i t y o f t h e z e r o s o l u t i o n o f ( E 0 ) a s t -*• + 00.

334 CONTI

The ze ro s o lu t io n o f (E 0) is s a id to be e x p o n e n tia lly a s y m p to t ic a lly

s ta b le as t-* +00, w hen (2.7) is r e in fo rc e d by

« < e s t s t, x e i R n =» |x( t , T , x ) | 2 s 7 (0 ) е ' р(0)(‘ " т> I x ! 2 (2 . 8)

fo r som e 7 (0 ) > 0 and ц ( в ) > 0 .

T h is is c le a r ly e q u iv a le n t to p ro p e r ty E A S o f G: fo r eve ry 0 > a the re

e x is t 7 (0 ) > 0 , p ( 0 ) > 0 such th a t

» < 0 S t S t => I G ( t , t ) I < 7 (0 ) е "^(0) (t ”T) (2.9)

L o o k in g a t the d e f in it io n s we see im m e d ia te ly th a t the fo llow ing

im p l ic a t io n s am o ng the fo u r p ro p e r t ie s S , US, A S , E A S o f G a re v a lid :

U S Ê A S S

AS

N one o f these a re r e v e r s ib le and the two p ro p e r t ie s US, AS a re independen t

o f each o the r , as e x am p le s show . To th is e ffect, le t us c o n s id e r the s c a la r

(n = 1 ) equa tio n

* - § f * = 0 (2 .1 0 )

w h e re f : t ->■ f(t) is a fu n c tio n de fined fo r t > a , p o s it iv e and lo c a l ly a b so lu te ly

co n t in u o u s . W e im m e d ia te ly see th a t G is g ive n by

G (t, t ) = f (t) / f(r)

E x a m p le 2 .1

L e t a = 0 and le t f(t) = 1 / t fo r 0 < t s 2, w h ile fo r t a 2 the g ra p h o f f

is the p o ly g o n a l w ith subsequen t v e r t ic e s at (2 ,1 /2 ), (3 ,1), (4 ,1 /4 ),

(5 ,1), . . . Such an f is bounded but i t does no t tend to ze ro so th a t G has

the p ro p e r ty S but no t A S . N o r has i t p ro p e r ty US s in ce G (2k + 1,2k) = 2k,

к = 1 , 2 , . . .

E x a m p le 2.2

L e t a = -o o , f(t) = 1 + e ’1. T hen G has p ro p e r ty U S, b u t no t A S .

E x a m p le 2.3

-2 + cos tL e t a = 0, f(t) = t . G h as the p ro p e r ty AS bu t no t U S, s ince

G(2kTT, (2k- 1 ) 7Г ) = тг2 (2k - l )3 / (2 k ) , к = 1, 2, . . .

IAE A-SM R-17/48 335

Ex. 2.2-E x. 2.4

Ex.2.1

Ex.2.3

Ex.2.5

F IG .l. Relationship o f properties S, US, AS and EAS.

E x a m p le 2.4

L e t a = 0, f(t) = 1/ t . T hen G has bo th p ro p e r t ie s US and AS bu t no t E A S .

E x a m p le 2.5

L e t a = -oo, f (t) = e _t . G has the p ro p e r ty E A S .

W ha t we have sa id can be v is u a l iz e d by a schem e show ing the

im p lic a t io n s am o ng the p ro p e r t ie s S, US, A S , E A S as is shown in F ig . 1.

W hen (E 0 ) is a u to no m o us , then i t can be show n th a t p r o p e r t ie s S and

U S co in c id e and a re c h a ra c te r iz e d by: a) a l l the e ig enva lu e s o f A have r e a l

p a r ts s 0, and b) those e ig enva lue s Xj w h ich have ze ro r e a l p a r t ( if any)

have a m u lt ip l ic i t y n^ e q u a l to n- ra n k (A - Xj I ) .

F u r th e r m o re , p r o p e r t ie s AS and EA S co in c id e and a re e q u iv a le n t to:

c) a l l the e ig enva lu e s o f A have r e a l p a r ts < 0 (s tr ic t ly ) .

3 . S T A B IL IT Y , II

L e t us now c o n s id e r a p ro p e r ty o f G o f in te g ra l ty pe , n a m e ly p ro p e r ty IS :

fo r e ve ry в > a le t th e re e x is t k (6 ) > 0 such th a t

T h is p ro p e r ty is independen t o f p ro p e r ty U S .

In fa c t, the G o f E x a m p le 2.4 is d e fined by G (t, s) = s / t , so th a t i t has

p ro p e r ty US bu t no t IS s ince

(3.1)e

e

O n the o the r hand , the G o f the next exam p le has the p ro p e r ty IS bu t does

n o t have p ro p e r ty U S .

336 CONTI

E x a m p l e 3 .1

L e t a = -oo and le t X : t -» X(t) be a lo c a l ly a b so lu te ly con tin uous

fun c tio n o f t 6 E , j 1 eve ryw he re except on the in te r v a ls J k = [k- 2 '4k, k + 2 '4kl

? -4k - o2 k ^ - 2 '4k

O n the o the r hand , fo r 0 < 1 - 2"4 we have

Ok 9U 1 . . . - .w here 1 g X (t) s 2 and X(k) =4[2 . I f we take E q . (2.10) w ith f(t) = e /X (t)

we have G (k + 2“4k, k) = 22k e“2 -» +oo as k-» +oo, so th a t US does no t ho ld .

t t t [ t+ i ]

J |G(t, s) | ds = f X(s) e s ds s e ' 1 J es ds + V J X ( s ) d s в в в к i i k

S е‘* (е ‘ - e e) + 2 У 22к2‘4к й 1 + 2/3

к = 1

so th a t G has the p ro p e r ty IS .

E x a m p le 2.3 show s th a t

AS IS (3.2)

In fa c t, s ince cos s s 1 we have s 2 " cos s > s , hence fo r t s в > 0

I , i . . -2 + cos t / 2 - cos s ,G (t, s) ds = t / s ds

I

t ~2 + cos t l ' s d s = t co s t {1 . ( g / t ) 2 ) / 2>

so tha t

2kir

J' j G (2 k 7r, s) I ds = k 7r (l - (0/ 2к 7г)2 ) -* +oo as к -*■ +oo. e

O n the o th e r hand , we have

IS =*AS (3.3)

To p ro ve th is , le t 0 > a and le t 0 1 = T ?r l ( e + 0 ) / 2 i f i ï > - ю

L e t, fu r th e r ,

<p(t) = I G (t, t ) I T i t (3.4)

t

i//(t) = J 4 > [ s) d s , 0 ' S t (3.5)

e'

IAEA-S M R-17/48 337

so th a t <p(t)> 0, ф(t) > 0, and, in p a r t ic u la r , ф (в )> 0. T hen we have

ф (^ <p_1 (t) = ф (Ъ )\G (t ,r )| = |¡//(t) G (t, t )

(s) ds ) G (t, t ) ip (s) G ( t , T ) ds

<p(s) G ( t , s) G ( s , t ) ds

t

■ J ' <p{s) I G (t, s) I j G (s, г ) I ds

G (t, s) ds < k(0 ')

by v ir tu e o f p ro p e r ty IS . O w ing to the d e f in it io n o f 0 ', we can re p la c e

k (0 ') by k(0) and w r ite

(Ht) <p_1 (t) â k(0) , e s t (3.6)

F r o m th is fo llow s (d /d tl0 (t) e k *0^ 4 ^ a 0, 0 s т s t,

hence , in te g ra t in g betw een 0 and t

ф(t) e 'k' 1(e)(t-T) й ф(в)

th a t is

i//_1 (t) ê ф~1 (в) е ‘ к1(в)(‘ -е)

S ф 1 (0) e к т\ 0 S т S t

S ince (3.4), (3 .6) g ive

I G (t, t ) I = ^ ( t ) S k(0) ^ ( t )

we have

j G (t, r ) I s k(0) ф '1 (в) e ‘ k l(0 )(t" r) (3.7)

f r o m w h ich p ro p e r ty AS fo l lo w s .

It shou ld be no ted th a t (3.7) does no t im p ly p ro p e r ty E A S s in ce ф '1 (0),

by d e f in it io n , depends on т . H ow ever , i f G has a lso the p ro p e r ty US then

fo r e ve ry 0 > a th e re e x is ts y 0(6) > 0 such tha t

0 S t S t -* I G ( t , t ) I < 7 q (9)

33 8 CONTI

w hence

0 ' a r â t => cp( t) = |G (t,r ) I _1 > 7' 1 (e)

so tha t

i//(t) = f <p{s) ds > 7 "1 (0 ') ( t - e 1)w 0e

In p a r t ic u la r , ф(в) > -у"1 (б ') ( в - Q ' ) , th a t is 0 _1 (0) < y Q (б ' ) / (0 - 9 '), and

by d e f in it io n o f 0

ф~1 ( в ) < у 1 (в) (3.8)

w here 7 X(0) > 0 is independen t o f т .

F r o m (3.8) and (3.7) it fo llow s

0 i t S t =» I G ( t , T ) | s k (0 ) 7 1 (6) e k ^ ^

th a t is , p ro p e r ty E A S ho ld s w ith 7 (0) = k(0) 7 (0), ц(в) = k ‘ 1 (0). W e have

th us p ro ved

IS p lu s US =*E A S (3.9)

The co nve rse is a lso t r u e . In fa c , E A S -* US is o b v io u s . A ls o , i f EA S

h o ld s we have

t t

J ' I G (t, s)| ds < 7 (в) j ' e_,J(e) ( t " s) ds = 7 (0) /u' 1 (0) [ 1 - e ^^H t- S )]

0 0

th a t is , IS h o ld s w ith k(0) = 7 (0) /u_1 (0).

T he re fo re

IS p lu s U S « = * E A S (3.10)

and the schem e o f S ec tio n 2 is now co m p le te d as is show n in F ig . 2.

4. S T A B IL IT Y ,I I I

W e s h a l l now t r y to c h a ra c te r iz e p ro p e r ty E A S , i. e . the exponen tia l

a sy m p to tic s ta b il ity o f the ze ro so lu tio n o f

x - A (t) x = 0 (E0)

by m e an s o f a "L ia p u n o v fu n c t io n " .

W e s ta r t w ith the case o f a cons tan t A .

IA E A -SM R -17/48 3 3 9

FIG.2. Relationship of properties S, US, AS, IS and EAS.

D e f in it io n 4.1«

W e say tha t the co ns tan t m a t r ix A a d m its a L ia p u n o v fun c tio n if f the re

a re two n X n m a t r ic e s Л , Q w h ich a re p o s it iv e de fin ite :

Л = Л * > 0 (4.1)

Q = Q * > 0 (4.2)

and such th a t

А* Л + Л A = -Q (A)

W hen th is happens the q u a d ra t ic fo rm

X - X*Ax

is a L ia p u n o v fu n c tio n o f A and (Л ) is a L ia p u n o v m a t r ix equa tion

a s so c ia te d w ith A .

W e w an t to p rove

T h e o re m 4.1

If A has a L ia p u n o v fu n c tio n , th en G has the p ro p e r ty E A S .

P ro o f . L e t

y(t) = et(A + tjI> x (4.3)

w ith ц > 0, so th a t d y ( t ) /d t - (A + juI) y (t) = 0. F r o m (Л ) it fo llow s

d y* (t) A y ( t ) /d t = y* (t) [ -Q + 2 ¡лA ] y(t)

I f we denote by ц Л the g re a te s t e ig enva lue o f A , f r o m (4.1) we have

Х * Л х s Мл [ X12

A ls o , deno ting by X Q the le a s t e ig enva lue o f Q fr o m (4.2) we have

-x*Qx S -Xq UI j

T h e re fo re ,

y * ( t ) [ -Q + 2 ЦА] y ( t ) s [ -X.Q + 2 W a ] |y(t)|®

and i f

340 C o n t i

^ < Й 2" ( 4 -4)

we have d y * ( t )A y ( t ) / d t < 0, w hence in te g ra t in g betw een 0 and t

y * (t) A y (t) < x*A x

B u t a g a in f r o m (4.1) we have

l y W l a - y * (t) Ay ( t )

w here X A is the le a s t e ig enva lue of Л and

i i2X,|CAX Ê й л I X|2

T h e re fo re

^ l y W l z < ^ N 2

and (4.3) g ive s

HA. p -i*К

fo r n s a t is fy in g (4 .4 ).

R e m a r k 4.1

In e q u a lity (4.4) says th a t the e xponen tia l decay o f the so lu tio n s o f (E0)

is no t fa s te r th an

- r 2- 4t -*■ e ,л

The co nve rse o f T h e o re m 4.1 is a lso v a l id .

T h e o re m 4.2

If G has the p ro p e r ty E A S then A has a L ia p un o v fu n c tio n .

P ro o f . B y a s su m p t io n

lA E A -SM R -17/48 341

I e tA| < 7 e~pt , t e К (4.5)

fo r som e у > 0 and ц > 0 .

On the o th e r hand , fo r x e E n and any Q = Q* > 0

X* I I esA'Q esA ds ) x = / (esAx)* Q (esAx) ds

+ oo

2= f (esAx)* (esAx ) d s = n QJ \ e A x \ 2 d s s \ e s A\2 d s )

о o od s j | x |2

w here /jq is the g re a te s t e ig enva lue o f Q .

T h e re fo re , by v ir tu e o f (4.5),

X * ( / e îA*Q esAd s ) x =

о M

T h is m e a n s th a t we can define

A = / e sA*Q esAds (4.6)

and we have Л = А* > 0 by v ir tu e o f (4.2) and

+ 00

A *A + A Ad „sA*.~. sA

e Q e.dsds = -Q

i . e . Л is a so lu tio n o f (Л) .

W hen A does depend on t, i t is no lo n g e r re a so n ab le to a s su m e tha t

the A and the Q in the d e f in it io n o f a L ia p u n o v fu n c tio n a re c o ns tan t.

T h e re fo re we have

D e f in it io n 4.2

W e say th a t t -* A (t) has a L ia p u n o v fu n c tio n if f fo r e ve ry в > a th e re

a re two n X n m a tr ix - v a lu e d fu nc tio ns

t - A 0(t), t - Q 0 (t)

such th a t

A e(t) = A* (t) (4.7)

а (в) IS A e (t) S ¡3 (в) I, 0 S t (4.8)

342 CONTI

fo r som e а {в) > 0 , (3(0) È а (в);

Q „ (t) = Q| (t) (4.9)

y(6) I S Q 0 (t) S 6(0) I, 0 = t (4.10)

fo r som e 7 (0) > 0 , 6(0) ê 7 (0), and

A 0(t) + A* ( t)A e(t) + A 0(t) A (t) = -Q 0(t), 0 S t (Л е )

A lo ng the sam e lin e s fo llow ed to p rove T h e o re m 4.1 th is can be

extended in to

T h e o re m 4.3

I f A : t -* A (t) has a L ia p u n o v fun c tio n x X * A 0 ( t ) x then G has the

p ro p e r ty E A S .

The p r o o f show s th a t the ra te o f e xponen tia l decay o f | G [ is

A p a r t ia l c o nve rse o f T h e o re m 4.3 can a lso be p ro ved to extend

T h e o re m 4 .2 , n a m e ly

T h e o re m 4.4

I f A is bounded and G has the p ro p e r ty E A S then A a d m its a L ia p un o v

fu n c tio n w ith

5 . A F F IN E C O N T R O L S Y S T E M S . B IB S S T A B IL IT Y

L e t us denote by t-> f(t) an n-vec to r fu n c tio n o f t e J m e a su ra b le

and lo c a l ly in te g ra b le on J .

F o r each 0 G J , x G H " , th e re is a un ique (C a ra th e odo ry ) s o lu tio n o f the

a ffine o rd in a r y d if fe r e n t ia l equa tion

i . e . th e re is a un ique n- vec to r fun c tio n t -» x (t), s a t is fy in g (C ), lo c a l ly

a b so lu te ly co n tin uo us on J and such th a t

(4.11)

(4.12)

x - A (t) x = f(t) (Ef )

such th a t

x (0 ) = x (C)

d x (t ) /d t - A (t) x (t) = f(t) , a . e . t G J

IAEA-SMR-17/48 343

This solution is represented by the Lagrange form ula

x(t) = G (t, 6 ) x + / G (t, s) f(s ) ds/■ (L )

e

L e t us now c o n s id e r a f a m i ly o f a ffine o rd in a ry d if fe r e n t ia l e qua tio ns ,

n a m e ly

w here t - u(t) is an m - v e c to r fu n c tio n o f t e J b e lo ng ing to a g iven set

U с L l0C(J ) . T h is m e an s th a t e ach u G U is m e a s u ra b le and (e s se n tia lly )

bounded on e ve ry f in ite in te r v a l C J . C o nse quen tly , B (t) is an n X m m a t r ix

and we s h a ll a s su m e th a t В € L

The fa m ily (U) o f a ffine o rd in a ry d if fe r e n t ia l e q u a t io n s , d epend ing on

the index u e U is an a ffine c o n tro l s y s te m .

A so lu t io n o f (U) depends on u , as w e ll as on t; i t a lso depends on

0 and x and u s in g the L a g ra n g e fo rm u la we s h a ll w r ite

U s in g the te rm in o lo g y o f C o n tro l T he o ry the n- vec to r x (t, 0, x, u) is

the s ta te o f the sy s te m , the n- ve c to r fu n c tio n t -* u(t) is the c o n tro l (or

s te e r in g ) fu n c tio n o r in p u t , the n- vec to r fun c tio n t -► x (t, 0 , x, u) is the

re sp o n se to u . The v a r ia b le t is u s u a lly in te rp re te d as t im e .

I f u is bounded as t-* +«> the c o r re s p o n d in g re spo n se need no t be

bounded . W e w an t to c h a r a c te r iz e such c o n tro l s y s te m s fo r w h ich bounded

in p u ts y ie ld bounded s ta te s .


The c o n tro l s y s te m (U) is bounded- inpu t bounded- sta te (B IBS) s tab le

as t-* -i-оо i f f : a) G has the p ro p e r ty U S , i . e . fo r eve ry в > a th e re is

7 (0) > 0 such th a t

b) for each u m easurable and bounded on [0 , +00[ there is som e k(6, u) > 0 such that

x - A (t) x = B (t) u(t) (U )

(X)

6

0 S r S t =* I G (t,T )| < 7 (0) (5.1)

(5.2)

F r o m (5 .1), (5 .2) and the L a g ra n g e fo rm u la (X) i t fo llow s

0 á t ■=» |x(t, 0 , x , u ) | 2 < 7 (0) I x|2+ k (0 , u)

i . e . to a bounded in p u t u th e re c o rre s p o n d s a bounded s ta te x .

It is easy to p rove :

T h e o re m 5 .1

The c o n tro l s y s te m (U) is B IB S s tab le if: a) G has the p ro p e r ty US;

b ') fo r e ve ry в > a th e re ex is t k B(6) > 0 such th a t

t

e s t - J j G (t , s) B (s ) I ds < k B(0) (5.3)

e

P r o o f . S ince b 1) c le a r ly im p lie s b ), a) + b ') im p l ie s B IB S s ta b il ity .

S ince

t t

J I G (t, s) B (s ) I ds á J I G (t, s)| j B (s ) I ds

e e

i t fo llow s th a t i f В is bounded on [S, +oo[ fo r e ve ry в > a then the p ro p e r ty

US p lu s the p ro p e r ty IS a re s u ff ic ie n t fo r B IB S s ta b il ity . O n the o the r hand ,

the s im u lta n e o u s v a l id ity o f US and IS is e q u iv a le n t to p ro p e r ty E A S .

T he re fo re

T h e o re m 5.2

L e t В be bounded on [0,+oo[ fo r each 0 > a . T hen (U) is B IB S s tab le

i f G has the p ro p e r ty E A S .

The co nve rse is no t tru e as is show n by ta k in g A = О, В = 0.

344 CONTI

6 . S T A B IL IZ A B IL IT Y , I

W e s ta r t w ith


W e say th a t the c o n tro l sy s te m

x - A (t) x = B (t) u(t) (U)

is s ta b il iz a b le (to ze ro ) i f f th e re is an m X n - m a tr ix v a lu e d fun c tio n

F : t F (t) o f t € J , such th a t B F is m e a s u ra b le and lo c a l ly in te g ra b le on

J and such th a t

ÿ - (A(t) + B (t) F ( t ) ) y = 0 (F)

is a s y m p to t ic a lly s ta b le .

I f (F) is e x p o n e n tia lly a s y m p to t ic a lly s tab le we say th a t (U) is

e xp o n e n tia lly s ta b i l iz a b le .

IAE A-SM R-17/4S 3 4 5

To e x p la in the m e an in g o f th is D e f in it io n le t us denote by y x the

so lu tio n o f (F) such th a t

fo r a g iv e n в e J and a g iven x 6 lRn . T hen we have y^ (t) - 0 as t - +oo,

bu t s ince

th is m e an s th a t a lso the s o lu t io n o f (U) c o rre s p o n d in g to u = F y x , i . e .

t -* x (t, 6 , x, F y x ) tends to ze ro as t-*+oo. In o th e r w o rd s , i f (U) is

s ta b il iz a b le it is p o s s ib le to b r in g the sta te x o f the sy s tem f r o m any in i t ia l

p o s it io n x to ze ro in an in f in ite t im e .

In w hat fo llow s we s h a ll g ive c o nd it io n s fo r s t a b i l iz a b i l i t y .

W e s h a ll p re s e n tly s ta r t w ith the case o f c o ns tan t A and B . Then it

m a k e s sense to lo ok fo r c o ns tan t F such th a t (F) is (exponen tia lly ) s ta b le .

A c o nd it io n fo r exp one n tia l s ta b i l iz a b i l i ty a r is e s f r o m the so lu t io n o f

the s o- ca lle d " r e g u la to r p r o b le m " . F o r each in i t ia l s ta te x one lo o k s .fo r

a c o n tro l u such th a t the " c o s t" fun c tio n

is m in im u m . H e re x = x (t, 0, x, u) w h ile L , M , N are th ree m a t r ic e s ,

r e s p e c t iv e ly o f types n X n , m X n and m X m such tha t

x * L x + x* M u + u*M*x + u*N u > a (x*x + u*u)

fo r som e a > 0. F r o m th is fo llo w s (u = 0)

L > 0

a n d ( x = 0)

N > 0

Ух (6) = X

yx (t) = G (t ,e )x +J G (t, s) B (s ) F (s ) yx (s) dsI'0

C (u) = r L x + x *M u + u *M *x + u*N u] dt

о

L = U N = N* (6 .1 )

The c u r re n t a s su m p t io n is

(6 . 2 )

w h ich m e an s th a t fo r x e lRn , u 6 ]Rm

346 CONTI

so th a t N "1 e x is ts , and a lso th a t (u = - N ^ M ^ x )

L - M N "1 M* > 0

F o r a g iv e n t r ip le t L , M , N s a t is fy in g (6.1) and (6.2) we a s so c ia te w ith

the c o n tro l s y s te m (U) the K a lm a n m a t r ix equa tio n

+ Л (A - B N _1 M *) - A (B N ' 1 B * )A = 0

W hen В = 0, (U) red uce s to (E0 ) and (6.3) red uce s to the L ia p un o v

equa tio n (A) w ith

so th a t A „ is a lso a s o lu tio n o f the L ia p u n o v equa tio n a s so c ia te d to A + В F ,

w h ich m e a n s (T heorem 4.1) th a t (F ) is e xp o n e n tia lly s ta b le . T he re fo re we

have

T h e o re m 6.1

L e t A and В be c o n s ta n t. T hen (U) is e x p o ne n tia lly s ta b il iz a b le i f fo r

som e t r ip le t L , M , N o f m a t r ic e s s a tis fy in g (6 .1), (6.2) the K a lm a n

equa tio n (6.3) has a s o lu tio n А» = A* > 0. In th is case a s ta b il iz in g F is

de fined by (6 .5 ).

R e m a r k 6.1

A c c o rd in g to R e m a r k 4 .1 the e xponen tia l decay o f the s o lu tio n s o f (F)

w ith F de fin ed by (6.5) is n o t fa s te r th an

(L - M N"1 M * ) + (A - B N ' 1 M* )* A (6.3)

Q = L - M N _1 M * > 0 (6.4)

A s s u m e th a t (6.3) has a so lu tio n

A „ = A S > 0

I f we take

(6.5)

(6.3) w ith A = AM can be w r it te n

Q + (A + B F )* Aœ + Л* (А + В F ) = 0 ( 6 . 6 )

е

w here Xq is the le a s t e ig enva lu e o f Q = L - M N _1 M * and ц А is the g re a te s t

e ig enva lu e o f Aœ .

IAE A-SM R-17/48 347

The la s t a s s e r t io n o f T h e o re m 6.1 needs a c o m m e n t . In fa c t, i f

Л „ = AS > 0 is a s o lu t io n o f (6.3) we have

R e m a r k 6.2

and (6 .6 ) by

Q a + (A + B F / A . + A J A + B F J = 0

we s t i l l have a L ia p u n o v e qua tio n , and we conc lude th a t no t o n ly F = F 0 ,

b u t a lso F K , o' È 0 a re s ta b il iz in g m a t r ic e s .

R e m a r k 6.3

T h e o re m 6.1 has an in v e rs e (D .L . L u kes ) in the sense th a t i f (U) is

s ta b il iz a b le th en fo r e ve ry t r ip le t L , M , N s a t is fy in g (6.1) and (6 .2), the

K a lm a n e qua tio n (6.3) m u s t have a un ique s o lu tio n А = > 0.

The s ta b il iz a b il i ty c r i te r io n re p re s e n te d by T h e o re m 6.1 can be

extended to the case o f t im e- dependen t A and В by u s in g T h e o re m 4.3

in s te a d o f T h e o re m 4 .1 .

In th is c ase , L , M , N a re a lso tim e - d e pe nde n t, and the K a lm a n m a t r ix

e qua tio n is re p la c e d by a R ic c a t i- m a tr ix d if fe r e n t ia l equa tio n

Q a = L - M N ' V + a A „ (B N _1 B * )A M = Q* > 0 (6.7)

fo r a l l a a 0. T h e re fo re , re p la c in g (6.5) by

( 6 . 8 )

À + (L (t) - M (t)N _1 (t)M *(t)) + (A (t) - B M N ^ W I V P W ^ A

+ A (A (t) - B W N '^ t J M - it ) ) - A (B ( t )N _1 (t)B * (t) )A = 0 (6.9)

and F is re p la c e d by

(6 .1 0 )

w here is an a p p ro p r ia te s o lu tio n o f (6 .9).

7. S T A B IL IZ A B IL IT Y , II

L e t us now c o n s id e r ano the r c r i te r io n o f s ta b i l iz a b i l i t y fo r the case

o f c o ns tan t A and В ( D . L . L u k e s , W . A . C o p p e l) .

T h e o re m 7.1

L e t A and В be c o n s ta n t. L e t th e re e x is t som e T > 0 such th a t

T

П = f e tA В В * e 'tA* dt > 0 (7.1)

о

T hen i f

F = -B* f i"1 (7.2)

the equa tio n

x - (A + B F ) x = 0 (F)

is e x p o ne n tia lly s ta b le .

P r o o f . F r o m (7.1) we have

3 4 8 C o n t i

T

A fi + fi A'1' - -J о

e 'tA B B * e 'tA!|.dt

dt = -e‘TAB B * e TA* + В В*

hence , by (7.2),

( A + B F ) i î + Г2 (A + B F )* = -(e’TAB В * e '™ * + B B * ) s 0

L e t X be an e ig enva lue o f (A + B F )* and Ç a c o rre s p o n d in g e ig enve c to r ,

(A + B F)- Ç = X Ç

W e have

(X + X) Г :‘ f i? = S* [(A + B F ) í i + í ! (A + B F )* ] Ç

= -Ç* (e'TAB B * e ' T A ' 4 B B i : ) Ç s 0

Since 0 it follow s ReX s 0. M oreover,if the equality holds then Ç*BB*Ç = 0, i . e . | B* Ç = 0, i . e . B*Ç = 0, hence (A + B F ):ÏÇ = A*Ç = X Ç and therefore

w h ic h g iv e s

т

Щ = J e"tA B B * Ç e’ M dt = 00

and th is c o n tra d ic ts (7.1) s in ce Ç f 0.

IA E A -SM R -17/48 349

I f (7.1) h o ld s fo r som e T > 0 th en i t h o ld s fo r e ve ry T > 0. In fa c t,

le t th e re e x is t som e T > 0, Ç f 0 such th a t

T T

0 = ïÇJ e tA B B * e"tA d t ) ? = J | B* e"tA* dt

0 0

T h is m e an s

B * e"tA* î = 0 , t e [ 0 , T ]

and , s in ce t — B * e 'tA'’' f is an a n a ly t ic fu n c tio n , we have

B* e"tA* Г = 0 , t 6 К

It fo llo w s th a t i f (7.1) h o ld s fo r som e T > 0 then we can d e fin e = П (Т ),

hence F = F (T ) fo r eve ry T > 0, so th a t we have a f a m ily , d epend ing on

T > 0, o f s ta b il iz in g m a t r ic e s .

It c an be show n th a t (7.1) is , in fa c t, e q u iv a le n t w ith the ex is te nce of

a m a t r ix F such th a t A + B F h as any p r e s c r ib e d se t o f e ig enva lue s

(C . L ang enh op , W . M . W o n h a m ).

In o th e r w o rd s , (7 .1) is e q u iv a le n t to the p ro p e r ty th a t the "c lo se d -

loop t r a n s fe r m a t r ix "

p - f (A + B F ) - p I ] ' 1 В

can be a s s ig n e d an a r b i t r a r y se t o f p o le s by a s u ita b le cho ice o f the

" fe e d b a ck g a in m a t r ix " F .

O n the o th e r hand , (7.1) is e q u iv a le n t to the c o m p le te c o n t ro l la b i l i ty

o f the c o n tro l s y s te m (U ). T h is m e an s th a t (7.1) is a n e c e s s a ry and

s u ff ic ie n t c o nd it io n in o rd e r th a t the s ta te x o f the s y s te m can be t r a n s fe r re d

f r o m any in i t i a l p o s it io n a t t = 0 to any f in a l p o s it io n a t t = T .

R e m a r k 7.1 m e an s th a t i f th is is p o s s ib le fo r som e T > 0 th en i t is

a lso p o s s ib le fo r any T > 0.

T h is r e m a r k le a d s to the n o t io n o f " u n ifo r m co m p le te c o n t ro l la b i l i ty "

fo r tim e - d epe nde n t c o n tro l s y s te m s as a s u ff ic ie n t c o nd it io n to e n su re a

v e ry s trong k in d o f s ta b i l iz a b i l i ty fo r such s y s te m s .

8 . U N IF O R M C O M P L E T E C O N T R O L L A B IL IT Y

To de fine the n o t io n o f u n ifo rm c o m p le te c o n t r o l la b i l i ty we s ta r t by

c o n s id e r in g the n X n m a t r ix

T

H (т, T ) = f G ( r , t) B (t) B * (t) G* ( r , t) d t (H)

R e m a r k 7.1

3 5 0 CONTI

w h ic h re d u ce s to Г2 o f S e c tio n 7 w hen т = 0 and A and В a re c o n s ta n t. O f

c o u rs e , in te g r a b il ity o f В is no t enough fo r the ex is tence o f the above in te g ra l ,

so we have to a s su m e th a t B G L joc(J) , i . e . the e le m e n ts o f В a re lo c a l ly

s q u a re in te g ra b le fu n c t io n s .

W e s h a l l a lso c o n s id e r th e n X n m a t r ix

C le a r ly ,

H(t , T ) = H *(t , T) ê 0, К (т ,Т ) = K *(t ,T ) ê 0

T hese in e q u a li t ie s a re (both) s t r ic t i f and on ly i f the c o n tro l s y s te m (U)

is c o m p le te ly c o n tro lla b le on [ t , T ] , i . e . i f f fo r e v e ry p a i r v , w G K n, th e re

is a c o n tro l uvw G L " ( [ t , T ] ) such tha t

x ( t , t , v , U yJ = v , x (T ,t , v , u vw) = w (8.1)

It is r e a d ily v e r i f ie d th a t w e can ta ke , fo r in s ta n ce

A m o n g the c o n tro l fu n c tio n s a c ting the t r a n s fe r f r o m v to w the one de fined

by (8.2) h as a n im p o r ta n t p r o p e r ty . In fa c t , i f u is a n o the r c o n tro l fu n c tio n

s a t is fy in g (8 . 1 ) w e have

T

Г

uvw(t) » B * ( t ) G * ( T , t ) H - 1 (T,T) [ G ( t ,T )w - v]

= B * (t) G * (T , t) К 1 ( t , T) [w - G (T ,t )v ] ( 8 . 2 )

T

r

hence

T

T

r

T he re fo re

IAEA-SMR-17/48 351

T h is m e an s th a t uvw tr a n s fe r s the s ta te o f the sy s te m f r o m v at t = т to

w at t = T at the expense o f a m in im u m am o un t o f e ne rgy , n a m e ly

TJ I Uvw(t) j 2 dt = [G(t ,T ) w - w ]* H "1^ , T) [ G ( r , T ) w - v ]

T

= [w - G ( T , t )v ] * К 1 (т, T ) [w - G ( T , t )v ] (8.3)

L e t us now c o n s id e r the fo llo w in g ex am p le . L e t the s c a la r c o n t ro l s y s te m

be de fined by

x + tx = ^ 2 (t - 1 )' e't + 1/2 u(t)

w ith t e J = ] 1, +00[ . I t is e asy to see tha t

G (t, s) = e sV2 " tV2

H ( T , T + CT) = е 2 ( о - Ц т + ( о - 1 ) * . е - 2 Т + 1

К (т ,т + ct) = e’ 2T- 2 o + 1 - e-2(o+1) T + 1 - ° 2

A c c o rd in g to (8.3) we see th a t the t r a n s fe r f r o m v a t t = т to w = 0 at

t = t + cr take s a m in im u m am o un t o f energy w h ich - +oo as t-> + oo i f

ct < 1, w h ile -► 0 as t -* +oo i f ct > 1. The tr a n s fe r f r o m v = 0 at t = т to

w at t = t + ct take s a m in im u m am o un t o f e nergy w h ich -» +oo as т -* +oo,

no m a t te r w hat the le ng th ct o f the t im e in te r v a l is .

To avo id such u n p le a san t c ir c u m s ta n c e s we need a s p e c ia l c la s s o f

c o n tro l s y s te m s , n a m e ly those w h ich a re c a lle d u n ifo rm ly c o m p le te ly

c o n tro lla b le , a c co rd in g to R . K a lm a n .


The c o n tro l s y s te m (U) is s a id to be u n ifo rm ly c o m p le te ly c o n tro lla b le

(u. c . c . ) i f f th e re e x is t

с т > 0 , 0 < h 1 s h 2 , 0 < k j i k 2

such that for every т > a

h 1 1 S H(t , t + ct) S h2 I (8.4)

k j I S K(t , t + ct) S k2 I (8.5)

352 CONTI

h 2 I S H ( t , t + o) i h j 1 I

k ’ 1 I S К "1 ( t , t + ct) s k ' 1 I

so th a t, a c co rd in g to (8 .2 ), the am o un t o f e ne rgy spent fo r the t r a n s fe r o f v

to ze ro (or f r o m ze ro to w) in a t im e in te rv a l [ t , t + a] does no t depend on

the in i t i a l t im e т .

R e m a r k 8.1

It can be show n th a t if (8 .4), (8.5) ho ld fo r som e aQ > 0, then they a re

v a l id fo r e ve ry a > a 0 .

R e m a rk 8.2

It can a lso be show n th a t i f (8 .4), (8.5) ho ld , then th e re ex is t

p - g x (p) , P - g2 (p) - 0 < g x (p) s g 2 (p)

such th a t

0 < g l ( 11 - s I ) á I G (t, s) I = g 2 (|t-s|) (8 .6)

fo r t , s > a , 11 - s I > ст.

S ince

K ( t , t + ct) = G ( t + а , т ) H ( t , t + a) G':<( t + ct, t )(8.7)

H ( t , t + ct) = G ( t , t + ct) K ( t , t + ct) G* ( t , t + ct)

it fo llo w s th a t i f any tw o o f the r e la t io n s (8 .4), (8 .5), (8 .6) ho ld , the re m a in in g

r e la t io n is a lso t r u e .

R e m a r k 8.3

(8.4) can be w r it te n (x € f t " )

0 < h l I Xla - X* Н ( т , т + ct)x S h 2 | x I2

so th a t i f we take x = G *(t + ct, t )r¡ we have

■ .2 i i2 h 1 1 x12 Ê Ч* К(т , т + ст)п È h2| x 12

T h e s e c o n d i t i o n s i m p l y , r e s p e c t i v e l y

lA E A -SM R -17/48 353

O n t he o t h e r h a n d

x\l* I G (r + ст,т)|2 I n22

a n d s i n c e r) = G * ( t , t + cr)x w e a l s o h a v e

t2 .-2 t X | 2 ^ I G ( t , t + a ) I I Y]

22

T h e re fo re f r o m (8.4) we have

( 8 . 8 )

w h ich , in g e n e ra l, does not im p ly (8.5) b ecause the bounds , le f t and r ig h t ,

w i l l depend on т .

H ow ever, i f A is bounded , i . e . , i f

and f r o m (8 .8) we have (8.5) w ith h la h 2 re p la c e d by h je '30, h 2eao,

r e s p e c t iv e ly .

T he re fo re w hen A is bounded we can s upp re ss (8.5) f r o m the d e f in it io n

o f u . с . с .

W e have fu r th e r

A (t) I < a , 0 S t (8.9)

fo r som e a > 0, th en , r e c a l l in g (1 .13), we have

T + O

T

and s in ce , f r o m (1.13) aga in :

T

i t fo llo w s , w hen A is bounded

T+O

r

T h e re fo re , i f no t o n ly A bu t a lso В is bounded , i . e . , i f

I B (s ) I 0, the second in e q u a lity o f (8.4) can be supp re sse d and we

conc lude th a t w hen both A and В a re bounded the c o n tro l s y s te m (U) is

u . c . c . i f and on ly i f th e re a re som e ст > 0, > 0 such tha t

h j I S H ( t , t + cr) fo r a l l t (8.10)

In p a r t ic u la r , w hen A and В a re co ns tan t, s in ce G (t, s) = e(t ' $)A we have

H ( t , t + ct) = H ( 0 , c r ) so th a t ( 8 . 1 0 ) w il l reduce s im p ly to

H ( 0 , c r ) > 0

i . e . to co m p le te c o n t r o l la b i l i ty .

9. U N IF O R M C O M P L E T E S T A B IL IZ A B IL IT Y

M . Ik ed a — H . M ae d a — S. K o d am a , in p ro se c u t io n o f K a lm a n 's w o rk ,

have re c e n tly in tro d u ce d the no tio n o f u n ifo rm ly c o m p le te ly s ta b il iz a b le

c o n tro l s y s te m s .


The c o n tro l s y s te m (U) is u n ifo rm ly c o m p le te ly s ta b il iz a b le ( u . c . s . )

i f f fo r e ve ry ju > 0 th e re is a m a t r ix Pj, such tha t

x - (A(t) + B (t) F j (t)) x = 0 (F)

is e x p o n e n tia lly s tab le w ith exponent > ц .I t can be p ro ved

T h e o re m 9.1

I f (U ) is u . c . c . then it is u . с . s .

u . c . c . s e rv e s to p ro v e the ex is tence fo r e ve ry ц > 0 o f a s o lu tio n

Л р of the R ic c a t i e qua tio n

Л + (A(t) + p i)* A + A (A(t) + Ail) - A B (t)B * (t)A = -I

such th a t

A„(t) =A* ( t )

« i g A M(t) á p i , t e r n

fo r som e a > 0, (3 ë a , independen t o f t.

T hen , a c c o rd in g to D e f in it io n 4 .2 , A^ is a L ia p u n o v fu n c tio n fo r

IA E A -S M R -n /4 8 355

w here

F„ (t) = - ¿ В (t) Л р (t)

F r o m th is fo llow s the exp one n tia l s ta b il ity o f

x - (A(t) + ju I + B (t) F p (t)) x = 0

hence th a t o f (F ) w ith an exponent > y ..T h e o re m 9.1 has a p a r t ia l co nve rse , i . e . :

T h e o re m 9.2

L e t A and В be bounded . T hen (U) is u . c . c . i f i t i s u . c . s. by m eans

o f bounded F^ .

P r o o f . A c c o rd in g to R e m a r k 8.3 , s in ce A and В a re bounded i f (U) is not

u . c . c . , t h e n fo r eve ry ct > 0 and eve ry p > 0 th e re w i l l e x is t som e v e c to r

X - x ( p , o') f 0 and som e т- > 0 such th a t

’х *Н ( ’т рт + ct)x < p 15Г12

i . e . such th a t

T + О

j I B * (s )G * (t , s )5T|2 ds - P I xlg

H ence , u s ing the S ch w a rtz in e q u a lity

T + a

J I B* (s)G* ( t , s )x | 2 ds S \Гстр |5T|2 (9.1)

O n the o th e r hand , fo r every т б J , x 6 ®.n, the so lu t io n x M o f (F , ) such

th a t Хр (т) = x is g iv e n by

t

x M(t) = G ( t ,r ) x + J G (t, s )B (s ) F j ( s ) x i ¡ ( s ) d s

hence , deno ting by the e v o lu tio n m a t r ix o f A + В , we have

t

G M( t , T ) x = G ( t , t ) x + f G ( t , s J B is J F ^ (s jG ^ ( s , t ) x d s

S ince x is an a r b i t r a r y v e c to r i t fo llow s

3 56 CONTI

G ^ f t . r ) = G ( t , r ) + J G (t, s )B (s )Ffl( s ) G / s , T ) ds

w hence

G* ( t ,т )G* ( T , t ) = I +J G * ( s , t )F * (s )B * (s )G * (t , s) ds

I f we app ly th is to x fo r т = т we o b ta in

t

G* (t,T )G * (t , t) x = x + J G* ( s , t ) F * ( s ) B* ( s ) G* ( t , s ) x ds

T

w hence

t

|G *(tJT )G ;!(T J t ) x |2 = I x + s . t J F * (s )B * (s )G * (7 , s)x d s |2T*

t

|G*(t,r)| I G:;!( t , t)| I x |2 g I x|2-J\G^s.t")! I Fp (s) f I B*(s)G>;<(t , s)x|t ”

and u s ing (9 .1), fo r t = t - + a

'T + O

I GM ( t + a , T ) I I G( T, T + ст) 12 1 - J | G p ( s , t ) | | Fp (s) | ds- '.I - ' •. > J T H • • I . H • • I

Now,

I Gj j ( T + I s 7 е ~ ца

I G ( t 57 + a) I S eaa

lFM - V for som e fjj > 0

so th a t

7 e*_,J + a)0 è 1 - s/cp 7 fjj

and i f we take

H = 2 a , p = (9 T 2 f 2 cr)'1

lA E A -SM R -17/48 357

we have

w h ich is a c o n tra d ic t io n s in ce we can take a a r b i t r a r i ly la rg e .

10. B * IB S S T A B IL IT Y

In S e c tio n 5 we c o n s id e re d B IB S s ta b il ity and we p ro v e d (T he o re m 5.2)

th a t

p ro p e r ty E A S ^ B IB S s ta b il ity (10.1)

p ro v id e d th a t В is bounded on [0, +oo[, i . e .

I B (t) I < |3, 0 ê t (10.2)

fo r som e (3 > 0.

I t is e asy to see th a t the v a l id ity o f (10.1) can be e s ta b lis h e d u nd e r a

c o nd it io n d if fe re n t f r o m (1 0 .2 ), l ik e

7 e"ao è 2 /3

(10.3)

e

fo r som e .8 (0) > 0 .

In fa c t, to p rove (10.1) we have to p rove

(5.3)

e

S ince E A S is e q u iv a le n t to US p lu s IS we have

0

hence

I0

! G ( t ,s) В (s) I ds S ( I G (t, s) |2 ds0 B (s ) |2 ds

*0

S [ 7 (e) k (0 ) j3 (o) ]

358 CONTI

B . D . О . A n d e rso n — J . В . M o o re re p la c e d B IB S s ta b il ity by B * IB S s ta b il ity .

T h is is o b ta in ed by r e q u ir in g th a t (5.2) in D e f in it io n 5.1 is s a t is f ie d not

o n ly fo r a l l bounded u, but a lso fo r a l l the bounded* u, w h ich m e an s tha t

th e re a re

CTu > °< Uu > 0

such tha t

r + о

J ' I u(t) |2 d t a (10.4)

r

T h e re fo re

B * IB S s ta b il ity =>BIBS s ta b il ity

but the co nve rse is no t tr u e .

A n d e rso n and M o o re have s tud ied the r e la t io n s h ip betw een B * IB S

s ta b il ity and the p ro p e r ty E A S and found th a t they a re e qu iv a len t p ro v id ed

th a t (U) is u . c . c . W e s h a l l now p rove the f i r s t p a r t o f th is im p lic a t io n ,

n a m e ly

T h e o re m 10.1

L e t (U) be u . c . c . T hen i f G has the p ro p e r ty E A S , (U) is B * IB S s tab le .

P r o o f . L e t 9 s t and take an in te g e r к = k (t, 0) such th a t

к + 1 ê (10.5)(7

w h ere ст is the one ap p e a r in g in the D e f in it io n o f u . c . c . T hen we have

t 6 + a

G (t, s) B (s ) u(s) ds = G (t, в + a) J " 0 ( 6 + o 3 s) B (s ) u(s) ds + . . .

9

0 + j о

. . . + G (t, в + j a ) J ' G (в + j ст, s) B (s ) u ( s ) d s + . . .

e + (j-i)o

0 + ko t

. . . + G (t, 9 + kcr) J ' G (9 + кст, s) B (s ) u ( s ) d s + J ' G (t, s) B (s ) u(s)

0 + (k -l)a 0 + ко ( Ю 6 )

I f u is bounded* th en (10.4 is v a l id a lso w ith ct r e p la c in g ctu and we have

fo r j = 1 , . . . , к

IAEA - S M R -17 /48

0 + jo

G ( 0 + j a , s ) B ( s ) u ( s ) d s

e + ( j - i ) o

6 + j o

J I G (0 + j ct, s ) B ( s ) I | i

0 + ( j - l ) O

0 + j o

| G ( 0 + j c r , s ) B ( s ) | 2 d s

2 i 2

e + ( j - i ) o

A l s o , b e c a u s e o f ( 1 0 . 5 ) ,

G ( t , s ) B ( s ) u ( s ) d s

0 + k o

; J I G ( t , s ) B ( s ) | I u ( s ) | 2 d s

e+ k o

t

s J I G ( t , s ) B ( s ) | I u ( s ) 12 d s á J | G ( t , s ) B ( s ) | 2 d s

t - О t - О

F r o m t h e s e c o n d h a l f o f ( 8 . 5 ) w e h a v e

r + о

J I G ( t + a, s ) B ( s ) |2 d s s n k 2

F r o m t h i s a n d ( 1 0 . 7 ) , ( 1 0 . 8 ) w e h a v e , r e s p e c t i v e l y

0 + j o

/e + ( j - i ) o

t

G ( 0 + jc t , s ) B ( s ) u ( s ) d s

G ( t , s ) B ( s ) u ( s ) d s

6 + k o

F r o m ( 1 0 . 6 ) , f o r a b o u n d e d ' :î u i t f o l l o w s

t

/ G ( t , s ) B ( s ) u ( s ) d s

0

( n k 2 u)u) ‘ 1 + I G ( t , 0 + кст) I + . . . + I G ( t , 0 + jct) I + . ,

+ I G ( t , 0 + ct) I

( s ) I 2 d s

( 1 0 . 7 )

( 10 . 8)

( 1 0 . 9 )

(1 0.10)

359

3 6 0 CONTI

F r o m p ro p e r ty E A S we have (7 = 7 (6) > 0 , /и = ц{6) > 0)

|G (t,0 + ja ) I = |G(t, в + ka) G (t + ka , в + ja)|

S I G (t, в + ka) I I G (в + ka , в + ja ) | S 7 1 G (t, 6 + ka) | (e_|Ja)k

s 7 2 ( е ' П ' Л j = 1 , . . . , к

Hence

G (t, s) B (s ) u(s) ds S ( n k 2 u u ) 1 + 7 2^ (e_fJO)k_j

j = i

< (n k 2 U J 1 + 72 1

1 - е -м о

i . e . (5 .2 ).

N ote th a t we used o n ly a p a r t o f u . c . c .

B I B L I O G R A P H Y

ANDERSON, B. D. O ., MOORE, J. B ., New results in linear system stability, SIAM J. Control, 7 (1969) 398.

COPPEL, W .A ., Matrix quadratic equations, Bull. Austral. Math. Soc. 10 (1974) 377.

IKEDA, М ., MAEDA, H ., KODAMA, Stabilization of linear systems, SIAM J. Control, 10 (1972) 716.

KALMAN, R .E ., Contributions to the theory of optimal control, Bol, Soc. Mat. Mexicana, (2) 5 (1960) 102.

LUKES, D. L ., Stabilizability and optimal control, Funkc. Ekv. 11 (1968) 39.

SILVERMAN, L. M ,, ANDERSON, B. D. O ., Controllability, observability and stability o f linear systems, SIAM J. Control 6 (1968) 121.

WONHAM, W .M ., On pole assignment in multi-input controllable linear systems, IEEE Trans. Autom. Control, AC 12(1967) 600.

IA E A -SM R -17/71

CONTROLLABILITY OF NON-LINEAR CONTROL DYNAMICAL SYSTEMS

C . LOBRY

Université de Bordeaux I,

France

Abstract

CONTROLLABILITY OF NON-LINEAR CONTROL DYNAMICAL SYSTEMS.In this paper, the results obtained in the problem o f controllab ility o f non-linear systems from 1970

to 1974 are reviewed. After introduction o f the idea o f a control dynam ical system, the abstract theory and applications are treated in a concise way, references to the original sources replacing detailed proofs to a great extent.

IN T R O D U C T IO N

T h is is a rev iew pap e r on the re s u lts o b ta in ed in the p ro b le m of

c o n t ro l la b i li ty of n o n - lin e a r sy s te m s fro m 1970 to 1974. L e t us ju s t r e m a r k

th a t the ideas deve loped d u r in g the la s t f iv e y e a rs w ere in it ia te d by

H e rm a n n [18], H e rm e s [19], K u ce ra [36, 37] and M a rk u s [48] and p ro bab ly

som e o the r peop le . The id e a of th is su rvey is no t to w r ite the h is to ry of

the s u b je c t bu t to p re se n t a se t o f re fe re n ce s a v a ila b le to the r e a d e r .

In C h ap te r 1, we in tro du ce the id e a of c o n tro l d y n a m ic a l s y s te m s ,

w h ich is a new fo rm u la t io n o f w hat is c u r re n t ly c a lle d a n o n - lin e a r c o n tro l

s y s te m . C h ap te r 2 is devoted to the a b s tra c t th eo ry and C h ap te r 3 to

a p p lic a t io n s . A t the end, re fe re n c e s to o th e r to p ic s o f c o n tro l th eo ry w here

the u se of these g e o m e tr ic too ls seem s to be u se fu l a re g iven .

1. C O N T R O L D Y N A M IC SYST E M S

H e re , we in tro du ce the concep t of a c o n tro l d y n a m ic a l s y s te m . W e

hope th a t E x a m p le s 1.3 and 1.4 g iven at the end of the sec tio n w i l l conv ince

the r e a d e r th a t c o n tro l d y n a m ic a l sy s te m s a re s ta n d a rd ob jec ts in n a tu re .

1.1 . V e c to r f ie ld s on a m a n ifo ld M

A m a n ifo ld is so m e th ing l ik e a " s u r fa c e ” b u t of a d im e n s io n not

n e c e s s a r ily e qua l to 2 (see M iln o r [50] fo r a r a p id in tro d u c t io n to the

sub je c t) . A v e c to r f ie ld is a m ap p in g

x - X (x )

w h ich a s so c ia te s w ith x a v e c to r in the tangen t space T XM, to M a t the

po in t x. U nde r v e ry re a so n ab le a s su m p tio n s the C auchy p ro b le m

361

362 LOBRY

d x - v i \

d t ”

x (0) = x 0

has a un ique s o lu tio n w hose v a lue at t im e t is denoted by X t (xg). W hen th is

v a lu e is d e fined fo r every t we say tha t the v e c to r f ie ld is c o m p le te . The

sm oo th m ap p ing

(t, x) - X t(x)

d e fines a g ro up ac tion .

X 0(x) = x

X «rlt2 (x) = X t¡ о X tz (x)

1.2. C o n tro l d y n a m ic a l sy s tem s

L e t us c o n s id e r a fa m ily S > o f c o m p le te v e c to r f ie ld s on the m a n ifo ld M .

If S ) re d uce s to one e lem en t we have a g ro up ac tio n o f E on M w h ich is

d e fined by the in te g ra t io n o f the d if fe re n t ia l equa tio n

d x - v , \

dt - X (x)

C o n s id e r the c o lle c t io n of sy m b o ls of the fo rm

■(t1 ,X 1 ) ( t2, X 2) . . . (tp, X p); p G N , X 1 € ^ > , t . G и

Take co n c a te n a tio n as the law of c o m p o s it io n and adop t the two s im p lif ic a t io n

r u le s :

i) i f X 1 = X i+1 then re p la c e (tif X 1) ( t i+1, X i+1) by (t¡ + ti+1, X 1)

ii) s u p p re s s te rm s of the fo rm (О , X ) .

The se t o f a l l ir r e d u c ib le sequences is c a lle d the c o n tro l g ro up a sso c ia ted

to S> and denoted by G { & ) . W e de fine the g ro up a c tio n by

( t 1( X 1) (ta, X 2) . . . (t¡, X 1) . . . (tp, X p)x = X* o x 2 . . . oX j. о . . . o X ? (x)

w here the X{¡ a re de fined in s e c t io n 1.1 . T he v e r if ic a t io n th a t is a

g ro up and th a t the above a c tio n is a g ro up a c tio n is t r iv ia l (see R e fs 142,46]).

IAEA -S M R -17 /71 363

1 .2 .1 . D e f in it io n : A f a m ily 3 ) o f v e c to r f ie ld s on M is c a lle d a 'C o n tro l

D y n a m ic a l S y s te m " , the (non- co m m u ta tiv e ) g ro up de fin ed above ,

is the a s so c ia te d c o n tro l g ro u p . In G(¿25) we c o n s id e r the subse t G + (&>) of those sequences:

( t p X 1) ( t2, X 2) ( . . .) (t j, X 1) . . . ( tp, XP)

fo r w h ich the r e a ls t 1( t2, . . . , t t a re p o s it iv e . The o rb it G ( ^ ) . x

of a p o in t is the set:

G ( S > ).x = { X ,1 o X 2 о . . . o X ‘ о . . . oXP (x); X 1 e S ) \ t j e Ж , p 6 IN}4 h t j tp

The p o s it iv e o rb it is the se t

G+(^> ).x = {X* o X ,2 о . . . o x j о . . . o X P (x); X 1 e g ) \ t. e ® +, p e IN} 1 2 ' P

The p o s it iv e o rb it a t t im e t is

G Л 3 > ) . х = j x .1 o x f о . . . o X ‘ о . . . o X p (x); X 1 e 3 ) ;L i tz ‘ ¡ 'p

p

t¡ e * p£ t¡ = t, p e k|i — 1

T he qu es tio n s w h ich we now w an t to a n sw er a re :

W h a t is the s tr u c tu re o f G ( ^ ) . x ?

W h a t is the s tr u c tu re o f G M( Æ > ) .x ?

As we s h a ll see , m o s t o f the e s s e n t ia l q u e s tio n s a re so lved except two:

W hen is G x o r G t (HZ)), x c lo se d?

W hen is G*(¿£>). x the w hole m a n ifo ld M ?

B y an an sw er to these q u es tio n s we m e a n a "c o m p u ta b le a lg o r ith m " in te rm s

of the know n da ta w h ich g ive s the an sw e r . U n fo r tu n a te ly (or fu r tu n a te ly ) ,

the two la s t q u es tio n s a re the m o s t p e r t in e n t ones fo r c o n tro l p u rp o se s .

1/ ....

r c,

/

c2-p

1 — UUÜJUUÜ--- — u u u u u u u — LA

FIG. 1. Electrical network as control dynamical system.

364 LOBRY

1.3 . F i r s t e x am p le o f a c o n tro l d y n a m ic a l sy s te m (taken fr o m Ref.[3] )

C o n s id e r the e le c t r ic a l n e tw o rk show n in F i g . l . H e re the sw itch is

c lo se d e ith e r on the r ig h t o r on the le f t . W ha t a re the equa tions of the

m o tio n ?

Now a s su m e , fo r s im p lic i ty , th a t the co ns tan ts c x, c 2, £^2, a re equa l

to 1; deno ting Vj by V2 by x 2 and i by x3 we have:

F o r each p o s it io n of the sw itc h the v e lo c ity is o r th og o na l to the p o s it io n

v e c to r , thus a ll the m o tio n s , w hatever we a re do ing w ith the sw itch , a re

r e s tr ic te d to a sphe re w hose ra d iu s is the n o rm of the in i t ia l s ta te . T hus,

the ne tw o rk d e fin es a f a m ily S > of two v e c to r f ie ld s on a sphe re S. T h is is

a c o n tro l d y n a m ic a l s y s te m . If x° = ( x jx 2xg) is an in i t ia l s ta te then G +S > . x°

is the se t o f a l l p o s s ib le s ta tes fro m x° w hen we c o n tro l the sw itch fr o m

r ig h t to le f t and v ice v e r s a w ithou t any r e s tr ic t io n .

I t is a lso noted in Ref.[3] th a t in the p ro b le m o f the r o ta t io n of a r ig id

body a ro und its c e n tre o f m a s s we are d e a lin g w ith a c o n tro l d y n a m ic a l

s y s te m on the ta n g en t b und le to the se t of o r th og o na l m a t r ic e s .

1.4 . Second exam p le o f a c o n tro l d y n a m ic a l sy s te m

a) The sw itc h is on the le ft: b) The sw itch is on the r ig h t:

d V i _ 1 .d t Cx 1

dV2dt

d i _ _1__dt 2

dx

d T0

o rdt

dx.

dtx,2

L e t X ‘ ; (i = 1, 2, . . . , p ) and X o be p + 1 v e c to r f ie ld s in ]Rn and c o n s id e r

the c o n tro l sy s tem

IAE A-SM R-17/71 365

and lo ok fo r B .B . c o n tro ls , th a t is p ie ce w ise c o ns tan t c o n tro ls t -* u¡(t)

w ith v a lu e +1 o r -1. D eno te by A (t, x0) the a c c e s s ib le set at t im e t in the

u s u a l m e an in g of c o n tro l th eo ry and by A(xq) = U A (t, x 0) the w hole

a c c e s s ib le se t. Then we have

w here the c o n tro ls a re p ie ce w ise sm o o th c an be c o n s id e re d to be a c o n tro l

d y n a m ic a l sy s te m [42].

2. G E N E R A L S T R U C T U R E O F O R B IT S O F C O N T R O L D Y N A M IC A L

SYST EM S

In th is se c t io n , we c o n s id e r a c o n tro l d y n a m ic a l sy s tem defined by a

c o lle c t io n o f v e c to r f ie ld s on an n - d im e n s io n a l, connected , p a ra- co m p ac t ,

sm o o th m a n ifo ld M . F o r the sake of s im p lic i ty , they a re supposed to be

co m p le te .

2.1 . T he ran k o f a sys tem

I f X and Y a re two v e c to r f ie ld s on a m a n ifo ld M , we denote by [X, Y]

th e ir b ra c k e t. W e r e c a l l th a t in a lo c a l co - o rd in a te sy s te m one has

[X, Y] (x) = D Y (x)X (x) - D X (x) Y (x)

w here D X (x ) and D Y (x ) denote the Ja c o b ia n m a t r ix o f X and Y at p o in t x,

w h ich red uce s to:

A (t, x0) = G t (Æ > ).x 0 A (x0) = G +(& > ).x 0

w here the fa m ily is the fa m ily

i — 1

A c tu a lly , any sy s te m de fined by

= f(x> u ) t x g и " , u e Ш р

i f the v e c to r f ie ld X is 3/Эх . I f ¡2) is a fa m ily of v e c to r f ie ld s , th en [S & T deno tes the s m a lle s t f a m ily , c lo se d u nd e r b ra c k e t o p e ra t io n , w h ich co n ta in s SD .

366 LOBRY

2 .1 .1 . D e f in it io n : T he ran k of a fa m ily o f v e c to r f ie ld s at p o in t x is the

d im e n s io n o f the l in e a r h u ll of the set of v e c to rs V(x) in T M w hen V ranges

o ve r W e denote i t by r {x ). W e have:

r ^ ( x ) = d im (á f({V (x ); V G 1 & > T } ) ) .

T he im p o r ta n c e of Jaco b y b ra c k e ts and the r a n k of a sy s te m fo r the

c o n t ro l la b i l i ty of sy s te m s seem s to have been no ticed f i r s t by H e rm a n n [17]

and H e rm e s [19]. The sy s te m a tic use of th is to o l s ta r te d in 1970 (see

H ay ne s- H e rm e s [16], L o b ry [40]).

N o tice th a t the ran k is unchanged i f we re p la c e IŒ >]°° by the a lg e b ra

gene ra ted by 3 ) ,

2 .1 .2 . A few re m a rk s :

The ran k r^,(x), w hen has ju s t one e lem en t, is a lw ays 0 o r 1.

A s soon as we have m o re th an one e lem en t in S > , the ran k m ay be ve ry

la r g e . F o r in s ta n ce , i n R n, the ran k at 0 of the sy s te m de fined by the two

v e c to r f ie ld s

x “ Y = >9 X j

i = 0

is equa l to n, b ecause one c le a r ly has

[X [X .......X [ X Y ] ] . . . ] (o) = p!e

w here {ep }, p = 1 , . . . n , is the c a n o n ic a l b a s is .

2 .2 . . S tru c tu re of o rb its of C " sy s te m s

W e a re f i r s t c o nce rne d w ith the s tru c tu re of G ( S > ) .x . The c la s s ic a l

F ro b e n iu s th e o re m te l ls us [40]:

2 .2 .1 . P ro p o s it io n : A s s u m e th a t fo r e ve ry p o in t x in M the d im e n s io n of

3 D , (d im e n s io n o f £ ^ ({X (x ); X G ££>}) is e qua l to the r a n k and co ns tan t

equa l to p. Such a c o n tro l sy s te m is s a id to be in v o lu t iv e and of cons tan t

d im e n s io n . T hen th e re ex is ts a un ique m a n ifo ld s tr u c tu re on G (£Z > ).x such

th a t the m ap p ing s

(t ,,.t. t . , . . , t ) - X ,1 o . . . o X ‘ о .. o X f (x) X ‘ G p G H t LG Ei 1 P 'i Ч tp

a re d if fe re n t ia b le . W ith th is s tr u c tu re G ( S > ) .x is p - d im e n s io n a l. T h is

th e o re m is q u ite in adequa te fo r c o n tro l p u rp o se s . T he re is no re a s o n fo r

the r a n k to be equa l to the d im e n s io n . C how 's th e o rem [7,30,40] g ives us

a c o n s id e ra b le im p ro v e m e n t of th is r e s u lt w h ich is the fo llo w in g

p ro p o s it io n :

IAEA-SM R-17/71 367

2 .2 .2 . P ro p o s it io n : (Chow — F ro b e n iu s ) A ssu m e th a t the r a n k of sy tem

is c o ns tan t and e qua l to p. T hen G ( 3 > ) .x has a un ique m a n ifo ld s tr u c tu re

fo r w h ich the m ap p ing s

(t., . . . , t . , . . . t j - x ! o . „ o X ‘ о . . . о Х р(х )х ‘ G ^ ; p G S ; t. G IR1 r l i ' i ‘ p 1

a re d if fe re n t ia b le . W ith th is s tr u c tu re G (£@ ).x is p - d im e n s io n a l.

P ro o f . T ake the fa m ily {¿£>]°°, th is c o n tro l d y n a m ic a l s y s te m s a t is f ie s the

a s su m p t io n o f p ro p o s it io n 2 .2 .1 ; th u s , G ( l S l f ° ) . x has a m a n ifo ld s tru c tu re

of d im e n s io n p . I t is connec ted ; by s ta n d a rd a rg u m e n t , the c o n c lu s io n

d e r iv e s fr o m T h e o re m 2 .2 .3 be low app lie d in G ([¿2> ]“ ) .x w h ich lo c a l ly lo oks

l ik e IRp.

2 .2 .3 . T h e o re m (Chow [7]). In 3Rn, c o n s id e r a c o n tro l d y n a m ic a l s y s te m .®

o f r a n k n at the o r ig in ; then:

i) in every ne ighbou rhood of О the se t G + (i29).0 has in te r io r p o in ts ,

i i) the se t G ( 3 > ) . 0 is a ne ighbou rhood o f O .

P ro o f . A c tu a lly , f r o m C how 's o r ig in a l tr e a tm e n t i t is no t p o s s ib le to

o b ta in i) , b u t ju s t i i ) . M o re o v e r , i t does no t w o rk d ir e c t ly fo r fa m i l ie s of

v e c to r f ie ld s b u t fo r a lg e b ra s o f v e c to r f ie ld s . T he re a re m any p ro o fs of

T h e o re m 2 .2 .3 in c o n tro l th eo ry l i t e r a tu r e : H e rm a n n [18], L o b ry [40 ,42 ,46] ,

S u s sm a n n - Ju rd je v ic [65], S te fan [55,56]. B e low we g ive K re n e r 's p ro o f,

w h ich is v e ry s im p le (see R e f .[30]).

2 .2 .4 . K r e n e r 's p ro o f of 2 .2 .3 :

L e t &> be a c o n tro l d y n a m ic a l sy s tem in ]Rn, a s su m in g tha t the r a n k

o f S > is n a t p o in t O . T hen G + (£D) . О has in te r io r p o in ts in e ve ry ne ighbou rhood

of O.

P ro o f . A t le a s t one v e c to r in is no t ze ro at the o r ig in , fo r , i f not, by

the fo rm u la s [X ,Y](0) = D X (0 )X (0 ) - D Y (0 )X (0 ) every b ra c k e t w ou ld v a n is h

at О and then the ran k is 0. T ake X 1 in SO w ith X 1 / 0. F o r e s m a l l enough ,

the se t

S 1 = 1 x ^ (0 ) ; 0 < t 1 < e 1|

is a sm o o th s u b - m an ifo ld of IRn. T he re e x is ts a p o in t p j = X ¿ (0) in S x, and

a v e c to r X 2 in 3 ) such th a t X 2 (pj) and X 1 (p ^ a re independen t, fo r , i f not,

th is w ou ld im p ly th a t a ll the v e c to rs o f 3 > a re tangen t o f S 1 and thus the

ran k w ou ld be 1 .

F o r e2 s m a l l enough, the set

is a su b - m an ifo ld o f IRn. I f 2 is s m a l le r th an n, th e re e x is ts a p o in t2 1 2 Я 1

p 2 = X Mz o X ft in S and a v e c to r X (w hich m ay be e qua l to X ) s uch th a t the

v e c to rs S (p2) X 2(p)2X 3(p2) a re l in e a r ly independen t fo r the s am e re a s o n th an

b e fo re ; one de fin es S3 in the s am e w ay , e tc . up to Sn w h ich is an open

368 LOBRY

subse t of IRn and is c o n ta in ed by c o n s tru c t io n in G 40. S ince the e- e2. . . en

c an be chosen a r b i t r a r i ly s m a l l , e v e ry th in g w orks in any ne ighbou rhood

of O . T h e o re m 2 .2 .3 is p ro ved . A c tu a lly , T h e o re m 2 .2 .3 te lls us m u ch

m o re th an 2 .2 .2 . It says:

2 .2 .5 .T h e o re m : If the ran k at x o f the c o n tro l d y n a m ic a l sy s tem de fined

by is equa l to n th en in te r io r p o in ts o f G +( ^ ) . x are dense in G +( £ > ) .x :

G * ( 0 ) . x C C lo s (In t(G 4 ( 0 ) . x ) )

P ro o f : T r iv ia l f r o m 2 .2 .3 .

T h is la s t r e s u lt , in v iew of the s ta b il ity ques tio n (Sec tion 4) seem s to

be v e ry s ig n if ic a n t in c o n tro l th eo ry desp ite the c u r io u s way in w h ich i t

is s ta ted . M o re o v e r , th is p ro p e r ty is g ene ric b ecause the rank co nd it io n ,

as the next th e o re m s ta te s , has the fo llow in g p ro pe rty :

2 .2 .6 . T h e o re m : L e t V (M ) denote the set of a l l v e c to r f ie ld s on M ; in

the Ck -topology of W h itney on (V (M ))P, fo r p ë 2 and к la rg e enough (it

depends on p , and the d im e n s io n o f M ), the se t of those f a m il ie s 3 ) of

p- vec to r f ie ld s w h ich s a t is fy the r a n k co nd it io n r^,(x) = n everyw here in M ,

co n ta in s an open dense subse t.

P ro o f : See R e f .[41] o r Ref.[46], The p ro o f is c a r r ie d out by m e ans of

T h o m 's L e m m a and the s ta n d a rd techn iques in d if fe r e n t ia l topo logy .

2 .2 .7 . G e n e r ic d y n a m ic a l c o n tro l sy s te m s a re " t r i v ia l " . By " t r iv ia l "

we m e a n th a t fo r every x in M , G(¿29).x = M .

P ro o f : F o r a g ene r ic c o n tro l d y n a m ic a l sy s te m r^ fx ) is n at every po in t

(2 .2 .6 ) then G ( S > ) .x is a ne ighbou rhood of x (2 .2 .3 ) and thus every o rb it

is open and hence m u s t be the fu l l space .

T hu s , we know the s tr u c tu re o f the o rb its in the g ene r ic case ( tr iv ia l) ;

in the case o f c o ns tan t ra n k (Chow + F ro b e n iu s ) , it r e m a in s the non-constan t

case w h ich has been so lved re ce n tly by S u ssm ann [59].

2 .2 .8 . T h e o re m (Sussm ann ) (Take a fa m ily Œ> of v e c to r f ie ld s on M , w ithou t

any a s su m p t io n on (except th a t v e c to r f ie ld s a re sm oo th ) th e re ex is ts a

un ique m a n ifo ld s tr u c tu re on G(i2>).x fo r w h ich the m ap p ing s :

(tt , . . . , t¡, . . . , t p) - X* о . . . o x j . о . . . o X p (x) X 1 G tj G IR; p G N

a re d if fe re n t ia b le .

P ro o f : T h is p ro o f is c o m p le te ly "by h a n d ” in the sense tha t i t does not

r e fe r to any o the r c la s s ic a l r e s u lt . T hus , i t is a tr u e g e n e ra liz a t io n of the

F ro b e n iu s th e o re m . I t is too te c h n ic a l to be exposed he re ; see R e f .[59]

o r R e f .[46 ].

As a c o n c lu s io n , we r e m a r k th a t the th eo ry of a c tio n o f c o n tro l d y n a m ic a l

s y s te m s , as c o n s id e re d fr o m the p o in t of v iew of o rb its , is now co m p le te

and a c tu a lly t r iv ia l . F o r tu n a te ly , m an y p ro b le m s on the p o s it iv e o rb its

a re not y e t so lved .

IAEA-S MR-17/71 369

2 .2 .9 . E x a m p le s . C o n s id e r in IR the c o n tro l d y n a m ic a l sy s te m s de fined

by (F ig .2 ) ;

a)

Эх

x2 - AX " a y

b)

X 1 - — x “ Эх

X 2 = (1 + X )

3y

3 7 0 LOBRY

FIG. 3. Proof o f Proposition 2. 2, 10.

In case a) G,\{2&).x is a c lo se d se t w ithou t in te r io r , in case b ), i t h as an

in te r io r and is c lo se d , and in case c) i t has an in te r io r b u t is not c lo se d .

The q u e s tio n of in te r io r is e a s ily so lved by the fo llow ing :

2 .2 .10 . P ro p o s it io n : The se t G t(j2?).x has a non-em pty in te r io r fo r every

t p ro v id e d th a t the r a n k at p o in t (x,0) o f the sy s tem £ b in M X IRn de fined

be low is equa l to n.

^ = { x ® x 6 . A . ( x , t ) = ( x « ) )

P ro o f : I t is t r iv ia l ; see F ig .3 . D e ta ils c an be found in R e f .[65].

The q u e s tio n of c lo sedne ss is not ye t so lved in th is se tt in g . A c tu a lly , i t is

v e ry c lo s e ly r e la te d to q u es tio n s of B ang- B ang c o n tro l la b i li ty o r s in g u la r

a rc s , and we have now a good know ledge of th is p ro b le m . (See S ec tion 4).

T h u s .it s ee m s th a t we a re go ing to o b ta in a re a so n ab le answ er in the near

fu tu re ( if it has not ye t b een ob ta ined , by now).

W e co nc lude th is se c t io n by a s o r t of in v e rse q ues tio n : ''How m any

v e c to r f ie ld s do we need to o b ta in G +( ^ ) . x ' = M ? " The an sw er is two. See

S u s sm ann [64] and S u ssm ann - L e v it t [39].

2 .3 . S tru c tu re of o rb its of a n a ly tic sy s tem s

In th is se c t io n , we s h a ll a s su m e th a t the c o n tro l d y n a m ic a l sy s tem 3) is co m po sed of a n a ly tic v e c to r f ie ld s on an an a ly tic m a n ifo ld M . In th is

c ase , one can expect "g lo b a l r e s u lts " fr o m the know ledge o f the "c o e ff ic ie n ts "

o f the sy s te m at p o in t x. T h is is a c tu a lly the c ase . The fo llo w in g th e o re m

was p roved by N agano [51] in a c o m p le te ly d if fe re n t fr a m e w o rk . T he p ro o f

g ive n in Ref.[40] is fa ls e . T h is was po in ted out by P . S te fan [54].

IAEA -SM R -17 /71 371

2 .3 .1 . T h e o re m (N agano): L e t S > be an an a ly tic c o n tro l s y s te m . L e t p be

the ran k of the sy s te m at p o in t x. T hen the m a n ifo ld s tr u c tu re de fined in

2 . 2.8 has d im e n s io n p.

P ro o f : See R e fs [51,54]. W e now tu rn to the study of G +( l2 ) ) .x and G t (& > ).x .

Suppose now th a t the r a n k o f the sy s tem at p o in t x is e qua l to n. T h is

is not a r e s t r ic t io n because fr o m 2 .3 .1 . th is a s su m p t io n is s a t is f ie d in the

r e s t r ic t io n to G(¿2? ).x . C o n s id e r now in M X 1R the sy s te m S ) d e fined by:

S> = {X œ 3/9t , X G £ 2)} and com pu te its ra n k at p o in t x. A s im p le c o m p u

ta t io n show s th a t

= [X , Y] ® 0

and, in d ir e c t ly , a l l the b ra c k e ts a re of the s am e fo rm . M o re p re c is e ly ,

the fo llo w in g ho ld s :

[â>r = J> ® [2)\@Y]

and we are^ 'in te rested in the d im e n s io n o f C le a r ly , (each

v e c to r in S > (x ) has a co m ponen t equa l to 1 ) th is l in e a r subspace is not c o n

ta in e d in the subspace (Çj , Ç2, . . . , Ç , 0 ); f r o m th is one e a s ily deduces

the fo llo w in g

2 .3 .3 . L e m m a : T he r a n k of § b is n + 1 at p o in t (x, 0) i f and on ly if the

l in e a r space gene ra ted by the v e c to rs o f the fo rm :

X Э V 9a t ' a t

p p

Y X j X ^ x ) + Y (x ); Y X j = 0, Y € \ S ) , [ 2 > Y ]

i — 1 i — 1

is o f d im e n s io n n.

F r o m th is le m m a , f r o m 2 .2 .10 , and fr o m 2.3 .1 we deduce im m e d ia te ly :

2 .3 .4 . P ro p o s it io n : A s s u m e th a t the sy s te m S> has ran k n at p o in t x.

T hen G t+(_22>).x has a non-em pty in te r io r (a c tu a lly i t has dense in te r io r po in ts)

fo r every t , i f and on ly i f the l in e a r subspace g ene ra ted by the v e c to rs of the

fo rm :

Р P

^ X¡ X ‘ (x) + Y (x ); X¡ = 0 Y G [ 3 > [ 3 > î

i = l

has d im e n s io n n.

2 .3 .5 . W e can say a l i t t le b it m o re about G*(£Z>).x w hen the d im e n s io n is

not n. L e t us denote by the c o lle c t io n of v e c to rs of the fo rm

& 0

K F

X; X ‘ (x) + Y (x ); Y X i = 0 Y G [ M 3 > î

372 LOBRY

C o n s id e r in M X IR the se c t io n at t im e t, denoted by M t :

M t = M x{t}

F r o m the c o n s id e ra t io n s b e fo re L e m m a 2 .3 .3 , we know th a t the sub - m an ifo ld

G ( 3 ) ( x , 0) of M X К m ee ts each M tr a n s v e r s a l ly . H ence , the in te rs e c t io n

S t = M t П G ( 3 > ) (x, 0)

c an be c o n s id e re d to be a sub - m an ifo ld of M . F r o m th is c o n s tru c t io n , i t is

c le a r th a t fo r any X in 3 one has

X t(S0) = st

and, on the o th e r hand , St co n ta in s G t+( ^ ) . x . M o re o v e r , the sy s te m ( 3 ) r e s tr ic te d to G +( 3 ) (x, 0) is o f m a x im u m ran k , thus in the topo logy of

G { 3 ) (x, 0), the se t G +(É>) (x, 0) has in te r io r po in ts e veryw here dense and

b ecause it m e e ts S t t r a n s v e r s a l ly G î ( 3 ) ) , x has an everyw here dense in te r io r

in the topo logy of St . To co nc lude , we know th a t at every p o in t x in M the

f a m ily 3 q is tangen t to S0 and of ran k n-1. T h is p roves th a t S 0 = G (J9 0)x.

W e have p ro ved (see S u s sm a n n - Ju rd je v ic [65] if m o re d e ta ils a re needed)

the fo llo w in g th e o re m .

2 .3 .6 . T h e o re m (S u ssm an n - Ju rd je v ic ) L e t S) be an an a ly tic c o n tro l s y s te m .

D e fin e by a new c o n tro l s y s te m , by 2 .3 .5 .

i) i f ra n k of at po in t x is n then in te r io r po in ts of G ^ ( S > ) .x a re dense

in G * { 3 > ) .xii) i f r a n k of at p o in t x is le s s th an n then i t is n-1. T he s u b

m a n ifo ld G ( 3 ^ ) . x has c o d im e n s io n 1. The se t G \( S > ) . x is con ta in ed in

X t ( G ( 3 0)) fo r any X in 3 ) and in the topo logy of the s ub - m an ifo ld X t ( G ( 3 ^ ! in te r io r po in ts of G +( 3 > ) .x a re dense in G +{ 3 ) . x

T he re is ano the r in te re s t in g po in t in S u s sm a n n - Ju rd je v ic [65]. W e ju s t

s ta te it w itho u t c o m m e n ts on the proo f;

2 .3 .7 . P ro p o s it io n : L e t M be a m a n ifo ld w hose fu n d a m e n ta l g roup has no

e lem e n ts of in f in ite o rd e r ; then

G +( 3 ) . x = M -> G \( 3 > ) has a non-em pty in te r io r fo r som e t.

2 .3 .8 . P ro p o s it io n : L e t M be a m a n ifo ld w hose u n iv e r s a l c o v e r in g is

co m p ac t, then

G +( 3 ) . x has a non-em pty in te r io r -* G \ ( 3 ) . x has a non-em pty in te r io r fo r

som e t.

3. C O N T R O L L A B IL IT Y O F N O N - L IN E A R SYST EM S

In th is se c t io n we r e s t r ic t o u r a tten tio n to sy s te m s of the fo llo w in g fo rm :

P

= X°(x) + U i X 1, X e M . ( U j , u 2...........Up) e ]RP (1)

i = l

IAEA-S M R-17/7 1 373

T he re is som e a r b i t r a r in e s s in cho o s ing such sy s te m s w here the c o n tro l

is a p - d im e n s io n a l in p u t w h ich en te rs in to the equa tio n l in e a r ly . B u t i t is

w e ll-know n th a t the e s s e n t ia l d if f ic u lt ie s a re in the space n o n - lin e a r it ie s

and, m o re o v e r , th is fo rm is a good b a la n ce betw een re a so n ab le g e n e ra lity

in o rd e r to o b ta in in te re s t in g cases and a s im p le e xpos itio n .

A l l the c o n tro ls w h ich we c o n s id e r a re lo c a l ly bounded m e a s u ra b le

c o n tro ls and, fo llo w in g J u rd je v ic - S u s s m a n n [29], we denote by:

Uu the se t o f c o n tro ls w h ich take a r b it r a r y v a lue s in IRP: the se t of

" u n re s t r ic te d c o n tro ls " .

Ur the se t o f c o n tro ls t -» ( ^ ( t ) , ^ 2(t), •••, Щ ( Ъ ) , W p{t)) such tha t: | <®i''1(t)) |ё 1. The se tt of " r e s t r ic te d c o n tro ls " .

Ub the se t of p ie ce w ise co ns tan t c o n tro ls w h ich take ju s t the v a lue s +1

o r -1: The se t o f "B ang B ang c o n tro ls " .

The a c c e s s ib le set a t t im e t fr o m som e in i t ia l c o nd it io n x is denoted by:

A u(x ,t), A r(x ,t) , A b(x,t)

d epend ing on the fa c t th a t the a d m is s ib le c o n tro ls a re " u n re s t r ic te d " ,

" r e s t r ic t e d " o r "B ang B a n g " . A nd we in tro du ce :

A (x,[0,t] ) = U A (x,t)s e [ 0 , 0

A r(x,[0,t] ) = U A r(x,t)

se [ 0 ,0

Ab(x,[0,t] ) = U A b(x,t)

s e [o, 0

W e s h a ll s tudy the p ro p e r t ie s of these se ts w hen M is an a r b i t r a r y m a n ifo ld ,

sm o o th o r a n a ly t ic , a c o m p ac t R ie m a n n ia n m a n ifo ld and the c o n tro l

d y n a m ic a l sy s te m co n se rv a t iv e , a L ie g ro up and the c o n tro l d y n a m ic a l

sy s te m is r ig h t in v a r ia n t and f in a l ly in IRn.

3.1. The case o f an a r b i t r a r y m a n ifo ld

W e c o n s id e r a s y s te m (1) on a m a n ifo ld M . W hen M is an an a ly tic

m a n ifo ld and the v e c to r f ie ld s a re an a ly tic we say th a t we a re in the an a ly tic

case .

3 .1 .1 . N o ta t io n : C o n s id e r the sy s tem (1):

P

^ = X °(x) + ^ u ¡ X '(x )

Í - 1

W e deno te by L the L ie a lg e b ra g ene ra ted by the fa m ily (X o, X 1, . . . , XP),

by L 0 the id e a l g ene ra ted by (X 1, X 2, . . . , X p) and by Я the L ie a lg e b ra

g ene ra ted by (X 1, X 2, . . . , X?).

374 LOBRY

3 .1 .2 . P ro p o s it io n : I f (and only i f in the an a ly tic case) the d im e n s io n of

L at po in t x is equa l to the d im e n s io n n of the m a n ifo ld M one has :

F o r every (t > 0) in te r io r po in ts of A H(x ,[0 ,t]) a re dense in A a(x,[0,t]) w here

the sy m b o l a can take the v a lue s u, r , o r b.

P ro o f : T h is fo llow s fr o m T he o re m s 2 .2 .3 and 2.3 .1 as soon as we have

r e a l iz e d , by a s im p le co m p u ta tio n , th a t the ran k of the fa m ily

P

® = { x ° + £ U jX 1 ; (u 1; u 2, . . . , u p e Ub}

i = l

at p o in t x (see d e f in it io n 2 .1 .1 ) is exac tly the d im e n s io n of L at p o in t x.

D e ta ils a re le f t to the re a d e r . R e m a rk th a t in the a n a ly tic case of

T h e o re m 2.3 .1 the s am e c o n c lu s io n ho ld s in the topo logy of the m a n ifo ld

G ( ¡ 2 > ) . x . Now we tu rn ou r a tten tio n to the re a ch ab le se t at t im e t.

3 .1 .3 . P r o p o s i t io n : If (and on ly if in the an a ly tic case) the d im e n s io n of

L o at p o in t x is equa l to n we have:

F o r every t(t > 0) in te r io r po in ts o f A a (x,t) a re dense in A a (x,t) w here the

sy m b o l a c a n take the v a lu e s u , r o r b . M o re o v e r , i f the d im e n s io n of

L a t x is n, the d im e n s io n of L 0 at x is e ith e r n o r n - l,a n d , in th is la s t case ,

if the sy s te m is a n a ly t ic , the sam e c o n c lu s io n ho ld s in r e s t r ic t io n to som e

s u b - m an ifo ld o f d im e n s io n n-1 fo r the case a = r o r a = b.

P ro o f : It fo llow s fr o m the fa c t tha t the d im e n s io n o f Lo is the s am e as

the d im e n s io n of the f a m ily d e sc r ib e d in L e m m a 2 .3 .3 (w ith r e s p e c t to &>).W e tu rn now to c o n tro l la b i li ty r e s u lts , w h ich a re not v e ry s trong in

th is contex t.

3 .1 .4 . P ro p o s it io n : A ssu m e th a t the sy s te m is hom ogeneous: i .e . X ° = 0.

T hen if (and on ly if in the an a ly tic case ) the d im e n s io n o f Л is equa l to n at

every po in t of M (w hich is a ssum ed to be connected) then

F o r every x in M one has A r(x,[0,°o]) = A b(x,[0,°°]) = M

F o r e ve ry x in M and t(t > 0 ) one has A u(x,t) = M .

P ro o f : S ince in th is case we c an ''re v e rse t im e 1' (the t r a je c to ry c o r r e s

pond ing to s om e co n s ta n t c o n tro l (u1( . . . , Up) can be fo llow ed b a ck w a rd u s in g

the c o n tro l (-Uj, -u2, . . . , "Up), it tu rn s out th a t th is r e s u lt co m es d ir e c t ly ,

see 2 .2 .4 ).

3 .1 .4 . P ro p o s it io n : A s su m e th a t the ran k o f L is n e ve ryw he re , then:

F o r every x in M one has A u(x ,[0 ,«] = M .

P ro o f : T ake a p o in t y; a n e ighbou rhood of th is p o in t is a c c e s s ib le at t im e

t, fo r a r b i t r a r y s m a l l t, f r o m x und e r the hom ogeneous sy s tem (see above).

D u r in g a ve ry s h o r t t im e the in tro d u c t io n of the non-hom ogeneous te rm X o

in tro d u ce s ju s t a " s m a l l e r r o r ' ’. One co nc lud e s by a fix e d- p o in t th eo rem .

The c o r r e c t p ro o f of th is la s t r e s u lt can be found in B ru n o v sk y - L o b ry [6]

w here th is id e a is used to p rove v a r io u s r e s u lts of c o n t ro l la b i li ty .

I A E A -S M R -1 7 /7 1 37 5

FIG. 4. Torus illustrating Proposition 3 .1 .5 .

FIG. 5. K now ledge o f E + and L~ does not determ ine con trollab ility . C ontrollability o f A depends on whether the trajectory is above or below line ir.

W e co nc lude th is p a r a g ra p h by som e c o m m e n ts ta ke n fr o m the w o rk of

G e rb ie r [11, 12]. W e lo o k fo r the c o n t r o l la b i l i ty o f a v e ry s im p le sy s te m ,

i.e . the sy s te m :

X°(x) + u X x(x) x = ( x v x 2) G IR2 (2)

xS (l+f(xi ' x2))i 4 +^ (XiX2> 4

X 1 = f < l - f ( x i ' - l g ( x i ’ 9 T ¡

376 LOBRY

N otice th a t fo r th is sy s te m the c o n tro l u = +1 le ad s to the v e c to r f ie ld

d / d x i and the c o n tro l u = -1 le ad s to the v e c to r f ie ld f ( x xх 2)Э /Э х г ++ g ( x i x ^ d / д х 2; th u s , th is c la ss of sy s te m s is p a r a m e tr iz e d by the set of

v e c to r f ie ld s in the p la n e . M o re o v e r , we r e s t r ic t o u rse lv e s to non- van ish ing

v e c to r f ie ld s : f^X jX g ) + g 2(x1x2) > 0. I t tu rn s out tha t the two sets

£ + = { (x1, x 2); g (xp x 2) = 0 and f(xp x2) > 0 }

£ " ~• { (Xp x 2); g (xp x 2) = 0 and f (x r x ^ < 0 }

p lay an im p o r ta n t , b u t non- exc lu s iv e ro le fo r c o n t ro l la b i l i ty , i .e . we can

p rove :

3.1.5. P ro p o s it io n : I f the sy s te m (2) is c o n tro lla b le : (A r(x ,[0,°°]) = IR

fo r e ve ry x) th en the se t £ " is not em pty .

N o tic e th a t th is depends on the E u le r- P o in c a ré c h a r a c te r is t ic b ecause i t is

fa ls e on the to ru s show n in F ig .4.

N ow , the fo llo w in g exam p le show s how know ledge of and £~ is not

enough to d e te rm in e c o n tro l la b i li ty .

3.1.6. E x a m p le (see F ig .5). F o r d e ta ils , see R e f . [11].

3.2. The case of a c o m p ac t m a n ifo ld w ith a m e a su re

Suppose th a t we a re on a co m p ac t m a n ifo ld M w ith a m e a s u re ц . W e

c o n s id e r the sy s te m :

P

= X°(x) + JT u , X 1 (x) x e M (1)

i — 1

and we a s su m e th a t the sy s te m is c o n se rv a t iv e ; th is m e an s th a t the v e c to r

f ie ld s X o ± X 1, i = 1 . . .p , p e rse rv e the m e a su re ц .

К((X o ± X i )t (В)) = ¿((В)

W e deno te by L the L ie a lg e b ra gene ra ted by (X o, X 1, . . . , XP).

3 .2 .1 . P ro p o s it io n : I f (and on ly if in the an a ly tic case) the ran k of the

d im e n s io n of L is n (the d im e n s io n o f M) at po in t x then one has:

A r(x ,T ) = A b(x,T ) = M fo r som e s u ff ic ie n t ly la r g e T .

P ro o f : It u se s the fa c t th a t ’'P o is s o n - s ta b le " po in ts a re dense fo r c o n

s e rv a t iv e d y n a m ic s . See L o b ry [44].

C o m p ac tn e s s p lu s m a x im u m ran k is not s u ff ic ie n t fo r c o n t ro l la b i li ty

as is show n by the fo llo w in g exam p le :

3 .2 .2 . E x a m p le : C o n s id e r , on the sp h e re , the sy s tem d e sc r ib e d by F ig .6.

IA E A -S M R -1 7 /7 1 377

FIG. 6. Com pactness plus m axim um rank not enough for controllab ility .

3.3. The case of a L ie g roup

The case w here the s ta te space is a L ie g ro up G has been ex tens iv e ly

stud ied by B ro c k e tt [2,4], J u rd je v ic [28], J u rd je v ic - S u s s m a n n [29], and

S u ssm ann [58]. The te chn ique c o n s is ts in the use of g e n e ra l r e s u lts fr o m

se c t io n 3.1 , to g e the r w ith som e advantages due to the g ro up s tr u c tu re . The

sy s te m s c o n s id e re d a re r ig h t- in v a r ia n t :

w h ich m e an s th a t the v e c to r f ie ld s X 1; i = 0, 1, . . . , p , a re r ig h t- in v a r ia n t

v e c to r f ie ld s . T he Lie. a lg e b ra s L , L 0, and i , w ith n o ta tio n of 3 .1 .1 , d e te r

m in e L ie sub-groups S, S0, and s . In th is c a se , the q u e s tio n o f c o n t r o l la

b i l i t y red uce s to a know ledge of w he the r o r not A (e,[0,°°]) (w here e is the

n e u tra l e lem en t of G) is a sub-g roup . A n an sw er is g iven by the fo llow in g

th e o re m :

3 .3 .1 . T h e o re m : I f G is c o m p ac t the sy s tem (1) is c o m p le te ly c o n tro lla b le

( i.e . A Je,[0 ,«>] ) = G ; a = u , r , b) i f and only i f the L ie a lg e b ra L is the

L ie a lg e b ra of G .

P ro o f : B e cause of the c o m pac tn e ss o f G , one can p ro ve , in th is case , tha t

A a (e,[0,°o]) is a sub-g roup [29]. A no the r way of p ro v in g th is is to r e m a r k

th a t, fo r flow s gene ra ted by r ig h t- in v a r ia n t v e c to r f ie ld s on a co m p ac t L ie

g ro up , a l l p o in ts a re P o is s o n - s ta b le and then the r e s u lt of se c t io n 3.2 can

be app lie d . A c tu a lly , in th is c o m p ac t c a se , one can say even m o re :

3 .3 .2 . T heo re m : A s su m e th a t G is c o m p ac t and the L ie a lg e b ra L o f (1)

is the L ie a lg e b ra of G . T hen fo r each c la s s o f c o n tro ls Ua(a = u , r , b)

th e re ex is ts a T > 0 such th a t fo r e ve ry g in G A a(g,[0T]) = G . If , m o re o v e r ,

G is s e m i- s im p le the s am e s ta te m e n t h o ld s w ith A ^ g .T ) in p lace of

A «(g ,[0 T ]).

P ro o f : see R e f .[29].

3 .3 .3 . The g ro up G L (n ,K ) o f in v e r t ib le m a t r ic e s tu rn s out to be a L ie g roup .

In th is c a se , the tangen t space at each p o in t is id e n t if ie d w ith the set

M (n ,IR ) of n-by-n m a t r ic e s . A r ig h t- in v a r ia n t v e c to r f ie ld is of the fo rm :

P

x e G (i)

i = l

X - AX X G GL(n, IR); A G M(n, B)

3 7 8 LOBRY

T hus fo r every sub-group G of G L (n , IR) the "m a tr ix d if fe r e n t ia l s y s te m " :

P

= [A0 + Y UjA1] X , X e G ; A1 e L (G ) i = 0, 1, p

i = l

is a p a r t ic u la r case of r ig h t- in v a r ia n t sy s te m s on L ie g ro u p s . In th is c ase ,

the b ra c k e t o f the two v e c to r f ie ld s

X - A‘.X

X - A JX

is the v e c to r f ie ld

X - (A1 A1’ - A J A 1 )X

o r

X - [A1, A '] X

w ith the u s u a l no ta tio n :

[A, B] = A B -BA.

3 .3 .4 . W e co nc lud e th is p a r a g ra p h by the fo llo w in g r e m a r k due to

H ir s h o rn [24,25]. B y a th e o re m of P a la is [52], i f the L ie a lg e b ra of a

f a m ily o f v e c to r f ie ld s is f in ite ly gene ra ted , then i t c an be r e a l iz e d as the

L ie a lg e b ra o f a L ie g ro up o f d if fe o m o rp h is m s of the m a n ifo ld . T hen ou r

p ro b le m red uce s to a c o n tro l p ro b le m on a L ie g roup .

3 .4 . The m a n ifo ld is IRn

W e show how c la s s ic a l r e s u lts on c o n t ro l la b i l i ty in IR" a re consequences

of the p re v io u s r e s u lts . B u t f i r s t we s ta te a r e s u lt by B ru n o v sk y w hose

p ro o f is q u ite d e lic a te and does not red uce to 3.1, 3.2 , o r 3.3 .

3 .4 .1 . P ro p o s it io n : T he c o n tro l sy s tem (1)

p

^ = X°(x) + U j X l ( x ) , i = 1, 2, p

i = l

is s a id to be "o d d " in case we have X°(-x) = -X°(x). T hen the se t A (0,[0 ,T])

is a ne ig hbou rho o d o f the o r ig in fo r e ve ry p o s it iv e T i f (and on ly i f in the

a n a ly t ic a l case ) the d im e n s io n o f the L ie a lg e b ra L gene ra ted by

(X o, X 1, . . . , X p) is n.

P ro o f : see Ref.[5].

L e t us lo ok now fo r " b i l in e a r s y s te m s " :

IA E A -S M R -1 7 /7 1 379

3.4.2. Definition: A bilinear system on IRn is a system of the form :

It is clear that the origin is a rest point for systems (2): thus we look for controllability in ]Rn\ { 0 } . Because the solution at time t of system (2), from initial condition x Q is given by X(t).X0, where X(t) is the solution of the matrix equation:

i = l

the problem of controllability reduces to the problem of controllability on Lie groups of m atrices (see paragraph 3.3).

We conclude by the celebrated Kalman critérium :

3.4.3. Theorem: The system:

H Y = Ax + Bu, x e ]Rn, u e ]Rpdt

is completely controllable if and only if the rank of the matrix

(B, AB, A2 B, ................. . An_1B)

is equal to n.

Proof: The reachable set from any point is a linear subspace with a nonempty interior, i.e. the full space, iff the rank of the system is n at every point. Compute the brackets:

[Ax + B u A x -B u J with u x = (1, 0, 0, 0)[Ax + Buj, Ax-Bu-j] = 2 A BUj

Now compute:

[Ax + Bax, A BuJ

We obtain

P

(2 )

[Ax + BUj, A BuJ = AÛj^

etc. One easily sees that the rank of the system is precisely the rank of the matrix (B, AB, ... , An_1B).

38 0 LOBRY

4. OTHER RELATED QUESTIONS

To conclude this survey, we just mention here some related questions which are object of current research.

4.1. Closedness of G^{&).x:

Consider the system:P

= x°(x) + Y uixiW; x e IRn i = 1

With the notations of Chapter 3 we have:

Ar(x,t) = Ab(x,t)

and clearly if Ab(x,t) is closed this means:

Ar(x,t) = Ab(x,t)

that is to say, everything you can do you can do with a bang-bang control. This is a bang-bang principle. It is known that bang-bang principle, singular arcs and maximum principle are very closely related questions. On these topics, we mention papers by Hermes [20,21 ] Krener [30,34,35] ,Lobry [43,45 ].

4.2. Realization theory

Assume that you have a control system of the standard form:

Р

= X°(x) + ^ u ¡x ‘ (x) x G M; (Uj u2 ... u p) G IRPi = l

on some manifold M. Let ф : M -* N be a mapping from M into some other manifold X (the observation). Let us say that two states x 0 and x x are equivalent if for any control t ->^/(t) we have:

cp(x(x0, t/Ô) = ф (х (х j , \.@/)), t fe 0

where x(x0, tf%/), x(x, Xi%/) denote, respectively, the response issued from x 0 and xi. To speak of minimal realization, it is necessary to make the quotient of M under the above equivalence and to define a manifold structure on the quotient. It turns out that this is possible by a theorem of Sussmann [60], which generalizes the classical "closed-sub-group theorem". Applications are given in Sussmann [61,62]. Another approach concerning local realizations of non-linear systems by bilinear ones can be found in Krener [33].

IAEA-SM R-17/71

4.3. Structural stability and classification

381

Let us say that a control system is structurally stable if a little change in the data does not very much change the general behaviour of the system. What are structurally stable systems? Are they generic? Try to classify them. There are some attempts to work in this direction in G erbier-Lobry [12,13].

4.4. Connection with diffusion process

Consider the diffusion process:

P

d?t = X°(?t ) + Y X ‘ (Çt ) dB¡i — 1

where Xo, X 1 ... X p are vector fields and the B¡ are independent onedimensional Brownian motions. By a result of Stroock and Varadhan, th.e support of the diffusion process is connected to the controllability problem for the control system:

P

^ = X°(x) + Y ui Xi (x).i =1

On these topics, see Elliot [8,9], Kunita [38], Strooke-Varadhan [57].

R E F E R E N C E S

[1 ] BOOTHBY, W. , A transitivity problem from control theory.[2 ] BROCKETT, R. , Lie theory and control systems defined on spheres, SIAM J. Appl. Math. 24 5 (1973).[3 ] BROCKETT, R. , System theory on group m anifolds and cosetsp aces, SIAM J. Control 10 2 (1972).[4 ] BROCKETT, R ., "Lie algebras and Lie groups in control theory. G eom etric Methods in System Theory".

Reidel (1973).[5 ] BRUNOVSKY, P ., Local controllability o f odd systems, to appear.[6 ] BRUNOVSKY, P ., LOBRY, C . , "C ontrôlabilité Bang-Bang, contrôlab ilité d ifférentielle et perturbation

des systèmes non linéa ires” , (a4 paraître dans Ann. Matem. pura ed a p p l.)[ 7] CHOW, W. L . , "Über Systeme von linearen partiellen D ifferentialgleichungen erster Ordnung", Math.

Ann. U 7_(1939) 98-105 .[8 ] ELLIOTT, D. , A consequence o f controllab ility , J. D iff. Equ. 10 (1971) 364-370.[9 ] ELLIOTT, D . , "Diffusions on m anifolds arising from controllable systems", G eom etric Methods in

System Theory, ( lo c . c it . see Ref. [ 4 ] ) .[1 0 ] ELLIOTT, D . , TARN, J. T . , C ontrollability and observation for bilinear systems (unpublished).[ 11] GERBIER, Y . ."C lassification des couples de systèmes dynamiques du plan. Application a' la théorie

de la com m a n d e", Thèse, 3è c y c le , Bordeaux 1974.[1 2 ] GERBIER, Y . , "C lassification de certains systèmes dynamiques contrôlés du plan", C .R . A cad . Sc.

Paris 280 (20 janvier 1975) 109-112.[1 3 ] GERBIER, Y . , LOBRY, C . , "On the structural stability o f dynam ical control system s". Com m unication

to 1975 I. F. A . C . Congress.[ 14] GROTE, J . , "Problems in G eodesic C on trol", G eom etric Methods in Systems theory ( lo c . c it . see Ref. [4 ] ) .[1 5 ] GROTE, J . , "La théorie des connections et con trô le ". Publications Mathématiques de l ’ Université

de Bordeaux, Année 73-74 , Fascicule 3, 7 -13 .[1 6 ] HAYNES, G .W ., HERMES, H., N onlinear controllab ility via Lie theory, SIAM J. Control 8 (1970) 450 -46 0 .

382 LOBRY

[1 7 ] HERMANN, R ., "On the A ccessibility problem in control theory". Internat. Sym. N on-linear Differential Equations and Nonlinear M echanics, A cad em ic Press New York (1963).

[1 8 ] HERMANN, R ., D ifferential Geom etry and the Calculus o f Variations, A cad em ic Press, New York (1968).[1 9 ] HERMES, H. , C ontrollability and the singular problem , SIAM J. Control 2 (1965) 241-260.[2 0 ] HERMES, H ., On lo ca l and global controllab ility , SIA M J. Control, 12 (1974) 252-261.[2 1 ] HERMES, H. , "On necessary and sufficient conditions for lo ca l controllability am ong a reference

tra jectory", G eom etric Methods in System Theory ( lo c . c it . see Ref. [ 4 ] ) .[2 2 ] HERMES, H. , "L ocal controllability and sufficient conditions in singular problem s" (to appear).[2 3 ] HERMES, H . , HAYNES, G .W ., On the nonlinear control problem with control appearing linearly,

SIAM J. Control 1_( 1963) 85-108 .[2 4 ] HIRSHORN, R ., "T op o log ica l groups and the control o f nonlinear Systems" (to appear).[2 5 ] HIRSHORN, R ., T op o log ica l sem i-groups, sets o f generators and controllability , Duke Math. J. 40

4 (1973) 937-947.[2 6 ] HIRSHORN, R ., "C ontrollability in Nonlinear System s", G eom etric Methods in System Theory

( lo c . c it . see Ref. [ 4 ] ) .[2 7 ] HIRSHORN, R ., "T op o log ica l sem i groups and controllability in bilinear system s", Ph. D. Thesis,

D iv. o f Eng. and Appl. Phys., Harvard Univ. (Sept. 1973).[2 8 ] JURDJEVIC, V . , "Certain controllability property o f analytic control systems", SIA M J. Control 10

(1972) 354 -360 .[2 9 ] JURDJEVIC, V., SUSSMANN, H ., "Control systems on Lie groups", J. D iff. Eqs 12_2 (1972).[3 0 ] KRENER, A . , "A generalization o f C h ow ’s theorem and the Bang-Bang theorem to nonlinear control

system s", SIA M J. Control 12(1974) 43 -52 .[3 1 ] KRENER, A . , "On the equ ivalence o f control systems and the linearization o f nonlinear control

system s", SIA M J. Control U_( 1973) 670-676.[3 2 ] KRENER, A . , "Bilinear and nonlinear realizations o f input output maps" (to appear),[3 3 ] KRENER, A . , "L ocal approxim ation o f control systems" (to appear in J. D iff. Eqs)[3 4 ] KRENER, A . , "T he high order m axim um principle" (to appear).[3 5 ] KRENER, A . , "The high order m axim um princip le", G eom etric Methods in System Theory ( lo c . c it .

see Ref. [ 4 ] ) .[3 6 ] KUCERA, Y . , "Solution in large o f control problem x = ( A ( l - u ) + B )x". C zech . Math. J. 1 6 (9 1 )

(1966) 600-623.[3 7 ] KUCERA, Y . , "Solution in large o f control problem x = (A u + Bv)x". C zech . Math. J. 17 (92) (1967)

91 -96 .[3 8 ] KUN1TA, S h ., Diffusion process and control systems, cours de D. E. A. Univ. Paris VI, Lab. de

C a lcu l des probabilités (1973-74).[3 9 ] LEVITT, М ., SUSSMANN, H ., On controllability by means o f two vector fields (to appear in SIAM).[4 0 ] LOBRY, C . , C ontrolabilité des systèmes non linéaires, SIAM J. Control 8 (1970) 573-605.[4 1 ] LOBRY, C . , Une propriété générique des couples de champs de vecteurs, J. Math. C zech . 22

(1972) 230-237.[4 2 ] LOBRY, C . , Quelques aspects qualitatifs de la théorie de la com m ande, T hëse-G renoble (1972).[4 3 ] LOBRY, C . , G eom etric structure o f D ynam ical polysystems. Warwick Control Theory Center report

19 (1972).[4 4 ] LOBRY, C . , C ontrollability o f nonlinear systems on com pact m anifolds, SIAM J. Control 12-1 (1974) 1 -4 .[4 5 ] LOBRY, C . , "Deux remarques sur la com m ande Bang Bang des systèmes semi linéa ires". (Proc. C onf.

Zakopane, Poland).[4 6 ] LOBRY, C . , "D yn am ical Polysystems and Control T h eory", G eom etric Methods in systems theory"

( lo c . c it . see Ref. [ 4 ] ) .[4 7 ] MARKUS, L . , "Control dynam ical systems” , Math. Systems Theory 3^(1969) 179-185.[4 8 ] MARKUS, L . , SELL, G .R . , "Capture and control in conservative dynam ical systems" Arch. Ration.

M ech. Anal. 3J_(1968) 271 -287 .[4 9 ] MARKUS, L . , SELL, G .R ., "Control in conservative dynam ical systems: Recurrence and capture in

aperiod ic f ie ld s", J. D iff. Eqs 16 (1974) 472 -505 .[5 0 ] MILNOR, J. W . , T op ology from the Differential V iew Point, The University Press o f Virginia,

Charlottesville (1965),[5 1 ] NAGANO, T . , Linear differential systems with singularities and app lication to transitive Lie algebras.

J. Math. Soc, Japan 18 (1966) 398-404.[5 2 ] PALAIS, R , , A global form ulation o f the Lie theory o f transformation groups, M em . A .M .S . 22 (1957),[5 3 ] REBHUHN, D . , On the set o f attainability, PhD-thesis Univ. o f Illinois Urbana - Cham paign (1974).[5 4 ] STEFAN, P ., A ccessibility and singular foliations, PhD-thesis, Univ. o f Warwich (1973).

IA E A -S M R -1 7 /7 1 383

[5 5 ] STEFAN, P ., Integrability o f systems o f vector fields (to appear).[5 6 ] STEFAN, P ., "T w o proofs o f C h ow ’s theorem " G eom etric Methods in Systems Theory ( lo c . c i t . ,

see Ref. [ 4 ] ) .[5 7 ] STROOKE, P ., VARADHAN, S ., "On the support o f diffusion Processes, with applications to the strong

m axim um princip le", 6th Berkeley Symp. M athem atical Statistics and Probability.[5 8 ] SUSSMANN, H ., The Bang-Bang problem for certain systems in G . L. (n, IR), SIAM J. Control (to appear),[5 9 ] SUSSMANN, H ., "Orbits o f fam ilies o f vector fields and integrability o f systems with singularities” .

Trans. Am er. Math. Soc. _180( 1973) 171-188.[6 0 ] SUSSMANN, H . , "On quotients o f m anifolds: a generalization o f the dosed subgroup theorem "

(to appear in J. D iff. G e o m .) .[ 61] SUSSMANN, H ., M inim al realizations o f nonlinear systems, G eom etric Methods in Systems Theory

( lo c . c it . , see Ref. [ 4 ] ) .[6 2 ] SUSSMANN, H . , "Observable realization o f finite dimensional non -linear systems" (to appear in

SIAM J. con tro l.)[6 3 ] SUSSMANN, H ., "Existence and uniqueness o f m inim al realizations o f nonlinear systems I:

Initialized systems" (to appear).[6 4 ] SUSSMANN, H ., "On the number o f vector fields needed to ach ieve controllab ility"(to appear).[6 5 ] SUSSMANN, H ., JURDJEVIC, V . , "C ontrollability o f nonlinear system s", J. D iff. Eqs 12 (1972) 95-116 .[6 6 ] SW AM Y-TARN, "Optim al control o f discrete bilinear system s", G eom etric Methods in System Theory

( lo c . c i t . , see Ref. [ 4 ] ) .[6 7 ] SW AMY, "O ptim al control o f single input bilinear system s", D .S .C . Dissertation (D ec. 1973)

Washington, University St. Louis, Missouri.

IAEA- SM R -17/63

INTRODUCTION TO CONVEX ANALYSIS

J.P. CECCONICentro di Studio per la Matematica e la Física Teórica, C.N.R.,Genova, Italy

INTRODUCTION TO CONVEX ANALYSIS.This paper considers the main elem ents o f convex analysis in in fin ite-d im ensional spaces, that is the

most essential properties o f convex functions valued in R (the extended real numbers) such as: continuity property, duality, sub-differentiability, properties concerning the m in im ization o f such functions, and,finally, the connections o f these with the m on oton ie -operators theory and the variational-in^qualities theory. C onvexity has always played a fundamental role in the study o f variational problem s, but system atic studies on convexity have been carried out only in recent tim es.

1. CONVEX SETS, SEPARATION THEOREMS

Let X be a vector space over Pi (abbreviated v. s. ) if x, y e X, the set

(x ,y ]= {X x + (1 -A )y :0 s Xs 1}

is called the closed line segment .joining x and y.A subset К of X is convex if x, у £ К imply that [x, y] G К. It is

immediate that X and 0, the empty set, are convex sets. The intersection of convex sets is a convex set. The union of convex sets is not, in general, a convex set. If A is a subset of X the convex hull of A, denoted by со A, is the smallest convex subset containing A. Evidently

Let H = {x : f(x) = a}, where f is a non-zero linear form on X and a G R, be a real affine hyperplane in X; the convex sets

are called the algebraically open semispaces determined by H, and the convex sets

Abstract

n n

Fa = {x : f(x) < ar}, F “ = {x : f(x) > a}

G a = { x : f ( x ) « » } , G a = { x : f ( x ) S a)

385

386 CECCONI

are the algebraically closed semispaces determined by H. Two nonempty subsets A, В of X are said to be algebraically separated (respectively strictly separated) by the real hyperplane H if either A C G , ВС G“ or A c G “, В С G q (respectively, if either A с B e F a or A С F a, B e F a ).

It is obvious that the hyperplane H = {x G X :f(x) = a } separates A and В if and only if

f(x ) S a (or f(x) S o ) V x G A

and

f(y) S a (or f(y) s a) V y G В

We recall now the Hahn-Banach theorem (in its analytical and geom etrical form) and the separation theorems for convex sets.

Theorem 1 (Hahn-Banach, analytic form). In a v. s. X let p : X ■* R be a sublinear real function, i .e . a real function such that

p(x + y) g p(x) + p(y); x, y G X

p(a x) = a p(x); x g X , O 'G R + U{0 }

Let A be a linear subspace of X and let f : A -* R be a linear real function on A with

f(x) § p(x), x G A

Then there exists a linear real function F :X R for which

F(x) = f (x ) x G A

F (x) S p(x) x e X

Proof. Consider the family S of all real linear extensions g of f for whichthe inequality g(x) s p(x) holds for x in the domain of g. Let the relation h > g be defined in S to mean that h is an extension of g; this relation partially orders S . Let us show that every totally ordered subset of *3? = {g a :aG has an upper bound on S . Let us define

В = U A a, where A a is the domain of g aa e

g ( x ) = g a( x ) i f g a e <¡g, x G A a

IAE A-SM R-17/63 387

This definition is not ambiguous, for if ga¡, ga2 G , then either g < gor go, < ga¡- At апУ rate- if x G A a¡ П Ааг, then g a i(o) = g a¡ (x). Clearly g 6 S, and it is an upper bound for ЯК in S. Then from Zorn 's lemma follows the existence of a maximal extension F of f for which the inequality F(x) s p(x) holds for every x in the domain of F. It remains to be shown that the domain M of F is X. Otherwise we can prove that there would be a g G S such that, F < g and F f g and this would violate the maximality of F.

Let us assume the existence of an x 0£ X-M . Every vector x in the space = M ® {x 0} spanned by M and x 0 has a unique representation in the form x = m + ax 0 with m e M, a G R. For any constant c, the function g defined on M by

gfm +ttXj) = F (m) + o c

is a proper extension of F. The desired contradiction will be made when it is shown that с may be chosen so that g G S- Let m, n be vectors in M. Then from

F (m) - F (n) = F (m - n) S p (m -n ) S p(m + x0) + p ( -n -x Q)

it follows that

- p (-n -x 0) - F(n) 5 p(m +x0 ) - F(m)

Since the left side of this inequality is independent of m and the right side is independent of n, there is a constant с with

-p (-m -x Q ) - F(m) S c m G M

с S p(m +x0 ) - F(m) m e M

We show that, with this c, it results for every x = m +ax 0 G M:

gfrn+ax^) = F(m) + а с § р ( т + а х ( )

so that g G S. In fact, if a > 0, then

g(m +ox0) = oc + F(m) = a i c + F{■ Se lpW + xoy- F{■ c .) Pm-)

38 8 CECCONI

g(m + »x 0) = F(m) + a c = F(m) - j3c = ¡3 | f - c }

S/3{ F( j ) + F G j ) +p(+ J - xo)} = P(+m- ^ 0)

S p (m ra x 0)

The proof of the theorem is so completed.

Corollary 1. In a v. s. X let p : X -* R be a seminorm in X, i. e. a realfunction such that

p(x +y) g p(x) + p(y) x, y £ X

p (ax ) = I a I • p (x ) x e X , a e R

Let A be a linear subspace of X and let f : A -» R be a linear real function on A with

I f(x) I s p(x), x e A

Then there exists a linear real function F : X -* R. for which

F (x) = f(x), x e A

IF (x) I S p(x), x G X

Proof. From the theorem follows the existence of F :X -♦ R for which

F(x) = f(x); x £ A

F (x) S p(x); x С X

Therefore

-F (x) = F (-x ) s p (-x ) = p(x)

Definition 1. Let К be a convex set in a v. s. X. A point x e X is called an internal point of К if for every v e X there exists £ € E 4 such that for every 5 6 R, ¡5 | < e it results x + 6 v e K.

i f a - -/3 < 0 , t h e n

IAEA-SM R-17/6 3 38 9

Definition 2. Let К be a convex set on a v. s. X and let the origin 0 ofX be an internal point of K. For each x € X let I(x) = {a e R.^:(x/a)G K}and k(x) = inf I(x). The function k(x) is called the gauge function of K.

Theorem 2. Let К be a convex set containing the origin of X as an internal point and be к its gauge function. Then

(a) к is a sublinear non-negative function in X;(b) if x 6 К then k(x) s 1;(c) the set of the internal points of К is characterized by the condition

k(x) < 1 .

Proof. Statement (a) follows from the fact that the origin is an internal point of K, from the fact that I(orx) = orl(x) for every x G X and a G R^,and from the fact that, for every x G X, e G R .

k(x) + e e I (x)

It follows that if x, y G К then

----------- ------------ G Кк(х) + е ' к (у ) + е

and from the convexity of К

х + У_______ _ _ X _______ k(x) + e + у k (y ) + e_______k(x) + k(y) + 2e k(x) + e k(x)+ k(y) + 2e k(y) + e k(x) + k(y) + 2e

Therefore

k(x) + k(y) + 2e G I(x + y)

and consequently

k(x + y) = inf I(x + y) S k(x) + k(y) + 2e

Hence

k(x + y) s k(x) + k(y)

Statement (b) is self-evident.If x is an internal point o f К then x + € x G К = x (l + c) G К for som e

sufficiently sm all e G R + so that ^ G I(x). Hence k(x) = inf I(x)g - - <1

3 9 0 CECCONI

Conversely,if k(x) < 1 let e = 1 - k(x) and for every y G X let |ô I {к (у) + к (-y )} < е. Then

k(x + бу) í k(x) + к(бу) â (1 - e) + бк(у) < (1 - e) + e = 1

k(x + 6 y) s k(x) + k(6 y) s (1 - e) - ó k (-y ) < (1 - e) + e = 1

according to 6 is positive or negative, so that x + 6 y G К. Therefore x is an internal point of K.

We can prove now the following separation theorem for convex sets:

Theorem 3. Let A and В be disjoint non-empty convex subsets of a v. s.X and let A have an internal point. Then there exists a hyperplane H which separates К and H.

P roof. If a is an internal point of A, then the origin 0 of X is an internal point of the convex set A - a = {x G X :x = z - a, z G A }. The affine hyperplane H = {x G X :f(x ) = a) separates A and В if and only if the affine hyperplane H' = {x G X :f (x ) = o' - f(a.)} separates the sets A -а and В-a. Thus it suffices to prove the theorem under the additional assumption that 0 is an internal point of A.

Let b be any point of В so that -b is an internal point of

A - В = { z G X : z = x - y , x G A, y G B }

and 0 is an internal point of the convex set

К = A - В + b = {z G X : z = x - y + b, x G A, y G B }

Since A and В are disjoint,the set A-В does not contain 0; hence К does not contain b. Let к be the gauge function of the convex set K, so k(b) s 1.

If for every a G R we put f 0 (ab) = ak(b) then f Q is a linear function defined on the one-dimensional subspace of X which consists of real multiples of b. M oreover, fotab) § k(ab) for all a G R, since for a 8 0 we have f 0(o-b) = к(о-Ь) while for a < 0 we have f q(orb) = crfo(b) < 0 sk(ab).By the Hahn-Banach theorem f 0 can be extended to a real function f such that f(x) £ k(x) for every x G X.

It follows f(x) £ 1 for every x G К, while f(b) ê 1. Thus K = A - B+ b C { x : f(x) s 1}, f(b) s 1. For every a G A, (3 £ В we have

f(a) - f(j3) + f(b) s i

a n d a l s o

f (e )s f(0 )+ l - f ( b ) 5f(j3)

lA E A -S M R -1 7 /6 3 391

Let us call 7 = sup {f(a) : a G A }. Hence results

f(a) 5 7 , a G A

f (/3 ) S 7 . /3 G B

Thence

A C { x : f (x) § 7 } , B C { x : f ( x ) H }

and the proof is completed.In the particular case where В is a linear affine subspace of X we

have

Corollary 2 (Hahn-Banach theorem in geometrical form). If A is a convexsubset of a v. s. X, which has an internal point, and if В is a linear affinesub space of X such that А П В = 0, then there exists an affine hvperplaneof X which contains В and does not intersect A.

P roof. Let H = {x :h(x) = a } the affine hyperplane of X that separates A. and В in consequence of Theorem 3; and let be A С {h(x) § а }, В Э (h(x) S a }. If b is a point in the linear affine space В we have h(y) = h(b) = 7 for every y G В. In fact, if h(b’ ) f h(b), b 1 G В and B0 = В - b, b 1 = b +u, u G B 0, it results b + Xu G В for every A G Pi and h(u) f 0 so that

h(b + Xu) = h(b) + Xb(u) < a

for convenient X; a contradiction.Therefore h(x) § a s h(b) for every x G A and h(y) = h(b) for every

y G В so that the hyperplane H1 = {x :h(x) = h(b)}, which contains B, separates A and B.

Definition 3. A real v. s. endowed with a topology ^"for which the two axioms

(1) (x, y) -> x+ y is a continuous map on X X X into X

(2) (a, x) -’ ax is a continuous map on R X X into X

are satisfied, is called a real topological vector' space (abbreviated t.v . s. ).If "= {U } is a basis of neighbourhood of the origin 0 of X in ^ and x 0 GX, then x 0 +U ; U G / is a base of neighbourhood of x Q in the topology If К is a convex set in ji_t.v. s. X, the closure К and the interior K° of Кare convex sets and К = (K°) if К f 0;_moreover,- if x G K°, then x is aninternal point of K; if x G K° and y G К, then every point in the open segment [x, у [ is interior to K. If K° / 0 and x is an internal point of the convex set K, from K= (K°) it follows that x G K°.

3 9 2 CEC CO N I

If A is a subset of a t .v . s. X , the intersection of all closed convex sets containing A is called the closed convex hull of A and is denoted by

со К

If К is a t. v. s. and H = {x : f(x) = a } is an affine hyperplane of X, then H is a closed subset of X if the linear real function f is continuous on X.In this case the semispaces F a, F “ [or Ga, G a] are open [or closed] subsets of X.

Conversely, if for a n aE Й and a linear function f :X-» R F , F “ are open subsets of X , then f is a continuous function. In fa:ct we have

Theorem 4. If H = {x : f(x) = a } is an affine hyperplane of X which separates two non-empty subsets A, В of X, one of which has an interior point, then the linear real function f :X -* R is continuous.

P roof. Let x0 be an interior point of A. Then there exists a neighbourhood U of the origin 0 in X such that x0 + U 6 A. In the hypothesis that A С Ga,В С G“, then we have f(x 0+ U) S a and therefore, for every x ë U, f(x) S a - f (x 0). Let V = U П ( - U ) ; then V and -V are neighbourhoods of the origin of X in and for every y G V it results

f(y) s a - f(x Q)

f(-y ) Sa - f(x0)

so that

I f (У) I S a - î ( x 0)

Then, for every e G R+ , we have

|Г(У)| “ €

if У e ----- T,— Ñ V = V ' and V is a neighbourhood of the origin 0 of X ina - i (x0 )Therefore f is continuous at the origin 0 of X . Since f is linear and continuous at 0 it is continuous everywhere.

Theorems 3 and 4 yield the following separation theorem in t.v . s.

Theorem 5. Let A and В be non-empty disjoint convex subsets of a t .v . s. X,one of them being an interior point. Then there exists an affine closed hyperplane which separates A and B.

Definition 4. A t .v . s. X is called locally convex (abbreviated 1. c. s. ) if it is a Hausdorff space such that every neighbourhood of the origin X contains a convex neighbourhood of 0 .

IA E A - SM R - 1 7 /6 3 393

Every normed space is obviously a 1. c. s. In general,every v. s. endowed with a family {P : y G Г } of seminorms in X such that for every x G X a seminorm p , exists such that Py (x) f 0, is a 1. c. s. ; its base of neighbourhoods of the origin 0 are all finite intersections of sets Uyi £ = {x : Pj, (x) s e }, y G Г, e G R +; this topology on X is called the topology generated by the family of seminorms Ру . In this topology each of the given seminorms P y is continuous on X. In a 1. c. s. we can derive from the preceding theorems of separation the following theorem which has become a standard tool of the theory.

Theorem 6 . Let A and В be non-empty disjoint convex closed subsets of a1. c. s. X, one of which is compact. Then there exists a closed affine hyperplane which strictly separates A and B.

Proof. A -В is a convex closed set and (A-В ) П{0} = 0. Therefore there exists a convex neighbourhood U of the origin in X such that (A-В ) П U = 0. From Theorem 5 it follows that there exists a closed hyperplane H which separates A -В and U. Let f :X -* R b e the non-zero continuous linear function such that

H= {x : f (x) = a, a G R } and f(A -B) í a , f(U)

Since f is a non-zero function, there exists an x G X such that f(x) f 0; since U is a neighbourhood of 0, there exists a 6 G R + such that kx G U,|k| S6 and consequently f(kx) § a if k| S 6 . Therefore there exists an e G R+ such that [-e , e ] С {kf(x) : к | s б} С f(U) so that a i e . Hence f(a-b) = f(a) - f(b) s a > e for every a G A, b G B.

Let с = inf {f(a), a G A }, then c G R because A is compact and f is continuous on X. We have also

f (a )È c S f ( b ) + o '> f ( b ) + e, a G A , b G B

and therefore

f(a) ê с > с - e >f(b), a G A , b G B

Two important consequences of these theorems are

Corollary 3. If К is a convex subset of a t.v . s. for which K° f 0, then through every boundary point of К passes a closed supporting hyperplane H; i .e . a closed hyperplane H such that КПН f 0 and К is contained in one of the closed sem ispaces determined by H. M oreover, К is the intersection of the closed sem ispaces which contain К and are determined by the supporting hyperplanes of K.

Corollary 4. If К is a non-empty convex closed subset of a 1. c. s. , then К is the intersection of all closed sem ispaces containing K.

3 9 4 CEC CO N I

Proof of Corollary 3. To see that through every boundary point x of К passes at least one supporting hyperplane it suffices to apply theorem 5 to the convex sets K° and x 0. Indeed there exists an affine closed hyperplane H = {x : f(x) = <*} such that f(x) s a for every x e K° and f(x0) ê a. Then for every y 6 К it results f(y) ё a and consequently f(x0) = a. To prove the second assertion we prove prelim inarly that there exists no supporting hyperplane of К containing an interior point of K. In fact: assume that x G K° П H where H = {x : f(x) = a } is a supporting hyperplane of К such that К С G„. There exists y G K° with f(y) < a, since H cannot contain K°. Now

f {x + e (x -y ) } = f(x ) (1 + e) - ef(y) = e (l + e ) - ef(y) > a (1 + e) - ea = a

for every e G R+. But from x G K° it follows that x+ e(x - y) G К forconvenient e G R + and therefore f(x + e (x -y )) S a for convenient e G R + .This is a contradiction. From the first assertion we know that КС 0 G„

СаеУwhere ^ i s the family of all supporting closed hyperplanesH of К and К С G„ with H = {x : f(x) = a}. It remains to prove that if у ^ K, then there exists a closed supporting hyperplane H = {x : f(x) = a } such that К С G a, f(y) > a. Let уф K, x G K°; the open segment [x ,y [ contains exactly one boundary point x 0 of K. There exists a closed supporting hyperplane H = {x :f(x ) = a } passing through x 0 such that К С Ga; H does not contain y; otherwise it would contain x and this is a contradiction.

Proof of corollary 4. It follows obviously from theorem 6 and the fact that sets containing exactly one point are compact.

A ll the properties now proved are of fundamental importance in functional analysis. For example, ifX is a I.e. s., from the Hahn-Banach theorem follows the existence of linear continuous non-zero real functions on X: it suffices to consider two points x ,y GX; x f у and to separate them by a closed hyperplane H; if H = {x :f(x ) = a} then f :X -» R is continuous and f(x) f f(y). If X is a t. v. s. , we call topological dual of X the v. s. X* of all the continuous, real linear functions on X endowed with the usual structure.

It is convenient to denote elements of X* with notations x*, y* and so on, and to denote the value of the linear continuous real function x* G X* on x with the notation<(x, x* У .

Then the map (x, x*) -* ^x, x* У is a real bilinear function on X X X* and X v [resp .X ] is a v. s. of linear real functions on X [resp .X v ]; m oreover, every x* G X 'r [resp. x G X] is the linear real function x x* )■ on X [resp. xt -|( x , x * ) o n X ’’ ], We can introduce on X* [resp. on X] the a (X*, X )-topology [resp. the a (X ,X*)-topology ] by taking as a base at 0 the family of all sets of the form U(A, e) = {x* G X* : |<(x, x*)> |<e,xG A } where A is a finite subset of X, e G R + [resp. U(A, e) = (x e X : |(x, x*)>|< e, x* G A } where A is a finite subset of X*, e G R +].

These topologies are called weak topologies on X* [resp. on X] associated with the duality between X and X*. Since x* -* |<x, x*^ | is a seminorm for Vx G X and for every pair x*, y* G X*, x* f y* there exists an x such that ( x , x * y f < x, y*X it results that X* endowed with a(X*,X ) is а

IAE A-SM R-17/63 39 5

I .e . s. If X (with the initial topology) is a I .e . s . , then also X endowed with the a (X , X *) is a 1. c. s. ; in fact x | < x , x * > |

is a seminorm for every x* 6 X* and for every pair x , y £ X , x f y there exists, in the hypothesis that X is a 1. c. s. , an x* 6 X * such that

The (X ,X*) topology on X is coarser than the initial topology on X; therefore the subsets of X closed in the a(X ,X*) topology are closed also in the initial topology.

It is of fundamental importance that from corollary 4 follows

Corollary 5. In a 1. c. s. X every closed convex set is closed in the cr(X,X*) topology; therefore a convex subset in a 1. c. s. X is closed if and only if it is weakly closed.

Proof. From corollary 5 it follows that К = П where S is the family ofGa es

all closed sem ispaces Ga = {x : f(x) S <*} such that К С Ga. But every f considered in the definition of a Ga 6 S is also continuous in the ct(X,X*) topology of X; therefore every Ga G S is also closed in that topology.Then from К = П Ga follows that К is closed in the cr(X,X*) topology.

GqÆ SIn the following we shall need also two theorems concerning normed spaces.

Definition 5. Let X be a normed v. s. and f :X -» R a linear continuous function on X. We call || f || = inf F when F = {(3 6 R+: | f (x) | § |3 ||x||, x 6 X }. It is obvious that F is not empty because f is continuous. ||f || is called the norm of f.

Theorem 7. Let X be a normed v. s. , M a linear subspace of x and f :M - R a continuous linear function on M. Then there exists a continuous linear function F : X R such that F (x) = f(x) on X and ||f || = ||f||.

P roof. Let I f II be the norm of f : M -» R and p(x) = ||f|| ||x ||. Then p :X - > R i s a seminorm on X such that |f(x) | s p(x) on M. Then by corollary 1 there exists a linear extension F :X -*■ R of f such that |F(x)| s p(x) = I f I) D x D. Thus ü F ü S ü f К. On the other hand, since F is an extension of f we must have ||f || S || F ||. This completes the proof of the theorem.

Theorem 8 . (Mazur) Let X be a normal v. s. and {x n}nEN a sequence in X which converges to x in the weak topology of X. Then there exists a sequence

whose elements are convex combinations of given xn, which converge in norm to x.

<x,x*> f <y, x*>.

Pn Pn

Proof. For each n e N let us define An = U {x¡ }. Then x belongs to thei = n

weak closure of A n, for every n £ N. Therefore x belongs to the weak

396 CEC CO N I

closure of co(An ) Vn G N. But, for corollary 5, x G co(A n) V n € N. Then for each n G N we can take a yn G co(A n) such that ||x-yn || < l/n . The theorem is proved.

2. CONVEX FUNCTIONS

Let X be a (real) v. s. and R = R U {- oo} U { + oo}.

Definition 1. If К is a convex set in X and f : К -> R is a function on К whose values are real numbers, either - co or +oo, we say that f is convex on К if for any x G X, y G X we have

f(Xx + (1 - X)y) S Af(x) + (1 - X) f(y), X G [0, 1] (2.1)

provided the second member is defined; i. e. the inequality subsists in all x, y G К with the exception of the points where f(x) = -f(y) = ± oo with the usual computation rules in R

a + o o = + o o + a = + o o i f a f -oo

а - о о = - с о + а = -оо i f a f +00

a ■ o o - o o , a ■ (- oo) = - oo if a G R + , 0 • oo = 0 • (- oo) = 0

We say that f is strictly convex if (2.1) is a strict inequality for x / y, X G ]0, 1 [.

From definition 1 it follows obviously that, if xp x 2. . . , x nG K,

a. ê 0 ; i = 1 ....

then

/ A n

1 1

provided the êcond member is defined.If f : К -* R is a convex function on the convex set K, then for every

a G R the sets

G a = {x G X :f(x ) s a}, Fa = {x G X : f(x) < a }

are £onvex; but the fact that Ga [resp. Fa ] is a convex set for every a G R, does not imply that f is a convex function. _

For example, if К =_R, any increasing f ;R - R is such that Ga [resp .F a is convex for every a G R.

IAEA -S M R -1 7 /6 3 397

F or any function f : X -» R we ca ll e ffective domain of f the convex set

dom f = (x g X : f ( x ) < + « }

By a proper convex function on X we shall mean a convex function with values in R U {+ oo}_which is not m erely the constant function f(x) = + 0 0 . T herefore , if f :X -» R is a proper convex function, the dom f is a nonempty convex set on which f is rea l-va lued . C onversely , if К is a nonempty convex subset o f X and f is a real-valued function on К which is convex, i. e. sa tisfies (2.1) when x € К, у G К, then one can obtain a proper convex function on X by setting f(x) = + 0 0 for every x ^ K.

A very useful example o f a proper convex function is the indicator function X of a non-em pty subset К o f X, which is defined by

It is obvious that a subset К of X is convex if and only if is a convex function on X . So the theory o f convex subsets of X is a part of the theory of the proper convex functions on X.

Definition 2. If f :X -*R is an extended real-valued function on X , the set

epi f = {(x, a) G X X R : f(x) § a }

is called the epigraph o f f.It is obvious that a subset G С X X R is the epigraph of som e function

if and only if for each x G X the set {a G R; (x, a) G G } is either R, 0 o r a closed internal of the type [d, +oo[. Then the function f is obtained from G = epi f by

f(x) = inf {a G R : (x, a) G G }

with the usual rule inf 0 = + 0 0 .We have the follow ing characterization o f convex functions:

Theorem 1. In a v . s. X a rea l extended function f :X -* R is convex if and only if epi f is à convex subset o f X X R.

P ro o f. Let f :X -» R be a convex function on X and let (x, a), (y, b) be in epi f. We prove that X(x, a) + (1 - X) (y, b) G epi f for every X G 10,1].In fact,f(x ) s a < + 0 0 , f(y) s b < + oo and for every X G [0, 1] it resu lts

f [Xx + (1 - X)y ] S Xf (x) + (1 - X) f(y) ë Xa + (1 - X)b

3 9 8 CEC CO N I

X (x, a) + (1 - X) (y, b) = (Xx + (1 - X) y, Xa + (1 - X)b) G epi f

Conversely, let us suppose that epi f is a convex subset of X X R.Then

dom f = {x G X : f(x) f + oo}

is convex.In fact, if x, y G dom f, then there exists a, b G R such that f(x) S a,

f(y) s b and therefore (x, a) G epi f, (y, b) G epi f. From the convexity of epi f it follows that for every XG [0 , 1 ]

X (x, a) + (1 - X) (y, b) = (Xx + (1 - X)y, Xa + (1 - X)b) G epi f

and so

f(xX + (1 - X)y) S Xa + (1 - X)b G R (2.1a)

and dom f is convex.Let us prove that f : dom f -*■ R U {- oo} is convex. This follows from

(2.1a)bymakingin it a = f(x), b = f(y) if f(x), f(y) G R and by passing by the limit for a -* - oo (or b -» - o o ) if f(x) = - oo (or f(y) = - o o ) . Now we can easily say that f :X - R is convex: in fact, (2 Л) subsists if x, y G dom f; and it follows from the computation rules in R if x, y ^ dom f. If, finally, x G dom f, y ^ dom f (2.1) is not requested if f(x) = - o o and it follows from the computation rules in R if f(x) G R.

The class of all convex functions on a v. s. X has the following closure properties which imply m oreover that the class of all convex functions is a convex class.

Theorem 2. The supremum of any family { f a : a G A } of convex functions on a v. s. X is a convex function on X. The sum of two convex functions f on a v. s. X is a convex function on X if we let (f+ g)(x) = + oo when f(x) = -g(x) = - o o . If f is a convex function in a v. s. X and a G R+ then ai is a convex function on X.

P roof. The first assertion follows from the fact that

epi (sup fa : a G A) = П epi f a ae A

and from theorem 1. To prove the second assertion, let f, g be convex functions on X, x, y G X. For every X G [0,1 ] we have

f [Xx + ( l - X ) y ] S Xf(x) + (1 - X) f(y)

s o t h a t

g [ X x + (1 - X ) y ] S X g ( x ) + (1 - X) g ( y )

IA E A - S M R -1 7 /6 3 39 9

and, by addition, the corresponding statement for f + g ; in fact,this is obvious if additions are done in R, and it follows from calculation rules incluse ( + oo) + ( - oo) = + oo if additions are done in R. The third assertion is obvious.

The most interesting part of the theory of convex functions is done in a 1. c. s. X as we suppose from now on.

We recall that a function f :X - R is said to be lower-semicontinuous on X (abbreviated 1. s. c. ) if, for every x 0 G X, we have

f(x0) S lim f(x)X ^Xo

A function f is said to be upper-semicontinuous on X (abbreviatedu. s. c. ) if - f is 1. s. c. on X. If f is 1. s. c. and u. s. c. on X it is continuous on X. Obviously the indicator function of a set A in X is 1. s. c.[resp. u. s. c. ] if and only if A is a closed [resp. open ] set. The following

theorem gives a characterization of 1. s .c . functions.

Theorem 3. Let f :X -*■ R'be a real extended function on a 1. c. s. Then the following conditions are equivalent:

(1) f is 1. s. c. on X(2) the sets Fa = {x G X : f(x) > a } are open for every a G R(3) epi f is a closed subset of X X R.

Proof. (1) implies (2). In fact, if y G Fa , it follows from (1) thatlim f(x) È f(y) > a. Therefore there exists a neighbourhood U(y) of yx ^ ywhere f(x) > a . Hence U(y)C F a and Fa is open. (2) implies (1). Let XgG X ,f(x0) f - oo and a < f(x„), a G R,then x g G Fa and Fa is open.Therefore, inf { f (x ) :x G F } f - oo and also lim f(x) ê a, and for the

x ^ x 0arbitrariness of a, lim f(x) È f(x n). Nothing must be proved if f(xn) = -o o .

x -» x 0 u uTo prove the equivalence between (2) and (3) we define the map

0 :X X R -» R which value 0 (x, a) = f (x )- a and observe that f is 1. s. c.on X if and only if 0 is 1. s. c. on X X R. But as we have proved^ is 1. s. c.on X X R if and only if the set {(x, a) G X X R : 0(x, a) = f(x) - a S a } is closedfor every a G R. But this set coincides with the set {(x, a) G X X R : f (x)-a Sa}and this set is a translation of ep if and therefore it is closed if and only ifepi f is closed. The proof of the theorem is so completed.

Corollary 1. If f :X -» R is a convex 1. s. c. function on a 1. c. s. X, thenf is 1. s. c. in the weak topology a (X,X*).

For every a G R the set G a = {x G X : f(x) S a } is convex and closed, therefore it is convex and closed in the weak u (X ,X *) topology. So f is1. s. c. in this topology.

The following theorems give us conditions under which a convex function is continuous.

Theorem 4. If a proper convex function f :X -* R on a t .v .s , is bounded above by a real constant in a neighbourhood of a point x G X, then f is continuous on X.

4 0 0 CEC CO N I

P roof. We can suppose that x = 0 and that f (0) .= 0. Let U be a neighbourhood of the origin 0 where f(x) S a G R . Let V = U <"> (-U) and observe thatV is a symmetric neighbourhood of the origin. Let e 6 ]0,1|. If x E V we have, by the convexity of f, (x/e)G V and therefore

f(x) = f((l - €) 0 + e ( j ) s (1 - e) f(0) + e f ( j ) s e a - | G V

and therefore

f(0 ) = f1 + e + x

1+ e 1 + ef(x)

f(x)

We have also |f(x)| s ea if x e eV and f is so continuous in x.

Theorem 5. Let f :X -* R be a proper convex function on a t. v. s. X.If there exists an open non-empty set U where f is bounded above by a real constant3then (domf)° D U and f is continuous on (dom f)°.

P roof. From the preceding theorem follows that f is continuous at every x G U, m oreover it is obvious'that U С (dom f)°.

For every y G (dom f)° and a fixed x 0G U let p > 1 be such that z = x 0 + p(y - x0) G (dom f)°. Let t = 1 - (l/p )G ] 0, 1 [ and consider the map 4>:x E X - tx + (1 - t ) z GX.

This is a continuous bijective map of X onto X whose inverse tp'1 is also continuous. Therefore <p{U) is an open subset of X containing

<P(X0) = ( 1 ' “ ) x o + ~ z = “ (Z + (P‘ У-

For every w = <p(x), x G U we have

f(w) = f(tx + (1 - t)z ) i t f ( x ) + (1 - t) f(z ) S tM + (1 - t)f(z ) < + =o

since z G (dom f)°.We have so proved that f is bounded above in the neighbourhood

<p(U) of y. Therefore, by theorem 4, it is continuous on y. The theorem is completely proved. In the particular case, when X is a normed space, we have:

Theorem 6 . If f :X -*R is a proper convex function on a normed space Xand if there exists a subset U of X where f is bounded above by a real M, then (dom f)° f ft and f is locally Lipschitzian on (dom f)°.

Proof. By theorem 5, f is continuous on (dom f)°. Therefore, if x G (dom f)°- there exists r 0E R+ such that if ||y-x|| S r„ it is |f(y)-f(x)| § 1 and con-

IAEA-SM R-17 /63 401

sequently there are real constants m ,M such that m § f(y) s M where ¡jy -x II < r 0 . Let us consider an r e ]0, r 0[ and a point ух such thatIlyx -x II § r. If we let G(z) = f(v j + z) - f(yj) for ||z || s r 0 - r , thenG(0) = 0 and G(z) S M-m if ||z | = r 0 - r.

Another consequence of theorem 4 is that, for every e e ]0,1 [,|G(z) j g e (M - m) if ||z ü S e (r0 - r). If we take y G X such that || у - ух || s r Qr

then H y - y ü = (г - г ) with У ~ е ] 0 ,1 L11J J I м r - г о r Q - r

so that

I f (y) - f (У1 ) I = |G (y-y i)| s ^ _Уг ~ <M - m )0

If y2 is another point in X for which ||y - x || S r we choose on thesegment [y , y ] a convenient number of points z, = y , z„, z, . . . z = y„ i ¿ 1 1 ¿ p 2such that

llz i +1 - Zi II = ~ l II У2 - У1 II < r 0 - r ’ i = 1. 2 . . . p -l

Then, for every consecutive pair z£, z i+ 1 , we have

|f(z ) - f(z. ) I s M ~ ^ ü z. - z. üI ' 1 + 1 ' 1 1 r - r 11 1 + 1 l 11

and finally, summing up, we see that

|f(y2) - f(y0)l - f ~J~ IIy2 - yjl

The theorem is completely proved.In the special case where X = Rn we have

Theorem 7. If f :R n -> R is a convex proper function, then it is continuousin (dom f)°.

P roof. If (dom f)° f 0 it contains (n + 1) points Xj , x 2. . . x n + 1 which arelinearly independent. If x is in the open convex set

n+l n+l(co {xj . . . x n + 1))° we have x = «jX^ <*; > 0 , ^ = 1

i l

and from the convexity of f n + l

f (x) = f ( Y , aix i) S1 i

our assertion follows now from theorem 4.

40 2 CEC CO N I

In the first section we proved that every convex closed set in a 1. c. s. can be obtained as intersection of all semispaces containing it. Now we investigate sim ilar properties for convex 1. s. c. functions in a 1. c. s.

Definition 3. In a t. v. s. X we call affine continuous function an f :X - R such that f(x) = (x ,x * )> + a where x* G X* and a € R,

Theorem 7. A proper convex 1. s. c. function f :X -* R i n a l . c . s , X is the pointwise supremum of all affine continuous functions h :X -* R such that h(x) S f (x) for every x G X.

Proof. It suffices to prove that for every x G X and every real a < f(x) there exists an affine continuous function h in X such that â < h(x) and h(x) s f(x) V x G X. In order to prove it}we observe that epi f is a convex closed subset of X X R and that (x,a) ^ epi f.

By means of the separation theorem we can strictly separate epi f and (x, a) using an affine closed hyperplane

H = {(x, a) :<(x, x * )+ O' • a = 7 } o f X X R where x* G X*, a, 7 G R and < x ,x * ) + на is a non-zero linear function on X X R. It is also

has the required property. If f(x) = + 00 and a f 0 we can conclude as in the preceding case.

Let now f(x) = + 0 0 and a = 0. In this case (2.2), (2.3) say that the affine continuous function K(x) = 7 -<x,x*)> on X is such that K(x) > 0,K(x) < 0 if x G dom f. Let us take now (y, b) such that y G dom f, b > f(y) and construct, as in the first case, an affine continuous function

<(x, x* > + a a < y (2 . 2 )

<x,x*> + a a > y if (x, a) G epi f (2.3)

Now if f(x) G R (x, f(x)) G epi f and the second inequality gives us

<(x, х*У + f(x)a > y (2.4)

so that

(a - f(x)) a < 0

It must be a > 0 and by dividing (2.2), (2.4) by a we obtain

a < ■a \ ’ a / ’ a \ ' a / < f(x) V x dom fa

and this proves that the affine continuous function h(x) =

IAE A-SM R-17/63 403

K'(x) = (3 - <(x, у*У such that b < K'(y) and K'(x) ë f(x) Vx € X. Then, for every c G R + , the affine continuous function

f (x) = j3 -<x,y*)> + cK(x) is such that í(x ) S f(x) if x G dom f

At this point let us rem ember that K(x) > 0 and choose с so great that i(x ) > a.

Thus the theorem is completely proved.If we recall that the supremum of a family of convex 1. s. c. functions

in a t.v . s. X is a convex 1. s. c. function, we have also proved

Theorem 9. A proper function f :X -> R in a 1. c. s. X is convex 1. s. c. , if and only if it is the supremum of a family of affine continuous functions.Let us introduce the following definition:

Definition 4. In a 1. c. s. X let f : X -» R be an extended real-valued function. We call regularized of f (abbreviated F(f)) the function

Г (f) = sup { cp : ip affine continuous function on X such that </>(x) S f(x )in X }

Obviously F(f) is a convex 1. s. c. function on X and it results

(rf) (x) S f(x) Vx G X

In theorem 8 we have also proved this.

Theorem 10. If f :X -> R is a proper convex function on a 1. c. s. X, then f = r ( f ) if and only if f is a convex 1. s. c. function.

A useful approach to the regularization is provided by the notion of the conjugate of a given function.

Definition 5. In a 1. c. s. X let f :X -* R be an extended real function.For every x* G X* let

f*(x*) = sup {<x,x*> - f(x); x G X }

The function f* :X* -*R so defined is called the conjugate function of f.It is immediately seen that an affine continuous function on X,h(x) = <(x, x*)> = a, satisfies the condition h(x) s f(x) on X if and only if < x, x*)>- f(x) s a, i. e. if and only if f*(x*) S a . It is also immediate that for every x* G X* it results f*(x*) = sup {<x,x*)> - f(x) : x G dom f } . The proof of the following theorem is obvious.

Theorem 11. If f, g : X -* R are real extended functions in a 1. c. s. X, then

(i) f*(0) = -inf {f(x) : x G X }

(ii) f(x) s g(x) for every x G X implies f*(x*) ê g*(x*) for every x*G X*

(iii) for every X £ R + we have (\f*)(x*) = Af*

(iv) for every 6 R we have (f + a)*(x*) = f*(x*) - о

(v) ( x ,x ’ ) s f(x) + f*(x*) for every x £ X, x*£ X*

Once f* is defined we can consider, for every x s X , _f* * (x ) = sup {X x, x* f* (x*)} . The function f**: X -* R so defined

x>;‘ e x *is called the second conjugate of f. This function f** is, as we know, aconvex 1. s. c. function on X. If we compare f** with f we have

Theorem 12. If f :X -> R is an extended-valued function on the 1. c. s. X, then for its second conjugate we have: f** = r(f). Therefore, if f is a proper convex 1. s. c. function, then f** = f.

P roof. From the definitions of F(f) and f** we can derive

(Г (f)) (x) = sup {<x, x*)>- a :a £ R, x* £ X* such that <x, х*У - a Sf(x) onX}

f** (x) = sup {<x,x*> - f*(x*) : x*£ X*}

and

f*(x*) = sup {<(x, x*)> - f(x)} so that by a fixed x* £ X* it is ( x ,x ‘ ) - ir S f (x ) ctex

on X if and only if f*(x*) s a. Therefore

(F(f)) (x) = sup {<x,x*> -f* (x * ) - e, x * € X*, e £ R + U {0} }

It follows that

(Ff) (x) = f**(x )

If we define f***(x*) = SUp {< XjX*^ -f** (x ) x £ X } = (f**)* (x*) then

Theorem 13. For every f :X -* R it results f*(x*) = f***(x*) on X*.

P roof. In fact, from f(x) ê (rf) (x) = f**(x) we obtain, by theorem 11, f*(x*) = f*** (x*) on X*. But from the definition of f**, we have also for every x £ X , x* £ X*:

<x,x*> s f**(x) + f*(x*)

so that

<x,x*> - f**(x) s f*(x*) V x £ X, x* £ X*

From the definition of we obtain therefore f***(x*) § f*(x*) onX* and the proof is complete.

4 0 4 CECCO NI

IA E A - S M R -1 7 /6 3 40 5

Theorem 14. If f : X -* R is a proper real extended function on a 1. c. s. X, then epi T(f) = co (epi f).

Proof. We prove prelim inarly that, for the given f, there exists a g :X -*■ R such that epi g = со (epi f). In fact, if (x, a) G co (epi f) and b > a, then (x,b) G co jep i f) so that, for every x € X, if we let g(x) == inf {X G R : (x, X) G co(epi f)} it results obviously epi g = co(epi f).

Let (x, a) G co(epi f ) and b > a. Then there exists (x¡, a¡ ) G epi f : i = 1 . . . n XjG R+, such that

T h e f o l l o w i n g t h e o r e m f i x e s a s i g n i f i c a n t g e o m e t r i c p r o p e r t y o f T ( f ) .

Successively we prove that, for the obtained g :X R, there exists aG : X -* R such that epi G = epi g. _____

In fact, if (x, a) G epi g and b >_a, then (x,b) G epi g, and then, if we let for every x G X, G(x) = inf {X G R : (x, X) G epi g }, we have obviously epi G = epi g.

To prove that if (x, a) G epi g, and b > a, then (x, b) G epi g, let us consider a generalized sequence (xa, aa); a G a, such that (xa, a j G epi g, for every a G a and (xH, aa) tends to (x, a) in X X R. Then, for a convenient â G a, it results aa a and, since g(xa ) s aa, it results also g(xa) i b when a > a. Then (xa, b) G epi g when a > a, and therefore (x, b) G epi g. _ _

It is so proved that, for the given f :X -* R, there exists a G :X ^ R such that epi G = со (epi f). We prove that this G coincides with P(f).Since it follows from the definition of T(f) that T(f) (x) s f(x) for every x G X we have epiT(f) Э epi f and therefore

epi (r(f)) = со (epi F(f)) Deo (epi f) Э epi f

We obtain so epi T(f) 3 ep i G Depi f and therefore

f**(x) = Г (f) (x) s G(x) s f(x) Vx G X

Then (f**)* (x*) ïG *(x*) s f*(x*) and, using theorem 13, we see that

n n

) G epi f.

П n n

Then X.a. + (b - a) = (x,b) G co(epi f).1 i l

( x * ) = G * ( x * ) = f * ( x * ) v x * G X *

406 C EC CO N I

Then (f***)* (x) (f**)**(x) = G**(x) = f**(x) for every x G X. But as G and f** are convex 1. s. c. functions on X,

f**(x) = (f**)** (x) = G**(x) = G(x) V x G X

and also

epi Г (f ) = epi f** = epi G = cô (epi f)

The theorem is so completely proved.Let us consider a few examples.

Example 1. If f(x) = + oo [resp. - uo] Vx G X*, then f*(x*) = - oo [resp. + oo]

for every x* G X*.

Example 2. If A is a subset of X and is the corresponding indicator function (i. e. ^ ( x ) = 0 if x G A, á£(x) = + oo if x ^ A), then

(■^)* (x*) = sup {<x,x*> - ^ ( x ) : x G X } = sup {<x,x*)>- S"A(x); x G A }

is called the support function of A. We have epiá?^* = со (epi £2" ).

Example 3. If f(x) = <х, x*X then

f*(x*) = sup « x , x * ^ - <x, x * } ; x G X } = sup K x , x * - x * ) ; x G X }

о

Example 4. Let X = R and p G R such that p > 1. Let l /p + l /q = 1 and consider f : R -* R such that f(x) = | x |P/p . Then f is a convex continuoi function on R. Obviously X* = R* = R and hence results

{+ 00 if x* f x ;¿0 if x* = x*

But for every y G R, <p(x) = ух - |x |P/p is a continuous function such that

lim <p(x) = - oo and tp{x.) = y - |x |P sig (x), if x f 0x -* + «°

I A E A - S M R - n /6 3 407

f |x|p \ ( i .p- 1 |x|Pf-'(y) = sup I x y ----- -— :,x G R f = Ix |x | sig(x) -

W e h a v e t h e r e f o r e

P у = I x|P 1 sig(x)

p_P I . 1 \ \ 1 i |P_1 1 I |Q

From theorem (v) we have also

I |P i |4 lx l У xy § + ------- V x, y £ RJ p q

the well-known H older's inequality.

3. SUB-DIFFERENTIAL OF A CONVEX FUNCTION

Definition 1. Let f :X -* R be a proper real extended function on a 1. c. s.X and x* G X*. We say that x* is a sub-gradient of f at a point x 0 e X where f(x0) G R i f

f(x) ê f(x 0) + (x -x Q, x*^ V x G X (3.1)

If a sub-gradient exists we say that f is sub-differentiable at the point x 0.The set of all sub-gradients x* of f at x0 is denoted by 9f(x). The multivalued mapping f :x G X -* 9f(x) С X* is called the sub-differential of f.If f is not differentiable at x we have also 9f(x) = 0.

Geometrically, the condition that x* is a sub-gradient of f at x n means that the affine closed hyperplane in X X R {(x, a) G X XR : < x -x0, x*^ + f(x0)=a} is a supporting hyperplane of epi f.

The following properties of 9f(x) are immediate consequences of definition 1 .

Theorem 1. If f :X -* R is a real extended function on a 1. c. s. X , then

(i) f (x0 ) = min {f(x) : x G X } if and only if 0 G Э f ( xQ )

(ii) x* e 9f(x0) if and only if f(x0) + f*(x*) = <xQ,x*>

(iii) 9f(x0 ) f 0 implies f (x Q) = f**(x0)

(iv) 9f(x0 ) is a convex subset of X* which is closed in the <j(X*,X)topology of X*.

P roof. To prove assertion (i) we observe that, if f(x) ê f(x 0) V x G X, then f(x) s f(x0) +<(x-x0 ,O> so that 0 6 9 f(x0).

40 8 CEC CO N I

Conversely, if 0 € 9f(x0), then (3.1) holds with x* = 0 so that f(x) й f(xQ) V x € X .

To prove (ii) we observe that, if x* G 3f(x0), then we have f(x) ê f(x 0 ) + < x - X q , x * > V x e X , and also - f (x) + <x, x*> § - f(x 0) + <xQ, x*>

If, conversely, РЦх*) = <xQ, x"';')> - f(x0), then <(xQ ,x*> - f(x Q) i f(x) - <(x,x*)> for V x GX and (3.1) holds.

To prove (iii) we observe that, if x*G 9f (x0), then the affine continuous function f(x0) + < (x -x0 ,x*)> is a minorant of f(x) and therefore f(x 0) + <x - x 0, x*> s T(f) (x) = f**(x) V x G X. Then f(xQ) S f** (xQ) S f(xQ) and the assertion is proved.

To prove (iv) we recall that <x,x*> S f(x) + f*(x*) V x G X, x* G X*. Therefore, by (ii) 9 f(x0) = {x*G X* :f*(x*) -<(x0 ,x >;! > S -f(x 0) and this set is convex and a(X *,X ) is closed because f*(x*) - <(x0 ,x*)> is a convex 1. s. c. function on X*.

For convex functions we have the following theorem of differentiability:

Theorem 2. Let f :X -* R be a proper convex real extended function on a 1. c. s. X which is continuous at a point x 0 G X. Then 9f(x) f jV x G (dom f)°. Moreover we know that x 0 G(dom f)° so that 9f(x0) = 0.

Proof. Since f is continuous at x fl, there exists an open set containing x 0 where f is bounded above. By theorem 5 of section 2 we obtain that (domf)° f 0, and that f is continuous on (dom f)°. Therefore it suffices to prove that 9f(x0) f 0 .

Since f is convex, epi f is a convex subset of X X R; since f is continuous at x 0, (epi f)° f 0. In fact, there exists an open subset U of X such that x 0 G U and f(x) § К on U, and therefore V = U X ]K, +oo [ is an open subset of X X R such that V С epi f. Since (x0, f(x0)) belongs to the boundary of epi f, then by corollary 3 of section 1 we can separate (x0 , f (x 0))and (epi f)° with an affine closed hyperplane H = {(x, a) G X X R : <x, x ’ +O'a = 7 }of X X R, where x*G X*, a , 7 G R and (x, a) -> <x, x*)> + aa is a non-zero real function on X X R. We have therefore < x, x*^>+ era ê 7 V (x, a) G epi f and <(x0 , x*)> + o-f (x0 ) = 7 and hence

<x0 ,x*> + a (f(x0) + e) 2 <x0 ,x*> + o-f(x0); e G R +

so that a È 0 .If a = 0 we would have <(x-x0 ,x*)> S 0 V x G dom f and, since (dom f)"i 0,

x* = 0, so that (x, a) -> ^x, x*)> + « a = 0 o n X X R; this is a contradiction. Therefore it is о > 0. Then we have

<x, x*> + fff(x) È 7 = <xQ, x*)>+ a f(x0 ) V x G dom f

V x G X, so th&tГ*(х*) = sup « x , x*^>-f(x) : x G X } - <(x0, xv)> - f(x0)

so that

This proves that G 9f(x0). The theorem is so proved.

IA E A -S M R -1 7 /6 3 4 0 9

Let us recall the definition of the Gateaux differential of a real extended function in a 1. c. s. X and investigate the connections between the Gateaux differential and sub-gradients.

Definition 2. Let f :X -» R be a real extended function on a 1. c. s. X, and let y EX. We call a derivative of f at x0 G X in the direction y and denote it by f 1 (x0, y):

lim f(x +_tyj _з_ fQ_)_ t - o+ t

provided the limit exists in R. If there exists an x* G X* such that f '(x 0 ,y ) = < y, x*)> for every y £ X we say that f is Gateaux-differentiable at the point x0 and we call x* the Gateaux derivative of f at the point x0 and denote it by f '(x 0). Obviously the Gateaux derivative is unique, provided it exists.

Theorem 3. Let f :X -♦ R be a proper convex function on a 1. c. s. X.If f is Gateaux-differentiable at x0 6 X, then 8 f(x 0) = {f '(x 0)}. Conversely, if at x 0 G X, f(xQ ) G R, f is continuous and 9f(xQ) = {x* } , then f is Gateaux- differentiable at x0 and f '(x 0) = x*.

Proof. If f is proper convex and Gateaux-differentiable at x 0 and, for every y G X , we consider \->cp(\) = f(x0 + Xy), then cp is a proper convex function on R, differentiable at xQ, and cp '(0) = 'vXg, f '( x 0))>. We have therefore

f(x0 + y) - f(x0) = cp(l) - <p(0 ) г <p'(0 ) = <x0 , f '( x 0)> y G X

so that

f '(x 0 ) G 8 f(x0)

On the other hand,if x* G 9f(x0) we have V y GX and X G R +

f (x 0 + Ху) ë f(xQ) + < Xy, x*>

so that

f(xp + Xy) - f (x0 ) g

therefore, going to the limit for X -♦ 0 +,

< y , f 1 ( x 0 ) > = < y , x * > V y G X

\

4 1 0 CEC CO N I

It follows that < y , f '( x 0) - x*)> =0 V y G X and therefore x* = f '(x0 ).Thu sit is proved that9f(x0 ) ~ f '(x 0 ).

Conversely, let f be convex and let 9f(x0) contain a unique element. Since f is proper convex, f(x 0) G R and f(x) is continuous at x0, we haveV y G X, f (x 0 + Xy) - f(x0) S f '( x 0, Xy) V X such that x 0 + X y G (dom f)° and therefore this holds V X. It follows that the one-dimensional subspace A = {x 0 + Xy, f(x0) + Xf'(x0, y ) : X G R } o f X X R does not intersect (epi f)° which is a convex, non-empty set of X X R since f is continuous at Xq .By corollary 2 and theorem 4 (of section 1) there exists a closed affine hyperplane H containing A such that H Л (epi f)° = 0. If H = {(x, a) G X X R : <(x, + a a = 7 }, where x* G X*, a, 7 G R, and(x, a) -» <x, x*)> + or a is a non-zero function on X X R, we can suppose that <x, x*)> + a a i 7 V (x, a) G (epi f)° and

< x 0 + *y . x*>+ e (f(x 0) + X f'(x 0 )y) = 7

Let us observe that a f 0; in fact, if a - 0, then

<x 0 ,x*> + X <y,x*> = 7 V y G X, X G R

so that x* = 0 and, consequently, <^x,x*)> + aa = 0 V (x, a); a contradiction.Then we have <x, x*)> + a (f(x) + e) й 7 = <Cx0, x* > + a f (x 0) V x G (dom f)

and therefore, by taking x = x 0, a > 0. So we have/ x* \f(x) ë \ ^ x 0 - x, + f(x o ) Vx G (dom f)

and therefore V x G X. We have so proved that - — G 9f(x0) and therefore

< x o+ ХУ« x*> + a <f (x0) + X f'(x0- y)) = T V X G R , y G X

then we obtain < y,x* > = -Q 'f'(x (),y ) for every y G X so that f is Gateaux- differentiable with the Gateaux derivative -x * /a.

For a Gateaux-differentiable function the convexity can be characterized in the following form:

Theorem 4. Let f : К -* R be a real valued function on a convex non-empty subset of a 1. c. s. X which is Gateaux-differentiable at every x G К in the following sense: there exists an f'(x)G X* (not necessarily unique if x ^ K°) such that for every y G X for which there exists an e G R+ such that

Let us recall that

x + ó y G K if 6 G ] 0, e [,then limt-> o+

f(x + ty) - f(x) = <(x, f '( x )X Then f ist

convex on К if and only if f(x) È f(xQ) + < x -x 0, f'(x ) У V xQ, x G K; f is strictly convex if and only if f(x) > f (xQ) +<(x-x0, f '(x 0))> V x G K, x f x„.

P roof. If f is convex on К and Gateaux-differentiable, then the arguments used in the proof of theorem 3 also prove that

f(x) ê f(xQ) + <x - x f ' ( x 0)> for х д о 6 К

Conversely, if f is Gateaux-differentiable at every x 6 K and

f(x) ê f(x0) + <x - x 0, f 1 (x0) > for x ,x 0 € К

then we have for every x, у 6 К, X 6 [0,1]

f(x) i f [x (1 - or) + y ] + о < x - y, f 1 [x(l - a) + ay ] )>

f(y) 5 f [x(l - ct)+ ay ] + (1 - a) < y -x , f' 1(1 - » ) x + ay ]>

We multiply the first inequality by (1 - a) and the second by a and sum up to obtain

(1 - O') f(x) + a f (y) ë f [ ( 1 - or ) x +o<y ]

so that f is convex in K.If it is strictly convex and Gateaux-differentiable we have for every

xQ, x € K, o; € ] 0 , 1 [, xQ f x,

< x - x 0, f .(x 0)> S ftx Q+ ? ( x - ^ n) ] - f ( x a )

t ,.( i .- ^ ) f ( x o ) , : .fUo). = f(x) _ f(Xo)

Conversely, i f f is Gateaux-differentiable and f(x) > f(x0 ) + ( x - x 0, f '(x 0)) Vx f x0, then the argument used above proves that

f [o’X + ( l - o ,)y ) ]> o 'f (x ) + (l-o r)f(y ) if x f y, a € ] 0, 1 I

The theorem is com pletely proved.

DEFINITION 3. Let tp : X -» X*, a map of a 1. c. s. X in the space X*.We say that ip is monotone if, V x, y G X,

< x - y, <¡p(x) - <¡e> (y) > SO

With this definition we can prove that

IAEA-SMR-17/63 4 1 1

Theorem 5. Let f : К -* R be a real-valued function on a convex subset К of a 1. c. s. X, which is Gateaux-differentiable at every x € K. Then f is convex if and only if f'(x ) is monotone on K.

Proof. If f is Gateaux-differentiable at every x G К and convex, then for every x, y G К there exist f'(x), f'(y) G X* such that

f(x) È f(y) + <x - y, f '(y )>

f(y) S f(x) + < y -x , f '(x )>

Adding these inequalities yields

< x -y , f '( x ') - f '( y )> È 0 x ,y G K

4 1 2 , CECCO NI

and f 1 is monotone on K.Conversely, if f is Gateaux-differentiable at every x G К and f' is

monotone on К let, V X G [0 ,1 ] , x, y G К, <p (X) = f [x + X (y - x) ] and consider (p: 10, 1 ] -* R. According to the hypothesis of Gateaux- differentiability of f, 

Therefore

(X -p ) {<p' (X) - cp'(p.)} = ( X - ^ X y - x , f ' [x+ X (y - x) ] - f 1 [х+ц (y - x) ] >

= < x + X ( y - x ) - {x + ц ( у - х ) } , f '[x + X (y -x ) ] - f '[ x + ( i ( y - x ) ] > l 0

so that ip'(X) is an increasing function on [0, 1] and (0) = Xf(y) + (1 -X ) f(x)

The theorem is so proved.

I A E A -S M R -1 7 /6 3 413

We conclude these preliminary topics on convex-functions theory with the following theorem which, for convex functions, connects the property of being 1. s. c. with the property of being Gateaux-differentiable:

Theorem 6 . If f :X - R is a proper convex function on a 1. c. s. X which is Gateaux-differentiable at x, then f is 1. s. c. in the ct(X,X*) topology at x.

P roof. Let x converge to x0 in the ct(x,x*) topology; we must prove that lim f(x) ë f(xQ). From theorem 3 follows that V x GX

x - » Xo

f(x) ë f(x 0) + < x - x Q, f '(x 0 )>

therefore, if we go to the limit in the topology a (X ,X*), we obtain lim ^ x - x 0, f '(x 0) > = 0 and consequently

X - » X o

lim f (x ) ë f (Xq ) .X - > x 0

The theorem is so proved.

4. MINIMIZATION OF CONVEX FUNCTIONS

In this last section we give the most essential results on the minimization of real-valued convex functions on Banach spaces.

Theorem 1. Let К be a closed convex set in a Banach space and let f : К -* R, a real-valued convex 1. s. c. function on K.

We suppose furthermore that either

(i) К is bounded, or(ii) lim f(x) = + oo

l l x l l —

X £ K

Then there exists an xQ 6 К where f(x) has a minimum in K, i. e.

f(xQ) S inf {f(x) : x € K }

If f is strictly convex then x Q is unique in K.

P roof. Let us consider X= inf {f(x) : x € K}; then X E f- » , + oo [. Let x n be a minimizing sequence in K, i. e. a sequence such that

lim f(x ) = X ' n 'n-> °°

4 1 4 CEC CO N I

The set A = {x n : n G N} is bounded in X: in fact, in the case(i), A = {x n : n G N) С К and К is bounded; in the case (ii) A is boundedsince the sequence {f(xn)} is bounded above.

Then there exists a sub-sequence {xn } of xn which converges in the <j(X,X*) topology to an x Q G X. But K,1 being closed and convex, is closed also in the or(X,X*) topology so that x0 G К. M oreover, f, being convex and 1. s. c. in the initial topology of X, remains 1. s. c. in the a(X ,X*) topology. We have also

f(x0) § Um f(xn.) = Xi -» .o

this proves that f(x 0 ) = A = inf (f(x) : x G K} and also that X f - oo.

If X j, x 2 G X are such that f(Xj ) = f(x2) = X, then from the convexity of fwe have f (xj + x 2) ] S fW x ^ + f(x2)) = X so that also - - G К is apoint of minimum for f in K. Therefore, if f is strictly convex we cannothave f ^ ) = f(x2) = X with Xj , x 2 G К and therefore there exists a unique x 0 GK where f has its minimum.

Theorem 2. Let f : К -* R be a convex real-valued function on a closed convex subset К of a Banach space which is Gateaux-differentiable at every x G K. M oreover, lim f(x) = +ooif К is not bounded. Then there

IUII-”x e K

exists an xQ G К such that f(x0) = inf {f(x) : x G К}; moreover

< x - x 0 , f '( x 0)>80 x G K (4.1)

<x - x , f '(x )> ê 0 V x G K (4.2)

Conversely, if x n G К satisfies (4.1) [resp. (4.2) and f' is continuous on К ], then f(x Q) = inf {f(x) : x G K }.

Proof. From theorem 6 of the preceding section follows that f is 1. s. c. on К in the a(X ,X*) topology. Therefore from theorem 1 it follows that there exists an x 0 G К such that f(xQ) = inf {f(x) :x G К}. Then, for every x G К and XG [0,1 ]

f ( x 0 ) ë f 1(1 - X ) x 0 + Xx ]

and consequently

Ü I!i îA (ü . - .x o .H -{ (]!ii) g о X G [0 , 1 ]X

Going to the limit for A 0 + we have

< x - x 0, f ' (x0)> g 0, x G K

IA E A -S M R -1 7 /6 3 41 5

(4.1) is so proved. But from theorem 5 of the preceding section it follows that

< x - x Q, f'(x ) - f '(x 0)> SO V x , x qG K

Addition yields

<x - x q, f '(x )> SO, x £ K

and (4.2) is proved.Conversely, let x 0 E К satisfy (4.1). Then from the convexity of f

in К follows V x G K, X e ] 0,1 [

f(x) - f(xQ ) г i {f [(1 - X )xQ + Ax ] - f(xQ)}

and going to the limit X -* 0 +

f(x) - f(xQ) S < x - x 0, f '( x 0)> § 0

so that f(x fl) = inf { f (x ) :x G К} and consequently (4.2) is satisfied. Finallylet x QG К satisfy (4.2) and let f'(x) be unique V x £ K and continuous on K.By taking in (4.2)

x = (1 - X)x0 + Ay, y G K, A G ] 0,1 [

we obtain

<У- x 0, f' [xQ(l - A) + Ay ]> SO

and therefore

< y - x 0, f '[ x 0(l - A)x0 + A y]> S 0, y £ K ; A G ]0 ,1 [

In the limit A -* 0 +, from the continuity of f' we have (4.1).

Remark 1. If the point x 0 in the hypothesis of the theorem is such that x 0 g K° then f '(x 0) = 0 according to theorem 1 of section 3.

Remark 2. If x 0 satisfies <x - xQ, f '(x 0)> ê O V x G K, we say that it satisfies a variational inequality. Therefore theorem 2 is an existence theorem for solutions of some variational inequalities.

In order to indicate a general existence theorem for variational inequalities we state the following (without proof):

41 6 CEC CO N I

Theorem. (Minty-Browder) Let f : К -» X* be a monotone map of a closed convex subset К of X on X*, we suppose that

(i) f is monotone on X(ii) V x, y G X the real function t -» < x - y, f [x +t(x т у) ] У is

continuous on R.(iii) there exists a Ç G К such that

<x - Ç, f(x)> lim --------n— ¡i-------- = + oo

II X 1 IIх II

Then V x* G X* there exists an x 0 G X such that

<(x - x 0, f(x) - x * ) 0, x G К

Example 1. Let a(x, y) be a bilinear symmetric continuous form in a Banach space X, i. e. a map (x, y) G X X X -» a(x, y) G R which is linearin both variables x, y, and let it be such that | a(x, y) | S M || x || ||y|| V x, y G Xand a convenient M in R.

Let us suppose that a(x,y) is coercive, i .e . there exists an a GR + such that I a(x, x) | ê а ||x ||2 V x G X.

Let x* G X* and consider the real function f(x) = a(x, x) = 2 ^x, x* Уon X.

Then f is a strictly convex function on X. In fact, from (4.3) a(x - y, x - y ) i 0, if x, y G X, we obtain

a(x, x) - 2a(x, y) + a(y, y) ë 0 (4.3)

and therefore

2a(x,y) s a(x,x) + a(y, y) (4.4)

m oreover, the equality in (4.3), (4.4) holds if and only if x = y.From (4.4) follows that, V x .y G X , X G [0,1]

a iXx + (1 - X)y; X x + (1 -X )y ]

= X2 a(x, x) + 2X(1 - X) a(x,y) + (1 - X)2 a(y, y)

S X2 a(x, x) + X(1 - X) {a(x, x) + a(y, y )} + (1 - X) 2 a(y, y)

S X a(x, x) + (1 - X) a(y, y)

so that, from the linearity of <(x,x* on X, it follows that

lA E A -S M R -1 7 /6 3 417

f lAx + (1 - A.) y ] S Af(x) + (1 - A) f (y)

and the equality holds if and only if x f y.Obviously the function f is continuous (and consequently 1. s. c. ) on X

and such that lim f(x) = + oo. In fact, for every x G X, from the llxll-»

coercivity of a(x, y) it follows that

f(x) = a(x,x) - 2 ^x,x*^> s a || x ||2 - 2 ||x|| • ||x* ||

and since

Hence

lim f(x) = + ooII X II-+ »

Therefore, if К is a closed convex subset of X, there exists, by theorem 1, a unique x 0 e К such that f(x0 ) = inf (f(x) : x G к}.

Since f is obviously Gateaux-differentiable at every x G X with the Gateaux-derivative f'(x) such that <(y, f'(x)> = 2 a(x, у) - < y, x* У, theorem 2 entails that at x 0

a(x0 , x - x 0) - < x - x 0, х*> г О V x e K (4.5)

We have so proved that the variational inequality (4.5) has one and only one solution in K.


EKELAND J., ТЕМАМ, R., Analyse convexe et problèmes variationnels, Dunod, Paris (1974).

LIONS, J.L., Quelques methodes de risolution de problèmes aux limites non linéaires, Dunod, Paris (1969).

ROCKAFELLAR, E.T., Convex analysis, Princeton Univ. Press (1970).

ASPLUND, E., "Topics in the theory o f convex functions", Theory and Applications o f Monotone Operators, A. Ghizzetti (1965).

IOFFE, A.D., TIKHOMIROV, V.M., "Duality in problems of the calculus o f variations", Trudy Moskov. Obs. 18 (1968) 187.

4 1 8 CEC CO N I

IOFFE, A.D., TIKHOMIROV, V.M., "Duality o f convex functions and extremum problems", Uspekhi Mat. Nauk (1970).

LIONS, J.L., "Partial differential inequalities", Uspekhi Math. Nauk (1972).

ROCKAFELLAR, E.T., "Convex functions, monotone operators and variational inequalities". Theory and Applications o f Monotone Operators, A. Ghizzetti (1969).

IA E A -S M R - П /2 9

AN INTRODUCTION TO PROBABILITY THEORY

J. ZABCZYK Institute of Mathematics,Polish Academy of Sciences,Warsaw, Poland

Abstract

AN INTRODUCTION TO PROBABILITY THEORY.T h e purpose o f this paper is to g iv e the probabilistic foundations o f stochastic control theory and to

show som e applications o f probability theory to functional analysis. Such top ics as m artingales, conditional expectations, Wiener process, linear stochastic equations and Ito 's integral are treated in a rigorous way. Only the elem ents o f integration theory and norm ed spaces are taken for granted.

INTRODUCTION

The purpose of this paper is to give the probabilistic foundations to stochastic control theory and at the same time to present some applications of probability theory to functional analysis.

The formal prerequisite of the paper is familiarity with the concept of (probability) measure and Lebesgue's integral and a knowledge of such theorems (from integration theory) as Lebesgue's theorem, Fatou's lemma, Lebesgue's monotone convergence theorem, although all these theorems are formulated in the text. To make use of elementary parts of the measure theory only, we omit all kinds of "extension theorem s" such as the Carathéodory or Kolmogorov theorems and instead we take for granted the Lebesgue's measure on the Euclidean space IRn. We also assume as known the elementary properties of Hilbert and normed spaces. With these exceptions, the paper is self-contained and complete although some proofs are given through problems.

The text is divided into five chapters: Prelim inaries, Martingales, Conditioning, Wiener P rocess, Ito's Stochastic Integral. Preliminaries contain basic information on a -fie ld s , measurability and independence. Examples of normal distributions, independent random variables are also introduced. As far as applications to functional analysis are concerned the next chapter on Martingales is most important. "The basic definitions (of martingale theory) are inspired by crude notions of gambling but the theory has become a sophisticated tool of modern mathematics drawing from and contributing to other fields" [1 ] . We apply martingales to prove fundamental properties of Haar and Rademacher systems and the Radon-Nikodym theorem. The main results of this chapter are Doob's inequalities and the Martingale convergence theorem. Some relations between martingales and Markov chains (important in stochastic stability) are also indicated. The material of the chapter Conditioning is extensively used in stochastic control theory (see the lectures "Stochastic control of discrete-tim e system s").An effort is made to link general definitions with concrete examples.Chapters 4 and 5 treat of a Wiener process and Ito's integral. They can be considered a starting point for stochastic differential equations and for

4 1 9

4 2 0 Z A B C Z Y K

stochastic control of continuous-time systems. Linear stochastic equations and their physical interpretation ("white noise") are also considered.

Preparing the paper, we have used many existing books and articles on probability and martingales, among others an article on martingales by J .L . D o o b [l ] and the books "Probability" by J. Lamperti, "Probability and Potentials" by P. A. Meyer, "Stochastic Integrals" by H. P. McKean Jr.[2, 3, 4 ] .

1. PRELIMINARIES

1.1. Random variables and generated g -fields

Let Í2 be a set. A collection ^ ”of subsets of Í2 is said to be a ct - field (ct-algebra) if

1 ) n e ÍF2) If A e & then Ac, the complement of A, belongs to ^ too,

+ 00

3) If A n e n = 1, 2, . . . then U An e !P~n = 1

The pair (Q ,,^) is called a measurable space.Let (E,<#) be two measurable spaces. A mapping X from Г2

into E such that for all A e S the set {u : X(u) e A} belongs t o ^ i s called a measurable mapping or a random variable.

Problem 1. Show that the composition of two random variables is also a random variable.

Let be a collection of subsets of fi. The smallest ct -field on Ç1, which contains is denoted by о(1Ж) and is called a ct-field generatedby It is obtained as the intersection of all ct-fields on Q containing Analogously, let (XjJjgj be a family of mappings from Í2 into (E, S') then the smallest ct-field on ÍÍ with respect to which all functions X¡ are measurable is called a p - field generated by (X¡)ie t and is denoted by a (X¡ : i e i ) .

Example 1. Let X be a mapping from into (E, S) then cr(X) ={Aa: ae<#} where Aa = {и : Х(ш) e a}.

Let E be a m etric space. The cr-field on E. generated by all open (or all closed) subset of E is called a Borel ct - field and denoted £S(E).

Let IR1 be the real line. A real-valued random variable is a random variable with values in (IR1, (fô (IR1)).

A random variable which takes only a finite number of values is called a simple (or elementary) random variable. The proofs of the following two propositions are easy and left to the reader (see also Ref. [5 ] ) .

Proposition 1. Let X, Y, X lf X2, . . . be real-valued random variables and a, /3 6 K 1, then X V Y = max (X, Y), X A Y = min (X, Y), aX + /3Y and lim sup X fi, lim inf Xn are also random variables (in general, with values

in ( R ,^ (E ) i where E = К и { - 00, + °°}).

Proposition 2. If X is a non-negative random variable (r. v. ), then there exists an increasing sequence of elementary non-negative r . v . 's that converges to X.

I A E A -S M R -1 7 /2 9 421

Let X be a measurable mapping from into (E, <8). Very often itis necessary to consider real-valued r . v . 's which are measurable with respect to cr(X). The following lemma shows that such r .v . 's are measurable functions of X.

Lemma 1. A real-valued function Y defined on Q is measurable with respect to cr(X) if and only if it is of the form Y = f(X) for some real-valued r .v . f defined on (E, S).

P roof. If f is a real-valued (^-measurable function then f(X) is ct(X)- measurable as a composition of two random variables (see Problem 1).Let now Y be any ct(X )-measurable function. We can assume that X is nonnegative. If Y is the indicator IA of a set A € cr(X) then (see Example 1)A = { X e a} for some a € if and, therefore, Y = Ia(X). Consequently, the lemma is true in this case. Since every simple r .v . is a linear combination of indicators, the lemma is true for all simple r . v . 's . If Y = lim fn(X) and

n 11fn are «^-measurable then Y = f(Y) where f = lim inf fn. Application of Propositions 1 and 2 finishes the proof. n

Let (E^ . . . , (Еш #„) be measurable spaces then by X . . . X <fnwe denote the smallest a -field of subsets of Et X Es X . . . X En which contains all sets of the form . . . X An, A¡ € <#;, i = 1, 2, . . . , n.

Problem 2. Let X i and X2 be mappings from Q into E x and E2, respectively. Show that the mapping (X 1; X 2) from Í2 intoEj X E 2 is a random variable if the mappings Xj and X 2 are random variables. Hint:The family of all sets A&Sl X <^2 such that {u : (Xû), Х2(ш)) € A} 6 ^"is a a -fie ld .

Lemma 2. Let . . . , be some families of subsets of E 1;E2, . . . , En respectively and let ^ b e the collection of all sets A jX . . . X An, AjG Sx, ___ Ane<fn. Then

ct(^) = X . . . X о ( ^ п)

Proof. Since X . . . X о(л?п) therefore а(^Ж)С X . . . X ‘On the other hand let ^ be the family of all sets A j6 a (^ j) for

which Ej X . . . X ЕЬ 1 X A¡ X E i+1 X . . . X EnS о^Ж). Then ^ is a a -field а-nd contains all sets from , thus ^ = а(^л^). From this if A i G a (^ 'i),i = 1, 2, . . . , n then A jX . . . X АП = .Г1 Ei X . . . X E¡-i X A¡ X E¡+i X . . . X En € a (.j?) and a ( X . . . X а ) с ст{лt ).

Corollary 1. For any natural numbers p1; p2, . . . , pn

^ (IR PlX . . . X HPn) = ^ (H Pl) X . . . X &{TRn)

Corollary 2. Let X = (X1( . . . , Xn), where X¡ are random variables with values in (IRP, ^ (IR P)), then any real-valued function Y, a(X)-m easurable, is of the form Y = f(X1; X2, . . . , Xn) where f is a Borel function on ]Rnp.

A collection Л oi subsets of Г2 is said to be a я--system if, ф € Л and if A, B e J ' then А П В £ /

42 2 Z A B C Z Y K

Example 2. If E is a m atric space then the family of all open (closed) subsets of E is a 7r-system . If E = К or E = [0 , 1) then respectively{( - 00 , b] : b G IE} and {[a, b) : 0 s a s b < 1} are гг-system s.

The lemma below will be used frequently in the sequel.

Lemma 3. If Л is a -v-system and Si the smallest family of subsets of such that

1) Л С Si

2) If A G Si then AcGSi +»3) If Aj, A2, . . . , € Si and АпПАт = <j for n f m then U An 6 Si,

then Si = о(Л ). n = i

Proof. Since о(Л') satisfies 1), 2) and 3), SÍC To prove theopposite inclusion, we show first that ^ is а к -system . Let AG Si and define ^ д = {B : B € ^ and А Л B 6 ^ } , It is easy to check that SiA satisfies 2) and 3) and if A € ^ fth e n the condition 1) is also satisfied. Thus for A 6 u<C = Si and we have proved that if AG ^ 'a n d B ë ê î then АПВ But this implies SiB Э ^ and, consequently, = & for any B G Si. Now the application of the following problem finishes the proof.

Problem 3. If a 7r-system Si satisfies 2) and 3) then Si is a cr-field.

Lemma 4 . ' Let f be a measurable mapping from (Ex X E2, i j X <£ 2) into (E ,# ) . For every XjGEj, f(xj, • ) is a measurable mapping from (E2, into (E, S),

P roof. Assume first that E = H and S - &(IR) and let ^ be a family of all sets A e ^ X S 2 such that for all X jE E j the function Ia(xj, • ) is a rea lvalued r .v . on (E2> S2). If A = A 2 X A 2 where A l&é?1, A 26 < ^ 2 then clearly A 6 ÿ . Moreover Si satisfies conditions 2) and 3) of lemma 3, therefore

nSi = S X <%2. Let f be a simple function then f(Xj, • ) = a iÂ-(x i’ ‘ ) or

1 = 1 i

some disjoint sets A¡ e é >1 X <#2real numbers € IR i = 1, 2, . . . , n and all X jê E j. Taking into account that a linear combination of measurable functions is also a measurable function, we see that f(xj, • ) is measurable in this case, too. But every non-negative measurable function f is a limit of simple functions, therefore the lemma is valid for all non-negative, thus for all measurable function. If (E, # ) is an arbitrary measurable space and AG S , then the composition IA(f) (1д is the indicator function of A) is a random variable on Ej X E2. Applying the first part of the proof, we see that for every x xG E the function IA(f(xj, • )) is (#2 -measurable. But then { x 2:f (x j, x2)£ A } = {x2: Ia^X],, x2) = 1} G ^ and the proof is complete.

1 .2 . Integration

The definition of the integral / XdP of a real-valued r .v . is takenQ

for granted (see Refs [3, 5]). We only recall that the integral is well definedif either / X+dP < + °° or / X"dP < + °°, where X + = XV 0, X ’ = (-X)VO and

ii nthat / XdP = / X +dP - J X*dP. If f X dP < + °° we say that X is an integrable

n n fi n

IA E A -S M R -1 7 /2 9 423

random variable. In probability theory, the integral of a random variable X is called the expectation of X and is denoted E(X).

For the proof of the following two theorems we refer to Ref. [5] . Lebesgue's theorem. Let X^ X2, be real-valued r . v . 's . If for some

integrable r .v . Y, |xn| s Y almost surely, n = 1 , 2 , . . . , and the sequence (Xn) converges (a. s .) to X, then E(Xn) -* E(X).

Fatou's lemma. Let Xj, X 2, be non-negative r . v . 's , then lim inf E(Xn) s E(lim inf Xn). If, in addition, the sequence (XJ is increasing

n nthen E(Xn) -* E(X). This last property is called Lebesgue's monotone convergence theorem .

1 .3 . Independence

Let {Ci, p) be a probability space and let { & eI be a family of sub-ct-fields of These ct-fields are said to be independent if, for every finite subset JC I and every family such that A ¡ G ^ , iGJ,P( П A-) = П P(Aj). Random variable (X ^ ig j are independent if the

i 6 J iG Jct-fields (^(Х ; ) ) 1 е 1 are independent.

Problem 1. Let X j and X 2 be two real-valued r . v . ' s, X j measurable in the respect to X 2 with respect to If &\, ¡?~2 are independentct-fields and the expectation E(XjX2) is well defined, then

EfXpC,) = Е(Х1Щ{Х2) (1)

Hint: Assume that X x, X 2are non-negative; show first that (1) holds for simple real-valued r . v . 's . Use Proposition 1, Section 1 .1 . and Lebesgue's monotone convergence theorem (Section 1 .2).

Example 1. Let us consider the probability space ([ 0,1 ), 3S [ 0 ,1), P) where P is the Lebesgue's measure on the interval [ 0 ,1) and define forevery n = 1 ,2 , . . . and к = 1,2, . . . , 2 n intervals l£ = [ ~¡T> ~ТГ)- Randomvariables

х пИ =■

are independent

0 if w G I ,, к oddк1 if u G I., к even 4 к

Proof. It follows, by induction on n, that for every e¡ = 0 or 1,i = 1 , 2, . . . . n,

{ u - .X ^ u) = e1, . . . , X n (y) = en} = + 2 n • 2 1 2n

This implies that P(Xj = e1, . . . , X n = en) = — = П P(X¡ = e¡). An applicationof the definition of independence and the fact that A íGct(X í) if A¡ = {u :X iGai} for some Borel set aj (see Example 1 (Section 1.1)) finishes the proof.

42 4 Z A B C Z Y K

Problem 2. Show that for every u e [0 ,1 ), to = E= E ~ k — * (dyadicX k(u)

£ = i 2 2 n L 2 1

Lemma 1. Let be a 7r-system on Q and let ^ = а(^^), i 6 I. The a-fie lds ( ^ í) ¡g i are independent if for every finite set J C I and sets

i e j , P( Pi A ) = П P(Aj).

P roof. Assume, without loss of generality, that I = J = (1, 2, . . . , n}. Let us fix the sets A2, A3, . . . , An and denote by the family of all sets Aj e ^ for which

The family and the ж-system satisfy the conditions 1), 2) and 3) of Lemma 3 (Section 1.1) therefore &1 = Analogously let us fixsets Aj <E ^ and A ¡e i = 3 ,4 , . . . , n and denote by the family of all A2 e ^ 2 which satisfy (2). Then й? 2 = a{ 4?%) = Easy induction showsthat (2) holds for all A¡ € i = 1, . . . , n.

Corollary 1. Random variables X¡ with values in (E ¡, c f^ j ) ) where are 7r-systems on E¡, iGI, are independent if P( П {X jG A j} ) = П P(X¡G A¡)

for any finite subset JCI and sets Aj 6 , i e j .

P roof. It is sufficient to remark that

a(Xj) = а ({Х ;е А ;}: А ;е а( ^ ;))

= CT({Xi€Ai }: A iG ^ i)

Corollary 2. Since all intervals ( - °°, x ] , x e ГО. form a ж-system which generates ^ (IR ) therefore real-valued r . v . ’ s Xx, X2 , . . . , Xn are independent if and only if

n nP( n {X j § Xj}) = П P(X¡ S x¡)

i=l i=lfor any real numbers Xj, x%, . . . , xn.

Problem 3. Let X 1( X & . . . be independent random variables with values in (E, <£) and let Jlf J2, . . . be a finite or infinite sequence of disjoint subsets of { 1 , 2 , . . . } . Then the a-fields ^ = cr(Xk; k e j j ), i = 1, 2, . . . are independent. Hint: a -fields are generated by ж -system s of all sets Л {X k€ A k} where Ake # and only for a finite number of k, Ak f- E . e 1

The lemma below is an important generalization of the Problem 1. The proof is analogous to that of Lemma 4 (Sec. 1.1).

n nP( П Ak) = П P(Afc) (2 )

k = l k = l

Lemma 2. Let f be a real-valued random variable defined on (E iX E2, Sx X <82) and let Х ъ X 2be independent random variables from

IAE A-SM R-17/2 9 42 5

(Q, P) into, respectively, (Ej, S ]J, (E2, S2). If the expectation E (f(X 1,X 2)) is well defined then the function f^- ) = E (f(- , X2)) is ^ -m easurable and E(f(X 1,X ^ ) = E (f1(X1)).

Proof. It is sufficient to prove Lemma 2 for functions f = IA where A E ^ X <#2 - Denote by the family of all sets A such that for f = Ia the lemma holds. Then Si where ~îs the тг-system of all sets Bi X B2 ,

B2 G< 2 " Since Si X S2 = oÇJ?) and the family satisfies the assumptions 2) and 3) of Lemma 3, Section 1.1; therefore, this yields& = #x X .

1 .4 . Distributions of random variables

Let (Q, P) be a probability space, (E, S) a measurable space and X a random variable from Q into E. The distribution of the r .v . X is a probability law on (E, $ ) defined as mx(A) = P(u : X(u) G A), AG S.

Lemma 1. Let ц and v be two probability measures on (E ,а(.Ж)) where a 7r-system . If

for A e ^tthen (1) holds for all AGcr(,</f).

P roof. Denote by & the family of all sets A E oJ ^ j for which (1) holds. Then Si satisfies assumptions 1), 2) and 3) of Lemma 3 (Section 1 .1 ) and, consequently, & =

Corollary 1. Let ц be a probability measure on (]RP, ¿^(]RP)). Then the distribution function F of ц defined as

P roof. Since the family of all sets {y : y s x}, xGIRp is a 7r-system and 3B{W(P) = о(Л?) therefore Corollary 1 follows from Lemma 3, Section 1.1.

Thus if X is a r. v. with values in Hp then the distribution function of Xdefined as Fx = F determines the distribution of X.

Example 1. If X is a real-valued r .v . such that

f 0 for t < 0Fx(t) = j t for 0 s t < 1

( 1 for 1 s t

then the distribution of X is the Lebesgue's measure restricted to [0, 1] .We say, in this case, that X has a uniform distribution on [0 ,1 ] .

The concept of distribution is often applied through the following lemma:

Lemma 2. If f is a non-negative random variable defined on (E, S') then

ju(A) = v(A) (1)

F (x) = ju(y :y s x), x E E p

determines ju uniquely, ((y1( . . . , y ) s (xp . . . , Xp) means Xj S y j, . . , , x p s yp.

E( f (X) ) ( 2 )E

426 Z A B C Z Y K

P roof. Formula (2) holds if f = IA, AÇ.S and thus for any simple randomvariable. If f is a non-negative r .v . then f = lim f for some increasing

nsequence of simple non-negative r .v . fx, f2, . . . . Since j fn(X)dP =

n/ fn(x)/ux (dx); therefore, (2) holds because of Lebesgue's monotone conver-

E

gence theorem (Section 1 .2).

Corollary 2. The random variable f(X) is integrable if and only if the function f is йх -integrable.

Let ц be a probability measure on (IRn, ^ (IE n)) such that / I x 12 az (dx) < + °o; ( I x I2 = x 2 + . . . + x 2). The column vector m with

Rn 1 ncomponents m¡ = / x;Ju(dx), i = 1 , . . . , n is called the mean vector of ц and Rnthe matrix Q = (°"í j )íj =i ........ where a¡ j = / (x¡ -m¡ )(Xj-mj)/u(dx) is called the

Rncovariance matrix of ¡л.

Let us remark that if ц = дх is the distribution of a random variable X with components X 2, X2, . . . , X n and E(X2) < + °o, i = l , . . . , n then, by virtue of Lemma 2, m4 = Е(Х4), ctj j = E(Xj -mj)(Xj -m ¡) i, j = 1, . . . , n. Vector m and matrix Q are called in this case the mean vector and the covariance matrix of X.

If T is a linear mapping from Rn into Rk (or, equivalently, T is a к X n matrix) and m is a vector from Rn, then T ' and m' denote, respectively, the conjugate mapping (the transpose matrix) and the transpose vector.

Problem 1. Show that the mean vector and the covariance matrix of the random variable TX are equal Tm and TQT1. Hint: E(TX) = TE(X) and E (TX-Tm ) (TX -Tm )' = T (E (X -m )(X -m )')T '.

1 .5 . Normal distributions

Let ju be a measure on (IRn, âê(TRn)) and g a Borel function such that

ju(A) = J g(x)dx for all A e ^ (IR n)A

then the function g is called a density of the measure ц (with respect to the n-dimensional Lebesgue's measure which is taken for granted).

Here, we introduce the normal distribution through the following problems (see Ref. [ 6 , v o l .2]):

Problem 1. Let R be an n X n symmetric matrix, then the function exp {-(1 /2 ) ^ Rx, x >}, xGIRn is integrable on IRn if and only if ^ Rx, x )> > 0for all x f 0, i. e. if the matrix R is positive-definite. Hint:

Г -rx2 ,/ e dx < + °°

if and only if 7 > 0. Use an appropriate system of co-ordinates on ]Rn.

I A E A - S M R - n /2 9 427

Problem 2. Let Q be an n X n positive-definite matrix and m a vector from Hn, then the function gm Q:

=m ,Q (x) = ((27r)ndet Q ) " 2 exp { - ( l / 2 ) < Q 1 (x -m ), x - m >}

is a density and a probability law on ]Rn. Hint: Use the formula

x2

1 f e 2xdx = 1 , X > 0 .•J 2irX J

This probability law is called a (non-degenerate) normal distribution.

Problem 3. The normal distribution with the density gmiQ has the mean vector m and the covariance matrix Q. Hint: See Section 1 .4.

Problem 4. Let a random variable X with components Xx< . . . ,X n be normally distributed, then, for every k g n, the random vector with components X^ X 2, . . . , Xk is also normally distributed on ]Rk. Hint: use induction on к = n ,n - l , . , , , 1 ,

If ц is a measure on IRn and T is a linear map from ]Rn into IEtk then the measure Тд is defined by the formula

Тц(А) = m(T_1A), A e ^ (IR k).

If a random variable X has the distribution ц then the random variable TX has the distribution Тц. Any measure of the form Тц where ц is a normal distribution is called a general normal distribution.

Problem 5. The measure Тд has the mean vector Tm and the covariance matrix TQ T'. Hint: Define o n ii = lRn the random variable X(x) = x, x £ K n and use Problem 1 (Section 1 .4).

Problem 6 . The general normal distribution T/и is a non-degenerate normal distribution if and only if the linear mapping T is onto or, equivalently, if the matrix TQT' is positive-definite. Hint: See Problem 4.

Corollary 1. A general normal distribution is a non-degenerate normal distribution if and only if its covariance matrix is positive definite.

Problem 7. The components X 1( . . . , Xn of a normally distributed r .v .X are independent if and only if the covariance matrix Q is diagonal (if CTi , j = 0 f o r 1 f З)-

Problem 8 . Let XJ.X2 , . . . be random variables normally distributed (in the general sense) with parameters (mbQx), (m2 ,Q 2), . . . . Assume that the sequence (Xk) converges almost surely to a random variable X, (mk) -» m, (Qk) - Q and Q is positive definite, then X is normally distributed with parameters (m ,Q ).

1 .6 . Sequences of independent random variables

4 2 8 Z A B C Z Y K

Classical probability theory deals with sequences of independent rea lvalued random variables. In this section, we show that such sequences can be easily constructed if the Lebesgue's measure on [0 , 1 ) is taken for granted. The construction goes back to H. Steinhaus.

The theorem below will be used as a source of concrete examples in the next chapters.

Theorem . Let /u1,/u2, . . . be a sequence of probability measures on IR1. There exists a sequence (Yn) of independent real-valued random variables defined on ([0 ,1 ), â&{0,l) , P) where P is the Lebesgue's measure on [0 ,1 ) such that the distributions of Yn are exactly the measures мп> n = 1, 2, . . . .

P roof. Let Xb X 2, . . . be independent random variables constructed in Example 1, Section 1. 3 and let J¡ = { n¡ j : j = 1, 2. . .} , i = 1, 2, . . . be disjoint subsets of the set of natural numbers. Then, because of Problem 3, Section 1.3, the random variables

j = i

are independent. We show that they have uniform distribution on [0, 1). Let, for instance, i = 1 and define

Then P(Sn = к/ 2n) = 1/ 2n, к = 0 , l , 2 , . . . , 2 n- l , (compare Example 1,Section 1.3) and, therefore, for t€ [0, 1), P(Sn s t ) - t , On the other hand, P(Sn = t) - P (Z j s t). Thus P (Z a s t ) = t. An application of Example 1 (Section 1.4) proves that Zi has the uniform distribution on [0, 1).

Now let м be a probability measure on И 1 and let F = FH be its d istribution function. Define F +(s) = inf {t : s s F(t)}, s € [ 0 , l ) . If Z has uniform distribution on [0, 1), then the distribution of F+(Z) is exactly ju. Really, from the definition of F +, for s e [ 0 , 1 ) and t e ( - « , + « ) , s s F(F+(s)) and F +(F(t)) s t. Therefore, { s : F +(s) s t} = [ 0, F (t) ] and, consequently,Р(ш : F +(Z(u)) s t ) = P(u : Z(u) s F(t)) = F(t).

To finish the proof of the theorem, it is sufficient to remark that if FÏ, F2+, . . . are functions (defined as above) corresponding to the measures M i, M2- • • • a n d Z i , Z 2, . . . are real-valued random variables defined in Chapter 1, then the sequence F ^Z^ , F*(Z2), . . . has all properties required.

2. MARTINGALES

2 .1 . Definition of martingales and supermartingales

Let T be an arbitrary subset of IR ordered by the relation s. Let (Í2, & , P) be a probability space and (&[)t gT an increasing family of sub-a-fields of

(1)

П

j = l

IA E A -S M R -1 7 /2 9 42 9

A family (Xt)teT of finite real-valued random variables adapted to the family (3rt )t g t , ( i .e . X t are ^ -m easu ra b le , te T ) , is said to be a martingale (or, respectively, a supermartingale, a submartingale) with respect to the’ family (^ t )t s x ^1) X t are integrable random variables, te T ;2) If s S t, then for every event

A A

or, respectively

Xt dP ê I X„ dP

Every real constant (or, respectively, decreasing, increasing) function defined on T is a martingale (or, respectively, a supermartingale, a submartingale).

Let (Xt)t e T be a supermartingale. The process ( -Xt )t eT 'is then a submartingale and conversely.

Example 1. Let (fi, ¡P, P) be a probability space and let Yj, Y2, . . . be a sequence of independent random variables (see Section 1.6) defined on Q such that E | Y n| < + a> and E(Yn) = 0, (or, respectively, SO, È 0) for all n = 1, 2, . . .; then the sequence (Xn), Xn = Yx + . . . + Yn, is a martingale (or, respectively, a super-, a submartingale) with respect to (£?~n), where = ст ( Y1, . . . , Y n). Really, by virtue of Problem 3, (Section 1.3), the a-fields and a (Yn + i) are independent. Therefore, if then IAand Yn+1are independent random variables. Applying Problem 1 (Section 1.3) we see that

J (Xn + l - X n ) d P = / Yn + 1dP = E(IAYn + 1) = P(A)E(Yn+1) = 0 ,A A

(or respectively s 0 , s 0 ).

Problem 1. An urn contains b black and w white balls. Balls are drawn at random without replacement. Let bn and wn denote the numbers of black and white balls in the urn before the n-th drawing. Construct an appropriate probability space and show that the sequence

b nX_ = :- , n = l , . . . , b + wn bn + w„ ' n n

is a martingale with respect to ^ = crfXp . . . , X,,).

Lemma 1. If (Xt) and (Yt) are supermartingales (relative to (3^)) and a, /3 are positive numbers, then the processes (aXt + /3Yt) and (Xt Л Yt ) are also supermartingales. If (Xt) is a martingale, then (|xt |) is a submartingale.

4 3 0 Z A B C Z Y K

P roof. Let s s t and A £ . Since a J X dP è a f X.dP andA s A

j3 /Y sdP ê |3/Yt dP, therefore / (ffX s + /3YS )dP ê ■ f (aXt + f3Y, )dP.A A A A

Obviously,

J (X, A YJdP = J Xj dP + J Ys dPA A O{ Xs< Yg[ A n { X sî ï J

and

A n {X s < Ys} £ j r , A D {X s ê Y s}e J ^ .

By virtue of the definition of a supermartingale, we obtain

J (Xs -X t)dP ê 0 , J (Y, - Yt )dP g 0

A O {X s< Ys} A n { X s s YJ

and, consequently,

X s A Ys dP ê J Xt A Y , dP + J Xt A Yt dP = J X t A Yt dP.A АП { Xs < Y s} A n { X s £ Y s} A

If (X t) is a martingale, then (Xt V 0) and ((-X t) V 0) are submartingalestherefore |xt | = X , V 0 + (~Xt) V 0 is a submartingale, too.

2 .2 . Rademacher and Haar systems and martingales

In this section, we introduce two martingales which are important from the point of view of functional analysis. The first martingale is connected with the so-called Rademacher functions. To define these functions, we use the same notations as in Example 1 (Section 1.3).

Thus, Í2 = [0 ,1 ), âë [ 0, 1) and P is the Lebesgue's measure on[0, 1). Define as the ст-field generated by the intervals

" is ii J l2 n ’ 2 "/

Rademacher functions rn are defined by the formula rn = 1 - 2Xn or, explicitly,

Tn - ik - к = 1, . . . , 2 , and n = 1, 2, . . . . Clearly, & С ^ ,’ ’ ’ ’ ’ J ’ n n + 1

rn (u) =1 if m £I“, к odd

-1 if u e l f , к evenк

Proposition 1 . For an arbitrary sequence of real numbers аг, a2, . . . the sequence (or^ + . . . + о^гп) is a martingale with respect to ( ^ n).

Proof. It is sufficient to note that I r . dP = 0 , к = 1, 2 , . . . , 2n or to,n n *k

apply Example 1 (Section 2 .1).Let now (Q, P) be an arbitrary probability space and let ( ^ ) n = o i

be an increasing sequence of ct-fields contained in ^"such that every isn

generated by exactly n + 1 disjoint sets A*!, . . . , An, U Af = Г2,Р(А") > 00 n k=Q к к

IA E A -S M R -1 7 /2 9 431

к = О, 1, . . . , n. (То construct the n + 1 -st partition, we choose a set, say Ak , from the n-th partition, divide it into two parts of positive probabilities, n and the remaining sets AJJ» к / kn, are left unchanged).

A generalized Haar system is a sequence of random variables (Hn)n=0 j adapted to the ( ^ , ) n = 0 x such that

2) E |h I > 0 , and H = 0 in the complement of the set A" ;' 1 n + 1 1 n + 1 r kn

3) E(Hn) = 0, E(H®)=1.

Proposition 2. For an arbitrary sequence of real numbers aQ, crlt a2> • ■ • > the sequence (a'0H0+. . . +anHn) is an -martingale.

P roof. Let A £ ^ n. If ADA? = 0, then 2) implies E(IAHn+1) = 0. If АЛА$п fi f then A = BUA°n, В е Д and BDA° = j and, therefore, E(IAHn+1) = E(I n H ,) = E(H ,) = 0 because of 3). This proves the proposition.

AkП

2 .3 . Martingales and densities

The results of this section will be needed in Section 2.10 which is devoted to the Radon-Nikodym theorem.

Let /л and P be two probability (or finite) measures on (Cl, Si). We recall that the measure /л is said to be absolutely continuous on Si with respect to P if for every A G Sisuch that P(A) = 0, ju(A) = 0 holds. Note that the definition not only depends on P and ju, but also on the cr-field Si.

A finite partition of £2 is a sequence A la Ag, ... , A of disjoint subsets o f £2 such that

U A¡ = Cl.i = l

Example 1. Let Si = crfA-j, . . . , An), where (Aj, . . . , A,,) is a finite partition of £2, and let /u be absolutely continuous with respect to P (on Si), then

M(A) = J g(u)dP(u>). for A £ ^ (1 )

The proof is obvious. Let us remark that the function g is Si-measurable, and that (1) implies absolute continuity of /u with respect to P.

More generally, if ц and P are arbitrary probability measures on (Cl, Si) then a Si-measurable function g such that (1 ) holds is called a density of ц with respect to P (on Si). If g is a density then a Si-measurable function h is a density also if and only if P(u : g(u) f h(u)) = 0.

1 ) H 0= 1

n

A

where

43 2 Z A B C Z Y K

Let ^ b e the collection of all finite partitions of Í2. If t = (Aj, . . . , A„t we define ^ = cr(Aj, . . . , A„t). We write t s s if and only if ^ Ç ^ that means if the s-th partition is a refinement of the s-th partition.

Proposition. Let ti s t 2 s . . . be an increasing sequence of partitions of Q . Let (Í2, P) be a probability space and < tnc n = 1,2, . . . . If a measure ц is absolutely continuous with respect to P on and gn is a corresponding ¿^-m easurable density, n = 1 , 2 , . . . , then the sequence (gn) is ( 5 rtn)-martingale.

P roof. Let A 6 & then A £ ^ . Therefore, ц(А) = f g dP andn n+l A

m(A) = f gn+1dP. Thus / gndP = f gn+1dP. This proves the result.A A A

Example 2. Let us consider the probability space ( [ 0 ,1 ) ,^ [ 0 ,1 ) , P) where P is the Lebesgue's measure and a Borel, integrable function g.If M is a measure with density g and ^ = cr(l", . . . , Inn), l£ = ^

2 L 2 ;к = 1 , . . . , 2 n then the density gn is a picewise constant (on l£) function given by the formula

gn(oj) = 2nJ g(s)ds for u 6 l", к = 1 , . . . , 2n

Í2 .4 . Martingales and Markov chains

Let (Г2, ÎP~, P) be a probability space and (E1( S\), (E2, S%) some measurable spaces. A sequence X 0, X j, . . . of random variables from Í2 into Ej is said to be a Markov chain if there exists a sequence of independent random variables Ço,|j,Ç2, ■ • • from Г2 into E 2 and measurable mappings Fn from Ej X E 2 -* Ej such that

X = F (X , Ç ), n = 0 ,1 , . . . (1)n + l n 4 n’ ’

and f 0, f j , Ç2, . . . are independent of X 0.If the mappings Fn and the distributions of Çn do not depend on n, then

the Markov chain (Xn) is homogeneous (in time).

Example 1. I f f o , € i , . . . are real-valued r . v . 's then the sequence (Xn),where X n + 1= X 0 + | 0 + . . . + f n = Xn + f n, is a Markov chain, so called random walk.

Problem 1. Let us define the transition function (IPn) of the Markov chain (Xn) by the formula

IPn (x, A) = P(Fn(x ,?n)ÉA) = E(IA(Fn (x, fn )))

then (for n = 1 , 2 , . . . ) :If A 6 # j, IPn ( * ,A ) i s (^-measurable (2 )

If x £ E j, IPn (x, • ) is a probability measure on S j (3)

Hint: Apply Lemma 4 (Section 1.1) and Lemma 2 (Section 1 .3).

IAEA-SMR- П /2 9 433

Usually, Markov chains are defined starting from a transition function (B?n), (see, e .g . R e f .[ 6 , V o l . l ] ) but models of stochastic control theory are described often by Eqs (1) (see Refs [7, 8 ]). From the mathematical point of view, these two approaches are equivalent. In this direction, we propose to solve the following problem:

Problem 2. Let E x be 1) a countable (or finite) set and é[the family of all subsets of E2 or 2) E j = IR1and S\ = ^ (IR 1) and let a function IP satisfy(2) and (3). Then there exists a measurable function F from Ex X [0 ,1 ) intoТГ* o i i n V i +Vi q + -fz-.v» а п т т т » а л Н r v m t t q r»i a K l a C n n i i n v m b r H i c f r i b u t o H o n Í П 1 \

F(x, t) = Fj(t), as in Section 1.6, where Fx(t) = IP(x, ( - °°, t ] ).Let us now consider a homogeneous Markov chain (Xn) with the transition

function IP. A real-valued (^-measurable function h is said to be harmonic (sub-, superharmonic) for (Xn) if

1) IFh = h, (respectively, IFh ë h, IFh ë h);2) For every initial point X 0 = x e E x, E | h(Xn) | < + °° , n = 0 ,1 , 2, . . . .

Here the operator IP is defined as

The theorem below has many applications to stability theory.

Theorem . If h is a harmonic (sub-, superharmonic) function for (Xn), then the sequence (h(Xj,)) is a martingale (sub-, supermartingale) with respect to ^ = g(X0, . . . ,X n) for any initial point X 0 = x.

Proof. Let A €ct(X0, . . . , Xn) then (see Example 1, Section 1.1)A = {u : (X(}(u), . . . , Xn(u))€a} for some a £ ^ X S\ X . . . X S\. Therefore,

independent of and, consequently, by virtue of Lemma 2, (Section 1.3),E(IAh(Xn+1)) = E(Ia(X0.........Xn) IRi (Xn)) = E(IAIHi(Xn)). Since IFh = h (IFh ê horlPhs h) we obtain E(IAh(Xn+1)) = E(IAh(Xn)), (E(IAh(Xn+i)) ê E(IAh(Xn)), or, respectively â).

Example 2 (coin tossing). Let E : = { . . . , -1, 0, 1, . . .} E2 = { -1, 1} and let ?o> ?i« • • • are independent r . v . 's such that P(Sn = 1 ) = p, P(?n = " ) = 1 "P*0 < p < 1. The function h:

h(x) = x, if p = j

IFh(x) = / h(y)P(x, dy), x £ E j

is harmonic for random walk Xn = X 0 + f 0 + . . . + f n. Thus, the sequence (Xn) in the form er and the sequence ( ( l -p ) /p )x" in the latter case are martingales.

4 3 4 Z.ABCZ.YK

F or the sake of completeness we propose solution of the following problem (see Chapter 3):

Problem 3 (Markov property). Show that for any bounded, á’j-measurable real function h:

But |n is independent of ^ = ct(X0, . . . , X J; therefore, using Lemma 1, Section 3 .2 , E(h(Xn+1)| J^) = IPnh(Xn) a. s.

2 .5 . Stopping times

Let ( Q , ^ ) be a measurable space and ( ^ ) t e T an increasing family of ст-fields, tG T. A function S : £2 -» T is said to be a stopping time(relative to the if for every tGT the set {u : S(u) s t} belongs to ^ ,i .e . the condition S s t is a condition involving only what has happened up to and including time t. Let S be a stopping time. By we denote thecollection of events A 6 J such that An{S s t}G^"t for all t e T . It is easy to verify that is a ct-field, so called a -field of events prior to S.

Proposition 1 a). If T is a finite or a countable subset of IR then S: 57 -» T is a stopping time if and only if {u : S(u) = t } G ^ for tG T. b) If Si and S2 are stopping times then SiAS2 and S!VS2 are again stopping times, c) Any stopping time S is ^ -m easu ra b le , d) If S1 s S2, then

P roof. The properties a), b), c) follow directly from the definitions. To prove d) assume that A G , then ÍS2 s t} П A = {S 2 s t}D{Si s t} П A because {S2 s t}D{Si s t} = {S2 s t}. Since {Si § t }r iA G ^ and {S2 t i e ^ j therefore {S2 s t}n {S i s t }n A G ^ t,

Example 1. Let (Xn) be a sequence of real-valued r .v . 's adapted to (^ n), (that means Xn are ^ -m easu ra b le ) and let ! Then

is a stopping time. To see this fix a natural number к then {S = k} = {X i< a .........Х к_! < а, X k 6 a }G a (X i,------ Xk)C S ^ .

The following special case of a theorem proved by Courrege and Priouret [ 9] is helpful in considering concrete examples.

Proposition 2. Let X j.X j, . . . be random variables with values in a countable set EC IR1. Define ^ rn = a(Xi, . . . , X n), = a(X1, X 2, , . . Amapping S : £2 - { 1 , 2 , . . . , + » } is a stopping time relative to (3^n) if and only if

1) S is ^ .-m easurable2) If S(u) = n and X k(u) = X k(w') for к = 1, 2, . . . , n then S(u') = n..

E(h(Xn+i )| X o , . . . ,X n) = E(h(Xn+1)|Xn) = JPnh(Xn), a .s .

Hint: E(h(Xn+1) IX0, . . . , Xn) = E(h(Fn(X„, f n) |X0, . . . , X*).

the least n such that Xn i a

+ 00 if Xn < a for all n = 1 , 2 , . , .

IAEA-SMR-17/29 435

P ro o f. We prove only that 1) and 2) im ply: S is a stopping tim e. (The opposite case is obvious). Let us define A = {u ; S(u) = n} and a = { ( x 1( . . . , x^G E "; x¡ = X ; (u) for i = 1, . . . , n and som e u € A } , The set a is at m ost a countable set, therefore it is a B orel one. The property 2) im plies A = {u : (X-û), . . . , X n(u)) Ga} and, th erefore , А Ё а (Х р . . . , Х д) = ^ .

2 .6 . M ore on stopping tim es. M arriage prob lem .

In this section, we try to consider stopping tim es in a m ore intuitiveway.

Let П be the set o f all sequences u> = (e±, e2, . . . , eN) where en = -1 o r 1, n = 1, . . . , N. Any sequence u £ Í2 can be interpreted as the re co rd of N su ccess ive tosses of a coin : -1 stands fo r heads and 1 fo r ta ils . Let usintroduce, fo r any n = 1, . . . , N, a random variable Yn and a ст-fie ld ^ by the form ulas Yn(e1( . . . , eN) = en, ^ = (Yb . . . , Yn). Thus Yn represents the outcom e o f coin tossing at the moment n and ^ the c la ss o f all past events up and including tim e n ( ^ is generated by the sets {Yj = el3 . . . , Yn = en} ). Any stopping tim e S relative to can be interpreted as a (non-anticipating) rule for stopping gambling: stop at the mom ent n if the eventS = n has just o ccu rred . Since the inform ation available to the gam bler at the mom ent n is contained in the sequence of outcom es Y 2, Y2, . . . , Yn, we requ ire {S = n } G3&. C onversely any ru le fo r stopping gam bling based on the past h istory only (non-anticipating rule) defines a stopping tim e. F or instance, if we have the follow ing rule: stop at moment 1 if Y j = - 1 , continue if Y j = 1; stop at mom ent 2 if Y j = 1 and Y 2 = -1 , continue if Y2 ■= 1; stop at mom ent 3 i f Y* = 1 and Y2 = 1 and Y3 = -1 o r Y 3 = 1. Then the corresponding stopping tim e has to be defined as fo llow s: S(w) = 1, for all и such that Yx (to) = -1; S(u) = 2 for all и such that Y j (u) = 1 and Y 2 (cj) = -1;S(u) = 3 otherw ise.

T o define a rule that anticipates the future (or, equivalently, a "stopping tim e" S which does not satisfy {S = n} € ^ ) we introduce a sequence X j, X 2, . . . , X N, which d escrib es the fortune o f the gam bler at mom ents 1, 2, . . . , N i . e. let X n = 1 + Y j + . . . + Yn. The follow ing rule: stop at the firs t m om ent n such that X n = max (X1( . . . , X N) anticipates the future. If for instance N = 3 then {S = 1} = {Х г ё X 2 and X i â Хз}= {Y 2 = -1} Tostop the game at mom ent 1 , the player should know the outcom e at mom ent 2 .

P roblem 1. (The space Г2 and the sequence {¡Pn) as above). Let r j and dN denote resp ective ly the number of all stopping tim es S s N and the num ber o f a ll random variab les T with values in {1, 2, . . . , N} and ^ ”N- m easu rables. Show that r x = 1, r N + 1 = (1 + rN)2, dN= (N)sN, N = 1, . . . and

0Nthat (rN/d N) á (1 /2 ) (2 /N ) . Hint: Use Proposition 2, Section 2 .5 and the

2 N -1estim ate r„r ё 2 .N

We propose a lso to consider the follow ing "M arriage P roblem .It can be form ulated as fo llow s: In an urn, there are N ba lls . Inside,

each ball contains a p iece of gold which we do not see and fo r different balls the amounts of gold are d ifferent. We m ay draw (at random ) one and keep it and, of cou rse , we want to draw the one with m ore gold inside. But we are not allow ed to check all balls fo r the amount of gold they contain. Instead, we are to follow the follow ing ru le. We are allowed to draw and check one ball a tim e and decide either to keep it or to have another drawing. H owever,

436 Z.ABCZYK

all balls which have been drawn, checked and re jected , are not available any longer. Keeping the к -th ball we ch oose, the probability of m axim al gain is 1/ N for к = 1, 2, . . . , N. H ere, we do not use the inform ation available from previous ch o ices . How ever, it may be shown that, using the inform ation available, one can find a strategy which in creases the probability of the best choice to p N where pN -*■ 1 /e as N -» °o. This problem one faces in life if one wishes to get m arried and believes that one w ill be given N possib ilities of ch oice .

The above problem can be form alized as an optimal stopping-tim e problem . A ll details are given in the book [ 1 0 ].

2 .7 . D oob 's optional sam pling theorem

T heorem . Let (X n) n = 1 k b e a superm artingale (a m artingale) relativeto (3^)n = l , . . . , k . Let S1( S2, ------Sm be an increasing sequence of (^ ti) stoppingtim es. The sequence (X s )i = 1 ...... m is then a lso a superm artingale (am artingale) with resp ect to и -fie lds , ¡P'c •

S m

P ro o f. Let A e ^ . We shall prove that f (X . -X _ ) dP â 0 in the1 A bl b2

superm artingale case and J (Xs -X s ) dP = 0 in the m artingale case . If forA 1 2

every u, S2 (to) -Sjîo) § 1 then к

/ ( W d P - I / l x - ’ x ' ,dPA i = l A n {Sj =r} n {S2 > r}

but A n {S i = r} and {Эг> r}, belong to therefore A n {S x = r }n {S 2> r}The definition o f a superm artingale (a m artingale) im plies that the desired inequality (equality) fo llow s.

Let r = 0, 1, . . . , k and define stopping tim es Rr = S2 A (S 1 + r ). Then S i= R o S R 1 S . . . s Ht = S2 and R ¡+i - R¡ s 1, i = 0 , . . . , k - l . By virtue o f the firs t part of the proof

J X Sid P ê J X Rj dP È . . . й J X RkdP =J X s.dPA A A A

In the m artingale case , the above inequalities should be rep laced by the equalities and, thus, the theorem is proved.

Let us con sider som e sim ple applications o f the above D oob 's optional sam pling theorem (m ore seriou s applications w ill be given in Sections 2 .8 and 2 .9 ) .

Exam ple 1. (Ruin prob lem ). Let (X J be the random walk (defined in Section 2 .4 ) which starts at the origine (X 0 = 0). Thus X n + 1= Y0 + . . . + Ynwhere Y0, Y x, are independent random variables P (Y n = -1) = 1 - p,P (Y n = 1) = p, n = 0, 1, 2, . . . . Let a and b be natural num bers and letS = inf { n :X n = -a o r X n = b}. In gambling language: a gam bler with initialcapital a plays against an adversary with initial capital b; in each gam e, he can win o r lose a dollar with probability p and q = 1 -p , respective ly , and the gam es are independent; X nis the gam bler 's total winnings after the n-th game and the game ends exactly at S, when he o r his adversary lost all his

IAEA-SMR-H/29 437

m oney. The ruin problem consists in finding the probability that the gam bler will be ruined, i . e . the probability that X s = -a .

P rob lem 1. Show that P(S < + < » )= 1. Hint: See R em ark 3, Section 2. 9. A ssum e that p = 1 /2 (fair game) and define stopping tim es Sn = SAn,

n = 1, 2, . . . . Since (X n) is a m artingale, by D oob 's theorem (applied to stopping tim es: 0 and Sn) E(Xsn) = E (X 0) = 0 . But S < + (a. s . ) and, th ere fo re , the sequence Xg^ is convergent:

X s - -a on {X s = -a}

X „ - b on {X . = b}

L ebesgue 's theorem (Section 1 .2 ) im plies

0 = E (X S ) - E (lim X s ) = -a P (X s = -a) + b (l - P (X S = -a))n n n

Consequently,

P (X S = -a) = b /(a + b), P (X S = b) = a /(a + b )

Problem 2. C onsider the case p f 1 /2 . Hint: Sequence (q /p ) n is a m artingale (Section 2 .4 ) . Answer

P (X S = -a) = (1 - (q /p )b) / (p /q )a - (q /p )b

Exam ple 2 (Strategy fo r a schoolboy). Let us con sider the m artingale introduced in Exam ple 1, Section 2 .1 . Again by D oob 's theorem , we obtain that, for any stopping tim e S (relative to ( ^ , ) ) ,

E (X S) = E (X 0) = b /(b + w )

The above identity can be interpreted as fo llow s: During an examination, every schoolboy draws at random (without replacem ent) one question he has to answ er. At the beginning o f the examination, there are b + w questions, and som e pupil is able to answer in a satisfactory way exactly w of them .At any mom ent, he knows which questions have been drawn and he wants to find an optim al mom ent for drawing question to m inim ize the probability o f choosing a "bad" question. But fo r any rule S, E (XS) = b /(b + w ); th ere fore , all possib le "s tra teg ie s" are o f the same value fo r him.

2 . 8 . Fundamental inequalities

T heorem 1. Let (Xn)n_.[ j b e a superm artingale and с a non-negative constant. Then we have

1 ) cP (sup Xn ê c) S E (X x) - J X kdP{sup x n < c}

2) g E f X J + E (X j j

438 Z.ABCZ.YK

3) cP (in f X n § -c ) g - J X kdP{in f x n s - c }

4) â E (3Q

P ro o f. To prove the inequalities 1), 2), define S(u) = inf { n :X n(u) è с }, o r S(u) = к if sup Xn(u) < c . S is a stopping time and S i 1, Thus, by D oob 'soptional sam pling theorem , we obtain

E (X j) ё E (X S) = J X sdP + J' XsdP ê cP (sup X n ê c) + J' X kdP{ sup x n > c } { sup Xn < c} {sup X n < c}

or equivalently

c P ( s u p X n È c ) s E ( X 1) - J X ^ P

{sup x n < c}

This is exactly inequality 1).Since -Xk s Xk, inequality 2) follow s, too . To establish the relations 3) and 4), we introduce an analogous stopping tim e S, S = inf { n :X n s -c } o rS = к i f in f X n > - c . Since S § k, therefore

E (X k) § E (X g) = J X sdP + J X sdP{in f X n < - c } { i n f X n > - c }

s -cP (in f X n s -c ) + J X kdP { in f x n > -c }

These inequalities im ply 3) and 4).

P roblem 1. (D oob -K olm ogorov 1 s inequality). Let (X n)n = 1 k be a m artingale then fo r every с > 0

, , Е(МP(sup IXn j ê c) s -----------n

Hint: If (Xn) is a m artingale then (| x n|) is a subm artingale (Lemma 1, Section 2 .1 ) . Use 4), T heorem l .

T o form ulate and prove the next theorem , we have to introduce som e new definitions and notations.

Let x = (xj, . . . , x„) be a sequence of rea l num bers and let a < b. Let Sj be the first of the num bers 1 , 2 , . . . , n such that X Sl s b o r n if there exists no such num ber. Let S^be, fo r every even (respective ly odd) integer к > 1 the first o f the num bers 1, 2, . . . , n such that Sk > S ^ and xs s a(resp ective ly x s ë b ). If no such number ex ists, we set s R=n. In this way,

the sequence Si,S 2 ,S 3 , . . . is defined. The number ££>n{n-, (a ,b )) o f down-cross in gs by the sequence x o f the interval (a ,b ) is defined as the greatestinteger к such that one actually has x„ ê b and x s a. If not such

2 k-l 2kinteger ex ists, we put ^ n(x; (a ,b )) = 0. Let us rem ark that the "in terva ls":

IAEA-SMR-И /29 439

[ Sj, S2] , . . . , [S 2 k-i>S2kl represent the periods of tim e when the sequence x is descending from b to a. Analogously the number (a ,b )) o fupcrossings by the sequence x of the interval (a ,b ) can be defined. In fact ^ „ ( x ; (a ,b )) = &n(-x, ( - b , -a ) ) . Numbers S^ Sg, . . . which correspond to the sequence -x and the interval (-b , -a) w ill be denoted as R i, R 2, . . . .

Theorem 2 (D oob 's inequalities). Let (X m) m = 1 ...... n be a superm artingalerelative to ... n and le t a n(a ,b )) S ~(ЩХгЛЪ) - E (X „A b ))

2) E ( ^ n(a ,b )) S E((a - X n)+)

Herei2>n(a ,b ) ^ „ ( ( X j ......... Xn); (a, b)), « ^ (a .b ) = № ...........X J ; (a ,b )),(¿£>n(a ,b ) and <?/n(a ,b ) are c learly random variab les).

P roo f. 1) Since (XmAb ) m = 1 ...... n is a superm artingale and the numberso f dow ncrossings corresponding to (Xm )m = 1 .......n and (X mAb) m = 1 .......n are thesame we can assum e that X m = X mAb, m = 1, . . . , n. Let Si, . . . , S^, 21 > nbe m om ents defined above and corresponding to the sequence (Х г, . . . , X n). Then E » <X l,-X 4 > *■■■ + <Xs21. , - Х , 21) i ® , ( a , b ) ( b - « ) . Indeed,

■ к then £ - (XSi - X S2) + . . . + (XS2i i - X S a ) + (Хз21и

But S 2k+l = n o r S2k+l < n and S 2k+2 = n> th erefore X S2k+i - X g ^ = 0 o rb - X~ 0. Thus X Q - X c 0 and D £ k(b -a ) . Consequently,

2 k+2 2k+l 2k+2 X q - Х ч = £ + (Xc. - X . ) + (X , - X 4 ) + . . . + (X 4 - X s )

S 1 S2 C b2 b3 b4 5 b2 fi- 2 b20-lÈ ( b - a ) ^ ( a .b ) + (X g -X ) + . . . + (X . - X g )n b2 s3 ü2 í - 2 2 Æ- 1

By D oob 's optional sampling theorem , the sequence Xs^, . . . , X ^ is a

superm artingale and, in particular, E (X S ) ë E (X S. ) fo r i s j . T herefore ,

E (X Si) - E (Xs2f) г (b -a )E (^ n(a ,b )). But E ^ ) = E (X J and E(XSi) S E(Xj) and we conclude that

E (¿0n(a .b )) S (E(Xa) - E (X n))

s ( ь Ь ) <E(x iAb) ' Е (х „л ь »

P roo f 2). Let Rx, . . . , R 2{, (2i > n) be moments used in the definition o f upcrossings and let E: = (X ^ -X ^ ) + . . . + ( X r ^ - X r ^ ) . Then

El s (b - a) ^ „ (a , b) + (Xn - a) A 0. Assum e <?/п(а, b) = k, then

E 1 = (X R2 - X Rl) + • • • + (X R2k "X R2k-l) + (X R2k+2 ' X R2k+l^ If R 2k+ 1 = n then

XR2 k+2 ' XR2k+l = ° 3nd if R 2 k + 1 < П and Й 2 к + 2 = П then Х к 2 к+2 _Хя2к+ 1 =

X n _ X R 2k+l ^ X n ' 3 - T h U S ' i n b ° t h C a s e s - X R2k+2 " Х «2к+1 " (Xn ‘ a ) Л 0 "

4 4 0 ZABCZYK

0 s E ^ ) g (b -a )E («^ n(a ,b )) + E ((X n -a ) AO)

F inally,

(b - a ) E ( « /n(a ,b )) s -E ((X n -a ) Л0) = E((a - X J +)

2 .9 . Martingale convergence theorem

Many im portant applications of m artingales are connected with the follow ing convergence theorem :

T heorem . Let (Xn)n=12, be a superm artingale such that sup E(XJJ) < + oo. Then the sequence (Xn) converges a. s. to an integrable

random variable X .

P ro o f. F or every a < b define b) = sup (a, b) = lim ^ n(a, b).n n

D oob 's inequality im plies that

E (^ (a ,b ) ) = lim E ( ^ n(a ,b )) s - 7 — sup E((a - X n)+)n n

But (a - X n)+ = |a j + X~ and therefore

E (^ (a , b)) s ¿ S Í + J- Sup E(X ¡) < + со

On the other hand,

{lim inf X < lim sup X } = U {lim inf X_ < a < b < lim sup Xn}a,b rationals n n

U {^ /(a , b) = + 00} a,b rationals

Since Е( (а,Ъ)) < + °° th erefore P (^ (a ,b ) = + °°) = 0. Thus

P(lim inf X n < lim sup X„) 5 £ Р (^ (а , b) = + 00) = 0n n a,b rationals

This way we have obtained that alm ost surely lim inf Xn = lim sup X n =

lim X n = X . Let us rem ark that |xn| = X n + 2Xn. But fo r a ll n,n

E (X n) § E (X :) and, by virtue o f Fatou 's lem m a,

E |x| § sup E (X n + 2X'n) s E(Xj) + 2sup E (X ‘ ) < + °°n n

This finishes the proof.

Rem ark 1 . Since X" S |Xn| S Xn + 2X¡ , therefore E(X¡J) é E|Xn| S E(Xj) 2E(X") and the conditions sup E (X ‘ ) < + » , sup E |Xn| < + 00 are equivalent (provided E(Xj) < + » ) . n n

A g a i n D o o b ' s t h e o r e m i m p l i e s t h a t

IA E A -S M R -1 7 /2 9 441

As an exam ple o f application of m artingale convergence theorem to functional analysis we prove the following proposition:

P roposition . If / + 00 then the Haar Series y Q -^k and theк - 0 k = 0

Radem acher seriesk = l

P ro o f. It is easy to check that Haar as w ell as Radem acher system s are orthonorm al system s. T h ere fore , using Schwarz 1 inequality

« A ° к н кk= 0 к = 0

E

Since

k= 1+ O0

к = 1

I k = 0

m u yk - l

< + °° we obtain:

и*sup E < + » and sup Eк - 0

Lк = 1

rkx k < + oo.

But sequences ^ “ k-^kJ anc* 2_, ак1"кJ are marti-ngales (see Section 2 .2 ) k= 0 k=l

and the application o f m artingale convergence theorem com pletes the proof.

R em ark 2. E xactly in the sam e way as the above proposition , it is p ossib le to prove that if Ç1; . . . are independent r . v . 's such that

E (?k) = 0, E (tf) = 1, к = 1,

converges alm ost su rely .

-t-

. . and ^ ak < + «>, then the ser iesк = 1

I-k = lk ?k

Rem ark 3. Let us consider Problem 1, Section 2 .7 . A ssum e, e .g . p = } . If S = + » , then the sequence (X<- ) does not con verge . But (X ) is

n na bounded m artingale, th erefore the convergence theorem im plies P(S = + 0O) = 0.

2 .1 0 . Radon-Nikodym theorem

As an application o f m artingale theory we prove the Radon-Nikodym theorem :

442 ZABCZYK

T heorem . Let ц and v be finite m easures defined on a m easurable space (Г2, 3*"). There exists a function g Ш 0, ^ -m ea su ra b le and y-integrable and a set B e ^ v(B) = 0, such that for every set A 6 ^

ju(A) = J gdv +m( A n B )A

The proof w ill be a rather sim ple application of the m artingale convergence theorem (Section 2 .9 ) . R elations between the m artingale convergence theorem and the Radon-Nikodym theorem in a m ore general setting (m artingales and m easures with values in Banach spaces) were intensively investigated by many authors. It turned out, for instance, that these two theorem s are equivalent (Chatterji [ 11 ]) . In the proof of the theorem , we shall need a sim ple fact concerning "gen era lized " sequences.A partially ordered set is said to be filtering to the right if fo r any t1, t2 £ ^ "th ere exists t G ^"such that t 1 â t, t2 s t.

E very mapping x(* ) from ^ "in to a m atric space E is called a generalized sequence. A generalized sequence (x(t))t g ^ converges to x 0 i f fo r every e > 0 there exist t(e) G ^"such that for t й t(e), p(x(t), Xq) < e . In this case we w rite lim x(t) = x Q. Here p is a m etric on E .

t s ?Lem m a 1. Let E be a com plete m etr ic space and (x(t))tg 5 - a generalized

sequence such that fo r every in creasing sequence tx ê t 2 S . . . the sequence (x(tn)) converges in E . Then there exists an elem ent x 0GE such that lim x(t) = xQ and m oreover lim x(tn) = x Q fo r som e increasing sequence

P roo f. It is easy to see (by contradiction) that for every e > 0 thereexists t(e) G such that for every t 2 t(e) p(x(t), x (t(e))) < e . If we definean increasing sequence (tn) such that for t Ш tn, p(x(t), x(tn)) s l /n we obtainthat (xftj,)) converges to an elem ent x 0GE and m oreover that lim x(t) = x0.

te5"

P roo f of the theorem . Let us define a probability m easure P = — (m + v) where с = ц{0.) + v{Q) and let S~be the co llection o f all finite partitions t = (A*, . . . , A* ), (A* G ¡P), o f Í2 and let ^ = ct(A* , . . . , Aji^). Denote by gt the

density o f the m easure ц with resp ect to P (considered on (seeSection 2 .3 ) . F or every in creasing sequence o f partitions (tn), the corresponding sequence of densities (g, ) is (¡P". )-m artingale (see

_ 1Section 2 .3 ) . Since 0 s gt s —, the convergence theorem (Section 2.9) im plies

ln cthat the sequence (g. ) converges P -a lm ost surely to an integrable random

Пvariable and from L ebesgue 's theorem we obtain that (gt ) converges also

nin E = L 1 (Q, P ). Applying the above lem m a we see that there exists aP -in tegrab le , ^"-m easurable function g such that lim gt = g. If A G &

te^-then for som e t G S~. F rom this if s г t,

IAEA-SMR-17/29 443

M(A) = J gtdP = J gdu + J gdvA A A

o r equivalently J (c-g)dju = / gdv for all A £ 3?. Using the fact that every A A

^ -m e a s u ra b le function h s 0 is a lim it o f in creasing sequence of sim ple functions as well as L ebesgue 's monotone convergence theorem we obtain

T h e r e fo r e

J h(c -g)dy =J hgdyA A

To finish the p roo f o f the theorem define the set В and the function g as fo llow s:

В = {g = с} and g = 0 on B, g = — on В°с -g

Since / hgdv = 0 fo r any ^ -m easu ra b le function h i 0 therefore y(B) = 0.B 1 Let h = 0 on В and — = on B° then c -g

1Л(А\В) = J (c -g )d M = I d v=f Sdy = / g dvA\B A\B A\B A

and finally

ju(A) = /u(A\B) + ij(A ПВ) =J g d y + p (A n B )

P roblem 1. P rove that the function g and the set В given in the theorem are unique in the follow ing sense: i f g 2 and Bj satisfy the statement o f the theorem , then

v[g f gj) = 0 and /^BXBJ + ц(BjVB) = 0

C orollary 1. Let ц be a m easure which is absolutely continuous with resp ect to v, then there exists a function g s 0 , ^"-m easurable such that m(A) = ! gdv, A G

AM oreover, there exists an increasing sequence o f partitions t jS t2 s . . .

such that the corresponding sequence (g ) o f densities o f the m easure yu with

resp ect to v (considered on ^ ), — see Section 2 .3 — converges ¡/-a lm ostn

surely to the density g.

P ro o f. Only the second part of the co ro lla ry requ ires a p roo f. F rom Lem m a 1, we obtain the existence o f an increasing sequence o f partitions t! S t 2 s . . . such that gtn ->■ g in E and P -a lm ost surely . But g = g /(c -g ) and thus gl / (1 -g . ) -*• g. It is sufficient to note that gt /(c-E ) = gt .

'n n n ln Ln

P roblem 2. Let ^ be a ct-field generated by a partition (A", . . . , A^ ),

n = 1 , 2 , . . . , = c r (^ , . . . ) and let с ^ +1. If m easures ц and v are

444 ZABCZYK

defined on a m easurable space (£2 , ¡p) and ц is absolutely continuous with resp ect to v, then the corresponding sequence o f densities (gn) converges у -a lm ost surely to the density g o f ц. with respect to v.

Hint: See the proof o f the above co ro lla ry .

3. CONDITIONING

3 .1 . Definition o f the conditional expectation

A s usual, let (£2 , P) denote a fixed probability space. Let X be an integrable random variable and A e , ^ The conditional expectation o f X given that A has occu rred is defined, in elem entary probability theory, as

/X d P

E(X| A) =-if P(A) > 0P(A)

any number if P(A) 0

In particu lar, the conditional probability o f an event В given that A has o ccu rred is equal to

P (B I A) = E(I I A) P (A n B )P(A)

It has been found useful to generalize these definitions and define conditional expectation as random variab les, as fo llow s. Let (A j, . . . , A k) be a partition o f £2 and let = a (Alf . . . , Ak) с ^ Then the conditional expectation o f X relative to is the random variable with constant value E(X|Aj) on each set A j, i = l , . . . , k . M ore generally, let Si с ^ be any ct -f ie ld . The conditional expectation o f X relative to Ч& is a random variable Y such that

1) Y is ^ -m easu ra b le

2) If A 6 ^ then /X d P = / YdpA A

Let us rem ark that i f = ct(A1; . . . , Ak) then the random variable Y with values E(X I A ¡) on A ¡, i = 1, . . . , к satisfies 1 ) and 2). Any random variable Y that sa tisfies 1) and 2) will be denoted by E(X|Si). It is c lea r that if a random variable Y j sa tisfies 1) and 2) then Y = Y j, except possib ly on an и -set o f probability z e ro . M oreover, as a consequence o f the Radon- Nikodym theorem , we obtain the following proposition:

P roposition 1. A random variable Y which satisfies the conditions 1) and 2) always ex ists . T here exists a lso an increasing sequence o f partitions (A!J, . . . , Aj^ ) o f Г2 such that E (X | ^ n) - Y, alm ost surely P , as n - + «o,

where S?n = ct(A j, . . . , A ^ ) .

P ro o f. Without loss o f generality, we may assum e X È 0. Let us define a m easure ц : ц(А) = /X d P , A eS ?, then the m e a su re /и is absolutely

A

IAEA-SMR-17/29 445

continuous with respect to P and the Radon-Nikodym theorem im plies that there exists a S i-m easurable function Y such that f XdP = ju(A) = / YdP.

A A

To obtain the latter part of the proposition , it is sufficient to apply C oro lla ry 1 (Section 2 .1 0 ).

P roposition 2 . Let ^ nbe a -fie ld s generated by finite partitions (A" . . . , A k ), n = 1, . . . and let Si = a { < ÿ then

1 n l Z

E(X ISin) — E(X | Si) alm ost surely.

P ro o f. It is sufficient to apply Problem 1, Section 2 .1 0 .

Rem ark 1. B yth ev ery definition, if X is a S i-m easurable r .v . then E(X I = X . It is worth to rem ark that if X = IB, íü = o-(A1; . . . , Ak) and A = Q then the condition 2) im plies c la ss ica l Bayes form ula:

кP(B) = ^ P (B | A i )P (A i)

i — l

P rob lem 1. Let Z be a S i-m easurable r .v . such that E|xz| < + » then E (XZ|Si) = Z E (X | ii), P alm ost surely . Hint: show firs t that the above form ula is true for sim ple Z , and then pass to the lim it.

P rob lem 2 . Let (Hn)n = 0 , i , ... be a generalized Haar system (see Section 2 .2 ) relative to {&'n)n = o l anc* x an integrable random variable then

E (X | ^ n) = ^ E ( X H k)Hkк ~ 0

Hint: Show first that any «^ -m ea su ra b le function is o f the form n

X “ Л к = 0

Then use the orthonorm ality o f the Haar system and P roblem 1.

C oro lla ry 1. A generalized Haar system is a com plete and orthonorm al basis in L (Q, ÍP, P) where &'= ct( 5^, . . . ).

P roo f. By P roposition 2 we know that E(X|Srn) - > E (x| ^ "), (a. s . ) . If X e L 2 (f¿, P) then X is ^"-m easurable and th erefore Е(Х|^") = X . Thus

the F o u r ie r 's s e r ie s E E(XHj¡)Hi< converges to X (a .s .) and since it converges

a lso in the sense o f L 2 (fi, ¡P~, P) we conclude that

X = ^ E(XHk)Hk k = 0

2in L (Í2, 3*”, P ). C oro lla ry 1 w ill play an im portant ro le in the construction o f Brownian motion (Section 4 .2 ) .

446 ZABCZYK

R em ark 2 . If ст-fie ld Si is generated by a fam ily of random variab les, say gi = cr(Xt; t€ T ) then E (X | ^ ) w ill be denoted as E (x | x t, t e T ) .

P roblem 3. Let r b ... , r n be R adem acher functions (Section 2 .2 ) and let sk be the indicator of the interval [(k -l)/2n , к /2 ), к = 1, . . . , 2n. Show that

2n

E (X |rlf . . , , r n) = E (XI s j......... s^) = £ V k . F in d a k.k = l

3 .2 . B asic properties o f the conditional expectation

In this section , we sum m arize the basic properties o f the conditional expectation. Connections with m artingales and an estim ation problem will be d iscussed separately in Sections 3.3 and 3.4.

In the proposition below , we list the properties o f the conditional expectation which are s im ilar to those of the usual expectation.

P roposition 1. If X , Y are integrable random variab les and a ,b , c are rea l num bers then

1) E(aX + bY + c|g?) = aE(X|€?) + bE(Y|S?) + c;

2) If in addition, X s Y a . s . then E (X \Ç§) s E(y| €?) a . s. ;

3) If Xn, n = 1, . . . , are integrable random variables which in crease to X, then

lim E(X |â?) = E (X l^ ) , ( a .s . )n n

4) (Jensen 's inequality). Let h be a convex mapping from IR1 into IR and let X and h(X) be integrable r . v . ' s , then

h(E(X|S?)) S E(h(X)|s?), ( a .s . ) .

P ro o f. We only prove property 4), the rem aining properties are left to the rea d er. Function h is upper envelope of a countable fam ily of affine functions hn, hn(x) = anx + bn, x e IR 1. The random variables hn(X) are clearly integrable and the property 1) im plies ^ (E fx ls ? ! ) = E (hn(X)|â£). Since hn £ h, using 2), we obtain hn(E(X l‘ÿ)) s E (h (X )[ë i), ( a . s . ) . T h erefore ,(the fam ily hn(E(X |<&)) is countable) h(E(X|«?)) = suphn(E(x|ë?)) s E (h (X )| »), ( a . s . ) .

C oro lla ry 1. Jensen 's inequality im plies that i f E |x|p < + » ,1 s p s + oo then

||E(X|»)||p s ||x||p, ( a .s . )

P ro o f. Let 1 § p < + » , then |E(X|âi)|p s E(|x|p |SF) and therefore E|E(X|a?)|P s E ¡X p . If p = + » , then X s || x||*_, ( a .s . ) then E(X|g?) S ||x||„ and ]| E(X |SF) I „ s I X||„.

M ore ch aracteristic properties of the conditional expectation are given in the follow ing proposition :

IAEA-S MR-17/29 447

Proposition 2. Let X be an integrable random variable,

1) If Si = {0 ,Q }, then E(X|Si) = E(X);

2) If Si is an arb itrary ст-fie ld , and X is a Si-m easurable r . v . , thenE(X|SF) = X , (a. s . );

3) If X is independent o f 9?, then E(X|Si) = E (X) a .s .;

4) If 9?, a re two ct -f ie ld s , 9? x с Sic j r then E(E(X | S? ) | g^) = E(X | S% )and, in particular,

5) E (E(x|áÍ)) = E(X) (Bayes form ula).

P roo f. We show, e .g . the property 4 ). Let then / E (X | ^ .)dP =, A ,/ XdP. On the other hand, since AG Si, we obtain / E(E(X ) | {ü )dP =

/E (X | s i)d P = / XdP. Thus 4) fo llow s. AA A

We form ulate separately a generalization o f the property 3), Proposition 2, because this generalization is very important in applications:

. Lemma 1. Let X j, X 2 be two random variables with values in (E1;(E 2, < 2 )} r esp ective ly . Assum e that X j is independent o f a ст-fie ld Si and X 2

is S i-m easurable . If f is a rea l function defined o n E jX E 2, X $ g-mea su rabie, then

E (f(X 1 ,X 2)| ^ ) = f2 (X2), (a. s . )

where fgfxg) = E ffiX^Xg)), x 2 G E2, provided E |f(X1# Xg)| < + » .

P ro o f. The proof is analogous to that o f Lemma 4, Section 1.1 and th erefore w ill be om itted, (the usual technique o f it -system s does work).

3 .3 . Conditional expectation and m artingales

It is very im portant that, by using the notion o f conditional expectation, it is possib le to give a new equivalent definition of a m artingale, a super - m artingale and a subm artingale.

A fam ily (Xt)t eT o f integrable random variab les is said to be a m artingale (or, respective ly , a superm artingale, a subm artingale) with respect to an increasing fam ily of ct- fie lds ( J^ )teT if

1) X t is ^ -m e a s u r a b le for tG T , (adapted to ^ ) ,

2) E (X t) = X s, a. s. (o r , resp ective ly s , Ш), fo r t ë s.

The proof that the definition of a m artingale (or, respective ly , a super - m artingale, a subm artingale) given here is equivalent to that from Section 2. 1 is a lm ost im m ediate and is left to the reader.

We use the new definition to prove the follow ing im portant Lemma 1

Lemma 1. Let (X t)t eT be a m artingale such that for som e pG[ 1 ,+ ») and all tG T , E | x J p < + » . Then

448 ZABCZTK

1) The fam ily (| x t |p)t e T is a submartingale

2) If T = {1, 2, . . . , n} and с is a positive number, then

, . E IX J PP (su p| xk| - c) —

P ro o f. 1) Let s < t then by Jensen 's inequality (P roposition 1,Section 3 .2 ) , E (X t |p|^¡) s |E(Xt |j^)|P ê |XS|P ( a .s . ) . 2) Since(-| x k | p ) k = 1 ...... n is a superm artingale th erefore using inequality 4) fromT heorem 1, Section 2 .8 we obtain cpP(inf( - [x^ |p ) s - cp) s E | x n|p. A fter elem entary transform ation , 2 ) fo llow s. k

3 .4 . Conditional expectation and an estim ation problem

A s usual, (Í3, ÍP", P) is a fixed probability space . Let X be a real-valued random variable and Y a random variable with values in a m easurable space (E, S). X and Y w ill be interpreted as unobservable param eter and observable data, resp ective ly . The estim ation problem can be stated as fo llow s: knowing Y, estim ate X in the best possib le way. T o be m ore sp ec ific , we assum e E (X 2) < + °° and form ulate p recise ly the so -ca lled :

L east-squ are estim ation problem : Find a rea l function f defined on E and «^-m easurable such that for any rea l function f defined E and ^■-measurable:

E ( X - f ( Y )) 2 s E (X - f (Y ))2 (1)

Any function f satisfying (1) is called an optim al estim ator.The existence o f an optimal estim ator is a consequence of the follow ing

proposition :As usual L 2 (f2 ,S i,P ) denotes the H ilbert space of all S i-m easurable,

square integrable r . v . ' s . It can be considered as a closed subspace of L 2(fi, J*] P ).

P roposition . The conditional expectation E(X|íü) is exactly the orthogonal projection of X onto Lz (fÏ, Si, P ). M oreover fo r any Z G L 2 (f2, Si , P ):

E (X -E (X | S i )) 2 s E ( X - Z ) 2 (2)

j_ 2 P roo f. Let X be the orthogonal p rojection of X onto L (Г2, Sí, P) and

A G Si then X -X -1- is orthogonal to 1д. Thus E (IA( X - X ±)) = 0 or equivalently/ XdP = J X± dP. This im plies X х = E(X|Si). The Pythagorean identitycom pletes the proof.

The solution to the L east-square estim ation problem is given by thefollow ing co ro lla ry :

C orollary : Let Si = ct(Y ). Since every S i-m easurable real function is o f the form f(Y) fo r som e (^-m easurable f, th erefore, for som e «^-measurable function f, E(X|âi) = f (Y ) . Inequality (2) says that f is the required optimal estim ator. Let us note that f is , in general, not uniquely determ ined.

IAEA-SMR-17/29 449

R em ark. The above proposition suggests a new proof (independent of' the Radon-N ikodym theorem ) o f the existence of the conditional expectation E (X|âi). Without loss o f generality we can assum e X ë 0. If X is a bounded r .v . then E(X|ëi) can be defined as the orthogonal projection X^. If X is unbounded then there exists an increasing sequence o f bounded nonnegative random variables Xn + X and it is sufficient to define E(X|â?) = lim E (X n|Si).

Exam ple 1. Let H0, . . . , Hn be Haar functions and define Y = (Hp, . . . , Hn). If X is a rea l-va lued r . v . , E (X 2) < + then the optim al estim ator f is a

n

function defined on Rn + 1, f (y0, . . . , yn) = ^ E(XHk)yk, (see P roblem 2,к = 0

Section 3 .1 ).

Exam ple 2. If Y = (r1; . . . , rn ) where r k are Radem acher functions then the optim al estim ator f can be defined as f (yx, . . . , yn) = 2 n J X(u)P(du)

J n «У1 .......Уп>where Ify-p . . . ,y n) is the interval [ y x/ 2 + . . . + yn / 2 , yj 2 + . . . + y J 2n + l / 2n) n [ 0 , l ) , (see Section 2 .2 ) .

M ore im portant, from the "p ra ctica l" point of view, examples of optimal estim ators will be given in the next section.

3 .5 . Conditional densities

Let X and Y be two random vectors with components X i, . . . , X n and Y]., . . . , Yk, respectively , and let g(- , • ) be a (B orel-m easurable) density of the joint distribution o f (X, Y) and assum e (to sim plify notations) that g is a positive function on ]Rn X IRk. The function g(- | . ) defined as

g(x|y) = r , x e lR ”, y G R kw ’ J g (z ,y )dz JK n

is called the conditional density of X with respect to Y.Im portance of the conditional densities follow s from the following

proposition.

Proposition 1. If f is a B orel real-valued function on IRn X K k then

E (f(X , Y) I Y) = Jf(x, Y)g(x| Y)dx, (a .s ) (1)R n

provided f(X, Y) is an integrable r .v . .

P roo f. F irst we show that fo r A e ^ ( E n)

P(XeA|Y) = / g(xI Y)dx (2)A

Since every set belonging to ct(Y) is of the form {YGB} where B eá^(IR k) therefore (2 ) is equivalent to

E (IA(X)IB(Y)) = E (IB(Y) / g(x J Y)dx)A

4 5 0 Z.ABCZYK

But g is the density o f the distribution o f (X, Y); th erefore , see Lemma 2, Section 1 .4 ,

B ^ W y Y ) ) = I g (x ,y )dxdyA X В

On the other hand, the function / g(z, • )dz is the density o f Y and, consequently, Rn

E (IB(Y) J g(x I Y)dx) =¡(1 g(x| y)dx) ( / g (z ,y )dz)dyA B A R n

= ! g(x, y)dxdyA X В

because o f the definition of the conditional density g(* | •). Combining all this together, we obtain (2). F rom (2), we easily deduce that (1) holds for functions f o f the form f(x, y) = IA(x)IB(y), Ae<5^(IRn), В €&(TRk). Let us denote by if the fam ily o f a ll sets C G ^ (K n X K k) such that (1) holds for Iq. Then if sa tisfies the conditions 1), 2) and 3) o f Lemma 3 (Section 1 .1 ) and ifcontains the тг-system ^ o f a ll sets A X B, A G ^ (IR n), В G ^ IR k) thereforethe mentioned Lem m a 3 im plies if = = ЗВ(Ш.п X IRk). The usualtechnique of "in creasin g sequences o f sim ple functions" com pletes the proof.

C oro lla ry 1. If f depends only on "x " then

E (f(X ) I Y) = / f(x)g(x I Y)dx R n

C oro ll ary 2. If Z is a random variable m easurable with resp ect to cr(Y) and takes values in (IRm, ¿^(IRn ), then

E (f(X ,Z )| Y )= Í f(x , Z )g (x I Y)dx, (a .s )R n

provided f is a rea l-va lued á&(IRn) X á^flR1”) -m easurable function and f(X , Z) is an integrable r .v . Since Z = h(Y) for som e ¿¡?(IRm)-m easurable h, th erefore the above form ula follow s from ( 1 ).

If X is a random vector with components X¡ (or a random m atrix with com ponents X ¡ j ) then E(X|ëi) is a random vector with com ponents E(X¡|SÍ), (or a random m atrix with components E (X 4 j[S i)).

Exam ple 1. Define

* i ( y ) = / X;g(x|y)dx R n

i(y) = / (xi - m i ( y ) ) (x i “ ^ j(y ))g (x ly )dx> i. j = i . • * • ,n,yenkR n

and let m , Q be vecto r and m atrix with components resp ective ly m ¡, ffj j, then

E (X | Y ) = m ( Y )

IAEA-SMR-17/29 451

and

E ((X -m (Y ))(X -m (Y )) ' |Y) = Q(Y), (a .s )

The random m atrix Q(Y) is called the conditional covariance m atrix . The above form ulas follow d irectly from P roposition 1.

An im portant ro le in the stochastic control theory plays the follow ing proposition :

P roposition 2. Let the random vector (X , Y) be norm ally distributed with the density g m _Q, where m = ( m ^ m j ) and

/ Qii» Q l 2 \

are , respective ly , the mean vector and the covariance m atrix of the random variab le (X, Y ). F or every fixed yS lR k the conditional density gm,Q(- |y) is norm al with the mean vecto r m (y) = m j + Qi2Q22 (y " т г) and the conditional covariance m atrix Q(y) = Q n ~ Q i2Q22Q2i- Let us note that Q does not depend on y.

P ro o f. Let

denote the inverse m atrix o f Q . R is positive-defin ite and, th erefore, R n , R 22 are also positive-defin ite m a tr ices . Since QR = I, we obtain Q ll^ il + Q i2^ 2 i = Q2 l^ i i + Q22^ 2 i = 0 and’ consequently,

where C(y), Cj(y) are constants depending on у only. Using (3), we obtain, finally

gm,o (x |y) = С 1.(У)ехр{ ~ 2 < Q_1(x - m (y)), x -m (y )> }

the desired resu lt.

R 22R n ”^22*^21* R n ( 3 )

F rom the definition o f the conditional densities, we have

g m , q ( x |y) = c (y) g m , o ( x .y)

= CjiyJexp - I < R u ( x - m 1 + R"111R 12( y - m 2)),

(x - m i + R¡^R1 2 (у -m 2))>

452 ZABCZYK

4 .1 . Definition o f a W iener process

A fam ily o f rea l random variables (W(t)) 0 defined on a probabilityspace (Ci, & , P) is called a W iener p rocess or a Brownian motion p rocess if it has the follow ing properties :

1) W(0,u>) = 0, (a .s ) ;2) If 0 = t0 < t j < . . . < tn, then the random variables W (tx) - W (t0), . . , ,

W (tn) - W (tn.]) a re independent;

3) F o r every t, s г 0 the increm ent W (t+ s ) - W(t) has a norm al distribution with covariance s (and mean 0 );

4) F o r a lm ost a ll ш, the function W(- ,u) is continuous on [ 0, + oo)

P rob lem 1. Show that a fam ily (W (t) ) t 5 , 0 o f rea l r . v . ' s is a W iener p rocess if it satisfies 1) and 4) and the follow ing two conditions

2 ') If 0 < t 2< . . . < t nthen the random v ector (W ftj), . . . , W(tn. : ), W(tn)) is norm ally distributed

3 ') F o r every t, s s 0;E(W (t)) = 0, E(W (t)W (s)) = t A s ,

which means that (W(t) ) t 2 0 is a Gaussian p rocess with the mean value function = 0 and the covariance function = t A s . Hint: Use Problem 5, Section 1 .5 .

It is not obvious that a fam ily (W (t))t > 0 with properties 1) -4 ) actually ex ists . The first construction was given by W iener in 1923. We introduce here a different techn ically sim pler construction, the so called L evy- C iesie lsk i construction (see [2 ] , [4 ] ) .

4 .2 . L evy -C ies ie lsk i construction o f W iener p rocess

In the L evy -C ies ie lsk i construction, the essential ro le is played by a Haar system (a specia l case o f a generalized Haar system considered in Section 2. 2) connected with a dyadic partition o f the interval [ 0 ,1 ] .Namely let h 0= 1, and if 2n s к < 2n + 1then

4 . W I E N E R P R O C E S S

2 if -----2n

h k ( t ) H n

- 2 2 if — I 2 n

hk( l ) = 0

§ t < k - 2

2n

тП+1+ J_

2n 2n

F rom the C orollary 1 , Section 3 .1 , we know that the system {hk; к = 0, 1, . . .} form s an orthonorm al and com plete basis in the space L2 ( [ 0 ,1 ] ) .

IAEA-SMR-17/29 453

Theorem 1. Let (X k ) k=0 j be a sequence of independent random variables norm ally distributed with mean 0 and covariance 1 , defined on a probability space (Г2, P ). Then, for a lm ost all u, the se r ie s

+ 00 t

x k(u) J hk(s)ds = W (t,u ), t e [ О, 1 ]k = o о

is uniform ly convergent on [0 , 1] and defines a W iener p rocess on [ 0 ,1 ] .

Lemma 1. Let e e (0, •§) and M > 0. If | ak | s Mk6 for к = 1 , 2 , . . . then

+ CO

the ser ies ^ akJT h k(s)ds is uniform ly convergent in the interval [ 0 , 1 ] . k=o 0

tP ro o f. If 2n § к < 2n+1, then the Schauder functions Sk(t) = / hk(s)ds,

оt € [ 0 , 1 ] are non-negative, have disjoint supports and are bounded from

above by 2 ^ 22 = 2 2 . Let us denote by bn = max ( | ak j ; 2n g к < 2n + 1 then

V I II I ak I Sk(t) s bn2 2

2n i k < 2 n tl

for all t e [ 0, 1 ] and n = 0, 1, . . . . Thus the condition + «. _ n

^ bn 2 2 < + »n = о

is sufficient for the uniform convergence of the series+ «o

X X l a kl S # )n = 0 2ns к < 2 n+1

and therefore fo r the uniform convergence of+ oo

X akSk(t) k=o

too. F rom the inequalities |ak | § Mke it follow s that bn S 2eM2ne for all n = 0,1,.,. and, consequently,

^ bn2 2 s 2е M 2 ( 2 ]< + oon=0 n=0

454 ZABCZYK

Lemma 2. Let (X^jô, 1 , . . . be a sequence o f norm ally distributed random variables with mean 0 and covariance 1 then with probability one the sequence

Ix kI

^ l°g k / k = 2 ,3 ,... is bounded.

P ro o f. Let с be a fixed positive num ber, then

P ( ¡x kI ё с >= 7§7 7 е 2 d x s 7 l r / 2 dx e 2с с

From this we obtain that fo r с > J~2_ £

I p ( I x k! T i r I o t f f < + 0°k=2 k=2

T h ere fore , if с > >J~2 then, with probability one, only for a finite number k,I X^ I ë с \j log k.

P ro o f of T heorem . Lemma 1 and Lemma 2 im ply that the series+ oo

X Х к И 3 к ^ k=2

is for a lm ost a ll u uniform ly convergent on the interval [ 0 ,1 ] . Since the functions Skare continuous and Sk(0) = 0 fo r к = 0, 1, . . . , the constructed p rocess sa tisfies properties 1) and 4).

T o prove that conditions 2 ') and 3') are satisfied , it is sufficient to show that the random vecto r (W ^ ), . , . ,W (tn)) is norm ally distributed with zero m ean -vector and the positive definite covariance m atrix Q = (tj A tj), i, j = 1, . . , , n . Let us introduce functions It: the indicators of the intervals[ 0, t] с [ 0 , 1 ] . C learly , fo r i, j = 1, . . . , n , tjA tj

= / l t.(s )I t (s)ds = < L , It.>0 1 J 1 J

P roblem 1. Show that the m atrix (t jA tp is positive definite. Hint:n \| n l

Y \ Xj tj/\t j- = Y xixj < V Itj- > (Xjlt^s) + . . . + XnItn(s))2ds. i,j=i i,j=i 0

n

If ^ Xj Xj tjA tj = 0 prove, by induction, that Xn, X^j, . . . , Xx are equal ze ro , i. j = i

Applying P a rcev a l's identity we obtain that

t .A t . = £ = ^ S ^ l S ^ t ) k=0 k=0

IAEA-SMR-17/29 455

дLet us define WN(t) = ^ X kSk(t), then m N = E(WN(t)) = 0 and the covariance

k=0

m atrices QN of (WN(t1 ) 1 . . . , W ^ n )) are accord in g to straightforw ard ■ calcu lations, equal to:

N

Q n = Sk(ti)Sk(tj)). ,=i k=0

It is a lso clear that the random vectors (WN(t1), . . . , WN(tn)) are norm allydistributed (see Section 1 .5 ), therefore the lim it random vector(W(t1), . . . ,W (tn)) is a lso norm ally distributed with the mean vector zero andthe covariance m atrix Q = (t jA t j)j j= 1 .......n, (see Problem 6 , Section 1 .5 ).This com pletes the p roo f.

P rob lem 2. Construct a W iener p rocess on [0 , + oo). Hint: Construct a countable fam ily of independent W iener p rocesses on [0 , 1] and piece them together.

4 .3 . "White n o ise ", stochastic equations and W ien er's integral

In this section , we give an elem entary introduction to "white n o ise ", stochastic equations and the W iener integral theory.

In the applications, the follow ing definition o f "white n oise" is used: "white n oise" is a stationary p rocess whose spectra l density is a constant function on the whole rea l line. Since spectra l m easures (o f stationary p rocesses ) are finite, the above definition is inconsistent. N evertheless, we show that by using a W iener p rocess a good approxim ation o f "white n oise" can be constructed . M oreover, starting from this approxim ation, we give a physical interpretation o f the stochastic equation.

Let Wi and W2 be two independent W iener p rocesses on [0 , +oo).A W iener process W on the whole line IR is defined as

(•Wi(t) i f t ë 0

W (t)= J1 w 2 (-t) i f t s o

Let W be a fixed W iener p rocess on К and define for every number h > 0 a new p rocess

¿ht*) = ¿ (W (t+ h ) - W (t)), t € E

It is easy to see that the p rocess Ah is a Gaussian p rocess (see the definition of a W iener p rocess ) and that E(Aj, (t)) = 0, t € E . The proposition below shows that this p rocess can be treated as an approxim ation o f "white n o ise ".

P roposition 1 . F or any h > 0 and a ll t, sGIR E(Ah (t)A h (s )) = rh(t - s), where

456 Z.A B C ZYK

Thus A h is a stationary p rocess (covariance function depends on t - s ) . If p is the spectra l density of rh:

+ «o

then

Ph(x) = ¿7 f е"ШХ rh(u)du> xGH

I. . 1 ( 1 - cos hx) , . 1

PhW = - ------ ’ Х Ф ' Phi ) = ^ l )

and, consequently, ph -*■ 1 / 27r uniform ly on every finite interval as h -> 0 .

P roo f. Both form ulas are a consequence of straightforw ard ca lcu lations. Form ula (1) follow s from the easily checked relation E(W(t) - W (s ) ) 2 = 11 - s j , t, s G E and (2) by integrating by p arts .

Rem ark 1. The above proposition is a lso a justification o f the statement: "white noise is a derivative of a W iener p ro ce ss " , because if h i 0 then (form ally) Ah (d /dt)W (t). In fact, the tra jectories o f a W iener p rocess are nowhere differentiable functions (see R efs [2 , 4 ]) and, th erefore , thesequence Д^, h * 0 is d ivergent. A related resu lt w ill be proved in the nextchapter (Section 5 .2 ) .

Let us now consider a m echanical system described by a linear d ifferential equation

x (n> + ajX^11"1) + . . . + ап_гх + anx = bÇ(t) (3)

with in itial conditions x ® (0 ) = x k+1, к = 0 , . . . , n - 1 and an outer stochastic fo rce b ç. If f(t ) is a "white n oise" p rocess , then, instead of considering a "fo rm a l" equation (3), it is reasonable to deal with its approxim ate version :

x(n) + a jx 01’ 15 + . . . + anx = ЬДЬ (4)

x w (0 ) = x к = 0 , 1 , . . , , n - l

where the right-hand side is w ell defined for every !'ы". Equation (4) is equivalent to the system of equations

(5)

. . a^ jX j + b Дь

with the initial conditions: xk(0) = хк, к = 1, . . . , n. We want to generalize slightly the system (5). To do this let us introduce independent W iener p ro ce sse s WXj W2, . . . , Wn defined for teIR and let . . . , AR be corresponding alm ost "white n o ise" p rocesses : Ah(t) = (1 /h ) (W¡ (t + h) - W i(t)). W(t) and Ah(t) w ill denote from now the (colum n) p rocesses with components resp ective ly Wj(t) and Ah(t), i = 1, . . . , n. Let A and В be two n X n m atrices then the system (5) is a specia l case o f the follow ing system :

x = A x + B A h, x (0) = x (6)

LAEA-SMR-17/29 457

Proposition 2. Let xh(t), t a 0 be the solution of Eq. (6 ). If h * 0 then the stochastic p ro cesses (xh(t) ) t > 0 tend, fo r alm ost all "u ", uniform ly on finite intervals to a stochastic p rocess (X (t))ta 0 , which satisfies the stochastic integral equation

tX(t) = x + J A X(s)ds + BW(t), t ê 0 (7)о

Equation (7) is som etim es written as a stochastic d ifferential equation

dX(t) = AX(t)dt + BdW(t), X (0) = x (8 )

The proof w ill be an easy application of the follow ing im portant lem ma:

Lem m a 1. Let B.( • ) be a continuously differentiable function from [0 , Ч-co) into the space o f n X n m a tr ices . Then

t t

! B (s) Ah (s)ds -*■ B(tlW(t) - / B (s)W (s)ds, as h 4 0 (9)o ' о

for a lm ost all "u " and uniform ly with resp ect to t from com pact in te r vals С (0, + °°).

P ro o f. The proof o f the lem m a follow s from the identity t t h

J B (s)A h(s)ds = J i (B (s-h ) - B (s))W (s)ds - £ J B(s)W (s)dsO h 0

t + h

+ J B (s-h )W (s)ds, t, h > 0 t

Rem ark 2. If B( • ) is a continuously differentiable function then the Stieltjes integral

t t

/ B (s)dW (s) = B(t)W(t) - I B(s)W{ s)ds (10)0 0

and we can w rite that: t t

/ B (s )A h(s)ds - f B (s)dW (s), t й 0о о

Paley, W iener and Zygmund [ 12] extended the definition o f the stochastict ‘ I ,2

integral / B (s)dW (s), to a ll functions B( •) such that / B (s) d s < + ° ° .о оTheir approach is presented in the follow ing problem .

P rob lem 1. A ssum e n = 1. Then the mapping from L2 ( [ 0 ,1 ] ) intol

L 2 (!T2, ^ P) given by b( • ) -* / b(s)dW (s) defined for all b continuously0 1 2 1 2 differentiable by (10) is an isom etry : E ( / b(s)dW (s)) = f b (s)ds and th ere -

0 0fore can be extended to an isom etry on the whole L2([ 0 ,1 ] ) . Show this.

1 1 1 . .Hint: E( / b (s)dW (s ) ) 2 = / / t A s b(t)b(s)dtds.

iThe integral / b(s)dW (s) defined in the above P roblem 1 is som etim es

called W ien er's in tegral. A different construction of W iener's integral will be given in the next chapter.

P ro o f o f P roposition 2 . Let us start from the observation that explicit form ulas fo r solutions o f (6 ) and (7) can be easily derived . Namely, for all t ë 0 we have:

A t A t - A sx,(t) = e x + e / e B A .(s )d s andti о

X(t) = (x + BW(t)) + A / eA(t"s) (x + BW (s))ds 0

= eAt x + A eAt / e 'As BW(s)ds + BW(t) о

And it is sufficient to apply Lemma 1.

Rem ark 3 . By exactly the same method a m ore general case can betreated. If, e .g . in E q .( 6 ) the m atrix В is a continuously differentiablefunction o f t, then the P roposition 2 holds but Eq. (7) has to be changed:

_ t tX(t) = x + f A X(s)ds + f B(s)dW (s) (7 ')

0 0

P roblem 2. G eneralize Proposition (2) to the "tim e-dependent" case:A and В depend on t. C onsider also a non-linear case: x = A(x) + B ñ h, and A( • ) a non-linear mapping from K n into lRn.

Rem ark 4 . R elations between ordinary differential equations and stochastic d ifferential equations are studied in R ef. [ 13] .

4 5 8 ZABCZYK

5. ITO'S STOCHASTIC INTEGRAL

5 .1 . Introductiont

In this chapter, follow ing Ito, we define a stochastic integral / f(s)dW (s)0

for a wide c la ss o f stochastic p ro cesses (f(t ) ) t » 0 (not only for functions f€ L ([ 0, 1 ] ) as we did in Section 4. 3). Such extension is needed if one wants to deal with stochastic integral equations:

t t X(t) = x+ J a(X (s))ds + J b(X (s))dW (s), t г 0

о о(in d ifferential form :

dX(t) = a(X(t))dt + b(X(t))dW (t))

where the function b does depend on x, (com pare the equations (7) and (7 ') in Section 4 .3 ) . t

The integral J f(s)dW (s) cannot be defined (even for continuous p rocesses (f(t))) as the Stieltjes or L ebesgue-S tieltjes integral because:

IAEA-SMR-17/2 9 459

Lemma 1. The tra jectories of a W iener p rocess are (a. s . ) not rectifiab le in any time interval o f positive length, th erefore , they are not o f bounded variation.

P roo f. Let2n

4, = £ | W ((k -l) /2 n) - W (k/2n)|, n = 1 , . . . k = l

The E(e ") = (E(e lw ' ))t for = \ j 2 n, because the W iener p rocess (W(t)) has independent increm ents (P roperty 2). T herefore

Г - 1E ( ' " n>-4 - j f c î J

- 0Using the estim ates e”u S i - u + u2/2 , u ê 0 we obtain E (e n ) ->■ 0 and since(fn) is an increasing sequence it tends to + «> (a. s . ) .

The other possib ility is to use the explicit L evy -C ies ie lsk i form ula for a W iener p rocess (see T heorem 1, Section 4 . 2) and define

t + « t

J f(s)dW (s) = £ XkJ f(s)hk(s)ds ( 1 )0 k = 0 0

but it is difficult, in this case , to d iscover the appropriate c lass ofintegrands f(t).

Problem 1. Show that if f € L 2 ( [ 0 , l ] ) , then for fixed t the ser ies in (1) is (a. s . ) convergent and is equal to W ien er's integral o f Section 4 .3 . Hint: Use Rem ark 2, Section 2.9.

It follow s from the paper by Ito -N isio [14] that if f GL2 [ 0 ,1] then the se r ie s ( 1 ) converges (a .s .) uniform ly on [ 0 , 1 ].

5 .2 . Construction o f the Ito stochastic integral

The Ito definition o f stochastic integral is valid for the so -ca lled non- anticipating Brownian functionals f. To define them, let ( ^ ) t >o ê an increasing fam ily of a -fie lds such that

1) F or every t è 0, W(t) is ^ -m e a s u ra b le

2) cr-fields 3 and <3* = cr(W(s) - W(t) : s S t) are independent.

F or instance, using a standard "m ethod o f ^ -sy stem s", (see Section 1 .1 ) it is possib le to prove that cr-fields â?t = a(W (s); s S t), satisfy 1) and 2).

A fam ily (f(t))ta 0 o f rea l-va lued random variables is said to be a non-anticipating Brownian functional if

1) The function f(t ,u ), ( t ,u ) e [0 , + и ) Х Г! i s ^ ( [ 0 , + « ) ) X ^"-m easurable

2) f(t) is ^ -m e a s u r a b le , t i 0

3) p( / f 2(s)ds < + °o, t ë 0) = 1.о

((U, ,V) is a basic probability space).

460 ZABCZYK

F irs t , we define stochastic integral f fdW for t e [ 0 , l ] . Let us denoteо

by _/Kand the linear spaces o f all non-anticipating (on [0 , 1]) Brownian functionals and all non-anticipating Brownian functionals continuous on [0 ,1 ] (for a lm ost all u), and define

I f L = ( / f 2(s)ds)* f o r f е * Ж ¿ о

I f |c = sup( I f(s) I , 0 s s È 1 ), for f

The linear spaces jV, ,/T can be treated as norm ed spaces with norm s:

И , = е Л ] - 1v \ f e >

t

f

1 + |f|t, î e j r

P roblem 2. Show that the spaces JŸ] <yV~c are com plete (the so-ca lled F -sp a ce s ).

Let us denote by the subspace o f ^ o f a ll sim ple Brownian functionals. A p rocess f e ^ i s said to be sim ple if there exist num bers,0 S tQ S tj S . . , S tn S Л, such that

fit) = f(tk), for tk S t < tk + 1, к = 0 ......... n - 1

P roblem 3 . Show that the subspace /fÇ is dense in JV.If f € the Ito 's stochastic integral is defined by the form ula

tJ f(s)dW (s) = £ f(tk)(W (tk+1) - W(tk)) + f^ X W ft) - W ^ ))0 4k+ls t

where tx = m ax(tk: tk s t), t e [ 0 , 1 ]T o extend the above definition to the whole ЛГ, it is convenient to

consider the stochastic integral as operator:

i - . j r r j r c.

t! f(t) = / f(s)dW (s), te[o, l ]

о

We show that / is a continuous (linear1) operator. This follow s from

the follow ing two lem m as:

Lemma 1. If f 6 and aG H then the p rocesst 2 t

X t = exp (a Í fdW - — f f 2 ds) o ¿ о

is ( , ^ ) -m artingale fo r all aS lR .

IAEA-SMR-17/29 461

P ro o f. We prove the lem m a for f = 1, because the general case follow s then d irectly by induction. Let us rem ark that for t > s and u = t - s

E (exp(a(W t -W s ) - | - a 2 ( t - s ) ) ) = f exp(ax - j « 2u - f^ )d x

1 Г , l i x -o -u )2, ,= 7 ^ Ï Ï _ J e x p ( - 2 — jj— Jdx = l

T h erefore fo r A e ^S

2 2 E (IA(X t - X s)) = E (IAexp(û-Ws - y s)(exp(a(W t -Ws ) - y u) - 1))

2 2 = E (IAexp(aW s s)) E (exp (a (Wt - W, ) - ) - 1) = 0

i*2 # 2 because the random variables IAexp(crWs - — s), and exp(<ï(Wt -W s ) - — (t -s ))

are independent as r .v . 's , ^ and m easurable, resp ective ly . This com pletes the proof.

Lemma 2. If a, (3 are positive numbers and f G JT then

P( max ( / fdW - f- / f2 ds) > (3) S e ‘ “ 6 (1)O i t i l о о

P ro o f. Let t 0 s t ¡ s . . . S tn S 1 b e a n arb itrary sequence o f positive num bers then, by virtue of Lem m a 1, the sequence (Xtk)k = i ...... n is

( ^ ’tk)k=i n-m artingale and consequently by D oob-K olm ogorov inequality

(P roblem 1, Section 2 .8 )*k ‘k

P( max ( / f dW - ^ / f2ds) ë (3)l s k s n 0 ¿ 0

= P( max X ë e “ 6) s E(X ) e " “ s1 £ k s n к

Since the sequence ¡jtk)k = 1 ...... n was arb itrary , the required resu lt fo llow s.

Since / f 2ds Ê J f2 ds, fo r t s 1 th erefore from the Lemma 2 we obtain:о 0

Fundamental inequality:

P(| / f|c > (3 + | J f | ) s 2 e ' “ 6

C oro lla ry 1. Assum e that fn GŸ~S and that || f n | |2 -, 0 a s n - , +«>. Thisis equivalent that for every e > 0, P(|fn |2 > e) -* 0. But then the fundamental

inequality im plies P ( / |fn|c > e) -*■ 0 for every e > 0 and, th erefore ,

II ! f n||c -» 0. This proves that the operator / is continuous.

Now we are able to finish the definition o f the Ito Integral. Let / be the unique continuous extension o f the operator J to the whole JY. Such an

462 ZABCZYK

extension exists because the operator J is continuous on Since ^ is

dense in the extension is uniquely determ ined. If f e.yKthen we define tf f(s)dW (s) = / f(t), t e [ o , 1 ]0

Since the operator / is into JVC3 the stochastic integral defined above for all fe ^ /T is a continuous function (a. s . ) o f the param eter t.

P roblem 4 . Extend the definition o f the stochastic integral to all t 0.

Rem ark 1. F or further inform ation on stochastic integrals as well as on stochastic d ifferential equations we re fer to R efs [4 , 15 ].

R E F E R E N C E S

[1 ] DOOB, J.L. , What is a martingale? Amer. Math. Month. , 78 (1971) 451-463 .[2 ] LAMPERTI, J. , Probability, Benjamin, New York and Amsterdam (1966).[3 ] MEYER, P. A . , Probability and Potentials, Blaisdell Publishing Company (1966).[4 ] McKEAN, H .P . , Jr. , Stochastic Integrals, Academic Press, New York and London (1969).[5 ] HALMOS, P .R ., Measure Theory, Van Nostrand, New York (1950).[6 ] FELLER, W . , An Introduction to Probability Theory and its Applications, W iley, New York, _1 (1968);

2 (1971).[7 ] ASTROM, K. J ., Introduction to Stochastic Control Theory, Academic Press, New York (1970).[8 ] KUSHNER, H . , Introduction to Stochastic Control, Holt, New York (1971).[9 ] COURREGE, P., PRIOURET, P., Temps d'arrêt d’ une fonction aléatoire: Relations d'equivalence associées

et propriétés de decomposition, Publ. Inst. Statist. Univ. Paris 14 (1965) 245-274.[10] DYNKIN, E.B. , YUSHKEVICH, A . , Markov Processes: Theorems and Problems, Plenum, New York (1969).[11] CHATTERJI, S .D . , Martingale convergence and the Radon-Nikodym theorem, Math. Scand. 22

(1968) 21-41 .[12] PALEY, R .E .A .C ., WIENER, N .. ZYGMUND, A . , Note on random functions, Math. Z . 37 (1933),

647-668 .[13] WONG, E. , ZAKAI, M . , On the relation between ordinary and stochastic differential equations,

Int. J. Eng. Sci. 3 (1965) 213-229.

[14] ITO, K. , NISIO, M. , On the convergence o f sums of independent Banach space valued randomvariables, Osaka J. Math, j) (1968) 35-48 .

[15] LIPCER, P.S. , SIRJAJEV, A .N . , Statistic of Random Processes, Science, Moscow (1974) in Russian.

SE C R E TA R IA T OF SEMINAR

ORGANIZING

R . Conti

L. Markus

C. Olech

EDITOR

J .W . W eil

COMMITTEE

Mathematics Institute "U .D in i" ,U niversity of F loren ce ,Italy

M athem atics Institute,U niversity of W arwick,Coventry, United KingdomandSchool o f M athem atics, U niversity o f Minnesota, M inneapolis, USA

Institute o f M athematics,P olish Academ y of Sciences,W arsaw, Poland

D ivision o f Publications, IAEA, Vienna, A ustria

463

The follow ing conversion table is provided fo r the convenience o f readers and to encourage the use o f S I units.

FACTORS FOR CONVERTING UNITS TO SI SYSTEM EQUIVALENTS*SI base units are the metre (m), kilogram (kg)f second (s), ampere (A), kelvin (К), candela (cd) and mole (mol).[F or further inform ation, see International Standards ISO 1000 (1973), and ISO 31/0 (1974} and its several parts]

Multiply by to obtain

Mass

pound mass (avoirdupois) 1 Ibm = 4.536 X 1 0 '1 kgounce mass (avoirdupois) 1 ozm = 2.835 X 101 9ton (long) (= 2240 Ibm) 1 ton = 1.016 X 103 kgton (short) (= 2000 Ibm) 1 short ton = 9.072 X 102 kgtonne (= metric ton) 1 t = 1.00 X 103 kg

Length

statute mile 1 mile 1.609 X 10° kmyard 1 yd = 9.144 X 10_1 mfoo t 1 f t = 3.048 X 10"1 minch 1 in = 2.54 X 1 0 '2 mmil (= 10-3 in) 1 mil = 2.54 X 10"2 mm

Area

hectare 1 ha = 1.00 X 104 m2(statute m ile)2 1 mile2 = 2.590 X 10° km 2acre 1 acre = 4.047 X 103 m2yard2 1 yd2 = 8.361 X 1 0 '1 m2fo o t2 1 f t 2 = 9.290 X 10-2 m2inch2 1 in2 = 6.452 X 102 mm2

Volume

yard3 1 yd3 = 7.646 X 10“ ’ m3fo o t3 1 f t 3 = 2.832 X 1 0 '2 m3inch3 1 in3 = 1.639 X 104 mm3gallon (Brit, or Imp.) 1 gal (Brit) = 4.546 X 10-3 m3 .gallon (US liquid) 1 gal (US) = 3.785 X 10-3 m3litre 1 1 = 1.00 X 1 0 '3 m3

Force

dyne 1 dyn = 1.00 X 10_s Nkilogram force 1 kgf = 9.807 X 10° Npoundal 1 pdl = 1.383 X 10_1 Npound force (avoirdupois) 1 Ibf = 4.448 X 10° Nounce force (avoirdupois) 1 ozf = 2.780 X 1 0 '1 N

Power

British thermal unit/second 1 Btu/s = 1.054 X 103 Wcalorie/second 1 cal/s = 4.184 X 10° Wfoot-pound force/second 1 f t - lb f/s = 1.356 X 10° Whorsepower (electric) 1 hp = 7.46 X 102 Whorsepower (metric) (= ps) 1 ps = 7.355 X 102 Whorsepower (5 5 0 ft- lb f/s ) 1 hp = 7.457 X 102 W

Factors are given exactly or to a maximum o f 4 significant figures

Multiply by to obtain

Density

pound mass/inch3 1 lb m /in 3 = 2.768 X 104 kg/m3pound mass/foot3 1 ib m /f t3 = 1.602 X 101 kg/m 3

Energy

British thermal un it 1 Btu = 1.054 X 103calorie 1 cal = 4.184 X 10° Jelectron-volt 1 eV 1.602 X 10"19erg 1 erg = 1.00 X 1 0 '7 Jfoot-pound force 1 f t • I bf = 1.356 X 10° Jkilowatt-hour 1 kW-h = 3.60 X 106 J

Pressure

newtons/metre2 1 N /m 2 = 1.00 Paatmosphere* 1 atm = 1.013 X 10s Pabar 1 bar = 1.00 X 105 Pacentimetres of mercury (0°C) 1 cmHg = 1.333 X 103 Padyne/centimetre2 1 dyn/cm 2 = 1.00 X 1 0 '1 Pafeet o f water (4°C) 1 ftH 20 = 2.989 X 103 Painches of mercury (0°C) 1 inHg = 3.386 X 103 Painches of water (4°C) 1 inH 20 = 2.491 X 102 Pakilogram force/centimetre2 1 kgf/cm 2 = 9.807 X 104 Papound fo rce /foo t2 1 lb f / f t2 = 4.788 X 101 Papound force/inch2 {= psi}6 1 lb f/ in 2 = 6.895 X 1Q3 Pato rr (0°C) <= mmHg) 1 to rr = 1.333 X 102 Pa

Velocity, acceleration

inch/second 1 in/s = 2.54 X 101 mm/sfoot/second {= fps) 1 ft/s = 3.048 X 1 0 '1 m/sfoot/m inute 1 ft/m in = 5.08 X 10"3 m/s

mile/hour {= mph) 1 mile/h4.470 X 10"1 1.609 X 10°

m/skm /h

knot 1 knot = 1.852 X 10° km/hfree fa ll, standard (= g) = 9.807 X 10° m/s2foot/second2 1 ft /s 2 = 3.048 X 10-1 m/s2

Temperature, thermal conductivity, energy/area• time

Fahrenheit, degrees — 32 ° F - 3 2 ] 5 i ° cRankine ° R Г 9 I K1 B tu - in / ft2-s- °F = 5.189 X 102 W /m -K1 B tu /ft-s* °F = 6.226 X 101 W /rn-K1 cal/cm-s*°C = 4.184 X 102 W /m -K1 B tu /ft2 -s = 1.135 X 104 W/mJ1 cal/cm2-min = 6.973 X 102 W/m2

Miscellaneous

fo o t3 /second 1 f t 3 /s = 2.832 X 10-2 m3/sfo o t3/m inute 1 f t 3/m in = 4.719 X 1 0 ^ m3/srad rad = 1.00 X 10-2 J/kgroentgen R = 2.580 X 1 0 ^ C/kgcurie Ci = 3.70 X 1010 disintegration/s

aatm abs: atmospheres absolute; atm (g) : atmospheres gauge.

b lb f/ in 2 (g) lb f/ in 2 abs

{= psig) : gauge pressure;{= psia): absolute pressure.

HOW TO ORDER IAEA PUBLICATIONS

■ Exclusive sales agents for IAEA publications, to whom all ordersand inquiries should be addressed, have been appointed in the following countries:

UNITED KINGDOM Her Majesty's S ta tionery O ffice , P.O. Box 569, London SE 1 9NH

UNITED STATES OF AM ERICA U NIPUB, P.O. Box 433, M urray H ill S tation, New Y o rk , N .Y . 10016

■ In the following countries IAEA publications may be purchased from the sales agents or booksellers listed or through your major local booksellers. Payment can be made in local currency or with UNESCO coupons.

A R G EN TIN A Comisión Nacional de Energía A tóm ica , Avenida del L ibertador 8250, Buenos Aíres

A U S TR A LIA H unter Publications, 58 A Gipps Street, C ollingw ood, V ic to ria 3066 BELGIUM Service du C ourrier de l'UNESCO , 112, Rue du Trône, B-1050 BrusselsC AN AD A In fo rm a tion Canada, 171 Slater Street, O ttaw a, O nt. K 1 A O S 9

C.S.S.R. S .N .T .L ., Spálená 51, CS-110 00 PragueA lfa , Publishers, H urbanovo nâmestie 6, CS-800 00 Bratislava

FRANCE O ffice In ternationa l de D ocum entation et L ib ra irie , 48, rue Gay-Lussac, F-75005 Paris

HUNGARY Ku ltu ra , Hungarian Trading Com pany fo r Books and Newspapers,P.O. Box 149, H-1011 Budapest 62

IN D IA O x fo rd Book and S ta tionery С о т р ., 17, Park Street, Calcutta 16;O x fo rd Book and S ta tionery С о т р ., Scindia House, New D elh i-110001

ISRAEL Heiliger and Co., 3, Nathan Strauss S tr., JerusalemIT A L Y L ibrería Scientifica, D o tt. de Biasio Lucio "a e io u ".

V ia Meravigli 16, 1-20123 Milan JAPAN Maruzen Com pany, L td ., P.O.Box 5050, 100-31 T o k y o International

NETHERLANDS M arinus N ijh o ff N .V ., Lange V o o rh o u t 9-11, P.O. Box 269, The HaguePAKISTAN M irza Book Agency, 65, The Mall, P.O.Box 729, Lahore-3

POLAND Ars Polona, Céntrala Handlu Zagranicznego, Krakowskie Przedmiescie 7,Warsaw

R O M AN IA Cartim ex, 3-5 13 Decembrie Street, P.O.Box 134-135, Bucarest SOUTH A FR IC A Van Schaik's Bookstore, P.O.Box 724, Pretoria

Universitas Books (Pty) L td ., P.O.Box 1557, Pretoria SPAIN Diaz de Santos, Lagasca 95, M adrid-6

Calle Francisco Navacerrada, 8, M adrid-28 SWEDEN C.E. Fritzes Kungl. H ovbokhandel, Fredsgatan 2, S-103 07 S tockho lm

U.S.S.R. M ezhdunarodnaya Kniga, Smolenskaya-Sennaya 32-34, Moscow G-200 YU G O SLAVIA Jugoslovenska Knjiga, Terazije 27, Y U -1 1000 Belgrade

■ Orders from countries where sales agents have not yet been appointed and requests for information should be addressed directly to:

^^ ¡sh in g Section, International Atomic Energy Agency, Kârntner Ring 11, P.O.Box 590, A-1011 Vienna, Austria

I N T E R N A T I O N A L S U B JE C T GROUP: IIIA T O M IC E N E R G Y A G E N C Y Physics/V IE N N A , 1976 Theoretical Physics

CONTROL THEORY AND · 2013. 7. 24. · Thus the mathematical problems of control theory are inverse to the usual problems of mathematical physics. We assume only a limited knowledge

Documents