Top Banner

of 327

andrews-notes.pdf

Apr 03, 2018

Download

Documents

m_asghar_90
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
  • 7/28/2019 andrews-notes.pdf

    1/326

    Incomplete Notes on

    Geometric Control Theory !

    Andrew D. Lewis

    23/06/2009

    Associate Professor, D e p a r t m e n t o f M a t h e m a t i c s a n d S t a t i s t i c s , Q u e e n ' s U n i v e r s i t y , K i n g s t o n , O N K 7 L 3 N 6 , C a n a d a

    Email: [email protected], URL: http://penelope.mast.queensu.ca/andrew/

  • 7/28/2019 andrews-notes.pdf

    2/326

  • 7/28/2019 andrews-notes.pdf

    3/326

    i

    Preface

    These are a very incomplete set of notes that will eventually turn into a book ongeometric control theory. Parts of these notes are quite complete (e.g., Chapters 2, 5,

    and the part of Chapter 6 that is finished), parts are partially complete (e.g., Chapters 3and 4), parts are just being started (e.g., Chapters 1, 8, and 9), and some are merelyplaceholders for things that are not yet written (e.g., Chapters 7, 10, and 11). Amongthe victims of the state of incompleteness is the referencing. So I am afraid thatthese notes are but an imperfect source for gaining access to the research literature ingeometric control. Hopefully some of the gaps left here can be filled in by using thetexts Jurdjevic [1997] and Agrachev and Sachkov [2004].

    The writing of this material started, ostensibly, as the writing of material for a shortcourse on controllability. However, (enjoyable) distractions arose, the results of whichare plain to see. These distractions prevent the material on controllability from being ascomplete as it might have been. This is no great crime, as the subject of controllability is

    incomplete by nature, and the intent of the short course is to outline the foundations ofcontrollability theory, rather than the presentation of specific results on controllability.Nonetheless, the reader should be aware that, in their present state, these notes donot paint a very clear picture of the state of the art of controllability theory, at least asconcerns specific necessary and sufficient conditions for controllability. The guilt I feelabout this is bounded above by a medium-sized positive constant.

    At present there is no mention of applications of geometric control theory in thesenotes. I do not have any plans to include mention of such applications in the future.The guilt I feel about this is bounded above by a small positive constant. If you areinterested in applications of geometric control theory to mechanical systems, then we

    refer to the texts ofBloch [2003] and Bullo and Lewis [2004].There are places where material is obviously missing. Places where absent materialis perhaps less obvious are marked with an exclamation point in the margin, like this: !

    There will obviously be many mistakes, typographical and, I am afraid, otherwise.I would appreciate it if you could pass on any that you find.

    In summary:

    These notes are incomplete and mostly unchecked. Thus I will not guaranteethat anything contained in them is correct. These notes are intended neither fordistribution nor for citation.

  • 7/28/2019 andrews-notes.pdf

    4/326

    Table of Contents

    1 Notation and prerequisites 1

    2 Real analyticity 92.1 Real analytic functions: definitions and fundamental properties . . . . . 9

    2.1.1 Multi-index and partial derivative notation . . . . . . . . . . . . 92.1.2 Formal power series . . . . . . . . . . . . . . . . . . . . . . . . . . 122.1.3 Formal Taylor series . . . . . . . . . . . . . . . . . . . . . . . . . . 142.1.4 Convergent power series . . . . . . . . . . . . . . . . . . . . . . . 192.1.5 Real analytic functions . . . . . . . . . . . . . . . . . . . . . . . . 24

    2.2 Real analytic multivariable calculus . . . . . . . . . . . . . . . . . . . . . 332.2.1 Real analyticity and operations on functions . . . . . . . . . . . . 332.2.2 The real analytic Inverse Function Theorem . . . . . . . . . . . . 362.2.3 Some consequences of the Inverse Function Theorem . . . . . . . 41

    2.3 Real analytic differential geometry . . . . . . . . . . . . . . . . . . . . . . 472.3.1 Real analytic manifolds, submanifolds, mappings, and vector

    bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472.3.2 The GrauertMorrey Embedding Theorem . . . . . . . . . . . . . 512.3.3 Extension of and approximation by real analytic maps . . . . . . 52

    2.4 Local properties of analytic functions . . . . . . . . . . . . . . . . . . . . 56

    2.4.1 Unique factorisation domains . . . . . . . . . . . . . . . . . . . . 562.4.2 Noetherian rings and modules . . . . . . . . . . . . . . . . . . . . 622.4.3 The Weierstrass Preparation Theorem . . . . . . . . . . . . . . . . 662.4.4 Algebraic properties of germs of analytic functions . . . . . . . . 732.4.5 Properties of analytic sections of vector bundles and their germs 77

    3 Time-dependent vector fields and their flows 913.1 Vector fields depending measurably on time . . . . . . . . . . . . . . . . 91

    3.1.1 The finitely differentiable case . . . . . . . . . . . . . . . . . . . . 913.1.2 The smooth case . . . . . . . . . . . . . . . . . . . . . . . . . . . . 943.1.3 The locally Lipschitz case . . . . . . . . . . . . . . . . . . . . . . . 95

    3.2 Absolutely continuous curves . . . . . . . . . . . . . . . . . . . . . . . . 1003.2.1 Some comments about absolute continuity . . . . . . . . . . . . . 1013.2.2 Absolutely continuous curves on smooth manifolds . . . . . . . 102

    3.3 Flows for time-dependent vector fields . . . . . . . . . . . . . . . . . . . 1043.3.1 Integral curves: local existence and uniqueness . . . . . . . . . . 104

  • 7/28/2019 andrews-notes.pdf

    5/326

    iii

    4 Set-valued analysis on manifolds 1124.1 Riemannian manifolds as metric spaces . . . . . . . . . . . . . . . . . . . 112

    4.1.1 Definition of the metric . . . . . . . . . . . . . . . . . . . . . . . . 1124.1.2 Equivalence of metrics . . . . . . . . . . . . . . . . . . . . . . . . 116

    4.1.3 The metric structure of the tangent bundle of a Riemannianmanifold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

    4.2 Set-valued maps between metric and topological spaces . . . . . . . . . 1214.2.1 The Hausdorffdistance . . . . . . . . . . . . . . . . . . . . . . . . 1214.2.2 Set-valued maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1274.2.3 Notions of continuity for set-valued maps . . . . . . . . . . . . . 1274.2.4 Lipschitz set-valued maps . . . . . . . . . . . . . . . . . . . . . . 1344.2.5 Measurable set-valued maps . . . . . . . . . . . . . . . . . . . . . 135

    4.3 Convex sets, affine subspaces, and cones . . . . . . . . . . . . . . . . . . 1364.3.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1364.3.2 Combinations and hulls . . . . . . . . . . . . . . . . . . . . . . . . 1364.3.3 Topology of convex sets and cones . . . . . . . . . . . . . . . . . 1424.3.4 Separation theorems for convex sets . . . . . . . . . . . . . . . . . 144

    4.4 Differential inclusions on manifolds . . . . . . . . . . . . . . . . . . . . . 1474.4.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1484.4.2 Continuity of differential inclusions . . . . . . . . . . . . . . . . . 1494.4.3 Lipschitz differential inclusions . . . . . . . . . . . . . . . . . . . 1514.4.4 Differential inclusions with measurable time dependence . . . . 1514.4.5 Selections of differential inclusions . . . . . . . . . . . . . . . . . 1524.4.6 Trajectories for differential inclusions . . . . . . . . . . . . . . . . 1534.4.7 Relaxation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

    4.4.8 Differential inclusions associated with a discontinuous vectorfield . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

    5 Families of vector fields, distributions, and affine distributions 1565.1 Distributions: definitions and basic properties . . . . . . . . . . . . . . . 156

    5.1.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1565.1.2 Regular and singular points . . . . . . . . . . . . . . . . . . . . . 1635.1.3 Distributions invariant under vector fields and diffeomorphisms 166

    5.2 The algebraic structure of sets of functions and vector fields . . . . . . . 1685.2.1 Rings of functions and modules of vector fields . . . . . . . . . . 1685.2.2 Rings of germs of functions and modules of germs of vector fields173

    5.2.3 Analytic germs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1745.2.4 Smooth germs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

    5.3 The Orbit Theorem and some consequences . . . . . . . . . . . . . . . . 1765.3.1 Lie algebras of vector fields . . . . . . . . . . . . . . . . . . . . . . 1775.3.2 Immersed submanifolds . . . . . . . . . . . . . . . . . . . . . . . 1825.3.3 Orbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1835.3.4 Fixed-time orbits . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

  • 7/28/2019 andrews-notes.pdf

    6/326

    iv

    5.3.5 The Orbit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 1895.3.6 The finitely generated Orbit Theorem . . . . . . . . . . . . . . . . 1975.3.7 The fixed-time Orbit Theorem . . . . . . . . . . . . . . . . . . . . 1995.3.8 Frobeniuss Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 201

    5.3.9 Equivalence of Lie subalgebras of vector fields . . . . . . . . . . 2035.4 Affine distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212

    5.4.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2125.4.2 Regular and singular points . . . . . . . . . . . . . . . . . . . . . 2145.4.3 Algebraic aspects of affine distributions . . . . . . . . . . . . . . 2155.4.4 The Lie algebra generated by an affine distribution . . . . . . . . 2175.4.5 Invariant subspace constructions . . . . . . . . . . . . . . . . . . 219

    6 Geometric system models 2236.1 Controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223

    6.1.1 Metric space valued controls . . . . . . . . . . . . . . . . . . . . . 223

    6.1.2 Subsets of admissible locally essentially bounded controls . . . . 2296.1.3 Euclidean space valued controls . . . . . . . . . . . . . . . . . . . 2326.1.4 Subsets of admissible locally integrable controls . . . . . . . . . . 237

    6.2 Differential inclusion systems . . . . . . . . . . . . . . . . . . . . . . . . . 2396.2.1 Definition of differential inclusion system . . . . . . . . . . . . . 2396.2.2 Trajectories and reachable sets for differential inclusion systems 240

    6.3 Systems depending continuously on control . . . . . . . . . . . . . . . . 2426.3.1 Definition of control system . . . . . . . . . . . . . . . . . . . . . 2426.3.2 Trajectories and reachable sets for control systems . . . . . . . . 2446.3.3 The fibred manifold picture for a control system . . . . . . . . 249

    6.4 Control-affine systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2516.4.1 Definition of control-affine system . . . . . . . . . . . . . . . . . . 2516.4.2 Trajectories and reachable sets for control-affine systems . . . . . 2536.4.3 Important classes of control-affine systems . . . . . . . . . . . . . 2576.4.4 Transformations of control-affine systems . . . . . . . . . . . . . 260

    6.5 Affine systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2616.5.1 Definition of affine system . . . . . . . . . . . . . . . . . . . . . . 2616.5.2 The relationship between affine systems and control-affine sys-

    tems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2626.5.3 Trajectories and reachable sets for affine systems . . . . . . . . . 263

    7 Linear systems and linearisation of systems 2667.1 Linear systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266

    7.1.1 Linear systems on vector spaces . . . . . . . . . . . . . . . . . . . 2667.1.2 Linear systems on vector bundles . . . . . . . . . . . . . . . . . . 266

    7.2 Linearisation of system models . . . . . . . . . . . . . . . . . . . . . . . . 2667.2.1 Linearisation of differential inclusion systems . . . . . . . . . . . 2667.2.2 Linearisation of control systems . . . . . . . . . . . . . . . . . . . 266

  • 7/28/2019 andrews-notes.pdf

    7/326

    v

    7.2.3 Linearisation of control-affine systems . . . . . . . . . . . . . . . 2667.2.4 Linearisation of affine systems . . . . . . . . . . . . . . . . . . . . 266

    8 Variations and the reachable set 267

    8.1 Jet bundles of various sorts . . . . . . . . . . . . . . . . . . . . . . . . . . 2678.1.1 The symmetric algebra of a vector space . . . . . . . . . . . . . . 2678.1.2 Jet bundles of vector bundles . . . . . . . . . . . . . . . . . . . . . 2688.1.3 Jet bundles of maps between manifolds . . . . . . . . . . . . . . . 2708.1.4 The structure of jets of maps between Euclidean spaces . . . . . 2728.1.5 Higher-order tangent vectors for nets . . . . . . . . . . . . . . . . 273

    8.2 Properties of the reachable set for differential inclusion systems . . . . . 2748.2.1 The topology of the reachable set for a differential inclusion

    system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2758.3 Properties of reachable sets for control systems . . . . . . . . . . . . . . 275

    8.3.1 Topology of the reachable set for control systems . . . . . . . . . 275

    8.3.2 States reachable by subsets of trajectories . . . . . . . . . . . . . . 2768.4 Variations for differential inclusion systems . . . . . . . . . . . . . . . . 276

    8.4.1 An algebro-geometric construction . . . . . . . . . . . . . . . . . 2768.4.2 A characterisation of variations . . . . . . . . . . . . . . . . . . . 2808.4.3 The relationship between variations and the reachable set . . . . 282

    8.5 Variations for control systems . . . . . . . . . . . . . . . . . . . . . . . . . 2828.5.1 Definition of variations . . . . . . . . . . . . . . . . . . . . . . . . 2828.5.2 The relationship between variations and the reachable set . . . . 283

    9 Controllability theory 2849.1 Definitions for the various types of controllability . . . . . . . . . . . . . 284

    9.1.1 Accessibility definitions . . . . . . . . . . . . . . . . . . . . . . . . 2849.1.2 Controllability definitions . . . . . . . . . . . . . . . . . . . . . . 2869.1.3 Geometric controllability definitions for control-affine and

    affine systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2889.2 Examples illustrating controllability problems . . . . . . . . . . . . . . . 292

    9.2.1 The difference between accessibility and local accessibility . 2939.2.2 The distinction between accessibility and controllability . . 2949.2.3 The distinction between accessibility and strong accessibility2959.2.4 Global controllability and local controllability . . . . . . . . 2969.2.5 The size of the control set can matter . . . . . . . . . . . . . . . . 297

    9.2.6 The role of feedback transformations . . . . . . . . . . . . . . . . 2989.2.7 Controllability might be computationally difficult to decide . . . 299

    9.3 Accessibility theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2999.3.1 Positive orbits and positive fixed-time orbits . . . . . . . . . . . . 3009.3.2 Accessibility for control systems . . . . . . . . . . . . . . . . . . . 3029.3.3 Accessibility for control-affine systems . . . . . . . . . . . . . . . 304

    9.4 Some controllability results . . . . . . . . . . . . . . . . . . . . . . . . . . 306

  • 7/28/2019 andrews-notes.pdf

    8/326

    vi

    9.4.1 Controllability results for differential inclusion systems . . . . . 3079.4.2 Controllability results for control systems . . . . . . . . . . . . . 3089.4.3 Controllability results for control-affine systems . . . . . . . . . . 309

    10 Optimal control theory 311

    11 Stabilisation theory 312

  • 7/28/2019 andrews-notes.pdf

    9/326

    This version: 23/06/2009

    Chapter 1

    Notation and prerequisites

    While we cover quite a bit of material in this book, it is not the case that the bookis even close to being self-contained. In fact, quite the opposite is true: the readeris expected to have substantial prerequisites in many areas of mathematics. In thischapter we will give an overview of what is expected of the reader, give the notation

    we will use throughout thee book, and provide references which readers can use atappropriate moments to fill in the required background. What we say is exceedinglyterse.

    Note that it is not the case that all prerequisites need to be understood completelyto read a chosen section. For example, it may be the case that some prerequisites areneeded only in a certain proof. Thus a reader need not worry if there are certain gapsin their background.

    The very basics

    We use the most common set theoretic notation. One thing the reader may wish to be

    aware of is that when we write A B we mean that A is a subset ofB, allowing thatA = B. If we require that A B and A B, then we will write A B. The power set ofa set Xwe denote by 2X. By idX we denote the identity map on a set X. The cardinalityof a set Xis denoted by card(X).

    By Z, Q, R, and C we denote the sets of integers, rational numbers, real numbers,and complex numbers, respectively. By Z0 (resp. Z>0) we denote the set of nonneg-ative (resp. positive) integers. By R0 (resp. R>0) we denote the set of nonnegative(resp. positive) real numbers. For x R, x denotes the largest integer less than orequal to x and x denotes the smallest integer not less than x.

    We use the terms injective and surjective for maps, rather than the termsone-to-one or onto. IfS1, . . . , Sk are sets, the map prj : S1

    Sk

    Sj defined by

    prj(x1, . . . , xj, . . . , xk) = xj

    is the projection onto the jth factor.

  • 7/28/2019 andrews-notes.pdf

    10/326

    2 1 Notation and prerequisites 23/06/2009

    Topology

    We use basic facts from point set topology. Thus we expect that the reader knowswhat a topological space is, and knows about the various flavours of subsets of atopological space, like open sets, closed sets, compact sets, and connected sets. Weshall make frequent use of basic theorems, often dealing with compactness, e.g., theBolzanoWeierstrass Theorem and the ArzelaAscoli Theorem.

    Let X be a topological space. By int(A), cl(A), and bd(A) we denote the interior,closure, and boundary of A X, respectively. IfY X, then Y inherits the subspacetopology. IfA Y X, then intY(A), clY(A), and bdY(A) denote the relative interior,relative closure, and relative boundary ofA.

    We shall also often use metric spaces in our constructions, denoting a typical metricspace by (M, d), where d denotes the distance function. For x M and r R0, wedenote by B(r, x) and B(r, x) the open and closed balls, respectively, of radius r andcentre x.

    We refer to [Willard 1970] as a good introduction to topology. We will also pullmany facts about topology from Chapter 1 of [Abraham, Marsden, and Ratiu 1988].

    Real analysis

    The reader is assumed to be familiar with basic, and maybe some not so basic, realanalysis. The Euclidean space of n-dimensions we denote by Rn. Typically we willdenote an element ofRn with a bold font, e.g.,x. For a vectorx Rn the components of

    x will be denoted by (x1, . . . , xn) or (x1, . . . , xn). On Rn we consider the standard innerproduct and norm denoted by

    x, y =n

    j=1

    xjyj, x =x,x,

    respectively. Other norms one can use are

    x1 =n

    j=1

    |xj|, x = max{|xj| | j {1, . . . , n}}.

    At times it will be convenient to use these other norms, and to recall the inequalities

    x1 nx, x1 nx, x x1,x nx, x x1, x x.

    (1.1)

    By Bn(r,x) (resp. Bn(r,x)) we denote the open ball (resp. closed ball) of radius r centredat x Rn. Unless otherwise stated, these balls are taken with respect to the standardnorm. The ball with respect to the norm are denoted by Dn(r,x) and Dn(r,x) andare often called disks.

  • 7/28/2019 andrews-notes.pdf

    11/326

    1 Notation and prerequisites 3

    IfU Rn is an open set and if f: U Rm is k-times continuously differentiable fork Z, then we say that f is ofclass Ck. The kth derivative of f at x U we denote byDkf(x), and we note that Dkf(x) is a symmetric multilinear map from (Rn)k to Rm. The

    set of such maps we denote by Lksym(Rn;Rm). Maps that are infinitely differentiable are

    ofclass C and maps that are real analytic are ofclass C. By a map of class C0 wemean a continuous map.

    A good reference for the real analysis we use is [Rudin 1976]. Another excellent,and more advanced, source is [Hewitt and Stromberg 1975].

    Differential geometry

    We suppose the reader to be thoroughly familiar with basic di fferential geometry.We shall frequently employ the Einstein summation convention, although we will

    not do so slavishly. In particular, we will not be consistent with using superscriptindices in places where the Einstein summation convention insists that you use super-script indices. Thus, for example, we will denote a point in Rn as (x1, . . . , xn) and notas (x1, . . . , xn) when we feel as if we want to do so.

    All manifolds we consider to be either infinitely differentiable or real analytic.When we use the word smooth, we shall always mean infinitely differentiable. Weshall often use the expression infinitely differentiable or real analytic, as is required,

    by which we mean that infinite differentiability is assumed, unless objects are beingconsider that are real analytic. For manifolds M and N and for r Z0{, }, the map-pings from M to N of class Cr are denoted by Cr(M, N). We abbreviate Cr(M) = Cr(M,R)for r Z0 {}{}. We shall mostly carefully state degrees of differentiability at alltimes. However, just to be safe: unless we state otherwise, all mappings and functions

    will be assumed to be of class C. If we wish to weaken this to some finite degree ofdifferentiability or strengthen this to analyticity (we shall often do this), we shall sayso explicitly.

    The tangent bundle of a manifold M is denoted by TM, with TxM denoting thetangent space at x. The cotangent bundle and cotangent space at x are similarlydenoted by TM and TxM, respectively. For r, s Z0, Trs(TM) denotes the vector bundleof (r, s)-tensors on M. So that there is no confusion, tangent vectors are tensors of type(1, 0) and cotangent vectors are tensors of type (0 , 1).

    For a vector bundle : V M of class Ck, the set of Ck-sections will be denoted byk(V), k Z0 {, }. Thus k(TM) denotes the set of vector fields of class Ck and,more generally, k(Tr

    s

    (TM)) denotes the set of (r, s)-tensor fields of type (r, s) and of classCk; we do not use special notation for these. We denote Vx = 1(x) the fibre at x.Sometimes, but not always, the zero vector in the fibre Vx is denoted by 0x.

    If : M N is differentiable, T : TM TN denotes the derivative of, with Txbeing the restriction to TxM. We renounce other notation for the derivative, such as (which we use for push-forward) of d.

    If : M N is a diffeomorphism, ifA k(Trs(M)), and ifB k(Trs(TN)), then A =denotes the push-forward ofA and B denotes the pull-back ofB. Note that does

  • 7/28/2019 andrews-notes.pdf

    12/326

    4 1 Notation and prerequisites 23/06/2009

    not need to be a diffeomorphism to define B in the case that r = 0, but the otherpossibilities generally require to be a diffeomorphism.

    The Lie derivative of a tensor field A with respect to a vector field Xis denoted byLXA. In case A = f is a function, we might often write LXf = X f if we are trying to

    be concise.Almost everything we will need to know about differential geometry is contained

    in the text ofAbraham, Marsden, and Ratiu [1988].

    Riemannian geometry

    Riemannian geometry does not feature critically in our presentation, but we will oc-casionally benefit from assuming that our manifolds possess a Riemannian metric.Principally, we will be interested in the metric structure a Riemannian manifold in-duces on the manifold, and we discuss this in Section 4.1.

    We will on occasion make use of some nontrivial facts about Riemannian manifolds;

    we refer to [Lang 1995] as a useful text in these cases.

    Measure and integration theory

    The dependence on time of controls must be allowed to be quite arbitrary in orderto deal with some of the phenomenon that can arise in control theory. A generaland useful class of controls to consider are those that are measurable, by which wemean Lebesgue measurable. For controls taking values in Euclidean space, we canalso ask for controls to be integrable, by which we mean Lebesgue integrable. Thuswe require enough knowledge of measure theory to know what is meant by Lebesguemeasurability andLebesgue integrability. At various times we will also require enoughknowledge of measure theory to understand standard, but nontrivial, manipulationsof the concepts of measurability and integrability. For example, if the reader knowsenough measure theory to know the Dominated Convergence Theorem and what itmeans will probably possess sufficient measure theory to get through this book.

    The Lebesgue measure on R will be denoted by . For an interval I R and fora subset A R, we denote by L1(I;A) the Lebesgue integrable A-valued functions onI. By L1loc(I;A) we denote the subset of locally integrable A-valued functions, meaningthat f|K L1(K;A) for every compact interval K I.

    A good reference for measure and integration theory is [Cohn 1980].

    Functional analysis

    We expect the reader to be acquainted with elementary functional analysis, suchas Banach space theory. We will also, however, occasionally need some less basicfunctional analysis. In particular, we will make use of some facts about locally convextopological vector spaces, as such concepts are essential to understanding how totopologise spaces of functions and vector fields on manifolds.

    Basics of functional analysis such as we need can be found in [ Rudin 1991].

  • 7/28/2019 andrews-notes.pdf

    13/326

    1 Notation and prerequisites 5

    Linear algebra

    A through understanding of linear algebra is essential in most any area of controltheory. We assume the reader to be fully acquainted with finite-dimensional linearalgebra.

    ForR-vector spaces U and V, we will denote the set ofR-linear maps from U to Vbyeither L(U; V) or HomR(U; V). More or less, when we are focusing on analytical ideas,we will use the former notation, while the latter notation will be used when algebraicstructure is the focus. By V = L(V;R) we denote the dual ofV. If V and v V, wemight denote (v) by ; v or v.

    For a subset S of aR-vector space V, spanR(S) is the linear hull ofS, i.e., the smallestsubspace ofV containing S.

    By Rmn we denote the set ofR matrices with m rows and n columns. A typicalmatrix will be denoted using a bold font, e.g., A. By In we denote the n n identitymatrix.

    A good basic reference for finite-dimensional linear algebra is [Halmos 1986]. Moreadvanced topics are covered in [Roman 2005].

    Algebra

    We expect that the reader knows the basic definitions and properties for groups, rings(in particular fields), modules (in particular, vector spaces), and algebras. We expectthe reader to know about tensor products.

    BySk we denote the symmetric group of order k, which means, precisely, the groupof bijections of the set {1, . . . , k}.

    Let us review some notation regarding direct sums and products. For a family

    (Va)aA of R-vector spaces, we regard the direct sum of these vector spaces as thefollowing set of maps:

    aAVa = { : A aAVa | (a) Va, a A, (a) = 0 for all but finitely many a A}.

    In like manner, the direct product of the same family of vector spaces is also a set ofmaps:

    aAVa = { : A aA | (a) Va, a A}.

    Thus the direct sum is a subspace of the direct product if we use the operations ofvector addition and scalar multiplication on

    aA Va by

    ( + )(a) = (a) + (a), ()(a) = ((a)).

    The k-fold tensor product of a vector space V with itself we denote by

    Tk(V) = V V.

  • 7/28/2019 andrews-notes.pdf

    14/326

    6 1 Notation and prerequisites 23/06/2009

    The tensor algebra ofV isthenT(V) = k=0

    Tk(V), with theunderstanding that T0(V) = R.We comment that Trs(V) Tr(V) Ts(V) in the case when V is finite-dimensional.

    We shall occasionally recall some not quite elementary facts from algebra, butwill provide the necessary background in these cases. References for basic algebra

    are [Hungerford 1980] and [Lang 1984].

    Computational complexity

    This is definitely not a book on computational complexity methods in control theory,but we will on occasion make some statements using the language of computationalcomplexity. Therefore, we should have in mind some idea about what these statementsmean.

    The aspect of this that we will be the most vague about is the class of problemsconsidered in the theory of computational complexity. The problems we consider aredecidability problemsmeaning that they have answers that are yes or noand

    will have a characteristic size which measures the complexity of the problem. Fora problem involving graphs, for example, the size might be the number of nodes inthe graph, for a problem involving linear algebra, the size might be the dimension ofthe vector space or some number related to the dimension of the vector space. For us,dealing with problems in control theory, the complexity of the problem will be relatedto the dimension of the state space and the way in which one represents the system.In any case, we will be concerned with decidability problems with a characteristic sizethat we will typically denote by N.

    By a solution algorithm we mean a method for taking an instance of the decid-ability problem and producing a correct yes or no answer in all cases. A solution

    algorithm which takes a problem of size N and correctly returns the answer to thedecidability problem in at most K steps where K satisfies an inequality of the formK CNp for some C,p Z>0 is called a polynomial-time solution algorithm. The classof problems admitting polynomial-time solution algorithms is denoted P. The class

    NPof problems are those problems which, if one is given an affirmative answer to aninstance of size N of the decidability problem, one can evaluate whether the answeris correct using at most K steps, where K satisfies an inequality of the form K CNpfor some C,p Z>0. Clearly P NP. It is not presently known whether P = NP. Aproblem L in called NP-complete if (1) it is in NP and (2) if any other problem L ofsize Nin NP can be converted to L using an algorithm with Ksteps, where KsatisfiesK

    CNp. A problem L is called NP-hard if any problem L of size N in NP can be

    converted to L using an algorithm with Ksteps, where Ksatisfies K CNp.The idea is that the class P of problems are nice in that a relatively efficient algorithm

    exists to solve them. The class NP are not known to be as nice (it is not known whetherthey can be solved using a polynomial-time algorithm), but are also not so bad sincesolutions can be efficiently verified. Problems that are NP-complete are the easiestproblems in NP. Problems that are NP-hard are at least as difficult as any problem inNP.

  • 7/29/2019 andrews-notes.pdf

    15/326

    1 Notation and prerequisites 7

    On introduction to matters of computation and computational complexity can befound in the book [Sipser 1996]. !

    Control theory

    It is important not to forget that geometric control theory is about control theory aswell as differential geometry. It is assumed that the reader is familiar with basictopics in control theory, as such familiarity forms an essential context within whichto understand geometric control theory. Without appreciating the needs of controltheory, it is possible to turn geometric control theory into a rather unhappy distortionof what it is supposed to be. When thinking about problems in geometric controltheory, one should always ask, What is the problem in control theory that this iscontributing to?

    Of course, control theory is an enormous subject with many specialities, and ge-ometric control theory is only even slightly related to a small subset of these spe-

    cialities, e.g., continuous-time, finite-dimensional systems described by differentialequation models. And even within this small subset, one does not need to understandeverything to be able to properly contextualise geometric control theory. For example,one need not understand the latest in robust adaptive controller design schemes to douseful research in geometric control theory. However, perhaps as a bare minimum,one should have familiarity with the following areas.

    1. Linear control theory: The basics of linear control theory are well-established, andso form as excellent starting point for studying control theory. The fundamentalproblems are all precisely formulated and, in some sense at least, solved. Thereare many linear systems texts available, most of which are intended for graduate

    students in engineering. Many of these are presented with sufficient rigour to beuseful preparation for geometric control theory. One book that stands out fromthe norm in its treatment is that of Wonham [1985]. This book provides a verygood formulation of linear control theory for the purposes of studying geometriccontrol theory, and so might be a good starting point for someone coming from a

    background where control theory is absent.

    2. Some of nonlinear control theory: The subject known as nonlinear control theoryis quite expansive. Some areas of nonlinear control theory clearly overlap withgeometric control theory. However, there is a significant body of nonlinear controltheory that really has no overlap with geometric control theory, and is more ofa nonlinearisation of linear control theory. This latter sort of nonlinear controltheory is presented in the text [Khalil 1996]. It is useful to understand some ofthis sort of nonlinear control theory. In particular, some of the more analyticaltechniques from this area are useful to know. However, the tools are decidedlynot geometric, and at some point one has to really shake free from this sort ofnonlinear control theory in order to immerse oneself into geometric control theory.A more geometric presentation of nonlinear control theory can be found in the

    books ofIsidori [1995] and Nijmeijer and van der Schaft [1990]. The later volume,

  • 7/29/2019 andrews-notes.pdf

    16/326

    8 1 Notation and prerequisites 23/06/2009

    in particular, is very closely aligned with a geometric approach, but is also faithfulto the needs of control theory.

  • 7/29/2019 andrews-notes.pdf

    17/326

    This version: 23/06/2009

    Chapter 2

    Real analyticity

    One of the places where geometric control theory departs from classical differentialgeometry occurs in the occasional importance of real analyticity in geometric controltheory. The reasons for this are deep and varied, as explained in [Sussmann 1990].For our purposes, one of the important properties of real analyticity has to do with

    algebraic properties of germs of analytic functions. The setup for this is developedin Section 2.4. Another important aspect of real analyticity has to do with extendingmaps from a subset to a larger space. For smooth maps, this is often very easily doneusing things like partitions of unity and the Tietsze Extension Theorem [Abraham,Marsden, and Ratiu 1988, 5.5]. For analytic maps, the matter is more subtle becauseof the absence of partitions of unity. In Section 2.3.3 we present some useful resultsconcerning extensibility of analytic functions.

    The facts we present about real analyticity are more or less well-known, but it isuseful to organise them in one place for ease of reference. What we say here is a

    bare scratching of the surface of the important, interesting, and subtle topic of realanalyticity.

    2.1 Real analytic functions: definitions and fundamental

    properties

    Real analytic functions are defined as being locally prescribed by a convergentpower series. We, therefore, begin by describing formal (i.e., not depending on anysort of convergence) power series. We then indicate how the usual notion of a Taylorseries gives rise to a formal power series, and we prove Borels Theorem which saysthat all formal power series arise as Taylor series. This leads us to consider convergenceof power series, and then finally to consider real analytic functions.

    Much of what we say here is a fleshing out of some material from Chapter 2 of[Krantz and Parks 2002].

    2.1.1 Multi-index and partial derivative notation

    A multi-index is an element ofZn0. For a multi-index Iwe shall write I= (i1, . . . , in).We introduce the following notation:

  • 7/29/2019 andrews-notes.pdf

    18/326

    10 2 Real analyticity 23/06/2009

    1. |I| = i1 + + in;2. I! = i1! in!;3. xI = xi1

    1 xinn for x = (x1, . . . , xn) Rn;

    4.|x|

    I =|x1|

    i1

    |xn|in

    for x=

    (x1, . . . , xn) Rn

    ;Note that the standard basis (e1, . . . , en) for Rn is in Zn0, so we shall think of thesevectors as elements ofZn0 when it is convenient to do so.

    The following property of the set of multi-indices will often be useful.

    2.1.1 Lemma For n Z>0 and m Z0, card({I Zn0 | |I| = m}) =n + m 1

    n 1

    .

    Proof We begin with an elementary lemma.

    1 Sublemma For n Z>0 and m Z0,m

    j=0

    n +j 1

    n 1

    =

    n + m

    n

    .

    Proof Recall that for j, k Z0 with j kwe havek

    j

    =

    k!

    j!(k j)! .

    We claim that if j, k Z>0 satisfy j kthenk

    j

    +

    k

    j 1

    =

    k+ 1

    j

    .

    This is a direct computation:

    k!j!(k j)! + k!(j 1)!(k j + 1)! =(k

    j + 1)k!

    (k j + 1)j!(k j)! +jk!

    j(j 1)!(k j + 1)!=

    (k j + 1)k! + jk!j!(k j + 1)!

    =(k+ 1)k!

    j!((k+ 1) j)! =(k+ 1)!

    j!((k+ 1) j)! =k+ 1

    j

    .

    Now we have

    mj=0

    n + j 1

    n 1

    = 1 +

    mj=1

    n + j 1

    n 1

    = 1 +

    mj=1

    n + j

    n

    mj=1

    n + j 1

    n

    = 1 +n + m

    n

    nn

    =

    n + mn

    ,

    as desired.

    We now prove the lemma by induction on n. For n = 1 we have

    card({j Z0 | j = m}) = 1 =

    m

    0

    ,

  • 7/29/2019 andrews-notes.pdf

    19/326

    23/06/2009 2.1 Real analytic functions: definitions and fundamental properties 11

    which gives the conclusions of the lemma in this case. Now suppose that the lemmaholds for n {1, . . . , k}. IfI Zk+10 satisfies |I| = m, then write I = (i1, . . . , ik, ik+1) and takeI = (i1, . . . , ik). Ifik+1 = j {0, 1, . . . , m} then |I| = m j. Thus

    card({I Zk+10 | |I| = m}) =

    mj=0

    card({I Zn0 | |I| = m j})

    =

    mj=0

    k+ m j + 1

    k 1

    =

    mj=0

    k+ j 1

    k 1

    =

    k+ m

    k

    =

    (k+ 1) + m 1

    (k+ 1) 1

    ,

    using the sublemma in the penultimate step. This proves the lemma by induction.

    Multi-index notation is also convenient for representing partial derivatives of multi-variable functions. Let us start from the ground up. Let (e1, . . . , en) be the standard

    basis for Rn and denote by (1, . . . ,n) the dual basis for (Rn). Let U Rn and letf C(U). The kth total derivative of f at x0 we denote by Df(x0), noting thatDf(x0) Lksym(Rn;R) is a symmetric k-multilinear map [see Abraham, Marsden, andRatiu 1988, Proposition 2.4.14]. We denote by Lk(Rn;R) the set ofk-multilinear mapswith its usual basis

    {(j1 jk) | j1, . . . , jk {1, . . . , n}}.Thinking of Df(x0) as a multilinear map, forgetting about its being symmetric, wewrite

    Dkf(x0) =

    n

    j1,...,jk=1kf

    xj1 xjk(x0)j1

    jk.

    Now, for j1, . . . , jk {1, . . . , n}, define I Zn0 by letting im Z0 be the number of timesm {1, . . . , n} appears in the list of numbers j1, . . . , jk. Then, via the product onLksym(R

    n;R) as discussed in Section 8.1.1, we can also write

    Dkf(x0) =

    ma=1

    i1,...,inZ0i1++in=k

    1

    i1! in!kfa

    xi11 xinn

    (x0)i11

    inn

    For I= (i1, . . . , in) Zn0, we write

    |I|f

    xI(x0) = |

    I|f

    xi11

    xinn(x0).

    We may also write this in a different way:

    |I|fxI

    (x0) =|I|f

    x1 x1i1 times

    xn xnin times

    (x0).

  • 7/29/2019 andrews-notes.pdf

    20/326

    12 2 Real analyticity 23/06/2009

    Indeed, because of symmetry of the derivative, for any collection of numbersj1, . . . , j|I| {1, . . . , n} for which koccurs ik times for each k {1, . . . , n}, we have

    |I|f

    xI

    (x0) =|I|f

    xj1 xj|I|(x0).

    In any case, we can also write

    Dkf(x0) =

    IZn0|I|=k

    1

    I!

    |I|fa

    xI(x0)

    i11

    inn

    We shall freely interchange the various partial derivative notations discussedabove, depending on what we are doing.

    2.1.2 Formal power series

    To get started with our discussion of real analyticity, it is useful to first engagein a little algebra so that we can write power series without having to worry aboutconvergence.

    2.1.2 Definition (Formal power series with arbitrary indeterminates) Let = {1, . . . , n}be a finite set and denote by Z0 the set of maps from into Z0. The set ofR-formalpower series with indeterminates X is the set of maps from Z0 to R, and is denotedby R[[]].

    There is a concrete way to represent Z0. Given : Z0 we note that () isuniquely determined by the n-tuple

    ((1), . . . , (n)) Zn0.Such an n-tuple is nothing but an n-multi-index. Therefore, we shall identify Z0 withthe set Zn0 of multi-indices.

    Therefore, rather than writing () for R[[]] and Zn0, we shall write (I)for I Zn0. Using this notation, the R-algebra operations are defined by

    ( +)(I) = (I) +(I),

    (a)(I) = a((I)),

    ()(I) =

    I1,I2Zn0I1+I2=I

    (I1)(I2),

    for a R and , R[[1, . . . , n]]. We shall identify the indeterminate j, j {1, . . . , n},with the element j ofR[[]] defined by

    j(I) =

    1, I= ej,0, otherwise,

  • 7/29/2019 andrews-notes.pdf

    21/326

    23/06/2009 2.1 Real analytic functions: definitions and fundamental properties 13

    where (e1, . . . , en) is the standard basis for Rn, thought of as an element ofZn0. One canreadily verify that, using this identification, the k-fold product ofj is

    kj

    (I) = 1, I= kej,

    0, otherwise.

    Therefore, it is straightforward to see that if R[[]] then =

    I=(i1,...,in)Zn0

    (I)i11

    inn . (2.1)

    Adopting the notational convention I = i11

    inn , the preceding formula admits thecompact representation

    =

    IZn0(I)I.

    We can describe explicitly the units in the ring R[[]], and give a formula for theinverse for these units.

    2.1.3 Proposition (Units in R[[]]) A member R[[]] is a unit if and only if (0) 0.Moreover, if is a unit, then we have

    1(I) =1

    (0)

    k=0

    1 (I)

    (0)

    k

    for all I Zn0.Proof First of all, suppose that is a unit. Thus there exists R[[]] such that = 1.In particular, this means that (0)(0) = 1, and so (0) is a unit in R, i.e., is nonzero.Next suppose that (0) 0. To prove that is a unit we use the following lemma.

    1 Lemma If R[[]] satisfies (0) = 0 then (1 ) is a unit in R[[]] and

    (1 )1(I) =

    k=0

    k(I)

    for all I Zn0.Proof First of all, we claim that

    k=0

    k is a well-defined element ofR[[]]. We claim thatk(I) = 0 whenever |I| {0, 1, . . . , k}. We can prove this by induction on k. For k = 0 thisfollows from the assumption that (0) = 0. So suppose that k(I) = 0 for |I| {0, 1, . . . , k},whenever k {0, 1, . . . , r}. Then, for I Zn0 satisfying |I| {0, 1, . . . , r + 1}, we have

    r+1(I) = ( r)(I) =

    I1,I2Zn0I1+I2=I

    (I1)r(I2)

    = (0)r(I) +

    IZn0IIZn0\0

    (I)r(I I) = 0,

  • 7/29/2019 andrews-notes.pdf

    22/326

    14 2 Real analyticity 23/06/2009

    using the definition of the product in R[[]] and the induction hypothesis. Thus weindeed have k(I) = 0 whenever |I| {0, 1, . . . , k}. This implies that, ifI Zn0, then thesum

    k=0

    k(I) is finite, and the formula in the statement of the lemma for (1 )1 at leastmakes sense. To see that it is actually the inverse of 1 , for I Zn0 we compute

    (1 )

    k=0

    k(I) =

    k=0

    k(I)

    k=1

    k(I) = 1,

    as desired.

    Proceeding with the proof, let us define = 1 (0) so that(0) = 0. By the lemma, 1is a unit. Since = (0)(1 ) it follows that is also a unit, and that 1 = (0)1(1 )1.The formula in the statement of the proposition then follows from the lemma above.

    Note that one of the consequences of the proof of the proposition is that the expres-sion given for 1 makes sense since the sum is finite for a fixed I Zn0.

    2.1.3 Formal Taylor series

    One can see an obvious notational resemblance between the representation (2.1)and power series in the usual sense. A common form of power series is the Taylorseries for an infinitely differentiable function about a point.. In this section we fleshthis out by assigning to a C-map a formal power series in a natural way. Throughoutthis section we let = {1, . . . , n} so R[[]] denotes the R-formal power series in theseindeterminates.

    We let (e1, . . . , en) be the standard basis for Rn. We might typically denote the dualbasis for (Rn) by (e1, . . . , en), but notationally, in this section, it is instead convenient todenote the dual basis by (1, . . . ,n).

    Let x0 Rn and let U be a neighbourhood ofx0 Rn. We suppose that f: U Rmis infinitely differentiable. We let Dkf(x0) be the kth-derivative of f at x0, noting that

    this is an element of Lksym(Rn;Rm) as discussed in Section 2.1.1. As we saw in our

    discussion in Section 2.1.1, writing this in the basis for Lksym(Rn;R) gives

    Dkf(x0) =

    ma=1

    IZn0|I|=k

    1

    I!

    |I|fa

    xI(x0)(

    i11

    inn ) ea.

    Thus, to f C(U) we can associate an element f(x0) R[[1, . . . , n]] Rm by

    f(x0) =m

    a=1

    IZn0

    1I!

    |I|faxI

    (x0)(i11 inn ) ea.

    One can (somewhat tediously) verify using the high-order Leibniz Rule [Abraham,Marsden, and Ratiu 1988, Supplement 2.4A] that this map is a homomorphism ofR-algebras. That is to say,

    af(x0) = af(x0), f+g(x0) = f(x0) + g(x0), f g(x0) = f(x0)g(x0).

  • 7/29/2019 andrews-notes.pdf

    23/326

    23/06/2009 2.1 Real analytic functions: definitions and fundamental properties 15

    We shall call f(x0) the formal Taylor series of f at x0.To initiate our discussions of convergence, let us consider R-valued functions for

    the moment, just for simplicity. The expression for f(x0) is reminiscent of the Taylorseries for f about x0:

    k=0

    IZn0

    1I!

    |I|fxI

    (x0)(x x0)I.

    This series will generally not converge, even though as small children we probablythought that it did converge for infinitely differentiable functions. The situation re-garding convergence is, in fact, as dire as possible, as is shown by the followingtheorem ofBorel [1895]. Actually, Borel only proves case where n = 1. The proof wegive for arbitrary n follows [Mirkil 1956].

    2.1.4 Theorem (Borel) If x0 Rn and ifU is a neighbourhood of x0, then the map f f fromC(U) to R[[]] is surjective.

    Proof For I Zn0 let us abbreviate

    f(I)(x) =|I|fxI

    (x).

    Let us also define h : R Rby

    h(x) =

    0, x (, 2],e e1/(1(x+1)2), x (2, 1),1, x [1, 1],e e1/(1(x1)2), x (1, 2),0, x [2, ).

    As is well-known, cf. [Abraham, Marsden, and Ratiu 1988, Page 82] and Example 2.1.5,the function h is infinitely differentiable. We depict the function in Figure 2.1.

    Let R[[]]. Without loss of generality we assume that x0 = 0. Let r R>0 be suchthat Bn(r, 0) U. We recursively define a sequence (fj)jZ0 in C(U) as follows. We takef0 C(U) such that f0(0) = (0) and such that supp(f0) Bn(r, 0), e.g., take

    f0(x) = (0)h(2r x).

    Now suppose that f0, f1, . . . , fk have been defined and define gk+1 : U R to a homo-geneous polynomial function in x1, . . . , xn of degree k+ 1 so that, for every multi-indexI= (i1, . . . , in) for which |I| = k+ 1, we have

    g(I)

    k+1(0) = (I) f(I)

    0(0) f(I)

    k(0).

    Note that g(I)

    k+1(0) = 0 if |I| {0, 1, . . . , k} since in these case g(I)

    k+1(x) will be a homogeneous

    polynomial of degree k+ 1 m. Next let

    gk+1(x) = gk+1(x)h(2r x)

  • 7/29/2019 andrews-notes.pdf

    24/326

    16 2 Real analyticity 23/06/2009

    3 2 1 0 1 2 3

    0.0

    0.2

    0.4

    0.6

    0.8

    1.0

    x

    h(x)

    Figure 2.1 The bump function

    so that, since the function x h( 2r x) is equal to 1 in a neighbourhood of 0,

    g(I)

    k+1(0) = (I) f(I)

    0(0) f(I)

    k(0)

    and g(I)

    k+1(0) = 0 if |I| {0, 1, . . . , k+ 1}. Also supp( gk+1) Bn(r, 0). Next let R>0. If we

    define h,k+1(x) = k1 gk+1(x) then

    h(I)

    ,k+1(x) = k1+m g(I)

    k+1(x).

    Thus, if|I| {0, 1, . . . , k}, then we can choose sufficiently large that |h(I)

    ,k+1(x)| < 2k

    1

    forevery x Bn(r, 0). With so chosen we take

    fk+1(x) = h,k+1(x).

    This recursive definition ensures that, for each k Z0, fk has the following properties:1. supp(fk) Bn(r, 0);2. f

    (I)

    k(0) = (I) f(I)

    0(0) f(I)

    k(0) whenever I= (i1, . . . in) satisfies |I| = k;

    3. f(I)

    k(0) = 0 if |I| {0, 1, . . . , k 1};

    4. |f(I)k

    (x)| < 2k whenever |I| {0, 1, . . . , k 1} and x Bn(r, 0).We then define f(x) = k=0 fk(x).

    From the second property of the functions fk, k Z0, above we see that = f. Itremains to show that f is infinitely differentiable. We shall do this by showing that thesequences of partial sums for all partial derivatives converge uniformly. The partial sumswe denote by

    Fm(x) =m

    k=0

    fk(x).

  • 7/29/2019 andrews-notes.pdf

    25/326

    23/06/2009 2.1 Real analytic functions: definitions and fundamental properties 17

    Since all functions in our series have support contained in Bn(r, 0), this is tantamount to

    showing that, for all multi-indices I, the sequences (F(I)m )mZ0 are Cauchy sequences in the

    Banach space C0(Bn(r, 0);R) of continuous R-valued functions on Bn(r, 0) equipped withthe norm

    g = sup{|g(x)| | x Bn(r, 0)},Let R>0 and let I= (i1, . . . , in) be a multi-index. Let N Z>0 be such that

    mj=l+1

    1

    2j<

    for every l, m N, this being possible since j=1 2j < . Then, for l, m {N, |I|} with m > land x Bn(r, 0), we have

    F

    (I)

    l(x) F(I)m (x)

    =

    m

    j=l+1f

    (I)j

    (x)

    m

    j=l+1f

    (I)j

    (x)

    m

    j=l+11

    2j< ,

    showing that (F(I)m )mZ>0 is a Cauchy sequence in C

    0(Bn(r, 0);R), as desired.

    Thus any possible coefficients in a formal power series can arise as the Taylorcoefficients for an infinitely differentiable function. Of course, an arbitrary powerseries

    k=0

    I=(i1,...,ik)

    (I)(x1 x01)i1 . . . (xn x0n)in

    may well only converge when x = x0. Not only this, but even when the Taylor seriesdoes converge, it may not converge to the function producing its coe fficients.

    2.1.5 Example (A Taylor series not converging to the function giving rise to it) Wedefine f: R Rby

    f(x) =

    e 1

    x2 , x 0,

    0, x = 0,

    and in Figure 2.2 we show the graph of f. We claim that the Taylor series for f is thezero R-formal power series. To prove this, we must compute the derivatives of f atx = 0. The following lemma is helpful in this regard.

    1 Lemma For j Z0 there exists a polynomial pj of degree at most 2j such that

    f(j)(x) =pj(x)

    x3je

    1x2 , x 0.

    Proof We prove this by induction on j. Clearly the lemma holds for j = 0 by takingp0(x) = 1. Now suppose the lemma holds for j {0, 1, . . . , k}. Thus

    f(k)(x) =pk(x)

    x3ke

    1x2

  • 7/29/2019 andrews-notes.pdf

    26/326

    18 2 Real analyticity 23/06/2009

    2.0 1.5 1.0 0.5 0.0 0.5 1.0 1.5 2.0

    0.0

    0.2

    0.4

    0.6

    x

    f(x)

    Figure 2.2 Everyones favourite smooth but not analytic function

    for a polynomial pk of degree at most 2k. Then we compute

    f(k+1)(x) =x3p

    k(x) 3kx2pk(x) 2pk(x)

    x3(k+1)e

    1x2 .

    Using the rules for differentiation of polynomials, one easily checks that x x3pk(x)

    3kx2pk(x) 2pk(x) is a polynomial whose degree is at most 2(k+ 1). From the lemma we infer the infinite differentiability of f on R \ {0}. We now need

    to consider the derivatives at 0. For this we employ another lemma.

    2 Lemma limx0 e 1

    x2

    xk= 0 for all k Z0.

    Proof We note that

    limx0

    e1

    x2

    xk= lim

    yyk

    ey2, lim

    x0e

    1x2

    xk= lim

    yyk

    ey2.

    We have

    ey2

    =

    j=0

    y2j

    j!

    In particular, ey2

    y2k

    k! , and so ykey2

    k!yk

    ,and so

    limx0

    e1

    x2

    xk= 0,

    as desired.

  • 7/29/2019 andrews-notes.pdf

    27/326

    23/06/2009 2.1 Real analytic functions: definitions and fundamental properties 19

    Now, letting pk(x) =2k

    j=0 ajxj, we may directly compute

    limx0

    f(k)(x) = limx0

    2kj=0

    ajx2j e

    1x2

    x3k=

    2kj=0

    aj limx0

    e1

    x2

    x3kj= 0.

    Thus we arrive at the conclusion that f is infinitely differentiable on R, and that f andall of its derivatives are zero at x = 0. Thus the Taylor series is indeed zero. Thisis clearly a convergent power series; it converges everywhere to the zero function.However, f(x) 0 except when x = 0. Thus the Taylor series about 0 for f, whileconvergent everywhere, converges to f only at x = 0. This is therefore an example of afunction that is infinitely differentiable at a point, but is not equal to its Taylor series atx = 0. This function may seem rather useless, but in actuality it is quite an importantone. For example, we used it in the construction for the proof of Theorem 2.1.4. It isalso used in the construction of partitions of unity which are so important in smoothdifferential geometry, and whose absence in real analytic differential geometry makesthe latter subject so subtle.

    Another way to think of the preceding example is that it tells us that the mapf f(x0) from C(U) to R[[]], while surjective, is not injective.

    2.1.4 Convergent power series

    Throughout this section we let= {1, . . . , n} soR[[]] denotes theR-formal powerseries in these indeterminates.

    Let us turn to formal power series that converge, and give some of their properties.Recalling notation from Section 2.1.1, we state the following.

    2.1.6 Definition (Convergent formal power series) Let = {1, . . . , n}. A formal powerseries R[[]] converges at x Rn if there exists a bijection : Z>0 Zn0 such thatthe series

    j=1

    ((j))x(j)

    converges. Let us denote by Rconv() the set of points x Rn such that converges atx. We call Rconv the region of convergence. We denote by

    R[[]] = { R[[]] | Rconv {0}}

    the set of power series converging at some nonzero point. 2.1.7 Remark (On notions of convergence for multi-indexed sums) Note that the def-

    inition of convergence we give is quite weak, as we require convergence for somearrangement of the index set Zn0. A stronger notion of convergence would be that theseries

    j=1

    ((j))x(j)

  • 7/29/2019 andrews-notes.pdf

    28/326

    20 2 Real analyticity 23/06/2009

    converge for every bijection : Z>0 Zn0. This, it turns out, is equivalent to absoluteconvergence of the series, i.e., that

    IZn0|(I)||x|I < .

    This is essentially explained by Roman [2005] (see Theorem 13.24) and Rudin [1976](see Theorem 3.55).1 We shall take an understanding of this for granted.

    Let us now show that, convergence as in the definition above at any nontrivialpoint (i.e., a nonzero point) leads to a strong form of convergence at a large subsetof other points. To be precise about this, for x Rn let us denote

    C(x) = {(c1x1, . . . , cnxn) Rn | c1, . . . , cn (1, 1)}.Thus C(x) is the smallest open cube centred at the origin whose closure contains x.

    2.1.8 Theorem (Uniform and absolute convergence of formal power series) Let R[[]] and suppose that converges at x0 Rn. Then converges uniformly and absolutelyon every compact subset of C(x0).

    Proof Let K C(x0) be compact. The proposition holds trivially is K= {0}, so we supposethis is not the case. Let (0, 1) be such that |xj| |x0j| for j {1, . . . , n}. Let : Z>0 Zn0be a bijection such that

    j=1

    ((j))x(j)0

    converges. This implies, in particular, that the sequence (((j))x(j)0

    )jZ>0 is bounded.

    Thus there exists M R>0 such that |(I)||x0|I C for every I Zn0. Then |(I)||x|I C|I|for every x K. In order to complete the proof we use the following lemma.

    1 Lemma For x (1, 1),

    j=0

    (m +j)!

    j!xj =

    dm

    dxm

    xm1 x

    .

    Proof Let a (0, 1) and recall that

    1

    1 a =

    j=0

    aj = am

    1 a =

    j=0

    am+j

    [Rudin 1976, Theorem 3.26]. Also, by the ratio test, the series

    j=0

    (m + j)(m + j 1) (m + j k)am+jk1 (2.2)

    converges for k Z0.1Also see the important paper ofDvoretzky and Rogers [1950] in this regard, where it is shown that

    the equivalence of absolute and unconditional convergence only holds in finite dimensions.

  • 7/29/2019 andrews-notes.pdf

    29/326

    23/06/2009 2.1 Real analytic functions: definitions and fundamental properties 21

    Now, for x [a, a], since | xm1x | < am

    1a , we have

    j=0

    xm+j =xm

    1 x ,

    with the convergence being uniform and absolute on [a, a]. Thus the series can bedifferentiated term-by-term to give

    d

    dx

    xm1 x

    =

    j=0

    (m + j)xm+j1.

    Since |(m + j)xm+j1| (m + j)am+j1, this differentiated series converges uniformly andabsolutely on [a, a] since the series (2.2) converges. In fact, by the same argument, thisdifferentiation can be made m-times to give

    dn

    dxn xm

    1 x

    = j=0

    (m + j) (m + j m + 1)xj = j=0

    (m + j)!j!

    xj,

    as desired.

    Now, continuing with the proof, for x Kand for m Z0 we have

    IZn0|I|m

    |(I)xI|

    IZn0|I|m

    |(I)||x|I

    IZn0|I|m

    C|I| < C

    j=0

    n + j 1

    n 1

    j

    < C

    j=0

    (n + j 1)!(n 1)!

    j =C

    dn1

    dn1 n1

    1 ,using 2.1.1 and Lemma 1. Thus the sum

    IZn0|(I)xI|

    converges absolutely on K, and uniformly in x Ksince our computation above providesa bound independent ofx.

    The result implies that, if we have convergence (in the weak sense of Defini-tion 2.1.6) for a formal power series at some nonzero point in Rn, we have a strongform of convergence in some neighbourhood of the origin. We now define

    Rabs() =

    rR>0

    x Rn

    IZn0

    |(I)yI| < for all y Bn(r,x),

    which we call the region of absolute convergence. The following result gives therelationship between the two regions of convergence.

  • 7/29/2019 andrews-notes.pdf

    30/326

    22 2 Real analyticity 23/06/2009

    2.1.9 Proposition (int(Rconv()) = Rabs()) For R[[]], int(Rconv)() = Rabs().Proof Let x int(Rconv()). Then, there exists > 1 such that x Rconv(). For such a, x C(x). Let K C(x) and r R>0 be such that Bn(r,x) K, e.g., take Kto be a largeenough closed cube. By Theorem 2.1.8 it follows that

    IZn0

    |(I)yI| <

    for y Bn(r,x) K, and so x Rabs.Ifx Rabs then there exists r R>0 such that

    IZn0|(I)yI| <

    for y Bn(r,x). In particular, converges at every y Bn(r,x) and so x int(Rconv()).

    This result has the following corollary that will be useful for us.2.1.10 Corollary (Property of coefficients for convergent power series) If R[[]] and

    if x Rabs() then there exists C, R>0 such that

    |(I)| C(|x1| + )i1 (|xn| + )in

    for every I Zn0.Proof Note that, ifx Rabs(), then

    (

    |x1

    |, . . . ,

    |xn

    |)

    Rabs()

    by definition of the region of absolute convergence and by Theorem 2.1.8. Now, byProposition 2.1.9 there exists R>0 such that

    (|x1| + , . . . , |xn| + ) Rconv.

    Thus there exists a bijection : Z>0 Z0 such that

    j=1

    ((j))(|x1| + )(j)1 (|xn| + )(j)n

    converges. Therefore, the terms in this series must be bounded. Thus there exists C

    R>0

    such that(I)(|x1| + )i1 (|xn| + )in < C

    for every I Zn0. Using this property of the coefficients of a convergent power series, one can deduce

    the following result.

  • 7/29/2019 andrews-notes.pdf

    31/326

    23/06/2009 2.1 Real analytic functions: definitions and fundamental properties 23

    2.1.11 Corollary (Convergent power series converge to infinitely differentiable func-tions) If R[[]] then the series

    IZn0(I)xI

    converges in Rabs to an infinitely differentiable function whose derivatives are obtained bydifferentiating the series term-by-term.

    Proof By induction it suffices to show that any partial derivative of f is defined on Rabsby a convergent power series. Consider a term (I)xI in the power series for I Zn0. Forj Z>0 we have

    xj(I)xI =

    0, ij = 0,ij(I)xIej , ij 1.Thus, when differentiating the terms in the power series with respect to xj, the only nonzerocontribution will come from terms corresponding to multi-indices of the form I+ ej. Inthis case,

    xj

    (I+ ej)xI+ej = (ij + 1)(I+ ej)x

    I.

    Therefore, the power series whose terms are the partial derivatives of those for the givenpower series with respect to xj is

    IZn0(ij + 1)(I+ ej)x

    I.

    Now let x Rabs and, according to Corollary 2.1.10, let C, R>0 be such that

    |(I)

    | C

    (|x1| + )i1 (|xn| + )in, I

    Zn

    0.

    Let y Rabs be such that

    y Dn( 2 ,x) = {x Rn | |xj xj| < 2 }.

    Note that|yj| |xj| + |yj xj| < |xj| + 2 .

    Also let

    = max |x1| + 2|x1| + , . . . ,

    |xn| + 2|xn| +

    (0, 1).

    Then, we computeIZn0

    |ij + 1||(I+ ej)||y|I

    IZn0C|ij + 1|

    |x1| + 2|x1| +

    i1 |xn| + 2|xn| + in

    m=0

    IZn0|I|=m

    C|ij + 1|m

    m=0

    C(m + 1)

    n m 1

    n 1

    m,

  • 7/29/2019 andrews-notes.pdf

    32/326

    24 2 Real analyticity 23/06/2009

    using Lemma 2.1.1. The ratio test shows that this last series converges. Thus the powerseries whose terms are the partial derivatives of those for the given power series withrespect to xj converges uniformly and absolutely in a neighbourhood of x. Thus

    xj

    IZn0(I)x

    I = IZn0

    (ij + 1)(I+ ej)xI

    ,

    which gives the corollary, after an induction as we indicated at the beginning of the proof.

    In Theorem 2.1.15 below we shall show that, in fact, convergent power series arereal analytic on Rabs.

    2.1.5 Real analytic functions

    Throughout this section we let= {1, . . . , n} soR[[]] denotes theR-formal powerseries in these indeterminates.Now understanding some basic facts about convergent power series, we are in aposition to use this knowledge to define what we mean by a real analytic function,and give some properties of such functions.

    2.1.12 Definition (Real analytic function) Let U Rn be open. A function f: U R is realanalytic or ofclass C on U if, for every x0 U, there exists x0 R[[]] and r R>0such that

    f(x) =

    IZn0x0 (I)(x x0)I =

    IZn0

    x0 (I)(x1 x01)i1 (xn x0n)in

    for all x Bn

    (r,x0). The set of real analytic functions on U is denoted by C

    (U).A map f: U Rm is real analytic or of class C on U if its components

    f1, . . . , fm : U R are real analytic. The set of real analytic Rm-valued maps on Uis denoted by C(U,Rm).

    2.1.13 Notation (Real analytic or analytic) We shall very frequently, especially outsidethe confines of this chapter, write analytic in place of real analytic. This is notproblematic since in our private life we use the term holomorphic and not the termanalytic when referring to functions of a complex variable.

    We can now show that a real analytic function is infinitely differentiable with realanalytic derivatives, and that the power series coefficients x0 (I), I

    Zn

    0, are actually

    the Taylor series coefficients for f at x0.

    2.1.14 Theorem (Real analytic functions have analytic partial derivatives of all orders)IfU Rn is open and if f C(U), then all partial derivatives of fare real analytic. Moreover,if x0 U, and if R[[]] and r R>0 are such that

    f(x) =

    IZn0(I)(x x0)I (2.3)

  • 7/29/2019 andrews-notes.pdf

    33/326

    23/06/2009 2.1 Real analytic functions: definitions and fundamental properties 25

    for all x Bn(r, x0), then = f(x0).Proof We begin with a lemma.

    1 Lemma IfU Rn is open and if f C(U) then fis differentiable and its partial derivatives areanalytic functions.

    Proof Let x0 U and let r R>0 and R[[]] be such that

    f(x) =

    IZn0(I)(x x0)I (2.4)

    for allx Bn(r,x0). As we showed in the proof of Corollary 2.1.11, the power series whoseterms are the partial derivatives of those for the power series for f with respect xj is

    IZn0(ij + 1)(I+ ej)(x x0)I.

    Now let R>0 be such thatx (x01 + , . . . , x0n + ) Bn(r,x0).

    Since the series (2.4) converges at x, the terms in the series (2.4) must be bounded. Thusthere exists C R>0 such that, for all I Zn0,

    |(I)(x x0)I| = |(I)||I| C.

    Let x Bn(r,x0) be such that |xj x0j| < for some (0, 1). We then estimate

    IZn0|(ij + 1)(I+ ej)(x

    x0)

    I

    |= IZn0

    (ij + 1)

    |(I+ ej)

    ||x

    x0

    |I

    C

    IZn0(ij + 1)

    |x x0|I|I|+1

    C

    k=0

    IZn0|I|=k

    (ij + 1)|I|

    C

    k=0

    (k+ 1)

    n + k 1

    n 1

    k,

    where we have used Lemma 2.1.1. The ratio test can be used to show that this last seriesconverges. Since this holds for every x for which |xj x0j| < for (0, 1), it follows thatthere is a neighbourhood ofx0 for which the series

    IZn0

    xj(I)(x x0)I

    converges absolutely and uniformly. This means thatfxj

    is represented by a convergent

    power series in a neighbourhood of x0. Since x0 U is arbitrary, it follows that fxj isanalytic in U.

  • 7/29/2019 andrews-notes.pdf

    34/326

    26 2 Real analyticity 23/06/2009

    Now, the only part of the statement in the theorem that does not follow immediatelyfrom a repeated application of the Lemma is the final assertion. This conclusion is provedas follows. If we evaluate (2.3) atx = x0 we see that (0) = f(x0). In the proof of the lemmaabove we showed that

    fxj

    (x) =

    IZn0(ij + 1)(I+ ej)(x x0)I

    in a neighbourhood ofx0. If we evaluate this at x = x0 we see that (ej) =fxj

    (x0). We can

    then inductively apply this argument to higher-order derivatives to derive the formula

    (i1e1 + + inen) = 1i1! in!

    i1++inf

    xi11

    xinn(x0) =

    1

    I!

    |I|fxI

    ,

    which gives (I) = f(x0)(I), as desired.

    By definition, real analytic functions are represented by convergent power se-riesin fact their Taylor series by Theorem 2.1.14in a neighbourhood of any point.Conversely, any convergent power series defines a real analytic function on its domainof convergence.

    2.1.15 Theorem (Convergent power series define real analytic functions) If R[[]]then the function f : Rabs() R defined by

    f(x) =

    IZn0(I)xI (2.5)

    is real analytic.Proof By Corollary 2.1.11 we know that f is infinitely differentiable and its derivativescan be gotten by term-by-term differentiation of the series for f. Let x0 Rabs. By aninduction on the argument of Corollary 2.1.11, if J= (j1, . . . , jn) Zn0 then

    |J|fxJ

    (x0) =

    IZn0(i1 + j1) (i1 + 1) (in + jn) (in + 1)(I+J)xI0.

    Note that1

    J!

    |J|fxJ

    (x0) = IZn0i1 + j1

    j1

    in + jn

    jn

    (I+J)xI0.

    We must show that

    f(x) =

    JZn0

    1

    J!

    |J|fxJ

    (x0)(x x0)J

    for x in some neighbourhood ofx0. To this end, the following lemma will be useful.

  • 7/29/2019 andrews-notes.pdf

    35/326

    23/06/2009 2.1 Real analytic functions: definitions and fundamental properties 27

    1 Lemma If a,b R>0 satisfy a +b < 1, then

    JZn0

    IZn0

    i1 +j1

    j1

    in +jn

    jn

    a|I|b|J| =

    11 a b

    n.

    Proof Recall that for m Z0 we havem

    j=0

    m

    j

    amjbj = (a + b)m,

    as is easily shown by induction. Therefore,

    m=0

    mj=0

    m

    j

    amjbj =

    1

    1 a b ,

    cf. the proof of Lemma 1 from the proof of Theorem 2.1.8. Note that the sets

    {(i, j) Z2 | i + j = m}, {(m, j) Z2 | j m}are in one-to-one correspondence. Using this fact we have

    JZn0

    IZn0

    i1 + j1

    j1

    in + jn

    jn

    a|I|b|J| =

    JZn0

    IZn0

    i1 + j1

    j1

    ai1 bj1

    in + jn

    jn

    ain bjn

    =

    i1=0

    m1j1=0

    m1j1

    am1j1 bj1

    in=0

    mnjn=0

    mnjn

    amnjn bjn

    = 1

    1 a bm

    ,

    as desired.

    Let z Rabs be such that none of the components ofz are zero and such thatx0j

    zj

    < 1for j {1, . . . , n}. This is possible by openness ofRabs. Denote

    = maxx01

    z1, . . . ,

    x0nzn

    .

    Let R>0 be such that + < 1. Ify Rabs satisfies |yj x0j| < |zj| for j {1, . . . , n}.With these definitions we now compute

    JZn0

    1

    J!|J|f

    xJ (x0)|x x0|J =

    JZn0

    IZn0i1 + j1

    j1 in + jnjn |(I+ J)||x0|I|x x0|J

    JZn0

    IZn0

    i1 + j1

    j1

    in + jn

    jn

    C

    |z|I+J|x0|I|J||z|J

    = C

    JZn0

    IZn0

    i1 + j1

    j1

    in + jn

    jn

    |I||J| = C

    11

    n,

  • 7/29/2019 andrews-notes.pdf

    36/326

    28 2 Real analyticity 23/06/2009

    using the lemma above. This shows that the Taylor series for f at x0 converges absolutelyand uniformly in a neighbourhood ofx0. It remains to show that it converges to f.

    Letxbe a point in the neighbourhood ofx0 where the Taylor series off atx0 converges.Let k Z>0. By Taylors Theorem [Abraham, Marsden, and Ratiu 1988, Theorem 2.4.15]there exists z {(1 t)x0 + tx | t [0, 1]}such that

    f(x) =

    JZn0|J|k

    1

    J!

    |J|fxJ

    (x0)(x x0)J +

    JZn0|J|=k+1

    1

    J!

    |J|fxJ

    (z)(x x0)J.

    By Corollary 2.1.11 we have

    1

    J!

    |J|fxJ

    (z) =

    IZn0

    i1 + j1

    j1

    in + jn

    jn

    (I+J)zI.

    Therefore,

    f(x) JZn0|J|k

    1

    J!

    |J|fxJ

    (x0)(x x0)J

    JZn0|J|=k+1

    1

    J!

    |J|fxJ

    (z)|x x0|J

    =

    JZn0|J|=k+1

    IZn0

    i1 + j1

    j1

    in + jn

    jn

    |(I+ J)||z|I|x x0|J.

    Just as we did above when we showed that the Taylor series for f at x0 converges

    absolutely, we can show that the seriesJZn0

    IZn0

    i1 + j1

    j1

    in + jn

    jn

    |(I+ J)||z|I|x x0|J

    converges. Therefore,

    limk

    JZn0|J|=k+1

    IZn0

    i1 + j1

    j1

    in + jn

    jn

    |(I+J)||z|I|x x0|J = 0,

    and so

    limk

    f(x) JZn0|J|k

    1J!

    |J|fxJ

    (x0)(x x0)J = 0,

    showing that the Taylor series for f at x0 converges to f in a neighbourhood ofx0.

    As a final result, let us characterise real analytic functions by providing an exactdescription of their derivatives.

  • 7/29/2019 andrews-notes.pdf

    37/326

    23/06/2009 2.1 Real analytic functions: definitions and fundamental properties 29

    2.1.16 Theorem (Derivatives of real analytic functions) If f C(U) then the followingstatements are equivalent:

    (i) f C(U);(ii) for each x0

    U there exists a neighbourhood V

    U of x0 and C, r

    R>0 such that

    |I|fxI (x) CI!r|I|

    for all x V and I Zn0.Proof First suppose that f is real analytic and let x0 U. We will use the followinglemmata.

    1 Lemma Let J Zn0 and let x Rn satisfy |xk| < 1, k {1, . . . , n}. Then

    IZn0J!

    i1 +j1j1 in +jn

    jn |x|I =

    |J|

    xJ n

    k=1

    xjkk

    1 xk .Proof By Lemma 1 from the proof of Theorem 2.1.8 we have

    ik=0

    jk!

    ik + jk

    jk

    xik

    k=

    ik=0

    (ik + jk)!

    ik!xik =

    djk

    dxjkk

    xjkk

    1 xk, k {1, . . . , n}.

    Therefore,

    IZn

    0

    J!

    i1 + j1

    j1

    in + jn

    jn

    |x|I =

    IZn

    0

    j1!

    i1 + j1

    i1

    xi1

    1 jn!

    in + jn

    in

    xinn

    =

    i1=0

    j1!

    i1 + j1

    i1

    xi1

    1

    in=0

    jn!

    in + jn

    in

    xinn

    =

    dj1

    dxj11

    xj11

    1 x1

    djn

    dxjnn

    xjnn1 xn

    =|J|

    xJ

    nk=1

    xjkk

    1 xk,

    as desired.

    2 Lemma For each R (0, 1) there exists A, R>0 such that, for each m Z0,

    sup dm

    dxm

    xm1 x

    x [R, R] Am!m.Proof We first claim that

    j=0

    m + j

    j

    xj =

    1

    (1 x)m+1 (2.6)

  • 7/29/2019 andrews-notes.pdf

    38/326

    30 2 Real analyticity 23/06/2009

    for x (1, 1) and m Z0. Indeed, by [Rudin 1976, Theorem 3.26] we have

    j=0

    xj =1

    1 x ,

    and convergence is uniform and absolute on [R, R] for R (0, 1). Differentiation m-timesof both sides with respect to x then gives (2.6).

    By Lemma 1 from the proof of Theorem 2.1.8 we have

    dm

    dxm

    xm1 x

    =

    j=0

    (m + j)!

    j!xj

    for x (1, 1). Ifx [R, R] then

    dm

    dxm

    xm

    1 x(1 R)m

    m!

    = (1

    R)m

    j=0(m + j)!

    m!j!

    xj = (1

    R)m

    j=0 m + j

    j xj

    =(1 R)m

    (1 x)m+1 =1 R

    1 xm 1

    1 x 1

    1 R .

    That is to say,dm

    dxm

    xm1 x

    1

    1 Rm!(1 R)m,

    and so the lemma follows with A = 11R and = 1 R. Now, for x in a neighbourhood V ofx0 we have

    f(x) = IZn0

    1I! |I

    |fxI (x0)(x x0)I.

    Let us abbreviate (I) = 1I!|I| fxI

    (x0). By Corollary 2.1.10 there exists C, R>0 such that

    |(I)| C|I|, I Zn0.

    By Corollary 2.1.11 and following the computations from the proof of Theorem 2.1.15,wecan write

    1

    J!

    |J|fxJ

    (x) =

    I

    Zn

    0

    i1 + j1

    j1

    in + jn

    jn

    (I+J)(x x0)I (2.7)

    for J Zn0 and for x in a neighbourhood ofx0. Therefore, there exists (0, ) sufficientlysmall that, ifx Rn satisfies |xj x0j| < , j {1, . . . , n}, (2.7) holds. Let CJ be the partialderivative

    |J|

    xJ

    nk=1

    xjkk

    1 xk

  • 7/29/2019 andrews-notes.pdf

    39/326

    23/06/2009 2.1 Real analytic functions: definitions and fundamental properties 31

    evaluated at x = ( , . . . ,

    ). Let R (0, 1) satisfy R >

    . By the second lemma above there

    exists A, R>0 such that, for each k {1, . . . , n} and each xk [R, R], we have

    djk

    dxjkk

    xjkk

    1 xk Ajk!jk

    It follows that

    |J|

    xJ

    nk=1

    xjkk

    1 xk AnJ!|J|

    whenever x = (x1, . . . , xn) satisfies |xj| < R, j {1, . . . , n}. In particular, CJ AnJ!|J|.Then, for such x such that |xj x0j| < , j {1, . . . , n}, we have

    |J|fxJ

    (x)

    IZn

    0

    J!

    i1 + j1

    j1

    in + jn

    jn

    |(I+ J)||x x0|I

    IZn0J!

    i1 + j1

    j1

    in + jn

    jn

    C|J|

    |I|

    C|J|CJ CAnJ!( + )|J|,

    using the lemmata above. Thus the second condition in the statement of the theorem holdswith C = CAn and r = + .

    Conversely, suppose that for eachx0 U there exists a neighbourhoodV U ofx0 andC, r R>0 such that

    |I|fxI

    (x) CI!r|I|

    for all x V and I Zn0. Then, for x0 U, let C, r, R>0 be such that < r and|I|f

    xI(x)

    CI!r|I|, I Zn0, x Bn(,x0).Then, for x Bn(,x0) we have

    IZn0

    1

    I!

    |I|fxI

    (x0)|x x0|I

    IZn0

    r

    |I| k=0

    IZn0|I|=k

    r

    |I|

    k=0

    n k

    1

    k 1

    r k

    ,

    using Lemma 2.1.1. By the ratio test, the last series converges, giving absolute convergenceof the Taylor series off atx0. We must also show that the series converges to f. Let k Z>0,let x Bn(,x0), and recall from Taylors Theorem [Abraham, Marsden, and Ratiu 1988,Theorem 2.4.15] that there exists

    z {(1 t)x0 + tx | t [0, 1]}

  • 7/29/2019 andrews-notes.pdf

    40/326

    32 2 Real analyticity 23/06/2009

    such that

    f(x) =

    IZn0|I|k

    1

    I!

    |I|fxI

    (x0)(x x0)I +

    IZn0|I|=k+1

    1

    I!

    |I|fxI

    (z)(x x0)I.

    Thus f(x) IZn0|I|k

    1

    I!

    |I|fxI

    (x0)(x x0)I

    IZn0|I|=k+1

    1

    I!

    |I|fxI

    (z)|x x0|I

    IZn0|I|=k+1

    r

    |I|=

    n k

    k

    r

    k+1.

    As we saw above, the series

    k=0

    n k

    1

    k 1

    r k

    converges, and so

    limk

    n k

    k

    r

    k+1= 0,

    giving

    limk

    f(x) IZn0|I|k

    1

    I!

    |I|fxI

    (x0)(x x0)I = 0,

    and so f is equal to is Taylor series about x0 in a neighbourhood ofx0.

    2.1.17 Remarks (On derivative estimates for real analytic functions)

    1. Note that, given any N Z>0, continuity of a C function and its derivativesensures that one can find an estimate of the form

    |I|fxI (x) CI!r|I|

    for allx in a neighbourhood of some pointx0 and for all Ifor which |I| N. Thus thekey distinction between smooth and analytic functions is that for analytic functionsone can do this uniformly for all I Zn0.

    2. Readers familiar with the theory of holomorphic functions of several complexvariables will recognise the estimate from the preceding theorem. Indeed, onecan deduce the estimate we give using the holomorphic theory using the Cauchyestimates for derivatives. One place where the real theory differs from the complextheory is that, in the holomorphic case, the estimate one gives can be expressed as

    |I|fzI (z) CI!r|I| sup{|f(z)| | z Dn(,x0)}, (2.8)

  • 7/29/2019 andrews-notes.pdf

    41/326

    23/06/2009 2.2 Real analytic multivariable calculus 33

    if f is holomorphic in the disk

    Dn(, z0) = {z Cn | |zj z0j| < , j {1, . . . , n}}.The point here is that the bound on the derivatives involves the value of the

    function. In the real analytic case this is not possible. This has dire implicationswhen one topologises the set of real analytic functions. Because of the form of the

    bounds for holomorphic functions, to topologise holomorphic functions is easy:one uses the topology of uniform convergence on compact sets. The analyticity ofthe limit function then essentially follows due to the bound (2.8). However, the setof real analytic functions is not a closed subspace of the set of continuous functionstopologised by uniform convergence on compact sets. !

    2.2 Real analytic multivariable calculus

    Now that we know what a real analytic function is, and what are some of itsproperties, we can turn to the calculus of real analytic functions. We describe herethe bare basics of this theory, enough so that we can do basic real analytic differentialgeometry.

    2.2.1 Real analyticity and operations on functions

    Let us verify that analyticity is respected by the standard ring operations for R-valued functions on a set.

    2.2.1 Proposition (Real analyticity and algebraic operations) If U is open and if f, g C(U) then f+ g, fg

    C(U). If, moreover, g(x) 0 for all x

    U, then f

    g

    C(U).

    Proof Let us first prove that f +g, f g C(U). Let x0 U and let r R>0 be such that, forx Bn(r,x0), we have

    f(x) =

    IZn0f(x0)(I)(x x0)I, g(x) =

    IZn0

    g(x0)(I)(x x0)I,

    with the convergence being uniform and absolute on Bn(r,x0). Absolute convergenceimplies that for any bijection : Z0 Zn0 we have

    f(x) =

    j=0

    f(x0)((j))(x x0)(j), g(x) =

    j=0

    g(x0)((j))(x x0)(j)

    The standard results on sums and products [Rudin 1976, Theorem 3.4] of series now apply(noting that convergence is absolute) to show that

    f(x) + g(x) =

    j=0

    (f(x0)((j)) + g(x0)((j)))(x x0)(j),

    f(x)g(x) =

    k=0

    kj=0

    (f(x0)((j))g(x0)((k j)))(x x0)(j)+(kj)

  • 7/29/2019 andrews-notes.pdf

    42/326

    34 2 Real analyticity 23/06/2009

    for all x Bn(r,x0), with convergence being absolute in both series. Absolute convergencethen implies that we can de-rearrange the series to get

    f(x)g(x) =

    IZn0(f(x0)(I) + g(x0(I))(x x0)I,

    f(x)g(x) =

    IZn0

    I1,I2Zn0I1+I2=I

    f(x0)(I1)g(x0)(I2)(x x0)I

    for x Bn(r,x0). Thus the power series f(x0) + g(x0) and f(x0)g(x0) converge in aneighbourhood ofx0 to f + g and f g, respectively. In particular, f + g, f g C(U).

    To show thatf

    g C(U) if g is nonzero on U, we show that 1g C(U); thatf

    g C(U)then follows by our conclusion above for multiplication of real analytic functions. Letx0 U and let r R>0 be such that

    g(x) = IZn0

    g(0)(x

    x0)I (2.9)

    for x Bn(r,x0), convergence being absolute and uniform in Bn(r,x0). Let us abbreviate = g(x0). Since g is nowhere zero onU it followsthat (0) 0 and so, by Proposition 2.1.3, is a unit in R[[]] with inverse defined by

    1(I) =1

    (0)

    k=0

    1 (I)

    (0)

    k.

    We will show that this is a convergent power series. Let R>0 be such that

    x

    (x01 + , . . . , x0n + ) Bn

    (r,x0).

    Since the series (2.9) converges at x, the terms in the series must be bounded. Thus thereexists C R>0 such that, for all I Zn0,

    |(I)(x x0)I| = |(I)||I| C.

    Therefore, for I Zn0, 1 (I)(0) 1 + C(0) .

    Let

    C = max1, 1 +C

    (0)and let R>0 be such that C (0, 1). Ifx Bn(r,x0) satisfies |xj x0j| < , j {1, . . . , n},

  • 7/29/2019 andrews-notes.pdf

    43/326

    23/06/2009 2.2 Real analytic multivariable calculus 35

    we have

    IZ0

    k=0

    1 (I)

    (0)

    k|x x0|I =

    IZ0

    |I|k=0

    1 (I)

    (0)

    k|x x0|I

    m=0

    IZ0|I|=m

    mk=0

    Ckm

    m=0

    IZ0|I|=m

    mk=0

    (C)m

    =

    m=0

    IZ0|I|=m

    (m + 1)(C)m =

    m=0

    n m 1

    n 1

    (m + 1)(C)m,

    using Lemma 2.1.1 and the fact, from Lemma 1 in the proof of Proposition 2.1.3, that

    (1 (I)(0) )k(I) = 0 whenever |I| {0, 1, . . . , k}.. The last series can be shown to converge bythe ratio test, and this shows that the series

    1

    (0)

    IZ0

    k=0

    1 (I)

    (0)

    k(x x0)I

    converges in a neighbourhood ofx0.

    Of course, the addition part of the preceding result also applies to Rm-valued realanalytic maps. That is, if f,g C(U,Rm), then f + g C(U,Rm).

    Next we consider compositions of real analytic maps.

    2.2.2 Proposition (Compositions of real analytic maps are real analytic) Let U Rn andV Rm be open, and let f : U V and g : V Rp be real analytic. Then g f : U Rp isreal analytic.

    Proof It suffices to consider the case where p = 1, and so we use g rather than g. Wedenote the components of f by f1, . . . , fm : U R. Let x0 U and let y0 = f(x0) V. For xin a neighbourhood U U ofx0 and for k {1, . . . , m} we write

    fk(x) =

    IZn0k(I)(x x0)I (2.10)

    and for y in a neighbourhood V V ofy0 we write

    g(y) =

    JZm

    0

    (J)(y y0)J. (2.11)

    Following Theorem 2.1.8, let x U be such that the series (2.10) converges absolutelyat x = x for each k {1, . . . , m} and such that xj x0j > 0, j {1, . . . , n}. In like fashion,let y V be such that (2.11) converges absolutely at y = y and such that yk y0k > 0,k {1, . . . , m}. Thus we have A, B R>0 such that

    IZn0|k(I)|(x x0)I < A, k {1, . . . , m},

    JZm0

    |(J)|(y y0)J < B.

  • 7/29/2019 andrews-notes.pdf

    44/326

    36 2 Real analyticity 23/06/2009

    Let

    r = min1,

    y1 y01A

    , . . . ,ym y0m

    A

    .

    Ifx U satisfies |xj| < (xj x0j), j {1, . . . , n}, then, for k {1, . . . , m},

    IZn0|I|1

    |k(I)||x x0|I

    IZn0|I|1

    |k(I)||I|(x x0)I =

    IZn0|I|1

    |k(I)||I|(x x0)I A ( yk y0k)

    since 1. Therefore, by (2.11),

    JZm0

    |(J)|

    I1Zn0|I1|1

    |1(I1)||x x0|I1j1

    InZn0|In |1

    |n(In)||x x0|Injk

    B.

    It follows that

    JZm0

    (J)

    I1Zn0|I1 |1

    1(I1)(x x0)I1j1

    InZn0|In|1

    n(In)(x x0)Injk

    converges absolutely for x U satisfying |xj| < (xj x0j), j {1, . . . , n}. Note, however,that since k(0) = y0k, k {1, . . . , m}, this means that the series

    JZm0

    (J)

    I1Zn0

    1(I1)(x x0)I1 y01

    j1

    InZn0

    n(In)(x x0)In y0k

    jk

    converges absolutely for x U satisfying |xj| < (xj x0j), j {1, . . . , n}. This last series,however, is precisely g f(x). This series is also a power series after a rearrangement, andany rearrangement will not affect convergence, cf. Remark 2.1.7. Thus g f is expressedas a convergent power series in a neighbourhood of x0.

    2.2.2 The real analytic Inverse Function Theorem

    The Inv