A Brief Introduction to Differential Games

396 International Journal of Physical and Mathematical Sciences Vol 4, No 1 (2013) ISSN: 2010-1791

International Journal of Physical and Mathematical Sciences

journal homepage: http://icoci.org/ijpms

A Brief Introduction to Differential Games

L. Gmez Esparza, G. Mendoza Torres, L. M. Saynes Torres.

Facultad de Ciencias de la Electrnica. Benemrita Universidad Autnoma de Puebla.

1. Introduction

The theory of dynamics games is concerned with multi-person decision making. The

principal characteristic of a dynamic game is that involves a dynamic decision process evolving

in time (continuous or discrete), with more than on decision maker, each with its own cost

function and possibly having access to different information. Dynamic game theory adopts

characteristics from game theory and optimal control theory, although it is much more versatile

than each of.

Differential games belong to a subclass of dynamic games called games in the state space.

In a game in the state space, the modeler introduces a set of variables to describe the state of a

dynamic system, at any particular instant of time in which the game takes place. The systematic

study of the problems of differential games was initiated by Isaacs in 1954.

After development of the maximum principle of Pontryagin's maximum principle, it became

clear that there was a connection between differential games and optimal control theory. In fact,

the differential game problems are a generalization of the optimal control problems in cases

where more than one driver or player. However, differential games are conceptually much more

complex than optimal control problems in that it is not as what constitutes a solution. There are

different kinds optimal solutions for problems such as differential games minimax solution,

Nash equilibrium, Pareto equilibrium, depending on the characteristics of the games (see e.g.,

Tolwinski (1982) and Haurie, Tolwinski, and Leitman (1983)).

We present some results on differential games cooperative and non-cooperative differential

games, and theirs "optimal" solutions. In particular we will study those that relate Pareto

equilibrium and Nash equilibrium (non-cooperative games), although other types of cooperative

and non-cooperative games, for example, commitment games, Stackelberg games, to name a

few.

2. Preliminary in optimal control theory

As mentioned above, optimal control problems are a special class of differential games

played and a cost criterion. In this section we study some basic results on optimal control theory:

dynamic programming and the maximum principle, since these results are determining in

dynamic game theory.


2.1. Statement of optimal control problem (OCP)

In general, the optimal control problem (continuous time) can be defined as follows

(1)

where is called the state equation and

is called the objective function or cost criteria. This is, in own words, the problem is find the

admissible control , which Maximizes the objective function, subject to the state equation and the

control constraints

(2)

Usually the set is determined by constraints (physical, economic, biological, etc.) on the

values of the control variables at time . The control is called the optimal control

and , determined by means of state equation with , is called the optimal trajectory or an

optimal path.

2.2. Dynamic Programming and the Maximum Principle.

Dynamic programming is based on Bellman's principle of optimality (Richard Bellman in

1957 stated this principle in his book on dynamic programming)

Let us consider the optimal control problem (1). The principle of maximum can be derived from

Bellman's principle of optimality (see [45]). We state the principle of maximum as follows

Theorem 1. Let us assume, that exists an optimal couple for the optimal control

problem (1), and we assume that and are continuously differentiable in and continuous in

and . Then, exists an adjoint variable that satisfies

An optimal policy has the property that, whatever the initial

state and initial conditions are, the remaining decision must

constitute an optimal policy with regard to the outcome

resulting from the first decision.


, (3)

(4)

(5)

where the so-called Hamiltonian is defined as

(6)

The maximum principle states that under certain assumptions there exists for every optimal

control path a trajectory such that the maximum condition, the adjoin equation, and

transversality condition (eq. 4) are satisfied. To obtain a sufficiency theorem we augment these

conditions by convexity assumptions. This yields the following theorem.

Theorem 2. Consider the optimal control problem given by the equation (1), (2), (3) and define

the Hamiltonian function like in (7), and the maximized Hamiltonian function

(7)

Assume that the state space is a convex set and that is continuously differentiable and

concave. Let be a feasible control path with corresponding state trajectory . If there exists

an absolutely continuous function such that the maximum condition

(8)

the adjoint equation

(9)

and the transversality condition

(10)

are satisfied, and such that the function is concave and continuously

differentiable with respect to for all , then is an optimal path. If the set of feasible

controls, does not depend on , this result remains true if equation (10) is replaced by


(11)

3. Differential games: basic concepts

The general -player (deterministic) differential game time is described by the state equation

(12)

and the cost functional for each player is given by the equation

(13)

for , where the index set is called the players' set.

In this formulation we consider a fixed interval of time that is the prescribed duration of

the game, is the initial state known by all players. Let called

trajectory space of the game. The controls are chosen by player for all , here

is named an admissible strategy set for player . Then the problem can be stated as follows

For each , player wants to choose his control such as to minimize (or maximize)

the cost functional (profits) subject to the state equation (13).

It is assumed that all players know the state equation as well as the cost functionals.

Example 1. In a two-firm differential game with one state variable , the state evolves over

time according to the differential equation

in which are scalar control variables of firm 1 and 2, respectively. The state variable

represents the number of customers that firm 1 has at time and is the constant size of the

total market. Hence is the number of customers of firm 2. The control variables

are the firm`s respective advertising effort rates at time . The differential equation, in this case, can

be interpreted in the following way: the number of customers of firm 1 tends to increase by the

advertising efforts of firm 1 since these efforts attract customers from firm 2. On the other hand, the

advertising efforts of firm 2 tend to draw away customers from firm 1.

Payoffs are given by


in which represent firm i's unit revenues. The second term in the integrand of is a convex

advertising cost function of firm . Feasibility requires that and are not negative. Each

firm wishes to choose its advertising strategy over so as to maximize its payoff. The payoff is

simply the present value of a firm's profit on the horizon.

Remark. In this game, the rival firm's actions do not influence a firm's payoff directly but only

indirectly through the state dynamics.

3.1. The information structure

In many problems the control function , for each , should be specified by means of an

information structure, which is denoted by , and is defined as

where is nondecreasing in .

Depending on the type of information available, we can define a strategy space of player

of all suitable mappings as follows

We also require that belongs to for .

Some types of standard information structures that arise in deterministic differential games are

stated in following definition.

Definition 1. In -player continuous time deterministic dynamic game of prescribed duration

we say that s information is


(i) open-loop if

(ii) closed-loop if ,

(iii) memory less perfect state if for ,

(iv) feedback if , for .

The following theorem provides a set of conditions under which the problem given by

equations (12) and (13), admits a unique solution for every -tuple

Theorem 3. Let the information structure for each pair be any one of the information patterns of

the definition above. Furthermore, let then if

(i) is continuous in for each ,

,

(ii) is uniformly Lipschitz in ,

(iii) for is continuous in for each and

uniformly Lipschitz in

the differential equation (12) admits a single state trajectory for every , so that

, and furthermore this unique trajectory is continuous.

3.2. Cooperative Games

In this section we fix the initial state , and hence it will be omitted from the notation.

As mentioned above, the differential games can be classified in two classes: Cooperative

games and non-cooperative games. In a cooperative game the players wish to cooperate to reach a

result that will be beneficial to all.

3.2.1 Pareto Equilibrium

Definition 2. Let us consider a game with players. Let ) be the player cost function,

given the initial state and that the players follow the multi-strategy

. Let be the set of admissible strategies for the player and

and

(14)

where . An admissible strategy

is called Pareto-optimal if there is if it does not exist another such that

(15)

This concept can be illustrated in the following figure


Figure 1.

We can to see in figure 1, that the pair ( , ) with the cost vector V=V( , ) is not a Pareto

equilibrium, since there exist other points in that are "below" V.

Let be the set of Pareto equilibrium (which supposed to be not empty). The set of vectors

is called the Pareto front of the game.

The following theorem provides one method to study the existence of Pareto equilibrium.

Theorem 4. Let we consider

and for each consider the scalar function

(16)

If for some vector there exists a strategy that minimizes , i.e.

(17)

then is a Pareto equilibrium.

3.3. Non-cooperative Games

In a non-cooperative game the players act independently and each one wishes to optimize his

own objective function, i.e. players are rivals and all players act in their own best interest, paying no

attention whatsoever to the fortunes of the other players.


An important example of non-cooperative games is the problem

3.3.1. Two person Zero-Sum Differential Games

Consider the state equation

(18)

We may assume all variables to be scalar for the time being. In this equation, we let and

denote the controls applied by players and , respectively. We assume that

where and are convex sets in . Consider the cost functional

which player wants to maximize and player wants to minimize. Since the gain of player

represents a loss to player , such games are named zero-sum games (because the sum of their cost

functional is identically zero). Thus, we are looking for admissible control trajectories and

such that

(20)

The solution is known as a saddle point but some authors call it the minimax solution.

Here and stand for and , , respectively.

3.3.2. Nash equilibrium

First, we consider the case (two players)

Definition 3. Let be a strategy of player 2. We define the set of optimal responses of player 1

to the strategy as

(21)

Similarly, the set of the optimal responses of player 2 to a strategy of the player 1 is defined

as

(22)

A multi-strategy is said to be a Nash equilibrium if


Equivalently, is a Nash equilibrium if

(23)

and

(24)

In our words, in Nash equilibrium, a player can't improve his situation if he alters his strategy

unilaterally.

Generalizing to any finite number of players we have the following

Definition 4. A multi-strategy in constitutes a Nash-equilibrium solution if

the following inequalities hold for all

The interpretation of a Nash equilibrium solution is as follows: If one player tries to alter his

strategy unilaterally, he can't improve his performance by such a change. In this sort of situation

each player is just interested in his own performance, that is, the game is played non-cooperatively.

Definition 5. Let we consider a dynamical game with players and let it be the

objective function for the player since the initial condition of the game is

in the time . Let it be , a markovian multi-strategy, that is to say, each it is

markovian (or feedback). It's said that is a perfect Nash Equilibrium if, for each and

any initial condition , is hold that

(25)

where the infimum is calculated above all the marcovians strategies of the player .

In other words, a perfect Nash equilibrium is a marcovian multi-strategy that it is a Nash

equilibrium for anyone that it will be the initial condition of the game. In this case, some authors

say that the Nash equilibrium is perfect in the sub games (sub game perfect). Observe that to solve

(23) or (24) substantially we would to solve an OCP for each . This suggests that, in principle, we

can use technical like the principle of the maximum or dynamic programming to find Nash

equilibrium.

We will formulate the maximum principle for the case . We will to consider a

differential game with players, state space , and actions set , for .

The dynamical model is

(26)

for all


The admissible controls are of open loop, , where is a

measurable function of to . The players wish "to maximize" the objective functions

Let be the matrix of adjoints variables which -th file is

In this case is defined the Hamiltonian as follows

Let us suppose that is a Nash equilibrium and that is the corresponding path, then are

hold the following necessary conditions for each the adjoints equations are

the terminal condition is

and the maximization of the Hamiltonian is

We can note that this reduce the problem to one border problem with two border conditions

that in some instances it can be solved explicitly. For example, Clemhout and Wan (1974) consider

games three-linear, called thus because the Hamiltonian is linear in the state, in the controls, and in

the variable attaches. Also, Dockner et al. (1985) identify several types of differential games that

they are soluble, in the sense of the fact that they can be determined balances of Nash of open loop,

either explicitly.

Example of perfect Nash equilibrium

Let us consider a differential game with players and finite horizon . To save on

notation, denote the control variables of the two players by and instead of and . The state

space is , the initial state is a fixed number , and the set of feasible controls is

for player 1 and for player 2. The objective functionals are


and

The system dynamics are given by

Let us first try to find open-loop Nash equilibrium of the above game, that is, a pair

where and are the strategies for player 1 and player 2,

respectively. If player 2 chooses to play then player 1's problem can be written as

Maximize

Subject to (27)

Since is assumed by player 1 to be a fixed function, of the integral in (27) is equivalent

to the maximization of

, so that problem above is equivalent to

the problem

Maximize

Subject to (28)

with and this problem has the optimal open-loop strategy

, where is the unique solution of

and

We also have that

(29)


where

. Finally, the state trajectory generated by this solution is given by .

Note that the formula for now depends on player 2's control path still to be determined.

Now consider player 2's control problem. If player 1 chooses , player 2's problem

can be written as

Maximize

Subject to (30)

Denoting by the costate variable of player 2, the Hamiltonian function for this problem is

given by . Maximization with respect to

yields if and if . If then

, independently of . These properties imply that the maximized Hamiltonian function is

given by

The adjoin equation and transversality condition for player 2's problem are

Using

this can be written as

(31)

The boundary value problem consisting of (28) and (31) has a unique solution which is given

by with from (29). The function is easily seen to be non-positive and

strictly increasing on It depends on the parameters and whether for all

or whether ) can be smaller than for some . Because of the monotonicity

of , however, we know that in the latter case there exists a number such that

for and for all Careful analysis of (31) reveals that such a

number exists if and only if and

, in which case is given by

In all other cases let us formally set . We summarize our results as follows. There exists

a candidate for an open-loop Nash equilibrium, given by


where and are specified as above.

To verify that the above candidate is an open-loop Nash equilibrium it suffices therefore to

prove that is an optimal control path in player 2's problem. This, however, follows from

theorem 2 by noting that for all , which shows that the maximized Hamiltonian

function ) of player 2's problem is a concave function with respect to . This concludes

the derivation of an open-loop Nash equilibrium for this example.

4. References

[1] Amir, R. (2003). Stochastic games in economics and related fields: an

overview. In Neyman and Sorin (2003), Chapter 30.

[2] Arkin, V.L, Evstigneev, LV. (1987). Stochastic Models of Control and

Economic Dynamics. Academic Press, London.

[3] Balbus, L., Nowak, A.S. (2004). Construction of Nash equilibrium in

symmetric stochastic games of capital accumulation. Math. Meth. Opero Res. 60, pp.

267-277.

[4] Basar, T. Editor (1986). Dynamic Games and Applications in Economics.

Lecture Notes in Economics and Mathematical Systems 265, Springer-Verlag, Berlin.

[5] Basar, T, Olsder, G.J. (1999). Dynamic Non-cooperative Game Theory,

Second Edition. SIAM, Philadelphia. (The first edition was published by Academic

Press, in 1982.)

[6] Bellman, R. (1956). Dynamic Programming. Princeton University Press,

Princeton, N.J.

[7] Bernhard, P. (2005). Robust control approach to option pricing, including

transaction costs. In Nowak and Szajowski (2005), pp. 391-416.

[8] Dockner, E., Feichtinger, G., Jorgensen, S. (1985). Tractable classes of

nonzero sum open-loop Nash differential games: theory and examples. J. Optim. Theory

Appl. 45, pp. 179-197.

[9] Dockner, E.J., Jorgensen, S., Long, N.V., Sorger, G. (2000). Differential

Games in Economics and Management Science. Cambridge University Press,

Cambridge, U.K.

[10] Feichtinger, G., Jorgensen, S. (1983). Differential game models in

management. Euro. J. Opero Res. 14, pp. 137-155.

[11] Fershtman, c., Mullar, E. (1986). Tumpike properties of capital accumulation

games. J. Econ.' Th. 38, pp. 167-177.


[12] Filar, J.A., Petrosjan, L.A. (2000). Dynamic cooperative games. Internatl.

Game Theory Rev. 2, pp. 47-65.

[13] Filar, J., Vrieze, K. (1997). Competitive Markov Decision Processes.

Springer Verlag, New York.

[14] Fleming, W.H., Rishel, R.W. (1975). Deterministic and Stochastic Optimal

Control. Springer-Verlag, NewYork.

[15] Fleming, W.H., Soner, H.M. (1992). Controlled Markov Processes and

Viscosity Solutions. Springer-Verlag, New York.

[16] Fudenberg, D., Tirole, J. (1991). Game Theory. MIT Press, Cambridge,

MA.Gaidov, S.D. (1986). Pareto-optimality in stochastic differential games. Problems of

Control and Information Theory 15, 439-450.

[17] Gaidov, S.D. (1990). On the Nash-bargaining solution in stochastic

differential games. Serdica 16, pp. 120-125.

[18] Isaacs, R. (1965). Differential Games. Wiley, New York.

[19] Jorgensen, S., Sorger, G. (1990). Feedback Nash equilibrium in a problem of

optimal fishery management. J. Optim. Theory Appl. 64, pp. 293-310.

[20] Jorgensen, S., Yeung, D.W.K. (1996). Stochastic differential game model of a

common property fishery. J. Optim. Theory Appl. 90 pp. 381-403.

[21] Kaitala, V., Hamalainen, RP., Ruusunen, J. (1985). On the analysis of

equilibrium and bargaining in a fishery game. In Feichtinger (1985), pp. 593-606.

[22] Kalai, E., Smorodinsky, M. (1975). Other solutions to Nash's bargaining

problem. Econometrica 43, pp. 513-518.

[23] Kannan, D., Lakshmikantham, v., editors (2002). Handbook of Stochastic

Analysis and Applications. Dekker, New York.

[24] Karatzas, L., Shreve, S.E. (1998). Methods of Mathematical Finance.

Springer-Verlag, New York.

[25] Kirman, A.P., Sobel, M.J. (1974). Dynamic oligopoly with inventories.

Econometrica 42, pp. 279-287.

[26] Kuhn, H.W. Szego, G.P., editors (1971). Differential Games and Related

Topics. North-Holland, Amsterdam.

[27] Leitmann, G. (1974). Cooperative and Non-cooperative Many Players

Differential Games. Springer-Verlag, New York.

[28] Le Van, E., Dana, R-A. (2003). Dynamic Programming in Economics.

Kluwer, Boston.

[29] Nash, J. (1950a). Equilibrium points in N-person games. Proc. Natl. Acad.

Sci. 36, pp. 48-49.

[30] Nash, J. (1950b). The bargaining problem. Econometrica 18, pp. 155-162.

[31] Nash, J. (1951). Non-cooperative games. Ann. Math. 54, pp. 286-295.

[32] Nash, J. (1953). Two-person cooperative games. Econometrica 21, pp. 128-

140.


[33] Neck, R (1982). Dynamic systems with several decision makers. In

Operations Research in Progress, ed. by G. Feichtinger and P. Kall, Reidel, New York,

pp. 261-284.

[34] Neyman, A, Sorin, S., editors (2003). Stochastic Games and Applications.

Kluwer, Dordrecht.

[35] Nowak, AS., Szajowski, P. (2003). On Nash equilibrium in stochastic games

of capital accumulation. In Stochastic Games and Applications, Volume 9, edited by L.A

Petrosjan and Y.Y. Mazalov, Nova Science, pp. 118-129.

[36] Nowak, AS., Szajowski, K., editors (2005). Advances in Dynamic Games.

(Annals of the International Society of Dynamic Games, vol. 7) Birkhauser, Boston.

[37] Petrosyan, L.A (2003). Bargaining in dynamic games. In Petrosyan and

Yeung (2003), pp. 139-143.

[38] Petrosyan, L.A. (2005). Cooperative differential games. In Nowak and

Szajowski (2005), pp. 183-200.

[39] Petrosyan, L.A. Zenkevich, N.A (1996). Carne Theory. World Scientific,

Singapore.

[40] Petrosyan, L.A. Yeung, D.W.K, Editor s (2003). ICM Millennium Lectures

on Games. Springer-Verlag, Berlin.

[41] Pohjola, M. (1983). Nash and Stackelberg solutions in a differential game

model of capitalism J. Economic Dynamics and Control 6, pp. 173-186.

[42] Ricci, G., Editor (1991). Decision Processes in Economics. Lecture Notes in

Economics and Mathematical Systems 353, Springer-Verlag, Berlin.

[43] Roth, AE. (1979). Axiomatic Models of Bargaining. Springer-Verlag, Berlin.

[44] Roth, AE. (1985). Game-Theoretic Models of Bargaining. Cambridge

University, Cambridge, U.K.

[45] Sethi, S.P., Thompson, G.L. (2000). Optimal Control Theory: Applications to

Management Science and Economics, 2nd Edition. Kluwer, Boston.

[46] Shapley, L. (1953). Stochastic games. Proc. Natl. Acad. Sci. 39, pp. 1095-

1100.

[47] Stokey, N.L., Lucas, R.E. (1989). Recursive Methods in Economic Dynamics.

Harvard University Press, Cambridge, MA.

[48] Tabak, D., Kuo, B.C. (1971). Optimal Control by Mathematical Programming.

Prentice-Hall, Englewood Cliffs, N.J.

[49] Tolwinski, B., Haurie, A., Leitmann, G. (1986). Cooperative equilibrium in

differential games. J. Math. Anal. Appl. 119, pp. 182-202.

[50] Toussaint, B., (1985). The transversality condition at infinity applied to a

problem of optimal resource depletion. In Feichtinger (1985), pp. 429-440.

[51] Vaisbord, E.M., Zhukovskii, V.I. (1988). Introduction to Multi-Player

Differential Games and Their Applications. Gordon and Breach, New York.

[52] Vega-Amaya, O. (2003). Zero-sum average semi-Markov games: fixed-point

solutions of the Shapley equation. SIAM J. Control Optimal. 42, pp. 1876


[53] Von Newmann, J., Morgenstern, O. (1944). The Theory of Games and

Economic Behavior. Princeton University Press, Princeton, N.J.

[54] Young, J., Zhou, X.Y. (1999). Stochastic Controls: Hamiltonian Systems and

HJB Equations. Springer-Verlag, New York.

[55] Yu, P.L., Leitmann, G. (1974). Compromise solutions, domination structures

Salukvadze's solution. J. Optim. Theory Appl. 3, pp. 362-378.

[56] Zhukavskiy. V.L, Salukvadze, M.E. (1994). The Vector-Valued Maximin.

Academic Press, Boston.

A Brief Introduction to Differential Games

Documents

problems of differential

differential games cooperative

optimal control theory

optimal control problems

commitment games

stackelberg games

theory of dynamics games

subclass of dynamic