-
396 International Journal of Physical and Mathematical Sciences
Vol 4, No 1 (2013) ISSN: 2010-1791
International Journal of Physical and Mathematical Sciences
journal homepage: http://icoci.org/ijpms
A Brief Introduction to Differential Games
L. Gmez Esparza, G. Mendoza Torres, L. M. Saynes Torres.
Facultad de Ciencias de la Electrnica. Benemrita Universidad
Autnoma de Puebla.
1. Introduction
The theory of dynamics games is concerned with multi-person
decision making. The
principal characteristic of a dynamic game is that involves a
dynamic decision process evolving
in time (continuous or discrete), with more than on decision
maker, each with its own cost
function and possibly having access to different information.
Dynamic game theory adopts
characteristics from game theory and optimal control theory,
although it is much more versatile
than each of.
Differential games belong to a subclass of dynamic games called
games in the state space.
In a game in the state space, the modeler introduces a set of
variables to describe the state of a
dynamic system, at any particular instant of time in which the
game takes place. The systematic
study of the problems of differential games was initiated by
Isaacs in 1954.
After development of the maximum principle of Pontryagin's
maximum principle, it became
clear that there was a connection between differential games and
optimal control theory. In fact,
the differential game problems are a generalization of the
optimal control problems in cases
where more than one driver or player. However, differential
games are conceptually much more
complex than optimal control problems in that it is not as what
constitutes a solution. There are
different kinds optimal solutions for problems such as
differential games minimax solution,
Nash equilibrium, Pareto equilibrium, depending on the
characteristics of the games (see e.g.,
Tolwinski (1982) and Haurie, Tolwinski, and Leitman (1983)).
We present some results on differential games cooperative and
non-cooperative differential
games, and theirs "optimal" solutions. In particular we will
study those that relate Pareto
equilibrium and Nash equilibrium (non-cooperative games),
although other types of cooperative
and non-cooperative games, for example, commitment games,
Stackelberg games, to name a
few.
2. Preliminary in optimal control theory
As mentioned above, optimal control problems are a special class
of differential games
played and a cost criterion. In this section we study some basic
results on optimal control theory:
dynamic programming and the maximum principle, since these
results are determining in
dynamic game theory.
-
397 International Journal of Physical and Mathematical Sciences
Vol 4, No 1 (2013) ISSN: 2010-1791
2.1. Statement of optimal control problem (OCP)
In general, the optimal control problem (continuous time) can be
defined as follows
(1)
where is called the state equation and
is called the objective function or cost criteria. This is, in
own words, the problem is find the
admissible control , which Maximizes the objective function,
subject to the state equation and the
control constraints
(2)
Usually the set is determined by constraints (physical,
economic, biological, etc.) on the
values of the control variables at time . The control is called
the optimal control
and , determined by means of state equation with , is called the
optimal trajectory or an
optimal path.
2.2. Dynamic Programming and the Maximum Principle.
Dynamic programming is based on Bellman's principle of
optimality (Richard Bellman in
1957 stated this principle in his book on dynamic
programming)
Let us consider the optimal control problem (1). The principle
of maximum can be derived from
Bellman's principle of optimality (see [45]). We state the
principle of maximum as follows
Theorem 1. Let us assume, that exists an optimal couple for the
optimal control
problem (1), and we assume that and are continuously
differentiable in and continuous in
and . Then, exists an adjoint variable that satisfies
An optimal policy has the property that, whatever the
initial
state and initial conditions are, the remaining decision
must
constitute an optimal policy with regard to the outcome
resulting from the first decision.
-
398 International Journal of Physical and Mathematical Sciences
Vol 4, No 1 (2013) ISSN: 2010-1791
, (3)
(4)
(5)
where the so-called Hamiltonian is defined as
(6)
The maximum principle states that under certain assumptions
there exists for every optimal
control path a trajectory such that the maximum condition, the
adjoin equation, and
transversality condition (eq. 4) are satisfied. To obtain a
sufficiency theorem we augment these
conditions by convexity assumptions. This yields the following
theorem.
Theorem 2. Consider the optimal control problem given by the
equation (1), (2), (3) and define
the Hamiltonian function like in (7), and the maximized
Hamiltonian function
(7)
Assume that the state space is a convex set and that is
continuously differentiable and
concave. Let be a feasible control path with corresponding state
trajectory . If there exists
an absolutely continuous function such that the maximum
condition
(8)
the adjoint equation
(9)
and the transversality condition
(10)
are satisfied, and such that the function is concave and
continuously
differentiable with respect to for all , then is an optimal
path. If the set of feasible
controls, does not depend on , this result remains true if
equation (10) is replaced by
-
399 International Journal of Physical and Mathematical Sciences
Vol 4, No 1 (2013) ISSN: 2010-1791
(11)
3. Differential games: basic concepts
The general -player (deterministic) differential game time is
described by the state equation
(12)
and the cost functional for each player is given by the
equation
(13)
for , where the index set is called the players' set.
In this formulation we consider a fixed interval of time that is
the prescribed duration of
the game, is the initial state known by all players. Let
called
trajectory space of the game. The controls are chosen by player
for all , here
is named an admissible strategy set for player . Then the
problem can be stated as follows
For each , player wants to choose his control such as to
minimize (or maximize)
the cost functional (profits) subject to the state equation
(13).
It is assumed that all players know the state equation as well
as the cost functionals.
Example 1. In a two-firm differential game with one state
variable , the state evolves over
time according to the differential equation
in which are scalar control variables of firm 1 and 2,
respectively. The state variable
represents the number of customers that firm 1 has at time and
is the constant size of the
total market. Hence is the number of customers of firm 2. The
control variables
are the firm`s respective advertising effort rates at time . The
differential equation, in this case, can
be interpreted in the following way: the number of customers of
firm 1 tends to increase by the
advertising efforts of firm 1 since these efforts attract
customers from firm 2. On the other hand, the
advertising efforts of firm 2 tend to draw away customers from
firm 1.
Payoffs are given by
-
400 International Journal of Physical and Mathematical Sciences
Vol 4, No 1 (2013) ISSN: 2010-1791
in which represent firm i's unit revenues. The second term in
the integrand of is a convex
advertising cost function of firm . Feasibility requires that
and are not negative. Each
firm wishes to choose its advertising strategy over so as to
maximize its payoff. The payoff is
simply the present value of a firm's profit on the horizon.
Remark. In this game, the rival firm's actions do not influence
a firm's payoff directly but only
indirectly through the state dynamics.
3.1. The information structure
In many problems the control function , for each , should be
specified by means of an
information structure, which is denoted by , and is defined
as
where is nondecreasing in .
Depending on the type of information available, we can define a
strategy space of player
of all suitable mappings as follows
We also require that belongs to for .
Some types of standard information structures that arise in
deterministic differential games are
stated in following definition.
Definition 1. In -player continuous time deterministic dynamic
game of prescribed duration
we say that s information is
-
401 International Journal of Physical and Mathematical Sciences
Vol 4, No 1 (2013) ISSN: 2010-1791
(i) open-loop if
(ii) closed-loop if ,
(iii) memory less perfect state if for ,
(iv) feedback if , for .
The following theorem provides a set of conditions under which
the problem given by
equations (12) and (13), admits a unique solution for every
-tuple
Theorem 3. Let the information structure for each pair be any
one of the information patterns of
the definition above. Furthermore, let then if
(i) is continuous in for each ,
,
(ii) is uniformly Lipschitz in ,
(iii) for is continuous in for each and
uniformly Lipschitz in
the differential equation (12) admits a single state trajectory
for every , so that
, and furthermore this unique trajectory is continuous.
3.2. Cooperative Games
In this section we fix the initial state , and hence it will be
omitted from the notation.
As mentioned above, the differential games can be classified in
two classes: Cooperative
games and non-cooperative games. In a cooperative game the
players wish to cooperate to reach a
result that will be beneficial to all.
3.2.1 Pareto Equilibrium
Definition 2. Let us consider a game with players. Let ) be the
player cost function,
given the initial state and that the players follow the
multi-strategy
. Let be the set of admissible strategies for the player and
and
(14)
where . An admissible strategy
is called Pareto-optimal if there is if it does not exist
another such that
(15)
This concept can be illustrated in the following figure
-
402 International Journal of Physical and Mathematical Sciences
Vol 4, No 1 (2013) ISSN: 2010-1791
Figure 1.
We can to see in figure 1, that the pair ( , ) with the cost
vector V=V( , ) is not a Pareto
equilibrium, since there exist other points in that are "below"
V.
Let be the set of Pareto equilibrium (which supposed to be not
empty). The set of vectors
is called the Pareto front of the game.
The following theorem provides one method to study the existence
of Pareto equilibrium.
Theorem 4. Let we consider
and for each consider the scalar function
(16)
If for some vector there exists a strategy that minimizes ,
i.e.
(17)
then is a Pareto equilibrium.
3.3. Non-cooperative Games
In a non-cooperative game the players act independently and each
one wishes to optimize his
own objective function, i.e. players are rivals and all players
act in their own best interest, paying no
attention whatsoever to the fortunes of the other players.
-
403 International Journal of Physical and Mathematical Sciences
Vol 4, No 1 (2013) ISSN: 2010-1791
An important example of non-cooperative games is the problem
3.3.1. Two person Zero-Sum Differential Games
Consider the state equation
(18)
We may assume all variables to be scalar for the time being. In
this equation, we let and
denote the controls applied by players and , respectively. We
assume that
where and are convex sets in . Consider the cost functional
which player wants to maximize and player wants to minimize.
Since the gain of player
represents a loss to player , such games are named zero-sum
games (because the sum of their cost
functional is identically zero). Thus, we are looking for
admissible control trajectories and
such that
(20)
The solution is known as a saddle point but some authors call it
the minimax solution.
Here and stand for and , , respectively.
3.3.2. Nash equilibrium
First, we consider the case (two players)
Definition 3. Let be a strategy of player 2. We define the set
of optimal responses of player 1
to the strategy as
(21)
Similarly, the set of the optimal responses of player 2 to a
strategy of the player 1 is defined
as
(22)
A multi-strategy is said to be a Nash equilibrium if
-
404 International Journal of Physical and Mathematical Sciences
Vol 4, No 1 (2013) ISSN: 2010-1791
Equivalently, is a Nash equilibrium if
(23)
and
(24)
In our words, in Nash equilibrium, a player can't improve his
situation if he alters his strategy
unilaterally.
Generalizing to any finite number of players we have the
following
Definition 4. A multi-strategy in constitutes a Nash-equilibrium
solution if
the following inequalities hold for all
The interpretation of a Nash equilibrium solution is as follows:
If one player tries to alter his
strategy unilaterally, he can't improve his performance by such
a change. In this sort of situation
each player is just interested in his own performance, that is,
the game is played non-cooperatively.
Definition 5. Let we consider a dynamical game with players and
let it be the
objective function for the player since the initial condition of
the game is
in the time . Let it be , a markovian multi-strategy, that is to
say, each it is
markovian (or feedback). It's said that is a perfect Nash
Equilibrium if, for each and
any initial condition , is hold that
(25)
where the infimum is calculated above all the marcovians
strategies of the player .
In other words, a perfect Nash equilibrium is a marcovian
multi-strategy that it is a Nash
equilibrium for anyone that it will be the initial condition of
the game. In this case, some authors
say that the Nash equilibrium is perfect in the sub games (sub
game perfect). Observe that to solve
(23) or (24) substantially we would to solve an OCP for each .
This suggests that, in principle, we
can use technical like the principle of the maximum or dynamic
programming to find Nash
equilibrium.
We will formulate the maximum principle for the case . We will
to consider a
differential game with players, state space , and actions set ,
for .
The dynamical model is
(26)
for all
-
405 International Journal of Physical and Mathematical Sciences
Vol 4, No 1 (2013) ISSN: 2010-1791
The admissible controls are of open loop, , where is a
measurable function of to . The players wish "to maximize" the
objective functions
Let be the matrix of adjoints variables which -th file is
In this case is defined the Hamiltonian as follows
Let us suppose that is a Nash equilibrium and that is the
corresponding path, then are
hold the following necessary conditions for each the adjoints
equations are
the terminal condition is
and the maximization of the Hamiltonian is
We can note that this reduce the problem to one border problem
with two border conditions
that in some instances it can be solved explicitly. For example,
Clemhout and Wan (1974) consider
games three-linear, called thus because the Hamiltonian is
linear in the state, in the controls, and in
the variable attaches. Also, Dockner et al. (1985) identify
several types of differential games that
they are soluble, in the sense of the fact that they can be
determined balances of Nash of open loop,
either explicitly.
Example of perfect Nash equilibrium
Let us consider a differential game with players and finite
horizon . To save on
notation, denote the control variables of the two players by and
instead of and . The state
space is , the initial state is a fixed number , and the set of
feasible controls is
for player 1 and for player 2. The objective functionals are
-
406 International Journal of Physical and Mathematical Sciences
Vol 4, No 1 (2013) ISSN: 2010-1791
and
The system dynamics are given by
Let us first try to find open-loop Nash equilibrium of the above
game, that is, a pair
where and are the strategies for player 1 and player 2,
respectively. If player 2 chooses to play then player 1's
problem can be written as
Maximize
Subject to (27)
Since is assumed by player 1 to be a fixed function, of the
integral in (27) is equivalent
to the maximization of
, so that problem above is equivalent to
the problem
Maximize
Subject to (28)
with and this problem has the optimal open-loop strategy
, where is the unique solution of
and
We also have that
(29)
-
407 International Journal of Physical and Mathematical Sciences
Vol 4, No 1 (2013) ISSN: 2010-1791
where
. Finally, the state trajectory generated by this solution is
given by .
Note that the formula for now depends on player 2's control path
still to be determined.
Now consider player 2's control problem. If player 1 chooses ,
player 2's problem
can be written as
Maximize
Subject to (30)
Denoting by the costate variable of player 2, the Hamiltonian
function for this problem is
given by . Maximization with respect to
yields if and if . If then
, independently of . These properties imply that the maximized
Hamiltonian function is
given by
The adjoin equation and transversality condition for player 2's
problem are
Using
this can be written as
(31)
The boundary value problem consisting of (28) and (31) has a
unique solution which is given
by with from (29). The function is easily seen to be
non-positive and
strictly increasing on It depends on the parameters and whether
for all
or whether ) can be smaller than for some . Because of the
monotonicity
of , however, we know that in the latter case there exists a
number such that
for and for all Careful analysis of (31) reveals that such a
number exists if and only if and
, in which case is given by
In all other cases let us formally set . We summarize our
results as follows. There exists
a candidate for an open-loop Nash equilibrium, given by
-
408 International Journal of Physical and Mathematical Sciences
Vol 4, No 1 (2013) ISSN: 2010-1791
where and are specified as above.
To verify that the above candidate is an open-loop Nash
equilibrium it suffices therefore to
prove that is an optimal control path in player 2's problem.
This, however, follows from
theorem 2 by noting that for all , which shows that the
maximized Hamiltonian
function ) of player 2's problem is a concave function with
respect to . This concludes
the derivation of an open-loop Nash equilibrium for this
example.
4. References
[1] Amir, R. (2003). Stochastic games in economics and related
fields: an
overview. In Neyman and Sorin (2003), Chapter 30.
[2] Arkin, V.L, Evstigneev, LV. (1987). Stochastic Models of
Control and
Economic Dynamics. Academic Press, London.
[3] Balbus, L., Nowak, A.S. (2004). Construction of Nash
equilibrium in
symmetric stochastic games of capital accumulation. Math. Meth.
Opero Res. 60, pp.
267-277.
[4] Basar, T. Editor (1986). Dynamic Games and Applications in
Economics.
Lecture Notes in Economics and Mathematical Systems 265,
Springer-Verlag, Berlin.
[5] Basar, T, Olsder, G.J. (1999). Dynamic Non-cooperative Game
Theory,
Second Edition. SIAM, Philadelphia. (The first edition was
published by Academic
Press, in 1982.)
[6] Bellman, R. (1956). Dynamic Programming. Princeton
University Press,
Princeton, N.J.
[7] Bernhard, P. (2005). Robust control approach to option
pricing, including
transaction costs. In Nowak and Szajowski (2005), pp.
391-416.
[8] Dockner, E., Feichtinger, G., Jorgensen, S. (1985).
Tractable classes of
nonzero sum open-loop Nash differential games: theory and
examples. J. Optim. Theory
Appl. 45, pp. 179-197.
[9] Dockner, E.J., Jorgensen, S., Long, N.V., Sorger, G. (2000).
Differential
Games in Economics and Management Science. Cambridge University
Press,
Cambridge, U.K.
[10] Feichtinger, G., Jorgensen, S. (1983). Differential game
models in
management. Euro. J. Opero Res. 14, pp. 137-155.
[11] Fershtman, c., Mullar, E. (1986). Tumpike properties of
capital accumulation
games. J. Econ.' Th. 38, pp. 167-177.
-
409 International Journal of Physical and Mathematical Sciences
Vol 4, No 1 (2013) ISSN: 2010-1791
[12] Filar, J.A., Petrosjan, L.A. (2000). Dynamic cooperative
games. Internatl.
Game Theory Rev. 2, pp. 47-65.
[13] Filar, J., Vrieze, K. (1997). Competitive Markov Decision
Processes.
Springer Verlag, New York.
[14] Fleming, W.H., Rishel, R.W. (1975). Deterministic and
Stochastic Optimal
Control. Springer-Verlag, NewYork.
[15] Fleming, W.H., Soner, H.M. (1992). Controlled Markov
Processes and
Viscosity Solutions. Springer-Verlag, New York.
[16] Fudenberg, D., Tirole, J. (1991). Game Theory. MIT Press,
Cambridge,
MA.Gaidov, S.D. (1986). Pareto-optimality in stochastic
differential games. Problems of
Control and Information Theory 15, 439-450.
[17] Gaidov, S.D. (1990). On the Nash-bargaining solution in
stochastic
differential games. Serdica 16, pp. 120-125.
[18] Isaacs, R. (1965). Differential Games. Wiley, New York.
[19] Jorgensen, S., Sorger, G. (1990). Feedback Nash equilibrium
in a problem of
optimal fishery management. J. Optim. Theory Appl. 64, pp.
293-310.
[20] Jorgensen, S., Yeung, D.W.K. (1996). Stochastic
differential game model of a
common property fishery. J. Optim. Theory Appl. 90 pp.
381-403.
[21] Kaitala, V., Hamalainen, RP., Ruusunen, J. (1985). On the
analysis of
equilibrium and bargaining in a fishery game. In Feichtinger
(1985), pp. 593-606.
[22] Kalai, E., Smorodinsky, M. (1975). Other solutions to
Nash's bargaining
problem. Econometrica 43, pp. 513-518.
[23] Kannan, D., Lakshmikantham, v., editors (2002). Handbook of
Stochastic
Analysis and Applications. Dekker, New York.
[24] Karatzas, L., Shreve, S.E. (1998). Methods of Mathematical
Finance.
Springer-Verlag, New York.
[25] Kirman, A.P., Sobel, M.J. (1974). Dynamic oligopoly with
inventories.
Econometrica 42, pp. 279-287.
[26] Kuhn, H.W. Szego, G.P., editors (1971). Differential Games
and Related
Topics. North-Holland, Amsterdam.
[27] Leitmann, G. (1974). Cooperative and Non-cooperative Many
Players
Differential Games. Springer-Verlag, New York.
[28] Le Van, E., Dana, R-A. (2003). Dynamic Programming in
Economics.
Kluwer, Boston.
[29] Nash, J. (1950a). Equilibrium points in N-person games.
Proc. Natl. Acad.
Sci. 36, pp. 48-49.
[30] Nash, J. (1950b). The bargaining problem. Econometrica 18,
pp. 155-162.
[31] Nash, J. (1951). Non-cooperative games. Ann. Math. 54, pp.
286-295.
[32] Nash, J. (1953). Two-person cooperative games. Econometrica
21, pp. 128-
140.
-
410 International Journal of Physical and Mathematical Sciences
Vol 4, No 1 (2013) ISSN: 2010-1791
[33] Neck, R (1982). Dynamic systems with several decision
makers. In
Operations Research in Progress, ed. by G. Feichtinger and P.
Kall, Reidel, New York,
pp. 261-284.
[34] Neyman, A, Sorin, S., editors (2003). Stochastic Games and
Applications.
Kluwer, Dordrecht.
[35] Nowak, AS., Szajowski, P. (2003). On Nash equilibrium in
stochastic games
of capital accumulation. In Stochastic Games and Applications,
Volume 9, edited by L.A
Petrosjan and Y.Y. Mazalov, Nova Science, pp. 118-129.
[36] Nowak, AS., Szajowski, K., editors (2005). Advances in
Dynamic Games.
(Annals of the International Society of Dynamic Games, vol. 7)
Birkhauser, Boston.
[37] Petrosyan, L.A (2003). Bargaining in dynamic games. In
Petrosyan and
Yeung (2003), pp. 139-143.
[38] Petrosyan, L.A. (2005). Cooperative differential games. In
Nowak and
Szajowski (2005), pp. 183-200.
[39] Petrosyan, L.A. Zenkevich, N.A (1996). Carne Theory. World
Scientific,
Singapore.
[40] Petrosyan, L.A. Yeung, D.W.K, Editor s (2003). ICM
Millennium Lectures
on Games. Springer-Verlag, Berlin.
[41] Pohjola, M. (1983). Nash and Stackelberg solutions in a
differential game
model of capitalism J. Economic Dynamics and Control 6, pp.
173-186.
[42] Ricci, G., Editor (1991). Decision Processes in Economics.
Lecture Notes in
Economics and Mathematical Systems 353, Springer-Verlag,
Berlin.
[43] Roth, AE. (1979). Axiomatic Models of Bargaining.
Springer-Verlag, Berlin.
[44] Roth, AE. (1985). Game-Theoretic Models of Bargaining.
Cambridge
University, Cambridge, U.K.
[45] Sethi, S.P., Thompson, G.L. (2000). Optimal Control Theory:
Applications to
Management Science and Economics, 2nd Edition. Kluwer,
Boston.
[46] Shapley, L. (1953). Stochastic games. Proc. Natl. Acad.
Sci. 39, pp. 1095-
1100.
[47] Stokey, N.L., Lucas, R.E. (1989). Recursive Methods in
Economic Dynamics.
Harvard University Press, Cambridge, MA.
[48] Tabak, D., Kuo, B.C. (1971). Optimal Control by
Mathematical Programming.
Prentice-Hall, Englewood Cliffs, N.J.
[49] Tolwinski, B., Haurie, A., Leitmann, G. (1986). Cooperative
equilibrium in
differential games. J. Math. Anal. Appl. 119, pp. 182-202.
[50] Toussaint, B., (1985). The transversality condition at
infinity applied to a
problem of optimal resource depletion. In Feichtinger (1985),
pp. 429-440.
[51] Vaisbord, E.M., Zhukovskii, V.I. (1988). Introduction to
Multi-Player
Differential Games and Their Applications. Gordon and Breach,
New York.
[52] Vega-Amaya, O. (2003). Zero-sum average semi-Markov games:
fixed-point
solutions of the Shapley equation. SIAM J. Control Optimal. 42,
pp. 1876
-
411 International Journal of Physical and Mathematical Sciences
Vol 4, No 1 (2013) ISSN: 2010-1791
[53] Von Newmann, J., Morgenstern, O. (1944). The Theory of
Games and
Economic Behavior. Princeton University Press, Princeton,
N.J.
[54] Young, J., Zhou, X.Y. (1999). Stochastic Controls:
Hamiltonian Systems and
HJB Equations. Springer-Verlag, New York.
[55] Yu, P.L., Leitmann, G. (1974). Compromise solutions,
domination structures
Salukvadze's solution. J. Optim. Theory Appl. 3, pp.
362-378.
[56] Zhukavskiy. V.L, Salukvadze, M.E. (1994). The Vector-Valued
Maximin.
Academic Press, Boston.