Stochastic Omega-Regular Games Krishnendu Chatterjee Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2007-122 http://www.eecs.berkeley.edu/Pubs/TechRpts/2007/EECS-2007-122.html October 8, 2007
263
Embed
University of California, Berkeley1 Abstract Stochastic ω-Regular Games by Krishnendu Chatterjee Doctor of Philosophy in Computer Science University of California at Berkeley Professor
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Stochastic Omega-Regular Games
Krishnendu Chatterjee
Electrical Engineering and Computer SciencesUniversity of California at Berkeley
Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission.
Stochastic ω-Regular Games
by
Krishnendu Chatterjee
B. Tech. (IIT, Kharagpur) 2001M.S. (University of California, Berkeley) 2004
A dissertation submitted in partial satisfaction of the
requirements for the degree of
Doctor of Philosophy
in
Computer Science
in the
GRADUATE DIVISION
of the
UNIVERSITY of CALIFORNIA at BERKELEY
Committee in charge:
Professor Thomas A. Henzinger, ChairProfessor Christos PapadimitriouProfessor John Steel
Fall, 2007
The dissertation of Krishnendu Chatterjee is approved:
Chair Date
Date
Date
University of California at Berkeley
Fall, 2007
Stochastic ω-Regular Games
Copyright Fall, 2007
by
Krishnendu Chatterjee
1
Abstract
Stochastic ω-Regular Games
by
Krishnendu Chatterjee
Doctor of Philosophy in Computer Science
University of California at Berkeley
Professor Thomas A. Henzinger, Chair
We study games played on graphs with ω-regular conditions specified as parity, Rabin,
Streett or Muller conditions. These games have applications in the verification, synthesis,
modeling, testing, and compatibility checking of reactive systems. Important distinctions
between graph games are as follows: (a) turn-based vs. concurrent games, depending
on whether at a state of the game only a single player makes a move, or players make
moves simultaneously; (b) deterministic vs. stochastic, depending on whether the transition
function is a deterministic or a probabilistic function over successor states; and (c) zero-sum
vs. non-zero-sum, depending on whether the objectives of the players are strictly conflicting
or not.
We establish that the decision problem for turn-based stochastic zero-sum games
with Rabin, Streett, and Muller objectives are NP-complete, coNP-complete, and PSPACE-
complete, respectively, substantially improving the previously known 3EXPTIME bound.
We also present strategy improvement style algorithms for turn-based stochastic Rabin and
Streett games. In the case of concurrent stochastic zero-sum games with parity objectives
we obtain a PSPACE bound, again improving the previously known 3EXPTIME bound. As
2
a consequence, concurrent stochastic zero-sum games with Rabin, Streett, and Muller ob-
jectives can be solved in EXPSPACE, improving the previously known 4EXPTIME bound.
We also present an elementary and combinatorial proof of the existence of memoryless ε-
optimal strategies in concurrent stochastic games with reachability objectives, for all real
ε > 0, where an ε-optimal strategy achieves the value of the game with in ε against all strate-
gies of the opponent. We also use the proof techniques to present a strategy improvement
style algorithm for concurrent stochastic reachability games.
We then go beyond ω-regular objectives and study the complexity of an important
class of quantitative objectives, namely, limit-average objectives. In the case of limit-average
games, the states of the graph is labeled with rewards and the goal is to maximize the long-
run average of the rewards. We show that concurrent stochastic zero-sum games with
limit-average objectives can be solved in EXPTIME.
Finally, we introduce a new notion of equilibrium, called secure equilibrium, in non-
zero-sum games which captures the notion of conditional competitiveness. We prove the
existence of unique maximal secure equilibrium payoff profiles in turn-based deterministic
games, and present algorithms to compute such payoff profiles. We also show how the
notion of secure equilibrium extends the assume-guarantee style of reasoning in the game
theoretic framework.
Professor Thomas A. HenzingerDissertation Committee Chair
5.1 Strategy complexity of 212 -player games and its sub-classes with ω-regular
objectives, where ΣPM denotes the family of pure memoryless strategies,ΣPF denotes the family of pure finite-memory strategies and ΣM denotes thefamily of randomized memoryless strategies. . . . . . . . . . . . . . . . . . . 130
5.2 Computational complexity of 212 -player games and its sub-classes with ω-
8.1 Strategy complexity of concurrent games with ω-regular objectives, whereΣPM denotes the family of pure memoryless strategies, ΣM denotes the fam-ily of randomized memoryless strategies, and ΣHI denotes the family of ran-domized history dependent, infinite-memory strategies. . . . . . . . . . . . 197
8.2 Computational complexity of concurrent games with ω-regular objectives. 197
LIST OF TABLES vii
Acknowledgements
I am deeply grateful to my advisor, Tom Henzinger, for his wonderful support and guidance
during my stay in Berkeley. Over the last five years he taught me all I know of research in
the field of verification. He taught me how to think about research problems, helped me
make significant progress in skills that are essential for a researcher, he taught me how to
write precisely and concisely, even carefully and patiently helped me in correcting all my
punctuation and grammatical mistakes. His enthusiasm, patience, ability to concretize an
ill-conceived idea to a precise problem, suggesting new ways to attack a problem when I got
stuck and his brilliance will always remain a source of inspiration. His influence is present
in every page of this thesis and will be in write-ups that I write in future. I only wish a
small fraction of his abilities has rubbed off on me.
I am thankful to Luca de Alfaro, Rupak Majumdar and Marcin Jurdzinski for
several collaborative works that appear in this dissertation. It was an absolute pleasure
to work with Luca, and from him I received innumerable intuitions on the behavior of
concurrent games that form a large part of the thesis. Research with Rupak is a truly
wonderful experience: his ability to suggest relevant and interesting new problems and
amazing sense of humor always made our research discussions a fun experience. I started
my research on graph games with Marcin and I am grateful to him getting me interested in
the topic of graph games and explaining all the basics. Other than helping me in research,
he also influenced me a lot on how to clearly communicate an idea and patiently answer
to all questions (in my early days of research I surely had many stupid questions for him).
I am also fortunate to collaborate with Orna Kupferman and Nir Piterman, and I thank
them for sharing with me their knowledge of automata theory and teaching me a new way
to look at games via automata. I am also thankful to Jean-Francois Raskin, Laurent Doyen
and Radha Jagadeesan for fruitful research collaborations; it was a pleasure to work with
them. I am in debt to P.P. Chakrabarti and Pallab Dasgupta, who were my undergraduate
LIST OF TABLES viii
mentors in IIT Kharagpur and introduced me to the field of formal methods and taught me
all the basics in computer science. I feel simply lucky that such brilliant people brought me
so caringly and smoothly to the field of verification and games.
I am grateful to a lot of people who read several of the results that appear in this
thesis (as manuscripts or conference publications) and helped me with their comments to
improve the results and presentation. Kousha Etessami and Mihalis Yannakakis pointed
out a flaw in a statement of a result of chapter 8 and then helped to the extent of correctly
formulating the result (which currently appears in chapter 8). I am truly grateful to them.
Hugo Gimbert with his valuable comments helped me to make the results of chapter 3
precise. Abraham Neyman helped immensely with his comments on chapter 7; his comments
were extremely helpful in improving and formalizing the results.
Christos Papadimitriou taught us an amazing course on “Algorithms, Internet
and Game Theory”and reinforced my interests in games. I thank him for the course and
for serving in my thesis committee. George Necula taught us a course on “Programming
Languages” and illuminated us with several aspects of program verification; and though I
have not worked much on this field he instilled in me a lot of interest and I hope to pursue
research in program verification in future. I also thank him for serving in my qualifying
exam committee. I am thankful to John Steel who readily agreed to serve on my qualifying
exam and thesis committee.
I thank all my friends who made my stay in Berkeley such a wonderful experience.
I already had old friends Arindam and Arkadeb and made many new friends. I had amazing
discussions with labmates Arindam, Slobodan, Vinayak, Satrajit and Arkadeb. I had an
excellent roommate Kaushik who helped me in many ways, and shared his vast knowledge
on cricket, tennis, movies, and so many other topics. The other highlight was our great
cricket sessions with Kaushik and Rahul. I had some great parties in the company of
Kaushik, Rahul, Pankaj, Arkadeb, Mohan Dunga, Satrajit, and many more friends. I had
LIST OF TABLES ix
some great time with some of my other close friends in Berkeley such as Ambuj, Anurag,
Shanky, Vishnu, Ankit, Parag, Sanjeev, · · · . I was fortunate to meet Nilim and Antar-da
in Berkeley, who in my early years took care of me as their younger brother.
Two of my school teachers: Manjusree Mukherjee and Sreya Kana Mukherjee; and
two of my great friends: Binayak Roy and Abhijit Guria, will always remain as a source of
inspiration and I can always rely on them when I am in trouble. Binayak, in my college
days, taught me how to think about mathematics and without him nothing would have
been possible. I am thankful and grateful to them in too many ways to list.
Finally, my family has been an endless source of love, affection, support and mo-
tivation for me. My grand-parents, parents, Jetha, Jethima, Bad-di, Anju-di, Ranju-di,
Sikha-da, Pradip-da, Abhi, Hriti: all my family members in Calcutta and relatives in Puru-
lia and other parts of Bengal gave me love beyond imagination, support and encouragement
at all stages of my PhD life. My inner strength is my mother and without her this thesis
would not have been possible. So I dedicate this thesis to her.
1
Chapter 1
Introduction
One shot and stochastic games. The study of games provide theoretical foundations in
several fields of mathematics and computer science. The simplest class of games consists
of the “one-step” games — games with single interaction between the agents after which
the game ends and the payoffs are decided (e.g., matrix games). However, a wide class of
games progress over time and in a stateful manner, and the current game depends on the
history of interactions. The class of concurrent stochastic games [Sha53, Eve57], that are
played over a finite state space and played in rounds, are a natural model for such games.
Infinite games. In this thesis we will consider nonterminating games of perfect-information
played on finite graphs. A nonterminating game proceeds for an infinite number of rounds.
The state of a game is a vertex of a graph. In each round, the state changes along an edge of
the graph to a successor vertex. Thus, the outcome of the game being played for an infinite
number of rounds, is an infinite path through the graph. We consider boolean objectives
for the two players: for each player, the resulting infinite path is either winning or losing.
The winning sets of paths are assumed to be ω-regular [Tho97]. Depending on how the
winning sets are specified, we distinguish between parity, Rabin, Streett, and Muller games,
as well as some subclasses thereof. The class of parity, Rabin, Streett and Muller objectives
CHAPTER 1. INTRODUCTION 2
are canonical forms to express ω-regular objectives [Tho97]. Depending on the structure of
the graph, we distinguish between turn-based and concurrent games. In turn-based games,
the graph is partitioned into player-1 states and player-2 states: in player-1 states, player 1
chooses the successor vertex; and in player-2 states, player 2 chooses the successor vertex.
In concurrent games, in every round both players choose simultaneously and independently
from a set of available moves, and the combination of both choices determines the successor
vertex. Finally, we distinguish between deterministic and stochastic games: in stochastic
games, in every round the players’ moves determine a probability distribution on the possible
successor vertices, instead of determining a unique successor.
These games play a central role in several areas of computer science. One impor-
tant application arises when the vertices and edges of a graph represent the states and tran-
sitions of a reactive system, and the two players represent controllable versus uncontrollable
decisions during the execution of the system. The synthesis problem (or control problem)
for reactive systems asks for the construction of a winning strategy in the corresponding
graph game. This problem was first posed independently by Alonzo Church [Chu62] and
Richard Buchi [Buc62] in settings that can be reduced to turn-based deterministic games
with ω-regular objectives. The problem was solved independently by Michael Rabin using
logics on trees [Rab69], and by Buchi and Lawrence Landweber using a more game-theoretic
approach [BL69]; it was later resolved using improved methods [GH82, McN93] and in dif-
ferent application contexts [RW87, PR89]. Game-theoretic formulations have proved useful
not only for synthesis, but also for the modeling [Dil89, ALW89], refinement [HKR02], ver-
ification [dAHM00b, AHK02], testing [BGNV05], and compatibility checking [dAH01] of
reactive systems. The use of ω-regular objectives is natural in these application contexts.
This is because the winning conditions of the games arise from requirements specifications
for reactive systems, and the ω-regular sets of infinite paths provide an important and
robust paradigm for such specifications [MP92]. However, both the restriction to determin-
CHAPTER 1. INTRODUCTION 3
istic games and the restriction to turn-based games are limiting in some respects: prob-
abilistic transitions are useful to model uncertain behavior that is not strictly adversarial
[Var85, CY95], and concurrent choice is useful to model certain forms of synchronous inter-
action between reactive systems [dAHM00a, dAHM01]. The resulting concurrent stochastic
games have long been familiar to game theorists and mathematicians, sometimes under the
name of competitive Markov decision processes [FV97].
Qualitative and quantitative analysis. The central computational problem about a
game is the question of whether a player has a strategy for winning the game. However, in
stochastic graph games there are several degrees of “winning”: we may ask if a player has
a strategy that ensures a winning outcome of the game, no matter how the other player
resolves her choices (this is called sure winning); or we may ask if a player has a strategy
that achieves a winning outcome of the game with probability 1 (almost-sure winning);
or we may ask if the maximal probability with which a player can win is 1 in the limit,
defined as the supremum over all possible strategies of the infimum over all adversarial
strategies (limit-sure winning). While all three notions of winning coincide for turn-based
deterministic games [Mar75], and almost-sure winning coincides with limit-sure winning for
turn-based stochastic games [CJH03] (see Corollary 5 of chapter 4), all three notions are
different for concurrent games, even in the deterministic case [dAHK98]. This is because
for concurrent games, strategies that use randomization are more powerful than pure (i.e.,
nonrandomized) strategies. The computation of sure winning, almost-sure winning, and
limit-sure winning states is called the qualitative analysis of graph games. This is in contrast
to the quantitative analysis, which asks for computing for each state the maximal probability
with which a player can win in the limit, even if that limit is less than 1. For a fixed player,
the limit probability is called the sup-inf value, or the optimal value, or simply the value of
the game at a state. A strategy that achieves the optimal value is an optimal strategy, and
a strategy that ensures one of the three ways of winning, is a sure (almost-sure; limit-sure)
CHAPTER 1. INTRODUCTION 4
winning strategy. Concurrent graph games are more difficult than turn-based graph games
for several reasons. In concurrent games, optimal strategies may not exist, but for every
real ε > 0, there may be a strategy that guarantees a winning outcome with a probability
that lies within ε of the optimal value [Eve57]. Moreover, ε-optimal or limit-sure winning
strategies may require infinite memory about the history of a game in order to prescribe
the next move of a player [dAH00]. By contrast, in the simplest scenarios —for example,
in the case of turn-based stochastic games with parity objectives— optimal and winning
strategies require neither randomization nor memory (see chapter 5); such pure memoryless
strategies can be implemented by control maps from states to moves.
A game that has a winning strategy for one of the two players at every vertex
is called determined. There are two kinds of determinacy results for graph games. First,
the turn-based deterministic games have a qualitative determinacy, namely, determinacy
for sure winning : in every state of the game graph, one of the two players has a sure
winning strategy [Mar75]. Second, the turn-based stochastic games and the concurrent
games have a quantitative determinacy, that is, determinacy for optimal values: in every
state, the optimal values for both players add up to 1 [Mar98]. Both the sure-winning
determinacy result and the optimal-value determinacy results hold for all Borel objectives;
the sure-winning determinacy for turn-based deterministic games with Borel objectives
was established by Donald Martin [Mar75] and the optimal-value determinacy for Borel
objectives was established again by Donald Martin [Mar98] for a very general class of
games called Blackwell games, which include all games we consider in this thesis. For
concurrent games, however, there is no determinacy for sure winning: even if a concurrent
game is deterministic (i.e., nonstochastic) and the objectives are simple (e.g., single-step
reachability), neither player may have a strategy for sure winning [dAHK98]. Determinacy is
useful for solving games: when computing the sure winning states of a game, or the optimal
values, we can switch between the dual views of the two players whenever convenient.
CHAPTER 1. INTRODUCTION 5
Quantitative objectives. So far we have discussed about qualitative objectives, i.e., an
outcome of the game is assigned payoff either 0 or 1. The more general case of quantitative
objectives consist of measurable functions that assign real valued rewards to outcomes of a
game. Several quantitative objectives have been studied by game theorists and also in the
context of economics. The notable quantitative objectives are discounted reward and limit-
average (or mean-payoff) objectives. In such games the states of the game graph is labeled
with real valued rewards: for discounted reward objectives the payoff is the discounted sum
of the rewards, and for limit-average objectives the payoff is the long-run average of the
rewards. Games with discounted reward objectives were introduced by Shapley [Sha53] and
has been studied in economics and also in systems theory [dAHM03]. The limit-average
objectives has also been studied extensively in game theory [MN81].
Nonzero-sum games. In nonzero-sum games, both players may be winning. In this case,
the notion of rational behavior of the players is captured by Nash equilibria: a pair of
strategies for the two players is a Nash equilibrium if neither player can increase her payoff
by unilaterally switching her strategy [Jr50]. In stochastic games Nash equilibria exists in
some special cases, and in the general setting, the existence of ε-Nash equilibria, for ε > 0,
is investigated. A pair of strategies for the two players is an ε-Nash equilibrium, for ε > 0,
if neither player can increase her payoff by at least ε by switching strategy. We now present
the fundamental results on stochastic games, and then state the main contribution of the
thesis.
Previous results on turn-based deterministic games. Sure determinacy for turn-
based deterministic games with Borel objectives was established by a deep result of Mar-
tin [Mar75]: the result of Martin showed that for complementary objectives for the players,
the sure winning set for the two-players form a partition of the state space. For the special
case of Muller objectives, the result of Gurevich-Harrington [GH82] showed that finite-
memory sure-winning strategies exist for each player from their respective sure-winning set.
CHAPTER 1. INTRODUCTION 6
In the case of Rabin objectives existence of pure-memoryless sure-winning strategy has been
established in [EJ88], and the results of [EJ88] also proved that turn-based deterministic
games with Rabin and Streett objectives are NP-complete and coNP-complete, respectively.
Zielonka [Zie98] used a tree representation of Muller objectives (referred as the Zielonka tree)
and presented an elegant analysis of turn-based deterministic games with Muller objectives.
Using an insightful analysis of Zielonka’s result, the result of [DJW97] presented an opti-
mal memory bound for pure sure-winning strategies for turn-based deterministic Muller
games. The complexity of turn-based deterministic games with Muller objectives was stud-
ied in [HD05] and the problem was shown to be PSPACE-complete. The algorithmic study
of turn-based deterministic games has received much attention in literature. A few notable
of them are as follows: (a) small progress measure algorithm [Jur00], strategy improvement
algorithm [VJ00], and subexponential time algorithm [JPZ06] for parity games, (b) algo-
rithms for Streett and Rabin games [Hor05, KV98, PP06], and (c) algorithms for Muller
games [Zie98, HD05].
Previous results on concurrent games. The optimal value determinacy for one-shot
games is the famous minmax theorem of von Neumann, and such games can be solved in
polynomial time using linear programming. For concurrent games sure-determinacy does
not hold, and the optimal value determinacy for concurrent games with Borel objectives was
established by Martin [Mar98]. Concurrent games with qualitative reachability and more
general parity objectives have been studied in [dAHK98, dAH00]. The computation of
sure, almost-sure and limit-sure states can be computed in polynomial time for reachability
objectives [dAHK98], and for parity objectives the problems are in NP ∩ coNP [dAH00].
The values of concurrent games with parity objectives was characterized by quantitative
µ-calculus formulas in [dAM01], and from the characterization a 3EXPTIME algorithm
was obtained to solve concurrent parity games. The reduction of Rabin, Streett and Muller
objectives to parity objectives (an exponential reduction) [Tho97] and the algorithm for
CHAPTER 1. INTRODUCTION 7
parity objectives yield a 4EXPTIME algorithm to solve concurrent Rabin, Streett and
Muller games. For the special case of turn-based stochastic games the algorithm of [dAM01]
can be shown to work in 2EXPTIME for parity objectives, and thus one could obtain a
3EXPTIME algorithm for turn-based stochastic Rabin, Streett and Muller games.
Previous results on quantitative objectives. The determinacy of concurrent stochas-
tic games with discounted reward objectives was proved in [Sha53], and the determinacy
for limit-average objectives was proved in [MN81]. The existence of pure memoryless opti-
mal strategies for turn-based deterministic games with limit-average objectives was shown
in [EM79]; and for turn-based stochastic games in [LL69]. The existence of pure memo-
ryless strategies in turn-based stochastic games with discounted reward objectives can be
proved from the results of [Sha53]; see [FV97] for analysis of various classes of games with
discounted reward and limit-average objectives. The complexity of turn-based determin-
istic limit-average games has been studied in [ZP96]; also see [FV97] for algorithms for
turn-based stochastic games with discounted reward and limit-average objectives.
Previous results on nonzero-sum games. The existence of Nash equilibrium in one-
shot concurrent games is the celebrated result of Nash [Jr50]. The computation of Nash
equilibria in one-shot games is PPAD-complete [DGP06, CD06], also see [EY07] for related
complexity results. Nash’s theorem holds for the case when the strategy space is convex
and compact. However, for infinite games and the strategy space is not compact and hence
Nash’s result does not immediately extend to infinite games. In fact for concurrent zero-sum
reachability games Nash equilibria (in case of zero-sum games Nash equilibria correspond
to optimal strategies) need not exist. In such case one investigates ε-Nash equilibria, and
ε-Nash equilibria, for all ε > 0 is the best one can achieve. Exact Nash equilibrium do exist
in discounted stochastic games [Fin64]. For concurrent nonzero-sum games with payoffs
defined by Borel sets, surprisingly little is known. Secchi and Sudderth [SS01] showed that
exact Nash equilibrium do exist when all players have payoffs defined by closed sets (“safety
CHAPTER 1. INTRODUCTION 8
objectives”). For the special case of two-player games, existence of ε-Nash equilibrium,
for every ε > 0, is known for limit-average objectives [Vie00a, Vie00b], and for parity
objectives [Cha05]. The existence of ε-Nash equilibrium in n-player concurrent games with
objectives in higher levels of Borel hierarchy is an intriguing open problem.
Organization and new results of the thesis. We now present the organization of the
thesis and the main results of each chapter.
1. (Chapter 2). The basic definitions of various classes of games, objectives, strategies,
and the formal notion of determinacy is presented in Chapter 2.
2. (Chapter 3). In Chapter 3 we consider concurrent games with tail objectives (which
is a generalization of Muller objectives) and prove several basic properties, e.g., we
show that if there there is a state with positive value, then there is some state with
value 1. The properties we prove are useful in the analysis of later chapters.
3. (Chapter 4). In Chapter 4 we study turn-based stochastic games with Muller ob-
jectives. The main results of the chapter are as follows:
• we prove an optimal memory bound for pure optimal strategies in turn-based
stochastic Muller games;
• we show the qualitative and quantitative analysis of turn-based stochastic Muller
games are both PSPACE-complete (improving the previous known 3EXPTIME
bound); and
• we present an improved memory bound for randomized optimal strategies as
compared to pure optimal strategies.
4. (Chapter 5). In Chapter 5 we study turn-based stochastic games with Rabin and
Streett objectives. The main results of the chapter are as follows:
CHAPTER 1. INTRODUCTION 9
• we show the qualitative and quantitative analysis of turn-based stochastic games
with Rabin and Streett objectives are NP-complete and coNP-complete, respec-
tively, (improving the previous known 3EXPTIME bound); and
• we present a strategy improvement algorithm for turn-based stochastic Rabin
and Streett games.
5. (Chapter 6). In Chapter 6 we study concurrent games with reachability objectives.
We present an elementary and combinatorial proof of existence of memoryless ε-
optimal strategies in concurrent games with reachability objectives, for all ε > 0. In
contrast, the previous proofs of the result relied on deep results from analysis (such
as analysis of Puisieux series) [FV97]. The proof techniques we develop also lead to a
strategy improvement algorithm for concurrent reachability games.
6. (Chapter 7). In Chapter 7 we study the complexity of concurrent games with limit-
average objectives and show that these games can be solved in EXPTIME. It also
follows from our results that concurrent games with discounted reward objectives can
be solved in PSPACE. To the best of our knowledge this is the first complexity result
on the solution of concurrent limit-average games. Also the techniques used in the
chapter are useful in the analysis for Chapter 8.
7. (Chapter 8). In Chapter 8 we study the complexity of concurrent games with
parity objectives and show that the quantitative analysis of concurrent parity games
can be achieved in PSPACE (improving the previous 3EXPTIME bound); and as a
consequence obtain an EXPSPACE algorithm for Rabin, Streett and Muller objectives
(as compared to the previously known 4EXPTIME bound).
8. (Chapter 9). In Chapter 9 we study games that are not strictly competitive. We
present a new notion of equilibrium that captures the notion of conditional com-
petitiveness. The new notion of equilibrium, called secure equilibria, captures the
CHAPTER 1. INTRODUCTION 10
notion of adverserial external choice. We show the maximal secure equilibria payoff is
unique for turn-based deterministic games, and present algorithms to compute such
payoff for ω-regular objectives. We then illustrate its application in the synthesis of
independent processes: we show that the notion of secure equilibria generalizes the
assume-guarantee style of reasoning in the game theoretic framework.
The relevant open problems for each chapter is listed along with concluding remarks for the
respective chapter.
Related topics. In this thesis we consider on games played on graphs with finite state
spaces, where each player has perfect information about the state of the game. We briefly
discuss several extensions of such games which have been studied in the literature.
Beyond games for reactive systems. We have only discussed about games played on graphs
that are mainly used in the analysis of reactive systems. However, graph games are widely
used in several other areas of computer science, such as, Ehrenfeucht and Fraisse games
in finite-model theory, network congestion games and auctioning for the analysis of the
internet [Pap01]. We keep our discussion limited to games related to verification of reactive
systems, and now describe several extensions in this context.
Partial-information games. In the class of partial-information games players only have
partial-information about the state of the game. Such games are much harder to solve
as compared to the perfect-information games, for example, 2-player partial-information
turn-based games are 2EXPTIME complete for reachability objectives [Rei79], and several
problems related to partial-information turn-based games with more than 2-players become
undecidable [Rei79]. The results in [CH05] present a close connection between a sub-class
of partial-information turn-based games and perfect-information concurrent games. The
algorithmic analysis of partial-information turn-based games with ω-regular objectives has
been studied in [CDHR06]. The complexity of partial-information Markov decision processes
CHAPTER 1. INTRODUCTION 11
has been studied in [PT87].
Infinite-state games. There are several extensions of games played on finite state space
to games played on infinite state space. The notable of them are pushdown games and
timed games. In case of pushdown games the state of the games encode an unbounded
amount of information about the pushdown store (or a stack); such games have been studied
in [Wal96]; also see [Wal04] for a survey. Pushdown games with stochastic transitions have
been studied in [EY05, EY06]. The class of timed games are played on finite state graphs,
but in continuous time with discrete transitions. The modeling of time by clocks make the
games infinite state games, and such games are studied in [MPS95, dAFH+03].
Logic and games. The connection between logical quantifiers and games is deep and well-
established. Game theory also provides an useful framework to study properties of sets. The
results of Martin [Mar75, Mar98] establishing Borel determinacy for 2-player and concurrent
games illuminates several key properties about sets. The close connections between logic on
trees and 2-player games is well-exposed in [Tho97]. The logic µ-calculus is a logic of fixed-
points and is expressive enough to capture all ω-regular objectives [Koz83]. Emerson and
Jutla [EJ91] established the equivalence of µ-calculus model checking and solving 2-player
parity games. Quantitative µ-calculus has been proposed in [dAM01] to solve concurrent
games with parity objectives, and in [MM02] to solve 212 -player games with parity objectives.
The model checking algorithm for the alternating temporal logic ATL requires game solving
procedures as sub-routines [AHK02].
Relationships between games. The relationship between games is an intriguing area of
research. The notions of abstraction of games [HMMR00, HJM03, CHJM05], refinement
relations between games [AHKV98], and distances between games [dAHM03] have been
explored in the literature.
12
Chapter 2
Definitions
In this chapter we will present the definitions of several classes of game graphs,
strategies, objectives, the notion of values and equilibrium. We start with the definition of
game graphs.
2.1 Game Graphs
We first define turn-based game graphs, and then the more general class of con-
current game graphs. We start with some preliminary notation. For a finite set A, a
probability distribution on A is a function δ: A → [0, 1] such that∑
a∈A δ(a) = 1. We write
Supp(δ) = a ∈ A | δ(a) > 0 for the support set of δ. We denote the set of probability
distributions on A by Dist(A).
2.1.1 Turn-based probabilistic game graphs
We consider several classes of turn-based games, namely, two-player turn-based
probabilistic games (212 -player games), two-player turn-based deterministic games (2-player
games), and Markov decision processes (112 -player games).
Turn-based probabilistic game graphs. A turn-based probabilistic game graph (or
CHAPTER 2. DEFINITIONS 13
212 -player game graph) G = ((S,E), (S1, S2, SP ), δ) consists of a directed graph (S,E), a
partition of the vertex set S into three subsets S1, S2, SP ⊆ S, and a probabilistic transition
function δ: SP → Dist(S). The vertices in S are called states. The state space S is finite.
The states in S1 are player-1 states; the states in S2 are player-2 states; and the states in
SP are probabilistic states. For all states s ∈ S, we define E(s) = t ∈ S | (s, t) ∈ E to
be the set of possible successor states. We require that E(s) 6= ∅ for every nonprobabilistic
state s ∈ S1 ∪ S2, and that E(s) = Supp(δ(s)) for every probabilistic state s ∈ SP . At
player-1 states s ∈ S1, player 1 chooses a successor state from E(s); at player-2 states
s ∈ S2, player 2 chooses a successor state from E(s); and at probabilistic states s ∈ SP , a
successor state is chosen according to the probability distribution δ(s).
The turn-based deterministic game graphs (or 2-player game graphs) are the spe-
cial case of the 212 -player game graphs with SP = ∅. The Markov decision processes (MDPs
for short; or 112-player game graphs) are the special case of the 21
2 -player game graphs with
either S1 = ∅ or S2 = ∅. We refer to the MDPs with S2 = ∅ as player-1 MDPs, and to the
MDPs with S1 = ∅ as player-2 MDPs. A game graph that is both deterministic and an
MDP is called a transition system (or 1-player game graph): a player-1 transition system
has only player-1 states; a player-2 transition system has only player-2 states.
2.1.2 Concurrent game graphs
Concurrent game graphs. A concurrent game graph (or a concurrent game structure)
G = (S,A,Γ1,Γ2, δ) consists of the following components:
• A finite state space S.
• A finite set A of moves or actions.
• Two move assignments Γ1,Γ2: S → 2A\∅. For i ∈ 1, 2, the player-i move assignment
Γi associates with every state s ∈ S a nonempty set Γi(s) ⊆ A of moves available to
CHAPTER 2. DEFINITIONS 14
player i at state s.
• A probabilistic transition function δ: S × A × A → Dist(S). At every state s ∈ S,
player 1 chooses a move a1 ∈ Γ1(s), and simultaneously and independently player 2
chooses a move a2 ∈ Γ2(s). A successor state is then chosen according to the proba-
bility distribution δ(s, a1, a2).
For all states s ∈ S and all moves a1 ∈ Γ1(s) and a2 ∈ Γ2(s), we define Succ(s, a1, a2) =
Supp(δ(s, a1, a2)) to be the set of possible successor states of s when the moves a1 and a2
are chosen. For a concurrent game graph, we define the set of edges as E = (s, t) ∈ S×S |
(∃a1 ∈ Γ1(s))(∃a2 ∈ Γ2(s))(t ∈ Succ(s, a1, a2)), and as with turn-based game graphs, we
write E(s) = t | (s, t) ∈ E for the set of possible successors of a state s ∈ S.
We distinguish the following special classes of concurrent game graphs. The con-
current game graph G is deterministic if |Succ(s, a1, a2)| = 1 for all states s ∈ S and all
moves a1 ∈ Γ1(s) and a2 ∈ Γ2(s). A state s ∈ S is a turn-based state if there exists a player
i ∈ 1, 2 such that |Γi(s)| = 1; that is, player i has no choice of moves at s. If |Γ2(s)| = 1,
then s is a player-1 turn-based state; and if |Γ1(s)| = 1, then s is a player-2 turn-based
state. The concurrent game graph G is turn-based if every state in S is a turn-based state.
Note that the turn-based concurrent game graphs are equivalent to the turn-based proba-
bilistic game graphs: to obtain a 212 -player game graph from a turn-based concurrent game
graph G, for every player-i turn-based state s of G, where i ∈ 1, 2, introduce |Γi(s)| many
probabilistic successor states of s. Moreover, the concurrent game graphs that are both
turn-based and deterministic are equivalent to the 2-player game graphs.
To measure the complexity of algorithms and problems, we need to define the size
of game graphs. We do this for the case that all transition probabilities can be specified as ra-
tional numbers. Then the size of a concurrent game graph G is equal to the size of the prob-
abilistic transition function δ, that is, |G| =∑
s∈S
∑a1∈Γ1(s)
∑a2∈Γ2(s)
∑t∈S |δ(s, a1, a2)(t)|,
CHAPTER 2. DEFINITIONS 15
where |δ(s, a1, a2)(t)| denotes the space required to specify a rational probability value.
2.2 Strategies
When choosing their moves, the players follow recipes that are called strategies.
We define strategies both for 212 -player game graphs and for concurrent game graphs. On a
concurrent game graph, the players choose moves from a set A of moves, while on a 212 -player
game graph, they choose successor states from a set S of states. Hence, for 212 -player game
graphs, we define the set of moves as A = S. For 212 -player game graphs, a player-1 strategy
prescribes the moves that player 1 chooses at the player-1 states S1, and a player-2 strategy
prescribes the moves that player 2 chooses at the player-2 states S2. For concurrent game
graphs, both players choose moves at every state, and hence for concurrent game graphs,
we define the sets of player-1 states and player-2 states as S1 = S2 = S.
Consider a game graph G. A player-1 strategy on G is a function σ: S∗ · S1 →
Dist(A) that assigns to every nonempty finite sequence ~s ∈ S∗ · S1 of states ending in a
player-1 state, a probability distribution σ(~s) over the moves A. By following the strategy σ,
whenever the history of a game played on G is ~s, then player 1 chooses the next move
according to the probability distribution σ(~s). A strategy must prescribe only available
moves. Hence, for all state sequences ~s1 ∈ S∗ and all states s ∈ S1, if σ(~s1 · s)(a) > 0, then
the following condition must hold: a ∈ E(s) for 212 -player game graphs G, and a ∈ Γ1(s)
for concurrent game graphs G. Symmetrically, a player-2 strategy on G is a function π:
S∗ · S2 → Dist(A) such that if π(~s1 · s)(a) > 0, then a ∈ E(s) for 212 -player game graphs G,
and a ∈ Γ2(s) for concurrent game graphs G. We write Σ for the set of player-1 strategies,
and Π for the player-2 strategies on G. Note that |Π| = 1 if G is a player-1 MDP, and
|Σ| = 1 if G is a player-2 MDP.
CHAPTER 2. DEFINITIONS 16
2.2.1 Types of strategies
We classify strategies according to their use of randomization and memory.
Use of randomization. Strategies that do not use randomization are called pure. A
player-1 strategy σ is pure (or deterministic) if for all state sequences ~s ∈ S∗ · S1, there
exists a move a ∈ A such that σ(~s)(a) = 1. The pure strategies for player 2 are defined
analogously. We denote by ΣP the set of pure player-1 strategies, and by ΠP the set of pure
player-2 strategies. A strategy that is not necessarily pure is sometimes called randomized.
Use of memory. Strategies in general require memory to remember the history of a
game. The following alternative definition of strategies makes this explicit. Let M be a set
called memory. A player-1 strategy σ = (σu, σn) can be specified as a pair of functions: a
memory-update function σu: S×M → M, which given the current state of the game and the
memory, updates the memory with information about the current state; and a next-move
function σm: S1 ×M → Dist(A), which given the current state and the memory, prescribes
the next move of the player. The player-1 strategy σ is finite-memory if the memory M is
a finite set; and the strategy σ is memoryless (or positional) if the memory M is singleton,
i.e., |M | = 1. A finite-memory strategy remembers only a finite amount of information
about the infinitely many different possible histories of the game; a memoryless strategy is
independent of the history of the game and depends only on the current state of the game.
Note that a memoryless player-1 strategy can be represented as a function σ: S1 → Dist(A).
A memoryless strategy σ is uniform memoryless if the memoryless strategy is an uniform
distribution over its support, i.e., for all states s we have σ(s)(a) = 0 if a 6∈ Supp(σ(s))
and σ(s)(a) = 1|Supp(σ(s))| if a ∈ Supp(σ(s)). We denote by ΣF the set of finite-memory
player-1 strategies, by ΣM and ΣUM the set of memoryless and uniform memoryless player-
1 strategies. The finite-memory player-2 strategies ΠF , the memoryless player-2 strategies
ΠM and uniform memoryless player-2 strategies ΠUM are defined analogously.
CHAPTER 2. DEFINITIONS 17
A pure finite-memory strategy is a pure strategy that is finite-memory; we write ΣPF =
ΣP ∩ ΣF for the pure finite-memory player-1 strategies, and ΠPF for the corresponding
player-2 strategies. A pure memoryless strategy is a pure strategy that is memoryless. The
pure memoryless strategies use neither randomization nor memory; they are the simplest
strategies we consider. Note that a pure memoryless player-1 strategy can be represented
as a function σ: S1 → A. We write ΣPM = ΣP ∩ ΣM for the pure memoryless player-1
strategies, and ΠPM for the corresponding class of simple player-2 strategies.
2.2.2 Probability space and outcomes of strategies
A path of the game graph G is an infinite sequence ω = 〈s0, s1, s2, . . .〉 of states in
S such that (sk, sk+1) ∈ E for all k ≥ 0. We denote the set of paths of G by Ω. Once a
starting state s ∈ S and strategies σ ∈ Σ and π ∈ Π for the two players are fixed, the result
of the game is a random walk in G, denoted as ωσ,πs .
Probability space of strategies. Given a finite sequence x = 〈s0, s1, . . . , sk〉 of states,
the cone for x the set Cone(x) = 〈s′0, s′1, . . .〉 | (∀0 ≤ i ≤ k)(si = s′i) of paths with prefix x.
Let U be the set of cones, for all finite paths of G. The set U is the set of basic open sets
in Sω. Let F be the Borel σ-field generated by U , i.e., F is the smallest set that is closed
under complementation, countable union, countable intersection, Ω ∈ F and U ⊆ F . Then
(Ω,F) is a σ-algebra. Given strategies σ and π for player 1 and player 2, respectively, and
a state s, we define a function µσ,πs : U → [0, 1] as follows:
• Cones of length 1:
µσ,πs (Cone(s′)) =
1 if s = s′
0 otherwise
• Cones of length greater than 1: given a finite sequence ωk+1 = 〈s0, s1, . . . , sk, sk+1〉,
CHAPTER 2. DEFINITIONS 18
let ωk = 〈s0, s1, . . . , sk〉 and
µσ,πs (Cone(ωk+1)) = µσ,π
s (Cone(ωk))·∑
a1∈Γ1(sk),
a2∈Γ2(sk)
δ(sk, a1, a2)(sk+1)·σ(ωk)(a1)·π(ωk)(a2).
The function µσ,πs is a measure and there is a unique extension of µ
σ,πs as a probability
measure on F (by Caratheodary Extension Theorem [Bil95]). We denote this probability
measure on F induced by strategies σ and π, and the starting state s as Prσ,πs . Then
(Ω,F ,Prσ,πs ) is a probability space. An event Φ is a measurable set of paths, i.e., Φ ∈ F .
Given an event Φ, Prσ,πs (Φ) denotes the probability that the random walk ω
σ,πs is in Φ. For
a measurable function f : Ω → R we denote by Eσ,πs [f ] the expectation of the function
f under the probability distribution Prσ,πs (·). For i ≥ 0, we denote by Xi : Ω → S the
random variable denoting the i-th state along a path, and by Y1,i and Y2,i the random
variables denoting the action played in the i-th round of the play by player 1 and player 2,
respectively.
Outcomes of strategies. Consider two strategies σ ∈ Σ and π ∈ Π on a game graph G,
and let ω = 〈s0, s1, s2, . . .〉 be a path of G. The path ω is (σ, π)-possible for a 212 -player
game graph G if for every k ≥ 0 the following two conditions hold: if sk ∈ S1, then
σ(s0s1 . . . sk)(sk+1) > 0; and if sk ∈ S2, then π(s0s1 . . . sk)(sk+1) > 0. The path ω is (σ, π)-
possible for a concurrent game graph G if for every k ≥ 0, there exist moves a1 ∈ Γ1(sk) and
a2 ∈ Γ2(sk) for the two players such that σ(s0s1 . . . sk)(a1) > 0 and π(s0s1 . . . sk)(a2) > 0
and sk+1 ∈ Succ(sk, a1, a2). Given a state s ∈ S and two strategies σ ∈ Σ and π ∈ Π, we
denote by Outcome(s, σ, π) ⊆ Ω the set of (σ, π)-possible paths whose first state is s. Note
that Outcome(s, σ, π) is a probability-1 event, i.e., Prσ,πs (Outcome(s, σ, π)) = 1.
Given a game graph G and a player-1 strategy σ ∈ Σ, we write Gσ for the game
played on G under the constraint that player 1 follows the strategy σ. Analogously, given G
CHAPTER 2. DEFINITIONS 19
and a player-2 strategy π ∈ Π, we write Gπ for the game played on G under the constraint
that player 2 follows the strategy π. Observe that for a 212 -player game graph G or a
concurrent game graph G and a memoryless player-1 strategy σ ∈ Σ, the result Gσ is a
player-2 MDP. Similarly, for a player-2 MDP G and a memoryless player-2 strategy π ∈ Π,
the result Gπ is a Markov chain. Hence, if G is a 212 -player game graph or a concurrent
game graph and the two players follow memoryless strategies σ and π, then the result
Gσ,π = (Gσ)π is a Markov chain. Also the following observation will be used later. Given a
game graph G and a strategy in Σ∪Π with finite memory M, the strategy can be interpreted
as a memoryless strategy in the synchronous product G×M of the game graph G with the
memory M. Hence the above observation (on memoryless strategies) also extends to finite-
memory strategies, i.e., if player 1 plays a finite-memory strategy σ, then Gσ is a player-2
MDP, and if both players follow finite-memory strategies, then we have a Markov chain.
2.3 Objectives
Consider a game graph G. Player-1 and player-2 objectives for G are measurable
sets Φ1,Φ2 ⊆ Ω of winning paths for the two players: player i, for i ∈ 1, 2, wins the game
played on the graph G with the objective Φi iff the infinite path in Ω that results from
playing the game, lies inside the set Φi. In the case of zero-sum games, the objectives of
the two players are strictly competitive, that is, Φ2 = Ω \ Φ1. A general class of objectives
are the Borel objectives. A Borel objective Φ ⊆ Ω is a Borel set in the Cantor topology
on the set Sω of infinite state sequences (note that Ω ⊆ Sω). An important subclass of
the Borel objectives are the ω-regular objectives, which lie in the first 212 levels of the Borel
hierarchy (i.e., in the intersection of Σ03 and Π0
3). The ω-regular objectives are of special
interest for the verification and synthesis of reactive systems [MP92]. In particular, the
following specifications of winning conditions for the players define ω-regular objectives,
CHAPTER 2. DEFINITIONS 20
and subclasses thereof [Tho97].
Reachability and safety objectives. A reachability specification for the game graph
G is a set T ⊆ S of states, called target states. The reachability specification T requires
that some state in T be visited. Thus, the reachability specification T defines the set
Reach(T ) = 〈s0, s1, s2, . . .〉 ∈ Ω | (∃k ≥ 0)(sk ∈ T ) of winning paths; this set is called
a reachability objective. A safety specification for G is likewise a set U ⊆ S of states;
they are called safe states. The safety specification U requires that only states in U be
visited. Formally, the safety objective defined by U is the set Safe(U) = 〈s0, s1, . . .〉 ∈ Ω |
(∀k ≥ 0)(sk ∈ U) of winning paths. Note that reachability and safety are dual objectives:
Safe(U) = Ω \ Reach(S \ U).
Buchi and coBuchi objectives. A Buchi specification for G is a set B ⊆ S of states,
which are called Buchi states. The Buchi specification B requires that some state in B be
visited infinitely often. For a path ω = 〈s0, s1, s2, . . .〉, we write Inf(ω) = s ∈ S | sk =
s for infinitely many k ≥ 0 for the set of states that occur infinitely often in ω. Thus, the
Buchi objective defined by B is the set Buchi(B) = ω ∈ Ω | Inf(ω) ∩ B 6= ∅ of winning
paths. The dual of a Buchi specification is a coBuchi specification C ⊆ S, which specifies a
set of so-called coBuchi states. The coBuchi specification C requires that the states outside
C be visited only finitely often. Formally, the coBuchi defined by C is the set coBuchi(C) =
ω ∈ Ω | Inf(ω) ⊆ C of winning paths. Note that coBuchi(C) = Ω \ Buchi(S \ C). It is
also worth noting that reachability and safety objectives can be turned into both Buchi and
coBuchi objectives, by slightly modifying the game graph (for example if every target state
s ∈ T is made a sink state, then we have Reach(T ) = Buchi(T )).
Rabin and Streett objectives. We use colors to define objectives independent of game
graphs. For a set C of colors, we write [[·]]: C → 2S for a function that maps each color to a set
of states. Inversely, given a set U ⊆ S of states, we write [U ] = c ∈ C | [[c]]∩U 6= ∅ for the
CHAPTER 2. DEFINITIONS 21
set of colors that occur in U . Note that a state can have multiple colors. A Rabin objective
is specified as a set P = (e1, f1), . . . , (ed, fd) of pairs of colors ei, fi ∈ C. Intuitively, the
Rabin condition P requires that for some 1 ≤ i ≤ d, all states of color ei be visited finitely
often and some state of color fi be visited infinitely often. Let [[P ]] = (E1, F1), . . . , (Ed, Fd)
be the corresponding set of so-called Rabin pairs, where Ei = [[ei]] and Fi = [[fi]] for all
1 ≤ i ≤ d. Formally, the set of winning plays is Rabin(P ) = ω ∈ Ω | ∃ 1 ≤ i ≤
d. (Inf(ω) ∩ Ei = ∅ ∧ Inf(ω) ∩ Fi 6= ∅). Without loss of generality, we require that
( ⋃i∈1,2,...,d(Ei ∪ Fi)
)= S. The parity (or Rabin-chain) objectives are the special case
of Rabin objectives such that E1 ⊂ F1 ⊂ E2 ⊂ F2 . . . ⊂ Ed ⊂ Fd. A Streett objective is
again specified as a set P = (e1, f1), . . . , (ed, fd) of pairs of colors. The Streett condition
P requires that for each 1 ≤ i ≤ d, if some state of color fi is visited infinitely often,
then some state of color ei be visited infinitely often. Formally, the set of winning plays is
Streett(P ) = ω ∈ Ω | ∀ 1 ≤ i ≤ d. (Inf(ω) ∩ Ei 6= ∅ ∨ Inf(ω) ∩ Fi = ∅), for the set
[[P ]] = (E1, F1), . . . , (Ed, Fd) of so-called Streett pairs. Note that the Rabin and Streett
objectives are dual; i.e., the complement of a Rabin objective is a Streett objective, and
vice versa.
Parity objectives. A parity specification for G consists of a nonnegative integer d and
a function p: S → 0, 1, 2, . . . , 2d, which assigns to every state of G an integer between
0 and 2d. For a state s ∈ S, the value p(s) is called the priority of S. We assume
without loss of generality that p−1(j) 6= ∅ for all 0 < j ≤ 2d; this implies that a parity
specification is completely specified by the priority function p (and d does not need to be
specified explicitly). The positive integer 2d + 1 is referred to as the number of priori-
ties of p. The parity specification p requires that the minimum priority of all states that
are visited infinitely often, is even. Formally, the parity objective defined by p is the set
Parity(p) = ω ∈ Ω | minp(s) | s ∈ Inf(ω) is even of winning paths. Note that for
a parity objective Parity(p), the complementary objective Ω \ Parity(p) is again a parity
CHAPTER 2. DEFINITIONS 22
objective: Ω \ Parity(p) = Parity(p + 1), where the priority function p + 1 is defined by
(p+1)(s) = p(s)+1 for all states s ∈ S (if p−1(0) = ∅, then use p−1 instead of p+1). This
self-duality of parity objectives is often convenient when solving games. It is also worth
noting the Buchi objectives are parity objectives with two priorities (let p−1(0) = B),
and the coBuchi objectives are parity objectives with three priorities (let p−1(0) = ∅ and
p−1(1) = S \ C and p−1(2) = C).
Parity objectives are also called Rabin-chain objectives, as they are a special case
of Rabin objectives [Tho97]: if the sets of a Rabin pair P = (E1, F1), . . . , (Ed, Fd) form
a chain E1 ( F1 ( E2 ( F2 ( · · · ( Ed ( Fd, then Rabin(P ) = Parity(p) for the priority
function p: S → 0, 1, . . . , 2d that for all 1 ≤ j ≤ d assigns to each state in Ej \ Fj−1
the priority 2j − 1, and to each state in Fj \ Ej the priority 2j, where F0 = ∅. Conversely,
given a priority function p: S → 0, 1, . . . , 2d, we can construct a chain E1 ( F1 ( · · · (
Ed+1 ( Fd+1 of Rabin sets such that Parity(p) = Rabin((E1, F1), . . . , (Ed, Fd) as follows:
let E1 = ∅ and F1 = p−1(0), and for all 1 ≤ j ≤ d + 1, let and Ej = Fj−1 ∪ p−1(2j − 3)
and Fj = Ej ∪ p−1(2j − 2). Hence, the parity objectives are a subclass of the Rabin
objectives that is closed under complementation. It follows that every parity objective is
both a Rabin objective and a Streett objective. The parity objectives are of special interest,
because every ω-regular objective can be turned into a parity objective by modifying the
game graph (take the synchronous product of the game graph with a deterministic parity
automaton that accepts the ω-regular objective) [Mos84].
Muller and upward-closed objectives. The most general form for defining ω-regular
objectives are Muller specifications. A Muller specification for the game graph G is a set
M ⊆ 2S of sets of states. The sets in M are called Muller sets. The Muller specification
M requires that the set of states that are visited infinitely often is one of the Muller sets.
Formally, the Muller specification M defines the Muller objective Muller(M) = ω ∈ Ω |
Inf(ω) ∈ M. Note that Rabin and Streett objectives are special cases of Muller objectives.
CHAPTER 2. DEFINITIONS 23
The upward-closed objectives form a sub-class of Muller objectives, with the restriction
that the set M is upward-closed. Formally a set UC ⊆ 2S is upward-closed if the following
condition hold: if U ∈ UC and U ⊆ Z, then Z ∈ UC . Given a upward-closed set UC ⊆ 2S ,
the upward-closed objective is defined as the set UpClo(UC ) = ω ∈ Ω | Inf(ω) ∈ UC of
winning plays.
2.4 Game Values
Given a state s and an objective Ψ1 for player 1, the maximal probability with
which player 1 can ensure that Ψ1 holds from s is the value of the game at s for player 1.
Formally, given a game graph G with objectives Ψ1 for player 1 and Ψ2 for player 2, we
define the value functions Val1 and Val2 for the players 1 and 2, respectively, as follows:
ValG1 (Ψ1)(s) = infπ∈Π
supσ∈Σ
Prσ,πs (Ψ1);
ValG2 (Ψ2)(s) = infσ∈Σ
supπ∈Π
Prσ,πs (Ψ2).
If the game graph G is clear from the context, then we will drop the superscript G. Given a
game graph G, a strategy σ for player 1 and an objective Ψ1 we use the following notation
Valσ1 (Ψ1)(s) = infπ∈Π
Prσ,πs (Ψ1).
Given a game graph G, a strategy σ for player 1 is optimal from state s for objective Ψ1 if
Val1(Ψ1)(s) = Valσ1 (Ψ1)(s) = infπ∈Π
Prσ,πs (Ψ1).
Given a game graph G, a strategy σ for player 1 is ε-optimal, for ε ≥ 0, from state s for
objective Ψ1 if
Val1(Ψ1)(s) − ε ≤ infπ∈Π
Prσ,πs (Ψ1).
Note that an optimal strategy is ε-optimal with ε = 0. The optimal and ε-optimal strategies
for player 2 are defined analogously. Computing values, optimal and ε-optimal strategies is
referred to as the quantitative analysis of games.
CHAPTER 2. DEFINITIONS 24
Sure, almost-sure, positive and limit-sure winning strategies. Given a game graph
G with an objective Ψ1 for player 1, a strategy σ is a sure winning strategy for player 1
from a state s if for every strategy π of player 2 we have Outcome(s, σ, π) ⊆ Ψ1. A strategy
σ is an almost-sure winning strategy for player 1 from a state s for the objective Ψ1 if
for every strategy π of player 2 we have Prσ,πs (Ψ1) = 1. A strategy σ is positive winning
for player 1 from the state s for the objective Φ if for every player-2 strategy π, we have
Prσ,πs (Φ) > 0. A family of strategies ΣC is limit-sure winning for player 1 from a state s for
the objective Ψ1, if we have supσ∈ΣC infπ∈Π Prσ,πs (Ψ1)(s) = 1. The sure winning, almost-
sure winning, positive winning and limit-sure winning strategies for player 2 are defined
analogously. Given a game graph G and an objective Ψ1 for player 1, the sure winning set
SureG1 (Ψ1) for player 1 is the set of states from which player 1 has a sure winning strategy.
Similarly, the almost-sure winning set AlmostG1 (Ψ1) for player 1 is the set of states from
which player 1 has an almost-sure winning strategy, the positive winning set PositiveG1 (Ψ1)
for player 1 is the set of states from which player 1 has an almost-sure winning strategy,
and the limit-sure winning set LimitG1 (Ψ1) for player 1 is the set of states from which
player 1 has limit-sure winning strategies. The sure winning set SureG2 (Ψ2), the almost-
sure winning set AlmostG2 (Ψ2), the positive winning set PositiveG
2 (Ψ2) and the limit-sure
winning set LimitG2 (Ψ2) with objective Ψ2 for player 2 are defined analogously. Again if the
game graph G is clear from the context we will drop G from superscript. It follows from
the definitions that for all 212 -player and concurrent game graphs and all objectives Ψ1 and
Ψ2, we have Sure1(Ψ1) ⊆ Almost1(Ψ1) ⊆ Limit1(Ψ1) ⊆ Positive1(Ψ1) and Sure2(Ψ2) ⊆
Almost2(Ψ2) ⊆ Limit2(Ψ2) ⊆ Positive2(Ψ2). A game is sure winning (resp. almost-sure
winning and limit-sure winning) for player i, if every state is sure winning (resp. almost-
sure winning and limit-sure winning) for player i, for i ∈ 1, 2. Computing sure winning,
almost-sure winning, positive winning and limit-sure winning sets and strategies is referred
to as the qualitative analysis of games.
CHAPTER 2. DEFINITIONS 25
Sufficiency of a family of strategies. Let C ∈ P,M,F,PM ,PF and consider the
family ΣC of special strategies for player 1. We say that the family ΣC suffices with respect
to an objective Ψ1 on a class G of game graphs for
• sure winning if for every game graph G ∈ G, for every s ∈ Sure1(Ψ1) there is
a player-1 strategy σ ∈ ΣC such that for every player-2 strategy π ∈ Π we have
Outcome(s, σ, π) ⊆ Ψ1;
• almost-sure winning if for every game graph G ∈ G, for every state s ∈ Almost1(Ψ1)
there is a player-1 strategy σ ∈ ΣC such that for every player-2 strategy π ∈ Π we
have Prσ,πs (Ψ1) = 1;
• positive winning if for every game graph G ∈ G, for every state s ∈ Positive1(Ψ1)
there is a player-1 strategy σ ∈ ΣC such that for every player-2 strategy π ∈ Π we
have Prσ,πs (Ψ1) > 0;
• limit-sure winning if for every game graph G ∈ G, for every state s ∈ Limit1(Ψ1) we
have supσ∈ΣC infπ∈Π Prσ,πs (Ψ1) = 1;
• optimality if for every game graph G ∈ G, for every state s ∈ S there is a player-1
strategy σ ∈ ΣC such that Val1(Ψ1)(s) = infπ∈Π Prσ,πs (Ψ1).
• ε-optimality if for every game graph G ∈ G, for every state s ∈ S there is a player-1
strategy σ ∈ ΣC such that Val1(Ψ1)(s) − ε ≤ infπ∈Π Prσ,πs (Ψ1).
The notion of sufficiency for size of finite-memory strategies is obtained by referring to the
size of the memory M of the strategies. The notions of sufficiency of strategies for player 2
is defined analogously.
For sure winning, 112 -player and 21
2 -player games coincide with 2-player (turn-
based deterministic) games where the random player (who chooses the successor at the
CHAPTER 2. DEFINITIONS 26
probabilistic states) is interpreted as an adversary, i.e., as player 2. This is formalized by
the proposition below.
Proposition 1 If a family ΣC of strategies suffices for sure winning with respect to an
objective Φ on all 2-player game graphs, then the family ΣC suffices for sure winning with
respect to Φ also on all 112-player and 21
2 -player game graphs.
The following proposition states that randomization is not necessary for sure win-
ning.
Proposition 2 If a family ΣC of strategies suffices for sure winning with respect to a Borel
objective Φ on all concurrent game graphs, then the family ΣC∩ΣP of pure strategies suffices
for sure winning with respect to Φ on all concurrent game graphs.
2.5 Determinacy
The fundamental concept of rationality in zero-sum games is captured by the
notion of optimal and ε-optimal strategies. The key result that establishes existence of
ε-optimal strategies, for all ε > 0, in zero-sum games is the determinacy result, that states
the sum of the values of the players is 1 at all states, i.e., for all states s ∈ S, we have
Val1(Ψ1)(s) + Val2(Ψ2)(s) = 1. The determinacy result implies the following equality:
supσ∈Σ
infπ∈Π
Prσ,πs (Ψ1) = inf
π∈Πsupσ∈Σ
Prσ,πs (Ψ1).
The determinacy result also guarantees existence of ε-optimal strategies for all ε > 0,
for both players. A deep result by Martin [Mar98] established that determinacy holds
for all concurrent games with Borel objectives (see Theorem 1). A more refined notion
of determinacy is the sure determinacy which states that for an objective Ψ1 we have
Sure1(Ψ1) = (S \ Sure2(Ω \ Ψ1)). The sure determinacy holds for turn-based deterministic
CHAPTER 2. DEFINITIONS 27
games with all Borel objectives [Mar75], however, the sure determinacy does not hold for
212 -player games and concurrent games.
Theorem 1 For all Borel objectives Ψ1 and the complementary objective Ψ2 = Ω \ Ψ1 the
following assertions hold.
1. ([Mar75]). For all 2-player game graphs, the sure winning sets Sure1(Ψ1) and
Sure2(Ψ2) form a partition of the state space, i.e., Sure1(Ψ1) = S \ Sure2(Ψ2), and
the family of pure strategies suffices for sure winning.
2. ([Mar98]). For all concurrent game structures and for all states s we have
Val1(Ψ1)(s) + Val2(Ψ2)(s) = 1.
Given a game graph G, let us denote by Σε(Ψ1) and Πε(Ψ2) the set of ε-optimal
strategies for player 1 for objective Ψ1 and and player 2 for objective Ψ2, respectively. We
obtain the following corollary from Theorem 1.
Corollary 1 For all concurrent game structures, for all Borel objectives Ψ1 and the com-
plementary objective Ψ2, for all ε > 0 we have Σε(Ψ1) 6= ∅ and Πε(Ψ2) 6= ∅.
2.6 Complexity of Games
We now summarize the main complexity results related to 2-player, 212 -player and
concurrent games with parity, Rabin, Streett and Muller objectives. We first present the
result for 2-player games.
Theorem 2 (Complexity of 2-player games) The problem of deciding whether a state
s is a sure winning state, i.e., s ∈ Sure1(Ψ1) for an objective Ψ1, is NP-complete for
Rabin objectives and coNP-complete for Streett objectives [EJ88], and PSPACE-complete
for Muller objectives [HD05].
CHAPTER 2. DEFINITIONS 28
We now state the main complexity results known for concurrent game struc-
tures. The basic results were proved for reachability and parity objectives (given by The-
orem 3). By an exponential reduction of Rabin, Streett and Muller objectives to parity
objectives [Tho97], we obtain Corollary 2.
Theorem 3 The following assertions hold.
1. ([dAHK98]). For all concurrent game structures G, for all T ⊆ S, for a state s ∈ S,
whether Val1(Reach(T ))(s) = 1 can be decided in PTIME.
2. ([EY06]). For all concurrent game structures G, for all T ⊆ S, for a state s ∈ S,
a rational α and a rational ε > 0, whether Val1(Reach(T ))(s) ≥ α can be decided
in PSPACE; and a rational interval [l, u] such that Val1(Reach(T ))(s) ∈ [l, u] and
u − l ≤ ε can be computed in PSPACE.
3. ([dAH00]). For all concurrent game structures G, for all priority functions p, for a
state s ∈ S, whether Val1(Parity(p))(s) = 1 can be decided in NP ∩ coNP.
4. ([dAM01]). For all concurrent game structures G, for all priority functions p, for a
state s ∈ S, a rational α and a rational ε > 0, whether Val1(Parity(p))(s) ≥ α can be
decided in 3EXPTIME; and a rational interval [l, u] such that Val1(Parity(p))(s) ∈
[l, u] and u − l ≤ ε can be computed in 3EXPTIME.
Corollary 2 For all concurrent game structures G, for all Rabin, Streett, and Muller ob-
jectives Φ, for a state s ∈ S, a rational α and a rational ε > 0, whether Val1(Φ)(s) ≥ α
can be decided in 4EXPTIME; a rational interval [l, u] such that Val1(Φ)(s) ∈ [l, u] and
u − l ≤ ε can be computed in 4EXPTIME; and whether Val1(Φ)(s) = 1 can be decided in
2EXPTIME.
We now present the result for 212 -player game graphs. The results for 21
2 -player
game graphs are obtained as follows: the qualitative analysis for reachability and parity
CHAPTER 2. DEFINITIONS 29
objectives follows from the results on concurrent game structures; the result for quantitative
analysis for reachability objectives follows from the results of Condon [Con92]; and the result
for quantitative analysis for parity objectives follows from the results of concurrent games
but with an exponential improvement. The results are presented in Theorem 4, and the
exponential reduction of Rabin, Streett and Muller objectives to parity objectives [Tho97]
yields Corollary 3.
Theorem 4 The following assertions hold.
1. ([dAHK98]). For all 212 -player game graphs G, for all T ⊆ S, for a state s ∈ S,
whether Val1(Reach(T ))(s) = 1 can be decided in PTIME.
2. ([Con92]). For all 212 -player game graphs G, for all T ⊆ S, for a state s ∈ S,
and a rational α, whether Val1(Reach(T ))(s) ≥ α can be decided in NP ∩ coNP, and
Val1(Reach(T ))(s) can be computed in EXPTIME.
3. ([dAH00]). For all 212-player game graphs G, for all priority functions p, for a state
s ∈ S, whether Val1(Parity(p))(s) = 1 can be decided in NP ∩ coNP.
4. ([dAM01]). For all 212-player game structures G, for all priority functions p, for a
state s ∈ S, a rational α, and a rational ε > 0, whether Val1(Parity(p))(s) ≥ α can
be decided in 2EXPTIME; and a rational interval [l, u] such that Val1(Parity(p))(s) ∈
[l, u] and u − l ≤ ε can be computed in 2EXPTIME.
Corollary 3 For all 212 -player game graphs G, for all Rabin, Streett, and Muller objectives
Φ, for a state s ∈ S, a rational α and a rational ε > 0, whether Val1(Φ)(s) ≥ α can be
decided in 3EXPTIME; a rational interval [l, u] such that Val1(Φ)(s) ∈ [l, u] and u − l ≤ ε
can be computed in 3EXPTIME; and whether Val1(Φ)(s) = 1 can be decided in 2EXPTIME.
30
Chapter 3
Concurrent Games with Tail
Objectives
In this chapter we will consider concurrent games with tail objectives,1 i.e., ob-
jectives that are independent of the finite-prefix of traces, and show that the class of tail
objectives are strictly richer than the ω-regular objectives. We develop new proof techniques
to extend several properties of concurrent games with ω-regular objectives to concurrent
games with tail objectives. We prove the positive limit-one property for tail objectives,
that states for all concurrent games if the optimum value for a player is positive for a tail
objective Φ at some state, then there is a state where the optimum value is 1 for Φ, for the
player. We also show that the optimum values of zero-sum (strictly conflicting objectives)
games with tail objectives can be related to equilibrium values of nonzero-sum (not strictly
conflicting objectives) games with simpler reachability objectives. A consequence of our
analysis presents a polynomial time reduction of the quantitative analysis of tail objectives
to the qualitative analysis for the sub-class of one-player stochastic games (Markov decision
processes). The properties we prove for the general class of concurrent games with tail
1A preliminary version of the results of this chapter appeared in [Cha06, Cha07a]
CHAPTER 3. CONCURRENT GAMES WITH TAIL OBJECTIVES 31
objectives will be used in the later chapters for both concurrent and turn-based games with
Muller objectives.
3.1 Tail Objectives
The class of tail objectives are defined as follows.
Tail objectives. Informally the class of tail objectives is the sub-class of Borel objectives
that are independent of all finite prefixes. An objective Φ is a tail objective, if the following
condition hold: a path ω ∈ Φ if and only if for all i ≥ 0, ωi ∈ Φ, where ωi denotes the
path ω with the prefix of length i deleted. Formally, let Gi = σ(Xi,Xi+1, . . .) be the σ-
field generated by the random-variables Xi,Xi+1, . . ..2. The tail σ-field T is defined as
T =⋂
i≥0 Gi. An objective Φ is a tail objective if and only if Φ belongs to the tail σ-field
T , i.e., the tail objectives are indicator functions of events A ∈ T .
Observe that Muller and parity objectives are tail objectives. Buchi and coBuchi
objectives are special cases of parity objectives and hence tail objectives. Reachability
objectives are not necessarily tail objectives, but for a set T ⊆ S of states, if every state
s ∈ T is an absorbing state, then the objective Reach(T ) is equivalent to Buchi(T ) and hence
is a tail objective. It may be noted that since σ-fields are closed under complementation,
the class of tail objectives are closed under complementation. We give an example to show
that the class of tail objectives are richer than ω-regular objectives. 3
Example 1 Let r be a reward function that maps every state s to a real-valued reward
r(s), i.e., r : S → R. Given a reward function r, we define a function LimAvgr : Ω → R
as follows: for a path ω = 〈s1, s2, s3, . . .〉 we have LimAvgr(ω) = lim infn→∞
1
n
n∑
i=1
r(si) i.e.,
LimAvgr(ω) defines the long-run average of the rewards appearing in ω. For a constant
2We use σ for strategies and σ (boldface) for sigma-fields3Our example shows that there are Π
03-hard objectives that are tail objectives. It is possible that the
tail objectives can express objectives in even higher levels of Borel hierarchy than Π03, which will make our
results stronger.
CHAPTER 3. CONCURRENT GAMES WITH TAIL OBJECTIVES 32
c ∈ R consider the objective Φc defined as follows: Φc = ω ∈ Ω | LimAvgr(ω) ≥ c.
Intuitively, Φc accepts the set of paths such that the “long-run” average of the rewards in
the path is at least the constant c. The “long-run” average condition is hard for the third-
level of the Borel-hierarchy (see subsection 3.1.1 for Π03-completeness proof) and cannot be
expressed as an ω-regular objective. It may be noted that the “long-run” average of a path
is independent of all finite-prefixes of the path. Formally, the class Φc of objectives are tail
objectives. Since Φc are Π03-hard objectives, it follows that tail objectives lie in higher levels
of Borel hierarchy than ω-regular objectives.
Notation. For ε > 0, an objective Φ for player 1 and Φ for player 2, we denote by Σε(Φ)
and Πε(Φ) the set of ε-optimal strategies for player 1 and player 2, respectively. Note that
the quantitative determinacy of concurrent games equivalent to the existence of ε-optimal
strategies for objective Φ for player 1 and Φ for player 2, for all ε > 0, at all states s ∈ S,
i.e., for all ε > 0, Σε(Φ) 6= ∅ and Πε(Φ) 6= ∅ (Corollary 1). We refer to the analysis of
computing the limit-sure winning states (the set of states s such that Val1(Φ)(s) = 1) and
ε-limit-sure winning strategies (ε-optimal strategies for the limit-sure winning states) as the
qualitative analysis of objective Φ. We refer to the analysis of computing the values and
the ε-optimal strategies as the quantitative analysis of objective Φ.
3.1.1 Completeness of limit-average objectives
Borel hierarchy. For an (possibly infinite) alphabet A, let Aω and A∗ denote
the set of infinite and finite words on A, respectively. The finite Borel hierarchy
(Σ01,Π
01),(Σ
02,Π
02),(Σ
03,Π
03), . . . is defined as follows:
• Σ01 = W · Aω | W ⊆ A∗ is the set of open sets;
• for all n ≥ 1, Π0n = Aω \ L | L ∈ Σ0
n consists of the complement of sets in Σ0n;
CHAPTER 3. CONCURRENT GAMES WITH TAIL OBJECTIVES 33
• for all n ≥ 1, Σ0n+1 = ⋃i∈N
Li | ∀i ∈ N. Li ∈ Π0n is the set obtained by countable
union of sets in Π0n.
Definition 1 (Wadge game) Let A and B be two (possibly infinite) alphabets. Let X ⊆
Aω and Y ⊆ Bω. The Wadge game GW (X,Y ) is a two player game between player 1
and player 2 as follows. Player 1 first chooses a letter a0 ∈ A and then player 2 chooses a
(possibly empty) finite word b0 ∈ B∗, then player 1 chooses a letter a1 ∈ A and then player 2
chooses a word b1 ∈ B∗, and so on. The play consists in writing a word wX = a0a1 . . . by
player 1 and wY = b0b1 . . . by player 2. Player 2 wins if and only if both wY is infinite and
wX ∈ X iff wY ∈ Y .
Definition 2 (Wadge reduction) Given alphabets A and B, a set X ⊆ Aω is Wadge
reducible to a set Y ⊆ Bω, denoted as X ≤W Y , if and only if there exists a continuous
function f : Aω → Bω such that X = f−1(Y ). If X ≤W Y and Y ≤W X, then X and Y
are Wadge equivalent and we denote this by X ≡W Y .
The notion of strategies in Wadge games and winners is defined similarly to the
notion of games on graphs. The Wadge games and Wadge reduction are related by the
following result.
Proposition 3 ([Wad84]) Player 2 has a winning strategy in the Wadge game GW (X,Y )
iff X ≤W Y .
Wadge equivalence preserves Borel hierarchy and defines the natural notion of
completeness.
Proposition 4 If X ≡W Y , then X and Y belong to the same level of Borel hierarchy.
Definition 3 A set Y ∈ Σ0n (resp. Y ∈ Π0
n) is Σ0n-complete (resp. Π0
n-complete) if and
only if X ≤W Y for all X ∈ Σ0n (resp. X ∈ Π0
n).
CHAPTER 3. CONCURRENT GAMES WITH TAIL OBJECTIVES 34
Our goal is to show that the lim inf objectives (defined in Example 1) are Π03-hard.
We first present a few notations.
Notations. Let A be an alphabet and B = b0, b1. For a word w ∈ A∗ or w ∈ B∗ we
denote by len(w) the length of w. For an infinite word w or finite word w with len(w) ≥ k
we denote by (w k) the prefix of length k of w. For a word w ∈ Bω or w ∈ B∗ with
len(w) ≥ k, we denote by
avg(w k
)=
number of b0 in (w k)
k,
i.e., the average of b0’s in (w k). For a finite word w we denote by avg(w) = avg(w
len(w)). Let
Y = w ∈ Bω | lim infk→∞
avg(w k) = 1
=⋂
i≥0
⋃
j≥0
⋂
k≥j
w ∈ Bω | avg(w k) ≥ 1 − 1
i.
Hardness of Y . We will show that Y is Π03-hard. To prove the result we consider an
arbitrary X ∈ Π03 and show that X ≤W Y . A set X ⊆ Aω in Π0
3 is obtained as the
countable intersection of countable union of closed sets, i.e.,
X =⋂
i≥0
⋃
j≥0
(Aj · (Fij)ω),
where Fij ⊆ A, and Aj denotes the set of words of length j in A∗. We show such a X is
Wadge reducible to Y , by showing that player 2 has a winning strategy in GW (X,Y ). In
the reduction we will use the following notation: given a word w ∈ A∗, let
sat(w) = i | ∃j ≥ 0. w ∈ Aj · (Fij)∗;
d(w) = maxl | ∀l′ ≤ l. l′ ∈ sat(w) + 1.
For example if sat(w) = 0, 1, 2, 4, 6, 7, then d(w) = max0, 1, 2+1 = 3. The play between
5.4 Optimal Strategy Construction for Streett Objectives
The algorithms, Algorithm 4 and the randomized algorithm, compute values for
both player 1 and player 2 (i.e., both for Rabin and Streett objectives), but only construct
an optimal strategy for player 1 (i.e., the player with the Rabin objective). Since pure
memoryless optimal strategies exist for the Rabin player, it is much simpler to analyze and
obtain the values and an optimal strategy for player 1. We now show that how, once these
values have been computed, we can obtain an optimal strategy for the Streett player as well.
We do this by computing sure winning strategies in 2-player games with Streett objectives.
Given a 212 -player game G with Rabin objective Φ for player 1, and the comple-
mentary objective Ω \ Φ for player 2, first we compute Val1(Φ)(s) for all states s ∈ S. An
optimal strategy π∗ for player 2 is constructed as follows: for a value class VC(r), for r < 1,
obtain a sure winning strategy πr for player 2 in Tr2as(Trwin2(G VC(r))), and in VC(r) the
strategy π∗ follows the strategy Tr2as(πr). By Lemma 29, it follows that π∗ is an optimal
strategy, and given all values, the construction of π∗ requires n calls to a procedure for
solving 2-player games with Streett objectives.
Theorem 33 Let G be a 212 -player game graph with n states and m edges, and let Φ
and Ω \ Φ be a Rabin and Streett objective, respectively, with d pairs. Given the values
Val1(Φ)(s) = 1−Val2(Φ)(s) for all states s of G, an optimal strategy π∗ for player 2 can be
constructed in time n·O(TwoPlStreettGame(n·d,m·d, d+1)), where TwoPlStreettGame(n·
d,m · d, d + 1) is any algorithm for solving 2-player Streett games with n · d states, m · d
edges, and d + 1 Streett pairs.
Discussion on parity games. We briefly discuss the special case for parity games, and
then summarize the results. For the special case of 212 -player games with parity objectives an
improved strategy improvement algorithm (where the improvement step can be computed
in polynomial time) is given in [CH06a]. We summarize the complexity of strategies and
CHAPTER 5. STOCHASTIC RABIN AND STREETT GAMES 130
Table 5.1: Strategy complexity of 212 -player games and its sub-classes with ω-regular objec-
tives, where ΣPM denotes the family of pure memoryless strategies, ΣPF denotes the familyof pure finite-memory strategies and ΣM denotes the family of randomized memorylessstrategies.
Objectives 1-pl. 112 -pl. 2-pl. 21
2 -pl.
Reachability ΣPM ΣPM ΣPM ΣPM
/Safety
Parity ΣPM ΣPM ΣPM ΣPM
Rabin ΣPM ΣPM ΣPM ΣPM
Streett ΣPF / ΣM ΣPF / ΣM ΣPF ΣPF
Muller ΣPF / ΣM ΣPF / ΣM ΣPF ΣPF
Table 5.2: Computational complexity of 212 -player games and its sub-classes with ω-regular
Since Succ(s, a1, a2) ∩ (S \ Vi) 6= ∅, we must have Succ(s, a1, a2) ∩ VC(Parity(p), > r) 6= ∅,
where s ∈ VC(Parity(p), r), i.e., Succ(s, a1, a2) ∩ (∪j<iVi) 6= ∅. It follows that condition 1,
condition 2, and condition 3 holds. We now prove condition 4.
Let v(s) = Val2(coParity(p))(s) for s ∈ S. Observe that v(s) = 1 for all
s ∈ W2. Hence to show the desired result, it suffices to show that in the MDP
G = M(G, f,P, ξf ,Eq,Pos,Neg), for all states s and all a ∈ Γ2(s) we have
v(s) ≥∑
t∈S
v(t) · δ(s, a)(t).
The inequality is proved considering the following cases.
• For s ∈ S and a = (a1, a2) ∈ Eq(s, f(s)), for all t ∈ Succ(s, a1, a2) we have v(s) = v(t)
(as s and t are in the same value-class). It follows that v(s) =∑
t∈S v(t) · δ(s, a)(t).
• For s ∈ S and a = (a1, a2) ∈ Pos(s, f(s)) we have
(1 − v(s)) ≤∑
t∈S
(1 − v(t)) · δ(s, a1, a2)(t),
i.e., v(s) ≥ ∑t∈S v(t) · δ(s, a1, a2)(t). In other words we have v(s) ≥ ∑
t∈S v(t) ·
δ(s, a)(t).
• For s ∈ S and a = (1, a2) we have
∑
t∈S
v(t) · δ(s, a)(t) =∑
t∈S
∑
a∈f(s)
v(t) · δ(s, a1, a2)(t) · ξf (s)(a1).
Since ξf is a locally optimal selector, for all s ∈ S and for all a2 ∈ Γ2(s) we have
v(s) ≥∑
t∈S
∑
a1∈f(s)
v(t) · δ(s, a1, a2)(t) · ξf (s)(a1).
Thus we have v(s) ≥ ∑t∈S v(t) · δ(s, a)(t).
CHAPTER 8. CONCURRENT PARITY GAMES 193
Hence the desired result follows.
Algorithm. Given a concurrent game structure G, a parity objective Parity(p), a real α,
and a state s, to decide whether Val1(Parity(p))(s) ≥ α, it is sufficient (and possible by
Lemma 53) to guess (P, f, ξ,Eq,Pos,Neg) such that P = (V0, V1, . . . , Vk) is a partition of
the state space, f : S → 2A \ ∅ a function such that f(s) ⊆ Γ1(s) for all s ∈ S, ξ a selector
such that for all s ∈ S we have Supp(ξ(s)) = f(s), and the following conditions hold:
1. V0 = Limit1(Parity(p)), and Vk = Limit2(coParity(p));
2. for all 1 ≤ i ≤ k and all s ∈ S ∩ Vi we have
(a) for all (a1, a2) ∈ Eq(s, f) we have Succ(s, a1, s2) ⊆ Vi;
(b) for all (a1, a2) ∈ Pos(s, f) we have Succ(s, a1, a2) ∩ (∪j<iVj) 6= ∅;
(c) for all a2 ∈ Γ2(s), if Succ(s, ξ, a2)∩(S\Vi) 6= ∅, then Succ(s, ξ, a2)∩(∪j<iVj) 6= ∅;
(also observe that it suffices to verify the condition for the selector ξU that at s
plays all actions in f(s) uniformly at random, instead of the selector ξ);
3. for all 1 ≤ i ≤ k − 1, every state s is limit-sure winning in (Gi, pi) =
QRS(G,Vi, f, ξU ,Eq,Pos,Neg, p), where ξU is a selector that at a state s plays all
moves in f(s) uniformly at random; and
4. 1 − α ≥ v(f,P, ξ,Eq,Pos,Neg, Vk)(s).
Observe that in each Gi we need to verify that s is limit-sure winning, and limit-sure winning
in concurrent games is not dependent on the the precise transition probability (Lemma 50),
it is sufficient to verify the condition with ξU instead of ξ. The guess of the partition P and
f is polynomial in the size of the game, and the guess of ξ will be obtained by a sentence in
the theory of reals. Once P and f are guessed, step 1 and step 2 can be achieved in PSPACE
(since for concurrent games whether a state s ∈ Limit1(Parity(p)) can be decided in NP∩
CHAPTER 8. CONCURRENT PARITY GAMES 194
coNP [dAH00]). We now present a sentence in the existential theory of the real closed field
(the sub-class of the real-closed field where only the existential quantifier is used) for the
guess ξ to verify the last condition:
∃x.∃v.∧
s∈S,a1∈Γ1(s)
(xs(a) ≥ 0
)∧
∧
s∈S
( ∑
a∈Γ1(s)
xs(a) = 1)
∧∧
s∈S,a1∈f(s)
(xs(a1) > 0) ∧∧
s∈S,a1 6∈f(s)
(xs(a1) = 0
)
∧∧
s∈V0
(v(s) = 0) ∧∧
s∈Vk
(v(s) = 1)
∧∧
1≤i≤k
∧
s,t∈Vi
(v(s) = v(t)) ∧∧
1≤i<j≤k
∧
s∈Vi,t∈Vj
(v(s) > v(t))
∧∧
s∈S\(V0∪Vk),a2∈Γ2(s)
(v(s) ≥
∑
t∈S
∑
a1∈f(s)
v(t) · δ(s, a1, a2)(t) · xs(a1)
)
∧∧
s∈S\(V0∪Vk),(a1,a2)∈Pos(s,f)
(v(s) ≥
∑
t∈S
v(t) · δ(s, a1, a2)(t)
)
∧ (v(s) ≤ 1 − α).
The first line of constraints ensures that x is a selector and the second line of constraints
ensures that x is a selector with support f(s) for all states s ∈ S. The third line ensures
that v(s) is defined in the right way for states in V0 and Vk. The fourth line ensures that
for all 1 ≤ i ≤ k, and s, t ∈ Vi, we have the same value in s and t, and if s ∈ Vi and
t ∈ Vj , with i < j, then the value at s is greater than the value at t. The next two lines
present the inequality constraints to guarantee that with the choice of x as selector we
have v(f,P, x,Eq,Pos,Neg, Vk) ≤ v(s). The last constraint specifies that v(s) ≤ 1 − α.
Since existential theory of reals is decidable in PSPACE [Can88], we obtain NPSPACE
algorithm to decide whether Val1(Parity(p))(s) ≥ α. Since NPSPACE=PSPACE, there is
a PSPACE algorithm to decide whether Val1(Parity(p))(s) ≥ α. By applying the binary
search technique (as for Algorithm 7) we can approximate the value to a precision ε, for
ε > 0, applying the decision procedure log(1ε ) times. Thus we have the following result.
CHAPTER 8. CONCURRENT PARITY GAMES 195
Theorem 44 (Computational complexity) Given a concurrent game structure G and
a parity objective Parity(p), a state s of G, rational ε > 0 and a rational α the following
assertions hold.
1. (Decision problem). Whether Val1(Parity(p))(s) ≥ α can be decided in PSPACE.
2. (Approximation problem). An interval [l, u] such that u − l ≤ 2ε and
Val1(Parity(p))(s) ∈ [l, u] can be computed in PSPACE.
The previous best known algorithm to approximate values is triple exponential in
the size of the game graph and logarithmic in 1ε [dAM01].
Strategy complexity. Lemma 52 and Lemma 53 shows that witness for perennial ε-
optimal strategies can be obtained by “stitching” (or composing together) limit-sure winning
strategies and locally optimal selectors across value classes. This characterization along with
results on the structure of limit-sure winning strategies yields Theorem 45. From the results
of [dAH00] it follows that there are limit-sure winning strategies coincide in limit with a
memoryless selector σℓ such that Supp(σℓ) is the set of least-rank actions of the limit-sure
witness. The witness construction of ε-optimal strategies we presented extend the result
from limit-sure winning strategies to ε-optimal strategies (Theorem 45). Theorem 45 states
that there exist ε-optimal strategies that in limit coincide with locally optimal selector, i.e.,
a memoryless strategy with locally optimal selectors. This parallels the results of Mertens-
Neyman [MN81] for concurrent games with limit-average objectives.
Theorem 45 (Limit of ε-optimal strategies) For every ε > 0, for all parity objectives
Φ there exist ε-optimal strategy σε, such that the sequence of the strategies σε converge to
a memoryless strategy σ with locally optimal selector as ε → 0, i.e., limε→0 σε = σ, where
σ ∈ Σℓ(Φ) and σ is memoryless.
CHAPTER 8. CONCURRENT PARITY GAMES 196
Complexity of concurrent ω-regular games. The complexity results for concurrent
games with sure winning criteria follows from the results for 2-player games. Given a
concurrent game of size |G|, and a parity objective of d-priorities, the almost-sure and
limit-sure winning states can be computed in time O(|G|d+1) and the almost-sure and
limit-sure winning states can be decided in NP ∩ coNP [dAH00]. We established that the
values of concurrent games with parity objectives can be approximated within ε-precision
in EXPTIME, for ε > 0. A concurrent game with Rabin and Streett objectives with d pairs,
can be solved by transforming it to a game of exponential size of the original game, with
parity objectives of O(d) priorities: the reduction is achieved using a index appearance record
(IAR) construction [Tho95], which is an adaptation of the LAR construction of [GH82]. The
conversion along with the qualitative analysis of concurrent games with parity objectives
shows that almost-sure and limit-sure winning states of concurrent games with Rabin and
Streett objectives can be computed in EXPTIME. Moreover, the conversion of concurrent
games with Rabin and Streett objectives to concurrent games with parity objectives and
quantitative analysis of concurrent games with parity objectives yields an EXPSPACE
bound to compute values within ε-precision for concurrent games with Rabin and Streett
objectives. We summarize the results on strategy and computational complexity in Table 8.1
and Table 8.2.
8.2 Conclusion
In this chapter we studied the complexity of concurrent games with parity objec-
tives, and as a consequence also obtained improved complexity results for concurrent games
with Rabin, Streett and Muller objectives. The interesting open problems are as follows:
1. The lower bounds for computation of almost-sure and limit-sure winning sets for
concurrent games with Rabin and Streett objectives are NP-hard and coNP-hard,
CHAPTER 8. CONCURRENT PARITY GAMES 197
Table 8.1: Strategy complexity of concurrent games with ω-regular objectives, where ΣPM
denotes the family of pure memoryless strategies, ΣM denotes the family of random-ized memoryless strategies, and ΣHI denotes the family of randomized history dependent,infinite-memory strategies.
Objectives Sure Almost-sure Limit-sure ε-optimal
Safety ΣPM ΣPM ΣPM ΣM
Reachability ΣPM ΣM ΣM ΣM
coBuchi ΣPM ΣM ΣM ΣM
Buchi ΣPM ΣM ΣHI ΣHI
Parity ΣPM ΣHI ΣHI ΣHI
Rabin ΣPM ΣHI ΣHI ΣHI
Streett ΣPF ΣHI ΣHI ΣHI
Muller ΣPF ΣHI ΣHI ΣHI
Table 8.2: Computational complexity of concurrent games with ω-regular objectives.
Objectives Sure Almost-sure Limit-sure Values
Safety PTIME PTIME PTIME PSPACE
Reachability PTIME PTIME PTIME PSPACE
coBuchi PTIME PTIME PTIME PSPACE
Buchi PTIME PTIME PTIME PSPACE
Parity NP ∩ coNP NP ∩ coNP NP ∩ coNP PSPACE
Rabin NP-compl. EXPTIME EXPTIME EXPSPACE
Streett coNP-compl. EXPTIME EXPTIME EXPSPACE
Muller PSPACE-compl. EXPTIME EXPTIME EXPSPACE
CHAPTER 8. CONCURRENT PARITY GAMES 198
respectively (from NP-hardness and coNP-hardness from the special case of 2-player
games). The upper bounds are EXPTIME, and it is open to prove that the problems
are NP-complete and coNP-complete for concurrent games with Rabin and Streett
objectives, respectively.
2. As stated above lower bounds for quantitative analysis of concurrent games with Rabin
and Streett objectives are NP-hard and coNP-hard, respectively. The upper bounds
are EXPSPACE. It remains open to get NP and coNP algorithm for concurrent games
with Rabin and Streett objectives, respectively, or even EXPTIME algorithms.
199
Chapter 9
Secure Equilibria and Applications
In this chapter we will consider 2-player games with non-zero-sum objectives and
show its application in synthesis1. In 2-player non-zero-sum games, Nash equilibria capture
the options for rational behavior if each player attempts to maximize her payoff. In contrast
to classical game theory, we consider lexicographic objectives: first, each player tries to
maximize her own payoff, and then, the player tries to minimize the opponent’s payoff.
Such objectives arise naturally in the verification of systems with multiple components.
There, instead of proving that each component satisfies its specification no matter how
the other components behave, it sometimes suffices to prove that each component satisfies
its specification provided that the other components satisfy their specifications. We say
that a Nash equilibrium is secure if it is an equilibrium with respect to the lexicographic
objectives of both players. We prove that in graph games with Borel objectives there
may be several Nash equilibria, but there is always an unique maximal payoff profile of a
secure equilibrium. We show how this equilibrium can be computed in the case of ω-regular
winning conditions. We then study the problem of synthesis of two independent processes
each with its own specification, and show how the notion of secure equilibria generalizes
1This chapter contains results from [CHJ04, CH07]
CHAPTER 9. SECURE EQUILIBRIA AND APPLICATIONS 200
the assume-guarantee style of reasoning in a game theoretic framework and leads to a more
appropriate formulation of the synthesis problem.
9.1 Non-zero-sum Games
We consider 2-player non-zero-sum games, i.e., non-strictly competitive games. A
possible behavior of the two players is captured by a strategy profile (σ, π), where σ is
a strategy of player 1, and π is a strategy of player 2. Classically, the behavior (σ, π) is
considered rational if the strategy profile is a Nash equilibrium [Jr50] —that is, if neither
player can increase her payoff by unilaterally changing her strategy. Formally, let vσ,π1 be
the payoff of player 1 if the strategies (σ, π) are played, and let vσ,π2 be the corresponding
payoff of player 2. Then (σ, π) is a Nash equilibrium if (1) vσ,π1 ≥ v
σ′,π1 for all player 1
strategies σ′, and (2) vσ,π2 ≥ v
σ,π′
2 for all player 2 strategies π′. Nash equilibria formalize a
notion of rationality which is strictly internal : each player cares about her own payoff but
does not in the least care (cooperatively or adversarially) about the other player’s payoff.
Choosing among Nash equilibria. A classical problem is that many games have multiple
Nash equilibria, and some of them may be preferable to others. For example, one might
partially order the equilibria by (σ, π) (σ′, π′) if both vσ,π1 ≥ v
σ′,π′
1 and vσ,π2 ≥ v
σ′,π′
2 . If a
unique maximal Nash equilibrium exists in this order, then it is preferable for both players.
However, maximal Nash equilibria may not be unique. In such cases external criteria, such
as the sum of the payoffs for both players, have been used to evaluate different rational
behaviors [Kre90, Owe95, vNM47]. These external criteria, which are based on a single
preference order on strategy profiles, are usually cooperative, in that they capture social
aspects of rational behavior. We define and study, instead, an adversarial external criterion
for rational behavior. Put simply, we assume that each player attempts to minimize the
other player’s payoff as long as, by doing so, she does not decrease her own payoff. This
CHAPTER 9. SECURE EQUILIBRIA AND APPLICATIONS 201
yields two different preference orders on strategy profiles, one for each player. Among two
strategy profiles (σ, π) and (σ′, π′), player 1 prefers (σ, π), denoted (σ, π) 1 (σ′, π′), if
either vσ,π1 > v
σ′,π′
1 , or both vσ,π1 = v
σ′,π′
1 and vσ,π2 ≤ v
σ′,π′
2 . In other words, the preference
order 1 of player 1 is lexicographic: the primary goal of player 1 is to maximize her own
payoff; the secondary goal is to minimize the opponent’s payoff. The preference order 2 of
player 2 is defined symmetrically. We refer to rational behaviors under these lexicographic
objectives as secure equilibria. (We do not know how to uniformly translate all games
with lexicographic preference orders to games with a single objective for each player, such
that the Nash equilibria of the translated games correspond to the secure equilibria of the
original games.)
Secure equilibria. The two orders 1 and 2 on strategy profiles, which express the
preferences of the two players, induce the following refinement of the notion of Nash equi-
librium: a strategy profile (σ, π) is a secure equilibrium if (1) (vσ,π1 , v
σ,π2 ) 1 (vσ′,π
1 , vσ′,π2 ) for
all player 1 strategies σ′, and (2) (vσ,π1 , v
σ,π2 ) 2 (vσ,π′
1 , vσ,π′
2 ) for all player 2 strategies π′.
Note that every secure equilibrium is a Nash equilibrium, but a Nash equilibrium need not
be secure. The name “secure” equilibrium derives from the following equivalent character-
ization. We say that a strategy profile (σ, π) is secure if any rational deviation of player 2
—i.e., a deviation that does not decrease her payoff— will not decrease the payoff of player 1,
and symmetrically, any rational deviation of player 1 will not decrease the payoff of player 2.
Formally, a strategy profile (σ, π) is secure if for all player 2 strategies π′, if vσ,π′
2 ≥ vσ,π2 then
vσ,π′
1 ≥ vσ,π1 , and for all player 1 strategies σ′, if v
σ′,π1 ≥ v
σ,π1 then v
σ′,π2 ≥ v
σ,π2 . The secure
profile (σ, π) can thus be interpreted as a contract between the two players which enforces
cooperation: any unilateral selfish deviation by one player cannot put the other player at a
disadvantage if she follows the contract. It is not difficult to show that a strategy profile is
a secure equilibrium iff it is both a secure profile and a Nash equilibrium. Thus, the secure
equilibria are those Nash equilibria which represent enforceable contracts between the two
CHAPTER 9. SECURE EQUILIBRIA AND APPLICATIONS 202
players.
Motivation: verification of component-based systems. The motivation for our def-
initions comes from verification. There, one would like to prove that a component of a
system (player 1) can satisfy a specification no matter how the environment (player 2)
behaves [AHK02]. Classically, this is modeled as a strictly competitive (zero-sum) game,
where the environment’s objective is the complement of the component’s objective. How-
ever, the zero-sum model is often overly conservative, as the environment itself typically
consists of components, each with its own specification (i.e., objective). Moreover, the in-
dividual component specifications are usually not complementary; a common example is
that each component must maintain a local invariant. So a more appropriate approach is
to prove that player 1 can meet her objective no matter how player 2 behaves as long as
player 2 does not sabotage her own objective. In other words, classical correctness proofs
of a component assume absolute worst-case behavior of the environment, while it would
suffice to assume only relative worst-case behavior of the environment —namely, relative
to the assumption that the environment itself is correct (i.e., meets its specification). Such
relative worst-case reasoning, called assume-guarantee reasoning [AL95, AH99, NAT03], so
far has not been studied in the natural setting offered by game theory.
Existence and uniqueness of maximal secure equilibria. We will see that in general
games, such as matrix games, there may be multiple secure equilibrium payoff profiles, even
several incomparable maximal ones. We show that for 2-player games with Borel objectives,
which may have multiple maximal Nash equilibria, there always exists a unique maximal
secure equilibrium payoff profile. In other words, in graph games with Borel objectives
there is a compelling notion of rational behavior for each player, which is (1) a classical
Nash equilibrium, (2) an enforceable contract (“secure”), and (3) a guarantee of maximal
payoff for each player among all behaviors that achieve (1) and (2).
CHAPTER 9. SECURE EQUILIBRIA AND APPLICATIONS 203
R2, R1
s1 s0 s2
R2s3 s4
Figure 9.1: A graph game with reachability objectives.
Examples. Consider the game graph shown in Fig. 9.1. Player 1 chooses the successor
node at square nodes and her objective is to reach the target s4. Player 2 chooses the
successor node at diamond nodes and her objective is to reach s3 or s4, also a reachability
objective. There are two player 1 strategies: the strategy σ1 chooses the move s0 → s1,
and σ2 chooses s0 → s2. There are also two player 2 strategies: the strategy π1 chooses
s1 → s3, and π2 chooses s1 → s4. The strategy profile (σ1, π1) leads the game into s3 and
therefore gives the payoff profile (0,1), indicating that player 1 loses and player 2 wins (i.e.,
only player 2 reaches her target). The strategy profiles (σ1, π2), (σ2, π1), and (σ2, π2) give
the payoffs (1,1), (0,0), and (0,0), respectively. All four strategy profiles are Nash equilibria.
For example, in (σ1, π1) player 1 does not have an incentive to switch to strategy σ2 (which
would still give her payoff 0), and neither does player 2 have an incentive to switch to π2
(she is already getting payoff 1). However, the strategy profile (σ1, π1) is not a secure
equilibrium, because player 2 can lower player 1’s payoff (from 1 to 0) without changing her
own payoff by switching to strategy σ2. Similarly, the strategy profile (σ1, π2) is not secure,
because player 1 can lower player 2’s payoff without changing her own payoff by switching
to σ2. So if both players, in addition to maximizing their own payoff, also attempt to
minimize the opponents payoff, then the resulting payoff profile is unique, namely, (0,0).
In other words, in this game, the only rational behavior for both players is to deny each
other’s objectives.
CHAPTER 9. SECURE EQUILIBRIA AND APPLICATIONS 204
s3
s1 s0
B1
B2 s4 s2
Figure 9.2: A graph game with Buchi objectives.
This is not always the case: sometimes it is beneficial for both players to cooperate
to achieve their own objectives, with the result that both players win. Consider the game
graph shown in Fig. 9.2. Both players have Buchi objectives: player 1 (square) wants to
visit s0 infinitely often, and player 2 (diamond) wants to visit s4 infinitely often. If player 1
always chooses s1 → s0 and player 2 always chooses s2 → s4, then both players win. This
Nash equilibrium is also secure: if player 1 deviates by choosing s2 → s0, then player 2 can
“retaliate” by choosing s0 → s3; similarly, if player 2 deviates by choosing s1 → s2, then
player 2 can retaliate by s2 → s3. It follows that for purely selfish motives (and not some
social reason), both players have an incentive to cooperate to achieve the maximal secure
equilibrium payoff (1,1).
Outline and results. We first define the notion of secure equilibrium and give several
interpretations through alternative definitions. We then prove the existence and uniqueness
of maximal secure equilibria in graph games with Borel objectives. The proof is based on
the following classification of strategies. A player 1 strategy is called strongly winning if
it ensures that player 1 wins and player 2 loses (i.e., the outcome of the game satisfies
ϕ1 ∧ ¬ϕ2). A player 1 strategy is a retaliating strategy if it ensures that if player 2 wins,
then player 1 wins (i.e., the outcome satisfies ϕ2 → ϕ1). In other words, a retaliating
strategy for player 1 ensures that if player 2 causes player 1 to lose, then player 2 will
CHAPTER 9. SECURE EQUILIBRIA AND APPLICATIONS 205
lose too. If both players follow retaliating strategies (σ, π), they may both win —in this
case, we say that (σ, π) is a winning pair of retaliating strategies— or they may both lose.
We show that at every node of a graph game with Borel objectives, either one of the two
players has a strongly winning strategy, or there is a pair of retaliating strategies. Based
on this insight, we give an algorithm for computing the secure equilibria in graph games
in the case that both players’ objectives are ω-regular. We then consider the problem of
synthesis of two independent processes each with its own specification and show that secure
equilibria generalizes the assume-guarantee style of reasoning in game theoretic framework
and present an appropriate formulation for the synthesis problem.
9.2 Secure Equilibria
In a secure game the objective of player 1 is to maximize her own payoff and then
minimize the payoff of player 2. Similarly, player 2 maximizes her own payoff and then
minimizes the payoff of player 1. We want to determine the best payoff that each player can
ensure when both players play according to these preferences. We formalize this as follows.
A strategy profile (σ, π) is a pair of strategies, where σ is a player 1 strategy and π is a
player 2 strategy. The strategy profile (σ, π) gives rise to a payoff profile (vσ,π1 , v
σ,π2 ), where
vσ,π1 is the payoff of player 1 if the two players follow the strategies σ and π respectively,
and vσ,π2 is the corresponding payoff of player 2. We define the player 1 preference order 1
and the player 2 preference order 2 on payoff profiles lexicographically:
If the two objectives ϕ1 and ϕ2 are ω-regular, then we obtain the following corol-
lary.
CHAPTER 9. SECURE EQUILIBRIA AND APPLICATIONS 219
Corollary 8 (Computational complexity) Let n be the size of the game graph G.
• If ϕ1 and ϕ2 are parity objectives specified by priority functions, then the decision
problem whether a given state lies in W10, or in W01, is coNP-complete; and if a
given state lies in W11, or in W00, can be decided in NP. The four sets W10, W01,
W11, and W00 can be computed in time O(nd+1 · d!
), where d is the maximal number
of priorities in the two priority functions.
• If the two objectives ϕ1 and ϕ2 are specified as LTL (linear temporal logic) formulas,
then deciding W10, W01, W11, and W00 is 2EXPTIME-complete. The four sets can
be computed in time O(n2ℓ × 22ℓ·log ℓ)
, where ℓ is the sum of the lengths of the two
formulas.
Proof. If the objectives ϕ1 and ϕ2 are parity objectives, and d is the maximal number
of priorities in the two priority functions, then the conjunctions ϕ1 ∧ ¬ϕ2, ϕ2 ∧ ¬ϕ1 and
ϕ1 ∧ ϕ2 can be expressed as Streett objectives [Tho97] with d pairs. The decision problem
for zero-sum games with Streett objectives is in co-NP [EJ88], the model-checking problem
for Streett objectives can be solved in polynomial time, and zero-sum games with Streett
objectives with d pairs can be solved in time O(nd+1 · d!) [PP06]. It follows that, for a
given state s, whether s ∈ W10 and whether s ∈ W01 can be decided in co-NP, and whether
s ∈ A for A = S \ (W01 ∪ W10) can be decided in NP. Given the set A, whether s ∈ W11
and whether s ∈ W00 can be decided in PTIME, by solving a model-checking problem with
Streett objectives. It follows from the results of [CHP07] that deciding the winner of a game
with conjunction of two parity objectives is coNP-hard; and hence the coNP-complete result
follows. The first part of the corollary follows.
Since the decision problem for zero-sum games with LTL objectives is 2EXPTIME-
complete [PR89], the 2EXPTIME lower bound is immediate. We obtain the matching upper
bound as follows. Let ℓ be the sum of the lengths of the two LTL formulas ϕ1 and ϕ2. LTL
CHAPTER 9. SECURE EQUILIBRIA AND APPLICATIONS 220
formulas are closed under conjunction and negation, and hence ϕ1 ∧ ¬ϕ2 and ϕ2 ∧ ¬ϕ1
are LTL formulas of length ℓ + 2. An LTL formula of length ℓ can be converted into an
equivalent nondeterministic Buchi automaton of size 2ℓ [VW86], and the nondeterministic
Buchi automaton can be converted into an equivalent deterministic parity automaton of
size 22ℓ·log ℓ
with 2ℓ priorities [Saf88]. The problem then reduces to solving zero-sum parity
games obtained by the synchronous product of the game graph and the deterministic parity
automaton. Since zero-sum parity games can be solved in time O(nd) for game graphs of
size n and parity objectives with d priorities [Tho97], the upper bound follows.
9.4 Assume-guarantee Synthesis
In this we will study the synthesis of two independent processes and show how
secure equilibria is useful in such scenario. The classical synthesis problem for reactive
systems asks, given a proponent process A and an opponent process B, to refine A so
that the closed-loop system A||B satisfies a given specification Φ. The solution of this
problem requires the computation of a winning strategy for proponent A in a game against
opponent B. We define and study the co-synthesis problem, where the proponent A consists
itself of two independent processes, A = A1||A2, with specifications Φ1 and Φ2, and the goal
is to refine both A1 and A2 so that A1||A2||B satisfies Φ1∧Φ2. For example, if the opponent
B is a fair scheduler for the two processes A1 and A2, and Φi specifies the requirements of
mutual exclusion for Ai (e.g., starvation freedom), then the co-synthesis problem asks for
the automatic synthesis of a mutual-exclusion protocol.
We show that co-synthesis defined classically, with the processes A1 and A2 either
collaborating or competing, does not capture desirable solutions. Instead, the proper formu-
lation of co-synthesis is the one where process A1 competes with A2 but not at the price of
violating Φ1, and vice versa. We call this assume-guarantee synthesis and show that it can
CHAPTER 9. SECURE EQUILIBRIA AND APPLICATIONS 221
be solved by computing secure-equilibrium strategies. In particular, from mutual-exclusion
requirements the assume-guarantee synthesis algorithm automatically computes Peterson’s
protocol.
We formally define the co-synthesis problem, using the automatic synthesis of a
mutual-exclusion protocol as a guiding example. More precisely, we wish to synthesize
two processes P1 and P2 so that the composite system P1||P2||R, where R is a scheduler
that arbitrarily but fairly interleaves the actions of P1 and P2, satisfies the requirements
of mutual exclusion and starvation freedom for each process. We show that traditional
zero-sum game-theoretic formulations, where P1 and P2 either collaborate against R, or
unconditionally compete, do not lead to acceptable solutions. We then show that for the
non-zero-sum game-theoretic formulation, where the two processes compete conditionally,
there exists an unique winning secure-equilibrium solution, which corresponds exactly to
Peterson’s mutual-exclusion protocol. In other words, Peterson’s protocol can be synthe-
sized automatically as the winning secure strategies of two players whose objectives are the
mutual-exclusion requirements. This is to our knowledge the first application of non-zero-
sum games in the synthesis of reactive processes. It is also, to our knowledge, the first
application of Nash equilibria —in particular, the special kind called “secure”— in system
design.
The new formulation of co-synthesis, with the two processes competing condition-
ally, is called assume-guarantee synthesis, because similar to assume-guarantee verification
(e.g., [AH99]), in attempting to satisfy her specification, each process makes the assumption
that the other process does not violate her own specification. The solution of the assume-
guarantee synthesis problem can be obtained by computing secure equilibria in 3-player
games, with the three players P1, P2, and R.
CHAPTER 9. SECURE EQUILIBRIA AND APPLICATIONS 222
do
flag[1]:=true; turn:=2;
| while(flag[1]) nop;
| while(flag[2]) nop;
| while(turn=1) nop;
| while(turn=2) nop;
| while(flag[1] & turn=2) nop;
| while(flag[1] & turn=1) nop;
| while(flag[2] & turn=1) nop;
| while(flag[2] & turn=2) nop;
Cr1:=true;
fin_wait;
Cr1:=false;
flag[1]:=false;
wait[1]:=1;
while(wait[1]=1)
| nop;
| wait[1]:=0;
while(true)
do
flag[2]:=true; turn:=1;
| while(flag[1]) nop; (C1)
| while(flag[2]) nop; (C2)
| while(turn=1) nop; (C3)
| while(turn=2) nop; (C4)
| while(flag[1] & turn=2) nop; (C5)
| while(flag[1] & turn=1) nop; (C6)
| while(flag[2] & turn=1) nop; (C7)
| while(flag[2] & turn=2) nop; (C8)
Cr2:=true;
fin_wait;
Cr2:=false;
flag[2]:=false;
wait[2]:=1;
while(wait[2]=1)
| nop; (C9)
| wait[2]:=0; (C10)
while(true)
Figure 9.3: Mutual-exclusion protocol synthesis
9.4.1 Co-synthesis
In this section we define processes, refinement, schedulers, and specifications. We
consider the traditional co-operative [CE81] and strictly competitive [PR89, RW87] versions
of the co-synthesis problem; we refer to them as weak co-synthesis and classical co-synthesis,
respectively. We show the drawbacks of these formulations and then present a new formu-
lation of co-synthesis, namely, assume-guarantee synthesis.
Variables, valuations, and traces. Let X be a finite set of variables such that each variable
x ∈ X has a finite domain Dx. A valuation θ on X is a function θ : X → ⋃x∈X Dx that
assigns to each variable x ∈ X a value θ(x) ∈ Dx. We write Θ for the set of valuations on
X. A trace on X is an infinite sequence (θ0, θ1, θ2, . . .) ∈ Θω of valuations on X. Given a
CHAPTER 9. SECURE EQUILIBRIA AND APPLICATIONS 223
valuation θ ∈ Θ and a subset Y ⊆ X of the variables, we denote by θ Y the restriction
of the valuation θ to the variables in Y . Similarly, for a trace τ = (θ0, θ1, θ2, . . .) on X, we
write τ Y = (θ0 Y, θ1 Y, θ2 Y, . . .) for the restriction of τ to the variables in Y . The
restriction operator is lifted to sets of valuations, and to sets of traces.
Processes and refinement. For i ∈ 1, 2, a process Pi = (Xi, δi) consists of a finite set Xi
of variables and a nondeterministic transition function δi : Θi → 2Θi \ ∅, where Θi is the
set of valuations on Xi. The transition function maps a present valuation to a nonempty
set of possible successor valuations. We write X = X1 ∪ X2 for the set of variables of
both processes; note that some variables may be shared by both processes. A refinement
of process Pi = (Xi, δi) is a process P ′i = (X ′
i, δ′i) such that (1) Xi ⊆ X ′
i, and (2) for all
valuations θ′ on X ′i, we have δ′i(θ
′) Xi ⊆ δi(θ′ Xi). In other words, the refined process
P ′i has possibly more variables than the original process Pi, and every possible update of
the variables in Xi by P ′i is a possible update by Pi. We write P ′
i Pi to denote that P ′i is
a refinement of Pi. Given two refinements P ′1 of P1 and P ′
2 of P2, we write X ′ = X ′1 ∪ X ′
2
for the set of variables of both refinements, and we denote the set of valuations on X ′ by
Θ′.
Schedulers. Given two processes P1 and P2, a scheduler R for P1 and P2 chooses at each
computation step whether it is process P1’s turn or process P2’s turn to update its variables.
Formally, the scheduler R is a function R : Θ∗ → 1, 2 that maps every finite sequence of
global valuations (representing the history of a computation) to i ∈ 1, 2, signaling that
process Pi is next to update its variables. The scheduler R is fair if it assigns turns to both
P1 and P2 infinitely often; i.e., for all traces (θ0, θ1, θ2, . . .) ∈ Θω, there exist infinitely many
j ≥ 0 and infinitely many k ≥ 0 such that R(θ0, . . . , θj) = 1 and R(θ0, . . . , θk) = 2. Given
two processes P1 = (X1, δ1) and P2 = (X2, δ2), a scheduler R for P1 and P2, and a start
valuation θ0 ∈ Θ, the set of possible traces is [[(P1 || P2 || R)(θ0)]] = (θ0, θ1, θ2, . . .) ∈ Θω |
∀j ≥ 0. R(θ0, . . . , θj) = i and θj+1 (X \ Xi) = θj (X \ Xi) and θj+1 Xi ∈ δi(θj Xi).
CHAPTER 9. SECURE EQUILIBRIA AND APPLICATIONS 224
Note that during turns of one process Pi, the values of the private variables X \ Xi of the
other process remain unchanged.
Specifications. A specification Φi for processes Pi is a set of traces on X; that is, Φi ⊆
Θω. We consider only ω-regular specifications [Tho97]. We define boolean operations on
specifications using logical operators such as ∧ (conjunction) and → (implication).
Weak co-synthesis. In all formulations of the co-synthesis problem that we consider, the
input to the problem is given as follows: two processes P1 = (X1, δ1) and P2 = (X2, δ2), two
specifications Φ1 for process 1 and Φ2 for process 2, and a start valuation θ0 ∈ Θ. The weak
co-synthesis problem is defined as follows: do there exist two processes P ′1 = (X ′
1, δ′1) and
P ′2 = (X ′
2, δ′2), and a valuation θ′0 ∈ Θ′, such that (1) P ′
1 P1 and P ′2 P2 and θ′0 X = θ0,
and (2) for all fair schedulers R for P ′1 and P ′
2, we have [[(P ′1 || P ′
2 || R)(θ′0)]] X ⊆ (Φ1∧Φ2).
Example 9 (Mutual-exclusion protocol synthesis) Consider the two processes shown
in Fig. 9.3. Process P1 (on the left) places a request to enter its critical section by setting
flag[1]:=true, and the entry of P1 into the critical section is signaled by Cr1:=true; and
similarly for process P2 (on the right). The two variables flag[1] and flag[2] are boolean,
and in addition, both processes may use a shared variable turn that takes two values 1 and
2. There are 8 possible conditions C1–C8 for a process to guard the entry into its critical
section.2 The figure shows all 8×8 alternatives for the two processes; any refinement without
additional variables will choose a subset of these. Process P1 may stay in its critical section
for an arbitrary finite amount of time (indicated by fin wait), and then exit by setting
Cr1:=false; and similarly for process P2. The while loop with the two alternatives C9
and C10 expresses the fact that a process may wait arbitrarily long (possibly infinitely long)
before a subsequent request to enter its critical section.
We use the notations 2 and 3 to denote always (safety) and eventually (reacha-
2Since a guard may check any subset of the three 2-valued variables, there are 256 possible guards; butall except 8 can be discharged immediately as not useful.
CHAPTER 9. SECURE EQUILIBRIA AND APPLICATIONS 225
bility) specifications, respectively. The specification for process P1 consists of two parts:
a safety part Φmutex1 = 2¬(Cr1 = true ∧ Cr2 = true) and a liveness part Φ
prog
1 =
2(flag[1] = true → 3(Cr1 = true)). The first part Φmutex1 specifies that both processes
are not in their critical sections simultaneously (mutual exclusion); the second part Φprog
1
specifies that if process P1 wishes to enter its critical section, then it will eventually enter
( starvation freedom). The specification Φ1 for process P1 is the conjunction of Φmutex1 and
Φprog
1 . The specification Φ2 for process P2 is symmetric.
The answer to the weak co-synthesis problem for Example 9 is “Yes.” A solution of the
weak co-synthesis formulation are two refinements P ′1 and P ′
2 of the two given processes P1
and P2, such that the composition of the two refinements satisfies the specifications Φ1 and
Φ2 for every fair scheduler. One possible solution is as follows: in P ′1, the alternatives C4
and C10 are chosen, and in P ′2, the alternatives C3 and C10 are chosen. This solution is not
satisfactory, because process P1’s starvation freedom depends on the fact that process P2
requests to enter its critical section infinitely often. If P2 were to make only a single request
to enter its critical section, then the progress part of Φ1 would be violated.
Classical co-synthesis. The classical co-synthesis problem is defined as follows: do there
exist two processes P ′1 = (X ′
1, δ′1) and P ′
2 = (X ′2, δ
′2), and a valuation θ′0 ∈ Θ′, such that
(1) P ′1 P1 and P ′
2 P2 and θ′0 X = θ0, and (2) for all fair schedulers R for P ′1 and P ′
2,
we have (a) [[(P ′1 || P2 || R)(θ′0)]] X ⊆ Φ1 and (b) [[(P1 || P ′
2 || R)(θ′0)]] X ⊆ Φ2.
The answer to the classical co-synthesis problem for Example 9 is “No.” We will
argue later (in Example 10) why this is the case.
Assume-guarantee synthesis. We now present a new formulation of the co-synthesis
problem. The main idea is derived from the notion of secure equilibria. We refer to this
new formulation as the assume-guarantee synthesis problem; it is defined as follows: do
there exist two refinements P ′1 = (X ′
1, δ′1) and P ′
2 = (X ′2, δ
′2), and a valuation θ′0 ∈ Θ′, such
CHAPTER 9. SECURE EQUILIBRIA AND APPLICATIONS 226
do
flag[1]:=true; turn:=2;
while (flag[2] & turn=2) nop;
Cr1:=true;
fin_wait;
Cr1:=false;
flag[1]:=false;
wait[1]:=1;
while(wait[1]=1)
| nop;
| wait[1]:=0;
while(true)
do
flag[2]:=true; turn:=1;
while (flag[1] & turn=1) nop; (C8+C6)
Cr2:=true;
fin_wait;
Cr2:=false;
flag[2]:=false;
wait[2]:=1;
while(wait[2]=1)
| nop; (C9)
| wait[2]:=0; (C10)
while(true)
Figure 9.4: Peterson’s mutual-exclusion protocol
that (1) P ′1 P1 and P ′
2 P2 and θ′0 X = θ0, and (2) for all fair schedulers R for P ′1 and
P ′2, we have (a) [[(P ′
1 || P2 || R)(θ′0)]] X ⊆ (Φ2 → Φ1) and (b) [[(P1 || P ′2 || R)(θ′0)]] X ⊆
(Φ1 → Φ2) and (c) [[(P ′1 || P ′
2 || R)(θ′0)]] X ⊆ (Φ1 ∧ Φ2).
The answer to the assume-guarantee synthesis problem for Example 9 is “Yes.”
A solution P ′1 and P ′
2 is shown in Fig. 9.4. We will argue the correctness of this solution
later (in Example 11). The two refined processes P ′1 and P ′
2 present exactly Peterson’s
solution to the mutual-exclusion problem. In other words, Peterson’s protocol can be derived
automatically as an answer to the assume-guarantee synthesis problem for the requirements
of mutual exclusion and starvation freedom. The success of assume-guarantee synthesis for
the mutual-exclusion problem, together with the failure of the classical co-synthesis, suggests
that the classical formulation of co-synthesis is too strong.
CHAPTER 9. SECURE EQUILIBRIA AND APPLICATIONS 227
9.4.2 Game Algorithms for Co-synthesis
We reduce the three formulations of the co-synthesis problem to problems about
games played on graphs with three players.
Game graphs. A 3-player game graph G = ((S,E), (S1, S2, S3)) consists of a directed graph
(S,E) with a finite set S of states and a set E ⊆ S×S of edges, and a partition (S1, S2, S3)
of the state space S into three sets. The states in Si are player-i states, for i ∈ 1, 2, 3,
and player i decides the successor at a state in Si. The notion of strategies and plays are
similar as in the case of 2-player games. We denote by σi a strategy for player i and Σi
the set of all strategies for player i, for i ∈ 1, 2, 3. Given a start state s ∈ S and three
strategies σi ∈ Σi, one for each of the three players i ∈ 1, 2, 3, there is an unique play,
denoted ωσ1,σ2,σ3(s) = (s0, s1, s2, . . .), such that s0 = s and for all k ≥ 0, if sk ∈ Si, then
σi(s0, s1, . . . , sk) = sk+1; this play is the outcome of the game starting at s given the three
strategies σ1, σ2, and σ3.
Winning. An objective Ψ is a set of plays; i.e., Ψ ⊆ Ω. We extend the notion of winning
states to three player games (using the notation is derived from [AHK02]. For an objective
Ψ, the set of winning states for player 1 in the game graph G is
〈〈1〉〉G(Ψ) = s ∈ S | ∃σ1 ∈ Σ1. ∀σ2 ∈ Σ2. ∀σ3 ∈ Σ3. ωσ1,σ2,σ3(s) ∈ Ψ;
a witness strategy σ1 for player 1 for the existential quantifier is referred to as a winning
strategy. The winning sets 〈〈2〉〉G(Ψ) and 〈〈3〉〉G(Ψ) for players 2 and 3 are defined analo-
gously. The set of winning states for the team consisting of player 1 and player 2, playing
against player 3, is
〈〈1, 2〉〉G(Ψ) = s ∈ S | ∃σ1 ∈ Σ1. ∃σ2 ∈ Σ2. ∀σ3 ∈ Σ3. ωs,σ1,σ2,σ3(s) ∈ Ψ.
The winning sets 〈〈I〉〉G(Ψ) for other teams I ⊆ 1, 2, 3 are defined similarly. The following
determinacy result follows from [GH82].
CHAPTER 9. SECURE EQUILIBRIA AND APPLICATIONS 228
Theorem 48 (Finite-memory determinacy [GH82]) Let Ψ be an ω-regular objective,
let G be a 3-player game graph, and let I ⊆ 1, 2, 3 be a set of the players. Let J =
1, 2, 3\I. Then (1) 〈〈I〉〉G(Ψ) = S\〈〈J〉〉G(¬Ψ), and (2) there exist finite-memory strategies
for the players in I such that against all strategies for the players in J , for all states in
s ∈ 〈〈I〉〉G(Ψ), the play starting at s given the strategies lies in Ψ.
Game solutions to weak and classical co-synthesis. Given two processes P1 = (X1, δ1)
and P2 = (X2, δ2), we define the 3-player game graph G = ((S,E), (S1, S2, S3)) as follows:
let S = Θ × 1, 2, 3; let Si = Θ × i for i ∈ 1, 2, 3; and let E contain (1) all edges
of the form ((θ, 3), (θ, 1)) for v ∈ Θ, (2) all edges of the form ((θ, 3), (θ, 2)) for v ∈ Θ,
and (3) all edges of the form ((θ, i), (θ′, 3)) for i ∈ 1, 2 and θ′ Xi ∈ δi(θ Xi) and
θ′ (X \ Xi) = θ (X \ Xi). In other words, player 1 represents process P1, player 2
represents process P2, and player 3 represents the scheduler. Given a play of the form
ω = ((θ0, 3), (θ0, i0), (θ1, 3), (θ1, i1), (θ2, 3), . . .), where ij ∈ 1, 2 for all j ≥ 0, we write [ω]1,2
for the sequence of valuations (θ0, θ1, θ2, . . .) in ω (ignoring the intermediate valuations at
player-3 states). A specification Φ ⊆ Θω defines the objective [[Φ]] = ω ∈ Ω | [ω]1,2 ∈ Φ.
In this way, the specifications Φ1 and Φ2 for the processes P1 and P2 provide the objectives
Ψ1 = [[Φ1]] and Ψ2 = [[Φ2]] for players 1 and 2, respectively. The objective for player 3
(the scheduler) is the fairness objective Ψ3 = Fair that both S1 and S2 are visited infinitely
often; i.e., Fair contains all plays (s0, s1, s2, . . .) ∈ Ω such that sj ∈ S1 for infinitely many
j ≥ 0, and sk ∈ S2 for infinitely many k ≥ 0.
Proposition 15 Given two processes P1 = (X1, δ1) and P2 = (X2, δ2), two specifications
Φ1 for P1 and Φ2 for P2, and a start valuation θ0 ∈ Θ, the answer to the weak co-synthesis
problem is “Yes” iff (θ0, 3) ∈ 〈〈1, 2〉〉 bG(Fair → ([[Φ1]] ∧ [[Φ2]])); and the answer to the
classical co-synthesis problem is “Yes” iff both (θ0, 3) ∈ 〈〈1〉〉 bG(Fair → [[Φ1]]) and (θ0, 3) ∈
〈〈2〉〉 bG(Fair → [[Φ2]]).
CHAPTER 9. SECURE EQUILIBRIA AND APPLICATIONS 229
Proof. We first note that for games with ω-regular objectives, finite-memory winning
strategies suffices (Theorem 48). The proof follows by the following case analysis.
1. Given a finite-memory strategy σ1, a witness P ′1 = (X ′
1, δ′1) for the weak co-synthesis
problem can be obtained as follows: the variables X ′i \ Xi encodes the finite-memory
information of the strategy σ1 and the next-state function of the strategy is then
captured by a deterministic update function δ′1. A similar construction holds for
player 2.
2. Given a witness P ′1 = (X ′
1, δ′1) as a witness for the weak co-synthesis problem, we first
observe that any deterministic restriction of P ′1 (i.e., the transition function δ′1 is made
deterministic) is also a witness to the weak co-synthesis problem. A witness strategy
σ1 in G is obtained as follows: the variables in X ′1\X1 is encoded as the finite-memory
information of σ1 and the deterministic update is captured by the next-state function.
The construction of witness strategies for player 2 is similar.
The proof for classical co-synthesis problem is similar.
Example 10 (Failure of classical co-synthesis) We now demonstrate the failure of
classical co-synthesis for Example 9. We show that for every strategy for process P1, there
exist spoiling strategies for process P2 and the scheduler such that (1) the scheduler is fair
and (2) the specification Φ1 of process P1 is violated. With any fair scheduler, process P1
will eventually set flag[1]:=true. Whenever process P1 enters its critical section (set-
ting Cr1:=true), the scheduler assigns a finite sequence of turns to process P2. During
this sequence, process P2 enters its critical section: it may first choose the alternative C10
to return to the beginning of the the main loop, then set flag[2]:=true; turn:=1; then
pass the guard C4: (since (turn 6= 2)), and enter the critical section (setting Cr2:=true).
This violates the mutual-exclusion requirement Φmutex1 of process P1. On the other hand, if
CHAPTER 9. SECURE EQUILIBRIA AND APPLICATIONS 230
process P1 never enters its critical section, this violates the starvation-freedom requirement
Φprog
1 of process P1. Thus the answer to the classical co-synthesis problem is “No.”
Game solution to assume-guarantee synthesis. We extend the notion of secure equi-
libria from 2-player games to 3-player games where player 3 can win unconditionally; i.e.,
〈〈3〉〉G(Ψ3) = S for the objective Ψ3 for player 3. In the setting of two processes and a
scheduler (player 3) with a fairness objective, the restriction that 〈〈3〉〉G(Ψ3) = S means
that the scheduler has a fair strategy from all states; this is clearly the case for Ψ3 = Fair.
(Alternatively, the scheduler may not required to be fair; then Ψ3 is the set of all plays, and
the restriction is satisfied trivially.) We characterize the winning secure equilibrium states
and then establish the existence of finite-memory winning secure strategies (Theorem 50).
This will allow us to solve the assume-guarantee synthesis problem by computing winning
secure equilibria (Theorem 51).
Payoffs. In the following, we fix a 3-player game graph G and objectives Ψ1, Ψ2, and
Ψ3 for the three players such that 〈〈3〉〉G(Ψ3) = S. Since 〈〈3〉〉G(Ψ3) = S, any equilibrium
payoff profile will assign payoff 1 to player 3. Hence we focus on payoff profiles whose third
component is 1.
Payoff-profile ordering. The player-1 preference order ≺1 on payoff profiles is lexicographic:
(v1, v2, 1) ≺1 (v′1, v′2, 1) iff either (1) v1 < v′1, or (2) v1 = v′1 and v2 > v′2; that is, player 1
prefers a payoff profile that gives her greater payoff, and if two payoff profiles match in the
first component, then she prefers the payoff profile in which player 2’s payoff is smaller, i.e.,
it is the same preference order defined for secure equilibria for two players. The preference
order for player 2 is symmetric. The preference order for player 3 is such that (v1, v2, 1) ≺3
(v′1, v′2, 1) iff v1 + v2 > v′1 + v′2. Given two payoff profiles (v1, v2, v3) and (v′1, v
′2, v
′3), we write
(v1, v2, v3) = (v′1, v′2, v
′3) iff vi = v′i for all i ∈ 1, 2, 3, and we write (v1, v2, v3) i (v′1, v
′2, v
′3)
iff (v1, v2, v3) ≺i (v′1, v′2, v
′3) or (v1, v2, v3) = (v′1, v
′2, v
′3).
CHAPTER 9. SECURE EQUILIBRIA AND APPLICATIONS 231
Secure equilibria. A strategy profile (σ1, σ2, σ3) is a secure equilibrium at a state s ∈ S iff its
a Nash equilibrium with respect to the preference order 1,2 and 3. For u,w ∈ 0, 1,
we write Suw1 ⊆ S for the set of states s such that a secure equilibrium with the payoff
profile (u,w, 1) exists at s; that is, s ∈ Suw1 iff there is a secure equilibrium (σ1, σ2, σ3) at
s with payoff profile (u,w, 1). Moreover, we write MSuw1(G) ⊆ Suw1 for the set of states s
such that the payoff profile (u,w, 1) is a maximal secure equilibrium payoff profile at s;
that is, s ∈ MSuw1(G) iff (1) s ∈ Suw1, and (2) for all u′, w′ ∈ 0, 1, if s ∈ Su′w′1, then
both (u′, w′, 1) 1 (u,w, 1) and (u′, w′, 1) 2 (u,w, 1). The states in MS111(G) are referred
to as winning secure equilibrium states, and the witnessing secure equilibrium strategies as
winning secure strategies.
Theorem 49 Let G be a 3-player game graph G with the objectives Ψ1, Ψ2, and Ψ3 for
P2: enters the critical section by passing the guard C8: (since (turn 6= 2)).After exiting its critical section, process P2 chooses the alternative C10
to enter the beginning of the main loop, sets flag[2]:=true; turn:=1;
and then the scheduler assigns the turn to process P1, which cannot enterits critical section. The scheduler then assigns turn to P2 and thenP2 enters the critical section by passing guard C8 and this sequence isrepeated forever.
The same spoiling strategies work for choices C2, C3, C6 and C7.
C4 The spoiling strategies cause the following sequence of updates:
P2: flag[2]:=true; turn:=1; [→];P1: flag[1]:=true; turn:=2; [→];P2: enters the critical section by passing the guard C3: (since (turn 6= 1)).
After exiting its critical section, process P2 continues to choose the al-ternative C9 forever, and the scheduler alternates turn between P1 andP2; and process P1 cannot enter its critical section.
The same spoiling strategies work for the choice C5.
CHAPTER 9. SECURE EQUILIBRIA AND APPLICATIONS 235
C8 The spoiling strategies cause the following sequence of updates: