Winning concurrent reachability games requires doubly-exponential patience Michal Koucký IM AS CR, Prague Kristoffer Arnsfelt Hansen, Peter Bro Miltersen.

Winning concurrent Winning concurrent reachability games reachability games

requires doubly-requires doubly-exponential patienceexponential patience

MichalMichal Koucký KouckýIMIM A ASS CCRR, Prague, Prague

Kristoffer Arnsfelt Hansen,Kristoffer Arnsfelt Hansen, Peter Bro Peter Bro MiltersenMiltersen

Aarhus U., DenmarkAarhus U., Denmark

2

ExampleExample

Player 1 chooses Player 1 chooses AA{t,h}{t,h}

Player 2 chooses Player 2 chooses BB{t,h}{t,h}

IfIf A = B then move A = B then move

one level up,one level up, A A B = t then move B = t then move

to 1to 1stst level, level, A A B = h then B = h then

Player 1 loses.Player 1 loses.

Entrance fee: Entrance fee: $15$15

Win: $20Win: $20

WW

77

66

55

44

33

22

11

3

Entrance fee: $15 Win: $20Entrance fee: $15 Win: $20

Observation:Observation: To break even, you need at To break even, you need at least ¾ probability to win.least ¾ probability to win.

Good news: Good news: you can win with probability you can win with probability arbitrary close to 1.arbitrary close to 1.

Bad news: Bad news: the expected time to win the the expected time to win the game with probability at least ¾ is 10game with probability at least ¾ is 102525 years (one move per day).years (one move per day).

… … the age of the age of universe: 10universe: 101111 years years

4

Concurrent reachability gamesConcurrent reachability games[de Alfaro, Henzinger, Kupferman ’98, Everett ’57][de Alfaro, Henzinger, Kupferman ’98, Everett ’57]

Two players play on a graph of states. At each Two players play on a graph of states. At each step they simultaneously (independently) pick step they simultaneously (independently) pick one of possible actions each and based on a one of possible actions each and based on a transition table move to the next state.transition table move to the next state.

……

……

……

……

5

Goals:Goals: Player 1 wants to reach a specific state Player 1 wants to reach a specific state or states.or states.

Player 2 wants to prevent Player 1 from Player 2 wants to prevent Player 1 from reaching these states.reaching these states.

Strategy of a player:Strategy of a player: Memory-less Memory-less (non-adaptive) – (non-adaptive) – ππ : states : states

actions.actions. AdaptiveAdaptive – – ππ : history : history actions. actions.

Probabilistic strategy: Probabilistic strategy: ππ gives a probability gives a probability distribution of possible actions.distribution of possible actions.

Patience of a memory-less strategy Patience of a memory-less strategy ππ = 1/min non-zero prob. in = 1/min non-zero prob. in

ππ … [Everett ’57]… [Everett ’57]

6

Winning starting states: Winning starting states:

SureSure – Player 1 has a winning strategy – Player 1 has a winning strategy that never fails.that never fails.

Almost-SureAlmost-Sure – Player 1 has a randomized – Player 1 has a randomized strategy that reaches goal with strategy that reaches goal with probability 1.probability 1.

Limit-SureLimit-Sure – For every – For every > 0 > 0, , Player 1 has Player 1 has a strategy that reaches goal with a strategy that reaches goal with probability at least 1 – probability at least 1 – ..

7

PurgatoryPurgatorynn

Player 1 chooses Player 1 chooses AA{t,h}{t,h}

Player 2 chooses Player 2 chooses BB{t,h}{t,h}

IfIf A = B then move A = B then move

one level up,one level up, A A B = t then move B = t then move

to 1to 1stst level, level, A A B = h then move B = h then move

to state H.to state H.

PP

nn

nn--11

33

22

11

…… HH

8

Our resultsOur results

ThmThm:: 1) For every 0< 1) For every 0< < < ½ ½ , , any any --optimal optimal strategy of Player 1 in Purgatorystrategy of Player 1 in Purgatorynn is of patience is of patience

> 1/> 1/ 22nn-2 -2 ..

2) For every 2) For every ll < < nn/2 , any (1 – 2/2 , any (1 – 2--l l )-optimal )-optimal strategy of Player 1 in Purgatorystrategy of Player 1 in Purgatorynn is of patience is of patience

> 2> 222nn--ll-2-2..

ThmThm:: For every 0< For every 0< < < ½ ½ and every concurrent and every concurrent reachability game with m>61 actions in total, reachability game with m>61 actions in total, both players have both players have --optimal strategies with optimal strategies with patience < 1/patience < 1/ 224242mm ..

9

ThmThm:: 1) For every 0< 1) For every 0< < < ’’ , , if every if every --optimal strategy of optimal strategy of Player 1 is of patience > Player 1 is of patience > tt then the expected time to win then the expected time to win the game by any the game by any ’-’-optimal strategy of Player 1 can be optimal strategy of Player 1 can be forced to be forced to be ΩΩ( ( tt ). ).

patience ~ expected time to winpatience ~ expected time to win

All the results essentially hold also for adaptive strategiesAll the results essentially hold also for adaptive strategies

Recall: Recall: the expected time to win Purgatorythe expected time to win Purgatory77 with probability with probability at least ¾ is 10at least ¾ is 102525 years (one move per day). years (one move per day).

10

Algorithmic consequencesAlgorithmic consequences

Three algorithmic questions:Three algorithmic questions:

1.1. What are *-SURE states?What are *-SURE states? PTIME [dAHK]PTIME [dAHK]

2.2. What are the winning probabilities of different What are the winning probabilities of different states? states?

PSPACE [EY]PSPACE [EY]

3.3. What is the (What is the (--)optimal strategy? )optimal strategy? EXP-EXP-TIME upper-bound [CdAH,…] EXP-EXP-TIME upper-bound [CdAH,…]

EXP-SPACE lower-bound [our results]EXP-SPACE lower-bound [our results]

Cor: Cor: Any algorithm that manipulates winning strategies Any algorithm that manipulates winning strategies in explicit representation must use exponential in explicit representation must use exponential space.space.

… … explicit representationexplicit representation: integer fractions: integer fractions

11

PurgatoryPurgatorynn

ppii – probability of – probability of playing t in state playing t in state ii in in -optimal strategy -optimal strategy of Player 1.of Player 1.

Claim: Claim: 1)1) 0< 0< ppii < 1, < 1, for all for all ii..

2)2) ppii < < , for , for all all ii..

3)3) pp11 ≤ ≤ pp22 . . pp33 … … ppn n

4)4) ppii ≤ ≤ ppi+i+11 . . ppi+i+22 … … ppnn

PP

nn

nn--11

33

22

11

……

1\1\22

tt hh

tt level+level+11

lossloss

hh level=level=11

level+level+11

ppnn

ppnn--

11

pp33

pp22

pp11

Player 2 Player 2 plays hplays h

Player 2 Player 2 plays tplays t

Player 2 plays hPlayer 2 plays h

tt

tt

tt

tt

tt

12

Open problemsOpen problems

Generic algorithm for Generic algorithm for --optimal optimal strategy with symbolic strategy with symbolic representation?representation?

How to redefine the game to be How to redefine the game to be more realistic?more realistic?

13

Goals:Goals: Player 1 wants to reach a specific state or Player 1 wants to reach a specific state or states.states.

Player 2 wants to prevent Player 1 from Player 2 wants to prevent Player 1 from reaching these states.reaching these states.

Winning starting states: Winning starting states:

SureSure – Player 1 has a winning strategy that – Player 1 has a winning strategy that never fails.never fails.

Almost-SureAlmost-Sure – Player 1 has a randomized – Player 1 has a randomized strategy that reaches goal with probability 1.strategy that reaches goal with probability 1.

Limit-SureLimit-Sure – For every – For every > 0 > 0, , Player 1 has a Player 1 has a strategy that reaches goal with probability at strategy that reaches goal with probability at least 1 – least 1 – ..

Winning concurrent reachability games requires doubly-exponential patience Michal Koucký IM AS CR, Prague Kristoffer Arnsfelt Hansen, Peter Bro Miltersen.

Documents

sure player

h slide

optimal strategy of

example player

purgatory n player

states actions

winning strategy

sure states