218 1 Lecture 3 Part2-Print

Post on 03-May-2017

216 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Lecture 3

Dynamic games of complete information

- Part 2

Outline (part 2)

Questions/comments/observations are always encouraged, at any point during the lecture!!

• Repeated games • Finite • Infinite

Motivation • Play the same normal-form game over and over

– each round is called a “stage game” Prisoner dilemma

Repeated games

• Repeated game is designed to examine the logic of long-term interaction

• It captures the idea that a player will take into account the effect of his current behavior on the other players’ future behavior, and aims to explain phenomena like cooperation, revenge, threats etc.

Finitely repeated games

• Everything is straightforward if we repeat a game a finite number of times

• We can write it as an extensive-form game with imperfect information – at each round players don’t know what the others

have done; afterwards they do – overall payoff function is additive: sum of payoffs in

stage games

Remarks

• Observe that the strategy space is much richer than it was in the normal-form setting

• Repeating a Nash strategy in each stage game will be an equilibrium in behavioral strategies (called a stationary strategy)

• We can apply backward induction in these games when the normal form game has a dominant strategy.

Prisoner’s dilemma as repeated game

Infinitely repeated games

Infinitely repeated games

Strategies

Nash equilibria with no discounting

Nash equilibria with no discounting

Can we get anything else rather than repetitions of the stage game equilibrium?

Can we get anything else rather than repetitions of the stage game equilibrium?

Important points

What outcomes can be achieved as equilibria?

What outcomes can be achieved as equilibria?

What outcomes can be achieved as equilibria?

The folk theorem

Folk theorems

• Repeated games – Structure of the equilibrium strategies (more useful) – Determine the payoffs that can be sustained by

equilibria -> conditions under which this set consists of nearly all reasonable payoff profiles (just existence of equilibria)

• “Folk theorems” – focus of most of the formal development in repeated games – Socially desirable outcomes that cannot be sustained

if players are myopic, can be sustained if players are foresighted (i.e. have long-term objectives)

Folk Theorems

• When players are patient, repeated play allows virtually any payoff to be an equilibrium outcome

• The set of Nash equilibria outcomes includes outcomes that are not repetitions of the constituent game

• To support such an outcome, each player must be deterred from deviating by being “punished”

• Punishment may take many forms – One possibility – “trigger strategy” (any deviation

causes his opponent(s) to carry out a punitive action that lasts forever)

Repeated games – some preliminary conclusions

• Repeated games may introduce new equilibria and stimulate cooperation

– Infinitely repeated games (infinite horizon T) • Finitely repeated games (finite horizon, T finite): solved

by backward induction – Players have incentives to cheat • Infinite Horizon: description of a game where players

think the game extends one more period with high probability

• Finite Horizon: terminal date of the game is known.

Nash equilibria with discounting

The folk theorem

The folk theorem with discounting

What about subgame perfect NE?

Strategies representation in repeated games - Automata The following game is infinitely repeated with

discount factor δ.

C D C D

2, 2

0, 3 1, 1

3, 0

Grim Trigger Strategy

Consider the repeated prisoner’s dilemma. The strategy prescribes that the player initially

cooperates, and continues to do so if both players cooperated at all previous times.

si (a1, . . . , aT) = C if at = (C,C) for all t = 1, . . . , T. si (a1, . . . , aT) = D otherwise. Note that a player defects if either she or her

opponent defected in the past.

Automaton of Grim Trigger Strategy

• There are two states: C in which C is chosen, and D, in which D is chosen.

• The initial state (*) is C. • If the play is not (C,C) in any period then the

state changes to D. • If the automaton is in state D, it remains there

forever.

* C D (C,D)

(D,C) (D,D)

(C,C)

More formalism!

One-step deviation principle

Central questions

1. If players are patient can we get cooperative outcomes = better than NE for all players? 2. If players are patient, what else can we get?

Grim-Trigger is SGPE

Suppose that both players adopt the grim-trigger strategy.

There are two sets of histories. Those for which grim-trigger strategy prescribes that the players play (C,C) and those for which the grim trigger strategy prescribes that they play (D,D).

In the first set of histories, if player i plays grim- trigger, then the outcome is (C, C) in every period with payoffs (2, 2, . . .), whose discounted average is 2.

If I deviates only once, she plays D. Then she reverts to the grim trigger-strategy, that prescribes to play D at all subsequent periods.

• The opponent, playing grim trigger strategy, plays D forever as a consequence of i’s one-shot deviation.

• The OSD yields the stream of payoffs (3, 1, 1, . . .) with discounted average

(1 − d)[3 + d + d 2 + d 3 + · · ·] = 3(1 − d) + d. • Thus player i cannot increase her payoff by deviating

if and only if 2 ≥ 3(1 − d) + d, or d ≥ 1/2. • In the second set of histories, if player i plays grim

trigger, then the outcome is (D, D) in every period with payoffs (1, 1, . . .), whose discounted average is 1.

• If I deviates only once, she plays C. Then she reverts to the grim trigger strategy, that prescribes to play D at all subsequent periods.

• The opponent, playing grim trigger strategy, plays D forever as a consequence of i’s one-shot deviation.

• The OSD yields the stream of payoffs (0, 1, 1, . . .) with discounted average

(1 − δ)[0 + δ + δ 2 + δ 3 + · · ·] = δ. • Player i cannot increase her payoff by deviating: 1

≥ δ. • We conclude that if δ ≥ ½ then the strategy pair in

which each player’s strategy is the grim-trigger strategy is a Subgame-Perfect equilibrium of the infinitely repeated Prisoner’s Dilemma.

Tit-for-Tat

• The player initially cooperates. • At subsequent rounds, she plays the strategy

played by the opponent at the previous round.

si (a1, . . . , aT) = C if aTj = C.

si (a1, . . . , aT) = D if aTj = D.

* C D ( . ,D)

(C, . )

( . ,C)

(D, . )

Tit for Tat is SGPE • Suppose that both players adopt tit for tat strategy. • There are four sets of histories. They prescribe

respectively, (C,C), (C,D), (D,C) and (D,D). • In the first set of histories, if player i plays tit for tat,

then the outcome is (C, C) in every period with payoffs (2, 2, . . .), whose discounted average is 2.

• If i deviates only once, she plays D. Then she reverts to tit for tat. Given that the opponent plays tit for tat, the induced play is {(D,C),(C,D),(D,C),(C,D)…}, with payoffs (3,0,3,0,…).

• Hence player i does not deviate if: 2 ≥ (1−δ)[3+0δ +3δ 2+0δ 3+· · ·] = 3 (1−δ)/(1-δ2)

that is to say: δ ≥ 1/2.

• In the set of histories prescribing (C,D), if players play tit for tat, then the outcome is {(C,D),(D,C),…}, which yields (0,3,0,3,…).

• If i deviates only once, she plays D. Then she reverts to tit for tat. Given that the opponent plays tit for tat, the induced play is (D,D) forever with payoffs 1.

• Hence player i does not deviate if: (1−δ)[0+3δ +0δ 2+3δ 3+· · ·] = 3δ (1−δ)/(1-δ2) ≥ 1

that is to say: δ ≥ 1/2. • In the set of histories prescribing (D,C), if players

play tit for tat, then the outcome is {(D,C),(C,D),…}, which yields (3,0,3,0…). If i deviates only once, the induced play is (C,C) forever with payoffs 2. Player i does not deviate if:

(1−δ)[3+0δ +3δ 2+0δ 3+· · ·] = 3 (1−δ)/(1-δ2) ≥1 that is to say: 1/2 ≥ δ.

• In the set of histories prescribing (D,D), if players play tit for tat, then the outcome is (D,D) forever, with payoff 1.

• If i deviates only once, the induced play is {(C,D),(D,C),…}, which yields (0,3,0,3,…).

• Player i does not deviate if: 1 ≥ (1−δ)[0+3δ +0δ 2+3δ 3+· · ·] = 3 (1−δ)/(1-δ2) that is to say: 1/2 ≥ δ. • We conclude that the strategy pair in which each

player plays the tit-for-tat strategy is a Subgame-Perfect equilibrium of the infinitely repeated Prisoner’s Dilemma if and only if δ = ½.

• This underlines the inherent fragility of tit-for-tat: it works only in a knife-hedge case.

Other strategies for repeated games

Other strategies for repeated games

top related