The Evolution of Cooperation Kentaro Toyama Microsoft Research India Indian Institute of ScienceAugust 10, 2005 Robert Axelrod’s A Computer Game for Political.

Post on 15-Dec-2015

214 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

The Evolution of Cooperation

Kentaro Toyama

Microsoft Research India

Indian Institute of Science August 10, 2005

Robert Axelrod’s

A Computer Game for Political Science

Outline

Prisoner’s Dilemma

Two Contests

Some Analysis

Real-World Scenarios

Agent-Based Simulation

Discussion

Professor of Political Science and Public Policy at U Michigan, Ann Arbor

First paper on cooperation published in 1980.

Book (left) published in 1984 to wide acclaim.

Best known for this and related work; still active in this area and publishing new research.

http://www-personal.umich.edu/~axe/

Robert Axelrod

Outline

Prisoner’s Dilemma

Two Contests

Some Analysis

Real-World Scenarios

Agent-Based Simulation

Discussion

The Prisoner’s Dilemma

Two-player game

Non-zero-sum

Model for many real-world scenarios

Story based on two criminals caught by police and interrogated separately…

The Prisoner’s Dilemma

Cooperate Defect

Defect

Cooperate

Payoff Matrix

Player B

Player A-2-2

0 -5

0-5

-4-4

Think of payoffs as number of years of life lost, spent in jail.

The Prisoner’s Dilemma

Cooperate Defect

Defect

Cooperate

Payoff Matrix

Player B

Player A33

5 0

50

11

(For ease of thinking, add 5 to each payoff. The larger the payoff, the better.)

The Prisoner’s Dilemma

Cooperate Defect

Defect

Cooperate

Payoff Matrix

Player B

Player A33

5 0

50

11

If Player A cooperates, Player B should defect.

Cooperate Defect

Defect

Cooperate

Payoff Matrix

Player B

Player A33

5 0

50

11

If Player A cooperates, Player B should defect.

If Player A defects, Player B should defect.

The Prisoner’s Dilemma

Cooperate Defect

Defect

Cooperate

Payoff Matrix

Player B

Player A33

5 0

50

11

No matter what the other player does, a rational, self-interested player will defect. (This is a Nash equilibrium.)

The Prisoner’s Dilemma

Cooperate Defect

Defect

Cooperate

Payoff Matrix

Player B

Player A33

5 0

50

11

No matter what the other player does, a rational, self-interested player will defect. (This is a Nash equilibrium.)

The Dilemma: There is a joint strategy that could result in better payoffs for both players. (The Nash equilibrium is not Pareto-optimal.)

The Prisoner’s Dilemma

Other Kinds of Games

4 4

C D

D

C

3 2

2 3

1 1

Linked Fates

2 2

Swerve Straight

Straight

Swerve

3 1

1 3

0 0

Chicken

1 -1

Heads Tails

Tails

Heads

-1 1

-1 1

1 -1

Matching Coins

1 1

C D

D

C

0 0

1 2

0 1

Exploitation

The Prisoner’s Dilemma

Cooperate Defect

Defect

Cooperate

Payoff Matrix

Player B

Player A3

0

5

1

Reward forcooperation

Punishmentfor defection

Temptationto defect

Sucker’spayoff

T > R > P > S

R > (S + T) / 2

The Prisoner’s Dilemma

Cooperate Defect

Defect

Cooperate

Payoff Matrix

Player B

Player A330

50 0

50

110

T > R > P > S

R > (S + T) / 2

Payoffs do not have to be symmetrical.

PD as a Model for Real-Life Scenarios

Research collaboration

Competitve advertising“Tragedy of the Commons”

WarfareBiological relationships

Driving in traffic

Two players

Prisoner’s Dilemma played repeatedly

History of previous interactions remembered by each player

No other outside knowledge

Iterated Prisoner’s Dilemma

3 3

C D

D

C

5 0

0 5

1 1

3 3

C D

D

C

5 0

0 5

1 1

Iterated Prisoner’s Dilemma

Two-game iteration…

No matter what the other player does, a rational, self-interested player will defect on the second (last) game.

3 3

C D

D

C

5 0

0 5

1 1

3 3

C D

D

C

5 0

0 5

1 1

Iterated Prisoner’s Dilemma

Two-game iteration…

Both players know this, so on the first game, both players will defect, as well.

No matter what the other player does, a rational, self-interested player will defect on the second (last) game.

3 3

C D

D

C

5 0

0 5

1 1

3 3

C D

D

C

5 0

0 5

1 1

Iterated Prisoner’s Dilemma

N-game iteration…

A rational, self-interested player should defect all N times.

3 3

C D

D

C

5 0

0 5

1 1

Both players know this, so on the first game, both players will defect, as well.

No matter what the other player does, a rational, self-interested player will defect on the second game.

3 3

C D

D

C

5 0

0 5

1 1

3 3

C D

D

C

5 0

0 5

1 1

Iterated Prisoner’s Dilemma

If number of iterations uncertain…

Best strategy is no longer clear!

Unlike, e.g., chess, there is no single “best strategy” – it depends on the strategy of the other player.

3 3

C D

D

C

5 0

0 5

1 1

… 3 3

C

D

C

5 0

0 5

1 1 ?? ? ?

Outline

Prisoner’s Dilemma

Two Contests

Some Analysis

Real-World Scenarios

Agent-Based Simulation

Discussion

Contest #1

Call for entries to game theorists

All entrants told of preliminary experiments

15 strategies = 14 entries + 1 RANDOM

Round-robin tournament against all other players and “twin”

Each game 200 iterations

Games run 5 times against each strategy

Scores averaged over all games

And, the winner is…

TIT FOR TAT

“Cooperate on first move, thereafter reciprocate opponent’s previous action”

Shortest program submitted

By psychologist, Anatol Rapoport

Analysis: “Nice” Guys Finish First

Top 8 strategies never defect first.

Analysis: To Forgive, DivineTop two rules are willing to cooperate even after

defections, if other player is “contrite”

DOWNING

- “Kingmaker”

- Tries to learn behavior of other player; starts by defecting twice.

- Hurts strategies that are unforgiving.

Other Interesting Strategies

TIT FOR TWO TATS– Retaliate only if previous two are D’s– Could have won tournament, if entered

NICE DOWNING– Like DOWNING, but start with C’s– Could have won tournament, if entered

Variations on TIT FOR TAT– Did well, but none beat TIT FOR TAT

Contest #2

Same set up as Contest #1, except…

Entries from first-round contestants as well as open call in magazine

63 strategies = 62 entries + 1 RANDOM

Each game iterated an uncertain number of iterations, with probability 0.00346 of ending

And, the winner is…

TIT FOR TAT, again!

(Again, by Anatol Rapoport)

Analysis: Contest #1 Lessons Validated

14 of top 15 strategies never defect first.

14 of bottom 15 strategies were not “nice”.

Forgiveness important.

Analysis: Be Retaliatory

Some entrees tried to take advantage of “nice” strategies:

TRANQUILIZER – cooperate first, if other cooperates, too, throw in a few defections.

TESTER – defect first, if other doesn’t retaliate, cooperate twice, then alternate defection and cooperation. If other ever defects, do TIT FOR TAT.

Strategies that were unresponsive to defections get taken advantage of.

Top strategies retaliate quickly.

Analysis: Sneaki-ness Doesn’t Pay

Entrees that try to take advantage of “nice” strategies, don’t gain as much as they lose.

TRANQUILIZER – 27th place in tournament.

TESTER – 46th place in tournament (out of 63).

Outline

Prisoner’s Dilemma

Two Contests

Some Analysis

Real-World Scenarios

Agent-Based Simulation

Discussion

Robustness of TIT FOR TAT

In six variations of Contest #2, TIT FOR TAT took first place in five and second place in one.

In a population simulation with 63 strategies (right), TIT FOR TAT emerges as the winner.

In an genetic algorithm experiment (1987), TIT-FOR-TAT-like algorithms prevailed.

Stability of TIT FOR TAT

A population of TIT FOR TAT strategists cannot be invaded by a single strategy.

Nor can a population of ALWAYS DEFECT strategists.

But! A cluster of TIT FOR TATs can invade ALWAYS DEFECT,* while the converse is not true.

Stability of TIT FOR TAT

A population of TIT FOR TAT strategists cannot be invaded by a single strategy.

Nor can a population of ALWAYS DEFECT strategists.

But! A cluster of TIT FOR TATs can invade ALWAYS DEFECT,* while the converse is not true.

Stability of TIT FOR TAT

A population of TIT FOR TAT strategists cannot be invaded by a single strategy.

Nor can a population of ALWAYS DEFECT strategists.

But! A cluster of TIT FOR TATs can invade ALWAYS DEFECT,* while the converse is not true.

* Under certain conditions that imply that the future is sufficiently important for all players.

General Lessons

Don’t be envious. (It doesn’t matter if others win.)TIT FOR TAT never scores more than the other player.

Be nice. (Don’t defect first.)The best way to do well is to cooperate with others who are also nice.

Retaliate swiftly.Or, others will take advantage.

Forgive.Feuds are costly. Defections shouldn’t prevent cooperation later on.

Don’t be too clever.Too much cleverness looks RANDOM.

Outline

Prisoner’s Dilemma

Two Contests

Some Analysis

Real-World Scenarios

Agent-Based Simulation

Discussion

Trench Warfare

Common form of battle in World War I

Armies in deep trenches on either side of battle line

Machine guns and artillery

Prolonged engagement with same group of enemy troops

Cooperate Shoot to Kill

Shoot to Kill

Cooperate

Payoff Matrix

You

ThemBoth live.

Trench Warfare is an IPD

You live and win a medal.

You die and they win.

Both die.

For a single round, no matter what the enemy does, it’s better to shoot to kill.

But, for an indefinite number of rounds…?

Cooperation spontaneously evolved:

“If the British shelled the Germans, the Germans replied, and the damage was equal.”

“[A British staff officer was] astonished to observe German soldiers walking about within rifle range…”

“These people … did not know there was a war on. Both sides … believed in … ‘live and let live’.”

“Suddenly a salvo arrived but did no damage. Naturally both sides got down and our men started swearing at the Germans, when all at once a brave German got on to his parapet and shouted out ‘We are very sorry about that; we hope no one was hurt. It is not our fault, it is that damned Prussian artillery.’”

Trench Warfare

Biological Mutualism

Lay eggs and pollinate

Lay eggs without pollinating

Cut off fig

Let fig ripen

Payoff Matrix

Wasp

Tree

Fig Tree and Fig Wasp

Live, and have kids.

For a single round, trees should cut off figs, wasps should lay eggs without pollinating.

But, for an indefinite number of rounds…?

No fruit.

Die, but have kids.

Bear fruit; breed good wasps.

Die, no kids.

Bear fruit.

Live, no kids.

No fruit; breed bad wasps.

Fig Tree and Fig Wasp Mutualism

Outline

Prisoner’s Dilemma

Two Contests

Some Analysis

Real-World Scenarios

Agent-Based Simulation

Discussion

Agent-Based Modeling

Simulation as a scientific method

Simulation allows hypothesis discovery, verification, and prediction.

Simulation is particularly valuable for interactions of many agents and the agents are expected to adapt.

Other Modeled Social Theories

1963 Cyert & March – Behavioral theory of the firm

1974 Schelling – Segregated neighborhoods

1980 Axelrod – Cooperation

2003 Axelrod – Ethnocentrism

Outline

Prisoner’s Dilemma

Two Contests

Some Analysis

Real-World Scenarios

Agent-Based Simulation

Discussion

Summary

Iterated Prisoner’s Dilemma as a model for many different types of interaction

There is no single optimal strategy in an IPD game, but TIT FOR TAT is strong, robust, and stable.

In real-world IPD scenarios, TIT-FOR-TAT-like strategies naturally evolve, even among antagonists and unintelligent players.

Agent-based modeling is a powerful tool for modeling populations in social and biological sciences.

TIT FOR TAT and EthicsMahabharata (~3000 BC)

“One should not behave towards others in a way which is disagreeable to oneself. This is the essence of morality. All other activities are due to selfish desire.”

Hammurabi’s Code (~1750 BC)“If a man put out the eye of another man, his eye shall be put out. If he

break another man's bone, his bone shall be broken.”

The Golden Rule (~30 AD)“Do unto others as you would have them do unto you.”

Kant’s Categorical Imperative“Act so that the maxim of action may be capable of becoming a universal

law.”

Garrett Hardin (“The Tragedy of the Commons”, 1968)“Conscience is self-eliminating.”

Thank you!

top related