The Evolution of Cooperation Kentaro Toyama Microsoft Research India Indian Institute of Science August 10, 2005 Robert Axelrod’s A Computer Game for Political Science
Dec 15, 2015
The Evolution of Cooperation
Kentaro Toyama
Microsoft Research India
Indian Institute of Science August 10, 2005
Robert Axelrod’s
A Computer Game for Political Science
Outline
Prisoner’s Dilemma
Two Contests
Some Analysis
Real-World Scenarios
Agent-Based Simulation
Discussion
Professor of Political Science and Public Policy at U Michigan, Ann Arbor
First paper on cooperation published in 1980.
Book (left) published in 1984 to wide acclaim.
Best known for this and related work; still active in this area and publishing new research.
http://www-personal.umich.edu/~axe/
Robert Axelrod
Outline
Prisoner’s Dilemma
Two Contests
Some Analysis
Real-World Scenarios
Agent-Based Simulation
Discussion
The Prisoner’s Dilemma
Two-player game
Non-zero-sum
Model for many real-world scenarios
Story based on two criminals caught by police and interrogated separately…
The Prisoner’s Dilemma
Cooperate Defect
Defect
Cooperate
Payoff Matrix
Player B
Player A-2-2
0 -5
0-5
-4-4
Think of payoffs as number of years of life lost, spent in jail.
The Prisoner’s Dilemma
Cooperate Defect
Defect
Cooperate
Payoff Matrix
Player B
Player A33
5 0
50
11
(For ease of thinking, add 5 to each payoff. The larger the payoff, the better.)
The Prisoner’s Dilemma
Cooperate Defect
Defect
Cooperate
Payoff Matrix
Player B
Player A33
5 0
50
11
If Player A cooperates, Player B should defect.
Cooperate Defect
Defect
Cooperate
Payoff Matrix
Player B
Player A33
5 0
50
11
If Player A cooperates, Player B should defect.
If Player A defects, Player B should defect.
The Prisoner’s Dilemma
Cooperate Defect
Defect
Cooperate
Payoff Matrix
Player B
Player A33
5 0
50
11
No matter what the other player does, a rational, self-interested player will defect. (This is a Nash equilibrium.)
The Prisoner’s Dilemma
Cooperate Defect
Defect
Cooperate
Payoff Matrix
Player B
Player A33
5 0
50
11
No matter what the other player does, a rational, self-interested player will defect. (This is a Nash equilibrium.)
The Dilemma: There is a joint strategy that could result in better payoffs for both players. (The Nash equilibrium is not Pareto-optimal.)
The Prisoner’s Dilemma
Other Kinds of Games
4 4
C D
D
C
3 2
2 3
1 1
Linked Fates
2 2
Swerve Straight
Straight
Swerve
3 1
1 3
0 0
Chicken
1 -1
Heads Tails
Tails
Heads
-1 1
-1 1
1 -1
Matching Coins
1 1
C D
D
C
0 0
1 2
0 1
Exploitation
The Prisoner’s Dilemma
Cooperate Defect
Defect
Cooperate
Payoff Matrix
Player B
Player A3
0
5
1
Reward forcooperation
Punishmentfor defection
Temptationto defect
Sucker’spayoff
T > R > P > S
R > (S + T) / 2
The Prisoner’s Dilemma
Cooperate Defect
Defect
Cooperate
Payoff Matrix
Player B
Player A330
50 0
50
110
T > R > P > S
R > (S + T) / 2
Payoffs do not have to be symmetrical.
PD as a Model for Real-Life Scenarios
Research collaboration
Competitve advertising“Tragedy of the Commons”
WarfareBiological relationships
Driving in traffic
Two players
Prisoner’s Dilemma played repeatedly
History of previous interactions remembered by each player
No other outside knowledge
Iterated Prisoner’s Dilemma
3 3
C D
D
C
5 0
0 5
1 1
3 3
C D
D
C
5 0
0 5
1 1
Iterated Prisoner’s Dilemma
Two-game iteration…
No matter what the other player does, a rational, self-interested player will defect on the second (last) game.
3 3
C D
D
C
5 0
0 5
1 1
3 3
C D
D
C
5 0
0 5
1 1
Iterated Prisoner’s Dilemma
Two-game iteration…
Both players know this, so on the first game, both players will defect, as well.
No matter what the other player does, a rational, self-interested player will defect on the second (last) game.
3 3
C D
D
C
5 0
0 5
1 1
3 3
C D
D
C
5 0
0 5
1 1
Iterated Prisoner’s Dilemma
N-game iteration…
A rational, self-interested player should defect all N times.
3 3
C D
D
C
5 0
0 5
1 1
…
Both players know this, so on the first game, both players will defect, as well.
No matter what the other player does, a rational, self-interested player will defect on the second game.
3 3
C D
D
C
5 0
0 5
1 1
3 3
C D
D
C
5 0
0 5
1 1
Iterated Prisoner’s Dilemma
If number of iterations uncertain…
Best strategy is no longer clear!
Unlike, e.g., chess, there is no single “best strategy” – it depends on the strategy of the other player.
3 3
C D
D
C
5 0
0 5
1 1
… 3 3
C
D
C
5 0
0 5
1 1 ?? ? ?
Outline
Prisoner’s Dilemma
Two Contests
Some Analysis
Real-World Scenarios
Agent-Based Simulation
Discussion
Contest #1
Call for entries to game theorists
All entrants told of preliminary experiments
15 strategies = 14 entries + 1 RANDOM
Round-robin tournament against all other players and “twin”
Each game 200 iterations
Games run 5 times against each strategy
Scores averaged over all games
And, the winner is…
TIT FOR TAT
“Cooperate on first move, thereafter reciprocate opponent’s previous action”
Shortest program submitted
By psychologist, Anatol Rapoport
Analysis: To Forgive, DivineTop two rules are willing to cooperate even after
defections, if other player is “contrite”
DOWNING
- “Kingmaker”
- Tries to learn behavior of other player; starts by defecting twice.
- Hurts strategies that are unforgiving.
Other Interesting Strategies
TIT FOR TWO TATS– Retaliate only if previous two are D’s– Could have won tournament, if entered
NICE DOWNING– Like DOWNING, but start with C’s– Could have won tournament, if entered
Variations on TIT FOR TAT– Did well, but none beat TIT FOR TAT
Contest #2
Same set up as Contest #1, except…
Entries from first-round contestants as well as open call in magazine
63 strategies = 62 entries + 1 RANDOM
Each game iterated an uncertain number of iterations, with probability 0.00346 of ending
Analysis: Contest #1 Lessons Validated
14 of top 15 strategies never defect first.
14 of bottom 15 strategies were not “nice”.
Forgiveness important.
Analysis: Be Retaliatory
Some entrees tried to take advantage of “nice” strategies:
TRANQUILIZER – cooperate first, if other cooperates, too, throw in a few defections.
TESTER – defect first, if other doesn’t retaliate, cooperate twice, then alternate defection and cooperation. If other ever defects, do TIT FOR TAT.
Strategies that were unresponsive to defections get taken advantage of.
Top strategies retaliate quickly.
Analysis: Sneaki-ness Doesn’t Pay
Entrees that try to take advantage of “nice” strategies, don’t gain as much as they lose.
TRANQUILIZER – 27th place in tournament.
TESTER – 46th place in tournament (out of 63).
Outline
Prisoner’s Dilemma
Two Contests
Some Analysis
Real-World Scenarios
Agent-Based Simulation
Discussion
Robustness of TIT FOR TAT
In six variations of Contest #2, TIT FOR TAT took first place in five and second place in one.
In a population simulation with 63 strategies (right), TIT FOR TAT emerges as the winner.
In an genetic algorithm experiment (1987), TIT-FOR-TAT-like algorithms prevailed.
Stability of TIT FOR TAT
A population of TIT FOR TAT strategists cannot be invaded by a single strategy.
Nor can a population of ALWAYS DEFECT strategists.
But! A cluster of TIT FOR TATs can invade ALWAYS DEFECT,* while the converse is not true.
Stability of TIT FOR TAT
A population of TIT FOR TAT strategists cannot be invaded by a single strategy.
Nor can a population of ALWAYS DEFECT strategists.
But! A cluster of TIT FOR TATs can invade ALWAYS DEFECT,* while the converse is not true.
Stability of TIT FOR TAT
A population of TIT FOR TAT strategists cannot be invaded by a single strategy.
Nor can a population of ALWAYS DEFECT strategists.
But! A cluster of TIT FOR TATs can invade ALWAYS DEFECT,* while the converse is not true.
* Under certain conditions that imply that the future is sufficiently important for all players.
General Lessons
Don’t be envious. (It doesn’t matter if others win.)TIT FOR TAT never scores more than the other player.
Be nice. (Don’t defect first.)The best way to do well is to cooperate with others who are also nice.
Retaliate swiftly.Or, others will take advantage.
Forgive.Feuds are costly. Defections shouldn’t prevent cooperation later on.
Don’t be too clever.Too much cleverness looks RANDOM.
Outline
Prisoner’s Dilemma
Two Contests
Some Analysis
Real-World Scenarios
Agent-Based Simulation
Discussion
Trench Warfare
Common form of battle in World War I
Armies in deep trenches on either side of battle line
Machine guns and artillery
Prolonged engagement with same group of enemy troops
Cooperate Shoot to Kill
Shoot to Kill
Cooperate
Payoff Matrix
You
ThemBoth live.
Trench Warfare is an IPD
You live and win a medal.
You die and they win.
Both die.
For a single round, no matter what the enemy does, it’s better to shoot to kill.
But, for an indefinite number of rounds…?
Cooperation spontaneously evolved:
“If the British shelled the Germans, the Germans replied, and the damage was equal.”
“[A British staff officer was] astonished to observe German soldiers walking about within rifle range…”
“These people … did not know there was a war on. Both sides … believed in … ‘live and let live’.”
“Suddenly a salvo arrived but did no damage. Naturally both sides got down and our men started swearing at the Germans, when all at once a brave German got on to his parapet and shouted out ‘We are very sorry about that; we hope no one was hurt. It is not our fault, it is that damned Prussian artillery.’”
Trench Warfare
Lay eggs and pollinate
Lay eggs without pollinating
Cut off fig
Let fig ripen
Payoff Matrix
Wasp
Tree
Fig Tree and Fig Wasp
Live, and have kids.
For a single round, trees should cut off figs, wasps should lay eggs without pollinating.
But, for an indefinite number of rounds…?
No fruit.
Die, but have kids.
Bear fruit; breed good wasps.
Die, no kids.
Bear fruit.
Live, no kids.
No fruit; breed bad wasps.
Outline
Prisoner’s Dilemma
Two Contests
Some Analysis
Real-World Scenarios
Agent-Based Simulation
Discussion
Agent-Based Modeling
Simulation as a scientific method
Simulation allows hypothesis discovery, verification, and prediction.
Simulation is particularly valuable for interactions of many agents and the agents are expected to adapt.
Other Modeled Social Theories
1963 Cyert & March – Behavioral theory of the firm
1974 Schelling – Segregated neighborhoods
1980 Axelrod – Cooperation
2003 Axelrod – Ethnocentrism
Outline
Prisoner’s Dilemma
Two Contests
Some Analysis
Real-World Scenarios
Agent-Based Simulation
Discussion
Summary
Iterated Prisoner’s Dilemma as a model for many different types of interaction
There is no single optimal strategy in an IPD game, but TIT FOR TAT is strong, robust, and stable.
In real-world IPD scenarios, TIT-FOR-TAT-like strategies naturally evolve, even among antagonists and unintelligent players.
Agent-based modeling is a powerful tool for modeling populations in social and biological sciences.
TIT FOR TAT and EthicsMahabharata (~3000 BC)
“One should not behave towards others in a way which is disagreeable to oneself. This is the essence of morality. All other activities are due to selfish desire.”
Hammurabi’s Code (~1750 BC)“If a man put out the eye of another man, his eye shall be put out. If he
break another man's bone, his bone shall be broken.”
The Golden Rule (~30 AD)“Do unto others as you would have them do unto you.”
Kant’s Categorical Imperative“Act so that the maxim of action may be capable of becoming a universal
law.”
Garrett Hardin (“The Tragedy of the Commons”, 1968)“Conscience is self-eliminating.”