Top Banner
Multi-agent learning Emergence of Conventions Multi-agent learning Gerard Vreeswijk, Intelligent Systems Group, Computer Science Department, Faculty of Sciences, Utrecht University, The Netherlands. Gerard Vreeswijk. Last modified on April 3 rd , 2014 at 13:17 Slide 1
48

Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Nov 29, 2018

Download

Documents

doanhuong
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Multi-agent learningEmergen e of Conventions

Gerard Vreeswijk, Intelligent Systems Group, Computer Science Department,

Faculty of Sciences, Utrecht University, The Netherlands.

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 1

Page 2: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Motivation

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 2

Page 3: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Simple example of a Markov process

• Return probabilities are usually

omitted in diagrams.

• In this case it can be derived that,

on average,{

P(Sun) = 6/7

P(Rain) = 1/7

• How?

We’ll see . . .

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 3

Page 4: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Plan for today

1. Markov processes. (Ergodic process, communicating states/class, transient

state/class, recurrent state/class, periodic state/class, absorbing state,

irreducible process, stationary distribution.)

Compute stationary distributions:

• Solve n linear equations.

• Compare n so-called z-trees (Freidlin and Wentzell, 1984).

2. Perturbed Markov processes. (Regular perturbed Markov process,

punctuated equilibrium, stochastically stable state.)

Compute stochastically stable states:

• Compare k so-called z-trees, where k is the number of so-called recurrent

classes (Peyton Young, 1993).

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 4

Page 5: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Plan for today

3. Applications.

• Emergence of a currency standard.

• Competing technologies: operating system A vs. operating system B.

• Competing technologies: cell phone company A vs. cell phone

company B. (If time allows.)

• Schelling’s model of segregation (1969).

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 5

Page 6: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Part 1: Markov processes

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 6

Page 7: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

State transitions

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 7

Page 8: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Communication classes

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 8

Page 9: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Start state matters

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 9

Page 10: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Start state matters. . . but here it does not

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 10

Page 11: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

The stationary distribution (and computing one)

P(A) = P(A|A′)P(A′) + P(A|B′)P(B′) + P(A|C′)P(C′) + P(A|D′)P(D′)

Let us assume that visiting probabilities are stationary (A = A′, B = B′, . . . ):

= P(A|A)P(A) + P(A|B)P(B) + P(A|C)P(C) + P(A|D)P(D)

= 0 · P(A) + 0 · P(B) + 1 · P(C) + 0 · P(D)

= P(C)

Let us write this as A = C. Similarly, B = 0.8A, C = D, and D = 0.2A + B.

Four equations with four unknowns. (Always regular, i.e. Det 6= 0 ?)

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 11

Page 12: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Theory of discrete Markov processes

Definitions:

• Stationary distribution: fixed point

of transition probabilities.

• Empirical distribution: long run

normalised frequency of visits.

• Limit distribution: long run

probability to visit a node.

• Process is path-dependent:

empirical distribution depends on

start state. Ergodic otherwise.

• Class is recurrent: process cannot

escape. Transient otherwise.

• Process is irreducible: all states can

reach each other.

Facts:

• Node is recurrent: process will

return to it a.s.

• If finite number of states:

– At least one recurrence class.

– If precisely one recurrence class

then ergodic, and conversely.

• Stationary distribution always

exists.

Unique iff ergodic. In that case,

stationary distr. ≡ empirical distr.

• If ergodic and a-periodic, then

stationary distr. ≡ limit distr.

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 12

Page 13: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Finding stationary distributions with many states is difficult

• Solve n equations in nunknowns. What if S is large?

0.1 0.2 0.0 0.1 0.0 0.1 0.0 0.3 0.0 0.2

0.1 0.2 0.0 0.1 0.0 0.1 0.0 0.3 0.0 0.2

0.1 0.2 0.0 0.1 0.0 0.1 0.0 0.3 0.0 0.2

0.0 0.1 0.1 0.2 0.0 0.1 0.0 0.3 0.0 0.2

0.5 0.2 0.0 0.1 0.0 0.0 0.0 0.0 0.0 0.2

0.1 0.2 0.0 0.1 0.0 0.1 0.0 0.3 0.0 0.2

0.0 0.1 0.1 0.2 0.0 0.1 0.0 0.3 0.0 0.2

0.1 0.2 0.0 0.1 0.0 0.1 0.0 0.3 0.0 0.2

0.3 0.1 0.2 0.0 0.1 0.0 0.0 0.0 0.3 0.0

0.1 0.2 0.0 0.1 0.0 0.1 0.0 0.3 0.0 0.2

• Freidlin & Wentzell (1984):

only look at so-called state

trees.

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 13

Page 14: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

An irreducible (and finite) Markov process

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 14

Page 15: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

One possible A-tree

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 15

Page 16: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Another possible A-tree

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 16

Page 17: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

A perhaps easier way to compute the stationary distribution

• An s-tree, Ts, is a complete collection of disjoint paths from states 6= s to s.

• The likelihood of an s-tree Ts, written ℓ(Ts), =Def the product of its edge

probabilities.

• The likelihood of a state s, written ℓ(s), =Def sum of the likelihood of all

s-trees.

Theorem (Freidlin & Wentzell, 1984). Let P be an irreducible

finite Markov process. Then, for all states, the likelihood of that

state is proportional to the stationary probability of that state.

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 17

Page 18: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Counting s-trees with Freidlin & Wentzell: example

Freidlin & Wentzell (1984):

µ(s) =v(s)

∑t∈S v(t), where v(t) =Def ∑

T∈Ts

ℓ(Ts)

The unique C-tree is coloured red. Computing ℓ(TC) = 10ǫ· 1/4· . . . = 5ǫ3/12.

Similarly:

State: A B C D E F G

Distribution: ǫ2/24 5ǫ3/9 5ǫ3/12 5ǫ2/24 ǫ2/24 ǫ/48 ǫ/32

Note what happens if ǫ → 0.

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 18

Page 19: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Part 2:

Perturbed Markov processes

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 19

Page 20: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Motivation

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 20

Page 21: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Most Markov processes are path-dependent (non-ergodic)

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 21

Page 22: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Make them ergodic by perturbing with ǫr(s,s′) here and there

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 22

Page 23: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Compute s-trees from P0-recurrent classes only (!)

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 23

Page 24: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Compute s-trees from P0-recurrent classes only (!)

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 24

Page 25: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Class {B, D, E} possesses lowest stochastic potential, viz. 4.

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 25

Page 26: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Example of P0 and Pǫ

limǫ→0

0.0 0.2 0.2 0.1 0.5

0.3 ǫ7 0.1 0.1 0.5 − ǫ7

0.1 0.2 0.2 0.0 0.5

0.7 0.1 0.2 0.0 0.0

0.1 0.2 − ǫ2/2 0.2 ǫ2 0.5 − ǫ2/2

0.0 0.0 0.1 0.0 0.9

=

0.0 0.2 0.2 0.1 0.5

0.3 0.0 0.1 0.1 0.5

0.1 0.2 0.2 0.0 0.5

0.7 0.1 0.2 0.0 0.0

0.1 0.2 0.2 0.0 0.5

0.0 0.0 0.1 0.0 0.9

• Notice that some P0-positive probabilities “have to give way” to perturbe

P0-zero probabilities with ǫ.

(Because row probabilities must add up to 1.)

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 26

Page 27: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Perturbed Markov processes

• P0 is a Markov process on a finite state space S.

• Let, for each ǫ ∈ (0, ǫ∗], Pǫ be a Markov process on the same state space.

• The collection

{Pǫ | ǫ ∈ (0, ǫ∗]}

is a regular perturbation of P0 if

1. Each Pǫ is ergodic.

2. It holds that limǫ→0 Pǫ = P0.

3. If Pǫ

s,s′ > 0, for some ǫ > 0, then

0 < limǫ→0

ǫr(s,s′)< ∞

for some r(s, s′) ≥ 0. This number is called the resistance from s tot s′.

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 27

Page 28: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Resistance

1. Each Pǫ is ergodic.

2. It holds that limǫ→0 Pǫ = P0.

3. If Pǫ

s,s′ > 0, for some ǫ > 0, then

0 < limǫ→0

ǫr(s,s′)< ∞

for some r(s, s′) ≥ 0.

4. For transitions s → s′ where P0s,s′ = Pǫ

s,s′ = 0 the resistance is defined to be

∞.

Note:

• The number r(s, s′) is well-defined!

• If P0s,s′ > 0 then r(s, s′) = 0.

• If r(s, s′) = 0 then P0s,s′ > 0.

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 28

Page 29: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Stochastic stability

• Because each Pǫ is ergodic, the stationary distribution µǫ is uniquely

defined, for every ǫ ∈ (0, ǫ∗].

• It can be shown that

limǫ→0

µǫ(s)

exists for every s. Let us call this distribution µ0.

• A state s is said to be stochastically stable if

µ0(s) > 0

Remarks:

• It can be shown that µ0 is a stationary distribution of P0.

• It follows that every regular perturbed Markov process possesses at least

one stochastically stable state.

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 29

Page 30: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

A way to compute stochastically stable states

• Recurrent classes 1, . . . , K.

• The resistance of a path from i to j

=Def the sum of edge resistances.

(Why the sum?)

• Construct edges rij (between

classes) with the minimum

resistance from i to j.

• The resistance of j-tree Tj, written

r(Tj), =Def sum of edge resistances

(in class graph).

• The stochastic potential of

recurrence class j, written p(j),=Def the minimum resistance over

all j-trees.

Theorem (Young, 1993). Let

{Pǫ | ǫ ∈ (0, ǫ∗]}

be a regular perturbed Markov

process, and let µǫ be the unique

stationary distribution of Pǫ, ǫ > 0.

Then

• limǫ→0 µǫ = µ0 exists.

• µ0 is a stationary distribution of P0.

• The stochastically stable states are

precisely those that are contained

in the recurrent class(es) of P0 with

minimum stochastic potential.

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 30

Page 31: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Minimum path resistance: example

• Compute path resistance between all K recurrent classes.

• With K recurrent classes there are always K(K − 1) minimum path

resistances to be computed. (We work on KK [unfortunate notation].)

Example:

• Suppose there are three recurrent classes E1, E2, and E3.

• Minimum path resistances here are 1, 5, 6, 7, 8, 9.

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 31

Page 32: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Nine j-trees generated by three recurrence classes

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 32

Page 33: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Revisit earlier example

1. The unperturbed Markov process

P0 possesses two recurrent classes,

viz. E1 = {A} and E2 = {F, G}.

2. Least resistance from E1 to E2 is

10ǫ· . . . = ǫ1/32. Resistance 1.

3. Least resistance from E2 to E1 is

1/3· ǫ· . . . = ǫ2/24. Resistance 2.

4. There is only one resistance tree to

either side, hence one minimum

resistance tree.

5. Stochastic potential of E1 is 2;

stochastic potential of E2 is 1.

6. Conclusion: E2 is stochastically

stable, E1 is not.

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 33

Page 34: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Part 3: Applications

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 34

Page 35: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Technology adoption

Other:

Operating system A Operating system BYou: Operating system A (a, a) (0, 0)

Operating system B (0, 0) (b, b)

Total number of players : n, for example n = 5

Sample size : s, for example s = 3

Total number of players currently playing A : m, for example m = 2

P( individual chooses A | AABBB ) = 3/

(

5

3

)

= 3/10

P( #A′s = k | AABBB ) =

(

5

3

)

(3

10)k(

7

10)5−k

This process is path-dependent (non-ergodic): for example always

BABBB, BABBB, etc. → BBBBB. With b ≫ a even BAABB, etc. → BBBBB.

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 35

Page 36: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Idiosyncratic play in technology adoption

“How, then, might institutional change occur? Because best-response

play renders both conventions absorbing states, it is clear that in order

to understand institutional change, some kind of nonbest-response

play must be introduced. Suppose there is a probability ǫ that when

individuals are in the process of updating, each may switch their type

for idiosyncratic reasons. Thus, 1 − ǫ represents the probability that

the individual pursues the best-response updating process described

above. The idiosyn rati play a ounting for nonbest responses need notbe irrational or odd; it simply represents actions whose reasons are not

explicitly modeled. Included is experimentation, whim, error, and

intentional acts seeking to affect game outcomes but whose

motivations are not captured by the above game.”

From Microeconomics: behavior, institutions, and evolution (Bowles, 2003).

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 36

Page 37: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

The tipping effect

Total number of players : n, for example n = 5

Sample size : s, for example s = 3

Total number of players currently playing A : m, for example m = 2

Suppose a = b. Let

E1 = { s ∈ S | s ∼ AAAAA } = {AAAAA}

T1 = { s ∈ S | s ∼ AAAAB } = {AAAAB, AAABA, . . . , BAAAA}

T2 = { s ∈ S | s ∼ AAABB } = {AAABB, AABAB, . . . , BBBAA}

T3 = { s ∈ S | s ∼ ABBBB }

E2 = { s ∈ S | s ∼ BBBBB }

• How many idiosyncratic transitions must be made to move from E1 to E2?

• Wat is the resistance from E1 to E2? From E1 to E2?

• What is (are) the stochastically stable state(s)?

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 37

Page 38: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Tipping point (general case)

• Suppose we’re in all-B.

• Generally, an individual will

choose for A when

ak ≥ b(s − k) ⇔ k ≥bs

a + b.

(Why “≥” instead of “>”?)

• Thus,

⌈bs

a + b⌉

times an idiosyncratic choice (ok,

error) must be made to move from

BBBBB . . . into the first transient

class that, without further

idiosyncracies, leads to AAAAA.

• With probability ǫ of idiosyncratic

choice this probability is

2)⌈

bsa+b ⌉

Indeed ǫ/2, if we assume that

idiosyncracy is uniformly

distributed among A and B.

In that case, half of the

idiosyncratic choices are

contra-productive again!

• With this payoff matrix, the

Pareto-optimal outcome is

favoured, proved s large enough.

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 38

Page 39: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Part 4:

Schelling’s model

of segregation

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 39

Page 40: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Schelling’s model in 2D (torus)

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 40

Page 41: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Schelling’s model in 1D (circle)

• Schelling (1969, 1971, 1978).

• Isolated people are discontent.

(Other people are content.)

• Possible swaps:

Trade Profit

DD → CC 2

DC → CC 1

CD → DC 0

CC → CD −1

CC → DD −2

• This “problem” can be “solved” in

“hundreds” of ways. (Analytically,

stochastically, whatever.)

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 41

Page 42: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Young’s take on Schelling’s model

• Possible trades:

Trade Profit Probability

DD → CC 2 − 2m high

DC → CC 1 − 2m high

CD → DC 0 − 2m low: ǫa

CC → CD −1 − 2m lower: ǫb

CC → DD −2 − 2m lowest: ǫc

Where 0 < a < b < c, and m aremoving osts.• The resulting Markov process is

ergodi and regular.

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 42

Page 43: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Recurrent classes are just {{a} | a ∈ Absorbing }

1. To determine all recurrent classes

of P0.

• All absorbing states A are

recurrent.

• If not in absorbing state, then a

mutually advantageous swap is

possible.

Thus, if not in absorbing state,

then transient state.

Therefore, all and only recurrent

classes are singletons of absorbing

states: R = {{a} | a ∈ A}.

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 43

Page 44: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Completely segregated vs. dispersed states

• Absorbant states are either ompletely separated or dispersed:

A = S ∪ D.

• For each s, s′ ∈ A, let r(s, s′) be

defined as usual.

Claims:

1. If s ∈ D, there does not exist an

s-tree from A\{s} with only

a-edges.

2. If s ∈ S, there does exist an s-tree

from A\{s} with only a-edges.

3. The classes with lowest potential

are L = {{s} | s ∈ S}.

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 44

Page 45: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Claim 1

If s ∈ D, there does not exist an s-tree from A\{s} with only a-edges.

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 45

Page 46: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Claim 2: a resistance

If s ∈ S, there does exist an s-tree from A\{s} with only a-edges.

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 46

Page 47: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Claim 2: a resistance, discontent individual (no problem)

If s ∈ S, there does exist an s-tree from A\{s} with only a-edges.

Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 47

Page 48: Multi-agent learning Emergence of Conventions · Last modified on April 3rd, 2014 at 13:17 Slide 1. Multi-agent learning Emergence of Conventions Motivation Gerard Vreeswijk. Last

Multi-agent learning Emergence of Conventions

Absorbing state ⇔ state with low potential

Claim 1: If s ∈ D, there does not exist

an s-tree from A\{s} with only a-edges.

• Let s ∈ D. We must show that

some edges from A to s have

resistance > a.

• Well, edges from S to s, at least,

necessarily involve moves that

create at least one discontent (=

isolated) individual.

• Therefore, all j-trees from A to D

have resistance b > a or c > a.

Claim 2: If s ∈ S, there does exist an

s-tree from A\{s} with only a-edges.

• Let s ∈ S. We must show that all

edges from A to s have resistance

a.

i) From elements in S to other

elements in S: ok! Put headto tail repeatedly.

ii) From elements in D to

elements in S: ok! Put headto tail of small groupsrepeatedly. If one large luster, ontinue as in i).Gerard Vreeswijk. Last modified on April 3rd, 2014 at 13:17 Slide 48