Exploring cultural transmission by iterated learning

Exploring cultural transmission by iterated learning

Tom GriffithsBrown University

Mike KalishUniversity of Louisiana

With thanks to: Anu Asnaani, Brian Christian, and Alana Firl

Cultural transmission

• Most knowledge is based on secondhand data

• Some things can only be learned from others– cultural knowledge transmitted across generations

• What are the consequences of learners learning from other learners?

Iterated learning(Kirby, 2001)

Each learner sees data, forms a hypothesis, produces the data given to the next learner

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Objects of iterated learning

• Knowledge communicated through data

• Examples:– religious concepts– social norms– myths and legends– causal theories– language

Analyzing iterated learning



PL(h|d): probability of inferring hypothesis h from data d

PP(d|h): probability of generating data d from hypothesis h

PL(h|d)

PP(d|h)

PL(h|d)

PP(d|h)

Analyzing iterated learning

What are the consequences of iterated learning?

Simulations

Analytic results

Complexalgorithms

Simplealgorithms

Komarova, Niyogi, & Nowak (2002)

Brighton (2002)

Kirby (2001)

Smith, Kirby, & Brighton (2003)

?

Bayesian inference

QuickTime™ and aTIFF (Uncompressed) decompressor


Reverend Thomas Bayes

• Rational procedure for updating beliefs

• Foundation of many learning algorithms

• Widely used for language learning

Bayes’ theorem

€

P(h | d) =P(d | h)P(h)

P(d | ′ h )P( ′ h )′ h ∈H

∑

Posteriorprobability

Likelihood Priorprobability

Sum over space of hypothesesh: hypothesis

d: data

Iterated Bayesian learning







Learners are Bayesian agents

€

PL (h | d) =PP (d | h)P(h)

PP (d | ′ h )P( ′ h )′ h ∈H

∑

PL(h|d)

PP(d|h)

PL(h|d)

PP(d|h)

• Variables x(t+1) independent of history given x(t)

• Converges to a stationary distribution under easily checked conditions for ergodicity

x x x x x x x x

Transition matrixT = P(x(t+1)|x(t))

Markov chains

Stationary distributions

• Stationary distribution:

• In matrix form

is the first eigenvector of the matrix T

• Second eigenvalue sets rate of convergence

€

i = P(x(t +1) = i |j

∑ x(t ) = j)π j = Tijπ j

j

∑

€

=Tπ

Analyzing iterated learningd0 h1 d1 h2

PL(h|d) PP(d|h) PL(h|d)d2 h3

PP(d|h) PL(h|d)

d PP(d|h)PL(h|d)h1 h2d PP(d|h)PL(h|d)

h3

A Markov chain on hypotheses

d0 d1h PL(h|d) PP(d|h)d2h PL(h|d) PP(d|h) h PL(h|d) PP(d|h)

A Markov chain on data

PL(h|d) PP(d|h) PL(h|d) PP(d|h)h1,d1 h2 ,d2 h3 ,d3

A Markov chain on hypothesis-data pairs

Stationary distributions

• Markov chain on h converges to the prior, p(h)

• Markov chain on d converges to the “prior predictive distribution”

• Markov chain on (h,d) is a Gibbs sampler for

€

p(d) = p(d | h)h

∑ p(h)

€

p(d,h) = p(d | h) p(h)

Implications

• The probability that the nth learner entertains the hypothesis h approaches p(h) as n

• Convergence to the prior occurs regardless of:– the properties of the hypotheses themselves– the amount or structure of the data transmitted

• The consequences of iterated learning are determined entirely by the biases of the learners

Identifying inductive biases

• Many problems in cognitive science can be formulated as problems of induction– learning languages, concepts, and causal relations

• Such problems are not solvable without bias(e.g., Goodman, 1955; Kearns & Vazirani, 1994; Vapnik, 1995)

• What biases guide human inductive inferences?

If iterated learning converges to the prior, then it may provide a method for investigating biases

Serial reproduction(Bartlett, 1932)

• Participants see stimuli, then reproduce them from memory

• Reproductions of one participant are stimuli for the next

• Stimuli were interesting, rather than controlled– e.g., “War of the Ghosts”

Iterated function learning(heavy lifting by Mike Kalish)

• Each learner sees a set of (x,y) pairs

• Makes predictions of y for new x values

• Predictions are data for the next learner

data hypotheses

Function learning experiments

Stimulus

Response

Slider

Feedback

Examine iterated learning with different initial data

1 2 3 4 5 6 7 8 9

IterationInitialdata

Iterated concept learning(heavy lifting by Brian Christian)

• Each learner sees examples from a species

• Identifies species of four amoebae

• Iterated learning is run within-subjects

data hypotheses

Two positive examples



data (d)

hypotheses (h)

Bayesian model(Tenenbaum, 1999; Tenenbaum & Griffiths, 2001)

€

P(h | d) =P(d | h)P(h)

P(d | ′ h )P( ′ h )′ h ∈H

∑d: 2 amoebaeh: set of 4 amoebae

€

P(d | h) =1/ h

m

0

⎧ ⎨ ⎩

d ∈ h

otherwise

m: # of amoebae in the set d (= 2)|h|: # of amoebae in the set h (= 4)

€

P(h | d) =P(h)

P( ′ h )h '|d ∈h'

∑Posterior is renormalized prior

What is the prior?

Classes of concepts(Shepard, Hovland, & Jenkins, 1958)

Class 1

Class 2

Class 3

Class 4

Class 5

Class 6

shape

size

color

Experiment design (for each subject)Class 1Class 2Class 3Class 4Class 5Class 6Class 1Class 2Class 3Class 4Class 5Class 6

6 iterated learning chains

6 independent

learning “chains”

Estimating the prior

data (d)hy

poth

eses

(h)

Estimating the prior

Class 1Class 2

Class 3

Class 4

Class 5

Class 6

0.8610.087

0.009

0.002

0.013

0.028

Prior

r = 0.952

Bayesian modelHuman subjects

Two positive examples(n = 20)

Prob

abil

ity

Iteration

Prob

abil

ity

Iteration

Human learners Bayesian model

Two positive examples(n = 20)

Prob

abil

ity

Bayesian model

Human learners

Three positive examples

data (d)

hypotheses (h)

Three positive examples(n = 20)

Prob

abil

ity

Iteration

Prob

abil

ity

Iteration

Human learners Bayesian model

Three positive examples(n = 20)

Bayesian model

Human learners

Conclusions

• Consequences of iterated learning with Bayesian learners determined by the biases of the learners

• Consistent results are obtained with human learners

• Provides an explanation for cultural universals…– universal properties are probable under the prior– a direct connection between mind and culture

• …and a novel method for evaluating the inductive biases that guide human learning

Discovering the biases of models



Generic neural network:




EXAM (Delosh, Busemeyer, & McDaniel, 1997):




POLE (Kalish, Lewandowsky, & Kruschke, 2004):

Exploring cultural transmission by iterated learning

Documents

Exploring cultural transmission by iterated learning