Top Banner
Bayesian models of human inference Josh Tenenbaum MIT
29
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Bayesian models of human inference Josh Tenenbaum MIT.

Bayesian models of human inference

Josh Tenenbaum

MIT

Page 2: Bayesian models of human inference Josh Tenenbaum MIT.

The Bayesian revolution in AI • Principled and effective solutions for inductive inference

from ambiguous data:– Vision– Robotics– Machine learning– Expert systems / reasoning– Natural language processing

• Standard view in AI: no necessary connection to how the human brain solves these problems.– Heuristics & Biases program in the background (“We know people

aren’t Bayesian, but…”).

Page 3: Bayesian models of human inference Josh Tenenbaum MIT.

Bayesian models of cognitionVisual perception [Weiss, Simoncelli, Adelson, Richards, Freeman, Feldman, Kersten, Knill, Maloney,

Olshausen, Jacobs, Pouget, ...]

Language acquisition and processing [Brent, de Marken, Niyogi, Klein, Manning, Jurafsky, Keller, Levy, Hale, Johnson, Griffiths, Perfors, Tenenbaum, …]

Motor learning and motor control [Ghahramani, Jordan, Wolpert, Kording, Kawato, Doya, Todorov, Shadmehr, …]

Associative learning [Dayan, Daw, Kakade, Courville, Touretzky, Kruschke, …]

Memory [Anderson, Schooler, Shiffrin, Steyvers, Griffiths, McClelland, …]

Attention [Mozer, Huber, Torralba, Oliva, Geisler, Yu, Itti, Baldi, …]

Categorization and concept learning [Anderson, Nosfosky, Rehder, Navarro, Griffiths, Feldman, Tenenbaum, Rosseel, Goodman, Kemp, Mansinghka, …]

Reasoning [Chater, Oaksford, Sloman, McKenzie, Heit, Tenenbaum, Kemp, …]

Causal inference [Waldmann, Sloman, Steyvers, Griffiths, Tenenbaum, Yuille, …]

Decision making and theory of mind [Lee, Stankiewicz, Rao, Baker, Goodman, Tenenbaum, …]

Page 4: Bayesian models of human inference Josh Tenenbaum MIT.

How to meet up with mainstream JDM research (i.e., heuristics & biases)?

1. How to reconcile apparently contradictory messages of H&B and Bayesian models?

Are people Bayesian or aren’t they? When are they, when aren’t they, and why?

2. How to integrate the H&B and Bayesian research approaches?

Page 5: Bayesian models of human inference Josh Tenenbaum MIT.

When are people Bayesian, and why?

• Low level hypothesis (Shiffrin, Maloney, etc.)– People are Bayesian in low-level input or output processes

that have a long evolutionary history shared with other species, e.g. vision, motor control, memory retrieval.

Page 6: Bayesian models of human inference Josh Tenenbaum MIT.

When are people Bayesian, and why?

• Low level hypothesis (Shiffrin, Maloney, etc.)• Information format hypothesis (Gigerenzer)

– Higher-level cognition can be Bayesian when information is presented in formats that we have evolved to process, and that support simple heuristic algorithms, e.g., base-rate neglect disappears with “natural frequencies”.

Explicit probabilities

Natural frequencies

Page 7: Bayesian models of human inference Josh Tenenbaum MIT.

When are people Bayesian, and why?

• Low level hypothesis (Shiffrin, Maloney, etc.)• Information format hypothesis (Gigerenzer)• Core capacities hypothesis

– Bayes can illuminate core human cognitive capacities for inductive inference – learning words and concepts, projecting properties of objects, causal inference, or action understanding: problems we solve effortlessly, unconsciously, and successfully in natural contexts, which any five-year-old solves better than any animal or computer.

Page 8: Bayesian models of human inference Josh Tenenbaum MIT.

When are people Bayesian, and why?

• Low level hypothesis (Shiffrin, Maloney, etc.)• Information format hypothesis (Gigerenzer)• Core capacities hypothesis

Causal induction

(Sobel, Griffiths, Tenenbaum, & Gopnik)

E

BA

? ?

BB

A AB

AB Trial A Trial

BA

Page 9: Bayesian models of human inference Josh Tenenbaum MIT.

When are people Bayesian, and why?

• Low level hypothesis (Shiffrin, Maloney, etc.)• Information format hypothesis (Gigerenzer)• Core capacities hypothesis

Hypothesisspace

Data

(Tenenbaum & Xu)

Word learning

Page 10: Bayesian models of human inference Josh Tenenbaum MIT.

When are people Bayesian, and why?

• Low level hypothesis (Shiffrin, Maloney, etc.)• Information format hypothesis (Gigerenzer)• Core capacities hypothesis

– Bayes can illuminate core human cognitive capacities for inductive inference – learning words and concepts, projecting properties of objects, causal inference, or action understanding: problems we solve effortlessly, unconsciously, and successfully in natural contexts, which a five-year-old solves better than any animal or computer.

– The mind is not good at explicit Bayesian reasoning about verbally or symbolically presented statistics, unless core capacities can be engaged.

Page 11: Bayesian models of human inference Josh Tenenbaum MIT.

When are people Bayesian, and why?

• Low level hypothesis (Shiffrin, Maloney, etc.)• Information format hypothesis (Gigerenzer)• Core capacities hypothesis

Statistical version ofDiagnosis problem

Causal version ofDiagnosis problem

Correct

Base-rateneglect

(Krynski & Tenenbaum)

Page 12: Bayesian models of human inference Josh Tenenbaum MIT.

How to meet up with mainstream JDM research (i.e., heuristics & biases)?

1. How to reconcile apparently contradictory messages of H&B and Bayesian models?

Are people Bayesian or aren’t they? When are they, when aren’t they, and why?

2. How to integrate the H&B and Bayesian research approaches?

Page 13: Bayesian models of human inference Josh Tenenbaum MIT.

Reverse engineering

• Goal is to reverse-engineer human inference.– A computational understanding of how the mind

works and why it works it does.

• Even for core inferential capacities, we are likely to observe behavior that deviates from any ideal Bayesian analysis.

• These deviations are likely to be informative about how the mind works.

Page 14: Bayesian models of human inference Josh Tenenbaum MIT.

Analogy to visual illusions

(Shepard)

• Highlight the problems the visual system is designed to solve: inferring world structure from images, not judging properties of the images themselves.

• Reveal the implicit visual system’s implicit assumptions about the physical world and the processes of image formation that are needed to solve these problems.

(Adelson)

Page 15: Bayesian models of human inference Josh Tenenbaum MIT.

How do we interpret deviations from a Bayesian analysis?

• H&B: People aren’t Bayesian, but use some other means of inference. – Base-rate neglect: representativeness heuristic – Recency bias: availability heuristic– Order of evidence effects: anchoring and adjustment– …

• Not so compelling as reverse engineering.– What engineer would want to design a system based on

“representativeness”, without knowing how it is computed, why it is computed that way, what problem it attempts to solve, when it works, or how its accuracy and efficiency compares to some ideal computation or other heuristics.

Page 16: Bayesian models of human inference Josh Tenenbaum MIT.

How do we interpret deviations from a Bayesian analysis?

Multiple levels of analysis (Marr)• Computational theory

– What is the goal of the computation – the outputs and available inputs? What is the logic by which the inference can be performed? What constraints (prior knowledge) do people assume to make the solution well-posed?

• Representation and algorithm– How is the information represented? How is the computation

carried out algorithmically, approximating the ideal computational theory with realistic time & space resources?

• Hardware implementation

Page 17: Bayesian models of human inference Josh Tenenbaum MIT.

How do we interpret deviations from a Bayesian analysis?

Multiple levels of analysis (Marr)• Computational theory

– What is the goal of the computation – the outputs and available inputs? What is the logic by which the inference can be performed? What constraints (prior knowledge) do people assume to make the solution well-posed?

• Representation and algorithm– How is the information represented? How is the computation

carried out algorithmically, approximating the ideal computational theory with realistic time & space resources?

• Hardware implementation

Bayes

Page 18: Bayesian models of human inference Josh Tenenbaum MIT.

Different philosophies• H&B

– One canonical Bayesian analysis of any given task, and we know what it is. – Ideal Bayesian solution can be computed.– The question “Are people Bayesian?” is empirically meaningful on any given

task.

• Bayes+Marr– Many possible Bayesian analyses of any given task, and we need to discover

which best characterize cognition.– Ideal Bayesian solution can only be approximately computed. – The question “Are people Bayesian?” is not an empirical one, at least not for

an individual task. Bayes is a framework-level assumption, like distributed representations in connectionism or condition-action rules in ACT-R.

Page 19: Bayesian models of human inference Josh Tenenbaum MIT.

How do we interpret deviations from a Bayesian analysis?

Multiple levels of analysis (Marr)• Computational theory

– What is the goal of the computation – the outputs and available inputs? What is the logic by which the inference can be performed? What constraints (prior knowledge) do people assume to make the solution well-posed?

• Representation and algorithm– How is the information represented? How is the computation

carried out algorithmically, approximating the ideal computational theory with realistic time & space resources?

• Hardware implementation

Page 20: Bayesian models of human inference Josh Tenenbaum MIT.

The centrality of causal inference

(Griffiths & Tenenbaum)

• In visual perception:

– Judge P(scene|image features) rather than P(image features|scene) or P(image features|other image features).

• Coin–flipping: Which sequence is more likely to come from flipping a fair coin, HHTHT or HHHHH?

• Coincidences: How likely that 2 people in a random party of 25 have the same birthday? 3 in a party of 10?

Page 21: Bayesian models of human inference Josh Tenenbaum MIT.

(Griffiths & Tenenbaum)

Rational measure of evidential support:

Judgments of randomness:

)|(

)|(

0

1

hdataP

hdataP

)|(

)|()|(

regulardataP

randomdataPdatarandomP

Judgments of coincidence:

)|(

)|()|(

randomdataP

regulardataPdataregularP

Page 22: Bayesian models of human inference Josh Tenenbaum MIT.

How do we interpret deviations from a Bayesian analysis?

Multiple levels of analysis (Marr)• Computational theory

– What is the goal of the computation – the outputs and available inputs? What is the logic by which the inference can be performed? What constraints (prior knowledge) do people assume to make the solution well-posed?

• Representation and algorithm– How is the information represented? How is the computation

carried out algorithmically, approximating the ideal computational theory with realistic time & space resources?

• Hardware implementation

Page 23: Bayesian models of human inference Josh Tenenbaum MIT.

Assuming the world is simple

P(A is a blicket|data) = 1 P(B is a blicket|data) ~ 1/6

P(A is a blicket|data) ~ 3/4 P(B is a blicket|data) ~ 1/4

AB Trial A Trial

AB Trial AC Trial

CA B

A BB

BA A

A ABBC

C

• In visual perception:– “Slow and smooth”

prior on visual motion–

• Causal induction:– P(blicket) = 1/6, “Activation law”

A B

Page 24: Bayesian models of human inference Josh Tenenbaum MIT.

Recognizing the world is complex

(Kemp & Tenenbaum)

• In visual perception:– Need uncertainty about

coherence ratio and velocity of coherent motion. (Lu & Yuille)

• Property induction:– Properties should be

distributed stochastically over tree structure, not just focused on single branches.

Gorillas have T9 cells.Seals have T9 cells.

Horses have T9 cells.

Bayes:single

branchprior

r = 0.50

Page 25: Bayesian models of human inference Josh Tenenbaum MIT.

Recognizing the world is complex

(Kemp & Tenenbaum)

• In visual perception:– Need uncertainty about

coherence ratio and velocity of coherent motion. (Lu & Yuille)

• Property induction:– Properties should be

distributed stochastically over tree structure, not just focused on single branches. Bayes:

“mutation”prior

Gorillas have T9 cells.Seals have T9 cells.

Horses have T9 cells.

r = 0.92

Page 26: Bayesian models of human inference Josh Tenenbaum MIT.

“has T9hormones”

“is found nearMinneapolis”

“can bite through wire”

“carry E. Spirus bacteria”

(Kemp & Tenenbaum)

Page 27: Bayesian models of human inference Josh Tenenbaum MIT.

How do we interpret deviations from a Bayesian analysis?

Multiple levels of analysis (Marr)• Computational theory

– What is the goal of the computation – the outputs and available inputs? What is the logic by which the inference can be performed? What constraints (prior knowledge) do people assume to make the solution well-posed?

• Representation and algorithm– How is the information represented? How is the computation

carried out algorithmically, approximating the ideal computational theory with realistic time & space resources?

• Hardware implementation

Page 28: Bayesian models of human inference Josh Tenenbaum MIT.

Sampling-based approximate inference

(Griffiths et al., Goodman et al.)

• In visual perception:– Temporal dynamics

of bi-stability due to fast sampling-based approximation of a

bimodal posterior (Schrater & Sundareswara).

• Order effects in category learning– Particle filter (sequential Monte Carlo), an online approximate inference algorithm assuming

stationarity.

• Probability matching in classification decisions– Sampling-based approximations with guarantees of near optimal generalization performance.

Page 29: Bayesian models of human inference Josh Tenenbaum MIT.

Conclusions• “Are people Bayesian?”, “When are they Bayesian?”

– Maybe not the most interesting questions in the long run….

• What is the best way to reverse engineer cognition at multiple levels of analysis? Assuming core inductive capacities are approximately Bayesian at the computational-theory level offers several benefits:– Explanatory power: why does cognition work? – Fewer degrees of freedom in modeling– A bridge to state-of-the-art AI and machine learning– Tools to study the big questions: What are the goals of cognition? What

does the mind know about the world? How is that knowledge represented? What are the processing mechanisms and why do they work as they do?