Top Banner
Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition
69

Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Dec 20, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Tom Griffiths

CogSci C131/Psych C123Computational Models of Cognition

Page 2: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Tom Griffiths

CogSci C131/Psych C123Computational Models of Cognition

Page 3: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Computation Cognition

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.QuickTime™ and a

TIFF (Uncompressed) decompressorare needed to see this picture.

Page 4: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Cognitive science

• The study of intelligent systems

• Cognition as information processing

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.QuickTime™ and a

TIFF (Uncompressed) decompressorare needed to see this picture.input output input output

Page 5: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

computationcomputation

Computational modeling

Look for principles that characterize both computation and cognition

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.QuickTime™ and a

TIFF (Uncompressed) decompressorare needed to see this picture.input output input output

Page 6: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Two goals

• Cognition:– explain human cognition (and behavior) in

terms of the underlying computation

• Computation:– gain insight into how to solve some

challenging computational problems

Page 7: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Computational problems

• Easy:– arithmetic, algebra, chess

• Difficult:– learning and using language– sophisticated senses: vision, hearing– similarity and categorization– representing the structure of the world– scientific investigation

human cognition sets the standard

Page 8: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Three approaches

Rules and symbols

Networks, features, and spaces

Probability and statistics

Page 9: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Three approaches

Rules and symbols

Networks, features, and spaces

Probability and statistics

Page 10: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Logic

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

All As are BsAll Bs are CsAll As are Cs

Aristotle(384-322 BC)

Page 11: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Mechanical reasoning

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

(1232-1315)

Page 12: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

The mathematics of reason

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Thomas Hobbes(1588-1679)

Rene Descartes(1596-1650)

Gottfried Leibniz(1646-1716)

Page 13: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Modern logic

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

George Boole(1816-1854)

Friedrich Frege(1848-1925)

PQPQ

Page 14: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Computation

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Alan Turing(1912-1954)

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 15: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

The World

Logic

P Q

P

Q

Facts Inference Rules

Page 16: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.Categorization

Page 17: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Categorization

cat small furry domestic carnivore

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture. QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.QuickTime™ and a

TIFF (Uncompressed) decompressorare needed to see this picture.

Page 18: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

The World

Logic

P Q

P

Q

Facts Inference Rules

Page 19: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

The World

Early AI systems…Workspace Operations

Actions

Observations

Facts

Goals

Operations

Page 20: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Rules and symbols

• Perhaps we can consider thought a set of rules, applied to symbols…– generating infinite possibilities with finite means– characterizing cognition as a “formal system”

• This idea was applied to:– deductive reasoning (logic)– language (generative grammar)– problem solving and action (production systems)

Page 21: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Language as a formal system

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Noam Chomsky

Page 22: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Language

“a set (finite or infinite) of sentences, each finite in length and constructed out of a finite set of elements”

all sequences

LThis is a good sentence 1Sentence bad this is 0

linguistic analysis aims to separate the grammatical sequences which are sentences of L from the

ungrammatical sequences which are not

Page 23: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

A context free grammar

S NP VP NP T N VP V NP T the N man, ball, … V hit, took, …

S

NP VP

T N V NP

T N the man hit

the ball

Page 24: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Rules and symbols

• Perhaps we can consider thought a set of rules, applied to symbols…– generating infinite possibilities with finite means– characterizing cognition as a “formal system”

• This idea was applied to:– deductive reasoning (logic)– language (generative grammar)– problem solving and action (production systems)

• Big question: what are the rules of cognition?

Page 25: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Computational problems

• Easy:– arithmetic, algebra, chess

• Difficult:– learning and using language– sophisticated senses: vision, hearing– similarity and categorization– representing the structure of the world– scientific investigation

human cognition sets the standard

Page 26: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Inductive problems

• Drawing conclusions that are not fully justified by the available data– e.g. detective work

• Much more challenging than deduction!

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

“In solving a problem of this sort, the grand thing is to be able to reason backward. That is a very useful accomplishment, and a very easy one, but people do not practice it much.”

Page 27: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Challenges for symbolic approaches

• Learning systems of rules and symbols is hard!– some people who think of human cognition in these

terms end up arguing against learning…

Page 28: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

The poverty of the stimulus

• The rules and principles that constitute the mature system of knowledge of language are actually very complicated

• There isn’t enough evidence to identify these principles in the data available to children

Therefore • Acquisition of these rules and principles must

be a consequence of the genetically determined structure of the language faculty

Page 29: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

The poverty of the stimulus

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Learning language requires strong constraints on the set of possible languages

These constraints are “Universal Grammar”

Page 30: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Challenges for symbolic approaches

• Learning systems of rules and symbols is hard!– some people who think of human cognition in these

terms end up arguing against learning…

• Many human concepts have fuzzy boundaries– notions of similarity and typicality are hard to

reconcile with binary rules

• Solving inductive problems requires dealing with uncertainty and partial knowledge

Page 31: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Three approaches

Rules and symbols

Networks, features, and spaces

Probability and statistics

Page 32: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Similarity

What determines similarity?

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 33: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Representations

What kind of representations are used by the human mind?

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.QuickTime™ and a

TIFF (Uncompressed) decompressorare needed to see this picture.

Page 34: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Semantic networks Semantic spaces

Representations

How can we capture the meaning of words?

Page 35: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.Categorization

Page 36: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Computing with spaces

x1 x2

y

perceptual features

+1 = cat, -1 = dog

x1

x2

y

dog cat

y = g(Wx)QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

E = y − g(Wx)( )2

error:

Page 37: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Networks, features, and spaces

• Artificial neural networks can represent any continuous function…

Page 38: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Problems with simple networks

x1 x2

x1

x2 y

Some kinds of data are not linearly separable

x1

x2

AND

x1

x2

OR

x1

x2

XOR

Page 39: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

A solution: multiple layers

z1 z2

y

x1 x2

y

z1

z2

x1

x2

hidden layer

input layer

output layer

Page 40: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Networks, features, and spaces

• Artificial neural networks can represent any continuous function…

• Simple algorithms for learning from data– fuzzy boundaries– effects of typicality

Page 41: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

E (error)

wij

∂E

∂wij

< 0

∂E

∂wij

= 0€

∂E

∂wij

> 0

Δwij = −η∂E

∂wij

( is learning rate)

General-purpose learning mechanisms

Page 42: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

The Delta Rule

x1 x2

y

+1 = cat, -1 = dog

E = y − g(Wx)( )2

Δwij = −η∂E

∂wij

∂E

∂wij

= −2 y − g(Wx)( ) g'(Wx) x j

Δwij = η y − g(Wx)( )g'(Wx) x j

output

error

influence

of input

for any function g with derivative g

perceptual features

Page 43: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Networks, features, and spaces

• Artificial neural networks can represent any continuous function…

• Simple algorithms for learning from data– fuzzy boundaries– effects of typicality

• A way to explain how people could learn things that look like rules and symbols…

Page 44: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Simple recurrent networks

z1 z2

x1 x2

hidden layer

input layer

output layer

context units

input

(Elman, 1990)

x2 x1

copy

x(i+1)

x(i)

Page 45: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Hidden unit activations after 6 iterations of 27,500 words

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

(Elman, 1990)

Page 46: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Networks, features, and spaces

• Artificial neural networks can represent any continuous function…

• Simple algorithms for learning from data– fuzzy boundaries– effects of typicality

• A way to explain how people could learn things that look like rules and symbols…

• Big question: how much of cognition can be explained by the input data?

Page 47: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Challenges for neural networks

• Being able to learn anything can make it harder to learn specific things– this is the “bias-variance tradeoff”

Page 48: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Bias-variance tradeoff

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 49: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Bias-variance tradeoff

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 50: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Bias-variance tradeoff

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 51: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Bias-variance tradeoff

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 52: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

What about generalization?

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 53: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

What happened?

• The set of 8th degree polynomials contains almost all functions through 10 points

• Our data are some true function, plus noise• Fitting the noise gives us the wrong function• This is called overfitting

– while it has low bias, this class of functions results in an algorithm that has high variance (i.e. is strongly affected by the observed data)

Page 54: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

The moral

• General purpose learning mechanisms do not work well with small amounts of data(the most flexible algorithm isn’t always the best)

• To make good predictions from small amounts of data, you need algorithms with bias that matches the problem being solved

• This suggests a different approach to studying induction…– (what people learn as n 0, rather than n )

Page 55: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Challenges for neural networks

• Being able to learn anything can make it harder to learn specific things– this is the “bias-variance tradeoff”

• Neural networks allow us to encode constraints on learning in terms of neurons, weights, and architecture, but is this always the right language?

Page 56: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Three approaches

Rules and symbols

Networks, features, and spaces

Probability and statistics

Page 57: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Probability

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Gerolamo Cardano (1501-1576)

Page 58: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Probability

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.QuickTime™ and a

TIFF (Uncompressed) decompressorare needed to see this picture.

Thomas Bayes (1701-1763)

Pierre-Simon Laplace (1749-1827)

Page 59: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Bayes’ rule

P(h | d) =P(d | h)P(h)

P(d | ′ h )P( ′ h )′ h ∈H

Posteriorprobability

Likelihood Priorprobability

Sum over space of hypotheses

h: hypothesisd: data

How rational agents should update their beliefs in the light of data

Page 60: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Cognition as statistical inference

• Bayes’ theorem tells us how to combine prior knowledge with data– a different language for describing the

constraints on human inductive inference

Page 61: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Prior over functions

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

k = 8, = 5, = 1

k = 8, = 5, = 0.1

k = 8, = 5, = 0.3

k = 8, = 5, = 0.01

Page 62: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Maximum a posteriori (MAP) estimation

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 63: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Cognition as statistical inference

• Bayes’ theorem tells us how to combine prior knowledge with data– a different language for describing the

constraints on human inductive inference

• Probabilistic approaches also help to describe learning

Page 64: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Probabilistic context free grammars

S NP VP 1.0 NP T N0.7 NP N 0.3 VP V NP 1.0 T the 0.8 T a 0.2 N man 0.5 N ball0.5 V hit 0.6 V took 0.4

S

NP VP

1.0

T N

0.7

V NP

1.0

the

0.8

man

0.5

hit

0.6

the

0.8

ball

0.5 T N

0.7

P(tree) = 1.00.71.00.80.50.60.70.80.5

Page 65: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Probability and learnability

• Any probabilistic context free grammar can be learned from a sample from that grammar as the sample size becomes infinite

• Priors trade off with the amount of data that needs to be seen to believe a hypothesis

Page 66: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Cognition as statistical inference

• Bayes’ theorem tells us how to combine prior knowledge with data– a language for describing the constraints on

human inductive inference

• Probabilistic approaches also help to describe learning

• Big question: what do the constraints on human inductive inference look like?

Page 67: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Challenges for probabilistic approaches

• Computing probabilities is hard… how could brains possibly do that?

• How well do the “rational” solutions from probability theory describe how people think in everyday life?

Page 68: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.

Three approaches

Rules and symbols

Networks, features, and spaces

Probability and statistics

Page 69: Tom Griffiths CogSci C131/Psych C123 Computational Models of Cognition.