Top Banner
Jamie Alexandre
44

Jamie Alexandre. ≠ = would you like acookie jason.

Dec 31, 2015

Download

Documents

Preston Lloyd
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Jamie Alexandre. ≠ = would you like acookie jason.

JamieAlexandre

Page 2: Jamie Alexandre. ≠ = would you like acookie jason.

=

Page 3: Jamie Alexandre. ≠ = would you like acookie jason.

wouldyoulikeacookiejason

Page 4: Jamie Alexandre. ≠ = would you like acookie jason.
Page 5: Jamie Alexandre. ≠ = would you like acookie jason.
Page 6: Jamie Alexandre. ≠ = would you like acookie jason.

Grammatical ComplexityThe Chomsky Hierarchy

Page 7: Jamie Alexandre. ≠ = would you like acookie jason.

Grammatical ComplexityThe Chomsky Hierarchy

Page 8: Jamie Alexandre. ≠ = would you like acookie jason.

Recursion• Something containing an instance of itself.

Page 9: Jamie Alexandre. ≠ = would you like acookie jason.

Recursion in Language

The dog walked down the street.

The dog the cat rode walked down the street.

The dog the cat the rat grabbed rode walked down the street.

Page 10: Jamie Alexandre. ≠ = would you like acookie jason.

Recursion: “Stack” MemoryThe dog the cat the rat grabbed rode walked down the street.

DOG CATRAT WALKRIDEGRAB

Page 11: Jamie Alexandre. ≠ = would you like acookie jason.

Recursion: “Stack” MemoryThe dog the cat the rat grabbed rode walked down the street.

DOGCATRAT

WALKRIDEGRAB

“Limited performance…”

“Infinite competence…”

Page 12: Jamie Alexandre. ≠ = would you like acookie jason.

??

Page 13: Jamie Alexandre. ≠ = would you like acookie jason.

SRNSimple Recurrent Network (Elman, 1990)

• Some ability to use longer contexts• Incremental learning: no looking back• No “rules”: distributed representation

Page 14: Jamie Alexandre. ≠ = would you like acookie jason.

PCFG

• Easily handles recursive structure, long-range context• Hierarchical, “rule”-based representation• More computationally complex, non-incremental learning

Probabilistic Context-Free Grammar

S NP VPN’ AdjP N’N’ NAdjgreen…

0.80.650.350.1…

Page 15: Jamie Alexandre. ≠ = would you like acookie jason.

Serial ReactionTime (SRT) Study

• Buttons flash in short sequences– “press the button as quickly as possible when it lights up”

• Dependent measure: RT– time from light on correct button pressed

• Subjects seem to be making sequential predictionsRT ∝ P(button|context)

also: RT -log(∝ P(button|context))(“surprisal”, e.g. Hale, 2001; Levy, 2008)

Page 16: Jamie Alexandre. ≠ = would you like acookie jason.

Training the Humans

• Eight subjects per experimental condition

• Same sequences, different mappings

• Broken into 16 blocks, with breaks

• About an hour of button-pressing total

• Emphasized speed, while minimizing errors

Page 17: Jamie Alexandre. ≠ = would you like acookie jason.
Page 18: Jamie Alexandre. ≠ = would you like acookie jason.

Training the Models• Trained on exactly the same sequences

as the humans, but not fit to human data

• Predictions at every point based solely on sequences seen prior to that

• Results in sequence of probabilities– correlated with sequence of human RTs,

through surprisal (negative log probability)

Page 19: Jamie Alexandre. ≠ = would you like acookie jason.

Analysis

Page 20: Jamie Alexandre. ≠ = would you like acookie jason.

Analysis

Page 21: Jamie Alexandre. ≠ = would you like acookie jason.

A Case Study in Recursion: Palindromes

A C L Q L C A

(Sequences of length 5 through 15; total of 3728 trials per subject)

Page 22: Jamie Alexandre. ≠ = would you like acookie jason.

1-4 5-8 9-12 13-160

0.1

0.2

0.3

0.4

Blocks (average of 233 trials per block)

Co

rre

latio

n (

Su

rpri

sal v

s R

T)

PCFGSRN

1-4 5-8 9-12 13-160

0.1

0.2

0.3

0.4

0.5

Blocks (average of 233 trials per block)

Co

rre

latio

n (

Su

rpri

sal v

s R

T)

PCFGSRN

“Did you notice any patterns?”Subjects with no awareness of pattern:

“No”, “None”, “Not really” (n=5)

Those with explicit awareness of pattern:

“Circular pattern”, “Mirror pattern” (n=3)

SRN(implicit task performance)

PCFG(explicit task performance)Will this replicate?

Page 23: Jamie Alexandre. ≠ = would you like acookie jason.

2 4 6 8 10 12 14 16-0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

Block

Cor

rela

tion

(Sur

pris

al v

s R

T)

Implicit, didn't notice (n=8)

PCFG

SRN

Page 24: Jamie Alexandre. ≠ = would you like acookie jason.

• Differences between individuals?– or actually between modes of processing?

• What if we explicitly train subjects on the pattern?

• First half implicit, second half explicit

Page 25: Jamie Alexandre. ≠ = would you like acookie jason.

“This is the middle button in every sequence (and it only occurs in the middle position, halfway through the sequence):

                                 This means that as soon as you see this button, you know that the sequence will start to reverse.

Here are some example sequences of various lengths:                                                                  

Explicit Training Worksheet

Page 26: Jamie Alexandre. ≠ = would you like acookie jason.

And Quiz Sheet“Now, complete these sequences using the same pattern (crossing out any unneeded boxes at the end of a sequence):

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             

Page 27: Jamie Alexandre. ≠ = would you like acookie jason.

2 4 6 8 10 12 14 16-0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

Block

Cor

rela

tion

(Sur

pris

al v

s R

T)

Fully explicit from middle (n=8)

PCFG

SRN

(explicit instruction given here)

Page 28: Jamie Alexandre. ≠ = would you like acookie jason.

0 20 40 60 80 100

240

260

280

300

320

340

360

Percentage of the way through sequence

RT

Palindromes: The effect of explicit instruction after block 8

Blocks 1-2

Blocks 3-4Blocks 5-6

Blocks 7-8

Blocks 9-10

Blocks 11-12Blocks 13-14

Blocks 15-16

Before explicit instruction

After

Page 29: Jamie Alexandre. ≠ = would you like acookie jason.

Context-free vs Context-sensitive

A AB BC CD D

1

1

1

2

2

2

1 2

Page 30: Jamie Alexandre. ≠ = would you like acookie jason.

CFG:

CSG:

Explicit Instruction(after block 4)

Page 31: Jamie Alexandre. ≠ = would you like acookie jason.

Methods• Four conditions, with 8 subjects in each

– Implicit context-free grammar (CFG)– Implicit context-sensitive grammar (CSG)– Explicit context-free grammar (CFG)– Explicit context-sensitive grammar (CSG)

• Total of 640 sequences (4,120 trials) per subject– Sequences of length 4, 6, 8, and 10– Around 1.5 hours of button-pressing– In blocks 9-16, 5% of the trials were “errors”

A1 B1 C1 C2 B2 A2

D2

Page 32: Jamie Alexandre. ≠ = would you like acookie jason.

0 20 40 60 80 100

280

300

320

340

Explicit CSG

0 20 40 60 80 100

280

300

320

340

Explicit CFG

0 20 40 60 80 100

280

300

320

340

Implicit CSG

0 20 40 60 80 100

280

300

320

340

Implicit CFG

Blocks 1-4Blocks 5-8Blocks 9-12 (errors thicker)Blocks 13-16 (errors thicker)

Page 33: Jamie Alexandre. ≠ = would you like acookie jason.

Implicit CFG Explicit CFG Implicit CSG Explicit CSG

240

260

280

300

320

340

non-errors

errors**(6ms)

**(27ms)

(2ms) **(11ms)

RT

(m

s)

Page 34: Jamie Alexandre. ≠ = would you like acookie jason.

Conclusions

• Explicit/Implicit processing– Implicit performance correlated with the predictions

made by an SRN (a connectionist model)– Explicit performance correlated with the predictions

made by a PCFG (a rule-based model)

• Grammatical complexity– Able to process context-free, recursive structures at a

very rapid timescale– More limited ability to process context-sensitive

structures

Page 35: Jamie Alexandre. ≠ = would you like acookie jason.

• Longer training

• More complex grammars– Determinism

• Other response measures– EEG: more sensitive than RTs to initial stages

of learning

• Field studies in Switzerland or Brazil…?

Future Directions

Page 36: Jamie Alexandre. ≠ = would you like acookie jason.

Broader Goals

• L2-learning pedagogy

Page 37: Jamie Alexandre. ≠ = would you like acookie jason.

Thankyous!MentorshipJeff ElmanRoger LevyMarta Kutas

AdviceMicah Bregman

Ben CipolliniVicente Malave Nathaniel Smith

Angela YuRachel Mayberry

Tom Urbach

Andrea, Seana and the 3rd Year Class!

Research AssistantsFrances Martin (2010)

Ryan Cordova (2009)

Wai Ho Chiu (2009)

Page 38: Jamie Alexandre. ≠ = would you like acookie jason.
Page 39: Jamie Alexandre. ≠ = would you like acookie jason.

Implicit CFG Explicit CFG Implicit CSG Explicit CSG

240

260

280

300

320

340

360

error position - 2error position -1*error position*error position + 1error position + 2

Page 40: Jamie Alexandre. ≠ = would you like acookie jason.

-0.6 -0.4 -0.2 0

325330335340345

bigr

am

Blocks 1-4

-0.6 -0.4 -0.2 0

320

340

trig

ram

-0.6 -0.4 -0.2 0320

340

360

hmm

5

-0.6 -0.4 -0.2 0320340360380

ihm

m

-0.6 -0.4 -0.2 0

320340360

srn

(one

pas

s)

-0.6 -0.4 -0.2 0300

350

pcfg

8

-0.6 -0.4 -0.2 0280

300

320

Blocks 5-8

-0.6 -0.4 -0.2 0

300

320

-0.6 -0.4 -0.2 0280

300

320

-0.6 -0.4 -0.2 0

280

300

320

-0.6 -0.4 -0.2 0

300320

340

-0.6 -0.4 -0.2 0

320

340

-0.6 -0.4 -0.2 0250

300

Blocks 9-12

-0.6 -0.4 -0.2 0

290300310

-0.6 -0.4 -0.2 0250

300

-0.6 -0.4 -0.2 0260280300320

-0.6 -0.4 -0.2 0

280300320340

-0.6 -0.4 -0.2 0

300

320

-0.6 -0.4 -0.2 0260280300320

Blocks 13-16

-0.6 -0.4 -0.2 0280

300

-0.6 -0.4 -0.2 0

300

320

-0.6 -0.4 -0.2 0250

300

-0.6 -0.4 -0.2 0

280300320

-0.6 -0.4 -0.2 0

300

320

Negative probability plotted against smoothed RTs

Page 41: Jamie Alexandre. ≠ = would you like acookie jason.

2 4 6

325330335340345

bigr

amBlocks 1-4

2 4 6

320

340

trig

ram

2 4 6320340

360

hmm

5

2 4 6320340360380

ihm

m

2 4 6

320340360

srn

(one

pas

s)

2 4 6

320

340

360

pcfg

8

2 4 6280

300

320

Blocks 5-8

2 4 6

300

320

2 4 6

280300320

2 4 6

280

300

320

2 4 6

300

350

2 4 6

310320330340350

2 4 6250

300

Blocks 9-12

2 4 6280

300

2 4 6260280300320340

2 4 6260280300320340

2 4 6260280300320340

2 4 6

300

320

340

2 4 6260280300320

Blocks 13-16

2 4 6

280

300

2 4 6

300

320

2 4 6

250

300

2 4 6

280300320

2 4 6

300310320

Surprisal plotted against smoothed RTs

Page 42: Jamie Alexandre. ≠ = would you like acookie jason.

AGL and Language

• Areas associated with syntax may be involved– Bahlmann, Schubotz, and Friederici (2008).

Hierarchical artificial grammar processing engages Broca's area. NeuroImage, 42(2):525-534.

• P600-like effects can be seen in AGL– Christiansen, Conway, & Onnis (2007). Neural

Responses to Structural Incongruencies in Language and Statistical Learning Point to Similar Underlying Mechanisms.

– “violations in an artificial grammar can elicit late positivities qualitatively and topographically comparable to the P600 seen with syntactic violations in natural language”

Page 43: Jamie Alexandre. ≠ = would you like acookie jason.

-15 -10 -5 0 5 10 15 20-0.05

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

Lag

Cor

rela

tion

(Pro

b vs

RT

)Sanity Check: Effect is Local

Page 44: Jamie Alexandre. ≠ = would you like acookie jason.

Context-free Grammar

The dog the cat the rat grabbed rode walked.

S NP VP

NP NNP N S

N the dogN the catN the rat

VP grabbedVP rodeVP walked