Top Banner
Simulation-based language understanding “Harry walked to the cafe.” Schema Trajector Goal walk Harry cafe Analysis Process Simulation Specificat ion Utterance Simulation Cafe Constructions General Knowledge Belief State
57

Simulation-based language understanding

Jan 03, 2016

Download

Documents

timon-boyer

Cafe. Simulation-based language understanding. Utterance. “Harry walked to the cafe.”. Constructions. Analysis Process. General Knowledge. Simulation Specification. SchemaTrajectorGoal walkHarrycafe. Belief State. Simulation. Simulation specification. - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Simulation-based language understanding

Simulation-based language understanding

“Harry walked to the cafe.”

Schema Trajector Goalwalk Harry cafe

Analysis Process

Simulation Specification

Utterance

SimulationCafe

Constructions

General Knowledge

Belief State

Page 2: Simulation-based language understanding

Simulation specification

The analysis process produces a simulation specification that

•includes image-schematic, motor control and conceptual structures

•provides parameters for a mental simulation

Page 3: Simulation-based language understanding

NTL Manifesto

• Basic Concepts are Grounded in Experience– Sensory, Motor, Emotional, Social,

• Abstract and Technical Concepts map by Metaphor to more Basic Concepts

• Neural Computation models all levels

Page 4: Simulation-based language understanding

Simulation based Language Understanding

Constructions

Simulation

Utterance Discourse & Situational Context

Semantic Specification:

image schemas, frames, action schemas

Analyzer:

incremental,competition-based, psycholinguistically

plausible

Page 5: Simulation-based language understanding

Embodied Construction Grammar• Embodied representations

– active perceptual and motor schemas(image schemas, x-schemas, frames, etc.)

– situational and discourse context

• Construction Grammar– Linguistic units relate form and

meaning/function.– Both constituency and (lexical) dependencies

allowed.

• Constraint-based– based on feature unification (as in LFG, HPSG)– Diverse factors can flexibly interact.

Page 6: Simulation-based language understanding

Embodied Construction GrammarECG

(Formalizing Cognitive Linguisitcs)

1. Linguistic Analysis

2. Computational Implementationa. Test Grammars

b. Applied Projects – Question Answering

3. Map to Connectionist Models, Brain

4. Models of Grammar Acquisition

Page 7: Simulation-based language understanding

ECG Structures

• Schemas– image schemas, force-dynamic schemas, executing

schemas, frames…

• Constructions– lexical, grammatical, morphological, gestural…

• Maps– metaphor, metonymy, mental space maps…

• Situations (Mental Spaces)– discourse, hypothetical, counterfactual…

Page 8: Simulation-based language understanding

schema Containerroles

interiorexteriorportalboundary

Embodied schemas

Interior

Exterior

Boundary

PortalSource

Path

GoalTrajector

These are abstractions over sensorimotor experiences.

schema Source-Path-Goalroles

sourcepathgoaltrajector

schema name

role name

Page 9: Simulation-based language understanding

ECG Schemas

schema <name> subcase of <schema> evokes <schema> as

<local name> roles < local role >: <role restriction> constraints <role> ↔ <role> <role> <value> <predicate>

schema Hypotenuse subcase of Line-Segment

evokes Right-Tri as rt

roles

{lower-left: Point}

{upper-right: Point}

constraints

self ↔ rt.long-side

Page 10: Simulation-based language understanding

Source-Path-Goal; Container

schema SPG

subcase of TrajLandmark

roles

source: Place

path: Directed–Curve

goal: Place

{trajector: Entity}

{landmark: Bounded-

Region}

schema Container

roles

interior: Bounded-Region boundary: Curve portal: Bounded-Region

Page 11: Simulation-based language understanding

Referent Descriptor Schemas

schema RD

roles

category

gender

count

specificty

resolved Ref

modifications

schema RD5 // Eve

roles

HumanSchema

Female

one

Known

Eve Sweetser

none

Page 12: Simulation-based language understanding

ECG Constructions

construction <name>

subcase of <construction>

constituents

<name>:<construction>

form

constraints

<name> before/meets <name>

meaning:

constraints

// same as for schemas

construction SpatialPP

constituents

prep: SpatialPreposition

lm: NP

form

constraints

prep meets lm

meaning: TrajectorLandmark

constraints

selfm ↔ prep

landmark ↔ lm.category

Page 13: Simulation-based language understanding

Into and The CXNs

construction Into subcase of SpatialPreposition

form: WordForm constraints

orth "into" meaning: SPG evokes Container as c

constraints landmark ↔ c goal ↔ c.interior

construction The subcase of Determiner form:WordForm

constraints

orth "the"

meaning

evokes RD as rd

constraints rd.specificity “known”

Page 14: Simulation-based language understanding

Two Grammatical CXNsconstruction DetNoun

subcase of NP constituents

d:Determiner

n:Noun

form constraints

d before n

meaning constraints

selfm ↔ d.rd

category ↔ n

construction NPVP subcase of S constituents

subj: NP vp: VP form constraints subj before vpmeaning constraints profiled-participant ↔ subj

Page 15: Simulation-based language understanding

Simulation specification

The analysis process produces a simulation specification that

•includes image-schematic, motor control and conceptual structures

•provides parameters for a mental simulation

Page 16: Simulation-based language understanding

Competition-based analyzer• An analysis is made up of:

– A constructional tree

– A semantic specification

– A set of resolutions

Bill gave Mary the book

MaryBill

Ref-Exp Ref-Exp Ref-ExpGive

A-GIVE-B-X

subj v obj1 obj2

book01

@Man @WomanGive-Action @Book

giver

recipient

theme

Johno Bryant

Page 17: Simulation-based language understanding

Combined score determines best-fit

• Syntactic Fit:– Constituency relations– Combine with preferences on non-local elements– Conditioned on syntactic context

• Antecedent Fit:– Ability to find referents in the context– Conditioned on syntax match, feature agreement

• Semantic Fit:– Semantic bindings for frame roles– Frame roles’ fillers are scored

Page 18: Simulation-based language understanding

0Eve1walked2into3the4house5

Constructs--------------NPVP[0] (0,5)Eve[3] (0,1)ActiveSelfMotionPath

[2] (1,5)WalkedVerb[57] (1,2)SpatialPP[56] (2,5)Into[174] (2,3)DetNoun[173] (3,5)The[204] (3,4)House[205] (4,5)

Schema Instances

-------------------

SelfMotionPathEvent[1]

HouseSchema[66]

WalkAction[60]

Person[4]

SPG[58]

RD[177] ~ house

RD[5]~ Eve

Page 19: Simulation-based language understanding

Unification chains and their fillersSelfMotionPathEvent[1].mover

SPG[58].trajector

WalkAction[60].walker

RD[5].resolved-ref

RD[5].category

Filler: Person4

 

 

SpatialPP[56].m

Into[174].m

SelfMotionPathEvent[1].spg

Filler: SPG58

 

SelfMotionPathEvent[1]

.landmark

House[205].m

RD[177].category

SPG[58].landmark

Filler:HouseSchema66

 

 

WalkedVerb[57].m

WalkAction[60].routine

WalkAction[60].gait

SelfMotionPathEvent[1]

.motion

Filler:WalkAction60

Page 20: Simulation-based language understanding

Summary: ECG

• Linguistic constructions are tied to a model of simulated action and perception

• Embedded in a theory of language processing– Constrains theory to be usable– Basis for models of grammar learning

• Precise, computationally usable formalism– Practical computational applications, like MT and NLU– Testing of functionality, e.g. language learning

• A shared theory and formalism for different cognitive mechanisms– Constructions, metaphor, mental spaces, etc.

• Reduction to Connectionist and Neural levels

Page 21: Simulation-based language understanding

• Mother (I) give you this (a toy).

CHILDES Beijing Corpus (Tardiff, 1993; Tardiff, 1996)

ma1+ma

gei3

ni3zhei4+

ge

mother give 2PS this+CLS• You give auntie [the

peach].

• Oh (go on)! You give [auntie] [that].

Productive Argument Omission (Mandarin)Johno Bryant & Eva Mok

1

2

3

ni3 gei3

yi2

2PS give auntie

ao ni3gei3

ya

EMP 2PS give EMP4 gei

3

give

• [I] give [you] [some peach].

Page 22: Simulation-based language understanding

Arguments are omitted with different probabilities

All args omitted: 30.6% No args omitted: 6.1%

% elided (98 total utterances)

Giver

Recipient

Theme

0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

60.00%

70.00%

80.00%

90.00%

100.00%

Page 23: Simulation-based language understanding

Analyzing ni3 gei3 yi2 (You give auntie)

• Syntactic Fit: – P(Theme omitted | ditransitive cxn) = 0.65– P(Recipient omitted | ditransitive cxn) = 0.42

Two of the competing analyses:

ni3 gei3 yi2 omitted↓ ↓ ↓ ↓

Giver Transfer Recipient Theme

ni3 gei3 omitted yi2↓ ↓ ↓ ↓

Giver Transfer Recipient Theme

(1-0.78)*(1-0.42)*0.65 = 0.08 (1-0.78)*(1-0.65)*0.42 = 0.03

Page 24: Simulation-based language understanding

Using frame and lexical information to restrict type of reference

Lexical Unit gei3

Giver (DNI)

Recipient (DNI)

Theme (DNI)

The Transfer Frame

Giver

Recipient

Theme

Manner

Means

Place

Purpose

Reason

Time

Page 25: Simulation-based language understanding

Can the omitted argument be recovered from context?

• Antecedent Fit:ni3 gei3 yi2 omitted↓ ↓ ↓ ↓

Giver Transfer Recipient Theme

ni3 gei3 omitted yi2↓ ↓ ↓ ↓

Giver Transfer Recipient Theme

Discourse & Situational Context

child motherpeach auntietable

?

Page 26: Simulation-based language understanding

How good of a theme is a peach? How about an aunt?

The Transfer Frame

Giver (usually animate)

Recipient (usually animate)

Theme (usually inanimate)

ni3 gei3 yi2 omitted↓ ↓ ↓ ↓

Giver Transfer Recipient Theme

ni3 gei3 omitted yi2↓ ↓ ↓ ↓

Giver Transfer Recipient Theme

Semantic Fit:

ni3 gei3 yi2 omitted↓ ↓ ↓ ↓

Giver Transfer Recipient Theme

Page 27: Simulation-based language understanding

The argument omission patterns shown earlier

can be covered with just ONE construction

• Each construction is annotated with probabilities of omission

• Language-specific default probability can be set

Subj Verb Obj1 Obj2

↓ ↓ ↓ ↓

Giver Transfer Recipient Theme

0.78 0.42 0.65P(omitted|cxn):

% elided (98 total utterances)

Giver

Recipient

Theme

0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

60.00%

70.00%

80.00%

90.00%

Page 28: Simulation-based language understanding

Leverage process to simplify representation

• The processing model is complementary to the theory of grammar

• By using a competition-based analysis process, we can:– Find the best-fit analysis with respect to constituency

structure, context, and semantics– Eliminate the need to enumerate allowable patterns

of argument omission in grammar

• This is currently being applied in models of language understanding and grammar learning.

Page 29: Simulation-based language understanding

Modeling context for language understanding and learning

• Linguistic structure reflects experiential structure

– Discourse participants and entities

– Embodied schemas:• action, perception, emotion, attention, perspective

– Semantic and pragmatic relations: • spatial, social, ontological, causal

• ‘Contextual bootstrapping’ for grammar learning

Page 30: Simulation-based language understanding

The context model tracks accessible entities, events, and utterances

Discourse & Situational

Context

Discourse01participants: Eve , Motherobjects: Hands, ...discourse-history: DS01situational-history: Wash-Action

Discourse:

Page 31: Simulation-based language understanding

Each of the items in the context model has rich internal structure

Situational History: Discourse History:

Participants: Objects:

Discourse:

Wash-Actionwasher: Evewashee: Hands

DS01speaker: Motheraddressee: Eveattentional-focus: Handscontent: {"are they clean yet?"}speech-act: question

Evecategory: childgender: femalename: Eveage: 2

Mothercategory: parentgender: femalename: Eveage: 33

Handscategory: BodyPartpart-of: Evenumber: pluralaccessibility: accessible

Page 32: Simulation-based language understanding

Analysis produces a semantic specification

Linguistic Knowledge

UtteranceDiscourse & Situational

Context

Semantic Specification

World Knowledge

Analysis

“You washed them”

WASH-ACTIONwasher: Evewashee: Hands

Page 33: Simulation-based language understanding

How Can Children Be So Good At Learning Language?

• Gold’s Theorem:No superfinite class of language is identifiable in the limit from positive data only

• Principles & ParametersBabies are born as blank slates but acquire language quickly (with noisy input and little correction) → Language must be innate:

Universal Grammar + parameter setting

But babies aren’t born as blank slates!And they do not learn language in a vacuum!

Page 34: Simulation-based language understanding

Key ideas for a NT of language acquisitionNancy Chang and Eva Mok

• Embodied Construction Grammar

• Opulence of the Substrate– Prelinguistic children already have rich sensorimotor

representations and sophisticated social knowledge

• Basic Scenes – Simple clause constructions are associated directly with

scenes basic to human experience(Goldberg 1995, Slobin 1985)

• Verb Island Hypothesis – Children learn their earliest constructions

(arguments, syntactic marking) on a verb-specific basis(Verb Island Hypothesis, Tomasello 1992)

Page 35: Simulation-based language understanding

Embodiment and Grammar Learning

Paradigm problem for Nature vs. Nurture

The poverty of the stimulus

The opulence of the substrate

Intricate interplay of genetic and environmental, including social, factors.

Page 36: Simulation-based language understanding

Two perspectives on grammar learning

Computational models• Grammatical induction

– language identification– context-free grammars,

unification grammars– statistical NLP (parsing,

etc.)

• Word learning models– semantic representations

• logical forms• discrete representations• continuous

representations

– statistical models

Developmental evidence

• Prior knowledge– primitive concepts– event-based knowledge– social cognition– lexical items

• Data-driven learning– basic scenes– lexically specific patterns– usage-based learning

Page 37: Simulation-based language understanding

Key assumptions for language acquisition

• Significant prior conceptual/embodied knowledge– rich sensorimotor/social substrate

• Incremental learning based on experience– Lexically specific constructions are learned

first.• Language learning tied to language use

– Acquisition interacts with comprehension, production; reflects communication and experience in world.

– Statistical properties of data affect learning

Page 38: Simulation-based language understanding

Context

Eve

washer

Wash-Action

Hands

washee

Discourse Segment

addressee

attentional-focus

Analysis draws on constructions and context

before

before

MeaningForm

you Addressee

washer

Wash-Actionwashed

washee

ContextElementthem

Page 39: Simulation-based language understanding

Learning updates linguistic knowledge based on input utterances

Learning

Discourse & Situational

Context Linguistic Knowledge

Analysis

Utterance

PartialSemSpec

World Knowledge

Page 40: Simulation-based language understanding

Context

Eve

washer

Wash-Action

Hands

washee

Discourse Segment

addressee

attentional-focus

Context aids understanding: Incomplete grammars yield partial SemSpec

MeaningForm

you Addressee

washer

Wash-Actionwashed

washee

ContextElementthem

Page 41: Simulation-based language understanding

Context

Eve

washer

Wash-Action

Hands

washee

Discourse Segment

addressee

attentional-focus

Context bootstraps learning: new construction maps form to meaning

MeaningForm

you Addressee

Wash-Actionwashed

ContextElementthem

before

before washer

washee

Page 42: Simulation-based language understanding

Context bootstraps learning: new construction maps form to meaning

MeaningForm

you Addressee

Wash-Actionwashed

ContextElementthem

before

before washer

washee

YOU-WASHED-THEM

constituents:

YOU, WASHED, THEM

form:

YOU before WASHED

WASHED before THEM

meaning: WASH-ACTION

washer: addressee

washee: ContextElement

Page 43: Simulation-based language understanding

Grammar learning: suggesting new CxNs and reorganizing existing ones

reinforcement

reorganize• merge• join• split

Linguistic Knowledge

Discourse & Situational

Context

Analysis

Utterance

PartialSemSpec

World Knowledge

hypothesize• map form to

meaning• learn contextual

constraints

Page 44: Simulation-based language understanding

Challenge: How far up to generalize

• Eat rice

• Eat apple

• Eat watermelon

• Want rice

• Want apple

• Want chair

Inanimate ObjectInanimate Object

ManipulableObjects

ManipulableObjects

Unmovable Objects

Unmovable Objects

FoodFood FurnitureFurniture

FruitFruit SavorySavory ChairChair SofaSofa

appleapple watermelon

watermelon

ricerice

Page 45: Simulation-based language understanding

Challenge: Omissible constituents

• In Mandarin, almost anything available in context can be omitted – and often is in child-directed speech.

• Intuition:

• Same context, two expressions that differ by one constituent a general construction with the constituent being omissible

• May require verbatim memory traces of utterances + “relevant” context

Page 46: Simulation-based language understanding

When does the learning stop?

• Most likely grammar given utterances and context• The grammar prior includes a preference for the

“kind” of grammar• In practice, take the log and minimize cost

Minimum Description Length (MDL)

)(),|(argmax

),|(argmaxˆ

GPZGUP

ZUGPG

G

G

Bayesian Learning FrameworkSchemas +

Constructions

SemSpec

Analysis +

Resolution

Analysis +

Resolution

Context Fitting

Context Fitting

Page 47: Simulation-based language understanding

Intuition for MDL

• S -> Give me NP

• NP -> the book

• NP -> a book

• S -> Give me NP• NP -> DET book• DET -> the• DET -> a

51

Suppose that the prior is inversely proportional to the size of the grammar (e.g. number of rules)

It’s not worthwhile to make this generalization

Page 48: Simulation-based language understanding

Intuition for MDL

• S -> Give me NP• NP -> the book• NP -> a book• NP -> the pen• NP -> a pen• NP -> the pencil• NP -> a pencil• NP -> the marker• NP -> a marker

• S -> Give me NP

• NP -> DET N

• DET -> the

• DET -> a

• N -> book

• N -> pen

• N -> pencil

• N -> marker

Page 49: Simulation-based language understanding

Usage-based learning: comprehension and production

reinforcement(usage)

reinformcent(correction)

reinforcement(usage)

hypothesize constructions& reorganize

reinforcement(correction)

constructicon

world knowledge

discourse & situational context

simulation

analysis

utterance

analyze &

resolve

utterance

response

comm. intent

generate

Page 50: Simulation-based language understanding
Page 51: Simulation-based language understanding

From Molecule to Metaphor www.m2mbook.org

I. Embodied Information Processing II. How the Brain Computes III. How the Mind Computes IV. Learning Concrete Words V. Learning Words for Actions VI. Abstract and Metaphorical Words VII. Understanding Stories VIII. Combining Form and Meaning IX. Embodied Language

Page 52: Simulation-based language understanding

Basic Questions Addressed

• How could our brain, a mass of chemical cells, produce language and thought?

• How much can we know about our own experience?• How do we learn new concepts?• Does our language determine how we think?• Is language innate?• How do children learn grammar?• Why make computational brain models of thought? • Will our robots understand us?

Page 53: Simulation-based language understanding

Language, Learning and Neural Modelingwww.icsi.berkeley.edu/AI

• Scientific Goal Understand how people learn and use language

• Practical Goal Deploy systems that analyze and produce language

• Approach Build models that perform cognitive tasks, respecting all experimental and experiential constraints Embodied linguistic theories with advanced biologically-based computational methods

Page 54: Simulation-based language understanding

Simulation Semantics• BASIC ASSUMPTION: SAME REPRESENTATION FOR

PLANNING AND SIMULATIVE INFERENCE– Evidence for common mechanisms for recognition and

action (mirror neurons) in the F5 area (Rizzolatti et al (1996), Gallese 96, Boccino 2002) and from motor imagery (Jeannerod 1996)

• IMPLEMENTATION: – x-schemas affect each other by enabling, disabling or

modifying execution trajectories. Whenever the CONTROLLER schema makes a transition it may set, get, or modify state leading to triggering or modification of other x-schemas. State is completely distributed (a graph marking) over the network.

• RESULT: INTERPRETATION IS IMAGINATIVE SIMULATION!

Page 55: Simulation-based language understanding

Grammar learning: hypothesizing new constructions and reorganizing them

reinforcement

reorganize• merge• join• split

Linguistic Knowledge

Discourse & Situational

Context

Analysis

Utterance

PartialSemSpec

World Knowledge

hypothesize• map form to

meaning• learn contextual

constraints

Page 56: Simulation-based language understanding

Discovering the Conceptual Primitives2008 Cognitive Science Conference

Cognitive Science is now in a position to discover the neural basis for many of the conceptual primitives underlying language and thought. The main concern is conceptual mechanisms that have neural realization that does not depend on language and culture. These concepts (the primitives) are good candidates for a catalog of potential foundations of meaning.

Lisa Aziz-Zadeh, USC - NeuroscienceDaniel Casasanto, Stanford – PsycholinguisticsJerome Feldman, UCB/ICSI - AIRebecca Saxe, MIT - DevelopmentLen Talmy, Buffalo,UCB – Cognitive Linguistics

Page 57: Simulation-based language understanding

Understanding an utterance in context: analysis and simulation

Linguistic Knowledge

Simulation

Utterance

Discourse & Situational

Context

Semantic Specification

World Knowledge

Analysis

Neural Theory of Language (Feldman, 2006)