Top Banner
" $ ! ! "!! !# & 최유진 % "!"
19

(발제) Grounding words in perception and action computational model. +TREDNS in Cognitive Sciences 2005 -Deb Roy /최유진 x2011 autum

Jul 06, 2015

Download

Technology

snuuxlab
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: (발제) Grounding words in perception and  action  computational model. +TREDNS in Cognitive Sciences 2005 -Deb Roy /최유진 x2011 autum

최유진

Page 2: (발제) Grounding words in perception and  action  computational model. +TREDNS in Cognitive Sciences 2005 -Deb Roy /최유진 x2011 autum

Grounding words in perception and action: computational model.

Deb RoyTRENDS in Cognitive Sciences

Vol.9 No.8 August 2005

Thursday, October 13, 2011

Page 3: (발제) Grounding words in perception and  action  computational model. +TREDNS in Cognitive Sciences 2005 -Deb Roy /최유진 x2011 autum

Language

English

Russian

Korean

French

Chinese

Japanese

Portuguise

Indian Germane

SpanishArabic

Thursday, October 13, 2011

Page 4: (발제) Grounding words in perception and  action  computational model. +TREDNS in Cognitive Sciences 2005 -Deb Roy /최유진 x2011 autum

Oneʼs language = Oneʼs perspective on the world

Makes a language of machines with that of humans.

Human - communicate with - Machine

Thursday, October 13, 2011

Page 5: (발제) Grounding words in perception and  action  computational model. +TREDNS in Cognitive Sciences 2005 -Deb Roy /최유진 x2011 autum

Deb Roy

Associate Professor of Media Arts and SciencesDirector, Cognitive Machines

Roy studies how children learn language, and designs machines that learn to communicate in human-like ways. To enable this work, he has pioneered new data-driven methods for analyzing and modeling human linguistic and social behavior.

: artificial intelligence, cognitive modeling, human-machine interaction, data mining and information visualization

http://www.ted.com/talks/deb_roy_the_birth_of_a_word.html

Thursday, October 13, 2011

Page 6: (발제) Grounding words in perception and  action  computational model. +TREDNS in Cognitive Sciences 2005 -Deb Roy /최유진 x2011 autum

We use words to communicate about the things and kinds of things, their properties, relations and actions. Analogy between Human and Machine.

- Researches in robotics and simulated systems uses : Ground language in machine perception and action = Human abilities.

- Research Tradition in computational model moves from : purely symbolic level to connecting symbolic to physical realm of the real world referents. : purely symbolic modelf context-dependent� XWj j8, I2� { P Y�.

Index.

1. Words about the physical world.2. Association between words and perceptual categories.3. Modeling context-dependent word use.4. Models of infant word learning that process ʻfirst-person-perspectiveʼ sensory data.5. Richer representational structures : grounding verbs in physical action.6. Integration of action and perception in grounding nouns.7. Conclusions

0. Research Background

Thursday, October 13, 2011

Page 7: (발제) Grounding words in perception and  action  computational model. +TREDNS in Cognitive Sciences 2005 -Deb Roy /최유진 x2011 autum

1. Words about the physical world.

• Is human language is like dictionary?

�y computational modelf l�j XWDag E�symbolice( k��]�1 Ux ��ZG D$�f XW, Dute( Da�� =�� Ux ���j ��g �� Da��.

Real-world referents : �W�f lT� �cj \�g �� W"� j8, @[;��?

• Computational model and embodied nature of language : Complex crossmodal phenomena --> particularly useful in situated language acquisition.

W. Vk�k XW, �h R�� �� ��f z( Ux ��(physical env.) NZG �E� ��(object and activities), \��� �k�.

• Implication of the study : the possibility of machines to autonomously acquire and verify beliefs about the world, and to communicate in natural language about their beliefs.

ROUND PUSH HEAVY

Visual feature Motor control feature Haptic feature

Thursday, October 13, 2011

Page 8: (발제) Grounding words in perception and  action  computational model. +TREDNS in Cognitive Sciences 2005 -Deb Roy /최유진 x2011 autum

2. Words - Perceptual Categories : Salient Linguistic Feature

2.1 Language grounding system & categorization.

Sensory input Natural language description.translation

�>tl �J `M: continuous sensor input (vectors) -- linguistic categories

e.g. Generative and discriminative models of categorization.

(a). Two prototypes can ʻcompeteʼ (b), leading to a category boundary along points of equal distance from both prototypes (if non-Euclidean distance measures are used, non-linear boundaries may emerge). Categories may also be modeled by explicitly representing categorical boundaries. In (c), a linear model, f(height)=A*width + B, encodes the same categorical distinction as the prototypes in (b)

Thursday, October 13, 2011

Page 9: (발제) Grounding words in perception and  action  computational model. +TREDNS in Cognitive Sciences 2005 -Deb Roy /최유진 x2011 autum

2. Words - Perceptual Categories : Salient Linguistic Feature

2.2. Models of color naming : Is perceptual model is fixed?

Mojsilovicʼs early model :�W_ S� lT ��-, 0��� �f w� �&nk#� �w.

Ux ��ZG, D$�f F� �W, =� d\�� Da��.in different context.

“Purple”“Red” “Red wine”

Thursday, October 13, 2011

Page 10: (발제) Grounding words in perception and  action  computational model. +TREDNS in Cognitive Sciences 2005 -Deb Roy /최유진 x2011 autum

3. Words - Perceptual Categories : Context-dependent Word Use

3.1 Gardenforʼs model : Color distance

How linguistic convention and visual perception combine to determine word meanings.

: Arbitrary linguistic convention within perceptual color constraints.

e.g. ʻRed wineʼ in Spanish : ʻvinto tintoʼ(colored wine,literally) in Catalan : ʻvino negroʼ(black wine)

red(tinto)_ black(negro) �j �W H�f linguistic conventionj 6qct(arbitrary)l ��k}/ Gardenfor 3�Z �+1 red_ white �j �W u�f B����.

: Distance between white and red(dark) wine > between white and white(light) wine (in the context-independent prototype)

Thursday, October 13, 2011

Page 11: (발제) Grounding words in perception and  action  computational model. +TREDNS in Cognitive Sciences 2005 -Deb Roy /최유진 x2011 autum

3. Words - Perceptual Categories : Context-dependent Word Use

3.2 Reiger : Spatial Distance

: studied graded acceptability judgments of 1) spatial terms.

For English speakers , how they perceive the term “Above” in conjunction with the physical context.

ʻ The circle is above the blockʼ : Q_V%j a, b, c | W! �k �s r I2� o� � ��`?

“Above”

L1 : Connects the centers of the mass of the regions.

L2 : Connects the closest points between the regions.

L1 of (b) = L1 of (c) L2 of (a) = L2 of (b)

�#G L1k� L2j �m `M/e(� � K ��, I2� P Y�.

�*) ^Wj above� near�j �m �W� BA2�S�tl ��, �w��.

Thursday, October 13, 2011

Page 12: (발제) Grounding words in perception and  action  computational model. +TREDNS in Cognitive Sciences 2005 -Deb Roy /최유진 x2011 autum

3. Words - Perceptual Categories : Context-dependent Word Use

3.2 Reiger : CONT.

2) movements : simple movies of objects moving relative one another to visually ground words s.a. ʻthroughʼ and ʻintoʼ.

e.g. ʻPutting a key into a lockʻ vs. ʻRemoving a key from a lockʼ: events distinguished by their initial points vs. end points.

3.3 Limitation in spatial semantics and further studies.- Lack of functional contexts e.g. (? ~��M�Z� ʻclean behind the couch(�b� ��g �M�#)ʼ _ ʻhind behind the couchʼ(�b� �( QW#)ZG � 2'7j behind j8j �k, I2�} 5�.

Thursday, October 13, 2011

Page 13: (발제) Grounding words in perception and  action  computational model. +TREDNS in Cognitive Sciences 2005 -Deb Roy /최유진 x2011 autum

4. Models of Infant word learning that process ʻfirst-person-perspectiveʼ sensory data

4.1. Cross-channel early lexical learning(CELL) “Step into the shoes” of humans and learn natural sensory data.: Directly process recordings from natural human environments became enabled without manual transcription.

CELL Computational Model : S�tl �� 9 F� ��-(visual categories)_ <�� XW(spoken words) Dkj \ Jg �V�. - A model of learning words from sights and sounds.

CELL vs. Blinded system : 50% accuracy rate gaps!

Thursday, October 13, 2011

Page 14: (발제) Grounding words in perception and  action  computational model. +TREDNS in Cognitive Sciences 2005 -Deb Roy /최유진 x2011 autum

4. Models of Infant word learning that process ʻfirst-person-perspectiveʼ sensory data

4.1. Cross-channel early lexical learning(CELL)

Method : Lexical Learning Analysis

1) STM : Utterance-Context pair : audio-visual inputaudio -phonetic representations of spoken sequences : linguistic unitvideo- context: visually observable object and motion : semantic(contextual) unit

2) LTM - Lexical candidatesutterance are decomposed into a set of hypothesized linguistic unit prototypecontexts are decomposed into a set of hypothesized semantic category prototypes

e.g. bounce - ball , ruf-ruf - dog, vrrooom - car...shoes, truck

Limitation :

1) Noises from sensory processes2) Semantically Inappropriate candidatese.g. ʻyeahʼ

Thursday, October 13, 2011

Page 15: (발제) Grounding words in perception and  action  computational model. +TREDNS in Cognitive Sciences 2005 -Deb Roy /최유진 x2011 autum

1. word - perception : indirect processing - purely semantic - context-dependent

2. first-person perspective : direct processing - CELL(single object at once) - Eyegaze(multiple objects at once)

3. whatʼs next?

VERB = ACTION.

Thursday, October 13, 2011

Page 16: (발제) Grounding words in perception and  action  computational model. +TREDNS in Cognitive Sciences 2005 -Deb Roy /최유진 x2011 autum

5. Richer representational structures : grounding verbs in physical action.

Verbs that refer to physical actions are naturally grounded in representations that encode the temporal flow of events.

5.1 Siskind : Perceptually grounded model of verbs - sequences of human hands moving colored blocks.(video recorded)

- D$j O, C), �p Dkj v�, ;�, \�(contact, support, attachment)j [Talmyʼs theory of force dynamics]

- semantics of basic verbs = temporal schema, an expected sequences of force dynamic interactions.

e.g. ʻHands pick up blockʼ table-supports-blockhand-contacts-blockhand-attached-blockhand-supports-block

12

34

subject verb object

* Allen relations : 13 logical pairs of time interval between A and B

5.2. Bailey et al. developed a system that learns verb semantics and action control structure, ʻX-schemaʼ.- e.g.Difference between ʻPushʼ and ʻShoveʼ

Thursday, October 13, 2011

Page 17: (발제) Grounding words in perception and  action  computational model. +TREDNS in Cognitive Sciences 2005 -Deb Roy /최유진 x2011 autum

6. Integration of action and perception in grounding nouns.

6.1. Roy : structure networks of motors and sensor primitives : conversational robot named Ripley.

ʻHand me the blue one on your rightʼ - Ripley maintain a dynamic mental model, three-dimensional model of physical environment : �p �1, (?j 4�, �� E�p(l�)j c�

- the contents of the robotʼs mental model maybe updated based onlinguistic,visual,or haptic input. (Ripley remember the position of the object when it is out of its sensory field.)

- multimodal sensory expectation :

When Ripley do something What visual system expects

Look at the location

Find the visual region

Reaches to the location

Touch and grasp the object

Grasps the objects

control over object locationlocation info. updated

Thursday, October 13, 2011

Page 18: (발제) Grounding words in perception and  action  computational model. +TREDNS in Cognitive Sciences 2005 -Deb Roy /최유진 x2011 autum

6. Integration of action and perception in grounding nouns.

6.1. Roy : CONT.

Ripleyʼs representations and algorithms approches to the grounds the meaning of verbs,adjectives,and nouns using a unified representational system.

VERB motor-control like X-schemes actions

ADJECTIVES�w �cZG LG� :i� object, I2: All perceptual properties corresponds to actions.

red =/ color categories = categories linked to motor programs

ADJECTIVES�w �cZG LG� :i� object, I2: All perceptual properties corresponds to actions.

heavy = haptic categories linked to specific actions.

NOUNS Objects linked with locations

Ball - Round (or color,size..) - All of actions involved.

Thursday, October 13, 2011

Page 19: (발제) Grounding words in perception and  action  computational model. +TREDNS in Cognitive Sciences 2005 -Deb Roy /최유진 x2011 autum

7. Conclusions - Interaction between word use, perception, and action - Further research (Box 3):other aspects of the language such as grammatical composition and functional use in social context.- Re-unite sub-fields of AI : from computer vision, parsing, information retrieval, machine learning, and planning.- Drop in cost of sensor and robotic technology, and ubiquitous situated computing : create new forms of situated human-machine communication.

Thursday, October 13, 2011