A Cognitive Architecture Approach to Interactive Task Learning John E. Laird University of Michigan 1 Students: James Kirk, Shiwali Mohan (PARC), Aaron Mininger
A Cognitive Architecture Approach to Interactive Task Learning
John E. Laird
University of Michigan
1
Students: James Kirk, Shiwali Mohan (PARC), Aaron Mininger
Interactive Task Learning An agent that • learns new task specifications
objects, features, relations, goals and subgoals, possible actions (physical and conceptual), situational constraints on behavior, policy for behavior, and when task is appropriate;
• using natural interaction: language, gestures, demonstrations;
• comprehends task description and uses its cognitive and physical capabilities to perform task;
• learns fast (small numbers of experiences);
• learns native representation (assimilate, fast execution).
2
Cognitive Architecture • Fixed computational structures underlying intelligent behavior
– Representations of knowledge
– Memories that hold knowledge
– Processors that manipulate knowledge
• Supports end-to-end behavior
– Includes integration with perception and action
• General across tasks
– Architectural mechanisms are reused across every task and subtask
– Task-specific knowledge guides task behavior
• Complete
– No “escape” to additional specialized programming
4
Different Goals of Cognitive Architecture Research
Biological modeling: – Model what we know about the brain: neurons, neural circuits, … – Predict neural activity and cognitive behavior – Examples: LEABRA, SPAUN
Psychological modeling: – Model human performance in a wide range of cognitive tasks – Predict human reaction time and error rates for psychological tasks – Examples: ACT-R, EPIC, CLARION, LIDA, CHREST, 4CAPS
AI Functionality: – Toward human-level intelligence inspired by psychology and biology – Emphasizes more complex cognitive processing – Examples: Soar, Companions, Sigma, ICARUS, Polyscheme, CogPrime
5
Newell’s Time Scale of Human Action
6
Scale (sec) Time Units System Band 107 months 106 weeks Social 105 days 104 hours Task 103 10 min Task Rational 102 minutes Task 101 10 sec Unit task 100 1 sec Operations Cognitive 10-1 100 ms Deliberate act 10-2 10 ms Neural Circuit 10-3 1 ms Neuron Biological 10-4 100 µs Organelle
Soar A
CT-R
LEAB
RA
SPA
UN
Co
mp
anio
ns EP
IC
Sigma
Standard Model: Commonalities Across Architectures
• Organization • Modular architecture: WM, LTM, procedural, perceptual/motor…
• Representation of information • Probabilistic/statistical representation of perceptual data
• Symbolic relational structures in short and long-term memories
• Non-symbolic representations of meta-data • Used for retrieval from long-term memory, decision making, learning
• Processing • Complex behavior arises from simple decisions controlled by knowledge
• Significant internal asynchronous parallelism
• ~50msec is basic cycle time to achieve human real-time cognition
• Learning: Multiple types of increment & on-line
• Skill learning, reinforcement learning, activation adjustment, declarative learning
7
Common Structures of many Cognitive Architectures
Short-term Memory
Procedural Long-term Memory
Declarative Long-term Memory
Perception Action
Action Selection
Procedure Learning
Declarative Learning
Goals
Soar Structure
Symbolic Long-Term Memories
Symbolic Working Memory
Procedural
Decis
ion
Pro
cedure
Chunking Reinforcement
Learning
Action
Semantic
Semantic
Learning
Episodic
Episodic
Learning
9
Spatial Visual System (SVS)
Object-based continuous metric space
Supports mental imagery
Perception
controller
Interactive Task Learning Workshop
May 12-13, 2014: Ann Arbor, MI
John Anderson (CMU), Ken Forbus (Northwestern U), Kevin Gluck (AFRL), Chad Jenkins (Brown), John Laird (UM), Christian Lebiere (CMU),
Dario Salvucci (Drexel), Matthias Scheutz (Tufts), Andrea Thomaz (Georgia Tech), Greg Trafton (NRL), Robert Wray (Soar Tech), Shiwali Mohan (UM), James Kirk (UM)
Report: http://soar.eecs.umich.edu/publications
What isn’t Interactive Task Learning?
• Not just Interactive Task Learning – Not just interpret and execute commands – Learns multiple tasks that it can perform in the future
• Not just Interactive Task Learning – Not just policy learning – Learns task specification/formulation
• Not just Interactive Task Acquisition – Not offline learning from observation or compilation of a high-
level language: TAQL, HERBAL, HLSR, GDL – Learns through natural mixed initiative interaction with a human.
11
Big Picture
14
Acquire task description via language
Construct internal task representation
Game
A1
C1
Tic-Tac-Toe
P1
block location C11 C12
place move
Extract internal representation of objects in the world
Big Picture
15
Acquire task description via language
Construct internal task representation
Reason over objects, relationships to determine available actions
Extract internal representation of objects in the world
Big Picture
16
Acquire task description via language
Construct internal task representation
Reason over objects, relationships to determine available actions
Search for solution by internally simulating actions
Extract internal representation of objects in the world
Big Picture
17
Acquire task description via language
Construct internal task representation
Reason over objects, relationships to determine available actions
Manipulate environment based on discovered solution
Search for solution by internally simulating actions
Perception
Word – Category Mapping
Parsing Knowledge
Interaction
Indexing
Verb Learning
Noun Learning
Prep Learning
Action Knowledge
Procedural Memory
Preposition – Spatial Relation Mapping
Verb – Operator Mapping
Noun/Adjective – Perceptual Symbol
Mapping
Semantic Memory
Locations
Primitive Verb – Operator Mapping
Episodic Memory
Agent’s Experiences
Working Memory
Spatial Visual System
Spatial Primitives Action
Soar
Desiderata D1. Learning Competent
D2. Task Competent
D3. Task General
D4. Easy to Teach
D5. Efficient Execution
D6. Organically integrates with other agent activities
19
Transfer Results between Games
20
0
10
20
30
40
50
60
70
80
Connect-3 Tic-Tac-Toe 4-Queens
Nu
mb
er o
f In
tera
ctio
ns
No transfers
After Connect-3
After Connect-3 and Tic-Tac-Toe
Transfer Results between Games
21
0
10
20
30
40
50
60
70
80
Connect-3 Tic-Tac-Toe 4-Queens
Nu
mb
er o
f In
tera
ctio
ns
No transfers
After Connect-3
After Connect-3 and Tic-Tac-Toe
Transfer Results between Games
22
0
10
20
30
40
50
60
70
80
Connect-3 Tic-Tac-Toe 4-Queens
Nu
mb
er o
f In
tera
ctio
ns
No transfers
After Connect-3
After Connect-3 and Tic-Tac-Toe
Tic-Tac-Toe after 4-Queens and Connect-3
If Tic-Tac-Toe is taught last, only 5 interactions are needed.
M: the game is tic-tac-toe
A: I do not know how to play tic-tac-toe. Tell me the name of a legal action in this game.
M: place (learned in 4 Queens)
A: Ok tell me the name of another legal action in this game, or finished if no more actions.
M: finished
A: Ok tell me the name of the goal in the game.
M: three-in-a-row (learned in Connect-3)
A: Ok tell me the name of a failure state in the game. (or none)
Mentor: none
A: Ok I have now learned the basics of the game.
23
Efficiency of Communication
24
0
100
200
300
400
500
600
700
800
NL average Rosie+ Rosie Soar GDL
Toke
ns
Method for Specifying Instructions
ToH
Tic-Tac-Toe
8-puzzle
Efficiency of Communication
25
0
100
200
300
400
500
600
700
800
NL average Rosie+ Rosie Soar GDL
Toke
ns
Method for Specifying Instructions
ToH
Tic-Tac-Toe
8-puzzle
Future Work on Taskability • More generality and complexity
– More complex games and concepts (hidden state, dynamic action, …)
– Beyond games to more real-world applications (mobile robots)
• More accessible communication
– More natural language, gestures, …
• Learn new language constructions
– Extend syntactic structures through instruction
• Informed by available background knowledge
– Take advantage of available knowledge bases
26
Etc. • Workshop on Interactive Task Learning in April at the International Conference on
Cognitive Modelling (ICCM-2015) in Groningen, Netherlands. • Advances in Cognitive Systems (ACS-2015) Conference in May at Georgia Tech.
• My web site: http://ai.eecs.umich.edu/people/laird/ • Soar web site: http://soar.eecs.umich.edu/
• References: – Kirk, J., Laird, J. E. 2014: Interactive task learning for simple games. Advances in Cognitive
Systems 3, 11-28 – Mohan, S., Laird, J.: Learning Goal-Oriented Hierarchical Tasks from Situated Interactive
Instruction. Proceedings of the 27th AAAI Conference on Artificial Intelligence (AAAI). – Laird et al.: Report on the NSF-funded Workshop on Interactive Task Learning (2014). – Mohan, S., Mininger, A., Laird, J. E. 2013: Towards an Indexical model of situated
comprehension for real-world cognitive agents. Advances in Cognitive Systems 3, 163-182. – Kirk, J. and Laird J. E.: Learning Task Formulations through Situated Interactive Instruction.
Proceedings of the 2nd Conference on Advances in Cognitive Systems (2013). Baltimore, Maryland
– Mohan, S., Mininger, A., Kirk, J. and Laird, J. E.: (2012). Acquiring Grounded Representations of Words with Situated Interactive Instruction, Advances in Cognitive Systems, Volume 2, December 2012, Palo Alto, California.
27