Top Banner
1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre , Marco Mirolli , Francesco Mannella, Vincenzo Fiore, Stefano Zappacosta, Daniele Caligiore, Fabian Chersi, Vieri Santucci, Simona Bosco
30

1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

Mar 26, 2015

Download

Documents

Olivia Wheeler
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

1/30

IM-CLeVeR: Intrinsically MotivatedCumulative Learning

Versatile Robots

Gianluca Baldassarre, Marco Mirolli,Francesco Mannella, Vincenzo Fiore, Stefano Zappacosta,

Daniele Caligiore, Fabian Chersi, Vieri Santucci, Simona Bosco

Page 2: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

2/30

OutlineIM-CLeVeR: Intrinsically Motivated

Cumulative Learning Versatile Robots

The figures of the project The project vision The 3 pillars of the project idea + 4 S/T objectives WP3: Experiments WP4: Abstraction WP5: Intrinsic motivations WP6: Hierarchical architectures WP7: Integration and demonstrators Conclusions

Page 3: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

3/30

OutlineIM-CLeVeR: Intrinsically Motivated

Cumulative Learning Versatile Robots

Integrated project Call: Cognitive Systems, Interactions and Robotics EU funds: 5.9 ml euros 7 (8) partners Start: May 2009 End: April 2013

Page 4: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

4/30

Vision: the problem How can we create “truly intelligent” robots?

Versatile: have many goals; re-use skills Robust: function in different conditions, with noise Autonomous: learning is paramount

Weng, McClelland, Pentland, Sporns, Stockman, Sur, Thelen, (Science, 2001):

…knowledge-based systems (e.g. production systems)… …learning systems focussed on single tasks (e.g. RL)… …evoluationary systems… Important results, but limited autonomy and scalability. . . . . . on the contrary . . .

. . . organisms do scale, are flexible, and are robust!

Page 5: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

5/30

Vision: the idea Why are organisms so special? Looking at children…

Page 6: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

6/30

Vision: the ideaIngredients: Powerful abstractions: “elefant on table leg”, “it slides down” Explore Record interesting states Intrinsic motivations (interesting states, learning rates):

motivate to reproduce states (goals) guide learning of skills

Skills are re-used and composed: to explore to produce new skills

Science: which brain and behavioural mechanisms are behind these processes?

Technology: can we reverse engineer them? can we design algorithms with a similar power?

Page 7: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

7/30

Vision: 2 promises Science: we can understand organisms Technology: we can develop a new methodology for designing robots… … in particular …

Learn actions cumulatively …

…on the basis of intrinsic

motivations…

…re-use them to build other actions…

…and achieve externally

assigned goals with them.

Page 8: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

8/30

Vision: how we will do it:3 pillars + 4 S/T objectives

WP4: Abstraction and attention

WP5: Intrinsic motivations

WP6: Hierarchical architectures to support

cumulative learning

1. Empirical investigations:

- Monkeys - Children - Adults - Parkinson patients

4. Two robotic demonstrators:- CLEVER-B- CLEVER-K

2. Computational bio-constrained models:mechanisms underlying brainand behaviour

Suitable representations

Focussing learning

Science

From Science to Technology

Technology

3. Machine-learning models:powerful algorithms and architectures

From Technologyto Science

Page 9: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

9/30

The project WPs

WP4: Abstraction and attention

WP5: Intrinsic motivations

WP6: Hierarchical architectures to support

cumulative learning

1. Empirical investigations:

- Monkeys - Children - Adults - Parkinson patients

4. Two robotic demonstrators:- CLEVER-B- CLEVER-K

2. Computational bio-constrained models:mechanisms underlying brainand behaviour

Suitable representations

Focussing learning

Science

From Science to Technology

Technology

3. Machine-learning models:powerful algorithms and architectures

WP3WP4

WP5

WP6

WP7

Page 10: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

10/30

WP3: Experiments and mechatronic board

WP4: Abstraction and attention

WP5: Intrinsic motivations

WP6: Hierarchical architectures to support

cumulative learning

1. Empirical investigations:

- Monkeys - Children - Adults - Parkinson patients

4. Two robotic demonstrators:- CLEVER-B- CLEVER-K

2. Computational bio-constrained models:mechanisms underlying brainand behaviour

Suitable representations

Focussing learning

Science

From Science to Technology

Technology

3. Machine-learning models:powerful algorithms and architectures

WP3

Page 11: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

11/30

WP3: “Joystick experiment” background

USFD (Peter Redgrave & Kevin Gurney)

Actions novel outcomes dopamine BG learning

Redgrave Gurney, 2006, Nature Rev. Neuroscience

Page 12: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

12/30

WP3: Empirical Experiments: “Joystick experiment” Method:

Adult humans and Parkinsonian patients Joystick manoeuvring (gesture, location, timing) of a cursor on a screen to

obtain reinforcement or salient event For studying: Actions novel outcomes dopamine BG learning

Page 13: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

13/30

WP3: Empirical Experiments: “Board experiment” UCBM-LBRB (Eugenio Guglielmelli); Mechatronic board, intelligent sensors UCBM-LDN (Flavio Keller): children CNR-ISTC-UCP (Elisabetta Visalberghi): monkeys; Goals: (a) Investigating properties of stimuli causing intrinsic motivations;

(b) acquisition of skills based on intrinsic motivations

Inertial/magnetic unit + battery + wireless

Tactile sensors

Sabbatini, Stammati, Tavares, Visalberghi, 2007,Amer. J. PrimatologyCampolo, Taffoni, Schiavone,

Formica, Guglielmelli, Keller, 2009, Int. J. Sicial Robotics

Page 14: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

14/30

WP4: Abstraction

WP4: Abstraction and attention

WP5: Intrinsic motivations

WP6: Hierarchical architectures to support

cumulative learning

1. Empirical investigations:

- Monkeys - Children - Adults - Parkinson patients

4. Two robotic demonstrators:- CLEVER-B- CLEVER-K

2. Computational bio-contrained models:mechanisms underlying brainand behaviour

Suitable representations

Focussing learning

Science

From Science to Technology

Technology

3. Machine-learning models:powerful algorithms and architectures

WP4

Page 15: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

15/30

WP4 Abstraction: motor, perception, attention, vergence, Abstraction is a key ingredient for intrinsic motivations and hierarchical

actions Motor: key in hierarchies Perceptual: key in intrinsic motivations: e.g., retina images would be

always novel without abstraction Attention/vergence: two key forms of abstraction

Page 16: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

16/30

WP4 Intrinsic motivations for developing vergence and perceptual abstraction FIAS (Jochen Triesch) E.g.: reward when

target fixated with both eyesdrives development of vergence

Similar mechanisms to develop perceptual abstraction

Weber Triesch, 2009, IJCNNFranz & Triesch, 2007, ICDL

Page 17: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

17/30

WP5: Novelty detection

WP4: Abstraction and attention

WP5: Intrinsic motivations

WP6: Hierarchical architectures to support

cumulative learning

1. Empirical investigations:

- Monkeys - Children - Adults - Parkinson patients

4. Two robotic demonstrators:- CLEVER-B- CLEVER-K

2. Computational bio-contrained models:mechanisms underlying brainand behaviour

Suitable representations

Focussing learning

Science

From Science to Technology

Technology

3. Machine-learning models:powerful algorithms and architectures

WP5

Page 18: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

18/30

WP5 Intrinsic (extrinsic) motivations Extrinsic motivations

(e.g. food, sex, money): Psychology (Berlyne,

White, Deci & Rayan):motivate actions to achieve specific goals

Drive actions whose effects directly increase fitness

Come back again with the homeostatic needs they are associated with

Intrinsic motivations (skill/knowledge acquis.):

Psychology: motivate actions for their own sake

Drive actions whose effects are an increase in:(a) knowledge or prediction ability;(b) competence to do

Terminate to drive actions when knowledge/ competence is acquired

Page 19: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

19/30

WP5 Intrinsic motivations CNR-LOCEN (Gianluca Baldassarre, Marco Mirolli) Young robot: low level of hierarchy develps skills based on

evolved ‘reinforcers’ (knowledge-based intrinsic motivations) Young robot: high level of hierarchy selects skills which produce

the highest suprise (competence-based intrinsic motivations) Adult robot: high level of hierarchy performs skill composition to

achieve salient goals (external rewards fitness measure)

Adult robot tasksChild robot task

Young robot: resultsBefore learning After learning

Adult robot: results

Schembri, Mirolli, Baldassare, 2007, ICDL, ECAL, EPIROB

Page 20: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

20/30

WP5 Novelty detection with habituable neural networks

UU: (Ulrich Nehmzow) Task: find novel elements in world Image pre-processing (abstraction) Habituable neural network

From Marsland et al. 2005 (J. Rob. Aut. Sys.)

Neto Nehmzow, 2007, Rob. & Aut. Syst.

Task

Page 21: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

21/30

WP5 Intrinsic motivations based on information theory

IDSIA (Juergen Schmidhuber) Theoretic ML, robotics, information-theory intrins. mot. ‘Data compression improvement’ = intrinsic motivation

Schmidhuber, 2009, Journal of SICE

Page 22: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

22/30

WP6: Hierarchical architectures

WP4: Abstraction and attention

WP5: Intrinsic motivations

WP6: Hierarchical architectures to support

cumulative learning

1. Empirical investigations:

- Monkeys - Children - Adults - Parkinson patients

4. Two robotic demonstrators:- CLEVER-B- CLEVER-K

2. Computational bio-mimetic models:mechanisms underlying brainand behaviour

Suitable representations

Focussing learning

Science

From Science to Technology

Technology

3. Machine-learning models:powerful algorithms and architectures

WP6

Page 23: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

23/30

WP6 Hierarchical architecturesCumulative learning needs hierarchical architectures: To avoid catastrophic forgetting To find solutions by ‘composing skills’: dirty but fast solutions, then refine Because brain is hierarchical Because brain has a (soft) modularity at all levels

From Fuster, 2001, NeuronMcgovern Sutton Fagg

Page 24: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

24/30

WP6 Intrinsic motivations, hierarchical RL (options)

UMASS (Andrew Barto) Intrinsically Motivated Reinforcement Learning HRL: options theory

Simsek Barto, 2006, ICML; Singh Barto Chentanez, 2004, NIPS

Sutton et al., Option theory

Page 25: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

25/30

WP6 Bio-inspired / bio-constrained hierarchical reinforcement learning

CNR-LOCEN (Gianluca Baldassarre & Marco Mirolli) Piaget theory: actions support learning of other actions Camera, dynamic arm, reaching tasks Continuous state/action reinforcement learning Hierarchical RL: segmentation, Piaget

Caligiore Borghi Parisi Mirolli Baldassarre, ongoing

Page 26: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

26/30

WP6 Development sensorimotor mappings in robots

AU (Mark Lee) Developmental psychology and robotics Staged development of sensorimotor behaviour LCAS – Lift Constraint, Act, and Saturate

Lee Meng Chao, 2007, Rob. & Auton. Sys.Lee Meng Chao, 2007, Adaptive Behaviour;

Page 27: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

27/30

WP7: Integration

WP4: Abstraction and attention

WP5: Intrinsic motivations

WP6: Hierarchical architectures to support

cumulative learning

1. Empirical investigations:

- Monkeys - Children - Adults - Parkinson patients

4. Two robotic demonstrators:- CLEVER-B- CLEVER-K

2. Computational bio-mimetic models:mechanisms underlying brainand behaviour

Suitable representations

Focussing learning

Science

From Science to Technology

Technology

3. Machine-learning models:powerful algorithms and architectures

WP7

Page 28: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

28/30

Leave a robot alone for a month

or so…

on the basis of intrinsic

motivations…

…it will build up a repertoire of actions

incrementally.

Come back and assign it a goal

(e.g. by reward)…

…and it will learn to accomplish it

very quickly.

WP7 CLEVER-K: Kitchen scenario

Main responsible: IDSIA, UU

…interacting with the environment:

3 iCub robots from

IIT (Giorgio Metta)

Page 29: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

29/30

WP7 CLEVER-B: Board scenario

Main responsible: AU, CNR-LOCEN

Page 30: 1/30 IM-CLeVeR: Intrinsically Motivated Cumulative Learning Versatile Robots Gianluca Baldassarre, Marco Mirolli, Francesco Mannella, Vincenzo Fiore, Stefano.

30/30

Conclusions: A timely project Timely research goals:

intrinsic motivations, hierarchical architectures Within important trends:

developmental robotics computational system neuroscience emotions/motivations

In synergy with various events:EpiRob, ICDL, J. of Autonomous Mental Development

In line with EU calls:“Cognitive Systems, Interactions and Robotics”

First EU Integrated Project wholly focussed on these topics

www.im-clever.eu