Top Banner
Robot control with Deep Reinforcement Learning Hadi Beik-Mohammadi INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017
15

Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

Aug 30, 2019

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

Robot controlwith Deep Reinforcement Learning

Hadi Beik-Mohammadi

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017

Page 2: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

• Inverse and Forward Kinematic

• How to Learn a behavior

• Methods

Inverse Recurrent Model

Deep Deterministic Policy Gradient

• Conclusion

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017 1

Page 3: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

End Effector

Joint 1

Joint 0

Joint 0Join

t 1

0 360

360

0 300X (CM)

200

Y (

CM

)

Joint SpaceCartesian Space

Target

End Effector

Target

TargetTarget

Target

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017

80

290

2

Page 4: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

Ө 1

Ө 2

Ө 3

Ө n

X

Y

Z

Ө X

Ө Y

Ө Z

Forward Kinematic

Forward and Inverse Kinematic

.

.

.Inverse Kinematic

Joint Space Cartesian Space

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017 3

Page 5: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

How to build agents that learn behaviors in a dynamic world?

The brain evolved, not to think or feel, but to control movement

Daniel Wolpert, nice TED talk

Learning a behavior:

Learning to map sequences of observations to actions, for a particular goal

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017 [3] 4

Page 6: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

What supervision does an agent need to learn purposeful

behaviors in dynamic environments?

• Rewards: sparse feedback from the environment whether the

desired behavior is achieved

• Demonstrations

• Specifications/Attributes of good behavior

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017 5

Page 7: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

Inverse Recurrent Model (IRM)[1]• Control Snake-Like Many Joint Robot Arms

• BPTT on recurrent forward models

• Recurrent Neural Networks LSTM

• Offline

Deep Deterministic Policy Gradient (DDPG)[2]• Deep Reinforcement Learning Method

• Actor Critic Network

• Continuous Action Domain

• Model Free

• Online

ME

TH

OD

S

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017 6

Page 8: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

Inverse Recurrent Model (IRM)

[1]

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017 7

Page 9: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

Deep Deterministic Policy Gradient (DDPG)

CRITIC NETWORK

ACTOR NETWORK

ENVIRONMENT

ACTION

ACTION

STATE

STATE

TD

State-Value Function

Policy Function

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017 8

Page 10: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

Deep Deterministic Policy Gradient (DDPG)

https://youtu.be/tJBIqkC1wWM

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017 9

Page 11: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

Rewarding

End Effector

Joint 1

Joint 0

Dist (t)

Reward 1 = Gaussian(Dist(t))

https://www.sfu.ca/sonic-studio/handbook/Graphics/Gaussian.gif

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017

Reward 2 = Dist(t-1) – Dist(t)

10

Page 12: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017 11

Page 13: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

2 DOF Manipulator Actor Critic maps during Learning

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017 12

Page 14: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

Pros:

• Operate over continuous action spaces

• Algorithm can learn policies end-to-end

• Model-Free

Cons:

• No Proof for learning

• No Guarantee for results

• Requires a large number of training episodes to find solutions

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017 13

Page 15: Robot control with Deep Reinforcement Learning · Ө 1 Ө 2 Ө 3 Ө n X Y Z Ө X Ө Y Ө Z Forward Kinematic Forward and Inverse Kinematic... Inverse Kinematic Joint Space Cartesian

References

• [1] Sebastian Otte , Adrian Zwiener , and Martin V. Butz, Inherently Constraint-Aware Control of Many-Joint Robot Arms with Inverse Recurrent Models

• [2] Continuous control with deep reinforcement learning, Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, Daan Wierstra

• [3] Deep Reinforcement Learning and Control, Spring 2017, CMU 10703

INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017INTELLIGENT ROBOTICS SEMINAR TALK, DECEMBER 2017 14