Top Banner
Value Iteration Networks Aviv Tamar, Yi Wu, Garrett Thomas, Sergey Levine and Pieter Abbeel @ UC Berkeley NIPS 2016 チチチチチチチチチ チチチ チチチチチ チチチチチ 2017/03/09
31

論文紹介:Value iteration networks(チームラボ勉強会)

Mar 19, 2017

Download

Technology

Ryo Yamamoto
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript

Value Iteration Networks

Value Iteration NetworksAviv Tamar, Yi Wu, Garrett Thomas, Sergey Levine and Pieter Abbeel @ UC BerkeleyNIPS 2016 2017/03/09

2

Qhttps://www.slideshare.net/yamaryox/20160421-70945023

3

Q Q

Q

Q4

Qreactive

5

CNN

6

Value Iteration Networks

7

s x, y s (s) a r

( (s), a ) 8

(s)

maxmax...

Value Iteration Value Iteration

Value Iteration Networks10

(s)

maxmax...

(s)

maxmax...

CNNCNN

Value Iteration Networks

(s) CNN

Conv

max& softmax13

13

Value Iteration Networks

CNNBack-Propagation14

15

Grid-World

Mars Rover Navigation

Continuous Control

Grid-World8x8, 16x16, 28x28 3x3CNN conv1=3x3x150, conv2=3x3x1 10, 20, 36 5000 7

CNNFCN(NN)17

Grid-World 18

Grid-World VIN 19

Grid-World 20

Grid-World 21

Mars Rover Navigation128x128108CNN16x16Conv(5x5x6), MaxPool(4x4), Conv(3x3x12), MaxPool(2x2), Conv(3x3x150), Conv(3x3x1) 10,000 7

22

Mars Rover Navigation 23

Mars Rover Navigation VIN84.8%

CNN90.3%

VIN24

Continuous Control(x, y, vx, vy)16x163x3NNCNN Conv1(3x3x150), Conv2(3x3x1)20040

25

Continuous Control

(s) CNN

Conv

max(5x5)& x 326

26

Continuous Control CNN27

Continuous Control

28

29

Value Iteration NetworksEnd-to-End

CNN

30

Web

VIN

31