Top Banner
Towards Chainer v1.5 10/14 Chainer meetup @ PFI/PFN Seiya Tokui (Preferred Networks)
20

Towards Chainer v1.5

Jan 23, 2018

Download

Technology

Seiya Tokui
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Towards Chainer v1.5

Towards  Chainer  v1.5

10/14  Chainer  meetup  @  PFI/PFN

Seiya  Tokui  (Preferred  Networks)

Page 2: Towards Chainer v1.5

Development  history

l  6/12:  v1.0

–  Basics  of  Variable/Function,  FunctionSet  &  Optimizer,  CUDA  support

l  7/7:  v1.1

–  Caffe  referece  model,  type  checking  (forward/backward),  Py3  support

l  8/19:  v1.2

–  Many  functions  are  added,  collect_̲parameters  is  deprecated,  remove  type  checking  on  backward

l  9/2:  v1.3

–  CuPy,  functions  module  is  reorganized

2

Page 3: Towards Chainer v1.5

CuPy

l  CUDA  array  implementation  with  NumPy-‐‑‒subset  API

l  Custom  elementwise  and  reduction  kernels  are  still  supported  (with  broadcasting)

l  No  dependence  on  PyCUDA  and  scikits.cuda

–  Cf.)  sudden  renaming  of  scikit-‐‑‒cuda  to  scikits.cuda

l  NumPy  API  coverage  is  still  incomplete

l  Most  operations  are  not  supported  yet  on  the  Function/Variable  level

3

Page 4: Towards Chainer v1.5

Development  history

l  6/12:  v1.0

–  Basics  of  Variable/Function,  FunctionSet  &  Optimizer,  CUDA  support

l  7/7:  v1.1

–  Caffe  referece  model,  type  checking  (forward/backward),  Py3  support

l  8/19:  v1.2

–  Many  functions  are  added,  collect_̲parameters  is  deprecated,  remove  type  checking  on  backward

l  9/2:  v1.3

–  CuPy,  functions  module  is  reorganized

l  10/28:  v1.4  (planned,  delayed)

–  Some  functions  are  added?

4

Page 5: Towards Chainer v1.5

The  cause  of  the  delay

l  New  model  structure  (#363)

l  Iʼ’ve  been  working  on  this  since  the  release  of  v1.3

l  It  is  unexpectedly  difficult  to  make  the  design

–  Still  in  designing  phase

–  Iʼ’m  planning  to  release  this  feature  in  v1.5

5

Page 6: Towards Chainer v1.5

Objective

l  Replacement  of  FunctionSet/Optimizer

l  Goals:

–  Provide  a  solid  way  of  sharing  and  reusing  (sub)network  definitions

–  Avoid  the  “to_̲cpu/to_̲gpu  trap”  between  FunctionSet  and  Optimizer

–  Portable  save/load

–  Make  all  functions  pure  for  more  flexibility  and  reusability

6

Page 7: Towards Chainer v1.5

Solution  (current  idea)

l  Hierarchy  of  network  definitions

l  Example:

–  An  autoencoder  uses  an  encoder  network  and  a  decoder  network

–  Each  of  the  networks  might  be  MLPs,  ConvNets,  etc.

–  MLP  consists  of  several  fully-‐‑‒connected  layers

–  Each  fully-‐‑‒connected  layer  defines  a  simple  operation  on  the  input  variable

l  Call  each  component  a  chain

l  Modeling  in  Chainer  will  be  linking  several  chains  into  one  big  chain

7

Page 8: Towards Chainer v1.5

Terminology

l  Link

–  A  minimal  component  of  the  chain  (e.g.  Linear,  Convolution2D,  etc.)

–  “Parameterized  function”  in  the  previous  versions

–  It  combines  parameter  variables  with  input  variables  to  compute  the  output  variables

l  Chain,  ChainList

–  Composition  of  child  chains  (including  links)

–  Chain  manages  the  child  chains  by  a  dictionary,  while  ChainList  does  by  a  list

8

Page 9: Towards Chainer v1.5

Schematic  of  Link/Chain

9

Linear Linear Linear

Link Chain Function

layer1 layer2 layer3 predictor

x

t

loss

Example  of  a  classifier  with  a  multi-‐‑‒layer  perceptron

MLP

Classifier

Page 10: Towards Chainer v1.5

Schematic  of  Link/Chain

Example  of  Variational  AutoEncoder

10

LinearLinear

LinearLinear Linearx

kld

nll

loss +

encoder decoder

z

VariationalAutoEncoder

MLP MLP(?)

Page 11: Towards Chainer v1.5

Define  by  Run

l  Note  that  these  diagrams  do  not  mean  the  computational  graph  must  be  fixed  at  the  defnition  of  chains

–  The  graph  is  dynamically  constructed  on  the  forward  computation  (define-‐‑‒by-‐‑‒run)

l  A  chain  might  implements  multiple  methods  that  constructs  different  graphs

11

Page 12: Towards Chainer v1.5

Example  (gist:  https://goo.gl/JKQgSy)

12

Page 13: Towards Chainer v1.5

Example  (gist:  https://goo.gl/JKQgSy)

13

Page 14: Towards Chainer v1.5

Example  (gist:  https://goo.gl/JKQgSy)

14

User can freely design the predictor chain.

Page 15: Towards Chainer v1.5

Example  (gist:  https://goo.gl/JKQgSy)

15

Page 16: Towards Chainer v1.5

Example  (gist:  https://goo.gl/JKQgSy)

16User can freely design the encoder/decoder chains.

Page 17: Towards Chainer v1.5

Planned  features  of  Link/Chain/ChainList

l  The  hierarchy  is  directly  mapped  to  HDF5  format  on  serialization

–  Only  the  parameters  and  auxiliary  variables  (computed  by  learning)  are  saved

l  Helper  method  to  traverse  the  hierarchy

–  Iterate  all  subchains  in  the  hierarchy

–  Iterate  all  parameter  variables  in  the  hierarchy

17

Page 18: Towards Chainer v1.5

New  Optimizer

l  Optimizer  is  also  updated

l  Optimizer  will  be  aware  of  the  target  chain

–  Track  the  migration  of  the  target  chain  between  CPUs  and  GPUs

l  Optimizer  is  also  serializable  (in  HDF5  format)

18

Page 19: Towards Chainer v1.5

Parallel  work:  introduction  of  Cython

l  CuPy  drawback:  the  CPU  side  manipulation  is  slow

l  No  single  huge  bottleneck:  the  cause  of  slow  down  is  already  scattered

l  The  easiest  point  to  fix:  ctypes

–  ctypes  is  verrrrrrrrrrrry  slow

–  Even  extracting  the  current  device  consumes  non-‐‑‒negligible  running  time

–  @okuta  san  is  trying  to  make  Cython  replace  it

l  Major  impact  on  the  Chainer  package

–  Low  level  interface  will  change

–  setup.py  is  drastically  updated  (since  Cython  extension  requires  Cython  to  build,  while  we  have  to  make  the  package  installable  to  environments  into  which  Cython  is  not  installed  yet)

19

Page 20: Towards Chainer v1.5

Future  work

l  Lazy  computation

–  See  VAE  example:  it  computes  all  intermediate  variables  in  the  _̲_̲call_̲_̲  operator,  while  there  might  be  a  usage  that  a  user  only  wants  some  of  them

–  Chainer  currently  computes  eagerly,  which  causes  unneeded  computations

–  Avoiding  unneeded  computations  is  one  of  the  easiest  graph  optimization

–  More  in  general,  I  believe  that  the  future  is  in  fusion  of  symbolic  and  dynamic  paradigms

l  Symbolic  optimization  of  computations  on  Variables  (loop  fusion,  etc.)

l  Variable  tags  (or  annotations)

–  Cf.)  Blocks

l  Learning  process  abstraction,  Data  loading  abstraction,  etc.

20