Top Banner
Distributed Deep Learning on Hadoop Clusters Andy Feng & Jun Shi Yahoo! Inc.
25

Distributed Deep Learning on Hadoop Clusters

Jan 06, 2017

Download

Technology

Hadoop Summit
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Distributed Deep Learning on Hadoop Clusters

Dist r ibuted Deep Learning on Hadoop Clusters

Andy Feng & Jun Sh iYa h o o ! I n c .

Page 2: Distributed Deep Learning on Hadoop Clusters

2

Our Talks @ Hadoop Summit

Storm on YARN (2013)› http://bit.ly/1W02tZy

Spark on YARN (2014)› http://bit.ly/1W03dxE

Machine Learning on Hadoop/Spark (2015)› http://bit.ly/1NW3GvO

Page 3: Distributed Deep Learning on Hadoop Clusters

Agenda

• Why Deep Learning on Hadoop?

• CaffeOnSpark– Architecture – API: Scala + Python

• Demo– CaffeOnSpark + Python Notebook

Page 4: Distributed Deep Learning on Hadoop Clusters

Deep Learning

4

Page 5: Distributed Deep Learning on Hadoop Clusters

Use Case: Flickr Magic View flickr.com/cameraroll

Page 6: Distributed Deep Learning on Hadoop Clusters

6

Yahoo Use Case: Yahoo Weather

Beauty› Computational

assessed

Relevant› Location› Time› Cloudy› Shower› …

Weather App Yahoo Weather App

Page 7: Distributed Deep Learning on Hadoop Clusters

7

Yahoo Vision Kit: Demo

Page 8: Distributed Deep Learning on Hadoop Clusters

(4)Apply

ML Model @ Scale

Flickr DL/ML Pipeline

(3) Non-deepLearning@ Scale

* http://bit.ly/1KIDfof by Pierre Garrigues, Deep Learning Summit 2015

(2)Deep

Learning@ Scale

(1)PrepareDatasets@ Scale

* 10 billion photos * 7.5 million per day

Page 9: Distributed Deep Learning on Hadoop Clusters

Deep Learning vs. Hadoop

9

Page 10: Distributed Deep Learning on Hadoop Clusters

10

Machine Learning & Deep Learning on Hadoop

Page 11: Distributed Deep Learning on Hadoop Clusters

11

Hadoop Cluster Enhanced

GPU servers added› 4 Tesla K80 cards

• 2 GK210 GPUs, 24GB memory

Network interface enhanced› InfiniBand for direct access to GPU

memory › Ethernet for external communication

Page 12: Distributed Deep Learning on Hadoop Clusters

Deep Learning Frameworks

Caffe› Available since Sept, 2013, 6.3k forks› Popular in vision community & Yahoo

TensorFlow › Released in Nov. 2015, 9.8k forks

Theano, Torch, DL4J, etc.

Page 13: Distributed Deep Learning on Hadoop Clusters

Released in Feb. 2016• Apache 2.0 license

• Distributed deep learning– GPU or CPU– Ethernet or InfiniBand

• Easily deployed on public cloud or private cloud

13

CaffeOnSpark Open Sourced

github.com/yahoo/CaffeOnSpark

Page 14: Distributed Deep Learning on Hadoop Clusters

CaffeOnSpark: Scalable Architecture

14

Page 15: Distributed Deep Learning on Hadoop Clusters

CaffeOnSpark: 19x Speedup (est.)

Training latency (hours)

Top-

5 Va

lidat

ion

Err

or

Page 16: Distributed Deep Learning on Hadoop Clusters

CaffeOnSpark: Deployment Options

16

• Single node– Spark-submit –master local

• Multiple nodes– Spark-submit –master URL –connection ethernet

– Ex. EC2– Spark-submit –master URL –connection infiniband

– Ex., Yahoo Hadoop cluster

Page 17: Distributed Deep Learning on Hadoop Clusters

Spark CLI• spark-submit --num-executors #_Processes --class com.yahoo.ml.CaffeOnSpark caffe-on-spark.jar -devices #_gpus_per_proc -conf solver_config_file -model model_file -train | -test | -feature

Caffe Configuration

layer { name: "data" type: "MemoryData" source_class=“com.yahoo.ml.caffe.LMDB” memory_data_param { source: ”hdfs:///mnist/trainingdata/" batch_size: 64; channels: 1; height: 28; width: 28; } …}

17

CaffeOnSpark: DL Made Easy

Page 18: Distributed Deep Learning on Hadoop Clusters

CaffeOnSpark: One Program (Scala) http://bit.ly/21ZY1c2

18

cos = new CaffeOnSpark(ctx)conf = new Config(ctx, args).init()

// (1) training DL modeldl_train_source = DataSource.getSource(conf, true)cos.train(dl_train_source) // (2) extract features via DLlr_raw_source = DataSource.getSource(conf, false)ext_df = cos.features(lr_raw_source) // (3) apply MLlr_input=ext_df.withColumn(“L", cos.floats2doubleUDF(ext_df(conf.label))) .withColumn(“F", cos.floats2doublesUDF(ext_df(conf.features(0))))lr = new LogisticRegression().setLabelCol(”L").setFeaturesCol(”F")lr_model = lr.fit(lr_input_df)

Non-deep

LearningD

eep Learning

Page 19: Distributed Deep Learning on Hadoop Clusters

CaffeOnSpark: One Notebook (Python) http://bit.ly/1REZ0cN

19

Page 20: Distributed Deep Learning on Hadoop Clusters

20

CaffeOnSpark: UI & Logs

Page 21: Distributed Deep Learning on Hadoop Clusters

Demo: CaffeOnSpark on EC2

https://github.com/yahoo/CaffeOnSpark/wiki› Get started on EC2› Python for CaffeOnSpark

Page 22: Distributed Deep Learning on Hadoop Clusters

CaffeOnSpark: What’s Next?

Validation within training Enhanced data layer RNN and LSTM Java API Asynchronous distributed training

Page 23: Distributed Deep Learning on Hadoop Clusters

Related Work: SparkNet & DL4J

1) [driver] sc.broadcast(model) to executors2) [executor] apply DL training against a mini-batch of dataset to

update models locally3) [driver] aggregate(models) to produce a new model

RE

PE

AT

Driver

Page 24: Distributed Deep Learning on Hadoop Clusters

Summary

24

Yahoo Hadoop clusters enhanced for deep learning› GPU nodes + CPU nodes› Infiniband network for fast communication

CaffeOnSpark open sourced› Empower Flickr and other Yahoo services

• In production since Q3 2015• Reduced training latency, and improved accuracy

› Scalable deep learning made easy• spark-submit on your Spark cluster

Page 25: Distributed Deep Learning on Hadoop Clusters

25

Thank You!

Repo: github.com/yahoo/CaffeOnSparkEmail: [email protected]