Distributed Deep Learning on Hadoop Clusters
Post on 06-Jan-2017
1303 Views
Preview:
Transcript
Dist r ibuted Deep Learning on Hadoop Clusters
Andy Feng & Jun Sh iYa h o o ! I n c .
2
Our Talks @ Hadoop Summit
Storm on YARN (2013)› http://bit.ly/1W02tZy
Spark on YARN (2014)› http://bit.ly/1W03dxE
Machine Learning on Hadoop/Spark (2015)› http://bit.ly/1NW3GvO
Agenda
• Why Deep Learning on Hadoop?
• CaffeOnSpark– Architecture – API: Scala + Python
• Demo– CaffeOnSpark + Python Notebook
Deep Learning
4
6
Yahoo Use Case: Yahoo Weather
Beauty› Computational
assessed
Relevant› Location› Time› Cloudy› Shower› …
Weather App Yahoo Weather App
7
Yahoo Vision Kit: Demo
(4)Apply
ML Model @ Scale
Flickr DL/ML Pipeline
(3) Non-deepLearning@ Scale
* http://bit.ly/1KIDfof by Pierre Garrigues, Deep Learning Summit 2015
(2)Deep
Learning@ Scale
(1)PrepareDatasets@ Scale
* 10 billion photos * 7.5 million per day
Deep Learning vs. Hadoop
9
10
Machine Learning & Deep Learning on Hadoop
11
Hadoop Cluster Enhanced
GPU servers added› 4 Tesla K80 cards
• 2 GK210 GPUs, 24GB memory
Network interface enhanced› InfiniBand for direct access to GPU
memory › Ethernet for external communication
Deep Learning Frameworks
Caffe› Available since Sept, 2013, 6.3k forks› Popular in vision community & Yahoo
TensorFlow › Released in Nov. 2015, 9.8k forks
Theano, Torch, DL4J, etc.
Released in Feb. 2016• Apache 2.0 license
• Distributed deep learning– GPU or CPU– Ethernet or InfiniBand
• Easily deployed on public cloud or private cloud
13
CaffeOnSpark Open Sourced
github.com/yahoo/CaffeOnSpark
CaffeOnSpark: Scalable Architecture
14
CaffeOnSpark: 19x Speedup (est.)
Training latency (hours)
Top-
5 Va
lidat
ion
Err
or
CaffeOnSpark: Deployment Options
16
• Single node– Spark-submit –master local
• Multiple nodes– Spark-submit –master URL –connection ethernet
– Ex. EC2– Spark-submit –master URL –connection infiniband
– Ex., Yahoo Hadoop cluster
Spark CLI• spark-submit --num-executors #_Processes --class com.yahoo.ml.CaffeOnSpark caffe-on-spark.jar -devices #_gpus_per_proc -conf solver_config_file -model model_file -train | -test | -feature
Caffe Configuration
layer { name: "data" type: "MemoryData" source_class=“com.yahoo.ml.caffe.LMDB” memory_data_param { source: ”hdfs:///mnist/trainingdata/" batch_size: 64; channels: 1; height: 28; width: 28; } …}
17
CaffeOnSpark: DL Made Easy
CaffeOnSpark: One Program (Scala) http://bit.ly/21ZY1c2
18
cos = new CaffeOnSpark(ctx)conf = new Config(ctx, args).init()
// (1) training DL modeldl_train_source = DataSource.getSource(conf, true)cos.train(dl_train_source) // (2) extract features via DLlr_raw_source = DataSource.getSource(conf, false)ext_df = cos.features(lr_raw_source) // (3) apply MLlr_input=ext_df.withColumn(“L", cos.floats2doubleUDF(ext_df(conf.label))) .withColumn(“F", cos.floats2doublesUDF(ext_df(conf.features(0))))lr = new LogisticRegression().setLabelCol(”L").setFeaturesCol(”F")lr_model = lr.fit(lr_input_df)
Non-deep
LearningD
eep Learning
CaffeOnSpark: One Notebook (Python) http://bit.ly/1REZ0cN
19
20
CaffeOnSpark: UI & Logs
Demo: CaffeOnSpark on EC2
https://github.com/yahoo/CaffeOnSpark/wiki› Get started on EC2› Python for CaffeOnSpark
CaffeOnSpark: What’s Next?
Validation within training Enhanced data layer RNN and LSTM Java API Asynchronous distributed training
Related Work: SparkNet & DL4J
1) [driver] sc.broadcast(model) to executors2) [executor] apply DL training against a mini-batch of dataset to
update models locally3) [driver] aggregate(models) to produce a new model
RE
PE
AT
Driver
Summary
24
Yahoo Hadoop clusters enhanced for deep learning› GPU nodes + CPU nodes› Infiniband network for fast communication
CaffeOnSpark open sourced› Empower Flickr and other Yahoo services
• In production since Q3 2015• Reduced training latency, and improved accuracy
› Scalable deep learning made easy• spark-submit on your Spark cluster
25
Thank You!
Repo: github.com/yahoo/CaffeOnSparkEmail: caffeonspark-users@googlegroups.com
top related