Top Banner
1 From The Lab to the Factory Building A Produc8on Machine Learning Infrastructure Josh Wills, Senior Director of Data Science Cloudera
10

Josh Wills, MLconf 2013

May 06, 2015

Download

Technology

SessionsEvents

Josh Wills, Senior Director of Data Science, Cloudera: Building a Production Machine Learning Infrastructure (Quickly)
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Josh Wills, MLconf 2013

1

From  The  Lab  to  the  Factory  Building  A  Produc8on  Machine  Learning  Infrastructure  Josh  Wills,  Senior  Director  of  Data  Science  Cloudera  

Page 2: Josh Wills, MLconf 2013

About  Me  

2  

Page 3: Josh Wills, MLconf 2013

Data  Science:  Another  Defini8on  

3

Page 4: Josh Wills, MLconf 2013

Data  Scien8sts  Build  Data  Products.  

4

Page 5: Josh Wills, MLconf 2013

All*  Products  Become  Data  Products  

5

Page 6: Josh Wills, MLconf 2013

Iden8fying  the  BoHlenecks  

6

Page 7: Josh Wills, MLconf 2013

Oryx:  Model  Building  and  Serving  

•  Algorithms  •  ALS  Recommenders  •  K-­‐Means  Parallel  •  RDF  

•  Batch  model  building  via  MapReduce  

•  Server  for  real-­‐8me  scoring  and  updates  

•  PMML  4.1  Models    

7  

Page 8: Josh Wills, MLconf 2013

Gertrude:  Evalua8on  via  Experiments  

•  Mul8variate  Tes8ng  •  Define  and  explore  a  space  of  parameters  

•  Overlapping  Experiments  •  Tang  et  al.  (2010)  •  Runs  mul8ple  independent  experiments  on  every  request  

8  

Page 9: Josh Wills, MLconf 2013

Planning  For  The  Future  

9

Page 10: Josh Wills, MLconf 2013

 Josh  Wills,  Director  of  Data  Science,  Cloudera            @josh_wills  

 

Thank  you!