Top Banner
Deep Learning is Like Water: It's Everywhere! Arno Candel, PhD CTO, H2O.ai @ArnoCandel AI By The Bay, San Francisco March 6, 2017
32

Arno Candel AIByTheBay 030617

Mar 19, 2017

Download

Technology

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Arno Candel AIByTheBay 030617

DeepLearningisLikeWater:It'sEverywhere!

ArnoCandel,PhDCTO,H2O.ai@ArnoCandel

AIByTheBay,SanFrancisco

March6,2017

Page 2: Arno Candel AIByTheBay 030617

MeettheH2OMakers

Page 3: Arno Candel AIByTheBay 030617

3

SoftwareProduct:H2O-AIforBusinessTransformation• DistributedDataFramewithScalableExecutionEngine• DistributedAlgorithmsDeepLearning,GBM,RF,GLM,K-Means,PCA,…• Apachev2OpenSource(github.com/h2oai)

H2OisEasytoUseandDeploy• h2o.ai/downloadandrunanywhere,immediately• ClientAPIs:R,Python,Java,Scala,REST,FlowGUI• Spark(cf.SparklingWater),Hadoop,BareMetal• Productionizewithauto-generatedJava/C++scoringcode

H2O.ai-MakersofH2O,SparklingWater,DeepWater,…

Page 4: Arno Candel AIByTheBay 030617

https://www.cbinsights.com

H2O.ai-AttheCoreofAI

Page 5: Arno Candel AIByTheBay 030617

H2O.ai-LovedByTheBest

Page 6: Arno Candel AIByTheBay 030617

H2O.ai-Visionaryin2017GartnerMQforDataScience

Page 7: Arno Candel AIByTheBay 030617

H2ODeepWatergot(Tech)Crunch’ed

Page 8: Arno Candel AIByTheBay 030617

H2O.ai-HighlightedbyFortuneMagazine

http://fortune.com/2017/02/23/artificial-intelligence-companies/

Page 9: Arno Candel AIByTheBay 030617

Powerful,ScalableTechniquesforDeepLearningandAI

brandnew:Dec2016

H2OBook-WrittenbytheCommunity

Page 10: Arno Candel AIByTheBay 030617

H2O.aiCustomerLove

10http://www.h2o.ai/customers/

Page 11: Arno Candel AIByTheBay 030617

11http://www.h2o.ai/customers/

H2O.aiCustomerLove

Page 12: Arno Candel AIByTheBay 030617

12

HighLevelArchitectureofH2O

HDFS

S3

NFS

DistributedIn-Memory

ParallelParser

LosslessCompression

H2OComputeEngine

ProductionScoringEnvironment

Exploratory&DescriptiveAnalysis

FeatureEngineering&Selection

Supervised&UnsupervisedModeling

ModelEvaluation&Selection

Predict

Data&ModelStorage

ModelExport: StandaloneScoringCode

C/C++/Java R/Py/etc.

DataPrepExport:PlainOldJavaObject

Local

SQL

LDAP Kerberos SSL HTTPS

HTTP

Page 13: Arno Candel AIByTheBay 030617

NativeAPIs:Java,Scala—RESTAPIs:R,Python,Flow,JavaScript,Java

13

library(h2o)h2o.init()h2o.deeplearning(x=1:4,y=5,as.h2o(iris))

importh2o

fromh2o.estimators.deeplearningimportH2ODeepLearningEstimator

h2o.init()

dl=H2ODeepLearningEstimator()

dl.train(x=list(range(1,4)),y="Species",training_frame=iris.hex)

import_root_.hex.deeplearning.DeepLearningimport_root_.hex.deeplearning.DeepLearningParametersvaldlParams=newDeepLearningParameters()dlParams._train=iris.hexdlParams._response_column=‘Speciesvaldl=newDeepLearning(dlParams)valdlModel=dl.trainModel.get

Allheavyliftingisdonebythebackend!

Built-ininteractiveGUIandnotebook-nocodingneeded!

Page 14: Arno Candel AIByTheBay 030617

LiveDemoofDistributedDeepLearninginH2O

Airline dataset: 116M flights in the U.S. over 20 years

10xin-memorycompressionvsCSV AllclusterCPUcoresarebusy

Page 15: Arno Candel AIByTheBay 030617

Brand-new:H2OXGBoostIntegration(GradientBoosting)

WhyXGBoost?Competitiveaccuracyandspeed(greatforKaggle)

GPUsupport(forsmall/mediumdata)Efficientonsparsedata

WhyintegrateintoH2O?Easeofuse(FlowGUI,R/PyAPIs)

Real-timemodelstatus(varimp,metrics)Efficientdatapreprocessing(sparse,categorical)

IntegrationintoH2Oecosystem(modeling,deployment,support)

Page 16: Arno Candel AIByTheBay 030617

LiveDemoofGPUGradientBoostinginH2O

Kaggle dataset: BNP Paribas Cardif Claims Management

Page 17: Arno Candel AIByTheBay 030617

DeepWater=THEDeepLearningPlatform

H2Ointegratesthetop open-sourceDLtools

NativeGPUsupport isupto100xfasterthan

EnterpriseReadyEasytotrainanddeploy,interactive,scalable,etc.Flow,R,Python,Spark/Scala,Java,REST,POJO,Steam

NewBigDataUseCases(previouslyimpossibleordifficultinH2O)

Image-socialmedia,manufacturing,healthcare,…Video-UX/UI,security,automotive,socialmedia,…Sound-automotive,security,callcenters,healthcare,…Text-NLP,sentiment,security,finance,fraud,…TimeSeries-security,IoT,finance,e-commerce,…

DeepWater:BestOpen-SourceDeepLearning

EnterpriseDeepLearningforBusinessTransformation

Page 18: Arno Candel AIByTheBay 030617

DeepWaterBringsState-Of-The-ArtDeepLearningonGPUstoH2O

H2ODeepLearning:simplemulti-layernetworks,CPUs

H2ODeepWater:arbitrarynetworks,CPUsorGPUs

Limitedtobusinessanalytics,statisticalmodels(CSVdata)

Largenetworksforbigdata(e.g.image1000x1000x3->3minputsperobservation)

1-5layersMBs/GBsofdata

1-1000layersGBs/TBsofdata

Page 19: Arno Candel AIByTheBay 030617

Open-Source-LeverageCommunityCode,DataandModels

BestImageClassifierasofAug2016:Google+MicrosoftHybridArchitecture

https://research.googleblog.com/2016/08/improving-inception-and-image.html

open-sourceimplementation

H2OtakesDLmodeldefinitionasinput

Page 20: Arno Candel AIByTheBay 030617

BuildyourownmodelswithDeepWaterToday!https://github.com/h2oai/h2o-3/blob/master/h2o-py/tests/testdir_algos/deepwater/pyunit_inception_resnet_v2_deepwater.py

LiveDemoofDeepWaterforImageClassificationonGPUs

Page 21: Arno Candel AIByTheBay 030617

Yesterday:SmallData(<GB) Today:BigData(TeraBytes,ExaBytes)

Data+Skillsaregoodforbusiness

Data+MachineLearningAREthebusiness

ThingsareChangingQuickly

Page 22: Arno Candel AIByTheBay 030617

ChallengesWithAIandDeepLearning

Page 23: Arno Candel AIByTheBay 030617

CEO:“WewilltransformourbusinesswithAI”Management:“HiresomeonetogiveusAI”SeniorDataScientist:“IshouldlookintoAI”JuniorDataScientist:“IuseTensorFlowallthetime”HighSchoolKid:“IdidmyinternshiponDeepLearning”AverageJoe:“Iwantaself-drivingcar(andkeepmyjob)”

StanfordProfessors:“focusoninterpretability,startwithsimplemodels!”

TheHypeandRealityofAI

Page 24: Arno Candel AIByTheBay 030617

H2O.aiStanfordAdvisors

stankrdpic

Sri/CEOBoydHastieTibshirani

inaconferenceroomnotsofaraway

me

Page 25: Arno Candel AIByTheBay 030617

WhichOpen-SourceAIPlatformtoUse?

Page 26: Arno Candel AIByTheBay 030617

WhichProgrammingLanguageToUse?

WhichoneforDevelopmentvsProduction?

Page 27: Arno Candel AIByTheBay 030617

WhichHardwareToUse?

WhichoneforDevelopmentvsProduction?

Analog/Neuromorphic

Page 28: Arno Candel AIByTheBay 030617

WhoDoestheWorkandonWhatInfrastructure?

WhichoneforDevelopmentvsProduction?

Cloud?Which? OnPremise?

DataLake?Micro-Services?

Page 29: Arno Candel AIByTheBay 030617

WhichoneforDevelopmentvsProduction?

WhenistheModelGoodEnough?

Crowdsourcing? TrustaGenius? InternalBake-Off?

Page 30: Arno Candel AIByTheBay 030617

Whatproblemareyousolvinginthefirstplace?

Whatproblemshould/couldyoubesolvinginstead?

Whatcanyoulearnfromthemodel?

Howcanyouimprovethemodels?More,betterdata?

Howcanyoucharacterizethemodel?

DoyouneedAI,DeepLearningorjustasimplemodel?

BacktotheDrawingBoard!

Page 31: Arno Candel AIByTheBay 030617

GradientBoosting Machine

Generalized LinearModeling

DeepLearning

Distributed RandomForest

DoyouneedAI,DeepLearningorjustaSimpleModel?

Page 32: Arno Candel AIByTheBay 030617

FutureOfAI:OrWhat’sLeftforHumanstoDo?

CharlieChaplin-ModernTimes1936