Top Banner
16

Driverless AI - Introduction and a Look Under the Hood + Hands-on Lab - Arno Candel, CTO, H2O.ai

Jan 28, 2018

Download

Technology

Sri Ambati
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Driverless AI - Introduction and a Look Under the Hood + Hands-on Lab - Arno Candel, CTO, H2O.ai
Page 2: Driverless AI - Introduction and a Look Under the Hood + Hands-on Lab - Arno Candel, CTO, H2O.ai

Driverless AIIntroduction and a Look under the Hood

+ Hands-On Lab

Arno Candel, CTO @arnocandel

Page 3: Driverless AI - Introduction and a Look Under the Hood + Hands-on Lab - Arno Candel, CTO, H2O.ai

Team H2O!

Page 4: Driverless AI - Introduction and a Look Under the Hood + Hands-on Lab - Arno Candel, CTO, H2O.ai
Page 5: Driverless AI - Introduction and a Look Under the Hood + Hands-on Lab - Arno Candel, CTO, H2O.ai

Shortage of Data Scientists

Page 6: Driverless AI - Introduction and a Look Under the Hood + Hands-on Lab - Arno Candel, CTO, H2O.ai

Mistake Correction

Automation needed to avoid human error

Page 7: Driverless AI - Introduction and a Look Under the Hood + Hands-on Lab - Arno Candel, CTO, H2O.ai
Page 8: Driverless AI - Introduction and a Look Under the Hood + Hands-on Lab - Arno Candel, CTO, H2O.ai

Hours for Driverless AI — Weeks for grandmasters

single run, fully automated: 6h on 3 GPUs

Driverless AI: 18th place in private LB (out of 2926)

Driverless AI: top 1% in BNP Paribas Kaggle competition

Page 9: Driverless AI - Introduction and a Look Under the Hood + Hands-on Lab - Arno Candel, CTO, H2O.ai

Driverless AI: top 5% in Amazon Kaggle competition

Driverless AI: 80th place in private LB(out of 1687 - top 5%)

With a little bit of stacking: 20th place (top 1.5%)

Driverless AI produces feature engineering pipeline (“more columns”) for downstream use

https://www.youtube.com/watch?v=qtUNyJlAID0&t=11shttps://github.com/kaz-Anova/Competitive_Dai

Page 10: Driverless AI - Introduction and a Look Under the Hood + Hands-on Lab - Arno Candel, CTO, H2O.ai

Automatic Visualization

Scalable outlier detection (no sampling)

Contains novel statistical algorithms to only show “relevant” aspects of the data

(soon: automated data cleaning)

Page 11: Driverless AI - Introduction and a Look Under the Hood + Hands-on Lab - Arno Candel, CTO, H2O.ai

Machine Learning Interpretation

Gain confidence in models before deploying them!

Page 12: Driverless AI - Introduction and a Look Under the Hood + Hands-on Lab - Arno Candel, CTO, H2O.ai

MOJO: Pure Java Production Deployment• feature engineering and model scoring logic • auto-generated human-readable representation • minimal platform-independent storage format • scoring backend can be in any language (C/Java/C#/Go/etc.)

Page 13: Driverless AI - Introduction and a Look Under the Hood + Hands-on Lab - Arno Candel, CTO, H2O.ai
Page 14: Driverless AI - Introduction and a Look Under the Hood + Hands-on Lab - Arno Candel, CTO, H2O.ai

Feature Now Q1 2018 Q2 2018 Q3 2018

AutoDL Feature Engineering Recipe

Supervised Structured Data, CSV, Text

Overfitting and Leakage Prevention

Machine Learning Interpretation

Automatic VisualizationGUI

Python client API

Python scoring API HTTP Thrift Scoring API

Multi-GPU (shared data)

Scoring MOJO (100% Java or C)

Data connectors: HDFS, SQL

User Management: LDAP, KerberosTensorFlow Deep Learning NLP Recipes

Time Series Recipes

Multi-GPU (sharded data) - optimized for DGX Volta

UDR (User-Defined Recipes), Verticals

Multi-Node Multi-GPU - optimized for DGX Volta

Sparkling Water Backend for Driverless AI

Driverless AI Roadmap

Page 15: Driverless AI - Introduction and a Look Under the Hood + Hands-on Lab - Arno Candel, CTO, H2O.ai

docs.h2o.ai

Page 16: Driverless AI - Introduction and a Look Under the Hood + Hands-on Lab - Arno Candel, CTO, H2O.ai

Hands-on Lab