Adatao Live Demo at the First Spark Summit

Post on 11-May-2015

1186 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Adatao's Live Product Demo at the First Spark Summit December 2, 2013 Nikko Hotel, San Francisco

Transcript

Adatao Live Demo at the First Spark Summit Dec 2, 2013, San Francisco (Video at the end of this deck)

Christopher Nguyen, PhD Co-Founder & CEO

DATA INTELLIGENCE FOR ALL

Hadoop distributed/streaming analytics, Yahoo Hadoop Eng, UIUC PhD

Machine learning & machine vision, US Army Research Lab, Johns Hopkins PhD

Big-Data Compute Engines, Google Apps Engineering Director, Google Founders’ Award, HKUST Prof, 2 successful enterprise exits, Stanford PhD

Deep engineering & business experience from Google, Yahoo et al. PhD’s in DM & ML from UIUC, Georgia Tech, Stanford, Berkeley, ...

Powerful In-Memory Data Mining

Machine Learning Big Analytics Platform BIG

COMPUTE

(Hadoop HDFS, Cassandra, SQL DMBS, Streaming Data)BIG

DATA

Business Users Data Scientists Data Engineers

Visually Beautiful

Interactive DataExploration

Narrative Web App

BIG INSIGHTS

01100011

0110001

01100011

10001100

01100011

0110001

01100011

10001100

ONE Integrated Platform for Business & Data Science & Engineering

Architecture Design One Integrated Platform for Business & Data Science & EngineeringBusiness Users

01100011

0110001

01100011

10001100

01100011

0110001

01100011

10001100

Data Scientists Data Engineers

OTHERS

Business Users

stack for

business users

Data Scientists Data Engineers

VSstack for

data science

stack for

data eng

for Data Scientists & Engineers

Powerful In-Memory Data Mining & Machine Learning—Model Terabytes in Seconds

Interactive, Cluster-Scale Data Munging & Modeling with Native R, R-Studio, Python, SQL, and Java Front-ends

Real-Time Scoring Directly From Trained Models

Share reproducible, live data analysis documents

Hadoop, Cassandra, RDBMS, Streaming Data

01100011

0110001

01100011

10001100

01100011

0110001

01100011

10001100 Big Data Mining & Machine Learning

for Business Users

A Beautiful New Way to Create & Share Visual Narratives of Your Analysis !Perform Ad Hoc Queries in Plain English !Publish Streaming, Interactive Dashboards !Collaborate With Others In Real Time !Query Terabytes in Seconds.

Predictive Decision Making

CLIENT WORKER WORKER WORKERWORKERMASTER

Demo Deployment Diagram

Demo Config

Cluster: 8-node x 8-core x 30GB RAM x 1TB Disk

Data Sets: 12GB-100GB, 100M-1B rows

Airline Arrival Data, 1988-2008 from DoT

Algorithms- LM & supporting statistics (AIC, log-likelihood, R2, cross-validation) - Binning - Classification metrics: confusion matrix, ROC, AUC, F1 - Logistic Regression with Ref Level for Categorical Vars - k-Means- Random Forest - Naive Bayes- Linear SVM

Algorithm Roadmap

- Hierarchical Clustering - Text Mining (token, POS, LDA, …) - SVD- Markov Chain Models- Ensemble Models - …

Thank you!

See demo video at !

http://youtu.be/5UAdk7oHoPE?t=7m

top related