Top Banner
JovianDATA © 2009 Confidential & Proprietary Information Slide - 1 2460 North First Street, Suite 170, San Jose, CA 95131 408-433-9383 www.joviandata.com Analytics at the Speed of Thought Satya Ramachandran Vice President of Engineering Anupam Singh Chief Technology Officer April 14, 2010
19

Jovian Data Amazon Final Version

Jul 02, 2015

Download

Documents

JovianDATA presentation at the AWS startup event
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Jovian Data Amazon Final Version

JovianDATA © 2009 Confidential & Proprietary InformationSlide - 12460 North First Street, Suite 170, San Jose, CA 95131 408-433-9383 www.joviandata.com

Analytics at the Speed of Thought

Satya RamachandranVice President of Engineering

Anupam SinghChief Technology Officer

April 14, 2010

Page 2: Jovian Data Amazon Final Version

JovianDATA © 2009 Confidential & Proprietary InformationSlide - 2

Technology platform to

optimize your conversion

funnel at the lowest cost

JovianDATA Mission

Page 3: Jovian Data Amazon Final Version

JovianDATA © 2009 Confidential & Proprietary InformationSlide - 3

Why move to the cloud?Customer Problem Solution Impact

Media Conglomerate • Generating 5TB data per

quarter

A conventional data

warehousing stack

Capital Expenditure of

more than a million

dollars

Maintain a terabyte

scale enterprise stack

has recurring

expenditure

Agency • Getting 2TB of DoubleClick

data per quarter per

advertiser

Sample 5000 users

and use SAS for data

mining

Loss of Analytic

Richness

Build tech expertise to

maintain a warehouse

Portal • 200 Terabytes of data

• Large number of physical

nodes in a datacenter

Use ‘NoSQL’ (hadoop

etc) to develop an

analytics practice in

house

Deployment and SLA

maintenance is

impossible in a single,

monolithic cluster

NoSQL does not solve

issues of application

provisioning

Considering AWS actively but not sure about

• Cap Ex benefits• Current stack’s cloud readiness• Application provisioning challenges

Page 4: Jovian Data Amazon Final Version

JovianDATA © 2009 Confidential & Proprietary InformationSlide - 4

Introducing JovianDATA

Extremely low TCO

Billions of Impressions, Clicks

& Conversions (100’s of TB)

No sampling

Multi-dimensional analytics

In-Flight

Fast Time-to-Value SaaS

Other Data Sources

+

+

+

+

Ad Server Data, Search Engine Data

Sales/Conversion Data

Site/Web Analytics Data

Customer/3rd Party Data

Page 5: Jovian Data Amazon Final Version

JovianDATA © 2009 Confidential & Proprietary InformationSlide - 5

Transforming Data to Actionable Insights

High

Medium

Low

Engagement

Campaign Heat Map

Fully Materialized Data Cube

Publishers

Time

Incremental updates

Multi-dimensional indexes

Multi-dimensional partitions

Page 6: Jovian Data Amazon Final Version

JovianDATA © 2009 Confidential & Proprietary InformationSlide - 6

Agenda

JovianDATA Company Overview

JovianInsights – The Power of Analytics

Analytics Lifecycle ManagementInnovations in Cloud Infrastructure Management

JovianDATA Cube Storage

Innovations in Advanced Analytics using commodity clusters

Page 7: Jovian Data Amazon Final Version

JovianDATA © 2009 Confidential & Proprietary InformationSlide - 7

Avoiding Expensive Data Processing

Reduce Disk I/OBy Materializing Expensive Groups

Usage based Automatic View Materialization

Avoid Network I/O Multi-Dimensional Partitioning

Page 8: Jovian Data Amazon Final Version

JovianDATA © 2009 Confidential & Proprietary InformationSlide - 8

Why move to the cloud?

Problem Current Solution Solution Impact

Capital Expenditure • Cap Ex takes long

approval cycles

JovianDATA enables

IT department to

complement and

extend cloud

infrastructure on

commodity machines

Reduce IT cost by

having a migration

path to a low cost

commodity cluster

environment

Over Provisioning • Resources are provisioned

for the peak leading to

massive underutilization

JovianDATA enables

extra load to be

handled by

dynamically

provisioning virtual

instances

TCO is tightly fitted to

usage rather than to

peak

Application Isolation • Configuration for

applications are guessed

resulting in expensive re-

config cycles while deploying

the application in production

JovianDATA provides

a configuration and

deployment

framework to isolate

applications in their

own set of instances

Prototyped application

can be deployed in

production without

interrupting other

applications

Page 9: Jovian Data Amazon Final Version

JovianDATA © 2009 Confidential & Proprietary InformationSlide - 9

Agenda

Reducing Capex

Application Isolation

Dynamic Provisioning

Page 10: Jovian Data Amazon Final Version

JovianDATA © 2009 Confidential & Proprietary InformationSlide - 10

Managing CapEx with Role Based Clusters

SINGLECLUSTER FOR

DATA CLEANSING, LOAD AND QUERY

15TB100 NODES

Monthly Cost = $28,800

Page 11: Jovian Data Amazon Final Version

JovianDATA © 2009 Confidential & Proprietary InformationSlide - 11

Managing Cap-Ex with Role Based Clusters

LOAD MODEL

HIBERNATE MODEL

QUERY

UIAd Server Data, Search Engine Data

DATA CLEANSING2 hours daily for load on 10 nodes8 hours daily for query on 5 nodes

Monthly Cost = $2,052

Page 12: Jovian Data Amazon Final Version

JovianDATA © 2009 Confidential & Proprietary InformationSlide - 12

Agenda

Reducing Capex

Application Isolation

Dynamic Provisioning

Page 13: Jovian Data Amazon Final Version

JovianDATA © 2009 Confidential & Proprietary InformationSlide - 13

Temp1 Temp2

Selective Replication for on demand perf• Power analyst needs to perform complex, heavy number-crunching query that

typically take 8 - 10 hours

• Solution

• FlexRestoreTM

• Adds two new temporary nodes (Temp1, Temp2)

• Creates new replicas for hot partitions and redistributes across nodes

P34

P22

P12

P3

P1

P1

Node1 Node2 Node3 Node4

Nodeset1

P1

P34

P22 P12

P3

P22

P34

P12

P3

P34

P1

P12

P22

P3

With Replication Factor = 1Site Section Analytics = 10 minutes

With Replication Factor = 10Site Section Analytics = 30 seconds

Page 14: Jovian Data Amazon Final Version

JovianDATA © 2009 Confidential & Proprietary InformationSlide - 14

Reduce replication to maintain cost • When the analysis is done and the extra performance is not needed, the SLA

Controller brings down the two temporary nodes (and the extra replicas)

• Benefits

• High performance computing power when you need it

• But only when you need it to hold down operating costs

P34

P22

P12

P3

P1

P3

P22

P34

P12

P3

P34

P1

P1

P12

P22

Node1 Node2 Node3 Node4

No

de

set1

P34

P22

P1

P12

Temp1 Temp2

P3

Page 15: Jovian Data Amazon Final Version

JovianDATA © 2009 Confidential & Proprietary InformationSlide - 15

Agenda

Reducing Capex

Application Isolation

Dynamic Provisioning

Page 16: Jovian Data Amazon Final Version

JovianDATA © 2009 Confidential & Proprietary InformationSlide - 16

FUNNEL ANALYSIS FOR CLIENT

Provision Tera Scale Applications in Minutes

Campaign Manager needs to runheavy duty reports for a

Big Advertiser

Without Application IsolationData for all advertisers is kept

‘live’ on 50 nodes

50 live nodes per month=

$14, 400

Page 17: Jovian Data Amazon Final Version

JovianDATA © 2009 Confidential & Proprietary InformationSlide - 17

Provision Tera Scale Applications in Minutes

FUNNEL ANALYSIS FOR CLIENT

HIBERNATED MODEL

Campaign Manager requestsApplication Provisioning for a

Specific Advertiser

Application is provisioned in parallel from S3/EBS into EC2

50 nodes for fortnightly analysis=

$320

Page 18: Jovian Data Amazon Final Version

JovianDATA © 2009 Confidential & Proprietary InformationSlide - 18

Summary

Dynamic Provisioning with Selective Replication on EC2

10x Performance on EC2 replication

Reducing CapEx with Role based Temporary Clusters on EC2

10x Cost Savings with EC2 usage

Application Isolation with Application Hibernation on S3/EBS

100x Cost Savings with EC2-S3

Page 19: Jovian Data Amazon Final Version

Thank You