Top Banner
How to build an elastically scalable, multi-tenant, FREE big data service Webinar
20
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: How to Build a Scalable and Free Big Data Service

How to build an elastically

scalable, multi-tenant, FREE

big data serviceWebinar

Page 2: How to Build a Scalable and Free Big Data Service

@karlunho

Alan Ho@sbaxi

Shailendra Baxi

@rbhargava

Rajesh

Bhargava

Page 3: How to Build a Scalable and Free Big Data Service

youtube.com/apigee

Page 4: How to Build a Scalable and Free Big Data Service

slideshare.com/apigee

Page 5: How to Build a Scalable and Free Big Data Service

Agenda

1. What & Why we built this service

2. Demo

1. Technical Architecture

2. Developer Experience

5

Page 6: How to Build a Scalable and Free Big Data Service

Apigee Developer

6

Page 7: How to Build a Scalable and Free Big Data Service

What we built

Free big data service for building

context aware apps

7

Page 8: How to Build a Scalable and Free Big Data Service

Context Aware Apps are “Behavior Driven”

8

Page 9: How to Build a Scalable and Free Big Data Service

Developer Alternatives for Machine Learning

9

Amazon Machine

Learning

Page 10: How to Build a Scalable and Free Big Data Service

Insights approach for Apigee Developer

10

Accelerated

Development

Descriptive &

Predictive

Behavior Based

Algorithms

E2E Experience

Free

Page 11: How to Build a Scalable and Free Big Data Service

Architecture

1

1

DATA

INSIGHTS

1.Data uploadStructured or Unstructured

2. ScalableVolume, Variety & Velocity

3. Core IP Machine LearningGraph ProcessingUn-structured Data

4. Analytics OfferingsPredictive & Journey analytics, segmentation

User Interactions

Prediction Journey Segmentation

Computational AlgorithmsMachine Learning Library

Data Pipelines Unstructured Data

Processors GRASP Processor

Distributed Processing FoundationDistributed Data and Job Management

Apache usergrid

Query Language

Modeling Work Bench User Interface

Page 12: How to Build a Scalable and Free Big Data Service

Transactional Datastore

Modeling, Scoring, Data Transformation,

Aggregation/Reporting

Ephemeral Hadoop Cluster

Management Service

Software LibrariesGRASP Unstructured Data

Machine Learning

Insights Master

Data Staging Area

Monitoring service

Ingestion Datastore

GRASP Query Service

QueryDatastore

Query Server

Real Time Service (Edge)

Real Time Datastore (usergrid)

node

Applications

UI, Modeling Workbench

Application Data

HTTPS, AWS APIs

HTTP(S)

Persistent Datastore

= S3

= HDFS

API

System Components

Metadata Service

Runtime MetadataJob Queue, Job Dependencies, Data

Set partitions

Metadata - Store

Static MetadataDataStore & Dataset, Application, Job

Page 13: How to Build a Scalable and Free Big Data Service

How does Insights work?

Ingest Customer Data

Batch or browser based

Event based or Customer profile

Aggregate behavior graphs

Cross-channel, domain-agnostic customer journey graphs

Enriched with Customer profile

Query capability and machine learning

Customer journey visualization

Models & Scores

Data scientist + developer support

R interface for predictive modeling on Hadoop

Integrated with API Edge (incl BaaS, node.js)

Data Flow

Customer

Data store

Persistant

Data storeHDFS on

compute cluster

Serving Data store

(Customer,

usergrid)

Data Ingestion

(Batch or Browser

based)

Data Moved to

Persistent

storage

Data brought to the

compute cluster for

processing

Processed Data

exported to

appropriate

location

Page 14: How to Build a Scalable and Free Big Data Service

Transactional Datastore

Modeling, Scoring, Data Transformation,

Aggregation/Reporting

Ephemeral Hadoop Cluster

Management Service

Software LibrariesGRASP Unstructured Data

Machine Learning

Insights Master

GRASP Query Service

QueryDatastore

Query Server

Real Time Service

Real Time Datastore (usergrid)

node

Applications

UI, Modeling Workbench

Application Data

HTTPS, AWS APIs

HTTP(S)

Persistent Datastore

= S3

= HDFS

API

Data level Multi-tenancy

Metadata Service

Runtime MetadataJob Queue, Job Dependencies, Data

Set partitions

Metadata - Store

Static MetadataDataStore & Dataset, Application, Job

Data Staging

Monitoring service

Ingestion Datastore

Datasets segregated/sharded by Account ID

Data keyed by account ID

Page 15: How to Build a Scalable and Free Big Data Service

Applications

UI, Modeling Workbench

Application Data

Transactional Datastore

Modeling, Scoring, Data Transformation,

Aggregation/Reporting

Ephemeral Hadoop Cluster

Management Service

Software LibrariesGRASP Unstructured Data

Machine Learning

Insights Master

Data Staging Area

Monitoring service

Ingestion Datastore

GRASP Query Service

QueryDatastore

Query Server

Real Time Service

Real Time Datastore (usergrid)

node

HTTPS, AWS APIs

HTTP(S)

Persistent Datastore

= S3

= HDFS

API

Scalability

Metadata Service

Runtime MetadataJob Queue, Job Dependencies, Data

Set partitions

Metadata - Store

Static MetadataDataStore & Dataset, Application, Job

Horizontal ScalingElastic/Ephemeral scaling

Sharding

Page 16: How to Build a Scalable and Free Big Data Service

Insights UI & APIs

• HTML5 Single page application

• Interacts with RESTful APIs

• Guide a novice user through the experience – Help them

understand important Predictive / Machine learning concepts

• Scalable REST API infrastructure

16

Page 17: How to Build a Scalable and Free Big Data Service

Insights R SDK

17

Page 18: How to Build a Scalable and Free Big Data Service

Developer Resources

• E2E Recommendation Tutorial – Try it Free !

• Sample Datasets

• Blog posts, Embedded Documentation

18

Page 19: How to Build a Scalable and Free Big Data Service

Try it out Apigee Developer

https://accounts-beta.apigee.com

19

Page 20: How to Build a Scalable and Free Big Data Service

Summary

• Be practical when approaching multi-tenancy

• Cost can be drastically reduced with elastic scaling & Multi-

tenancy

• Developer Experience requires continual refinement

• Try it out our Free Service for yourself !

20