Top Banner
Prediction Systems Dan Crankshaw UCB RISE Lab Seminar 10/3/2015
36

Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

Mar 08, 2018

Download

Documents

trantu
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

Prediction SystemsDan Crankshaw

UCB RISE Lab Seminar10/3/2015

Page 2: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

BigData

Big Model

Training

Learning

Timescale: minutes to daysSystems: offline and batch optimizedHeavily studied ... major focus of the AMPLab

Page 3: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

BigData

Big Model

Training

Application

Decision

Query

?

Learning Inference

Page 4: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

BigData

Training

LearningInference

Big ModelApplication

Decision

Query

Timescale: ~20 millisecondsSystems: online and latency optimizedLess studied …

Page 5: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

BigData

Big Model

Training

Application

Decision

Query

Learning Inference

Feedback

Page 6: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

BigData

Training

Application

Decision

Learning Inference

Feedback

Timescale: hours to weeksSystems: combination of systemsLess studied …

Page 7: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

BigData

Big Model

Training

Application

Decision

Query

Learning Inference

Feedback

Responsive(~10ms)

Adaptive(~1 seconds)

Page 8: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

Prediction Serving ChallengesØ Complexity of deploying new models

Ø New applications or products (0 à 1 models).Ø New data, features, model family: (N à N+1 models).Ø Why is it hard: Frameworks not designed for low-latency serving, frameworks

have different APIs, different resource requirements, and different costs.Ø System Performance

Ø Need to ensure low-latency predictions, scalable throughput. Deploying a new model can’t degrade system performance.

Ø Model or Statistical PerformanceØ Model Selection: Which models to use?Ø When to deploy a new model?Ø How to adapt to feedback?Ø At a meta-level: what are the right metrics for measuring model performance?

Page 9: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

LASER: A Scalable Response Prediction Platform for Online

AdvertisingAgarwal et al. 2014

Page 10: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

LASER OverviewØ Top-down system design enforced by company organizational structureØ Picked a model (logistic regression) and built the system based on that

choiceØ Force data-scientists to use this model, express features in specialized

configuration languageØ Result: System and model family are tightly coupled

pijt =1

1 + exp(�sijt)

sijt = ! + s1,cijt + s2,cijt + s2,!ijt

Page 11: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

Addressing Deployment ComplexityØ Fixed Model Choice: Can be hardcoded into system, no need for

API to specify modelØ Configuration language: specify feature construction in JSON-

based configuration languageØ Restricts feature transformations to be built from component libraryØ Allows for changes in pipeline without service restarts or code modificationØ Allows easy re-use of common features across an organizationØ Similar to PMML, PFA

Ø Language detailsØ Source: translate data to numeric feature vectorsØ Transformer: Vector-to-vector transformations (transform, aggregate)Ø Assembler: Concatenates all feature pipelines together into single vector

Page 12: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

Addressing System PerformanceØ Precompute second-order interaction terms

Ø The LASER logistic regression model includes second order interaction terms between user and campaign features:

s

2,cijt = x

0iAcj + . . .

Ø Don’t wait for delayed featuresØ Features can be delayed by slow DB lookup, expensive computationØ Solution: Substitute expected value for missing features and degrade

accuracy, not latencyØ Solution: Cache precomputed scalar products in PRC, save overhead

of re-computing features and dot products which are lazily evaluated

Page 13: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

Addressing Model Performance

Cold StartTrained Offline

Warm StartTrained Onlinesijt = ! + s1,cijt + s2,cijt + s2,!ijt

Ø Decompose model into slowly-changing and quickly-changing componentsØ Fast retraining of warm-start (quickly-changing) component of model without

cost of full retraining

Ø Explore/Exploit with Thompson SamplingØ Sometimes serve ads with low empirical mean but high-varianceØ Draw sample from posterior distribution over parameters and use

sample to predict CTR instead of modeØ In practice, hold fixed and sample from ⇥c ⇥w

Page 14: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

Some Takeaways from LASER

Ø System performance is paramount in the broader application contextØ Slow page load has much larger impact on revenue than poor ad-

recommendationØ AUC/accuracy is not always the most useful model performance

metricØ The more assumptions you can make about your tools

(software, models) the more tricks you can play (configlanguage, shared features, warm-start/cold-start decomposition)Ø Safe for LASER to make these assumptions because they are enforced

through extra-technological methodsØ Similar to some of the design choices we saw in Borg last week

Page 15: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

Daniel Crankshaw, Xin WangGiulio ZhouMichael Franklin, Joseph E. GonzalezIon Stoica

A Low-Latency Online Prediction Serving System

Clipper

Page 16: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

Goals of ClipperØ Design Choice: General purpose, easy to use prediction

serving systemØ Generalize to many different ML applications (contrast to LASER

which was designed to address LinkedIn’s ad-targeting needs)Ø Generalize to many frameworks/tools for a single application

Ø Don’t tie the hands of data scientists developing modelsØ Make it simple for a data-scientist to deploy a new model into

productionØ Given these design choices, maximize system and model

performance using model-agnostic techniques

Page 17: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

Clipper Generalizes Models Across ML Frameworks

Clipper

ContentRec.

FraudDetection

PersonalAsst.

RoboticControl

MachineTranslation

Create VWCaffe

Page 18: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

Clipper Architecture

Clipper

Applications

Predict ObserveRPC/REST Interface

VWCaffeCreate

Page 19: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

Clipper

Caffe

ust

Predict ObserveRPC/REST Interface

Model Wrapper (MW) MW MW MWRPC RPC RPC RPC

Clipper Architecture

Applications

Page 20: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

Clipper Architecture

Clipper

Caffe

ApplicationsPredict ObserveRPC/REST Interface

Model Wrapper (MW) MW MW MWRPC RPC RPC RPC

Model Abstraction LayerProvide a common interface to modelswhile bounding latency and maximizing throughput.

Model Selection LayerImprove accuracy through ensembles,online learning and personalization

Page 21: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

Clipper Architecture

Clipper

Caffe

ApplicationsPredict ObserveRPC/REST Interface

Model Wrapper (MW) MW MW MWRPC RPC RPC RPC

Model Selection LayerSelection Policy

Model Abstraction LayerCaching

Adaptive Batching

Page 22: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

Caffe

Model Selection LayerSelection Policy

Model Wrapper (MW) MW MW MWRPC RPC RPC RPC

Model Abstraction LayerCaching

Adaptive Batching

Provide a common interface to models while

Page 23: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

Correction LayerCorrection Policy

Model Wrapper (MW)RPC

CaffeMW

RPCMW

RPCMW

RPC

Model Abstraction LayerApproximate Caching

Adaptive Batching

Common Interface à Simplifies Deployment: Ø Evaluate models using original code & systemsØ Models run in separate processes (Docker containers)

Ø Resource isolation

Page 24: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

Model Selection LayerSelection Policy

Model Abstraction LayerCaching

Adaptive Batching

Model Wrapper (MW)RPC

CaffeMW

RPCMW

RPCMW

RPCMW

RPCMW

RPC

Common Interface à Simplifies Deployment: Ø Evaluate models using original code & systemsØ Models run in separate processes

Ø Resource isolationØ Scale-out

Problem: frameworks optimized for batch processing not latency

Page 25: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

A single page load may generatemany queries

Adaptive Batching to Improve ThroughputØ Optimal batch depends on:

Ø hardware configurationØ model and frameworkØ system load

Clipper Solution:

be as slow as allowed…

Ø Inc. batch size until the latency objective is exceeded (Additive Increase)

Ø If latency exceeds SLO cut batch size by a fraction (Multiplicative Decrease)

Ø Why batching helps:

HardwareAcceleration

Helps amortizesystem overhead

Page 26: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

Adaptive Batching to Improve Throughput

25.5xthroughput

increase

Page 27: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

Clipper Architecture

Clipper

Caffe

ApplicationsPredict ObserveRPC/REST Interface

Model Wrapper (MW) MW MW MWRPC RPC RPC RPC

Model Selection LayerSelection Policy

Model Abstraction LayerCaching

Adaptive Batching

Page 28: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

Goal:Maximize accuracy through bandits and ensembles, online learning, and personalization

Incorporate feedback in real-time to achieve:Ø robust predictions by combining multiple models &

frameworksØ online learning and personalization by selecting and

personalizing predictions in response to feedback

ClipperModel Selection LayerSelection Policy

Page 29: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

Caffe

BigData

Application

Learning Inference

FeedbackSlow

Slow ChangingModel

Fast Model Selection per-User

Clipper

Page 30: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

Caffe

Slow ChangingModel

Fast Model Selection per-User

Clipper

Model Selection PolicyImproves prediction accuracy by:Ø Incorporates real-time feedback

Ø Estimates confidence of predictions

Ø Determines how to combine multiple predictions

Ø e.g., choose best, average, …Ø enables frameworks to compete

Page 31: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

Increased LoadØ Solutions:

Ø Caching and BatchingØ Model Selection prioritizes

frameworks for load-shedding

StragglersØ e.g., framework fails to meet SLO

Ø Solution: Anytime predictionsØ Selection policy must select/combine

from available predictionsØ e.g., built-in ensemble policy

substitutes expected valueCaffe

Slow ChangingModel

Fast ChangingUser Model

Clipper

Cost of Ensembles

?

Page 32: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

Limitations of ClipperØ Clipper does not address offline model retraining

Ø By treating deployed models as black boxes, Clipper forgoes the opportunity to optimize prediction execution of the models themselves or share computation between models

Ø Only performs coarse-grained tradeoffs of accuracy, robustness, and performance.

Page 33: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

TensorFlow ServingØ Recently released open-source prediction-serving system from

GoogleØ Companion to TensorFlow deep-learning ML frameworkØ Easy to deploy TensorFlow ModelsØ System automatically manages the lifetime of deployed models

Ø Watches for new versions, loads and transfers requests to new models automatically

Ø System does not address model performance, only system performance (through batching)

Page 34: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

TensorFlow-Serving

Predict RPC/REST Interface

TensorFlow Serving Architecture

Applications

Prediction Batching

V2V1 V3

New model version trainedRETIRED

Page 35: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

TensorFlow-Serving

Predict RPC/REST Interface

TensorFlow Serving Architecture

Applications

Prediction Batching

V2V1 V3RETIRED

Page 36: Prediction Systems - GitHub Pages PredictionIO Ø Open-source Apache Incubating project, the company behind the project was recently acquired by Salesforce Ø Built on Apache Spark,

Other Prediction-Serving SystemsØ Turi

Ø Company co-founded by Joey, Carlos Guestrin, and others to serve predictions from models (primarily) trained in the GraphLab Create framework

Ø Not open-sourceØ Recently acquired by Apple

Ø OryxØ Developed by Cloudera for serving Apache Spark ModelsØ Implementation of Lambda Architecture with Spark and Spark Streaming to

incrementally maintain modelsØ Open source

Ø PredictionIOØ Open-source Apache Incubating project, the company behind the project was

recently acquired by SalesforceØ Built on Apache Spark, Hbase, Spray, ElasticSearch