VELOX: MODELS IN ACTION Presented by Dan Crankshaw [email protected]Henry Milner, Joseph Gonzalez, Peter Bailis, Haoyuan Li, Tomer Kaftan, Zhao Zhang, Ali Ghodsi, Michael Franklin, Michael Jordan, and Ion Stoica https://amplab.cs.berkeley.edu/projects/velox/
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
2. Different systems3. Space inefficient4. Stale predictions
What’s wrong?
1. Built from scratch for each application
2. Different systems3. Space inefficient4. Stale predictions5. The T-Swift effect Sample Bias
What’s wrong?
Pipeline
Tachyon + HDFS
Node.js App Server
NGINX
MongoDB
Training Data
New Model
Catify: Music for Cats
Pipeline
Tachyon + HDFS
Web Application Velox
The Missing Piece
Data
ModelPredictionsServing
TrainingFeedb
ack
Tachyon + HDFS
Velox
The Missing Piece
Prediction Service
Model Manager
Web Application
Pipeline
BENEFITS
BENEFITS1. Low-latency and scalable
predictions as a service
BENEFITS1. Low-latency and scalable
predictions as a service2. Integrated approach leads to
fresher, better predictions
BENEFITS1. Low-latency and scalable
predictions as a service2. Integrated approach leads to
fresher, better predictions3. Easy translation to production
predictions
BENEFITS1. Low-latency and scalable
predictions as a service2. Integrated approach leads to
fresher, better predictions3. Easy translation to production
predictions4. Eases operational pain
PERSONALIZED MODELING
PERSONALIZED MODELING
wu · f(x; ✓)Rating =
PERSONALIZED MODELING
Shared BasisFeature Models
wu · f(x; ✓)Rating =
PERSONALIZED MODELING
Shared BasisFeature Models
PersonalizedUser Model
wu · f(x; ✓)Rating =
PERSONALIZED MODELING
Shared BasisFeature Models
PersonalizedUser Model
wu · f(x; ✓)
Change slowly
Rating =
PERSONALIZED MODELING
Shared BasisFeature Models
PersonalizedUser Model
wu · f(x; ✓)
Change slowlyHighly dynamic
Rating =
PERSONALIZED MODELING
Data
ModelPredictionsServing
TrainingFeedb
ack
VELOX
Pipeline
Tachyon + HDFS
VeloxPrediction Service
Model Manager
Web Application
Predictions as a service
VELOX
Pipeline
Tachyon + HDFS
VeloxPrediction Service
Model Manager
Web Application
Predictions as a service
PREDICTION API
GET /velox/catify/predict_top_k?userid=22&k=100
GET /velox/catify/predict?userid=22&song=27632
PREDICTION API
GET /velox/catify/predict_top_k?userid=22&k=100
GET /velox/catify/predict?userid=22&song=27632
PREDICTION API
GET /velox/catify/predict_top_k?userid=22&k=100
GET /velox/catify/predict?userid=22&song=27632
PREDICTIONS
def predict( u: UUID, x: Context )
wu · f(x; ✓)
Look up user weight
PREDICTIONS
def predict( u: UUID, x: Context )
wu · f(x; ✓)
Compute Features
Look up user weight
PREDICTIONS
def predict( u: UUID, x: Context )
wu · f(x; ✓)
LOW-LATENCY PREDICTIONS
Velox
Tachyon
Partition 0
Velox
Tachyon
Partition 1
Velox
Tachyon
Partition 2
Partition users
Compute Features
Look up user weight
PREDICTIONS
def predict( u: UUID, x: Context )
wu · f(x; ✓)
LOW-LATENCY PREDICTIONS
Velox
Tachyon
Feature Cache
LOW-LATENCY PREDICTIONS
Velox
Tachyon
Feature Cache
Features shared between users
Data
ModelPredictionsServing
TrainingFeedb
ack
Data
ModelPredictionsServing
TrainingFeedb
ack
Pipeline
Tachyon + HDFS
Node.js App Server
NGINX
MongoDB
Catify: Music for Cats
Pipeline
Tachyon + HDFS
Node.js App Server
NGINX
MongoDB
Training Data
Catify: Music for Cats
SIMPLE EXPLORATION
Rating
Songs
Prediction
SIMPLE EXPLORATION
Rating
Songs
Prediction
Epsilon-greedy
SIMPLE EXPLORATION
Rating
Songs
Prediction
Epsilon-greedy
ACTIVE LEARNING
Rating
Songs
Prediction
ACTIVE LEARNING: LinUCB
Rating
Songs
Prediction
Uncertainty
Li, L., Chu, W., Langford, J., & Schapire, R. E. (2010). A contextual-bandit approach to personalized news article recommendation. WWW '10: Proceedings of the 19th international conference on World wide web, New York, New York, USA: ACM. doi:10.1145/1772690.1772758
ACTIVE LEARNING: LinUCB
Rating
Songs
Prediction
Look at upper confidence bound
Uncertainty
Li, L., Chu, W., Langford, J., & Schapire, R. E. (2010). A contextual-bandit approach to personalized news article recommendation. WWW '10: Proceedings of the 19th international conference on World wide web, New York, New York, USA: ACM. doi:10.1145/1772690.1772758
ACTIVE LEARNING: LinUCB
Rating
Songs
Prediction
Look at upper confidence bound
Uncertainty
Li, L., Chu, W., Langford, J., & Schapire, R. E. (2010). A contextual-bandit approach to personalized news article recommendation. WWW '10: Proceedings of the 19th international conference on World wide web, New York, New York, USA: ACM. doi:10.1145/1772690.1772758
Data
ModelPredictionsServing
TrainingFeedb
ack
Pipeline
Tachyon + HDFS
Node.js App Server
NGINX
MongoDB
Velox
Catify: Music for Cats
Prediction Service
Model Manager
Data
ModelPredictionsServing
TrainingFeedb
ackMgmt.
Data
ModelPredictionsServing
TrainingFeedb
ackMgmt.
RealtimeLearning
Pipeline
Tachyon + HDFS
Node.js App Server
NGINX
MongoDB
Training Data
New Model
Catify: Music for Cats
GET /velox/catify/predict?userid=22&song=27632
GET /velox/catify/predict_top_k?userid=22&k=100
USER-FACING API
GET /velox/catify/predict?userid=22&song=27632
GET /velox/catify/predict_top_k?userid=22&k=100
USER-FACING API
POST /velox/catify/observe?userid=22&song=27632?score=3.7
ONLINE UPDATES
def observe(u: UUID, x: Context, y: Score)
wu · f(x; ✓)
Update wu with new training point
ONLINE UPDATES
def observe(u: UUID, x: Context, y: Score)
wu · f(x; ✓)
Basis functions stay fixed
Update wu with new training point
ONLINE UPDATES
def observe(u: UUID, x: Context, y: Score)
wu · f(x; ✓)
Data
ModelPredictionsServing
TrainingFeedb
ackMgmt.
RealtimeLearning
Data
ModelPredictionsServing
TrainingFeedb
ackMgmt.
RealtimeLearning + Offline Retraining
Pipeline
Tachyon + HDFS
Node.js App Server
NGINX
MongoDB
Velox
Catify: Music for Cats
Prediction Service
Model Manager
Data
ModelPredictionsServing
Feedb
ack
Velox Model Management System
Spark
The future of research in scalable learning systems will be in the integration of the learning lifecycle:
Data
ModelPredictionsServing
TrainingFeedb
ack
SUMMARY
•Model training and predictions rely on ad-hoc, manual processes spread across multiple systems