Top Banner
Scaling and Approximation in Complex Data Analysis Mikio Braun Delivery Lead Recommendation & Search Zalando SE datanatives, Nov 20, 2016
9

"Scaling and Approximation in Complex Data Analysis", Mikio Braun

Jan 08, 2017

Download

Data & Analytics

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: "Scaling and Approximation in Complex Data Analysis", Mikio Braun

Scaling and Approximation

in Complex Data Analysis

Mikio BraunDelivery Lead Recommendation & Search

Zalando SEdatanatives, Nov 20, 2016

Page 2: "Scaling and Approximation in Complex Data Analysis", Mikio Braun

● 135M visitors per months (about 4.5M per day)

● Recommendations for product page, newsletter, etc.

● Fully data driven.

10.04.15

Recommendations at Zalando

Page 3: "Scaling and Approximation in Complex Data Analysis", Mikio Braun

Personalized Recommendations

Page 4: "Scaling and Approximation in Complex Data Analysis", Mikio Braun

Big data on Hadoop

Page 5: "Scaling and Approximation in Complex Data Analysis", Mikio Braun

Large Scale Learning

Page 6: "Scaling and Approximation in Complex Data Analysis", Mikio Braun

Learning in a nutshell: Optimization

Page 7: "Scaling and Approximation in Complex Data Analysis", Mikio Braun

Instead of doing anything “fancy”,try to minimize the prediction error in each step.

Or, not even consider the whole data set but just a few points at a time.

Or, not even that but only take one point at a time.

=> Stochastic Gradient Descent

Gradient Descent - Introducing approximations

Page 8: "Scaling and Approximation in Complex Data Analysis", Mikio Braun

Why can we just take a short-cut?

Page 9: "Scaling and Approximation in Complex Data Analysis", Mikio Braun

● Large scale complex data analysis: billions of examples, millions of features

● Many parts can be parallelized well● Training of models is essentially hard● Approximation can help to deal● Goal is to generate good predictions of future data

Summary