Scaling and Approximation in Complex Data Analysis Mikio Braun Delivery Lead Recommendation & Search Zalando SE datanatives, Nov 20, 2016
Jan 08, 2017
Scaling and Approximation
in Complex Data Analysis
Mikio BraunDelivery Lead Recommendation & Search
Zalando SEdatanatives, Nov 20, 2016
● 135M visitors per months (about 4.5M per day)
● Recommendations for product page, newsletter, etc.
● Fully data driven.
10.04.15
Recommendations at Zalando
Personalized Recommendations
Big data on Hadoop
Large Scale Learning
Learning in a nutshell: Optimization
Instead of doing anything “fancy”,try to minimize the prediction error in each step.
Or, not even consider the whole data set but just a few points at a time.
Or, not even that but only take one point at a time.
=> Stochastic Gradient Descent
Gradient Descent - Introducing approximations
Why can we just take a short-cut?
● Large scale complex data analysis: billions of examples, millions of features
● Many parts can be parallelized well● Training of models is essentially hard● Approximation can help to deal● Goal is to generate good predictions of future data
Summary