Blendle @ RecSys'17: Online Learning to Rank for Recommender Systems

Online Learning to Rank for Recommending

Daan Odijk Lead data scientist @ Blendle

@dodijk

MissionHelp you discover and support the world’s best journalism

International May 2014: The Netherlands Sept 2015: GermanyMarch 2016: United States

Publisher-backeda.o. NY Times, Nikkei, Axel Springer

70 employees10 journalists & 50 developers

Blendle

@dodijk

In Blendle you can browse through all quality newspapers and magazines

@dodijk

You only pay for what you read, with a single click

Scale at Blendle

@dodijk

Articles> 6M in total> 7K new every day> 30% is read

Users> 1M users~ 1 in 5 converts to a paying user

Events~ 2B in total> 2M new every day

! "

@dodijk

Our editors select the best articles for our email newsletter every day

Our personalisation algorithms create a personal bundle from this

@anneschuthBlendle

@dodijk

@dodijk

Why we personalize

@dodijk

Why we personalize

@dodijk

@dodijk

Random Forest classifier trained on a year of editorial picks

Clustered based on Cosine similarity with TF.IDF vectors

Prioritised Selection

Sale

s →

Toda

y

1w

|

2w

|

3w

|

Short shelve life

Daily cold start•>7K new articles every night•Our newsletter is an important traffic driver

•No usage info to rank the newsletter before we send the newsletter

Article enrichment pipeline

Author extraction

Semanticlinking

Sentimentanalysis

Stylometry

Named Entity RecognitionPoS-tagging

Length, word variation, vocabulary richness, …

Polarity scores (negative, positive)

Locations, persons, organizations

Wikipedia concepts

@dodijk

Language detection

Topic modeling

Tokenization

Spark LDA EM

User Profiling

Enrich

Aggregate Profile

• Reads • Views • Negative feedback

Learning to rank: preference learning

Model

Enrich EnrichProfile Profile

Extract ML Features

Learning to predict

Learning to rank: preference learning

Model

Enrich EnrichProfile Profile

Extract ML Features

Learning to predict

Enrich Profile

Extract ML Features Rank

Ranking

Online Learning to Rank

•Learning with a user in the loop•Daily updates to our model

query

[Yue et al, 2009; Hofmann et al., 2011]

Dueling Bandit Gradient Descent

wAuthor

wTopic

wAuthor

wTopic

Explorative RankerExploitative Ranker

For Blendle the user is the query

Interleaved Ranking Explorative RankingExploitative Ranking

A

B

C

D

E

F

C

G

D

A

B

E

query

TeamDraft Interleave

Radlinski, F., Kurup, M., & Joachims, T. (2008). How does clickthrough data reflect retrieval quality? In CIKM ’08.


A

B

D

E

F

C

G

D

A

B

E

query




AB

D

E

F

C

G

D

B

E

query



Interleaved Ranking

A

B

E

C

G

D

query



Interleaved Ranking

A

B

E

C

G

D

query



note: the interleaving

method is NOT part of

DBGD, it just provides feedback

query

wAuthor

wTopic


wAuthor

wTopic

G



query

wAuthor

wTopic


wAuthor

wTopic

G



query

Exploitative Ranker

wAuthor

wTopic

G



Online Evaluation

@dodijk

-2%

0%

2%

4%

6%

Online Learning A/B Test

Lift in articles

read

Days →

User Cold Start

@dodijk

User Cold Start

@dodijk

+ Newspapers

Rank

er W

eigh

t in

Sco

re

0%

50%

100%

Articles Read0 5

Onboarding Reading

5% lift in articles read

@dodijkBlendle

Relevance

For

you

✓

Diversification

@dodijkBlendle

Relevance

For

you

✓

Similarity

Diversification

@dodijkBlendle

Relevance

For

you

✓

Similarity

✓

Diversification

@dodijkBlendle

Relevance

For

you

✓

Similarity

✓

Diversification

@dodijkBlendle

Relevance

For

you

✓

Similarity

✓ ✓

Diversification

@dodijkBlendle

Relevance

For

you

✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓

Diversification

@dodijkBlendle

Breaking Bubbles Editorial selection Must reads Diversification Little effect of popularity

Filter Bubbles

For

you

Mu

st R

ead

s

Making Bubbles Onboarding Reading history Explicit feedback

Timing problem•Our editors wake up at 5am and are done reading at 7am

•Which is also when we want to send our newsletter

•We simply can’t wait for a batch process

@dodijk

article published

articles

enrich articleenrich articleenrich article

articles updates

persist update

articles

picks

editor picks article

create pick update

picked articles

article features

article features

users

user article features

user article features

ranker A

ranker B

ranker Z

user article scores

persist score scores filled nightly with a batch process

scores arrive after seconds

#users x #updates = 200M#users x #updates x #rankers = 600M

#updates = 200

personalize bundle

@dodijk

scores

editor sends newsletter

newsletters

users

select users

user newsletters personalized newsletter

send newsletter

computed realtimefilled with a batch process

#newsletters = 1#newsletters x #users = 1M

#newsletters x #users = 1M

~7am

~7.15am

personalizepersonalize bundle

experimentation

Cold start problem

•So we enrich our content

Timing problem

•So we precompute as much as possible

Explanations: #FATREC

Why this article?Because you seem to like a long read article every now and then.

@dodijk

@dodijk research.blendle.io

http://research.blendle.io

Blendle @ RecSys'17: Online Learning to Rank for Recommender Systems

Science