Story of the algorithms behind Deezer Flow RecSysFr, Paris, 2016 March 23th B. Mathieu, Data Architect T. Bouabca, Data Scientist
Story of the algorithms behind Deezer Flow
RecSysFr, Paris, 2016 March 23th
B. Mathieu, Data ArchitectT. Bouabca, Data Scientist
/01
/02
/03
/04
/05
Context
Initial system
Content tagging system
Live adaptive algorithms
Conclusion
Story of the algorithms behindDeezer Flow
Story of the algorithms behind Deezer Flow
Context
/01
Story of the algorithms behind Deezer Flow
Deezer overview
/01 Context
Story of the algorithms behind Deezer Flow
● Music streaming service
● 6M paying users
● 40M tracks
● 180+ countries
● Up to 200+ tracks / user / day
Story of the algorithms behind Deezer Flow
Adapt tracklist to● Music tastes● Localization● Activity● Mood● Time & day● Discovery preferences
Interesting debate
Should we ask questions to the user or let data science do the magic?
Deezer Flow: Initial pitchThe magic play button
Context/01
Initial system
/02
Story of the algorithms behind Deezer Flow
/02 Initial system
Story of the algorithms behind Deezer Flow
Available data:
● User likes (artists, albums, tracks)
● User streams logs● Album recommendation
algorithm (collaborative filtering)
Initial System (2014)
Strategy:
● Tracklist computed offline● Tracks from library / listening
habits● Tracks from recommended
albums
/02 Initial system
Story of the algorithms behind Deezer Flow
Cold start problem: addressing new users
1. New users are asked to select some musical genres, and some artists
2. Build tracklist based on liked artists & similar artists
3. Fallback to top tracks in country
/02 Initial system
Story of the algorithms behind Deezer Flow
● Tracklist only fits user’s tastes
● Tracklist do not fit user’s mood or user’s activity or time ...
To reach this goal:
● Immediately take into account user’s last interactions
● Refresh tracklist more often
● Insights into the content of a track
Need a more content-based approach
First Flow limitations
Content tagging system
/03
Story of the algorithms behind Deezer Flow
/03 Content tagging system
Story of the algorithms behind Deezer Flow
Building a content tagging system
/03
Story of the algorithms behind Deezer Flow
● Heterogenous sources
● Millions of songs, artists, playlists or albums to tag everyday
Quality assessment:
● Monitoring every sources
● Benchmarking ● Studying new metrics
How to consolidate such data?
Content tagging system
/03 Content tagging system
Story of the algorithms behind Deezer Flow
Architecture overview
Content data:- Tags- Popularity
User data:- Taste model- Hot tracks- Behaviors
Build tracklist
- Data cache- User action history
- Update user models- Consolidate tags data- Build indexes
actions logs
Live adaptive algorithms
/04
Story of the algorithms behind Deezer Flow
The live Flow (2015)
● Generated user profile● User history analyzed offline● Recently played tracks● Recent actions
● Querying tracks from ElasticSearch index
/04 Live adaptive algorithms
Story of the algorithms behind Deezer Flow
Story of the algorithms behind Deezer Flow
Flat tag profiles can lead to mistakes
● Tag clustering
● Querying ES with different tag queries
● Serving tracks according to cluster proportion
/04
We can be more precise!
Live adaptive algorithms
Different metrics to follow:
● Listening time
● Satisfaction
● User interaction (skipped / liked)
● Reconnection to Flow
Live evaluation - AB Testing
/04 Live adaptive algorithms
Story of the algorithms behind Deezer Flow
Conclusion
/05
Story of the algorithms behind Deezer Flow
Story of the algorithms behind Deezer Flow
What‘s next ?
● Fitting to user’s mood
● Increased performance on first days
Where are we now?
● Collaborative filtering combined with Content-Based approach (coming soon)
● More adaptation to the context
Conclusion/05
We are hiring!
Story of the algorithms behind Deezer Flow
● Data scientist
● Data architect
● Search scientist
https://www.deezer.com/jobs
Conclusion/05
21
Thanks for your attention
Questions?