Top Banner
Seite 1 Page 1 The University of Innsbruck was founded in 1669 and is one of Austrias oldest universities. Today, with over 28.000 students and 4.000 staff, it is western Austrias largest institution of higher education and research. For further information visit: www.uibk.ac.at. #nowplaying on #Spotify: Leveraging Spotify Information on Twitter for Artist Recommendations Martin Pichl , Eva Zangerle and Günther Specht
20

(SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information on Twitter for Artist Recommendations" - Martin Pichl, Eva Zangerle and Günther SpechtPresentation

Aug 15, 2015

Download

Internet

icwe2015
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: (SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information on Twitter for Artist Recommendations" - Martin Pichl, Eva Zangerle and Günther SpechtPresentation

Seite 1

Page 1

The University of Innsbruck was founded in 1669 and is one of Austria’s oldest universities. Today, with over 28.000 students and 4.000 staff, it is

western Austria’s largest institution of higher education and research. For further information visit: www.uibk.ac.at.

#nowplaying on #Spotify: Leveraging Spotify

Information on Twitter for Artist RecommendationsMartin Pichl, Eva Zangerle and Günther Specht

Page 2: (SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information on Twitter for Artist Recommendations" - Martin Pichl, Eva Zangerle and Günther SpechtPresentation

Seite 2

Page 2

Agenda

• Why Music Recommendations?

• Dataset Creation & Recommendation Approach

• Discussion and Future Work

Page 3: (SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information on Twitter for Artist Recommendations" - Martin Pichl, Eva Zangerle and Günther SpechtPresentation

Seite 3

Page 3

Recent Trends

• Rise of the web enabled new distribution channels

• Online Stores

• Music Streaming Platforms

• …

• These new distribution channels

– Exploit a word-wide market

– Virtually no inventory costs

→ More and more dives music is available

Page 4: (SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information on Twitter for Artist Recommendations" - Martin Pichl, Eva Zangerle and Günther SpechtPresentation

Seite 4

Page 4

Why (Music) Recommender Systems?

• The user is confronted with more and more diverse music

– on streaming platforms

– in online stores

– on mobile devices

• and has a free choice

• Users often do not know what to listen to

→ Information Overload

• Recommender Systems

– Helps users finding music they like

→ Increase usability

Page 5: (SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information on Twitter for Artist Recommendations" - Martin Pichl, Eva Zangerle and Günther SpechtPresentation

Seite 5

Page 5

Research on Music Recommender System

• Publicly available data necessary

• Twitter

– People share what they are listening at the moment

• Get additional information from Spotify

– Additional listening events

– Additional information about the tracks and the artists

– Additional information about the listening context

• The additional information is necessary to build a more

specialized recommender system

Page 6: (SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information on Twitter for Artist Recommendations" - Martin Pichl, Eva Zangerle and Günther SpechtPresentation

Seite 6

Page 6

Example Tweets

Page 7: (SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information on Twitter for Artist Recommendations" - Martin Pichl, Eva Zangerle and Günther SpechtPresentation

Seite 7

Page 7

The Dataset

• Generated dataset based Tweets that contains

– <UserID, ArtistID, TrackID>-triples

– Boolean preferences (listened/not listened)

• Cleaning

– Removed duplicates

– Removed certain accounts i.e. @SpotifyNowPlaying

– Removed “Various Artists”

Page 8: (SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information on Twitter for Artist Recommendations" - Martin Pichl, Eva Zangerle and Günther SpechtPresentation

Seite 8

Page 8

Dataset Snapshot

• Dataset contains

– 513,489 listening events

– by 68,045 unique users

– listening to 97,586 unique tracks

– by unique 40,593 artists

• Distribution

– In average 4.77 tweets per user (SD= 30.02)

– Median of 2

Page 9: (SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information on Twitter for Artist Recommendations" - Martin Pichl, Eva Zangerle and Günther SpechtPresentation

Seite 9

Page 9

Artist Recommendations using this Dataset

• No content based information

– Recommendations are computed using collaborative filtering

• Collaborative Filtering (CF)

– CF recommends items that the most similar users of a user

listened to (and are new to the user)

• CF relies on

– A user similarity measure

– A number of nearest neighbors 𝑘

Page 10: (SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information on Twitter for Artist Recommendations" - Martin Pichl, Eva Zangerle and Günther SpechtPresentation

Seite 10

Page 10

User Similarity

• Boolean Preferences

– Jaccard Coefficient is suitable

– 𝐽𝑎𝑐𝑐𝑎𝑟𝑑𝑖,𝑗 =𝑆𝑖 ∩ 𝑆𝑗

𝑆𝑖 ∪ 𝑆𝑗

• Include all the information available

– Compute Jaccard Coefficient using the artist listening history

– Compute Jaccard Coefficient using the track listening history

– Combined using an weighted average

• 𝑢𝑠𝑒𝑟𝑆𝑖𝑚 = 𝑤𝑎 ∗ 𝑎𝑟𝑡𝑖𝑠𝑡𝑆𝑖𝑚 + 𝑤𝑡 ∗ 𝑡𝑟𝑎𝑐𝑘𝑆𝑖𝑚

Page 11: (SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information on Twitter for Artist Recommendations" - Martin Pichl, Eva Zangerle and Günther SpechtPresentation

Seite 11

Page 11

Parameter Tuning

• Input Parameters

– 𝑤𝑎, 𝑤𝑡, 𝑘

– Optimized using a Genetic Algorithm (GA)

– Fitness = Precision of the recommender system

– In average a good solution was found after 4.14 iterations

(SD=2.27)

Page 12: (SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information on Twitter for Artist Recommendations" - Martin Pichl, Eva Zangerle and Günther SpechtPresentation

Seite 12

Page 12

Genetic Algorithm

• 𝑤𝑎,𝑤𝑡, 𝑘 are float point genes between 0 and 1 and form a

individual

• Random initial distribution

• The fitness of each individual is measured using the

precision

• Crossover and mutations of the best individual

• Terminate if the precision is 1 or a certain number of

generations is reach

Page 13: (SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information on Twitter for Artist Recommendations" - Martin Pichl, Eva Zangerle and Günther SpechtPresentation

Seite 13

Page 13

The Big Picture

Page 14: (SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information on Twitter for Artist Recommendations" - Martin Pichl, Eva Zangerle and Günther SpechtPresentation

Seite 14

Page 14

Evaluation Setup

• Offline Evaluation

– From each user we removed 1/3 of the listening events for

testing

– Recommended 𝑝 ∗ 𝑆𝑖𝑧𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑇𝑒𝑠𝑡𝑠𝑒𝑡 items

– Varied 𝑝 between 0 and 1

– Computed precision and recall for each 𝑝

• Parameters used for the Evaluation

– 𝑤𝑎 = 0.21

– 𝑤𝑡 = 0.94

– 𝑘 = 59

Page 15: (SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information on Twitter for Artist Recommendations" - Martin Pichl, Eva Zangerle and Günther SpechtPresentation

Seite 15

Page 15

Evaluation Metrics

• Hit: Item found in the testset

• 𝑝𝑟𝑒𝑐𝑖𝑠𝑜𝑛 =ℎ𝑖𝑡𝑠

𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑟𝑒𝑐𝑜𝑚𝑚𝑒𝑛𝑑𝑎𝑡𝑖𝑜𝑛𝑠

• Relevant Items: All items in the testset

• 𝑟𝑒𝑐𝑎𝑙𝑙 =ℎ𝑖𝑡𝑠

𝑠𝑖𝑧𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑡𝑒𝑠𝑡𝑠𝑒𝑡

Page 16: (SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information on Twitter for Artist Recommendations" - Martin Pichl, Eva Zangerle and Günther SpechtPresentation

Seite 16

Page 16

Performance of the optimized Recommender

Systemn Precision Recall

1 0.4656 0.0228

2 0.3622 0.0547

3 0.3137 0.0782

4 0.2812 0.1003

5 0.2531 0.1195

6 0.2315 0.1286

7 0.2170 0.1396

8 0.2170 0.1396

9 0.1871 0.1583

10 0.1871 0,1583

0

0,1

0,2

0,3

0,4

0,5

Pre

cisi

on

/ R

ecal

l

Number of Recommendations (% of the Testset)

Precision

Recall

Page 17: (SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information on Twitter for Artist Recommendations" - Martin Pichl, Eva Zangerle and Günther SpechtPresentation

Seite 17

Page 17

Discussion

• Heading into the right direction

• Performance is limited for a high number of

recommendations

– Data sparsity

– Too general approach

• Performance improvements with

– Reducing data sparsity

– Specialized algorithm that fits more to music

recommendation

Page 18: (SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information on Twitter for Artist Recommendations" - Martin Pichl, Eva Zangerle and Günther SpechtPresentation

Seite 18

Page 18

Next Steps towards a more specialized RS

• Match Spotify and Twitter Users

– Early experiments show that we can match ~ 10% of the

dataset

– Better matching than using the username and played tracks?

• Extract listening context from playlist names, i.e.

– Christmas

– Workout, training

– Driving

– …

Page 19: (SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information on Twitter for Artist Recommendations" - Martin Pichl, Eva Zangerle and Günther SpechtPresentation

Seite 19

Page 19

Next Steps towards a more specialized RS

• The offline evaluation is rather limited

• Create an intuitive webinterface

• Conduct a live user experiment

Page 20: (SoWeMine Workshop) "#nowplaying on #Spotify: Leveraging Spotify Information on Twitter for Artist Recommendations" - Martin Pichl, Eva Zangerle and Günther SpechtPresentation

Seite 20

Page 20

Acknowledgments