Music Recommandation System Group - G6 Advisor – Dr. Vikram Pudi
Music Recommandation System
Group - G6
Advisor – Dr. Vikram Pudi
Music != Movies and books
• CF algorithms are generally suffers from cold-start problem, novelty and ignore content of items.
• Tracking user’s preference is mostly done implicitly, via their listening habits instead of asking users to explicitly rate the item
• Any user can consume the item several times, even repeatedly and continuously. Mostly music labeling should be done by music experts.
• Another big difference is context of the music, like people prefer hard-rock in the morning, classical piano while working, and cool jazz while having dinner. It should also handle these contextual differences
Audio Feature Extraction
• Audio data is the time series where y-axis is current amplitude and x-axis is time.
Continue ..
• Audio waveform is broken into short frames. (1024 samples at 22050Hz).
• Collect Frame-level features and get mean/variance for each frames
• Discrete short term Fourier transformation
• Real Cepstral Coefficients
• Mel Frequency Cepstral Coefficients
• Zero crossing rate
• Septral centroid, Rolloff, flux and LPC
• Rhythmic and Harmony Beat features ..
• We got a 68 floating point vector called feature vector for a audio file.
• Computationally expensive
2. Automatic Playlist Generation using a song seed
• Aim:- given a song suggest most similar song in the library and make a mood-based playlist
• Implementation-
• Seed song s0 , F0 = {p1 , p2 , …………., pn}
• Find song s { S- s0} where difference between feature vector is minimum.
• Repeat the steps to generate whole playlist up to certain tolerance.
• Two mode:- seed song, last recommended song
• Note that there is no user involved.
Problem in User scenario
• Previous was simplest case of recommendation problem.
• No user preferences are involved.
• Where is user profiles , musical taste, feedback loops, ratings, listening habits ?
• Result depends upon seed song, all the time
• Not ideal situation when user has various list of songs already in his playlist.
• Simple Averaging can’t be the right solution
3. Top-N recommendation
• Solution: Clustering with dynamic K-mean
• Cluster the song using kmean algorithm based on their feature vectors.
• But we don’t know the initial K ? Solution
• Fix a Radius R at which a genre is usually clustered. If user like 3 genres, finally 3 or more clusters will be the outcome
• Algorithm starts with k=1, if radius > R, increase K by 1 (=2) and recalculate until
all cluster’s radius <= R.• Then find score of each music and select
top-N items, N is given by user.
Ranking and scoring items
• Calculate score of each music as
•
• Score(m,c) = 1 * ClusterData(c)
--------------------------Dist( Vc , Vm ) * AllData
• score(m,c) = score of music item with cluster c,
• ClusterData(c) = number of music instances in cluster c,
• Dist(v,u) = Euclidian distance between music item and cluster centroid. Hence more closer to the centroid, more chances of getting recommended. i.e. higher score.
• Alldata = number of pieces in users’ playlist.
• Sum the score for each cluster.
• Sort down the score and recommend top-N items.
Stats
• Dataset = 'A benchmark for automatic genre classification"
• +----------+--------+ • | tag | count(*) |
• +-------------+---------+
• | alternative | 145 |
• | blues | 120 |
• | electronic | 113 |
• | folkcountry | 222 |
• | funksoulrnb | 47 |
• | jazz | 319 |
• | pop | 116 |
• | raphiphop | 300 |
• | rock | 504 |
• +-------------+----------+
• No user feedback loops
• Determining R is a problem, this approach fails in case of numerous genres.
• Content based recommendation are less accurate .
• Hard –time of mapping user preferences into music domain.
• More feature will increase the result but clustering is a expensive with big feature vectors.
• Scaling problem, not efficient.
Limitation
Future work
• UI front-end and Interfaces
• Improving recommendation algorithm defined in literature.
• Gathering implicit feedback and tracking user-profiles.
• Playlists according to artists, user-profile as a seed.
• Recommendation from a song-set.
• Mining web for new music information and other attributes of songs. Mp3 blogs, web services API last.fm, mystrands, pandora etc.
References
• Adomavicius, G. and Tuzhilin A.(2005), “Towards the next generation of recommender system: A survey and state-of-art and possible extensions”
• Aucouturier, J-J. and Packet, F. (2002), “Music similarity measures: What’s the use?”
• P. Cano, M. Kopperbergerger, N. Wack, “Content based music audio recommendation”
• B. Logan, “Music recommendation from song-sets”
• G. Tzanetakis, P. Cook, “Musical genre classification of audio signals”
• J.H. Ban, K.M. Kim, K.S. Park, “Quick audio retrieval using multiple feature vector”
• Canno, P., Koppenberger, M., and Wack, N. (2005), “An industrial-strength content based music recommendation
system”