Btp 1st

Music Recommandation System

Group - G6

Advisor – Dr. Vikram Pudi

Music != Movies and books

• CF algorithms are generally suffers from cold-start problem, novelty and ignore content of items.

• Tracking user’s preference is mostly done implicitly, via their listening habits instead of asking users to explicitly rate the item

• Any user can consume the item several times, even repeatedly and continuously. Mostly music labeling should be done by music experts.

• Another big difference is context of the music, like people prefer hard-rock in the morning, classical piano while working, and cool jazz while having dinner. It should also handle these contextual differences

Audio Feature Extraction

• Audio data is the time series where y-axis is current amplitude and x-axis is time.

Continue ..

• Audio waveform is broken into short frames. (1024 samples at 22050Hz).

• Collect Frame-level features and get mean/variance for each frames

• Discrete short term Fourier transformation

• Real Cepstral Coefficients

• Mel Frequency Cepstral Coefficients

• Zero crossing rate

• Septral centroid, Rolloff, flux and LPC

• Rhythmic and Harmony Beat features ..

• We got a 68 floating point vector called feature vector for a audio file.

• Computationally expensive

2. Automatic Playlist Generation using a song seed

• Aim:- given a song suggest most similar song in the library and make a mood-based playlist

• Implementation-

• Seed song s0 , F0 = {p1 , p2 , …………., pn}

• Find song s { S- s0} where difference between feature vector is minimum.

• Repeat the steps to generate whole playlist up to certain tolerance.

• Two mode:- seed song, last recommended song

• Note that there is no user involved.

Problem in User scenario

• Previous was simplest case of recommendation problem.

• No user preferences are involved.

• Where is user profiles , musical taste, feedback loops, ratings, listening habits ?

• Result depends upon seed song, all the time

• Not ideal situation when user has various list of songs already in his playlist.

• Simple Averaging can’t be the right solution

3. Top-N recommendation

• Solution: Clustering with dynamic K-mean

• Cluster the song using kmean algorithm based on their feature vectors.

• But we don’t know the initial K ? Solution

• Fix a Radius R at which a genre is usually clustered. If user like 3 genres, finally 3 or more clusters will be the outcome

• Algorithm starts with k=1, if radius > R, increase K by 1 (=2) and recalculate until

all cluster’s radius <= R.• Then find score of each music and select

top-N items, N is given by user.

Ranking and scoring items

• Calculate score of each music as

•

• Score(m,c) = 1 * ClusterData(c)

--------------------------Dist( Vc , Vm ) * AllData

• score(m,c) = score of music item with cluster c,

• ClusterData(c) = number of music instances in cluster c,

• Dist(v,u) = Euclidian distance between music item and cluster centroid. Hence more closer to the centroid, more chances of getting recommended. i.e. higher score.

• Alldata = number of pieces in users’ playlist.

• Sum the score for each cluster.

• Sort down the score and recommend top-N items.

Stats

• Dataset = 'A benchmark for automatic genre classification"

• +----------+--------+ • | tag | count(*) |

• +-------------+---------+

• | alternative | 145 |

• | blues | 120 |

• | electronic | 113 |

• | folkcountry | 222 |

• | funksoulrnb | 47 |

• | jazz | 319 |

• | pop | 116 |

• | raphiphop | 300 |

• | rock | 504 |

• +-------------+----------+

• No user feedback loops

• Determining R is a problem, this approach fails in case of numerous genres.

• Content based recommendation are less accurate .

• Hard –time of mapping user preferences into music domain.

• More feature will increase the result but clustering is a expensive with big feature vectors.

• Scaling problem, not efficient.

Limitation

Future work

• UI front-end and Interfaces

• Improving recommendation algorithm defined in literature.

• Gathering implicit feedback and tracking user-profiles.

• Playlists according to artists, user-profile as a seed.

• Recommendation from a song-set.

• Mining web for new music information and other attributes of songs. Mp3 blogs, web services API last.fm, mystrands, pandora etc.

References

• Adomavicius, G. and Tuzhilin A.(2005), “Towards the next generation of recommender system: A survey and state-of-art and possible extensions”

• Aucouturier, J-J. and Packet, F. (2002), “Music similarity measures: What’s the use?”

• P. Cano, M. Kopperbergerger, N. Wack, “Content based music audio recommendation”

• B. Logan, “Music recommendation from song-sets”

• G. Tzanetakis, P. Cook, “Musical genre classification of audio signals”

• J.H. Ban, K.M. Kim, K.S. Park, “Quick audio retrieval using multiple feature vector”

• Canno, P., Koppenberger, M., and Wack, N. (2005), “An industrial-strength content based music recommendation

system”