Recommendation as Search: Reflections on Symmetry
Post on 08-May-2015
114 Views
Preview:
DESCRIPTION
Transcript
1©MapR Technologies - Confidential
Recommendation as Search
Reflections on Symmetry
2©MapR Technologies - Confidential
Company Background
MapR provides the industry’s best Hadoop Distribution– Combines the best of the Hadoop community
contributions with significant internally financed infrastructure development
Background of Team– Deep management bench with extensive analytic,
storage, virtualization, and open source experience– Google, EMC, Cisco, VMWare, Network Appliance, IBM,
Microsoft, Apache Foundation, Aster Data, Brio, ParAccel Proven – MapR used across industries (Financial Services, Media,
Telcom, Health Care, Internet Services, Government) – Strategic OEM relationship with EMC and Cisco– Over 1,000 installs
3©MapR Technologies - Confidential
What is Hadoop?
A new style of computation
A new style of combining computation and storage
Allows very large computations
Used by all large internet companies, many other industries
Fundamentally changes the economics of large-scale computation
4©MapR Technologies - Confidential
Why Big Data?
Because we can
Because we can learn new things
Because new economics of computation favors large scale
Because big data can be simpler than small data
5©MapR Technologies - Confidential
Recommendations
Often known as collaborative filtering“People who bought x also bought y”
Actors (people) interact (bought) with items (x and y)– observe successful interaction
We want to suggest additional successful interactions
Observations are inherently very sparse
6©MapR Technologies - Confidential
Examples
Customers buying books (Linden et al)
Web visitors rating music (Shardanand and Maes) or movies (Riedl, et al), (Netflix)
Internet radio listeners not skipping songs (Musicmatch)
Internet video watchers watching >30 s (Veoh)
iTunes song purchases or plays (Apple)
7©MapR Technologies - Confidential
Fundamental Algorithm
History matrix A has the shape of actors x items
Cooccurrence matrix K has the shape of items x itemsan actor interacted with both x and ysum over all actors
A is also a linear operator
K tells us “users who interacted with x also interacted with y”
8©MapR Technologies - Confidential
… Warning …
9©MapR Technologies - Confidential
… Warning …
Mathematics ahead
10©MapR Technologies - Confidential
Fundamental Algorithmic Structure
Cooccurrence
For very large data-sets
11©MapR Technologies - Confidential
But Wait ...
Does it have to be that way?
12©MapR Technologies - Confidential
But why not ...
13©MapR Technologies - Confidential
But why not ...
Why just dyadic learning?
14©MapR Technologies - Confidential
But why not ...
Why just dyadic learning?
Why not triadic learning?
15©MapR Technologies - Confidential
But why not ...
Why just dyadic learning?
Why not p-adic learning?
16©MapR Technologies - Confidential
For example
Users enter queries (A)– (actor = user, item=query)
Users view videos (B)– (actor = user, item=video)
A’A gives query recommendation– “did you mean to ask for”
B’B gives video recommendation– “you might like these videos”
17©MapR Technologies - Confidential
The punch-line
B’A recommends videos in response to a query– (isn’t that a search engine?)– (not quite, it doesn’t look at content or meta-data)
18©MapR Technologies - Confidential
Real-life example
Query: “Paco de Lucia” Conventional meta-data search results:– “hombres del paco” times 400– not much else
Recommendation based search:– Flamenco guitar and dancers– Spanish and classical guitar– Van Halen doing a classical/flamenco riff
19©MapR Technologies - Confidential
Real-life example
20©MapR Technologies - Confidential
Real-life example
21©MapR Technologies - Confidential
Hypothetical Example
Want a navigational ontology? Just put labels on a web page with traffic– This gives A = users x label clicks
Remember viewing history– This gives B = users x items
Cross recommend– B’A = label to item mapping
After several users click, results are whatever users think they should be
22©MapR Technologies - Confidential
Resources
Metdunning@maprtech.com@ted_dunning
Slides and such:– http://info.mapr.com/ted-paris-05-2012
The original paper– Accurate Methods for the Statistics of Surprise and Coincidence– (check on citeseer)
Source code– Mahout project– contact me
top related