Transcript Slides
Online Recommendations
THE UV DECOMPOSITION ALGORITHM
Motivation
It is now common to get “personal
recommendations” when we visit a website.
News articles
Product recommendations
Advertisements
Why ?
Unlike paper newspapers or brick and mortar stores, there is
no limit [in terms of space/inventory] what can be shown or
sold on a web-site..
Long-Tail effect
Large part of the income comes from the tail [example in
search revenue]
The Netflix challenge
Netflix is a (online) US company from where people can
rent movies
Netflix would like to recommend movies to users.
Netflix challenge (2006) – one million dollars prize who
could beat their movie recommendation by 10%
After three years the prize was awarded..
We will discuss one of the “main ideas” behind the
winning entry
[We will follow the discussion from the “Mining of
Massive Data Sets”]
The Data
Data can be arranged in the form…
M1
U1
M3
3
U2
2
U3
2
U4
M2
M4
M6
2
2
M7
5
4
1
3
M5
3
4
2
4
4
Users have given rating (between 1 and 5) from a database of 7 movie
..
What should be recommended to U2 ?
First ideas
Do missing entries mean a rating of “0” ?
How about simple dot product.
Other ideas…?
Clustering
Association Rules ?
Another basic idea
Let the user “u” rate movie “m” as follow:
Take the average of the following two numbers.
Average of user’s “u” ratings.
Average of all ratings given to movie “m” by all users’ who
rated “m”
This was only 3% worse than than the Netflix
algorithm (called CineMatch)
Latent Modeling
A big breakthrough is the idea of “latent variable
modeling”
The data we observe is a result of another variable or set of
variables which are “latent” (not observable)…
The latent variables generate the observed data…
In our case…the latent variables control the generation of the
ratings..
So the challenge is to “infer” the latent variables
from the observed data…
clustering
Bayes Theorem
Cognitive Science/AI
Scientists working in AI/Cognitive Science have
drawn the following analogy..
Mind Computer
Mental Representation Programs/Theories
Thinking Computational Process/Algorithms
Practical Outcome: Infer Latent Structures
The Key Idea
Decompose the User x Rating matrix into:
User x Rating = ( User x Genre ) x (Genre x Movies)
Number of Genres is typically small
Or
R =~ UV
Find U and V such that ||R – UV|| is minimized…
Almost like k-means clustering…why ?
Example of UV Decomposition
The criterion used to select U and V is the Root Mean Square Error (RMSE)
RMSE Example
æ 1 2 ö
ç
÷
3 ø
è
æ 1 3 ö
ç
÷
è 2 4 ø
1
1
1
2
2
2
(1-1) +(2-3) +(3- 4) =
0+1+1 =
2
3
3
3
Example..Continued
Results in an RMSE of 1.8
UV Computation
UV Computation..continued
Same calculation as before…..
UV Computation….
UV Decomposition
The above process can be generalized to any entry…
Continue the process until RMSE settles into a local
optimal…
So in spirit very similar to k-means..
References
Mining of Massive Data Sets
Rajaraman, Leskovic, Ullman