Online Recommendations

Online Recommendations

T H E U V D E C O M P O S I T I O N A L G O R I T H M

Motivation





It is now common to get “personal recommendations” when we visit a website.



News articles





Product recommendations

Advertisements

Why ?





Unlike paper newspapers or brick and mortar stores, there is no limit [in terms of space/inventory] what can be shown or sold on a web-site..

Long-Tail effect



Large part of the income comes from the tail [example in search revenue]

The Netflix challenge











Netflix is a (online) US company from where people can rent movies

Netflix would like to recommend movies to users.

Netflix challenge (2006) – one million dollars prize who could beat their movie recommendation by 10%

After three years the prize was awarded..

We will discuss one of the “main ideas” behind the winning entry

 [We will follow the discussion from the “Mining of

Massive Data Sets”]

The Data

 Data can be arranged in the form…

M1 M2 M3 M4 M5 M6 M7

U1

U2 2

3

2

2

4

5

U3

U4

2

3

1 3

2

4

4

4

Users have given rating (between 1 and 5) from a database of 7 movie

..

What should be recommended to U2 ?

First ideas

 Do missing entries mean a rating of “0” ?

 How about simple dot product.

 Other ideas…?



Clustering



Association Rules ?

Another basic idea

 Let the user “u” rate movie “m” as follow:

 Take the average of the following two numbers.





Average of user’s “u” ratings.

Average of all ratings given to movie “m” by all users’ who rated “m”

 This was only 3% worse than than the Netflix algorithm (called CineMatch)

Latent Modeling

 A big breakthrough is the idea of “latent variable modeling”





The data we observe is a result of another variable or set of variables which are “latent” (not observable)…

The latent variables generate the observed data…



In our case…the latent variables control the generation of the ratings..

 So the challenge is to “infer” the latent variables from the observed data…



 clustering

Bayes Theorem

Cognitive Science/AI

 Scientists working in AI/Cognitive Science have drawn the following analogy..







Mind  Computer

Mental Representation  Programs/Theories

Thinking  Computational Process/Algorithms

 Practical Outcome: Infer Latent Structures

The Key Idea



Decompose the User x Rating matrix into:



User x Rating = ( User x Genre ) x (Genre x Movies)



Number of Genres is typically small

 Or



R =~ UV



Find U and V such that ||R – UV|| is minimized…



Almost like k-means clustering…why ?

Example of UV Decomposition

The criterion used to select U and V is the Root Mean Square Error (RMSE)

RMSE Example

ç

æ

è

1 2

3

÷

ö

ø

ç

æ

è

1 3

2 4

÷

ö

ø

1

3

(1

-

1)

2 +

(2

-

3)

2 +

(3

-

4)

2 =

1

3

0

+

1

+

1

=

1

3

2

Example..Continued

Results in an RMSE of 1.8

UV Computation

UV Computation..continued

Same calculation as before…..

UV Computation….

UV Decomposition

 The above process can be generalized to any entry…

 Continue the process until RMSE settles into a local optimal…

 So in spirit very similar to k-means..

References

 Mining of Massive Data Sets



Rajaraman, Leskovic, Ullman

Online Recommendations

Online Recommendations

ç

æ

è

÷

ö

ø

ç

æ

è

÷

ö

ø

Related documents

Products

Support

Online Recommendations

Online Recommendations

ç

æ

è

÷

ö

ø

ç

æ

è

÷

ö

ø

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib