Item-Based Collaborative Filtering Recommendation Algorithms Ali Hamie Jatin Saluja

advertisement
Item-Based Collaborative Filtering
Recommendation Algorithms
Ali Hamie
Jatin Saluja
What is collaborative filtering?
● Most successful recommendation technique to date.
● It is the idea of recommending an item or prediction depending on other
like minded individuals.
● Consists of set of users,set of items, set of opinions about the item,
ratings,reviews or purchases.
Two Types of Collaborative filtering
algorithms
● Memory-based collaborative filtering algorithms which means it is user
based.
● Model-based collaborative filtering algorithm which is item based and our
current paper.
Memory based collaborative filtering
● This approach uses the entire user-item data set to make different
“neighborhoods “ of users.
● These neighborhoods are users who like the same items or disliked the
same items.
● An algorithm like user based collaborative filtering is then used to
recommend an item to similar users.
Challenges of User -Based CF-Algorithms
● Sparsity: big data companies like amazon or CD now that recommends
books and music.
● Scalability: Nearest neighbor algorithms grows with item and user data.
Item based recommendation system
● Instead of looking into users the item based looks into the items the user
has rated and computes their similarity through different algorithms.
● It produces k most similar items.
● A prediction algorithm is ran to choose which item is the most similar.
Another Example
Cosine Based similarity.
● An algorithm to calculate the similarity between items.
● Similarity is computed by computing the cosine of the angle between two
vectors. Here, items are vectors in m dimensional user space.
● How does this work?
Example:
Consider the following texts:
1. Julie loves me more than Linda loves me.
2. Jane likes me more than Julie loves me.
.
Example...
me
Jane
Julie
Linda
likes
loves
more
than
2
0
1
1
0
2
1
1
2
1
1
0
1
1
1
1
Example...
a: [2, 1, 0, 2, 0, 1, 1, 1]
b: [2, 1, 1, 1, 1, 0, 1, 1]
● Two 8 dimensional vectors of the two texts.
● The cosine of their angle would be around 0.8.
The closer to one the value the more similar it is.
Correlation based similarity
This similarity measure is based on how much the ratings by common users for a pair of items deviate
from average ratings for those items.
Let the set of users who both rated i and j are denoted by U then the correlation similarity is given by
R(u.i) = Rating of user u on item i
R(i) = Average rating of the i-th item
Adjusted Cosine Similarity
This similarity measurement is a modified form of cosine-based similarity where we take into the fact
that different users have different ratings schemes.
Some users might rate items highly in general, and others might give items lower ratings as a
preference. To remove this drawback from cosine-based similarity, we subtract average ratings for
each user from each user's rating for the pair of items in question:
Prediction Computation
Once we make a model using one of the similarity measures described above,
we can predict the rating for any user-item pair by using the idea of weighted
sum.
Prediction Computation...
First, we take all the items similar to our target item, and from those similar items, we pick items
which the active user has rated.
Then, we weight the user's rating for each of these items by the similarity between that and the
target item.
Finally, we scale the prediction by the sum of similarities to get a reasonable value for
the predicted rating.
Advantages and Disadvantages
● More scalable
● Better memory
● No sparsity problems
● Can be done offline
● Item Similarity takes a long time
Experimental Data
● Movie Lens data set
● 100,000 ratings
● Matrix of users and items
● Data was read into an item hashtable and user hashtable
Similarity algorithms,Mean Absolute Error
Model Size Sensitivity
Conclusion
● User based CF
● Item based CF
● Adjusted Cosine has less error.
● Item based better performance and efficiency.
References
● http://stackoverflow.com/questions/1746501/can-someone-give-anexample-of-cosine-similarity-in-a-very-simple-graphical-wa
● https://cran.rproject.org/web/packages/recommenderlab/vignettes/recommenderlab.p
df
Download