Recommender Systems Recommender Systems In many cases, users are faced with a wealth of products and information from which they can choose. To alleviate this many web sites help users by using Recommender Systems, List of items or page that are likely to interest them Once the user makes a choice, a new list can be presented What Data is used to make the recommendations? Explicit feedback Ratings Reviews Auctions • Implicit feedback Page visits Purchase data Browsing paths What are the type of recommendations? Item-to-Item associations Similar pages this “Users who bought this book also bought X” User-to-User associations Which other user has similar interests? User-to-Item associations Rating history describes user Items are described by attributes Items are described by ratings of other users Classification of Recommender Systems Content-based approach Item is described by a set of attributes Movies: e.g director, genre, year, actors Documents: bag-of-words Similarity metric defines relationship between items e.g. cosine similarity Examples “related pages” in search engine Google News Related Approaches Mooney and Roy (2000) Their approach comes from the Information Retrieval (IR) field They rely on the content of the items, and use some similarity score to match the items based on their content Burke (2000) The use the content-based recommendation. However, they allow to the user introduce explicit information about his preferences. Types of Recommender Systems Collaborative filtering Item is described by user interactions Matrix V of n (number of users) rows and m (number of items) columns Elements of matrix V are the user feedback Examples: Rating given to item by each user Users who viewed this item Similarity metric between items Related Approaches Collaborative Filtering They used historical data gathered from other users to make the recommendation Ex: If a user wants to rent a movie, he tends to rely on friends to recommend him items that they have like it The goal is to identify those users whose taste in recommendations is predictive of the taste of a certain person and use this recommendations to construct an interesting list for the user. Collaborative Filtering Models Memory Based Neighborhood Latent Models Factors Model Based Classification Bayesian Networks Association Rules Memory Based Approaches Works directly with the user data Given a user, the system finds the most similar users to make a recommendation There are two approaches: Neighborhood Latent Factor Neighborhood Approach It’s an item-oriented approach, focusing on evaluating the preference of a user to an item based on ratings of similar items by the same user. Users are transformed to item space by viewing them as baskets of rated items. No longer to compare users to items, but directly relate items to items. Pros: rely on a few significant neighborhood relations; effective at detecting very localized relationships Cons: ignore the vast majority of ratings by a user; unable to capture the totality of weak signals in all of a user’s rating. Latent Factor Models Transform both items and users to the same latent 1 3 5 2 4 2 5 4 1 4 4 1 factor space, thus making them directly comparable. Latent space tries to explain ratings by characterizing both products and users on factors automatically inferred from user feedback. Pros: effective at estimating overall structure that relates simultaneously to most or all items. Cons: poor at detecting strong association among a small set of closely related items. 3 4 2 3 5 3 5 4 3 4 4 2 4 2 1 3 5 ~ 2 2 2 3 4 5 .1 -.4 .2 -.5 .6 .5 -.2 .3 .5 1.1 2.1 .3 -.7 2.1 -2 -1 .7 .3 1.1 -.2 .3 .5 -2 -.5 .8 -.4 .3 1.4 2.4 -.9 -.8 .7 .5 1.4 .3 -1 1.4 2.9 -.7 1.2 -.1 1.3 2.1 -.4 .6 1.7 2.4 .9 -.3 .4 .8 .7 -.6 .1 Singular Value Decomposition Decompose ratings matrix, R, into coefficients matrix US and factors matrix V such that 2 N M J D U K SKVK Dij U K SKVKT i 1 j 1 is minimized. U = eigenvectors of RRT (NxN) V = eigenvectors of RTR (MxM) S = diag(1,…,M) eigenvalues of RRT k r11 R r N1 r1M w11 rNM wN 1 w1k v11 wNk vk 1 ij M v1M vNM Challenges Collaborative Filtering User Cold-Start problem not enough known about new user to decide who is similar (and perhaps no other users yet..) Challenges Collaborative Filtering Sparsity when recommending from a large item set, users will have rated only some of the items (makes it hard to find similar users) Challenges Collaborative Filtering Scalability with millions of users and items, computations become slow Item Cold-Start problem Cannot predict ratings for new item till some similar users have rated it [No problem for content-based] Related Approaches Srebro & Jaakkola (2003) Weighted SVD N M J D U K SKVK Wij Dij U K S KVKT ij Binary weights i 1 j 1 2 wij = 1 means element is observed wij = 0 means element is missing Positive weights weights are inversely proportional to noise variance allow for sampling density e.g. elements are actually sample averages from counties or districts Related Approaches SVD with Missing Values Uses Expectation maximization to calculate the approximation of matrix E step fills in missing values of ranking matrix with the low-rank approximation matrix M step computes best approximation matrix in Frobenius norm Local minima exist for weighted SVD Related Approaches Agarwal (2009) Regression-Based Latent Factor Models They presented a regression based factor model that regularizes and deals with both cold-start and warmstart in a single framework. It takes advantage of other user ratings, item and user features to predict the missing ratings Model Based Approaches User data is compressed into a predictive model Instead of using ratings directly, develop a model of user ratings Use the model to predict ratings for new items To build the model: Bayesian network (probabilistic) Clustering (classification) Rule-based approaches (e.g., association rules between co-purchased items) Related Approaches Stern(2009) Large Scale Online Bayesian Recommender Integrates Collaborative Filtering with Content information. Users and items compared in the same space. Flexible feedback model. Bayesian probabilistic approach. Value of the Recommendation Many considerations are taken into account to build the list of recommendations: The likelihood of a recommendation to been accepted by the user The immediate value to the site The long term implications of the recommendations on the user’s future choices Value of the Recommendation Example: Suggest a video camera with probability 0.5 or a VCR with a probability 0.6 To recommend the video camera is less profitable than the VCR It the long term it might be more profitable (the camera has accessories that are likely to be purchased whereas the VCR does not) Sequential Nature of Recommendation Process The recommender system suggests items to the user The user can accept or not one the items offered A new list of items is calculated based on the user past ratings Markov Decision Process (MDP) A MDP is a model for stochastic decision problems A MDP is a four-tuple (S,A,Rwd, tr) where S is a set of states, A is a set of actions, Rwd is the reward associated with each state/action and tr is the transition function for each state. The goal is to behave in order to maximize the total reward The optimal solution π is a policy specifying which action to perform in each state . Markov Decision Process (MDP) The value function V of the policy π is defined as: Where γ is a discount factor And the optimal value function V* is defined as: Markov Decision Process (MDP) To find the optimal policy π* and its corresponding value function V*: We search the space of the possible policies starting with an initial policy π0(s) At each step we compute the value function based on the former policy and update the policy based on the new value function Temporal Dynamics in the Recommendations Item-side effects: Product perception and popularity are constantly changing Seasonal patterns influence items’ popularity User-side effects: Customers ever redefine their taste Transient, short-term bias; anchoring Drifting rating scale Change of rater within household Temporal dynamics - challenges Multiple sources: Both items and users are changing over time Multiple targets: Each user/item forms a unique time series Scarce data per target Inter-related targets: Signal needs to be shared among users – foundation of collaborative filtering cannot isolate multiple problems Time Sensitive Recommenders Koren (2009) Collaborative Filtering with Temporal Dynamics He use factor models to separate different aspects of the ratings to observe changes in: Rating scale of individual users 2. Popularity of individual items 3. User preferences 1. Recommender Systems with Social Networks Use the interaction of the user with others to do recommendations Motivation: Social Influence: users adopt the behavior of their friends Challenges: How do we define influence between users? Recommender Systems with Social Networks Preliminary Approaches Jamali & Ester (2009) TrustWalker: A Random Walk Model for Combining Trust-based and Itembased Recommendation Explores the trust network to find Raters. Aggregate the ratings from raters for prediction. Different weights for users Open Challenges Transparency Exploration versus Exploitation Convince a user to accept a recommendation Help a user make a good decision Help a user fit a goal or mood Cold start problems (for new items, and for new users) Choosing what questions to ask users Trade-off between optimizing for this user vs. for all users How can meta-data on user or item help? Guided Navigation Providing a guide over a vast body of content User's intent detection Open Challenges Time Value Does value of user input decay with time? Do items change in relevance with time? How to adjust for recent user experience? Evaluation of the recommenders performance Scalability Combining Content and Collaborative Recommenders efficiently