Recommender Systems Chapter 1 Sara Qassimi sara.qassimi@uca.ac.ma L2IS Laboratory , FST Marrakech, Cadi Ayyad University 1 R&D work driven by three dimensions IT Dimension Data Scientist Domain Dimension Scientific Dimension Pr. Sara Qassimi FST- UCA 2 Pr. Sara Qassimi FST- UCA 3 How to make research Successful ● Research = answering question systematiclly Research Strategy Answer Question How to get from question to answer Interesting Compelling Pr. Sara Qassimi FST- UCA 4 Research process 1. Identifying the problem. 2. Reviewing literature. 3. Setting research questions, objectives, and hypotheses. 4. Choosing the study design. 5. Deciding on the sample design. 6. Collecting and processing data 7. Analyzing data & Data Modeling ; incorporating ML & DL 8. Discussing Research findings 9. Writing the report Pr. Sara Qassimi FST- UCA 5 5W1H Questioning Technique ● A framework that you can use when gathering information and investigating a problem. ● It is an iterative interrogative technique used to explore the cause-and-effect relationships underlying a particular problem which allows you to understand a situation, to discern a problem by analyzing all the dimensions from different perspectives one at a time; ● This framework can help you to expand a discussion, scope your research, and organize your findings and reports. Pr. Sara Qassimi FST- UCA 6 Pr. Sara Qassimi FST- UCA 7 5W1H : Community questions Who • The authors of the paper. When • The publishing date. Where • The Location of lab/team and/or where the paper has being published. Pr. Sara Qassimi FST- UCA 8 5W1H : Engineering questions What • The Problematic (P) Why • The added value/Solution (S) How • The proposed Approach (A) Pr. Sara Qassimi FST- UCA 9 This table compares the existing Articles (A1, A2,. . . , An) according to the comparison Criterias (C1, C2,. . . , Cn) which are concluded during the study of the existing works. Article C1 C2 C3 C4 A1 A2 A3 A4 A5 A6 A7 Pr. Sara Qassimi FST- UCA 10 Motivation : The importance of personalization in the post-pandemic market Consumers expect brands to demonstrate that they know them on a personal level ● Suggest relevant product/ service recommendations ● Offer me targeted promotions ● Make it easy for me to navigate in-store and online Pr. Sara Qassimi FST- UCA 11 Recommender System (RS) ● A recommender system, or a recommendation system is a subclass of information filtering system that seeks to predict the “rating” or “preference” a user would give to an item. ● RS assists users to cope up with information overload by suggestion relevant content to users [1]. ● RS helps users with four key features: ○ Decide: predicting a rating for a user concerning an item; ○ Compare: rank a list of items in a personalized way for a user; ○ Explore: give items similar to a given target item; ○ Discover: provide a user with unknown items that will be appreciated. [1] Qassimi S, Abdelwahed EH (2021) Towards a Semantic Graph-based Recommender System. A Case Study of Pr. Sara Qassimi Cultural Heritage. JUCS - Journal of Universal Computer Science 27(7): 714-733. 12 FST- UCA ● Customers (users) buy products (Items) they like or based on recommendations from others they trust. ● Online stores offer hundreds of thousands of items, making it difficult for users to find the most suitable one. ● Recommendation systems are used to help users search through items and find the most suitable ones. ● Content providers and social networking services also use recommendation systems to manage and personalize content for users. ● Recommendation systems act as an automated user’s assistant, suggesting not only the items the user asked for but also related items they might like. ● Recommendation systems are among the most popular machine learning services used in business for personalizing content. Pr. Sara Qassimi FST- UCA 13 Recommendations should be relevant to the user, i.e., they should invoke their interests. Recommend items they have not seen or used before, but are entirely different from what they might have rated/ liked before. Recommend items they have not seen or used before, but are similar to the items they have rated/ liked before. Recommend items that are dissimilar to each other, that cover different topics or genres. Pr. Sara Qassimi FST- UCA 14 The most common applications of RS E-commerce: Online shopping websites such as Amazon, eBay, and Alibaba use recommendation systems to suggest products to customers based on their purchase history, browsing behavior, and preferences. Streaming services: Video and music streaming platforms such as Netflix, YouTube, and Spotify use recommendation systems to suggest movies, TV shows, and songs to their users based on their viewing or listening history, ratings, and preferences . Social networks: Social networking sites such as Facebook, Twitter, and LinkedIn use recommendation systems to suggest friends, groups, and content to users based on their interests, activities, and social connections . Pr. Sara Qassimi FST- UCA 15 The most common applications of RS Online advertising: Advertising platforms such as Google AdWords and Facebook Ads use recommendation systems to suggest relevant ads to users based on their search history, browsing behavior, and demographics . Travel and hospitality: Travel and hospitality websites such as Airbnb and TripAdvisor use recommendation systems to suggest accommodations, activities, and restaurants to their users based on their search history, ratings, and preferences . Healthcare: Healthcare providers and insurers use recommendation systems to suggest treatments, medications, and health plans to patients based on their medical history, symptoms, and demographics . Pr. Sara Qassimi FST- UCA 16 The key element in building a RS is Data. Explicit Data ● Explicit data is usually in the form of a number given by a user to an item (e.g. 5-star ratings, user feedback). Item description Implicit Data ● Implicit data refers to data that captures user interactions with available items, observable user behaviors like, the number of clicks on product links . ● ● Item description data Item Metadata. Pre-processing is necessary to extract relevant information from unstructured data, such as the list of cast members in a movie on Netflix. 17 Major issues ● Sparsity: ○ Insufficient required data to extract descriptive metadata, rating, and contextual information about items. ● Cold Start: ○ When a user or an item is new to the system which has insufficient ratings or records at the start. Pr. Sara Qassimi FST- UCA 18 Recommendation system Generations 1st Generation 2nd Generation Content-Based Matrix Factorization Collaborative Filtering Web usage mining based Hybrid 3rd Generation Collaborative Filtering using Deep Learning stochastic artificial neural network Personality-based Other AI-based Models Pr. Sara Qassimi FST- UCA 19 AI & RS User Augmentation & Empowering ▪ Spectacular Development of ▪ Data-Enabled Systems and Application Enable to add value by enhancing the recommendations to empower the user decision making Pr. Sara Qassimi FST- UCA 20 ● Interactive Recommendation with voice feedback ● Provide personalized recommendations based on user preferences and engage in a natural conversation with users through voice interactions. These systems often employ natural language processing (NLP) and voice recognition technologies. Pr. Sara Qassimi FST- UCA 21 Interactive Recommendation with voice feedback Book Recommendation Voice-Enabled Shopping Assistant User: "Recommend a mystery novel with a strong female lead." System: "Certainly. Are you looking for something recent or a classic?" User: "I'd prefer a recent release." System: "I suggest 'The Silent Witness' by Fiona Barton. Would you like to hear more about it?" User: "Help me find a stylish winter coat." System: "Of course. What's your preferred color and size?" User: "I like black, and I wear a medium." System: "Great choice. Here are some black winter coats in medium sizes available in your favorite stores." Pr. Sara Qassimi FST- UCA 22 ● Image recommender system mapping images and user ● aims to recommend images to users based on their preferences, behavior, or specific needs. These systems can be used in various contexts, including e-commerce, social media, and content discovery. Pr. Sara Qassimi FST- UCA 23 Image recommender system Travel Destination Image Recommender Event and Activity Recommender Platform: A travel planning website or app. Functionality: Users can input their travel preferences, such as beach destinations, historical sites, or adventure travel. The system recommends images of destinations that match these preferences. Platform: An event discovery app. Functionality: Users can upload images of events or activities they've attended. The system recommends similar events happening in the user's area based on visual cues from the images. Pr. Sara Qassimi FST- UCA 24 ● Profiling users from multimedia data ● Profiling users from multimedia data involves creating detailed user profiles based on their interactions with multimedia content, such as images, videos, and audio. This information can be used for various purposes, including content recommendation, targeted advertising, and personalization. Pr. Sara Qassimi FST- UCA 25 Profiling users from multimedia data Image and Audio Profiling for Ad Targeting Platform: Online advertising platforms. Data Source: User interactions with ads, including clicks and comments. Profiling: Analyzing the types of images and audio in ads that lead to user engagement and conversions. Detecting sentiments and emotions in the user’s comments. Tailoring ad content to match user preferences. Image-based Social Media Profiling Platform: Social media like Instagram. Data Source: User's uploaded images, liked images, and captions. Profiling: Analyzing the content of images, such as objects, scenery, and people, to determine the user's interests. Analyzing image captions and comments to understand their preferences and sentiments. Pr. Sara Qassimi FST- UCA 26 ● Neural collaborative filtering ● an approach that combines neural networks with collaborative filtering techniques to build recommender systems. NCF models are particularly effective for recommendation tasks because they can capture complex user-item interactions and provide highly personalized recommendations. Pr. Sara Qassimi FST- UCA 27 Neural collaborative filtering : NCF Travel Destination Recommendation Problem: Recommending travel destinations and experiences to travelers. Application: Travel planning websites can use NCF to examine a user's past trips, preferences, and travel history to suggest new destinations, accommodations, and activities. NCF (Neural Collaborative Filtering) CF (Collaborative Filtering) suitable for travel destination recommendation when highly personalized and accurate recommendations are needed, and when non-linear user-destination interactions are significant. It excels in handling data sparsity and providing fine-grained personalization. can be a simpler and more computationally efficient choice for travel destination recommendation, particularly when data is sparse. It relies on user similarity and is effective in situations where personalized recommendations are less critical. However, it may struggle with capturing complex, non-linear relationships. Pr. Sara Qassimi FST- UCA 28 ● Deep Matrix Factorization extends the conventional matrix factorization by introducing deep neural networks. It seeks to discover hidden user and item characteristics by decomposing a user-item interaction matrix into latent factor representations. Pr. Sara Qassimi FST- UCA 29 Deep Matrix Factorization : DeepMF Job Recommendation Restaurant or Food Recommendation Problem:Recommending job listings to job seekers. Application: In a job search platform like LinkedIn, DeepMF can analyze a user's job search history, skills, and career interests to recommend relevant job openings and networking opportunities. Problem: Recommending restaurants, dishes, or recipes to users. Application: On food delivery apps like Uber Eats, DeepMF can consider a user's order history, cuisine preferences, and dietary restrictions to suggest new restaurants and menu items. Pr. Sara Qassimi FST- UCA 30 ● Convolutional Neural Network ● CNNs are particularly useful when the recommendation problem involves structured or visual data. are primarily designed for image analysis tasks, but they can be adapted creatively for certain aspects of recommender systems, especially when dealing with visual or image-related recommendations. Pr. Sara Qassimi FST- UCA 31 Convolutional Neural Network : CNN Fashion Recommendation Problem: Recommending clothing and fashion items to users based on their preferences. Application: CNNs can be employed to analyze the visual attributes of clothing items, such as style, color, pattern, and texture. User preferences and browsing behavior can be combined with CNN-based image analysis to recommend outfits or individual fashion items. Art and Photography Recommendation Problem: Recommending artworks, photographs, or visual content to art enthusiasts. Application: CNNs can analyze the visual features of artworks and photographs. Users' past interactions and preferences are considered, and recommendations are made based on the visual similarity of artwork and users' historical preferences. Pr. Sara Qassimi FST- UCA 32 Convolutional Neural Network : CNN Recipe Recommendation Problem: Recommending recipes to users based on their dietary preferences and visual appeal. Application: In recipe apps or websites, CNNs can be used to assess the visual appeal of dishes based on images. User dietary restrictions and past recipe choices can be combined with image analysis to recommend visually appealing recipes that align with users' preferences. Home Decor and Interior Design Problem: Recommending home decor items and interior design ideas to users. Application: In interior design platforms websites, CNNs can analyze images of furniture, decor, and interior designs. Users' style preferences and room layouts can be taken into account, and CNNs can help recommend visually appealing decor items and design inspirations. Pr. Sara Qassimi FST- UCA 33 ● Graph Neural Network ● GNNs are a powerful tool for solving recommendation problems where data can be represented as graphs. GNNs can capture complex relationships and patterns within graph-structured data, making them suitable for various recommendation scenarios. Pr. Sara Qassimi FST- UCA 34 Graph Neural Network : GNN Social Network-Based Recommendation Problem: Recommending content or connections in a social network. Application: In social media platforms, GNNs can be applied to model the user-user interaction graph. By analyzing the connections, user behavior, and content interactions, GNNs can suggest relevant content, such as posts, articles, or friends, to users. Collaborative Filtering with Knowledge Graphs Problem: Recommending items or resources in a knowledge-based system. Application: GNNs can be used to incorporate knowledge graphs into collaborative filtering. By connecting users, items, and their attributes in a knowledge graph, GNNs can make recommendations that consider both user-item interactions and domain-specific knowledge. Pr. Sara Qassimi FST- UCA 35 Cross-domain recommender system ● Employ transfer learning techniques aim to leverage knowledge learned from one domain to improve recommendation performance in another domain. Pr. Sara Qassimi FST- UCA 36 Cross-domain recommender system E-commerce Cross-Domain Recommendations Problem: Recommending products to users in different e-commerce domains (e.g., electronics, fashion, home decor). Application: TL can be applied to learn user preferences and behavior in one domain and adapt this knowledge to make recommendations in other domains. For instance, a user's interactions and purchase history in the electronics domain can be used to recommend fashion items based on shared preferences like brand affinity or price range. Multilingual Movie and TV Show Recommendations Problem: Recommending movies and TV shows across different languages. Application: TL can be employed to create recommendation models in one language and then fine-tune or adapt these models to other languages. This ensures that users receive personalized recommendations, regardless of their language preferences, based on their historical interactions. Pr. Sara Qassimi FST- UCA 37 Cross-domain recommender system News and Content Aggregation Problem: News and Content Aggregation. Application: Transfer learning can be used to analyze user interactions with news articles in one platform and apply the learned knowledge to recommend content in another platform. For instance, user preferences and click behavior in a news app can be transferred to suggest relevant blog posts or videos on a different website. Adaptive Music Recommendation Problem: Recommending music that adapts to users' changing moods and activities. Application: Transfer learning can be used to extract features from a user's historical listening data and adapt music recommendations for different contexts. For example, a user's preferences for upbeat music during workouts can be transferred to recommend relaxing music during meditation. Pr. Sara Qassimi FST- UCA 38 Collective matrix factorization ● is a technique that combines matrix factorization with transfer learning to enhance recommendation systems. It leverages information from multiple related domains or sources to improve recommendation quality. Pr. Sara Qassimi FST- UCA 39 Collective matrix factorization : CMF Cross-Domain News Recommendation Problem: Recommending news articles across different news publishers. Application: CMF can factorize user-news interaction matrices for each publisher while sharing latent factors across publishers. Transfer learning allows the system to understand user interests and preferences regardless of the news source, leading to more diverse and personalized news recommendations. Cross-Platform Social Media Recommendations Problem: Recommending content or connections to users across multiple social media platforms Application: CMF can factorize user-content interaction matrices for each platform separately. By sharing latent factors among these platforms, transfer learning ensures that user preferences and content relationships learned from one platform can be applied to recommend content or connections on other platforms. This approach enhances cross-platform user experiences and engagement. Pr. Sara Qassimi FST- UCA 40 User behaviour ● RL can be applied to recommender systems to model user behavior and provide personalized recommendations. In RL-based recommender systems, the user is typically treated as an agent, and their interactions with the system are seen as a sequential decision-making process. Pr. Sara Qassimi FST- UCA 41 User behaviour Exploration vs. Exploitation Click-Through Rate (CTR) Optimization Problem: Balancing exploration of new items with exploiting known preferences. Application: In e-commerce, music streaming, or video streaming platforms, RL can help decide when to recommend items that are similar to a user's past choices (exploitation) and when to introduce new or diverse items to encourage exploration. RL agents learn optimal exploration-exploitation trade-offs to improve user satisfaction. Problem: Maximizing the click-through rate for recommended items. Application: In online advertising or content recommendation systems, RL can model user interactions with recommended ads or articles. The agent (system) learns to select items that maximize user clicks or engagement over time. User clicks are treated as rewards, and RL algorithms like Q-learning or deep reinforcement learning (DRL) are used to optimize the recommendations. Pr. Sara Qassimi FST- UCA 42 Markov Decision Process ● MDPs can be applied to recommender systems to model the decision-making process of recommending items to users. In this context, the recommender system acts as an agent that makes sequential decisions to optimize user satisfaction or other relevant objectives. Pr. Sara Qassimi FST- UCA 43 Markov Decision Process E-commerce Product Recommendation Game Recommendation Problem: Recommending products to maximize user engagement. Application: In an e-commerce setting, the recommender system can model user behavior as an MDP. States represent user preferences, browsing history, and available products. Actions correspond to recommending specific products. Rewards can be defined based on user interactions (e.g., clicks, purchases). The MDP agent learns a policy that optimizes product recommendations to increase user engagement and conversion rates. Problem: Recommending video games to maximize user engagement. Application: MDP can model user behavior as they explore and interact with different games. States include user gaming preferences and game availability. Actions correspond to recommending specific games. Rewards can be defined based on user engagement metrics (e.g., playtime, in-game achievements). The MDP agent learns a policy that maximizes user engagement by recommending games aligned with user interests. Pr. Sara Qassimi FST- UCA 44 ● Active learning is a machine learning approach that involves iteratively selecting and labeling the most informative data points to train a model. ● In the context of a recommender system, active learning can be used to improve the performance of the recommendation model by actively choosing which user-item interactions to label or acquire for training. Pr. Sara Qassimi FST- UCA 45 How Active Learning Recommender System Works ● ● ● Donec risus dolor porta venenatis ● Start with an initial recommender system model trained on a Pharetra luctus felis small labeled dataset. Proin in tellus felis volutpat ● ● porta Define a query strategy or selection criteria to identify which user-item Donec risus dolor venenatis Pharetra luctus felis interactions to label next. The goal is to choose interactions that are expected to provide the most valuable information to improve the model. Proin in tellus felis volutpat 01 01 Initial Model Initial Model 02 02 Lorem ipsum dolor sit amet at Query Strategy / selection criteria ● nec at adipiscing ● 03 03 Lorem ipsum dolor sit amet at User-Item Interaction Selection nec at adipiscing ● ● ● ● Use the query strategy to select a set of user-item interactions for which Donec risus dolor porta venenatis the model's predictions are uncertain or where there is a potential for Pharetra luctus felis improvement. These interactions are typically those that the model is Proin in tellus felis volutpat unsure about or has low confidence in. 04 04 Lorem ipsum dolor sit amet at Labeling / Feedback nec at adipiscing ● ● ● ● porta Request labels or feedback from users for the selected interactions. This Donec risus dolor venenatis Pharetra luctus felis can involve asking users to rate items or provide explicit feedback on their preferences. Proin in tellus felis volutpat 05 05 Lorem ipsum dolor sit amet at Model Update nec at adipiscing ● ● ● Donec risus dolor porta venenatis ● Incorporate the newly labeled data into the training dataset and retrain the Pharetra luctus felis recommender model. Proin in tellus felis volutpat 06 Iteration ● Repeat steps 2-5 for a predefined number of iterations or until a certain performance criterion is met. Pr. Sara Qassimi FST- UCA 46 Active Learning Recommender System E-commerce Product Recommendation News Article Recommendation - The recommender system starts with an initial model trained on user-product interactions. - The query strategy targets users who have shown interest in a variety of product categories but haven't made many purchases. - These users are asked to provide feedback or ratings on specific products. - The model is updated to offer more personalized recommendations across a broader range of product categories. - The system begins with a baseline model trained on user-article interactions. - The query strategy identifies users who have diverse interests and have interacted with both mainstream and niche articles. - These users are prompted to rate or provide feedback on a set of suggested articles. - The model learns to offer more personalized news recommendations across a wider spectrum of topics. Pr. Sara Qassimi FST- UCA 47 Ratings ● ● ● Thumbs up and down / Like and Dislike 5 stars ratings Machine Learning: ○ ○ Binary outcomes as classification 5 stars ratings as Regression Pr. Sara Qassimi FST- UCA 48 How to recommend? Simple approach: Just sort by average rating! Pr. Sara Qassimi FST- UCA 49 Considering how confident we are in that rating. Pr. Sara Qassimi FST- UCA 50 Confidence Pr. Sara Qassimi FST- UCA 51 Confidence: Why sort by lower bound? ● ● ● ● Suppose 2 items have 4 stars on average Item #1 has 3 ratings, and Item #2 has 100 ratings Higher # of raters ; higher lower bound The popularity increases the score Because Item #2 has more ratings or in other words, a bigger sample size, we are more confident in its average rating, so its confidence interval is narrow. Item #1 confidence interval is wid. If we use the upper bound, then Item #1 would be ranked higher than Item #2 . Item #2 Item #1 Pr. Sara Qassimi FST- UCA 52 Confidence Intervals ● ● Given a random variable X, we can calculate the distribution of its sample mean The more samples are collected (N), the narrower its distribution This notation indicates that the random variable X follows a normal distribution with mean μ and variance σ² . In simpler terms, it means that the values of X tend to cluster around the value μ, and the spread or variability of the data is determined by σ². Central Limit Theorem a sum of random variables converges to a normal distribution. So X bar (Sample Mean) is going to be approximately normally distributed. The more samples are collected, the more confident I should be in my estimate of the average. Pr. Sara Qassimi FST- UCA 53 More problems with average rating what if N (ratings) is very small, or 0 ? Naive Solution: Smoothing (or Dampening) The basic idea is to add a small number to the numerator and denominator so that if there is zero ratings, we can just default to some predefined value. 3 stars is mostly considered bad rating Pr. Sara Qassimi FST- UCA 54 Supervised Machine Learning to RS ● ● Inputs X and corresponding Targets Y Y might represent: ○ ○ ○ ○ ○ ○ ● ● Did the user buy the product? Click on the ad? Click on the article? Sign up for the newsletter? Make an account? What did the user rate this item? If the model is accurate, then we can use it to recommend items for the users The user is more likely to buy/ click/ rate highly those recommended items Pr. Sara Qassimi FST- UCA 55 Input Features ● Common features include demographics: ○ ○ ○ ○ ○ ○ ○ ○ ○ ● Age Gender Religion Location Race Occupation Education Level Material Status Socioeconomic Statuts E.g pseudo_SQL Select product_id from products where location=’Morocco’ Pr. Sara Qassimi FST- UCA 56 Input features ● Can include any data collected about the users: ○ ○ ○ ○ When the user signed up Which pages they viewed Have credit card history Purchase history ● Can purchase data from other sites(e.g. surveys, tracking) ● Supervised model: ○ ○ ● Logistic regression Random Forest What about the item? Pr. Sara Qassimi FST- UCA 57 What about the item? ● Given a user age and gender; the probability a user will buy an iPhone is probably different than the probability a user will buy cat food! ● ● A separate predict model for each item! Problematic: A lot of data per item for each model Pr. Sara Qassimi FST- UCA 58 Add item attributes as model inputs A single Machine Learning Model: Classification - Like or unlike| Regression - Predict user’s rating if we can predict how a user is going to behave, then we can tailor the user experience to perform the recommendations. Pr. Sara Qassimi FST- UCA 59 Difficulty in Getting Data ● ● ● Privacy- ad and tracking blockers item data- dependent on vendor entering data correctly If free-form, lots of string parsing needed Pr. Sara Qassimi FST- UCA 60 More flexible- latent variable models ● ● ● ● ● ● ● Instead of explicit features like age; gender;etcs. Learning features implicitly Latent variable models- features are learned automatically “hidden causes” May not be interpretable or presented neatly defined concepts like age But confident that they are mathematically optimal we do not collect these features manually! (user,item,rating) is enough! Will revisit when we talk about Matrix Factorization Pr. Sara Qassimi FST- UCA 61 There are two main types of recommender systems – personalized and non-personalized. Recommender System Personalized Content based Collaborative Filtering Non-Personalized Hybrid Popularity PageRank Pr. Sara Qassimi FST- UCA 62