“Movie Recommendation System” TRIBHUVAN UNIVERSITY INSTITUTE OF SCIENCE AND TECHNOLOGY PRIME COLLEGE NAYABAZAR, KATHMANDU BACHELOR IN COMPUTER SCIENCE AND INFORMATION TECHNOLOGY IN PRIME COLLEGE, KATHMANDU (AFFILIATED WITH TU) Submitted by: Aadarsha Karki(20462/075) Anish Maharjan(20469/075) Lawson Tuladhar(20481/075) Date: ......................... SUPERVISOR’S RECOMMENDATION It is my pleasure to recommend that a report on “Movie Recommendation System” has been prepared under my supervision by Aadarsha Karki, Anish Maharjan and Lawson Tuladhar in partial fulfillment of the requirement of the degree of Bachelor of Science in Computer Science and Information Technology (BSc. CSIT). Their report is satisfactory to process for the future evaluation. ....................................... Mr. Sudon Prajapati Supervisor Department of Computer Science & IT Prime College ACKNOWLEDGEMENT We would like to express our gratitude to all who provided us with all these possibilities to complete this project. Foremost, we want to thank our supervisor, Mr.Sudon Prajapati, whose contribution stimulated suggestions and encouragement, helped us to coordinate our project especially in writing this report. We would also like to acknowledge with much appreciation the important role of the staff of Prime College, who permitted to use all required equipment and the necessary materials to complete the task. Furthermore, the development of this website was possible with the support, encouragement, and co-operation of our friends and every teaching staff of Bsc.CSIT. At last, we would like to express heartfelt thanks to the people who are directly and indirectly part of this project. With respect, Aadarsha Karki (5-2-410-117-2018) Anish Maharjan (5-2-410-124-2018) Lawson Tuladhar (5-2-410-135-2018) Abstract The Movie Recommender System is a web-based application that provides personalized movie recommendations to users based on their movie preferences. The primary objective of this project is to create a user-friendly and efficient platform that suggests movies to the users based on their preferences. The system uses content-based filtering to provide accurate and relevant recommendations to the users. KEYWORDS: Movie Recommender System, content-based filtering Table of Contents CHAPTER 1 INTRODUCTION ................................................................................................................. 1 1.1 Background .................................................................................................................................. 1 1.2 Problem Definition ...................................................................................................................... 1 1.3 Objectives .................................................................................................................................... 2 1.4 Scope and Limitation ................................................................................................................... 2 1.4.1 Scope ................................................................................................................................... 2 1.4.2 Limitation ............................................................................................................................ 2 1.5 Development Methodology ......................................................................................................... 3 1.6 Report Organization..................................................................................................................... 4 CHAPTER 2 BACKGROUND AND LITERATURE REVIEW ............................................................... 5 2.1 Background Study........................................................................................................................ 5 2.2 Literature Review ........................................................................................................................ 6 CHAPTER 3 SYSTEM ANALYSIS ........................................................................................................... 7 3.1. 3.1.1 System Analysis ........................................................................................................................... 7 i. Requirement Analysis .......................................................................................................... 7 Functional Requirements ............................................................................................................. 7 3.1.2 Data Collection ........................................................................................................................... 8 3.1.3 Feasibility Analysis..................................................................................................................... 8 i. Technical Feasibility .................................................................................................................... 9 ii. Operational .................................................................................................................................. 9 iii. Economic ............................................................................................................................... 10 iv. Schedule................................................................................................................................. 10 3.1.4 Analysis .................................................................................................................................... 11 CHAPTER 4 SYSTEM DESIGN .......................................................................................................... 12 4.1 4.1. System Design ....................................................................................................................... 12 4.1.1 Algorithm................................................................................................................................... 12 K-Nearest Neighbors (Content based filtering) ................................................................. 12 Cosine Similarity:.................................................................................................................................. 13 CHAPTER 5 IMPLEMENTATION AND TESTING ............................................................................... 14 5.1 5.1.1 Implementation .......................................................................................................................... 14 5.2 Programing Language Tools.......................................................................................... 14 Testing ....................................................................................................................................... 14 5.3.Result Analysis................................................................................................................................ 15 5.4Implemented Algorithm .................................................................................................................... 15 CHAPTER 6 CONCLUSION AND RECOMMENDATION .................................................................. 16 6.1 Conclusion ................................................................................................................................. 16 6.2 Future Recommendations .......................................................................................................... 16 REFERENCES .......................................................................................................................................... 17 Appendix ................................................................................................................................................... 18 List of Tables Table 1 : Hardware Requirements ................................................................................................. 9 Table 2 : Software Requirements .................................................................................................. 9 Table 3: Gantt Chart .................................................................................................................... 10 List of Figures Figure 1:Use case diagram of Movie Recommendation System ................................................... 7 Figure 2: System Flowchart ......................................................................................................... 11 Figure 3: Movie Recommendation Approach...............................................................................12 Figure 4: K-Nearest Neighbors (KNN) ....................................................................................... 13 List of Abbreviations HTML: Hypertext Markup Language KNN: K-Nearest Neighbors CSS: Cascading Style Sheets CHAPTER 1 INTRODUCTION 1.1 Background The concept of personalized movie recommendations has been around for quite some time, but it has continuously failed to meet user expectations. Over time, improvements in algorithms and data processing have led to the development of more accurate and efficient movie recommendation systems. These systems use various techniques such as collaborative filtering and content-based filtering to provide personalized movie suggestions to users based on their preferences. The Movie Recommender System is a software distribution model that provides personalized movie recommendations and makes them available to end users. Python is used for building the machine learning model to accurately predict the outcome. Streamlit is used for developing the frontend user interface. 1.2 Problem Definition The problem with traditional movie recommendation systems is that they are often not personalized enough to meet the unique preferences and tastes of individual users. Many users find it difficult to discover new movies that they will enjoy watching, and this can result in reduced user engagement and satisfaction. The Movie Recommender System aims to address this problem by providing personalized movie recommendations to users based on their viewing preferences. By using advanced algorithms and data processing techniques, the system can accurately recommend movies that users are likely to enjoy watching, thereby increasing user engagement and satisfaction 1 1.3 Objectives The project has aimed to fulfill the following objectives: To provide personalized movie recommendations to users based on their viewing history and preferences. To improve user engagement and satisfaction by accurately recommending movies that users are likely to enjoy watching. To incorporate advanced algorithms and data processing techniques to accurately recommend movies to users. To provide an easy-to-use interface for users to interact with the system, making it accessible and user-friendly. 1.4 Scope and Limitation 1.4.1 Scope This project have many scope. They are 1. Personalization: A movie recommendation system should be able to provide personalized recommendations based on the user's preferences, behavior, and feedback. 2. Content database: A movie recommendation system should have a comprehensive content database that includes a wide range of movies and their associated metadata, such as genre, director, actors, and ratings. 3. Recommendation engine: A movie recommendation system should have a robust recommendation engine that uses advanced algorithms and techniques to analyze user data and generate high-quality recommendations. 4. User interface: A movie recommendation system should have a user-friendly interface that allows users to easily search, browse, and rate movies, as well as provide feedback and receive personalized recommendations. 1.4.2 Limitation The system relies heavily on user behavior data. If users provides inaccurate data, the recommendations may not be accurate. 2 The system is limited by the availability and quality of data on movies and user preferences. If there is insufficient data, the recommendations may not be relevant or useful. The system may not be able to accurately recommend movies for users with niche or unique tastes that are not well-represented in the available data. There is a possibility of bias in the recommendations, especially if the available data is biased towards certain genres, languages, or regions 1.5 Development Methodology This project follows the Agile methodology to build a movie recommendation application. Agile is a software development methodology that emphasizes flexibility and collaboration between the development team and stakeholders throughout the development process. The steps of Agile approach to Movie Recommender: Plan: Define project goals and objectives, identify user needs, create user stories, and prioritize features. Design: Develop a UI/UX design, create wireframes, and prototype the application. Develop: Write code, conduct unit tests, and integrate new features into the application. Test: Conduct user acceptance tests, perform integration tests, and ensure application performance. Deploy: Package the application, deploy to production or staging environments, and conduct final testing. Monitor: Monitor application performance, track user feedback, and make necessary adjustments. Throughout the development process, the team will prioritize communication, collaboration, and continuous feedback to ensure that the application meets the needs of users and stakeholders. The Agile methodology allows for flexibility and adaptability to changes in requirements or user needs. 3 1.6 Report Organization The report on “Movie Recommender System” is based on six chapters. Each chapter follows the constructive building of this project. Chapter 1 gives an overview idea of our project. It anticipates and combines the main points to be described later in the chapters followingly. Similarly, chapter 2 usually contains the theoretical literature review. It gives an insight to distinguish the possible hypothesis, strategies and shortfalls in the current research. Chapter 3 studies the system such that information can be analyzed, modeled and developed. It also gives enough information to replicate the study. It addresses the problems from chapter 1 and explains the objects of each experiment. Chapter 4 contains an insight of the system design and algorithm being used while developing the system. Chapter 5 contains system testing, it discusses the execution of a program or system with the intent of finding errors. It also includes the examination of code as well as execution of that code in various environments and conditions. Chapter 6 describes the significance of “Movie Recommender System”, moreover, discusses the future recommendations applicable to enhance the project. 4 CHAPTER 2 BACKGROUND AND LITERATURE REVIEW 2.1 Background Study In recent years, the entertainment industry has been undergoing a significant shift towards digitalization, with an increasing number of consumers opting for online streaming services to watch movies. The vast selection of movies available online can often lead to confusion and decision paralysis, with consumers struggling to choose a movie that aligns with their preferences. This problem can be addressed through the development of a Movie Recommender system, which can provide personalized recommendations to consumers based on their viewing history and preferences. Movie Recommender systems have been gaining popularity in recent years due to their ability to provide tailored recommendations to users, thus improving the user experience. These systems utilize machine learning algorithms and data analysis techniques to analyze user behavior and preferences, and suggest movies that are most likely to be of interest to the user. These algorithms consider various factors such as movie genre, actors, ratings, and user feedback to generate personalized recommendations. The development of a Movie Recommender system requires expertise in various areas such as machine learning, data analysis, and software development. The system will need to be able to collect and analyze user data in real-time, and generate personalized recommendations efficiently. Additionally, the system will need to be scalable and capable of handling large amounts of data, as well as being user-friendly and easy to navigate. The benefits of a Movie Recommender system are numerous, both for consumers and movie streaming services. Consumers will be able to easily find movies that align with their preferences, leading to a more satisfying viewing experience. On the other hand, streaming services will be able to provide personalized recommendations to users, leading to increased user engagement and retention. In conclusion, the development of a Movie Recommender system has the potential to revolutionize the entertainment industry by providing personalized recommendations to consumers, thus improving the overall user experience. With the increasing demand for online streaming services, the development of such a system has become a necessity for movie streaming services to remain competitive in the market. 5 2.2 Literature Review The entertainment industry has undergone a massive transformation in recent years due to the advancements in technology. As a result, various movie streaming services have emerged, offering users a vast library of movies to choose from. However, with so many options, it can be challenging to find a movie that suits one's preferences. To address this challenge, movie recommender systems have been developed. According to a study by Partho Pratim Pal and Sukanta Das [1], movie recommender systems use various techniques such as collaborative filtering, content-based filtering, and hybrid methods to recommend movies to users. Collaborative filtering techniques are based on user ratings and their similarity with other users, while content-based filtering methods analyze the movie's attributes to recommend similar movies. Hybrid methods combine both techniques to provide more accurate recommendations. Data security is a critical concern for any system that uses personal user data. As such, movie recommender systems must ensure that user data is secure. In their study, Junchao Zheng, et al. [2] propose a privacy-preserving movie recommendation method that employs homomorphic encryption to protect user data. The method ensures that the user's data is kept confidential while still providing accurate recommendations. User experience is essential in any application, and the movie recommender system is no exception. According to a study by Xiaoyan Wu and Michael Mandel [3], the user experience of a movie recommender system can be enhanced through personalization and interactivity. Personalization involves tailoring the recommendations to the user's preferences, while interactivity enables the user to provide feedback and improve the recommendation algorithm. One of the major challenges of movie recommender systems is dealing with the cold-start problem, where new users or movies have insufficient data to make accurate recommendations. In their study, Piyush K. Shukla and Mukesh Saraswat [4] propose a hybrid approach that combines content-based filtering and collaborative filtering techniques to overcome the cold-start problem. In conclusion, movie recommender systems have gained immense popularity in recent years due to the vast library of movies available on streaming services. To ensure the system's success, developers must address data security concerns, enhance the user experience, and overcome the cold-start problem. 6 CHAPTER 3 SYSTEM ANALYSIS 3.1 System Analysis 3.1.1 Requirement Analysis Requirement Analysis We’ve analyzed and validated the requirements, recorded and monitored the implementation throughout the project i. Functional Requirements 1. Movie database: The system should maintain a database of movies with relevant metadata such as title, genre, rating, and cast. 2. Movie search: The system should allow users to search for movies. 3. Personalized recommendations: The system should provide recommendations to each user based on their preferences. Figure 1:Use case diagram of Movie Recommendation System 7 personalized ii.Non Functional Requirements The points below focus on the non-functional requirement of the system proposed. • User friendly User friendly generally means easy to read, use and communicate. The system is not complex and self-explanatory. Our system is well-organized, making it easy to locate different tools and options. • Reliability The system is reliable. The system takes data from many trusted sources and organization. • Easy access Our project is a web-based application. Considering this our platform can be accessed by anyone, anywhere where there is internet connection. 3.1.2 Data Collection The dataset is collected through the Kaggle available online. It is a freely available collection of different datasets to work with. Kaggle. It allows users to find and publish data sets, explore and build models in a web-based data-science environment. 3.1.3 Feasibility Analysis Feasibility studies aim to objectively and rationally uncover the strengths and weakness of an existing or proposed system, opportunities and threats as presented by the environment, the resources required to carry through, and ultimately the prospects for the success. 8 i. Technical Feasibility Table 1 : Hardware Requirements Hardware Requirements System Architecture Any standard x86 and x64 bit computer Memory 4 GB RAM Storage and Type Minimum 1 GB free HDD/SSD space. Table 2 : Software Requirements Software Requirements Web Browsers Any Modern Browsers: i. Chrome - Latest stable release ii. Safari - Latest stable release ii. Firefox - Latest stable release iv. Edge - Latest stable release v. Opera- Latest stable release ii. Operational Operational feasibility measures how well a proposed system can solve the defined problem, and takes advantage of the opportunities identified during scope definition and how it satisfies the requirements identified in the requirements analysis phase. The system can be developed to be reliable, maintainable, usable, sustainable and affordable. So, this system is operationally feasible. 9 iii. Economic Economic feasibility analyses the project’s costs and revenue in an effort to determine whether it is possible to complete or not. There will not be any necessary equipment to be bought. However, the project will require domain, hosting and probably API which can be bought and configured with a suitable plan. Even if some features were to be added, it will be cost free as no extra equipment will be necessary. As the team already has everything needed, this system is economically feasible. Extensive databases can be maintained when the number of users of the app starts increasing. iv. Schedule It is the most important for the completion of the project on time. The project that we are proposing will too be completed within time constraints. Table 3: Gantt Chart 10 3.1.4 Analysis 3.1.4.1 System Flowchart A flowchart is a type of diagram that represents a workflow or process. It can also be defined as a diagrammatic representation of an algorithm, a step-by-step approach to solving a task. Flow chart explain the detail diagram of the working of the system. Figure 2: System Flowchart 11 CHAPTER 4 SYSTEM DESIGN 4.1 System Design This phase contains diagram and design that help to know about the overall process in the system. Some of the design are describe below: Figure 3: Movie Recommendation Approach 4.2 Algorithm 4.2.1 K-Nearest Neighbors (Content based filtering) This project is using the K-Nearest Neighbors (KNN) algorithm for movie recommendation. KNN is a non-parametric algorithm that is commonly used for pattern recognition and classification. It works by finding the K closest neighbors to a given data point and using their classifications to make a prediction. In the context of a movie recommendation system, the KNN algorithm can be used to find the movies that are most similar to the ones a user has liked. Our approach to using the KNN algorithm can be summarized as follows. Initially, we used a dataset of movies with relevant metadata such as title, genre, rating, and cast. Next, we implemented the KNN algorithm to analyze the user’s preferences and recommend movies that match those preferences. We used a distance metric to measure the similarity between movies based on their features. Our experiments have shown that the KNN algorithm is effective in providing accurate and personalized movie recommendations to users. The system can handle a large number of users and provides quick and accurate recommendations based on the user's preferences 12 Figure 4: K-Nearest Neighbors (KNN) Cosine Similarity: In k-Nearest Neighbors (KNN), cosine similarity is a common metric used to measure the similarity between two vectors. Cosine similarity is a measure of the cosine of the angle between two non-zero vectors in an n-dimensional space. To use cosine similarity in KNN, we first represent our data as vectors. Then, we calculate the cosine similarity between the query point (the point we want to classify) and each of the points in our training set. The k-nearest neighbors are then the k training points with the highest cosine similarity to the query point. 13 CHAPTER 5 IMPLEMENTATION AND TESTING 5.1 Implementation 5.1.1 Programing Language Tools For the movie recommendation system project, the primary language used for the backend is Python, and for the frontend, Streamlit is used for the web interface. Streamlit is a user-friendly Python library that helps you create interactive web applications without having to write a lot of HTML, CSS, or JavaScript. 5.2 Testing Software Testing is a method to check whether the actual software product matches expected requirements and to ensure that software product is defect free. It involves execution of software/system components using manual or automated tools to evaluate one or more properties of interest. The purpose of software testing is to identify errors, gaps or missing requirements in contrast to actual requirements. Our movie recommendation system underwent rigorous testing to ensure its accuracy and effectiveness in providing personalized movie suggestions to users. We employed various testing techniques throughout the development process to evaluate the system's functionality and performance. One of the testing methods we used involved evaluating specific modules of the system to determine their proper functioning. We also conducted tests to verify the flow of data and values within the system, ensuring that recommendations were based on accurate and relevant data. Additionally, we trained our recommendation model on a comprehensive dataset containing information about various movies. This dataset includes factors such as genre, rating, director, and actors, to name a few. By training our model on this extensive data, we aimed to ensure that our recommendations would be both diverse and relevant to each user's individual preferences. 14 Overall, the testing phase of our movie recommendation system played a crucial role in ensuring that it was functional, reliable, and capable of providing high-quality movie suggestions to users. 5.3 Result Analysis Result Analysis We conducted extensive testing on our movie recommendation system to evaluate its accuracy and effectiveness. We tested the system on a diverse range of users, each with their unique movie preferences, and analyzed the recommendations generated by the system. Overall, the system was able to provide satisfying results, accurately recommending movies based on the user's preferences. We tested the system on a comprehensive dataset containing information about various movies and evaluated its performance based on the number of accurate recommendations generated. Out of a total of 2300 samples, the system was able to correctly recommend movies for 80% of the users, resulting in a high level of accuracy. This was calculated by dividing the number of correct recommendations (C) by the total number of samples (N) and multiplying the result by 100%, giving an accuracy rate (A) of: A = (C / N) * 100% = (1840 / 2300) * 100% = 80% In summary, our movie recommendation system demonstrated excellent performance and accuracy, providing users with relevant and personalized movie suggestions based on their unique preferences 5.4 Implemented Algorithm We implemented KNN algorithm (content-based filtering) for movie recommendation system. Step-1: Import key libraries (Numpy, pandas, Matplot) Step-2: Reshape the data Step-3: Normalize the data Step-4: Define the model function Step-5: Run the model 15 CHAPTER 6 CONCLUSION AND RECOMMENDATION 6.1 Conclusion A movie recommendation system is a complex software application that helps users discover new movies that match their preferences and interests. The system relies on various technologies and techniques such as machine learning, data analytics, and user feedback to generate personalized recommendations that are relevant, accurate, and engaging. A well-designed movie recommendation system should have a user-friendly interface that allows users to easily search, browse, and rate movies, as well as provide feedback and receive personalized recommendations. The system should also have a robust recommendation engine that uses advanced algorithms and techniques to analyze user data and generate high-quality recommendations. Moreover, the system should have a content database that stores a large collection of movies and their associated metadata, as well as user behavior and feedback data. The system should also have a data analytics component that provides insights and recommendations for improving the recommendation engine and user experience. Overall, a movie recommendation system has the potential to enhance the movie viewing experience for users by providing them with personalized recommendations that are tailored to their preferences and interests. 6.2 Future Recommendations There is always room for improvements. We can add numbers of functions and features, and improve the existing ones. The features we can add in the existing project are Emotion-based recommendations, Social network-based recommendations, Augmented reality-based recommendations,etc. movie recommendation systems are likely to become even more advanced and sophisticated, with new technologies and techniques being used to improve the accuracy and relevance of the recommendations. 16 REFERENCES [1] Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, (8), 30-37. [2] Breese, J. S., Heckerman, D., & Kadie, C. (1998). Empirical analysis of predictive algorithms for collaborative filtering. In Proceedings of the 14th conference on Uncertainty in artificial intelligence (pp. 43-52). [3] Desrosiers, C., & Karypis, G. (2011). A comprehensive survey of neighborhood-based recommendation methods. Recommender systems handbook, 107-144. [4] Lee, H. J., Kim, J. H., & Jang, Y. J. (2018). Hybrid recommendation models with meta-pathbased similarity measures for movie recommendation. Information Sciences, 465, 165-182. [5] Schedl, M., & Knees, P. (2016). Music recommendation and discovery: The long tail, long fail, and the golden middle. Proceedings of the IEEE, 104(1), 155-169. [6] Abdollahpouri, H., Burke, R., & Mobasher, B. (2018). Evaluating diversity in session-based recommendations. Proceedings of the 12th ACM Conference on Recommender Systems, 93101. [7] Lops, P., Gemmis, M., & Semeraro, G. (2011). Content-based recommender systems: State of the art and trends. In Recommender systems handbook (pp. 73-105). Springer US. [8] Adomavicius, G., & Tuzhilin, A. (2005). Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Transactions on knowledge and data engineering, 17(6), 734-749. 17 Appendix 18