Table of Content S no. Topic 1. Certificate 2. Acknowledgements 3. Abstract 4. List Of Figures 5. List Of Tables 6. Synopsis 7. CHAPTER 1 - INTRODUCTION 1.1 Description of the topic 1.2 Problem Statement 1.3 Objective 1.4 Scope of Project 1.5 Data Collection 8. CHAPTER 2 – LITERATURE REVIEW 9. CHAPTER 3 – SYSTEM DESIGN AND METHODOLOGY 3.1 System Design 3.2 Algorithm Used 10. CHAPTER 4 - IMPLEMENTATION & RESULT 4.1 Hardware and Software Requirement 4.2 Implementation Details 4.3 Result 11. CHAPTER 5 – CONCLUSION AND FUTURE WORK 5.1 Conclusion 5.2 Future Scope 12. References Page No. Certificate I Krishna Gupta (08013702021) certify that the Summer Training Report (BCA) entitled “ Predictive Analysis of Fixed Deposit User Engagement using Machine Learning and Data Science Tools ”is done by me and it is an authentic work carried out by me at “Internshala Training”. The matter embodied in this project work has not been submitted earlier for the award of any degree or diploma to the best of my knowledge and belief. Signature of the Student Date: Certified that the Project Report (BCA-356) entitled "Insurance Claim Prediction" done by the above student is completed under my guidance. Signature of the Guide: Date: Signature of the Guide: Date: Name of the Guide: Ms. Anjum Rathi Name of the Guide: Ms. Suman Singh Designation: Assistant Professor Designation: Assistant Professor Counter sign HOD-Computer Science ACKNOWLEDGEMENT I would like to acknowledge the contributions of the people without whose help and guidance this report would not have been completed. I acknowledge the counsel and support of our training guide Ms. Suman Singh, Assistant Professor, CS Department, with respect and gratitude, whose expertise guidance, support, encouragement, and enthusiasm has made this report possible. Their feedback vastly improved the quality of this report and provided an enthralling experience. I am indeed fortunate to be supported by her. I am also thankful to Prof. (Dr.) Ganesh Wadhwani, H.O.D of Computer Science Department, Institute of Technology & Management, and New Delhi for his constant encouragement, valuable suggestions and moral support and blessings. I shall ever remain indebted to the faculty members of Institute of Technology & Management, New Delhi for their persistent support and cooperation extended during this work. This acknowledgement will remain incomplete if I fail to express our deep sense of obligation to my parents and God for their consistent blessings and encouragement. Krishna Gupta 0807137020201 Abstract In today's dynamic financial landscape, where banks and financial institutions strive to maintain and expand their customer base, understanding and enhancing user engagement has become imperative. This project endeavors to harness the power of machine learning and data science tools to predict and optimize user engagement in fixed deposit services, a critical offering in the realm of financial products. Fixed deposits represent a long-term financial commitment for customers, and predicting their engagement patterns is a multifaceted challenge. Leveraging advanced analytics techniques, this project aims to decipher the factors influencing user engagement, such as customer demographics, transaction history, and previous interactions with fixed deposit accounts. The project's objectives encompass the entire data science lifecycle. It commences with data collection and preprocessing, followed by feature engineering to extract meaningful insights. Machine learning models, including regression, classification, and clustering algorithms, are deployed to forecast user engagement. Moreover, customer segmentation techniques are applied to tailor marketing strategies, and a recommendation system is devised to offer personalized fixed deposit options. Through rigorous performance evaluation and visualization, the project illuminates the inner workings of predictive models, making them interpretable for stakeholders. The results offer critical insights into user behavior and preferences, enabling banks to adapt their strategies, enhance customer satisfaction, and optimize their product offerings. This project underscores the transformative potential of data-driven decision-making in the banking sector. By predicting customer engagement in fixed deposit services, financial institutions can not only retain existing customers but also attract new ones, fostering sustainable growth and competitiveness in a volatile financial landscape. LIST OF FIGURES Chapter-1: 1.1 Pert Chart: Describing the work flow of the project. Chapter-3: Fig 3.1 Figure explaining the process of Logistic Regression Chapter-4: Figure 4: 1 we are importing libraries which will be used in our project. Figure 4: 2 Loading the data and converting it into data frame so as to perform operations. Figure 4: 3 Features in Data set Figure 4: 4 Overview of dataset Figure 4: 5 Print data types for each variable Figure 4: 6 Describing Data set Figure 4:7 Missing values in the dataset. Figure 4: 8 Split Data into training and validation set Figure 4:9 Making prediction on the validation set Figure 4: 10 checking the accuracy score Figure 4.11 Heat map to show the correlation between different features of dataset Figure 4.12 Image showing of Bar Plot for the No of users took FD Figure 4.13 Image showing client occupation that are taking FD Figure 4.14 Image showing most of the clients fall in which age group Figure 4.15 Pie chart showing no. user said yes to subscription Figure 4.16 Pair Plot LIST OF TABLES Chapter-1: 1.1 Responsibility wise work distribution Chapter-2: 2.1 Showing the related work of this field. Chapter-4: 4.1 Showing the Hardware and Software requirement of the system developed 4.2 Correlation table 4.3 Comparison between the different machine learning models used in the project LIST OF ABBREVATIONS: None. Synopsis 1. Title of the project: “Predictive Analysis of Fixed Deposit User Engagement using Machine Learning and Data Science Tools”. 2. Statement about the problem: Fixed deposits (FDs) are a fundamental financial product offered by banks, often characterized by long-term commitments. To maintain and grow their customer base, banks need to continuously assess and improve the user experience for FD customers. Predictive analysis can play a pivotal role in achieving this by identifying patterns and trends in customer behavior.. 3. Significance of the project: • The project promotes data-driven decision-making, enabling the bank to tailor its strategies based on user behavior and preferences. • The project's significance lies in its potential to drive revenue, improve customer satisfaction, and enable cost-effective marketing through data-driven insights and predictions, ultimately benefiting both the bank and its customers. 4. Objective: The objective of the project is to create a data driven model that can accurately predict the Fixed Deposits. 5. Scope: The project will Predictive Analysis of Fixed Deposit User Engagement using Machine Learning & Data Science, signifying the factors such as: 1. Collecting and preparing data on fixed deposit users. 2. Selecting influential features and engineering them. 3. Training machine learning models for user engagement prediction. 4. Validating and evaluating model performance. 5. Using the model to predict potential fixed deposit users and gain insights into user behavior. 6. H/W and S/W specifications: • Hardware • Windows 7 or higher • 256 mb required RAM • intel Pentium or above. • Software • Python • Jupyter Notebook • Libraries: Matplotlib, Seaborn, Pandas, Scikit-Learn. 7. Data collection and Methodology: • • Data collection was simply done from Kaggle dataset website. Methodologies include: o Data preprocessing o Feature engineering o Data modelling. 8. Algorithm: This project uses a Logistic Regression due to: • Its high precision results. • Low effect of outliers on this model. 9. Limitations and Constraints: Limitations of this projects include. • Not removing outliers from dataset since we used Logistic Regression that has low effect on the modal training. • No consideration of traffic patterns on the way to destination and Weather situation since the data was not present and also requires complex methods to operate on. 10. Conclusion and Future Scope: The "Predictive Analysis of Fixed Deposit User Engagement using Machine Learning and Data Science Tools" project endeavors to harness the power of data-driven insights to improve user engagement in fixed deposit services. Through data analysis, predictive modeling, and personalized recommendations, the project aims to enhance customer satisfaction and drive business growth in the financial sector. 11. References and Bibliography: https://www.kaggle.com/datasets/bankindiauser/8756123 https://www.sciencedirect.com/science/article/abs/pii/S0927538X17303037 https://www.sciencedirect.com/science/article/abs/pii/S0378426619301025 https://www.iasj.net/iasj/article/263875 https://journal.formosapublisher.org/index.php/eajmr/article/view/2524 https://ieeexplore.ieee.org/abstract/document/10080695/authors#authors CHAPTER-1 INTRODUCTION 1.1 Description of the topic: A sector that plays a very significant part in the Commercial and Economic backdrop of any country is the banking sector. Data Mining technique can play a key role in providing different methods to analyse data and to find useful patterns and to extract knowledge in this sector (Vajiramedhin and Suebsing; 2014). Data mining helps in the extraction of useful information from the data (Turban et al.; 2011). According to (Venkatesh and Jacob; 2016), machine learning has more capability to gather information from the data, which results in more frequent use of data mining methods in the banking sector. Due to a large amount of data gathered in banks, data warehouses are required to store these data. Analysing and identifying patterns from such data can be useful for Banks 1 to identify trends and acquire knowledge from these data. By the acquired knowledge from these data, organizations can more clearly understand their customers and improve the services they provide. The topic, "Predictive Analysis of Fixed Deposit User Engagement using Machine Learning and Data Science Tools," focuses on leveraging data science and machine learning techniques to forecast user engagement in fixed deposit products. This research aims to develop predictive models that can identify potential fixed deposit customers and enhance financial institutions' marketing and product offerings. By analyzing user behavior and employing advanced algorithms, this project seeks to provide insights into customer preferences, ultimately improving engagement and investment in fixed deposits. 1.2 Problem Statement: The client, a retail bank heavily reliant on term deposits, seeks to optimize their marketing efforts. Term deposits involve cash investments for a fixed period at an agreed-upon interest rate. The bank employs various outreach methods, including email, ads, telephonic, and digital marketing, with telephonic campaigns being particularly effective but expensive. To make telephonic marketing cost-effective, the goal is to identify potential customers likely to subscribe to term deposits in advance. Client data such as age, job type, marital status, and call details (e.g., call duration, day, and month) are available. The task is to predict whether a client will subscribe to a term deposit based on this data. This predictive model will enable targeted call outreach, enhancing the efficiency and success of marketing campaigns. 1.3 Objectives: The objective of the topic, "Predictive Analysis of Fixed Deposit User Engagement using Machine Learning and Data Science Tools," is to develop predictive models that can accurately forecast user engagement with fixed deposit products. This involves leveraging machine learning algorithms and data science tools to analyze user behavior, identify potential customers, and optimize marketing and product strategies for financial institutions. Ultimately, the goal is to enhance customer engagement and promote investment in fixed deposits through data-driven insights and predictions. 1.3 Scope of the Project: 1. Data Collection and Preparation: • Gather relevant data on fixed deposit user behavior, demographics, and engagement metrics. • Clean and preprocess the data, addressing missing values and outliers. 2. Feature Engineering: • Identify influential features affecting user engagement. • Create new features or transform existing ones to improve predictive accuracy. 3. Model Selection and Training: • Choose appropriate machine learning algorithms, including logistic regression. • Train and fine-tune models on historical data to predict user engagement. 4. Model Evaluation and Validation: • Assess model performance using metrics like accuracy, precision, and recall. • Implement cross-validation techniques to ensure robustness. 5. Predictive Analysis: • Apply the trained model to new data to predict potential fixed deposit users. • Generate insights into user preferences and behavior. 6. Interpretation of Results: • Understand the significance of individual features in predicting user engagement. • Conduct feature importance analysis to identify key drivers. 7. Model Deployment: • Create a user-friendly interface for real-time or batch predictions. • Integrate the predictive model into the bank's marketing strategies. 8. Monitoring and Maintenance: • Establish protocols for continuous model monitoring. • Update the model as needed to adapt to evolving user behavior and market trends. 9. Ethical and Regulatory Compliance: • Ensure data usage and model deployment align with privacy and regulatory standards. • Address potential bias or discrimination in predictions. 10. Documentation and Recommendations: • Compile comprehensive documentation covering data sources, preprocessing, modeling, and results. • Provide actionable recommendations for optimizing the bank's marketing and engagement strategies based on predictive insights. 11. This project aims to leverage data science and machine learning tools to enhance user engagement with fixed deposits, ultimately benefiting the retail banking institution. 1.5 Project planning Activities: 1.5.1 Team-Member wise work distribution table Team consists of two members Piyush Goel and Krishna Gupta and the work is distributed in the following manner. Table 1.1 Responsibility wise work distribution Team Member Piyush Goel Krishna Gupta Krishna Gupta Piyush Goel Role/Responsibility & Project Data Collection ML model selection Graphic Designing Task/Contributions Data Collection and Preprocessing Model Development and Training Preparation of Presentation on Canva. The data collection was done with the agreement of both the team members and found by the both from Kaggle website. The preprocessing part was done with the help of both team members. Krishna Gupta: ๏ง Focused on developing the predictive model for delivery time estimation. Piyush Goel: ๏ง ๏ง Created presentation on the project. Using Canva graphic designing tool creation of presentation. Utilized Python, scikit-learn for model development and training. 1.5.2 PERT Chart Fig. 1.1 Pert Chart: Describing the work flow of the whole project. CHAPTER 2 – LITERATURE REVIEW 2.1 SUMMARY OF PAPER STUDIES The objective of the paper (Nazar, 2023-02-28) is to investigate the influence of investments in fixed deposits and real estate on the credit rating of Iraq's National Insurance Company. It addresses the timely and significant issue of credit ratings within Iraq's insurance market, emphasizing its relevance. This research adopts a comprehensive approach, combining theoretical insights with practical data derived from extensive records of the National Insurance Company spanning from 2009 to 2020, complemented by personal interviews through field visits. Statistical methodologies are applied for rigorous data analysis, ensuring a robust examination. An essential discovery is the notable absence of credit ratings for both the National Insurance Company and other insurance firms in Iraq, whether at the local or international level. To address this gap, the research proposes the establishment of a specialized credit rating institution within Iraq, fostering collaboration with key entities like the Central Bank of Iraq, the Ministry of Finance, and the Insurance Bureau. This approach is envisioned to be cost-effective when compared to relying solely on international credit rating agencies, offering a tailored solution for Iraq's insurance industry. The objective of the paper research (Abhishek Rawat (KIT), 2023-7-12)presents a noteworthy implementation—a secure fixed deposit system powered by smart contracts, meticulously developed within the Remix IDE. Notably, this system capitalizes on the Ethereum blockchain, ensuring an exceptionally high level of transparency, data security, and immutability—critical attributes for financial applications. The core of this innovation is a smart contract coded in Solidity, a robust programming language, and developed in the Remix Integrated Development Environment. This contract empowers the system with a broad spectrum of functionalities, allowing the creation of new fixed deposit accounts, facilitating seamless fund deposits and withdrawals, automating interest calculations, and ensuring timely user notifications. Consequently, the research envisions this framework as a catalyst for financial institutions, greatly enhancing their operational efficiency and user experience through automation and heightened security. The objective of the paper research (Aditya Bodhankar, 2023) underscores the paramount importance of marketing in business growth and improvement. It highlights the significance of direct marketing campaigns in achieving specific business goals and the use of various communication channels, including telephones, social media, and digital marketing, to reach both local and distant clients. Recognizing the universal need for marketing, the abstract zooms in on the banking sector, stressing the critical role of marketing analysis, particularly in loan approval, insurance policies, and fixed deposits. Within this sector, banks employ targeted strategies based on customer data, including transaction history. The study's core focus is the analytical approach it adopts, deploying Bayesian Logistic Regression to delve into the sanctioning of Bank Fixed Term Deposits. It's worth noting that customer eligibility for loans, fixed deposits, or insurance hinges on a comprehensive analysis, which considers factors such as transaction history and loan repayment punctuality, underlining the intricate nature of customer decisions in the financial landscape. The study by Frankfurt School of Finance and Management Deutsche Bundesbank, Division Securities and Money Market Statistics (Willer, 2019-12-25) delves into the implications of recent regulatory proposals, notably the European Deposit Insurance Scheme, which seek to reshape deposit insurance systems. To evaluate the potential impacts of these changes, it becomes crucial to discern the factors guiding depositors' decisions to withdraw or shift their funds. Remarkably, the research introduces a novel insight: Google searches related to 'deposit insurance' and similar terms can serve as indicative markers of depositors' apprehensions and anxieties. These online search patterns effectively capture the sentiments and concerns of depositors. 2.2 Integrated Summary of Literature Studied The related field of study and the literature written are explained in the following table that shows the related research paper and the different model used in them with the accuracy or the error decrement using different models. Table 2.1 Related work on Predictive Analysis of Fixed Deposit Ref. (Nazar, 2023-0228) Model • • • Dataset Logistic Regression (81%) Random Forest (88%) SVM (91%) Cryptocu rrency Dataset Finding and Limitation • • • (Willer, 2019-1225) • • • Logistic Regression (84%) SVM (86%) Deep Learning (88.8%) Share market Dataset • • • Secure Fixed Deposit System: The research demonstrates the successful implementation of a secure fixed deposit system using smart contracts in Remix IDE. Blockchain-Based Solution: This system leverages Ethereum blockchain technology, ensuring high levels of transparency, data security, and immutability. Comprehensive Functionality: The framework supports critical fixed deposit operations, including account creation, fund deposits, withdrawals, interest calculations, and user notifications The findings highlighted in the abstract emphasize the prevailing investor sentiments and preferences in India regarding investment choices. Indian investors often perceive all investment approaches as carrying inherent risks. The findings from this research can provide valuable insights into investor preferences and risk appetites in India, which can inform investment strategies and financial planning in the region. Respondent may not be 100% truthful with their answer. • (Abhishek Rawat (KIT), 2023-7-12) • • • (Aditya Bodhankar, 2023) • • • German SVM (86%) Bank Deep Learning Dataset (88.8%) XGboost (87.9%) • South India Bank Dataset • Logistic Regression (80%) SVM (82%) Decision Tree (88.8%) There is no way of checking misinterpretation and unintelligible replies by the respondent. A heterogeneous insurance of deposits can lead to a sudden, fear-induced reallocation of deposits endangering the stability of the banking sector even in absence of redenomination risks. it is found that the middle-income group tend to invest in the traditional form along with the emerging avenues to balance the risk. CHAPTER-3 SYSTEM DESIGN AND METHODOLOGY 3.1 System Design: 3.1.1 Introduction to System Design For an industry providing logistics services it is crucial to keep track of an orders’ dispatch and delivery time. This creates a belief of satisfaction in customers’ mind. The system design is crucial for accurately estimating delivery times in the context of the Porter service. 3.1.2 System Architecture: The high-level architecture of your delivery time estimation system. Include the following elements: • • • Components: CSV file containing order delivery data as data sources, data preprocessing, data visualization and data modeling. Data Flow: The data is taken from the source csv file that is imported using python pandas library and then used for the modelling. Technology Stack: This project uses Python libraries such as Pandas, Seaborn, Matplotlib, Sci-Kit Learn, Machine learning. 3.1.3 Data Sources: The data source consists of a downloaded CSV file that is available on Kaggle website given in reference (Link: https://www.kaggle.com/datasets/ranitsarkar01/porter-delivery-timeestimation-dataset). 3.1.4 Data Preprocessing: Data preprocessing steps that were followed to prepare the data for modeling are as follows. • • • Data cleaning Feature Engineering • Feature Transformation Handling Missing, Null, Duplicate values. 3.1.5 Modelling: The machine learning models we employed for estimating delivery time was Logistic Regression. This algorithm was used because of numerical target value as well as low effect of outliers on its result, though the data source we used contained almost not outliers considering larger no number of data we had. Since we were using python for machine learning algorithm we used sci-kit learn library for model selection. 3.1.6 Challenges and Trade-offs: The only challenge we encountered was while removing the outliers as the data was already clean enough and thus removing the outliers leaning the model toward biasness thus we had to make a trade off with the outliers and didn’t remove them. 3.1.7 Conclusion: For estimation/prediction of accurate time values that are generally numerical it requires Regression Algorithm and to use the best suited algorithm we applied Logistic Regression. It is one of the best algorithms for regression problem. It gave high accuracy than other model that we used (Linear regression). It also reduced the error figures by significant margin. Providing with best results. 3.2 Algorithm Used There are a lot of algorithms available in machine learning models but choosing one of them to address a particular problem is totally dependent on the type of problem or the target values of your data. Choosing an accurate model is crucial part of a data science project. So, we did in our project. After carefully analyzing the all the aspects of our dataset we chose Logistic Regression algorithm to be used in our project. First, we used Linear Regression model and the results we get were considerably good. With different combinations of Test size and Random state we could only achieve a highest score of 81.39 %. 3.2.1 Logistic Regression This type of statistical model (also known as logit model) is often used for classification and predictive analytics. Logistic regression estimates the probability of an event occurring, such as voted or didn’t vote, based on a given dataset of independent variables. Since the outcome is a probability, the dependent variable is bounded between 0 and 1. In logistic regression, a logit transformation is applied on the odds—that is, the probability of success divided by the probability of failure. This is also commonly known as the log odds, or the natural logarithm of odds, and this logistic function is represented by the following formulas: 1 ๐ฟ๐ฟ๐ฟ๐ฟ๐ฟ๐ฟ๐ฟ๐ฟ๐ฟ๐ฟ(๐๐๐๐) = 1+exp (−๐๐๐๐) (1) ๐๐๐๐ ln ๏ฟฝ(1−๐๐๐๐)๏ฟฝ = Beta_0 + Beta_1 ∗ X_1 + … + B_k ∗ K_k (2) Fig 3.5 Figure explaining the process of Logistic Regression CHAPTER-4 IMPLEMENTATION AND RESUTLS 4.1Hardware and Software Requirements Table 4.1 Showing the Hardware and Software requirement of the system developed Hardware Software • i511th Gen, 8Gb RAM • Any window-based operating system (Windows 11). • Screen resolution of at least 800 x 600 required for proper and • Word pad or Microsoft Word complete viewing of screens. • Python Language, jupyter Higher resolution would not be a notebook problem. • Scklearn learn 4.2 Implementation 4.2.1 Data Collection Data collection in data science refers to the process of gathering, acquiring, and recording data from various sources for the purpose of analysis, interpretation, and decision-making. This dataset is extracted from (Kaggle)and our dataset is “Fixed Deposit Data” in which we past data is available of users doing FD. So, I have considered a labelled dataset for applying supervised machine learning technique. Figure 4: 1 we are importing libraries which will be used in our project. Figure 4: 2 Loading the data and converting it into data frame so as to perform operations. Figure 4: 3 Features in Data set Figure 4: 4 Overview of dataset Figure 4: 5 Print data types for each variable We can see there are two format of data types: 1. Object: Object format means variables are categorical. Categorical variables in our dataset are: job, marital, education, default, housing, loan, contact, month, poutcome, subscribed 2. int64: It represents the integer variables. Integer variables in our dataset are: ID, age, balance, day, duration, campaign, pdays, previous 4.2.2 Data Preparation Data preparation, also known as data preprocessing or data cleaning, is a crucial step in the data science workflow. It involves transforming raw data from various sources into a format that is suitable for analysis, modeling, and machine learning. We prepare raw data in this phase so that meaningful insights can be extracted from it. Figure 4: 6 Describing Data set Figure 4:7 Missing values in the dataset. There are no missing values in the train dataset. Next, we will start to build our predictive model to predict whether a client will subscribe to a term deposit or not. As the sklearn models takes only numerical input, we will convert the categorical variables into numerical values using dummies. We will remove the ID variables as they are unique values and then apply dummies. We will also remove the target variable and keep it in a separate variable. Figure 4: 8 Split Data into training and validation set 4.2.3 Model Building Model building in data science is the process of creating predictive or descriptive models from data to make informed decisions, solve problems, or gain insights. In this phase, we choose the appropriate model as per our data, then we divide them into two parts: values (independent) and target (dependent). Then we further divide them into training and testing so as to calculate accuracy or score of our data. In this project, we used this algorithm because we don't have categorical values and we have structural data so we pursued this algorithm. Also, it reduces overfitting by averaging multiple decision trees and is less sensitive to noise and outliers in the data. Figure 4:9 Making prediction on the validation set Figure 4: 10 checking the accuracy score We got an accuracy score of around 90% on the validation dataset. Logistic regression has a linear decision boundary. What if our data have non linearity? We need a model that can capture this non linearity. 4.3 Results We conclude that the time estimated by our model is slightly more accurate as compared to the average time of the data. Also, there is no perfect project i.e there is always a scope of improvement. This is one of the suitable algorithm as per us which we thought is best suited for this dataset. DATA VISUALIZATION USING MATPLOTLIB & SEABORN LIBRARY Heatmap to see the correlation between the attributes: We can infer that duration of the call is highly correlated with the target variable. This can be verified as well. As the duration of the call is more, there are higher chances that the client is showing interest in the term deposit and hence there are higher chances that the client will subscribe to term deposit. Fig 4.11 Heatmap to show the correla๏ฟฝon between di๏ฌerent features of dataset Bar Plot So, 3715 users out of total 31647 have subscribed which is around 12%. now exploring the variables to have a better understanding of the dataset. Fig 4.12 Image showing of Bar Plot for the No of users took FD We see that most of the clients belongs to blue-collar job and the students are least in number as students generally do not take a term deposit Fig 4.13 Image showing client occupa๏ฟฝon that are taking FD Displot We can infer that most of the clients fall in the age group between 20-60. Fig 4.14 Image showing most of the clients fall in which age group Pie charts Fig 4.15 showing no. user said yes to subscription Fig 4.16 Pair Plot Figure 4.16 Correlation in Train dataset Comparison tables of the models that we used in our project: Table 4.3 Comparison between the model used in the project Decision Tree Logistic Regression Test Size 11162 200 Test Size 31674 200 Accuracy 87.02% 92.01% Accuracy 88.44% 90.34% CHAPTER-5 CONCLUSION AND FUTURE WORK The project's conclusion highlights the key outcomes and insights derived from using logistic regression to predict the users who would take fixed deposits (FDs) with an impressive accuracy score of approximately 90%. Here's a concise conclusion for your project: This project successfully employed logistic regression as a predictive modeling technique to identify potential users who are likely to opt for fixed deposits (FDs). The model achieved a remarkable accuracy score of approximately 90%, demonstrating its effectiveness in making accurate predictions. Although, it achieved its objective to accurately predict the users who would take fixed deposits (FDs), not considering the real time changes in policy and many other factor makes it limited to the features/factors present in the dataset. In summary, the project underscores the power of data science and machine learning in improving decision-making processes within the financial sector. The logistic regression model's high accuracy signifies its utility in identifying potential FD users, contributing to more effective financial planning and customer engagement strategies. In the near future, there are several promising avenues for further enhancing the project's predictive capabilities. These include refining the dataset with additional relevant features, considering alternative machine learning algorithms to optimize accuracy, and updating the data to maintain relevance in a dynamic market. Additionally, exploring time series analysis to capture temporal patterns, implementing customer segmentation for personalized strategies, and assessing the risk profile of fixed deposit users are important areas for development. The project can also benefit from real-time prediction capabilities, incorporating user feedback to refine predictions, and addressing ethical considerations in data usage. Creating a user-friendly deployment interface and establishing ongoing monitoring and maintenance protocols are crucial for long-term success. Finally, considering market expansion possibilities and collaborating with industry experts can help further refine and extend the project's impact in the financial sector. References 1. Abhishek Rawat (KIT), H. U. (2023-7-12). mplementation of Algorithms for Fixed Deposit Using Smart Contract . 2. Aditya Bodhankar, D. P. (2023). Bank Fixed Term Deposit analysis using Bayesian Logistic Regression. 3. Nazar, A. (2023-02-28). Analyzing the risks of fixed deposit investments and real estate and their impact on enhancing the credit rating of insurance companies. 4. Willer, F. (2019-12-25). Fear, deposit insurance schemes, and deposit reallocation in the German banking system. 5. https://www.kaggle.com/datasets/akaretirastogi897/785268 6. https://www.sciencedirect.com/science/article/abs/pii/S0927538X17303037 7. https://www.sciencedirect.com/science/article/abs/pii/S0378426619301025 8. https://www.iasj.net/iasj/article/263875 9. https://journal.formosapublisher.org/index.php/eajmr/article/view/2524 10. https://ieeexplore.ieee.org/abstract/document/10080695/authors#authors