MACHINE LEARNING APPROACH FOR CUSTOMER SEGEMENTATION AND MARKET ANALYTICS A Project Proposal submitted for the partial fulfillment of the requirements of the Advanced Diploma in Data Science (Part time) Program By F.M.M. AADHIL IMAM COADDS191P -009 Independent Research Project Advanced Diploma in Data Science National Institute of Business Management Colombo, Sri Lanka 4th October 2020 1 ABSTRACT In this paper, based on a data sample from a UK-based non-store online retail, author identify that are important of Customer segmentation and market analytics Further, author tried to find out the Identifying potential customers and their unsatisfied customer needs this enables marketers to create targeted marketing messages for a specific group of customers which increases the chances of the person buying a product. Key words – Customer Segmentation, Market Analytics 2 Table of Contents Chapter 1: Introduction .................................................................................................. 5 1.1 Background ...................................................................................................... 5 1.2 Research Problem ............................................................................................ 6 1.3 Objective of the Project ..................................................................................... 6 1.4 Scope of the Project ........................................................................................... 7 1.5 Justification of Research .................................................................................... 7 1.6 Expected Limitations ......................................................................................... 8 1.7 Proposed Work Schedule .................................................................................. 8 Chapter 2: Literature Review ......................................................................................... 9 2.1 Introduction to the research theme .................................................................. 9 2.2 Theoretical explanation about the Key Words in the Topic ............................ 9 2.3 Findings by other researchers ........................................................................ 10 2.4 The research gap ............................................................................................ 11 2.5 Table for Variables, their definitions and sources ......................................... 12 2.6 Chapter conclusion......................................................................................... 12 Chapter 3: Methodology .............................................................................................. 13 3.1 Introduction .................................................................................................... 13 3.2 Population, sample and Sampling technique .................................................. 12 3 3.3 Type of Data to be collected and data sources ................................................ 13 3.4 Data collection tools and plan ......................................................................... 13 3.5 Conceptual framework .................................................................................... 14 3.6 Hypothesis....................................................................................................... 14 3.7 Methods of Data Analysis ................................................................................ 15 References…………………………………...…………………………………….....17 4 Chapter 1: Introduction 1.1 Background Over the years, the commercial world is becoming more competitive, as such organizations have to satisfy the needs and wants of their customers, attract new customers, and hence enhance their businesses. In the Business sector, the various chain of trading’s generating a large amount of data. This data is generated on a daily basis or monthly basis across the stores. This extensive database of customers transactions needs to analyze for designing profitable strategies. All customers have different kind of taste and needs. With the increase in customer base and transaction, it is not easy to understand the requirement of each customer. Identifying potential customers can improve the marketing campaign, which ultimately increases the sales. Segmentation can play a better role in grouping those customers into various segments. The task of identifying and satisfying the needs and wants of each customer in a business is a very complex task. This is because customers may be different in their needs, wants, demography, geography, tastes and preferences, behaviors and so on. As such, it is a wrong practice to treat all the customers equally in business. This challenge has motivated the adoption of the idea of customer segmentation or market segmentation, in which the customers are subdivided into smaller groups or segments wherein members of each segment show similar market behaviors or characteristics. 5 1.2 Research Problem When we find similar characteristics in each customer’s behavior and needs. Then, those are generalized into groups to satisfy demands with various strategies and those strategies can be an input of the Targeted marketing activities to specific groups Launch of features aligning with the customer demand, Development of the product roadmap. As we know in traditional method, we have to compare the existing customer data and the general population data in some way to deduce a relationship between them. A manual way of doing this is to compare the statistics between the customers and the general population. For example, the mean and standard deviation of age can be compared to determine which age group is more likely to be a customer or the salaries can be compared to see what group of people fall into customers, etc. But this analysis would give out many results which again have to be analyses to come up with a final strategy. This process will require a lot of time, and by the time this analysis completes, the competitor in the market will capture most of the population, and the company will be out of business. 1.3 Objective of the Project Identifying potential customers and their unsatisfied customer needs this enables marketers to create targeted marketing messages for a specific group of customers which increases the chances of the person buying a produc 6 1.4 Scope of the Project The scope of the project Develop customized marketing campaigns, design an optimal distribution strategy, choose specific product features for deployment, Prioritize new product development efforts of business. 1.5 Justification of Research Segmentation allows businesses to make better use of their marketing budgets, gain a competitive edge over companies and, importantly, demonstrate a better knowledge of your customers’ needs and wants. It helps to • Improve Marketing efficiency - Breaking down a large customer base into more manageable pieces, making it easier to identify your target audience and launch campaigns to the most relevant people, using the most relevant channel. • Identify new market opportunities - During the process of grouping your customers into clusters, you may find that you have identified a new market segment, which could in turn alter your marketing focus and strategy to fit. • Better brand strategy - Once you have identified the key motivators for your customer, such as design or price or practical needs, you can brand your products appropriately. • Improve distribution strategies - Identifying where customers shop and when can informatively shape product distributions strategies, such as what type of products are sold at particular outlets. 7 • Customer retention – Using segmentation, marketers can identify groups that require extra attention and those that churn quick, along with customers with the highest potential value. It can also help with creating targeted strategies that capture your customers’ attention and create positive, high-value experiences with your brands. 1.6 Expected Limitations The main barrier to the project was obtaining the datasets. There was a limited number of datasets were conducted based on the real-world data for approval of Customers. 1.7 Proposed work schedule 8 Chapter 2: Literature Review 2.1 Introduction to the research theme Customer segmentation, refers to the process of dividing a market into different buyers with different behavior’s, characteristics. Customer segmentation refers to a way of dividing according to different characteristics of consumer groups. This theory proposes to study and predict the future consumption trend of customers in the way of segmentation of customer information and consumption behavior, as well as the profit market planning of enterprises. 2.2 Theoretical explanation about the Key Words in the Topic 2.2.1 Customer Segmentation The process of grouping customers into sections of individuals who share common characteristics is called Customer Segmentation. 2.2.2 Market analytics Marketing analytics is customer lifecycle analytics. It is to conduct data analysis around consumers to generate insights to guide marketing activities. Specifically, it includes analyses such as market segmentation, consumer lifetime value analysis, acquiring new customers, maintaining old customers, and enhancing customer engagement. 9 2.3 Findings by other researchers The literature review is used to identify the conclusions of previous researches on factors on Customer Segmentation of past literature will help to develop a framework for the new research. In 2015 Chinedu Pascal Ezenkwu, Simeon Ozuomba, Constance kalu came up with their finding Application of K-Means Algorithm for Efficient Customer Segmentation: A Strategy for Targeted Customer Services they found K means algorithm has a purity measure of 0.95 indicating 95% accurate segmentation of the customers. Insight into the business’s customer segmentation will avail it with the following advantages: the ability of the business to customize market programs that will be suitable for each of its customer segments; business decision support in terms of risky situations such as credit relationship with its customers; identification of products associated with each segments and how to manage the forces of demand and supply; unravelling some latent dependencies and associations amongst customers, amongst products, or between customers and products which the business may not be aware of; ability to predict customer defection and which customers are most likely to defect; and raising further market research questions as well as providing directions to finding the solutions. In April 2019 Balmeet Kaur, Pankaj Kumar Sharma came up with their finding Implementation of Customer Segmentation using Integrated Approach in them study they found in competitive market of e-commerce, the problem of identifying potential customer is gaining more and more attention. To address this problem timely, this paper proposes a study on integrated novel approach based on clustering using Kmeans and associative mining using Apriority technique. After identification of 10 targeted customers and their associative buying pattern, the business managers take the strategic profitable decisions accordingly. This integrated model could be directly brought into implementation for providing better profitable margins from sales. 2.4 The research gap Customer segmentation based on stream clustering provides an ongoing picture of the makeup of the customer base. It also indicates the value that different customer groups have for the company and shows where increased marketing activities may be worthwhile. It is not limited to the retail/e-commerce field, of course. It can be applied in other sectors, too. It allows companies to lay the foundation for targeted marketing campaigns aimed, for instance, at rewarding loyal customers, preventing defections or gaining new customers. The segmentation also helps a company select the right communication channels. If the intention is to target online shoppers, for example, campaigns using social media and email are preferable to expensive direct mailings. In other words, this new approach to customer segmentation allows companies to reach the desired customers with the right messages via the right channels, which also improves the customer experience. And these benefits are achieved continuously, because updating the clusters on an ongoing basis using the streams eliminates the key disadvantage of traditional customer segmentation. 11 2.5 Table for Variables, their definitions and sources • InvoiceNo: Invoice number. Nominal. A 6-digit integral number uniquely assigned to each transaction. If this code starts with the letter 'c', it indicates a cancellation. • StockCode: Product (item) code. Nominal. A 5-digit integral number uniquely assigned to each distinct product. • Description: Product (item) name. Nominal. • Quantity: The quantities of each product (item) per transaction. Numeric. • InvoiceDate: Invoice date and time. Numeric. The day and time when a transaction was generated. • UnitPrice: Unit price. Numeric. Product price per unit in sterling (£). • CustomerID: Customer number. Nominal. A 5-digit integral number uniquely assigned to each customer. • Country: Country name. Nominal. The name of the country where a customer resides. 2.6 Chapter conclusion The chapter, literature review concludes of previous researches on factors significant on customer segmentation. Based on the analysis of past literature the variables were extracted for the present study. Furthermore, new factors were included in the framework which will be helpful to study with regards to the customer segmentation and market analytics. 12 Chapter 3: Methodology 3.1 Introduction This chapter sets out the research process and methods of analysis in order to identify factors on Customer Segmentation and Market Analytics. Furthermore, chapter comprises the research design, study sample, data collection methods and data analysis plan. 3.2 Population, sample and Sampling technique The Sample data are drawn from 01/12/2010 and 09/12/2011 for a UK-based and registered non-store online retail. 3.3 Type of Data to be collected and data sources This is a transnational data set which contains all the transactions occurring between 01/12/2010 and 09/12/2011 for a UK-based and registered non-store online retail. The company mainly sells unique all-occasion gifts. Many customers of the company are wholesalers. Data source: https://archive.ics.uci.edu/ml/datasets/Online+Retail+II# 3.4 Data collection tools and plan The study is conducting based on secondary data which was stored in online retail store, required data fields were extracted from the original data source. The selected fields were stored in a tabular format for the convenience of the study. 13 3.5 Conceptual framework 3.6 Hypothesis Developing a hypothesis is necessary as the hypothesis will guide us decisions on how to formulate the data in such a way to cluster customers. For the orders, our 14 hypothesis is that online purchase BLUE DIAMANTE PEN IN GIFT BOX based on features such as big or small and price tier (high/premium or low/affordable). Although we will use giftbox model to cluster on, the giftbox model features (e.g. price, category, etc.) will be used for assessing the preferences of the customer clusters. 3.7 Methods of data analysis 3.7.1 Exploratory Data Analysis (EDA) Exploratory data analysis is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling or hypothesis testing task. 3.7.2 Cohort analysis Cohort analysis is a type of behavioral analytics in which you group your users based on their shared traits to better track and understand their actions. Cohort analysis allows you to ask more specific, targeted questions and make informed product decisions that will reduce churn and drastically increase revenue. 3.7.3 Funnel analysis A funnel analysis is a method of understanding the steps required to reach an outcome on a website and how many users get through each of those steps. The set of steps is referred to as a “funnel” because the typical shape visualizing the flow of users is similar to a funnel in your kitchen or garage. 15 3.7.3 Market basket analysis Market basket analysis is a data mining technique used by retailers to increase sales by better understanding customer purchasing patterns. It involves analyzing large data sets, such as purchase history, to reveal product groupings, as well as products that are likely to be purchased together. 3.7.4 Recency, Frequency and Monetary Value Recency, frequency, monetary value is a marketing analysis tool used to identify a company's or an organization's best customers by using certain measures. The RFM model is based on three quantitative factors: Recency: How recently a customer has made a purchase. 3.7.5 K Means clustering K-means clustering is a type of unsupervised learning, which is used when you have unlabeled data (i.e., data without defined categories or groups). The goal of this algorithm is to find groups in the data, with the number of groups represented by the variable K. The algorithm works iteratively to assign each data point to one of K groups based on the features that are provided. Data points are clustered based on feature similarity. 16 References Kishana R. Kashwan, Member, IACSIT, and C. M. Velu Customer Segmentation Using Clustering and Data Mining Techniques [online] Available at: <https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&cad=rja&uact=8 &ved=2ahUKEwjUu6L8uYvsAhXVdCsKHQhxDlAQFjALegQIAhAB&url=https%3A%2F %2Fpdfs.semanticscholar.org%2F4750%2Fa05c73c6a1d9836e568a24e935bfddd21ef3.pdf&u sg=AOvVaw38BFNHjVp8XtcX1NYK9nis > [6, December 2013] Patel Monil1, Patel Darshan2, Rana Jecky3, Chauhan Vimarsh4, Prof. B. R. Bhatt5 Customer Segmentation using Machine Learning [online] Available at: <https://www.academia.edu/43487362/Customer_Segmentation_using_Machine_Learnin> [June 2020] Balmeet Kaur, Pankaj Kumar Sharma Implementation of Customer Segmentation using Integrated Approach [online] Available at: <https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&cad=rja&uact=8 &ved=2ahUKEwjUu6L8uYvsAhXVdCsKHQhxDlAQFjAPegQICxAB&url=https%3A%2F %2Fwww.ijitee.org%2Fwpcontent%2Fuploads%2Fpapers%2Fv8i6s%2FF61680486S19.pdf&usg=AOvVaw0cnFIHhCF yyD3LBGNwxqiU > [April 2019] Chinedu Pascal Ezenkwu, Simeon Ozuomba, Constance kalu Application of K-Means Algorithm for Efficient Customer Segmentation: A Strategy for Targeted Customer Services Approach [online] Available at: <https://www.researchgate.net/publication/282862569_Application_of_KMeans_Algorithm_for_Efficient_Customer_Segmentation_A_Strategy_for_Targeted_Custo mer_Services> [2015] Mr. Abhijit Bag Customer Segmentation &Opportunity Analysis [online] Available at: <https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&ved=2ahUKEwie ufrLhJjsAhUs_XMBHY91B8sQFjAMegQIAxAC&url=https%3A%2F%2Fwww.daitm.org.i n%2Fwp-content%2Fuploads%2F2019%2F04%2F15499016029_AbhijitBag.pdf&usg=AOvVaw115VpcV1BUfcU4T-MNIxP3>[2018] Raquel Florez-Lopez and Juan Manuel Ramon-Jeronimo Marketing Segmentation Through Machine Learning Models: An Approach Based on Customer Relationship Management and Customer Profitability Accounting [online] Available at: <https://www.researchgate.net/publication/249737446_Marketing_Segmentation_Through_M achine_Learning_Models_An_Approach_Based_on_Customer_Relationship_Management_a nd_Customer_Profitability_Accounting>[2008] 17 Juni Nurma Sari, Lukito Edi Nugroho ,Ridi Ferdiana ,P. Insap Santosa Review on Customer Segmentation Technique on Ecommerce [online] Available at: <https://www.researchgate.net/publication/313737530_Review_on_Customer_Segmentation_ Technique_on_Ecommerce>[2011] Andrew Aziz Customer Segmentation based on Behavioral Data in E-marketplace [online] Available at: <https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&ved=2ahUKEwjS otGXhpjsAhUpH7cAHboACycQFjAAegQIBBAC&url=https%3A%2F%2Fuu.divaportal.org%2Fsmash%2Fget%2Fdiva2%3A1145508%2FFULLTEXT01.pdf&usg=AOvVaw0 eaSVl-F9EhVq8rmDKHg01> Ceren Iyim Customer Segmentation with Machine Learning [online] Available at: < https://towardsdatascience.com/customer-segmentation-with-machine-learninga0ac8c3d4d84 > Alan Zhang Marketing Analytics - Anyone can do it [online] Available at: <https://towardsdatascience.com/marketing-analytics-anyone-can-do-it-750d8ca63806> 18 19