International Journal of Engineering Trends and Technology (IJETT) – Volume17 Number5–Nov2014 Identification and Extraction of Parameters Influencing Commerce Trends Based on Data Mining in Shopping Centres Theresa Rani Joseph1, Smitha Jacob2 1 PG Scholar, Department of Computer Science and Engineering, St. Josephs College of Engineering and Technology Pala, Kottayam 2 Assistant Professor, Department of Computer Science and Engineering, St. Josephs College of Engineering and Technology Pala, Kottayam Abstract—The paper is the initial effort towards a proposed prescriptive analysis of commerce trends based on data mining in shopping centres. The prescriptive analysis is proposed as the second phase of our previous work on commerce recommender systems for improving customer relationship management in shopping centres. The first phase which contained the design of commerce recommender system was intended for providing personalized as well as generic recommendations to the customers of the shopping centre. The second phase will concentrate on formulating optimal solutions and marketing strategies for business needs of managers and admins based on predictive and descriptive analysis of customer data. As a preliminary step towards the bigger analysis, we attempt to identify and extract relevant predictive and descriptive parameters that influence commerce trends based on the first phase of the work we carried out. Index Terms — Prescriptive Analysis, Descriptive Analysis, Predictive Analysis, I. INTRODUCTION The research significance of commerce trends in shopping centres mostly comes from the role a shopping centre plays in the commerce world. Being a part of the economy of all towns and cities, shopping centre is a good domain for carrying out data mining studies. Data mining can be of predictive or descriptive nature. Predictive data mining can be used to forecast almost precise values, based on patterns determined from known results where as descriptive data mining describes a data set in a concise way and presents interesting characteristics of the data without having any predefined target [11]. Descriptive analytics is used to intelligently group or classify customers, or to simply better understand the composition of a population [12]. Descriptive analysis is often used as the primary step in segregating a population for detailed analysis. It helps to organize the database into well defined segments or groups. problem. The actual predictive analysis comes after the above mentioned steps. Predictive analysis tracks down and filters necessary data with the aim of producing valuable results. The valuable results usually guess the likelihood of occurrence of an event or possible outcomes. Predictive analysis has the power to augment customer relationship management systems by analyzing customer data. Prescriptive analysis combines both descriptive and predictive analyses. Prescriptive analysis explores a set of possible actions and suggests actions based on descriptive and predictive analyses of data [13]. Uncertainty associated with a problem is accounted and ways to reduce the risks arising from it are suggested. It makes use of optimization and mathematical models to make recommendations and suggestions. Customer Relationship Management is an important aspect in commerce world. It is analysed that the marketing success of an enterprise is founded on a continuous dialogue with user leading to real understanding of product or service [2]. Good CRM usually includes the following key points (1) Presenting a single image of the organization; (2) Understanding who customers are and their likes and dislikes; (3) Anticipating customer needs and addressing them proactively; and (4) Recognizing when customers are dissatisfied and taking corrective action [1]. It is interesting to note that CRM uses predictive and descriptive analyses to proactively understand customer purchase habits and product demands. II. RELATED WORK The related work will briefly deal with descriptive and predictive analyses, RFM analysis as well as customer relationship management. A. Descriptive and Predictive Analyses Wikipedia says that predictive analytics encompasses a variety of statistical techniques from modeling, machine learning, and data mining that analyze current and historical facts to make predictions about future, or otherwise unknown, events. It is imperative to identify the business problem and to determine the parameters that address the business ISSN: 2231-5381 Descriptive analysis for the purpose of customer relationship management addresses a list of routine business needs [12]. The primary consideration here is the determination of distinct types of customers of a shop. It will analyse what differentiates a set of customers from others. Price sensitivity is also http://www.ijettjournal.org Page 230 International Journal of Engineering Trends and Technology (IJETT) – Volume17 Number5–Nov2014 addressed and the impact of discounts and incentives are evaluated. Segmentation, clustering and profiling are three common techniques used for descriptive analysis. Clustering helps to find best groupings and specifically in CRM, clustering helps to clarify long held assumptions about customer groups. The process of dividing records into predefined groups is termed as segmentation. Profiling will include deeper understanding of attributes and corresponding values within a defined segment. Predictive analysis uses data associated with past events to predict if a similar event will occur in future [12]. Predictive analysis has many applications within CRM. It will determine the chances of a customer becoming a repeat buyer and final evolution into a high value customer. It can reduce promotion costs by eliminating unproductive marketing strategies. Classification, regression, pattern recognition etc are techniques for predictive analysis. Predictive analytics have the power to significantly optimize customer relationship management systems. They can help enable an organization to analyze all its customer data before exposing patterns that predict customer behavior [14]. A predictive analytic solution can be made part of a CRM system for effective use of in-session data. those who are not appropriate customers because they are no longer part of the target market; or those who may have shifted their purchases to competing products. III. PROPOSED SYSTEM The basic framework of the system consists of a web application at shopping centre side and an android application at the customer side. A brief description of the first phase of the work now follows [17]. The first phase of the work was focused on providing a recommender system solely for the use of customers. The web application supports four kinds of users namely web admins, shop admins and customers. Shops can register in the website by providing necessary details and they need to be approved by web admins. Customers can register in the website and can download the android recommender application from the website. The android application will have a login for each registered user. The user can then search for shop products and avail recommendations. B. RFM Analysis and CRM CENTRALISED DATABASE The concept of RFM was introduced by Bult and Wansbeek (1995) and has proven very effective (Blattberg et al., 2008) when applied to marketing databases [13]. RFM stands for Recency, Frequency and Monetary value. A marketing technique used for analyzing customer behavior such as how recently a customer has purchased (recency), how often the customer purchases (frequency), and how much the customer spends (monetary) is termed as RFM analysis. It is a useful method to improve customer segmentation by dividing customers into various groups for future personalization services and to identify customers who are more likely to respond to promotions [15]. . Customer Relationship Management is defined by four elements of a simple frame-work: Know, Target, Sell and Service. CRM requires the firm to know and understand its markets and customers [16]. This involves detailed customer intelligence in order to select the most profitable customers and identify those no longer worth targeting. CRM also incorporates development of the offer including which products to sell to which customers and through which channel. As far as customer life cycle is concerned, CRM identifies four important stages. Prospects are people who are not yet customers but are in target market. Responders are prospects who have shown an interest in a product or service. Active customers are people who are currently using the product. Former customers may be customers who incurred high costs; ISSN: 2231-5381 Customer Specific Purchase & Visit Info Inventory info for each shop Frequent Pattern Mining & Prediction Similarity Measuring Model Prescriptive Analysis of Commerce Trends Fig 1: Proposed System Design A dual recommender system was envisioned and implemented for customer end. The dual system involves a personalized recommender system and for implementing this, it is mandatory to have the purchase information for each registered user. This will be available only at the shop database that stores the billing information. In addition the search patterns of the customers need to be tracked using the android recommender application. The design also involved a generic recommender system, the implementation of which required the complete inventory information of each shop, complete with at least four levels of categorization information. Hence it is necessary to set up a centralized database containing the consolidated purchase, visit and inventory info from each shop. It http://www.ijettjournal.org Page 231 International Journal of Engineering Trends and Technology (IJETT) – Volume17 Number5–Nov2014 should be noted that each registered shop will be assigned a unique shop id by the web application. The second phase of the work concentrates on performing prescriptive analysis of commerce trends for the use of shop admins or managers. Prescriptive analysis will help to figure out optimal solutions for different business problems faced by the shop manager. Prescriptive analysis combines both predictive and descriptive analysis. Predictive analysis involves prediction or forecast of results based on existing data. Descriptive analysis on the other hand involves concise representation of existing data and figuring out interesting facts from data. This paper is an initial effort to track some of the decisive parameters for prescriptive analysis based on the customer specific and inventory specific data mining process carried out in first phase. The identification of some high level commerce parameters as well as the ways of extracting these specific parameters will be the focus of this work. A. Identification of decisive parameters In this preliminary phase, we identify six classes of commerce parameters. These parameters may have descriptive or predictive nature with regard to the centralized database. We assume a centralized database containing the consolidated purchase, visit and inventory info from each shop within a shopping centre. The following are the parameters we consider in this paper. 1. 2. 3. 4. 5. 6. Temporal parameters Customer categorization parameters Product based parameters Price and offer based parameters Search and browsing based parameters Linked parameters 1) Temporal parameters: Temporal parameters as the name suggests are related to time. RFM analysis typically deals with recency and frequency. Recency shows the most recent date and time of purchase by the customer. Frequency counts the number of times the customer has made the purchase. Apart from these two parameters, we are interested in certain specific temporal parameters of the framework we have discussed. First we consider temporal parameters for granularity of a single day. The most favourite purchase hour of the customer is one such attribute. This might vary with customers. It will be useful to map customers with their favourite purchase hour. Again this will be a function of frequency of purchase by the customer during the chosen hour of the day. Instead of doing an hourly analysis, it is sufficient to divide a day into time frames like morning, midday, noon, afternoon, evening and night. There could be specific interest groups during each of these time ISSN: 2231-5381 frames. Associative data will be useful for finding out the existence of commonality if any between customers of a specific time frame. The most busy and least busy time frame of a day is another interesting attribute. From granularity of a single day, we consider granularity of a week. The days with maximum, minimum and average sales in a week are parameters coming under this category. We can extend the granularity form week to month and to year and can track similar parameters. Predictive parameters deal with the likelihood of occurrence of events based on past data. We have already discussed about rush sale hours and not so rush hours at a shop during a single day. This comes under descriptive analysis. One predictive parameter is to check whether it will be productive to allow express checkout lanes during rush hours. It may not be productive for granularity of a single day, but it could give better results if we allow express checkout lanes for busy day or holidays in a week. Similarly we can check the best date and time for release of offers. 2) Customer categorisation parameters: Customer categorization is a typical descriptive analysis procedure. The RFM model is always useful for customer categorization. We can always rate the customer on the basis of their recency, frequency and monetary value. The best valued customer will be the ones with maximum scores for all three parameters. There could be interesting predictive parameters associated with a customer group. We can always direct our best offers and promotional mails to these customers. It will not be wise to ignore the least valued customers. But we have to come up with specific strategies for each customer group. Grouping customers based on associative data is another way of classification. Associative data refers to a mix of information about the customer. It could include demographic information (age, marital status, nature of job, income etc), geographic information etc. 3) Product based parameters: Product based parameters are dependent on inventory data. There are lots of possibilities for deriving descriptive parameters from product data depending on the type of shop and range of products supported. These can include the most sold out product, the least sold out product and products with average sales. We can group products into different sections and can find best sellers in each section. We can also find out products that are gaining new attention from customers and those products which are slowly ignored by the customers. A valuable predictive parameter is to check how beneficial it is to have additional stocks of the most sold out item and to introduce different brands of the most sold out products. Brands play a big role in the sale of a product. The store managers have to wisely choose the brand with highest demand and have http://www.ijettjournal.org Page 232 International Journal of Engineering Trends and Technology (IJETT) – Volume17 Number5–Nov2014 to recognize brands with diminishing demand from customers. A very famous analysis concept related to products is called market basket analysis. It tries to find sets of items that are brought together often during a single purchase by the customer. Bread, butter and milk bought during the same purchase are one such example. Market basket analysis actually helps to predict items for combo offers. 4) Price and offer based parameters: Price based parameters are inevitable for commerce trend analysis. The role of price in commerce is often tricky. The general trend is to move towards cost effectiveness. Low priced products are always attractive to most customer groups. But at times the decisive factors are the budget and the urgency of the customer group. It is a general practice in certain shops to maintain products of varying quality and expense for targeting more customer groups to their shops. But there are trademark shops which often target a specific customer groups. One descriptive parameter to check is the price ranges of the most sold out brands of products. It is also noteworthy to understand the effect of price variations in the sale of a product. Offers inherently calls for increased sales. But it is important to notice the offers that gave the maximum sales, the period during which the offer was introduced, its duration and the customer groups involved. 5) Search and browsing based parameters: The search and browsing info give useful insights about the tastes of each customer group. Moreover it is a sure indicator of the recent trends of the market. The most searched item in a shop is a good descriptive parameter. Introducing offers for the most searched product is a predictive parameter. The recency of search is also important as its frequency. The diminishing utility of a product can be understood by a depreciation in search or browse activities surrounding that product. 6) Linked parameters: Most of the above mentioned parameters are linked to one another. Price based parameters are closely related to temporal and product based parameters. A good example is the introduction of combo offers at peak sale hours. Combo offers combine more than one favorite product of the customers. Sale of combo products at low price, and at peak sale hours definitely creates a very positive response from customers. Synergy of multiple factors is at work in this case. Customer categorization parameters can be easily linked with product parameters. In a similar way market basket analysis of different customer groups might yield surprising results. ISSN: 2231-5381 B. Extraction of identified parameters A brief discussion regarding the means of extraction of the identified parameters will give us an idea about the feasibility of the proposed prescriptive analysis for our designed framework. Our framework is so designed that a new pattern id is created each time the user logs into the android app [17]. The time between one login and log off is considered as the time interval of a pattern. The pattern ids and customer ids are maintained in a table called customer_patterns. Two separate tables are maintained for tracking purchase pattern and browsing pattern. Each time the user purchases an item the maximum pattern id or the latest pattern id for that customer is chosen from the customer_patterns table. This is based on the assumption that user will be logged into the android app during the time of visit to the mall. Since the pattern ids are created based on time, we are amply equipped with means to capture temporal parameters. For collecting associative data for customer categorization, we can make use of the registration data available in the database. Each customer has to register in the website to download the android app. For data regarding recency, frequency and monetary value of customer, it is enough to query customer_patterns table, purchase and browsing info table. Product based parameters can be easily tracked from the detailed inventory data available with each shop owner. Price and offer based parameters can be tracked partly from inventory data and partly from purchase info table of each shop. The android application allows customers to check in to a shop to view recommendations and offers. The user can additionally search for a product and then checks in to a shop offering the product to view further recommendations and offers. The browse and search history of the user is saved in visit_info table. So it is feasible to extract parameters regarding search and browse data of customers. For finding out repeated purchase or browse patterns, we can access the user pattern data structure which is a hash map [17] that contains pattern id as key and the transaction array list corresponding to each pattern id as the value of the hash map. Frequency mapping is performed on this dynamic data structure to find out repeated patterns. IV. CONCLUSION The identification and ways of extraction of decisive parameters for prescriptive analysis have been discussed. The key to improving customer relationship management is to analyze the descriptive parameters that are available through various means like customer registration, purchase info ,billing info, browsing info etc and to derive predictive parameters and finally to http://www.ijettjournal.org Page 233 International Journal of Engineering Trends and Technology (IJETT) – Volume17 Number5–Nov2014 move on to prescriptive analysis for finding out optimal solutions for business needs. . We have identified six categories of commerce parameters. Each category will involve predictive and descriptive parameters relevant to our framework. The design and implementation of a prototype commerce recommender system for a shopping centre [17] has been the basis for studying the extraction of the identified parameters. V. FUTURE WORKS [13] http://cdn.intechopen.com/ pdfs-wm/13162.pdf [14] http://huaat.net/download/DMtechniques.pdf [15] http://www.dataminingarticles.com/info/data-miningintroduction/ [16]https://faculty.washington.edu/socha/css572winter2012/ASA_Int roduction_to_Analytics.pdf [17] Theresa Rani Joseph and Smitha Jacob,“A Commerce Recommender System for Improving Customer Relationship Management in Shopping Centres,” in International Journal of Engineering Trends and Technology, vol. 13, no. 4, Jul. 2014. The discussion we have done in this paper is the preliminary step towards building a strong customer relationship management system for a shopping centre. From the parameters we have identified, our aim is to move towards the complete prescriptive analysis of business needs associated with a shopping centre. It is extremely useful to provide store owners with analytical reports which summarize specific customer requirements, behavioral patterns and solutions for business needs. REFERENCES [1] C.Dennis, D.Marsland and T.Cockett, “Data Mining for Shopping Centres - Customer Knowlegde Management Framework,” in Journal of Knowledge Management, 2001. [2] I.Richard, D.Foster and R.Morgan, “Brand Knowledge Management: Growing Brand Equity,” in Journal of Knowledgde Management, 1998. [3] R. Agrawal and R. Srikant, “Fast Algorithm for Mining Association Rules,” in Proc. Int’l Conf. Very Large Databases, pp. 478-499, Sept.1994. [4] Mohammed Javeed Zaki, Srinivasan Parthasarathy, Mitsunori Ogihara and Wei Li “New Algorithms for Fast Discovery of Assoctiation Rules,” in Proc. Int’l Conf. Knowledge Discovery and Data Mining,1997. [5] J. Han, J. Pei, and Y. Yin, “Mining Frequent Patterns without Candidate Generation,” in Proc. ACM SIGMOD Conf. Management of Data, pp. 1-12, May 2000. [6] Eric Hsueh-Chan Lu, Wang-Chien Lee and Vincent S. Tseng, “A Framework for Personal Mobile Commerce Pattern Mining and Prediction,” in Proc. IEEE Transactions on Knowledge and Data Engineering, 2012. [7] Y. Lu, “Concept Hierarchy in Data Mining: Specification, Generation and Implementation,” master’s thesis, Simon Fraser Univ., 1997. [8] J. L. Herlocker, J.A. Konstan, A. Brochers, and J. Riedl, “An Algorithm Framework for Performing Collaborative Filtering,” in Proc. Int’l ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 230-237, Aug. 1999. [9] R.Agrawal and R.Srikant, “Mining Sequential Patterns,” in Proc. Int’l Conf. Data Eng., pp. 3-14, Mar. 1995. [10] L. Kaufman and P.J. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis, Wiley, Mar. 1990. [11] http://www-01.ibm.com [12] http://www.askingsmarterquestions.com/predictive-vdescriptive-analytics-in-data-driven-marketing/ ISSN: 2231-5381 http://www.ijettjournal.org Page 234