Website Popularity Prediction (Assignment 2) Data Mining and Industry Applications (ISCG8042) Website Popularity Prediction (Assignment 2) Due Date: 10/11/2014 Assisted by: Jane Zhao Prepared and Submitted by: Mohammed Alkorbi (1431788) Abdulbasit Almatrook (1430943) Haider Al Oliwi (1440428) Naji Alobaidi (1373749) Page 1 of 37 Website Popularity Prediction (Assignment 2) Table of Contents 1 Introduction .......................................................................................................................................... 5 1.1. 1.1.1 Prediction of Fitness Website Design ................................................................................... 5 1.1.2 Popularity Prediction of E-commerce Website..................................................................... 6 1.1.3 Popularity Prediction of Music Website ............................................................................... 6 1.1.4 Popularity Prediction of Education Website ......................................................................... 7 1.2. Data Mining Applications .............................................................................................................. 7 1.2.1. Retail Data Mining Applications ............................................................................................ 7 1.1.5 Telecommunication Data Mining Applications ..................................................................... 8 1.1.6 Banking Data Mining Applications ........................................................................................ 9 1.1.7 Website Popularity Prediction ............................................................................................ 10 1.2 2 Research Background.................................................................................................................... 5 Data Mining Tasks ....................................................................................................................... 11 1.2.1 Classification ....................................................................................................................... 11 1.2.2 Regression ........................................................................................................................... 11 1.2.3 Prediction ............................................................................................................................ 12 1.2.4 Clustering ............................................................................................................................ 14 1.2.5 Association Rule .................................................................................................................. 15 1.2.6 Summarization .................................................................................................................... 16 1.2.7 Outlier Analysis ................................................................................................................... 16 Methodology....................................................................................................................................... 17 2.1 Flowchart of System Design .............................................................................................................. 17 2.2 Explanation of System Design: .......................................................................................................... 18 2.2.1 System Module A: Extraction: .................................................................................................... 18 2.2.2 System Module B: Data Server Module ..................................................................................... 18 2.2.3 System Module C: Data Pre-processing Module ....................................................................... 18 2.2.4 System Module D: Data Mining ................................................................................................. 19 2.2.5 System Module E: Loading into Warehouse: ............................................................................. 20 2.2.6 System Module F: Visualization the Data: ................................................................................. 20 2.3 Difficulties and Solutions: ................................................................................................................. 20 2.3.1 Extraction: .................................................................................................................................. 20 2.3.2 Transformation Difficulties: ....................................................................................................... 20 2.3.3 Loading Issue:............................................................................................................................. 21 Page 2 of 37 Website Popularity Prediction (Assignment 2) 3 Experiments and Results Discussion ................................................................................................... 22 3.1 3.1.1 Linear Regression ................................................................................................................ 22 3.1.2 Logistic Regression .............................................................................................................. 22 3.1.3 Multilayer Perceptron ......................................................................................................... 23 3.1.4 Sequential Minimal Optimization for Regression (SMOreg)............................................... 24 3.2 Prediction of Fitness Website Design ......................................................................................... 24 3.2.1 Linear Regression ................................................................................................................ 24 3.2.2 Logistic Regression .............................................................................................................. 25 3.2.3 Multilayer Perceptron ......................................................................................................... 26 3.2.4 Sequential Minimal Optimization for Regression (SMOreg)............................................... 26 3.3 Popularity Prediction of E-commerce Website........................................................................... 27 3.3.3 Linear Regression ................................................................................................................ 27 3.3.2 Logistic Regression .............................................................................................................. 27 3.3.3 Multilayer Perceptron ......................................................................................................... 28 3.3.4 Sequential Minimal Optimization for Regression (SMOreg)............................................... 29 3.4 Popularity Prediction of Education Website ............................................................................... 29 3.4.1 Linear Regression ................................................................................................................ 29 3.4.2 Logistic Regression .............................................................................................................. 30 3.4.3 Multilayer Perceptron ......................................................................................................... 31 3.4.4 Sequential Minimal Optimization for Regression (SMOreg)............................................... 31 3.5 4 Listening Music Website Popularity ............................................................................................ 22 Compression ............................................................................................................................... 32 3.5.1 Linear Regression ................................................................................................................ 32 3.5.2 Logistic Regression .............................................................................................................. 32 3.5.3 Multilayer Perceptron ......................................................................................................... 33 3.5.4 Sequential Minimal Optimization for Regression (SMOreg)............................................... 33 3.5.5 Overall Comparison............................................................................................................. 33 Conclusion: .......................................................................................................................................... 35 Bibliography ................................................................................................................................................ 36 Page 3 of 37 Website Popularity Prediction (Assignment 2) Figure 1: The Impact of Online Membership Registration Form on the Popularity of the Website ............ 6 Figure 2 The Concept of Multilayer Perceptrons (NEURAL NETWORK, 2014)............................................ 13 Figure 3: The segmentation system provided by Nielsen Caritas based on American zip code to describe the lifestyle in each area in USA (Ngai, Xiu, & Chau, 2009) ........................................................................ 15 Figure 4 Flowchart of System Design .......................................................................................................... 17 Figure 5 Weka ............................................................................................................................................. 19 Figure 6 Linear Regression (Listening Music Website Popularity) .............................................................. 22 Figure 7 Logistic Regression (Listening Music Website Popularity) ............................................................ 22 Figure 8 Multilayer Perceptron (Listening Music Website Popularity) ....................................................... 23 Figure 9 SMOreg (Listening Music Website Popularity) ............................................................................. 24 Figure 10 Linear Regression (Prediction of Fitness Website Design).......................................................... 24 Figure 11 Logistic Regression (Prediction of Fitness Website Design)........................................................ 25 Figure 12 Multilayer Perceptron (Prediction of Fitness Website Design) .................................................. 26 Figure 13 SMOreg (Prediction of Fitness Website Design) ......................................................................... 26 Figure 14 Linear Regression (Popularity Prediction of E-commerce Website) ........................................... 27 Figure 15 Logistic Regression (Popularity Prediction of E-commerce Website) ......................................... 27 Figure 16 Multilayer Perceptron (Popularity Prediction of E-commerce Website).................................... 28 Figure 17 SMOreg (Popularity Prediction of E-commerce Website) .......................................................... 29 Figure 18 Linear Regression (Popularity Prediction of Education Website) ............................................... 29 Figure 19 Logistic Regression (Popularity Prediction of E-commerce Website) ......................................... 30 Figure 20 Multilayer Perceptron (Popularity Prediction of E-commerce Website).................................... 31 Figure 21 SMOreg (Popularity Prediction of E-commerce Website) .......................................................... 31 Page 4 of 37 Website Popularity Prediction (Assignment 2) 1 Introduction 1.1. Research Background Data mining is mainly concerned with finding new and interesting patterns from a large database. Data mining is also referred as knowledge discovery in database. It helps in predicting behaviours and future trends. This helps in making effective and efficient decisions that are based on knowledge. The contribution of four different research in the prediction of website design and popularity are as follows: 1.1.1 Prediction of Fitness Website Design The prediction of fitness website design contains information about the design factors of 50 fitness websites. These websites are located in USA, UK and NZ. The main aspect of this research is to determine the significant factor in designing fitness website design. In addition, whether these factors will affect the popularity of the fitness website. Several factors related to fitness website design have gathered including online member registration, online contact form, Google map function, search function, and background image. The research investigated whether each fitness website has these factors or not. Additionally, the research looked at the overall design of each fitness website, each website has given a percentage related to overall design. Information gathered about well-designed websites to determine the design rating as a percentage to each website. This information included the effective use of website contents, layout design, width and height of a website page, design of text and design of the image (Plumley, 2010) (MacDConald, 2011). The conclusion of this research was that having online membership registration increases the popularity of the website, see figure 1. Additionally, Out of 50 fitness website, 33 of them have online membership registration. The popularity of each fitness website is done as percentage. Page 5 of 37 Website Popularity Prediction (Assignment 2) Impact of Online Membership Registration Form on Popularity 1.2 1 0.8 0.6 0.4 0.2 0 120.00 100.00 80.00 60.00 40.00 20.00 0.00 Membership registration Popularity Figure 1: The Impact of Online Membership Registration Form on the Popularity of the Website 1.1.2 Popularity Prediction of E-commerce Website The topic chosen for assignment 1 was very interesting and really quite enlightening insights were developed during the commencement of it. The topic for assignment 1 was the “Website Popularity Prediction”. Information technology has really changed this World and has converted into a miraculous wonder. Internet is a very significant key in the progress of enterprises we see today and the essence of the topic was to apply the concepts of data mining and trying to figure out the popularity of numerous websites. A sample of 50 E-commerce websites was taken and then numerous attributes were assigned to each website such as “country”, “loading-time”, “home-page size”, “Ranking”, “linking websites” and some more of them. The major source of data collection was “Alexa.com” which is one of the largest web analytics company owned by the industry giant “Amazon” itself. Another website “pingdom.net” was used to retrieve the “response time” of the websites. The major column “popularity” was the dependent variable in the dataset which was totally depending upon the independent variable “Alexa’s ranking”. Using, the specific column a scale was formulated which filled the values in the column of popularity. Numerous, techniques of data pre-processing and data transformation were applied to the complete dataset to remove the anomalies and keep the data in consistent form. 1.1.3 Popularity Prediction of Music Website The objective the dataset to gather values from 51 sites in order to discover the popularity among music websites. To do so, data is collected based on five factors, which include: Loading Time: measuring the site page loading time by using SEO tools. Quality Links: identifying the number of dead-links on the page by using SEO tools. Page 6 of 37 Website Popularity Prediction (Assignment 2) Compatibility: identifying the percentage of devices that can be displayed on the website. Social Networks: identifying the website that is supplied with social means. Site Ranking: identifying the number of traffic on the website. 1.1.4 Popularity Prediction of Education Website The dataset collected for the present work consists of 50 educational websites. The dataset developed for these websites also provides rating against each field in order to analyze the level of compatibility, popularity, quality links, trust, website traffic and share content. The rating has been provided on the basis of users’ visit on these websites. The data collected for the present work has been properly labeled and presented in an excel sheet. The data has been gathered from the different websites of the UK, USA and New Zealand. These 50 selected websites serve as representatives of large number of educational websites available in World Wide Web. 1.2. Data Mining Applications A literature review contains data mining applications for retails, telecommunication, banking and website popularity prediction. These data mining applications will be as follows: 1.2.1. Retail Data Mining Applications Nowadays retailer is facing rapidly change and competitive environment among other retailers (Ramageri & Desai, 2013). In addition, retailers are competitive in producing a better and effective market for their products. Every day, retailers collect large amount of customers’ information, product information and purchases information. This information can be converted to discover a knowledge which will assist in making better business decision (Ramageri & Desai, 2013). Retailers are seeking to target right customers who will definitely buy their products. Data mining will assist in predicting customers’ purchases behaviour in the futures which will increase retails’ productivity (Ramageri & Desai, 2013). Some Retails’ data mining applications will be as followed: 1. Acquiring and Retaining Customer Retailers are seeking to keep their customers rather than getting a new customer therefore, retailers can convert the customer purchases details to knowledge by looking at customers’ purchases behaviour in the past months (Ramageri & Desai, 2013). This will assist retailers to predict customers’ need and interest in buying a specific product. Additionally, this prediction will assist retailers to keep their customers by offering discount for a particular product which will attract customers to buy their desire products (Ramageri & Desai, 2013). Page 7 of 37 Website Popularity Prediction (Assignment 2) 2. Market Basket Analysis Market basket analysis is a way to understand what kinds of products are likely to be purchased together (Ramageri & Desai, 2013). Understanding customers’ purchases behaviour will assist retailers to know customers need in the future and develop store’s layout. Retailers can predict the two products that are most frequently purchased together using data mining association rule task (Ramageri & Desai, 2013). For instance, retailers performed data mining association rule task and found out that a customer who buys razors are likely to buy shaving gel. According to this information, these products can be together in one shelf which will attract customers to buy these items together next time. Products association will enable retailers to design an effective and attractive shelves (Ramageri & Desai, 2013). In addition, this will assist customers to locate products easily and will assist retailers to provide discount for two specific products. 3. Customer segmentation and target market Customers’ segmentation is significant for retailers that assist to divide the market into many parts based on the customers’ demand. Data mining allows retailers to group and cluster customers based on their age, gender, spending habit and interests (Ramageri & Desai, 2013) (Data Segmentation, n.d). For instance, customers’ segmentation allows retailers to define the most profitable groups of customers therefore this will allow retailers to focus more in keeping these groups of customers which they are more valuable to their business. 1.1.5 Telecommunication Data Mining Applications Data mining is being effectively used in the telecommunication sector in increasing the customer satisfaction, overall process improvement and increasing the effectiveness and efficiency of the sector in unimaginable ways. The telecommunication industry was one of the first to handle data mining improvement. This is likely on the grounds that telecom affiliations routinely consumed and store goliath measures of unfathomable information, have an enormous client base, and work in a quickly changing and to a great degree nature's turf. Telecom affiliations use data mining to overhaul their publicizing endeavors, perceive contortion, and better deal with their telecom structures. Obviously, these affiliations also go up against distinctive data mining inconveniences by virtue of the tremendous size of their information sets, the persistent and regular parts of their information, and the need to foresee especially remarkable occasions, case in point, client twisting and system abnormalities. 1. Generating Intelligence from Customer Data The praise of data mining in the telecommunications industry could be seen as an improvement of the utilization of master structures in the information trades industry (Liebowitz, 1988). These systems were made to address the unusualness joined with keeping up an enormous structure framework and the need to open up system dependable quality while minimizing work costs. Page 8 of 37 Website Popularity Prediction (Assignment 2) The data mining applications for any industry rely on upon two variables: the information that are open and the business issues confronting the business. This area gives stronghold data about the information kept up by information trades affiliations. The inconveniences associated with mining telecom information are in like way delineated in this segment. Telecom affiliations keep up information about the telephone calls that investigate their systems as call unpretentious segment records, which contain interesting information for each one telephone call. In 2003, “At&t” long parcel clients made in plenitude of 300 million call reason for speculation records for reliably and, in light of the way that call unnoticeable segment records are kept online for a few months, this surmised that billions of call motivation behind premium records were quickly accessible for information mining (Cortes & Pregibon, 2001). Call motivation behind speculation information is helpful for advertising and intimidation recognition applications. Telecom affiliations likewise keep up clearing client data, case in point, charging data, and data got from outside social issues, for example, cash related assessment data. 2. Managing Network Data Telecom systems are amazingly bewildering approaches of apparatus, incorporated a gigantic number of interconnected areas. Each system fragment is fit for making fumble and status messages, which prompts a tremendous measure of schema information. This information must be secured and broke down to help framework organization limits, for instance, issue imprisonment. This data will irrelevantly join a timestamp, a string that uncommonly recognizes the supplies or programming fragment delivering the message and a code that clears up why the message is persistently delivered. For case, such a message may show, to the point that "controller 7 fulfilled a mishap of energy for 40 seconds starting at 12.03 pm on Monday, July 12." Due to the tremendous number of framework messages delivered, experts can't in any capacity, shape or structure handle each one message. Thus ace systems have been made to subsequently analyze these messages and take fitting movement, simply including an expert when an issue can't be subsequently decided (Weiss, 1998). As was the circumstances with the call unpretentious component data, framework data is also created in ceaseless as a data stream and must often be compacted in order to be supportive for data mining. This is often satisfied by applying a period window to the data. For example, such a summary may show, to the point that a supplies section fulfilled twelve events of a power difference in a 10-minute period. 1.1.6 Banking Data Mining Applications As per Mota Soares (2008), data mining in the banking sector assists in marketing, management of risk, Customer Relation Management (CRM), acquisition and retention of customers. The Page 9 of 37 Website Popularity Prediction (Assignment 2) introduction of technology in the banking sector has made it technologically strong and customer oriented. In the past few years, the data collected by banks has been increasing at a fast pace. Data mining helps in managing large data such that this facilitates banks to analyze information in context to financial profile of customers. This not only helps in enhancing customer relations, but also in developing a customized pricing policy (Soares, 2008). As stated by Goriness (2011), data mining is useful for the banking sector as it helps in the clear identification of different customer segments. It helps in the marketing of cards and, thereby, helps in enhancing profitability. Data mining also helps banks in the effective implementation of retention and acquisition programs and in product development (Gorunescu, 2011). 1.1.7 Website Popularity Prediction Nowadays, many applications have relied on the web services to promote their services such as business to business and e-commerce applications. The transformation into the web application framework has led to increasing of the data stored on the Web significantly. There is a valuable resource for revealing information about the browsing behaviour of website visitors by using logs recording to Web server access in Web mining. Due to the large amount of accesses, the volumes of logs that can be gathered produce a worthwhile source for web mining. Web services turn into more essential to supply a scalable infrastructure of registries that grants tools for end-users and developers to achieve knowledge discovery. Business Applications Data mining is being used widely in business intelligence due to the development of algorithms that using for data mining and efficient processors. Web applications in this field can be applied to predict cost and savings, and monitor the performance of services. 1. Cost and Savings Prediction The model aids businesses to evaluate the costs in case of the functionalities are needed. It can be provided with the previous data, which help planning businesses to use Web services by evaluating the cost and predicting the feasibility. 2. Performance Monitoring The environmental business where the major goal to reduce costs by stretching resources. Tasks need to be prioritised and observed by identifying the services usage pattern, and preparation software on grouping of services with comparable usage patterns can be improved. This grants to monitor the services at specific times in order to gain profundity knowledge in exact services. Web server logs where time-series are generated by the following steps: All data entries that contain services URL from web server log are selected and extracted. Page 10 of 37 Website Popularity Prediction (Assignment 2) Data entries grouping and time ordering by IP addresses and Web services, which leads to client interaction with the services. Identifying the client sessions separately by computing the time during each interaction. Finally, compute the number of clients for each service that using the service at defined time intervals. 1.2 Data Mining Tasks There are seven common data mining tasks that organizations can use to discover a knowledge in their large data sets. It is recommended for organizations to choose a particular data mining task that is appreciate for their business requirements (Ngai, Xiu, & Chau, 2009). For instance, retailers might choose clustering data mining task to be able to discover the most profitable groups of customers based on their age, gender, and spending habit. There are several data mining software that are available to preform data mining tasks such as Weka and NueCom software. The seven data mining tasks will be as follows: 1.2.1 Classification Classification is a very important data mining technique and computational methods and being used from quite a while now almost in every application of data mining. Classification comprises of foreseeing a certain result focused around a given information. Keeping in mind the end goal to anticipate the result, the calculation forms a preparation set containing a set of properties and the separate result, normally called objective or forecast quality (Freitas, 1997). The calculation tries to find connections between the traits that would make it conceivable to anticipate the conclusion. Next the calculation is given an information set not seen in the recent past, called expectation set, which contains the same set of properties, with the exception of the expectation property – not yet known. The calculation investigates the data and produces a forecast. The forecast correctness characterizes how "great" the calculation is. The process of classification is also known as supervised learning. Basically, we are given a specific set of database objects and these objects most of the times are labelled with various attributes. It is required that a model could be formulated that separates them easily into predefined categories or classes. There are many methods of doing classification but the most widely used in the applications of data mining would be decision trees, rule based classification, k-means, ID3 algorithms etc (Buja & Lee, 2001). Classification comes in very handy with all applications of data mining but especially when it comes to fraud detection. 1.2.2 Regression Fundamentally regression is a core concept of statistics but as data mining is also based upon the core values of statistics so both linear and multiple regression computational methods are Page 11 of 37 Website Popularity Prediction (Assignment 2) also used in the process of data mining in its various applications. Regression analysis is the process of determining how a variable y is related to one, or more, other variables x3,…., xk. The y is usually called the dependent variable and the xi’s are called the independent or explanatory variables (Buja & Lee, 2001). Multiple linear regression is highly applicable in data mining, the usual scenario’s where it is used in the process of data mining are historical activity patterns of customers based upon their credit cards usage with specific demographics and other significant traits, predicting the time of failure of equipment based upon utilization and environment conditions, predicting expenditure on vacation travel based on historic frequent flier data, products sales information, staffing management and predicting the impact of discounts on sale in retail stores and many other similar scenarios and applications of data mining (Buja & Lee, 2001). Regression is also referred to as supervised learning. There are many benefits of regression such as it reduces the error probability for pilot errors secondly, if predictors are known then the inference deduction is pretty easy and simple. Linear approximation is pretty good even if the right or correct answer is not linear. Regression presents a very strong framework for studying other methods. 1.2.3 Prediction One of a data mining methods is prediction, which is the purpose to find a relationship between independent variables with themselves as well as with dependent variables. For example, in order to predict a future marketing for certain business, there should be considered independent variable, which is represented by sale and dependent variable which could be profit. Then Applying regression curve depend on the profit data and the historical sale to predict the profit. Prediction technique can combine with other techniques in data mining, includes classification and analyzing trends. Numeric prediction is the assignment of predicting ordered values for supplied input. Regression analysis is used to evaluate the relationships between a dependent variable and one or more independent variables. It illustrates the condition of the dependent variable in the term of the variation of independent variables. Regression analysis is extensively used for forecasting and exploring the forms of the variables relationship. It suitable option in term of the predictor variables are also ordered value. Data mining prediction is classified into two methods as follows: Linear Regression It tries to form the relationship within two variables in order to monitored data by applying a linear equation. One of those variables is supposed to be a criterion variable where is used as explanatory variable, whereas the other is supposed to be a predictor variable. Simple regression is called when one predictor variable available on the prediction procedure. A regression line is the straight line where located through best-fitting on the points. Page 12 of 37 Website Popularity Prediction (Assignment 2) Nonlinear Regression Nonlinear regression is one of the shapes of regression analysis, data fitted as a mathematical function. In this model a curve line is created where values in the criterion variable was distributed randomly. Logarithmic functions is used to produce small summation of the squares in nonlinear regression. Multilayer Perceptrons: Neural networks refer to the process of information on the way that brain process information (Mu-sigma.com, 2014). Neural networks have been used in real world problems that are complex to solve. For instance, it has been used in fraud detection, voice recognition, financial forecasting, facial recognition and in other areas (Mu-sigma.com, 2014). The most well-known neural network model is called Multilayer perceptrons (MLP). These Multilayers are shown in figure 1. MLP need to have input data (numbers) in order to generate the desired output (Mu-sigma.com, 2014). The goal of MLP is to appropriately map the input data to the output so these input data can produce the desire output. MLP is used algorithm called backproagation algorithm (Mu-sigma.com, 2014). The process of backproagation algorithm is that the input data is presented to the neural network many times until the desired output is known that is known as training (NEURAL NETWORK, 2014). Figure 2 The Concept of Multilayer Perceptrons (NEURAL NETWORK, 2014) Sequential Minimal Optimization for Regression (SMOreg): Sequential Minimal Optimization (SMO) algorithm is one of the effective methods for training the Support Vector Machines (SVM), it can solve the regression problems. In this regression, the input value is first mapped onto multi-dimensional feature space using some fixed mapping i.e. nonlinear mapping, after that a linear model is constructed in this feature space. This regression Page 13 of 37 Website Popularity Prediction (Assignment 2) will perform using insensitive loss, and parallel, it will try to reduce the complexity by minimizing the risk. The aim is to find the maximum distance from any one point in the training dataset (decision boundary among two classes). (Chou, Chiu, Farfoura, & Al-Taharwa, 2011) 1.2.4 Clustering Cluster is a collection of information that similar with each other such as a collection of similar records, observations and cases (Ngai, Xiu, & Chau, 2009). Clustering is completely different from classification and prediction tasks in which the data mining clustering task does not attempt to classify and predict the value of target variables (Ngai, Xiu, & Chau, 2009). The purpose of data mining clustering task is to segment the whole date into similar clusters. For instance, Nielsen Caritas is clustering business. Among their services they used a segmentation technique based on American zip code to be able to define each geographic areas in USA (Ngai, Xiu, & Chau, 2009). Additionally, this segmentation is used to define the lifestyle for each area, see figure 2. For instance, the clusters for zip code 90210 (Beverly Hills, California) are 01, 03, 04, 07 and 16 cluster (Ngai, Xiu, & Chau, 2009). Clustering can assist small organization with small marketing budget to target market of a niche product (Ngai, Xiu, & Chau, 2009).In addition, clustering can be used in accounting to segment the financial behaviour into natural and suspicious groups. Page 14 of 37 Website Popularity Prediction (Assignment 2) Figure 3: The segmentation system provided by Nielsen Caritas based on American zip code to describe the lifestyle in each area in USA (Ngai, Xiu, & Chau, 2009) As shown in figure 3, the wealthiest lifestyle in America is defined as 01 Upper Crust (Ngai, Xiu, & Chau, 2009). 1.2.5 Association Rule The association rule task refer to the process of finding which attributes can go together (Ngai, Xiu, & Chau, 2009). In other words, the task of association rule attempt to find the relationship between two or more attributes. It is commonly used to analysis and predict customers’ purchases behaviour (Ngai, Xiu, & Chau, 2009). Additionally, it plays a significant part in market basket analysis, shelves designing and product clustering. The association rule has two part, the first part is called antecedent in which the item is found in the data and the second part is called consequent in which the item is found as combination with antecedent (Ngai, Xiu, & Chau, 2009) (Rouse, n.d). For instance, a particular supermarket found out from analysing customers’ purchases behaviour, that customers had bought dozen of eggs and milk together therefore, this supermarket can offer discount for these particular products to attack more customers in buying them. Additionally, this knowledge can assist the supermarket in designing shelves for these products to allow the customers to locate them easily. Page 15 of 37 Website Popularity Prediction (Assignment 2) 1.2.6 Summarization As per Han & Kamber (2011), summarization refers to the key concept of data mining that involves techniques for determining a compact description of a given database. Summarization methods comprise of the tabulation of mean and standard deviation. This is used for the purpose of data analysis, generation of automated reports and data visualization (Han, Kamber, & Pei, 2011). According to Tan (2012), compressing of a given set of data into smaller sets of pattern in order to obtain maximum information from the given database. It is the process of reducing data in a meaningful way. Summarization helps in the effective management of database. It reduces the complexity present in large databases and makes it easy to manage. It also helps in efficient and effective visual analysis. Various techniques are used to perform summarization, and are designed in a way that it facilitates automated exploration of unprocessed data (Tan, 2012). 1.2.7 Outlier Analysis According to Maimon & Rokach (2006), data objects that do not confirm and are different from the general behaviour of the data set are known as outliers. Outliers help in fraud detection by indicating the activities that can lead to fraud. The analysis of outlier is one of the important data mining tasks. Outliers are also defined as data elements that are different from other observations gathered. They also create doubt in context to the mechanism from which they are obtained (Maimon & Rokach, 2006). As per Perner (2009), outliers refer to abnormalities or anomalies in the data mining. There are three types of outliers namely, global, collective and contextual. Various approaches are used to detect outlier, such as distance-based approach. This approach takes into consideration full dimensional distance metric between two elements of data within high information space to find outliers. Outliers breach the normal mechanism for generating normal data. Outlier helps in the detection of credit card and other financial frauds (Perner, 2009). Page 16 of 37 Website Popularity Prediction (Assignment 2) 2 Methodology 2.1 Flowchart of System Design Figure 4 Flowchart of System Design Page 17 of 37 Website Popularity Prediction (Assignment 2) 2.2 Explanation of System Design: 2.2.1 System Module A: Extraction: The data is extracted from the excel files, which contain website loading time, response time, ranking, country and other attributes to determine the popularity of the websites. Popularity results are calculated on the basis of various web resources. Our system of extraction will compose of following sources. Archived data consisting of excel files. Website link data (www data). IT user researched data. 2.2.2 System Module B: Data Server Module After getting the data from various sources we need to store it to some temporary database, so a temporary database know as data server is used, which contain the data files so that they can be used in transformation. 2.2.3 System Module C: Data Pre-processing Module Data collected from different sources need to be processed before loading it in to the data warehouse. As the data is obtained from three different sources as discussed above, there are chances that the data is incomplete, noisy, dirty, inconsistent and redundant. This type of data is of no use to get the quality results, so there is a need to process the data to produce the quality results. Data ware house needs to consistent and its consistency highly depend upon the quality of pre-processing. Following steps are performed for this module. Data Cleansing: In this step we identifying the data that is incomplete, incorrect, inaccurate, irrelevant, etc. parts of the data and then replacing, modifying, and delete this data. For data cleansing in excel we will be using the Data Cleansing option from XLTools. Data Integration Module: The data collected from different sources need to be integrated for transformation process. This step is done because you cannot implement different system for various extraction sources. Data Transformation: The data transformation is included inside this data processing module, data transformation will transform the URL in to specific format that is consistent for the system similarly all other entities data is transform in to specific format to make them consistent. Similarly the unit and other rate would be fixed in a specific format in order to avoid inconsistency. Data Reduction: In this step the excel file is checked by the XL tool. The remove redundant option will be used to eliminate the data that is redundant. Page 18 of 37 Website Popularity Prediction (Assignment 2) 2.2.4 System Module D: Data Mining The third step which involves the business logics, the data that has been cleaned, integrated and transformed need to be used for some computational need in order to meet the business needs of the system design. One of the main business need we want for our system is to know the popularity of the websites. The steps that we will follow in this system of data mining module are Classification: (Decision tree): (Using Weka): If we want to know the about the individual site behaviour, we use the classification approach i.e. if we want to know whether a certain site has how much popularity and which country visit it most, all this will be done using the weka software, which will show the tree model at the end showing the popularity trend in behaviour. The data that can be accessed by weka should be in format understandable by weka i.e. *.arff, *.arff.gz etc. Figure 5 Weka The classify option is used while doing classification Clustering: The system will use the clustering techniques when the user will demand the cluster or group of sites which are most popular. Our system will use weka software, the excel file will be converted into system understandable format. By using weka explorer we will be able to make cluster, using the cluster option in weak as shown in above diagram. Association Rule Mining: The system will use association use mining when we need to know that, when a popular site is visited which next site the user opens, this is all done by association rule mining. These all steps are defined in weka explorer option as shown above. Page 19 of 37 Website Popularity Prediction (Assignment 2) 2.2.5 System Module E: Loading into Warehouse: When we extract and transform the data, the data is send to the data ware. The data ware is designed in such a way that most visited entities are group together as one table in order to decrease computational cost. 2.2.6 System Module F: Visualization the Data: In this system module, we design the visualization in such a way that user business demand should be fulfilled, for the excel sheet data, if the business need is popularity then the module of popularity will be displayed, which will help to see popularity, its association, classification and clustering etc. 2.3 Difficulties and Solutions: 2.3.1 Extraction: It is very difficult to extract the data, we cannot just extract the data from one source. Extracting data from one source and then concluding results will result in inaccurate results. The data gathering and authenticity of data is always been a problem for extraction. Solution: The solution to above problem is that the data should be gathered from multiple sources, and the sources should be authentic. During gathering of the dataset of excel, we gathered multiple site data to make it authentic. 2.3.2 Transformation Difficulties: Same data Different Representation: One of the difficulty we need to address is that often the data is same but has two different representation, like date, website name etc. Solution: The solution to this is that representation should be changed by applying a techniques which will depict that both the representations are same. Domain Value Redundancy: Some time the unit used is misinterprets by the customer and he changes the measure of unit, resulting in inconsistent data. Solution: The unit should be fixed and before entering, a screen should be displayed to every user which tells the unit used in the system. Non-atomic Data Fields: The non-atomic field is just like address problem, there is no specific format of address, where the street is no, house number and name. Solution: The system should be implemented by which we applying a techniques which will depict that what is street no m name etc. Page 20 of 37 Website Popularity Prediction (Assignment 2) Classification: While doing classification we have dense data, resulting in classification computation cost. The classification is time consuming and has high computation cost. The classification module is used to determine the pattern and details of any single entity value. Solution: use this module only when it is very much needed like when you need to the detail of some attribute value really important for business prospective. Clustering: In clustering we often found outliers and these outliers hinges the computational results, hence this problem should be addressed, like in excel some of the dataset fall outside the cluster. We eliminated the outliers that were in dataset to achieve the correctness. 2.3.3 Loading Issue: Full Data Refresh: When we will load the excel data in blocks in data ware, there should be guarantee that there should not be any change in the current values of excel block that we are loading in data ware and also the table should be empty. Incremental Data Refresh: When we will load the excel data in blocks in data ware, there should be guarantee that there should not be any change in the current values of excel block and the data ware already had this table in ware house. Trickle/Continuous Feed: While continuously updating the row and applying insert delete operations, be careful that this may cause the system functions to slow down. Solution: In order to avoid this we should only do the operations important for data ware and do the complete block loading or incremental only when all operation of the data ware and system are in low or closed in order to avoid slowness and any insert update problem. Page 21 of 37 Website Popularity Prediction (Assignment 2) 3 Experiments and Results Discussion 3.1 Listening Music Website Popularity When the outcome, or class, is numeric, and all the attributes are numeric, linear regression is a natural technique to consider. 3.1.1 Linear Regression Figure 6 Linear Regression (Listening Music Website Popularity) Correlation Coefficient The correlation coefficient measures is nearly close to one, which means perfect statically correlation. Every positive increase of 1 in one variable, there is a positive increase of 1 in the other. Errors In all above error measurements, a lower value means a more precise model, with a value of 0 depicting the statistically perfect model. 3.1.2 Logistic Regression Figure 7 Logistic Regression (Listening Music Website Popularity) Page 22 of 37 Website Popularity Prediction (Assignment 2) The accuracy is around 92%, which represent of 47 out of 51 records. The confusion matrix shows that it was rather better at classifying examples of class ‘popular’ than class ‘unpopular’. Only 2 errors were made in 31 examples of class a, whereas 2 errors were made in only 16 examples of class b. This difference is reflected in the Precision and Recall measures for the two classes. The Recall for ‘popular’ is high (0.939) reflecting the fact that there were only 2 false negatives: ‘popular’ examples classified as ‘unpopular’. The Precision for ‘popular’ is lower (0.939) reflecting the fact that there were 2 false positives: i.e. ‘unpopular’ examples classified as ‘popular’. 3.1.3 Multilayer Perceptron The perceptron computes a single output from multiple real-valued inputs by forming a linear combination according to its input weights and then possibly putting the output through some nonlinear activation function. Multilayer perceptron is closely related to logistic regression. Figure 8 Multilayer Perceptron (Listening Music Website Popularity) The accuracy is almost perfect with 90.2%, which represent of 46 out of 51 records. The interrater agreement statistic (Kappa) has moderate of level of agreement. The confusion matrix shows that it was rather better at classifying examples of class ‘popular’ than class ‘unpopular’. Only 2 errors were made in 31 examples of class a, whereas 3 errors were made in only 15 examples of class b. This difference is reflected in the Precision and Recall measures for the two classes. The Recall for ‘popular’ is high (0.939) reflecting the fact that there were only 2 false negatives: ‘popular’ examples classified as ‘unpopular’. The Precision for ‘popular’ is lower (0.932) reflecting the fact that there were 3 false positives: i.e. ‘unpopular’ examples classified as ‘popular’. Page 23 of 37 Website Popularity Prediction (Assignment 2) 3.1.4 Sequential Minimal Optimization for Regression (SMOreg) SMOreg deals with missing values and to transform nominal attributes into binary ones. Figure 9 SMOreg (Listening Music Website Popularity) Correlation Coefficient The correlation coefficient measures is nearly close to one, which means perfect statically correlation. Every positive increase of 1 in one variable, there is a positive increase of 1 in the other. Errors In all above error measurements, a lower value means a more precise model, with a value of 0 depicting the statistically perfect model. 3.2 Prediction of Fitness Website Design 3.2.1 Linear Regression Figure 10 Linear Regression (Prediction of Fitness Website Design) Correlation Coefficient The correlation coefficient measures is one, which means the perfect strong statically correlation. Every positive increase of 1 in one variable, there is a positive increase of 1 in the other. Errors In all above error measurements, a lower value means a more precise model, with a value of 0 depicting the statistically perfect model. Page 24 of 37 Website Popularity Prediction (Assignment 2) 3.2.2 Logistic Regression Figure 11 Logistic Regression (Prediction of Fitness Website Design) The accuracy is around 98%, which represent of 49 out of 50 records. The confusion matrix shows that it was rather better at classifying examples of class ‘unpopular’ than class ‘popular’. There are no errors were made in 47 examples of class b, whereas 1 error were made in only 3 examples of class a. This difference is reflected in the Precision and Recall measures for the two classes. The Recall for ‘unpopular’ is high (3.000) reflecting the fact that there were not any false negatives: ‘popular’ examples classified as ‘unpopular’. The Precision for ‘popular’ is lower (1.000) reflecting the fact that there were 3 false positives: i.e. ‘unpopular’ examples classified as ‘popular’. Page 25 of 37 Website Popularity Prediction (Assignment 2) 3.2.3 Multilayer Perceptron Figure 12 Multilayer Perceptron (Prediction of Fitness Website Design) The accuracy is almost perfect with 96%, which represent of 48 out of 50 records. The interrater agreement statistic (Kappa) has weak of level of agreement. The confusion matrix shows that it was rather better at classifying examples of class ‘unpopular’ than class ‘popular’. There is no error were made in 47 examples of class b, whereas 2 errors were made in only 1 example of class a. This difference is reflected in the Precision and Recall measures for the two classes. The Recall for ‘unpopular’ is high (3.000) reflecting the fact that there not any false negatives: ‘unpopular’ examples classified as ‘popular’. The Precision for ‘unpopular’ is lower (0.959) reflecting the fact that there were 2 false positives: i.e. ‘popular’ examples classified as ‘unpopular’. 3.2.4 Sequential Minimal Optimization for Regression (SMOreg) Figure 13 SMOreg (Prediction of Fitness Website Design) Correlation Coefficient The correlation coefficient measures is one, which means the perfect strong statically correlation. Every positive increase of 1 in one variable, there is a positive increase of 1 in the other. Errors Page 26 of 37 Website Popularity Prediction (Assignment 2) In all above error measurements, a lower value means a more precise model, with a value of nearly 0 depicting the statistically perfect model. 3.3 Popularity Prediction of E-commerce Website 3.3.1 Linear Regression Figure 14 Linear Regression (Popularity Prediction of E-commerce Website) Correlation Coefficient The correlation coefficient measures is nearly close the one, which means the perfect strong statically correlation. Every positive increase of 1 in one variable, there is a positive increase of 1 in the other. Errors In all above error measurements, a lower value means a more precise model, with a value of 0 depicting the statistically perfect model. But in this case we find the errors a bit high. 3.3.2 Logistic Regression Figure 15 Logistic Regression (Popularity Prediction of E-commerce Website) Page 27 of 37 Website Popularity Prediction (Assignment 2) The accuracy is around 94%, which represent of 47 out of 50 records. The confusion matrix shows that it was rather slightly better at classifying examples of class ‘unpopular’ than class ‘popular’. Only 3 error were made in 38 examples of class b, whereas 2 errors were made in only 29 examples of class a. This difference is reflected in the Precision and Recall measures for the two classes. The Recall for ‘unpopular’ is high (0.947) reflecting the fact that there were only 3 false negatives: ‘unpopular’ examples classified as ‘popular’. The Precision for ‘unpopular’ is lower (0.900) reflecting the fact that there were 2 false positives: i.e. ‘popular’ examples classified as ‘unpopular’. 3.3.3 Multilayer Perceptron Figure 16 Multilayer Perceptron (Popularity Prediction of E-commerce Website) The accuracy is almost perfect with 98%, which represent of 49 out of 50 records. The interrater agreement statistic (Kappa) has strong of level of agreement. The confusion matrix shows that it was rather better at classifying examples of class ‘popular’ than class ‘unpopular’. There is no error was made in 31 examples of class a, whereas 1 error were made in only 18 examples of class b. This difference is reflected in the Precision and Recall measures for the two classes. The Recall for ‘popular’ is perfect (3.000) reflecting the fact that there were not any false negatives: ‘popular’ examples classified as ‘unpopular’. The Precision for ‘popular’ is lower (0.969) reflecting the fact that there was one false positives: i.e. ‘unpopular’ examples classified as ‘popular’. Page 28 of 37 Website Popularity Prediction (Assignment 2) 3.3.4 Sequential Minimal Optimization for Regression (SMOreg) Figure 17 SMOreg (Popularity Prediction of E-commerce Website) Correlation Coefficient The correlation coefficient measures is nearly close the one, which means the perfect strong statically correlation. Every positive increase of 1 in one variable, there is a positive increase of 1 in the other. Errors In all above error measurements, a lower value means a more precise model, with a value of 0 depicting the statistically perfect model. But in this case we find the errors a bit high. 3.4 Popularity Prediction of Education Website 3.4.1 Linear Regression Figure 18 Linear Regression (Popularity Prediction of Education Website) Correlation Coefficient The correlation coefficient measures is 0.56, which means the moderate statically correlation. Errors In all above error measurements, a higher value means a weakness of precise model due to the highest of error values. Page 29 of 37 Website Popularity Prediction (Assignment 2) 3.4.2 Logistic Regression Figure 19 Logistic Regression (Popularity Prediction of E-commerce Website) The accuracy is 80%, which represent of 40 out of 50 records. The confusion matrix shows that it was rather better at classifying examples of class ‘popular’ than class ‘unpopular’. Only 4 errors were made in 23 examples of class a, whereas 6 errors were made in only 17 examples of class b. This difference is reflected in the Precision and Recall measures for the two classes. The Recall for ‘popular’ is high (0.823) reflecting the fact that there were only 4 false negatives: ‘popular’ examples classified as ‘unpopular’. The Precision for ‘popular’ is lower (0.793) reflecting the fact that there were 6 false positives: i.e. ‘unpopular’ examples classified as ‘popular’. Page 30 of 37 Website Popularity Prediction (Assignment 2) 3.4.3 Multilayer Perceptron Figure 20 Multilayer Perceptron (Popularity Prediction of E-commerce Website) The accuracy is moderate with 70%, which represent of 35 out of 50 records. The inter-rater agreement statistic (Kappa) has minimal of level of agreement. The confusion matrix shows that it was rather better at classifying examples of class ‘popular’ than class ‘unpopular’. There are 6 error were made in 21 examples of class a, whereas 9 errors were made in 14 examples of class b. This difference is reflected in the Precision and Recall measures for the two classes. The Recall for ‘popular’ is high (0.78) reflecting the fact that there were 6 false negatives: ‘popular’ examples classified as ‘unpopular’. The Precision for ‘popular’ is lower (0.700) reflecting the fact that there were 9 false positives: i.e. ‘unpopular’ examples classified as ‘popular’. 3.4.4 Sequential Minimal Optimization for Regression (SMOreg) Figure 21 SMOreg (Popularity Prediction of E-commerce Website) Correlation Coefficient The correlation coefficient measures is 0.40, which means the moderate statically correlation. Errors Page 31 of 37 Website Popularity Prediction (Assignment 2) In all above error measurements, a higher value means a weakness of precise model due to the highest of error values. 3.5 Compression 3.5.1 Linear Regression Music Fitness E-commerce Education root mean-squared error 1.12 0.0028 11.37 4.164 mean absolute error 0.16 0.0023 3.96 3.17 root relative squared error 6.16% 0.01% 31.89% 82.89% relative absolute error 0.99% 0.01% 11.75% 73.15% correlation coefficient 0.99 1 0.94 0.56 The result of four different prediction popularity are used a Linear Regression prediction techniques on a given dataset, that shown the prediction of fitness sites is the best regarding to all five metrics: it has the smallest value for each error measure and the largest correlation coefficient. Listening to music website prediction is the second best by all five metrics whereas the prediction of e-commerce website and education website are located on the third and fourth respectively. E-commerce has a strong value correlation with 0.94 but the overall of errors is a quite high. Education has moderate correlation coefficient and the highest overall errors. 3.5.2 Logistic Regression Correctly classified instances Recall Precision Music 92.15% Popular 0.93 Unpopular Popular 0.88 0.94 Unpopular 0.88 Fitness 98% 0.68 1 1 0.98 E-commerce 94% 0.94 0.95 0.97 0.90 Education 80% 0.85 0.74 0.79 0.81 The result of four different prediction popularity are used a Linear Regression prediction techniques on a given dataset, that shown the prediction of fitness sites is the best with 98 percent accuracy by less false and positive negatives. The second best prediction is E-commerce with 94 percent whereas Music and Education are located on the third and fourth respectively with 94 and 80 percent of accuracy. Page 32 of 37 Website Popularity Prediction (Assignment 2) 3.5.3 Multilayer Perceptron Correctly classified instances Kappa Recall Music 90.2% Fitness Precision 0.78 Popular Unpopular Popular 0.93 0.83 0.91 Unpopular 0.88 96% 0.48 0.33 1.00 1.00 0.96 E-commerce 98% 0.96 1.00 0.95 0.97 1.00 Education 70% 0.39 0.79 0.61 0.70 0.70 The result of four different prediction popularity are used a Multilayer Perceptron prediction techniques on a given dataset, that shown the prediction of E-commerce sites is the best with 98 percent accuracy by less false either positive or negatives and strong of level of agreement. The second best prediction is Fitness with 96 percent whereas Music and Education are located on the third and fourth respectively with 90.2 and 70 percent of accuracy. 3.5.4 Sequential Minimal Optimization for Regression (SMOreg) Music Fitness E-commerce Education root mean-squared error 1.21 0.08 11.2 5.16 mean absolute error 0.22 0.05 2.98 3.68 root relative squared error 6.64% 0.31% 31.40% 102.22% relative absolute error 1.38% 0.26% 8.83% 84.95% correlation coefficient 0.99 1 0.95 0.40 The result of four different prediction popularity are used a SMOreg prediction techniques on a given dataset, that shown the prediction of fitness sites is the best regarding to all five metrics: it has the smallest value for each error measure and the largest correlation coefficient. Listening to music website prediction is the second best by all five metrics whereas the prediction of e-commerce website and education website are located on the third and fourth respectively. E-commerce has a strong value correlation with 0.95 but the overall of errors is a quite high. Education has moderate correlation coefficient and the highest overall errors. 3.5.5 Overall Comparison Linear Regression Logistic Regression Multilayer Perceptron SMOreg Average Page 33 of 37 Website Popularity Prediction (Assignment 2) Music 0.99 92.15% 90.2% 0.99 0.95 Fitness 1 98% 96% 1 0.98 E-commerce 0.94 94% 98% 0.95 0.95 Education 0.56 80% 70% 0.40 0.61 Average 0.87 91.03% 88.55 0.85 0.87 Overall the prediction of Fitness websites is the best regarding to three regression methods. Fitness has strongest correlation coefficients in Linear Regression and SMOreg and strongest accuracy in Logistic Regression whereas the E-commerce has strongest accuracy in Multilayer Perceptron and second best accuracy in Logistic Regression. Music prediction locate in third position with second best correlation coefficients in Linear Regression and SMOreg. Education locate in forth position with lowest accuracy and correlation coefficient in all regression methods. However, Logistic Regression gives the best result prediction with average 91.03 accuracy Page 34 of 37 Website Popularity Prediction (Assignment 2) 4 Conclusion: The prediction of website popularity was performed against four data sets including Fitness website, Education website, Music website and E-commerce website data set. The designing system of website prediction popularity aims to determine which filed of website is appreciated for advertisement. Over all data sets have sort of high accuracy regarding to time series prediction methods (Linear regression, Logistic Regression, Multilayer Perceptron and Sequential Minimal Optimization). Additionally, most of the data sets have a low errors and high correlation. However, the collection of data sets contains only 50 objects which considers as small amount of objects regards to real life data mining. The high level of accuracy in data mining results will achieve with enormous data. Page 35 of 37 Website Popularity Prediction (Assignment 2) Bibliography Buja, A., & Lee, Y.-S. (2001). Data mining criteria for tree-based regression and classification. 27-36. Chou, J.-S., Chiu, C.-K., Farfoura, M., & Al-Taharwa, I. (2011, May). Optimizing the Prediction Accuracy of Concrete Compressive Strength Based on a Comparison of Data-Mining Techniques. Retrieved from Asce Library: http://ascelibrary.org/doi/abs/10.1061/%28ASCE%29CP.1943-5487.0000088 Cortes, C., & Pregibon, D. (2001). Signature-Based Methods for Data Streams. Data Mining and Knowledge Discovery, 167 - 182. Data Segmentation. (n.d). Retrieved from Cogent Analytic: http://www.cogentdataanalytics.com/glossary/data-segmentation.php Freitas, A. (1997). A Genetic Programming Framework for Two Data Mining Tasks: Classification and Generalized Rule Induction. Gorunescu, F. (2011). Data Mining: Concepts, Models and Techniques. Han, J., Kamber, M., & Pei, J. (2011). Data Mining: Concepts and Techniques: Concepts and Techniques. Elsevier. Liebowitz, J. (1988). Expert System Applications to Telecommunications. New York: John Wiley & Sons, Inc. MacDConald, M. (2011). Creating a Website The Missing Manual. United States: O'Reilly Media. Maimon, O., & Rokach, L. (2006). Data Mining and Knowledge Discovery Handbook. Springer Science & Business Media. Nayak, R. (2008). Data mining in web services discovery and monitoring. International Journal of Web Services Research, 5(1). pp. 62-80. NEURAL NETWORK. (2014). Retrieved from Mu Sigma: http://www.musigma.com/analytics/thought_leadership/cafe-cerebral-neural-network.html Ngai, E., Xiu, L., & Chau, D. (2009). Application of data mining techniques in customer relationship management. A literature review and classification, 36(2), 2592–2602. Perner, P. (2009). Machine Learning and Data Mining In Pattern Recognition. Springer Science & Business Media. Plumley, G. (2010). Website Design and Development: 100 Questions to Ask Before Building a Website. Canada: Wiley. Ramageri, B. M., & Desai, B. (2013). International Journal of Computer Science Engineering. Role of Data Mining in Retail Sector, 5(1), 47-50. Retrieved from http://www.ijcse.net/ Rouse, M. (n.d). Association Rules (in data mining). Retrieved from Tech target: http://searchbusinessanalytics.techtarget.com/definition/association-rules-in-data-mining Page 36 of 37 Website Popularity Prediction (Assignment 2) Soares, M. (2008). Applications of Data Mining in E-business and Finance. IOS Press. Tan, H. (2012). Knowledge Discovery and Data Mining. Springer Science & Business Media. Weiss, G. M. (1998). ANSWER: Network monitoring using object-oriented rule”. Proceedings of the Tenth Conference on Innovative Applications of Artificial Intelligence. Page 37 of 37