WHITE PAPER GETTING AHEAD OF THE COMPETITION WITH DATA MINING Ultimately, data mining boils down to continually finding new ways to be more profitable which in today’s competitive world means making better and more accurate decisions faster than your competition. Data mining has a great affinity to the six sigma mind set and the two disciplines work well together. They both start out with an unknown outcome, the end goal being improvement, and both use well established statistical methods. Whilst the end result does not guarantee you will double your sales or half your production defects, you are very likely to find improvements and gain new insights into your business. How much these insight are worth varies from industry to industry, but there is always that chance to find highly valuable nuggets of gold. For example, if you are drilling for oil, knowing where to drill is hugely valuable given the cost of setting up. However, knowing the reasons why a product fails quality control so they can be mitigated has a benefit that all depends on the cost of rectification and frequency of reoccurrence, which will vary greatly from business to business. The fact is that whilst we often use our in-depth business and industry knowledge to generate many great improvement ideas, you would have even more great ideas when presented with trends in your data that you were not aware of. For example, if, as a delivery company, you discovered that the accident rate and delivery times were greatly affected by the driver’s age and whether the route was inner city or more motorway. Would you not adjust which route the driver goes on? If you discovered managers and team leaders were sick three times less than everybody else would you change your sickness policy or find new ways to give people more responsibilities and empowerment? Chances are, you would take some form of action, but you would probably only have taken that action in the light of this new information. The other side to data mining is about accuracy and measurability, the great example being forecasting. This typically means going beyond simple averages, linear regression functions and moving averages to more complex weighted moving averages, exponential smoothing, and time serial analysis to forecast sales, staffing and stock levels, etc. These last methods will often acknowledge other attributes, clusters, seasonality, and have further alpha inputs that can lead to significantly higher accuracy levels. As the best method often varies from case to case, each can be implemented separately and then statistically compared. This is done using functions like mean square error, mean absolute variation and slicedbread.co.uk WHITE PAPER tracking signal so that the most pertinent forecast is used. If you’re not already using the these methods, how much better would some of your decisions be if your estimates were that bit more accurate, and how much money would that save or make for your business? What is also useful is that the activity of forecasting often identifies hidden factors affecting you estimates you were not aware of. For example, you log competitor sales campaign dates and plot them against your own data. Often competitors stick to schedule which, once known, can be very valuable to your forecasting. In customer relationship management it’s critical to know your customer. Knowing your customer makes your marketing more effective, reduces churn, increases spend levels and ensures you are giving the customer the service they want and expect. It also helps you exceed their expectations and evolve your offering. This is an area where data mining plays a major role and where nearly all the methods and tools play a part. Usually the first step in understanding your customers is to classify and cluster them. For a retail company a simple example would mean classifying your customer base into age groups, disposable income bands and their primary interest, and then clustering the combinations that naturally stick together. The resulting clusters are often given memorable names like the “silver surfer”, but using these clusters makes marketing to each group far easier, helps you focus on the right delivery channel strategy and can greatly increases conversion rates by being more targeted. Whilst this activity is often done manually, it becomes progressively harder as the number of influencing attributes grows. This is when it becomes a good idea to augment your internal data with external data. Data mining tools, however, will scan all your data and automatically recommend groupings based on reviewing every field. The nice thing about clusters is that whilst you can’t change the customer’s gender or address, you can often influence other attributes and help move a customer to a new cluster of higher value. A great example is Amazon prime. Once a customer joins prime there is usually a big shift in the lifetime value, shopping frequency, basket size, churn, ability to upsell and so on. Knowing the value difference between the two clusters helps you understand the maximum spend available per customer to move them up to a higher value cluster. However, bear in mind it is also effective to move customers down to a less profitable cluster if they look likely to leave, especially if you know before they consciously decide to do so. But just how do you know they are likely to leave? The answer lies in understanding the data of every customer who has left in the past and then use data mining tools to automatically detect all the influencing factors. There are several methods, but the simplest tool will give you a bar graph showing the relative importance of each factor. Once identified, these factors let you filter and identify the customers most likely to leave so you’re able to take proactive action such as getting in touch and offering them a better deal or a complementary bottle of wine when things go wrong. Just as important as knowing your customers should be knowing your competitors. Today, with most business publishing their prices on the internet and the public data available on web traffic, it’s now easier than even to profile your customers and track how they are doing in comparison to your own efforts. Some answers, however, you will never find because you just don’t have the data or it’s in a state that can’t be used. The unfortunate truth is that from a Gartner perspective most businesses have poor data quality levels with most floating around level 1 and 2 of their 5 level quality framework. This means your business is most likely to be missing data and have a significant amount of incorrect data. Whilst poor data quality dilutes your data analysis, it can be addressed with automated data cleansing via ETL or exclusion entirely from the models you build. There are, however, data mining tools for filling in missing data and identifying outliers statistically, but the best solution is to have the right level of governance in place and a solid data strategy. slicedbread.co.uk WHITE PAPER However, the biggest factor is if you don’t capture the data or it’s just not available. This is where a little creativity comes into play as you start to look at what data you can get, and how likely it may influence what you are investigating. For example, a business wants to open its next store but needs to pick the best possible location. It’s unlikely to already have good sales data on the area unless it sells online. From a data mining perspective, this is where it is useful to know as much as possible from as many angles as possible. Example additional data for the location might include population demographics, disposable income, education levels, number of household vehicles, density of competing businesses, university student populations, average house prices, etc. By linking all these factors to your own data and then letting the data mining tools discover the trends, the decision on location becomes more informed and the risk of a poor investment is reduced. Another technique is to create derived metrics from the data you have to predict key supporting factors. If you take the example of betting on a greyhound race. If you know that only 5% of dogs who are bumped finish first, or that 38% of the dogs that reach the first corner leading go on to win, it soon becomes clear that it’s worth knowing the likelihood of these supporting factors. This is where you would create new metrics specifically to predict their likelihood. Factors such as the average trap the dog starts in to see if collisions are more likely when in the wrong trap, or the ranked weight of each dog to see if the lighter dog has the advantage in the initial sprint. The other methods that support understanding your customer include association analysis, which is commonly used to do basket analysis and is great at supporting a higher basket margin. This uses your customers shopping behaviours to learn what they buy in groups. This learning can then be shared with other customers by grouping products together on the shop floor or web page making it easier for them to buy as a group or bundle deal, but it also plants the idea in their minds of the possible combination. Another method is called sequence analysis. This is often used to understand the click paths customers take when using a website or can be used to look at customer purchases over time. Knowing a customer buys ink for a printer 3 months after buying the printer can be useful to know and a great reason to contact the customer. All these methods are all easily accessible from the free Excel data mining addin and table analysis bars shown below. All you need is to be able to connect to an Analysis Services server. Whilst the list of applications for data mining is exhaustive, hopefully you get the idea. Data mining brings to the table some very sophisticated mathematical techniques as well as some quite simple but clever methods. Consider the differences between the Naïve Bayes method, decision trees and the neural net methods. The Bayes method was invented by reverend Thomas Bayes in 1763 and was based on him using marbles to count events and hence is quite simple but effective as shown below in this bike buyer sample dataset. slicedbread.co.uk WHITE PAPER Decision trees which work by discovering the key pathways through your data are more sophisticated and visual. They give you the probability of the desired outcomes and colour grade the pathway through the data with darker shades indicating higher probability. This is my personal favourite as it can produce some great visual output that’s easy to understand. The example below shows the factors affecting a shopper likelihood of buying a bike. The best pathway shows customers Aged between 36 and 39 who don’t own a cars have a 40.66% likelihood of buying a bike. However, the most interesting is probably the Neural slicedbread.co.uk WHITE PAPER Network method which was discover as a by-product of mapping out how the human brain works. This method analyses all the possible relationships in your data and then after combining all the attributes of the first pass will take a second look at your data in a similar way. This is the equivalent of a chess player thinking 2 moves ahead and is best suited to highly complex problems and is not something any report or cube can easily do. Below is the same bike buyer example but predicted using the neural net method, which, as you can see, is much more detailed. At slicedbread, we use the Microsoft tools embedded in SQL Server and Office to support our clients with their data mining projects. Whilst many statisticians will prefer the more advanced and expensive SAS and SPSS solutions, Microsoft has the advantage of being free as long as you own SQL server and Excel, it’s user friendly and Excel can handle more complexity than most typical businesses use. However, if you need more advanced capabilities then you can always use the data mining extensions language DMX in analysis services which has extra support for things like nested tables. Also, if you have SQL Enterprise then there are some more advanced algorithms available. When we work on a data mining project, we follow the life cycle below which I would recommend if you want to have a go at this yourself. Define the problem. Whilst you can point the tools at hundreds of fields and ask it to tell you something you don’t know, you will get better quality results by focusing around a single goal. Collect you data. The more attributes you investigate the more data you will need, but a simple snap shot will usually do. The tools will usually use 70% of your data to design the model and the remaining 30% to test it. The model you will build will often last some time before needing to be re-evaluated. Transform and clean you data. Expect this to take the majority of the effort. The cleaner and more discrete your data the better. As some models work on continuous data such as sales, and others utilise discrete data slicedbread.co.uk WHITE PAPER like sales channel, it’s best to take continuous data and create additional discretized groupings in advance so you can have the best of both worlds. A good example would be age bands. Where you have bad data, either exclude it or fix it, but don’t leave it in. Because the process is cyclic as you close in on your goal, your first pass of the data is to identify trends, with subsequent passes going deeper into the data as you go for greater accuracy. If you have millions of records, I would recommend pre-aggregating the data into a more summary form. Build Your Model. Using a clean and well prepared data source makes data mining almost a matter of pointing the tools at the data, selecting your field settings and clicking go. However, there is plenty of opportunity to refine the models as you discover what is correlating and supporting your goal and what has no affect and needs removing. In most cases, you will often find yourself going back to you data and either creating new derived fields or pulling in brand new fields related to an attribute that you have just discovered is highly influential. In addition to this approach, there also some very powerful techniques you can use if you put your data into a pivot table using PowerPivot (ideally). This approach can reproduce some of the same outcomes but with a greater control and understanding in how you got to the best solution. To take advantage of this, switch to a percent of row total view and keep pulling in new dimensions to see correlations quickly against your target attributes that are on a separate axis. This is sometimes exceptionally useful as it’s much quicker than continuously rebuilding your model and it can really help your focus your efforts. Apply the model to deliver live data. Sometime just knowing the answers to questions is enough to go and change your business processes, but there will be times you will want to record the answers in your data warehouse. This will support automated decision systems, which are particularly useful in call centres as well as supporting CRM applications; but it’s also a great way of sharing the knowledge. Whilst you can query the data mining models for predictions on the fly and export the bulk results using DMX, my preferred method is to duplicate the data mining logic in your traditional ETL processes. This is because it’s faster, and when you’re updating your data every hour, as we do, this is an import factor. Whilst not every trend or pattern you discover will be useful, the return on your investment should always be positive (Microsoft press states an average 150%), especially when using Microsoft tools, and in some cases can be like winning the lottery. If you’re not already data mining but can see the benefit, you’re half way there. The biggest misconception in the data mining world is that data mining is the same as data analysis, which every business is doing. This is your opportunity to get one step ahead. Give us a call if you want to chat about your data mining needs or if you would like to challenge us to find the gold in your data for free. Who are slicedbread? Get in touch Consider slicedbread a blend of ideas people, creatives, information architects and technical wizards, who believe that by devising the best strategies and manipulating the best technology, they can deliver unparalleled competitive advantage to clients (and also have a pretty good time whilst doing it). slicedbread build better business apps. slicedbread.co.uk 01565 757 832 mail@slicedbread.co.uk slicedbread.co.uk @slicedbread_it