White Paper

advertisement
WHITE PAPER
GETTING
AHEAD OF THE
COMPETITION
WITH DATA
MINING
Ultimately, data mining boils down to continually finding new ways to be
more profitable which in today’s competitive world means making better and
more accurate decisions faster than your competition.
Data mining has a great affinity to the six sigma mind set and the two disciplines work well together. They
both start out with an unknown outcome, the end goal being improvement, and both use well established
statistical methods. Whilst the end result does not guarantee you will double your sales or half your
production defects, you are very likely to find improvements and gain new insights into your business. How
much these insight are worth varies from industry to industry, but there is always that chance to find highly
valuable nuggets of gold. For example, if you are drilling for oil, knowing where to drill is hugely valuable
given the cost of setting up. However, knowing the reasons why a product fails quality control so they can
be mitigated has a benefit that all depends on the cost of rectification and frequency of reoccurrence, which
will vary greatly from business to business.
The fact is that whilst we often use our in-depth business and industry knowledge to generate many great
improvement ideas, you would have even more great ideas when presented with trends in your data that
you were not aware of. For example, if, as a delivery company, you discovered that the accident rate and
delivery times were greatly affected by the driver’s age and whether the route was inner city or more
motorway. Would you not adjust which route the driver goes on? If you discovered managers and team
leaders were sick three times less than everybody else would you change your sickness policy or find new
ways to give people more responsibilities and empowerment? Chances are, you would take some form of
action, but you would probably only have taken that action in the light of this new information.
The other side to data mining is about accuracy and measurability, the great example being forecasting.
This typically means going beyond simple averages, linear regression functions and moving averages to
more complex weighted moving averages, exponential smoothing, and time serial analysis to forecast
sales, staffing and stock levels, etc. These last methods will often acknowledge other attributes, clusters,
seasonality, and have further alpha inputs that can lead to significantly higher accuracy levels.
As the best method often varies from case to case, each can be implemented separately and then
statistically compared. This is done using functions like mean square error, mean absolute variation and
slicedbread.co.uk
WHITE PAPER
tracking signal so that the most pertinent forecast is used. If you’re not already using the these methods,
how much better would some of your decisions be if your estimates were that bit more accurate, and
how much money would that save or make for your business? What is also useful is that the activity of
forecasting often identifies hidden factors affecting you estimates you were not aware of. For example,
you log competitor sales campaign dates and plot them against your own data. Often competitors stick to
schedule which, once known, can be very valuable to your forecasting.
In customer relationship management it’s critical to know your customer. Knowing your customer makes
your marketing more effective, reduces churn, increases spend levels and ensures you are giving the
customer the service they want and expect. It also helps you exceed their expectations and evolve your
offering. This is an area where data mining plays a major role and where nearly all the methods and tools
play a part.
Usually the first step in understanding your customers is to classify and cluster them. For a retail company
a simple example would mean classifying your customer base into age groups, disposable income
bands and their primary interest, and then clustering the combinations that naturally stick together. The
resulting clusters are often given memorable names like the “silver surfer”, but using these clusters makes
marketing to each group far easier, helps you focus on the right delivery channel strategy and can greatly
increases conversion rates by being more targeted. Whilst this activity is often done manually, it becomes
progressively harder as the number of influencing attributes grows. This is when it becomes a good idea
to augment your internal data with external data. Data mining tools, however, will scan all your data and
automatically recommend groupings based on reviewing every field.
The nice thing about clusters is that whilst you can’t change the customer’s gender or address, you can
often influence other attributes and help move a customer to a new cluster of higher value. A great example
is Amazon prime. Once a customer joins prime there is usually a big shift in the lifetime value, shopping
frequency, basket size, churn, ability to upsell and so on. Knowing the value difference between the two
clusters helps you understand the maximum spend available per customer to move them up to a higher
value cluster. However, bear in mind it is also effective to move customers down to a less profitable cluster if
they look likely to leave, especially if you know before they consciously decide to do so.
But just how do you know they are likely to leave? The answer lies in understanding the data of every
customer who has left in the past and then use data mining tools to automatically detect all the influencing
factors. There are several methods, but the simplest tool will give you a bar graph showing the relative
importance of each factor. Once identified, these factors let you filter and identify the customers most likely
to leave so you’re able to take proactive action such as getting in touch and offering them a better deal or a
complementary bottle of wine when things go wrong.
Just as important as knowing your customers should be knowing your competitors. Today, with most
business publishing their prices on the internet and the public data available on web traffic, it’s now easier
than even to profile your customers and track how they are doing in comparison to your own efforts.
Some answers, however, you will never find because you just don’t have the data or it’s in a state that can’t
be used. The unfortunate truth is that from a Gartner perspective most businesses have poor data quality
levels with most floating around level 1 and 2 of their 5 level quality framework. This means your business
is most likely to be missing data and have a significant amount of incorrect data. Whilst poor data quality
dilutes your data analysis, it can be addressed with automated data cleansing via ETL or exclusion entirely
from the models you build. There are, however, data mining tools for filling in missing data and identifying
outliers statistically, but the best solution is to have the right level of governance in place and a solid data
strategy.
slicedbread.co.uk
WHITE PAPER
However, the biggest factor is if you don’t capture the data or it’s just not available. This is where a little
creativity comes into play as you start to look at what data you can get, and how likely it may influence
what you are investigating. For example, a business wants to open its next store but needs to pick the
best possible location. It’s unlikely to already have good sales data on the area unless it sells online. From
a data mining perspective, this is where it is useful to know as much as possible from as many angles as
possible. Example additional data for the location might include population demographics, disposable
income, education levels, number of household vehicles, density of competing businesses, university student
populations, average house prices, etc. By linking all these factors to your own data and then letting the
data mining tools discover the trends, the decision on location becomes more informed and the risk of a
poor investment is reduced.
Another technique is to create derived metrics from the data you have to predict key supporting factors. If
you take the example of betting on a greyhound race. If you know that only 5% of dogs who are bumped
finish first, or that 38% of the dogs that reach the first corner leading go on to win, it soon becomes clear
that it’s worth knowing the likelihood of these supporting factors. This is where you would create new
metrics specifically to predict their likelihood. Factors such as the average trap the dog starts in to see if
collisions are more likely when in the wrong trap, or the ranked weight of each dog to see if the lighter dog
has the advantage in the initial sprint.
The other methods that support understanding your customer include association analysis, which is
commonly used to do basket analysis and is great at supporting a higher basket margin. This uses your
customers shopping behaviours to learn what they buy in groups. This learning can then be shared with
other customers by grouping products together on the shop floor or web page making it easier for them
to buy as a group or bundle deal, but it also plants the idea in their minds of the possible combination.
Another method is called sequence analysis. This is often used to understand the click paths customers take
when using a website or can be used to look at customer purchases over time. Knowing a customer buys
ink for a printer 3 months after buying the printer can be useful to know and a great reason to contact the
customer.
All these methods are all easily accessible from the free Excel data mining addin and table analysis bars
shown below. All you need is to be able to connect to an Analysis Services server.
Whilst the list of applications for data mining is exhaustive, hopefully you get the idea. Data mining brings
to the table some very sophisticated mathematical techniques as well as some quite simple but clever
methods. Consider the differences between the Naïve Bayes method, decision trees and the neural net
methods. The Bayes method was invented by reverend Thomas Bayes in 1763 and was based on him using
marbles to count events and hence is quite simple but effective as shown below in this bike buyer sample
dataset.
slicedbread.co.uk
WHITE PAPER
Decision trees which work by discovering the key pathways through your data are more sophisticated and
visual. They give you the probability of the desired outcomes and colour grade the pathway through the
data with darker shades indicating higher probability. This is my personal favourite as it can produce some
great visual output that’s easy to understand. The example below shows the factors affecting a shopper
likelihood of buying a bike. The best pathway shows customers Aged between 36 and 39 who don’t own
a cars have a 40.66% likelihood of buying a bike. However, the most interesting is probably the Neural
slicedbread.co.uk
WHITE PAPER
Network method which was discover as a by-product of mapping out how the human brain works. This
method analyses all the possible relationships in your data and then after combining all the attributes of the
first pass will take a second look at your data in a similar way. This is the equivalent of a chess player thinking
2 moves ahead and is best suited to highly complex problems and is not something any report or cube can
easily do. Below is the same bike buyer example but predicted using the neural net method, which, as you
can see, is much more detailed.
At slicedbread, we use the Microsoft tools embedded in SQL Server and Office to support our clients with
their data mining projects. Whilst many statisticians will prefer the more advanced and expensive SAS and
SPSS solutions, Microsoft has the advantage of being free as long as you own SQL server and Excel, it’s
user friendly and Excel can handle more complexity than most typical businesses use. However, if you need
more advanced capabilities then you can always use the data mining extensions language DMX in analysis
services which has extra support for things like nested tables. Also, if you have SQL Enterprise then there are
some more advanced algorithms available.
When we work on a data mining project, we follow the life cycle below which I would recommend if you
want to have a go at this yourself.
Define the problem. Whilst you can point the tools at hundreds of fields and ask it to tell you something
you don’t know, you will get better quality results by focusing around a single goal.
Collect you data. The more attributes you investigate the more data you will need, but a simple snap shot
will usually do. The tools will usually use 70% of your data to design the model and the remaining 30% to
test it. The model you will build will often last some time before needing to be re-evaluated.
Transform and clean you data. Expect this to take the majority of the effort. The cleaner and more discrete
your data the better. As some models work on continuous data such as sales, and others utilise discrete data
slicedbread.co.uk
WHITE PAPER
like sales channel, it’s best to take continuous data and create additional discretized groupings in advance
so you can have the best of both worlds. A good example would be age bands. Where you have bad data,
either exclude it or fix it, but don’t leave it in. Because the process is cyclic as you close in on your goal, your
first pass of the data is to identify trends, with subsequent passes going deeper into the data as you go for
greater accuracy. If you have millions of records, I would recommend pre-aggregating the data into a more
summary form.
Build Your Model. Using a clean and well prepared data source makes data mining almost a matter of
pointing the tools at the data, selecting your field settings and clicking go. However, there is plenty of
opportunity to refine the models as you discover what is correlating and supporting your goal and what
has no affect and needs removing. In most cases, you will often find yourself going back to you data and
either creating new derived fields or pulling in brand new fields related to an attribute that you have just
discovered is highly influential.
In addition to this approach, there also some very powerful techniques you can use if you put your data into
a pivot table using PowerPivot (ideally). This approach can reproduce some of the same outcomes but with
a greater control and understanding in how you got to the best solution. To take advantage of this, switch
to a percent of row total view and keep pulling in new dimensions to see correlations quickly against your
target attributes that are on a separate axis. This is sometimes exceptionally useful as it’s much quicker than
continuously rebuilding your model and it can really help your focus your efforts.
Apply the model to deliver live data. Sometime just knowing the answers to questions is enough to go
and change your business processes, but there will be times you will want to record the answers in your data
warehouse. This will support automated decision systems, which are particularly useful in call centres as well
as supporting CRM applications; but it’s also a great way of sharing the knowledge. Whilst you can query
the data mining models for predictions on the fly and export the bulk results using DMX, my preferred
method is to duplicate the data mining logic in your traditional ETL processes. This is because it’s faster, and
when you’re updating your data every hour, as we do, this is an import factor.
Whilst not every trend or pattern you discover will be useful, the return on your investment should always
be positive (Microsoft press states an average 150%), especially when using Microsoft tools, and in some
cases can be like winning the lottery. If you’re not already data mining but can see the benefit, you’re half
way there. The biggest misconception in the data mining world is that data mining is the same as data
analysis, which every business is doing. This is your opportunity to get one step ahead.
Give us a call if you want to chat about your data mining needs or if you
would like to challenge us to find the gold in your data for free.
Who are slicedbread?
Get in touch
Consider slicedbread a blend of ideas people, creatives, information
architects and technical wizards, who believe that by devising the best
strategies and manipulating the best technology, they can deliver
unparalleled competitive advantage to clients (and also have a pretty
good time whilst doing it).
slicedbread build better business apps.
slicedbread.co.uk
01565 757 832
mail@slicedbread.co.uk
slicedbread.co.uk
@slicedbread_it
Download