First Day - MIS 420: Business Intelligence and Analytics

advertisement
MIS 420: Predictive Analytics &
Making Statistical Conclusions
[analytics: recap]
[descriptive analytics]
“Using descriptive analytics is like driving your car but
watching traffic through the rear-view mirror, not seeing
what’s ahead and thereby in danger of crashing.”
[predictive analytics]
“Using predictive analytics is like driving your car and
watching traffic through the front windshield, anticipating
traffic, making course corrections to avoid traffic jams and
getting there faster and safer.”
[predictive analytics: definitions]
“exploit patterns found in historical and transactional data
to identify risks and opportunities … capture relationships
among many factors to allow assessment of risk or potential
associated with a particular set of conditions, guiding
decision making for candidate transactions.”
“predicting future events and assessing the attractiveness of
various courses of action.”
HELPS TO ANSWER: WHAT SHOULD WE DO NEXT???
[descriptive vs. predictive analytics]
• Descriptive Analytics: the understanding of existing
(retrospective) data with the goal of understanding trends
via comparison
• This is the first step towards deriving predictive analytics
• Predictive Analytics: more sophisticated analytics that
“forward thinking” in nature
• gaining insights by enhancing understanding, interpretation
and judgment for the purpose of good decision making
• Remember, not what WILL happen, but what is probable
to happen, based on past data.
[where are predictive analytics used?]
• Marketing
• Financial services
• Insurance
• Telecommunications
• Retail
• Travel
• Healthcare
• Pharmaceuticals
• …. And more!
[communication skills and predictive analytics]
• Not enough to have state of the art in BI technologies,
without having a common understanding and a
common language between the business people and BI
professionals, otherwise BI efforts will fall short of
desired results
• Business professionals need to appreciate the role of
technology as an enabler and they need to lead and
determine where & how IT/BI infrastructure should be
deployed to improve decision making
• IT & BI professionals need to understand the language
of strategy, business models and performance while
solving business NOT technology problems
[benefits of predictive analytics]
• Informs decision making
• Rapid identification of new market opportunities
• Increased confidence of pursuing new and existing business strategies
• Enhanced ability to direct and focus decisions
• Draws reliable conclusions about current conditions and future events
• Timely feedback to executives on their strategic initiatives – without
feedback course corrections may be too late
• Provide leading indicators and insight to assist in planning for answering
the big question: What should we do next? – next quarter, next year etc.
[overall: why do organizations need predictive
analytics?]
• Improving business processes
• Doing more with less budget (working smarter not harder!)
• Allocating resources appropriately
• Understanding correlations and sensitivities with customer
segments
• To ensure long term financial resources are available to run
the business
• Developing Competitive Advantages and Differentiation
[EXAMPLES IN TABLEAU]
[Predictive Analytics in Tableau: Understanding
Trend Lines]
• Trend Lines: indicating the general
course or tendency of something
• Based on linear regression
• Example: Want to look @
relationship between Sales and
Profit.
• For Central Region (blue trend line),
as Sales increase, so does Profit. So,
based on historical data, as sales
increased, profit tends to increase.
• How do we know if this trend line is
meaningful, or legit?
• Look @ Trend Model
[Predictive Analytics in Tableau: Model Formula]
• Example: Right-click on the Central trend
line > select Describe Trend Model
• Model Formula: Profit =
Coefficient for Region*Sales + yintercept
•
Y = mx + b
• Y-intercept: is a point in the
equation where the x-value is
zero.
• Formula for Central:
•
Profit = .198449*Sales + 10.7668
• Basically, Tableau plots the line
for every piece of data using that
formula
[Predictive Analytics in Tableau: Coefficients, YIntercept, Interpretation]
• Formula for Central:
•
•
•
•
•
•
Profit = .198449*Sales + 10.7668
Coefficient for Central: .198499
What does Coefficient mean??
For every unit increase of X, Y goes up by the
coefficient.
In our case:
In the central region, for every dollar our Sales
increases, our Profit increases by $0.1984 cents.
• What is the formula for East?
•
•
•
•
Sales coefficient: .154
Y-Intercept: .385
Profit = .154*Sales + .385
In the Eastern region, for every dollar
our Sales increases, our Profit increases
by $0.154 cents.
[Predictive Analytics in Tableau: P-Value]
• Formula for Central:
•
Profit = .198449*Sales + 10.7668
•
•
•
•
•
•
•
•
•
•
•
•
•
Meaning: In the central region, for every dollar our
Sales increases, our Profit increases by $0.1984 cents.
Is this significant? Need to look @ p-value
P-value: way to measure significance
How likely result would occur by chance.
A calculated P value of .001, we can say that the observed outcome
would be expected to occur by chance only 1 in 1,000 times
Criteria: Standard cutoff is less than .05
P > 0.10: not significant
0.05 < P < 0.10: weak evidence
0.01 < P < 0.05: moderate evidence
0.001 < P < 0.01: strong evidence
P < 0.001: very strong evidence
So is our statement meaningful for the Central
region?
P-value < 0.0001
•So we can conclude that:
• In the central region, for every dollar our sales
increases, our profit increases by $0.1984 cents.
There is very strong evidence for this statement,
given that p < 0.00001.
[Predictive Analytics in Tableau: Practice]
• Now using the same table, answer the
following questions for the West Region:
• What is the coefficient?
• What is the y-intercept?
• What is the formula for the West
region?
• What does the coefficient mean in
terms of profit and sales?
• What is the p-value for the West
region?
• Can we make significant conclusions
for the relationship between sales and
profit in the West? Why or why not?
(use p-value to support your answers)
[Predictive Analytics in Tableau: Forecasting]
• Forecasting: estimations of future values of a measure based on
historical values
• Requirements: a measure, and a continuous date field.
•
A continuous date is one that has an infinite number of possible values. For
example, the Order Date field in Superstore Sales has possible values from year to
individual day
• The measure must be a basic aggregation—forecasts can’t be calculated
for table calculations
• At least one complete cycle of the data that you want to forecast—for
example, if you want to forecast data for one year, you need at least one
year—preferably two or more—of data
[Predictive Analytics in Tableau: Forecasting
Example]
• Example: let’s forecast sales
over time
• Drag Sales into Rows
• Drag Order Date into Columns
• Change Order Date from Year to
MONTH
• We now have a nice line graph
that shows actual data. Let’s
forecast what the data will look
like over the next year
•
[Predictive Analytics in Tableau: Forecasting
Example]
FORECAST
• Example: Sales over time
• Right click on graph > Show
Forecast
• Let’s play with some of the
forecast options.
[Predictive Analytics in Tableau: Forecast
Options]
• Example: Sales over time
• Right click on graph > forecast
options
• Forecast Options are basically the
assumptions of your forecast
• Forecast Length:
•
•
Notice Auto is set to Next 13 months
Can Change it to whatever you want to
forecast
• Source Data:
•
The Aggregate by – you can change
the aggregate as well
[Predictive Analytics in Tableau: Describe
Forecast]
• Example: Sales over time
• Right click on graph > Describe Forecast
•
•
•
•
•
•
•
•
Sum of Sales: value being forecast (this will change if your
data changes)
Initial: prediction interval for first forecasting period
Our first is Dec 2013 (the interval is +/- $123,597)
Change from Initial: difference between first and last forecast
estimates
Seasonal Effect: These fields are displayed for models
identified as having seasonality--that is, a repeating pattern of
variation over time.
Contribution: The extent to which Trend and Seasonality
contribute to the forecast. These values are always expressed
as percentages and add up to 100%.
Ours is only trend because no seasonality was indicated
Quality: how well the forecast fits the actual data. Possible
values are GOOD, OK, and POOR. A naïve forecast is defined
as a forecast that estimates that the value of the next period
will be identical to the value of the current period. Quality is
expressed relative to a naïve forecast, such that OK means
the forecast is likely to have less error than a naïve forecast,
GOOD means that the forecast has less than half as much
error, and POOR means that the forecast has more error
[Predictive Analytics in Tableau: Things to
Remember in Forecasting]
• The forecast will only ever be as good as your data, and
the more you have of it, the better the forecast should be.
• The forecast can’t account for external influences on your
data, though, but you can factor those into your historical
data to build more complex forecasting models.
• For more information on forecasting, check out
•
•
•
http://beta.tableausoftware.com/qsg/forecasting.pdf (username:
v8beta password: kraken
http://beta.tableausoftware.com/videos/Forecasting/Forecasting
_Tableau_8_Beta.html (use the username and password above)
http://onlinehelp.tableausoftware.com/v8.0/pro/online/enus/help.html#forecasting.html
Download