Demand Forecasting II Causal Analysis Chris Caplice ESD.260/15.770/1.260 Logistics Systems

advertisement
Demand Forecasting II
Causal Analysis
Chris Caplice
ESD.260/15.770/1.260 Logistics Systems
Sept 2006
Agenda
Forecasting Evaluation
Use of Causal Models in Forecasting
Approach and Methods
„
„
Ordinary Least Squares (OLS) Regression
Other Approaches
Closing Comments on Forecasting
MIT Center for Transportation & Logistics – ESD.260
2
© Chris Caplice, MIT
Forecast Evaluation
How do we determine what is a good forecast?
„
„
„
„
Accuracy - Closeness to actual observations
Bias - Persistent tendency to over or under predict
Fit versus Forecast – Tradeoff between accuracy to
past forecast to usefulness of predictability
Forecast Optimality – Error is equal to the random
noise distribution
Combination of art and science
„
„
Statistically – find a valid model
Art – find a model that makes sense
MIT Center for Transportation & Logistics – ESD.260
3
© Chris Caplice, MIT
Accuracy and Bias Measures
1. Forecast Error: et = xt - ̂xt
n
MD =
2. Mean Deviation:
∑e
t
t =1
n
n
3. Mean Absolute Deviation
MAD =
∑e
t
t =1
n
n
4. Mean Squared Error:
MSE =
n
5. Root Mean Squared Error:
t =1
n
n
2
t
n
2
t
6. Mean Percent Error:
MIT Center for Transportation & Logistics – ESD.260
t =1
∑e
RMSE =
7. Mean Absolute Percent Error: MAPE =
∑e
n
et
∑
D
MPE = t =1 t
n
et
∑D
t =1
t
n
4
© Chris Caplice, MIT
Moving Average Forecasts
116.00
115.00
114.00
113.00
MD
MAD
MSE
RMSE
MAPE
112.00
111.00
110.00
MA3
0.05
0.56
0.47
0.68
0.50%
MA10
0.21
1.07
1.67
1.29
0.96%
MA20
0.35
1.41
2.71
1.65
1.27%
109.00
108.00
-
20.00
MA3
40.00
MA10
60.00
MA20
MIT Center for Transportation & Logistics – ESD.260
80.00
100.00
120.00
ActDemand
5
© Chris Caplice, MIT
Analysis of the Forecast
25
Are the forecast errors
~N(0,Var(e))?
Frequency
10
Š What is the expected value of the
5
errors?
Š What is the variance of the errors?
„
.1
9)
0.
12
0.
42
0.
72
1.
03
1.
33
M
or
e
(0
.4
9)
.1
0)
.7
9)
(0
(1
.4
0)
0
(0
For Moving Averages:
15
(1
„
20
From actual observations,
error
Š Are the observed errors ~N(0,Var(e))?
Š For the MA3 data
„
„
„
Errors
2.00
μe = 0.05
σe= 0.69
σD= 1.478
1.50
1.00
0.50
Š Testing for Normalcy – Chi-Square,
Kolmogorov-Smirnov, or other tests
(0.50)
-
20
40
60
80
100
120
(1.00)
(1.50)
(2.00)
MIT Center for Transportation & Logistics – ESD.260
6
© Chris Caplice, MIT
Corrective Actions to Forecasts
Measures of Bias
„
Cumulative Sum of Errors (Ct)
Š Normalize by dividing by RMSE (Ut)
Š Ut should ~0 if unbiased
„
Smoothed Error Tracking Signal (Tt)
Š Tt=zt/MADt
Š Where zt= ωet + (1-ω)zt-1 (smoothing constant)
„
Autocorrelation of forecast Errors
Š Correlation between successive observations
Corrective Actions
„
Adaptive Forecasting
Š Methods where the smoothing coefficients change over time
Š Found (generally) to be no better than standard methods
„
Human Intervention
Š Overrule the model’s output – look for reason
Š Rules of thumb: |Tt|>f or |Ct|>k(RMSE) (f~0.4 and k~4)
Š Lower values (of k or f) lead to more intervention
MIT Center for Transportation & Logistics – ESD.260
7
© Chris Caplice, MIT
Causal Forecasting Models
Assumes that demand is highly correlated with some
environmental factors
Model is built to relate the independent exogenous
factors to the demand
Examples:
„
„
„
„
„
„
Diapers ~ f(birth rates lagged by 1 year)
NFL Jerseys ~ f(team and individual performance)
New products ~ f(product lifecycle)
Promotional Items ~ f(marketing promotions & ads)
Regional Sales ~ f(household demographics in area)
Umbrellas / Fuel ~ f(weather, temperature, rain, etc.)
Form of Dependent Variable dictates the method used
„
„
„
Continuous – takes any value
Discrete – takes only integer values
Binary – is equal to 0 or 1
MIT Center for Transportation & Logistics – ESD.260
8
© Chris Caplice, MIT
OLS Linear Regression
The relationship is described in terms of linear model
The data (xi,yi) are the observed pairs from which we try to
estimate the β coefficients to find the ‘best fit’
The error term, ε, is the ‘unaccounted’ or ‘unexplained’
portion
The error terms are assumed to be iid ~N(0,σ) and catch
all of the factors ignored or neglected in the model
E (Y | x) = β 0 + β1 x
yi = β 0 + β1 xi
Yi = β 0 + β1 xi + ε i
Observed
for i = 1, 2,...n
StdDev(Y | x) = σ
Unknown
MIT Center for Transportation & Logistics – ESD.260
9
© Chris Caplice, MIT
OLS Linear Regression
Residuals
„
„
„
Predicted or estimated values are found by using the regression
coefficients, b.
Residuals, ei, are the difference of actual – predicted values
Find the b’s that “minimize the residuals”
yˆi = b0 + b1 xi for i = 1, 2,...n
ei = yi − yˆi = yi − b0 + b1 xi for i = 1, 2,...n
How should I measure the residuals?
„
„
„
Min sum of errors - shows bias, but not accurate
Min sum of absolute error - accurate & shows bias, but intractable
Min sum of squares of error – shows bias & is accurate
2
e
(
∑ i =1 i ) = ∑ i =1 ( yi − yˆi ) = ∑ i=1 ( yi − b0 − b1 xi )
n
n
2
n
2
The best model minimizes the residual sum of squares
MIT Center for Transportation & Logistics – ESD.260
10
© Chris Caplice, MIT
OLS Linear Regression
We can find the optimal values of b0 and b1 by taking
first order conditions of the SSE:
∑ ( e ) = ∑ ( y − yˆ ) = ∑ ( y − b
n
i =1
2
n
2
i =1
i
i
i
n
i =1
i
0
− b1 xi )
2
This gives us the following coefficients:
b0 = y − b1 x
b1
∑
=
n
i =1
( xi − x )( yi − y )
2
(
x
−
x
)
∑ i =1 i
MIT Center for Transportation & Logistics – ESD.260
n
11
© Chris Caplice, MIT
OLS Linear Regression
Expansion to multiple variables is straightforward
So, for k variables we need to find k regression
coefficients
Yi = β 0 + β1 x1i + ... + β k xki + ε i
for i = 1, 2,...n
E (Y | x1 , x2 ,..., xk ) = β 0 + β1 x1 + β 2 x2 + ... + β k xk
StdDev(Y | x1 , x2 ,..., xk ) = σ
∑ ( e ) = ∑ ( y − yˆ ) = ∑ ( y − b
n
i =1
2
i
2
n
i =1
i
MIT Center for Transportation & Logistics – ESD.260
n
i =1
i
12
i
0
− b1 x1i − ... − bk xki )
2
© Chris Caplice, MIT
OLS Example
4,500
4,000
3,500
3,000
2,500
MIT Center for Transportation & Logistics – ESD.260
Ju
l
ay
M
ar
M
Ja
n
No
v
Se
p
Ju
l
ay
M
ar
2,000
M
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Demand
3,025
3,047
3,079
3,136
3,454
3,661
3,554
3,692
3,407
3,410
3,499
3,598
3,596
3,721
3,745
3,650
4,157
4,221
4,238
4,008
Ja
n
Month
What do you see?
13
© Chris Caplice, MIT
OLS Example
Month
Establish relationship
„
„
„
Fi = f(X1i, X2i, …Xni)
=β0+β1X1i+β2X2i+…+βnXni
Fi = Level + Trend + Season
= β0 + β1 X1i + β2 X2i
Where X2i = 1 if a summer month,
= 0 o.w.
Points to consider:
„
„
„
„
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Demand Period
3,025
1
3,047
2
3,079
3
3,136
4
3,454
5
3,661
6
3,554
7
3,692
8
3,407
9
3,410
10
3,499
11
3,598
12
3,596
13
3,721
14
3,745
15
3,650
16
4,157
17
4,221
18
4,238
19
4,008
20
Summer
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
0
1
1
1
1
What if the trend is not linear?
How do I handle seasonality if it impacts the trend?
How does OLS treat old versus new data?
How much information do I need to keep on hand?
MIT Center for Transportation & Logistics – ESD.260
14
© Chris Caplice, MIT
OLS Example (Excel)
4,300
4,100
3,900
3,700
Actual
Predicted
3,500
SUMMARY OUTPUT
3,300
0.979
0.958
0.953
79.21
20
3,100
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
Dec
Nov
Oct
Sep
Jul
Aug
Jun
May
Apr
Mar
Feb
2,900
Jan
Regression Statistics
Multiple R
R Square
Adjusted R Square
Standard Error
Observations
ANOVA
df
Regression
Residual
Total
Intercept
Period
Summer
2
17
19
Coefficients
2,969.14
48.03
303.51
SS
MS
F
Significance F
2442766.966 1221383.483 194.6730408
1.91955E-12
106658.4214 6274.024786
2549425.387
Standard Error
37.21
3.20
37.70
t Stat
79.79
15.00
8.05
P-value
0.0000
0.0000
0.0000
Lower 95%
Upper 95% Lower 95.0% Upper 95.0%
2,890.62
3,047.65
2,890.62
3,047.65
41.27
54.79
41.27
54.79
223.97
383.04
223.97
383.04
Fi = 2969 + 48 (Period) + 304 (Summer_Flag)
MIT Center for Transportation & Logistics – ESD.260
15
© Chris Caplice, MIT
OLS Example (Excel)
Coefficient of Determination
R2= 1-ESS/TSS=RSS/TSS
SUMMARY OUTPUT
Regression Statistics
Multiple R
R Square
Adjusted R Square
Standard Error
Observations
Standard Error (estimate of σ
around the regression line)
0.979
0.958
0.953
79.21
20
Sum of the Squares
Regression (RSS) =∑(ŷ-̅y)2
Error (ESS) =∑(y-ŷ)2
Total (TSS) =∑(y-̅y)2
ANOVA
df
Regression
Residual
Total
Intercept
Period
Summer
Regression
Coefficients
2
17
19
SS
MS
F
Significance F
2442766.966 1221383.483 194.6730408
1.91955E-12
106658.4214 6274.024786
2549425.387
Coefficients Standard Error
2,969.14
37.21
48.03
3.20
303.51
37.70
Degrees of
Freedom = n-k-1
MIT Center for Transportation & Logistics – ESD.260
t Stat
79.79
15.00
8.05
P-value
0.0000
0.0000
0.0000
Lower 95% Upper 95%
2,890.62
3,047.65
41.27
54.79
223.97
383.04
Std Error of Regression
Coeff (sbm)
16
95%
Confidence
Intervals
Lower 95.0%
Upper 95.0%
2,890.62
41.27
223.97
3,047.65
54.79
383.04
t-Statistic (bm/sbm)
Is bm different from 0?
P-value tells you % conf.
© Chris Caplice, MIT
Coefficient of Determination (R2)
Measures Goodness of Fit of the model
Captures the amount of variation that the model
‘explains’
„
R2=1-ESS/TSS = RSS/TSS
Š TSS = ESS + RSS
Š Variation of observed around mean = Variation of observed
around estimated – Variation of estimated around the mean
Generally, a higher R2 is better, but . . .
„
„
„
„
Model needs to make sense
High R2 does not indicate causality
It really depends on how the model is being used as
to what is ‘good enough’
The individual coefficients need to be tested
MIT Center for Transportation & Logistics – ESD.260
17
© Chris Caplice, MIT
Discrete Choice Models
What if you are predicting demand for one
product over another?
„
„
Model Selections (Blue vs. Red Cars)
Mode Forecasting (pick one of many)
OLS => y = b0 +b1 X
Y
1
Yi =
1 + e− β X i
Y
1
1
0
0
X
MIT Center for Transportation & Logistics – ESD.260
18
X
© Chris Caplice, MIT
Sales Forecasting Methods
Sales Forecast Errors (MAPE) by
forecast horizons in years
Expert Opinions
„
„
„
44.8%
37.3%
14.9%
Sales Force
Executives
Industry Surveys
Level
Statistical Models
„
„
„
„
„
30.6%
20.9%
11.2%
6.0%
3.7%
Naïve Model
Moving Average
Exp. Smoothing
Regression
Box-Jenkins
<.25 yrs ≤2 yrs >2 yrs
Industry
8
11
15
Corporate
7
11
18
Product Group
10
15
20
Product Line
11
16
20
Product
16
21
26
Source: Mentzer & Cox (1984)
Source: Dalrymple (1987) Survey 134 companies
MIT Center for Transportation & Logistics – ESD.260
19
© Chris Caplice, MIT
Misc. Forecasting Issues
Data Issues
„
„
„
„
Sales data is not demand data
Transactions can aggregate and skew actual demand
Ordering quantities can dictate sourcing
Historical data might not exist
Demand visibility can be skewed by level of echelon
„
„
Bullwhip effect
Collaborative Planning, Forecasting, and Replenishment
(CPFR)
Forecasting vs. Inventory Management
Statistical Validity vs. Use and Cost of Model
Demand is not always exogenous
MIT Center for Transportation & Logistics – ESD.260
20
© Chris Caplice, MIT
Questions, Comments,
Suggestions?
Download