Document

advertisement
FORECASTING
Quantitative Models
Causal Models
1
Forecasting Models
Forecasting
Quantitative
Qualitative
Causal Model
Expert Judgment
Trend
Time series
Delphi Method
Grassroots
Stationary
Market Research
Trend
Trend + Seasonality
Jury Exec. Opinion
2
Quantitative Forecasts
3
Causal Forecasting Models



In a causal forecasting model, the forecast for the
quantity of interest “rides piggyback” on another
quantity or set of quantities
In other words, our knowledge of the value of one
variable (or perhaps several variables) enables us to
forecast the value of another variable.
In this model, let
y
^
y or y’
denote the true value of some variable of interest and
denote a predicted or forecast value for that variable.
4
Causal Forecasting Models
5
Causal Forecasting Models
One commonly used approach in creating a causal forecasting model is
called curve fitting.
6
Curve Fitting Example:
AN OIL COMPANY EXPANSION

Consider an oil company that is planning to
expand its network of modern self-service
gasoline stations.
 The company plans to use traffic flow
(measured in the average number of cars per
hour) to forecast sales (measured in average
dollar sales per hour).
 The firm has had five stations in operation for
more than a year and has used historical data
to calculate the following averages:
7
Curve Fitting Example:
AN OIL COMPANY EXPANSION
$300.00
$250.00
Sales/hour ($)
$200.00
$150.00
$100.00
$50.00
$0
50
100
150
Cars/hour
200
250
8
Curve Fitting Example:
AN OIL COMPANY EXPANSION

Now, these data will be used to construct a
forecast
 Use to forecast sales at any proposed location
by measuring the traffic flow at that location
and plugging its value into the constructed
function.
9
Least Squares Method

Least Squares Fits The method of least squares is a
formal procedure for curve fitting. It is a two-step
process.
1. Select a specific functional form (e.g., a straight
line or quadratic curve).
2. Within the set of functions specified in step 1,
choose the specific function that minimizes the sum
of the squared deviations between the data points
and the function values.
10
To demonstrate the process, consider the salestraffic flow example.
1. Assume a straight line; that is, functions of
the form y = a + bx.
2. Draw the line in the scatter diagram and
indicate the deviations between observed
points and the function as di .
For example,
d1 = y1 – [a +bx1] = 220 – [a + 150b]
where
y1 = actual sales/hr at location 1
x1 = actual traffic flow at location 1
a = y-axis intercept for the function
b = slope for the function
11
$300.00
y
d3
$250.00
d1
Sales/hour ($)
$200.00
d5
y = a + bx
d4
$150.00
d2
$100.00
$50.00
$0
50
100
150
x250
200
Cars/hour
The value d12 is one measure of how close the
value of the function [a +bx1] is to the observed
value, y1; that is it indicates how well the
function fits at this one point.
12
One measure of how well the function fits overall
is the sum of the squared deviations:
5
di2
S
i=1
Consider a general model with n as opposed to
five observations. Since each di = yi – (a +bxi),
the sum of the squared deviations can be written
as:
n
2
(
y
–
[a
+b
x
])
S i
i
i=1
Using the method of least squares, select a and b
so as to minimize the sum in the equation above.
13
Now, take the partial derivative of the sum with
respect to a and set the resulting expression
equal to zero.
n
-2(yi – [a +bxi]) = 0
S
i=1
A second equation is derived by following the
same procedure with b.
n
-2xi (yi – [a +bxi]) = 0
S
i=1
Recall that the values for xi and yi are the
observations, and our goal is to find the values of
a and b that satisfy these two equations.
14
The solution is:
n
b=
1
x
y
S
i i
n
i=1
a= 1
n
n
n
n
xi S yi
S
i=1
i=1
n
1
2
x
S i -n
i=1
xi
S
i=1
n
n
1
y
b
S i
n
i=1
2
xi
S
i=1
The next step is to determine the values for:
n
xi
S
i=1
n
xi2
S
i=1
n
yi
S
i=1
n
xiyi
S
i=1
Note that these quantities depend only on
observed data and can be found with simple
arithmetic operations or automatically using
Excel’s predefined functions.
15
Using Excel, click on Tools – Data Analysis …
In the resulting
dialog, choose
Regression.
16
Using Excel, click on Tools – Data Analysis …
In the resulting
dialog, choose
Regression.
17
In the Regression dialog, enter the Y-range and Xrange.
Choose to
place the
output in
a new
worksheet
called
Results
Select Residual Plots and Normal Probability Plots
to be created along with the output.
18
Click OK to produce the following results:
Note that a (Intercept) and b (X Variable 1) are
reported as 57.104 and 0.92997, respectively.
19
To add the resulting least squares line, first click on
the worksheet Chart 1 which contains the original
scatter plot.
Next, click on the data series so that they are
highlighted and then choose Add Trendline … from
the Chart pull-down menu.
20
Choose Linear Trend in the resulting dialog and
click OK.
21
A linear trend is fit to the data:
$300.00
$250.00
Sales/hour ($)
$200.00
Series1
$150.00
Linear (Series1)
$100.00
$50.00
$0
50
100
150
200
250
Cars/hour
22
One of the other summary output values that is
given in Excel is: R Square = 69.4%
This is a “goodness of fit” measure which
represents the R2 statistic discussed in
introductory statistics classes.
R2 ranges in value from 0 to 1 and gives an
indication of how much of the total variation in Y
from its mean is explained by the new trend line.
In fact, there are three different sums of errors:
TSS (Total Sum of Squares)
ESS (Error Sum of Squares)
RSS (Regression Sum of Squares)
23
The basic relationship between them is:
TSS = ESS + RSS
They are defined as follows:
TSS =
ESS =
RSS =
n
–
n
^
(Yi – Y )2
S
i=1
(Yi – Yi )2
S
i=1
n
^
–
( Yi – Y ) 2
S
i=1
Essentially, the ESS is the amount of variation
that can’t be explained by the regression.
The RSS quantity is effectively the amount of the
original, total variation (TSS) that could be
24
removed using the regression line.
R2 is defined as:
R2
RSS
=
TSS
If the regression line fits perfectly, then ESS = 0
and RSS = TSS, resulting in R2 = 1.
In this example, R2 = .694 which means that
approximately 70% of the variation in the Y
values is explained by the one explanatory
variable (X), cars per hour.
25
Now, returning to the original question: Should
we build a station at Buffalo Grove where traffic
is 183 cars/hour?
The best guess at what the corresponding sales
volume would be is found by placing this X value
into the new regression equation:
^
y
=
a
+
b
*
x
Sales/hour = 57.104 + 0.92997 * (183 cars/hour)
= $227.29
However, it would be nice to be able to state a
95% confidence interval around this best
guess.
26
We can get the information to do this from Excel’s
Summary Output.
Excel reports that the
standard error (Se) is
44.18.
This quantity represents
the amount of scatter in
the actual data around
the regression line.
The formula for Se is:
n
Se =
^
(Yi – Yi )2
S
i=1
n – k -1
Where n is the number
of data points (e.g., 5)
and k is the number of
independent variables
(e.g., 1).
27
This equation is equivalent to:
ESS
n – k -1
Once we know Se and based on the normal
distribution, we can state that
• We have 68% confidence that the actual
value of sales/hour is within + 1 Se of the
predicted value ($277.29).
• We have 95% confidence that the actual
value of sales/hour is within + 2 Se of the
predicted value ($277.29).
The 95% confidence interval is:
[277.29 – 2(44.18); 227.29 + 2(44.18)]
[$138.93; $315.65]
28
Another value of interest in the Summary report
is the t-statistic for the X variable and its
associated values.
The t-statistic is 2.61 and the P-value is 0.0798.
A P-value less than 0.05 represents that we have
at least 95% confidence that the slope parameter
(b) is statistically significantly different than 0
(zero).
A slope of 0 results in a flat trend line and
indicates no relationship between Y and X.
The 95% confidence limit for b is [-0.205; 2.064]
Thus, we can’t exclude the possibility that the
29
true value of b might be 0.
Also given in the Summary report is the
F –significance. Since there is only one
independent variable, the F –significance is
identical to the P-value for the t-statistic.
In the case of more than one X variable, the F –
significance tests the hypothesis that all the X
variable parameters as a group are statistically
significantly different than zero.
30
Concerning multiple regression models, as you
add other X variables, the R2 statistic will always
increase, meaning the RSS has increased.
In this case, the Adjusted
R2 statistic is a reliable
indicator of the true
goodness of fit because it
compensates for the
reduction in the ESS due to
the addition of more
independent variables.
Thus, it may report a decreased adjusted R2 value
even though R2 has increased, unless the
improvement in RSS is more than compensated
for by the addition of the new independent
31
variables.
WHICH CURVE TO FIT?
If, for example, a quadratic function fits better
than a linear function, why not choose a more
general form, thereby getting an even better fit?
In practice, functions of the form (with only a
single independent variable for illustrative
purposes) are often suggested:
y = a0 + a1x + a2x2 + … + anxn
Such a function is called a polynomial of degree n,
and it represents a broad and flexible class of
functions.
n=2
quadratic
n=3
cubic
n=4
quartic
32
…
One must proceed with caution when fitting data
with a polynomial function.
For example, it is possible to find a (k – 1)-degree
polynomial that will perfectly fit k data points.
To be more specific, suppose we have seven
historical observations, denoted
(xi , yi), i = 1, 2, …, 7
It is possible to find a sixth-degree polynomial
y = a0 + a1x + a2x2 + … + a6x6
that exactly passes through each of these seven
data points.
33
A perfect fit gives zero for the sum of squared
deviations.
However,
this is
deceptive,
for it does
not imply
much about
the
predictive
value of the
model for
use in
future
forecasting.
34
Despite the perfect fit of the polynomial function,
the forecast is very inaccurate. The linear fit
might provide more realistic forecasts.
Also, note
that the
polynomial
fit has
hazardous
extrapolation
properties
(i.e., the
polynomial
“blows up”
at its
extremes).
35
Reliability and Validity





Does the model make intuitive sense? Is the
model easy to understand and interpret?
Are the coefficients statistically significant (pvalues less than .05)?
Are the signs associated with the coefficients as
expected?
Does the model predict values that are
reasonably close to the actual values?
Is the model sufficiently sound (high R2, low
standard error, etc.)?
36
Correlation Coefficient and
Coefficient of Determination
r

n X iYi   X i  Yi
[n X i2  ( X i ) 2 ][ Yi 2  ( Yi ) 2 ]
Coefficient of determination = r2.

Correlation coefficient = r.

Where: Yi = dependent variable.

Xi = independent variable.

n = number of observations.
37
Correlation Coefficient and
Coefficient of Determination
38
Summary: Causal Forecasting Models




The goal of causal forecasting model is to develop the
best statistical relationship between a dependent
variable and one or more independent variables.
The most common model approach used in practice is
regression analysis. Only linear regression models
are examined in this course.
In causal forecasting models, when one tries to predict a
dependent variable using a single independent variable,
it is called a simple regression model.
When one uses more than one independent variable to
forecast the dependent variable, it is called a multiple
regression model.
39
Download