forecasting

advertisement
Chapter
13
FORECASTING
CHAPTER OUTLINE
13.1
Introduction
13.2
Quantitative Forecasting
13.3
Causal Forecasting Models
13.4
Time-Series Forecasting Models
13.5
The Role of Historical Data: Divide and Conquer
13.6
Qualitative Forecasting
13.7
Notes on Implementation
KEY TERMS
SELF-REVIEW EXERCISES
PROBLEMS
CASE 1: BANK OF LARAMIE
CASE 2: SHUMWAY, HORCH, AND SAGER (B)
CASE 3: MARRIOTT ROOM FORECASTING
REFERENCES
CD13-2 C D
C H A P T E R S
APPLICATION CAPSULE
Forecasting Improvement at L. L. Bean
L. L. Bean is a widely known retailer of high-quality outdoor
goods and apparel. The majority of its sales are generated
through telephone orders via 800-service, which was introduced in 1986. Ten percent of its $870 million in 1993 sales
was derived through store transactions, 18% was ordered
through the mail, leaving the bulk (72%) from orders taken at
the company’s call center.
Calls to L. L. Bean’s call center fit into two major classifications, telemarketing (TM) and telephone inquiry (TI), each
with its own 800-number. TM calls are primarily the orderplacing calls that generate the vast majority of the company’s
sales. TI callers are mainly customers who ask about the status
of their orders, report problems about orders, and so on. The
volume of calls and average duration of each call are quite different for these two classes. Annual call volumes for TM are
many times higher than those of TI, but the average length is
much less. TI agents are responsible for customer inquiries in
a variety of areas and thus require special training. Thus it is
important to accurately forecast the incoming call volumes for
TI and TM separately to properly schedule these two distinct
server groups.
The real focus of these forecasts is on the third week
ahead. Once the forecast is made, the schedulers can make
up a weekly schedule for their workers and give them two
weeks’ advance notice. Inaccurate forecasts are very costly to
L. L. Bean because they result in a mismatch of supply and
demand. Understaffing of TM agents increases opportunity
costs due to diminished revenues from lost orders (a percentage of customers who don’t get through immediately,
abandon the call, and never call back). Understaffing of TI
agents decreases overall customer satisfaction and erodes
customer loyalty. In both cases, understaffing leads to excessive queue times, which causes telephone-connect charges to
13.1
INTRODUCTION
increase dramatically. On the other hand, overstaffing of
either group of agents incurs the obvious penalty of excessive direct labor costs for the underutilized pool of agents on
duty.
The staff-scheduling decisions would be quite routine if
it were not for the erratic nature and extreme seasonality of
L. L. Bean’s business. For example, the three-week period
before Christmas can make or break the year, as nearly 20% of
the annual calls come during this short period. L. L. Bean will
typically double the number of agents and quadruple the
number of phone lines during this period. After this period,
there is, of course, the exact opposite problem, the build-down
process. In addition, there is a strong day-of-week pattern
throughout the year in both types of calls, with the volume in
the week the highest on Monday and monotonically decreasing down to the low on Sunday.
Other factors that must be considered by the forecasting model is the effect of catalog mailings or “drops.” These
are generally done so that the bulk of the catalogs arrive
around Tuesday, which disrupts the normal pattern of calls
tremendously. Many eager customers order immediately,
which creates a surge of new calls around the time of the
“drop.” The new forecasting model that was developed had
much greater forecast accuracy than L. L. Bean’s previous
approach and was able to produce a mean absolute percentage error of 7.4% for the TM group and 11.4% for the TI
group on five years of historical data. So far, on future threeweek ahead forecasts, the new forecasting model has had
about the same accuracy as that demonstrated on the historical data. The increased precision afforded by these models is estimated to translate into $300,000 annual savings for
L. L. Bean through more efficient scheduling. (See Andrews
and Cunningham.)
The date is June 15, 1941. Joachim von Ribbentrop, Hitler’s special envoy, is meeting in
Venice with Count Ciano, the Italian foreign minister, whereupon von Ribbentrop says:
“My dear Ciano, I cannot tell you anything as yet because every decision is locked in the
impenetrable bosom of the Führer. However, one thing is certain: If we attack, the Russia of
Stalin will be erased from the map within eight weeks.” (see Bullock) Nine days later, Nazi
Germany launched operation Barbarossa and declared war on Russia. With this decision, a
chain of events that led to the end of the Third Reich had been set in motion, and the
course of history was dramatically changed.
Although few decisions are this significant, it is clearly true that many of the most
important decisions made by individuals and organizations crucially depend on an assessment of the future. Predictions or forecasts with greater accuracy than that achieved by the
C H A P T E R
1 3
Forecasting CD13-3
German General Staff are thus fervidly hoped for and in some cases diligently worked for.
There are a few “wise” sayings that illustrate the promise and frustration of forecasting:
• “It is difficult to forecast, especially in regards to the future.”
• “It isn’t difficult to forecast, just to forecast correctly.”
• “Numbers, if tortured enough, will confess to just about anything!”
Economic forecasting considered by itself is an important activity. Government policies and business decisions are based on forecasts of the GDP, the level of unemployment,
the demand for refrigerators, and so on. Among the major insurance companies, one is
hard-pressed to find an investment department that does not have a contract with some
expert or firm to obtain economic forecasts on a regular basis. Billions of dollars of investments in mortgages and bonds are influenced by these forecasts. Over 2,000 people show
up each year at the Annual Forecast Luncheon sponsored by the University of Chicago to
hear the views of three economists on the economic outlook. The data are overwhelming.
Forecasting is playing an increasingly important role in the modern firm.
Not only is forecasting increasingly important, but quantitative models are playing an
increasingly important role in the forecasting function. There is clearly a steady increase in
the use of quantitative forecasting models at many levels in industry and government. A
conspicuous example is the widespread use of inventory control programs that include a
forecasting subroutine. Another example is the reliance of several industries (airlines,
hotels, rental cars, cruise lines) in the services sector of the economy on accurate forecasts
of demand as inputs to their sophisticated mathematical optimizers used for revenue management (e.g., How much to overbook? How many units should be made available at different discount levels?). For economic entities such as the GDP or exchange rates, many
firms now rely on econometric models for their forecasts. These models, which consist of a
system of statistically estimated equations, have had a significant impact on the decision
processes in both industry and government.
There are numerous ways to classify forecasting models and the terminology varies
with the classification. For example, one can refer to “long-range,” “medium-range,” and
“short-range” models. There are “regression” models, “extrapolation” models, and “conditional” or “precedent-based” models, as well as “nearest-neighbor” models. The major distinction we employ will be between quantitative and qualitative forecasting techniques.
13.2
QUANTITATIVE FORECASTING
Quantitative forecasting models possess two important and attractive features:
1. They are expressed in mathematical notation. Thus, they establish an unambiguous
record of how the forecast is made. This provides an excellent vehicle for clear communication about the forecast among those who are concerned. Furthermore, they
provide an opportunity for systematic modification and improvement of the forecasting technique. In a quantitative model coefficients can be modified and/or terms
added until the model yields good results. (This assumes that the relationship
expressed in the model is basically sound.)
2. With the use of spreadsheets and computers, quantitative models can be based on an
amazing quantity of data. For example, a major oil company was considering a reorganization and expansion of its domestic marketing facilities (gasoline stations).
Everyone understood that this was a pivotal decision for the firm. The size of the proposed capital investment alone, not to mention the possible influences on the revenue
from gasoline sales, dictated that this decision be made by the board of directors. In
order to evaluate the alternative expansion strategies, the board needed forecasts of
the demand for gasoline in each of the marketing regions (more than 100 regions
were involved) for each of the next 15 years. Each of these 1,500 estimates was based
CD13-4 C D
C H A P T E R S
on a combination of several factors, including the population and the level of new
construction in each region. Without the use of computers and quantitative models, a
study involving this level of detail would generally be impossible. In a similar way
inventory control systems that require forecasts that are updated on a monthly basis
for literally thousands of items could not be constructed without quantitative models
and computers.
The technical literature related to quantitative forecasting models is enormous, and a
high level of technical, mainly statistical, sophistication is required to understand the intricacies of the models in certain areas. In the following two sections we summarize some
of the important characteristics and the applicability of such models. We shall distinguish
two categories based on the underlying approach. These are causal models and time-series
models.
13.3
CAUSAL FORECASTING
MODELS
In a causal forecasting model, the forecast for the quantity of interest “rides piggyback” on
another quantity or set of quantities. In other words, our knowledge of the value of one
variable (or perhaps several variables) enables us to forecast the value of another variable.
In more precise terms, let y denote the true value for some variable of interest, and let ŷ
denote a predicted or forecast value for that variable. Then, in a causal model,
ŷ = f(x1, x 2, . . . , xn)
where f is a forecasting rule, or function, and x1, x 2, . . . , xn is a set of variables.
In this representation the x variables are often called independent variables, whereas ŷ is
the dependent or response variable. The notion is that we know the independent variables
and use them in the forecasting model to forecast the dependent variable.
Consider the following examples:
1. If y is the demand for baby food, then x might be the number of children between 7
and 24 months old.
2. If y is the demand for plumbing fixtures, then x1 and x 2 might be the number of
housing starts and the number of existing houses, respectively.
3. If y is the traffic volume on a proposed expressway, x1 and x 2 might be the traffic volume on each of two nearby existing highways.
4. If y is the yield of usable material per pound of ingredients from a proposed chemical
plant, then x might be the same quantity produced by a small-scale experimental
plant.
For a causal model to be useful, either the independent variables must be known in
advance or it must be possible to forecast them more easily than ŷ, the dependent variable.
For example, knowing a functional relationship between the pounds of sauerkraut and the
number of bratwurst sold in Milwaukee in the same year may be interesting to sociologists,
but unless sauerkraut usage can be easily predicted, the relationship is of little value for
anyone in the bratwurst forecasting business. More generally, companies often find by
looking at past performance that their monthly sales are directly related to the monthly
GDP, and thus figure that a good forecast could be made using next month’s GDP figure.
The only problem is that this quantity is not known, or it may just be a forecast and thus
not a truly independent variable.
To use a causal forecasting model, then, requires two conditions:
1. There must be a relationship between values of the independent and dependent variables such that the former provides information about the latter.
2. The values for the independent variables must be known and available to the forecaster at the time the forecast must be made.
Before we proceed, let’s reemphasize what we mean by point 1. Simply because there is
a mathematical relationship does not guarantee that there is really cause and effect. Since
C H A P T E R
1 3
Forecasting CD13-5
the Super Bowl began in 1967, almost every time an NFC team wins, the stock market’s
Standard & Poor 500 indicator increases for that year. When an AFC team wins, the market
usually goes down. In 32 years this rule has worked 88% of the time (28 out of 32)! If you
really believed there was a significant relationship between these two variables (which team
wins the Super Bowl and subsequent stock market performance that year), then in 1999
you would have taken all of your money out of the stock market and put it into a money
market account (savings) or if you wanted to go even further, you might have sold short
some stocks or the S&P Index because the Denver Broncos (AFC) won the Super Bowl early
that year.
One commonly used approach in creating a causal forecasting model is called curve
fitting.
CURVE FITTING: AN OIL COMPANY EXPANSION
The fundamental ideas of curve fitting are easily illustrated by a model in which one independent variable is used to predict the value of the dependent variable. As a specific example, consider an oil company that is planning to expand its network of modern self-service
gasoline stations. It plans to use traffic flow (measured in the average number of cars per
hour) to forecast sales (measured in average dollar sales per hour).
The firm has had five stations in operation for more than a year and has used historical data to calculate the averages shown in Figure 13.1 (OILCOMP.XLS). These data are
plotted in Figure 13.2. Such a plot is often called a scatter diagram. In order to create this
diagram (or chart) in Excel, we must do the following:
1.
2.
Highlight the range of data (B2:C6); then click on the Chart Wizard.
In the first step, indicate that you want the XY (Scatter) type of chart (the fifth
choice), then indicate you want the first subtype of scatter chart you can choose (only
the data points, no lines connecting).
3. In the second step, click on “Next>” because all of Excel’s default choices are fine.
4. In the third step, enter the X-axis label as “Cars/hour” and the Y-axis label as
“Sales/hour ($),” then click on “Next>.”
5. In the final step, click on “As new sheet” to place the chart in a separate worksheet
called “Chart1”; then click on “Finish.”
We now wish to use these data to construct a function that will enable us to forecast
the sales at any proposed location by measuring the traffic flow at that location and plugging its value into the function we construct. In particular, suppose that the traffic flow at a
proposed location in Buffalo Grove is 183 cars per hour. How might we use the data in
Figure 13.2 to forecast the sales at this location?
Least Squares Fits The method of least squares is a formal procedure for curve fitting. It is a two-step process.
1. Select a specific functional form (e.g., a straight line or a quadratic curve).
2. Within the set of functions specified in step 1, choose the specific function that minimizes the sum of the squared deviations between the data points and the function
values.
FIGURE 13.1
Sales and Traffic Data
CD13-6 C D
C H A P T E R S
FIGURE 13.2
Scatter Plot of Sales Versus
Traffic
To demonstrate this process, consider the sales-traffic flow example. In step 1, assume that
we select a straight line; that is, we restrict our attention to functions of the form y = a + bx,
Step 2 is illustrated in Figure 13.3. Here values for a and b were chosen (we’ll show how to
do this in Excel momentarily), the appropriate line y = a + bx was drawn, and the deviations between observed points and the function are indicated. For example,
d1 = y1 – [a + bx1] = 220 – [a + 150b]
where y1 = actual (observed) sales/hr at location 1 (i.e., 220)
x1 = actual (observed) traffic flow at location 1 (i.e., 150)
a = intercept (on the vertical axis) for function in Figure 13.3
b = slope for the function in Figure 13.3
The value d12 is one measure of how close the value of the function [a + bx1] is to the
observed value, y1; that is, it indicates how well the function fits at this one point.
FIGURE 13.3
y
Method of Least Squares
300
Sales/hour ($)
250
d3
d1
200
d5
d4
150
100
y = a + bx
d2
50
50
100
150
200
Cars/hour
x
C H A P T E R
1 3
Forecasting CD13-7
We want the function to fit well at all points. One measure of how well it fits overall is
5
the sum of the squared deviations, which is
di2. Let us now consider a general model with
i=1
n as opposed to five observations. Then, since each di = yi – (a + bxi ), the sum of the
squared deviations can be written as
n
(yi – [a + bxi])2
i=1
(13.1)
Using the method of least squares, we select a and b so as to minimize the sum
shown in equation (13.1). The rules of calculus can be used to determine the values of a
and b that minimize this sum. Over a century ago, mathematicians could not determine
the straight line that minimized the absolute deviation or error yi – ŷi, but they could
use calculus to determine the line that minimized the squared error (yi – ŷi)2. Thus,
forecasting has been inundated with “least squares” formulas and rationalizations as to
why “squared” errors should be minimized. Today, with the advent of spreadsheets, we
are able to use other error measures (e.g., Mean Absolute Deviation [MAD] and Mean
Absolute Percentage Error [MAPE]) because spreadsheets combined with the Solver
algorithm can now minimize the sums of absolute errors or percentage errors as well.
These two, newer error measures will be used and demonstrated extensively in Section
13.4.
To continue with the development of the traditional least squares approach, the procedure is to take the partial derivative of the sum in equation (13.1) with respect to a and set
the resulting expression equal to zero. This yields one equation. A second equation is
derived by following the same procedure with b. The equations that result from this procedure are
n
n
–2(yi – [a + bxi]) = 0
i=1
–2xi(yi – [a + bxi]) = 0
i=1
and
Recall that the values for xi and yi are the observations, and our goal is to find the values of a and b that satisfy these two equations. The solution can be shown to be
n
b=
n
n
xi yi – n1 i=1
xi i=1
yi
i=1
1
a=n
xi2 – n1 i=1
xi
i=1
n
n
n
n
yi – b n1 i=1
xi
i=1
2
(13.2)
The next step is to determine the values for xi, x2i , yi, xiyi. Note that these quantities
depend only on the data we have observed and that we can find them with simple arithmetic operations. Of course, Excel is highly capable of doing this for us and in fact has
some predefined functions that will do this automatically. To do this, we simply
1. Click on Tools, then “Data Analysis . . .” (If you don’t see this as a choice on the submenu, you will need to click on “Add-ins” and then select “Analysis ToolPak.”)
2. Choose “Regression” and it brings up the dialog box shown in Figure 13.4. Enter the
Y-range as $C$2:$C$6 and the X-range as $B$2:$B$6.
3. Indicate that we want the results reported to us in a separate spreadsheet entitled
“Results.”
4. Click on OK.
The results that are automatically calculated and reported by Excel are shown in Figure
13.5.
CD13-8 C D
C H A P T E R S
FIGURE 13.4
Excel’s Regression Tool
Dialog Box
FIGURE 13.5
Results of Regression
There is a wealth of information that is reported to us, but the parameters of immediate interest to us are contained in cells B17:B18. We note that the “Intercept” (a) and “X
Variable 1” value (b) are reported as:
b = 0.92997
a = 57.104
To add the resulting least squares line we must follow these steps:
1. Click on the worksheet with our original scatter plot (Chart 1).
2. Click on the data series so that they’re highlighted.
3. Click on the Chart menu, followed by “Add Trendline . . .”
4. In responding to the type of trendline we want, click on OK (because the linear trend
is the default choice).
5. Click on OK.
The result is shown in Figure 13.6 as a solid line.
Let’s explore what some of the other information reported in the regression’s
“Summary Output” means. The “R Square” value reported in cell B5 of Figure 13.5 is given
as 69.4%. This is a “goodness of fit” measure, like the sum of squared deviations. This number represents the R2 statistic discussed in introductory statistics classes. It ranges in value
from 0 to 1 and gives us an idea of how much of the total variation in Y from its mean is
explained by the new line we’ve drawn. Put another way, statisticians like to talk about the
C H A P T E R
1 3
Forecasting CD13-9
FIGURE 13.6
Least Squares Linear
Trend Line
three different sums of errors (Total Sum of Squares [TSS], Error Sum of Squares [ESS],
and Regression Sum of Squares [RSS]). The basic relationship between them is:
TSS = ESS + RSS
and they are defined as follows:
n
TSS =
(Yi – Y )2
i=1
ESS =
(Yi – Ŷi)2
i=1
RSS =
(Ŷi – Y)2
i=1
n
n
(13.3)
The ESS is the quantity that we tried to minimize with the Regression tool. Essentially, the
sum of squared errors that is left after regression has done its job (ESS) is the amount of
variation that can’t be explained by the regression. The RSS quantity is effectively the
amount of the original, total variation (TSS) that we could remove by using our newfound
regression line. Another way, R2 is defined is:
R2 =
RSS
TSS
If we could come up with a perfect fitting regression line (ESS = 0), we note that RSS = TSS
and the R2 = 1.0 (its maximum value). In our case, R2 = .694, meaning we can explain
approximately 70% of the variation in the Y values by using our one explanatory variable
(X), cars per hour.
Now let’s get back to our original task—should we build a station at Buffalo Grove
where the traffic is 183 cars/hour? Our best guess at what the corresponding sales volume
would be is found by placing this X value into our new regression equation:
Sales/hour = 57.104 + 0.92997 * (183 cars/hour)
CD13-10
C D
C H A P T E R S
This gives us a forecasted sales/hour of $227.29. How confident are we in our forecast? It would
be nice to be able to state a 95% confidence interval around our best guess. The information we
need to do that is also contained in Excel’s summary output. In cell B7, Excel reports that the
standard error (Se) is 44.18. This quantity represents the amount of scatter in the actual data
around our regression line and is very similar in concept to the ESS. In fact, its formula is
Se =
n
(Yi – Ŷi)2
i=1
n–k–1
(13.4)
where n is the number of data points (5 in our example), and k is the number of independent variables (1 in our example). We can see that equation (13.4) is equivalent to
=
n –ESSk – 1
Once we have this Se value, we can take advantage of a rough thumb rule that is based on
the normal distribution and states that we can have 68% confidence the actual value of
sales/hour would be within ± 1 Se of our predicted value ($227.29). Likewise we have 95%
confidence that the actual value of sales/hour would be within ± 2 Se of our predicted value
($227.29), meaning our 95% confidence interval would be
[227.29 – 2(44.18); 227.29 + 2(44.18)] or [$138.93; $315.65]
To be more precise on these confidence intervals requires that we calculate Sp (the standard
prediction error), which is always larger than Se but is more complicated to derive and
beyond our scope of an introductory coverage. The intuition to remember here is that
when we’re trying to predict Y based on values of X that are near X , then Sp is very close to
Se. The farther away the X values get from X , the larger the difference between Sp and Se.
Another value of interest in the Summary report is the t-statistic for the X variable and
its associated values (cells D18:G18). The t-statistic is given in cell D18 as 2.61. The P-value
in cell E18 is 0.0798. We desire to have the P-value less than 0.05. This would represent that
we have at least 95% confidence that the slope parameter (b) is statistically significantly different than zero (a zero slope would be a flat line and indicate no relationship between Y
and X). In fact, Excel provides the 95% confidence interval for its estimate of b. In our case,
we have 95% confidence the true value for b is between –0.205 and 2.064 (cells F18 and
G18) and thus we can’t exclude the possibility that the true value of b might be zero.
Lastly, the F-significance reported in cell F12 is identical to the P-value for the t-statistic (0.0798) and will always be so if there is only a single independent variable. In the case
of more than one X variable, the F-significance tests the hypothesis that all the X variable
parameters as a group are statistically significantly different than zero.
One final note as you move into multiple regression models: As you add other X variables, the R2 statistic will always increase, meaning the RSS has increased. But in order to
keep from overmassaging the data (similar to what we discussed with the polynomial of
degree n – 1), you should keep an eye on the Adjusted R2 statistic (cell B6) as the more reliable indicator of the true goodness of fit because it compensates for the reduction in the
ESS due to the addition of more independent variables. Thus it may report a decreased
adjusted R2 value even though R2 has increased, unless the improvement in RSS is more
than compensated for by the addition of the new independent variables.
We should point out that we could have obtained the same parameter values for a and
b by using the Solver algorithm (set our objective function as the sum of the squared deviations, let the decision variables be a and b and then turn the Solver loose in trying to minimize our nonlinear objective function).
Note that our forecast also “predicts” earning $57.10 (the value for a) when no cars
arrive (i.e., cars/hour = 0). At this point it might be well to establish limits on the range over
C H A P T E R
1 3
Forecasting CD13-11
which we feel that the forecast is valid (e.g., from 30 to 250 cars) or seek a logical explanation. Many service stations have convenience foods and also do a walk-in business. Thus
“a” might represent the amount of walk-in business (which would be constant regardless of
how much car traffic there is).
Fitting a Quadratic Function The example above has shown how to make linear fits
for the case of one independent variable. But the method of least squares can be used with
any number of independent variables and with any functional form. As an illustration, suppose that we wish to fit a quadratic function of the form
y = a0 + a1x + a2x 2
to our previous data with the method of least squares. Our goal, then, is to select a0, a1, and
a2 in order to minimize the sum of squared deviations, which is now
5
(yi – [a0 + a1xi + a2xi2])2
i=1
(13.5)
We proceed by setting the partial derivatives with respect to a0, a1, and a2 equal to zero. This
gives the equations
5a0 + (xi)a1 + (xi2)a2 = yi
(xi)a0 + (xi2)a1 + (xi3)a2 = xi yi
(xi2)a0 + (xi3)a1 + (xi4)a2 = xi2yi.
(13.6)
This is a simple set of three linear equations in three unknowns. Thus, the general name for
this least squares curve fitting is “Linear Regression.” The term linear comes not from a
straight line being fit, but from the fact that simultaneous linear equations are being solved.
Finding the numerical values of the coefficients is a straightforward task in a spreadsheet. This time instead of using the “Regression” tool in Excel, we will demonstrate the use
of the Solver algorithm. We use the new spreadsheet called “Quadratic” in the same
OILCOMP.XLS workbook that is shown in Figure 13.7.
FIGURE 13.7
Quadratic Trend
Spreadsheet
Cell
Formula
Copy To
D7
E7
F7
F13
F15
= $B$2$B$3*B7$B$4B7^2
= C7D7
= E7^2
= SUM(F7:F11)
= SUMXMY2(C7:C11,D7:D11)
D8:D11
E8:E11
F8:F11
—
—
CD13-12
C D
C H A P T E R S
FIGURE 13.8
Solver Dialog Box for
Quadratic Trend
The steps to find the optimal values for the parameters (a0, a1, and a2) are indicated
below:
1. Click on the Tools menu, and then “Solver . . .”
2. Complete the Solver Parameters dialog box, as shown in Figure 13.8. Click on “Solve.”
We are basically setting up an unconstrained, nonlinear optimization model, where
the three parameters (cells B2:B4) are our decision variables (changing cells) and our
objective function is to minimize the sum of squared errors (cell F13).
3. When Solver returns its dialog box showing that it has found a solution, click on OK
and you see the results shown in Figure 13.9.
We see the optimal parameters are:
a0 = –13.586
a1 = 2.147
a2 = –0.0044
which yields a sum of squared errors of 4954. Note: Excel has a built-in function to help us
calculate this quantity directly (i.e., we could do it without columns E and F) known as
=sumxmy2(range1, range2) and it is shown in cell F15 of Figure 13.9. The function takes
FIGURE 13.9
Results for Optimal
Quadratic Parameters
Cell
Formula
Copy To
D7
E7
F7
F13
F15
= $B$2$B$3*B7$B$4B7^2
= C7D7
= E7^2
= SUM(F7:F11)
= SUMXMY2(C7:C11,D7:D11)
D8:D11
E8:E11
F8:F11
—
––
C H A P T E R
1 3
Forecasting CD13-13
FIGURE 13.10
Quadratic Least Squares
Function Fit to Data
the values in the second range and subtracts them from the values in the first range (one at
a time), squares the difference, and sums these squared differences up for all the values in
the range. To plot the original data and this quadratic function, we use the Chart Wizard
with the following steps:
1. Highlight the original range of data (B7:C11) in the “Quadratic” spreadsheet, then
click on the Chart Wizard.
2. In the first step, indicate that you want the XY (Scatter) type of chart (the fifth
choice), then indicate that you want the first subtype of scatter chart you can choose
(only the data points, no lines connecting).
3. In the second step, click on “Next>” because all of Excel’s default choices are fine.
4. In the third step, enter the X-axis label as “Cars/hour” and the Y-axis label as
“Sales/hour($)”; then click on “Next>.”
5. In the final step, click on “As new sheet” to place the chart in a separate worksheet
called “Chart2”; then click on “Finish.”
6. Click on the data series in Chart2 so that they’re highlighted.
7. Click on the Chart menu, followed by “Add Trendline.”
8. In responding to the type of trendline we want, click on “Polynomial” of order 2.
9. Click on OK and you get the graph shown in Figure 13.10.
To do this same thing with the “Regression” tool, you must first create a column for a second independent variable, X2 = X12, and then regress Y (Sales/hr) on both X1 (Cars/hr) and
X2 ([Cars/hr] ˆ2). We leave this as an exercise for the student (see Problem 13-23).
Comparing the Linear and Quadratic Fits In the method of least squares, we
have selected the sum of the squared deviations as our measure of “goodness of fit.” We can
thus compare the linear and the quadratic fit with this criterion. In order to make this comparison, we have to go back and use the linear regression “Results” spreadsheet and make
the corresponding calculation in the original “Data” spreadsheet. This work is shown in
Figure 13.11.
We see that the sum of the squared deviations for the quadratic function is indeed
smaller than that for the linear function (i.e., 4954 < 5854.7). Indeed, the quadratic gives us
CD13-14
C D
C H A P T E R S
FIGURE 13.11
Sum of Squared Errors
Calculation for Linear
Regression Results
Cell
Formula
Copy To
D2
E2
F2
F8
= Results!$B$17Results!$B$18*B2
= C2D2
= E2^2
= SUM(F2:F6)
D3:D6
E3:E6
F3:F6
—
roughly a 15% decrease in the sum of squared deviations. The general result has to hold in
this direction; that is, the quadratic function must always fit better than the linear function.
A linear function is, after all, a special type of a quadratic function (one in which a2 = 0). It
follows then that the best quadratic function must be at least as good as the best linear
function.
WHICH CURVE TO FIT?
If a quadratic function is at least as good as a linear function, why not choose an even more
general form, such as a cubic or a quartic, thereby getting an even better fit? In principle the
method can be applied to any specified functional form. In practice, functions of the form
(again using only a single independent variable for illustrative purposes)
y = a0 + a1x + a2x 2 + · · · + an xn
are often suggested. Such a function is called a polynomial of degree n, and it represents a
broad and flexible class of functions (for n = 2 we have a quadratic, n = 3 a cubic, n = 4 a
quartic, etc.). One can obtain an amazing variety of curves with polynomials, and thus they
are popular among curve fitters. One must, however, proceed with caution when fitting
data with a polynomial function. Under quite general conditions it is possible, for example,
to find a (k – 1)-degree polynomial that will perfectly fit k data points. To be more specific,
suppose that we have on hand seven historical observations, denoted (xi , yi), i = 1, 2, . . . , 7.
It is possible to find a sixth-degree polynomial
y = a0 + a1x + a2x 2 + · · · + a6x 6
that exactly passes through each of these seven data points (see Figure 13.12).
This perfect fit (giving zero for the sum of squared deviations), however, is deceptive,
for it does not imply as much as you may think about the predictive value of the model for
use in future forecasting. For example, refer again to Figure 13.12. When the independent
variable (at some future time) assumes the value x8, the true value of y might be given by y8,
whereas the predicted value is ŷ8. Despite the previous perfect fit, the forecast is very inaccurate. In this situation a linear fit (i.e., a first-degree polynomial) such as the one indicated
in Figure 13.12 might well provide more realistic forecasts, although by the criterion of
least squares it does not “fit” the historical data nearly as well as the sixth-degree polynomial. Also, note that the polynomial fit has hazardous extrapolation properties. That is, the
polynomial “blows up” at its extremes; x values only slightly larger than x6 produce very
large predicted y’s. Looking at Figure 13.12, you can understand why high-order polynomial fits are referred to as “wild.”
C H A P T E R
FIGURE 13.12
1 3
Forecasting CD13-15
y
A Sixth-Degree Polynomial
Produces a Perfect Fit
Linear fit
y8
A sixth-degree polynomial
yˆ 8
x7
x8
x2
x5
x1
x3
x4
x6
x
One way of finding which fit is truly “better” is to use a different standard of comparison, the “mean squared error” or MSE—which is the sum of squared errors/(number of
points – number of parameters). For our linear fit, the number of parameters estimated
is 2 (a, b), so MSE = 5854/(5 – 2) = 1951.3; and for the quadratic fit, the MSE = 4954/
(5 – 3) = 2477.0. Thus, the MSE gets worse in this case, even though the total sum of squares
will always be less or the same for a higher-order fit. (Note: This is similar to how the adjusted
R2 statistic works.) We still have to be somewhat careful even with this new standard of comparison, because when there is a perfect fit, both the total sum of squares and MSE will be
0.00. Because of this, most prepackaged forecasting programs will fit only up through a cubic
polynomial, since higher degrees simply don’t reflect the general trend of actual data.
What Is a Good Fit? The intent of the paragraphs above is to suggest that a model
that has given a good fit to historical data may provide a terrible fit to future data. That is, a
good historical fit may have poor predictive power. So what is a good fit?
The answer to this question involves considerations both philosophic and technical. It
depends, first, on whether one has some idea about the underlying real-world process that
relates the y’s and the x’s. To be an effective forecasting device, the forecasting function
must to some extent capture important features of that process. The more one knows, the
better one can do. To go very far into this topic, one must employ a level of statistics that
would extend well beyond this introductory coverage. For our purposes it suffices to state
that knowledge of the underlying process is typically phrased in statistical language. For
example, linear curve fitting, in the statistical context, is called linear regression. If the statistical assumptions about the linear regression model are precisely satisfied (e.g., errors are
normally distributed around the regression line), then in a precise and well-defined sense
statisticians can prove that the linear fit is the “best possible fit.”
But in a real sense, this begs the question. In the real world one can never be completely certain about the underlying process. It is never “served to us on a silver platter.”
One only has some (and often not enough) historical data to observe. The question then
becomes: How much confidence can we have that the underlying process is one that satisfies a particular set of statistical assumptions? Fortunately, quantitative measures do exist.
Statistical analysis, at least for simple classes of models like linear regression, can reveal how
well the historical data do indeed satisfy those assumptions.
And what if they do not? One tries a different model. Let us regress (digress) for a
moment to recall some of the philosophy involved with the use of optimization models
CD13-16
C D
C H A P T E R S
(which is exactly what least squares fitting is—an unconstrained nonlinear optimization).
There is an underlying real-world problem. The model is a selective representation of that
problem. How good is that model, or representation? One usually does not have precise
measures, and many paragraphs in this text have been devoted to the role of managerial
judgment and sensitivity analysis in establishing a model’s credibility. Ideally, to test the
goodness of a model, one would like to have considerable experience with its use. If, in
repeated use, we observe that the model performs well, then our confidence is high.1
However, what confidence can we have at the outset, without experience?
Validating Models One benchmark, which brings us close to the current context, is
to ask the question: Suppose the model had been used to make past decisions; how well
would the firm have fared? This approach “creates” experience by simulating the past.
This is often referred to as validation of the model. One way to use this approach, in the
forecasting context, is called “divide and conquer” and is discussed in Section 13.5.
Typically, one uses only a portion of the historical data to create the model—for example, to fit a polynomial of a specified degree. One can then use the remaining data to see
how well the model would have performed. This procedure is specified in some detail in
Section 13.5. At present, it suffices to conclude by stressing that in curve fitting the question of “goodness of fit” is both philosophic and technical, and you do not want to lose
sight of either issue.
SUMMARY
A causal forecasting model uses one or more independent variables to forecast the value of
a dependent or response variable. The model is often created by fitting a curve to an existing set of data and then using this curve to determine the response associated with new values of the independent variable(s). The method of least squares is a particularly useful
method of fitting a curve. We illustrated the general concept of this method and considered
the specific problems of fitting a straight line, a quadratic function, and higher-order polynomials to a set of data. For simplicity, all of our illustrations involved a single independent
variable but the same techniques apply to models with many variables.
These few examples of causal forecasting models demonstrate that even in simple
models the required calculations are tedious. The wide availability of spreadsheets has
reduced the problem of performing the necessary calculations so that it is insignificant.
The important questions are: What model, if any, can do a reliable job of forecasting, and
are the data required for such a model available and reliable?
We have discussed both philosophic and technical issues that the “curve fitting” manager must address. Comments on the role of causal models in managerial decision making
are reserved for Section 13.7. We now turn our attention to time-series analysis.
13.4
TIME-SERIES FORECASTING
MODELS
Another class of quantitative forecasting techniques comprises the so-called time-series
forecasting models. These models produce forecasts by extrapolating the historical behavior
of the values of a particular single variable of interest. For example, one may be interested in
the sales for a particular item, or a fluctuation of a particular market price with time. Timeseries models use a technique to extrapolate the historical behavior into the future.
Figuratively, the series is being lifted into the future “by its own bootstraps.” Time-series
data are historical data in chronological order, with only one value per time period. Thus,
the data for the service station from the previous section are not time-series data and cannot be analyzed using the techniques in this section.
1 No matter how much observation seems to substantiate the model, we can never conclude that the model
is “true.” Recall the high degree of “substantiation” of the flat earth model. “If you leave port and sail westward
you will eventually fall off the earth and never be seen again.”
C H A P T E R
1 3
Forecasting CD13-17
EXTRAPOLATING HISTORICAL BEHAVIOR
In order to provide several examples of bootstrap methods, let us suppose that we have on
hand from the Wall Street Journal the daily closing prices of a March cocoa futures contract
for the past 12 days, including today, and that from this past stream of data we wish to predict tomorrow’s closing price. Several possibilities come to mind:
1. If it is felt that all historical values are important, and that all have equal predictive
power, we might take the average of the past 12 values as our best forecast for tomorrow.
2. If it is felt that today’s value (the 12th) is far and away the most important, this value
might be our best prediction for tomorrow.
3. It may be felt that in the current “fast-trending market” the first six values are too
antiquated, but the most recent six are important and each has equal predictive
power. We might then take the average of the most recent six values as our best estimate for tomorrow.
4. It may be felt that all past values contain useful information, but today’s (the 12th
observation) is the most important of all, and, in succession, the 11th, 10th, 9th, and
so on, observations have decreasing importance. In this case we might take a weighted
average of all 12 observations, with increasing weights assigned to each value in the
order 1 through 12 and with the 12 weights summing to 1.
5. We might actually plot the 12 values as a function of time and then draw a linear
“trend line” that lies close to these values. This line might then be used to predict
tomorrow’s value.
Let us now suppose that tomorrow’s actual closing price is observed and consider our
forecast for the day after tomorrow, using the 13 available historical values. Methods 1 and
2 can be applied in a straightforward manner. Now consider method 3. In this case we
might take tomorrow’s actual observed price, together with today’s and the previous four
prices, to obtain a new 6-day average. This technique is called a simple 6-period moving
average, and it will be discussed in more detail in the following sections.
Let us now refer to method 4. In this instance, since we employ all past values, we
would be using 13 rather than 12 values, with new weights assigned to these values. An
important class of techniques called exponential smoothing models operate in this fashion.
These models will also be explored in the ensuing discussion.
Finally, we shall explore in more detail the technique mentioned in item 5. This provides another illustration of forecasting by a curve fitting method.
We mention at this point that whenever we have values for a particular (single) variable of interest, which can be plotted against time, these values are often termed a time
series, and any method used to analyze and extrapolate such a series into the future falls
within the general category of time-series analysis. This is currently a very active area of
research in statistics and management science. We will be able to barely scratch the surface
in terms of formal development. Nevertheless, some of the important concepts, from the
manager’s viewpoint, will be developed. In this section, we will use the error measures of
MAD (mean absolute deviation) and MAPE (mean absolute percentage error) instead of
mean squared error (MSE), which was used extensively in Section 13.3.
CURVE FITTING
We have already considered curve fitting in the discussion of causal models. The main difference in the time-series context is that the independent variable is time. The historical
observations of the dependent variable are plotted against time, and a curve is then fitted to
these data. The curve is then extended into the future to yield a forecast. In this context,
extending the curve simply means evaluating the derived function for larger values of t, the
time. This procedure is illustrated for a straight line in Figure 13.13.
CD13-18
C D
FIGURE 13.13
Fitting a Straight Line
C H A P T E R S
Sales
Forecast for
period t + 2
Historical data
Forecast for
period t + 1
t
Time
The use of time as an independent variable has more serious implications than altering
a few formulas, and a manager should understand the important difference between a
causal model using curve fitting and a time-series model using curve fitting. One of the
assumptions with curve fitting is that all the data are equally important (weighted). This
method produces a very stable forecast that is fairly insensitive to slight changes in the data.
The mathematical techniques for fitting the curves are identical, but the rationale, or
philosophy, behind the two models is basically quite different. To understand this difference, think of the values of y, the variable of interest, as being produced by a particular
underlying process or system. The causal model assumes that as the underlying system
changes to produce different values of y, it will also produce corresponding differences in
the independent variables and thus, by knowing the independent variables, a good forecast
of y can be deduced. The time-series model assumes that the system that produces y is
essentially stationary (or stable) and will continue to act in the future as it has in the past.
Future patterns in the movement of y will closely resemble past patterns. This means that
time is a surrogate for many factors that may be difficult to measure but that seem to vary
in a consistent and systematic manner with time. If the system that produces y significantly
changes (e.g., because of changes in environment, technology, or government policy), then
the assumption of a stationary process is invalid and consequently a forecast based on time
as an independent variable is apt to be badly in error.
Just as for causal models, it is, of course, possible to use other than linear functions to
extrapolate a series of observations (i.e., to forecast the future). As you might imagine, one
alternative that is often suggested in practice is to assume that y is a higher-order polynomial in t, that is,
yt = b0 + b1t + b2t 2 + . . . + bkt k
As before, appropriate values for the parameters b0, b1, . . . ,bk must be mathematically
derived from the values of previous observations. The higher-order polynomial, however,
suffers from the pitfalls described earlier. That is, perfect (or at least extremely good) historical fits with little or no predictive power may be obtained.
MOVING AVERAGES: FORECASTING STECO’S STRUT SALES
The assumption behind models of this type is that the average performance over the recent
past is a good forecast of the future. The fact that only the most recent data are being used
to forecast the future, and perhaps the weighting of the most recent data most heavily, produces a forecast that is much more responsive than a curve fitting model. This new type of
model will be sensitive to increases or decreases in sales, or other changes in the data. It is
perhaps surprising that these “naive” models are extremely important in applications.
Many of the world’s major airlines need to generate forecasts of demand to come by fare
class in order to feed these forecasts into sophisticated revenue management optimization
engines. A great number of the airlines use a particular type of moving average called exponentially weighted moving averages. In addition, almost all inventory control packages
C H A P T E R
1 3
Forecasting CD13-19
include a forecasting subroutine based on this same type of moving average (exponentially
weighted moving averages). On the basis of a criterion such as “frequency of use,” the
method of moving averages is surely an important forecasting procedure.
One person who is deeply concerned about the use of simple forecasting models is
Victor Kowalski, the new vice president of operations of STECO. His introduction to
inventory control models is discussed in Chapter 7. Since he is responsible for the inventory
of thousands of items, simple (i.e., inexpensive) forecasting models are important to him.
In order to become familiar with the various models, he decides to “try out” different models on some historical data. In particular he decides to use last year’s monthly sales data for
stainless steel struts to learn about the different models and to see how well they would
have worked if STECO had been using the models last year. He is performing what is called
a validation study.
The forecasting models are presented, of course, in symbols. Victor feels that it would
be useful to use a common notation throughout his investigation. He thus decides to let
yt–1 = observed sales of struts in month t – 1
ŷt = forecast of sales for struts in period t
He is interested in forecasting the sales one month ahead; that is, he will take the known historical values y1, . . . , yt–1 (demand in months 1 through t – 1) and use this information to
produce ŷt the forecast for demand in month t. In other words, he will take the actual past
sales, up through May, for example, and use them to forecast the sales in June; then he will use
the sales through June to forecast sales in July, and so on. This process produces a sequence of
ŷt values. By comparing these values with the observed yt values, one obtains an indication of
how the forecasting model would have worked had it actually been in use last year.
Simple n-Period Moving Average The simplest model in the moving average category is the simple n-period moving average. In this model the average of a fixed number
(say, n) of the most recent observations is used as an estimate of the next value of y. For
example, if n equals 4, then after we have observed the value of y in period 15, our estimate
for period 16 would be
ŷ16 =
y15 + y14 + y13 + y12
4
In general,
1
ŷt+1 = n (yt + yt–1 + . . . + yt–n+1)
The application of a 3-period and a 4-period moving average to STECO’s strut sales
data is shown in Table 13.1.
We see that the 3-month moving average forecast for sales in April is the average of
January, February, and March sales, (20 + 24 + 27)/3, or 23.67. Ex post (i.e., after the forecast) actual sales in April were 31. Thus in this case the sales forecast differed from actual
sales by 31 – 23.67, or 7.33.
Comparing the actual sales to the forecast sales using the data in Table 13.1 suggests
that neither forecasting method seems particularly accurate. It is, however, useful to replace
this qualitative impression with some quantitative measure of how well the two methods
performed. The measures of comparison we’ll use in this section are the mean absolute
deviation (MAD) and the mean absolute percentage error (MAPE), where
MAD =
all forecasts
MAPE =
all forecasts
 actual sales – forecast sales
number of forecasts
actual sales – forecast sales
actual sales
number of forecasts
* 100%
(13.7)
CD13-20
C D
C H A P T E R S
Table 13.1
Three- and Four-Month
Simple Moving Averages
MONTH
ACTUAL
SALES
($000s)
THREE-MONTH
SIMPLE MOVING
AVERAGE FORECAST
FOUR-MONTH
SIMPLE MOVING
AVERAGE FORECAST
Jan.
Feb.
Mar.
Apr.
May
June
July
Aug.
Sept.
Oct.
Nov.
Dec.
20
24
27
31
37
47
53
62
54
36
32
29
(20 + 24 + 27)/3 = 23.67
(24 + 27 + 31)/3 = 27.33
(27 + 31 + 37)/3 = 31.67
(31 + 37 + 47)/3 = 38.33
(37 + 47 + 53)/3 = 45.67
(47 + 53 + 62)/3 = 54.00
(53 + 62 + 54)/3 = 56.33
(62 + 54 + 36)/3 = 50.67
(54 + 36 + 32)/3 = 40.67
(20 + 24 + 27 + 31)/4 = 25.50
(24 + 27 + 31 + 37)/4 = 29.75
(27 + 31 + 37 + 47)/4 = 35.50
(31 + 37 + 47 + 53)/4 = 42.00
(37 + 47 + 53 + 62)/4 = 49.75
(47 + 53 + 62 + 54)/4 = 54.00
(53 + 62 + 54 + 36)/4 = 51.25
(62 + 54 + 36 + 32)/4 = 46.00
The MAD is calculated for the 3-month (beginning with April) and 4-month (beginning with May) moving average forecast in a new spreadsheet (STRUT.XLS), which is
shown in Figure 13.14. Since the 3-month moving average yields a MAD of 12.67 (cell
D16), whereas the 4-month moving average yields a MAD of 15.59 (cell F16), it seems (at
least historically) that including more historical data harms rather than helps the forecasting accuracy.
A simple moving average will always lag behind rising data and stay above declining
data. Thus, if there are broad rises and falls, simple moving averages will not perform well.
They are best suited to data with small erratic ups and downs, providing some stability in
the face of the random perturbations.
FIGURE 13.14
Mean Absolute Deviation
Comparison for Three- and
Four-Month Moving
Average Forecasts
Cell
Formula
Copy To
C5
D5
E6
F6
D15
D16
AVERAGE(B2:B4)
ABS(B5-C5)
AVERAGE(B2:B5)
ABS(B6-E6)
SUM(D5:D13)
AVERAGE(D5:D13)
C6:C13
D6:D13
E7:E13
F7:F13
F15
F16
C H A P T E R
1 3
Forecasting CD13-21
The simple moving average has two shortcomings, one philosophical and the other
operational. The philosophical problem centers on the fact that in calculating a forecast
(say, ŷ8), the most recent observation (y7) receives no more weight or importance than an
older observation such as y5. This is because each of the last n observations is assigned the
weight 1/n. This procedure of assigning equal weights stands in opposition to one’s intuition that in many instances the more recent data should tell us more than the older data
about the future. Indeed, the analysis in Figure 13.14 suggests that better predictions for
strut sales are based on the most recent data.
The second shortcoming, which is operational, is that if n observations are to be
included in the moving average, then (n – 1) pieces of past data must be brought forward
to be combined with the current (the nth) observation. All this data must be stored in
some way, in order to calculate the forecast. This is not a serious problem when a small
number of forecasts is involved. The situation is quite different for the firm that needs to
forecast the demand for thousands of individual products on an item-by-item basis. If,
for example, STECO is using 8-period moving averages to forecast demand for 5,000
small parts, then for each item 7 pieces of old data must be stored for each forecast, in
addition to the most recent observation. This implies that a total of 40,000 pieces of data
must be stored. Another example where the number of forecasts is huge comes from the
airline industry. Consider an airline with a large number of flights departing per day (like
United Airlines or Continental). Suppose it has 2,000 flights departing every day and it
tracks all flights for 300 days in advance. This means it has 600,000 flights to track and
forecast on an ongoing basis. In both these cases, storage requirements, as well as computing time, may become important factors in designing a forecasting and/or inventory
control system.
Weighted n-Period Moving Average The notion that recent data are more
important than old data can be implemented with a weighed n-period moving average.
This generalizes the notion of a simple n-period moving average, where, as we have seen,
each weight is 1/n. In this more general form, taking n = 3 as a specific example, we
would set
ŷ7 = 0y6 + 1y5 + 2y4
where the ’s (which are called weights) are nonnegative numbers that are chosen so that
smaller weights are assigned to more ancient data and all the weights sum to 1. There are, of
course, innumerable ways of selecting a set of ’s to satisfy these criteria. For example, if the
weighted average is to include the last three observations (a weighted 3-period moving
average), one might set
3
2
1
ŷ = y6 + y5 + y4
6
6
6
Alternatively, one could define
ŷ7 =
5
3
2
y + y + y
10 6 10 5 10 4
In both these expressions we have decreasing weights that sum to 1. In practice, the proper
choice of weights could easily be left to the Solver algorithm.
To get some idea about its performance, Victor applies the 3-month weighted moving
average with initial weights 3/6, 2/6, 1/6 to the historical stainless strut data. The forecasts
and the MAD are developed by Victor in a new sheet “WMA” in the same workbook
(STRUT.XLS) and are shown in Figure 13.15. Comparing the new MAD Victor obtained of
11.04 (cell G16) to the MAD of the 3-month simple moving average (12.67) and the
4-month simple moving average (15.59) confirms the suggestion that recent sales results
are a better indicator of future sales than are older data.
CD13-22
C D
C H A P T E R S
FIGURE 13.15
Initial Three-Month Weighted
Moving Average
Cell
Formula
Copy To
B4
F5
G5
G15
G16
SUM(B1:B3)
SUMPRODUCT($B$1:$B$3, E2:E4)
ABS(E5-F5)
SUM(G5:G13)
AVERAGE(G5:G13)
—
F6:F13
G6:G13
—
—
Of course, if we let the Solver choose the optimal weights for us, we can do even better
than our initial guess at the weights. To let Solver choose the weights that minimize the
MAD, we do the following:
1. Click on Tools, then “Solver.”
2. Set the Target cell to G16 and tell Solver we want to minimize it.
3. Indicate that the changing cells are B1:B3.
4. Add the constraints that (a) B4 = 1.0, (b) B1:B3 0, (c) B1:B3 1, (d) B1 B2, and
(e) B2 B3.
5. Click on Solve and you get the results shown in Figure 13.16.
Here we see that the optimal weighting is to place all the weight on the most recent observation, which yields a MAD of 7.56 (a 31.5% reduction in error from our initial guess).
This continues to confirm our idea that we should give more weight to the most recent
observation (to the extreme in this example).
Although the weighted moving average places more weight on more recent data, it
does not solve the operational problems of data storage, since (n – 1) pieces of historical
FIGURE 13.16
Optimal Three-Month
Weighted Moving Average
C H A P T E R
1 3
Forecasting CD13-23
sales data must still be stored. We now turn to a weighting scheme that cleverly addresses
this problem.
EXPONENTIAL SMOOTHING: THE BASIC MODEL
We saw that, in using a weighted moving average, there are many different ways to assign
decreasing weights that sum to 1. One way is called exponential smoothing, which is a
shortened name for an exponentially weighted moving average. This is a scheme that weights
recent data more heavily than past data, with weights summing to 1, but it avoids the operational problem just discussed. In this model, for any t 1 the forecast for period t + 1,
denoted ŷt+1, is a weighted sum (with weights summing to 1) of the actual observed sales in
period t (i.e., yt) and the forecast for period t (which was ŷt). In other words,
Observed in t
Forecast for t
→
→
→
Forecast for t + 1
ŷt+1 = yt + (1 – )ŷt
(13.8)
where is a user-specified constant such that 0 1. The value assigned to determines how much weight is placed on the most recent observation in calculating the forecast for the next period. Note in equation (13.8) that if is assigned a value close to 1,
almost all the weight is placed on the actual demand in period t.
Exponential smoothing has important computational advantages. To compute ŷt+1, only
ŷt need be stored (together with the value of ). As soon as the actual yt is observed, we compute ŷt+1 = yt + (1 – )ŷt . If STECO wanted to forecast demand for 5,000 small parts, in each
period, then 10,001 items would have to be stored (the 5000 ŷt values, the 5000 yt values, and
the value of ), as opposed to the previously computed 40,000 items needed to implement an
8-period moving average. Depending on the behavior of the data, it might be necessary to
store a different value of for each item, but even then much less storage would be required
than if using moving averages. The thing that is nice about exponential smoothing is that by
saving and the last forecast, all the previous forecasts are being stored implicitly.
In order to obtain more insight into the exponential smoothing model, let us note that
when t = 1 the expression used to define ŷ2 is
ŷ2 = y1 + (1 – )ŷ1
In this expression ŷ1 is an “initial guess” at the value for y in period 1, and y1 is the
observed value in period 1. To get the exponential smoothing forecast going, we need to
provide this “initial guess.” Several options are available to us: (1) First and most commonly, we let ŷ1 = y1 (i.e., we assume a “perfect” forecast to get the process rolling, but we
don’t count this error of zero in our calculation of the MAD). Other choices include (2)
looking ahead at all the available data and letting ŷ1 = y (average of all available data), or (3)
letting ŷ1 = the average of just the first couple of months. We will choose the first approach.
At this point Victor decides to use the spreadsheet “EXPSMTH” in the same workbook
(STRUT.XLS) to apply exponential smoothing to the stainless steel strut data. Figure 13.17
shows actual sales and estimated sales for 12 months using an initial value for = 0.5.
He has also calculated the MAD for February through December. Indeed, the exponential smoothing model with = 0.5 yields a smaller MAD (9.92 in cell G16) than the
moving average models (see Figure 13.14) or our initial guess at a weighted moving average
model (see Figure 13.15).
Victor knows he can find a better model by using the Solver to select the optimal value
of (one that minimizes the MAD), but he is pleased with the initial results. The MAD is
smaller than what he obtained with several previous models, and the calculations are simple. From a computational viewpoint it is reasonable to consider exponential smoothing as
an affordable way to forecast the sales of the many products STECO holds in inventory.
Although the results obtained from the exponential smoothing model are impressive,
it is clear that the particular numerical values (column F of Figure 13.17) depend on the
CD13-24
C D
C H A P T E R S
FIGURE 13.17
Exponential Smoothing
Forecast, Initial = 0.5
Cell
Formula
Copy To
F3
G3
G15
G16
$B$1*E2 (1$B$1)*F2
ABS(E3F3)
SUM(G3:G13)
AVERAGE(G3:G13)
F4:F13
G4:G13
—
—
values selected for the smoothing constant and the “initial guess” ŷ1. In order to find the
optimal value of , we just set up a nonlinear optimization model using Excel’s Solver tool.
Of course, if we let the Solver choose the optimal for us, we can do even better than our
initial guess of = 0.5. To let Solver choose the that minimizes the MAD, we do the
following:
1. Click on Tools, then “Solver . . .”
2. Set the Target cell to G16 and tell Solver we want to minimize it.
3. Indicate that the changing cell is B1.
4. Add the constraints that (a) B1 0, and (b) B1 1.
5. Click on Solve and you get the results shown in Figure 13.18.
FIGURE 13.18
Exponential Smoothing
Forecast, Optimal C H A P T E R
1 3
Forecasting CD13-25
Again, as with the weighted moving average approach, we see that because of the linear
trend (up, then down) in the data, that the more weight we can put on the most recent
observation, the better the forecast. So, not surprisingly, the optimal = 1.0, and this forecasting approach gives Victor a MAD of 6.82 (cell G16), which is the best performance he
has seen so far.
Because of the importance of the basic exponential smoothing model, it is worth
exploring in more detail how it works and when it can be successfully applied to real models. We will now examine some of its properties. To begin, note that if t 2 it is possible to
substitute t – 1 for t in (13.8) to obtain
ŷt = yt–1 + (1 – )ŷt–1
Substituting this relationship for ŷt back into the original expression for ŷt+1 (i.e., into
[13.8]) yields for t 2,
ŷt+1 = yt + (1 – )yt–1 + (1 – )2ŷt–1
By successively performing similar substitutions, one is led to the following general expression for ŷt+1:
ŷt+1 = yt + (1 – )yt–1 + (1 – )2yt–2 + . . . + (1 – )t–1y1 + (1 – )tŷ1
(13.9)
For example,
ŷ4 = y3 + (1 – )y2 + (1 – )2y1 + (1 – )3ŷ1
Since usually 0 < < 1, it follows that 0 < 1 – < 1. Thus,
> (1 – ) > (1 – )2
In other words, in the previous example y3, the most recent observation, receives more
weight than y2, which receives more weight than y1. This illustrates the general property of
an exponential smoothing model—that the coefficients of the y’s decrease as the data become
older. It can also be shown that the sum of all of the coefficients (including the coefficient of ŷ1)
is 1; that is in the case of ŷ4, for example,
+ (1 – ) + (1 – )2 + (1 – )3 = 1
We have thus seen in equation (13.9) that the general value ŷt+1 a weighted sum of all
previous observations (including the last observed value, yt). Moreover, the weights sum to 1
and are decreasing as historical observations get older. The last term in the sum, namely ŷ1,
is not a historical observation. Recall that it was a “guess” at y1. We can now observe that as
t increases, the influence of ŷ1 on ŷt+1 decreases and in time becomes negligible. To see this,
note that the coefficient of ŷ1 in (13.9) is (1 – )t. Thus, the weight assigned to ŷ1 decreases
exponentially with t. Even if is small (which makes [1 – ] nearly 1), the value of (1 – )t
decreases rapidly. For example, if = 0.1 and t = 20, then (1 – )t = 0.12. If = 0.1 and t =
40, then (1 – )t = 0.015. Thus, as soon as enough data have been observed, the value of ŷt+1
will be quite insensitive to the choice for ŷ1.
Obviously, the value of , which is a parameter input by the manager, affects the performance of the model. As you can see explicitly in (13.8), it is the weight given to the data
value (yt ) most recently observed. This implies that the larger the value of , the more
strongly the model will react to the last observation (we call this a responsive forecast). This,
as we will see, may or may not be desirable. When 0.0, this means almost complete
trust in the last forecast and almost completely ignoring the most recent observation (i.e.,
last data point). This would be an extremely stable forecast. Table 13.2 shows values for the
weights (in equation [13.9]) when = 0.1, 0.3, and 0.5. You can see that for the larger values of (e.g., = 0.5) more relative weight is assigned to the more recent observations, and
the influence of older data is more rapidly diminished.
To illustrate further the effect of choosing various values for (i.e., putting more or
less weight on recent observations), we consider three specific cases.
CD13-26
C D
C H A P T E R S
Table 13.2
Weights for Different
Values of VARIABLE
COEFFICIENT
= 0.1
= 0.3
= 0.5
yt
yt–1
yt–2
yt–3
yt–4
yt–5
yt–6
yt–7
yt–8
yt–9
yt–10
Sum of the Weights
(1 – )
(1 – )2
(1 – )3
(1 – )4
(1 – )5
(1 – )6
(1 – )7
(1 – )8
(1 – )9
(1 – )10
0.1
0.09
0.081
0.07290
0.06561
0.05905
0.05314
0.04783
0.04305
0.03874
0.03487
0.68619
0.3
0.21
0.147
0.10290
0.07203
0.05042
0.03530
0.02471
0.01729
0.01211
0.00847
0.98023
0.5
0.25
0.125
0.0625
0.03125
0.01563
0.00781
0.00391
0.00195
0.00098
0.00049
0.99610
Case 1 (Response to a Sudden Change) Suppose that at a certain point in time
the underlying system experiences a rapid and radical change. How does the choice of influence the way in which the exponential smoothing model will react? As an illustrative
example consider an extreme case in which
yt = 0 for t = 1, 2, . . . , 99
yt = 1 for t = 100, 101, . . .
This situation is illustrated in Figure 13.19. Note that in this case if ŷ1 = 0, then ŷ100 = 0 for
any value of , since we are taking the weighted sum of a series of zeros.
Thus, at time 99 our best estimate of y100 is 0, whereas the actual value will be 1. At
time 100 we will first see that the system has changed. The question is: How quickly will the
forecasting system respond as time passes and the information that the system has changed
becomes available?
To answer this question, we plot ŷt+1 or = 0.5 and = 0.1 in Figure 13.20. Note that
when = 0.5, ŷ106 = 0.984; thus at time 105 our estimate of y106 would be 0.984, whereas
the true value will turn out to be 1. When = 0.1 our estimate of y106 is only 0.468.
We see then that a forecasting system with = 0.5 responds much more quickly to
changes in the data than does a forecasting system with = 0.1. The manager would
thus prefer a relatively large if the system is characterized by a low level of random
behavior, but is subject to occasional enduring shocks. (Case 1 is an extreme example of
this situation.)
However, suppose that the data are characterized by large random errors but a stable
mean. Then if is large, a large random error in yt will throw the forecast value, ŷt+1, way
off. Hence, for this type of process a smaller value of would be preferred.
FIGURE 13.19
yt
System Change when
t = 100
1
0
95
96
97
98
99
100
101 102
t
C H A P T E R
FIGURE 13.20
yˆ t + 1
Response to a Unit Change
in yt
1 3
Forecasting CD13-27
When = 0.5
yˆ 106 = 0.984
.9
.8
= 0.5
.7
.6
When = 0.1
yˆ 106 = 0.468
.5
= 0.1
.4
.3
.2
.1
100
105
110
115
120
t
Case 2 (Response to a Steady Change) As opposed to the rapid and radical
change investigated in Case 1, suppose now that a system experiences a steady change in the
value of y. An example of a steady growth pattern is illustrated in Figure 13.21. This example is called a linear ramp. Again the questions are: How will the exponential smoothing
model respond, and how will this response be affected by the choice of ?
In this case, recall that
ŷt+1 = yt + (1 – )yt–1 + . . .
Since all previous y’s (y1, . . . , yt–1) are smaller than yt , and since the weights sum to 1, it can
be shown that, for any between 0 and 1, ŷt+1 < yt . Also, since yt+1 is greater than yt , we see
that ŷt+1 < yt < yt+1. Thus our forecast will always be too small. Finally, since smaller values
of put more weight on older data, the smaller the value of , the worse the forecast
becomes. But even with very close to 1 the forecast is not very good if the ramp is steep.
The moral for managers is that exponential smoothing (or indeed any weighted moving
average), without an appropriate modification, is not a good forecasting tool in a rapidly
growing market or a declining market. The model can be adjusted to include the trend and
this is called Holt’s model (or exponential smoothing with trend), and the method will be
shown in more detail later in this section.
In reality, the observation that Victor made with the struts in our previous example
(i.e., both weighted moving average and exponential smoothing placed ALL the weight on
the most recent observation) is a good clue to you as a manager that there is obvious trend
in the data and that you should consider a different forecasting model.
FIGURE 13.21
yt
Steadily Increasing Values of
yt (a Linear Ramp)
t
CD13-28
C D
C H A P T E R S
Case 3 (Response to a Seasonal Change) Suppose that a system experiences a
regular seasonal pattern in y (such as would be the case if y represents, for example, the
demand in the city of Denver for swimming suits). How then will the exponential smoothing model respond, and how will this response be affected by the choice of ? Consider, for
example, the seasonal pattern illustrated in Figure 13.22, and suppose it is desired to
extrapolate several periods forward. For example, suppose we wish to forecast demand in
periods 8 through 11 based only on data through period 7. Then
ŷ8 = y7 + (1 – )ŷ7
Now to obtain ŷ9 , since we have data only through period 7, we assume that y8 = ŷ8. Then
ŷ9 = y8 + (1 – )ŷ8 = ŷ8 + (1 – )ŷ8 = ŷ8
Similarly, it can be shown that ŷ11 = yˆ10 = ŷ9 = ŷ8. In other words, ŷ8 is the best estimate of
all future demands.
Now let us see how good these predictions are. We know that
ŷt+1 = yt + (1 – )yt–1 + (1 – )2yt–2 + . . .
Suppose that a small value of is chosen. By referring to Table 13.2 we see that when is small (say, 0.1) the coefficients for the most recent terms change relatively slowly (i.e.,
they are nearly equal to each other). Thus, ŷt+1 will resemble a simple moving average of a
number of terms. In this case the future predictions (e.g., ŷ11) will all be somewhere near
the average of the past observations. The forecast thus essentially ignores the seasonal pattern. If a large value of is chosen, ŷ11, which equals ŷ8, will be close in value to y7, which is
obviously not good. In other words, the model fares poorly in this case regardless of the
choice of .
The exponential smoothing model ŷt+1 = yt + (1 – )ŷt is intended for situations in
which the behavior of the variable of interest is essentially stable, in the sense that deviations over time have nothing to do with time, per se, but are caused by random effects that
do not follow a regular pattern. This is what we have termed the stationarity assumption.
Not surprisingly, then, the model has various shortcomings when it is used in situations
(such as a linear ramp or swimming suit demand) that do not fit this prescription.
Although this statement may be true, it is not very constructive. What approach should a
manager take when the exponential smoothing model as described above is not appropriate? In the case of a seasonal pattern, a naive approach would be to use the exponential
smoothing model on “appropriate” past data. For example, the airlines or hotels, which
exhibit strong day-of-week seasonality, could take a smoothed average of demand on previous Mondays to forecast demand on upcoming Mondays. Another business with monthly
seasonality might take a smoothed average of sales in previous Junes to forecast sales this
June. This latter approach has two problems. First, it ignores a great deal of useful information. Certainly sales from last Tuesday to Sunday in the airline or hotel example (or July
through this May in the other example) should provide at least a limited amount of infor-
FIGURE 13.22
Demand
Seasonal Pattern in yt
yt
7
8
9 10 11
t
C H A P T E R
1 3
Forecasting CD13-29
mation about the likely level of sales this Monday (or June). Second, if the cycle is very
long, say a year, this approach means that very old data must be used to get a reasonable
sample size. The above assumption, that the system or process producing the variable of
interest is essentially stationary over time, becomes more tenuous when the span of time
covered by the data becomes quite large.
If the manager is convinced that there is either a trend (Case 2) or a seasonal effect
(Case 3) in the variable being predicted, a better approach is to develop forecasting models
that incorporate these features. When there is a discernible pattern of seasonality (which
can be seen fairly easily by graphing the data in Excel) there are methods, using simple
moving averages, to determine a seasonality factor. Using this factor, the data can be “deseasonalized,” some forecasting method used on the deseasonalized data, and the forecast can
then be “reseasonalized.” This approach will be shown after the trend model in the next
section.
HOLT’S MODEL (EXPONENTIAL SMOOTHING WITH TREND)
As discussed above, simple exponential smoothing models don’t perform very well on
models that have obvious up or down trend in the data (and no seasonality). To correct
this, Holt developed the following model:
ŷt+k = Lt + kTt
where:
Lt = yt + (1 – )(Lt–1 + Tt–1)
Tt = (Lt – Lt–1) + (1 – )Tt–1
(13.10)
Holt’s model allows us to forecast up to k time periods ahead. In this model, we now have
two smoothing parameters, and , both of which must be between 0 and 1. The Lt term
indicates the long-term level or base value for the time-series data. The Tt term indicates
the expected increase or decrease per period (i.e., the trend).
Let’s demonstrate how to make this model work with a new example. Amy Luford is an
analyst with a large brokerage firm on Wall Street. She has been looking at the quarterly
earnings of Startup Airlines and is expected to make a forecast of next quarter’s earnings.
She has the following data and graph available to her in a spreadsheet (STARTUP.XLS) as
shown in Figure 13.23. Amy can see that the data has obvious trend to it, as she would
expect for a successful new business venture. She wants to apply Holt’s trend model to the
data to generate her forecast of earnings per share (EPS) for the thirteenth quarter. This
forecasting approach is demonstrated in her spreadsheet “Holt” in the same workbook
(STARTUP.XLS) and is shown in Figure 13.24. She needs initial values for both L and T.
FIGURE 13.23
Startup Airlines Earnings
per Share
CD13-30
C D
C H A P T E R S
FIGURE 13.24
Exponential Smoothing
with Trend Model for
Startup Airlines
Cell
Formula
Copy To
C5
C6
D6
E6
F6
F18
B5
$B$1*B6 (1-$B$1)*(C5D5)
$B$2* (C6-C5)(1-$B$2)*D5
SUM(C5:D5)
ABS(B6-E6)/B6
AVERAGE(F6:F16)
—
C7:C16
D7:D16
E7:E17
F7:F16
—
She has several choices: (1) let L1 = actual EPS for quarter 1 and T1 = 0, (2) let L1 = average
EPS for all 12 quarters and T1 = average trend for all 12 quarters, and many other variations
in between. Amy chooses the first option.
With initial guesses for = 0.5 and = 0.5, she sees that the mean absolute percentage
error (MAPE) is 43.3% (cell F18). Although this is fairly high, Amy tries putting in a of
zero (as if there were no trend and she was back to simple exponential smoothing to see if
she’s gaining anything by this new model), and she sees in Figure 13.25 that the MAPE is
much worse at 78.1%.
FIGURE 13.25
Spreadsheet Model for
Startup Airlines with
No Trend
C H A P T E R
1 3
Forecasting CD13-31
FIGURE 13.26
Optimal Exponential
Smoothing with Trend
Spreadsheet Model for
Startup Airlines
Finally, she decides to use Solver to help her find the optimal values for and because she knows Solver can do better than her initial guesses of 0.5. To let Solver choose
the and that minimize the MAPE, she does the following:
1. Click on Tools, then “Solver.”
2. Set the Target cell to F18 and tell Solver we want to minimize it.
3. Indicate that the changing cells are B1:B2.
4. Add the constraints that (a) B1:B2 0, and (b) B1:B2 1.
5. Click on Solve to get the results shown in Figure 13.26.
Amy sees that the * = 0.59 and * = 0.42 and the MAPE has been lowered to 38%, which
is nearly a 12.5% improvement over the MAPE with her initial guesses of = 0.5 and
= 0.5.
The other forecasting approach Amy could have tried where she could see that there
was an obvious trend in her data (and therefore that simple exponential smoothing and
weighted moving averages would not be effective) would be to do a linear regression on the
data with time being the independent variable. We’ll leave this as an exercise for the student
(see Problem 13-19).
SEASONALITY
When making forecasts using data from a time series, one can often take advantage of seasonality. Seasonality comprises movements up and down in a pattern of constant length
that repeats itself.
For example, if you were looking at monthly data on sales of ice cream, you would
expect to see higher sales in the warmer months (June to August in the Northern
Hemisphere) than in the winter months, year after year. The seasonal pattern would be 12
months long. If we used weekly data, the seasonal pattern would repeat every 52 periods.
The number of time periods in a seasonal pattern depends on how often the observations
are collected.
In another example we may be looking at daily data on the number of guests staying
overnight at a downtown business hotel. Our intuition might tell us that we expect high
numbers on Monday, Tuesday, and Wednesday nights, low numbers on Friday and
Saturday, and medium numbers on Thursday and Sunday. So our pattern would be as follows, starting with Sunday: Medium, High, High, High, Medium, Low, Low. The pattern
would repeat itself every seven days.
CD13-32
C D
C H A P T E R S
FIGURE 13.27
Coal Receipts over a
Nine-Year Period
The approach for treating such seasonal patterns consists of four steps: (1) Look at the
original data that exhibit a seasonal pattern. From examining the data and from our own
judgment, we hypothesize an m-period seasonal pattern. (2) Using the numerical approach
described in the next section, we deseasonalize the data. (3) Using the best forecasting
method available, we make a forecast in deseasonalized terms. (4) We reseasonalize the
forecast to account for the seasonal pattern.
We will illustrate these concepts with data on U.S. coal receipts by the commercial/residential sectors over a nine-year period (measured in thousands of tons).2 Frank Keetch is
the manager of Gillette Coal Mine and he is trying to make a forecast of demand in the
upcoming two quarters. He has entered the following data for the entire industry in a
spreadsheet (COAL.XLS) and it is graphed in Figure 13.27. Intuition tells Frank to expect
higher than average coal receipts in the first and fourth quarters (winter effects) and lower
than average in the second and third quarters (spring/summer effects).
Deseasonalizing The procedure to deseasonalize data is quite simply to average out
all variations that occur within one season. Thus for quarterly data an average of four periods is used to eliminate within-year seasonality. In order to deseasonalize a whole time
series, the first step is to calculate a series of m-period moving averages, where m is the
length of the seasonal pattern. In order to calculate this four-period moving average, he has
to add two columns (C and D) to his Excel spreadsheet, which is shown in Figure 13.28.
Column C of Figure 13.28 shows a four-period moving average of the data in column
B. The first number is the average of the first four periods,
(2159 + 1203 + 1094 + 1996)/4 = 1613
The second number is the average of the next four periods, and so on.
Frank really would like to center the moving average in the middle of the data from
which it was calculated. If m is odd, the first moving average (average of points 1 to m) is
easily centered on the (m + 1)/2 point (e.g., suppose you have daily data where m = 7, the
first seven-period moving average is centered on the (7 + 1)/2 or fourth point). This
process rolls forward to find the average of the second through (m + 1)st point, which is
centered on the (m + 3)/2 point, and so forth.
2
See Quarterly Coal Reports.
C H A P T E R
1 3
Forecasting CD13-33
FIGURE 13.28
Spreadsheet to
Deseasonalize the Data
Cell
Formula
Copy To
C10
D10
E10
F8
F9
F10
F11
AVERAGE(B8:B11)
AVERAGE(C10:C11)
B10/D10
$E$1
$E$2
$E$3
$E$4
C11:C42
D11:D41
E11:E41
F12, F16, F20, F24, F28, F32, F36, F40
F13, F17, F21, F25, F29, F33, F37, F41
F14, F18, F22, F26, F30, F34, F38, F42
F15, F19, F23, F27, F31, F35, F39, F43
If m is even, as it is in Frank’s situation, the task is a little more complicated, using an
additional step to get the moving averages centered. Since the average of the first four
points should really be centered at the midpoint between the second and third data point,
and the average of periods two through five should be centered halfway between periods
three and four, the value to be centered at period three can be approximated by taking the
average of the first two averages. Thus the first number in the centered moving average column (column D) is
(1613 + 1594)/2 = 1603
A graph of the original data along with their centered moving averages is shown in
Figure 13.29. Notice that as Frank expected, coal receipts are above the centered moving
average in the first and fourth quarters, and below average in the second and third quarters.
Notice that the moving average has much less volatility than the original series; again, the
averaging process eliminates the quarter-to-quarter movement.
FIGURE 13.29
Graph of Data and Its
Centered Moving Average
CD13-34
C D
C H A P T E R S
FIGURE 13.30
Ratios by Quarter and
Their Averages
Cell
Formula
Copy To
E1
E3
AVERAGE(H1:O1)
AVERAGE(G3:N3)
E2
E4
The third step is to divide the actual data at a given point in the series by the centered
moving average corresponding to the same point. This calculation cannot be done for all
possible points, since at the beginning and end of the series we are unable to compute a
centered moving average. These ratios represent the degree to which a particular observation is below (as in the .682 for period three shown in cell E10 of Figure 13.28) or above (as
in the 1.24 for period four shown in cell E11) the typical level. Note that these ratios for the
third quarter tend to be below 1.0 and the ratios for the fourth quarter tend to be above 1.0.
These ratios form the basis for developing a “seasonal index.”
To develop the seasonal index, we first group the ratios according by quarter (columns
G through O), as shown in Figure 13.30. We then average all of the ratios to moving averages quarter by quarter (column E). For example, all the ratios for the first quarter average
1.108. This is a seasonal index for the first quarter, and Frank concludes that the first quarter produces coal receipts that are on average about 110.8% relative to the average of all
quarters.
These seasonal indices represent what that particular season’s data look like on average
compared to the average of the entire series. A seasonal index greater than 1 means that season is higher than the average for the year; likewise an index less than 1 means that season
is lower than the average for the year.
The last step of deseasonalization is to take the actual data and divide it by the appropriate seasonal index. This is shown in columns F and G of Figure 13.31. The deseasonalized data are graphed in Figure 13.32. Frank notices that the deseasonalized data seem to
“jump around” a lot less than the original data.
Forecasting Once the data have been deseasonalized, a deseasonalized forecast can be
made. This should be based on an appropriate methodology that accounts for the pattern
in the deseasonalized data (e.g., if there is trend in the data, use a trend-based model). In
this example, Frank chooses to forecast coal receipts for the first quarter of the tenth year by
simple exponential smoothing. Using this forecasting method, it turns out that the optimal
smoothing constant is * = .653, which yields an MSE of 47,920 (see Figure 13.33, page
CD13-36). Frank determines that the deseasonalized forecast would be 1726 thousand tons
for the first quarter of next year (cell H44). This would also be the deseasonalized forecast
amount for the second quarter given the data we currently have.
Reseasonalizing The last step in the process is for Frank to reseasonalize the forecast
of 1726. The way to do this is to multiply 1726 by the seasonal index for the first quarter
(1.108) to obtain a value of 1912. A seasonalized forecast for the second quarter would be
1726 times its seasonal index (0.784), giving a value of 1353. These values would represent
Frank’s point forecasts for the coming two quarters.
THE RANDOM WALK
The moving average and exponential smoothing techniques discussed above are examples
of what are called time-series models. Recently, much more sophisticated methods for
time-series analysis have become available. These methods, based primarily on develop-
C H A P T E R
1 3
Forecasting CD13-35
FIGURE 13.31
Calculation of
Deseasonalized Values
Cell
Formula
Copy To
G8
B8/F8
G9:G43
ments by G. E. P. Box and G. M. Jenkins, have already had an important impact on the
practice of forecasting, and indeed the Box-Jenkins approach is incorporated in several
computer-based forecasting packages.
These time-series forecasting techniques are based on the assumption that the true values of the variable of interest, yt , are generated by a stochastic (i.e., probabilistic) model.
FIGURE 13.32
Graph of Deseasonalized
Data
CD13-36
C D
C H A P T E R S
FIGURE 13.33
Spreadsheet for Exponential
Smoothing Forecast of
Deseasonalized Data
Cell
Formula
Copy To
H8
H9
J10
G8
$J$7*G8(1$J$7)*H8
SUMXMY2(G9:G43,H9:H43)/COUNT(G9:G43)
—
H10:H44
—
Introducing enough of the theory of probability to enable us to discuss these models in any
generality seems inappropriate, but one special and very important (and very simple)
process, called a random walk, serves as a nice illustration of a stochastic model. Here the
variable yt is assumed to be produced by the relationship
yt = yt–1 + ε
where the value of ε is determined by a random event. To illustrate this process even more
explicitly, let us consider a man standing at a street corner on a north-south street. He flips
a fair coin. If it lands with a head showing (H), he walks one block north. If it lands with a
tail showing (T), he walks one block south. When he arrives at the next corner (whichever
one it turns out to be), he repeats the process. This is the classic example of a random walk.
To put this example in the form of the model, label the original corner zero. We shall call
this the value of the first observation, y1. Starting at this point, label successive corners
going north +1, +2, . . . Also starting at the original corner label successive corners going
south –1, –2, . . . (see Figure 13.34). These labels that describe the location of our random
walker are the yt’s.
FIGURE 13.34
yt
Classic Random Walk
4
North
3
2
1
Original corner 0
–1
–2
South
–3
–4
1
2
3
4
5
6
7
8
9
10 t
C H A P T E R
1 3
Forecasting CD13-37
In the model, yt = yt–1 + ε, where (assuming a fair coin) ε = 1 with probability 1\2 and
ε = –1 with probability 1\2. If our walker observes the sequence H, H, H, T, T, H, T, T, T,
he will follow the path shown in Figure 13.34.
Forecasts Based on Conditional Expected Value Suppose that after our special agent has flipped the coin nine times (i.e., he has moved nine times, and we have [starting with corner 0] ten observations of corners), we would like to forecast where he will be
after another move. This is the typical forecasting problem in the time-series context. That
is, we have observed y1, y2, . . . , y10 and we need a good forecast ŷ11 of the forthcoming
value y11. In this case, according to a reasonable criterion, the best value for ŷ11 is the conditional expected value of the random quantity y11. In other words, the best forecast is the
expected value of y11 given that we know y1,y2, . . . , y10. From the model we know that y11
will equal (y10 + 1) with a probability equal to 1\2 and y11 will equal (y10 – 1) with a probability equal to 1\2. Thus, E(y11y1, . . . , y10), the conditional expected value of y11 given y1,
y2, . . . , y10, is calculated as follows:
E(y11y1, . . . , y10) = (y10 + 1)1\2 + (y10 – 1)1\2 = y10
Thus we see that for this model the data y1, . . . , y9 are irrelevant, and the best forecast of the
random walker’s position one move from now is his current position. It is interesting to
observe that the best forecast of the next observation y12 given y1, . . . , y10 (the original set)
is also y10. Indeed, the best forecast for any future value of y1, given this particular model, is
its current value.
Seeing What Isn’t There This example is not as silly as it may seem at first glance.
Indeed, there is a great deal of evidence that supports the idea that stock prices and foreign
currency exchange rates behave like a random walk and that the best forecast of a future
stock price or of an exchange rate (e.g., German Mark/$) is its current value. Not surprisingly, this conclusion is not warmly accepted by research directors and technical chartists
who make their living forecasting stock prices or exchange rates. One reason for the resistance to the random walk hypothesis is the almost universal human tendency when looking
at a set of data to observe certain patterns or regularities, no matter how the data are produced. Consider the time-series data plotted in Figure 13.35. It does not seem unreasonable
to believe that the data are following a sinusoidal pattern as suggested by the smooth curve
in the figure. In spite of this impression, the data were in fact generated by the random walk
model presented earlier in this section. This illustrates the tendency to see patterns where
there are none. In Figure 13.35, any attempt to predict future values by extrapolating the
sinusoidal pattern would have no more validity than flipping a coin.
In concluding this section we should stress that it is not a general conclusion of timeseries analysis that the best estimate of the future is the present (i.e., that ŷt+1 = yt). This
FIGURE 13.35
Time-Series Data
yt
+5
0
–5
10
20
30
40
50
t
CD13-38
C D
C H A P T E R S
result holds for the particular random walk model presented above. The result depends
crucially on the assumption that the expected or mean value of ε, the random component,
is zero. If the probability that ε equals 1 had been 0.6 and the probability that ε equals –1
had been 0.4, the best forecast of yt+1 would not have been yt . To find this forecast one
would have had to find the new value for E(yt+1y1, . . . . yt). Such a model is called a random walk with a drift.
13.5
THE ROLE OF HISTORICAL
DATA: DIVIDE AND CONQUER
Historical data play a critical role in the construction and testing of forecasting models.
One hopes that a rationale precedes the construction of a quantitative forecasting
model. There may be theoretical reasons for believing that a relationship exists between
some independent variables and the dependent variable to be forecast and thus that a
causal model is appropriate. Alternatively, one may take the time-series view that the
“behavior of the past” is a good indication of the future. In either case, however, if a
quantitative model is to be used, the parameters of the model must be selected. For
example:
1. In a causal model using a linear forecasting function, y = a + bx, the values of a and b
must be specified.
2. In a time-series model using a weighted n-period moving average, ŷt+1 = 0yt +
1yt–1 + . . . + n–1yt–n+1, the number of terms, n, and the values for the weights, 0,
1, . . . . , n–1, must be specified.
3. In a time-series model using exponential smoothing, ŷt+1 = yt + (1 – )ŷt , the value
of must be specified.
In any of these models, in order to specify the parameter values, one typically must make
use of historical data. A useful guide in seeking to use such data effectively is to “divide and
conquer.” More directly, this means that it is often a useful practice to use part of the data to
estimate the parameters and the rest of the data to test the model. With real data, it is also
important to “clean” the data—that is, examine them for irregularities, missing information, or special circumstances, and adjust them accordingly.
For example, suppose that a firm has weekly sales data on a particular product for the
last two years (104 observations) and plans to use an exponential smoothing model to forecast sales for this product. The firm might use the following procedure:
1. Pick a particular value of , and compare the values of ŷt+1 to yt+1 for t = 25 to 75.
The first 24 values are not compared, so as to negate any initial or “startup” effect;
that is, to nullify the influence of the initial guess, ŷ1. The manager would continue to
select different values of until the model produces a satisfactory fit during the
period t = 25 to 75.
2. Test the model derived in step 1 on the remaining 29 pieces of data. That is, using the
best value of from step 1, compare the values of ŷt+1 to yt+1 for t = 76 to 104.
If the model does a good job of forecasting values for the last part of the historical data,
there is some reason to believe that it will also do a good job with the future. On the other
hand, if by using the data from weeks 1 through 75, the model cannot perform well in predicting the demand in weeks 76 through 104, the prospects for predicting the future with
the same model seem dubious. In this case, another forecasting technique might be
applied.
The same type of divide-and-conquer strategy can be used with any of the forecasting
techniques we have presented. This approach amounts to simulating the model’s performance on past data. It is a popular method of testing models. It should be stressed, however, that this procedure represents what is termed a null test. If the model fails on historical data, the model probably is not appropriate. If the model succeeds on historical data,
C H A P T E R
1 3
Forecasting CD13-39
one cannot be sure that it will work in the future. Who knows, the underlying system that is
producing the observations may change. It is this type of sobering experience that causes
certain forecasters to be less certain.
13.6
QUALITATIVE FORECASTING
EXPERT JUDGMENT
Many important forecasts are not based on formal models. This point seems obvious in the
realm of world affairs—matters of war and peace, so to speak. Perhaps more surprisingly it
is also often true in economic matters. For example, during the high-interest-rate period of
1980 and 1981, the most influential forecasters of interest rates were not two competing
econometric models run by teams of econometricians. Rather, they were Henry Kaufman
of Salomon Brothers and Albert Wojnilower of First Boston, the so-called Doctors Doom
and Gloom of the interest-rate world. These gentlemen combined relevant factors such as
the money supply and unemployment, as well as results from quantitative models, in their
own intuitive way (their own “internal” models) and produced forecasts that had widespread credibility and impact on the financial community.
The moral for managers is that qualitative forecasts can well be an important source of
information. Managers must consider a wide variety of sources of data before coming to a
decision. Expert opinion should not be ignored. A sobering and useful measure of all forecasts—quantitative and qualitative—is a record of past performance. Good performance in
the past is a sensible type of null test. An excellent track record does not promise good
results in the future. A poor record, however, hardly creates enthusiasm for high achievement in the future. Managers should thus listen to experts cautiously and hold them to a
standard of performance. This same type of standard should be applied by individuals to
the thousands of managers of stock and bond mutual funds. Every month the Wall Street
Journal publishes the ranking of the different mutual funds’ performance for the previous
month, year, three-year, five-year and ten-year period. Wise investors check the track record
of the different funds although it is not a guarantee of future performance.
There is, however, more to qualitative forecasting than selecting “the right” expert.
Techniques exist to elicit and combine forecasts from various groups of experts, and we
now turn our attention to these techniques.
THE DELPHI METHOD AND CONSENSUS PANEL
The Delphi Method confronts the problem of obtaining a combined forecast from a group
of experts. One approach is to bring the experts together in a room and let them discuss the
event until a consensus emerges. Not surprisingly, this group is called a consensus panel.
This approach suffers because of the group dynamics of such an exercise. One strong individual can have an enormous effect on the forecast because of his or her personality, reputation, or debating skills. Accurate analysis may be pushed into a secondary position.
The Delphi Method was developed by the Rand Corporation to retain the strength of
a joint forecast, while removing the effects of group dynamics. The method uses a coordinator and a set of experts. No expert knows who else is in the group. All communication is
through the coordinator. The process is illustrated in Figure 13.36.
After three or four passes through this process, a consensus forecast typically emerges.
The forecast may be near the original median, but if a forecast that is an outlier in round 1
is supported by strong analysis, the extreme forecast in round 1 may be the group forecast
after three or four rounds.
GRASSROOTS FORECASTING AND MARKET RESEARCH
Other qualitative techniques focus primarily on forecasting demand for a product or group
of products. They are based on the concept of asking either those who are close to the even-
CD13-40
C D
FIGURE 13.36
C H A P T E R S
Coordinator requests forecasts
Delphi Method
Coordinator receives
individual forecasts
Coordinator determines
(a) Median response
(b) Range of middle 50%
of answers
Coordinator requests explanations
from any expert whose estimate is
not in the middle 50%
Coordinator sends to all experts
(a) Median response
(b) Range of middle 50%
(c) Explanations
tual consumer, such as salespeople, or consumers themselves, about a product or their purchasing plans.
Consulting Salesmen In grassroots forecasting, salespeople are asked to forecast
demand in their districts. In the simplest situations, these forecasts are added together to
get a total demand forecast. In more sophisticated systems individual forecasts or the total
may be adjusted on the basis of the historical correlation between the salesperson’s forecasts and the actual sales. Such a procedure makes it possible to adjust for an actual occurrence of the stereotyped salesperson’s optimism.
Grassroots forecasts have the advantage of bringing a great deal of detailed knowledge
to bear on the forecasting problem. The individual salesperson who is keenly aware of the
situation in his or her district should be able to provide better forecasts than more aggregate models. There are, however, several problems:
1. High cost: The time salespeople spend forecasting is not spent selling. Some view this
opportunity cost of grassroots forecasting as its major disadvantage.
2. Potential conflict of interest: Sales forecasts may well turn into marketing goals that
can affect a salesperson’s compensation in an important way. Such considerations
exert a downward bias in individual forecasts.
3. Product schizophrenia (i.e., stereotyped salesperson’s optimism): It is important for
salespeople to be enthusiastic about their product and its potential uses. It is not clear
that this enthusiasm is consistent with a cold-eyed appraisal of its market potential.
In summary, grassroots forecasting may not fit well with other organization objectives
and thus may not be effective in an overall sense.
Consulting Consumers Market research is a large and important topic in its own
right. It includes a variety of techniques, from consumer panels through consumer surveys
and on to test marketing. The goal is to make predictions about the size and structure of
the market for specific goods and/or services. These predictions (forecasts) are usually
based on small samples and are qualitative in the sense that the original data typically con-
C H A P T E R
1 3
Forecasting CD13-41
sist of subjective evaluations of consumers. A large menu of quantitative techniques exists
to aid in determining how to gather the data and how to analyze them.
Market research is an important activity in most consumer product firms. It also plays
an increasingly important role in the political and electoral process.
13.7
NOTES ON IMPLEMENTATION
APPLICATION CAPSULE
Whether in the private or public sector, the need to deal with the future is an implicit or
explicit part of every management action and decision. Because of this, managing the forecasting activity is a critical part of a manager’s responsibility. A manager must decide what
resources to devote to a particular forecast and what approach to use to obtain it.
The question of “what resources” hinges on two issues:
1. The importance of the forecast, or more precisely, the importance of the decision
awaiting the forecast and its sensitivity to the forecast.
2. The quality of the forecast as a function of the resources devoted to it.
In other words, how much does it matter, and how much does it cost? These are the same
questions that management must ask and answer about many of the services it purchases.
In actual applications, the selection of the appropriate forecasting method for a particular situation depends on a variety of factors. Some of the features that distinguish one situation from the next are
1. The importance of the decision
2. The availability of relevant data
3. The time horizon for the forecast
4. The cost of preparing the forecast
5. The time until the forecast is needed
6. The number of times such a forecast will be needed
7. The stability of the environment
Improved Forecasting at Taco Bell Helps Save Millions in Labor Costs
Taco Bell Corporation has approximately 6,500 companyowned, licensed, and franchised locations in 50 states a growing international market. Worldwide yearly sales are approximately $4.6 billion. In the late 1980s, they restructured the
business to become more efficient and cost-effective. To do
this, the company relied on an integrated set of operations
research models, including forecasting to predict customer
arrivals. Through 1997, these models have saved over $53 million in labor costs.
At Taco Bell, labor costs represent approximately 30% of
every sales dollar and are among the largest controllable
costs. They are also among the most difficult to manage
because of the direct link that exists between sales capacity
and labor. Because the product must be fresh when sold, it is
not possible to produce large amounts during periods of low
demand to be warehoused and sold during periods of high
demand. Instead, the product must be prepared virtually
when it is ordered. And since demand is highly variable and is
concentrated during the meal periods (52% of daily sales
occur during the three-hour lunch period from 11 A.M. to 2
P.M.), determining how many employees to schedule to per-
form what functions in the store at any given time is a complex and vexing problem.
Customer transactions during a 15-minute interval are
subject to many sources of variability, including but not limited to time of the day, day of the week, week of the month,
and month of the year. To eliminate as many sources of variability as possible, all customer transaction data was separated
into a number of independent time series, each representing
the history corresponding to a specific 15-minute interval
during a specific day of the week. For example, the customer
transaction history at a particular store for all Fridays from
9:00 to 9:15 A.M. constituted the time series to be used to forecast customer transactions at that store for future Fridays
from 9:00 to 9:15 A.M.
Many different time-series forecasting methods were
tried and it was found that a six-week moving average provided the best forecast (i.e., the one that minimized mean
squared error). The forecasting system provides the manager
with the forecasts obtained for the next week for scheduling
purposes. The manager has the authority to modify the forecast based upon known events. (See Hueter and Swart.)
CD13-42
C D
C H A P T E R S
The importance of the decision probably plays the strongest role in determining
what forecasting method to use. Curiously, qualitative approaches (as opposed to quantitative) dominate the stage at the extremes of important and not very important
forecasts.
On the low end of the importance scale, think of the many decisions a supermarket manager makes on what are typically implicit forecasts: what specials to
offer, what to display at the ends of the aisles, how many baggers to employ. In such
cases, forecasts are simply business judgments. The potential return is not high
enough to justify the expenditure of resources required for formal and extensive
model development.
On the high end, the decisions are too important (and perhaps too complex) to be left
entirely to formal quantitative models. The future of the company, to say nothing of the
chief executive, may hinge on a good forecast and the ensuing decision. Quantitative models may certainly provide important input. In fact, the higher the planning level, the more you
can be sure that forecasting models will be employed at least to some extent. But for very
important decisions, the final forecast will be based on the judgment of the executive and
his or her colleagues.
The extent to which a quantitative model is employed as an input to this judgment
will depend, in the final analysis, on management’s assessment of the model’s validity. A
consensus panel (a management committee) is often the chosen vehicle for achieving the
final forecast. For example, what forecasts do you think persuaded Henry Ford IV to reject
Lee Iacocca’s plan to move Ford into small, energy-efficient cars in the late 1970s? Also,
what forecasts led Panasonic to introduce a tape-based system while RCA introduced a
disk-based system for the TV player market? And what about the Cuban missile crisis?
The Bay of Pigs? Clearly, management’s personal view of the future played an important
role.
Quantitative models play a major role in producing directly usable forecasts in situations that are deemed to be of “midlevel importance.” This is especially true in short-range
(up to one month) and medium-range (one month to two years) scenarios. Time-series
analyses are especially popular for repetitive forecasts of midlevel importance in a relatively
stable environment. The use of exponential smoothing to forecast the demand for mature
products is a prototype of this type of application.
Causal models actively compete with various experts for forecasting various economic
phenomena in the midlevel medium range. Situations in which a forecast will be repeated
quite often, and where much relevant data are available, are prime targets for quantitative
models, and in such cases many successful models have been constructed. As our earlier
discussion of interest rates forecasts indicated, there is ample room in this market for the
“expert” with a good record of performance. In commercial practice one finds that many
management consulting groups, as well as specialized firms, provide forecasting “packages”
for use in a variety of midlevel scenarios.
As a final comment, we can make the following observations about the use of forecasting in decision making within the public sector: Just as in private industry, it is often the
case that the higher the level of the planning function, the more one sees the use of forecasting models employed as inputs. In such high-level situations there is a high premium
on expertise, and forecasting is, in one sense, a formal extension of expert judgment. Think
of the Council of Economic Advisors, the chairman of the Federal Reserve Board, or the
director of the Central Intelligence Agency. You can be sure that forecasts are of importance
in these contexts, and you can be sure that there is within these environments a continuing
updating and, one hopes, improvement of forecasting techniques. As always, the extent to
which the results of existing models are employed is a function of the executive’s overall
assessment of the model itself.
C H A P T E R
1 3
Forecasting CD13-43
Key Terms
Causal Forecasting. The forecast for the
quantity of interest is determined as a
function of other variables.
Curve Fitting. Selecting a “curve” that
passes close to the data points in a
scatter diagram.
Scatter Diagram. A plot of the response
variable against a single independent
variable.
Method of Least Squares. A procedure
for fitting a curve to a set of data. It
minimizes the sum of the squared
deviations of the data from the curve.
Polynomial of Degree n. A function of
the form y = a0 + a1x + a2x 2 + . . . +
anx n. Often used as the curve in a least
squares fit.
Linear Regression. A statistical technique used to estimate the parameters
of a polynomial in such a way that the
polynomial “best” represents a set of
data. Also sometimes used to describe
the problem of fitting a linear function
to a set of data.
Validation. The process of using a
model on past data to assess its credibility.
Time-Series Forecasting. A variable of
interest is plotted against time and
extrapolated into the future using one
of several techniques.
Simple n-Period Moving Average.
Average of last n periods is used as the
forecast of future values; (n – 1) pieces
of data must be stored.
Weighted n-Period Moving Average. A
weighted sum, with decreasing
weights, of the last n observations is
used as a forecast. The sum of the
weights equals 1; (n – 1) pieces of data
must be stored.
Exponential Smoothing. A weighted
sum, with decreasing weights of all
past observations, the sum of the
weights equals 1; only one piece of
information need be stored.
Holt’s Model. A variation of simple
exponential smoothing that accounts
for either up or down trend in the
data.
Seasonality. Movements up and down in
a pattern of constant length that
repeats itself in a time-series set of
data.
Random Walk. A stochastic process in
which the variable at time t equals the
variable at time (t – 1) plus a random
element.
Delphi Method. A method of achieving a consensus among experts
while eliminating factors of group
dynamics.
Consensus Panel. An assembled group
of experts that produces an agreedupon forecast.
Grassroots Forecasting. Soliciting forecasts from individuals “close to” and
thus presumably knowledgeable about
the entity being forecast.
Market Research. A type of grassroots
forecasting that is based on getting
information directly from consumers.
Self-Review Exercises
True-False
n
1.
T F Minimizing total deviations (i.e., di) is a reasoni=1
able way to define a “good fit.”
2.
T F Least squares fits can be used for a variety of
curves in addition to straight lines.
3.
T F Regression analysis can be used to prove that the
method of least squares produces the best possible fit for
any specific real model.
4.
T F The method of least squares is used in causal models as well as in time-series models.
5.
T F In a weighted three-period moving-average forecast, the weights can be assigned in many different ways.
Multiple Choice
11. Linear regression (with one independent variable)
a. requires the estimation of three parameters
b. is a special case of polynomial least squares
c. is a quick and dirty method
d. uses total deviation as a measure of good fit
6.
T F Exponential smoothing automatically assigns
weights that decrease in value as the data get older.
7.
T F Average squared error is one way to compare various forecasting techniques.
8.
T F Validation refers to the process of determining a
model’s credibility by simulating its performance on past
data.
9.
T F
10.
T F At higher levels of management, qualitative forecasting models become more important.
A “random walk” is a stochastic model.
12. An operational problem with a simple k-period moving
average is that
a. it assigns equal weight to each piece of past data
b. it assigns equal weight to each of the last k observations
c. it requires storage of k – 1 pieces of data
d. none of the above
CD13-44
C D
C H A P T E R S
13. A large value of puts more weight on
a. recent
b. older
data in an exponential smoothing model.
b. divide the data into two parts; estimate the parameters
of the model on the first part; see how well the model
works on the second part.
c. compare two models on the same database.
d. none of the above.
14. If the data being observed can be best thought of as being
generated by random deviations about a stationary mean, a
a. large
b. small
value of is preferable in an exponential smoothing
model.
16. The Delphi Method
a. relies on the power of written arguments
b. requires resolution of differences via face-to-face debate
c. is mainly used as an alternative to exponential smoothing
d. none of the above
15. A divide-and-conquer strategy means
a. divide the modeling procedure into two parts: (1) use
all the data to estimate parameter values, and (2) use
the parameter values from part (1) to see how well the
model works.
17. Conflict of interest can be a serious problem in
a. the Delphi Method
b. asking salespeople
c. market research based on consumer data
d. none of the above
Answers
1 . F, 2 . T, 3 . F, 4 . T, 5 . T, 6 . T, 7 . T, 8 . T, 9 . T, 1 0 . T, 1 1 . b , 1 2 . c , 1 3 . a ,
14. b, 15. b, 16. a, 17. b
Skill Problems
13-1. Consider the data set shown (contained in 13-1.XLS):
x
100
70
30
40
80
60
50
20
10
90
y
57
40
35
33
56
46
45
26
26
53
(a) Plot a scatter diagram of these data.
(b) Fit a straight line to the data using the method of least squares.
(c) Use the function derived in part (b) to forecast a value for y when x = 120.
13-2. Consider the following set of data where x is the independent and y the dependent variable (contained in 13-2.XLS):
x
30
25
20
15
10
5
y
15
20
30
35
45
60
(a) Plot the scatter diagram for these data.
(b) Fit a straight line to the data by the method of least squares.
13-3. Consider the following set of data (contained in 13-3.XLS):
x
1
2
3
4
5
6
7
y
2.00
1.50
4.50
4.00
5.50
4.50
6.00
(a) Plot a scatter diagram of the data.
(b) Fit a straight line to the data by the method of least squares. Plot the line on the scatter diagram.
(c) Fit a quadratic function to the data by the method of least squares. Plot the curve on the scatter
diagram.
13-4. Fit a quadratic function to the data in Problem 13-2 by the method of least squares.
13-5. Compare the goodness of fit on the data in Problem 13-3 for the least squares linear function and the
least squares quadratic by calculating the sum of the squared deviations.
13-6. Compare the goodness of fit on the data in Problem 13-2 for the least squares linear function and the
least squares quadratic function (derived in Problem 13-4) by calculating the sum of the squared
deviations. Is the answer for 13-4 always better than that for 13-2?
C H A P T E R
Forecasting CD13-45
1 3
13-7. Further investigation reveals that the x variable in Problem 13-1 is simply 10 times the time at which
an observation was recorded, and the y variable is demand. For example, a demand of 57 occurred at
time 10; a demand of 26 occurred at times 1 and 2.
(a) Plot actual demand against time.
(b) Use a simple four-period moving average to forecast demand at time 11.
(c) By inspecting the data, would you expect this to be a good model or not? Why?
13-8. Consider the following data set (contained in 13-8.XLS):
TIME
DEMAND
13-9.
13-10.
13-11.
13-12.
13-13.
13-14.
13-15.
13-16.
13-17.
13-18.
1
2
3
4
5
6
7
8
9
10
11
12
10
14
19
26
31
35
39
44
51
55
61
54
(a) Plot this time series. Connect the points with a straight line.
(b) Use a simple four-period moving average to forecast the demand for periods 5–13.
(c) Find the mean absolute deviation.
(d) Does this seem like a reasonable forecasting device in view of the data?
Consider the data in Problem 13-7.
(a) Use a four-period weighted moving average with the weights 4\10, 3\10, 2\10, and 1\10 to forecast
demand for time 11. Heavier weights should apply to more recent observations.
(b) Do you prefer this approach to the simple four-period model suggested in Problem 13-7? Why?
(c) Now find the optimal weights using the Solver. How much have you reduced your error measure
compared to (a)?
Consider the data in Problem 13-8.
(a) Use a four-period weighted moving average with the weights 0.1, 0.2, 0.3, and 0.4 to forecast
demand for time periods 5–13. Heavier weights should apply to more recent observations.
(b) Find the mean absolute deviation.
(c) Do you prefer this approach to the simple four-period model suggested in Problem 13-8? Why?
(d) Now find the optimal weights using the Solver. How much have you reduced your error measure
compared to the method of (a)?
Consider the time-series data in Problem 13-7.
(a) Let ŷ1 = 22 and = 0.4. Use an exponential smoothing model to forecast demand in period 11.
(b) If you were to use an exponential smoothing model to forecast this time series, would you prefer
a larger (0.4) or smaller value for ? Why?
(c) Find the optimal value for using Solver by minimizing the mean absolute percentage error.
Consider the time-series data in Problem 13-8.
(a) Assume that ŷ1 = 8 and = 0.3. Use an exponential smoothing model to forecast demand in
periods 2–13.
(b) Find the mean absolute percentage error.
(c) Repeat the analysis using = 0.5.
(d) If you were to use an exponential smoothing model to forecast this time series, would you prefer
= 0.3, a larger (0.3), or smaller, value of ? Why?
(e) What is the optimal value for ? How much does it reduce the MAPE from part (a)?
The president of Quacker Mills wants a subjective evaluation of the market potential of a new nachoflavored breakfast cereal from a group consisting of (1) the vice president of marketing, (2) the marketing
manager of the western region, (3) ten district sales managers from the western region. Discuss the
advantages and disadvantages of a consensus panel and the Delphi Method for obtaining this evaluation.
Given that yt is produced by the relationship yt = yt–1 + ε where ε is a random number with mean zero
and y1 = 1, y2 = 2, y3 = 1.5, y4 = 0.8, y5 = 1, what is your best forecast of y6?
Given your current knowledge of the situation, would you recommend a causal or a time-series
model to forecast next month’s demand for Kellogg’s Rice Crispies? Why?
If = 0.3, in calculating ŷ5, what is the weight on
(a) ŷ1
(b) y1
(c) y4
Discuss the merit of the measure “mean squared error.” In comparing two forecasting methods, is the
one with a smaller average squared error always superior?
If a company experiences an exponential sales growth, how would you alter the sales forecasting
model to account for this?
CD13-46
C D
C H A P T E R S
13-19. Generate the following function for 100 periods (i.e., xt from 1 to 100):
yt = 300 + 200 * COS(0.1 * xt)
(a) Create a twenty-period simple moving average forecast for periods 21 to 100, and a fifty-period
simple moving average forecast for periods 51 to 100 and compare these forecasts against the
original function. What can you infer about the performance of the simple moving average forecasting technique?
(b) Create an exponential smoothing forecast using = 0.3. What can you infer about the performance of exponential smoothing forecasts?
(c) Given the above function and forecasts, how would you as a manager use qualitative forecasting
to adjust the quantitative forecasts to achieve higher accuracy?
13-20. Discuss the advantages and disadvantages of MAD, MAPE, and MSE as error measurements. Why
would you use one over another?
13-21. The marketing manager for 7–12, a small upstart convenience store corporation, wants to forecast the
sales of a new store opening in a small midwestern city. He has gathered the population statistics and
sales at the other ten stores the corporation operates. The data is shown in Table 13.3.
(a) Plot the scatter diagram of these data.
(b) Fit a straight line to the data using the method of least squares.
(c) Forecast sales for the new store if the population of the new city is 50,000.
13-22. The marketing manager for 7–12, a small upstart convenience store corporation, has found that sales
over time at his highest performing store is as shown in Table 13.4.
(a) Use a three-period simple moving average to forecast periods 4 to 11.
(b) Use a three-period weighted moving average to forecast periods 4 to 11. Use Solver to optimize
the weights for the forecast.
(c) Use an exponentially smoothed forecast to forecast periods 2 to 11. Use Solver to optimize the
value of alpha.
Table 13.3
Table 13.4
STORE NUMBER
SALES
POPULATION
1
2
3
4
5
6
7
8
9
10
400000
1250000
1300000
1100000
450000
540000
500000
1425000
1700000
475000
10,000
65,000
72,000
54,000
42,500
36,800
27,500
85,000
98,000
37,500
PERIOD
SALES
1
2
3
4
5
6
7
8
9
10
11
$ 750,000
$ 790,000
$ 810,000
$ 875,000
$ 990,000
$1,090,000
$ 950,000
$1,050,000
$1,150,000
$1,200,000
$1,250,000
C H A P T E R
1 3
Forecasting CD13-47
13-23. Discuss the effort and accuracy required with forecasts for the following:
(a) A short-term investment of $1,000,000 in volatility.com common stock with a time horizon of
two weeks or less.
(b) A long-term investment in an S&P-500 index fund with a time horizon of 30 years or more.
(c) A monthly materials requirements forecast for the production of jumbo jet aircraft.
(d) A weekly forecast for the number of custodians in a school.
13-24. Discuss the role of probability in the development of forecasting models. When would it be appropriate to use random variables in a forecasting model and what kind of statistical data would be used
in conjunction with the model?
13-25. Discuss the divide-and-conquer strategy as it applies to time-series forecasting. What alternatives
would you consider if the results from the divide-and-conquer strategy fail to meet your standards?
Application Problems
13-26. In some cases it is possible to obtain better forecasts by using a trend-adjusted forecast.
(a) Use the Holt trend model with ŷ1 = 22 to forecast the sequence of demands in Problem 13-7.
(b) Use the MSE error measure to compare the simple exponential smoothing model (Problem 1311) with the trend-adjusted model from part (a) on forecasting demand.
13-27. In some cases it is possible to obtain better forecasts by using a trend-adjusted forecast.
(a) Use the Holt trend model with ŷ1 = 8 to forecast the sequence of demands in Problem 13-8.
(b) As in Problem 13-26, compare the above result with the result from Problem 13-12 (simple
exponential smoothing).
13-28. In Section 13.4 we presented the Holt trend model for forecasting the earnings of Startup Airlines by
Amy Luford. Use the same data to
(a) Develop a trend model using linear regression with time as the independent variable.
(b) How does its forecasting performance compare with the Holt trend model?
13-29. The spreadsheet AUTO.XLS contains data from Business Week on auto sales by month for 43 months.
(a) Deseasonalize the data.
(b) Find the best forecasting method for the deseasonalized data.
(c) What is your forecast for period 44?
(d) How much confidence do you have in your forecast?
13-30. Using the OILCOMP.XLS data, use the “Regression” tool of Excel to fit a quadratic function curve to
the data. Hint: You must first create a column for a second independent variable, X2 = X12, and then
regress Y (Sales/hr) on both X1 (Cars/hr) and X2 ([Cars/hr] ^2).
(a) How do the results compare to those found in the chapter by using the Solver?
(b) Which technique seems easier?
(c) Which error measures can you use to compare these two approaches?
13-31. Use an “investment” Internet web site to download the daily closing price data for AT&T stock for the
most recent thirty days. Use Holt’s model to forecast the next five days. Follow the next five days’ closing prices for AT&T and compare the actual closings with your forecast. Note: www.yahoo.com will
provide data in its finance section that can be downloaded into an Excel spreadsheet. Ensure that you
save the spreadsheet as an Excel spreadsheet after downloading the .csv format.
Case Study
The Bank of Laramie
Jim Cowan was reviewing the staffing needs for the Proof
Department. He was the president of the Bank of Laramie and
one of his first priorities was to evaluate a staffing and scheduling problem.
tomers. It provided its customers a full line of corporate banking services, including cash management, and money market
services. It had total assets of $19 million and net income of
$300 thousand.
Company Background
Proof Department
The Bank of Laramie conducted retail and corporate banking
activities and provided a full line of trust services to its cus-
The Proof Department was the heart of the bank’s checkclearing operations. The department received and processed
CD13-48
C D
C H A P T E R S
checks and other documents to clear them in the shortest possible time in order to save on float, which averaged $450 thousand a day. The department was charged with the responsibility for sorting checks, proving the accuracy of deposits,
distributing checks, and listing transactions arising from the
daily operations of the bank.
The physical facility consisted of a room with two proof
machines and several tables. The department operated from
8:00 A.M. to 5:30 P.M. on Monday through Friday. Despite the
practice by other banks of handling check processing almost
entirely at night, the Bank of Laramie believed it was important to give its employees a normal workday.
The volume of items processed in the Proof Department
had increased significantly in the last two years, from 780
thousand/year to 1.6 million/year. The scheduling problem
in the department was magnified because of the uneven
nature of the volume. Exhibit 1 contains deseasonalized
EXHIBIT 1 Deseasonalized Weekly Proof Volumes
WEEK
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
VOLUME
(000)
WEEK
VOLUME
(000)
23.4
26.4
28.7
26.4
28.6
29.4
29.9
29.3
32.2
28.7
27.8
31.1
32.7
32.5
28.9
31.8
32.8
32.7
31.7
32.5
32.7
30.9
30.5
31.3
30.1
32.4
28.5
29.9
31.7
30.7
31.6
32.1
30.1
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
31.1
31.0
29.6
31.5
31.3
31.1
34.9
32.3
35.6
33.8
31.3
31.2
30.4
31.1
32.7
34.8
34.5
36.0
28.3
27.8
30.4
28.4
29.3
28.9
33.5
32.6
32.0
30.6
31.9
31.3
31.6
31.1
32.0
weekly proof volumes going back to the beginning of the
prior year. This volume pattern led management to use a
part-time staff to cover peak loads. Currently, one full-time
and two part-time proof operators were working at the bank.
Each operator had an average processing rate of 700 items
per hour.
Forecasting
The first thing Mr. Cowan had to do was forecast demand for
next week, week 67, and then he would need to work out a
schedule for the number of full- and part-time staff to meet
the predicted demand. A couple of simple forecasting methods appealed to him—namely, using the previous week’s
actual demand for the next week’s forecast or just using the
weekly average over the past 66 weeks. He wondered how
accurate these simple methods were, however, and whether or
not there was a better way.
He would use his forecast to determine how many hours
of additional part-time workers to schedule for the next week.
His base schedule was enough to do 15,000 checks; he could
add as many additional part-time hours as he wished to the
schedule. (Note: The base schedule includes the number of
checks that could be processed by one full-time and two parttime workers along with their other duties.) If he scheduled
either full- or part-time hours, he had to pay for them even if
the workers completed the check processing early. On the
other hand, if the volume of checks was so high that the checks
couldn’t be processed in the hours he scheduled for the week,
he would need to pay overtime wages (which were 50% above
regular wages) in order to complete the work for the week.
There was no requirement to finish the checks in a given day
during the day, but all the checks for the week had to be done
by Friday afternoon.
His first task was to get a handle on the forecasting problem; then he could easily use it to find the number of parttime hours to schedule.
Questions
1.
Look at the following five procedures for forecasting the
weekly workload requirements of the Proof Department
suggested below:
• Method 1: This simple forecasting scheme uses the
previous week’s volume to forecast each succeeding
week’s volume (e.g., the forecast for week 10 would be
32.2, the volume for week 9).
• Method 2: This approach uses the average volume over
all the previous weeks as the forecast. Thus, the forecast for week 23 would be 30.05 (the average over the
first 22 weeks). To forecast week 67, weeks 1 through
66 would be averaged together.
• Method 3: This method is exponential smoothing with
alpha = 0.5. Use the spreadsheet (BANK.XLS) that
already has the data from Exhibit 1 entered and the
exponential smoothing forecast calculated for an
C H A P T E R
alpha of 0.5. (Notice that this forecasting method
requires an initial average to get things started. The
spreadsheet used 23.4 as this average; in essence taking a “best-case” approach by assuming the first average was perfect.)
• Method 4: This method takes a moving average
approach. The number of periods in the moving average is up to you. Try several different values to come
up with the best forecasting method you can.
Case Study
1 3
Forecasting CD13-49
• Method 5: This method is called linear regression. It
basically fits a straight line through the data, looking
for the overall trend.
2.
3.
Which of the forecasting methods presented would you
recommend to forecast weekly volumes? Adjust the
smoothing constant if needed to improve method 3.
How many part-time hours should Mr. Cowan schedule
for next week? Hint: Think about the costs of over- and
underscheduling the number of hours.
Shumway, Horch, and Sager (B)1
Christensen started to look at the circulation data of some of
the other monthly magazines represented by the client organization. The first set of data was for Working Woman (found in
the “WkgWom” sheet of SHSB.XLS and shown in Exhibit 1),
which was targeted at women who were in management
careers in business. Contents included sections devoted to
entrepreneurs, business news, economic trends, technology,
politics, career fields, social and behavioral sciences, fashion,
and health. It was sold almost entirely through subscriptions,
as evidenced by the latest figures reported to the Audit Bureau
of Circulation (823.6K subscriptions out of 887.8K total circulation).
The next graph represented circulation data for Country
Living (found in the “CtryLiving” sheet of SHSB.XLS and
shown in Exhibit 2), a journal that focused on both the practical concerns and the intangible rewards of living on the land.
It was sold to people who had a place in the country, whether
EXHIBIT 2 Graph of Country Living Circulation
that was a working farm, a gentleman’s country place, or a
weekend retreat.
The third set of data was for Health (found in the “Hlth”
sheet of SHSB.XLS and shown in Exhibit 3), which was a
lifestyle magazine edited for women who were trying to look
and feel better. The magazine provided information on fitness,
beauty, nutrition, medicine, psychology, and fashions for the
active woman.
EXHIBIT 1 Graph of Working Woman Circulation
1 This case is to be used as the basis for class discussion rather
than to illustrate either the effective or ineffective handling of an
administrative situation. © 1990 by the Darden Graduate
Business School Foundation. Preview Darden case abstracts on
the World Wide Web at www.darden.virginia.edu/publishing.
EXHIBIT 3 Graph of Health Circulation
CD13-50
C D
C H A P T E R S
EXHIBIT 4 Graph of Better Homes and Gardens
Circulation
EXHIBIT 5 Graph of True Story Circulation
A fourth graph was for Better Homes and Gardens (found
in the “BH&G” sheet of SHSB.XLS and shown in Exhibit 4),
which competed with Good Housekeeping and was published
for husbands and wives who had serious interests in home and
family as the focal points of their lives. It covered these homeand-family subjects in depth: food and appliances, building
and handyman, decorating, family money management, gardening, travel, health, cars in your family, home and family
entertainment, new-product information, and shopping. The
magazine’s circulation appeared to be experiencing increased
volatility over time. Was this the beginning of a new pattern?
The last magazine was True Story (found in the
“TrueStry” sheet of SHSB.XLS and shown in Exhibit 5). It was
edited for young women and featured story editorials as well
as recipes and food features, beauty and health articles, and
home management and personal advice. This journal’s circulation appeared to have a definite downward trend over the
past nine years. Was the cause a general declining interest in
the subject matter, or was this a cycle that would correct itself
in the future (like the sine wave Christensen had studied in
trigonometry)?
Case Study
Question
1.
What’s the best forecasting method for each of the five
magazines? Use the concept of “Divide and Conquer” to
really test your different forecast methods.
Marriott Room Forecasting1
“A hotel room is a perishable good. If it is vacant for one night,
the revenue is lost forever.” Linda Snow was commenting on
the issue of capacity utilization in the hotel business. “On the
other hand, the customer is king with us. We go to great pains
to avoid telling a customer with a reservation at the front desk
that we don’t have a room for him in the hotel.”
As reservation manager of one of Marriott’s hotels, Linda
faced this trade-off constantly. To complicate the matter, customers often booked reservations and then failed to show, or
cancelled reservations just before their expected arrival. In
addition, some guests stayed over in the hotel extra days
beyond their original reservation and others checked out early.
A key aspect of dealing with the capacity-management problem was having a good forecast of how many rooms would be
needed on any future date. It was Linda’s responsibility to pre1 This case is to be used as the basis for class discussion rather
than to illustrate either the effective or ineffective handling of an
administrative situation. © 1989 by the Darden Graduate
Business School Foundation. Preview Darden case abstracts on
the World Wide Web at www.darden.virginia.edu/publishing.
pare a forecast on Tuesday afternoon of the number of rooms
that would be occupied each day of the next week (Saturday
through Friday). This forecast was used by almost every
department within the hotel for a variety of purposes; now she
needed the forecast for a decision in her own department.
Hamilton Hotel
The Hamilton Hotel was a large downtown business hotel
with 1,877 rooms and abundant meeting space for groups and
conventions. It had been built and was operated by Marriott
Hotels, a company that operated more than 180 hotels
and resorts worldwide and was expanding rapidly into other
lodging-market segments. Management of The Hamilton
reported regularly to Marriott Corporation on occupancy and
revenue performance.
Hotel managers were rewarded for their ability to meet
targets for occupancy and revenue. Linda could not remember
a time when the targets went down, but she had seen them go
up in the two years since she took the job as reservation manager. The hotel managers were continuously comparing forecasts of performance against these targets. In addition to over-
C H A P T E R
seeing the reservations office with eight reservationists, Linda
prepared the week-ahead forecast and presented it on Tuesday
afternoon to other department managers in the hotel. The
forecast was used to schedule, for example, daily work assignments for housekeeping personnel, the clerks at the front desk,
restaurant personnel, and others. It also played a role in purchasing, revenue, and cost planning.
Overbooking
At the moment, however, Linda needed her forecast to know
how to treat an opportunity that was developing for next
Saturday. It was Tuesday, August 18, and Linda’s forecasts were
due by midafternoon for Saturday, August 22 through Friday,
August 28. Although 1,839 rooms were reserved already for
Saturday, Linda had just received a request from a tour company for as many as 60 more rooms for that night. The tour
company would take any number of rooms less than 60 that
Linda would provide, but no more than 60. Normally Linda
would be ecstatic about such a request: selling out the house
for a business hotel on a Saturday would be a real coup. The
request, in its entirety, put reservations above the capacity of
the hotel, however. True, a reservation on the books Tuesday
was not the same as a “head in the bed” on Saturday, especially
when weekend nights produced a lot of “no-show” reserva-
1 3
Forecasting CD13-51
tions. “Chances are good we still wouldn’t have a full house on
Saturday,” Linda thought out loud. “But if everybody came
and someone was denied a room due to overbooking, I would
certainly hear about it, and maybe Bill Marriott would also!”
Linda considered the trade-off between a vacant room
and denying a customer a room. The contribution margin
from a room was about $90, since the low variable costs arose
primarily from cleaning the room and check-in/check-out.
On the other side, if a guest with a reservation was denied a
room at The Hamilton, the front desk would find a comparable room somewhere in the city, transport the guest there, and
provide some gratuity, such as a fruit basket, in consideration
for the inconvenience. If the customer were a Marquis cardholder (a frequent guest staying more than 45 nights a year in
the hotel), he or she would receive $200 cash plus the next two
stays at Marriott free. Linda wasn’t sure how to put a cost figure on a denied room; in her judgment, it should be valued,
goodwill and all, at about twice the contribution figure.
Forecasting
Linda focused on getting a good forecast for Saturday, August
22, and making a decision on whether to accept the additional
reservations for that day. She had historical data on demand
for rooms in the hotel; Exhibit 1 shows demand for the first
EXHIBIT 1
Historical Demand and
Bookings Data
Cell
Formula
Copy To
E5
C5/D5
E6:E91
CD13-52
C D
C H A P T E R S
three weeks for dates starting with Saturday, May 23.
(Ten additional weeks [weeks 4–13] are contained in
MARRIOTT.XLS and thus Saturday, August 22, was the beginning of week 14 in this database.) “Demand” figures (column
C) included the number of turned-down requests for a reservation on a night when the hotel had stopped taking reservations because of capacity plus the number of rooms actually
occupied that night. Also included in Exhibit 1 is the number
of rooms booked (column D) as of the Tuesday morning of
the week prior to each date. (Note that this Tuesday precedes a
date by a number of days that depends on the date’s day of
week. It is four days ahead of a Saturday date, seven days ahead
of a Tuesday, ten days ahead of a Friday. Also note that on a
Tuesday morning, actual demand is known for Monday night,
but not for Tuesday night.)
Linda had calculated pickup ratios for each date where
actual demand was known in Exhibit 1 (column E). Between a
Tuesday one week ahead and any date, new reservations were
added, reservations were canceled, some reservations were
extended to more nights, some were shortened, and some
resulted in no-shows. The net effect was a final demand that
might be larger than Tuesday bookings (a pickup ratio greater
than 1.0) or smaller than Tuesday bookings (a pickup ratio
less than 1.0). Linda looked at her forecasting task as one of
predicting the pickup ratio. With a good forecast of pickup
ratio, she could simply multiply by Tuesday bookings to
obtain a forecast of demand.
From her earliest experience in a hotel, Linda was aware
that the day of the week (DOW) made a lot of difference in
demand for rooms; her recent experience in reservations sug-
gested that it was key in forecasting pickup ratios. Downtown
business hotels like hers tended to be busiest in the middle of
the workweek (Tuesday, Wednesday, Thursday) and light on
the weekends. Using the data in her spreadsheet, she had calculated a DOW index for the pickup ratio during each day of
the week, which is shown in column F of Exhibit 1. Thus, for
example, the average pickup ratio for Saturday is about 86.5%
of the average pickup ratio for all days of the week. Her plan
was to adjust the data for this DOW effect by dividing each
pickup ratio by this factor. This adjustment would take out the
DOW effect, and put the pickup ratios on the same footing.
Then she could use the stream of adjusted pickup ratios to
forecast Saturday’s adjusted pickup ratio. To do this, she
needed to think about how to level out the peaks and valleys of
demand, which she knew from experience couldn’t be forecasted. Once she had this forecast of adjusted pickup ratio,
then she could multiply it by the Saturday DOW index to get
back to an unadjusted pickup ratio. “Let’s get on with it,” she
said to herself. “I need to get an answer back on that request
for 60 reservations.”
Questions
1.
2.
3.
Verify the Day-of-Week indices in column F of Exhibit 1.
What forecasting procedure would you recommend
for making the Tuesday afternoon forecast for each
day’s demand for the following Saturday through
Friday?
What is your forecast for Saturday, August 22? What will
you do about the current request for up to 60 rooms for
Saturday?
References
Bruce Andrews and Shawn Cunningham, “L. L. Bean Improves Call-Center Forecasting,”
Interfaces, 25, no. 6 (1995), 1–13.
George E. P. Box and Gwilym M. Jenkins, Time Series Analysis, Forecasting and Control
(San Francisco: Holden-Day, Inc., 1970).
Alan L. C. Bullock, Hitler: A Study in Tyranny (New York: Harper & Row, 1962).
Jackie Hueter and William Swart, “An Integrated Labor-Management System for Taco Bell,”
Interfaces, 28, no. 1 (1998), 75–91.
Quarterly Coal Reports published by the Department of Energy, Energy Information
Administration.
Download