Instructor's manual for this book

advertisement
Preface
In preparing the manuscript for the third edition of Forecasting: methods and applications,
one of our primary goals has been to make the book as complete and thorough as possible
in order that it might best meet its intended objectives. The same set of principles has
guided us in preparing this instructors manual. Our intent has been not only to provide
solutions to the exercises but to go beyond and suggest several other types of teaching
materials and suggestions to help those who teach forecasting. We hope that you will find
that this manual delivers on those objectives.
The instructors manual is in four parts. To avoid confusion with the chapters in the
textbook, we will refer to these as Parts A through D. When the word Chapter is used, it
refers to that chapter in the text itself.
Part A is aimed at providing different course outlines for a number of different settings
in which the text has been used. These range from short executive seminars to subsegments
of a required college course to full-length courses at the graduate level on the subject of
forecasting.
In Part B, we have provided some teaching suggestions as to how we would teach a
course based on the book.
When teaching, we always use a range of additional teaching materials as complements
to the text. In Part C, we discuss the use of case studies and provide suggestions for
projects and exam questions. There use will depend on the overall structure, teaching
style and design selected for the course.
Part D provides solutions to the end-of-chapter exercises. We have provided solutions
that can be used in teaching the course rather than just for grading student work. We hope
the graphs, tables and descriptions will be useful in preparing overhead transparencies or
handouts for students.
We have long found it useful to teach forecasting using a computer package for the
computational aspects of the subject. We have chosen in this edition not to emphasize
a particular package but to comment on the facilities available in a range of packages
(see Appendix I of the text). It is important to have a package that the students can
iii
iv
Preface
learn relatively quickly and which provides as many of the statistical facilities as possible.
This will depend on the students’ background and the type of course being taught. In
preparing the solutions, we have mainly used Minitab version 11 and SAS version 6.12.
Be aware that other packages may give slightly different numerical results due to different
algorithms being used.
We would like to express special thanks to a number of instructors who have helped us
with this manual and given us their feedback from use of the second edition of Forecasting:
methods and applications. At Stanford University, the book has been used by Professor
Fred Shepardson and Professor Peter Reiss. Teaching materials were also provided by
two of our colleagues at the University of Virginia, Professor Jim Freeland and Professor
Bob Landel and by Dr Gary Grunwald who provided some ideas for the exercises while
teaching at the University of Melbourne.
Spyros Makridakis
Fontainebleau, France
Steven Wheelwright
Boston, Massachusetts
Rob Hyndman
Melbourne, Australia
November 1997
Contents
v
A/Planning a forecasting course
Since the late 1940s organizational forecasting has been directly affected by numerous developments in estimation and prediction. Today organizations of all sizes find it essential to
make forecasts in order to reduce the uncertainty of their environment and take advantage
of opportunities. We have been involved in the forecasting area for more than 25 years,
and this book is the culmination of our efforts and experiences at teaching forecasting to
both full-time students and practicing managers with forecasting responsibility.
This book grew out of our perception that a book was needed that would cover a
full range of forecasting methods, be accurate and complete in describing the essential
characteristics and steps for applying those methods in practice, and yet not get bogged
down in theoretical questions underlying the development of individual techniques. The
purpose of the book is to fill a gap in the literature by presenting in terms that are easily
understandable (but that accurately and rigorously describe the techniques) a wide range
of forecasting methods useful to students and practitioners of management, economics,
engineering, and other disciplines requiring effective forecasting.
The material included in the book has been particularly effective in a wide range of
teaching activities. These have included seminars for middle management, seminars for
practicing forecasters, and classes taught to both graduates and undergraduates in business and in more specialized topic areas such as courses for statisticians, economists, and
management scientists.
Four objectives guided the structuring and development of the major sections of this
book. It is useful to understand these objectives before describing the way in which various
materials might be combined to develop an effective course or program on forecasting
and planning. Part A of this manual reviews these objectives and describes how the
materials in the text can be used to teach forecasting effectively. Alternative course
outlines that indicate how various parts of the book might be used for different audiences
and in programs of different length are also suggested. Parts B and C of this manual
provide additional information on the use of supplementary material and the teaching of
individual sessions in a forecasting course.
The various uses of the book can probably best be seen by considering specific course
1
2
Part A. Planning a forecasting course
outlines built around the text. Several such outlines are discussed and presented in Section
A/3.
A/1 Objectives
The major objectives we pursued in structuring and developing the four parts in this book
included the following:
1. Presenting essential aspects of a wide range of forecasting methods with sufficient
detail and clarity that they could be applied easily.
2. Presenting alternative forecasting methods in such a form that a minimum of technical background would be required to understand each technique, yet including
enough of the essential concepts and their theoretical basis that students could gain
a thorough understanding of each technique if desired.
3. Providing information concerning the operating and performance characteristics associated with each of the major forecasting methodologies so that criteria for selecting the most appropriate methods can be developed and applied.
4. Examining the important factors and issues in the application of forecasting and
planning and the effective use of forecasting resources in an ongoing organization.
It should be emphasized that while the body of each chapter is geared to the reader
with only a basic background in algebra, the aim has still been to present a complete
description of each technique and those factors that are relevant in deciding where and
how to use it.
A/2 Course materials
In addition to having a class study individual chapters, an instructor has four other major
sources of material to use as building blocks in developing a forecasting course.
Exercises As indicated in the preceding section, many of the chapters in this book deal
with specific techniques for forecasting. At the end of these methodology-oriented
chapters are exercises that we have found particularly helpful in teaching forecasting. The purpose of these exercises is to give students the opportunity to test their
A/3 Sample course outlines
3
knowledge of the basic mechanics of individual methodologies and their understanding of the strengths and limitations of those approaches. Complete solutions for
these exercises are provided in Part D of this manual. All data sets used in the
exercises are available from the book’s web page at
www.maths.monash.edu.au/˜hyndman/forecasting/
Case studies Case studies on forecasting situations can take a variety of forms including
basic exercises with a few paragraphs describing the situation surrounding that exercise, providing organizational descriptions of a forecasting system, and even outlining
the management perspective of a decision maker who must use forecasts prepared
in accordance with a given methodology. Part C outlines procedures for using such
cases and suggests how those cases might be used to supplement the exercises at
the end of each chapter. Part C concludes with several additional exercises, project
suggestions and examination questions that might be useful in a forecasting course.
Computer programs A number of computer programs that implement many of the
quantitative techniques described in the text are available and their facilities are
summarized in Appendix I of the book. We have found that it is essential to use a
computer package when teaching Forecasting to enable students to apply the techniques themselves without unnecessarily detailed calculations. In addition, use of
such a package has proved a particularly appropriate means of making students
aware of the strengths and weakness of individual techniques on the basis of their
applicability and results in a range of situations.
Additional readings At the end of each chapter is a set of references that we have found
relevant to topics raised in the chapter. These references provide more depth on the
most important issues and suggest follow-up material for students with a special
interest in a given topic.
A/3 Sample course outlines
We have used the materials included in Forecasting: Methods and Applications, third
edition, for a variety of purposes. These have varied in terms of (1) number of class
sessions in the course, (2) the audience at which the course is directed, and (3) the mix of
lecture and case sessions.
In each course, we find it useful to use real case studies. These are most successful
when the subject matter is of relevance to those attending the course. For longer courses
designed for users, it is important to include computer laboratory sessions giving those
4
Part A. Planning a forecasting course
attending real experience in forecasting real data. Often the concepts that are covered in
lectures are only understood after a person has carried out the procedure and seen the
results on a computer.
Here we provide six course outlines that vary along these three dimensions. Chapter
references are given for background reading. In longer courses, we would cover all (or
almost all) of the material in each chapter. In shorter courses, we given only a brief
introduction to the material listed.
Course A: 5-session (one-day) course for potential users of forecasting
Course A: for potential users of forecasting
Session 1: Introduction
(Chapter 1)
Evaluating the performance of a forecasting system, identifying issues in organizational forecasting, and providing an overview of forecasting methodology.
Session 2: Time series approaches to forecasting
(Sections 4/2 – 4/3)
Exponential smoothing methods and ARIMA models.
Session 3: Regression approaches to forecasting
(Sections 5/1 – 5/3; 6/1 – 6/5)
Correlation analysis, simple regression, and multiple regression.
Session 4: Evaluating alternative forecasting methods
(Section 2/4, Chapter 11)
Introduction of a basic framework.
Session 5: Managing the organization’s forecasting function
(Chapter 12)
Auditing the status of forecasting, identifying key problems, designing, and implementing effective action plans.
The first outline is for a one-day course that would serve as a primer for potential users
of forecasting in a business organization. Given the audience, the teaching approach is a
combination of lectures and case discussions that can draw into the class the experience
of individual participants.
It is assumed that those attending such a course have some basic mathematical abilities;
if they also have some forecasting experience, it is assumed that it is in a fairly limited area
or that it involves a fairly narrow set of methodologies. As outlined below, selected sections
of half a dozen chapters in this book can be used to illustrate various methodologies and
the ways in which forecasting management issues can be tackled. The references to the
textbook are given in italics.
A/3 Sample course outlines
5
Course B: 12-session (two-day) course for practicing forecasters
In the second course outline, 12 sessions spread over two days are used to introduce
those with forecasting responsibility in a company to various methodologies and to an
interactive forecasting computer package. This program has been used with companies
who have have assigned forecasting responsibility to several people acting as business
managers for various product lines or as product/market managers for various segments
of the company’s business.
The text chapters indicated serve as further reference on specific topics. The basic
information on those topics would be covered during lecture class sessions and would be
applied during the sessions that use the computer. Cases are used as the source of data
and as illustrations of various management issues.
Course C: 11-session (two-day) course for managers who will use forecasting
This is also a two-day program (11 sessions) aimed at managers who make use of forecasting
rather than at forecasters. These may be managers who must do their own forecasting or
managers who interface with a forecasting group, such as at a public utility or in a large
consumer products firm. Again, this program assumes that an interactive forecasting
package is available, and output from the package is used to illustrate methodologies.
However, the participants do not actually use the package during the course.
Much of the class time is spent describing the way in which alternative forecasting
techniques can tackle representative management problems. The objectives of such a
course would be to help managers identify situations in which a quantitative forecasting
approach would be appropriate and how to describe and define such situations so that
such a technique could be applied.
Course D: 9-session (three-day) course for potential users of forecasting
This course gives a nine-session program that could be provided as an elective in a regular
college or university curriculum or it might be compressed into a three-day program for
managers or practicing forecasters. It would go into more depth on the topics covered in the
outlines for one- or two-day courses and take a management orientation by concentrating
on case applications and their discussion in the classroom. As indicated, this course outline
includes specific time for group work after individual preparation of the cases.
6
Part A. Planning a forecasting course
Course B: for practicing forecasters
Session 1: Introduction and defining the forecasting problem
(Chapters 1 and 2)
Major issues in forecasting, the concept of a forecasting strategy, a framework for
classifying forecasting methodologies, and measuring forecasting error.
Session 2: Exponential smoothing methods of forecasting
(Chapter 4)
Single exponential smoothing, Holt’s method, and Holt-Winters’ seasonal method.
Session 3: Using an interactive forecasting package
A laboratory session introducing a forecasting package. Inputting data, plotting data,
using exponential smoothing methods for tackling a given case study.
Session 4: Time series analysis
(Chapter 7)
Autoregressive and moving average models for time series analysis, autocorrelation
analysis, and model selection.
Session 5: Data preparation for forecasting
Determining what to forecast, deciding how to forecast, and securing the required
data. Use examples from past experience.
Session 6: Optional laboratory session
Application of smoothing methods and time series analysis to given case studies.
Session 7: Simple regression
(Chapter 5)
Correlation analysis, simple linear regression, and statistical tests of significance.
Session 8: Multiple regression
(Chapter 6)
Basics of time series multiple regression, causal factors in multiple regression, statistical characteristics of this method.
Session 9: Laboratory session
Applying simple and multiple regression to given case studies.
Session 10: Selection of a forecasting methodology
(Section 2/4 and Chapter 11)
Techniques for analyzing the characteristics of a given data set, criteria for selecting a forecasting methodology, and comparing the results obtained from alternative
techniques
Session 11: Systematic improvement of forecasting
(Chapters 12)
Designing a forecasting strategy, measuring performance, and considering organizational aspects of the forecasting function
7
A/3 Sample course outlines
Course C: for managers who will use forecasting
Session 1: Overview of forecasting for management
(Chapter 1)
A management perspective on forecasting, introduction to forecasting methodologies,
accuracy as a performance criterion, additional criteria for selecting a forecasting
method.
Session 2: Seasonal indices and decomposition
(Chapter 3/1, 3/2 and 3/4)
Classical decomposition, computation of seasonal indices, deseasonalizing a data series.
Session 3: Exponential smoothing
(Chapter 4)
Basic smoothing models, performance on trend and seasonal patterns.
Session 4: Interactive forecasting package
Philosophy and characteristics, overview of program structure and components, data
creation and handling, inputs and outputs to the program, sample runs, limitations,
and applications.
Session 5: Autocorrelation analysis
(Section 2/3/3 and Section 7/1)
Time dependence, correlation error analysis.
smoothing method.
Session 6: Time series analysis
Apply to errors from exponential
(Sections 7/2 – 7/8)
Concepts and theory, performance, and practice.
Session 7: Simple regression
(Chapter 5)
Linear time trend analysis, least squares estimation, performance, and practice.
Session 8: Multiple regression
(Chapter 6)
Causal relationships, illustrative examples, performance, and practice.
Session 9: Management use of forecasting
(Chapters 10 and 12)
Computer programs, business cycles, and judgmental inputs.
Session 10: Implementing quantitative forecasting techniques
(Chapters 12)
Selecting the right data, specifying the forecasting project, determining support requirements, using available methodologies.
8
Part A. Planning a forecasting course
Course D: for potential users of forecasting
Session 1: Introduction and overview
Session 2: Role of forecasting in decision making, planning, and control
(Chapter 1)
Session 3: Time series decomposition
(Chapter 3)
Classical decomposition, computation of seasonal indices, deseasonalizing a data series. Value of decomposition for forecasting.
Session 4: Exponential smoothing
(Chapter 4)
Group work on exercises for smoothing time series. Discussion of time series smoothing
and evaluation of accuracy
Session 5: Time series analysis
(Chapter 7)
Introduction to autoregressive and moving average models for time series analysis,
autocorrelation analysis, and model selection. Some exercises on model selection and
autocorrelation analysis of errors. Interpretation of forecasts and prediction intervals.
Session 6: Simple regression
(Chapter 5)
Correlation, least squares estimation, simple linear regression, statistical tests of significance, forecasts.
Session 7: Multiple regression
(Chapter 6)
Seasonal dummy variables, other explanatory variables, variable selection, estimation,
forecasts, collinearity.
Session 8: Regression analysis in practice
(Chapter 6)
Group discussion of a case study. Identification of appropriate variables, variable
selection, interpretation of computer output, use of model for forecasting, forecasting
explanatory variables.
Session 9: The forecasting function in the firm
(Chapter 10 and 12)
9
A/3 Sample course outlines
Course E: segment of production course on short-range forecasting
Session 1: Introduction to short-range forecasting
Session 2: Analysis of time series data
(Chapters 1, 2, and 12)
(Chapters 3 and 7)
Plots, seasonality, autocorrelation.
Session 3: Exponential smoothing
(Chapter 4)
Simple models, calculation of forecasts, autocorrelation of errors.
Session 4: ARIMA models
(Chapter 7)
Overview of models, model selection, interpretation of forecasts, and prediction intervals.
Course E: 4-session segment of production course on short-range forecasting
This short course on forecasting is actually a four session segment of an MBA elective
course on production and operations management that focuses on short-range forecasting
as an input to production planning and control. Students are expected to use an interactive forecasting package in applying selected methods to a variety of production planning
situations.
Course F: 18-session course for MBA elective on forecasting for management
This course is designed as a full-quarter elective course for MBA students who may accept
job assignments with forecasting responsibility immediately upon graduation. Thus, students electing this course would tend to have a fairly good background in mathematics and
would be particularly interested in the knowledge needed to apply individual techniques.
The majority of the class sessions would involve lectures dealing with various topics related
to quantitative forecasting techniques and their application. However, at the end of each
of several major sections of the course, one or more cases would be used to show the practical application of the concepts being discussed and some of the difficulties encountered
in their implementation. A forecasting competition would be conducted during the course
so students could forecast a variable, obtain results, and then prepare further forecasts.
We have also used class visitors at the end of a long course of this type. These would
be practicing forecasters from business or industry who would provide a fresh perspective
on the practice of forecasting and would also discuss non-statistical problems they have
encountered in forecasting.
10
Part A. Planning a forecasting course
Course F: for MBA elective on forecasting for management
Session 1: Introduction to forecasting - Why forecast?
(Chapter 1)
Summarizing data and relationships: a review of some useful concepts Explanatory
methods of forecasting. Forecasting and causality
Session 2: Summarizing data and relationships
(Chapters 2 and 3)
A review of some simple quantitative concepts, time series, moving average smoothing.
Session 3: Decomposition
(Chapter 3)
Trend analysis and seasonality, detrending, classical decomposition, overview of other
decomposition methods.
Session 4: Exponential smoothing methods
(Chapter 4)
Simple exponential smoothing, Holt’s method, Holt-Winters’ method.
Session 5: Regression
(Chapter 5)
Regression analysis, correlation, least squares estimation, tests of significance.
Session 6: Multiple regression
(Chapter 6)
Multiple regression models, significance tests, variable selection,
Session 7: Multiple regression
(Chapter 6)
More on estimating regressions, diagnostics, transformations, prediction with regression models.
Session 8: Multiple regression
(Chapter 6)
Lagged variables, spurious regressions.
Session 9: Multiple regression
(Chapter 6)
Econometrics and economic models.
Session 10: Time series analysis
(Chapter 7)
tests for autocorrelation, analysis of errors from forecasting methods, autoregressive
models.
Session 11: Time series analysis
(Chapter 7)
ARMA and ARIMA models
Session 12: Time series analysis
(Chapter 7)
Box-Jenkins procedures, estimation, tests
11
A/3 Sample course outlines
Session 13: Time series analysis
(Chapter 7)
Examples and applications.
Session 14: Advanced forecasting methods
(Chapter 8)
Overview of some more advanced methods.
Session 15: Long-term forecasting
(Chapter 9)
Cycle analysis and indexes, cycles and the forecast problem, index construction, leading indicators.
Session 16: Judgmental forecasting
(Chapter 10)
Subjective methods, estimation, tests, model selection criteria, estimation, and diagnostics, choosing a method to fit the problem.
Session 17: Comparing forecasts
(Chapter 11)
Measures of accuracy. How do we compare forecasts? Canonical procedures, rules of
thumb.
Session 18: Conclusion
Review and award presentation for the competition
A major project is also a worthwhile adjunct to such a course. One option would be
to give students a case and ask them to prepare the complete analysis and forecasts in
both written and oral form for the management of that company. Such a project allows
students not only to test their knowledge of various techniques but also to handle the
problems of deciding what to forecast and how, and determining what data would be
most appropriate.
A final variation of this forecasting project that we have used with some success is to
identify a forecasting situation in an existing company and to have the manager in that
situation come to class so that the students, acting as forecasters, can define the task and
determine what information is needed. Students can work on the project in teams and
make their final presentations to the manager involved. This type of project is perhaps
the best possible test of students’ knowledge of the subject area and their ability to handle
the practical aspects of forecasting in an organization. The final presentation could serve
as the final exam for the students.
Many other course outlines based on the material in this book are possible, of course.
Here we wish to introduce the range of such outlines and suggest how they might be
adapted to use the complementary materials described in Parts B and C of this manual.
B/Teaching suggestions
Chapter 1: The forecasting perspective
The fate of a course is often decided early. So motivating the subject is critical. This
chapter needs help to make it a live opening for the course, and here are some suggestions.
1. Spend time making the three points on page 5 meaningful.
(a) Scheduling existing resources
(b) Acquiring additional resources
(c) Determining what future resources are needed
If possible, invite a couple of business practitioners into the first class to make the
reality of these matters clear.
2. Consider Figure 1-1 and study it from the point of view of the group responsible for
making the sales forecast for a company. Sales forecasting is one of the most routine
and most fundamental business tasks. It directly impinges on budget policy at all
levels of the firm. Make the point that there are two directions to look:
where does it come from?
Sales Forecast
where does it go?
-
3. Following the previous point, try to establish (i) the context within which a “forecast” is made, and (ii) the environment in which a “forecast” is received. There is
more to forecasting than learning a mathematical method, getting data, running a
computer program, and reporting on extrapolated future values. Sales of a product
group take place within the context of sales of competing products (both internal
to the company and externally). Sales depend on market demand. Market demand
12
13
Chapter 1: The forecasting perspective
depends on market conditions. And so on. It is a dynamic interlocking system out
there. When the sales forecast has been made it is passed on in the hierarchy, to
be examined, modified by expert judgment, repackaged, and passed on again. Forecasting is not a passive activity. The physical environment within which forecasts
are created is dynamic, and the organizational consumer of forecasts shares these
dynamics and adds its own human relations.
4. Table 1-1 offers opportunities for class participation. What forecasting scenarios
are hot topics right now—in the local environment (what will acid rain do to our
forests)? in the nation (will the national deficit ever decrease)? in any firm that you
know about (will our ski businesses survive in the uncertain weather)? And so on.
5. Sections 1/2 and 1/3 begin to introduce the jargon of the field of forecasting, and it
is important to establish some of the most commonly used terms. For example, a lot
will be said about time series methods and explanatory methods. Distinctions are
also made between quantitative methods and qualitative methods. And you can no
doubt think up other dichotomous categories. The purpose of these categorizations
is to aid in communication.
For example, with the two mentioned above, we can identify four cells:
time series
quantitative
time series
non-quantitative
causal
non-quantitative
causal
quantitative
As more and more dichotomous categories are invented we can generate many, many
possible (methodology) cells—but some of them will not be possible. It might be
enough to try to think of examples for the four cells given above.
6. Be careful with the distinction between explanatory models and time series models.
There is some semantic confusion with this choice of words, because “explanatory
models” can deal with “time series” data. So the context has to help out. When we
say time series models or time series methods we usually mean to talk about one set
of (time-dependent) data and we will try to develop a model (an equation) which can
be though of as the “generating process” of these data. We specifically will not look
at its relationship with other variables. When we causal models we are specifically
looking for other variables (which may be time series) which offer explanations (or
linkages) to the main variable (which may also be a time series).
7. It is important to see forecasting as a process involving several steps as outlined in
Section 1/3. Many people focus on just Step 4: choosing and fitting models. To
emphasize the basic steps in a forecasting task, it is helpful to take a particular
example and lead a class discussion on what is involved in each of the five steps.
14
Part B. Teaching suggestions
Chapter 2: Basic forecasting tools
Chapter 2 introduces some of the basic quantitative tools that will be used extensively
later. Therefore it is important the material covered here is well understood.
1. The two examples in section 2/1 will help to fix the distinction between explanatory
and time series methods.
2. The patterns discussed in Section 2/2/1 are important for later reference. In the
many time series to be dealt with in the book, it will become second nature to ask
(i) is there a trend? (ii) is there an annual pattern (seasonality)? (iii) is there a
longer (than one year) cycle?
It is important to emphasize that when we talk about “business cycles” we mean
something different from “seasonality” which refers to the annual pattern. Cycles
are almost always longer than seasonal patterns.
3. The plots in Section 2/2, particularly the time plot and the scatterplot, will be
used extensively in subsequent chapters. These also help students understand the
difference between explanatory and time series methods.
4. In dealing with the familiar descriptive statistics in Section 2/3, the following aspects
could be stressed.
(a) Cross-sectional versus time series data: Some students find it difficult to switch
their thinking from elementary descriptive statistics of cross-sectional data (e.g.,
the weights of 100 mice) to time series data which may have trend, seasonality
and longer term business cycles (e.g., housing starts). The mean of “housing
starts” does not convey much information because of the trend, seasonality and
cycle. Similarly, the variance of “housing starts” is hard to interpret. So in
section 2/3, as each descriptive statistic is defined, it is useful to ask: is there
any difference in meaning when we talk about cross-sectional data or time series
data?
(b) Absolute values: The sum of the deviations from the mean for any data set
always equals zero. The sum of the “absolute” deviations from the mean does
better. But it is an awkward statistic to deal with. (For those with a little
calculus background, ask them to differentiate the MAD equation (2.3) with
respect to the mean, and the point will be established.) It is easier to deal with
the mean of the squared values. (Again, ask those with calculus background to
differentiate the MS equation (2.4)).
Chapter 2: Basic forecasting tools
15
(c) Mean square versus variance: There is a tendency for current statistics texts
to define variance using (n − 1) in the denominator without explanation. Then
there is the inevitable question, “Why (n−1) and not n?” When we write down
formulas for computing summary numbers (statistics) the formulas don’t know
anything about assumptions, so from their point of view it makes no difference.
Only when a statistical model has been defined (using random variables with
their accompanying probability distributions) does it make any sense to talk
about “unbiased” or “biased” estimates, and then a meaning can be given to
the issue. To avoid some of the problems in this connection, we have chosen to
talk about the mean square as the simple average of squares, and the variance
will use (n − 1) in the denominator. This is a matter of convenience. Since
this issue is tied up with d.f. (degrees of freedom), it is useful to make a point
that d.f. are defined as (number of independent data points) minus (number of
parameters estimated). For example, if there are n = 20 students in the class
and each has a score on a test, then how many numbers must you sing out to a
stranger before they know all the scores? Twenty. So there are 20 d.f. for the
raw data. Now compute the mean. Find the deviation from the mean for all
20 students. How many of these deviations must be sung out before a stranger
can know them all? Nineteen. After hearing 19 of them, the last one must be
such that they sum to zero. One d.f. was lost in computing the mean.
(d) Bivariate statistics: Covariance and correlation are important concepts to understand. Note that variance is a special case of covariance (if you convert Y
to X). Understanding correlation is important in regression (Chapters 5 and
6) and in autocorrelation (Chapter 7).
(e) Autocorrelation: To give students a feel for the meaning of autocorrelation,
show some plots of a time series plotted against itself, lagged one period, then
lagged two periods, etc. The physical act of making one or two of these scatterplots is worth the time it takes.
Another worthwhile activity is to do Exercise 2.4 in class with students guessing
which plots align with which ACFs. Discussion is usually easy to stimulate with
students justifying their answers.
(f) Accuracy measurements: Dwell on the interpretation of these summary numbers, so that the distinctions between absolute statistical measures and relative
measures are clearly understood. For example, in dealing with MSE versus
MAPE, you could consider the X and F data in Table 2-9.
(i) What happens to the values of MSE and MAPE if the data were all multiplied by 2?
(ii) Similarly consider what happens to the values of MSE and MAPE if 100
was subtracted from all numbers.
16
Part B. Teaching suggestions
In case (i) the MSE is going to be 4 times larger but the MAPE won’t change.
In case (ii) the MSE won’t change but the MAPE is considerably changed.
What happens if 119 is subtracted from all data?
(g) Theil’s U statistic: This is worth understanding. It becomes easy to use in
practice because when U < 1 the forecasting method is better than the naı̈ve
one (using last period’s observation as the forecast for the next period). When
U > 1 it is worse then the naı̈ve method. Unfortunately, not many computer
forecasting programs calculate Theil’s U .
(h) ACF of forecast error: It is good practice to look at the ACF of the errors obtained from any forecasting method. (See Section 7/1) It is worth emphasizing
this throughout Chapters 4–8 whenever a forecast is calculated.
5. Throughout the book, we emphasize the calculation of forecast intervals wherever
possible because it provides a sense of how dependable a forecast is. Section 2/5
allows some simple forecast intervals to be computed using only the MSE. This
helps students understand the MSE better and introduces the basic concept of a
forecast interval.
6. The reason for giving a special section (2/6) to least squares is simply because it is
almost all-pervasive in the model building world. It has been found to be extremely
valuable in statistics (e.g., in the fitting of regression models to data).
7. Many forecasting tasks can be simplified by transforming or adjusting the data in
some way. It is helpful to ask students how they would transform or adjust a few
given data sets. This can lead to some interesting discussion and emphasizes the
need to understand the data before doing any forecasting.
Chapter 3: Time series decomposition
Decomposition methods are among the oldest of all the forecasting procedures. They are
easy to understand, at least in principle, because most people dealing with time series
data assume the presence of trend, the influence of a business cycle, seasonality (if we’re
dealing with monthly or quarterly data, for instance), and the ever-present noise (the error
term, the disturbance term, or the random shock). The following points should be noted
in teaching this chapter.
1. The components implied by decomposition are invariably described as trend, cycle,
seasonality and noise (or other words to describe this uncontrollable part). when
we speak of trend it seems easy to understand, but in fact it is not all that clear. it
Chapter 4: Exponential smoothing methods
17
is often inextricably mixed up with the so-called cycle (which itself is not a mathematical cycle—such as a sine wave—but rather a irregularly shaped up and down
movement associated with “general business conditions”) and the only way trend can
be separated from cycle is by arbitrary definition of trend. Because of this complexity many decomposition methods (e.g., the Census II method) identify trend/cycle
as one component. Seasonality is less ambiguous and it refers to systematic patterns
that occur within the calendar year.
Suggestion: Have students come up with a written definition of these four components.
2. The ratio-to-moving averages method is easy to compute (see Table 3-6) and it is
good to see plots of the original data, the moving average and the ratio-to-moving
average. When the ratio-to-moving average values are portrayed as in Table 3-6 they
can be visualized in a seasonal plot (Section 2/2/2) which allows for the stability
of the seasonal pattern to be assessed. Students should be clear on the meaning of
each of the columns in Table 3-6.
3. The Wall Street Journal, Business Week, Fortune and other business magazines all
make repeated reference to seasonally adjusted time series and it is important that
all students know exactly what this means. The ratio-moving averages are in fact
seasonal indices plus the random noise component. By averaging these seasonal
indices for each month (or quarter as the case may be), the random component is
reduced, and the resulting seasonal index is a measure of the impact of the season.
By dividing the original data by this seasonal index we are left with seasonally
adjusted data—which has in it, trend and cycle and noise. That is, the influence of
the season has been removed. The Census II method talks about preliminary and
final “seasonal adjustment factors” (same as “seasonal indices”), and preliminary
and final seasonally adjusted series.
4. If software is available for one or more of these decomposition methods, it is interesting for students to compare the results. In particular, investigate what the methods
produce when the series contains some unusual behavior such as a level shift or
some outliers. The classical decomposition method is not designed to cope with
such behaviors, but the Census II and STL methods both contain some robustness
facilities.
Chapter 4: Exponential smoothing methods
Exponential smoothing methods can be useful as an introduction to some of the ideas of
time series forecasting, particularly the concept of forecasts being weighted averages of
18
Part B. Teaching suggestions
time-lagged observations. They are also useful forecasting methods in their own right.
1. These are time series methods as opposed to explanatory methods.
2. In dealing with time-dependent data the concept of a moving average is valuable
because it is dynamic. It moves with time.
3. Whereas moving averages involve equal weights over a set of observations, the simple
exponential smoothing (SES) method is fundamentally different in that it implies
unequal (exponentially decaying) weights.
Aside: You can engage the students in a discussion on how to weight past data in
making a forecast. Should the latest data count more than earlier data? When is
this true? (E.g., when older data was based on a different manufacturing process.)
When might it not be true? (E.g., when current data occurs during a strike.)
4. In order to appreciate the fact that all methodologies have built-in limitations, it is
useful to do what engineers typically do, namely, test the methods on some standard
types of input series. This can be demonstrated by constructing a simple series of 20
observations containing a level shift part way through the series. Then apply both
a moving average and SES to the data to see how each method accommodates the
step. This can be a most enlightening experience for students and a valuable base
for latter work. Similar test series might contain a single outlier, or a trend.
5. Following the previous point, you can ask if SES and moving averages keep pace with
trend. And then discuss seasonality as a complicating factor. Can moving averages
and SES take care of seasonal indices?
6. Discuss Pegel’s two-way classification with the students to emphasize the difference
between linear and multiplicative trend and seasonality. The flexibility of Pegels’
classification has yet to be fully appreciated, and it is worth discussing some of the
cells other than those corresponding to SES (A-1), Holt’s method (B-1) and HoltWinters’ method (B-2 and B-3). For example, consider cell C-3 which will often
outperform Holt-Winters’ method.
7. A very important point to establish for exponential smoothing methods is the fact
that an initialization process (for getting a method going) has to be defined. Since
the initialization procedure has an influence on all subsequent smoothed values it
has to be handled with care. Therefore, in deciding how well an SES model fits, for
example, it is wise to define a “test period” which excludes that early part of the
series which is still “settling down during the initializing phase”. Contrast this with
regression models where we can define “errors of fit” for all data points at once. For
this reason, we have given explicit statements about each strategy and some general
Chapter 5: Simple regression
19
comments on alternative initialization strategies in Section 4/5/1. Note that these
are not the only ways of going about it. Students may be able to come up with there
own suggestions.
8. We have designed this chapter to be complete in the sense that the equations are
all given and fully worked examples are provided. Table 4-11 is something that you
might work toward. It gives in one place a comparison of how all the methods do on
one set of seasonal data. If your students have come to the course with their own
data sets they should be encouraged to work toward a table similar to this.
9. Also note that it is appropriate to discuss the extensive forecasting experiments
described in Chapter 11.
10. A good exercise in discussing ARRSES would be to ask students to generate a time
series using a slowly changing α value, and then do an ARRSES analysis to see if
the method comes close to what was simulated.
Chapter 5: Simple regression
Many students will have already have had a first course in statistics and will have done
some simple regression—mostly in the context of cross-sectional data. Chapters 5 and 6
of this text should accomplish the following:
(a) Consolidate understanding of simple and multiple regression for crosssectional data.
(b) Discuss the importance and limitations of the correlation coefficient.
(c) Discuss the use of regression in a forecasting (time series) context.
(d) Deal with the practical application of simple regression, multiple regression and econometric models.
The following suggestions will assist in teaching this material.
1. Discuss the data setup for simple regression, multiple regression and econometric
models. Mention that in econometric modelling the dependent and independent
variables become mixed up—in the sense that Y variables appear on the right hand
side of econometric equations as well as on the left hand side.
2. Review the details of simple regression of Y on X, make sure everyone knows the fundamental statistics (mean, variance, covariance, correlation, regression coefficients),
20
Part B. Teaching suggestions
and then deal with the definition and role of the overall F test, the t-tests for individual coefficients and the sampling fluctuation of the coefficients.
3. The correlation coefficient is a very widely used statistic and therefore should be understood well. Mention that it is a measure of linear association, that it is therefore
unaffected by any linear transformation, that its sampling fluctuation is large for
small sample sizes (so beware of those regressions based on 10 observations!), and
that it can be severely affected by skewness (or outliers).
4. Emphasize the difference between regression for cross-sectional data and for time
series data. Cross-sectional regression can be useful in a forecasting context (e.g., the
automobile data in chapter 2) but time-indexed data and time series regression pose
special problems. The error terms in cross-sectional regression are usually assumed
independent, but in time series regression this independence is often suspect, and
in some cases (e.g., in dynamic regression models) the errors are carefully defined
not to be independent. The autocorrelation coefficient is simple to define and to
compute, but its sampling distribution is more difficult to handle than the sampling
distribution of the correlation coefficient.
5. Many textbooks talk about regression as a forecasting tool but very few actually do
forecasting with regression. For example, if we regress Y t on Xt−1 ,
Yt = 3 + 5Xt−1 + (error),
and we want to forecast Yt+1 , then the equation allows us to do that:
Ŷt+1 = 3 + 5Xt .
We already know Xt , and so can obtain Yt+1 . However, if we regress Yt on Xt :
Yt = 3 + 5Xt + (error),
then in order to forecast Yt+1 we will need to know Xt+1 , and we do not know this.
So we will have to forecast Xt+1 before we can forecast Yt+1 .
6. Discuss the meaning of equations (5.3) and (5.4) for slope and intercept. The slope
is actually (covariance of X and Y ) divided by (variance of X) and the intercept is
the mean of Y minus the slope times the mean of X.
7. There are no assumptions involved in calculating a correlation coefficient. Equation
(5.10) for r is merely a formula. When it comes to regressing Y on X and some
statistical regression model is defined, then the correlation between X and Y , when
squared, has another useful interpretation. It is “the proportion of variance explained
by the linear relationship between X and Y ”. What this means is that, knowing the
X values we will be able to recover a certain proportion of the variance of Y , and
this proportion is r 2 .
Chapter 6: Multiple regression
21
8. The r value is a measure of linear association so point out the message in Figure 5-7
when a strong nonlinear association cannot be picked up by r. Note also that for
small samples the correlation coefficient is “notoriously unstable” (a phrase Kendall
and Stuart use in the Advanced Theory of Statistics, Vol. 1). Finally, emphasise how
skewness can have a profound effect on r. The King Kong example (see Figure 5-8)
is a useful illustration of this effect and the answers to exercise 5.6 present further
evidence.
9. It is a good idea to contrast equations (5.13) and (5.14) and ask the question: “Where
are the random variables in each equation?” In (5.13) there is only one random
variable—ε. In (5.14) there are three random variables—a, b and e. The values of
a and b are estimates for the unknown parameters, α and β in (5.13). This is why
we can define a standard error for the slope and a standard error for the intercept—
standard errors which are needed to define t tests for the slope and intercept.
10. Emphasise that the F statistic involves a numerator degree of freedom and a denominator degree of freedom, and make sure that students know how to read the
F tables (Appendix III, Table C). A pragmatic point is that the F test should be
done first when appraising a regression analysis, and afterward the individual t tests
can be examined. In the case of simple regression there is no difference, because the
t test is a special case of the F test, but in multiple regression (chapter 6) this is
important.
11. In simple regression there is an intimate connection between the slope and the intercept. Since the least squares regression line always goes through the mean of X
and the mean of Y , it stands to reason that if the slope is changed the intercept is
changed and vice versa. If the mean of X and Y are both positive then an increase
in the slope will cause a decrease in the intercept, and vice versa.
Equations (5.17) and (5.18) should be studied. Note in (5.17) that the second term
under the square root sign will be small if the denominator (representing the “spread
of the Xs”) is large relative to the numerator (which is the mean of X). In (5.18)
the standard error of the slope depends on how spread out the Xs are. If they are
well spread out, the standard error is small.
Chapter 6: Multiple regression
1. As for many other chapters, it will be helpful here to have a readily available regression package for students to work on, so that they can check various things in the
chapter and can run their own data through various regression models.
22
Part B. Teaching suggestions
2. If at all possible, students should run their own data through the various analyses
for maximum understanding. We sometimes adapt the exercises at the end of the
chapter to use the data sets of interests to our students.
3. In multiple regression for cross-sectional data it is important to point out that the
significance of individual coefficients is contingent upon the other regressors present
in the regression. We emphasise this and say: “the coefficient for X 3 is significant
in the presence of the other regressors”.
4. In real world regression problems a considerable amount of time is spent selecting
independent variables and coming up with a reasonable model specification. To
illustrate this we have used a mutual savings bank data set and have done a detailed
analysis of most of the stages that led to a model which was actually used by a large
metropolitan bank. The example is sometimes a little complicated, but we feel it is
worth the effort to get into it in detail.
5. In the notes on chapter 5 we pointed out that regression is often spoken of as a
forecasting methodology, but seldom actually used explicitly in a forecasting context.
In this chapter we carry the bank study through to its conclusion by forecasting
with a final regression model. We explore the difficulties of having to forecast the
independent variables before we can forecast the dependent variable.
6. As in chapter 5, it is worthwhile to understand the cross-sectional regression model
thoroughly, and then consider where the time series regression applications violate
certain assumptions. Since this text is not addressed to formal statisticians, it is
enough to discuss the implications of correlated errors, improper specification (e.g.,
linear when it might be curvilinear, or two regressors when it should be four regressors), multicollinearity, etc., and refer interested students to texts such as Draper
and Smith (1981) for more details.
7. Table 6-1 and Table 6-2: Take time to get to know these data sets because they will
be used a lot during the chapter. Have students graph the data sets and keep them
handy for class discussion.
8. The Durbin-Watson statistic is described in equation (6.9) and Table 6-7. Students
don’t have too much difficulty learning how this statistic is computed. However,
learning to use the D-W tables (Appendix III, Table F) is not so straightforward.
Please spend time going through a couple of examples in using the tables.
9. Selecting variables for inclusion in regression is a meaty subject and we give only an
introduction to the major ideas. In the context of the bank example we go through
some of the procedures without explaining all the details (for example, we talk about
Chapter 7: The Box-Jenkins methodology for ARIMA models
23
using principal components to get a short list of variables). However, any serious
multiple regression analysis will need to consider variable selection carefully.
10. Section 6/4 (Multicollinearity) gives some information that is not often given about
multicollinearity. We hear too often that “multicollinearity is present” when the
highest correlation among any pair of regressors is only .7, say. And we hear too
often that “multicollinearity is not a problem” when there are no large correlations
(i.e., not larger than say 0.5). Both of these statements are incorrect. Table 6-12
shows quite clearly that even when the correlations among regressors never get bigger
than 0.333 we can have perfect multicollinearity.
11. Standard error formula (6.13) is a multivariate equivalents of (5.19). It is a little
harder to interpret because it is written in matrix notation, but it should be part of
a regression package so that confidence intervals can be determined.
12. Table 6-14 should be studied very carefully to ensure students understand how these
forecasts are obtained. It takes a little while to get the time intervals straight, but
it’s a real issue.
13. In discussing econometric models (Section 6/6) we have only given an introduction to
how econometric models are related to the multiple regression models which are the
subject of this chapter. Our aims are to give students an appreciation for econometric
models, their breadth and depth, and the need for specialized skills to develop and
use them effectively.
The topic of econometric modelling is itself an extensive field and we have not chosen
to cover it in this book. An instructor may choose to include other materials on
econometric methods (such as Johnston, 1984; Judge et al., 1988; or Pindyck and
Rubenfeld, 1991) to complement the materials in this chapter. A useful introductory
perspective is provided by Aykac and Borges “Econometric methods for managerial
applications” in the Handbook of Forecasting, Makridakis and Wheelwright (editors),
(New York: Wiley and Sons, 1982).
Chapter 7: The Box-Jenkins methodology for ARIMA models
1. We have chosen to present this material in a different order from that found in most
other textbooks. Students always find this material a little difficult at first, and we
have found the order given in the textbook the most successful approach in leading
students through ARIMA modeling.
2. Section 7/1 allows students to firmly grasp the idea of white noise and the use of
the ACF and PACF before considering ARIMA models. The white noise tests can
24
Part B. Teaching suggestions
be applied to the residuals from regression models or exponential smoothing models.
Introducing residual analysis in the context of these earlier forecasting methods
emphasises that these ideas are not only applicable to ARIMA models, but to any
forecasting methodology. It also allows students to become familiar with some of
the tools used in ARIMA modelling before having to learn about ARIMA models
themselves.
3. Next we introduce the ideas of stationarity and differencing in Section 7/2. We find
it better to introduce these ideas before ARMA models (rather than after as is often
done) because it allows ARMA models to be applied to a much wider range of time
series from the start. A common approach is to consider only stationary series at
first and non-stationary series later, but this gives students the initial impression
that ARMA models are not very widely applicable. We find that the approach given
in the book leads students to be more positive about these very useful models.
4. You can anticipate that we will be wanting to use the backward shift operator notation later (particularly in chapter 8) and should make a decision whether to introduce
it at this time or not. Students do not seem to find this difficult to digest.
5. Students usually find autoregressive models relatively easy to understand as an extension of multiple regression. This is the way we normally introduce them. Next
consider the connection between exponential smoothing and AR processes. An exponential smoothing process also involves weighted past values and so is a special
case of an AR process.
6. Moving average models are usually more difficult to understand at first. Some instructors try to connect them with previous forecasting methods such as exponential
smoothing. We have not found that students find this particularly helpful. Instead
we just say they are like a multiple regression, but with past errors as the “explanatory variables”. Get the point across that linear functions of past values of the error
series are called MA processes, and linear functions of past values of the observations
are called AR processes.
7. Note the potential confusion between moving average models, moving average
smoothing and moving average forecasting. Students find this unfortunate duplication of terminology difficult and it needs to be explained very carefully.
8. If you have the facilities it is recommended that you have students generate time
series using some of the simpler models so that they really know what is implied.
Students will learn more about the models if they have to generate data using a
spreadsheet package, than if they use a package with built-in data generation facilities. Generating data with known properties and then studying the shape of the
theoretical ACF and PACF gives a valuable insight into the BJ models.
Chapter 7: The Box-Jenkins methodology for ARIMA models
25
Make a point about learning in one direction and analyzing real data in the opposite
direction. In schematic form, this is:
(a) generate data with known properties
(b) study the theoretical ACF and PACF
(c) store the simulated data and then analyze it.
See if the empirical ACF and PACF match the
properties that we started with.
In real world data, it is the other way around:
(a) analyze the observed data
(b) study the empirical ACF and PACF
(c) try to identify a theoretical underlying model
that could have given rise to the observed data
9. Note that the constant term c in these models is not the same as the mean of the
time series if there is an AR component. For example, if the mean of Y t is µ, then
Yt − µ = φ(Yt−1 − µ) + et
so that c = µ(1 − φ). In general, the constant term will be
c = µ(1 − φ1 − φ2 − · · · − φp ).
For an MA model, the constant is equal to the mean of the series.
10. It will become increasingly important for students to be able to write out the equation
for any ARIMA model, so practice at this time is important.
11. The ACF and PACF are used repeatedly later Sections of this chapter, so they
should be learned thoroughly. Give students some sample ACFs and PACFs and
have them guess the type of series they come from.
12. Before proceeding to Section 7/5, it would be wise to have students very comfortable
with the ideas of stationarity, ACFs, PACFs, ARIMA models and how they are all
related. They should be able to write the equations for simple ARIMA models.
13. In trying to generate time series that are from an ARIMA(p, d, q) process, students
will need to be aware of the restrictions on the values that the AR and MA coefficients
can take on.
26
Part B. Teaching suggestions
14. Students should not be fooled into believing that it will be easy identifying ARIMA
models for real data series. As soon as the model becomes a mixed model—even
the very simplest ARIMA(1, 0, 1)—the shape of the ACF and PACF can become
confusing. It is good to lean the properties of the elementary models, and it is good
to remember that they will seldom make themselves known unequivocally in real
data series.
15. Estimation (Section 7/4) is often taught in two parts: preliminary estimation and
final estimation (using some iterative process such as Marquardt’s algorithm). Computationally, this is the way it must be done, but from the point of view of a forecaster
using a computer package, the computational details are not relevant. Therefore, we
have focussed on the results which are obtained from a computer package as these
are of most relevance to practising forecasters.
16. The use of the AIC in Section 7/6 is not common in introductory forecasting books,
but we have found it extremely useful in practice and many computer packages are
not giving it as part of the standard output.
17. The conversion of an ARIMA equation into a form suitable for forecasting (Section
7/8/1) takes a little bit of algebraic multiplication and rearrangement of terms.
However, since all forecasting packages will provide forecasts from an ARIMA model
automatically, students will probably never need to do the calculation themselves.
The purpose of including the details in this section is to show how the equations give
forecasts, something which may not be immediately obvious to students, particularly
when there is an MA component.
18. The material in Section 7/8/3 is very poorly understood, even by some experienced
forecasters. The effect of differencing on the forecasts is worth understanding. Too
often differencing is carried out without thought for its implications later on.
19. Students learn most from working through an analysis of a time series from start
to finish. Exercises 7.8 and 7.9 are useful for this purpose. See also Section C/2/7
which can be used as a student project in longer forecasting courses as it enables
each student to choose a different set of data to analyze.
Chapter 8: Advanced forecasting models
1. There is a lot of material in this chapter, and the instructor may wish to select
only a few topics to cover. In a shorter course, we suggest omitting Section 8/5
(Multivariate autoregressive models) and Section 8/6 (State space models). Please
note that Sections 8/1, 8/2 and 8/3 are sequential. Therefore it is important to
Chapter 8: Advanced forecasting models
27
cover Section 8/1 (Regression with ARIMA errors) well before going on to Sections
8/2 (Dynamic regression models) and 8/3 (Intervention analysis).
2. The approach we have adopted for Sections 8/1 and 8/2 is very different from that
found in most books. We have followed the Pankratz approach (and terminology)
to modelling rather than the more traditional Box-Jenkins’ approach. We have
taught using both approaches many times and have found students find the Pankratz
approach very much easier to follow and use. In our own consulting work, we have
also found it a much simpler methodology when fitting dynamic regression (transfer
function) models.
3. It is essential when considering Sections 8/2 and 8/3 that students are comfortable
with the backshift operator notation.
4. Unfortunately, there are not many software packages which allow the range of models
covered in this chapter to be fitted. We have taught the material to a range of
students using click-and-point interface available in the SAS Forecasting system. It
provides particularly good facilities for dynamic regression and intervention models.
We have had most success in teaching this material through case studies with the
students spending most of the class time doing the analysis on PCs.
5. After studying Sections 8/1 and 8/2, have the students find examples in the real
world where one input variable influences another variable dynamically over future
time. One of our students came up with three series that seemed to go round robin:
monthly gas prices, domestic autos produced and autos sold.
6. For intervention analysis, a good class project is to ask students to read an article
involving the application of intervention analysis, then prepare their own report or
oral presentation on what was done. We have done this with the Ledolter and Chan
(1996) article with good results.
7. For Sections 8/4 through 8/7, we only provide a brief introduction to the ideas involved, plus some applications. Our aim here is to provide students with enough
information to know when these models might be applicable and in what circumstances they might be useful. If students wish to use these models, they will need
to learn much more about them than is described in our book.
8. An interesting activity is to have students hunt for illustrations in the literature
which make use of one or more of the methods covered in this chapter. Each student
can give a brief presentation based on one application and lead a discussion on
whether the model was appropriate to the problem.
28
Part B. Teaching suggestions
Chapter 9: Forecasting the long term
1. The first thing which it is important to bring across to the students is long-term
mega-trends. A good way for doing so is to present Figure 9-1 and ask if the data
presents a trend (most students will say “No”), then Figure 9-2 asking the same
question. At this point several students would say that 14 years is enough to establish
a long-term trend. One can then show Figure 9-3, which indicates that the 14 years
of Figure 9-2, shown as a shaded region of Figure 9-3, are merely a small part of
Figure 9-3. Finally, one can show Figure 9-4 and discuss why mega-trends can only
be established by going back to the beginning of the Industrial Revolution (that is
around 1800). Another figure which can be used to help illustrate this starting date
as well as the persistence of mega-trends is Figure 9-6 which shows wheat prices in
constant £ and goes back to the middle of the 13th century.
2. Once long-term mega-trends have been identified they can be extrapolated unless we
believe that they will change due to some other revolution similar to the industrial
one. If that is the case then we have to make our predictions not by extrapolation
but by using analogies or by making various scenarios about the implications of large
changes like those of the forthcoming Information Revolution.
3. When forecasting for the long term, deviations around the long-term mega-trends are
of critical importance as cycles can last for many years or even decades. Moreover,
since cycles are mostly random walks we have to go beyond pure quantitative models
to predict them. This is a point worth making and can be illustrated by generating
random numbers, cumulating their effects, and showing the result on a graph. Such
graphs show that predicting turning points is impossible quantitatively since they
present random walks.
4. Chapter 9 has a lot of figures that usually generate a great deal of interest from
students. The way to present them is by discussing the implications if they are
extrapolated in the long run, ending up with the question of what will happen when
our buying power increases (at a double rate since real prices drop exponentially and
real income increases exponentially) and we get a situation of over-abundance, while
at the same time huge inequalities between rich and poor nations, and rich and poor
citizens in single nations.
5. The discussion about such implications, as well as those that would come up by
talking about various analogies and scenarios between the Industrial and forthcoming
Information Revolution, generate great interest and strong opinions which provide
the basis for a lively debate.
Chapter 10: Judgmental forecasting and adjustments
29
Chapter 10: Judgmental forecasting and adjustments
1. One way of introducing the topic of judgmental forecasting is to give Figure 10-4 to
the class and then ask them to make forecasts after consulting the figure. Tell one
third of the people that the product shown in the figure is mature, the second third
that it is old, and the final third that it is new. The results of their forecasts can
be summarised and presented. They usually are similar to those shown in Figure
10-5 which indicates how pre-conceived ideas are being used and how they can bias
the forecasting process (after all it can be indicated that the great majority of new
products fail after a couple of years).
2. Large errors in judgmental forecasts can also be illustrated by comparing the performance of professional investment managers to those of randomly selected stocks.
The consistent under-performance of expert managers is remarkable and can be used
as the basis to discuss what is wrong as well as illustrating how one can improve
investment returns without having to pay any fees to professional experts by simply
selecting bond, stocks and other investments randomly. In addition one can discuss
why people prefer “experts” to manage their investments (obviously, they feel more
secure by doing so, or alternatively they think that they reduce their uncertainty)
while clearly such a choice results in smaller returns and extra fees.
3. Another interesting topic in Chapter 10 is the use of decision rules instead of intuitive,
global judgment when the judgmental inputs can be quantified. Again there is
a lot of material for interesting discussion starting with the finding that decision
models in the form of multiple regression equations can predict more accurately
the performance (their average GPA) of candidates for universities than admissions
officers. This and similar types of decision rules can therefore be discussed as ways
of improving future oriented decision-making.
4. The last part of this chapter deals with ways of debiasing decision making so that
the advantages of both quantitative models and judgment can be exploited while
avoiding their disadvantages.
5. The following is a list of judgmental exercises (there are two versions: one to be
given to half the class and the second to the other half). These exercises provide
an excellent way to show the students their biases as their answers from the two
versions vary considerably.
30
Part B. Teaching suggestions
Judgmental exercises
1. What is the percentage of countries in the UN that are African? To make your
estimate, I would suggest that you start with a value of 65% (this percentage was
found in the computer by generating a random number between 0 and 100). First
decide whether this value is too high or too low—then move upward or downward
from that value to what you feel is the true value.
Your final estimate as to the true percentage of African nations in the UN is
.
2. A psychological test was administered to a group of 100 people. The group consisted
of 30 engineers and 70 lawyers. The following descriptions were obtained for Peter
Jones:
“Peter Jones is of high intelligence and exhibits a strong drive for competence. He has a need for order and clarity and for neat and tidy systems in
which every detail finds its appropriate place. His writing is enlivened by
somewhat corny puns and by flashes of imagination. He seems to have little feel and little sympathy for other people and does not enjoy interacting
with others. Self-centred, he nonetheless has a deep moral sense.”
If you had to place a bet on whether a participant in the test named Peter Jones
was an engineer or a lawyer, what would you say?
.
.
Peter Jones is an engineer
Peter Jones is a lawyer
Please put a cross on the appropriate line.
3. You are the chief executive officer of a company faced with a difficult choice. Because
of worsening economic conditions, 12,000 people will need to be fired to reduce the
payroll costs and avoid serious financial problems. Two alternative programs to
combat the firings have been proposed to you. The estimates of the consequences of
the programs are as follows:
• If program A is adopted, 4,000 jobs will be saved.
• If program B is adopted, there is a two-thirds probability that no jobs will
be saved and a one-third probability that 12,000 jobs will be saved.
Which of the two research and development projects would you select?
A
Please tick the appropriate box.
B
31
Chapter 10: Judgmental forecasting and adjustments
4. The figure below shows the sales of “Electrack”, a video game produced by Jeutronics, a medium sized French toy company. Provide optimistic, most likely and
pessimistic forecasts for the year 2001.
100000
50000
Sales
4,433
60,298
67,884
89,512
0
Year
1993
1994
1995
1996
150000
200000
250000
Figure 1: Actual sales of ’Electrack’
1993
1995
1997
1999
2001
Your Forecasts of “Electrack” Sales in 2001:
Pessimistic
Most Likely
Optimistic
.
.
.
5. You are in a store about to buy a new watch which will cost 350FF. As you wait for
the sales clerk, a friend comes by and tells you that an identical watch is available in
another store two blocks away for 200FF. You know that the service and reliability
of the other store are just as good as this one. Will you travel two blocks to save
150FF?
6. FINISHED FILES ARE THE RESULT OF YEARS OF SCIENTIFIC STUDY
COMBINED WITH THE EXPERIENCE OF YEARS
Please indicate the number of Fs which appear in the above sentence.
.
How confident are you of your above answer? Indicate your confidence on a scale of
0 to 100 with 0 indicating no confidence and 100 indicating full confidence.
.
How many times did you read the sentence “FINISHED . . . OF YEARS”?
.
32
Part B. Teaching suggestions
Judgmental exercises
1. What is the percentage of countries in the UN that are African? To make your
estimate, I would suggest that you start with a value of 10% (this percentage was
found in the computer by generating a random number between 0 and 100). First
decide whether this value is too high or too low—then move upward or downward
from that value to what you feel is the true value.
Your final estimate as to the true percentage of African nations in the UN is
.
2. A psychological test was administered to a group of 100 people. The group consisted
of 30 engineers and 70 lawyers. If you had to place a bet on whether a participant
in the test named Peter Jones was an engineer or a lawyer, what would you say?
Peter Jones is an engineer
Peter Jones is a lawyer
.
.
Please put a cross on the appropriate line.
3. You are the chief executive officer of a company faced with a difficult choice. Because
of worsening economic conditions, 12,000 people will need to be fired to reduce the
payroll costs and avoid serious financial problems. Two alternative programs to
combat the firings have been proposed to you. The estimates of the consequences of
the programs are as follows:
• If program A is adopted, 8,000 people will be fired.
• If program B is adopted, there is a one-third probablity that no nobody will
be fired and a two-thirds probability that 12,000 people will be fired.
Which of the two research and development projects would you select?
A
B
Please tick the appropriate box.
4. The figure below shows the sales of “Electrack”, a video game produced by Jeutronics, a medium sized French toy company. Figure 2 shows the most likely predictions
from a widely-used computerized mathematical model for new products. After having looked at Figures 1 and 2, provide an optimistic, most likely and pessimistic
forecast for the year 2001.
33
Chapter 10: Judgmental forecasting and adjustments
Year
1993
1994
1995
1996
Sales
4,433
60,298
67,884
89,512
200000
150000
Figure 2: Actual and predicted sales of ’Electrack’
250000
Figure 1: Actual sales of ’Electrack’
•
50000
100000
150000
100000
•
•
•
50000
•
•
•
•
•
0
0
•
1993
1995
1997
1999
2001
1993
1995
1997
1999
2001
Your Forecasts of “Electrack” Sales in 2001:
Pessimistic
Most Likely
Optimistic
.
.
.
5. You are in a store about to buy a new video camera that costs 4000FF. As you
wait for the sales clerk, a friend comes by and tells you that an identical camera is
available in another store two blocks away for 3850FF. You know that the service
and reliability of the other store are just as good as this one. Will you travel two
blocks to save 150FF?
6. FINISHED FILES ARE THE RESULT OF YEARS OF SCIENTIFIC STUDY
COMBINED WITH THE EXPERIENCE OF YEARS
(Please do not read the above sentence again)
Please indicate the number of Fs which appear in the above sentence.
.
How confident are you of your above answer? Indicate your confidence on a scale of
0 to 100 with 0 indicating no confidence and 100 indicating full confidence.
.
34
Part B. Teaching suggestions
Chapter 11: The use of forecasting methods in practice
1. The material of Chapter 11 relates to that of Chapter 10. The major question which
arises at the beginning of the chapter is the choice between judgmental and statistical
forecasting methods. As the quote by Sanders and Manrodt explains
“Like past investigations (surveys) we found that judgmental methods are the
dominant forecasting procedure used in practice.”
This means that there is a great deal of potential improvement as managers now
realize the potential for higher forecasting accuracy (and therefore reduced costs)
and they are continuously pushed, at the same time, to operate more efficiently and
effectively in order to reduce their operational cost. Thus the time is right to persuade them of the considerable benefits they can obtain by using the knowledge and
experience we have accumulated which clearly indicates the benefits from statistical
forecasting and how it can be best integrated (this is also the major topic for the
next chapter) with judgmental predictions.
2. The findings in Section 11/2 are very useful and relevant in putting forecasting
on a practical grounding. In this section, the major empirical evidence is being
summarized and various tables and figures are available for backing up the findings.
The major question then for forecasting users is to select the most appropriate
method for their specific situation. This topic is discussed in Section 11/3. It is
worthwhile for the instructor to present each one of these factors and discuss them
in class. Obviously a critical factor is the last one (the number and frequency
of forecasts). Such factor signifies the necessity for simple methods that can be
completely automated when the number of forecasts required is very large and they
are needed frequently.
3. In the absence of clear factors, when guidelines are not obvious or in case of doubt
as to what method to select, the best alternative is to combine three or four simple
methods and use their average as a way of predicting the future. As is well known,
through many empirical studies, such a simple average of the combined forecasting
methods is both more accurate than the individual methods being combined while
at the same time variance of the forecasting errors of combining is smaller than that
of the individual methods involved. Combining can, therefore, be presented as a
practical alternative which improves forecasting accuracy and reduces the chances
of errors (in particular large ones).
Chapter 12: Implementing forecasting: its uses, advantages, and limitations
35
Chapter 12: Implementing forecasting: its uses, advantages, and
limitations
1. In Chapter 12 it is important to emphasize what can and cannot be predicted or
in other words present and discuss the limits of predictability. This can be done in
terms of short-, medium- and long-term predictions as the horizon of our forecasts
presents different challenges and problems as far as predictability is concerned. Critical in such a discussion is the medium-term which must predict the ups and downs
of business cycles. Such predictions are extremely difficult and present a major challenge for businesses when they attempt to make budget estimates. The same is true
for the longer run (18 months to 5 years) when predictions for the five years Business
Plan are made and when longer term cyclical deviations around the long term trend
must be dealt with. The topic of predictability, or the lack of it, can be related to
the introductory chapters of the book and to our experience from the forecasting
practice (including the findings from the surveys among forecasting users presented
in the previous chapter).
2. The second topic of Chapter 12 deals with the organisational aspects of forecasting
and the need to deal with the various forecasting problems that are encountered in
organisations which are using forecasting methods and the possible solutions to such
problems. There is enough information in the corresponding part of Chapter 12 to
describe these problems and discuss suggestions for solving them satisfactorily.
3. The third section of Chapter 12 (Extrapolative predictions versus creative insights)
discusses the role and value of forecasting beyond its operational applications. As the
title implies, its greatest value is when the forecasts are creative in nature, which
by definition means that they cannot be based on simply extrapolating historical
information. On the contrary it may be necessary to go against conventional wisdom
in order to come up with creative insights about future changes or what the future
might hold.
4. In the last part of this chapter the instructor can discuss and possibly develop
his/her own ideas about how forecasting is going to evolve in the future. Central
to such a conception will have to be the creation of a learning process which will
result in organisational learning (rather than the experience of each individual person
concerning forecasting resting with such person and disappearing when he or she
changes jobs or company). Creating learning about forecasting is more practical
and cost efficient these days through the use of groupware (or intranets) which
allow the people in organisations working on forecasting not only to exchange ideas,
information and inside knowledge, but also to record the forecasting process they
have been using as well as their successes and failures so that they can be reviewed in
36
Part B. Teaching suggestions
the future by themselves or others in ways that can enhance learning. In other words
ways must be developed which can help organisations to improve their forecasting
process by knowing and avoiding past mistakes while using practices that they have
been found to be successful in the past.
C/Additional materials for teaching
forecasting
This chapter suggests additional materials that might be used to complement the contents
of Forecasting: Methods and Applications, 3rd ed., in a teaching situation.
Since cases can be a valuable addition to a forecasting course and yet often represent
a very different style of teaching from straight lectures and problem sets, Section C/1
outlines ways in which cases can be effectively integrated into the teaching of forecasting.
Section C/2 provides some suggestions for special project assignments. These provide a
context for forecasting but are shorter than field-based cases. Lastly, Section C/3 consists
of exam questions.
C/1 Using cases in teaching forecasting
Many different types of cases can be used to meet very different purposes in teaching.
Briefly, these can be grouped into three categories. The first would be case exercises in
which the case is simply an expanded problem (that is, what was traditionally described
as a “story” problem) providing data and their context for the students to use in applying
a specific tool or technique being covered in a forecasting course. The second type would
be the management process case in which the case describes the forecasting process in
an organization and allows students to consider the process itself as opposed to specific
techniques applied to specific data sets. The final category would be a mix of these two
but generally oriented toward applying a technique to specific data and then looking at the
management decision-making implications of the resulting forecasts. This third category
is closely tied to the implementation of what the forecaster recommends to management.
A case course is taught using a problem-centered, participant-involved method of instruction. For most class sessions in such a course, a case is assigned to be read and
prepared for discussion and analysis in the classroom. Each case describes a specific management problem and seeks to describe selected aspects of an everyday situation that
either the forecasting specialist or the management user of forecasting might encounter.
37
38
Part C. Additional materials for teaching forecasting
C/1/1 The nature of a case
In spite of the realism that an instructor seeks to build into all cases, they cannot be
completely true-to-life management situations for the following reasons.
1. The information comes to the student in a neatly presented form. By contrast,
managers and forecasters must gather facts in ongoing situations from memos, conversations, statistical reports, and the public press in a much less organized fashion.
2. A case is designed to fit a particular unit of class time and to focus on a certain
category of problem, such as a forecasting technique, forecasting system, or the
management use of a given forecast. Consequently, it may omit elements of the real
situation—people or organizational issues for example—in order to focus attention
on what the instructor would like the class to see.
3. A case is a “snapshot” taken at a given point in time. In reality, business problems
form a continuum requiring some action today, further consideration and action
tomorrow, and so on. It is very seldom that a manager can wrap up his or her
problems, put them away and go on to the next “case” as is done in a course.
4. While students studying cases are required to make decisions, they do not have the
responsibility for implementing those decisions and do not have to bear the burden
of ineffective implementation. This can be a particularly important shortcoming
when training forecasting specialists, who will have to interface with managers, who
in turn must make decisions based on their forecasts.
C/1/2 The educational purposes of cases
Cases can help forecasters and managers sharpen their analytical skills by exposing them
to facts and figures which must be evaluated and used to produce both quantitative and
qualitative evidence to support recommendations and decisions. In case discussions students are typically challenged by instructors and peers to defend their arguments and
analyses. This can have the cumulative effect on the students of helping to develop a
problem-solving methodology and heightened ability to think, reason, and apply specific
techniques in a rigorous fashion.
Case studies cut across a range of organizational situations and provide exposure to
a far greater number of situations than would be likely on a single job involving normal
day-to-day routine. Thus, cases permit building knowledge across a range of subjects and
situations by dealing selectively and intensively with problems in each field. Students
come to recognize that the problems they face as a manager or forecaster are not unique
C/1 Using cases in teaching forecasting
39
to one organization or even a system of organizations. This helps them to develop a more
professional sense of their tasks and the way in which they can be handled most effectively.
Cases and the related class discussions can provide the focal point around which the
student’s past experience, expertise, observations, and rules of thumb can be brought
together in a framework for effectively tackling new situations. What each class member
brings to identifying the central problems in a case, analyzing them and proposing solutions
to them, is as important as the content of the case itself. The lessons of experience
can be tested as students present and defend their analyses against those of participants
with different experiences and attitudes. This is one place where common problems,
interdependencies, differences of perception, and organizational needs can be highlighted
and resolved in a systematic fashion.
An important benefit of using cases is that they help students learn how to ask the
right questions. It has often been said that 90% of the task of a good manager is to ask
useful questions. Answers can be relatively easy to find once the appropriate questions are
asked. Even when assignment questions are used with individual cases, students should
be pressed to ask themselves, “What are the real problems that the individual forecaster
or manager must resolve in this situation?”
One final benefit that an instructor often seeks to achieve through the use of cases is
to transfer to the students the excitement and challenge that can come from pursuing
management and forecasting careers. In the cases they prepare, students often see some
problems they are glad they do not have to face in real life, and others that they recognize
from first-hand experience. They should come to recognize that being a manager or
working as a forecaster for a manager can be a great challenge intellectually, politically,
and socially.
C/1/3 How students should prepare a case
There is no single form of case preparation that works best for everyone. However, some
general guidelines can be offered that might well be adapted to the way each individual
student does his or her work. These guidelines would include the following.
1. Go through the case almost as fast as you can turn the pages, asking what the case
is about and what types of information are being provided for analysis.
2. Read the case very carefully, underlining key facts and perhaps writing those and
key issues in the margin. The students should try to put themselves in the position
of the manager or forecaster in the case and develop a sense of involvement in that
person’s problem.
40
Part C. Additional materials for teaching forecasting
3. After a thorough reading of the case it’s often useful to prepare a list of the key problems. The case can then be gone through once more, picking up those considerations
and data that are relevant for resolving each of those problem areas.
4. Perform the analysis that will enable the student to conclude what recommendations
are most consistent with the situation and the facts provided.
5. Develop a set of recommendations that can be supported by case data and the
student’s analysis, and indicate how those recommendations should be implemented
in this situation.
These five steps can best be applied by the student working individually. The next step
which can aid in case preparation is to have students meet in discussion groups to present
their arguments to others as well as to listen to the arguments of their peers. This testing
of the student’s analysis and recommendations is an important preparatory step for class
discussion. The purpose of the group discussion is not to develop a consensus or a group
plan of action, but rather to help each member refine, adjust, and amplify their own
thinking. It is not necessary or even desirable that the discussion group members agree.
In class the instructor will usually let the class direct the discussion toward those topics
where most of the individuals have concentrated their attention. However, the faculty
member is also likely to prod the class to explore fully those avenues that are most relevant
based on the faulty member’s experience and based on the purposes for which the case was
included in that course. Often the faculty member will summarize the discussion and draw
out the useful lessons and observations toward the end of the class discussion. However,
this might also be done by asking a student to provide that type of summary. It should be
emphasized that learning through the case method results from rigorous discussion and
controversy. Each member of the class and the instructor must assume responsibility for
preparing a case and for contributing ideas to the class discussion.
C/1/4 Use of cases in course design
Just as there are many types of cases, there are many purposes for which they may be
used in course design. Perhaps the simplest is to select exercise cases that can be used
to present data for the students to use with various forecasting techniques. Many of the
cases that will be suggested later in this chapter are of this type and, thus, can be used
simply as exercises or work problems for students.
A second way in which cases are often used by instructors who do not teach largely by
the case method is as a way to get at implementation and management decision-making
issues. In such a use of cases, an instructor might choose to cover a topic such as regression
C/1 Using cases in teaching forecasting
41
analysis using a more traditional lecture method with exercises and problems and then
conclude that section of the course with one or two classes built around management
process-oriented cases. These can be viewed as a way for students not only to apply the
techniques related to regression that they have learned but to look at the implications
of those applications for managers and to consider how they might be effectively “sold”
to management and implemented. If this were to be the only use of cases the instructor
might simply choose to end each of four or five major sections of the course with one or
two cases, resulting in a course that is approximately 80% exercises and lecture/problems,
and 20% case applications.
A third way in which cases often have been used effectively in teaching forecasting and
planning is to teach the basic techniques and their applications using exercises and problems and then at the every end of the course to have a major section on implementation.
In that section, cases requiring the use of different techniques and illustrating the range
of management issues related to implementation could be addressed. If this approach is
followed a class of thirty sessions might have only the last four or five built around case
studies of the management process and implementation type.
Still a fourth approach to utilizing cases in this subject area would be to build the entire
course around cases. While this is certainly feasible given the amount of material readily
available, this can often be a most challenging task if the students are not used to case
courses from other parts of their curriculum. The authors’ experience would suggest that
it is best to use one of the foregoing forms of case use initially, before building an entire
course around cases.
When students are not particularly well versed in the case method, it has often been
found effective to assign study groups to meet in preparation for the classes in which cases
will be used and then to have two or three individuals briefly (5 minutes each) present
their recommendations and analysis at the start of class in order to get the discussion
going. That provides a complete set of thoughts and ideas on the case situation that the
rest of the class discussion can build on. It also ensures that students are well prepared for
that class since they know they might be called on to make such a starting presentation
C/1/5 Obtaining cases
Prepared cases with teaching notes are available from Harvard Business School Publishing,
60 Harvard Way, Box 230-5A, Boston, MA 02163. They can also be obtained through the
internet at
www.hbsp.harvard.edu
These can be reprinted by the instructor and used to complement the exercises in the text
itself.
42
Part C. Additional materials for teaching forecasting
C/2 Assignments
C/2/1 The Phrygian thread factory
(Prepared by Professor Fred Shepardson, Stanford University. Used by permission.)
The Phrygian Thread Factory was founded in 1947 by Ikos Matzakis shortly after
emigrating from Greece. The enterprise had begun on a small scale, supplying thread for
the local garment industry. In these days, Matzakis would buy cotton fiber from relatives
in Greece, import it to the United States, and dye it and spin it produce a rather wide
range of end products.
Since those humble beginnings Phrygian had grown to become a not inconsequential
thread supplier for the Northeastern United States. In addition to supplying the garment
industry and various distributors and retailers of sewing threads, Phrygian was now supplying large industrial users. Major customers included the auto industry (for upholstery
and seat belts) and the telecommunications industry (for wrapping and insulating cables)
Similarly, Phrygian no longer restricted itself to cotton thread. The bulk of its output
was now nylon, although significant amounts of rayon, cotton, and silk were produced
as well. Phrygian’s product line was virtually unlimited, for it was standard operating
procedure to do custom dye jobs to match customer color specifications. However, color
notwithstanding, there still remained nearly a hundred distinct items in the product line.
One of Phrygian’s most important products was NC-216. This was a bonded nylon
thread customarily used by the auto industry in sewing seams in upholstery. To make it
Phrygian began with raw nylon fiber of weight 210 denier. Two strands were spun together
with a right-hand twist to form a thread. Then three threads were twisted together, again
with a right-hand twist, to form the final thread. Once this was ready, it was loosely
wound into large spools and sent to the dyehouse.
The dyehouse staff would dye the thread into batches of up to ninety pounds. From the
dye vats, the thread would go directly to large walk-in ovens to accelerate drying. After
24–48 hours in the ovens the thread was moved to large drying rooms to finish drying.
After one to five days in the drying room, the thread was ready to be sent upstairs for
bonding.
In the bonding room the thread was passed slowly through a hot liquid plastic solution
and then through heaters and on to winders. Once this process was completed, random
samples were taken and tested, primarily for breaking strength. The thread was finally
sent down to the spooling room to be put on customer-specified spools (usually one pound
spools). Once finished, the completed order was sent down to shipping to be packaged
and shipped
C/2 Assignments
43
Other products followed the same general flow, although some required additional procedures (such as skeining before dying) and others required fewer (for example, no bonding).
In the office suite life was characterized by a constant effort to track down and expedite
orders. Orders were phoned in by salesmen in the field. In the case of custom color
requirements, color samples followed by mail. For most orders, salesmen, wanted to know
a projected delivery date. The production manager, Roy, and his assistant, Fred would
characteristically supply delivery dates off the top of their heads. In this process they
relied on their intelligence guided by experience. For all products they were aware of the
normal production time. They also knew that these lead times were quite flexible. With
constant monitoring, a product could be shipped in a much shorter time than its expected
processing time. However, with no monitoring a product often took much longer than the
normal lead time.
For very important orders, Roy and Fred would promise an early date and then ride
the department managers closely to make sure the date was met.
Other aspects of production control were done in the same sort of ad hoc manner.
Workforce levels, overtime, and extra shifts were decided on pretty much a day-to-day
basis. Bernie, the new plant manager, had decided things had to change. This decision
had been made in response to the latest catastrophe. Phrygian had just received a large
rush order from Non-Specific Motors for three thousand pounds of NC-216, a thousand
pounds in each of three colors. Bernie was at first jubilant when he heard of it. But his
jubilation was short-lived when he learned that there was not enough 210 denier nylon to
meet such a large order. There was already additional nylon backordered but it was not
scheduled to arrive in time to be of use for the Non-Specific order.
While Roy and Fred scrambled orders, robbing Peter to pay Paul, and combed the countryside for additional supplies of 210 nylon, Bernie sat in his office and plotted strategy.
Bernie decided the first requirement was a good forecasting system so they would not be
caught off guard like this again. By going over orders for the last three years, he noticed
that each year Non-Specific had placed a large rush order for NC-216 at about this same
time. He felt sure such information could be used in planning operations. He decided to
call Roy and Fred in and have them set up a forecasting system.
The next day Bernie made his pitch. “Look, you guys, things have been going pretty
well. Phrygian’s profits are up, even our market share is up. But now we’re getting bigger
and I think we’ve just about hit our breaking point. Last year you hired Fred, Roy. That
took some of the load off you, but already it’s getting ahead of you again. We are getting
more customer complaints about orders being shipped late. And now we’ve gotten caught
short on 210 for the Non-Specific order.”
“Look, Bernie, we’re getting burdened on this I know. But it’s the first time this has
44
Part C. Additional materials for teaching forecasting
1972
January
February
March
April
May
June
July
August
September
October
November
December
a
b
c
260
550
930
1973
1340
1500
1570
1360
1350
1400
1610
2280
2730
3210
3350
3620
1974
3690
3520
3330
3120
2880
2670
2790
3540
3920
4310
4200
4070
1975
4110
3870
3550
3420
3250
2910
3080
3890
4310
4860
4660
4520
1976
4500
4290
4010
3830
3570
3250
3520
4280
4830
5310
5180
5030
1977
2600
5830
5400
4210
3900
3640
4010
4830
5270
5960
5830
5510
1988
5330
5290
4960
4730
4370
4020
4020b
4830
4880
5540
5430
5210
1979
5140
4900
4400
4090
4600 c
4540
4930
5920
6480
7170
7080
6930
1980
6820
6540
6030
5770
5510
5000
5430
6520
7180
As a matter of policy, no special discounts or sales campaigns have been
held for NC-216.
Phrygian dropped NC-336, from their regular product line, retaining it
as an option at a price premium.
Phrygian introduced NC-236, identical to NC-216, except with a right
hand spin and a left-hand twist.
Monthly order for NC in poundsa
happened. Give us a break.”
“Roy, I know it’s the first time and I’m not really blaming you. But I want to make
sure it doesn’t happen a second time. I want you two to develop a forecasting system so
you’ll have a better idea of what to expect.”
“You mean some sort of automated technique for the new computer you got?”
“That’s right, Fred. Since sales orders are now being entered into a data file for the
billing system anyway, there must be some way of accessing that information and using
it.”
After the meeting Roy and Fred sat in Roy’s office kicking around ideas. “Look, Roy,
here’s the orders for NC-216 since late 1972. That’s when we introduced NC-216, to meet
the demand caused by federally mandated seat belts (see Table 3-1). Suppose we just use
this one item for discussion purposes. Now what do you propose to do?”
“Well, Fred, I’m not sure. In the past I always used to talk with the salesmen periodically to get a feel for what they thought was coming. Then I’d use that with my intuition
45
C/2 Assignments
1979
January
February
March
April
May
June
July
August
September
October
November
December
∗
7018
7353
7009
6795
1980
5983
6767
6355
6380
5326
4960
5565
6646
7980
8006
7736
7578
1981
6796∗
7481
6910
6305
6116
5698
6105
7350
For January 1981 through September 1981, inflation factor was calculated using data from October 1980.
Fred’s forecasts for October 1979 through September 1981
to make decisions on ordering raw materials, setting up vacation and maintenance schedules.”
“Well, your intuition didn’t intuit the big Non-Specific order.”
“Actually, I’d thought of it. That’s why we have had that big order already in on 210.
It’s just I though we wouldn’t need it for almost another month.”
“Hey, Roy, maybe we could just use last year’s figure for a month’s demand, plus an
inflation factor to get a forecast for the same month this year.”
“That might make sense, Fred. But look here with the NC-216. Notice in January of
1977 the low figure. That’s probably because of our wildcat strike that entire month. If
we’d used that figure to predict January 1978 we would have really been caught short.”
“Well, perhaps this forecasting system should have a manual component as well. A
place for our intuition to pick up on facts like that.”
“We’re busy enough already, Fred. With almost a hundred products, not considering
color differences, we’d be buried under the reams of data and the damned computer output.
Of course we should be able to take advantage of the fact that most of our products fall
into one of three or four demand patterns.”
“How’s this then, Roy? Suppose we are trying to forecast January 1981. We need
46
Part C. Additional materials for teaching forecasting
the forecast about three months in advance; that means the first week of October, 1980.
Suppose we take the average January demand for the five years preceding and inflate it by
a certain percentage. Since we will know October 1980’s real demand by then, we will let
the inflation percentage be the percentage difference between October 1980’s real demand
and the five preceding years’ average of October demands. Following that procedure, we
should be able to develop forecasts for even the next twelve months.” (See Table 3-2.)
“That sounds really good, Fred. But let’s test it by going back in the data and forecasting our last twelve months’ demand. Then we can compare it with reality (see Table
3-2). In any case, just to protect ourselves, I think we should hire a consultant and see
what ideas he has.”
“Sounds good to me. Boy, this could cut Phrygian Thread’s Gordian Knot.”
Assignment
You are to act as a consultant to Phrygian Thread. Prepare a careful analysis of the
demand pattern for NC-216. Then present a forecasting system for Fred and Roy to
consider for their product line. Evaluate your model on whatever criteria you feel to be
appropriate. Make explicit any assumptions you make.
C/2 Assignments
47
C/2/2 Nike stock price predictions
(Prepared by Professor Peter Reiss, Stanford University. Used by permission.)
M.A. Verage is a retired stockbroker who on occasion still tries her hand at picking
stocks. Last summer she invested her entire life’s savings in Nike, a company that sells
not only running shoes, but also a wide range of athletic apparel. Having made a bundle
on her original investment, M.A. is now concerned that the athletic fad will fade. (She
privately fears that this is already happening.) She is also concerned that the current
speculative bubble of optimism on Wall Street will burst, and this too will send Nike stock
prices plummeting.
Her dilemma is this: since last August, the value of her stock has ranged from one and
a half to two times what she paid for it. If she sells now, she can realize a sure return on
her original investment. On the other hand, Nike’s stock price has historically been quite
strong—even in recessions—and it may return to its 1983 high of $24.00. Not wanting to
make a foolish decision either way, she decides to call in an expert who can predict what
will happen to the price of Nike stock over the next 15 trading days.
For a rather modest fee, you have a suddenly become an expert on stock price behavior.
Having not had a course in finance and not having access to any information about Nike’s
prospects in the athletic apparel market, you must resort to forecasting Nike’s stock price
using only information on past stock prices.
Assignment
1. In a four-page (or less) document, you or your group must present your forecast of
how Nike’s stock price will behave. If you present more than one set of forecasts,
you must state which one is your most preferred (and why).
2. You may assume that she has taken this subject and is familiar with the material
covered in the text.
3. You should spend at least three-quarters of your discussion describing the data
(e.g., plots, transformations, statistics . . .) and why you have elected to use your
forecasting techniques. Running every possible method on the data would be terribly
time consuming and burdensome. Please limit your efforts by first looking at the
data and then deciding what to do. You will be graded not so much on how you
do at forecasting the price and the sophistication of your techniques, as you will
on the thoughtfulness and clarity of your discussion. Remember, the idea here is
not so much to give you a once-and-for all grade, as it is for the experiment with
techniques used in class. If you get to the point where you are disgusted with moving
48
Part C. Additional materials for teaching forecasting
averages, exponential smoothing, etc, you have probably tried too many techniques
and written too much.
4. Your paper is limited to four double-spaced pages of text. You can attach as many
plots or copies of output as you want.
5. The data on Nike’s stock price are below. Please use only these data. The data are
arrayed by date and price. February data are given first, followed by March.
Date
01
09
16
23
02
09
16
23
Price
19.875
20.125
22.625
22.625
17.125
16.000
16.125
16.250
Date
02
10
17
24
03
10
17
24
Price
19.375
20.250
23.125
19.125
16.625
16.375
16.000
15.875
Date
03
11
18
25
04
11
18
Price
19.500
21.500
23.125
17.000
16.125
16.250
15.625
Date
04
14
21
28
07
14
21
Price
20.125
22.750
23.125
17.875
16.000
16.125
15.875
Date
07
15
22
01
08
15
22
Price
20.000
23.000
22.625
17.625
15.625
16.125
16.000
C/2 Assignments
49
C/2/3 Final Paper Option
(Prepared by Professor Peter Reiss, Stanford University. Used by permission.)
Paper requirements
All papers must be typed, double-spaced and have neat corrections. Your paper can be
anywhere from seven to sixteen pages in length. I suggest you aim for seven to ten pages,
but if you require more pages to make your arguments clear, you may use it. Footnotes,
references and exhibits are not counted in the above page limits.
In writing your papers, remember that, although content is paramount, style and clear
prose are also important. Any sources that contribute significant or little known facts
must be referenced with footnotes. You are also not permitted to turn in any paper
submitted for another course (or any modified course paper). All data sources must be
clearly detailed.
Topics
You pretty much have freedom to choose your own topic as long as it is related to a
forecasting problem. This forecasting problem can be an event change study, a time series
study, or a characteristics-based study. You must use real data. This data can be gathered
from public or private sources. You must reserve 5 to 10 per cent of your data for out of
sample predictions. You may not use this data when fitting your model. Once you have
settled on perhaps several models, you are then to forecast the out of sample data, as
well as periods (or phenomena) beyond your data. You are then to write up your results,
evaluating how you did in forecasting your reserved data and how you expect to do with
your future forecasts. You must not alter your models once you have simulated them over
the reserved data. I will not penalize you for bad out of sample predictions as long as you
intelligently evaluate why they occurred and you indicate how you might have gone about
fixing your model once these new data were revealed (you can actually revise your model
if you find it a compelling exercise—but you must report your initial models and results
first).
As far as content, you should in the beginning of the paper lay out the issue or topic
you are discussing. This should include a statement of the forecasting problem or topic,
any analytical or numerical frameworks you wish to use, and a brief statement of the
forecasting methods you have chosen to explore. You then should present your analysis
and facts back to your analysis . Finally, you should give a brief summary of your models,
50
Part C. Additional materials for teaching forecasting
and their out of sample predictions, and an evaluation of the reliability of your model.
Suggestions for actions based on your forecasts should also be made at this point.
Sample topics might include:
1. A model that predicts the sales of a company’s product line.
2. A model of movements in macroeconomic variables.
3. A model of a firm’s choice of strategies (e.g., capacity utilization, price, output,
quality, advertising, etc.).
4. A model of seasonality, cycles and trend in macroeconomic or company data.
5. Inventory modeling and production scheduling problems.
6. Judgmental forecasting
C/2 Assignments
51
C/2/4 Demand for blood tests
(Prepared by Professor Fred Shepardson, Stanford University. Adapted with permission.)
This assignment is based on the article “Box-Jenkins vs multiple regression: some
adventures in forecasting the demand for blood tests” by Everette S. Gardner, Interfaces,
Vol. 9, August 1979.
This paper is a report on consulting activities performed for the Clinical Coagulation
Laboratory at the North Carolina Memorial Hospital. You are to put yourself in the
position of the Laboratory Director. Consider the papers as the consultants’ final report
to you on their study of the problem of forecasting the Laboratory’s demand for blood
tests. While you are responsible for this project, your boss has taken a keen interest
in it as well and she is eager to have the Laboratory start using an analytically based
forecasting method. She has also received a copy of the consultants’ report and you can
assume she has read it. She is now awaiting your report on the project and how you will
proceed. Your presentation should include a brief analysis of the consultants’ work and
your own recommendation for implementing the proposed forecasting procedure. What is
your presentation?
52
Part C. Additional materials for teaching forecasting
C/2/5 Winning Wines
As the operations manager for Winning Wines, Sandra McDougal has become quite concerned about managing her product release. The firm produces a large range of wines
primarily for local consumption. Lately Winning Wines has found a valuable new export market in South East Asia. Currently, of the 39 distinct products Winning Wines
produces, 12 are for this new market.
Ms McDougal is worried about managing the release of her products gradually so that
customer demand is satisfied without affecting the price by oversupply. Consequently she
has decided to introduce a formal forecasting procedure at Winning Wines. In order to
begin her analysis of the demand pattern for the company’s products, Ms McDougal has
selected a product for which the company has a detailed history—a popular sparkling
white wine.
From marketing she has obtained extensive monthly sales data for this wine. Graphs
of the data with ACF and PACF plots are attached. What forecasting techniques should
Ms McDougal be considering at this point? Defend your choice. What advice can you
give to help her in developing a forecasting system for demand for Winning Wines entire
product line?
53
C/2 Assignments
Sparkling wine sales
600
o
o
o
o
500
o
o
liters
400
o
o
o
o
300
o
o
o
o
o
o
o
o
o
o
200
o
o
o
o
o
o
o o
o
o
o
o
o
o
o
o
o
o
o o
o
o
o
o
o
o
o
o
o
o o
o
100
o
3
4
o
o
5
6
0.2
-0.4
-0.2
0.0
0.0
ACF
PACF
0.4
0.4
0.6
0.8
2
o
o o
o o o o
o
o
1
o
o
o
0
10
20
30
0
10
Sparkling wine sales (liters) for Winning Wines.
20
30
54
Part C. Additional materials for teaching forecasting
C/2/6 Decomposition and exponential smoothing assignment
Select one time series of real data. The series can be selected from among those available
in the Time Series Data Library (www.maths.monash.edu.au/~hyndman/tseries/) or can be
published data or data you have collected. The data series must be seasonal and comprise
at least 30 observations.
1. Make a time plot of your data and describe the main features of the series.
2. Transform your series if necessary. Explain which transformation was used and why.
If no transformation was used, explain why not.
3. Decompose the transformed series using an additive model. Produce a decomposition
plot and a seasonal sub-series plot for the decomposition.
4. Forecast the next two years of your series using Holt–Winters’ additive method. Give
the parameters of the method and report the MSE, MAPE and MAD of the one-step
forecasts from your method. If you transformed the series, give the forecasts on the
original scale.
5. Find a prediction interval for the next observation using the MSE. Check the assumption of normality.
6. Add your forecasts and prediction interval to the graph of the data.
7. Explain why the MSE cannot be used to obtain prediction intervals for longer-term
forecasts.
C/2 Assignments
55
C/2/7 ARIMA Assignment
Select one time series of real data. The series can be selected from among those available
in the Time Series Data Library (www.maths.monash.edu.au/~hyndman/tseries/) or can be
published data or data you have collected. The data series must comprise at least 30
observations.
You should produce forecasts of the series using an ARIMA model. Write a brief report
(about 4 pages) of your analysis including
• transformations
• model selection
• estimation
• diagnostics
• forecasts and prediction intervals
Explain carefully what you have done and why you have done it. You should also compare
your results with those obtained using an exponential smoothing method. Which method
do you think gives the better forecasts?
You should write as if your report is to a client who is interested in forecasts of your
data. You may assume that your client is familiar with the material covered in the text.
You will be graded not so much on the sophistication of your techniques, as you will on
the thoughtfulness and clarity of your discussion and the communication of your results.
You will also be required to give a presentation of your analysis in class.
56
Part C. Additional materials for teaching forecasting
C/3 Exams
C/3/1 Sample exam questions for a time series and forecasting subject
Question 1 The graphs in Figure 1 concern the production of sulphuric acid in Australia
between March 1956 and March 1992.
1.1 Describe the series in a few sentences. Does transforming or differencing seem
appropriate?
1.2 Given that Australian economic policy was radically different between 1972
and 1975 due to the Whitlam Labor Government and that there was a severe
recession in 1991–1992, explain in a couple of sentences some of the unusual
features of the time series plot. What other information might be helpful in
modelling this series?
1.3 Your client has asked you to provide forecasts of this series for the next two
years. She has no specific idea of the expected behavior of the forecasts and
does not require forecast intervals. Consider each of the methods listed below.
Say, in a few words each, if and why you think each of the methods listed might
be appropriate or not for this situation. If you find more than one method
that might be appropriate, discuss in about two sentences the relative merits
of the appropriate methods. Assume methods a)–e) will be applied to the
data as given, without any preceding actions taken.
a)
b)
c)
d)
e)
f)
Single exponential forecasting
Holt’s method
Holt–Winter’s method
AR(1) with −1 < φ < 1
ARIMA(0,1,1) with −1 < θ < 1 with a constant
ARIMA(0, 1, 4) applied to the data differenced at lag 4.
57
C/3 Exams
Quarterly production of sulphuric acid in Australia
o
o
600
o
o
o
o
o
o
o
o
o
o
o
500
o
o o
400
o
o
300
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o o
o
o
o
o
o o
o
o
o
o
200
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
1960
1970
1980
1990
0.2
Partial ACF
0.0
0.4
-0.2
0.2
-0.4
0.0
-0.2
ACF
0.6
0.4
0.8
0.6
0.8
1.0
Time
0
5
10
Lag
15
20
5
10
15
20
Lag
Figure 1: Graphs relating to production of sulphuric acid in Australia.
58
0
100
200
300
Part C. Additional materials for teaching forecasting
1500
1600
1700
1800
0.4
0.0
0.2
Partial ACF
0.4
0.2
-0.2
0.0
ACF
0.6
0.6
0.8
0.8
1.0
Year
0
5
10
15
Lag
20
25
5
10
15
20
Lag
Figure 2: Graphs relating to the Beveridge wheat price index.
25
59
C/3 Exams
Question 2 The graphs in Figure 2 concern the Beveridge wheat price index from 1500–
1869.
2.1 Describe the series in a few sentences. Explain why taking logarithms of the
series is appropriate. Does differencing also seem appropriate? Explain why
or why not.
2.2 Suppose you wish to forecast the series for the next 10 years. Consider each of
the methods listed below. Say, in a few words each, if and why you think each
of the methods listed might be appropriate or not for this situation. If you
find more than one method that might be appropriate, discuss in about two
sentences the relative merits of the appropriate methods. Assume methods
(a)–(f) will be applied to the logged data without any other actions taken.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
Single exponential forecasting
Holt’s method
Holt–Winter’s method
AR(1) with −1 < φ < 1
ARIMA(p,1,q) with no constant
ARIMA(p,1,q) with a constant
ARMA(p,q) applied to the logged data differenced at lag 12.
Question 3 The graph in Figure 3 is of the number of housing starts in the US each
month for nine years.
3.1 Consider forecasting the time series using the various methods listed in the
previous question. Say, in a few words each, if and why you think each of the
methods might be appropriate or not for the client in this situation. If you
find more than one method that might be appropriate, discuss in about two
sentences the relative merits of the appropriate methods. Assume methods
(a)–(f) will be applied to the data as given without any preceding actions
taken.
3.2 Describe what the ACF would probably look like for this series and describe
any actions you would take before trying to fit a stationary ARMA model.
3.3 Discuss in about 4–5 sentences (but without giving any equations) what actions you would take after you have obtained the parameter estimates from
your ARMA model but before you produce any forecasts.
60
150
50
100
thousands
200
Part C. Additional materials for teaching forecasting
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
Figure 3: U.S. monthly housing starts, January 1966–December 1974.
Question 4 The graphs in Figure 4 concern the total building and construction activity
in Australia each quarter. The units represent the value of work done in millions of
dollars at 1984/1985 prices. Data are available from July 1976 to September 1994.
However, the graphs are based on a restricted set of data. The first quarter on the
graph is July–September 1976; the last quarter on the graph is July–September
1991. Let Yt denote the raw series shown in the time plot and let X t denote the
series after differencing at lags 1 and 4. The ACF and PACF graphs are for the X t
series.
4.1 Describe the series in a few sentences. Does transforming seem appropriate?
If so, what transformation would you try? What features of the series suggest
differencing is appropriate?
4.2 You are developing a forecasting model for the Housing Industry Association
and you wish to test the model by forecasting the data from December 1991 to
September 1994. Consider each of the methods listed below. Comment, in a
few words each, on whether the methods listed might be appropriate for these
data. If more than one method might be appropriate, discuss in about two
sentences the relative merits of the appropriate methods. Assume methods
a)–e) will be applied to the data, Yt , without any preceding actions taken.
61
C/3 Exams
Building and construction activity in Australia
o
o
5000
o
o
4500
o
o
o
o o
o
o
o
4000
o
o o
o
o
o
o
o
o o
3500
o
o
o
o
o
o
3000
o o
o
o
o
o
o
o
o
o
o
o
o
o o o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
2500
o
o
o
1976
1978
1980
1982
1984
1986
1988
1990
1992
-0.2
Partial ACF
0.4
-0.4
-0.4
-0.2
0.0
0.2
ACF
0.0
0.6
0.8
0.2
1.0
ACF and PACF for differenced data
0
5
10
Lag
15
20
5
10
15
20
Lag
Figure 4: Graphs relating to quarterly totals of building and construction activity in Australia,
First quarter: Jul–Sep 1976; last quarter: Jul–Sep 1991.
62
Part C. Additional materials for teaching forecasting
a)
b)
c)
d)
e)
f)
g)
Single exponential forecasting
Holt’s method
Holt–Winter’s method
AR(1) with −1 < φ < 1
ARIMA(0, 4, 1)
AR(8) applied to Xt
ARIMA(1,0,0) applied to Xt
4.3 Explain why it is better when evaluating forecast performance to fit the model
using the data up to September 1991 rather than using the complete data set
up to September 1994.
63
C/3 Exams
C/3/2 Two hour exam for a time series and forecasting subject
All questions of this exam involve the series plotted below.
5
10
15
20
25
Monthly retail turnover recreational goods (Tasmania)
1982
1984
1986
1988
1990
1992
1994
1996
Figure 1: Time plot of monthly retail turnover ($ million) of recreational goods in Tasmania
between April 1982 and March 1996.
The last 15 months of data are given below:
Jan Feb Mar Apr May Jun
1995 13.7 14.7 14.8 13.0 14.0 13.4
1996 16.9 16.3 14.7
Jul
13.6
Aug
14.9
Sep
13.5
Oct
14.7
Nov
15.7
Dec
21.9
Question 1
1.1 Describe the series plotted above in a few sentences. Comment on trend,
seasonality, cycles and changes in variance and discuss the causes for these.
1.2 Explain why it is easier to analyze the logarithms of the data rather than the
raw data.
1.3 Your client has asked you to provide forecasts of this series for the next two
years. Consider each of the methods listed below. Assume the methods will
be applied to the logged data. Say, in a few words each, if and why you think
each of the methods listed might be appropriate or not for this situation. If
64
Part C. Additional materials for teaching forecasting
you find more than one method that might be appropriate, discuss in about
two sentences the relative merits of the appropriate methods.
a)
b)
c)
d)
e)
Single exponential forecasting
Holt’s method
Holt–Winter’s method
AR(1) with −1 < φ < 1
ARIMA(0,1,1) with −1 < θ < 1 and with the mean removed after differencing at lag 1
f) ARMA(p,q) model fitted to the series after differencing at lag 12.
g) Seasonal means method.
Question 2 Figure 2 shows the results of a STL decomposition applied to the logarithm
of the data shown in Figure 1. The seasonal component is assumed to be constant
from year to year. Figure 3 shows the seasonal pattern.
2.1 Say which quantities are plotted in each graph of Figures 2 and 3.
2.2 Explain how seasonally adjusted data can be obtained using the quantities
plotted in Figure 2.
2.3 If you were using a classical decomposition, what sort of moving average
smoother would be appropriate for estimating the trend of the series? Express the smoother as a weighted moving average smoother and explain how
the weights ensure there is no seasonal contamination of the trend estimate.
2.4 Explain why there is a problem with computing a moving average smoother
near the ends of the series. Explain why a loess smoother does not have this
problem.
2.5 What sort of decomposition would have been necessary if we had used the
raw data instead of the logged data?
65
-0.10
0.0
0.10
-0.1
0.1
0.3
1.6
2.0
2.4
2.8
1.5
2.0
2.5
3.0
C/3 Exams
1982
1984
1986
1988
1990
1992
1994
1996
time
Figure 2: STL decomposition of the logarithm of the data shown in Figure 1.
66
Part C. Additional materials for teaching forecasting
-0.1
0.0
0.1
0.2
0.3
Seasonal pattern
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
3.0
Figure 3: Seasonal pattern (or indices) based on the STL decomposition in Figure 2.
1.5
2.0
2.5
Forecasts
1982
1984
1986
1988
1990
1992
1994
1996
1998
Figure 4: Forecasts of the logarithms of the data shown in Figure 1, computed using HoltWinters’ method with parameters a = 0.47, b = 0.41 and c = 0.0.
67
C/3 Exams
Question 3 Holt-Winters’ method was used to forecast the logged data. The forecasts
are shown in Figure 4. The MSE for the one-step forecasts is 0.0045. The first few
forecasts are:
Apr 96
2.69
May 96
2.74
Jun 96
2.70
Jul 96
2.74
Aug 96
2.79
3.1 Give the forecasts for April–August 1996 on the original scale.
3.2 Compute a 95% prediction interval for the first forecast (on the original scale).
3.3 The smoothing parameters (α, β and γ) were chosen to minimize the one-step
MSE. Using the Holt-Winters’ equations, explain what γ = 0 implies about
the data. Discuss how this feature of the data is also seen in the seasonal
decomposition in Figure 2.
Question 4 Let the series plotted in Figure 1 be denoted by {X t }, let Yt = log(Xt ) and
let Wt = Yt − Yt−12 . Then the following model was fitted:
Wt = 0.52Wt−1 + 0.38Wt−2 + Zt
where {Zt } is white noise with variance 0.0063.
4.1 What sort of ARIMA model is Wt (i.e. what are p, d and q)?
4.2 Is the model for Wt stationary?
4.3 Write down the model for Yt . Is the model for Yt stationary?
4.4 Compute the forecasts for Xt for April 1996 and May 1996.
4.5 Compare the forecast performance of this model with the method used in
Question 3, referring to the one-step MSE for both models.
4.6 The Ljung-Box statistic for h = 24 is 50.3. Complete the Ljung-Box test and
comment on the adequacy of the model.
End of Exam
68
Part C. Additional materials for teaching forecasting
C/3/3 Final Exam: Elective MBA Course
(Prepared by Professor Peter Reiss, Stanford University. Adapted by permission.)
Part I: True or False (Worth 100 points. Suggested time allotment: 1 hour)
The following statements are either always true or they are false. Answer each
statement “True” or “False” and give a one or two sentence justification that supports
your conclusion. You will receive zero for an incorrect answer, one for the right
answer, and 1 to 3 more points for your justification.
1. A forecast error cost function based upon absolute deviations (MAD) penalizes
you more for large errors than does mean squared error (MSE).
2. The larger R2 , the better the model.
3. The smaller the standard error of a multiple regression model, the more accurate
are the predictions of the future.
4. The first four randomness tests we discussed in class will (with an extremely
high probability) perform equally well in detecting any trend or seasonality in
a time series.
5. Given that we now know how to use Box-Jenkins procedures, there is absolutely
no reason to ever use a simple moving average of past values of the time series.
6. The mean of a time series (that has any variance) can never be an optimal
forecasting rule.
7. A single moving average (for example, X̂t = (Xt−1 + Xt−2 )/2) is always stationary.
8. Single moving averages as a smoothing and forecasting tool work well on data
that have a strong linear trend.
9. Single Exponential Smoothing (SES) puts most of its averaging weight on values
that are mid-way between the start and end of the observations being averaged.
10. Holt’s method is always better than single exponential smoothing because it
removes more trend.
11. In SES, if the value of α that minimizes MSE is equal to one, this implies X t
is best represented as a moving average of length one (i.e. X̂t = Xt−1 .
12. If you definitely have seasonality, trend and randomness in your time series
data, Winters’ method should be used before any other smoothing method.
13. In trying to pick the best exponential smoothing method, it is always best to
compare MSE. The method with the smallest MSE is always preferred.
69
C/3 Exams
14. If a variable appears to be insignificant in a regression, it should be dropped.
15. A high F statistic for your regression indicates that you have a good forecasting
model.
16. A multiplicative decomposition method will usually be preferred to an additive
decomposition method when the variability of the data increases through time.
17. The adjusted R2 is a much better measure of goodness-of-fit and forecasting
accuracy because it penalizes you for data mining.
18. If the residuals from a regression show evidence of first-order serial correlation,
then we can obtain better forecasts of the dependent variable by incorporating
that serial dependence in our forecasting rule.
19. The F sub-block test in regression analysis is not useful if the linear regression
slope coefficients change through time.
20. Econometric system models are always better forecasting tools than linear regression models because simultaneous equations models use more information
about the process generating the data.
21. Correlation always implies causality.
22. Autocorrelation coefficients are not very useful for detecting the presence of an
autoregressive process in time series data.
23. Differencing techniques can always make a time series stationary. Besides, we
can never over-difference a time series.
24. Box-Jenkins forecasting methods do not work well in the presence of quadratic
time trends in the data.
25. If there are patterns in your out-of-sample forecast errors, but your withinsample forecast errors appear random, it was just by chance that you obtained
this nonrandomness out of sample.
Part II: Thought Questions (Worth 70 points. Suggested time allotment:
3
4
hour.)
Please answer only one of the following questions. Long answers are not required,
so please pay attention to your time. The situations described are hypothetical and
the names have been changed to protect the offenders.
1. Recently you received a forecasting report that used a multiple regression model
to construct the forecasts. The author of the report chose his model because
“of the several hundred tried, it had by far the highest R-squared, adjusted
R-squared and F statistic. It also had the lowest sum of squared residuals and
a Durbin-Watson of 1.40 for n = 100 and k = 5.” The author also claims that
on a reserved sample of the data the model has “very little bias and the MSE
is only slightly worse than within sample.” Given only this information,
70
Part C. Additional materials for teaching forecasting
(a) Comment on the author’s judgment in choosing his forecasting model.
(b) Suggest a number of diagnostics or additional steps you would like to see
the author take.
2. Recently the Bank of the GSB issued its forecasts of macroeconomic aggregate
variables (e.g. GNP, the money supply, interest rates, etc.) for 1998. In their
report they say that they have a ten equation model of the U.S. economy.
Discuss:
(a) What types of data (or models) they would have to have in order to predict
the 1998 values.
(b) What sorts of questions you as a consumer of these forecasts would ask in
order to be more confident that their forecast of a 20 percent inflation rate
was reasonable.
Part III: Forecasting analysis (Worth 130 points.
hours).
Suggested time allotment: 1 14
(a) Look at the following retail sales data (see the attached pages) and write a
short description of the data.
(b) A colleague has proposed that
“Box-Jenkins, time trend and seasonal regression, and decomposition
appear to work best as forecasting tools.”
Looking at the attached output that contains data and diagnostics on each of
these best methods, evaluate the above statement. Comment on the strengths
and weakness of the three approaches.
71
C/3 Exams
Quarterly Retail turnover
1700
o
o
o
o
o
o
1600
o
o
o
1500
o
o
o
o
o
o
o
1400
o
o
o
o
o
o
1300
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
2
3
4
5
6
7
8
9
10
0.2
-0.2
-0.2
0.2
PACF
0.6
0.6
1
ACF
o
o
2
4
6
8
10
12
14
2
Graphs of the retail turnover data.
4
6
8
10
12
14
72
1500
50
-20
0
Remainder
10
20
-50 0
Seasonal
150
1400
Trend-cycle
1500
1300
Data
1700
Part C. Additional materials for teaching forecasting
2
4
6
8
time
Decomposition plot of the retail turnover data.
Part of computer output for Seasonal decomposition:
Seasonal indices
-51.80256 -51.52647
189.64777 -86.31869
10
73
C/3 Exams
Part of computer output for Regression:
Regression Coefficients:
(Intercept)
t
Q1
Q2
Q3
Value Std. Error
1271.2339
9.1093
5.0954
0.2961
36.8263
9.5463
35.6839
9.5417
275.2755
9.5463
t value
139.5535
17.2103
3.8577
3.7398
28.8360
Pr(>|t|)
0.0000
0.0000
0.0005
0.0007
0.0000
Residual standard error: 20.77 on 34 degrees of freedom
Multiple R-Squared: 0.9771
F statistic: 363.2 on 4 and 34 degrees of freedom, the p-value is 0
Part of computer output for ARIMA modelling:
Period(s) of Differencing = 1,4.
Number of observations =
34
NOTE: The first 5 observations were eliminated by differencing.
Parameter
MA1,1
Estimate
0.74282
Approx.
Std Error
0.20763
T Ratio
3.58
Lag
4
Variance Estimate = 37259.2677
Std Error Estimate = 193.026599
AIC
= 458.540303
SBC
= 460.066663
Number of Residuals=
34
To
Lag
6
12
18
24
Chi
Square
2.05
5.92
13.49
20.71
Autocorrelations
DF
5
11
17
23
Prob
0.842
0.878
0.703
0.599
-0.007 -0.005 -0.164 0.106 -0.083 -0.074
-0.100 0.055 0.191 -0.161 -0.004 -0.054
-0.118 -0.067 0.112 0.233 0.165 0.015
-0.118 -0.031 0.091 -0.225 0.019 -0.014
Model for variable TURNOVER
No mean term in this model.
Period(s) of Differencing = 1,4.
Moving Average Factors
Factor 1: 1 - 0.74282 B**(4)
74
Part C. Additional materials for teaching forecasting
Residuals from regression
o
40
o
o
o
o
o
o
o
20
o
o
o
o
o
o
o
o
o
o
o
0
o
o
o
o
o
o
o
o
-20
o
o
o
o
o
o
o
o
-40
o
1
2
3
4
5
6
7
0.6
0.6
0.4
0.4
0.2
PACF
ACF
o
o
o
0.0
8
9
10
0.2
0.0
-0.2
-0.2
-0.4
2
4
6
8
10
12
14
2
4
6
Regression residuals for the retail turnover data.
8
10
12
14
75
C/3 Exams
Residuals from ARIMA model
o
40
o
o
o
o
20
o
o
o
o
o
o
o
o
o
o
o
0
o
o
o
o
o
o
o
o
o
o
o
-20
o
o
o
o
o
o
-40
o
1
2
3
4
5
6
8
9
10
0.2
PACF
ACF
0.2
7
0.0
-0.2
0.0
-0.2
2
4
6
8
10
12
14
2
4
6
ARIMA residuals for the retail turnover data.
8
10
12
14
D/Solutions to exercises
Chapter 1: The forecasting perspective
1.1 Look for pragmatic applications in the real world. Note that there are no fixed
answers in this problem.
(a) Dow theory: There is an element of belief that past patterns will continue
into the future. So first, look for the patterns (support and resistance levels)
and then project them ahead for the market and individual stocks. This is a
quantitative time series method.
(b) Random walk theory: This is quantitative, and involves a time series rather
than an explanatory approach. However, the forecasts are very simple because
of the lack of any meaningful information. The best prediction of tomorrow’s
closing price is today’s closing price. In other words, if we look at first differences
of closing prices (i.e., today’s closing price minus yesterday’s closing price) there
will be no pattern to discover.
(c) Prices and earnings: Here instead of dealing with only one time series (i.e., the
stock price series) we look at the relation between stock price and earnings per
share to see if there is a relationship—maybe with a lag, maybe not. Therefore this is an explanatory approach to forecasting and would typically involve
regression analysis.
1.2 Step 1: Problem definition This would involve understanding the nature of the individual product lines to be forecast. For example, are they high-demand products or specialty biscuits produced for individual clients? It is also important
to learn who requires the forecasts and how they will be used. Are the forecasts
to be used in scheduling production, or in inventory management, or for budgetary planning? Will the forecasts be studied by senior management, or by
the production manager, or someone else? Have there been stock shortages so
that demand has gone unsatisfied in the recent past? If so, would it be better
to try to forecast demand rather than sales so that we can try to prevent this
76
Chapter 1: The forecasting perspective
77
happening again in the future? The forecaster will also need to learn whether
the company requires one-off forecasts or whether the company is planning on
introducing a new forecasting system. If the latter, are they intending it to
be managed by their own employees and, if so, what software facilities do they
have available and what forecasting expertise do they have in-house?
Step 2: Gathering information It will be necessary to collect historical data on each
of the product lines we wish to forecast. The company may be interested in
forecasting each of the product lines for individual selling points. If so, it is
important to check that there are sufficient data to allow reasonable forecasts
to be obtained. For each variable the company wishes to forecast, at least a
few years of data will be needed.
There may be other variables which impact the biscuit sales, such as economic
fluctuations, advertising campaigns, introduction of new product lines by a
competitor, advertising campaigns of competitors, production difficulties. This
information is best obtained by key personnel within the company. It will be
necessary to conduct a range of discussions with relevant people to try to build
an understanding of the market forces.
If there are any relevant explanatory variables, these will need to be collected.
Step 3: Preliminary (exploratory) analysis Each series of interest should be graphed
and its features studied. Try to identify consistent patterns such as trend
and seasonality. Check for outliers. Can they be explained? Do any of the
explanatory variables appear to be strongly related to biscuit sales?
Step 4: Choosing and fitting models A range of models will be fitted. These models
will be chosen on the basis of the analysis in Step 3.
Step 5: Using and evaluating a forecasting model Forecasts of each product line will
be made using the best forecasting model identified in Step 4. These forecasts
will be compared with expert in-house opinion and monitored over the period
for which forecasts have been made.
There will be work to be done in explaining how the forecasting models work
to company personnel. There may even be substantial resistance to the introduction of a mathematical approach to forecasting. Some people may feel
threatened. A period of education will probably be necessary.
A review of the forecasting models should be planned.
78
Part D. Solutions to exercises
Chapter 2: Basic forecasting tools
2.1 (a) One simple answer: choose the mean temperature in June 1994 as the forecast
for June 1995. That is, 17.2 ◦ C.
(b) The time plot below shows clear seasonality with average temperature higher
in summer.
20
18
Celsius
16
14
12
10
8
6
1994
Jan
1994
Feb
1994
May
1994
Jul
1994
Sep
1994
Nov
1995
Jan
1995
Mar
1995
May
Month
Exercise 2.1(b): Time plot of average monthly temperature in Paris (January 1994–May
1995).
2.2 (a) Rapidly increasing trend, little or no seasonality.
(b) Seasonal pattern of period 24 (low when asleep); occasional peaks due to strenuous activity.
(c) Seasonal pattern of period 7 with peaks at weekends; possibly also peaks during
holiday periods such as Easter or Christmas.
(d) Strong seasonality with a weekly pattern (low on weekends) and a yearly pattern. Peaks in either summer (air-conditioning) or winter (heating) or both
depending on climate. Probably increasing trend with variation increasing with
trend.
2.3 (a) Smooth series with several large jumps or direction changes; very large range
of values; logs help stabilize variance.
(b) Downward trend (or early level shift); cycles of about 15 days; outlier at day
8; no transformation necessary.
(c) Cycles of about 9–10 years; large range and little variation at low points indicating transformation will help; logs help stabilize variance.
79
Chapter 2: Basic forecasting tools
(d) No clear trend; seasonality of period 12; high in July; no transformation necessary.
(e) Initial trend; level shift end of 1982; seasonal period 4 (high in Q2 and Q3, low
in Q1); no transformation necessary.
2.4 1-B, 2-A, 3-D, 4-C. The easiest approach to this question is to first identify D.
Because it has a peak at lag 12, the time series must have a pattern of period 12.
Therefore it is likely to be monthly. The slow decay in plot D shows the series has
trend. The only series with both trend and seasonality of period 12 is Series 3. Next
consider plot C which has a peak at lag 10. Obviously this cannot reflect a seasonal
pattern since the only series remaining which is seasonal is series 2 and that has
period 12. Series 4 is strongly cyclic with period approximately 10 and series 1 has
no seasonal or strong cyclic patterns. Therefore C must correspond to series 4. Plot
A shows a peak at lag 12 indicating seasonality of period 12. Therefore, it must
correspond with series 2. That leaves plot B aligned with series 1.
2.5 (a)
Mean
Median
MAD
MSE
St.dev.
X
52.99
52.60
3.11
15.94
4.14
Y
43.70
44.42
2.47
8.02
2.94
(b) Mean and median give a measure of center; MAD, MSE and St.dev. are measures of spread.
(c) r = −0.660. See plot on next page.
(d) It is inappropriate to compute autocorrelations since there is no time component
to these data. The data are from 14 different runners. (Autocorrelation would
be appropriate if they were data from the same runner at 14 different times.)
2.6 (a) See plot on following page.
(b) and (c)
Notation:
Error 1 =
Error 2 =
(actual demand) − (method 1 forecast)
(actual demand) − (method 2 forecast)
80
46
44
42
40
Y: maximal aerobic capacity
48
Part D. Solutions to exercises
48
50
52
54
56
58
60
X: running times
200
180
160
Actual
Forecast Method 1
Forecast Method 2
140
Demand
220
240
Exercise 2.5(c): Plot of running times versus maximal aerobic capacity.
5
10
15
Month
Exercise 2.6(a): Time plots of data and forecasts.
20
81
Chapter 2: Basic forecasting tools
Period
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Actual
139
137
174
142
141
162
180
164
171
206
193
207
218
229
225
204
227
223
242
239
Analysis of errors
(periods 1–20)
Method 1
157
145
140
162
149
144
156
172
167
169
193
193
202
213
223
224
211
221
222
235
Error 1
−18
−8
34
−20
−8
18
24
−8
4
37
0
14
16
16
2
−20
16
2
20
4
ME
MAE
MSE
MPE
MAPE
Theil’s U
6.25
14.45
307.25
2.55
7.87
0.94
Method 2
170
162
157
173
164
158
166
179
177
180
199
202
211
221
232
235
225
232
233
243
Error 2
−31
−25
17
−31
−23
4
14
−15
−6
26
−6
5
7
8
−7
−31
2
−9
9
−4
−4.80
14.29
294.00
−3.61
8.24
0.85
On MAE and MSE, Method 2 is better than Method 1. On MAPE, Method 1
is better than Method 2. Note that this is different from the conclusion drawn
in Section 4/2/3 where these two methods are compared. The difference is that
we have used a different time period over which to compare the results. Holt’s
method (Method 2) performs quite poorly at the start of the series. In Chapter
4, this period is excluded from the analysis of errors.
2.7 (a) Changes: −0.25, −0.26, 0.13, . . . , −0.09, −0.77. There are 78 observations in
the DOWJONES.DAT file. Therefore there are 77 changes.
(b) Average change: 0.1336. So the next 20 changes are each forecast to be 0.1336.
(c) The last value of the series is 121.23. So the next 20 are forecast to be:
X̂79 = 121.23 + 0.1336 = 121.36
X̂80 = 121.36 + 0.1336 = 121.50
X̂81 = 121.50 + 0.1336 = 121.63
etc.
82
Part D. Solutions to exercises
In general, X̂79+h = 121.23 + h(0.1336).
120
115
110
Dow Jones index
(d) See the plot below.
0
20
40
60
80
100
day
Exercise 2.7(d): Plot of Dow Jones index (DOWJONES.DAT)
(e) The average change is c =
Xn + hc. Therefore,
1
n−1
Pn
t=2 (Xt
− Xt−1 ) and the forecasts are X̂n+h =
n
X̂n+h
1 X
= Xn + h
(Xt − Xt−1 )
n − 1 t=2
= Xn +
h
(Xn − X1 ).
n−1
This is a straight line with slope equal to (X n − X1 )/(n − 1). When h = 0,
X̂n+h = Xn and when h = −(n − 1), X̂n+h = X1 . Therefore, the line is drawn
between the first and last observations.
2.8 (a) See the plot on the next page. The variation when the production is low is
much less than the variation in the series when the production is high. This
indicates a transformation is required.
(b) See the plot on the next page.
(c) See the table on page 84.
83
Chapter 2: Basic forecasting tools
10 12
2
4
6
8
Forecast
1960
1970
1980
1990
1950
1960
1970
1980
1990
6
8
1950
4
Logarithms of vehicles
0
Vehicles (thousands)
•
Exercise 2.8 (a) and (b): Time plots of Japanese automobile production and the logarithms
of Japanese automobile production.
84
Part D. Solutions to exercises
Year
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
Data
11
20
29
32
38
39
50
70
69
111
182
188
263
482
814
991
1284
1702
1876
2286
3146
4086
4675
5289
5811
6294
7083
6552
6942
7842
8514
9269
9636
11043
11180
10732
11112
11465
12271
12260
12249
12700
13026
Log
2.40
3.00
3.37
3.47
3.64
3.66
3.91
4.25
4.23
4.71
5.20
5.24
5.57
6.18
6.70
6.90
7.16
7.44
7.54
7.73
8.05
8.32
8.45
8.57
8.67
8.75
8.87
8.79
8.85
8.97
9.05
9.13
9.17
9.31
9.32
9.28
9.32
9.35
9.41
9.41
9.41
9.45
9.47
Forecast
Error
Error2
|Error/Log|
2.40
3.00
3.37
3.47
3.64
3.66
3.91
4.25
4.23
4.71
5.20
5.24
5.57
6.18
6.70
6.90
7.16
7.44
7.54
7.73
8.05
8.32
8.45
8.57
8.67
8.75
8.87
8.79
8.85
8.97
9.05
9.13
9.17
9.31
9.32
9.28
9.32
9.35
9.41
9.41
9.41
9.45
9.47
0.598
0.372
0.098
0.172
0.026
0.249
0.337
−0.014
0.475
0.495
0.032
0.336
0.606
0.524
0.197
0.259
0.282
0.097
0.198
0.319
0.261
0.135
0.123
0.094
0.080
0.118
−0.078
0.058
0.122
0.082
0.085
0.039
0.136
0.012
−0.041
0.035
0.031
0.068
−0.001
−0.001
0.036
0.025
0.357
0.138
0.010
0.030
0.001
0.062
0.113
0.000
0.226
0.245
0.001
0.113
0.367
0.275
0.039
0.067
0.079
0.009
0.039
0.102
0.068
0.018
0.015
0.009
0.006
0.014
0.006
0.003
0.015
0.007
0.007
0.002
0.019
0.000
0.002
0.001
0.001
0.005
0.000
0.000
0.001
0.001
0.1996
0.1103
0.0284
0.0472
0.0071
0.0635
0.0792
0.0034
0.1009
0.0950
0.0062
0.0602
0.0981
0.0782
0.0285
0.0362
0.0379
0.0129
0.0256
0.0396
0.0314
0.0159
0.0144
0.0109
0.0091
0.0133
0.0089
0.0065
0.0136
0.0091
0.0093
0.0042
0.0146
0.0013
0.0044
0.0037
0.0033
0.0072
0.0001
0.0001
0.0038
0.0027
Exercise 2.8 (c) and (d).
Chapter 2: Basic forecasting tools
85
(d) MSE=0.059 (average of column headed Error 2 )
MAPE=3.21% (average of values in last column multiplied by 100).
(e) See graph. Forecast is e9.47 = 13026.
(f ) There are a large number of possible methods. One method, which is discussed
in Chapter 5, is to consider only data after 1970 and use a straight line fitted
through the original data (i.e. without taking logarithms).
(g) The data for 1974 is lower than would be expected. If this information could be
included in the forecasts, the MSE and MAPE would both be smaller because
the forecast error in 1974 would be smaller.
86
Part D. Solutions to exercises
Chapter 3: Time series decomposition
3.1
3-MA
55.50
70.33
94.67
115.67
129.33
142.33
155.33
168.33
185.00
204.00
226.33
255.33
291.67
339.67
364.00
5-MA
70.33
81.50
91.60
111.40
128.40
142.60
155.60
170.00
187.40
206.00
230.00
261.40
298.80
316.50
339.67
3 × 3-MA
62.92
73.50
93.56
113.22
129.11
142.33
155.33
169.56
185.78
205.11
228.56
257.78
295.56
331.78
351.83
7-MA
81.50
91.60
99.83
107.57
126.00
141.86
156.71
172.86
189.29
210.71
236.86
268.29
283.00
298.80
316.50
5 × 5-MA
81.14
88.71
96.65
111.10
125.92
141.60
156.80
172.32
189.80
210.96
236.72
262.54
289.27
304.09
318.32
400
Y
42
69
100
115
132
141
154
171
180
204
228
247
291
337
391
100
200
300
3-MA
3x3 MA
5-MA
5x5 MA
7-MA
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Exercise 3.1: Smoothers fitted to the shipments data.
15
16
17
87
Chapter 3: Time series decomposition
The graph on the previous page shows the five smoothers. Because moving average
smoothers are “flat” at the ends, the best smoother in this case is the one with the
smallest number of terms, namely the 3-MA.
3.2
T̂t =
=
1
3
1
5 (Yt−3 + Yt−2 + Yt−1 + Yt + Yt+1 )
+ 15 (Yt−2 + Yt−1 + Yt + Yt+1 + Yt+2 )
+ 15 (Yt−1 + Yt + Yt+1 + Yt+2 + Yt+3 )
1
2
1
1
1
15 Yt−3 + 15 Yt−2 + 5 Yt−1 + 5 Yt + 5 Yt+1
+
2
15 Yt+2
+
1
15 Yt+3 .
3.3 (a) The 4 MA is designed to eliminate seasonal variation because each quarter
receives equal weight. The 2 MA is designed to center the estimated trend at
the data points. The combination 2 × 4 MA also gives equal weight to each
quarter.
(b) T̂t = 18 Yt−2 + 14 Yt−1 + 14 Yt + 14 Yt+1 + 18 Yt+2 .
3.4 (a) Use 2×4 MA to get trend. If the end-points are ignored, we obtain the following
results.
Data:
Trend:
Y1 Y2 Y3 Y4
Y1
Y2
Y3
Y4
Q1
99 120 139 160
Q1
110.250 129.250 150.125
Q2
88 108 127 148
Q2
114.875 134.500 154.750
Q3
93 111 131 150
Q3 100.375 119.635 138.875
Q4 111 130 152 170
Q4 105.500 124.375 145.125
Data – trend:
Y1
Y2
Q1
9.750
Q2
–6.875
Q3 –7.375 –8.625
Q4
5.500
5.625
Y3
9.750
–7.500
–8.875
6.875
Y4
9.875
–6.500
Ave
9.792
–7.042
–8.292
6.000
(b) Hence, the seasonal indices are:
Ŝ1 = 9.8, Ŝ2 = −7.0, Ŝ3 = −8.3 and Ŝ4 = 6.0.
The seasonal component consists of replications of these indices.
(c) End points ignored. Other approaches are possible.
3.5 (a) See the top plot on the next page. There is clear trend which appears close to
linear, and strong seasonality with a peak in August–October and a trough in
January–March.
(b) Calculations are given at the bottom of the next page. The decomposition plot
is shown at the top of the next page.
88
Part D. Solutions to exercises
1200
110
104
70 80 90
96
100
remainder
seasonal
1000
1200
trend-cycle
800
data
1600
Plastic sales
2
Year
J
F
Data
1
742
697
2
741
700
3
896
793
4
951
861
5 1030
1032
2×12 MA Trend
1
2 1000.5 1011.2
3 1117.4 1121.5
4 1208.7 1221.3
5 1374.8 1382.2
Ratios
1
2
74.1
69.2
3
80.2
70.7
4
78.7
70.5
5
74.9
74.7
Seasonal indices
Ave
77.0
71.3
3
4
5
M
A
M
J
J
A
S
O
N
D
776
774
885
938
1126
898
932
1055
1109
1285
1030
1099
1204
1274
1468
1107
1223
1326
1422
1637
1165
1290
1303
1486
1611
1216
1349
1436
1555
1608
1208
1341
1473
1604
1528
1131
1296
1453
1600
1420
971
1066
1170
1403
1119
783
901
1023
1209
1013
1022.3
1130.7
1231.7
1381.2
1034.7
1142.7
1243.3
1370.6
1045.5
1153.6
1259.1
1351.2
75.7
78.3
76.2
81.5
90.1
92.3
89.2
93.8
105.1
104.4
101.2
108.6
116.0
114.0
111.4
123.0
77.9
91.3
104.8
116.1
977.0 977.0 977.1 978.4 982.7 990.4
1054.4 1065.8 1076.1 1084.6 1094.4 1103.9 1112.5
1163.0 1170.4 1175.5 1180.5 1185.0 1190.2 1197.1
1276.6 1287.6 1298.0 1313.0 1328.2 1343.6 1360.6
1331.2
119.2
121.0
111.3
115.4
124.5
125.4
122.2
119.8
123.6
123.6
124.8
122.2
115.6
118.4
122.6
120.5
98.8
96.6
98.3
104.4
79.1
81.0
85.5
88.9
116.8
122.9
123.6
119.3
99.5
83.6
Exercise 3.5(a) and (b): Multiplicative classical decomposition of plastic sales data.
89
Chapter 3: Time series decomposition
(c) The trend does appear almost linear except for a slight drop at the end. The
seasonal pattern is as expected. Note that it does not make much difference
whether these data are analyzed using a multiplicative decomposition or an
additive decomposition.
3.6
Period
t
61
62
63
64
65
66
67
68
69
70
71
72
Trend
Tt
1433.96
1442.81
1451.66
1460.51
1469.36
1478.21
1487.06
1495.91
1504.76
1513.61
1522.46
1531.31
Seasonal
St
76.96
71.27
77.91
91.34
104.83
116.09
116.76
122.94
123.55
119.28
99.53
83.59
Forecast
Ŷt = Tt St /100
1103.6
1028.3
1131.0
1334.0
1540.3
1716.1
1736.3
1839.1
1859.1
1805.4
1515.3
1280.0
3.7 (a) See the top of the figure on the previous page.
(b) The calculations are given below.
Year
Q1
Q2
Data
1
362
385
2
382
409
3
473
513
4
544
582
5
628
707
6
627
725
4×2 MA
1
2 399.3 413.3
3 478.3 499.6
4 557.9 580.6
5 654.8 670.6
6 689.4 708.1
Ratios
1
2
95.7
99.0
3
98.9 102.7
4
97.5 100.2
5
95.9 105.4
6
91.0 102.4
Seasonal indices
Ave
95.8 101.9
Q3
Q4
432
498
582
681
773
854
341
387
474
557
592
661
382.5
430.4
519.4
601.5
674.9
388.0
454.8
536.9
627.6
677.0
112.9
115.7
112.1
113.2
114.5
87.9
85.1
88.3
88.7
87.4
113.7
87.5
90
600
105
90 95
96
98
100
remainder
seasonal
400
500
600
trend-cycle
700
400
data
Part D. Solutions to exercises
2
3
4
5
6
Exercise 3.7: Decomposition plot for exports from French company.
(c) Multiplicative decomposition seems appropriate here because the variance is
increasing with the level of the series. The most interesting feature of the
decomposition is that the trend has levelled off in the last year or so. Any
forecast method should take this change in the trend into account.
3.8 (a) The top plot shows the original data followed by trend-cycle, seasonal and
irregular components. The bottom plot shows the seasonal sub-series.
(b) The trend-cycle is almost linear and the small seasonal component is very small
compared to the trend-cycle. The seasonal pattern is difficult to see in time
plot of original data. Values are high in March, September and December and
low in January and August. For the last six years, the December peak and
March peak have been almost constant. Before that, the December peak was
growing and the March peak was dropping. There are several possible outliers
in 1991.
91
Chapter 3: Time series decomposition
(c) The recession is seen by several negative outliers in the irregular component.
This is also apparent in the data time plot. Note: the recession could be made
part of the trend-cycle component by reducing the span of the loess smoother.
3.9 (a) and (b) Calculations are given below. Note that the seasonal indices are
computed by averaging the de-trended values within each half-year.
Data
1.09
1.07
1.10
1.06
1.08
1.03
1.04
1.01
1.03
0.96
2×2 MA
Trend
1.0825
1.0825
1.0750
1.0625
1.0450
1.0300
1.0225
1.0075
Detrended
Data
-0.0125
0.0175
-0.0150
0.0175
-0.0150
0.0100
-0.0125
0.0225
Seasonal
Component
0.017
-0.014
0.017
-0.014
0.017
-0.014
0.017
-0.014
0.017
-0.014
Seasonal
Adjusted Data
1.073
1.084
1.083
1.074
1.063
1.044
1.023
1.024
1.013
0.974
(c) With more data, we could take moving averages of the detrended values for
each half-year rather than a simple average. This would result in a seasonal
component which changed over time.
92
Part D. Solutions to exercises
Chapter 4: Exponential smoothing methods
4.1
Period Data
MA(3)
SES(α = 0.7)
t
Yt
Et
Et
Ŷt
Ŷt
1974 1
1
5.4
2
2
5.3
5.40
-0.10
3
3
5.3
5.33
-0.03
4
4
5.6 5.33
0.27 5.31
0.29
1975 1
5
6.9 5.40
1.50 5.51
1.39
2
6
7.2 5.93
1.27 6.48
0.72
3
7
7.2 6.57
0.63 6.99
0.21
4
8
7.10
7.14
Accuracy statistics from period 4 through 7
ME
0.92
0.65
MAE
0.92
0.65
MAPE
13.22
9.56
MSE
1.08
0.64
Theil’s U
1.40
1.14
Theil’s U statistic suggests that the naı̈ve (or last value) method is better than
either of these. If SES is used with an optimal value of α chosen, then α = 1 is
selected. This is equivalent to the naı̈ve method. Note different packages may give
slightly different results for SES depending on how they initialize the method. Some
packages will also allow α > 1.
4.2 (a) Forecasts for May 1992
Method
Forecast
MSE
MA(3)
24.0
1484.3
MA(5)
48.6
1031.2
MA(7)
55.6
757.5
α = 0.3
41.9
1211.80
α = 0.5
33.1
1193.98
MA(9)
51.7
860.8
MA(11)
53.1
1313.8
(b) Forecasts for May 1992
Method
Forecast
MSE
α = 0.1
45.5
1421.35
α = 0.7
29.1
1225.40
α = 0.9
28.7
1298.49
(c) Of these forecasting methods, the best MA(k) method has k = 7 and the best
SES method has α = 0.5. However, it should be noted that the MSE values for
the MA methods are taken over different periods. For example, the MSE for the
MA(7) method is computed only over 9 observations because it is not possible
to compute an MA(7) forecast for the first seven observations. So the MSE
93
Chapter 4: Exponential smoothing methods
values are not strictly comparable for the MA forecasts. It would be better to
use a holdout sample but there are too few data.
4.3 Optimizing α for SES over the period 3 through 10:
α
MAPE
MSE
0.1
65.60
79.34
0.2
53.46
47.24
0.3
44.43
29.95
0.4
37.60
20.10
0.5
32.32
14.17
0.6
28.16
10.41
0.7
24.82
7.91
0.8
22.08
6.17
0.9
19.80
4.92
1.0
17.86
4.00
The optimal value is α = 1.
With Holt’s method, any combination of α and β will give MAPE=0. This is so
because the differences between successive values of (4.13) are always going to be
zero with this errorless series. Using α = 1 for SES and α = 0.5 and β = 0.5 for
Holt’s method gives the following results.
Data
Yt
2
4
6
8
10
12
14
16
18
20
SES
Ŷt Et
Holt’s
Ŷt Et
2
4
6
8
10
12
14
16
18
4
6
8
10
12
14
16
18
20
2
2
2
2
2
2
2
2
2
0
0
0
0
0
0
0
0
0
(a) Clearly Holt’s method is better as it allows for the trend in the data.
(b) For SES, α = 1. Because of the trend, the forecasts will always lag behind the
actual values so that the forecast errors will always be at least 2. Choosing
α = 1 makes the forecast errors as small as possible for SES.
(c) See above.
4.4 (a) (b) and (c) See the table on the following page.
(d) There’s not much to choose between these methods. They are both bad! Look
at Theil’s U values for instance. The last value method over the same period
(13–28) gives MSE=6.0, MAPE=2.05 and Theil’s U=1.0.
94
Part D. Solutions to exercises
Period Data Forecast Errors
t
Yt
MA(12)
Et
1
108
2
108
3
110
4
106
5
108
6
108
7
105
8
100
9
97
10
95
11
95
12
92
13
95
102.67
-7.67
14
95
101.58
-6.58
15
98
100.50
-2.50
16
97
99.50
-2.50
17
101
98.75
2.25
18
104
98.17
5.83
19
101
97.83
3.17
20
99
97.50
1.50
21
95
97.42
-2.42
22
95
97.25
-2.25
23
96
97.25
-1.25
24
96
97.33
-1.33
25
97
97.67
-0.67
26
98
97.83
0.17
27
94
98.08
-4.08
28
92
97.75
-5.75
29
97.33
Accuracy criteria: periods 13–28
ME
-1.51
MAE
3.12
MSE
14.40
MAPE
3.23
Theil’s U
1.58
Forecast
MA(6)
Errors
Et
108.00
107.50
106.17
104.00
102.17
100.00
97.33
95.67
94.83
95.00
95.33
96.33
98.33
99.33
100.00
99.50
99.17
98.33
97.00
96.33
96.17
96.00
95.50
-3.00
-7.50
-9.17
-9.00
-7.17
-8.00
-2.33
-0.67
3.17
2.00
5.67
7.67
2.67
-0.33
-5.00
-4.50
-3.17
-2.33
0.00
1.67
-2.17
-4.00
Calculations for Exercise 4.4
-0.10
2.96
12.64
3.03
1.45
95
Chapter 4: Exponential smoothing methods
4.5 (a) (b) and (c)
Smoothing parameters
Forecast Day
Forecast Day
Forecast Day
Forecast Day
MAE
MSE
MAPE
Theil’s U
31
32
33
34
Paperbacks
SES
Holt
α = 0.213 α = 0.335
β = 0.453
210.15
224.24
210.15
231.79
210.15
239.33
210.15
246.88
29.6
33.9
1252.2
1701.7
17.1
18.4
0.68
0.92
Hardcovers
SES
Holt
α = 0.347 β = 0.437
β = 0.157
240.38
250.73
240.38
254.63
240.38
258.53
240.38
262.43
27.3
28.6
1060.6
1273.0
13.5
14.3
0.81
0.92
For both series, SES forecasting is performing better than Holt’s method.
(d) SES forecasts are “flat” and Holt’s forecasts show a linear trend. Both series
show an upward linear trend and we would expect the forecasts to reflect that
trend. Perhaps an out-of-sample analysis would give a better indication of the
merits of the two methods.
(e) The autocorrelation functions of the forecast errors in each case are plotted on
the next page. In each case, there is no noticeable pattern. Only a few spikes
are just outside the critical bounds which is expected.
4.6 Here is a complete run for one set of values (β = 0.1 and α 1 = 0.1). Note that in this
program we have chosen to make the first three values of α be equal to the starting
value. This is not crucial, but it does make a difference.
t
1
2
3
4
5
6
7
8
9
10
11
12
Yt
200.0
135.0
195.0
197.5
310.0
175.0
155.0
130.0
220.0
277.5
235.0
Ft
200.00
200.00
193.50
193.65
197.31
289.74
241.00
224.08
185.43
204.62
234.77
234.81
Et
0.00
-65.00
1.50
3.85
112.69
-114.74
-86.00
-94.08
34.57
72.88
0.23
At
0.00
-6.50
-5.70
-4.74
7.00
-5.18
-13.26
-21.34
-15.75
-6.89
-6.17
Mt
0.00
6.50
6.00
5.79
16.48
26.30
32.27
38.45
38.06
41.55
37.41
αt
0.100
0.100
0.100
0.950
0.820
0.425
0.197
0.411
0.555
0.414
0.166
0.165
96
Part D. Solutions to exercises
Holt paperbacks
0.2
-0.2
-0.4
-0.4
6
8
10
12
14
2
4
6
8
10
Lag
Lag
SES hardbacks
Holt hardbacks
12
14
12
14
0.2
0.0
-0.2
-0.4
-0.4
-0.2
0.0
ACF
0.2
0.4
4
0.4
2
ACF
0.0
ACF
0.0
-0.2
ACF
0.2
0.4
0.4
SES paperbacks
2
4
6
8
10
12
14
2
4
Lag
6
8
10
Lag
Exercise 4.5 (e): Autocorrelation functions of forecast errors.
For other combinations of values for β and starting values for α, here is what the
final α value is:
α = 0.1
α = 0.2
α = 0.3
β = 0.1
0.165
0.058
0.140
β = 0.3
0.327
0.454
0.618
β = 0.5
0.732
0.783
0.797
β = 0.7
0.143
0.133
0.133
The time series is not very long and therefore the results are somewhat fickle. In
any event, it is clear that the β value and the starting values for α have a profound
effect on the final value of α.
4.7 Holt-Winters’ method is best because the data are seasonal. The variation increases
with the level, so we use Holt-Winters’ multiplicative method. The optimal smoothing parameters (giving smallest MSE) are a = 0.479, b = 0.00 and c = 1.00. These
give the following forecasts (read left to right):
97
Chapter 4: Exponential smoothing methods
309.1
327.5
345.9
364.2
312.1
330.5
348.9
367.3
315.2
333.6
352.0
370.4
318.3
336.7
355.0
373.4
321.3
339.7
358.1
376.5
324.4
342.8
361.2
379.6
4.8 First choose any values for the three parameters. Here we have used α = β = γ = 0.1.
Different values will choose different initial values. Our program uses the method
described in the textbook and gave the following results:
ME
-240.5
MAE
240.5
MSE
62469.9
MAPE
37.5
r1
0.70
Theil’s U
2.6
Now compare with the optimal values: α = 0.917, β = 0.234 and γ = 0.000. Using
the same initialization, we obtain the results in Table 4-11, namely
ME
-9.46
MAE
24.00
MSE
824.75
MAPE
3.75
r1
0.17
Theil’s U
0.29
98
Part D. Solutions to exercises
Chapter 5: Simple regression
5.1 (a) −0.7, almost 1, 0.2
(b) False. The correlation is negative. So below-average values of one are associated
with above-average values of the other variable.
(c) Wages have been increasing over time due to inflation. At the same time,
population has been increasing and consequently, new houses need to be built.
So, because they are both increasing with time, they are positively correlated.
(d) There are many factors affecting unemployment and it is simplistic to draw a
causal connection with inflation on the basis of correlation. As in the previous
question, both vary with time and the correlation could be induced by their
time trends. Or they could both be related to some third variable such as
business confidence or government spending.
(e) The older people in the survey had much less opportunity for education than the
younger people. This negative correlation is caused by the increase in education
levels over time.
P
P
5.2 (a) X̄ = 5, Ȳ = 25, (Xi − X̄)2 = 20, (Xi − X̄)(Yi − Ȳ ) = 78. So b = 78/20 = 3.9
and a = 25 − 3.9(5) = 5.5. Hence, the regression line is Ŷ = 5.5 + 3.9X.
p
P
P
(b)
(Ŷi − Ȳ )2 = 304.20, (Yi − Ŷi )2 = 25.80, σe = 25.80/(5 − 2) = 2.933. So
F =
304.20/(2 − 1)
= 35.4.
25.80/(5 − 2)
This has (2−1) = 1 df for the numerator and (5−2) = 3 df for the denominator.
From Table C in Appendix III, the P -value is slightly smaller than 0.010. (Using
a computer, it is 0.0095.) Standard errors:
q
s.e.(a) = (2.93) 15 + 25
20 = 3.53
q
1
s.e.(b) = (2.93) 20
= 0.656.
On 3 df, t∗ = 3.18 for a 95% confidence interval. Hence 95% intervals are
α:
β:
5.500 ± 3.18(3.53) = [−5.7, 16.7]
3.900 ± 3.18(0.656) = [1.8, 6.0]
99
Chapter 5: Simple regression
Output from Minitab for Exercise 5.2:
Regression Analysis
The regression equation is
Y = 5.50 + 3.90X
Predictor
Constant
X
Coef
5.500
3.9000
S = 2.933
StDev
3.531
0.6557
R-Sq = 92.2%
T
1.56
5.95
P
0.217
0.010
R-Sq(adj) = 89.6%
Analysis of Variance
Source
Regression
Error
Total
DF
1
3
4
SS
304.20
25.80
330.00
MS
304.20
8.60
F
35.37
P
0.010
(c) R2 = 0.922, rXY = rY Ŷ = 0.960.
(d) The line through the middle of the graph is the line of best fit. The 95%
prediction interval shown is the interval which would contain the Y value with
probability 0.95 if the X value was 17. The 80% prediction interval shown is
the interval which would contain the Y value with probability 0.80 if the X
value was 26. The dotted line at the boundary of the light shaded region gives
the ends of all the 95% prediction intervals. The dotted line at the boundary
of the dark shaded region gives the ends of all the 80% prediction intervals.
5.3 (a) See the plot on the next page and the Minitab output on page 101. The straight
line is Ŷ = 0.46 + 0.22X.
(b) See the plot on the next page. The residuals may show a slight curvature (Λ
shaped). However, the curvature is not strong and the fitted model appears
reasonable.
(c) R2 = 90.2%. Therefore, 90.2% of the variation in melanoma rates is explained
by the linear regression.
(d) From the Minitab output:
Prediction: 9.286. Prediction interval: (6.749, 11.823)
100
6
2
4
melanoma
8
10
Part D. Solutions to exercises
10
20
30
40
ozone
0.5
0.0
-1.0
-0.5
Residuals
1.0
1.5
Exercise 5.3(a): Scatterplot of melanoma rate against ozone depletion.
10
20
30
40
ozone
Exercise 5.3(b): Scatterplot of residuals from the linear regression.
101
Chapter 5: Simple regression
Output using Minitab for Exercise 5.3:
MTB > Regress ’Melanoma’ 1 ’Ozone’;
SUBC> predict 40.
The regression equation is
Melanoma = 0.460 + 0.221 Ozone
Predictor
Constant
Ozone
Coef
0.4598
0.22065
S = 0.9947
StDev
0.6258
0.02426
R-Sq = 90.2%
T
0.73
9.09
P
0.481
0.000
R-Sq(adj) = 89.1%
Analysis of Variance
Source
Regression
Error
Total
Fit
9.286
DF
1
9
10
StDev Fit
0.517
SS
81.822
8.905
90.727
(
MS
81.822
0.989
95.0% CI
8.116, 10.456)
F
82.70
(
P
0.000
95.0% PI
6.749, 11.823)
Note that it is the prediction interval (PI) we want here. Minitab also gives
the confidence interval (CI) for the line at this point, something we have not
covered in the book.
(e) This analysis has assumed that the susceptibility to melanoma among people
living in the various locations is constant. This is unlikely to be true due to
the diversity of racial mix and climate over the locations. Apart from ozone
depletion, melanoma will be affected by skin type, climate, culture (e.g. is
sun-baking encouraged?), diet, etc.
5.4 (a) See plot on the next page and computer output on page 103.
(b) Coefficients: a = 4.184, b = 0.9431. Only b is significant, showing the relationship is significant. (We could refit the model without the intercept term.)
(c) If X = 80, Ŷ = 4.184 + 0.9431(80) = 79.63. Standard error of forecast is 1.88
(from computer output).
102
30
40
50
60
Production rating
70
80
90
Part D. Solutions to exercises
20
40
60
80
Manual dexterity
100
Exercise 5.4(a): Scatterplot of production rating against manual dexterity test scores.
80
•
•
•
•
60
•
•
•
•
40
•
•
•
•
•
•
•
•
•
•
20
Production rating
•
•
20
40
60
80
Manual dexterity
Exercise 5.4(e): 95% prediction intervals for production rating.
100
103
Chapter 5: Simple regression
Output using Minitab for Exercise 5.4
MTB > Regress ’Y’ 1 ’X’;
SUBC> Predict ’newX’.
The regression equation is
Y = 4.18 + 0.943 X
Predictor
Constant
X
Coef
4.184
0.94306
S = 5.126
StDev
3.476
0.05961
R-Sq = 93.3%
T
1.20
15.82
P
0.244
0.000
R-Sq(adj) = 92.9%
Analysis of Variance
Source
Regression
Error
Total
Fit
23.05
41.91
60.77
79.63
DF
1
18
19
StDev Fit
2.38
1.46
1.18
1.88
SS
6576.8
473.0
7049.8
(
(
(
(
95.0% CI
18.04,
38.85,
58.28,
75.68,
MS
6576.8
26.3
28.05)
44.97)
63.26)
83.58)
F
250.29
(
(
(
(
95.0% PI
11.17,
30.71,
49.71,
68.16,
P
0.000
34.92)
53.10)
71.82)
91.10)
(d) For confidence and prediction intervals, use Table B with 18 df. 95% CI for β
is 0.94306 ± 2.10(0.05961) = [0.82, 1.07].
(e) See output. Again it is the prediction interval (PI) we want here, not the
confidence interval (CI). The prediction intervals are shown in the plot on the
previous page.
5.5 (a) See the plot on the following page. The straight line regression model is Ŷ =
20.2−0.145X where Y = electricity consumption and X = temperature. There
is a negative relationship because heating is used for lower temperatures, but
there is no need to use heating for the higher temperatures. The temperatures
are not sufficiently high to warrant the use of air conditioning. Hence, the
electricity consumption is higher when the temperature is lower.
104
19
18
17
16
Electricity consumption (Mwh)
Part D. Solutions to exercises
10
15
20
25
30
Temperature
Exercise 5.5(a): Electricity consumption (Mwh) plotted against temperature (degrees Celsius).
-1
0
Residuals
1
2
Possible outlier
10
15
20
25
30
Temperature
Exercise 5.5(c): Residual plot for the straight line regression of electricity consumption
against temperature.
105
Chapter 5: Simple regression
(b) r = −0.791
(c) See the plot on the previous page. Apart from the possible outlier, the model
appears to be adequate. There are no highly influential observations.
(d) If X = 10, Ŷ = 20.2 − 0.145(10) = 18.75. If X = 35, Ŷ = 20.2 − 0.145(35) =
15.12. The first of these predictions seems reasonable. The second is unlikely.
Note that X = 35 is outside the range of the data making prediction dangerous. For temperatures above about 20 ◦ C, it is unlikely electricity consumption
would continue to fall because no heating would be used. Instead, at high
temperatures (such as X = 35◦ C), electricity consumption is likely to increase
again due to the use of air-conditioning.
5.6 (a) When H = 130 and W = 45, r = 0.553.
(b) When H = 40 and W = 150, r = −0.001.
(c) The following table shows the influence of outliers at various positions.
H
129
128
122
112
99
83
65
44
22
0
W
0
22
44
64
83
99
112
122
128
129
r
-0.393
0.032
0.527
0.773
0.846
0.810
0.627
0.151
-0.365
-0.624
The point about all this is that an outlier (and skewness in general) can seriously
affect the correlation coefficient. It is a good idea to look at the scatterplot
before computing any correlation.
5.7 (a) See the plot on the next page. The winning time has been decreasing with year.
There is an outlier in 1896.
(b) The fitted line is Ŷ = 196−0.0768X where X denotes the year of the Olympics.
Therefore the winning time has been decreasing an average 0.0768 seconds per
year.
(c) The residuals are plotted on the next page. The residuals show random scatter
about 0 with only one usual point (the outlier in 1896). But note that the
last five residuals are positive. This suggests that the straight line is “levelling
out”—the winning time is decreasing at a slower rate now than it was earlier.
106
50
48
44
46
winning.time
52
54
Part D. Solutions to exercises
1900
1920
1940
1960
1980
2000
year
1
0
-1
fit$resid
2
3
Exercise 5.7(a): Scatterplot of winning times against year.
1900
1920
1940
1960
1980
year
Exercise 5.7(c): Residual plot for linear regression model of winning times.
2000
107
Chapter 5: Simple regression
(d) The predicted winning time in the 2000 Olympics is
Ŷ = 196 − 0.0768(2000) = 42.50 seconds.
This would smash the world record. But given the previous five results (with
positive residuals), it would seem more likely that the actual winning time
would be higher. A prediction interval is
42.50 ± 2.0796(1.1762) = 42.50 ± 2.45 = [40.05, 44.95].
5.8 (a) There is strong seasonality with peaks in November and December and a trough
in January. The surfing festival shows as a smaller peak in March from 1988.
The variation in the series is increasing with the level and there is a strong
positive trend due to sales growth.
(b) Logarithms are necessary to stabilize the variance so it does not increase with
the level of the series.
(c) See the plot on the next page and the computer output on page 109. The fitted
line is Ŷ = −526.57 + 0.2706X where X is the year and Y is the logged annual
sales.
(d)
Ŷ = −526.57 + 0.2706(1994) = 12.98
X = 1994 :
Ŷ = −526.57 + 0.2706(1995) = 13.25
X = 1995 :
Ŷ = −526.57 + 0.2706(1996) = 13.52
X = 1996 :
Prediction intervals (from computer output):
X = 1994 :
[12.57, 13.40]
X = 1995 :
[12.80, 13.71]
X = 1996 :
[13.03, 14.02]
(e) We transform the forecasts and intervals with the exponential function:
Total annual sales for 1994
exp(12.98) = $434, 443
Total annual sales for 1995
exp(13.25) = $569, 439
Total annual sales for 1996
exp(13.52) = $746, 383
Prediction intervals:
X = 1994 :
[e12.57 , e13.40 ] = [286673, 658385]
X = 1995 :
[e12.80 , e13.71 ] = [361994, 895764]
X = 1996 :
[e13.03 , e14.02 ] = [455060, 1224208]
108
60000
40000
20000
0
1988
1990
1992
1994
11.5
12.0
12.5
Exercise 5.8(a): Time plot of sales figures.
Log Total annual sales
Sales
80000
Part D. Solutions to exercises
1987
1988
1989
1990
1991
1992
1993
Exercise 5.8(c): Regression line fitted to the logged sales data.
109
Chapter 5: Simple regression
Output using Minitab for Exercise 5.8:
MTB > regress ’Log Sales’ 1 ’Year’;
SUBC> predict ’new years’;
The regression equation is
Log Sales = - 527 + 0.271 Year
Predictor
Constant
Year
Coef
-526.57
0.27059
S = 0.1235
StDev
46.44
0.02334
R-Sq = 96.4%
T
-11.34
11.60
P
0.000
0.000
R-Sq(adj) = 95.7%
Analysis of Variance
Source
Regression
Error
Total
DF
1
5
6
SS
2.0501
0.0762
2.1263
Fit StDev Fit
12.9818
0.1044
13.2524
0.1257
13.5230
0.1476
X denotes a row with
(
(
(
X
MS
2.0501
0.0152
F
134.45
95.0% CI
12.7135, 13.2502) (
12.9293, 13.5755) (
13.1435, 13.9025) (
values away from the
P
0.000
95.0% PI
12.5661, 13.3975)
12.7994, 13.7054) X
13.0282, 14.0178) X
center
These prediction intervals are very wide because we are only using annual totals
in making these predictions. A more accurate method would be to fit a model
to the monthly data allowing for the seasonal patterns. This is discussed in
Chapter 7.
(f ) One way would be to calculate the proportion of sales for each month compared
to the total sales for that year. Averaging these proportions will give a rough
guide as to how to split the annual totals into 12 monthly totals.
110
10
8
4
6
Percentage mortality
12
14
Part D. Solutions to exercises
0
20
40
60
80
100
Percentage Type A Birds
Exercise 5.9(a): Scatterplot of percentage mortality against percentage of Type A birds.
5.9 (a) The plot is shown above. The fitted line is
Ŷ = 4.38 + 0.0154X
where X = percentage of type A birds and Y = percentage mortality.
(b) From the computer output:
Predictor
Constant
% Type A
Coef
4.3817
0.015432
StDev
0.6848
0.007672
T
6.40
2.01
P
0.000
0.046
So the t-test is significant (since P < 0.05). A 95% confidence interval for the
slope is
0.01543 ± 1.976(0.007672) = 0.01543 ± 0.01516 = [0.003, 0.031].
This suggests that the Type A birds have a higher mortality than the Type B
birds, the opposite to what the farmers claim.
(c) For a farmer using all Type A birds, X = 100. So Ŷ = 4.38 + 0.0154(100) =
5.92%. For a farmer using all Type B birds, X = 0. So Ŷ = 4.38%. Prediction
intervals for these are [2.363, 9.487] and [0.587, 8.177] respectively.
(d) R2 = 2.6. So only 2.6% of the variation in mortality is due to bird type.
111
140
Chapter 5: Simple regression
100
80
40
60
consumption
120
Model 1
Model 2
40
60
80
100
price
Exercise 5.10(b): Scatterplot of gas consumption against price.
(e) This information suggests that heat may be a lurking variable. If Type A birds
are being used more in summer and the mortality is higher in summer, than the
increased mortality of Type A birds may be due to the summer rather than the
bird type. A proper randomized experiment would need to be done to properly
assess whether bird type is having an effect here.
5.10 (a) Cross sectional data. There is no time component.
(b) See the plot above.
(c) When the price is higher, the consumption may be lower due to the pressure of
increased cost. Therefore, we would expect b 1 < b2 < 0.
(d) Model 1: First take logarithms of Y i , then use simple linear regression to obtain
a = 5.10,
b = −0.0153,
σe2 = 0.0735.
Model 2: Split data into two groups. Fit each group separately using simple
linear regression to obtain
a1 = 221,
b1 = −2.91
and
a2 = 84.8,
Using the equation given in the question, we obtain
σe2 = 2913.7/16 = 182.06.
The fitted lines are shown on the graph above.
b2 = −0.447.
112
Part D. Solutions to exercises
(e) Model 1: R2 = rY2 Ŷ = 0.721.
1
Model 2: R2 = rY2 Ŷ = 0.859. The second model is better with higher R 2 value.
2
The residual plots are given on the following page. Again, the second model is
much better showing random scatter about zero. The first model show pattern
in the residuals.
(f ) The graph on page 114 shows a local linear regression through the data. The
fitted curve resembles the fitted lines for model 2. This suggests that model 2
is a reasonable model for the data. However, our approach has also meant the
two lines do not join at X = 60. A better model would force them to join. This
means the parameters must be restricted which makes the estimation much
harder.
(g) and (h) Using model 2, forecasts are obtained by
220.9 − 2.906X when X ≤ 60
Ŷ =
84.8 − 0.447X
when X > 60.
and standard errors are obtained from (5.19):
r
√
1
(X − 63)2
s.e.(Ŷ ) = 182.06 1 +
+
.
20
10672.11
The 95% PI are obtained using Ŷ ± t∗ (s.e.) where t∗ = 2.12 (from Table B with
16 df). Hence, we obtain the following values.
X
40
60
80
100
120
Ŷ
104.67
46.55
49.03
40.09
31.15
s.e.
14.15
13.83
14.00
14.65
15.70
[ 95% PI ]
[74.7 , 134.7]
[17.2 , 75.9]
[19.3 , 78.7]
[ 9.0 , 71.1]
[ -2.1 , 64.4]
For example, at a price of 80c, the gas consumption will lie between 19.3 and
78.7 for 95% of towns.
113
20
0
-20
Residuals model 1
40
Chapter 5: Simple regression
40
60
80
100
80
100
10
0
-10
-20
Residuals model 2
20
Price
40
60
Price
Exercise 5.10(e): Residual plots for the two models.
114
40
60
80
Consumption
100
120
140
Part D. Solutions to exercises
40
60
80
100
Price
80
60
40
20
0
Consumption
100
120
140
Exercise 5.10(f ): Local linear regression through the gas consumption data. The fitted line
suggests that model 2 is more appropriate.
40
60
80
100
Price
Exercise 5.10(h): 95% prediction intervals for gas consumption.
120
115
Chapter 6: Multiple regression
Chapter 6: Multiple regression
6.1 (a) df for numerator = k and for denominator = n − k − 1 where n = number of
observations and k = number of explanatory variables. Here, k = 16 so that
n − 16 − 1 = 30. Hence, n = 30 + 16 + 1 = 47.
(b)
n−1
R̄2 = 1 − (1 − R2 ) n−k−1
= 1 − (1 − 0.943)
47 − 1
= 0.913.
48 − 16 − 1
(c) F = 31.04 on (17,30) df. From Table C in Appendix III, the P -value is much
smaller than 0.01. So the regression is highly significant.
(d) The coefficients should be compared with a t 30 distribution. From Table B in
Appendix III, any value greater than 2.04 in absolute value will be significant
at the 5% level. So the constant and variables 4, 8, 12, 13, 14, 15 and 17 are
significant in the presence of other explanatory variables. Note that the significance level of 5% is arbitrary. There is no reason why some other significance
level (e.g. 2%) could not be used.
(e) The next stage would be to reduce the number of variables in the model by
removing some of the least significant variables and re-fitting the model.
6.2 (a) The fitted model is Ĉ = 273.93−5.68P +0.034P 2 . For this model, R2 = 0.8315.
[Recall: in exercise 5.6, model 1 had R 2 = 0.721 and model 2 had R2 = 0.859.]
So the R̄2 values for each model are:
46
= 0.715.
45
46
= 1 − (1 − 0.859)
= 0.849.
43
46
= 1 − (1 − 0.832)
= 0.824.
44
Model 1
n−1
R̄2 = 1 − (1 − 0.721) n−k−1
= 1 − (1 − 0.721)
Model 2
n−1
R̄2 = 1 − (1 − 0.859) n−k−1
Model 3
n−1
R̄2 = 1 − (1 − 0.832) n−k−1
These values show that model 2 is the best model, followed by model 3. The t
values for the coefficients are:
Model 1
Model 2
Model 3
α : t = 10.22
α1 : t = 10.33
β0 : t = 8.83
β : t = −5.47
β1 : t = −6.61
β1 : t = −5.62
α2 : t = 4.11
β2 : t = 4.57
β2 : t = −1.99
Of these, only β2 from model 2 is not significantly different from zero. This
suggests that a better model would be to allow the second part of model 2 to
be a constant rather than a linear function.
(b) From the computer output the following 95% prediction intervals are obtained.
116
Part D. Solutions to exercises
Output using Minitab for Exercise 6.2:
MTB > regress ’C’ 2 ’P’ ’Psq’;
SUBC> predict ’newP’ ’newPsq’.
The regression equation is
C = 274 - 5.68 P + 0.0339 Psq
Predictor
Constant
P
Psq
Coef
273.93
-5.676
0.033904
S = 14.37
StDev
31.03
1.009
0.007412
R-Sq = 83.2%
Analysis of Variance
Source
DF
SS
Regression
2
17327.0
Error
17
3511.0
Total
19
20838.0
Source
P
Psq
DF
1
1
T
8.83
-5.62
4.57
P
0.000
0.000
0.000
R-Sq(adj) = 81.2%
MS
8663.5
206.5
F
41.95
P
0.000
Seq SS
13005.7
4321.3
Fit StDev Fit
173.97
14.29
101.14
4.77
55.43
4.91
36.85
4.95
45.38
7.14
81.04
18.49
X denotes a row with
XX denotes a row with
95.0% CI
95.0% PI
( 143.82, 204.13) ( 131.21, 216.74) XX
(
91.08, 111.21) (
69.19, 133.10)
(
45.07,
65.80) (
23.38,
87.48)
(
26.40,
47.29) (
4.77,
68.92)
(
30.31,
60.46) (
11.52,
79.25)
(
42.02, 120.06) (
31.62, 130.46) XX
X values away from the center
very extreme X values
117
100
0
50
consumption
150
200
Chapter 6: Multiple regression
20
40
60
80
100
120
price
Exercise 6.2: Quadratic regression of gas consumption against price. 95% prediction intervals shown.
P
20
40
60
80
100
120
Ĉ
173.97
101.14
55.43
36.85
45.38
81.04
[
[
[
[
[
[
[
95% PI
131.21 , 216.74
69.19 , 133.10
23.38 , 87.48
4.77 , 68.92
11.52 , 79.25
31.62 , 130.46
]
]
]
]
]
]
]
It is clear from the plot that it is dangerous predicting outside the observed
price range. In this case, the predictions at P = 20 and P = 120 are almost certainly wrong. Predicting outside the range of the explanatory variable is always
dangerous, but much more so when a quadratic (or higher-order polynomial) is
used.
(c) rP P 2 = 0.990. If we were to use P , P 2 and P 3 , the correlations among these
explanatory variables would be very high and we would have a serious multicollinearity problem on our hands. The coefficients estimates would be unstable
(i.e. have large standard errors). Multicollinearity will often be a problem with
polynomial regression.
118
Part D. Solutions to exercises
6.3 (a) From Table 6-15, we obtain the following values
Period
54
55
56
57
58
59
Actual
4.646
1.060
-0.758
4.702
1.878
6.620
Forecast
1.863
1.221
0.114
2.779
1.959
5.789
Analysis of errors: periods 54 through 59.
ME
0.74
MAE
1.11
MSE
2.15
MPE
34.82
MAPE
41.32
ACF1
-0.35
Theil’s U
0.34
Strictly speaking, we should not compute relative measures when the data cross
the zero line (i.e., when there are positive and negative values) because relative
measures will “blow up” if divided by zero.
(b) and (c) Optimizing the coefficients for Holt’s method will give better forecasts.
Another approach is to use a simple MA forecast. An MA(2) forecast actually
works better than Holt’s method for both series. Other approaches are also
possible.
Calculate accuracy statistics for your forecasts and compare them with the
forecasts in Table 6-14.
6.4 (a) The fitted equation is
Ŷ = 73.40 + 1.52X1 + 0.38X2 − 0.27X3 .
95% confidence intervals for the parameters are calculated using a t 6 distribution. So the multiplier is 2.45:
73.40 ± 2.45(14.687) = [37.46, 109.3]
1.52 ± 2.45(0.1295) = [1.20, 1.84]
0.38 ± 2.45(0.1941) = [−0.09, 0.85]
−0.27 ± 2.45(0.1841) = [−0.72, 0.18]
(b) F = 123.3 on (3,6) df. P = 0.000. This means that the probability of results
like this, if the three explanatory variables were not relevant, is very small.
(c) The residual plots on page 120 show the model is satisfactory. There is no
pattern in any of the residual plots.
(d) R2 = 0.984. Therefore 98.4% of the variation in Y is explained by the regression
relationship.
119
Chapter 6: Multiple regression
Output using Minitab for Exercise 6.4:
MTB > Regress ’Y’ 3 ’X1’ ’X2’ ’X3’;
SUBC>
Predict 10 40 30;
SUBC>
Confidence 90.
The regression equation is
Y = 73.4 + 1.52 X1 + 0.381 X2 - 0.268 X3
Predictor
Constant
X1
X2
X3
Coef
73.40
1.5162
0.3815
-0.2685
S = 2.326
StDev
14.69
0.1295
0.1941
0.1841
R-Sq = 98.4%
Analysis of Variance
Source
DF
SS
Regression
3
2001.54
Error
6
32.46
Total
9
2034.00
Source
X1
X2
X3
Fit
95.762
DF
1
1
1
StDev Fit
1.632
T
5.00
11.71
1.97
-1.46
P
0.002
0.000
0.097
0.195
R-Sq(adj) = 97.6%
MS
667.18
5.41
F
123.32
P
0.000
Seq SS
1118.36
871.67
11.51
(
90.0% CI
92.590, 98.934)
(
90.0% PI
90.239, 101.285)
120
4
3
2
1
-2
-1
0
residuals
1
0
-2
-1
residuals
2
3
4
Part D. Solutions to exercises
5
10
15
20
40
50
60
70
X2
1
0
-2
-1
residuals
2
3
4
X1
30
10
20
30
40
50
60
X3
Exercise 6.4(c): Residual plots for the cement data.
(e) The signs of the coefficients indicate the direction of the effect of each variable.
X1 increases heat and has the greatest effect (the largest coefficient). The other
variables are not significant, so they may not have any effect. If they do, the
coefficients suggest that X2 might increase heat and X3 might decrease heat.
(f ) For X1 = 10, X2 = 40 and X3 = 30, Ŷ = 73.40+1.52(10)+0.38(40)−0.27(30) =
95.76. 90% Prediction interval: [90.24,101.29]
6.5 The data for this exercise were taken from McGee and Carleton (1970) “Piecewise regression”, Journal of the American Statistical Association, 65, 1109–1124. It might
be worthwhile to get this paper to compare what conventional regression can accomplish when there are special features in the data. In this case, the relationship
121
Chapter 6: Multiple regression
between the Boston dollar volume and the NYSE-AME dollar volume underwent a
series of changes over the time period of interest. In this paper, the solution was as
follows:
from
from
from
from
Jan ’67 through Oct ’67
Nov ’67 through Jul ’68
Aug ’68 through Nov ’68
Dec ’68 through Nov ’69
Ŷ
Ŷ
Ŷ
Ŷ
=
=
=
=
8.748 + 0.0061X
−20.905 + 0.0114X
−79.043 + 0.0205X
11.075 + 0.0067X
Notice the slope coefficients in these four equations. They are small (because
Boston’s dollar volume is small relative to the big board volumes) but they get
increasingly stronger (from6 1 to 114 to 205) in successive periods of commission
splitting. Then in Dec ’68, the SEC said “no more commission splitting” and it
hurt the Boston dollar volume. The slope went back to 67, which is almost where it
started.
(a) The fitted equation is Ŷ = −66.2 + 0.014X. The following output was obtained
from a computer package.
Value Std. Error t value Pr(>|t|)
(Intercept) -66.2193 39.6809
-1.6688
0.1046
X
0.0138
0.0029
4.7856
0.0000
F statistic: 22.9 on 1 and 33 degrees of freedom
the p-value is 0.00003465
R-sq = 0.4097 Rbar-sq = 0.3918 D-W = 0.694
Clearly, the regression is significant, although the intercept is not significant.
(b) Output from computer package:
Value Std. Error t value Pr(>|t|)
(Intercept) -67.2116 40.2550
-1.6696
0.1047
X
0.0135
0.0030
4.5025
0.0001
time
0.2737
0.6518
0.4199
0.6773
F statistic: 11.25 on 2 and 32 degrees of freedom
the p-value is 0.0001992
R-sq = 0.4129 Rbar-sq = 0.3762 D-W = 0.6814
Here, the regression is significant, but time is not significant. In fact, comparing
these two models shows that adding time to the regression equation is actually
worse than not adding it. See the R̄2 values. And for both analyses, the D-W
122
50
100
150
Y
200
250
Part D. Solutions to exercises
10000
12000
14000
16000
18000
X
Exercise 6.5(c): Connected scatterplot for the Boston and American stock exchanges.
statistic shows that there is a lot of pattern left in the residuals. A piecewise
regression approach does far better with this data set.
(c) See the plot above.
6.6 (a) and (b) Here are the seasonality indices based on the regression equations
(6.10) and (6.12). They represent the intercept term in the regression for each
of the 12 first differences.
Mar-Feb
Apr-Mar
May-Apr
Jun-May
Jul-Jun
Aug-Jul
Sep-Aug
Oct-Sep
Nov-Oct
Dec-Nov
Jan-Dec
Feb-Jan
Using (6.10)
-2.6
-6.7
-3.5
-5.3
-3.6
-5.2
-5.9
-6.9
-4.1
-4.7
-0.8
-2.2
Using (6.12)
-6.2
-10.6
-7.4
-9.2
-7.4
-9.2
-9.7
-10.7
-7.9
-8.5
-4.6
-6.2
These two sets of seasonal indices are not quite the same. In the first equa-
Chapter 6: Multiple regression
123
tion (6.10), all eleven dummy variables for seasonality were allowed to be in
the regression. In the second equation (6.12), the best subsets regression procedure did not allow the first seasonal dummy into the final equation. The
absolute values are not so important because, in the presence of different sets
of explanatory variables, we expect the intercept terms to be different.
(c) The seasonal indices should be the same regardless of which month is used as
a base.
6.7 (a) Yt = 78.7 + 0.534xt + et
(b) DW = 0.57. dL = 1.04 at 1% level. Therefore there is significant positive
autocorrelation.
124
Part D. Solutions to exercises
Chapter 7: The Box-Jenkins methodology for ARIMA models
7.1 (a) In general, the approximate standard error of the sample autocorrelations is
√
1/ n. So the larger the value of n, the smaller the standard error. Therefore,
the ACF has more variation for small values of n than for large values of n. All
three series show the autocorrelations mostly falling with the 95% bands. The
few that lie just outside the bands are not of concern since we would expect
about 5% of spikes to cross the bands. There is no reason to think these series
are anything but white noise.
√
(b) The lines shown are 95% critical values. These are calculated as ±1.96/ n. So
they are closer to zero when n is larger. The autocorrelations vary randomly,
but they mostly stay within the bounds.
7.2 The time plot shows the series as a non-stationary level. It wanders up and down
over time in a similar way to a random walk. The ACF decays very slowly which
also indicates non-stationarity in the level. Finally, the PACF has a very large value
at lag 1, indicating the data should be differenced.
7.3 The five models are
AR(1)
MA(1)
ARMA(1,1)
AR(2)
MA(2)
Yt
Yt
Yt
Yt
Yt
= 0.6Yt−1 + et .
= et + 0.6et−1 .
= 0.6Yt−1 + et + 0.6et−1 .
= −0.8Yt−1 + 0.3Yt−2 + et .
= et + 0.8et−1 − 0.3et−2 .
In each case, we assume Yt = 0 and et = 0 for t ≤ 0. The generated data are shown
on the following two pages. There is a lot of similarity in the shapes of the series
because they are based on exactly the same errors.
7.4 (a) The ACF is slow to die out and the time plot shows the series wandering in a
non-stationary way. So we take first differences. The ACF of the first differences
show one significant spike at lag 1 indicating an MA(1) is appropriate. So the
model for the raw data is ARIMA(0,1,1).
(b) There is not consistent trend in the raw data and the differenced data have
mean close to zero. Therefore, there is no need to include a constant term.
(c) (1 − B)Yt = (1 − θ1 B)et .
(d) See the output on page 127. There may be slight differences with different
software packages and even different versions of the same package. The LjungBox statistics are not significant and the ACF and PACF of residuals show no
significant differences from white noise.
125
Chapter 7: The Box-Jenkins methodology for ARIMA models
t
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
AR(1)
0.010
1.386
1.362
2.397
2.758
2.695
1.947
0.968
2.481
2.209
1.055
-0.797
-1.628
-1.047
1.062
0.917
0.560
1.276
-1.334
-0.711
0.484
2.050
2.070
0.112
0.987
2.262
0.327
-1.514
0.272
-0.427
MA(1)
0.010
1.386
1.358
1.898
2.268
1.832
0.954
-0.002
1.780
1.860
0.162
-1.592
-2.008
-0.760
1.648
1.294
0.178
0.946
-1.536
-1.170
0.964
2.306
1.896
-0.626
0.242
2.222
-0.028
-2.328
0.154
0.118
ARMA(1,1)
0.010
1.392
2.193
3.214
4.196
4.350
3.564
2.136
3.062
3.697
2.380
-0.164
-2.106
-2.024
0.434
1.554
1.111
1.612
-0.569
-1.511
0.057
2.340
3.300
1.354
1.054
2.855
1.685
-1.317
-0.636
-0.264
AR(2)
0.010
1.372
-0.565
2.443
-0.804
2.416
-1.844
2.000
-0.253
1.523
-1.564
0.278
-1.842
1.487
-0.052
0.768
-0.620
1.666
-3.619
3.485
-2.964
5.176
-4.190
3.775
-3.357
5.488
-6.428
5.079
-4.811
4.783
Generated data for Exercise 7.3
MA(2)
0.010
1.388
1.631
1.590
2.425
1.622
0.766
-0.248
1.641
2.300
-0.264
-1.862
-2.213
-0.561
1.979
1.653
-0.273
0.864
-1.351
-1.872
1.612
2.461
1.975
-0.986
-0.236
2.745
0.030
-3.035
0.121
0.867
126
Part D. Solutions to exercises
MA(1)
-2
-1
-1
0
0
1
1
2
2
AR(1)
0
10
20
30
0
10
30
20
30
AR(2)
-2
-6
-1
-4
0
-2
1
0
2
2
3
4
4
ARMA(1,1)
20
0
10
20
30
20
30
0
10
-3
-2
-1
0
1
2
MA(2)
0
10
Exercise 7.3: Simulated ARMA series.
Chapter 7: The Box-Jenkins methodology for ARIMA models
127
Output using Minitab for Exercise 7.4:
MTB > ARIMA 0 1 1 ’Strikes’;
SUBC> NoConstant;
SUBC> Forecast 3.
Final Estimates of Parameters
Type
Coef
StDev
MA
1
0.3174
0.1886
T
1.68
Differencing: 1 regular difference
Number of observations: Original series 30, after differencing 29
Residuals:
SS = 9256634 (backforecasts excluded)
MS =
330594 DF = 28
Modified Box-Pierce (Ljung-Box) Chi-Square statistic
Lag
12
24
Chi-Square
8.1(DF=11)
34.1(DF=23)
Forecasts from period 30
Period
31
32
33
Forecast
4164.87
4164.87
4164.87
95 Percent Limits
Lower
Upper
3037.70
5292.04
2800.11
5529.63
2598.14
5731.60
(e) The last observation is yt = 3885; the last residual in series is e t = −881.87
(obtained from the computer package). Now
So
Yt = Yt−1 + et − 0.3174et−1 .
Ŷ31 = Y30 + ê31 − 0.3174ê30
= 3885 + 0 − 0.3174(−881.87) = 4164.9
Ŷ32 = Ŷ31 + 0 − 0.3174(0) = 4164.9
Ŷ33 = Ŷ32 + 0 − 0.3174(0) = 4164.9
(f ) See the graph on the following page.
7.5 (a) The monthly data show strong seasonality and the seasonal pattern is reasonably stable. There is no trend in the data (this is a mature product).
(b) The pattern in the ACF plot shows the dominance of the seasonality. The
autocorrelations at lags 6, 18 and 30 are negative (because we are correlating
128
Part D. Solutions to exercises
3000
4000
5000
6000
Number of strikes in USA
1950
1960
1970
1980
Exercise 7.4(f ): Predicted number of strikes in USA. 95% prediction intervals shown.
the high periods with the low periods) and at lags 12, 14 and 36 they are
positive (because we are correlating high periods with high periods).
(c) The pattern in the PACF plot is not particularly revealing. However, there
is little need to try to interpret this plot when the analysis clearly shows the
dominance of the seasonality. The best approach would be to difference the
series to reduce the effect of the seasonality and then see what is left over.
(d) These graphs suggest a seasonal MA(1) because of the spike at lag 12 in the
ACF and the decreasing spikes at lags 12 and 24 in the PACF. Overall, the
suggested model is ARIMA(0,1,0)(0,1,1) 12 .
(e) Using the backshift operator: (1 − B)(1 − B 12 )Yt = (1 − ΘB 12 )et . Rewriting
gives
Yt − Yt−12 − Yt−1 + Yt−13 = et − Θet−12 .
7.6 (a) ARIMA(3,1,0).
(b) For the differenced data, the PACF has a significant spikes at lags 1, 2 and
3 and a spike at lag 17 which is marginally significant. The spike at lag 17
is probably due to chance. Therefore an AR(3) is an appropriate model for
the differenced data. Consequently, an ARIMA(3,1,0) model is suitable for the
original data.
129
Chapter 7: The Box-Jenkins methodology for ARIMA models
(c) Now
(Yt − Yt−1 ) = 0.42(Yt−1 − Yt−2 ) − 0.20(Yt−2 − Yt−3 ) − 0.30(Yt−3 − Yt−4 ) + et .
Therefore
Yt = 1.42Yt−1 − 0.62Yt−2 − 0.10Yt−3 + 0.30Yt−4 + et
and
Ŷ1940 = 1.42(1797) − 0.62(1791) − 0.10(1627) + 0.30(1665) = 1778.1
Ŷ1941 = 1.42(1778.1) − 0.62(1797) − 0.10(1791) + 0.30(1627) = 1719.8
Ŷ1942 = 1.42(1719.8) − 0.62(1778.1) − 0.10(1797) + 0.30(1791) = 1697.3
7.7 (a) ARIMA(4,0,0).
(b) The model was chosen because the last significant spike in the PACF was at
lag 4. Note that the spikes at lags 2 and 3 were not significant. This makes
no difference. It is the last significant spike which determines the order of the
model.
(c) The model is
Yt = 146.1 + 0.891Yt−1 − 0.257Yt−2 + 0.392Yt−3 − 0.333Yt−4 + et .
So
Ŷ1969
Ŷ1970
Ŷ1971
= 146.1 + 0.891(545) − 0.257(552) + 0.392(534) − 0.333(512) = 528.7
= 146.1 + 0.891(528.7) − 0.257(545) + 0.392(552) − 0.333(534) = 515.7
= 146.1 + 0.891(515.7) − 0.257(528.7) + 0.392(545) − 0.333(552) = 499.5
7.8 (a) The centered 12-MA smooth is shown in the plot on the next page. The trend
is generally linear and increasing with a flat period between 1990 and 1993.
(b) The variation does not change much with the level, so transforming will not
make much difference.
(c) The data are not stationary. There is a trend and seasonality in the data.
Differencing at lag 12 gives the data shown in the plot on page 131. These
appear stationary although it is possible another difference at lag 1 is needed.
(d) From the plots on page 131 it is clear there is a seasonal MA component of order
1. In addition there is a significant spike at lag 1 in both the ACF and PACF.
Hence plausible models are ARIMA(1,0,0)(0,1,1) 12 and ARIMA(0,0,1)(0,1,1)12 .
Comparing the two models we have the following results
ARIMA(1,0,0)(0,1,1)12
ARIMA(0,0,1)(0,1,1)12
AIC=900.2
AIC=926.9
130
Part D. Solutions to exercises
200
220
240
260
280
300
US electricity generation
1985
1987
1989
1991
1993
1995
1997
Year
Exercise 7.8(a): Total net generation of electricity in USA.
Hence the better model is the first one. Note that different packages will give
different values for the AIC depending on how it is calculated. Therefore the
same package should be used for all calculations.
(e) The residuals from the ARIMA(1,0,0)(0,1,1) 12 are shown in the plots on page
132. Because there are significant spikes in the ACF and PACF, the model is
not adequately describing the series. These plots suggest we need to add an
MA(1) term to the model. So we fit the revised model ARIMA(1,0,1)(0,1,1) 12 .
This time, the residual plots (not shown here) look like white noise. The AIC
is 876.7. Part of the computer output for fitting the revised model is shown
below.
Parameter
MA1,1
MA2,1
AR1,1
Estimate
0.74427
0.77650
0.99566
Approx.
Std Error
0.05887
0.09047
0.0070613
T Ratio
12.64
8.58
141.00
Lag
1
12
1
So the fitted model is
(1 − 0.996B)(1 − B 12 )Yt = (1 − 0.744B)(1 − 0.777B 12 )et .
131
Chapter 7: The Box-Jenkins methodology for ARIMA models
30
Electricity data differenced at lag 12
o
o
o
o
20
o
o
o
o
o
10
o
o
o
0
oo
o o
o o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o o
o
o
o
o
o
o
o
o
oo o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
-20
o
o
o
o
1987
1988
1989
20
30
1990
1991
1992
1993
1994
1995
1996
20
30
1997
0.0
-0.2
-0.2
0.0
PACF
0.2
0.2
1986
ACF
o
o
o o o
o
o
o
o
o
o
o
o
o
o o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
-10
o
o
o
o
o
o
o
o
o
o
oo
o
o
oo
oo
o
o
o
o
o
o
oo
o
o
0
10
40
0
10
Exercise 7.8(c): Seasonally differenced electricity generation.
40
132
Part D. Solutions to exercises
3
Residuals from ARIMA(1,0,0)(0,1,1) model
o
o
o
2
o
o
o
1
o
o
o
o
o
0
o
o
o
oo
o
o
o
o
o
o o
ooo
o
o
o
oo
o o
o
o
-1
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o o o
o
o
o o
o
o
o
o
oo
o
o
o
o
o
o
o
-2
o
o
o
o
o
o
o
oo
o
o o
o
o
o o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o o
o
o
o
o
o
o
o
o
-3
o
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
0.0
-0.3
-0.1
PACF
-0.1
-0.3
ACF
0.1
0.1
0.2
1985
0
10
20
30
40
0
10
20
30
40
Exercise 7.8(e): Residuals from ARIMA(1,0,0)(0,1,1)12 model fitted to the electricity data.
Chapter 7: The Box-Jenkins methodology for ARIMA models
133
Output using SAS for Exercise 7.8:
Parameter
MA1,1
MA2,1
AR1,1
Estimate
0.86486
0.80875
0.27744
Approx.
Std Error
0.06044
0.09544
0.10751
T Ratio
14.31
8.47
2.58
Lag
1
12
1
Variance Estimate = 41.8498466
Std Error Estimate = 6.46914574
AIC
= 864.616345
SBC
= 873.195782
Number of Residuals=
129
To
Lag
6
12
18
24
Chi
Square DF
1.60 3
7.67 9
15.32 15
18.67 21
Autocorrelations of Residuals
Prob
0.659
0.568
0.429
0.607
0.028 -0.036 -0.020 -0.010 -0.014 0.095
0.004 -0.082 0.073 0.128 -0.095 0.072
0.126 0.016 -0.091 0.125 -0.105 -0.020
0.065 -0.048 0.051 0.005 0.069 -0.085
Note that the first term on the left is almost the same as differencing (1 − B).
This suggests that we probably should have taking a first difference as well as a
seasonal difference. We repeated the above analysis and arrived at the following
model: ARIMA(1,1,1)(0,1,1)12 which has AIC=864.6.
The computer output for the final model is shown above. The figures under
the heading Chi Square concern the Ljung-Box test. Clearly the model passes
the test (see Table E in Appendix III).
(f ) Forecasts for the next 24 months are given on the following page.
134
Part D. Solutions to exercises
Month
Nov 96
Dec 96
Jan 97
Feb 97
Mar 97
Apr 97
May 97
Jun 97
Jul 97
Aug 97
Sep 97
Oct 97
Nov 97
Dec 97
Jan 98
Feb 98
Mar 98
Apr 98
May 98
Jun 98
Jul 98
Aug 98
Sep 98
Oct 98
Obs
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
Forecast
240.1614
262.5516
270.2423
244.0064
249.8899
232.7683
249.3720
270.7257
295.5439
295.6598
257.1358
246.4526
245.0224
267.1930
274.8228
248.5699
254.4488
237.3258
253.9291
275.2828
300.1009
300.2168
261.6929
251.0096
Std Error
6.4691
6.9981
7.1820
7.3027
7.4074
7.5069
7.6042
7.6999
7.7944
7.8878
7.9800
8.0712
8.4340
8.6077
8.7406
8.8622
8.9796
9.0948
9.2083
9.3205
9.4313
9.5407
9.6490
9.7560
Lower 95%
227.4821
248.8356
256.1659
229.6934
235.3718
218.0550
234.4680
255.6341
280.2671
280.2000
241.4952
230.6332
228.4920
250.3222
257.6914
231.2003
236.8491
219.5004
235.8811
257.0150
281.6160
281.5173
242.7812
231.8881
Upper 95%
252.8407
276.2677
284.3187
258.3194
264.4081
247.4816
264.2759
285.8173
310.8207
311.1196
272.7764
262.2719
261.5528
284.0638
291.9541
265.9395
272.0484
255.1513
271.9772
293.5506
318.5858
318.9163
280.6045
270.1311
7.9 (a) See the plot on the following page. Note that there is strong seasonality and
a pronounced trend-cycle. One way to study the consistency of the seasonal
pattern is to compute the seasonal sub-series and see how stable each month
is. The results are given below.
1955:
1956:
1957:
1958:
1959:
1960:
1961:
1962:
1963:
1964:
1965:
1966:
1967:
1968:
1969:
1970:
Jan
94.7
94.1
95.9
95.0
93.9
95.7
95.2
93.5
94.6
93.6
95.6
97.0
93.9
91.2
96.0
95.8
Feb
94.0
93.5
96.8
94.8
94.5
95.7
94.8
93.7
93.4
93.2
92.7
93.7
93.6
91.7
94.3
93.0
Mar
96.5
96.8
99.0
96.1
96.4
95.1
96.5
95.6
95.5
94.6
94.0
95.2
94.1
94.5
94.1
92.0
Apr
101.3
103.1
97.7
100.4
100.9
98.9
101.3
100.7
99.1
98.6
96.7
97.0
99.0
99.0
96.8
96.0
May
102.4
104.1
99.5
101.7
102.1
100.8
101.7
102.3
100.8
100.2
99.4
98.4
102.5
101.9
100.7
100.2
Jun
103.7
102.8
101.1
102.1
103.3
102.5
103.7
102.5
104.1
103.5
103.7
104.1
105.7
103.1
104.5
103.7
Jul
104.5
103.7
102.0
103.6
104.7
104.6
105.2
104.4
106.1
106.6
108.2
105.9
109.2
105.7
106.3
106.0
Aug
104.3
103.6
103.3
104.9
106.0
106.0
105.3
106.4
107.4
107.5
108.0
107.2
109.9
106.0
107.2
105.8
Sep
104.1
103.6
105.1
104.4
104.4
104.0
104.3
103.5
104.1
103.6
104.7
104.2
104.9
103.5
103.7
102.7
Oct Nov Dec
101.2 98.3 95.4
101.7 98.6 96.8
103.2 99.8 96.7
101.1 96.7 95.3
100.9 98.7 96.1
100.1 98.0 96.7
101.2 97.7 96.5
101.0 97.6 96.9
100.7 97.9 97.4
101.7 97.9 96.9
100.5 98.4 99.6
99.7 97.1 96.8
99.8 98.3 93.8
100.2 100.7 99.1
102.5 100.2 99.4
98.9 97.1 96.5
These detrended data are relatively consistent from year to year with only minor
135
Chapter 7: The Box-Jenkins methodology for ARIMA models
240
Employment in motion picture industry
o
o oo
o
o
o
220
o
oo
o
o o
o oo
o
o
o
o
o
o
o
o
oo o
o oo
oo o
o
o
200
o
o
o
o
o
o
oo
o
oo
o
oo
o
o
o
o
o
o
o
o
o
180
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
oo
oo
o
o
oo
o
o
o
o
o
1959
1961
o
o
o
o
o
o
o
o
o
oo o
o
o
o o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
oo
o
oo
o
o
o
o
o
o o
o
o
o
o
o
o
o o
o
o
o o
oo
o
o
o
oo
1963
o
o
o
o
o
o
o
o
o
o
o oo
o
o
o
o
oo
o
o
o
o
o
o
1965
1967
1969
1971
-0.5
0.0
PACF
0.4
0.0
ACF
0.5
0.8
1.0
1957
o
o
o o
o
o
oo
1955
o
o
o
o
oo
o
o
0
10
20
30
40
0
10
20
Exercise 7.9(a): Employment in the motion picture industry
30
40
136
100
95
ratios
105
110
Part D. Solutions to exercises
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Exercise 7.9(a): Sub-series of detrended data
variations occurring here and there. For example, December 1967 and January
and February 1968 were noticeably lower than surrounding years.
Another way to look at seasonal patterns is via autocorrelation functions. Note
that for the raw data, the ACF shows strong seasonality over several seasonal
lags. This is further evidence of the consistency of the seasonal pattern. The
plot on the previous page shows the detrended data. Again, the seasonal pattern is very consistent although the amplitude of the pattern each year varies.
Unusual results in early 1968 and early 1970 are seen.
(b) For the first 96 months, we identified an ARIMA(0,1,0)(0,1,1) 12 . For the second
96 months, we identified an ARIMA(0,1,0)(1,1,0) 12 : In practice, there is little
difference between these models. This means that once the trend has been
eliminated (by differencing), the seasonal patterns are very similar.
(c) Using the above ARIMA(0,1,0)(0,1,1) 12 model, we obtained the following forecasts.
137
Chapter 7: The Box-Jenkins methodology for ARIMA models
110
Detrended employment in motion picture industry
o
o
oo
o
o
105
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
95
o
o
o
oo
o
o
o
o
o
o
o
o
o
o o
o
o
o
o
o
o
o
o
o o
o
o
o
o
o o
o
o
o
o o
o
o
o
oo
o
o
o
o
oo
o
o
o
o
o
1957
1959
1961
1963
1967
1969
1971
-0.5
0.0
PACF
0.5
0.0
-0.5
ACF
1965
0.5
1955
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o o
o
o
o
o
o
o
o
o
o
o
o
o
o o
o
o
o
o
oo
o
o
oo
o o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
oo
100
o
ooo
o
o
o
o
o
o
o
0
10
20
30
40
0
10
20
30
40
Exercise 7.9(a): Seasonal differences of employment in the motion picture industry
138
Part D. Solutions to exercises
Month
Jan
1963
Feb
1963
Mar 1963
Apr 1963
May 1963
Jun 1963
Jul
1963
Aug 1963
Sep
1963
Oct
1963
Nov 1963
Dec 1963
Jan
1964
Feb
1964
Mar 1964
Apr 1964
May 1964
Jun 1964
Jul
1964
Aug 1964
Sep
1964
Oct
1964
Nov 1964
Dec 1964
Actual
167.2
165.0
168.8
175.0
177.9
183.7
187.2
189.3
183.4
177.3
172.3
171.4
164.9
164.4
166.9
174.2
177.5
183.6
189.5
191.6
185.1
181.9
175.4
174.2
Forecast
167.3286
166.7030
167.7141
176.5972
176.9498
179.2212
186.1708
188.7598
186.1678
177.2392
170.7711
168.8708
167.5935
163.9412
167.4086
174.2641
176.4075
180.0358
186.3499
191.2063
187.7172
178.9557
175.7857
172.6567
Upper (95%)
172.4228
171.7902
172.8012
181.6844
182.0369
184.3084
191.2579
193.8470
191.2550
182.3264
175.8583
173.9580
172.6807
169.0247
172.4920
179.3475
181.4909
185.1193
191.4333
196.2898
192.8007
184.0392
180.8692
177.7402
Lower (95%)
162.2345
161.6159
162.6269
171.5100
171.8626
174.1340
181.0836
183.6727
181.0806
172.1520
165.6839
163.7837
162.5063
158.8577
162.3251
169.1806
171.3240
174.9523
181.2664
186.1228
182.6338
173.8722
170.7022
167.5732
Error
-0.1286
-1.7030
1.0859
-1.5972
0.9502
4.4788
1.0292
0.5402
-2.7678
0.0608
1.5289
2.5292
-2.6935
0.4588
-0.5086
-0.0641
1.0925
3.5642
3.1501
0.3937
-2.6172
2.9443
-0.3857
1.5433
(d) For the second half of the data we used the ARIMA(0,1,0)(1,1,0) 12 to obtain
the forecasts at the top of the following page. The actual 1971–1972 figures
are also shown. The source is “Employment and Earnings, US 1909–1978”,
published by the Department of Labor, 1979.
A good exercise would be to take these forecasts and check the MAPE for 1971
and 1972 separately. The MAPE for the first forecast year should be smaller
than the MAPE for the second year.
(e) If the objective is to forecast the next 12 months then the latest data is obviously
the most relevant but to get seasonal indices we have to go back several years
and to anticipate what the next move the large cycle is going to be, we really
need to look at as much data as possible. So a good strategy would be
i.
ii.
iii.
iv.
study the trend-cycle by looking at the 12-month moving average;
remove the trend-cycle and study the consistency of the seasonality;
decide how much of the data series to retain for the ARIMA modeling;
forecast the next 12 months and use some judgment as to how to modify
the ARIMA forecasts on the basis of anticipated trend-cycle movements.
139
Chapter 7: The Box-Jenkins methodology for ARIMA models
Month
Jan
1971
Feb
1971
Mar 1971
Apr 1971
May 1971
Jun 1971
Jul
1971
Aug 1971
Sep
1971
Oct
1971
Nov 1971
Dec 1971
Jan
1972
Feb
1972
Mar 1972
Apr 1972
May 1972
Jun 1972
Jul
1972
Aug 1972
Sep
1972
Oct
1972
Nov 1972
Dec 1972
Actual
194.5
187.9
187.7
198.3
202.7
204.2
211.7
213.4
212.0
203.4
199.5
199.3
191.3
192.1
193.3
203.4
205.5
218.2
220.3
219.9
211.9
204.5
198.5
200.5
Forecast
196.0141
191.2939
189.9446
197.3595
205.7424
213.1446
217.8789
218.8543
213.2939
208.3371
204.7595
203.4670
196.2469
191.0587
189.3618
196.9907
205.3066
212.5618
217.4298
218.2314
213.0587
207.6473
204.3907
203.2051
Upper (95%)
201.4418
198.9698
199.3456
208.2149
217.8790
226.4396
232.2392
234.2061
229.5769
225.5009
222.7611
222.2690
217.0365
213.6617
213.6432
222.8417
232.6373
241.2960
247.5022
249.5848
245.6428
241.4174
239.3064
239.2300
Lower (95%)
190.5864
183.6180
180.5436
186.5042
193.6057
199.8495
203.5186
203.5025
197.0109
191.1733
186.7580
184.6650
175.4573
168.4557
165.0804
171.1396
177.9760
183.8276
187.3575
186.8780
180.4746
173.8773
169.4750
167.1802
Error
-1.5141
-3.3939
-2.2446
0.9405
-3.0424
-8.9446
-6.1789
-5.4543
-1.2939
-4.9371
-5.2595
-4.1670
-4.9469
1.0413
3.9382
6.4093
0.1934
5.6382
2.8702
1.6686
-1.1587
-3.1473
-5.8907
-2.7051
Forecasts for Exercise 7.9(d)
7.10 (a) There is strong seasonality as can be seen from the time plot and the seasonal
peaks in the ACF.
(b) The trend in the series is small compared to the seasonal variation. However,
there is a period of downward trend in the first four years, followed by an
upward trend for four years. At the end the trend seems to have levelled off.
(c) The one large spike in the PACF of Figure 7-34 suggests the series needs differencing at lag 1. This is also apparent from the slow decay in the ACF and
the non-stationary mean in the time plot.
(d) You would need to difference again at lag 1 and plot the ACF and PACF of the
new series (differenced at lags 12 and 1). It is not possible to identify a model
from Figures 7-33 and 7-34.
140
Part D. Solutions to exercises
Chapter 8: Advanced forecasting models
8.1 (a) The fitted model in Exercise 6.7 (using OLS) was
Yt = 78.7 + 0.534xt + Nt .
The computer output below shows the results for fitting the straight line regression with AR(1) errors. Hence the new model is
Yt = 79.3 + 0.508xt + Nt
where
Nt = 0.72Nt−1 + et .
In this case, the error model makes very little difference to the parameters.
Output from SAS for Exercise 8.1:
Parameter
MU
AR1,1
NUM1
Estimate
79.27236
0.72469
0.50801
Approx.
Std Error
0.76093
0.14647
0.02318
T Ratio
104.18
4.95
21.91
Lag
0
1
0
Variable Shift
SALES
0
SALES
0
ADVERT
0
Constant Estimate = 21.8242442
Variance Estimate = 1.11639088
Std Error Estimate =
1.056594
AIC
= 74.2915405
SBC
= 77.825702
Number of Residuals=
24
To
Lag
6
12
18
Chi
Square DF
3.46 5
9.31 11
16.39 17
Autocorrelation Check of Residuals
Autocorrelations
Prob
0.630 0.027 0.099 -0.037 0.111 -0.060 -0.274
0.593 0.055 0.126 0.229 -0.227 0.060 -0.095
0.497 -0.117 -0.238 -0.080 0.054 -0.108 0.101
(b) The ACF and PACF of the errors is plotted on the following page. An AR(1)
model for the errors is appropriate since there is a single significant spike at
lag 1 in the PACF and geometric decay in the ACF. This is confirmed by the
Ljung-Box test in the computer output above. The Q ∗ values are given under
the column Chi Square. None are significant showing the residuals from the
full model are white noise.
141
Chapter 8: Advanced forecasting models
Errors from regression model
o
2
o
o
o
o
o
1
o
o
o
0
o
o
o
o
o
o
-1
o
o
o
o
o
o
-2
o
o
-3
o
0.0
-0.4
-0.4
0.0
ACF
PACF
0.4
20
0.4
10
2
4
6
8
10
12
2
4
6
8
10
Exercise 8.1: Errors from regression model with AR(1) error term.
12
142
Part D. Solutions to exercises
Output from SAS for Exercise 8.2(a):
Parameter
MU
AR1,1
NUM1
Estimate
9.56328
0.78346
-0.02038
Constant Estimate
Approx.
Std Error
0.40537
0.06559
0.01066
T Ratio
23.59
11.94
-1.91
Lag
0
1
0
Variable Shift
HURON
0
HURON
0
YEAR
0
= 2.07087134
Variance Estimate = 0.51219788
Std Error Estimate = 0.71568001
AIC
= 216.450147
SBC
= 224.205049
Number of Residuals=
98
To
Lag
6
12
18
24
Chi
Square
8.35
15.01
16.36
25.47
Autocorrelation Check of Residuals
Autocorrelations
DF
5
11
17
23
Prob
0.138 0.222 -0.100 -0.133 -0.056 -0.007 -0.042
0.182 -0.051 0.009 0.175 0.017 -0.121 -0.107
0.499 -0.053 0.014 0.019 0.058 0.006 -0.067
0.326 -0.071 -0.166 -0.043 0.051 0.160 0.092
8.2 (a) To reduce numerical error, we subtracted 1900 from the year to create an explanatory variable. Hence the year ranged from -25 (1875) to 72 (1972). The
computer output above shows the fitted model to be
Yt = 9.56 − 0.02xt + Nt
where
Nt = 0.78Nt−1 + et
where xt is the year −1900.
(b) The errors are shown in the plot on the following page. This demonstrates
that a better model would have an AR(2) error term since the PACF has two
significant spikes at lags 1 and 2. The spike at lag 10 is probably due to chance.
The ACF shows geometric decay which is possible with an AR(2) model. So
the full regression model is
Yt = β 0 + β 1 xt + N t
where Nt = φ1 Nt−1 + φ2 Nt−2 + et .
Fitting this model gives the output shown on page 144. So the fitted model is
Yt = 9.53 − 0.02xt + Nt
where Nt = Nt−1 − 0.29Nt−2 + et .
143
Chapter 8: Advanced forecasting models
Errors from regression model
2
o
oo
1
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
oo
o
o
o
o
o
o
o
o
o
-1
o
o
o
o
o
o
o
o
o
o
o
o
o o
o
o
o
o
o
oo
o
o
o
o
o
o o
oo
oo
o
o
o
o
o
o
oo
o
o
o
0
o
ooo
o
o o
o
oo oo
-2
oo
o
o
o
oo
o
o
1900
1920
1960
0.6
PACF
0.2
0.4
0.6
0.4
0.2
-0.2
0.0
-0.2
ACF
1940
0.8
0.8
1880
5
10
15
5
10
15
Exercise 8.2(b): Errors from regression model with AR(1) error term.
o
144
Part D. Solutions to exercises
Output from SAS for Exercise 8.2(b):
Parameter
MU
AR1,1
AR1,2
NUM1
Estimate
9.53078
1.00479
-0.29128
-0.02157
Constant Estimate
Std Error
0.30653
0.09839
0.10030
0.0082537
Approx.
T Ratio Lag
31.09
0
10.21
1
-2.90
2
-2.61
0
Variable Shift
HURON
0
HURON
0
HURON
0
YEAR
0
= 2.73048107
Variance Estimate = 0.4760492
Std Error Estimate = 0.68996319
AIC
= 210.396534
SBC
= 220.736404
Number of Residuals=
98
To
Lag
6
12
18
24
Chi
Square
0.60
5.35
6.21
10.49
Autocorrelation Check of Residuals
Autocorrelations
DF
4
10
16
22
Prob
0.964 0.018 -0.028 -0.003 0.040 0.054 -0.007
0.867 -0.032 -0.037 0.167 -0.007 -0.098 -0.055
0.986 -0.036 0.005 -0.025 0.035 -0.006 -0.063
0.981 -0.003 -0.141 -0.007 0.006 0.116 0.008
8.3 (a) ARIMA(0,1,1)(2,1,0)12 . This model would have been chosen by first identifying
that differences at lags 12 and 1 are necessary to make the data stationary. Then
looking at the ACF and PACF of the differenced data would have shown two
significant spikes in the PACF at lags 12 and 24. There would have also been
a significant spike in the ACF at lag 1 and geometric decay in the early lags of
the PACF.
(b) Since both parameter estimates are positive (and significantly different from
zero), we can conclude that electricity consumption increases with both heating
degrees and cooling degrees. Because b 2 is larger, we know that there is a greater
increase in electricity usage for each heating degree than for each cooling degree.
(c) To use this model for forecasting, we would first need forecasts of both X 1,t
and X2,t into the future. These could be obtained by taking averages of these
variables over the equivalent months of the previous few decades. Then the
model can be used to forecast electricity demand over the next 12 months by
Chapter 8: Advanced forecasting models
145
forecasting the Nt series using the method discussed in chapter 7 and plugging
the forecasts of X1,t , X2,t and Nt into the formula for Yt .
(d) If the model was fitted using a standard regression package (thus modeling N t
as white noise), then the seasonality and autocorrelation in the data would have
been ignored. This would result in less efficient parameter estimates and invalid
estimates of their standard errors. In particular, tests for significance would be
incorrect, as would prediction intervals. Also, when producing forecasts of Y t ,
the forecasts of Nt would be all be zero. Hence, the model would not adequately
allow for the seasonality or autocorrelation in the data.
8.4 (a) b = 3, r = 1, s = 2.
(b) ARIMA(2,0,0)
(c) ω0 = −0.53, ω1 = −0.37, ω2 = −0.51, δ1 = 0.57, δ2 = 0, θ1 = θ2 = 0, φ1 = 1.53,
φ2 = −0.63.
(d) 27 seconds.
8.5 See the graphs on the following page.
8.6 (a) The three series are shown on page 147. For Set 1, four X t values are needed
(since v1 , v2 , v3 and v4 are all non-zero). Therefore 27 Yt values can be produced.
Similarly 26 Yt values for Set 2 and 24 Yt values for Set 3 can be calculated.
(b) The first model is
2.0B
Xt + N t .
1 − 0.7B
The simplest way to generate data for this transfer function is to rewrite it as
follows
Yt =
(1 − 0.7B)Yt = (1 − 0.7B)B(2 − 1.4B)Xt + (1 − 0.7B)Nt
so that
Yt = 0.7Yt−1 + 2.0Xt−1 − 1.4Xt−2 + Nt − 0.7Nt−1 .
Thus Yt values can only be generated for times t = 3, 4, . . . since we need at
least two previous Xt values. However, for t = 3, we also need Y 2 . To start the
process going, we have assumed here that Y 2 = 0. Other values could also have
been used. The effect of this initialization is negligible after a few time periods.
The second model is easier to generate as we can write it
Yt = 1.2Xt + 2.0Xt−1 − 0.8Xt−2 + Nt .
146
Part D. Solutions to exercises
(b)
1.0
1.5
weight
0.5
0.0
-1.0
0.5
0.0
weight
1.0
2.0
1.5
2.5
2.0
3.0
(a)
2
4
6
8
10
0
4
lag
(c)
(d)
8
10
6
8
10
1.0
6
0.8
0.6
-0.2
0.2
0.4
weight
0.6
0.4
0.2
0.0
-0.4
weight
2
lag
0.8
0
0
2
4
6
lag
8
10
0
2
4
lag
Exercise 8.5: Impulse response weights for the four different transfer functions.
147
Chapter 8: Advanced forecasting models
t
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Nt
-0.8003
0.8357
1.4631
0.7332
0.3260
-0.7442
0.7362
1.1931
-1.4681
-0.5285
0.4314
-1.6341
0.8198
0.4183
-0.4065
-0.0615
0.1432
-1.0747
-0.5355
-0.1454
0.2088
-0.6854
0.1182
0.6971
0.3698
-0.0802
-0.9202
1.1483
-0.1663
-0.5461
Xt
50
90
50
30
80
80
30
70
60
10
40
20
40
20
10
30
60
70
40
70
10
30
30
40
30
100
60
90
60
100
Set 1
Yt
58.7
52.3
61.3
65.7
59.2
55.5
49.5
37.4
27.4
29.8
30.4
23.6
19.9
29.1
46.9
56.5
56.9
49.2
34.3
28.1
30.7
34.4
46.9
64.1
76.1
75.8
76.5
Set 2
Yt
58.3
51.3
62.7
66.2
56.5
56.5
50.4
35.4
29.8
29.4
29.6
23.9
20.1
27.9
47.5
56.9
57.2
48.3
35.1
28.7
30.4
33.9
46.1
66.1
74.8
75.5
Set 3
Yt
74.7
88.2
87.5
79.5
83.4
72.4
54.8
47.4
41.6
40.9
36.1
27.9
38.5
59.9
74.2
76.3
73.1
54.7
43.4
43.9
44.1
61.1
84.8
97.5
Set 4
Set 5
0.0
110.9
51.3
25.7
135.0
143.8
49.3
130.2
113.7
16.4
75.5
38.8
79.0
38.6
19.3
59.7
118.6
139.2
79.7
140.1
19.2
60.1
60.7
80.3
59.9
199.1
121.1
179.8
119.4
201.5
64.7
116.3
231.3
132.7
81.2
186.5
75.5
20.4
94.4
56.8
88.4
19.6
39.9
124.1
178.9
139.5
107.9
120.2
-0.7
88.1
84.7
92.4
147.9
247.1
149.1
203.8
167.5
Generated data for Exercise 8.6
148
Part D. Solutions to exercises
8.7 (a) The average cost of a night’s accommodation is C/R.
(b) There are a number of ways this could be done. The simplest is to define the
monthly CPI to be the same as that of the quarter. For example, January,
February and March of 1980 would each have a CPI of 45.2; April, May and
June 1980 would each have a CPI of 46.6; and so on. Other methods might
involve fitting a smooth curve through the quarterly figures and using the curve
to predict the CPI at other points along the time axis.
(c) See figure below.
80
100
Consumer price index (Melbourne)
40
60
Average rate per room per night ($)
1980
1982
1984
1986
1988
1990
1992
1994
1996
Exercise 8.7(c): Time plots of average room rate and CPI.
(d) Our preliminary model is
Yt = a + (ν0 + ν1 B + · · · + ν6 B 6 )Xt + Nt
where Yt denotes the average room rate, Xt denotes the CPI and Nt is an AR(1)
process. The estimated errors from this model are shown in the figure on the
previous page. They are clearly non-stationary and have some seasonality.
So we difference both Yt and Xt and refit the model with Nt specified as an
ARIMA(1,0,0)(1,0,0)12 . The parameter estimates are shown below (as given
by SAS).
149
Chapter 8: Advanced forecasting models
Residuals from dynamic regression with AR(1) errors
o
o
o
o
o
o
0.05
o
o
o o
o
o
o
oo
o
o
o
o
o
o
o
o
o
o o
-0.05
o
o
o
o
o
o o
oo
o
o
o
o
o
o
o
o
o
oo
o
o
oo
o
o
o
o
o
o o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
oo
o
o
o
o
o o
o
o
o
o
o
o
o
o
1986
o
1988
1990
1992
1994
1996
PACF
0.0
0.2
0.4
0.6
0.4
0.2
-0.4
0.0
ACF
o
o
o
o
1984
o
0.6
1982
o
o
o
1980
o
o
o
o
o
o
o
oo
o
o
o
o
o
o o
o
o
o o
o
o
o
o
o
oo o
o o
o
o
o
o
o
o
o
o
o
oo
o
o
oo o
o
o
o
o
oo o
o
o
o
o
o
o
o
o
o
o
o
o
o
0.0
o
o
o
oo
o
o
o
oo
o
o
o o
5
10
15
20
5
10
15
Exercise 8.7(d): Errors from regression model with AR(1) error term.
20
150
Part D. Solutions to exercises
Parameter
a
ν0
ν1
ν2
ν3
ν4
ν5
ν6
Estimate
0.20200
0.20730
-0.41687
0.23165
0.32048
-0.72093
0.74707
-0.36272
s.e.
0.2848
0.2602
0.2634
0.2655
0.2716
0.2665
0.2633
0.2656
P -value
0.4791
0.4267
0.1154
0.3842
0.2396
0.0075
0.0051
0.1739
Thus the intercept and first four coefficients are not significant and can be
omitted. Hence we select b = 4. We shall retain the last three coefficients for
the moment. Since they show no clear pattern, we select r = 0 and s = 3 giving
the model
Yt = (ω0 + ω1 B + ω2 B 2 )B 4 Xt + Nt .
Looking at the ACF and PACF of the error series (not shown) and trying a
number of alternative models led us to the model ARIMA(2,1,0)(2,0,0) 12 for
Nt . That is
(1 − φ1 B − φ2 B 2 )(1 − Φ1 B 12 − Φ2 B 24 )(1 − B)Nt = et .
The parameter values (all significant) were
Parameter
Estimate
ω0
0.52
ω1
0.61
ω2
-0.47
φ1
-0.49
φ2
-0.33
Φ1
0.37
Φ2
0.41
The model suggests that there is a lag of four months between changes in the
CPI and changes in the price of travel accommodation. The seasonality inherent
in the model may be due to seasonal price variation or due to the way CPI was
estimated from quarterly data.
(e) Forecasts of CPI were obtained using Holt’s method. These are only needed
from November 1995 because of the time lag of 4 months. Actual data beyond
June 1995 are given in the second column for comparison.
151
Chapter 8: Advanced forecasting models
Month
1995
1995
1995
1995
1995
1995
1996
1996
1996
1996
1996
1996
Actual
Yt
94.0
96.7
94.8
89.6
95.8
91.5
92.0
95.5
100.6
94.1
97.2
102.9
Xt−4
115.0
116.2
116.2
116.2
116.9
117.3
117.7
118.1
118.5
118.9
119.3
119.8
Xt−5
115.0
115.0
116.2
116.2
116.2
116.9
117.3
117.7
118.1
118.5
118.9
119.3
Xt−6
115.0
115.0
115.0
116.2
116.2
116.2
116.9
117.3
117.7
118.1
118.5
118.9
60
40
Perpetual speed score
80
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Predicted
Yt
90.4
91.8
92.0
91.6
93.4
90.3
90.4
92.6
94.7
90.7
91.5
93.0
0
20
40
60
80
100
120
Day
Exercise 8.8: Time plot of daily perceptual speed scores for a schizophrenic patient. The
drug intervention is shown at day 61.
8.8 (a) See the figure above.
(b) The step intervention model with an ARIMA(0,1,1) error was used:
Yt = ωXt + Nt
where (1 − B)Nt = θet−1 + et
152
Part D. Solutions to exercises
where Yt denotes the perceptual speed score and X t denotes the step dummy
variable. The estimated coefficients were
Parameter
Estimate
ω
-22.1
θ
0.76
(c) The drug has lowered the perceptual speed score by about 22.
(d) The new model is
Yt =
ω
Xt + N t
1 − δB
where (1 − B)Nt = θet−1 + et
(An ARIMA(0,1,1) error was found to be the best model again.) Here the
estimated coefficients were
Parameter
Estimate
ω
-13.21
δ
0.54
θ
0.76
The following accuracy measures show that the delayed effect model fits the
data better.
Model
MAPE
MSE
AIC
Step
15.1
92.5
542.8
Delayed step
15.0
91.1
538.4
The forecasts for the two models are very similar. This is because the effect of
the step in the delayed step model is almost complete at the end of the series,
60 days after the drug intervention.
(e) The best ARIMA model we found was an ARIMA(0,1,1) with θ = 0.69. This
gave MAPE=15.4, MSE=100.8 and AIC=550.9.
This model gives a flat forecast function (since we did not include a constant
term). The forecast values are 33.9. Because the step effect is almost complete
in the delayed step model, it also gives a virtually flat forecast function with
forecast values of 34.1. Hence there is virtually no difference. If forecasts had
been made earlier (for example, at day 80), there would have been a difference
because the step effect would still be in progress and so the delayed step model
would have showed a continuing decline in perceptual speed. The real advantage
of the intervention model over the ARIMA model is that the intervention model
provides a way of measuring and evaluating the effect of an intervention.
(f ) If the drug varied from day to day and the reaction times depended on dose,
then a better model would be a dynamic regression model with the the quantity
of drug as an explanatory variable.
153
Chapter 8: Advanced forecasting models
8.9 (a)
(b)
Yt − Yt−1
Xt − Xt−1
Yt−1 − Yt−2
Yt−2 − Yt−3
= Φ1
+ Φ2
Xt−1 − Xt−2
Xt−2 − Xt−3
Yt−12 − Yt−13
+ · · · + Φ12
+ Zt.
Xt−12 − Xt−13
Yt = Yt−1 − 0.38(Yt−1 − Yt−2 ) + 0.15(Xt−1 − Xt−2 )
− 0.37(Yt−2 − Yt−3 ) + 0.13(Xt−2 − Xt−3 ) + · · ·
= 0.62Yt−1 + 0.01Yt−2 + 0.15Xt−1 − 0.02Xt−2 + · · ·
(c)
• Multivariate model assumes feedback. That is, X t depends on past values
of Yt . But regression does not allow this.
• Regression model does not assume Xt is random.
• Regression model allows Yt to depend on Xt as well as past values
Xt−1 , Xt−2 , . . .. Multivariate AR only allows dependence on past values
of {Xt }.
• For these data, it is unlikely room rates will substantially affect Y t although
it is possible. Small values in lower left of coefficient matrices suggest that
Xt is not affecting Yt . Yt should depend on Xt . So regression is probably
better.
8.10 (a) An AR(3) model can be written using the same procedure as the AR(2) model
described in Section 8/5/1. Thus we define X 1,t = Yt , X2,t = Yt−1 and X3,t =
Yt−2 . Then write




φ1 φ2 φ3
at
X t =  1 0 0  X t−1 +  0 
0 1 0
0
and
Yt = [1 0 0]X t .
This is now in state space form



1
φ1 φ2 φ3
F =  1 0 0 ,G =  0
0 1 0
0
with



0 0
at
1 0  , H = [1 0 0], et =  0  and zt = 0.
0 1
0
(b) An MA(1) can be written as Yt = θat−1 + at where at is white noise. We can
write this in state space form by letting F = 0, G = θ, e t = at−1 , zt = at and
H = 1. Thus
Yt = X t + a t
and
Xt = θat−1 .
154
Part D. Solutions to exercises
(c) Holt’s method is defined in Chapter 4 as
Lt = αYt + (1 − α)(Lt−1 + bt−1 ),
bt = β(Lt − Lt−1 ) + (1 − β)bt−1 ,
with the one-step forecast as Ft+1 = Lt + bt . Hence the one-step error is
et = Yt − Lt−1 − bt−1 . The first row can be written
Lt = α(Yt − Lt−1 − bt−1 ) + Lt−1 + bt−1
= αet + Lt−1 + bt−1
and the second row can be written
bt = β(Lt − Lt−1 ) + (1 − β)bt−1
= bt−1 + β(Lt − Lt−1 − bt−1 )
= bt−1 + βαet
using the first equation.
Now let Xt,1 = Lt and Xt,2 = bt . Then the state space form of the model is
1 1
α
Xt =
X t−1 +
et
0 1
βα
Yt = [1 1] X t−1 + et .
(d) The state space form might be preferable because
• it allows missing values to be handled easily;
• it is easy to generalize to allow the parameters to change over time;
• the Kalman recursion equations can be used to calculate the forecasts and
likelihood.
Chapter 9: Forecasting the long term
155
Chapter 9: Forecasting the long term
9.1 There is little doubt that the trends in computer power and memory show a very
clear exponential growth while that of price is declining exponentially. It is therefore
a question of time until computers that cost only a few hundred dollars will exist
that can perform an incredible array of tasks which until now have been the sole
prerogative of humans, for example playing chess (a high-power judgmental and
creative process). It is therefore up to our imaginations to come up with future
scenarios when such computers will be used as extensively as electrical appliances
are used today. The trick is to free our thinking process so that we can come up with
scenarios that are not constrained by our perception of the present when computers
are being used mostly to make calculations.
9.2 As the cost of computers (including all of the peripherals such as printers and scanners) is being reduced drastically, and at the same time we will be getting soon to
devices that will perform a great number of functions now done by separate machines, it will become more practical and economical to work at home. Furthermore,
the size of these all-purpose machines is being continuously reduced. In the next
five to ten years we will be able to have everything that is provided to us now in
an office at home with two machines: one a powerful all-inclusive computer and the
other a printer-scanner-photocopier-fax machine. Moreover these two machines will
be connected to any network we wish via modems so that we can communicate and
get information from anywhere.
9.3 As it was also mentioned in Exercise 9.1, there is no doubt that the trend in computer
and equipment prices are declining exponentially at a fast rate. This would make it
possible for everyone to be able to afford them and be able to have an office not only
at home but at any other place he or she wishes, including one’s car, a hotel room,
a summer vacation residence, or a sail boat.
9.4 Statements like those referred to in Exercise 9.4 abound and demonstrate the shortsightedness of peoples’ ability to predict the future. As a matter of fact as late
as the beginning of our century people did not predict all four major inventions of
the Industrial Revolution (cars, telephones, electrical appliances and television) that
have dramatically changed our lives. Moreover, they did not predict the huge impact
of computers even as late as the beginning of the 1950s. This is why we must break
from our present mode of thinking and see things in a different, new light. This is
where scenarios and analogies can be extremely useful.
156
Part D. Solutions to exercises
Chapter 10: Judgmental forecasting and adjustments
10.1 Phillips’ problems have to do with the management bias of overoptimism, that is
believing that all changes will be successful and that they can overcome peoples’
resistance to change. This is not true, but we tend to believe that most organisational
changes are successful because we hear and we read about the successful ones while
there is very little mention of those that fail. Introducing changes must be considered,
therefore, in an objective manner and our ability to succeed estimated correctly.
10.2 The quote by Glassman illustrates the extent to which professional, expert investment managers underperform the average of the market. Business Week, Fortune
and other business journals regularly publish summary statistics of the performance
of mutual funds and other professionally-managed funds, benchmarking them with
the Standard & Poor or other appropriate indexes. The instructor can therefore get
some more recent comparisons than those shown in Chapter 10 and show them in
class.
10.3 Assuming that the length of cycles varies considerably we have little way to say how
long it will take until the expansion started in May 1991 will be interrupted. Unfortunately the length (and depth) of cycles varies a great deal making it extremely
difficult to say how long an expansion will last. It will all depend on the specific
situation involved that will require judgmental inputs, structured in such a way as
to avoid biases and other problems.
10.4 There are twenty 8s that one will encounter when counting from 0–100. When given
this exercise most people say nine or ten because they are not counting the eights
coming from 81 to 87, and 89 (they usually count the 8s in 88 often one time).
Chapter 11: The use of forecasting methods in practice
157
Chapter 11: The use of forecasting methods in practice
11.1 The results of Table 1 are very similar to those of the previous M-Competitions. As
a matter of fact the resemblance is phenomenal given the fact that the series used
and the time horizon they refer to are completely different.
11.2 In our view the combined method will do extremely well. More specifically its accuracy will be higher than the individual methods being combined while its variance
of forecasting error will be smaller than that of the methods involved.
11.3 It seems that proponents of new forecasting methods usually exaggerate their benefits. This has been the case with methods under the banner of neural networks,
machine learning and expert systems. These methods did not do well in the M3-IJF
Competition. In addition only few experts participated in the competition using
such methods, even though more than a hundred were contacted (and invited to
participate) and more than fifty expressed an interest in the M3-IJF Competition,
indicating that they would “possibly” participate . In the final analysis it seems that
it is not so simple to run a great number of series by such methods resulting in not
too many participants from such methods.
158
Part D. Solutions to exercises
Chapter 12: Implementing forecasting: its uses, advantages, and
limitations
The exercises for Chapter 12 are general and can be answered by referring to the text
of Chapter 12 which covers each one of the topics. Each instructor can therefore form
his/her way of answering these exercises.
Download