ECON 325 -- FORECASTING Preliminaries & Introduction

advertisement
ECON 325 -FORECASTING
Preliminaries & Introduction
Course Assumptions
• This is not an introduction to R, per se, but as part of the
course we will be spending a significant time using the R
software platform.
• R is integrated into the text that we’ll be using, and your
homework assignments will typically utilize R. To some
degree, however, it’s YOUR responsibility to learn the
software.
• Some concepts will also be illustrated in Excel, so that
you can see how they work, rather than simply the end
product.
Course Assumptions
• This is most definitely not a statistics course. I assume
you are familiar with concepts such as:
1.
2.
3.
4.
Mean
Standard deviation
Probability distributions (e.g., normal distribution)
Quantiles, etc.
• While previous experience with cross-sectional regression
is recommended, it isn’t required.
Course Assumptions
• This is also not a theory course. There won’t be any
proofs or derivations. My goal is to teach you forecasting
tools, as well as instruct you when and how to use them
most effectively.
What is a forecast?
• A prediction or estimate of an actual outcome expected at
a future time period or for another situation.
What can be forecast?
• Pretty much anything where past patterns can be
expected to evolve in the same manner into the future.
What can be forecast?
Some of the areas in which forecasting currently plays an
important role are:
• Scheduling: Forecasts of the level of demand for product, material,
labor, financing, or service are an essential input to scheduling.
• Acquiring resources: Forecasting is required to determine future
resource requirements.
• Determining resource requirements: All organizations must
determine what resources they want to have in the long-term.
These determinations all require good forecasts and managers
who can interpret the predictions and make appropriate decisions.
What can be forecast?
• The environment doesn’t have to be static. Rather, the
environment must be expected to change in a predictable
manner.
What can be forecast?
• There’s a difference between forecast accuracy and
precision in identifying a particular variable’s law of
motion.
• Example: exchange rates follow a random walk over the
short run
Overview of forecasting techniques
• Forecasting situations vary widely in their time horizons,
types of data patterns, factors affecting actual outcomes,
and many other aspects.
Qualitative forecasting
• If there are no data available, or if the data available are
not relevant to the forecasts, then qualitative forecasting
methods must be used. These methods are not purely
guesswork – there are well-developed approaches for
getting good forecasts without using historical data.
Quantitative forecasting
Can be applied when two conditions are satisfied:
1. Numerical information about the past is available
2. It is reasonable to assume that some aspects of the
past patterns will continue into the future.
• Continuity assumption
• Wide range of quantitative forecasting methods: time series
methods (data collected at regular intervals over time) or cross
sectional data (data collected at a single point in time)
Unpredictable events
• Little or no information is available.
• Example: likely impact of Yellowstone super-volcano
eruption is ???
Cross-sectional forecasting
• We are wanting to predict the value of something we have
not observed, using the information on the cases that we
have observed.
• Examples:
• Hedonic pricing. Suppose you have data on housing prices for all
houses sold in 2011 in a particular area, but are interested in
predicting the price of a house not in our data set using various
house characteristics (position, # of bedrooms, age, etc.)
• Fuel economic data for a range of 2009 model cars. We are
interested in predicting the carbon footprint of a vehicle not in our
data set using information such as the size of the engine and the
fuel efficiency of the car.
Cross-sectional forecasting
• Cross-sectional models are used when the variable to be
forecast exhibits a relationship with one or more other
predictor variables.
• The purpose of the cross-sectional model is to describe the form of
the relationship and use it to forecasts values of the forecast
variable that have not been observed.
• Under this model, any change in predictors will affect the output of
the system in a predictable way, assuming that the relationship
does not change.
• Models in this class include regression models, additive
models, and some kinds of neural networks.
Car Emissions Example
Model
Engine (litres)
City (mpg)
Highway (mpg)
Carbon (tons CO2 per year)
Chevrolet Aveo
1.6
25
34
6.6
Chevrolet Aveo 5
1.6
25
34
6.6
Honda Civic
1.8
25
36
6.3
Honda Civic Hybrid
1.3
40
45
4.4
Honda Fit
Honda Fit
Hyundai Accent
Kia Rio
Nissan Versa
Nissan Versa
Pontiac G3 Wave
1.5
1.5
1.6
1.6
1.8
1.8
1.6
27
28
26
26
27
24
25
33
35
35
35
33
32
34
6.1
5.9
6.3
6.1
6.3
6.8
6.6
Pontiac G3 Wave 5
1.6
25
34
6.6
Pontiac Vibe
1.8
26
31
6.6
Saturn Astra 2DR Hatchback
1.8
24
30
6.8
Saturn Astra 4DR Hatchback
1.8
24
30
6.8
Scion xD
Toyota Corolla
Toyota Matrix
Toyota Prius
Toyota Yaris
1.8
1.8
1.8
1.5
1.5
26
27
25
48
29
32
35
31
45
35
6.6
6.1
6.6
4.0
5.9
Car Emissions Example
• Table 1.1: Fuel economy and carbon footprints for 2009
model cars with automatic transmissions, four cylinders
and small engines. City and Highway represent fuel
economy while driving in the city and on the highway.
• A forecaster may wish to predict the carbon footprint (tons
of CO2 per year) for other similar vehicles that are not
included in the above table.
Car Emissions Example
Forecasting Procedure:
1. It is necessary to first estimate the effects of the
predictors (number of cylinders, size of engine, and fuel
economy) on the variable to be forecast (carbon
footprint).
2. Then, provided that we know the predictors for a car not
in the table, we can forecast its carbon footprint.
Car Emissions Example
R code for scatter plot and regression line:
plot(jitter(Carbon) ~ jitter(City), xlab="City (mpg)",
ylab="Carbon footprint (tons per year)", data=fuel)
fit <- lm(Carbon ~ City, data=fuel)
abline(fit)
This scatter plot illustrates Carbon (carbon footprint in
tonnes per year) versus City (fuel economy in city driving
conditions in miles per gallon) for all 134 cars. Also plotted
is the estimated regression line: 𝑦 = 12.53 − 0.22𝑥 .
Car Emissions Example
For a car with City driving fuel economy x=30 mpg, the
average footprint forecasted is 𝑦 = 5.90 tons of CO2 per
year. The corresponding 95% and 80% forecast intervals
are [4.95,6.84] and [5.28,6.51] respectively (calculated
using R).
R code for forecast value:
fitted(fit)[1]
fcast <- forecast(fit, newdata=data.frame(City=30))
plot(fcast, xlab="City (mpg)", ylab="Carbon footprint
(tons per year)")
Explanatory vs Time Series Forecasting
There is an important distinction between explanatory (or
causal) models and time series models. There is some
semantic confusion with this choice of words because
“explanatory models” can deal with “time series” data.
Time Series Models
• When we say time series models or time series methods
we usually mean to talk about one set of (time-dependent)
data and we will try to develop a model (an equation)
which can be thought of as the “generating process” of
these data.
• We specifically will not look at its relationship with other
variables. That is, the system is treated as a black box
and makes no attempt to discover the factors affecting its
behavior.
Time Series Models
There are two main reasons for wanting to treat a system
as a black box.
1. The system may not be understood, and even if it were
understood it may be extremely difficult to measure the
relationships assumed to govern its behavior.
2. The main concern may be only to predict what will
happen, and not to know why it happens.
Explanatory Models
When we say causal models we are specifically looking for
other variables (which may be time series) which offer
explanations (or linkages) to the main variable (which may
also be a time series).
1. Explanatory models assume that the variable to be
forecasted exhibits an explanatory relationship with one
or more independent variables.
2. The purpose of the explanatory model is to discover the
form of the relationship and use it to forecast future
values of the forecast variable.
Time Series vs. Explanatory Models
Both time series and explanatory models have advantages
in certain situations.
• Time series models can often be used more easily to
forecast
• Explanatory models can be used with greater success
for policy and decision making.
Exercise
Several approaches have been suggested by those
attempting to predict stock market movements. Three of
them are described briefly below. How does each relate to
the different approaches to forecasting described here?
Dow Theory: There tend to be support levels (lower bounds) and
resistance levels (upper bounds) for stock prices both for the
overall market and for individual stocks. These levels can be
found by plotting prices of the market or stock over time.
B. Random Walk Theory: No way to predict future movements in
the stock market or individual stocks, since all available info is
quickly assimilated by the investors and moves market prices in
the appropriate direction.
C. Prices of individual stocks or of the market in general are largely
determined by earnings (e.g., dividend growth model).
A.
Download