THE UNIVERSITY OF LETHBRIDGE DEPARTMENT OF GEOGRAPHY

advertisement
THE UNIVERSITY OF LETHBRIDGE
DEPARTMENT OF GEOGRAPHY
GEOGRAPHY 3235: Quantitative Models for Geographic Analysis
Fall 2007
Assignment 2 – Output Evaluation and Curve Estimation (Part 2)
Introduction
Curve-fitting and extrapolation techniques provide well-defined scientific methods for
quantitative modellers to understand past trends and make educated predictions about
future trends. Initially, one must perform input evaluation – that is, the curve ‘fitting’
procedure – to select an appropriate mathematical model that encompasses the past data.
Then one can apply curve ‘estimation’ procedures to make predictions about future
values for a time series according to various models.
Goal
To apply curve estimation techniques to a time series and assess various measures of
goodness of fit for various types of curves to the given set of data.
Data
The data for this exercise are provided to you in an Excel spreadsheet and describe:
Mauna Loa, Hawaii, USA Yearly Average Carbon Dioxide Emissions –
Source: C. D. Keeling, T.P. Whorf, and the Carbon Dioxide Research Group,
Scripps Institution of Oceanography (SIO), University of California.
Taber, Alberta Official Population –
Source: Alberta Municipal Affairs, Official Populations, 1901 - 2001. Available
online at http://www.municipalaffairs.gov.ab.ca/ms_official_pop_lists.htm
Calgary, Alberta Official Population –
Source: Statistics Canada, Census of Population.
Page 1 of 4
THE UNIVERSITY OF LETHBRIDGE
DEPARTMENT OF GEOGRAPHY
GEOGRAPHY 3235: Quantitative Models for Geographic Analysis
Fall 2007
General Instructions for Perfoming
Curve Estimation Analysis using SPSS
The SPSS Curve Estimation procedure will be demonstrated by your lab instructor.
Esssentially, this feature of SPSS can be used with time series data to generate curve
estimation regression statistics and related plots for many different curve estimation
regression models – including three of the ones discussed in this course (linear,
geometric and logistic). In addition to the regression statistics, you can also save
predicted values and residuals for each prediction interval according to each model.
To set up the Curve Estimation procedure, choose Analyze > Regression > Curve
Estimation. Your dependent variable will be the population or measure that you are
modeling, and your independent variable will be
set as ‘Time’. This will assign a simple integer
beginning with “1” for the first observation and
will continue on, increasing by 1 each time until
the nth observation (you MUST have ‘Time’ as
the independent variable in order for some of
the subsequent steps). You must also specify
the specific model for SPSS to use to fit the
data…you are familiar with the linear,
geometric (titled as “compound” in SPSS), or
logistic models, but there are also many more!
The ‘Save’ box gives you the option to calculate more than just the regression
statistics and plots for the data according to the specified model – in the ‘Save
Variables’ section, by choosing to save ‘Predicted values’ and ‘Residuals’, you will
be able to calculate the Yc and (Y – Yc) for each time period. Table 3.7 on page 43 of
your text shows a table of interim
calculations required for computing
output evaluation statistics – notice that
when applied to time series data (such as
that displayed in columns 1 and 2 of
Table 3.7) the Curve Estimation
procedure will quickly let you fill in
most of the rest of the required values
(such as columns 3 and 4).
Page 2 of 4
THE UNIVERSITY OF LETHBRIDGE
DEPARTMENT OF GEOGRAPHY
GEOGRAPHY 3235: Quantitative Models for Geographic Analysis
Fall 2007
When specifying to save ‘Predicted values’, note that SPSS will only calculate these
values up to the last case of the data set, for example, if your population data includes
12 observations that run in odd-numbered
years from 1981 to 2003, by default the last
‘Predicted value’ that will automatically be
calculated will be an estimated value for 2003.
However, you may predict values beyond the
12th observation by choosing to ‘Predict
through:’ and specifying the 13th, 14th, 15th, or
nth observation. For example, predicting
through the 18th observation (as shown at
right) will give an estimated population that
would correspond to the year 2019.
(As you know, this is about as far forward as we dare extrapolate given the 12
historical observations, based on generally-accepted ‘rules of thumb’).
Last, note that in order to complete the curve estimation procedure for the logistic
model, you will need to add one additional parameter – the upper growth limit of the
model. This is entered in the “Upper Bound:” space below the logistic model
checkbox.
1. Copy the two variables —“year” and “pop” – from your existing input evaluation
data set(s) from last week into new SPSS data file(s).
2. Using Analyze > Regression > Curve Estimation, obtain the regression
coefficients (found in the SPSS Output under “Parameter Estimates”) for each
data set according to each model, the goodness-of-fit statistics (under “Model
Summary”) according to each model; and the estimated values (added to the SPSS
Data window as “FIT_1”) according to each model.
3. Finally, manually calculate the output evaluation statistics (ME, MAPE) for your
time series data according to the linear and geometric models. Refer to Table
3.7 and formulas 3.5 and 3.6 on p. 42-44 in your text (**Hint: you may need to do
calculate some additional simple variables to assist in the calculation of the
MAPE statistic**).
Page 3 of 4
THE UNIVERSITY OF LETHBRIDGE
DEPARTMENT OF GEOGRAPHY
GEOGRAPHY 3235: Quantitative Models for Geographic Analysis
Fall 2007
4. Neatly summarize the information from the above questions into a series of
carefully formatted tables. You will need to copy/paste some of this data
(especially the predicted values according to each of the three models) into
Microsoft Excel especially for the creation of your charts. Again, make sure to
keep things labeled and organized!
5. Using Microsoft Excel, create an X-Y graph for each data set that plots all of the
observed values AND the three “series” of estimated/predicted values. Display
the observed values as ‘markers’ without connecting lines; but display the
estimated/predicted values as lines only, omitting the markers. Title your graph
and carefully label each axis and the legend – ensure that your line symbols are
easily interpreted when the graph is printed in black and white! Print your graphs
and include then with your lab report.
6. To enhance your understanding of the linear model, calculate by hand the a, b,
and YC values for the year 2011 for the Taber population data set. Use a separate
sheet of paper for each model, making sure to clearly indicate the appropriate
formulas for each calculation. Show all of your handwritten work! As well,
remember to use the ‘index’ or observation number in your calculation -- not the
actual year 2011. (Remember by using the index numbers and not the actual
years, the mean of the time variable will be zero…this makes computations much
more simple!)
7. For each data set, assess the goodness-of-fit, ME, and MAPE statistics that you
have obtained in the previous questions. What do these values tell you about the
utility of the different curves in modeling each data set? How is this reflected in
the graphs that you created through Question 5 above?
8. In one or two paragraphs, compare the techniques and input and output
evaluation. In what ways are these methods similar; and how do they differ?
Your laboratory report should be typed with a cover sheet and submitted to your lab instructor on or before
October 4, 2007. Reports should be submitted only in person OR through the geography assignment drop
box; no email submissions please. You may format your lab report with numbers indicating the answers to
each of the questions. For ‘discussion’-type questions, please respond in paragraph form, using correct
spelling, grammar and punctuation. For ‘action’-type questions, please make use of the Copy/Paste
functions in Microsoft office to insert your work into the lab report. If a table or chart does not easily fit
into the page, then attach them as clearly labelled appendices. Following the format used in your textbook
and using the Guide to Term Papers on the course web page note that graphs and tables should be
numbered with titles, axis labels, and a source to indicate where you obtained the data.
Page 4 of 4
Download