THE UNIVERSITY OF LETHBRIDGE DEPARTMENT OF GEOGRAPHY GEOGRAPHY 3235: Quantitative Models for Geographic Analysis Fall 2007 Assignment 2 – Output Evaluation and Curve Estimation (Part 2) Introduction Curve-fitting and extrapolation techniques provide well-defined scientific methods for quantitative modellers to understand past trends and make educated predictions about future trends. Initially, one must perform input evaluation – that is, the curve ‘fitting’ procedure – to select an appropriate mathematical model that encompasses the past data. Then one can apply curve ‘estimation’ procedures to make predictions about future values for a time series according to various models. Goal To apply curve estimation techniques to a time series and assess various measures of goodness of fit for various types of curves to the given set of data. Data The data for this exercise are provided to you in an Excel spreadsheet and describe: Mauna Loa, Hawaii, USA Yearly Average Carbon Dioxide Emissions – Source: C. D. Keeling, T.P. Whorf, and the Carbon Dioxide Research Group, Scripps Institution of Oceanography (SIO), University of California. Taber, Alberta Official Population – Source: Alberta Municipal Affairs, Official Populations, 1901 - 2001. Available online at http://www.municipalaffairs.gov.ab.ca/ms_official_pop_lists.htm Calgary, Alberta Official Population – Source: Statistics Canada, Census of Population. Page 1 of 4 THE UNIVERSITY OF LETHBRIDGE DEPARTMENT OF GEOGRAPHY GEOGRAPHY 3235: Quantitative Models for Geographic Analysis Fall 2007 General Instructions for Perfoming Curve Estimation Analysis using SPSS The SPSS Curve Estimation procedure will be demonstrated by your lab instructor. Esssentially, this feature of SPSS can be used with time series data to generate curve estimation regression statistics and related plots for many different curve estimation regression models – including three of the ones discussed in this course (linear, geometric and logistic). In addition to the regression statistics, you can also save predicted values and residuals for each prediction interval according to each model. To set up the Curve Estimation procedure, choose Analyze > Regression > Curve Estimation. Your dependent variable will be the population or measure that you are modeling, and your independent variable will be set as ‘Time’. This will assign a simple integer beginning with “1” for the first observation and will continue on, increasing by 1 each time until the nth observation (you MUST have ‘Time’ as the independent variable in order for some of the subsequent steps). You must also specify the specific model for SPSS to use to fit the data…you are familiar with the linear, geometric (titled as “compound” in SPSS), or logistic models, but there are also many more! The ‘Save’ box gives you the option to calculate more than just the regression statistics and plots for the data according to the specified model – in the ‘Save Variables’ section, by choosing to save ‘Predicted values’ and ‘Residuals’, you will be able to calculate the Yc and (Y – Yc) for each time period. Table 3.7 on page 43 of your text shows a table of interim calculations required for computing output evaluation statistics – notice that when applied to time series data (such as that displayed in columns 1 and 2 of Table 3.7) the Curve Estimation procedure will quickly let you fill in most of the rest of the required values (such as columns 3 and 4). Page 2 of 4 THE UNIVERSITY OF LETHBRIDGE DEPARTMENT OF GEOGRAPHY GEOGRAPHY 3235: Quantitative Models for Geographic Analysis Fall 2007 When specifying to save ‘Predicted values’, note that SPSS will only calculate these values up to the last case of the data set, for example, if your population data includes 12 observations that run in odd-numbered years from 1981 to 2003, by default the last ‘Predicted value’ that will automatically be calculated will be an estimated value for 2003. However, you may predict values beyond the 12th observation by choosing to ‘Predict through:’ and specifying the 13th, 14th, 15th, or nth observation. For example, predicting through the 18th observation (as shown at right) will give an estimated population that would correspond to the year 2019. (As you know, this is about as far forward as we dare extrapolate given the 12 historical observations, based on generally-accepted ‘rules of thumb’). Last, note that in order to complete the curve estimation procedure for the logistic model, you will need to add one additional parameter – the upper growth limit of the model. This is entered in the “Upper Bound:” space below the logistic model checkbox. 1. Copy the two variables —“year” and “pop” – from your existing input evaluation data set(s) from last week into new SPSS data file(s). 2. Using Analyze > Regression > Curve Estimation, obtain the regression coefficients (found in the SPSS Output under “Parameter Estimates”) for each data set according to each model, the goodness-of-fit statistics (under “Model Summary”) according to each model; and the estimated values (added to the SPSS Data window as “FIT_1”) according to each model. 3. Finally, manually calculate the output evaluation statistics (ME, MAPE) for your time series data according to the linear and geometric models. Refer to Table 3.7 and formulas 3.5 and 3.6 on p. 42-44 in your text (**Hint: you may need to do calculate some additional simple variables to assist in the calculation of the MAPE statistic**). Page 3 of 4 THE UNIVERSITY OF LETHBRIDGE DEPARTMENT OF GEOGRAPHY GEOGRAPHY 3235: Quantitative Models for Geographic Analysis Fall 2007 4. Neatly summarize the information from the above questions into a series of carefully formatted tables. You will need to copy/paste some of this data (especially the predicted values according to each of the three models) into Microsoft Excel especially for the creation of your charts. Again, make sure to keep things labeled and organized! 5. Using Microsoft Excel, create an X-Y graph for each data set that plots all of the observed values AND the three “series” of estimated/predicted values. Display the observed values as ‘markers’ without connecting lines; but display the estimated/predicted values as lines only, omitting the markers. Title your graph and carefully label each axis and the legend – ensure that your line symbols are easily interpreted when the graph is printed in black and white! Print your graphs and include then with your lab report. 6. To enhance your understanding of the linear model, calculate by hand the a, b, and YC values for the year 2011 for the Taber population data set. Use a separate sheet of paper for each model, making sure to clearly indicate the appropriate formulas for each calculation. Show all of your handwritten work! As well, remember to use the ‘index’ or observation number in your calculation -- not the actual year 2011. (Remember by using the index numbers and not the actual years, the mean of the time variable will be zero…this makes computations much more simple!) 7. For each data set, assess the goodness-of-fit, ME, and MAPE statistics that you have obtained in the previous questions. What do these values tell you about the utility of the different curves in modeling each data set? How is this reflected in the graphs that you created through Question 5 above? 8. In one or two paragraphs, compare the techniques and input and output evaluation. In what ways are these methods similar; and how do they differ? Your laboratory report should be typed with a cover sheet and submitted to your lab instructor on or before October 4, 2007. Reports should be submitted only in person OR through the geography assignment drop box; no email submissions please. You may format your lab report with numbers indicating the answers to each of the questions. For ‘discussion’-type questions, please respond in paragraph form, using correct spelling, grammar and punctuation. For ‘action’-type questions, please make use of the Copy/Paste functions in Microsoft office to insert your work into the lab report. If a table or chart does not easily fit into the page, then attach them as clearly labelled appendices. Following the format used in your textbook and using the Guide to Term Papers on the course web page note that graphs and tables should be numbered with titles, axis labels, and a source to indicate where you obtained the data. Page 4 of 4