TIME SERIES (AS 91580, AS 3.8) – SUGGESTIONS FOR TEACHING TIME SERIES Why study time series? Students are always asking their teachers why we have to study this topic. So start the unit by answering this question first! The primary reason that most people study time series is that they are interested in predicting the future. To do this they need to model past behaviour of a time series and hope that this pattern of behaviour will continue into the future in order to calculate a prediction. The problem is that some time series are just unpredictable, some that are not we can attempt to model. Predictions of time series are required in many different areas Population projections are calculated by Government bodies in order to predict when a new school, hospital, road, bridge, prison or houses will need to be built. Economic forecasts, such as share prices or exchange rates are used by financial institutions. Weather forecasts are probably the most common types of prediction which we hear about every day. Environmental forecasts covering topics like global warming, monitoring populations of species close to extinction, spread of disease, rainfall, temperature etc are calculated by scientists from a variety of disciplines As a Government Statistician I produced a variety of projections including Ocean wave heights for engineers who were trying to develop machines to harness wave energy and needed to know wave heights as from this they could calculate the forces that the machines would need to withstand. Prison population projections by type of prisoner ( lifer, remand, short-term, young offender, etc.) so that decisions about where and when to build new prisons could be taken Predictions of the number of prescriptions dispensed nationally. The government subsidise each prescription dispensed so from the projections could work out a budget. Predictions of hospital waiting lists for a variety of procedures. These predictions were not only used as part of hospital planning but also were input into medical training programmes to ensure that the right number of specialists were being trained in the right areas. What are time series? Provide a selection of time series for your students to discuss in groups. Some possible sources for these are Statistics NZ Figure.NZ Datamarket American Statistical Association Google trends Allow students to develop their own descriptions before you introduce correct terminology. Encourage students to speculate about possible reasons for the variation that they can see. Give the same time series to different groups – it is interesting to see the different features that different groups will notice. Aim to include a range of time series with increasing complexity from Stationary ( no trend) time series with little variation Time series with no long term trend but some seasonality Time series with a linear trend Time series with a linear trend and seasonality Time series with a non-linear trend and seasonality Time series with a piece-wise trend , with and without seasonality Time series with linear trend and cycle Time series with a non-linear trend and cycle Time series with no discernible pattern i.e. one that is unpredictable Examples of time series with some of these characteristics are given in Appendix 1. Some of the time axes are unquantified; this is deliberate and designed to stimulate debate about what the unit of measurement might be from the shape of the variation. Don’t worry too much about the meta data at this stage either, the main focus should on developing students’ skills of describing time series patterns. Terminology Having exposed students to a variety of time series which they described using their own vocabulary, re visit the same time series and repeat description of time series but this time using the correct terminology. Terms to cover include:Trend – short term and long term. The long term trend is the most slowly changing component of the series. The trend can be either increasing or decreasing over time and it may be linear or nonlinear. A short term trend is a temporary shift which may or may not have been caused by a one-off unusual event; once this event has passed the previous long term trend direction is normally resumed. Seasonality – remind students that a ‘season’ might be a day, a week, a month, a quarter or any repeating time period. Residuals – ask students to identify any unusual residuals, which is a residual which is greater than 10% of the overall variation in the raw data series. Any unusual values, thus identified, warrant some further investigation. Perhaps this unusual value represents an error in the data, perhaps it occurred as the result of another related unusual event; students will need to research events around the time of the unusual value to conjecture about possible reasons. Conjectures are fine, proof is not required. Peaks and Troughs – terms used to describe local maxima and minima in a time series. Students should identify if peaks or troughs occur at the same point in the seasonal cycle and again conjecture about possible reasons for this. Cycle – A cycle is a recurrent wave-like pattern. The period and amplitude of a cycle is neither fixed nor predictable. Thus we can describe cycles as irregular wave-like patterns in series. Many financial and economic time series have cycles that are related to changing business conditions. Students should be exposed to time series with cycles but they will not have to model them as the techniques required are far too complex. Smoothing Techniques Hopefully some students will have struggled to adequately describe the time series you have exposed them too, particularly if you have presented your time series with equal length axes as opposed to a longer x axis. The overall trend is often hard to identify particularly in a series which is dominated by seasonality. In order to view just the trend without the distraction of the seasonality a number of smoothing techniques are available. If you google smoothing techniques you will see there are many. The student tutorial provided in Appendix 2 introduces a few smoothing techniques, namely Moving Mean Weighted Moving Mean Exponential smoothing ( 𝛼 = 0.5) Exponential smoothing ( 𝛼 = 0.1) Through completion of this tutorial, students can see the effects of smoothing and this lays the foundation for using the time series module of iNZight, which uses a smoothing procedure called Holt-Winters. A Teacher’s Guide to Holt-Winter’s is attached at Appendix 3. Students do not need to know how Holt-Winters works but they do need to understand that it is a refinement of exponential smoothing, so it can be helpful to go through the process of calculating smoothed values by hand before they are exposed to the software that will handle the calculations for them. Without this step, the software becomes a ‘black-box’ and an important component of the student’s learning trajectory will have been omitted. Robust research also supports the importance of this step. Description of overall trend With the move to using real data in the teaching and assessment of time series, the task of describing the overall trend of the time series has become a lot more complex. No longer can scaffolding be provided by using the coefficient of a linear regression model and the thorny issue of how many pieces comprise the time series emerges. This issue is the subject of a separate paper currently being prepared by the NZSA Education Committee and will be posted on Census@School website when finalised but the advice, in short, is to train students to describe the long term trend of their time series, which means they should not be distracted by short term variations which will always be present. See examples of acceptable and unacceptable trend descriptions below. “Looking at the smoothed values, there appears to be a slight increasing trend in the mean area of Arctic sea ice from Jan 1990 – Dec 1992, followed by a decline in the mean area of Arctic Sea Ice during 1993, a slight increasing trend in 1994, then general decreasing trend from 1995-2011(with a more rapidly decreasing trend than the rest of the years at the second half of 1995 and 2007 respectively).” This is an unacceptable description of the trend as it focuses too much on short term variations. “Overall the trend in the mean area of Arctic sea ice from 1990 to 2010 is slightly decreasing.” This is an acceptable description of the overall trend; the only addition to this might be some comment quantifying the change, for example, “The area of Arctic sea ice shows a very gradual decline over the period 1990 to 2010. The trend level has fallen from around 9.5 million km2to around 8.5 million km2 over the time period.” Predictions If a student understands the underlying concepts of the model of their time series they can then make sensible statements about their predictions. For example, how far into the future are the predictions likely to be reliable. Are there any indications that past patterns of behaviour are not going to continue? Are predictions available for any related time series? How do these predictions compare to those calculated? What do the width of the confidence intervals tell you about the predictions? What do the width of the confidence intervals tell you about the fit of the model in general? Remember “All models are incorrect – some are useful” (Box, 1987) Predictions are especially problematical if there has been an unusual value near the end of the time series and will be reflected in wide confidence intervals. Model robustness can be tested by removing the last few values of a time series, re fitting the model and investigating how the ‘predictions’ compare with actual data values. If the actual data values fall within the prediction confidence intervals, model robustness is supported. Interpretation and Conjecture Encourage students to explain the features they have observed in their time series. Some features will be easier to explain than others, for example Possible explanations for seasonal effects Ice cream sales – more sold in summer, fewer in winter Power usage – in NZ more power in winter, less in summer, but compare this with countries that have hotter climates. Often power usage is greater in summer because of air conditioning. Alcohol sales – often peaks around Christmas and New Year. Retail sales – again peaks around Christmas are common. Perhaps compare with countries who do not celebrate Christmas, are their seasonal patterns different? Weather aspects – is rainfall seasonal? Possible explanations for changes in long term overall trend 2008 Global Financial Crisis. Many financial series show dramatic disruptions to overall trends around 2008. Global Warming. Consider related series. Health Scares – SARS, Asian flu, AIDS, Ebola, Mad Cow Disease. These show up well on Google trends. Acts of Terrorism – such acts can dramatically affect airline travel and other aspects of tourism. Major Sporting Events – Olympics, Commonwealth Games, Football World Cup, Rugby World Cup. Investigate other research to see if it confirms or refute a suspected overall trend change. Interrogative Reasoning At the end of an initial analysis of a time series students should consider further questions inspired by their investigation. For example, if conditions changed can they suggest how this might affect predictions? Some time series may reflect patterns shown in related time series but the pattern is lagged – i.e. the pattern in one series is several time periods behind that in another series. In the Food for Thought data set a drop in the four retail spending series – supermarkets, fresh food, takeaways and restaurants – was found but the drop occurred in each series at different times. In this scenario a reduction in fresh food spending was the first to fall, followed by takeaways, then restaurants and supermarkets. Thus if in the future a drop is observed in fresh food spending it may be an indicator of falls to come in other related time series. The American Statistical Association website (http://www.the-numbers.com/ and http://www.amstat.org/publications/jse/v17n1/datasets.mclaren.html) has a large data set concerning box office takings for a number of different movies. This data set provides ample opportunity to exercise interrogative reasoning. Movie Data questions that come to mind include Do similar films genres have similar box office patterns? Which film genre’s box office takings drops off the quickest? Do sequels display similar box office patterns? How do different genres compare? Action vs Rom. Com. For example? If you were a cinema manager, what sort of movie would you try to get? Does this vary depending on the time of year? What happens next in time series analysis? It is always good to be able to explain to students what happens next in a topic. In time series it generally means moving on to more complicated models that will enable them to model some of the time series students saw at the beginning of the unit. Models covered in a Stage 3 Time series course at the University of Auckland include Autocorrelation – inclusion of an element in the model for correlation between values Transformation of series – for time series with a non-constant variation Alternative smoothing techniques Harmonic models – using trigonometric functions to model trends ARCH models – autoregressive conditional heteroscedasticity models, used to model time series whose variation changes over time. Time Series for Teaching and Assessment DO use a variety of time series with linear & non-linear trends, fluctuating variation, cycles and large residuals in your teaching of time series. DO NOT use these more complex time series in your assessments. Such time series are beyond the capability of many 3rd Year University students so don’t expect your secondary school students to cope. This does pose a problem as real data is not often nicely behaved, yet teachers are encouraged to use real data in their assessments. Another problem is that teachers in the conditions of assessment guidelines are requested to provide multivariate data sets for time series assessments from which students must select one time series to analyse. It is an extremely difficult and almost impossible task to find a multivariate data set with all variables providing analysis opportunities of equal difficulty. It can also represent a huge marking workload when students select different series. I suggest teachers limit the multivariate data set to 2 or 3 variables maximum. Some teachers are also using alternative forms of assessment such as a presentation rather than a report for the time series internal in an attempt to reduce the marking workload. The hierarchical levels of reasoning referred to in this document are taken from a framework for the development of reasoning in time series constructed for my Master’s thesis which is due to be submitted at the end of January 2016. Rachel Passmore November 2015 Appendix 1 TIME SERIES – TREND DESCRIPTION DAILY_PER_THEATER - A Beautiful Mind 10000 8000 6000 4000 DAILY_PER_THEATER 2000 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101 106 0 Daily Numbers - Spiderman 15000 10000 Series1 5000 0 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 FROM GOOGLE TRENDS Appendix 2 EXAMPLES OF SMOOTHING TECHNIQUES FOR TIME SERIES STUDENT INVESTIGATION INSTRUCTIONS Introduction There are many different ways to smooth a time series. Methods depend on the type of time series but have also depended historically on ease of calculation. Thus historically the Moving Mean or Centred Moving Mean approach has been favoured for its ease of calculation. Technological change means we are no longer restricted to such a smoothing technique especially given its disadvantages. Techniques to be investigated 1. Moving Mean 2. Weighted Moving Mean 3. Exponential smoothing INSTRUCTIONS 1. Copy EXCEL data file called LArain. This contains data on rainfall in Los Angeles, measured in inches between 1908 and 1973. The time variable should be in column A and the rainfall data in column B. Add these headings in the suggested cells D1 – enter MM(3) or Moving Mean , order 3 F1 – enter Weighted Moving Mean H1 – enter Exponentially smoothed ( α = 0.5) K1 = enter Exponentially smoothed ( α = 0.1) 2. Plot the time series using INSERT option on EXCEL 3. SMOOTHING TECHNIQUE 1 – This technique smooths the series by calculating the mean of number of consecutive values of the time series. The ORDER of the moving mean refers to the number of consecutive values you include in the calculation of your mean. Eg ORDER = 3 means First smoothed value = mean of first three values of series This first smoothed value will be plotted against the second time period in your series. 4. In cell D3 enter the following formula =average(B2:B4) & then copy formula down to cell D66. 5. Plot LA rainfall and Moving Mean (Order 3) on same graph. 6. SMOOTHING TECHNIQUE 2 – The moving mean technique applies an equal weight to each of the previous values included in the smoothing calculation. Smoothing using a weighted mean allows us to manipulate these weights. For example, instead of using equal weights we could allocate 50% (or 0.50) weighting to the most recent value, 30% (or 0.30) weighting to the value before that and 20% (or 0.20) weighting to the value before that. Eg First smoothed value = (0.5 x third value) + (0.3 x second value) + (0.2 x first value) 7. In cell F5 enter the following formula =0.5*B4+0.3*B3+0.2*B2 & copy formula down to cell F67 8. Plot LA rainfall data and Weighted Moving Mean on same graph. 9. SMOOTHING TECHNIQUE 3 - With exponential smoothing, each smoothed value is a weighted mean of ALL of the previous values in the series. The weights decrease in size over time. Greater weight is attached to more recent values; less weight is attached to values further in the past. Exponential smoothing requires TWO initial parameters – First smoothed value = estimate by first data value Smoothing parameter, α, which can range 0 < α < 1, but is usually below 0.5. Enter 0.5 in cell I2. 10. Insert following formula in cell H3 = B2 This initializes the first smoothed value to first data value 11. Insert the following formula in cell H4 =$I$2*B3+ (1-$I$2)*H3 This calculates the second smoothed value. The $ signs surrounding I2 can be added by pressing F4. This ensures that the constant, α, does not change when the formula is copied. Copy the formula down to cell H67 12. Plot LA rainfall data and exponentially smoothed series on same graph. 13. SMOOTHING TECHNIQUE 4 – This is another exponential smoothing technique, but this time we are going to reduce the smoothing parameter, α, from 0.5 to 0.1. 14. Initialise the first smoothing value. Insert the following formula in cell K3 = B2 15. Initialise the smoothing parameter, α. Enter 0.1 in cell L2 16. Insert the following formula in cell K3 =$L$2*B3+(1-$L$2)*K3 in cell K4 and then copy to cell K67. Again use F4 button to keep value of α unchanged in the formula. 17. Plot LA rainfall and exponentially smoothed series on same graph. QUESTION – WHICH TECHNIQUE DO YOU PREFER AND WHY? EXTRA ACTIVITIES TO TRY 18. Adjust order of Moving Mean. What does an order 4, 5 or bigger look like? What are the disadvantages of a higher order Moving Mean? 19. Adjust the weights in the Weighted Moving Mean. What difference would weights of 70%, 20% & 10% look like for example? Try some other weights or perhaps another order and another set of weights. 20. Adjust values of α. Try values in between 0.4, 0.3 or 0.2. 21. Repeat exercise with another stationary ( no trend) series. Appendix 3 A TEACHER’S GUIDE TO THE MODELS USED IN TIME SERIES MODULE OF iNZight Introduction The Time Series module of the FREE software package iNZight uses two different statistical models. The model used to obtain the series decomposition is called a Seasonal Trend Lowess and the model used to calculate predictions is a Holt-Winters model. This guide is to give teachers a brief summary of the models used but the new standard has no expectation that students need to know any theoretical background to the models. Seasonal Trend Lowess (LOWESS – Locally Weighted Regression Scatterplot Smoothing) Smoothing or filtering a Time Series is best thought of as similar to the idea of filtering music through an amplifier. We can amplify certain sounds or we can suppress certain sounds. Similarly, we can suppress (remove) certain features in a Time Series, such as seasonality, in order to model the trend and/or cycle. Once we have built a suitable model for the smoothed series, we can add back the appropriate seasonal component in order to produce predictions. A common method for smoothing a Time Series is to use moving averages, which is what has traditionally be taught in schools for AS 3.1. One drawback of moving averages is that our moving average series becomes shorter than the original Time Series. If we have monthly data, our first moving average value is calculated on observations 1 to 12, and the second moving average value is calculated on observations 2 to 13. We then average these two values to get our first moving average value which then replaces observation 7 in our original series. Similarly, at the end of our series, there are six observations that we have no moving average values for. A more useful tool for isolating and then removing the seasonal component of a Time Series is Seasonal Trend Lowess a decomposition function in R ( the programming language that iNZight is written in). The method used is to first smooth the trend and cycle using a lowess smoother (fitting a local regression to a window of points and using the point on the fitted regression line as the value of the smooth for the time value in the middle of the window). The regression that is used is “weighted”, in that observations near the edge of the window are given less weight than observations near the centre of the window when determining the local regression line. Then a separate lowess smoother is used on each seasonal subseries (i.e. all the January observations, all the February observations, …). The “trend and cycle” smoothed value and the appropriate “seasonal” smoothed value can be subtracted from the original observation to yield the remainder or random component for that observation. iNZight produces a plot of the decomposition that shows the original series, the seasonal component, the trend and cycle and finally, the random component. A third option for smoothing data is exponential smoothing and it is this technique that is used in the Holt-Winters model. Holt-Winters Model This model, often referred to as a procedure, was first proposed in the early 1960s. It uses a process known as exponential smoothing. All data values in a series contribute to the calculation of the prediction model. 0 -6 -4 -2 TS1 2 4 6 8 Stationary Time Series 0 100 200 300 400 500 Time Exponential smoothing in its simplest form should only be used for non-seasonal time series exhibiting a constant trend (or what is known as a stationary time series). It seems a reasonable assumption to give more weight to the more recent data values and less weight to the data values from further in the past. An intuitive set of weights is the set of weights that decrease each time by a constant ratio. Strictly speaking this implies an infinite number of past observations but in practice there will be a finite number. Such a procedure is known as exponential smoothing since the weights lie on an exponential curve. If the smoothed series is denoted by St denotes the smoothing parameter, the exponential smoothing constant, 0 1 The smoothed series is given by: St = yt + (1 - )St-1 where S1 = y1 The smaller the value of , the smoother the resulting series. It can be shown that: St = yt + )yt-1 + )2yt-2 + …+ (1 )t-1 y1 Consider the following Time Series: 14 24 5 18 10 17 23 17 23 … Using the formulae above, with an exponential smoothing constant, = 0.1 S1 = y1 = 14 S2 = y2 + (1 - )S1 = 0.1(24) + 0.9(14) = 15 S3 = y3 + (1 - )S2 = 0.1(5) + 0.9(15) = 14 S4 = y4 + (1 - )S3 = 0.1(18) + 0.9(14) = 14.4 etc Thus the smoothed series depends on all previous values, with the most weight given to the most recent values. Exponential smoothing requires a large number of observations. Exponential smoothing is not appropriate for data that has a seasonal component, cycle or trend. However, modified methods of exponential smoothing are available to deal with data containing these components. The Holt-Winters model uses a modified form of exponential smoothing. It applies three exponential smoothing formulae to the series. Firstly, the level (or mean) is smoothed to give a local average value for the series. Secondly, the trend is smoothed and lastly each seasonal sub-series ( ie all the January values, all the February values….. for monthly data) is smoothed separately to give a seasonal estimate for each of the seasons. A combination of these three series is used to calculate the predictions output by iNZight. The exponential smoothing formulae applied to a series with a trend and constant seasonal component using the Holt-Winters additive technique are: a t (Yt s t p ) (1 )(a t 1 b t 1 ) b t (a t a t 1 ) (1 )b t 1 s t (Yt a t ) (1 )s t p where: , and are the smoothing parameters at is the smoothed level at time t bt is the change in the trend at time t st is the seasonal smooth at time t p is the number of seasons per year The Holt-Winters algorithm requires starting (or initialising) values. Most commonly: ap 1 (Y1 Y2 Y p ) p bp Yp p Yp 1 Y p 1 Y1 Y p 2 Y2 p p p p s1 Y1 a p , s 2 Y2 a p , , s p Yp a p The Holt-Winters forecasts are then calculated using the latest estimates from the appropriate exponential smooths that have been applied to the series. So we have our forecast for time period T : ŷT a T b T s T where: a T is the smoothed estimate of the level at time T b T is the smoothed estimate of the change in the trend value at time T s T is the smoothed estimate of the appropriate seasonal component at T As mentioned earlier the Holt-Winters model assumes that the seasonal pattern is relatively constant over the time period. Students would be expected to notice changes in the seasonal pattern and identify this as a potential problem with the model, particularly if long–term predictions are made. In practice this is dealt with by transforming the original data and modelling the transformed series or using a multiplicative model. Students are not expected to know this, but are required to identify a variable seasonal pattern as a potential problem. The exponential smoothing formulae applied to a series using Holt-Winters Multiplicative models are: at Yt (1 )(a t 1 b t 1 ) st p b t (a t a t 1 ) (1 )b t 1 st Yt (1 )s t p at The initialising values are as for the additive model, except: s1 Y1 , ap s2 Y2 , ap , sp Yp ap So we have our prediction for time period T : ŷT (a T b T )s T Calculation of Prediction Intervals for Holt Winters Reference Yar,M. & Chatfield, C. ( 1990) Prediction intervals for the Holt-Winters forecasting procedure, International Journal of Forecasting, Vol. 6,pp 127-137, North Holland. There are many situations where it is important to give interval predictions, rather than point predictions, as a means of assessing future uncertainty. An interval prediction associated with a prescribed probability is sometimes called a confidence interval, but it is recommended that the term prediction interval is used in the context of time series analysis. This is because prediction interval is more descriptive and because the term ‘confidence interval’ is usually applied to interval estimates of model parameters. Unfortunately it is relatively common to see predictions made without any reference to prediction intervals. This may be because there are a number of different ways that prediction intervals can be calculated. The paper above provides not only details of how the prediction intervals for Holt-Winters are produced but also compares the authors’ preferred method with several alternative methods. It also compares the prediction intervals calculated for the same data set by a variety of different models. Derivation and details of prediction interval calculation can be found in Yar & Chatfield’s (1990) paper – see section 3 and 4 on page 129. In one example given in the paper, a monthly index of employment in manufacturing in Canada, a prediction for three years after the end of the actual data is provided of 115.9. A prediction interval of [113.95,117.85] is also calculated. A suggested interpretation of this prediction interval (P.I.) is ‘ There is a 95% chance that the true index value of employment in manufacturing in Canada in three years time will be between 113.95 and 117.85.’ The details given in this paper apply to an additive Holt-Winters model only. Assessing non-stationary model forecasts The test of any prediction model is how well does it predict when compared to actual data values. To do this either remove the last few given observations or find the next few actual observations. Different prediction models can then be compared using a statistic known as the Root Mean Squared Error of Prediction (RMSEP). The formula for calculating this statistic is given below RMSEP 1 (y t 1 t yˆ t ) 2 where is the number of predictions we are using in the calculation of RMSEP. Students are not expected to calculate RMSEP. Want to know more? There are several Youtube clips that explain exponential smoothing and Holt-Winters using Excel or R if you are interested. Summary of what students need to know Holt –Winters Additive model assumes seasonal pattern is reasonably constant Holt-Winters Model uses a technique of exponential smoothing, which is a weighted sum of previous values in a series. More weight is given to more recent values and less weight is given to values from the distant past. Holt-Winters Additive model exponentially smooths three series in order to produce predictions – the level, the trend and the seasonal sub-series. Students should be able to identify cyclical components and inconsistent seasonal patterns. They should note that such features are incompatible with assumptions underlying Holt-Winters Additive model and suggest a multiplicative model be considered instead. Such a comment would be expected at Excellence level only. Students are NOT expected to calculate a multiplicative model. With thanks to Mike Forster, Department of Statistics, University of Auckland