Week 3 September 15-19 Five Mini-Lectures QMM 510 Fall 2014 ML 3.1 Chapter Contents 14.1 Time-Series Components 14.2 Trend Forecasting 14.3 Assessing Fit 14.4 Moving Averages 14.5 Exponential Smoothing 14.6 Seasonality 14.7 Index Numbers 14.8 Forecasting: Final Thoughts 14-2 So many topics, so little time … Chapter 14 Time-Series Analysis Chapter 14 Time-Series Analysis • A time-series variable (Y) consists of data observed over n periods of time. • Businesses use time-series data - to monitor a process to determine if it is stable - to predict the future (forecasting) • Time-series data can also be used to understand economic, population, health, crime, sports, and social problems. 14-3 Chapter 14 Time-Series Analysis Time- Series Data • Time-series data are usually plotted as a line graph. • Time is on the horizontal (X) axis. • Trends and fluctuations are easier to see on a line graph. • The following notation is used: yt is the value of the time series in period t (t is an index denoting the time period t = 1, 2, …, n); n is the number of time periods; y1, y2, …, yn is the data set for analysis. 14-4 Chapter 14 Time-Series Components Time-Series Data • To distinguish time-series data from cross-sectional data, use yt instead of xi for an individual observation. Measuring Time Series • Time-series data may be measured at a point in time. • For example, prime rate of interest is measured at a particular point in time. • Time-series data may also be measured over an interval of time. • For example, gross domestic product (GDP) is a flow of goods and services measured over an interval of time. 14-5 Chapter 14 Time-Series Components Periodicity • • The periodicity is the time interval over which data are collected. Data can be collected once a year (e.g., 1 observation per year), quarter (e.g., 4 observations per year), month (e.g., 12 observations per year), etc. Additive versus Multiplicative Models • • Time-series decomposition seeks to separate a time series Y into four components: - Trend (T) - Cycle (C) - Seasonal (S) - Irregular (I) These components are assumed to follow either an additive or a multiplicative model. 14-6 Additive versus Multiplicative Models • 14-7 The multiplicative model becomes additive if logarithms are taken (for nonnegative data): Chapter 14 Time-Series Components Trend • • • Trend (T) is the general movement over all years (t = 1, 2, ..., n). Trends may be steady and predictable, increasing, decreasing, or staying the same. A mathematical trend can be fitted to any data but may or may not be useful for predictions. 14-8 Chapter 14 Time-Series Components Chapter 14 Time-Series Components Cycle • • Cycle (C) is a repetitive up-anddown movement about a trend that covers several years. Over a small number of time periods, cycles are undetectable or may resemble a trend. Note: Forecasters generally ignore the cycle so the multiplicative model is just Y = T x S x I. 14-9 Chapter 14 Time-Series Components Seasonal • • • Seasonal (S) is a repetitive cyclical pattern within a year (or within a week, day, or other time period). Over a small number of time periods, cycles are undetectable or may resemble a trend. By definition, annual data have no seasonality. 14-10 Chapter 14 Time-Series Components Irregular • • Irregular (I) is a random disturbance that follows no pattern. It is also called the error component or random noise reflecting all factors other than trend, cycle and seasonality. 14-11 Chapter 14 Trend Forecasting ML 3.2 The main categories of forecasting models are: We will only look at this one category of models (more time to surf) 14-12 Chapter 14 Trend Forecasting Steps in Forecasting: 1. Make a nice Excel graph a. Highlight the data column (excluding heading) b. Insert > Line Chart (e.g., with markers) c. Add a descriptive chart title, etc. 2. Click on the line in the graph to select the variable. 3. Right-click and choose Add Trendline. 4. Select a trend (e.g., linear). Try several. 5. Make forecasts (if desired). 6. If quarterly or monthly data, calculate seasonal factors (using MegaStat or Minitab). 7. Multiply each numerical forecast by its seasonal factor to get seasonally adjusted forecasts. Detailed examples follow… 14-13 Chapter 14 Trend Forecasting Three Trend Models • Three trend models are especially useful in business applications: • All three models can be fitted by Excel, MegaStat, or MINITAB. Linear Trend Model • The linear trend model has the form yt = a + bt • It is the simplest and may suffice for short-run forecasting or as a baseline model. 14-14 Chapter 14 Trend Forecasting Linear Trend Model Linear Trend Calculations • Linear trend is fitted by using ordinary least squares formulas. • Note: Instead of using the actual time values (e.g., years), use an index xt = 1, 2, 3, …. as the independent variable. 14-15 Chapter 14 Trend Forecasting Linear Trend Calculations these calculations are done by Excel (whew!) Forecasting a Linear Trend • Once the slope and intercept have been calculated, a forecast can be made for any future time period by inserting t = n+1, n+2, n+3, etc into the fitted trend equation. 14-16 Chapter 14 Trend Forecasting Linear Trend: Calculating R2 • R2 can be calculated as • An R2 close to 1.0 would indicate a good fit to the past data. • However, a high R2 does not guarantee a good forecast. Projecting a trend assumes that nothing changes. 14-17 Chapter 14 Trend Forecasting Exponential Trend Model • The exponential trend model has the form yt = aebt. • Useful for a time series that grows or declines at the same rate (b) in each time period. 14-18 Chapter 14 Trend Forecasting When to Use the Exponential Model • The exponential model (yt = aebt) is often used for data that may be assumed to grow at a steady percent growth rate (e.g., financial data). • You can compare growth rates in two time-series variables with dissimilar data units by comparing their b estimates (i.e., the fitted growth rate b is unit-free) • There may not be much difference between a linear and exponential model when the data set covers only a few time periods. • The linear model yt = a + bt and the exponential model yt = aebt are equally simple because they are two-parameter models, and a logtransformed exponential model is actually linear. 14-19 Chapter 14 Trend Forecasting Exponential Trend Calculations Calculations of the exponential trend are done by using a transformed variable zt = ln(yt) to produce a linear equation so that the least squares formulas can be used. Excel does all this. Once the least squares calculations are completed, Excel transforms the intercept back to the original units by exponentiation to get the correct intercept. Caution: You can’t fit an exponential model if any data values are zero or negative. 14-20 Chapter 14 Trend Forecasting Quadratic Trend • A quadratic trend model has the form yt = a + bt + ct2 • If c = 0, then the quadratic model becomes a linear model (i.e., the linear model is a special case of the quadratic model). yt = a + bt + ct2 • Fitting a quadratic model is a way of checking for nonlinearity. If c does not differ significantly from zero, then the linear model would suffice. Note: A quadratic equation is unfamiliar to many, and has no simple interpretation. Use it only when your data has a peak or trough and no other model suffices . 14-21 Chapter 14 Trend Forecasting Quadratic Trend Depending on the values of b and c, the quadratic model can assume any of four shapes: Note: Use the quadratic only for short term forecasts when no other model suffices . 14-22 Which Trend Model? … or maybe none of the above will give reasonable forecasts 14-23 Chapter 14 Trend Forecasting Chapter 14 Trend Forecasting Five Measures of Fit We usually refer to R2 because it is familiar. “Fit” refers to how well the estimated trend model matches the observed historical past data. We usually look at R2 because it is familiar. 14-24 Chapter 14 Trend Forecasting Example: Revenue of Amazon.com Inc (AMZN) • Eyeball the data – see anything unusual? • Make a nice graph. • Fit several trend models using Excel. Quarter t yt Quarter t yt 03/31/2004 06/30/2004 09/30/2004 12/31/2004 03/31/2005 06/30/2005 09/30/2005 12/31/2005 03/31/2006 06/30/2006 09/30/2006 12/31/2006 03/31/2007 06/30/2007 09/30/2007 12/31/2007 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1530 1387 1462 2541 1902 1753 1858 2977 2279 2139 2307 3986 3015 2886 3262 5672 03/31/2008 06/30/2008 09/30/2008 12/31/2008 03/31/2009 06/30/2009 09/30/2009 12/31/2009 03/31/2010 06/30/2010 09/30/2010 12/31/2010 03/31/2011 06/30/2011 09/30/2011 12/31/2011 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 4135 4063 4264 6704 4889 4651 5449 9520 7131 6566 7560 12947 Revenue is in $millions (e.g., first data value is 1.53 billion) Objective: Fill in these 4 boxes 14-25 Chapter 14 Trend Forecasting Excel’s “Polynomial Order 2” is a “Quadratic” trend Example: Revenue of Amazon.com Inc (AMZN) Make nice graph, then click on the data series Be sure to click these 2 boxes 14-26 Chapter 14 Example: Amazon Revenue 14000 Net Sales (millions) 12000 y = 284.58x + 117.77 R² = 0.7396 10000 8000 6000 4000 2000 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 Note: Excel will show forecasts on the graph but no numbers are given. 14-27 Chapter 14 Example: Amazon Revenue Fitted Trends: Amazon Revenue If necessary, format the fitted trend label (right-click it) to show more decimals. Moving average (not really a trend model) 14-28 Chapter 14 Example: Amazon Revenue 14000 Net Sales (millions) 12000 y = 284.58x + 117.77 R² = 0.7396 10000 8000 6000 4000 2000 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 Interpretation: growing at $284.58 million per quarter, 74% of variation explained by linear trend model, forecasts seem low? Interpretation: complex nonlinear equation, 82% of variation explained by quadratic trend 14000 12000 Net Sales (millions) Net Sales (millions) 12000 14000 y = 12.752x2 - 85.227x + 1966.8 R² = 0.8168 10000 8000 6000 4000 2000 Interpretation: growing 6.86% per quarter, 89% of variation explained by exponential trend, believable forecasts y = 1316.5e0.0686x R² = 0.8861 10000 8000 6000 4000 2000 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 14-29 Chapter 14 Example: Amazon Revenue 14000 Net Sales (millions) 12000 y = 284.58x + 117.77 R² = 0.7396 Excel formula for t = 32 forecast: =284.58*32 + 117.77 10000 8000 Excel formula for t = 32 forecast: =12.75*32^2 - 85.227*32 + 1966.8 6000 4000 Excel formula for t = 32 forecast: =1316.5*EXP(.0686*32) 2000 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 14000 12000 Net Sales (millions) Net Sales (millions) 12000 14000 y = 12.752x2 - 85.227x + 1966.8 R² = 0.8168 10000 8000 6000 4000 2000 y = 1316.5e0.0686x R² = 0.8861 10000 8000 6000 4000 2000 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 14-30 Chapter 14 Example: Amazon Revenue Unadjusted Forecasts t 29 30 31 32 Linear Quad Expon 8370 8655 8939 9224 10220 10887 11579 12298 9625 10309 11041 11825 Note: To make these forecasts, the formulas from fitted trends were entered into cells beside the time index t = 29, 30, 31, 32 (as shown below). Unadjusted Forecasts t 29 30 31 32 Linear =117.11+284.58*J11 =117.11+284.58*J12 =117.11+284.58*J13 =117.11+284.58*J14 Quad =1966.8-85.227*J11+12.752*J11^2 =1966.8-85.227*J12+12.752*J12^2 =1966.8-85.227*J13+12.752*J13^2 =1966.8-85.227*J14+12.752*J14^2 Expon =1316.5*EXP(0.0686*J11) =1316.5*EXP(0.0686*J12) =1316.5*EXP(0.0686*J13) =1316.5*EXP(0.0686*J14) Note: In this example, the time index t = 29, 30, 31, 32 is in cells J11, J12, J13, J14 14-31 Chapter 14 Example: Amazon Revenue to make future forecasts, insert formula in each cell for all three fitted models using time index in column A (or wherever it is) Note: The year and quarter are just labels – they are not used in any of the calculations. 14-32 Chapter 14 Example: Amazon Revenue Unadjusted Forecasts t 29 30 31 32 Linear Quad Expon 8370 8655 8939 9224 10220 10887 11579 12298 9625 10309 11041 11825 Comment: The linear forecasts are much more conservative than the other two trend models. Quadratic forecasts are the most aggressive, though only slightly more than the exponential forecasts. Comment: These are quarterly data, so now we should adjust the forecasts for seasonality. 14-33 Four Trend-Fitting Criteria Criteria for selecting a trend forecasting model: Criterion • Occam’s Razor • Overall fit • Believability • Fit to recent data. 14-34 Ask Yourself Would a simpler model suffice? How does the trend fit the past data? Does the extrapolated trend “look right”? Does the fitted trend match the last few data points? Chapter 14 Trend Forecasting Chapter 14 Forecasting with Seasonality ML 3.3 When and How to Deseasonalize • When the data periodicity is monthly or quarterly, calculate a seasonal index and use it to deseasonalize the data. • For the multiplicative model, a seasonal index is a ratio. • The seasonal indexes must sum to 12 for monthly data or to 4 for quarterly data. • In a multiplicative model, seasonal indexes near 1.00 suggest a lack of seasonality: Y=TxSxI if S = 1.00 then S disappears 14-35 Chapter 14 Seasonality When and How to Deseasonalize Step 1: Calculate a centered moving average (CMA) for each month (quarter). Step 2: Divide each observed yt value by the MA to obtain seasonal ratios. Step 3: Average the seasonal ratios by the month (quarter) to get raw seasonal indexes. Step 4: Adjust the raw seasonal indexes so they sum to 12 (monthly) or 4 (quarterly). Step 5: Divide each yt by its seasonal index to get deseasonalized data. Note: We rely on MegaStat or another computer package for these complex calculations. 14-36 Chapter 14 Seasonality MegaStat Menus to label the data by year and quarter 14-37 Chapter 14 Seasonality MegaStat’s Seasonal Analysis t 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 Year 2004 2004 2004 2004 2005 2005 2005 2005 2006 2006 2006 2006 2007 2007 2007 2007 2008 2008 2008 2008 2009 2009 2009 2009 2010 2010 2010 2010 Net Quarter sales 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 Centered Moving (millions) Average 1,530 1,387 1,462 1776.737 2,541 1868.901 1,902 1964.049 1,753 2067.995 1,858 2169.625 2,977 2265.000 2,279 2369.375 2,139 2551.625 2,307 2769.750 3,986 2955.125 3,015 3167.875 2,886 3498.000 3,262 3848.750 5,672 4135.875 4,135 4408.250 4,063 4662.500 4,264 4885.750 6,704 5053.500 4,889 5275.125 4,651 5775.250 5,449 6407.500 9,520 6927.125 7,131 7430.375 6,566 8122.625 7,560 12,947 Ratio to Seasonal Net sales (millions) CMA Indexes Deseasonalized 0.956 1,601.1 0.837 1,657.8 0.823 0.851 1,717.7 1.360 1.356 1,874.0 0.968 0.956 1,989.9 0.848 0.837 2,094.8 0.856 0.851 2,182.3 1.314 1.356 2,195.6 0.962 0.956 2,384.3 0.838 0.837 2,556.0 0.833 0.851 2,709.6 1.349 1.356 2,939.7 0.952 0.956 3,154.3 0.825 0.837 3,448.7 0.848 0.851 3,831.3 1.371 1.356 4,183.2 0.938 0.956 4,326.1 0.871 0.837 4,855.2 0.873 0.851 5,008.2 1.327 1.356 4,944.3 0.927 0.956 5,114.9 0.805 0.837 5,557.8 0.850 0.851 6,400.0 1.374 1.356 7,021.1 0.960 0.956 7,460.5 0.808 0.837 7,846.2 0.851 8,879.4 1.356 9,548.5 14-38 Chapter 14 Seasonality MegaStat’s Seasonal Indexes 2004 2005 2006 2007 2008 2009 2010 mean: adjusted: Calculation of Seasonal Indexes 1 2 3 0.823 0.968 0.848 0.856 0.962 0.838 0.833 0.952 0.825 0.848 0.938 0.871 0.873 0.927 0.805 0.850 0.960 0.808 0.951 0.833 0.847 0.956 0.837 0.851 4 1.360 1.314 1.349 1.371 1.327 1.374 1.349 1.356 3.980 4.000 <--- adjusted so they add to 4.000 14-39 Chapter 14 Seasonality MegaStat’s Graph Note: MegaStat’s graph does not show any forecasts (only the deseasonalized time series So … we have to make our own numerical forecasts 14-40 Chapter 14 Seasonality Seasonal Adjustment Unadjusted Forecasts t 29 30 31 32 Now, multiply each trend forecast by its quarterly seasonal factor Linear Quad Expon 8370 8655 8939 9224 10220 10887 11579 12298 9625 10309 11041 11825 Seasonally Adjusted Forecasts t Linear 29 30 31 32 Quad 8018 7215 7607 12520 Expon 9790 9076 9854 16693 9221 8594 9395 16051 MegaStat Seasonals 0.956 Qtr 1 0.837 Qtr 2 0.851 Qtr 3 1.356 Qtr 4 14-41 Chapter 14 Seasonality Compare Forecasts and Choose One 14-42 ML 3.4 Chapter Contents 14.1 14.2 14.3 14.4 14.5 14.6 14.7 14.8 14-43 Time-Series Components Trend Forecasting Assessing Fit Moving Averages Exponential Smoothing Seasonality Index Numbers Forecasting: Final Thoughts … when no trend model works Chapter 14 More Time Series Methods Chapter 14 Moving Averages Trendless or Erratic Data • In cases where the time series y1, y2, …, yn is erratic or has no consistent trend, there may be little point in fitting a trend line. • A simple approach is to calculate either a trailing or centered moving average. Trailing Moving Average (TMA) • The TMA simply averages the data over the last m periods. • The TMA smoothes the past fluctuations in the time series in order to see the pattern more clearly. • Choosing a larger m yields a “smoother” TMA but requires more data. Note: Excel uses the TMA method. 14-44 Chapter 14 Moving Averages Centered Moving Average (CMA) • The CMA smoothing method calculates the mean of the current observation and observations on either side of the current data. For example, for m = 3: • • When m is odd (m = 3, 5, etc.), the CMA is easy to calculate. When m is even, the mean would lie between two data points and would not be correctly centered, so we would take a double moving average. Caution: Excel does not offer the CMA method (only TMA). 14-45 Chapter 14 Exponential Smoothing Forecast Updating • The exponential smoothing model is a special kind of moving average. • This one-period-ahead forecasting technique is utilized for data that have up-and-down movements but no consistent trend. • The updating formula is where 14-46 Chapter 14 Exponential Smoothing Smoothing Constant () • • • • • The forecast Ft+1 is a weighted average of yt (the current data) and Ft (the previous forecast). The value of (the smoothing constant) is the weight given to the latest data. A small value of would give low weight to the most recent observation. A large value of would give heavy weight to the previous forecast. The larger the value of , the more quickly the forecasts adapt to recent data. Choosing the Value of • If = 1, there is no smoothing at all and the forecast for the next period is the same as the latest data point. 14-47 Chapter 14 Exponential Smoothing Initializing the Process • • • • • Where do we get the initial forecast F1 (i.e., how do we initialize the process)? Method A Use the first data value. Set F1 = y1 Although simple, if y1 is unusual, it could take a few iterations for the forecasts to stabilize. Method B Average the first six data values. Set F1 = (y1 + y2 + y3 + y4 + y5 + y6)/6 This method consumes more data and is still somewhat vulnerable to unusual y-values. 14-48 Chapter 14 Exponential Smoothing Effect of Past Data • • Note that Ft 1 depends on Ft, which in turn depends on Ft 1, and so on all the way back to F1. The effect of past data diminishes as time increases. • To see this, replace Ft with Ft 1 and repeat this type of substitution indefinitely to obtain • The coefficients diminish so older yt values have less effect on the current forecast. 14-49 Chapter 14 Exponential Smoothing Example from LearningStats: 14-50 Chapter 14 Advanced Methods • The Box-Jenkins method uses several different types of timeseries modeling techniques that fall into a class called ARIMA (Autoregressive Integrated Moving Average) models • AR (autoregressive) models take advantage of the dependency that may exist between values in the time series. • MA (moving average) models take advantage of the dependency that may exist between errors in the forecasts. • Although they are more powerful and general than trend models, these methods require sophisticated software and additional training. Excel and MegaStat do not offer these methods. 14-51 Assignments ML 3.5 • Project P-1 (data, tasks, questions) • • • • Review instructions Look at the data Your task is to write a nice, readable report (not a spreadsheet) Length is up to you • Project P-2 (data, tasks, questions) • • • • 0-52 Review instructions Look at the data Your task is to write a nice, readable report (not a spreadsheet) Length is up to you General Instructions For each team project, submit a short (5-10 page) report (using Microsoft Word or equivalent) that answers the questions posed. Strive for effective writing (see textbook Appendix I). Creativity and initiative will be rewarded. Avoid careless spelling and grammar. Paste graphs and computer tables or output into your written report (it may be easier to format tables in Excel and then use Paste Special > Picture to avoid weird formatting and permit sizing within Word). Allocate tasks among team members as you see fit, but all should review and proofread the report (submit only one report). 0-53 Chapter 0 Projects: General Instructions Project P-2 You will be assigned a company and team members (see Moodle). Delegate tasks and collaborate as seems appropriate, based on your various skills. Submit one report. Data: Download your company’s quarterly revenue data 2006-2012 from Moodle or from Doane’s teaching web page. Analysis: (a) Briefly describe the company’s history, products, services, competition, and market conditions (e.g., Yahoo or Mergent or Google). (b) Fit several trends (e.g., linear, quadratic, exponential) using Excel. (c) Interpret each fitted trend equation. Discuss and compare the R2 statistics. (d) Forecast the next 4 quarters (t = 29, 30, 31, 32) based on trend alone using each fitted trend model (i.e., plug in the time index for periods n+1, n+2, n+3, n+4). (e) Use MegaStat (or Minitab) to calculate quarterly seasonal factors. Is there noticeable seasonality? (f) Ambitious students: Multiply each quarterly trend forecast by its seasonal factor. Discuss the effect of the seasonal adjustment. (g) Using the four criteria for assessing forecasts (see p. 614), which trend model (if any) would yield credible forecasts? If none, then what? 0-54 Project P-2 A Few Examples These are from Doane’s teaching web page (only the “cleaned” files were posted to Moodle). You can look up your company’s stock ticker symbol on the internet (e.g., EMC) Note: Use the “cleaned”: spreadsheets for your forecasting project. The others are the “raw data” from Mergent Online. 0-55 Project P-2 Note: Use only the “cleaned”: spreadsheets for your P-2 project. The others are “raw data” from Mergent Online. 0-56 Project P-2 Hint: Watch the instructor’s video walkthrough using Amazon’s revenue as an example (posted on Moodle) 0-57