Chapter 3

İŞL 276 Exploring Data Patterns & Choosing a Forecasting Technique Chapter 3 Fall 2014 Exploring Data Patterns & Choosing a Forecasting Technique 2 ¾ Collection of valid and reliable data is the most time consuming and difficult part of forecasting. ¾ The difficult task facing most forecasters is how to find relevant data that will help solve their specific decision making problems. ¾ GIGO: garbage in and garbage out ¾ The following four criteria can be applied to the determination of whether data will be useful: 1. Data should be reliable and accurate. 2. Data should be relevant. 3. Data should be consistent. 4. Data should be timely. Exploring Data Patterns & Choosing a Forecasting Technique 3 There Are Two Types of Data : ¾ ¾ One are observations collected at a single point in time called crosssectional data and, The other one are observations collected over successive increments of time called time series data. Exploring Data Patterns & Choosing a Forecasting Technique 4 Exploring Time Series Data Patterns : The important aspects in selecting an appropriate forecasting method for time series data is to consider the following different types of data patterns: 1. Horizontal Pattern ¾ The observations fluctuate around a constant level or mean. ¾ This type of series is called stationary in its mean. Exploring Data Patterns & Choosing a Forecasting Technique 5 2. Trend Pattern ¾ The observations grow or decline over an extended period of time. ¾ This type of series is called nonstationary. The trend is the long term component that represents the growth or decline in the time series over an extended period of time. ¾ Exploring Data Patterns & Choosing a Forecasting Technique 6 3. Cyclical Pattern ¾ The observations exhibit rises and falls that are not of a fixed period. ¾ The cyclic component is the Wave like fluctuation around the trend. ¾ Cyclical fluctuations are often influenced by changes in economic expansions and contractions (business cycle) Exploring Data Patterns & Choosing a Forecasting Technique 7 4. Seasonal Pattern ¾ The observations are influenced by seasonal factors. ¾ The seasonal component refers to a pattern of change that repeats it self year after year. ¾ In the monthly series the seasonal component measures the variability of the series each month, and in the quarterly series the seasonal component represents the variability in each quarter…etc. Exploring Data Patterns & Choosing a Forecasting Technique 8 Autocorrelation Analysis ¾ The autocorrelation analysis for different time lags of a variable is used to identify time series data patterns including components such as trend and seasonality. ¾ Autocorrelation is the correlation between a variable lagged one or more periods and itself. ¾ This is measured using the autocorrelation coefficient at lag k , which is denoted by ρk and it’s estimated by its sample autocorrelation coefficient rk lag k ; k=0,1 2 … where; n rk = ∑ (Y t − Y)(Y t−k −Y) t=k+1 ;k=0,1,2,... n ∑ (Y −Y) t t=1 2 Exploring Data Patterns & Choosing a Forecasting Technique 9 ¾ Where Yt and Yt-k are observations at time period t and t-k , respectively. ¾ Autocorrelation analysis can be done by demonstrating the correlogram or autocorrelation function. ¾ The correlogram or autocorrelation function is a graph of the autocorrelations for various lags of a time series. Example 3.1 Harry Vernon has collected data on the number of VCRs sold last year for Vernon’s Music Store. We need to know lag 1 and lag 2 autocorrelation coefficients (r1 and r2) Month # of VCRs January 123 February 130 March 125 April 138 May 145 June 142 July 141 August 146 September 147 October 157 November 150 December 160 Exploring Data Patterns & Choosing a Forecasting Technique 10 Solution: Exploring Data Patterns & Choosing a Forecasting Technique 11 Lag 1 Exploring Data Patterns & Choosing a Forecasting Technique 12 Lag 2 Exploring Data Patterns & Choosing a Forecasting Technique 13 MINITAB Exploring Data Patterns & Choosing a Forecasting Technique 14 MINITAB The correlogram or autocorrelation function is a graph of the autocorrelation for various lags of time series Exploring Data Patterns & Choosing a Forecasting Technique 15 ¾ With the correlogram display, the data patterns including trend and seasonality can be studied. ¾ Autocorrelation coefficients for different time lags for a variable can be used to answer the following questions about a time series: 1. Are the data random? 2. Do the data have a trend? 3. Are the data stationary? 4. Are the data seasonal? ¾ If a series is random, the autocorrelations between Y, and Yt-k for any lag k are close to zero. The successive values of a time series are not related to each other. ¾ If a series has a trend, successive observations are highly correlated and the autocorrelation coefficients are typically significantly different from zero for the first several time lags and then gradually drop toward zero as the number of lags increases. Exploring Data Patterns & Choosing a Forecasting Technique 16 ¾ The autocorrelation coefficient for time lag 1 is often very large (close to 1). ¾ The autocorrelation coefficient for time lag 2 will also he large. However, it will not he as large as for time lag 1. ¾ If a series has a seasonal pattern, a significant autocorrelation coefficient will occur at the seasonal time lag or multiples of the seasonal lag. ¾ The seasonal lag is 4 for quarterly data and 1 2 for monthly data. ¾ How does an analyst determine whether an autocorrelation coefficient is significantly different from zero? ¾ Statisticians showed that the sampling distribution of the sample autocorrelation coefficient r1 is normally distributed with mean zero and approximate standard deviation1 n. ¾ Knowing this, we can compare the sample autocorrelation coefficients with this theoretical sampling distribution and determine whether, for given time lags, they come from a population whose mean is zero. Exploring Data Patterns & Choosing a Forecasting Technique 17 Checking the Significance of the Autocorrelation Coefficients ¾ In this section we are going to determine whether the autocorrelation coefficient ρ k at lag k ; k=0,1,2,… ; is different from zero for any time series data set. ¾ The sampling distribution of the sample autocorrelation coefficient rk at lags k ; k=2,3,… ; is normally distributed with mean zero and approximate standard deviation SE(rk) and is given by: k−1 1+2 SE(r k )= ¾ SE(r1 = 1 n ∑r 2 i i=1 n ; k=2,3,... Exploring Data Patterns & Choosing a Forecasting Technique 18 α ,n−1 2 k ρk ≤+tα 2 the time series data is random. 2. Testing for individual ¾ ,n−1 ⋅SE(r)k ρk Now using ΅ level of significant, we want to test for k=1,2,… H0: ρk =0 H1 : ρk ≠ 0 using the test statistics t = reject H0 if t ≥tα 2 rk SE(r or P-value ΅ k ) Exploring Data Patterns & Choosing a Forecasting Technique 19 3. Testing a subset of ρ ¾ k ;k=1, 2, ,m We use one of the common portmanteau tests; the following modified Box Pierce m Q = n(n+2) r k2 ∑ n− k k =1 ¾ We reject that all the subset of autocorrelations are zero if Q ǃ x2΅,m or p-value ǂ ΅. Exploring Time Series Data Types The autocorrelation coefficients are used to know if the time series data are: 1. Random data. ¾ t ¾ The time series is random or independent if the auto correlations between Y and Y t kfor any lag k are close to zero. This implies that the successive data are not related to each other. We use the option of constructing confidence interval to check that almost all sample autocorrelations should lie within a range specified by zero. ¾ Exploring Data Patterns & Choosing a Forecasting Technique 20 For example, at 5 % level of significant, the time series data are random if 95 % of the sample autocorrelations will lie within ¾ -2.2 SE(rk) ≤ ρ k ≤ +2.2 SE (rk) for all k = 1,2,3... Also, it is possible to use the Q Statistic in option of testing a subset of autocorrelations is zero. ¾ For example, at 5 % level of significant, the time series data are random if Q for a subset of 10 autocorrelations is less than x20.05,10 = 18.31 ¾ Example 3.2 A hypothesis test is developed to determine whether a particular autocorrelation coefficient is significantly different from zero for the correlogram figure. The null and alternative hypotheses for testing the significance of the lag 1 population autocorrelation coefficient are H0 : ρ 1 = 0 H 1 : ρ1 ≠ 0 Exploring Data Patterns & Choosing a Forecasting Technique 21 If the null hypothesis is true, the test statistic t= r1−0 r1−ρ1 r1 = SE(r) = SE(r)1 SE(r) 1 1 has a t distribution with df = n - 1 → n- 1 = 12-1 = 11, so for 5% significance level, the decision rule is: Decision Rule: if t < -2.2 or t > 2.2 → we reject Ho and conclude the lag 1 autocorrelation is significantly different from 0. The critical values ± 2.2 are the upper and lower .025 points of a t distribution with 11 df. The standard error of r1 is SE(r1) = √1/12 = .289 and the value of the test statistic becomes r1 − ρ1 .572 t= = =1.98 SE(r1) .289 Ho: ρ 1 = 0 cannot be rejected because —2.2 < 1.98 < 2.2. Exploring Data Patterns & Choosing a Forecasting Technique 22 Now for lag 2 H0 : ρ 2 = 0 H 1 : ρ2 ≠ 0 t= r2 SE(r2) Decision Rule: if t < 2.2 or t > 2.2 → we reject Ho and conclude the lag 2 autocorrelation is significantly different from 0 with the same critical values of ± 2.2 and 11 df − The standard error of r2 is SE(r2 ) = 1+2∑ri 2 The value of the test statistic becomes t= Ho: ρ 2 i=1 n = 1+2(.572) 2 =.371 12 r2 = .463 =1.25 SE(r 2 ) .371 = 0 cannot be rejected because —2.2 < 1.25< 2.2. Exploring Data Patterns & Choosing a Forecasting Technique 23 ¾ ¾ An alternative way to check for significant autocorrelation is to construct, say, 95% confidence limits centered at 0. These limits for lags 1 and 2 are given by ¾ Autocorrelation significantly different from 0 is indicated whenever a value for r falls outside the corresponding confidence limits. ¾ The 95% confidence limits are shown in correlogram by the dashed lines in the graphical display of the autocorrelation function. Exploring Data Patterns & Choosing a Forecasting Technique 24 2. Stationary and nonstationary data. ¾ The time series is stationary if the observations fluctuate around a constant level or mean. ¾ The sample autocorrelation coefficients decline to zero fairly rapidly, generally after the second or third time lag. ¾ The time series is nonstationary or having trend if the successive observations are highly correlated . ¾ The autocorrelation coefficients are sufficiently different from zero for the first several time lags and then gradually drop toward zero as the number of lags increases. ¾ Nonstationary data to be analyzed the trend to be removed from the data before modeling. ¾ On possible technique is used to remove the trend is the differencing method. Exploring Data Patterns & Choosing a Forecasting Technique 25 ¾ Difference the data at order 1,ǊYt =Yt - Yt 1 may remove the trend and the time series data becomes stationary. Exploring Data Patterns & Choosing a Forecasting Technique 26 Example 3.4 An analyst for Sears company is assigned the task of forecasting operating revenue for 2001. She gathers the data for the years 1955 to 2000 as shown in table below Exploring Data Patterns & Choosing a Forecasting Technique 27 The data are plotted as a time series in figure A 95% confidence interval for the autocorrelation coefficients at time lag 1 using 0 ± Z025(√1/12 ). Time lags are significantly different from zero (.96, .92. and .87) and that the values then gradually drop to zero Exploring Data Patterns & Choosing a Forecasting Technique 28 The date series were differenced using MINITAB to remove the trend and to create a stationary series. The differenced series shows no evidence of a trend One can notice that the autocorrelation coefficient at time lag 3. (0.32) is significantly different from zero. The autocorrelations at lags other than lag 3 are small Exploring Data Patterns & Choosing a Forecasting Technique 29 4. Seasonal data. ¾ The time series is seasonal if significant autocorrelation coefficient will occur at a seasonal time lag or multiple of the seasonal lag. ¾ For example, for the quarterly seasonal data, a significant autocorrelation coefficient will appear at lag 4, for the monthly seasonal data, a significant autocorrelation coefficient will appear at lag 12, and so forth. ¾ The time series data to be analyzed the seasonal component should be removed from the data before modeling. ¾ Different techniques will be studied in future helps in removing the seasonal components. Exploring Data Patterns & Choosing a Forecasting Technique 30 Example 3.5 An analyst for Outboard Marine Corporation always felt that sales were seasonal. He gathers the data shown in the table below for the quarterly sales of Outboard Marine Corporation from 1984 to 1996 and plots them as the time series graph Exploring Data Patterns & Choosing a Forecasting Technique 31 By observing the time series plot, he noticed a seasonal pattern He computes the autocorrelation coefficients. He notes that the autocorrelation coefficients at time lags 1 and 4 are significantly different from zero He concludes that Outboard Marine sales are seasonal on a quarterly basis Exploring Data Patterns & Choosing a Forecasting Technique 32 Choosing a Forecasting Technique 1. For Stationary Time Series Data The suggested methods are as follows: ¾ Naïve Methods ¾ Simple Averaging Methods ¾ Moving Averages ¾ Autoregressive Moving Average (ARMA) ¾ Box Jenkins These methods are suggested to use whenever the following exist: ¾ The forces generating a series have stabilized and the environment in which the series exists is relatively unchanging. ¾ A very simple model is needed because of a lack of data or for ease of explanation or implementation. Exploring Data Patterns & Choosing a Forecasting Technique 33 ¾ A very simple model is needed because of a lack of data or for ease of explanation or implementation. ¾ Stability may be obtained by making simple corrections for factors such as population growth or inflation. ¾ The series may be transformed into a stable one using logarithms, square roots, or differences. ¾ The series is a set of forecast errors from a forecasting technique that is considered adequate. Examples: ¾ The unit sales of a product or service in maturation stage of its life cycle. ¾ The Number of sales resulting from a constant level of effort. Exploring Data Patterns & Choosing a Forecasting Technique 34 ¾ Number of breakdowns per week on an assembly line having a uniform production rate. ¾ Changing income to per capita income. 2. For Nonstationary (trended) Time Series Data ¾ The suggested methods are as follows: ¾ Moving Averages ¾ Holtȇs Linear Exponential Smoothing ¾ Simple Regression ¾ Growth Curves ¾ Exponential Models ¾ Autoregressive Integrated Moving Average (ARIMA) ¾ Box Jenkins Exploring Data Patterns & Choosing a Forecasting Technique 35 These methods are suggested to use whenever the following exist: Increased productivity and new technology lead to changes in lifestyle. ¾ Increasing population causes increases in demand for goods and services. ¾ The purchasing power of the dollar affects economic variables due to inflation. ¾ ¾ Market acceptance increases. Examples: ¾ Demand for electronic components (increased with the use of computer). ¾ The Railroad usage (decreased with the use of airplane). ¾ Sales revenues of consumer goods. Exploring Data Patterns & Choosing a Forecasting Technique 36 ¾ Salaries, Production costs, and prices. ¾ The growth period in the life cycle of a new product. 3. For Seasonal Time Series Data ¾ ¾ The suggested methods are as follows: ¾ Classical Decomposition ¾ Census X - 12 ¾ Winter Exponential Smoothing ¾ Multiple Regression ¾ Autoregressive Integrated Moving Average (ARIMA) ¾ Box Jenkins These methods are suggested to use whenever the following exist: ¾ Weather influences the variable of interest. ¾ The annual calendar influences the variable of interest. Exploring Data Patterns & Choosing a Forecasting Technique 37 Examples: ¾ Electrical consumptions. ¾ Summer and winter activities (e.g. sports such as skiing). ¾ Clothing. ¾ ¾ Agricultural growing seasons. Retail sales influenced by holidays, three day weekends, and school calendar. 4. For Cyclical Time Series Data ¾ The suggested methods are as follows: ¾ Classical Decomposition ¾ Economic Indicators ¾ Econometric Models ¾ Multiple Regression ¾ Autoregressive Integrated Moving Average (ARIMA) ¾ Box Jenkins Exploring Data Patterns & Choosing a Forecasting Technique 38 ¾ ¾ These methods are suggested to use whenever the following are exists: ¾ The business cycle influences the variable of interest. ¾ Shifts in popular tastes occur. ¾ Shifts in population occur. ¾ Shifts in product life cycle occur. Examples: ¾ Economic, Market, Competitive factors. ¾ Fashions, Music, Food. ¾ Wars, Famines, Epidemics, Natural Disasters. ¾ Introduction, Growth, Decline. ¾ Maturation and market saturation. Exploring Data Patterns & Choosing a Forecasting Technique 39 Remark ¾ Time Horizon (short, intermediate, and long term) for a forecast is important in the selection of a forecasting technique. ¾ For short and intermediate term forecasts, a variety of quantitative techniques can be applied. ¾ As the forecasting horizon increases, a number of these techniques become less applicable. Exploring Data Patterns & Choosing a Forecasting Technique 40 The following table represents when to choose the appropriate forecasting technique Exploring Data Patterns & Choosing a Forecasting Technique 41 Measuring Forecasting Error Suppose Yt be the actual value of a time series at time t and Ŷt be the forecast value of a time series at time t where t = 1,2,3,…n. ¾ Then the difference between the actual value and its forecast value is called the residual or forecast error and usually denoted by еt such that еt = Yt - Ŷt ¾ Remarks on Empirical Evaluation of Forecasting Methods Statistically sophisticated or complex methods do not necessarily produce more accurate forecasts than simpler methods. ¾ Various accuracy measures (MAD, MSE, MAPE, and MPE) produce consistent results when used to evaluate different forecasting methods. ¾ Combining the three smoothing methods on the average does well in comparison with other methods. ¾ The performance of the various forecasting methods depends on the length of the horizon and the kind of the data analyzed. ¾ Exploring Data Patterns & Choosing a Forecasting Technique 42 Types of Forecast Accuracy Measures 1. Mean Absolute Deviation (MAD) It is useful when the analyst wants to measure forecast error in the same units as the original series. 1n MAD= n ∑ Yt −Yt 2. Mean Squared Error (MSE) i=1 It is useful because it penalizes large forecasting error and therefore the method with moderate errors is more preferable than the method of small errors. 1 n MSE = ∑ (Yt −Y)t 2 n i=1 Exploring Data Patterns & Choosing a Forecasting Technique 43 3. Mean Absolute Percentage Error (MAPE) It is useful when the size or magnitude of the forecast variable is important in evaluating the accuracy of forecast and useful when the actual values of a time series are large. MAPE provides an indication of how large the forecast errors are in comparison to the actual values of the series Also it can be used to compare the accuracy of the same or different techniques on two entirely different series. t 1 n Y− MAPE= ∑ n i=1 Y Y t t Exploring Data Patterns & Choosing a Forecasting Technique 44 4. Mean Percentage Error (MPE) It is useful when the analyst wants to determine whether a forecasting method is biased (consistently forecasting low or high). Therefore; ¾ If MPE is very close to zero then the forecasting method is unbiased. ¾ If MPE is large negative percentage then the forecasting method is consistently overestimating. (Yt − Yt ) MPE= ∑ n i=1 Y t 1 n Exploring Data Patterns & Choosing a Forecasting Technique 45 In general, the above four measures of forecast accuracy are usually used as follows: ¾ To compare the accuracy of two or more different techniques. ¾ To measure the usefulness and the reliability of a particular technique. ¾ To help search for an optimal technique. Determining the Accuracy of a Forecasting Technique To evaluate the adequacy of the forecasting technique, we should check the following: Randomness of the residuals → Use the autocorrelation function for the residuals. ¾ Normality of the residuals → Use the histogram or the normal probability plot for the residuals. ¾ Exploring Data Patterns & Choosing a Forecasting Technique 46 Significance of parameter estimates → Use the t test for all parameter estimates. ¾ ¾ Simplicity and understandability of the technique for decision makers. Example 3.6 The following table shows the data for the daily number of customers requiring repair work, Y, and a forecast of these data, Yt, for Gary’s Chevron Station. The forecasting technique used the number of customers serviced in the previous period as the forecast for the current period. This simple technique will be discussed in Chapter 4. The following computations were employed to evaluate this model using MAD, MSE, MAPE, and MPE. Exploring Data Patterns & Choosing a Forecasting Technique 47 Exploring Data Patterns & Choosing a Forecasting Technique 48 Application to Management ¾ The following are a few examples of situations constantly arising in the business world for which a sound forecasting technique would help the decision making process. ¾ A soft drink company wants to project the demand for its major product over the next two years, by month. ¾ A major telecommunications company wants to forecast the quarterly dividend payments of its chief rival for the next three years. ¾ A university needs to forecast student credit hours by quarter for the next four years in order to develop budget projections for the state legislature. ¾ A public accounting firm needs monthly forecasts of dollar billings so it can plan for additional accounting positions and begin recruiting. ¾ The quality control manager of a factory that makes aluminum ingots needs a weekly forecast of production defects for top management of the company. Exploring Data Patterns & Choosing a Forecasting Technique 49 ¾ A banker wants to see the projected monthly revenue of a small bicycle manufacturer that is seeking a large loan to triple its output capacity. ¾ A federal government agency needs annual projections of average miles per gallon of American made cars over the next 10 years in order to make regulatory recommendations. ¾ A personnel manager needs a monthly forecast of absent days for the company workforce in order to plan overtime expenditures. ¾ A savings and loan company needs a forecast of delinquent loans over the next two years in an attempt to avoid bankruptcy. ¾ A company that makes computer chips needs an industry forecast for the number of personal computers sold over the next five years in order to plan its research and development budget.

Chapter 3

Related documents

Products

Support

Chapter 3

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib