SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 11: Forecasting Lecturer: Prof. Duane S. Boning Copyright 2003 © Duane S. Boning. 1 Agenda 1. Regression • • 2. Time Series Data & Regression • • • • • 3. Polynomial regression Example (using Excel) Autocorrelation – ACF Example: white noise sequences Example: autoregressive sequences Example: moving average ARIMA modeling and regression Forecasting Examples Copyright 2003 © Duane S. Boning. 2 Regression – Review & Extensions • Single Model Coefficient: Linear Dependence • Slope and Intercept (or Offset): • Polynomial and Higher Order Models: • Multiple Parameters • Key point: “linear” regression can be used as long as the model is linear in the coefficients (doesn’t matter the dependence in the independent variable) Copyright 2003 © Duane S. Boning. 3 Polynomial Regression Example Bivariate Fit of y By x 95 90 85 y 80 75 70 65 60 5 10 15 20 25 30 35 40 x Fit Mean Linear Fit Polynomial Fit Degree=2 • Replicate data provides opportunity to check for lack of fit Copyright 2003 © Duane S. Boning. 4 Growth Rate – First Order Model • Mean significant, but linear term not • Clear evidence of lack of fit Copyright 2003 © Duane S. Boning. 5 Growth Rate – Second Order Model • No evidence of lack of fit • Quadratic term significant Copyright 2003 © Duane S. Boning. 6 Polynomial Regression In Excel • Create additional input columns for each input • Use “Data Analysis” and “Regression” tool x x^2 10 10 15 20 20 25 25 25 30 35 y 100 100 225 400 400 625 625 625 900 1225 73 78 85 90 91 87 86 91 75 65 Regression Statistics Multiple R 0.968 R Square 0.936 Adjusted R Square 0.918 Standard Error 2.541 Observations 10 ANOVA df Regression Residual Total Intercept x x^2 Copyright 2003 © Duane S. Boning. 2 7 9 Coefficients 35.657 5.263 -0.128 SS 665.706 45.194 710.9 Standard Error 5.618 0.558 0.013 MS 332.853 6.456 F Significance F 51.555 6.48E-05 P-value t Stat 6.347 0.0004 9.431 3.1E-05 -9.966 2.2E-05 Lower Upper 95% 95% 22.373 48.942 3.943 6.582 -0.158 -0.097 7 Polynomial Regression Analysis of Variance Source Model Error C. Total DF 2 7 9 Sum of Squares Mean Square 665.70617 332.853 45.19383 6.456 710.90000 F Ratio 51.5551 Prob > F <.0001 • Generated using JMP Lack Of Fit Source Lack Of Fit Pure Error Total Error Summary of Fit DF 3 4 7 Sum of Squares Mean Square 18.193829 6.0646 27.000000 6.7500 45.193829 F Ratio 0.8985 Prob > F 0.5157 Max RSq 0.9620 RSquare RSquare Adj Root Mean Sq Error Mean of Response Observations (or Sum Wgts) 0.936427 0.918264 2.540917 82.1 10 Parameter Estimates Term Intercept x x*x Estimate 35.657437 5.2628956 -0.127674 Std Error 5.617927 0.558022 0.012811 t Ratio 6.35 9.43 -9.97 Prob>|t| 0.0004 <.0001 <.0001 Effect Tests Source x x*x Nparm 1 1 DF 1 1 Copyright 2003 © Duane S. Boning. Sum of Squares 574.28553 641.20451 F Ratio 88.9502 99.3151 Prob > F <.0001 <.0001 8 Agenda 1. Regression • • 2. Time Series Data & Time Series Regression • • • • • 3. Polynomial regression Example (using Excel) Autocorrelation – ACF Example: white noise sequences Example: autoregressive sequences Example: moving average ARIMA modeling and regression Forecasting Examples Copyright 2003 © Duane S. Boning. 9 Time Series – Time as an Implicit Parameter • An underlying dynamic process (e.g. due to physics of a manufacturing process) may create autocorrelation in the data 4 uncorrelated x 2 0 -2 0 10 20 30 40 50 time 5 autocorrelated 0 x • Data is often collected with a time-order -5 -10 0 10 20 30 40 50 time Copyright 2003 © Duane S. Boning. 10 Intuition: Where Does Autocorrelation Come From? • Consider a chamber with volume V, and with gas flow in and gas flow out at rate f. We are interested in the concentration x at the output, in relation to a known input concentration w. Copyright 2003 © Duane S. Boning. 11 Key Tool: Autocorrelation Function (ACF) 4 • Time series data: time index i x 2 0 -2 • CCF: cross-correlation function -4 0 20 40 60 80 100 time 1 • ACF: auto-correlation function r(k) 0.5 0 -0.5 -1 0 5 10 15 20 lags 25 30 35 ⇒ ACF shows the “similarity” of a signal to a lagged version of same signal Copyright 2003 © Duane S. Boning. 12 40 Stationary vs. Non-Stationary 10 5 x Stationary series: Process has a fixed mean 0 -5 -10 0 100 200 300 400 500 time 30 20 10 0 -10 0 100 200 300 400 500 time Copyright 2003 © Duane S. Boning. 13 White Noise – An Uncorrelated Series • Data drawn from IID gaussian 4 • • r(k) • ACF: We also plot the 3σ limits – values within these not significant Note that r(0) = 1 always (a signal is always equal to itself with zero lag – perfectly autocorrelated at k = 0) Sample mean x 2 0 -2 -4 0 50 100 time 150 200 1 0.5 0 -0.5 • Sample variance Copyright 2003 © Duane S. Boning. -1 0 5 10 15 20 lags 25 30 35 14 40 Autoregressive Disturbances 10 • Generated by: x 5 0 -5 • Mean -10 0 100 200 300 400 500 time 1 0.5 r(k) • Variance 0 Slow drop in ACF with large α -0.5 -1 0 5 10 15 20 lags 25 30 35 So AR (autoregressive) behavior increases variance of signal. Copyright 2003 © Duane S. Boning. 15 40 Another Autoregressive Series • Generated by: 10 x 5 0 -5 -10 0 100 200 300 400 500 time • High negative autocorrelation: 1 r(k) 0.5 0 Slow drop in ACF with large α -0.5 Slow drop in ACF with large α -1 0 But now ACF alternates in sign Copyright 2003 © Duane S. Boning. 5 10 15 20 lags 25 30 35 16 40 Random Walk Disturbances • Generated by: 30 x 20 10 0 • Mean -10 0 100 200 300 400 500 time 1 • Variance r(k) 0.5 0 -0.5 -1 0 5 10 15 20 lags 25 30 35 Very slow drop in ACF for α = 1 Copyright 2003 © Duane S. Boning. 17 40 Moving Average Sequence • Generated by: 4 x 2 0 -2 • Mean -4 0 100 200 300 400 500 time 1 • Variance r(1) ≈ β r(k) 0.5 0 -0.5 Jump in ACF at specific lag -1 0 5 10 15 20 lags 25 30 35 So MA (moving average) behavior also increases variance of signal. Copyright 2003 © Duane S. Boning. 18 40 ARMA Sequence • Generated by: 10 x 5 0 -5 -10 0 • Both AR & MA behavior 100 200 300 400 500 time 1 r(k) 0.5 0 Slow drop in ACF with large α -0.5 -1 0 Copyright 2003 © Duane S. Boning. 5 10 15 20 lags 25 30 35 19 40 ARIMA Sequence random walk (integrative) action • Start with ARMA sequence: 400 x 200 0 -200 0 100 200 300 400 500 time 1 • Add Integrated (I) behavior r(k) 0.5 0 Slow drop in ACF with large α -0.5 -1 0 Copyright 2003 © Duane S. Boning. 5 10 15 20 lags 25 30 35 20 40 Periodic Signal with Autoregressive Noise Original Signal After Differencing 5 20 x x 10 0 0 0 50 100 150 200 time 250 300 350 -5 400 1 1 0.5 0.5 r(k) r(k) -10 0 0 50 100 150 200 time 250 300 350 400 5 10 15 20 lags 25 30 35 40 0 -0.5 -0.5 -1 -1 0 5 10 15 20 lags 25 30 35 40 0 See underlying signal with period = 5 Copyright 2003 © Duane S. Boning. 21 Agenda 1. Regression • • 2. Time Series Data & Regression • • • • • 3. Polynomial regression Example (using Excel) Autocorrelation – ACF Example: white noise sequences Example: autoregressive sequences Example: moving average ARIMA modeling and regression Forecasting Examples Copyright 2003 © Duane S. Boning. 22 Cross-Correlation: A Leading Indicator 10 5 x • Now we have two series: – An “input” or explanatory variable x – An “output” variable y 0 -5 -10 0 100 200 300 400 500 300 400 500 time 10 y 5 0 -5 -10 0 100 200 time 1 • CCF indicates both AR and lag: rxy(k) 0.5 0 -0.5 -1 0 Copyright 2003 © Duane S. Boning. 5 10 15 20 lags 25 30 35 23 40 Regression & Time Series Modeling • The ACF or CCF are helpful tools in selecting an appropriate model structure – Autoregressive terms? • xi = α xi-1 – Lag terms? • yi = γ xi-k • One can structure data and perform regressions – Estimate model coefficient values, significance, and confidence intervals – Determine confidence intervals on output – Check residuals Copyright 2003 © Duane S. Boning. 24 Statistical Modeling Summary 1. Statistical Fundamentals • • • 2. Regression • • • 3. Sampling distributions Point and interval estimation Hypothesis testing ANOVA Nominal data: modeling of treatment effects (mean differences) Continuous data: least square regression Time Series Data & Forecasting • • • Autoregressive, moving average, and integrative behavior Auto- and Cross-correlation functions Regression and time-series modeling Copyright 2003 © Duane S. Boning. 25