Forecasting Lecturer: Prof. Duane S. Boning Rev 8 1

advertisement

Forecasting

Lecturer: Prof. Duane S. Boning

Rev 8

1

Regression – Review & Extensions

• Single Model Coefficient: Linear Dependence

• Slope and Intercept (or Offset):

• Polynomial and Higher Order Models:

• Multiple Parameters

• Key point: “linear” regression can be used as long as the model is linear in the coefficients (doesn’t matter the dependence in the independent variable)

• Time dependencies

– Explicit

– Implicit

2

Agenda

1.

Regression

• Polynomial regression

• Example (using Excel)

2.

Time Series Data & Time Series Regression

• Autocorrelation – ACF

• Example: white noise sequences

• Example: autoregressive sequences

• Example: moving average

• ARIMA modeling and regression

3.

Forecasting Examples

3

Time Series – Time as an Implicit Parameter

• Data is often collected with a time-order

4

2

0

• An underlying dynamic process

(e.g. due to physics of a manufacturing process) may create autocorrelation in the data

-2

0

5

0

-5

-10

0 uncorrelated

10

10

20 time

30 40 autocorrelated

20 time

30 40

50

50

4

Intuition: Where Does Autocorrelation Come From?

• Consider a chamber with volume V , and with gas flow in and gas flow out at rate f . We are interested in the concentration x at the output, in relation to a known input concentration w .

5

Key Tool: Autocorrelation Function (ACF)

• Time series data: time index i

4

2

0

-2

• CCF: cross-correlation function

-4

0 20 40 60 time

1

0.5

0

• ACF: auto-correlation function

-0.5

-1

0 5 10 15 20 lags

) ACF shows the “similarity” of a signal

25 to a lagged version of same signal

80 100

30 35 40

6

Stationary vs. Non-Stationary

Stationary series:

Process has a fixed mean

10

5

0

-5

-10

0 100 200 time

300 400

30

20

10

0

-10

0 100 200 time

300 400

500

500

7

White Noise – An Uncorrelated Series

• Data drawn from IID gaussian

4

2

• ACF: We also plot the 3 limits – values within these not significant

• Note that r(0) = 1 always (a signal is always equal to itself with zero lag – perfectly autocorrelated at k = 0)

• Sample mean

Sample variance

0

-2

-4

0

1

0.5

0

-0.5

-1

0

50 100 time

150 200

5 10 15 20 lags

25 30 35 40

8

Autoregressive Disturbances

• Generated by:

• Mean

• Variance

10

5

0

-5

-10

0 100 200 time

300 400

1

0.5

0

-0.5

-1

0

Slow drop in ACF with large

5 10 15 20 lags

25 30 35

So AR (autoregressive) behavior increases variance of signal.

500

40

9

Another Autoregressive Series

• Generated by:

• High negative autocorrelation:

Slow drop in ACF with large

But now ACF alternates in sign

10

5

0

-5

-10

0

1

0.5

0

-0.5

-1

0

100 200 time

300 400 500

5

Slow drop in ACF with large

10 15 20 lags

25 30 35 40

10

Random Walk Disturbances

• Generated by:

• Mean

• Variance

30

20

10

0

-10

0

1

0.5

0

-0.5

-1

0

100 200 time

300 400 500

5 10 15 20 lags

25 30 35

Very slow drop in ACF for = 1

40

11

• Generated by:

Moving Average Sequence

• Mean

• Variance

4

2

0

-2

-4

0 100 200 time

300 400

1

0.5

0 r (1)

-0.5

Jump in ACF at specific lag

-1

0 5 10 15

So MA (moving average) behavior also increases variance of signal.

20 lags

25 30 35

500

40

12

• Generated by:

• Both AR & MA behavior

ARMA Sequence

10

5

0

-5

-10

0

1

0.5

0

-0.5

-1

0

100 200 time

300 400 500

5

Slow drop in ACF with large

10 15 20 lags

25 30 35 40

13

ARIMA Sequence random walk (integrative) action

• Start with ARMA sequence:

400

200

• Add Integrated (I) behavior

0

-200

0

1

0.5

0

-0.5

-1

0

100 200 time

300 400 500

5

Slow drop in ACF with large

10 15 20 lags

25 30 35 40

14

Periodic Signal with Autoregressive Noise

Original Signal

20

10

0

-10

0

1

0.5

0

-0.5

-1

0

50

5

100 150 200 250 300 350 400 time

10 15 20 lags

25 30 35 40

5

0

After Differencing

-5

0

1

0.5

0

-0.5

-1

0

50

5

100 150 200 250 300 350 400 time

10 15 20 lags

25 30 35 40

See underlying signal with period = 5

15

Cross-Correlation: A Leading Indicator

• Now we have two series:

– An “input” or explanatory variable x

– An “output” variable y

• CCF indicates both AR and lag:

10

5

0

-5

-10

0

10

5

0

-5

-10

0

1

0.5

0

-0.5

-1

0

100

100

200

200 time time

300

300

400

400

500

500

5 10 15 20 lags

25 30 35 40

16

Regression & Time Series Modeling

• The ACF or CCF are helpful tools in selecting an appropriate model structure

– Autoregressive terms?

• x i

= x i-1

– Lag terms?

• y i

= x i-k

• One can structure data and perform regressions

– Estimate model coefficient values, significance, and confidence intervals

– Determine confidence intervals on output

– Check residuals

17

Statistical Modeling Summary

1.

Statistical Fundamentals

• Sampling distributions

• Point and interval estimation

• Hypothesis testing

2.

Regression

• ANOVA

• Nominal data: modeling of treatment effects (mean differences)

• Continuous data: least square regression

3.

Time Series Data & Forecasting

• Autoregressive, moving average, and integrative behavior

• Auto- and Cross-correlation functions

• Regression and time-series modeling

18

MIT OpenCourseWare http://ocw.mit.edu

2.854 / 2.853 Introduction to Manufacturing Systems

Fall 2010

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms .

Download