Time Series

advertisement
Chapter 11: Time Series
1.
a.
23.96
b.
24.064
a.
S5 = 24.95; T5 = 1.19
b.
S6 = 26.026; T6 = 1.1672
2.
3.
5,660
4.
Seasonality is apparent on a listing or plot if the same patterns appear at regular intervals. This occurs
typically in monthly or quarterly ata. You might notice, for example, that every year there is a high in
summer and a low in winter as you might expect for sales of a product like ice cream. The ACF is helpful
in seeing seasonality because it will show a peak at the appropriate interval or intervals. For example if
monthly data are seasonal, then you should expect a peak at lag 12 in the ACF. Similarly if quarterly data
is seasonal, you should expect a peak at lag 4.
If there is additive seasonality in monthly data, then December should be high (or low) each year by same
amount relative to the monthly average. For example, December sales could be about $80,000 higher than
the average for the year. On the other hand, multiplicative seasonality involves a percentage increment. In
this case, December sales might be 10% higher than the monthly average.
5.
The politician's claim that unemployment fell was based on raw data–88,000 more people were employed
in that month than in the previous month. The Bureau's claim that unemployment rose by 98,000 was based
on the expectation that employment would rise by 186,000 (perhaps based on the seasonal business cycle)
when in fact, it only rose by 88,000. The Bureau's number gives the better picture of the state of the
economy because when the economy is healthy, there is an increase of 186,000 jobs in that time period.
The fact that this number was not reached, indicates a faltering economy.
1
Chapter 11: Time Series
6.
The line chart of the baseball averages through the years 1901 to 2001 appears as follows:
a.
Leading Batting Average
0.440
0.420
Batting Average
0.400
0.380
0.360
0.340
0.320
19
01
19
05
19
09
19
13
19
17
19
21
19
25
19
29
19
33
19
37
19
41
19
45
19
49
19
53
19
57
19
61
19
65
19
69
19
73
19
77
19
81
19
85
19
89
19
93
19
97
20
01
0.300
Year
From the plot or from observing the data in the Edit worksheet, a downward trend is apparent,
although the AVERAGE was relative to the trend in the early years and high relative to the trend
in recent years. The George Brett average in 1980 (0.390) was high relative to neighboring
observations, but there are others that are similarly different. For example, in 1977 Rod Carew
hit 0.388. There is also a peak in 1994.
The trendline is:
b.
Leading Batting Average
0.440
0.420
0.380
0.360
0.340
0.320
2
01
97
20
93
19
89
19
85
19
81
19
77
19
73
19
69
19
65
19
61
19
57
Year
19
53
19
49
19
45
19
41
19
37
19
33
19
29
19
25
19
21
19
17
19
13
19
09
19
05
19
19
01
0.300
19
Batting Average
0.400
Chapter 11: Time Series
c.
The ACF is:
The ACF plot follows the pattern associated wwith a trend. This is in accord with the trend that
appears of the raw data.
d.
The ACF of the difference in batting average is:
3
Chapter 11: Time Series
The line plot appears as:
Difference in Batting Avg
0.06
0.04
Difference
0.02
0
-0.02
-0.04
19
01
19
05
19
09
19
13
19
17
19
21
19
25
19
29
19
33
19
37
19
41
19
45
19
49
19
53
19
57
19
61
19
65
19
69
19
73
19
77
19
81
19
85
19
89
19
93
19
97
20
01
-0.06
Year
The differenced series has no evident trend although there seems to be less variability near the
beginning of the series.
e.
The only significant correlation is in the first lag and it is a negative correlation. The correlation
drops after the first lag and no other lags are statistically significant. The negative value for the first
lag indicates that the change in the batting average is negatively correlated with the change in the
previous year. In other words, if the batting average has decreased one year it is more likely to
increase in the next.
f.
Here are the predictions and standard errors of simple exponential smoothing models. The
minimum standard forecasting error occurs with a smalling parameter value of 0.4.
Smoothing Value
Forecasted Value
0.2
0.3653
0.01928
0.3
0.3659
0.01883
0.4
0.3657
0.01875
0.5
0.3655
0.01887
4
Standard Error
Chapter 11: Time Series
7.
The line chart appears as follows:
a.
Power vs. Date
290.0
270.0
Power
250.0
230.0
210.0
190.0
170.0
Ja
n
1
Ju 978
l1
Ja 97
n 8
1
Ju 979
l1
Ja 97
n 9
1
Ju 980
l1
Ja 98
n 0
1
Ju 981
l1
Ja 98
n 1
1
Ju 982
l1
Ja 98
n 2
1
Ju 983
l1
Ja 98
n 3
1
Ju 984
l1
Ja 98
n 4
1
Ju 985
l1
Ja 98
n 5
1
Ju 986
l1
Ja 98
n 6
1
Ju 987
l1
Ja 98
n 7
1
Ju 988
l1
Ja 98
n 8
1
Ju 989
l1
Ja 98
n 9
1
Ju 990
l1
99
0
150.0
Date
There are two peaks each year. There is a summer peak probably caused by high demand for air
conditioning and there is a winter peak probably resulting from heating demands.
The forecasted values for the next 12 months are:
b.
Obs.
Forecast
Lower
Upper
157
257.42
245.41
269.43
158
229.45
217.30
241.60
159
234.65
222.35
246.95
160
217.36
204.91
229.82
161
229.44
216.82
242.06
162
249.15
236.36
261.94
163
274.52
261.55
287.49
164
274.80
261.64
287.96
165
240.71
227.35
254.06
166
230.54
216.98
244.09
167
228.18
214.42
241.94
168
252.54
238.56
266.51
5
Chapter 11: Time Series
c.
The multiplicative seasonal indices are:
July, August, and January are the three months of highest power production. August production is
25% higher than April production.
d.
e.
The standard errors for the 5 exponential smoothing models are as follows:
Location
Linear
Seasonal
Standard Error
0.05
0.15
0.05
6.743145
0.05
0.30
0.05
7.463687
0.15
0.15
0.05
6.039115
0.15
0.30
0.05
6.165231
0.30
0.15
0.05
5.779812
0.30
0.30
0.05
5.972524
The smallest standard error comes from a location parameter of 0.3, a linear parameter of 0.15, and
a seasonal parameter of 0.05.
6
Chapter 11: Time Series
8.
a.
The line plot of the visitation for Exit Glacier appears as follows:
There are two trends present: visitation at Exit Glacier is high in the summer and very low in the
winter, and overall visitation is growing every year.
b.
The two line plots for visitation appear as follows:
7
Chapter 11: Time Series
June 1994 is much higher than any previous month and much higher than previous June values. It
will cause forecasts for future months to be higher.
c.
The plot of the seasonally adjusted values appears as follows:
The 36th observation–December 1992–shows an increase in the seasonally adjusted values. After
this date, the adjusted values appear to vary around a high average value.
8
Chapter 11: Time Series
d.
Using exponential smoothing with the location parameter = 0.15, the linear parameter = 0.15 and
the seasonal parameter = 0.05, here are forecasted values for the next 12 months:
e.
Here are the forecasted visitation data for the next 12 months, with parameters of 0.05, 0.15, and
0.05:
Here are the data with parameters of .01, .15, and .05:
Location
Linear
Seasonal
Standard Error
0.15
0.15
0.05
8419.575
0.05
0.15
0.05
7644.315
0.01
0.15
0.05
8402.932
9
Chapter 11: Time Series
The predictions for the next 12 months are much reduced with the smaller location parameters, and
the standard error is smaller as well. The second model might be the best, since it has the smallest
standard error. More information that would help decide between projections is some explanation
for the jump in June 1994. Was there an event at the glacier that drew more visitors? Was a new
road recently put in, or a new shuttle system, or something else that would make access to the
glacier much easier? Or has the Park experienced a dramatic and temporary increase in visitation
due to publicity from some other event?
f.
The lower end of many confidence intervals are negative, particularly in the winter. Since the
numbers represent visitors, they cannot be negative! Clearly these confidence intervals have to be
viewed with caution.
a.
Use the LOG10(D2) function to transform the count of visitors to Exit Glacier and then use the Fill
Down command to fill in the rest of the values in the column.
b.
The line plot of the log10(visits) appears as follows:
9.
By plotting the logs we can view smaller variations in the number of visits in the winter months.
It is possible that there are peaks that occur in the winter months which we could not see in the
raw counts plots (due to the width of the y-axis scale).
c.
The predicted and exponentiated log(visits) values for an exponential model with the location
parameter = 0.15, the linear parameter=0.15 and the seasonal parameter=0.05 is:
10
Chapter 11: Time Series
The confidence intervals are unreasonably wide. The upper 95% confidence interval suggests that
more than 3 million people could visit the Exit Glacier site in June 1995!
d.
Here are the forecasted values with parameters .01, .15, and .05:
and with parameters .05, .15, and .05:
Location
Linear
Seasonal
Standard Error
0.15
0.15
0.05
0.363263
0.05
0.15
0.05
0.326634
0.01
0.15
0.05
0.317860
The set of parameters, .01, .15, and .05, give the smallest standard error, as well as the smallest
forecasted visitation, of the log(visitation) data.
e.
Here are the projected winter visitation numbers using the raw and log counts:
Month
Raw Count Projection
Log Count Projection
December 1995
442
248
January 1995
275
217
February 1995
201
143
March 1995
551
471
11
Chapter 11: Time Series
The transformed values predict a lower number of visits for the winter months. When
determining the number of personnel you need during winter months a difference of 200 visits
might mean hiring (or not hiring) an additional employee. The results from previous years
indicates that it may be more appropriate to forecast the lower number of visits rather than the
higher number. Other information besides the forecasts from the models should be taken into
account however.
10.
a.
The line plot appears as follows:
There is some evidence of seasonality albeit with a great deal of variability. There appear to be
about 7 separate peaks.
b.
The boxplot appears as:
A period of high body temperatures preceeds menstruation. The onset of menstruation appears to
be associated with a decrease in temperature of about 0.5 degrees.
12
Chapter 11: Time Series
c.
The ACF plot is:
Based on the shape of the plot, the between-peak distance is roughly 35, and thus the length of the
period is about 35 days.
d.
The exponential smoothing model using a location parameter=0.15, linear parameter=0.01 and the
seasonal parameter=0.05 appears as follows:
13
Chapter 11: Time Series
e.
Here are the predicted temperatures for the next cycle, using parameters of 0.15, 0.01, and 0.15:
14
Chapter 11: Time Series
Here are the predicted temperatures for the next cycle, using parameters of 0.15, 0.01, and 0.25:
15
Chapter 11: Time Series
Location
Linear
Seasonal
Standard Error
0.15
0.01
0.05
0.221952
0.15
0.01
0.15
0.229522
0.15
0.01
0.25
0.236464
11.
The smoothed plot with a smoothing parameter of 0.15 appears as follows (after reformatting):
a.
401
351
301
251
201
151
101
51
1
0
100
200
300
MSE = 11,595.06
For w = 0.085, MSE = 11,263,54
16
400
Chapter 11: Time Series
401
351
301
251
201
151
101
51
1
0
100
200
300
400
300
400
For w = 0.05, MSE = 11,188.20
401
351
301
251
201
151
101
51
1
0
100
200
The lowest MSE occurs with the smoothing parameter = 0.05.
b.
The plots hsow a downward trend in draft numbers as the year progresses.
17
Chapter 11: Time Series
c.
The ACF plot (shown following) does not indicate the presence of any significant autocorrelation in
the draft numbers.
d.
The draft numbers show a downward trend as the year progresses making it more likely that people
born later in the year receive lower draft numbers (and are thus less likely to be drafted.) There is
no indication that draftnumbers are autocorrelated.
a.
The line chart appears as follows:
12.
Oil Production
160
140
120
Production
100
1992
1993
1994
1995
80
60
40
20
0
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Year
Production is seasonal. Cottonseed oil production reaches its height in the months from
November to January. The production is at its lowest from July to September.
18
Chapter 11: Time Series
b.
The smoothed plot and 12 forcasted values appear as follows:
211.7
191.7
171.7
151.7
131.7
111.7
91.7
71.7
51.7
0
c.
10
20
30
40
50
60
70
Adjusting for the seasonal effects yields the following plot:
151.7
131.7
111.7
91.7
71.7
51.7
0
10
20
30
Production
40
Adjusted
It is difficult to determine whether the mean production level (after adjusting for seasonal
variation) has changed. There is one peak that occurs in the last year that suggests that it might
have.
19
Chapter 11: Time Series
The ANOVA table for the regression is:
Regression Statistics
Multiple R
0.266
R Square
0.071
Adjusted R Square
0.051
Standard Error
11.321
Observations
48
ANOVA
df
SS
Regression
MS
1
449.403
449.403
Residual
46
5895.438
128.162
Total
47
6344.841
Standard
Error
Coefficients
Intercept
Obs.
Significance
F
F
t Stat
3.507
P-value
0.067
Lower 95%
Upper 95%
94.896
3.320
28.585
0.000
88.214
101.578
0.221
0.118
1.873
0.067
-0.017
0.458
The p-value for the slope of the regression is 0.067 which is not significant at the 5% level. We
reject the null hypothesis that there has been an significant increase in cottonseed oil production
over the past four years.
13.
a.
The two-way table appears as:
Stoppage
Year
Month
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
1981
6
7
16
17
18
30
23
9
5
7
5
2
1982
2
3
4
14
15
18
13
9
14
3
1
0
1983
1
5
5
2
12
16
10
7
7
12
4
0
1984
6
3
2
7
5
5
8
5
10
4
4
3
1985
2
4
4
3
2
2
9
6
11
6
3
2
1986
4
3
2
4
6
11
13
10
8
5
2
1
1987
2
5
3
2
3
8
6
3
7
1
6
0
1988
3
5
3
0
5
7
4
7
2
3
1
0
1989
3
0
3
6
8
2
6
6
6
5
5
1
1990
2
3
7
5
5
6
1
5
3
2
3
2
1991
0
2
1
7
7
5
0
4
3
6
3
1
1992
0
1
1
4
6
6
1
3
8
5
0
0
1993
2
1
4
2
5
2
3
5
4
4
3
0
1994
1
2
3
5
4
9
4
5
7
4
1
0
1995
1
1
4
3
1
2
3
5
4
5
2
0
20
Chapter 11: Time Series
The boxplot and line plots appear as:
b.
35
30
25
20
15
10
5
0
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Nov
Dec
-5
Work Stoppage
35
30
Stoppage
25
20
15
10
5
0
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Month
1981
1991
1982
1992
1983
1993
1984
1994
1985
1995
1986
1987
1988
1989
1990
The highest work stoppages appear in the summer months from May to September. The fewest
work stoppages appear from December to February.
21
Chapter 11: Time Series
The adjusted values appear as follows (after reformatting):
c.
35
30
25
20
15
10
5
0
0
50
100
Stoppage
150
Adjusted
There does appear to be a decrease in work stoppages over the decade after adjusting for seasonal
effects.
The plot of smoothed values appears as follows:
d.
25
20
15
10
5
0
0
e.
50
100
150
200
Based on the findings, it appears that work stoppages may be seasonal and they have declined over
the past ten years.
22
Chapter 11: Time Series
14.
The two-way table appears as:
a.
Unemployment
Month
Year
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
19.1
19.3
19.2
18.8
19.1
19.8
18.6
18.8
19.7
20.3
21.3
21.1
1982
22
22.6
21.8
22.8
22.8
22.9
24
23.7
23.6
23.7
24.1
24.1
1983
23.1
22.8
23.5
23.4
22.8
24
22.8
22.9
21.7
21.4
20.2
19.9
1984
19.5
19.4
19.8
19.2
18.7
18.2
18.8
18.7
19.2
18.6
17.7
18.8
1985
18.8
18.3
18.2
17.5
18.5
18.5
20.2
17.9
17.9
20
18.3
19.1
1986
18.1
18.8
18.2
19.2
18.6
19.2
18.4
18
18.4
17.7
18.1
17.5
1987
17.7
18
17.9
17.3
17.4
16.5
15.8
15.9
16.2
17.3
16.6
16
1988
16.1
15.6
16.6
16
15.3
14.2
14.8
15.4
15.5
15.1
13.9
14.8
1989
16.4
15
13.9
14.6
14.8
15.7
14.2
14.6
15.2
15
15.5
15.3
1990
14.8
15
14.3
14.7
15
14.3
15
16.3
16.4
16.5
17.1
17.4
1991
18.6
17.4
18.3
17.8
18.8
18.5
19.4
18.9
18.8
19.1
19
20.3
1992
19.2
20.1
20.3
18.5
20.1
23
20.8
19.9
21
18.3
20.5
19.8
1993
19.9
19.7
19.7
19.5
19.8
19.9
18.4
18.4
18.2
18.7
18.5
17.9
1994
18.3
18.1
18
19
18
17.7
17.5
17.3
17.6
17.4
15.5
16.9
1995
16.5
17.5
16.1
17.3
17.5
17.2
18
17.4
17.8
17.3
17.5
17.9
1996
17.8
17
17.1
16.8
16.6
16.2
16.7
17
16
16.3
16.8
16.5
Aug
Sep
Oct
1981
The spaghetti plot appears as:
b.
Unemployment
26
24
Unemployment
22
20
18
16
14
12
Jan
Feb
Mar
Apr
May
Jun
Jul
Nov
Dec
Month
1981
1991
1982
1992
1983
1993
1984
1994
1985
1995
23
1986
1996
1987
1988
1989
1990
Chapter 11: Time Series
c.
The boxplot appears as:
Unemployment
26
24
22
20
18
16
14
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
12
Month
There is no evidence of seasonality in this plot.
d.
The results of the seasonal adjustment appear as follows:
There is nothing in the seasonal indices that indicates that this data is seasonal.
24
25
Download