Solutions to Midterm (STA 4234, October 7, 2013) The outputs are

advertisement
Solutions to Midterm (STA 4234, October 7, 2013)
The outputs are provided by Minitab.
1.
Regression Analysis: purity versus hydro
The regression equation is
purity = 77.9 + 11.8 hydro
Predictor
Constant
hydro
Coef
77.863
11.801
S = 3.59656
SE Coef
4.199
3.485
T
18.54
3.39
R-Sq = 38.9%
P
0.000
0.003
R-Sq(adj) = 35.5%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
18
19
SS
148.31
232.83
381.15
MS
148.31
12.94
F
11.47
P
0.003
SE Fit
1.047
Residual
7.304
Unusual Observations
Obs
18
hydro
0.99
purity
96.850
Fit
89.546
St Resid
2.12R
R denotes an observation with a large standardized residual.
Predicted Values for New Observations
New
Obs
1
Fit
89.664
SE Fit
1.025
95% CI
(87.510, 91.818)
95% PI
(81.807, 97.521)
Values of Predictors for New Observations
New
Obs
1
hydro
1.00
(a) The
regression equation is
purity = 77.9 + 11.8 hydro
(b) Analysis
of Variance
Source
Regression
Residual Error
Total
DF
1
18
19
SS
148.31
232.83
381.15
MS
148.31
12.94
F
11.47
P
0.003
F=11.47 with p=0.003. The null hypothesis 𝐻0 : 𝛽1 = 0 is rejected and we conclude that there is a
linear relationship between purity and percent of hydrocarbons.
(c) R-Sq
= 38.9%
(d) A 95% confidence interval on the slope parameter 𝛽1 is (11.801 ± 2.101 (3.485) =
(4.48, 19.12).
(e) A 95% confidence interval on the mean purity when the hydrocarbon percentage is
1.00 is (87.510, 91.818).
(f) A 95% prediction interval on oxygen purity when the hydrocarbon percentage is 1.00
is (81.807, 97.521).
2. (a) The correlation is 𝑟 = √𝑅 2=0.624.
(b) This is the same as the test statistics for testing 𝐻0 : 𝛽1 = 0 with t=3.39 and p=0.003. We
reject the hypothesis that 𝜌 = 0 and conclude that there is correlation different from zero.
(c) A 95% confidence interval for 𝜌 is
(𝑡𝑎𝑛ℎ [arctanh(0.624) −
1.96
1.96
√17
√17
] , tanh [arctanh(0.624) +
]) = (0.261, 0.837).
(d) The test statistics is 𝑍0 = (arctanh(0.624) − arctanh(0.6))√17=0.15825. Since the
rejection region is |𝑍0 | > 𝑍𝛼/2 = 1.96, we fail to reject 𝐻0 .
3.
Regression Analysis: y versus x1, x2, x3, x4, x5
The regression equation is
y = 52.1 + 0.0556 x1 + 0.282 x2 + 0.125 x3 - 0.000 x4 - 16.1 x5
Predictor
Constant
x1
x2
x3
x4
x5
Coef
52.08
0.05556
0.28214
0.1250
-0.0000
-16.065
S = 8.06536
SE Coef
18.89
0.02987
0.05761
0.4033
0.2016
1.456
R-Sq = 93.7%
Analysis of Variance
T
2.76
1.86
4.90
0.31
-0.00
-11.03
P
0.020
0.093
0.001
0.763
1.000
0.000
R-Sq(adj) = 90.6%
Source
Regression
Residual Error
Total
Source
x1
x2
x3
x4
x5
DF
1
1
1
1
1
DF
5
10
15
SS
9712.5
650.5
10363.0
MS
1942.5
65.0
F
29.86
P
0.000
Seq SS
225.0
1560.2
6.2
0.0
7921.0
Regression Analysis: y versus x2, x5
The regression equation is
y = 80.1 + 0.282 x2 - 16.1 x5
Predictor
Constant
x2
x5
Coef
80.135
0.28214
-16.065
S = 8.23571
SE Coef
5.691
0.05883
1.487
R-Sq = 91.5%
T
14.08
4.80
-10.81
P
0.000
0.000
0.000
R-Sq(adj) = 90.2%
Analysis of Variance
Source
Regression
Residual Error
Total
Source
x2
x5
DF
1
1
DF
2
13
15
SS
9481.3
881.7
10363.0
MS
4740.6
67.8
F
69.89
P
0.000
Seq SS
1560.2
7921.0
Unusual Observations
Obs
7
x2
95.0
y
71.00
Fit
86.38
SE Fit
3.57
Residual
-15.38
St Resid
-2.07R
R denotes an observation with a large standardized residual.
(a) The desired multiple regression
- 0.000 x4 - 16.1 x5
model is y
= 52.1 + 0.0556 x1 + 0.282 x2 + 0.125 x3
(b)
Analysis of Variance
Source
Regression
Residual Error
Total
DF
SS
MS
5
9712.5 1942.5
10
650.5
65.0
15 10363.0
F
29.86
P
0.000
Since F=29.86 with p=0.000, we reject H0: all the coefficients are zero and
conclude that the regression is significant.
(c) x2 and x5 appear to contribute to the model. For other regressors x1, x3 and x4, their t
values are small with large p values, which implies that they are not significant.
Predictor
Constant
x1
x2
x3
x4
x5
Coef
52.08
0.05556
0.28214
0.1250
-0.0000
-16.065
SE Coef
18.89
0.02987
0.05761
0.4033
0.2016
1.456
T
P
2.76 0.020
1.86 0.093
4.90 0.001
0.31 0.763
-0.00 1.000
-11.03 0.000
(d) For the model in part (a), R-Sq = 93.7%
R-Sq(adj) = 90.6%. For the model with only
temperature and particle size, R-Sq = 91.5%
R-Sq(adj) = 90.2%. These are
basically the same.
(e) For the model in part (a), a 95% confidence interval for the regression
coefficient for temperature is (0.282-2.228 (0.05761),
0.282+2.228(0.05761))=(0.154, 0.410). For the model with only temperature and
particle size, a 95% confidence interval is (0.282-2.16 (0.05883),
0.282+2.16(0.05883))=(0.155, 0.409). These two intervals are almost the same.
Download