Spring 2014 Exam 1 Solutions

advertisement
STA 6167 – Exam 1 – Spring 2014 – PRINT Name _______________________
For all significance tests, use  = 0.05 significance level.
Q.1. A simple linear regression was fit relating number of species of arctic flora observed (Y) and July mean temperature
(X, in Celsius). The results of the regression model, based on n=19 temperature stations is given below.
ANOVA
df
Regression
Residual
Total
Intercept
JulyTemp
1
17
18
SS
39858
8484
48342
MS
39858
499
F
Significance F
79.87
0.0000
Coefficients
Standard Error t Stat
P-value Lower 95% Upper 95%
-34.49
16.56
-2.08
0.0527
-69.43
0.46
24.60
2.75
8.94
0.0000
18.79
30.41
SS_XX
X-bar
65.85
5.7
p.1.a. What proportion of the variation in number of species is “explained” by mean July temperature?
R2 = SSR/TSS = 39858/48342 = .8245
p.1.b. Compute a 95% Confidence Interval for the population mean number of species, with mean July temperature of 6
degrees.
^
Y 6  34.49  24.60(6)  113.11




2
2

6 X 
^
1  6  5.7 
1
  499(.0540)  5.19
 
 .0540 SE Y 6  MSE  
 n
19
65.85
S XX 


t .025,17   2.110 95% CI: 113.11  2.110(5.19)  113.11  10.95  (102.16,124.06)
1 6 X

n
S XX

2
p.1.c. Compute a 95% Prediction Interval for the number of species, at a single station with mean July temperature of 6
degrees.
^
Y 6,new  34.49  24.60(6)  113.11


2

6 X 
1
  499(1.0540)  22.93
SE Y 6,new  MSE 1  
 n
S XX 


t .025,17   2.110 95% CI: 113.11  2.110(22.93)  113.11  48.39  (64.72,161.50)
 
^
Q.2. An experiment was conducted, relating the penetration depth of missiles (Y) to its impact factor (X). The results from
the regression, and the residual versus fitted plot are given below (n=25).
Residuals vs Fitted Values
ANOVA
df
SS
1 1.585884
23 0.713406
24 2.29929
Regression
Residual
Total
0.4
0.3
0.2
0.1
0
Residuals
-0.1 0
0.5
1
1.5
2
Coefficients
Standard Error -0.2
Intercept 0.633253 0.103076 -0.3
impact
0.06 0.008391 -0.4
p.2.a. Test H0: 1 = 0 (Penetration depth is not associated with impact factor) based on the t-test.
^
1
TS : tobs 

^
SE  1

0.06
 7.15
0.008391
RR : tobs  t .025, 23  2.069
p.2.b. Test H0: 1 = 0 (Penetration depth is not associated with impact factor) based on the F-test.
TS : Fobs 
 SSR 1   0.713406 1  51.13
MSR

MSE  SSE 23  2.9929 23
RR : Fobs  F .05,1, 23  4.279
p.2.c. The residual plot appears to display non-constant error variance. A regression of the squared residuals on the impact
factors (X) is fit, and the ANOVA is given below. Conduct the Breusch-Pagan test to test whether the errors are related to
X. Do you reject the null hypothesis of constant variance? Yes or No
ANOVA
df
Regression
Residual
Total
2
TS : X BP

SS
1 0.007959
23 0.025824
24 0.033783
 SSR 2   0.007959 2 
 SSE n   0.713406 25
e2
2
y
2

0.0039795
 4.887
0.00081432
2
RR : X BP
  2 .05,1  3.841
Q.3. An experiment was conducted to measure air permeability of fabric (Y) as a function of the following factors: warp
density (X1), weft density (X2), and Mass per unit area (X3). There were n=30 observations, and 4 models are fit:
E Y    0  1 X 1   2 X 2  3 X 3  12 X 1 X 2  13 X 1 X 3   23 X 2 X 3  11 X 12   22 X 22  33 X 32
E Y    0  1 X 1   2 X 2  3 X 3  12 X 1 X 2  13 X 1 X 3   23 X 2 X 3
E Y    0  1 X 1   2 X 2  3 X 3
SSE  72.4
SSE  86.5
SSE  813.6
E Y    0  1 X 1   2 X 2  3 X 3   23 X 2 X 3
SSE  122.7
p.3.a. Use the first two models to test H0: .
Complete Model 1: SSE  72.4 df E  30  10  20
Reduced Model 2: SSE  86.5 df E  30  7  23
TS : Fobs
 86.5  72.4 


23  20  4.70


 1.298
3.62
 72.4 


 20 
RR : Fobs  F .05,3, 20   3.098
p.3.b. Use the 3rd and 4th models to test whether the weft-mass interaction is significant, controlling for all main effects.
Complete Model 4: SSE  122.7 df E  30  5  25
Reduced Model 3: SSE  813.6 df E  30  4  26
TS : Fobs
 813.6  122.7 


26  25  690.9



 140.77
4.91
 122.7 


 25 
RR : Fobs  F .05,1, 25   4.242
Q.4. A regression model was fit, relating the share of big 3 television network prime-time market share (Y, %) to
household penetration of cable/satellite dish providers (X = MVPD) for the years 1980-2004 (n=25). The regression
results and residual versus time plot are given below.
Residuals
ANOVA
df
Regression
Residual
Total
1
23
24
SS
7073.7
237.3
7311.0
MS
7073.7
10.3
F
685.7
6
4
2
0
Residuals
1
Coefficients
Standard Error t Stat
P-value
Intercept
112.029
2.090
53.61
0.0000
mvpd
-0.863
0.033
-26.19
0.0000
3
5
7
9
11 13 15 17 19 21 23 25
-2
-4
-6
p.4.a. Compute the correlation between big 3 market share and MVPD.
SSR 7073.7

 .9675
TSS 7311.0
R2 

^
R  sgn  1
R 2  .9836
p.4.b. The residual plot appears to display serial autocorrelation over time. Conduct the Durbin-Watson test, with null
hypothesis that errors are not autocorrelated.
25
 e  e 
t 2
t
2
t 1
 161.4
25
DW 
 e  e 
t 2
t 1
t
n
e
t 1
d L   0.05, n  25, p  1  1.29 dU   0.05, n  25, p  1  1.45
2

2
t
161.4
 0.6802  d L   0.05, n  25, p  1  1.29  Reject H 0
237.3
p.4.c. Data were transformed to conduct estimated generalized least squares (EGLS), to account for the auto-correlation.
The parameter estimates and standard errors are given below. Obtain 95% confidence intervals for 1, based on Ordinary
Least Squares (OLS) and EGLS. Note that the error degrees’ of freedom are 23 for OLS and 22 for EGLS (estimated the
autocorrelation coefficient).
beta-egls SE(b-egls)
110.577
3.469
-0.845
0.055
OLS: t .025, 23  2.069 :  0.863  2.069(0.033)   0.863  0.068   0.931, 0.795 
EGLS: t .025, 22   2.074 :  0.845  2.074(0.055)   0.845  0.114   0.959, 0.731
Q.5. Regression analyses were fit, relating various chemical levels to age for stranded bottlenose dolphins in South
Carolina and Florida.
This plot gives the quadratic fit, relating mercury/selenium molar ratio (Y) to age (X) for the Florida dolphins. Complete
the following parts. Note: The data were NOT centered. The model fit was: E(Y) = 0 + 1X + 2X2
n = 14 Predicted value when age = 15: 0.1295 + 0.1479(15) – 0.0046(152) = 1.313
Test H0: 1 = 2 = 0
TS : Fobs
 R 2 p 
.6151 2


 8.79
1  R 2   n  p '  .3849  14  3 


RR : Fobs  F .05, 2,11  3.982
Q.6. A study was conducted to determine which factors were associated with percent release (Y) of hydroxypropyl
methylcellulose (HPMC) tablets. The factors were:
X1 = Carr’s compressibility index,
X2 = angle of repose,
X3 = solubility,
X4 =molecular weight,
X5 = compression force
X6 = apparent viscosity of 4% (w/v) HPMC.
The sample size was n=18, and the authors reported the fit of the following models.
p.6.a. Complete the table in terms of AIC and SBC (BIC).
Predictors
X1,X2,X3,X4,X5,X6
X1,X2,X3,X4,X5
X1,X2,X3,X4
X1,X3,X4,X6
X1,X3,X4
X2,X3,X4
X3,X4,X6
p'
SSE
7
6
5
5
4
4
4
42.62
42.62
48.58
48.58
52.86
75.31
48.85
AIC
SBC
29.5151 35.7477
27.5151 32.8574
27.8711 32.3230
27.8711 32.3230
27.3910 30.9524
33.7623 37.3238
25.9709 29.5324
Models 3 and 4:
AIC  18  ln(48.58)   2(5)  18  ln(18)   69.8978  10  52.0267  27.8711
SBC  18  ln(48.58)    ln(18)  (5)  18  ln(18)   69.8978  14.4519  52.0267  32.3230
Model 7:
AIC  18  ln(48.85)   2(4)  18  ln(18)   69.9976  8  52.0267  25.9709
SBC  18  ln(48.85)    ln(18)  (4)  18  ln(18)   69.9976  11.5615  52.0267  29.5324
p.6.b. Which model is “best” based on AIC: Model 7
BIC: Model 7
p.6.c. R2 for the complete model was 0.9278. Compute the total (corrected) sum of squares (TSS):
R2  1 
SSE
TSS

SSE
 1  R2
TSS
 TSS 
SSE
42.62
42.62


 590.305
2
1 R
1  .9278 .0722
Download