STA 6167 – Exam 1 – Spring 2014 – PRINT Name _______________________ For all significance tests, use = 0.05 significance level. Q.1. A simple linear regression was fit relating number of species of arctic flora observed (Y) and July mean temperature (X, in Celsius). The results of the regression model, based on n=19 temperature stations is given below. ANOVA df Regression Residual Total Intercept JulyTemp 1 17 18 SS 39858 8484 48342 MS 39858 499 F Significance F 79.87 0.0000 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% -34.49 16.56 -2.08 0.0527 -69.43 0.46 24.60 2.75 8.94 0.0000 18.79 30.41 SS_XX X-bar 65.85 5.7 p.1.a. What proportion of the variation in number of species is “explained” by mean July temperature? R2 = SSR/TSS = 39858/48342 = .8245 p.1.b. Compute a 95% Confidence Interval for the population mean number of species, with mean July temperature of 6 degrees. ^ Y 6 34.49 24.60(6) 113.11 2 2 6 X ^ 1 6 5.7 1 499(.0540) 5.19 .0540 SE Y 6 MSE n 19 65.85 S XX t .025,17 2.110 95% CI: 113.11 2.110(5.19) 113.11 10.95 (102.16,124.06) 1 6 X n S XX 2 p.1.c. Compute a 95% Prediction Interval for the number of species, at a single station with mean July temperature of 6 degrees. ^ Y 6,new 34.49 24.60(6) 113.11 2 6 X 1 499(1.0540) 22.93 SE Y 6,new MSE 1 n S XX t .025,17 2.110 95% CI: 113.11 2.110(22.93) 113.11 48.39 (64.72,161.50) ^ Q.2. An experiment was conducted, relating the penetration depth of missiles (Y) to its impact factor (X). The results from the regression, and the residual versus fitted plot are given below (n=25). Residuals vs Fitted Values ANOVA df SS 1 1.585884 23 0.713406 24 2.29929 Regression Residual Total 0.4 0.3 0.2 0.1 0 Residuals -0.1 0 0.5 1 1.5 2 Coefficients Standard Error -0.2 Intercept 0.633253 0.103076 -0.3 impact 0.06 0.008391 -0.4 p.2.a. Test H0: 1 = 0 (Penetration depth is not associated with impact factor) based on the t-test. ^ 1 TS : tobs ^ SE 1 0.06 7.15 0.008391 RR : tobs t .025, 23 2.069 p.2.b. Test H0: 1 = 0 (Penetration depth is not associated with impact factor) based on the F-test. TS : Fobs SSR 1 0.713406 1 51.13 MSR MSE SSE 23 2.9929 23 RR : Fobs F .05,1, 23 4.279 p.2.c. The residual plot appears to display non-constant error variance. A regression of the squared residuals on the impact factors (X) is fit, and the ANOVA is given below. Conduct the Breusch-Pagan test to test whether the errors are related to X. Do you reject the null hypothesis of constant variance? Yes or No ANOVA df Regression Residual Total 2 TS : X BP SS 1 0.007959 23 0.025824 24 0.033783 SSR 2 0.007959 2 SSE n 0.713406 25 e2 2 y 2 0.0039795 4.887 0.00081432 2 RR : X BP 2 .05,1 3.841 Q.3. An experiment was conducted to measure air permeability of fabric (Y) as a function of the following factors: warp density (X1), weft density (X2), and Mass per unit area (X3). There were n=30 observations, and 4 models are fit: E Y 0 1 X 1 2 X 2 3 X 3 12 X 1 X 2 13 X 1 X 3 23 X 2 X 3 11 X 12 22 X 22 33 X 32 E Y 0 1 X 1 2 X 2 3 X 3 12 X 1 X 2 13 X 1 X 3 23 X 2 X 3 E Y 0 1 X 1 2 X 2 3 X 3 SSE 72.4 SSE 86.5 SSE 813.6 E Y 0 1 X 1 2 X 2 3 X 3 23 X 2 X 3 SSE 122.7 p.3.a. Use the first two models to test H0: . Complete Model 1: SSE 72.4 df E 30 10 20 Reduced Model 2: SSE 86.5 df E 30 7 23 TS : Fobs 86.5 72.4 23 20 4.70 1.298 3.62 72.4 20 RR : Fobs F .05,3, 20 3.098 p.3.b. Use the 3rd and 4th models to test whether the weft-mass interaction is significant, controlling for all main effects. Complete Model 4: SSE 122.7 df E 30 5 25 Reduced Model 3: SSE 813.6 df E 30 4 26 TS : Fobs 813.6 122.7 26 25 690.9 140.77 4.91 122.7 25 RR : Fobs F .05,1, 25 4.242 Q.4. A regression model was fit, relating the share of big 3 television network prime-time market share (Y, %) to household penetration of cable/satellite dish providers (X = MVPD) for the years 1980-2004 (n=25). The regression results and residual versus time plot are given below. Residuals ANOVA df Regression Residual Total 1 23 24 SS 7073.7 237.3 7311.0 MS 7073.7 10.3 F 685.7 6 4 2 0 Residuals 1 Coefficients Standard Error t Stat P-value Intercept 112.029 2.090 53.61 0.0000 mvpd -0.863 0.033 -26.19 0.0000 3 5 7 9 11 13 15 17 19 21 23 25 -2 -4 -6 p.4.a. Compute the correlation between big 3 market share and MVPD. SSR 7073.7 .9675 TSS 7311.0 R2 ^ R sgn 1 R 2 .9836 p.4.b. The residual plot appears to display serial autocorrelation over time. Conduct the Durbin-Watson test, with null hypothesis that errors are not autocorrelated. 25 e e t 2 t 2 t 1 161.4 25 DW e e t 2 t 1 t n e t 1 d L 0.05, n 25, p 1 1.29 dU 0.05, n 25, p 1 1.45 2 2 t 161.4 0.6802 d L 0.05, n 25, p 1 1.29 Reject H 0 237.3 p.4.c. Data were transformed to conduct estimated generalized least squares (EGLS), to account for the auto-correlation. The parameter estimates and standard errors are given below. Obtain 95% confidence intervals for 1, based on Ordinary Least Squares (OLS) and EGLS. Note that the error degrees’ of freedom are 23 for OLS and 22 for EGLS (estimated the autocorrelation coefficient). beta-egls SE(b-egls) 110.577 3.469 -0.845 0.055 OLS: t .025, 23 2.069 : 0.863 2.069(0.033) 0.863 0.068 0.931, 0.795 EGLS: t .025, 22 2.074 : 0.845 2.074(0.055) 0.845 0.114 0.959, 0.731 Q.5. Regression analyses were fit, relating various chemical levels to age for stranded bottlenose dolphins in South Carolina and Florida. This plot gives the quadratic fit, relating mercury/selenium molar ratio (Y) to age (X) for the Florida dolphins. Complete the following parts. Note: The data were NOT centered. The model fit was: E(Y) = 0 + 1X + 2X2 n = 14 Predicted value when age = 15: 0.1295 + 0.1479(15) – 0.0046(152) = 1.313 Test H0: 1 = 2 = 0 TS : Fobs R 2 p .6151 2 8.79 1 R 2 n p ' .3849 14 3 RR : Fobs F .05, 2,11 3.982 Q.6. A study was conducted to determine which factors were associated with percent release (Y) of hydroxypropyl methylcellulose (HPMC) tablets. The factors were: X1 = Carr’s compressibility index, X2 = angle of repose, X3 = solubility, X4 =molecular weight, X5 = compression force X6 = apparent viscosity of 4% (w/v) HPMC. The sample size was n=18, and the authors reported the fit of the following models. p.6.a. Complete the table in terms of AIC and SBC (BIC). Predictors X1,X2,X3,X4,X5,X6 X1,X2,X3,X4,X5 X1,X2,X3,X4 X1,X3,X4,X6 X1,X3,X4 X2,X3,X4 X3,X4,X6 p' SSE 7 6 5 5 4 4 4 42.62 42.62 48.58 48.58 52.86 75.31 48.85 AIC SBC 29.5151 35.7477 27.5151 32.8574 27.8711 32.3230 27.8711 32.3230 27.3910 30.9524 33.7623 37.3238 25.9709 29.5324 Models 3 and 4: AIC 18 ln(48.58) 2(5) 18 ln(18) 69.8978 10 52.0267 27.8711 SBC 18 ln(48.58) ln(18) (5) 18 ln(18) 69.8978 14.4519 52.0267 32.3230 Model 7: AIC 18 ln(48.85) 2(4) 18 ln(18) 69.9976 8 52.0267 25.9709 SBC 18 ln(48.85) ln(18) (4) 18 ln(18) 69.9976 11.5615 52.0267 29.5324 p.6.b. Which model is “best” based on AIC: Model 7 BIC: Model 7 p.6.c. R2 for the complete model was 0.9278. Compute the total (corrected) sum of squares (TSS): R2 1 SSE TSS SSE 1 R2 TSS TSS SSE 42.62 42.62 590.305 2 1 R 1 .9278 .0722