STA 6167 – Exam 1 – Spring 2014 – PRINT Name _______________________ For all significance tests, use = 0.05 significance level. Q.1. A simple linear regression was fit relating number of species of arctic flora observed (Y) and July mean temperature (X, in Celsius). The results of the regression model, based on n=19 temperature stations is given below. ANOVA df Regression Residual Total Intercept JulyTemp 1 17 18 SS 39858 8484 48342 MS 39858 499 F Significance F 79.87 0.0000 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% -34.49 16.56 -2.08 0.0527 -69.43 0.46 24.60 2.75 8.94 0.0000 18.79 30.41 SS_XX X-bar 65.85 5.7 p.1.a. What proportion of the variation in number of species is “explained” by mean July temperature? p.1.b. Compute a 95% Confidence Interval for the population mean number of species, with mean July temperature of 6 degrees. p.1.c. Compute a 95% Prediction Interval for the number of species, at a single station with mean July temperature of 6 degrees. Q.2. An experiment was conducted, relating the penetration depth of missiles (Y) to its impact factor (X). The results from the regression, and the residual versus fitted plot are given below (n=25). Residuals vs Fitted Values ANOVA df Regression Residual Total SS 1 1.585884 23 0.713406 24 2.29929 0.4 0.3 0.2 0.1 0 -0.1 0 Residuals 0.5 1 1.5 2 Coefficients Standard Error -0.2 Intercept 0.633253 0.103076 -0.3 impact 0.06 0.008391 -0.4 p.2.a. Test H0: 1 = 0 (Penetration depth is not associated with impact factor) based on the t-test. Test Statistic: _____________________________________ Rejection Region: _______________________ p.2.b. Test H0: 1 = 0 (Penetration depth is not associated with impact factor) based on the F-test. Test Statistic: _____________________________________ Rejection Region: _______________________ p.2.c. The residual plot appears to display non-constant error variance. A regression of the squared residuals on the impact factors (X) is fit, and the ANOVA is given below. Conduct the Breusch-Pagan test to test whether the errors are related to X. Do you reject the null hypothesis of constant variance? Yes or No ANOVA df Regression Residual Total SS 1 0.007959 23 0.025824 24 0.033783 Test Statistic: _____________________________________________ Rejection Region: _____________________ Q.3. An experiment was conducted to measure air permeability of fabric (Y) as a function of the following factors: warp density (X1), weft density (X2), and Mass per unit area (X3). There were n=30 observations, and 4 models are fit: E Y 0 1 X 1 2 X 2 3 X 3 12 X 1 X 2 13 X 1 X 3 23 X 2 X 3 11 X 12 22 X 22 33 X 32 E Y 0 1 X 1 2 X 2 3 X 3 12 X 1 X 2 13 X 1 X 3 23 X 2 X 3 E Y 0 1 X 1 2 X 2 3 X 3 SSE 72.4 SSE 86.5 SSE 813.6 E Y 0 1 X 1 2 X 2 3 X 3 23 X 2 X 3 SSE 122.7 p.3.a. Use the first two models to test H0: . Test Statistic: ______________________________________________ Rejection Region: ___________________ p.3.b. Use the 3rd and 4th models to test whether the weft-mass interaction is significant, controlling for all main effects. Test Statistic: ______________________________________________ Rejection Region: ___________________ Q.4. A regression model was fit, relating the share of big 3 television network prime-time market share (Y, %) to household penetration of cable/satellite dish providers (X = MVPD) for the years 1980-2004 (n=25). The regression results and residual versus time plot are given below. Residuals ANOVA df Regression Residual Total 1 23 24 SS 7073.7 237.3 7311.0 MS 7073.7 10.3 F 685.7 6 4 2 0 Residuals 1 Coefficients Standard Error t Stat P-value Intercept 112.029 2.090 53.61 0.0000 mvpd -0.863 0.033 -26.19 0.0000 3 5 7 9 11 13 15 17 19 21 23 25 -2 -4 -6 p.4.a. Compute the correlation between big 3 market share and MVPD. p.4.b. The residual plot appears to display serial autocorrelation over time. Conduct the Durbin-Watson test, with null hypothesis that residuals are not autocorrelated. 25 e e t 2 t t 1 2 161.4 d L 0.05, n 25, p 1 1.29 dU 0.05, n 25, p 1 1.45 Test Statistic: ____________________________________ Reject H0? Yes or No p.4.c. Data were transformed to conduct estimated generalized least squares (EGLS), to account for the auto-correlation. The parameter estimates and standard errors are given below. Obtain 95% confidence intervals for 1, based on Ordinary Least Squares (OLS) and EGLS. Note that the error degrees’ of freedom are 23 for OLS and 22 for EGLS (estimated the autocorrelation coefficient). beta-egls SE(b-egls) 110.577 3.469 -0.845 0.055 OLS 95% CI: ____________________________________ EGLS 95% CI: _________________________________ Q.5. Regression analyses were fit, relating various chemical levels to age for stranded bottlenose dolphins in South Carolina and Florida. This plot gives the quadratic fit, relating mercury/selenium molar ratio (Y) to age (X) for the Florida dolphins. Complete the following parts. Note: The data were NOT centered. The model fit was: E(Y) = 0 + 1X + 2X2 n = ____________ Predicted value when age = 15 ______________________________________________ Test H0: 1 = 2 = 0 Test Statistic: ________________________________ Rejection Region: __________________________ Q.6. A study was conducted to determine which factors were associated with percent release (Y) of hydroxypropyl methylcellulose (HPMC) tablets. The factors were: X1 = Carr’s compressibility index, X2 = angle of repose, X3 = solubility, X4 =molecular weight, X5 = compression force X6 = apparent viscosity of 4% (w/v) HPMC. The sample size was n=18, and the authors reported the fit of the following models. p.6.a. Complete the table in terms of AIC and SBC (BIC). Predictors X1,X2,X3,X4,X5,X6 X1,X2,X3,X4,X5 X1,X2,X3,X4 X1,X3,X4,X6 X1,X3,X4 X2,X3,X4 X3,X4,X6 p' SSE 7 6 5 5 4 4 4 AIC SBC 42.62 29.5151 138.7028 42.62 27.5151 138.7028 48.58 48.58 52.86 27.3910 172.1760 75.31 33.7623 243.4362 48.85 p.6.b. Which model is “best” based on AIC: ___________________________ BIC: ___________________________ p.6.c. R2 for the complete model was 0.9278. Compute the total (corrected) sum of squares (TSS):