STATISTICS 401D Spring 2016 Laboratory Assignment 7 1. A paper in the Journal of the Association of Asphalt Paving Technologists (Vol. 59, 1990) describes an experiment to determine the effect of air voids on percentage retained strength of asphalt. For purposes of the experiment, air voids are controlled at three levels; low (2-4%), medium (4-6%), and high (6-8%). The data are shown in the following table. The variability of the strengths may be just as important as the mean strentghs of the asphalt. A testing agency wanted to verify that there are no differences among the three air voids with respect to the variability in the strength of asphalt produced under each condition each. Use the file air voids.jmp downloaded from the web page. Air Voids Low Medium High 106 80 78 90 69 80 Retained Strength (%) 103 90 79 88 94 91 70 83 62 69 76 85 92 87 69 95 83 85 (a) Examine evidence that the assumptions about the population distributions required to use Hartley’s test for the above purpose are satisfied or not. (b) Use the statistics output from executing a JMP analysis to conduct Hartley’s test by hand using α = .05. State the null and alternative hypotheses, test statistic, and the rejection region clearly. (c) Extract from the JMP output the results of performing Levene’s test for testing the above hypotheses. State the value of the test statistic, the p-value and your decision based on α = .05 1 2. The data shown in the following table are highway gasoline mileage performance (MPG) and engine displacement (Disp.) for a sample of 22 midsize cars from the 2005 model year. Simple linear regression was used to model mileage (y) using displacement as the explanatory variable (x). Make Model ACURA BMW BUICK CHEVROLET CHRYSLER DODGE HONDA HYUNDAI HYUNDAI INFINITI KIA TL 525I CENTURY MALIBU SEBRING STRATUS ACCORD ELANTRA SONATA Q45 OPTIMA Disp. (liters) 3.2 2.5 3.1 3.5 2.4 2.7 2.4 2 2.4 4.5 2.4 MPG (highway) 29 28 30 32 30 28 34 32 30 23 30 Make Model KIA LEXUS MERCURY MAZDA MERCEDES MITSUBISHI NISSAN PONTIAC TOYOTA VOLKSWAGEN VOLVO SPECTRA ES330 SABLE MAZDA6 E320 GALANT ALTIMA GRAND PRIX CAMRY PASSAT S80 FWD Disp. (liters) 2 3.3 3 2.3 3.2 2.4 2.5 3.8 2.4 2.8 2.5 MPG (highway) 34 29 27 28 28 30 29 28 34 27 30 Perform the following computations needed to fit the model y = β0 + β1 x + by hand calculation. Must show work for all your numerical answers. Finally, execute a JMP program to obtain all quantities needed and confirm that all numbers you computed by hand can be obtained from the JMP output by hi-liting and labelling them on the JMP output. Use the midsize.jmp JMP data file and the text file midsize.txt provided. Write answers for this problem on separate sheets and attach. a) Construct a plot (using software known to you or using JMP Graph Builder) which shows the scatter of (x, y) data points with y on the vertical axis. b) Compute the following using Excel: X xi = , X x2i = , X yi = , X yi2 = , X x i yi = c) Use the method of least squares to obtain estimates β̂0 and β̂1 of the parameters in the model using the results of part (b). d) Give the least squares prediction equation (which is the equation of the line that yields minimum sum of squared residuals). e) Obtain the predicted highway MPG (round to one decimal) and the corresponding residual for a car, the Mitsubishi Galant. f) According to this model, what is the expected increase in the mean MPG associated with a 0.2 liters decrease in displacement? g) Compute a table of predicted values ŷ and residuals y − ŷ corresponding to the observed values y. (You may use Excel for this and the next two parts). h) Extend the table in part(f) to include columns y − ȳ and ŷ − ȳ. i) Compute the sums of squares (y − ȳ)2 , (ŷ − ȳ)2 and (y − ŷ)2 using the table in part (g). Explain how these give a decomposition of total variability in sample values into two parts and identify the parts. P P P j) What proportion of the total variability in mileage is explained by using only engine displacement in a linear regression model? What does this say about the role of engine displacement in predicting mileage? Give a reason for the moderate value for this statistic in this problem and what the experimenter may do to obtain an improved model for prediction. 2 k) Give the point estimate s2 of σ2 `) Compute the estimated standard errors of β̂0 and β̂1 . m) Construct 95% confidence intervals for β0 and β1 . n) Test the hypothesis H0 : β1 = 0 vs. Ha : β1 6= 0 using a t-statistic. Give your conclusion using α = .05. o) Construct an analysis of variance table using quantities computed in parts (b),(c), and Syy . Include a column for computing the F-statistic to test the hypothesis in part (n). Use the F-tables to determine the rejection region and state your decision. p) Construct a 95% confidence interval for the mean highway MPG of cars with an engine displacement of 2.6 liters. q) Construct a 95% prediction interval for the predicted highway MPG of cars withan engine displacement of 2.6 liters. r) (JMP analysis only) Save columns of predicted values, residuals, and confidence and prediction intervals into the JMP data table. Journal this table and save as a Word file. Obtain a printed copy to turn in along with the printed copy of the output from the JMP analysis. Make sure the JMP analysis contains the following plots and analyses (in addition to the standard output): i) A plot of the data superimposed by the fitted regression line, the confidence, and prediction interval curves. ii) A plot of the residuals against the corresponding engine displacement (x) values. iii) A plot of the residuals against the corresponding predicted (ŷ) values. s) Study the two plots obtained in (ii) and (iii) of part (r) above. Use these to comment on the adequacy of the model you fitted to this data. t) A consumer’s group says that a 1.0 liter increase in engine displacement in automobiles will decrease highway mileage by more than 5 mpg. Do the data support this conjecture? u) A manufacurer of a vehicle with an engine displacement of 2.5 liters claims that the mean highway mileage of this vehicle will exceed 30 MPG. Do the data support this claim? Note: To answer parts (t) and (u), a hypothesis you specify must be tested either by using an appropriate test statistic or by using an appropriate confidence interval. Due Tuesday, March 29, 2016 (turn in during the first 20 minutes of the lab) 3