Activity 5: Regression The Turbine Oil Oxidation Test (TOST) and the Rotating Bomb Oxidation Test (RBOT) are different procedures for evaluating the oxidation stability of steam turbine oils. We will assume that the regression function of Y=RBOT time (hr) on X=TOST time (min) has the simple linear form, and will use the data in www.stat.psu.edu/~mga/401/labs/05/lab5/regr.data.txt to estimate the model parameters. 1. Plot the Data: Graph>Scatterplot>Simple>Input RBOT for Y and TOST for X>OK The plot suggests that the assumption of a simple linear regression model is reasonable. 2. Fit the Simple Linear Regression Model: Stat>Regression > Regression>Input RBOT for Response and TOST for Predictor>OK . The output that Minitab gives, together with explanatory notes from me, follow: Regression Analysis: RBOT: versus TOST: The regression equation is RBOT: = - 13.9 + 0.0902 TOST: [NOTE: The regression equation is the estimated regression function, under the assumption of a simple linear regression model. Thus, the estimate of the intercept is -13.9, while the estimate of the slope is 0.0902.] Predictor Constant TOST: Coef -13.85 0.09024 SE Coef 44.78 0.01188 T -0.31 7.59 P 0.763 0.000 [NOTE: This table gives the estimates of the coefficients. The column headed SE Coef gives the estimated standard errors of the estimators. The column headed T gives the ratios, Coef/(SE Coef), which are the statistics for testing the null hypothesis that the coefficient is zero. The last column gives the p-values. Thus, the intercept coefficient is not significantly different from zero, but there is strong evidence suggesting that the slope coefficient is not zero.] S = 25.1546 R-Sq = 85.2% R-Sq(adj) = 83.7% [NOTE: S is the estimated intrinsic scatter, or the standard deviation of the error term. R-Sq is the proportion of the total variability explained for by the model. A large R-Sq suggests good fit of the model to the data.] Analysis of Variance Source Regression Residual Error Total DF 1 10 11 SS 36489 6328 42817 MS 36489 633 F 57.67 P 0.000 [NOTE: The Analysis of Variance table decomposes the total variability, or total sum of squares (which is the numerator of the sample variance of the response variable), into the variability explained by the regression model and the variability due to the intrinsic scatter (or error variability). The regression sum of squares (i.e. the variability explained by the model) divided by the total sum of squares gives the R-Sq we saw above.] Unusual Observations Obs 3 TOST: 3750 RBOT: 375.00 Fit 324.56 SE Fit 7.27 Residual 50.44 St Resid 2.09R R denotes an observation with a large standardized residual. [NOTE: Minitab, and other software packages, list observations for which the response deviates considerably from the regression line. With our data, the response of the third observation does so.] FURTHER NOTES: a. The estimates of the regression coefficients, together with their estimated standard errors, which are given in the first table, can be used to construct confidence intervals. Thus, (0.09024 – (2.228)(0.01188), 0.09024 + (2.228)(0.01188)) = (0.0638 , 0.1167), where 2.228 is the 97.5th percentile of the t-distribution with 10 degrees of freedom and 0.01188 is the estimated standard error of the slope estimator, is a 95% CI for the slope, 1. b. The test statistics T, which are given in the first table, allow us to test one-sided hypotheses regarding the parameters. For example, to test H0: 1=0 vs. Ha:1>0 at 0.05 level of significance, the rejection rule is T>1.812, where 1.812 is the 95th percentile of the t-distribution with 10 degrees of freedom. Since T=7.59, the null hypothesis is rejected. 3. CI and PI: Here we will see the commands needed to obtain a CI for the mean response at a given value of X, as well as a PI for a future Y observation at a given value of X STAT>Regression > Regression>Input RBOT for Response and TOST for Predictor>Click Options, then input 4000 into Prediction intervals for new observations, and 95 into Confidence level. The additional part of the Minitab output is: Predicted Values for New Observations New Obs 1 Fit 347.12 SE Fit 95% CI 8.00 (329.30, 364.94) Values of Predictors for New Observations New Obs TOST: 95% PI (288.31, 405.94) 1 4000 [NOTE: The PI is wider than the CI, as expected from a previous discussion.]