Stat 401B Lab 2 Overview In this lab you will look at residuals using JMP. The definition of a residual is the difference between an observed value and a predicted value. The predicted value, and thus the residual, is determined by the type of problem you have (the statistical model). Computer Exercises The objective for these exercises is to show you how to get residuals using JMP. 1. We will go back to the fuel economy data. The JMP file is available from the course web page. Recall that this file contains a sample of 36 vehicles and their combined city and highway mpg. For this one-sample problem the model and definition of a residual are: Model : yi = μ + ε Residual : yi − y a. In order to create residuals for the one sample problem you need to subtract the value of the sample mean from each length of stay value. To do this: • • Go to the data table and add a new column. To do this click on the red triangle next to Columns. Select New Column. Name this column Residual. Highlight the Residual column in the data table and go to the Cols pull down menu. Select Formula. A formula window will open. Enter a formula using the following mouse clicks. [Average MPG] – Statistical + Col Mean[Average MPG] b. Use Analyze + Distribution to analyze the residuals. You should include a histogram, box plot and Normal Quantile Plot. You may also wish to use Fit Distribution + Normal, to superimpose a normal curve on the histogram. 2. We will now go back to the body mass index data for men and women. The JMP file is available from the course web page. Recall that this file contains the Body Mass Index (BMI) for 50 men and 50 women. For this two-sample problem the model and definition of a residual are: Model : y1i = μ1 + ε 1i Residual : y1i − y1 y 2i = μ 2 + ε 2i y 2i − y 2 1 a. JMP can automatically calculate residuals for you from the Fit Y by X platform. Go to Analyze + Fit Y by X and enter BMI as the Y, Response and Gender as the X, Factor. Click on OK. Go to the red triangle pull down in the output window and select Save + Residual. This will create a new column in your data table labeled BMI centered by Gender. These are the residuals for the two-sample problem. b. Use Analyze + Distribution to analyze the residuals, BMI centered by Gender. You should have a histogram, box plot, and Normal Quantile Plot. You can use Fit Distribution + Normal, to superimpose a normal curve on the histogram. Note that even though you have two samples of BMI values (Male and Female) you analyze a combined (single) set of residuals. c. For the two-sample model there is also the condition of equal standard deviations for the random errors. You can check this condition both graphically and numerically. • • Graphically: Use Fit Y by X and cast the residuals (BMI centered by Gender) as the Y, Response and Gender as the X, Factor. You should see side-by-side dot plots of the residuals. JMP should automatically add a horizontal line at zero. Look at the spread for each Gender. Numerically: Go to the red pull down in the Fit Y by X output and select Means and Std Dev. This will provide the standard deviations for the residuals for Males and Females separately. You can compare these two sample standard deviations. The condition of equal population standard deviations is satisfied provided one sample standard deviation is less than 3 times as large as the other sample standard deviation. Alternatively, use Analyze + Distribution and cast BMI as the Y, Columns and Gender as the By variable. This will calculate the sample standard deviations for each Gender. You can compare the two sample standard deviations. 2