Stat 401B 1 Lab 3 Fall 2005 1 Overview In this lab you will be introduced to using JMP for simple linear regression analysis. For this lab you need to be sitting in front of Windows PC that has JMP. 2 Warm-up Exercise Simple Linear Regression (SLR) is available in two ways in JMP: Fit Y by X and Fit Model. Fit Y by X is under Basic Stats on the JMP Starter. Fit Model is under Modeling on the JMP Starter. Both are available under the Analyze pull-down menu. Multiple Regression and some advanced SLR features are available only under Fit Model. This handout illustrates obtaining regression output from Fit Y by X. JMP’s Fit Model analysis platform will be covered later. Many activities (sports, jobs, etc.) depend on a person being able to accurately determine (guess) quantities visually. One individual was asked to visually determine the distances (to the nearest foot) of thirteen landmarks. For this warm-up exercise, you will use the guessed distances (X) and the actual distances (Y ) to 12 of those landmarks. The thirteenth landmark will be reserved to see if the simple linear regression line can predict better than the individual can guess. Object 1 2 3 4 5 6 7 8 9 10 11 12 13 Guessed, X 45 120 100 90 140 20 120 12 150 180 250 200 160 Actual, Y 53 100 95 85 121 27 105 19 136 165 232 182 . Fit Y by X 1. Create a data worksheet with three columns. To do this click on the red triangle next to Columns and Add Multiple Columns. Name the columns Object, Guessed and Actual. Do not enter an Actual value for Object 13 (arrow down past this row and JMP will automatically put a period in the cell indicating a missing value). 2. Go to Fit Y by X under Basic Statistics or select Analyze→ Fit Y by X. 3. Select Actual Distance as the Response (Y ) and Guessed Distance as the Factor (X), then click OK. This produces a plot of the data. 4. Using the analysis pop-up menu (indicated by the little red triangle to the left of Bivariate Fit of Actual by Guessed), you can choose various fits and options including Fit Mean, Fit Line, and Fit Polynomial, etc. Choosing a fit produces a table of output below the scatterplot Stat 401B Lab 3 Fall 2005 2 corresponding to the fit. The fitted line/curve is also drawn on the plot and a new pop-up menu of options specific to the fit is created just below the scatterplot. Choosing options from this new pop-up menu either creates more output or creates new columns in the worksheet (e.g. you can save the residuals from a particular fit as a new column in the worksheet for further analysis). One can also fit a model, exclude a row, and re-fit the same model to see the effect of the excluded point on the fit because JMP will graph both fitted equations on the same scatter plot and provide both sets of output in the same analysis window. JMP also provides many options (formatting and statistical) by simply Right-clicking (on Windows) on the items displayed (both graphical items and numerical/text items). 5. Select Fit Line from the analysis pop-up menu (red triangle icon). Note the fitted line now plotted in the scatter plot; explore the output displayed below the scatter plot. Also notice the new pop-up menu below the scatter plot (labeled Linear Fit); click here and notice your analysis options. 6. Identify the following items from the Linear Fit output: • equation of the fitted line • R2 = RSquare • SY |X = Root Mean Square Error • n • p−value for the test of β1 = 0 vs. β1 6= 0 7. Right-click on the Parameter Estimates portion of the output. From the resulting contextual pop-up menu, select Lower 95% from the Columns menu item. Repeat this selecting Upper 95%. This adds two additional columns to this section of the output. These columns provide 95% confidence intervals for the intercept and the slope of the model: Y = β 0 + β1 X + . 8. From the Linear Fit pop-up menu (just below the scatter plot), choose Confid Curves Fit to plot 95% confidence bands for µY |X = β0 + β1 X. Now, select Confid Curves Indiv to plot 95% prediction bands for individual values of Y . 9. Save Predicteds and Save Residuals both create new columns in the JMP data table—one of Ŷ −values and one of ˆ−values, respectively. Note that even though object 13 was not used in the analysis JMP will predict a value for this object. Plot Residuals adds an additional plot to the bottom of the output—a plot of Residuals (ˆ ) vs. Guessed Distances (X). Choose all three of these menu commands and note the new columns and new plot. (One could now use JMP’s Analyze→ Distribution platform to analyze the residuals from this model—checking for Normality via JMP’s Normal Quantile Plot for example.)