Stat 401B Lab 3 1 Overview

advertisement
Stat 401B
1
Lab 3
1
Overview
In this lab you will be introduced to using JMP for simple linear regression analysis. For this lab
you need to be sitting in front of Windows PC that has JMP.
2
Warm-up Exercise
Simple Linear Regression (SLR) is available in two ways in JMP: Fit Y by X and Fit Model. Fit Y by
X is under Basic Stats on the JMP Starter. Fit Model is under Modeling on the JMP Starter. Both
are available under the Analyze pull-down menu. Multiple Regression and some advanced SLR
features are available only under Fit Model. This handout illustrates obtaining regression output
from Fit Y by X. JMP’s Fit Model analysis platform will be covered later.
Manatees are large, gentle sea mammals that live in the warm waters of Florida. Because these
creatures tend to float just below the surface of the water, they are subject to injury and sometimes
death from propellers on motor boats. Environmentalists claimed that as the number of motor boats
registered in Florida increased more and more manatees were killed. For this warm-up exercise,
you will use the number of motor boats registered, in 1000’s, (X) and the number of manatees
killed (Y ) for 14 years between 1977 and 1990. The fifteenth year, 1991, will be reserved to see if
the simple linear regression line can predict the number of manatees killed in that year.
Year
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
Boats
(1000’s), X
447
460
481
498
513
512
526
559
585
614
645
675
711
719
716
Manateees
Killed, Y
13
21
24
16
24
20
15
34
33
33
39
43
50
47
.
Fit Y by X
1. Create a data worksheet with three columns. To do this click on the red triangle next to
Columns and Add Multiple Columns. Name the columns Year, Boats and Killed. Do not
enter a value for Killed in 1991 (arrow down past this row and JMP will automatically put a
period in the cell indicating a missing value).
2. Go to Fit Y by X under Basic Statistics or select Analyze→ Fit Y by X.
Stat 401B
Lab 3
2
3. Select Killed as the Response (Y ) and Boats as the Factor (X), then click OK. This produces
a plot of the data.
4. Using the analysis pop-up menu (indicated by the little red triangle to the left of Bivariate
Fit of Killed by Boats), you can choose various fits and options including Fit Mean, Fit Line,
and Fit Polynomial, etc. Choosing a fit produces a table of output below the scatterplot
corresponding to the fit. The fitted line/curve is also drawn on the plot and a new pop-up
menu of options specific to the fit is created just below the scatterplot. Choosing options from
this new pop-up menu either creates more output or creates new columns in the worksheet
(e.g. you can save the residuals from a particular fit as a new column in the worksheet for
further analysis). One can also fit a model, exclude a row, and re-fit the same model to see
the effect of the excluded point on the fit because JMP will graph both fitted equations on
the same scatter plot and provide both sets of output in the same analysis window. JMP also
provides many options (formatting and statistical) by simply Right-clicking (on Windows) on
the items displayed (both graphical items and numerical/text items).
5. Select Fit Line from the analysis pop-up menu (red triangle icon). Note the fitted line now
plotted in the scatter plot; explore the output displayed below the scatter plot. Also notice
the new pop-up menu below the scatter plot (labeled Linear Fit); click here and notice your
analysis options.
6. Identify the following items from the Linear Fit output:
• equation of the fitted line (Note: JMP does not indicate that this is really a predicted
number of manatees killed. You will need to add this when you write the prediction
equation.
• R2 = RSquare
• SY |X = Root Mean Square Error
• n
• p−value for the test of β1 = 0 vs. β1 6= 0
7. Right-click on the Parameter Estimates portion of the output. From the resulting contextual
pop-up menu, select Lower 95% from the Columns menu item. Repeat this selecting Upper
95%. This adds two additional columns to this section of the output. These columns provide
95% confidence intervals for the intercept and the slope of the model: Y = β 0 + β1 X + .
8. From the Linear Fit pop-up menu (just below the scatter plot), choose Confid Curves Fit to
plot 95% confidence bands for µY |X = β0 + β1 X. Now, select Confid Curves Indiv to plot 95%
prediction bands for individual values of Y .
9. Save Predicteds and Save Residuals both create new columns in the JMP data table—one
of Ŷ −values and one of ˆ−values, respectively. Note that even though 1991 was not used
in the analysis JMP will predict a value for this year. Plot Residuals adds an additional
plot to the bottom of the output—a plot of Residuals (ˆ
) vs. Boats (X). Choose all three
of these menu commands and note the new columns and new plot. (One could now use
JMP’s Analyze→ Distribution platform to analyze the residuals from this model—checking for
Normality via JMP’s Normal Quantile Plot for example.)
Download