Stat 401B Lab 3

advertisement
Stat 401B Lab 3
Overview
In this lab you will be introduced to using JMP for simple linear regression analysis. For
this lab you need to be sitting in front of Windows PC that has JMP.
Warm-up Exercise
Simple Linear Regression (SLR) is available in two ways in JMP: Fit Y by X and Fit
Model. Fit Y by X is under Basic Stats on the JMP Starter. Fit Model is under Modeling
on the JMP Starter. Both are available under the Analyze pull-down menu. Multiple
Regression and some advanced SLR features are available only under Fit Model. This
handout illustrates obtaining regression output from Fit Y by X. JMP’s Fit Model
analysis platform will be covered later.
In class we looked at the relationship between the number of manatees killed (Y) and the
number of motorboats registered, in 1000’s, (X) for 14 years between 1977 and 1990.
The fifteenth year, 1991, will be reserved to see if the simple linear regression line can
predict the number of manatees killed in that year.
Year
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
Motorboats
(1000's), X
447
460
481
498
513
512
526
559
585
614
645
675
711
719
716
Manatees
Killed, Y
13
21
24
16
24
20
15
34
33
33
39
43
50
47
.
Fit Y by X
1. Create a data worksheet with three columns. To do this, click on the red triangle
next to Columns and Add Multiple Columns. Name the columns Year, Boats and
Killed. Do not enter a value for Killed in 1991 (arrow down past this row and
JMP will automatically put a period in the cell indicating a missing value).
1
2. Go to Fit Y by X under Basic Statistics or select Analyze + Fit Y by X.
3. Select Killed as the Response (Y) and Boats as the Factor (X), then click OK.
This produces a plot of the data.
4. Using the analysis pop-up menu (indicated by the little red triangle to the left of
Bivariate Fit of Killed by Boats), you can choose various fits and options
including Fit Mean, Fit Line, and Fit Polynomial, etc. Choosing a fit produces a
table of output below the plot corresponding to the fit. The fitted line/curve is also
drawn on the plot and a new pop-up menu of options specific to the fit is created
just below the plot. Choosing options from this new pop-up menu either creates
more output or creates new columns in the worksheet (e.g. you can save the
residuals from a particular fit as a new column in the worksheet for further
analysis). One can also fit a model, exclude a row, and re-fit the same model to
see the effect of the excluded point on the fit because JMP will graph both fitted
equations on the same scatter plot and provide both sets of output in the same
analysis window. JMP also provides many options (formatting and statistical) by
simply Right-clicking on the items displayed (both graphical items and
numerical/text items).
5. Select Fit Line from the analysis pop-up menu (red triangle icon). Note the fitted
line is now on the plot of the data. Explore the output displayed below the scatter
plot. Also notice the new pop-up menu below the plot (labeled Linear Fit); click
here and notice your analysis options.
6. Identify the following items from the Linear Fit output:
• The equation of the fitted line. (Note: JMP does not indicate that this is
really a predicted number of manatees killed. You will need to add this
when you write the prediction equation.
• R2 = RSquare
• sY|X = Root Mean Square Error
• n
• P-value for the test of β 1 = 0 vs. β 1 ≠ 0
7. Right-click on the Parameter Estimates portion of the output. From the resulting
contextual pop-up menu, select Lower 95% from the Columns menu item. Repeat
this selecting Upper 95%. This adds two additional columns to this section of the
output. These columns provide 95% confidence intervals for the intercept and the
slope of the model: Y = β 0 + β 1 X + ε .
8. From the Linear Fit pop-up menu (just below the scatter plot), choose Confid
Curves Fit to plot 95% confidence bands for µY |X = β 0 + β 1 X . Now, select
Confid Curves Indiv to plot 95% prediction bands for individual values of Y.
9. Save Predicteds and Save Residuals both create new columns in the JMP data
table. Note that even though 1991 was not used in the analysis JMP will predict a
value for this year. Plot Residuals adds an additional plot to the bottom of the
output – a plot of Residuals ( Ŷ ) vs. Boats (X). Choose all three of these menu
commands and note the new columns and new plot. (One could now use JMP’s
Analyze + Distribution platform to analyze the residuals from this model –
checking for Normality via JMP's Normal Quantile Plot for example.)
2
Download