STATISTICS 101 - Homework 4

advertisement
STATISTICS 101 - Homework 4
Due Friday, February 22, 2002
• Homework is due by 5:00 PM on the due date in my office. You can always hand in your
homework at the end of lecture on Friday.
• You may talk with others about the homework problems but please write your solutions
up independently. Please answer homework questions in complete sentences. Make sure to
staple the pages of your assignment together. Be sure to indicate your lab section on
your paper.
• You will have an opportunity to get help on homework during lab.
Problem: In lab 2, we looked at the relationship between the length of a person’s forearm and
the length of a person’s foot. In lab, we visually fit a regression line to this data. Now, using JMP,
we will fit a least squares regression line to the data. The data for this problem was obtained by
taking a probability sample of measurements from the ten Stat 101 lab sections. There are 50 data
points, 25 from men and 25 from women. To analyze this data, complete the following steps.
1. Go to the website www.public.iastate.edu/∼wrstephe/stat101.html and go to the link
Lab Data for Homework 4. Click on the right mouse button and select Save Link As
or Save Target As. Name the file hmwk4.txt and save it to either the computer’s hard
drive, or a diskette.
2. Start the computer program JMP. Select File → Open from the JMP menu. Enter the
name of the file (hmwk4.txt), and change the Files of type: settings to Text Import
Preview. Then click on Open and then Delimited. In the box that appears, put a check
mark in the box near Space in the End of Field Box. Put a check mark in the box near
Table contains column headers. Click on Apply Settings. At this point, JMP gives
you a preview of the column names and the first two rows of your data. If everything looks
good, press OK.
3. First we want to look at the distribution of the foot length separately. Select Analyze
→ Distribution from the JMP menu. Select the column foot and click the button Y,
Columns. Then click on OK.
4. You should now have a histogram, boxplot, and statistics for the foot length. We want to
make a few changes to the information JMP has calculated. First, click on the red triangle
next to foot and select Stem and Leaf. This should add a stem-and-leaf plot to your
window. Now, click on the red triangle next to foot and select Histogram Options →
Count Axis. This should add a count axis to the histogram in the window. Finally, click
on the red triangle next to foot and select Display Options → Horizontal Layout.
5. From the JMP menu, select File → Print to print your output. Turn this paper in with
your assignment. You will use this output to answer question 14.
6. Now, we want to look at the relationship between the two variables, arm and foot. Specifically, we would like to predict the length of a person’s foot from the length of their forearm.
From the JMP menu, select Analyze → Fit Y by X. Select the column foot and click on
the button Y, Response. Select the column arm and click on the button X, Factor. Then
click on OK.
1
7. You should have a scatterplot of the two variables with the variable arm on the horizontal
axis and the variable foot on the vertical axis. To add the regression line to the scatterplot,
click on the red triangle next to Bivariate Fit of foot By arm and select Fit Line. This
should add a regression line to the scatterplot and statistics for the regression line to the
window.
8. To get a complete picture of all regression lines, we must study the residual plot. Click on
the red triangle next Linear Fit and select Plot Residuals. A residual plot should be
added to the bottom of the window.
9. From the JMP menu, select File → Print to print your output. Turn this paper in with
your assignment. You will need this output to answer question 14.
10. In this analysis, gender could be a lurking variable. In order to account for this variable, we
should calculate two different regression lines, one to predict a women’s foot length using her
arm length, and one to predict a man’s foot length using his arm length. From the JMP
menu, select Analyze → Fit Y by X. Select the column foot and click on the button Y,
Response. Select the column arm and click on the button X, Factor. Finally, select the
column gender and click on the button By. Then click OK.
11. You should now have two scatterplots. The scatterplot on the top is for the men, and the
scatterplot on the bottom is for the women. To add a regression line to the men scatterplot,
click on the red triangle next to Bivariate Fit of foot By arm and below the letter “M”,
and select Fit Line. To add a regression line to the women scatterplot, click on the red
triangle next to Bivariate Fit of foot By arm and below the letter “W”, and select Fit
Line.
12. To get a complete picture of all regression lines, we must study the residual plot. For each
regression line, click on the red triangle next Linear Fit and select Plot Residuals. A
residual plot will be added for each regression line.
13. From the JMP menu, select File → Print to print your output. Turn this paper in with
your assignment. You will need this output to answer question 14.
14. Use your output to answer the following questions.
(a) Describe the distribution of the foot length. Make sure to include in your description the
five number summary, the mean and standard deviation, and the shape of the histogram.
(b) Describe the scatterplot of forearm length vs. foot length. Give the regression equation
for predicting foot length from forearm length, give an interpretation of the slope of the
regression equation, and give an interpretation of the R2 value for the regression. Finally
describe the residual plot, and make note of any potential problems with the regression.
(c) Describe the scatterplot of forearm length vs. foot length for the women and the men,
making note of their differences. Give the regression equation for predicting foot length
from forearm length for each gender, and give an interpretation of the R2 value for
each gender. Describe the residual plot for each gender, and make note of any potential
problems with the regression. Which regression is better at predicting foot length from
arm length, the one for women or the one for men?
2
Download