STATISTICS 101 L - Homework 2 Due Friday, January 30, 2004

advertisement
STATISTICS 101 L - Homework 2
Due Friday, January 30, 2004
• Homework is due by 4:00 PM on the due date at 327 Snedecor. You can always hand in your homework
at the end of lecture on Friday.
• You may talk with others about the homework problems but please write your solutions up independently.
• Please answer homework questions in complete sentences. Make sure to
assignment together.
staple the pages of your
• Normally you will have an opportunity to get help on homework during lab.
Reading:
Jan. 21
Jan. 26
- Jan. 23
- Jan. 30
Section 1.2
Section 1.3
Assignment:
1. Read pages 38-55 of the text and do problems 1.47, 1.52, 1.64, and 1.65.
2. On homework 1 the birth weights of 44 babies born at a Brisbane, Australia hospital were given.
Eighteen of those babies were girls. Their birth weights, in grams, are given below.
3837
3523
2383
3334
3430
3500
2208
3480
3866
2576
3116
3542
3208
3428
3278
3746
2184
1745
(a) Calculate the mean and median for this sample of data. Comparing these two values, what shape
would you expect the distribution to have? Explain briefly your choice.
(b) Calculate a five number summary for these data and use this to construct a box plot. Be sure
to include a realistic axis for the box plot. Is the distribution of scores symmetric or skewed?
Explain briefly what it is about the shape of this box plot that indicates symmetry or skew.
(c) What are the values of the sample range and the sample InterQuartile Range (IQR) for these
data?
(d) Use your calculator to compute the sample standard deviation of these 14 measurements.
(e) Suppose the hospital found out that the scale was not calibrated correctly and each of the reported
weights above is 200 grams more than it should be. If we correct for this mistake how would the
mean, median, range, IQR and standard deviation change? Hint: Don’t change the data and
recalculate the answers. Instead, think about what each measures and how that is
affected by subtracting a constant.)
3. (JMP assignment) How faithful is the Old Faithful Geyser in Yellowstone National Park? In the table
below are the times (minutes) between eruptions of the Old Faithful Geyser during part of August
1985.
80
81
84
74
93
80
108
62
81
51
71
50
54
85
54
60
50
79
74
82
57
89
85
75
86
92
77
54
59
58
80
54
58
65
53
43
57
80
81
81
75
90
79
76
78
89
80
73
66
49
1
77
73
57
58
52
60
61
81
87
92
60
60
88
91
83
84
82
62
53
50
86
83
68
50
60
69
48
81
80
88
77
65
76
87
87
74
81
71
50
62
56
82
78
48
49
71
73
79
87
93
• Go to the course webpage www.public.iastate.edu/∼wrstephe/stat101L.html. There is is
a link for Old Faithful times between eruptions data. Click on the right mouse button
and select Save Link As or Save Target As. Save this file as geyser.txt on either the temp
directory of the computer or on a disc.
• Start the JMP program and select File → Open from the JMP menu. Enter the name of the
file (geyser.txt), and change the Files of type: settings to Text Import Preview. Then click
on Open and then Delimited. In the box that appears, click on Space in the End of Field
Box. Put a check mark in the box near Table contains column headers. Click on Apply
Settings. At this point, JMP gives you a preview of the column names and the first two rows of
your data. If everything looks good, press OK.
• We want to describe the distribution of this data using JMP. This can be done using Distribution.
Be sure you get a histogram with the times along the horizontal axis. Make sure you have counts
on the vertical axis of your histogram. You should ask for a stem and leaf display. Print off your
output. Using your output, answer the following questions. You can put your answers on the
printed output.
(a) What percentage of the times between eruptions are less than one hour? greater than an hour
and a half?
(b) Describe the shape of this distribution. Based on the shape, would you expect the mean to be
equal to, less than, or greater than the median? Explain your answer.
(c) Give the five number summary for these data.
(d) Compute the sample IQR and sample range.
(e) Give the mean and standard deviation for these data.
(f) Did JMP split the stems in the stem plot? If yes, how did JMP split the stems? If no, do you
think JMP should have split the stems?
4. (JMP Assignment) The total annual rainfall (in inches) for 100 years (1902-2002) for Los Angeles,
California are given in the table below.
4.4
21.0
10.7
7.2
18.8
26.2
11.2
16.9
19.7
11.6
17.8
12.0
9.0
12.3
4.9
8.2
32.8
12.5
13.7
16.2
11.6
7.4
27.0
7.8
8.2
10.6
19.2
11.5
12.5
12.6
9.1
8.1
19.7
27.5
5.6
8.0
13.1
12.7
8.6
19.2
31.0
12.5
33.4
16.6
21.1
7.2
23.4
9.8
13.9
11.7
12.4
7.7
12.3
22.0
9.5
12.3
22.4
18.0
15.3
19.3
12.4
17.9
7.2
20.4
16.0
11.7
13.5
17.6
19.9
18.7
24.4
12.8
14.4
13.7
11.9
11.6
21.7
7.9
17.1
19.5
8.1
10.4
14.9
8.0
12.0
19.2
14.9
6.7
23.7
8.7
27.4
31.3
21.3
8.4
9.5
18.2
11.9
9.6
13.4
19.3
These data are also on the course webpage as Los Angeles annual rainfall data and can be accessed in
the same way as the Old Faithful data. Repeat the steps from exercise 3 to obtain JMP output. On
the output answer the following questions.
(a) What percent of the 100 years have rainfall amounts been less than or equal to 8 inches? What
percent of the 100 years have rainfall amounts been greater than 27 inches?
(b) Referring to the histogram, describe the distribution of yearly rainfall in Los Angeles for the 100
years between 1902 and 2002.
(c) Does the Stem and Leaf display have split stems? If yes, how did JMP split the stems?
(d) Report the value of a measure of center and the value of a measure of spread for these data.
Explain why you chose each measure.
(e) Would the distribution of yearly rainfall in Los Angeles give you any information about the
distribution of yearly rainfall in Des Moines? Explain your answer.
2
Download