STAT1600 Test #1 Form A

advertisement
Stat1600
Midterm #1
Solution, Form A
1. The high temperature on Jan 30th (in ◦ F.) for the last 5 years in a northern U.S. city is as
follows:
23,30,25,24,28
(a) (10 points) Calculate the mean, X of this data.
ANSWER: 26
CALCULATION/REASON: (show your work)
X=
130
23 + 30 + 25 + 24 + 28
=
= 26.
5
5
(b) (10 points) Calculate the median of this data.
ANSWER: 25
CALCULATION/REASON: (show your work)
sorted list: 23, 24, 25, 28, 30
5+1
n+1
=
= 3 =⇒ M ED = |{z}
25 = 3rd ordered value.
2
2
(c) (15 points) Let’s compare the summary statistics of temperature for two cities during
the past year.
Statistics
City A City B
Mean
62◦ F
60◦ F
Standard Deviation 10◦ F
14◦ F
Based on the information provided in the table, would you expect one of these cities to
have more extremes in their temperatures than the other?
ANSWER: Yes
REASON:
City B has more extremes due to larger SD.
1
2. Suppose the height of males in the United States follows a Normal Distribution with a mean
of 69 inches and a standard deviation of 4 inches. Given this information, use the Z-Table
provided to answer the following questions:
(a) (10 points) What percentage of men are under 65 inches tall?
The z-score of x = 65:
z=
x − Mean
65 − 69
=
= −1.00.
SD
4
Hence, the area to the left of 65 under N (M ean = 69, SD = 4) equals the area to
the left of −1.00 under Z curve:
=
=1−
−1.00
= 1 − .8413
= .1587
1.00
1.00
That is, 15.87%.
On the other hand, you can apply the empirical rule: 65 inches is 1 SD below the
mean (= 69 − 4) and hence the percentage is
32%
100% − 68%
=
= 16%
2
2
which is fairly close to 15.87%.
(b) (15 points) What percentage of men are between 61 and 77 inches?
The z-scores of 61 and 77 are, respectively,
for 61 :
61 − 69
77 − 69
= −2.00; for 77 :
= 2.00
4
4
and hence the percentage:
=1−
−2.00
2.00
z
=2x
−2.00
−1 = 2 x .9772 − 1
= .9544
2.00
z
2.00
z
That is, 95.44%
On the other hand, you can apply the empirical rule: 61 and 77, respectively, are
M ean − 2SD and M ean + 2SD and hence the percentage is 95% which is fairly
close to 95.44%.
2
(c) (10 points) A male basketball player is 77 inches tall. What percentile is he in? (In
other words, what percentage of males are shorter than he is?)
The z-score for 77 is 2.00 (from part (b)) and hence the percentage is 97.72% (area is
.9772 from Z Table).
You can also apply the empirical rule:
95%
|{z}
M ean±2SD
+
100% − 95%
= 97.5%
2
|
{z
}
below Mean−2SD
which is fairly close to 97.72%.
3. Fred is in a weekly golf league. His scores from the last n = 19 weeks are shown in the stem
and leaf plot below:
------------Leaf Unit = 1
7 579
8 0123489
9 01255
10 012
11 1
------------(So, for example, the first observation under stem 11 is read as 111
and the first observation under stem 7 is read as 75)
(a) (7 points) In addition to the Stem-and-Leaf Plot, what other type(s) of data presentation would be most appropriate for this data? Explain your answer. Your choices are:
(Bar Chart, Pie chart, Histogram, Dotplot, Box-and-Whisker Plot)
Answer:
Since this is a numerical data set, histogram, dotplot, and box-and-whisker plot are
also appropriate the present the data.
(b) (8 points) After examining the Stem-and-Leaf Plot, what seems to be the shape of this
data?
Answer: right skewed
Reason:
3
Turn the stem-and-leaf plot 90◦ counter-clockwise and inspect the shape: long right
tail. Hence, it’s right skewed.
(c) (15 points) Give the five-number summary then construct a boxplot (with whiskers extending to the two extremes, MIN and MAX). (Hint: How do you construct a sorted
list of the data from the stem-and-leaf plot above?)
MIN and MAX: MIN=75, MAX=111
Median: (show your work)
n+1
19 + 1
=
= 10 =⇒ M ED =
2
2
89
|{z}
10th ordered value
Quartiles, Q1 and Q3 : (show your work)
19 + 1
n+1
=
=5
4
4
so, Q1 = 5th ordered value = 81
Q3 = 5th largest value = 95
boxplot:
75
80
85
90
95
100
105
110
4. (5 points) Extra Credit Problem
Assume we have a data set that includes the Area Codes of our customers’ phone numbers.
Some examples in the data set include 269, 626, 610, etc. Clearly these are numbers, but
are considered nominal (categorical) data. Discuss or prove why these values are actually
nominal (categorical) data and should not be analyzed as numerical data.
Area codes are used as labels to determine from which part of the country people are
calling. Operations cannot be performed on area codes—it would make no sense to
add and substract, multiply and divide them—making them not numerical data.
4
Download