Unit 3/Exam 1 Review Key

advertisement
Math 075
Exam 1 Review
Module 2
NAME:_____________________________________
Directions: Do not use minitab. A calculator is allowed.
FORMULAS:
∑(π‘₯−π‘₯Μ… )2
SD = √
𝑛−1
IQR = 𝑄3 − 𝑄1
ADM =
∑|π‘₯−π‘₯Μ… |
𝑛
1) The body temperature of students is taken each time a student goes to the nurse’s office. The
five-number summary for the temperatures (in degrees Fahrenheit) of students on a particular
day is:
a) Would you expect the mean temperature of all students who visited the nurse’s office to be
higher or lower than the median? Explain.
Look at the distance from the median to the extremes (max/min). If the distance to one
extreme is MUCH bigger than to the other extreme than the distribution is skewed. If the
two distances are close in length, then the distribution is symmetric.
Median – Min = 98.25 - 96.6 = 1.65
Max – Median = 101.6 - 98.25 = 3.35
The distance from the Median to the Max is twice as big as the distance from the Median
to the Min. So the distribution is skewed to the right. This means that the mean will be
bigger than the median since the mean is pulled towards the tail.
b) After the data were picked up in the afternoon, three more students visited the nurse’s
office with temperatures of 96.7°, 98.4°, and 99.2°. Were any of these students outliers?
Explain using the fences at 𝑄1 − 1.5 × πΌπ‘„π‘… and 𝑄3 + 1.5 × πΌπ‘„π‘….
Higher fence is at 𝑄3 + 1.5 × πΌπ‘„π‘… = 98.6+1.5*(98.6-97.85) = 99.725
Lower fence is at 𝑄1 − 1.5 × πΌπ‘„π‘… = 97.85-1.5*(98.6-97.85) = 96.725
The student with the 96.7° temperature is an outliers because it goes past the fence.
1
2) The boxplots show the age of people s Box Plot involved in accidents according to their
role n
the accident.
a) Which role involved the youngest person,
and what is the age?
Passenger, 0
b) Which role involved the person with the
lowest median age, and what is the age?
Passenger, 23
c) Which role involved the smallest range of
age, and what is it?
Cyclist, 52-10 = 42
d) Which role involved the largest IQR of age,
and what is it?
Pedestrian, 64-20 = 44
e) Which role has the most symmetric distribution? Explain.
Pedestrian, because the distance from the median to the extremes are about the same.
f) Which role has the most skewed distribution? Explain.
Passenger, because the distance from the median to the max is so much bigger than the
distance from the median to the min.
3) The students in a biology class kept a record of the height (in centimeters) of plants for a class
experiment.
e) Sketch a histogram for these data.
2
b) Find the mean and median of the plant heights.
Mean:
π‘₯Μ… =
49 + 67 + 38 + 55 + 62 + 54 + 36 + 41 + 56 + 43 + 49 + 75 + 44 + 60 + 48 + 52 + 48 + 53 + 59 + 32
= 51.05
20
Median:
49 67 38 55 62 54 36 41 56 43 49 75 44 60 48 52 48 53 59 32
Find the average of the two middle values.
43+49
2
= 46
c) Is it appropriate to use the mean to summarize these data? Explain.
Yes, the graph is roughly symmetric (with a possible small skew to the right). The center is
approximately equidistant to both ends of the histogram.
d) Describe the distribution of plant heights.
The distribution is unimodal and symmetric with no outliers. Most plants have a height of
about 40-60cm. The smallest plant is 32cm. The largest plant is 75cm, which gave us a slight
skew and higher mean. The range is 75-32 = 43cm. There are some plants that are twice as
big as others.
4) All students in the physical education class completed a basketball free-throw shooting event
and the highest number of shots made was 32. The next day, the PE teacher realized that he
had made a mistake. The student had actually made 35 shots. Indicate whether changing the
student’s score made each of these summary statistics increase, decrease, or stay about the
same:
5)
a) Mean - Increase
c) range - Increase
b) Median – Remain the same
d) IQR – Remain the same
The mean number of hours worked for the 30 males was 6, and for the 20 females was 9.
What is the overall mean number of hours worked?
π‘₯Μ… =
30 ∗ 6 + 20 ∗ 9
= 7.2
50
3
6)
Create a boxplot for this set of data using the scale below. Remember the fences are at 𝑄1 −
1.5 × πΌπ‘„π‘… and 𝑄3 1.5 × πΌπ‘„π‘….
7)
We collect these data from 50 male students. Which variable is categorical and which is
quantitative?
A) eye color - C
B) head circumference - Q
C) marital status - C
D) number of cigarettes smoked daily - Q
E) number of TV sets at home - Q
8)
Which one of the quantitative variables in problem 7 is most likely to be bimodal? Why?
Number of cigarettes smoked daily, because you will probably have many people who don’t
smoke at all and the people who do smoke will most likely smoke many cigarettes.
9)
Which one of the quantitative variables in problem 7 is most likely to be unimodal and
symmetric? Why?
Head circumference, because there will be an average head size and most people will be near
that size. No one will have an extremely small head or an extremely large head.
10) Why might you choose to display data with a dotplot rather than a boxplot?
A dotplot will tell you approximately what the values are whereas a boxplot only give you the
five number summary.
4
11)
The 1999 Consumer Reports new Car Buying Guide reported the number of seconds required
for a variety of cars to accelerate form 0 to 30 mph. The cars were also classified into six
categories by type. The following boxplots display the distributions of acceleration times for
each type of car. (Note: the asterisks on the boxplot for the small type of cars, these denote
outliers.)
a) If we compare a typical car in each category, which type accelerates the fastest? What
part(s) of the boxplots did you compare to make your choice?
The sports car accelerates the fastest. I used the median to make my choice.
b) If we compare the range of acceleration times for each car type, which type performs
the most consistently? What part of the boxplots did you compare to make your choice?
The large car has the most consistent acceleration time. I looked at the IQR (distance
from Q1 to Q3 or the length of the entire box). I also looked at the range (distance
from min to max or the length of the entire graph.) I noticed that the small car has the
smallest IQR, but the large car has the smallest range. I decided to pick the large car
as the most consistent, because it had the smallest range and no unpredictable outliers.
c) Now, lets only focus on the Small cars. If the outliers were removed from the dataset
of Small cars, which of the following measures of spread would be least affected?
Overall range, interquartile range (the distance between the 1st and 3rd quartile marks),
or standard deviation.
The IQR would be the least affected because the middle 50% values would remain about
the same after the outlier is removed. The SD and range would definitely decrease.
5
C) is 7.5
be determined.
h a stemplot rather than a boxplot because a
12) Which is true of the data whose distribution is shown? I only
distribution.
ts.
III only
D) I and III
E) I, II, and III
ibution is shown?
ed to the right.
the median.
The distribution is skewed to the right.
ith mean and I.
standard
II. The mean is smaller than the median.
III. We should summarize with mean and standard deviation.
I and II
D) II and III
E) I, II, and III
13) The IQR of the data displayed in this dotplot is most likely to be ...
splayed in
E) 20.
40
A) 5
60
80
B) 12
100
C) 1
bes the acidity (pH) of rainwater, and that water
ore of 1.8. This means that the acidity of that rain
D) 65
14) A boxplot is a graphical summary of the data set. I cannot tell by looking at the boxplot how
many data points are in the data set or how the data is distributed within each quartile. To
illustrate this important idea, make up a two different sets of data to match this boxplot. Put
ion of 1.
10 numbers in one of the data sets and 13 numbers in the other data set.
rage rainfall.
verage rainwater.
ons higher than that of average rainwater.
Set 1: 0, 1 or 2, 2, 2 or 3, 2 or 3, 3 or 4, 4 or 5, 5, 5 or 6, 6
The numbers underlined must have an average of 3.
Set 2: 0, 0 or 1 or 2, 0 or 1 or 2, 2, 2 or 3, 2 or 3, 3, 3 or 4 or 5, 3 or 4 or 5, 5, 5 or 6, 5 or
6, 6
Note: I may ask for a data set with a large SD or ADM. Keep in mind that if you want a
large spread try to keep the numbers as far away from the mean as possible and if you want
a small spread try to keep the numbers as close to the mean as possible. In this type of
problem I would have to mention what the mean is.
6
15) A class of fourth graders takes a diagnostic reading test, and scores are reported by reading
grade level. The 5 number summaries for the boys and girls are shown below.
Boys: 2.8 4.1 4.8 5.5 5.6
Girls: 2.1 4.5 4.9 5.6 5.8
a) Which group has the highest score?
Circle one: Boys /
Girls
Circle one: Boys /
Girls
Circle one: Boys /
Girls
Girls with a max of 5.8
b) Which group has the greatest range?
Girls with a range of 5.8-2.1 = 3.7
c) Which group has the highest IQR?
Boys with an IQR = 5.5-4.1 = 1.4
d) Which group’s scores appear to be more skewed? Explain.
Girls:
Median – Min = 4.9 – 2.1 = 2.8
Max – Median = 5.8 – 4.9 = .9
Boys:
Median – Min = 4.8 – 2.8 = 2
Max – Median = 5.6 – 4.8 = .8
The girls score is more skewed because the distance from the Median to the Min (2.8) is
three times bigger than the distance from the Max to the Median (.9). The boys distances
(2 and .8) are closer to each other than the girls distances (.9 and 2.8).
e) Which group generally did better on the test? Explain.
Girls did better because they have a higher Q1, median, Q3 and Max. They had a low min, but the
distribution was skewed to the left meaning that there weren’t very many low scores.
16) Review the following:
a) CW Mod 2 Topic 2.1 all
b) CW Mod 2 Topic 2.2 #4-9
c) CW Mod 2 Topic 2.4 all
d) Notes Slide 1-5
e) OLI Checkpoints, “Learn by Doing” and “Did I get This”! Approximately 20-50% of the exam
questions will be OLI type problems.
7
17) Consider the following data values. Find the following for each of the data sets.
Set A
1
2
4
5
5
6
6
6
7
8
Set B
0
1
1
2
2
2
3
4
5
6
Set A:
a) Min - 1
9
7
8
Set B:
a) Min - 1
b) Q1 – 4.5
b) Q1 - 2
c) Median - 6
c) Median – 3.5
d) Q3 – 6.5
d) Q3 - 7
e) Max - 9
e) Max - 9
f) IQR - 2
f) IQR - 5
g) Range - 8
g) Range – 9
h) Mean – 5.36
h) Mean - 4.14
i) ADM - 1.79
i) ADM – 2.59
j) Standard Deviation – 2.38
j) Standard Deviation – 3.01
8
8
9
Download