Stat 101 – Exam 1 Name: ________________________ February

advertisement
Stat 101 – Exam 1
February 8, 2008
Name: ________________________
Section: L
INSTRUCTIONS: Read the questions carefully and completely. Answer each question and
show work in the space provided. Partial credit will not be given if work is not shown. When
asked to explain, describe, or comment, do so within the context of the problem. Be sure to
include units when dealing with quantitative variables.
1. [18 pts] Short answer.
a) [2] Statistics is about … _______________. (Fill in the blank with one word.)
b) [5] A study is conducted by the Iowa Department of Transportation. Every
twentieth vehicle passing mile marker 160 on Interstate 80 in Iowa has its speed
measured with a radar gun. In addition the type of vehicle: car, light truck, or
heavy truck is noted. Answer the questions Who? and What? for this study.
When answering the questions What? be sure to indicate the type of variable,
Categorical or Numerical.
c) [4] A small bed and breakfast has 5 rooms to rent by the night. The least
expensive room is $100. The median price of rooms at the bed and breakfast is
$100. The IQR of rooms at the bed and breakfast is $35 and the Range of room
prices is $70. What is the average (mean) price of rooms at the bed and
breakfast? In order to get credit you must show your work.
d)
[3] For a sample of 100 values, the distance from the minimum to the lower
quartile is 1.8, the distance from the lower quartile to the median is 0.9, the
distance from the median to the upper quartile is 0.4 and the distance from the
upper quartile to the median is 0.1. Will the sample mean of the 100 values be
smaller than, about equal to, or greater than the sample median of the 100 values?
Explain briefly.
1
e) [4] Ty Cobb’s batting average in 1911 was 0.420 while Ted Williams’ batting
average in 1941 was 0.406. For the decade 1911 to 1920, the mean batting
average was 0.266 and the standard deviation was 0.0371. For the decade 1941
to 1950, the mean batting average was .267 and the standard deviation was
0.0326. Who had the higher standardized batting average, Ty Cobb or Ted
Williams? Support your answer statistically.
2.
[15 pts] The table below indicates the rank attained by males and females in the New
York City Police Department (NYPD).
Gender
Female
Male
Total
Officer
4281
21900
26181
Rank in NYPD
Detective
Sergeant
806
415
4058
3898
4864
4313
Higher Rank
111
1910
2021
Total
5613
31766
37379
a) [5] Answer the questions, Who? and What?. In answering the question What? be
sure to indicate the type of variable, Categorical or Numerical.
b) [2] What percentage of the NYPD is female?
c) [3] Considering only females, what percentage of females in the NYPD are
Sergeants?
d) [5] On the next page is a Mosaic plot. Based on this plot, is there a difference
between males and females in terms of the rank they attain in the NYPD? Explain
your answer briefly.
2
1.00
4-Higher Ranks
3-Sergeant
2-Detective
Rank
0.75
0.50
1-Officer
0.25
0.00
Female
Male
Gender
3. [12 pts] Environmental Protection Agency estimates of automobile fuel economy for
highway driving appear to follow a Normal model with population mean μ=24.8 mpg and
population standard deviation σ=6.2 mpg.
a) [4] Draw the model for automobile fuel economy. Clearly label it showing what
the 68-95-99.7 rule indicates about fuel economy mpg.
b) [3] What percentage of automobiles will have fuel economy for highway driving
less than 20.0 mpg?
3
c) [5] The EPA wants to designate the top 2% of automobiles as “highly fuel
efficient” for highway driving. What fuel economy will an automobile have to
get in order to earn the “highly fuel efficient” for highway driving designation?
3
.99
2
.95
.90
.75
.50
1
0
.25
.10
.05
.01
Normal Quantile Plot
4. [20 pts] A random sample of 35 Division 1 Women’s Basketball Teams was taken. The
number of points the team scored in its last game was recorded. Refer to JMP output
below.
-1
-2
-3
10
6
4
Count
8
Stem
9
8
8
7
7
6
6
5
5
4
4
3
3
2
Leaf
0
6
11
688
112334
77889
03344
5568899
023
9
Count
1
1
2
3
6
5
5
7
3
1
8
1
2
20 30 40 50 60 70 80 90 100
Score
2|8 represents 28
Quantiles
100.0%
75.0%
50.0%
25.0%
0.0%
maximum
quartile
median
quartile
minimum
Moments
90.0
73.0
67.0
58.0
28.0
Mean
Std Dev
N
65.4
12.2
35
4
a) [3] Describe the shape of the histogram.
b) [2] What are the sample median score and sample mean score?
c) [4] Are the values of the sample mean and sample median consistent with your
description of the shape of the histogram in a)? Explain briefly.
d) [2] How many sample values are within 1 standard deviation of the mean?
e) [1] According to the box plot there is one potential outlier. What is the value of
this potential outlier?
f) [4] If the potential outlier is deleted (leaving a sample of 34 scores) what are the
new sample median and sample mean?
g) [4] If the potential outlier is deleted, could the remaining 34 scores have come
from a population described by a Normal model? Support your answer by
making reference to the Normal Quantile Plot.
5
5. [10 pts] An experiment is conducted to see if a high dose of a cholesterol-lowering drug
(ZocorTM) is more effective than a low dose. The response variable is the change in
cholesterol level. A negative value indicates that their cholesterol went down by that
much. A positive value indicates that their cholesterol actually went up while taking the
drug. Below are box plots for the two groups of 18 men that participated in the
experiment.
Change in Total Cholesterol
Zocor
Dose
Low
High
-150
-100
-50
0
50
Change in Total Cholesterol
a) [1] Describe the shape of the distribution of change in total cholesterol for the
group that received the high dose.
b) [3] Would you use the sample range to numerically summarize the spread of the
low dose group? Explain briefly.
c) [3] Give the approximate values for the sample medians for the high and low
dose groups. According to this measure of center, which dose is more effective
in lowering total cholesterol. Explain briefly.
d) [3] Which group, high or low dose, exhibits more variation? Support your
answer statistically.
6
Stat 101 – Exam 1
Name: ________________________
Things you should know for the first exam.
Types of variables: Categorical and Numerical(Quantitative)
Five number summary: Minimum, Lower Quartile, Median, Upper Quartile, Maximum.
Range = Maximum – Minimum
IQR = Upper Quartile – Lower Quartile
Sample mean:
y=
(∑y)
n
(∑ ( y − y ) )
2
Sample standard deviation: s =
Standardized score:
z=
n −1
y−y
s
Normal model:
68% between μ − 1σ and μ + 1σ
95% between μ − 2σ and μ + 2σ
99.7% between μ − 3σ and μ + 3σ
Use Table Z:
z=
y −μ
σ
y = μ + zσ
I think I scored _________ out of 75 points on this exam.
7
Download