Statistics 108 Name

advertisement
Statistics 108
Spring 2003, (Prof. Rizzardi)
Name:___________
+ If the answer is a single number or interval, circle your final answer.
+ Show all work for full credit. Neatness counts.
+ If you are not certain of an answer, describe your logic for partial credit.
+ If a question uses the answer from a previous problem which you could not answer,
use a sensible number as the result of the previous problem and utilize that answer for
the question. Explicitly tell me on the exam that you are doing this.
+ For this final, you are allowed to use:
 The provided statistical tables,
 calculator,
 2 sheets of notes,
(Problem 1) Let x1  3, x2  7, x3  2.
(1a) Calculate x .
(1b) Calculate the sample standard deviation. Show your work.
(Problem 2) Fill in both blanks. The smallest value a probability can be is _________ and the
largest value is ___________, and any probability value stated outside of this range is a mistake.
(Problem 3) Suppose you were to calculate a 1-standard deviation window by calculating x  s .
Assuming the dot plot of the data is approximately “bell-shaped”, you would expect there to be
roughly _________% of the data within the 1-standard deviation window.
(Problem 4) Suppose the following probabilities among an animal population:
P( diseased | male ) = 0.2
P( diseased | female ) = 0.1
P( male ) = 0.30
P( female ) = 0.70
Are disease and sex independent? Explain.
1
(Problem 5) A study (fictional!) at a local hospital was performed to see if breast feeding of
infants was associated with lower allergy rates later in the child’s life. The mothers of young
children suffering from severe non-food allergies were asked whether or not the child had been
breast fed regularly for at least 2 months after birth. Mothers of children who were in the hospital
for non-disease injuries (e.g. broken arm) were also asked the same question. Data were later
analyzed to compare whether breast feeding was more or less prevalent among the allergy
children.
(5a) Was this an experimental or observational study? Explain why.
(5b) If this were an observational study, explain what would be required to make it an
experimental study. Or, if this were an experimental study, describe how it could have been
turned into an observational study.
(Problem 6) Suppose a fair 6-sided die is rolled once. A particular event occurs if the die
satisfies one of the values inside of the set.
A={1,2,3}
B={2,4,6}
C={2,3,5}
D={3,4,5,6}
(6a) Calculate P (D )
(6b) Calculate P( D c ) ; i.e., the probability of the complement of event D.
(6c) Calculate P ( A  C )
(6d) Calculate P ( A | B )
2
(Problem 7) The leaf lengths of 582 trillium plants were collected by HSU students. The boxplot
of the lengths is shown below. Units are in centimeters.
20
leaf
15
10
5
(7a) Approximately what percent of the leaf lengths are less than 15cm in length?
(7b) Give a rough calculation of the interquartile range. Show your work.
(Problem 8) Suppose the probability of a newborn calf being male is 0.3; i.e., P(male)=0.3.
Also suppose n=6 calves were born, their sexes are independent, and Y is the number of calves
that are male.
(8a) What is the name of the probability distribution that describes the random variable Y?
(8b) What is the probability of the 6 calves being 2 males and 4 females; i.e., calculate P( Y=2).
3
(Problem 9) A study was carried out where the weight (pounds) and cholesterol levels (mg/100
ml) were compared. Of interest was whether cholesterol is associated with weight. A simple
linear regression analysis was performed by a statistician. Below is some of the Minitab output.
The regression equation is
cholesterol = - 128 + 2.03 wt
Predictor
Constant
wt
Coef
-127.57
2.0320
SE Coef
78.90
0.4447
T
-1.62
4.57
P
0.130
0.001
Regression Plot
cholesterol = -127.567 + 2.03199 wt
S = 36.8697
R-Sq = 61.6 %
R-Sq(adj) = 58.7 %
cholesterol
300
200
100
140
150
160
170
180
190
200
210
220
wt
(9a) If a randomly sampled man weighed 180 pounds, using the regression analysis, what would
you expect his cholesterol to be?
(9b) For each pound increase in weight, you would expect cholesterol to
(a)
(b)
(c)
(d)
(e)
Decrease about 128 mg/100ml
Increase about 128 mg/100ml
Increase about 2.0 mg/100ml
Increase about 0.4 mg/100ml
Increase about 4.6 mg/100ml
(9c) The correlation coefficient between weight and cholesterol is about:
(a) –2.0 (b) – 0. 75 (c) –0.06 (d) 0 (e) +0.06 (f) +0.75 (g) +2.0
4
(Problem 10) An April 2003 Wall Street Journal – NBC news poll interviewed 605 adults. 430
of those surveyed answered “approve” to the question, “In general, do you approve or disapprove
of the job that George W. Bush is doing as president?”
(10a) What is the population?
(10b) What is the sample?
(10c) Calculate a 95% confidence interval (conservative approach) for the proportion of
Americans who approve of Bush’s job as president.
(10d) Suppose you wanted a margin of error of  2 %, how many people would have needed to
be surveyed?
(Problem 11) The below contingency table is a chi-square test output from Minitab with some
parts deleted. It involves the student data and compares hair color against gender.
black
blond
brown lightbro
red
All
female
3
2.96
15
15.28
9
WWW
4
VVVV
3
1.97
34
UUUUU
male
3
3.04
16
15.72
XXX
YYY
4
TTTT
1
2.03
35
35.00
All
6
6.00
31
31.00
20
20.00
8
8.00
4
4.00
69
69.00
Chi-Square = 1.218, DF = 4, P-Value = 0.875
6 cells with expected counts less than 5.0
Cell Contents -Count
Exp Freq
(11a) For brown-hair males, (observed) XXX= __________.
(11b) For brown-hair males, (expected) YYY=___________.
(11c) Circle which conclusion is most appropriate?
(a) There is statistically significant evidence that the mean hair color of males is equal to
females (P=0.875).
(b) There is not statistically significant evidence that he mean hair color of males is equal
to females (P=0.875).
(c) There is statistically significant evidence that gender and hair color are dependent
(P=0.875).
(d) There is not statistically significant evidence that gender and hair color are dependent
(P=0.875).
5
(Problem 12) Suppose the random variable X is distributed according to the standard normal
distribution (mean=0, sd=1), calculate P(X > 1.3).
(Problem 13) Suppose the random variable X is normally distributed with a mean of 70 and
standard deviation of 3. Also suppose 36 X’s were sampled and their mean, X , calculated.
(13a) Calculate P(69 < X < 71 ).
(13b) Do you think your answer for part (a) would be approximately close even if the random
variable X were not normally distributed? Explain. (Hint, think about problem 14.)
(Problem 14) Fill in both blanks. Suppose the random variable X has any distribution with
mean  and standard deviation  . Then, as the size of the sample (n) gets large, the distribution
of _____________(hint: a symbol and or letter) will become
“approximately _________________” with mean  and standard deviation

.
n
(Problem 15) Suppose 30 island foxes were captured and weighed. The mean weight was 10
pounds with a sample standard deviation of 1.5 pounds.
(15a) Calculate a 95% confidence for the mean weight of island foxes.
(15b) What is meant by “95% confidence interval” in this problem?
(15c) True or False: The width of the confidence interval will typically get wider with a larger
sample size. If false, explain why the interval would get narrower.
6
(Problem 17) A jolly doctor claimed that rubbing the tummy lowers a person’s diastolic blood
pressure an average 5mm HG. To investigate this claim, researchers measured the blood pressure
of 81 randomly chosen people. Next, they had the people rub their tummies and then took their
blood pressures again. For each person, the researchers calculated the difference as
d = Before – After. Thus, a decrease in blood pressure would give a d>0.
The average difference (decrease) was d  6.9 with a standard deviation of s  9 .
(17a) Calculate a 95% confidence interval for the mean decrease in blood pressure.
(17b) Test the hypothesis that the decrease in blood pressure was 5 or more mm HG using a level
of significance of 5% (   0.05 )
(17b.i) State the null and alternative hypothesis.
(17b.ii) Calculate the appropriate test statistic.
(17b.iii) Calculate a p-value. Sketch a graph shading in the area beneath the density curve which
equals the p-value.
(17b.iv) The “p” in p-value is for probability. Explain what the probability in your p-value is
describing.
(17b.v) Should you keep or reject your null hypothesis? Explain how you reached this
conclusion.
(17b.vi) Is it possible that you made a Type I error? Explain.
7
(Problem 18) Let A and B be two non-disjoint events (they overlap), but neither is a subset of
the other (neither is completely inside one another). Draw a Venn Diagram. Label your circles A
and B. Shade in the region A  B c .
(Problem 19) Genetics would have you believe that the off-spring of two gray sheep would have
a 25% chance of being white, 50% chance of being gray, and 25% chance of being black. Of the
200 lambs produced by gray sheep 55 were white, 98 were gray, and 47 were black. Does the
data support the hypothesis put forth by the geneticists?
(19a) State a null and alternative hypothesis.
(19b) Calculate a test statistic.
(19c) Calculate a p-value.
(19d) State your conclusion in a sentence. (Simply saying reject or keep Ho is not adequate.)
8
(Problem 20) The suppose the distribution for the numbers of cars owned by an American family
is given by the below table.
k=number of cars
0
1
2
3
4
P(X=k), pdf
0.05
0.20
0.60
aaa=
0.05
P(X  k), cdf
bbb=
bbb=
bbb=
bbb=
bbb=
(20a) Fill in aaa to complete the probability distribution function (pdf).
(20b) Fill in bbb’s to complete the cumulative distribution function (cdf).
(Problem 21) Suppose the heights of 30,000 men (randomly selected Army recruits) were
measured. The General wanted to know if the average height of Army recruits is different from
70 inches. You go into Minitab and perform a t-test on the data getting the following output.
One-Sample T: Height
Test of mu = 70 vs mu not = 70
Variable
Height
N
30000
Mean
70.0483
StDev
3.0131
Variable
Height
95.0% CI
( 70.0142, 70.0824)
T
2.78
SE Mean
0.0174
P
0.005
(21a) State a null and alternative hypothesis.
(21b) Are the results statistically significant? Explain.
(21c) Are the results practically significant? That is, is there a real importance? Explain.
9
Download