STAT 103 Sample Questions for the Final Exam

advertisement
STAT 103 Sample Questions for the Final Exam (SOLUTIONS ARE IN RED BOLDFACE)
1. Illinois has about 20 times as many registered voters as Wyoming. Other things being equal, a random sample of 1,800 registered voters taken
in Illinois will give results of___ greater accuracy XXXX about the same accuracy ___ less accuracy ___ either greater or less accuracy
than a random sample of 1,800 registered voters taken in Wyoming. Choose one option and EXPLAIN.
Accuracy (SE% = SD/sqrt[n]) is determined by sample size. Both states have the same sample size. See Chapter 20.
2. Los Angeles has about 4 times as many registered voters as San Diego. A simple random sample of registered voters is taken in each city to
estimate the percentage who will vote for school bonds. Other things being equal, a sample of 4,000 taken in Los Angeles will be about
______ four times as accurate
XXXX twice as accurate
______ as accurate
as a sample of 1,000 taken in San Diego. Choose one option, and say why. Accuracy (SE% = SD/sqrt[n]) is determined by sample size. n
for Los Angeles is 4 times that for San Diego and the square root of 4 is 2; hence twice as accurate. See Chapter 20.
3. Using the same data, you compute a 95% confidence interval and a 99% confidence interval. Choose one option
____The intervals have the same width;
XXXX the 99% confidence interval is wider;
____the 95% confidence interval is wider;
____you can't say unless you know the sample size and standard deviation.
The multiplier for 95% is 1.96 and the multiplier for 99% is 2.576, making the 99% confidence interval wider. See Chapters 21 and 23.
4. New laser instruments were used to measure the speed of light; 225 measurements were taken. They averaged out to 299,775 km per sec, with
a standard deviation of 45 km/sec. Assume the Gauss model with no bias. Fill in the blanks in part [a] and answer the rest True of False. (See
Chapter 24) a. The true speed of light is estimated as 299,775 km/sec; This estimate is likely to be off by 45/sqrt[225] = 3 km/sec or so.
F b. 299,775 + 6 km/sec is a 95% confidence interval for the average of the 225 readings. We know the average of the 225 readings with 100%
accuracy.
T c. 299,775 + 6 km/sec is a 95% confidence interval for the true speed of light.
F d. There is about a 95% probability that the next reading will be in the range 299,775 + 6 km/sec. Use SD; + 45 km/sec here.
F e. About 95% of the 225 readings were in the range 299,775 + 6 km/sec. Use SD; + 45 km/sec here.
T f. If another 225 readings are made, there is about a 95% probability that their average will be in the range 299,775 + 6 km/sec.
5. One way of playing the Illinois Lottery Daily Game is to buy a $1 ticket and attempt to match a 3 digit number between 000 and
999. If your number comes up you win $499 (you receive $500 but your ticket costs $1); otherwise if one of the 999 losing numbers
comes up, you lose the $1 cost of your ticket. (See Chapters 16-18)
[a] Draw the box model for this game. There are 999 cards with -$1 and one with $499.
[b] The average of the box in part (a) is – 0.50 and the SD is about 16. Suppose 1,000,000 plays are made in one day. Noting that
the square root of 1,000,000 is 1,000, the expected winnings of those playing is n(AV) = -$500,000 with an SE of sqrt[n](SD) =
$16,000
[c] Suppose 1,000,000 plays per day are made for several days. About what percentage of those days are the participants going to
lose at least $468,000? Pr(SUM < -468,000) = Pr(z < {-468000 – (-500000)}/16000) = Pr(z < 2) = 97.72% __
___________
6. A box of tickets has an average of 100 and an SD of 16. Four hundred draws will be made at random with replacement from the box. Find the
chance that the average of the draws will turn out to be in the range 98 to 102. (See Chapter 23) EXP{X-bar} = AV = 100, SE{X-bar} = SD/sqrt[n]
= 16/20 = 0.80. Pr(98 < X-bar < 102) = Pr({98 – 100}/(0.80) < Z < {102 – 100}/(0.80)) = Pr(-2.5 < Z < 2.5) = 98.76%.
7. The National Survey of Salaries and Wages in Public Schools shows that the mean salary of teachers is $40,133 and the standard deviation is
about $8,000. If you took a random sample of 64 teachers you expect their total salary to be n(AV) = $2,568,512 ,give or take sqrt[n](SD) =
$64,000, or so. (See Chapters 16-18)
Their mean salary is expected to be AV = $40,133, give or take SD/sqrt[n] = $1000, or so. (See Chapter 23)
Find the probability that the mean salary of these 64 teachers is over $42,000 Pr(X-bar > 42,000) = Pr(Z > {42000 – 40133}/1000) = Pr(Z >
1.867) = 0.031 = 3.1%. (See Chapter 23)
Do you need to assume that public school teacher salaries follow the normal curve to answer this last question? No; sample size n = 64 is large.
8. A university has 10,000 students. In order to estimate the percentage of nonsmokers, a simple random sample of 225 students is
chosen. It turns out that 198 of them are nonsmokers.
[a] Find a 95% confidence interval for the percentage of nonsmokers. pi = .88 + (1.96)SD/sqrt[n] = .88 + (1.96)sqrt[(.88)(1.88)]/sqrt[225] = .88 + .0425 = 88% + 4.25% or 83.75% < pi < 92.25%. (See Chapter 21)
[b] Give a symbol for the parameter being estimated The Greek letter pi.
9. In election years, the Bureau of Labor Statistics makes a special report on voting. In 1972 about 63% of all the people of voting age in these
households said they voted; but only 56% of the total population of voting age did in fact vote. Can the difference be explained by sampling
variability? If not, how else can it be explained? You may assume that the Bureau's sample is a simple random sample of 60,000 people.
No. The statistic 63% is over 34 SEs larger than the expected percentage 56%. The explanation is response bias (people lie). (See
Chapter 20 or 21 or 26)
10. The life of certain types of light bulbs follow the normal curve with a mean life span of 1000 hours and a standard deviation of 100 hours.
[a] Find the probability that one such component last more than 875 hours. Pr(X > 875) = Pr(Z > {875 – 1000}/100) = .894 = 89.4% (Basic; see
Chapter 5)
[b] Find the probability that four such component all last more than 875 hours. (.894)^4 = .64 = 64% (Basic; see Chapter 13)
[c] Find the probability that at least one of four such components last more than 875 hours. Pr(not all fail) = 1 – Pr(all fail) = 1 – (1 - .894)^4 =
.99988 = 99.989% (Basic; see Chapter 14).
[d] Find the probability that four such components have a mean life exceeding 875 hours. Pr(X-bar > 875) = Pr(Z > {875 – 1000}/(100/sqrt[4])) =
.9938 = 99.38% (See Chapter 23)
11. A company that owns and services a fleet of cars for its sales force has found that the service lifetime of disc brake pads varies from car to car
according to a normal distribution with mean of 43,000 miles and standard deviation of 4,500 miles. A new and cheaper brand of brake pads is
installed on 36 cars. [a] If the new brand of pads last as long as the previous type, you would expect the average life of brake pads on the 36 cars
to be 43,000 miles give or take 4500/sqrt[36] = 750 miles. (See Chapter 23)
[b] The average life of the pads on these 36 cars turned out to be only 41,000 miles. What is the probability that the sample average life is 41,000
miles or less if the new pads last just as long as the old ones? Pr(X-bar < 41000) = Pr(Z = {OBS – EXP}/SE < {41000 – 43000}/750) = Pr(Z < -2.667)
= 0.0038 (See Chapter 23)
[c]. The company takes this probability as evidence that the average lifetime of the new brand of pads is less than 43,000 miles. Is this warranted
by the facts? Explain. Yes; the P-value = 0.0038 when you test the null hypothesis that the pads last 43,000 miles.
12. In a double blind test 16 patients were given an experimental sleeping pill one night and a placebo on another night. On the average the
patients slept longer under the effect of the sleeping pill than the placebo. The 16 patients slept an average of 1.58 hours longer, with an SD of 1.23
hours, using the sleeping pill instead of the placebo. Use t-curve with df = 15. (See Chapter 26) Here SD+ = sqrt[16/15]SD = 1.27
[a] Find the 95% confidence interval for the average increase in sleeping time for patients using the sleeping pill. 95% multiplier for df = 15 is 2.13.
mu = 1.58 + (2.13)(1.27/sqrt[16]) = 1.58 + .68 or 0.90 < mu < 2.26
[b] Find the 99% confidence interval for the average increase in sleeping time for patients using the sleeping pill. 99% multiplier for df = 15 is 2.95.
mu = 1.58 + (2.95)(1.27/sqrt[16]) = 1.58 + .94 or 0.64 < mu < 2.53
[c] Give a symbol for the parameter being estimated. The Greek symbol mu
13. A Louis Harris Poll was taken of 1250 U.S. adults about their views of banning handgun sales. Of those sampled, 650 favored a ban. At the
5% significance level, does the poll provide sufficient evidence to conclude that a majority of the U.S. adults (i.e., more than 50%) favor banning
handgun sales? (See Chapter 26)
Null Hypothesis pi = 50% Alternative Hypothesis pi > 50% alpha = 5%
Note: 650/1250 = 0.52 = 52% and SE% = sqrt[(.50)(1- .50)]/sqrt[1250] = 0.014
Test statistic Z = {OBS – EXP}/SE = {.52 - .50}/0.014 = 1.414 Do we need to assume the Gauss model (normality of the population) to perform this test? No;
variable is dichotomous.
Table from the book to use for this problem is on page (or calculator function) A-105 (or normalcdf) P-value: Pr(Z > 1.414) = 0.0786 (using calculator)
Circle the correct decision: Retain the Null Reject the Null Detailed conclusion: There is insufficient evidence (P-value = 7.86% > 5%) to conclude that
more than 50% of the population favors a ban of handgun sales.
14. An environmentalist group monitors the temperature rise in the water 50 yards downstream from a nuclear power station. Federal regulations
permit a rise in temperature of 3.0°F. They collect 16 water samples and obtain a mean rise of 3.2°F with a standard deviation of 1.0°F. Does this
show that the power plant exceeds government limits (or can the observed mean be explained by chance variation)? Carry out the appropriate test
of significance at the 5% level. (See Chapter 26) Here SD+ = sqrt[16/15] SD = 1.033, SE+ = SD+/sqrt[n] = .2582
Null Hypothesis The true rise in temperature is 3.0°F (mu = 3.0) Alternative Hypothesis The rise is greater than 3.0°F (mu > 3.0) alpha = 5%
Test statistic t = {OBS – EXP}/SE = {3.2 – 3.0}/0.2582 = 0.77 Do we need to assume the Gauss model (normality of the population) to perform this test? Yes
(bootstrapping from a small quantitative sample)
Table from the book to use for this problem is on page (or calculator function) A-107 (or tcdf) P-value: 10% < P-value < 25% (or P-value = 21.81% using tcdf).
Circle the correct decision: Retain the Null Reject the Null Detailed conclusion: There is insufficient evidence (P-value = 21.81% > 5%) to conclude that
the power plant is in violation of federal standards. the true rise in temperature is above 3.0°F
15. A brand of water-softener salt comes in packages marked “net weight 40 lb.” The company that makes the salt claims that the bags contain an
average of 40 lb of salt and that the standard deviation is 1.5 lb. Assume the weights follow the normal curve.
[a] Obtain the probability that the weight of one bag of water-softener salt will be 39 lb or less, if the company’s claim is true.
Pr(Z < {39 – 40}/1.5) = Pr(Z < -0.67) = .2525 (Basic; see Chapter 5)
[b] Determine the probability that the mean weight of 10 randomly selected bags of water-softener salt will be 39 lb or less, if the company’s claim
is true. Pr(Z = {OBS – EXP}/SE < {39 – 40}/(1.5/sqrt[10]) = Pr(Z < -2.10) = .01751 (See Chapter 18)
[c] If you bought one bag of water-softener salt and it weighed 39 lb, would you consider this sufficient evidence that the company’s claim is
incorrect? No Explain your answer. The P-value = 25.25%
[d] If you bought 10 bags of water-softener salt and their mean weight was 39 lb, would you consider this sufficient evidence that the company’s
claim is incorrect? Yes Explain your answer. The P-value = 1.751% (See Chapter 26)
16. Batteries are tested by keeping flashlights on until the light deteriorates by 50%. When 25 such batteries were tested, the results followed a
normal curve with a mean of 20 hours and a standard deviation of 3 hours.
[a] About 95% of the batteries lasted between 14.12 and 25.88 hours (20 + (1.96)(3) = 20 + 5.88) (Basic; see Chapter 5)
[b] A 95% confidence interval for the mean life of batteries of this type is mu = 20 + (2.06)(3/sqrt[25]) = 20 + 1.24 or (18.76 < mu < 21.24) (See
Chapter 23)
[c] Give a symbol for the parameter being estimated The Greek symbol mu.
[d] Test the hypothesis that the mean life of batteries of this type is at least 22 hours. Use alpha = .05. (See Chapter 26)
H0: mu = 22, HA: mu < 22, alpha = 5%, t = {OBS – EXP}/SE = {20 – 22}/(3/sqrt[25]) = -3.33. P-value = Pr(t < -3.33)
= 0.0014 (off chart on page A-106)
17. A psychologist wants to study the effect of home environment on IQ scores. She selects a sample of students, gives each an IQ test, and also asks the size of
the family, mother's occupation, father's occupation, and many other questions. In this study, the following are dependent (response) variables. (See Chapter 8)
XXXX a. IQ scores;
____ b. family size;
____ c. mother's occupation;
____ d. both b and c.
18. Consider the following scores of six students on two tests in a statistics course. The means and standard deviations are given
below (You don't have to calculate them).
Student Score on Score on
zx
zy
(zx)(zy)
first test second test
--------------------------------------------------------------------------1
40
65
-1.5 +0.5 -0.75
--------------------------------------------------------------------------2
60
60
-0.5 0
0
--------------------------------------------------------------------------3
70
55
0
-0.5 0
--------------------------------------------------------------------------4
80
60
+0.5 0
0
--------------------------------------------------------------------------5
70
45
0
-1.5 0
--------------------------------------------------------------------------6
100
75
+1.5 +1.5 +2.25
--------------------------------------------------------------------------Totals
0
0
1.50
AVx = 70
AVy = 60
SDx = 20
SDy = 10 [a] Compute r = 1.50/6 = 0.25 (See Chapter 8)
[b] Draw the scatter plot. (not shown)
[c] Draw the regression line on the scatter plot carefully. (not shown)
[d] Find an equation of the regression line. y-hat – 60 = (0.25)(10/20)(x – 70) or y-hat = .125x + 51.25 (See Chapter 10)
[e] If another student in the class got 80 on the first test, estimate her score on the second test. y-hat = .125(80) + 51.25 = 61.25
19. To estimate forearm length of men, the following results were obtained for about 1000 men:
Average height = 69 inches, SD for height = 2.5 inches,
Average forearm length = 18 inches, SD = 1 inch;
Correlation coefficient = .80.
Estimate the average length of the forearms of men whose heights are (See Chapter 10)
[a] unknown 18 (use mean forearm length of all men)
[b] 69 inches 18 (the regression line goes through the point of averages (AVx,AVy) = (69,18)) or you can use the four step
method.
[c] 73 inches 19.28 (Four step method: 1. x = 73; 2. zx = {73 – 69}/2.5 = 1.6; 3. zy-hat = rzx = (.80)(1.6) = 1.28; 4. y-hat = 18 +
(1.28)(1.0) = 19.28)
[d] 66 inches 17.04 (Four step method: 1. x = 66; 2. zx = {66 – 69}/2.5 = -1.2; 3. zy-hat = rzx = (.80)(-1.2) = -0.96; 4. y-hat = 18 + (0.96)(1.0) = 17.04)
[e] Find an equation of the regression line. y-hat – 18 = (0.80)(1.0/2.5)(x – 69) or y-hat = .32x – 4.08
Download