STATISTICS B REVIEW S4.1 I can compute the mean of the sum

advertisement
STATISTICS B REVIEW
S4.1
S4.1
S4.2
S4.3
S4.3
I can compute the mean of the sum given the mean and
standard deviation of each random variable in a set of
random variables.
I can compute the variance and standard deviation of the
sum given the mean and standard deviation of each random
variable in a set of random variables.
I can define and use the Central Limit Theorem with
sampling distributions.
I can use the normal model to compute probabilities for the
sample mean
I can use the CLT to compute probabilities for the sample
mean.
p. 321:1 – 9
p. 321:1 – 9
p. 328 – 331: 1 – 13
p. 328 – 331: 1 – 13
p. 328 – 331: 1 – 13
S4.4
I can define and use the large sample distribution of
sample proportion to compare probabilities.
p. 349 - 353: 1 – 12
S4.4
I can define and use rules of thumb for the applicability
of the large sample distribution.
p. 349 - 353: 1 – 12
S4.5
I can use the normal model to derive the percent
confidence interval for the mean.
.
p. 349 - 353: 1 – 12;
p. 360 – 364: 1 – 13
S4.5
I can use the CLT to derive the percent confidence
interval for the mean
p. 349 - 353: 1 – 12;
p. 360 – 364: 1 – 13
S4.6
I can compute control limits for commonly used control
charts.
p. 268-271: 12 – 15
S4.6
I can compute control limits to assess whether a
process is out of control.
p. 268-271: 12 – 15
S5.1
I can compute bias, variance and mean squared error of
estimators of the mean and proportion.
p. 349 - 353: 1 – 12;
p. 360 – 364: 1 – 13
S5.2
I can understand the logic of confidence intervals.
p. 349 - 353: 1 – 12;
S5.2
I can define the meaning of confidence interval.
p. 349 - 353: 1 – 12;
S5.5
I can compute the sample size required for a fixed
confidence level and interval width for confidence
intervals for mean and proportions
p. 378- 381: 1 – 13
S5.3
S5.4
S5.5
I can compute and interpret confidence intervals for
the difference between two means (in both the
paired and unpaired setting) when the standard
deviation in unknown, using the t distribution.
I can compute and interpret (large samples)
confidence intervals for the difference between two
proportions and using the normal distribution.
I can compute the sample size required for a fixed
confidence level and interval width for confidence
intervals for mean and proportions
p. 460 – 464: 1 – 13
p. 476 – 478: 1 - 13
p. 488 – 492: 1 – 10
p. 501 – 505: 1 – 10
p. 378- 381: 1 – 13
STATISTICS B REVIEW
I can explain what null hypothesis, alternative
hypothesis, p-value, Type I error, Type II error and
S6.1
power mean.
I can explain the logic of significance testing.
S6.2
S6.3
I can, assuming a normal model and known
standard deviation, carry out a significance test for
a single mean, with emphasis on understanding the
computation and interpretation of the p-value.
I can carry out (large samples) significance tests for
one proportion, with emphasis on proper
interpretation of results.
I can carry out (large samples) significance tests for
the difference of two proportions, with emphasis on
proper interpretation of results.
p. 399-401: 1-9
p. 399-401: 1-9
p. 412-415: 1 – 15;
p. 423-424: 1 – 9
p. 371 – 373:1 – 14
p. 501 – 505: 1 – 10
S6.3
S6.4
S6.5
I can carry out (large samples) significance tests for
one proportion, with emphasis on proper
interpretation of results.
I can carry out significance tests for the difference
of two means (paired and unpaired) using the t
distribution, with emphasis on proper interpretation
of results.
I can carry out significance tests for one mean
using the t distribution, with emphasis on proper
interpretation of results.
I can carry out chi-squared significance tests of
independence with emphasis on proper
interpretation of results.
I can carry out chi-squared significance tests of
goodness of fit with emphasis on proper
interpretation of results.
I can carry out chi-squared significance tests of
homogenicity with emphasis on proper
interpretation of results.
S6.7
S7.1
I can demonstrate, in the context of specific studies,
the understanding that a result can be statistically
significant while of insignificant practical
importance.
I can know the statistical model for regression,
including linearity.
.
.
p. 440-443: 1 – 15
p. 460 – 464: 1 – 13
p. 476 – 478: 1 - 13
p. 488 – 492: 1 – 10
p. 433-436: 1 – 13
p. 524 – 525: 1 – 8
p. 532 – 534: 1 – 10
w.s.12.1
p. 399-401: 1-9
p. 549 – 551: 1 – 9
I can know the statistical model for regression,
including normality of errors
p. 549 – 551: 1 – 9
I can know the statistical model for regression,
including constancy of error variance
p. 549 – 551: 1 – 9
STATISTICS B REVIEW
I can compute a confidence interval for the slope of
a regression line using the t distribution.
S7.2
S7.3
S8.1
S8.2
S8.3
S8.3
S8.4
I can interpret a confidence interval for the slope of
a regression line using the t distribution.
I can test hypotheses about the slope of a regression
line, with emphasis on interpretation of results.
I can demonstrate knowledge of the assumptions
required for all of the inferential procedures
(confidence intervals and significance test).
I can, in the context of specific studies, recognize
aspects of study design that either support or offer
evidence against required assumptions.
I can demonstrate knowledge of the possible effects
of incorrect assumptions (i.e., improperly specified
models) on the inferential procedures.
I can demonstrate knowledge of the possible effects
of incorrect assumptions (i.e., improperly specified
models) on the robustness of inferential procedures
to departures from specified assumptions.
I can show in context an understanding that
statistical models are approximations to reality.
I can show in context an understanding that care
should be exercised in assigning too much
precision to measures such as confidence levels or
p-values.
1. (S4.1) Give two examples of a sample statistics: ______________________
2. (S4.1) State the Central Limit Theorem (Section 8.2)
3. (S4.1) Given   2.35 and n  13 find the standard error of the mean.
p. 549 – 551: 1 – 9
p. 549 – 551: 1 – 9
p. 560 – 562: 1 - 6
p. 399-401: 1-9
p. 399-401: 1-9
p. 399-401: 1-9
p. 399-401: 1-9
p. 433-436: 1 – 13
p. 399-401: 1-9
p. 399-401: 1-9
STATISTICS B REVIEW
4. (S4.2) Jimmy is the manager of a bottling plant in Denver. He has observed that the amount of orange juice in
each 24 oz bottle is normally distributed, with a mean of 24.3 ounces and a standard deviation of 0.7 ounces. Find
the probability that if a customer buys four bottles of juice that the mean of the four will be greater than 24 ounces.
Round your answer to 4 decimal spots.
5. (S4.3) The number of boxes of girls scout cookies sold by each of the girls scouts in a Midwestern city has a
distribution which is approximately normal with mean μ = 75 boxes and standard deviation σ = 30 boxes. Find the
probability that the sample mean ( X ) number of boxes of cookies, sold by a random sample of 36 girl scouts, is
between 60 and 90 boxes.
6. (S4.4) The Smith twins Sally and Billy are in different math classes at Freedom High. On their final
exams, Sally scored 70 on a test with a mean of 80 and a standard deviation of 10 and Billy scored 62
with a mean of 71 and a standard deviation of 11. Who scored better?
7. (S4.4) What size does the sample have to be in order to use the student’s t distribution?
8. (S4.5) Given the confidence interval, 6.71 < µ < 7.23, and that s = 1.15 and n = 75, find the confidence
level, c.
9. (S5.2) What does it mean, when given a confidence interval such as, 3.2  u  3.8 , is computed at a
90% level?
10. (S5.3) When computing a 95% confidence interval for the given information, what item would you
find first?
STATISTICS B REVIEW
11. (S5.4) A random sample of 100 felony trials in a large city in the Midwest shows the mean waiting
time between arrest and trial is 173 days with standard deviation 28 days. Find a 99% confidence interval
for the mean time interval between arrest and trial.
12. (S5.5) The Director of a Museum would like to know what fraction of the museum associates make
purchases through the gift shop catalogue.
(a) If no preliminary study is done, how large a sample must be taken if the director is to say with
90% confidence that the sample estimate is within 2% of the population proportion?
(b) A preliminary study showed that out of 60 associates, 12 have used the gift shop catalogue.
What size sample does the director need in order to say with 90% confidence that the sample
estimate is within 2% of the population proportion?
13. (S5.5) A random sample of 40 Salt Lake City teachers showed the standard deviation of teaching
experience to be 5.3 years. How many more teachers should be included in the sample to have 95%
confidence that the sample mean number of years teaching experience is within six months of the
population mean?
14. (S6.1) Based on the null hypothesis below, identify whether the result is a Type I error, a Type II
error, or not an error.
H0 : There is a smoke alarm in the kitchen.
Result: The smoke alarm doesn’t go off when there is a fire in the kitchen.
15. (S6.1) What are you two choices in terms of H0 during hypothesis testing?
16. (S6.2) Killer bees have migrated into this country. There is fear that they will spread across the nation.
However, they cannot survive in cold climates. It is thought that they cannot tolerate temperatures below
36º F. To test this claim a random sample of 9 killer bee hives were subjected to colder and colder
temperatures until they died. The temperatures at which the hives died were recorded. The mean
temperature was 37 º F with standard deviation 4 º F. Assuming that the killing temperature level is
normally distributed, test the claim that the mean killing temperature is different from 36 º F. Use a 1%
level of significance.
17. (S6.3) Of a random sample of 150 American adults, 69 adults claim that they prefer drinking regular
STATISTICS B REVIEW
soft drinks to drinking diet soft drinks and of a random sample of 130 American adults 61 adults, claim
that they prefer drinking diet soft drinks to drinking regular soft drinks. Would this mean that the
proportion of American adults that drink regular soft drinks is less than that of those who drink diet soft
drinks? Use a 5% level of significance.
18. (S6.4) In order to test how the environment effects the ability of a child to understand concepts better,
an experiment was conducted on six pairs of identical twins. In the given table, row B represents the
writing age in months of the randomly selected member of a twin pair in an experimental group whereas
row A represents the writing age in months of the other member of the twin pair who was not part of the
experimental group. Using a 1% level of significance, test if there is a difference in the writing ages of
the two groups. Then, find the P-Value.
Twin 1 Twin 2 Twin 3 Twin 4 Twin 5 Twin 6
Row A
59
61
65
60
66
58
Row B
70
72
68
61
65
69
19. (S6.5) According to a recent marketing research, the distribution of the ages of people that buy
compact discs in the U.S. is listed in the table below. A random sample of 750 people is also presented in
the table. At a 5% significance level, does it appear that the distribution of the ages of the customers in the
report fits that from customers in the U.S. as a whole? ONLY Calculate  2 .
Age in
years
Less than 20
Percent of Customers Number of Customers
in the Report
in the Report
40.1
97
21- 30
21.3
374
31- 40
19.0
65
More than 40
19.6
214
20. (S6.7) The Magic Dragon Cigarette Company claims that their cigarettes contain an average of only
10 mg of tar. A random sample of 100 Magic Dragon cigarettes shows the average tar content to be 11.5
mg with standard deviation 4.5 mg. Use a 1% level of significance.
Which statement correctly identifies this situation?
(a) Too much emphasis on precision
(b) Too little emphasis on precision
(c) Incorrect assumption that the sample is large enough
(d) Incorrect assumption that the sample is small enough
STATISTICS B REVIEW
21. (S7.1 & S7.2) In the following table, x represents the number of questions David has answered more
than Sally in the school exam and y represents the percent of times David gets a higher grade than Sally.
x 13 16 14 19 15 12 22
y 13 14 20 12 22 19 23
(a) Find the linear regression equation for the given data. You may use a calculator. Round to the
nearest thousandth.
(b) Find the standard error of estimate for the data set. Round to the nearest thousandth.
(c) Using the data and linear regression line from part (a), find a 85% confidence interval for x =
17. Round to the nearest hundred.
22. (S7.3) In the following table, x represents the average percentage of change in prices of consumer
goods and y represents the average percentage change in expenditure of a family. Use a 2.5% level of
significance to test the claim that   0.
x 3.2 3.9 2.6 3.7 35
.
3.3 31
.
y 12.4 17.6 13.4 17.6 14.4 9.2 5
23. (S 7.3) In the following table, x represents the per capita sales of a wholesale outlet in thousands and y
represents the per capita income in thousands of dollars of the outlet. Use a 5% level of significance to
estimate the P-value for the claim that   0.
x 14.6 151
. 15.3 158
. 16.4 16.7 16.8
y 19.1 24.2 18.5 238
. 19.6 25.2 24.9
24. (S8.1) Which confidence interval is longer – 80% or 95%? Why? If the confidence interval is longer,
than what can be said about the x values (sample statistics) in terms of the mean ( x )?
25. (S8.2) In a high school study of test preparation methods it was shown that method 2 produced 1%
better results over method 1. At   .01 , test results indicate it is statistically significant. Method 2 is
expensive to purchase. Should the district buy the new test preparation kit?
26. (S8.3) Incorrect assumptions can affect the robustness of inferential procedures. Which measure of
central tendency is most affected by outliers?
27. (S8.4) What is the test statistic (tailed test) needed to determine if you are testing an item being
different than the stated statistic?
Key to Benchmark 3 review
STATISTICS B REVIEW
1. x, s, s 2
2. As n grows the distribution approaches a normal distribution
3. 1.277
4. .8051
5. 99.74%
6. Sally = -1: Billy = -.8181 so Billy is better
7. less than 30
8. 95%
9. 90% of the time the true mean will lies between 3.2 and 3.8
10. C = 95%
11. 90% confident
12. a) 1692; b) 1083
13. 165.78 < u < 180.22
14. Type I
15. Reject null or accept null
16. T=.75; do not reject; p =
17. Z= -0.151; do not reject
18. T = -2.615; .02 < p < .05
19. Chi square = 498.06
20. A
21. Y = 14.484 + 0.1947x; E = 8.924; 8.87 < y < 26.72
22. T-value does not lie within the critical region, so do not reject
23. .02 < p < .05
24. 95%; To be more sure you include a wider range; you can be 95% sure the sample statistic will
be in the range
25. No
26. Mean
27. Not equal to
Download