Stat 101: Chapters 16-20. Note: The Final Is Cumulative CH 1-20

advertisement
Stat 101: Chapters 16-20. Note: The
Final Is Cumulative CH 1-20
Important Formulas and Concepts
1
1
Chapter 16
1.1
Definitions
1. Standard Error
When we estimate the standard deviation of a sampling distribution, using statistics
.]
found from the data, the estimate is called a standard error. [SE(p̂) = p̂(1−p̂)
n
2. Confidence Interval (CI)
A level C confidence interval for a model parameter is an interval of values usually of
the form Estimate ± Margin of Error found from data in such a way that C% of all
random samples will yield intervals that capture the true parameter value.
3. One Proportion z-interval
A confidence interval for the true value of a proportion. The confidence interval is p̂ ±
∗
z1−α/2
SE(p̂), where z ∗ is a critical value from the standard normal model corresponding
to the specified confidence level.
4. Margin of Error (MOE)
In a confidence interval, the extent of the interval on either side of the observed statistic
value. It is typically the produce of a critical value from the sampling distribution and
a standard error from the data. A small MOE corresponds to a confidence interval that
pins down the parameter precisely. A large MOE corresponds to a confidence interval
that
q gives relatively little information about the estimated parameter. MOEproportion =
z∗
p̂(1−p̂)
.
n
5. Critical Value
The number of standard errors to move away from the mean of the sampling distribution to correspond to the specified level of confidence. The critical value, for a normal
sampling distribution, denoted z ∗ , is usually found from a table or technology.
1.2
Some z values (Critical Values) for Confidence Intervals
CI:
z
1
90% CI
1.645
95% CI
1.96
99% CI
2.576
This version: November 21, 2015, by Jennifer Pajda-De La O. May not include all things that could
possibly be tested on. To be used as an additional reference to studying all Chapters 16-20. Most definitions,
formulas, and selected problems come from Intro Stats by De Veaux, Velleman and Bock, 4th edition,
published by Pearson.
2
Extra Information
Review any and all notes and supplementary materials. It may be the case that something
was accidentally omitted from this study guide. Also, review any problems that may have
been discussed in class as not all example problems may have been provided here.
3
Chapter 17
1. Hypothesis
A model or proposition that we adopt in order to test.
2. Null Hypothesis (H0 )
The claim being assessed in a hypothesis test that states “no change from the traditional value,” “no effect”, “no difference”, or “no relationship”. For a claim to be a
testable null hypothesis, it must specify a value for some population parameter that
can form the basis for assuming a sampling distribution for a test statistic.
3. Alternative Hypothesis (HA )
The alternative hypothesis proposes what we should conclude if we reject the null
hypothesis.
4. P-value
The probability of observing a value for a test statistic at least as far from the hypothesized value as the statistic value actually observed if the null hypothesis is true.
A small p-value indicates either that the observation is improbable or that the probability calculation was based on incorrect assumptions. The assumed truth of the null
hypothesis is the assumption under suspicion.
5. One-proportion Z-test
A test of the null hypothesis that the proportion of a single sample equals a specified
value H0 : p = p0 by referring the statistic z = (p̂ − p0 )/SD(p̂).
6. Effect Size
The difference between the null hypothesis value and the true value of a model parameter.
7. Two-sided (Tailed) Alternative
An alternative hypothesis is two-sided (HA : p 6= p0 ) when we are interested in deviations in either direction away from the hypothesized parameter value.
8. One-sided (Tailed) Alternative
An alternative hypothesis is one-sized (HA : p > p0 or HA : p < p0 ) when we are
interested in deviations in only one direction away from the hypothesized parameter
value.
4
Chapter 18
1. Student’s t distribution
A family of distributions indexed by its degrees of freedom. The t-models are unimodal,
symmetric, and bell shaped, but have fatter tails and a narrower center than the
Normal model. As the degrees of freedom increase, t-distributions approah the Normal
distribution.
2. Degrees of Freedom for Student’s t distribution (df)
For the t-distribution, the degrees of freedom are equal to n − 1, where n is the sample
size.
3. One-sample t-interval for the mean
This
for the mean. This is given by y ± t∗n−1 SE(y), SE(y) =
√ is the confidence interval
s/ n. The critical value t∗n−1 depends on the particular confidence level that you
specify and on the number of degrees of freedom n − 1.
4. One-sample t-test for the mean
This is the hypothesis
√ test. It tests the hypothesis H0 : µ = µ0 using the statistic
tn−1 = (y − µ0 )/(s/ n).
5
Chapter 19
1. Statistically significant
When the p-value falls below the alpha level, we say that the test is “statistically
significant” at that alpha level.
2. Alpha level
The threshold p-value that determines when we reject a null hypothesis. If we observe
a statistic whose p-value based on the null hypothesis is less than α, we reject that
null hypothesis.
3. Significance level
The alpha level is also called the significance level, most often in a phrase such as a
conclusion that a particular test is “significant at the 5% significance level”
4. Critical value
The value in the sampling distribution model of the statistic whose p-value is equal to
the alpha level. The critical value is often denoted with an asterisk, as z ∗ and t∗ .
5. Type I Error
The error of rejecting a null hypothesis when in fact it is true (also called a false
positive). The probability of a Type I Error is α.
6. Type II Error
The error of failing to reject a null hypothesis when in fact it is false (also called a false
negative). The probability of a Type II Error is β.
7. β
The probability of a Type II Error is commonly denoted β and depends on the effect
size.
8. Power
The probability that a hypothesis test will correctly reject a false null hypothesis is
the power of the test. To find the power, we must specify a particular alternative
parameter value as the “true” value. For any specific value in the alternative, the
power is 1 − β.
9. Effect Size
The difference between the null hypothesis value and the true value of a model parameter.
6
Chapter 20
1. Sampling distribution of the difference between two proportions
The sampling distribution of p̂1 − p̂2 is, under appropriate assumptions, modeled by
a Normal model with mean µ = p1 − p2 and standard deviation SD(p̂1 − p̂2 ) =
p
(p1 (1 − p1 ))/n1 + (p2 (1 − p2 ))/n2 .
2. Two-proportion z-interval
This is the confidence interval. A two-proportion z-interval gives a confidence interval
for the true difference in proportions, p1 −p2 in two independent groups. The confidence
interval is (p̂1 − p̂2 )±z ∗ ×SE(p̂1 − p̂2 ). z ∗ is the critical value from the standard Normal
Model corresponding to the specified confidence level.
3. Pooling
Data from two or more populations may sometimes be combined, or pooled, to estimate
a statistic (typically a pooled variance) when the estimated value is assumed to be the
same in both populations. The resulting larger sample size may lead to an estimate
with lower sample variance. However, pooled estimates are appropriate only when the
required assumptions are true.
4. Two-proportion z-test
This is the hypothesis test. Test the null hypothesis H0 : p1 − p2 = 0 by comparing
the statistic z = (p̂1 − p̂2 )/SEpooled (p̂1 − p̂2 ) to the standard normal model.
5. Two-sample t-interval for the difference between means
A confidence interval for the difference between the means of two p
independent groups
is found as (y 1 − y 2 ) ± t∗df × SE(y 1 − y 2 ). Here, SE(y 1 − y 2 ) = (s21 /n1 ) + (s22 /n2 ),
and the number of degrees of freedom is given by a special formula.
6. Two-sample t-test for the difference between means
A hypothesis test for the difference between the means of two independent groups. It
tests the null hypothesis H0 : µ1 − µ2 = ∆0 , where the hypothesized difference ∆0 is
almost always 0. This uses the statistic tdf =
is given by a special formula.
(y 1 −y 2 )−∆0
,
SE(y 1 −y 2 )
with the degrees of freedom
7. Pooled t-test
A hypothesis test for the difference in the means of two independent groups when we
are willing and able to assume that the variances of the groups are equal. It tests the
null hypothesis H0 : µ1 − µ2 = ∆0 , where the hypothesized difference, ∆0 is almost
1 −y 2 )−∆0
, with the degrees of freedom is
always 0. This uses the statistic tdf = SE(ypooled
(y 1 −y 2 )
(n1 − 1) + (n2 − 1).
7
Confidence Interval Creation and Hypothesis Testing Summary
7.1
1-Proportion
Proportion - always use p
• Confidence Interval Creation
r
p̂(1 − p̂)
CI: p̂ ± z ∗
{z n }
|
M OE
z ∗ = criticalvalue
Table of critical values for z ∗ for Confidence Intervals:
CI:
z∗
90%
1.645
95%
1.96
96%
2.054
98%
2.326
99%
2.576
• Hypothesis Testing
Step 1: Write down your hypothesis
H0 : p = p0
HA : p <or>or6= p0
Step 2: Calculate your test statistic
z=
q p̂−p0
p0 (1−p0 )
n
Step 3: Calculate the p-value
Step 4: State your conclusion. If p-value ≤ α, (usually α = 0.05), then Reject H0 . If
p-value is > α, then Do Not Reject H0 .
7.2
1-Sample Mean
Sample Mean - always use x or y
The degrees of freedom is given by df = n − 1.
• Confidence Interval Creation
s
CI : y ± t∗n−1 √
n
| {z }
M OE
t∗n−1 = criticalvalue
Use Appendix D Table T to determine the critical values of t.
• Hypothesis Testing
Step 1: Write down your hypothesis
H0 : µ = µ0
HA : µ <or>or6= µ0
Step 2: Calculate your test statistic
tn−1 =
y−µ0
√s
n
Step 3: Calculate the p-value
Step 4: State your conclusion. If p-value ≤ α, (usually α = 0.05), then Reject H0 . If
p-value is > α, then Do Not Reject H0 .
7.3
Difference of Proportions
Difference of Proportions - always use p1 − p2
• Confidence Interval Creation
s
p̂1 (1 − p̂1 ) p̂2 (1 − p̂2 )
+
CI:(p̂1 − p̂2 ) ± z ∗
n1
n2
|
{z
}
M OE
z ∗ = criticalvalue
See Section ?? for examples of the critical values for z ∗ .
• Hypothesis Testing
Step 1: Write down your hypothesis
H0 : p1 − p2 = 0
HA : p1 − p2 <or>or6= 0
Step 2: Calculate your test statistic
z=
p̂1 −p̂2
,
SEpooled (p̂1 −p̂2 )
p̂pooled (1−p̂pooled )
p̂
(1−p̂
)
+ pooled n2 pooled ,
n1
NumberofSuccessesinGroup1+NumberofSuccessesinGroup2
.
n1 +n2
SEpooled (p̂1 − p̂2 ) =
p̂pooled =
q
Step 3: Calculate the p-value
Step 4: State your conclusion. If p-value ≤ α, (usually α = 0.05), then Reject H0 . If
p-value is > α, then Do Not Reject H0 .
7.4
Difference of Means - 2 Independent Groups; Any type of
Variance
Difference of Means - always use x1 − x2 or y 1 − y 2
The degrees offreedom
is given by (round down to the nearest integer)
df =
2 2
s2
1 + s2
n1
n2
2 2
2 2
s1
s2
1
1
+
n1 −1 n1
n2 −1 n2
.
• Confidence Interval Creation
s
s21
s2
CI: (y 1 − y 2 ) ± t∗df
+ 2
n1 n2
|
{z
}
t∗df = criticalvalue
M OE
• Hypothesis Testing
Step 1: Write down your hypothesis
H0 : µ1 − µ2 = ∆0
HA : µ1 − µ2 <or>or6= ∆0
Step 2: Calculate your test statistic
tdf =
(y 1 −y 2 )−∆0
r
2
s2
1 + s2
n1
n2
Step 3: Calculate the p-value
Step 4: State your conclusion. If p-value ≤ α, (usually α = 0.05), then Reject H0 . If
p-value is > α, then Do Not Reject H0 .
7.5
Difference of Means - 2 Independent Groups; Equal Variance
both groups
Difference of Means - always use x1 − x2 or y 1 − y 2
The degrees of freedom is given by df = n1 + n2 − 2.
• Confidence Interval Creation
CI: (y 1 − y 2 ) ± t∗df × SEpooled (y 1 − y 2 )
|
{z
}
t∗df = criticalvalue
M OE
• Hypothesis Testing
Step 1: Write down your hypothesis
H0 : µ1 − µ2 = ∆0
HA : µ1 − µ2 <or>or6= ∆0
Step 2: Calculate your test statistic
tdf =
(y 1 −y 2 )−∆0
,
SEpooled (y 1 −y 2 )
q
SEpooled (y 1 − y 2 ) = spooled n11 +
q
(n1 −1)s21 +(n2 −1)s22
spooled =
n1 +n2 −2
1
n2
Step 3: Calculate the p-value
Step 4: State your conclusion. If p-value ≤ α, (usually α = 0.05), then Reject H0 . If
p-value is > α, then Do Not Reject H0 .
7.6
Paired Differences of Means
The degrees of freedom is given by df = n − 1.
• Confidence Interval Creation (n = number of pairs)
sd
CI: d¯ ± t∗n−1 √
n
| {z }
M OE
t∗n−1 = criticalvalue
• Hypothesis Testing
Step 1: Write down your hypothesis
H0 : µd = ∆0
HA : µd <or>or6= ∆0
Step 2: Calculate your test statistic
¯
0
tn−1 = d−∆
s
√d
n
d¯ = averageofthedifferences
sd = standarddeviationofthedifferences
Step 3: Calculate the p-value
Step 4: State your conclusion. If p-value ≤ α, (usually α = 0.05), then Reject H0 . If
p-value is > α, then Do Not Reject H0 .
8
Extra Information
Review any and all notes and supplementary materials. It may be the case that something
was accidentally omitted from this study guide. Also, review any problems that may have
been discussed in class as not all example problems may have been provided here.
9
Example Problems
Q3 pg 444 The 95% confidence interval for the number of teens who reported that they had
misrepresented their age online is from 45.6% to 52.5%. There were 799 teens in this
study.
(a) Interpret the interval in this context.
(b) Explain the meaning of “95% confident” in this context.
1. A study found that 16 of 40 peanut candy bars in fact did not contain peanuts.
(a) Construct a 90% confidence interval.
(b) Interpret your 90% confidence interval.
(c) Construct a 95% confidence interval.
(d) Interpret your 95% confidence interval.
Q15,16 pg 446 Several factors are involved in the creation of a confidence interval. Among them are
the sample size, the level of confidence, and the margin of error. Which statements are
true?
(a) For a given sample size, higher confidence means a smaller margin of error.
(b) For a given confidence level, halving the margin of error requires a sample twice
as large.
(c) For a certain confidence level, you can get a smaller margin of error by selecting
a bigger sample.
(d) For a fixed margin of error, larger samples provide greater confidence.
9.1
Chapter 16
1. I sample 600 people and 432 of them like cats. Construct a 95% confidence interval
for the population proportion.
2. I think the proportion of people that eat candy is around 0.75. I am going to construct
a 90% confidence interval and want the margin of error to be ±0.025. How large should
the sample size be?
3. Jimmy samples 930 people and 234 took public transportation. Construct a 99%
confidence interval for the population proportion.
4. I am going to construct a 95% confidence interval for the proportion of people that
wear eyeglasses and want the margin of error to be ±0.2. I have no idea what to
estimate for the population proportion. How large should the sample size be?
9.2
Chapter 17
1. A researcher believes that more than 50% of all people voted in the last election. She
samples 800 people and 420 of them voted. Test her claim at a significance level of
0.05 (i.e. compare the P-value to 0.05).
(a) State the hypotheses to be tested.
(b) Compute the test statistics (z-value). You must show your computation to receive
credit.
(c) Compute the P-value associated with your test statistic.
(d) Make a conclusion about the hypotheses.
2. A researcher believes that fewer than 75% of all mollusks are tasty. He samples 1200
mollusks and 865 of them are tasty. Test his claim at a significance level of 0.05 (i.e.
compare the P-value to 0.05).
(a) State the hypotheses to be tested.
(b) Compute the test statistics (z-value). You must show your computation to receive
credit.
(c) Compute the P-value associated with your test statistic.
(d) Make a conclusion about the hypotheses.
3. A researcher believes that the percentage of people that watch Game of Thrones is
different than 27%. He samples 900 people and 220 of them watch. Test his claim at
a significance level of 0.05 (i.e. compare the P-value to 0.05).
(a) State the hypotheses to be tested.
(b) Compute the test statistics (z-value). You must show your computation to receive
credit.
(c) Compute the P-value associated with your test statistic.
(d) Make a conclusion about the hypotheses.
9.3
Chapter 18
1. A butcher wants to estimate the mean weight of a ham. She samples 33 hams and
computes a sample mean weight of 8.2 pounds and a sample standard deviation of 3.3
pounds. What is a 90% confidence interval for the population mean weight of ham?
Please indicate the value you used for z ∗ or t∗ .
2. A professor is interested in the mean length of a letter of recommendation. He samples
51 letters and finds a sample mean length of 620 words with a sample standard deviation
of 90 words. What is a 95% confidence interval for the population mean length of a
letter? Please indicate the value you used for z ∗ or t∗ .
3. A computer professional wants to know the mean number of emails people receive each
day. She is going to compute a 95% confidence interval and wants a margin of error of
±2 emails. She believes the standard deviation to be 18 emails. How large should the
sample size be to ensure this margin of error?
4. A researcher believes that the mean age at which a person first votes is greater than 22
years. He samples 27 people and computes a sample mean of 24.3 years and a sample
standard deviation of 8 years.
(a) State the hypotheses to be tested.
(b) What is the value of your test statistic (t or z value)?
(c) What is the P-value?
(d) What conclusion should be drawn (compare p-value to 0.05).
5. A researcher believes that the mean age at which a person first tries chocolate is less
than 3 years. He samples 24 people and computes a sample mean of 2.3 years and a
sample standard deviation of 1.5 years.
(a) State the hypotheses to be tested.
(b) What is the value of your test statistic (t or z value)?
(c) What is the P-value?
(d) What conclusion should be drawn (compare p-value to 0.05).
6. A researcher believes that the mean height of a prairie dog is different than 14 inches.
She samples 31 prairie dogs and computes a sample mean of 15.8 inches and a sample
standard deviation of 3.6 inches.
(a) State the hypotheses to be tested.
(b) What is the value of your test statistic (t or z value)?
(c) What is the P-value?
(d) What conclusion should be drawn (compare p-value to 0.05).
9.4
Chapter 19
Q4 pg 526 Which of the following are true? If false, explain briefly.
(a) A very low P-value provides evidence against the null hypothesis.
(b) A high P-value is strong evidence in favor of the null hypothesis.
(c) A P-value above 0.10 shows that the null hypothesis is true.
(d) If the null hypothesis is true, you can’t get a p-value below 0.01.
Q7 pg 526 Which of the following statements are true? If false, explain briefly.
(a) Using an alpha level of 0.05, a p-value of 0.04 results in rejecting the null hypothesis.
(b) The alpha level depends on the sample size.
(c) With an alpha level of 0.01, a p-value of 0.10 results in rejecting the null hypothesis.
(d) Using an alpha level of 0.05, a p-value of 0.06 means the null hypothesis is true.
Q11 pg 527 For each of the following situations, state whether a Type I or Type II, or neither error
has been made. Explain briefly.
(a) A bank wants to know if the enrollment on their website is above 30% based on
a small sample of customers. they test H0 : p = 0.3 versus HA : p > 0.3 and
reject the null hypothesis. Later they find out that actually 28% of all customers
enrolled.
(b) A student tests 100 students to determine whether other students on her campus
prefer Coke or Pepsi and finds no evidence that preference for Coke is not 0.5.
Later, a marketing company tests all students on campus and finds no difference.
(c) A human resource analyst wants to know if the applicants this year score, on
average, higher on their placement exam than the 52.5 points the candidates
averaged last year. She samples 50 recent tests and finds the average to be 54.1
points. She fails to reject the null hypothesis that the mean is 52.5 points. At
the end of the year, they find that the candidates this year had a mean of 55.3
points.
(d) A pharmaceutical company tests whether a drug lifts the headache relief rate
from the 25% achieved by the placebo. They fail to reject the null hypothesis
because the p-value is 0.465. Further testing shows that the drug actually relieves
headaches in 38% of people.
9.5
Chapter 20
1. A researcher samples 600 children and 500 of them like ice cream. She also samples
450 adults and 350 of them like ice cream. Construct a 95% confidence interval for the
difference of population proportions of children and adults that like ice cream.
2. A researcher samples 1200 children and 500 of them like to exercise. She also samples
900 adults and 350 of them like to exercise. Construct a 90% confidence interval for
the difference of population proportions of children and adults that like to exercise.
3. A scientist believes that the proportion of North American bees that are hostile is
greater than the proportion of South American bees. She samples 500 North American
bees and 200 are hostile. She samples 600 South American bees and 230 are hostile.
(a) State the hypotheses to be tested.
(b) Compute the sample statistic (z value or t value). You must show work to receive
credit.
(c) Give the P-value or range of P-Values.
(d) What decision should the scientist make at a significance level of 5%?
4. A scientist believes that the proportion of North American bears that are hostile is
greater than the proportion of South American bears. She samples 800 North American
bears and 200 are hostile. She samples 1200 South American bears and 240 are hostile.
(a) State the hypotheses to be tested.
(b) Compute the sample statistic (z value or t value). You must show work to receive
credit.
(c) Give the P-value or range of P-Values.
(d) What decision should the scientist make at a significance level of 5%?
Q61 pg 578 A man who moves to a new city sees that there are two routes he could take to work.
A neighbor who has lived there a long time tells him Route A will average 5 minutes
faster than Route B. The man decides to experiment; he wants to find out if the mean
difference between Route A and B is different from 5 minutes. Each day, he flips a coin
to determine which way to go, driving each route 20 days. He finds that Route A takes
an average of 40 minutes, with a standard deviation of 3 minutes, and Route B takes
an average of 43 minutes, with a standard deviation of 2 minutes. Histograms of travel
times for the routes are roughly symmetric and show no outliers. Assume α = 0.05.
(a) Find a 95% confidence interval for the difference in average commuting time for
the two routes. Use df= 33.
(b) State the hypotheses to be tested.
(c) Compute the value of the test score.
(d) Give the P-value or range of P-values.
(e) Do the results seem significant?
Q78 pg 582 Researchers randomly assigned participants either a tall, thin “highball” glass or a
short, wide “tumbler,” each of which held 355 ml. Participants were asked to pour 1.5
oz = 44.3 ml of water into their glass. Did the shape of the glass make a difference in
how much liquid they poured? In particular, test to see if they poured less water into
the “highball” glass than the “tumbler”. Assume α = 0.1. Here are the summaries:
Highball
n
99
y 42.2 ml
s 16.2 ml
Tumbler
n
99
y 60.9 ml
s 17.9 ml
(a) Find a 90% confidence interval for the difference in average water held for the two
glasses. Use df = 194.
(b) State the hypotheses to be tested.
(c) Compute the value of the test score. (Assume all conditions are met.)
(d) Give the P-value or range of P-values.
(e) Do the results seem significant?
9.6
Various Chapters
1. We want to estimate the healing rate for a wound. A sample of size 17 is collected
and the sample mean is computed to be 24.3 micrometers per hour, with a sample
standard deviation of s= 8 micrometers per hour. What is a 95% confidence interval
for the population mean?
2. A sample of size n=150 people is collected and the sample proportion of people who are
illiterate is computed to be .20. Compute a 95% confidence interval for the population
proportion of illiterate people.
3. You believe that the proportion of people that like cheese is .80. You are going to
construct a 95% confidence interval and want the margin of error to be plus or minus
.03. What should the sample size be?
4. Teresa knows that appointment times are approximately normally distributed. She
believes the mean wait time is longer than 25 minutes. She conducts a test with α
= 0.05 and the appropriate hypotheses. She selects 25 random appointments and the
sample mean was found to be 25.66 minutes and a sample standard deviation of 10
minutes.
(a) State the hypotheses to be tested.
(b) Compute the value of the test score.
(c) Give the P-value or range of P-values.
(d) Do the results seem significant?
5. You claim that the proportion of people who watch American Idol is greater than .50.
You sample n=200 people and compute a sample proportion of .53. Assume α = 0.05.
(a) State the hypotheses to be tested.
(b) Compute the value of the test score.
(c) Give the P-value or range of P-values.
(d) Do the results seem significant?
6. You want to compare the proportion of gamers amongst women and men. You survey
300 women and 400 men. 175 of the women were gamers and 200 of the men were
gamers. Construct a 95% confidence interval for the difference of proportions.
7. You believe that the proportion of men that are colorblind is greater than the proportion of women that are color blind. You sample 900 men and 90 of them are color
blind. You sample 700 women and 45 of them are colorblind. Assume α = 0.05.
(a) State the hypotheses to be tested.
(b) Compute the value of the test score.
(c) Did you use the pooled proportion in part b.?
(d) Compute the P-value.
(e) Are the results significant?
10
Example Solutions
Q3 pg 444 (a) We are 95% confident that, if we were to ask all teens whether they have misrepresented their age online, between 45.6% and 52.5% of them would say they
have.
(b) If we were to collect many random samples of 799 teens, about 95% of the confidence intervals would contain the true proportion of all teens who admit to
misrepresenting their age online.
1. This problem tells us that p̂ = 16/40, n = 40.
(a) For a 90% confidence interval, z ∗ = 1.645. The 90% confidence interval would
then be
q
p̂ ± z ∗
p̂(1−p̂)
n
q 16
16
)
( 40 )(1− 40
= ± 1.645
40
√
= ± 1.645 0.006
= (0.2726, 0.5274)
16
40
16
40
(b) We are 90% confidence that between 27% and 53% of all peanut candy bars did
not contain peanuts.
(c) For a 95% confidence interval, z ∗ = 1.96. The 95% confidence interval would then
be
q
p̂ ± z ∗
p̂(1−p̂)
n
q 16
16
( 40 )(1− 40
)
= ± 1.96
40
√
= ± 1.96 0.006
= (0.2482, 0.5518)
16
40
16
40
(d) We are 95% confident that between 25% and 55% of all peanut candy bars did
not contain peanuts.
Q15,16 pg 446 (a) False. Higher confidence means a larger margin of error.
Suppose n = 10. Suppose p̂ = 0.5. Start with 90% Confidence (z ∗ = 1.645.)
Calculate MOE. Now change to 95% Confidence (z ∗ = 1.96). Calculate MOE.
Compare the two results.
q
M OE90 = 1.645 0.5(1−0.5)
10
q
0.25
= 1.645 10
√
= 1.645 0.025
= 0.26,
q
M OE95 = 1.96
q
= 1.96 0.25
√ 10
= 1.96 0.025
= 0.31.
0.5(1−0.5)
10
From this, we can see that MOE increases when confidence increases.
(b) False. The margin of error decreases as the square root of the sample size increases.
Halving the margin of error requires a sample four times as large as the original.
Suppose p̂ = 0.5. Suppose 95% Confidence (z ∗ = 1.96). Start with MOE = 0.6.
Then compare with MOE = 0.3.
q
q
0.6
0.25
⇒
=
0.6 = 1.96 0.5(1−0.5)
n
1.96
n
q
⇒ 0.306 = 0.25
n
⇒ 0.0937 = 0.25
n
0.25
⇒ n = 0.0937
⇒ n = 2.668,
q
0.3 = 1.96
⇒ 0.153 =
0.5(1−0.5)
n
⇒
0.3
1.96
=
q
0.25
n
q
0.25
n
0.25
n
⇒ 0.0234 =
0.25
⇒ n = 0.0234
⇒ n = 10.68.
So our original n = 2.668 and the new n = 10.68, which is approximately 4 times
the original value of n.
(c) True. Larger samples are less variable, which translates to a smaller margin of
error. We can be more precise at the same level of confidence.
Suppose p̂ = 0.5. Suppose 90% Confidence. Start with n = 2 and compare to
n = 18.
q
M OE2 = 1.645 0.5(1−0.5)
2
q
= 1.645 0.25
√ 2
= 1.645 0.125
= 0.582,
q
M OE18 = 1.645 0.5(1−0.5)
18
q
0.25
= 1.645 18
√
= 1.645 0.139
= 0.194.
Our MOE decreased when n increased.
(d) True. Larger samples are less variable, which makes us more confident that a
given confidence interval succeeds in catching the population proportion.
Suppose M OE = 0.4. Suppose p̂ = 0.5. Compare the confidence of n = 5 to
n = 8. q
√
∗
0.4 = z5∗ 0.5(1−0.5)
⇒
0.4
=
z
0.05
5
5
0.4
∗
⇒ √0.05 = z5
⇒ 1.789q
= z5∗ ,
√
0.4 = z8∗ 0.5(1−0.5)
⇒ 0.4 = z8∗ 0.03125
8
0.4
⇒ √0.03125
= z8∗
⇒ 2.263 = z8∗ .
As the sample sizes increases, z ∗ increases, which means that the confidence level
increases.
10.1
Chapter 16
1. I sample 600 people and 432 of them like cats. Construct a 95% confidence interval
for the population proportion.
432
= 0.72
p̂ = 600
z ∗ = 1.96
n = 600 q
CI : p̂ ± z ∗ p̂(1−p̂)
qn
⇒ 0.72 ± 1.96 0.72(1−0.72)
600
⇒ (0.684, 0.756)
2. I think the proportion of people that eat candy is around 0.75. I am going to construct
a 90% confidence interval and want the margin of error to be ±0.025. How large should
the sample size be?
p̂ = 0.75
z ∗ = 1.645
M OE = 0.025
q
M OE = z ∗ p̂(1−p̂)
n
q
⇒ 0.025 = 1.645 0.75(1−0.75)
n
q
0.1875
0.025
⇒ 1.645 =
n
0.1875
0.025 2
⇒ 1.645 = n
⇒ n = 0.1875
2
( 0.025
1.645 )
⇒ n = 811.8075
⇒ n ≈ 812
3. Jimmy samples 930 people and 234 took public transportation. Construct a 99%
confidence interval for the population proportion.
p̂ = 234
930
z ∗ = 2.576
n = 930 q
CI : p̂ ± z ∗ p̂(1−p̂)
qn
⇒ 234
± 2.576 (234/930)(1−234/930)
930
930
q
0.188
⇒ 0.252 ± 2.576 930
⇒ (0.215, 0.289)
4. I am going to construct a 95% confidence interval for the proportion of people that
wear eyeglasses and want the margin of error to be ±0.2. I have no idea what to
estimate for the population proportion. How large should the sample size be?
p̂ = 0.5whenwedon0 thaveanyideaforthepopulationproportion
z∗ = 1.96
MOE = 0.2q
MOE = z∗ p̂(1−p̂)
qn
⇒ 0.2 = 1.96 0.5(1−0.5)
n
q
0.2
⇒ 1.96
= 0.25
n
0.2 2
0.25
⇒ 1.96 = n
⇒ n = 0.25
0.2 2
( 1.96
)
⇒ n = 24.01
⇒ n ≈ 25
10.2
Chapter 17
1. A researcher believes that more than 50% of all people voted in the last election. She
samples 800 people and 420 of them voted. Test her claim at a significance level of
0.05 (i.e. compare the P-value to 0.05).
(a) State the hypotheses to be tested.
H0 : p = 0.5
HA : p > 0.5
(b) Compute the test statistics (z-value). You must show your computation to receive
credit.
p̂ = 420/800 = 0.525. n = 800.
0.025
0
= 1.41
= q0.525−0.5
=√
z = q pp̂−p
0.25
0.5(1−0.5)
0 (1−p0 )
n
800
800
(c) Compute the P-value associated with your test statistic.
P (Z > 1.41) = normalcdf (1.41, 999) = 0.0793
(d) Make a conclusion about the hypotheses.
Since the p-value is “large” (0.0793 > 0.05), Do Not Reject H0 . The results are
not significant.
2. A researcher believes that fewer than 75% of all mollusks are tasty. He samples 1200
mollusks and 865 of them are tasty. Test his claim at a significance level of 0.05 (i.e.
compare the P-value to 0.05).
(a) State the hypotheses to be tested.
H0 : p = 0.75
HA : p < 0.75
(b) Compute the test statistics (z-value). You must show your computation to receive
credit.
p̂ = 865/1200 = 0.721. n = 1200.
−0.029
0
= q0.721−0.75
=√
= −2.32
z = q pp̂−p
0.1875
0.75(1−0.75)
0 (1−p0 )
n
1200
1200
(c) Compute the P-value associated with your test statistic.
P (Z < −2.32) = normalcdf (−999, −2.32) = 0.0102
(d) Make a conclusion about the hypotheses.
Since the p-value is “small” (0.0102 < 0.05), Reject H0 . The results are significant.
3. A researcher believes that the percentage of people that watch Game of Thrones is
different than 27%. He samples 900 people and 220 of them watch. Test his claim at
a significance level of 0.05 (i.e. compare the P-value to 0.05).
(a) State the hypotheses to be tested.
H0 : p = 0.27
HA : p 6= 0.27
(b) Compute the test statistics (z-value). You must show your computation to receive
credit.
p̂ = 220/900 = 0.244. n = 900.
−0.026
0
= q0.244−0.27
=√
z = q pp̂−p
= −1.76
0.1971
0.27(1−0.27)
0 (1−p0 )
n
900
900
(c) Compute the P-value associated with your test statistic.
Note that this is a 2-sided test.
p-value= 2P (Z < −1.76)
= 2 (normalcdf (−999, −1.76))
= 2(0.039)
= 0.078
(d) Make a conclusion about the hypotheses.
Since the p-value is “large” (0.078 > 0.05), Do Not Reject H0 . The results are
not significant.
10.3
Chapter 18
1. A butcher wants to estimate the mean weight of a ham. She samples 33 hams and
computes a sample mean weight of 8.2 pounds and a sample standard deviation of 3.3
pounds. What is a 90% confidence interval for the population mean weight of ham?
Please indicate the value you used for z ∗ or t∗ .
Summary of what is given:
n = 33
y = 8.2
s = 3.3.
For confidence intervals for the mean, we use t∗ , with n − 1 degrees of freedom and
90% confidence (for this case). Thus, t∗32 = 1.694.
CI : y ± t∗n−1 √sn
⇒ 8.2 ± 1.694 √3.3
33
⇒ (7.227, 9.173)
2. A professor is interested in the mean length of a letter of recommendation. He samples
51 letters and finds a sample mean length of 620 words with a sample standard deviation
of 90 words. What is a 95% confidence interval for the population mean length of a
letter? Please indicate the value you used for z ∗ or t∗ .
Summary of what is given:
n = 51
y = 620
s = 90.
For confidence intervals for the mean, we use t∗ , with n − 1 degrees of freedom and
95% confidence (for this case). Thus, t∗50 = 2.009.
CI : y ± t∗n−1 √sn
⇒ 620 ± 2.009 √9051
⇒ (594.682, 645.318)
3. A computer professional wants to know the mean number of emails people receive each
day. She is going to compute a 95% confidence interval and wants a margin of error of
±2 emails. She believes the standard deviation to be 18 emails. How large should the
sample size be to ensure this margin of error? Summary of what is given:
M OE = 2
s = 18
For sample size calculation, since this is based on the mean, use t∗ , with n − 1 degrees
of freedom. Note that as n becomes really large, the t-distribution becomes more like
the normal distribution. Therefore, use the 95% confidence interval critical value from
the normal distribution instead. z ∗ = 1.96. Sample size can be calculated as follows:
M OE = t∗n−1 √sn ⇒ 2 = 1.96 √18n
2
⇒ 1.96×18
= √1n
√
⇒ n = 1.96×18
2 2
⇒ n = 1.96×18
2
⇒ n = 311.1696
⇒ n ≈ 312
4. A researcher believes that the mean age at which a person first votes is greater than 22
years. He samples 27 people and computes a sample mean of 24.3 years and a sample
standard deviation of 8 years.
(a) State the hypotheses to be tested.
H0 : µ = 22
HA : µ > 22
(b) What is the value of your test statistic (t or z value)?
Use the t-test statistic because we are dealing with means.
0
tn−1 = y−µ
√s
n
t27−1 = t26 =
24.3−22
√8
27
=
2.3
1.54
= 1.49
(c) What is the P-value?
On your calculator: tcdf (1.49, 999, 26) = 0.0741
On the table: Go to degrees of freedom 26, find where 1.49 is in the row, and then
look at the one-tail probability values. The probability is between 0.05 and 0.10.
(d) What conclusion should be drawn (compare p-value to 0.05).
Since the p-value is “large” (0.0741 > 0.05), Do Not Reject H0 . The results are
not significant.
5. A researcher believes that the mean age at which a person first tries chocolate is less
than 3 years. He samples 24 people and computes a sample mean of 2.3 years and a
sample standard deviation of 1.5 years.
(a) State the hypotheses to be tested.
H0 : µ = 3
HA : µ < 3
(b) What is the value of your test statistic (t or z value)?
Use the t-test statistic because we are dealing with means.
0
tn−1 = y−µ
√s
n
t24−1 = t23 =
2.3−3
1.5
√
24
=
−0.7
0.3062
= −2.286
(c) What is the P-value?
On your calculator: tcdf (−999, −2.286, 23) = 0.0159.
On the table: Go to degrees of freedom 23, find where 2.286 is in the row, and
then look at the one-tail probability values. The probability is between 0.01 and
0.025.
(d) What conclusion should be drawn (compare p-value to 0.05).
Since the p-value is “small” (0.0159 < 0.05), Reject H0 . The results are significant.
6. A researcher believes that the mean height of a prairie dog is different than 14 inches.
She samples 31 prairie dogs and computes a sample mean of 15.8 inches and a sample
standard deviation of 3.6 inches.
(a) State the hypotheses to be tested.
H0 : µ = 14
HA : µ 6= 14
(b) What is the value of your test statistic (t or z value)? Use the t-test statistic
because we are dealing with means.
0
tn−1 = y−µ
√s
n
t31−1 = t30 =
15.8−14
3.6
√
31
=
1.8
0.6466
= 2.784
(c) What is the P-value?
On your calculator: 2tcdf (2.784, 999, 30) = 2(0.0046) = 0.0092.
On the table: go to degrees of freedom 30, find where 2.784 is in the row, and
then look at the two-tail probability values. The probability is lower than 0.01.
(d) What conclusion should be drawn (compare p-value to 0.05).
Since the p-value is “small” (0.0092 < 0.05), Reject H0 . The results are significant.
10.4
Chapter 19
Q4 pg 526 Which of the following are true? If false, explain briefly.
(a) A very low P-value provides evidence against the null hypothesis.
True.
(b) A high P-value is strong evidence in favor of the null hypothesis.
False. A high p-value shows that the data are consistent with the null hypothesis
but does not prove that the null hypothesis is true.
(c) A P-value above 0.10 shows that the null hypothesis is true.
False. No p-value ever shows that the null hypothesis is true (or false).
(d) If the null hypothesis is true, you can’t get a p-value below 0.01.
False. If the null hypothesis is true, you will get a p-value below 0.01 about once
in a hundred hypothesis tests.
Q7 pg 526 Which of the following statements are true? If false, explain briefly.
(a) Using an alpha level of 0.05, a p-value of 0.04 results in rejecting the null hypothesis.
True.
(b) The alpha level depends on the sample size.
False. The alpha level is set independently and does not depend on the sample
size.
(c) With an alpha level of 0.01, a p-value of 0.10 results in rejecting the null hypothesis.
False. The p-value would have to be less than 0.01 to reject the null hypothesis.
(d) Using an alpha level of 0.05, a p-value of 0.06 means the null hypothesis is true.
False. It means that we do not have enough evidence at that alpha level to reject
the null hypothesis.
Q11 pg 527 For each of the following situations, state whether a Type I or Type II, or neither error
has been made. Explain briefly.
(a) A bank wants to know if the enrollment on their website is above 30% based on
a small sample of customers. they test H0 : p = 0.3 versus HA : p > 0.3 and
reject the null hypothesis. Later they find out that actually 28% of all customers
enrolled.
Type I Error. The actual value is not greater than 0.3, but they rejected the null
hypothesis.
(b) A student tests 100 students to determine whether other students on her campus
prefer Coke or Pepsi and finds no evidence that preference for Coke is not 0.5.
Later, a marketing company tests all students on campus and finds no difference.
No error. The actual value is 0.5 which was not rejected.
(c) A human resource analyst wants to know if the applicants this year score, on
average, higher on their placement exam than the 52.5 points the candidates
averaged last year. She samples 50 recent tests and finds the average to be 54.1
points. She fails to reject the null hypothesis that the mean is 52.5 points. At
the end of the year, they find that the candidates this year had a mean of 55.3
points.
Type II Error. The actual value was 55.3 points, which is greater than 52.5, which
was not rejected.
(d) A pharmaceutical company tests whether a drug lifts the headache relief rate
from the 25% achieved by the placebo. They fail to reject the null hypothesis
because the p-value is 0.465. Further testing shows that the drug actually relieves
headaches in 38% of people.
Type II Error. The null hypothesis was not rejected, but it was false. The true
relief rate was greater than 0.25.
10.5
Chapter 20
1. A researcher samples 600 children and 500 of them like ice cream. She also samples
450 adults and 350 of them like ice cream. Construct a 95% confidence interval for the
difference of population proportions of children and adults that like ice cream.
What we are given:
p̂1 =
p̂2 =
500
600
350
450
Since we are considering the confidence interval for the difference of proportions, we
need a value for z ∗ . Here, z ∗ = 1.96. The confidence interval is
q
p̂1 (1−p̂1 )
2)
∗
CI: (p̂1 − p̂2 ) ± z
+ p̂2 (1−p̂
n2
1
qn500
350
(1− 500 )
(1− 350 )
⇒ 500
− 350
± 1.96 600 600600 + 450 450450
600
450q
1
⇒ 18
± 1.96 5/36
+ 14/81
600
450
⇒ (0.0069, 0.1042)
2. A researcher samples 1200 children and 500 of them like to exercise. She also samples
900 adults and 350 of them like to exercise. Construct a 90% confidence interval for
the difference of population proportions of children and adults that like to exercise.
What we are given:
p̂1 =
p̂2 =
500
1200
350
900
Since we are considering the confidence interval for the difference of proportions, we
need a value for z ∗ . Here, z ∗ = 1.645. The confidence interval is
q
1)
2)
CI:(p̂1 − p̂2 ) ± z ∗ p̂1 (1−p̂
+ p̂2 (1−p̂
n1
n2
q 500
350
(1− 500 )
(1− 350 )
500
− 350
± 1.645 1200 12001200 + 900 900900
⇒ 1200
900q
1
± 1.645 35/144
+ 77/324
⇒ 36
1200
900
⇒ (−0.0078, 0.0633)
3. A scientist believes that the proportion of North American bees that are hostile is
greater than the proportion of South American bees. She samples 500 North American
bees and 200 are hostile. She samples 600 South American bees and 230 are hostile.
(a) State the hypotheses to be tested.
H0 : pN A − pSA = 0
HA : pN A − pSA > 0
(b) Compute the sample statistic (z value or t value). You must show work to receive
credit.
This is the hypothesis test for the difference of proportions.
p̂pooled = #SuccessGrp1+#SuccessGrp2
n1 +n2
200+230
= 500+600
43
= 110
q
p̂pooled (1−p̂pooled )
p̂
(1−p̂
)
SEpooled (p̂N A − p̂SA ) =
+ pooled n2 pooled
n1
q 43
43
(1− 43 )
(1− 43 )
= 110 500110 + 110 600110
q
= 0.2381
+ 0.2381
500
600
= 0.0295
p̂N A −p̂SA
z = SEpooled
(p̂N A −p̂SA )
200
− 230
600
= 500
0.0295
1/60
= 0.0295
= 0.565
(c) Give the P-value or range of P-Values.
On your calculator: normalcdf (0.565, 999) = 0.2860.
(d) What decision should the scientist make at a significance level of 5%?
Since the p-value is “large” (0.2860 > 0.05), Do Not Reject H0 . The results are
not significant.
4. A scientist believes that the proportion of North American bears that are hostile is
greater than the proportion of South American bears. She samples 800 North American
bears and 200 are hostile. She samples 1200 South American bears and 240 are hostile.
(a) State the hypotheses to be tested.
H0 : pN A − pSA = 0
HA : pN A − pSA > 0
(b) Compute the sample statistic (z value or t value). You must show work to receive
credit.
This is the hypothesis test for the difference of proportions.
p̂pooled = #SuccessGrp1+#SuccessGrp2
n1 +n2
200+240
= 800+1200
= 0.22
q
p̂pooled (1−p̂pooled )
p̂
(1−p̂
)
SEpooled (p̂N A − p̂SA ) =
+ pooled n2 pooled
n1
q
= 0.22(1−0.22)
+ 0.22(1−0.22)
800
1200
q
0.1716
0.1716
+ 1200
=
800
= 0.0189
200
240
− 1200
( 800
)
p̂N A −p̂SA
0.05
z = SEpooled
=
= 0.0189
= 2.646
(p̂N A −p̂SA )
0.0189
(c) Give the P-value or range of P-Values.
On your calculator: normalcdf (2.646, 999) = 0.0041.
(d) What decision should the scientist make at a significance level of 5%?
Since the p-value is “small” (0.0041 < 0.05), Reject H0 . The results are significant.
Q61 pg 578 A man who moves to a new city sees that there are two routes he could take to work.
A neighbor who has lived there a long time tells him Route A will average 5 minutes
faster than Route B. The man decides to experiment; he wants to find out if the mean
difference between Route A and B is different from 5 minutes. Each day, he flips a coin
to determine which way to go, driving each route 20 days. He finds that Route A takes
an average of 40 minutes, with a standard deviation of 3 minutes, and Route B takes
an average of 43 minutes, with a standard deviation of 2 minutes. Histograms of travel
times for the routes are roughly symmetric and show no outliers. Assume α = 0.05.
(a) Find a 95% confidence interval for the difference in average commuting time for
the two routes. Use df= 33.
Since df = 33, then t∗33 = 2.0345.
q 2
s
s2
∗
CI:(y B − y A ) ± tdf nBB + nAA
q
22
32
⇒ (43 − 40) ± 2.0345 20
+ 20
√
⇒ 3 ± 2.0345 0.65
⇒ (1.36, 4.64)
Note that this result means that we are 95% confident that Route B has a mean
commuting time between 1.36 and 4.64 minutes more than the mean commuting
time of Route A. Also, because 5 minutes is not within the interval, it appears
that the neighbor may be exaggerating the average difference in commuting time.
(b) State the hypotheses to be tested.
H0 : µB − µA = 5
HA : µB − µA 6= 5
(c) Compute the value of the test score.
t33 =
(y B −y A )−∆0
r
s2
B
nB
s2
+ nA
A
=
(43−40)−5
q
2
22
+ 320
20
=
√−2
0.65
= −2.481
(d) Give the P-value or range of P-values.
On your calculator: 2tcdf (−999, −2.481, 33) = 2(0.00919) = 0.0184.
On the table: go to degrees of freedom 33, find where 2.481 is in the row, and
then look at the two-tail probability values. The p-value is between 0.01 and 0.02.
(e) Do the results seem significant?
Since the p-value is “small” (0.0184 < α = 0.05), Reject H0 . The results are
significant. There is evidence to conclude that the average difference in commuting
time is different from 5 minutes. However, we don’t know if it is higher than 5
minutes or lower than 5 minutes because we did not test for that.
Q78 pg 582 Researchers randomly assigned participants either a tall, thin “highball” glass or a
short, wide “tumbler,” each of which held 355 ml. Participants were asked to pour 1.5
oz = 44.3 ml of water into their glass. Did the shape of the glass make a difference in
how much liquid they poured? In particular, test to see if they poured less water into
the “highball” glass than the “tumbler”. Assume α = 0.1. Here are the summaries:
Highball
n
99
y 42.2 ml
s 16.2 ml
Tumbler
n
99
y 60.9 ml
s 17.9 ml
(a) Find a 90% confidence interval for the difference in average water held for the two
glasses. Use df = 194.
Because we are looking for a 90% confidence interval with df = 194, t∗194 = 1.6528.
q 2
s
s2
∗
CI: (y H − y T ) ± tdf nHH + nTT
q
2
2
⇒ (42.2 − 60.9) ± 1.6528 16.2
+ 17.9
99
99
√
⇒ −18.7 ± 1.6528 5.8874
⇒ −18.7 ± 1.6528(2.4264)
⇒ (−22.71, −14.69)
(b) State the hypotheses to be tested.
H0 : µH − µT = 0
HA : µH − µT < 0
(c) Compute the value of the test score. (Assume all conditions are met.)
t194 =
=
(y H −y T )−0
r
s2
H
nH
s2
+ nT
T
q 42.2−60.9
2
16.22
+ 17.9
99
99
−18.7
2.4264
=
= −7.707
(d) Give the P-value or range of P-values.
On your calculator: tcdf (−999, −7.707, 194) = 0.
On the table (or an online table): go to degrees of freedom 194, find where 7.707
is in the row, and then look at the one-tail probability values. Compare 7.707 to
the values on the table. The p-value is less than 0.001.
(e) Do the results seem significant?
Since the p-value is “small” (0 < α = 0.1), Reject H0 . The results are significant. There is sufficient evidence to conclude that they poured less water into the
“highball” glass than the “tumbler”.
10.6
Various Chapters
1. We want to estimate the healing rate for a wound. A sample of size 17 is collected
and the sample mean is computed to be 24.3 micrometers per hour, with a sample
standard deviation of s= 8 micrometers per hour. What is a 95% confidence interval
for the population mean?
What we are given:
y = 24.3
s=8
n = 17
Because we want the confidence interval for the population mean, use the formula
CI : y ± t∗n−1 √sn
⇒ 24.3 ± t∗17−1 √817
⇒ 24.3 ± 2.120 √817
⇒ (20.187, 28.413)
2. A sample of size n=150 people is collected and the sample proportion of people who are
illiterate is computed to be .20. Compute a 95% confidence interval for the population
proportion of illiterate people.
What we are given:
n = 150
p̂ = 0.2
Because we want the confidence interval for the population proportion, use the formula
q
p̂(1−p̂)
∗
CI: p̂ ± z
qn
⇒ 0.2 ± 1.96 0.2(1−0.2)
150
q
⇒ 0.2 ± 1.96 0.16
150
⇒ (0.136, 0.264)
3. You believe that the proportion of people that like cheese is .80. You are going to
construct a 95% confidence interval and want the margin of error to be plus or minus
.03. What should the sample size be?
What we are given:
p̂ = 0.8
M OE = 0.03
Since this is dealing with one proportion, use the formula
q
q
p̂(1−p̂)
∗
⇒ 0.03 = 1.96 0.8(1−0.8)
M OE = z
n
n
q
0.16
0.03
⇒ 1.96 =
n
0.16
0.03 2
⇒ 1.96 = n
0.16
⇒ n = 0.03
2
( 1.96 )
⇒ n = 682.95
⇒ n ≈ 683
4. Teresa knows that appointment times are approximately normally distributed. She
believes the mean wait time is longer than 25 minutes. She conducts a test with α
= 0.05 and the appropriate hypotheses. She selects 25 random appointments and the
sample mean was found to be 25.66 minutes and a sample standard deviation of 10
minutes.
(a) State the hypotheses to be tested.
H0 : µ = 25
HA : µ > 25
(b) Compute the value of the test score.
Since this is asking for us to test the population mean, we need the formula
0
tn−1 = y−µ
√s
n
⇒ t25−1 = t24 =
25.66−25
10
√
25
=
0.66
2
= 0.33
(c) Give the P-value or range of P-values.
On your calculator: tcdf (0.33, 999, 24) = 0.372.
On the table: go to degrees of freedom 24, find where 0.33 is on the row, and then
look at the one-tail probability values. The probability is greater than 0.10.
(d) Do the results seem significant?
Since the p-value is “large” (0.372 > α = 0.05), Do Not Reject H0 . The results
are not significant.
5. You claim that the proportion of people who watch American Idol is greater than .50.
You sample n=200 people and compute a sample proportion of .53. Assume α = 0.05.
(a) State the hypotheses to be tested.
H0 : p = 0.5
HA : p > 0.5
(b) Compute the value of the test score. Since this is asking for us to test the population proportion, we need the formula
0.03
0
= √0.03
= 0.849
z = q pp̂−p
= q0.53−0.5
= √0.00125
0.25
0.5(1−0.5)
0 (1−p0 )
n
200
200
(c) Give the P-value or range of P-values.
On your calculator: normalcdf (0.849, 999) = 0.198.
(d) Do the results seem significant?
Since the p-value is “large” (0.198 > α = 0.05), Do Not Reject H0 . The results
are not significant.
6. You want to compare the proportion of gamers amongst women and men. You survey
300 women and 400 men. 175 of the women were gamers and 200 of the men were
gamers. Construct a 95% confidence interval for the difference of proportions.
What we are given:
p̂W =
p̂M =
175
300
200
400
Because we want the confidence interval for the difference of proportions, use the
formula
q
p̂1 (1−p̂1 )
2)
∗
CI: (p̂1 − p̂2 ) ± z
+ p̂2 (1−p̂
n2
qn1 175
( 300 )(1− 175
( 200 )(1− 200 )
300 )
− 200
± 1.96
+ 400 400 400
⇒ 175
300
400q
300
35
1
144
⇒ 12
± 1.96 300
+ 0.25
400
⇒ (0.0091, 0.157)
7. You believe that the proportion of men that are colorblind is greater than the proportion of women that are color blind. You sample 900 men and 90 of them are color
blind. You sample 700 women and 45 of them are colorblind. Assume α = 0.05.
(a) State the hypotheses to be tested.
H0 : pM − pW = 0
HA : pM − pW > 0
(b) Compute the value of the test score.
p̂pooled = NumberofSuccessesinGroup1+NumberofSuccessesinGroup2
n1 +n2
90+45
= 900+700
= 0.084375
q
p̂pooled (1−p̂pooled )
p̂
(1−p̂
)
SEpooled (p̂1 − p̂2 ) =
+ pooled n2 pooled
n1
q
= 0.084375(1−0.084375)
+ 0.084375(1−0.084375)
900
700
√
−5
−4
= √8.534 × 10 + 1.104 × 10
= 1.9574 × 10−4
= 0.014
90
45
− 700
1/28
p̂1 −p̂2
900
z = SEpooled
=
= 0.014
= 2.55
(p̂1 −p̂2 )
0.014
(c) Did you use the pooled proportion in part b.?
YES
(d) Compute the P-value.
On your calculator: normalcdf (2.55, 999) = 0.0054
(e) Are the results significant?
Since the p-value is “small” (0.0054 < α = 0.05), Reject H0 . The results are
significant.
Download