Answer key

advertisement
Recitation #9 Answer Key
HYPOTHESIS TESTING AND
SIGNIFICANCE TESTS
CONCEPT REVIEW
Null and alternative hypotheses
The null hypothesis (H0) often represents either a skeptical perspective or a
claim to be tested (often a position of ‘no difference’). The alternative hypothesis
(HA) represents an alternative claim under consideration and is often
represented by a range of possible parameter values.
Even if we fail to reject the null hypothesis, we typically do not accept the null
hypothesis as true. Failing to find strong evidence for the alternative hypothesis
is not equivalent to accepting the null hypothesis. ๏ƒ  This is why we say we ‘fail
to reject Ho.
Significance tests
They are equivalent to using confidence intervals to test hypotheses.
If H0: μ = v.
We do not reject Ho with confidence level 95% if the 95% CI for the sample mean
m includes v.
Do not reject Ho at 95% if:
m – t0.025 x SE < v < m + t0.025 x SE
๐‘š−๐‘ฃ
– t0.025 < ๐‘†๐ธ < + t0.025
(Note that we are talking about two-tailed tests but our critical t values are righttail distributions. This is why for the 95% distribution I am using t0.025. On the
right-tail t distribution table, the 0.025 refers to the 2.5% that lies to the right of
a 95% CI. That is, the critical t value for a two-tailed test named t.05 equals the
t.025 for a one-tail test.)
Practice exercises
Writing hypotheses
1 For a)-c), is it a null hypothesis, or an alternative hypothesis?
a) In Canada, the proportion of adults who favor legalized gambling equals 0.50.
b) The proportion of all Canadian College students who are regular smokers now
is less than 0.24 (the value it was ten years ago).
c) The mean IQ of all students at Lake Wobegon High School is larger than 100.
a) Null hypothesis
b) Alternative hypothesis
c) Alternative hypothesis
1
2 Write the null and alternative hypotheses in words and then symbols for each of the
following situations.
(a) New York is known as “the city that never sleeps”. A random sample of 25 New
Yorkers were asked how much sleep they get per night. Do these data provide
convincing evidence that New Yorkers on average sleep less than 8 hours a night?
(b) Employers at a firm are worried about the effect of March Madness, a basketball
championship held each spring in the US, on employee productivity. They estimate
that on a regular business day employees spend on average 15 minutes of company
time checking personal email, making personal phone calls, etc. They also collect data
on how much company time employees spend on such non-business activities during
March Madness. They want to determine if these data provide convincing evidence
that employee productivity decreases during March Madness.
(a) H0 : μ = 8 (On average, New Yorkers sleep 8 hours a night.)
HA : μ < 8 (On average, New Yorkers sleep less than 8 hours a night.)
(b) H0 : μ = 15 (The average amount of company time each employee spends not
working is 15 minutes for March Madness.)
HA : μ > 15 (The average amount of company time each employee spends not working
is greater than 15 minutes for March Madness.)
3 A study suggests that the average college student spends 2 hours per week
communicating with others online. You believe that this is an underestimate and
decide to collect your own sample for a hypothesis test. You randomly sample 60
students from your dorm and find that on average they spent 3.5 hours a week
communicating with others online. A friend of yours, who offers to help you with the
hypothesis test, comes up with the following set of hypotheses. Indicate any errors
you see.
H0 : ๐‘ฅฬ… < 2 hours
HA : ๐‘ฅฬ… > 3.5 hours
First, the hypotheses should be about the population mean (μ) not the sample mean.
Second, the null hypothesis should have an equal sign and the alternative hypothesis
should be about the null hypothesized value, not the observed sample mean. The
correct way to set up these hypotheses is shown below:
H0 :μ = 2 hours
HA :μ > 2 hours
The one-sided test indicates that we are only interested in showing that 2 is an
underestimate. Here the interest is in only one direction, so a one-sided test seems
most appropriate. If we would also be interested if the data showed strong evidence
that 2 was an overestimate, then the test should be two-sided.
2
Confidence Intervals
4 Consider whether there is strong evidence that the average age of runners has
changed from 2006 to 2012 in the Cherry Blossom Run using a confidence
interval approach. In 2006, the average age was 36.13 years, and in 2012 the
average was 35.05 years with a standard deviation of 8.97 years for 100 runners.
First, set up the hypotheses:
H0: The average age of runners has not changed from 2006 to 2012, μage = 36.13.
HA: The average age of runners has changed from 2006 to 2012, μage ≠ 36.13.
We assume the normal model may be applied to ๐‘ฆฬ…. Using the sample mean and
standard error, we can construct a 95% confidence interval for μ age to determine
if there is sufficient evidence to reject H0:
๐‘ฆฬ… ± 1.96 ×
๐‘ 
√100
→ 35.05 ± 1.96 × 0.90 → (33.29, 36.81)
This confidence interval contains the null value, 36.13. Because 36.13 is not
implausible, we cannot reject the null hypothesis. We have not found strong
evidence that the average age is different than 36.13 years.
5 Colleges frequently provide estimates of student expenses such as housing. A
consultant hired by a community college claimed that the average student
housing expense was $650 per month. What are the null and alternative
hypotheses to test whether this claim is accurate?
The sample mean for student housing is $611.63 and the sample standard
deviation is $132.85. Construct a 95% confidence interval for the population
mean and evaluate the your hypotheses.
H0: The true average student housing expense μ = $650.
Ha: The true average student housing expense μ ≠ $650.
The standard error associated with the mean may be estimated using the sample
standard deviation divided by the square root of the sample size. Recall that n =75
students were sampled.
SE =
๐‘ 
√๐‘›
=
132.85
√75
= 15.34
The normal model may be applied to the sample mean because the data are a
simple random sample and the sample (presumably) represents no more than 10%
of all students at the college, the observations are independent. The sample size is
also sufficiently large (n = 75) and the data exhibit only moderate skew. Thus, the
3
normal model may be applied to the sample mean. This ensures a 95% confidence
interval may be accurately constructed:
๐‘ฅฬ… ± z*SE → 611.63 ± 1.96 × 15.34 → (581.56, 641.70)
Because the null value $650 is not in the confidence interval, a true mean of $650 is
implausible and we reject the null hypothesis. The data provide statistically
significant evidence that the actual average housing expense is less than $650 per
month.
6 A survey was conducted on 203 undergraduates from Duke University who took an
introductory statistics course in Spring 2012. Among many other questions, this
survey asked them about the number of exclusive relationships they have been in. The
histogram below shows the distribution of the data from this sample. The sample
average is 3.2 with a standard deviation of 1.97.
Estimate the average number of exclusive relationships Duke students have been in
using a 90% confidence interval and interpret this interval in context. Check any
conditions required for inference, and note any assumptions you must make as you
proceed with your calculations and conclusions.
We must assume it is a simple random sample to move forward; in practice, we would
investigate whether this is the case, but here we will just report that we are making
this assumption. Note that there are no students who have had no exclusive
relationships in the sample, which suggests some student responses are likely missing
(perhaps only positive values were reported). The sample size is at least 30. The skew
is strong, but the sample is relatively large so this is not a concern.
90% confidence interval (2.97, 3.43)
We are 90% confident that the true value of average number of exclusive
relationships that Duke students have been in is between 2.97 and 3.43.
4
Significance Tests
7 We want to test H0: μ = 100 against Ha: μ ≠ 100 for a dataset containing 101
observations. If the t statistic is 1.2 this indicates that:
a)
b)
c)
d)
e)
f)
There is strong evidence that μ = 100
There is not enough evidence to conclude that μ ≠ 100
There is strong evidence that μ ≠ 100
There is strong evidence that μ > 100
There is strong evidence that μ < 100
If μ were equal to 100, it would be unusual to obtain data such as those
observed.
g) We do not have enough information to make any conclusions.
8 A study compared treatments for teenage girls suffering from anorexia. For
each girl, the study observed her change in weight while receiving the therapy.
Let μ denote the population mean change in weight for the cognitive behavioral
treatment. We would like to test whether this treatment has an effect. For the 17
girls who received family therapy, the changes in weight were
11, 11, 6, 9, 14, -3, 0, 7, 22, -5, -4, 13, 13, 9, 4, 6, 11.
Part of our output shows:
Mean
95% Confidence Interval
Lower
Upper
3.60
t-value
df
2-Tail Sig
0.0007
Fill in the missing results.
To test whether the treatment has an effect we would test:
H0: μ = 0
Ha: μ≠ 0
If N=17, then df=16.
We need to find our sample mean in order to find the upper bound to the
confidence interval. ๐‘ฅฬ… = 7.29
We know that 3.60 = 7.29 – t0.025 * se
t0.025= 2.120
3.60−7.29
se = −2.120 = 1.74
Upper = 7.29 + 2.120*1.74 = 10.98
๐‘ ๐‘Ž๐‘š๐‘๐‘™๐‘’ ๐‘š๐‘’๐‘Ž๐‘›−๐‘š๐‘’๐‘Ž๐‘› ๐‘–๐‘› ๐‘›๐‘ข๐‘™๐‘™ โ„Ž๐‘ฆ๐‘๐‘œ๐‘กโ„Ž๐‘’๐‘ ๐‘–๐‘ 
7.29−0
t-value =
= 1.74 = 4.19
๐‘ ๐‘’
9 A manufacturer claims that bearings produced by their machine last 7 hours on
average under harsh conditions. A factory worker randomly samples 75 ball bearings,
and records their lifespans under harsh conditions. He calculates a sample mean of
6.85 hours, and the standard deviation of the data is 1.25 working hours. The
5
following histogram shows the distribution of the lifespans of the ball bearings in this
sample. Conduct a formal hypothesis test of this claim. Make sure to check that
relevant conditions are satisfied.
The sample is presumably a simple random sample, though we should verify that is
the case. Generally, this is what is meant by “random sample”, though it is a good
idea to actually check. The sample size is at least 30. The data are only slightly
skewed. Under the assumption that the random sample is a simple random sample, ๐‘ฅฬ…
will be normally distributed.
H0 : μ = 7 hours.
HA :μ ≠ 7 hours.
SE=
๐‘ 
1.25
√
√75
=
๐‘›
=
3
√12
๐‘ ๐‘Ž๐‘š๐‘๐‘™๐‘’ ๐‘š๐‘’๐‘Ž๐‘›−๐‘Ž๐‘ ๐‘ ๐‘ข๐‘š๐‘’๐‘‘ ๐‘ก๐‘Ÿ๐‘ข๐‘’ ๐‘ฃ๐‘Ž๐‘™๐‘ข๐‘’
t statistic =
๐‘†๐ธ
=
6.85−7
3
√12
= −1.04
The degrees of freedom are N -1 = 74. We look on t distribution critical values table
(page 593) and find the critical t.025 value (also noted as t* at the 95% confidence
level) for 60 and 80 degrees of freedom are 2.000 and 1.990 respectively. Since
- t.025 < -1.04 < t.025 or |−1.04| < 2.000 ๐‘Ž๐‘›๐‘‘ |−1.04| < 1.990
We cannot reject the null hypothesis. Note that even though we do not have the exact
t.-025 (critical t) value for 74 degrees of freedom, the fact that both values for 80 and
60 degrees of freedom are greater than the absolute value of our t statistics makes us
be sure that we cannot reject our null hypothesis.
All in all, the data do not provide convincing evidence that the average lifespan of all
ball bearings produced by this machine is different than 7 hours.
10 A hospital administrator randomly selected 64 patients and measured the time (in
minutes) between when they checked in to the ER and the time they were first seen by
a doctor. The average time is 137.5 minutes and the standard deviation is 39 minutes.
He is getting grief from his supervisor on the basis that the wait times in the ER
increased greatly from last year’s average of 127 minutes. However, the administrator
claims that the increase is probably just due to chance.
(a) Are conditions for inference met? Note any assumptions you must make to
proceed.
(b) Using a significance level of 95%, is the change in wait times statistically
significant? Use a two-sided test since it seems the supervisor had to inspect the data
before he suggested an increase occurred.
(c) Would the conclusion of the hypothesis test change if the significance level was
changed to 99%?
6
a) The sample size is at least 30. No information is provided about the skew. In
practice, we would ask to see the data to check this condition, but here we will
make the assumption that the skew is not very strong.
b) H0 :μ = 127.
HA :μ ≠ 127.
SE =
39
√64
= 4.875
๐‘ ๐‘Ž๐‘š๐‘๐‘™๐‘’ ๐‘š๐‘’๐‘Ž๐‘›−๐‘›๐‘ข๐‘™๐‘™ โ„Ž๐‘ฆ๐‘๐‘œ๐‘กโ„Ž๐‘’๐‘ ๐‘–๐‘  ๐‘ฃ๐‘Ž๐‘™๐‘ข๐‘’
t-statistic =
๐‘†๐ธ
=
137.5−127
4.875
=2.15
dF = N – 1 = 63. Looking into Table B (page 593) we find critical t.025 for 60 dF is
2.000. Since the value for 63 would be even lower, we are confident to say that
|2.15| > ๐‘ก.025
Thus, we reject the null hypothesis. The data provide convincing evidence that the
average ER wait time has increased over the last year.
c) For a 99% confidence level, our critical t value would be 2.66. Since
|2.15| < ๐‘ก.005
We fail to reject the null hypothesis. Our conclusion of the hypothesis would change.
11 A study considers whether the mean score μ on a college entrance exam for
students in 2007 is any different from the mean of 500 for students in 1957. To do
this, researchers used a nationwide random sample of 10,000 students who took the
exam in 2007, ๐‘ฆฬ… = 497 and s=100. Show that the result is highly significant
statistically, but not practically significant.
H0: μ = 500
Ha: μ ≠ 500
se =
๐‘ 
√๐‘›
=
100
√10000
t statistic =
= 0.1
497−500
0.1
= −30
From Table B, the critical t value for a 95% confidence level and 9999 degrees of
freedom (we look at df = ∞) is 1.96. Since
|−30| > 1.96
We can reject the null hypothesis.
This result is highly significant (the absolute value of our t statistic is way bigger
than our critical t) we can even reject this at the 99.8% level since |−30| > t.001.
Even though we have strong evidence against the null hypothesis, we must note
7
that this is not an important finding in any practical sense. With a large sample of
10,000 students, our statistic for the mean is still very much close to our null
hypothesis value of 500. Our big sample size allows us to be very sure that the value
of the mean is not 500, but our t test tells us nothing about how close the
parameter falls from H0. In this case it lays close enough that our result cannot be
considered to have much substantial importance.
8
Download