Math 103 - Activity #35

advertisement
Math 103 - Cooley
Statistics for Teachers
OCC
Activity #35 – Hypothesis Testing
California State Content Standard - Statistics, Data Analysis, and Probability
N/A
Hypothesis : A statement that something is true.
Null Hypothesis H0 : The statement being tested in a statistical test. The test is designed to assess the
strength of the evidence against the null hypothesis. Usually the null hypothesis is a statement of “no
effect” or “no difference.”
Alternative Hypothesis Ha : A hypothesis to be considered as an alternative to the null hypothesis. The
alternative hypothesis should express the hopes or suspicions we bring to the data.
Hypothesis Test : The problem in a hypothesis test is to decide whether the null hypothesis should be
rejected in favor of the alternative hypothesis.
Possibilities for the alternative hypothesis
 One-Sided Alternative – H a :    0 or H a :    0
 Two-Sided Alternative – H a :    0
Basic Logic of Hypothesis Testing
Take a random sample from the population. If the sample data are consistent with the null hypothesis, do
not reject the null hypothesis; if the sample data are inconsistent with the null hypothesis (in the direction
of the alternative hypothesis), reject the null hypothesis and conclude that the alternative hypothesis is
true.
 Exercises:
For each of the following problems, hypothesis tests are proposed. For each hypothesis test,
a) determine the null hypothesis.
b) determine the alternative hypothesis.
c) classify the hypothesis test as two tailed, left tailed, or right tailed.
1)
Agriculture Books. The R.R. Bowker Company of New York collects information on the retail
prices of books and publishes the data in The Bowker Annual Library and Book Trade Almanac.
In 2000, the mean retail price of agriculture books was $66.52. A hypothesis test is to be
performed to decide whether this year’s mean retail price of agriculture books has changed from
the 2000 mean.
2)
Early-Onset Dementia. Dementia is the loss of the intellectual and social abilities severe enough
to interfere with judgment, behavior, and daily functioning. Alzheimer’s disease is the most
common type of dementia. In the article “Living with Early Onset Dementia: Exploring the
Experience and Developing Evidence-Based Guidelines for Practice” (Alzheimer’s Care
Quarterly, Vol. 5, Issue 2, pp. 111-122), P. Harris and J. Keady explored the experience and
struggles of people diagnosed with dementia and their families. A hypothesis test is to be
performed to decide whether the mean age at diagnosis of all people with early-onset dementia is
less than 55 years old.
-1-
3)
Worker Fatigue. A study by M. Chen et al. titled “Heat Stress Evaluation and Worker Fatigue in
a Steel Plant” (American Industrial Hygiene Association, Vol. 64, pp.352-359) assesses fatigue in
steel-plant workers due to heat stress. Among other things, the researchers monitored the heat rates
of a random sample of 29 casting workers. A hypothesis test is to be conducted to decide whether
the mean post-work heart rate of casting workers exceeds the normal resting heart rate of 72 beats
per minute (bpm).
Test Statistic : The statistic used as a basis for deciding whether the null hypothesis should be rejected.
Rejection region : The set of values for the test statistic that leads to rejection of the null hypothesis.
Non-rejection region : The set of values for the test statistic that leads to non-rejection of the null hypothesis.
Critical values : The values of the test statistic that separate the rejection and non-rejection regions. A
critical value is considered part of the rejection region.
Type I error : Rejecting the null hypothesis when it is in fact true.
Type II error : Not rejecting the null hypothesis when it is in fact false.
Decision:
H0 is:
Do not reject H0
Reject H0
True
Correct
decision
Type I
error
False
Type II
error
Correct
decision
Significance Level
The probability of making a Type I error, that is, of rejecting a true null hypothesis, is called the
significance level, α, of a hypothesis test.
Relation Between Type I and Type II Error Probabilities
For a fixed sample size, the smaller we specify the significance level, α, the larger will be the probability,
 , of not rejecting a false null hypothesis.
Possible Conclusions for a Hypothesis Test
Suppose that a hypothesis test is conducted at a small significance level.
 If the null hypothesis is rejected, we conclude that the alternative hypothesis is true.
 If the null hypothesis is not rejected, we conclude that the data do not provide sufficient evidence
to support the alternative hypothesis.
Note: When the null hypothesis is rejected in a hypothesis test performed at the significance level, α, we
simply say “the test results are statistically significant at the α level.” Similarly, when the null hypothesis
is not rejected in a hypothesis test performed at the significance level, α, we simply say “the test results are
not statistically significant at the α level.”
-2-
Math 103 - Cooley
Statistics for Teachers
OCC
Activity #35 – Hypothesis Testing
Obtaining Critical Values
Suppose that a hypothesis test is to be performed at the significance level, . Then the critical value(s)
must be chosen so that, if the null hypothesis is true, the probability is  that the test statistic will fall in
the rejection region.
The One-Mean z-Test (Critical Value Approach)
Assumptions
1) Simple random sample
2) Normal population or large sample (n ≥ 30)
3) σ known
Step 1 – The null hypothesis is H 0 :   0 and the alternative hypothesis is one of the following:
H a :   0
(Two- tailed)
or
H a :   0
(Left-tailed)
or
H a :   0
(Right-tailed)
Step 2 – Decide on the significance level .
x  0
Step 3 – Compute the value of the test statistic z 
/ n
.
Step 4 – The critical value(s) are
 z / 2
(Two-tailed)
or
 z
(Left-tailed)
or
z
(Right-tailed)
Use Table II to find the critical value(s).
(Two-tailed)
(Left-tailed)
(Right-tailed)
Step 5 – If the value of the test statistic falls in the rejection region, reject H0; otherwise, do not reject H0.
Step 6 – Interpret the results of the hypotheses test.
The hypothesis test is exact for normal populations and is approximately correct for large samples from
non-normal populations. By saying that the hypothesis test is exact, we mean that the true significance
level is equal to ; by saying that it is approximately correct, we mean that the true significance level is
only approximately equal to .
-3-
Some Important Values of z
z 0.10
z 0.05
z 0.025
z 0.01
z 0.005
1.28
1.645
1.96
2.33
2.575
When to Use the One-Mean z-Test
 For small samples–say, of size less than 15–the z-test should be used only when the variable under
consideration is normally distributed or very close to being so.
 For samples of moderate size–say, between 15 and 30–the z-test can be used unless the data
contains outliers or the variable under consideration is far from being normally distributed.
 For large samples–say, of size 30 or more–the z-test can be used essentially without restriction.
However, if outliers are present and their removal is not justified, you should perform the
hypothesis test once with the outliers and once without them to see what effect the outliers have. If
the conclusion is affected, use a different procedure or take another sample.
 If outliers are present but their removal is justified and results in a data set for which the z-test is
appropriate (as previously stated), the procedure can be used.
 Exercises:
4)
Agriculture Books. The R. R. Bowker Company of New York collects information on the retail
prices of books and publishes the data in The Bowker Annual Library and Book Trade Almanac.
In 2000, the mean retail price of agriculture books was $66.52. This year’s retail prices for
28 randomly selected agriculture books are shown in the following table.
68.45
60.99
59.36
58.86
76.61
46.58
79.94
67.99
66.01
59.38
57.05
66.95
55.02
69.33
75.09
55.56
55.77
47.05
68.27
75.67
71.78
67.12
50.54
59.52
75.31
56.26
62.57
75.59
At the 10% significance level, do the data provide sufficient evidence to conclude that this year’s
mean retail price of agriculture books has changed from the 2000 mean? Assume that the
standard deviation of prices for this year’s agriculture books is $8.45. (Note: The sum of the
data is $1788.62.)
-4-
Math 103 - Cooley
Statistics for Teachers
OCC
Activity #35 – Hypothesis Testing
 Exercises:
5)
Early-Onset Dementia. Dementia is the loss of the intellectual and social abilities severe enough
to interfere with judgment, behavior, and daily functioning. Alzheimer’s disease is the most
common type of dementia. In the article “Living with Early-Onset Dementia: Exploring the
Experience and Developing Evidence-Based Guidelines for Practice” (Alzheimer’s Care
Quarterly, Vol. 5, Issue 2, pp. 111-122), P. Harris and J. Keady explored the experience and
struggles of peoples diagnosed with dementia and their families. A simple random sample of 21
people with early-onset dementia gave the following data on age at diagnosis.
60 58 52 58 59 58 51
61 54 59 55 53 44 46
47 42 56 57 49 41 43
At the 1% significance level, do the data provide sufficient evidence to conclude that the mean age
at diagnosis of all people with early-onset dementia is less than 55 years old? Assume that the
population standard deviation is 6.8 years. (Note: x = 52.5 years.)
6)
Worker Fatigue. A study by M. Chen et al. titled “Heat Stress Evaluation and Worker Fatigue in
a Steel Plant” (American Industrial Hygiene Association, Vol. 64, pp. 352-359) assessed fatigue
in steel-plant workers due to heat stress. A random sample of 29 casting workers had a mean postwork heart rate of 78.3 beats per minute (bpm). At the 5% significance level, do the data provide
sufficient evidence to conclude that the mean post-work heart rate for casting worker exceed the
normal resting heart rate of 72 bpm? Assume that the population standard deviation of post-work
heart rates for casting workers is 11.2 bpm.
-5-
Download