Chapter 7

advertisement
Chapter 7
Hypothesis Testing with One Sample
Section 7.1 – Overview
Hypothesis – a claim or statement about a property of a population
Hypothesis Test – a standard procedure for testing a claim about a property of a
population
Below is a general outline of a hypothesis test. We will be discussing each of
these steps in more detail as we discuss Section 7.2.
Steps for Hypothesis Testing
1. Determine the hypotheses.
Null Hypothesis: an assumption concerning the value of the population parameter
being studied (usually represents no effect, no change, no difference, etc.)
Notation: H0
Alternative Hypothesis: a statement that specifies an alternative set of possible values
for the population parameter that is not included in the null hypothesis (states the
result for which we hope to find evidence) Notation: H1 (or HA or Ha)
Note: The null hypothesis may or may not be true. We will carry out a study and then
determine if we have strong enough evidence to conclude that the null hypothesis
is false (meaning our evidence suggests that H1 is true).
2. Obtain a simple random sample of n observations from the desired population and
calculate the observed sample statistic. For example, if we want to test something about
a population proportion (p), then we would calculate the sample proportion ( p̂ ). If we
want to test something about a population mean (µ), then we would calculate the sample
mean ( x ).
The test statistic is the corresponding z-score (or t-score) for the observed statistic
under the assumption that the null hypothesis is true.
3. Determine the “strength” of your evidence.
The evidence is strong if the outcome we observe is highly unlikely to occur by chance,
assuming the null hypothesis is true (meaning it is more probable that the alternative
hypothesis is true).
The evidence is weak if the outcome we observe can easily occur by chance, assuming
the null hypothesis is true.
We measure the strength of the evidence by calculating a P-value.
P-value: the probability of obtaining a sample outcome as extreme or more extreme
than the actual observed outcome, assuming the null hypothesis is true.
The smaller the P-value, the stronger the evidence is against H0.
(You may also think of the P-value as describing the risk of making a mistake
if we wrongly reject the null hypothesis.)
4. Draw a conclusion.
If the P-value is “small”, then we reject H0 in favor of H1.
If the P-value is “large”, then we fail to reject H0, meaning we cannot conclude H1.
Note: You may NEVER conclude that the null is true.
Unfortunately, you CANNOT be certain that you have made the correct conclusion.
Section 7.2 – Basics of Hypothesis Testing
Null and Alternative Hypotheses
Look at exercises 5, 7, 9, and 11 on page 335.
Note: The claim that you wish to support must be worded so that it becomes the alternative
hypothesis.
Test Statistic
z
pˆ  p
pq
n
z
Look at Examples 21 and 23
x 

n
t
x 
s
n
Critical Region, Significance Level, Critical Value,
and P-value
Critical Region – the set of all values of the test statistic that cause us to reject H0
(the set of all values that are highly unlikely to occur by chance if H0 is true)
Significance Level – the probability that we choose to use to determine if an outcome is
highly unlikely
Critical Value(s) – the value(s) that separates the critical region from the rest of the sampling
distribution
Two-tailed test
One-tailed test
Left-tailed
Right tailed
P-value – the probability of obtaining a sample outcome as extreme or more extreme than the
actual observed outcome, assuming the null hypothesis is true.
Look at exercises 25, 26, 27, 29, and 31
Statistically Significant
Decisions and Conclusions
Decision Criterion:
Traditional Method
P-value Method
Another Option
Look at exercises 33 and 35
Giving the P-value is ALWAYS more informative than just stating if the results are
statistically significant or not.
Advantages to this Approach:
When the P-value is reported, the decision of whether or not to reject the null hypothesis is
left up to the reader.
For example, suppose a P-value of .03 is reported.
If you, the reader, think that a 5% level of significance (α = .05) is sufficient, then you would
choose to reject the null hypothesis in favor of the alternative hypothesis.
If, however, a second reader thinks that a 5% level of significance is insufficient and would
rather use α = .01, then he or she would fail to reject the null hypothesis.
Publishing Our Results:
P-values are very often reported when describing the results of studies in many fields.
Therefore, it is very important to understand what they are telling you.
Example: The financial aid office of a university asks a sample of students
about their employment and earnings. The report says, “For
academic year earnings, a significant difference ( P-value = .038)
was found between the sexes, with men earning more on the
average.”
Interpretation: If there really is no difference in academic year earnings
between the sexes, then we would have seen a difference this
big or bigger in only 3.8% of all samples. (i.e. There is only a
3.8% chance that these results occurred by chance alone.)
Consequences of Our Decisions – Type I & Type II Errors
The first thing to note is that for any hypothesis test, there are four possible outcomes, two of
which are correct and two of which are incorrect.
Actual Truth
Decision
Reject Ho
Fail to Reject Ho
H0 is true
H1 is true
Type I Error: We reject Ho when it is true.
Type II Error: We fail to reject Ho when it is false.
Probability of a Type I Error: The probability that the test statistic falls in the critical
region when the null hypothesis is true. Notation: α
Probability of a Type II Error: The probability that the test statistic does not fall in the
critical region when the null hypothesis is false. Notation: β
Power – the probability (1 – β) of rejecting a false null hypothesis
(See pages 331-333 and exercise 43 for more details.)
Implications of Rejecting or Failing to Reject the Null Hypothesis
If a test statistic falls in the critical region, it does not prove that the null hypothesis is
false. Instead, it indicates that we have strong evidence to believe it is not true.
When the test statistic falls in the critical region, there are two possibilities.
1) The null hypothesis really is false.
2) By bad luck, we have observed a very unlikely event in our sample.
Similarly, if the test statistic does not fall in the critical region, it does not prove that the
null hypothesis is true. Instead, it indicates that our evidence is not strong enough to
reject the null hypothesis.
(This is the reason we do not want to say that we accept the null hypothesis.)
Since we assume that the null hypothesis is true in the beginning, it takes strong evidence
from the data to reject it. Usually we will choose (or be given) a small α, such as .05 or
.01. By choosing α small, we can guarantee that our chance of making a Type I Error is
small.
Statistical Significance is NOT the same as practical importance.
If we use a small sample, it is very unlikely that we will reject the null hypothesis.
As the sample size increases, it becomes more likely that we will reject the null
hypothesis. Hence, if a very large sample is used, we may reject the null hypothesis, thus
reporting that our test statistic is statistically significant (meaning that it fell in the
critical region), even if the difference is not of any practical importance.
Section 7.3 – Testing a Claim About a Proportion
Requirements
1. The data was gathered by using a simple random sampling method.
2. The conditions for the binomial distribution are satisfied.
(Independent trials and each trial has two possible outcomes.)
3. Both np and nq are greater than or equal to 5.
Examples – Exercises 1, 5, 7, 11, and 17
Section 7.4 – Testing a Claim About a Mean: σ Known
Requirements
1. The data was gathered by using a simple random sampling method.
2. The value of the population standard deviation, σ, is known.
3. Either the population is already normal or n ≥ 30 so the Central Limit Theorem can be
applied.
Examples – Exercises 5, 7, 9, and 11
Section 7.5 – Testing a Claim About a Mean: σ Not Known
Requirements
1. The data was gathered by using a simple random sampling method.
2. The value of the population standard deviation, σ, is not known.
3. Either the population is already normal or n ≥ 30 so the Central Limit Theorem can be
applied.
Examples – Exercises 1, 2, 3, 4, 5, 7, 9, 13, 15, and 17
Section 7.6 – Testing a Claim About a Standard Deviation or
Variance
Requirements
1. The data was gathered by using a simple random sampling method.
2. The population has a normal distribution.
Test Statistic:  2 
n 1 s 2
2
Examples – Exercises 1 and 5
with d.f. = n – 1
Download