Hypothesis Testing

advertisement
Hypothesis Testing
A hypothesis is a claim or statement about a property of a
population (in our case, about the mean or a proportion of
the population)
A hypothesis test (or test of significance) is a standard
procedure for testing a claim or statement about a property
of a population.
It is extremely important to realize that we are not making
definitive conclusions. We are giving probabilistic
conclusions. We are either concluding that the results we
get are likely due to chance, or unlikely.
Examples
If we flip a coin 100 times, and 52 come up heads, this could
easily occur by chance. There is not sufficient evidence to
suggest that the coin is unfair.
If we flip a coin 100 times, and 75 come up heads, this would
be an extremely rare event if the coin was fair. The
extremely low probability is evidence that the coin may not
be fair.
Note: If would be very sloppy of us to conclude in the second
example that the coin is definitely unfair. Although
extremely rare, 75 heads is still possible by chance from a
fair coin.
Another Example
A light bulb is advertised as having a mean life of 1000 hours.
From a sample, we find the mean life of our sample to be
900 hours. The 95% confidence interval for the population
mean is 850 < μ < 1050 hours.
We CANNOT conclude:
That the actual mean life of light bulbs is 900 hours
That the advertised life is wrong
That the advertised life is correct
We CAN conclude:
From our sample, we are 95% confident that the population
mean is between 850 hours and 1050 hours. Since 1000
hours is included in that interval, we do not have sufficient
evidence to say that the advertised life is wrong.
Another approach
Claim: The mean life of light bulbs is less than 1000
Working Assumption: The mean life of light bulbs is 1000
The sample resulted in a mean life of 900
Assuming that μ =1000, the probability that the mean of our
sample would be less than 900 is P( x < 900) = 0.0951
There are two possible explanations for why our sample came
out with a mean life of 900 hours. Either this occurred by
chance (with probability 9.5%), or the actual mean life of
light bulbs is less than 900. Since the probability (9.5%)
isn’t horribly small, we decide that random chance is a
reasonable explanation. There isn’t sufficient evidence to
support the claim that the mean life of light bulbs is less
than 1000 hours.
Formal Hypothesis Testing
The brief process
Convert your claim into a symbolic null and
alternative hypothesis
Calculate a test statistic
Compare the test statistic to critical values OR
Find a probability
Write a conclusion
Components of a Formal
Hypothesis Test
The Null hypothesis (denoted H0) is a statement that
the value of a population parameter (such as
proportion or mean) is equal to some claimed
value.
The alternative hypothesis (denoted H1 or Ha) is a
statement that the value of a population parameter
somehow differs from the null hypothesis. The
symbolic form must be a >, < or ≠ statement.
We will be testing the null hypothesis directly (by
assuming it’s true) to reach a conclusion to either
reject H0 or fail to reject H0.
Note: We cannot support a claim that a parameter is
equal to a value. So, the null hypothesis must
always include equality, and the alternative
hypothesis must be inequality.
Process
1. Identify the claim to be tested and express
it in symbolic form.
2. Give the symbolic form that must be true
when the original claim is false
3. Pick the one not including equality to be
H1, and let the null hypotheses be that the
parameter equals the value being
considered.
Example
Claim: The mean IQ of statistics students is greater
than 110.
Symbolic form: μ > 110
Opposite: μ ≤ 110
H0: μ = 110
H1: μ > 110
Note: While often your claim will be the alternative
hypothesis, it won’t always be.
Test Statistics
A test statistic is a value computed from the sample
data, used in making the decision whether or not
to reject the null hypothesis.
pˆ  p
z
pq
Z value for proportion
n
Z value for mean (sigma known)
z
x

n
x
T value for mean (sigma unknown)
t
s
n
The test statistic indicates how far our sample
deviates from the assumed population parameter.
Critical region and significance
Critical region (or rejection region) is the set of all
values of the test statistic that cause us to reject the
null hypothesis.
Significance level (α) is the probability that the test
statistic will fall in the critical region when the
null hypothesis is actually true. Common values
are 0.01, 0.05 and 0.10
A Critical value is any value that separates the
critical region from values of the test statistic that
would not cause us to reject the null hypothesis
Example
Using a significance level of α =0.05, lets find the critical
value for each of these alternative hypotheses:
P ≠ 0.5: Critical region is in two tails of the normal
distribution. Using the same method we used in chapter 6,
we find the critical values to be z = -1.96 and z=1.96
P < 0.5: The critical region is in the left tail of the normal
distribution. Using the methods from 5.2, we find c so
P(z < c) = 0.05. The critical value is -1.645
P > 0.5: The critical region is in the left tail of the normal
distribution. Using the methods from 5.2, we find c so
P(z < c) = 0.95. The critical value is 1.645
P-Value
The P-value is the probability of getting a value of
the test statistic that is at least as extreme as the
one obtained for the sample data. If the P-value is
very small (such as less than 0.05), we will reject
the null hypothesis.
See pullout for help on how to calculate P-value.
The exact process depends on your alternative
hypothesis.
Decisions and Conclusions
Our final conclusion will always be one of these:
1. Reject the null hypothesis
2. Fail to reject the null hypothesis
Traditional Method
Reject H0 if the test statistic falls within the critical
region
Otherwise fail to reject the null hypothesis
Decisions and Conclusions
P-value method
Reject H0 if P-value ≤ α
Fail to reject if H0 > α
Less common methods
Find P-value, and leave conclusion to the reader
Look at whether population parameter falls in
confidence interval estimate
Final Wording
If your original claim contains equality (became H0)
Reject H0: “There is sufficient evidence to warrant
rejection of the claim that…”
Fail to Reject H0: “There is not sufficient evidence to
warrant rejection of the claim that…”
If your original claim does not contain equality (was H1)
Reject H0: “The sample data support the claim that…”
Fail to Reject H0: “There is not sufficient sample
evidence to support the claim that…”
Homework
7-2: 1-35 every other odd
Every odd recommended.
Download