LECTURE 17

advertisement
MATH/STAT 352: LECTURE 17
Sections 6.1, 6.2, 6.12: Tests of Hypotheses, Large
Sample Test for the Population Mean, Fixed Level of
Significance Tests.
1
TESTING HYPOTHESES FOR A SINGLE SAMPLE
PROBLEM: Make a statement about a parameter.
GOAL: Find out, DECIDE , if the sample data supports or contradicts
this statement.
PROCESS: Test hypothesis (statement)
Examples:
1. Is a coin fair? Is the probability of H equal to 0.5?
Statement or hypothesis: P(H)=0.5
2. Average height in a population is 64 in.
Hypothesis: average height μ = 64.
TESTING HYPOTHESES - INTRODUCTION
Definition. A hypothesis is a statement about an unknown
parameter.
Based on sample data we TEST if the HYPOTHESIS is true or
false.
DECISION: Accept or reject hypothesis.
- If the hypothesis is consistent with the data, we accept it (no
reason to reject it).
- Otherwise, we reject the hypothesis in favor of an alternative
relative to which we judge our hypothesis.
NULL AND ALTERNATIVE HYPOTHESES

Null hypothesis Ho – statement denoting “no effect”, “no
change”.

Alternative hypothesis Ha - reflects “expected change”,
“research hypothesis”.
IDEA: We hold on to Ho as true and reject it ONLY IF there is
sufficient evidence against it.
EXAMPLE: A coin is tossed 100 times and gives 62 H. Is it a fair
coin?
Assume the coin is fair until proven otherwise. Ho: P(H)=0.5
Alternative? Coin is not fair. Ha: P(H) ≠ 0.5
Two sided alternative
(possible values of P(H) on both sides of 0.5)
NULL AND ALTERNATIVE HYPOTHESES –EXAMPLES contd.
One sided alternatives.
1. Coin tossing example contd. New question.
Does the coin favor H?
Ho: P(H) ≤ 0.5
Ha: P(H) > 0.5
2. Suppose the coin came up H 25 times.
Does the coin favor T?
Ho: P(H) ≥ 0.5
One sided alternatives
Ha: P(H) < 0.5
Example2. A new drug comes to market for high blood pressure.
Decide if it is significantly better than the old one.
Ho: Old drug same as new
(both drugs equally effective)
Ha: New drug better than old
(new drug more effective than old one)
Example: Identify the Null and Alternative Hypothesis.
a) The proportion of drivers who admit to running red lights is
greater than 0.5.
•
Ho: p ≤ 0.5
H1. p > 0.5
b) The mean height of professional basketball players is at most
7 ft.
•
Ho: μ ≥ 7ft
H1. μ < 7ft
c) The standard deviation of IQ scores of actors is equal to 15.
•
Ho: σ = 15
H1. σ ≠ 15
Example: ProCare Industries claimed that couples using their product
Gender Choice would have girls at a rate that is greater than 50% or
0.5. In an experiment whereby 100 couples used Gender Choice in
an attempt to have a baby girl, there were exactly 52 girls born.
NOTE: Under normal circumstances the proportion of girls p is 0.5,
so a claim that Gender Choice is effective can be expressed as
p > 0.5.
Ho P ≤ 0.5
H1 p> 0.5
Is the observed sample unusual? Could it happen by chance?
Assume p=0.5. Using a normal distribution as an approximation to
the binomial distribution, we find P(52 or more girls in 100 births) =
0.3821.
Conclusion. We do not reject random chance as a reasonable
explanation. We conclude that the proportion of girls born to
couples using Gender Choice is not significantly greater than the
number that we would expect by random chance
EXAMPLE
Average score on midterms in a calculus class in the past years was 70. This
year a sample of 100 students averaged 73. Are the students smarter this
year? Assume scores follow a normal distribution with σ=10.
Solution. Let μ= true mean midterm score this year (unknown),
X = score, sample mean x = 73, σ=10, n=100, X ~ N(μ, 10).
Ho: μ ≤ 70 (no change from last yr)
If Ho true, how likely is it to observe
By CLT (or Fact 2),
Ha: μ > 70 (smarter students this yr)
X
µ=
70 and σ=
X
X
of 73 or higher?
σ
=
n
10
= 1,
100
73 − 70
P( X ≥ 73)= P( Z ≥
)= P( Z > 3)= 0.0013
1
so
very small!
So, the data suggests that Ho is not true. DECISION: Reject Ho.
CONCLUSION: Students are smarter this year.
NOTES

The data is the truth.

If the chances of observing what actually happened are small if
Ho is true, then the data tells us that Ho is not true.

If the data we observed are likely to come up when Ho is true,
the data supports Ho.

When in doubt about one sided alternative, use two sided
alternative.
Example
A market survey company MS claims that more than 25% of
Internet users pay their bills online. A recent survey of 50
Internet users in Nevada showed that only 5% pay their bills
online. Assuming that MS claim holds in NV, the chances of
such survey results are very small, below 0.0005.
Do the survey results provide evidence in support of the MS claim
in NV?
Answer: Since, if the claim holds, the chances of observing what
was observed are very small, then the claim is probably not
true. This data does not support the MS’s claim.
TESTING HYPOTHESES: P-VALUE APPROACH
QUESTION: How to make the decision TO REJECT OR NOT Ho?
A p-value is the probability of observing a value of the test statistic at
least as contradictory to Ho (favoring Ha) as the observed value,
when Ho is assumed to be true.
In the calc test example, for the sample of 100 observations with
sample mean 73: p-value=P( X ≥ 73 given that μ=70) = 0.0013.
If p-value is small
Ho is true is small
probability of observing what we observed if
we have evidence against Ho
reject Ho.
Typically, we reject Ho for p-values below 0.01 or 0.05.
P-value is also called observed significance level.
TESTING HYPOTHESES: ERRORS
CORRECT DECISIONS
AND
ERRORS
DECISION
Reject Ho
Do not reject
Ho
Ho true
Type I error
Correct
decision
Ho false
Correct decision
Type II error
TRUTH
TESTING HYPOTHES: FIXED LEVEL OF SIGNIFICANCE
APPROACH
2 TYPES OF ERROR:
Type I Error: Reject Ho when it is true.
Type II Error: Do not reject Ho when it is false.
LEVEL OF SIGNIFICANCE α OF A TEST = probability of Type I
error we are willing to tolerate.
Our procedures are constructed in such a way that they have
minimal chance of Type II error for a given significance level α.
Usually significance level is given/decided before any data is
collected. Significance level is up the researcher.
Controlling Type I and Type II Errors; Power of a Test




For any fixed α, an increase in the sample size n will cause a
decrease in β.
For any fixed sample size n, a decrease in α will cause an
increase in β. Conversely, an increase in α will cause a
decrease in β.
To decrease both α and β, increase the sample size.
Power of a test. The power of a hypothesis test is the
probability (1 - β ) of rejecting a false null hypothesis.
It is computed by using a particular significance level α and a
particular value of the population parameter that is an
alternative to the value assumed true in the null hypothesis.
That is, the power of the hypothesis test is the probability of
supporting an alternative hypothesis that is true.
6.2: Large sample test for the population mean.
Setup: Population normal, σ known, OR if σ not known LARGE SAMPLE and population
close to normal. ONE SAMPLE Z-TEST
STEP 1. Ho: μ = μo
Ho: μ ≤ μo ( ≥ )
Ha: μ ≠ μo or
Ha: μ > μo (<)
STEP 2. Compute the test statistic:
x −µ
z=
.
σ/ n
STEP 3. Compute the critical number/value for the test
Find the
critical/rejection region for the test. The critical value depends on
Ha.
Two sided alternative
One sided alternatives
critical value = zα/2.
critical value = zα if Ha: μ > μo ;
or = – zα if Ha: μ < μo
FIXED LEVEL OF SIGNIFICANCE PROCEDURE contd.
STEP 4. DECISION-critical/rejection regions: depend on Ha.
Ha: μ ≠ μo Reject Ho if |z|> zα/2;
Ha: μ > μo Reject Ho if z > zα;
Ha: μ < μo
Reject Ho if z < - zα.
STEP 5. Answer the question in the problem.
P-value approach
STEP 3. Compute the p-value.
Value of the test
statistic
Two sided test p-value: Ha: μ ≠ μo , P-value: 2P( Z > |z|)
One sided tests p-values: Ha: μ > μo, P-value: P( Z > z)
Ha: μ < μo, P-value: P( Z < z)
STEP 4. DECISION
Reject Ho if p-value < significance level α.
STEP 5. Answer the question in the problem.
EXAMPLE
1. Suppose the verbal SAT score for 100 students gives average of
500, and it is known that σ =100. Test the hypothesis that the true mean SAT
score for this population is 475 versus a two sided alternative. Use
significance level α = 5%.
Solution. n=100,
x = 500,
STEP1. Ho: μ = 475
STEP 2. Test statistic
=
z
σ=100, μo=475, α = 5%.
Ha: μ ≠ 475
500 − 475
= 2.5.
100 / 100
STEP3. Critical value zα/2= z0.025= 1.96.
STEP 4. z=2.5 > 1.96
reject Ho.
STEP5. There is enough evidence to support the claim that the true mean SAT
score for this pop. differs significantly from 475.
P-value: P( Z > 2.5)= 0.0062, p-value=2(0.0062)= 0.0124
EXAMPLE
2. Suppose that the mean height of men is 66”. A sample of 36 women yielded
mean height of 62”. Are women, on average, shorter than men? Use σ =10
and 1% significance level. Compute p-value for your test.
Solution. n=36,
x = 62 ,σ=10,
STEP1. Ho: μ ≥ 66
STEP 2. Test statistic
μo=66, α = 1%.
Ha: μ < 66
z=
62 − 66
= −2.4
10 / 36
STEP3. Critical value zα= - z0.01= - 2.33.
STEP 4. z = - 2.4 < -2.33,
reject Ho.
STEP5. There is enough evidence to support the claim that on average, women
are shorter than men.
62 − 66
P-value = P ( X ≤ 62)
= P( Z ≤
=
) P( Z ≤ −2.4)
= 0.0082.
10 / 36
Testing hypotheses in MINITAB
Example: SAT scores.
Use Stat, Basic Statistics, 1 sample Z, use Summarized data, check
“perform hypothesis test”, and set hypothesized mean, in Options set the
confidence level (1- significance level) and set alternative hypothesis.
Results:
One-Sample Z
Test of mu = 475 vs not = 475
The assumed standard deviation = 100
N Mean SE Mean
95% CI
100 500.0 10.0 (480.4, 519.6)
Z
P
2.50 0.012
Since p-value=0.012 < 0.05=significance level, we reject Ho, and conclude that the
mean SAT score for this population differs significantly from 475.
Testing hypothesis MINITAB:
Heights example.
Results:
One-Sample Z
Test of mu = 66 vs < 66
The assumed standard deviation = 10
99% Upper
N Mean SE Mean
Bound
36 62.00
1.67
65.88
Z
P
-2.40 0.008
Since the p-value=0.008 < 0.01=significance level, we reject Ho, and
conclude that women are on average shorter than men.
Download