• In statistical testing, we use deductive reasoning to specify what should happen if the conjecture or null hypothesis is true.
• A study is designed to collect data to challenge this hypothesis.
• We determine if what we expected to happen is supported by the data.
• If it is not supported, we reject the conjecture .
• If it cannot be rejected we (tentatively) fail to reject the conjecture .
• We have not proven the conjecture, we have simply not been able to disprove it with these data.
One Sample Inf-1
Statistical tests are based on the concept of Proof by Contradiction .
If P then Q
=
If NOT Q then NOT P
Analogy with justice system
Court of Law
1.
Start with premise that person is innocent . (Opposite is guilty .)
2.
If enough evidence is found to show beyond reasonable doubt that person committed crime, reject premise.
(Enough evidence that person is guilty.)
3.
If not enough evidence found, we don’t reject the premise. (Not enough evidence to conclude person guilty.)
Statistical Hypothesis Test
1.
Start with null hypothesis , status quo .
(Opposite is alternative hypothesis .)
2.
If a significant amount of evidence is found to refute null hypothesis, reject it.
(Enough evidence to conclude alternative is true.)
3.
If not enough evidence found, we don’t reject the null. (Not enough evidence to disprove null.)
One Sample Inf-2
CLAIM: A new variety of turf grass has been developed that is claimed to resist drought better than currently used varieties.
CONJECTURE: The new variety resists drought no better than currently used varieties.
DEDUCTION: If the new variety is no better than other varieties (P), then areas planted with the new variety should display the same number of surviving individuals (Q) after a fixed period without water than areas planted with other varieties.
CONCLUSION: If more surviving individuals are observed for the new varieties than the other varieties (NOT Q), then we conclude the new variety is indeed not the same as the other varieties, but in fact is better (NOT P).
One Sample Inf-3
1. Null Hypothesis (H
0
:)
2. Alternative Hypothesis (H
A
:)
3. Test Statistic (T.S.)
Computed from sample data.
Sampling distribution is known if the Null Hypothesis is true.
4. Rejection Region (R.R.)
Reject H
0 if the test statistic computed with the sample data is unlikely to come from the sampling distribution under the assumption that the Null
Hypothesis is true.
5. Conclusion:
Reject H
0 or Do Not Reject H
0
H
0
: m ≤ 530
H
A
: m
> 530
Other H
A
: m m
< 530
530
T .
S .
z * y
m n
~ N ( 0 , 1 )
R.R. z * > z a
Or z * < -z a
O r
| z * | > z a/2
One Sample Inf-4
Research Hypotheses: The thing we are primarily interested in “ proving ”.
H
1
A
H
2
A
H
3
A
:
:
: m m m
m
0 m
0 m
0
H
1
A
H
2
A
H
3
A
:
:
: m m m
1 .
6 m
1 .
6 m
1 .
6 m
Average height of the class
Null Hypothesis: Things are what they say they are, status quo .
0
m m
0
H
0
: m
1 .
6 m
(It’s common practice to always write H
0 in this way, even though what is meant is the opposite of H
A in each case.)
One Sample Inf-5
One Sample Inf-6
Hypotheses of interest:
H
1
A
H 2
A
H
3
A
:
:
: m m m
m
0 m
0 m
0 or m m
0
Test Statistic: Sample Mean
Under H
0
: the sample mean has a sampling distribution that is normal with mean m
0
.
Under H
A
: the sample mean has a sampling distribution that is also normal but with a mean m
1 that is different from m
0
.
y
1 n i n
1 y i y y
~ N ( m
0
,
)
~ N m
0
,
n y y
~ N ( m
1
,
)
~ N m
1
,
n
One Sample Inf-7
H
0
: True
(3) (1) (2)
H
A
1 : True m
0 m
1
What would you conclude if the sample mean fell in location (1)? How about location (2)? Location (3)? Which is most likely 1, 2, or 3 when H
0 is true?
One Sample Inf-8
H
0
: Assumed True
Testing H
A
1 : m m
0
N ( m
0
, n
)
Rejection Region
Reject H
0 if the sample mean is in the upper tail of the sampling distribution.
m
0
C 1
If y
C
1
Re ject H
0
1
One Sample Inf-9
Reject H
0 if the sample mean is larger than “expected”.
If H
0 were true, we would expect 95% of sample means to be less than the upper limit of a 90% CI for μ.
C
1 m
0
1 .
645
n
From the standard normal table.
In this case, if we use this critical value, in 5 out of 100 repetitions of the study we would reject H o incorrectly. That is, we would make an error.
But, suppose H
A
1 is the true situation, then most sample means will be greater than C 1 and we will be making the correct decision to reject more often.
P
y
m
0 n
1 .
645
0 .
05
One Sample Inf-10
Rejection Regions for Different Alternative Hypotheses
N ( m
0
, n
)
5%
H
0
: m
= m
0
True
Testing H
A
1 : m m
0
Rejection Region
H
0
: m
= m
0
True
Testing H
A
1 : m m
0
5%
Rejection Region
H
0
: m
= m
0
True
Testing H
A
1 : m m
0
2.5%
Rejection Region
C 2
C 3
L m
0
C 1
2.5%
C 3
U
One Sample Inf-11
H
0
: True
Rejection Region
If the sample mean is at location (1) or (3) we make the correct decision.
m
0 C 1
(3)
(2)
(1)
If the sample mean is at location (2) and H
0 is actually true , we make the wrong decision.
This is called making a TYPE I error , and the probability of making this error is usually denoted by the Greek letter a
.
a
= P(Reject H
0
If
If
C
1
C 1
m m
0
0
when H
0
1 .
645 n z a
n is is the true condition) then a
= 1/20=5/100 or .05
then the Type I error is a
.
One Sample Inf-12
H
A
1 : True
Rejection Region
If the sample mean is at location (1) or (3) we make
the wrong decision.
(3) m
0
(1)
C 1 m
1
(2)
If the sample mean is at location (2) we make the correct decision.
This is called making a TYPE II error , and the probability of making this type error is usually denoted by the Greek letter b
.
b
= P(Do Not Reject H
0 when H
A is the true condition)
We will come back to the Type II error later.
One Sample Inf-13
H
0
: True
Type I Error Region a
Type II Error Region
H
A
1 : True b
• We want to minimize the Type I error , just in case H
0 is true, and we wish to minimize the Type II error just in case H
A is true.
• Problem: A decrease in one causes an increase in the other.
• Also: We can never have a Type I or II error equal to zero!
m
0
C 1 m
1
Rejection Region
One Sample Inf-14
The solution to our quandary is to set the Type I Error Rate to be small, and hope for a small Type II error also. The logic is as follows:
1.
Assume the data come from a population with unknown distribution but which has mean ( m
) and standard deviation (
).
2.
For any sample of size n from this population we can compute a
3.
Sample means from repeated studies having samples of size n will have the sampling distribution y with mean m and a standard deviation of
/
n (the standard error of the mean). This is the Central Limit Theorem.
4.
If the Null Hypothesis is true and m
= m
0
, then we deduce that with probability a we will observe a sample mean greater than m
0
+ z a
/
n.
(For example, for a
= 0.05, z a
=1.645.)
One Sample Inf-15
Following this same logic, rejection rules for all three possible alternative hypotheses can be derived.
Reject H o if:
H
H
H
1
A
2
A
3
A
:
:
: m m m
m m
0 m
0
0 y
m y
m n
0
0 y
n m
0 n
z a
z a z a
/ 2
Note: It is just easier to work with a test statistic that is standardized.
We only need the standard normal table to find critical values.
For (1a
)100% of repeated studies, if the true population mean is m
0 as conjectured by H
0
, then the decision concluded from the statistical test will be correct. On the other hand, in a
100% of studies, the wrong decision (a Type I error) will be made.
One Sample Inf-16
The value 0 < a
< 1 represents the risk we are willing to take of making the wrong decision.
But, what if I don’t wish to take any risk?
Why not set a
= 0?
What if the true situation is expressed by the alternative hypothesis and not the null hypothesis?
a
= 0.05
a
= 0.01
a
= 0.001
But this!
Suppose H
A
1 is really the true situation. Then the samples come from the distribution described by the alternative hypothesis and the sampling distribution of the mean will have mean m
1
, not m
0
.
y ~ N ( m
1
,
n
) m
0 m
1
One Sample Inf-17
(at an a
Type I error probability)
Rule : Reject H
0 if y
m
0 n
z a or if y
C
1 m
0 z a
n
Pr
y
m
0 n
z a
a m
0
Do not reject
C
1
H o
Pretty clear cut when m
1 m
1
Reject H o much greater than m
0
One Sample Inf-18
When H
0 is true
Pr
y
m
0 n
z a
| H
0
a m
0
m
0
a
A
b
C
1 m
1
When H
A is true
Pr
y
m
0 n
z a
| H
A
1
b
If H
A is the true situation, then any sample whose mean is larger than will lead to the
0 correct decision (reject H
0
, accept H
A
).
One Sample Inf-19
If H
A is the true situation
Then any sample such that its sample mean is less than m
0
z a
n y will lead to the wrong decision (do not reject H
0
, reject H
A
).
Reject H
A
True Situation
C 1
Do not reject H
A
Reject H
0
Decision
Accept H
0
H
0
: True H
A
: True
Type I Error a
Correct
1 a
Correct
1 b
Type II Error b
One Sample Inf-20
m
0
a
0
a
Type I Error Probability
(Reject H
0 when H
0 true)
Pr
y
m
0 n
z a
| H
A
1
b
m
0
a
A
b
Power of the test.
(Reject H
0 when H
A true)
Type II Error Probability
(Reject H
A when H
A true)
One Sample Inf-21
Example
H
0
: m
= m
0
= 38
Sample Size: n = 50
Sample Mean: y
40 .
1
Sample Standard deviation: s = 5.6
H
A
: m
> 38
What risk are we willing to take that we reject H
0 fact H
0 is true?
when in
P(Type I Error) = a
= .05 Critical Value: z
.05
Rejection Region
40 .
1
5 .
6
38
50
2 .
651
1 .
645 y
m n
0
z a
= 1.645
Conclusion: Reject H
0
One Sample Inf-22
Type II Error.
To compute the Type II error for a hypothesis test, we need to have a specific alternative hypothesis stated, rather than a vague alternative
Vague
H
A
: m
> m
0
H
A
: m
< m
0
H
A
: m m
0
H
A
: m
10
Specific
H
A
: m
= m
1
> m
0
H
A
: m
= m
1
< m
0
H
H
A
A
:
: m m
= 5 < 10
= 20 > 10
Note:
As the difference between m
0 and m
1 gets larger , the probability of committing a Type II error ( b
) decreases .
Significant Difference:
D
= m
0
m
1
One Sample Inf-23
Computing the probability of a Type II Error ( b
)
H
0
: m
= m
0
H
A
: m
= m
1
> m
0 b
Pr Z
z a
m
0
m
1 n
H
0
: m
= m
0
H
A
: m
= m
1
< m
0
H
0
: m
= m
0
H a
: m m
0 m
= m
1
m
0 b b
Pr Z
z a
Pr
Z
z a
2 m
0
m
1 n
D m m
0 1 critical difference m
0
m
1 n
m
0 m
0
1 b
= Power of the test m
1 m
1
One Sample Inf-24
Example: Power n
50 H
0
: m m
0
38 y
40 .
1 s
5 .
6
Pr( Type I error)
a
.05
H
A
: m m
1 z a
1 .
645 b
Pr
Z
z a
m
0
m
1 n
Pr
Z
1 .
645
D
50
40
Pr
Z
1 .
645
D
n
Assuming s is actually equal to
(usually what is done): b
Pr Z
1 .
645
2
5 .
6
( 50 )
Pr( Z
0 .
88 )
0 .
1894
One Sample Inf-25
D
b
z a
n
1
Pr Z
z a
D
n
Type II error
1a
As
D gets larger, it is less likely that a Type
II error will be made.
b
D
One Sample Inf-26
D
b
z a
n
1
Pr Z
z a
D
n
Type II error
1a
As n gets larger, it is less likely that a Type II error will be made.
b n
What happens for larger
?
One Sample Inf-27
2
4
D
1
b
z a
m m
0 1
n
Pr Z
1.645
D
n
Power = 1b n b power
50 .755
.
245
50 .286
.714
50 .001
.999
2
4
D
1 n b power
25 .857
.143
25 .565
.435
25 .054
.946
One Sample Inf-28
1a
See Table 4,
Ott and
Longnecker
Pr(Type II Error) b
1
0
0
Power
1b a
0
0 n=50
D
D n=10 large large
One Sample Inf-29
b
z a
Pr Z
z a
m m
0 1
n
Pr Z
z a
D
n
m m
0 1
n
1) For fixed
and n, b decreases (power increases) as
D increases.
2) For fixed
and
D
, b decreases (power increases) as n increases.
3) For fixed
D and n, b decreases (power increases) as
decreases.
One Sample Inf-30
N( m
0
,
/ n)
D
1
Decreasing
decreases the spread in the sampling dist of y
D
N( m
0
,
1
/ n)
Note: z a changes.
Same thing happens if you increase n.
Same thing happens if
D is increased.
One Sample Inf-31
1) Specify the critical difference,
D
(assume
is known).
2) Choose P(Type I error) = a and Pr(Type II error) = b based on traditional and/or personal levels of risk.
3) One-sided tests :
4) Two-sided tests: n n
D
2
2
D 2
2
(
( z a z a
/ 2
z z b b
)
)
2
2
Example: One-sided test,
= 5.6, and we wish to show a difference of
D
= .5 as significant (i.e. reject H
0 m
0 for H
A
: m
= m
1 = m
0
: m
+
D
) with a
= .05 and b
= .2.
= n
2
2
1 .
65
0 .
84
2
777 .
7 ~ 778
One Sample Inf-32