Fundamentals of Hypothesis Testing: One Sample Tests Testing

advertisement
Fundamentals of Hypothesis
Testing: One
One-Sample
Sample Tests
1
Goals
After completing this chapter, you should be
able to:
ƒ
Formulate null and alternative hypotheses for
pp
involving
g a single
g p
population
p
mean or
applications
proportion
ƒ
Formulate a decision rule for testing
g a hypothesis
yp
ƒ
Know how to use the critical value and p-value
approaches to test the null hypothesis (for both mean
and proportion problems)
ƒ
Know what Type I and Type II errors are
2
What is a Hypothesis?
ƒ A hypothesis is a claim
(assumption) about a
population parameter:
ƒ population mean
Example: The mean monthly cell phone bill
of this city is µ = $42
ƒ population proportion
Example: The proportion of adults in this
city
it with
ith cell
ll phones
h
is
i p = .68
68
3
The Null Hypothesis
Hypothesis, H0
ƒ States the assumption (numerical) to be
tested
Example: The average number of TV sets in
U.S. Homes is equal to three ( H0 : µ = 3 )
ƒ Is always about a population parameter
parameter,
not about a sample statistic
H0 : µ = 3
H0 : X = 3
4
The Null Hypothesis
Hypothesis, H0
(continued)
ƒ Begin with the assumption that the null
hypothesis is true
ƒ Similar to the notion of innocent until
proven guilty
ilt
q
ƒ Refers to the status quo
ƒ Always contains “=” , “≤” or “≥” sign
ƒ May
M or may nott be
b rejected
j t d
5
The Alternative Hypothesis,
Hypothesis H1
ƒ Is the opposite of the null hypothesis
ƒ e.g.,
g The average
g number of TV sets in U.S.
homes is not equal to 3 ( H1: µ ≠ 3 )
ƒ Challenges
g the status q
quo
ƒ Never contains the “=” , “≤” or “≥” sign
ƒ May or may not be proven
ƒ Is generally the hypothesis that the
researcher is trying to prove
6
Hypothesis Testing Process
Claim: the
population
mean age is 50.
((Null Hypothesis:
y
H0: µ = 50 )
Population
Is X= 20 likely if µ = 50?
If not likely,
REJECT
Null Hypothesis
Suppose
the sample
mean age
g
is 20: X = 20
Now select a
random sample
Sample
p
7
Reason for Rejecting H0
Sampling Distribution of X
20
If it is unlikely that
we would get a
sample mean of
this value ...
µ = 50
If H0 is true
... if in fact this were
the population mean…
X
... then
e we
e
reject the null
hypothesis that
µ = 50.
8
Level of Significance
Significance, α
ƒ Defines the unlikely values of the sample
statistic if the null hypothesis is true
ƒ Defines rejection region of the sampling
distribution
ƒ Is designated
g
by
y α , (level
(
of significance)
g
)
ƒ Typical values are .01, .05, or .10
ƒ Is selected by the researcher at the beginning
ƒ Provides the critical value(s) of the test
9
Level of Significance
and
d th
the R
Rejection
j ti R
Region
i
Level of significance =
H0: µ = 3
H1: µ ≠ 3
α/2
Two-tail test
α/2
Represents
critical value
Rejection
region is
shaded
h d d
0
H0: µ ≤ 3
H1: µ > 3
α
Upper-tail test
H0: µ ≥ 3
H1: µ < 3
α
0
α
Lower-tail test
0
10
Errors in Making Decisions
ƒ Type I Error
ƒ Reject a true null hypothesis
ƒ Considered a serious type of error
The probabilityy of Type
y I Error is α
ƒ Called level of significance of the test
ƒ Set by researcher in advance
11
Errors in Making Decisions
(continued)
ƒ Type II Error
ƒ Fail to reject a false null hypothesis
The probability of Type II Error is β
12
Outcomes and Probabilities
Possible Hypothesis Test Outcomes
Decision
Key:
Outcome
((Probability)
y)
Actual
Situation
H0 True
H0 False
Do Not
Reject
j
H0
No error
(1 - α )
Type II Error
(β)
Reject
ejec
H0
Type
ype I Error
o
(α)
No
o Error
o
(1-β)
13
Type I & II Error Relationship
ƒ Type I and Type II errors can not happen at
the same time
ƒ
Type I error can only occur if H0 is true
ƒ
Type II error can only occur if H0 is false
If Type I error probability ( α )
, then
Type II error probability ( β )
14
Factors Affecting Type II Error
ƒ All else equal,
ƒ β
when the difference between
hypothesized parameter and its true value
ƒ
β
when
α
ƒ
β
when
σ
ƒ
β
when
n
15
Hypothesis Tests for the Mean
Hypothesis
Tests for µ
σ Known
σ Unknown
16
Z Test of Hypothesis for the
M
Mean
((σ Known)
K
)
ƒ Convert
C
t sample
l statistic
t ti ti ( X ) tto a Z test
t t statistic
t ti ti
Hypothesis
Tests for µ
σ Known
σ Unknown
The test statistic is:
X −µ
Z =
σ
n
17
Critical Value
A
Approach
h tto T
Testing
ti
ƒ For two tailed test for the mean, σ known:
ƒ C
Convertt sample
l statistic
t ti ti ( X ) to
t test
t t statistic
t ti ti (Z
statistic )
ƒ Determine the critical Z values for a specified
g
α from a table or
level of significance
computer
ƒ D
Decision
i i R
Rule:
l If the
th test
t t statistic
t ti ti falls
f ll in
i the
th
rejection region, reject H0 ; otherwise do not
reject
j t H0
18
Two Tail Tests
Two-Tail
ƒ There are two
cutoff values
(critical values),
defining the
regions of
rejection
H0: µ = 3
H1: µ ≠ 3
α/2
α/2
X
3
Reject H0
-Z
Lower
critical
value
Do not reject H0
0
Reject H0
+Z
Z
Upper
iti l
critical
value
19
Review: 10 Steps in
H
Hypothesis
h i T
Testing
i
ƒ 1. State the null hypothesis, H0
ƒ 2.
2 State the alternative hypotheses,
hypotheses H1
ƒ 3. Choose the level of significance, α
ƒ 4. Choose the sample size, n
5. Determine
ete
e tthe
e app
appropriate
op ate stat
statistical
st ca
ƒ 5
technique and the test statistic to use
ƒ 6.
6 Find the critical values and determine the
rejection region(s)
20
Review: 10 Steps in
H
Hypothesis
h i T
Testing
i
ƒ 7. Collect data and compute the test statistic
from the sample result
ƒ 8. Compare the test statistic to the critical
value to determine whether the test statistics
falls in the region of rejection
ƒ 9.
9 Make the statistical decision: Reject H0 if the
test statistic falls in the rejection region
ƒ 10. Express the decision in the context of the
problem
21
Hypothesis Testing Example
T t the
Test
th claim
l i that
th t th
the ttrue mean # off TV
sets in US homes is equal to 3.
(Assume σ = 0.8)
0 8)
ƒ 1-2. State the appropriate
pp p
null and alternative
hypotheses
ƒ H0: µ = 3
H1: µ ≠ 3 (This is a two tailed test)
ƒ 3. Specify the desired level of significance
ƒ Suppose that α = .05 is chosen for this test
ƒ 4. Choose a sample size
ƒ Suppose
pp
a sample
p of size n = 100 is selected
22
Hypothesis Testing Example
(continued)
ƒ 5. Determine the appropriate technique
ƒ σ is known so this is a Z test
ƒ 6. Set up the critical values
ƒ For α = .05 the critical Z values are ±1.96
ƒ 7. Collect the data and compute the test statistic
ƒ Suppose the sample results are
n = 100, X = 2.84 (σ = 0.8 is assumed known)
So the test statistic is:
Z =
X −µ
2.84 − 3
− .16
=
=
= −2.0
σ
0.8
.08
n
100
23
Hypothesis Testing Example
(continued)
ƒ 8. Is the test statistic in the rejection region?
α = .05/2
Reject H0 if
Z < -1.96 or
Z > 1.96;
otherwise
do not
reject
j
H0
Reject H0
-Z= -1.96
α = .05/2
Do not reject H0
0
Reject H0
+Z= +1.96
Here, Z = -2.0 < -1.96, so the
test statistic is in the rejection
region
24
Hypothesis Testing Example
(continued)
ƒ 9-10. Reach a decision and interpret the result
α = .05/2
Reject H0
-Z= -1.96
α = .05/2
Do not reject H0
0
Reject H0
+Z= +1.96
-2.0
Since Z = -2.0 < -1.96, we reject the null hypothesis
and conclude that there is sufficient evidence that the
mean number
b off TVs
TV in
i US h
homes iis nott equall tto 3
25
p Value Approach to Testing
p-Value
ƒ p-value: Probability of obtaining a test
statistic
t ti ti more extreme
t
( ≤ or ≥ ) th
than th
the
observed sample value g
given H0 is true
ƒ Also called observed level of significance
ƒ Smallest value of α for which H0 can be
rejected
26
p Value Approach to Testing
p-Value
(continued)
ƒ Convert Sample Statistic (e.g., X ) to Test
Statistic
S
a s c (e
(e.g.,
g , Z sstatistic
a s c)
ƒ Obtain the p-value from a table or computer
ƒ Compare the p-value with α
ƒ If p-value < α , reject H0
ƒ If p-value
p value ≥ α , do not reject H0
27
p Value Example
p-Value
ƒ Example: How likely is it to see a sample mean of
2.84 (or something further from the mean, in either
direction) if the tr
true
e mean is µ = 3.0?
3 0?
X = 2.84 is translated
to a Z score of Z = -2.0
P(Z < −2.0)
2 0) = .0228
0228
P(Z > 2.0) = .0228
α/2 = .025
α/2 = .025
.0228
.0228
p-value
=.0228 + .0228 = .0456
-1 96
-1.96
-2.0
0
1 96
1.96
Z
2.0
28
p Value Example
p-Value
ƒ Compare
C
th
the p-value
l with
ith α
(continued)
ƒ If p
p-value < α , reject
j
H0
ƒ If p-value ≥ α , do not reject H0
Here: p-value = .0456
α = .05
05
Since .0456 < .05, we
reject the null
hypothesis
α/2 = .025
α/2 = .025
.0228
.0228
-1 96
-1.96
-2.0
0
1 96
1.96
Z
2.0
29
Connection to Confidence Intervals
ƒ For X = 2.84, σ = 0.8 and n = 100, the 95%
confidence
co
de ce interval
e a is:
s
0.8
2 84 - (1
2.84
(1.96)
96)
to
100
0.8
2 84 + (1
2.84
(1.96)
96)
100
2.6832 ≤ µ ≤ 2.9968
ƒ Since this interval does not contain the hypothesized
mean (3.0), we reject the null hypothesis at α = .05
Chap 8-30
One Tail Tests
One-Tail
ƒ In many cases, the alternative hypothesis
focuses on a particular direction
H0: µ ≥ 3
H1: µ < 3
H0: µ ≤ 3
H1: µ > 3
This is a lower-tail
lower tail test since the
alternative hypothesis is focused on
the lower tail below the mean of 3
This is an upper-tail test since the
alternative hypothesis is focused on
the upper tail above the mean of 3
31
Lower Tail Tests
Lower-Tail
H0: µ ≥ 3
ƒ There is only one
critical value
value, since
the rejection area is
i only
in
l one ttailil
H1: µ < 3
α
R j t H0
Reject
-Z
D nott reject
Do
j t H0
0
µ
Z
X
Critical value
32
Upper Tail Tests
Upper-Tail
ƒ There is only one
critical value
value, since
the rejection area is
i only
in
l one ttailil
H0: µ ≤ 3
H1: µ > 3
α
Z
Do not reject H0
X
µ
0
Zα
Reject H0
Critical value
33
Example: Upper-Tail Z Test
f Mean
for
M
(σ
( Known)
K
)
A phone industry manager thinks that
customer monthly cell phone bill have
increased, and now average over $52
$ per
month. The company wishes to test this
claim. (Assume σ = 10 is known)
Form hypothesis
F
h
th i ttest:
t
H0: µ ≤ 52 the average is not over $52 per month
H1: µ > 52
the average is greater than $52 per month
(i.e., sufficient evidence exists to support the
manager’s
manager
s claim)
34
Example: Find Rejection Region
(continued)
ƒ Suppose that α = .10 is chosen for this test
Find the rejection region:
Reject H0
α = .10
Do not reject H0
0
1.28
Reject H0
Reject H0 if Z > 1.28
1 28
35
Review:
O T il Critical
One-Tail
C iti l V
Value
l
What is Z given α = 0.10?
.90
.10
α = .10
.90
90
z
Standard Normal
Distribution Table ((Portion))
0 1.28
1 28
Z
.07
07
.08
.09
09
1.1 .8790 .8810 .8830
1.2 .8980 .8997 .9015
1.3 .9147 .9162 .9177
Critical Value
= 1.28
36
Example: Test Statistic
(continued)
Obtain sample and compute the test statistic
Suppose a sample is taken with the following
results: n = 64,
64 X = 53.1
53 1 (σ
(σ=10
10 was assumed known)
ƒ Then the test statistic is:
X−µ
53.1 − 52
Z =
=
= 0.88
0 88
σ
10
n
64
37
Example: Decision
(continued)
Reach a decision and interpret the result:
Reject H0
α = .10
Do not reject H0
0
1.28
Reject H0
Z = .88
Do not reject H0 since Z = 0.88 ≤ 1.28
i e : there is not sufficient evidence that the
i.e.:
mean bill is over $52
38
p -Value
Value Solution
Calculate the p
p-value
value and compare to α
(continued)
(assuming that µ = 52.0)
p-value
l = .1894
1894
Reject
j
H0
α = .10
0
Do not reject H0
1.28
Z = .88
88
Reject H0
P( X ≥ 53.1)
53.1− 52.0 ⎞
⎛
= P⎜ Z ≥
⎟
10/ 64 ⎠
⎝
= P(Z ≥ 0.88) = 1− .8106
= .1894
Do not reject H0 since p-value = .1894 > α = .10
39
Z Test of Hypothesis for the
M
Mean
((σ Known)
K
)
ƒ Convert
C
t sample
l statistic
t ti ti ( X ) tto a t test
t t statistic
t ti ti
Hypothesis
Tests for µ
σ Known
σ Unknown
The test statistic is:
t n-1
X −µ
=
S
n
40
Example: Two-Tail Test
( Unknown)
(σ
U k
)
The average cost of a
hotel room in New York
is said to be $168 per
night.
i ht A random
d
sample
l
of 25 hotels resulted in
X = $172.50 and
S = $15.40.
$15 40 Test at the
α = 0.05 level.
H0: µ = 168
H1: µ ≠ 168
(A
(Assume
the
th population
l ti distribution
di t ib ti is
i normal)
l)
41
Example Solution:
T
Two-Tail
T il T
Test
H0: µ = 168
H1: µ ≠ 168
ƒ α = 0.05
0 05
α/2=.025
Reject H0
-t n-1,α/2
-2.0639
ƒ n = 25
ƒ σ is unknown, so
use a t statistic
ƒ Critical Value:
t24 = ± 2.0639
t n−1 =
α/2=.025
Do not reject H0
Reject H0
0
1.46
t n-1,α/2
2.0639
X −µ
172.50 − 168
=
= 1.46
S
15.40
n
25
Do not reject H0: not sufficient evidence that
true mean cost is different than $168
42
Connection to Confidence Intervals
ƒ For X = 172.5, S = 15.40 and n = 25, the 95%
confidence
co
de ce interval
e a is:
s
172 5 - (2.0639)
172.5
(2 0639) 15.4/
15 4/ 25
to 172.5
172 5 + (2.0639)
(2 0639) 15.4/
15 4/ 25
166.14 ≤ µ ≤ 178.86
ƒ Since this interval contains the Hypothesized mean (168),
we do not reject the null hypothesis at α = .05
43
Hypothesis Tests for Proportions
ƒ Involves categorical variables
ƒ Two possible outcomes
ƒ “Success”
Success (possesses a certain characteristic)
ƒ “Failure” (does not possesses that characteristic)
ƒ Fraction or proportion of the population in the
“success” category
g y is denoted by
y p
44
Proportions
(continued)
ƒ Sample proportion in the success category is
denoted by ps
X number of successes in sample
ƒ ps = n =
sample size
ƒ When both np and n(1
n(1-p)
p) are at least 5,
5 ps
can be approximated by a normal distribution
with mean and standard deviation
p(1 − p)
ƒ
µp s = p
σ ps =
n
45
Hypothesis Tests for Proportions
ƒ The sampling
distribution of ps
is approximately
normal,, so the test
statistic is a Z
value:
Z=
ps − p
p(1 − p)
n
Hypothesis
Tests for p
np ≥ 5
and
n(1-p) ≥ 5
np < 5
or
n(1-p) < 5
Not discussed
in this chapter
46
Z Test for Proportion
in Terms of Number of Successes
ƒ An equivalent form
to the last slide,
slide
but in terms of the
number of
successes, X:
X − np
Z=
np(1 − p)
Hypothesis
Tests for X
X≥5
and
n-X ≥ 5
X<5
or
n-X < 5
Not discussed
in this chapter
47
Example: Z Test for Proportion
A marketing company
claims that it receives
8% responses from its
mailing To test this
mailing.
claim, a random sample
of 500 were surveyed
with 25 responses. Test
at the α = .05
significance level.
Check:
n p = (500)(.08) = 40
n(1-p) = (500)(.92) = 460
48
Z Test for Proportion: Solution
T t Statistic:
Test
St ti ti
H0: p = .08
H1: p ≠ .08
Z=
α = .05
n = 500,
500 ps = .05
05
Reject
.025
025
.025
025
-1.96
-2.47
0
1.96
.05 − .08
= −2.47
.08(1 − .08)
500
Decision:
Critical Values: ± 1.96
Reject
ps − p
=
p(1 − p)
n
z
Reject H0 at α = .05
05
Conclusion:
There is sufficient
evidence to reject the
company’s
’ claim
l i off 8%
response rate.
49
p Value Solution
p-Value
(continued)
Calculate the p-value and compare to α
(For a two sided test the p-value is always two sided)
Do not reject H0
Reject H0
α/2 = .025
Reject H0
p-value = .0136:
α/2 = .025
P(Z ≤ −2.47) + P(Z ≥ 2.47)
.0068
.0068
-1.96
Z = -2.47
0
= 2(.0068)
2( 0068) = 0.0136
0 0136
1.96
Z = 2.47
Reject H0 since p
p-value
value = .0136
0136 < α = .05
05
50
Potential Pitfalls and
E hi l C
Ethical
Considerations
id
i
ƒ Use randomly collected data to reduce selection biases
ƒ Do not use human subjects without informed consent
ƒ Choose the level of significance, α, before data
collection
ƒ Do not employ “data snooping” to choose between onetail and two-tail test, or to determine the level of
significance
ƒ Do not practice “data cleansing” to hide observations
that do not support a stated hypothesis
ƒ Report all pertinent findings
51
Download