P-value

advertisement
P-value
Example: Problem 35
State DMV records indicate that of all vehicles undergoing emissions
testing during the previous year, 70% passed on the first try. A random
sample of 200 cars tested in a particular county during the current year
yields 124 that passed on the initial test. Does this suggest that the true
proportion for this county during the current year differs from the
previous statewide proportion?
Hypothesis: H0 : p = 0.70 v.s. Ha : p 6= 0.70.
Value of the test statistic:
0.62 − 0.70
p̂ − p0
=p
= −2.469
z=p
p0 (1 − p0 )/n
0.70(1 − 0.70)/200
Significance Level
.05
.02
.01
.002
z
z
z
z
Rejection Region
≥ 1.96 or z ≤ −1.96
≥ 2.33 or z ≤ −2.33
≥ 2.58 or z ≤ −2.58
≥ 3.08 or z ≤ −3.08
Conclusion
Reject H0
Reject H0
Fail to reject H0
Fail to reject H0
P-value
Definition
The P-value (or observed significance level) is the smallest level of
significance at which H0 would be rejected when a specified test
procedure is used on a given data set. Once the P-value has been
determined, the conclusion at any partivular level α results from
comparing the P-value to α:
1. P-value ≤ α ⇒ reject H0 at level α.
2. P-value > α ⇒ fail to reject H0 at level α.
Convention: it is customary to call the data significant when H0 is
rejected and not significant otherwise.
P-value
An equivalent definition for P-value:
Definition
The P-value is the probability calculated assuming H0 is true, of
obtaining a test statistic value at least as contradictory to H0 as
the value that actually resulted. The smaller the P-value, the more
contradictory is the data to H0 .
P-value
P-value for z Tests


for an upper-tailed test
1 − Φ(z)
P = Φ(z)
for a lower-tailed test


2[1 − Φ(|z|)] for a two-tailed test
where Φ(z) is the cdf for standard normal rv.
e.g. the P-value for our first example is
P = 2[1 − Φ(|z|)] = 2[1 − Φ(| − 2.469|)]
= 2[1 − Φ(2.469)] = 2[1 − .9932]
= 0.0136
P-value
P-value for t Tests


for an upper-tailed test
1 − Tν (t)
P = Tν (t)
for a lower-tailed test


2[1 − Tν (|t|)] for a two-tailed test
where Tν (t) is the cdf for t-distribution with degrees of freedom ν.
Table A.8 gives the upper tail probability of t-distribution. The
relation between the upper tail probability and the cdf is simply
given by
upper tail probability = 1 − Tν (t)
For lower tail probability, recall that the t-distibution is symmetric.
Thus the lower tail probability corresponding to t ≤ −c with c > 0
is the same as the upper tail probability corresponding to t ≥ c
with the same degrees of freedom.
Test about a Population Mean
Example:
To determine whether the pipe welds in a nuclear power plant
meet specifications, a random sample of 10 welds is selected, and
tests are conducted on each weld in the sample. The sample data
is recorded as follows
101.9 100.4 101.2 100.9 101.7
with X = 101.10 and
101.5 100.9 100.1 101.6 100.8
s = .585.
It is known that the weld strength is normally distributed. If the
specifications state that the mean strength should be greater
than 100.5 lb/in2 , shall we accept that the pipe welds meet the
specifications?
P-value
Hypothesis: H0 : µ = 100 v.s. Ha : µ > 101.
Value of the test statistic:
t=
X − µ0
101.10 − 100.5
√
√ =
= 3.24
s/ n
.585/ 10
The P-value is
P = Tν (t) = T10−1 (3.24) = 0.005
where 0.005 is found from Table A.8 with t = 3.2 and ν = 9.
Therefore, if the significance level is α with α ≥ 0.005, e.g.
α = 0.05, we will reject H0 ; otherwise we do not reject H0 .
Download