2. Video: Hypothesis Testing and p-values

advertisement
Statistics 312 – Dr. Uebersax
19 Hypothesis Testing and p-values
1. Hypothesis Testing and p-values
Null and Alternative Hypotheses
Many applications of statistical inference in engineering involving testing some hypothesis (e.g.,
that a new product performs better than an old product, or that the number of defective units is
above some specified level).
However, the way classical statistical inference is designed, we typically don't try to prove our
hypothesis directly, but instead construct a second, 'opposite' hypothesis, and seek to reject
that.

Null Hypothesis: The 'opposite' of our scientific hypothesis. Expressed in forms like
"the new produce performs the same as the old product", or "the number of defective
units is less than or equal to some specified level. Because this hypothesis often
implies no effect (e.g. old design and new design are equal), it is called the null
hypothesis. The symbol for the null hypothesis is H0.

Alternative Hypothesis: Our original hypothesis (e.g., old and new designs perform
differently) is called the alternative hypothesis. The symbol for the alternative
hypothesis is H1.
In order to prove (or stated more accurately, supply evidence in favor of) our 'alternative' (i.e.
our original) hypothesis, we seek statistical evidence that will enable us to reject the null
hypothesis as implausible.
This reverse-logic approach is the classical approach to statistical hypothesis testing. The
Bayesian approach is more logical: it tries to directly test the original hypothesis; however we
will not be considering the Bayesian approach here.
Errors in Hypothesis Testing
We can either reject or accept the null hypothesis; and it is either true or not true. This leads to
four possible scenarios – two correct inferences and two incorrect ones.
True State
H0 True
H0 False
(No Effect)
(H1 True)
Do not reject H0
Correct
Type II Error
Reject H0
Type I Error
Correct
Decision
The error probabilities are:
Statistics 312 – Dr. Uebersax
19 Hypothesis Testing and p-values
α = P(Type I error)
ß = P(Type II Error)
Test statistic. A sample statistic whose distribution is known if H0 is true.
p-value. A measure of how unusual the value of the test statistic obtained from a sample is
under the assumption that the null hypothesis is true. A “small” p-value indicates that the
sample data with the associated test statistic is unlikely to have been obtained if the null
hypothesis is true and so will lead us to reject the null. If the p-value is not “small” then we do
not have strong evidence that the null is false and so we will fail to reject the null. Note that we
do not “prove” the null, we only fail to disprove it − this says that the null hypothesis might be
true, but we cannot prove it .
The p-value and α are related, but not the same. As we shall see, we fix α in advance, but the
value of p depends on the result of our study.
The term statistical significance is used somewhat inconsistently to refer either to α or to the pvalue.
2. Video: Hypothesis Testing and p-values
Type I errors:
Hypothesis testing:
http://www.youtube.com/watch?v=EowIec7Y8HM
http://www.youtube.com/watch?v=-FtlH4svqx4
Khan Academy seers: http://goo.gl/S3E2yE
3. Homework Review
Problem: For a sample of size n = 64, the sample mean = 85. The population standard
deviation = 8. Set up a 99% credible interval for the population mean, µ.
8.8 An engineering consulting firm wanted to evaluate the diameter of rivet heads. The
following data represent the diameters (in hundredths of an inch) for a random
sample of 25 rivet heads:
6.81 6.79 6.69 6.59 6.65 6.60 6.74
6.84 6.81 6.71 6.66 6.76 6.76 6.77
6.71 6.79 6.72 6.72 6.72 6.79 6.83
(a) Set up a 95 % confidence interval estimate of the average diameter of rivet heads (in
hundredths of an inch).
Homework: Read pp. 392–398 (Skip “Critical Value” and “Regions” sections)
Download