Statistics 312 – Dr. Uebersax 19 Hypothesis Testing and p-values 1. Hypothesis Testing and p-values Null and Alternative Hypotheses Many applications of statistical inference in engineering involving testing some hypothesis (e.g., that a new product performs better than an old product, or that the number of defective units is above some specified level). However, the way classical statistical inference is designed, we typically don't try to prove our hypothesis directly, but instead construct a second, 'opposite' hypothesis, and seek to reject that. Null Hypothesis: The 'opposite' of our scientific hypothesis. Expressed in forms like "the new produce performs the same as the old product", or "the number of defective units is less than or equal to some specified level. Because this hypothesis often implies no effect (e.g. old design and new design are equal), it is called the null hypothesis. The symbol for the null hypothesis is H0. Alternative Hypothesis: Our original hypothesis (e.g., old and new designs perform differently) is called the alternative hypothesis. The symbol for the alternative hypothesis is H1. In order to prove (or stated more accurately, supply evidence in favor of) our 'alternative' (i.e. our original) hypothesis, we seek statistical evidence that will enable us to reject the null hypothesis as implausible. This reverse-logic approach is the classical approach to statistical hypothesis testing. The Bayesian approach is more logical: it tries to directly test the original hypothesis; however we will not be considering the Bayesian approach here. Errors in Hypothesis Testing We can either reject or accept the null hypothesis; and it is either true or not true. This leads to four possible scenarios – two correct inferences and two incorrect ones. True State H0 True H0 False (No Effect) (H1 True) Do not reject H0 Correct Type II Error Reject H0 Type I Error Correct Decision The error probabilities are: Statistics 312 – Dr. Uebersax 19 Hypothesis Testing and p-values α = P(Type I error) ß = P(Type II Error) Test statistic. A sample statistic whose distribution is known if H0 is true. p-value. A measure of how unusual the value of the test statistic obtained from a sample is under the assumption that the null hypothesis is true. A “small” p-value indicates that the sample data with the associated test statistic is unlikely to have been obtained if the null hypothesis is true and so will lead us to reject the null. If the p-value is not “small” then we do not have strong evidence that the null is false and so we will fail to reject the null. Note that we do not “prove” the null, we only fail to disprove it − this says that the null hypothesis might be true, but we cannot prove it . The p-value and α are related, but not the same. As we shall see, we fix α in advance, but the value of p depends on the result of our study. The term statistical significance is used somewhat inconsistently to refer either to α or to the pvalue. 2. Video: Hypothesis Testing and p-values Type I errors: Hypothesis testing: http://www.youtube.com/watch?v=EowIec7Y8HM http://www.youtube.com/watch?v=-FtlH4svqx4 Khan Academy seers: http://goo.gl/S3E2yE 3. Homework Review Problem: For a sample of size n = 64, the sample mean = 85. The population standard deviation = 8. Set up a 99% credible interval for the population mean, µ. 8.8 An engineering consulting firm wanted to evaluate the diameter of rivet heads. The following data represent the diameters (in hundredths of an inch) for a random sample of 25 rivet heads: 6.81 6.79 6.69 6.59 6.65 6.60 6.74 6.84 6.81 6.71 6.66 6.76 6.76 6.77 6.71 6.79 6.72 6.72 6.72 6.79 6.83 (a) Set up a 95 % confidence interval estimate of the average diameter of rivet heads (in hundredths of an inch). Homework: Read pp. 392–398 (Skip “Critical Value” and “Regions” sections)