CHAPTER 7-1 HYPOTHESIS TESTING Hypothesis testing in a nutshell: - a claim is made about the value of a population parameter - a sample is collected and the value of the corresponding sample statistic is calculated. - we do not expect the sample statistic to equal the hypothesized value of the population parameter, even if the claim is correct (sampling error). So, the initial claim is rejected only if the sample statistic is different enough from the claimed value to be highly improbable. Hypothesis testing builds upon knowledge of sampling distributions. We need to know how probable/improbable the value of the sample statistic is under the claim. Formally: 1. State the Null and Alternative Hypotheses Def’n. The Null Hypothesis, Ho: the claim being made about the value of the population parameter. “null” because it is the status quo … it is accepted unless proven wrong by the data. “innocent until proven guilty”. Def’n. The Alternative Hypothesis, HA: the opposite of the null hypothesis. The conclusion that must be true if the null hypothesis is rejected. Ex. (from text). A cereal company labels its boxes as containing 368 grams of cereal. There is variation from box to box, but the average should be 368 grams. A quality control inspection would test the production process to ensure the average weight is 368 g. Ho: μ = 368 grams HA: μ ≠ 368 grams. This is a hypothesis test about the population mean. Chapter 7 - 1 2. To test the hypothesis, sample information must be used. We know the sample mean is unlikely to equal 368 grams even if the null is true – sampling variability. Will reject the null only if sample mean is “very” different from hypothesized value of population mean. Notion is formalized using “Regions of Rejection and Non-Rejection”. Suppose sample size is at least 30. Then we know the sampling distribution of the sample mean is X ~ N ( o , ) , where μo is the n hypothesized value of the population mean. We now know how likely or unlikely various values of the sample mean are and can define rejection and acceptance regions. 0.12 0.1 0.08 0.06 0.04 Rejection Region Region of Nonrejection Rejection Region 0.02 Critical Value Critical Value 0 1 2 3 4 5 6 7 8 9 10 μ 12 13 14 15 16 17 18 19 20 21 The null hypothesis will be rejected if the test statistic found in the sample falls into either rejection region. Two types of errors: Type I Error: falsely reject Ho. Suppose average weight in sample is 375 grams and this falls into the upper rejection region (i.e., Chapter 7 - 2 exceeds critical value). We may decide this value of sample mean is too improbable under the null and reject the null, but this does not mean it is impossible. We could hang an innocent person. Type II Error: not rejecting a null that is, in fact, false. We may find an average weight of 370 grams and decide this is not improbable enough to reject the null hypothesis that μ = 368 g. But the mean might be greater than 368. We just do not have sufficient evidence against the null. 3. Clearly, some decision has to be made about when enough evidence exists to reject the null hypothesis. This leads to: Level of Significance, α: the minimum probability of observing sample information without rejecting Ho. Also the probability of committing Type I error. Conventional levels of significance: α = .01, .05, or .10. Suppose that we choose α = .05. We will reject the null hypothesis if, were it true, the probability of observing what we observe in the data is less than 5%. I.e., the value of the sample statistic lies far enough away in either direction from the hypothesized value of μ that there is only a 5% probability or less. How far is “far enough away”? Depends on the probability distribution of the sample statistic (also called the test statistic). 4. Calculating test statistic. Need to know sampling distribution of the sample statistic: When Ho, HA concern population mean, μ Recall: X is normally distributed if n is large, or n is small but pop. is normal and σ is known. use z X follows a t-distribution if n is small, and pop. is normal and σ is unknown. use t When Ho, HA concern population proportion, p Chapter 7 - 3 Recall: ps is approximately normal if np > 5 and n(1-p)>5 use z z-tests of Hypothesis for the Mean ( X is normally distributed) Suppose the average household income in Canada is $35,000. The null hypothesis is that Peterborough’s average income is the same. Ho: μ = $35,000 HA: μ ≠ $35,000 Decide that α = .05 (5% level of significance) A sample of 100 Peterborough households is surveyed and we find X = $33,000, s = $15,000 Although sample mean is less than hypothesized mean, cannot reject the null out of hand. We need to determine the probability of observing what we see in the sample and will reject null only if this probability is less than 5%. Test is based on knowledge that X ~ N ( , s2 ) n Since sample mean is normal, there is a 95% probability of obtaining a value within 1.96 standard errors of the population mean. Therefore, there is a 5% probability of getting a value for the sample mean more than 1.96 standard errors away from the population mean. If our sample mean is more than 1.96 standard errors away from $35,000 we will reject the null hypothesis (although the probability of Type I error is 5%). Calculate the test statistic: X o 33000 35000 s 15000 10 n 2000 1500 1.33 Chapter 7 - 4 We cannot reject the null hypothesis. Although the sample mean is not equal to the hypothesized population mean, the probability of observing a sample mean of $33,000 or less is greater than 5%. More formally: Ho = μo HA ≠ μo (two-tail test) Collect sample and calculate X . Determine significance level: α Determine the test statistic: If X is normally distributed, the test statistic is X o z s n Determine the critical values for the test: with a two-tail test, we reject the null hypothesis if the sample mean is so far in either direction from the hypothesized population mean as to have a probability less than α%. Suppose α = .05. Then the critical values for the test statistic are at 1.96 and -1.96: Reject Ho if X o 1.96, or s n X o 1.96, s n otherwise do not reject H o These critical values determine the limits of the acceptance region. Example: (statistical process control) A bolt-making machine turns out thousands of bolts per hour. If properly adjusted (within control), the bolts have a mean diameter of Chapter 7 - 5 14.00 millimetres. Past data also shows that σ = .15 mm (so, as always, there is variability in the manufacturing process). A part of the quality control process, a random sample of 49 bolts is taken and the average diameter in this sample is calculated as 14.06 mm. Is this machine out of control limits? Ho: μ = 14.00 mm HA: μ ≠ 14.00 mm Choose α = .01 (pb. of Type I error is 1%). Since n>30, the sampling distribution of the sample mean is approximately normal. Then, X o z s n What are the critical values of the test statistic. Since two-tail test, put ½ of α in each tail. Limits of acceptance region then when X o z / 2 s n or X o z / 2 s n Compute value of test statistic: where z / 2 z.005 2.576 X o 14.06 14 2.80 s .15 7 n Since the sample mean lies 2.80 standard errors above 14.00 mm., we reject the null hypothesis ( the machine needs adjusting). The correct sampling distribution must be used. Suppose in the example above, a sample of 6 bolts is taken. What is the sampling distribution of the sample mean? If the population is normal and σ is known, then X is normal. If the population is not normal, we cannot appeal to central limit theorem with this sample size. If population is normal, but σ is not known, we can estimate σ using the sample standard deviation, s, and X follows a t distribution: Chapter 7 - 6 X o t n 1 . Repeat the example using the appropriate sampling s n distribution. Ho: μ = 14.00 mm HA: μ ≠ 14.00 mm Choose α = .01 (pb. of Type I error is 1%). Since n<30 and σ must be estimated using s, the sampling distribution of the sample mean follows a t-distribution if the population can be assumed normal. Then, X o t n 1 s n What are the critical values of the test statistic. Since two-tail test, put ½ of α in each tail. Limits of acceptance region then when X o t / 2 s n or X o t / 2 s n Compute value of test statistic: where tn/12 t.5005 4.032 X o 14.06 14 0.98 s .15 2.45 n Since the sample mean lies 0.98 standard errors above 14.00 mm., we cannot reject the null hypothesis. Next: one-tail tests, power of the test, hypothesis testing of population proportion. Chapter 7 - 7