Chapter 12: Hypothesis Testing • Remember that our ultimate goal is to take information obtained in a sample and use it to come to some conclusion about the population. • To determine whether or not we can reach a certain conclusion about the population, we need to perform a hypothesis test. • The purpose of hypothesis testing is to determine whether or not a claim about the true value of the population characteristic is valid. • Hypothesis testing is a process for evaluating hypotheses about population parameters using sample statistics. Writing Hypothesis • The null hypothesis describes our initial belief about a population parameter. • The null hypothesis must contain a condition of equality (no difference in values). • For example, let’s say that a candy company believes each of their chocolate bars contains an average of 70 grams of peanuts. Their null hypothesis would be: • The alternative hypothesis describes what we want to establish or what we suspect is true. • Sticking with the same chocolate bar example, let’s say we took five samples of chocolate bars, and found the mean amount of peanuts to be 40 grams, 51 grams, 35 grams, 55 grams, and 46 grams. Now we start to suspect the company is lying and the mean amount of peanuts is less than 70 grams. Here’s how we would write both our null and alternative: Example: Write a null and alternative hypothesis for each situation. a) The psychological study to measure students’ attitudes toward school ranged from 0 to 200. The mean score for US college students is about 115. A teacher suspects that older students have better attitudes toward school. b) Buddy read a newspaper report claiming that 12% of all adults in the US are left-handed. He believes that the proportion of lefties at Seton Hall is not equal to 12%. One-tail vs Two-tail • Hypothesis testing will either be classified as a one-tailed test or a two-tailed test, and it is based off of our alternative hypothesis. • If our alternative was that the mean was greater than 70, then we would only care about the right tail (still a onetailed test). Terminology • The significance level of a test is a predetermined level of evidence that is required to reject the null hypothesis. – Common levels of significance are 1%, 5%, and 10%. • Critical values are the values we read either from the z-table or t-table that separate the “reject the null hypothesis” region from the “do not reject the null hypothesis” region. • The test statistic measures how far a sample statistic diverges from what we would expect if the null were true, in standardized units. • The test statistic when performing a hypothesis test for a single population mean will be: • The test statistic when performing a hypothesis test for a single population proportion will be: Two Approaches • There are two approaches to hypothesis testing: the classical approach and the p-value approach. • In the classical approach, the computed value of the test statistic is compared to the critical value(s). If the test statistic falls within the critical region, we will reject the null hypothesis. If it does not, we will fail to reject the null hypothesis. • In the p-value approach, we reject the null hypothesis if the p-value is less than the level of significance. Step 1: Does the mean amount owed on delinquent credit cards usually exceed $2000? Step 5: Step 8: At the 1% level of significance, there is not convincing evidence to support the claim that the mean amount owed on delinquent credit cards is more than $2000. Example: A pineapple company is interested in the sizes of pineapples grown in their fields. Last year, the mean weight of the pineapples was 31 ounces. This year, the company is using a different irrigation system, and management is wondering how this change will affect the mean weight. They will be concerned if the mean weight is not equal to 31 ounces. A sample of 50 pineapples were taken that produced a mean weight of 31.935 ounces, with a standard deviation of 2.394 ounces. Is there convincing evidence that the mean weight of pineapples produced in the field has changed this year? Use a 5% level of significance. Step 1: Has the mean weight of pineapples changed this year? Use df=49 and area in two-tails 0.05. Example: A pineapple company is interested in the sizes of pineapples grown in their fields. Last year, the mean weight of the pineapples was 31 ounces. This year, the company is using a different irrigation system, and management is wondering how this change will affect the mean weight. They will be concerned if the mean weight is not equal to 31 ounces. A sample of 50 pineapples were taken that produced a mean weight of 31.935 ounces, with a standard deviation of 2.394 ounces. Is there convincing evidence that the mean weight of pineapples produced in the field has changed this year? Use a 5% level of significance. Step 5: Example: A pineapple company is interested in the sizes of pineapples grown in their fields. Last year, the mean weight of the pineapples was 31 ounces. This year, the company is using a different irrigation system, and management is wondering how this change will affect the mean weight. They will be concerned if the mean weight is not equal to 31 ounces. A sample of 50 pineapples were taken that produced a mean weight of 31.935 ounces, with a standard deviation of 2.394 ounces. Is there convincing evidence that the mean weight of pineapples produced in the field has changed this year? Use a 5% level of significance. Step 8: At the 5% level of significance, there is convincing evidence to support the claim that the mean weight of pineapples has changed this year. Step 1: Is the mean etch-a-sketch production for South Pole Elves lower than 165? Step 5: Step 6: Step 8: At the 5% level of significance, there is convincing evidence to support the claim that the mean etch-a-sketch production is less than 165. P-Value • P-value (probability value) is the probability of getting a test statistic, at least as extreme as the observed test statistic, assuming the null hypothesis is true. • If the p-value is smaller than our significance level (alpha), we will reject the null hypothesis. • If the p-value is larger than our significance level, we will fail to reject the null hypothesis. Let’s revisit some previous examples. Example: A pineapple company is interested in the sizes of pineapples grown in their fields. Last year, the mean weight of the pineapples was 31 ounces. This year, the company is using a different irrigation system, and management is wondering how this change will affect the mean weight. They will be concerned if the mean weight is not equal to 31 ounces. A sample of 50 pineapples were taken that produced a mean weight of 31.935 ounces, with a standard deviation of 2.394 ounces. Is there convincing evidence that the mean weight of pineapples produced in the field has changed this year? Use a 5% level of significance. We know that our hypotheses are: Suppose this test yielded a p-value of 0.0081. This would be lower than our significance level of .05. Therefore, we would reject our null hypothesis. Suppose this test yielded a p-value of .02. This is NOT lower than our significance level of .01. Therefore, we would fail to reject our null hypothesis. However, what would happen if we had a different significance level, say 5%. Would our conclusion be different? This leads us into the fact that we can make errors making conclusions during hypothesis testing. Type I and Type II Errors • It is possible to perform all of the correct procedures in a hypothesis test, but still make an error with our conclusion. • The different kinds of errors we can make are known as Type I and Type II errors. • A Type I error occurs when we reject a null hypothesis given that the null hypothesis is actually true. • A Type I error is also known as a “false positive.” • Examples of Type I errors: – You go to the doctor for a exam. You have a perfect bill of health, but the doctor tells you have you cancer. – In a courtroom, the null hypothesis is that the defendant is innocent and this null is actually true. A Type I error would be convicting the defendant. • A Type II error occurs when we fail to reject the null hypothesis given that the null hypothesis is false. • A Type II error is also known as a “false negative.” • Example of Type II errors: – You go to the doctor for an exam. You have cancer, but the doctor tells you that you are perfectly fine. – In a courtroom, a Type II error would be failing to convict a guilty person. Example: Suppose Jackie Moon says he is a 65% free throw shooter. A sample of his recent free throws was taken, and his sample free-throw percentage was 50%. a) State the hypotheses. b) A test was run and it yielded a p-value of 0.02. Using a significance level of 5%, what conclusion would you make? c) If your conclusion was an error, what type of error did you make? Type I. We would have rejected a true null. Example: John and Jeremy are venture capitalists that invested in a new company called Holy Shirts and Pants. The company claims their mean daily profits are $370. However, a recent sample was taken and the average daily sample profits were $290. a) State the hypotheses. b) A test was run and it yielded a p-value of 0.07. Using a significance level of 5%, what conclusion would you make? c) If your conclusion was an error, what type of error did you make? Type II. We would have failed to reject a false null. Hypothesis Testing for a Population Proportion • Recall that the test statistic for a population proportion will be calculated by: • Also remember we are always using the zdistribution when testing for a population proportion. Example: Prestige Worldwide manufactures liquid paper. The machine used in making the liquid paper is known to produce 5% defective bottles of liquid paper. A random sample of 200 bottles was taken recently and showed that 17 of the bottles were defective. Using a 1% level of significance, can we conclude that the machine is producing more than 5% defective bottles of liquid paper? Step 1: Is the machine producing more than 5% defective bottles of liquid paper? Example: Prestige Worldwide manufactures liquid paper. The machine used in making the liquid paper is known to produce 5% defective bottles of liquid paper. A random sample of 200 bottles was taken recently and showed that 17 of the bottles were defective. Using a 1% level of significance, can we conclude that the machine is producing more than 5% defective bottles of liquid paper? Step 5: Step 6: Example: Prestige Worldwide manufactures liquid paper. The machine used in making the liquid paper is known to produce 5% defective bottles of liquid paper. A random sample of 200 bottles was taken recently and showed that 17 of the bottles were defective. Using a 1% level of significance, can we conclude that the machine is producing more than 5% defective bottles of liquid paper? Step 8: At the 1% level of significance, there is not convincing evidence to support the claim that the machine is producing more than 5% defective bottles of liquid paper. Example: Johnny Karate claims to be a 80% free throw shooter. In a recent sample of 50 free-throws, he made 32. Is there convincing evidence to claim that Johnny Karate is less than an 80% free throw shooter? Use a 5% level of significance to test. Step 1: Is Johnny Karate less than an 80% free throw shooter? Example: Johnny Karate claims to be a 80% free throw shooter. In a recent sample of 50 free-throws, he made 32. Is there convincing evidence to claim that Johnny Karate is less than an 80% free throw shooter? Use a 5% level of significance to test. Step 5: Step 6: Example: Johnny Karate claims to be a 80% free throw shooter. In a recent sample of 50 free-throws, he made 32. Is there convincing evidence to claim that Johnny Karate is less than an 80% free throw shooter? Use a 5% level of significance to test. Step 8: At the 5% level of significance, there convincing evidence to support the claim that Johnny Karate is less than an 80% free-throw shooter.