Part 8: Hypothesis Testing Inference: Estimating a population parameter Our goal in this chapter is to assess the evidence provided by the data and to decide between two competing claims (hypotheses) about the population. We will be making an inference about a population mean and a population proportion. Stating Hypotheses: The Null Hypothesis ( H 0 ): This states what is generally believed to be the true population parameter. Similar to a criminal court case (subject is innocent until proven guilty). Often, if there is no value given for the population parameter, it is a statement of “no effect” or “no difference.” For example, regarding a parameter (say, a mean value) , the null hypothesis looks like H 0 :{ 0 }. For example, H 0 :{ 3}. Or about a probability p, the hypothesis would look like H 0 :{ p p0 }. For, example H 0 :{ p 0.5}. The Alternative Hypothesis ( H A ): This is also called the “research hypothesis” and we need to get enough evidence to support this in order to reject the null hypothesis. Similar to a criminal court case (subject is convicted of a crime only if there is sufficient evidence to do so) It can be written in one of three forms depending on your test. One-Tailed: Two-Sided: Right Sided H 0 :{ 0 }. Left Sided H 0 :{ 0 }. H 0 :{ 0 }. Practice: For each example below, state the notation for the null and the alternative hypotheses a. A consumer analyst reports that the mean life of a certain type of car battery is 74 months. Test this claim. b. 20% of cars of a certain model have needed costly transmission work after being driven between 50,000 and 100,000 miles. The manufacturer hopes that a redesign of a transmission component has solved this problem. -1- c. Only about 20% of people who try to quit smoking succeed. Sellers of a motivational tape claim that listening to the recorded messages can help people quit. Possible Errors in Hypothesis Testing Because a statistician must make inferences (or conclusions) based on random data that is subject to sampling errors, we can make mistakes in hypothesis testing. In fact, there are two types of errors that can be made: TRUTH (Unknown to the statistician) Null ( H 0 ) is True Statistician's Decision Null ( H 0 ) is False Reject Null ( H 0 ) Fail to reject Null ( H 0 ) 1. Type I error = the null hypothesis is true, but we mistakenly reject it. The probability of a Type I error is denoted by and is called the level of significance of the test. This threshold is where we reject the null hypothesis. If that is a mistake, than the probability of making that error is . 2. Type II error = the null hypothesis is false, but we fail to reject it. The probability of a Type II error is enoted by . Example #1: Your gas gauge in your car is broken. You believe you have enough gas to get to school. If the null hypothesis is that you have enough gas to travel to school, then which of the following would be an example of a Type I error? A. On your way to school, you stop at the local gas station and put gas in your car and it takes 11.9 gallons of gas and you have a 12 gallon gas tank. B. On your way to school, you choose not to stop for gas and you run out of gas and don’t get to school. C. On your way to school, you choose not to stop for gas and you make it in time for your first class. D. On your way to school, you stop at the local gas station and put gas in your car but it only takes a few gallons because there was plenty already in the tank. . Example #2: What is the relationship between and ? 1) Suppose I increase my level from 0.05 to 0.10. Which error increases? 2) Suppose I decrease my level from 0.05 to 0.01. Which error decreases? Reducing Type I and Type II Errors: The only way to reduce both errors is to increase sample size. We have more information about our population proportion and a better estimate of reality. -2- Vocabulary: 1. Test statistic : A sample statistic that is computed from the data. It helps us to make a statistical decision. Do we have enough evidence to reject the null hypothesis or not? 2. P-value : The probability of observing test statistic this extreme or more extreme if the null hypothesis (statement about the population parameter) is true. This value measures how much evidence you have against the null hypothesis. Small p-values indicate the outcome measured from the sample data is unlikely given the null hypothesis is true. It provides strong evidence against your null hypothesis. 3.Significance level : The decisive p-value we fix in advance. This states when the null hypothesis should be rejected. This decisive level is called the significance level (alpha) of the test and it is compared to the p-value in #2 above. Common levels of rejection are =0.10, =0.05 and =0.01. If our p-value is low and falls below the threshold, we reject the null hypothesis and conclude our result is statistically significant. TESTING HYPOTHESES ABOUT A MEAN 5 Steps in Hypothesis Testing: 1. Set up your hypotheses: 2. Calculate your test statistic: Z X n 3. Find the P-value for the observed data: The P-value measures the weight of evidence against the null hypothesis. Ha : 0 P(Z z) = 1 - P(Z z) Ha : 0 P(Z -z) Ha : 0 2 P(Z -z) 4. Make a statistical decision and justify the decision: Compare your P-value to the pre-specified significance level Reject the null [RTN] if the p-value < Fail to reject the null [FTRN] if the p-value > 5. State your conclusion in the context of the problem: Write a statement about the population mean! The statement should read: There is enough evidence to conclude….or there is not sufficient evidence to conclude… -3- Example #1: Employees in a large accounting firm claim that the mean salary of the firm’s accountants is less than that of its competitor’s, which is $45,000. A random sample of 30 of the firm’s accountants has a mean salary of $43,500. Suppose it is known that the salaries follow a normal distribution, with a population standard deviation of $5200. Test the employees’ claim at a 0.05 level of significance. A. State the null and alternative hypotheses. B. Conduct a test of the hypotheses above I. Calculate the test statistic II. Calculate the corresponding p-value. III. Make a statistical decision. IV. Interpret your decision in the context of the problem. C. Which of the following statements correctly interprets the p-value for this test: a) It is the probability of observing a population mean at least as large as $45000 in a sample whose mean is $43500 b) It is the probability the population mean is not equal to $45000. c) It is the probability of observing a sample mean at least as small as $43500 in a population whose mean is $45000. d) It is the probability of observing a sample mean at least as large as $43500 in a population whose mean is $45000. -4- Example #2: A company that makes cola drinks states that the mean caffeine content per one 12-ounce bottle of cola is 40 milligrams. Suppose you work as a quality control manager and are asked to verify this claim. During your tests, you find that a random sample of thirty 12-ounce bottles of cola has a mean caffeine content of 38.9 milligrams.. At the 0.01 level of significance, can you reject the company’s claim? Suppose we know from previous studies that the population standard deviation is 5.5 a. State the null and alternative hypotheses. b. Calculate the test statistic. c. Calculate the p-value of the test statistic. d. Make, and justify, a statistical conclusion. e. Interpret your conclusion to someone who knows nothing about statistics What if we do not know the population standard deviation? When the population is distributed normal or approximately normal, we have a random sample, and is unknown, the test statistic is: t X S n When using a t-table, we cannot find the exact p-value of a hypothesis test. Example #1: suppose we are conducting a right tailed hypothesis test, with n = 20 and test statistic 1.54, what is the corresponding p-value? -5- Example #2: The Better Business Bureau of San Diego states that the standard hotel room average price for a Saturday night stay is at most $115. You go out and take a random sample of 33 hotels and get an average price of $119.80 with a sample standard deviation of $10. Is there sufficient evidence to show that the average hotel price exceeds $115? a) State the null and alternative hypotheses: b) Calculate the test statistic: c) Find the p-value: d) Make and justify a statistical decision: e) State your conclusion in the context of the problem Example #3: The personnel department of a large corporation wants to estimate the family dental expenses of its employees to determine the feasibility of providing a dental insurance plan. A business journal claimed that the mean dental expense of families is 330 dollars per year. A random sample of 10 employees reveals the following family dental expenses (in dollars) for the preceding year. Is there statistical evidence that this claim is wrong? 115 262 246 85 410 208 173 425 316 160 State the appropriate null and alternative hypotheses. Calculate the test statistic Calculate the corresponding p-value. Make a statistical decision using level of significance of 0.05. Justify your decision. Interpret your decision in the context of the problem. -6- Using the critical value method to perform a hypothesis test Critical value: separates the rejection region from the non-rejection region. (It is the z-score corresponding to the tail area) Rejection region: the range of values for which we would reject the null hypothesis. If the test statistic falls in this region we reject the null hypothesis. Example #1: Employees in a large accounting firm claim that the mean salary of the firm’s accountants is less than that of its competitor’s, which is $45,000. A random sample of 30 of the firm’s accountants has a mean salary of $43,500. Suppose it is known that the salaries follow a normal distribution, with a population standard deviation of $5200. Test the employees’ claim at a 0.05 level of significance. Example #2: A company that makes cola drinks states that the mean caffeine content per one 12-ounce bottle of cola is 40 milligrams. Suppose you work as a quality control manager and are asked to verify this claim. During your tests, you find that a random sample of thirty 12-ounce bottles of cola has a mean caffeine content of 38.9 milligrams.. At the 0.01 level of significance, can you reject the company’s claim? Suppose we know from previous studies that the population standard deviation is 5.5 Example #3: The personnel department of a large corporation wants to estimate the family dental expenses of its employees to determine the feasibility of providing a dental insurance plan. A business journal claimed that the mean dental expense of families is 330 dollars per year. A random sample of 10 employees reveals the following family dental expenses (in dollars) for the preceding year. Is there statistical evidence that this claim is wrong? 115 262 246 85 410 208 173 425 -7- 316 160 Hypothesis Tests for a Population Proportion: The Same 5 Steps 1. Set up your hypotheses: 2. Calculate your test statistic: z p̂ p 0 p0 q0 n 3. Find the P-value for the observed data: The P-value measures the weight of evidence against the null hypothesis. Ha : p p0 p-value = P(Z z) = 1 - P(Z z) Ha : p p0 p-value = P(Z z) Ha : p p0 p-value = 2 P(Z z) 4. Make a statistical decision and justify the decision. 5. State your conclusion in the context of the problem. Example #1: In 2004, 5.8% of job applicants who were tested for drugs failed the test. Test the claim that the failure rate is now lower if a random sample of 1520 current job applicants results in 58 failures. a) State the null and alternative hypotheses. b) Calculate the test statistic. c) Calculate the p-value of the test statistic. d) Make, and justify, a statistical conclusion at the 0.05 level. e) Interpret your conclusion to someone who knows nothing about statistics. -8- Example #2: Which situation best describes when to declare results statistically significant? A. The p-value is less than the significance level and we fail to reject the null hypothesis. B. The p-value is less than the significance level and we reject the null hypothesis. C. The p-value is greater than the significance level and we reject the null hypothesis. D. The p-value is greater than the significance level and we fail to reject the alternative hypothesis. Example #3: A government study claimed that 60% of Americans agree that oil exploration in our national parks is necessary. In a survey of 1945 Americans, 1200 said they agree that oil exploration in our national parks is necessary. Is this sufficient evidence to reject the claim? a) Set up the null and alternative hypothesis b) Calculate the test statistic z c) Calculate the p-value. d) Make, and justify, a statistical conclusion at the 0.05 level. e) State your conclusions in context of the problem. Duality of Confidence Intervals and Two-Sided Hypothesis Tests: When testing H 0 : p p 0 versus H a : p p 0 , if a (1- ) * 100% confidence interval contains p 0 , we do not reject the null hypothesis at the level . If the confidence interval does not contain p 0 , we have evidence that supports the alternative hypothesis, thus we reject the null hypothesis at the level . Example #1: A random sample of 87 students who took a bar exam prep course was selected and only 24 of them report that they will have to retake the exam. -9- A. Construct a 95% confidence interval for the population proportion. B. If we calculate a 90% confidence interval instead of a 95% confidence interval, then: a) The width of the confidence interval would be unchanged. b) The confidence interval width would decrease. c) The confidence interval width would increase. Answer: C. An instructor for the prep course stated at the beginning of the course that 35% of students would fail the bar exam. Use the confidence interval above to test this claim. I. State the null and alternative hypotheses. II. Make an appropriate decision, giving a one-sentence justification of your decision. Practice Problems: 1. Sports car owners in a town complain that the state vehicle inspection station judges their cars differently from the family style cars. Previous records indicate that 30% of all passenger cars fail the inspection the first time through. In a random sample of 150 sports cars, 50 failed the inspection on the first time through. Is there sufficient evidence to indicate that the proportion of first failures for sports cars is higher than the proportion for all passenger cars? -10- Step 1: Set up the null and alternative hypothesis Step 2: Calculate the test statistic Step 3: Calculate the p-value. Step 4: Make and justify a statistical decision ( 0.10 ). Step 5: State your conclusions in context of the problem. 2. According to a report sponsored by the National Center for Health Statistics, 75% of American women have been married by the age of 30. In a random sample of 125 women between the ages of 19 and 30 it was found that 84 of them had been married. Is this sufficient evidence to indicate that the claim of a 75% marriage rate is too high? Use a 0.05 level of significance. Step 1: Set up the null and alternative hypothesis Step 2: Calculate the test statistic Step 3: Calculate the p-value. Step 4: Make and justify a statistical decision Step 5: State your conclusions in context of the problem. 3. Diet Guide magazine claims that juice fasting for a week is an excellent way to lose weight. An article states that the mean weight loss for people who juice fast one week was at least 12 pounds. The Atkins diet group is very skeptical of this claim, and selects an SRS of 20 individuals who juice fast for a week. The sample of 20 individuals has a mean weight loss of 10.3 pounds, with a standard deviation of 4.8 pounds. It is known that weight loss follows a normal distribution State the appropriate null and alternative hypotheses. -11- Calculate the test statistic Calculate the corresponding p-value. Make a statistical decision using level of significance of 0.01. Justify your decision. Interpret your decision in the context of the problem. 4. Suppose in a hypothesis test, the probability of committing a Type II error was decreased. Which of the following is correct? A. The probability of committing a Type I error would be decreased B. The level of significance would be decreased C. The power of the test would be decreased D. The power of the test would be increased Answer: 5. We calculate a 90% confidence interval for p to be (0.53, 0.67) and want to test the following: Ho: p = 0.50 vs. Ha: p 0.50. a) If the confidence interval is used to test the given hypotheses, what would the statistical decision be? Justify your decision. b) What would the level of significance of the hypothesis test be? c) If your statistical decision had been a mistake, what type of error would you have made? -12- SUMMARY: Make sure that you understand this. A. Stating Hypotheses The Null Hypothesis ( H 0 ) The Alternative Hypothesis: One-Tailed (Right or Left Sided) or Two-Sided. Possible Errors in Hypothesis Testing TRUTH (Unknown to the statistician) Null ( H 0 ) is True Statistician's Decision Null ( H 0 ) is False Reject Null ( H 0 ) Fail to reject Null ( H 0 ) 1. The probability of a Type I error is denoted by and is called the level of significance of the test. 2. The probability of a Type II error is denoted by . 3. Power of the test= 1- . Vocabulary: 1. Test statistic, P-value, Significance level . If our p-value is low and falls below the threshold, we reject the null hypothesis and conclude our result is statistically significant. TESTING HYPOTHESES ABOUT A MEAN 5 Steps in Hypothesis Testing: 1. Set up your hypotheses. X n 3. Find the P-value for the observed data (The P-value measures the weight of evidence against the null hypothesis. Ha : 0 P(Z z) = 1 - P(Z z) Ha : 0 P(Z -z) 2. Calculate your test statistic: Ha : 0 Z 2 P(Z -z) 4. Make a statistical decision and justify the decision: Compare your P-value to the pre-specified significance level Reject the null [RTN] if the p-value < Fail to reject the null [FTRN] if the p-value > -13- 5. State your conclusion in the context of the problem: Write a statement about the population mean! The statement should read: There is enough evidence to conclude….or there is not sufficient evidence to conclude… The case where the population standard deviation is NOT known When the population is distributed normal or approximately normal, we have a random sample, and is unknown, the test statistic is: t X S n When using a t-table, we cannot find the exact p-value of a hypothesis test. Hypothesis Tests for a Population Proportion: The Same 5 Steps 1. Set up your hypotheses: 2. Calculate your test statistic: z p̂ p 0 p0 q0 n 3. Find the P-value for the observed data: d) The P-value measures the weight of evidence against the null hypothesis. Ha : p p0 p-value = P(Z z) = 1 - P(Z z) Ha : p p0 p-value = P(Z z) Ha : p p0 p-value = 2 P(Z z) 4. Make a statistical decision and justify the decision. 5. State your conclusion in the context of the problem. Duality of Confidence Intervals and Two-Sided Hypothesis Tests When testing H 0 : p p 0 versus H a : p p 0 , if a (1- ) * 100% confidence interval contains p 0 , we do not reject the null hypothesis at the level . If the confidence interval does not contain p 0 , we have evidence that supports the alternative hypothesis, thus we reject the null hypothesis at the level . -14-