CHS Prob and Stats Chapter 8: Hypothesis Testing 1 CHAPTER 8 8-1: OVERVIEW AND 8-2 Pt. I: BASICS OF HYPOTHESIS TESTING 8-1: OVERVIEW Hypothesis: a claim or statement about a property of a population. Hypothesis Test (Test of Significance): a standard procedure for testing a claim about a property of a population. Rare Event Rule for Inferential Statistics: If, under a given assumption, the probability of a particular observed event is exceptionally small, we conclude that the assumption is probably not correct. Following this rule we test a claim by analyzing sample data in an attempt to distinguish between results that can easily occur by chance and the results that are highly unlikely to occur by chance. Example 1: According to the National Center for Chronic Disease Prevention and Health Promotion, 73.8% of females between the ages of 18 and 29 exercise. Kathleen believes that more women in this age range are now exercising, so she obtains a simple random sample of 1000 women and finds that 750 of them are exercising. Is this evidence that the percent of women between the ages of 18 and 29 who are exercising has increased? What if Kathleen’s sample resulted in 920 women exercising? Approach: Here is the situation Kathleen faces. If 73.8% of 18-29 year old females exercise, she would expect 738 of the 1000 samples respondents to exercise. The questions that Kathleen wants to answer are “How likely is it to obtain a sample of 750 out of 1000 women exercising from a population when the percentage of women who exercise is 73.8%? How likely is a sample that has 920 women exercising?” Results: The result of 750 is close to what one would expect, so Kathleen is not inclined to believe that the percentage of women exercising has increased. However, the likelihood of obtaining a sample of 920 women who exercise is extremely low if the actual percentage of women who exercise is 73.8%. For the case of obtaining a sample of 920 women who exercise, Kathleen can conclude one of two things: either the proportion of women who exercise is 73.8% and her sample just happens to include a lot of women who exercise, or the proportion of women who exercise has increased. Provided the sampling was performed in a correct fashion, Kathleen is more inclined to believe that the percentage of women who exercise has increased. 8.2 Part I: THE BASICS OF HYPOTHESIS TESTING Objectives: Given a claim, identify the null hypothesis and the alternative hypothesis, and express them both in symbolic form. Given a claim and sample data, calculate the value of the test statistic. Given a value of the test statistic, identify the P-value. Make the conclusion to reject or fail to reject the null hypothesis. State the conclusion of a hypothesis test in simple, nontechnical terms. CHS Prob and Stats Chapter 8: Hypothesis Testing 2 COMPONENTS OF A FORMAL HYPOTHESIS TEST NULL HYPOTHESIS: a statement to be tested and that the value of a population parameter (such as proportion, mean, or standard deviation) is EQUAL TO some claimed value. We test the null hypothesis directly in the sense that we assume it is true and reach a conclusion to either reject 𝐻0 or fail to reject𝐻0 . The claim that we seek evidence for always becomes the alternative hypothesis. ALTERNATIVE HYPOTHESIS: the statement that the parameter has a value that somehow differs from the null hypothesis. The symbolic form of the alternative hypothesis must use one of these symbols: < or > or ≠. If you are conducting a study and want to use a hypothesis test to SUPPORT your claim, the claim must be worded so that it becomes the alternative hypothesis (and can be expressed with the symbols < or > or ≠. You can never support a claim that some parameter is equal to some specified value. IDENTIFYING 𝑯𝟎 AND 𝑯𝟏 START Identify the specific claim or hypothesis to be tested, and express it in symbolic form. Give the symbolic form that must be true when the original claim is false. Of the two symbolic expressions obtained so far, let the alternative hypothesis 𝐻1 be the one not containing equality, so that 𝐻1 uses < or > or ≠. Let the null hypothesis 𝐻0 be the symbolic expression that the parameter equals the fixed value being considered. EXAMPLE 2: IDENTIFYING THE NULL AND ALTERNATIVE HYPOTHESES a. The proportion of workers who get jobs through networking is greater than 0.5 b. The mean weight of airline passengers with carry-on baggage is at most 195 lb (the current figure used by the Federal Aviation Administration). c. The standard deviation of IQ scores of actors is equal to 15. CHS Prob and Stats Chapter 8: Hypothesis Testing 3 TEST STATISTIC: a value used in making a decision about the null hypothesis, and it is found by converting a sample statistic (such as a sample proportion, sample mean, or sample standard deviation) to a score (such as z, t, or 𝝌𝟐 ) with the assumption that the null hypothesis is true. TEST STATISTIC FOR PROPORTION: TEST STATISTIC FOR MEAN, σ Known: EXAMPLE 3: FINDING THE TEST STATISTIC A survey of n = 703 randomly selected workers showed that 61% of those respondents found their job through networking. Find the value of the test statistic for the claim that most (more than 50%) workers get their jobs through networking. THREE WAYS TO SET UP THE NULL AND ALTERNATIVE HYPOTHESES: 1. EQUAL HYPOTHESIS VERSUS NOT EQUAL HYPOTHESIS (TWO-TAILED TEST) 𝐻0 : parameter = some value 𝐻1 : parameter ≠ some value 2. EQUAL VERSUS LESS THAN (LEFT-TAILED TEST) 𝐻0 : parameter = some value 𝐻1 : parameter < some value 3. EQUAL VERSUS GREATER THAN (RIGHT-TAILED TEST) 𝐻0 : parameter = some value 𝐻1 : parameter > some value CHS Prob and Stats Chapter 8: Hypothesis Testing 4 SIGNIFICANCE LEVEL true. P-VALUE: the probability of getting a value of the test statistic that is at least as extreme as the one representing the sample data, assuming that the null hypothesis is true. The null hypothesis is rejected if the p-value is very small, such as 0.05 or less. CONCLUSION: REJECT OR FAIL TO REJECT THE NULL HYPOTHESIS. CHS Prob and Stats Chapter 8: Hypothesis Testing 5 P-value Method: Reject the null if the pFail to reject the null if the pConfidence Intervals: Because a confidence interval estimate of a population parameter contains the likely values of that parameter, reject a claim that the population parameter has a value that is not included in the confidence interval. EXAMPLE 5: FINDING P-VALUES First determine whether the given conditions result in a right-tailed, left-tailed, or two-tailed test, and then use the Figure 8-6 above to find the P-value, then state a conclusion about the null hypothesis. a. 05 is used in testing the claim that p > 0.25, and the sample data result in a test statistic of z = 1.18. b. test statistic of z = 2.34. CHAPTER 8 CHS Prob and Stats Chapter 8: Hypothesis Testing 6 8-2 Pt. I: BASICS OF HYPOTHESIS TESTING SUMMARY Null Hypothesis: the parameter equal to a particular value. Reject 𝐻0 or Fail to reject 𝐻0 . Alternative Hypothesis: the parameter with a value that differs from the null hypothesis. (<, >, ≠). Test Statistic: a value used in making a decision about 𝐻0 . Convert a sample statistic (such as a sample proportion, sample mean, or sample standard deviation) to a score (such as z, t, or 𝜒 2 ) with the assumption that the null hypothesis is true. DECISION RULES BASED ON THE REJECTION REGION AND THE VALUE OF z. ** SIDE NOTE: If a confidence interval actually contains the claimed value, then you have no reason to believe the claim is wrong, since the variability of the sample could result in any value in the confidence interval. If, on the other hand, the entire confidence interval is on one side of the claimed value and does not contain the claimed value, then you would have reason to doubt the claim. Any result that is unlikely to have occurred by chance is called STATISTICALLY SIGNIFICANT. P-VALUE: the probability of getting a value of the test statistic that is at least as extreme as the one representing the sample data, assuming that the null hypothesis is true. The null hypothesis is rejected if the p-value is very small, such as 0.05 or less. The p-value tells you down to what level of α your data is statistically significant, which then allows you to reject the null hypothesis. For example, a p-value of 0.0234 tells you that you can reject the null with α values of 0.10, 0.05, 0.04, 0.03, and fail to reject the null at α levels of 0.02, 0.01, and so on. If alternative hypothesis contains A less than symbol, the p-value is the area to the left of test statistic. A greater than symbol, the p-value is the area to the right of test statistic. A not equal to symbol, the p-value is twice the area to the right of |z|. DECISIONS AND CONCLUSIONS It does not matter which hypothesis is the claim; you always test the null hypothesis. The test always involves one of two possible results: 1. Reject the null hypothesis. 2. Fail to reject the null hypothesis. ** Notice that the complement of “reject” is “fail to reject.” You do not “accept” the null. This is similar to the results of a criminal trial. A defendant is either “guilty” or “not guilty.” He is not “innocent.” THE DECISION TO REJECT OR FAIL TO REJECT THE NULL HYPOTHESIS. CHS Prob and Stats Chapter 8: Hypothesis Testing 7 P-value Method: Reject the null if the p Fail to reject the null if the pConfidence Intervals: Because a confidence interval estimate of a population parameter contains the likely values of that parameter, reject a claim that the population parameter has a value that is not included in the confidence interval. Caution: Sometimes a conclusion based on a confidence interval may be different from a conclusion based on a hypothesis test. THE WORDING OF YOUR CONCLUSION SHOULD USE SIMPLE, NONTECHNICAL TERMS IN STATING WHAT THE CONCLUSION REALLY MEANS. THE FIGURE BELOW SUMMARIZES A PROCEDURE FOR WORDING A FINAL CONCLUSION. When making decision about claims using sample data, you must accept the fact that errors will be made. You might reject the null hypothesis, when in actuality it is true and we don’t know whether that conclusion is wrong or not. There are four possible outcomes from hypothesis testing. CHS Prob and Stats Chapter 8: Hypothesis Testing 8 OUTCOMES FROM HYPOTHESIS TESTING 1. Reject the null hypothesis when the alternative hypothesis is actually true. This decision would be correct. 2. Do Not Reject the null hypothesis when the null hypothesis is actually true. This decision would be correct. 3. Reject the null hypothesis when the null hypothesis is actually true. This decision would be incorrect. This type of error is called a TYPE I ERROR. 4. Do Not Reject the null hypothesis when the alternative hypothesis is actually true. This decision is incorrect. This type of error is called a TYPE II ERROR. TYPE I AND TYPE II ERRORS TRUE STATE OF NATURE (REALITY) DECISION/CONCLUSION 𝑯𝟎 IS TRUE 𝑯𝟎 IS FALSE REJECT 𝑯𝟎 TYPE I ERROR (rejecting a true null hypothesis) α CORRECT DECISION FAIL TO REJECT CORRECT DECISION TYPE II ERROR (failing to reject a false 𝐻0 ) β 𝑯𝟎 “ROUTINE False Null. FOR FUN”: Type I – RTN – Reject True Null; Type II – FRFN – Failure to Reject a Example: IDENTIFYING TYPE I AND TYPE II ERRORS Assume that we are conducting a hypothesis test of a claim that p < 0.5. Here are the null and alternative hypotheses: 𝐻0 : p = o.5 𝐻1 : p < 0.5 Give statements identifying a. a Type I error. b. a Type II error. CHS Prob and Stats Chapter 8: Hypothesis Testing 9 Controlling type I and type II errors can be helped by selecting an appropriate level of α, which is the probability of a type I error. However, we don’t select β (P(type II error)). Mathematically, α, β, and sample size n are all related, so when you choose to determine any two of them, the third is automatically determined. The usual practice is to select values of α and n, so the value of β is determined. For type I errors with more serious consequences, select smaller values of α. Then choose a sample size n as large as is reasonable, based on time, cost, and other relevant factors. The following considerations may be relevant: 1. For any fixed α, an increase in sample size n will cause a decrease in β. That is, a larger sample will lessen the chance that you make an error of not rejecting the null when it’s actually false. 2. For any fixed sample size n, a decrease in α will cause an increase in β. Conversely, an increase in α will cause a decrease in β. 3. To decrease both α and β, increase sample size. Consider these: The mean weight of M&Ms is supposed to be at least 0.8535 g. The Bufferin tablets are supposed to have a mean weight of 325 mg of aspirin. Two very different levels of seriousness, thus for the less serious claim use α=0.05 and a sample size n=100 and for the more serious claim use α=0.01 and a sample size n=500. Any value of α may be chosen, although these are the most common: α = 0.01 – willingness to make a type I error 1% of the time. α = 0.05 – willingness to make a type I error 5% of the time. α = 0.10 – willingness to make a type I error 10% of the time. POWER OF A TEST: We use β to denote the probability of failing to reject a false null hypothesis (Type II error). It follows that 1 – β is the probability of rejecting a false null hypothesis. Statisticians refer to this probability as the power of a test, and use it often to gauge the test’s effectiveness in recognizing that the null hypothesis is false. That is, the power of a hypothesis test is the probability of supporting an alternative hypothesis that is true. CHS Prob and Stats Chapter 8: Hypothesis Testing 10 8-3: TESTING A CLAIM ABOUT A PROPORTION Goal: To be able to test a hypothesis (claim) made about a population proportion. REQUIREMENTS 1. The sample observations are a simple random sample. 2. Conditions for a binomial distribution are satisfied. (Fixed number of independent trials with constant probabilities; each trial has two outcome categories of “success” and “failure.”) 3. The conditions that 𝑛𝑝 ≥ 5 𝑎𝑛𝑑 𝑛𝑞 ≥ 5 are both satisfied, so the binomial distribution of sample proportions can be approximated by a normal distribution with 𝝁 = 𝒏𝒑 𝒂𝒏𝒅 𝝈 = √𝒏𝒑𝒒. Note that p is the assumed proportion used in the claim, not the sample proportion. NOTATION = sample size or number of trials. = 𝑥 𝑛 = population proportion (used in the null hypothesis). =1–p TEST STATISTIC FOR TESTING A CLAIM ABOUT A PROPORTION 𝒛= ̂−𝒑 𝒑 𝒑𝒒 𝒏 √ P-values: use the standard normal distribution (Table A-2) and refer to figure 8-6 on page 396. EXAMPLE 1: (Using the p-value method) In a 2005, a survey of a simple random sample of 900 U.S. households showed that 528 of them use email (2006 World Almanac and Book of Facts). Use those sample results to test the claim that less than 60% of all U Step 1: (a) State the hypothesis (claim) in symbolic form. (b) Give the symbolic form that must be true if the original claim is false. Step 2: Identify the null and alternative hypotheses as well as the level of significance. CHS Prob and Stats Chapter 8: Hypothesis Testing 11 Step 3: Compute the test statistic. Step 4: Identify the P-value. Step 5: Make a conclusion. Step 6: Summarize the results (conclusion). EXAMPLE 2: (Using the p-value method) In a recent survey of a simple random sample of 1002 people, 701 said that they voted in the most recent presidential election (based on data from ICR Research Group). Test the claim that when surveyed, the proportion of people who said that they voted is equal to 0.61, which is the proportion of people who actually Step 1: (a) State the hypothesis (claim) in symbolic form. (b) Give the symbolic form that must be true if the original claim is false. Step 2: Identify the null and alternative hypotheses as well as the level of significance. Step 3: Compute the test statistic. CHS Prob and Stats Chapter 8: Hypothesis Testing 12 Step 4: Identify the P-value. Step 5: Make a conclusion. Step 6: Summarize the results (conclusion). EXAMPLE 3: Clarinex is a drug used to treat asthma. In clinical test of this drug, 1655 patients were treated with 5-mg doses of Clarinex, and 2.1% of them experienced fatigue (based on data from Schering Corporation). Use percentage of Clarinex users experiencing fatigue is greater than the 1.2% rate for those not using Clarinex. Does it appear that fatigue is an adverse reaction to Clarinex? Step 1: (a) State the hypothesis (claim) in symbolic form. (b) Give the symbolic form that must be true if the original claim is false. Step 2: Identify the null and alternative hypotheses as well as the level of significance. Step 3: Compute the test statistic. CHS Prob and Stats Chapter 8: Hypothesis Testing 13 Step 4: Identify the P-value. Step 5: Make a conclusion. Step 6: Summarize the results (conclusion). 8-4: TESTING A CLAIM ABOUT A MEAN: σ Known Goal: To be able to test a hypothesis (claim) made about a population mean, given that the population standard deviation is a known value. REQUIREMENTS 1. The sample is a simple random sample. 2. The value of the population standard deviation σ is known. 3. Either or both of these conditions is satisfied: The population is normally distributed or TEST STATISTIC FOR TESTING A CLAIM ABOUT A MEAN: 𝒛= σ Known ̅ − 𝝁𝒙̅ 𝒙 𝝈 √𝒏 P-values: use the standard normal distribution (Table A-2) and refer to figure 8-6 on page 396. n >30. CHS Prob and Stats Chapter 8: Hypothesis Testing 14 EXAMPLE 1: The average cost of owning and operating a vehicle is $8121 per 15,000 miles including fixed and variable costs. A random survey of 40 automobile owners revealed an average cost of $8350 with a population standard deviation of $750. Is there sufficient evidence to conclude that the average is greater than $8121? Use α=0.01. (Source: New York Times Almanac 2010) Step 1: (a) State the hypothesis (claim) in symbolic form. (b) Give the symbolic form that must be true if the original claim is false. Step 2: Identify the null and alternative hypotheses as well as the level of significance. Step 3: Compute the test statistic. Step 4: Identify the P-value. Step 5: Make a conclusion. Step 6: Summarize the results (conclusion). EXAMPLE 2: The average depth of the Hudson Bay is 305 feet. Climatologists were interested in seeing if the effects of warming and melting ice were affecting the water level. Fifty-five measurements over a period of two weeks yielded a sample mean of 306.2 feet. The population variance is known to be 3.57. Can it be concluded at a 0.05 level of significance that the average depth has increased? (Source: World Almanac and Book of Facts 2010). Step 1: (a) State the hypothesis (claim) in symbolic form. (b) Give the symbolic form that must be true if the original claim is false. CHS Prob and Stats Chapter 8: Hypothesis Testing Step 2: Identify the null and alternative hypotheses as well as the level of significance. Step 3: Compute the test statistic. Step 4: Identify the P-value. Step 5: Make a conclusion. Step 6: Summarize the results (conclusion). 15 CHS Prob and Stats Chapter 8: Hypothesis Testing 16 EXAMPLE 3: A researcher estimates that the average revenue of the largest businesses in the United States is greater than $24 billion. A simple random sample of 45 companies is selected, and the revenues (in billions of dollars) are shown below. At α=0.05, is there enough evidence to support the researcher’s claim? Assume the population standard deviation is 28.7 billion. (Source: New York Times Almanac). 178 122 91 44 35 61 56 46 20 32 30 28 28 20 27 29 16 16 19 15 41 38 36 15 25 31 30 19 19 19 25 25 18 14 15 24 23 17 17 22 22 21 20 17 20 Step 1: (a) State the hypothesis (claim) in symbolic form. (b) Give the symbolic form that must be true if the original claim is false. Step 2: Identify the null and alternative hypotheses as well as the level of significance. Step 3: Compute the test statistic. Step 4: Identify the P-value. Step 5: Make a conclusion. Step 6: Summarize the results (conclusion).