Hypothesis Testing of Qualitative Data Connecting Probability Concepts in a logical way to develop the formulas used for testing hypotheses By Pete Kaslik Pierce College, Ft. Steilacoom The Need To Do Statistics Begins With A Question That Leads to a Decision It is typical that the decision usually has financial or health implications. For the purposes of this demonstration, the question is whether the US should allow drilling for oil in the Pacific Ocean and in the Arctic Ocean We will assume there are only 2 possible points of view on this. 1. Our lifestyles require oil and therefore we should get oil no matter where it is and no matter the possible consequences. 2. We have already gotten the easiest oil and the potential consequences of drilling in the ocean outweigh the benefits. Therefore we should transition to an oil-free society instead of drilling for more. Before analyzing data, it is useful to understand the topic. Following is a brief explanation of the oil issue. Hubbert • In 1956, M King Hubbert, a petroleum geologist explained that oil field production follows a normal curve. That is, it starts slow, increases to a peak and then declines. The same curve applies to countries and the world. • He predicted that the US would peak in the early 1970s. US Oil Production and Consumption (and population) 240,000,000 14 220,000,000 12 200,000,000 10 180,000,000 8 160,000,000 6 140,000,000 4 120,000,000 2 100,000,000 0 80,000,000 M o n th US Population 16 Notice how much more we consume than produce, hence the reason for all the imports. Mar-2023 260,000,000 Jul-2009 18 Oct-1995 280,000,000 Feb-1982 20 Jun-1968 300,000,000 Oct-1954 22 Jan-1941 320,000,000 May-1927 24 Sep-1913 Million Barrels per Day U.S. Crude Oil Daily Production and Consumption and US Population (1920-Oct 2011) - Y e a r U.S. Field Production of Crude Oil (Million Barrels Per Day)(L) U.S. Field Consumption of Crude Oil (Million Barrels Per Day)(L) US Population(R) Notice the peak in US oil production around 1970 World Oil Production World oil production history and projections (left) are produced by the US Energy Information Administration (EIA) and are available on the website: http://www.eia.gov/pub/oil_gas/petroleum/present ations/2000/long_term_supply/sld001.htm. World Oil Supply (Billion BBL/year) Data From http://www.eia.gov/cfapps/ipdbproject/IEDIndex3.cfm These projections were made in 2000, so that we can now see if they are on track by looking at data that has been obtained between the time of the predictions and 2010. World Oil Supply (Billion BBL/year) 35 30 25 20 1975 1980 1985 1990 1995 2000 2005 2010 2015 New Oil Fields • There are known reserves off the west coast of the US and in the Arctic. • There is conflict about allowing drilling in the ocean, particularly after the incident in the Gulf in 2010. What if the US citizens decided whether to drill in the oceans or transition to a more environmental sustainable society? Our Choice • If a super majority of Americans wanted to drill, then we drill. • If those who want to drill are not a super majority, then we should take a more conservative approach to impacting our planet and begin our transition to a more sustainable society. • We will consider a super majority to be 66.7%. Our Question • Does a super majority of adult Americans want to drill for oil in the ocean waters, in spite of the risks? A Census • To know the absolute answer to this question would require a census – that is asking every single adult in the country (approximately 230 million people). • Since a census is expensive, time consuming and generally not possible, we will make a hypothesis about the opinions of people and then take a sample to test our hypothesis. Our Hypotheses • H0: p = 0.667 • H1: p > 0.667 • p is the proportion of all adult Americans who think we should drill in marine waters. • A super majority is over 66.7%, thus the null hypothesis indicates there isn’t a super majority and the alternate indicates a super majority. 3 Methods • We will test these hypotheses using 3 different, but related methods. – Binomial distribution – gives exact results – Normal approximation to the binomial distribution – gives approximate results – Sampling distribution of sample proportions – gives approximate results. Visualizing the Hypotheses The Null Distribution Don't Drill H0: p = 0.667 Drill A Possible Alternate Distribution A Possible Alternate Distribution A Possible Alternate Distribution Don't Drill Don't Drill Don't Drill Drill Drill Drill H1: p > 0.667 Picture the Opinion of the Entire Adult US Population Use your imagination to picture 230 million black or green circles on this map, one for each adult’s opinion. drill don’t drill Image of US Map from http://www.thinkstockphotos.com/search/#us map/f=PIHV The Sample • Since we can’t do a census, our next best alternative to understanding the population is to take a random sample. We are going to then have to use this sample as a way of determining which hypothesis to support. Error • Because we are going to make a judgment about the entire population based on a sample, it is possible that we will make an error as a result of the data we randomly select. • If the data supports the alternate hypothesis, we could make a Type I Error. • If the data supports the null hypothesis, we could make a Type II Error. • We will not know if we make an error but there are consequences if we do. Consequences of Errors • The consequence of a Type I error would be that we would drill for oil when we don’t have a super majority. • The consequence of a Type II error would be that we wouldn’t drill for oil when the super majority thinks we should. Our Sample This is our sample, in the order in which the samples were taken. Black represents drill, green represents not drilling. What we have to decide is how to use this sample to determine which hypothesis to support. One way to use this sample is to count the number in favor of drilling (black) and the number opposed to drilling (green). Our sample has 16 in favor, 4 opposed. Disclaimer: Obviously, this is a very small sample. This sample size is being used to keep this demonstration reasonable. Hypothesis Testing Theory • When testing hypotheses, we start with the assumption the null hypothesis is true. • We reject the null hypothesis only if we get data that is unlikely. That is, we get data that would be considered a rare event. • To determine what is and isn’t a rare event, we must first recognize that random samples do not look exactly like the population from which they were drawn. The Null Distribution • Our objective is to create the null distribution. This distribution shows the complete set of possible outcomes and gives the probability of each of the outcomes. • Probability = Number of Favorable Outcomes Number of Possible Outcomes This assumes that every outcome is equally likely, which in a simple random sample would be the case (theoretically). The Starting Point • If we assume the null hypothesis is true, that is, we assume that exactly two thirds (0.667) of the population wants to drill and exactly one third (0.333) does not want to drill, and if we randomly select from this population, then the probability that we select someone who wants to drill is 0.667. P(A or B)=P(A) + P(B) • When one selection is made from a population, the probability it has one of two mutually exclusive characteristics is found by adding the probabilities of each characteristic. • This rule is useful for understanding complements. P(A or B)=P(A) + P(B) • Assuming everyone in the country has an opinion on this topic, then the probability we select someone who wants to drill or doesn’t want to drill = 1, which is a certainty. Therefore: • P(Drill or Don’t Drill)=P(Drill) + P(Don’t Drill) • 1=P(Drill) + P(Don’t Drill) Complements • 1=P(Drill) + P(Don’t Drill) • With a little algebra, we see that P(Don’t Drill) = 1 – P(Drill). Therefore, if the probability of selecting someone who wants to drill is 0.667, the probability of selecting someone who doesn’t want to drill is 1- 0.667 = 0.333. P(A and B) = P(A)P(B) • For independent events such as selecting people and asking their opinion about drilling, the probability of any specific sequence of responses is found by multiplying the probabilities of each individual response. P(A and B) = P(A)P(B) • Thus, if we selected 2 people from the population, it is possible to get the following four combination of opinions. • These can be shown as P(B and B), P(B and G), P(G and B), P(G and G) however, this will be abbreviated by the removal of the word “and” to give: P(BB), P(BG), P(GB), P(GG) P(A and B) = P(A)P(B) • Using the above rule we have: P(BB) = P(B)P(B) = 0.667• 0.667 = 0.445 P(BG) = P(B)P(G) = 0.667 • 0.333 = 0.222 P(GB) = P(G)P(B) = 0.333 • 0.667 = 0.222 P(GG) = P(G)P(G) = 0.333 • 0.333 = 0.111 Notice that the sum of all these probabilities is 1, since they are the complete set of possible outcomes for a sample size of 2. Appling this to our sample • Remember that our sample was: This can be represented mathematically as: P(BBGBBBBGBBBBBGGBBBBB) = P(B)P(B)P(G)P(B)P(B)P(B)P(B)P(G)P(B)P(B)P(B)P(B)P(B)P(G)P(G)P(B)P( B)P(B)P(B)P(B) = (0.667)(0.667)(0.333)(0.667)(0.667)(0.667)(0.667)(0.333)(0.667)(0.667)(0.667)( 0.667)(0.667)(0.333)(0.333)(0.667)(0.667)(0.667)(0.667)(0.667) = 0.0000189 Therefore, the probability of getting our exact sequences of drill and don’t drill opinions, assuming the null hypothesis is true is 0.0000189. Applying this to our sample If the probability of this exact sequence is 0.0000189, because multiplication is commutative, does it seem reasonable that the probability of each of the following exact sequences is also 0.0000189? Using Exponents The last sequence is a convenient way of representing 16 black and 4 green circles because we can use exponents to make our calculations faster. Thus we have: P(drill)16P(don’t drill)4 = (0.667)16(0.333)4 = 0.0000189 A shift in our thinking • When we take a sample, we don’t really care about the order in which the data are collected. What we want to know is the probability of getting a particular number of people who want to drill. In this example, we might want to know the probability that exactly 16 out of 20 people in the sample will want to drill. Adding Probabilities • We now need to return to the rule for mutually exclusive events. • P(A or B) = P(A) + P(B) • Our use of this formula involves the different combinations of outcomes to the survey: Of course, there are many more arrangements (combinations) of 16 black and 4 green circles than are shown here. Multiplication instead of addition • Since each combination has exactly the same probability (0.0000189) then instead of finding each combination and adding the probabilities of each, we could simply multiply that probability by the number of combinations. Combinations are found using n n! nCr r n r !r! Since our sample size is n = 20 and the number of people who want to drill is r = 16, then 20 20! 20 C16 16 20 16!16! 4845 TI 83/84: 20 Math PRB #3 16 Enter Contrast • Make sure you understand the difference between finding the probability of one particular sequence of outcomes and the number of people who want to drill. • The probability of one particular sequence such as = 0.0000189. • The probability of exactly 16 successes in a sample of size 20 with a probability as defined in the null hypothesis is 4845•0.0000189 = 0.0916. Focus on what’s important • We care about the number of people who want to drill more than the order in which their opinion was recorded. • Recall that our objective is to determine the likelihood of particular outcomes. • For a sample of size 20, there are 21 possible outcomes for the number of people who want to drill. These possible outcomes are: 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 Binomial Distribution Formula • We found the probability of exactly 16 people who want to drill by multiplying the number of combinations times the probability of a specific combination. Formally, this is: n! r n r p q n r !r! Where n = sample size, r = number who want to drill, p = probability of selecting someone who wants to drill and q = probability of selecting someone who doesn’t want to drill. TI 83/84: 2nd Distr Binompdf(n,p,r): Example: Binompdf(20,0.667,16) Binomial Distribution • If we apply the binomial distribution formula to each possible number of successes, we can create a binomial distribution. 0 2.81E-10 7 0.003 14 0.182 1 1.13E-08 8 0.009 15 0.146 2 2.14E-07 9 0.025 16 0.091 3 2.58E-06 10 0.054 17 0.043 4 2.19E-05 11 0.098 18 0.014 5 1.41E-04 12 0.148 19 0.003 6 7.04E-04 13 0.182 20 0.000 Binomial Distribution Stick Graph Binomial Distribution For The Number Of People Who Want To Drill If Exactly 66.7% of the Population Wants To Drill 0.20 0.18 0.16 Probability 0.14 0.12 0.10 0.08 0.06 0.04 0.02 0.00 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Number of People in a Sample of Size 20 Who Want To Drill Testing Our Hypotheses Review: The principle behind hypothesis testing is to assume the null hypothesis is true and then determine if selecting our data was a rare event. The probability of selecting our data or more extreme data is called the p-value. Since our hypotheses are H0: p = 0.667 H1: p > 0.667 The direction of our extreme is to the right because it is high values that would lead us to conclude the proportion is more than 0.667 and high values are to the right on a number line. Testing Our Hypotheses • Rare events have small p-values. • A p-value is considered to be small enough to be regarded as a rare event if it is less than or equal to alpha, which is the probability of making a type I error. If p-value ≤ α, accept H1. The data are significant. Testing Our Hypotheses • • Let α=0.05. Since our data was 16 people who wanted to drill, then we can find the probability of getting 16 or more by adding up the probabilities on the binomial distribution. Sum: 16 0.091 17 0.043 18 0.014 19 0.003 20 0.000 0.152 Our p-value = 0.152. Since the p-value is greater than α, we conclude that it would not be a rare event to get 16 out of 20 people to support drilling if exactly 66.7% of the adult American population supports drilling, therefore the evidence from our sample supports the null hypothesis which indicates there is not a super majority. Using the Calculator to Test a hypothesis using the binomial Distribution • Rather than creating an entire binomial distribution and adding up the probabilities for the data and more extreme values, we can use the binomcdf function on the calculator. Binomcdf always adds the probabilities to the left, so if the direction of the extreme is to the right, we need to use the complement rule, and also subtract one from our x value. That means the probability of getting 16 or more equals 1 minus the probability of getting 15 or less. • 1 – binomcdf(n,p,r-1) = 1 – binomcdf(20,0.667,15)=0.152 Using the Calculator to Test a hypothesis using the binomial Distribution • If the direction of the extreme had been to the left because of a < symbol in the alternate hypothesis, we would use: binomcdf(n,p,r) to get the p-value. Summary • A binomial distribution includes all possible outcomes and their probabilities if the null hypothesis is true. Using this distribution allows us to find the exact p-value for our data and thereby determine if the data is rare enough to cause us to reject the null hypothesis. The Second Method Normal Approximation of the Binomial Distribution Binomial Distribution For The Number Of People Who Want To Drill If Exactly 66.7% of the Population Wants To Drill Function = exp(-.5*(((x-1)-13.34)/2.1077)^2)/(2.1077*sqrt(2*Pi)) 0.20 0.18 0.16 0.14 0.12 0.10 0.08 0.06 0.04 0.02 0.00 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Number of People Who Want To Drill Notice how nicely the normal curve fits the binomial distribution. Binomial Distribution Parameters • The expected value, or mean of this distribution is μ = np = 20(0.667) = 13.34 • The standard deviation of this distribution is np1 p 20 0.6670.333 2.1077 This mean and standard deviation can be used to label the x axis of the normal curve. Normal Approximation Binomial Distribution For The Number Of People Who Want To Drill If Exactly 66.7% of the Population Wants To Drill Function = exp(-.5*(((x-1)-13.34)/2.1077)^2)/(2.1077*sqrt(2*Pi)) 0.20 0.18 0.16 0.14 0.12 0.10 0.08 0.06 0.04 0.02 Number of People Who Want To Drill 19 20 19.7 17 18 17.6 16 15 15.4 14 13 13.3 12 11.2 11 10 9 9.1 8 7.0 7 6 5 4 3 2 1 0 0.00 Using the Normal Distribution • Since the mean is 13.34, the standard deviation is 2.1077 and our sample number of people who want to drill is 16, then we can use the z formula to determine how many standard deviations our sample is from the mean. We can use that value and the standard normal z tables to find the probability of getting 16 or more successes. Using the Normal Distribution z z x 16 13.34 1.26 2.1077 From the tables, we find the area to the right of this z-value (hint: 1 – area to the left) is 0.1038, which is our p-value. The TI 83/84 calculator gives a more precise version of this if you enter 2nd distr #2 Normalcdf(low, high, mean, standard deviation) = Normalcdf(16, 1E99, 20*0.667,sqrt(20*0.667*0.333)) = 0.10346 As when the hypothesis was tested using the binomial distribution, this p-value is greater than α, so the data supports the null hypothesis – there isn’t a super majority who want to drill. Counts are difficult to compare Both of the methods used to test hypotheses rely on counting the number of people who want to drill. If every sample size taken was the same size, then this would be a good way to compare results from different samples. However, if our sample had 16 people who want to drill and someone else had 42 people who want to drill, how could we compare these numbers? Counts are difficult to compare The obvious question that would be asked is 42 out of how many? If the sample size was 48, we would come to one conclusion, if the sample size was 100, we would come to a different conclusion. From Counts to Proportions • For us to compare the results of two samples with different sample sizes leads us to the concept of proportions. • Proportions result from dividing the number of people who want to drill by the sample size. Thus: 16 0 .8 20 42 0.42 100 42 0.875 48 We will use this approach of dividing the counts by the sample size to convert our distribution from counts to proportions. We will also convert the mean and standard deviation of the number of people who want to drill. From Counts to Proportions Remember that the mean of a binomial distribution is μ = np. If we divide the mean by the sample size n we get: np p n n Remember the standard deviation of a binomial distribution is np(1 p) We can divide the standard deviation by the sample size n to get: n np(1 p) n np(1 p) 2 n p(1 p) n 18 (0.90) 17 (0.85) 19 (0.95) 19.7 (0.985) 20 (1.00) 17.6 (0.88) 15 (0.75) 15.4 (0.77) 16 (0.80) 13 (0.65) 13.3 (0.667) 14 (0.70) 12 (0.60) 11.2 (0.56)11 (0.55) 10 (0.50) 9.1 (0.455) 9 (0.45) 8 (0.40) 7 (0.35) 7.0 (0.35) 6 (0.30) 5 (0.25) 4 (0.20) 3 (0.15) 2 (0.10) 1 (0.05) 0 (0.00) Changing to Proportions Distribution of Sample Proportion of People Who Want To Drill 0.20 0.18 0.16 0.14 0.12 0.10 0.08 0.06 0.04 0.02 0.00 Using the Normal Distribution For the distribution of sample proportions, x = x 16 pˆ 0.8 n 20 z x p̂ , μ = p and 0.667(1 0.667) 0.10538 20 pˆ p p(1 p) n 0.8 0.667 1.26 0.10538 Notice that this is exactly the same Z value we got when using the Normal Approximation for the Binomial Distribution. Consequently, the p-value will be the same as will be our conclusion. From the tables, we find the area to the right of this z-value (hint: 1 – area to the left) is 0.1038, which is our p-value. The TI 83/84 calculator gives a more precise version of this if you enter 2nd distr #2 Normalcdf(low, high, mean, standard deviation) = Normalcdf(16/20, 1E99, 0.667,sqrt(0.667*0.333/20)) = 0.10346 p(1 p) n Surprised? • It should come as no surprise that the normal approximation for the binomial distribution and the sampling distribution of sample proportions yields exactly the same result. This is because the normal approximation uses counts while the sampling distribution represents the counts as proportions. Nothing else is different. Summary • Binomial Distributions produce exact pvalues • The normal approximation to the binomial distribution and the sampling distribution of sample proportions produce identical pvalues (0.103) that will be close to, but not the same as those produced in the Binomial Distribution (0.152). Conclusion • Ultimately, we use the last method to test hypotheses because it is easier to discuss proportions than counts since sample sizes can vary. Keep in mind however, that the method produces approximate pvalues. The approximation improves as the sample size increases. Conclusion • Because the p-value is greater than the level of significance (0.05), – we select the null hypotheses. – the data are not significant – we will not drill. – there is the possibility we made a Type II error.