Statistical Analysis – Chapter 6 “Hypothesis Testing” Dr. Roderick Graham Fashion Institute of Technology Basic Concepts of Hypothesis Testing What do we mean by a hypothesis? A “proposed” explanation for a phenomenon. In statistics, usually a hypothesis centers around explaining sample means. We usually hypothesize that a particular sample collected is or is not like the population from which it is drawn. Why the Central Limit Theorem is important for hypothesis testing… Remember that the central limit states that: With a sample size over 30, the mean of the sample equals the mean of the population And sample means are normally distributed around the mean This means that with any sample, we can EXPECT that this sample is no different than the population from which it is drawn Conceptual Background for Hypothesis Testing But when we take a sample that is different from the mean, it may be an indication that this sample is not equal to the population. But how can we be sure? We can never be exactly sure…but we can use the normal table to tell us the chances that a sample value is like or unlike the population from which it is drawn. Basics of Hypothesis Testing (using a class example) At the beginning of one of our classes, you completed a small survey about the types of music you like. This survey is turned into a scale. Social scientists use this survey as a measure of cultural openness. Americans score around 3.21 on this scale, with a standard deviation of .95. We will test to see if the sample we drew from this class is significantly different than the population. Basics of Hypothesis Testing (using a class example) Step 1 All statistical tests start with an assumption. For this example, the assumption is that the mean of the population, µ is 3.21, and the Standard Deviation of the population, σ is .95. The population is Americans. We also assume that our sample is equal to the American population. Thus… µ = X In other words, because of the central limit theorem, we assume that the mean of the population = the mean of the sample. So any sample we take should have a mean of approximately 3.21. Basics of Hypothesis Testing (using a class example) Step 2 We will set a “cut-off” for accepting this assumption. Our cut-off will be 95% of the normal distribution area (z = +/- 1.96). If we calculate a z-score for our sample, and it falls outside of this number, we reject our assumption that the sample is the same as the population. Basics of Hypothesis Testing (using a class example) Step 2 (cont’d) Let’s make the necessary calculations. We need a mean and a standard deviation, in order to use this formula: z X X 1. 2. 3. The mean of the population, µ, is 3.21. The population standard deviation is .95. What is the mean of our sample? This is X . The mean is 3.50. We cannot assume a normal distribution. Thus, we need to know the N of the sample in order to compute the standard deviation. The N = 38. We compute a standard deviation for the sample using this: X / n Basics of Hypothesis Testing (using a class example) Step 2 Cont’d Population Sample (FIT Students) µ = 3.21 X σ = .95 N = 38 z-score (test) = +/-1.96 X = 3.50 = .154 Z-score (sample) = 1.88 Equations used… X / n z X X Basics of Hypothesis Testing (using a class example) Step 3 Now we evaluate our findings. We decided that any value greater than z = +/- 1.96 is not like the population. The z-score for our sample is 1.88 So, do we accept or reject the initial assumption? NO. We do not reject the assumption. FIT students are just like the American Population. Errors with hypothesis testing Type 1 error – you have rejected an assumption when you should not have. We call this “the α risk”. (alpha risk). The alpha is always the point at which we reject or accept an assumption. For example we set our level of rejection for the FIT sample at 95% (+/- 1.96). For the FIT example, the α is .05 (1 - .95). Errors with hypothesis testing Type II error – You accept the assumption when you shouldn’t have. We call this “the β risk” (beta risk). In other words, the sample is really different from the population, but we did not identify it. Power = 100% - β, or 1 - β Power is the probability of making a correct decision by avoiding Type II error. Let’s go over a problem…(p. 170) NIH agreed to supply immunizations for viruses. A process is set up to fill test tubes to an average of 9.00 ml, with a standard deviation of .35 ml. Now let’s say we took a sample of 49 test tubes, and 99% of all sample averages fall between 8.87 ml and 9.13 ml. We will use this 99% criterion to accept µ = 9.00 ml. What is the probability of Type I error? b. What is the probability of Type II error if the process shifts to µ = 9.20 ml.? c. What is the power of the test in part b? a. Sample Questions for Review Questions 6.6 and Questions 6.8 (work in groups if you like) We may finish these in class….if not…answers will be posted on the website END