UNC-Wilmington Department of Economics and Finance ECN 377 Dr. Chris Dumas Hypothesis Testing Terminology, Z-tests of Means "A statistical hypothesis is simply a claim about a population that can be put to the test using information from a random sample drawn from the population." (Wonnacott p. 257) Null Hypothesis "H0" = the baseline, status-quo assumption/hypothesis Alternative Hypothesis "H1" = what might be true instead of the status-quo The idea: (1) Collect data. (2) The data might provide enough evidence to reject Ho, leading you to accept H1. If not, accept (continue to believe) Ho. Hint: Often, the Null Hypothesis H0 is that some parameter, like a mean or standard deviation, equals some number (usually zero), and the Alternative Hypothesis H1 is that the parameter is not equal to that number. One-tailed Hypothesis Test (two kinds) H0: population parameter equals some number H1: population parameter greater than that number or H0: population parameter equals some number H1: population parameter less than that number Two-tailed Hypothesis Test (only one kind) H0: population parameter equals some number H1: population parameter not equal to that number TIP: If your hypothesis mentions one particular direction for an effect, use a one-tailed test; otherwise, use a two-tailed test. 1 “Confidence Level of a test” (C.L.) is a measure of reliability of a hypothesis test. A 95% Confidence Level means that if took many random samples from a population and conducted our hypothesis test on each of the samples, the test would give us the correct result 95% of the time. Typically, C.L. = 95% or 0.95 is used, but, in situations where you need to be very careful, a larger value of C.L. is used, say 99%, and in situations where you can allow greater chance for error, a smaller value of C.L., say C.L. = 0.10 can be used. The Confidence Level tells us the probability of accepting H0 when H0 is actually true. The investigator chooses the Confidence Level of a test; that is, the Confidence Level of a test is a subjective decision. Of course, a higher confidence level is always better, but the higher the confidence level chosen, the larger the amount of data needed to conclude that H0 is false (when it truly is false). Don’t confuse the Confidence Level (C.L.) with the Confidence Interval (C.I.), described later. "Significance Level of test" ("alpha" = "α", also called Type I Error) is another measure of the reliability of a hypothesis test, but this measure gives your probability of being wrong. A Significance Level of 5% means that if took many random samples from a population and conducted our hypothesis test on each of the samples, the test would give us the wrong result 5% of the time. The Significance Level tells us the probability of rejecting H0 when H0 is actually true. The Significance Level is determined by the investigator; that is, the choice of α is subjective. However, the Significance Level is determined by the Confidence Level chosen by the investigator: Significance Level = α = (100% - Confidence Level ) Typically, α = 0.05 is used, but, in situations where you need to be very careful, a smaller value of α is used, say α = 0.01, and in situations where you can allow greater chance for error, a larger value of α, say α = 0.10 can be used. When you use α = 0.05, it means: prob of rejecting H0 when H0 true = 0.05 prob of being wrong if you reject H0 = 0.05 Other terms sometimes used in hypothesis testing: Type II Error = the probability of accepting H0 when H0 is actually false. Power = the probability of rejecting H0 when H0 is actually false. 2 Testing Hypotheses about Means Terminology X = a variable of interest Xi = a particular value of X (namely, the "ith" value) N = population size μ = mean of population σ = standard deviation of population n = sample size 𝑋̅ or Xbar = mean of sample (as opposed to the population mean) s = standard deviation of sample (as opposed to the population standard deviation) The idea: We're in a situation where we do not have data on the full population; rather, we just have a ̅ , to test whether the population mean, μ, is equal to sample. We are going to use the sample mean, 𝑿 some number, "a". (Looking ahead, often "a" is the number zero, because in many applications, μ will be the average "effect" of one variable on another, and we want to test whether that average "effect" is zero, or not.) Types of hypotheses about means One-tailed Hypothesis Tests (two kinds) H0: μ = a H1: μ > a or H0: μ = a H1: μ < a Two-tailed Hypothesis Test (only one kind) H0: μ = a H1: μ ≠ a Looking ahead, we are going to do these hypothesis tests in two different situations: (1) where we know the population standard deviation, σ, (we are going to use a “Z table” for this). (2) where we don't know σ but must approximate it with sample std. dev. "s" (we are going to use a “t table” for this). 3 ̅ The Variation of the Sample Mean, 𝑿 Central Limit Theorem: As sample size increases (that is, "in the limit"), if you take a bunch of random samples from a population, and then take the mean of each sample, the distribution of all of those means will have the shape of a normal (bell-shaped) distribution, regardless of the shape of the distribution of the individual values in the population from which the samples were taken. Amazing Implication: If we want to test hypotheses about means, we can always assume that the distribution of the means is a normal distribution, regardless of the shape of the distribution of the underlying population. This is why we work with the normal distribution so much!!!!!!!!!!!!!! If you look at a collection of means from many samples, one mean from each sample, the means have less variation than do the individual data values, so, variation in means across samples is smaller than variation in individual data values. The variation in the means from a collection of samples is called . . . "Standard Error" = "s.e." = the standard deviation of the mean. Remember, this is not the standard deviation of X, but rather the standard deviation of 𝑋̅ or Xbar. The formula for s.e. depends on whether or not you know the population standard deviation, σ, ..... s.e. = σ √n if you know σ of the population, s √n if you don't know σ and have to approximate it with "s" from the sample. or s.e. = The standard error of 𝑋̅ is typically smaller than the standard deviation of X, because the averaging involved in calculating 𝑋̅ leads to "averaging out" some of the variation that was present in the original X values. 4 The "Z-distribution" The Z-distribution is a particular type of normal ("bell-shaped") distribution. It is the normal distribution that has μ = 0 and σ = 1. The Z-distribution is also called the Standard Normal distribution. It is important to keep in mind two facts about the Z distribution: 1) Like all probability distributions, the total area under the whole distribution always equals 100%, or 1.00. This fact is used often when solving problems with the Z table. 2) Because Z is a symmetric distribution, half = 50% = 0.50 of the Z values occur on either side of Z = 0. This fact is used often when solving problems with the Z table. The Z-distribution can be used for hypothesis tests when you know the standard deviation of the population, σ. The idea is: If you have a variable X with a normal distribution, convert it to the equivalent Z distribution, and then do hypothesis tests with the Z distribution. By doing this, you only need to know the hypothesis test formulas for the Z distribution rather than needing to know all the formulas for all types of normal distributions. The Z table The Z table gives the area under the Z distribution. The area under the Z distribution is equal to the probability that a randomly selected Z would take on the values on the Z axis. So, "area under Z curve" = "probability Z takes on those values". Using the Z table to Test Hypotheses about Means When It's Okay to Use the Z table--You may use the Z table to test hypotheses about means when you know σ of the population !! (you must use the t table, described later, when you don't know σ of the population) In cases where it's okay to use the Z table, (that is, when you know the σ of the population), then the ̅ is: standard error of the sample mean 𝑿 s.e. = σ √n (we already saw this formula, I'm just re-copying it here) and the Z conversion formula for 𝑋̅ uses this s.e. in the denominator: 𝑍= (𝑋̅ − 𝜇) 𝑠. 𝑒. When we test hypotheses about means using the Z table, we convert our 𝑋̅ value to a Z value using the formula above, and then we use the Z table. 5 Example -- One-Sided Test A computer chip manufacturer has produced millions of computer chips using an "old" technology. The population of chips produced by the old technology has a mean lifetime of 1200 hours per chip, and a standard deviation of 300 hours. A sample of 100 chips is taken from a trial run of a "new" technology. The sample mean lifetime for the 100 chips is 1265. Assuming the standard deviation remains the same at 300 hours for the new chips, can we conclude that the mean lifetime is larger for the chips made using the "new" technology? Given: X = chip lifetime Xi = lifetime of a particular chip X is normally distributed μ0 = mean chip lifetime of "old technology" population = 1200 σ0 = standard deviation of "old technology" population = 300 n = sample size = 100 μ1 = mean chip lifetime of "new technology" population . Note: μ1 is the unknown population mean; we're hypothesizing about it, based on our sample. ̅ 𝑋 or Xbar = mean chip lifetime of "new technology" sample = 1265 s.e. = σ √n = 300/sqrt(100) = 30 1) Set up hypothesis test: One-tailed Hypothesis Test H0: μ1 = 1200 (that is, maybe μ1 is just the same as the old μ0 ) H1: μ1 > 1200 (or, maybe it's larger) 2) Select significance level and "Z-critical" value for the hypothesis test: Let's select the typical significance level, α = 0.05. Now, if 𝑋̅ is much larger than μ0 , then that is evidence that μ1 is larger than μ0. How large must 𝑋̅ be for us to conclude that μ1 is larger than μ0, with an α = 0.05 chance of being wrong in our conclusion? Rather than answering this question for 𝑋̅ directly, we are going to convert 𝑋̅ to its equivalent Z value, and then answer the question using the Z table. To do this, we need to know what value of Z is large enough so that, if we get that value (or larger) for Z, then we have only an α = 0.05 chance of being wrong in our conclusion that μ1 is larger than μ0. To do this, look up α = 0.05 probability in the body of the Z table (not in the margins of the table, but in the body of the table), and then read the value of Z that corresponds to this probability by looking in the margins of the table. This Z value is called "Z critical." 6 For α = 0.05, the Z value in the table is between 1.64 and 1.65, so let's say Zcritical = 1.645. (This is the correct Zcritical for the one-tailed test that we are doing in this example; it would be different for a two-tailed test, as we will see...) 3) Convert 𝑋̅ to its equivalent Z value, called "Z test" Convert 𝑋̅ to its equivalent Z value, called "Z test," using the formula: 𝑍𝑡𝑒𝑠𝑡 = 𝑍𝑡𝑒𝑠𝑡 = (𝑋̅ − 𝜇) 𝑠. 𝑒. (1265−1200) 30 = 2.17 4a) Compare Ztest to Zcritical Because we are testing whether μ1 > 1200 , we want to determine whether Ztest > Zcritical. If Ztest > Zcritical, then reject H0 and we accept H1. Otherwise, accept H0. In our case, Ztest = 2.17 and Zcritical = 1.645, so Ztest > Zcritical, and thus we conclude: reject H0 and we accept H1. Given that we accept μ1 > 1200, what is our best estimate of μ1? Our best estimate of μ1 is 𝑋̅, which is equal to 1265. 4b) Calculate p-value of test (this is the same as comparing Ztest to Zcritical) Instead of comparing Ztest to Zcritical to test whether to reject H0, we could instead calculate the "p-value" of the test. The p-value gives the probability of getting the 𝑋̅ of our sample, or a value of 𝑋̅ even farther from μ0, when H0 is in fact true. If the p-value is small, then it is unlikely that H0 is true, and we therefore reject H0 (on the other hand, if the p-value is large, then we accept H0.) To calculate the p-value: p-value = the probability that Z is larger than Ztest For example, in the previous step, we found that Ztest = 2.17. If we look up the probability for this Z value in the Z table, we find the probability is 0.015, so for this example: p-value = 0.015 This means that if H0 is in fact true, the chances of getting a value of 𝑋̅ equal to the one from our sample (or larger) are only 0.015, 1.5%. This chance is so small, that we reject H0, and we accept H1 the alternative hypothesis. Notice that the conclusion we reach using the p-value is the same as the conclusion we reached by comparing Ztest to Zcritical. This is always the case; you can either compare Ztest to Zcritical or use the p-value, either way you get the same answer about whether to reject H0. In general the rule for using p-values is this: If p-value < α, then reject H0 and accept H1. If p-value > α, then accept H0 and reject H1. 7 Example -- Two-Sided Test Consider the computer chip manufacturer again. Suppose we wanted to find out whether the mean lifetime for the "new technology" chips had changed, in either a positive or negative direction, from the mean lifetime for the "old technology" chips. This is a two-sided hypothesis, because we are testing whether the new mean changed in either direction relative to the old mean. 1) Set up hypothesis test: Two-tailed Hypothesis Test H0: μ1 = 1200 (that is, maybe μ1 is just the same as the old μ0 ) H1: μ1 ≠ 1200 (or, maybe it's larger or smaller) 2) Select significance level and "Z-critical" value for the hypothesis test: Let's select the typical significance level, α = 0.05. For a two-sided test, we use α/2. In this case, α/2 = 0.025 From Z table, Zcritical for 0.025 = 1.96. 3) Convert 𝑋̅ to its equivalent Z value, called "Z test" Same as before, Ztest = 2.17 4a) Compare Ztest to Zcritical Same as before: If Ztest > Zcritical, then reject H0 and we accept H1. Otherwise, accept H0. Only this time, Zcritical is 1.96, rather than 1.645, as it was for the one-side test. For the two-sided test, Ztest = 2.17 and Zcritical = 1.96, so Ztest > Zcritical, and thus we still conclude: reject H0 and we accept H1. 4b) Calculate p-value of test Procedure is same as for one-sided test, except we use α/2 rather than α. So, If p-value < (α/2), then reject H0 and accept H1. If p-value > (α/2), then accept H0 and reject H1. 8