Hypothesis Testing z-test The sampling distributions of a mean (SDM) describes the behavior of a sampling mean x ~ N , x where x n Sampling distribution of a mean is based on: 1. the central limit theorem, 2. the law of large numbers (unbiased nature of the sample mean) We compare means of two samples taken from specific sub-groups of the population. We analyze the difference between two means The question under consideration: “Is the difference between the samples large enough to allow us to conclude (with a known probability of error) that the populations represented by the samples are different?” Example 1: A quality engineer would like to determine whether the production process he is charged of monitoring is still producing products whose mean response value is supposed to be 0 (process is in-control), or whether it is producing products whose mean response value is now different from the required value of 0 (process is out-of-control). Statement 1 (Null): = 0 (process in-control) Statement 2 (Alternative): 0 (process outof-control) Example 2: Suppose we want to estimate the difference between the mean weights of all participants before (μ1) and after (μ2) a weight loss program. To accomplish this, suppose we take a sample of 40 participants and measure their weights before and after the completion of this program. Statement 1 (Null): 1 = 2 (No weight loss) Statement 2 (Alternative): 2 < 1 (Significant weight loss) Consider the following situation: Previous records state that the birth weights of babies in England are normally distributed with a mean of 3000g and a standard deviation of 500g. We think that maybe babies in Australia have a mean birth weight greater than 3000g and we would like to test this hypothesis. Convert the research question to null and alternative hypotheses The null hypothesis (H0) is a claim of “no difference in the population” The alternative hypothesis (Ha) claims “H0 is false” Collect data and seek evidence against H0 as a way of bolstering Ha (deduction) Null hypothesis, denoted H0 –Assuming that H0 is true, there is no difference between the parameters of the two populations The mean birth weight is equal to 3000g. H0 : μ= 3000g Alternative hypothesis, denoted H1, We reject H0 and say there is a difference between the populations (If the difference between the sample statistics is large enough). The mean birth weight of Australian babies is greater than 3000g. H1 :μ >3000g Once we have set up our null and alternative hypothesis we can collect a sample of data. Imagine that we take a sample of 44 babies from Australia, measure their birth weights and we observe that the sample mean of these 44 weights is 𝑋= 3275.955g. Now, we want to calculate the probability of obtaining a sample with mean as large as 3275.955 by chance under the assumption of the null hypothesis H0. We know from the previous lecture that, If X1,X2,...Xn are independent and identically distributed random variables from a N(μ,σ) Then, x ∼N(μ, ) n Now, we can calculate the probability of obtaining a sample with a mean as large as 3275.955 using standardization. z stat x 0 x where 0 population mean assuming H 0 is true and x n 75.378 This probability is called the p-value of the test. In this case the p-value is very low. But how low does this probability has to be before we can conclude that the null hypothesis is false? Convention: choose a level of significance α, as an indicator of a significant difference (α =0.01 or α =0.05). We conclude that there is significant evidence against the null hypothesis if the p-value is less than or equal to 0.01 (or 0.05). We have to look at the tail of the Standard Normal distribution beyond the zstat. Convert z statistics to P-value: For Ha: μ > μ0 p= P(Z > zstat) = right-tail beyond zstat For Ha: μ < μ0 p= P(Z < zstat) = left tail beyond zstat For Ha: μ μ0 p= 2 × one-tailed p-value Use the Table to find these probabilities. In the baby weight example, the p-value is 0.00015 which is lower than α/2 (0.01/2=0.005). In this case, we conclude that “there is significant evidence against the null hypothesis at the 0.01 level.” Another way of saying this is that “we reject the null hypothesis at the 0.01 level”. In the 1970s, 20–29 year old men in the U.S. had a mean μ body weight of 170 pounds. Standard deviation σ was 40 pounds. We test whether mean body weight in the population now differs. Null hypothesis H0: μ = 170 (“no difference”) The alternative hypothesis can be either Ha: μ ≠ 170 (two-sided test) Sampling distribution of xbar under H0: µ = 170 for n = 64 x ~ N 170 ,5 For the illustrative example, μ0 = 170 We know σ = 40 Take an random sample of n = 64. Therefore 40 x 5 n 64 If we found a sample mean of 173, then: zstat x 0 x 173 170 0.60 5 If we found a sample mean of 185, then x 0 185 170 zstat 3.00 x 5 One-sided p-value AUC in tail beyond zstat Two-sided p-value consider potential deviations in both directions double the onesided p-value • If one-sided p = 0.0010, then two-sided p = 2 × 0.0010 = 0.0020. • If one-sided P = 0.2743, then two-sided p = 2 × 0.2743 = 0.5486. If we choose α =0.05 • For zstat = 3.0, two-sided p = 0.002, • • • • • In this case, 2 sided p-value is less than 0.05 Then, we reject the null hypothesis and say that the mean weight of old men has increased significantly since 1970s. For zstat = 0.6, two-sided p = 0.5486. In this case, 2 sided p-value is greater than 0.05 Then, we accept the null hypothesis and say that there is no statistical difference between the mean weight of old men from the 1970s until now. 1. Make assumptions and meet test requirements 2. Define the null and alternative hypothesis 3. Select the sampling distribution and establish the critical region 4. Compute the test statistic 5. Make a decision and interpret the test results Let X represent Weschler Adult Intelligence scores (WAIS) ◦ Typically, X ~ N(100, 15) ◦ Take SRS of n = 9 from Lake Wobegon population ◦ Data {116, 128, 125, 119, 89, 99, 105, 116, 118} Calculate: x-bar = 112.8 Does sample mean provide strong evidence that population mean μ > 100? “where all the women are strong, all the men are good-looking, and all the children are above average” A. B. Hypotheses: H0: µ = 100 Ha: µ > 100 (one-sided) Ha: µ ≠ 100 (two-sided) Test statistic: zstat 15 x 5 n 9 x 0 112.8 100 2.56 x 5 p-value: p = P(Z ≥ 2.56) = 0.0052 p =0.0052 For α/2 = 0.025, it is unlikely the sample came from this null distribution We reject the null Hypothesis H0 Ha: µ ≠100 Ha considers random deviations “up” and “down” from μ0 tails above and below ±zstat Thus, two-sided P = 2 × 0.0052 = 0.0104 2-sided p < α (0.0104 <0.05) We reject the null Hypothesis H0 In general, comparing two population means is the way used to prove that one population is different or better than another Examples ◦ Competing Companies / Products ◦ Treatment vs. No treatment ◦ New method vs. Old method 1. Make assumptions and meet test requirements 2. Define the null and alternative hypothesis 3. Select the sampling distribution and establish the critical region 4. Compute the test statistic 5. Make a decision and interpret the test results Case study_ Safety of drinking water (Arizona Republic, May 27, 2001) Water sampled from 10 communities in Pheonix and 10 communities from rural Arizona Determine if Arsenic concentration (As) is different between these two areas? To answer: “Whether or not there is a difference between the means of Arsenic levels (μ1 and μ2) in these two areas?” is equivalent to test: whether μ1-μ2 is different from 0. Two independent populations Question: Do men and women significantly differ for their support of gun control? For men (Sample 1) – Mean 𝑋1 = 6.2 – Standard deviation s1 = 1.3 – Sample size N1= 324 For women (Sample 2) – Mean 𝑋2= 6.5 – Standard deviation s2 = 1.4 – Sample size N2= 317 Three conditions: The two samples are independent The standard deviations σ1 and σ2 of the two populations are known 3. At least one of the following conditions is fulfilled: i. Both samples are large (i.e., n1 ≥ 30 and n2 ≥ 30) ii. If either one or both sample sizes are small, then both populations from which the samples are drawn are normally distributed So, the Central Limit Theorem applies and we can assume a standard normal distribution (Z) 1. 2. Null hypothesis, H0 : μ1=μ2 – The null hypothesis asserts there is no difference between the 2 populations Alternative hypothesis, H1: μ1≠ μ2 – The research hypothesis contradicts the H0 and asserts there is a difference between the populations x1 x 2 When the 3 conditions are satisfied, the sampling distribution of x1 x 2 is (approx.) normal with its mean and standard deviation as follows: x x 1 2 1 2 and x x 1 2 12 n1 22 n2 Significance level – Alpha (α) = 0.05 (two-tailed) – The decision to reject the null hypothesis has only a 0.05 probability of being incorrect Z(critical)= ±1.96 – If the probability (p-value) is less than 0.05 – Z(obtained) will be beyond Z(critical) Sample outcomes Pooled standard error 1 2 0.1067 Z(obtained) = –2.80 –This is beyond z(critical) = ± 1.96 –The obtained z score falls in the critical region –Therefore, the H0 is false and must be rejected The difference between men and women is statistically significant - that we can conclude (at 95% confidence) that a difference exists between men and women for their support to gun control. According to Kaiser Family Foundation survey in 2011 and 2010, the average premium for health insurance for family coverage was $15,073 in 2011 and $13,770 in 2010 and the standard deviations were$2,160 and $1,990, respectively. Suppose that these averages were based on random samples of 250 and 200 employees who had such health insurance plans for 2011 and 2010, respectively. Test at 1% significance level whether the population means for the two years are different. Step 1: Population standard deviations, σ1 and σ2, are known Both samples are large; n1 > 30 and n2 > 30 Therefore, we use the normal distribution to perform the hypothesis test. Step 2: H0: μ1 – μ2 = 0 (The two population means are not different.) H1: μ1 – μ2 ≠ 0 (The two population means are different) Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved. Step 3: α = 0.01. The ≠ sign in the alternative hypothesis indicates that the test is two-tailed Area in each tail = α / 2 = 0.01 / 2 = 0.005 The critical values of z are 2.58 and -2.58 Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved. Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved. Step 4: 𝜎𝑥1 −𝑥2 = = 𝑍= 𝜎12 𝑛1 + 𝜎22 𝑛2 2160 2 250 + 1990 2 200 = $196.1196 𝑥1 −𝑥2 −(𝜇1 −𝜇2 ) 𝜎𝑥1 −𝑥2 $15,073 − $13,770 − 0 = = 6.64 196.1196 From H0 Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved. Step 5: Because the value of the test statistic z = 6.64 falls in the rejection region, we reject the null hypothesis H0. Therefore, we conclude that the average annual premiums for employer-sponsored health insurance for family coverage were different for 2011 and 2010. Prem Mann, Introductory Statistics, 8/E Copyright © 2013 John Wiley & Sons. All rights reserved.