Hypothesis Testing – Examples and Case Studies Steps in Hypothesis Testing 1. Determine the null hypothesis and the alternative hypothesis. 2. Collect and summarize the data into a test statistic. 3. Use the test statistic to determine the p-value. 4. The result is statistically significant if the p-value is less than or equal to the level of significance. 5. Report your conclusions in terms of the original hypothesis Hypothesis Testing – Examples and Case Studies Inferences about means •Now that we know how to test hypotheses about proportions, it’d be nice to be able to do the same for means. •Just as we did before, we will base both our confidence interval and our hypothesis test on the sampling distribution model. The Central Limit Theorem told us that the sampling distribution model for means is Normal with mean equal to the population mean and standard deviation (SD): population standard deviation sample size •All we need is a random sample of quantitative data and the true population standard deviation. That’s the problem. •Proportions have a link between the proportion value and the standard deviation of the sample proportion. •This is not the case with means—knowing the sample mean tells us nothing about the SD of the sample mean. Hypothesis Testing – Examples and Case Studies Inferences about means We’ll do the best we can: estimate the population standard deviation with the sample standard deviation . Our resulting standard error of sample mean (SEM) is : sample standard deviation sample size Reminder: Conditions for Rule for Sample Means Population of measurements is bell-shaped, and a random sample of any size is measured. OR Population of measurements of interest is not bell-shaped, but a large random sample is measured. Sample of size 40 is considered “large,” but if there are extreme outliers, better to have a larger sample. Hypothesis Testing – Examples and Case Studies Example: Executives' blood pressures The National Center for Health Statistics reports that the mean systolic blood pressure for males 35 to 44 years of age is 128. The medical director of a large company looks at the medical records of 72 executives in this age group and finds that the mean systolic blood pressure in this sample mean is equal to 126.1 and that the standard deviation is = 15.2. Is this evidence that the company's executives have a different mean blood pressure from the general population? The hypotheses: The null hypothesis is "no difference“ from the national mean. The alternative is two sided. Why? So the hypotheses about the unknown population mean of the executive population are Null hypothesis: population mean equal to 128 Alternative hypothesis: population mean not equal to 128 The sampling distribution: If the null hypothesis is true, the sample mean comes from a Normal distribution with mean equal to 128. The standard error of the sample mean (SEM) is: sample standard deviation sample size = 15.2/square root of (72) = 1.79 Hypothesis Testing – Examples and Case Studies Example: Executives' blood pressures The data: The sample mean is = 126.1. The standardized score (z-score) for this outcome is standardized score (z-score) = observation – null value/standard error = 126.1 – 128/1.79 = -1.06 The P-value: The figure locates the sample outcome -1.06 (in the standard scale) on the Normal curve that represents the sampling distribution if the null hypothesis is true. The two-sided P-value is the probability of an outcome at least this far out in either direction. In Table 8.1, the closest standardized score to -1.08. This is the 13th percentile of a Normal distribution. So the area to the left of-1.08 is 0.14. The area to the left of -1.08 and to the right of 1.08 is double this, equal to 0.28. This is our approximate P-value. (The exact P-value, from software, is P = 0.289.) The conclusion? Comparing Two Means Hypothesis Testing – Examples and Case Studies This time the parameter of interest is the difference between the two population means For independent random quantities, variances add. So, the standard error(SE) or measure of variability of the difference between two sample means is SE(Sample 1 Mean - Sample 2 Mean ) = square root of [(SEM1)2 + (SEM2)2] where SEM1 = sample 1 standard deviation sample 1 size and SEM2 = sample 2 standard deviation sample 2 size Note: Rules for Sample Means apply for comparing two means. measure of variability known as Standard Error of Difference (SED) = [(SEM1)2 + (SEM2)2] Null hypothesis: Population mean difference is zero Alternative hypothesis: Population mean difference is not zero. The sampling distribution: If the null hypothesis is true, the sample mean comes from a Normal distribution with mean equal to zero and standard error equal to above value. Hypothesis Testing – Examples and Case Studies Example: Weight Loss for Diet vs Exercise Determine the null and alternative hypotheses. Null hypothesis: No difference in average fat lost in population for two methods. Population mean difference is zero. Alternative hypothesis: There is a difference in average fat lost in population for two methods. Population mean difference is not zero. Collect and summarize data into a test statistic. The sample mean difference (Diet – Exercise) = 5.9 – 4.1 = 1.8 kg and the standard error of the difference is 0.83. standardized score = observation – null value/standard deviation = sample mean difference – null value/ standard error of the difference = 1.8 – 0/0.83 = 2.17 Determine the p-value. The alternative hypothesis was two-sided. p-value = 2 [proportion of bell-shaped curve above 2.17] Table 8.1 => proportion is about 2 0.015 = 0.03. 95% Confidence Interval : 1.6 kg to 4.8 kg Hypothesis Testing – Examples and Case Studies Example. Cholesterol. A randomized controlled double-blind experiment was performed to demonstrate the efficacy of a drug called "cholestyramine“ in reducing blood cholesterol levels and preventing heart attacks. •There were 3,806 subjects, who were all middle-aged men at high risk of heart attack; •1,906 were randomly assigned the treatment group and the remaining 1,900 to the control group. •The subjects were followed for 7 years. The drug did reduce the cholesterol level in the treatment group (by about 8%). •Furthermore, there were 155 heart attacks in the treatment group, and 187 in the control group: 8.1% versus 9.8%, z = -1.8, P =0.035 (one-tailed). •This was called "strong evidence" that cholestyramine helps prevent heart attacks Hypothesis Testing – Examples and Case Studies How Journals Present Tests Mozart, Relaxation, and Performance on Spatial Tasks Three listening conditions— Mozart, a relaxation tape, and silence—and all subjects participated in all three conditions. Null hypothesis: No differences in population mean spatial reasoning IQ scores after each of three listening conditions. Alternative hypothesis: Population mean spatial reasoning IQ scores do differ for at least one of the conditions compared with the others. “A one-factor (listening condition) repeated measures analysis of variance … revealed that subjects performed better on the abstract/spatial reasoning tests after listening to Mozart than after listening to either the relaxation tape or to nothing (F[2,35] = 7.08, p = 0.002).” “The music condition differed significantly from both the relaxation and silence conditions (t = 3.41, p = 0.002; t = 3.67, p = 0.0008, two-tailed, respectively). The relaxation and silence conditions did not differ (t = 0.795, p = 0.432, two-tailed).” Example: Weight Loss - Diet versus Exercise: z-test statistic given as 2.17 is actually a t-test statistic, because the sample standard deviations were used in the computation. The degrees of freedom (df) for the t-test is (42 + 47 – 2) = 87. Excel: TDIST(2.17,87,2) gives a p-value of 0.0327, very close to 0.0300 found using the standard normal curve (z-curve). Text Questions Study: Alternations in Brain and Immune Function Produced by Mindfulness Meditation Q16. The participants were right-handed volunteers in a biotechnology company in Madison, Wisconsin in the Midwestern United States. However, for the relationship studied in this randomized experiment, there should be nothing unusual about them, so the results probably apply to adults who would volunteer for a study like this one.