Common Statistical Mistakes Mistake #1 • Failing to investigate data for data entry or recording errors. • Failing to graph data and calculate basic descriptive statistics before analyzing data. Example: Wrong Decision Due to Error Example: Wrong Decision Due to Error Test of mu = 26.000 vs mu not = 26.000 Variable With Without N 16 15 Variable N With 16 Without 15 Mean 25.625 24.733 Mean 25.625 24.733 StDev 3.964 1.792 StDev 3.964 1.792 SE Mean 0.991 0.463 SE Mean 0.991 0.463 T -0.38 -2.74 P 0.71 0.016 95.0 % CI (23.513, 27.737) (23.741, 25.725) Mistake #2 • Using the wrong statistical procedure in analyzing your data. • Includes failing to check that necessary assumptions are met. Example: Wrong Decision Due to Wrong Analysis Pulse Rates Before and After Marching Student 1 2 3 4 BEFORE 60 56 90 78 AFTER 78 66 96 88 DIFFA-B 18 10 6 10 Paired Data Design, so analyze with Paired t-test. Example: Wrong Decision Due to Wrong Analysis Paired T for AFTER - BEFORE AFTER BEFORE Difference N 4 4 4 Mean 82.00 71.00 11.00 StDev 12.96 15.87 5.03 SE Mean 6.48 7.94 2.52 95% CI for mean difference: (2.99, 19.01) T-Test of mean difference = 0 (vs not = 0): T-Value = 4.37 P-Value = 0.02 Conclude mean pulse rate after is greater than mean pulse rate before. Example: Wrong Decision Due to Wrong Analysis Two sample T for AFTER vs BEFORE AFTER BEFORE N 4 4 Mean 82.0 71.0 StDev 13.0 15.9 SE Mean 6.5 7.9 95% CI for mu AFTER - mu BEFORE: ( -15.3, 37.3) T-Test mu AFTER = mu BEFORE (vs not =): T = 1.07 DF = 5 P = 0.33 Conclude no difference in mean pulse rates before and after marching. Mistake #3 • Failing to design your study so that it has high enough power to call meaningful differences “significantly different.” • Includes concluding that the null hypothesis is true. Should be “not enough evidence to say the null is false.” Example: Low Power Success = Yes, I recycle. Gender Male Female X 33 54 N 59 79 Sample p 0.559322 0.683544 Estimate for p(1) - p(2): -0.124222 95% CI for p(1) - p(2): (-0.287215, 0.0387704) Test for p(1) - p(2) = 0 (vs not = 0): Z = -1.49 P-Value = 0.135 A number of students said that they were surprised that the hypothesis test said “no difference in percentages.” Example: Low Power Power and Sample Size Test for Two Proportions Testing proportion 1 = proportion 2 (versus not =) Calculating power for: proportion 1 = 0.55 and proportion 2 = 0.70 Alpha = 0.05 Difference = -0.15 Sample Size Power 60 0.4366 70 0.4911 80 0.5421 *Sample size = # in EACH group Mistake #4 • Failing to report a confidence interval as well as the P-value. • P-value tells you if statistically significant. • Confidence interval tells you what the population value might be. Example: A Significant, but Potentially Meaningless Difference Two sample T for Phone Gender Male Female N 59 80 Mean 79 153 StDev 162 247 SE Mean 21 28 95% CI for mu (1) - mu (2): ( -142, -5) T-Test mu (1) = mu (2) (vs not =): T = -2.11 P = 0.036 DF = 135 P-value tells us significant difference, but confidence interval tells us that the difference in the averages could be as small as 5 minutes. Incidentally…. Outliers Removing Outliers … Two sample T for Phone Gender Male Female N 58 79 Mean 59.9 129 StDev 66.5 133 SE Mean 8.7 15 95% CI for mu (1) - mu (2): ( -103.7, -35) T-Test mu (1) = mu (2) (vs not =): T = -4.02 DF = 121 P = 0.0001 The difference in male and female phone usage becomes even more significant. We are 95% confident that the difference in the averages is now more than 35 minutes. Mistake #5 • “Fishing” for significant results. That is, performing several hypothesis tests on a data set, and reporting only those results that are significant. • If = P(Type I) = 0.05, and we perform 20 tests on the same data set, we can expect to make 1 Type I error. (0.05 ×20 = 1). Example: Results Obtained from Fishing • Primary driver of $10,000 vehicle and going away for Spring Break are related (P=0.01). • Virginity and supporting self through school are related (P = 0.045). • Virginity and graduating in four years are related (P = 0.041). • Virginity and attending non-football PSU sports events are related (P = 0.016). Mistake #6 • Overstating the results of an observational study. – That is, suggesting that one variable “caused” the differences in the other variable. – As opposed to correctly saying that the two variables are “associated” or “correlated.” • Don’t forget that a significant result may be “spurious.” Example: Misleading Headlines • Virgins don’t support themselves through school. • Non-virgins too busy to go to non-football PSU sporting events. • Non-virgins also too busy to graduate in four years. Mistake #7 • Using a non-random or unrepresentative sample. • Includes extending the results of an unrepresentative sample to the population. Example: Unrepresentative sample • Shere Hite wrote a book in 1987 called “Women in Love” • 100,000 questionnaires about love, sex, and relationships sent to women’s groups. Only 4,500 questionnaires returned. • Entire book devoted to results of survey. • Examples: 91% of divorcees initiated the divorce; 70% of women married 5 years committed adultery. Mistake #8 • Failing to use all of the basic principles of experiments, including randomization, blinding, and controlling.