Common Statistical Mistakes

advertisement
Common Statistical Mistakes
Mistake #1
• Failing to investigate data for data entry or
recording errors.
• Failing to graph data and calculate basic
descriptive statistics before analyzing data.
Example:
Wrong Decision Due to Error
Example:
Wrong Decision Due to Error
Test of mu = 26.000 vs mu not = 26.000
Variable
With
Without
N
16
15
Variable N
With
16
Without 15
Mean
25.625
24.733
Mean
25.625
24.733
StDev
3.964
1.792
StDev
3.964
1.792
SE Mean
0.991
0.463
SE Mean
0.991
0.463
T
-0.38
-2.74
P
0.71
0.016
95.0 % CI
(23.513, 27.737)
(23.741, 25.725)
Mistake #2
• Using the wrong statistical procedure in
analyzing your data.
• Includes failing to check that necessary
assumptions are met.
Example:
Wrong Decision Due to Wrong
Analysis
Pulse Rates Before and After Marching
Student
1
2
3
4
BEFORE
60
56
90
78
AFTER
78
66
96
88
DIFFA-B
18
10
6
10
Paired Data Design, so analyze with Paired t-test.
Example:
Wrong Decision Due to Wrong
Analysis
Paired T for AFTER - BEFORE
AFTER
BEFORE
Difference
N
4
4
4
Mean
82.00
71.00
11.00
StDev
12.96
15.87
5.03
SE Mean
6.48
7.94
2.52
95% CI for mean difference: (2.99, 19.01)
T-Test of mean difference = 0 (vs not = 0): T-Value = 4.37
P-Value = 0.02
Conclude mean pulse rate after is greater than mean pulse rate before.
Example:
Wrong Decision Due to Wrong
Analysis
Two sample T for AFTER vs BEFORE
AFTER
BEFORE
N
4
4
Mean
82.0
71.0
StDev
13.0
15.9
SE Mean
6.5
7.9
95% CI for mu AFTER - mu BEFORE: ( -15.3, 37.3)
T-Test mu AFTER = mu BEFORE (vs not =): T = 1.07
DF = 5
P = 0.33
Conclude no difference in mean pulse rates before and after marching.
Mistake #3
• Failing to design your study so that it has
high enough power to call meaningful
differences “significantly different.”
• Includes concluding that the null hypothesis
is true. Should be “not enough evidence to
say the null is false.”
Example: Low Power
Success = Yes, I recycle.
Gender
Male
Female
X
33
54
N
59
79
Sample p
0.559322
0.683544
Estimate for p(1) - p(2): -0.124222
95% CI for p(1) - p(2): (-0.287215, 0.0387704)
Test for p(1) - p(2) = 0 (vs not = 0): Z = -1.49
P-Value = 0.135
A number of students said that they were surprised that the
hypothesis test said “no difference in percentages.”
Example: Low Power
Power and Sample Size
Test for Two Proportions
Testing proportion 1 = proportion 2 (versus not =)
Calculating power for:
proportion 1 = 0.55 and proportion 2 = 0.70
Alpha = 0.05 Difference = -0.15
Sample
Size
Power
60 0.4366
70 0.4911
80 0.5421
*Sample size = # in EACH group
Mistake #4
• Failing to report a confidence interval as
well as the P-value.
• P-value tells you if statistically significant.
• Confidence interval tells you what the
population value might be.
Example: A Significant, but
Potentially Meaningless
Difference
Two sample T for Phone
Gender
Male
Female
N
59
80
Mean
79
153
StDev
162
247
SE Mean
21
28
95% CI for mu (1) - mu (2): ( -142, -5)
T-Test mu (1) = mu (2) (vs not =): T = -2.11
P = 0.036
DF = 135
P-value tells us significant difference, but confidence interval tells us
that the difference in the averages could be as small as 5 minutes.
Incidentally….
Outliers
Removing Outliers …
Two sample T for Phone
Gender
Male
Female
N
58
79
Mean
59.9
129
StDev
66.5
133
SE Mean
8.7
15
95% CI for mu (1) - mu (2): ( -103.7, -35)
T-Test mu (1) = mu (2) (vs not =): T = -4.02
DF = 121
P = 0.0001
The difference in male and female phone usage becomes even more
significant. We are 95% confident that the difference in the
averages is now more than 35 minutes.
Mistake #5
• “Fishing” for significant results. That is,
performing several hypothesis tests on a
data set, and reporting only those results
that are significant.
• If  = P(Type I) = 0.05, and we perform 20
tests on the same data set, we can expect to
make 1 Type I error. (0.05 ×20 = 1).
Example:
Results Obtained from Fishing
• Primary driver of $10,000 vehicle and going
away for Spring Break are related (P=0.01).
• Virginity and supporting self through school
are related (P = 0.045).
• Virginity and graduating in four years are
related (P = 0.041).
• Virginity and attending non-football PSU
sports events are related (P = 0.016).
Mistake #6
• Overstating the results of an observational
study.
– That is, suggesting that one variable “caused”
the differences in the other variable.
– As opposed to correctly saying that the two
variables are “associated” or “correlated.”
• Don’t forget that a significant result may be
“spurious.”
Example: Misleading Headlines
• Virgins don’t support themselves through
school.
• Non-virgins too busy to go to non-football
PSU sporting events.
• Non-virgins also too busy to graduate in
four years.
Mistake #7
• Using a non-random or unrepresentative
sample.
• Includes extending the results of an
unrepresentative sample to the population.
Example:
Unrepresentative sample
• Shere Hite wrote a book in 1987 called
“Women in Love”
• 100,000 questionnaires about love, sex, and
relationships sent to women’s groups. Only
4,500 questionnaires returned.
• Entire book devoted to results of survey.
• Examples: 91% of divorcees initiated the
divorce; 70% of women married 5 years
committed adultery.
Mistake #8
• Failing to use all of the basic principles of
experiments, including randomization,
blinding, and controlling.
Download