Using and reporting measures of effect size

advertisement
Using and Reporting
Measures of Effect Size
Roger E. Kirk
Department of Psychology & Neuroscience
Baylor University
Three Categories of Measures of
Effect Magnitude
1. Measures of effect size (typically, standardized
mean differences)
2. Measures of strength of association
3. A large category of other kinds of measures
2
Four Purposes of Measures of
Effect Magnitude
1. Estimate the sample size required to achieve an
acceptable power
2. Integrate the results of empirical research studies
in meta-analyses
3. Supplement the information provided by null
hypothesis significance tests
4. Determine whether research results are
practically significant
3
Four Criticisms of Null Hypothesis
Significance Testing
1. Answers the wrong question
What we want to know is the probability that the null
hypothesis is true, given our data: p(H0 | D)
Null hypothesis significance testing tells us the
probability of obtaining our data or more extreme data
if the null hypothesis is true: p(D | H0 )
4
Four Criticisms of Null Hypothesis
Significance Testing (continued)
2. Is a trivial exercise
According to John Tukey
“the effects of A and B are always different—in some
decimal place—for any A and B. Thus asking
‘Are the effects different?’ is foolish.”
5
Four Criticisms of Null Hypothesis
Significance Testing (continued)
According to Bruce Thompson
“Statistical testing becomes a tautological search for
enough participants to achieve statistical significance.
If we fail to reject, it is only because we have been too
lazy to drag in enough participants”
6
Four Criticisms of Null Hypothesis
Significance Testing (continued)
3. Requires us to make a dichotomous decision from a
continuum of uncertainty
The adoption of .05 as as the dividing point between
significance and non-significance is quite arbitrary.
7
Four Criticisms of Null Hypothesis
Significance Testing (continued)
4. Does not address the question of whether results are
important, valuable, or useful: that is, their practical
significance.
8
Three Basic Questions that Researchers
Want to Answer from Their Research
1. Is an observed effect real or should it be attributed
to chance?
2. If the effect is real, how large is it?
3. Is the effect large enough to be useful?
9
Recommendation of the APA
Publication Manual
“Because confidence intervals combine information
on location and precision and can often be used to
infer significance levels, they are, in general the best
reporting strategy . . . Multiple degree-of-freedom
indicators are often less useful than effect-size
indicators that decompose multiple degree-of-freedom
tests into one degree-of-freedom effects . . .
10
Effect size
|m E -m C |
(1)
Cohen’s d =
s
Three ways to estimate s
(nE -1)sˆ 2E + (nC -1)sˆ C2
=
(nE -1) + (nC -1)
| YE - YC |
Cohen d =
sˆ pooled
sˆ pooled
| YE - YC |
Glass g ¢ =
sˆ C
sˆ C is the sigma of the control group
| YE - YC |
sˆ pooled =
Hedges g =
sˆ pooled
11
2
ˆ
(n1 -1)s1 +
(n1 -1) +
2
ˆ
+ (n p -1)s p
+ (n p -1)
Guidelines for Interpreting d
d = 0.2 is a small effect
d = 0.5 is a medium effect
d = 0.8 is a large effect
12
Strength of Association
(1)
w2 , r I =
s 2treatment
2
s error
+ s 2treatment
Sample estimators of omega squared and the intraclass
correlation
SS treat - (df treat )MSerror
ˆ =
w
SS total + MSerror
2
MS treat - MSerror
rˆ I =
MS treat + (n -1) MSerror
13
Guidelines for Interpreting
Omega Squared
w2 = .001 is a small association
w2 = .059 is a medium association
w2 = .138 is a large association
14
Measures of Effect Magnitude
________________________________________________________________
Effect
Size
Strength of Association
Other Measures
_______________________________________________________________
Cohen d, f, g, h,
q, w
Glass g’
Hedges g
Mahalanobis D
Mean1 – Mean2
Mdn1 – Mdn2
Mode1 – Mode2
Rosenthal P
Tang f
Thompson d*
ˆ
Wilcox L
Mdn ,
sb
r, rpb, r2, R, R2, h,
2
h2 , hmult , f , f2
Chamber re
Cohen f2
Contingency coef C
Cramér V
Fisher Z
Friedman rm
Goodman l, g
wˆ 2
rˆ I
Herzberg R2
Kelly e 2
15
Abs. risk reduction ARR
Cliff p
Cohen U1, U2, U3
Shift function
Dunlap CLR
Grisson PS
Logit d’
McGraw & Wong CL
Odds ratio w
Preece ratio of success
Probit d’
Relative risk RR
Sánchez-Meca dCox
More Measures of Effect Magnitude
________________________________________________________________
Effect
Size
Strength of Association
Other Measures
_______________________________________________________________
Wilcox & Muska
Qˆ0.632
Kendall W
Lord R2
2
2
ˆG
Olejnik w
, hˆ G
requivalent
ralerting
rcontrast
reffect size
ˆ 2mult c
Tatsuoka w
Wherry R2
16
Rosenthal & Rubin
BESD
Rosenthal & Rubin
EScounter null
Wilcoxon l
Two Ways to Estimate the
Denominator of Cohen’s d
|m E -m C |
d=
s
(1)
(2)
sˆ Y -Y =
1
2
sˆ Y -Y =
1
2
sˆ 2pooled
n1
sˆ 2pooled
n1
+
+
sˆ 2pooled
n2
sˆ 2pooled
n2
17
æ sˆ
öæ sˆ
ö
- 2r ç pooled ÷ç pooled ÷
ç n ÷ç n ÷
è
1 øè
2 ø
Effect of the Unreliability of the
Dependent Variable, Y, On the
Proportion of Explained Variance
rXY can not exceed (rXX ¢ )1 2 (rYY ¢ )1 2
where rXX ¢ and rYY ¢ are the reliabilities of X and Y
18
Double-Blind Study of 22,071
Men Physicians
Aspirin group
pA = .01259
Placebo group
pP = .02166
pA – pP = .01259 – .02166 = –.009
19
THE
END
20
Download