Estimating the Standardized Mean Change with Heterogeneous

advertisement
Heterogeneous Variance
1
Effect Size Estimation from Pretest-Posttest-Control
Designs with Heterogeneous Variances
Scott B. Morris
Illinois Institute of Technology
Paper Presented at the 20th Annual Conference of the Society for Industrial and Organizational
Psychology, Los Angeles, CA, April 2005.
Heterogeneous Variance
2
Effect Size Estimation from Pretest-Posttest-Control
Designs with Heterogeneous Variances
The effectiveness of meta-analysis depends on the quality of the effect size estimates
from primary research results. It is critical that effect size estimates be unbiased and that the
sampling properties of the effect size estimates be known. In particular, meta-analytic
procedures require estimates of the sampling variance of effect sizes in order to obtain optimal
weights, to build confidence intervals, and to estimate between-study variance components
(Hedges & Olkin, 1985).
The standardized mean difference, d, is a common index of effect size for meta-analysis
of the effectiveness of organizational interventions. The techniques for meta-analysis of d were
developed under the common assumptions of independence, normality and homogeneity of
variance (Hedges & Olkin, 1985). Research has shown that violating these assumptions can bias
meta-analytic results (Grissom & Kim, 2001; Harwell, 1997; Morris, 2004). Therefore, it is
important to develop methods that are robust to violation of these assumptions.
This paper will discuss the Pretest-Posttest-Control (PPC) design, a popular design for
assessing the effectiveness of organizational interventions. The PPC design involves two
independent groups of participants assigned to alternate treatment conditions (e.g., treatment and
control groups). All participants are measured before and after the intervention, allowing the
measurement of individual change. Change scores, however, may be influenced by maturation,
spontaneous remission, or historical events that occur between measurement occasions (Cook &
Campbell, 1979). By comparing the change in the treatment group to the change observed in a
control or placebo group, bias due to the analysis of change scores can be reduced.
Consequently, the PPC design is often preferred over designs where the outcome is measured
only at posttest or designs with no control group (Cook & Campbell, 1979).
Methods for estimating meta-analysis of effect sizes from the PPC design have been
described in a number of sources (Becker, 1988; Carlson & Schmidt, 1999; Morris & DeShon,
2002). All of the existing methods were developed under the assumption that variances are
homogeneous across treatment groups at both pretest and posttest. Although this assumption is
often justified (Hedges, 1981), there are many situations were it is reasonable to expect
differences in the variance of the outcome variable across treatment conditions.
Heterogeneous Variance
3
A potential cause of variance heterogeneity is the treatment x subject interaction (Cook &
Campbell, 1979). If the effectiveness of a treatment is not the same for all research participants,
some individuals will show a greater change due to treatment than others will. Consequently, the
post-treatment variance will reflect both initial individual differences as well as differences in the
effectiveness of treatment. In contrast, the variance of scores in the control group will only
reflect initial individual differences, because the individuals do not receive the treatment. This
pattern of variances is common in studies of training effectiveness (Carlson & Schmidt, 1999).
Variance heterogeneity has implications for the definition of the effect size estimate as
well as its sampling distribution. The impact of variance heterogeneity on the definition of the
effect size estimate was given thorough consideration in early work on meta-analysis (Glass,
McGaw & Smith, 1981); however, this work did not address the sampling variance of the
proposed effect size. This paper will review the rationale for the Glass et al. (1981) effect size
estimate, and then derive the sampling distribution for this statistic.
Estimating the Standardized Mean Difference
In order for the effect size to have consistent interpretation across studies, it must be
expressed in a common metric. Because a collection of studies will often use a variety of
measures, the mean difference between groups will not be directly comparable across studies.
The standardized mean difference avoids this problem by dividing the mean difference by the
within-group standard deviation. This removes differences due to the scaling of the dependent
variables, and promotes comparability of effect sizes across studies.
This approach assumes that, except for scaling differences, the variance of dependent
variable is the same across studies. If two studies have a common scale but different variances,
the effect size from the two studies will have a different interpretation, and cannot be
meaningfully combined in a meta-analysis.
In the PPC design, participants are assigned to either a treatment or control condition, and
each participant is measured both before and after treatment occurs. Therefore, there are four
means and four standard deviations that could be used to define the effect size. The pattern of
heterogeneity among these four cells will depend on the theoretical mechanism causing the
variances to differ.
A likely source of variance heterogeneity is due to differential treatment effects. To the
extent that individuals receive differing amounts of treatment or treatment is more effective for
Heterogeneous Variance
4
some than others (a subject by treatment interaction), posttest variance in the treatment group
will be inflated relative to pretest variance. For these reasons, Glass et al. (1981) argued that the
posttest standard deviation of the treatment group might not be comparable across studies, even
after removing differences due to scaling. Therefore, the posttest standard deviation of the
treatment group is not a good standardizer.
The standard deviation of the pretest scores, on the other hand, is more likely to be
consistent across studies (assuming participants are sampled from the same population).
Because pretest scores are measured before treatment has been administered, they will not be
affected by the differential treatment effect. Therefore, an effect size defined in terms of the
pooled pretest standard deviation across treatment and control groups is likely to have metric
comparability across studies. The posttest variance in the control group should also be
unaffected by the differential treatment effect. However, pooling effect sizes across pretest and
posttest scores in the control group complicates the distribution of the effect size (Morris, 2003),
and will not be considered here.
The recommendation that effect sizes be defined using pretest standard deviations has
been repeated in many treatments of meta-analysis (Becker, 1988; Carlson & Schmidt, 1999;
Morris & DeShon, 2002), the impact of the recommendation on meta-analysis procedures has
not been fully investigated. Specifically, there has been little consideration of the impact of
variance heterogeneity on the sampling variance of effect size estimates. Sampling variance
plays a central role in almost all uses of effect size estimates. Sampling variance is used to
construct confidence intervals around individual effect size estimates, it may be used to define
weights for estimating the mean effect size in a meta-analysis, and provides the basis for tests of
homogeneity of effect size and estimates of random variance components in random effects
meta-analysis.
Using the pretest standard deviation does not eliminate the effect of variance
heterogeneity on the sampling variance of the mean difference. The current research develops the
sampling variance of the standardized mean from a PPC design when variances are
heterogeneous, and illustrates how failure to use the correct sampling variance can lead to
inaccurate conclusions in meta-analysis.
Heterogeneous Variance
5
Definition of the Effect Size
The data are assumed to be randomly sampled from two populations, corresponding to
treatment and control conditions. Pretest and posttest scores in each population have a bivariate
normal distribution with correlation . The pretest scores from both populations and the posttest
scores in the control group are assumed to have equal variance, 2; while the posttest variance in
the treatment group,  T2, post , may differ from the others. The means are indicated by T,pre for the
treatment population pretest, T,post for the treatment population posttest, C,pre for the control
group pretest, and C,post for the control group posttest.
The standardized mean change in each population is defined as the mean difference
between posttest and pretest scores, divided by the common pretest standard deviation. The
standardized mean change for the treatment group (T) is
T 
 T , post  T , pre
.

( 1)
The standardized mean change for the control group (C) is
C 
C , post  C , pre
.

( 2)
The effect size for the PPC design is defined as the difference between the standardized mean
change for the treatment and control groups (Becker, 1988; Carlson & Schmidt, 1999; Morris &
DeShon, 2002),
  T  C 

T , post
  T , pre    C , post   C , pre 

.
An individual study consists of nT participants receiving treatment, and nC participants in
the control group. The pretest and posttest means for the treatment group are indicated by Mpre,T
and Mpost,T, respectively. The pretest and posttest means for the control group are indicated by
Mpre,C and Mpost,C, respectively. A separate estimate of the standard deviation can be obtained
for the treatment groups at pretest (SDpre,T) and posttest (SDpost,T), and for the control group at
pretest (SDpre,C) and posttest (SDpost,C).
Previous research (Carlson & Schmidt, 1999; Morris, 2003) suggests defining the effect
size using the pooled pretest standard deviation,
Heterogeneous Variance
6
SDP 
2
2
nT  1SD pre
,T  nC  1SD pre,C
nT  nC  2
.
(3)
An unbiased estimate of the effect size is given by
 M post,T  M pre,T   M post,C  M pre,C 
d ppc  c P 
,
SDP


( 4)
where c is Hedges (1981) bias correction, which is approximately
c  1
3
.
4nT  nC  2  1
( 5)
Distribution of Effect Size Estimate
Under the assumptions of the model, the sample mean contrast in the numerator of the
effect size estimate (i.e., the difference between treatment and control group change scores) is
normally distributed, and is an unbiased estimate of the contrast among population means. If
homogeneity of variance is assumed across all cells, the variance of the mean contrast is
 1
1 
 .
var M Post,T  M Pr e,T   M Post,C  M Pr e,C   2 2 1    
 nT nC 
( 6)
However, when the variance of the treatment group posttest scores differs from the other
conditions, the variance of the mean contrast becomes
var M Post,T  M Pr e,T   M Post,C  M Pr e,C  
2
2
 Post
,T    2  Post,T 
nT
2 2 1   
.( 7)

nC
Let h be the ratio of the variance of the mean contrast to the variance of pretest scores,
h

.
VAR M post,T  M pre,T   M post,C  M pre,C 
2
( 8)
For the pattern of variance heterogeneity described above,
  T2
 2

h C


  1  2   T
C

nT


  21    .
nC
( 9)
Heterogeneous Variance
7
Let gPPC be the sample effect size estimate without the bias correction,
g ppc 
Mpost,T  Mpre,T   Mpost,C  Mpre,C  ,
SDP
The sample effect size divided by
( 10)
h would be
 M post,T  M pre,T   M post,C  M pre,C 


 h
g
,

SD
h
P
( 11)

which is distributed as a noncentral t (Huynh, 1989) with df=nT+nC-1 and noncentrality
parameter,
  1 .
( 12)
h
Therefore, g is distributed as
h times t (Huynh, 1989), and the unbiased estimate, dPPC = cgPPC
is distributed as c h times the t, where c is the bias factor approximated by Equation 5.
expectation and variance of the noncentral t (Johnson & Kotz, 1970) are given by
E t  

c
,
( 13)
and


2
 df 
2
var t   
 1   2 .
c
 df  2 
( 14)
Therefore, the expected value of dPPC is,
  c   .
E dPPC   c h
( 15)
The heterogeneity-assumed variance of dPPC is c2h times the variance of t. Therefore
 nT  nC  2 
 h  2  2 ,
 nT  nC  4 
2
d PPC   c 2 
 HET


( 16)
The
Heterogeneous Variance
8
or,
   T2
 2
 


+
2
n nC
2
 HET
(d PPC ) = c 2  T
  C
 nT + nC - 4 


 


  1  2   T
C

nT







2
1



+ 2  - 2 .
nC



( 17)
Most current methods for meta-analysis assume homogeneity of variance (e.g., Hedges &
Olkin, 1985), in which case, the homogeneity-assumed variance of the effect size would be
 nT  nC  2  21   nT  nC  2 

    2 .
nT nC
 nT  nC  4 

2
d   c 2 
 HOM
( 18)
In cases where heterogeneity of variance exists, the use of Equation 18 can be quite
inaccurate. Table 1 illustrates the difference between the homogeneity-assumed variance
(Equation 18) and the true variance (Equation 17) when =1 and =.5, and under different levels
of heterogeneity and sample size. The results clearly show that assuming homogeneity can result
in substantial bias in variance estimates. Specifically, when the treatment group has the larger
variance, the homogeneity-assumed variance tends to underestimate the true variance. When the
treatment group variance was four times larger than the control group variance, the homogeneityassumed variance underestimated the true variance by as much as 56%.
Misestimating the sampling variance could have serious implications for the conclusions
drawn from a meta-analysis. Estimates of sampling variance are needed to compute the weighted
mean effect size and test for homogeneity of effect size across studies. The following section
discusses the impact of using the incorrect variance formula on these procedures.
Meta-Analytic Procedures
Mean Effect Size
In a meta-analysis, the researcher is generally interested in estimating the mean effect
size and testing for homogeneity of effect size. The most precise estimate of the mean effect size
is obtained by weighting the individual effect sizes by the reciprocal of the variance (Hedges &
Olkin, 1985),
Heterogeneous Variance
9
k
d 
w d
j
j 1
j
,
( 19)
1
.
 d j 
( 20)
k
w
j 1
j
where k is the number of studies, and
wj 
2
In general, more accurate estimates of the variance should lead to a more precise
2
weighted mean. Therefore, when variances are unequal, using  HET
(d ) should result in a more
2
( d ) . However, the benefit of using the more
precise estimate of the mean than using  HOM
accurate variance estimate is complicated by the fact that the weights must be based on sample
estimates of the population parameters. Using sample statistics to define the weights can create
bias in the weighted mean. Because both the effect size estimate and the weight are based on the
same data, they will tend to be correlated across samples, and this correlation creates bias in the
weighted mean.
Previous research has shown that this bias tends to be very small when variances are
homogeneous (Van Den Noortgate & Onghena, 2003). When variances are homogeneous, the
weight depends only on the sample size, the estimates of the effect size, and the estimate of the
pre-post correlation. Furthermore, the bias can be avoided simply by computing the weights
using the mean effect size and mean correlation rather than the sample statistics. However, for
2
 HET
(d ) , the weight also depends on the ratio of treatment group to control group variance.
Because the sample effect size and the sample variance ratio both depend on the standard
deviation of the control group, the correlation and the resulting bias may be non-trivial.
Homogeneity of Effect Size
Researchers are also interested in determining whether the treatment effect is
homogeneous across a pool of studies. Hedges' (1981) Q-test is commonly used to test whether
the observed variance in effect sizes is larger than expected due to sampling error,
Heterogeneous Variance
10
k
Q
j 1
d
d
2
j
 2 d j 
.
( 21)
Under the null hypothesis of homogeneity, Q has a chi-square distribution with k-1 df.
When the variance of the treatment group is larger than the variance of the control group,
2
( d ) in the Q-test would lead to underestimation of true variance, and correspondingly
using  HOM
exaggerated values of Q. The potential bias in Q could be remedied by using the correct
2
variance formula,  HET
(d ) .
As with the weighted mean, it is unknown to what extent sampling error in the estimate
2
of the variance ratio will affect the results of the Q test using  HET
(d ) . Thus, while the more
accurate formula is correct asymptotically, it is not clear how well procedures based on the
improved variance estimate will perform in small samples. To explore the viability of the
proposed method, a Monte Carlo simulation was conducted to examine the accuracy of metaanalytic results using the modified variance formula (Equation 17).
Monte Carlo Simulation
A Monte Carlo simulation explored the effectiveness of alternate variance estimates
under conditions of both homogeneous and heterogeneous variance. The simulation was repeated
under a variety of conditions likely to influence the accuracy of meta-analytic results, such as the
effect size, the pre-post correlation, the sample size, and the number of studies.
For the simulation, the set of studies in a meta-analysis either all had homogeneous
variance or all had an equal degree of heterogeneous variance. For the heterogeneous conditions,
the variance of the treatment group posttest was 4.0, while the variance of the treatment group
pretest, control group pretest and control group posttest were all 1.0. This represents a large
difference in variance. For the homogeneous variance conditions, all variances were 1.0.
The number of studies in a meta-analysis (k) was set at 10 or 25. The effect size was
constant across all studies within a meta-analysis. The population effect size () was set at 0.0,
0.5, and 1.0, corresponding to no effect, a moderate effect, and a large effect. The population
pre-post correlation () was equal for treatment and control groups, and was constant across
studies within a meta-analysis. The values for the pre-post correlation were 0.0, 0.4, and 0.8.
Heterogeneous Variance
11
Sample size was allowed to vary across studies within a meta-analysis. The sample sizes
for treatment and control conditions were randomly sampled from four levels (5, 10, 20, 30)
based on a specified probability distribution. The distribution of sample sizes varied across
conditions, so that the average sample size was 10, 15 or 25. The probability distributions are
shown in Table 2. Average sample size was manipulated separately for treatment and control
groups. When the average sample size was equal across groups, a high proportion of the studies
had similar sample sizes across groups. When the average sample size was different across
groups, a high proportion of the studies had substantial differences in sample sizes across groups.
Under each combination of the parameters, results were averaged across 10,000 metaanalyses. Each meta-analysis consisted of k studies. For each study, nC scores in the control
group and nT scores in the treatment group were randomly generated from a multivariate normal
distribution using the IMSL DRNMVN routine. At both pretest and posttest, the control group
had a mean of 0 and a standard deviation of 1. Pretest scores in the treatment group also had a
mean of 0 and a standard deviation of 1. A linear transformation was used to create treatment
group scores with a mean of  and a standard deviation of 1 or 4, depending on the condition.
Based on these scores, an effect size was computed for each study using Equation 4.
For each meta-analysis, the weighted mean effect size was computed two ways; first with
weights defined using the inverse of the homogeneity assumed variance (Equation 18) and
second with the weights defined using the inverse of the heterogeneity-assumed variance
(Equation 17). For both approaches, the unweighted average values across studies were used as
estimates of  and  in the variance formula. For the heterogeneity-assumed variance, the
variance ratio was estimated using the treatment group posttest variance divided by the treatment
group pretest variance. The resulting mean effect size was averaged across iterations of the
simulation to obtain the expected value.
The homogeneity of effect size test was computed twice within each meta-analysis: first
using the homogeneity-assumed variance and second using the heterogeneity-assumed variance.
The resulting Q-value was compared to a chi-square distribution with k-1 df. Type I error rate
was defined as the proportion of meta-analyses within a condition where Q exceeded the critical
chi-square value at =.05.
Heterogeneous Variance
12
Results
Both methods of conducting the meta-analysis produced a mean effect size that was
nearly unbiased. The results are summarized in Tables 3 and 4. Consistent with past research
(Morris, 2003), the weighted mean effect size using the homogeneity-assumed variance formula
was essentially unbiased under all conditions examined in the study. For meta-analysis based on
the heterogeneity-assumed variance formula, there was little or no bias when the population
effect size was zero. For >0, there was a slight negative bias. The degree of bias was greatest
when sample size was small in both groups, and when variances were unequal. For example, in
the condition with unequal variance, nT=nC=10, =0 and 25 studies in the meta-analysis, an
effect size of 1.0 was underestimated by 3% ( d = 0.97). In many cases, the bias was much
smaller. When the population effect size was 1.0, the average bias was -.01.
The accuracy of the Q-test is summarized in Tables 5 and 6. When the assumption of
homogeneity of variance was met, the Q-test based on the homogeneity-assumed variance had
reasonably accurate Type I error rates, ranging from .05 to .08. Type I error rates for the
simulations with k=25 are shown in Figure 1. Similar results were obtained with k=10.
When variance was homogeneous, the Q-test based on the heterogeneity-assumed variance
produced inflated Type I error rates under some conditions (see Figure 1). When the population
effect size was 0, the test was slightly conservative, with Type I error rates ranging from .02 to
.05 (M = .04). When the population effect size was 1.0, the Type I error rates were overly
liberal, ranging from .06 to .24 (M=.10). Type I error rate inflation was particularly high when
the pre-post correlation was large, as indicated in Figure 1.
As expected, when the homogeneity of variance assumption was violated, the Q-test
based on the homogeneous variance formula was not accurate. The magnitude of bias was
considerable, and increased with the number of studies. For k=10, Type I error rates ranged
from .19 to .75 (M = .43), while for k=25, Type I error rates ranged from .33 to .97 (M = .68).
Type I error rates were highest when the pre-post correlation was large, and when the treatmentgroup sample size was smaller than the control-group sample size (see Figure 2).
When treatment and control group variances were unequal, Type I error rates for the Qtest based on the heterogeneity-assumed variance were considerably more accurate than the
traditional Q-test (see Figure 2). Results are summarized here for k=25. Similar results were
found for k=10. When sample sizes were equal, Type I error rates were close to the nominal
Heterogeneous Variance
13
level on average (M=.06), although they ranged from .03 to .12. When the sample size for the
treatment group was larger than the sample size for the control group, the test was reasonably
accurate. For example, for nT=25 and nC=10, Type I error rates ranged from .03 to .07 (M=.05).
However, when the sample size for the treatment group was smaller than the sample size for the
control group, the test was overly liberal. When nC=25 and nT=10, Type I error rates ranged
from .11 to .15 (M=.12).
Conclusion
When subgroup variances are unequal, common meta-analytic methods may be
inaccurate unless appropriate modifications are made. The common recommendation to
standardize the effect size using only the pretest standard deviations (Becker, 1988; Carlson &
Schmidt, 1999; Morris & DeShon, 2002) is successful at producing an estimate of effect size that
is unbiased. However, additional adjustments are needed to obtain accurate estimates of the
sampling variance of effect sizes. Standard formulas for the sampling variance, which assume
variance homogeneity, can lead to substantial errors regarding the variability of effect sizes
across studies.
A likely pattern of variance heterogeneity occurs when an experimental treatment does
not affect all individuals equally, resulting in larger variance for posttest scores in the treatment
group than for pretest or control group scores. In this situation, failure to adjust for variance
heterogeneity can severely bias the Q-test for homogeneity of effect size across studies. A
simulated meta-analysis yielded Type I error rates as high as 97%. Under some conditions, the
traditional Q-test was almost guaranteed to find significant variance across studies, when in fact
the true effect size was constant. Using this method could lead researchers to falsely conclude
that random differences across studies are due to substantive moderator variables.
Some support was found for an alternate estimate of sampling variance, which takes into
account the degree of heterogeneity. Use of the proposed estimate avoided the excessive
inflation of Type I error found with the homogeneity-assumed formula. However, the results for
the new estimate were somewhat mixed. Use of the heterogeneity-assumed variance to compute
the weighted mean resulted in a very small downward bias in the mean effect size, particularly
when sample size was small. In addition, Type I error rates on the Q-test differed from the
normative alpha level under some conditions. Specifically, when variances were homogenous,
Type I error rates were more consistent using the homogeneity-assumed formula. When both 
Heterogeneous Variance
14
and  were large, the heterogeneity-assumed formula produced Type I error rates as high as .24.
When variances differed across conditions, tests based on the heterogeneity-assumed formula
were consistently more accurate than the test currently used by meta-analysts. However, the new
procedure still yielded Type I error rates as high as .15. Thus, although the proposed variance
formula produced some improvement in the accuracy of the Q-test, additional work is needed to
refine the procedure.
When it is reasonable to assume that variances are equal across groups and across time, it
is recommended that researchers use existing procedures based on the homogeneity-assumed
variance formula. When the assumption was met, this approach better controlled for Type I
error, and produced a weighted mean that was unbiased. However, when variances are unequal,
the heterogeneity-assumed variance formula is recommended for procedures that rely on
estimates of sampling variance, such as testing homogeneity of effect size across studies, and
estimating the true variance of effect size.
Several limitations of the study should be noted. The simulations only examined
conditions where the degree of variance heterogeneity was constant across studies. In practice, a
meta-analysis is likely to include studies with varying degrees of heterogeneity. When only a
few studies have heterogeneous variance, the impact on the results will be minimized. Similarly,
the simulation modeled the situation where the population effect size and the pre-post correlation
were constant across studies. Future research should also consider the impact of the alternate
formulas under a wider range of conditions.
Another limitation is that the study only examined one pattern of heterogeneity: inflated
variance in treatment-group posttest scores with homogeneity across the other conditions.
Although this form of heterogeneity is often a concern, other patterns are possible. The
equations derived in this paper should apply to situations where the posttest variance in the
treatment group is either larger or smaller than the other conditions. The same approach can be
used to derive appropriate formulas for other patterns of heterogeneity, such as situations where
posttest scores are inflated, perhaps differentially, in both treatment and control groups, or when
the pre-post correlation differs across groups.
Heterogeneous Variance
15
References
Becker, B. J. (1988). Synthesizing standardized mean-change measures. British Journal
of Mathematical and Statistical Psychology, 41, 257-278.
Carlson, K. D., & Schmidt, F. L. (1999). Impact of experimental design on effect size:
Findings from the research literature on training. Journal of Applied Psychology, 84, 851-862.
Cook, T. D., & Campbell, D. T. (1979). Quasi-experimentation: Design and analysis
issues for field settings. Boston, MA: Houghton Mifflin.
Glass, G. V., McGaw, B., & Smith, M. L. (1981). Meta-analysis in social research.
Beverly Hills, CA: Sage.
Grissom , R. J., & Kim, J. J. (2001). Review of assumptions and problems in appropriate
conceptualization of effect size. Psychological Methods, 6, 135-146.
Harwell, M. (1997). An empirical study of Hedges' Homogeneity Test. Psychological
Methods, 2, 219-231.
Hedges, L. V. (1981). Distribution theory for Glass's estimator of effect size and related
estimators. Journal of Educational Statistics, 6, 107-128.
Hedges, L. V., & Olkin, I. (1985). Statistical methods for meta-analysis. San Diego,
CA: Academic Press.
Huynh, C. L. (1989). A unified approach to the estimation of effect size in meta-analysis.
Paper presented at the Annual Meeting of the American Educational Research Association, San
Francisco (ERIC Document Reproduction Service No. ED 306 248).
Johnson, N. L., & Kotz, S. (1970). Continuous univariate distributions. NY: John
Wiley & Sons.
Morris, S. B. (2003, April). Estimating Effect Size from the Pretest-Posttest-Control
Design. Paper presented at the 18th annual conference of the Society for Industrial and
Organizational Psychology, Orlando, FL.
Morris, S. B. (2004). Effect Size Estimation from Two Independent Groups with
Heterogeneous Variances. Paper Presented at the 19th Annual Conference of the Society for
Industrial and Organizational Psychology, Chicago, IL, April 2004.
Morris, S. B., & DeShon, R. P. (2002). Combining effect size estimates in meta-analysis
with repeated measures and independent-groups designs. Psychological Methods, 7, 105-125.
Heterogeneous Variance
16
Van Den Noortgate, W. & Onghena, P. (2003). Estimating the Mean Effect Size in
Meta-Analysis: Bias, Precision, and Mean Squared Error of Different Weighting Methods.
Behavior Research Methods, Instruments and Computers, 35, 504–511.
Heterogeneous Variance
17
Table 1
Inaccuracy of Traditional Estimate of Effect Size Variance (Assuming Homogeneity) when =1
and =.5.
Variance of Effect Size
2
 Post
,T
0.25
0.25
0.25
0.25
0.5
0.5
0.5
0.5
2
2
2
2
4
4
4
4
Homogeneity Homogeneity
nT
nC
Assumed
10
10
25
25
10
10
25
25
10
10
25
25
10
10
25
25
10
25
10
25
10
25
10
25
10
25
10
25
10
25
10
25
0.238
0.159
0.159
0.092
0.238
0.159
0.159
0.092
0.238
0.159
0.159
0.092
0.238
0.159
0.159
0.092
Not Assumed Difference % Difference
0.213
0.133
0.148
0.082
0.217
0.138
0.150
0.083
0.299
0.218
0.182
0.116
0.445
0.362
0.240
0.173
0.026
0.025
0.010
0.010
0.021
0.021
0.008
0.008
-0.060
-0.060
-0.024
-0.024
-0.206
-0.203
-0.081
-0.081
12
19
7
12
10
15
6
10
-20
-27
-13
-21
-46
-56
-34
-47
Note: 2=1 for pretest scores in both groups and posttest scores in control group.
Heterogeneous Variance
18
Table 2
Proportion of Studies at Each Sample Size in Simulated Meta-Analyses.
Sample Size Condition
N
Small
Medium
Large
5
0.6
0.2
0
10
0.2
0.4
0.1
20
0.1
0.2
0.3
30
0.1
0.2
0.6
Average N
10
15
25
Heterogeneous Variance
19
Table 3
Bias in Weighted Mean Effect Size Using Homogeneity-Assumed (HOM) and HeterogeneityAssumed (HET) Variance Estimates as Weights (k=10).


0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0.4
0.4
0.4
0.4
0.4
0.4
0.4
0.4
0.4
0.8
0.8
0.8
0.8
0.8
0.8
0.8
0.8
0.8
0
0
0
0
0
0
0
0
0
nT
nC
10
10
15
10
25
10
10
15
15
15
25
15
10
25
15
25
25
25
10
10
15
10
25
10
10
15
15
15
25
15
10
25
15
25
25
25
10
10
15
10
25
10
10
15
15
15
25
15
10
25
15
25
25
25
10
10
15
10
25
10
10
15
15
15
25
15
10
25
15
25
25
25
T2 = C2
HOM
HET
0.0046
0.0050
0.0050
0.0047
0.0023
0.0022
0.0013
0.0012
-0.0002 -0.0004
-0.0018 -0.0015
0.0022
0.0019
0.0001 -0.0002
-0.0025 -0.0024
0.0044
0.0039
0.0000
0.0003
0.0047
0.0048
-0.0015 -0.0008
0.0003
0.0001
-0.0008 -0.0007
-0.0003
0.0001
0.0006
0.0006
-0.0013 -0.0015
-0.0010 -0.0007
0.0010
0.0009
0.0021
0.0019
0.0017
0.0019
0.0001
0.0002
0.0001
0.0000
0.0001
0.0000
-0.0004 -0.0003
0.0002
0.0002
0.0027 -0.0138
0.0024 -0.0068
0.0043 -0.0003
0.0032 -0.0084
-0.0027 -0.0119
0.0011 -0.0035
0.0028 -0.0070
-0.0009 -0.0086
-0.0005 -0.0054
T2 > C2
HOM
HET
0.0034
0.0030
0.0014
0.0010
0.0033
0.0034
0.0026
0.0019
-0.0017 -0.0018
0.0026
0.0020
-0.0040 -0.0037
0.0002
0.0002
-0.0008 -0.0013
0.0025
0.0016
0.0004 -0.0007
0.0011
0.0008
0.0000 -0.0005
-0.0012 -0.0014
-0.0017 -0.0008
0.0017
0.0012
-0.0011 -0.0015
-0.0014 -0.0017
0.0009
0.0019
0.0040
0.0029
0.0014
0.0014
0.0018
0.0021
0.0033
0.0032
0.0007
0.0011
-0.0003 -0.0007
-0.0015 -0.0013
0.0008
0.0008
0.0025 -0.0255
-0.0002 -0.0199
0.0026 -0.0087
0.0034 -0.0186
0.0025 -0.0143
0.0010 -0.0096
0.0016 -0.0150
0.0000 -0.0140
-0.0003 -0.0105
(table continues)
Heterogeneous Variance
20

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
nT
nC

0.4
10
10
0.4
15
10
0.4
25
10
0.4
10
15
0.4
15
15
0.4
25
15
0.4
10
25
0.4
15
25
0.4
25
25
0.8
10
10
0.8
15
10
0.8
25
10
0.8
10
15
0.8
15
15
0.8
25
15
0.8
10
25
0.8
15
25
0.8
25
25
T2 = C2
HOM
HET
0.0060 -0.0084
0.0025 -0.0061
0.0068
0.0027
0.0015 -0.0103
-0.0005 -0.0078
0.0008 -0.0034
-0.0008 -0.0091
0.0002 -0.0062
0.0001 -0.0043
0.0021 -0.0068
0.0015 -0.0040
-0.0001 -0.0023
0.0036 -0.0033
0.0017 -0.0025
0.0005 -0.0018
-0.0002 -0.0050
-0.0003 -0.0036
0.0000 -0.0021
Note: nT and nC represent average sample size.
T2 > C2
HOM
HET
0.0003 -0.0303
0.0033 -0.0163
0.0021 -0.0093
-0.0011 -0.0254
0.0015 -0.0163
0.0024 -0.0085
0.0013 -0.0112
-0.0023 -0.0152
-0.0040 -0.0140
0.0025 -0.0160
-0.0003 -0.0134
0.0027 -0.0054
0.0035 -0.0095
0.0030 -0.0082
0.0017 -0.0058
-0.0004 -0.0092
-0.0009 -0.0091
-0.0004 -0.0072
Heterogeneous Variance
21
Table 4
Bias in Weighted Mean Effect Size Using Homogeneity-Assumed (HOM) and HeterogeneityAssumed (HET) Variance Estimates as Weights (k=25).


0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0.4
0.4
0.4
0.4
0.4
0.4
0.4
0.4
0.4
0.8
0.8
0.8
0.8
0.8
0.8
0.8
0.8
0.8
0
0
0
0
0
0
0
0
0
nT nC
10 10
15 10
25 10
10 15
15 15
25 15
10 25
15 25
25 25
10 10
15 10
25 10
10 15
15 15
25 15
10 25
15 25
25 25
10 10
15 10
25 10
10 15
15 15
25 15
10 25
15 25
25 25
10 10
15 10
25 10
10 15
15 15
25 15
10 25
15 25
25 25
T2 = C2
HOM
HET
-0.0035 -0.0034
-0.0014 -0.0013
-0.0004 -0.0004
-0.0006 -0.0006
-0.0009 -0.0010
-0.0010 -0.0011
-0.0006 -0.0001
-0.0023 -0.0021
0.0007
0.0008
0.0009
0.0013
0.0001
0.0005
-0.0013 -0.0012
0.0018
0.0017
0.0019
0.0021
0.0006
0.0006
0.0006
0.0004
-0.0003 -0.0001
0.0005
0.0004
-0.0010 -0.0009
-0.0003 -0.0003
0.0004
0.0004
-0.0006 -0.0007
0.0007
0.0007
0.0000
0.0000
0.0008
0.0007
0.0004
0.0005
0.0002
0.0003
-0.0019 -0.0188
-0.0008 -0.0114
-0.0004 -0.0053
0.0006 -0.0128
0.0001 -0.0091
-0.0010 -0.0060
0.0006 -0.0090
-0.0009 -0.0084
0.0000 -0.0051
T2 > C2
HOM
HET
-0.0005 -0.0003
-0.0021 -0.0028
-0.0021 -0.0019
-0.0012 -0.0028
-0.0021 -0.0016
0.0014
0.0016
0.0014
0.0008
-0.0019 -0.0020
0.0007
0.0008
-0.0007
0.0003
-0.0018 -0.0016
-0.0006 -0.0007
0.0006
0.0013
0.0002
0.0001
-0.0006 -0.0007
0.0001 -0.0005
-0.0008 -0.0001
0.0009
0.0015
0.0010
0.0011
-0.0002 -0.0003
-0.0004 -0.0003
0.0004
0.0002
0.0009
0.0007
0.0013
0.0011
0.0006
0.0003
0.0007
0.0005
0.0004
0.0005
-0.0033 -0.0334
-0.0006 -0.0221
-0.0014 -0.0130
-0.0005 -0.0234
0.0004 -0.0167
0.0000 -0.0114
0.0010 -0.0161
0.0000 -0.0142
0.0012 -0.0095
(table continues)
Heterogeneous Variance
22

1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
nT nC

0.4 10 10
0.4 15 10
0.4 25 10
0.4 10 15
0.4 15 15
0.4 25 15
0.4 10 25
0.4 15 25
0.4 25 25
0.8 10 10
0.8 15 10
0.8 25 10
0.8 10 15
0.8 15 15
0.8 25 15
0.8 10 25
0.8 15 25
0.8 25 25
T2 = C2
HOM
HET
0.0015 -0.0136
-0.0008 -0.0103
-0.0017 -0.0058
-0.0002 -0.0118
0.0005 -0.0076
0.0000 -0.0046
-0.0002 -0.0083
0.0008 -0.0056
0.0022 -0.0023
-0.0002 -0.0096
0.0005 -0.0049
-0.0010 -0.0032
-0.0003 -0.0071
-0.0001 -0.0045
0.0002 -0.0020
0.0003 -0.0044
-0.0007 -0.0043
0.0003 -0.0019
Note: nT and nC represent average sample size.
T2 > C2
HOM
HET
0.0004 -0.0295
-0.0016 -0.0229
-0.0001 -0.0120
-0.0007 -0.0236
0.0005 -0.0173
0.0010 -0.0110
0.0000 -0.0168
-0.0004 -0.0149
0.0001 -0.0110
0.0009 -0.0175
-0.0004 -0.0136
-0.0008 -0.0088
0.0015 -0.0131
-0.0012 -0.0127
0.0005 -0.0075
0.0012 -0.0076
-0.0001 -0.0085
0.0006 -0.0065
Heterogeneous Variance
23
Table 5
Type I Error Rate for Homogeneity of Effect Size Test Using Homogeneity-Assumed (HOM)
and Heterogeneity-Assumed (HET) Variance Estimates (k=10).


0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0
0
0
0
0
0
0
0
0
0.4
0.4
0.4
0.4
0.4
0.4
0.4
0.4
0.4
0.8
0.8
0.8
0.8
0.8
0.8
0.8
0.8
0.8
0
0
0
0
0
0
0
0
0
0.4
0.4
nT
nC
10
10
15
10
25
10
10
15
15
15
25
15
10
25
15
25
25
25
10
10
15
10
25
10
10
15
15
15
25
15
10
25
15
25
25
25
10
10
15
10
25
10
10
15
15
15
25
15
10
25
15
25
25
25
10
10
15
10
25
10
10
15
15
15
25
15
10
25
15
25
25
25
10
10
15
10
T2 = C2
HOM
HET
0.0680
0.0441
0.0678
0.0524
0.0570
0.0507
0.0668
0.0483
0.0602
0.0474
0.0572
0.0517
0.0629
0.0546
0.0542
0.0480
0.0589
0.0536
0.0609
0.0404
0.0585
0.0414
0.0541
0.0489
0.0619
0.0421
0.0536
0.0417
0.0549
0.0490
0.0599
0.0458
0.0560
0.0432
0.0526
0.0474
0.0583
0.0342
0.0516
0.0382
0.0516
0.0463
0.0510
0.0315
0.0528
0.0406
0.0523
0.0477
0.0529
0.0309
0.0514
0.0377
0.0514
0.0448
0.0663
0.0479
0.0670
0.0547
0.0583
0.0553
0.0703
0.0566
0.0632
0.0511
0.0570
0.0538
0.0568
0.0522
0.0583
0.0561
0.0557
0.0537
0.0606
0.0422
0.0589
0.0497
T2 > C2
HOM
HET
0.3581
0.0592
0.3007
0.0539
0.2029
0.0464
0.4308
0.0746
0.3653
0.0591
0.2738
0.0504
0.5173
0.0912
0.4614
0.0675
0.3753
0.0523
0.4045
0.0552
0.3346
0.0495
0.2299
0.0433
0.4718
0.0715
0.3986
0.0577
0.2953
0.0406
0.5790
0.0886
0.5198
0.0651
0.4239
0.0482
0.5577
0.0611
0.4739
0.0492
0.3433
0.0344
0.6374
0.0702
0.5827
0.0543
0.4449
0.0348
0.7511
0.0908
0.6952
0.0593
0.5842
0.0359
0.3481
0.0572
0.2951
0.0544
0.1996
0.0519
0.4273
0.0774
0.3585
0.0614
0.2562
0.0483
0.5094
0.0972
0.4658
0.0721
0.3764
0.0551
0.3920
0.0611
0.3198
0.0504
(table continues)
Heterogeneous Variance
24
T2 = C2
nT
nC
HOM
HET


0.5 0.4
25
10
0.0528
0.0521
0.5 0.4
10
15
0.0586
0.0465
0.5 0.4
15
15
0.0577
0.0511
0.5 0.4
25
15
0.0541
0.0542
0.5 0.4
10
25
0.0570
0.0457
0.5 0.4
15
25
0.0573
0.0502
0.5 0.4
25
25
0.0496
0.0502
0.5 0.8
10
10
0.0529
0.0500
0.5 0.8
15
10
0.0528
0.0559
0.5 0.8
25
10
0.0524
0.0644
0.5 0.8
10
15
0.0561
0.0493
0.5 0.8
15
15
0.0518
0.0558
0.5 0.8
25
15
0.0544
0.0679
0.5 0.8
10
25
0.0523
0.0444
0.5 0.8
15
25
0.0518
0.0549
0.5 0.8
25
25
0.0501
0.0643
1
0
10
10
0.0676
0.0635
1
0
15
10
0.0643
0.0673
1
0
25
10
0.0600
0.0653
1
0
10
15
0.0642
0.0623
1
0
15
15
0.0636
0.0665
1
0
25
15
0.0545
0.0647
1
0
10
25
0.0570
0.0605
1
0
15
25
0.0564
0.0611
1
0
25
25
0.0605
0.0724
1 0.4
10
10
0.0620
0.0640
1 0.4
15
10
0.0617
0.0715
1 0.4
25
10
0.0547
0.0727
1 0.4
10
15
0.0607
0.0659
1 0.4
15
15
0.0573
0.0705
1 0.4
25
15
0.0546
0.0746
1 0.4
10
25
0.0574
0.0622
1 0.4
15
25
0.0572
0.0728
1 0.4
25
25
0.0545
0.0778
1 0.8
10
10
0.0536
0.1213
1 0.8
15
10
0.0561
0.1297
1 0.8
25
10
0.0521
0.1228
1 0.8
10
15
0.0586
0.1147
1 0.8
15
15
0.0517
0.1266
1 0.8
25
15
0.0538
0.1402
1 0.8
10
25
0.0479
0.0883
1 0.8
15
25
0.0526
0.1170
1 0.8
25
25
0.0508
0.1484
Note: nT and nC represent average sample size.
T2 > C2
HOM
HET
0.2213
0.0458
0.4590
0.0730
0.4030
0.0617
0.2917
0.0487
0.5664
0.0975
0.5133
0.0731
0.4083
0.0494
0.5252
0.0727
0.4486
0.0541
0.3326
0.0450
0.6194
0.0759
0.5599
0.0624
0.4203
0.0424
0.7375
0.0941
0.6745
0.0687
0.5635
0.0445
0.3349
0.0689
0.2727
0.0618
0.1939
0.0563
0.4099
0.0824
0.3432
0.0703
0.2632
0.0611
0.4978
0.0981
0.4523
0.0805
0.3541
0.0631
0.3615
0.0683
0.3056
0.0638
0.2135
0.0569
0.4360
0.0804
0.3765
0.0740
0.2784
0.0563
0.5506
0.1039
0.4824
0.0804
0.3829
0.0597
0.4405
0.0940
0.3830
0.0788
0.2852
0.0637
0.5514
0.0999
0.4774
0.0813
0.3642
0.0667
0.6855
0.1115
0.6020
0.0854
0.4830
0.0604
Heterogeneous Variance
25
Table 6
Type I Error Rate for Homogeneity of Effect Size Test Using Homogeneity-Assumed (HOM)
and Heterogeneity-Assumed (HET) Variance Estimates (k=25).


0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0
0
0
0
0
0
0
0
0
0.4
0.4
0.4
0.4
0.4
0.4
0.4
0.4
0.4
0.8
0.8
0.8
0.8
0.8
0.8
0.8
0.8
0.8
0
0
0
0
0
0
0
0
0
0.4
0.4
nT
nC
10
10
15
10
25
10
10
15
15
15
25
15
10
25
15
25
25
25
10
10
15
10
25
10
10
15
15
15
25
15
10
25
15
25
25
25
10
10
15
10
25
10
10
15
15
15
25
15
10
25
15
25
25
25
10
10
15
10
25
10
10
15
15
15
25
15
10
25
15
25
25
25
10
10
15
10
T2 = C2
HOM
HET
0.0745
0.0396
0.0718
0.0474
0.0598
0.0503
0.0645
0.0397
0.0647
0.0451
0.0574
0.0486
0.0609
0.0502
0.0542
0.0432
0.0559
0.0448
0.0665
0.0334
0.0632
0.0376
0.0592
0.0489
0.0682
0.0345
0.0599
0.0376
0.0518
0.0417
0.0524
0.0318
0.0563
0.0380
0.0551
0.0443
0.0589
0.0265
0.0561
0.0342
0.0508
0.0443
0.0570
0.0245
0.0504
0.0304
0.0527
0.0439
0.0476
0.0193
0.0512
0.0307
0.0535
0.0426
0.0732
0.0429
0.0676
0.0486
0.0567
0.0512
0.0749
0.0499
0.0615
0.0494
0.0657
0.0609
0.0598
0.0516
0.0615
0.0529
0.0579
0.0538
0.0665
0.0384
0.0578
0.0442
T2 > C2
HOM
HET
0.6055
0.0548
0.5209
0.0511
0.3582
0.0500
0.7119
0.0830
0.6284
0.0630
0.4724
0.0469
0.8169
0.1101
0.7646
0.0761
0.6459
0.0522
0.6607
0.0561
0.5601
0.0431
0.3985
0.0397
0.7612
0.0782
0.6750
0.0522
0.5220
0.0390
0.8686
0.1068
0.8193
0.0772
0.7106
0.0468
0.8581
0.0637
0.7711
0.0454
0.5956
0.0281
0.9182
0.0818
0.8666
0.0534
0.7404
0.0277
0.9724
0.1097
0.9490
0.0632
0.8706
0.0277
0.5953
0.0610
0.4987
0.0558
0.3473
0.0487
0.6944
0.0768
0.6043
0.0634
0.4640
0.0543
0.8149
0.1127
0.7600
0.0827
0.6272
0.0549
0.6434
0.0596
0.5539
0.0507
(table continues)
Heterogeneous Variance
26
T2 = C2
T2 > C2
nT
nC
HOM
HET
HOM
HET


0.5 0.4
25
10
0.0589
0.0556
0.3860
0.0412
0.5 0.4
10
15
0.0650
0.0403
0.7473
0.0828
0.5 0.4
15
15
0.0611
0.0492
0.6829
0.0589
0.5 0.4
25
15
0.0500
0.0488
0.5160
0.0464
0.5 0.4
10
25
0.0530
0.0388
0.8632
0.1103
0.5 0.4
15
25
0.0521
0.0453
0.8071
0.0803
0.5 0.4
25
25
0.0489
0.0497
0.6903
0.0460
0.5 0.8
10
10
0.0588
0.0473
0.8152
0.0737
0.5 0.8
15
10
0.0569
0.0620
0.7433
0.0494
0.5 0.8
25
10
0.0547
0.0713
0.5640
0.0361
0.5 0.8
10
15
0.0529
0.0426
0.9107
0.0922
0.5 0.8
15
15
0.0501
0.0541
0.8484
0.0664
0.5 0.8
25
15
0.0478
0.0688
0.7034
0.0343
0.5 0.8
10
25
0.0516
0.0365
0.9645
0.1251
0.5 0.8
15
25
0.0534
0.0554
0.9335
0.0756
0.5 0.8
25
25
0.0524
0.0777
0.8520
0.0339
1
0
10
10
0.0779
0.0679
0.5690
0.0761
1
0
15
10
0.0734
0.0739
0.4794
0.0688
1
0
25
10
0.0621
0.0746
0.3294
0.0564
1
0
10
15
0.0702
0.0657
0.6793
0.0990
1
0
15
15
0.0662
0.0694
0.5979
0.0737
1
0
25
15
0.0592
0.0744
0.4423
0.0624
1
0
10
25
0.0626
0.0658
0.7964
0.1173
1
0
15
25
0.0591
0.0690
0.7353
0.0938
1
0
25
25
0.0583
0.0761
0.6072
0.0700
1 0.4
10
10
0.0723
0.0728
0.6147
0.0797
1 0.4
15
10
0.0627
0.0757
0.4988
0.0654
1 0.4
25
10
0.0574
0.0834
0.3585
0.0553
1 0.4
10
15
0.0633
0.0695
0.7104
0.1010
1 0.4
15
15
0.0639
0.0839
0.6314
0.0733
1 0.4
25
15
0.0556
0.0894
0.4806
0.0622
1 0.4
10
25
0.0526
0.0612
0.8419
0.1289
1 0.4
15
25
0.0572
0.0762
0.7718
0.0969
1 0.4
25
25
0.0572
0.0920
0.6382
0.0642
1 0.8
10
10
0.0642
0.1605
0.7396
0.1194
1 0.8
15
10
0.0578
0.1854
0.6459
0.0902
1 0.8
25
10
0.0523
0.1806
0.4879
0.0704
1 0.8
10
15
0.0618
0.1473
0.8485
0.1297
1 0.8
15
15
0.0554
0.1769
0.7735
0.0960
1 0.8
25
15
0.0504
0.2079
0.6216
0.0695
1 0.8
10
25
0.0522
0.1114
0.9403
0.1485
1 0.8
15
25
0.0540
0.1635
0.8931
0.0985
1 0.8
25
25
0.0507
0.2358
0.7805
0.0679
Note: Note: nT and nC represent average sample size.
Heterogeneous Variance
27
Figure 1
Type I Error Rate for Q-test When Variance is Homogeneous Across Groups (k=25).
Type I Error Rate
= 0
= 1
0.25
0.25
0.2
0.2
0.15
0.15
0.1
0.1
0.05
0.05
0
HomogeneityAssumed
HeterogeneityAssumed
0
0
0.4
Pre-Post Correlation
Note: The nominal Type I error rate was .05.
0.8
0
0.4
Pre-Post Correlation
0.8
Heterogeneous Variance
28
Figure 2
Type I Error Rate for Q-test When Variance is Heterogeneous Across Groups (2T > 2C).
1
Pre-Post
Correlation
Homogeneity-Assumed
Type I Error Rate
0.8
r=0
0.6
r = .4
r = .8
0.4
0.2
Heterogeneity-Assumed
0.05
0
0.0
1.0
2.0
3.0
nT/nC
Note: k = 25. The dashed line indicates the nominal Type I error rate (.05).
Download