Why effect size? - Department of Education

Funded through the ESRC’s Researcher Development Initiative Sessions 1.2-1.3: Effect Size Calculation Prof. Herb Marsh Ms. Alison O’Mara Dr. Lars-Erik Malmberg Department of Education, University of Oxford Establish research question Define relevant studies Develop code materials Data entry and effect size calculation Pilot coding; coding Locate and collate studies Main analyses Supplementary analyses 2 The effect size makes meta-analysis possible  It is based on the “dependent variable” (i.e., the outcome)  It standardizes findings across studies such that they can be directly compared Any standardized index can be an “effect size” (e.g., standardized mean difference, correlation coefficient, odds-ratio), but must  be comparable across studies (standardization)  represent magnitude & direction of the relationship  be independent of sample size Different studies in same meta-analysis can be based on different statistics, but have to transform each to a standardized effect size that is comparable across different studies Sample size, significance and d effect size Study 1 N M SD t p d Exp 10 105 15 Cntr 10 100 15 Study 3 N M SD t p d Exp 100 105 15 Study 2 N M SD t p d Exp 50 105 15 Cntr 50 100 15 Cntr 100 100 15 XLS Sample size, significance and d effect size Study 1 N M SD t p d Exp 10 105 15 Cntr 10 100 15 0.750 0.466 0.333 Study 3 N M SD t p d Study 2 N M SD t p d Exp 100 105 15 Exp 50 105 15 Cntr 50 100 15 1.667 0.099 0.333 Cntr 100 100 15 2.360 0.019 0.333 ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) XLS 5 Simulate ds on homemade calculator T-test and effect sizes (ES.xls) M SD N Treatment Control 105 pooled SD 100 15 15 15 15 15 T 0.91 0.91 28 0.3691 0.33 DF sign d one or two tailed 2 M SD N Treatment Control 100 pooled SD 105 15 15 15 15 15 T DF sign d one or two tailed -0.91 0.91 28 0.3691 -0.33 2 Change direction of effects Change Ns (equal or same?) Change SDs ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) XLS 6 Effect size as proportion in the Treatment group doing better than the average Control group person d = .20 d = .80 d = .50 0.50 0.50 0.50 0.40 0.40 0.40 0.30 0.30 0.30 0.20 0.20 0.20 0.10 0.10 0.10 0.00 0.00 0.00 57% of T above xc 69% of T above xc 79% of T above xc = Control = Treatment 7 Effect size as proportion of success in the Treatment versus Control group (Binomial Effect Size Display = BESD): d = .20 d = .80 d = .50 0.50 0.50 0.50 0.40 0.40 0.40 0.30 0.30 0.30 0.20 0.20 0.20 0.10 0.10 0.10 0.00 0.00 0.00 Success: 55% of T, 45% of C Success: 62% of T, 38% of C xc Success: 68% of T, 32% of C = Control = Treatment 8 Why effect size?  Long focus on significance level (safe-guarding against Type I (a) error) – today focus on practical and meaningful significance. Cohen, J. (1994). The earth is round (p < .05), American Psychologist, 49, 997–1003. Real world H0 True Accept Study H0 ok Accept H1 Type I (a) error H1 True Type II () error ok 9 A short history of the effect size (Huberty, 2002; see also Olejnik & Algina, 2000 for review of effect sizes) 10 Power and effect size Power: “Finding what is out there” Type II () error “not finding what is out there” Power (1 – ): the probability of rejecting a false H0 hypothesis Power of .80 or .90 in primary research 11 Power, sought effect size, at significance level a = .05 in primary research (prior to conducting study) Sample size for three effects sizes, a = .05 N needed per sample 600 500 400 effect .20 effect .25 300 effect .30 200 100 0 0.50 0.60 0.70 0.80 0.90 power 12 How meaningful is a “small” effect size? Raw counts No heart attack Aspirin 10,933.00 Placebo 10,845.00 Total 21,778.00 Heart attack Total 104 11,037.00 189 11,034.00 293 22,071.00 Percentages (row) No heart attack Aspirin 99.06 Placebo 98.29 Total 98.67 Heart attack Total 0.94 1.71 1.33 100 100 100 Binomial effect size display (proportions) No heart Heart attack attack Total Aspirin 0.517 0.483 100 Placebo 0.483 0.517 100 Total 100 100 200  A small effect size changed the course of an RCT in 1987: placebo group participants were given aspirin instead (see Rosenthal, 1994, p. 242) [21]  25.01 r   2 n  25.01  0.034 22071 XLS r =  = .034 (r2 = .0011) BESD (Binomial Effect Size Display): Treatment success rate .50 - r/2 Condition treatment success rate .50 - r/2 13  Within the one meta-analysis, can include studies based on any combination of statistical analysis (e.g., t-tests, ANOVA, correlation, odds-ratio, chisquare, etc).  The “art” of meta-analysis is how to compute effect sizes based on non-standard designs and studies that do not supply complete data (see Lipsey&Wilson_AppB.pdf).  Convert all effect sizes into a common metric based on the “natural” metric given research in the area. E.g. d, r, OR ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 14  Standardized mean difference  Group contrast research  Treatment groups  Naturally occurring groups  Inherently continuous construct  Correlation coefficient  Association between inherently continuous constructs  Odds-ratio  Group contrast research  Treatment or naturally occurring groups  Inherently dichotomous construct  Regression coefficients and other multivariate effects  Requires access to covariance-variance (correlation) matrices for each included study 15 Calculating ds (1) Means and standard deviations Correlations P-values F-statistics t-statistics d Almost all test statistics can be transformed into an standardized effect size “d” “other” test statistics ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 16 16 Calculating ds (1)  Represents a standardized group contrast on an inherently continuous measure  Uses the pooled standard deviation  Commonly called “d” X  X G2 ES  G1 s pooled If n1  n2 s pooled If n1  n2 s pooled s12  s22  2 ( s12 (n1  1))  ( s22 (n2  1))  (n1  n2  2) ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) Various contrast effect sizes  Cohen’s d X G1  X G 2 ES  d  SDpooled  Hedge’s g X G1  X G 2 ES  g  S pooled  Glass’s D X G1  X G 2 ES  D  sC ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 18 Calculating d (1) using Ms, SDs and ns T-test and effect sizes Treatment M 25 SD 5 N 25 T DF sign d one or two tailed Control pooled SD 5 3.6927 3.6927 53 0.0005 1.0000 2 20 5 30 X G1  X G 2 25  20 ES    1.00 s pooled 5 ES  X Exper  X Control s pooled Remember to code treatment effect in positive direction! ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 19 ES_calculator.xls 20 Calculating d (2) using ES calculator, using Ms, ns, and t-value 21 Calculating d (3) using ES calculator, using ns, and t-value  The treatment group scored higher than the control group at Time 2 (t[28]= 4.11; p<.001).  From sample description we learn that n1 = n2 ES  t ESm  t n1  n2 n1n2 n1  n2 15  15 30  4.11  4.11  4.11 .1333  (4.11)(0.365)  1.50 n1n2 (15)(15) 225 22 Calculating d (3) correcting for small sample bias Hedges proposed a correction for small sample size bias (ns < 20) Must be applied before analysis 3   ' ESsm  ESsm 1    4N  9  ES ' sm 3    1.51   (1.5)(.97)  1.46   4N  9  23 Calculating d (4) using ES calculator, using ns, and F-value F 1.6 ES  2 2  .40 N 40 Remember: in a two-group ANOVA F = t2 24 Calculating d (5) using ES calculator, using p-value “The mean-level comparison was not significant (p = .53)” 25 T-test table df = (n1 + ns –2) Sometimes authors only report e.g., p<.01 (n = 22). If so, use a conservative approach to reading the ttest table. NOTE: When p = n.s. some researchers code d = 0 in data base 26 Example dataset so far (1) (ES_enter.sav): study es 1 2 3 4 5 1.00 0.80 1.46 0.40 0.10 Treat Cntr n Groups 25 30 55 2 40 40 80 2 15 15 30 2 20 20 40 2 80 75 155 2 27 Use all available tools for calculating the following 5 effect sizes ES 6: MT = 21, MC = 20, nT = 60, nC = 60, t = .55 ES 7: MT = 103.5, MC = 100, SDT = 22.0, SDC = 18.5, nT = 45, nC = 35, ES 8: nT = 45, nC = 40, p <.05 ES 9: nT = 100, nC = 120, F = 8.73 ES 10: nT = 200, nC = 160, t = 5.66 (see electronic document: “Correct ds for 5 effect sizes.doc”) ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 28 Example dataset so far (2) (ES_enter.sav): study es 1 2 3 4 5 6 7 8 9 10 1.00 0.80 1.46 0.40 0.10 0.10 0.17 0.43 0.40 0.60 Treat Cntr n Groups 25 30 55 2 40 40 80 2 15 15 30 2 20 20 40 2 80 75 155 2 60 60 120 2 45 35 80 2 45 40 85 2 100 120 220 2 200 160 360 2 29 Calculating d (11) using ES calculator, using number of successful outcomes per group Frequencies Success Failure Treatment Group a b Control Group c d Treatment Control Total Success Failure Total 28 28 56 31 34 65 59 62 121 ad ES OR  bc 28 34 ES OR   1.097 31 28 loge (1.097)  .092 logit  .092/ 1.83  .05 30 Calculating d (11) using ES calculator, using number of successful outcomes per group Success Failure Total Treatment Control Total 28 31 59 28 34 62 56 65 121 31 Calculating d (12) using ES calculator, using proportion of successes per group (53% vs. 48.5%) 32 Calculating d (13) using paired t-test (only one experimental group; “each person their own control”) Don’t use the X T 2  X T1 ESsg  sPooled X T 1  4.50, X T 2  6.25, sP  2.50 6.25  4.50 ESsg   .70 2.50 SD of the change score! r = correlation between Time 1 and Time 2 s Pooled  ( sT21  sT2 2 ) / 2 33 Calculating d (14) using paired t-test (only one experimental group) n (pairs) = 90, t-value = 6.5, r = .70 34 Calculating d (15)  “The 20 participants increased .84 z-scores between time 1 and time 2 (p<.01)”  ES = .84  Correct for small sample bias ES ' sm   3  .841   (.84)(.957)  .80   (4)(20)  9  35 Example dataset so far 3 (ES_enter.sav): study es 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1.00 0.80 1.46 0.40 0.10 0.10 0.17 0.43 0.40 0.60 0.05 0.10 0.70 0.53 0.80 Treat Cntr n Groups 25 30 55 2 40 40 80 2 15 15 30 2 20 20 40 2 80 75 155 2 60 60 120 2 45 35 80 2 45 40 85 2 100 120 220 2 200 160 360 2 56 65 121 2 70 80 150 2 80 0 80 1 90 0 90 1 20 0 20 1 Method difference: mean contrast and gain scores 36 Summary of equations from Lipsey & Wilson (2001) (for more formulae see Lipsey & Wilson Appendix B) 37 Weighting for mean-level differences The effect sizes are weighted by the inverse of the variance to give more weight to effects based on larger sample sizes Variance for mean level comparison is calculated as di2 (n1 n 2 ) vi   (n1 n 2 ) 2(n1 n 2 ) The standard error of each effect size is given by the square root of the sampling variance SE =  vi ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 38 Enter_w.xls d i2 ( n1  n 2 )  ( n1 n 2 ) 2( n1  n 2 ) se  study es Treat Cntr n Groups se 1 1.00 0.80 1.46 0.40 0.10 0.10 0.17 0.43 0.40 0.60 0.05 0.10 25 30 55 2 0.2871 40 40 80 2 0.2324 15 15 30 2 0.4109 20 20 40 2 0.3194 80 75 155 2 0.1608 60 60 120 2 0.1827 45 35 80 2 0.2258 45 40 85 2 0.2198 100 120 220 2 0.1367 200 160 360 2 0.1084 56 65 121 2 0.1824 70 80 150 2 0.1638 2 3 4 5 6 7 8 9 10 11 12 ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 39 Weighting for gain scores SE for gain scores SE sg  2(1  r )  n 2n ES sg2 T1 and T2 scores are dependent so we need to get correlation between T1 and T2 into equation (not always reported) Inverse variance for gain scores 1 2n wsg   2 SEsg 4(1  r )  ESsg2 ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 40 Enter_w.xls SE sg  2(1  r )  n 2n ES sg2 study es Treat Cntr n Groups r se 13 0.70 0.53 0.80 80 0 80 1 0.65 0.1087 90 0 90 1 0.70 0.0907 20 0 20 1 0.50 0.2569 14 15 XLS ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 41 Compute the weighted mean ES and s.e. of the ES in SPSS (var_ofES.sps) (1) ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 42 Compute the weighted mean ES and s.e. of the ES in SPSS (var_ofES.sps) (2) ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 43 Compute the weighted mean ES and s.e. of the ES Weight the ES by the inverse of the s.e. 1 w 2 SEES The average ES ( wi ESi ) ES  wi Standard error of the ES SE ES  1 wi ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 44 Enter_w.xls w 1 1  2 SEES (0.2871) 2 wes  w  ES  12.131.000  12.13 study es Treat Cntr n Groups se w wes 1 1.00 0.80 1.46 0.40 0.10 0.10 0.17 0.43 0.40 0.60 0.05 0.10 25 30 55 2 0.2871 12.13 12.13 40 40 80 2 0.2324 18.52 14.81 15 15 30 2 0.4109 5.92 8.65 20 20 40 2 0.3194 9.80 3.92 80 75 155 2 0.1608 38.66 3.87 60 60 120 2 0.1827 29.96 3.00 45 35 80 2 0.2258 19.62 3.34 45 40 85 2 0.2198 20.70 8.90 100 120 220 2 0.1367 53.48 21.39 200 160 360 2 0.1084 85.11 51.06 56 65 121 2 0.1824 30.07 1.50 70 80 150 2 0.1638 37.29 3.69 2 3 4 5 6 7 8 9 10 11 12 ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 45 study es Treat Cntr n Groups 1 1.00 0.80 1.46 0.40 0.10 0.10 0.17 0.43 0.40 0.60 0.05 0.10 0.70 0.53 0.80 25 30 55 40 40 15 2 3 4 5 6 7 8 9 10 11 12 13 14 15 r se w wes 2 0.2871 12.13 12.13 80 2 0.2324 18.52 14.81 15 30 2 0.4109 5.92 8.65 20 20 40 2 0.3194 9.80 3.92 80 75 155 2 0.1608 38.66 3.87 60 60 120 2 0.1827 29.96 3.00 45 35 80 2 0.2258 19.62 3.34 45 40 85 2 0.2198 20.70 8.90 100 120 220 2 0.1367 53.48 21.39 200 160 360 2 0.1084 85.11 51.06 56 65 121 2 0.1824 30.07 1.50 70 80 150 2 0.1638 37.29 3.69 80 0 80 1 0.65 0.1087 84.66 59.26 90 0 90 1 0.70 0.0907 121.55 64.42 20 0 20 1 0.50 0.2569 15.15 12.12 Sums ES  ( wi ESi ) 272.07   0.47 wi 582.63 SEES  1 1   0.04 wi 582.63 582.63 272.07 Funnel plot for x = sample size, y = ES  Does average of ES converge toward the average of the largest (n) study? average es se of mean es 95% C.I. Lower 95% C.I. Upper Effect sizes by sam ple size 1.60 1.40 1.20 1.00 0.47 0.04 0.39 0.55 95% C.I. = ±1.96 * s.e. 0.80 0.60 0.40 99% C.I. = ±2.58 * s.e. 0.20 0.00 0 50 100 150 200 250 300 350 400 99.9% C.I. = ±3.29 * s.e. 47 Funnel plot including s.e. of ES  ES in smaller sample has larger standard error (s.e.) Effect sizes by sam ple size 2.00 1.75 1.50 1.25 1.00 0.75 0.50 0.25 0.00 -0.25 0 50 100 150 200 250 300 350 400 48 Population N = ‘size’ m = ‘mean’ d = ‘effect size’ Sample n = ‘size’ m = ‘mean’ d = ‘effect size’ Interval estimates The “likely” population parameter is the sample parameter ± uncertainty  Standard errors (s.e.)  Confidence intervals (C.I.) ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 49 Calculating rs Means and standard deviations (d) 2  P-values F-statistics t-statistics r Almost all test statistics can be transformed into an standardized effect size “r” “other” test statistics ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 50 Correlations / relationships between variables  rxy Pearson’s product moment coefficient (continuous  continuous)  Rpb Bi-serial correlation (dichotomous  continuous)  2 (dichotomous  dichotomous)  rsSpearman’s rank-order coefficient (ordinal  ordinal) And others, e.g.,   coefficient, Odds-Ratio (OR)  Cramer’s V, Contingency coefficient C  Tetrachoric and polychoric correlations …. (etc) 51 Bias when dichotomising continuous variables X or Y are both “truly” continuous, but in the study either is dichotomised X = continuous, Y =50/50 split gives an rpb that is 80% of its value, had it been continuous X or Y are both “truly” continuous, but both are dichotomised Maximum value of  if x = 30/70 split and Y = 50/50 split is  = .33 52 Calculating rs from d (1) r can be used in all situations d can, but d cannot be used in all situations where r is appropriate r 0.90 0.80 0.70 0.60 0.50 0.40 0.30 0.20 0.10 0.00 -0.10 -0.20 -0.30 -0.40 -0.50 -0.60 -0.70 -0.80 -0.90 d 4.13 2.67 1.96 1.50 1.15 0.87 0.63 0.41 0.20 0.00 -0.20 -0.41 -0.63 -0.87 -1.15 -1.50 -1.96 -2.67 -4.13 ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 53 Calculating rpb (2)  If inherently continuous X and Y, mean-contrast is a better option than rpb ESsm  2rpb 1 r 2 54 Calculating r (3) from t-value Appropriate for both independent and dependent samples t-test values  t r   2  t  df 2  t   2 t  df  Calculating r (4) from 2-value r   2  2 N  2 Z N 55 Sources of error Cf. Structural Equation Model (circle = latent/ unobserved construct, rectangle = manifest/ observed variable) rx*y* Latent (unobserved) X rxx Latent (unobserved) Y rx* y*  Manifest (observed) variable x rxy rxx ryy ryy Manifest (observed) variable y rxy 56 Alternatively: transform rs into Fisher’s Zr-transformed rs, which are more normally distributed ES Zr 1  ES r   .5 log e   1  ES r  SEZr  1 n3 r 0.90 0.80 0.70 0.60 0.50 0.40 0.30 0.20 0.10 0.00 -0.10 -0.20 -0.30 -0.40 -0.50 -0.60 -0.70 -0.80 -0.90 Fisher's zr 1.47 1.10 0.87 0.69 0.55 0.42 0.31 0.20 0.10 0.00 -0.10 -0.20 -0.31 -0.42 -0.55 -0.69 -0.87 -1.10 -1.47 ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 57 rxy rx* y*  rxx ryy  .46  .46 / .74  .61 .80  .70 study r n aY aX rx*y* SEZr FisherZr FisherZr_dis 1 0.460 0.330 0.250 -0.200 -0.250 -0.400 -0.100 0.100 0.275 0.150 145 0.80 0.70 0.6147 0.0839 0.4973 0.7164 132 0.70 0.71 0.4681 0.0880 0.3428 0.5076 80 0.83 0.78 0.3107 0.1140 0.2554 0.3213 442 0.82 0.80 -0.2469 0.0477 -0.2027 -0.2521 662 0.86 0.69 -0.3245 0.0390 -0.2554 -0.3367 320 0.75 0.80 -0.5164 0.0562 -0.4236 -0.5714 450 0.89 0.83 -0.1163 0.0473 -0.1003 -0.1169 106 0.82 0.87 0.1184 0.0985 0.1003 0.1190 1927 0.71 0.76 0.3744 0.0228 0.2823 0.3935 2863 0.80 0.83 0.1841 0.0187 0.1511 0.1862 2 3 4 5 6 7 8 9 10 rr.xls SEZr  1 n3 ES Zr ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 1  ES r   .5 log e   1  ES r  58 Ten effect sizes (r) 0.600 0.400 ES (r) 0.200 0.000 -0.200 r 0 500 1000 1500 2000 2500 3000 3500 -0.400 -0.600 N Ten disattenuated ES (rx*y*) ES (disaattenuated r) 0.800 0.600 0.400 0.200 rx*y* 0.000 -0.200 0 500 1000 1500 2000 2500 3000 3500 -0.400 -0.600 N 59 Calculating OR (chi2.sps) r   2  2 [1] N  2 Z N 60 inocul incoulated * escape escaped disease Crosstabulation Count inocul incoulated 0 non-inoculated 1 inoculated Total escape escaped disease 0 caught disease 1 escaped 75 204 32 265 107 469 Total 279 297 576 [21]  24.68 r   2 n  24.68  0.207 576 61 Frequencies Success Failure Treatment Group a b Control Group c d ad ESOR  bc 265 75 ES OR   3.04 204 32 .46  .06 or : ES OR   3.04 .35 .13 loge (3.04)  1.11 Risk Estimate Value Odds Ratio for inocul incoulated (0 non-inoculated / 1 inoculated) For cohort escape escaped disease = 0 caught disease For cohort escape escaped disease = 1 escaped N of Valid Cases 95% Confidence Interval Lower Upper 3.045 1.937 4.786 2.495 1.706 3.649 .819 .755 .889 576 logit effectsize 1.11/ 1.83  .61 62 Pearson’s 5 studies escaping Enteric Fever (1904) N Study 1 2 3 4 5 % non-inoculated cases escaped 75 204 1489 9040 257 10724 82 1203 1475 109034 inoculated cases escaped 32 265 35 1670 26 2509 72 1135 84 10798 Ratios non-inoculated inoculated Study escaped / cases escaped / cases 1 2.72 8.28 2 6.07 47.71 3 41.73 96.50 4 14.67 15.76 5 73.92 128.55 non-inoculated cases escaped 0.27 0.73 0.14 0.86 0.02 0.98 0.06 0.94 0.01 0.99 inoculated cases escaped 0.11 0.89 0.02 0.98 0.01 0.99 0.06 0.94 0.01 0.99 Odds-ratio (OR) LN(OR) success of oculation / success of non- (natural logarithm oculation of OR) 3.04 1.11 7.86 2.06 2.31 0.84 1.07 0.07 1.74 0.55 63 p odds of an event  1 p where p  probability of event p/(1-p) p 0.01 0.05 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 0.95 0.99 0.01 0.05 0.11 0.25 0.43 0.67 1.00 1.50 2.33 4.00 9.00 19.00 99.00 Logit(p/(1-p) -4.595 -2.944 -2.197 -1.386 -0.847 -0.405 0.000 0.405 0.847 1.386 2.197 2.944 4.595 EXP(x) 0.0101 0.0526 0.1111 0.2500 0.4286 0.6667 1.0000 1.5000 2.3333 4.0000 9.0000 19.0000 99.0000 64 265 75 ES OR   3.044 32 204 loge (3.044)  1.113 XLS 1.113/ 1.83  0.61 ESRC RDI One Day Meta-analysis workshop (Marsh, O’Mara, Malmberg) 65 Establish research question Define relevant studies Develop code materials Data entry and effect size calculation Pilot coding; coding Locate and collate studies Main analyses Supplementary analyses 66 67 Each study is one lineVariance in of the effect size Sample sizes the data base Effect size DurationReliability of the instrument 68 Organising effect sizes within study (1) “Flat dataset” Study_ID 001 002 003 004 005 006 ES1 0.77 0.20 0.40 0.25 0.30 0.60 ES2 . 0.05 0.30 0.22 0.40 0.50 ES3 . 0.10 . . 0.10 0.30 ES4 . . . . . 0.30 DV1cat DV2cat DV3cat DV4cat 1 2 2 4 3 1 1 1 4 4 2 2 2 1 4 Categories of verbal DV 1 = verbal IQ 2 = reading comprehension 3 = reading-lag 4 = spelling-lag 69 Organising effect sizes within study (2) “hierarchical dataset” (effect sizes nested within study) Study_ID 001 002 002 002 003 003 004 004 005 005 005 006 006 006 006 ES 0.77 0.20 0.05 0.10 0.40 0.30 0.25 0.22 0.30 0.40 0.10 0.60 0.50 0.30 0.30 DVcat 1 2 2 4 3 1 1 1 4 4 2 2 2 1 4 Categories of verbal DV 1 = verbal IQ 2 = reading comprehension 3 = reading-lag 4 = spelling-lag 70 Organising effect sizes within study (3) “hierarchical dataset”, with one construct per DV per study Study_ID 001 002 002 003 003 004 005 005 006 006 006 ES 0.770 0.130 0.100 0.400 0.300 0.235 0.350 0.100 0.555 0.300 0.300 DVcat 1 2 4 3 1 1 4 2 2 1 4 Categories of verbal DV 1 = verbal IQ 2 = reading comprehension 3 = reading-lag 4 = spelling-lag 71 Organising effect sizes within study (4) “hierarchical dataset”, with one DV per study Study_ID 001 002 003 004 005 006 ES 0.770 0.120 0.350 0.240 0.270 0.430 NOTE: alternative to aggregating ESs within study: multilevel metaanalysis 72 Exercise: effect size calculation (4 method/result extracts from journals):  Do boys have higher general (global) self-concept (self-worth) than girls?  Decide which effect size to use (d, r, OR)?  Calculate appropriate effect sizes 73 Effect size literature  Cohen, J. (1969). Statistical Power Analysis for the Behavioral Sciences, 1st Edition, Lawrence Erlbaum Associates, Hillsdale (2nd Edition, 1988).  Cohen, J. (1994). The earth is round (p < .05), American Psychologist, 49, 997–1003.  Gwet, K. (2001). Handbook of interrater reliability. How to estimate the level of agreement between two of multiple raters. Gaithersburg: STATAXIS Publishing.  Huberty, C. J. (2002). A history of effect size indices. Educational and Psychological Measurement, 62, 227-240.  McCartney, K., & Rosenthal, R. (2000). Effect size, practical importance, and social policy for children. Child Development, 71, 173180.  Olejnik, S., & Algina, J. (2000). Measures of effect size for comparative studies: Applications, interpretations, and limitations. Contemporary Educational Psychology, 25, 241-286. 74

Why effect size? - Department of Education

Related documents

Products

Support

Why effect size? - Department of Education

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib