Meta-analysis: summarising data for two arm trials and other simple outcome studies Steff Lewis statistician When can/should you do a meta-analysis? • When more than one study has estimated an effect • When there are no differences in the study characteristics that are likely to substantially affect outcome • When the outcome has been measured in similar ways • When the data are available (take care with interpretation when only some data are available) Types of data • Dichotomous/ binary data • Counts of infrequent events • Short ordinal scales • Long ordinal scales • Continuous data • Censored data What to collect • Need the total number of patients in each treatment group Plus: • Binary data – The number of patients who had the relevant outcome in each treatment group • Continuous data – The mean and standard deviation of the effect for each treatment group • Then enter data into RevMan / MIX (easy to use and free) http://www.mix-for-meta-analysis.info/ http://www.cc-ims.net/RevMan/ Or R (harder to use and free) Or Stata (harder to use and costs) Etc etc.... Summary statistic for each study • Calculate a single summary statistic to represent the effect found in each study • For binary data – Risk ratio with rarer event as outcome • For continuous data – Difference between means Meta-analysis Averaging studies • Starting with the summary statistic for each study, how should we combine these? • A simple average gives each study equal weight • This seems intuitively wrong • Some studies are more likely to give an answer closer to the ‘true’ effect than others Weighting studies • More weight to the studies which give us more information – More participants – More events – Lower variance • Weight is closely related to the width of the study confidence interval: wider confidence interval = less weight Displaying results graphically • RevMan (the Cochrane Collaboration’s free meta-analysis software) and MIX produce forest plots (as do R and Stata and some other packages) Review : Comparison: Outcome: Corticosteroids for acute traumatic brain injury 01 Any steroid administered in any dose against no steroid 01 Death at end of follow up period Study or sub-category Alexander 1972 Brackman 1983 CRASH 2004 Chacon 1987 Cooper 1979 Dearden 1986 Faupel 1976 Gaab 1994 Giannotta 1984 Grumme 1995 Hemesniemi 1979 Pitts 1980 Ransohoff 1972 Saul 1981 Stubbs 1989 Zagara 1987 Zarete 1995 Steroid n/N 16/55 44/81 1052/4985 1/5 26/49 33/68 16/67 19/133 34/72 38/175 35/81 114/201 9/17 8/50 13/98 4/12 0/30 6179 Total (95% CI) Total events: 1462 (Steroid), 1194 (Control) Test for heterogeneity: Chi² = 26.46, df = 15 (P = 0.03), I² = 43.3% Test for overall effect: Z = 3.27 (P = 0.001) Control n/N RR (fixed) 95% CI RR (fixed) 95% CI 22/55 47/80 893/4979 0/5 13/27 21/62 16/28 21/136 7/16 49/195 36/83 38/74 13/18 9/50 5/54 4/12 0/30 0.73 [0.43, 1.23 0.92 [0.70, 1.21 1.18 [1.09, 1.27 3.00 [0.15, 59.8 1.10 [0.69, 1.77 1.43 [0.94, 2.19 0.42 [0.24, 0.71 0.93 [0.52, 1.64 1.08 [0.59, 1.98 0.86 [0.60, 1.25 1.00 [0.70, 1.41 1.10 [0.86, 1.42 0.73 [0.43, 1.25 0.89 [0.37, 2.12 1.43 [0.54, 3.80 1.00 [0.32, 3.10 Not estimable 5904 1.12 [1.05, 1.20 0.1 0.2 0.5 Steroid better 1 2 5 Steroid w orse 10 Heterogeneity What is heterogeneity? • Heterogeneity is variation between the studies’ results Causes of heterogeneity Differences between studies with respect to: • Patients: diagnosis, in- and exclusion criteria, etc. • Interventions: type, dose, duration, etc. • Outcomes: type, scale, cut-off points, duration of follow-up, etc. • Quality and methodology: randomised or not, allocation concealment, blinding, etc. How to deal with heterogeneity 1. Do not pool at all 2. Ignore heterogeneity: use fixed effect model 3. Allow for heterogeneity: use random effects model 4. Explore heterogeneity: meta-regression (tricky) How to assess heterogeneity from a forest plot Statistical measures of heterogeneity • The Chi2 test measures the amount of variation in a set of trials, and tells us if it is more than would be expected by chance Estimates with 95% confidence intervals Study Liggins 1972 Block 1977 Morrison 1978 Trials from Cochrane logo: Corticosteroids for preterm birth (neonatal death) Taeusch 1979 Papageorgiou 1979 Heterogeneity test Schutte 1979 Collaborative Group 1981 0.61 Pooled 0.05 0.25 1 ( 0.46 , 0.81 ) 4 Odds ratio Corticosteroids better Corticosteroids worse Q = 11.2 (6 d.f.) p = 0.08 Estimates with 95% confidence intervals Study Corticosteroids for preterm birth (neonatal death) Liggins 1972 Block 1977 Morrison 1978 Taeusch 1979 Papageorgiou 1979 Heterogeneity test Schutte 1979 Q = 11.2 (6 d.f.) Collaborative Group 1981 p = 0.08 Heterogeneity test Q = 44.7 (27 d.f.) p = 0.02 0.05 0.25 1 4 Odds ratio 0.05 0.25 1 4 Odds ratio I squared quantifies heterogeneity Q df I 100 Q 2 where Q = heterogeneity c2 statistic I2 can be interpreted as the proportion of total variability explained by heterogeneity, rather than chance • Roughly, I2 values of 25%, 50%, and 75% could be interpreted as indicating low, moderate, and high heterogeneity • For more info see: Higgins JPT et al. Measuring inconsistency in meta-analyses. BMJ 2003;327:557-60. Fixed and random effects Fixed effect Philosophy behind fixed effect model: • there is one real value for the treatment effect • all trials estimate this one value Problems with ignoring heterogeneity: • confidence intervals too narrow Random effects Philosophy behind random effects model: • there are many possible real values for the treatment effect (depending on dose, duration, etc etc). • each trial estimates its own real value Example Could we just add the data from all the trials together? • One approach to combining trials would be to add all the treatment groups together, add all the control groups together, and compare the totals • This is wrong for several reasons, and it can give the wrong answer If we add up the columns we get 34.3% vs 32.5% , a RR of 1.06, a higher chance of death in the steroids group From a meta-analysis, we get RR=0.96 , a lower chance of death in the steroids group Problems with simple addition of studies • breaks the power of randomisation • imbalances within trials introduce bias * The Pitts trial contributes 17% (201/1194) of all the data to the experimental column, but 8% (74/925) to the control column. Therefore it contributes more information to the average chance of death in the experimental column than it does to the control column. There is a high chance of death in this trial, so the chance of death for the expt column is higher than the control column. Interpretation Interpretation - “Evidence of absence” vs “Absence of evidence” • If the confidence interval crosses the line of no effect, this does not mean that there is no difference between the treatments • It means we have found no statistically significant difference in the effects of the two interventions In the example below, as more data is included, the overall odds ratio remains the same but the confidence interval decreases. It is not true that there is ‘no difference’ shown in the first rows of the plot – there just isn’t enough data to show a statistically significant result. Review : Comparison: Outcome: Steff 01 Absence of evidence and Evidence of absence 01 Increasing the amount of data... Study or sub-category Treatment n/N Control n/N 1 study 2 studies 3 studies 4 studies 5 studies 10/100 20/200 30/300 40/400 50/500 15/100 30/200 45/300 60/400 75/500 OR (fixed) 95% CI OR (fixed) 95% CI 0.63 0.63 0.63 0.63 0.63 0.1 0.2 0.5 Favours treatment 1 2 5 Favours control 10 [0.27, [0.34, [0.38, [0.41, [0.43, 1.48] 1.15] 1.03] 0.96] 0.92] Interpretation - Weighing up benefit and harm • When interpreting results, don’t just emphasise the positive results. • A treatment might cure acne instantly, but kill one person in 10,000 (very important as acne is not life threatening). Interpretation - Quality • Rubbish studies = unbelievable results • If all the trials in a meta-analysis were of very low quality, then you should be less certain of your conclusions. • Instead of “Treatment X cures depression”, try “There is some evidence that Treatment X cures depression, but the data should be interpreted with caution.” Summary • Choose an appropriate effect measure • Collect data from trials and do a metaanalysis if appropriate • Interpret the results carefully – – – – Evidence of absence vs absence of evidence Benefit and harm Quality Heterogeneity Sources of statistics help and advice Cochrane Handbook for Systematic Reviews of Interventions http://www.cochrane.org/resources/handbook/index.htm The Cochrane distance learning material http://www.cochrane-net.org/openlearning/ The Cochrane RevMan user guide. http://www.cc-ims.net/RevMan/documentation.htm