Critically Evaluating the Evidence: Tools for Appraisal Elizabeth A. Crabtree, MPH, PhD (c) Director of Evidence-Based Practice, Quality Management Assistant Professor, Library & Informatics Medical University of South Carolina Steps of EBP: 1) Ask the question 2) Find the best evidence 3) Evaluate the evidence 4) Apply the information 5) Evaluate outcomes Step 3: Evaluate the Evidence Systematic, Critical Appraisal It’s peer-reviewed, therefore it must be OK? Adopted from: Heneghan, Carl. Introduction, 16th Oxford Workshop on Evidence-Based Practice, September, 2010. What is in “the stack”? Gold mine Bonfire Hierarchy of Evidence CONSORT • Consolidated Standards of Reporting Trials • Focus - Randomized Control Trials (RCT) » 2-group, parallel • Checklist of 25 items – – – – – – Title/Abstract Introduction Methods Results Discussion Other information The CONSORT Group STROBE • Strengthening the Reporting of Observational Studies in Epidemiology • Focus – Cross-sectional, Case-control, Cohort and Observational Studies • Checklists of 22 items – – – – – – Title/Abstract Introduction Methods Results Discussion Other Information STROBE Statement CASP • Critical Appraisal Skills Programme • Focus – Systematic Reviews, RCTs, Qualitative Studies, Diagnostic Test Studies, Cohort Studies, Case-control Studies & Economic Evaluation Studies • 10 - 12 Questions per appraisal tool – Validity – Results – Relevance CASP Body of Evidence • All studies relevant to a given PICO questions – Recommend grouping studies by PICO question • Assess the quality of relevant studies as a group How is this done??? GRADE Quality Assessment Criteria What is the GRADE System? G rading of R ecommendations A ssessment D evelopment and E valuation • Built on previous systems • International group of guideline developers Advantages of GRADE • Transparent process of moving from evidence to recommendations • Explicit, comprehensive criteria for downgrading and upgrading quality of evidence ratings • Explicit evaluation of the importance of outcomes of alternative management strategies GRADE vs. The Competition Quality & Recommendations • Quality of evidence-the extent to which one can be confident that an estimate of effect is adequate to support recommendations • Strength of recommendation-the extent to which one can be confident that adherence to the recommendation will do more good than harm Utilization Getting Started… • • Must have a clearly defined question Patient(s), intervention, comparison, and outcome of interest (PICO) In adult patients (population), is the use of glucocorticosteroids (intervention) associated with VTE (outcome)? Chutes & Ladders Evaluation of evidence can lower its quality or raise its quality. Key Elements-Chutes • Study design limitations • Inconsistency • Indirectness • Imprecision • Reporting bias Study Design Limitations • Basic study design (randomized trials or observational) • Study Limitations – Insufficient sample size – Lack of blinding – Lack of allocation concealment – Large losses to follow up – Non-adherence to intent to treat analysis – Stopped for early benefit – Selective reporting of measured outcomes Inconsistency of Results • Detailed study methods and execution – Wide variation of treatment effect across studies – Populations varied (e.g. sicker, older) – Interventions varied (e.g. doses) – Outcomes varied (e.g. diminishing effect over time) • Increased heterogeneity = ↓ quality (I2: <0.25 low; 0.25 – 0.5 moderate; > 0.5 high) Indirectness of Evidence • The extent to which the people, interventions, and outcome measures are similar to those of interest – Indirect comparisons – Different populations – Different interventions – Different outcomes measured – Comparisons not applicable to question/outcome Imprecision • Accuracy of data/results • Results include just a few events or observations – Sample size lower than calculated for optimal information (needed for decision-making) – Confidence intervals are sufficiently wide that an estimate is consistent with either important harms or benefits Bias Key Elements-Ladders • Effect • Dose response • Plausible confounders Effect Magnitude of treatment effect • Strong effect • e.g., meta-analysis of observational studies found that bicycle helmets reduce the risk of head injuries RR 0.31 (95% CI, 0.13 to 0.37) • Very Strong effect • e.g., meta-analysis looking at impact of warfarin prophylaxis in cardiac valve replacement • Relative Risk for thromboembolism with warfarin was 0.17 (95% CI, 0.13 to 0.24) Dose Response Evidence of a dose-response gradient • The more exposure to an intervention the greater the harm – Higher warfarin dose → Higher INR → increased bleeding Plausible Confounders • All plausible confounders would have reduced the demonstrated effect • OR would suggest a spurious effect when results show no effect Evidence of Association • Strong evidence of association – significant relative risk of > 2 ( < 0.5) based on consistent evidence from two or more observational studies, with no plausible confounders • Very Strong evidence of association – significant relative risk of > 5 ( < 0.2) based on direct evidence with no major threats to validity Quality of Supporting Evidence High • Further research is very unlikely to change confidence‡ in the estimate of effect • Consistent evidence from well-performed RCT’s or exceptionally strong evidence from unbiased observational studies Moderate Low Very Low • Further research is likely to have an important impact on confidence in the estimate of effect and may change the estimate. • Evidence from RCTs with important limitations or unusually strong evidence from unbiased observational studies • Further research is very likely to have an important impact on confidence in the estimate of effect and is likely to change the estimate • Evidence for at least 1 critical outcome from observational studies or from RCTs with serious flaws or indirect evidence • Any estimate of effect is very uncertain • Evidence for at least 1 of the critical outcomes from unsystematic clinical observations or very indirect evidence Outcomes: Critical or Important Guyatt, G. H., Oxman, A. D., Kunz, R., Vist, G. E., Falck-Ytter,Y. & Schünemann, H. J. (2008). What is “quality of evidence” and why is it important to clinicians? BMJ 333, 995-998. Strength of Recommendations Strong Weak VS. Strength of Recommendations Strong Weak VS. X Strong Recommendation • Desirable effects clearly outweigh undesirable effects or vice versa • Certain that benefits do, or do not, outweigh risks & burdens Weak Recommendation • Desirable effects closely balanced with undesirable effects • Benefits, risks & burdens are finely balanced OR appreciable uncertainty exists about the magnitude of benefits & risks Moving from Strong to Weak To treat or not to treat… • Absence of high quality evidence • Imprecise estimates • Uncertainty or variation in individuals’ value of the outcomes • Small net benefits • Uncertain if net benefits are worth the costs Strong Recommendations Strong recommendation High quality evidence Recommendation can apply to most patients. Further research is unlikely to change our confidence in the estimate of effect. Strong recommendation Moderate quality evidence Recommendation can apply to most patients. Further research (if performed) is likely to have an important effect on our confidence in the estimate of effect and may change the estimate. Strong recommendation Low quality evidence Recommendation may change when higher quality evidence becomes available. Further research (if performed) is likely to have an important influence on our confidence in the estimate of effect and is likely to change the estimate. Strong recommendation Very low quality evidence (Very rarely applicable) Recommendation may change when higher quality evidence becomes available; any estimate of effect, for at least 1 critical outcome, is uncertain. Weak Recommendations Weak recommendation High quality evidence The best action may differ, depending on circumstances or patients or societal values. Further research is unlikely to change our confidence in the estimate of effect. Weak recommendation Moderate quality evidence Alternative approaches likely to be better for some patients under some circumstances. Further research (if performed) is likely to have an important influence on our confidence in the estimate of effect and may change the estimate. Weak recommendation Low quality evidence Other alternatives may be equally reasonable. Further research is likely to have an important influence on our confidence in the estimate of effect and is likely to change the estimate. Weak recommendation Very low quality evidence Other alternatives may be equally reasonable. Any estimate of effect, for at least 1 critical outcome, is uncertain. Guideline Evaluation-AGREE II • Appraisal of Guidelines for Research and Evaluation • Focus – evaluation of practice guidelines • Checklist of 23 questions • Six domains – – – – – – Scope and Purpose Stakeholder Involvement Rigor of Development Clarity and Presentation Applicability Editorial Independence