The GRADE approach: an introductory workshop Holger Schünemann, MD, PhD Professor and Chair, Dept. of Clinical Epidemiology & Biostatistics Professor of Medicine Michael Gent Chair in Healthcare Research McMaster University, Hamilton, Canada NTP, Raleigh June 22, 2011 The Department of Clinical Epidemiology & Biostatistics at McMaster History - 1967 – Founded by David Sackett - 6 chairs since - Instrumental in specialty of Clinical Epidemiology, origin of “Evidence-Based Medicine” People 45 full time and joint faculty ~ 120 associate & part time faculty; 19 emeritus ~ 180 staff ~ 200 PhD and Master students Content • Guidelines and GRADE – Background about GRADE • Quality of evidence • Going from evidence to recommendations What is a guideline? • "Guidelines are recommendations intended to assist providers and recipients of health care and other stakeholders to make informed decisions. Recommendations may relate to clinical interventions, public health activities, or government policies." WHO 2003, 2007 Evidence based healthcare decisions Population values and preferences (Clinical) state and circumstances Expertise Research evidence Haynes et al. 2002 Confidence in evidence • There always is evidence – “When there is a question there is evidence” • Better research greater confidence in the evidence and decisions Hierarchy of evidence based on quality STUDY DESIGN Randomized Controlled Trials Cohort Studies and Case Control Studies Case Reports and Case Series, Non-systematic observations Expert Opinion BIAS “Everything should be made as simple as possible but not simpler.” Explain the following? • Confounding, effect modification & ext. validity • Concealment of randomization • Blinding (who is blinded in a double blinded study?) • Intention to treat analysis and its correct application • P-values and confidence intervals BMJ 2003 BMJ, 2003 Relative risk reduction: ….> 99.9 % (1/100,000) U.S. Parachute Association reported 821 injuries and 18 deaths out of 2.2 million jumps in 2007 BMJ 2003 Simple hierarchies are (too) simplistic STUDY DESIGN Cohort Studies and Case Control Studies Case Reports and Case Series, Non-systematic observations Expert Opinion Expert Opinion Randomized Controlled Trials BIAS Schünemann & Bone, 2003 Which hierarchy? Recommendation for use of oral anticoagulation in patients with atrial fibrillation and rheumatic mitral valve disease Evidence •B •A • IV Recommendation Class I 1 C Organization AHA ACCP SIGN Oxford Centre for Evidence Based Medicine Levels of Evidence and Grades of Recommendations- 23 November 1999. Grade of Recommendation Level of Evidence Therapy/Prevention, Aetiology/Harm Prognosis Diagnosis Economic analysis 1a SR (with homogeneity) of RCTs SR (with homogeneity*) of Level 1 diagnostic studies; or a CPG validated on a test set. SR (with homogeneity*) of Level 1 economic studies 1b Individual RCT (with narrow Confidence Interval) SR (with homogeneity*) of inception cohort studies; or a CPG validated on a test set. Individual inception cohort study with > 80% follow-up Independent blind comparison of an appropriate spectrum of consecutive patients, all of whom have undergone both the diagnostic test and the reference standard. 1c All or none All or none case-series Absolute SpPins and SnNouts Analysis comparing all (critically-validated) alternative outcomes against appropriate cost measurement, and including a sensitivity analysis incorporating clinically sensible variations in important variables. Clearly as good or better, but cheaper. Clearly as bad or worse but more expensive. Clearly better or worse at the same cost. 2a SR (with homogeneity*) of cohort studies SR (with homogeneity*) of Level >2 diagnostic studies SR (with homogeneity*) of Level >2 economic studies 2b Individual cohort study (including low quality RCT; e.g., <80% follow-up) SR (with homogeneity*) of either retrospective cohort studies or untreated control groups in RCTs. Retrospective cohort study or follow-up of untreated control patients in an RCT; or CPG not validated in a test set. Any of: · Independent blind or objective comparison; · Study performed in a set of non-consecutive patients, or confined to a narrow spectrum of study individuals (or both) all of whom have undergone both the diagnostic test and the reference standard; · A diagnostic CPG not validated in a test set. Analysis comparing a limited number of alternative outcomes against appropriate cost measurement, and including a sensitivity analysis incorporating clinically sensible variations in important variables. 2c “Outcomes” Research 3a SR (with homogeneity*) of case-control studies Individual Case-Control Study Independent blind comparison of an appropriate spectrum, but the reference standard was not applied to all study patients Analysis without accurate cost measurement, but including a sensitivity analysis incorporating clinically sensible variations in important variables. A B 3b 4 Case-series (and poor quality cohort and case-control studies) Case-series (and poor quality prognostic cohort studies) Any of: · Reference standard was unobjective, unblinded or not · independent; · Positive and negative tests were verified using separate reference standards; · Study was performed in an inappropriate spectrum** of patients. Analysis with no sensitivity analysis 5 Expert opinion without explicit critical appraisal, or based on physiology, bench research or “first principles” Expert opinion without explicit critical appraisal, or based on physiology, bench research or “first principles” Expert opinion without explicit critical appraisal, or based on physiology, bench research or “first principles” Expert opinion without explicit critical appraisal, or based on economic theory C D “Outcomes” Research Oxford Centre for Evidence-Based Medicine (Chris Ball, Dave Sackett, Bob Phillips, Brian Haynes, and Sharon Straus). USPSTF - Grade Definitions After May 2007: Certainty Level of Certainty High Moderate Low Description The available evidence usually includes consistent results from well-designed, wellconducted studies in representative primary care populations. These studies assess the effects of the preventive service on health outcomes. This conclusion is therefore unlikely to be strongly affected by the results of future studies. •The available evidence is sufficient to determine the effects of the preventive service on health outcomes, but confidence in the estimate is constrained by such factors as: The number, size, or quality of individual studies. •Inconsistency of findings across individual studies. •Limited generalizability of findings to routine primary care practice. •Lack of coherence in the chain of evidence. As more information becomes available, the magnitude or direction of the observed effect could change, and this change may be large enough to alter the conclusion. •The available evidence is insufficient to assess effects on health outcomes. Evidence is insufficient because of: The limited number or size of studies. •Important flaws in study design or methods. •Inconsistency of findings across individual studies. •Gaps in the chain of evidence. •Findings not generalizable to routine primary care practice. •Lack of information on important health outcomes. More information may allow estimation of effects on health outcomes. The USPSTF defines certainty as "likelihood that the USPSTF assessment of the net benefit of a preventive service is correct." • Recommendations for prognosis – Use prognostic information to determine baseline risk for healthcare decisions 19 20 Center for Disease Control and Prevention (CDC) Evidence of Execution Effectiveness - Good or Fair Design Suitability — Greatest, Moderate, or Least Greatest Number of Studies Consistent Effect Sized Expert Opinion At Least 2 Yes Sufficient Not Used Greatest or Moderate Greatest At Least 5 Yes Sufficient Not Used Good or At Least 5 Yes Fair Meet Design, Execution, Number, and Consistency Criteria for Sufficient But Not Strong Evidence Sufficient Good Greatest 1 Not Applicable Good or Greatest or At Least 3 Yes Fair Moderate Good or Greatest, At Least 5 Yes Fair Moderate, or Least Expert Opinion Varies Varies Varies Varies Sufficient Not Used Large Not Used Sufficient Not Used Sufficient Not Used Sufficient Not Used Sufficient Insufficient D. Small Supports a Recommendation E. Not Used Strong Good Good A.Insufficient Designs or Execution B. Too Few Studies C. Inconsistent Healthcare problem “Healthy people” “Herd immunity” “Long term perspective” “Disease perception” “Lots of other things” recommendation GRADE Working Group Grades of Recommendation Assessment, Development and Evaluation • Aim: to develop a common, transparent and sensible system for grading the quality of evidence and the strength of recommendations • International group of guideline developers, methodologists & clinicians from around the world (>250 contributors) – since 2000 • International group: ACCP, AHRQ, Australian NMRC, BMJ Clinical Evidence, Cochrane Collaboration, CDC, McMaster, NICE, Oxford CEBM, SIGN, CMAJ 2003, BMJ 2004, BMC 2004, BMC 2005, UpToDate, USPSTF, WHO AJRCCM 2006, Chest 2006, BMJ 2008 GRADE Uptake World Health Organization CDC-ACIP Allergic Rhinitis in Asthma Guidelines (ARIA) American Thoracic Society American College of Physicians European Respiratory Society European Society of Thoracic Surgeons British Medical Journal Infectious Disease Society of America American College of Chest Physicians UpToDate® National Institutes of Health and Clinical Excellence (NICE) Scottish Intercollegiate Guideline Network (SIGN) Cochrane Collaboration Infectious Disease Society of America Clinical Evidence Agency for Health Care Research and Quality (AHRQ) Partner of GIN Over 40 major organizations Guideline development Process Prioritise problems & scoping Establish guideline panel and develop questions, including outcomes Find and critically appraise systematic review(s) and/or Prepare protocol(s) for systematic review(s) and Prepare systematic review(s) (searches, selection of studies, data collection and analysis) Prepare an evidence profile Assess the quality of evidence for each outcome Prepare a Summary of Findings table If developing guidelines: Assess the overall quality of evidence and Decide on the direction (which alternative) and strength of the recommendation Draft guideline Consult with stakeholders and/or external peer reviewers Disseminate guidelines Update review or guidelines when needed Adapt guidelines, if needed Prioritise guidelines/recommendations for implementation Implement or support implementation of the guidelines Evaluate the impact of the guidelines and implementation strategies Update systematic review/guidelines Case scenario A 13 year old girl who lives in rural Indonesia presented with flu symptoms and developed severe respiratory distress over the course of the last 2 days. She required intubation. The history reveals that she shares her living quarters with her parents and her three siblings. At night the family’s chicken stock shares this room too and several chicken had died unexpectedly a few days before the girl fell sick. Potential interventions: antivirals, such as neuraminidase inhibitors oseltamivir and zanamivir Types of questions Background Questions Definition: What is Avian Influenza? Mechanism: What is the mechanism of action of oseltamivir? Foreground Questions Benefit > harm: In patients with avian influenza, does oseltamivir therapy improve survival, …? Framing a foreground question Population: Avian Flu/influenza A (H5N1) patients Intervention: Oseltamivir Comparison: No pharmacological intervention Outcomes: Mortality, hospitalizations, resource use, adverse outcomes, antimicrobial resistance Schunemann, et al., The Lancet ID, 2007 Choosing outcomes • Desirable outcomes – – – – lower mortality reduced hospital stay reduced duration of disease reduced resource expenditure • Undesirable outcomes – adverse reactions – the development of resistance – costs of treatment • Every decision comes with desirable and undesirable consequences Developing recommendations must include a consideration of desirable and undesirable outcomes Relative importance of outcomes • Decision makers (and guideline authors) need to consider the relative importance of outcomes when balancing these outcomes to make a recommendation • Relative importance vary across populations • Relative importance may vary across patient groups within the same population • When considered critical - evaluate GRADE: recommendation – quality of evidence Clear separation: 1) Recommendation: 2 grades – weak/conditional/optional or strong (for or against an intervention)? – Balance of benefits and downsides, values and preferences, resource use and quality of evidence 2) 4 categories of quality of evidence: (High), (Moderate), (Low), (Very low)? – methodological quality of evidence – likelihood of bias – by outcome and across outcomes *www.GradeWorking-Group.org GRADE Quality of Evidence In the context of a systematic review • The quality of evidence reflects the extent to which we are confident that an estimate of effect is correct. In the context of making recommendations • The quality of evidence reflects the extent to which our confidence in estimates of the effects is adequate to support a particular recommendation. Likelihood of and confidence in an outcome Definition of grades of evidence Research • /A/High: Further research is very unlikely to change confidence in the estimate of effect. • /B/Moderate: Further research is likely to have an important impact on confidence in the estimate of effect and may change the estimate. • /C/Low: Further research is very likely to have an important impact on confidence in the estimate of effect and is likely to change the estimate. • /D/Very low: Any estimate of effect is very uncertain. Confidence in evidence /A/High: We are very confident that the true effect lies close to that of the estimate of the effect. /B/Moderate: : We are moderately confident in the effect estimate: The true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different. /C/Low : Our confidence in the effect estimate is limited: The true effect may be substantially different from the estimate of the effect. /D/Very low : We have very little confidence in the effect estimate: The true effect is likely to be substantially different from the estimate of effect. Determinants of quality • RCTs • observational studies • 5 factors that can lower quality 1. 2. 3. 4. 5. limitations in detailed design and execution (risk of bias criteria) Inconsistency (or heterogeneity) Indirectness (PICO and applicability) Imprecision (number of events and confidence intervals) Publication bias • 3 factors can increase quality 1. 2. 3. large magnitude of effect all plausible residual confounding may be working to reduce the demonstrated effect or increase the effect if no effect was observed dose-response gradient 1. Design and Execution/Risk of Bias Examples: • Inappropriate selection of exposed and unexposed groups • Failure to adequately measure/control for confounding • Selective outcome reporting • Failure to blind (e.g. outcome assessors) • High loss to follow-up • Lack of concealment in RCTs • Intention to treat principle violated Design and Execution/RoB From Cates , CDSR 2008 Design and Execution/RoB Overall judgment required 2. Inconsistency of results (Heterogeneity) • if inconsistency, look for explanation – patients, intervention, comparator, outcome • if unexplained inconsistency lower quality Reminders for immunization uptake Indoor air polution: ALRI Non-steroidal drug use and risk of pancreatic cancer Capurso G, Schünemann HJ, Terrenato I, Moretti A, Koch M, Muti P, Capurso L, Delle Fave G. Meta-analysis: the use of non-steroidal anti-inflammatory drugs and pancreatic cancer risk for different exposure categories. Aliment Pharmacol Ther. 2007 Oct 15;26(8):1089-99. 3. Directness of Evidence • differences in – populations/patients (children – neonates, women in general – pregnant women) – interventions (all vaccines, new - old) – comparator appropriate (new policy – old or no policy) – outcomes (important – surrogate: cases prevented – seroconversion) • indirect comparisons – interested in A versus B – have A versus C and B versus C – Vaccine A versus Placebo versus Vaccine B • Possibly. The “high” dose effects of bisphenol A in laboratory animals that provide clear evidence for adverse effects on development, i.e., reduced survival, birth weight, and growth of offspring early in life, and delayed puberty in female rats and male rats and mice, are observed at levels of exposure that far exceed those encountered by humans. However, estimated exposures in pregnant women and fetuses, infants, and children are similar to levels of bisphenol A associated with several “low” dose laboratory animal findings of effects on the brain and behavior, prostate and mammary gland development, and early onset of puberty in females. When considered together, these laboratory animal findings provide limited evidence that bisphenol A has adverse effects on development. Hierarchy of outcomes according to their importance to assess the effect of phosphate lowering drugs in patients with renal failure and hyperphosphatemia Importance of outcomes 9 Critical for decision making Important, but not critical for decision making Mortality Myocardial infarction Coronary calcification Ca2+/Pproduct 7 Fractures Bone density Ca2+/Pproduct 6 Pain due to soft tissue calcification / function Soft tissue calcification Ca2+/Pproduct 8 5 4 3 Low importance for decision making 2 1 Flatulence Surrogates: relation to important outcomes increasingly uncertain 4. Publication Bias • Should always be suspected – Only small “positive” studies (hypothesis confirming) – For profit interest – Various methods to evaluate – none perfect, but clearly a problem ISIS-4 Lancet 1995 I.V. Mg in acute myocardial infarction Meta-analysis Yusuf S.Circulation 1993 Publication bias Egger M, Smith DS. BMJ 1995;310:752-54 49 Funnel plot Standard Error 0 Symmetrical: No publication bias 1 2 3 0.1 0.3 0.6 1 3 10 Odds ratio 50 Funnel plot Standard Error 0 1 File drawer problem No interest in publishing or being published 0.4 Asymmetrical: Publication bias? 2 3 0.1 0.3 0.6 1 3 10 Odds ratio 51 Indoor air polution: ALRI 5. Imprecision • Small sample size – small number of events • Wide confidence intervals – uncertainty about magnitude of effect • Extent to which confidence in estimate of effect adequate to support decision Example: Immunization in children What can raise quality? 1. large magnitude can upgrade (RRR 50%/RR 2) – very large two levels (RRR 80%/RR 5) – criteria • everyone used to do badly • almost everyone does well – parachutes to prevent death when jumping from airplanes Reminders for immunization uptake What can raise quality? 2. dose response relation – (higher INR – increased bleeding) – childhood lymphoblastic leukemia • • • • risk for CNS malignancies 15 years after cranial irradiation no radiation: 1% (95% CI 0% to 2.1%) 12 Gy: 1.6% (95% CI 0% to 3.4%) 18 Gy: 3.3% (95% CI 0.9% to 5.6%) 3. all plausible confounding may be working to reduce the demonstrated effect or increase the effect if no effect was observed All plausible residual confounding would result in an overestimate of effect Hypoglycaemic drug phenformin causes lactic acidosis The related agent metformin is under suspicion for the same toxicity. Large observational studies have failed to demonstrate an association – Clinicians would be more alert to lactic acidosis in the presence of the agent • Vaccine – adverse effects Quality assessment criteria Study design Initial Lower if quality of a body of evidence Higher if Quality of a body of evidence Randomised trials High Large effect Dose response All plausible residual confounding & bias -Would reduce a demonstrated effect -Would suggest a spurious effect if no effect was observed A/High (four plus: ) B/Moderate (three plus: ) C/Low (two plus: ) D/Very low (one plus: ) Risk of Bias Inconsistency Indirectness Imprecision Observational Low studies Publication bias Evidence Profiles/Summaries Evidence Profiles/Summaries Evidence Profiles/Summaries Evidence Profiles/Summaries Content • Background • Quality of evidence • Moving from evidence to recommendations Strength of recommendation “The strength of a recommendation reflects the extent to which we can, across the range of patients for whom the recommendations are intended, be confident that desirable effects of a management strategy outweigh undesirable effects.” • Strong or weak/conditional Determinants of the strength of recommendation Factors that can strengthen a Comment recommendation Quality of the evidence The higher the quality of evidence, the more likely is a strong recommendation. Balance between desirable The larger the difference between the and undesirable effects desirable and undesirable consequences, the more likely a strong recommendation warranted. The smaller the net benefit and the lower certainty for that benefit, the more likely weak recommendation warranted. Values and preferences The greater the variability in values and preferences, or uncertainty in values and preferences, the more likely weak recommendation warranted. Costs (resource allocation) The higher the costs of an intervention – that is, the more resources consumed – the less likely is a strong recommendation warranted Developing recommendations Case scenario A 13 year old girl who lives in rural Indonesia presented with flu symptoms and developed severe respiratory distress over the course of the last 2 days. She required intubation. The history reveals that she shares her living quarters with her parents and her three siblings. At night the family’s chicken stock shares this room too and several chicken had died unexpectedly a few days before the girl fell sick. Methods – WHO Rapid Advice Guidelines for management of Avian Flu Applied findings of a recent systematic evaluation of guideline development for WHO/ACHR Group composition (including panel of 13 voting members): clinicians who treated influenza A(H5N1) patients infectious disease experts basic scientists public health officers methodologists Independent scientific reviewers: Identified systematic reviews, recent RCTs, case series, animal studies related to H5N1 infection Oseltamivir for Avian Flu Summary of findings: No clinical trial of oseltamivir for treatment of H5N1 patients. 4 systematic reviews and health technology assessments (HTA) reporting on 5 studies of oseltamivir in seasonal influenza. Hospitalization: OR 0.22 (0.02 – 2.16) Pneumonia: OR 0.15 (0.03 - 0.69) 3 published case series. Many in vitro and animal studies. No alternative that is more promising at present. Cost: 40$ per treatment course From evidence to recommendation Factors that can strengthen a Comment recommendation Quality of the evidence Very low quality evidence Balance between desirable and undesirable effects Values and preferences Costs (resource allocation) Uncertain, but small reduction in relative risk still leads to large absolute effect Little variability and clear Low cost under non-pandemic conditions Example: Oseltamivir for Avian Flu Recommendation: In patients with confirmed or strongly suspected infection with avian influenza A (H5N1) virus, clinicians should administer oseltamivir treatment as soon as possible (????? recommendation, very low quality evidence). Schunemann et al. The Lancet ID, 2007 Example: Oseltamivir for Avian Flu Recommendation: In patients with confirmed or strongly suspected infection with avian influenza A (H5N1) virus, clinicians should administer oseltamivir treatment as soon as possible (strong recommendation, very low quality evidence). Values and Preferences Remarks: This recommendation places a high value on the prevention of death in an illness with a high case fatality. It places relatively low values on adverse reactions, the development of resistance and costs of treatment. Schunemann et al. The Lancet ID, 2007 Implications of a strong recommendation • Patients: Most people in this situation would want the recommended course of action and only a small proportion would not • Clinicians: Most patients should receive the recommended course of action • Policy makers: The recommendation can be adapted as a policy in most situations Implications of a conditional/weak recommendation • Patients: The majority of people in this situation would want the recommended course of action, but many would not • Clinicians: Be more prepared to help patients to make a decision that is consistent with their own values/decision aids and shared decision making • Policy makers: There is a need for substantial debate and involvement of stakeholders Critical Outcome Critical Outcome Important Outcome Not High Moderate Low Very low Summary of findings & estimate of effect for each outcome Systematic review Grade down Outcome Grade up P I C O Randomization increases initial quality 1. Risk of bias 2. Inconsistency 3. Indirectness 4. Imprecision 5. Publication bias 1. Large effect 2. Dose response 3. Confounders Guideline development Formulate recommendations: • For or against (direction) • Strong or weak (strength) By considering: Quality of evidence Balance benefits/harms Values and preferences Revise if necessary by considering: Resource use (cost) Grade overall quality of evidence across outcomes based on lowest quality of critical outcomes • • • • “We recommend using…” “We suggest using…” “We recommend against using…” “We suggest against using…” Issues in guideline development in Public Health • Causation versus effects of intervention – Causation not equivalent to efficacy of interventions – Bradford Hill • Nearly half a century old – tablet from the mountain? • Harms caused by medications – Assumption is that removal of exposure leads to NO adverse effects • How confident can one be that removal of the exposure is effective in preventing disease? – Whether drugs or environmental factors it will depend on the intervention to remove exposure Schünemann et al. JECH 2010 Conclusions Clinical practice guidelines should be based on the best available evidence to be evidence based GRADE combines what is known in health research methodology and provides a structured approach to improve communication Criteria for evidence assessment across questions and outcomes Criteria for moving from evidence to recommendations Transparent, systematic four categories of quality of evidence two grades for strength of recommendations Transparency in decision making and judgments is key Formulating Questions and Choosing Outcomes Outline • Type of questions • Framing a foreground question • Choosing outcomes • Relative importance of outcomes 85 Guidelines and questions Guidelines are a way of answering questions about clinical, communication, organisational or policy interventions, in the hope of improving health care or health policy. It is therefore helpful to structure a guideline in terms of answerable questions. WHO Guideline Handbook, 2008 Types of questions Background Questions Definition: What is COPD? Mechanism: What is the mechanism of action of mucolytic therapy? Foreground Questions Efficacy: In patients with COPD, does mucolytic therapy improve survival? Framing a foreground question P I C O Framing a foreground question Population: Intervention: Comparison: Outcomes: Case scenario A 13 year old girl who lives in rural Indonesia presented with flu symptoms and developed severe respiratory distress over the course of the last 2 days. She required intubation. The history reveals that she shares her living quarters with her parents and her three siblings. At night the family’s chicken stock shares this room too and several chicken had died unexpectedly a few days before the girl fell sick. Potential interventions: antivirals, such as neuraminidase inhibitors oseltamivir and zanamivir What are examples of: • Background questions • Foreground questions •Population: •Intervention: •Comparison: •Outcomes: 91 Framing a foreground question Population: Avian Flu/influenza A (H5N1) patients Intervention: Oseltamivir (or Zanamivir) Comparison: No pharmacological intervention Outcomes: Mortality, hospitalizations, resource use, adverse outcomes, antimicrobial resistance Schunemann, Hill et al., The Lancet ID, 2007 Choosing outcomes • Every decision comes with desirable and undesirable consequences Developing recommendations must include a consideration of desirable and undesirable outcomes Outcomes should be patient important outcomes. Choosing outcomes • desirable outcomes – lower mortality – reduced hospital stay – reduced duration of disease – reduced resource expenditure • undesirable outcomes – adverse reactions – the development of resistance – costs of treatment Choosing outcomes What if what is important is not measured? What if what is measured is not important? How do we make sure we’ve covered all important outcomes? Relative importance of outcomes • Decision makers (and guideline authors) need to consider the relative importance of outcomes when balancing these outcomes to make a recommendation • Relative importance vary across populations • Relative importance may vary across patient groups within the same population • When considered critical - evaluate Relative importance of outcomes 9 8 Critical for decision making 7 6 5 Important, but not critical for decision making 4 3 2 1 Of low importance Using GRADEpro Creating a new GRADEpro file Profile groups Profiles Managing outcomes 118 Content • Quality of evidence • Going from evidence to recommendations Healthcare problem recommendation Strength of recommendation “The strength of a recommendation reflects the extent to which we can, across the range of patients for whom the recommendations are intended, be confident that desirable effects of a management strategy outweigh undesirable effects.” • Strong or conditional Strength of recommendation The degree of confidence that the desirable effects of adherence to a recommendation outweigh the undesirable effects. Desirable effects •health benefits •less burden •savings Undesirable effects •harms •more burden •costs Determinants of the strength of recommendation Factors that can strengthen a recommendation Quality of the evidence Balance between desirable and undesirable effects Values and preferences Costs (resource allocation) Comment The higher the quality of evidence, the more likely is a strong recommendation. The larger the difference between the desirable and undesirable consequences, the more likely a strong recommendation warranted. The smaller the net benefit and the lower certainty for that benefit, the more likely weak recommendation warranted. The greater the variability in values and preferences, or uncertainty in values and preferences, the more likely weak recommendation warranted. The higher the costs of an intervention – that is, the more resources consumed – the less likely is a strong recommendation warranted Balancing benefits and downsides ↑ herd immunity Conditional Strong ↓ Morbidity ↓ Death ↑ QoL For ↑ Resources ↑ Allergic reactions ↑ Nausea ↑ Local skin reactions Against Balancing benefits and downsides Conditional Strong For Against Balancing benefits and downsides Conditional Strong For Against Balancing benefits and downsides Conditional Strong For Against Balancing benefits and downsides Conditional Strong For Against Implications of a strong recommendation • Policy makers: The recommendation can be adapted as a policy in most situations • Patients: Most people in this situation would want the recommended course of action and only a small proportion would not • Clinicians: Most patients should receive the recommended course of action Implications of a conditional recommendation • Policy makers: There is a need for substantial debate and involvement of stakeholders • Patients: The majority of people in this situation would want the recommended course of action, but many would not • Clinicians: Be more prepared to help patients to make a decision that is consistent with their own values/decision aids and shared decision making Case scenario A 13 year old girl who lives in rural Indonesia presented with flu symptoms and developed severe respiratory distress over the course of the last 2 days. She required intubation. The history reveals that she shares her living quarters with her parents and her three siblings. At night the family’s chicken stock shares this room too and several chicken had died unexpectedly a few days before the girl fell sick. Methods – WHO Rapid Advice Guidelines for Avian Flu Applied findings of a recent systematic evaluation of guideline development for WHO/ACHR Group composition (including panel of 13 voting members): clinicians who treated influenza A(H5N1) patients infectious disease experts basic scientists public health officers methodologists Independent scientific reviewers: Identified systematic reviews, recent RCTs, case series, animal studies related to H5N1 infection Oseltamivir for Avian Flu Summary of findings: • No clinical trial of oseltamivir for treatment of H5N1 patients. • 4 systematic reviews and health technology assessments (HTA) reporting on 5 studies of oseltamivir in seasonal influenza. – Hospitalization: OR 0.22 (0.02 – 2.16) – Pneumonia: OR 0.15 (0.03 - 0.69) • 3 published case series. • Many in vitro and animal studies. • No alternative that was more promising at present. • Cost: 40$ per treatment course From evidence to recommendation Factors that can strengthen a Comment recommendation Quality of the evidence Very low quality evidence Balance between desirable and undesirable effects Values and preferences Costs (resource allocation) Uncertain, but small reduction in relative risk still leads to large absolute effect Little variability and clear Low cost under non-pandemic conditions Complex data & decisions: yes/no? Recommendation - The Guidelines Group recommends that fluoroquinolones are / not used in the treatment of all patients with MDR (Strong(conditional) recommendation/ low(moderate, high) grade of evidence) Recommendation: In women with histologically confirmed CIN, the expert panel recommends/suggests cryotherapy/LEEP over cryotherapy/LEEP. Population: Women with histologically confirmed CIN Intervention: Cryotherapy versus LEEP Decision Factor Explanation High or moderate evidence (is there high or moderate quality evidence?) The higher the quality of evidence, the Yes more likely is a strong OO N0 recommendation. Certainty about the balance of benefits versus harms and burdens (is there certainty?) The larger the difference between the desirable and undesirable consequences and the certainty around that difference, the more likely a strong recommendation. The smaller the net benefit and the lower the certainty for that benefit, the more likely is a conditional/ weak recommendation. Certainty in or similar values (is there certainty or similarity?) The more certainty or similarity in values and preferences, the more likely a strong recommendation. Resource implications (are the resources worth the intervention?) The lower the cost of an intervention compared to the alternative that is considered and other costs related to the decision – that is, the less resources consumed – the more likely is a strong recommendation. Overall strength of recommendation There is moderate quality evidence from both randomised and observational controlled studies for recurrence rates. However, there is low quality evidence for other outcomes which were considered critical and important for decision making (e.g., severe adverse events, cervical cancer). There is uncertainty for fertility and other obstetrical outcomes, and HIV acquisition/transmission was not measured. Yes No Benefits of LEEP were greater, and harms were fewer or similar YES No Similar values across women YES No More resources required for LEEP Conditional Recurrence rates of CIN I, CIN II-III and all CINs are probably greater with cryotherapy o CIN II-III, OR 3.3 (1.04 to 10.46) o CIN I, OR 2.74 (0.62 to 12.07) o All CIN, OR 2.14 (1.05 to 4.33) Cryotherapy may be less acceptable to patients than LEEP There may be little difference in serious adverse events between cryotherapy and LEEP, but there may be fewer minor adverse events (such as pain) with cryotherapy It is unclear whether there is a difference in fertility/obstetric outcomes High value was placed on CIN recurrence, serious adverse events and acceptability to the patient Low value was placed on minor adverse events Need for more skilled providers to perform LEEP Need for more or expensive equipment/supplies for LEEP; electrical supply for LEEP Need for local anaesthesia with LEEP Example: Oseltamivir for Avian Flu Recommendation: In patients with confirmed or strongly suspected infection with avian influenza A (H5N1) virus, clinicians should administer oseltamivir treatment as soon as possible (strong recommendation, very low quality evidence). Remarks: This recommendation places a high value on the prevention of death in an illness with a high case fatality. It places relatively low values on adverse reactions, the development of resistance and costs of treatment. Schunemann et al. The Lancet ID, 2007 Issues in guideline development for immunization • Causation versus effects of intervention – Causation not equivalent to efficacy of interventions – Bradford Hill • Nearly half a century old – tablet from the mountain? • Harms caused by interventions – Assumption is that removal of vaccine (or no exposure) leads to NO adverse effects • How confident can one be that removal of the exposure is effective in preventing disease? – Whether immunization or environmental factors: will depend on the intervention to remove exposure Current state of recommendations 14 Current state of recommendations • Reviewed 7527 recommendations – 1275 randomly selected • Inconsistency across/within • 31.6% did not recommendations clearly – Most of them not written as executable actions • 52.7% did not indicated strength 14 Recommendation • The Guideline Group recommends rapid DST testing for resistance to INH and RIF or RIF alone over conventional testing or no testing at the time of diagnosis of TB (conditional, /low quality evidence). • Values and preferences: A high value was placed on outcomes such as preventing death and transmission of MDR as a result of delayed diagnosis as well as avoiding spending resources. Group composition • Group composition might affect recommendation • Common principle: include all affected by the recommendations ( multi-disciplinary groups incl. patients/carers) – Industry? • Keep a manageable size The Process: How to make it constructive? • Group members are heterogeneous and might have different objectives • Chair facilitates rather than leads the group • Common understanding of goal, tasks and ground rules • Similar level of required knowhow and skills • Sufficient technical support Balanced participation and formal agreement • Key task of chair • Formal consensus processes Delphi Method Nominal group process Voting Group processes How to present controversies • Lay out the controversies • Describe the evidence • Ask members to focus on the agreed upon evidence and the factors leading to a decision • Ask whether there still is disagreement • Vote – Make voting explicit and transparent (ways of doing this to come tomorrow) Conclusions - Process • Success depends on strong chair(s), training of group, good facilitation and technical support – Clinical and methods co-chairs • Formal consensus developing methods might support agreement on recommendations – Voting represents forced consensus • Guideline development will require sufficient resources. GRADE Grid Critical Outcome Critical Outcome Important Outcome Not High Moderate Low Very low Summary of findings & estimate of effect for each outcome Systematic review Grade down Outcome Grade up P I C O Randomization increases initial quality 1. Risk of bias 2. Inconsistency 3. Indirectness 4. Imprecision 5. Publication bias 1. Large effect 2. Dose response 3. Confounders Guideline development Formulate recommendations: • For or against (direction) • Strong or conditional (strength) By considering: Quality of evidence Balance benefits/harms Values and preferences (Revise by considering:) Resource use (cost) Grade overall quality of evidence across outcomes based on lowest quality of critical outcomes • • • • “We recommend using…/should” “We suggest using…/might” “We recommend against using…/might not” “We suggest against using…/should not” Conclusions WHO guidelines should be based on the best available evidence to be evidence based GRADE is the approach used by WHO and gaining acceptance internationally combines what is known in health research methodology and provides a structured approach to improve communication Does not avoid judgments but provides framework Criteria for evidence assessment across questions and outcomes Criteria for moving from evidence to recommendations Transparent, systematic four categories of quality of evidence two grades for strength of recommendations Transparency in decision making and judgments is key Format • Mix of seminars/interactive lectures, self directed learning and simulation – Large group and smaller group discussion – Computer work • Simulate guideline panel work • Select rapporteur (both for large group and any small group work) Format • Mix of seminars/interactive lectures, self directed learning and simulation – Large group and smaller group discussion – Computer work? • Simulate guideline panel work • Select rapporteur (for any small group work)