Quality Standards for Real-World Research Focus on Observational Database Studies of Comparative Effectiveness Nicolas Roche1, Helen Reddel2, Richard Martin3, Guy Brusselle4, Alberto Papi5, Mike Thomas6, Dirjke Postma7, Vicky Thomas8, Cynthia Rand9, Alison Chisholm8, and David Price10 1. Cochin Hospital Group, APHP and University Paris Descartes, Paris, France; 2. Woolcock Institute of Medical Research, University of Sydney, Glebe, Australia; 3. National Jewish Medical and Research Center, Denver, Colorado; 4. Department of Respiratory Medicine, Ghent University Hospital and Ghent University, Ghent, Belgium; 5. Research Center on Asthma and COPD, University of Ferrara, Ferrara, Italy; 6. Academic Primary Care, University of Aberdeen, Foresterhill Health Centre, Aberdeen, UK; 7. Department of Pulmonology, Center Groningen and University of Groningen, Groningen, The Netherlands; 8. Research in Real Life, Cambridge, United Kingdom; 9. Johns Hopkins School of Medicine, Baltimore, Maryland; and 10.Centre of Academic Primary Care, University of Aberdeen, Aberdeen, UK Aims of the paper 1. Improve knowledge and understanding of methodological issues specifically related to comparative observational studies using clinical and administrative datasets 2. Provide checklists (for researchers and reviewers) of key markers of quality when conducting and appraising such studies. Context: what is real-life research? • Effectiveness and comparative effectiveness studies aim to evaluate the (relative) benefits of available therapeutic options as used in real clinical practice situations (i.e., in unselected patients receiving usual care). • Such studies can use observational or clinical trial designs but in both cases put emphasis on high external validity. • Their goal is to complement classical efficacy randomized controlled trials (RCTs), with high internal validity are required for the registration of treatments Context: why do we need real-life research? • In asthma, it has been shown that the highly selected patient populations recruited to registration RCTs represent less than 5% of the general target patient population.1 • Although these efficacy trials are rigorous in design and address important questions regarding the risk/benefit profile of new therapies, their conclusions strictly apply only to the selected population recruited to the trial o They are limited in the extent to which they can be extrapolated to reflect the treatment effects achievable at the population level. 1. Herland K, et al. Respir Med 2005;99:11–19. Context: why do we need real-life research? • Patient characteristics controlled for by tight RCT design, include:2–5 o Smoking status o Excess weight, o Presence of other comorbidities o Concomitant treatments o Environmental exposures. Similarly, in RCTs some clinical management issues that can modulate the • RCTs also utilize intense patient–physician interaction; control patient behaviour, reinforce patient education, enforce therapy adherence, insist on effective inhalation device (for inhaled therapies)6,7 2. Peters-Golden M, et al. Eur Respir J 2006;27:495–503; 3. Price DB, et al. Allergy 2006;61:737–742; 4. Thomas M, et al. BMC Pulm Med 2006;6:S4; 5. Molimard M, et al. J AerosolMed 2003;16:249–254; 6. Giraud V, et al. Respir Med 2011;105:1815–1822; 7. Price D, et al. Treatment Strategies 2012;3:37–46. Limitations: RCTs inclusions/exclusions • Studies have shown that efficacy RCTs exclude about 95% of asthma and 90% of COPD routine care populations due to strict inclusion criteria.1 1. Herland K, et al. Respir Med 2005;99:11–19. Patient RCT eligibility drop-off with sequential application of standard inclusion criteria COPD Asthma An integrated evidence base • To ensure the widest possible generalizability of results, highly controlled RCTs conducted in highly selected populations must be complemented by larger studies : o Performed in target populations (i.e., populations in whom we intend to use the intervention) o Settings o Durations that mimic the real world. • The need for such research in asthma has been advocated by several groups in recent years.8–10 8 Silverman SL. Am J Med 2009;122:114–120. 9 Reddel HK, et al. Am J Respir Crit Care Med 2009;180:59–99. 10 Holgate S, et al. Eur Respir J 2008;32:1433–1442. REG’s integrated evidence framework • The Respiratory Effectiveness Group (REG) has proposed a new framework to enable classification of clinical research studies in terms of their general design. • The framework is intended to complement the previously proposed PRECIS wheel (see later slide) • The REG framework relies on two axes: o One describing the type of studied population in relation to the broadest target population o The other describing the “ecology of care” (or management approach) in relation to usual standard of care in the community. 11. Roche N, et al. Lancet Respir Med 2013;1:e29–e30. REG’s integrated evidence framework • The position of a study within the framework serves as a description of a study, not as a representation of the quality of evidence it provides. • The framework is tool for describing the basic characteristics of the study design and population. • Multiple studies can be placed relative to each other with respect to their relevance to the general target population, and for each study the appropriate quality assessment tools can be identified. 11. Roche N, et al. Lancet Respir Med 2013;1:e29–e30. REG’s integrated evidence framework A means of positioning individual studies with respect to their relevance to the general target population. 11. Roche N, et al. Lancet Respir Med 2013;1:e29–e30. Improving Guidelines: PRECIS Wheel • 9 “spokes,” each representing a different element of the study design (e.g., study eligibility criteria, expertise of individuals applying the intervention). • Each spoke, or axis, represents an explanatory– pragmatic (i.e., efficacy–effectiveness) continuum, and aspects of a trial are scored/positioned along each respective axis depending on the extent to which they reflect the characteristics of an explanatory (efficacy) RCT or a pragmatic effectiveness trial 12. Thorpe KE, et al. CMAJ 2009;180:E47–E57. Evidence quality assessment tools • • • • RCTs: CONSORT Statement13 Pragmatic trials: CONSORT Statement14 Observational studies in epidemiology: STROBE statement)15 Pharmacoepidemiology and pharmacovigilance studies: EMA-ENCePPchecklist for study protocols16 • Clinical trial protocols: SPIRIT recommendations17 • Datasets: Quality criteria and minimal datasets requirements for observational studies – UNLOCK initiative18 • Meta- analyses reporting: QUOROM & PRISMA19 13. Altman DG, et al. Ann Intern Med 2001;134:663–694; 14. Zwarenstein M, et al. BMJ 2008;337:a2390; 15 Vandenbroucke JP, et al. Ann Intern Med 2007;147:W163–W194; 16. ENCePP: http://www.encepp.eu/standards_and_guidances/ documents/ENCePPGuideofMethStandardsinPE.pdf ;17. Chan A-W,et al. Lancet 2013;381:91–92; 18 Chavannes N, et al. Prim Care Respir J 2010;19:408; 19 Moher D, et al. Lancet 1999;354:1896–1900; 20. Liberati A, et al. BMJ 2009;339:b2700. Quality Standards: Observational studies • Traditional Perception: o Observational studies provide weak evidence to support treatment recommendations o Efficacy RCTs represent the top-level evidence in many guidelines. BUT • RCTs cannot be non- interventional so questions of “usual care” are best addressed via alternative means. • Guidelines should include difference sources of evidence and acknowledging that evidence from well-designed observational studies may be moderate (or even strong) if the treatment effect is large and the evaluation has accounted for all of the plausible confounders and biases in properly adjusted analyses. Quality Standards: reporting • Whatever the design of a study type, it is crucial that it is reported in such a way that the: o appropriateness and quality of the chosen methodology o relevance of the results can be assessed by readers (e.g., care givers, researchers, guidelines developers, policy makers, patients associations, journal editors, and reviewers) so they can determine whether (and how) they should use the findings Observational studies: key limitations • Main potential limitations of observational studies are: o Selection bias (e.g., confounding by severity or indication –differential prescribed based on unevaluable patient characteristics) o Information bias (e.g., data that leads to misclassificaton); o Recall bias (when assessment of treatment exposure and/or outcomes depend on patients’ or caregivers’ recall), and o Detection bias (when an event of interest is less, or more, likely to be captured in one treatment group than in the other) Observational studies: Preparation (I) • A priori planning, prospective design – helps avoid “fishing” strategies– and should specify: o The purpose of the study (i.e., hypotheses to test) o Primary and secondary outcomes o Study design (i.e., cross-sectional or longitudinal) o Pre-specified analyses – ensures all potentially relevant variables required to characterize patients are included Observational studies: Preparation (II) • A suitable database to answer the key study question should be selected • A reliable, identifiable index event should be defined (e.g. treatment change) • A detailed database extraction and statistical analysis plan must then be prepared • The study population and subgroups of interest must be precisely defined. • To reduce potential bias, possible confounders should be identified and accounted for appropriately by matching and/or adjustment strategies. • A dedicated independent steering committee should guide these steps • The preparation process for observational studies should include the registration and, if possible, the publication of a study protocol in a public repository, with a commitment to publish regardless of results. Observational studies: Analyses and reporting • Demonstrate the robustness of results by: o Assessing whether the studied database population is representative of the target patients o Establishing the consistency of results through sensitivity analyses and across relevant patient subgroups and o Demonstrating their reproducibility in different datasets where similar criteria have been used to define the target populations, index events and outcomes. • Use of the same pre-defined population for all components of analyses (e.g. effectiveness, tolerance and medico-economic outcomes) • The process of reporting results reporting should begin with a flow chart detailing patient selection Observational studies: Analyses and reporting • Patient characteristics – demographic and medical (including markers of disease severity, comorbidities and concomitant treatments) should be described in detail and compared between treatment groups. • Use of patient matching or statistical modeling to adjust for differences between treatment arms • The results of all analyses that are conducted (e.g. matched, unmatched, adjusted and unadjusted) results should be reported to help demonstrate the robustness of the chosen method of analysis Observational studies: Matched analyses • Matching if the differences are too great to apply • adjusted analyses alone, matching should be considered as an additional tool for ensuring similarity of patients based on key demographic characteristics and markers of disease severity. • This can be done using: o Propensity scores – patients are assigned a score based on their baseline profile and matched to other patients with a similar score o Matching individual patients using a predefined set of key matching criteria • Both of these processes require close liaison between medical experts and statisticians to agree on suitable criteria for matching. Observational studies: Discussion of results • Discussion has to address the specific aspects of the study design. • Consider the results from the perspective of the initial hypotheses before being viewed from a broader perspective – do they confirm or contradict the underlying study hypothesis. • Set the results of observational database effectiveness studies within context by comparing them to those of efficacy RCTs on the same topic. o If they differ could it be because of unaddressed bias in the observational study? Observational studies: Discussion of results • The authors should present the rationale for their analysis approach and discuss whether they feel it has successfully reduced the risk of bias. • Limitations of the study must also be acknowledged. • Conclusions should be qualified with a note about the level of confidence that readers should have in the reliability, robustness and generalizability of results (i.e., the level of evidence provided by their study) and new studies should be suggested to challenge, strengthen or extend the conclusions. Quality criteria for observational database comparative studies Section Quality criteria Background Clear underlying hypotheses and specific research question(s) Methods Study design Observational comparative effectiveness database study Independent steering committee involved in a priori definition of the study methodology (including statistical analysis plan), review of analyses and interpretation of results Registration in a public repository with a commitment to publish results Databases High-quality database(s) with few missing data for measures of interest Validation studies Outcomes Clearly defined primary and secondary outcomes, chosen a priori The use of proxy and composite measures is justified and explained. The validity of proxy measures has been checked. Length of observation Sufficient duration to reliably assess outcomes of interest and long-term treatment effects Patients Well-described inclusion and exclusion criteria, reflecting target patients’ characteristics in the real world Analyses Study groups are compared at baseline using univariate analyses. Avoid biases related to baseline differences using matching and/or adjustments. Sensitivity analyses are performed to check the robustness of results. Sample Size Sample size is calculated based on clear a priori hypotheses regarding the occurrence of outcomes of interest and target effect of studied treatment vs. comparator. Results Flow chart explaining all exclusions Detailed description of patients’ characteristics, including demographics, characteristics of the disease of interest, comorbidities, and concomitant treatments If patients are lost to follow-up, their characteristics are compared with those of patients remaining in the analyses. Extensive presentation of results obtained in unmatched and matched populations (if matching was performed) using univariate and multivariate, unadjusted and adjusted analyses Sensitivity analyses and/or analyses of several databases go in the same direction as primary analyses. Discussion Summary and interpretation of findings, focusing first on whether they confirm or contradict a priori hypotheses Discussion of differences with results of efficacy randomized control trials Discussion of possible biases and confounding factors, especially related to the observational nature of the study Suggestions for future research to challenge, strengthen, or extend study results Conclusions (I) • An integrated approach to evidence evaluation combines data of high internal validity (classical RCTs) with those of greater external validity (pragmatic trials and observational studies) to inform clinical decision making, guidance, and policy. BUT • This requires the reliability and generalizability of different study designs to first be determined: o Characterisation of the study in terms of generalizability of its ecology of care and study population, then o Assessment of its quality (using design-specific tools) Conclusions (II) • Further work is required in this area to turn the wellintended calls for better integration of different study approaches into meaningful action. • A systematic review of the existing respiratory guidelines is required to identify where real-world studies can add useful complementary data. • There is also a need to test the REG’s integrated research • framework, to apply it to published research, and to use it to critically appraise the quality of the existing real-world evidence base. • The REG plan to undertake these activities Conclusions (III) • Until the REG’s systematic reviews are complete, the paper seeks to bring together the various challenges and considerations faced by those conducting and reviewing observational research and to provide useful checklists of key quality markers for observational research. • The checklists should be used as guidance such that their principles of: a priori planning, appropriate analysis and transparency should be embodied for all those seeking to conduct high-quality observational research and recognized by those appraising it