See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/8084663 Evidence-Based Quality Improvement: The State Of The Science Article in Health Affairs · January 2005 DOI: 10.1377/hlthaff.24.1.138 · Source: PubMed CITATIONS READS 565 4,225 2 authors: Kaveh Shojania Jeremy Grimshaw University of Toronto Ottawa Hospital Research Institute 214 PUBLICATIONS 15,000 CITATIONS 963 PUBLICATIONS 134,685 CITATIONS SEE PROFILE SEE PROFILE Some of the authors of this publication are also working on these related projects: Cluster trial ethics View project Living systematic review on the effectiveness of quality improvement (QI) strategies on the management of diabetes View project All content following this page was uploaded by Jeremy Grimshaw on 26 August 2014. The user has requested enhancement of the downloaded file. Implemen ti n g Evi dence Evidence-Based Quality Improvement: The State Of The Science Quality improvement strategies, just like medical interventions, need to rest on a strong evidence base. by Kaveh G. Shojania and Jeremy M. Grimshaw ABSTRACT: Routine practice fails to incorporate research evidence in a timely and reliable fashion. Many quality improvement (QI) efforts aim to close these gaps between clinical research and practice. However, in sharp contrast to the paradigm of evidence-based medicine, these efforts often proceed on the basis of intuition and anecdotal accounts of successful strategies for changing provider behavior or achieving organizational change. We review problems with current approaches to QI research and outline the steps required to make QI efforts based as much on evidence as the practices they seek to implement. C o n s i d e r t h e f o l l o w i n g s c e na r i o : A patient comes to see a physician. The patient obviously suffers from a serious chronic illness. Various diagnostic tests show gross abnormalities. Some sort of treatment is necessary. The physician consults several colleagues, one of whom reports success in the treatment of a patient with a similar chronic illness using oval red capsules with the number “250” stamped on them. The physician rummages through the clinic’s medication room and finds some red pills, although they are square tablets with the number “100” stamped on them. He instructs the patient to take this medication once daily and report back to him. At the next visit, the patient reports most of the same symptoms, except that mornings tend to be better than they used to. A little disappointed at this equivocal improvement, the doctor nonetheless publishes his results, as he knows there are other clinicians out there with sick patients and they may benefit from knowing that “red pills improve morning symptoms in patients with chronic illnesses,” as his article title reads. Moreover, it may not matter if they come in tablet or capsule form. The article receives considerable attention. Many decide that the problem of sick patients is so urgent that there is no time to conduct further studies: Kaveh Shojania (kshojania@ohri.ca) is an assistant professor of medicine at the University of Ottawa (Ontario) and a scientist in the Clinical Epidemiology Program at the Ottawa Health Research Institute (OHRI). Jeremy Grimshaw is director of the OHRI Clinical Epidemiology Program, director of the Centre for Best Practice in the Institute of Population Health, and a professor in the Department of Medicine at the University of Ottawa. 138 DOI 10.1377/hlthaff.24.1.138 ©2005 Project HOPE–The People-to-People Health Foundation, Inc. Ja n u a r y/ Fe b r u a r y 2 0 0 5 Q uality Improv ement Chronically ill patients should receive red pills as soon as possible. Others resist this approach as potentially squandering precious resources and therefore call for more research on the benefit of red pills in sick patients. Several randomized controlled trials (RCTs) report negative results, prompting some researchers to reconsider the issue of pill shape, while others try pills of different colors. Enough studies appear in the literature to warrant review articles and commentaries. Some conclude that the available literature shows no consistent benefit for pills of any type. They wonder what makes patients so difficult to cure— perhaps they don’t want to change? Other more optimistic researchers point out that although there are no “magic bullets,” a number of pills show promise. For now, what seems most important is not the color or shape of the pills used but rather their number, as trials administering at least two types of pills consistently report greater benefits than the ones in which patients received only a single type. Replace “patients” with “quality problems,” and the above scenario captures the state of the science for promoting the translation of evidence from clinical research into practice. From the perspectives of clinical medicine and the research enterprise, we regard it as absurd to proceed directly from a patient’s poorly understood complaints to reaching for a bottle of pills simply because they are handy and resemble ones recommended anecdotally by a colleague. The decision to administer these pills without any understanding of their active ingredients or their mode of action would be completely unsupportable. Yet comparably unsupportable activities occur routinely in quality improvement (QI)research. Quality problems are widespread and often glaring, but as in the above scenario, reasons for these problems remain unclear.1 Do providers not know the latest literature, or know but disagree with it? Do they agree with the literature, but inadequate support systems frustrate their efforts to comply with recommendations based on the literature? Perhaps financial incentives are misaligned? Remedies for quality problems, like their medicinal counterparts, come in a variety of colors and shapes—critical pathways, disease management, report cards, and local opinion leaders to champion guidelines, to name a few—all with active ingredients as poorly defined as those in the pills in the above scenario. And just as in that scenario, evaluations of these remedies sometimes report beneficial results, but no single approach produces these results consistently. Instead of exploring deeper reasons for these failures (what key ingredients do red pills contain?), the field has simply moved on to the next superficial variable (maybe pill shape plays a crucial role?). In this paper we briefly summarize the evolution of approaches to implementing evidence in practice, review the results thus far for particular implementation strategies, and outline a plan for future advances. Evidence-based medicine (EBM) is the explicit use of the best available evidence to inform decisions about the care of individual patients.2 Under this paradigm, hypotheses about clinical care undergo rigorous evaluation instead of having their effectiveness presumed on the basis of anecdotal experience or patho- H E A L T H A F F A I R S ~ Vo l u m e 2 4 , N u m b e r 1 139 Implemen ti n g Evi dence physiological arguments. QI research seeks to implement in routine practice the processes and outcomes of care established by the best available evidence. Unfortunately, these efforts have often proceeded without insistence on the same level of rigor required to establish these QI targets as worthy of implementation. After multiple rigorously designed and conducted clinical trials establish the benefit of some process of care, implementation efforts typically proceed on the basis of intuition, anecdotal stories of success, or studies that exhibit little of the methodological sophistication seen in the research that established the intervention’s benefit. Strategies for implementing EBM require an evidence base of their own.3 Evolution Of QI And Implementation Research Efforts to implement EBM in routine practice have evolved through four overlapping phases, each characterized by its own optimistic version of “If you build it, they will come.”4 n Passive diffusion (“If you publish it, they will come”). In this earliest and particularly optimistic phase, it was assumed that clinicians would naturally act upon new clinical research as it appeared. The only acknowledged impediment to the flow of evidence from the pages of medical journals to the minds of practitioners was the sheer volume of information and variation in its quality. Advocates of evidence-based medicine promoted the adoption of systematic reading habits and the acquisition of basic skills in critiquing research articles. n Guidelines and systematic reviews (“If you read it for them, they will come”). In this second phase, it was realized that even with more judicious reading habits and critical skills honed in journal clubs, a variety of factors prevented clinicians from acquiring evidence in a reliable and timely fashion. Systematic reviews of the evidence and clinical practice guidelines would therefore identify and synthesize studies addressing important clinical decisions, accompanied by graded recommendations for practitioners. Why practice guidelines generally failed to change practice likely involved a combination of continued reliance on passive diffusion and other factors that have received only limited study, including disagreement with the content of guidelines (which show wide variations in methodological quality and quickly become out of date), personal characteristics of providers (for example, resistance to perceived infringements on physician autonomy), and logistic or financial barriers to implementation.5 n Industrial-style quality improvement (“If you TQM/CQI it, they will come”). This third stage introduced more active approaches to quality improvement, best represented by the “plan-do-study-act” cycles of total quality management (TQM) and continuous quality improvement (CQI). In many ways, TQM and CQI are not so much specific interventions as they are general approaches to improving quality. In fact, heterogeneity in what counts as TQM or CQI may explain the disappointing results reported by reviews of their impact in health care.6 Some may claim that the benefits of TQM, CQI, and other general approaches 140 Ja n u a r y/ Fe b r u a r y 2 0 0 5 Q uality Improv ement to quality improvement (such as “Six Sigma”) have been well established in other industries and that the disappointing results in health care reflect inadequate commitments to their principles or suboptimal implementation of their methods. Although there may be some truth to this, it is also important to note that QI programs in other industries have seldom undergone the level of scrutiny to which health care routinely subjects new aspects of care and efforts to implement them. In fact, what counts as “well established” in other industries often consists of case reports or observational studies that in health care would be regarded as hypothesis-generating research, not confirmatory evidence. n Systems reengineering (“If you completely rebuild it, they will come”). This fourth and present stage contrasts with the incremental cycles of TQM and CQI (although these are still commonly encountered) by seeking meaningful quality improvement through radical redesign of existing systems of care. Redesign efforts attempt to capture the optimal means of accomplishing key goals, instead of relying on Rube Goldberg–like protocols that reflect myriad, often contradictory historical forces and incentives for change.7 Reengineering efforts often involve a major component of information technology (IT) as the means of achieving the more optimal, streamlined delivery system. Evaluations of IT already show the familiar pattern of prominent successes accompanied by equally prominent failures.8 Instead of simply moving onto the next new paradigm, it is worth considering what deficiencies have existed in the literature and how these might be corrected. n Barriers to QI interventions. In each of the above phases, QI initiatives have typically proceeded on the basis of presumptions about practitioners’ needs and untested assumptions about effective means for addressing them, in precise opposition to the paradigm of evidence-based medicine. During the past thirty years, a number of groups have amassed an evidence base for implementation research.9 Only in the past decade, however, have researchers focused on identifying barriers that prevent evidence-based care (Exhibit 1), and designed QI strategies to address them.10 The terms “knowledge translation” and “implementation research” also appear in the literature, capturing the notion that although a given practice may be supported by evidence, the best way to implement that practice requires a research base of its own. While slightly more technical, these terms are roughly interchangeable with each other and with QI research. Designs For Evaluating QI Research Systematic reviews of QI strategies have consistently identified weak designs in the primary studies evaluating QI strategies. For example, a review of CQI studies published in major U.S. medical journals reported that 75 percent of them relied on simple before-after designs, often at single institutions at single sites, which makes it difficult to attribute any observed benefits to the CQI interventions.11 n Expediency versus rigor. Given the perceived urgency for QI efforts, some have resisted calls to adopt evaluative designs comparable in rigor to those typically H E A L T H A F F A I R S ~ Vo l u m e 2 4 , N u m b e r 1 141 Implemen ti n g Evi dence EXHIBIT 1 Examples Of Barriers To Translating Evidence Into Practice Type of barrier Example Structural Financial disincentives Cost and reimbursement issues related to Inappropriate skill mix, lack of facilities or equipment administering birth dose of hepatitis B vaccine (HBV) dose in the hospital Staffing of ICUs by intensivists widely called for; meeting this goal is highly unlikely given current projections of the intensivist workforce Peer group Practice patterns determined by local standards and Multiple examples of wide variations in rates of surgery or diagnostic tests from one geographical beliefs rather than evidence or formal consensus area to another statements Professional Knowledge/skills Attitudes and beliefs Physicians’ knowledge of the crucial role for aspirin in treating acute myocardial infarction and published recommendations from clinical experts lagging behind or even contradicting existing evidence from randomized trials Concerns that guidelines do not reflect real-world practice, resistance to “cookbook medicine” Patient factors Requests for antibiotics as treatment for viral upper Patients’ requests for specific diagnostic tests or respiratory tract infections treatments despite their not being recommended Patients’ informed choices not to pursue care that is Choice of some parents not to use newer vaccines for their children (such as HBV beginning at birth) recommended SOURCE: Authors’ analysis based on R. Grol and M. Wensing, “What Drives Change? Barriers to and Incentives for Achieving Evidence-based Practice,” Medical Journal of Australia 180, no. 6, Supp. 1 (2004): S57–S60. NOTES: Failures of routine practice to replicate recommended care have frequently been ascribed to knowledge (or lack thereof) and recalcitrance on the part of physicians. In fact, barriers to adoption of evidence-based care include structural issues, peer-group effects, and patient factors. ICU is intensive care unit. found in clinical research, especially given the additional challenges of studying change in complex organizations. Ironically, a better case for permitting welldesigned observational studies to provide adequate evidence for major policy decisions can be made for clinical research than for quality improvement. RCTs offer protection from the effects of unknown predictors of treatment outcome by balancing between their prevalence in control and experimental groups. In clinical research, we usually understand many of these predictors and can therefore adjust for imbalances in observational studies. By contrast, we generally have very limited understanding of the factors that determine the success of a QI intervention, making randomized designs (not to mention blinding, concealment of allocation, and other often overlooked aspects of trial design) even more important if one wants to avoid wasting resources on ineffective interventions. n The “before-after” approach. When resources or time constraints prevent an RCT, institutions should strongly consider trial designs that avoid the problems of a simple “before-after” design.12 In this approach, if Hospital A decides to implement a 142 Ja n u a r y/ Fe b r u a r y 2 0 0 5 Q uality Improv ement particular QI program, it might look at outcomes of interest in the year before and the year after implementation to determine whether or not any significant change occurred. Such studies suffer from two major drawbacks. First, background factors can produce large fluctuations in processes or outcomes of interest irrespective of QI interventions. Second, during any given period, multiple changes typically occur within a health care system or its socioeconomic environment. One or more of these other changes might have produced the desired improvements. n Time-series design. One way to minimize these possibilities is to look at multiple time periods (for example, monthly outcomes over at least one year before and after the intervention). This conveys the extent of background variation and also indicates the extent to which any trend toward improvement may have been present prior to the intervention. Various mathematical tools allow one to formally test what is often readily apparent visually—namely, that a marked change did (or did not) occur at the time of the intervention. One example of this so-called interrupted time-series design, which clearly provided more accurate information than a simple before-after study would have done, is an evaluation in the United Kingdom of the impact of guidelines mailed to providers when they requested radiographic studies. During a four-year period at the two hospitals involved in the intervention, a simple before-after study would have suggested significant reductions in referrals for eleven of the eighteen procedures evaluated, whereas the time-series analysis indicated no benefit attributable to the intervention.13 n Controlled before-after design. When multiple time points before and after an intervention are not feasible, a reasonable alternative to a time-series analysis at Hospital A is a controlled before-after study, in which the same before-after measurements occur in one or more hospitals that did not implement the change of interest but are otherwise comparable with Hospital A. A study of the impact of “critical pathways” on surgical length-of-stay provides a striking example of the benefit of a controlled before-after design.14 A major Boston teaching hospital implemented critical pathways for major surgical procedures and observed significant decreases in length-of-stay, from 3 percent to 9 percent (p < .01 for each before-after comparison). However, when lengths-of-stay for the same procedures were analyzed in other Boston hospitals, the same or greater decreases were found. This comparison allowed investigators to recognize that reductions in surgical length-of-stay at their own hospital likely reflected secular changes, presumably in response to general economic pressures to shorten hospital stays. Reliance on a simple before-after design would have resulted in the mistaken attribution of these changes to the pathways. Evidence Of The Effectiveness Of Specific QI Strategies Regardless of trial design, individual evaluations of QI strategies will always provide less complete pictures than systematic reviews. By gathering evidence from multiple clinical and organizational settings, systematic reviews provide H E A L T H A F F A I R S ~ Vo l u m e 2 4 , N u m b e r 1 143 Implemen ti n g Evi dence decisionmakers with more useful assessments of the totality of evidence supporting a given approach to improving quality than individual studies can.15 The increasingly recognized importance of systematic reviews in informing policy decisions has led some to distinguish “health technology assessments” from systematic reviews, with the former reflecting a balance between the ideals of scientific rigor and the needs of policymakers to receive evidence syntheses in short timelines.16 Here we review findings from several major systematic reviews and health technology assessments of QI strategies. n Use of multifaceted interventions. The first major review of the evidence supporting a variety of QI strategies found no “magic bullets” for addressing quality problems.17 It did, however, identify trends toward modest benefits for many interventions, especially those using multiple strategies for promoting change (“multifaceted” interventions). In other words, instead of using provider education or audit and feedback, effective interventions more often combined elements from two or more categories. Effective interventions were also more likely to involve active than passive strategies (for example, simply mailing guidelines to providers). n Targeting provider behavior. A comparably broad review five years later found that the literature now included forty-one systematic reviews of implementation strategies targeting provider behavior.18 This “overview of overviews” largely echoed the previous findings: No interventions consistently produced large improvements, and the ones producing modest improvements tended to be active and multifaceted. More recently, a synthesis of more than 200 evaluations of strategies for promoting implementation of guidelines again showed modest but consistent evidence of improvements in care.19 Across all studies, intervention groups exhibited a median absolute improvement of approximately 10 percent in terms of adherence to target processes of care. As in previous reviews, however, most implementation strategies showed wide variations in effect size. For instance, RCTs of interventions involving provider reminders reported changes in adherence to guidelines ranging from a 34 percent improvement to a 1 percent decline. As in the scenario at the outset, red pills seemed to work remarkably well in some studies while producing no effect in others. Two interesting findings emerged from this review: Multifaceted interventions had median effect sizes that were not significantly greater than single-faceted ones; and interventions involving passive dissemination such as educational materials produced modest but consistently positive improvements. These findings represent “good news–bad news” results. On the one hand, the review did not bear out two central tenets of the field—that significant improvement requires multifaceted interventions and that passive strategies offer little chance of success. On the other hand, if single-faceted interventions and passive dissemination strategies provide modest benefits, it certainly makes things easier for organizations trying to implement particular guidelines or other changes in practice. Exhibit 2 lists common types of QI strategies evaluated in the above reviews 144 Ja n u a r y/ Fe b r u a r y 2 0 0 5 Q uality Improv ement EXHIBIT 2 Common Quality Improvement Strategies Strategy definition and examples Effectiveness Provider education—Examples: Conferences or printed educational materials detailing current Generally ineffective if judged on the basis of improving patient outcomes recommendations for management of a particular condition If judged in terms of increasing provider Educational outreach visits to providers’ offices, usually knowledge, can be effective targeting more specific aspects of care, such as appropriate medication choices for a target condition Provider reminder systems and decision support (systems for prompting health professionals to recall information relevant to a specific patient or clinical encounter; when accompanied by a recommendation, such systems are classified as clinical decision support)—Examples: Sheet on front of chart alerting provider to date of the patient’s most recent mammogram and its result Computer-generated suggestion to intensify diabetes medications based on most recent HbA1c values Reminders often effective if well integrated with workflow Decision support sometimes effective, but less so for the more complex situations in which it would be most desirable Small to modest (at best) benefits for Audit and feedback (summary of clinical performance for an various forms of audit and feedback (such individual provider or clinic, transmitted back to the provider)— as report cards, benchmarking) Example: Variations in format may explain some of Reports to providers or provider groups summarizing the observed variations in effectiveness, in percentages of their eligible patients who have achieved a target outcome (cholesterol below a certain value) or received addition to providers’ attitudes toward the accuracy or credibility of the reports a target process of care (counseling about smoking cessation), accompanied by recommended targets Patient education—Examples: Individual or group sessions with nurse educator for patients with diabetes Medication education from a pharmacist for patients with heart failure Organizational change—Example: Changes in the structure or organization of the health care team or setting designed to improve processes or outcomes of care Financial incentives, regulation, and policy—Examples: Financial bonuses for achieving target level of compliance with targeted processes of care Change from fee-for-service to salaried or capitated reimbursement systems Modest to large effects for some conditions and patient populations Mostly positive results for case management and disease management programs Mixed results for total quality management and continuous quality improvement Some evidence for achieving target goals, but also for concerning decreases in access and conflicts of interest in physician-patient relationships SOURCE: Categories and definitions based on K.G. Shojania et al., Closing the Quality Gap: A Critical Analysis of Quality Improvement Strategies, Volume 1—Series Overview and Methodology; see Note 20 in text. and comments on their effectiveness. However, the major conclusion to draw from the literature is that general conclusions about what works are still tentative. n Detailed evidence for two chronic illnesses: diabetes and hypertension. One explanation for the variation in effectiveness of specific QI strategies across different studies may be the nature of the QI target. To address this issue, the Agency for Healthcare Research and Quality (AHRQ) recently funded a series of systematic H E A L T H A F F A I R S ~ Vo l u m e 2 4 , N u m b e r 1 145 Implemen tati on reviews of QI strategies and requested that the first two reviews focus on diabetes and hypertension. The reviews sought to identify interventions successful in improving patient outcomes through organizational changes or modifications to provider behavior, such as those listed in Exhibit 2.20 Interventions directed at patients were included as long as they also involved at least some strategy for changing provider behavior or achieving organizational change. The diabetes review identified sixty-six trials evaluating an intervention that targeted providers or organizations and reported improvements in processes or outcomes of care. The hypertension review included eighty-two trials.21 Exhibit 3 compares the effectiveness of specific QI strategies across the two target conditions. Two main findings emerge from this exhibit. First, a given strategy may work for diabetes but not for hypertension, which emphasizes that the effectiveness of a particular approach to quality improvement depends at least partly on the clinical context and almost certainly on other contextual factors (such as the beliefs and attitudes of providers and organizational features) that have received little study. Second, for diabetes, everything seems to work. This observation likely reflects two factors: (1) publication bias, such that positive studies have a EXHIBIT 3 Impacts Of Selected Quality Improvement (QI) Strategies For Two Chronic Illnesses: Diabetes And Hypertension Type of QI strategy Diabetes (significant improvement in glycemic control) Hypertension (significant improvement in systolic or diastolic blood pressure) Provider education Yes (9 studies) No for SBP (10 studies) No for DBP (11 studies) Provider reminders Yes (8 studies) No for SBP (6 studies) No for DBP (6 studies) Audit and feedback Yes (5 studies) Yes for SBP (3 studies) No for DBP (3 studies) Patient education Yes (18 studies) Yes for SBP (18 studies) No for DBP (21 studies) Disease or case management Yes (12 studies) Yes for SBP (4 studies) No for DBP (7 studies) Changes to team or staffing Yes (15 studies) Yes for SBP (19 studies) No for DBP (22 studies) Multifacted interventions better than single interventions? Yes (33 multifaceted studies, 6 single-faceted studies) Insufficient data to compare (42 multifaceted studies, only 1 single-faceted) SOURCES: Authors’ analysis based on K.G. Shojania et al., Closing the Quality Gap: A Critical Analysis of Quality Improvement Strategies, Volume 2—Diabetes Mellitus Care; and J.M. Walsh et al., Closing the Quality Gap: A Critical Analysis of Quality Improvement Strategies, Volume 3—Hypertension Care; see Note 21 in text. NOTE: This exhibit compares findings from the thirty-nine trials in the diabetes review that reported impacts on glycemic control with the thirty-three trials in the hypertension review that reported average changes in blood pressure. 146 Ja n u a r y/ Fe b r u a r y 2 0 0 5 Q uality Improv ement “Better understanding of the problem being addressed is an essential first step for any QI effort.” much larger chance of being published (an explanation borne out in the detailed analysis of the diabetes review); and (2) the benefit of any given QI strategy is confounded by the presence of multiple co-interventions. To address these issues, a more sophisticated regression analysis adjusted for variations in study quality (for example, study size and design) and took into account the presence of multiple co-interventions in the vast majority of studies. In this analysis, only three QI strategies emerged as significantly more beneficial than other interventions, and one of these, disease or case management, showed a similar result in the hypertension review. Since other reviews of the disease management literature have reported generally positive results, we discuss this strategy in more detail below.22 n Disease or case management. In our review, we defined disease management as any intervention involving coordination of diagnosis, treatment, or other aspects of ongoing management by a person or multidisciplinary team in collaboration with or supplementary to the primary care provider. Despite efforts to make our definition consistent with others in the literature, we found a number of disagreements with respect to classifying specific studies. For instance, one review of the “chronic care model” highlighted a study of primary care disease management for diabetes as exemplary.23 By contrast, a comprehensive systematic review of disease management across a variety of conditions did not include this study, nor did we in our review. Similarly, we regarded a randomized trial published in a prominent journal as an excellent example of disease management, whereas the authors of a recent randomized trial evaluating disease management for diabetes care did not mention this article in their discussion of the literature. These discrepancies almost certainly reflect the nebulous and varied definitions for disease management, rather than deficient search strategies or heterogeneous inclusion criteria. In fact, we contacted the authors of the more recent trial to confirm that they knew of the study in question and simply did not regard it as disease management, instead of their having overlooked it. Consistent with our experience, authors of systematic reviews have consistently emphasized the need for better definitions of disease management. Therefore, it seems that there is a pill called “disease management” that produces promising results, but its active ingredients remain unclear as do key features of the mode of delivery. Unless better clarity emerges, disease management may become a fad that disappears in the face of well-designed negative trials.24 Or, worse, it will simply be replaced when a newer, more appealing, but equally poorly understood pill appears on the market. H E A L T H A F F A I R S ~ Vo l u m e 2 4 , N u m b e r 1 147 Implemen ti n g Evi dence Where To Go From Here? The existing QI literature differs from the rest of biomedical research, especially that informed by the paradigm of EBM, in two major respects. First, evaluations of specific interventions often fail to meet basic standards for the conduct and reporting of research. Second, and more fundamentally, the choices of particular interventions lack compelling theories predicting their success or informing specific features of their development. Methodological shortcomings in the QI research literature include basic problems with the design and analysis of the interventions and poor reporting of the results. For instance, in our review of interventions to improve diabetes care, roughly one-third of studies omitted key data elements such as pre-intervention values for the outcome of interest and measures of variation for these outcomes. Some studies did not even report the number of patients or providers participating in the study. Even with better-designed and -reported studies, however, progress in QI research will require better understanding of the factors driving provider and organizational change. We need empirically derived models to inform the decision to select specific implementation strategies, based on clinical features of the quality target, organizational or social context, and relevant attitudes and beliefs of providers and patients. Until such theories emerge, better understanding of the problem being addressed is an essential first step. Instead of presuming that provider behavior reflects lack of knowledge, inadequate incentives, or any of the barriers listed in Exhibit 1, those interested in change need to determine which factors play the predominant role for the given QI target. If providers fail to perform some basic aspect of preventive care because they forget, amid multiple competing tasks, and not because they do not know of its importance, then a reminder system has a far greater chance of success than an educational strategy, no matter how well designed. Once an intervention has been developed, the next step should be a pilot study to confirm that it works as intended—the QI equivalent of Phase I clinical studies. Too often, interventions are immediately evaluated in a clinical trial without prior data regarding basic processes expected to mediate the target improvements.25 How frequently do providers read the audit and feedback reports sent to them? Do patients understand the self-management materials provided to them? A recent well-designed evaluation of a disease management intervention for diabetics reported no improvements in glycemic control but accompanied this null result with key data, such as the low success rate for nurses trying to contact patients in the intervention group.26 Not reporting such information, which is currently the norm in the literature, would be like conducting a drug trial in which it is not reported if the patients actually took the medication. Concluding Remarks We constructed the analogy of the pills at the outset to emphasize the degree to which QI research lags behind the rest of biomedical research and the EBM para- 148 Ja n u a r y/ Fe b r u a r y 2 0 0 5 Q uality Improv ement digm. Even when QI strategies undergo rigorous evaluation, the trials might as well involve “red capsules with the number 250 stamped on them,” as the essential ingredients that account for their impact remain unclear and the appropriate instructions for the bottle unknown. The pill metaphor also pertains to another important feature of QI research: the quest for a wonder drug. Although the public awaits “cures” for various diseases, the medical community has long recognized that progress occurs through incremental gains, with new therapies typically providing modest reductions (on the order of 10–20 percent) in the relative risk of adverse outcomes. In the quality arena, however, even the medical community expects miracle cures, and one often finds apothecaries peddling pills that promise cures for all that ails us—elimination of medication errors, universal adherence to key process measures, and dramatically improved patient outcomes. The wonder pill most frequently encountered currently is in fact the “wonder clinical information system,” despite the often glaring discrepancy between the promise of systems in showrooms and the way they perform in the real world.27 A s q i r e s e a r c h b e c o m e s m o r e r i g o r o u s , with greater attention to the understanding of why particular interventions work and the factors that augment or interfere with their success in different settings, we believe that a number of strategies will prove effective at promoting evidence-based care. As in the rest of health care, however, these effects will generally be modest. Unless we adjust our expectations, the continued quest for dramatic cures will result in missed opportunities to make consistent, incremental improvements in care. Jeremy Grimshaw holds a Canada Research Chair in Health Knowledge Transfer and Uptake. NOTES 1. 2. 3. 4. 5. 6. 7. 8. E.A. McGlynn et al., “The Quality of Health Care Delivered to Adults in the United States,” New England Journal of Medicine 348, no. 26 (2003): 2635–2645. D.L. Sackett et al., “Evidence based Medicine: What It Is and What It Isn’t” (Editorial), British Medical Journal 312, no. 7023 (1996): 71–72. R. Grol and J. Grimshaw, “Evidence-based Implementation of Evidence-based Medicine,” Joint Commission Journal for Quality Improvement 25, no. 10 (1999): 503–513. C.D. Naylor, “Putting Evidence into Practice,” American Journal of Medicine 113, no. 2 (2002): 161–163. M.D. Cabana et al., “Why Don’t Physicians Follow Clinical Practice Guidelines? A Framework for Improvement,” Journal of the American Medical Association 282, no. 15 (1999): 1458–1465. D. Blumenthal and C.M. Kilo, “A Report Card on Continuous Quality Improvement,” Milbank Quarterly 76, no. 4 (1998): 625–648. L. Locock, “Healthcare Redesign: Meaning, Origins, and Application,” Quality and Safety in Health Care 12, no. 1 (2003): 53–57. Several recent, well-designed trials have shown no benefit for computerized clinical decision support. For computerized physician order entry, recent “failures” have included dramatic system shutdowns and less dramatic failures in which the majority of orders remain handwritten. Citations for specific examples are available on request from the author; send e-mail to kshojania@ohri.ca. H E A L T H A F F A I R S ~ Vo l u m e 2 4 , N u m b e r 1 149 Implemen ti n g Evi dence 9. L.A. Bero et al. “Closing the Gap between Research and Practice: An Overview of Systematic Reviews of Interventions to Promote the Implementation of Research Findings—The Cochrane Effective Practice and Organization of Care Review Group,” British Medical Journal 317, no. 7156 (1998): 465–468. 10. R. Grol and M. Wensing, “What Drives Change? Barriers to and Incentives for Achieving Evidence-based Practice,” Medical Journal of Australia 180, no. 6 Supp. (2004): S57–S60. 11. S.M. Shortell et al., “Assessing the Impact of Continuous Quality Improvement on Clinical Practice: What It Will Take to Accelerate Progress,” Milbank Quarterly 76, no. 4 (1998): 593–624. 12. M. Eccles et al., “Research Designs for Studies Evaluating the Effectiveness of Change and Improvement Strategies,” Quality and Safety in Health Care 12, no. 1 (2003): 47–52. 13. L. Matowe et al., “Effects of Mailed Dissemination of the Royal College of Radiologists’ Guidelines on General Practitioner Referrals for Radiography: A Time Series Analysis,” Clinical Radiology 57, no. 7 (2002): 575–578. 14. S.D. Pearson et al., “Critical Pathways Intervention to Reduce Length of Hospital Stay,” American Journal of Medicine 110, no. 3 (2001): 175–180. 15. L.A. Bero and A.R. Jadad, “How Consumers and Policymakers Can Use Systematic Reviews for Decision Making,” Annals of Internal Medicine 127, no. 1 (1997): 37–42. 16. D. Rotstein and A. Laupacis, “Differences between Systematic Reviews and Health Technology Assessments: A Trade-off between the Ideals of Scientific Rigor and the Realities of Policy Making,” International Journal for Technology Assessment in Health Care 20, no. 2 (2004): 177–183. 17. A.D. Oxman et al., “No Magic Bullets: A Systematic Review of 102 Trials of Interventions to Improve Professional Practice,” Canadian Medical Association Journal 153, no. 10 (1995): 1423–1431. 18. J.M. Grimshaw et al., “Changing Provider Behavior: An Overview of Systematic Reviews of Interventions,” Medical Care 39, no. 8, Supp. 2 (1998): II2–II45. 19. J.M. Grimshaw et al., “Effectiveness and Efficiency of Guideline Dissemination and Implementation Strategies,” Health Technology Assessment 8, no. 6 (2004): 1–84. 20. K.G. Shojania et al., Closing the Quality Gap: A Critical Analysis of Quality Improvement Strategies, Volume 1—Series Overview and Methodology, 2004, www.ahrq.gov/downloads/pub/evidence/pdf/qualgap1/front.pdf (10 November 2004). 21. K.G. Shojania et al., Closing the Quality Gap: A Critical Analysis of Quality Improvement Strategies, Volume 2—Diabetes Mellitus Care, September 2004, www.ahrq.gov/downloads/pub/evidence/pdf/qualgap2/qualgap2.pdf (29 November 2004); and J.M. Walsh et al., Closing the Quality Gap: A Critical Analysis of Quality Improvement Strategies, Volume 3—Hypertension Care (forthcoming). 22. S.R. Weingarten et al., “Interventions Used in Disease Management Programmes for Patients with Chronic Illness—Which Ones Work? Meta-Analysis of Published Reports,” British Medical Journal 325, no. 7370 (2002): 925. 23. See K.G. Shojania and J.M. Grimshaw, “Still No Magic Bullets: Pursuing More Rigorous Research in Quality Improvement,” American Journal of Medicine 116, no. 11 (2004): 778–780. 24. S.L. Krein et al., “Case Management for Patients with Poorly Controlled Diabetes: A Randomized Trial,” American Journal of Medicine 116, no. 11 (2004): 732–739; and R.F. DeBusk et al., “Care Management for Low-Risk Patients with Heart Failure: A Randomized Controlled Trial,” Annals of Internal Medicine 141, no. 8 (2004): 606–613. 25. M.E. Hulscher et al., “Process Evaluation on Quality Improvement Interventions,” Quality and Safety in Health Care 12, no. 1 (2003): 40–46. 26. Krein et al., “Case Management for Patients with Poorly Controlled Diabetes.” 27. J.D. Kleinke, “Release 0.0: Clinical Information Technology in the Real World,” Health Affairs 17, no. 6 (1998): 23–38. 150 View p ublic at ion s t at s Ja n u a r y/ Fe b r u a r y 2 0 0 5