Assessing the quality and appropriateness of care

advertisement
Assessing the quality and appropriateness of care
Case reviews are frequently carried out for quality assurance, negligence claims,
investigations of adverse outcomes, and in disciplinary proceedings. Despite the
popularity of this process it has severe limitations. A review of cases requires an
understanding of the science of the assessment of the quality of care.
First we need to define quality of care, of which the most comprehensive is that of
Donabedian1:
‘Quality of care is a conceptual entity represented by the entire continuum from
process to outcome and not by either one independently’2
Ideally quality of care is based on audit of the outcomes in a group practice. This may be
compared to an expected standard based on literature or comparable data from other
groups. In either case, since case mix (the type of patient seen) will vary, outcome data
should be risk-adjusted to allow for this. Since the outcome may be just as much a
function of risk factors such as age and comorbidity (presence of other medical
conditions, see earlier discussion) as process (the care provided), it is important to
determine which factors are responsible for the observed outcomes, and which are
modifiable.
Alternatively, risk adjusted outcomes may be examined within a group practice to see if
there are obvious outliers, from which lessons may be learnt and practice modified.
Obviously this applies equally to exceptionally bad or exceptionally good outcomes.
Finally subsets of patients may be examined to see if there are certain groups of patients
who appear to benefit less or more, which may lead to modifications in practice.
In contrast negligence or malpractice claims, or disciplinary proceedings work backwards
from known adverse outcomes to examine process for the appropriateness of care
delivered, which is much more susceptible to bias.
Limitations
Even when trained reviewers are used, assessments of the appropriateness or quality of
care show consistently low reliability3,4,5,6,7,8,910,11. Agreement between reviewers
examining the same cases can be expressed as inter-rater reliability coefficients or kappa
scores. While these are often at acceptable levels in terms of agreeing about the specific
outcomes, reliability is much lower for process.12,13,14,15,16,17 However reliability is
improved where there are accepted evidence based guidelines (Hofer 2004, Naylor 1998).
(for further discussion on this see Measurement: Outcomes or Process? below)
In a recent review Naylor18 parodies the reliance placed on profiles of clinical
performance, driven by a ‘watchword’ of greater assessment and accountability in health
care.19 As he states, performance measurement remain ‘a very inexact science’ and
‘clinical outcomes are relatively insensitive to process-of-care problems’. He
demonstrates the limitations of even risk-adjusted data due to confounding by multiple
factors, and that without a ‘level playing field’, ‘unfair comparison of disparate patient’
populations produce misleading data and impressions. He stresses that any assessment of
clinical performance must take into account four critical elements: process of care,
clinical outcome, patient perception and patient satisfaction. Finally he emphasizes the
importance of considering the impact of performance profiles in terms of damaged
‘professional and institutional reputations and morale’. Theoretically there are potential
harms to patients, particularly if such assessments lead to excessive risk aversion.
Outcome Bias and Related Factors
Of even more concern is the effect of knowledge of outcomes in biasing assessment of
the level of care.20 Outcome bias is the relationship between the known severity of an
outcome and the degree of harshness with which people judge the alleged perpetrators of
that outcome. The desire to allocate blame, and the tendency to render this
proportionately to outcome is a natural human fallibility, and in recent years has been
driven by the defensiveness resulting from escalation in the application of tort law21 and
dramatic press coverage of the estimated incidence and prevalence of medical error.22,23
This represents a significant retreat from an earlier position that “to seek absolute safety
is to advocate therapeutic nihilism”.24 It has been suggested that this fulfils a need to
annul wrong and lessen hurt (Merry and McCall Smith 2003).
Knowledge of outcomes changes both the harshness of implicit judgement and the
willingness to render judgement. Furthermore the specific nature of the outcome will
influence assessments. When trained reviewers are given cases but minor adverse
outcomes are then altered to more severe outcomes, the assessment of the quality of care
changes sharply25,26. Kaplan et al. found the proportion of cases assessed as inappropriate
by reviewers increased 14% (from 19 to 33% of the total) and judgements on appropriate
care decreased 31% (from 67 to 36% of the total, almost halved) and in some cases as
much as 61%, after changing the outcomes that were given to the reviewers, for the same
description of the level of care.27 This is consistent with the observation that judgement
of care is highly dependant on the context and ‘hindsight knowledge’ as opposed to the
actual care provided, and has been a consistent observation28,2930,31,32 in many fields of
human behaviour, and has a sound basis in psychology.33,34,35
At best retrospective chart reviews remain a ‘blunt instrument’36 with a built in
systematic bias towards negativity. Process and outcome are at best loosely linked in
complex and dynamic settings. Not all bad process results in adverse outcomes, and bad
outcomes ‘occur despite good decisions, [including]…complete and thorough
consideration of all of the available information, goals, and contributing factors’
(Edwards 1984). Even in the investigation of adverse outcomes an adaptive learning
framework demonstrates that the key ‘is not to focus on where people went wrong but to
understand how their decisions and actions made sense at the time’ (Henriksen 2003).
Henriksen also points out how this pervasive bias is conducive in processes such as
morbidity and mortality conferences (see above) to mete out ‘a fair share of shame and
2
blame’, associated with ‘delusional clarity and important lessons unlearned’. This can be
further compounded by another shortcoming, that of attribution error37.
“Rather than giving careful consideration to the full range of situational,
organizational, and system related factors that are present when adverse events
happen, humans serving in a supervisory or evaluative capacity tend to make
dispositional attributions and view the mishap as evidence of an inherent
character flaw or defect on the part of the individual involved”
Other psychological factors that operate under these circumstances are those of the strong
sense of personal responsibility imbued in medical hierarchies, moral meaning and just
reciprocation, in which comparable magnitude of consequences and causation are sought.
In a competitive medical culture, personal responsibility runs deep and successful
outcomes are more highly valued than correct decision processes.
Related to outcome bias is the concept of hindsight bias38, in which people exaggerate the
extent to which individuals can predict the consequences of their actions.39,40,41
Alternatively those with prior knowledge of the outcome believe falsely that they would
have predicted such an outcome. As Arkes and colleagues point out, the natural human
tendency is to try and make sense out of a train of events rather than analyze the data
independently, or reconstruct the events in context.42 Physicians are greatly susceptible to
bias in their judgements and hindsight bias is a compelling influence in the population
leading to overconfidence in their own abilities and an inadequate appreciation of the of
the difficulties in making the original decision. Hindsight bias works through ‘creeping
determinism’ whereby outcome information is unconsciously integrated into knowledge
of prior events.43 This is reinforced by secondary gain, whereby the assessor appears to
be more knowledgeable and intelligent, and to maintain esteem by being intolerant to
ambiguity.
Often forgotten in assessing quality of care is the ‘mirror image’ problem. Driven by an
over-emphasis on and misunderstanding of the Hippocratic dictum of ‘do no harm’, the
harms of not acting are overlooked in the equation.44 The risks of all feasible alternatives
must be considered in aiding patient decision making.
Memory
Another major limitation in evaluative processes such as this is the reliance on memory.
Henrikson and others45,46 point out that human memory is not only frail (limitations in the
reproductive function of memory) but subject to systematic bias, so called reconstructive
memory47. In this process recalled fragments are ordered and filled in with constructions
that appear to make sense. Two conducive factors to systematic reconstruction are the
passage of time, and reliance on those not directly involved. Other factors include the
way questions are framed48, assimilation to the familiar, and outcome bias.
3
Adaptive Learning
When these factors are placed in the broader context of adaptive learning49 we find
evidence of motivational factors such as the desire to maintain public esteem or to
maintain a more orderly control over their environment.
Incorporating Patient Preferences
In defining quality of care, Brook50 identifies two components, that as he says, ‘are
important to people’. Care of high technical quality, and that ‘all patients wish to be
treated in a humane and culturally appropriate manner and be invited to participate fully
in deciding about their therapy’. Put simply, individuals’ value systems shape their
choices.
Naylor asks ‘What is appropriate care?’51 noting increased public concern about
accountability with shrinking resources, and revelations of seemingly endless
inexplicable studies revealing wide variations in what physicians do, and how they do it.
However he notes that the biggest problem is defining and measuring quality of care. It
has been said that the cornerstone of good care is sound clinical judgement, but the
difficulty is in judging clinical judgement. Naylor offers this definition ‘an ability to
determine what interventions (including calculated inactivity) will lead consistently to net
benefits for individual patients.’ While this might seem straightforward to some, even the
most rigorous tests of evaluating this give rise to concern, for instance different panels of
experts could variously rate elective hysterectomies as inappropriate with a 100%
variation.52 These sorts of studies are widespread53. In pondering over these failures of
agreements he points out that a major failing of chart reviews or other forms of review is
that they ignore patients’ preferences
“Surely the key to more appropriate utilization is much greater patient
involvement in decision making, not post hoc audits based on a few clinicians’
judgments”.
Acknowledging that there are many ‘Grey Zone’ (Naylor 1995) areas of clinical practice,
he attempted to answer his original question.
“What is appropriate care? The answer will often be that it depends. It depends on
which clinicians are asked, where they live and work, what weight is given to
different types of evidence and end points, whether one considers the preferences
of patients and families, the level of resources in a given health system, and the
prevailing values of both the system and the society in which it operates….[some]
will undoubtedly dislike this ambiguity. Clinicians, however, may welcome it as
affirmation that the art of medicine is unlikely to be managed away for many years
to come.”
In an earlier paper (Naylor 1995) he characterised clinical medicine as ‘a few things we
know, a few things we think we know (but probably don’t), and lots of things we don’t
know at all’. Noting that although lack of evidence characterises much of what we do,
‘paralytic indecisiveness’ is surprisingly uncommon because we have become adept at
4
making decisions under uncertainty and confusing ‘personal opinion with evidence, or
personal ignorance with genuine scientific uncertainty’. What is black and white in
abstract rapidly becomes grey when applied to meeting individual patients’ needs.
This has been discussed in a separate document Clinical Judgement: Decision making Preferences. Put simply, quality or appropriateness of care cannot be assessed in isolation
from the individual patient and their family.
Measurement: Outcomes or Process?
Hofer et al. (2000) define the outcomes of interest as being at three levels, preventable
morbidity, mortality and patient satisfaction.
Essentially missing in most procedures such as the one being discussed in this document
is attention to overall as opposed to individual outcomes from which observers work
backwards. As Brook and colleagues (2000) state;
“Process measures are only as good as the evidence that associates them with
improved outcomes. Process assessments produce the harshest judgment of the
quality of care.”
In one of the original studies by this RAND group, only 4 of 300 patient records
examined were deemed to be appropriate care, yet in only 25% could it be determined as
to whether outcome could be improved by changes to the process, despite 98% of
patients receiving ‘inappropriate’ care 54. They also note that it is extraordinarily difficult
to determine what actually transpired, for many proposed measures. Just because
something is not documented in a chart, such as a discussion, does not mean it did not
take place. Assessments of process must therefore be reliably linked to events, desirable
or undesirable (Brook 1973).
In addition they question the equity or fairness of quality assessments that compare
providers, in the absence of well documented reliability and validity. They stress the
importance of patient input to any assessment. Quality assessment remains largely a
research tool.
In particular they noted that despite increasing emphasis on meeting patient’s
expectations and patient satisfaction as outcomes, that chart reviews are unlikely to
reflect this component.
Summary
Lack of epidemiological robustness and rigour, as in most of the error literature (Hofer
2000) severely hampers assessment of the quality and appropriateness of care. There are
wide variations in how physicians practice, without necessarily adversely affecting the
outcomes. Patient’s own values and preferences strongly influence both the process and
outcome of care, and physicians consistently fail either to accurately estimate this or to
take it into account. Judgements on care are subject to severe bias, reflecting wisdom
after the event and individual prejudices.
5
Understanding the problems inherent in assessing quality of care, its limitations, lack of
reliability and validity, and inherent negative bias, are essential initial steps in
undertaking any judgement as to quality of care delivered.
Dr Michael Goodyear,
Department of Medicine,
Dalhousie University,
Halifax, Nova Scotia, Canada.
March 2005.
References
1
Donabedian A. Quality assessment and assurance: unity of purpose, diversity of means. Inquiry.
1988;25:173-92.
2
Hofer TP, Kerr EA, Hayward RA. What is an error? Effective Clinical Practice November 2000.
3
Koran LM: The reliability of clinical methods, data and judgments (first of two parts). N Engl J Med.
1975 Sep 25;293(13):642-6.
4
Koran LM: The reliability of clinical methods, data and judgments (second of two parts). N Engl J Med.
1975 Oct 2;293(14):695-701
5
Dubois RW, Brook RH. Preventable deaths: who, how often, and why? Ann Intern Med. 1988 Oct
1;109(7):582-9
6
Brennan TA, Localio RJ, Laird NL. Reliability and validity of judgments concerning adverse events
suffered by hospitalized patients. Med Care. 1989 Dec;27(12):1148-58.
7
Thomas EJ, Lipsitz SR, Studdert DM, Brennan TA. The reliability of medical record review for
estimating adverse event rates. Ann Intern Med. 2002 Jun 4;136(11):812-6. Erratum in: Ann Intern Med
2002 Jul 16;137(2):147.
8
I Localio AR, Weaver SL, Landis JR, Lawthers AG, Brenhan TA, Hebert L, Sharp TJ. dentifying adverse
events caused by medical care: degree of physician agreement in a retrospective chart review. Ann Intern
Med. 1996 Sep 15;125(6):457-64.
9
Hayward RA, McMahon LF Jr, Bernard AM. valuating the care of general medicine inpatients: how good
is implicit review? Ann Intern Med. 1993 Apr 1;118(7):550-6
10
Hofer TP, Asch SM, Hayward RA, Rubenstein LV, Hogan MM, Adams J, Kerr EA. Profiling quality of
care: Is there a role for peer review? BMC Health Serv Res. 2004 May 19;4(1):9
11
Goldman RL. The reliability of peer assessments of quality of care. JAMA 267: 958-960, 1992
12
Brook RH, Appel FA. Quality-of-care assessment: choosing a method for peer review. N Engl J Med.
1973 Jun 21;288(25):1323-9.
13
Nobrega FT, Morrow GW, Smoldt RK, Offord KP. Quality assessment in hypertension: analysis of
process and outcome methods. N Engl J Med. 1977 Jan 20;296(3):145-8
14
Fessel WJ, Van Brunt EE. Assessing quality of care from the medical record. N Engl J Med. 1972 Jan
20;286(3):134-8.
15
Brook RH. Quality-can we measure it. N Engl J Med. 1977 Jan 20;296(3):170-2.
16
Schroeder SA, Kabcenell AI. Do bad outcomes mean substandard care? JAMA. 1991 Apr
17;265(15):1995.
17
Smith MA, Atherly AJ, Kane RL, Pacala JT. Peer review of the quality of care: reliability and sources of
variability for outcome and process assessments. JAMA 278: 173-1578, 1997
18
Naylor CD. Public profiling of clinical performance. JAMA 287(10): 1323-1325, 2002.
6
19
Relman AS. Assessment and accountability: the third revolution in medical care. N Engl J Med. 319:
1220-1222, 1988
20
Henriksen K, Kaplan H. Hindsight bias, outcome knowledge and adaptive learning. Qual Saf Health
Care. 2003 Dec;12 Suppl 2:ii46-50
21
Richards C. Who’s to blame for bad luck? Chicago Sun Times July 9 2003: 51
22
CNN. Physicians divided: report of medical-error deaths on target or exaggerated? CNN.com July 5 2000
23
Wiley D. Mistakes that kill. Macleans.ca August 13 2001
24
Mills DH. Medical insurance feasibility study: a technical study. West J Med 128: 360-5, 1978
25
Merry A, McCall Smith A. Errors, medicine, and the law. Cambridge University Press, 2003. pp 162-4,
194-8.
26
LaBine SJ, LaBine G. Determination of negligence and the hindsight bias. Law Hum Behav 20: 501-516,
1996
27
Caplan RA, Posner KL, Cheney FW. Effect of outcome on physician judgments of appropriateness of
care. JAMA. 1991 Apr 17;265(15):1957-60.
28
Brennan TA, Leape LL, Laird NM, Hebert L, Localio AR, Lawthers AG, Newhouse JP, Weiler PC, Hiatt
HH. Incidence of adverse events and negligence in hospitalized patients. Results of the Harvard Medical
Practice Study I. N Engl J Med. 1991 Feb 7;324(6):370-6.
29
Brennan TA, Sox CM, Burstin HR. Relation between negligent adverse events and the outcomes of
medical-malpractice litigation. N Engl J Med 335: 1963-1967, 1996
30
Berlin L. Malpractice issues in radiology: Outcome bias. AJR 183: 557-560, 2004
31
Morris JA Jr, Carillo Y, Jenkins JM et al: Surgical adverse events, risk management, and malpractice
outcome: morbidity and mortality review is not enough. Ann Surg 237: 844-852, 2003
32
Blendon RJ, DesRoches CM, Brodie M et al: Views of practising physicians and the public on medical
errors. N Engl J Med 347: 1933-1940, 2002
33
Hoch SJ, Loewenstein GF. Outcome feedback: hindsight and information. J Exp Psychol 15: 605-619,
1989
34
Reason J. Human Error. Cambridge University Press 1990.
35
Edwards W. How to make good decisions. Acta Psychol. 56: 5-27, 1984
36
Neale G, Woloshynowych M. Retrospective case record review: a blunt instrument that needs
sharpening. Qual Saf Health Care. 2003 Feb;12(1):2-3
37
Nisbett R, Ross L. Human inference: strategies and shortcomings of social judgments. Prentice-Hall, NY,
1980
38
Berlin L. Malpractice issues in radiology: Hindsight bias. AJR 175: 597-801, 2000
39
Baron J, Hershey JC. Outcome bias in decision evaluation J Personality Soc Psychol 54: 569-79, 1988
40
Russo JE, Schoemaker PJH. Decision traps – the ten barriers to brilliant decision-making and how to
overcome them. Simon & Schuster, NY, 1989
41
Hawkins SA, Hastie R. Hindsight: biased judgements of past events after the outcomes are known.
Pscychol Bull 107: 311-327, 1990
42
Arkes HR, Wortmann RL, Saville PD, Harkness AR. Hindsight bias among physicians weighing the
likelihood of diagnoses. J Appl Psychol 66: 252-254, 1981
43
Renfrew DI, Franken EA, Berbaum KS, Weigelt FH, Abu-Yousef MM. Error in radiology: classification
and lessons in 182 cases presented at a problem case conference. Radiology 183: 145-50, 1992
44
Bogardus ST, Holmboe E, Jekel JF. Perils, pitfalls and possibilities in talking about medical risk. JAMA
281(11) 1037-1041.
45
Redelmeier DA, Schull MJ, Hux JE et al. Problems for clinical judgement: 1. Eliciting an insightful
history of present illness. CMAJ 164(5): 647-51, 2001
46
Redelmeier DA, Schull MJ, Hux JE et al. Problems for clinical judgement: 2. Obtaining a reliable past
medical history. CMAJ 164(6): 809-13, 2001
47
Bartlett FC. Remembering: an experimental and social study. Cambridge University Press, 1932
48
Loftus EF. The eyewitness on trial. Trial. October: 3l-3, 1980
49
Campbell JD, Tesser A. Motivational interpretations of hindsight bias: an individual difference analysis.
J Personality 51: 605-20, 1983
50
Brook RH, McGlynn EA, Shekelle PG. Defining and measuring quality of care: a perspective from US
researchers. Int J Qual Health Care 12(4): 281-295, 2000
51
Naylor CD. What is appropriate care? N Engl J Med. 1998 Jun 25;338(26):1918-20.
7
52
Shekelle PG, Kahan JP, Bernstein SJ, Leape LL, Kamberg CJ, Park RE. The reproducibility of a method
to identify the overuse and underuse of medical procedures. N Engl J Med 1998;338:1888-1895.
53
Naylor CD. Grey zones of clinical practice: some limits to evidence-based medicine. Lancet
1995;345:840-842.
54
Brook RH, McGlynn EA, Cleary PD. Quality of health care. Part 2: measuring quality of care N Engl J
Med 335: 966-70, 1996
8
Download