Sam G Campbell MB BCh, FCFP(EM), Dip PEC(SA), FCCHL. Chief, Department of Emergency Medicine Charles V Keating Emergency and Trauma Centre Professor of Emergency Medicine Dalhousie University, Halifax, Nova Scotia. CFPC CoI Templates: Slide 1 Faculty: Sam Campbell Relationships with commercial interests: ◦ Grants/Research Support: Shire, NSHA, BoehringerIngelheim. ◦ Speakers Bureau/Honoraria: Boehringer-Ingelheim, ◦ Other: Employee PraxES Medical Group CFPC CoI Templates: Slide 2 This program has received no financial nor in kind support from anyone Potential for conflict(s) of interest: ◦ Sam Campbell has received no payment/funding, from any organization whose product(s) are being discussed in this program. CFPC CoI Templates: Slide 3 Active involvement in Choosing Wisely Canada Dal Critical thinking group Remind people about Choosing wisely Basic concept of testing ◦ Why/How tests lie ◦ How should we use them? Clinical Context/Bayesian approach P.H, 54 yr old ‘Check up’ CBC, ‘lytes, BUN/Creat, LFTs, Lipids, TSH, Fe, PSA, Vit B12, folate, Vit D. Transaminases mildly elevated Repeat in a month (still up) Heaptitis serology, ANA, Abd US. 3.5 cm lesion in rt kidney (?angiomyolipoma) CT – confirms AML Can Fam Phys 2015;61:535-7. Lab: US and CT: Missed work Anxiety ++ Reassurance??? Can Fam Phys 2015;61:535-7. L.A.W. Canadian Journal of Diagnosis (in press) What we do is not benign … What we ask may not be either……. - campaign to help physicians and patients engage in conversations about unnecessary tests, treatments and procedures, and to help physicians and patients make smart and effective choices to ensure high-quality care. Lists of interventions of questionable value from different specialist organizations. Germany U.K. Canada Japan U.S. 1970 1980 1990 2000 2008 2011 OECD, 2013 IOM - 30% of health care spending wasteful, no added value to patient care Inappropriate testing ◦ ◦ ◦ ◦ > 50% of prescriptions for respiratory infections 28 - 65% of lumbar spine MRIs inappropriate 9 - 16% of head scans for headache Bone density scans, Vit D levels, pre-operative tests……….. Hunters vs. Fishermen: This is a simplistic preliminary discussion of a complicated issue Doctors, They are just tools, each designed for a purpose Test variability may be related to: ◦ ◦ ◦ ◦ the test the interpreter duration of symptoms/stage of the illness lab equipment, reagents, procedure, or even lab error. Test results should never be ◦ accepted at face value ◦ interpreted without considering pre-test (clinical) probability of disease. Diagnostic tests are used to: ◦ to help establish diagnoses. ◦ Culture ‘more is better’ ◦ Relieve pressure from patients/family (Cyberchondria) ◦ To delay making a decision (Entertain the patient while we wait for something to declare itself) ◦ ◦ ◦ ◦ Consultant expectations Save time explaining/examining Perpetuate the myth of medical clarity “Routine” Screening just because it is what we do! I’m going order this test because I don’t have time to tell you why you don’t need it… Diagnosis ◦ Rule In vs. Rule Out ◦ Treatment Threshold Out of sight – ethereal/magical Measured once and rarely challenged Presumption that the result will help your patient Presumption of benefit exceeding risk ◦ ◦ ◦ ◦ Phlebotomy risks Risk of false results Waste of time/money Misinformation/misinterpretation Don't "make" diagnoses; they supplement clinical judgement and reduce the level of diagnostic uncertainty. Unless applied and interpreted carefully, tests can be misleading. The premise of diagnostic testing is that there are 2 populations of people ◦ those with the disease ◦ those without ..... who differ on at least one testable parameter. Almost all tests lie! Most tests can be ‘positive’ for several reasons Not everyone with (for example) pneumonia has an infiltrate on x-ray and not everyone with an infiltrate has pneumonia. Patient variability and test variability result in an overlap between the results for diseased and normal populations for virtually all tests Most objective tests assess a measurable parameter and classify the patient as "normal" or "abnormal." "Normal" is typically established by determining test values in disease-free people and identifying the range in which 95% of this population lies. There is variability in the normal and in the diseased population, and overlap between the two groups. Some levels are therefore compatible with health or disease. Imagine a test that screens people for a disease. ◦ Each person taking the test either has or does not have the disease. ◦ The test outcome can be positive (predicting that the person has the disease) or negative (predicting that the person does not have the disease). ◦ The test results for each subject may or may not match the subject's actual status – i.e. The test may lie True positive: Sick people correctly diagnosed as sick False positive: Healthy people incorrectly identified as sick True negative: Healthy people correctly identified as healthy False negative: Sick people incorrectly identified as healthy Each test will have it’s own strengths and weaknesses, and we can describe these. Sensitivity: the ability to recognize (rule in) the thing being tested for Specificity: Precise – if it says the quality is present, then it is- able to rule out the thing being tested for A perfect test would be described as 100% sensitive (i.e. predicting all people from the sick group as sick) and 100% specific (i.e. not predicting anyone from the healthy group as sick) Highly sensitive tests don’t miss those who have a disease. The trade off is they will be positive in people who don’t. These are false positive results Highly specific tests won’t be positive in the absence of disease. The price? Some who have it will escape detection. These are false negatives Sensitivity and Specificity are not independent. When you increase one, you often decrease the other. False negatives delay diagnoses. False positives create them. All testing is susceptible to both Test results are categorized as: ◦ True or false positive, or true or false negative all relative to a ‘gold standard’ (which may also be imperfect..) Gold standard is more accurate, but too slow, expensive or invasive to do as a first line test. The false positive rate is not just a function of sensitivity and specificity. It is dependent on the actual risks an individual has of having the disease and how common the disease itself is. Thomas Bayes (1701 –1761) 10% of patients with acute MI fail to develop ST segment changes. 20-30% of ST↑ have no MI N Engl J Med 2003;349:2128-35 ‘Screening’ ECG He has ST elevation Should we send him to hospital at once? ‘Monitoring’ ECG completely normal Cancel the cath? ~80% of cases will have a high WBC WBC is ↑ in up to 70 % of patients with other causes of right lower quadrant pain Only including ‘grey zone’ cases, it may perform less well than clinical judgement! Fig. 1: Hypothetical probability density distributions of measured plasma brain natriuretic peptide (BNP) levels in 2 subgroups of a study population. Victor M. Montori et al. CMAJ 2005;173:385-390 ©2005 by Canadian Medical Association Fig. 2: These hypothetical probability density distributions reflect a study population of middle-aged patients who all have recurrent asthma and chronic CHF. The patients whose dyspnea is caused by asthma exacerbations look clinically similar to those whos... Victor M. Montori et al. CMAJ 2005;173:385-390 ©2005 by Canadian Medical Association Your clinical assessment is still critical! Likelihood of a positive test result in a patient with the target disorder compared that in a patient without the disorder LR+ = Sensitivity/1- Specificity LR-+ 1-sensitivity/Specificity LR+ 2-5 LR+ 5-10 LR+ >10 LR- 0.5-0.2 LR- 0.1-0.2 LR- <0.1 Small changes Moderate changes Large changes Small changes Moderate changes Large changes Victor M. Montori et al. CMAJ 2005;173:385-390 ©2005 by Canadian Medical Association The ‘power’ of the test /Likelihood ratios depends on what you thought in the first place. Radiation/blood loss Unnecessary intervention Inappropriate reasurrance Confirmation bias Cost ‘one third of health care costs could be saved without depriving any patient of beneficial care’ Howard Brody, 10.1056/nejmp0911423 nejm.org • When it doesn’t matter: Seasonal viral illness Prostate screening in >80 Surgical conditions in people not fit for surgery Minor facial fractures When pre-test probability is really low: ◦ Clinical picture ◦ Rare conditions and no risk factors Spinning a coin to rule out malaria is a really sensitive test in Tuktoyaktuk If you already know the answer ◦ Repeating normal screens too early ◦ OA in older patients When risk of investigation/treatment approaches risk of illness ◦ Contrast medium ◦ Radiation When the test can’t answer the question you need answered ◦ CT scan for cerebellar disease ◦ Lumbar/cervical spine x-ray for ‘sprains’ ◦ Sinus x-rays e.g. 'Ottawa rules’ Evidence-based guidelines suggest that: ◦ We should tailor screening to individual patient health profiles and move to "opportunistic" screening • We should screen only for conditions that: Cause serious illness or functional difficulties, and only when an accurate test and effective treatments are available. http://www.cfhi-fcass.ca/publicationsandresources/Mythbusters/ Cadman D et al. JAMA 1984;251: 1580-1585. • What will I do if the result is •+ve? •-ve? Will it improve the management of my patient? • What is the benefit related to the cost? • I’ll just do all of the tests and see what you might have’ ‘ ‘I’m going to do a test to supplement my clinical Or, impression’ Discussion: 26 yr old female Dysuria, frequency, suprapubic discomfort Afebrile, no back pain, N/V. Has had previous UTI’s – pretty much the same.. Our options: ◦ ◦ ◦ ◦ ◦ Urine dip? Microscopy? Culture? Empiric treatment? Treat only if Positive test? Four symptoms and 1 sign increased the probability of UTI: Four symptoms and 1 sign decreased the probability of UTI: ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ ◦ dysuria LR, 1.5 frequency LR, 1.8 hematuria LR, 2.0 back pain LR, 1.6 costovertebral angle tenderness LR, 1.7 absence of dysuria negative LR, 0.5; absence of back pain NLR, 0.8; history of vaginal discharge NLR, 0.3 history of vaginal irritation NLR, 0.2 vaginal discharge on examination NLR, 0.7 JAMA. 2002 May 22-29;287(20):2701-10. 2 most powerful signs/symptoms - history of vaginal discharge and history of vaginal irritation ◦ Neg LR of UTI when present (LRs, 0.3 and 0.2, respectively). Using combinations of symptoms: ◦ LRs 24.6 for the combination of dysuria and frequency but no vaginal discharge or irritation. ◦ In patients with recurrent UTI one study found that self-diagnosis significantly increased the probability of UTI (LR, 4.0). JAMA. 2002 May 22-29;287(20):2701-10. Reasonable to rule in infection, but not better that clinical judgement. Not good enough to rule it out. ◦ 57-96% sensitive and 94-98% specific for identifying pyuria Emerg Med J. 2003 Jul;20(4):362-3. Am J Med. 2002 Jul 8;113 Suppl 1A:20S-28S. Ann Emerg Med. 1989 May;18(5):560-3. ◦ In women who present with >1 symptoms of UTI, the probability of infection is ~ 50% Physical exam, and tests are not able to lower the post-test probability to a level where a UTI can be ruled out ◦ Specific combinations of symptoms raise the probability to >90%, effectively ruling in the diagnosis based on history alone. JAMA. 2002 May 22-29;287(20):2701-10.