Evidence - Based Medicine The Pediatrics Clerkship EBM Curriculum STUDENT WORKBOOK 2004-2005 Sponsored by the Departments of Pediatrics and Medical Education TABLE OF CONTENTS Welcome and Introduction ............................................................................... - 3 Goals, and Competencies ................................................................................ - 6 Curricular Activities .......................................................................................... - 7 Assignment: Answerable Clinical Questions (ACQ) ......................................... - 7 Assignment: EBM Write-Up ............................................................................. - 7 EBM Learning Resources ................................................................................ - 8 Competency Levels, Incomplete Grade, and Remediation .............................. - 9 Pediatrics Clerkship EBM Page ..................................................................... - 10 Therapy and Diagnosis Specific Articles ........................................................ - 11 What is Evidence-Based Medicine................................................................. - 11 Developing an Answerable Clinical Question ................................................ - 12 Searching for Answers to Clinical Questions ................................................. - 14 Theapy: Summary of Approach to Validity and Results ................................. - 16 Expanded Evaluation of a Therapy Article ..................................................... - 17 Risk Reduction Calculator .............................................................................. - 23 Critically Appraised Topic (CAT) Form ........................................................... - 24 Theapy: Example of a Write-Up Using a CAT Form ...................................... - 25 Diagnosis: Summary of Approach to Validity, Results and Applicability ........ - 26 Expanded Evaluation of a Diagnostic Test Article ......................................... - 27 Diagnostic Test Calculator ............................................................................. - 35 Basic Statistics for Diagnostic Tests .............................................................. - 36 Diagnostic Test Likelihood Ratio Nomogram ................................................. - 37 Diagnosis: Example of a Write-Up Using a CAT Form................................... - 38 PICO Mnemonic for Applicability.................................................................... - 40 - -2- EVIDENCE-BASED MEDICINE CURRICULUM FOR THE PEDIATRICS CLERKSHIP Dear Pediatrics Clerkship Student, Welcome to Pediatrics and welcome to the Evidence-Based Medicine (EBM) Curriculum. Below, please find information on goals, competencies, activities, and responsibilities. This curriculum may be considered a continuation of your work with EBM in the ECM course. The difference is that now you will now have an opportunity to re-learn and apply the tools of EBM in an actual clinical context. Each student “EBM PAIR” (see page 7) has been assigned a mentor (http://ebm.peds.uic.edu/clerkship/ ). Your mentor is eager to work with you on this important learning program. While there are a few required activities, much of the learning that hopefully will occur is predicated on your initiative. “EBM Learning Resources” (page 7) was specifically designed as a resource for selfdirected learning. Your Pediatrics Clerkship is designed to facilitate learning of both pediatric background and foreground information. Background information in pediatrics encompasses the basic set of facts about child health and disease. Medical schools are traditionally quite good at providing educational curricula for acquiring background information. Examples of questions that a learner may ask to gather background information are: What are the physical findings in bronchiolitis? How is iron deficiency anemia routinely diagnosed? What is the known life expectancy for a child with sickle cell anemia? What is the standard treatment for attention deficit, hyperactivity disorder? What are the recommended immunizations for a well baby in the first twelve months of life? What resources are available to help a depressed adolescent? Good sources of this kind of information are: involvement in patient care, mentors, textbooks, and MD Consult. Most of your clerkship activities should focus on acquiring background information. A smaller proportion of your activities in the clerkship should focus on acquiring foreground information, which is the subject of this EBM curriculum. Foreground information is what is obtained by answering higher level questions. Examples of this type of information would include clinical evidence for the therapeutic efficacy of a new or an existing treatment, and the diagnostic accuracy of a newly proposed diagnostic test. Until recently, there were few, if any, formal curricula designed to facilitate learning of foreground information. The EBM Curriculum for the Pediatrics Clerkship is an educational intervention to address this learning need. -3- To better illustrate the difference between background and foreground information gathering, please see the table below. Note that a foreground question is often a thoughtful follow-up question to a background question. Background Foreground What are the physical findings in bronchiolitis? How is iron deficiency anemia routinely diagnosed? What is the known life expectancy for a child with sickle cell anemia? What is the standard treatment for attention deficit, hyperactivity disorder? What are the recommended immunizations for a well baby in the first twelve months of life? What resources are available to help a depressed adolescent? In babies with new-onset wheezing, what is the diagnostic accuracy of the history and physical examination, compared to viral cultures, in diagnosing bronchiolitis? (a diagnosis question) In children with suspected iron deficiency anemia, what is the diagnostic accuracy of serum ferritin versus using the MCV and hemoglobin count, compared to bone marrow aspiration (or some other suitable gold standard), in diagnosing iron deficiency anemia? (a diagnosis question) In children with sickle cell anemia, what is the prognostic significance of frequent episodes of acute chest syndrome, compared to no episodes, on probability of survival at age forty? (a prognosis question) In children with ritalin-resistant attention deficit, hyperactivity disorder, what is the therapeutic efficacy of clonidine, compared to adderall, as measured by parental report on the Connors Scale? (a therapy question) In the population of otherwise healthy infants, what is the efficacy of the pneumococcal vaccine Prevnar, compared to placebo, in preventing pneumococcal meningitis? (a type of therapy question) Among mildly depressed adolescents, what is the therapeutic efficacy of outpatient cognitive therapy plus antidepressants, compared to outpatient cognitive therapy alone, in reducing the frequency of depression six months following initiation of treatment? Answers to foreground questions are rarely found in textbooks. By their nature, foreground questions require up-to-date answers. Textbooks are often a number of years out of date by the time they are published. The online clinical research bibliographic databases, or study syntheses (meta-analyses, methodologically sound guidelines) are much more likely to provide answers to foreground questions. -4- By achieving the basic competencies of the EBM Curriculum for the Pediatrics Clerkship, we anticipate that you will have attained a beginner-level ability to formulate clear foreground questions ("answerable clinical questions") based on real patient encounters, search for answers (clinical studies), evaluate study methodology, analyze study results, and approach the application of results to your patients. These EBM tools are likely to be of aid to you in all of your future clinical endeavors. Sincerely, Jordan Hupert Jerry Niederman Larry Roy for the EBM mentoring group. -5- Alan Schwartz EVIDENCE-BASED MEDICINE CURRICULUM FOR THE PEDIATRICS CLERKSHIP GOALS 1. To actively employ the pediatric patient encounter as a forum for clinical learning 2. To answer clinical questions using the clinical research literature COMPETENCIES By the end of the Pediatrics Clerkship, the student will demonstrate how to 1. Develop an answerable clinical question (ACQ) from a patient encounter 2. Assess the methodologic validity of diagnosis and therapy research studies 3. Analyze the results of diagnosis and therapy studies, employing the tools of evidence-based medicine (EBM) 4. Approach the application of therapy and diagnosis study results to specific patient scenarios TOOLS NEEDED TO ACHIEVE COMPETENCIES The Pediatrics Clerkship will provide resources to facilitate student learning of 1. The PICO (Patient, Intervention, Comparison, Outcome) format for ACQ’s 2. PubMed Clinical Queries 3. The definition and application of the major issues of methodologic validity and applicability for diagnosis and therapy studies 4. The definition and application of the following concepts a. b. c. d. e. f. g. h. i. j. k. prevalence pre-test probability sensitivity specificity likelihood ratio post-test probability absolute risk reduction number needed to treat 95% confidence interval statistical and clinical power PICO for applicability -6- CURRICULAR ACTIVITIES (See “Competencies” section for remediation of non-completion of activities) Aside from the pre- and post-tests, the EBM activities will be done in pairs (with occasional exceptions). Please see the web site for pair (= EBM PAIR) assignments, as well as mentor assignments. 1. Pre-Test. This will be completed either on-site or electronically. The results of the pre-test and the post-test will not affect your clerkship grade. The purpose of the tests is to inform the EBM mentors as to how well students are learning and how well mentors are facilitating learning of EBM. 2. Answerable Clinical Question (ACQ) Within the first 3 weeks of the clerkship, each EBM PAIR must submit either a therapy or a diagnosis ACQ via the Pediatric Clerkship EBM Page http://ebm.peds.uic.edu/clerkship/ to your mentor. The ACQ is to be based on a pediatric patient with whom you have had clinical interaction during the first 3 weeks of the clerkship. If the ACQ is inadequate or deficient, your mentor will help with fixing it or will suggest sending another ACQ. 3. Search and EBM Write-Up A. B. C. D. 4. Each EBM PAIR should conduct a search of the online medical bibliographic databases to find an answer to the ACQ within 72 hours of receiving approval of his ACQ from your mentor. Send the reference of the article that best answers your question to your mentor. Within the first 3 weeks of the clerkship, each EBM PAIR is to arrange at least one face-to-face meeting with your mentor to discuss the EBM project and write-up (“CAT”, Critically Appraised Topic). You must complete the EBM write-up using the Critically Appraised Topic (CAT) form (available on-line at the Pediatric Clerkship EBM Page http://ebm.peds.uic.edu/clerkship/ ). The final draft must be submitted via e-mail to your assigned mentor by Sunday, the first day of the 5th week of the clerkship. Some examples of completed CAT's are included in this workbook. Those students rotating in Pediatrics after the first clerkship, will receive a list of EBM topics completed by students in earlier clerkships. EBM topics are not to be repeated. Each EBM PAIR is to work on a unique ACQ or a unique aspect of a previous ACQ. If you chose to answer an ACQ on therapy, you ideally should be able to generate an NNT. All diagnosis articles will allow generation of LR’s. Both of these numbers may be calculated using the online calculators available on the Pediatrics Clerkship EBM Page http://ebm.peds.uic.edu/clerkship/ “Post-Test” (similar in form to the pre-test) will be taken at the end of the clerkship just prior to the shelf exam. -7- EBM Learning: In addition to the workshop handbook, the following resources are available to help you achieve the curricular competencies: A. http://www.cche.net/usersguides/main.asp (This site has the JAMA collection of articles on EBM including those on diagnosis and therapy) B. http://ebm.peds.uic.edu (Location of EBM Consult Service, EBM calculators and brief diagnostic test tutorials. Developed by Dr. Alan Schwartz) C. http://ebm.peds.uic.edu/clerkship/ Pediatrics Clerkship EBM Page D. Evidence-Based Medicine. How to practice and teach EBM. David L. Sackett, et al. Second edition. 2000. Churchill Livingstone. (Available from the UIC bookstore or Amazon.com http://www.amazon.com/exec/obidos/ASIN/0443062404/qid=1021228742 /sr=8-1/ref=sr_8_71_1/002-4147953-8775251 ) E. Users’ Guides to the Medical Literature. Gordon Guyatt and Drummond Renie. 2002. AMA Press. (Available from Amazon.com http://www.amazon.com/exec/obidos/ASIN/1579471749/qid=1022870085 /sr=2-2/103-7432821-2643037 ) F. http://bmj.com/cgi/content/full/315/7107/540 (A paper on diagnostic tests by Trisha Greenhalgh. In her list of questions, she combines issues of methodologic validity and applicability. For the sake of uniformity, when doing your write-up, use the questions of validity given to you at the workshop) G. http://bmj.com/cgi/content/full/315/7105/422 (General paper on statistics, also by Dr. Greenhalgh. Particularly useful for confidence intervals) H. http://www.cebm.utoronto.ca/practise (Good reference on answering clinical questions, including diagnostic test and therapy questions) I. http://www.med.ualberta.ca/ebm/ebm.htm (Evidence-Based Medicine Tool Kit. Include the validity questions with links to explanations) J. http://www.intensivecare.com/Tutorial.html#anchor1214386 (an online tutorial) GRADING Students will receive a grade of “Achieved Competency,” “Did Not Achieve Competency,” or “Incomplete.” EBM curriculum grades will be taken into consideration when determining the “Problem Solving” component of the clinical grade. Students who receive “Incomplete” for the EBM curriculum will receive an “Incomplete” for the clerkship until it is remediated. -8- COMPETENCY LEVELS, “INCOMPLETE”, AND REMEDIATION CLINICAL GRADE OF (NONEXCUSED) “INCOMPLETE” COMPETENCY LEVEL ACTIVITY Pre-test NA NA ACQ’S Submits an appropriate ACQ in the appropriate PICO format 1. Submits EBM writeup on time using CAT form in which all sections are completed. 2. Achieves competency in discussion of A. Validity: Addresses at least 3 of the validity questions (that are enumerated in the Student Handbook) for therapy and diagnostic test clinical trials B. Results (75% accuracy ): For a therapy study, reports results in terms of CER, EER, ARR, NNT, 95% CI’s for ARR and NNT, where applicable. For a diagnostic test study, reports results in terms of pre-test probabilities, LR’s, post-test probabilities, 95% CI’s for LR and post-test probability. C. Applicability (75% accuracy): Addresses issues using “PICO for Applicability” or standard questions in the “Summary” sections (see Table of Contents). NA Not submitted as required Written assignment Post - test -9- Does not complete EBM write-up by deadline. NA REMEDIATION OF (NON-EXCUSED) CLINICAL GRADE OF “INCOMPLETE” Completes requirements The link to this website is: http://ebm.peds.uic.edu/clerkship/ Pediatrics Clerkship EBM Page This site serves students in the Pediatrics clerkship. Places to go from here: Welcome and introduction Mentor assignments and list of mentors Student workbook Submit ACQ Submit CAT Online tools o Diagnostic test calculator o Number needed to treat/harm calculator This web site is a joint project of Dr. Alan Schwartz of the Departments of Medical Education and Pediatrics and Dr. Jordan Hupert of the Department of Pediatrics at UIC. - 10 - Therapy and Diagnosis Specific Article List http://www.cche.net/usersguides/therapy.asp Evidence-Based Medicine: A New Approach to Teaching the Practice of Medicine How to Use an Article About Therapy or Prevention (PAY PARTICULAR ATTENTION TO ABSOLUTE RISK REDUCTION AND NUMBER NEEDED TO TREAT) How to Use an Article About a Diagnostic Test - 11 - Developing an Answerable Clinical Question (Based on Evidence-Based Medicine, 1997, Churchill Livingston) Learning how to ask an answerable clinical question (ACQ) is the first step in applying the results of clinical research to patient care. A well-formulated ACQ will save you time: the search for evidence will be an efficient, sensibly-honed process, rather than a chaotic search for vaguely relevant clinical trials. There are four parts to an ACQ: 1) The patient’s problem 2) The potential intervention (test, treatment, prognostic factor, etiology, etc.) 3) Comparison to another potential intervention (if necessary) 4) The outcomes of interest. Here are four examples of ACQ’s broken down into their four component parts: Diagnosis 1) In an otherwise healthy seven-year-old boy with a sore throat, 2) how does the clinical exam 3) compare to throat culture 4) in diagnosing group A, -hemolytic streptococcal infection? Treatment 1) In infants with West Syndrome (infantile spasms), 2) would use of vigabatrin 3) compared to ACTH therapy 4) result in faster and more efficient seizure reduction? Prognosis 1) In children with Downs Syndrome, 2) is IQ an important prognostic factor 4) in predicting Alzheimer’s later in life? (Notice that this question did not have a comparison component.) Causation/Etiology 1) Controlling for confounding factors, do otherwise healthy children 2) exposed in utero to cocaine, 3) compared to children not exposed, 4) have an increased incidence of learning disabilities at age six years? - 12 - Exercise: Develop ACQ’s for the following cases: 1. Your attending in clinic wants you to start penicillin on a 3yo girl with a sore throat, fever, rhinorrhea and cough. He says the chance of strep in this patient is fairly high. 2. A pregnant woman is visiting your office for a pre-natal pediatric visit. She says that she heard that the injectable form of vitamin K, which is given routinely to babies soon after birth, may cause cancer later in life. 3. The mother of a child with frequent febrile seizures is insisting that her son be started on an anticonvulsant. Her concern is that she “just can’t deal with any more seizures.” 4. An 18yo immigrant, who contracted hepatitis C as a baby while receiving a blood transfusion for unknown reasons, wants to know if he is likely to develop hepatic carcinoma. - 13 - SEARCHING FOR ANSWERS TO CLINICAL QUESTIONS Busy practicing pediatricians (even academic pediatricians) need search methods that are both fast and sufficiently reliable to retrieve high quality articles that specifically answer their questions. We will discuss one search service and two data bases that attempt to fulfill these two criteria. 1. PUBMED CLINICAL QUERIES PubMed is a free on-line search service of the National Library of Medicine that searches the MEDLINE biomedical bibliographic database. PubMed offers several search options including an option called Clinical Queries. The Clinical Queries option is based on the results of work by Dr. Brian Haynes, et al (Haynes RB, Wilczynski N, McKibbon KA, Walker CJ, Sinclair JC. Developing Optimal Search Strategies for Detecting Clinically Sound Studies in MEDLINE. J Am Med Informatics Assoc 1994;1:447-58). [The last author, Dr. Jack Sinclair, is a neonatologist.] They developed search terms which would result in the retrieval of the most methodologically sound articles in four categories: diagnosis, therapy, prognosis, and etiology. The various search terms were determined as the result of a diagnostic test experiment. Dr. Haynes and his team hand searched ten general medicine and internal medicine journals published between the years 1986-1991. They picked out those studies which they felt were of the highest methodological quality. This hand-search method became the "gold" standard and the chosen articles became the results of the "gold" standard test. The diagnostic test was the computer search of MEDLINE. Every potentially useful search term and all combinations of these terms were used to find the hand-picked articles. More than 100,000 combinations were tried by computer to find the "gold" standard studies. Emerging from these many combinations were two sets of search terms for each of the four categories: those which maximized specificity and those which maximized sensitivity. Those search terms that maximized specificity (i.e., specificity of the search), minimized false positives. Thus, those articles retrieved included many of the "gold standard" articles. Unfortunately, as one maximizes specificity, sensitivity suffers: false negatives increased. In terms of the search, it meant that "gold standard" articles were missed. The search terms that maximized sensitivity cast a wider net upon MEDLINE. The search retrieved a larger number of "gold standard" articles (increased sensitivity means decreased false negatives), however, specificity suffered leading to retrieval of articles which were not among the "gold standard" group. The search terms which maximized specificity and sensitivity in each of the four categories have been imbedded in PubMed Clinical Queries. How to Use PubMed Clinical Queries - 14 - Go to the "new" PubMed Clinical Queries. Decide on the type of evidence you are looking for (therapy, diagnosis, etiology, prognosis [and click that]). Click either "specificity" or "sensitivity". In general, it is best to start with specificity (the default). This will give you the shortest list of articles, one of which, hopefully, will be applicable. Enter a few search terms. Qualifying words such as "AND" or "OR" should be capitalized. MeSH headings (Medical Subject Headings) may be used. Do not add words such as therapy, randomized, blind. This search engine will incorporate automatically similar high efficiency terms. Example To appreciate the value of PubMed Clinical Queries, let us search for evidence that will answer the following question. In babies with colic, what is the therapeutic efficacy of any treatment compared to no specific therapy in decreasing crying spells (as determined by the parents)? First, go to "regular" PubMed (the new version). Type in "colic" AND "infant" AND "therapy." Then click search. Note the number of articles you find (approximately 309). Next go to Clinical Queries. Click "therapy" and "sensitivity." Type in "infant" AND "colic" and then search. Notice the number of retrieved articles has decreased (approximately 113). These also should be better quality articles, on average, than those retrieved with regular PubMed. Click "details" near the top of your search results. At the bottom of the new screen is the actual query with the imbedded search terms developed by Dr. Haynes and his team to filter in methodologically sound studies. Now go back to Clinical Queries Search Page and click "specificity" and search. This type of search is the most restrictive, filtering in only those studies of the highest quality (but possibly missing some). This search method produces the smallest quantity of studies (approximately 27). However, notice that most, if not all, of the studies listed are prospective, randomized studies. - 15 - Therapy: Reviewing the Evidence Adapted from Sacket DL, Straus SE, Richardson WS, Rosenberg W, and Haynes RB, EVIDENCE-BASED MEDICINE: How to Practice and Teach EBM. 3 rd Ed. Churchill Livingstone. 2000 Are the results likely to be valid? If NO then STOP. Are the results important? If NO then STOP. Are the results applicable to my patient? If NO then STOP. Was the assignment of patients to treatment randomized? Was follow-up sufficiently long and complete? Were all patients analyzed in the groups to which they were randomized (intention to treat)? Were patients and clinicians kept blind to treatment? Treatment Drug Group Placebo Totals Adverse outcome Present Absent a b c d a+c b+d Totals a+b c+d a+b+c+d Control Event Rate (CER) = c/(c+d) Experimental Event Rate (EER) = a/(a+b) Absolute risk reduction (ARR) = CER – EER Number needed to treat (NNT) = 1/ARR Is our patient so different from those in the study that its results cannot apply? Is the treatment feasible in our setting? What are our patient’s potential benefits and harms from the therapy? What are our patient’s values and expectations for both the outcome we are trying to prevent and the treatment we are offering? - 16 - Evaluating an Article about Therapy You are in your office. It is 6:00 P.M. and your day is over. As you pack up your briefcase to head home, the nurse brings in a 7-year-old boy with a history of moderate persistent asthma. His mother says he's been coughing and wheezing since yesterday and he's getting worse. On exam, the patient is in moderate respiratory distress with a RR=40. His mentation is normal. He has bilateral wheezing with fair air entry and moderate subcostal/intercostal retractions. His oxygen saturation on room air is 85% and his peak expiratory flow rate (PEFR) is 45% of predicted for height. You start him on oxygen and albuterol by nebulization. After 30 minutes and 2 albuterol treatments there is only mild improvement in his wheezing and he is still tachypneic (34) and hypoxic (90%) and his PEFR is only 50% expected. This patient is a definite admission (after stabilization in the ER). As you are waiting for the ambulance, you recall hearing about a study using magnesium in moderate exacerbations of asthma and wonder what is the likelihood that your patient would benefit from the magnesium treatment. You decide to formulate an answerable clinical question and find an answer as soon as your patient is transferred to the ER. The pediatrician is faced with therapy decisions many times each day. Evaluating evidence for or against new therapies (or older, unproved therapies) that may be potentially beneficial is part of providing a high level of care for our patients. The example which follows will outline one approach to answering a clinical question about therapy. The Question P: In children with an acute moderate exacerbation of asthma, I: what is the therapeutic efficacy of magnesium, C: compared to placebo (or no treatment), O: in improving PEFR (and possibly saving an admission)? The Search You quickly go to your computer, call up PubMed, click "Clinical Queries", and begin your search for evidence of efficacy (and safety) of magnesium in patients with moderate exacerbation of asthma. You click "therapy" and "specificity" and enter the words "magnesium AND asthma AND child." Eleven studies are retrieved. The fourth in the list looks promising: Cirallo L, Sauer AH, Shannon MW. "Intravenous magnesium therapy for moderate to severe pediatric asthma: results of a randomized placebo-controlled trial." J Pediatr 1996;129:809-14. You quickly download a copy of the article from OVID and briefly analyze it. Objective "To evaluate the efficacy of intravenous magnesium therapy for moderate to severe asthma exacerbations in pediatric patients." - 17 - Methods All children 6-14 years of age presenting to the ER of Children's Hospital in Boston 9/93 - 12/94 were evaluated for the study. Inclusion criteria: PEFR < 60% predicted and an IV placed for reasons other than the study. Exclusion criteria: fever > 38.5 degrees C, systolic BP < 25% for age, recent use of theophylline, history of cardiac, renal, or pulmonary disease, and pregnancy. Patients were randomized in what appears to have been a double blind fashion to either 100 ml of 25 mg/kg (maximum 2 gm) MgSO4 or saline. All patients were given 2 mg/kg of methylprednisolone by IV. Validity Primary Issues: 1. Was the assignment of patients to treatments randomized? Yes. 2. Were all patients who entered the trial properly accounted for and attributed at its conclusion? Yes 3. Was follow-up complete? Yes. 4. Were all patients analyzed in the groups to which they were randomized? Yes. This question is important for very practical purposes. Consider this (fictitious) example. Two hundred patients are randomly assigned to receive either coumadin (the experimental medication) or aspirin to prevent thrombosis in children under 3 years of age following a Fontan procedure to treat single ventricle congenital heart disease. One hundred patients end up in each group. After twelve months the investigators found that 16 patients assigned to the coumadin group and 16 patients assigned to the aspirin group had a thrombotic event - no apparent improvement with coumadin. However, the investigators discovered that within the first 2 months of the study, 35 of the patients assigned to the coumadin group had stopped taking it - for a variety of reasons - and started taking aspirin. When they analyzed only those who actually took the coumadin for a full 12 months, only 2 patients (of the remaining 65) actually had a thrombotic event. The investigators did their calculations in two ways: 1) they analyzed the patients in the groups to which they had been originally randomized, and 2) they analyzed the patients in the groups in which they ended the study. The first approach is call an intention to treat analysis. In general, this is the type of analysis that has meaning to practicing physicians. The - 18 - intention to treat approach is much more typical of the "real life" situation of clinical practice where patients take, or do not take, their medicine for a variety of reasons. The practical difference in the example is that the first way of calculating does not demonstrate a benefit of coumadin where the second way of calculating does. The conclusion from a practical clinical point of view is that use of coumadin is not clinically effective. If ways could be found to increase compliance, another study could be performed to retest coumadin versus aspirin. Secondary Issues: 5. Were patients, health workers, and study personnel "blind" to treatment? It does appear from the description of the methods that they were. 6. Were the groups similar at the start of the trial? Not completely. Table 1 demonstrates that patients randomized to the magnesium group had a statistically lower baseline PEFR than those randomized to placebo. This group, therefore had more room for improvement. Since improvement from baseline was an outcome variable, the results would be biased in favor of the magnesium group. 7. Aside from the experimental intervention, were the groups treated equally? Yes. Having decided that the study is at least minimally valid, you apply to the results the very basic evidence-based medicine statistical tools you were taught as a resident. Results The primary outcome was measured as percent improvement in PEFR 80 minutes after initiation of drug infusion. The authors found that the group given magnesium showed a significant improvement for their entry level PEFR - 46% vs. 16% in the placebo group (p=0.05) and no significant side effects were noted (in particular, BP effects, though the small sample size precludes conclusions in the case of uncommon significant side effects). . This result, while interesting, has limited clinically significant meaning. It doesn't answer the question, "What is the likelihood that my patient will benefit from the treatment?" Fortunately, the authors give us a patient-oriented outcome for PEFR. At the end of the observation period 4 of the 15 patients in the magnesium group (27%) vs. 11 of the 16 patients in the placebo group (69%) had a PEFR < 60% predicted. In order to calculate a clinically useful statistic, let us place data into a 2x2 table. - 19 - PEFR Mg > 60% < 60% Total + 11 4 15 - 5 11 16 31 The absolute risk reduction (ARR) is the rate of disease in the control group minus the rate of disease in the treatment group. In our example, the rate of "disease" is the percentage of patients with PEFR < 60% at the conclusion of the observation period. For the Placebo group = 11/16 = 0.69 For the Mg group = 4/15 = 0.27 The absolute risk reduction = 0.69-0.27 = 0.42 The 95% confidence interval (CI) for the ARR is [0.10, 0.74], and is therefore statistically significant (since it does not cross 0). [Click on "ARR/NNT" found in the sidebar to see how the 95% CI for both the ARR and NNT are calculated.] Thus, Mg treatment leads to a 42% reduction in patients with a PEFR < 60% at end of the observation period. In addition, you can be 95% confident that if this study were repeated 100 times, 95 out of those 100 times, the resultant ARR would be found within the interval of [0.10, 0.74]. Another way of stating this is that you could be 95% confident that the true ARR lies somewhere between [0.10, 0.74]. There is another, perhaps even more clinically meaningful statistic called the number needed to treat (NNT). The NNT tells you how many patients you would you have to treat to see an effect of the drug (over and above control). NNT= 1 ARR In our example, the NNT = 1/0.42 = 2.4. Its 95% CI is [1, 10], and is therefore statistically significant (since the CI does not include infinity [1/0]) . You have a personal rule of thumb: for mildly invasive treatments with no significant side effects and at least moderately significant benefits (this magnesium treatment seems to fit all these criteria), you need to be 95% confident that you will not have to treat more than 25 patients to benefit one over and above control. In this case, ~2 patients must be treated with magnesium to benefit one patient over and above control. Also, you can be 95% confident that the true NNT lies somewhere between [1, 10]. Therefore, assuming the side effects are mild - 20 - and/or uncommon (hard to know from this small study, though there were no serious side effects noted), this result appears to meet all your criteria for use. In the same study, the authors reported a statistically significant decreased admission rate among the patients treated with magnesium. Eleven patients in the Mg groups and 16 patients in the placebo group were admitted. ADMITTED Mg + - Total + 11 4 15 - 16 0 16 31 What is the ARR (the risk in this case in admission to the hospital)? Answer: 16/16 - 11/15 = 0.27 [0.05, 0.43]. What is the NNT? Answer: 1/0.27 = 3.7 [2, 22]. Thus, only four patients would need to be treated to prevent one admission, over and above control, with 95% confidence that less than 25 patients would have to be treated to detect a benefit (no admission) in one patient - again fulfilling your criteria for use. Dr. Alan Schwartz has developed an online calculator which can determine ARR (with 95% CI) as well as NNT (with 95% CI). This should make your life a bit easier. Click http://araw.mede.uic.edu/cgi-bin/nntcalc.pl to try it. Authors Conclusions "Children treated with IV magnesium for moderate to severe asthma had...greater improvement in short-term pulmonary function [compared to controls]...suggesting a role for the agent as an adjunct in the treatment of such patients." Questions/Concerns/Your Conclusions/Applicability There are a couple of important points. The first is that the randomization procedure didn't work. the baseline PEFR's of the two groups were different. One way to overcome this "random" problem is to randomize more patients. This would have been a desirable thing to do, as the baseline PEFR favored the experimental group. On the other hand, the results relating to admissions was certainly biased away from the experimental group, as all of the patients in the - 21 - study were slated for admission. It was the significant (statistically and clinically) improvement in the treatment group which prevented 27% of the admissions in that group, whereas all control patients were admitted. The results of this study certainly favor the use of magnesium based on your NNT cutoff criteria. It would be helpful to see what other well-designed magnesium studies demonstrate in children, especially studies in which the randomization worked. It would also be important to note any significant adverse effects. To the extent one can tell, the patients in this study (those who visited the ER of the Children's Hospital of Boston) are similar to those you see (though you tend to see only a few patients with moderate to severe exacerbations in your office). Therefore, from a patient "type" standpoint, the results appear to be applicable. Resolution of Your Patient's Story Your patient was admitted to the general pediatric ward. He was discharged after three days in good condition. Early the next week you set up a meeting with the ER staff to begin evaluating the possibility of employing magnesium as part of the outpatient therapy for asthma exacerbations. - 22 - Risk Reduction Calculator To use, go to: http://araw.mede.uic.edu/cgi-bin/nntcalc.pl or link through the Pediatrics Clerkship EBM Page http://ebm.peds.uic.edu/clerkship/ Enter your data in one of these ways: Numbers of patients who experience good and bad outcomes under the new therapy and control therapy: Good Bad Total Outcome Outcome 0 New therapy Control Total 0 0 0 0 Compute or Type of event and event rates (and, optionally, sample size): The events I'm interested in are: Control event rate: Experimental event rate: adverse % % Optional # of patients in control group: # of patients in experimental group: Compute Clear Entries - 23 - This is what the Critically Appraised Topic (CAT) form looks like (it expands as you fill it in). Please access it at the Pediatric Clerkship EBM Page http://ebm.peds.uic.edu/clerkship/ and click on “Submit CAT.” CRITICALLY APPRAISED TOPIC TOPIC TITLE Date Name of Reviewer(s) Patient Story (be brief) Answerable Clinical Question (PICO) The Search The Study Citation Methods (focus on your question) Issues of Validity (see specific questions) Results (focus on your question) Applicability (see PICO for Applicability) Resolution of Patient Story CLINICAL BOTTOM LINE - 24 - An example of a completed CAT for a Therapy Article. This is the link to the full text: http://bmj.com/cgi/content/full/314/7097/1800?maxtoshow=&HITS=10&hits=10&RESULTFORMAT=&titleabs tract=gingivostomatitis&searchid=1024936087142_10594&stored_search=&FIRSTINDEX=0&fdate=1/1/199 7&resourcetype=1,2,3,4,10 TOPIC TITLE Name of Reviewer Patient Story Answerable Clinical Question (PICO) The Search The Study Citation Methods (focus on your question) Issues of Validity Results (focus on your question) CRITICALLY APPRAISED TOPIC Acyclovir and Gingivostomatitis Jordan Hupert Date 1/20/01 13 month old girl with fever and sores in her mouth for 2 days. She is not clinically dehydrated, but she has been drinking less and appears uncomfortable. In children with probable herpes gingivostomatitis, what is the therapeutic efficacy of oral acyclovir, compared to placebo, on rate of cure? PubMed Clinical Queries Therapy Specificity gingivostomatitis AND acyclovir Amir J, Harel L, Smetana Z, Versano I. Treatment of herpes simplex gingivostomatitis with aciclovir in children: a randomised double blind placebo controlled study. BMJ. 1997;314:1800-3. 1-6 years old. Ambulatory/ER setting. Within 3 days of symptom onset. Randomized/placebo/double blind. 15 gm/kg, 5x/day X 7days. 72 children randomized, 62 HSV+, 1 dropped out 61 HSV+ = study population Randomized Trial? Yes Patients accounted at end of trial? Yes Follow-up long enough? Yes Intention to treat? Yes for 61. See Results for 72. Lesion resolution after 7 days of treatment: 61 patients: CER: 21/30, EER 2/31, ARR = 0.63 [0.45, 0.82], NNT = 2 [1, 2]. 72 patients (worst case scenario): CER: 21/36, EER 7/36, ARR = 0.39 [0.18, 0.60], NNT = 3 [2, 5]. Other results: (acyclovir vs placebo, medians) Oral lesions: 4 vs 10 days Fever: 1 vs 3 days Drinking difficulties: 3 vs 6 days Applicability, Limitations, Concerns Resolution of Patient Story CLINICAL BOTTOM LINE No serious side effects and same in both groups. Would less frequent or shorter course acyclovir administration work? Study compliance was good. Real life compliance? Extrapolation to older/younger children? Treatment effect after 3 days of symptoms? Uncommon serious side effects check PDR and literature Acyclovir prescribed. Patient improved in 2 days, healed in 7 days. Acyclovir is effective in treating children with suspected herpetic gingivostomatitis - 25 - Diagnosis: Reviewing the Evidence Adapted from Sacket DL, Straus SE, Richardson WS, Rosenberg W, and Haynes RB, EVIDENCE-BASED MEDICINE: How to Practice and Teach EBM. 3rd Ed. Churchill Livingstone. 2000 Are the results likely to be valid? If NO then STOP. Are the results important? If NO then STOP. Are the results applicable to my patient? If NO then STOP. Was there an independent, blind comparison with a reference (“gold”) standard? Was the diagnostic test evaluated in an appropriate spectrum of patients (like those in whom we would use it in practice)? Was the reference (“gold”) standard applied regardless of the diagnostic test result? Test Result Positive Negative Totals Disease Present Absent a b c d a+c b+d Totals a+b c+d a+b+c+d Sensitivity = a/(a+c) Positive predictive value = a/(a+b) LR (+) = [a/(a+c)]/[b/(b+d)] Specificity= d/(b+d) Negative predictive value = d/(c+d) LR (-) = [c/(a+c)]/[d/(b+d)] Is the diagnostic test available, affordable, accurate, and precise in our setting? Can we generate a clinically sensible estimate of our patient’s pretest probability? Will the resulting post-test probability affect our management and help our patient? - 26 - EVALUATING AN ARTICLE ABOUT DIAGNOSIS Tanya is an 8-month-old girl, otherwise normal, who is seeing you with what her mother describes as two days of fever greater than 102° F. Physical exam is unremarkable except for a temperature of 103° F in your office and mild URI symptoms. You decide to obtain a urinalysis and urine culture (by catheter) and send it to the nearby lab. Two hours later you get a call that there were 7 WBCs seen as determined by a hemocytometer and no bacteria on gram stain. The interpretation by the lab is "negative urinalysis, culture pending." You've often wondered how accurate pyuria and bacteruria (or their absence) are in predicting a urinary tract infection in infants. You therefore formulate your question in preparation for a search. The Question: P: In children under two years of age with fever and no obvious source of infection, I: what is the diagnostic accuracy of pyuria and bacteruria, C: compared to urine culture, O: in the diagnosis of urinary tract infection (UTI)? The Search: You quickly click onto PubMed Clinical Queries, click "diagnosis," "specificity," and type in "pyuria AND bacteruria AND infant." Fourteen articles come up, including one that seems to be right on target: Hoberman A, Wald ER, Reynolds EA, et al. "Pyuria and bacteruria in urine specimens obtained by catheter from young children with fever." J Pediatr 1994;124:513-9. Objective: The study had a number of objectives. The objective which addressed your question was to "...assess the validity of microscopic urinalysis for diagnosis of UTI." Methods: The patient population was made up of children under two years of age from whom a urine specimen was obtained for urinalysis and culture. WBCs in urine were counted on a hemocytometer from uncentrifuged urine. Gram stains were done on uncentrifuged urine. A positive urinalysis was defined as greater than 10 leukocytes/mm3 and any bacteria seen on gram stain. - 27 - Validity: 1. Was there an independent blind comparison with a reference (gold) standard? There was a reference (gold) standard. A positive urine culture was defined as > 50,000 colonies/ml. The reference standard was presumably independent. There is no mention as to whether it was a blind comparison. 2. Did the patient sample include an appropriate spectrum of patients to whom the diagnostic test will be applied in clinical practice? We only know a few things about the patients. We know their age and we know that they were seen in the Emergency Department of the Children's Hospital of Pittsburgh, though we do not know any other specifics about the patients. Since Pittsburgh is a large city not unlike Chicago, we can be reasonably assured that the patients seen in the Emergency Department in Pittsburgh are similar to those we see here in Chicago. 3. Did the results of the test evaluated influence the decision to perform the reference (gold) standard? No. 4. Were the methods for performing the test described in sufficient detail to permit replication? Yes. Results: Before we discuss specific results of this study, we need to discuss a few basic concepts about diagnostic tests. This section discusses how to use the results of studies of diagnostic tests. There are three steps in using diagnostic tests: 1. Assigning a pre-test probability of disease for our patient, 2. Finding or calculating the likelihood ratio (LR) for a particular test result, and 3. Calculating the post-test probability of disease using the pre-test probability and LR. Step One: Pre-Test Probability Diagnostic tests modify our patients pre-test probability of disease. It is important to accurately estimate the initial probability of disease in our patient. Sometimes our initial estimate is quite accurate. For example, we may know that in our clinic population, 15% of the pre-school kids are iron deficient. At other times, we may have only a vague idea of the pre-test probability, as for the pre-test probability of parasitic gastrointestinal infection in a recent immigrant with abdominal pain. In - 28 - this case, we could try to refine our estimate by going to the literature. If the literature does not help, we could ask one of our experienced colleagues or an expert in parasitic diseases. If no one knows, we must make our best guess. A diagnostic test can only change our pre-test probability. Step Two: The Likelihood Ratio Consider the table below: Reality (Gold Standard) TEST Disease No Disease Positive True Positive (TP) False Positive (FP) Negative False Negative (FN) True Negative (TN) All With Disease All Without Disease Totals On the left side of the table are listed the results of the test. In the simplest case, the test is either positive or negative. At the top of the table is the reference, or gold, standard. The gold standard is the best available definition of the disease; it is often the definitive test for a disease -- in extreme cases, the diagnosis at autopsy might be the gold standard. When the gold standard itself is imperfect or unknown (e.g. the definition of a urinary tract infection in a patient with a dysfunctional bladder), the test being evaluated will carry the same uncertainties as the gold standard. A positive test constituted > 10 WBC/mm3 + bacteria on gram stain. A negative test was anything else. The gold standard was a urine culture with > 5 x 104 colony forming units/ml. The data from the article was entered into the 2 X 2 table below. Urine Culture Positive Negative Enhanced >10 WBC + bacteria 91 12 Urinalysis Anything else 11 2067 Totals 102 2079 Notice that the enhanced urinalysis was not perfect, there were false positives and false negatives. This is common for tests. - 29 - We can use the data in the table to calculate a number of important properties of the test and study population: sensitivity, specificity, predictive values, prevalence rates, and likelihood ratios (LR). The most useful property for our purposes is the LR (See Likelihood Ratios on the sidebar). The LR reflects the essence of a test because it combines within a single value both the sensitivity and specificity. A +LR defines the diagnostic strength of a positive test; a -LR defines the diagnostic strength of a negative test. The LR tells us how much we must modify our initial pre-test probability. Refer to the first table. The LR for a positive test (+LR) is defined by either of these two ratios: TP/ All With Disease -------------------------FP/ All Without Disease or Sensitivity --------------------1 - Specificity Let us calculate the +LR from our example: +LR = [91/102]/[12/2079] = 155/1 The interpretation of this result is that a positive result from the enhanced urinalysis test will change the pre-test probability (actually, the pre-test "odds," see below) 155 times more toward the diagnosis of urinary tract infection than away from it. The LR for a negative test (-LR) is defined by either of these two equivalent ratios: FN/ All With Disease -------------------------TN/ All Without Disease or 1 - Sensitivity --------------------Specificity The LR for the enhanced urinalysis test is -LR = [11/102]/[2067/2079] = 0.11/1 The interpretation of this result is that a negative result from the enhanced urinalysis test will change the pre-test probability (actually, the pre-test "odds," - 30 - see below) 0.11 times more toward the diagnosis of urinary tract infection than away from it. This, of course, means that a negative test result leads us away from the diagnosis of urinary tract infection. Notice that the mathematical formulations for +LR and -LR are +LR = sensitivity/[1 - specificity] -LR = [1- sensitivity]/specificity Recall (or take our word for it), that sensitivity and specificity are prevalence independent. Thus, LR's do not change from population to population. This is one of the most valuable characteristics of LR's. The same LR is used in Chicago and Bombay, even if the disease is much more prevalent in Bombay. This is providing that the disease is defined identically in both locations, that the LR was calculated from a group of patients likely to be found in both locations, and that those patients are the ones who are likely to be tested. (A study developing a test for the early detection of group B strep in neonates should not include healthy 6month-old babies. The +LR and -LR from such a study may be falsely elevated and depressed, respectively.) Typically, these problems either do not arise or do not change the LR's by much. However, this is a good example of how it pays to pay attention to the methods section of a clinical study. We are now ready to learn how to use LRs. Step Three: Calculating the Post-Test Probability of Disease The following demonstrates how the concepts are developed and used. As a practical matter, use the online calculator http://araw.mede.uic.edu/cgibin/testcalc.pl . We will use what we have discussed above to calculate the probability of disease given a particular test result. Recall the data presented above: Urine Culture Positive Negative Enhanced >10 WBC + bacteria 91 12 Urinalysis Anything else 11 2067 Totals 102 2079 A positive test result - 31 - Step One: As already mentioned, we need to have an initial estimate of the probability of disease. In our example, urinary tract infection was present in 4.7% of the study patients (102/[102 + 2079]). Let us assume that our patient population is similar to the study's population. We will use 5% as our estimate for the pre-test probability of urinary tract infection in our febrile patients < 2 years old. Since LRs are in ratio, or odds, form, we need to convert our pre-test probability into an odds form. Thus, 5% = 5/100 = 5 out of 100 children are infected = 5 are infected and 95 are not infected. The odds, therefore, are 5/95 = 0.05/1 Step Two: Assume that we obtained a positive test result (>10 WBC/mm3 + bacteria on gram stain). The LR for this result, +LR, as calculated previously, is 155/1. Step Three: Now we modify our pre-test odds with the results of the test: 0.05/1 x 155/1 = 7.8/1 Pre-Test Odds x +LR = Post-Test Odds The interpretation of the post-test odds is that in febrile children < 2 years old, a positive enhanced urinalysis test increases the likelihood of urinary tract infection from 0.05/1 to 7.8/1, a 155-fold increase. We then convert the post-test odds back into a post-test probability. We do this because we are more comfortable talking about the probability of disease than the odds of a disease (with the exception of those who frequent the racetrack). Thus, odds of infection of 7.8/1 means that there are 7.8 children with infection for every child without infection. Therefore, the probability of infection is 7.8 (children with infection) divided by 8.8 (total number of children with and without infection). 7.8 / 8.8 = 0.89 (89%) Notice what the test did. It took a patient with an initial probability of 5% for a urinary tract infection and modified it to give him a post-test probability of 89%. This is a very powerful test and demonstrates one of the useful characteristics of LRs. This child with positive results is likely to be started immediately with antibiotics. A negative test result What if the test came back negative? Step One The prevalence hasn't changed. Pre-test probability = 5% - 32 - Pre-test odds = 0.05/1 Step Two The likelihood ratio for a negative test, -LR, = 0.11/1 Step Three 0.05/1 x 0.11/1 = 0.006/1 Pre-Test Odds x -LR = Post-Test Odds Post-test probability = 0.006 / [0.006 + 1] = 0.006 (0.6%) This is a patient with < 1% probability of a urinary tract infection. It would make sense to hold off on the antibiotics, check the culture results the next day, and keep an eye open for other causes of fever if the child does not improve. Using the online test calculator Dr. Alan Schwartz has developed an online calculator which will do all the LR and post-test probability calculations automatically. It also will calculate the 95% confidence intervals. Now that you know the fundamentals of LR's and post-test probability calculation, you may wish to consult the online calculator for your future needs. Click here for the calculator http://ebm.peds.uic.edu/ebm/testcalc.shtml. Here's what the results of using the online test calculator would be: Multilevel Test Results Many times, the results of a test are not "positive or negative." Consider our first example of an enhanced urinalysis. This test can be considered as a multilevel test. In addition to ">10 WBCs + Bacteria," we were able to calculate the data (from one of the tables in the article) for the other possible combinations of WBCs and bacteria (see table below). >10 WBCs + Bacteria <10 WBCs + Bacteria >10 WBCs/No Bacteria <10 WBCs/No Bacteria Total +UTI 91 4 2 5 102 -UTI 12 58 61 1948 2079 LR 151 1.41 0.68 0.05 Notice that each test level has an LR. The calculation for the LR follows our definition. Thus, the LR for a test result of "<10 WBCs + Bacteria" is the probability that the patient comes from the diseased versus the nondiseased population. The calculation for this LR is [4/2079]/[58/102] = 1.41/1. This LR can then be combined in the usual fashion with the pre-test probability of our patient having a UTI to obtain the post-test probability of disease. What is the post-test probability? Notice that the LR is approximately equal to 1 and therefore our patient's post-test probability will not be very different from his pre-test probability. This is what we would expect for a test with mixed results (positive for bacteria, negative for WBC's). Authors' Cconclusions: - 33 - The authors' conclusions do not specifically address our clinical question, and therefore we will go directly to the next section. Questions/Concerns/Your Conclusions/Applicability: There were a number of concerns about this study. It is not clear what percentage of "all febrile children" were included in the sample. The methods section merely states that all children "from whom a urine sample was obtained" were included in the study. Another concern is the definition of a urinary tract infection. It was defined as 50,000 colonies/ml. This is at odds with others who define UTI with either less or more colonies/ml. However, this study did attempt to demonstrate why 50,000 colonies/ml is an appropriate definition (based on other results in the study, which compared urine culture to DMSA scanning), and for our purposes we can use it. In terms of feasibility, cell counting with a hemocytometer is a technically simple procedure. A gram stain is also relatively simple. Both of these procedures can be done in any laboratory. These results do seem to apply to your patient since you use a laboratory that does urinalysis by hemocytometer and gram stain in a way that it similar to the Hoberman study. The conclusion of the laboratory that the child has a negative urinalysis is consistent with the approach of the study. Resolution of Your Patient's Story You decided not to start your patient on antibiotics. Twenty-four hours later, your patient was doing well and the results of the culture came back as "no growth after 24 hours." - 34 - Diagnostic Test Calculator: http://ebm.peds.uic.edu/clerkship/ Numbers of patients with and without the disease who test positive and negative: Disease present Disease absent Total Test positive Test negative Total Compute or disease prevalence, test sensitivity, and test specificity (and, optionally, sample size): Prevalence (e.g. 0.10): Sensitivity (e.g. 0.80): Specificity (e.g. 0.80): Total sample size: Compute or disease prevalence, positive likelihood ratio, and negative likelihood ratio (and, optionally, sample size): Prevalence (e.g. 0.10): +LR (e.g. 4): -LR (e.g. 0.01): Total sample size: Compute Clear Entries - 35 - Here are the concepts and how it is done mathematically: BASIC STATISTICS FOR DIAGNOSTIC TESTS TEST + Total Gold Standard (Reality) Disease No Disease a b c d All With Disease (a + c) All Without Disease (b + d) Likelihood Ratios For a given test result, the likelihood ratio is the probability that our patient comes from the diseased versus the non-diseased population (definition courtesy of Dr. Jack Sinclair, Department of Pediatrics, McMaster University) The likelihood ratio for a positive test = +LR = [a/(a+c)]/[b/(b+d)] The likelihood ratio for a negative test = -LR = [c/(a+c)]/[d/(b+d)] The likelihood ratio, when combined with the patient’s pre-test probability (prevalence) of having the disease, will give you the posttest probability of disease in that patient. Use the post-test probability nomogram to do this or see below for an exact method. Here is an exact way to do it: Determine the prevalence of disease and convert it to a prevalence ratio prevalence ratio (PR) = prevalence/(1 – prevalence) Then calculate the post-test odds PR x LR = post-test odds of disease Finally, convert the post-test odds back to a probability Probability of disease = [post-test odds]/[1 + post-test odds] For those interested in calculating a 95% confidence interval around an LR: 95% Confidence Interval for +LR = eln LR + 1.96{[c/(a+c)]/a + [d/(b+d)]/b} 95% Confidence Interval for -LR = eln LR + 1.96{[a/(a+c)]/c + [b/(b+d)]/d} Other helpful statistics: Sensitivity = The ability of a test to detect diseased people from a diseased population = a/(a + c). Specificity = The ability of a test to detect healthy people from a healthy population = d/(b + d). Positive predictive value = The probability that a given test result is a true positive given a specific disease prevalence = a/(a + b). Negative predictive value = The probability that a given test result is a true negative given a specific disease prevalence = d/(c + d). - 36 - Here is a diagnostic test post-test probability nomogram (see article above) if you do not want to calculate probability of disease from the pre-test probability and the likelihood ratio using the web-based diagnostic test calculator. - 37 - An example of a completed CAT for a Diagnosis Article. Link to full text: Full Text, see OVID: http://gateway1.ovid.com/ovidweb.cgi CRITICALLY APPRAISED TOPIC TOPIC TITLE 13C-Urea Breath Test for H. Pylori Date 11/7/02 Name of Reviewer(s) J. Hupert Patient Story (be brief) 11 yo HF with 3 weeks of intermittent epigastric pain, somewhat relived with food ingestion. Stool heme negative. One of your colleagues suggested the "breath test" for H. pylori. Answerable Clinical P: In children with abdominal pain suggestive of gastritis, Question (PICO) I: what is the diagnostic accuracy of the 13C-urea breath test, C: compared to biopsy, O: in diagnosing H. pylori infection? The Search PubMed -->Clinical Queries --> specificity --> helicobacter pylori AND child AND urea breath test The Study Citation Kawakami E, Machado RS, Reber M, Patricio FR. 13 C-urea breath test with infrared spectroscopy for diagnosing helicobacter pylori infection in children and adolescents. J Pediatr Gastroenterol Nutr. 2002 Jul;35(1):3943. Methods (focus on your 18 month study period. 82 children evaluated, 75 children included in question) analysis (see results for the other 7 patients). Age: 6 months - 18 years. All were referred for endoscopy. Culture, histology, and rapid urease test done on the six biopsy specimens from each patient (2 specimens per test). H. pylori infection was defined by a positive culture or both a positive histology and a positive rapid urease test. 13C-urea breath test was performed using an infrared isotope analyzer at baseline and at 30 minutes. Issues of Validity (see 1. Was there an independent, blind comparison with a reference (gold) specific questions) standard? The gold standard was either a positive culture or both a positive histology and a positive rapid urease test on biopsy specimens. Histology evaluation does appear to have been independent. No mention was made of who performed the rapid urease and culture. However, they are relatively objective tests. Blinding was not discussed. There is no mention as to whether the breath test preceeded the endoscopy. However, it seems reasonable that it did. If so, blinding may have been helpful to eliminate bias. 2. Was the diagnostic test evaluated in an appropriate spectrum of patients (like those in whom we would use it in practice)? The age range is appropriate. Close to half of all patients were infected, which seems higher that one is likely to find in the primary care setting, suggesting possible spectrum bias. The variety of disease appears sufficienatly broad. 3. Was the reference (gold) standard applied regardless of the diagnostic test result? Yes. Results (focus on your 7 patients had discordant gold-standard results (2 tested positive and 5 - 38 - question) negative with the 13C-urea breath test) and were excluded from the investigators' analysis. Based on 75 patients: Sensitivity = 0.97, Specificity = 0.93 +LR = 14 [5, 42], -LR = 0.03 [0.10, 0.24] Assuming a 41% prevalence, post-test probabilities for positive and negative test are 91% [77, 97] and 2% [1, 14], respectively. If assume worst case scenario with the 7 missing patients: 2 false positives and 5 false negatives. Based on 75 patients: Sensitivity = 0.83, Specificity = 0.89 +LR = 7.7 [3.3, 18], -LR = 0.19 [0.09, 0.39] Assuming a 44% prevalence, post-test probabilities for positive and negative test are 86% [72, 93] and 13% [7, 23], respectively. Applicability (see PICO for Applicability) (P) Is my patient similar enough to the patients in the study that the evidence can be applied? It seems likely. (I) Could the intervention in the study be carried out in my setting, and in a way that is similar enough to the way it was conducted in the study? Yes. However, given the rang of results, it would be important to evaluated other H. pylori tests which may be easier to perform from an office setting. (C) Is the comparison in the study similar to the standard of care in my setting? The gold standard used is accepted. (O) Are the results important enough and are the outcomes measured in the study similar enough to those that are relevant and important in my setting or to my patient? Resolution of Patient Story CLINICAL BOTTOM LINE The results are likely to be important. Our H. pylori prevalence is probably closer to 10% of those in whom we would entertain the diagnosis. The likelihood ratios do not change. The post-test probabilities would then be 46 - 61% for a positive test and 0.3 - 2 % for a negative test (range of worst case and investigators case scenarios). H. pylori is a clinically significant diagnosis and those values are clinically significant. The patient tested positive on the breath test, was started on treatment and is doing better 3 weeks later. The 13C-urea breath test is a sufficiently accurate test for diagnosisng H. pylori infection.. - 39 - PICO Mnemonic for Applicability Developed by Alan Schwartz, PhD (P) Is my patient similar enough to the patients in the study that the evidence can be applied? Would my patient have met the study's inclusion criteria? A valid study may not be applicable to your patient if your patient differs in important ways from the study patients. (I) Could the intervention in the study be carried out in my setting, and in a way that is similar enough to the way it was conducted in the study? A valid study may not be applicable to your patient if the study intervention is impractical, too costly, requires skills, equipment, or medications that are not locally available, etc. (C) Is the comparison in the study similar to the standard of care (or for a diagnostic test study, the gold standard) in my setting? A valid study may not be applicable to your patient if you are already using a better standard of care (or for a diagnostic test study, you have a better gold standard) than that to which the study intervention is compared. (O) Are the outcomes measured in the study similar enough to those that are relevant and important in my setting or to my patient? A valid study may not be applicable to your patient if it reports outcomes that can not be measured practically in your setting, or that are unimportant to your patient. - 40 -