evidence-based medicine curriculum for the pediatrics clerkship

advertisement
Evidence - Based
Medicine
The Pediatrics Clerkship
EBM Curriculum
STUDENT WORKBOOK
2004-2005
Sponsored by the Departments of Pediatrics and Medical Education
TABLE OF CONTENTS
Welcome and Introduction ............................................................................... - 3 Goals, and Competencies ................................................................................ - 6 Curricular Activities .......................................................................................... - 7 Assignment: Answerable Clinical Questions (ACQ) ......................................... - 7 Assignment: EBM Write-Up ............................................................................. - 7 EBM Learning Resources ................................................................................ - 8 Competency Levels, Incomplete Grade, and Remediation .............................. - 9 Pediatrics Clerkship EBM Page ..................................................................... - 10 Therapy and Diagnosis Specific Articles ........................................................ - 11 What is Evidence-Based Medicine................................................................. - 11 Developing an Answerable Clinical Question ................................................ - 12 Searching for Answers to Clinical Questions ................................................. - 14 Theapy: Summary of Approach to Validity and Results ................................. - 16 Expanded Evaluation of a Therapy Article ..................................................... - 17 Risk Reduction Calculator .............................................................................. - 23 Critically Appraised Topic (CAT) Form ........................................................... - 24 Theapy: Example of a Write-Up Using a CAT Form ...................................... - 25 Diagnosis: Summary of Approach to Validity, Results and Applicability ........ - 26 Expanded Evaluation of a Diagnostic Test Article ......................................... - 27 Diagnostic Test Calculator ............................................................................. - 35 Basic Statistics for Diagnostic Tests .............................................................. - 36 Diagnostic Test Likelihood Ratio Nomogram ................................................. - 37 Diagnosis: Example of a Write-Up Using a CAT Form................................... - 38 PICO Mnemonic for Applicability.................................................................... - 40 -
-2-
EVIDENCE-BASED MEDICINE CURRICULUM FOR THE
PEDIATRICS CLERKSHIP
Dear Pediatrics Clerkship Student,
Welcome to Pediatrics and welcome to the Evidence-Based Medicine (EBM)
Curriculum. Below, please find information on goals, competencies, activities,
and responsibilities. This curriculum may be considered a continuation of your
work with EBM in the ECM course. The difference is that now you will now have
an opportunity to re-learn and apply the tools of EBM in an actual clinical context.
Each student “EBM PAIR” (see page 7) has been assigned a mentor
(http://ebm.peds.uic.edu/clerkship/ ). Your mentor is eager to work with you on
this important learning program. While there are a few required activities, much
of the learning that hopefully will occur is predicated on your initiative. “EBM
Learning Resources” (page 7) was specifically designed as a resource for selfdirected learning.
Your Pediatrics Clerkship is designed to facilitate learning of both pediatric
background and foreground information. Background information in pediatrics
encompasses the basic set of facts about child health and disease. Medical
schools are traditionally quite good at providing educational curricula for
acquiring background information. Examples of questions that a learner may ask
to gather background information are:
What are the physical findings in bronchiolitis? How is iron deficiency anemia routinely
diagnosed? What is the known life expectancy for a child with sickle cell anemia? What
is the standard treatment for attention deficit, hyperactivity disorder? What are the
recommended immunizations for a well baby in the first twelve months of life? What
resources are available to help a depressed adolescent?
Good sources of this kind of information are: involvement in patient care,
mentors, textbooks, and MD Consult. Most of your clerkship activities should
focus on acquiring background information. A smaller proportion of your
activities in the clerkship should focus on acquiring foreground information, which
is the subject of this EBM curriculum.
Foreground information is what is obtained by answering higher level questions.
Examples of this type of information would include clinical evidence for the
therapeutic efficacy of a new or an existing treatment, and the diagnostic
accuracy of a newly proposed diagnostic test. Until recently, there were few, if
any, formal curricula designed to facilitate learning of foreground information.
The EBM Curriculum for the Pediatrics Clerkship is an educational intervention to
address this learning need.
-3-
To better illustrate the difference between background and foreground
information gathering, please see the table below. Note that a foreground
question is often a thoughtful follow-up question to a background question.
Background
Foreground
What are the physical findings in
bronchiolitis?
How is iron deficiency anemia routinely
diagnosed?
What is the known life expectancy for a
child with sickle cell anemia?
What is the standard treatment for
attention deficit, hyperactivity disorder?
What are the recommended immunizations
for a well baby in the first twelve months of
life?
What resources are available to help a
depressed adolescent?
In babies with new-onset wheezing, what
is the diagnostic accuracy of the history
and physical examination, compared to
viral cultures, in diagnosing bronchiolitis?
(a diagnosis question)
In children with suspected iron deficiency
anemia, what is the diagnostic accuracy of
serum ferritin versus using the MCV and
hemoglobin count, compared to bone
marrow aspiration (or some other suitable
gold standard), in diagnosing iron
deficiency anemia? (a diagnosis question)
In children with sickle cell anemia, what is
the prognostic significance of frequent
episodes of acute chest syndrome,
compared to no episodes, on probability of
survival at age forty? (a prognosis
question)
In children with ritalin-resistant attention
deficit, hyperactivity disorder, what is the
therapeutic efficacy of clonidine, compared
to adderall, as measured by parental report
on the Connors Scale? (a therapy
question)
In the population of otherwise healthy
infants, what is the efficacy of the
pneumococcal vaccine Prevnar, compared
to placebo, in preventing pneumococcal
meningitis? (a type of therapy question)
Among mildly depressed adolescents,
what is the therapeutic efficacy of
outpatient cognitive therapy plus antidepressants, compared to outpatient
cognitive therapy alone, in reducing the
frequency of depression six months
following initiation of treatment?
Answers to foreground questions are rarely found in textbooks. By their nature,
foreground questions require up-to-date answers. Textbooks are often a number
of years out of date by the time they are published. The online clinical research
bibliographic databases, or study syntheses (meta-analyses, methodologically
sound guidelines) are much more likely to provide answers to foreground
questions.
-4-
By achieving the basic competencies of the EBM Curriculum for the Pediatrics
Clerkship, we anticipate that you will have attained a beginner-level ability to
formulate clear foreground questions ("answerable clinical questions") based on
real patient encounters, search for answers (clinical studies), evaluate study
methodology, analyze study results, and approach the application of results to
your patients. These EBM tools are likely to be of aid to you in all of your future
clinical endeavors.
Sincerely,
Jordan Hupert
Jerry Niederman
Larry Roy
for the EBM mentoring group.
-5-
Alan Schwartz
EVIDENCE-BASED MEDICINE CURRICULUM FOR THE
PEDIATRICS CLERKSHIP
GOALS
1.
To actively employ the pediatric patient encounter as a forum for clinical learning
2.
To answer clinical questions using the clinical research literature
COMPETENCIES
By the end of the Pediatrics Clerkship, the student will demonstrate how to
1.
Develop an answerable clinical question (ACQ) from a patient encounter
2.
Assess the methodologic validity of diagnosis and therapy research studies
3.
Analyze the results of diagnosis and therapy studies, employing the tools of
evidence-based medicine (EBM)
4.
Approach the application of therapy and diagnosis study results to specific
patient scenarios
TOOLS NEEDED TO ACHIEVE COMPETENCIES
The Pediatrics Clerkship will provide resources to facilitate student learning of
1.
The PICO (Patient, Intervention, Comparison, Outcome) format for ACQ’s
2.
PubMed Clinical Queries
3.
The definition and application of the major issues of methodologic validity and
applicability for diagnosis and therapy studies
4.
The definition and application of the following concepts
a.
b.
c.
d.
e.
f.
g.
h.
i.
j.
k.
prevalence
pre-test probability
sensitivity
specificity
likelihood ratio
post-test probability
absolute risk reduction
number needed to treat
95% confidence interval
statistical and clinical power
PICO for applicability
-6-
CURRICULAR ACTIVITIES
(See “Competencies” section for remediation of non-completion of activities)
Aside from the pre- and post-tests, the EBM activities will be done in
pairs (with occasional exceptions). Please see the web site for pair
(= EBM PAIR) assignments, as well as mentor assignments.
1.
Pre-Test.
This will be completed either on-site or electronically. The results of the pre-test
and the post-test will not affect your clerkship grade. The purpose of the tests is
to inform the EBM mentors as to how well students are learning and how well
mentors are facilitating learning of EBM.
2.
Answerable Clinical Question (ACQ)
Within the first 3 weeks of the clerkship, each EBM PAIR must submit either a
therapy or a diagnosis ACQ via the Pediatric Clerkship EBM Page
http://ebm.peds.uic.edu/clerkship/ to your mentor. The ACQ is to be based on
a pediatric patient with whom you have had clinical interaction during the
first 3 weeks of the clerkship. If the ACQ is inadequate or deficient, your
mentor will help with fixing it or will suggest sending another ACQ.
3.
Search and EBM Write-Up
A.
B.
C.
D.
4.
Each EBM PAIR should conduct a search of the online medical
bibliographic databases to find an answer to the ACQ within 72 hours of
receiving approval of his ACQ from your mentor.
Send the reference of the article that best answers your question to your
mentor.
Within the first 3 weeks of the clerkship, each EBM PAIR is to
arrange at least one face-to-face meeting with your mentor to
discuss the EBM project and write-up (“CAT”, Critically Appraised
Topic). You must complete the EBM write-up using the Critically
Appraised Topic (CAT) form (available on-line at the Pediatric Clerkship
EBM Page http://ebm.peds.uic.edu/clerkship/ ). The final draft must be
submitted via e-mail to your assigned mentor by Sunday, the first day of
the 5th week of the clerkship. Some examples of completed CAT's are
included in this workbook. Those students rotating in Pediatrics after the
first clerkship, will receive a list of EBM topics completed by students in
earlier clerkships. EBM topics are not to be repeated. Each EBM PAIR is
to work on a unique ACQ or a unique aspect of a previous ACQ.
If you chose to answer an ACQ on therapy, you ideally should be able to
generate an NNT. All diagnosis articles will allow generation of LR’s.
Both of these numbers may be calculated using the online calculators
available on the Pediatrics Clerkship EBM Page
http://ebm.peds.uic.edu/clerkship/
“Post-Test” (similar in form to the pre-test) will be taken at the end of the
clerkship just prior to the shelf exam.
-7-
EBM Learning: In addition to the workshop handbook, the following resources are
available to help you achieve the curricular competencies:
A.
http://www.cche.net/usersguides/main.asp (This site has the JAMA
collection of articles on EBM including those on diagnosis and therapy)
B.
http://ebm.peds.uic.edu (Location of EBM Consult Service, EBM
calculators and brief diagnostic test tutorials. Developed by Dr. Alan
Schwartz)
C.
http://ebm.peds.uic.edu/clerkship/ Pediatrics Clerkship EBM Page
D.
Evidence-Based Medicine. How to practice and teach EBM. David L.
Sackett, et al. Second edition. 2000. Churchill Livingstone. (Available
from the UIC bookstore or Amazon.com
http://www.amazon.com/exec/obidos/ASIN/0443062404/qid=1021228742
/sr=8-1/ref=sr_8_71_1/002-4147953-8775251 )
E.
Users’ Guides to the Medical Literature. Gordon Guyatt and Drummond
Renie. 2002. AMA Press. (Available from Amazon.com
http://www.amazon.com/exec/obidos/ASIN/1579471749/qid=1022870085
/sr=2-2/103-7432821-2643037 )
F.
http://bmj.com/cgi/content/full/315/7107/540 (A paper on diagnostic tests
by Trisha Greenhalgh. In her list of questions, she combines issues of
methodologic validity and applicability. For the sake of uniformity, when
doing your write-up, use the questions of validity given to you at the
workshop)
G.
http://bmj.com/cgi/content/full/315/7105/422 (General paper on statistics,
also by Dr. Greenhalgh. Particularly useful for confidence intervals)
H.
http://www.cebm.utoronto.ca/practise (Good reference on answering
clinical questions, including diagnostic test and therapy questions)
I.
http://www.med.ualberta.ca/ebm/ebm.htm (Evidence-Based Medicine
Tool Kit. Include the validity questions with links to explanations)
J.
http://www.intensivecare.com/Tutorial.html#anchor1214386 (an online
tutorial)
GRADING
Students will receive a grade of “Achieved Competency,” “Did Not Achieve
Competency,” or “Incomplete.” EBM curriculum grades will be taken into consideration
when determining the “Problem Solving” component of the clinical grade. Students who
receive “Incomplete” for the EBM curriculum will receive an “Incomplete” for the clerkship
until it is remediated.
-8-
COMPETENCY LEVELS, “INCOMPLETE”, AND REMEDIATION
CLINICAL GRADE
OF (NONEXCUSED)
“INCOMPLETE”
COMPETENCY
LEVEL
ACTIVITY
Pre-test
NA
NA
ACQ’S
Submits an appropriate
ACQ in the appropriate
PICO format
1. Submits EBM writeup on time using CAT
form in which all
sections are completed.
2. Achieves competency
in discussion of
A. Validity: Addresses
at least 3 of the validity
questions (that are
enumerated in the
Student Handbook) for
therapy and diagnostic
test clinical trials
B. Results (75%
accuracy ): For a
therapy study, reports
results in terms of CER,
EER, ARR, NNT, 95%
CI’s for ARR and NNT,
where applicable. For a
diagnostic test study,
reports results in terms
of pre-test probabilities,
LR’s, post-test
probabilities, 95% CI’s
for LR and post-test
probability.
C. Applicability (75%
accuracy): Addresses
issues using “PICO for
Applicability” or
standard questions in
the “Summary” sections
(see Table of Contents).
NA
Not submitted as
required
Written assignment
Post - test
-9-
Does not complete
EBM write-up by
deadline.
NA
REMEDIATION OF
(NON-EXCUSED)
CLINICAL GRADE
OF “INCOMPLETE”
Completes
requirements
The link to this website is: http://ebm.peds.uic.edu/clerkship/
Pediatrics Clerkship EBM Page
This site serves students in the Pediatrics clerkship.
Places to go from here:

Welcome and introduction

Mentor assignments and list of mentors

Student workbook

Submit ACQ

Submit CAT

Online tools
o
Diagnostic test calculator
o
Number needed to treat/harm calculator
This web site is a joint project of Dr. Alan Schwartz
of the Departments of Medical Education and
Pediatrics and Dr. Jordan Hupert of the Department
of Pediatrics at UIC.
- 10 -
Therapy and Diagnosis Specific Article List
http://www.cche.net/usersguides/therapy.asp
Evidence-Based Medicine: A New Approach to Teaching the Practice of
Medicine
How to Use an Article About Therapy or Prevention (PAY PARTICULAR
ATTENTION TO ABSOLUTE RISK REDUCTION AND NUMBER NEEDED TO
TREAT)
How to Use an Article About a Diagnostic Test
- 11 -
Developing an Answerable Clinical Question
(Based on Evidence-Based Medicine, 1997, Churchill Livingston)
Learning how to ask an answerable clinical question (ACQ) is the first step in applying
the results of clinical research to patient care. A well-formulated ACQ will save you
time: the search for evidence will be an efficient, sensibly-honed process, rather than a
chaotic search for vaguely relevant clinical trials.
There are four parts to an ACQ:
1)
The patient’s problem
2)
The potential intervention (test, treatment, prognostic factor, etiology, etc.)
3)
Comparison to another potential intervention (if necessary)
4)
The outcomes of interest.
Here are four examples of ACQ’s broken down into their four component parts:
Diagnosis
1)
In an otherwise healthy seven-year-old boy with a sore throat,
2)
how does the clinical exam
3)
compare to throat culture
4)
in diagnosing group A, -hemolytic streptococcal infection?
Treatment
1)
In infants with West Syndrome (infantile spasms),
2)
would use of vigabatrin
3)
compared to ACTH therapy
4)
result in faster and more efficient seizure reduction?
Prognosis
1)
In children with Downs Syndrome,
2)
is IQ an important prognostic factor
4)
in predicting Alzheimer’s later in life?
(Notice that this question did not have a comparison component.)
Causation/Etiology
1)
Controlling for confounding factors, do otherwise healthy children
2)
exposed in utero to cocaine,
3)
compared to children not exposed,
4)
have an increased incidence of learning disabilities at age six years?
- 12 -
Exercise:
Develop ACQ’s for the following cases:
1.
Your attending in clinic wants you to start penicillin on a 3yo girl with a sore
throat, fever, rhinorrhea and cough. He says the chance of strep in this patient is
fairly high.
2.
A pregnant woman is visiting your office for a pre-natal pediatric visit. She says
that she heard that the injectable form of vitamin K, which is given routinely to
babies soon after birth, may cause cancer later in life.
3.
The mother of a child with frequent febrile seizures is insisting that her son be
started on an anticonvulsant. Her concern is that she “just can’t deal with any
more seizures.”
4.
An 18yo immigrant, who contracted hepatitis C as a baby while receiving a blood
transfusion for unknown reasons, wants to know if he is likely to develop hepatic
carcinoma.
- 13 -
SEARCHING FOR ANSWERS TO
CLINICAL QUESTIONS
Busy practicing pediatricians (even academic pediatricians) need search
methods that are both fast and sufficiently reliable to retrieve high quality articles
that specifically answer their questions. We will discuss one search service and
two data bases that attempt to fulfill these two criteria.
1.
PUBMED CLINICAL QUERIES
PubMed is a free on-line search service of the National Library of Medicine that
searches the MEDLINE biomedical bibliographic database. PubMed offers
several search options including an option called Clinical Queries. The Clinical
Queries option is based on the results of work by Dr. Brian Haynes, et al (Haynes
RB, Wilczynski N, McKibbon KA, Walker CJ, Sinclair JC. Developing Optimal
Search Strategies for Detecting Clinically Sound Studies in MEDLINE. J Am Med
Informatics Assoc 1994;1:447-58). [The last author, Dr. Jack Sinclair, is a
neonatologist.] They developed search terms which would result in the retrieval
of the most methodologically sound articles in four categories: diagnosis,
therapy, prognosis, and etiology. The various search terms were determined as
the result of a diagnostic test experiment. Dr. Haynes and his team hand
searched ten general medicine and internal medicine journals published between
the years 1986-1991. They picked out those studies which they felt were of the
highest methodological quality. This hand-search method became the "gold"
standard and the chosen articles became the results of the "gold" standard test.
The diagnostic test was the computer search of MEDLINE. Every potentially
useful search term and all combinations of these terms were used to find the
hand-picked articles. More than 100,000 combinations were tried by computer to
find the "gold" standard studies. Emerging from these many combinations were
two sets of search terms for each of the four categories: those which maximized
specificity and those which maximized sensitivity. Those search terms that
maximized specificity (i.e., specificity of the search), minimized false positives.
Thus, those articles retrieved included many of the "gold standard" articles.
Unfortunately, as one maximizes specificity, sensitivity suffers: false negatives
increased. In terms of the search, it meant that "gold standard" articles were
missed. The search terms that maximized sensitivity cast a wider net upon
MEDLINE. The search retrieved a larger number of "gold standard" articles
(increased sensitivity means decreased false negatives), however, specificity
suffered leading to retrieval of articles which were not among the "gold standard"
group. The search terms which maximized specificity and sensitivity in each of
the four categories have been imbedded in PubMed Clinical Queries.
How to Use PubMed Clinical Queries
- 14 -
Go to the "new" PubMed Clinical Queries. Decide on the type of evidence you
are looking for (therapy, diagnosis, etiology, prognosis [and click that]). Click
either "specificity" or "sensitivity". In general, it is best to start with specificity (the
default). This will give you the shortest list of articles, one of which, hopefully, will
be applicable. Enter a few search terms. Qualifying words such as "AND" or "OR"
should be capitalized. MeSH headings (Medical Subject Headings) may be used.
Do not add words such as therapy, randomized, blind. This search engine will
incorporate automatically similar high efficiency terms.
Example
To appreciate the value of PubMed Clinical Queries, let us search for evidence
that will answer the following question. In babies with colic, what is the
therapeutic efficacy of any treatment compared to no specific therapy in
decreasing crying spells (as determined by the parents)?
First, go to "regular" PubMed (the new version). Type in "colic" AND "infant" AND
"therapy." Then click search. Note the number of articles you find (approximately
309). Next go to Clinical Queries. Click "therapy" and "sensitivity." Type in "infant"
AND "colic" and then search. Notice the number of retrieved articles has
decreased (approximately 113). These also should be better quality articles, on
average, than those retrieved with regular PubMed. Click "details" near the top of
your search results. At the bottom of the new screen is the actual query with the
imbedded search terms developed by Dr. Haynes and his team to filter in
methodologically sound studies. Now go back to Clinical Queries Search Page
and click "specificity" and search. This type of search is the most restrictive,
filtering in only those studies of the highest quality (but possibly missing some).
This search method produces the smallest quantity of studies (approximately 27).
However, notice that most, if not all, of the studies listed are prospective,
randomized studies.
- 15 -
Therapy: Reviewing the Evidence
Adapted from Sacket DL, Straus SE, Richardson WS, Rosenberg W, and Haynes RB, EVIDENCE-BASED
MEDICINE: How to Practice and Teach EBM. 3 rd Ed. Churchill Livingstone. 2000
Are the results likely
to be valid?
If NO then STOP.
Are the results
important?
If NO then STOP.
Are the results
applicable to my
patient?
If NO then STOP.
Was the assignment of patients to treatment randomized?
Was follow-up sufficiently long and complete?
Were all patients analyzed in the groups to which they were
randomized (intention to treat)?
Were patients and clinicians kept blind to treatment?
Treatment Drug
Group
Placebo
Totals
Adverse outcome
Present
Absent
a
b
c
d
a+c
b+d
Totals
a+b
c+d
a+b+c+d
Control Event Rate (CER) = c/(c+d)
Experimental Event Rate (EER) = a/(a+b)
Absolute risk reduction (ARR) = CER – EER
Number needed to treat (NNT) = 1/ARR
Is our patient so different from those in the study that its results
cannot apply?
Is the treatment feasible in our setting?
What are our patient’s potential benefits and harms from the
therapy?
What are our patient’s values and expectations for both the
outcome we are trying to prevent and the treatment we are
offering?
- 16 -
Evaluating an Article about Therapy
You are in your office. It is 6:00 P.M. and your day is over. As you pack up your
briefcase to head home, the nurse brings in a 7-year-old boy with a history of
moderate persistent asthma. His mother says he's been coughing and wheezing
since yesterday and he's getting worse. On exam, the patient is in moderate
respiratory distress with a RR=40. His mentation is normal. He has bilateral
wheezing with fair air entry and moderate subcostal/intercostal retractions. His
oxygen saturation on room air is 85% and his peak expiratory flow rate (PEFR) is
45% of predicted for height. You start him on oxygen and albuterol by
nebulization. After 30 minutes and 2 albuterol treatments there is only mild
improvement in his wheezing and he is still tachypneic (34) and hypoxic (90%)
and his PEFR is only 50% expected. This patient is a definite admission (after
stabilization in the ER). As you are waiting for the ambulance, you recall hearing
about a study using magnesium in moderate exacerbations of asthma and
wonder what is the likelihood that your patient would benefit from the magnesium
treatment. You decide to formulate an answerable clinical question and find an
answer as soon as your patient is transferred to the ER.
The pediatrician is faced with therapy decisions many times each day.
Evaluating evidence for or against new therapies (or older, unproved therapies)
that may be potentially beneficial is part of providing a high level of care for our
patients. The example which follows will outline one approach to answering a
clinical question about therapy.
The Question
P: In children with an acute moderate exacerbation of asthma,
I: what is the therapeutic efficacy of magnesium,
C: compared to placebo (or no treatment),
O: in improving PEFR (and possibly saving an admission)?
The Search
You quickly go to your computer, call up PubMed, click "Clinical Queries", and
begin your search for evidence of efficacy (and safety) of magnesium in patients
with moderate exacerbation of asthma. You click "therapy" and "specificity" and
enter the words "magnesium AND asthma AND child." Eleven studies are
retrieved. The fourth in the list looks promising: Cirallo L, Sauer AH, Shannon
MW. "Intravenous magnesium therapy for moderate to severe pediatric asthma:
results of a randomized placebo-controlled trial." J Pediatr 1996;129:809-14.
You quickly download a copy of the article from OVID and briefly analyze it.
Objective
"To evaluate the efficacy of intravenous magnesium therapy for moderate to
severe asthma exacerbations in pediatric patients."
- 17 -
Methods
All children 6-14 years of age presenting to the ER of Children's Hospital in
Boston 9/93 - 12/94 were evaluated for the study. Inclusion criteria: PEFR <
60% predicted and an IV placed for reasons other than the study. Exclusion
criteria: fever > 38.5 degrees C, systolic BP < 25% for age, recent use of
theophylline, history of cardiac, renal, or pulmonary disease, and pregnancy.
Patients were randomized in what appears to have been a double blind fashion
to either 100 ml of 25 mg/kg (maximum 2 gm) MgSO4 or saline. All patients were
given 2 mg/kg of methylprednisolone by IV.
Validity
Primary Issues:
1.
Was the assignment of patients to treatments randomized?
Yes.
2. Were all patients who entered the trial properly accounted for and attributed
at its conclusion?
Yes
3.
Was follow-up complete?
Yes.
4.
Were all patients analyzed in the groups to which they were randomized?
Yes.
This question is important for very practical purposes. Consider this (fictitious)
example. Two hundred patients are randomly assigned to receive either
coumadin (the experimental medication) or aspirin to prevent thrombosis in
children under 3 years of age following a Fontan procedure to treat single
ventricle congenital heart disease. One hundred patients end up in each group.
After twelve months the investigators found that 16 patients assigned to the
coumadin group and 16 patients assigned to the aspirin group had a thrombotic
event - no apparent improvement with coumadin. However, the investigators
discovered that within the first 2 months of the study, 35 of the patients assigned
to the coumadin group had stopped taking it - for a variety of reasons - and
started taking aspirin. When they analyzed only those who actually took the
coumadin for a full 12 months, only 2 patients (of the remaining 65) actually had
a thrombotic event. The investigators did their calculations in two ways: 1) they
analyzed the patients in the groups to which they had been originally
randomized, and 2) they analyzed the patients in the groups in which they ended
the study. The first approach is call an intention to treat analysis. In general,
this is the type of analysis that has meaning to practicing physicians. The
- 18 -
intention to treat approach is much more typical of the "real life" situation of
clinical practice where patients take, or do not take, their medicine for a variety of
reasons. The practical difference in the example is that the first way of
calculating does not demonstrate a benefit of coumadin where the second way of
calculating does. The conclusion from a practical clinical point of view is that use
of coumadin is not clinically effective. If ways could be found to increase
compliance, another study could be performed to retest coumadin versus aspirin.
Secondary Issues:
5.
Were patients, health workers, and study personnel "blind" to treatment?
It does appear from the description of the methods that they were.
6.
Were the groups similar at the start of the trial?
Not completely. Table 1 demonstrates that patients randomized to the
magnesium group had a statistically lower baseline PEFR than those randomized
to placebo. This group, therefore had more room for improvement. Since
improvement from baseline was an outcome variable, the results would be
biased in favor of the magnesium group.
7.
Aside from the experimental intervention, were the groups treated equally?
Yes.
Having decided that the study is at least minimally valid, you apply to the results
the very basic evidence-based medicine statistical tools you were taught as a
resident.
Results
The primary outcome was measured as percent improvement in PEFR 80
minutes after initiation of drug infusion. The authors found that the group given
magnesium showed a significant improvement for their entry level PEFR - 46%
vs. 16% in the placebo group (p=0.05) and no significant side effects were noted
(in particular, BP effects, though the small sample size precludes conclusions in
the case of uncommon significant side effects).
. This result, while interesting, has limited clinically significant meaning. It
doesn't answer the question, "What is the likelihood that my patient will benefit
from the treatment?" Fortunately, the authors give us a patient-oriented outcome
for PEFR.
At the end of the observation period 4 of the 15 patients in the magnesium group
(27%) vs. 11 of the 16 patients in the placebo group (69%) had a PEFR < 60%
predicted.
In order to calculate a clinically useful statistic, let us place data into a 2x2 table.
- 19 -
PEFR
Mg
> 60%
< 60%
Total
+
11
4
15
-
5
11
16
31
The absolute risk reduction (ARR) is the rate of disease in the control group
minus the rate of disease in the treatment group.
In our example, the rate of "disease" is the percentage of patients with PEFR <
60% at the conclusion of the observation period.
For the Placebo group = 11/16 = 0.69
For the Mg group = 4/15 = 0.27
The absolute risk reduction =
0.69-0.27 = 0.42
The 95% confidence interval (CI) for the ARR is [0.10, 0.74], and is therefore
statistically significant (since it does not cross 0). [Click on "ARR/NNT" found in
the sidebar to see how the 95% CI for both the ARR and NNT are calculated.]
Thus, Mg treatment leads to a 42% reduction in patients with a PEFR < 60% at
end of the observation period. In addition, you can be 95% confident that if this
study were repeated 100 times, 95 out of those 100 times, the resultant ARR
would be found within the interval of [0.10, 0.74]. Another way of stating this is
that you could be 95% confident that the true ARR lies somewhere between
[0.10, 0.74].
There is another, perhaps even more clinically meaningful statistic called the
number needed to treat (NNT).
The NNT tells you how many patients you would you have to treat to see an
effect of the drug (over and above control).
NNT= 1
ARR
In our example, the NNT = 1/0.42 = 2.4. Its 95% CI is [1, 10], and is therefore
statistically significant (since the CI does not include infinity [1/0]) . You have a
personal rule of thumb: for mildly invasive treatments with no significant side
effects and at least moderately significant benefits (this magnesium treatment
seems to fit all these criteria), you need to be 95% confident that you will not
have to treat more than 25 patients to benefit one over and above control. In this
case, ~2 patients must be treated with magnesium to benefit one patient over
and above control. Also, you can be 95% confident that the true NNT lies
somewhere between [1, 10]. Therefore, assuming the side effects are mild
- 20 -
and/or uncommon (hard to know from this small study, though there were no
serious side effects noted), this result appears to meet all your criteria for use.
In the same study, the authors reported a statistically significant decreased
admission rate among the patients treated with magnesium. Eleven patients in
the Mg groups and 16 patients in the placebo group were admitted.
ADMITTED
Mg
+
-
Total
+
11
4
15
-
16
0
16
31
What is the ARR (the risk in this case in admission to the hospital)? Answer:
16/16 - 11/15 = 0.27 [0.05, 0.43].
What is the NNT? Answer: 1/0.27 = 3.7 [2, 22].
Thus, only four patients would need to be treated to prevent one admission, over
and above control, with 95% confidence that less than 25 patients would have to
be treated to detect a benefit (no admission) in one patient - again fulfilling your
criteria for use.
Dr. Alan Schwartz has developed an online calculator which can determine ARR
(with 95% CI) as well as NNT (with 95% CI). This should make your life a bit
easier. Click http://araw.mede.uic.edu/cgi-bin/nntcalc.pl to try it.
Authors Conclusions
"Children treated with IV magnesium for moderate to severe asthma
had...greater improvement in short-term pulmonary function [compared to
controls]...suggesting a role for the agent as an adjunct in the treatment of such
patients."
Questions/Concerns/Your Conclusions/Applicability
There are a couple of important points. The first is that the randomization
procedure didn't work. the baseline PEFR's of the two groups were different.
One way to overcome this "random" problem is to randomize more patients. This
would have been a desirable thing to do, as the baseline PEFR favored the
experimental group. On the other hand, the results relating to admissions was
certainly biased away from the experimental group, as all of the patients in the
- 21 -
study were slated for admission. It was the significant (statistically and clinically)
improvement in the treatment group which prevented 27% of the admissions in
that group, whereas all control patients were admitted. The results of this study
certainly favor the use of magnesium based on your NNT cutoff criteria. It would
be helpful to see what other well-designed magnesium studies demonstrate in
children, especially studies in which the randomization worked. It would also be
important to note any significant adverse effects. To the extent one can tell, the
patients in this study (those who visited the ER of the Children's Hospital of
Boston) are similar to those you see (though you tend to see only a few patients
with moderate to severe exacerbations in your office). Therefore, from a patient
"type" standpoint, the results appear to be applicable.
Resolution of Your Patient's Story
Your patient was admitted to the general pediatric ward. He was discharged
after three days in good condition. Early the next week you set up a meeting with
the ER staff to begin evaluating the possibility of employing magnesium as part
of the outpatient therapy for asthma exacerbations.
- 22 -
Risk Reduction Calculator
To use, go to: http://araw.mede.uic.edu/cgi-bin/nntcalc.pl
or link through the Pediatrics Clerkship EBM Page http://ebm.peds.uic.edu/clerkship/
Enter your data in one
of these ways:
Numbers of patients who
experience good and bad outcomes
under the new therapy and control
therapy:
Good
Bad
Total
Outcome Outcome
0
New
therapy
Control
Total
0
0
0
0
Compute
or
Type of event and event rates (and,
optionally, sample size):
The events I'm
interested in are:
Control event rate:
Experimental event
rate:
adverse
%
%
Optional
# of patients in
control group:
# of patients in
experimental group:
Compute
Clear Entries
- 23 -
This is what the Critically Appraised Topic (CAT) form looks like (it expands as you fill
it in). Please access it at the Pediatric Clerkship EBM Page
http://ebm.peds.uic.edu/clerkship/ and click on “Submit CAT.”
CRITICALLY APPRAISED TOPIC
TOPIC TITLE
Date
Name of Reviewer(s)
Patient Story (be brief)
Answerable Clinical Question (PICO)
The Search
The Study Citation
Methods (focus on your question)
Issues of Validity (see specific questions)
Results (focus on your question)
Applicability (see PICO for Applicability)
Resolution of Patient Story
CLINICAL BOTTOM LINE
- 24 -
An example of a completed CAT for a Therapy Article. This is the link to the full text:
http://bmj.com/cgi/content/full/314/7097/1800?maxtoshow=&HITS=10&hits=10&RESULTFORMAT=&titleabs
tract=gingivostomatitis&searchid=1024936087142_10594&stored_search=&FIRSTINDEX=0&fdate=1/1/199
7&resourcetype=1,2,3,4,10
TOPIC TITLE
Name of Reviewer
Patient Story
Answerable Clinical
Question (PICO)
The Search
The Study Citation
Methods (focus on
your question)
Issues of Validity
Results (focus on
your question)
CRITICALLY APPRAISED TOPIC
Acyclovir and Gingivostomatitis
Jordan Hupert
Date 1/20/01
13 month old girl with fever and sores in her mouth for 2 days.
She is not clinically dehydrated, but she has been drinking less
and appears uncomfortable.
In children with probable herpes gingivostomatitis, what is the
therapeutic efficacy of oral acyclovir, compared to placebo, on
rate of cure?
PubMed  Clinical Queries  Therapy  Specificity 
gingivostomatitis AND acyclovir
Amir J, Harel L, Smetana Z, Versano I. Treatment of herpes
simplex gingivostomatitis with aciclovir in children: a
randomised double blind placebo controlled study. BMJ.
1997;314:1800-3.
 1-6 years old. Ambulatory/ER setting.
 Within 3 days of symptom onset.
 Randomized/placebo/double blind.
 15 gm/kg, 5x/day X 7days.
 72 children randomized, 62 HSV+, 1 dropped out
 61 HSV+ = study population
 Randomized Trial? Yes
 Patients accounted at end of trial? Yes
 Follow-up long enough? Yes
 Intention to treat? Yes for 61. See Results for 72.
Lesion resolution after 7 days of treatment:
61 patients: CER: 21/30, EER 2/31, ARR = 0.63 [0.45, 0.82],
NNT = 2 [1, 2].
72 patients (worst case scenario): CER: 21/36, EER 7/36, ARR =
0.39 [0.18, 0.60], NNT = 3 [2, 5].
Other results: (acyclovir vs placebo, medians)
 Oral lesions: 4 vs 10 days
 Fever: 1 vs 3 days
 Drinking difficulties: 3 vs 6 days
Applicability,
Limitations,
Concerns
Resolution of
Patient Story
CLINICAL
BOTTOM LINE
No serious side effects and same in both groups.
 Would less frequent or shorter course acyclovir administration
work?
 Study compliance was good. Real life compliance?
 Extrapolation to older/younger children?
 Treatment effect after 3 days of symptoms?
 Uncommon serious side effects  check PDR and literature
Acyclovir prescribed. Patient improved in 2 days, healed in 7
days.
Acyclovir is effective in treating children with suspected herpetic
gingivostomatitis
- 25 -
Diagnosis: Reviewing the Evidence
Adapted from Sacket DL, Straus SE, Richardson WS, Rosenberg W, and Haynes RB, EVIDENCE-BASED
MEDICINE: How to Practice and Teach EBM. 3rd Ed. Churchill Livingstone. 2000
Are the results
likely to be valid?
If NO
then
STOP.
Are the results
important?
If NO then STOP.
Are the results
applicable to my
patient?
If NO then STOP.
Was there an independent, blind comparison with a reference
(“gold”) standard?
Was the diagnostic test evaluated in an appropriate spectrum of
patients (like those in whom we would use it in practice)?
Was the reference (“gold”) standard applied regardless of the
diagnostic test result?
Test
Result
Positive
Negative
Totals
Disease
Present
Absent
a
b
c
d
a+c
b+d
Totals
a+b
c+d
a+b+c+d
Sensitivity = a/(a+c)
Positive predictive value = a/(a+b)
LR (+) = [a/(a+c)]/[b/(b+d)]
Specificity= d/(b+d)
Negative predictive value = d/(c+d)
LR (-) = [c/(a+c)]/[d/(b+d)]
Is the diagnostic test available, affordable, accurate, and precise in
our setting?
Can we generate a clinically sensible estimate of our patient’s pretest probability?
Will the resulting post-test probability affect our management and
help our patient?
- 26 -
EVALUATING AN ARTICLE ABOUT DIAGNOSIS
Tanya is an 8-month-old girl, otherwise normal, who is seeing you with what her
mother describes as two days of fever greater than 102° F. Physical exam is
unremarkable except for a temperature of 103° F in your office and mild URI
symptoms. You decide to obtain a urinalysis and urine culture (by catheter) and
send it to the nearby lab. Two hours later you get a call that there were 7 WBCs
seen as determined by a hemocytometer and no bacteria on gram stain. The
interpretation by the lab is "negative urinalysis, culture pending." You've often
wondered how accurate pyuria and bacteruria (or their absence) are in predicting
a urinary tract infection in infants. You therefore formulate your question in
preparation for a search.
The Question:
P: In children under two years of age with fever and no obvious source of
infection,
I: what is the diagnostic accuracy of pyuria and bacteruria,
C: compared to urine culture,
O: in the diagnosis of urinary tract infection (UTI)?
The Search:
You quickly click onto PubMed Clinical Queries, click "diagnosis," "specificity,"
and type in "pyuria AND bacteruria AND infant." Fourteen articles come up,
including one that seems to be right on target: Hoberman A, Wald ER, Reynolds
EA, et al. "Pyuria and bacteruria in urine specimens obtained by catheter from
young children with fever." J Pediatr 1994;124:513-9.
Objective:
The study had a number of objectives. The objective which addressed your
question was to "...assess the validity of microscopic urinalysis for diagnosis of
UTI."
Methods:
The patient population was made up of children under two years of age from
whom a urine specimen was obtained for urinalysis and culture. WBCs in urine
were counted on a hemocytometer from uncentrifuged urine. Gram stains were
done on uncentrifuged urine. A positive urinalysis was defined as greater than 10
leukocytes/mm3 and any bacteria seen on gram stain.
- 27 -
Validity:
1. Was there an independent blind comparison with a reference (gold) standard?
There was a reference (gold) standard. A positive urine culture was defined as >
50,000 colonies/ml. The reference standard was presumably independent. There
is no mention as to whether it was a blind comparison.
2. Did the patient sample include an appropriate spectrum of patients to whom
the diagnostic test will be applied in clinical practice?
We only know a few things about the patients. We know their age and we know
that they were seen in the Emergency Department of the Children's Hospital of
Pittsburgh, though we do not know any other specifics about the patients. Since
Pittsburgh is a large city not unlike Chicago, we can be reasonably assured that
the patients seen in the Emergency Department in Pittsburgh are similar to those
we see here in Chicago.
3. Did the results of the test evaluated influence the decision to perform the
reference (gold) standard?
No.
4. Were the methods for performing the test described in sufficient detail to
permit replication?
Yes.
Results:
Before we discuss specific results of this study, we need to discuss a few basic
concepts about diagnostic tests. This section discusses how to use the results of
studies of diagnostic tests. There are three steps in using diagnostic tests:
1. Assigning a pre-test probability of disease for our patient,
2. Finding or calculating the likelihood ratio (LR) for a particular test result,
and
3. Calculating the post-test probability of disease using the pre-test
probability and LR.
Step One: Pre-Test Probability
Diagnostic tests modify our patients pre-test probability of disease. It is important
to accurately estimate the initial probability of disease in our patient. Sometimes
our initial estimate is quite accurate. For example, we may know that in our clinic
population, 15% of the pre-school kids are iron deficient. At other times, we may
have only a vague idea of the pre-test probability, as for the pre-test probability of
parasitic gastrointestinal infection in a recent immigrant with abdominal pain. In
- 28 -
this case, we could try to refine our estimate by going to the literature. If the
literature does not help, we could ask one of our experienced colleagues or an
expert in parasitic diseases. If no one knows, we must make our best guess. A
diagnostic test can only change our pre-test probability.
Step Two: The Likelihood Ratio
Consider the table below:
Reality (Gold Standard)
TEST
Disease
No Disease
Positive
True Positive (TP)
False Positive (FP)
Negative
False Negative (FN)
True Negative (TN)
All With Disease
All Without Disease
Totals
On the left side of the table are listed the results of the test. In the simplest case,
the test is either positive or negative. At the top of the table is the reference, or
gold, standard. The gold standard is the best available definition of the disease;
it is often the definitive test for a disease -- in extreme cases, the diagnosis at
autopsy might be the gold standard. When the gold standard itself is imperfect or
unknown (e.g. the definition of a urinary tract infection in a patient with a
dysfunctional bladder), the test being evaluated will carry the same uncertainties
as the gold standard.
A positive test constituted > 10 WBC/mm3 + bacteria on gram stain. A negative
test was anything else. The gold standard was a urine culture with > 5 x 104
colony forming units/ml. The data from the article was entered into the 2 X 2
table below.
Urine Culture
Positive
Negative
Enhanced
>10 WBC + bacteria
91
12
Urinalysis
Anything else
11
2067
Totals
102
2079
Notice that the enhanced urinalysis was not perfect, there were false positives
and false negatives. This is common for tests.
- 29 -
We can use the data in the table to calculate a number of important properties of
the test and study population: sensitivity, specificity, predictive values,
prevalence rates, and likelihood ratios (LR). The most useful property for our
purposes is the LR (See Likelihood Ratios on the sidebar).
The LR reflects the essence of a test because it combines within a single value
both the sensitivity and specificity. A +LR defines the diagnostic strength of a
positive test; a -LR defines the diagnostic strength of a negative test. The LR
tells us how much we must modify our initial pre-test probability.
Refer to the first table. The LR for a positive test (+LR) is defined by either of
these two ratios:
TP/ All With Disease
-------------------------FP/ All Without Disease
or
Sensitivity
--------------------1 - Specificity
Let us calculate the +LR from our example:
+LR = [91/102]/[12/2079] = 155/1
The interpretation of this result is that a positive result from the enhanced
urinalysis test will change the pre-test probability (actually, the pre-test "odds,"
see below) 155 times more toward the diagnosis of urinary tract infection than
away from it.
The LR for a negative test (-LR) is defined by either of these two equivalent
ratios:
FN/ All With Disease
-------------------------TN/ All Without Disease
or
1 - Sensitivity
--------------------Specificity
The LR for the enhanced urinalysis test is
-LR = [11/102]/[2067/2079] = 0.11/1
The interpretation of this result is that a negative result from the enhanced
urinalysis test will change the pre-test probability (actually, the pre-test "odds,"
- 30 -
see below) 0.11 times more toward the diagnosis of urinary tract infection than
away from it. This, of course, means that a negative test result leads us away
from the diagnosis of urinary tract infection.
Notice that the mathematical formulations for +LR and -LR are
+LR = sensitivity/[1 - specificity]
-LR = [1- sensitivity]/specificity
Recall (or take our word for it), that sensitivity and specificity are prevalence
independent. Thus, LR's do not change from population to population. This is one
of the most valuable characteristics of LR's. The same LR is used in Chicago and
Bombay, even if the disease is much more prevalent in Bombay. This is
providing that the disease is defined identically in both locations, that the LR was
calculated from a group of patients likely to be found in both locations, and that
those patients are the ones who are likely to be tested. (A study developing a test
for the early detection of group B strep in neonates should not include healthy 6month-old babies. The +LR and -LR from such a study may be falsely elevated
and depressed, respectively.) Typically, these problems either do not arise or do
not change the LR's by much. However, this is a good example of how it pays to
pay attention to the methods section of a clinical study. We are now ready to
learn how to use LRs.
Step Three: Calculating the Post-Test Probability of Disease
The following demonstrates how the concepts are developed and used. As
a practical matter, use the online calculator http://araw.mede.uic.edu/cgibin/testcalc.pl .
We will use what we have discussed above to calculate the probability of disease
given a particular test result. Recall the data presented above:
Urine Culture
Positive
Negative
Enhanced
>10 WBC + bacteria
91
12
Urinalysis
Anything else
11
2067
Totals
102
2079
A positive test result
- 31 -
Step One:
As already mentioned, we need to have an initial estimate of the probability of
disease. In our example, urinary tract infection was present in 4.7% of the study
patients (102/[102 + 2079]). Let us assume that our patient population is similar
to the study's population. We will use 5% as our estimate for the pre-test
probability of urinary tract infection in our febrile patients < 2 years old. Since
LRs are in ratio, or odds, form, we need to convert our pre-test probability
into an odds form. Thus, 5% = 5/100 = 5 out of 100 children are infected = 5 are
infected and 95 are not infected. The odds, therefore, are
5/95 = 0.05/1
Step Two:
Assume that we obtained a positive test result (>10 WBC/mm3 + bacteria on
gram stain). The LR for this result, +LR, as calculated previously, is 155/1.
Step Three:
Now we modify our pre-test odds with the results of the test:
0.05/1
x 155/1 = 7.8/1
Pre-Test Odds x +LR = Post-Test Odds
The interpretation of the post-test odds is that in febrile children < 2 years old, a
positive enhanced urinalysis test increases the likelihood of urinary tract infection
from 0.05/1 to 7.8/1, a 155-fold increase.
We then convert the post-test odds back into a post-test probability. We do this
because we are more comfortable talking about the probability of disease than
the odds of a disease (with the exception of those who frequent the racetrack).
Thus, odds of infection of 7.8/1 means that there are 7.8 children with infection
for every child without infection. Therefore, the probability of infection is 7.8
(children with infection) divided by 8.8 (total number of children with and without
infection).
7.8 / 8.8 = 0.89 (89%)
Notice what the test did. It took a patient with an initial probability of 5% for a
urinary tract infection and modified it to give him a post-test probability of 89%.
This is a very powerful test and demonstrates one of the useful characteristics of
LRs. This child with positive results is likely to be started immediately with
antibiotics.
A negative test result
What if the test came back negative?
Step One
The prevalence hasn't changed.
Pre-test probability = 5%
- 32 -
Pre-test odds = 0.05/1
Step Two
The likelihood ratio for a negative test, -LR, = 0.11/1
Step Three
0.05/1
x 0.11/1 = 0.006/1
Pre-Test Odds x -LR = Post-Test Odds
Post-test probability = 0.006 / [0.006 + 1] = 0.006 (0.6%)
This is a patient with < 1% probability of a urinary tract infection. It would make
sense to hold off on the antibiotics, check the culture results the next day, and
keep an eye open for other causes of fever if the child does not improve.
Using the online test calculator
Dr. Alan Schwartz has developed an online calculator which will do all the LR
and post-test probability calculations automatically. It also will calculate the 95%
confidence intervals. Now that you know the fundamentals of LR's and post-test
probability calculation, you may wish to consult the online calculator for your
future needs. Click here for the calculator
http://ebm.peds.uic.edu/ebm/testcalc.shtml. Here's what the results of using the
online test calculator would be:
Multilevel Test Results
Many times, the results of a test are not "positive or negative." Consider our first
example of an enhanced urinalysis. This test can be considered as a multilevel
test. In addition to ">10 WBCs + Bacteria," we were able to calculate the data
(from one of the tables in the article) for the other possible combinations of
WBCs and bacteria (see table below).
>10 WBCs + Bacteria
<10 WBCs + Bacteria
>10 WBCs/No Bacteria
<10 WBCs/No Bacteria
Total
+UTI
91
4
2
5
102
-UTI
12
58
61
1948
2079
LR
151
1.41
0.68
0.05
Notice that each test level has an LR. The calculation for the LR follows our
definition. Thus, the LR for a test result of "<10 WBCs + Bacteria" is the
probability that the patient comes from the diseased versus the nondiseased
population. The calculation for this LR is [4/2079]/[58/102] = 1.41/1. This LR can
then be combined in the usual fashion with the pre-test probability of our patient
having a UTI to obtain the post-test probability of disease. What is the post-test
probability? Notice that the LR is approximately equal to 1 and therefore our
patient's post-test probability will not be very different from his pre-test
probability. This is what we would expect for a test with mixed results (positive for
bacteria, negative for WBC's).
Authors' Cconclusions:
- 33 -
The authors' conclusions do not specifically address our clinical question, and
therefore we will go directly to the next section.
Questions/Concerns/Your Conclusions/Applicability:
There were a number of concerns about this study.
It is not clear what percentage of "all febrile children" were included in the
sample. The methods section merely states that all children "from whom a urine
sample was obtained" were included in the study. Another concern is the
definition of a urinary tract infection. It was defined as 50,000 colonies/ml. This is
at odds with others who define UTI with either less or more colonies/ml.
However, this study did attempt to demonstrate why 50,000 colonies/ml is an
appropriate definition (based on other results in the study, which compared urine
culture to DMSA scanning), and for our purposes we can use it. In terms of
feasibility, cell counting with a hemocytometer is a technically simple procedure.
A gram stain is also relatively simple. Both of these procedures can be done in
any laboratory. These results do seem to apply to your patient since you use a
laboratory that does urinalysis by hemocytometer and gram stain in a way that it
similar to the Hoberman study. The conclusion of the laboratory that the child has
a negative urinalysis is consistent with the approach of the study.
Resolution of Your Patient's Story
You decided not to start your patient on antibiotics. Twenty-four hours later, your
patient was doing well and the results of the culture came back as "no growth
after 24 hours."
- 34 -
Diagnostic Test Calculator: http://ebm.peds.uic.edu/clerkship/
Numbers of patients with and without the
disease who test positive and negative:
Disease
present
Disease
absent
Total
Test
positive
Test
negative
Total
Compute
or
disease prevalence, test sensitivity, and
test specificity (and, optionally, sample
size):
Prevalence (e.g. 0.10):
Sensitivity (e.g. 0.80):
Specificity (e.g. 0.80):
Total sample size:
Compute
or
disease prevalence, positive likelihood
ratio, and negative likelihood ratio (and,
optionally, sample size):
Prevalence (e.g. 0.10):
+LR (e.g. 4):
-LR (e.g. 0.01):
Total sample size:
Compute
Clear Entries
- 35 -
Here are the concepts and how it is done mathematically:
BASIC STATISTICS FOR DIAGNOSTIC TESTS
TEST
+
Total
Gold Standard (Reality)
Disease
No Disease
a
b
c
d
All With Disease (a + c)
All Without Disease (b + d)


Likelihood Ratios
For a given test result, the likelihood ratio is the probability that our
patient comes from the diseased versus the non-diseased population
(definition courtesy of Dr. Jack Sinclair, Department of Pediatrics, McMaster
University)


The likelihood ratio for a positive test = +LR = [a/(a+c)]/[b/(b+d)]
The likelihood ratio for a negative test = -LR = [c/(a+c)]/[d/(b+d)]

The likelihood ratio, when combined with the patient’s pre-test
probability (prevalence) of having the disease, will give you the posttest probability of disease in that patient. Use the post-test probability
nomogram to do this or see below for an exact method.
Here is an exact way to do it:
Determine the prevalence of disease and convert it to a prevalence ratio
prevalence ratio (PR) = prevalence/(1 – prevalence)
Then calculate the post-test odds
PR x LR = post-test odds of disease
Finally, convert the post-test odds back to a probability
Probability of disease = [post-test odds]/[1 + post-test odds]
For those interested in calculating a 95% confidence interval around an LR:
95% Confidence Interval for +LR = eln LR + 1.96{[c/(a+c)]/a + [d/(b+d)]/b}
95% Confidence Interval for -LR = eln LR + 1.96{[a/(a+c)]/c + [b/(b+d)]/d}

Other helpful statistics:

Sensitivity = The ability of a test to detect diseased people from a diseased
population = a/(a + c).
Specificity = The ability of a test to detect healthy people from a healthy
population = d/(b + d).
Positive predictive value = The probability that a given test result is a true
positive given a specific disease prevalence = a/(a + b).
Negative predictive value = The probability that a given test result is a true
negative given a specific disease prevalence = d/(c + d).



- 36 -
Here is a diagnostic test post-test probability nomogram (see article above) if you do
not want to calculate probability of disease from the pre-test probability and the
likelihood ratio using the web-based diagnostic test calculator.
- 37 -
An example of a completed CAT for a Diagnosis Article. Link to full text:
Full Text, see OVID: http://gateway1.ovid.com/ovidweb.cgi
CRITICALLY APPRAISED TOPIC
TOPIC TITLE
13C-Urea Breath Test for H. Pylori
Date
11/7/02
Name of Reviewer(s)
J. Hupert
Patient Story (be brief)
11 yo HF with 3 weeks of intermittent epigastric pain, somewhat relived
with food ingestion. Stool heme negative. One of your colleagues
suggested the "breath test" for H. pylori.
Answerable Clinical
P: In children with abdominal pain suggestive of gastritis,
Question (PICO)
I: what is the diagnostic accuracy of the 13C-urea breath test,
C: compared to biopsy,
O: in diagnosing H. pylori infection?
The Search
PubMed -->Clinical Queries --> specificity --> helicobacter pylori AND
child AND urea breath test
The Study Citation
Kawakami E, Machado RS, Reber M, Patricio FR. 13 C-urea breath test
with infrared spectroscopy for diagnosing helicobacter pylori infection in
children and adolescents. J Pediatr Gastroenterol Nutr. 2002 Jul;35(1):3943.
Methods (focus on your
18 month study period. 82 children evaluated, 75 children included in
question)
analysis (see results for the other 7 patients). Age: 6 months - 18 years.
All were referred for endoscopy. Culture, histology, and rapid urease test
done on the six biopsy specimens from each patient (2 specimens per test).
H. pylori infection was defined by a positive culture or both a positive
histology and a positive rapid urease test. 13C-urea breath test was
performed using an infrared isotope analyzer at baseline and at 30 minutes.
Issues of Validity (see
1. Was there an independent, blind comparison with a reference (gold)
specific questions)
standard?
The gold standard was either a positive culture or both a positive histology
and a positive rapid urease test on biopsy specimens. Histology evaluation
does appear to have been independent. No mention was made of who
performed the rapid urease and culture. However, they are relatively
objective tests. Blinding was not discussed. There is no mention as to
whether the breath test preceeded the endoscopy. However, it seems
reasonable that it did. If so, blinding may have been helpful to eliminate
bias.
2. Was the diagnostic test evaluated in an appropriate spectrum of patients
(like those in whom we would use it in practice)?
The age range is appropriate. Close to half of all patients were infected,
which seems higher that one is likely to find in the primary care setting,
suggesting possible spectrum bias. The variety of disease appears
sufficienatly broad.
3. Was the reference (gold) standard applied regardless of the diagnostic
test result?
Yes.
Results (focus on your
7 patients had discordant gold-standard results (2 tested positive and 5
- 38 -
question)
negative with the 13C-urea breath test) and were excluded from the
investigators' analysis.
Based on 75 patients:
Sensitivity = 0.97, Specificity = 0.93
+LR = 14 [5, 42], -LR = 0.03 [0.10, 0.24]
Assuming a 41% prevalence, post-test probabilities for positive and
negative test are 91% [77, 97] and 2% [1, 14], respectively.
If assume worst case scenario with the 7 missing patients: 2 false positives
and 5 false negatives. Based on 75 patients:
Sensitivity = 0.83, Specificity = 0.89
+LR = 7.7 [3.3, 18], -LR = 0.19 [0.09, 0.39]
Assuming a 44% prevalence, post-test probabilities for positive and
negative test are 86% [72, 93] and 13% [7, 23], respectively.
Applicability (see PICO for
Applicability)
(P)
Is my patient similar enough to the patients in the study that the
evidence can be applied?
It seems likely.
(I)
Could the intervention in the study be carried out in my setting, and
in a way that is similar enough to the way it was conducted in the study?
Yes. However, given the rang of results, it would be important to
evaluated other H. pylori tests which may be easier to perform from an
office setting.
(C)
Is the comparison in the study similar to the standard of care in my
setting?
The gold standard used is accepted.
(O)
Are the results important enough and are the outcomes measured in
the study similar enough to those that are relevant and important in my
setting or to my patient?
Resolution of Patient Story
CLINICAL BOTTOM
LINE
The results are likely to be important. Our H. pylori prevalence is probably
closer to 10% of those in whom we would entertain the diagnosis. The
likelihood ratios do not change. The post-test probabilities would then be
46 - 61% for a positive test and 0.3 - 2 % for a negative test (range of worst
case and investigators case scenarios). H. pylori is a clinically significant
diagnosis and those values are clinically significant.
The patient tested positive on the breath test, was started on treatment and
is doing better 3 weeks later.
The 13C-urea breath test is a sufficiently accurate test for diagnosisng H.
pylori infection..
- 39 -
PICO Mnemonic for Applicability
Developed by Alan Schwartz, PhD
(P)
Is my patient similar enough to the patients in the study that the evidence
can be applied? Would my patient have met the study's inclusion criteria?
A valid study may not be applicable to your patient if your patient differs in
important ways from the study patients.
(I)
Could the intervention in the study be carried out in my setting, and in a
way that is similar enough to the way it was conducted in the study? A
valid study may not be applicable to your patient if the study intervention is
impractical, too costly, requires skills, equipment, or medications that are not
locally available, etc.
(C)
Is the comparison in the study similar to the standard of care (or for a
diagnostic test study, the gold standard) in my setting? A valid study may
not be applicable to your patient if you are already using a better standard of care
(or for a diagnostic test study, you have a better gold standard) than that to which
the study intervention is compared.
(O)
Are the outcomes measured in the study similar enough to those that are
relevant and important in my setting or to my patient? A valid study may not
be applicable to your patient if it reports outcomes that can not be measured
practically in your setting, or that are unimportant to your patient.
- 40 -
Download