An Epidemiological Approach to Diagnostic Process

advertisement
An Epidemiological
Approach to Diagnostic
Process
Steve Doucette, BSc, MSc
Email: sdoucette@ohri.ca
Ottawa Health Research Institute
Clinical Epidemiology Program
The Ottawa Hospital (General Campus)
Topics to be covered…
-Through use of illustrative examples involving clinical
trials, we’ll discuss the following:
Diagnostic and Screening tests
Conditional Probability
The 2 X 2 Table
Sensitivity, Specificity, Predictive Value
ROC curves
Bayes Theorem
Likelihood and Odds
What are Diagnostic & Screening
tests?
Important part of medical decision making
In practice, many tests are used to obtain
diagnoses
Screening tests: Are used for persons who
are asymptomatic but who may have early
disease or disease precursors
Diagnostic tests: Are used for persons who
have a specific indication of possible
illness
What’s the difference?
Screening - the
proportion of affected
persons is likely to be
small (Breast Cancer)
Diagnostic tests many patients have
medical problems that
require investigation
Early detection of
disease is helpful only
if early intervention is
helpful
Usually to diagnosis
disease for immediate
treatment
Why conduct diagnostic tests?
Does a positive acid-fast smear guarantee that
the patient has active tuberculosis?
NO
Does a toxic digoxin concentration inevitably
signify digitalis intoxication?
NO
By having a factor VIII ratio < 0.8, are you
automatically known to be a hemophilia carrier?
NO
Not all tests are perfect…but
A positive test results should increase the
“probability” that the disease is present.
“Good” tests aim to be:
-sensitive
-specific
-predictive
-accurate
Terminology
Sensitive test: If all persons with the
disease have “positive” tests, we say the
test is sensitive to the presence of disease
Specific test: If all persons without the
disease test “negative”, we say the rest is
specific to the absence of the disease
Predictive (positive & negative) test: If the
results of the test are indicative of the true
outcome
Terminology
Accuracy: The accuracy of a test
expresses includes all the times that this
test resulted in a correct result. It
represents true positive and negative
results among all the results of the test.
Prevalence: The number or proportion of
cases of a given disease or other attribute
that exists in a defined population at a
specific time.
Terminology
Probability: A number expressing the
likelihood that a specific event will occur,
expressed as the ratio of the number of
actual occurrences to the number of
possible occurrences.
P(A) = a / n
Terminology
Conditional Probability: A number
expressing the likelihood that a specific
event will occur, GIVEN that certain
conditions hold.
P(A|B)
Sensitivity, Specificity, Positive & Negative
Predictive Values are all conditional
probabilities.
Terminology
Sensitivity: The proportion of positive results among all
the patients that have certain disease.
Specificity: The proportion of negative results among all
the patients that did not have disease.
Positive Predictive Value: The proportion of patients who
have disease among all the patients that tested positive.
Negative Predictive Value: The proportion of patients
who do not have disease among all the patients that
tested negative.
These are all conditional probabilities!!
The 2 X 2 Table
Truth
+
-
+
A
B
A+B
-
C
D
C+D
A+C
B+D
Test Result
A+B+C+D
The 2 X 2 Table
Truth
Test
Result
+
-
+
-
A
B
C
D
Formulas:
Sensitivity
= a / a+c
Specificity
= d / b+d
Accuracy
= a+d / a+b+c+d
Prevalence
= a+c / a+b+c+d
Predictive Value:
Positive Test
= a+b / a+b+c+d
Negative Test
= c+d / a+b+c+d
Diseased
= a+c / a+b+c+d
Not Diseased
= b+d / a+b+c+d
positive
= a / a+b
negative
= d / c+d
The 2 X 2 Table
Example: Testing for Genetic Hemophilia
-A method for testing whether an individual is a
carrier of hemophilia (a bleeding disorder) takes the ratio
of factor VIII activity to factor VIII antigen. This ratio
tends to be lower in carriers thus providing a basis for
diagnostic testing. In this example, a ratio < 0.8 gives a
positive test result.
Results:
-38 tested positive, 6 incorrectly.
-28 tested negative, 2 incorrectly.
The 2 X 2 Table
Carrier State
Carrier
+
32
6
38
2
28
30
34
34
68
F8 < 0.8
Test
Result
F8 > 0.8
Non-Carrier
The 2 X 2 Table
Carrier State
Carrier Non-Carrier
Exercise:
+
32
6
38
-
2
28
34
34
Sensitivity
=
30
Specificity
=
68
Accuracy
=
Prevalence
=
Test
Result
Predictive Value:
Positive Test
=
Negative Test =
Diseased
=
Not Diseased =
positive
=
negative
=
The 2 X 2 Table
Example: Testing for digoxin toxicity
-A method for testing whether an individual is
a digoxin toxic measures serum digoxin levels. A
cut off value for serum concentration provides a
basis for diagnostic testing.
Results: -39 tested positive, 14 incorrectly.
-96 tested negative, 18 incorrectly.
The 2 X 2 Table
Toxicity
D+
D-
T+
25
14
39
T-
18
78
96
43
92
135
Test
Result
The 2 X 2 Table
Toxicity
DD+
T+
25
14
Exercise:
39
Sensitivity
=
96
Specificity
=
135
Accuracy
=
Prevalence
=
Test
Result
T-
18
43
78
92
Predictive Value:
Positive Test
=
Negative Test =
Diseased
=
Not Diseased =
positive
=
negative
=
Sensitivity & Specificity – Trade off
Ideally we would like to have 100%
sensitivity and specificity.
If we want our test to be more sensitive,
we will pay the price of losing specificity.
Increasing specificity will result in a
decrease in sensitivity.
Back to Hemophilia example…
Non
Carrier Carrier
+
Test
Result
32
6
Non
Carrier Carrier
38
+
33
13
46
-
1
21
22
34
34
68
Test
Result
2
28
30
34
34
68
-
Exercise:
Exercise:
Sensitivity
= 32/(32+2)
= 0.94
Sensitivity
= 33/(33+1)
= 0.97
Specificity
= 28/(28+6)
= 0.82
Specificity
= 21/(21+13)
= 0.62
Predictive Value:
Predictive Value:
positive
= 32/(32+6)
= 0.84
positive
= 33/(33+13)
= 0.72
negative
= 28/(28+2)
= 0.93
negative
= 21/(21+1)
= 0.95
Example 2: How can prevalence affect predictive value?
Non
Carrier Carrier
+
Test
Result
32
6
Non
Carrier Carrier
38
+
32
600
632
-
2
2800
2802
34
3400
3034
Test
Result
2
28
30
34
34
68
-
Exercise:
Exercise:
Sensitivity
= 32/(32+2)
= 0.94
Sensitivity
= 32/(32+2)
Specificity
= 28/(28+6)
= 0.82
Specificity
= 2800/(2800+600) = 0.82
= 0.94
Predictive Value:
Predictive Value:
positive
= 32/(32+6)
= 0.84
positive
= 32/(32+600)
negative
= 28/(28+2)
= 0.93
negative
= 2800/(2800+2)= 0.999
= 0.05
Summary
The 2 X 2 Table allows us to compute
sensitivity, specificity, and predictive
values of a test.
The prevalence of a disease can affect
how our test results should be interpreted.
ROC Curves - Introduction
Cut-off value for test
TP=a
FP=b
FN=c
TN=d
With Disease
Without Disease
TP
TN
FP FN
0.5
0.6
0.7
0.8
POSITIVE
0.9
1.0
NEGATIVE
Test Result
1.1
ROC Curves - Introduction
Cut-off value for test
With Disease
Without Disease
TP
FP
0.5
0.6
0.7
0.8
TN
FN
0.9
POSITIVE
1.0
1.1
NEGATIVE
Test Result
ROC Curves
An ROC curve is a graphical
representation of the trade off between the
false negative and false positive rates for
every possible cut off. Equivalently, the
ROC curve is the representation of the
tradeoffs between sensitivity (Sn) and
specificity (Sp).
By tradition, the plot shows 1-Sp on the X
axis and Sn on the Y axis.
ROC Curves
Example: Given 5 different cut offs for the
hemophilia example: 0.5, 0.6, 0.7, 0.8, 0.9. What
might an ROC curve look like?
Cut-off
Sensitivity
Specificity
1- Specificity
0.5
0.6
0.7
0.8
0.9
0.30
0.65
0.85
0.94
0.97
0.97
0.94
0.88
0.82
0.63
0.03
0.06
0.12
0.18
0.37
ROC Curves
1
0.8
Sensitivity
0.6
0.4
0.2
0
0
0.2
0.4
0.6
1- Specificity
0.8
1
ROC Curves
We are usually happy when the
ROC curve climbs rapidly
towards upper left hand corner
of the graph. This means that
Sensitivity and specificity is
high.
1
Sensitivity
0.8
0.6
0.4
0.2
0
0
0.2
0.4
0.6
1- Specificity
0.8
1
We are less happy when the ROC
curve follows a diagonal path from
the lower left hand corner to the
upper right hand corner. This
means that every improvement in
false positive rate is matched by a
corresponding decline in the false
negative rate
ROC Curves
Area under ROC curve:
1
Sensitivity
0.8
1 = Perfect diagnostic test
0.5 = Useless diagnostic test
0.6
0.4
0.2
0
0
0.2
0.4
0.6
1- Specificity
0.8
1
If the area is 1.0, you have an
ideal test, because it achieves
both 100% sensitivity and 100%
specificity.
If the area is 0.5, then you have
a test which has effectively 50%
sensitivity and 50% specificity.
This is a test that is no better
than flipping a coin.
What's a good value for the area
under the curve?
Deciding what a good value is for area under
the curve is tricky and it depends a lot on the
context of your individual problem.
What are the cost associated with
misclassifying someone as non-diseased
when in fact they were? (False Negative)
What are the costs associated with
misclassifying someone as diseased when in
fact they weren’t? (False Positive)
ROC Curves
1
0.8
Sensitivity
0.6
0.4
Test 1
Test 2
Test 3
0.2
0
0
0.2
0.4
0.6
1- Specificity
0.8
1
Bayes Theorem
The 2 x 2 table offers a direct way to
compute the positive and negative
predictive values.
Bayes Theorem gives identical results
without constructing the 2 x 2 table.
P(A|B) =
P(B|A) P(A)
P(B|A) P(A) + P(B|not A) P(not A)
Note: P(B) = 0
Bayes Theorem
Applying these results:
Sensitivity
Positive predictive Value = P(D+|T+)
P(D+|T+) =
1- Specificity
P(T+|D+) P(D+)
P(T+|D+) P(D+) + P(T+|D-) P(D-)
Specificity
Negative predictive Value = P(D-|T-)
1- Sensitivity
P(D-|T-) =
P(T-|D-) P(D-)
P(T-|D-) P(D-) + P(T-|D+) P(D+)
How does Bayes Rule help?
Example: Investigators have developed a
diagnostic test, and in a population we
know the tests’ sensitivity and specificity.
The results of a diagnostic test will allow
us to compute the probability of disease.
The new, updated, probability from new
information is called the posterior
probability.
Back to Digoxin example…
Say we know that someone’s probability of
toxicity is 0.6. We now give them the diagnostic
test and find out that their digoxin levels were
high and they tested positive. What is the new
probability of disease, given the positive test
result information?
P(D+|T+) =
P(T+|D+) P(D+)
P(T+|D+) P(D+) + P(T+|D-) P(D-)
Back to Digoxin example…
P(T+|D+) P(D+)
P(D+|T+) =
P(T+|D+) P(D+) + P(T+|D-) P(D-)
We know P(D+) = 0.6
From before,
1- 0. 6 = 0.4
Sensitivity
= 25/(25+18)
= 0.58
Specificity
= 78/(78+14)
= 0.85
1- 0.85 = 0.15
0.58*0.6
P(D+|T+) =
= 0.85
0.58*0.6 + 0.15*0.4
Back to Digoxin example…
P(T-|D-) P(D-)
P(D-|T-) =
P(T-|D-) P(D-) + P(T-|D+) P(D+)
We know P(D+) = 0.6
1- 0.6 = 0.4
From before,
Sensitivity
= 25/(25+18)
= 0.58
Specificity
= 78/(78+14)
= 0.85
1- 0.58 = 0.42
0.85*0.4
P(D-|T-) =
= 0.57
0.85*0.4 + 0.42*0.6
Digoxin example continued…
What happens to the positive and negative
predictive values if our ‘prior’ probability of
disease, P(D+), changes…
Example 2: What is the new probability of
disease given the same positive test,
however the probability of disease was
known to be 0.3 before testing?
Back to Digoxin example…
P(T+|D+) P(D+)
P(D+|T+) =
P(T+|D+) P(D+) + P(T+|D-) P(D-)
We know P(D+) = 0.3
From before,
1- 0. 3 = 0.7
Sensitivity
= 25/(25+18)
= 0.58
Specificity
= 78/(78+14)
= 0.85
1- 0.85 = 0.15
0.58*0.3
P(D+|T+) =
= 0.62
0.58*0.3 + 0.15*0.7
Back to Digoxin example…
P(T-|D-) P(D-)
P(D-|T-) =
P(T-|D-) P(D-) + P(T-|D+) P(D+)
We know P(D+) = 0.3
1- 0.3 = 0.7
From before,
Sensitivity
= 25/(25+18)
= 0.58
Specificity
= 78/(78+14)
= 0.85
1- 0.58 = 0.42
0.85*0.7
P(D-|T-) =
= 0.83
0.85*0.7 + 0.42*0.3
Hemophilia example continued…
Example: Mrs X. had positive lab results, what is
the probability she was a carrier??
P(D+|T+)
Hemophilia is a genetic disorder. If Mrs. X
mother was a carrier, Mrs. X would have a 50-50
chance of being a carrier. (Prior probability)
If all we knew was that her grandmother was a
carrier, Mrs. X would have a 25% chance of
being a carrier.
Hemophilia example continued…
P(D+|T+) =
P(T+|D+) P(D+)
P(T+|D+) P(D+) + P(T+|D-) P(D-)
From before,
Sensitivity
= 32/(32+2)
= 0.94
Specificity
= 28/(28+6)
= 0.82
Grandmother was a carrier
Mother was a carrier
0.94*0.25
0.94*0.5
= 0.84
P(D+|T+) =
0.94*0.5 + 0.18*0.5
P(D+|T+) =
= 0.64
0.94*0.25 + 0.18*0.75
Summary
Bayes theorem allows us to calculate the
positive and negative predictive values
using only sensitivity, specificity, and the
probability of disease (prevalence).
Likelihood and Odds
Likelihood Ratio:
LR+ = Sensitivity
1- Specificity
LR- = 1-Sensitivity
Specificity
What would a good LR+ look like?
HIGH LR+ and LOW LR- imply both sensitivity
and specificity are close to 1
Likelihood and Odds
The odds in favor of ‘A’ is defined as:
P(A)
Odds in favor of A =
=
P(NOT A)
P(A)
1- P(A)
Example: if P(A) = 2/3 then the odds in
favor of A is:
2/3
1- 2/3
= 2
(or 2 to 1)
Likelihood and Odds
We can also calculate probability knowing the odds of disease
odds
P(A) =
1 + odds
Example: if the odds = 2 (that is 2:1)
then the probability in favor of A is:
2
1+2
= 2/3
Likelihood and Odds
Some more simple examples:
-The Odds in favor of heads when a coin is tossed is
1. (Ratio of 1:1)
-The Odds in favor of rolling a ‘6’ on any throw of a
fair die is 0.2. (Ratio of 1:5)
-The Odds AGAINST rolling a ‘6’ on any throw of a fair
die is 5. (Ratio of 5:1)
-The Odds in favor of drawing an ace from an
ordinary deck of playing cards is 1/12. (Ratio of 1:12)
Likelihood and odds
Recall, ‘prior’ probability was the known
probability of outcome (ex. Disease)
before our diagnostic test.
Posterior probability is the probability of
outcome (ex. Disease) after updating
results from our diagnostic test.
Prior and posterior odds have the same
definition.
Posterior Odds
Posterior odds in
Prior odds in
=
favor of A
favor of A
X
Likelihood
ratio
LR+ if they tested positive
LR- if they tested negative
Hemophilia example continued…
What was the odds that Mrs. X was a
carrier when the only information known
was:
Her mother was a carrier?
Her grandmother was a carrier?
Posterior odds in
Prior odds in
=
favor of A
favor of A
X
Likelihood
ratio
Hemophilia example continued…
STEP 1.
What were the prior odds of being a carrier for Mrs. X
when her mother was a carrier? (Hint: she had a 50-50
chance)
Answer: her odds were 1:1, or simply 1.
What were her odds when her grandmother was a
carrier? (Hint: she had a 25% chance)
Answer: her odds were 1:3, or simply 1/3.
Hemophilia example continued…
STEP 2.
What is the likelihood ratio of a positive test - (in this
case LR+ since she tested positive in our example)
Answer: LR+ = Sensitivity
1- Specificity
=
0.94 =
1- 0.82
5.3
Hemophilia example continued…
What was the odds that Mrs. X was a carrier when the
only information was that her mother was a carrier?
Posterior
odds in
favor of A
=
Prior odds
in favor of A
X
Likelihood
ratio
=
1 X 5.3 = 5.3
The odds are 5.3 to 1 in favor of Mrs. X being a carrier.
What was the odds that Mrs. X was a carrier when the
only information was that her mother was a carrier?
Posterior
odds in
favor of A
=
Prior odds
in favor of A
X
Likelihood
ratio
=
(1/3) X 5.3 = 1.8
The odds are 1.8 to 1 in favor of Mrs. X being a carrier.
Summary
The prior odds of disease can affect the
posterior odds of a disease even with the
same test result.
The odds of disease can be computed
from the probability of disease and vice
versa.
Reference

JA Ingelfinger, F Mosteller, LA Thibodeau, JH Ware.
Biostatistics in Clinical Medicine, 3rd Edition.
McGraw-Hill Companies, Inc. 1994.
Download