Specificity

advertisement
Sensitivity, Specificity and ROC
Curve Analysis
Criteria for Evaluating a Screening Test
•Validity: provide a good indication of who does and does
not have disease
-Sensitivity of the test
-Specificity of the test
•Reliability: (precision): gives consistent results when given
to same person under the same conditions
•Yield: Amount of disease detected in the population,
relative to the effort
-Prevalence of disease/predictive value
Validity of Screening Test (Accuracy)
- Sensitivity: Is the test detecting true cases of
disease? Ideal is 100%: 100% of cases are
detected; =Pr(T+|D+)
-Specificity: Is the test excluding those without
disease? Ideal is 100%: 100% of non-cases are
negative; =Pr(T-|D-)
- See Gehlbach, Chp. 10
Example: Screening for Glaucoma using IOP
True Cases of Glaucoma
IOP > 22:
Yes
No
Yes
50
100
No
50
1900
(total)
100
2000
Sensitivity = 50% (50/100)
False Negative=50%
Specificity = 95% (1900/2000) False Positive=5%
Where do we set the cut-off for a screening test?
Consider:
-The impact of high number
of false positives: anxiety,
cost of further testing
-Importance of not missing a
case: seriousness of disease,
likelihood of re-screening
Yield from the Screening Test:
Predictive Value
•Relationship between Sensitivity, Specificity, and
Prevalence of Disease
Prevalence is low, even a highly specific test will
give large numbers of False Positives
•Predictive Value of a Positive Test (PPV): Likelihood
that a person with a positive test has the disease
•Predictive Value of a Negative Test (NPV): Likelihood
that a person with a negative test does not have the
disease
Screening for Glaucoma using IOP
True Cases of Glaucoma
IOP > 22:
Yes
No
Yes
50
100
No
50
1900
(total)
100
2000
Specificity = 95% (1900/2000)
False Positive=5%
Positive Predictive Value =33% (50/150)
How Good does a Screening Test have to be?
IT DEPENDS
-Seriousness
of disease, consequences of high false
positivity rate:
-Rapid
HIV test should have >90% sensitivity, 99.9%
specificity
-Screen for nearsighted children proposes 80%
sensitivity, >95% specificity
-Pre-natal genetic questionnaire could be 99%
sensitive, 80% specific
Choosing a cut-point: receiver
operating characteristic curves
• Situation where screening test yields results as
a continuous value (e.g., intraocular pressure
for glaucoma)
• Want to select a value above (or below) which
to call “diseased” or “at risk”
• How do we select that value?
Non-diseased
cases
Diseased
cases
Threshold
Test result value
or
subjective judgment of likelihood that case is diseased
More typically:
Non-diseased
cases
Diseased
cases
Test result value
or
subjective judgment of likelihood that case is diseased
12
Threshold
Diseased
cases
TP Fraction (sensitivity)
Non-diseased
cases
less aggressive
mindset
FP Fraction (1-specificity)
Threshold
Diseased
cases
TP Fraction (sensitivity)
Non-diseased
cases
moderate
mindset
FP Fraction (1-specificity)
Threshold
Diseased
cases
TP Fraction (sensitivity)
Non-diseased
cases
more
aggressive
mindset
FP Fraction (1-specificity)
Non-diseased
cases
Threshold
Diseased
cases
TP Fraction (sensitivity)
Entire ROC curve
FP Fraction (1-specificity)
Highly
discriminate
(good)
Somewhat
discriminate
(not as good)
TP Fraction (sensitivity)
Entire ROC curve
Reader Skill
and/or
Level of Technology
FP Fraction (1-specificity)
Non-informative
(no better than chance)
Use area under to curve (AUC)
to judge discriminating ability.
Gehlbach: want AUC>80%
Luke Neff: Refractory Burn Shock Data
Logistic Regression and ROC Curve Analysis
Response Profile
Ordered
Value
PET
Total
Frequency
1
0
22
2
1
20
Testing Global Null Hypothesis: BETA=0
Test
Chi-Square
DF
Pr > ChiSq
Likelihood Ratio
20.2651
1
<.0001
Score
15.3270
1
<.0001
Wald
10.1930
1
0.0014
Luke Neff: Refractory Burn Shock Data
Logistic Regression and ROC Curve Analysis
Analysis of Maximum Likelihood Estimates
Parameter
DF
Estimate
Standard
Error
Wald
Pr > ChiSq
Chi-Square
Intercept
1
-3.0649
0.9514
10.3771
0.0013
Admission Lactate
1
0.8436
0.2642
10.1930
0.0014
Odds Ratio Estimates
Effect
Admission Lactate
Point
Estimate
2.325
95% Wald
Confidence Limits
1.385
3.902
Luke Neff: Refractory Burn Shock Data
Logistic Regression and ROC Curve Analysis
Area
0.8489
Standard
Error
0.0633
95% Wald
Confidence Limits
0.7249
0.9729
Point that Maximizes
sum of sensitivity
and specificity.
Corresponds to
lactate value of
about 3.0
Pred
Prob True Pos True Neg False Pos
0.9995
1
22
0
0.9863
2
22
0
0.9838
3
22
0
0.96
4
22
0
0.9402
6
22
0
0.9353
7
22
0
0.9182
8
22
0
0.889
9
22
0
0.8401
10
22
0
0.8284
11
22
0
0.7894
12
22
0
0.675
12
21
1
0.637
12
20
2
0.5767
12
18
4
0.5351
13
17
5
0.493
14
17
5
0.4302
14
16
6
0.4096
15
16
6
0.3894
16
16
6
0.3695
17
16
6
0.3312
18
15
7
0.3127
18
14
8
0.2611
18
13
9
0.2299
18
12
10
0.1881
19
10
12
0.1637
19
8
14
0.1525
19
7
15
0.1419
19
5
17
0.1226
19
4
18
0.1056
19
2
20
0.0907
19
1
21
0.0718
20
0
22
False
Neg
19
18
17
16
14
13
12
11
10
9
8
8
8
8
7
6
6
5
4
3
2
2
2
2
1
1
1
1
1
1
1
0
Se
0.05
0.1
0.15
0.2
0.3
0.35
0.4
0.45
0.5
0.55
0.6
0.6
0.6
0.6
0.65
0.7
0.7
0.75
0.8
0.85
0.9
0.9
0.9
0.9
0.95
0.95
0.95
0.95
0.95
0.95
0.95
1
1 - Sp
0
0
0
0
0
0
0
0
0
0
0
0.05
0.09
0.18
0.23
0.23
0.27
0.27
0.27
0.27
0.32
0.36
0.41
0.45
0.55
0.64
0.68
0.77
0.82
0.91
0.95
1
Download