Proof that max(Youden) occurs when using incidence of the outcome,

advertisement
1
Another Youden property
The maximum of Youden’s Index, Y, defined as (sens + spec – 1), occurs when one uses the
prevalence of disease as one’s threshold for defining high vs. low risk in a well-calibrated prediction
model.
Proof: This is really just a reinterpretation of the Neyman-Pearson lemma.
It is instructive, however, and actually much simpler, to prove it via a ROC diagram (and even
simpler with a FROC (prior frequency-weighted ROC) diagram).
Consider the graph! Youden's Y is constant, when (sens + spec) is constant, namely along any 45
degree line. Along such a line the DLR (diagnostic LR) is 1:1 (signifying “no information”), as
exemplified by DLR("no data") = 1:1, i.e. the main diagonal.
The maximum sens+spec, and thus Y, is attained where the local slope is also 1, i.e., where the
local DLR is 1:1 and, accordingly, the test result is uninformative.
2
Now, these "no data" situations imply
posterior odds = (1:1)(prior odds) = prior odds = "prevalence" odds,
so the maximum of Y is attained when the posterior odds equal the prevalence odds
(more vividly phrased: where the posterior odds tip from being higher to being lower than the
prevalence odds). Choose that point as your threshold, and Y is maximized. QED.
[With a general, not necessarily quantitative, test result variable the diagram is still
applicable provided that the diagram is 'concavified' (test values are reordered by decreasing DLR
from left to right).]
Download