Statistics in Screening/Diagnosis Annie Herbert Research & Development Department Salford Royal Hospitals NHS Foundation Trust annie.herbert@manchester.ac.uk 0161 2064567 Outline • Intro: Design, recording results • Sensitivity • Specificity • Continuous variables: ROC curves • Predictive values • Likelihood ratio • Bias Introduction: diagnostic test Patient enters clinic (Patient has disease). Takes test, e.g. blood sample (Patient doesn’t have disease). Takes test, e.g. blood sample Test result: Positive Test result: Negative Test result: Positive Test result: Negative (right diagnosis, ‘true positive’) (wrong diagnosis, ‘false negative’) (wrong diagnosis, ‘false positive’) (right diagnosis, ‘true negative’) Introduction: assessing a diagnostic test All participants 2 x 2 Table of results Reference test Reference test 2 x 2 Table of results Index test Index test + - + 90 60 - 10 240 Introduction: 2x2 table of results ‘TRUTH’ (by Reference test – Gold standard) TEST RESULT (by Index test) Total Total + - + True Positive False Positive Total said to have disease - False Negative True Negative Total said not to have disease Total with disease Total without disease Hypothetical example – Breast cancer screening study • Gold standard: Mammography • Cheaper/more convenient option: GP examination Breast cancer screening results ‘TRUTH’ (by mammography) TEST RESULT (by GP Exam) Total Total + - + 95 45 140 - 5 855 860 100 900 1000 Sensitivity - Definition • What proportion of people who have the condition are identified as positive by the test? • If a test has a high sensitivity, most people with the condition are picked up by the test. Have condition +ve test Sensitivity - Calculation TEST RESULT Total + - ‘TRUTH’ (by gold standard) + a b c d a+c b+d Sensitivity = a/(a+c) Sensitivity - Example Mammography GP Exam Total + - + 95 5 45 855 100 900 Sensitivity = 95/100 = 0.95 I.e. 95% of patients diagnosed as having breast cancer by the mammogram are picked up by GP examination. Specificity - Definition • What proportion of people who don’t have the condition are identified as negative by the test? • If a test has a high specificity, most people without the condition are ruled out by the test. Don’t have condition -ve test Specificity - Calculation TEST RESULT Total + - ‘TRUTH’ (by gold standard) + a b c d a+c b+d Specificity = d/(b+d) Specificity - Example Mammography GP Exam Total + - + 95 5 45 855 100 900 Specificity = 855/900 = 0.95 I.e. 95% of patients diagnosed as not having breast cancer by the mammogram are ruled out by GP examination. Sensitivity & Specificity - Notes • It is essential to have a confirmed ‘true’ diagnosis (+ve or -ve). The accuracies of sensitivity and specificity are only as good as that of the gold standard. • Sensitivity and specificity are estimated from a sample, and so should be accompanied by confidence intervals to convey amount of uncertainty. (StatsDirect: Analysis -> Proportions -> Single) Tests based on continuous variables (1) • One or more continuous variables can be a marker for a condition, where a very low/high level indicates a low/high likelihood of having the condition. • A cut-off level can be determined where having higher/lower than that cut-off indicates a positive test result. • Different cut-off points will give different sensitivity/specificity values. Tests based on continuous variables (2) 1000 10 100 creatinekinase 10000 5000 E.g. Creatinekinase in patients with unstable angina or acute myocardial infarction angina myocardial infarction Data of Frances Boa, from ‘An introduction to Medical Statistics’ by Martin Bland Test Total + - Sensitivity Specificity = 27/27 = 39/93 = 0.42 = 1.0 100 1000 10 Truth + 27 54 0 39 27 93 creatinekinase 10000 5000 Tests based on continuous variables (3) angina myocardial infarction Cut-off level at 80 Test Total + - Sensitivity Specificity = 26/27 = 58/93 = 0.62 = 0.96 100 1000 10 Truth + 26 35 1 58 27 93 creatinekinase 10000 5000 Tests based on continuous variables (4) angina Cut-off level at 100 myocardial infarction The trade-off • Plot sensitivity against (1-specificity) to get the ROC (‘receiver operating characteristic’) curve. • Ideally want high sensitivity and high specificity (but increase in one is at expense of the other). • Also requires some clinical judgement, e.g. Likely considered better to send women without breast cancer to have a mammogram than give those with breast cancer the all clear. • Check sensitivity and specificity values in a new sample. ROC curve Sensitivity = 1.0 Specificity = 1.0 The diagonal line represents ROC plot for MI data Sensitivity 1.00 sensitivity = specificity, i.e. taking the test is as good as flipping a coin. 0.75 0.50 0.25 Sensitivity = 0.0 0.00 0.00 0.25 0.50 0.75 1.00 1-Specificity Specificity = 0.0 Optimum cut-off MI data: • ‘Optimum’ cut-off point selected = 302 • Sensitivity (95% CI) = 0.93 (0.76 to 0.99) • Specificity (95% CI) = 0.97 (0.91 to 0.99) Note: ‘optimum’ assumes sensitivity and specificity of equal concern. Area under the ROC curve • Area under the ROC curve can be between 0 (sensitivity and specificity always 0.0) and 1 (sensitivity and specificity always 1.0). • Can be useful for comparing two tests. • MI data: Area under curve is an estimate of ‘probability that creatinekinase of random person with MI will be higher than for random person with angina’. The difference between sensitivity & specificity and predictive values… • Sensitivity & Specificity: How good is the test at making the right diagnosis? • Predictive Values: Once diagnosis has been made, how reliable is it? Positive Predictive Value - Definition • Proportion of those with a positive test result that actually have the condition. • If a test has a high positive predictive value, if someone tests positive for the condition, there is a high probability that they have it. Have Test positive condition Positive Predictive Value - Calculation ‘TRUTH’ TEST RESULT + + - a b Total a+b PPV = a/(a+b) - c d c+d Positive Predictive Value - Example ‘TRUTH’ (by mammogram) Total TEST + RESULT (by GP Exam) - + - 95 45 140 PPV = 95/140 5 855 860 = 0.68 I.e. 68% of patients who test positive for breast cancer by GP examination could be expected to test positive by mammogram. Negative Predictive Value - Definition • Proportion of those with a negative test result that really don’t have the condition. • If a test has a high negative predictive value, if someone tests negative for the condition, there is a high probability that they don’t have it. Don’t have condition Test negative Negative Predictive Value - Calculation ‘TRUTH’ TEST RESULT + - + - a b c d Total a+b c+d NPV = d/(c+d) Negative Predictive Value - Example ‘TRUTH’ (by mammogram) Total TEST + RESULT (by GP Exam) - + - 95 45 140 NPV = 855/860 5 855 860 = 0.99 I.e. 99% of patients who test negative for breast cancer by GP examination would be expected to test negative by mammogram. Prevalence • What proportion of people in a cohort have the disease? E.g. “The prevalence of breast cancer in females over 40 years of age is approximately 1.5%”. • ‘Prevalence’ is not the same as ‘incidence’. • Sensitivity & specificity values are unaffected by prevalence, though predictive values are. E.g. Test with 95% sensitivity and 95% specificity: MeReC Briefing: supplement to issue 30 Example: Self administered cognitive screening test (TYM) for detection of Alzheimer’s disease: cross sectional study, Brown et al, June 2009 “A score of 42/50 had a sensitivity of 93% and specificity of 86% in the diagnosis of Alzheimer’s disease. The TYM was more sensitive in detection of Alzheimer’s disease than the mini-mental examination, detecting 93% of patients compared with 52% for the mini-mental state examination. The negative and positive predictive values of the TYM with the cut off of 42 were 99% and 42% with a prevalence of Alzheimer’s disease of 10%.” Likelihood Ratio - Definition • How many times more (or less) a patient with the condition is likely to have that particular result than a patient without the disease. • Can be used to calculate the probability of individual patient having condition based on test results. Use of Fagan's nomogram for calculating post-test probabilities: Deeks, J. J et al. BMJ 2004;329:168-169 Copyright ©2004 BMJ Publishing Group Ltd. Bias in studies: • Is the reference appropriate? • Was the same reference used for all patients (verification bias)? • Were assessors blind to case details? • Was it a ‘diagnostic case-control study’? • Which population was the test tested in? Summary • All patients must have both new test & reference test (gold standard). • Report 2x2 table and give sensitivity, specificity with precision. • A good screening test is not necessarily a good diagnostic test. • Test cut-offs in an independent sample. • Predictive values vary according to prevalence. • Consider all potential sources of bias. Recommended Texts • BMJ Statistics Notes: – 1) Sensitivity & Specificity – 2) Predictive Values – 3) ROC Curves – 4) Likelihood Ratios • Assessing bias – How to read a paper: Papers that report diagnostic or screening tests by Trisha Greenhalgh; BMJ 1997 315 pg. 540.