Some Difficult Decisions are Easier without Computer Support / TA Mammography, RT Diversity / Andrey A. Povyakalo (work together with E Alberdi, L Strigini and P Ayton) andrey@csr.city.ac.uk DIRC workshop, Edinburgh, 16 March 2005 Computer Aided Detection for Mammography Computer Aided Detection (CAD) Tool – aims to mark Regions of Interest (ROI) on a digitised mammogram image to prevent overlooking by the human reader – applies a pattern recognition algorithm – claimed not to be a diagnostic tool – claimed that: “ the potential for missed lesions is not increased over routine screening mammography when used as labelled” Prescribed Procedure – Reader looks at original mammogram and interpret it as usual, then – activates CAD and looks at a small low resolution image of the mammogram with marked ROIs on it , then – checks whether or not some ROIs have been overlooked and... ... revises her/his assessment, if necessary Controversy • US FDA (1998) “… use of the device improved the radiologist's detection rate from approximately 80 out of 100 cancers to almost 88 out of 100…” • Warren Burhenne, LJ et al. (2000) “...CAD prompting could have potentially helped reduce this false-negative rate by 77% (89 of 115) without an increase in the recall rate. • Brem, RF et al. (2003) “…for every 100,000 women with breast cancer identified without the use of computer-aided detection, an estimated additional 21,200 cancers would be found with the use of computer-aided detection. ...” • Freer, TW & Ulissey MJ (2001) “ The use of CAD ... can increase the detection of early-stage malignancies without undue effect on the recall rate or positive predictive value for biopsy.” (8 more cancers of 49 found with CAD) • Taylor, PM et al. (2004) “… this version of the ImageChecker would not have a significant impact on the UK screening programme...” • Gur, D et al. (2004) “The introduction of computer-aided detection … was not associated with statistically significant changes in recall and ... detection rates” • Alberdi, E et al (2004) “Possible automation bias effects in CAD use ... may degrade human decision-making for some categories of cases under certain conditions...” HTA trial (University College London) • 50 readers looked at 180 cases: – 60 cancers – 120 normal cases (‘normals’) • in two conditions: – without computer support (unprompted session) – with computer support (prompted session) • to make a recall decision • Rate of cancers much higher than in real working conditions • CADT printout used instead of using real system • Results: – the trial administrators found NO statistically significant impact of CAD on human performance 40 30 20 10 Readers ranked by their sensitivity 50 Trial data for cancers 10 20 30 40 50 Cases ranked by their difficulty 60 • Sensitivity: fraction of cases recalled by the reader without CAD • Case difficulty: fraction of readers missing the case without CAD • Blue points mark <case, reader> pairs where the unaided decision was wrong and the decision supported by the CAD was correct; • Red points mark <case, reader> pairs where the unaided decision was correct but the decision supported by CAD was wrong; Regression Estimates • Difficulty of case i : d(i) • Sensitivity of reader j : f(j) • Probability that reader j recalling case i • in the unprompted condition: • Pun (i, j) = F( d(i), f(j) ) • in the prompted condition: • Ppr (i, j) = G( d(i), f(j) ) • F, G -some functions found via logit regression • Impact: • Imp( d(i), f(j) ) = G( d(i), f(j) ) - F( d(I), f(j) ) Effect of CAD on probability of recalling cancer (all cases) The more sensitive readers hindered 0.2 Maximum effect 0.85 0.1 Fraction of cases recalled by the reader without CAD (sensitivity) 0.80 0.0 0.75 0.70 The less sensitive readers benefit -0.1 0.65 -0.2 0.60 0.0 More of the easy cases recalled Maximum damage 0.2 0.4 0.6 0.8 Fraction of readers missing the case without CAD (case difficulty) More of the difficult cases missed Effect of CAD on probability of recalling cancer (cases with correct prompts) The more sensitive readers hindered 0.25 Maximum effect 0.85 0.20 0.80 Fraction of cases recalled by the reader without CAD (sensitivity) The less sensitive readers benefit 0.15 0.75 0.10 0.70 0.05 0.65 0.00 Maximum damage 0.60 -0.05 0.0 More of the easy cases recalled 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Fraction of readers missing the case without CAD (case difficulty) Effect of CAD on probability of recalling cancer (cases without correct prompts) The more sensitive readers hindered 0.0 0.85 -0.1 0.80 Fraction of cases recalled by the reader without CAD (sensitivity) 0.75 -0.2 0.70 -0.3 0.65 0.60 -0.4 0.2 0.4 0.6 Maximum damage 0.8 Fraction of readers missing the case without CAD (case difficulty) More of the difficult cases missed Concordance of decisions • More precisely: Probability that two randomly selected readers both recall or not recall randomly selected case • significantly greater in the prompted condition for • all cases: by • 0.812 - 0.789 = 0.022 (95% CI: 0.018, 0.027) • correctly prompted cases: by • 0.849 - 0.834 = 0.015 (95% CI: 0.010, 0.019) • cases without correct prompts: by • 0.701 - 0.655 = 0.046 (95% CI: 0.036, 0.056) • Does the technology reduce the human diversity? Conclusions • Exploratory analyses to generate hypotheses • Generated hypotheses to be tested with independent data • Conjecture: CAD helps the less sensitive radiologists • The use of CAD by more sensitive radiologists is questionable • use of CAD leads to more concordance between decisions of different radiologists • Generalisation of results from studies with small number of participating radiologists* is questionable • Mechanisms? •MIRCAD proposal submitted to EPSRC • City, Edinburgh and UCL involved ___________ * similar to those by Warren Burhenne, LJ et al. (2000) - 5 readers, Brem, RF et al. (2003) - 7 readers, Freer, TW & Ulissey MJ (2001) - 2 readers (Freer + Ulissey),