JAMA Ophthalmology Journal Club Slides: Centralized Grading of Retinopathy of Prematurity Daniel E, Quinn GE, Hildebrand PL, et al; e-ROP Cooperative Group. Validated system for centralized grading of retinopathy of prematurity: Telemedicine Approaches to Evaluating Acute-Phase Retinopathy of Prematurity (e-ROP) Study. JAMA Ophthalmol. Published online March 26, 2015. doi:10.1001/jamaophthalmol.2015.0460. Copyright restrictions may apply Introduction • • • • • Good neonatal care, especially in emerging world economies with increasing survival of preterm low-birth-weight infants, is propagating an epidemic of retinopathy of prematurity (ROP), elevating ROP into a leading cause of preventable childhood blindness. Ophthalmologists skilled in detection of ROP are scarce. Remote screening methods are feasible options for focusing this limited expertise on diagnostic examinations of infants at risk and needing treatment. Nonphysician trained readers (TRs) of ROP images can substantially reduce cumulative diagnostic screening time of ROP experts if their grading is valid and reliable. Objective – To describe a centralized system for grading digital images of ROP by nonphysician TRs in the Telemedicine Approaches to Evaluating AcutePhase Retinopathy of Prematurity (e-ROP) Study. Copyright restrictions may apply Methods • Study Design – Multicenter observational cohort study. • Participants – Infants with birth weight <1251 g. • Data Analysis – Intergrader and intragrader agreement calculated using exact percentage of agreement and weighted κ. – Weighted κ was calculated by a weighting matrix specific to each grading item in which discrepant grades were assigned partial credit for agreement depending on how close they were, and “ungradable” was given 25% agreement with all other grades. – 95% CI for the weighted κ was calculated using the bootstrap. Copyright restrictions may apply Methods • • • • TRs from diverse undergraduate backgrounds went through 3-phase training. – Phase 1: Didactic lectures, interactive sessions, and reading assignments; visit to neonatal intensive care unit at The Children’s Hospital of Philadelphia to observe imaging of premature babies; and knowledge assessment testing. – Phase 2: TRs independently viewed and graded training image sets with known ROP grading from another study; used a paper grading form; and had training sessions graded, reviewed, and discussed to determine areas that warranted additional training. – Phase 3: TRs graded and reviewed 100 ROP image sets using electronic form and grading protocol and met with study chair, reading center (RC) director, and a clinical expert. The e-ROP grading protocol detailed criteria for evaluation of image quality and the key morphologic features of ROP. A group of 4 experts (retinal specialists experienced in ROP) and the RC director generated consensus final grading results for use in assessing TR performance. Certified when agreement with consensus final grading was >80%. Copyright restrictions may apply Methods: Study Flow ENROLLMENT 1285 Infants enrolled DIAGNOSTIC EXAM IMAGING GRADING 1257 Infants had 4263 diagnostic examinations 244 Infants had RW-ROP 1013 Infants did not have RW-ROP 242 Infants with RW-ROP imaged 999 Infants without RW-ROP imaged All 242 infants with RWROP image sets selected 613 Infants without RW-ROP had image sets selected 1759 Image sets from 454 eyes with RW-ROP selected 150 Image sets from 30 eyes without RWROP selected 3611 Image sets from 1226 eyes without RW-ROP selected 5250 Image sets from 855 eyes of 1710 infants graded Copyright restrictions may apply RW-ROP indicates referral-warranted ROP. Methods • Grading of Study Images – Standard 6-image sets acquired for each eye were graded independently by 2 TRs, with discrepancies adjudicated by RC director. – Grading performed at standardized independent workstations with secure Internet access and similarly configured computers with monitors calibrated every 2 weeks to maintain consistency in brightness and hue. – Software developed for displaying and manipulating contrast, brightness, and magnification in the ROP images. – Data from grading captured using web-based forms. – TRs masked to all infant demographic information including birth weight and gestational age, clinical data on ROP findings from the diagnostic eye examination, and the grading results from image sets of previous visits and image sets from the fellow eye. Copyright restrictions may apply Methods Screenshot of e-ROP Image Display and Web-Based Forms for Grading Copyright restrictions may apply Results Adjudication for Components of RW-ROP Component Total RC grading From RC final grading 5520 Images With Any Adjudication for Component, No. (%) 3115 (56.4) 3911 1495 114 5520 505 (12.9) 787 (52.6) 60 (52.6) 1352 (24.5) 5315 130 75 5520 122 (2.3) 71 (54.6) 20 (26.7) 213 (3.9) 5018 402 100 5520 421 (8.4) 228 (56.7) 36 (36.0) 685 (12.4) 4067 1359 94 5520 469 (11.5) 425 (31.3) 38 (40.4) 932 (16.9) Images, No. RW-ROP No Yes Cannot determine Total Plus disease No Yes Cannot determine Total Zone I ROP No Yes Cannot determine Total Stage 3 or worse ROP No Yes Cannot determine Total Copyright restrictions may apply Results Intergrader Variabilitya Variable Intragrader Variabilityb Agreement, % Weighted ĸ (95% CI) Agreement, % Weighted ĸ (95% CI) Abnormal posterior pole vessels 80 0.60 (0.39-0.82) 86 0.73 (0.51-0.94) Total quadrants of plus or preplus disease 65 0.68 (0.51-0.84) 74 0.75 (0.57-0.94) Total quadrants of plus 88 0.50 (0.32-0.68) 98 0.87 (0.69-1.00) Dominant feature 71 0.58 (0.42-0.75) 69 0.65 (0.48-0.82) Any ROP 95 0.89 (0.68-1.00) 100 1.00 (1.00-1.00) Demarcation line 99 0.74 (0.56-0.92) 100 1.00 (1.00-1.00) Ridge 83 0.65 (0.45-0.86) 98 0.95 (0.74-1.00) Extraretinal fibrovascular proliferation 83 0.67 (0.47-0.88) 88 0.77 (0.57-0.97) Flat preretinal neovascular proliferation 100 1.00 (1.00-1.00) 100 1.00 (1.00-1.00) Retinal detachment 100 1.00 (1.00-1.00) 100 1.00 (1.00-1.00) Highest stage 81 0.85 (0.67-1.00) 89 0.95 (0.77-1.00) Lowest zone 80 0.81 (0.63-0.99) 93 0.95 (0.77-1.00) contemporaneous variable sample graded every 3 months (n = 80). Plus disease 90 0.57 (0.37-0.77) 98 0.87 (0.67-1.00) Zone I ROP 83 0.43 (0.24-0.63) 90 0.70 (0.51-0.90) b Repeated Stage 3 or worse ROP 83 0.67 (0.47-0.88) 88 0.77 (0.57-0.97) RW-ROP 85 0.72 (0.52-0.93) 88 0.77 (0.57-0.97) TR Grading Variability of Morphology a Combined grading of 25 image sets for temporal drift. Copyright restrictions may apply Results TR Grading Variability of Image Quality Intergrader Variabilitya Image Quality Intragrader Variabilityb Agreement, % Weighted κ (95% CI) Agreement, % Pupil 94 0.73 (0.56-0.91) 95 0.76 (0.59-0.93) Disc center 66 0.47 (0.30-0.65) 85 0.67 (0.46-0.88) Disc up 74 0.56 (0.43-0.69) 91 0.78 (0.63-0.93) Disc down 78 0.40 (0.25-0.56) 96 0.81 (0.63-0.99) Disc temporal 81 0.66 (0.51-0.81) 90 0.76 (0.61-0.91) Disc nasal 74 0.49 (0.34-0.64) 89 0.64 (0.50-0.78) a Combined b Repeated contemporaneous variable sample graded every 3 months (n = 80). grading of 25 image sets for temporal drift. Copyright restrictions may apply Weighted κ (95% CI) Comment • A system to evaluate the competency of remote nonphysician TRs had not been detailed in prior ROP telemedicine studies. • e-ROP RC: – Developed an ROP curriculum for training and certification for nonphysician readers. – Developed and implemented a standardized grading protocol using electronic data capture. – Established a standard criterion of reference for RW-ROP morphology, any ROP, and preplus disease in retinal digital images through a process of integrating the grading of 3 expert readers, the RC director, and the study chair and using this for comparing TR grading during certification. • The excellent agreement between TRs reflects the extensive and rigorous training and certification process. Copyright restrictions may apply Comment • This study’s data suggest that the e-ROP system for training and certifying nonphysicians to grade ROP images under the supervision of an RC director reliably detects potentially serious ROP with good intragrader and intergrader consistency and minimal temporal drift. • Zone I ROP – Intragrader agreement least in identifying zone I ROP attributable largely to difficulty in accurately identifying the foveal center in the images. – Consistent with the results from a study that reported large variability in identifying the foveal center by ROP-specialized ophthalmologists. – The reliability of identifying the foveal center and subsequent delineation of zone I consistently in digital images could be increased by using a standard zone I template for digital images. • Enhancing the appearance of the ROP morphology and attenuating background noise in poor-quality images by manipulating the contrast, brightness, magnification, and gray tone appear to bring more certainty to detecting ROP pathology in digital images; this will be tested in a future study. Copyright restrictions may apply Comment • Plus Disease – Identifying plus disease had an intergrader variability weighted κ of 0.57. – International Classification of Retinopathy of Prematurity images as standards for tortuosity and dilation in identifying plus and preplus disease do not appear to adequately minimize intergrader variability. – Identification of plus disease among ROP experts appears to be highly variable over several previous studies. – These disagreements in identifying plus disease that persist in telemedicine ROP studies need more rigorous refinements on the definition and quantitative methods of detecting plus disease in digital images. Copyright restrictions may apply Comment • Limitations of the Study – Readers had no access to information on the gestational age, birth weight, or findings in the fellow eye, which could have improved the sensitivity and specificity in the study. – No gold standard to assess the competency of the TR in identifying morphological features in the retinal images; the consensus opinion of a few experts in ROP, subject to error, was used as the standard for comparison for training and certifying TRs. • To our knowledge, this is the first study that has demonstrated consistent and good agreement between and among nonphysician TRs grading ROP from digital images using a centralized reading facility. Copyright restrictions may apply Contact Information • If you have questions, please contact the corresponding author: – Ebenezer Daniel, MBBS, MS, MPH, PhD, Ophthalmology Reading Center, Department of Ophthalmology, University of Pennsylvania, 3535 Market St, Ste 700, Philadelphia, PA 19104 (ebdaniel@mail.med.upenn.edu). Funding/Support • This work was supported by cooperative agreement grant U10 EY017014 from the National Eye Institute. Conflict of Interest Disclosures • All authors have completed and submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest. Dr Hildebrand reported receiving consulting fees from Inoveon Corp, serving as chairman of the board of directors for Inoveon Corp, and receiving royalties for US patents 5940802 and 6470320, “Digital Disease Management System” (assignee: Board of Regents, University of Oklahoma). Dr Ells reported serving as a member of the scientific advisory board for Clarity Systems. Dr Hubbard reported receiving payment from the University of Pennsylvania as an expert grader of photographs in this work and receiving consulting fees from VisionQuest Biomedical, LLC for grading photographs outside this work. Dr Capone reported being a founding partner of FocusROP, LLC. Dr Ying reported serving as a statistical consultant for Janssen Research and Development, LLC. No other disclosures were reported. Copyright restrictions may apply