Comparison of Monitors for Digital Diagnosis of Medical Images 1 MIEW KEEN CHOONG, 1RAJASVARAN LOGESWARAN, 2ZAHARAH MUSA, 1AZIAH ALI 1 Faculty of Engineering Multimedia University Persiaran Multimedia, 63100 Cyberjaya MALAYSIA 2 Diagnostic Imaging Department Selayang Hospital 68100 Batu Caves MALAYSIA Abstract: - Radiology departments have been using monitors as the display medium for several modalities for some time now. Radiologists make their diagnosis based on the images displayed on these monitors. This article studies the diagnostic difference of 15 inch, 20 inch, 21 inch high quality CRT monitor, and 15 inch LCD monitor. A number of images from different modalities were used, and five radiologists were involved in the study. Blind testing was done where radiologists had to diagnose the images, and state their confidence level. Results show that there is no significant diagnostic difference when using the different monitors. The non-high-quality monitor may act as a low-cost solution especially for medical posts in remote and less affluent communities. The findings also indicate that portable notebooks are viable mobile tools for use in medical image diagnosis. Key-Words: - Monitors, Medical Images, ROC analysis, Diagnostic quality, MRI, CT, X-ray. 1 Introduction Display is an important link in the imaging chain. With medical images increasingly digitally stored, monitors have become dominant in image display. Reporting and viewing radiological images on monitors has become common practice in medicine [1]. Thus, the quality of the monitors used must be good. One of the advantages of digital medical images is that the data can be sent directly from the console via modem to any workstation for interpretations in a telemedicine environment. In the case where a study needs to be sent to the radiologists’ home or office workstation for emergency interpretation, the image would be viewed on a normal user monitor [2]. Besides, non-high-quality monitors are commonly used by referring physicians and clinicians. Thus, it would be necessary to compare the quality of the normal user monitors and high quality monitors used for diagnosis. Specifically, the quality of the monitor chosen must not lead to wrong interpretation and mis-diagnosis. Since images that score high on objective quality scales are not necessarily ranked equally high by human viewers, a randomized double-blind trial closely resembling actual clinical practice has to be done. The results have to be graded to allow statistical analysis [2]. 2 Methodology This study was designed as multiobserver, reader performance (receiver operating characteristic (ROC)) [3] study in which observer performance was measured for four different types of monitors. Participating radiologists rated each image with regards to the likelihood that abnormality was present. A total of 25 (18 X-rays, 4 CT, and 3 MRI) images were used. The images consist of acquisitions from a variety of body parts such as ankle joint, foot, abdomen, skull, pelvis, chest, hand, wrist, humerus, cervical spine, shoulder, brain, and lumbar spine. These images were chosen by the Head of Radiology of the collaborating medical institution such that it contained a combination of some with lesions and some without. The X-ray radiographs were archived with minimum 1576 x 1976 pixel resolution and a 10-bit depth; CT (Computed Tomography) images were stored with 512 x 512 pixel resolution and a 12-bit depth; whereas MRI (Magnetic Resonance Image) were stored with 256 x 256 pixel resolution and a 12-bit depth. All the images used were stored in American College of Radiology (ACR) and the National Electrical Manufacturers Association (NEMA) defined standard - DICOM (Digital Imaging and Communication in Medicine) format [4-5]. This standard is a framework for medical-imaging communication. Diagnostic experiments were done using different monitors, namely the conventional 15 inch CRT (Cathode Ray Tube) office computer monitor, 20 inch CRT, and 21 inch CRT high quality monitor (SMM2183L, Siemen Simomed HM 1.8 MP (high-contrast)). The 21 inch CRT monitor is the default monitor used with the Picture Archiving Communication System (PACS) in the Hospital Information System (HIS) at the medical institution for diagnosis purposes. As such, this monitor will be considered as the ‘golden standard’ in this experiment. Taken into account that LCD (Liquid Crystal Displays) monitors are becoming gradually more important [1], and portable computers (eg. laptop) are becoming more powerful and possess the additional advantage of mobility in second opinion consultation, evaluation was also done with a 15 inch laptop LCD monitor. Table 1 shows the comparison among the monitors. The viewing software used was Siemens MagicView 1000 [6] for SMM2183L, and Siemens MagicView 300 [6] for the 15 inch, 20 inch and LCD monitors. The MagicView 1000 is the high end reporting workstation and is the default software used by the PACS for medical image display; whereas MagicView 300 is a clinician viewing workstation. The display resolutions chosen for the monitors are indicated by the ‘matrix used’ row in the table. Resolution and refresh rates were as the recommended for standard usage of the monitors. Table 1: The different monitors used for the experiment model max. matrix matrix used refresh rate (Hz) CRT 15” Compaq v55 1024 x 768 1024 x 768 47.5 – 125 CRT 20” Compaq v1000 1800 x 1440 1024 x 768 48 -160 CRT 21” SMM 2183L 1600 x 1200 1280 x 1024 50 -120 LCD Compaq LCD 15” 1024 x 768 1024 x 768 60 Five radiologists were involved in the study. Blind tests were done where radiologists had to identify which images had lesions and those without. Interpretations of the selected images were performed individually in four separate sessions at 1-week intervals. Using first the lower quality 15 inch monitor, the radiologists were shown the 25 images (one at a time) at a room condition which was similar to the real diagnostic environment. The radiologists were asked to detect if there were any lesions and state their confidence level (1-not confident to 6-very confident). They were also asked to rate the diagnostic quality of the images shown (1-poor to 6-good). After one week, the same radiologists re-examine the images using the 20 inch monitor. The process is repeated a week later with the high quality 21 inch monitor. The evaluation of different monitors was done with an interval of at least one week so that the radiologists conducted their analysis based on the images presented to them and not from memory. From the data collected, statistical tests of the possible effects of using different monitors were conducted. 3 Results A powerful Multi-Reader Multi-Case statistical method – receiver operating characteristic (ROC) curves [7] is used to analyse the results. The ROC curve is a plot of sensitivity versus 1-specificity, where “sensitivity” is the proportion of the abnormal cases correctly identified as such by the diagnostic test, and “specificity” is the proportion of the normal cases correctly identified as such by the diagnostic test. ROC analysis was performed on the confidence levels to measure diagnostic accuracy when different monitors were used. The area under the ROC curve was analysed as it is the average sensitivity over all possible specificities [8-10]. The calculation of areas under the ROC curves was done using the Analysis-it software (available at http://www.analyse-it.com/download/dl.asp). It uses a non-parametric method for constructing curves [11], and the Hanley and McNeil method [12] for comparing the curves. Table 2 summarises the area under the ROC curves for abnormality detection by each radiologist (R1-R5) with the different monitors. Table 2: Area under the ROC curve for detection of abnormalities by individual radiologists for the different monitors R1 CRT 15” 0.781 CRT 20” 0.943 CRT 21” 0.917 LCD 0.833 0.877 0.921 0.895 0.912 0.991 0.789 0.842 0.991 1.000 0.746 0.816 0.886 0.978 0.978 0.842 0.930 A plot of the mean area under the ROC curve is presented in Fig. 1. The error bars are the standard deviation of mean for all the trials. The differences in the mean area under the ROC are very small. Thus, it supports the earlier observation that there were no significant differences in diagnostic accuracy by using the different monitors. Fig.2 shows the mean of the diagnostic quality of the 25 images shown on different monitors. Overall the images were highly rated when the 21 inch SMM2183L was used. SMM2183L is used as the radiologists’ daily diagnostic monitors and this may cause a bias by the radiologists. For the 15” CRT and LCD monitors, the rating is quite low, and this may be caused by the perception of the participating radiologists that smaller monitors display lower quality images. Based on the statistical results in Table 1 and Fig. 1, however, it is proven that this bias is faulty. Instead, the results recommend that the diagnostic quality of low-cost monitors is comparable to high-quality monitors. As such, smaller low-cost monitors may be an economical and convenient solution for displaying medical images. Area under ROC curve Mean area under ROC curve with different monitors 1 0.9 0.8 0.7 0.6 0.5 CRT 15" CRT 20" CRT 21" LCD Type of monitors Fig. 1: Percentage of correct selections based on radiologists Diagnostic quality of different monitors diagnostic quality value R2 R3 R4 R5 7 6 R1 5 R2 4 R3 3 R4 2 R5 1 0 15 " 20" 21" LCD types of monitors Fig. 2: Diagnostic quality of images shown on different monitors rated by the different radiologists 4 Discussion In this study, ambiguity occurs because no patient history was provided, and radiologists were not sure what abnormalities they were searching for. As such, diagnosis results were gained purely on the interpretation of the results presented on the monitors. It is found that referring physicians and clinicians may use non-high-quality monitors in their respective clinics without any risks. In addition, non-high-quality monitors (eg. v55 and v1000) may be a low-cost solution particularly in rural areas where cost and availability are pertinent factors. LCD monitors may be higher in price than the 15” CRT monitors, but still cheaper than high-quality monitor may act as a low-cost mobile solution. With the price of monitors decreasing and the quality improving, it is worth keeping an eye on the market for a low-cost good-quality monitors for medical displays. As the results show that there is generally no diagnostically significant difference in using high-quality or non-high-quality monitors, the question lies on factors such as display performance and user behaviors. Display performances are measured by luminance, gray scale and contrast, resolution, veiling glare, and linearity, just to name a few [13-14]; user behaviors includes characteristic such as search behaviors, viewing time, and experience as radiologists [14]. These factors are needed for a more thorough evaluation of the displays used in radiology department, should a technical analysis be desired to support the results presented here. This article aims to provide a preliminary study targeted at subjective diagnosis evaluations by medical experts in a workplace environment. This is important as technical specifications of hardware are not necessarily reflected in the day-to-day operations of the biological human visualisation system and experience. Based on the results, and with the availability of good quality LCD monitors/screens on modern notebook PC, further work should be undertaken to investigate the viability of using such a mobile platform for medical diagnosis. The results presented here may be extended for monitor selection in other applications as well. 5 Conclusion No obvious trend is observed in the results presented, indicating that there was no significant preference to any particular monitor in the detection of abnormalities by the radiologists (each radiologist had the best results using different monitors). However, there was a trend in higher confidence in using larger monitors which the users have prior experience with. As such, some amount of training and exposure may be required in confidence building when migrating to using lower quality and smaller less costly monitors. The results on LCD are important as it supports the use of notebook PC in conducting diagnosis. Furthermore, the findings may prove beneficial in a more cost-effective telemedicine environment. Acknowledgement: This paper reports a collaborative study conducted with images and medical expertise (Dr Noriza Ismail, Dr Wan Zainab Hayati Wan Mokhtar, Dr Faizatuddarain Mahmood, Dr S. Sivalingam, and Dr Shahrina Man Harun) from Selayang Hospital, Malaysia. The authors would also like to acknowledge Michel Bister (formerly of Multimedia University) for his contributions toward this project. References: [1] G. Partan, R. Mayrhofer, R. Brauer, and H. Hruby. LCD monitors for diagnostic imaging. Electromedica, Vol. 70, No. 2, 2002, pp.165-168. [2] M. Aanestad, B. Edwin, and R. Marvik. Medical image quality as a socio-technical phenomenon. Paper presented at IT in Health Care: Socio-technical Approaches, 2001. [3] Lara, and Jo van Schalkwyk. The magnificent ROC (Receiver Operating Characteristic curve). Available at http://www.anaesthetist.com/mnm/stats/roc/ 1/10/2001. [4] American College of Radiology (ACR)/National Electrical Manufacturers Association Standards Publication for Data Compression Standards, NEMA Publication PS-2, Washington DC, 1989. [5] Digital imaging and communication in Medicine (DICOM), version 3, American College of Radiology (ACR)/National Electrical Manufacturers Assoc. (NEMA) Standards Draft, Dec., 1992. [6] Siemens Healthcare Services [PACS]. Available at http://www.siemens.co.uk/med/shs/solutions/p acs6.htm [7] D.D. Dorfman, K.S. Berbaum, and C.E. Metz. Receiver operating characteristic rating analysis: generalization to the population of readers and patients with the jackknife method. Invest Radiol, Vol. 27, 1992, pp.723-731. [8] J.A. Swets. ROC anaylis applied to the evaluation of medical imaging techniques. Invest Radiol, Vol. 14, 1979, pp.109-121. [9] J.A. Hanley, and B.J. McNeil. The meaning and use of the area under a receiver operating characteristic curve. Radiology, Vol. 143, 1982, pp. 29-36. [10] C.E. Metz. ROC methodology in radiologic imaging. Invest Radiolo, Vol. 21, 1986, pp. 720-733. [11] J.R. Beck, and E.K. Schultz,. The use of relative operating characteristic (ROC) curves in test performance evaluation. Arch Pathol Lab Med, Vol. 110, 1986, pp. 13-20. [12] J.A. Hanley, and B.J. McNeil. A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology, Vol. 148, 1983, pp. 839-843. [13] M.J. Flynn, J. Kanicki, A. Badano, and W.R. Eyler. High-fidelity electronic display of digital radiographs. RadioGraphics, Vol. 19, 1999, pp. 1653-1669. [14] E. Krupinski. Choosing the right monitor and interface for viewing medical images. Invited symposium at the 6th Internet World Congress for Biomedical Sciences. 2000.