Computer-aided Diagnosis Based on Speckle Patterns for Ultrasound Images 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 Woo Kyung Moon1, Chung-Ming Lo2, Chiun-Sheng Huang3, Jeon-Hor Chen4,5, and Ruey-Feng Chang2,6* 1 Department of Radiology, Seoul National University Hospital, Seoul, Korea 2 Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan 3 Department of Surgery, National Taiwan University Hospital and National Taiwan University College of Medicine, Taipei, Taiwan 4 Center for Functional Onco-Imaging and Department of Radiological Science University of California Irvine, California, USA 5 Department of Radiology, China Medical University Hospital, Taichung, Taiwan 6 Graduate Institute of Biomedical Electronics and Bioinformatics National Taiwan University, Taipei, Taiwan * Corresponding Author: Professor Ruey-Feng Chang Department of Computer Science and Information Engineering National Taiwan University Taipei, Taiwan 10617, R.O.C. Telephone: 886-2-33664888~331 Fax: 886-2-23628167 E-mail: rfchang@csie.ntu.edu.tw 30 Abstract 31 For breast ultrasound, the scatterer number density from backscattered echo was 32 demonstrated in previous research to be a useful feature for tumor characterization. To 33 take advantage of the scatterer number density in B-mode images, spatial compound 34 imaging was obtained, and the statistical properties of speckle patterns were analyzed 35 in this study for use in distinguishing between benign and malignant lesions. A total of 36 137 breast masses (95 benign cases and 42 malignant cases) were used in the 37 proposed computer-aided diagnosis (CAD) system. For each mass, the average 38 number of speckle pixels in a region of interest (ROI) was calculated to utilize the 39 concept of scatterer number density. In addition, the first-order and second-order 40 statistics of the speckle pixels were quantified to obtain the distributions of the pixel 41 values and the spatial relations among the pixels. The performance of the speckle 42 features extracted from each ROI was compared with the performance of the 43 segmentation features extracted from each segmented tumor. As a result, the proposed 44 CAD system using the speckle features achieved an accuracy of 89.1% (122/137), a 45 sensitivity of 81.0% (34/42), and a specificity of 92.6% (88/95). All of the differences 46 between the speckle features and the segmentation features are not statistically 47 significant (p>0.05). In a receiver operating characteristic (ROC) curve analysis, the 48 Az value of the speckle features was significantly better than the Az value of the 49 segmentation features (0.93 vs. 0.86, p=0.0359). The performance of this approach 50 supports the notion that the speckle patterns induced by the scatterers in tissues can 51 provide information for classifying tumors. The proposed speckle features, which 52 were extracted readily from drawing an ROI without any preprocessing, also provide 53 a more efficient classification approach than tumor segmentation. 54 55 Keywords: Speckle, Breast cancer, Ultrasound, Spatial compound imaging, 56 Computer-assisted diagnosis 57 Introduction 58 Breast ultrasound (US), including spatial compound imaging (SCI), is being 59 explored for distinguishing between benign and malignant lesions (Stavros et al. 1995; 60 Cha et al. 2005; Cha et al. 2007). The sonographic appearances of breast tumors are 61 constructed by means of acoustic transmission and are interpreted by radiologists 62 upon clinical examination for a diagnosis. To standardize the terminology used for 63 describing tumors , the BI-RADS lexicon was proposed by the American College of 64 Radiology (American College of Radiology 2003). According to the descriptors 65 defined in BI-RADS descriptive categories, the dominant sonographic findings of a 66 tumor are classified and analyzed by radiologists to evaluate the likelihood of 67 malignancy. The descriptive categories include shape, orientation, margin, lesion 68 boundary, echo pattern, and posterior acoustic features. With the quantification of the 69 descriptors used clinically, various computer-aided diagnosis (CAD) systems 70 (Rangayyan et al. 2000; Shen et al. 2007a; Shen et al. 2007b; Nie et al. 2008) were 71 developed to provide an efficient procedure to diagnose breast tumors automatically. 72 By tumor segmentation, the tumor characteristics were extracted and quantified to 73 distinguish between benign and malignant lesions. The quantitative features utilized in 74 these CAD systems can be classified as morphology or texture features. Morphology 1 75 features are used to describe the tumors’ shape characteristics, and texture features are 76 used to show the echogenicity properties through the correlations among pixels. 77 Furthermore, the speckle patterns from backscattered echo were analyzed to 78 provide information for tissue characterization (Tuthill et al. 1988). Speckle is 79 generated by the constructive and destructive interference of US waves backscattered 80 from tissue scatterers, which are tissues with equal or smaller structures than the US 81 wavelength. From observing the backscattered echoes, the effects of the scatterer 82 number density can be shown by the statistical properties when the number per 83 resolution cell is small (i.e., < 10). The amplitude histogram of the signal with a 84 sparse scatterer number density has a pre-Rayleigh distribution. With the increasing 85 number of scatterers, the speckle is fully developed, and the corresponding histogram 86 approaches a Rayleigh distribution. Based on the analyzed results, the backscattered 87 echo was further modeled by the Nakagami parameter to distinguish between benign 88 and malignant lesions (Shankar et al. 2003; Chang et al. 2010). The Nakagami 89 parameter effectively quantified the statistics of the backscattered echo from different 90 scattering conditions, including pre-Rayleigh, Rayleigh and post-Rayleigh. In the 91 previous studies, the speckle patterns from backscattered echo signal were 92 demonstrated to be useful in classifying tumors. However, the signal data are not easy 93 to acquire and are less familiar to radiologists who generally use B-mode images to 2 94 evaluate tumor characteristics. 95 In this study, we analyzed the speckle patterns in B-mode images obtained with 96 SCI to develop more useful features for tumor classification. For speckle extraction, a 97 tumor and its surrounding tissues were included in a region of interest (ROI) that was 98 cropped from a B-mode image. Next, the statistical properties of the speckle pixels 99 extracted from an ROI were quantified as features for distinguishing between benign 100 and malignant lesions. Moreover, the diagnostic performance of the speckle features 101 was compared with the performance of the morphology and texture features obtained 102 by tumor segmentation. 103 104 Materials and Methods 105 US acquisition 106 The breast US images collected in this study were acquired using an ATL HDI 107 5000 scanner (Philips, Bothell, WA). The applied L12-5 probe was a 192-element 108 linear array transducer with variable frequency ranging from 5 to 12 MHz and a scan 109 width of 38 mm. We obtained image data sets with and without SCI (SonoCT™, 110 Philips). The SCI method combined nine frames produced from different 111 transmit angles to reduce speckle and shadowing. In our experiment, conventional 112 US images were first obtained without SCI mode, and SCI images were subsequently 3 113 obtained in an identical plane without changing depth, focus position, or gain settings. 114 The gray-level value of the image pixels ranges from 0 to 255. 115 A total of 137 masses were examined based on the B-mode images produced by 116 the scanner from April 2002 to May 2003. Two breast radiologists with 5 and 15 117 years of experience, respectively, classified the lesions into BI-RADS assessment 118 categories according to the observation of the B-mode images before biopsy. There 119 were 69 lesions (50%) in BI-RADS 3 (probably benign), 53 lesions (39%) in 120 BI-RADS 4 (suspicious abnormality), and 15 lesions (11%) in BI-RADS 5 (highly 121 suggestive of malignancy). 122 The lesions of all patients that were pathologically proven by core needle biopsy 123 or fine-needle aspiration cytology included 95 benign lesions (69%) and 42 malignant 124 lesions (31%). The lesion sizes ranged from 0.4 to 3.0 cm (mean: 1.3 ± 0.6 cm). In the 125 benign lesions, there were 74 cases of fibroadenoma and 21 cases of fibrocystic 126 changes. The size was 0.4-2.5 cm (mean: 1.2 cm). In the malignant lesions, there 127 were 40 cases of invasive ductal carcinoma, 1 case of invasive tubular carcinoma, and 128 1 case of invasive papillary carcinoma. The size was 0.6-3.0 cm (mean: 1.6 cm). The 129 patients with benign lesions had a mean age of 42 (range 22–64), and the patients with 130 malignant lesions had a mean age of 49 (range 34–77). For this study, approval was 131 obtained from our institution review board, and informed consent was waived. 4 132 133 ROI 134 A subimage of a B-mode image was cropped into an ROI. To specify a tumor 135 region, a rectangular bounding box was used to enclose a tumor. In principle, there 136 was a tumor in the center of the ROI, leaving a distance of 5–10 pixels between the 137 tumor boundary and the bounding box. According to the tumor size, the smallest ROI 138 was 55 × 42 pixels (0.57 cm × 0.44 cm), and the largest one was 290 × 160 pixels 139 (3.02 cm × 1.67 cm). The illustration of ROI selection is shown in Fig. 1 (a). 140 141 Speckle features 142 A US B-mode image is composed of the backscattered echoes reflected from 143 tissues. With the transmission of the US pulses, tissues generate various responses to 144 the pulses, such as absorption, reflection, and scattering, according to their physical 145 properties. The scatterers with microstructure contained in tissues, such as tissue 146 parenchyma, scatter the US pulses and produce an interference pattern called speckle. 147 In this study, the speckle pixels in B-mode images were extracted to generate useful 148 features in tissue characterization. 149 The speckle patterns have granular appearances with small difference in gray 150 level. For fully developed speckle, the number of scatterers is considerable. The 5 151 intensity image should have an exponential distribution and a constant ratio of the 152 mean to the standard deviation (SD) of 1.0 (Tuthill et al. 1998). According to this 153 speckle property, the B-mode images were first log decompressed to obtain the raw 154 intensity images (Smith and Fenster 2000). The decompression procedure was 155 performed to extract the speckle pixels more precisely. The pixel value of the 156 acquired image data was log compressed to a 0-255 log scale to reduce the dynamic 157 range. To obtain the raw intensity value, the log decompression converted the 0-255 158 log scale back to a linear scale. The gray value of the pixels was converted by the 159 equation 160 𝐼(𝑥, 𝑦) = 10𝐺(𝑥,𝑦)∕𝐺0 (1) 161 where G(x,y) is the pixel value in B-mode images and G0, which is a linearization 162 factor related to the frequency of the transducer (Smith and Fenster 2000), converts 163 G(x,y) to a linear scale. The G0 value was used to determine the range of the intensity 164 values. After the conversion, I(x,y) is obtained as the acoustic intensity. Next, a 165 moving 5×5 window is used to find a region with a range of ratios of mean intensity 166 to SD of 0.8–1.2 in the raw intensity image. If a region satisfies the condition, the 167 center pixel of the region is defined as speckle. 168 After the extraction of speckle pixels, the statistical properties of the speckle 169 pixels were converted into parameters for tissue characterization. In a previous study 6 170 (Tuthill et al. 1988), the scatterer number density from backscattered echo is useful 171 for tumor characterization. For B-mode images, the average number of speckle pixels 172 in an ROI is calculated as 𝑆_𝑎𝑣𝑔𝑛𝑢𝑚 = 𝑆_𝑛𝑢𝑚 ∕ 𝑅𝑂𝐼_𝑛𝑢𝑚 173 (2) 174 where S_num is the number of total speckle pixels in an ROI, and ROI_num means 175 the number of all pixels in an ROI. An illustration is presented in Fig. 1. The ROIs of 176 a benign tumor and a malignant tumor are in Fig. 1 (b) and (d), respectively. The 177 speckle pixels extracted from the ROIs are shown in Fig. 1 (c) and (e) by a white 178 appearance (with a pixel value of 255 on the 8 bit gray-level image) for visualization. 179 Clearly, the density of speckle pixels in the ROI of the benign tumor (Fig. 1 (c)) is 180 higher than that of the malignant tumor (Fig. 1 (e)). 181 In addition to S_avgnum, the first-order statistics and second-order statistics of 182 the extracted speckle pixels in the ROI are utilized. Among the first-order statistics, 183 the mean and SD of the qualified mean/SD (0.8–1.2) in a moving window are defined 184 as 185 𝑆_𝑚𝑒𝑎𝑛 = (∑𝑃∈𝑆 𝑚𝑆𝐷(𝑃)) ∕ 𝑆_𝑛𝑢𝑚 (3) 186 𝑆_𝑆𝐷 = √(∑𝑃∈𝑆(𝑚𝑆𝐷(𝑃) − 𝑆_𝑚𝑒𝑎𝑛)2 ) ∕ 𝑆_𝑛𝑢𝑚 , (4) 187 where mSD(P) is the mean/SD value of a speckle pixel. Also, the mean and SD of the 188 extracted speckle pixels in the 256 gray-level are defined as 7 189 𝑆_𝑔𝑚𝑒𝑎𝑛 = (∑𝑃∈𝑆 𝐺(𝑃)) ∕ 𝑆_𝑛𝑢𝑚 (5) 190 𝑆_𝑔𝑆𝐷 = √(∑𝑃∈𝑆(𝐺(𝑃) − 𝑆_𝑔𝑚𝑒𝑎𝑛)2 ) ∕ 𝑆_𝑛𝑢𝑚 , (6) 191 where G(P) is the gray-level value of a speckle pixel. 192 With regard to the spatial relations between speckle pixels, textures, as the 193 second-order statistics, are assumed to describe the regional information. In this study, 194 the gray-level co-occurrence matrices (GLCM) (Haralick et al. 1973) are employed to 195 provide texture features. In this work, 21 texture features, which were implemented to 196 predict the likelihood of malignancy of the tumors in a classifier, are listed in Table 1. 197 198 Segmentation features 199 In this study, the speckle features obtained from an ROI were compared with the 200 features extracted from a segmented tumor. To segment a tumor, the level set method 201 was implemented (Huang et al. 2011). The quantitative features obtained from 202 segmentation can be classified into two categories: morphology features and texture 203 features (Rangayyan et al. 2000; Shen et al. 2007a; Nie et al. 2008; Chang et al. 2011). 204 Morphology features were proposed to describe the tumor shape, and texture features 205 were proposed to represent the tumor echogenicity. 206 207 Morphology features 8 208 After segmentation, the tumor contour is delineated by the level set method. 209 Next, the geometric characteristics of the tumor contour can be described by 210 quantitative morphology features. In past research (Rangayyan et al. 2000; Shen et al. 211 2007a; Nie et al. 2008; Chang et al. 2011), a variety of morphology features have 212 been suggested to estimate tumor shape, orientation, and margins. Instead of 213 measuring tumors directly, the best-fit ellipse was utilized for approximating the size 214 and position of a tumor (Shen et al. 2007b). By analyzing the properties of the best-fit 215 ellipse, the close measurement of tumor characteristics was accomplished. For 216 example, the angle of the major axis of the ellipse was used to calculate the tumor 217 orientation. Comparing the ellipse perimeter and the tumor boundary helped to 218 determine the smoothness of the tumor. 219 Additionally, the inherent properties of tumors were utilized to develop other 220 morphology features. Rangayyan et al. (2000) used the tumor perimeter and area to 221 estimate the compactness of a tumor. Nie et al. (2008) described the roundness of a 222 tumor by the normalized radial length (NRL), which was defined as the Euclidean 223 distance between the tumor center and the pixels on the tumor boundary normalized 224 by the maximum distance. 225 226 Texture features 9 227 Texture features were used to describe the characteristics of the specific pattern 228 in a region. After segmentation, the tumor region is defined as the pixels surrounded 229 by the tumor contour. Therefore, the analyses of the pixel values inside the tumor 230 region were used as the texture features. In B-mode images, different tissues inside 231 the tumor reflect different echogenicity patterns that result in various distributions of 232 gray-level values. GLCM (Haralick et al. 1973), which calculates the spatial 233 correlations among pixels, was used in this study to provide the texture information 234 inside the tumors. Furthermore, the average intensity of the tumor was compared with 235 that of surrounding tissues and posterior shadowing to describe the tumor 236 characteristics (Shen et al. 2007b). In Table 2, a total of 38 quantitative features 237 mentioned above were collected for predicting the likelihood of tumor malignancy in 238 a classifier. 239 240 Statistical analysis 241 The quantitative features mentioned above were implemented in our experiment 242 to distinguish between benign and malignant lesions. For this purpose, the speckle 243 features were evaluated if they exhibited discriminability. According to the 244 distribution of the feature values, different test methods were employed. At first, the 245 distribution was determined by the Kolmogorov-Smirnov test (Field 2009). If the 10 246 distribution of a feature was normal, then Student’s t-test (Field 2009) was applied. 247 Nonnormal distributions were analyzed using the Mann-Whitney U test (Field 2009). 248 The test result was quantified using p values. A p value of less than 0.05 was 249 considered to be statistically significant. For classifying masses, only the significant 250 features were used in the classifier. 251 The classifier used in this study was the binary logistic regression model 252 (Hosmer 2000). To classify the tumors as malignant or benign, the significant features 253 were gathered in the classifier. Next, backward elimination was applied to select 254 features from the significant features in the stepwise procedure. While the lowest error 255 rate was achieved in the trained classifier, a subset of features was selected as the 256 most relevant for classifying tumors. The performance of the classifier was examined 257 by the leave-one-out cross-validation method (Alpaydin 2004). If there were K cases 258 involved in the validation, the cases were trained K times. In every instance, one case 259 was left out of the K cases and was used to test the result trained using the remaining 260 K-1 cases. 261 According to the biopsy-proven pathology, the classification result was 262 evaluated based on five performance indices: accuracy, sensitivity, specificity, positive 263 predictive value (PPV), and negative predictive value (NPV). Using the chi-square 264 test, the performance indices of the speckle features, which were drawn out from 11 265 ROIs, were compared with those of the segmentation features. Moreover, the 266 trade-offs between sensitivity and specificity achieved by the two feature sets were 267 compared using the receiver operating characteristic (ROC) curve. The Az value, 268 which represents the normalized area under the curve, was measured for comparison 269 via the z-test. ROCKIT software (C. Metz, University of Chicago, Chicago, IL, USA) 270 was used to analyze the ROC curve, and all other test methods were performed with 271 SPSS software (version 16 for Windows; SPSS, Chicago, IL, USA). 272 273 Results 274 For detecting speckle pixels, the performances of different window sizes 275 were calculated and listed in Table 3. The detected speckle pixels were zero in 276 most cases while using window sizes bigger than 9 × 9. Therefore, the 277 performances of 3 × 3, 5 × 5, and 7 × 7 were compared. Although the differences 278 were not significant, the result achieved by 5 × 5 was better than 3 × 3 and 7 × 7. 279 We adopted 5 × 5 as the window size for the following calculations. 280 To evaluate the effect of speckle reducing techniques, the speckle features 281 extracted from conventional US and SCI were compared. Figure 2 (a) shows an ROI 282 of a B-mode image generated by conventional US, and Fig. 2 (c) shows the identical 283 case generated by SCI. The extracted speckle pixels of the ROIs are depicted with a 12 284 white appearance in Fig. 2 (b) and (d). The illustration reveals that SCI reduced a little 285 speckle in the B-mode images. 286 By Student’s t-test or the Mann-Whitney U-test, the speckle features and the 287 segmentation features were evaluated to determine whether they were significant in 288 distinguishing between benign and malignant lesions. The significant speckle features 289 and the significant segmentation features are shown in Table 4 and Table 5, 290 respectively. Next, the significant features were used in the classifier to predict 291 whether the tumors are benign or malignant. In Table 6, the performances achieved by 292 different feature sets are listed. First, the differences of all performance indices 293 between the speckle features extracted from conventional US and SCI are not 294 significant (p>0.05). Note that the significant speckle features listed in Table 4 were 295 evaluated based on SCI. For conventional US, the number of significant speckle 296 features is one less than SCI (Inertia ave). In further comparisons, the speckle features 297 extracted from ROIs with SCI were used. 298 Using the chi-square test, the differences in the five performance indices 299 between the speckle features and the segmentation features are not significant 300 (p>0.05). Significant differences were observed in the comparison of Az using the 301 z-test (p=0.0359). The performance of the combined feature set including the speckle 302 features and the segmentation features was also calculated. This comparison showed 13 303 that the Az of the combined feature set was slightly better than the Az of the speckle 304 features and was significantly better than the Az of the segmentation features 305 (p=0.0219). To illustrate, the ROC curves of the speckle features extracted from 306 conventional US and SCI are shown in Fig. 3 (a). In Fig. 3 (b), the ROC curves of the 307 speckle features with SCI, the segmentation features, and the combined feature set are 308 presented. 309 Two cases are illustrated to show the performance of the speckle features. The 310 malignant tumor in Fig. 4, which has an S_avgnum value of 0.13, was classified 311 correctly by the speckle features and was misclassified by the segmentation features. 312 Another example of a benign tumor is shown in Fig. 5. A tumor with an S_avgnum 313 value of 0.17 was classified correctly by the speckle features and was misclassified by 314 the segmentation features. 315 In order to refine the features, the cases misdiagnosed by the speckle features 316 were analyzed in terms of margin, echogenicity, calcification, and shadowing. 317 Echogenicity was related to the misclassification. First, if the surrounding tissues of a 318 tumor included in an ROI were hypoechogenic, the image composition may affect the 319 classification result. Second, while a malignant tumor and the surrounding tissues had 320 similar levels of isoechogenicity, the tumor was regarded as benign. Tumor margin, 321 calcification, and shadowing did not affect the performance. 14 322 323 Discussion 324 Various US CAD systems were developed based on the features used by 325 radiologists on clinical examination. The features can be observed by human vision 326 and thus are used to describe the appearances of tumors, such as their shape, 327 orientation, and margin. In these CAD systems, the feature extraction and 328 quantification procedure, which affects the final performance of the classification, is 329 highly dependent on the segmentation result. However, the computation required in a 330 segmentation procedure is considerable, and it is inevitable that a well-segmented 331 result will be necessary. 332 With respect to the signal features suggested in other research studies, the 333 analysis of the scatterer number density from the backscattered echo is also useful in 334 classifying tumors (Shankar et al. 2003; Chang et al. 2010). Nevertheless, general US 335 scanners are not designed to obtain the signal data conveniently. In this study, the 336 speckle pixels in common B-mode images were extracted from ROIs to utilize the 337 scatterer properties. The first-order and second-order statistics of the extracted speckle 338 pixels were quantified into the features for tumor classification and were compared 339 with the features from segmentation. To the best of our knowledge, this study 340 represents the first attempt to use the speckle density and the first order statistics of 15 341 speckle pixels in common B-mode images for breast tumor classification. 342 Furthermore, GLCM texture features (Haralick et al. 1973) were utilized in this 343 study. Conventionally, texture features were used to characterize the correlations 344 among pixels inside the tumor. In our CAD system, only the speckle pixels 345 detected in a ROI were considered. The diagnostic accuracy of proposed speckle 346 features was better than the accuracy of the Nakagami parameter in the 347 literature (89% vs. 82%)(Chang et al. 2010). Considering the trade-offs between 348 sensitivity and specificity, the proposed speckle features was also better in the 349 comparison of Az (0.93 vs. 0.81). The proposed speckle features were based on 350 B-mode image which is the basic function of widely available US scanners. The 351 convenience of the proposed CAD system can make it be widely used for every 352 common US scanner. We obtained SCI and conventional images with and without 353 the use of the scanner’s SonoCT feature. 354 In our study, the speckle features extracted from B-mode images obtained by 355 using both conventional US and SCI achieved good performance (Az=0.89, Az=0.93), 356 thereby suggesting that the speckle features can be used in these two settings of the 357 US scanner. For comparison with the segmentation features, the speckle features 358 obtained with SCI were used. The reason for this choice is that a relatively large 359 amount of noise appeared in the conventional US and caused the failure of tumor 16 360 segmentation. In comparison, the performance of five indices achieved by the speckle 361 features obtained with SCI is close to the performance achieved by the segmentation 362 features. Considering the trade-offs between sensitivity and specificity, the Az of the 363 speckle features is significantly better than the Az of the segmentation features (0.93 364 vs. 0.86, p=0.0359). By combining the speckle features and the segmentation features, 365 the combined feature set also performed significantly better than the segmentation 366 features (p=0.0219). The result indicates that the proposed speckle features provide 367 equal or better diagnostic information than the conventional segmentation features. 368 For the likelihood of malignancy, the combined feature set can determine the 369 classification result with greater accuracy. That is, incorporating the morphological 370 properties of a tumor with the underlying scatterer characteristics improves the 371 reliability of the CAD system. 372 For clinical use, the proposed CAD system, which employs the speckle features 373 to classify tumors, is expected to be a more efficient procedure than tumor 374 segmentation. In our experiment, calculating the speckle features of a total of 137 375 cases took 121 seconds using a computer with an Intel® Core™2 Quad CPU Q9400 376 at 2.66 GHz 2.67 GHz and 3.25 GB memory (Intel, Santa Clara, CA, USA). After 377 acquiring the ROIs, the average processing time for the proposed CAD system to 378 classify each case was 0.88 seconds. In clinical examination, the speckle density can 17 379 immediately be displayed visually by a white appearance on the screen. The suggested 380 classification result can be shown almost in real-time while the radiologists specify 381 the tumor area. Generally, the routine procedure for radiologists to mark the tumor 382 size on a B-mode image is sufficient to specify the tumor area. The extra time for the 383 processing of the underlying CAD system would not be noticeable. 384 The limitation of this study is that the B-mode images were acquired from one 385 US scanner (ATL HDI 5000). In our experiment, we have shown the difference 386 between conventional US and SCI for the proposed speckle features. The result 387 demonstrated that the performances achieved by these two techniques were very 388 close. The B-mode images used in the experiment were acquired from the 389 manipulation of the experienced radiologist (15 years). For each patient, the 390 adaptive configuration of the US scanner was customized for generating each US 391 image. Similarly, the proposed CAD system can be adjusted for individual US 392 systems. By training the up-to-date data, the classifier generates the customized 393 parameters for optimizing performance. So far, the available image data 394 collected from the clinical examinations was obtained from the US scanner. 395 Applying the proposed CAD system to different US platforms will be 396 investigated in the future. Another method for showing the advantage of the 397 proposed CAD system would be to collect various nonmass types of lesions as the 18 398 specimens for diagnosis. Nonmass lesions are abnormal tissues without distinct 399 boundaries. Because of the indistinct boundary between the lesion and its background 400 tissues, it is not practical to utilize segmentation for extracting features. Lesions that 401 are close to a nipple or that have posterior shadowing are the specimens that we will 402 use to evaluate the proposed CAD system. 403 404 Acknowledgments 405 The authors would like to thank the National Science Council of the Republic of 406 China and National Taiwan University for financially supporting this research under 407 Contract No. NSC 98-2221-E-002 -172 -MY3 and 10R80919-6. This study was also 408 supported by a grant from the Innovative Research Institute for Cell Therapy, 409 Republic of Korea (A062260). 410 411 19 412 References 413 Alpaydin E. Introduction to machine learning. Cambridge, Mass: MIT Press, 2004. 414 American College of Radiology. Breast Imaging Reporting and Data System, 4th ed. 415 American College of Radiology, 2003. 416 Cha JH, Moon WK, Cho N, Chung SY, Park SH, Park JM, Han BK, Choe YH, Cho G, 417 Im JG. Differentiation of benign from malignant solid breast masses: 418 Conventional US versus spatial compound imaging. Radiology 2005;237:841-6. 419 Cha JH, Moon WK, Cho N, Kim SM, Park SH, Han BK, Choe YH, Park JM, Im JG. 420 Characterization of benign and malignant solid breast masses: Comparison of 421 conventional US and tissue harmonic imaging. Radiology 2007;242:63-9. 422 Chang CC, Tsui PH, Yeh CK, Liao YY, Kuo WH, Chang KJ, Chen CN. Ultrasonic 423 Nakagami Imaging: A Strategy to Visualize the Scatterer Properties of Benign 424 and Malignant Breast Tumors. Ultrasound Med Biol 2010;36:209-17. 425 Chang RF, Moon WK, Shen YW, Huang CS, Chiang LR. Computer-Aided Diagnosis 426 for the Classification of Breast Masses in Automated Whole Breast Ultrasound 427 Images. Ultrasound Med Biol 2011;37:539-48. 428 429 430 Field AP. Discovering statistics using SPSS, 3rd ed. Los Angeles: SAGE Publications, 2009. Haralick RM, Shanmuga.K, Dinstein I. Textural Features for Image Classification. 20 431 IEEE Trans Syst Man Cybern 1973;Smc3:610-21. 432 Hosmer DW. Applied logistic regression. 2nd edition. New York: Wiley, 2000. 433 Huang CS, Moon WM, W. K., Chang SC, Chang RF. Breast Tumor Classification 434 Using Fuzzy Clustering for Breast Elastography. Ultrasound Med Biol 435 2011;37:700-8. 436 Nie K, Chen JH, Yu HJ, Chu Y, Nalcioglu O, Su MY. Quantitative Analysis of 437 Lesion Morphology and Texture Features for Diagnostic Prediction in Breast 438 MRI. Acad Radiol 2008;15:1513-25. 439 Rangayyan RM, Mudigonda NR, Desautels JEL. Boundary modelling and shape 440 analysis methods for classification of mammographic masses. Med Biol Eng 441 Comput 2000;38:487-96. 442 Shankar PM, Dumane VA, George T, Piccoli CW, Reid JM, Forsberg F, Goldberg 443 BB. Classification of breast masses in ultrasonic B scans using Nakagami and K 444 distributions. Phys Med Biol 2003;48:2229-40. 445 Shen WC, Chang RF, Moon WK. Computer aided classification system for breast 446 ultrasound based on breast imaging reporting and data system (BI-RADS). 447 Ultrasound Med Biol 2007a;33:1688-98. 448 449 Shen WC, Chang RF, Moon WK, Chou YH, Huang CS. Breast ultrasound computer-aided diagnosis using BI-RADS 21 features. Acad Radiol 450 451 452 2007b;14:928-39. Smith WL, Fenster A. Optimum scan spacing for three-dimensional ultrasound by speckle statistics. Ultrasound Med Biol 2000;26:551-62. 453 Stavros AT, Thickman D, Rapp CL, Dennis MA, Parker SH, Sisney GA. Solid breast 454 nodules: Use of sonography to distinguish between benign and malignant lesions. 455 Radiology 1995;196:123-34. 456 Tuthill TA, Krucker JF, Fowlkes JB, Carson PL. Automated three-dimensional US 457 frame positioning computed from elevational speckle decorrelation. Radiology 458 1998;209:575-82. 459 460 Tuthill TA, Sperry RH, Parker KJ. Deviations from Rayleigh Statistics in Ultrasonic Speckle. Ultrason Imaging 1988;10:81-9. 461 22 462 463 Figure Captions 464 465 Fig. 1 Illustration of the ROI selection and speckle extraction. (a) A tumor is selected 466 by drawing an ROI from the B-mode image. (b) The ROI of a benign tumor. (c) 467 The extracted speckle pixels inside the ROI of (b) are shown by a white 468 appearance. (d) The ROI of a malignant tumor. (e) The extracted speckle 469 pixels inside the ROI of (d) are indicated by a white appearance. 470 Fig. 2 The comparison between conventional US and SCI for speckle pixels. (a) The 471 ROI of a B-mode image with conventional US. (b) The extracted speckle 472 pixels inside the ROI of (a) are shown by a white appearance. (c) The ROI of a 473 B-mode image with SCI. (d) The extracted speckle pixels inside the ROI of (c) 474 are indicated by a white appearance. 475 Fig. 3 The ROC curves of feature sets. (a) The ROC curves of the speckle features 476 with conventional US and with SCI. (b) The ROC curves of the speckle 477 features with SCI, segmentation features, and the combined feature set. Note 478 that the combined feature set includes the speckle features with SCI and the 479 segmentation features. 480 Fig. 4 A malignant tumor that was classified correctly by the speckle features but was 23 481 misclassified by the segmentation features. (a) The ROI of the tumor. (b) The 482 S_avgnum of the tumor is relatively small (0.13). (c) The segmentation result 483 of the tumor. 484 Fig. 5 A benign tumor that was classified correctly by the speckle features but was 485 misclassified by the segmentation features. (a) The ROI of the tumor. (b) The 486 S_avgnum of the tumor is relatively large (0.17). (c) The segmentation result 487 of the tumor. 488 24 489 490 Table 1 21 speckle features. Category Feature Description First order statistics S_avgnum The average number speckle pixels in a ROI S_mean, S_SD The mean and SD of the qualified mean/SD in a moving window S_gmean, S_gSD The mean and SD of speckle of pixels in 256 gray-level Second order statistics Energy ave., Energy std., 16 GLCM texture features Entropy ave., Entropy std., Correlation ave., Correlation std., Inverse Difference Moment ave., Inverse Difference Moment std., Inertia ave., Inertia std., Cluster Shade ave. Cluster Shade std., (Haralick et al. 1973) Cluster Prominence ave., Cluster Prominence std., Haralick Correlation ave., Haralick Correlation std. 491 492 Table 2 38 segmentation features. Category Feature Description Morphology Tumor_a, Tumor_p Tumor area and perimeter Ellipse_a, The length of the major axis of the best-fit ellipse Ellipse_b, The length of the minor axis Ellipse_a/b, Ellipse_a / Ellipse_b Ep/Tp, The ratio of the ellipse perimeter and the tumor perimeter 25 Texture Ellipse_compactness, The overlap between the ellipse and the tumor Ellipse_theta The angle of the major axis of the ellipse (Shen et al. 2007a) NRL entropy, NRL variance NRL features(Nie et al. 2008) Compactness Tumor roundness(Rangayyan et al. 2000) Undulation, Sharp, MU Features about undulations on the tumor boundary(Shen et al. 2007b) NS The number of spicules on the tumor boundary MNS NS×Compactness MaxSpicule The length of the longest spicule of NS LB The average intensity difference between the inner and outer bands around the tumor boundary (Shen et al. 2007b) PS The average intensity difference between the tumor and the region under the tumor (Shen et al. 2007b) PS_diff The average intensity difference between the surrounding tissues and the region under the tumor EPc The average intensity difference between the 25% brighter pixels and whole tumor pixels (Shen et al. 2007b) EP_diff The average intensity difference between the tumor and the surrounding tissues 16 GLCM texture features (Haralick et al. 1973) Energy ave., Energy std., Entropy ave., Entropy std., Correlation ave., Correlation std., Inverse Difference Moment ave., Inverse Difference Moment std., Inertia ave., Inertia std., Cluster Shade ave. Cluster Shade std., Cluster Prominence ave., Cluster Prominence std., Haralick Correlation ave., 26 Haralick Correlation std. 493 494 Table 3 The comparison of different window sizes for detecting speckle pixels. Accuracy Sensitivity Specificity PPV NPV Az 495 3×3 5×5 7×7 ≥9×9 83.9% (115/137) 73.8% (31/42) 88.4% (84/95) 73.8% (31/42) 88.4% (84/95) 0.89 89.1% (122/137) 81.0% (34/42) 92.6% (88/95) 82.9% (34/41) 91.7% (88/96) 0.93 84.7% (116/137) 76.2% (32/42) 88.4%. (84/95) 74.4% (32/43) 89.4% (84/94) 0.89 N/A 5×5 VS. 3×3 (p-value) 0.2160 5×5 VS. 7×7 (p-value) 0.2833 N/A 0.4340 0.5949 N/A 0.3217 0.3217 N/A 0.3136 0.3421 N/A 0.4537 0.5875 N/A 0.0657 0.0762 N/A: not available. 496 497 Table 4 The mean, standard deviation (SD), median, and p-value (Student’s t-test or 498 Features S_avgnum S_mean S_SD S_gmean S_gSD Mann-Whitney U-test) of significant speckle features with SCI. Benign Mean±SD 0.17±0.05 1.01±0.01 Median Malignant Mean±SD 0.10±0.05 1.03±0.01 0.114 64.07±17.82 25.52±5.57 Median 0.110 45.83±18.48 17.66±3.94 27 p-value <0.001* <0.001* <0.001* <0.001* <0.001* Energy ave. Energy std. Entropy ave. Correlation ave. Correlation std. Inverse Difference Moment ave. Inertia ave. Inertia std. Cluster Shade ave. Cluster Shade std. Cluster Prominence ave. Cluster Prominence std. Haralick Correlation ave. Haralick Correlation std. 499 0.02 0.006 5.76±0.38 0.04 0.009 5.16±0.44 0.08 0.003 0.55±0.01 1.82±0.19 0.69±0.1 0.17 0.013 0.58±0.01 1.57±0.17 0.57±0.1 121.95 13.14 4399.12 333.57 33598.55 1164.68 40.23 4.02 909.71 79.64 9968.95 418.58 <0.001* 0.04* <0.001* <0.001* <0.001* <0.001* <0.001* <0.001* 0.002* <0.001* <0.001* <0.001* <0.001* <0.001* * p-value<0.05 indicates a statistically significant difference. 500 501 Table 5 The mean, standard deviation (SD), median, and p-value (Student’s t-test or 502 Mann-Whitney U-test) of significant segmentation features. Features Tumor_a Tumor_p Ellipse_a Ellipse_b Ellipse_a/b Ep/Tp Ellipse_theta NRL entropy NRL variance Undulation Sharp MU MNS LB EPc EP_diff Energy ave. Entropy std. Correlation std. Inverse Difference Moment ave. Inverse Difference Moment std. Inertia ave. Benign Mean±SD Median 4676 344 57.05±22.22 Malignant Mean±SD p-value Median 12839 616 70.92±23.73 30.92 1.63 0.79±0.09 54.77 1.34 0.67±0.10 0.08 2.56±0.37 0.14±0.03 0.20 2.31±0.42 0.11±0.03 2 2 5 1.69±0.73 34.91±8.95 4 3.5 7.5 2.15±1.06 26.15±8.97 0.41 36.05±9.27 0.64 28.88±9.09 0.08 0.20±0.03 0.09 0.18±0.03 0.02 0.72±0.04 0.05±0.01 0.80±0.24 0.01 0.75±0.05 0.04±0.01 0.67±0.23 28 <0.001* <0.001* <0.001* <0.001* <0.001* <0.001* 0.002* <0.001* <0.001* <0.001* <0.001* <0.001* 0.01* <0.001* 0.004* <0.001* 0.02* <0.001* 0.04* <0.001* <0.001* 0.003* Inertia std. Cluster Shade ave. Haralick Correlation ave. Haralick Correlation std. 503 0.23±0.08 0.18±0.08 9.60 3529.22 54.12 17.18 2166.20 27.39 <0.001* 0.005* 0.01* <0.001* * p-value<0.05 indicates a statistically significant difference. 504 505 Table 6 The comparison of five performance indices and p-values (chi-square test) 506 Accuracy Sensitivity Specificity PPV NPV Az between the speckle features and the segmentation features. Speckle (with SCI) Speckle (with convention al US) Segmentat ion Combined 89.1% (122/137) 81.0% (34/42) 92.6% (88/95) 82.9% (34/41) 91.7% (88/96) 0.93 88.3% (121/137) 78.6% (33/42) 92.6% (88/95) 82.5% (33/40) 90.7% (88/97) 0.89 84.7% (116/137) 73.8% (31/42) 89.5%. (85/95) 75.6% (31/41) 88.5% (85/96) 0.86 89.8% (123/137) 81.0% (34/42) 93.7% (89/95) 85.0% (34/40) 91.8% (89/97) 0.94 Speckle (with SCI) VS. Speckle (with convention al US) (p-value) 0.8487 Speckle (with SCI) VS. Segmentat ion (p-value) Combined VS. Speckle (with SCI) (p-value) Combined VS. Segmentat ion (p-value) 0.2833 0.8443 0.2052 0.7860 0.4340 1.0000 0.4340 1.0000 0.4458 0.7738 0.2960 0.9595 0.4138 0.7994 0.2886 0.8168 0.4684 0.9827 0.4541 0.2513 0.0359* 0.9573 0.0219* 507 * p-value<0.05 indicates a statistically significant difference. 508 Combined feature set includes the speckle features with SCI and the segmentation 509 features. 29