Additional file 1 Supplementary Table 1. Intraclass correlation coefficients (ICC) for masked reliability assessment of computer-extracted features (n=91 pairs) Feature Type and Number Feature Definition ICC Reference AVE Average gray value within ROI; higher values correspond to denser region 0.986 [1-3] MinCDF Gray value corresponding to the 5% region cutoff on cumulative density function; higher values correspond to denser region 0.971 [1-3] Balance Ratio of (95%CDF-AVE) to (AVE-5%CDF); Related to skewness; Values less than one correspond to having a ROI that is skewed towards relatively denser values. 0.959 [1-4] Energy Measure of image homogeneity; higher values correspond to being more homogeneous 0.980 [5-7] MaxF (COOC) Largest number of a gray value pair in the cooccurrence matrix; measure of image homogeneity; higher values correspond to being more homogeneous 0.987 [5-7] MaxCDF Gray value corresponding to the 95% region cutoff on cumulative density function; higher values correspond to denser region 0.996 [1-3] 70%CDF Gray value corresponding to the 70% region cutoff on cumulative density function; higher values correspond to denser region 0.989 [1-3] 30%CDF Gray value corresponding to the 30% region cutoff on cumulative density function; higher values correspond to denser region 0.982 [1-3] M7 Balance2 Ratio of (70%CDF-AVE) to (AVE-30%CDF); Related to skewness; Values larger than one correspond to denser regions 0.842 [1-4] M8 Skewness The denseness measure; Negative values correspond to denser region 0.981 [1-4] M9 RMS Rood mean square variation; quantifies the magnitude 1 of parenchymal patterns 0.990 [1-3, 7] (cont’d) SELECTED FEATURES: Gray level magnitudebased features: M1 M2 M3 Texture-based features: T1 T2 OTHER FEATURES: Gray level magnitudebased features: M4 M5 M6 Feature Type and Number Feature Definition ICC Reference D_BC [1-6] Fractal dimension estimated based on box-counting method; lower values correspond to coarser texture Range: 0.8950.995 [3, 8] D_M Fractal dimension estimated based on Minkowski method; lower values correspond to coarser texture 0.993 [3, 8] Beta [1-8] Exponent beta from power law spectrum analysis; characterize the frequency content of texture pattern; higher values correspond to coarser texture Range: 0.9600.990 [9] Contrast (COOC) Contrast measure calculated from co-occurrence matrix; measure of image local variations 0.998 [5-7] T19 Contrast (NGTDM) Contrast measure calculated from Neighborhood-Graytone-difference matrix; measure of image local variations 0.989 [1-4] T20 Correlation (COOC) Measure of image linearity; larger values correspond to linear patterns 0.791 [5-7] T21 Entropy (COOC) Measure of randomness of gray level pairs 0.996 [5-7] T22 ZeroMeasure (COOC) Zero measures in co-occurrence matrix; measure of image homogeneity 0.982 [5-7] T23 Skewness (COOC) Measure of the asymmetry of co-occurrence matrix; image homogeneity measure 0.983 [5-7] T24 MeanEdgeGradient Average of edge gradient; image coarseness measure 0.997 [3, 7] T25 MaxEdgeGradient Maximum edge gradient; image coarseness measure 0.905 [3, 7] T26 MinEdgeGradient Minimum edge gradient; image coarseness measure 0.933 [3, 7] T27 StdDevEdgeGradient Standard deviation of edge gradient; image coarseness measure 0.996 [3, 7] T28 Coarseness (NGTDM) Measure of image coarseness; higher values correspond to coarser region 0.953 [1-4] T29 FMP First moment of power spectrum; spatial frequency content of parenchymal patterns 0.972 [1-3, 7] OTHER FEATURES: Texture-based features: T3 - T8 T9 T10 - T17 T18 2 Supplementary Table 2. Correlations between selected computer-extracted features (n=237 women) Correlation with AVE Correlation with MinCDF Correlation with Balance Feature Type and Number Feature r* p-value r* p-value r* Gray level magnitude-based features: M1 M2 M3 AVE MinCDF Balance 1.00 0.60 -0.77 <.0001 <.0001 1.00 -0.41 <.0001 1.00 Texture-based features: T1 T2 Energy -0.09 MaxF (COOC) 0.03 0.19 0.69 0.48 0.43 <.0001 <.0001 -0.09 -0.13 *Spearman's correlation coefficient 3 p-value 0.19 0.053 Correlation with Energy r* p-value 1.00 0.90 <.0001 Supplementary Table 3. Sensitivity analyses of the ability of the trained classifier to distinguish between BRCA1/2 mutation carriers and non-carriers in age-matched datasets Training dataset Description Testing dataset Selected features ¥ Description Testing dataset results No. noncarriers No. carriers Mean paired difference in probability score from trained classifier SD p-value † Mean paired difference in PMD SD p-value † -1.83 18.04 0.83 PMD Alone Age-matched ~3 years 19 19 Features alone Age-matched ~3 years 19 19 0.28 0.48 0.02 Age-matched ~3 years 19 19 0.32 0.53 0.02 Age-matched ~1 year 17 17 Age-matched ~1 year 17 17 0.26 0.52 0.06 Age-matched ~1 year 17 17 0.25 0.53 0.08 Age-matched ~3 years 19 19 0.18 0.39 0.055 Age-matched ~3 years 19 19 0.18 0.37 0.06 PMD Alone Features alone 4 features selected: MinCDF Energy AVE Max F (COOC) Features + PMD Training dataset truncated at upper age-limit of mutation carriers § Features alone Features + PMD SE 0.55 0.09 0.71 0.09 0.72 0.08 0.52 0.09 0.73 0.09 0.74 0.09 0.72 0.09 0.71 0.09 Age-matched testing datasets Original training dataset* Features + PMD AUC 3 features selected: MinCDF Max F (COOC) Balance -0.77 19.33 1.00 Testing dataset truncated at upper age-limit of mutation carriers § * Original training set includes 70 non-carriers and 107 BRCA1/2 mutation carriers § Training dataset truncated at upper age-limit of mutation carriers (i.e., 55 years) includes 48 non-carriers and 48 BRCA1/2 mutation carriers ¥ Selected features: Percent mammographic density was not selected by the trained classifier but was forced into the models where noted † P-value for differences between age-matched pairs from Wilcoxon signed rank test AUC, area under the curve; PMD, percent mammographic density; SE, standard error 4 -1.83 18.04 0.83 Supplementary Figure 1. Histogram of the number of times that each feature was selected in the 177 leaveone-case-out stepwise feature selection using linear discriminant analysis of the training dataset. Note the features M1-M9 correspond to gray level magnitude-based features, and features T1-T29 correspond to texture-based features. Two gray level magnitude-based features (i.e., M1, AVE; M2, MinCDF) and two texture-based features (i.e., T1, Energy; T2, MaxF (COOC)) were selected more than half the time (as indicated by the dashed line in the figure), and were included in subsequent Bayesian Artificial Neural Network models. 5 References 1. Huo Z, Giger ML, Wolverton DE, et al. Computerized analysis of mammographic parenchymal patterns for breast cancer risk assessment: feature selection. Med Phys 2000;27(1):4-12. 2. Li H, Giger ML, Huo Z, et al. Computerized analysis of mammographic parenchymal patterns for assessing breast cancer risk: effect of ROI size and location. Med Phys 2004;31(3):549-555. 3. Li H, Giger ML, Olopade OI, et al. Computerized Texture Analysis of Mammographic Parenchymal Patterns of Digitized Mammograms. Academic Radiology 2005;12(7):863-873. 4. Huo Z, Giger ML, Olopade OI, et al. Computerized analysis of digitized mammograms of BRCA1 and BRCA2 gene mutation carriers. Radiology 2002;225(2):519-526. 5. Chen W, Giger ML, Li H, et al. Volumetric texture analysis of breast lesions on contrast-enhanced magnetic resonance images. Magn Reson Med 2007;58(3):562-571. 6. Haralick RM, Shanmugan K, Dinstein I. Textural Features for Image Classification. IEEE Transactions on Systems, Man, and Cybernetics 1973;6:610-621. 7. Sonka M, Hlavac V, Boyle R. Image Processing, Analysis, and Machine Vision. Second Edition ed: PWS Publishing; 1999. 8. Li H, Giger ML, Olopade OI, et al. Fractal analysis of mammographic parenchymal patterns in breast cancer risk assessment. Acad Radiol 2007;14(5):513-521. 9. Li H, Giger ML, Olopade OI, et al. Power spectral analysis of mammographic parenchymal patterns for breast cancer risk assessment. J Digit Imaging 2008;21(2):145-152. 6