MMP1 and MMP7 as Potential Peripheral Blood Biomarkers in Idiopathic Pulmonary Fibrosis Ivan O. Rosas MD, Thomas J. Richards PhD, Kazuhisa Konishi MD, Yingze Zhang PhD, Kevin Gibson MD, Anna E. Lokshin PhD, Kathleen O. Lindell MSN, Jose Cisneros PhD, Sandra D. MacDonald RN, Annie Pardo PhD, Frank Sciurba MD, James Dauber MD, Moises Selman MD, Bernadette R. Gochuico MD, Naftali Kaminski MD. Online Supplement Methods Study Population Initial IPF Derivation cohort - This study included 74 patients with IPF evaluated at the Dorothy P. and Richard P. Simmons Center for Interstitial Lung Disease at the University of Pittsburgh Medical Center between 2002 and 2004, and was approved by the Institutional Review Board at the University of Pittsburgh School of Medicine. The diagnosis of IPF was determined similarly in all cohorts and was based on established criteria[1]. Briefly, all patients presented with progressive dyspnea of > 3 months, nonproductive cough, restrictive pulmonary functional impairment with reduced diffusing capacity for carbon monoxide (DLCO), and arterial hypoxemia exacerbated by exercise, and Velcro-type inspiratory crackles on physical examination. Other known causes of interstitial lung disease, such as drug toxicities, environmental exposures, and connective tissue diseases were excluded. When HRCT findings typical of UIP were found, the diagnosis of IPF was established and surgical lung biopsy was avoided. HRCT findings were considered characteristic of IPF when exhibiting bilateral basilar subpleural reticulation and honeycomb cysts, accompanied by traction bronchiectasis and architectural distortion. When HRCT was considered atypical, (for example showing typical features but more extensive ground glass attenuation) the diagnosis of usual interstitial pneumonia was confirmed on surgical lung biopsy as indicated[2]. All patients were seen by Simmons Center Physicians (KFG, JD). Clinical data (demographics, smoking status, clinical findings, drug treatment, pulmonary functions, and imaging studies) were available through the Simmons Center Database. Smoking status was characterized as “never”, “former” (patients with smoking history who quit smoking at least 12 months before presentation), or “current” (patients with smoking history who are either still smoking or quit less than a year before presentation)[3]. Fifty-three controls were obtained from the Pulmonary Division Sample Collection core. Baseline demographic information is detailed in Table 1 in manuscript. The mean age for IPF patients was 65.9 + 9.4 years, average FVC% predicted was 61.9 + 20.8 and average DLCO% predicted was 42.1 + 17.4. Eight IPF patients had been treated with prednisone, 3 with IFN, 16 with IFN + prednisone, and 47 were untreated as of the sampling date. COPD cohort - Samples from 73 patients with COPD who were evaluated at the University of Pittsburgh were available for this study. Individuals were clinically stable at the time of examination, had tobacco exposure of at least 10 pack-years and had no clinical diagnosis of rheumatologic, infectious or other systemic inflammatory disease. Exclusion criteria included a dominant restrictive spirometric impairment, a significant allergic history, completely reversible airflow obstruction or a history of clinical asthma. The study was approved by the University of Pittsburgh Institutional Review Board. GOLD stages were defined as previously described [4]: Stage 0: Chronic cough and sputum production but normal lung function. Stage I: Mild airflow limitation (FEV1/FVC < 70% and FEV1 80% predicted or more). Stage II: FEV1 50%-80% predicted. Stage III: FEV1 30%-50% predicted. Stage IV: FEV1<30% predicted or the presence of respiratory failure or clinical signs of right heart failure. The COPD cohort included 13 patients with GOLD 0-I, 21 patients with GOLD II and 39 patients with GOLD III-IV. Sarcoidosis cohort - 47 patients with sarcoidosis evaluated at the University of Pittsburgh Medical Center between 2002 and 2004 were available for this study. 70% of patients were female and 21.3% African-American. The average age was 45.7 ± 9.7 years. The diagnosis and staging of disease was performed according to American Thoracic Society/European respiratory Society criteria. The staging was defined as: stage 1, adenopathy alone; stage 2, adenopathy plus infiltrates; stage 3, infiltrates alone; stage 4, fibrosis. Patients with stages 2-4 (n=29) showed FVC % 76.7 ± 22.1, and DLCO % 72.9 ± 25.5. Hypersensitivity Pneumonitis cohort: Sera from 41 HP with subacute/chronic disease were included in this study (49.8 + 12.8 years). Diagnosis of HP was made as previously described[5,6]. Briefly, patients showed the following features: a) antecedent of bird exposure and positive serum antibodies against avian antigens; b) clinical and functional features of an interstitial lung disease; c) HRCT showing diffuse centrilobular poorly defined micronodules, ground glass attenuation, focal air trapping and mild/moderate fibrotic changes and d) >35% lymphocytes in bronchoalveolar lavage (BAL) fluid. Forty-four percent of the patients were biopsied and in all of them lung histology was compatible with the diagnosis of HP. Validation cohort: 20 controls, 8 patients with sub-clinical ILD, 16 patients with familial IPF and 9 patients with sporadic IPF were evaluated at the Clinical Center, National Institutes of Health (NIH), in Bethesda, MD. Subjects, at least 18 years old, were recruited by advertisements, and were enrolled in protocols 99H-0068 and/or 04-H-0211, approved by the National Heart, Lung, and Blood Institute Institutional Review Board. Subjects were eligible if they had an open lung biopsy demonstrating usual interstitial pneumonia or HRCT scan findings consistent with IPF as outlined by the American Thoracic Society/European Respiratory Society guidelines. Written informed consent was obtained from subjects. The cohorts have been recently described by us [7,8]. Average ages for subjects with sporadic and familial IPF were 66 + 8 years and 64 +11 years, respectively. Eight patients with familial IPF were diagnosed with early asymptomatic interstitial lung disease using HRCT[8], the mean age in this group was 49 +11 years. Twenty normal volunteers with a mean age of 39 +17 years were used as a control group. Gender, ethnic origin and smoking status for the four groups are presented in Table 2. Samples for oligonucleotide microarrays were obtained from the University of Pittsburgh Health Sciences Tissue Bank as previously described by us[9]. The use of archived tissue has been approved by the local Institutional Review Board. Diagnosis of IPF was supported by history, physical examination, pulmonary function studies, chest high-resolution computed tomography (HRCT), and corroborated by open lung biopsy. The morphologic diagnosis of IPF was based on typical microscopic findings consistent with usual interstitial pneumonia[2]. The patients fulfilled the criteria of the American Thoracic Society and European Respiratory Society [10]. Twenty-three samples obtained from surgical remnants of biopsies or lungs explanted from patients with IPF who underwent pulmonary transplant and 15 normal histology lung samples resected from patients with lung cancer were used for microarray analysis. Collection and Storage of Blood. 45 ml of peripheral blood were drawn from subjects using standardized phlebotomy procedures, after informed consent. Handling and processing was similar for patients and controls. Cells, plasma or serum were separated by centrifugation, and all specimens were immediately aliquoted, frozen and stored in a dedicated –80°C freezer. No more than two freeze-thaw cycles were allowed per sample. Bronchoalveolar lavage (BAL) BAL was performed through flexible fiber-optic bronchoscopy under local anesthesia as previously described by us [9]. Briefly, 300 ml of normal saline were instilled in 50-ml aliquots, with an average recovery of 60-70%. The recovered BAL fluid was centrifuged at 250 X g for 10 min at 4°C. The cell pellet was resuspended in 1 ml phosphate buffer saline and an aliquot was used to evaluate the total number of cells. Other aliquots were fixed in carbowax, stained with hematoxylin & eosin and used for differential cell count. Supernatants were kept at -70oC until use. BAL fluids from 22 IPF patients (age 62.2 + 7.2 years) and 10 normal controls (age 41.5 + 5 years) were available for this study. These patients belong to a cohort of IPF individuals studied and followed up at the National Institute of Respiratory Diseases, Mexico and their inclusion was approved by the local Ethics Committee. Diagnosis of IPF was made based on established criteria [1] and confirmed by lung biopsy in 40% of the patients. In IPF patients BAL was performed as part of the diagnostic process as previously described by us [6,9,11]. Cells were stained with hematoxylin & eosin for differential cell counts, and supernatants were frozen at -70°C until use. All studies were approved by the Institutional Review Board at the University of Pittsburgh, the National Heart, Lung, and Blood Institute or the National Institute of Respiratory Diseases, Mexico. Informed consent was obtained from all patients. Multiplex Analysis The Luminex xMAP technology (Luminex Corp., Austin, TX) combines the principle of a sandwich immunoassay with fluorescent-bead-based technology allowing individual and multiplex analysis of up to 100 different analytes in a single microtiter well. The Luminex xMAP plasma assays were done in 96-well microplate format according to the protocol by Biosource International (Camarillo, CA). A filter-bottom, 96-well microplate (Millipore, Billerica, MA) was blocked for 10 minutes with PBS/bovine serum albumin. To generate a standard curve, 5-fold dilutions of appropriate standards were prepared in serum diluent. Standards and patient sera were pipetted at 50 µL per well in duplicate and mixed with 50 µL of the bead mixture. The microplate was incubated for 1 hour at room temperature on a microtiter shaker. Wells were then washed thrice with washing buffer using a vacuum manifold. Phycoerythrin (PE)-conjugated secondary antibody was added to the appropriate wells. The wells were incubated for 45 minutes in the dark, with constant shaking. Wells were washed twice, assay buffer was added to each well, and samples were analyzed using the Bio-Plex suspension array system (Bio-Rad Laboratories, Hercules, CA). Fluorescence measures were converted to protein concentrations using a five-parameter logistic curve (5-PL). Sources of beads based immunoassays. 34-plex assay for IL1A, IL1B, IL2, IL2R, IL4, IL5, IL6, IL7, IL8, L10,IL12B (IL-12p40), IL13, IL15, IL17, TNFA, IFNA, IFNG, ,GMCSF, EGF, TNFR10B, VEGF, GCSF, FGF2, HGF, CCL5 , CXCL10 , CCL11 , CCL3, CCL5, CCL2, CXCL9, TNFRS1A, TNFRS1B, purchased from Biosource International (Camarillo, CA). MMP assays for MMP1, MMP2, MMP3, MMP7, MMP8, MMP9, MMP12, MMP13 were obtained from R&D Systems (Minneapolis, MN). Assay for FAS, EGFR, FASL, CKRT19, IGFBP1, KLK10, was developed in our Pittsburgh Clinical Proteomics Core Facility[13]. The assays were validated as described previously[13]. Inter-assay variability within replicates presented as an average coefficient of variation in the range of 5.4 to 15%. Intra-assay variability was between 7.0 and 15%[13]. Each assay was further validated in comparison with appropriate ELISA and demonstrated 97-99% correlation[13]. ELISA Quantitative sandwich enzyme immunoassay for human AGER was performed as recommended by the manufacturer (R& D Systems, Minneapolis, MN). Patient samples were prepared by diluting 1:2 in Calibrator Diluent provided with the kit. The AGER standard was reconstituted in 1 ml of distilled water to 50,000 pg/mL. Standards assembled on the plate ranged from 0 pg/mL to 5000 pg/mL, with the appropriate calibrator diluent. All samples and standards were run in duplicate and incubated for two hours at room temperature with Assay Diluent RD1-60. Following four washes, Conjugate was added to each well and incubated. After 2 hours at room temperature, the reagents were aspirated and the plate was washed an additional four times. Shielded from light, the samples were incubated with substrate solution at room temperature for 30 minutes. After color development to the bound AGER, the reaction was stopped and the optical density of the samples measured at 450 nm on a microplate reader. The quantitative sandwich enzyme immunoassay for Human MMP1 and MMP7 was performed similarly as recommended by the manufacturer (R&D Systems, Minneapolis, MN). Oligonucleotide microarray experiments Lung samples were lysed in ice cold Trizol and total RNA was extracted and used as a template for double stranded cDNA synthesis. RNA quantity was determined by OD measurement at 260 nm and RNA integrity by Agilent Bioanalyzer. Labeling was performed using the Agilent Low RNA Input Linear Amplification Kit PLUS, (One-Color) (Agilent Technologies, Santa Clara, CA). Briefly, double stranded cDNA synthesis was performed using an oligo(dT)24 primer containing a T7 RNA polymerase promoter site. The cDNA was used as a template to generate Cy3 labeled cRNA that was used for hybridization. After purification and fragmentation aliquots of each sample were hybridized to Agilent Whole Mouse Genome 4 X 44K multi pack arrays (Agilent Technologies, Santa Clara, CA). After hybridization, each array was sequentially washed and scanned (Agilent DNA microarray scanner). Arrays were individually visually inspected for hybridization defects and quality control procedures were applied, as recommended by the manufacturer of the arrays. For array readout we used Agilent Feature Extraction Software. Data files were imported into a microarray database and linked with updated gene annotations using SOURCE (http://genome-www5.stanford.edu/cgi-bin/SMD/source/sourceSearch) and then normalized using cyclic LOESS[14]. Differentially expressed genes were identified using Significant Analysis of Microarrays (SAM)[15]. Probes corresponding to the genes that encode the 49 protein markers were identified through their gene symbols or Entrez gene ids. Expression levels for the probes that corresponded to these markers were extracted. In the case of multiple redundant probes we selected the highest expressing probe with the lowest Qvalue. Pulmonary function testing (PFT): Measurements were made using standard equipment according to American Thoracic Society recommendations [12] (SensorMedics, Yorba Linda, CA). Forced expiratory volume in one second (FEV1), forced vital capacity (FVC), and diffusion capacity (DLCO) were expressed as percentages of predicted values. Data handling and statistical analysis Protein concentrations obtained from 5-PL curve fitting were exported from the Luminex software to tab-delimited text files. Concentrations with fluorescence below background were set to zero. Data visualization and clustering were performed using Genomica (http://genomica.weizmann.ac.il/index.html), previously described[16] and Spotfire Decision Site 9 (TIBCO, Palo-Alto, CA). A protein was considered differentially expressed when there was a change of at least 25% in concentration and statistical signifcance at p-value < 0.05 corrected for multiple testing. Data are reported as mean + standard deviation. The Wilcoxon rank-sum test was used to identify potential biomarkers that univariately distinguish IPF samples from controls. Statistical analysis was performed using the R language and environment for statistical computing ([17]; http://www.r-project.org). Univariate comparisons were performed using the Wilcoxon rank-sum test; for multiple testing, the familywise error rate (FWER) was controlled at 0.05 using the Bonferroni method. Fold ratios are expressed as ratios of medians. Descriptive statistics and box plots were used to examine univariate distributions of protein concentrations in IPF and control subsets. Classification trees were estimated with the rpart package for recursive partitioning, an implementation of the classification and regression trees (CART) algorithm. Classification performance was assessed using the ROCR package (http://rocr.bioinf.mpi-sb.mpg.de). Briefly, the CART algorithm begins by searching the entire data set for a protein and a binary split on that protein, to minimize the misclassification of samples in the resulting “nodes” of the split. Each potential biomarker is compared to a threshold, and each sample is classified as IPF or Control depending on whether the marker is above or below the threshold. After the first split is chosen, the algorithm is applied recursively to each resulting node, until the resulting nodes are too small to split further. The usual procedure is to overfit classification trees to the data and prune back the trees using optimal cost-complexity pruning. For oligonucleotide array data analysis we applied SAM and significance was controlled at FDR = 5% (SAM Q value = 5)[15]. To investigate the ability of the MMP7 and MMP1 to jointly distinguish IPF from HP, we plotted the values MMP1 (x-axis) and MMP7 (y-axis) in all patients (manuscript, Figure 4D). Closed circles indicate patients with IPF and open circles indicate patients with HP. Corners represent points in which the trade-off between positive predictive value (PPV) and negative predictive value (NPV) are optimal for ruling out IPF (blue) or ruling in IPF (red) based on MMP1 and MMP7 concentrations. As can be seen in Figure 4D the performance of the combination of MMP7 and MMP1 concentrations is actually quite impressive, in terms of the predictive value of a positive or a negative test result. For example, a serum level of MMP7>2.6ng/ml and a serum level of MMP1>8.9ng/ml had a 91% positive predictive value (PPV) for IPF and a serum level of MMP7<2.9 ng/ml and MMP1 >3.5ng/ml has a 96% negative predictive value (NPV) for IPF (figure S1). To derive the corners on Figure 4D we calculated optimal concentration combinations in the trade-off of PPV and NPV by plotting the PPV vs. NPV for all combinations (Figure S1). Each “corner point” demarcated in Figure S1 has a corresponding line of the same color in Figure 4D of the manuscript. References 1. (2000) American Thoracic Society. Idiopathic pulmonary fibrosis: diagnosis and treatment. International consensus statement. American Thoracic Society (ATS), and the European Respiratory Society (ERS). Am J Respir Crit Care Med 161: 646-664. 2. Katzenstein AL, Myers JL (1998) Idiopathic pulmonary fibrosis: clinical relevance of pathologic classification. Am J Respir Crit Care Med 157: 1301-1315. 3. King TE, Jr., Tooze JA, Schwarz MI, Brown KR, Cherniack RM (2001) Predicting survival in idiopathic pulmonary fibrosis: scoring system and survival model. Am J Respir Crit Care Med 164: 1171-1181. 4. Pauwels RA, Buist AS, Calverley PM, Jenkins CR, Hurd SS (2001) Global strategy for the diagnosis, management, and prevention of chronic obstructive pulmonary disease. NHLBI/WHO Global Initiative for Chronic Obstructive Lung Disease (GOLD) Workshop summary. Am J Respir Crit Care Med 163: 1256-1276. 5. Bustos ML, Frias S, Ramos S, Estrada A, Arreola JL, et al. (2007) Local and circulating microchimerism is associated with hypersensitivity pneumonitis. Am J Respir Crit Care Med 176: 90-95. 6. Selman M, Pardo A, Barrera L, Estrada A, Watson SR, et al. (2006) Gene expression profiles distinguish idiopathic pulmonary fibrosis from hypersensitivity pneumonitis. Am J Respir Crit Care Med 173: 188-198. 7. Ren P, Rosas IO, Macdonald SD, Wu HP, Billings EM, et al. (2007) Impairment of alveolar macrophage transcription in idiopathic pulmonary fibrosis. Am J Respir Crit Care Med 175: 1151-1157. 8. Rosas IO, Ren P, Avila NA, Chow CK, Franks TJ, et al. (2007) Early interstitial lung disease in familial pulmonary fibrosis. Am J Respir Crit Care Med 176: 698-705. 9. Pardo A, Gibson KF, Cisneros J, Richards TJ, Yang Y, et al. (2005) UpRegulation and Profibrotic Role of Osteopontin in Human Idiopathic Pulmonary Fibrosis. PLoS Med 2. 10. (2002) American Thoracic Society/European Respiratory Society International Multidisciplinary Consensus Classification of the Idiopathic Interstitial Pneumonias. This joint statement of the American Thoracic Society (ATS), and the European Respiratory Society (ERS) was adopted by the ATS board of directors, June 2001 and by the ERS Executive Committee, June 2001. Am J Respir Crit Care Med 165: 277-304. 11. Selman M, Carrillo G, Estrada A, Mejia M, Becerril C, et al. (2007) Accelerated variant of idiopathic pulmonary fibrosis: clinical behavior and gene expression pattern. PLoS ONE 2: e482. 12. Brusasco V, Crapo R, Viegi G (2005) Coming together: the ATS/ERS consensus on clinical pulmonary function testing. Eur Respir J 26: 1-2. 13. Gorelik E, Landsittel DP, Marrangoni AM, Modugno F, Velikokhatnaya L, et al. (2005) Multiplexed immunobead-based cytokine profiling for early detection of ovarian cancer. Cancer Epidemiol Biomarkers Prev 14: 981987. 14. Wu W, Dave N, Tseng GC, Richards T, Xing EP, et al. (2005) Comparison of normalization methods for CodeLink Bioarray data. BMC Bioinformatics 6: 309. 15. Tusher VG, Tibshirani R, Chu G (2001) Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci U S A 98: 5116-5121. 16. Segal E, Friedman N, Kaminski N, Regev A, Koller D (2005) From signatures to models: understanding cancer using microarrays. Nat Genet 37 Suppl: S38-45. 17. Ihaka R, Gentleman R (1996) R: A language for data analysis and graphics. Journal of Computational and Graphical Statistics 5: 299-314. Supplementary Figure Legend Figure S1. PPV and NPV for Figure 4D. Red and blue Corners in manuscript Figure 4D were derived by computing PPV and NPV at all observed combinations of MMP-7 and MMP-1 concentrations. The points delineated by corners in this figure correspond to optimal trade-off points between PPV and NPV and also to corners of the same line color in manuscript Figure 4D. 66% 80% 85% 91% 100 Negative Predictive Value (%) 96% 90 80 77% 75% 70 70% 60 50 40 40 50 60 70 80 90 Positive Predictive Value (%) 100