1 Supplementary Material Double-stranded RNA induces molecular and inflammatory signatures that are directly relevant to COPD Paul Harris*1, Sriram Sridhar*2, Ruoqi Peng1, Jonathan E. Phillips1, Ronald G. Cohn1#, Lisa Burns1, John Woods1, Meera Ramanujam1, Martine Loubeau1, Gaurav Tyagi3, John Allard2, Michael Burczinski2, Palanikumar Ravindran2, Donavan Cheng2, Hans Bitter2, Jay S. Fine1, Carla M.T. Bauer1, and Christopher S. Stevenson1,4§ Hoffmann-La Roche Inc., pRED, Pharma Research & Early Development, 1DTA Inflammation, 2Translational Research Sciences, 3Non-Clinical Safety, 340 Kingsland Street, Nutley, NJ 07110 USA 4Imperial College London, National Heart and Lung Institute, Centre for Respiratory Infections, Respiratory Pharmacology Group, Pharmacology and Toxicology Section, Exhibition Road, London SW2 7AZ *Authors have contributed equally to this work; §Correspondence: Christopher S. Stevenson, PhD. Inflammation Discovery F. Hoffmann-La Roche Inc. 340 Kingsland Street Nutley, NJ 07110-1199 E-mail: Christopher.stevenson@roche.com FAX: +1-973-235-5005 ` 2 ` 3 Supplementary methods Microarray processing and data analysis Total RNA was isolated from the mouse lung tissue of poly I:C and saline treated mice across 7 time points (2, 6, 24, 48, 72, 96 hours, 7 days, n=6 per group) and homogenized in QIAzol reagent. Purified total RNA was amplified and labeled using NuGen Ovation kits (NuGEN Technologies, Inc., San Carlos, CA) and samples were hybridized to Affymetrix Mouse 430 2.0 arrays. Array washing, staining and scanning was performed according to standard Affymetrix protocols (Affymetrix Inc., Santa Clara, CA). Probe level data was curated by first mapping individual probe sequences to their most current genome sequences. Probes which were non-uniquely mapped to specific genes or contained outdated mappings were discarded, and the remaining probes were summarized into probesets and normalized using Robust Multi-array Average (RMA). Potential outlier samples were assessed using principal component analysis (PCA) on all normalized probesets across all samples. Probe level data was subsequently summarized to unique genes based on a variance filter, yielding one expression value per unique gene across all samples. Murine genes were then mapped to their human orthologs for subsequent pathway analysis. This resulted in 14300 unique mouse genes which mapped to human orthologs and were subsequently used for analysis. Differentially expressed genes (DEGs) were determined using an ANOVA, with pairwise comparisons between poly I:C and saline treatment at each time point. P-values for DEGs in pairwise comparisons were adjusted using a ` 4 Benjamini-Hochberg correction to account for multiple hypothesis testing 1. Genes that were significantly altered at least 2-fold between poly I:C treatment and saline controls (false discovery rate, FDR < 0.05) were considered to be differentially expressed. Unsupervised hierarchical clustering was performed on the union of DEGs between poly I:C and saline-treated samples across all time points to determine phases of response to poly I:C treatment. Common and unique genes between these phases of response were determined for each phase by taking the union and intersection of DEGs. Gene ontology and pathway analysis Gene ontology (GO) functional analysis and pathway enrichment analysis were performed on DEGs between saline and poly I:C treated mice at each time point. Enrichment of functional ontologies for all DEGs was determined by a hypergeometric test, using internally curated GO biological process annotations. Internal curation of this repository involved generating clusters of GO categories based on the degree of overlap between sets of genes within a functional group. Significantly enriched clusters were determined based on a mean FDR cutoff for all the groups within a cluster. Similarly, pathway enrichment for all DEGs was determined using internally curated data from the NCI Pathway Interaction Database (http://pid.nci.nih.gov/index.shtml). This repository includes pathway data imported from BioCarta and Reactome. Significantly enriched clusters of pathways were also determined as described above. ` 5 Gene set enrichment analysis of custom inflammatory gene signatures Additional pathway analyses were conducted for custom defined inflammatory signatures of interest which have been compiled from literature using three different text-mining resources: Ingenuity’s Knowledgebase (Ingenuity Systems, Inc, Redwood City, CA), Ariadne’s ResNet database (Ariadne Genomics, Rockville, MD), I2E from Linguamatics (Linguamatics Ltd, Cambridge, UK). These signatures include TLR3 signaling. An additional set of blood-cell specific signatures were also obtained from a previous study involving expression profiling of 17 different blood cell subtypes from resting and activated cell populations 2. Signatures of interest were evaluated in the murine poly I:C expression dataset using gene set enrichment analysis 3 to determine altered expression of signaling pathways in response to poly I:C treatment at each time point. Briefly, enrichment of gene sets was calculated against the entire set of 14300 genes from the poly I:C treatment dataset, ranked based on a composite score of fold-change and FDR differences between saline and poly I:C treatment at each time point. FDR values were determined for gene set enrichment by permuting genes within gene sets. Modular analysis of blood transcriptomic signatures Enrichment of blood transcriptome modules was performed as previously described 4. Briefly individual sets of poly I:C DEGs were compiled for each time point assayed. Enrichment of each set of poly I:C DEGs was determined across each of 28 blood transcriptome modules by determining the overlap of poly I:C ` 6 DEGs with each module, normalized by the total number of genes within a module. Modular enrichment = (number up-regulated genes) – (number downregulated genes) / total number of genes in module The significance of the enrichment was determined using a hypergeometric test. Gene set variation analysis Enrichment of poly I:C genes in clinical datasets was determined using the gene set variation analysis (GSVA) algorithm 5, as implemented in the R software environment (http://www.r-project.org/). Briefly, GSVA determines enrichment of gene sets in an expression dataset on a per sample basis by transforming the gene-by-sample matrix into a gene set-by-sample matrix. The transformation is carried out using a non-parametric, unsupervised approach, calculating relative enrichment of a gene set in each sample across a sample space. This allows for the sample-wise comparison of gene set enrichment across a dataset. GSVA enrichment scores were calculated using the up- and down-regulated poly I:C genes from the in vivo murine data as gene sets. Relative enrichment of these gene sets was then calculated on a per-sample basis in 2 clinical microarray datasets available on the Gene Expression Omnibus (GEO): GSE1122 6 and GSE10667 7. GSE1122 consisted of lung tissue samples from COPD and nonCOPD lungs, while GSE10667 was comprised of lung tissue from stable and acute exacerbations of IPF and non-IPF controls. Statistical significance of gene set enrichment was determined by applying a linear model to the GSVA ` 7 enrichment scores within each dataset and determining FDR for pairwise differences between sample groups (e.g. COPD vs. Control in GSE1122, Stable IPF v. Control in GSE10667, etc.). Unsupervised hierarchical clustering was also performed on the enrichment scores for each dataset to determine clusters of samples within a dataset which showed similar enrichment of poly I:C signatures. Non-invasive airway hyper-responsiveness (AHR) To determine if AHR was a feature of the model, AHR was measured at 6, 24 or 48 hours after poly I:C insult. Mice were placed into a whole–body plethysmograph (WBP) interfaced with computers using differential pressure transducers (BUXCO system). After at least a 5 minute acclimation period, mice were exposed to aerosolized normal saline, followed by increasing doses of methacholine (MCh)solution (5, 10, 20, 40, 80 mg/ml, Sigma-Aldrich). Beginning after each aerosol challenge, enhanced pause (Penh) readings were measured for 5 minutes and are represented as averaged values for each dose. ` 8 Supplementary Figure Legends Figure S1. Poly I:C dose dependently increases the numbers of total cells (A), neutrophils (B), and lymphocytes (C) in the BALF. Neutrophil and lymphocyte numbers were determined by performing differential cell counts. Data are expressed as mean ± SEM n = 8 - 10 mice. Significance (relative to the poly I:C vehicle control) was determined using a Student's t-test and is denoted as follows: *p < 0.05; **p< 0.01; and ***p< 0.001. Figure S2. The major inflammatory features induced by poly I:C instillation include peribronchiolar (A), perivascular (B) and interstitial (C) inflammatory cell infiltrate. Representative images for control samples are provided for reference (D - F). Lung section were stained with H&E and pictures taken at 400x magnification. Figure S3. Poly I:C induces AHR to nebulized methacholine challenges at 24 hours. Airway responses to increasing doses of methacholine were measured using conscious, whole body plethsmography at 6 (A), 24 (B), and 48 (C) hours post poly I:C administration. Data are expressed as mean ± SEM of n = 8 mice. Significance was determined using the AUC values relative to the saline control group. Figure S4. Monocyte, dendritic cell, and NK cell markers enriched in response to poly I:C treatment. (A) Gene set enrichment analysis of blood cell marker gene panels from a previously published study (Abbas et al., 2000) were applied to this ` 9 murine poly I:C data-set. Columns represent the entire ranked gene list based on pairwise comparisons of poly I:C versus saline treatment at each time point. Each row represents individual gene sets. Each box denotes the enrichment score of the gene set against the poly I:C data at each timepoint. Up-regualted gene sets are shaded red, while down-regulated gene sets are shaded blue. Inflammatory modules based on analysis of blood transcriptional modules from a previously published study (Chaussabel et al,. 2008) were enriched in response to poly I:C treatment (B). Columns represent comparisons of poly I:C versus saline treatment at each time point, while rows represent individual transcriptional modules, Circles are shaded red if there is a high degree of overlap between genes in a module and up-regulated poly I:C genes at a specific time-point. Circles are shaded blue if there is a high degree of overlap between genes in a module and down-regulated poly I:C genes at a time-point. Asterisks indicate statistically significant (hypergeometric p < 0.05) overlaps between module genes and differentially expressed poly I:C genes. Figure S5. Poly I:C induces the expression of cytokines and chemokines as measured by microarray analysis. TNFalpha (A), KC (B), MIP-1beta (C), RANTES (D), and total IL-12 (E) were measured by microarray. Data are expressed as mean ± SEM of n = 6 mice. Significance (relative to the poly I:C vehicle control) was determined using a Student's t-test and is denoted as follows: *p < 0.05; **p< 0.01; and ***p< 0.001. ` 10 Figure S6. Flow of gene set variation analysis. Two publically available data sets (GSE1122 and GSE10667) were used to test per-patient enrichment of poly I:C signatures using gene set variation analysis. ` 11 References 1. Benjamini Y, Hochberg Y. Controlling the False Discovery Rate - a Practical and Powerful Approach to Multiple Testing. J Roy Stat Soc B Met 1995; 57(1): 289-300. 2. Abbas AR, Wolslegel K, Seshasayee D, Modrusan Z, Clark HF. Deconvolution of blood microarray data identifies cellular activation patterns in systemic lupus erythematosus. PloS One 2009; 4(7): e6098. 3. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proceedings of the National Academy of Sciences of the United States of America 2005; 102(43): 1554515550. 4. Chaussabel D, Quinn C, Shen J, Patel P, Glaser C, Baldwin N et al. A modular analysis framework for blood genomics studies: application to systemic lupus erythematosus. Immunity 2008; 29(1): 150-164. ` 12 5. GSVA - The Gene Set Variation Analysis Package. http://www.bioconductor.org/packages/2.8/bioc/html/GSVA.html, 2011, Accessed Date Accessed 2011 Accessed. 6. Golpon HA, Coldren CD, Zamora MR, Cosgrove GP, Moore MD, Tuder RM et al. Emphysema lung tissue gene expression profiling. American Journal of Respiratory Cell and Molecular Biology 2004; 31(6): 595-600. 7. Konishi K, Gibson KF, Lindell KO, Richards TJ, Zhang Y, Dhir R et al. Gene expression profiles of acute exacerbations of idiopathic pulmonary fibrosis. American Journal of Respiratory and Critical Care Medicine 2009; 180(2): 167175. `