Supplementary Materials and Methods Microarray Datasets

advertisement
Supplementary Materials and Methods
Microarray Datasets
Preparation of the labeled cRNA samples for Affymetrix DNA microarray was
described previously (Berchuck et al., 2005).
Four microarray datasets of clinical ovarian samples and one microarray
dataset of ovarian cancer cell lines that include clear cell and serous subtypes were obtained
from the Gene Expression Omnibus (GEO) web site at
http://www.ncbi.nlm.nih.gov/sites/entre and from the website of the University of Texas
M.D. Anderson Cancer Center at
http://www.mdanderson.org/departments/expther/bastovcalab/. GEO datasets are:
GSE6008 including 99 samples (8 OCCC: 8, OS: 41, OM: 13, OE: 37) from the University
of Michigan (Hendrix et al., 2006); GSE2109 including 138 samples (OCCC: 11, OS: 87,
OM: 13, OE: 27) from the Expression Project for Oncology (expO)
(https://expo.intgen.org/geo/); GSE4198 including 44 samples (OCCC: 6, OS: 38) from
Stanford University (Schaner et al., 2003); and GSE3001 including 10 ovarian cancer cell
lines (OCCC: 5, OS: 5) from Hiroshima University (Komatsu et al., 2006). A dataset
including 50 samples (OCCC: 9, OS: 23, OM: 9, OE: 9) was from the University of Texas
M.D. Anderson Cancer Center (PMID16144910) (Marquez et al., 2005) .
Bioinformatics Methodologies
Identification of differentially expressed genes between OCCC and nonOCCC: We identified differentially expressed genes by comparing OCCC with non-clear
cell carcinoma (non-OCCC) using SAM (Tusher et al., 2001) with the threshold q-value of
5% by Wilcoxon type statistical analysis. For statistical analysis of gene expression data, R
version 2.8.2 (R Development Core Team, R: A language and environment for statistical
computing. R Foundation for Statistical Computing, Vienna, http://www.R-project.org.)
and Bioconductor version 2.3 (Ihaka and Gentleman, 1996) were used.
Hierarchical clustering: Genes selected by SAM were used to conduct
average-linkage hierarchical clustering of KyotoOv38, GSE6008, and external datasets
(GSE2109, GSE4198, GSE3001, and PMID16144910) with Cluster version 3.0.
(http://rana.lbl.gov/eisen/). We compared different microarray platforms with U133A by
using the following methods: only the overlapping probe sets between the U133A and
U133 plus 2.0 dataset (GSE2109) were used; the ‘‘best match’’ annotation file provided by
Affymetrix (http://www.netaffx.com) (Ramaswamy et al., 2003) was used
(PMID16144910); Chip Comparer
(http://tenero.duhs.duke.edu/genearray/perl/chip/chipcomparer.pl) was used (GSE4198);
and the files were merged using GeneIDs (GSE3001). In using the cDNA spotted array
datasets (GSE4198 and GSE3001), a SD>0.7 across all samples were used. For all analyses,
levels of gene expression were standardized using mean-centering. Heat maps and
dendrograms were generated with Java TreeView (http://jtreeview.sourceforge.net/).
Gene Set Enrichment Analysis (GSEA): GSEA was performed to determine
whether the up-regulated or down-regulated gene sets in the OCCC signature are enriched
within OCCC samples or non-OCCC samples in external datasets (GSE2109, GSE4198,
GSE3001, and PMID16144910). GSEA was performed to analyze the enrichment of the
gene sets following the developer’s protocol (Subramanian et al., 2005)
(http://www.broad.mit.edu/gsea/).
Allez analysis: The biological characteristics of the OCCC signature were
evaluated by the enrichment of MSigDB gene sets (v2.5 updated April 7 2008,
Subramanian et al., 2005) using the R package allez 1.0 (Newton et al., 2007). Briefly, for
each gene set, the proportion of the annotated genes in the OCCC signature was compared
to that for all probe set genes. A gene set was considered significantly enriched if the
nominal p-value was less than 0.05 (Pyeon et al., 2007).
Pathway analysis: Ingenuity Pathway Analysis (IPA; Ingenuity Systems®,
http://www.ingenuity.com) is a commercial application that calculates the association
between a particular gene set and known pathways. The activated pathways in OCCC were
searched using IPA with filtering of “Human” and “Relax Relation (including Indirect)”.
Evaluation of the effect of oxidative stress on the OCCC signature: To evaluate
enrichment of the OCCC signature among genes induced by oxidative stress, we generated
an enrichment score as defined by GSEA procedures (Sweet-Cordero et al., 2005). Rankordered gene lists of N total genes, g j , ( j  1,..., N ) , were based on the log fold changes, c j ,
between conditions “with” and “without” oxidative stress. The fraction of genes in the
OCCC signature (set S of NH genes) was weighted by absolute log fold changes, Phit, and
the fraction of genes not in S, Pmiss, was evaluated.
Phit i  
cj
N
g j S
j i
, where N R 
R
Pmiss i  
c
g j S
j
1
 N  N 
g j S
j i
H
Running enrichment score (RES) is the difference between empirical distribution functions
ES i   Phit i   Pmiss i  . The magnitude of enrichment was assessed by the maximum
deviation of RES from zero.
Statistical Analysis
The Mann-Whitney U test with Bonferroni correction was used to compare the
expression levels of OCCC with those of non-OCCC as detected by RT-PCR analyses.
Fisher’s exact test was used in the analysis of contingency tables. The correlation between
the expression of KyotoOv38 microarray datasets and that of Semi-qRT-PCR or qRT-PCR
was evaluated by Spearman’s correlation. A p-value less than 0.05 was considered
statistically significant.
Supplementary References
Berchuck A, Iversen ES, Lancaster JM, Pittman J, Luo J, Lee P et al (2005).
Patterns of gene expression that characterize long-term survival in advanced stage serous
ovarian cancers. Clin Cancer Res 11: 3686-96.
Hendrix ND, Wu R, Kuick R, Schwartz DR, Fearon ER, Cho KR (2006).
Fibroblast growth factor 9 has oncogenic activity and is a downstream target of Wnt
signaling in ovarian endometrioid adenocarcinomas. Cancer Res 66: 1354-62.
Ihaka R, Gentleman R (1996). R: A language for data analysis and graphics. J
Comput Graph Stat 5: 299-314.
Komatsu M, Hiyama K, Tanimoto K, Yunokawa M, Otani K, Ohtaki M et al
(2006). Prediction of individual response to platinum/paclitaxel combination using novel
marker genes in ovarian cancers. Mol Cancer Ther 5: 767-75.
Marquez RT, Baggerly KA, Patterson AP, Liu J, Broaddus R, Frumovitz M et
al (2005). Patterns of gene expression in different histotypes of epithelial ovarian cancer
correlate with those in normal fallopian tube, endometrium, and colon. Clin Cancer Res 11:
6116-26.
Newton MA, Quintana FA, den Boon JA, Sengupta S, Ahlquist P (2007).
Random-set methods identify distinct aspects of the enrichment signal in gene-set analysis.
Annals of Applied Statistics 1: 85-106.
Pyeon D, Newton MA, Lambert PF, den Boon JA, Sengupta S, Marsit CJ et al
(2007). Fundamental differences in cell cycle deregulation in human papillomaviruspositive and human papillomavirus-negative head/neck and cervical cancers. Cancer Res
67: 4605-19.
Ramaswamy S, Ross KN, Lander ES, Golub TR (2003). A molecular signature
of metastasis in primary solid tumors. Nat Genet 33: 49-54.
Schaner ME, Ross DT, Ciaravino G, Sorlie T, Troyanskaya O, Diehn M et al
(2003). Gene expression patterns in ovarian carcinomas. Mol Biol Cell 14: 4376-86.
Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA
et al (2005). Gene set enrichment analysis: a knowledge-based approach for interpreting
genome-wide expression profiles. Proc Natl Acad Sci U S A 102: 15545-50.
Sweet-Cordero A, Mukherjee S, Subramanian A, You H, Roix JJ, Ladd-Acosta
C et al (2005). An oncogenic KRAS2 expression signature identified by cross-species gene-
expression analysis. Nat Genet 37: 48-55.
Tusher VG, Tibshirani R, Chu G (2001). Significance analysis of microarrays
applied to the ionizing radiation response. Proc Natl Acad Sci U S A 98: 5116-21.
Download