Supplement to “High-Resolution Genomic and

advertisement
Supplement to “High-Resolution Analysis of Copy Number
Alterations and Associated Expression Changes in Ovarian
Tumors”
Peter M. Haverty, Lawrence S. Hon, Joshua S. Kaminker, John Chant, Zemin Zhang
SUPPLEMENTARY METHODS
Breakpoint Analysis by Subtype
Then number of chromosomal breakpoints in each sample was calculated as the number
of GLAD segments minus one. Gains and losses were calculated as the number of GLAD
segments with an inferred log ratio of > 0.3 and < -0.3, respectively.
Comparison of Tumor Gain Regions with Normal Copy Number Polymorphisms
Data on normal human copy number variations (Redon et al., 2006) were downloaded
from http://projects.tcag.ca/variation/data/500K_EA_sample_level_CNV.gff for the Affymetrix
500K SNP array. For the tumor data, GLAD segments in individual samples with an inferred
copy number of > 2.5 or < 1.62 were separately selected for comparison.
LOH Analysis
Genotypes were generated using the BRLMM method (apt-probeset-genotype version
1.8.5 from the Affymetrix Power Tools package). The HMM-based method implemented in
Partek 6.3b was used to find detect of LOH in each sample using the default parameters.
Pathway Analysis of Gains and Losses
Genes in the GISTIC-identified peaks of gain and loss, with significant expression changes, were
overlaid onto a global molecular network developed from information contained in the Ingenuity
knowledge base (Ingenuity Systems, www.ingenuity.com). This database contained
information for 231 out of 260 genes. Significant expression changes were taken to be two-fold
over-expression or under-expression, in samples with a given CNA relative to normal samples,
1
for regions of gain or loss, respectively. Networks of these genes were then algorithmically
generated based on their connectivity.
Gene Locations
SNP Chip probe sets were mapped to the genome, NCBI assembly version 36, using
annotation provided by the Affymetrix web site
(http://www.affymetrix.com/products/arrays/specific/500k.affx). Genes and Affymetrix
expression probe sets were localized on the genome by aligning RefSeq sequences and probe set
targets to the genome, NCBI Version 36, using GMAP (Wu and Watanabe, 2005). Cytoband and
miRNA locations, for the NCBI Version 36 of the genome, were downloaded from the UCSC
genome browser (http://www.genome.ucsc.edu).
2
REFERENCES
Benjamini Y, Hochberg Y. 1995. Controlling the False Discovery Rate: a Practical and Powerful
Approach to Multiple Testing. J. R. Statist. Soc. 57:289-300.
Redon R, Ishikawa S, Fitch KR, Feuk L, Perry GH, Andrews TD, Fiegler H, Shapero MH,
Carson AR, Chen W, Cho EK, Dallaire S, Freeman JL, Gonzalez JR, Gratacos M, Huang
J, Kalaitzopoulos D, Komura D, MacDonald JR, Marshall CR, Mei R, Montgomery L,
Nishimura K, Okamura K, Shen F, Somerville MJ, Tchinda J, Valsesia A, Woodwark C,
Yang F, Zhang J, Zerjal T, Zhang J, Armengol L, Conrad DF, Estivill X, Tyler-Smith C,
Carter NP, Aburatani H, Lee C, Jones KW, Scherer SW, Hurles ME. 2006. Global
variation in copy number in the human genome. Nature 444:444-54.
Wu TD, Watanabe CK. 2005. GMAP: a genomic mapping and alignment program for mRNA
and EST sequences. Bioinformatics 21:1859-75.
3
SUPPLEMENTARY FIGURE LEGENDS
Figure S1: Copy Number Alteration Trends by Subtype
a) The number of transitions from one copy number to another (breakpoints) per sample,
stratified by ovarian cancer subtype. b) The sum of the inferred log ratio (ILR) in each segment >
0.3 (gain) in each sample, stratified by subtype. c) The sum of the ILR in each segment < -0.3
(loss) in each sample, stratified by subtype.
Figure S2: Comparison of CNA and CNV lengths
Distribution of the lengths of genome segments with abnormal copy number in a normal
population (blue) (Redon et al., 2006) and in 52 ovary tumors (red) for a) gain and b) loss.
Figure S3: GISTIC and Heatmaps for CNAs on Chromosomes 3, 8, and 20
Heatmap and GISTIC for specific chromosomes. Details of the amplicon structure and statistical
significance is presented for (a,b) chromosome 3; (b, c) chromosome 8; (d,e) chromosome 20.
Heatmaps and GISTIC amplification significance were prepared and plotted as in Fig. 2.
Figure S4: Chromosome 8 Amplifications and Associated Expression Change
a) GISTIC Q-values for chromosome 8, plotted as in Fig. 4. b) Close-up of the region of panel
“a” indicated by vertical, red lines. The locations of all genes associated with a RefSeq transcript
or Affymetrix probeset are indicated in red. c) Expression ratios for genes in the genome region
depicted in panel “b” and represented on the HGU133 Plus 2.0 array. The probeset associated
with each gene with the highest 90th percentile value was selected. Red bars indicate the log2
ratio of the mean value in samples with copy gain of this gene and the mean value in samples
without gain of this gene. Orange bars represent the log2 ratio of the mean value in samples with
copy gain of this gene and the mean expression level in the normal samples. In order to show the
PVT1 probeset 1558290_a_at, which is not present in the Affymetrix U133A or B chips, we
used expression data from Affymetrix U133 Plus 2.0 chips for 41 of the 52 tumors and 51
normal samples.
Figure S5: Chromosome 8 Deletions and Associated Expression Changes
4
a) GISTIC Q-values for loss on chromosome 3, plotted as in Fig. 2. Vertical, dashed lines
indicate areas of focus in panels ‘b’ (red) and ‘c’ (blue). b) Close-up of the region of panel ‘a’
indicated by vertical, red lines. The locations of all genes associated with a RefSeq transcript or
Affymetrix probeset are indicated in red. c) Close-up of the region of panel ‘a’ indicated by
vertical, blue lines. The locations of all genes associated with a RefSeq transcript or an
Affymetrix probeset are indicated in red. d) Expression ratios for genes in the genome region
depicted in panel ‘b’ and represented on the HGU133A and B arrays. The probeset associated
with each gene with the highest 90th percentile value was selected. Red bars indicate the log2
ratio of the mean value in samples with copy gain of this gene and the mean value in samples
without gain of this gene. Orange bars represent the log2 ratio of the mean value in samples with
copy gain of this gene and the mean expression level in the normal samples. e) Expression ratios
for genes in the genome region depicted in panel ‘c’, plotted as in panel ‘d’.
Figure S6: Deletions and LOH
The fraction of samples with copy loss (ILR < -0.3) at each position. b) The fraction of samples
determined to have LOH at each position. c) A Heatmap and hierarchical clustering dendrogram
depicting regions of LOH (red). Hierarchical clustering was performed with the Euclidean
distance metric and complete linkage clustering.
Figure S7: Pathway Analysis of Gains and Losses
Pathway analysis inferred connectivity among genes in GISTIC-identified regions of gain and
loss with significant expression changes. Network nodes have been colored to indicate the level
and magnitude of their expression change in samples with a given CNA relative to normal ovary.
Green represents decreased expression and red indicates increased expression. White network
nodes represent genes in this set and white nodes represent genes related to these genes through
the Ingenuity database.
5
Download