Supplementary Tables

advertisement
Manuscript ID: HEP-12-1863
Supplementary materials
Genomic landscape of copy number aberrations enables the identification of oncogenic
drivers in hepatocellular carcinoma
Kai Wang1,*, Ho Yeong Lim2, Stephanie Shi3, Jeeyun Lee2, Shibing Deng1, Tao Xie1, Zhou
Zhu1, Yuli Wang1, David Pocalyko1,†, Wei Jennifer Yang1, Paul A. Rejto1, Mao Mao1, CheolKeun Park4,* and Jiangchun Xu1,*,‡
1
Oncology Research Unit, Pfizer Inc., San Diego, CA 92121, USA
2
Division of Hematology-Oncology, Department of Medicine, Samsung Medical Center,
Sungkyunkwan University School of Medicine, Seoul 135-710, Korea
3
External Research Solutions, Pfizer Inc., San Diego, CA 92121, USA
4
Department of Pathology, Samsung Medical Center, Sungkyunkwan University School of
Medicine, Seoul 135-710, Korea
† Current address: Janssen Research & Development, 3210 Merryfield Row, San Diego, CA
92121, USA
‡ Current address: Quanticel Pharmaceuticals, 9393 Towne Centre Dr., Suite 110, San Diego,
CA 92121
* Correspondence should be addressed to:
Kai Wang (Kai.Wang4@pfizer.com)
1
Manuscript ID: HEP-12-1863
Cheol-Keun Park (ckpark@skku.edu)
Jiangchun Xu (jxucam@gmail.com)
2
Manuscript ID: HEP-12-1863
3
Contents
Supplementary Methods ................................................................................................................. 5
Clinico-pathological features of primary HCC samples ............................................................. 5
DNA and RNA extraction ........................................................................................................... 6
Gene expression analysis ............................................................................................................ 6
Copy number data processing pipeline ....................................................................................... 7
RNAi Knockdown ...................................................................................................................... 7
Assays used in quantitative real-time RT-PCR .......................................................................... 7
Immunohistochemical staining on tissue microarrays ................................................................ 8
Supplementary Tables ................................................................................................................... 10
Table S1. Major demographic and clinicopathological parameters of the HCC cohort. .......... 10
Table S2. Cell lines used in this study and their sources. ......................................................... 10
Table S3. All CNA peaks predicted by GISTIC2 analysis. ...................................................... 10
Table S4. Association of average copy number of the focal amplification and deletion peaks
identifed by GISTIC2 to clinical and outcome variables.......................................................... 10
Manuscript ID: HEP-12-1863
4
Table S5. All pathways enriched among cis-acting genes in CNA peaks. ............................... 10
Table S6. Candidate driver genes selected based on focal CNA, expression changes and model
availability................................................................................................................................. 10
Table S7. Somatic copy number, gene expression and IHC staining results for BCL9. .......... 11
Table S8. Somatic copy number, gene expression and IHC staining results for MTDH. ........ 11
Supplementary Figures ................................................................................................................. 12
Figure S1. Association between somatic CNA, mRNA expression and clinical outcome. ...... 12
Figure S2. Pair-wise DNA/DNA correlations reveal significant associations between unlinked
loci............................................................................................................................................. 13
Figure S3. Distributions of GISTIC2 peak statistics. ............................................................... 14
Figure S4. Frequent somatic copy number alterations in critical signaling pathways in HCC. 16
Figure S5. Immunostaining in HCCs for BCL9 and MTDH. ................................................... 17
Figure S6. Inferred copy numbers and expression levels of BCL9 and MTDH in a panel of 30
HCC cell line models. ............................................................................................................... 18
Figure S7. Association to clinical outcomes and AJCC tumor stages for the putative CNA
drivers BCL9 and MTDH. ........................................................................................................ 19
Reference ...................................................................................................................................... 20
Manuscript ID: HEP-12-1863
5
Supplementary Methods
Clinico-pathological features of primary HCC samples
We defined curative resection as complete resection of all tumor nodules with clear microscopic
resection margins and no residual tumors as indicated by a computed tomography (CT) scan one
month after surgery. None of the patients received preoperative chemotherapy. Clinical
parameters, including age, gender, date of surgery, and tumor size were obtained from pathology
reports. We also examined histopathologic features of the HCCs including histological
differentiation, vascular invasion, intrahepatic metastasis, AJCC stage (1) and non-tumor liver
pathology. HCCs were graded histologically according to the criteria of Edmondson and Steiner
(2). Vascular invasion was considered present when a neoplastic cell group was surrounded by at
least one or more endothelial cells or the tunica media of the vessel. Intrahepatic metastasis was
defined according to the criteria of the Liver Cancer Study Group of Japan (3). Patient serum
levels of α-fetoprotein and CT scans were performed at least once every 3 months after surgery
until December 31, 2010. When tumor recurrence was suspected, precise diagnostic imaging was
performed by means of magnetic resonance imaging. Disease-free survival (DFS) was defined as
time from surgery to the date of tumor recurrence. While HCC is the cause of death for most
patients with the disease, some patients died of liver failure or other causes in the absence of
progressive HCC. For disease-specific survival (DSS), we defined HCC-related mortality
(disease-specific death) as follows: 1) tumor occupying more than 80% of the liver, 2) portal
venous tumor thrombus (PVTT) proximal to the second bifurcation, 3) obstructive jaundice due
Manuscript ID: HEP-12-1863
6
to tumor, 4) distant metastases, or 5) variceal hemorrhage with PVTT proximal to the first
bifurcation (4).
DNA and RNA extraction
Genomic DNA and total RNA were extracted from the sliced tissue specimens using the
QIAamp DNA mini kit and RNeasy Plus Mini kit (Qiagen, Hilden, Germany), respectively.
RNA integrity was assessed using Agilent 2100 BioAnalyzer (Agilent Technologies, Palo Alto,
CA). For gene expression analysis, 276 tumor and 247 adjacent non-tumor liver samples with an
RNA integrity number greater than 5.0 were included.
Gene expression analysis
Background corrected bead level data were exported from Illumina GenomeStudio for further
analysis. Median was used to summarize bead level data to probe level data, which were then
transformed with base 2 logarithm. The log transformed probe intensity data were normalized
using quantile normalization. A probe was called present (p≤0.05) or absent (p>0.05) based on
its comparison to the negative controls at 0.05 level. For genes with multiple probes, probe level
data were summarized into gene level data using median after removing non-performing probes.
Non-performing probes were defined only for genes with more than one probe and for the
purpose of averaging probe level data. A non-performing probe has average intensity below the
background threshold (i.e., 95% quantile of the negative control probes) for both normal and
tumor samples, and another probe from the same gene has intensity at least two fold above the
background threshold in either tumor or normal samples. In this case, we excluded the “nonperforming” probes when averaging the probe level data into gene level data.
Manuscript ID: HEP-12-1863
7
Copy number data processing pipeline
Normalized single channel intensity data were further corrected for potential dye-bias effect with
the tQN method (5) and converted into raw copy number estimates in the form of Log R ratios
(LRRs), which were then adjusted for genomic wave effect using the PennCNV package (6). For
primary HCCs, we further normalized the LRR values in tumors by subtracting those from their
matched non-tumor liver tissues, in an effort to eliminate germline copy number variations in
each individual patient. Copy number segmentation was performed on the LRR data using the
GLAD algorithm (7) with default parameters as provided in the R aroma package (8). To
summarize the copy number of each gene in a sample, we took the mean LRR of the segment to
which it belongs, and converted it back to copy number space by 2LRR+1. When a gene
overlapped multiple segments, the LRR of the segment with the largest absolute value was taken.
RNAi Knockdown
siRNA transfection was performed using Lipofectamine 2000 (Life Technolgoeis, Carlsbad, CA)
according to manufacturer’s protocol with control (Block-iT Alexa Fluor Red, Life
Technologies) or target siRNAs (BCL9: M-007268-01, MTDH: M-018531-00, Thermo
Scientific Dharmacon, Lafayette, CO).
Assays used in quantitative real-time RT-PCR
Taqman gene expression assay was performed using the following assays: BCL9 Taqman Gene
Expression Assay (Applied Biosystems, Hs00979216_m1), MTDH Taqman Gene Expression
Assay (Applied Biosystems, Hs00757841_m1) and Human GAPDH Endogenous Control
(VIC/TAMRA Probe, Primer Limited, Applied Biosystems, cat#4310884E).
Manuscript ID: HEP-12-1863
8
Immunohistochemical staining on tissue microarrays
All histologic sections were examined and representative tumor areas free of necrosis or
hemorrhage were pre-marked in formalin-fixed paraffin-embedded blocks. Two 2.0-mmdiameter tissue cores were taken from the donor blocks and transferred to the recipient paraffin
block at defined array positions. Uninvolved normal liver tissues from 12 patients with
metastatic colonic carcinoma of the liver were used as controls. Immunostaining was performed
using rabbit polyclonal antibody to BCL9 (ab37305, 1:100; Abcam Inc., Cambridge, MA) and
mouse monoclonal antibody to MTDH (NBP1-51585, 1:400; Novus Bio., Littleton, CO).
Consecutive 4-µm tissue sections embedded in the slides were deparaffinized with xylene,
hydrated in serial dilutions of alcohol, and immersed in peroxidase-blocking solution (Dako,
Glostrup, Denmark) to quench endogenous peroxidase activity. The sections were then
microwaved in 0.01 mol/L Citrate buffer (pH 6.0) for 30 minutes. Incubation with the primary
antibody was performed overnight at 4ºC. 3,3’-Diaminobenzidine tetrahydrochloride was used as
the chromogen, and Mayer’s hematoxylin counterstain was applied. Negative controls (isotypematched irrelevant antibody or preimmune mouse serum as primary antibody) were run
simultaneously. Results of staining were evaluated without knowledge of the clinicopathologic
features. Duplicate tissue cores for each tumor showed high levels of homogeneity for staining
intensity and percentage of positive cells. The higher score was taken as the final score where
there was a difference between duplicate tissue cores. Immunoreactivity for BCL9 was observed
in the nucleus and/or cytoplasm of tumor cells, but only in the cytoplasm of hepatocytes in 12
control normal livers. For assessment of the positivity of immunostaining for BCL9, only nuclear
staining was regarded as positive. The staining intensity was first scored (0, negative; 1, weak; 2,
Manuscript ID: HEP-12-1863
9
moderate; 3, strong), followed by the percentage of positive cells (1, 1-5%; 2, 6–25%; 3, 26–
50%; 4, 51%–75%; 5, >75%). The final score of each tumor was obtained by multiplying the
score for staining intensity by the score for percentage of positive cells. For categorical analyses,
the immunoreactivity of tumor cells was graded as low (total score =1), moderate (total score
=2), or high (total score ≥3). Immunoreactivity for MTDH was observed only in the cytoplasm
of tumor cells and hepatocytes in 12 control normal livers. In all control normal livers, MTDH
was observed in fewer than 20% of hepatocytes. We defined MTDH as positive when ≥20% of
tumor cells showed cytoplasmic immunoreactivity. The staining intensity was first scored (0,
negative; 1, weak; 2, moderate; 3, strong), followed by the percentage of positive cells (1, 2140%; 2, 41–60%; 3, 61–80%; 4, >80%). The final score of each tumor was obtained by
multiplying the score for staining intensity by the score for percentage of positive cells. For
categorical analyses, the immunoreactivity of tumor cells was graded as low (total score =1-4),
moderate (total score =5-8), or high (total score =9-12).
Manuscript ID: HEP-12-1863
Supplementary Tables
Table S1. Major demographic and clinicopathological parameters of the HCC cohort.
See file “Wang_HCC_CNA_landscape_Table_S1.docx”.
Table S2. Cell lines used in this study and their sources.
See file “Wang_HCC_CNA_landscape_Table_S2.docx”.
Table S3. All CNA peaks predicted by GISTIC2 analysis.
See file “Wang_HCC_CNA_landscape_Table_S3.docx”. A full version of Table S3 can be
found in file “Wang_HCC_CNA_landscape_Table_S3_full.docx”.
Table S4. Association of average copy number of the focal amplification and deletion peaks
identified by GISTIC2 to clinical and outcome variables.
See file “Wang_HCC_CNA_landscape_Table_S4.docx”.
Table S5. All pathways enriched among cis-acting genes in CNA peaks.
See file “Wang_HCC_CNA_landscape_Table_S5.docx”. A full version of Table S5 can be
found in file “Wang_HCC_CNA_landscape_Table_S5_full.docx”.
Table S6. Candidate driver genes selected based on focal CNA, expression changes and model
availability.
See file “Wang_HCC_CNA_landscape_Table_S6.docx”.
10
Manuscript ID: HEP-12-1863
Table S7. Somatic copy number, gene expression and IHC staining results for BCL9.
See file “Wang_HCC_CNA_landscape_Table_S7.docx”.
Table S8. Somatic copy number, gene expression and IHC staining results for MTDH.
See file “Wang_HCC_CNA_landscape_Table_S8.docx“.
11
Manuscript ID: HEP-12-1863
12
Supplementary Figures
A
B
10
1400
5
DSS
DFS
DSS perm
DFS perm
CNA-mRNA correlation in cis
Permutation
1200
10
No. genes (< p-value)
Counts
1000
800
600
10
10
4
3
2
400
10
1
200
0
-0.4
-0.2
0
0.2
0.4
cis-correlation
0.6
0.8
1
10
0
0
1
2
3
4
5
-log10(p-value)
6
7
8
Figure S1. Association between somatic CNA, mRNA expression and clinical outcome. (A)
Distribution of genome-wide cis-correlation between somatic CNAs and mRNA expression
levels across the HCCs (red) and those obtained from the permutated dataset where sample labels
were randomly scrambled (blue). (B) Cumulative Distribution of Cox regression p-values for
associating somatic CNAs to clinical outcomes including both disease specific survival (DSS,
blue) and disease-free survival (DFS, red), in comparison to same distributions calculated from a
permutated dataset where sample labels were randomly scrambled (“DSS perm” in green and
“DFS perm” in black). X-axis of the plot shows the –log10 of the Cox regression p-value cutoffs,
and Y-axis is the number of genes with a p-value smaller than the corresponding cutoff on the Xaxis.
Manuscript ID: HEP-12-1863
Figure S2. Pair-wise DNA/DNA correlations reveal significant associations between unlinked
loci. Pair-wise Pearson correlations computed from ~20k gene copy number are ordered by
genes’ chromosomal positions through the genome on the X and Y axes with red indicating a
positive correlation and blue indicating a negative correlation. The red diagonal represents the
correlation of genes with themselves.
13
Manuscript ID: HEP-12-1863
14
A
B
4
x 10
Two-sample t-test: p-value = 2.6e-007
0.18
12
0.16
0.14
Peak frequency
Peak size (KB)
10
8
6
4
0.12
0.1
0.08
0.06
2
0.04
0
0.02
Amplifications
0 0
10
Deletions
D
0.2
0.2
0.18
0.18
0.16
0.16
0.14
0.14
Peak frequency
Peak frequency
C
5
10
Peak size (KB)
0.12
0.1
0.08
0.06
0.04
0.12
0.1
0.08
0.06
0.04
0.02
0.02
0 0
10
2
4
10
10
Peak size (KB)
1
p = 4e-032
p = 4.4e-005
0.8
0.6
0.4
0.2
0
-0.2
-0.4
non-peak
deletion
6
10
E
Correlation incis
amplification
deletion
amplification
0
1
2
3
Peak amplitude
4
Figure S3. Distributions of GISTIC2 peak
statistics. (A) Peak size distribution and
comparison between amplification and deletion
peaks. (B) Relationship between peak frequency
and peak size for amplification peaks. (C)
Relationship between peak frequency and peak
size for deletion peaks. (D) Relationship between
peak frequency and peak amplitude. Peak
frequencies were calculated based on copy number
cutoffs of 3 and 1.3 for amplification and deletion
peaks, respectively. Peak amplitudes were taken as
the average copy number of a peak among patients
called positive for the peak. (E) Distribution of ciscorrelations for genes not in any GISTIC2 peak, in
deletion or amplification peaks. P-values shown
were based on two-sample t-tests.
Manuscript ID: HEP-12-1863
15
Manuscript ID: HEP-12-1863
Figure S4. Frequent somatic copy number alterations in critical signaling pathways in HCC.
16
Manuscript ID: HEP-12-1863
A
17
B
Figure S5. Immunostaining in HCCs for BCL9 and MTDH. HRP, original magnification x200)
showing high levels of immunoreactivity for BCL9 in the nucleus (A) and MTDH in the
cytoplasm (B).
Manuscript ID: HEP-12-1863
18
Gene expression (log2)
A
Copy number
Gene expression (log2)
B
Copy number
Figure S6. Inferred copy numbers and expression levels of BCL9 and MTDH in a
panel of 30 HCC cell line models. (A) BCL9; (B) MTDH. Cell lines colored in green
were used as amplified models for each candidate driver in the functional validation,
and those in pink were used as controls (i.e. copy number neutral with respect to the
target).
Manuscript ID: HEP-12-1863
19
D
0
20
40
60
80
100
1.0
0.4
0.6
0.8
MTDH<2.3 (n=109)
MTDH 3 (n=36)
0.2
0.2
0.4
0.6
0.8
BCL9<2.3 (n=106)
BCL9 3 (n=24)
Disease-Specific Survival (p=0.716)
0.0
Probability of Disease-Specific Survival
1.0
Disease-Specific Survival (p=0.12)
0.0
Probability of Disease-Specific Survival
A
0
120
20
40
60
80
100
120
Time (month)
Time (month)
B
E
Disease-Free Survival (p=0.215)
0
20
40
60
80
100
1.0
0.2
0.4
0.6
0.8
MTDH<2.3 (n=114)
MTDH 3 (n=37)
0.0
Probability of Disease-Free Survival
1.0
0.2
0.4
0.6
0.8
BCL9<2.3 (n=110)
BCL9 3 (n=24)
0.0
Probability of Disease-Free Survival
Disease-Free Survival (p=0.033)
0
120
20
40
60
80
100
120
Time (month)
Time (month)
C
F
BCL9 (p=0.0008)
4.5
3.5
1.5
2.5
Inferred Copy Number
3.5
3.0
2.5
2.0
Inferred Copy Number
4.0
MTDH (p=0.0090)
I
II
III
AJCC Stage
IV
I
II
III
IV
AJCC Stage
Figure S7. Association to clinical outcomes and AJCC tumor stages for the putative CNA
drivers BCL9 and MTDH. Panels (A-C) show data for BCL9; panels (D-F) show data for
MTDH. Patients were separated into two groups based on the amplification status of
BCL9 and MTDH. Differences in disease-specific and disease-free survival were
assessed by Kaplan-Meier curves and the associated log rank test. For association with
Manuscript ID: HEP-12-1863
20
AJCC tumor stage, a linear trend test was performed (p-values shown in parenthesis).
Reference
1.
Edge SB, Byrd DR, Compton CC, Fritz AG, Greene FL, Trotti A. AJCC Cancer Staging
Manual. 7 ed. Chicago, IL: Springer, 2010.
2.
Edmondson HA, Steiner PE. Primary carcinoma of the liver: a study of 100 cases among
48,900 necropsies. Cancer 1954;7:462-503.
3.
LCSGJ: The Liver Cancer Study Group of Japan: The general rules for the clinical and
pathological study of primary liver cancer. In. 2 ed. Tokyo, Japan: Kanehara & Co., 2003; 38.
4.
Hoshida Y, Villanueva A, Kobayashi M, Peix J, Chiang DY, Camargo A, Gupta S, et al.
Gene expression in fixed tissues and outcome in hepatocellular carcinoma. N Engl J Med
2008;359:1995-2004.
5.
Staaf J, Vallon-Christersson J, Lindgren D, Juliusson G, Rosenquist R, Hoglund M, Borg
A, et al. Normalization of Illumina Infinium whole-genome SNP data improves copy number
estimates and allelic intensity ratios. BMC Bioinformatics 2008;9:409.
6.
Wang K, Li M, Hadley D, Liu R, Glessner J, Grant SF, Hakonarson H, et al. PennCNV:
an integrated hidden Markov model designed for high-resolution copy number variation
detection in whole-genome SNP genotyping data. Genome Res 2007;17:1665-1674.
7.
Hupe P, Stransky N, Thiery JP, Radvanyi F, Barillot E. Analysis of array CGH data: from
signal ratio to gain and loss of DNA regions. Bioinformatics 2004;20:3413-3422.
Manuscript ID: HEP-12-1863
8.
Bengtsson H, Simpson K, Bullard J, Hansen K. aroma.affymetrix: A generic framework
in R for analyzing small to very large Affymetrix data sets in bounded memory. Berkeley:
Department of Statistics, University of California, Berkeley; 2008 February 2008. Report No.:
Tech Report #745.
9.
21
Kanehisa M, Goto S, Sato Y, Furumichi M, Tanabe M. KEGG for integration and
interpretation of large-scale molecular data sets. Nucleic Acids Res;40:D109-114.
Download