file - European Urology

advertisement
Supplement – Materials and methods
1.
Bladder tumour materials
FFPE archival UBC specimens obtained following TURBT were randomly selected from 2005
and 2010 out of the departmental database (University Hospitals Network, Toronto and
Mount Sinai Hospital, Toronto). To compare FF and FFPE, we used 4 bladder tumour
samples from 4 patients. For our cohort, we analyzed an additional 49 tumours from 47
patients, among which, 7 patients had multiple samples analyzed for the study inter-tumour
heterogeneity (Figure 1c-e). The study was carried out with institutional ethics board
approval from both institutions (UHN: 11-0134-T and MSH: 11-0015-E). The oldest direct
comparison between FF and FFPE was performed on a sample from 2007 (embedded in
FFPE for 6 years). The oldest FFPE RNAseq was performed on samples from 2005
(embedded in FFPE for 6 years). Two years (2005 and 2010) were chosen to ensure quality
control with older samples and samples passed quality control set for RNAseq analysis (see
RNAseq quality control section). Cases were reviewed pathologically to confirm that they
were unequivocally low or high grade cases and suitable for RNA extraction based on size of
tumour and purity of tumour e.g. absence of haemorrhage, divergent pathology, and
necrosis or inflammation. For each FFPE block/section, we extracted a minimum of 750ng
total RNA, and the smallest tumour sample we analyzed had a volume of approximately
0.8mm3.
2.
Tissue sectioning and RNA collection
The FFPE blocks were serially sectioned and the first section (5µm subsequent sections
10µm) was stained (H & E) and assessed by a pathologist (TvdK) for tumour grade and stage.
1
Viable tumour regions were then evaluated and marked before total RNA was collected and
extracted.
3.
RNA extraction
Total RNA was extracted using RNeasy kit (QIAGEN). The yield and quality of total RNA was
then assessed using Nanodrop (Thermo Fisher) and BioAnalyzer (Agilent), respectively.
Ribosomal RNA (rRNAs) was removed using a bead-based hybridization kit, RiboZero
(Epicentre, EPI-MRZG12324SP), cDNA libraries were prepared using the Illumina TruSeq RNA
sample preparation kit v2 (RS-122-2001) and then loaded as 2 indexed samples per lane on
an Illumina HiSeq 2000.
4.
wRNASeq protocols and analytical pipeline
The raw sequencing reads in fastq format were obtained from the image files produced by
the HiSeq2000 using the standard Illumina CASAVA software (version 1.8). For each sample,
around 150 million 101bp paired-end reads were generated in the fastq format and mapped
onto the human genome (hg19) using Tophat1.4.1, allowing for up to two mismatches 1.
After obtaining aligned BAM files, a custom Perl script was used to select unique reads that
mapped to only one location on the genome. Reads mapped to multiple sites on the
genome were discarded. These filtered bam files were then analyzed using a custom Rbased pipeline to calculate gene expression profiles using ENSEMBL annotation for coding
genes and ENCODE annotations for lincRNAs2. To estimate the expression levels of each
gene, including both coding and non-coding genes, the number of reads mapped onto the
gene was counted regardless of transcription isoforms and normalized to total mapped
reads to obtain transcript union Read Per Million total reads (truRPMs). For coding genes,
2
reads mapped onto both exons and introns were all counted for truRPM calculations using a
custom R script.
5. RNAseq quality control
The quality of total RNA extracted from FFPE sections was examined using Nanodrop
(Thermo Fisher) and all the RNA samples have OD260/OD280 >2 indicating high purity of
RNA. Unlike total RNA extracted from fresh frozen samples, RNA molecules extracted from
FFPE are highly fragmented, therefore, we did not need to perform the fragmentation step
during the generation of cDNA sequencing libraries. For RNA extracted from fresh frozen
samples, we followed the Illumina protocol to fragment the RNA. This is a key modification
of the protocol. More importantly, efficient removal of rRNA is another critical factor for
successful cDNA library construction from FFPE RNA samples. The efficiency of rRNA
removal and was determined by calculating the ratio of GAPDH RNA to S18 rRNA using
Taqman qPCR. The Taqman probes for human GAPDH (Cat.4333764) and S18 (Cat. 4333760)
were obtained from Life Technologies. All cDNA libraries displayed successful removal of
rRNA (GAPDH:rRNA>1) indicating high quality. The quality of sequencing reads was assessed
using
a
publically
available
tool,
FastQC
(http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). All samples that passed the
quality control measurements of FastQC were then subjected to subsequent analysis. Poor
quality 5’ or 3’ end reads were trimmed using a custom Perl script before mapping onto the
human genome and no sample contained more than 10% rRNA. The storage time of all the
FFEP samples had no effects on the RNAseq data quality based on the above standards.
6. Statistical analyses
3
For heatmap and statistical analyses, publically available R packages were used, while
network presentations were prepared in Cytoscape. Analysis of network modularity was
performed as reported previously3. Briefly, after removal of high and low abundance genes,
expression levels of coding genes were log transformed and median centered, and the
average correlation of co-expression of a hub with its interacting partners across LG samples
was calculated and compared with that across the HG cluster. The hubs with altered
correlation between LG and HG were selected and the significance was assessed using
permutation. In total, 392 significant hubs (p<0.05) were identified. Of these, 302 hubs
showed interactions and were used to construct the network in Fig 2c.
4
Supplementary Results
Comparison with other similar studies using larger cohorts.
We compared our results with the Lindgren et al study4 that used microarrays to examine
expression levels of 2508 coding genes between high grade and low grade bladder tumours.
These authors identified 392 and 400 coding genes upregulated in high grade and low grade
tumours respectively. Compared with our DEGs, 80 and 32 coding genes overlapped with
high grade and low grade associated genes, respectively. Interestingly, the overlapped
genes have more dynamic fold changes in our data than theirs (Fig. S2a and S2b, the
boxplots in the left panels). For the genes that only appear in the list by Lingren et al 4, our
RNAseq data have similar distribution as the microarray data (Fig. S2a and S2b, the boxplots
in the right panels); however, due to our strict statistical standards, these genes are not
included in our DEGs. When comparing our results to another similar study5, among our 947
DEGs, 241 were also identified by microarray profiling of fresh frozen invasive (T1-T4) versus
non-invasive tumours (Ta). In summary, our results showed good concordance with
previous studies and in keeping with the known limitations when comparing different
datasets of this nature6.
5
Supplementary References
1.
2.
3.
4.
5.
6.
Trapnell, C., Pachter, L. & Salzberg, S.L. TopHat: discovering splice junctions with
RNA-Seq. Bioinformatics 25, 1105-1111 (2009).
Djebali, S. et al. Landscape of transcription in human cells. Nature 489, 101-108.
Taylor, I.W. et al. Dynamic modularity in protein interaction networks predicts breast
cancer outcome. Nat Biotechnol 27, 199-204 (2009).
Lindgren, D. et al. Combined gene expression and genomic profiling define two
intrinsic molecular subtypes of urothelial carcinoma and gene signatures for
molecular grading and outcome. Cancer research 70, 3463-3472 (2010).
He, X. et al. Differentiation of a highly tumorigenic basal cell compartment in
urothelial carcinoma. Stem Cells 27, 1487-1495 (2009).
Lauss, M., Ringner, M. & Hoglund, M. Prediction of stage, grade, and survival in
bladder cancer using genome-wide expression data: a validation study. Clinical
cancer research : an official journal of the American Association for Cancer Research
16, 4421-4433 (2010).
6
Download