Supplementary Information (docx 135K)

advertisement
Clinical and genomic analysis of a randomised phase II study evaluating
anastrozole and fulvestrant in postmenopausal patients treated for large
operable or locally-advanced hormone-receptor-positive breast cancer
Quenel-Tueux et al
Supplementary Methods
DNA was purified from frozen samples by phenol/chloroform extraction (Phase Lock
Gel, 5 PRIME GmbH, Hamburg, Germany) followed by a DNeasy column (Qiagen GmbH,
Hilden, Germany). The order of extraction was randomized except that pre and post
treatment biopsies from the same tumor were processed at the same time. Libraries
were prepared and sequenced on a GAIIx sequencer (Illumina, San Diego, CA) in
multiple batches, as described (Wood et al, 2010). Reads passing quality control in
Casava (Illumina) were aligned to human genome version hg19 using bwa version 0.5.9r16 in single end mode using default parameters. The bam files are available in the NCBI
Sequence Read Archive under accession number SRP035504. The quality was assessed
with qualimap v.0.7.1 (Garcia-Alcalde et al, 2012). The bam files were reformatted in
Perl with the bam2windows.pl script available on www.precancer.leeds.ac.uk/softwareand-datasets/cnanorm/ and processed with the CNAnorm package (Gusnanto et al,
2012) in R statistical software (R Core Team, 2013). CNAnorm converts the number of
reads in non-overlapping sliding windows to ratios with respect to a pool of normal
female DNA, normalizes for GC content and segments the ratios with DNAcopy
(Gusnanto et al, 2012; Olshen et al, 2004). Base quality, mapping quality, read depth and
tumor cell content are summarized in Supplementary Table 6. Tumor cell content was
estimated independently by a pathologist and by CNAnorm. To create the heatmap in
Fig 3, the normalized segmented ratios from CNAnorm (Gusnanto et al, 2012) for
genomic windows for which complete data were available were median centered then
centroid-clustered with correlation as the distance metric using Cluster 3.0 and
Treeview 1.1.6r4 (de Hoon et al, 2004) (the cdt and atr files are provided as
supplementary data).
To identify significant differences before and after treatment, reads were
counted in 200 kb windows and processed in CNAnorm with the same parameters for
all samples. To correct for differences in normal tissue contamination in the two
samples, a linear model was used to project the segmented pretreatment biopsy values
onto the post-treatment surgical measurement scale. The procedure is shown for
tumour H11 in Sup Fig. 4. In this case there was no difference between the profiles
before and after treatment. The software finds the peaks corresponding to the major
ploidy values (the blue dots in Sup Fig. 4B). When the amount of normal tissue differs
before and after treatment the spacing of the peaks is different in the two plots. The
linear model adjusts the scale to ensure that the spacing of the peaks is the same before
and after treatment. In the normalised data we do not know the amount of normal tissue
in either sample, but we do know that the separation of the modal ploidy peaks is the
same, so we can legitimately compare the two profiles. We used this approach with all
chromosomes included in the linear model, then repeated it with only chr 1 and 16
included in the model. The logic behind using chr1 and 16 as reference chromosomes is
that changes in these chromosomes are likely to be present in all subclones, because the
der(1;16) translocation is normally the first oncogenic event to occur in the
transformation of ER+ tumour cells. The expected values after applying the model to the
biopsy data were subtracted from the observed values in the surgical samples to
generated the raw differences in copy number. The difference score is the standard
deviation of these differences. To prevent distortion of the results by outliers, segments
<20Mb were removed before calculating the difference scores. The median absolute
deviation (mad) of the normalized ratios minus the segmented copy numbers was used
as a measure of noise in individual samples. The mads for the pre-treatment biopsies
were multiplied by the coefficient from the linear model, then the difference scores were
divided by the higher of the mads for the biopsy and surgical sample to adjust for
quality. To estimate the significance of the corrected scores, the tumors for which
replicates were available were used to generate an empirical distribution of corrected
scores for self comparisons (n = 9). The difference between the corrected scores for the
before and after treatment comparisons and the mean of the self comparisons for the
replicates was divided by the standard deviation of the replicates to generate Z values
for each tumor. The Z value thus measures differences before and after treatment as a
multiple of the standard deviation of the differences between the replicates under the
null hypothesis that two biopsies from the same tumour should have the same profile.
The results are shown for a linear model including all chromosomes and for one using
chr1 and chr16 as "invariant" reference chromosomes (the most logical choice for
luminal breast cancer); the same tumors were identified as different with both models.
To facilitate visual interpretation of differences before and after treatment, the inferred
modal ploidy in the copy number plots shown was adjusted to the same value in both
samples (this scaling does not change the underlying data).
Supplementary clinical and surgical results
In the anastrozole arm one patient who refused surgery had stable disease and received
chemotherapy and trastuzumab for a HER2-positive tumour. The other had stable
disease, so anastrozole treatment was continued for an additional three years. At the
end of this period she developed progressive disease, refused surgery again, and was
transferred to fulvestrant. For the six patients in the fulvestrant arm who did not
undergo surgery, two had disease progression at 4 and 5 months, treated by
chemotherapy and mastectomy, respectively; three showed stable disease and
continued to receive endocrine therapy (1 anastrozole and 2 fulvestrant) for more than
6 months; and the remaining patient did not undergo surgical treatment because of rib
invasion after 6 months of hormonal therapy (she was not scored as having progressed
because she was T4a at inclusion).
de Hoon MJ, Imoto S, Nolan J, Miyano S (2004) Open source clustering software.
Bioinformatics 20(9): 1453-4
Garcia-Alcalde F, Okonechnikov K, Carbonell J, Cruz LM, Gotz S, Tarazona S, Dopazo J,
Meyer TF, Conesa A (2012) Qualimap: evaluating next-generation sequencing alignment
data. Bioinformatics 28(20): 2678-9
Gusnanto A, Wood HM, Pawitan Y, Rabbitts P, Berri S (2012) Correcting for cancer
genome size and tumour cell content enables better estimation of copy number
alterations from next-generation sequence data. Bioinformatics 28(1): 40-7
Olshen AB, Venkatraman ES, Lucito R, Wigler M (2004) Circular binary segmentation for
the analysis of array-based DNA copy number data. Biostatistics 5(4): 557-72
R Core Team (2013) R: A language and environment for statistical computing. R
Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org/
Wood HM, Belvedere O, Conway C, Daly C, Chalkley R, Bickerdike M, McKinley C, Egan P,
Ross L, Hayward B, Morgan J, Davidson L, MacLennan K, Ong TK, Papagiannopoulos K,
Cook I, Adams DJ, Taylor GR, Rabbitts P (2010) Using next-generation sequencing for
high resolution multiplex analysis of copy number variation from nanogram quantities
of DNA from formalin-fixed paraffin-embedded specimens. Nucleic Acids Res 38(14):
e151
Download