Supplementary Information (doc 188K)

advertisement
1
SUPPLEMENTARY INFORMATION
2
MATERIALS AND METHODS
3
Patient samples
4
Blood and marrow cells from both donor and recipient were obtained from the South
5
Australian Cancer Research Biobank. Mesenchymal stromal cells (MSC) were cultured from
6
BM aspirates as a source of germline control DNA. Whole exome sequencing (WES) was
7
performed on three donor samples (MSC, AML diagnosis and relapse) and one recipient
8
sample (AML diagnosis). In addition, paired diagnostic and remission samples from 12 other
9
patients with DNMT3A-mutant AML were available for targeted sequencing. None of these
10
12 patients had therapy-related AML or an antecedent diagnosis of a hematological
11
neoplasm.
12
13
Whole Exome Sequencing (WES) and Targeted Massively Parallel Sequencing
14
WES was performed using a Roche NimbleGen capture kit and sequenced on the Illumina
15
HiSeq2500. Briefly 1 g of genomic DNA was sheared to a mean fragment size of 200 bp
16
using the Covaris S220 before conversion to barcoded DNA libraries using a TruSeq DNA
17
LT Sample Preparation Kit (Illumina, San Diego, CA USA). After purification, libraries
18
were quantified by Agilent Bioanalyzer HS DNA assay and combined equally into pools of 6
19
prior to solution phase capture using the SeqCap EZ Exome Library v3.0 (Roche NimbleGen,
20
Madison, WI USA). The three donor (mesenchymal stem cells, diagnosis, relapse) and one
21
recipient (diagnosis) samples were sequenced together with other unrelated samples on five
22
Illumina HiSeq2500 flowcells (v3 SBS chemistry 2x100PE), with 6 samples multiplexed per
23
lane. All but the mesenchymal stem cell sample were included on two flowcells. The number
24
of sequenced fragments for the mesenchymal stem cell sample was 30 million, while for the
25
other three samples there were 121, 86 and 71 million fragments, respectively).
1
26
27
Targeted Massively Parallel Sequencing was performed on a custom 29 gene panel (all
28
coding regions) of myeloid genes (Supplementary Table 3) using an Ion Torrent AmpliSeq
29
approach. Briefly, the targeted gene libraries were generated from 10 ng of genomic DNA
30
using the Ion AmpliSeq Library Kit v2.0 and the custom primer pool as per the
31
manufacturer’s protocol (Life Technologies, Guilford, CT USA). After adapter ligation and a
32
5-cycle PCR amplification incorporating barcodes the libraries were quantified by Agilent
33
Bioanalyzer HS DNA assay and combined equally into a pool of 12 samples. The library
34
pool was diluted to 6 pM and templated onto Ion Sphere Particles (ISPs) by emulsion PCR
35
using the automated Ion OneTouch2 system with the Ion P1 Template OT2 200 Kit (Life
36
Technologies). ISP Sequencing was done using an Ion P1 chip (Ion P1 Sequencing 200 Kit
37
v3 chemistry) on the Ion Proton.
38
39
Sequence analysis
40
The WES reads were mapped to the human genome (hg19) using bwa sampe (v0.6.2).
41
Sorting and indexing was carried out using samtools (v0.1.12a) followed by duplicate
42
marking using picard (v1.71). Mapping resulted in average coverage over the Nimblegen
43
capture regions of 34.1, 96.5, 94.2 and 76.1 for the donor’s mesenchymal stem cells,
44
diagnosis, relapse and the recipient’s diagnosis sample, respectively.
45
46
The GATK toolkit (v2.5.2-v2.8.1) was used to realign indels, recalibrate quality scores and
47
its UnifiedGenotyper was used to call variants (multi-sample calling) according to the
48
Broad’s “best practices pipeline” for the GATK v2 series. Variants were annotated using the
49
ACRF Cancer Genome Facility’s custom annotation pipeline based on SnpEff and SnpSift
50
(v3.3)18. Annotation information was taken from Ensembl (v73)19, dbSNP (v137), the 1000
2
51
Genomes project (integrated phase 1, v3)20, the Exome Sequencing project (6500SI-V2)21,
52
COSMIC (v67)22, GERP scores23 as well as other public databases.
53
A rudimentary filtering was imposed on variants to remove those that were relatively unlikely
54
to be of interest by imposing two conditions. First, we demanded that the variant had to be
55
rare (<0.5%) both in the 1092 individuals of the 1000 Genomes project and the 4300/2203
56
European/African-Americans of the Exome sequencing project. Secondly, variants were only
57
retained if they either showed evidence that they were evolutionarily conserved, either in
58
mammalian (GERP ≥ 2) or other vertebrate (PhastCons ≥ 0.9) species, or if their predicted
59
functional impact had the potential to be non-trivial (i.e. not synonymous coding and not
60
classified by SnpEff to be a “modifier”), or if they were known somatic mutations occurring
61
in the COSMIC database. Finally, we compared the variants passing the above criteria to an
62
in-house collection of 51 exomes of patients with non-hematological malignancies, some of
63
which were sequenced concurrently with the four exomes considered here. If the variant
64
occurred more than once (heterozygous) in this collection then it was discarded. In total,
65
these filters reduced the total number of sites to be considered further to 7480.
66
67
The Ion Torrent targeted sequencing data was processed with the Torrent Suite™ software
68
v4.0.1 using the AmpliSeq workflow. This suite automates the generation of sequence reads,
69
trimming of adapter sequences and the removal of poor quality reads. Variant calls were
70
made using the Torrent Variant Caller plugin (4.0-5, 72041) using the Somatic Mutation
71
default settings except for ‘SNP minimum allele frequency’ (0.5%) and ‘Indel min allele
72
frequency’ (1.25%). Variants were annotated using SnpEff, COSMIC and local in-house
73
databases as detailed above.
74
75
3
76
Additional cases of DNMT3A-mutant AML
77
DNMT3A mutation load of paired diagnostic and remission samples from 12 AML patients
78
with DNMT3A (R882H/C) was performed using a custom Sequenom MassArray assay
79
(Sequenom, Inc., San Diego, CA USA). Allele loads of concurrent mutations in isocitrate
80
dehydrogenase 1 and 2 (IDH1/2), Kirsten rat sarcoma oncogene homolog (KRAS) and NPM1
81
were measured using a custom Sequenom assay, Sanger sequencing and a restriction
82
fragment length polymorphism assay, respectively.
83
84
Study oversight
85
The research was approved by the Royal Adelaide Hospital Human Research Ethics
86
Committee and all patients gave written informed consent.
4
87
Supplementary Table 1. Gene mutations in the two brothers.
88
Gene
DNMT3A
NPM1
FLT3
IDH1
NOTCH4
WT1
SMC1A
Genome (hg19)
chr2:g.25457242C>T
chr5:
g.170837544_170837547dupTCTG
chr13:g.28592642C>A
chr2:g.209113112C>T
chr6:g.32178533C>T
chr11:g.32413566G>A
chrX:g.53423420T>C
mRNA Transcript
NM_022552; c.2645G>A
NM_002520;
c.860_863dupTCTG
NM_004119; c.2503G>T
NM_005896; c.395G>A
NM_004557; c.2861G>A
NM_024424; c.1180C>T
NM_006306; c.2680A>G
89
5
Protein
p.Arg882His
p.Trp288Cysfs*12
p.Asp835Tyr
p.Arg132His
p.Cys954Tyr
p.Arg394Trp
p.Ile894Val
90
91
92
Supplementary Table 2. Mutation allele loads of 12 DNMT3A-mutant AML patients
who achieved complete remission after induction chemotherapy.
Patient Age at Sex
#
Diagnosis
(years)
Karyotype
Duration Sample
DNMT3A
IDH1
IDH2
NPM1
KRAS
of CR1
type R882H R882C R132C R132H R140Q R172K W288Cfs*12 G12D
(days)
1
65
M
46,XY,ins(17;2)
(p13;p21p23)[20]/
46,XY[1]
268
Dx
95
2
39
F
46,XX-7,+8[23]/
46,XX[7]
NA
CR1
Dx
54
45
3
69
F
46, XX
71
4
37
M
46, XY
360
49
58
47
48
29
56
53
5
64
F
46, XX
>1,460
6
63
F
46, XX
767
7
53
M
46,XY,+1~22
dmin[cp29].ish
dmin(ETO-,c-MYC,MLL-,RUNX1-)/
46,XY[6]
1188
CR1
Dx
CR1
Dx
CR1
Rel
CR2
Dx
CR1
CR1
Dx
CR1
Rel
CR2
Dx
55
15
27
41
54
45
12
17
0
0
491
CR1
CR2
Dx
8
50
F
46,XX,del(7)
(q?31.2)[10]/
46,XX,-7,+mar[2]/
91~92,XXXX,-7,
-7,+marx1~2[6]/
46,XX[2]
9
60
F
46, XX
230
10
65
M
46, XY
100
11
64
M
47,XY,+4,del(4)
(q12q31)[7]/
46,XY[13]
181
12
60
F
46, XX
159
32
0
48
0
48
0
NA
0
43
0
46
0
50
0
60
13
41
CR1
CR2
P1
P2
Dx
CR1
Dx
CR1
Dx
58
0
53
CR1
Dx
CR1
0
30
0
6
26
0
0
0
72
5
5
0
0
0
70
0
0
0
0
0
55
0
49
0
93
94
95
Supplementary Table 3. AmpliSeq 29 Gene Panel list. The entire coding region of each
gene was encompassed by massively parallel sequencing for mutation detection.
Genes
ASXL1
BAP1
BRAF
CBL
CEBPA
DNMT3A
EGFR
EZH2
GATA2
IDH1
IDH2
JAK1
JAK2
KIT
KRAS
MET
MPL
MYD88
NOTCH1
NPM1
NRAS
PTPN11
RUNX1
SF3B1
SRP72
SRSF2
TET2
U2AF1
XPO1
96
97
98
7
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
REFERENCES
18. Cingolani P, Platts A, Wang le L, Coon M, Nguyen T, Wang L, et al. A program for
annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs
in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin) 2012;
6: 80-92.
19. Flicek P, Amode MR, Barrell D, Beal K, Billis K, Brent S, et al. Ensembl 2014. Nucleic
Acids Res 2014; 42: D749-755.
20. Abecasis GR, Auton A, Brooks LD, DePristo MA, Durbin RM, Handsaker RE, et al. An
integrated map of genetic variation from 1,092 human genomes. Nature 2012; 491: 5665.
21. Tennessen JA, Bigham AW, O'Connor TD, Fu W, Kenny EE, Gravel S, et al. Evolution
and functional impact of rare coding variation from deep sequencing of human exomes.
Science 2012; 337: 64-69.
22. Forbes SA, Bhamra G, Bamford S, Dawson E, Kok C, Clements J, et al. The Catalogue
of Somatic Mutations in Cancer (COSMIC). Curr Protoc Hum Genet 2008; Chapter 10:
Unit 10 11.
23. Cooper GM, Stone EA, Asimenos G, Green ED, Batzoglou S, Sidow A. Distribution and
intensity of constraint in mammalian genomic sequence. Genome Res 2005; 15: 901-913.
8
Download