1 Supplementary information 2 Whole exome sequencing identifies a novel somatic mutation in MMP8 associated 3 with a t(1;22)-acute megakaryoblastic leukemia 4 5 Yeunhee Kim PhD1,10, Vincent P. Schulz PhD2, Noriko Satake MD3, Tanja A. 6 Gruber4,5, Alexandra Teixeira6,10, Stephanie Halene MD7, Patrick G. Gallagher MD2,8, 7 and Diane S. Krause MD, PhD1,9,10* 8 9 1Departments of Laboratory Medicine, 2Pediatrics, 6Pathology, 7Internal Medicine 10 (Division of Hematology and Yale Comprehensive Cancer Center), 8Genetics, and 11 9Cell 12 Haven, CT. 13 3Department 14 Davis, Comprehensive Cancer Center, Sacramento, CA 15 4Departments 16 Memphis, TN Biology; 10Yale Stem Cell Center; Yale University School of Medicine, New of Pediatrics (Section of Hematology/Oncology), University of California, of Oncology and 5Pathology, St. Jude Children’s Research Hospital, 17 18 *Correspondence to: 19 Diane S. Krause 20 P.O. Box 208035, 333 Cedar Street, New Haven, CT 06520-8035 21 Office (203) 737-1678 22 Fax (203) 785-4305 23 Email: diane.krause@yale.edu 1 24 Running title: MMP8 mutation in AMKL 25 26 Supplementary Materials and Methods 27 Sample collection 28 Parental consent was obtained and documented for all procedures according to the 29 University of California, Davis Internal Review Board protocol. Samples of bone 30 marrow and peripheral blood were obtained from the patient at presentation with 31 leukemia and on several post-induction days when the patient was in remission. 32 Morphological and immunophenotypic analyses were performed at the UC Davis 33 Pathology Department according to the standard protocols. FISH and cytogenetic 34 analyses were performed by the Mayo Clinic Laboratory (Children’s Oncology Group 35 designated laboratory). The remaining diagnostic sample was frozen. Peripheral 36 blood was also collected and cryopreserved 6 and 9 months after the patient achieved 37 remission. 38 39 DNA Preparation 40 The leukemic genomic DNA was extracted using the Gentra Puregene kit (Qiagen) 41 according to the manufacturer’s instructions from 2.2 million acetic acid/methanol-fixed 42 peripheral blood cells remaining after diagnostic FISH was performed. The Gentra 43 Puregene kit provides an anionic cell lysis detergent in the presence of a DNA 44 stabilizer to enhance DNA quality and quantity from archived tissue samples. The 45 remission genomic DNA was extracted using the DNeasy blood and tissue kit 46 (Qiagen) from thawed peripheral blood cells according to manufacturer’s instructions. 2 47 The GenomePlex Complete whole genome amplification kit (Sigma, St. Louis, MO) (1) 48 was used to generate genomic DNA. Control human genomic DNA was provided by 49 the manufacturer and used as a positive control for WGA and droplet digital PCR. 50 51 PCR analysis of t(1;22) breakpoint 52 PCR amplification was performed to assess the breakpoint between chromosome 1 53 and 22 with the primer sets published in Mercher, et al. (2). PCR amplification was 54 performed in 20 l containing 100 ng of template DNA, 0.5 M primers, Phusion HF 55 buffer (1.5mg MgCl2), 200 M dNTP, 3% DMSO and 0.02 U of Phusion DNA 56 polymerase (Finnzymes, Espoo, Finland). The protocol was as follows: initial 57 denaturation at 98 C for 1 min; 98 C for 10 s, 60 C for 30 s, 72 C for 2 min for 35 58 cycles, and then a final extension cycle of 5 min at 72 C. 59 60 Exome Capture and Next-Generation Sequencing 61 The cancer and remission genomic DNA was captured on a Nimblegen sequence 62 capture human exome 2.1 M liquid phase capture system (NimbleGen, Madison, WI) 63 using the manufacturer’s protocol with modifications at the Yale Center for Genome 64 Analysis at Yale University. DNA was sheared and adaptors were ligated onto the 65 resulting sheared fragments. Fragments were amplified using adaptor PCR primers, 66 purified and hybridized to the capture oligonucleotides. Eluted and purified fragments 67 were sequenced on an Illumina GenomeAnalyzer II sequencing system (Illumina, Inc., 68 San Diego, CA) as 75bp paired end reads according to manufacturer's protocols. 3 69 Image analysis and base calling was performed by Illumina pipeline with default 70 parameters. 71 72 Bioinformatics analyses 73 Illumina sequence reads were mapped to the hg19 human genome sequence using 74 the BWA version 0.5.9 alignment software. Duplicate reads arising from PCR 75 amplification were flagged using the Picard software (http://picard.sourceforge.net). 76 Reads near to insertion/deletions were remapped using GATK software version 1.1-37 77 to reduce genotype miscalling due to alignment errors. GATK software was then used 78 to recalibrate the sequence base quality scores using dbSNP version 130 SNP 79 information and sequence cycle and dinucleotide covariates. Genotype call 80 information was generated using the samtools mpileup command 81 (http://www.ncbi.nlm.nih.gov/pubmed/19505943). Only bases with a quality of 20 or 82 more and reads with a mapping quality of greater than 10 were used for genotype 83 calls. The genotypes of cancer and paired remission samples were compared using 84 the varscan version 2.2.7 software (http://www.ncbi.nlm.nih.gov/pubmed/22300766). 85 Variants were filtered to require >2% frequency and minimum coverage of 8 reads. 86 Somatic SNPs were annotated with coding effect information using the SIFT server at 87 http://sift.jcvi.org/www/SIFT_chr_coords_submit.html. 88 89 Droplet Digital PCR 90 Droplet digital PCR was performed using a Bio-Rad QX100 droplet digital PCR system 91 (Bio-Rad, Hercules, CA) to validate the variants identified in the diagnostic versus 4 92 remission samples. TaqMan primers and probes (Life Technologies, Grand Island, 93 NY) were custom-designed for each gene, which carried fluorescent 6-FAM and VIC 94 as reporter labels at the 5’ end for mutant and normal alleles, respectively, and a 95 ‘minor groove binder and non-fluorescence quencher (MGB/NFQ)’ as a quencher at 96 the 3’ end. The TaqMan PCR reaction mixture was assembled with a 2x ddPCR 97 mastermix (Bio-Rad), 20x TaqMan primer/probes (final concentrations of 900 nM and 98 250 nM, respectively) and template (10 ng genomic DNA and 5x10-7 ng mutant oligos) 99 in a final volume of 20 l. The PCR mixture and 60 l of droplet generation oil were 100 loaded into the sample and oil wells, respectively, for each channel of an eight- 101 channel droplet generator cartridge. The droplet generator generated approximately 102 15000 droplets for each PCR mixture, which were transferred to a 96-well PCR plate 103 (Eppendorf, Hauppauge, NY). The plate was heat-sealed with a foil seal (Eppendorf) 104 and then placed on a conventional thermal cycler. Thermal cycle conditions were; 10 105 m at 95 C (1 cycle), 30 s at 95 C and 60 s at 59-63 C (40 cycles), 10 m at 98 C 106 (one cycle). After PCR, the 96-well PCR plate was loaded onto the droplet reader for 107 analysis. QuantSoft analysis software was used to analyze the ddPCR data. The 108 primers and Taqman probes for ddPCR are listed in Supplementary Table 1. 109 110 Cell culture and transient transfection 111 HEK 293T cells were maintained in DMEM (Life Technologies, Carlsbad, CA) with 112 10% fetal bovine serum (Gemini Bio-Products, West Sacramento, CA), 2 mM L- 113 glutamine and 1x Penicillin/Streptomycin (Life Technologies). Confluent HEK 293T 114 cells were transfected using LipofectamineTM 2000 Transfection Reagent (Life 5 115 Technologies) with 24 g of either empty vector or vector containing wild-type MMP8 116 or mutant MMP8. Cells were harvested 24 hours after transfection. 117 118 Immunoprecipitation 119 Transfected cells in T75 flask were washed once with 1x Phosphate buffered saline 120 and then lysed using 810 l EBC medium (50 mM Tris-Cl pH 7.5, 100 mM NaCl, 0.5% 121 NP, 1x phosphatase and protease inhibitor cocktails (Roche, Nutley, NJ) and 90 l of 122 1,10-phenanthroline (Sigma, St. Louis, MO) in 0.1 N HCl for 20 min on ice. Cells were 123 collected and centrifuged at 14,000 rpm at 4 C for 10 min. The 850 l of supernatant 124 from the lysed cells were incubated with 100 l blocking EBC (EBC medium + 5% 125 milk) and 30 l Anti-FLAG M2 Affinity Gel (Sigma). The mixture was incubated at 4 C 126 overnight. Next day, the supernatant was removed and the immunoprecipitates were 127 washed three times with EBC medium. The remaining cell lysate was used as an input 128 control for western blot. 129 130 Zymography 131 Collagen zymography was performed using 0.3% rat-tail collagen type 1 (BD, Bedford, 132 MA) in a 7.5% polyacrylamide gel. Fifty l of Zymogram sample buffer (BioRad) was 133 added to each immunoprecipitate and the mixture was incubated for 10 minutes at 134 room temperature. Fifteen l was loaded onto the collagen-polyacrylamide gel. After 135 electrophoresis, the gel was incubated in 50 ml 1x Zymogram Renaturation buffer 136 (BioRad) for 30 min at room temperature and changed to 50 ml 1x Zymogram 137 Development buffer (BioRad), which was equilibrated for 30 min at room temperature. 6 138 The development buffer was discarded and 50 ml 1x Zymogram Development buffer 139 was added and incubated overnight at 37C. The gel was stained in 0.5% Coomassie 140 Blue R-250 (BioRad) for 30 min and then destained with Methanol:Acetic Acid:dH2O 141 (40:10:50). Student t-test was used for statistical analysis (p<0.05). 142 143 Western Blot Analysis 144 Beta-mercaptoethanol was added to the remaining immunoprecipitates in zymogram 145 sample buffer and the whole cell lysates prior to immunoprecipitation were treated with 146 denaturing Laemmli sample buffer. The samples are boiled at 98C for 3 min and 15 l 147 of each sample was loaded on a 10% Mini Protean TGX gel (BioRad). After 148 electrophoresis, the gel was transferred onto a 0.45 m nitrocellulose membrane. The 149 blot was blocked in 5% milk/0.05% Tween in TBS for 30 min at room temperature and 150 then incubated with primary mouse Anti-Flag M2 antibody in the blocking solution 151 (Sigma) at 4C overnight. The blot was washed in 0.05% Tween-TBS 3 times and 152 then incubated in secondary anti-mouse-horseradish peroxidase antibody for 1 hour at 153 room temperature. After wash, the blot was developed in Western Lightening Plus- 154 ECL solution (PerkinElmer, Waltham, MA). 155 156 Analysis of MMP8 expression in AML patients 157 RNA extraction, gene expression profiling, and transcriptome sequencing data has 158 been previously published (3, 4). Sequencing data is deposited in the dbGaP 159 database (http://www.ncbi.nlm.nih.gov/gap) under the accession number 160 phs000413.v1.p1. Affymetrix Human Exon 1.0 ST Array data for pediatric AML 7 161 profiling has been deposited in the NCBI gene expression omnibus 162 ((http://www.ncbi.nlm.nih.gov/geo/) under GSE35203. Affymetrix Human U133A Array 163 data for AMKL profiling has been deposited in the NCBI gene expression omnibus 164 ((http://www.ncbi.nlm.nih.gov/geo/) under GSE4119. 165 8 166 Figure Legends 167 168 Supplementary Figure 1. Droplet digital PCR plots of MMP8 and CDK9. (A, C) The 169 raw data of ddPCR dot plots shows the signals from normal and variant allele of 170 MMP8 and CDK9, respectively. The VIC and FAM signals indicate normal (WT) and 171 variant allele (MT) on the Y-axis, respectively, and the event number on the X-axis 172 indicates the cumulative number of droplets generated from each sample (NTC: no 173 template control, WT: normal human genomic DNA, MT: synthesized oligos carrying 174 variant, Leuk: the patient’s leukemic DNA at diagnosis, and Rem: the patient’s normal 175 DNA at remission), each of which is separated by yellow lines. The pink line indicates 176 threshold for cut off of nonspecific signals. (B, D) The copy number of normal and 177 variant alleles of MMP8 and CDK9, respectively, was calculated by Poisson 178 distribution. Samples ‘WT’ and ‘Rem’ showed only VIC signal whereas ‘MT’ showed 179 only FAM signals for both MMP8 and CDK9. Both VIC and FAM signals were 180 observed in ‘Leuk’ for MMP8 (A, B) but not for CDK9 (C, D). 181 182 Supplementary Figure 2. Schematic diagram of MMP8 and functional analysis of 183 MMP8G189D. 184 (A) Diagram of MMP8 in a 3D-ribbon representation. Green and grey dots represent 185 calcium and zinc ions, respectively. (Jmol: an open-source Java viewer for chemical 186 structures in 3D. http://www.jmol.org/) (B) Schematic diagram of MMP8 structural 187 domains showing the variation as a red asterisk. P; leading sequences, PGBD; 188 proteoglycan binding domain, H; hinge domain. (C) The conservation alignment view 9 189 of multiple vertebrates over the variation of MMP8 (Chr11:102 592 138-102 592 237). 190 Top and bottom red asterisk mark the nucleotide and amino acid variation, 191 respectively. 192 193 Supplementary Figure 3. MMP8 Expression Levels in Acute Megakaryoblastic 194 Leukemia. (A) Pediatric AML cases were subjected to gene expression profiling using 195 Affymetrix Human Exon 1.0 ST Arrays. Samples were clustered using average 196 linkage. All samples had >80% purity, AMKL cases were all >90% pure. A detailed 197 description of the samples included in this analysis can be found at NCBI gene 198 expression omnibus, accession GSE35203. Briefly, AML M7 denotes AMKL 199 specimens; CN, cytogenetically normal AML; M3, Acute Promyelocytic Leukemia; 200 MLL, Mixed Lineage Leukemia rearranged AML; AE, AML1-ETO AML; inv16, CBFB- 201 MYH11 AML. Samples with no distinguishing characteristics on cytogenetics are 202 labeled as AML without modifiers. (B) A single AMKL case, SJAMLM7007 203 demonstrated robust MMP8 expression by transcriptome sequencing. MMP8 exons 204 are indicated, the Y-axis shows the number of supporting reads. (C) AMKL cases 205 were subjected to gene expression profiling using Affymetrix Human U133A 206 Arrays. Samples were clustered using average linkage. Sample purity ranged from 207 12-94%. A detailed description of the samples included in this analysis can be found 208 at NCBI gene expression omnibus, accession GSE4119. Briefly, M7 Ped denotes 209 pediatric non-Down syndrome AMKL specimens; M7 DS, Down syndrome AMKL; M7 210 Adult, adult AMKL. 211 10 212 References 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 1. Arneson N, Hughes S, Houlston R, Done S. GenomePlex Whole-Genome Amplification. Cold Spring Harbor Protocols 2008 January 1, 2008; 2008(1): pdb.prot4920. 2. Mercher T, Busson-Le Coniat M, Khac FN, Ballerini P, MauchauffÈ M, Bui H, et al. Recurrence of OTT-MAL fusion in t(1;22) of infant AML-M7. Genes, Chromosomes and Cancer 2002; 33(1): 22-28. 3. Bourquin JP, Subramanian A, Langebrake C, Reinhardt D, Bernard O, Ballerini P, et al. Identification of distinct molecular phenotypes in acute megakaryoblastic leukemia by gene expression profiling. Proceedings of the National Academy of Sciences of the United States of America 2006 Feb 28; 103(9): 3339-3344. 4. Gruber Tanja A, Larson Gedman A, Zhang J, Koss Cary S, Marada S, Ta Huy Q, et al. An Inv(16)(p13.3q24.3)-Encoded CBFA2T3-GLIS2 Fusion Protein Defines an Aggressive Subtype of Pediatric Acute Megakaryoblastic Leukemia. Cancer Cell 2012; 22(5): 683-697. 11