Using RNA-seq and targeted nucleases to identify mechanisms of drug resistance in acute myeloid leukemia Susan K. Rathe1,*, Branden S. Moriarity1,2,3, Christopher B. Stoltenberg1, Morito Kurata1, Natalie K. Aumann, Eric P. Rahrmann1,2,3, Natashay J. Bailey1, Ellen G. Melrose1, Dominic A. Beckmann2,3, Chase R. Liska1, David A. Largaespada1,2,3,4,5 1Masonic Cancer Center, University of Minnesota, Minneapolis, MN, USA; 2Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, MN, USA; 3Center for Genome Engineering, University of Minnesota, Minneapolis, MN, USA; 4Brain Tumor Program, University of Minnesota, Minneapolis, MN, USA; 5Department of Pediatrics, University of Minnesota, Minneapolis, MN, USA 1 Supplementary Figures Supplementary Figure S1: Distribution of expression comparing Ara-C resistant samples to parental samples using both microarray and RNA-seq data. a) Dot-blot of microarray gene expression comparing B117H to B117P. b) Dot-blot of RNA-seq gene expression comparing B117H to B117P. c) Dot-blot of microarray gene expression comparing B140H to B140P. d) Dot-blot of RNA-seq gene expression comparing B140H to B140P. 2 Supplementary Figure S2: Insertion mutation in exon 4 of Dck in B140H is also present in the genomic DNA. (a) Sanger sequencing was used to interrogate the DNA sequence using primers provided in Supplementary Table 1. Intensity of the signal for the insertion indicates the mutation is homozygous. (b) Projected change in protein sequence for Dck in the B140H cells. Amino acids derived from alternating exons are presented in the colors black and blue, while single red amino acids span 2 exons. The red amino acids at the end of the projected protein sequence identify the expected novel protein sequence due to the insertion mutation. 3 4 Supplementary Figure S3: Extended PCR of a C-terminus transcript segment of Dck in B117P and B117H DNA shows an approximate deletion of 1000 bp in B117H cells. Supplementary Figure S4: KO of Dck in the B117P T6B clone deleted a 32 bp segment toward the end of exon 1 which resulted in an abnormally spliced transcript containing 34 nucleotides from intron 1, starting at the 872nd nt of intron 1. The predicted result would be a protein consisting of 34 amino acids. The novel segment of protein is presented in red. 5 Supplementary Figure S5: Confirmation of the CRISPR gene modification in the murine Dck locus of the B117P Dck CKO-2 clone using CEL-I assay. The region containing the gRNA target sequence was amplified by PCR using primers described in Supplementary Table S11. Induction of the Cas9 system by doxycycline (Dox +) results in a 40% gene modification. 6 Supplementary Figure S6: Full-length Western blots, which were cropped for a) Figure 1d, b) Figure 3g, and c) Figure 4d. 7 8 9 Supplementary Table Legends Supplementary Table S1: The number of paired-end 76 bp reads generated by Illumina NGS sequencing for the murine cell lines B117P, B117H, B140P, and B140H, and the ability of the reads to be mapped uniquely by TopHat. Supplementary Table S2: Gene expression changes common to both Ara-C resistant cell lines when compared to their parental cell lines were identified by analyzing data generated from the Illumina NGS sequencing platform with the use of TopHat for genome mapping, and Cuffdiff for gene expression comparison. Included in this table are genes with a minimum fold change between 2 and 3. Genes with a greater than 3 fold change are included in Table 1. Genes in bold with * were also identified by gene expression microarray with 2-fold+ changes in expression (Rathe & Largaespada, 2010). (1) Expression levels are expressed in fragments per kilobase of transcript per million mapped reads (FPKMs). (2) Fold change compares the Ara-C resistant gene expression to the Ara-C sensitive parental lines. Negative values indicate a decrease in expression while positive values represent an increase in expression. (3) The table is sorted by the minimum fold change, which is the least of the two fold change values. Supplementary Table S3: Gene expression changes between the Ara-C resistant B117H cells compared to its parental cell line, B117P, were identified by analyzing data generated from the Illumina NGS sequencing platform with the use of TopHat for genome mapping, and Cuffdiff for gene expression comparison. Genes in bold with * were also identified by gene expression microarray with 100-fold+ changes in expression. (1) Expression levels are expressed in fragments per kilobase of transcript per million mapped reads (FPKMs). (2) Fold change compares the Ara-C resistant gene expression to the Ara-C sensitive parental lines. Negative values indicate a decrease in expression while positive values represent an increase in expression. 10 Supplementary Table S4: Gene expression changes between the Ara-C resistant B140H cell compared to its parental cell line, B140P, were identified by analyzing data generated from the Illumina NGS sequencing platform with the use of TopHat for genome mapping, and Cuffdiff for gene expression comparison. (1) Expression levels are expressed in fragments per kilobase of transcript per million mapped reads (FPKMs). (2) Fold change compares the Ara-C resistant gene expression to the Ara-C sensitive parental lines. Negative values indicate a decrease in expression while positive values represent an increase in expression. Supplementary Table S5: The last 10 amino acids of the C-terminus of Dck are highly conserved across a number of species. The coelacanth varies by 2 amino acids, highlighted in red (Source: Ensembl) Supplementary Table S6: MMuFLR identified potential frameshift mutations in the B140P and B140H cell lines. The mutations were screened in silico using IGV. The two mutations confirmed by IGV as being newly introduced to the B140H cells, Ccdc88b and Dck, were confirmed as heterozygous and homozygous, respectively, by Sanger sequencing. Supplementary Table S7: MMuFLR identified potential frameshift mutations in the B117P and B117H cell lines. The mutations were screened in silico using IGV. None were found to be newly introduced to the B117H cells. Supplementary Table S8: DeFuse was used to detect fusions. Fusions starting in coding regions and having a probability > 0.85 were confirmed using IGV. Supplementary Table S9: Genomic targets for guide RNAs (gRNAs) used in CRISPR KO screen. Supplementary Table S10: As controls, the MTS tetrazolium assay was used to determine the Ara-C IC50 in B117P cells transfected with various RNAi vectors. 11 Supplementary Table S11: Primers designed to find mutations and to verify expression levels. 12