Exome sequencing analysis of the mutational spectrum in carcinogen and genetic models of Kras-driven lung cancer Peter Westcott, Kyle Halliwill, Minh To, David Quigley, Reyno Delrosario, Erik Fredlund, David Adams1, and Allan Balmain UCSF Helen Diller Family Comprehensive Cancer Center, 1450 3rd Street, San Francisco. 1 Wellcome Trust Sanger Centre, Cambridge, England. Why sequence tumors from mice? Control! Timing of initiation collection Initiating gene(s), carcinogen(s) Can distinguish mutations involved in initiation from progression Specific goals of this study Part of the MMHCC TCGA Pilot Project Characterize the utility of sequencing mouse tumors: What is the effect of the causative carcinogen on mutation spectrum? Clean genetic induction (GEM) vs. carcinogen induction? What mutations arise after Kras initiation? Exome sequencing Urethane 44 lung tumors from 17 mice MNU 26 lung tumors from 7 mice Kras+/(FVB/Ola) KrasLA2 (GEM) 13 lung tumors from 4 mice KrasLA2 (FVB/Ola) Spontaneous lung tumors Kras+/- Kras+/+ Control tail DNA: 2 Kras+/+ tails Exome sequencing Illumina paired-end sequencing (Wellcome Trust Sanger Centre) Have aligned reads to mouse genome, called against multiple controls and performed extensive QC (Kyle Hallilwill) Have a confident list of somatic variants Exome sequencing Carcinogen models of Kras-driven lung cancer Urethane (ethyl carbamate) Adenosine and cytidine DNA adducts lead to mispairing: A Replication T Mispairing Kras Q61L (CAACTA), Q61R (CAACGA). ~90% of lung tumors harbor Kras mutations. Carcinogen models of Kras-driven lung cancer MNU (methyl-nitroso urea) Guanosine DNA adducts lead to GA transitions G G Replication Mispairing G A Kras G12D (GGTGAT) ~90% of lung tumors harbor Kras mutations Genome-wide spectrum of these carcinogen mutations not known Mutation spectrum Urethane MNU Light shade = Kras+/- LA2 Mutation spectrum Slight bias for mutations at G/C nucleotide Strong bias for mutations at G nucleotide with flanking G or A Strong bias for mutations at A/T nucleotide Mutation spectrum MNU G->A Mutations Average counts per tumor 100 90 80 70 60 50 40 30 20 10 0 5’ A Purine bias at 5’ flanking base 5’ G Mutation spectrum Are non-carcinogen mutations separable? Average counts per tumor 670 Urethane MNU LA2 80 60 40 20 0 NCG->T Other G->A A->T For the most part A->G A->C G->C G->T ARE CARCINOGEN MUTATIONS RELEVANT? Other driver mutations? Analysis complicated: High mutation rates: MNU – 21.2/Mb Urethane – 6.4/Mb LA2 – 1.9/Mb Correlation between gene length and mutations Start with variants within Vogelstein’s 2013 list of drivers: Selected only consequential mutations at highly conserved sites in expressed genes Other driver mutations? GENE Mll2 Sf3b1 Crebbp Asxl1 Pdgfra Met Cic Atm Arid1b Alk Gnas Notch2 Arid1a Fgfr3 Hnf1a Flt3 Brca2 Akt1 Rb1 EXON_LENGTH 19827 6191 7507 6674 6553 6652 6099 11964 11325 5918 3717 10506 8175 4222 3186 3656 10540 2640 4625 NONSYN_MUT 16 5 4 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2 None of these mutations occur in LA2 tumors Slight enrichment for longer genes Modest increase in NS mutation ratio One S367 to F – required for autophosph. and activity Subclonal Myc T58P? Conclusions Mutation Spectrum Clear recapitulation of expected carcinogen mutations GEM shows few mutations Mutations highly specific and distinguishable Driver Mutations Kras Interesting candidates in carcinogen-induced tumors Future work Validate top 1000 interesting variants by Sequenom (Wellcome Trust Sanger Centre). Optimize list of potential driver mutations (relevant sites?). InDel analysis. Array CGH (copy number analysis). Inverse correlation of point mutational burden and copy number changes? Acknowledgments Kyle Halliwill Minh To David Quigley Reyno Del Rosario Erik Fredlund ALLAN BALMAIN DAVID ADAMS (WELLCOME TRUST SANGER CENTRE) $: MMHCC $: NIH Training Grant T32 GM007175 $: NSF Supplemental (Kyle’s Pipeline) • Capture using Agilent mouse whole exome kit • Sequenced on illumina HiSeq – Paired end, 75 bp each, average read span of 180 bp • Converted back to FASTQ, then followed QC pipeline (next slide) Supplemental (Kyle’s Pipeline) Align to Mm10 with BWA QC and Variant Calling Strategy Mark duplicates and fix mate information with picard Base recalibration and realignment with GATK Alignment and coverage information with picard Variant calling with MuTect Filter for depth and previously observed variants with vcftools Supplemental (Kyle’s Pipeline) Variant Calling Details Control 1 .bam Variant Calling via MuTect Variant List1 .vcf Intersect Filter, Annotate Candidate Variant List .vcf Sample .bam Variant List2 .vcf Control 2 .bam Candidate Variants