Towards Personalized Genomics-Guided Cancer Immunotherapy Ion Mandoiu Department of Computer Science & Engineering Joint work with Sahar Al Seesi (CSE) Jorge Duitama (CIAT) Fei Duan, Tatiana Blanchard, Pramod K. Srivastava (UCHC) Mandoiu Lab Main Research Areas: • Bioinformatics Algorithms • Development of Computational Methods for Next-Gen Sequencing Data Analysis Ongoing Projects • RNA-Seq Analysis (NSF, NIH, Life Technologies) - Novel transcript reconstruction - Allele-specific isoform expression - Computational deconvolution of heterogeneous samples • Viral quasispecies reconstruction (USDA) - IBV evolution and vaccine optimization • Genome assembly and scaffolding, LD-based genotype calling, local ancestry inference, metabolomics, … 2 - More info & software at http://dna.engr.uconn.edu Genomics-Guided Cancer Immunotherapy mRNA Sequencing Peptide Synthesis Tumor Specific Epitopes CTCAATTGATGAAATTGTTCTGAAACT GCAGAGATAGCTAAAGGATACCGGGTT CCGGTATCCTTTAGCTATCTCTGCCTC CTGACACCATCTGTGTGGGCTACCATG … SYFPEITHI ISETDLSLL CALRRNESL … AGGCAAGCTCATGGCCAAATCATGAGA Immune System Stimulation T-Cell Response Mouse Image Source: http://www.clker.com/clipart-simple-cartoon-mouse-2.html Tumor Remission Bioinformatics Pipeline Read Alignment Data Cleaning Variant Detection Haplotyping Epitope Prediction • Hybrid alignment strategy (HardMerge) • Clipping alignments & removal of PCR artifacts • Bayesian model based on quality scores (SNVQ) • Max-Cut algorithm (RefHap) • PWM and ANN algorithms (NetMHC) Hybrid Read Alignment Approach mRNA reads Transcript Library Mapping Read Merging Genome Mapping http://en.wikipedia.org/wiki/File:RNA-Seq-alignment.png Transcript mapped reads Mapped reads Genome mapped reads • More efficient compared to spliced alignment onto genome • Stringent filtering: reads with multiple alignments are discarded Percentage of reads with mismatches Clipping Alignments 2.5 Lane 1 2 Lane 2 1.5 Lane 3 1 0.5 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65 67 69 71 73 Read position Removal of PCR Artifacts Variant Detection and Genotyping Locus i Reference genome Ri AACGCGGCCAGCCGGCTTCTGTCGGCCAGCAGCCAGGAATCTGGAAACAATGGCTACAGCGTGC AACGCGGCCAGCCGGCTTCTGTCGGCCAGCCGGCAG CGCGGCCAGCCGGCTTCTGTCGGCCAGCAGCCCGGA GCGGCCAGCCGGCTTCTGTCGGCCAGCCGGCAGGGA GCCAGCCGGCTTCTGTCGGCCAGCAGCCAGGAATCT GCCGGCTTCTGTCGGCCAGCAGCCAGGAATCTGGAA CTTCTGTCGGCCAGCCGGCAGGAATCTGGAAACAAT CGGCCAGCAGCCAGGAATCTGGAAACAATGGCTACA CCAGCAGCCAGGAATCTGGAAACAATGGCTACAGCG CAAGCAGCCAGGAATCTGGAAACAATGGCTACAGCG GCAGCCAGGAATCTGGAAACAATGGCTACAGCGTGC Variant Detection and Genotyping • Pick genotype with the largest posterior probability Accuracy as Function of Coverage Haplotyping • Somatic cells are diploid, containing two nearly identical copies of each autosomal chromosome – Novel mutations are present on only one chromosome copy – For epitope prediction we need to know if nearby mutations appear in phase Locus Mutation Alleles Locus Mutation Haplotype 1 Haplotype 2 1 SNV C,T 1 SNV T C 2 Deletion C,- 2 Deletion C - 3 SNV A,G 3 SNV A G 4 Insertion -,GC 4 Insertion - GC RefHap Algorithm • Reduce the problem to Max-Cut • Solve Max-Cut • Build haplotypes according with the cut Locus 1 2 3 4 5 f1 * 0 1 1 0 f2 1 1 0 * 1 f3 1 * * 0 * f4 * 0 0 * 1 1 f4 -1 3 f1 f2 1 f3 -1 h1 00110 h2 11001 Epitope Prediction Profile weight matrix (PWM) model C. Lundegaard et al. MHC Class I Epitope Binding Prediction Trained on Small Data Sets. In Lecture Notes in Computer Science, 3239:217-225, 2004 SYFPEITHI Score H2-Kd J.W. Yedell, E Reits and J Neefjes. Making sense of mass destruction: quantitating MHC class I antigen presentation. Nature Reviews Immunology, 3:952-961, 2003 R² = 0.5333 -20 -10 0 10 NetMHC Score 20 Results on Tumor Data Tumor Type RNA-Seq Reads (Million) Genome Mapped Transcriptome Mapped HardMerge Mapped HardMerge Mapped Bases (Gb) High-Quality Heterozygous SNVs in CCDS Exons Non-synonymous Missense Nonsense No-stop NetMHC Predicted Epitopes MethA 105.8 75% 83% 50% 3.18 1,504 1,160 1,096 63 1 836 CMS5 23.4 54% 59% 36% 0.41 232 182 178 4 142 Tnpo3 15 15 10 10 5 5 0 30 0 40 0 10 20 30 400 200 0 40 ai 20 600 Days after tumor challenge N 10 800 P < 0.0001 v Tn e po 3 Naive 0 AUC (mm2) Mean Tumor Diameter (mm) • Tumor rejection potential of identified epitopes currently evaluated experimentally in the Srivastava lab Ongoing Work • Sequencing of spontaneous tumors (TRAMP mice) • Detecting other forms of variation: indels, gene fusions, novel transcripts • Incorporating predictions of TAP transport efficiency and proteasomal cleavage in epitope prediction • Integration of mass-spectrometry data • Monitoring immune response by TCR sequencing