Dealing with the heterogeneity of cancer Dana Pe’er Department of Biological Sciences Center for Computational Biology and Bioinformatics What is Cancer? Weinberg, Cell 2001 Why these phenotypes? • Cells only proliferate when they are told to do so. – Usually achieved by growth factors or cell-to-cell interaction. • Malignant cells proliferate independent of external signals • Proliferation rate is controlled by external and internal signals. • Cells that interfere with their environment receive signals to die • Tumors evade these signals • A local tumor is almost always surgically removable. • Cancer is such a terrible disease because it metastasizes and affects other organs • Our chromosomes end with “telomeres”, a chunk of DNA that isn’t replicated and gets smaller when a new DNA is synthesized. • When they are too short, the “important” DNA is unable to be copied and the cell dies • Tumors activate the process that elongates telomeres (and don’t die). • Cells need blood. More cells need more blood • Tumors, which spread into new areas, need new blood vessels • Our cells aren’t designed to proliferate indefinitely, metastasize, divide whenever they want and ignore extracellular signals • There are checkpoints in place that prevent all of the above by a suicide. • These are lost in cancer. So what is cancer? Weinberg, Cell 2011 The “Pathway” view of the cell • We depict proteins and processes as “pathways”. How a cell achieves these phenotypes • Different types of mutations (alterations) can alter pathway activity – Activate “Oncogene” – Inhibit “Tumor suppressor” TCGA, Nature 2008 Point mutations • Nucleotide change can lead to: – An early stop codon – making a protein nonfunctional – Create a constitutively active protein DNA Copy Number Alterations • Chunks of the genome can be amplified – Leading to many copies of an oncogene – Which leads to overexpression of the gene • Chunks can also be lost (deleted) – And that is one mechanism to lose a tumor suppressor Subtypes of cancer – By expression • Different cancers, and even subtypes of cancer, have dramatically different gene expression patterns • These represent cellular states Sandhu, 2010 Cancer development Genetic alterations alterations functional drivers Identifying significantly recurrent alterations across samples The Cancer Genome Atlas (TCGA) • Characterization of 20 cancers x 1000 tumors each • Assays include: – How is the DNA changing: DNA sequencing (mostly exon), copy number variation – How is expression different: RNA-seq, miRNAs – Extras: methylation, clinical annotation • https://tcga-data.nci.nih.gov/tcga/ Prevalence of alterations by type Sequence mutations 35 Frequency 30 25 20 15 10 6 alt > 5% samples 5 0 CN alterations 80 Frequency 70 60 50 40 30 20 10 0 87 alt > 5% samples Distinguishing drivers from passengers What Aberrations Make a Cell Go Bad? Driver Aberrations: Significantly Recur Across Tumors Breast Copy Number Profile Breast Cancer Exome Sequencing Total mutations: 21713 Per patient: 48 Two forces driver copy number I. Selection of the Fittest II. DNA secondary structure and packing Norwell, 1976 Our ISAR algorithm tries to identify frequent alterations driven by fitness. ISAR Significance of number of alterations should be computed locally. ~8Mbp P-value Distribution ISAR regions # regions # genes per region # genes per peak ISAR 83 14 GISTIC2 33 14.39 1.18 A better null model helps sensitivity ~1200 genes in ISAR regions: we need to identify drivers within these regions. GISTIC2 narrows down regions to deterministic peaks containing 1.18 genes. Problem solved? Defining peaks: cut-off 9 of the 33 GISTIC2 peaks do not contain a single gene Helios approach Sample 1 Sample 2 Sample 3 Sample 4 Genome deterministic 0/1 decision GENE1 GENE2 GENE3 GENE4 GENE5 Classic Approach Features Sequence Weight and combine Genome Integrative Score Copy Number GENE1 GENE2 Expression GENE3 GENE4 shRNA GENE5 Primary tumor data (TCGA) Functional assays (RNAi screens) Helios: Data Integration Primary tumor (many) Cell Line (few) … A ranked and scored list of driver genes Making use of the large-scale of functional screens that are quickly accumulating Best of both worlds: Integrating primary tumor data with functional screens on cell lines Features: Gene expression Is the gene expressed ? Diploid VS amplified : CCND1 CN AMP WT CCND1 EXP Differentially expressed in subtypes: SUBTYPE FOXA1 EXP BASAL LUMINAL Features: Sequence mutations Driver genes may show a footprint of point mutations We use p-value of frequency of alteration calculated by MutSig (Banerji, Nature 2012 ) Training data Features Classifier Labels List of drivers and passengers Too small and biased !!! Make frequency of alteration the center of the system PLX4720-Targeted Therapy Proteins Form a Complex Network Chandarpalaty et al. 2011 Crosstalk BRAF exists in a network Feedback BRAF Networks Vary Across Genetic Backgrounds Drastically different genetic backgrounds Our Aims Identify genetic determinants and master regulators of drug resistance Predict additional target pathways for combinatorial drug treatment. Heterogeneity within a tumor If even < 1% of cells evade therapy, tumor will recur. The influence of this population on any bulk assay is negligent Mass cytometry: A powerful new technology Time of flight Mass spectrometer We capture the level of 45 protein epitopes simultaneously in single cells For tens of thousands of cells Mass cytometry How do we view > 30 dimensions? Parameters: 32 14 8 4 Plots: 91 28 6 496 Acknowledgements Felix Sanchez-Garcia Dylan Kotliar Uri David Akavia Jose Silva (CUMC) Junji Matsui Bo-Juen Chen El-ad David Amir Jacob Levine Smita Krishnaswamy Daniel Shenfeld Michelle Tadmor Garry Nolan (Stanford) Sean Bendall Erin Simonds Kara Davis