Integrative Pathways to Maximally Predict Cancer Outcomes Josh Stuart Center for Biomolecular Science and Engineering, UC Santa Cruz Disclosure 1 Five3 Genomics LLC, Scientific Advisory Board Disclosure 2: Postdoc Positions Available https://genome-cancer.ucsc.edu Jing Zhu and UCSC cancer group UCSC Cancer Genomics Browser genome-cancer.ucsc.edu TCGA The Cancer Genome Atlas https://tcga1.cse.ucsc.edu I-SPY TRIAL SU2C Dream Team Breast Cancer Clinical Trial Stand Up to Cancer https://instinct1.cse.ucsc.edu https://genome-cancersu2c.cse.ucsc.edu UCSC Cancer Genomics Browser Open Source: First public deposit Oct. 2007 Public access portal: All major published studies Spin-off Browsers Immuno Browser Stem Cell Browser UCSC Cancer Genomics Browser UCSC Cancer Genome Heatmaps Breast Cancer Amplified / Deleted Regions Grade ER Survival Samples TCGA Brain Cancer Amplified / Deleted Regions Clinical Data Tumor Samples Normal Genome Position Jing Zhu and UCSC cancer group UCSC Cancer Genomics Browser TCGA ovarian Jing Zhu and UCSC cancerdata group UCSC Cancer Genomics Browser Click to sort according to clinical information TCGA ovarian Jing Zhu and UCSC cancerdata group Perform Robust Statistics across the Entire Genome, Identifying Genomic Regions to Study Further Breast Cancer Cell Lines, Amplified / Deleted Genomic Regions Low Genomic Data High Genome-wide statistics UCSC Genome Browser Quickly identify genes significantly amplified in samples with low TP53 expression, a critical tumor suppressor Clinical Data TCGA Breast Copy number variations Summary Views Heatmap View - Amplified / Deleted Regions Box Plot Summary View Proportional Summary View Similarities Shared by Multiple Cancers Pancreatic Copy Number (Jones 2008) Glioblastoma Multiforme Copy Number (TCGA 2009) Melanoma Copy Number (Lin 2008) CDKN2A/B Homozygous Deletion Jing Zhu and UCSC cancer group Statistical Comparison of Two Subsets of Samples t test stats Perform Statistics Select features Define subgroups Jing Zhu and UCSC cancer group Identify genetic markers dynamically Copy Number Data Gene Expression Data Co-amplified region ERBB2 locus Jing Zhu and UCSC cancer group Signatures now Implemented in the Browser C. Enter Signature (math formula) A. select dataset B. Signature Signature Demo E. Real-time Signature Score correlates with pathCR, ER, Her2, or prediction score in publication #ChemoTFAC30+ E2F3+ MELK+ RRM2+ BTG3- CTNND2- GAMT- METRN- ERBB4- ZNF552- CA12- KDM4B- NKAIN1- SCUBE2- KIAA1467- MAPT- FLJ10916- C. Signature Score D. Click to sort A. Enter Signature B. Name Signature, Press “Validate & Add” Button Jing Zhu and UCSC cancer group Pull up the signature genes as a gene set C. update #ChemoTFAC30+ E2F3+ MELK+ RRM2+ BTG3- CTNND2- GAMT- A. enter gene set ChemoTFAC30 ... B. save user-defined geneset Jing Zhu and UCSC cancer group Compare Multiple Signatures #Oncotype_DX#Pro #ChemoTFAC30+ E2F3+ MELK+ RRM2+ BTG3- CTNND2- GAMT- METRN- ERBB4- ZNF552- CA12- KDM4B- NKAIN1- SCUBE2- KIAA1467- MAPT- FLJ10916- BECN1- RAMP1- GFRA1- IGFBP4- FG TFAC30 OncotypeDX-like Pathways-based organization, GBM data somatic mutations germline mutations Normal Tumor Copy number variation stats Jing Zhu and cancer group Analysis like investigating a plane crash Patient Sample 1 Patient Sample 2 Patient Sample 3 Patient Sample N … Pathways as genetic unit p53 altered in 87% cases RAS altered in 88% cases RB altered in 78% cases 3The Cancer Genome Atlas, Nature, 2008 Main Approach: Detailed models of gene expression and interaction MDM2 TP53 Main Approach: Detailed models of expression and interaction Two Parts: MDM2 TP53 1. Gene Level Model (central dogma) 2. Interaction Model (regulation) Patient-specific alterations for known pathways Collect publicly available pathways (NCI, WikiPathways, KEGG, …) Convert to graphical model Infer pathway “levels” for each “concept in each patient sample concepts: gene activities, apoptosis, DNA repair, small molecules, complexes, etc. Train classifiers w/ inferred pathway levels E.g. 3-year disease-free survival Pathway Recognition Algorithm Using Data Integration on Genomic Models (PARADIGM) Gene Charlie Vaske, Steve Benz PARADIGM Gene Model to Integrate Data 3-state discrete variables relative to non-cancer, is this sample: up, same, down? Charlie Vaske, Steve Benz PARADIGM Gene-level Model relative to non-cancer, is this sample: up, same, down? 3-state discrete variables CNA \ Exp Down Same Up Down Same Up 0.90 0.05 0.01 0.09 0.90 0.09 0.01 0.05 0.90 PARADIGM Gene Model to Integrate Data Charlie Vaske, Steve Benz PARADIGM Gene Model to Integrate Data Charlie Vaske, Steve Benz PARDIGM Gene Model to Integrate Data Charlie Vaske, Steve Benz Interactions Matter Apoptosis Apoptosis Given information about the expression of TP53 alone Reasoning predicts apoptosis is in tact in these cells. Charlie Vaske, Steve Benz Interactions Matter Given the interaction and data about MDM2. apoptosis inference reversed Quantitative Output Log likelihood Ratio: log log odds of state and data prior log odds P Data | Apoptosis active, P Data, Apoptosis active | P Apoptosis active | log log P Data | Apoptosis active, = P Data, Apoptosis active | P Apoptosis active | Charlie Vaske, Steve Benz PARADIGM Interaction Components Transcriptio n Factor Kinase Target Gene Transcription al Regulation Phosphorylated gene Post-translational Modification Charlie Vaske, Steve Benz PARADIGM Interaction Components Subunit A Subunit BSubunit C Protein Complex Gene A Gene B Gene C Gene family, proteins with interchangeable function Charlie Vaske, Steve Benz PARADIGM Interaction Components Subunit A Subunit BSubunit C Noisy AND function Protein Complex Gene A Gene B Gene C Noisy OR function Gene family, proteins with interchangeable function Charlie Vaske, Steve Benz Pathway Interpretation of Omics Data PARADIGM Pathway Analysis CNA Expression p53 pathway FoxM1 network Angiopoietin receptor Tie2-mediated signaling ~100 Pathways 316 TCGA Ovarian Samples Tumor Samples Concepts Per-sample integrated pathway levels Sample 1 Sample 2 Sample 3 TCGA Ovarian Cancer Inferred Pathway Activities Pathway Concepts (867) Patient Samples (247) Ovarian: FOXM1 pathway altered in majority of serous ovarian tumors Patient Samples (247) Pathway Concepts (867) FOXM1 Transcription Network Patient Samples Expression Copy Number FOXM1 Amplification Leads to Secondary Responses Predicted by its Pathway Interactions Ovarian: CircleMap of FOXM1 Transcription Factor and Targets FOXM1 Expression vs. Fallopian Normal Copy Number Variation Each spoke of the circle represents a single sample Ovarian: FOXM1 central to DNA repair and cell proliferation Ovarian: IPLs statify by survival time Charlie Vaske, Steve Benz Ovarian: IPLs improve the accuracy of platinum response predictors SuperPathway based Predictors Merge constituent pathway models into a single superimposed pathway (SuperPathway) Single interpretation for every concept Score concepts according to whether they are predictive of an outcome Identifies Pathway Signatures of subtypes Pathway Signatures: Differential Subnetworks from a “SuperPathway” Pathway Activities Pathway Activities Pathway Signatures: Differential Subnetworks from a “SuperPathway” Pathway Activities Pathway Activities Pathway Signatures: Differential Subnetworks from a “SuperPathway” SuperPathway Activities SuperPathway Activities Pathway Signature Test Approach in Breast Cancer Cell Lines Basal cells more aggressive, stem cell like What pathways underlie this subclass of cancer? ~50 breast cancer cell lines of various subtypes Copy number and gene expression for all cells Do the pathway fingerprints of basal breast cancer give clues about drug treatment? Basal Pathway Markers Basal Pathway Markers: Bird’s eye view Each edge is supported by at least one publication 965 nodes 941 edges Blue down Red up – Infer activities in a global “super pathway” – Associate activities with drug sensitivity to find activity markers – Search for focused subnetworks with interconnected markers – Activity nodes linked to therapeutic inhibitors Dominant features: FOXA1 / FOXA2 Network • Controls transcription of ERregulated genes (1,2) • Expression associated with good prognosis (1) and luminal breast cancers (2) CHROMATIN REMODELING Inhibitor Up/ON Down/OF F 1. Eur J Cancer. 2008 Jul;44(11):1541-51 2. Clin Cancer Res. 2007 Aug 1;13(15 Pt 1):4415-21. Note PARADIGM can easily incorporate drugs into the network -allows inference -drug’s effect Dominant features: MYB/MYC/MAX Network 1. MYB is required for proliferation of ER-breast tumor cell lines 2. MYB influences angiogenesis, proliferation and apoptosis 3. MYC is known oncogene MYC/MAX COMPLEX 1. 2. Expert Opin Biol Ther. 2008 Jun;8(6):713-7. Expert Opin Ther Targets. 2003 Apr;7(2):235-48. Dominant features: ERK1/2 Network • RAF/MEK/ERK pathway frequently deregulated in breast cancers • Influences several downstream processes, including: cell cycle, adhesion, invasion, macrophage activation Networks can predict response to treatment: FOXM1/PLK/DNA Network • DNA damage network is upregulated in basal breast cancers • Basal breast cancers are sensitive to PLK inhibitors GSK-PLKi Luminal Claudinlow Basal Up Down Networks can predict response to treatment: HDAC Network • HDAC Network is downregulated in basal breast cancer cell lines • Basal/CL breast cancers are resistant to HDAC inhibitors HDAC inhibitor VORINOSTAT Network can suggest new biology • Integrins are upregulated in basal breast cancers • beta2 integrins associated with macrophage binding and stimulation • alphaMbeta2 integrin = macrophage 1 antigen = Mac-1 antigen LIGANDPHAGOCYTOSIS BINDINGTRIGGERED BY IMMUNE RESPONSE αM/β2 INTEGRIN ACTIVATION OF CASPASE ACTIVITY SIMVISTATIN (ZOCOR) Summary Top down merits: Pathway interpretation of integrated cancer data powerful for pinpointing coherent themes in tumor samples. Pathways provide robust features for sub-typing and predictor building. Local models reveal known mechanisms underlying breast cancer subtypes and possibly drug response. Open Problems (Toward Smoother Flying) Dynamically build discriminative pathways for outcome prediction (e.g. survival, treatment response)? Reconstruct “event histories” of cancer cell transformations? Simulate networks to predict points of attack? Combine top-down with bottom-up approaches: E.g. Dynamically build discriminative pathways for outcome prediction (e.g. survival, treatment response (Finding biologists who will help interpret nets!) Patient-specific Prediction of Therapeutic Targets Mechanism Proxy X X Pathways provide a proxy for mechanism Explore knock-out effects in silico Idea: Rank genes according to their KO benefit Given data for a patient Each gene’s KO changes the patients molecular signature Measure if signature becomes more like those associated with good prognosis Pathway Models as Predictors (analogy to protein homology detection) • Protein homology • Set of sequences with shared property (bind same DNA motif, homologous, etc) • PFAM database stores HMMs; each recognize a different protein domain. • Annotate new sequences by running all HMMs in DB • Patient clinical outcomes • • • Set of samples with shared property (drug response, survival, tissue of origin) Train pathway models to “recognize” each clinicallyrelevant class Analysis of new patient using all pathway models. UCSC Cancer Integration Group Steve Benz David Haussler Jing Zhu James Durbin Chris Szeto Larry Meyer Charlie Vaske Sam Ng Zack Sanborn Amie Radenbaugh Mia Grifford Ted Golstein Cancer Engineering Team Jing Zhu, Director jzhu@soe.ucsc.edu Brian Craft craft@soe.ucsc.edu Kyle Ellrott kellrott@soe.ucsc.edu Kord Kober kord@soe.ucsc.edu Acknowledgments • • • • • • • • • • • • • • • UCSC Cancer Genomics Steve Benz Sam Ng Ted Goldstein Charles Vaske Zack Sanborn Jingchun Zhu Larry Meyer Christopher Szeto James Durbin Tracy Ballinger Daniel Zerbino Mia Grifford Fan Hsu Sofie Salama • UCSC Genome Browser Staff • Cluster Admins • • • • • • • • • Collaborators The Cancer Genome Atlas Stand Up To Cancer Christopher Benz, Buck Institute Laura Esserman, UCSF Joe Gray, LBL Laura Heiser, LBL Eric Collisson, UCSF Ting Wang, Washington University • • • • • • Funding Agencies NCI/NIH NHGRI American Association for Cancer Research (AACR) UCSF Comprehensive Cancer Center California Institute for Quantitative Biosciences (QB3)