Multiscale informatics for understanding drug response Russ B. Altman, MD, PhD Departments of Bioengineering, Genetics, Medicine & (by courtesy) Computer Science Stanford University PharmGKB, http://www.pharmgkb.org/ Outline of comments today Our focus: drug action at molecular, cellular and organismal level (for pharmacogenetics, mechanism, repurposing, drug interactions) Our toolkit: informatics tools and datasets for discovery and for leveraging the investment. 1. Structural molecular data for drug repurposing 2. Expression data for understanding cellular responses to drugs 3. Text information for predicting drug interactions 4. Population and clinical data to discover drug interactions Future prospects for systems pharmacology. http://www.pharmgkb.org/ Summarizing clinical implications of molecular changes PMID: 20435227 Beta 2 Adrenergic Receptor, a G-protein coupled receptor (GPCR), frequent pharmacological target. Beta 2 Adrenergic Receptor in lipid bilayer with surrounding intraand extracellular aqueous solution. Molecular Data • Structural genomics initiative: great increase in “solved” 3D structures • Includes many complexes of proteins with small molecule ligands Can we identify use this information to predict new interactions between proteins and small molecule drugs? The FEATURE system for describing molecular environments Builds a statistical model of microenvironments in proteins, using occurrence of biophysical and biochemical properties. FEATURE1 Can detect remote similarities in environments from different proteins. Atom Type Atom Element Residue Name Residue Class Partial Charge Hydrophobicity Aromatic Etc. FEATURE microenvironments detect remote flexible FAD binding sites FEATURE “sees” similarities in diverse kinase structures (PIK3CG and SRC) = repurposing opportunity (Liu & Altman, PLoS Comp Bio, 2011). Validating kinase predictions from 40 tested predictions (11 positive & 29 negative) Pairs with high similarity Validated Highly interested Mammalian Liu & Altman, CPT: PSP, 2014 2 Moving from molecular to cellular data for drug action • Expression data universally available at Gene Expression Omnibus • Measures human gene expression in thousands of conditions, cell types • Recent examples of drug repurposing using this information. Can we understand drug action using this data? Independent Component Analysis (ICA) on all human GEO data (~9K arrays) component = hidden signal in data Independent Component Analysis Independent Component Analysis of GEO human data yields ~420 fundamental components that explain most variability in expression experiments. (Engreitz et al, J Biomed Inform. 2010) Analyzing response to a drug: gene expression in response to parthenolide • • • CD34+ cells 12 AML patients Samples before/after treatment Key components expressed at different levels in treated vs. untreated 373 = component with biggest difference Highly weighted Genes: ICOS GNA15 SOCS2 GCH1 ID3 NR4A3 PELI2 ID1 ZNF394 KHDRBS3 Diseases with high level of component 373: Pediatric T-ALL Childhood ALL AML ALLL Leukemia Inteferon treatment of Hep C Parthenolide effects NFKB signaling. Pathways suggests new diseases and new targets. Parthenolide alters expression of TNF, NFKB, kinases. Immediately suggests a way to decide if Parthenolide is appropriate for other cancers, based on their gene expression. Augmenting the network of molecular and cellular data with textual information. • PubMed now holds more than 20 million abstracts • Most biomedical knowledge stored in the published literature Can we have computers “read” PubMed abstracts and reason over them to predict new drug-drug interactions? Advances in natural language parsing enable high fidelity extraction of relations Sentence SYNTAX Word Part Of Speech The DT ABCB1 NN C3435T NN polymorphism NN influences VBZ methotrexate JJ sensitivity NN in IN rheumatoid JJ arthritis NN patients. NNS SEMANTICS Named Entity Recognition Dependency Graph Relation Extraction Relation Normalization ABCB1 polymorphism ABCB1_ Variant Genomic Variation influences AFFECTS relationship methotrexate sensitivity methotrexate_ DrugSensitivity Phenotype Dependency graph gene gene variant drug disease drug sensitivity disease population We can identify genes, drugs, phenotypes in text using these technologies.… an ontology to normalize Sentence SYNTAX Word Part Of Speech The DT ABCB1 NN C3435T NN polymorphism NN influences VBZ methotrexate JJ sensitivity NN in IN rheumatoid JJ arthritis NN patients. NNS SEMANTICS Named Entity Recognition Dependency Graph Relation Extraction Relation Normalization ABCB1 polymorphism ABCB1_ Variant Genomic Variation influences AFFECTS relationship methotrexate sensitivity methotrexate_ DrugSensitivity Phenotype gene gene variant drug drug sensitivity disease disease population Coulet, Shah, Garten, Musen, Altman, JBI, 2010. KEY: we map extracted entities to standard terminologies Sentence SYNTAX Word Part Of Speech The DT ABCB1 NN C3435T NN polymorphism NN influences VBZ methotrexate JJ sensitivity NN in IN rheumatoid JJ arthritis NN patients. NNS SEMANTICS Named Entity Recognition Dependency Graph Relation Extraction Relation Normalization ABCB1 polymorphism ABCB1_ Variant Genomic Variation influences AFFECTS relationship methotrexate sensitivity methotrexate_ DrugSensitivity Phenotype gene gene variant drug drug sensitivity disease disease population Coulet, Shah, Garten, Musen, Altman, JBI, 2010. Semantic network of 170,598 normalized relations from all PubMed abstracts. CYP3A4 ACE CYP2D 6 ABCB1 gene and verapamil drug We chain together two gene-drug relationships to create a drug-gene-drug relationship = potential drug interaction! Percha et al, Pacific Symposium on Biocomputing, 2012. Top predicted DDIs from text processing after training on known DDIs drug pair interact fluoxetine-diazepam tramadol-dextromethorphan venlafaxine-dextromethorphan naproxen-diazepam paroxetine-dextromethorphan metoprolol-dextromethorphan lisinopril-enalapril verapamil-omeprazole dextromethorphan-codeine omeprazole-naproxen sertraline-dextromethorphan omeprazole-diazepam verapamil-fluconazole verapamil-fexofenadine verapamil-atorvastatin naproxen-fluoxetine warfarin-glibenclamide warfarin-fluoxetine verapamil-carvedilol venlafaxine-tramadol verapamil-clopidogrel venlafaxine-paroxetine verapamil-simvastatin tramadol-paroxetine warfarin-ibuprofen omeprazole-dextromethorphan fluoxetine-dextromethorphan venlafaxine-metoprolol venlafaxine-sertraline tramadol-sertraline dextromethorphan-citalopram paroxetine-metoprolol tramadol-metoprolol verapamil-diazepam dextromethorphan-aripiprazole 1* 1 1 0 1 0 1-0 11* 1** 1 1 0 1 1* 1* 1 1 1 11 1 1 1 0 1 1* 1 1 1** 1 0 10 npaths avgvotes p.glm 975 536 528 806 489 490 986 469 440 911 357 872 252 248 296 626 221 272 200 145 184 132 263 133 172 150 160 132 97 97 104 126 132 300 132 59.68% 95.27% 94.99% 65.60% 95.32% 90.17% 37.46% 87.30% 86.98% 40.09% 95.02% 38.84% 97.34% 93.28% 88.32% 52.50% 92.29% 86.93% 92.99% 98.45% 94.40% 99.18% 86.04% 98.42% 94.41% 96.36% 95.16% 94.89% 97.97% 97.84% 96.35% 93.79% 93.13% 76.30% 92.67% 0.608 0.533 0.523 0.506 0.489 0.441 0.410 0.395 0.367 0.366 0.365 0.322 0.296 0.262 0.261 0.240 0.236 0.234 0.227 0.227 0.226 0.223 0.222 0.219 0.218 0.216 0.215 0.196 0.194 0.193 0.188 0.186 0.186 0.185 0.183 LAB 1 0 1 1 1 0 0 1 0 Metoprolol Class Brand Treats Effects Mech Dextromethorphan Beta blocker Class Non-opioid antitussive Brand Robitussin (+ tons of others) Treats Cough Effects Acts on central nervous system to elevate threshold for coughing Mech NMDA and glutamate antagonist Blocks dopamine reuptake site (?) Lopressor, Toprol High blood pressure Angina (chest pain) Heart attack (improves survival) Heart failure Relaxes blood vessels Slows heart rate Blocks beta receptors on sympathetic nerves Metoprolol Dextromethorphan metoprolol Drug isMetabolizedBy Gene CYP2D6… Metoprolol Dextromethorphan … CYP2D6 Gene metabolizes Drug Dextromethorphan DEEP DIVE for gene-drug-phenotype (Chris Re, CS) Using population-based and clinical data to generate hypotheses • FDA releases adverse event reports regularly, but very complex data. • Electronic Medical Records have information about patient drug exposures? Can we mine FDA and clinical EMR records to discover new drug-drug interactions? We built a statistical model able to recognize glucose-altering drugs based on their “adverse event signature” Adverse Events reported by FDA Diabetes Drugs Correlated Indications Fij = frequency of AEj for Drugi Fij = frequency of AEj for Indicationi The adverse event signature is able to recall glucose-altering drugs Model trained on single drug reports can be used to classify drug pairs. EXAMPLE: Glucose-homeostasis adverse effects Fij = frequency of AEj for Drugi Drugs Pairs Pairs of drugs that match signature: Paroxetine and Pravastatin each taken by approximately 15 million Americans. We estimate 500,000 to 800,000 take both. Pravastatin and paroxetine significantly increase blood glucose by 20 mg/dl Variable N Pravastatin and Paroxetine Combination 10 Demographics Age (mean ± SD) Gender (% Female) 59.9 ± 11.09 90.0 Race (% of group) White 50 African American 20 Hispanic 0 Other 30 Glucose (mg/dl mean ± SD) Baseline (base) 114.08 ± 14.79 After treatment(s) (post) 133.96 ± 19.54 paired t-test (t: post - base) Change (base to post) N patients with increase 0.020 19.88 ± 21.04 8 (80) Combination therapy increases glucose by 20 mg/dl; pravastatin and paroxetine effects alone are modest Site 2 (10 patients) Replication sites validate combination findings Vanderbilt (18 patients) Harvard Partners (106 patients) Combining all three sites Not a “class affect” of statins and SSRIs Effect larger in diabetics N = 177 Mice on two drugs also show increase (Tatonetti et al, Clin Pharm Ther. 2011) Web searches? Patient search for pravastatin & paroxetine and DM-related words more frequently. White et al, J Am Med Inform Assoc. 2013 May 1;20(3):4048 Conclusions • Biomedical informatics provides powerful tools for discovery at the molecular, cellular, and organismal levels. • They operate on both primary data and processed knowledge. • They can individually contribute to understanding drug action, effects and interactions. • The combination & integration of these methods is likely to be even more powerful going forward… Combining text mining + FDA data Orange highlighted predictions from text mining are also predictions in the FDA mining effort! drug pair interact fluoxetine-diazepam tramadol-dextromethorphan venlafaxine-dextromethorphan naproxen-diazepam paroxetine-dextromethorphan metoprolol-dextromethorphan lisinopril-enalapril verapamil-omeprazole dextromethorphan-codeine omeprazole-naproxen sertraline-dextromethorphan omeprazole-diazepam verapamil-fluconazole verapamil-fexofenadine verapamil-atorvastatin naproxen-fluoxetine warfarin-glibenclamide warfarin-fluoxetine verapamil-carvedilol venlafaxine-tramadol verapamil-clopidogrel venlafaxine-paroxetine verapamil-simvastatin tramadol-paroxetine warfarin-ibuprofen omeprazole-dextromethorphan fluoxetine-dextromethorphan venlafaxine-metoprolol venlafaxine-sertraline tramadol-sertraline dextromethorphan-citalopram 1* 1 1 0 1 0 1-0 11* 1** 1 1 0 1 1* 1* 1 1 1 11 1 1 1 0 1 1* 1 1 1** npaths avgvotes p.glm 975 536 528 806 489 490 986 469 440 911 357 872 252 248 296 626 221 272 200 145 184 132 263 133 172 150 160 132 97 97 104 59.68% 95.27% 94.99% 65.60% 95.32% 90.17% 37.46% 87.30% 86.98% 40.09% 95.02% 38.84% 97.34% 93.28% 88.32% 52.50% 92.29% 86.93% 92.99% 98.45% 94.40% 99.18% 86.04% 98.42% 94.41% 96.36% 95.16% 94.89% 97.97% 97.84% 96.35% 0.608 0.533 0.523 0.506 0.489 0.441 0.410 0.395 0.367 0.366 0.365 0.322 0.296 0.262 0.261 0.240 0.236 0.234 0.227 0.227 0.226 0.223 0.222 0.219 0.218 0.216 0.215 0.196 0.194 0.193 0.188 LAB 1 0 1 1 1 0 Combining molecular & cellular data Proteins with similar pockets can be associated with disease pathways for repurposing. Thus, the emerging network for drugs…. Cellular response & pathways Target structure & dynamics Drug recognition & binding Text mining of gene, drug, phenotype associations Clinical response datamining Population effect reporting Teri Klein Michelle Jesse Guy Roxana Li Mei Grace Tang Nick Tatonetti Joan Sara Caroline Beth Percha Mark Fen Tianyun Liu Ellie Blanca Jenn Jesse Engreitz Katrin Jenelle Yael Garten Konrad Ryan But the database biases can kill you. Corrects for *unmeasured* biasees Tatonetti et al, Science STM, 2012 Thanks to PharmGKB Team Teri Klein Michelle Carrillo Curators: Li Gong, Joan Hebert, Katrin Sangkuhl, Laura Hodges, Connie Oshiro, Caroline Thorn Developers: Mark Woon, Ryan Whaley, Feng Liu, Mei Gong, Rebecca Tang Data Center: Tina Zhou, TC Truong NIH: GM-61374 Key Papers PharmGKB: a logical home for knowledge relating genotype to drug response phenotype. Altman RB. Nat Genet. 2007 Apr;39(4):426. No abstract available. PMID: 17392795 [PubMed - indexed for MEDLINE] Independent component analysis: mining microarray data for fundamental human gene expression modules. Engreitz JM, Daigle BJ Jr, Marshall JJ, Altman RB. J Biomed Inform. 2010 Dec;43(6):932-44. Epub 2010 Jul 7. PMID: 20619355 Detecting drug interactions from adverse-event reports: interaction between paroxetine and pravastatin increases blood glucose levels. Tatonetti NP, Denny JC, Murphy SN, Fernald GH, Krishnan G, Castro V, Yue P, Tsao PS, Kohane I, Roden DM, Altman RB. Clin Pharmacol Ther. 2011 Jul;90(1):133-42. doi: 10.1038/clpt.2011.83. Epub 2011 May 25. Erratum in: Clin Pharmacol Ther. 2011 Sep;90(3):480. Tsau, P S [corrected to Tsao, P S]. PMID: 21613990 Using text to build semantic networks for pharmacogenomics.Coulet A, Shah NH, Garten Y, Musen M, Altman RB. J Biomed Inform. 2010 Dec;43(6):1009-19. Epub 2010 Aug 17. PMID: 20723615 [PubMed - indexed for MEDLINE] DISCOVERY AND EXPLANATION OF DRUG-DRUG INTERACTIONS VIA TEXT MINING. Percha B, Garten Y, Altman RB. Pacific Symposium on Biocomputing 2012, in press. Available at http://psb.stanford.edu/psbonline/ A novel signal detection algorithm for identifying hidden drug-drug interactions in adverse event reports. Tatonetti NP, Fernald GH, Altman RB. J Am Med Inform Assoc. 2011 Jun 14. [Epub ahead of print] PMID: 21676938 The FEATURE framework for protein function annotation: modeling new functions, improving performance, and extending to novel applications. Halperin I, Glazer DS, Wu S, Altman RB. BMC Genomics. 2008 Sep 16;9 Suppl 2:S2. PMID: 18831785