The Role of Biopathways in Drug Repositioning and Determining Side Effects Philip E. Bourne University of California San Diego pbourne@ucsd.edu Support Open Access BioPathways 2008 What We Know & What We Don’t Know proteome.sdsc.edu • We know how to do functional annotation of proteins • We know little about biopathways • A side effect of our annotation work relates to drug repositioning • That work highlights our need to explore pathways – this is what I hope to show today and perhaps get you interested What Motivates Us • The truth is we know very little about how the major drugs we take work – most drugs bind to a variety of targets with varying affinity • We know even less about what side effects they might have • Drug discovery seems to be approached in a very consistent and conventional way • The cost of bringing a drug to market is ~$800M • The cost of failure is even higher e.g. Vioxx - $4.85Bn - Hence fail early and cheaply What Has Evolution Taught Us? • Global 3D similarity and sequence similarity do not tell the whole story • Perhaps a ligand binding site is what has passed from generation to generation while virtually all other aspects of the protein have changed? What Has Evolution Taught Us About Drug Discovery? • If that were true and evolutionarily related ligand binding sites could be found, they presumably would exist across very diverse gene families • From the perspective of drug discovery such sites would have significant implications What if… • We can characterize a protein-ligand binding site from a 3D structure (primary site) and search for that site on a proteome wide scale? • We could perhaps find alternative binding sites (off-targets) for existing pharmaceuticals? • We could use it for lead optimization and possible ADME/Tox prediction What Do Off-targets Tell Us? • One of three things: 1. Nothing 2. A possible explanation for a side-effect of a drug 3. A possible repositioning of a drug to treat a completely different condition Today I will give you examples of both 2 and 3 and illustrate how pathways come into play Agenda • Computational Methodology • Side Effects - The Tamoxifen Story • Repositioning an Existing Drug - The TB Story • Salvaging $800M – The Torcetrapib Story • The need to introduce pathway analysis Need to Start with a 3D Drug-Receptor Complex - The PDB Contains Many Examples Generic Name Other Name Treatment PDBid Lipitor Atorvastatin High cholesterol 1HWK, 1HW8… Testosterone Testosterone Osteoporosis 1AFS, 1I9J .. Taxol Paclitaxel Cancer 1JFF, 2HXF, 2HXH Viagra Sildenafil citrate ED, pulmonary arterial hypertension 1TBF, 1UDT, 1XOS.. Digoxin Lanoxin Congestive heart failure 1IGJ A Reverse Engineering Approach to Drug Discovery Across Gene Families Characterize ligand binding site of primary target (Geometric Potential) Identify off-targets by ligand binding site similarity (Sequence order independent profile-profile alignment) Extract known drugs or inhibitors of the primary and/or off-targets Search for similar small molecules … Dock molecules to both primary and off-targets Statistics analysis of docking score correlations Computational Methodology Characterization of the Ligand Binding Site - The Geometric Potential Conceptually similar to hydrophobicity or electrostatic potential that is dependant on both global and local environments • Initially assign Ca atom with a value that is the distance to the environmental boundary • Update the value with those of surrounding Ca atoms dependent on distances and orientation – atoms within a 10A radius define i GP P Pi cos(ai) 1.0 2.0 neighbors Di 1.0 Computational Methodology Xie and Bourne 2007 BMC Bioinformatics, 8(Suppl 4):S9 Discrimination Power of the Geometric Potential 4 binding site non-binding site 3.5 • Geometric potential can distinguish binding and non-binding sites 3 2.5 2 1.5 1 0.5 100 99 88 77 66 55 44 33 22 11 0 0 Geometric Potential Computational Methodology 0 Geometric Potential Scale Xie and Bourne 2007 BMC Bioinformatics, 8(Suppl 4):S9 Boundary Accuracy of Ligand Binding Site Prediction 25 70 60 20 Distribution (%) Distribution (%) 50 15 10 40 30 20 5 10 0 0 10 20 30 40 50 60 70 Sensitivity (%) 80 90 100 10 20 30 40 50 60 70 80 90 100 Specificity (%) • ~90% of the binding sites can be identified with above 50% sensitivity • The specificity of ~70% binding sites identified is above 90% Computational Methodology So Far… • Geometric potential dependant on local environment of a residue – relative to other residues and the environmental boundary • Geometric potential reasonably good at discriminating between ligand binding sites and non-ligand binding sites • Boundary of the binding site reasonably well defined • How to compare sites ??? Computational Methodology Local Sequence-order Independent Alignment with Maximum-Weight Sub-Graph Algorithm Structure A Structure B LER VKDL LER VKDL • Build an associated graph from the graph representations of two structures being compared. Each of the nodes is assigned with a weight from the similarity matrix • The maximum-weight clique corresponds to the optimum alignment of the two structures Xie and Bourne 2008 PNAS, 105(14) 5441 Similarity Matrix of Alignment Chemical Similarity • Amino acid grouping: (LVIMC), (AGSTP), (FYW), and (EDNQKRH) • Amino acid chemical similarity matrix Evolutionary Correlation • Amino acid substitution matrix such as BLOSUM45 • Similarity score between two sequence profiles d f a Sb f b S a i i i i i i fa, fb are the 20 amino acid target frequencies of profile a and b, respectively Sa, Sb are the PSSM of profile a and b, respectively Computational Methodology Xie and Bourne 2008 PNAS, 105(14) 5441 So What is the Potential of this Methodology? Finding Secondary Binding Sites for Major Pharmaceuticals • Scan known binding sites for major pharmaceuticals bound to their receptors against the human and other “druggable” proteomes • Try and correlate strong hits with known data from the literature, databases, clinical trials etc. and now pathways to provide molecular evidence of secondary effects Agenda • Computational Methodology • Repositioning an Existing Drug - The TB Story • Side Effects - The Tamoxifen Story • Salvaging $800M – The Torcetrapib Story • The need to introduce pathway analysis Tuberculosis (TB) • • • • One third of global population infected Kills 2 million people each year 95% of deaths in developing countries Anti-TB drugs hardly changed in 40 years • MDR-TB and XDR-TB pose a threat to human health worldwide • Development of novel, effective, and inexpensive drugs is an urgent priority Repositioning an Existing Drug - The TB Story Hypothesis Drawn from the Study of Evolution • We were looking for connections (evolutionary linkages) across fold and functional space through an allby-all comparison of ligand binding sites Repositioning an Existing Drug - The TB Story Found.. • Evolutionary linkage between: – NAD-binding Rossmann fold – S-adenosylmethionine (SAM)-binding domain of SAMdependent methyltransferases • Catechol-O-methyl transferase (COMT) is SAMdependent methyltransferase • Entacapone and tolcapone are used as COMT inhibitors in Parkinson’s disease treatment • Hypothesis: – Further investigation of NAD-binding proteins may uncover a potential new drug target for entacapone and tolcapone Repositioning an Existing Drug - The TB Story Functional Site Similarity between COMT and ENR • Entacapone and tolcapone docked onto 215 NADbinding proteins from different species • M.tuberculosis Enoyl-acyl carrier protein reductase ENR (InhA) discovered as potential new drug target • ENR is the primary target of many existing anti-TB drugs but all are very toxic • ENR catalyses the final, rate-determining step in the fatty acid elongation cycle • Alignment of the COMT and ENR binding sites revealed similarities ... Repositioning an Existing Drug - The TB Story Binding Site Similarity between COMT and ENR COMT SAM (cofactor) BIE (inhibitor) ENR NAD (cofactor) 641 (inhibitor) Repositioning an Existing Drug - The TB Story In Vivo Studies • Quantitative and microplate assays of Mtb agree • Entacapone - 80% growth inhibition with 62 ug/ml; 100% inhibition with 2x the dose • Tolcapone – similar results Repositioning an Existing Drug - The TB Story Courtesy Nancy Buchmeier Summary of the TB Story • Entacapone and tolcapone shown to have potential for repositioning • Direct mechanism of action avoids M.tuberculosis resistance mechanisms • Possess excellent safety profiles with few side effects – already on the market • At least some in vivo support • Assay of direct binding of Entacapone and tolcapone to ENR under way Repositioning an Existing Drug - The TB Story Agenda • Computational Methodology • Repositioning an Existing Drug - The TB Story • Side Effects - The Tamoxifen Story • Salvaging $800M – The Torcetrapib Story • The need to introduce pathway analysis Selective Estrogen Receptor Modulators (SERM) • One of the largest classes of drugs • Breast cancer, osteoporosis, birth control etc. • Amine and benzine moiety Side Effects - The Tamoxifen Story PLoS Comp. Biol., 2007 3(11) e217 Adverse Effects of SERMs cardiac abnormalities thromboembolic disorders loss of calcium homeostatis ????? ocular toxicities Side Effects - The Tamoxifen Story PLoS Comp. Biol., 3(11) e217 0.02 Density 0.04 0.06 Ligand Binding Site Similarity Search On a Proteome Scale 0.00 SERCA ERa 0 20 40 Score 60 80 • Searching human proteins covering ~38% of the drugable genome against SERM binding site • Matching Sacroplasmic Reticulum (SR) Ca2+ ion channel ATPase (SERCA) TG1 inhibitor site • ERa ranked top with p-value<0.0001 from reversed search against SERCA Side Effects - The Tamoxifen Story PLoS Comp. Biol., 3(11) e217 Structure and Function of SERCA • Regulating cytosolic calcium levels in cardiac and skeletal muscle • Cytosolic and transmembrane domains • Predicted SERM binding site locates in the TM, inhibiting Ca2+ uptake Side Effects - The Tamoxifen Story PLoS Comp. Biol., 3(11) e217 The Challenge • Design modified SERMs that bind as strongly to estrogen receptors but do not have strong binding to SERCA, yet maintain other characteristics of the activity profile Side Effects - The Tamoxifen Story PLoS Comp. Biol., 3(11) e217 Agenda • Computational Methodology • Repositioning an Existing Drug - The TB Story • Side Effects - The Tamoxifen Story • Salvaging $800M – The Torcetrapib Story • The need to introduce pathway analysis Consider in any of these cases there are likely multiple secondary sites Cholesteryl Ester Transfer Protein (CETP) CETP inhibitor X CETP LDL Bad Cholesterol HDL Good Cholesterol • collects triglycerides from very low density or low density lipoproteins (VLDL or LDL) and exchanges them for cholesteryl esters from high density lipoproteins (and vice versa) • A long tunnel with two major binding sites. Docking studies suggest that it possible that torcetrapib binds to both of them. • The torcetrapib binding site is unknown. Docking studies show that both sites can bind to trocetrapib with the docking score around -11.0. Docking Scores eHits/Autodock Off-target PDB Ids Torcetrapib Anacetrapib JTT705 Complex ligand CETP 2OBD -11.675 / -5.72 -11.375 / -8.15 -7.563 / -6.65 -8.324 (PCW) Retinoid X receptor 1YOW 1ZDT -11.420 / -6.600 -6.74 -8.696 / -7.68 -7.35 -6.276 / -7.28 -6.95 -9.113 (POE) PPAR delta 1Y0S -10.203 / -8.22 -10.595 / -7.91 -7.581 / -8.36 -10.691(331) PPAR alpha 2P54 -11.036 / -6.67 -0.835 / -7.27 -9.599 / -7.78 -11.404(735) PPAR gamma 1ZEO -9.515 / -7.31 > 0.0 / -8.25 -7.204 / -8.11 -8.075 (C01) Vitamin D receptor 1IE8 >0.0/ -4.73 >0.0 / -6.25 -6.628 / -9.70 -8.354 (KH1) -7.35 Glucocorticoid Receptor 1NHZ 1P93 Fatty acid binding protein 2F73 2PY1 2NNQ >0.0/ -4.33 >0.0/-6.13 /-6.40 >0.0/ -7.81 >0.0/ -6.98 /-7.64 -7.191 / -8.49 /-6.33 /6.35 ??? T-Cell CD1B 1GZP -8.815 / -7.02 -13.515 / -7.15 -7.590 / -8.02 -6.519 (GM2) IL-10 receptor 1LQS / -4.59 / -6.77 GM-2 activator 2AG9 -9.345 / -6.26 -9.674 / -6.98 (3CA2+) CARDIAC TROPONIN C 1DTL /-5.83 /-6.71 /-5.79 cytochrome bc1 complex 1PP9 (PEG) /-6.97 /-9.07 /-6.64 1PP9 (HEM) /-7.21 /8.79 /-8.94 1V5H /-4.89 /-7.00 /-4.94 human cytoglobin /-4.43 /-5.63 /-7.08 /-0.58 /-7.09 /-9.42 / -5.95 -8.617 / -6.17 ??? ??? (MYR) -4.16 EP distributions in binding pockets Docking Scores eHits/Autodock Off-target PDB Ids Torcetrapib Anacetrapib JTT705 Complex ligand CETP 2OBD -11.675 / -5.72 -11.375 / -8.15 -7.563 / -6.65 -8.324 (PCW) Retinoid X receptor 1YOW 1ZDT -11.420 / -6.600 -6.74 -8.696 / -7.68 -7.35 -6.276 / -7.28 -6.95 -9.113 (POE) PPAR delta 1Y0S -10.203 / -8.22 -10.595 / -7.91 -7.581 / -8.36 -10.691(331) PPAR alpha 2P54 -11.036 / -6.67 -0.835 / -7.27 -9.599 / -7.78 -11.404(735) PPAR gamma 1ZEO -9.515 / -7.31 > 0.0 / -8.25 -7.204 / -8.11 -8.075 (C01) Vitamin D receptor 1IE8 >0.0/ -4.73 >0.0 / -6.25 -6.628 / -9.70 -8.354 (KH1) -7.35 Glucocorticoid Receptor 1NHZ 1P93 Fatty acid binding protein 2F73 2PY1 2NNQ >0.0/ -4.33 >0.0/-6.13 /-6.40 >0.0/ -7.81 >0.0/ -6.98 /-7.64 -7.191 / -8.49 /-6.33 /6.35 ??? T-Cell CD1B 1GZP -8.815 / -7.02 -13.515 / -7.15 -7.590 / -8.02 -6.519 (GM2) IL-10 receptor 1LQS / -4.59 / -6.77 GM-2 activator 2AG9 -9.345 / -6.26 -9.674 / -6.98 (3CA2+) CARDIAC TROPONIN C 1DTL /-5.83 /-6.71 /-5.79 cytochrome bc1 complex 1PP9 (PEG) /-6.97 /-9.07 /-6.64 1PP9 (HEM) /-7.21 /8.79 /-8.94 1V5H /-4.89 /-7.00 /-4.94 human cytoglobin /-4.43 /-5.63 /-7.08 /-0.58 /-7.09 /-9.42 / -5.95 -8.617 / -6.17 ??? ??? (MYR) -4.16 JTT705 Torcetrapib Anacetrapib JTT705 VDR – RAS + RXR PPARα PPARδ FA ? FABP ? ? PPARγ High blood pressure + Anti-inflammatory function JNK/IKK pathway JNK/NF-KB pathway Immune response to infection Docking Scores eHits/Autodock Off-target PDB Ids Torcetrapib Anacetrapib JTT705 Complex ligand CETP 2OBD -11.675 / -5.72 -11.375 / -8.15 -7.563 / -6.65 -8.324 (PCW) Retinoid X receptor 1YOW 1ZDT -11.420 / -6.600 -6.74 -8.696 / -7.68 -7.35 -6.276 / -7.28 -6.95 -9.113 (POE) PPAR delta 1Y0S -10.203 / -8.22 -10.595 / -7.91 -7.581 / -8.36 -10.691(331) PPAR alpha 2P54 -11.036 / -6.67 -0.835 / -7.27 -9.599 / -7.78 -11.404(735) PPAR gamma 1ZEO -9.515 / -7.31 > 0.0 / -8.25 -7.204 / -8.11 -8.075 (C01) Vitamin D receptor 1IE8 >0.0/ -4.73 >0.0 / -6.25 -6.628 / -9.70 -8.354 (KH1) -7.35 Glucocorticoid Receptor 1NHZ 1P93 Fatty acid binding protein 2F73 2PY1 2NNQ >0.0/ -4.33 >0.0/-6.13 /-6.40 >0.0/ -7.81 >0.0/ -6.98 /-7.64 -7.191 / -8.49 /-6.33 /6.35 ??? T-Cell CD1B 1GZP -8.815 / -7.02 -13.515 / -7.15 -7.590 / -8.02 -6.519 (GM2) IL-10 receptor 1LQS / -4.59 / -6.77 GM-2 activator 2AG9 -9.345 / -6.26 -9.674 / -6.98 (3CA2+) CARDIAC TROPONIN C 1DTL /-5.83 /-6.71 /-5.79 cytochrome bc1 complex 1PP9 (PEG) /-6.97 /-9.07 /-6.64 1PP9 (HEM) /-7.21 /8.79 /-8.94 1V5H /-4.89 /-7.00 /-4.94 human cytoglobin /-4.43 /-5.63 /-7.08 /-0.58 /-7.09 /-9.42 / -5.95 -8.617 / -6.17 ??? ??? (MYR) -4.16 JTT705 Torcetrapib Anacetrapib JTT705 VDR – RAS + RXR PPARα PPARδ FA ? FABP ? ? PPARγ High blood pressure + Anti-inflammatory function JNK/IKK pathway JNK/NF-KB pathway Immune response to infection Summary • We have established a protocol to look for off-targets for existing therapeutics and NCEs • Understanding these in the context of pathways would seem to be the next step towards a new understanding • Lots of other opportunities to examine existing drugs Bioinformatics Final Examples.. • Donepezil for treating Alzheimer’s shows positive effects against other neurological disorders • Orlistat used to treat obesity has proven effective against certain cancer types • Ritonavir used to treat AIDS effective against TB • Nelfinavir used to treat AIDS effective against different types of cancers Acknowledgements Eric Scheeff Lei Xie Li Xie Jian Wang Sarah Kinnings Nancy Buchmeier Support Open Access 43,738 Human Proteins map human proteins to drug targets with BLAST e-value < 0.001 map human proteins to PDB structures with >95% sequence identity 13,865 Human Proteins (2,002 Drug Targets) 3,158 Human Proteins (10,730 PDB Structures) map drug targets to PDB structures 1,585 PDB Structures (929 Drug Targets) cover 929/2,002 = 46.4% drug targets structurally remove redundant structures with 30% sequence identity 2,586 PDB Structures remove redundant structures with 30% sequence identity, 825 PDB Structures (druggable) Lead Discovery from Fragment Assembly • Privileged molecular moieties in medicinal chemistry • Structural genomics and high throughput screening generate a large number of proteinfragment complexes • Similar sub-site detection enhances the application of fragment assembly strategies in drug discovery 1HQC: Holliday junction migration motor protein from Thermus thermophilus 1ZEF: Rio1 atypical serine protein kinase from A. fulgidus Lead Optimization from Conformational Constraints • Same ligand can bind to different proteins, but with different conformations • By recognizing the conformational changes in the binding site, it is possible to improve the binding specificity with conformational constraints placed on the ligand 1ECJ: amido-phosphoribosyltransferase from E. Coli 1H3D: ATP-phosphoribosyltransferase from E. Coli Renin-angiotensin system (RAS) Angiotensinogen Hydrolyzation + Renin Angiotensin I Peptide cleavage Angiotensin II + High blood pressure + Aldosterone secretion + ACE Anacetrapib JTT705 Torcetrapib Anacetrapib JTT705 X GCR Cytochrome bc1 complex excessive activation Inhibition of NF-KB anti-cancer and anti-inflammatory Hypertension X Q cycle ATP generation, cell repair, cell death ? Cardiac hypertrophy, hypertension Torcetrapib Anacetrapib JTT705 Torcetrapib T-cell CD1B Cardiac TnC X X CD1B+antigen Immune response to infection Ca2+ Troponin conformation change Heart muscle contraction Summary Estimated Capitalized Costs for New Chemical Entities (NCEs) Entering Each Phase • Estimated costs for a drug withdrawal: ~ 60.0 millions • Phase III is most costly: fail fast, fail cheap M. Dickson & J. P. Gagnon, Nature Review Drug Discovery 3(2004) p417-429 www.pdb.org • info@rcsb.org Implications on Drug Development Affinity (ER Site) Affinity (SERCA) Affinity Difference Bazedoxifene(BAZ) -9.44 +/- 0.54 -7.23 +/- 0.13 2.21 Lasofoxifene(LAS) -8.66 +/- 0.40 -6.54 +/- 0.20 2.12 Ormeloxifene(ORM) -8.67 +/- 0.18 -5.84 +/- 0.33 2.83 Raloxifene(RAL) -8.08 +/- 0.64 -5.78 +/- 0.23 2.30 4-hydroxytamoxifen(OHT) -7.67 +/- 0.47 -5.40 +/- 0.15 2.27 Tamoxifen(TAM) -7.30 +/- 0.28 -5.64 +/- 0.28 1.66 • Taking account of both target and off-target for lead optimization • Drug delivery and administration regime Swiss-Prot - 20 Year Celebration Northwestern Jan 53 Improved Performance of Alignment Quality and Search Sensitivity and Specificity 90 0.03 Amino Acid Grouping Chemical Similarity Substitution Matrix Profile-Profile 80 Amino Acid Group Chemical Similarity Substitution Matrix Profile-Profile 0.025 70 False Positive Ratio Frequency (%) 60 50 40 30 0.02 0.015 0.01 20 0.005 10 0 0 <1.0 <3.0 <5.0 <7.0 <9.0 <11.0 RMSD (Angsgroms) . RMSD distribution of the aligned common fragments of ligands from 247 test cases showing four scores: amino acid grouping, chemical similarity, substitution matrix and profile-profile. 0 0.04 0.08 0.12 True Positive Ratio 0.16 0.2 2D small molecule similarity between existing and potential ENR inhibitors Entacapone Tolcapone 2D Similarity to Tolcapone 15 2D Similarity to Entacapone 6 0 2 4 Density Density 5 AYM p=0.065 0 Density Density 8 10 10 12 ZAM p=0.205 0.00 0.05 0.10 0.15 0.20 0.25 Tanimoto Coefficient Tanimoto Coefficient 0.30 0.00 0.05 0.10 0.15 0.20 0.25 Tanimoto Coefficient Tanimoto Coefficient 0.30 0.35 Docking existing and potential InhA inhibitors onto COMT and InhA InhA inhibitor Docking score Docking score with InhA with COMT 468 -6.57 +/- 1.27 -4.42 566 -6.24 +/- 0.92 -3.96 641 -6.00 +/- 1.51 -5.92 665 -5.18 +/- 0.72 -4.20 744 -6.07 +/- 1.28 -5.47 5PP -5.99 +/- 0.48 -3.90 8PS -6.51 +/- 0.95 -4.04 GEQ -6.29 +/- 1.61 -4.45 Triclosan -6.34 +/- 0.68 -4.05 Entacapone -4.91 +/- 0.97 -4.49 Entacapone (N5) -5.25 +/- 0.93 -4.02 Entacapone (N6) -5.10 +/- 1.03 -3.81 Tolcapone -5.85 +/- 0.74 -4.68 Correlation of binding affinity profiles between COMT and InhA • Tolcapone-like molecules 3 2 1 0 Control Docking Score -1 -2 -3 0 2 4 6 Control Docking Score Tolcapone-like molecules (R=0.35) -2 ENR Docking Score InhA Docking Score Tolcapone-like molecules (R=0.36) -2 0 2 4 COMT Docking Score COMT Docking Score 6 -2 0 2 4 6 COMT Docking Score COMT Docking Score Binding pose analysis of potential InhA inhibitors with InhA Asp110 Asp115 15.25Å 14.53Å Glu210 11.54Å Comparison of surface electrostatic potential between COMT and InhA functional sites • Electrostatic potentials of COMT and InhA calculated using APBS • Predicted binding poses of entacapone and tolcapone inserted into proteins • Qualitative similarities between COMT and InhA functional sites observed • In both cases, nitrite groups of entacapone and tolcapone associated with positively charged region of active site Tolcapone Entacapone Comparison of surface electrostatic potential between COMT and InhA functional sites COMT InhA Advantage to Using Ligand Site Similarity Small molecule Similarity Protein Sequence/Structure Similarity Protein Functional Site Similarity • Poor correlation between structure and activity • Infinite chemical space . Not adequately reflecting functional relationship . Not directly addressing drug design problem . Build closer structurefunction relationships . Limit chemical space through co-evolution Correlation of Binding Affinity Profiles between COMT and ENR • Entacapone-like molecules Entacapone-like molecules (R=0.38) 2 1 0 -1 -2 -3 Control Docking Score Control Docking Score 6 4 2 0 ENR Docking Score ENR Docking Score 8 3 Entacapone-like molecules (R=0.42) 0 2 4 6 COMT Docking Score COMT Docking Score Linear regression Repositioning an Existing Drug - The TB Story 0 2 4 COMT Docking Score COMT Docking Score 2 identical sites 6