Adventures in Computational Enzymology John Mitchell MACiE Database Mechanism, Annotation and Classification in Enzymes. http://www.ebi.ac.uk/thornton-srv/databases/MACiE/ G.L. Holliday et al., Nucl. Acids Res., 35, D515-D520 (2007) Enzyme Nomenclature and Classification EC Classification Class Subclass Sub-subclass Serial number EC Classification Chemical reaction Enzyme Commission (EC) Nomenclature, 1992, Academic Press, San Diego, 6th Edition The EC Classification Only deals with overall reaction. Reaction direction arbitrary. Doesn’t deal with structural and sequence information. Thus, cofactors and active site residues ignored. However, it was never intended to describe mechanism. A New Representation of Enzyme Reactions? Should be complementary to, but distinct from, the EC system. Should take into account: Reaction Mechanism; Structure; Sequence. Need a database of enzyme mechanisms. MACiE Database Mechanism, Annotation and Classification in Enzymes. http://www.ebi.ac.uk/thornton-srv/databases/MACiE/ Coverage of MACiE Representative – based on a non-homologous dataset, and chosen to represent each available EC sub-subclass. Coverage of MACiE Structures exist for: MACiE covers: 6 EC 1.-.-.- 6 EC 1.-.-.- 56 EC 1.2.-.- 53 EC 1.2.-.- 184 EC 1.2.3.- 156 EC 1.2.3.- 1312 EC 1.2.3.4 199 EC 1.2.3.4 Representative – based on a non-homologous dataset, and chosen to represent each available EC sub-subclass. Repertoire of Enzyme Catalysis G.L. Holliday et al., J. Molec. Biol., 372, 1261-1277 (2007) Number of steps in MACiE Repertoire of Enzyme Catalysis 140 Intramolecular 120 Bimolecular Unimolecular Enzyme chemistry is largely nucleophilic 100 80 60 40 20 0 Heterolytic Elimination Homolytic Elimination Electrophilic Addition Nucleophilic Addition Homolytic Addition Reaction Types Electrophilic Substitution Nucleophilic Substitution Homolytic Substitution Repertoire of Enzyme Catalysis 450 400 Number of steps in MACiE 350 300 250 200 150 100 50 0 Proton transfer AdN2 E1 SN2 E2 Reaction Types Radical reaction Tautom. Others Residue Catalytic Propensities Evolution of Enzyme Function D.E. Almonacid et al., to be published Domains Work with domains - evolutionary & structural units of proteins. Map enzyme catalytic mechanisms to domains to quantify convergent and divergent functional evolution of enzymes. Functional Classification: EC Chemical reaction Enzyme Commission (EC) Nomenclature, 1992, Academic Press, San Diego, 6th Edition Enzyme Catalysis Databases G.L. Holliday et al., Nucleic Acids Res., 35, D515 (2007) S.C. Pegg et al., Biochemistry, 45, 2545 (2006) N. Nagano, Nucleic Acids Res., 33, D407 (2005) Coverage of MACiE Representative – based on a non-homologous dataset, and chosen to represent each available EC sub-subclass. Coverage of SFLD Based on a few evolutionarily related families Coverage of EzCatDB But without mechanisms. Structural Classification: CATH Orengo, C. A., et al. Structure, 1997, 5, 1093 Dataset To avoid the ambiguity of multi-domain structures we use only single-domain proteins. CATH (single-domain) Database entries EC sub-subclasses EC serial numbers 395 114 326 Enzymes in PDB >>799 184 1312 Results: Convergent Evolution Numbers of CATH code occurrences per EC number c.s.-.c.s.ss.c.s.ss.sn c.-.-.- C 3.17 1.73 1.38 1.11 A 11.00 3.27 1.93 1.60 T 28.00 4.89 2.24 1.19 H 38.33 5.80 2.46 1.22 2.46 CATH/EC reaction Convergent Evolution Results: Convergent Evolution Numbers of CATH code occurrences per EC number c.s.-.c.s.ss.c.s.ss.sn c.-.-.- C 3.17 1.73 1.38 1.11 A 11.00 3.27 1.93 1.60 T 28.00 4.89 2.24 1.19 H 38.33 5.80 2.46 1.22 2.46 CATH/EC reaction: Convergent Evolution An average reaction has evolved independently in 2.46 superfamilies Results: Divergent Evolution EC reactions/CATH c.-.-.- C 4.75 A 3.14 T 1.36 H 1.20 c.s.-.- 19.50 7.00 1.79 1.36 c.s.ss.- 39.25 10.48 2.08 1.46 c.s.ss.sn 90.00 17.90 3.05 2.05 1.46 EC reactions/CATH database entries/CATH Divergent Evolution 2.18 Results: Divergent Evolution EC reactions/CATH c.-.-.- C 4.75 A 3.14 T 1.36 H 1.20 c.s.-.- 19.50 7.00 1.79 1.36 c.s.ss.- 39.25 10.48 2.08 1.46 c.s.ss.sn 90.00 17.90 3.05 2.05 1.46 EC reactions/CATH: Divergent Evolution database An average superfamily hasentries/CATH evolved 1.46 different reactions 2.18 Density Functional Theory Calculations on Dehydroquinase Mattias Blomberg et al., to be published DFT – System Size • System sizes of ~100150 atoms can be treated using DFT • That raises the question of how to treat the rest of the protein. Dielectric Continuum or QM/MM? • One approach is to cut out the active site residues and treat the rest of the protein as a dielectric continuum. • Another approach is to treat the active site as QM and the rest of the protein using MM. ε=4 Q M MM Q M Dielectric Continuum or QM/MM? • One approach is to cut out the active site residues and treat the rest of the protein as a dielectric continuum. • Another approach is to treat the active site as QM and the rest of the protein using MM. ε=4 Q M MM Q M Dehydroquinase - Part of the Shikimate Pathway Shikimate & Chorismate Pathways Dehydroquinase (Shikimate Pathway) Shikimate & Chorismate Pathways • Biosynthetic pathway for phenylalanine, tyrosine and tryptophan. • Present in plants, microorganisms and fungi but not in mammals. • The target for Glyphosate, an important herbicide. • Understanding the mechanisms and developing inhibitors is of great importance for the development of new herbicides, fungicides and antibiotics. Two Types of Dehydroquinases • Type I: E. coli and S. typhi, (EC 4.2.1.10) MACiE M0054 Mechanism: cis-dehydration, imine intermediate. • Type II: S. coelicor, M. tuberculosis and H. pylori (EC 4.2.1.10). MACiE M0055 Mechanism: trans-dehydration, enol(ate) intermediate. Proposed Mechanism of DHQase Arg113 Arg113 Tyr28 NH+ NH H2N NH+ HO O H N H H O 2HN N Ala 82 O Asn79 Pro15 -O2C H O H O O N H N H NH His106 Asn16 O N H H O 2HN NH+ HO HO HO OH -O2C His106 H2N -O HO H N Tyr28 NH NH H2N Arg113 Tyr28 OH OH -O2C N Ala 82 O H O H NHH O O Asn79 Pro15 N H Asn16 N His106 O N H H HN O 2 N Ala 82 O H O H O O Asn79 Pro15 N H H NH Asn16 Models of DHQase Active Site Energetics of DHQase Model A Does Asn16 Protonate the DHQ Enolate? Other Things we do Chemoinformatics for pharmaceutical design … …using Machine Learning for prediction of solubility, bioavailability and bioactivity. Machine Learning Methods • • • • Recognise patterns in data Similar inputs Similar outputs Make full use of all available information One application is solubility Machine Learning Methods • Can be used for Classification or for Regression • Can be used with chemoinformatics, physicochemical or experimental (e.g., assay) data as descriptors Solubility is an important issue in drug discovery and a major source of attrition This is expensive for the industry A good model for predicting the solubility of druglike molecules would be very valuable. Drug Disc.Today, 10 (4), 289 (2005) Random Forest Machine Learning Method k-Nearest Neighbours Machine Learning Method Winnow (“Molecular Spam Filter”) Machine Learning Method Future Directions Current coverage of MACiE Representative – based on a non-homologous dataset Future coverage of MACiE Adding homologues – to facilitate study of divergent evolution Divergent Evolution using MACiE This will use our reaction similarity work to measure changes in chemistry Using Machine Learning Methods to calculate and predict protein-ligand binding energies Building on our previous work … P.M. Marsden et al., Org. Biomol. Chem., 2, 3267 (2004) Computational Toxicology Predicting bioavailability problems, off-target activities and side effects of drug candidates QM, QM/MM and MD Simulation Work • Using computational chemistry to study enzyme mechanisms Fosfomycin Resistance Protein A ACKNOWLEDGEMENTS Dr Gemma Holliday Dr Daniel Almonacid Dr Noel O’Boyle Dr Mattias Blomberg Prof. Janet Thornton (EBI) Dr Peter Murray-Rust Dr Jochen Blumberger ACKNOWLEDGEMENTS Cambridge Overseas Trust All slides after here are for information only Similarity of Enzyme Mechanisms N.M. O'Boyle, et al., J. Molec. Biol., 368, 1484-1499 (2007) Measuring Similarity of Enzyme Mechanisms Coverage of MACiE Representative – based on a non-homologous dataset, and chosen to represent each available EC sub-subclass. Repertoire of enzyme catalysis Heterolytic Unimolecular Bimolecular Intramolecular Homolytic Unimolecular Bimolecular Intramolecular Elimination Addition Substitution Electrophilic Bimolecular Intramolecular Nucleophilic Bimolecular Intramolecular Homolytic Bimolecular Intramolecular Electrophilic Unimolecular Bimolecular Intramolecular Nucleophilic Unimolecular Bimolecular Intramolecular Homolytic Unimolecular Bimolecular Intramolecular Ingold, C. K. Cornell University Press, 1969. Repertoire of enzyme catalysis “Other reactions” and Named organic reactions currently supported in MACiE ______________________________________________ Aldol Condensation Hydride Transfer Amadori Rearrangement Isomerisation A-SN1 Michael Addition A-SN2 Nucleophilic Attack A-SNi Pericyclic Reaction Claisen Rearrangement Proton Transfer Condensation Radical Formation E1cb Radical Propagation Group Transfer Radical Termination Heterolysis Redox Homolysis Tautomerisation ______________________________________________ Function of catalytic residues Functionality for amino acids currently supported in the MACiE ________________________________________________ Activating residue Proton acceptor Charge destabiliser Proton donor Charge stabiliser Proton relay Covalently attached Radical acceptor Electrophile Radical donor Hydride relay Radical relay Hydrogen bond acceptor Radical stabiliser Hydrogen bond donor Spectator Leaving group Steric hindrance Metal ligand Unknown function Nucleophile Unspecified steric role ________________________________________________ CMLReact Customisable mark-up language Allows validation Uses dictionary technology Separates content from presentation Open Source BUT still under development An Overview of MACiE and CMLReact Energetics of DHQase Model A TS1 - Proton Transfer TS2 - Dehydration 69/41 Mattias Blomberg Model C Model A Model B Model C Models A, B & C MD and QM/MM Calculations on Fosfomycin Resistance Protein A Fosfomycin Resistance Protein A Fosfomycin Resistance Proteins • Fosfomycin inhibits the first step in the bacterial cell-wall synthesis (MurA). • Mn(II)-dependent soluble glutathione (GSH) transferase. • FosA homologues in pathogenic bacteria: FosB and FosX. Impact on Pathogens • Low toxicity and broad-spectrum activity have resulted in an increased clinical use of fosfomycin • Fosfomycin is most commonly used in treatments of lower urinary tract infections • Fosfomycin alone or in combination with other drugs could also be useful against resistant Staphylococci and E. Coli, which can give serious infections for hospitalized patients (pneumonia, urinary tract infections, skin infections and bacteraemia). Proposed Mechanism • Lys90, Tyr100 and Arg119 mutants have a large effect on the turnover of the enzyme. They are all involved in the stabilization of the phosphonate group (Beharry et al, J Biol Chem, 2005, 17786.) • Recent docking and mutation studies indicate that Trp34, Gln36, Tyr39, Ser50, Lys90 and Arg93 are involved in the binding of GSH (Rigsby et al, Arch. Biochem. Biophys, 2007, 277.) • Tyr39 has been proposed to participate in the ionization of GSH (Rigsby et al, Arch. Biochem. Biophys, 2007, 277.) Docking of GSH in FosA 30 LGA Dockings using AutoDock 4, 1.5 Å clustering. 10 structures from the lowest energy conformations. The GSH thiol is placed in the vicinity of FCN. MD simulations • Amber 9. • FF03 force field, TIP3P water model. • Truncated octahedron > 10 Å of water around the solute. • 10 Å cutoff on non-bonding interactions • Charges and Force constants for the Mn-centre (His, Glu, Mn, FCN) calculated using Gaussian 03. Backbone RMSD residue 1-268 GSGSH t (ps) Distance GSH(S) – FCN (C) of the different Protonation States of GSH GS- Leaves the Binding Pocket GSGSH t (ps) MD snapshot of FosA active site Residues Shown to Affect FosA Actvity and Interactions with the Modelled GSH Residue Arg93 Lys90 Ser50 Tyr39 Gln36 Trp34 Gln91 His64 Tyr62 Cys48 Tyr128 Arg119 Trp46 Tyr65 Ser94 Glu95 Ser98 Tyr100 Asp103 His107 Glu110 Thr9 Interacting with GSH Yes Yes No Yes Yes Yes No No Yes No Yes Yes No No No Yes No No No No No No Comments FCN Mn-ligand Most of the observed changes in FosAFCN activity can be identified with the interactions with FCN or the modelled binding of GSHFCN FCN FCN FCN Mn-ligand FCN QM/MM-model of FosA Restricted Unrestricted Preliminary Energetics for FosA