The Chemistry of Protein Catalysis John Mitchell The MACiE Database Mechanism, Annotation and Classification in Enzymes. http://www.ebi.ac.uk/thornton-srv/databases/MACiE/ Gemma Holliday, Daniel Almonacid, Noel O’Boyle, Janet Thornton (EBI), Peter Murray-Rust, Gail Bartlett (EBI), James Torrance, John Mitchell G.L. Holliday et al., Nucl. Acids Res., 35, D515-D520 (2007) Enzyme Nomenclature and Classification EC Classification Class Subclass Sub-subclass Serial number The EC Classification Only deals with overall reaction Reaction direction arbitrary Cofactors and active site residues ignored Doesn’t deal with structural and sequence information However, it was never intended to do so A New Representation of Enzyme Reactions? Should be complementary to, but distinct from, the EC system Should take into account: Reaction Mechanism Structure Sequence Active Site residues Cofactors Need a database of enzyme mechanisms MACiE Database Mechanism, Annotation and Classification in Enzymes. http://www.ebi.ac.uk/thornton-srv/databases/MACiE/ Coverage of MACiE Representative – based on a non-homologous dataset, and chosen to represent each available EC sub-subclass. Coverage of MACiE Structures exist for: MACiE covers: 6 EC 1.-.-.- 6 EC 1.-.-.- 56 EC 1.2.-.- 53 EC 1.2.-.- 184 EC 1.2.3.- 156 EC 1.2.3.- 1312 EC 1.2.3.4 199 EC 1.2.3.4 Representative – based on a non-homologous dataset, and chosen to represent each available EC sub-subclass. Repertoire of Enzyme Catalysis G.L. Holliday et al., J. Molec. Biol., 372, 1261-1277 (2007) G.L. Holliday et al., J. Molec. Biol., accepted (2009) Number of steps in MACiE Repertoire of Enzyme Catalysis 140 Intramolecular 120 Bimolecular Unimolecular Enzyme chemistry is largely nucleophilic 100 80 60 40 20 0 Heterolytic Elimination Homolytic Elimination Electrophilic Addition Nucleophilic Addition Homolytic Addition Reaction Types Electrophilic Substitution Nucleophilic Substitution Homolytic Substitution Repertoire of Enzyme Catalysis Enzyme chemistry is largely nucleophilic Repertoire of Enzyme Catalysis 450 400 Number of steps in MACiE 350 300 250 200 150 100 50 0 Proton transfer AdN2 E1 SN2 E2 Reaction Types Radical reaction Tautom. Others Repertoire of Enzyme Catalysis Repertoire of Enzyme Catalysis Repertoire of Enzyme Catalysis Repertoire of Enzyme Catalysis Residue Catalytic Propensities Residue Catalytic Functions We use a combination of bioinformatics & chemoinformatics to identify similarities between enzyme-catalysed reaction mechanisms … we align the steps of chemical reactions. Just like sequence alignment! We can measure their similarity … Find only a few similar pairs Identify convergent evolution Check MACiE for duplicates Mechanistic similarity is only weakly related to proximity in the EC classification EC in common 0 -.-.-. 1 c.-.-. 2 c.s.-. 3 c.s.ss.- Evolution of Enzyme Function D.E. Almonacid et al., to be published EC is our Functional Classification Chemical reaction Enzyme Commission (EC) Nomenclature, 1992, Academic Press, San Diego, 6th Edition Enzyme catalysis databases G.L. Holliday et al., Nucleic Acids Res., 35, D515 (2007) S.C. Pegg et al., Biochemistry, 45, 2545 (2006) N. Nagano, Nucleic Acids Res., 33, D407 (2005) Coverage of MACiE Representative – based on a non-homologous dataset, and chosen to represent each available EC sub-subclass. Coverage of SFLD Based on a few evolutionarily related families Coverage of EzCatDB But without mechanisms. Domains Work with domains - evolutionary & structural units of proteins. Map enzyme catalytic mechanisms to domains to quantify convergent and divergent functional evolution of enzymes. CATH is our Structural Classification Orengo, C. A., et al. Structure, 1997, 5, 1093 Results: Convergent Evolution Numbers of CATH code occurrences per EC number c.s.-.c.s.ss.c.s.ss.sn c.-.-.- C 3.17 1.73 1.38 1.11 A 11.00 3.27 1.93 1.60 T 28.00 4.89 2.24 1.19 H 38.33 5.80 2.46 1.22 2.46 CATH/EC reaction Convergent Evolution Results: Convergent Evolution Numbers of CATH code occurrences per EC number c.s.-.c.s.ss.c.s.ss.sn c.-.-.- C 3.17 1.73 1.38 1.11 A 11.00 3.27 1.93 1.60 T 28.00 4.89 2.24 1.19 H 38.33 5.80 2.46 1.22 2.46 CATH/EC reaction: Convergent Evolution An average reaction has evolved independently in 2.46 superfamilies Results: Divergent Evolution EC reactions/CATH c.-.-.- C 4.75 A 3.14 T 1.36 H 1.20 c.s.-.- 19.50 7.00 1.79 1.36 c.s.ss.- 39.25 10.48 2.08 1.46 c.s.ss.sn 90.00 17.90 3.05 2.05 1.46 EC reactions/CATH database entries/CATH Divergent Evolution 2.18 Results: Divergent Evolution EC reactions/CATH c.-.-.- C 4.75 A 3.14 T 1.36 H 1.20 c.s.-.- 19.50 7.00 1.79 1.36 c.s.ss.- 39.25 10.48 2.08 1.46 c.s.ss.sn 90.00 17.90 3.05 2.05 1.46 EC reactions/CATH: Divergent Evolution database An average superfamily hasentries/CATH evolved 1.46 different reactions 2.18 The Future … (1) Molecular Evolution Now we want to evolve chemical reactions in silico across chemical, or EC, space. 1. To understand and rationalise convergent and divergent biochemical evolution; 2. To better relate protein structure and function; 3. To understand the influence on networks of coupled reactions. (2) Understanding Protein Structure • We seek to understand the influence of folding pathway on protein structure over all time scales (including the evolutionary one). 44 Protein Folding Funnel Energy Landscape 45 ACKNOWLEDGEMENTS Dr Gemma Holliday Dr Daniel Almonacid Dr Noel O’Boyle Prof. Janet Thornton (EBI) Dr Peter Murray-Rust Dr Florian Nigsch ACKNOWLEDGEMENTS Cambridge Overseas Trust