Review TRENDS in Biochemical Sciences Vol.30 No.11 November 2005 Understanding nature’s catalytic toolkit Alex Gutteridge and Janet M. Thornton EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK Enzymes catalyse numerous reactions in nature, often causing spectacular accelerations in the catalysis rate. One aspect of understanding how enzymes achieve these feats is to explore how they use the limited set of residue side chains that form their ‘catalytic toolkit’. Combinations of different residues form ‘catalytic units’ that are found repeatedly in different unrelated enzymes. Most catalytic units facilitate rapid catalysis in the enzyme active site either by providing charged groups to polarize substrates and to stabilize transition states, or by modifying the pKa values of other residues to provide more effective acids and bases. Given recent efforts to design novel enzymes, the rise of structural genomics and subsequent efforts to predict the function of enzymes from their structure, these units provide a simple framework to describe how nature uses the tools at her disposal, and might help to improve techniques for designing and predicting enzyme function. Mechanisms of enzyme catalysis Enzymes, and the principles by which they perform catalysis, have been the subject of intense study for over a hundred years, in which time the mechanisms of many different enzymes have been investigated in great detail. Serine proteases, for example, have been the focus of countless structural [1], kinetic [2] and theoretical [3,4] studies. Even now, however, when the general principles that govern enzyme catalysis seem to be well understood [5], new theories continue to be proposed to explain puzzling aspects of enzyme catalysis [6], and novel resources are being developed [7] to answer ongoing questions about the evolution and mechanism of enzymes. For instance, how do enzymes catalyse the diverse range of reactions found in a cell with only a small set of different chemical groups? Of the 20 naturally occurring amino acids, only the 11 polar and charged residues are generally observed to engage directly in catalysis [8]. These residues fall into seven different chemical groups: imidazole (histidine), guanidinium (arginine), amine (lysine), carboxylate (glutamate, aspartate), amide (glutamine, asparagine), hydroxyl (serine, threonine, tyrosine) and thiol (cysteine). The structures and ionization of the polar and charged residue side chains are summarized in Box 1. Corresponding author: Gutteridge, A. (alexg@ebi.ac.uk). Available online 7 October 2005 Of course, enzymes also use metal ions [9], cofactors [10,11] and water molecules [12] to aid catalysis. However, a source of catalytic power that does not require additional groups stems from the ability of catalytic residues to interact with each other and thus to affect each other’s chemical properties [13]. An early example of this phenomenon was observed in acetoacetate decarboxylase [14], in which two adjacent lysines mutually destabilize their protonated forms through their proximity, enabling one of them to function as a nucleophile. Similarly, the well-known serine proteases use a triad of interacting residues to perform their chemistry [2]. But not all combinations of residues are useful: some might have no effect on or even reduce the power of their component residues. By reviewing the available structural and biochemical data, here we show which combinations of residues are used by enzymes and how their interactions affect enzyme properties. We introduce the concept of the ‘catalytic unit’: that is, simple combinations of two or more residues, such as the serine protease catalytic triad, that are repeatedly used in similar roles by different, unrelated enzymes. The view of catalysis that we present here is undoubtedly a simplification. We have not considered some important aspects of enzyme chemistry, such as the role of hydrophobic residues, metal ions, cofactors or water, nor have we touched on quantum effects or the importance of factors such as entropy and binding energy. A data set of catalytic interactions We have compiled from the literature a set of 191 enzymes to study (listed in the Supplementary Material). The set is non-redundant: that is, no two enzymes are evolutionarily related, as defined by sequence and structure comparisons in the CATH database [15]. The catalytic mechanism for each enzyme is extracted from the Catalytic Site Atlas [7] and the crystal structure is taken from the Protein Data Bank (PDB) [16]. All of the structures are high quality (resolution, 2 Å; R-factor, 0.3) and the catalytic residues are found in a single conformation (all atoms have an occupancy of 1). To find catalytic units, we define an interaction as taking place between two residues if any of their side chain atoms are within 4 Å of each other. We consider only interactions between polar residues and ignore those involving hydrophobic residues. Hydrophobic residues, such as phenylalanine, tryptophan and the smaller aliphatic residues, do have important effects in many enzymes. By providing www.sciencedirect.com 0968-0004/$ - see front matter Q 2005 Elsevier Ltd. All rights reserved. doi:10.1016/j.tibs.2005.09.006 Review TRENDS in Biochemical Sciences Vol.30 No.11 November 2005 623 Box 1. Structures and ionization of polar and charged amino acids Only seven different polar or charged side chain terminal groups are found in proteins. The protonated forms are shown in Figure I, although at neutral pH the carboxylate and imidazole side chains will tend to be unprotonated and thus negatively charged and neutral, respectively. Each side chain has a characteristic pKa, the pH at which half of the side chains will be protonated (Table I); however, the pKa can be altered by placing a residue in contact with other charged or uncharged groups. For example, forming an interaction between two carboxylates destabilizes their negatively charged forms and thus raises their pKa until one of them is neutral. By contrast, because opposite charges stabilize each other, placing an arginine next to a carboxylate stabilizes the two charges and thus tends to raise the pKa of the arginine and to lower the pKa of the carboxylate (Figure I). As a rule, when a group makes an interactions with a negative charge its pKa is raised (making it more likely to be protonated), whereas when it interacts with a positive charge its pKa is lowered (making it more likely to be unprotonated). H H O O H O CH2 R Carboxylate H H N N R Amide + H H O S R Hydroxyl R Thiol H H N N H N + H H N + H CH2 R Amino R Imidazole + N H N H CH2 CH2 R R Guanidinium N H H CH2 H H N H O O CH2 R Carboxylate–arginine O O H O O CH2 CH2 R R Carboxylate–carboxylate Figure I. Polar and charged side chain terminal groups and carboxylate–arginine and carboxylate–carboxylate dyads found in proteins. Table I. pKa values of polar and charged amino acids Residue Aspartate Glutamate Histidine Cysteine Lysine Tyrosine Arginine Serine Threonine Asparagine Glutamine a non-polar environment, they tend to raise the pKa of acidic residues and to lower the pKa of basic ones. These effects (also known as medium or solvent effects) [17] are often spread out over the whole or parts of the active site, however, and thus can be hard to localize to a specific residue. For this reason, we consider only interactions between polar or charged residues in this analysis. We also restrict our attention to side chain groups, although main chain amides and carbonyls are also important polar groups that can be used in catalysis, most obviously in the oxyanion hole of the serine proteases. By counting the number of different interactions, we find that, on average, each polar catalytic residue interacts with 0.4 other polar catalytic residues and 1.9 other polar residues (giving a total of 2.3 interactions per catalytic residue). By contrast, non-catalytic buried polar residues have, on average, interactions with 1.2 other polar residues, which is significantly fewer than those of the catalytic residues. Only 88 of the 191 enzymes contain www.sciencedirect.com pKa 3.9 4.3 w6 8.3 10.8 11 12.5 w13 w13 – – one or more interactions between two of the defined catalytic residues, which seems to suggest that most catalytic residues do not require direct interactions with other catalytic residues to be active. The fact that the catalytic residues have a larger number of interactions than non-catalytic residues suggests, however, that at least some of the interactions between catalytic and noncatalytic residues are functional. The annotation in the Catalytic Site Atlas is derived from literature searching using strict criteria for the definition of a catalytic residue [8], but it is likely that some of the secondary interactions – between residues annotated as catalytic and residues annotated as noncatalytic – do have a role in catalysis. The residues annotated as non-catalytic have not been previously identified because their effect is likely to be subtler than those of other residues that are directly involved in the mechanism. Thus, it is probably not true to suggest that catalytic residues work alone, even when no specific TRENDS in Biochemical Sciences functional interaction has been identified. It is also important to remember that catalytic residues work within and rely on the specific microenvironment provided by all of the other residues in the active site and the rest of the enzyme, and not only by the residues with which they interact directly. The functions of secondary residues The simplest of the functions performed by secondary interactions is orientation. Making bonds between residues restricts their motion and ensures that they are positioned correctly relative to the substrate. Because restricting motion reduces entropy, there is an energetic cost to this orientation. By pre-arranging the active site, this entropic cost is paid for when the enzyme folds or the substrate binds [18–20], rather than during catalysis, and thus it is beneficial to the enzyme. The individual contribution of a single residue that functions only to orientate another residue is likely to be small, but the effect of taking all such residues in an enzyme into account will be significant. Where charged groups are required to interact with a substrate, there might well be secondary groups that stabilize the charge required by providing oppositely charged groups nearby. Often these residues have been annotated as catalytic, but in some enzymes their importance might be less obvious and they have might have been overlooked. For histidine, secondary residues that control tautomerization of the imidazole ring can be important. Histidine exists in two neutral tautomeric states that are protonated on either the Nd or the N3 atom. Free in solution, these two forms exist in roughly equal proportions; in an enzyme, however, the presence of the correct tautomer is essential. Vol.30 No.11 November 2005 (a) 100 Number of catalytic residues observed Review 80 60 40 20 0 HIS CYS ASP ARG GLU LYS TYR SER ASN THR GLN Residue type (b) 7 6 5 Catalytic propensity 624 4 3 2 1 The contents of nature’s toolkit Before we look at the interactions between catalytic residues, it is helpful to see which residues are used most often in catalytic sites. Figure 1 shows the numbers of each residue that are catalytic in our data set and the catalytic propensity of each residue. The catalytic propensity of a residue is defined as the percentage of catalytic residues constituted by a particular reaction type, divided by the percentage of all residues in the data set constituted by that particular residue type [8]. From Figure 1, we can immediately see the importance of histidine in enzyme catalysis. The reason for the popular use of this amino acid is that histidine is the only residue that has a pKa close to neutral and thus can easily function as an acid–base catalyst. It can also function as a nucleophile, use its charged form to stabilize charged transition states, and can both accept and donate hydrogen bonds. In addition, cysteine, which also has a pKa close to neutral, has a high catalytic propensity, although its use is much rarer overall. The other most commonly observed residues are the charged residues glutamate, aspartate, arginine and lysine. These residues have pKa values that are far from neutral and they are harder to use in acid–base chemistry; through interactions with other residues, however, their www.sciencedirect.com 0 HIS CYS ASP ARG GLU LYS TYR SER ASN THR GLN Residue type Figure 1. Numbers of catalytic residues observed (a) and catalytic propensity (b) of each residue type. The catalytic propensity of a residue is defined as the percentage of catalytic residues constituted by a particular reaction type, divided by the percentage of all residues in the data set constituted by that particular residue type [8]. pKa values can be altered significantly and they can be also used to provide charges that affect other residues and the substrate. The polar residues – serine, threonine, tyrosine, glutamine and asparagine – are used less often. In general, these residues are unreactive until they are primed by interaction with another residue. The simplest units that these individual residues can form are interacting dyads. Figure 2a shows the number of interactions observed between each pair of residues in which both residues are annotated as catalytic. It is clear that many potential interactions, particularly those between polar residues (clustered in the bottom right of the diagram), are rarely observed in active sites. By contrast, interactions between charged residues, and to Review (a) ASP GLU HIS LYS ARG CYS TYR SER THR ASN GLN (b) ASP GLU HIS LYS ARG CYS TYR SER THR ASN GLN TRENDS in Biochemical Sciences 3 GLU 3 4 2 HIS 1 31 23 12 13 10 9 6 6 16 20 4 2 13 1 3 4 2 3 2 2 4 1 0 1 2 1 1 1 0 0 1 0 1 1 1 1 0 1 THR 1 1 0 0 1 0 1 0 1 0 0 0 SER 0 0 2 0 0 0 1 3 0 2 1 TYR 0 1 1 3 1 3 1 0 1 3 2 2 3 1 1 5 CYS 2 1 8 2 0 0 6 8 2 0 6 2 3 2 3 2 ARG 3 4 0 1 0 0 13 1 3 LYS 3 5 13 1 8 12 1 0 0 0 3 ASN 0 0 0 0 0 0 GLN 0 0 0 0 0 0 0 ASP 17 GLU 5 26 14 15 70 58 54 42 31 23 23 63 38 7 37 3 26 16 20 15 23 16 25 29 18 13 37 31 11 8 13 9 17 5 11 17 6 12 5 16 2 7 ASN 4 1 4 10 THR 11 9 4 2 3 2 3 2 13 11 14 4 20 SER 4 3 15 4 5 3 18 5 7 6 13 14 13 8 14 6 5 17 18 TYR 7 31 5 3 22 9 34 19 5 12 24 CYS 6 11 31 17 20 1 46 18 7 0 40 18 24 6 11 0 3 9 GLN 5 6 10 2 1 Figure 2. Numbers of residue interactions in the data set. (a) Numbers of residue interactions in which both residues are annotated as catalytic in the Catalytic Site Atlas. (b) Numbers of residue interactions in which either residue is annotated as catalytic. Numbers above the diagonal lines in each box are the observed number of interactions; numbers below the diagonal lines are the expected number of interactions, which takes into account the catalytic propensity of each residue and the propensity of a given pair of residue types to interact in non-catalytic regions of 2 protein structure. Boxes are coloured by calculating ðOKEEÞ ðc2 Þ, where O equals the observed number of interactions and E equals the expected number. The deepest blue and red boxes have c2R5 with O!E and OOE, respectively; boxes where c2 is 0, or where O!2 are uncoloured; and other boxes are scaled between these two extremes according to the c2 value. a lesser extent between polar and charged residues, are more common. It is also noticeable that some combinations (e.g. histidine–aspartate) occur much more than expected, whereas other interactions (e.g. arginine–carboxylate) are observed much less. Figure 2b shows the number of interactions in which only one of the residues is catalytic; there are many more of these interactions and, as explained above, we expect many to be functionally important even though they have www.sciencedirect.com Common catalytic units Arginine–arginine The eight catalytic arginine–arginine interactions in the data set come from five different enzymes: arginine kinase, flavocytochrome c, phytase, undecaprenyl pyrophosphate synthase and adenylate kinase. Four of these five enzymes catalyse reactions involving phosphate chemistry. In each case, the arginines form bonds to the phosphate oxygens and polarize the phosphate, making it a better leaving group. Figure 3a shows the arrangement of arginines around a substrate analogue in adenylate kinase [21]. The arginines are close enough to destabilize each other until the negatively charged phosphate groups bind. The nearby residues Asp162 and Asp163 provide potentially stabilizing negative charges, although these charges might be more important for holding the arginines in position, rather than for directly affecting the arginine side chains. ARG 9 12 23 25 1 0 48 7 31 LYS 10 13 56 5 31 27 23 53 11 HIS 3 625 not been annotated as such. The larger numbers enable us to see more general trends: the polar–polar interactions are not only rare, they are also observed less often than we would expect. By contrast, we again see a large number of interactions between charged groups. Some (e.g. carboxylate–carboxylate and arginine–arginine interactions) are observed more often than we would expect, whereas other interactions (e.g. arginine–carboxylate) are observed less than we would expect. ASP 2 Vol.30 No.11 November 2005 Carboxylate–carboxylate The effect of placing two carboxylates together is that their pKa values are raised. Thus, they tend to be protonated at a higher pH than is normal, which prevents the unfavourable interaction of two negative charges and enables a hydrogen bond to form between the two carboxylate groups. Carboxylate dyads are used in two particularly important classes of enzymes: aspartic proteases and glycosidases. In aspartic proteases, both the carboxylates engage in acid–base chemistry. Owing to its raised pKa, one of the aspartates is protonated at the start of the reaction, enabling it to donate a proton to the substrate. The second aspartate is unprotonated and thus can accept a proton from water to form a nucleophilic OHK group, which then attacks the substrate. In the second stage, the roles of the two aspartates are reversed such that the first aspartate accepts a proton from the protonated intermediate and the second aspartate donates a proton to the leaving substrate. The active site of the aspartic protease Cardosin [22] is shown in Figure 3b. Glycosidases, such as cellobiohydrolase Cel6A [23] (Figure 3c), also use two interacting carboxylates. This interaction raises the pKa of Asp226, enabling it to operate as an acid–base but, in contrast to the situation in aspartic proteases, Asp180 operates as a nucleophile, rather than as a second acid–base. This difference in mechanism is due to the interaction between Asp180 and Arg179. This interaction lowers the pKa of Asp180 and prevents it from becoming protonated. In aspartic proteases, both aspartates are hydrogen-bonded to hydroxyl groups (Figure 3b), Review 626 (a) TRENDS in Biochemical Sciences (b) (c) Vol.30 No.11 November 2005 (d) Figure 3. Interactions involving arginine and carboxylate. (a) The arginines in adenylate kinase (PDB code: 1ZIN) polarize the substrate phosphates (shown in stick formation below the arginines). Two aspartates stabilize the concentration of positive charge required. (b) Asp215 and Asp32 in Cardosin (PDB code: 1B5F) form an interaction that enables them both to be an acid–base. The hydroxyl groups of Thr218 and Ser35 orientate, without affecting the pKa of, the carboxyls. (c) Asp180 and Arg179 form an ion pair in cellobiohydrolase (PDB code: 1OC7) that raises the pKa of Asp226, which can then engage in acid–base chemistry. (d) Asp192 in sucrose phosphorylase (PDB code: 1R7A) acts as a nucleophile, forming a covalent bond with the substrate. The nearby residue Arg190 ensures that Asp192 is unprotonated. which do not alter the pKa of the carboxylates in the same way that the arginine side chain does in glycosidase. Carboxylate–arginine Placing carboxylate and arginine residues together stabilizes the charged form of each residue such that neither residue can easily gain or lose protons. Carboxylate–arginine interactions are often found where either a positive charge (from the arginine) or a negative charge (from the carboxylate) is required to polarize a substrate. For example, carboxylates are used to hold the arginines in adenylate kinase, as described above. Carboxylate oxygens that are nucleophiles also interact with arginines. An example of this is seen in the active site of sucrose phosphorylase [24] (Figure 3d). Here, Arg190 reduces the pKa of Asp192, ensuring that it is not protonated and thus able to perform a nucleophilic attack on the substrate (an analogue of which is shown). The carboxylate–arginine interaction is also found as part of a larger unit comprising two carboxylates and an arginine, as in the glycosidases described above. Carboxylate–lysine Because the amino group of lysine is usually protonated, it can have a similar role to that of arginine in its interactions with carboxylate; however, lysine has a lower pKa than arginine (10 versus 12) and is found in a neutral state given the correct conditions. Lysine-containing triads A threonine-containing triad (analogous to the serine protease triad) is found in L-asparaginase [25] (Figure 4a). (a) (b) The side chain of Thr95 is used as the nucleophile and Lys168 is used as the acid–base instead of serine and histidine, respectively. Asp96 retains its role in orientating and altering the pKa of the lysine. A second lysine–carboxylate containing triad is seen in aldo-keto reductase [26] (Figure 4b). Here, Tyr58 is not used as a nucleophile, but instead lysine lowers its pKa so that it can function as an acid, donating a proton to the substrate. A third triad containing two lysine–carboxylate interactions is present in indole-3-glycerol-phosphate synthase [27] (Figure 4c), in which two lysines form salt bridges with a single glutamate. The charged forms of the glutamate and one of the lysines are required to stabilize charges in the transition state. The second lysine engages in general acid catalysis, and it has been speculated that its reprotonation is mediated by its involvement in the triad. Lysine–carboxylate in acid–base chemistry In contrast to arginine, which seems to remain protonated at all times, lysine can gain and lose protons and is often used as an acid–base, for example in the enolase superfamily of enzymes [28]. Interactions with carboxylate groups will tend to raise the pKa of lysine, making it less capable of losing protons; however, there are enzymes in which the lysine of a lysine–carboxylate dyad is involved in acid–base chemistry. The question is, given its high pKa, how is the lysine ever deprotonated? Proton relay chains have been suggested for this role, but this idea remains speculative. In the lysine–aspartate dyad of glucosamine-6-phosphate (c) (d) Figure 4. Interactions involving lysine. (a) The ‘asparaginase triad’ in L-asparaginase (PDB code: 1O7J) features an aspartate–lysine pair, which is used to activate a threonine residue as a nucleophile by extracting a proton from the threonine hydroxyl. (b) Another triad containing an aspartate–lysine pair, found in aldo-keto reductase AKR11A (PDB code: 1PYF), uses tyrosine not as a nucleophile but as an acid–base. The lysine–aspartate dyad controls the pKa of the tyrosine. (c) A triad of two lysines and a glutamate are used in indole-3-glycerol phosphate synthase (PDB code: 1VC4). Lys112 acts as an acid–base and its pKa is thought to be modulated by its involvement in the triad. In addition, Glu51 and Lys53 have roles in providing electrostatic stabilization during the reaction. (d) The lysine in a simple glutamate–lysine dyad provides acid–base chemistry in glucosamine-6-phosphate synthase (PDB code: 1MOQ). www.sciencedirect.com Review (a) TRENDS in Biochemical Sciences (b) Vol.30 No.11 November 2005 627 (c) Figure 5. Interactions involving histidine. (a) In the classic serine–histidine–aspartate triad found in trypsin (PDB code: 1AVW), the aspartate–histidine dyad extracts a proton from the serine hydroxyl. (b) The aspartate–histidine dyad from aconitase (PDB code: 1C96) acts as an acid–base directly on the substrate, which is also shown. (c) A rare, functionally important histidine–histidine interaction is found in the phosphotransferase domain of glucose permease (PDB code: 1GPR). The N3 atom of His83 acts as a nucleophile in attacking phosphate, whereas His68 ensures that His83 exists in the correct tautomer and stabilizes the transition state. A threonine residue ensures that His68 is in the correct tautomeric state. synthase [29], the lysine has been proposed to deprotonate a substrate hydroxyl group [30] (Figure 4d). Carboxylate–histidine The effect of the interaction between histidine and carboxylate is to raise the pKa of the histidine, helping it to function as an acid–base. A classic example of this situation is found in the serine protease triad from trypsin [31] (Figure 5a), in which the histidine–aspartate dyad is used to deprotonate Ser195. Most of the examples in our data set, however, use the histidine–carboxylate dyad to act directly on the substrate. Just such a dyad and a substrate analogue can be seen in the active site of aconitase [32] (Figure 5b). Histidine–hydroxyl The most well-known example of a catalytic histidine– hydroxyl interaction is found in the catalytic triad that catalyses many different reactions. In the triad, histidine extracts a proton from serine, as described above, to prime the serine as a nucleophile. However, hydroxyls can also prime histidines. In phosphotransferase [33] (Figure 5c), for example, a threonine hydroxyl hydrogen bonds with His68; this hydrogen bond forces the Nd of His68 to be unprotonated and the N3 to be protonated. This tautomeric state is essential for His68 to prime His83 for its role as a nucleophile. Histidine–histidine Because histidine is the most commonly used catalytic residue, it is surprising to see only eight catalytic histidine–histidine interactions. Furthermore, in all but one of these examples, the two histidines are close to each other but the interaction between them does not seem to be functionally important. The one exception is the previously mentioned phosphotransferase, in which His83 functions as a nucleophile that attacks a phosphate group. It is kept in its correct tautomeric state by its bond to His68. The roles of catalytic interactions Previous studies have shown that w40% of catalytic residues are involved in either transition state stabilization or substrate activation [8] – processes that generally www.sciencedirect.com involve simply providing the appropriate charged groups or hydrogen-bonding partners around the substrate. Given this, it is not surprising that most catalytic residues interact directly with the substrate, rather than with each other. Interactions with other groups are not required for most residues to fulfil these types of role. It also seems, however, that many important catalytic functions are best achieved by particular combinations of residues. In some catalytic units, such as carboxylate– carboxylate, the effect is to change the chemical character of a group (from negatively charged to neutral in this case), whereas in others, such as arginine–carboxylate, the effect is to enhance existing properties (by stabilizing charges). The range of functions that are performed by interacting residues in our data set is summarized in Box 2. The conclusion that we draw from this analysis is that the range of roles of interactions involving charged residues is greater than that of interactions involving polar residues. This would explain, in part, why charged residues are found to be used in catalysis more often than polar residues. Box 2. Roles of the different catalytic dyads in this analysis Interactions between like charges † Provide charges, as in arginine–arginine † Provide an acid–base (by depolarization), as in aspartate–aspartate † Provide nucleophiles, as in lysine–lysine Interactions between opposite charges † Stabilize charge concentration, as in aspartate–arginine † Provide an ion pair to depolarize a third residue, as in arginine– aspartate–aspartate † Provide charges for transition state stabilization, as in glutamate– lysine † Provide nucleophiles, as in arginine–aspartate † Provide an acid–base, as in glutamate–lysine Interactions between charged and polar residues † Provide nucleophiles, as in lysine–threonine † Provide an acid–base, as in lysine–tyrosine, glutamate–threonine and aspartate–histidine Interactions between polar residues † Provide nucleophiles, as in histidine–serine, histidine–histidine † Provide an acid–base, as in asparagine–histidine † Provide tautomerization, as in threonine–histidine 628 Review TRENDS in Biochemical Sciences Charged residues have important roles in transition state stabilization and substrate polarization, but they also have the ability to modify the pKa of other residues, enabling those residues to perform functions that they would not otherwise be able to do. By contrast, apart from histidine (which is often charged in protein structures), polar residues frequently require an interaction with a charged residue to alter their chemical properties. These interactions generally involve a charged residue that primes the polar residue for action, either as a nucleophile (as in threonine–lysine) or as an acid–base (as in aspartate–histidine). The reverse situation – in which a polar residue primes a charged residue – is rarely seen, presumably because a polar residue has relatively little effect on a charged residue. Figure 2 shows that only a few of the different dyads that could be formed in enzyme active sites are actually used in catalysis. The seven combinations that we have described above (arginine–arginine, carboxylate– carboxylate, carboxylate–arginine, carboxylate–lysine, carboxylate–histidine, histidine–hydroxyl and histidine– histidine) account for w65% of the interactions between catalytic residues and probably for an even higher proportion of the key, direct, functional interactions. Thus, although combinations of residues can produce new or enhanced chemical activity in the residue side chains, the catalytic toolkit used by enzymes seems as small as ever. How can such a small set of tools catalyse the diverse range of reactions found in nature? The answer probably lies partly in the nature of the actual reactions that are catalysed. Results from an analysis of MACiE (a database of reaction mechanisms) suggest that most reactions can be broken down into individual steps, each of which is chemically simple. For example, 75% of reaction steps involve a straightforward proton transfer (G.L. Holliday et al., personal communication). Because there are a restricted number of these simple steps, it follows that the number of chemical groups that are required to catalyse these steps is also small. It is also important to appreciate that the origins of the power of enzyme catalysis derive from more than just providing specific residue or residue combinations in proximity to the substrate. The repeated evolution of some units, such as the catalytic triad [2], implies that these interactions are genuinely useful to enzymes. But the catalytic power of any enzyme cannot be ascribed only to the formation of these units. Recent efforts to modify existing proteins to catalyse a new catalytic function have shown that many binding and/or structural residues are required for efficient catalysis, in addition to correctly placed, mechanistically important residues (which the catalytic units represent) [34]. We suggest that the real power of an enzyme lies in a combination of very ‘local’ structural features, represented by catalytic units, and more ‘global’ features of the enzyme, including the dynamics of the structure and the overall microenvironment of the active site. To predict the catalytic function of an enzyme purely from its structure is a long-term goal that has both practical and academic interest [35]. One way to predict www.sciencedirect.com Vol.30 No.11 November 2005 function is to find similarities between the active site of an unannotated enzyme and that of another annotated structure. The use of templates built from the structure of active sites to identify such similarities is well developed [36]. These templates, however, generally involve three or more residues, some of which might provide different functions within a single template. Such large templates make finding similarities between enzymes difficult because the complete active site has to be conserved, rather than the smaller functional units that we have described here. Shaw et al. [37] have described an interesting example in which two of these smaller units, a classic Glu–Glu cellulase dyad (similar to that shown in Figure 3c) and a Ser–His–Glu triad, combine in a novel way. The triad is used to couple one of the catalytic glutamates with another deeply buried glutamate. The buried glutamate has a high pKa owing to its location in the core of the protein and thus raises the pKa of the catalytic glutamate, ensuring that it is protonated and ready to be a proton donor. Searching for these types of combinations of catalytic units could help to predict catalytic functions from structure. This example also demonstrates how catalytic units interact with, and require for their function, the global structure of the enzyme, which in this example provides the hydrophobic core required to lower the pKa of the buried glutamate. Concluding remarks With the advent of structural genomics [38], the ability to predict function including catalytic mechanisms from enzyme structures [39] is increasingly important. In addition, the design or redesign of enzymes to bind new substrates or to perform new reactions is being actively pursued [34,40]. The catalytic units that we have described here provide a useful framework for understanding the chemistry performed by enzymes and might help to develop techniques for predicting and designing the mechanisms of enzymes. Supplementary data Supplementary data associated with this article can be found at doi:10.1016/j.tibs.2005.09.006 References 1 Perona, J. and Craik, C. (1997) Evolutionary divergence of substrate specificity within the chymotrypsin-like serine protease fold. J. Biol. Chem. 272, 29987–29990 2 Hedstrom, L. (2002) Serine protease mechanism and specificity. Chem. Rev. 102, 4501–4524 3 Topf, M. et al. (2002) Ab initio qm/mm dynamics simulation of the tetrahedral intermediate of serine proteases: insights into the active site hydrogen-bonding network. J. Am. Chem. Soc. 124, 14780–14788 4 Ishida, T. and Kato, S. (2004) Role of Asp102 in the catalytic relay system of serine proteases: a theoretical study. J. Am. Chem. Soc. 126, 7111–7118 5 Blow, D. (2000) So do we understand how enzymes work? Struct. Fold. Des. 8, R77–R81 6 Williams, D. et al. (2004) Understanding noncovalent interactions: ligand binding energy and catalytic efficiency from ligand-induced reductions in motion within receptors and enzymes. Angew. Chem. Int. Ed. Engl. 43, 6596–6616 Review TRENDS in Biochemical Sciences 7 Porter, C. et al. (2004) The catalytic site atlas: a resource of catalytic sites and residues identified in enzymes using structural data. Nucleic Acids Res. 32, D129–D133 8 Bartlett, G.J. et al. (2002) Analysis of catalytic residues in enzyme active sites. J. Mol. Biol. 324, 105–121 9 Williams, R. (2003) Metallo-enzyme catalysis. Chem. Commun., 1109–1113 10 Mure, M. (2004) Tyrosine-derived quinone cofactors. Acc. Chem. Res. 37, 131–139 11 Murataliev, M. et al. (2004) Electron transfer by diflavin reductases. Biochim. Biophys. Acta 1698, 1–26 12 Hernick, M. and Fierke, C. (2005) Zinc hydrolases: the mechanisms of zinc-dependent deacetylases. Arch. Biochem. Biophys. 433, 71–84 13 Harris, T. and Turner, G. (2002) Structural basis of perturbed pKa values of catalytic groups in enzyme active sites. IUBMB Life 53, 85–98 14 Schmidt, D. and Westheimer, F. (1971) pK of the lysine amino group at the active site of acetoacetate decarboxylase. Biochemistry 10, 1249–1253 15 Pearl, F.M. et al. (2000) Assigning genomic sequences to CATH. Nucleic Acids Res. 28, 277–282 16 Berman, H.M. et al. (2000) The Protein Data Bank. Nucleic Acids Res. 28, 235–242 17 Hollfelder, F. et al. (2001) On the magnitude and specificity of medium effects in enzyme-like catalysts for proton transfer. J. Org. Chem. 66, 5866–5874 18 Hammes, G. (2002) Multiple conformational changes in enzyme catalysis. Biochemistry 41, 8221–8228 19 Gutteridge, A. and Thornton, J. (2004) Conformational change in substrate binding, catalysis and product release: an open and shut case? FEBS Lett. 567, 67–73 20 Gutteridge, A. and Thornton, J. (2005) Conformational changes observed in enzyme crystal structures upon substrate binding. J. Mol. Biol. 346, 21–28 21 Berry, M. and Phillips, G. (1998) Crystal structures of bacillus stearothermophilus adenylate kinase with bound Ap5a, Mg2CAp5a, and Mn2CAp5a reveal an intermediate lid position and six coordinate octahedral geometry for bound Mg2C and Mn2C. Proteins 32, 276–288 22 Frazao, C. et al. (1999) Crystal structure of cardosin A, a glycosylated and Arg-Gly-Asp-containing aspartic proteinase from the flowers of Cynara cardunculus L. J. Biol. Chem. 274, 27694–27701 23 Varrot, A. and Davies, G. (2003) Direct experimental observation of the hydrogen-bonding network of a glycosidase along its reaction coordinate revealed by atomic resolution analyses of endoglucanase cel5a. Acta Crystallogr. D 59, 447–452 24 Sprogoe, D. et al. (2004) Crystal structure of sucrose phosphorylase from Bifidobacterium adolescentis. Biochemistry 43, 1156–1162 Vol.30 No.11 November 2005 629 25 Lubkowski, J. et al. (2003) Atomic resolution structure of Erwinia chrysanthemi L-asparaginase. Acta Crystallogr. D 59, 84–92 26 Ehrensberger, A. and Wilson, D. (2004) Structural and catalytic diversity in the two family 11 aldo-keto reductases. J. Mol. Biol. 337, 661–673 27 Hennig, M. et al. (2002) The catalytic mechanism of indole-3-glycerol phosphate synthase: crystal structures of complexes of the enzyme from sulfolobus solfataricus with substrate analogue, substrate, and product. J. Mol. Biol. 319, 757–766 28 Gerlt, J. and Babbitt, P. (2001) Divergent evolution of enzymatic function: mechanistically diverse superfamilies and functionally distinct suprafamilies. Annu. Rev. Biochem. 70, 209–246 29 Teplyakov, A. et al. (1998) Involvement of the C terminus in intramolecular nitrogen channeling in glucosamine 6-phosphate synthase: evidence from a 1.6 Å crystal structure of the isomerase domain. Structure 6, 1047–1055 30 Teplyakov, A. et al. (1999) The mechanism of sugar phosphate isomerization by glucosamine 6-phosphate synthase. Protein Sci. 8, 596–602 31 Song, H. and Suh, S. (1998) Kunitz-type soybean trypsin inhibitor revisited: refined structure of its complex with porcine trypsin reveals an insight into the interaction between a homologous inhibitor from Erythrina caffra and tissue-type plasminogen activator. J. Mol. Biol. 275, 347–363 32 Lloyd, S. et al. (1999) The mechanism of aconitase: 1.8 Å resolution crystal structure of the S642A: citrate complex. Protein Sci. 8, 2655–2662 33 Liao, D. et al. (1991) Structure of the IIa domain of the glucose permease of Bacillus subtilis at 2.2-Å resolution. Biochemistry 30, 9583–9594 34 Dwyer, M. et al. (2004) Computational design of a biologically active enzyme. Science 304, 1967–1971 35 Ringe, D. et al. (2004) Protein structure to function: insights from computation. Cell. Mol. Life Sci. 61, 387–392 36 Torrance, J. et al. (2005) Using a library of structural templates to recognise catalytic sites and explore their evolution in homologous families. J. Mol. Biol. 347, 565–581 37 Shaw, A. et al. (2002) A novel combination of two classic catalytic schemes. J. Mol. Biol. 320, 303–309 38 Todd, A. et al. (2005) Progress of structural genomics initiatives: an analysis of solved target structures. J. Mol. Biol. 348, 1235–1260 39 Jones, S. and Thornton, J. (2004) Searching for functional sites in protein structures. Curr. Opin. Chem. Biol. 8, 3–7 40 Korkegian, A. et al. (2005) Computational thermostabilization of an enzyme. Science 308, 857–860 Elsevier.com – Dynamic New Site Links Scientists to New Research & Thinking Elsevier.com has had a makeover, inside and out. As a world-leading publisher of scientific, technical and health information, Elsevier is dedicated to linking researchers and professionals to the best thinking in their fields. We offer the widest and deepest coverage in a range of media types to enhance crosspollination of information, breakthroughs in research and discovery, and the sharing and preservation of knowledge. Visit us at Elsevier.com. Elsevier. Building Insights. Breaking Boundaries www.sciencedirect.com