Understanding nature's catalytic toolkit

Review
TRENDS in Biochemical Sciences
Vol.30 No.11 November 2005
Understanding nature’s catalytic
toolkit
Alex Gutteridge and Janet M. Thornton
EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
Enzymes catalyse numerous reactions in nature, often
causing spectacular accelerations in the catalysis rate.
One aspect of understanding how enzymes achieve
these feats is to explore how they use the limited set of
residue side chains that form their ‘catalytic toolkit’.
Combinations of different residues form ‘catalytic units’
that are found repeatedly in different unrelated
enzymes. Most catalytic units facilitate rapid catalysis
in the enzyme active site either by providing charged
groups to polarize substrates and to stabilize transition
states, or by modifying the pKa values of other residues
to provide more effective acids and bases. Given recent
efforts to design novel enzymes, the rise of structural
genomics and subsequent efforts to predict the function
of enzymes from their structure, these units provide a
simple framework to describe how nature uses the tools
at her disposal, and might help to improve techniques
for designing and predicting enzyme function.
Mechanisms of enzyme catalysis
Enzymes, and the principles by which they perform
catalysis, have been the subject of intense study for over
a hundred years, in which time the mechanisms of many
different enzymes have been investigated in great detail.
Serine proteases, for example, have been the focus of
countless structural [1], kinetic [2] and theoretical [3,4]
studies. Even now, however, when the general principles
that govern enzyme catalysis seem to be well understood
[5], new theories continue to be proposed to explain
puzzling aspects of enzyme catalysis [6], and novel
resources are being developed [7] to answer ongoing
questions about the evolution and mechanism of enzymes.
For instance, how do enzymes catalyse the diverse range
of reactions found in a cell with only a small set of different
chemical groups?
Of the 20 naturally occurring amino acids, only the 11
polar and charged residues are generally observed to
engage directly in catalysis [8]. These residues fall into
seven different chemical groups: imidazole (histidine),
guanidinium (arginine), amine (lysine), carboxylate (glutamate, aspartate), amide (glutamine, asparagine),
hydroxyl (serine, threonine, tyrosine) and thiol (cysteine).
The structures and ionization of the polar and charged
residue side chains are summarized in Box 1.
Corresponding author: Gutteridge, A. (alexg@ebi.ac.uk).
Available online 7 October 2005
Of course, enzymes also use metal ions [9], cofactors
[10,11] and water molecules [12] to aid catalysis. However,
a source of catalytic power that does not require additional
groups stems from the ability of catalytic residues to
interact with each other and thus to affect each other’s
chemical properties [13]. An early example of this
phenomenon was observed in acetoacetate decarboxylase
[14], in which two adjacent lysines mutually destabilize
their protonated forms through their proximity, enabling
one of them to function as a nucleophile. Similarly, the
well-known serine proteases use a triad of interacting
residues to perform their chemistry [2]. But not all
combinations of residues are useful: some might have no
effect on or even reduce the power of their component
residues.
By reviewing the available structural and biochemical
data, here we show which combinations of residues are
used by enzymes and how their interactions affect enzyme
properties. We introduce the concept of the ‘catalytic unit’:
that is, simple combinations of two or more residues, such
as the serine protease catalytic triad, that are repeatedly
used in similar roles by different, unrelated enzymes. The
view of catalysis that we present here is undoubtedly a
simplification. We have not considered some important
aspects of enzyme chemistry, such as the role of
hydrophobic residues, metal ions, cofactors or water, nor
have we touched on quantum effects or the importance of
factors such as entropy and binding energy.
A data set of catalytic interactions
We have compiled from the literature a set of 191 enzymes
to study (listed in the Supplementary Material). The set is
non-redundant: that is, no two enzymes are evolutionarily
related, as defined by sequence and structure comparisons
in the CATH database [15]. The catalytic mechanism for
each enzyme is extracted from the Catalytic Site Atlas [7]
and the crystal structure is taken from the Protein Data
Bank (PDB) [16]. All of the structures are high quality
(resolution, 2 Å; R-factor, 0.3) and the catalytic residues
are found in a single conformation (all atoms have an
occupancy of 1). To find catalytic units, we define an
interaction as taking place between two residues if any of
their side chain atoms are within 4 Å of each other. We
consider only interactions between polar residues and
ignore those involving hydrophobic residues.
Hydrophobic residues, such as phenylalanine, tryptophan and the smaller aliphatic residues, do have
important effects in many enzymes. By providing
www.sciencedirect.com 0968-0004/$ - see front matter Q 2005 Elsevier Ltd. All rights reserved. doi:10.1016/j.tibs.2005.09.006
Review
TRENDS in Biochemical Sciences
Vol.30 No.11 November 2005
623
Box 1. Structures and ionization of polar and charged amino acids
Only seven different polar or charged side chain terminal groups are found in proteins. The protonated forms are shown in Figure I, although at
neutral pH the carboxylate and imidazole side chains will tend to be unprotonated and thus negatively charged and neutral, respectively. Each side
chain has a characteristic pKa, the pH at which half of the side chains will be protonated (Table I); however, the pKa can be altered by placing a
residue in contact with other charged or uncharged groups. For example, forming an interaction between two carboxylates destabilizes their
negatively charged forms and thus raises their pKa until one of them is neutral. By contrast, because opposite charges stabilize each other, placing
an arginine next to a carboxylate stabilizes the two charges and thus tends to raise the pKa of the arginine and to lower the pKa of the carboxylate
(Figure I). As a rule, when a group makes an interactions with a negative charge its pKa is raised (making it more likely to be protonated), whereas
when it interacts with a positive charge its pKa is lowered (making it more likely to be unprotonated).
H
H
O
O
H
O
CH2
R
Carboxylate
H
H
N
N
R
Amide
+
H
H
O
S
R
Hydroxyl
R
Thiol
H
H
N
N
H
N
+
H
H
N
+
H
CH2
R
Amino
R
Imidazole
+
N H
N H
CH2
CH2
R
R
Guanidinium
N
H
H
CH2
H
H
N
H
O
O
CH2
R
Carboxylate–arginine
O
O
H O
O
CH2
CH2
R
R
Carboxylate–carboxylate
Figure I. Polar and charged side chain terminal groups and carboxylate–arginine and carboxylate–carboxylate dyads found in proteins.
Table I. pKa values of polar and charged amino acids
Residue
Aspartate
Glutamate
Histidine
Cysteine
Lysine
Tyrosine
Arginine
Serine
Threonine
Asparagine
Glutamine
a non-polar environment, they tend to raise the pKa of
acidic residues and to lower the pKa of basic ones. These
effects (also known as medium or solvent effects) [17] are
often spread out over the whole or parts of the active site,
however, and thus can be hard to localize to a specific
residue. For this reason, we consider only interactions
between polar or charged residues in this analysis. We
also restrict our attention to side chain groups, although
main chain amides and carbonyls are also important polar
groups that can be used in catalysis, most obviously in the
oxyanion hole of the serine proteases.
By counting the number of different interactions, we
find that, on average, each polar catalytic residue
interacts with 0.4 other polar catalytic residues and 1.9
other polar residues (giving a total of 2.3 interactions per
catalytic residue). By contrast, non-catalytic buried polar
residues have, on average, interactions with 1.2 other
polar residues, which is significantly fewer than those of
the catalytic residues. Only 88 of the 191 enzymes contain
www.sciencedirect.com
pKa
3.9
4.3
w6
8.3
10.8
11
12.5
w13
w13
–
–
one or more interactions between two of the defined
catalytic residues, which seems to suggest that most
catalytic residues do not require direct interactions with
other catalytic residues to be active. The fact that the
catalytic residues have a larger number of interactions
than non-catalytic residues suggests, however, that at
least some of the interactions between catalytic and noncatalytic residues are functional.
The annotation in the Catalytic Site Atlas is derived
from literature searching using strict criteria for the
definition of a catalytic residue [8], but it is likely that
some of the secondary interactions – between residues
annotated as catalytic and residues annotated as noncatalytic – do have a role in catalysis. The residues
annotated as non-catalytic have not been previously
identified because their effect is likely to be subtler than
those of other residues that are directly involved in the
mechanism. Thus, it is probably not true to suggest that
catalytic residues work alone, even when no specific
TRENDS in Biochemical Sciences
functional interaction has been identified. It is also
important to remember that catalytic residues work
within and rely on the specific microenvironment provided
by all of the other residues in the active site and the rest of
the enzyme, and not only by the residues with which they
interact directly.
The functions of secondary residues
The simplest of the functions performed by secondary
interactions is orientation. Making bonds between residues restricts their motion and ensures that they are
positioned correctly relative to the substrate. Because
restricting motion reduces entropy, there is an energetic
cost to this orientation. By pre-arranging the active site,
this entropic cost is paid for when the enzyme folds or the
substrate binds [18–20], rather than during catalysis, and
thus it is beneficial to the enzyme. The individual
contribution of a single residue that functions only to
orientate another residue is likely to be small, but the
effect of taking all such residues in an enzyme into account
will be significant.
Where charged groups are required to interact with a
substrate, there might well be secondary groups that
stabilize the charge required by providing oppositely
charged groups nearby. Often these residues have been
annotated as catalytic, but in some enzymes their
importance might be less obvious and they have might
have been overlooked.
For histidine, secondary residues that control tautomerization of the imidazole ring can be important.
Histidine exists in two neutral tautomeric states that
are protonated on either the Nd or the N3 atom. Free in
solution, these two forms exist in roughly equal proportions; in an enzyme, however, the presence of the
correct tautomer is essential.
Vol.30 No.11 November 2005
(a) 100
Number of catalytic residues observed
Review
80
60
40
20
0
HIS CYS ASP ARG GLU LYS TYR SER ASN THR GLN
Residue type
(b)
7
6
5
Catalytic propensity
624
4
3
2
1
The contents of nature’s toolkit
Before we look at the interactions between catalytic
residues, it is helpful to see which residues are used
most often in catalytic sites. Figure 1 shows the numbers
of each residue that are catalytic in our data set and the
catalytic propensity of each residue. The catalytic
propensity of a residue is defined as the percentage of
catalytic residues constituted by a particular reaction
type, divided by the percentage of all residues in the data
set constituted by that particular residue type [8].
From Figure 1, we can immediately see the importance
of histidine in enzyme catalysis. The reason for the
popular use of this amino acid is that histidine is the
only residue that has a pKa close to neutral and thus can
easily function as an acid–base catalyst. It can also
function as a nucleophile, use its charged form to stabilize
charged transition states, and can both accept and donate
hydrogen bonds. In addition, cysteine, which also has a
pKa close to neutral, has a high catalytic propensity,
although its use is much rarer overall.
The other most commonly observed residues are the
charged residues glutamate, aspartate, arginine and
lysine. These residues have pKa values that are far from
neutral and they are harder to use in acid–base chemistry;
through interactions with other residues, however, their
www.sciencedirect.com
0
HIS CYS ASP ARG GLU LYS TYR SER ASN THR GLN
Residue type
Figure 1. Numbers of catalytic residues observed (a) and catalytic propensity (b) of
each residue type. The catalytic propensity of a residue is defined as the
percentage of catalytic residues constituted by a particular reaction type, divided
by the percentage of all residues in the data set constituted by that particular
residue type [8].
pKa values can be altered significantly and they can be
also used to provide charges that affect other residues and
the substrate. The polar residues – serine, threonine,
tyrosine, glutamine and asparagine – are used less often.
In general, these residues are unreactive until they are
primed by interaction with another residue.
The simplest units that these individual residues can
form are interacting dyads. Figure 2a shows the number of
interactions observed between each pair of residues in
which both residues are annotated as catalytic. It is clear
that many potential interactions, particularly those
between polar residues (clustered in the bottom right of
the diagram), are rarely observed in active sites. By
contrast, interactions between charged residues, and to
Review
(a)
ASP
GLU
HIS
LYS
ARG
CYS
TYR
SER
THR
ASN
GLN
(b)
ASP
GLU
HIS
LYS
ARG
CYS
TYR
SER
THR
ASN
GLN
TRENDS in Biochemical Sciences
3
GLU
3
4
2
HIS
1
31
23
12
13
10
9
6
6
16
20
4
2
13
1
3
4
2
3
2
2
4
1
0
1
2
1
1
1
0
0
1
0
1
1
1
1
0
1
THR
1
1
0
0
1
0
1
0
1
0
0
0
SER
0
0
2
0
0
0
1
3
0
2
1
TYR
0
1
1
3
1
3
1
0
1
3
2
2
3
1
1
5
CYS
2
1
8
2
0
0
6
8
2
0
6
2
3
2
3
2
ARG
3
4
0
1
0
0
13
1
3
LYS
3
5
13
1
8
12
1
0
0
0
3
ASN
0
0
0
0
0
0
GLN
0
0
0
0
0
0
0
ASP
17
GLU
5
26
14
15
70
58
54
42
31
23
23
63
38
7
37
3
26
16
20
15
23
16
25
29
18
13
37
31
11
8
13
9
17
5
11
17
6
12
5
16
2
7
ASN
4
1
4
10
THR
11
9
4
2
3
2
3
2
13
11
14
4
20
SER
4
3
15
4
5
3
18
5
7
6
13
14
13
8
14
6
5
17
18
TYR
7
31
5
3
22
9
34
19
5
12
24
CYS
6
11
31
17
20
1
46
18
7
0
40
18
24
6
11
0
3
9
GLN
5
6
10
2
1
Figure 2. Numbers of residue interactions in the data set. (a) Numbers of residue
interactions in which both residues are annotated as catalytic in the Catalytic Site
Atlas. (b) Numbers of residue interactions in which either residue is annotated as
catalytic. Numbers above the diagonal lines in each box are the observed number
of interactions; numbers below the diagonal lines are the expected number of
interactions, which takes into account the catalytic propensity of each residue and
the propensity of a given pair of residue types to interact in non-catalytic regions of
2
protein structure. Boxes are coloured by calculating ðOKEEÞ ðc2 Þ, where O equals the
observed number of interactions and E equals the expected number. The deepest
blue and red boxes have c2R5 with O!E and OOE, respectively; boxes where c2 is
0, or where O!2 are uncoloured; and other boxes are scaled between these two
extremes according to the c2 value.
a lesser extent between polar and charged residues, are
more common. It is also noticeable that some combinations
(e.g. histidine–aspartate) occur much more than expected,
whereas other interactions (e.g. arginine–carboxylate) are
observed much less.
Figure 2b shows the number of interactions in which
only one of the residues is catalytic; there are many more
of these interactions and, as explained above, we expect
many to be functionally important even though they have
www.sciencedirect.com
Common catalytic units
Arginine–arginine
The eight catalytic arginine–arginine interactions in the
data set come from five different enzymes: arginine
kinase, flavocytochrome c, phytase, undecaprenyl pyrophosphate synthase and adenylate kinase. Four of these
five enzymes catalyse reactions involving phosphate
chemistry. In each case, the arginines form bonds to the
phosphate oxygens and polarize the phosphate, making it
a better leaving group. Figure 3a shows the arrangement
of arginines around a substrate analogue in adenylate
kinase [21]. The arginines are close enough to destabilize
each other until the negatively charged phosphate groups
bind. The nearby residues Asp162 and Asp163 provide
potentially stabilizing negative charges, although these
charges might be more important for holding the
arginines in position, rather than for directly affecting
the arginine side chains.
ARG
9
12
23
25
1
0
48
7
31
LYS
10
13
56
5
31
27
23
53
11
HIS
3
625
not been annotated as such. The larger numbers enable us
to see more general trends: the polar–polar interactions
are not only rare, they are also observed less often than we
would expect. By contrast, we again see a large number of
interactions between charged groups. Some (e.g. carboxylate–carboxylate and arginine–arginine interactions)
are observed more often than we would expect, whereas
other interactions (e.g. arginine–carboxylate) are
observed less than we would expect.
ASP
2
Vol.30 No.11 November 2005
Carboxylate–carboxylate
The effect of placing two carboxylates together is that their
pKa values are raised. Thus, they tend to be protonated at
a higher pH than is normal, which prevents the
unfavourable interaction of two negative charges and
enables a hydrogen bond to form between the two
carboxylate groups. Carboxylate dyads are used in two
particularly important classes of enzymes: aspartic
proteases and glycosidases.
In aspartic proteases, both the carboxylates engage in
acid–base chemistry. Owing to its raised pKa, one of the
aspartates is protonated at the start of the reaction,
enabling it to donate a proton to the substrate. The second
aspartate is unprotonated and thus can accept a proton
from water to form a nucleophilic OHK group, which then
attacks the substrate. In the second stage, the roles of the
two aspartates are reversed such that the first aspartate
accepts a proton from the protonated intermediate and the
second aspartate donates a proton to the leaving
substrate. The active site of the aspartic protease
Cardosin [22] is shown in Figure 3b.
Glycosidases, such as cellobiohydrolase Cel6A [23]
(Figure 3c), also use two interacting carboxylates. This
interaction raises the pKa of Asp226, enabling it to operate
as an acid–base but, in contrast to the situation in aspartic
proteases, Asp180 operates as a nucleophile, rather than
as a second acid–base. This difference in mechanism is due
to the interaction between Asp180 and Arg179. This
interaction lowers the pKa of Asp180 and prevents it from
becoming protonated. In aspartic proteases, both aspartates are hydrogen-bonded to hydroxyl groups (Figure 3b),
Review
626
(a)
TRENDS in Biochemical Sciences
(b)
(c)
Vol.30 No.11 November 2005
(d)
Figure 3. Interactions involving arginine and carboxylate. (a) The arginines in adenylate kinase (PDB code: 1ZIN) polarize the substrate phosphates (shown in stick formation
below the arginines). Two aspartates stabilize the concentration of positive charge required. (b) Asp215 and Asp32 in Cardosin (PDB code: 1B5F) form an interaction that
enables them both to be an acid–base. The hydroxyl groups of Thr218 and Ser35 orientate, without affecting the pKa of, the carboxyls. (c) Asp180 and Arg179 form an ion pair
in cellobiohydrolase (PDB code: 1OC7) that raises the pKa of Asp226, which can then engage in acid–base chemistry. (d) Asp192 in sucrose phosphorylase (PDB code: 1R7A)
acts as a nucleophile, forming a covalent bond with the substrate. The nearby residue Arg190 ensures that Asp192 is unprotonated.
which do not alter the pKa of the carboxylates in the same
way that the arginine side chain does in glycosidase.
Carboxylate–arginine
Placing carboxylate and arginine residues together
stabilizes the charged form of each residue such that
neither residue can easily gain or lose protons. Carboxylate–arginine interactions are often found where
either a positive charge (from the arginine) or a negative
charge (from the carboxylate) is required to polarize a
substrate. For example, carboxylates are used to hold the
arginines in adenylate kinase, as described above.
Carboxylate oxygens that are nucleophiles also interact
with arginines. An example of this is seen in the active site
of sucrose phosphorylase [24] (Figure 3d). Here, Arg190
reduces the pKa of Asp192, ensuring that it is not
protonated and thus able to perform a nucleophilic attack
on the substrate (an analogue of which is shown).
The carboxylate–arginine interaction is also found as
part of a larger unit comprising two carboxylates and an
arginine, as in the glycosidases described above.
Carboxylate–lysine
Because the amino group of lysine is usually protonated, it
can have a similar role to that of arginine in its
interactions with carboxylate; however, lysine has a
lower pKa than arginine (10 versus 12) and is found in a
neutral state given the correct conditions.
Lysine-containing triads
A threonine-containing triad (analogous to the serine
protease triad) is found in L-asparaginase [25] (Figure 4a).
(a)
(b)
The side chain of Thr95 is used as the nucleophile and
Lys168 is used as the acid–base instead of serine and
histidine, respectively. Asp96 retains its role in orientating and altering the pKa of the lysine.
A second lysine–carboxylate containing triad is seen in
aldo-keto reductase [26] (Figure 4b). Here, Tyr58 is not
used as a nucleophile, but instead lysine lowers its pKa so
that it can function as an acid, donating a proton to the
substrate.
A third triad containing two lysine–carboxylate interactions is present in indole-3-glycerol-phosphate synthase
[27] (Figure 4c), in which two lysines form salt bridges
with a single glutamate. The charged forms of the
glutamate and one of the lysines are required to stabilize
charges in the transition state. The second lysine engages
in general acid catalysis, and it has been speculated that
its reprotonation is mediated by its involvement in the
triad.
Lysine–carboxylate in acid–base chemistry
In contrast to arginine, which seems to remain protonated
at all times, lysine can gain and lose protons and is often
used as an acid–base, for example in the enolase superfamily of enzymes [28]. Interactions with carboxylate
groups will tend to raise the pKa of lysine, making it less
capable of losing protons; however, there are enzymes in
which the lysine of a lysine–carboxylate dyad is involved
in acid–base chemistry.
The question is, given its high pKa, how is the lysine
ever deprotonated? Proton relay chains have been
suggested for this role, but this idea remains speculative.
In the lysine–aspartate dyad of glucosamine-6-phosphate
(c)
(d)
Figure 4. Interactions involving lysine. (a) The ‘asparaginase triad’ in L-asparaginase (PDB code: 1O7J) features an aspartate–lysine pair, which is used to activate a threonine
residue as a nucleophile by extracting a proton from the threonine hydroxyl. (b) Another triad containing an aspartate–lysine pair, found in aldo-keto reductase AKR11A (PDB
code: 1PYF), uses tyrosine not as a nucleophile but as an acid–base. The lysine–aspartate dyad controls the pKa of the tyrosine. (c) A triad of two lysines and a glutamate are
used in indole-3-glycerol phosphate synthase (PDB code: 1VC4). Lys112 acts as an acid–base and its pKa is thought to be modulated by its involvement in the triad. In addition,
Glu51 and Lys53 have roles in providing electrostatic stabilization during the reaction. (d) The lysine in a simple glutamate–lysine dyad provides acid–base chemistry in
glucosamine-6-phosphate synthase (PDB code: 1MOQ).
www.sciencedirect.com
Review
(a)
TRENDS in Biochemical Sciences
(b)
Vol.30 No.11 November 2005
627
(c)
Figure 5. Interactions involving histidine. (a) In the classic serine–histidine–aspartate triad found in trypsin (PDB code: 1AVW), the aspartate–histidine dyad extracts a proton
from the serine hydroxyl. (b) The aspartate–histidine dyad from aconitase (PDB code: 1C96) acts as an acid–base directly on the substrate, which is also shown. (c) A rare,
functionally important histidine–histidine interaction is found in the phosphotransferase domain of glucose permease (PDB code: 1GPR). The N3 atom of His83 acts as a
nucleophile in attacking phosphate, whereas His68 ensures that His83 exists in the correct tautomer and stabilizes the transition state. A threonine residue ensures that His68
is in the correct tautomeric state.
synthase [29], the lysine has been proposed to deprotonate
a substrate hydroxyl group [30] (Figure 4d).
Carboxylate–histidine
The effect of the interaction between histidine and
carboxylate is to raise the pKa of the histidine, helping it
to function as an acid–base. A classic example of this
situation is found in the serine protease triad from trypsin
[31] (Figure 5a), in which the histidine–aspartate dyad is
used to deprotonate Ser195. Most of the examples in our
data set, however, use the histidine–carboxylate dyad to
act directly on the substrate. Just such a dyad and a
substrate analogue can be seen in the active site of
aconitase [32] (Figure 5b).
Histidine–hydroxyl
The most well-known example of a catalytic histidine–
hydroxyl interaction is found in the catalytic triad that
catalyses many different reactions. In the triad, histidine
extracts a proton from serine, as described above, to prime
the serine as a nucleophile. However, hydroxyls can also
prime histidines. In phosphotransferase [33] (Figure 5c),
for example, a threonine hydroxyl hydrogen bonds with
His68; this hydrogen bond forces the Nd of His68 to be
unprotonated and the N3 to be protonated. This tautomeric state is essential for His68 to prime His83 for its role
as a nucleophile.
Histidine–histidine
Because histidine is the most commonly used catalytic
residue, it is surprising to see only eight catalytic
histidine–histidine interactions. Furthermore, in all but
one of these examples, the two histidines are close to each
other but the interaction between them does not seem to
be functionally important. The one exception is the
previously mentioned phosphotransferase, in which
His83 functions as a nucleophile that attacks a phosphate
group. It is kept in its correct tautomeric state by its bond
to His68.
The roles of catalytic interactions
Previous studies have shown that w40% of catalytic
residues are involved in either transition state stabilization or substrate activation [8] – processes that generally
www.sciencedirect.com
involve simply providing the appropriate charged groups
or hydrogen-bonding partners around the substrate.
Given this, it is not surprising that most catalytic residues
interact directly with the substrate, rather than with each
other. Interactions with other groups are not required for
most residues to fulfil these types of role.
It also seems, however, that many important catalytic
functions are best achieved by particular combinations of
residues. In some catalytic units, such as carboxylate–
carboxylate, the effect is to change the chemical character
of a group (from negatively charged to neutral in this
case), whereas in others, such as arginine–carboxylate,
the effect is to enhance existing properties (by stabilizing
charges). The range of functions that are performed by
interacting residues in our data set is summarized in
Box 2.
The conclusion that we draw from this analysis is that
the range of roles of interactions involving charged
residues is greater than that of interactions involving
polar residues. This would explain, in part, why charged
residues are found to be used in catalysis more often than
polar residues.
Box 2. Roles of the different catalytic dyads in this analysis
Interactions between like charges
† Provide charges, as in arginine–arginine
† Provide an acid–base (by depolarization), as in aspartate–aspartate
† Provide nucleophiles, as in lysine–lysine
Interactions between opposite charges
† Stabilize charge concentration, as in aspartate–arginine
† Provide an ion pair to depolarize a third residue, as in arginine–
aspartate–aspartate
† Provide charges for transition state stabilization, as in glutamate–
lysine
† Provide nucleophiles, as in arginine–aspartate
† Provide an acid–base, as in glutamate–lysine
Interactions between charged and polar residues
† Provide nucleophiles, as in lysine–threonine
† Provide an acid–base, as in lysine–tyrosine, glutamate–threonine
and aspartate–histidine
Interactions between polar residues
† Provide nucleophiles, as in histidine–serine, histidine–histidine
† Provide an acid–base, as in asparagine–histidine
† Provide tautomerization, as in threonine–histidine
628
Review
TRENDS in Biochemical Sciences
Charged residues have important roles in transition
state stabilization and substrate polarization, but they
also have the ability to modify the pKa of other residues,
enabling those residues to perform functions that they
would not otherwise be able to do.
By contrast, apart from histidine (which is often
charged in protein structures), polar residues frequently
require an interaction with a charged residue to alter their
chemical properties. These interactions generally involve
a charged residue that primes the polar residue for action,
either as a nucleophile (as in threonine–lysine) or as an
acid–base (as in aspartate–histidine). The reverse situation – in which a polar residue primes a charged residue –
is rarely seen, presumably because a polar residue has
relatively little effect on a charged residue.
Figure 2 shows that only a few of the different dyads
that could be formed in enzyme active sites are actually
used in catalysis. The seven combinations that we have
described above (arginine–arginine, carboxylate–
carboxylate, carboxylate–arginine, carboxylate–lysine,
carboxylate–histidine, histidine–hydroxyl and histidine–
histidine) account for w65% of the interactions between
catalytic residues and probably for an even higher
proportion of the key, direct, functional interactions.
Thus, although combinations of residues can produce
new or enhanced chemical activity in the residue side
chains, the catalytic toolkit used by enzymes seems as
small as ever.
How can such a small set of tools catalyse the diverse
range of reactions found in nature? The answer probably
lies partly in the nature of the actual reactions that are
catalysed. Results from an analysis of MACiE (a database
of reaction mechanisms) suggest that most reactions can
be broken down into individual steps, each of which is
chemically simple. For example, 75% of reaction steps
involve a straightforward proton transfer (G.L. Holliday et
al., personal communication). Because there are a
restricted number of these simple steps, it follows that
the number of chemical groups that are required to
catalyse these steps is also small.
It is also important to appreciate that the origins of the
power of enzyme catalysis derive from more than just
providing specific residue or residue combinations in
proximity to the substrate. The repeated evolution of
some units, such as the catalytic triad [2], implies that
these interactions are genuinely useful to enzymes. But
the catalytic power of any enzyme cannot be ascribed only
to the formation of these units. Recent efforts to modify
existing proteins to catalyse a new catalytic function have
shown that many binding and/or structural residues are
required for efficient catalysis, in addition to correctly
placed, mechanistically important residues (which the
catalytic units represent) [34]. We suggest that the real
power of an enzyme lies in a combination of very ‘local’
structural features, represented by catalytic units, and
more ‘global’ features of the enzyme, including the
dynamics of the structure and the overall microenvironment of the active site.
To predict the catalytic function of an enzyme purely
from its structure is a long-term goal that has both
practical and academic interest [35]. One way to predict
www.sciencedirect.com
Vol.30 No.11 November 2005
function is to find similarities between the active site of an
unannotated enzyme and that of another annotated
structure. The use of templates built from the structure
of active sites to identify such similarities is well
developed [36]. These templates, however, generally
involve three or more residues, some of which might
provide different functions within a single template. Such
large templates make finding similarities between
enzymes difficult because the complete active site has to
be conserved, rather than the smaller functional units
that we have described here.
Shaw et al. [37] have described an interesting example
in which two of these smaller units, a classic Glu–Glu
cellulase dyad (similar to that shown in Figure 3c) and a
Ser–His–Glu triad, combine in a novel way. The triad is
used to couple one of the catalytic glutamates with
another deeply buried glutamate. The buried glutamate
has a high pKa owing to its location in the core of the
protein and thus raises the pKa of the catalytic glutamate,
ensuring that it is protonated and ready to be a proton
donor. Searching for these types of combinations of
catalytic units could help to predict catalytic functions
from structure. This example also demonstrates how
catalytic units interact with, and require for their
function, the global structure of the enzyme, which in
this example provides the hydrophobic core required to
lower the pKa of the buried glutamate.
Concluding remarks
With the advent of structural genomics [38], the ability to
predict function including catalytic mechanisms from
enzyme structures [39] is increasingly important. In
addition, the design or redesign of enzymes to bind new
substrates or to perform new reactions is being actively
pursued [34,40]. The catalytic units that we have
described here provide a useful framework for understanding the chemistry performed by enzymes and might
help to develop techniques for predicting and designing
the mechanisms of enzymes.
Supplementary data
Supplementary data associated with this article can be
found at doi:10.1016/j.tibs.2005.09.006
References
1 Perona, J. and Craik, C. (1997) Evolutionary divergence of substrate
specificity within the chymotrypsin-like serine protease fold. J. Biol.
Chem. 272, 29987–29990
2 Hedstrom, L. (2002) Serine protease mechanism and specificity. Chem.
Rev. 102, 4501–4524
3 Topf, M. et al. (2002) Ab initio qm/mm dynamics simulation of the
tetrahedral intermediate of serine proteases: insights into the active
site hydrogen-bonding network. J. Am. Chem. Soc. 124, 14780–14788
4 Ishida, T. and Kato, S. (2004) Role of Asp102 in the catalytic relay
system of serine proteases: a theoretical study. J. Am. Chem. Soc. 126,
7111–7118
5 Blow, D. (2000) So do we understand how enzymes work? Struct. Fold.
Des. 8, R77–R81
6 Williams, D. et al. (2004) Understanding noncovalent interactions:
ligand binding energy and catalytic efficiency from ligand-induced
reductions in motion within receptors and enzymes. Angew. Chem.
Int. Ed. Engl. 43, 6596–6616
Review
TRENDS in Biochemical Sciences
7 Porter, C. et al. (2004) The catalytic site atlas: a resource of catalytic
sites and residues identified in enzymes using structural data. Nucleic
Acids Res. 32, D129–D133
8 Bartlett, G.J. et al. (2002) Analysis of catalytic residues in enzyme
active sites. J. Mol. Biol. 324, 105–121
9 Williams, R. (2003) Metallo-enzyme catalysis. Chem. Commun.,
1109–1113
10 Mure, M. (2004) Tyrosine-derived quinone cofactors. Acc. Chem. Res.
37, 131–139
11 Murataliev, M. et al. (2004) Electron transfer by diflavin reductases.
Biochim. Biophys. Acta 1698, 1–26
12 Hernick, M. and Fierke, C. (2005) Zinc hydrolases: the mechanisms of
zinc-dependent deacetylases. Arch. Biochem. Biophys. 433, 71–84
13 Harris, T. and Turner, G. (2002) Structural basis of perturbed pKa
values of catalytic groups in enzyme active sites. IUBMB Life 53,
85–98
14 Schmidt, D. and Westheimer, F. (1971) pK of the lysine amino group at
the active site of acetoacetate decarboxylase. Biochemistry 10,
1249–1253
15 Pearl, F.M. et al. (2000) Assigning genomic sequences to CATH.
Nucleic Acids Res. 28, 277–282
16 Berman, H.M. et al. (2000) The Protein Data Bank. Nucleic Acids Res.
28, 235–242
17 Hollfelder, F. et al. (2001) On the magnitude and specificity of medium
effects in enzyme-like catalysts for proton transfer. J. Org. Chem. 66,
5866–5874
18 Hammes, G. (2002) Multiple conformational changes in enzyme
catalysis. Biochemistry 41, 8221–8228
19 Gutteridge, A. and Thornton, J. (2004) Conformational change in
substrate binding, catalysis and product release: an open and shut
case? FEBS Lett. 567, 67–73
20 Gutteridge, A. and Thornton, J. (2005) Conformational changes
observed in enzyme crystal structures upon substrate binding.
J. Mol. Biol. 346, 21–28
21 Berry, M. and Phillips, G. (1998) Crystal structures of bacillus
stearothermophilus adenylate kinase with bound Ap5a, Mg2CAp5a,
and Mn2CAp5a reveal an intermediate lid position and six coordinate
octahedral geometry for bound Mg2C and Mn2C. Proteins 32, 276–288
22 Frazao, C. et al. (1999) Crystal structure of cardosin A, a glycosylated
and Arg-Gly-Asp-containing aspartic proteinase from the flowers of
Cynara cardunculus L. J. Biol. Chem. 274, 27694–27701
23 Varrot, A. and Davies, G. (2003) Direct experimental observation of
the hydrogen-bonding network of a glycosidase along its reaction
coordinate revealed by atomic resolution analyses of endoglucanase
cel5a. Acta Crystallogr. D 59, 447–452
24 Sprogoe, D. et al. (2004) Crystal structure of sucrose phosphorylase
from Bifidobacterium adolescentis. Biochemistry 43, 1156–1162
Vol.30 No.11 November 2005
629
25 Lubkowski, J. et al. (2003) Atomic resolution structure of Erwinia
chrysanthemi L-asparaginase. Acta Crystallogr. D 59, 84–92
26 Ehrensberger, A. and Wilson, D. (2004) Structural and catalytic
diversity in the two family 11 aldo-keto reductases. J. Mol. Biol. 337,
661–673
27 Hennig, M. et al. (2002) The catalytic mechanism of indole-3-glycerol
phosphate synthase: crystal structures of complexes of the enzyme
from sulfolobus solfataricus with substrate analogue, substrate, and
product. J. Mol. Biol. 319, 757–766
28 Gerlt, J. and Babbitt, P. (2001) Divergent evolution of enzymatic
function: mechanistically diverse superfamilies and functionally
distinct suprafamilies. Annu. Rev. Biochem. 70, 209–246
29 Teplyakov, A. et al. (1998) Involvement of the C terminus in
intramolecular nitrogen channeling in glucosamine 6-phosphate
synthase: evidence from a 1.6 Å crystal structure of the isomerase
domain. Structure 6, 1047–1055
30 Teplyakov, A. et al. (1999) The mechanism of sugar phosphate
isomerization by glucosamine 6-phosphate synthase. Protein Sci. 8,
596–602
31 Song, H. and Suh, S. (1998) Kunitz-type soybean trypsin inhibitor
revisited: refined structure of its complex with porcine trypsin reveals
an insight into the interaction between a homologous inhibitor from
Erythrina caffra and tissue-type plasminogen activator. J. Mol. Biol.
275, 347–363
32 Lloyd, S. et al. (1999) The mechanism of aconitase: 1.8 Å resolution
crystal structure of the S642A: citrate complex. Protein Sci. 8,
2655–2662
33 Liao, D. et al. (1991) Structure of the IIa domain of the glucose
permease of Bacillus subtilis at 2.2-Å resolution. Biochemistry 30,
9583–9594
34 Dwyer, M. et al. (2004) Computational design of a biologically active
enzyme. Science 304, 1967–1971
35 Ringe, D. et al. (2004) Protein structure to function: insights from
computation. Cell. Mol. Life Sci. 61, 387–392
36 Torrance, J. et al. (2005) Using a library of structural templates to
recognise catalytic sites and explore their evolution in homologous
families. J. Mol. Biol. 347, 565–581
37 Shaw, A. et al. (2002) A novel combination of two classic catalytic
schemes. J. Mol. Biol. 320, 303–309
38 Todd, A. et al. (2005) Progress of structural genomics initiatives:
an analysis of solved target structures. J. Mol. Biol. 348,
1235–1260
39 Jones, S. and Thornton, J. (2004) Searching for functional sites in
protein structures. Curr. Opin. Chem. Biol. 8, 3–7
40 Korkegian, A. et al. (2005) Computational thermostabilization of an
enzyme. Science 308, 857–860
Elsevier.com – Dynamic New Site Links Scientists to New Research & Thinking
Elsevier.com has had a makeover, inside and out.
As a world-leading publisher of scientific, technical and health information, Elsevier is dedicated to linking researchers and
professionals to the best thinking in their fields. We offer the widest and deepest coverage in a range of media types to enhance crosspollination of information, breakthroughs in research and discovery, and the sharing and preservation of knowledge. Visit us at
Elsevier.com.
Elsevier. Building Insights. Breaking Boundaries
www.sciencedirect.com