593 Glycoside hydrolases and glycosyltransferases: families and functional modules Yves Bourne and Bernard Henrissat* The past year has witnessed the expected increase in the number of solved structures of glycoside hydrolases and glycosyltransferases, and their constitutive modules. These structures show that, while glycoside hydrolases display an extraordinary variety of folds, glycosyltransferases and carbohydrate-binding modules appear to belong to a much smaller number of folding families. Table 1 GH-3 Exo-β-1,3-1,4-glucanase (Hordeum vulgare) 1EX1 Addresses Architecture et Fonction des Macromolécules Biologiques, UMR6098, CNRS and Universités Aix-Marseille I and II, 31 Chemin Joseph Aiguier, 13402 Marseille cedex 20, France *e-mail: bernie@afmb.cnrs-mrs.fr GH-4 6-P-α-glucosidase (Bacillus subtilis) GH-8 Cellulase CelC (Clostridium cellulolyticum) 1G4L Cellulase Cel8A (Clostridium thermocellum) 1CEM Structural status of various glycoside hydrolase families*. Family Protein GH-12 Cellulase (Streptomyces lividans) Repres- Refs† entative PDB codes Cryst‡ 1NLR Current Opinion in Structural Biology 2001, 11:593–600 GH-26 β-Mannanase (Pseudomonas fluorescens) 1J9Y 0959-440X/01/$ — see front matter © 2001 Elsevier Science Ltd. All rights reserved. GH-27 N-acetyl α-galactosaminidase (chicken) Cryst‡ GH-28 Rhamnogalacturonase (Aspergillus aculeatus)1RMG Polygalacturonase (Aspergillus niger) 1CZF Polygalacturonase (Erwinia carotovora) 1BHE Abbreviations CBM carbohydrate-binding module GH glycoside hydrolase GT glycosyltransferase GH-38 α-Mannosidase (Drosophila melanogaster) 1HTY GH-43 Arabinanase (Pseudomonas fluorescens) Cryst‡ Introduction GH-46 Chitosanase (Bacillus circulans) Chitosanase (Streptomyces sp.) 1QGI 1CHK GH-47 α-Mannosidase (human) α-Mannosidase (yeast) α-Mannosidase (Trichoderma reesei) 1FMI 1DL2 1HCU GH-48 Cellulase CelF (Clostridium cellulolyticum) 1FCE A captivating facet of carbohydrates and glycoconjugates is their exceptional structural and functional diversity [1], which requires a comparable multiplicity of the glycosyltransferases (GTs) and glycoside hydrolases (GHs) responsible for their biosynthesis and selective cleavage. To cope with this multiplicity, a classification of GHs into families based on amino acid similarities was introduced a decade ago [2] and is updated regularly [3,4]. In contrast to the IUB-MB enzyme nomenclature, this classification scheme was designed to integrate both structural and mechanistic features of these enzymes. Strikingly, the system based on amino acid sequence similarities (hence also reflecting similar structural features) often grouped enzymes with different substrate specificity in a single ‘polyspecific’ family. This classification system has since been extended to GTs [5]. Over the years, the number of families of GHs and GTs has grown steadily and currently there are 85 and 52 families, respectively (these families are available on the continuously updated CAZy web server at http://afmb.cnrs-mrs.fr/~pedro/CAZY/db.html). Because the structures of proteins are better conserved than their sequences, the grouping of several families in ‘clans’ or superfamilies has been introduced [6]. In addition to the multiplicity of their families, GHs and GTs also frequently display a modular structure in which the catalytic module carries one or several ancillary modules that are often but not always carbohydrate binding. In the genomic era, this modularity is of particular importance for correct open reading frame (ORF) annotation and functional prediction [7], whereas in structural biology, it can be a major hurdle for crystallisation. Here, we review recent [11••] [9] [10••] (a) GH-53 Galactanase (Aspergillus aculeatus) 1FHL (b) GH-56 Hyaluronidase (Apis mellifera) 1FCQ [12•] GH-57 4-α-glucanotransferase (Thermococcus litoralis) NA (c) GH-65 Maltose phosphorylase (Lactobacillus brevis) 1H54 [16••] GH-67 α-Glucuronidase (Bacillus stearothermophilus) NA (d) GH-77 Amylomaltase (Thermus aquaticus) 1CWY [15] GH-82 ι-Carrageenase (Alteromonas fortis) 1H80 (e) GH-83 Neuraminidase (Newcastle disease virus) 1E8T [14•] *The families mentioned here are those solved or crystallised since the comprehensive survey of GH structures by Davies and Henrissat [8]. † References are indicated only for families for which a representative structure was solved during the past year. ‡ Crystallisation reported. (a) F Van Petegem, H Contreras, R Contreras, J Van Beeumen, unpublished data. (b) C Ryttersgaard, S Larsen, unpublished data. (c) H Imamura et al., abstract 89, 4th Carbohydrate-Bioengineering Meeting, 10–13 June 2001, Stockholm, Sweden. (d) G Golan et al., abstract 108, 4th Carbohydrate-Bioengineering Meeting, 10–13 June 2001, Stockholm, Sweden. (e) G Michel, L Chantalat, O Dideberg, unpublished data (see Update). NA, PDB accession code not available at time of writing. structural advances concerning GH and GT enzymes, and their constitutive carbohydrate-binding modules (CBMs). Glycoside hydrolases The number of structure determinations of GHs continues to grow steadily. In 1995, there were 22 families of GHs 594 Carbohydrates and glycoconjugates Figure 1 Ribbon diagrams of (a) Saccharomyces cerevisiae α-mannosidase I (PDB accession code 1DL2) and (b) Drosophila α-mannosidase II (PDB accession code 1HTY) — representative of GH families GH-47 and GH-38, respectively. β Sheets are coloured in cyan and α helices are in red. Figure prepared with SPOCK [45] and Raster3D [46]. with known representative structures [8]. The number of families with known structures has almost doubled since to reach 38 at the time of writing, with no less than 9 families reported since January 2000 (see Table 1). These new 3D structures have further expanded the extraordinary variety of folds exhibited by these enzymes (for a review, see [8]) and suggest that yet more folds might exist for families that are still awaiting structural determination. A good example is the α-mannosidases, which are commonly found in GH families GH-38 and GH-47 [9,10••,11••]. Remarkably, the representative structures of each family revealed a novel fold (Figure 1)! In addition, several other structures determined in 2000 have reinforced the families of folds already known for GH enzymes. For example, bee venom hyaluronidase, the first structure reported for family GH-56, resembles a classical triose phosphate isomerase, except that the barrel is composed of only seven strands [12•]. The structure of the complex of the hyaluronidase with a substrate analogue suggests a molecular mechanism involving anchimeric assistance of the N-acetyl group of the substrate in catalysis. Such a mechanism has been confirmed for family GH-20 by the structure of Streptomyces plicatus β-N-acetylhexosaminidase in complex with N-acetylglucosamine-thiazoline [13]. Despite insignificant sequence similarity with bacterial and influenza virus neuraminidases, the structure of the family GH-83 hemagglutinin-neuraminidase from Newcastle disease virus revealed a typical neuraminidase active site within a β-propeller fold [14•]. This clear resemblance allows the inclusion of family GH-83 in a superfamily called ‘clan GH-E’, which already contained families GH-33 and GH-34 (for more on clans, see [6]). In a similar vein, the first structural description of a family GH-77 member has also been accomplished with the resolution of the structure of Thermus aquaticus amylomaltase, a transglycosylating enzyme that produces amylose macrocycles. The (β/α)8 structure reveals a clear resemblance to family GH-13 (also known as the α-amylase superfamily) members [15]. Family GH-65 groups trehalases together with maltose phosphorylases. The structure of maltose phosphorylase from Lactobacillus brevis shows a striking resemblance to the (α/α)6-barrel structure of family GH-15 glucoamylases [16••]. This resemblance allows the creation of a new clan of GHs (‘clan GH-L’, comprising families GH-65 and GH-15) and provides a remarkable illustration of how little it takes to evolve an inverting GH into a phosphorylase by recruiting phosphate instead of water as the nucleophile in the single displacement mechanism (see also Update). Structures have also been reported for families that already had a known structural representative, but these are too numerous to all be reported here. Amongst the most notable of these structures, one can mention a family GH-1 plant β-glucosidase, whose narrow substrate specificity for aromatic aglycones is dictated by a slot-like aglycone-binding subsite [17]. In the same family, Burmeister et al. [18•] have reported several high-resolution structures of the plant defence protein myrosinase in complex with inhibitors and ascorbate. Another interesting structure published Glycoside hydrolases and glycosyltransferases Bourne and Henrissat 595 Table 2 Glycosyltransferases: families and structures (May 2001). Family Protein Fold Representative PDB codes References* GT-1 GtfB (Amycolatopsis orientalis) GT-2 SpsA (Bacillus subtilis) GT-B 1IIR [29] GT-A 1QG8 GT-6 α-1,3-galactosyltransferase (bovine) GT-A 1FG5 [27•] GT-7 β-1,4-galactosyltransferase β4GalT1 (bovine) GT-A 1FGX GT-8 α-1,4-galactosyltransferase LgtC (Neisseria meningitidis) Glycogenin (rabbit) GT-A 1GA8 NA [26••] (a) GT-13 β-1,2-N-acetylglucosaminyltransferase GnT1 (rabbit) GT-A 1FO8 [24] GT-28 β-1,4-N-acetylglucosaminyltransferase MurG GT-B 1FOK [47] GT-35 Maltodextrin phosphorylase (E. coli) Glycogen phosphorylase (human) Glycogen phosphorylase (rabbit) Glycogen phosphorylase (yeast) GT-B 1AHP 1EM6 1ABB 1YGP GT-43 β-1,3-glucuronyltransferase (human) GT-A 1FGG NC β-Glucosyltransferase (bacteriophage T4) GT-B 1BGT [25] *References are indicated only for families for which a representative structure was solved during the past year. (a) BJ Gibbons, PJ Roach, TD Hurley, abstract W0269, Annual Meeting of the American Crystallographic Association, 21–26 July 2001, Los Angeles, CA. NA, PDB accession code not available at time of writing; NC, nonclassified. this year was that of family GH-18 endo-β-N-acetylglucosaminidase F3 from Flavobacterium meningosepticum [19]. The structure of Bacillus agaradhaerens cellulase Cel5A (family GH-5) in complex with a substrate analogue in which a single α-1,4 glycosidic bond was incorporated into an otherwise all β-1,4-linked oligosaccharide has led to the discovery of a whole new class of cellulase inhibitors. These inhibitors have affinities that are 150 times better than that observed for an all β-linked compound [20•]. Finally, and very recently, the mechanism of hen eggwhite lysozyme has been revisited by a clever alliance of mutagenesis, organic chemistry, mass spectrometry and X-ray crystallography. The outcome, that hen egg-white lysozyme does form a covalent glycosyl–enzyme intermediate using Asp52, puts an end to a long-lived controversy and discards the ion-pair intermediate hypothesis found in all textbooks [21••]. Glycosyltransferases While only two GT structures were solved by 1995 (rabbit muscle glycogen phosphorylase and bacteriophage T4 β-glucosyltransferase), structures of representatives of nine families of GTs have now been solved in total, with structures reported for six families since January 2000 (Table 2). The folds of glycosyltransferases In marked contrast to the wide variety of folds displayed by the GHs, GTs are less ‘exciting’ if one considers only their folds. So far, all GT structures adopt only two folds [22]. By analogy to the GH clans, we name these two folds ‘GT-A’ and ‘GT-B’ (Table 2). The GT-A fold, best represented by family GT-2 SpsA from Bacillus subtilis [23], comprises two dissimilar domains, one involved in nucleotide binding (called the SGC domain in [22]) and the other binding the acceptor. The structures of several GTs sharing this fold have been solved recently, including rabbit N-acetylglucosaminyltransferase GnT1 (family GT-13) [24] and human β-1,3-glucuronyltransferase I (family GT-43) [25]. While all previously determined GT structures were ‘inverting’ enzymes (e.g. those that produce β-bonds from α-linked nucleotide sugars), Persson et al. [26••] have accomplished remarkable work with the structural resolution of a ‘retaining’ enzyme from family GT-8, the bacterial α-1,4-galactosyltransferase LgtC from Neisseria meningitidis. Shortly after, the structure of bovine α-1,3-galactosyltransferase (family GT-6), another retaining enzyme, was also reported [27•]. Both structures adopt the GT-A fold. The GT-B fold, originally found in phage T4 DNA-glucosyltransferase [28] and characterised by two similar Rossmann fold subdomains, is also found in families GT-28 and GT-35 (Table 2). Mulichak et al. [29] have completed this folding family with the crystal structure of the family GT-1 UDP-glucosyltransferase GtfB, which is involved in the biosynthesis of the vancomycin group of antibiotics. As only two large folding superfamilies (GT-A and GT-B) have emerged so far for GTs, we have submitted the remaining, unsolved families to threading analyses [30,31]. The results suggest that the other families will fold either like GT-A (examples include GT-12, GT-21 and GT-27) or like GT-B (for instance GT-4, GT-5, GT-9, GT-19, GT-20, GT-30 and GT-33). A number of families resist the threading analyses, suggesting that other folds perhaps exist. This optimism should be moderated by the fact that 596 Carbohydrates and glycoconjugates Figure 2 Stereo view of the superimposition of four structures from the GT-A fold family around the active site: SpsA in cyan (family GT-2), GnT1 in orange (family GT-13), β4GalT1 in green (family GT-7 [48]) and glucuronyltransferase 1 in magenta (family GT-43). The catalytic base appears at the top in the same colour code. The sugar-nucleotide donor around the manganese-binding region is shown at the bottom. For clarity, the ligands of the manganese ion (amino acid sidechains, solvent and nucleotide-sugar) are shown only for GnT1 (PDB accession code 1FOA). nucleotide binding may constitute an important constraint, which might prevent the proliferation of folds seen with the glycosidases. Finally, the high sensitivity of threading analyses has an unexpected drawback: the fold might be conserved, but not the function. Each of the two GT folding superfamilies has counterparts with significant structural similarity detectable by these threading analyses. Thus, a bacterial UDP-N-acetylglucosamine 2-epimerase clearly belongs to the GT-B fold [29,32]. In a similar fashion, the bacterial glucosamine-1-phosphate pyrophosphorylase GlmU shows substantial resemblance to the GT-A fold [33]. The mechanism of glycosyltransferases In contrast to the GH ‘clans’, in which the catalytic mechanism is strictly conserved, the two folding superfamilies of GTs group families of retaining and inverting enzymes together. Whereas the mechanism of inverting GTs appears reasonably well understood (a single displacement reaction with base activation of the acceptor), that of retaining enzymes remains poorly understood. Although a double displacement is deemed necessary, no evidence of a glycosyl–enzyme intermediate has yet been found for retaining GTs and the nature of the enzymatic nucleophile is the subject of debate [34]. By direct analogy with the GHs, aspartic acid or glutamic acid groups are excellent candidates for this task. Because these residues are sometimes not conserved in certain GT families or are sometimes not located appropriately in the active site, however, alternative nucleophiles have to be identified. The carbonyl oxygen in an amide sidechain or even perhaps in the mainchain could, in principle, also play the role of the nucleophile, as shown by the retaining N-acetylglucosaminidases, in which the enzymatic nucleophile is replaced by the acetamido group of the substrate. More structural work and mechanism-based inhibitors of retaining GTs are needed to resolve this important issue. Another important mechanistic feature emerging from 3D structures of GTs is the interplay between the donor and acceptor subsites: in some cases, it is only upon binding the nucleotide-sugar that the acceptor-binding site becomes fully functional (for instance, GnT1 and LgtC), whereas the reverse happens in other cases (for instance, GtfB). Perhaps important is the presence of a hinge region, typical of the GT-B fold, that separates the two constitutive subdomains and whose flexibility might be critical for specificity and catalysis. A beautiful example of the conformational restraints required within the acceptor-binding site should soon be demonstrated by the expected structure of rabbit glycogenin from family GT-8 (BJ Gibbons, PJ Roach, TD Hurley, abstract W0269, Annual Meeting of the American Crystallographic Association, 21–26 July 2001, Los Angeles, CA). This enzyme has the unique property of self-transferring several glucose moieties from a nucleotide-glucose donor to a tyrosine residue, forming a 10-residue α-1,4-glucan chain sufficient to initiate glycogen biosynthesis. The two main superfamilies of GTs also diverge in the utilisation of divalent cations. In all characterised members Glycoside hydrolases and glycosyltransferases Bourne and Henrissat of the GT-A fold superfamily, there is a strong structural restraint on the coordination of a metal ion by two phosphate oxygens of the nucleotide and sidechain residues from the protein (Figure 2). These sidechains are often called the ‘DXD motif’ [35], although the aspartic acid residues in this motif can be replaced by other residues. In contrast, members of the GT-B fold superfamily do not have such a motif and no metal has been identified clearly in the 3D structures solved so far, even though some (but not all) of these enzymes have been reported to be metal-dependent (see also Update). Structures of carbohydrate-binding modules and entire multimodular enzymes The CBMs also form sequence-based families (27 at the time of writing) and they have also witnessed a recent acceleration in the number of structural determinations, with three families structurally depicted by 1995 and thirteen now, with five families described since January 2000 (Table 3). Because of their frequent small size, a significant number of CBM structures have been determined by NMR. Isolated carbohydrate-binding modules Five families of CBMs have seen their first structural characterisation recently. The xylan-binding module of xylanase A from Pseudomonas fluorescens ssp cellulosa (family CBM-10) consists of two antiparallel β sheets, one with two strands and one with three, with a short α helix across one face of the three-stranded sheet [36]. The 3D structure of the family CBM-12 module of chitinase A1 from Bacillus circulans WL-12 was determined by NMR [37]. This module, which binds chitin, has a compact twisted β-sandwich structure reminiscent of that found in family CBM-5 [38]. Family CBM-14 contains chitin-binding modules either borne by chitinases or existing in isolation, such as the chitin-binding protein tachycitin, a 73-residue antimicrobial polypeptide from Tachypleus tridentatus. The 3D structure of tachycitin shares some similarity with the chitin-binding modules of family CBM-18; this has been proposed to have arisen by convergent evolution [39]. In an elegant piece of work, Charnock and co-workers [40] have reported both the function and the X-ray structure of a family CBM-22 module that binds xylan. This work also showed that some CBM-22 family members have lost their polysaccharide-binding function. Finally, the β-sandwich structure of the C-terminal CBM-9 module of Thermotoga maritima xylanase 10A has been determined alone and in complex with cellobiose by X-ray crystallography [41•]. The stunning result is that the T. maritima CBM appears to bind selectively to the reducing ends of cellulose. Because of their relative small size, CBMs fold predominantly as all-β proteins. Like the hydrolases and transferases, superfamilies are also beginning to emerge for the CBMs. For instance, families CBM-4 and CBM-22 are clearly related and, based on sequence motifs, it has been proposed 597 Table 3 Carbohydrate-binding modules: families and structures (May 2001). Family Protein Representative PDB codes Refs* CBM-1 Cellulase Cel7A (Trichoderma reesei) 1CBH CBM-2 Xylanase Xyn10A (Cellulomonas fimi) Xylanase Xyn11A (Cellulomonas fimi) 1EXG 2XBD CBM-3 Scaffoldin (Clostridium cellulolyticum) Scaffoldin (Clostridium thermocellum) Cellulase Cel9A (Thermobifida fusca) 1G43 1NBC 1TF4 CBM-4 Cellulase Cel9B (Cellulomonas fimi) 1ULO CBM-5 Cellulase Cel5A (Erwinia chrysanthemi) 1AIW CBM-6 Xylanase U (Clostridium thermocellum) NA (a) CBM-9 Xylanase Xyn10A (Thermotoga maritima) 1I8A [41•] CBM-10 Xylanase Xyn10A (Pseudomonas fluorescens) 1QLD [36] CBM-12 Chitinase A1 (Bacillus circulans) Chitinase B (Serratia marcescens) 1ED7 1E15 [37] [44] CBM-13 Xylanase (Streptomyces olivaceoviridis) Ricin (Ricinus communis) Ebulin (Sambucus ebulus) 1XYF 2AAI 1HWM [43•] [49] CBM-14 Tachycitin (Tachypleus tridentatus) 1DQC [39] CBM-17 Endoglucanase EngF (Clostridium cellulovorans) 1J83 (b) CBM-18 Hevein (Hevea brasiliensis) Antimicrobial peptide 2 (Amaranthus caudatus) Agglutinin (Triticum aestivum) 1HEV 1MMC 1WGC CBM-20 Glucoamylase (Aspergillus niger) β-Amylase (Bacillus cereus) 1KUM 1CQY CBM-22 Xylanase Xyn10B (Clostridium thermocellum) 1DYO [40] *References are indicated only for families for which a representative structure was solved during the past year. (a) M Czjzek et al., abstract 131, 4th Carbohydrate-Bioengineering Meeting, 10–13 June 2001, Stockholm, Sweden. (b) V Notenboom, B Boraston, A Freelove, D Kilburn, DR Rose, unpublished data. NA, PDB accession code not available at time of writing. that they form a superfamily with families CBM-16, CBM-17 and CBM-27 [42]. Entire multimodular enzymes The flexibility of the linker peptides connecting the various modules makes intact modular enzymes with a catalytic module and a CBM particularly recalcitrant to crystallisation. As a consequence, there are only a very few solved structures of intact modular GHs and none of a modular GT. In this respect, the two entire modular GHs solved this year probably represent a tour de force. The first was a xylanase from Streptomyces olivaceoviridis featuring a catalytic module from family GH-10 carrying a xylan-binding module from family CBM-13 [43•]. The other is chitinase B from Serratia marcescens comprising a family GH-18 catalytic module and a C-terminal chitinbinding module from family CBM-12 [44] (Figure 3). 598 Carbohydrates and glycoconjugates Figure 3 Ribbon diagrams of (a) S. olivaceoviridis family GH-10 xylanase (yellow) linked to a CBM from family CBM-13 (cyan) (PDB accession code 1XYF) and (b) S. marcescens family GH-18 chitinase B (yellow) linked to a CBM from family CBM-12 (cyan) (PDB accession code 1E15) via a long ordered linker (orange). Conclusions References and recommended reading While the number of enzyme families (GH or GTs) will grow relatively slowly now, a systematic analysis of protein modularity should reveal novel families of noncatalytic modules (PM Coutinho, B Henrissat, unpublished data), some of which might turn out to be CBMs. A major challenge remains the structural elucidation of multimodular enzymes: some have over ten different modules! Indeed, the adjunction of multiple CBMs to the catalytic modules is probably a convenient way to build larger active sites from pre-existing scaffolds and these extended sites allow the perception of ligand structures remote from the site of catalysis itself. Papers of particular interest, published within the annual period of review, have been highlighted as: Update The crystal structure of the muramidase from Streptomyces coelicolor has recently been determined [50]. This structure (PBD accession code 1JFX) is the first reported for a family GH-25 member. The structure of the mannanase from Pseudomonas cellulosa (family GH-26; Table 1; PBD accession code 1J9Y) has now been published [51]. Recent work has demonstrated that T4 phage β-glucosyltransferase, which adopts the GT-B fold, can bind metal ions near the β-phosphate of the nucleotide [52]. • of special interest •• of outstanding interest 1. Laine RA: A calculation of all possible oligosaccharide isomers both branched and linear yields 1.05 x 1012 structures for a reducing hexasaccharide: the Isomer Barrier to development of single-method saccharide sequencing or synthesis systems. Glycobiology 1994, 4:759-767. 2. Henrissat B: A classification of glycosyl hydrolases based on amino acid sequence similarities. Biochem J 1991, 280:309-316. 3. Henrissat B, Bairoch A: New families in the classification of glycosyl hydrolases based on amino acid sequence similarities. Biochem J 1993, 293:781-788. 4. Henrissat B, Bairoch A: Updating the sequence-based classification of glycosyl hydrolases. Biochem J 1996, 316:695-696. 5. Campbell JA, Davies GJ, Bulone V, Henrissat B: A classification of nucleotide-diphospho-sugar glycosyltransferases based on amino acid sequence similarities. Biochem J 1997, 326:929-939. 6. Henrissat B, Davies G: Structural and sequence-based classification of glycoside hydrolases. Curr Opin Struct Biol 1997, 7:637-644. 7. Henrissat B, Davies GJ: Glycoside hydrolases and glycosyltransferases: families, modules and implications for genomics. Plant Physiol 2000, 124:1515-1519. 8. Davies G, Henrissat B: Structures and mechanisms of glycosyl hydrolases. Structure 1995, 3:853-859. 9. Vallée F, Karaveg K, Herscovics A, Moremen KW, Howell PL: Structural basis for catalysis and inhibition of N-glycan processing class I α1,2-mannosidases. J Biol Chem 2000, 275:41287-41298. Acknowledgements We would like to thank Jim Rini, Michael Garavito, David Rose and Herman van Tilbeurgh for providing the coordinates and preprints for GnT1, GtfB, α-mannosidase II and maltose phosphorylase, respectively, before release. Glycoside hydrolases and glycosyltransferases Bourne and Henrissat 10. Vallée F, Lipari F, Yip P, Sleno B, Herscovics A, Howell PL: Crystal •• structure of a class I α1,2-mannosidase involved in N-glycan processing and endoplasmic reticulum quality control. EMBO J 2000, 19:581-588. The first crystal structure of S. cerevisiae α-1,2-mannosidase I reveals a novel (α/α)7-barrel fold for family GH-47. The unexpected presence of a fully ordered N-glycan within the active site of a neighbouring molecule in the crystal provides a detailed description of the catalytic mechanism involving a calcium ion 11. van den Elsen JMH, Kuntz DA, Rose DR: Structure of Golgi α •• mannosidase II: a target for inhibition of growth and metastasis of cancer cells. EMBO J 2001, 20:3008-3017. The structure of the enormous (1108 residues) Drosophila Golgi α-mannosidase II has been solved in the presence of the anticancer agent swainsonine and the inhibitor deoxymannojirimicin. The structure reveals a novel protein fold for family GH-38, consisting of an N-terminal α/β domain, a three-helical bundle and an all-β C-terminal domain forming a single compact entity. A zinc atom appears to be involved both in the substrate specificity of the enzyme and directly in the catalytic mechanism. 12. Markovic-Housley Z, Miglierini G, Soldatova L, Rizkallah PJ, Muller U, • Schirmer T: Crystal structure of hyaluronidase, a major allergen of bee venom. Structure 2000, 8:1025-1035. This first structural determination for family GH-56 not only reveals the overall topology of bee venom hyaluronidase but also suggests a molecular mechanism involving anchimeric assistance of the N-acetyl group of the substrate for catalysis. 13. Mark BL, Vocadlo DJ, Knapp S, Triggs-Raine BL, Withers SG, James MN: Crystallographic evidence for substrate-assisted catalysis in a bacterial β-hexosaminidase. J Biol Chem 2001, 276:10330-10337. 599 22. Ünligil UM, Rini JM: Glycosyltransferase structure and mechanism. Curr Opin Struct Biol 2000, 10:510-517. 23. Charnock SJ, Davies GJ: Structure of the nucleotide-diphosphosugar transferase, SpsA from Bacillus subtilis, in native and nucleotide-complexed forms. Biochemistry 1999, 38:6380-6385. 24. Ünligil U, Zhou S, Yuwaraj S, Sarkar M, Schachter H, Rini J: X-ray crystal structure of rabbit N-acetylglucosaminyltransferase I: catalytic mechanism and a new protein superfamily. EMBO J 2000, 19:5269-5280. 25. Pedersen LC, Tsuchida K, Kitagawa H, Sugahara K, Darden TA, Negishi M: Heparan/chondroitin sulfate biosynthesis: structure and mechanism of human glucuronyltransferase I. J Biol Chem 2000, 275:34580-34585. 26. Persson K, Ly HD, Dieckelmann M, Wakarchuk WW, Withers SG, •• Strynadka NC: Crystal structure of the retaining galactosyltransferase LgtC from Neisseria meningitidis in complex with donor and acceptor sugar analogs. Nat Struct Biol 2001, 8:166-175. The first structure of a retaining GT. This beautiful piece of work, which combines X-ray crystallography, carbohydrate chemistry and mutagenesis, also reports the first structure of a ternary complex containing both the sugar-nucleotide donor and an acceptor analogue. Although these combined approaches have not uncovered all the details of the catalytic mechanism, they have revealed that the enzyme does not have a carboxylic nucleophile equivalent to that of the retaining GHs. 14. Crennell S, Takimoto T, Portner A, Taylor G: Crystal structure of the • multifunctional paramyxovirus hemagglutinin-neuraminidase. Nat Struct Biol 2000, 7:1068-1074. The crystal structure of the family GH-83 hemagglutinin-neuraminidase from Newcastle disease virus bound to either an inhibitor or the β-anomer of sialic acid reveals a typical neuraminidase active site within a β-propeller fold. Gastinel LN, Bignon C, Misra AK, Hindsgaul O, Shaper JH, Joziasse DH: Bovine α1,3-galactosyltransferase catalytic domain structure and its relationship with ABO histo-blood group and glycosphingolipid glycosyltransferases. EMBO J 2001, 20:638-649. This enzyme synthesises the epitope that causes hyperacute rejection observed in pig-to-human xenotransplantation. The crystal structure of this retaining glycosyltransferase in complex with a modified UDP-Gal sugar donor allowed the authors to describe a catalytic mechanism that, unlike LgtC, involves the formation of a covalent glycosyl–enzyme intermediate. 15. Przylas I, Tomoo K, Terada Y, Takaha T, Fujii K, Saenger W, Strater N: Crystal structure of amylomaltase from Thermus aquaticus, a glycosyltransferase catalysing the production of large cyclic glucans. J Mol Biol 2000, 296:873-886. 28. Vrielink A, Ruger W, Driessen HP, Freemont PS: Crystal structure of the DNA modifying enzyme β-glucosyltransferase in the presence and absence of the substrate uridine diphosphoglucose. EMBO J 1994, 13:3413-3422. 16. Egloff MP, Uppenberg J, Haalck L, van Tilbeurgh H: Crystal structure •• of maltose phosphorylase from Lactobacillus brevis: unexpected evolutionary relationship with glucoamylases. Structure 2001, 9:689-697. Maltose phosphorylase is a family GH-65 enzyme that catalyses the conversion of maltose and inorganic phosphate into β-D-glucose-1-phosphate and glucose, without any cofactor. The 3D structure strongly suggests that this enzyme, which has evolved from family GH-14 glucoamylase, has conserved one carboxylate group for acid catalysis and has exchanged the catalytic base for a phosphate-binding pocket. 29. Mulichak AM, Losey HC, Walsh CT, Garavito RM: Structure of the UDP-glucosyltransferase GtfB that modifies the heptapeptide aglycone in biosynthesis of the vancomycin group of antibiotics. Structure 2001, 9:547-557. 17. Czjzek M, Cicek M, Zamboni V, Bevan DR, Henrissat B, Esen A: The mechanism of substrate (aglycone) specificity in β-glucosidases is revealed by crystal structures of mutant maize β-glucosidaseDIMBOA, -DIMBOAGlc, and -dhurrin complexes. Proc Natl Acad Sci USA 2000, 97:13555-13560. 18. Burmeister WP, Cottaz S, Rollin P, Vasella A, Henrissat B: High • resolution X-ray crystallography shows that ascorbate is a cofactor for myrosinase and substitutes for the function of the catalytic base. J Biol Chem 2000, 275:39385-39393. Several high-resolution structures of myrosinase in complex with inhibitors and/or L-ascorbate have brought a final answer to the question of the missing acid/base catalyst and the particular activation of the enzyme by ascorbate. 19. Waddling CA, Plummer TH, Tarentino AL, Van Roey P: Structural β-Nbasis for the substrate specificity of endo-β acetylglucosaminidase F3. Biochemistry 2000, 39:7878-7885. 20. Fort S, Varrot A, Schülein M, Cottaz S, Driguez H, Davies GJ: Mixed • linkage cellooligosaccharides: a new class of glycoside hydrolase inhibitors. Chem Biochem 2001, 2:319-325. A beautiful example of unexpected synergy between chemistry (and its pitfalls) X-ray crystallography has led the authors to design a clever class of inhibitors that ‘by-pass’ the catalytic subsite while maintaining binding to the surrounding subsites of endoglucanase Cel5A from B. agaradhaerens. 21. Vocadlo DJ, Davies GJ, Laine R, Withers SG: Catalysis by hen egg •• white lysozyme proceeds via a covalent intermediate. Nature 2001, 412:835-838. This paper brings the long debate on the nature of the reaction intermediate in egg-white lysozyme catalysis to an end. Most textbooks will need to be updated, with the covalent glycosyl–enzyme intermediate replacing the famous (but erroneous) ion pair. 27. • 30. Jones DT: GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. J Mol Biol 1999, 287:797-815. 31. Kelley LA, MacCallum RM, Sternberg MJ: Enhanced genome annotation using structural profiles in the program 3D-PSSM. J Mol Biol 2000, 299:499-520. 32. Campbell RE, Mosimann SC, Tanner ME, Strynadka NC: The structure of UDP-N-acetylglucosamine 2-epimerase reveals homology to phosphoglycosyl transferases. Biochemistry 2000, 39:14993-15001. 33. Brown K, Pompeo F, Dixon S, Mengin-Lecreulx D, Cambillau C, Bourne Y: Crystal structure of the bifunctional N-acetylglucosamine 1-phosphate uridyltransferase from Escherichia coli: a paradigm for the related pyrophosphorylase superfamily. EMBO J 1999, 18:4096-4107. 34. Davies GJ: Sweet secrets of synthesis. Nat Struct Biol 2001, 8:98-100. 35. Wiggins CA, Munro S: Activity of the yeast MNN1 α-1,3mannosyltransferase requires a motif conserved in many other families of glycosyltransferases. Proc Natl Acad Sci USA 1998, 95:7945-7950. 36. Raghothama S, Simpson PJ, Szabo L, Nagy T, Gilbert HJ, Williamson MP: Solution structure of the CBM10 cellulose binding module from Pseudomonas xylanase A. Biochemistry 2000, 39:978-984. 37. Ikegami T, Okada T, Hashimoto M, Seino S, Watanabe T, Shirakawa M: Solution structure of the chitin-binding domain of Bacillus circulans WL-12 chitinase A1. J Biol Chem 2000, 275:13654-13661. 38. Brun E, Moriaud F, Gans P, Blackledge MJ, Barras F, Marion D: Solution structure of the cellulose-binding domain of the endoglucanase Z secreted by Erwinia chrysanthemi. Biochemistry 1997, 36:16074-16086. 600 Carbohydrates and glycoconjugates 39. Suetake T, Tsuda S, Kawabata S, Miura K, Iwanaga S, Hikichi K, Nitta K, Kawano K: Chitin-binding proteins in invertebrates and plants comprise a common chitin-binding structural motif. J Biol Chem 2000, 275:17929-17932. 40. Charnock SJ, Bolam DN, Turkenburg JP, Gilbert HJ, Ferreira LM, Davies GJ, Fontes CM: The X6 ‘thermostabilizing’ domains of xylanases are carbohydrate-binding modules: structure and biochemistry of the Clostridium thermocellum X6b domain. Biochemistry 2000, 39:5013-5021. 41. Notenboom V, Boraston AB, Kilburn DG, Rose DR: Crystal • structures of the family 9 carbohydrate-binding module from Thermotoga maritima xylanase 10A in native and ligand-bound forms. Biochemistry 2001, 40:6248-6256. The crystal structure of the C-terminal module of T. maritima xylanase 10A is the first to be reported for family CBM-9. This work also reveals the first complex of a cellulose-binding CBM bound to cellobiose. The structure suggests that this CBM binds selectively to the reducing ends of cellulose. 42. Sunna A, Gibbs MD, Bergquist PL: Identification of novel β-mannanand β-glucan binding modules: evidence for a superfamily of carbohydrate-binding modules. Biochem J 2001, 356:791-798. 43. Fujimoto Z, Kuno A, Kaneko S, Yoshida S, Kobayashi H, Kusakabe I, • Mizuno H: Crystal structure of Streptomyces olivaceoviridis E-86 β-xylanase containing xylan-binding domain. J Mol Biol 2000, 300:575-585. The crystal structure of S. olivaceoviridis xylanase A provides a first view of a multimodular enzyme from family GH-10 carrying a xylan-binding CBM from family 13. This structure offers a preview of the overall architecture of other CBM-13-containing modular enzymes, such as GTs from family GT-27. 44. van Aalten DM, Synstad B, Brurberg MB, Hough E, Riise BW, Eijsink VG, Wierenga RK: Structure of a two-domain chitotriosidase from Serratia marcescens at 1.9-Å resolution. Proc Natl Acad Sci USA 2000, 97:5842-5847. 45. Christopher JA: SPOCK: the Structural Properties Observation and Calculation Kit Program Manual. Texas: The Center for Macromolecular Design, Texas A&M University, College Station; 1998. 46. Merritt EA, Bacon DJ: Raster3D: photorealistic molecular graphics. Methods Enzymol 1997, 277:505-524. 47. Ha S, Walker D, Shi Y, Walker S: The 1.9 Å crystal structure of Escherichia coli MurG, a membrane-associated glycosyltransferase involved in peptidoglycan biosynthesis. Protein Sci 2000, 9:1045-1052. 48. Gastinel LN, Cambillau C, Bourne Y: Crystal structures of the bovine β4galactosyltransferase catalytic domain and its complex with uridine diphosphogalactose. EMBO J 1999, 18:3546-3557. 49. Pascal JM, Day PJ, Monzingo AF, Ernst SR, Robertus JD, Iglesias R, Perez Y, Ferreras JM, Citores L, Girbes T: 2.8-Å crystal structure of a nontoxic type-II ribosome-inactivating protein, ebulin l. Proteins 2001, 43:319-326. 50. Rau A, Hogg T, Marquardt R, Hilgenfeld R: A new lysozyme fold. Crystal structure of the muramidase from Streptomyces coelicolor at 1.65 Å resolution. J Biol Chem 2001, 276:31994-31999. 51. Hogg D, Woo EJ, Bolam DN, McKie VA, Gilbert HJ, Pickersgill RW: Crystal structure of mannanase 26A from Pseudomonas cellulosa and analysis of residues involved in substrate binding. J Biol Chem 2001, 276:31186-31192. 52. Moréra S, Larivière L, Kurzeck J, Aschke-Sonnenborn U, Freedmont PS, Janin J, Rüger W: High resolution crystal structures of T4 phage β-glucosyltransferase: induced fit and effect on substrate and metal binding. J Mol Biol 2001, 311:569-577. Now in press The work referred to in Table 1 as (G Michel, L Chantalat, O Dideberg, unpublished data) is now in press: 53. Michel G, Chantalat L, Fanchon E, Henrissat B, Kloareg B, Dideberg O: The iota-carrageenase of Alteromonas fortis: a β-helix fold-containing enzyme for the degradation of a highly polyanionic polysaccharide. J Biol Chem 2001, in press.