Biochemical and Biophysical Investigations of N-linked Glycosylation Pathways in Archaea by MASSACHUSETTS INSTITUTE OF TECHNOLOLGY Michelle M. Chang MAY 192015 B.S. Chemical Biology University of California, Berkeley, 2008 LIBRARIES Submitted to the Department of Chemistry in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy at the Massachusetts Institute of Technology December 2014 [Febrtary 2c1 2014 Massachusetts Institute of Technology All rights reserved Signature of Author: Signature redacted Department of Chemistry December 11, 2014 Certified by: __Signature redacted Barbara Imperiali Class of 1922 Professor of Chemistry and Biology Thesis Supervisor Accepted by: Signature redacted_____ Robert W. Field Haslam and Dewey Professor of Chemistry Chairman, Committee on Graduate Students This doctoral thesis has been examined by a committee of the Department of Chemistry as follows: Signature redacted Alice Y. Ting Ellen Swallow Richards Associate Professor of Chemistry Thesis Chair Signature redacted Barbara Imperiali Class of 1922 Professor of Chemistry and Biology Thesis Supervisor Signature redacted Alexander M. Klibanov Novartis Professor of Chemistry and Bioengineering 2 Biochemical and Biophysical Investigations of N-linked Glycosylation Pathways in Archaea by Michelle M. Chang Submitted to the Department of Chemistry on December 11, 2014 in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy Abstract Asparagine-linked glycosylation is an abundant and complex protein modification conserved among all three domains of life. Much is known about N-glycan assembly in eukaryotes and selected bacteria, in which the oligosaccharyltransferase (OTase) carries out the en bloc transfer of glycans from polyprenyl-PP-linked donors onto asparagine side chains of acceptor proteins. The first aim of this thesis is to elucidate the biochemical details of archaeal N-linked glycosylation, specifically through in vitro analysis of the polyprenyl-P-dependent pathway of the methanogenic archaeon Methanococcus voltae. The archaeal OTase, known as AglB, utilizes a-linked dolichyl-P-trisaccharide substrate as the glycosyl donor for transfer to the acceptor protein. This dolichyl-P-glycan is generated by an initial retaining glycosyltransferase (AglK) and elaborated by additional glycosyltransferases (AglC and AgIA) to afford Dol-P-GlcNAcGlc-2,3-diNAcA-ManNAc(6Thr)A. Despite the homology to other bacterial or eukaryotic OTases that exploit polyprenyl-PP-linked substrates, the M. voltae AglB efficiently transfers disaccharide to model peptides from the Dol-P-GlcNAc-Glc-2,3-diNAcA monophosphate. While this archaeal pathway affords the same asparagine-linked P-glycosyl amide products generated in bacteria and eukaryotes, these studies provide the first biochemical evidence revealing that despite the apparent similarities of the overall pathways, there are actually two general strategies to achieve N-linked glycoproteins across the domains of life. A second focus of this thesis involves biophysical studies to probe structural features and conformational dynamics of AglB. An intramolecular LRET experimental system was developed to report on substrate binding and the resulting structural transformations in AgIB. There is a strong need for detailed studies on the mechanistic and functional significance of archaeal adaptations of N-linked glycosylation, especially exploring differences between AglB and other OTases that allow AglB to utilize these unique polyprenyl-P-linked substrates. Lastly, a cell-free expression system was established for the efficient synthesis of Alg5, a yeast dolichyl-phosphate glucosyltransferase that shares high sequence similarity to AglK, the first glycosyltransferase in the M. voltae pathway. Dol-P-Glc was generated and examined to unambiguously characterize the stereochemistry of the product of Alg5. Thesis Supervisor: Barbara Imperiali Title: Class of 1922 Professor of Chemistry and Biology 3 Acknowledgements I would like to start by thanking my advisor, Barbara Imperiali, for giving me the incredible opportunity to be a part of her lab and for her guidance, patience, and support throughout my graduate studies. Thank you for assembling a wonderful group of people who are always willing to collaborate and help one another become better scientists. Thank you also to my thesis committee members, Professors Alice Ting and Alex Klibanov, for their help and advice through the years. I have had the pleasure of working with a fantastic group of labmates over the past six years. I am thankful for everyone who welcomed Mike and me into the lab in our first year. Even though I didn't overlap very long with them, Galen, Elvedin, Mark, and Nelson all helped make the lab a wonderful place in my first year. Angelyn, Meredith, Wendy, and Brenda were the four fourth-year students who helped me in countless ways as a new graduate student. I am especially thankful to Angelyn for taking the time to mentor me and teach me about the world of membrane protein preparations, OTase assays, HPLC maintenance, and so much more. I am also grateful to Meredith for her generous and sage advice and for teaching me about nanodiscs and the joy of volleyball. I am thankful for Wendy's help with my fluorescent peptide experiments. I worked on PglB with Marcie, and I am so thankful that I had her to discuss science with and for all of her help with LBTs. I appreciate being able to walk through grad school with Mike as a classmate and labmate, and I love that he hosted an annual Thanksgiving meal for our classmates. I am happy to have shared most of my grad school duration with Vinita, who has been a great friend and someone to commiserate with in purifying our sugars and lipid-linked oligosaccharides. I am thankful to Stephanie for helping me with yeast expression and for deciphering my horrible drawings. I am delighted that Sonya joined our lab and enjoyed our time working on cell-free translation. Austin always spoke with great enthusiasm for science and never failed to ask about my day, and I am sorry that his time in our lives was cut short. I also need to acknowledge the many postdocs throughout my time, starting with Matthieu, Jay, and Cliff who were always generous with their advice. I am particularly thankful to Matthieu for teaching me about peptide synthesis. I would also like to thank postdocs Andrew, Elke, Laura, Marthe, Julie, Joris, Debasis, Silvano, and Cristina who have been brilliant scientists and great resources throughout the years. I am glad to see Monika continuing the LBT-PglB project, and I am excited to see how it compares to AglB. I am thankful for Garrett's help with Dol-P synthesis and NMR and for being my hockey and early lunch buddy. Joris has been my late lunch partner. James, Philipp, and Carsten have each been my podmate at some point, and I appreciate all of their vastly different personalities and value our various conversations about science and life. I would like to thank members of the Stubbe lab for allowing me to use their French press, as well as members of the Drennan lab for helping me with crystallography. I would also like to thank Traci, Daniel, Chiara, Seymour, and Kaspar for their time working in our lab. I have also enjoyed the time Will, Thais, Pi, Hui, and Natalie have spent in our lab as undergrads. And I would be remiss if I did not thank Debby for her work in the BIF and Elizabeth for all her work to keep the lab running. Lastly, I would never have made it this far if it weren't for the love and support of my family and friends. Thanks to my mom and dad who are always proud of me and concerned for my academic and personal well-being without ever pressuring me with their expectations. Thanks to my amazing sister, Liana, who is always there for me whenever I need to talk. Thanks 4 to the MIT Graduate Christian Fellowship, Hope Fellowship Church, and Tang small group communities for fortifying my faith and for all the adventures we've had together. And thank you to my current and former roommates (Stephanie, Heather, Wei-Shan, Nan, Ange) who have made Tang a special place filled with laughter, shared experiences of failures and triumphs, rousing conversations about God and relationships, and jigsaw puzzles. 5 Table of Contents Abstract ........................................................................................................................................... 3 Acknow ledgem ents ......................................................................................................................... 4 Table of Conttents ............................................................................................................................ 6 List of Figures ................................................................................................................................. 9 List of Tables................................................................................................................................. 12 List of A bbreviations..................................................................................................................... 13 Chapter 1: Introduction..............................................................................................................15 Introduction ............................................................................................................................... Eukaryotic N -linked glycosylation......................................................................................... Bacterial N -linked glycosylation........................................................................................... Archaeal N -linked glycosylation........................................................................................... Thesis objectives ....................................................................................................................... References ................................................................................................................................. 16 17 19 20 28 29 Chapter 2: Identification and characterization of archaeal glycosyltransferases.............33 Introduction ............................................................................................................................... Results and D iscussion.............................................................................................................. AglH is not the first enzyme in the M. voltae N-linked pathway...................................... A glK is a specific D ol-P-GIcN Ac synthase ...................................................................... D ol-P-GIcN A c exhibits an a-glycosidic linkage ............................................................. A giK mutagenesis and inhibition studies........................................................................... Comparison of A glK with other glycosyltransferases ...................................................... Observation of Dol-P-glycan in M . maripaludis cells ...................................................... A gIC is a U D P-Glc-2,3-diN A cA glycosyltransferase....................................................... Identification of D ol-P-glycans in other archaeal species................................................ Conclusions ............................................................................................................................... A cknow ledgem ents ................................................................................................................... Experim ental M ethods .............................................................................................................. Cloning of A glH , A glK and AgiC ................................................................................... Protein expression and purification.................................................................................... Synthesis of (C55-60) (S)-dolichols.................................................................................. Synthesis of (S)-dolichyl-phosphate ................................................................................. Synthesis of (C55-60) Dolichyl-PP-GlcNAc and Dolichyl-PP-GaINAc .......................... Synthesis of U D P-[3H ]G lc-2,3-diN AcA ........................................................................... A glK and A glC glycosyltransferase assays ...................................................................... A glK reaction product identification.................................................................................. A glK D X D m utagenesis.................................................................................................... Purification of Dol-P-glycans from AgiK and AgIC reactions ......................................... 6 34 41 41 42 47 50 51 53 56 60 62 63 64 64 64 65 66 67 67 67 69 69 69 N M R and M S characterization of Dol-P-glycans ............................................................. 70 Methanococcus maripaludisLLO preparation ................................................................. PyrococcusfuriosusLLO preparation ............................................................................... References ................................................................................................................................. 70 71 72 Chapter 3: Characterization of an archaeal oligosaccharyl transferase ........................... 76 Introduction ............................................................................................................................... Results and discussion...............................................................................................................82 Characterization of oligosaccharyl transferase activity ................................................... AgIB and its lipid-linked glycosyl donor substrate........................................................... 77 82 87 Investigation of the role of AgIB His-597........................................................................ 89 AgLB crystallographic studies .......................................................................................... 92 Conclusions ............................................................................................................................... Acknowledgem ents ................................................................................................................... Experim ental m ethods............................................................................................................... AglB expression and purification...................................................................................... Oligosaccharyl transferase assay......................................................................................... Purification of glycopeptides .............................................................................................. AglB H597 m utagenesis...................................................................................................... Purification of AglB for crystallography ............................................................................ AglB peptide substrates....................................................................................................... 97 98 99 99 100 101 101 101 102 AgIB crystallography .......................................................................................................... 103 References ............................................................................................................................... 104 Chapter 4: Efforts towards establishing an experimental system for investigating AgiB 108 conform ational dynam ics by LRET ........................................................................................ Introduction ............................................................................................................................. Results and discussion............................................................................................................. Generation of LBT-AgIB construct.....................................................................................115 Generation of LBT-AgIB cysteine mutants ........................................................................ 109 115 LRET with fluorophore-labeled LBT-AgIB ....................................................................... 121 Lum inescence studies with LBT-AgIB and peptide substrate ............................................ N ew peptide library for LRET experim ents........................................................................ Luminescence studies with LBT-AgIB and short acceptor peptides .................................. Conclusions ............................................................................................................................. A cknowledgem ents.................................................................................................................133 Experim ental m ethods............................................................................................................. LBT-AglB construct cloning, expression, and purification ................................................ LBT-AglB and LBT-Ub lum inescence experim ents .......................................................... 124 127 131 132 LBT-AgIB Cys mutagenesis ............................................................................................... LBT-AgB fluorophore labeling..........................................................................................136 135 AglB peptide substrate synthesis......................................................................................... Km determ ination of AglB peptide substrates ..................................................................... References ............................................................................................................................... 117 133 133 135 137 138 139 Chapter 5: Characterization of Alg5 activity and product stereochemistry ...................... 141 Introduction ............................................................................................................................. 7 142 Results and discussion............................................................................................................. A lg5 expression in E. coli ................................................................................................... Alg5 expression in S. cerevisiae ......................................................................................... Cell-free translation of Alg5 ............................................................................................... Conclusions ............................................................................................................................. Acknowledgem ents ................................................................................................................. Experim ental m ethods............................................................................................................. A lg5 expression in pGEX-4T3, pET-47, and pE-SUM O vectors....................................... Alg5 expression in pBAD vector ........................................................................................ A lg5 expression from pET(Alg5) plasm id.......................................................................... Alg5 expression in S. cerevisiae ......................................................................................... Alg5 activity assay .............................................................................................................. Cell-free expression of Alg5 ............................................................................................... Dol-P-Glc synthesis and analysis........................................................................................ References ............................................................................................................................... 8 145 145 150 151 156 157 157 157 158 160 160 162 162 164 165 List of Figures Chapter 1 Figure 1-1: Schematic overview of the five different types of protein glycosylation. ............. 17 Figure 1-2: N-linked glycosylation pathway in S. cerevisiae................................................. 18 Figure 1-3: N-linked glycosylation pathway in C. jejuni.........................................................20 Figure 1-4: N-glycan structures from Halobacteriumsalinarum, Thermoplasma acidophilum and Pyrococcusfuriosus.................................................................................................... 22 Figure 1-5: N-linked pathway in H. volcanii in high and low salinity. ................................... 23 Figure 1-6: N-linked glycosylation pathway in M. voltae. ..................................................... 25 Figure 1-7: N-linked glycosylation pathway in M. maripaludis.............................................26 27 Figure 1-8: N-glycosylation pathway in S. acidocaldarius...................................................... Chapter 2 Figure 2-1: Representative structures of the two main GT folds.............................................35 Figure 2-2: Catalytic m echanism s in GTs............................................................................... 37 Figure 2-3: N-linked glycosylation in Methanococcus voltae. ................................................. 39 Figure 2-4: General assay for PGT and GT activity.. ............................................................. 41 Figure 2-5: Topology prediction for M. voltae proteins AgIK and AglC ............................... 43 Figure 2-6: SDS-PAGE and Western blot analysis of purified AglK and AglC proteins ..... 43 Figure 2-7: A glK activity assay. .............................................................................................. 45 45 Figure 2-8: Normal phase HPLC purification of Dol-P-GlcNAc. ............................................ 46 Figure 2-9: ESI-M S of purified Dol-P-GlcNAc...................................................................... Figure 2-10: Capillary electrophoresis of AglK reaction products.......................................... 46 Figure 2-11: Alignment of AglK from M. voltae with S. cerevisiae Alg5 .............................. 47 Figure 2-12: 1H-NMR spectrum of Dol-P-GlcNAc................................................................. 48 Figure 2-13: Activity assay for AgIK DI05A and D107A mutants.........................................50 Figure 2-14: AgiK inhibition observed with UDP-(5F)-GlcNAc.............................................51 Figure 2-15: Alignment of AglK from M. voltae with AgIJ from H. volcanii or MMP 1170 from M m aripaludis. ..................................................................................................................... 53 Figure 2-16: N-linked glycan structures from M voltae and M. maripaludis.......................... 54 Figure 2-17: Mass spectrometry analysis of M maripaludistotal lipid extraction ................. 55 57 Figure 2-18: Biosynthetic pathway of UDP-ManNAc(3NAc)A ............................................. Figure 2-19: AglC donor and acceptor substrate specificity....................................................58 Figure 2-20: Normal phase HPLC purification of Dol-P-Glc-2,3-diNAcA ............................ 58 Figure 2-21: ESI-MS of Dol-P-GIcNAc-Glc-2,3-diNAcA ...................................................... 59 Figure 2-22: Reverse phase LC/MS analysis of P. furiosus whole cell extractions confirms the presence of Dol-P-heptasaccharide and other Dol-P-glycan intermediates..................61 63 Figure 2-23: M voltae N-linked glycosylation pathway enzymes .......................................... Chapter 3 Figure 3-1: N-linked glycosylation pathway in Saccharomyces cerevisiae. ........................... 77 Figure 3-2: Subunits comprising the S. cerevisiae OT complex.............................................78 9 Figure 3-3: N-linked glycosylation pathway in Campylobacterjejuni....................................80 Figure 3-4: Sequence alignment of 16 archaeal Stt3 homologs, depicting the region surrounding 82 the critical WW D XGX m otif........................................................................................... blot.........................82 Figure 3-5: Comparison of the archaeal OTases expression by Western 83 Figure 3-6: Topology prediction for M voltae AgiB............................................................... 83 Figure 3-7: Purified AglB analyzed by SDS-PAGE and Western blots .................................. 84 Figure 3-8: Standard assay for OTase activity........................................................................ Figure 3-9: AglB activity assay using the peptide Ac(YKYNESSYKpNF)NH 2 as the acceptor substrate and Dol-P-GlcNAc or Dol-P-disaccharide as the donor substrates ................... 85 Figure 3-10: Dependence of AglB activity on divalent metal cations ...................................... 86 Figure 3-11: Reverse-phase HPLC purification of peptide and glycopeptide product............86 87 Figure 3-12: ESI-MS of the glycopeptide produced by AgIB ................................................. OTases............90 representative motif of the WWDXGX of alignment Sequence Figure 3-13: Figure 3-14: Activity of AglB His-597 mutants with Dol-P-GIcNAc-[ 3 H]Glc-2,3-diNAcA and 91 Ac(YKYNESSYKpNF)NH 2 as the donor and acceptor substrates. ................................ .. Figure 3-15: AglB WT, H597A, H597F, H597N, H597Y expression in membrane fractions 91 Figure 3-16: AglB wild type and AgIB H597Y mutant activity with Dol-P-GlcNAc-Glc-2,392 diNAcA and Und-PP-Bac-GaINAc.................................................................................. Figure 3-17: Gel filtration chromatography of AgIB with Triton X-100 and LMNG.............94 95 Figure 3-18: Structure of LMNG amphiphile. ......................................................................... 96 Figure 3-19: AglB acceptor peptide library screen ................................................................. 97 Figure 3-20: Crystallization of AgIB in the presence of peptide ............................................. onto the Figure 3-21: M voltae AglB catalyzes the oligosaccharyl transfer from a Dol-P-glycan 98 Asn residue within the NXS/T sequon of acceptor proteins. ............................................ Chapter 4 Figure 4-1: N-linked glycosylation across the three domains of life ......................................... 110 111 Figure 4-2: Jablonski diagram of a ligand-lanthanide complex ................................................ 112 Figure 4-3: Lanthanide binding tag............................................................................................ 114 Figure 4-4: Intramolecular LRET of AglB ................................................................................ 116 ........ Tb3+ titration upon of LBT-AglB Figure 4-5: LBT-AglB purification and luminescence Figure 4-6: Membrane topology prediction plot for AgIB, illustration of Cys to Ser mutants, and 118 structure of BODIPY-TM R m aleim ide............................................................................... Figure 4-7: LBT-AglB-C35S/C590S purification and OTase assay.......................................... 119 Figure 4-8: LBT-AglB-C35S/C590S/P356C and LBT-AgIB-C35S/C590S/T367C purification 120 and labeling with BODIPY-TM R fluorophore. .................................................................. Figure 4-9: Emission spectrum of LBT-AglB-T367C-BODIPY-TMR with or without gating 121 Figure 4-10: Time-gated emission spectra of BODIPY-TMR labeled P356C upon Tb3+ titration. 123 and luminescence decay measured at 543 nm ..................................................................... titration Tb3+ upon T367C labeled BODIPY-TMR of Figure 4-11: Time-gated emission spectra 124 and lum inescence decay measured at 543 nm ..................................................................... Figure 4-12: Emission spectra and luminescence decay of LBT-AglB-T367C-BODIPY-TMR and Tb3+ ion exhibit quenching upon addition of Ac(YKYNFTSYKRR)NH 2 peptide......125 Figure 4-13: Loss of luminescence signal with addition of peptides Ac(YFNFTGRR)NH 2 and 127 Ac(YKYNFTSYKRR)NH 2 to LBT-Ub.............................................................................. 10 Figure 4-14: AgiB acceptor peptide library screen with peptides having N-terminal acetylation 130 and or free N -term inal am ines............................................................................................. 3 Figure 4-15: Emission spectra of LBT-Ub with Tb and addition of peptides Ac(TFNFTS)NH 2 132 and Bz(TFNETS)NH 2. ...................................... ................................ . . . . . . . . . . . . ... Chapter 5 Figure 5-1: Dolichol pathway of N-linked glycosylation in Saccharomyces cerevisiae...........143 Figure 5-2: Alignment of AgiK from M. voltae with S. cerevisiae Alg5 .................................. 144 145 Figure 5-3: Topology prediction for S. cerevisiae Alg5. ........................................................... Figure 5-4: GST-Alg5 and His6 -Alg5 expression detected by SDS-PAGE and Western blot. 146 Activity assay of GST-Alg5, His6 -Alg5 constructs ............................................................ Figure 5-5: Expression of His6 -SUMO-Alg5 in C41 (DE3), C43 (DE3), BL21 (DE3) RIL, 146 Rosetta2 (D E3), and Lem o2l (DE3) cell lines ................................................................... Figure 5-6: Alg5-Hisio expression levels detected at various times after induction with variable 147 concentrations of L-arabinose ............................................................................................. Figure 5-7: Activity of cell lysates Alg5-Hisio expressed from pBAD vector in BL21 (DE3) RIL 148 cells at various times after induction with 0.2% L-arabinose. ............................................ Figure 5-8: TRX-His 6 -Alg5 expression is detected by Western blot analysis...........................148 Figure 5-9: Alg5 expression as CEF from pET(Alg5) under different conditions. Activity assay 149 of A lg5 C EF s....................................................................................................................... Figure 5-10: HPLC fractions of the organic extraction of the Alg5 reaction products.............. 150 Figure 5-11: Alg5 yeast expression in yeast strains INVSc1, W303a, or W303c.....................151 Figure 5-12: Glucosyltransferase activity assay with membrane fractions of Alg5 expression in 151 IN VSc1, W 303a, and W 303c ............................................................................................. 153 ................. Alg5-CT and Figure 5-13: Western blot showing cell-free expression of Alg5-NT Figure 5-14: Western blot of Alg5-NT and AglC expression with additives, or Alg5-NT and 154 A lg5-CT solubilization after expression. ............................................................................ Figure 5-15: Glucosyltransferase activity assay with Alg5-NT and Alg5-CT .......................... 155 Figure 5-16: 1H-NMR and 3 'P decoupled 1H-NMR spectrum of Dol-P-Glc............................. 156 11 List of Tables Chapter 1 Table 1-1: Features of archaeal N-linked glycosylation in select archaeal species. ................. 28 Chapter 2 Table 2-1: 1H- and 13 C-NMR assignments of Dol-P-GlcNAc structure ................................... 49 Chapter 4 Table 4-1: Amino acid preference at each position of XO-X 1 -X 2 -X 3 -X 4 -X5 peptide. ................. 131 Table 4-2: Apparent Km values of best AglB peptide substrates. ............................................... 12 131 List of Abbreviations Standard 3-letter and 1-letter codes are used for the 20 natural amino acids. Standard 1-letter codes are used for the 4 common DNA bases. AcCoA ATP B-OG BODIPY CAM CE CEF CHAPSO DDM DIPEA DMF DMSO Dol-P Dol-PP DMPC DTT EDTA ER ESI-MS Fmoc Fos-12 FRET GDP-Man GST GT IPTG HEPES Km Kd LB LBT LDAO LLO LMNG LRET NDP Ni-NTA NMR NP-HPLC OTase acetyl-coenzyme A adenosine 5'-triphosphate n-octyl P-D-glucoside 4,4-difluoro-4-bora-3a,4a-diaza-s-indacene ceric ammonium molybdate capillary electrophoresis cell envelope fraction 3-[(3-cholamidopropyl)dimethylammonio]-1 -propanesulfonate n-dodecyl--D-maltoside diisopropylethylamine dimethylformamide dimethylsulfoxide dolichyl-phosphate dolichyl-diphosphate 1,2-dimyristoyl-sn-glycero-3-phosphocholine dithiothreitol extinction coefficient for molar absorptivity ethylenediaminetetraacetic acid endoplasmic reticulum electrospray ionization-mass spectrometry 9-fluorenylmethoxycarbonyl n-dodecyl phsophocholine fluorescence resonance energy transfer guanosine diphosphate mannose glutathione S-transferase glycosyltransferase isopropyl f-D-1-thiogalactopyranoside 4-(2-hydroxyethyl)- 1 -piperazineethanesulfonic acid Michaelis constant dissociation constant Luria-Bertani broth lanthanide binding tag lauryl dimethylamine oxide lipid-linked oligosaccharide lauryl maltose neopentyl glycol luminescence resonance energy transfer nucleotide diphosphate nickel nitrilotriacetic acid nuclear magnetic resonance normal phase high performance liquid chromatography oligosaccharyltransferase 13 PAL PDB PEG-PS PGT pNF POPC Pren-P Pren-PP PSUP PyBOP Ro RP-HPLC SDS-PAGE TTFA TIS TLC TMHMM Triton X-100 TRX TUPS UDP UDP-Bac UDP-Glc UDP-GlcNAc UDP-Gal UDP-GalNAc UDP-Glc-2,3-diNAcA UMP Und-P Und-PP Xaa 5-(4'-aminomethyl-3',5'-dimethoxyphenoxy)valeric acid Protein Data Bank polyethylene glycol-grafted polystyrene phosphoglycosyltransferase 4-nitrophenylalanine 1 -palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine polyprenyl-phosphate polyprenyl-diphosphate pure solvent upper phase benzotriazol- 1 -yl-oxytripyrrolidinophosphonium hexafluorophosphate FOrster distance reverse phase high performance liquid chromatography sodium dodecyl sulfate polyacrylamide gel electrophoresis lifetime trifluoroacetic acid triisopropylsilane thin layer chromatography Tied Mixture Hidden Markov Model for transmembrane prediction t-octylphenoxypolyethoxyethanol thioredoxin theoretical upper phase with salt uridine 5'-diphosphate UDP-2,4-diacetamido-2,4,5-trideoxy-ax-D-glucose UDP-glucosamine UDP-N-acetyl-D-glucosamine UDP-galactosamine UDP-N-acetyl- D-galactosamine UDP-2,3-diacetamido-2,3-dideoxy-D-glucuronic acid uridine 5'-monophosphate undecaprenyl-phosphate undecaprenyl-diphosphate 14 Chapter 1: Introduction 15 Introduction Post-translational modifications of proteins help to expand the proteome far beyond what is directly encoded by the genome.' Glycosylation is one of the predominant and most complex protein modifications found in nature, occurring in all three domains of life - Eukarya, Bacteria, and Archaea. 2 The enormous diversity of sugars, sugar linkages, sugar-protein linkages, and glycan size leads to a seemingly endless number of possible protein-linked glycan structures. Glycosylation can profoundly impact protein function, playing important roles in biological processes such as protein folding and stability, intracellular targeting, cellular signaling and adhesion, and the immune response.3 Five major types of protein glycosylation have been observed in nature and are classified based on the manner in which the glycan is linked to the modified protein (Figure 1-1). N-linked glycosylation involves the en bloc transfer of an oligosaccharide from a lipid carrier onto the side chain amide nitrogen of asparagine residues within a conserved sequon of acceptor proteins.4 -6 O-linked glycosylation involves the transfer of monosaccharide units onto the side chain hydroxyl oxygen atom of either serine or threonine residues, and to a lesser extent, other residues such as tyrosine, hydroxylysine and hydroxyproline. Glycosylphosphatidylinositol (GPI) anchoring involves the transfer of a GPI moiety to the C-terminus of a target protein, tethering it to the cell membrane.' Phosphoglycosylation involves the addition of sugar I -phosphates to the hydroxyl oxygen atom of serine or threonine residues.9 C-mannosylation involves the transfer of mannose to the indole C2' carbon atom of tryptophan residues resulting in a C-C linkage.10'11 Nlinked, 0-linked, and GPI-anchored glycosylation have been well-studied biochemically, while less is known about phosphoglycosylation and C-mannosylation. 16 0-glycosylation P-glycosylation C-glycosylation zon GPM-anchor W Man-4cwwif o Galrncto. (Gd * kn(I * N'A*My~uCOGsamN(G4A SGuWsmri *Muano (GICN) (an Upid asaw S DO Gooh ftycopm-.i Torg~t gtyoop-ew Figure 1-1: Schematic overview of the five different types of protein glycosylation. Taken from Jarrell, et al.' Eukaryotic N-linked glycosylation N-linked glycosylation is an essential part of the protein maturation process in eukaryotes and is estimated to occur in more than half of all eukaryotic proteins.' 2 Genetic and biochemical characterization of this process has provided the best understanding of this pathway in Saccharomyces cerevisiae (Figure 1-2A), but N-glycosylation is remarkably conserved in all eukaryotes, from yeast to man.' 3 The first phase of N-glycosylation occurs on the cytoplasmic face of the endoplasmic reticulum (ER) membrane. The assembly of the lipid-linked oligosaccharide (LLO) is initiated by a phosphoglycosyltransferase, followed by a series of membrane-bound glycosyltransferases in the asparagine-linked glycosylation (Alg) family that catalyze the transfer of each monosaccharide unit from nucleotide sugar donors onto a dolichyldiphosphate (Dol-PP) carrier. Dolichol is a long chain, a-saturated linear polyisoprene that typically varies in size between 14 and 21 units depending on the cell type and species.' 4 After the completion of the branched heptasaccharide, the Dol-PP-GlcNAc 2Man5 intermediate is flipped across the membrane to face the ER lumen in an ATP-independent manner. Yeast genetic 17 analysis previously implicated Rftl as the purported flippase, but recent studies show that Rftlnull cells do not stall at the flippase step, and Rftl may instead act as a chaprone. 15"1 6 The flipped LLO is further elaborated by the transfer of mannose and glucose residues from Dol-P-mannose and Dol-P-donors, respectively. The fully assembled branched tetradecassachride, Glc 3Man9 GlcNAc 2 , is transferred by the oligosaccharyltransferase (OT) onto asparagine residues within the N-X-S/T sequon, where X can be any amino acid except proline (Figure 1-2B). A '3 CTP UDP1 UDP-U GDP-11 Sec59 Alg7 Alg13I14 A191 Cytoplasm GDP-U A g2 a. GDP-U A g2 j I A1 A GDP- Y GDP-0UY ER Lumen Rfti? nascent gycoprotein OT ILAiglO T A1g98 i I IrA IL0-A, IIcLc 4 a a A1IgQ Ag6 . 1. -4a Ci CLa A1g12 O-P 8 GDP-U Dpm1 a. a." d_ Dol ichyl-diphosphate = B HO OH HO HO - OH 0 OH LHO 0 ix;;4.l0o HO OH OH HO HO HO 'A H% 0. - 0. o-P-O-P-o1~ 2 0O 0 OH OH OH 0 '0 HO. -Z~o0 - HO OH HO OH ,,d 4- a. a N- acetylglucosamine (GIcNAc) Ma nnose (Man) Glu cose (Glc) rl" IAlg5 0 OH HO H OH H0 AcHN AcHN 1-- . Figure 1-2: (A) N-linked glycosylation pathway in S. cerevisiae. (B) Structure of the eukaryotic N-linked glycan, Glc 3Man9 GlcNAc 2 18 Newly formed glycoproteins are processed by glycosidases and glycosyltransferases in the ER and Golgi apparatus before being secreted or distributed to various cellular locations. This downstream modification results in the vast diversity of N-linked glycans observed in eukaryotes." Although the Glc 3Man9 GlcNAc 2 tetradecasaccharide core is widely conserved in eukaryotes, several species of protists have been found to assemble only truncated forms of the glycan.1 8 N-glycans play an important role in the quality control of protein folding in the ER, serve as identity and localization tags to direct proteins to the proper cellular destination, help regulate the mobility of cell membrane proteins through interactions with lattice-forming lectins, impart structural rigidity, and may provide protection against proteolysis. 19 Bacterial N-linked glycosylation The first bacterial N-linked glycosylation system was discovered in Campylobacter jejuni, a gram-negative human gut mucosal pathogen.2 0 '2 1 Glycan assembly occurs on the cytoplasmic face of the periplasmic membrane by the protein glycosylation (Pgl) enzymes (Figure 1-3A). This pathway resembles the first half of the dolichol pathway in S. cerevisiae, but with several differences. A phosphoglycosyltransferase initiates the LLO assembly and then a series of glycosyltransferases add monosaccharide units from nucleotide sugar donors onto an undecaprenyl-diphosphate (Und-PP) carrier rather than onto Dol-PP. Undecaprenol is a long chain, unsaturated polyisoprene comprising 11 isoprene units.2 2 Instead of the GlcNAc residue found in eukaryotic N-glycans, the first sugar in the C. jejuni glycan is N,N'- diacetylbacillosamine (Bac). After the completion of the heptasaccharide, the Und-PPBacGalNAc 5 Glc intermediate is flipped across the membrane to the periplasmic face by an ABC-type transporter (PgIK). GlcGalNAcjBac is transferred by the oligosaccharyltransferase 19 (PglB) onto asparagine residues within the extended sequon, D/E-X 1-N-X 2-S/T, in which an acidic residue occupies the -2 position (Figure 1-3B). A PgIF PgIE UDP-U - UDP-8 I UDP-O UDP-U PglJ PgIH CL Pg i 'U UDP-O P91A PgIC Cytoplasm N-acetylglucosamine (GlcNAc) Perplasm o N,N'-diacetylbacillosamine (Bac) N-acetylgalactosamine (GalNAc) Glucose (Glc) I ] PgKK Undecaprenyl-diphosphate PglB Undecaprenol n- B OH HO OH OH o 0 O n = 7 (length < 50 MAcH A) I ACOHO 0 21 HO~ AHNO4 CH3 H HO Figure 1-3: (A) N-linked glycosylation pathway in C. jejuni. (B) Structure of the N-linked glycan, GlcGalNAc 5 Bac. Over one hundred glycoproteins have been identified in C. jejuni and are associated with a wide range of cellular functions.2 4 Mutations of the pgl locus resulted in impaired host cell 20 2 52 6 adhesion and colonization, establishing a link between N-glycosylation and pathogenicity. , , Other important roles for N-linked glycosylation may include modulation of protein-protein interactions, enhancement of protein stability, and protection of C. jejuni from osmolytic stress. 27 28 Archaeal N-linked glycosylation In 1977, recognition of major differences in the structure and genetics of prokaryotes led to the classification of archaea and bacteria into two distinct domains. 29 Archaea share common 20 features with both bacteria and eukaryotes while also possessing several unique characteristics. Like bacteria, archaea are single-celled organisms, have circular chromosomes and no nucleus or membrane-bound organelles, but they also possess several genes and metabolic pathways that are more closely related to those of eukaryotes, especially the enzymes involved in transcription and translation. Archaea contain ether-linked phospholipids in their cell membranes instead of the ester-linked phospholipids found in bacteria and eukaryotes. In addition, archaea are able to utilize a wide range of energy sources and frequently inhabit extreme environments. 30 N-linked glycosylation is predicted to occur in nearly all of the archaeal species with sequenced genomes. 3 1 While many archaeal N-glycans have been identified in S-layer, archaellin, and other cell-surface glycoproteins, not many N-glycan structures have been fully characterized and even fewer biosynthetic pathways understood.. The structures that have been characterized over the years demonstrate the incredible variety of archaeal N-glycans (Figure 14).3-3 There is a high diversity of archaeal N-linked glycan structures in terms of size, the degree of branching, the identity of the linking sugar, the presence of unique sugars, and further modification of sugar moieties by methylation, sulfation, and even addition of pendant amino acids. Such diversity is far beyond that found in bacteria and eukaryotes. However, the chemical structure of the glycan and the nature of the glycan linkages to protein have often remained undetermined and the pathways for N-glycan assembly unclear. The recent availability of complete genome sequences for nearly 200 archaeal species and the development of appropriate molecular tools for genetic manipulation have facilitated the detailed delineation of several of these pathways. 3 1 ,3 6 Since the first discovery of archaeal N-linked glycoproteins in 37 the study of archaeal glycosylation (Agl) pathways has been largely Halobacteriumsalinarium, 21 limited to the euryarchaeal Haloferax volcanii, Methanococcus voltae, and Methanococcus maripaludis,and the crenarchaeal Sulfolobus acidocaldarius. A B 0>(. 0 0 0 R'H $O3 H RAO R-RO 30 R30 HO H1CO H 330 R30 R=H, ~ RO SOO .~ -o A2N 1 IN0 0 H AcN /10-13 HO D C ~ S HO HO~HO 0" O . OH 00 0 HO HOOH HIH 6ic OH HHutOH Y 040 eI TOHOH OH N OilH~ high-mannose-type N-linked glycan of Thermoplasma acidophilum. (D) Structure of the Nlinked heptasaccharide in Pyrococcusfuriosus. There are two pathways of N-glycosylation in the extreme halophile Haloferax volcanii. Under normal high salt conditions, the process begins with the sequential addition of the first four sugar subunits of the pentasaccharide from nucleotide-activated donors to a Dol-P carrier at the cytoplasmic leaflet of the cytoplasmic membrane (Figure 1 -5A). 38 -44 In contrast to the 22 eukaryotic (C 70 -Cl o) Dol-PP lipid carriers, (C5 5-C6 o) Dol-P serves this role in H. volcanii. The Dol-P-tetrasaccharide intermediate and a separate Dol-P-mannose are flipped across the membrane, and the tetrasaccharide is transferred by the oligosaccharyltransferase (AgiB) to the asparagine residue within the N-X-S/T sequon of the target S-layer glycoprotein or archaellins. The final mannose subunit is then delivered from its flipped Dol-P carrier onto the glycoproteinlinked tetrasaccharide, resulting in the final glycoprotein N-linked to the pentasaccharide. At higher salinities, both Dol-P charged with the first four pentasaccharide sugars and Dol-P charged with final mannose sugar have been observed by mass spectrometry analysis. A NDP-U NDP-U AgIJ Cytoplasm a. - Exteror NDPU NDP-U q AgIG a.C. AgIF AgIl ONC NDP-U AgIE -+ I SAM AgIP -- + NDP-u AgID o- Dolichol ! AgIM ? - 7-8 AI AgIR AgIS OH 2 HH, H3 Cytoplasm P AgI5 Ag6 qtAplasmss Ag7Ag19 C.. a. - a.- e L. -. Agli2 Ag Il AIj~iAg l2 Ag l3 AgI8 i Agli4 !soi - a. - a Hexose (Hex) N Hexuronic acid (HexA) * Mannose (Man) O Rhamnose (Rha) 0 I Dolichyl-phosphate (Dol-P) Exteror 0 , Ag 115 Figure 1-5: (A) N-linked pathway in H. volcanii cells grown in 3.4 M NaCl. (B) A second Nlinked pathway in H. volcanii cells grown in 1.75 M NaCl. The exact structures of the N-linked glycans on S-layer glycoproteins or archaellins have not yet been determined. 23 When H. volcanii cells are grown at lower salinity, S-layer glycoproteins display a distinct tetrasaccharide, comprising a sulfated hexose, two unmodified hexoses, and a rhamnose. Dol-P charged with this tetrasaccharide has also been observed by mass spectrometry. 43 The current working model in this low-salt environment implicates an entirely different set of enzymes for glycan assembly (Figure 1-5B). A currently unidentified oligosaccharyltransferase, not AglB, is believed to be responsible for the transfer of this glycan to a different asparagine residue on the S-layer glycoprotein. This second pathway appears to be an adaptive response and is recruited when the first pathway is compromised. 45 Chapters 2 and 3 of this thesis detail the biochemical characterization of the N-linked glycosylation pathway of Methanococcus voltae, a marine methanogenic mesophile. This pathway is initiated at the cytoplasmic face of the cell membrane by a series of glycosyltransferases that add monosaccharide units from nucleotide-activated sugar donors onto a Dol-P carrier (Figure 1-6A). 46 49 The Dol-P-trisaccharide is then flipped across the membrane to the exterior surface of the cell, where AglB transfers the trisaccharide onto the asparagine residue within the N-X-S/T sequon of target proteins. In our studies, AgIB efficiently transfers a disaccharide from a Dol-P carrier to acceptor peptides, but the glycan that is natively N-linked to archaellins and S-layer proteins is a trisaccharide, ManNAc(6Thr)A-Glc-2,3-diNAcA-GIcNAc (Figure 1-6B). 24 A Cytoplasm cL UDP -U UDP-U UDP--U A1K AgIC Ag1A GlcNAc Glc-2,3-diNAA ManNAc(6Thr)A [LC Exterior [Dolichyl-phosphate AgIB Dolichol B HO RO OH 0 O~NHAC -O 0 0 0 OH 0 0 AcH N AcHN Figure 1-6: (A) N-linked glycosylation pathway in M voltae (B) Structure of the N-linked archaellar glycan, ManNAc(6Thr)A-Glc-2,3-diNAcA-GlcNAc. The N-linked glycosylation pathway in another methanogen, M maripaludis, shares similarities with that of the closely related M voltae (Figure 1-7A). 50 Chapter 2 of this thesis posits that the first glycosyltransfer step of the M maripaludis pathway is initiated by MMP 1170, an enzyme that has yet to be characterized but exhibits extremely high homology to AglK, the first glycosyltransferase of the M voltae pathway. 49 Three other glycosyltransferases add monosaccharide units from nucleotide-activated sugar donors onto a Dol-P carrier. The DolP-tetrasaccharide is then flipped from the cytoplasmic face of the cell membrane to the exterior, where AglB transfers the tetrasaccharide onto the asparagine residue within the N-X-S/T sequon of acceptor proteins. Strong evidence for a Dol-P carrier comes from our mass spectrometry analysis of M maripaludis cells, in which we observed the presence of a Dol-P-trisaccharide intermediate bearing the first three sugars of the N-linked glycan identified in M maripaludis archaellar and pilin proteins, Sug-ManNAc(3NAm6Thr)A-Glc-2,3-diNAcA-GaNAc, where the terminal Sug is 2-acetamido-2,4-dideoxy-5-0-methylhexosulo-1,5-pyranose (Figure 1-7B). 25 A NDP-U MMP1170? Cytplsm .. ! c. -- NDP-U AgIA NDP-U AglO - 0- . -- NDPAgIL O. + - + U GaINAc M Glc-2,3-diNAcA ManNAc(3NAm6Thr)A I O. c 2NAc(4deoxy)hexulose Dolichyl-phosphate (Dol-P) Exterior AgIB Dolichol OH B 0 HO OH NHAc ~CiO 0 --<NHc OH OH 0 AcHN OCH 3 AcHN Figure 1-7: (A) N-linked glycosylation pathway in M maripaludis(B) Structure of the N-linked archaellar glycan, 2NAc(4deoxy)hexulose-ManNAc(3NAm6Thr)A-Glc-2,3-diNAcA-GaNAc. Crenarchaeal N-linked glycosylation has been best studied in the thermophilic Sulfolobus acidocaldarius,but the pathway enzymes are only partially identified (Figure 1-8A). 5 2- 54 The actions of a phosphoglycosyltransferase and subsequent glycosyltransferases are thought to build up a tribranched glycan on a Dol-PP carrier, which is flipped from the cytoplasmic to the exterior surface of the cell membrane, before transfer of the glycan onto the asparagine residue of target proteins. S-layer, archaellin, and cytochrome b558 /566 proteins are N-glycosylated with heterogeneous family of glycans, with the largest comprising GlcNAc 2GlcMan 2 and an unusual sulfated sugar called sulfoquinovose (Figure 1-8B). The dolichols observed in S. acidocaldarius are unusually short and display a higher degree of saturation of internal isoprene units than eukaryotic or other archaeal dolichols. While genetic complementation data suggest that Dol-PP is the lipid carrier, only Dol-P and Dol-P charged with different hexoses have ever been detected by mass spectrometric analysis of lipid extracts.5 3 No Dol-P or Dol-PP bearing complex glycans have ever been observed. In addition, previous misclassification of AglH as the first enzyme of 26 the M voltae pathway illustrates that genetic complementation studies alone are insufficient in characterizing enzyme activity. 47 ' 49 This ambiguity suggests that the S. acidocaldariuspathway could in fact be more like the euryarchaeal pathways, with Dol-P being the lipid carrier instead of Dol-PP. Interestingly, S. acidocaldarius is the only archaeal organism studied thus far in which N-linked glycosylation appears to be essential. 54 A UDP-* Agl3 Dol-P-U NDP-1 EM an or, I oror NDP-U UDP-U qNDP- 0A Cytoplasm Gic Dol-P-U DoI-P-U U G2cNAc NDP-U NDP-U 0 Sulfoquinovose (QuiS) CL Exterior (L a. Dolichol Dolichyl-phosphate (Dol-P) AgIB OH B HO HO OH H OH H010 0 OH HHO OH O OH 0O 'jOH HO H Figure 1-8: (A) Current understanding of the N-glycosylation pathway in S. acidocaldarius.(B) Structure of the tribranched hexasaccharide N-glycan containing the 6-sulfoquinovose. Progress has been made in recent years to identify highly unusual archaeal N-glycans and the enzymes involved in their biosynthesis and transfer. Much of this progress has come through deletion of genes potentially involved in a pathway and analysis of the downstream effects on protein glycosylation through mass spectrometry. Unfortunately, there is a dearth of biochemical validation of proposed pathways and in vitro enzymatic assays are difficult when the identities of the substrates are often unknown. Even when the glycan identity is known, questions remain about the lipid-linked carrier (Dol-P vs. Dol-PP) and the mechanism of transfer to acceptor 27 proteins. Archaeal oligosaccharyltransferases (designated as AglBs) are easily identifiable in archaeal genomes, and many archaeal species encode multiple AgiBs. The physiological significance of this observation remains unclear, as multiple AgiBs could reflect differences in substrate or target preference, possibly as a function of local growth conditions. A few AglB crystal structures have been published, but there are still major gaps in our knowledge of the AgiB mechanisms (Table 1-1). All known oligosaccharyltransferases contain a WWDXGX 2 motif, and sequence variability in this motif may reflect different activities of the various versions of the protein in the same species.55 It would be highly noteworthy if a link could be established between the type of AgiB and the lipid-linked oligosaccharide substrates it uses. Table 1-1: Features of archaeal N-linked glycosylation in select archaeal species, highlighting the WWDX1GX 2 motif of AglB. Verified oligosaccharyltransferases are marked with (*). Species AgIB WWDX 1 GX 2 Observed LLO H. salinarium OE2548F WWDYGH H. volcanii M. voltae M. maripaludis S. acidocaldarius P. furiosus HVO 1530 Mvol_1038* MMP1424* Sacil274 PF0156* PF0411 PH0242 PH1271 AF0380* AF0329 AF0040 WWDYGH WWDNGH WWDNGH WWDYGY WWDYGY WWDWGH WWDYGY WWDWGH WWDYGH WWDYGN WWDYGN Dol-P-HexHexA"' Dol-PP-tetrasaccharide 56 Dol-P-tetrasaccharide 43 P. horikoshii A. fulgidus Dol-P-trisaccharide 49 Dol-P-Hex" Dol-P-heptasaccharide (Chapter 2) Crystal structure _ PFO 156 C-term"7 PH0242 C-term 8 AF0380 C-term, 5 ful5 AF0329 C-term58 AF0040 C-term58 Thesis objectives The overall goal of this thesis is to gain a deeper understanding of the process of N-linked glycosylation in archaea. Chapter 2 details the identification and characterization of the activities of AglK and AgIC, the first two glycosyltransferases in the M voltae N-glycosylation pathway. 28 Chapter 3 discusses the biochemical characterization of AgIB activity using purified substrates, which represents the first biochemical demonstration of archaeal oligosaccharyltransferase activity. Chapter 4 describes our progress towards developing a new system to investigate the conformational dynamics of AglB upon substrate binding using a lanthanide-based luminescence resonance energy transfer technique. Finally, Chapter 5 describes our efforts to express Alg5, a dolichyl-phosphate glucosyltransferase, and the stereochemical analysis of the product. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. Walsh, C. T., Garneau-Tsodikova, S. & Gatto, G. J. Protein Posttranslational Modifications: The Chemistry of Proteome Diversifications. Angew. Chem. Int. Ed. 44, 7342-7372 (2005). Varki, A. et al. Essentialsof Glycobiology. (Cold Spring Harbor Laboratory Press, 2009). at <http://www.ncbi.nlm.nih.gov/books/NBK 1908/> Varki, A. Biological Roles of Oligosaccharides - All of the Theories Are Correct. Glycobiology 3, 97-130 (1993). Burda, P. & Aebi, M. The dolichol pathway of N-linked glycosylation. Biochim. Biophys. Acta BBA - Gen. Subj. 1426, 239-257 (1999). Jarrell, K. F. et al. N-Linked Glycosylation in Archaea: a Structural, Functional, and Genetic Analysis. Microbiol. Mol. Biol. Rev. 78, 304-341 (2014). Weerapana, E. & Imperiali, B. Asparagine-linked protein glycosylation: from eukaryotic to prokaryotic systems. Glycobiology 16, 91 R-1 01 (2006). Steen, P. V. den, Rudd, P. M., Dwek, R. A. & Opdenakker, G. Concepts and Principles of O-Linked Glycosylation. Crit. Rev. Biochem. Mol. Biol. 33, 151-208 (1998). Paulick, M. G. & Bertozzi, C. R. The Glycosylphosphatidylinositol Anchor: A Complex Membrane-Anchoring Structure for Proteinst. Biochemistry 47, 6991-7000 (2008). Haynes, P. A. Phosphoglycosylation: A new structural class of glycosylation? Glycobiology 8, 1-5 (1998). Doucey, M.-A., Hess, D., Cacan, R. & Hofsteenge, J. Protein C-Mannosylation Is Enzymecatalysed and Uses Dolichyl-Phosphate-Mannose as a Precursor. Mol. Biol. Cell 9, 291-300 (1998). Furmanek, A. & Hofsteenge, J. Protein C-mannosylation: facts and questions. Acta Biochim. Pol. 47, 781-789 (2000). Apweiler, R., Hermjakob, H. & Sharon, N. On the frequency of protein glycosylation, as deduced from analysis of the SWISS-PROT database. Biochim. Biophys. Acta BBA - Gen. Subj. 1473, 4-8 (1999). Lehle, L., Strahl, S. & Tanner, W. Protein Glycosylation, Conserved from Yeast to Man: A Model Organism Helps Elucidate Congenital Human Diseases. Angew. Chem. nt. Ed. 45, 6802-6818 (2006). 29 14. Jones, M. B., Rosenberg, J. N., Betenbaugh, M. J. & Krag, S. S. Structure and synthesis of polyisoprenoids used in N-glycosylation across the three domains of life. Biochim. Biophys. Acta BBA - Gen. Subj. 1790, 485-494 (2009). 15. Helenius, J. et al. Translocation of lipid-linked oligosaccharides across the ER membrane requires Rftl protein. Nature 415, 447-450 (2002). 16. Jelk, J. et al. Glycoprotein Biosynthesis in a Eukaryote Lacking the Membrane Protein Rftl. J. Biol. Chem. 288, 20616-20623 (2013). 17. Helenius, A. & Aebi, M. Roles of N-Linked Glycans in the Endoplasmic Reticulum. Annu. Rev. Biochem. 73, 1019-1049 (2004). 18. Guha-Niyogi, A., Sullivan, D. R. & Turco, S. J. Glycoconjugate structures of parasitic protozoa. Glycobiology 11, 45R-59R (2001). 19. Larkin, A. & Imperiali, B. The Expanding Horizons of Asparagine-Linked Glycosylation. Biochemistry 50, 4411-4426 (2011). 20. Szymanski, C. M., Yao, R., Ewing, C. P., Trust, T. J. & Guerry, P. Evidence for a system of general protein glycosylation in Campylobacter jejuni. Mol. Microbiol. 32, 1022-1030 (1999). 21. Young, N. M. et al. Structure of the N-Linked Glycan Present on Multiple Glycoproteins in the Gram-negative Bacterium, Campylobacter jejuni. J Biol. Chem. 277, 42530-42539 (2002). 22. Hartley, M. D. & Imperiali, B. At the membrane frontier: A prospectus on the remarkable evolutionary conservation of polyprenols and polyprenyl-phosphates. Arch. Biochem. Biophys. 517, 83-97 (2012). 23. Alaimo, C. et al. Two distinct but interchangeable mechanisms for flipping of lipid-linked oligosaccharides. EMBO J. 25, 967-976 (2006). 24. Scott, N. E. et al. Simultaneous Glycan-Peptide Characterization Using Hydrophilic Interaction Chromatography and Parallel Fragmentation by CID, Higher Energy Collisional Dissociation, and Electron Transfer Dissociation MS Applied to the N-Linked Glycoproteome of Campylobacter jejuni. Mol. Cell. Proteomics 10, M00003 1-MCP201 (2011). 25. Szymanski, C. M., Burr, D. H. & Guerry, P. Campylobacter protein glycosylation affects host cell interactions. Infect Immun 70, 2242-4 (2002). 26. Hendrixson, D. R. & DiRita, V. J. Identification of Campylobacter jejuni genes involved in commensal colonization of the chick gastrointestinal tract. Mol. Microbiol. 52, 471-484 (2004). 27. Larsen, J. C., Szymanski, C. & Guerry, P. N-Linked Protein Glycosylation Is Required for Full Competence in Campylobacterjejuni 81-176. J. Bacteriol. 186, 6508-6514 (2004). 28. Nothaft, H., Liu, X., McNally, D. J., Li, J. & Szymanski, C. M. Study of free oligosaccharides derived from the bacterial N-glycosylation pathway. Proc. Natl. Acad Sci. 106, 15019-15024 (2009). 29. Woese, C. R. & Fox, G. E. Phylogenetic structure of the prokaryotic domain: the primary kingdoms. Proc. Natl. Acad Sci. U. S. A. 74, 5088-5090 (1977). 30. Calo, D., Kaminski, L. & Eichler, J. Protein glycosylation in Archaea: Sweet and extreme. Glycobiology 20, 1065-1076 (2010). 31. Kaminski, L., Lurie-Weinberger, M. N., Allers, T., Gophna, U. & Eichler, J. Phylogeneticand genome-derived insight into the evolution of N-glycosylation in Archaea. Mol. Phylogenet. Evol. 68, 327-339 (2013). 30 32. Eichler, J. & Adams, M. W. W. Posttranslational Protein Modification in Archaea. Microbiol. Mol. Biol. Rev. 69, 393-425 (2005). 33. Mescher, M. F., Hansen, U. & Strominger, J. L. Formation of lipid-linked sugar compounds in Halobacterium salinarium. Presumed intermediates in glycoprotein synthesis. J. Biol. Chem. 251, 7289-7294 (1976). 34. Lechner, J. & Wieland, F. Structure and Biosynthesis of Prokaryotic Glycoproteins. Annu. Rev. Biochem. 58, 173-194 (1989). 35. Fujinami, D., Matsumoto, M., Noguchi, T., Sonomoto, K. & Kohda, D. Structural elucidation of an asparagine-linked oligosaccharide from the hyperthermophilic archaeon, Pyrococcus furiosus. Carbohydr. Res. 387, 30-36 (2014). 36. Leigh, J. A., Albers, S.-V., Atomi, H. & Allers, T. Model organisms for genetics in the domain Archaea: methanogens, halophiles, Thermococcales and Sulfolobales. FEMS Microbiol. Rev. 35, 577-608 (2011). 37. Mescher, M. F. & Strominger, J. L. Purification and characterization of a prokaryotic glucoprotein from the cell envelope of Halobacterium salinarium. J. Biol. Chem. 251, 20052014 (1976). 38. Kuntz, C., Sonnenbichler, J., Sonnenbichler, I., Sumper, M. & Zeitler, R. Isolation and characterization of dolichol-linked oligosaccharides from Haloferax volcanii. Glycobiology 7, 897-904 (1997). 39. Abu-Qam, M. & Eichler, J. Protein N-glycosylation in Archaea: defining Haloferax volcanii genes involved in S-layer glycoprotein glycosylation. Mol. Microbiol. 61, 511-525 (2006). 40. Abu-Qam, M. et al. Haloferax volcanii AglB and AgID are Involved in N-glycosylation of the S-layer Glycoprotein and Proper Assembly of the Surface Layer. J. Mol. Biol. 374, 1224-1236 (2007). 41. Guan, Z., Naparstek, S., Kaminski, L., Konrad, Z. & Eichler, J. Distinct glycan-charged phosphodolichol carriers are required for the assembly of the pentasaccharide N-linked to the Haloferax volcanii S-layer glycoprotein. Mol. Microbiol. 78, 1294-1303 (2010). 42. Kaminski, L. et al. AgliJ Adds the First Sugar of the N-Linked Pentasaccharide Decorating the Haloferax volcanii S-Layer Glycoprotein. J. Bacteriol. 192, 5572-5579 (2010). 43. Calo, D., Guan, Z., Naparstek, S. & Eichler, J. Different routes to the same ending: comparing the N-glycosylation processes of Haloferax volcanii and Haloarcula marismortui, two halophilic archaea from the Dead Sea. Mol. Microbiol. 81, 1166-1177 (2011). 44. Naparstek, S., Guan, Z. & Eichler, J. A predicted geranylgeranyl reductase reduces the Wposition isoprene of dolichol phosphate in the halophilic archaeon, Haloferax volcanii. Biochim. Biophys. Acta BBA - Mol. Cell Biol. Lipids 1821, 923-933 (2012). 45. Guan, Z., Naparstek, S., Calo, D. & Eichler, J. Protein glycosylation as an adaptive response in Archaea: growth at different salt concentrations leads to alterations in Haloferax volcanii S-layer glycoprotein N-glycosylation. Environ. Microbiol. 14, 743-753 (2012). 46. Voisin, S. et al. Identification and Characterization of the Unique N-Linked Glycan Common to the Flagellins and S-layer Glycoprotein of Methanococcus voltae. J. Biol. Chem. 280, 16586-16593 (2005). 47. Chaban, B., Voisin, S., Kelly, J., Logan, S. M. & Jarrell, K. F. Identification of genes involved in the biosynthesis and attachment of Methanococcus voltae N -linked glycans: insight into N -linked glycosylation pathways in Archaea: Archaeal glycosylation genes. Mol. Microbiol. 61, 259-268 (2006). 31 48. Chaban, B., Logan, S. M., Kelly, J. F. & Jarrell, K. F. AgIC and AgiK Are Involved in Biosynthesis and Attachment of Diacetylated Glucuronic Acid to the N-Glycan in Methanococcus voltae. J. Bacteriol. 191, 187-195 (2009). 49. Larkin, A., Chang, M. M., Whitworth, G. E. & Imperiali, B. Biochemical evidence for an alternate pathway in N-linked glycoprotein biosynthesis. Nat. Chem. Biol. 9, 367-373 (2013). 50. VanDyke, D. J. et al. Identification of genes involved in the assembly and attachment of a novel flagellin N-linked tetrasaccharide important for motility in the archaeon Methanococcus maripaludis.Mol. Microbiol. 72, 633-644 (2009). 51. Kelly, J., Logan, S. M., Jarrell, K. F., VanDyke, D. J. & Vinogradov, E. A novel N-linked flagellar glycan from Methanococcus maripaludis. Carbohydr. Res. 344, 648-653 (2009). 52. Peyfoon, E. et al. The S-Layer Glycoprotein of the Crenarchaeote Sulfolobus acidocaldarius Is Glycosylated at Multiple Sites with Chitobiose-Linked N-Glycans. Archaea 2010, e754101 (2010). 53. Guan, Z., Meyer, B. H., Albers, S.-V. & Eichler, J. The thermoacidophilic archaeon Sulfolobus acidocaldarius contains an unsually short, highly reduced dolichyl phosphate. Biochim. Biophys. Acta BBA - Mol. Cell Biol. Lipids 1811, 607-616 (2011). 54. Meyer, B. H. & Albers, S.-V. AgIB, catalyzing the oligosaccharyl transferase step of the archaeal N-glycosylation process, is essential in the thermoacidophilic crenarchaeon Sulfolobus acidocaldarius. MicrobiologyOpen 3, 531-543 (2014). 55. Magidovich, H. & Eichler, J. Glycosyltransferases and oligosaccharyltransferases in Archaea: putative components of the N-glycosylation pathway in the third domain of life. FEMS Microbiol. Lett. 300, 122-130 (2009). 56. Cohen-Rosenzweig, C., Guan, Z., Shaanan, B. & Eichler, J. Substrate Promiscuity: AgIB, the Archaeal Oligosaccharyltransferase, Can Process a Variety of Lipid-Linked Glycans. AppL. Environ. Microbiol. 80, 486-496 (2014). 57. Igura, M. et al. Purification, crystallization and preliminary X-ray diffraction studies of the soluble domain of the oligosaccharyltransferase STT3 subunit from the thermophilic archaeon Pyrococcusfuriosus. Acta Crystallograph. Sect. F Struct. Biol. Cryst. Commun. 63, 798-801 (2007). 58. Maita, N., Nyirenda, J., Igura, M., Kamishikiryo, J. & Kohda, D. Comparative Structural Biology of Eubacterial and Archaeal Oligosaccharyltransferases. J. Biol. Chem. 285, 49414950 (2010). 59. Matsumoto, S. et al. Crystal structures of an archaeal oligosaccharyltransferase provide insights into the catalytic cycle of N-linked protein glycosylation. Proc. NatL. Acad Sci. 110, 17868-17873 (2013). 32 Chapter 2: Identification and characterization of archaeal glycosyltransferases A portion of the work described in this chapter has been published in the following: Larkin, A., Chang, M.M., Whitworth, G.E., Imperiali, B. Biochemical evidence for an alternate pathway in N-linked glycoprotein biosynthesis. Nat. Chem. Biol. 2013, 9, 367-373. 33 Introduction Nature exhibits an extensive and complex variety of glycan structures in part due to the enzymatic action of glycosyltransferases (GTs), which catalyze glycan assembly through the transfer of individual monosaccharide units. Due to the diversity of monosaccharide units, multiplicity of glycosidic linkages, branching of oligosaccharides, and processing by glycosyltransferases and glycosyihydrolases, the structure of glycans on mature glycoproteins can be highly heterogeneous.' GTs most commonly use sugar nucleotide derivatives (e.g. UDPGlc, GDP-Man, CMP-NeuAc) as glycosyl donors, but they can also use lipid-phosphate sugars (e.g. Dol-P-Man) and unsubstituted phosphate sugars as activated donors. Acceptor substrates can be other sugars, proteins, lipids, nucleic acids, antibiotics, or other small molecules. 2 Currently, the Carbohydrate-Active EnZYmes (CAZy) database includes over 148,000 unique GTs that have been identified and classified into 96 families based on sequence similarities. 3 These family classifications are continually updated as new sequences are added to the CAZy database and new structures are deposited into the Protein Data Bank (PDB). The first X-ray crystal structure of a GT was reported for the bacteriophage T4-glucosyltransferase in 1994.4 There are now X-ray crystal structures available for over 100 GTs in at least 38 GT families.! A comparison of these GT structures shows that while these enzymes have low sequence homology, they exhibit a surprisingly high structural homology, with two major structural folds, called GT-A and GT-B (Figure 2-1).2 Both GT folds are largely composed of ca/p/a sandwiches. GT-A folds contain a single Rossmann fold, in which the active site is located between two s-sheets, and a conserved 'DXD' metal binding motif. GT-B folds contain two distinct Rossmann folds that are separated by a linker region and surround the catalytic site, but 34 do not require metal ion binding for activity. The Rossmann fold is a common motif found in nucleotide-binding proteins. A B Figure 2-1: Representative structures of the two main GT folds. (A) GT-A fold, inverting enzyme SpsA from Bacillus subtilus, PDB code lqgq. (B) GT-B fold, bacteriophage T4 pglucosyltransferase, PDB code ljg7. Figure modified from Lairson, et al.2 In recent years, additional folds have been proposed but still remain controversial. The GT-C fold has been predicted for GT families, including the Stt3p-like subunits of oligosaccharyltransferases (OTases), that utilize lipid-phosphate donor substrates and have 8-13 transmembrane a-helices with the active site located at the interface between the transmembrane and soluble domains.' 7 Sometimes, newly described GTs do not fit into any category, such as the CstII sialyltransferase from Campylobacter jejuni,8 or the peptidoglycan PBP2 from Staphylococcus aureus.9 Most recently, a GT-D fold was proposed after the crystal structure of a previously uncharacterized 'domain of unknown function' 1792 (DUF1792), recently biochemically shown to be a glucosyltransferase, revealed that it is structurally distinct from all known GT folds and contains a previously unknown metal ion-binding site. 10 As more X-ray structures are solved for different GT families, novel GT folds may arise during the structural characterization of currently orphan GT families and will certainly contribute to the understanding of the variety of folds used by nature to selectively catalyze glycosyl transfer. 35 Mechanistic understanding of the glycosyltransferase reaction has advanced in recent years.21 1 Glycosidic bond formation can be described as inverting or retaining, depending on whether there is a change in configuration with respect to the anomeric center of the glycosyl donor (Figure 2-2A). The mechanism of inverting GTs is believed to be similar to that of the inverting glycosyihydrolases,2 in which an active site base deprotonates the nucleophilic hydroxyl group of the glycosyl acceptor, facilitating an SN2-type direct displacement of the activating group of the glycosyl donor via an oxocarbenium ion-like transition state, which would show considerable positive charge buildup at the anomeric center (Figure 2-2B). In the case of retaining GTs, both double displacement and front-side (SNi-like) mechanisms have been proposed (Figure 2-2C). A double displacement mechanism requires two subsequent SN2 reactions with nucleophilic attack by an active site residue on the anomeric center of the donor substrate, leading to the formation of a covalent glycosyl-enzyme intermediate. The glycosylenzyme species is then attacked by a hydroxyl group of the acceptor, assisted by its deprotonation by a catalytic base or departing UDP, resulting in net overall retention of configuration. For a front-side SNi-like mechanism, there is an interaction between the departing phosphate and the incoming acceptor nucleophile and the formation of an enzymestabilized oxocarbenium ion that is shielded on one face by the enzyme.13 Interestingly, both inverting and retaining enzymes are found in GT-A and GT-B folds, suggesting that there is no correlation between the overall fold of GTs and their catalytic mechanism. To date, all enzymes predicted to adopt the GT-C fold belong to the inverting GT family.2 36 A 6-P-0-# 0 HO0 inverting HO+ 0- -0 Sugar Donor 00 AcceptorRt Retaining B H I HH- R' 0 HO HO .?0 HO-P=O 0-7=0 0 Oxocarbenium ion-like transition state C B~ HO' 0=7-0 B' B" 0 HO-" 0 0= H= Short-lived oxocarbenium ion-like species - - HOH B ~-P=OH 0 H O=P-0 Covalent glycosyl-enzyme species Figure 2-2: Catalytic mechanisms in GTs. (A) GTs catalyze the transfer of sugars with either inversion or retention of the anomeric configuration with respect to the sugar donor substrates. (B) Inverting GTs utilize a direct-displacement SN2-like mechanism with a single oxocarbenium ion-like transition state. (C) Current mechanisms for retaining GTs are the front-side SNi-like mechanism involving an oxocarbenium ion-like species and the double-displacement mechanism involving a covalent glycosyl-enzyme species. Figure adapted from Albesa-Jov6, et al." 37 N-linked glycosylation in all three domains of life begins with the action of a series of glycosyltransferases that build up an oligosaccharide for transfer onto the acceptor protein. In particular, eukaryotic and bacterial pathways feature the stepwise assembly of a glycan onto a polyprenyl-diphosphate carrier followed by glycan transfer onto an asparagine residue within an AsnXaaSer/Thr sequence. 14 The first step in this process commits the pathway to the membrane bilayer when a phosphoglycosyltransferase (PGT) catalyzes the reaction between a nucleotidediphosphate-activated sugar and a polyprenyl-phosphate to form the corresponding c-linked polyprenyl-diphosphate monosaccharide, which is then elaborated by various GTs to ultimately produce the glycosyl donors for the OTases, which generate a P-linked glycosyl bond to asparagine. Unlike in bacteria, archaeal N-linked glycosylation is widespread, as evidenced by a recent bioinformatics study that revealed that 166 out of 168 sequenced archaeal genomes are predicted to encode at least one version of AglB, the archaeal OTase enzyme that is key to Nglycosylation." N-linked glycans that decorate experimentally characterized glycoproteins exhibit a composition and content that is more diverse than seen in bacteria or eukaryotes. 16 The marine methanogen Methanococcus voltae is one of the most intensively studied archaeal pathways, along with Methanococcus maripaludis and Haloferax volcanii."' Methanoarchaeal N-glycosylation studies began with the elucidation of the structure of the glycan N-linked to the archaeal flagellins (archaellins) and S-layer glycoproteins of M voltae.19 Mass spectrometry and NMR analysis was used to characterize the structure as consisting of GlcNAc acting as the linking sugar, connected to a diacetylated glucuronic acid, which is linked to an acetylated mannuronic acid that is attached via an amide bond to a threonine at the C-6 position (Figure 23A). N-glycan attachment is required for proper archaellar structure and cell motility. 38 A HO 0 0 0 HOA 0 H I0 Hi H OH - HN 4'NHAc 0I 0~ O0 u 0 HO N 0 AcHN B UDP-U Cytoplasm ~ AgIH -- UDP o AgICIAgIK - AcHN S NAcHN IIIH 0 HN;, 0 H i 0 UDP -M GIcNAc Glc-2,3-diNAcA Ag1A q--+ UManNAc(6Thr)A Dolichyl-diphosphate Exterior ? AgIB Figure 2-3: N-linked glycosylation in Methanococcus voltae. (A) The M. voltae N-linked glycan ManNAc(6Thr)A-p,4-Glc-2,3-diNAcA-@H,3-GlcNAc. (B) The annotated M voltae pathway based on genetic complementation and knockout studies, as proposed by Chaban, et al.2 0 Once the glycan structure was identified, studies next targeted the genes encoding potential enzymes involved in the biosynthesis and attachment of N-linked glycans in M voltae. The first to be identified in the pathway were genes for the OTase, AglB, and a glycosyltransferase, designated as AglA, which was proposed to be responsible for the transfer of the third sugar. Insertional inactivation of these genes was carried out and combined with mass spectrometry analysis and migration analysis on SDS-PAGE of the resulting archaellins and S-layer glycoproteins. The first GT in the pathway, responsible for attachment of GlcNAc to the dolichol carrier, was proposed to be encoded by Mv 751, a gene subsequently designated as aglH. AglH is the only M voltae protein that belongs to Pfam PF00953 (GT family 4), which also includes Alg7, the N-acetylglucosamine-1-transferse enzyme in Saccharomyces cerevisiae that catalyzes the conversion of Dolichol-P and UDP-GlcNAc to Dol-PP-GlcNAc and UMP, which represents the first committed step in the eukyarotic N-linked glycosylation pathway. 39 No direct involvement of AglH in the M. voltae pathway was observed because attempts to inactivate the aglH gene via genetic manipulations were unsuccessful. However, AglH shares 25% identity and 41% similarity with S. cerevisiae Alg7 as well as motifs conserved in GlcNAc1-P transferase proteins. Due to their shared features, AgIH was proposed to serve as a functional homolog of Alg7. Further experiments suggested that the M voltae aglH gene was able to complement a conditional lethal mutation in the S. cerevisiae aig7 gene, supporting that AglH was likely the GlcNAc-1-P transferase that catalyzes the first step in the M voltae N-linked glycosylation pathway. 2 Two other GTs, designated as AgIC and AgIK, were proposed to be involved in the biosynthesis or transfer of the second sugar, and inactivation of either gene resulted in archaellin and S-layer proteins with significantly reduced apparent molecular masses, loss of archaellum assembly, and absence of glycan attachment.2 0 The proposed identity of the enzymes and the reactions they catalyze during N-glycan assembly in M voltae are summarized in Figure 2-3B. In our studies, we used purified enzymes and synthetic and semisynthetic substrates to biochemically validate the proposed M. voltae N-linked glycosylation pathway. Ultimately, we wanted to generate dolichyl-linked disaccharide substrate necessary for future investigations of a representative archaeal OTase, AgIB. We made several significant discoveries during the course of our in vitro studies that mandated the reassessment of the putative pathway. The proper functions of two GTs were biochemically characterized and the first two steps of the M. voltae N-linked glycosylation pathway were redefined due to our efforts described in this chapter. 40 Results and Discussion AglH is not thefirst enzyme in the M voltae N-linked pathway In order to investigate the biosynthesis of the M voltae N-linked glycan donor, we first focused our attention on AglH. Since the aglH gene is reported to complement a Aalg7 mutation in S. cerevisiae and rescue growth of an alg7 deletion, AglH was thought to be the PGT responsible for the formation of Dol-PP-GlcNAc, which would be the first polyprenyl-linked intermediate in the M voltae pathway. 2 PGT or GT activity can be examined through an assay that takes advantage of a radiolabeled substrate and the differential partitioning of glycosyl donors, polyprenyl-linked acceptors, and polyprenyl-linked glycan products into aqueous and organic phases (Figure 2-4). The transfer of radioactivity from the aqueous layer into the organic layer indicates that the radiolabeled sugar has been transferred from its aqueous soluble nucleotide-diphosphate derivative to the polyprenyl-linked acceptor, producing a radiolabeled, organic soluble polyprenyl-linked sugar. 3 H " HO OH HO A NH A- UDP or UMP (aqueous soluble) 11 -l O-P-O-P-O + UDP-[ 3H]GIcNAc (aqueous soluble) OH OH PGT or GT + 3 H OH HO0 HO OH 0 O-P-O O-P-0 -0- 0 )1-2 Dolichyl-P(P)-[ 3H]GlcNAc (organic soluble) Dolichyl-P (organic soluble) Figure 2-4: General assay for PGT and GT activity. Radiolabeled glycosyl donor is incubated with unlabeled polyprenyl-P acceptor. At each time point, the reaction is quenched into organic solvent and extracted with aqueous solution. The radioactivity in the organic and aqueous layers is quantified by scintillation counting to determine reaction progress. 41 AglH activity was investigated with polyprenyl-phosphate substrates featuring both short and long dolichols, which are reported to be characteristic of archaea and eukaryotes (C55-60 vs. C85-105).24 Unlike the bacterial undecaprenol, dolichols feature an cc-saturated isoprene unit, as seen in eukaryotes. Archaeal dolichols, in particular those from selected halophiles, also include an additional element of saturation at the o-isoprene unit.25 2 6 The enzyme which reduces this cposition isoprene in the short (C55 and C60) dolichyl-phosphates from H. volcanii has been identified2 7 and mass spectrometry analyses reveal that both the a-saturated and cc-, o-saturated short dolichols are competent intermediates for the assembly of the Dol-P-glycans. Therefore, asaturated polyprenols were expected to be suitable for the current studies, although future studies would be needed to confirm whether methanogens such as M voltae also feature the additional polyprenol w-saturation. To test the activity of AglH, the enzyme was overexpressed in Escherichia coli, isolated as a membrane fraction, and incubated with either short (C55-60) or long (C85-105) (S)-Dol-P and UDP-[3H]GIcNAc. However, even after an exhaustive screen of reaction conditions (e.g. pH, temperature, detergent, salts, and metals), we did not observe formation of Dol-PP-GlcNAc, the product that was expected based on the proposed AglH function. It is noteworthy that the authors of the complementation study suggest that AgIH may also be involved in different and essential lipid-glycoconjugate biosynthetic pathways in M voltae, including assembly of the unique GIcNAc-1-P diether glycolipid found in the plasma membrane.2 2 2 8 While the function of AglH remains a mystery, we can conclude that its function is not as the PGT that initiates the N-glycan assembly in M voltae. AgiK is a specific Dol-P-GlcNAc synthase We next turned our attention to AgIK, a 28.2 kDa protein proposed to function downstream of AgIH. 2 0 The predicted topology of AgIK does not include a true transmembrane 42 domain but the overall structure appears to contain hydrophobic regions that associate with the membrane (Figure 2-5A). 29 AglK was overexpressed in E. coli with N-terminal T7 and Cterminal His 6 tags, purified from bacterial membranes in high yield, and confirmed by SDSPAGE and Western blot analyses (Figure 2-6). The protein sedimented with the membrane fraction and required detergent for solubility. A B 0C8 -- 0.81 -, 04 0.4 02 0.2 0 50 transremban-e 0 150 100 nd - 200 250 50 100 outside -transmembrane 200 150 inside - 300 250 outside Figure 2-5: Topology prediction generated using the TMHMM server 29 for M voltae proteins (A) AglK and (B) AglC. Coomassie Anti-T7 Anti-His AgIKAgIC AgIKAgIC AgIKAgIC kDa 75 50 37 - 25 20 15 Figure 2-6: Coomassie-stained SDS-PAGE and Western blot analysis of purified AglK and AgiC proteins using antibodies specific to an N-terminal T7 or C-terminal His tag. Upon incubation of AglK with short (C55-60) (S)-Dol-P and UDP-[ 3H]GlcNAc, we observed rapid formation of a radiolabeled organic-soluble product, characteristic of a dolichyllinked compound. Under the defined reaction conditions, AglK was specific for UDP-GlcNAc, 43 as we detected no turnover with radiolabeled UDP-Glc, UDP-Ga1NAc, UDP-Gal, or GDP-Man (Figure 2-7A). The observed in vitro preference of AgIK for UDP-GlcNAc fully supports the assignment of this enzyme as the first glycosyltransferase in the pathway, as the N-glycan structure assembled in vivo in M. voltae shows that GlcNAc is the first linking sugar attached to Asn. 19 The apparent Km for the donor UDP-GlcNAc substrate at a fixed concentration of Dol-P (100 pM) was determined to be 1.1 0.3 pM, which is comparable to the Km values of other known glycosyltransferases, which are typically around 1-25 pM for NDP-sugars. Under the same conditions, AgIH once again demonstrated no discernible activity. Additionally, AgIK showed much higher activity with the shorter, native-like (C55-60) Dol-P substrate compared with the longer (C85-105) Dol-P, indicating a strong preference of AglK for the shorter dolichols found in archaea (Figure 2-7B). 2 6,3 0 The long and short (S)-dolichols used in these studies are derived from plant sources and therefore include three E-isoprenes, in contrast to the non-plant derived linear polyprenols that include two E-isoprenes and an additional Z-isoprene unit. Finally, enzymatic activity required divalent metal cations (Figure 2-7C). During the purification from the membrane, AgIK retained enough bound M 2 + to show activity with no additional metal in the assay. However, when AglK was stripped of all metal with EDTA during purification, it was inactive. The addition of Ca2 +,Mn2, or Mg2+ restored robust activity. 44 A B 100 80 100 80 r_ 0 C 100 80 +UDP-GlcNAc 60 -eUDP-Glc 60] AUDP-Gal 0 40 0 -- Do-P (C55-60) -U-Do-P (C85-105) 0 0 +UDP-GaNAc -4-UDP-GcNAc, AgIH 40 60 4C82 -0-Mn2. C Mg2. 40- 20] +EDTA 20 20 0. 0 0 5 10 15 20 Time (min) 25 30 0 5 10 15 20 Time (min) 25 30 0 5 10 15 20 25 30 Time (min) Figure 2-7: AgLK activity assay. (A) Glycosyl donor specificity of AgiK using (C55-60) Dol-P and radiolabeled nucleotide sugars. Assay monitors the formation of radiolabeled Dol-Pmonosaccharide over time. AgiH activity is tested using UDP-[ 3H]GlcNAc and (C55-60) Dol-P. (B) Polyprenyl-phosphate specificity of AgiK with UDP-[3H]GlcNAc. (C) Metal dependency of AgiK with UDP-[ 3H]GlcNAc as the glycosyl donor and (C55-60) Dol-P as the polyprenylphosphate acceptor. AgiK activity was tested in the presence of various divalent cations (10 mM) and EDTA. The (C55-60) Dol-P-linked product of the AgiK reaction was purified using normal phase HPLC (Figure 2-8) and identified as Dol-P-GlcNAc by mass spectrometry (Figure 2-9). Analysis of the AgiK reaction products by capillary electrophoresis also revealed that the nucleotide product of this reaction was UDP rather than UMP, providing evidence that AglK acts as a DolP-GlcNAc synthase and transfers GlcNAc to the Dol-P acceptor, rather than transferring GlcNAc- 1-P to form Dol-PP-GlcNAc (Figure 2-10). A B 4500 4000 3500 3000 Z t5 CU 2500 2000 1500 1000 500 0 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Time (min) Figure 2-8: Normal phase HPLC purification of Dol-P-GlcNAc. (A) Radiolabeled Dol-P[ 3 H]GlcNAc was monitored by liquid scintillation counting of each fraction. (B) Unlabeled DolP-GlcNAc was monitored by TLC of each fraction. 45 1-OOLPCLCNAC.tR2M06141 12(l.066)Cn (C.n3. 60.00, Ht~ Sm (SC 2..0S0~ Cm(fr24-2tx30001 1 Scon ES7 530 IS. 256.2 2832 2421 1052.2 87 24.2 33 33773,2 SO9.!,2 _2 ' 100 200 1053.2 483 228 r"0 300 400 -' 700 6 100- ~ 7735~L 629,63 3 800 no 0 11313 1135.5 12W,9 l". . . . . . . . . . . . . . . - 1000 1100 1i 1300 140 1500 1600 17 0 180 1-1 Figure 2-9: ESI-MS (negative ion mode) of purified (S)-Dol-P-GlcNAc [M+ = 1051]. The (C55) Dol-linked product was enriched during HPLC purification of the corresponding polyprenol. 0.035 0.03 --. P- -- -- 0.025 C 0.02 1 - 0.015-----AgJK S0.015 0.01 AgIK +2 ...... AgIK +3 2A 0.005 3 20 22 24 26 28 30 32 34 36 38 40 42 44 Time (min) Figure 2-10: Capillary electrophoresis of AglK reaction products after aqueous extractions of the reaction mixture, in order to identify the nucleotide product as UDP. In each case, (1) UDP, (2) UMP, or (3) UDP-GlcNAc was spiked into the reaction mixture to determine the identity of each peak. 46 Dol-P-GlcNAc exhibits an a-glycosidic linkage AgiK was previously described to function alongside AgiC, to produce the downstream Dol-PP-disaccharide from the Dol-PP-monosaccharide. 2 0 However, a closer examination of the AgiK sequence revealed that it exhibits a strikingly high sequence similarity (> 50%) to Alg5, the dolichyl-phosphate P-glucosyltransferase from S. cerevisiae (Figure 2-11).31 AglK (1) Alg5 (1) -------------------------------------------------------------------- MA MRALRFLIENRNTVFFTLLVALVLSLYLLVYLFSHTPRPPYPEELKYIADEKGHEVSRALPNLNEHQDDE AglK (3) Alg5 (71) DKLIYLI EIFLS ----KIKNVVNNEQNH--YD GRILLILTDAISFPEKYGSR NNNKVIAEHEWGV AglK (67) Alg5 (135) -YEQFRIEFSERGKV AglK (129) -- Y V Alg5 (201) PAVA -I H AglK (196) TCSMIA Alg5 (272) FD I KKL RII EAVIYRSMIRNCLM SFI HTL KIMKELEEISKQ N NNN TQYCLKICKQFK ------ I ITo KKAYELGADIAVTFM RQ --- LHIRGKYGLF * KEFKNMPLTKKVGN I IIVG I YYV RSI HAPD SKF PIINDSKE-------EIIAISKIETSSTDLKTTK VLSEQ LKIFPY YE TEGWI ML K -------TIYTIY SMARGTNVMIGFjIFY S--WHEVDGSKMALAIDSIEMAKDEVIIR3AYLUGIYRDNKKC - Figure 2-11: Alignment of AglK from M voltae with Alg5, the dolichyl-phosphate glucosyltransferase from S. cerevisiae. Residues highlighted in black indicate sequence identity (25%) and those in gray and black denote sequence homology (53%). The alignment was performed using ClustalW. Due to the high sequence similarity of AglK with dolichol-phosphate-mannose synthase (DPM) and dolichol-phosphate-glucose synthase (DPG), it was predicted that AglK would generate P-linked Dol-P-GlcNAc, as previous studies using a variety of methods including acid/base hydrolysis, enzymatic digestion, and mass spectrometry suggest that both Dol-P-Man and Dol-P-Glc produced by Dpm 1, the Dol-P-Man synthase, and Alg5, the Dol-P-Glc synthase, show this anomeric configuration. The unambiguous determination of the anomeric configuration in the product is essential since this stereocenter is critical in the ultimate OTase reaction catalyzed by AglB. We investigated the stereochemistry of the GlcNAc moiety bound to Dol-P using phosphorus-decoupled 'H-NMR spectroscopy, and determined the J1, 2 value of the 47 anomeric proton to be 3.4 Hz (Figure 2-12, Table 2-1). This value strongly indicates that the AglK reaction proceeds with retention of stereochemistry and the Dol-P-GlcNAc product bears an anomeric a-linkage.2 5 3,5 The glycosylated archaellins in M voltae are known to attach the Nglycan via a s-glycosyl amide linkage, 9 so this surprising and unexpected a-configuration 1 of Dol-P-GlcNAc would actually be consistent with the later inversion of stereochemistry catalyzed by known OTases. A B 3.4 Hz 6.4 Hz 5.40 5.38 5.36 5.34 5.32 Chemical shift (ppm) I 6.0 5.5 I .I.I....I...... 5.0 4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 Chemical shift (ppm) Figure 2-12: (A) 1H-NMR spectrum of Dol-P-GlcNAc. (B) (i) Expansion of the 31P decoupled 1H-NMR spectrum of the anomeric proton for Dol-P-GlcNAc, a doublet with a coupling constant of 3.4 Hz (J,2 ). (ii) Expansion of the 'H-NMR spectrum of the anomeric proton for Dol-PGlcNAc, a doublet of doublets with coupling constants of 3.4 (J1 ,2 ) and 6.4 (Jip) Hz. 48 Table 2-1: (A) Dol-P-GlcNAc structure. (B) 'H- and "C-NMR assignments (ppm) and coupling constants (Hz) for the dolichyl portion of Dol-P-GlcNAc. (C) 'H- and 13 C-NMR assignments (ppm) and coupling constants (Hz) for the GlcNAc portion of Dol-P-GlcNAc. HO HO AcHN - 7-8 0 O-P=O 1 S 2 Dolichyl-Phosphate Isoprene unit Methyl groups SC 20.0 0.79 to w-3 23.2 1.55 0) to w-2 16.0, 17.6 1.55 a H CHM (cis) to w-3 H CHM (cis) H co-1,0)o-2 H CM (cis) C, (trans) cM, (cis) SH to to c(-2 H CH (trans) 124.4 5.06 129.9 5.09 CH 3 (Cs) C H 135.5 134.9 (r CH3 (trans) CH 3 (cis) 131.3 as 37.5 1.29 37.9 1.39, 1.62 GicNAc Proton SH (Hz) H-1 (JU1,2 H-1 (J1,P) 5.36 (3.4) H-2 (J 2,3 H-3 (J3,4 H-6 (J5,6 H-6 (6,e') H-6' 3.89 (9.4) 3.63 (10.0) 3.26 3.82 3.53 (7.6) 3.53 (9.9) 3.91 26.6 2.01 32.1 1.98 39.7 1.94 CH3 1.96 H-4 ) H 1.28 3.86 H-5 w-1,ot-2 'CH 3 (trans) 1to (o-3 ) p to (1) pto o-3 H3 (Cis) cxc( ) PO HC C HICc U a 64.9 PO-CHI 29.1 isoprene unit Quaternary carbons SC 25.8 (cis), 1.55 15.5 (trans) SC SH -CH H isoprene unit Methylene group SC Methine groups a CM, (trans) H cc Isoprene unit ) B 49 Carbon C-1 SC 94.2 5.36 (6.4) C-2 C-3 C-4 C-5 C-6 54.0 C=O 171.3 22.6 71.5 67.7 73.7 62.3 AgiK mutagenesis and inhibition studies The 'DXD' motif is a short conserved motif found in many glycosyltransferase families which all use nucleoside diphosphate sugars as donors and require divalent cations for activity.2 AglK possesses a 'DAD' motif, which is a signature of the inverting, metal ion-dependent, GT-A structure, GT2 family members of which DPM and DPG are assigned. Each aspartic acid residue in the 'DAD' motif was mutated to an alanine residue (D105A and D107A), and both mutants were shown to be inactive presumably due to the abolished metal ion-binding capacity required for function (Figure 2-13). 100 80 .0 ~ 0 -- AgK WT -AgIK O 40 D105A + AgK D107A 20 0 0 5 10 15 20 Time (min) 25 30 Figure 2-13: AglK D105A and D107A mutants were both inactive compared to wild type AglK when assayed with Dol-P and UDP-[3H]GlcNAc, confirming that binding of a divalent metal cation is necessary for activity. In addition, the chemical mechanism of glycosyltransferases is thought to involve the development of substantial oxocarbenium character in the transition state (Figure 2-2).2 Sugar nucleotides with electron-withdrawing substituents are known to be inhibitors of glycosidases, and UDP-(5F)-GlcNAc (Figure 2-14A) has been shown to be a slow-binding inhibitor of MshA, a retaining glycosyltransferase. 3 6 We used UDP-(5F)-GlcNAc as a mechanistic probe of AglK, and our studies show that AglK is indeed inhibited, further supporting all the evidence that AglK is a retaining glycosyltransferase (Figure 2-14B). 50 A B OH HO o IC 12 1 0 N - 8~ +0 0 20 40 60 uM 80 Time (min) Figure 2-14: AgiK inhibition observed with UDP-(5F)-GlcNAc. (A) Structure of UDP-(5F)GlcNAc inhibitor. (B) AgiK activity in the presence of 0, 50, and 500 uM UDP-(5F)-GlcNAc. Comparison of AgiK with other glycosyltransferases We demonstrated that AgiK is a Dol-P-GlcNAc synthase that utilizes UDP-GlcNAc and (C55-60) Dol-P to produce a-linked Dol-P-GlcNAc. This initial step immediately deviates from the biochemically characterized bacterial and eukaryotic pathways, which feature polyprenyldiphosphate-linked intermediates, as well as the proposed pathways in selected methanogens." However, the high sequence similarity of AglK and S. cerevisiae Alg5 suggests a common evolutionary origin for both enzymes, and may shed light on the intriguing observation that in eukaryotes Dol-P-monosaccharides serve as glycosyl donors for the glycosyltransferases within the ER lumen in the second phase of the dolichol pathway, compared with the nucleotide sugars that are utilized in the first phase.2 3 The first step in the M voltae pathway involves an enzyme that catalyzes a rather unique transformation. This AglK reaction is the first membrane-committed step, and it may represent a hallmark for the dolichyl-monophosphate-dependent pathways. In this context, examination of enzyme sequences and resulting pathway intermediates from other archaeal species can be 51 informative, even if the molecular details remain unclear. In the case of H. volcandi, AgliJ is predicted to carry out the first glycosyl transfer step, and exhibits 43% similarity to AgIK (Figure 2-15A).38 Previous analysis of H. volcanii lipid extracts confirmed the presence of Dol-P-linked glycans, suggesting that this glycosylation pathway may be more similar to that of M. voltae than originally proposed, although the identities of the carbohydrates are different and the stereochemistry at the glycosyl phosphate linkage is not defined.38 Similarly, while the enzyme that catalyzes the first step of the M. maripaludis N-linked glycosylation pathway is not yet identified, a gene that is proposed to be involved in the process (MMP 1170), based on its location in a gene cluster, encodes a protein that shows extremely high homology to AglK, with 79% similarity (Figure 2-15B). 17 This suggests that the first step of the M. maripaludis pathway may involve the formation of Dol-P-GaINAc instead of the previously presumed Dol-PPGalNAc. And by analogy to the M. voltae product, the Dol-P-GaNAc product should also bear an a-linkage at the anomeric center. 52 A KQNHNYDII ISDYRDEGF AglK (1) Ag1J (1) -MADKLIYLII MPTPDAVCIL AglK (66) Ag1J (67) NNNNKVIAIKHEQ-----N QAVREAVED IQAPYVLMLD AglK (122) iW Ag1J (133) KIMKELEEESKQNNNNNNN LAEDAGAHVVVQSGSGKG VI ATIGLKAYELGAD-----I1VTIDADGQHIPDDIAKIIQP DMRPITRLNRUGNR DPLTEGYDH4GD AT YE -------- SKEYVVISIIKNPKE FTRES RAFAFIHGQIFRILSY K------------VGNLILSFITFLLGGUYV SDGFGIETEMAVECAKRUIKTTVVPTTYUPR --------------- YETCSEMIIAKKNKLNIGEVPIK NNPLFYFGSVGFASTATGLGLEYVAYEWVVRSISHEVI EQ SKSALK AglK (168) TD-SQS Ag1J (199) PDGSDT4DPIRDGGIIFELY AglK (218) TIYTEYSmAR rNVMIG---FKIFYPML4KVLD-RIEELE REQ Ag1J (265) AVVSMAGILFIVQLLMFGVLSDLILS B KEI AglK MMP1170 (1) (1) AglK MMP1170 (64) (54) AglK MMP1170 (127) KE (110) FD% AglK MMP1170 (190) (173) KS ESKQNNNNN ISE ---------- LDKtL V KN AI NNb NNK -- ---- E ELGADL KYNPKV: G M GGA INI G GGjqW KKt GT~i1 K C S GLSFT GLMFITE G Y GKVLD IIA KN IVLFKNLF F IQ I---- Figure 2-15: Alignment of AglK from M voltae with either (A) AgliJ from H. volcanii or (B) MMP 1170 from M maripaludis. Residues highlighted in black indicate sequence identity (18% for AglJ and 60% for MMP 1170) and those in gray and black denote sequence similarity (43% for Ag1J and 79% for MMP 1170). The alignments were performed using ClustalW. Observationof Dol-P-glycanin M maripaludiscells Both Dol-P and Dol-PP linked glycosyl donor substrates have been implicated in archaeal glycan assembly.3 9 For example, studies in Haloferax volcanii and Sulfolobus acidocaldarius 2 26 40 identified Dol-P-glycans as intermediates in the N-glycan assembly. , , Ideally, the next step would be to carry out a cellular lipid extract of M voltae to identify any present Dol-P-glycans or Dol-PP-glycans and their intermediates. Unfortunately, M voltae cells were unavailable; however, we were instead able to carry out mass spectrometry analysis of a total lipid extract 53 from the closely related methanogen M. maripaludis, which has been previously reported to generate a very similar N-linked glycan to that of M. voltae (Figure 2-16). A o HdO NH 0 HO B OH 0 OHO O 0 0 AH -0~H0 H0-A. AcHN N 0 AcHN HO Ho) NHAc NH 0 0 0HAHN 0 AcHN OH OH 0 NHAc AcHN 0ON AcHN AcHN Figure 2-16: N-linked glycan structures from (A) M. voltae and (B) M. maripaludis. The total lipid extract was visualized by TLC, and we noticed three promising spots in the expected range of Rf values for Dol-P-glycans (Figure 2-17A). We then used normal-phase HPLC to isolate the potential Dol-P-glycans. The mass spectrometry data indicated the presence of a component corresponding to a short (C55) Dol-P-trisaccharide ([M-2H+Na]~ = 1692.0) with the dolichol bearing two sites of unsaturation, presumably the a- and o-isoprene units predicted for archaeal dolichols (Figure 2-17B,C). The trisaccharide has the same mass as that calculated for the first three sugars of the N-linked glycan identified in M maripaludis. We were not able to detect other Dol-P-glycan intermediates or any Dol-PP-glycans. 54 B A 834.60 10o 90 80 70 60- 40 818'7 10 1691.87 747.67 654.47 , 600 C 98,3 800 1000 116447 1200 mlz 143%3 160.O .- L 819.87 19t7.73 1400 1600 1800 2000 HO HO NH o OH NH NH AcHN OH OH 0 o0i AcHN0-00 0-2 Figure 2-17: Mass spectrometry analysis of M maripaludis total lipid extraction. (A) TLC of (C55) Dol-P (lanes 1-2), M maripaludis lipid extraction (lanes 4-5), and co-spot (lane 3). The Dol-P-trisaccharide purified by NP-HPLC is indicated by an arrow. (B) ESI-MS (negative ion mode) of M maripaludis lipid extraction after HPLC purification, where [M-H-Na]f = 1692.0 and [M-2H] 2- = 834.5. (C) Structure of the a- and o-saturated Dol-P-trisaccharide, ManNAc3NAm(6Thr)A-p1,4-Glc-2,3-diNAcA-s1,3-GaINAc-P-Dol. The findings with the M maripaludis system strongly supports the notion that like halophiles, methanogens such as M maripaludis and the closely related M voltae rely on dolichol as the polyprenol anchor in N-linked glycan biosynthesis, and additionally require the monophosphate linkage rather than the diphosphate as previously proposed. In addition, the observation of this Dol-P-trisaccharide in M maripaludis is compelling evidence that the intermediates defined through our in vitro experiments are consistent with those produced in 55 vivo. Therefore, taken together with the conclusions of our comparative sequence analysis, it is likely that the N-linked glycosylation pathways in both H. volcanii and M. maripaludismay also follow a similar monophosphate-dependent pathway as that characterized in this chapter for M. voltae. Ag/C is a UDP-Glc-2,3-diNAcA glycosyltransferase In order to examine the downstream steps of glycan assembly in M. voltae, we focused our efforts on the characterization of AglC, a 37.3 kDa protein with 1-2 transmembrane domains predicted at the C-terminus (Figure 2-5B). 2 9 AgIC had previously been implicated in the assembly of the Dol-linked disaccharide in conjunction with AgIK based on genetic studies.2 0 With the role of this enzyme in the pathway understandably unclear in light of our characterization of AgIK, we imagined two possible functions for AglC. One possibility is that AglC behaves as a DPG homolog similar to AglK, utilizing UDP-Glc-2,3-diNAcA and Dol-P to generate Dol-P-Glc-2,3-diNAcA. Based on sequence analysis, AglC was also annotated as a GT2-family glycosyltransferase along with DPG and AgIK. However, the sequence similarity of AglC to DPG was much lower (21%) compared with AglK to DPG (53%). Alternatively, AgIC could directly act on UDP-Glc-2,3-diNAcA, transferring the highly modified carbohydrate to the Dol-P-GlcNAc product of AglK. To test these hypotheses, we first overexpressed AglC in E. coli with a N-terminal T7 and C-terminal His 6 tag and purified it from the membrane fraction, using detergent to maintain its solubility (Figure 2-6). We also prepared UDP-[ 3H]Glc-2,3-diNAcA chemoenzymatically using the WbpB, WbpE, and WbpD enzymes involved in the Pseudomonas aeruginosalipopolysaccharide biosynthetic pathway (Figure 2-18).42 56 OH00 WbpA H0 ANP-UDP HO AcHN O-UDP NAD+ UDP-GIcNAc WbpB 0 HO NADH NAD+ N-UDP NADH UDP-GIcNAc(3keto)A UDP-GkcNAcA 2-HG "-KG + --------- a-KG ,)WbpE PMP L~GUX L-Glu xPLP ACm H 0 0 0~ 0WbpI HO 0~O ACHN O-UDP UDP-ManNAc(3NAc)A )Wp O-UDP UDP-GIc-2,3-dINAcA WbpD 0O HO O-UDP UDP-GicNAc(3NH 2)A Figure 2-18: Biosynthetic pathway of UDP-ManNAc(3NAc)A, a lipopolysaccharide subunit in P. aeruginosaPAO 1 .42 UDP-Glc-2,3-diNAcA is highlighted in blue. We incubated AglC with UDP-[ 3 H]Glc-2,3-diNAcA as the glycosyl donor and either Dol-P or Dol-P-GlcNAc as a glycosyl acceptor. AgIC readily converted the aqueous-soluble UDP-[ 3H]Glc-2,3-diNAcA starting material to the organic-soluble Dol-P-disaccharide when Dol-P-GlcNAc was used as the glycosyl acceptor. Under the defined reaction conditions, we detected no turnover with radiolabeled UDP-Glc, UDP-GalNAc, or UDP-Gal, and observed only minimal turnover for UDP-GlcNAc (Figure 2-19A). This observed in vitro preference of AglC for UDP-Glc-2,3-diNAcA fully supports the assignment of this enzyme as the second glycosyltransferase in the pathway, as the N-glycan structure assembled in vivo in M voltae shows that Glc-2,3-diNAcA is the second sugar. 19 The enzyme was specific for the monophosphate-linked Dol-P-GlcNAc, as no conversion was observed with Dol-P, Dol-PPGlcNAc, or Dol-PP-GalNAc (Figure 2-19B). These results indicate our second scenario is correct, and AglC directly transfers UDP-Glc-2,3-diNAcA onto the Dol-P-GlcNAc product of AglK. 57 A B 60 60 50 C 0 CO, C 0 0- 50 P-Glc-2,3-dNAcA -mUD P-GlcNAc +UD P-Glc +UD P-GaINAc UD P-Gal 40 30 20 10 C 40 C., C 0 -- Do-P-GcNAc -a-Dol-P 30 +--Do-PP-GcNAc + Dol-PP-GaINAc 20 10 I 0 0 20 40 80 60 Time (min) 100 0 120 20 80 60 40 Time (min) 100 120 Figure 2-19: AgiC donor and acceptor substrate specificity. The glycosyltransferase AgiC assembles the disaccharide Dol-P-GlcNAc-Glc-2,3-diNAcA from UDP-Glc-2,3-diNAcA and the AgiK product Dol-P-GlcNAc. (A) Glycosyl donor specificity of AgiC using Dol-P-GlcNAc and radiolabeled nucleotide sugars. Assay monitors the formation of radiolabeled Dol-P-disaccharide over time. (B) Polyprenyl acceptor specificity of AgLC using UDP-[ 3H]Glc-2,3-diNAcA as the donor substrate. HPLC purification (Figure 2-20) and MS analysis confirmed the identity of the resulting product as Dol-P-GlcNAc-Glc-2,3-diNAcA (Figure 2-21), indicating that AglC is indeed a Glc2,3-diNAcA glycosyltransferase. This in vitro work substantiated the prediction from previous genetic work that AglC is somehow involved in the formation of the second sugar, but we demonstrated that it acts alone as the GT responsible for the attachment of the second sugar. B A 1800 ii 1600 1400 CL 1200 1000 4 800 600 400 200 0 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 02VM 29 39 lime (min) Figure 2-20: Normal phase HPLC purification of Dol-P-Glc-2,3-diNAcA. (A) Radiolabeled Dol-P-GlcNAc-[ 3H]Glc-2,3-diNAcA was monitored by liquid scintillation counting of each fraction. (B) Unlabeled Dol-P-GlcNAc-Glc-2,3-diNAcA is monitored by TLC of each fraction. 58 l.DoL.P1OS-RZM06139 60(5 226) Cn (Cen3 86 0 00, N)i Sm (SG, F 108.8 .0 , Cm (586.8 16 000) Scan ES2.28 2552 283-2 1305.6 1307T 110.7 23 28107 ' 1310.4 13=314 'M70 , 152, 1810.3m 100 20 30 40 50 00 700 800900 lo00 1100 1200 1300 1400 1500 1600 1700 1800 Figure 2-2 1: ESI-MS (negative ion mode) of Dol-P-GlcNAc-Glc-2,3-diNAcA [M+ 1308.7]. The second step of the M voltae pathway involves the glycosyltransferase AgIC, which converts Dol-P-GlcNAc to Dol-P-GlcNAc-Glc-2,3-diNAcA. The study of this enzyme has been challenging due to the unique nature of the UDP-sugar substrate. Indeed, analysis of glycan structures across both the archaeal and bacterial domains reveals highly modified carbohydrates, which frequently make the unambiguous biochemical annotation of target glycosyltransferase enzymes challenging. In the case of . voltae, while the identity and linkage of the second carbohydrate in the glycan had been known for some time through MS and NMR analysis of the glycoprotein products,1 9 the enzymes responsible for its biosynthesis had not been elucidated. Therefore, we took advantage of the P. aeruginosa pathway enzymes in order to obtain UDPGlc-2,3-diNAcA, the essential glycosyl donor substrate for AglC. As more and more archaeal and bacterial genomes are sequenced and characterized, the accessibility of unusual carbohydrate 59 building blocks will likely increase, enabling new discoveries and further insight into complex glycoconjugate assembly pathways. Identification of Dol-P-glycans in other archaealspecies Finally, we turned to the hyperthermophilic archaeon Pyrococcusfuriosus in hopes of identifying an additional archaeal organism exhibiting this polyprenyl-monophosphate-linked pathway. The chemical structure of the unique N-glycan attached to the asparagine residue of P. furiosus proteins is known, 43 and the oligosaccharyl transfer reaction of glycan onto acceptor peptides has been demonstrated in vitro using membrane extracts as the source of crude lipidlinked oligosaccharides (LLO).44 '45 Because the exact nature of the LLO is unknown, it is unclear if the glycans are transferred from Dol-P or Dol-PP carriers. Following a similar protocol for total lipid extraction in H. volcanii,38 we extracted the P. furiosus whole cell pellets and were able to identify Dol-P-heptasaccharide and shorter Dol-P-glycan intermediates by mass spectrometry (Figure 2-22). The heptasaccharide has the same mass as described for the Nglycan, 46 and is linked to (C60-70) Dol-P. The predominant Dol-P-glycan species found is actually Dol-P charged with the linear pentasaccharide precursor of the branched heptasaccharide. Additionally, varying degrees of isoprene saturation are observed. 60 1036. 02 400 Doi-13-P- 350 penta CL 1036. 06 948. 7 300 250 948. 2 200 1037 976. 150 88 100 14.51 0 15097 5 A 88 D o-13 -P - Do l-1 3 -P - 1118 . 01 103E 04 hepta hexa 9791022. 50 61 1 920 940 960 980 12 1000 1081. 21 10 6.692 1020 1040 1060 1080 1100 1120 1140 m/z (Da) Figure 2-22: Reverse phase LC/MS analysis of P. furiosus whole cell extractions confirms the presence of Dol-P-heptasaccharide and other Dol-P-glycan intermediates. Penta-, hexa- and hepta-saccharide charged (C60-C70) Dol-P are indicated by the blue, green, and red arrows, respectively. (C65) Dol-P-heptasaccharide, [M-2H] = 1081.1. At this point, even though we did not observe any Dol-PP-heptasaccharide or intermediates in the biosynthesis of such a product in our P. furiosus extractions, we cannot completely rule out the existence of Dol-PP-glycans since our extraction and MS protocol may not have been adequate or sensitive enough to identify small quantities. However, there is currently no evidence that Dol-PP-glycans do exist as substrates for the P. furiosus OTases. Instead, the Dol-P-glycans observed in our extractions are the most likely substrates for N-linked glycosylation in this archaeon. Along with the polyprenyl-monophosphate linked pathways observed in halophilic and methanogenic archaea, this new evidence for the polyprenylmonophosphate linked pathway in thermophilic archaea suggests that this alternative, monophosphate-dependent pathway may be more widespread than previously thought. 61 Conclusions Complex glycoproteins play central roles in biology, yet the biochemical details of the assembly and recognition of these structures are largely incomplete due to the technical challenges associated with preparing homogenous substrates for unambiguous characterization. In this chapter, we have utilized purified substrates and enzymes along with detailed product characterization to define two key processes of N-linked glycosylation in M. voltae. The first step is the initial reaction between polyprenyl-phosphate and a UDP-activated glycosyl donor, which affords the first amphiphilic membrane-associated substrate (Figure 2-23A). The second step is the glycosyltransferase reaction, which yields the simplest competent substrate for the OTase (Figure 2-23B). The following chapter will define the culminating OTase reaction to form the N-linked glycan. The in vitro analysis of the first two key enzymes in the N-linked glycosylation pathway of M. voltae using purified components establishes the biochemical details of a second general strategy for N-linked glycosylation that utilizes polyprenylmonophosphate instead of polyprenyl-diphosphate carriers (Figure 2-23C). These findings have enabled us to parse out key differences among N-linked protein glycosylation pathways across the three domains of life. While it is not yet known how common this strategy is among archaea, the comparative analysis of the structural and mechanistic aspects of the two strategies will certainly provide new information on this complex post-translational modification throughout evolution. 62 A AgIK OH LIO HO HO Ou HNO-UDP HO o UDP UDP-GlcNAc B AcHN 0 I I O-P-O-Dol Dol-P-GlcNAc OH HO HA Dolichols: OH 001-P 0 0 ACHN)I 1 O-P-O-Dol + n =7-8 for short dolichols n = 13-17 for long dolichols 0 o- O OH AgIC HO ACHN H , Aol-IN AcHNI O-UDP AcHN UDP 0I I0cN O-P-o-Dol I - - I DoI-P-GlcNAc C OH n 2 0 u -0 UDP-Glc-23-diNAcA OU -0 UDP -W Dol-P-GlcNAc-Glc-2,3-diNAcA p GlcNAc Glc-2,3-diNAcA ManNAc(6Thr)A Cytoplasm . 0 I Dolichyl-phosphate Exterior ?AglB Figure 2-23: M voltae N-linked glycosylation pathway enzymes. (A) AglK catalyzes transfer of GlcNAc to Dol-P. (B) AgiC transfers Glc-2,3-diNAcA to Dol-P-GlcNAc to form a linkage. (C) The revised pathway shows the action of the two characterized GTs, AgiK and AgIC, at the cytoplasmic membrane, presumably followed by the action of a third GT, AglA. The resulting Dol-P-trisaccharide is flipped from the cytoplasm to the periplasm by a currently unidentified flippase. The glycan is then transferred en bloc to a recipient protein by the OTase, AgIB. s-1,3 Acknowledgements Dr. Angelyn Larkin began this project by pursuing the M voltae N-linked pathway and made many attempts to utilize AglH, AglK, and AglC to produce Dol-PP-GlcNAc-2,3-diNAcA, which was initially thought to be the substrate for the OTase. Dr. Garrett Whitworth worked on the synthesis of Dol-P and NMR studies. I am also grateful to Dr. Jeff Simpson (DCIF, MIT) for assistance with NMR acquisition, Prof. W. Whitman (University of Georgia) for the gift of both M voltae genomic DNA and M maripaludis cells, Prof. E. Swiezewska (Polish Academy of 63 Sciences) for providing a sample of racemic short dolichols for preliminary studies, Prof. M. Hartman (Virginia Commonwealth University) for providing UDP-(5F)-GlcNAc, Prof. M. Adams (University of Georgia) for P. furiosus DSM 3638 cells, and Dr. Ziqiang Guan (Duke University Medical Center) for LC/MS analysis of P. furiosus Dol-P-glycans. Experimental Methods Cloning ofAglH, Ag/K and Ag/C The aglK and ag/C gene sequences were optimized for expression in E. coli and chemically synthesized by Genscript. The aglH gene was amplified by PCR from M. voltae PS genomic DNA using Pfu Turbo polymerase (Agilent). Both the commercial genes and PCR products were treated with the restriction endonucleases BamHI and XhoI and cloned into the same sites in the pET-24a vector (Novagen) to give encoded proteins with N-terminal T7 and Cterminal His6 tags.4 7 Protein expression andpurification Plasmids were transformed into E. coli BL21 (DE3) RIL cells (Agilent) using kanamycin and chloramphenicol for selection. LB media (1 L) supplemented with antibiotics was inoculated with a 5 mL starter culture and incubated at 37 'C with shaking until an optical density (600 nm) of 0.8 AU was obtained. The cultures were then cooled to 16 'C and protein expression was induced with 1 mM IPTG. After 16 hrs, the cells were harvested by centrifugation (4000 x g). For protein purification, cell pellets were resuspended in 5% of the original culture volume in buffer A (50 mM HEPES, pH 7.5/300 mM NaCl) plus 10 mM imidazole and a protease inhibitor cocktail (Calbiochem), then subjected to sonication. To prepare cell membrane fractions, the cell lysate was spun down to remove cellular debris (6000 x g, 30 min), followed by pelleting of the 64 membranes (142,000 x g, I hr). The membrane pellet was then resuspended in 0.25% of the original culture volume in buffer B (50 mM HEPES, pH 7.5/150 mM NaCI) with 10 mM imidazole and stored at -80 C. To purify proteins from crude membrane fractions, the membranes were solubilized for 1 hr in either 1% n-dodecyl-@-D-maltoside (DDM) for AglK and AgIC or 1% Triton X-100 for AgIB and centrifuged (142,000 x g), and the resulting + supernatant was incubated with Ni-NTA resin for 3 hrs. The resin was then washed (buffer B 25 mM imidazole/0.05% DDM), the proteins eluted (buffer B + 250 mM imidazole/0.05% DDM), and the samples dialyzed (buffer B + 0.05% DDM). All purification steps were performed at 4 'C. AglK and AglC concentrations were quantified by UV absorbance. AglK expressed from pET-24a(+): MW = 30,619 g/mol, = 39,529 g/mol, E280nm = 15,480 cm'M'. AgIC expressed 41,630 cm M 1 . from pET-24a(+): MW E28Onm = Synthesis of (C55-60) (S)-dolichols For the preparation of short chain (C55-60) (S)-dolichols, polyprenols were first extracted from the leaves of Rhus typhina (staghom sumac), 48 which affords a distribution of C50-65 isomers. The isolated mixture was further fractionated to obtain a 5:1 mixture of C55:C60 linear polyprenols. (S)-Dolichols were prepared by regioselective asymmetric hydrogenation according to the procedure of Sowa and coworkers. 49,50 A sample of linear polyprenols (C55-60 - 5:1), (0.08 g, 0.10 mmol) was dissolved in toluene and concentrated by rotary evaporation. [{(S)-tolbinap}RuCl 2 (p-cymene)] (0.01 g, 0.01 mmol) and KOH (0.001 g, 0.02 mmol) were added to the polyprenols and dried under vacuum. N 2 was then backfilled into flask and the vacuum was reestablished for 1 hr. In a separate flask, n-propanol (anhydrous, 10 mL) was added to activated 4 A molecular sieves and sparged with N 2 for 30 min. The sparged n-propanol was transferred to the flask containing polyprenols, [{(S)-tol-binap}RuCl 2 (p-cymene)] and KOH, which was 65 immediately purged of atmosphere and backfilled with N 2. This process was repeated three times, after which a dry stir bar was added and the reaction was allowed to proceed for 8 h under N 2 . NMR analysis was used to monitor the reaction. Upon completion, silica gel was added to the mixture, and flash column chromatography (toluene:EtOAc, 49:1) afforded the desired (S)dolichols (5:1 C55-60 mixture) (0.072 g, 90% yield) as a colorless oil. 1H NMR (600 MHz, CDC 3 ) 6 5.17-5.07 (10 H, m), 3.75-3.62 (2 H, in), 2.12-1.94 (38 H, in), 1.68 (20 H, s), 1.60 (18 H, s), 1.40-1.10 (10 H, in), 0.92-0.82 (3 H, m). Synthesis of (S)-dolichyl-phosphate Dolichyl-phosphate was prepared essentially as described by Branch et al. 5 ' A sample of short (S)-dolichols (C55-60, 0.02 g, 0.003 mmol) was dissolved in THF (anhydrous, I mL) under argon, then tetrazole (0.006 g, 0.008 mmol) and bis-(2-cyanoethyl)-N,N- diisopropylphosphoramidite (0.03 g, 0.009 mmol) were added and the solution was stirred for 2 hr. The solution was then cooled to -30 'C and 30% hydrogen peroxide (0.2 mL) was then added. After stirring for 10 mins, the mixture was warmed to room temperature. The reaction was then rapidly diluted with Na2 SO 3 (10%, 5 mL) and extracted with EtOAc (3 x 10 mL). The , combined organic extract was washed with NaHCO 3 (5%, 2 mL), brine (5 mL), and MgSO 4 then concentrated by rotary evaporation. The resulting oil was dissolved in MeOH (anhydrous, 5 mL) and a stoichiometric equivalent of NaOMe was introduced and allowed to stir in the mixture for 2 d. Flash column chromatography on silica with a gradient of EtOAc to EtOAc:MeOH (5:2) afforded the desired Dol-P (0.015 g, 70% yield). 1H and 31 P NMR data match previously reported values.5 2 Dol-P quantification was carried out using a standard phosphate quantification protocol. 3 66 A similar procedure was utilized to produce (C85-105) Dol-P. In this case the C85-105 (S)-dolichols were a kind gift from Kuraray Co. Ltd. Japan and had been prepared according to the procedure of Suzuki and co-workers by C5 homologation of C80-100 polyprenols extracted from the leaves of Ginkgo biloba.49 Synthesis of (C55-60) Dolichyl-PP-GlcNAc and Dolichyl-PP-GaNAc The Dol-PP-GlcNAc and Dol-PP-GaINAc used in the glycosyltransferase specificity assays were synthesized by Dr. Angelyn Larkin using a procedure adapted from previously published protocols. 47 5 2 ,54 Synthesis of UDP-[PH]Glc-2,3-diNAcA UDP-[ 3 H]Glc-2,3-diNAcA was prepared through chemoenzymatic synthesis starting from UDP-GlcNAcA and the enzymes WbpB, WbpE, and WbpD from P. aeruginosaPAOI as previously described with slight modification.4 2 After purification, the WbpB/WbpE product UDP-GlcNAc(3NH 2 )A (1 mM) was incubated with [ 3 H]-AcCoA (20 Ci/mmol), 50 mM HEPES, pH 7, and WbpD (0.5 rig) in a reaction volume of 1 mL at 30 C for 5 minutes, followed by a chase with unlabeled AcCoA (0.75 mM) for 4 hrs. The radiolabeled product was purified on a Phenomenex Synergi C18 RP-HPLC column as described. 4 2 Ag/K and Ag/C glycosyltransferaseassays For AgIK assays, dried (S)-Dol-P (5 nmol) was resuspended in DMSO (3 piL) and vortexed, followed by the addition of 0.715% DDM (7 ptL), 0.5 M HEPES, pH 7.5 (10 pL), 1 M MgCl 2 (1 pL), 100 mM DTT (2 pL), and AgiK (0.5 pM), along with H 20 for a final volume of 100 pL. The reaction was initiated by the addition of UDP-[ 3 H]GlcNAc (5 pmol, 20 Ci/mmol) and incubated at 25 'C. Aliquots (15 ptL) were removed at various time points and quenched by 67 dilution into 2:1 CHCl 3 :MeOH (1.2 mL), followed by the addition of pure solvent upper phase (PSUP, 300 pL, composed of 15 mL CHC1 3, 240 mL MeOH, 1.83 g KCl in 235 mL H 2 0) . The organic layer was extracted with PSUP (3 x 300 tL) and analyzed by scintillation counting. 5 mL of EcoLite (MP Biomedicals) or 5 mL of Opti-Fluor (PerkinElmer) liquid scintillation cocktail was added to each aqueous or organic sample before analysis on a Beckman Coulter LS6500 scintillation counting system. All other nucleotide diphosphate sugars were tested as substrates under identical conditions. AglK activity was also tested with UDP-GIcNAc over a range of concentrations, in which UDP-[ 3H]GIcNAc (5 pmol, 20 Ci/mmol) was combined with unlabeled UDP-GIcNAc. Preparative AgIK reactions were carried out with 200 nmol (S)-Dol-P using unlabeled UDP-GlcNAc (250 nmol) in a total volume of 1 mL (x 10). For the AglK inhibition assays, UDP-(5F)-GIcNAc was added to the reaction with no enzyme pre-incubation. For AglC assays, dried Dol-P-GIcNAc (5 nmol) was resuspended in DMSO (3 pL) and vortexed, followed by the addition of 0.715% DDM (7 pL) and the reaction components as described for the AglK reaction. AglC (0.5 [tM) was introduced, and the reaction was initiated with UDP-[3H]Glc-2,3-diNAcA (1.62 nmol, 17.3 mCi/mmol) and incubated at 25 'C. All other UDP-sugars were tested at 5 pmol and 20 Ci/mmol, and dolichyl substrates were tested at 5 nmol. Aliquots were removed and processed as described above. For preparative AgIC reactions, 50 nmol Dol-P-GlcNAc was utilized along with unlabeled UDP-Glc-2,3-diNAcA (50 nmol) in a total volume of 500 pL (x 10). AgIK and AgIC reaction conditions were optimized with a detergent (DDM, Triton X-100, LDAO, CHAPSO), pH (6, 7, 8, 9), and temperature (4, 16, 30, 37 'C) screen. 68 Ag/K reactionproduct identification The AgiK reaction was carried out with 200 nmol of (S)-Dol-P. The entire reaction was quenched into 2:1 chloroform:methanol (12 mL) and extracted with PSUP (3 x 3 mL). The aqueous layer was removed, concentrated, and filtered through a 0.22 tm membrane. CE of the filtered solution was performed on a Hewlett-Packard 3D CE system. A bare silica capillary (75 pm x 80 cm) was used with a detector distance of 72 cm. The running buffer was 25 mM sodium tetraborate (pH 9.5). The capillary was conditioned before each run with a 0.4 M NaOH wash for 2 min, H2 0 for 2 min, and running buffer for 2 min. Samples were introduced by pressure injection for 16 s at 30 mbar, and the separation was performed at 25 kV and monitored at 254 nm. Ag/K DXD mutagenesis AglK D105A and D107A were expressed as pET-24a constructs in BL21 (DE3) RIL cells, solubilized in 1% DDM, and purified in 0.05% DDM. Primers used for QuikChange are as follows: AglKfwdQCOl: 5'-CATAGCAGTTACATTTGCGGCAGATGGTCAACACG-3' AglKrevQC0I: 5'-CGTGTTGACCATCTGCCGCAAATGTAACTGCTATG-3' AglKfwdQC02: 5'-AGTTACATTTGATGCAGCGGGTCAACACGCACCCG-3' AglKrevQC02: 5'-CGGGTGCGTGTTGACCCGCTGCATCAAATGTAACT-3' Purificationof Dol-P-glycansfrom Ag/K and AgiC reactions Dol-P mono- and disaccharides were purified using a normal phase Varian Microsorb HPLC column and separating over 21-28% buffer E, in which buffer D is 4:1 CHCI 3/MeOH and buffer E is 10:10:3 CHCI 3/MeOH/2 M NH 40Ac. 56 Elution of the Dol-P-saccharides were 69 monitored by scintillation counting (with 5 mL of OptiFluor added) and/or analytical thin-layer chromatography on silica gel 60 F254 plates (EMD Chemicals) using a solvent system of 65:25:4:0.5 CHCl 3/MeOH/H 20/NH4 0H (conc) and stained with ceric ammonium molybdate (CAM). NMR and MS characterizationof Dol-P-glycans NMR spectra were acquired using either a Bruker 600 or Varian Inova 500 MHz spectrometer equipped with a 5 mm inverse broadband gradient probe. Samples were diluted in CDC1 3 (Dol-P) or 2:1 CDCl 3 :CD30D (Dol-P-GIcNAc) and analyzed. configuration of the Dol-P-GlcNAc glycosidic linkage was determined using a The anomeric 31 P-decoupled 'H pulse sequence. ESI-TOF mass spectra were obtained by the Mass Spectrometry Laboratory at the University of Illinois, Urbana-Champaign. Methanococcus maripaludisLLO preparation The total lipid extract was prepared following a similar protocol used in the isolation of H. volcanii lipid fraction.38 The cell pellet (10 g) was thawed, resuspended in water (1.33 ml dH 2 0/g cells) and DNase (1.7 tg/ml), and stirred overnight at room temperature in a 250 ml round bottom flask. Methanol and chloroform were added to the cell extract in a ratio of 2:1:0.8 MeOH/CHCl 3/cell extract. After stirring for 24 h, the mixture was centrifuged (30 min, 1000 x g, 4'C) and the supernatant was collected and filtered through glass wool. Chloroform and water were added to the filtrate to yield a CHCI 3/dH 2 O/filtrate ratio of 1:1:3.8 in a separatory funnel. After separation, the lower clear organic phase, containing the total lipid extract, was collected into a round-bottom flask and evaporated in a rotary evaporator at 35'C. For analysis of the dolichol phosphate pool, the total lipid extracts were subjected to normal phase HPLC using a 70 YMC-Pack PVA-SIL-NP column (250 x 4.6 mm I.D., S-5 [tm, 12 nm) and separating with a gradient of 0-100% buffer E over 60 minutes, in which buffer D is 4:1 CHCl 3/MeOH and buffer E is 10:10:3 CHCl 3/MeOH/2 M NH4 0Ac, flowing at 1 mL/min with 2 mL fractions collected. Each fraction was analyzed by analytical TLC using silica gel 60 F254 plates with a solvent system of 65:25:4:0.5 CHCI 3/MeOH/H 20/NH4 0H and stained with CAM. The desirable fractions were subjected to MS analysis on a Finnigan LCQ Deca mass spectrometer coupled to an Agilent 1100 series HPLC. PyrococcusfuriosusLLO preparation P. furiosus DSM 3638 cells were a gift from Prof. Michael Adams (University of Georgia). The cell pellet (11 g) was added to a 250 ml round bottom flask and extracted with 2:1:0.8 methanol/chloroform/pellet (50 ml total volume) for 24 h, stirring at room temperature. The mixture was vacuum filtered through a BUchner funnel and three filter papers (Whatman Grade 1) and the filtrate was dried down using a rotary evaporator at 30 'C to afford extract 1 (52.3 mg). The remaining pellet was further extracted with 2:1 methanol/chloroform (30 ml) for an additional 24 h, stirring at room temperature. The mixture was filtered as before, and the filtrate was dried down to afford extract 2 (58.7 mg). The remaining pellet was then extracted a third time with 10:10:3 chloroform/methanol/water (69 ml total volume) for 48 h, stirring at room temperature. The mixture was transferred to 50 ml Falcon tubes and centrifuged to pellet the debris (30 min x 1000 g). The supernatant was transferred to another round bottom flask and dried down to afford extract 3 (119 mg). Dr. Ziqiang Guan (Duke University Medical Center) analyzed extracts 1, 2, and 3 (2 mg each) by LC/MS for the presence of lipid-linked oligosaccharides. Extract I contained Dol-P-HexNAc. Extract 2 contained Dol-P-HexNAc, Dol- 71 P-pentasaccharide, Dol-P-hexasaccharide, and Dol-P-heptasaccharide. Extract 3 revealed no Dol-P-oligosaccharide species. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. Varki, A. et al. Essentials of Glycobiology. (Cold Spring Harbor Laboratory Press, 2009). at <http://www.ncbi.nlm.nih.gov/books/NBK 1908/> Lairson, L. L., Henrissat, B., Davies, G. J. & Withers, S. G. Glycosyltransferases: Structures, Functions, and Mechanisms. Annu. Rev. Biochem. 77, 521-555 (2008). Lombard, V., Ramulu, H. G., Drula, E., Coutinho, P. M. & Henrissat, B. The carbohydrateactive enzymes database (CAZy) in 2013. Nucleic Acids Res. 42, D490-D495 (2014). Vrielink, A., Ruger, W., Driessen, H. P. & Freemont, P. S. Crystal structure of the DNA modifying enzyme beta-glucosyltransferase in the presence and absence of the substrate uridine diphosphoglucose. EMBO J. 13, 3413-3422 (1994). Breton, C., Fournel-Gigleux, S. & Palcic, M. M. Recent structures, evolution and mechanisms of glycosyltransferases. Curr. Opin. Struct. Biol. 22, 540-549 (2012). Maita, N., Nyirenda, J., Igura, M., Kamishikiryo, J. & Kohda, D. Comparative Structural Biology of Eubacterial and Archaeal Oligosaccharyltransferases. J. Biol. Chem. 285, 49414950 (2010). Lizak, C., Gerber, S., Numao, S., Aebi, M. & Locher, K. P. X-ray structure of a bacterial oligosaccharyltransferase. Nature 474, 350-355 (2011). Chiu, C. P. C. et al. Structural analysis of the sialyltransferase CstII from Campylobacter jejuni in complex with a substrate analog. Nat. Struct. Mol. Biol. 11, 163-170 (2004). Lovering, A. L., Castro, L. H. de, Lim, D. & Strynadka, N. C. J. Structural Insight into the Transglycosylation Step of Bacterial Cell-Wall Biosynthesis. Science 315, 1402-1405 (2007). Zhang, H. et al. The highly conserved domain of unknown function 1792 has a distinct glycosyltransferase fold. Nat. Commun. 5, (2014). Albesa-Jove, D., Giganti, D., Jackson, M., Alzari, P. M. & Guerin, M. E. Structure-function relationships of membrane-associated GT-B glycosyltransferases. Glycobiology 24, 108124 (2014). Rojas-Cervellera, V., Ardevol, A., Boero, M., Planas, A. & Rovira, C. Formation of a Covalent Glycosyl-Enzyme Species in a Retaining Glycosyltransferase. Chem. - Eur. J. 19, 14018-14023 (2013). Lee, S. S. et al. Mechanistic evidence for a front-side, SNi-type reaction in a retaining glycosyltransferase. Nat. Chem. Biol. 7, 631-638 (2011). Weerapana, E. & Imperiali, B. Asparagine-linked protein glycosylation: from eukaryotic to prokaryotic systems. Glycobiology 16, 91R-101 (2006). Kaminski, L., Lurie-Weinberger, M. N., Allers, T., Gophna, U. & Eichler, J. Phylogeneticand genome-derived insight into the evolution of N-glycosylation in Archaea. Mol. Phylogenet. Evol. 68, 327-339 (2013). Eichler, J. Extreme sweetness: protein glycosylation in archaea. Nat. Rev. Microbiol. 11, 151-156 (2013). 72 17. VanDyke, D. J. et al. Identification of genes involved in the assembly and attachment of a novel flagellin N -linked tetrasaccharide important for motility in the archaeon Methanococcus maripaludis. Mol. Microbiol. 72, 633-644 (2009). 18. Abu-Qam, M. & Eichler, J. Protein N-glycosylation in Archaea: defining Haloferax volcanii genes involved in S-layer glycoprotein glycosylation. Mol. Microbiol. 61, 511-525 (2006). 19. Voisin, S. et al. Identification and Characterization of the Unique N-Linked Glycan Common to the Flagellins and S-layer Glycoprotein of Methanococcus voltae. J. Biol. Chem. 280, 16586-16593 (2005). 20. Chaban, B., Logan, S. M., Kelly, J. F. & Jarrell, K. F. AgIC and AgiK Are Involved in Biosynthesis and Attachment of Diacetylated Glucuronic Acid to the N-Glycan in Methanococcus voltae. J. Bacteriol. 191, 187-195 (2009). 21. Chaban, B., Voisin, S., Kelly, J., Logan, S. M. & Jarrell, K. F. Identification of genes involved in the biosynthesis and attachment of Methanococcus voltae N-linked glycans: insight into N-linked glycosylation pathways in Archaea. Mol Microbiol 61, 259-68 (2006). 22. Shams-Eldin, H., Chaban, B., Niehus, S., Schwarz, R. T. & Jarrell, K. F. Identification of the Archaeal alg7 Gene Homolog (Encoding N-Acetylglucosamine- 1-Phosphate Transferase) of the N-Linked Glycosylation System by Cross-Domain Complementation in Saccharomyces cerevisiae. J. Bacteriol. 190, 2217-2220 (2008). 23. Burda, P. & Aebi, M. The dolichol pathway of N-linked glycosylation. Biochim. Biophys. Acta BBA - Gen. Subj. 1426, 239-257 (1999). 24. Jones, M. B., Rosenberg, J. N., Betenbaugh, M. J. & Krag, S. S. Structure and synthesis of polyisoprenoids used in N-glycosylation across the three domains of life. Biochim. Biophys. Acta BBA - Gen. Subj. 1790, 485-494 (2009). 25. Kuntz, C., Sonnenbichler, J., Sonnenbichler, I., Sumper, M. & Zeitler, R. Isolation and characterization of dolichol-linked oligosaccharides from Haloferax volcanii. Glycobiology 7, 897-904 (1997). 26. Guan, Z., Naparstek, S., Kaminski, L., Konrad, Z. & Eichler, J. Distinct glycan-charged phosphodolichol carriers are required for the assembly of the pentasaccharide N-linked to the Haloferax volcanii S-layer glycoprotein. Mol. Microbiol. 78, 1294-1303 (2010). 27. Naparstek, S., Guan, Z. & Eichler, J. A predicted geranylgeranyl reductase reduces the wposition isoprene of dolichol phosphate in the halophilic archaeon, Haloferax volcanii. Biochim. Biophys. Acta BBA - Mol. Cell Biol. Lipids 1821, 923-933 (2012). 28. Ferrante, G., Ekiel, I. & Sprott, G. D. Structural characterization of the lipids of Methanococcus voltae, including a novel N-acetylglucosamine 1-phosphate diether. J. Biol. Chem. 261, 17062-17066 (1986). 29. Krogh, A., Larsson, B., von Heijne, G. & Sonnhammer, E. L. L. Predicting transmembrane protein topology with a hidden markov model: application to complete genomes. J. Mol. Biol. 305, 567-580 (2001). 30. Hartley, M. D. & Imperiali, B. At the membrane frontier: A prospectus on the remarkable evolutionary conservation of polyprenols and polyprenyl-phosphates. Arch. Biochem. Biophys. 517, 83-97 (2012). 31. Heesen, S. te, Lehle, L., Weissmann, A. & Aebi, M. Isolation of the ALG5 Locus Encoding the UDP-Glucose:Dolichyl-Phosphate Glucosyltransferase from Saccharomyces cerevisiae. Eur. J. Biochem. 224, 71-79 (1994). 32. Behrens, N. H. & Leloir, L. F. Dolichol Monophosphate Glucose: An Intermediate in Glucose Transfer in Liver. Proc. Natl. Acad Sci. U. S. A. 66, 153-159 (1970). 73 33. Tkacz, J. S., Herscovics, A., Warren, C. D. & Jeanloz, R. W. Mannosyltransferase Activity in Calf Pancreas Microsomes Formation From Guanosine Diphosphate-d-[14C]Mannose of a 14C-Labeled Mannolipid with Properties of Dolichyl Mannopyranosyl Phosphate. J. Biol. Chem. 249, 6372-6381 (1974). 34. Herscovics, A., Warren, C. D. & Jeanloz, R. W. Anomeric configuration of the dolichyl Dmannosyl phosphate formed in calf pancreas microsomes. J. Biol. Chem. 250, 8079-8084 (1975). 35. O'Connor, J. V., Nunez, H. A. & Barker, R. Alpha- and -beta-Glycopyranosyl phosphates and 1,2-phosphates. Assignments of conformations in solution by carbon-13 and proton NMR. Biochemistry 18, 500-507 (1979). 36. Frantom, P. A., Coward, J. K. & Blanchard, J. S. UDP-(5F)-GlcNAc Acts as a SlowBinding Inhibitor of MshA, a Retaining Glycosyltransferase. J. Am. Chem. Soc. 132, 66266627 (2010). 37. Yurist-Doutsch, S., VanDyke, D. J., Jarrell, K. F. & Eichler, J. Sweet to the extreme: protein glycosylation in Archaea. Mol. Microbiol. 68, 1079-1084 (2008). 38. Kaminski, L. et al. AgliJ Adds the First Sugar of the N-Linked Pentasaccharide Decorating the Haloferax volcanii S-Layer Glycoprotein. J. Bacteriol. 192, 5572-5579 (2010). 39. Jarrell, K. F. et al. N-Linked Glycosylation in Archaea: a Structural, Functional, and Genetic Analysis. Microbiol. Mol. Biol. Rev. 78, 304-341 (2014). 40. Guan, Z., Meyer, B. H., Albers, S.-V. & Eichler, J. The thermoacidophilic archaeon Sulfolobus acidocaldarius contains an unsually short, highly reduced dolichyl phosphate. Biochim. Biophys. Acta BBA - Mol. Cell Biol. Lipids 1811, 607-616 (2011). 41. Kelly, J., Logan, S. M., Jarrell, K. F., VanDyke, D. J. & Vinogradov, E. A novel N-linked flagellar glycan from Methanococcus maripaludis. Carbohydr. Res. 344, 648-653 (2009). 42. Larkin, A. & Imperiali, B. Biosynthesis of UDP-GlcNAc(3NAc)A by WbpB, WbpE, and WbpD: Enzymes in the Wbp Pathway Responsible for O-Antigen Assembly in Pseudomonas aeruginosa PAOl. Biochemistry 48, 5446-5455 (2009). 43. Fujinami, D., Matsumoto, M., Noguchi, T., Sonomoto, K. & Kohda, D. Structural elucidation of an asparagine-linked oligosaccharide from the hyperthermophilic archaeon, Pyrococcus furiosus. Carbohydr. Res. 387, 30-36 (2014). 44. Kohda, D., Yamada, M., Igura, M., Kamishikiryo, J. & Maenaka, K. New oligosaccharyltransferase assay method. Glycobiology 17, 1175-1182 (2007). 45. Igura, M. & Kohda, D. Quantitative assessment of the preferences for the amino acid residues flanking archaeal N-linked glycosylation sites. Glycobiology 21, 575-583 (2011). 46. Igura, M. et al. Structure-guided identification of a new catalytic motif of oligosaccharyltransferase. EMBO J. 27, 234-243 (2008). 47. Larkin, A. A. K. Investigation of asparagine-linked glycosylation in archaeal and bacterial at 2010). Technology, of Institute (Massachusetts systems. <http://dspace.mit.edu/handle/1721.1/62725> 48. Swiezewska, E. et al. The search for plant polyprenols. Acta Biochim. Pol. 41, 221-260 (1994). 49. Suzuki, S. et al. Synthesis of mammalian dolichols from plant polyprenols. Tetrahedron Lett. 24, 5103-5106 (1983). 50. Wu, R., Beauchamps, M. G., Laquidara, J. M. & Sowa, J. R. Ruthenium-Catalyzed Asymmetric Transfer Hydrogenation of Allylic Alcohols by an Enantioselective 74 51. 52. 53. 54. 55. 56. Isomerization/Transfer Hydrogenation Mechanism. Angew. Chem. Int. Ed. 51, 2106-2110 (2012). Branch, C. L., Burton, G. & Moss, S. F. An Expedient Synthesis of Allylic Polyprenyl Phosphates. Synth. Commun. 29, 2639-2644 (1999). Tai, V. W.-F. & Imperiali, B. Substrate Specificity of the Glycosyl Donor for Oligosaccharyl Transferase. J. Org. Chem. 66, 6217-6228 (2001). Chen, P. S., Toribara, T. Y. & Warner, H. Microdetermination of Phosphorus. Anal. Chem. 28, 1756-1758 (1956). Sim, M. M., Kondo, H. & Wong, C. H. Synthesis of dibenzyl glycosyl phosphites using dibenzyl N, N-diethylphosphoramidite as phosphitylating reagent: an effective route to glycosyl phosphates, nucleotides, and glycosides. J. Am. Chem. Soc. 115, 2260-2267 (1993). Folch, J., Lees, M. & Stanley, G. H. S. A Simple Method for the Isolation and Purification of Total Lipides from Animal Tissues. J. Biol. Chem. 226, 497-509 (1957). Troutman, J. M. & Imperiali, B. Campylobacter jejuni PgIH Is a Single Active Site Processive Polymerase that Utilizes Product Inhibition to Limit Sequential Glycosyl Transfer Reactions. Biochemistry 48, 2807-2816 (2009). 75 Chapter 3: Characterization of an archaeal oligosaccharyl transferase A portion of the work described in this chapter has been published in the following: Larkin, A., Chang, M.M., Whitworth, G.E., Imperiali, B. Biochemical evidence for an alternate pathway in N-linked glycoprotein biosynthesis. Nat. Chem. Biol. 2013, 9, 367-373. 76 Introduction Asparagine-linked glycosylation is an abundant and complex post-translational protein modification found in all three domains of life.' It plays an important role in many cellular events, including protein folding, stability and intracellular trafficking in eukaryotes, host pathogenicity in bacteria, and flagellum assembly in archaea. 2 4 Biosynthesis of asparagine-linked glycoproteins begins with the stepwise assembly of an oligosaccharide onto a membrane-bound polyprenyllinked carrier, followed by the flipping of the glycan across the membrane, and subsequent transfer to protein (Figure 3-1). The oligosaccharyl transferase (OTase) is the key enzyme of this pathway and catalyzes the en bloc transfer of the oligosaccharide onto asparagine residues of nascent or fully folded proteins found within an N-X-S/T sequon, where X can be any amino acid except proline.' UDP-U Cytoplasm C.. UDP-U L AIg14 GDP-@ 9 A GDP-@ Alg2 OT 1 ER Lumen A1g3 R Alg9 N-acetylglucosamine (GlcNAc) Mannose (Man) Dol-P-0 Alg12 Alg9 Dol-P- Alg6 Alg8 AlglO Nascent glycoprotein Dol-P-U Dolichol Glucose (Glc) Dolichyl-diphosphate .n = 13-17 (length > 100 A) Figure 3-1: N-linked glycosylation pathway in Saccharomyces cerevisiae. In higher eukaryotes, the oligosaccharyl transferase (OT) is a heterooligomeric protein complex localized in the endoplasmic reticulum (ER) membrane. In the well-studied model organism S. cerevisiae, the OT complex is composed of at least eight membrane-bound subunits 77 (Ostl, Ost2, Ost3/6, Ost4, Ost5, Stt3, Swpl, and Wbpl), five of which (Ostl, Ost2, Stt3, Swpl, and Wbpl) are essential for cell viability (Figure 3-2).6 Except for the catalytic subunit, Stt3, the precise functions of each OT subunit are not fully understood. The subunits are thought to work together help substrate binding and protein folding to influence glycosylation rate and efficiency, likely slowing down the early stages of protein folding to maintain the polypeptide chain in a glycosylation-competent conformation. Some of the subunits may mediate complex formation among different OT subunits and with components of the transloeon complex. Ostl is thought to recognize the acceptor peptide substrate and act as a chaperon to promote glycosylation of selected proteins.7 Wbpl contains the putative binding site of the dolichyl-linked oligosaccharide.8 '9 Ost3 and Ost6 are homologous proteins that exhibit oxidoreductase activity and bind specific polypeptides non-covalently and via transient disulfide bonds.10 Cytoplasm I V N1 OWt SWPI OsW SOt Ost3/Ost6 Ost5 N C Wbp1 N Osti ER Lumen Figure 3-2: Subunits comprising the S. cerevisiae OT complex. Predicted topologies of the individual subunits are shown, where N and C indicate the N- and C-termini of the individual proteins. The OT catalytic subunit is called Stt3 (staurosporin and temperature sensitive 3) in eukaryotes, while the single-subunit prokaryotic homologs are called PglB (protein glycosylation B) in bacteria and AglB (archaeal glycosylation B) in archaea. Stt3 is the most highly conserved of all the subunits of the eukaryotic OT complex and is the only subunit that has homologs 78 bacteria and archaea. Interestingly, OTases across the three domains of life share little primary sequence identity, with the exception of the region immediately surrounding the highly conserved WWDXGX motif.1 1 1 4 However, all share a common architecture with an N-terminal region of 400 to 600 residues containing 11-13 transmembrane (TM) domains and a C-terminal globular domain of 150-500 amino acids.1 5 The TM helices are connected by short cytoplasmic or external loops (ELs), except for the extended external loops ELI and EL5.14 16 ' The WWDXGX motif lies at the beginning of the C-terminal domain, which is located in the periplasm for PglB, in the extracellular space for AglB, or in the ER lumen for Stt3. The bacterial N-linked glycosylation process is not common, and pgB genes have been found only in a few species of delta- and epsilon-proteobacteria.2 17 The first bacterial general system of N-linked glycosylation was reported in 1999 in the human pathogen Campylobacter jejuni.18 This pathway is similar to the N-linked glycosylation pathway in S. cerevisiae, in that an oligosaccharide is assembled onto a polyprenyl-linked carrier and then flipped across the membrane before being transferred to acceptor proteins by the OTase (Figure 3-3). One important difference is that the C. jejuni PglB, is composed of just a single polypeptide instead of a multimeric complex and is sufficient to catalyze the transfer of the heptasccharide onto fully folded acceptor proteins. 19,20 Further work showed that PglB requires an extended glycosylation to the yeast OT. - sequence, D/E-X-N-X-S/T, and demonstrates a relaxed oligosaccharide specificity in comparison 79 PgIF PgIE PgID CytpsmPglC UDP-O PgIC -y ,-8 L DPgI a-PgIA - I , Pg I' PglH -pl N-acetylglucosamine (GlcNAc) N,N '-d iacetylbacillosa m ine (B ac) UP gU pgl Q N-acetylgalactosamine (GaINAc) Gluose (Gc) Periplasm Undecaprenyl-diphosphate PgK Undecaprenol PglB OH n =7 (length < 50 A) Figure 3-3: N-linked glycosylation pathway in Campylobacterjejuni. Archaeal N-linked glycosylation was discovered prior to the bacterial counterparts. The first archaeal glycoproteins were observed in 1976 in the extreme halophile Halobacterium salinarium. A recent analysis of 168 sequenced archaeal genomes indicates that all but two contain an aglB gene, which suggests that, unlike in bacteria, N-linked glycosylation in archaea is widespread.2 ' However, in contrast to the analogous pathways in eukaryotes and bacteria, many of the details of N-linked glycan assembly and transfer to protein in archaea are still unknown. Understanding OTase activity, the culminating step of N-linked glycosylation, should be easier in systems where a single enzyme rather than a large complex is involved in the catalysis. Prokaryotic OTases provide a simpler platform for exhaustive study of the oligosaccharyl transfer reaction. Significant advances in defining OTase structures using AglB and PglB have recently been reported, including crystallography studies on and AglB from Pyrococcusfuriosus and Archaeoglobusfulgidus26-29 and PglB from Campylobacterlari.1 4 In our efforts to obtain a tractable OTase with improved expression and stability for in-depth examination, we turned to archaea. Archaeal protein homologs are often used to aid protein structure and function analysis 80 due to increased expression in heterologous systems and enhanced stability compared to their eukaryotic and bacterial counterparts. 0 N-linked glycosylation in archaea bears similarities to both the bacterial and eukaryotic pathways. As in bacteria, archaea display a tremendous variety of N-linked glycans and the OTase is a single integral membrane protein.3 Similar to eukaryotes, N-linked glycoproteins are an important feature of secretory proteins, appearing in both S-layer and flagellar proteins. Also as in eukaryotes, the polyprenyl carrier is dolichol rather than the undecaprenol characteristic of bacteria. 3 Dolichols are distinguished from undecaprenols by the presence of a single saturated a-isoprene unit. These attributes make the archaeal pathway an attractive system for detailed biochemical characterization. At the onset of this work, Angelyn Larkin had identified potential archaeal OTases by using iterative searches of archaeal genome sequences with S. cerevisiae Stt3 was as the search model (Figure 3-4).5 AgIB from the marine methanogen Methanococcus voltae was then identified as a desirable archaeal OTase candidate for study due to the high expression levels in E. coli (Figure 3-5).35 Furthermore, in contrast to many other unusual archaeal glycans, the identities of the individual sugars of the N-linked glycan transferred onto Asn by the OTase are known in the flagellar and S-layer glycoproteins of M. voltae.36 Chapter 2 details our efforts to biochemically elucidate the first two steps of the N-linked glycosylation pathway in M. voltae and to generate the dolichyl-linked disaccharide substrate needed to further investigate the oligosaccharyl transfer reaction. This chapter describes the biochemical characterization of the OTase, AglB, and our efforts towards its biophysical characterization through X-ray crystallography. 81 H. H. H. N. M. M. M. M. M. P. P. P. T. A. D. S. C. S. marismortui volcanii walsbyi pharaonis hungatei stadtmanae jannaschii maripaludis voltae horikoshii abyssi furiosus kodakarensis fulgidus vulgaris solfataricus jejuni, Pg1B cerevisiae, Stt3 (551) (625) (635) (558) (542) (516) (614) (526) (579) (500) (501) (461) (481) (452) (499) (488) (444) (503) wkQEQQYGVL PD9AYGVM *PAOTYGVQ NiAP-P NJ P-P PN VP-I? -- T"E -- VLAH ,..QEgQYGVLI PQNIPT#I --------- VLA S --------- ~--TRAE L--VLQE ,--TQ0 V*FbG-TFOG-TFOG-TFOG-- Y -E VIT YAin-:,-A ---------"I bSSLL -------- ATT P-' ATSP-t 2 ~~TTSODGGH S*VGGH KSSLLO G-- V-Y -------- DIVL -cyN-F' TL$ADG-AH--HF -E-VLT S IDENNTL OR-YYSDVUTLVDGGKH IG-GMADITTLVDNNTW -- PYEKPEYSM --------- M'Ww -------- AFVL --------------- DYVV ta Figure 3-4: Sequence alignment of 16 archaeal Stt3 homologs, depicting the region surrounding the critical WWDXGX motif. Fully conserved residues are highlighted in black, while moderate conservation is indicated in gray. This figure is reproduced from Larkin." A kDa 148 B 98 64 50 1 2 3 4 5 6 1 2 3 4 5 6 Figure 3-5: Comparison of the archaeal OTases expression by Western blot. Each protein was prepared as a crude E. coli membrane fraction prior to analysis. (A) Anti-T7 and (B) Anti-His 4 Western blots, indicating the N- and C-terminal tags. (1) MW standard; (2) M voltae; (3) M maripaludis;(4) A. fulgidus; (5) H. marismortui; (6) C. jejuni (PglB). This figure is reproduced from Larkin.35 Results and discussion Characterizationof oligosaccharyltransferaseactivity Sequence analysis of AglB in M voltae predicted that this OTase contains 13 TM domains located at the N-terminus of the protein and a large (39.1 kDa) C-terminal soluble domain, which is similar to the overall topology observed for both Stt3 and Pg1B (Figure 36).4,14,16 AglB was overexpressed in E. coli with N-terminal T7 and C-terminal His 6 tags and 82 purified in the presence of detergent from the membrane fraction in good yield (2-5 mg/L E. coli culture) (Figure 3-7). T4 02 0 100 transnembrane 300 200 400 600 500 inside - 700 800 900 outside - Figure 3-6: Topology prediction generated using the TMHMM server for M voltae AglB. kDa 1 2 3 150 100 75 50 37 Figure 3-7: Purified AgiB analyzed by (1) Coomassie-stained SDS-PAGE, and by (2) Anti-T7 and (3) Anti-His 6 Western blots, indicating the N- and C-terminal tags. Early work in the Imperiali lab evaluating archaeal OTases for function was unsuccessful because the correct polyprenyl-linked donor substrate was not previously available and alternate substrates such as those from S. cerevisiae or C. jejuni failed to be recognized by AglB. However, with our discovery of the proper enzymatic functions of the glycosyltransferases AglK and AglC, we were able to generate Dol-P-GlcNAc-Glc-2,3-diNAcA and investigate its competence as a substrate for AglB, the predicted OTase in M voltae. Although the full length M voltae N-linked glycan is a trisaccharide, previous in vitro studies have shown that disaccharide-linked substrates are generally sufficient for OTase activity. 20,383, 9 The acceptor 83 peptide substrate, Ac(YKYNESSYKpNF)NH 2, where pNF is para-nitrophenylalanine, was based on the natively glycosylated FlaB2 protein sequence.3 6 With all the necessary substrates in hand, we were able to test AglB for activity using a standard assay based upon that originally developed for the yeast OT.2 0 3, The assay takes advantage of a radiolabeled substrate and the differential partitioning of polyprenyl-linked glycosyl donors, peptide acceptors, and glycopeptide products into aqueous and organic phases. The transfer of radioactivity from the organic layer into the aqueous layer indicates that the radiolabeled sugar has been transferred from the organic soluble polyprenyl-linked donor to the peptide, producing a radiolabeled, aqueous soluble glycopeptide (Figure 3-8). O OH __0- 0 AcNH - HO H H 0 AcNH H4 2. Dolichyl-P (organic soluble) 1 _7 + 2 H]Glc-2,3-diNAcA (organic soluble) HO AgIB 0 0 - Dolichyl-P-GIcNAc-[3 N H A3HNHO + Ac NH OH O 0 H OH OH H H 0 H 0 [ 3H]-Glycopeptide (aqueous soluble) Peptide (aqueous soluble) Figure 3-8: Standard assay for OTase activity. OTases are incubated with the peptide substrate and the radiolabeled glycan donor. Aliquots are removed over time, quenched, and subjected to phase extraction. Glycosylation results in the transfer of radioactive counts from the organic to the aqueous phase. When AglB was incubated with radiolabeled Dol-P-GlcNAc-[ 3 H]Glc-2,3-diNAcA donor and Ac(YKYNESSYKpNF)NH 2 acceptor peptide, efficient formation of the radiolabeled glycopeptide product was observed (Figure 3-9). AglB was specific for the Dol-P-disaccharide 84 donor and showed minimal activity with Dol-P-GlcNAc. No glycosylation activity was detected with the negative control peptide Ac(YKYQESSYKpNF)NH 2 , indicating that AgiB activity is dependent on the canonical NXS/T sequon. 45- 4035- .g30 -+-Dol-P-DS, NES Dol-P-MS, NES QES -Dol-P-DS, S25 20 15 5 0 0 5 10 15 20 25 30 35 40 Time (min) Figure 3-9: AglB activity assay using the peptide Ac(YKYNESSYKpNF)NH 2 (NES) as the acceptor substrate and Dol-P-GlcNAc (MS) or Dol-P-disaccharide (DS) as the donor substrates. A peptide lacking the key asparagine, Ac(YKYQESSYKpNF)NH 2 (QES), was also screened. The activity of AglB, like the other OTases characterized to date, required the addition of divalent metal cations (Figure 3-10). In contrast to the other OTases, AgIB appears to be quite specific for Mn2+; no activity was observed in the presence of either Mg 2 + or Ca2 +. The C. lari PglB X-ray structure (PDB code: 3RCE) shows that three acidic side chains (Asp56, Asp154, and Glu319) coordinate a divalent metal ion, presumably Mg 2 +, and a fourth acidic side chain (Aspi56) is close to the metal ion, with no water molecules modeled due to the limited resolution.1 4 In the A. fulgidus AglB X-ray structure (PDB code: 3WAJ), anomalous difference Fourier maps confirmed that the bound metal ion is Zn2 +, which was derived from the crystallizing solution.' 5 The three ligands are the carboxyl oxygen atoms of Asp47 and Aspl6l, and the imidazole nitrogen of His 163. Water molecules occupy the other three coordination sites. The Zn 2+-bound structure is catalytically active, and the His163 residue might allow AglB to bind and use Zn2+ efficiently for catalysis. Presently, there is no X-ray structure of M voltae 85 AgIB to identify the chelating ligands, which may provide clues as to why AglB is specific for Mn 2+ in M voltae. 7060. - - 40. Mn 2 + r 50- 2 + Mg 0 30- + Ca 2 2R02. -+ No metal - EDTA 10 0 0 20 60 40 80 100 120 Time (min) Figure 3-10: AgIB activity assay performed with the addition of divalent metal cations (10 mM MnCl 2 , MgCl 2 , or CaCl2 ), no exogenous metal ions, or 10 mM EDTA. We were able to readily separate the glycopeptide product from the peptide starting material using reverse-phase HPLC (Figure 3-11), and confirmed the product by MS analysis (Figure 3-12). 20001600- I 1200- Glycopeptide ----- Peptide - 800400 0 i~. 8 9 Time (min) 10 Figure 3-11: Reverse-phase HPLC purification of peptide and glycopeptide product (280 nm). 86 Sa. pim+ ID: 6 Ae- 0 MC- 09 1 2.Oe.02 W' 4 90 900 '00 1100 - :; 4 1300 1101 00 '90 V' Figure 3-12: ESI-MS (negative ion mode) of the glycopeptide produced by AgiB [M+ = 1875.8]. AgiB and its lipid-linkedglycosyl donor substrate The final step of N-linked glycosylation involves oligosaccharide transfer from the Dol-P donor to asparagine in acceptor proteins by the OTase, AgiB. The catalytic subunit of OTases from bacteria, archaea, and eukaryotes share common topologies and conserved sequences. It is remarkable that, despite this homology, AgiB transfers carbohydrates to asparagine in the same conserved NXS/T sequon from a polyprenyl-monophosphate (Pren-P) linked donor rather than a polyprenyl-diphosphate (Pren-PP) linked donor, as the diphosphate moiety has been typically assigned important roles in metal binding and catalysis.14"1 6 In particular, the metal ioncoordinated diphosphate-linked donor would be considerably more activated than the corresponding monophosphate derivative owing to the higher affinity of the diphosphate for divalent cations, which would in turn render the metal ion-coordinated complex a better leaving 87 group as a result of the more acidic pKa (for example, 5.3 for ADPH 2 --Mg2+ versus around 7.0 for typical monophosphate esters) of the departing phosphate species. 40 The structural and mechanistic consequences of this significant change in the glycosyl donor substrate will be extremely interesting to pursue. As discussed in Chapter 2, M voltae and M maripaludis are closely related methanogenic, euryachaeal species with very similar N-glycan structures. While N-linked glycosylation pathways in M voltae and M maripaludis appear unique due to the occurrence of Dol-P rather than Dol-PP linked glycans, 41 this feature may actually be much more prevalent in archaea than originally believed. The eukaryotic OT complex transfers a tetradecasaccharide, GlcNAc 2Man 9Glc 3, from a Dol-PP carrier to the asparagine residue of proteins (Figure 3-1). PglB in C. jejuni uses a similar Und-PP-heptasaccharide donor substrate (Figure 3-3),20 and PgIB in the closely related C. lari presumably utilizes an Und-PP-hexasaccharide donor substrate lacking the branching glucose.42 It is not yet known whether AgIB enzymes from A. fulgidus, P. furiosus, and most other archaeal species employ Dol-P or Dol-PP substrates because activity assays on the full-length proteins have only been carried out using crude membrane fractions as the source of the glycosyl donor. 15 ,2 6 Lipid carriers that have been definitively isolated and identified in archaea are the various Dol-P charged glycans found in Haloferax volcanii,43 Dol-P-HexANAc and Dol-P-(HexANAc) 2 44 the species found in H. mediterranei,45 the Dol-P- pentasaccharide found in Haloarcula marismortui,46 the Dol-P-trisaccharide found in M maripaludis4 and the Dol-P-heptasaccharide and intermediates found in P. furiosus (Chapter 2). Mass spectrometry analysis of a total Sulfolobus acidocaldariuslipid extract identified Dol, DolP, and Dol-P-hexoses but not Dol-PP or any Dol-P charged with complex glycans. 47 The only archaeal Dol-PP-glycans found thus far occur in Halobacteriumsalinarum, in which both Dol-P- 88 disaccharide and Dol-PP-tetrasaccharide species have been detected. 4 5 Presently, it is not clear if the single AgIB encoded by H. salinarum is responsible for processing both the Dol-P and DolPP charged glycan carriers. In this chapter, we demonstrate that AglB in M voltae utilizes Dol-P-disaccharide as its glycan donor. Taken altogether, there is strong evidence that N-linked glycosylation in archaea predominantly utilizes Dol-P instead of Dol-PP as the lipid carrier. PglB and Stt3 use polyprenyl-diphosphate linked glycosyl donor substrates, so AgIB must be different in some way that explains the significant change in the chemical characteristics of the glycosyl donor substrate. Investigation of the role ofAgiB His-597 AglB is the first biochemically characterized example of an OTase capable of transferring glycans from a polyprenyl monophosphate-linked glycan donor. These AgIB proteins represent a divergent class of OTases from those found in eukaryotes and bacteria. The absence of a second phosphate in the donor, which has been typically attributed to metal ion binding, suggests that other residues within AgIB must be involved in the process as compared to those described in PgIB.14'16 A sequence alignment of several archaeal AglB homologs indicates that these proteins contain a few distinct primary sequence features, including a conserved histidine residue immediately following the canonical WWDXG motif, as well as additional aromatic and positively charged residues roughly 20 amino acids downstream of this sequence (Figure 3-13). While additional alignments and mutagenesis studies will be need to confirm this, these sequence features may serve as a predictor for OTase recognition of dolichyl monophosphate-linked glycans. 89 * H. marismortui H. volcanii M. stadtmanae M. jannaschii M. maripaludis (561) (635) (526) (626) (536) (589) (548) (465) P. horikoshii A (485) P. horikoshii B (487) P. furiosus A (471) P. furiosus B (508) S. solfataricus (498) (454) C. jejuni (513) S. cerevisiae M. voltas A. fulgidus A A. fulgidus B - -TRAERVPNA)- QQGTETVAPZIVA -- VL0ERIPNA -P ;GATEAAN'A L -T-AVADXQVFG-- -sQNNMRA FD---S NSPRA IYT--FDO - - -li 00NTPRA IY-I A IYN--MKFDO --- S4NTPA -IGNKYNNVP -- AVAHRMI DAAK -- WVWKE~gV - -YYARRSPIAQG-- -GPSVGVAL ESSLL0NRRASADSGHARDRDHI -- YYAR*SPVA*--- SPOSGVAG ESSLLGQRRASADCGHARDRDHI E-VLTNRSV1DENTLMTQIRLAEMFL P R-Y!SDVETLVDGQKHI4KDNFFPStSL$ 'QG-GMADRTTLVDUNT1NNTHIAIVGKAMA Figure 3-13: Sequence alignment of the WWDXGX motif of representative OTases, highlighting the presence of the conserved His residue in various archaea (*). AgIB in M voltae has a histidine residue at position 597, and mutation of this conserved His to either Ala, Phe or Tyr, where Tyr is the residue found in both PglB and Stt3, results in complete inactivation of the enzyme for the Ala, Phe, and Tyr mutants (Figure 3-14). Mutation of His-597 to Asn was not tractable possibly due to premature truncation, but the other mutants were well expressed (Figure 3-15). It is tempting to propose a role for this His residue in the activation of the acceptor asparagine amide or stabilization of the reaction intermediate, as the monophosphate donor is predicted to be less reactive than the corresponding diphosphate, although further experiments are required to support this hypothesis. As depicted in Figure 3-13, certain archaea such as P. furiosus, P. horikoshii, and A. fulgidus contain additional isoforms of AglB that do not feature the conserved histidine residue. However, it is unclear what the role of these alternate isoforms might be. 90 80 70 60 . 50 0 S40 -- H597A , H597F H597Y c3 10 20 0 100 200 400 300 Time (min) Figure 3-14: Activity of AgIB His-597 mutants (CWWDNGH) with Dol-P-GlcNAc-[ 3H]Glc2,3-diNAcA and Ac(YKYNESSYKpNF)NH 2 as the donor and acceptor substrates. kDa 190 WT Ala Phe Asn Tyr * 120 85 60 50 40 25 20 Figure 3-15: AgiB WT, H597A, H597F, H597N, H597Y expression in the membrane fraction, indicated by the asterisk, and detected by anti-His 4 Western blot. The hypothesis that His-597 is important for distinguishing between Pren-P and Pren-PP linked glycans led to the idea that it might be possible to convert one class of OTase to another by mutating that key histidine residue to tyrosine and using the polyprenyl diphosphate-linked donor of the other class of OTase. Therefore, AglB H597Y was tested with its native substrate, Dol-P-GlcNAc-Glc-2,3-diNAcA, and the PglB native substrate, Und-PP-diNAcBac-GalNAc (Figure 3-16). The best substrate we would want to use in this study would have been Dol-PP- 91 GlcNAc-Glc-2,3-diNAcA, but as this substrate was not available to us, we believed that Und-PPdiNAcBac-GalNAc would be a close enough analog. This assumption was made in light of reports of relaxed substrate specificity of AgIB for lipid-linked glycans in haloarchaeal species.45 AglB H587Y demonstrated no activity with either substrate, indicating that His-597 is indeed important for AglB activity, but the differences between the two classes of OTases are more complex than one single His residue. A 9o B 90 80 80 770 7 60 60.2 . 30 70 50 -- 40 -0-AgB-H597Y AgIB-WT 0 50 -*AgB-WT 40 -U-AgB-H597Y 30 0 20 20 10 10 0 4 0 ------- 200 400 800 800 1000 1200 Time (min) 0 1400 0 100 200 Time (min) 300 400 Figure 3-16: AglB wild type and AglB H597Y mutant activity with (A) Dol-P-GlcNAc-Glc-2,3diNAcA and (B) Und-PP-Bac-GalNAc. AglB crystallographicstudies In recent years, much effort has been made to solve the crystal structures of prokaryotic OTases in order to understand the critical protein to glycan bond-forming step in N-linked glycosylation. P. horikoshii and P. furiosus each contain two versions of AglB, designated AglBL (long) and Ag1B-S (short), while A. fulgidus has three versions of the protein, designated AglB-L, AglB-S 1, and AglB-S2. 2 8 Crystal structures have been obtained for the C-terminal soluble domain of AglB-L of P. furiosus,26 AglB-L of P. horikoshii,28 and all three paralogs in A. fulgidus. 2 7 ,2,48 These soluble domains alone were insufficient to catalyze the OTase reaction, implying that the transmembrane regions are needed for enzymatic activity. Nevertheless, these 92 structures led to the identification of additional motifs involved in the catalytic activity of AgIB. To date, the only structures of full-length, catalytically active OTases are those reported for PglB from C. lari," and AglB-L from A, fulgidus.15 Despite displaying low sequence similarity, remarkable structural similarity was evident between full-length PglB and AglB-L. The X-ray structure of full-length C. lari PglB in complex with an acceptor peptide revealed that PgIB forms two cavities on opposite sides of the protein, with one cavity for peptide binding and the other presumably for binding to the lipid-linked oligosaccharide. At this stage, determination of the crystal structure of an OTase, which is bound to a peptide substrate, but known to utilize a Pren-P-oligosaccharide would be desirable for comparison with PgIB. Attempts to obtain the crystals of the M. voltae AglB have been hindered by the fact that while the protein expresses well, it is still difficult to handle. Membrane crystallography is challenging because the protein must first be extracted from the lipid membrane with a mild detergent and purified to a stable, homogeneous population.4 9 AgiB is active in Triton X-100 and n-dodecyl-p-D-maltoside (DDM), but neither detergent was suitable for our crystallography studies. We observed that AglB is solubilized well from the membrane by Triton X-100 and not at all by DDM. When AglB is solubilized by 1% Triton X-100, exchanged into 0.05% DDM, and subjected to a gel filtration chromatography, the protein exists as a mixture of higher order aggregates, resulting in AgIB eluting off the column as a broad peak extending from the void volume to its expected retention time for monomeric AglB (Figure 3-17a). When AgiB is solubilized from the membrane with 1% lauryl maltose neopentyl glycol (LMNG), washed with 0.003% LMNG, and subjected to gel filtration chromatography, the protein elutes as one monodisperse peak at the volume expected for a large protein and micelle complex (Figure 317b). 93 0.0 Figure 3-17: Gel filtration chromatography of AgiB, monitored at 280 nm. (A) Solubilized with 1% Triton X-100 and exchanged into 0.05% DDM. After Superdex 200 16/60 column, 1.2 mg of AgiB is purified per liter of cell culture. (B) Solubilized with 1% LMNG and exchanged into 0.003% LMNG. After Superdex 200 10/300 column, 0.3 mg of AglB is purified per liter of cell culture. Arrow indicates the void volume. LMNG is not a traditional detergent and represents a new class of amphiphiles built around a central tetrahedral carbon atom derived from neopentyl glycol (NG), with two hydrophilic heads and two lipophilic tails (Figure 3-18). These NG class detergents are more effective than conventional detergents for extracting, solubilizing, and stabilizing proteins from multiple membrane protein systems, and are particularly beneficial in the crystallization process due to their subtle constraints on overall conformational flexibility that allows dense packing when forming a micelle.50 Low critical micellar concentration (CMC) values also reduce the often-detrimental effects of excess solubilizing agent on crystallization. Triton X-100 is significantly better than LMNG at solubilizing AglB from the membrane, but Triton X-100 is inferior for AglB stabilization and promotes aggregation, while LMNG is able to maintain AglB in monodisperse form, a crucial feature of proteins intended for crystallization. Therefore, LMNG was selected as the detergent of choice for AglB crystallography screens despite low solubilization yields and only 0.1 mg of monodisperse AglB purified per liter of cell culture. 94 OH 0H HOHO V1 A OH OH 00 OH OH 0 Z5XOH HO HOHO O Figure 3-18: Structure of LMNG amphiphile. Given that the C. lari PgIB did not crystallize without peptide substrate, it was important to find a good peptide substrate for AgiB crystallization studies. The Ac(YKYNESSYKpNF)NH 2 sequence used in the AgiB activity experiments was taken from a flagellar protein sequence, with no studies undertaken to optimize the residues by position. While AgIB does not appear to have any requirements other than the NXS/T sequon in its acceptor peptide, each position can be assayed with a diverse peptide library. AgiB was first assayed with a peptide library consisting of 23 unique peptide substrates at 0.5 mM peptide concentration (Figure 3-19). The screen revealed which residues at which positions resulted in the highest initial rates. An additional peptide with the best residues at each position based on the library was synthesized with arginine residues added to the C-terminus for increased solubility. The new Ac(YKYNFTSYKRR)NH 2 peptide had an apparent K, of 0.96 0.52 mM and is a much poorer substrate for AgiB than the C. jejuni peptide is for PglB. The optimized C. jejuni peptide, Ac(DQNATpNF)NH 2, has an apparent Km of 0.80 0.11 tM, which is a thousand-fold tighter binding than the AglB peptide substrate.5 ' The Michaelis constant, K, is the substrate concentration at which the reaction velocity is half-maximal, and is an inverse measure of the substrate affinity for the enzyme, as a small K- indicates high affinity, meaning that the rate will approach Vmax more quickly. The value of Km is dependent on both the enzyme and the substrate 95 and is also a function of temperature and pH. Under the constant experimental conditions, including the identical enzyme preparation, at which all of our Km values are determined, Km approximates the apparent dissociation constant, Kd. 0.35 0.3 0.25 0.2 0.15 Figure 3-19: AglB acceptor peptide library screen. Initial rates were determined at 0.5 mM peptide concentration. Crystallization conditions for AglB were identified by initially surveying several commercially available sparse matrix screens. The Hampton MembFac crystal screen consists of 48 unique reagents selected specifically for use with detergents. Hanging drop screens were set up with 5 mg/mL AglB with and without 0.75 mM Ac(YKYNFTSYKRR)NH 2 peptide. Crystals were observed after two weeks in one condition (Figure 3-20). These crystals contain AglB that was pre-incubated with peptide. Some other conditions appeared to have tiny microcrystals, but they did not grow large enough and failed to seed larger crystals. The rest of the conditions were split between clear drops and precipitate. This indicates that 5 mg/ml is around the right protein concentration range, but further optimization of the setup (drop size, protein to precipitant ratio, protein to peptide ratio, reservoir volume, temperature) and sampling pH, salt, and precipitant concentrations around the initial hit condition in order improve crystal size and quality did not prove fruitful. 96 A B Figure 3-20: Crystallization of AgiB in the presence of peptide. Initial crystals of AgIB in 0.1 M NaCl, 0.1 M sodium acetate trihydrate pH 4.6, 12% v/v 2-propanol after (A) 17 days and (B) 18 days. Our one initial crystal hit appeared to have fibrils growing outwards, tended to stick on the cover slide, and appeared to be composed of stacked layers. Despite this crystal exhibiting less than ideal properties for good diffraction, it was looped, cryo-protected, and transported to Boston University for initial diffraction screening at the Allen lab. Unfortunately, the protein crystal was lost sometime during this process and we were unable to reproduce this crystal growth. One factor affecting reproducibility is that we cannot precisely determine the final concentration of LMNG in our purified AglB stock used in the setup of the crystal trays. Given the low amount of AglB liberated in the membrane solubilization, protein loss and dilution in subsequent purification steps, and non-reproducibility of our crystal hit, we decided to put the crystallization studies on hold. Nevertheless, these studies laid the foundation for future biophysical investigations of AglB. Conclusions In this chapter, the in vitro analysis of the final enzyme in the N-linked glycosylation pathway of M voltae represents the first time an OTase has been biochemically shown to transfer a glycan from a polyprenyl-monophosphate carrier instead of a polyprenyl-diphosphate carrier observed in eukaryotes and bacteria, and traditionally presumed in archaea (Figure 3-21). 97 Although it is not yet known how common this second general strategy for N-linked glycosylation is among archaea, there may be clues to be found in the comparative analysis of AgLB and PglB structures. Membrane protein crystallography of AgLB remains challenging and there is still no published structure of AglB bound to peptide or the lipid-linked oligosaccharide. Given the challenges associated with membrane protein crystallography, our immediate focus shifted to understanding the structure and function of AgLB by investigating the protein dynamics associated with substrate binding using luminescence resonance energy transfer, which will be detailed in the following chapter. 0 OH AgIB ~ sequon HOHO AcHN O-P-O-Dol Dol-P-GIcNAc-GIc-2,3-diNAcA o-OH Peptide wfth0 (AsnXaaSerlrhr) 0. K HO~ H Dol-P ~ H(Q H> 0 HNA HI M. voltae disaccharide N-linked glycopeptide 0 0H ~ OH Figure 3-21: M voltae AglB catalyzes the oligosaccharyl transfer from a Dol-P-glycan onto the Asn residue within the NXS/T sequon of acceptor proteins. Acknowledgements Dr. Angelyn Larkin began this project by pursuing the M voltae N-linked pathway and made many attempts to make a suitable substrate in order to examine AglB activity. I am also grateful to Prof. W. Whitman (University of Georgia) for the gift of M voltae genomic DNA, Prof. E. Swiezewska (Polish Academy of Sciences) for providing a sample of racemic short dolichols for preliminary studies, Prof. K. Allen (Boston University) for help with protein crystallography, and Li Li (DCIF, MIT) for high-resolution ESI-MS of glycopeptide. 98 Experimental methods AgiB expression and purification AgiB was expressed in a pET-24a(+) vector in BL21-CodonPlus (DE3) RIL cells (Agilent) using kanamycin and chloramphenicol for selection. LB medium (1 L) supplemented with antibiotics was inoculated with a 5-mL starter culture and incubated at 37 'C with shaking until an optical density at 600 nm of 0.6 to 0.8 AU was obtained. The cultures were then cooled to 16 *C and protein expression was induced with 1 mM IPTG. After 16 h, the cells were harvested by centrifugation (4,000 g). For protein purification, cell pellets were resuspended in 5% of the original culture volume in buffer A (50 mM HEPES, pH 7.5, 300 mM NaCl) and a protease inhibitor cocktail (Calbiochem) and then were subjected to sonication (50% amplitude, 1 second pulses, 3 x 1.5 min on ice). To prepare cell membrane fractions, the cell lysate was spun down to remove cellular debris (6,000 g, 30 min), followed by pelleting of the membranes (142,000 g, 1 h). The membrane pellet was then resuspended in 0.25% of the original culture volume in buffer B (50 mM HEPES, pH 7.5, 150 mM NaCl) plus 10 mM imidazole and stored at -80 'C. To purify proteins from crude membrane fractions, the membranes were solubilized in 1% Triton X-100 (rotating for 1 h) and centrifuged (142,000 g), and the resulting supernatant was incubated with Ni-NTA resin (rotating for 3 h). The resin was then washed (buffer B plus 25 mM imidazole and 0.05% DDM), the proteins eluted (buffer B plus 250 mM imidazole and 0.05% DDM), and the samples were dialyzed (buffer B plus 0.05% DDM). All purification steps were performed at 4 'C. AgIB concentrations were determined by UV absorbance. AglB expressed from pET-24a(+): MW = 104,864 g/mol, E28Onm 99 = 168,510 cm' M~1. Oligosaccharyltransferase assay To assay the AgiB OTase, DMSO (5 ptL), buffer C (50 [IL, 100 mM HEPES, pH 7.5/280 mM sucrose/2.4 % Triton X-100/20 mM MnC 2), and H 2 0 (53 pL) were added to a tube containing dried Dol-P-GlcNAc-[ 3 H]Glc-2,3-diNAcA (0.78 nmol, 17.3 mCi/mmol). Freshly purified AgIB (1 pM) was then added, along with H 20 for a final volume of 100 [IL. The reaction was initiated by the addition of peptide, either Ac(YKYNESSYKpNF)NH 2 or Ac(YKYQESSYKpNF)NH 2 (0.2 pmol), which were synthesized on PAL-PEG-PS resin (Applied Biosystems) using standard Fmoc-based solid phase peptide synthesis protocols. Aliquots (10 [IL) of the reaction mixture removed at various time points and quenched in 3:2:1 CHCI 3:MeOH:4 mM MgC 2 (1.2 mL), and the resulting aqueous layer was removed. The organic layer was further extracted with theoretical upper phase plus salt (TUPS, 2 x 600 pL, composed of 2.75% CHCl 3, 44% MeOH, and 1.55 mM MgCl 2 ). The resulting aqueous layers were combined with 5 mL EcoLite (MP Biomedicals) liquid scintillation cocktail, organic layers were combined with 5 mL OptiFluor (PerkinElmer), and analyzed on a Beckman Coulter LS6500 scintillation counting system. The non-zero background observed at time zero without the addition of any enzyme is due to hydrolysis of the Dol-P-GlcNAc-[ 3 H]Glc-2,3-diNAcA substrate during purification and storage. This is a common observation with these types of substrates. In general, the sample is only used if less than 15% of the sample has been hydrolyzed prior to the assay, which then allows a reliable range of substrate conversion rates to be observed and measured. In control reactions, we note that there is no background hydrolysis over the limited time frame of the assay. 100 Purificationof glycopeptides Glycopeptides were purified using a C18 reverse phase column (YMC-Pack ODS-AQ, 120 A, 3 mm, 100 x 3 mm) and separating over 30-38% buffer G, where buffer F is H 2 0/0.1% TFA and buffer G is CH 3CN/0.1% TFA, and monitoring at 280 nm. The glycopeptide product was further characterized by high resolution ESI-MS. Ag/B H597 mutagenesis AgiB H597A, H597N, H597F, and H597Y mutants were expressed as pET-24a(+) constructs in BL21 (DE3) RIL cells. Primers (Sigma) were used for the QuikChange are as follows: H597A-f: 5'-GTTGGTGGGACAATGGTGCCATCTACACATGGAAAAC-3' H597A-r: 5'-GTTTTCCATGTGTAGATGGCACCATTGTCCCACCAAC-3' H597N-f: 5'-GTTGGTGGGACAATGGTAACATCTACACATGGAAAAC-3' H597N-r: 5'-GTTTTCCATGTGTAGATGTTACCATTGTCCCACCAAC-3' H597F-f: 5'-GTTGGTGGGACAATGGTTTCATCTACACATGGAAAAC-3' H597F-r: 5'-GTTTTCCATGTGTAGATGAAACCATTGTCCCACCAAC-3' H597Y-f: 5'-GTTGGTGGGACAATGGTTACATCTACACATGGAAAAC-3' H597Y-r: 5'-GTTTTCCATGTGTAGATGTAACCATTGTCCCACCAAC-3' PurificationofAglB for crystallography For solubilization of the protein from the crude membrane fraction, the membrane fraction was thawed, homogenized with 50 mM HEPES, pH 7.5, 100 mM NaCl, 10 mM imidazole, 1% LMNG (10 ml per L of cell culture), and incubated with rotation at 4 'C for I hour. The detergent mixture was centrifuged (145,000 g, 65 min) to pellet the crude membranes. 101 The supernatant containing protein and detergent micelles was incubated with Ni-NTA agarose resin for 1 hour with gentle rocking and subsequently poured into a fritted PolyPrep column to collect the resin. The resin was washed with 50 mM HEPES, pH 7.5, 100 mM NaCl, 20 mM imidazole, 0.002% LMNG and then with 50 mM HEPES, pH 7.5, 100 mM NaCl, 40 mM imidazole, 0.002% LMNG. The protein was eluted with 50 mM HEPES, pH 7.5, 100 mM NaCl, 300 mM imidazole, 0.002% LMNG. Fractions containing the desired protein were identified by SDS-PAGE and were combined and concentrated using a 50 kDa cutoff centrifugal filter. The protein was then subjected to size exclusion chromatography using a Superdex 200 16/60 or Superdex 200 10/300 columns (GE Healthcare) with running buffer of 50 mM HEPES, pH 7.5, 100 mM NaCl, 0.002% LMNG. Fractions containing monodisperse protein were pooled and subjected to anion exchange chromatography using a HiTrap Q HP column (GE Healthcare) with 50 mM Tris, pH 8.5, 0.002% LMNG as the initial buffer. Protein was eluted with a gradient of 50 mM Tris, pH 8.5, 1 M NaCl, 0.002% LMNG. Fractions containing the desired protein were pooled and dialyzed against 50 mM HEPES, pH 7.5, 100 mM NaCl and then flash frozen for future use. The LMNG final concentration is somewhat ambiguous because of the unknown micellar size and the unknown permeability through the centrifugal filters used for concentrating protein. In addition to LMNG, octyl glucose neopentyl glycol (OGNG), Triton X-100, DDM, were also screened for solubilizaton and purification properties. All detergents were purchased from Anatrace or Sigma. AglB peptide substrates A peptide library consisting of 2-5 mg of 23 unique peptide substrates was synthesized by ChinaPeptides Co., Ltd and assayed with AglB at 0.5 mM peptide concentration. Two peptides were manually synthesized by standard Fmoc solid-phase peptide synthesis on a PAL-PEG-PS 102 resin. The sequences are Ac(YKYNFTSYKRR)NH 2 (E = 3840 M- cm') and Ac(YKYNFTSYKpNFRR)-NH 2 (E = 16340 M- cm-'). The resin was swelled in CH 2 Cl 2 (5 min) then DMF (10 min) prior to synthesis. The terminal Fmoc group was deprotected with an excess of 20% 4-methylpiperidine (3 x 5 min). Each amino acid was coupled to the resin by incubation with the amino acid (4 equiv), PyBOP (4 equiv), and DIPEA (8 equiv) for I h. The resin was washed with DMF (5 x I min) and CH2 Cl 2 (5 x 1 min) between each deprotection and coupling step. All peptides were acetylated at the N-terminus by incubation with acetic anhydride (10 equiv) and pyridine (10 equiv) for 30 min. The side chain protecting groups were removed and the peptides cleaved from the resin by exposure to a mixture of 90:5:2.5:2.5 TFA/CH 2CI2/H 20/TIS for 3 h with vigorous shaking. The resin was removed by filtration and the solvent was evaporated under a stream of nitrogen. The resulting pellet was triturated with cold Et 20 before purification. The peptides were purified by C 18 RP-HPLC (YMC-Pack ODS-A, 250 x 20 mmI.D., S-5 tm, 12 nm) with a gradient of 15-50% B, where solvent A is water/0.1% TFA and solvent B is acetonitrile/0. I% TFA using a Waters 1525 system with monitoring at 228 and 280 nm. Peptides were confirmed by MS analysis on a Finnigan LCQ Deca mass spectrometer coupled to an Agilent 1100 series HPLC. AglB crystallography The Hampton MembFac crystal screen consists of 48 unique reagents selected specifically for use with detergents. A hanging drop screen was set up using 24-well plates with 0.5 ml MembFac reagent in each reservoir. All setup and growth occurred at room temperature. The AgiB (in 50 mM HEPES, pH 7.5, 100 mM NaCl, and an unknown concentration of LMNG far above its CMC of 0.001% w/v) stock concentration was 5 mg/ml and one screen was set up with peptide and one without peptide. For AgIB alone, 1 ul of protein was added to I ul of 103 reservoir solution. For the screen with peptide, AgIB was incubated with a final concentration of 0.75 mM Ac-YKYNFTSYKRR-NH 2 (stock = 72.4 mM in DMSO) for one hour, following what was done for the C. lari PglB crystal structure.' 4 Then I ul of protein incubated with peptide was added to I ul of reservoir solution. Crystals were observed in MembFac condition 4 (0.1 M NaCl, 0.1 M sodium acetate trihydrate pH 4.6, 12% v/v 2-propanol) after two weeks. These crystals contain AglB that was pre-incubated with peptide. This condition was optimized by varying buffer pH (0.1 M NaAc, pH 4.0-5.0), salt concentration (0.05-0.17 M NaCI), precipitant concentration (9-15% 2-propanol), peptide concentration (0.5-1 mM), drop size (2-4 ul), and protein to solution ratio in the drop. Unfortunately, around half of the conditions gave light precipitate while the rest remained clear with no crystals. References 1. Varki, A. et al. Essentials of Glycobiology. (Cold Spring Harbor Laboratory Press, 2009). at <http://www.ncbi.nlm.nih.gov/books/NBK 1908/> 2. Nothaft, H. & Szymanski, C. M. Protein glycosylation in bacteria: sweeter than ever. Nat. Rev. Microbiol. 8, 765-778 (2010). 3. Eichler, J. Extreme sweetness: protein glycosylation in archaea. Nat. Rev. Microbiol. 11, 151-156 (2013). 4. Larkin, A. & Imperiali, B. The Expanding Horizons of Asparagine-Linked Glycosylation. Biochemistry 50, 4411-4426 (2011). 5. Bause, E. Structural requirements of N-glycosylation of proteins. Studies with proline peptides as conformational probes. Biochem J209, 331-6 (1983). 6. Kelleher, D. J. & Gilmore, R. An evolving view of the eukaryotic oligosaccharyltransferase. Glycobiology 16, 47R-62R (2006). 7. Wilson, C. M., Roebuck, Q. & High, S. Ribophorin I regulates substrate delivery to the oligosaccharyltransferase core. Proc. Natl. A cad. Sci. 105, 9534-9539 (2008). 8. Pathak, R., Hendrickson, T. L. & Imperiali, B. Sulthydryl modification of the yeast Wbplp inhibits oligosaccharyl transferase activity. Biochemistry 34, 4179-4185 (1995). 9. Yan, Q., Prestwich, G. D. & Lennarz, W. J. The Ostlp Subunit of Yeast Oligosaccharyl Transferase Recognizes the Peptide Glycosylation Site Sequence, -Asn-X-Ser/Thr-. J. Biol. Chem. 274, 5021-5025 (1999). 10. Schulz, B. L. et al. Oxidoreductase activity of oligosaccharyltransferase subunits Ost3p and Ost6p defines site-specific glycosylation efficiency. Proc. Natl. Acad. Sci. 106, 1106111066 (2009). 104 11. Igura, M. & Kohda, D. Quantitative assessment of the preferences for the amino acid residues flanking archaeal N-linked glycosylation sites. Glycobiology 21, 575-583 (2011). 12. Maita, N., Nyirenda, J., Igura, M., Kamishikiryo, J. & Kohda, D. Comparative Structural Biology of Eubacterial and Archaeal Oligosaccharyltransferases. J. Biol. Chem. 285, 49414950 (2010). 13. Yan, Q. & Lennarz, W. J. Studies on the Function of Oligosaccharyl Transferase Subunits: Stt3p IS DIRECTLY INVOLVED IN THE GLYCOSYLATION PROCESS. J. Biol. Chem. 277, 47692-47700 (2002). 14. Lizak, C., Gerber, S., Numao, S., Aebi, M. & Locher, K. P. X-ray structure of a bacterial oligosaccharyltransferase. Nature 474, 350-355 (2011). 15. Matsumoto, S. et al. Crystal structures of an archaeal oligosaccharyltransferase provide insights into the catalytic cycle of N-linked protein glycosylation. Proc. Natl. Acad Sci. 110, 17868-17873 (2013). 16. Jaffee, M. B. & Imperiali, B. Exploiting Topological Constraints To Reveal Buried Sequence Motifs in the Membrane-Bound N-Linked Oligosaccharyl Transferases. Biochemistry 50, 7557-7567 (2011). 17. Nothaft, H. & Szymanski, C. M. Bacterial Protein N-Glycosylation: New Perspectives and Applications. J. Biol. Chem. 288, 6912-6920 (2013). 18. Szymanski, C. M., Yao, R., Ewing, C. P., Trust, T. J. & Guerry, P. Evidence for a system of general protein glycosylation in Campylobacter jejuni. Mol. Microbiol. 32, 1022-1030 (1999). 19. Wacker, M. N-Linked Glycosylation in Campylobacter jejuni and Its Functional Transfer into E. coli. Science 298, 1790-1793 (2002). 20. Glover, K. J., Weerapana, E., Numao, S. & Imperiali, B. Chemoenzymatic Synthesis of Glycopeptides with PglB, a Bacterial Oligosaccharyl Transferase from Campylobacter jejuni. Chem. Biol. 12, 1311-1316 (2005). 21. Nita-Lazar, M., Wacker, M., Schegg, B., Amber, S. & Aebi, M. The N-X-S/T consensus sequence is required but not sufficient for bacterial N-linked protein glycosylation. Glycobiology 15, 361-367 (2005). 22. Chen, M. M., Glover, K. J. & Imperiali, B. From peptide to protein: comparative analysis of the substrate specificity of N-linked glycosylation in C. jejuni. Biochemistry 46, 5579-85 (2007). 23. Feldman, M. F. et al. Engineering N-linked protein glycosylation with diverse 0 antigen lipopolysaccharide structures in Escherichia coli. Proc. Natl. A cad. Sci. U. S. A. 102, 30163021 (2005). 24. Mescher, M. F. & Strominger, J. L. Purification and characterization of a prokaryotic glucoprotein from the cell envelope of Halobacterium salinarium. J. Biol. Chem. 251, 20052014 (1976). 25. Kaminski, L., Lurie-Weinberger, M. N., Allers, T., Gophna, U. & Eichler, J. Phylogeneticand genome-derived insight into the evolution of N-glycosylation in Archaea. Mol. Phylogenet. Evol. 68, 327-339 (2013). 26. Igura, M. et al. Structure-guided identification of a new catalytic motif of oligosaccharyltransferase. EMBO J. 27, 234-243 (2008). 27. Matsumoto, S. et al. Crystal Structure of the C-Terminal Globular Domain of Oligosaccharyltransferase from Archaeoglobus fulgidus at 1.75 A Resolution. Biochemistry 51, 4157-4166 (2012). 105 28. Nyirenda, J. et al. Crystallographic and NMR Evidence for Flexibility in Oligosaccharyltransferases and Its Catalytic Significance. Structure 21, 32-41 (2013). 29. Igura, M. et al. Purification, crystallization and preliminary X-ray diffraction studies of the soluble domain of the oligosaccharyltransferase STT3 subunit from the thermophilic archaeon Pyrococcusfuriosus. Acta Crystallograph. Sect. F Struct. Biol. Cryst. Commun. 63, 798-801 (2007). 30. Vieille, C. & Zeikus, G. J. Hyperthermophilic Enzymes: Sources, Uses, and Molecular Mechanisms for Thermostability. Microbiol. Mol. Biol. Rev. 65, 1-43 (2001). 31. Calo, D., Kaminski, L. & Eichler, J. Protein glycosylation in Archaea: Sweet and extreme. Glycobiology 20, 1065-1076 (2010). 32. Jarrell, K. F., Jones, G. M. & Nair, D. B. Biosynthesis and Role of N-Linked Glycosylation in Cell Surface Structures of Archaea with a Focus on Flagella and S Layers. Int. J. Microbiol. 2010, 1-20 (2010). 33. Lechner, J., Wieland, F. & Sumper, M. Biosynthesis of sulfated saccharides Nglycosidically linked to the protein via glucose. Purification and identification of sulfated dolichyl monophosphoryl tetrasaccharides from halobacteria. J. Biol. Chem. 260, 860-866 (1985). 34. Hartley, M. D. & Imperiali, B. At the membrane frontier: A prospectus on the remarkable evolutionary conservation of polyprenols and polyprenyl-phosphates. Arch. Biochem. Biophys. 517, 83-97 (2012). 35. Larkin, A. K. Investigation of asparagine-linked glycosylation in archaeal and bacterial systems. (Massachusetts Institute of Technology, 2010). at <http://dspace.mit.edu/handle/1721.1/62725> 36. Voisin, S. et al. Identification and Characterization of the Unique N-Linked Glycan Common to the Flagellins and S-layer Glycoprotein of Methanococcus voltae. J. Biol. Chem. 280, 16586-16593 (2005). 37. Krogh, A., Larsson, B., von Heijne, G. & Sonnhammer, E. L. L. Predicting transmembrane protein topology with a hidden markov model: application to complete genomes. J. Mol. Biol. 305, 567-580 (2001). 38. Sharma, C. B., Lehle, L. & Tanner, W. N-Glycosylation of yeast proteins. Characterization of the solubilized oligosaccharyl transferase. Eur. J. Biochem. FEBS 116, 101-108 (1981). 39. Imperiali, B. & Hendrickson, T. L. Asparagine-linked glycosylation: Specificity and function of oligosaccharyl transferase. Bioorg. Med. Chem. 3, 1565-1578 (1995). 40. Ramirez, F. & Marecek, J. F. Coordination of magnesium with adenosine 5'-diphosphate and triphosphate. Biochim. Biophys. Acta BBA - Bioenerg. 589, 21-29 (1980). 41. Larkin, A., Chang, M. M., Whitworth, G. E. & Imperiali, B. Biochemical evidence for an alternate pathway in N-linked glycoprotein biosynthesis. Nat. Chem. Biol. 9, 367-373 (2013). 42. Schwarz, F. et al. Relaxed acceptor site specificity of bacterial oligosaccharyltransferase in vivo. Glycobiology 21, 45-54 (2011). 43. Kuntz, C., Sonnenbichler, J., Sonnenbichler, I., Sumper, M. & Zeitler, R. Isolation and characterization of dolichol-linked oligosaccharides from Haloferax volcanii. Glycobiology 7, 897-904 (1997). 44. Guan, Z., Naparstek, S., Kaminski, L., Konrad, Z. & Eichler, J. Distinct glycan-charged phosphodolichol carriers are required for the assembly of the pentasaccharide N-linked to the Haloferax volcanii S-layer glycoprotein. Mol. Microbiol. 78, 1294-1303 (2010). 106 45. Cohen-Rosenzweig, C., Guan, Z., Shaanan, B. & Eichler, J. Substrate Promiscuity: AgIB, the Archaeal Oligosaccharyltransferase, Can Process a Variety of Lipid-Linked Glycans. AppL. Environ. Microbiol. 80, 486-496 (2014). 46. Calo, D., Guan, Z., Naparstek, S. & Eichler, J. Different routes to the same ending: comparing the N-glycosylation processes of Haloferax volcanii and Haloarcula marismortui, two halophilic archaea from the Dead Sea. Mol. Microbiol. 81, 1166-1177 (2011). 47. Guan, Z., Meyer, B. H., Albers, S.-V. & Eichler, J. The thermoacidophilic archaeon Sulfolobus acidocaldarius contains an unsually short, highly reduced dolichyl phosphate. Biochim. Biophys. Acta BBA - Mol. Cell Biol. Lipids 1811, 607-616 (2011). 48. Matsumoto, S., Shimada, A. & Kohda, D. Crystal structure of the C-terminal globular domain of the third paralog of the Archaeoglobus fulgidus oligosaccharyltransferases. BMC Struct. Biol. 13, 11 (2013). 49. Newby, Z. E. R. et aL. A general protocol for the crystallization of membrane proteins for Xray structural investigation. Nat. Protoc. 4, 619-637 (2009). 50. Chae, P. S. et aL. Maltose-neopentyl glycol (MNG) amphiphiles for solubilization, stabilization and crystallization of membrane proteins. Nat. Methods 7, 1003-1008 (2010). 51. Chen, M. M., Glover, K. J. & Imperiali, B. From Peptide to Protein: Comparative Analysis of the Substrate Specificity of N-Linked Glycosylation in C. jejuni t. Biochemistry 46, 5579-5585 (2007). 107 Chapter 4: Efforts towards establishing an experimental system for investigating AgIB conformational dynamics by LRET 108 Introduction Asparagine-linked glycosylation is a complex protein modification that is found among all domains of life, but is rare in bacteria, abundant in archaea, and essential in eukaryotes.] Much is known about N-glycan assembly in eukaryotes and bacteria, where a polyprenyldiphosphate-dependent pathway is initiated by a phospho-glycosyltransferase (PGT) and elaborated with a series of glycosyltransferases (GTs) to generate a polyprenyl-PP-linked glycan. Specifically, in the Campylobacter jejuni bacterial pathway, the oligosaccharyl transferse (OTase), known as PglB, utilizes the a-linked undecaprenyl-PP-heptasaccharide substrate as the glycosyl donor and transfers the glycan to the asparagine residue of acceptor proteins.2 " In Methanococcus voltae, the archaeal OTase, known as AglB, utilizes the a-linked dolichyl-Ptrisaccharide substrate as the glycosyl donor for transfer to the acceptor protein. This Dol-Pglycan is generated by an initial "retaining-stereochemistry" glycosyltransferase (AglK) and elaborated by additional glycosyltransferases (AgIC and AglA) to afford Dol-P-GIcNAc-Glc-2,3diNAcA-ManNAc(6Thr)A. 4 Despite the sequence homology to C. jejuni PgIB and other bacterial or eukaryotic OTases that exploit polyprenyl-PP-linked substrates, the M. voltae AgIB efficiently transfers disaccharide to model peptides from Dol-P-GIcNAc-Glc-2,3-diNAcA. While this archaeal pathway affords the same asparagine-linked P-glycosyl amide products generated in bacteria and eukaryotes, the work in Chapters 2 and 3 provide the first biochemical evidence revealing that despite the apparent similarities of the overall pathways, there are actually two general strategies to achieve N-linked glycoproteins across the domains of life (Figure 4-1). There is a strong need for detailed studies on the mechanistic and functional significance of archaeal adaptations of N-linked glycosylation, in particular to probe the structural, conformational, or chemical differences amongst AgIB, PgIB, and the other OTases that allow 109 AgiB to utilize these unique dolichyl-P-linked substrates, rather than the more common dolichylPP-linked glycans that are ubiquitous in all eukaryotic organisms from yeast to man. Biophysical studies exploring peptide and Dol-P-glycan binding to AglB should prove particularly illuminating. Diphosphate-dependent pathway (bacteria and eukaryotes) NDP-sugar and Polyprenyl-P UMP 0~b* '2 GSglycant; 0- 0-PTI 0-0- PGT O-P-O-P-O-Udine -1 -- 0- 0- O- P-o-P-o-Pren o o T O-P-O-P-O-Pren 0 0 OTase Monophosphate-dependent pathway (archaea) Polyprenyl-P UDP o-glycan -glycan IO-P-O-P-o-Uridine O1 01 Retaining GT NDP-sugar and j O-P-o-Pren 1 GTs N0 . 0 HO O-P-O-Pren R P-glycosylamide-Iinked glycoproteins Figure 4-1: N-linked glycosylation across the three domains of life. Diphosphate-dependent pathway used by eukaryotes and bacteria is initiated by a PGT, which generates a polyprenyl-PPlinked glycan. Monophosphate dependent pathway used by selected archaea is initiated by a retaining GT, which generates a polyprenyl-P-linked glycan. In both pathways, GTs then build up the glycan for transfer by an OTase to an acceptor protein. New methods have been developed that exploit the unique photophysical and electronic properties of lanthanide ions to provide new insight into protein structure, function, and dynamics.5 Selected lanthanide ions emit energy via luminescence, which encompasses fluorescence and phosphorescence (Figure 4-2). In particular, terbium and europium ions are valuable for biological studies because they emit light in the visible range, are more intense than other ions in the series, exhibit long excited-state lifetimes (millisecond), and have high emission quantum yields.6 Many lanthanide ions exhibit advantageous luminescence emission spectra, but because the specific f-f electronic transitions are forbidden, the lanthanide ions must first be sensitized with appropriate organic fluorophores. The availability of tryptophan as a convenient sensitizing fluorophore for Tb3+ means that this amino acid can be used to sensitize Tb3+ in peptides and proteins. 110 absorption 10-15 s vibrational relaxation/internal conversion 10-1 s fluorescence 10- s phosphorescence >10- s C S2 -, S isc | III T, IET :sD SL-- ligand central ion (Eu,") acceptor Figure 4-2: Jablonski diagram of a ligand-lanthanide complex. The solid arrows represent absorption of a photon, the wavy arrows designate non-radiative transition to excited levels, and the dashed arrows radiative emission. The lanthanide ion may also transfer its further to a suitable acceptor fluorophore. IC = internal conversion, ILCT = intra-ligand transfer, ISC = intersystem crossing, LMCT = ligand-to-metal charge transfer, energy energy charge IET = intramolecular energy transfer, RET = resonance energy transfer. Figure modified from Vuojola, et al. Lanthanide-binding tags (LBTs) have been developed as genetically-encodable, short peptide sequences comprising 15-20 naturally occurring amino acids that bind Tb3+ ions with high affinity (single low nM range).','" An important early study established that the 14-residue peptide corresponding to an EF-hand motif from calmodulin could form a luminescent Tb3+_ chelate when the fluorescent tryptophan residue was incorporated at position 7 of the sequence.1" The LBT has six metal-binding residues that form a coordination sphere around the Tb3+ ion (Aspl, Asn3, Asp5, the carbonyl oxygen of Trp7, Glu9, and Glul2), with eight total oxygens binding Tb3+ (Figure 4-3A). These ligands can be manipulated to alter the lanthanide ion specificity of the LBT. Because these tags are composed exclusively of amino acids, LBTs can easily be fused to proteins of interest using standard molecular biology techniques. Analysis of LBT peptides revealed Kds in the low rn range" and a highly ordered chelate structure 111 including only peptide-based ligands without water in the inner complexation sphere, which is critical for minimizing luminescence quenching of the lanthanide. 10 In luminescence experiments, the millisecond lifetimes of Tb3+ and Eu3+ emission provide greatly increased sensitivity and the possibility to readily eliminate background fluorescence from typical shortlived organic fluorophores since time-gated data acquisition can be applied for the selective detection of lanthanide-tagged species. A B RET LBT Figure 4-3: Lanthanide binding tag. (A) Ribbon structure of a 2.0-A resolution X-ray crystal structure of a 17-residue lanthanide-binding peptide complexed with a Tb3+ ion. This figure was modified from Nitz, et al. 10 (B) Depiction of LRET between an LBT and BODIPY fluorophore, where a tryptophan in the LBT loop is excited at 280 nm and sensitizes the Tb3+ luminescence. LRET only occurs when the fluorophore is incorporated into the phosphopeptide ligand specific for the LBT-labeled SH2 domain. This figure was adapted from Sculimbrene, et al.12 Resonance energy transfer (RET) relies on direct, non-radiative energy transfer from donor to acceptor, and is a distance-dependent phenomenon that can be used to measure distances between sites separated by approximately 20-100 A, depending on the fluorescence or luminescence properties of donors and acceptors that are selected for the RET system.1 3 Luminescence resonance energy transfer (LRET), which uses lanthanide ions such as Tb3+ and Eu3 + as donors in energy transfer to organic-based acceptors, offers several technical advantages when compared to fluorescence resonance energy transfer (FRET), which uses either small molecule fluorophores or encoded fluorescent proteins." FRET can report on distances from 20- 112 80 A, but the distances measured can be rather imprecise, especially in the case of using fluorescent proteins, due to their large size. The energy transfer also depends on the relative orientation of the transition dipoles of the donor and acceptor fluorophores and their relative orientation in space, which is assumed to be completely random, in the measurements that are calculated using the F'rster equation, but may not be so when the fluorophores are conjugated to proteins." The unpolarized emission of lanthanides allows for specific and efficient energy transfer. The error in distances measured via LRET due to the orientation factor is essentially negligible, allowing for the application of LRET to measure distances up to 100 A more accurately than via FRET.13 In addition, unlike organic fluorophores, lanthanide ions have long luminescence lifetimes and sharp emission spectra with large Stokes shifts (> 200 nm). LRET can be used to measure intermolecular and intramolecular distances as well as to detect two interacting moieties (Figure 4-3B). The millisecond luminescence lifetime of the lanthanide ion can be exploited by collecting the emission after a short time delay of 50 psec. This time delay eliminates any background fluorescence from direct excitation of the acceptor or any background auto-fluorescence from the sample. LRET can be used to detect and quantify binding interactions and to provide information on the distance between the binding partners. 3 The distance between the donor and acceptor can be calculated from the luminescence lifetime that is derived from the decay data. LRET experiments with LBT-tagged AglB and strategically placed fluorophore labels in the presence and absence of peptide or lipid-linked oligosaccharide (LLO) substrates should provide valuable information regarding substrate binding and resulting structural fluctuations (Figure 4-4). This will supplement the current understanding from the crystal structures of full-length PgIB from Campylobacter lari and full-length AgIB from Archaeoglobus fulgidus, along with additional structures of the C-terminal domains of other 113 OTases. 1 6-18 Several key features have been identified from these crystal structures, including the novel DK or MI motifs 1 7 found in a kinked helix adjacent to the WWDXG motif, or the [I/V]XXX[S/T][I/V]XE motif found in EL5,' 9 which is the external loop thought to undergo a significant conformational change upon peptide substrate binding. 20 To date, there is no structural data for M voltae AgiB or OTases confirmed to utilize polyprenyl-P linked oligosaccharide substrates. LRET 280 nm T 490 nm A , A"544 nm 580 nm Figure 4-4: Intramolecular LRET of AgIB. LBT-AglB undergoes LRET upon Tb3 ion binding and excitation with 280 nm wavelength light. The acceptor BODIPY-TMR is strategically located on AglB loop regions to detect changes upon substrate binding. This chapter describes progress towards the application of LRET-based approaches for investigating the conformational dynamics of AgiB upon peptide or LLO substrate binding and understanding how its overall architecture compares to that of PglB, another OTase which uses a crucially distinct LLO substrate. Uncertainty remains about the catalytic mechanism of OTases. It is still unclear if the substrates bind in a specific order and which chemical groups on the protein function in binding and catalysis. Also unknown is the LLO binding site or whether the peptide-bound crystal structure of C. lari PglB represents the native state of binding. If the glycan substrate binds before the peptide substrate, saturation of the enzyme with the peptide 114 substrate alone may facilitate a non-native conformation. LRET studies will begin to tackle these important questions. Results and discussion Generation of LBT-AglB construct For LRET studies, M. voltae AgIB was generated with an N-terminal LBT coexpression tag and a C-terminal His6 tag. The LBT residues (FIDTNNDGWIEGDELLA) are encoded by a nucleotide sequence optimized for improved expression, and exhibit a high affinity for terbium (Kd =18 nm). 10'2 ' LBT-AgIB-His 6 in a modified pET24a vector was expressed in BL21 (DE3) RIL cells and solubilized with Triton X-100 detergent before purification and buffer exchange into n-dodecyl-p-D-maltoside (DDM) detergent (Figure 4-5A). Before commencing with LRET experiments, it was important to check that appending the LBT to the protein would not negatively impact its affinity for Tb3+. Upon titration of LBT-AgIB with Tb3+ ion and excitation at 280 nm, the emission spectra showed the expected increase in luminescence intensity (Figure 4-5B). The LBT was saturated with Tb3+ (Figure 4-5C) and the time-gated luminescence decay was measured (Figure 4-5D) and fit to equation (1), where I is the luminescence intensity at time t, and r is the luminescence lifetime. I(t) = I(0)exp(-t/T)() Luminescence lifetime measurements were detected at 543 nm and calculated to be 2.60 ms, which is a comparable value to lifetimes found for other LBT-labeled proteins.'12' 2 AgiB requires Mn2+ for catalytic activity, but Mn2+ may quench lanthanide luminescence. 8 When Mg 2+ was used as a divalent cation substitute for Mn2+, there was no difference in luminescence intensity or in lifetime values with or without Mg2 +, but it is unclear if any Mg 2 + is even bound in the metal binding site. 115 A B kDa P FT W1 W2 1 2 3 5 6 7 4 100 75 wow 50 80000 1- 70000 -I- 0 uM 60000 -0.133 uM 50000 0.467 uM C uM a, 40000 -1.13 C 30000 -2.47 uM 20000 -3.80 uM 10000 5.13 uM 4.1 37 25 0 450 500 550 600 Wavelength (nm) D 800001 E C a U 12 0 0 00 100000 60000 a 40000 80000 60000 a y= 1111 82e-038 R= 0.999M7 40000 - 20000 20000 0 0 1 2 3 4 5 6 0 3 2 4 6 8 10 Time (Ms) Tb +(uM) Figure 4-5: (A) Coomassie stained SDS-PAGE gel of LBT-AglB purification on Ni-NTA. (B) Emission spectra of 5 pM LBT-AglB upon Tb3+ titration. (C) Luminescence intensity at 543 nm upon Tb3+ titration. (D) Luminescence decay measured at 543 nm, with z= 2.60 ms. For all resonance energy transfer experiments, the choice of a suitable donor-acceptor pair is critical. Sufficient overlap between the emission spectrum of the donor (Tb3+) and the acceptor fluorophore is necessary to ensure measurable energy transfer, and therefore a value of Ro that is commensurate with the target measurements. The distance, R, between Tb3+ and the acceptor fluorophore can be calculated as a function of Ro (the F6rster distance) and E (the percentage of energy transferred), shown in equation (2). Values for E and Ro can be determined as outlined in equations (3) and (4), respectively. 116 R = Ro [(IE) E= 1 - (2) 1]1/6 (3) -tDA/tD Ro = 0.21 1(K 2q-4QDj) 116 (4) J= E[FD(k) (k)X 4Ak]/1[FD(k)AX] (5) The parameter tD is the lifetime of the donor alone. For this system, it is the lifetime of Tb3 chelated by LBT-AglB. Similarly, TDA is the lifetime of the donor (Tb3+) in the presence of the acceptor fluorophore. Ro is unique to each donor-acceptor pair, and is the distance between the donor and the acceptor such that E = 0.5. The orientation factor, K2 , is taken as 2/3, because the lanthanide donor's emission is unpolarized and the acceptor fluorophore will likely completely rotate during the lanthanide's millisecond lifetime. 2 The refractive index, i, is 1.4 for biological samples in water. The quantum yield of the donor, QD, for LBT is TDA/tTb, where tTb is 4.75 ins. Finally, the spectral overlap term, J, can be calculated according to equation (5), where FD(A) is the emission spectra of the donor and 6(k) is the extinction coefficient of the acceptor." Generationof LBT-AglB cysteine mutants For LRET studies with LBT-AglB, an acceptor fluorophore must be incorporated in a region that will indicate substrate binding. AglB exhibits poor peptide substrate binding with apparent Km values around 0.5 to 1 mM, which is approximately a I 000-fold poorer affinity than peptide binding to C. jejuni PgIB.. The peptide is not a good target for fluorophore labeling because the high concentration of peptide required for binding could create high background in the LRET studies. Instead, the loop regions of LBT-AglB are good targets for fluorophore labeling. Binding of unlabeled peptide would still result in LRET and lifetime changes if the fluorophore locations are chosen in areas likely to undergo large conformational changes upon substrate binding. The fluorophore labeling strategy requires free thiols at only the desired 117 positions. Based on the predicted membrane topology for AgiB (Figure 4-6A) and alignment with C. lari PglB (PDB code: 3RCE),' 6 six residues were chosen for mutagenesis to cysteine, and the two native cysteine residues needed to be mutated to serine to eliminate unwanted labeling (Figure 4-6B). The six residues chosen for mutagenesis lie within predicted dynamic loop regions but are not highly conserved to prevent any change in essential residues. While many fluorophores have spectral overlap with one of the three Tb 3 + emission bands at 490, 544, 580 nm, the 4,4-difluoro-4-bora-3a,4a-diaza-s-indacene (BODIPY) family fluorophores are especially attractive because they have high photostability at pH greater than 2, large extinction coefficients, and high quantum yields. BODIPY-TMR maleimide (Figure 4-6C) was chosen as the acceptor fluorophore because it is thiol reactive, overlaps with the Tb3+ emission at 544 nm, and has previously been used in combination with LBT in LRET experiments.' 2 RO has been determined to be 50.9 A, which falls conveniently within the range of distances predicted in these experiments. A B TMHMM postenor probebhties for Sequence K602C 12 0.8 W138C T367C P356C 086 V203C Y88C 0.4 C590S 02 100 200 300 400 500 B00 700 outside inside - trinsrembritne 000 B00 - 0 H C35S C CH2CHZ F H 0 Tb 3 + F\ -NH(CH,)-N'1 CH 3 O Figure 4-6: (A) Membrane topology prediction plot for AglB. (B) Illustration of Cys to Ser mutants and the addition of Cys at desired locations. (C) Structure of BODIPY-TMR maleimide. Ax = 544 nm, ?m= 570 nm. 118 Double Cys mutants (C35S and C590S) were successfully produced through two QuikChange reactions. LBT-AglB-C35S/C590S was expressed in BL21 (DE3) RIL cells and purified in the presence of DDM as before for LBT-AglB (Figure 4-7A). The activity of the M voltae AglB was confirmed with the standard OTase assay using Ac(YKYNESSYKpNF)NH 2 (pNF = para-nitrophenylalanine) and Dol-P-GlcNAc-[ 3 H]Glc-2,3-diNAcA as standard substrates (Figure 4-7B). The background level of Dol-P-GlcNAc-[ 3H]Glc-2,3-diNAcA hydrolysis did not increase over the time of the assay. Removing the two natively-found cysteine residues did not diminish activity, so AglB-C35S/C590S mutant is most likely still folded correctly and remains a good model for LRET studies. A B kDa P FT W1 W2 1 2 3 4 5 40 1 100 75 3 50 2 37 10 0 25 20 40 60 80 Time (min) 100 120 Figure 4-7: (A) Coomassie stained SDS-PAGE gel of LBT-AglB-C35S/C590S purification on Ni-NTA. (B) OTase assay of LBT-Ag1B-C35S/C590S shows that the double Cys mutants are still active. Next, the desired Cys mutants (Y88C, W138C, V203C, P356C, T367C, or K602C) were generated. All future discussions of these mutants refer to the LBT-AglB-C35S/C590S protein incorporating a single Cys mutant at the indicated residue. The P356C and T367C mutants are the most interesting, as they are located in the equivalent of the EL5 loop of C. lari PglB, which is predicted to undergo large structural changes upon peptide binding. In the PglB X-ray 119 structure, EL5 is only partially ordered, with residues 283 through 306 disordered in the electron density map. EL5 is proposed to be involved in peptide binding and catalysis.2 O Labeling at P356C and T367C would provide dynamic information that is unavailable from the static crystallographic data. Both the P356C and T367C mutants were expressed in BL21 (DE3) RIL cells and purified with DDM (Figure 4-8A). The purified mutants were labeled with BODIPYTMR. A Ni-NTA column was used to remove unlabeled fluorophore from labeled LBT-AglB, which was visualized by Western blot (Figure 4-8B) and under UV light (Figure 4-8C). The yield of purified LBT-AgIB labeled at P356C is 0.36 mg per liter of culture. The yield of purified LBT-AglB labeled at T367C is 0.48 mg per liter of culture. Some protein was visibly lost to the membranes of the centricon filters during concentration. BODIPY-TMR was quantified using its extinction coefficient (e = 60,000 M' cm1 at 544 nm). Labeling efficiency of the LBT-AglB at the P356 position was 61.2% and at the T367 position was 42.1%. An advantage of LRET over FRET is that lifetime measurements avoid the problem of labeling inefficiencies. A kDa P356C T367C P FT W1 W2 El E2 E3 E4 P FT W1 W2 E1 E2 E3 E4 kDa < A$' 4 P 190 100 75 C B -120 85 50 60 50 40 Figure 4-8: (A) Coomassie stained SDS-PAGE gel of the Ni-NTA purification of LBT-AglBC35S/C590S/P356C and LBT-AglB-C35S/C590S/T367C. (B) Anti-His 6 Western blot of purified BODIPY-TMR labeled P356C and T367C. (C) SDS-PAGE of purified, fluorescent BODIPYTMR labeled P356C and T367C under UV light. 120 LRET with fluorophore-labeledLBT-AglB One of the benefits of LRET over FRET is the ability to use gated experiments to eliminate background fluorescence from direct excitation of the acceptor fluorophore. There is a significant amount of direct excitation of BODIPY-TMR, but this can be completely eliminated from our experiments using a 50 ps gate (Figure 4-9). Therefore, any signal detected at 570 nm is entirely due to resonance energy transfer from the LBT to the BODIPY-TMR and not from direct excitation. While excitation at 280 nm is not optimal for biological samples, no deleterious effects were observed in these in vitro experiments. Additionally, the luminescence experiments are conducted with pulsed excitation instead of continual irradiation, which minimizes exposure. 16000000 14000000 $ 12000000 10000000 8000000 6000000 E - no delay - 50 l~s delay 4000000 2000000 0 500 600 700 800 Wavelength (nm) Figure 4-9: Emission spectrum of LBT-AglB-T367C-BODIPY-TMR with excitation at 280 nm with no delay or 50 ps delay after flash. For LBT-AglB-P356C-BODIPY-TMR, titration of Tb3+ ion resulted in an increase of emission intensities at the three Tb3+ emission bands, but there was also a marked increase at 570 nm, the emission maximum for BODIPY-TMR (Figure 4-10A,B). The same results were observed for LBT-AglB-T367C-BODIPY-TMR (Figure 4-11 A,B). Luminescence lifetimes were also measured (Figures 4-1 OC, 4-11 C). All of these experiments were conducted using a 50 ps 121 gate. In the presence of an acceptor fluorophore, the luminescence decay curve must be fit to a biexponential found in equation (6). I(t) = 1(0)1 exp(-t/ti ) +1(0)2 exp(-t/T 2 ) (6) After chemical modification, there are two distinct species in the LRET experiments. These are the LBT-tagged protein with Tb3+ bound and no acceptor fluorophore and the corresponding species with a fluorophore attached to the single targeted cysteine. In the biexponential decay, one component reflects the LBT donor alone and the other component reflects the LBT donor influenced by the acceptor fluorophore. The proportions of the two species can be determined by their pre-exponential amplitudes, but the lifetimes remain unchanged regardless of donor and acceptor incorporation. Donor-only and acceptor-only species do not contribute contaminating backgrounds to the sensitized-emission lifetime measurements. This lack of sensitivity to incomplete labeling would be particularly useful in cellular applications, where complete labeling and purification cannot usually be achieved.13 122 A B 80000 -0 70000 80000 uM 0.133 uM 0.487 uM 50000 - e 40000 30000 20000l 60000 50000 40000 1.13 uM 2.47 uM -3.8 uM -- 5.13 uM 30000 E 20000 10000 100001 0 450 1 S70000 500 550 Wavelength (nm) 0 600 C EC 1 2 3 4 5 6 Tb3* (UM) "cM - 7000 00 MO I - y-M1 exp(-k OM2) 3UF + M3 Trw C mhl 000 0 2 4 6 a 10 12 Time (ms) Figure 4-10: (A) Time-gated emission spectra of 5 uM BODIPY-TMR labeled P356C upon Tb3+ titration. LRET is detected as the absorption of BODIPY-TMR at 544 nm and fluorescence emission at 570 nm. (B) Luminescence intensity at 543 nm upon Tb3 + titration. (C) Luminescence decay measured at 543 nm, with T, = 2.70 0.07 ms, T 2 = 1.00 0.05 ms, and calculated R = 46.6 1.8 A. 123 A B 90000 80000 80000+ uM 0.133 uM 70000 C 60000 0.467 uM 60000 S250000 1.13 uM ! 50000 0 40000 30000 W 20000 2.47 uM 3.8 uM 5.13 uM . 40000 + 80000 70000 - -0 430000 20000 010000 450 0 0 600 550 1 3 2 4 5 6 3 Tb * (uM) Wavelength (nm) C li EC Y. I Wo I exp(-M EM 47W---RW.d-1 ffa C +M3 VaLv M1 MY 407 0157W _7:6 ITW O.OM402 Sq C 0 OMOW E00 M 0 2 6 4 8 10 12 Time (ms) Figure 4-11: (A) Time-gated emission spectra of 5 pM BODIPY-TMR labeled T367C upon Tb3+ titration. LRET is detected as the absorption of BODIPY-TMR at 544 nm and fluorescence emission at 570 nm. (B) Luminescence intensity at 543 nm upon Tb3+ titration. (C) Luminescence decay measured at 543 nm, with 11 = 2.54 0.05 Ms, T2 = 0.86 0.04 ms, and calculated R = 45.5 1.5 A. Luminescence studies with LBT-AglB andpeptide substrate Having determined the R values with two different BODIPY-TMR labeled LBT-AglB proteins (R is 46.6 + 1.8 A for P356C and 45.5 1.5 A for T367C), the distances between the LBT and the two positions within the same EL5 loop are within error of each other. Both of these R values should dramatically change upon peptide binding. In order to learn about the conformational dynamics of the loop regions and response to substrate addition, unlabeled peptide substrate was added to LBT-AglB-T367C-BODIPY-TMR saturated with Tb3 +. Upon 124 addition of 1 mM of Ac(YKYNFTSYKRR)NH 2 peptide, the luminescence intensity sharply decreased around 10-fold (Figure 4-12). Peptide binding should affect luminescence lifetimes but should leave emission intensities unchanged. A B 80000 9000 70000 8000 60000 7000 j50000 E6000 uM ~0Tb3.600 50 40M-uMT' 30000 43uMTb' 5000 +5 uM Tb-' 1 mM peptide 4000 3000 x1 mM peptide 200002000 102000 10000 1000 0 0 450 500 550 Wavelength (nm) 600 0 2 4 6 Time (ms) 8 10 12 Figure 4-12: (A) Emission spectra and (B) luminescence decay of 5 gM LBT-AglB-T367CBODIPY-TMR and 5 iM Tb"+ ion exhibit quenching upon addition of 1 mM Ac(YKYNFTSYKRR)NH 2 peptide. The quenching of luminescence signal by peptide was concerning. After eliminating DMSO or pH as possible causes, we observed that this phenomenon was not unique to our LBTAglB system. A model LBT-labeled ubiquitin (LBT-Ub) protein also exhibited the same quenching at high concentrations of peptide. However, the identity of the peptide did affect the extent of luminescence quenching Ac(YKYNFTSYKRR)NH 2 (Km = 0.86 (Figure 4-13). The longer peptide sequence, 0.19 mM), is more like the native flagellar sequence found in M voltae. The shorter peptide, Ac(YFNFTGRR)NH 2 (Km = 1.45 + 0.35 mM), is the composite best peptide coming from the first OTase peptide library described in the previous chapter (Figure 3-19). The Michaelis constant, Km, is the substrate concentration at which the reaction velocity is half-maximal, and is an inverse measure of the substrate affinity for the enzyme, as a small Km indicates high affinity, meaning that the rate will approach Vmax more 125 quickly. The value of Km is dependent on both the enzyme and the substrate and is also a function of temperature and pH. Under the constant experimental conditions, including the identical enzyme preparation, at which all of our Km values are determined, Km approximates the apparent dissociation constant, Kd. The higher the Km value, the higher concentrations of the peptide would be necessary to achieve the same level of binding for the LRET studies. After a systematic study, it was established that having such a large excess of peptide compared to LBTprotein perturbs the chelated Tb3 + in some manner, perhaps with the peptide amide backbone competitively binding to Tb3+ due to the very high concentrations of peptide needed to saturate the AgIB peptide-binding site. Consistent with this, it appears that the longer peptide substrates have the largest negative effect, which is in keeping with the hypothesis that the presence of more amide backbone available for nonspecific binding to Tb3 is more detrimental. Although it should also be noted that the longer peptide substrates do generally show lower Km values. Therefore, there is an apparent trade-off with current peptide substrates between tighter binding and higher competition for Tb3+ binding. Having an excess of Tb3 did appear to boost emission intensity overall, but the quenching effect from abundant peptide never diminished and the lifetime measurements were still perturbed to the extent that they were not of value. 126 A B 90000 80000 , 70000 70000 l 60000 . 50000 - OuM Tb3+ - 20 uM Tb3+ 2:1 60000 0.5 mM peptide 30000 LU 20000 10000 0 450 K -0 50000 0.25 mM peptide 2 40000 E 80000 0.25 mM peptide c 40000 0 -0.5 30000 LU 20000 10000 500 550 0 5 450 600 Wavelength (nm) uM Tb3+ 20 uM Tb3+ A 500 550 mM peptide 600 A Wavelength (nm) C 80000 70000 60000 C 50000 40000 30000 Ui 20000 10000 0 1 I *YFNFTGRR NYKYNFTSYKRR I 0 0. 25 0.5 Peptid 0 (mM) Figure 4-13: Loss of luminescence signal with addition of peptides (A) Ac(YFNFTGRR)NH 2 and (B) Ac(YKYNFTSYKRR)NH 2 to 10 pM LBT-Ub. (C) Comparison of emission intensities at different concentrations of peptides. New peptide libraryfor LRET experiments PglB from C. jejuni has an optimized peptide substrate, Ac(DQNATpNF)NH 2, with an apparent Km of 0.8 0.11 pM. 23 In contrast, for AglB from M voltae, the peptide substrates that have been identified to date reveal apparent Km values around 1 mM. Therefore, to populate the peptide binding site of AglB, we predict that we would need to use 2-3x Km in the LRET studies. This presents a major problem because at such high concentrations there is considerable competition for binding to Tb3+, which has the effect of reducing the Tb3+-bound LBT species. One way to address this issue is to design a peptide with higher affinity, which would enable lower peptide concentrations to be used in the luminescence studies. Alternatively, another strategy to eliminate the quenching effect may be to physically isolate the LBT from peptide. For 127 example, this could be achieved through liposomes in which LBT-AglB has been directionally inserted with the LBT located inside and the peptide-binding region outside the liposome. Thus far, we have not explored these other strategies as we have focus on identifying an improved peptide to minimize luminescence quenching. Bacterial peptide substrates have the extended sequon requirement of D/E-X 1 -N-X 2-S/T, while eukaryotic and archaeal sequons only require N-X-S/T, where X cannot be proline. While the extended sequon in bacteria may help explain the low micromolar Km values seen for PgIB, approaching low micromolar binding in archaea should be possible. For example, in Pyrococcus furiosus, the results of library experiments revealed that the optimal acceptor sequence was PYNVTK, with a Km of 10 pM. 24 Therefore, further optimization of peptide sequences should result in an acceptor with a lower Km. The first peptide library we designed, originally conceived to find a peptide suitable for crystallography studies, included 23 peptides with the sequences based on Ac(YX 1 NX 2TX3 RR)NH 2. The fixed Tyr residue was for quantification purposes and the two Arg residues were to ensure high solubility. The N-terminus was acetylated and the Cterminus contained an amide instead of a carboxylic acid to better mimic the peptide backbone. A second peptide library consisting of 14 peptides was designed based on H2N(X 1X 2NX 3TX 4)NH 2 . Each peptide in this library contains a lysine for solubility and a tyrosine or phenylalanine for quantification. The N-terminus remained as the free amine to improve solubility, while the C-terminus was again capped as a primary amide. These sequences account for the preference of certain amino acid residues at particular positions when comparing all known glycosylated M voltae and M maripaludis flagellar sequences. This second peptide library was made by Boston Open Labs. Like for first peptide library, AglB was assayed with the second peptide library at 0.5 mM peptide substrate concentration. The screen revealed which 128 residues at which positions resulted in the highest initial rates. The results of both peptide libraries are presented in Figure 4-14. In addition, a shorter peptide, Ac(SINATpNF)NH 2, was determined to have a lower Km value of 0.42 0.16 mM, which is actually lower than that of the longer peptides. This result was encouraging for our search for a tighter binding short peptide sequence. Given that 2 is promising, new peptides Ac(SINTSApNF)NH 2, Ac(SFNTSApNF)NH and Ac(SINGTNpNF)NH 2 2 , Ac(SINATpNF)NH were made by standard solid phase peptide synthesis, purified by HPLC, and assayed with AglB. Unfortunately, solubility was an issue with these peptides during the AglB assay, in which conspicuous precipitation occurred at higher peptide concentrations. This rendered the Km values for these peptides with AglB as unreliable. Fmoc-NH-(PEG)-COOH was appended to the N-terminus to increase the solubility of these peptides, but these peptides did not reach saturation at low millimolar concentrations. 129 A 0.35 - 0.3 0.25 , 0.2 0.15 0.4 0.35j 0.3 0.25 0.2 .S 0.15 75 0.1 Figure 4-14: AgiB acceptor peptide library screen with (A) 23 peptides with N-terminal acetylation and (B) 14 peptides with free N-terminal amines. Initial rates were determined at 0.5 mM peptide concentration. Peptide libraries I and 2 provided useful information about the amino acid preference at each position according to the template X0 -X 1 -X2 -X 3 -X 4 -X5 (Table 4-1). Based on this information, two new composite peptides were designed with an acetylated, benzoylated or free amine at the N-terminus and an amide at the C-terminus. Benzoylation of the N-terminus of the OT peptide substrate in yeast has been shown to enhance peptide-binding affinity relative to The NH 2(TFNETS)NH 2 , six new manually Bz(TFNETS)NH 2 , synthesized peptides Ac(TFNFTS)NH 2 , were Ac(TFNETS)NH 2 NH 2 (TFNFTS)NH 2, , acetylation. and Bz(TFNFTS)NH 2. These peptides were assayed with AglB and the Km values were determined 130 (Table 4-2). The TFNFTS peptides were not very soluble, and a Km value could not be determined for the benzoylated peptide due to precipitation. Table 4-1: Amino acid preference at each position of XO-X1-X 2 -X 3 -X4 -X5 peptide. Position Library 1 Library 2 Xo -- T>L>P X1 F>Q>L>E>G>K F>I~Y>S X2 N N X3 F >E >>Q > L > G/K E >F; E> T; T >E; F >> S X4 T> S T X5 G>K~A>L-Q>E>F G-N;S>>N Table 4-2: Apparent Km values of best AglB peptide substrates. Peptide Km (mM) Ac-YFNITIRR-NH 2 high Ac-YFNFTGRR-NH 2 1.4* 0.3 H 2N-TFNFTK-NH 2 0.5 0.1 H2N-TINYTK-NH 2 1.2 0.4 H 2N-KYNTTS-NH 2 1.6 0.4 H 2N-TFNETS-NH 2 1.2 0.1 Ac-TFNETS-NH 2 2.4 0.8 H 2N-TFNFTS-NH 2 0.5 0.1 0.5 -40.2 Ac-TFNFTS-NH 2 0.34 Bz-TFNETS-NH 2 Bz-TFNFTS-NH 2 0.08 -- Luminescence studies with LBT-AglB and short acceptorpeptides The new, short peptides were tested with LBT-Ubiquitin to determine whether or not they also promote luminescence quenching. One peptide, Ac(TFNFTS)NH 2, has a relatively low Km 131 of 0.5 mM and shows the least amount of luminescence quenching upon addition to LBT-Ub (Figure 4-15A). Bz(TFNETS)NH 2, the peptide with the lowest Km at 0.3 mM, unfortunately exhibits a high amount of luminescence quenching (Figure 4-15B). Future LRET investigations will employ Ac(TFNFTS)NH 2 as the acceptor peptide substrate. While a low micromolar affinity would allow for intermolecular LRET studies with fluorophore-labeled peptide substrates, a Km of 0.5 mM is still reasonable for intramolecular LRET studies with unlabeled peptide and fluoreophore-labeled loop regions of AgIB. A -0 Tb3+ B 45000 . 40000 c 35000 30000 _ c 25000 0 20000 15000 W 10000 5000 0 35000 0 1OmM 050mM 1 mM 550 30000 25000 c025 20000 .s 075mM 500 0 mM 0.1mM 0.25 mM 0.5 mM si -01 mM mM 450 40000 600 Wavelength (nm) * 15000 10000M 5000 0 450 0.75 mM 500 550 600 Wavelength (nm) Figure 4-15: Emission spectra of 10 pM LBT-Ub with 10 pM Tb3+ and addition of peptides (A) Ac(TFNFTS)NH 2 and (B) Bz(TFNETS)NH 2 . 3 30 -Tb 50000 Conclusions Biophysical techniques such as FRET and LRET are useful tools for examining proteinprotein and protein-substrate interactions. This chapter describes efforts towards developing an intramolecular LRET system that can report on substrate binding and the resulting structural transformations in AglB. Six LBT-AglB constructs were successfully generated with unique Cys mutations in loop regions for fluorophore labeling. Two of these mutants were labeled with BODIPY-TMR, and LRET was successfully observed between the Tb3+-chelated LBT and the fluorophore. We have overcome difficulties with nonspecific luminescence quenching upon 132 addition of peptide, and now have identified a favorable peptide with which to move forward with LRET experiments. The lipid-linked oligosaccharide has not yet been explored in LRET studies, but it remains a valuable target for studies with AgIB in conjunction with the peptide substrate. In addition, the most insight into AglB enzymatic mechanism may come from comparative LRET studies with the corresponding PglB system. Investigation of C. jejuni and C. lari LBT-PglB with BODIPY-TMR labeled peptide substrates are ongoing by Dr. Monika Musial-Siwek. There are still several unanswered questions concerning conformational dynamics or order of substrate binding that may be better explored by LRET rather than crystallography. It will be especially intriguing to discover possible explanations for how AglB selectively acts upon Dol-P-glycan while PglB utilizes Und-PP-glycan. Perhaps these experiments will elucidate differences in active site metal binding between AglB and PglB that could contribute to the discrimination of LLO substrates. Acknowledgements Dr. Marcie Jaffee did a substantial amount of work to begin LRET investigations in the C. jejuni LBT-PglB system. I am thankful to her for the modified LBT-pET24a vector used in these studies and for technical advice with the phosphorimeter. I am thankful to Dr. Monika Musial-Siwek for discussions about data analysis and for her continuing investigations of LBTPglB. Experimental methods LBT-AglB construct cloning, expression, andpurification Dr. Marcie Jaffee provided the DNA sequence-optimized LBT-PgIB/pET24a vector from which PglB was excised out of and AglB cloned in to using BamHI and Xhol restriction 133 enzymes. The LBT-AglB/pET24a plasmid was transformed into BL21 (DE3) RIL cells (Agilent) using kanamycin and chloramphenicol for selection. LB media (1 L) supplemented with antibiotics was inoculated with a 5 mL starter culture and incubated at 37 'C with shaking until an optical density (600 nm) of 0.6-0.8 AU was obtained. The cultures were then cooled to 16 'C. and protein expression was induced with 1 mM IPTG. After 16 h, the cells were harvested by centrifugation (4000 x g). For LBT-AgIB purification, cell pellets were resuspended in 5% of the original culture volume in lysis buffer (50 mM HEPES, pH 7.5/300 mM NaCl) and protease inhibitor cocktail (Calbiochem), then were lysed by sonication. To prepare the cell membrane fraction, the cell lysate was spun down to remove cellular debris (6000 x g, 30 min), followed by pelleting of the membranes (142,000 x g, I h). The membrane pellet was then solubilized in 0.25% of the original culture volume in 50 mM HEPES, pH 7.5/150 mM NaCI/10 mM imidazole supplemented with 1% Triton X-100 for 2 h. After a high-speed spin (142,000 x g, 65 min), the resulting supernatant was incubated with Ni-NTA for 2 h. The resin was washed with 50 mM HEPES, pH 7.5/100 mM NaCI/20 mM imidazole/0.05% DDM and then 50 mM HEPES, pH 7.5/100 mM NaCI/35 mM imidazole/0.05% DDM). LBT-AgIB was eluted with 50 mM HEPES, pH 7.5/100 mM NaCI/300 mM imidazole/0.05% DDM and exchanged into 50 mM HEPES, pH 7.5/100 mM NaCl/10% glycerol using a 50 kDa cutoff centricon filter. The DDM micelles are retained. A final exchange in a 100 kDa cutoff filter concentrated the protein and reduced the amount of DDM for luminescence experiments. All purification steps were performed at 4 'C. Protein concentration was determined by UV absorbance at 280 nm (LBT-AglB: 106,130 g/mol; c = 174,200 cm'M1 ). 134 LBT-AglB and LBT-Ub luminescence experiments Luminescence experiments were conducted on a Horiba Jobin Yvon Fluoromax-P in a 1 cm path-length quartz cells. Sensitization of Tb3+ was carried out by exciting tryptophan at 280 nm and recording the luminescence at 544 nm. A 315 nm long-pass filter was used to eliminate interference from harmonic doubling. To obtain titration curves, Tb3+ was added in small volumes to 150 pl of 5-10 pM LBT-AgLB or LBT-Ub. LBT-AgIB and other labeled constructs were diluted from concentrated stocks into buffer containing 50 mM HEPES, pH 7.4/100 mM NaCI/5% glycerol/0.01% DDM. This serves to reduce the detergent levels to near CMC. Phosphorimeter emission acquisition parameters are as follows: scan start = 450 nm, increment = 1, sample window 10 ms, 2 nm, excitation = 280 nm, scan end = 600 nm, number of scans delay after flash = 0.05 ms, time per flash = 40 ms, number of flashes = 100. Phosphorimeter = = decay acquisition by delay parameters are as follows: initial delay = 0.05 ms, delay increments = 0.06 ms, max delay = 11.99 ms, excitation = 280 nm, emission = 544 nm, number of scans = 2, sample window = 20 ms, time per flash = 70 ms, number of flashes = 10. These data were fit using the Excel or KaleidaGraph as appropriate. LBT-AglB Cys mutagenesis LBT-AgIB Cys mutants were expressed as LBT-pET24a constructs in BL21 (DE3) RIL cells, solubilized in 1% Triton X-100, and purified in 0.05% DDM. Primers used in the QuikChange reactions are as follows: AglBC35S_f: 5'-CGAAGATAAAAAAGTAAAATCTGCAAAAACAATATTAATT-3' AglBC35S_r: 5'-AATTAATATTGTTTTTGCAGATTTTACTTTTTTATCTTCG-3' AglBC590S_f: 5'-AATAACTCAGTTGTAACTTCTTGGTGGGACAATGGTCAC-3' AglBC590S_r: 5'-GTGACCATTGTCCCACCAAGAAGTTACAACTGAGTTATT-3' 135 AglBY88C_f: 5'-GCATTGGACCCTTATTATTGTTTAAGAATGTCTGAAAATT-3' AglBY88Cr: 5'-AATTTTCAGACATTCTTAAACAATAATAAGGGTCCAATGC-3' AgBWI38C_f: 5'-CTATCTGTTGTAACAGTTTGTGTGTATCAAGTATGGCAC-3' AglBWI38Cr: 5'-GTGCCATACTTGATACACACAAACTGTTACAACAGATAG-3' AglBV203C_f: 5'-CAATATTTGTAAAAACATGTGCAGGGTTTTCTGATACTC-3' AglBV203Cr: 5'-GAGTATCAGAAAACCCTGCACATGTTTTTACAAATATTG-3' AglBP356C_f: 5'-CTTCACAAACTGGTTGGTGTAACGTTTTGACCACAGTTTC-3' AglBP356Cr: 5'-GAAACTGTGGTCAAAACGTTACACCAACCAGTTTGTGAAG-3' AglBT367C_f: 5'-CACAGTTTCTGAGTTAGATTGTGCATCACTCGACGAAAT-3' AglBT367Cr: 5'-ATTTCGTCGAGTGATGCACAATCTAACTCAGAAACTGTG-3' AglBK602C_f: 5'-GGTCACATCTACACATGGTGTACTGATAGAATGGTAAC-3' AglBK602Cr: 5'-GTTACCATTCTATCAGTACACCATGTGTAGATGTGACC-3' LBT-AglBfluorophore labeling LBT-AgIB-C35S/C590S/P356C and LBT-AglB-C35S/C590S/T367C (1 ml, 25 pM) were reduced with 1 mM TCEP for I h at room temperature. BODIPY-TMR C5 -maleimide was purchased from Life Technologies. 3 equivalents of BODIPY-TMR maleimide (38 PI of 2 mM stock in DMSO) were added to the reduced protein and the reaction was carried out overnight at 4 'C in the dark with tumbling. The reaction was stopped by addition of 4 p1l of pmercaptoethanol to quench excess fluorophore. The reaction mixture was then batch bound to 2 ml of Ni-NTA for I h. The protein was washed with 50 ml of 50 mM HEPES, pH 7.5/100 mM NaCl/10% glycerol/0.03% DDM, eluted with 50 mM HEPES, pH 7.5/100 mM NaCl/300 mM imidazole/0.03% DDM, exchanged into 50 mM HEPES, pH 7.5/100 mM NaCl/10% glycerol/0.03% DDM with a 100 kDa centricon filter, and frozen at -80 'C for luminescence 136 experiments. Protein concentration was determined by the Bio-Rad detergent compatible protein assay. Labeling efficiency was determined by quantifying the amount of BODIPY-TMR by UV absorbance at 544 nm (c = 60,000 cm' M'). AglB peptide substratesynthesis Peptide library 2 was synthesized by Boston Open Labs (4 mg, greater than 95% purity). Standard Fmoc solid-phase peptide synthesis protocols were employed to manually synthesize all the other peptides. (SINTSApNF)NH 2 , (SFNTSApNF)NH 2, (SINGTNpNF)NH 2 were synthesized on 0.05 mmol of PAL-PEG-PS resin (Applied Biosystems) using PyBOP (Genscript) as the activating agent. Half of the on-resin peptides were acetylated (10 equiv. Ac 2 0 and 10 equiv. pyridine in DMF), cleaved from resin (90% TFA, 5% CH 2CI 2, 2.5% H 20, 2.5% TIS) and purified by C 18 RP-HPLC (YMC-Pack ODS-A, 250 x 20 mmI.D., S-5 rim, 12 nm) with % a gradient of 20-50% B, where solvent A is water/0. 1 % TFA and solvent B is acetonitrile/0. I TFA using a Waters 1525 system with monitoring at 228 and 280 nm. The other half of the onresin peptides were coupled with Fmoc-NH-(PEG)-COOH (9 atoms) (Millipore) to increase the peptide hydrophilicity, cleaved from resin, and purified in the same manner. Peptides were confirmed by MS analysis on a Finnigan LCQ Deca mass spectrometer coupled to an Agilent 1100 series HPLC. (TFNETS)NH 2 and (TFNFTS)NH 2 were synthesized on 0.02 mmol of PALPEG resin. The N-termini were either acetylated or benzoylated (10 equiv. Bz 2 0 and 10 equiv. pyridine in DMF). Peptide cleavage, HPLC purification, and MS analysis are similar for all peptides. Peptides were stored as 30 mM stocks in DMSO at -80 'C or 5 mM stocks in DMSO at -20 0 C. 137 K,, determinationofAgB peptide substrates For the AgIB activity assay, DMSO (5 pL), buffer C (50 [tL, 100 mM HEPES, pH 7.5/280 mM sucrose/2.4% Triton X-100/20 mM MnCl 2), and H 2 0 (53 PL) were added to a tube containing dried Dol-P-GlcNAc-[3H]Glc-2,3-diNAcA (0.78 nmol, 17.3 mCi/mmol). Freshly purified AgiB (1 pM) was then added, along with H 20 for a final volume of 100 [iL. The reaction was initiated by the addition of variable amounts of peptide. Aliquots (10 [tL) of the reaction mixture removed at various time points and quenched in 3:2:1 CHCI 3 :MeOH:4 mM MgCl 2 (1.2 mL), and the resulting aqueous layer was removed. The organic layer was further , extracted with theoretical upper phase plus salt (TUPS, 2 x 600 piL, composed of 2.75% CHCl 3 44% MeOH, and 1.55 mM MgCl 2 ). The resulting aqueous layers were combined with 5 mL EcoLite (MP Biomedicals) liquid scintillation cocktail, organic layers were combined with 5 mL OptiFluor (PerkinElmer), and analyzed on a Beckman Coulter LS6500 scintillation counting system. The non-zero background observed at time zero without the addition of any enzyme is due to hydrolysis of the Dol-P-GIcNAc-[3H]Glc-2,3-diNAcA substrate during purification and storage. The sample is only used if less than 15% of the sample has been hydrolyzed prior to the assay. This background level of hydrolysis does not increase over the time frame of the assay. For Km determination, the peptides were assayed at 0.025, 0.2, 0.5, 1, 2, and 3 mM quenched at 20 min intervals for 0.025 and 0.2 mM peptide concentrations and at 10 min intervals for 0.5-3 mM peptide concentrations in order to remain in the linear range of the initial activity. The initial rates at each peptide concentration are used to plot the Michaelis-Menten curve and determine the apparent Km values on KaleidaGraph software. 138 References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. Varki, A. et al. Essentials of Glycobiology. (Cold Spring Harbor Laboratory Press, 2009). at <http://www.ncbi.nlm.nih.gov/books/NBK 1908/> Glover, K. J., Weerapana, E., Numao, S. & Imperiali, B. Chemoenzymatic Synthesis of Glycopeptides with PglB, a Bacterial Oligosaccharyl Transferase from Campylobacter jejuni. Chem. Biol. 12, 1311-1316 (2005). Glover, K. J., Weerapana, E. & Imperiali, B. In vitro assembly of the undecaprenylpyrophosphate-linked heptasaccharide for prokaryotic N-linked glycosylation. Proc. Natl. Acad Sci. U. S. A. 102, 14255-14259 (2005). Larkin, A., Chang, M. M., Whitworth, G. E. & Imperiali, B. Biochemical evidence for an alternate pathway in N-linked glycoprotein biosynthesis. Nat. Chem. Biol. 9, 367-373 (2013). Biinzli, J.-C. G. Benefiting from the Unique Properties of Lanthanide Ions. Acc. Chem. Res. 39, 53-61 (2006). Allen, K. N. & Imperiali, B. Lanthanide-tagged proteins - an illuminating partnership. Curr. Opin. Chem. Biol. 14, 247-254 (2010). Vuojola, J. & Soukka, T. Luminescent lanthanide reporters: new concepts for use in bioanalytical applications. Methods Appl. Fluoresc. 2, 012001 (2014). Franz, K. J., Nitz, M. & Imperiali, B. Lanthanide-Binding Tags as Versatile Protein Coexpression Probes. ChemBioChem 4, 265-271 (2003). Nitz, M., Franz, K. J., Maglathlin, R. L. & Imperiali, B. A Powerful Combinatorial Screen to Identify High-Affinity Terbium(III)-Binding Peptides. ChemBioChem 4, 272-276 (2003). Nitz, M. et al. Structural Origin of the High Affinity of a Chemically Evolved LanthanideBinding Peptide. Angew. Chem. Int. Ed. 43, 3682-3685 (2004). Martin, L. J., Sculimbrene, B. R., Nitz, M. & Imperiali, B. Rapid Combinatorial Screening of Peptide Libraries for the Selection of Lanthanide-Binding Tags (LBTs). QSAR Comb. Sci. 24, 1149-1157 (2005). Sculimbrene, B. R. & Imperiali, B. Lanthanide-Binding Tags as Luminescent Probes for Studying Protein Interactions. J. Am. Chem. Soc. 128, 7346-7352 (2006). Selvin, P. R. Principles and Biophysical Applications of Lanthanide-Based Probes. Annu. Rev. Biophys. Biomol. Struct. 31, 275-302 (2002). Selvin, P. R. & Hearst, J. E. Luminescence energy transfer using a terbium chelate: improvements on fluorescence energy transfer. Proc. Natl. Acad. Sci. U. S. A. 91, 1002410028 (1994). Stryer, L. Fluorescence Energy Transfer as a Spectroscopic Ruler. Annu. Rev. Biochem. 47, 819-846 (1978). Lizak, C., Gerber, S., Numao, S., Aebi, M. & Locher, K. P. X-ray structure of a bacterial oligosaccharyltransferase. Nature 474, 350-355 (2011). Maita, N., Nyirenda, J., Igura, M., Kamishikiryo, J. & Kohda, D. Comparative Structural Biology of Eubacterial and Archaeal Oligosaccharyltransferases. J. Biol. Chem. 285, 49414950 (2010). Matsumoto, S. et al. Crystal structures of an archaeal oligosaccharyltransferase provide insights into the catalytic cycle of N-linked protein glycosylation. Proc. Natl. Acad. Sci. 110, 17868-17873 (2013). 139 19. Jaffee, M. B. & Imperiali, B. Exploiting Topological Constraints To Reveal Buried Sequence Motifs in the Membrane-Bound N-Linked Oligosaccharyl Transferases. Biochemistry 50, 7557-7567 (2011). 20. Lizak, C. et al. A Catalytically Essential Motif in External Loop 5 of the Bacterial Oligosaccharyltransferase Pg1B. J. Biol. Chem. 289, 735-746 (2014). 21. Jaffee, M. B. Investigation of the structure requirements for oligosaccharyl transferase function. (Massachusetts Institute of Technology, 2013). at <http://dspace.mit.edu/handle/ 1721.1/83769> 22. Dale, R. E., Eisinger, J. & Blumberg, W. E. The orientational freedom of molecular probes. The orientation factor in intramolecular energy transfer. Biophys. J. 26, 161-193 (1979). 23. Chen, M. M., Glover, K. J. & Imperiali, B. From Peptide to Protein: Comparative Analysis of the Substrate Specificity of N-Linked Glycosylation in C. jejuni t. Biochemistry 46, 5579-5585 (2007). 24. Igura, M. & Kohda, D. Quantitative assessment of the preferences for the amino acid residues flanking archaeal N-linked glycosylation sites. Glycobiology 21, 575-583 (2011). 25. Voisin, S. et aL. Identification and Characterization of the Unique N-Linked Glycan Common to the Flagellins and S-layer Glycoprotein of Methanococcus voltae. J. Biol. Chem. 280, 16586-16593 (2005). 26. Chaban, B. et aL. Systematic deletion analyses of the fla genes in the flagella operon identify several genes essential for proper assembly and function of flagella in the archaeon, Methanococcus maripaludis. Mol, Microbiol. 66, 596-609 (2007). 27. Kelleher, D. J. & Gilmore, R. An evolving view of the eukaryotic oligosaccharyltransferase. Glycobiology 16, 47R-62R (2006). 140 Chapter 5: Characterization of Alg5 activity and product stereochemistry 141 Introduction Asparagine-linked protein glycosylation is a highly conserved process in eukaryotes from yeast to man.1-3 The core branched oligosaccharide, GlcNAc 2 ManGlc 3, is assembled on a dolichyl-diphosphate (Dol-PP) carrier at the membrane of the endoplasmic reticulum (ER) and is transferred to selected asparagine residues of nascent polypeptides. As illustrated in Figure 5-1, the synthesis of the lipid-linked oligosaccharide (LLO) is initiated on the cytoplasmic side of the ER membrane, where two N-acetylglucosamine (GIcNAc) and five mannose (Man) residues are assembled, using UDP-GIcNAc and GDP-Man, respectively, as nucleotide sugar donors. After translocation of the Dol-PP-GlcNAc 2Man5 into the lumen of the ER, four additional mannose residues are transferred from Dol-P-Man (DPM) to the LLO. In the final steps of this biosynthetic pathway, three glucose residues are transferred from the Dol-P-Glc (DPG) donor to lipid-linked GlcNAc 2Manq. The oligosaccharyl transferase (OTase) enzyme acts on the completely assembled Dol-PP-GlcNAc 2ManGlc 3 and transfers the tetradecasaccharide en bloc to asparagine residues within the Asn-Xaa-Ser/Thr sequon, where Xaa can be any amino acid except proline (Figure 5-1). Further glycan processing in the ER and Golgi remodels N-linked glycans in a protein, cell, and species-specific manner to generate the high structural diversity of N-linked glycans observed in eukaryotic organisms. 4 142 UDP-0 Alg7 - UDP-U ~ Alg13/14 . Cytoplasm CTP SeC59 GDP-U 0 CL d. Algi -4 I ER Lumen I GDP-U Alg2 O. I GDP4 T Alg2 o- GDPAlgi 1 I -iC 0- Y d. I W W?1 P UDP-U Alg5 P GDPOT 0-AiglO! I Ag 8 Ag6~ ll a-0C ., - Algil Rft? nascent glycoprotein ~ GDP-8 L0 -0- A1g9 T C A1g12 1P C Alg9 P 0. All CL N-acetylglucosamine (GlcNAc) Mannose (Man) Glucose (Gic) Dolichyl-diphosphate = Dpm o. 0. 1 2 Figure 5-1: Dolichol pathway of N-linked glycosylation in Saccharomyces cerevisiae. The glycan is assembled on the cytoplasmic face of the ER membrane on a dolichyl-diphosphate carrier from nucleotide sugar donors before being flipped to the ER lumen. The glycosyl donors in the lumen are Dol-P-Glc and Dol-P-Man, which are biosynthesized by Alg5 and Dpml, respectively, as shown on the right. The glycosyltransferases involved in the assembly of the oligosaccharide core are categorized into two classes based on the type of sugar donor, namely nucleotide-sugars and dolichyl-phosphate-sugars. Dol-P-Man and Dol-P-Glc are synthesized on the cytosolic face of the ER from Dol-P and GDP-Man or UDP-Glc, respectively, and are flipped into the ER lumen through currently uncharacterized flippase machinery.5' 6 Dol-P-Man and Dol-P-Glc are used exclusively on the luminal side of the ER and serve as sugar donors in the late steps of the LLO assembly. In yeast, the DPM synthase was designated as Dpml and characterized as a dolichylphosphate-p-D-mannosyltransferase using a variety of methods including lability to acid/base 143 hydrolysis, enzymatic digestion, and mass spectrometry.~ 0 Similarly, the yeast DPG synthase was designated as Alg5 and characterized as a dolichyl-phosphate-p-D-glucosyltransferase. 1-13 Investigations into the activity of AglK, the Methanococcus voltae glycosyltransferase studied in Chapter 2, led to the observation of the high sequence similarity of AglK with DPM and DPG synthases. Alg5 from S. cerevisiae exhibits greater than 50% sequence similarity to AglK (Figure 5-2).13 However, the product of the AglK reaction was unambiguously characterized by 1H-NMR as Dol-P-GlcNAc bearing an anomeric ca-linkage. 14 Considering the high sequence similarity between AglK and Alg5, and given that no NMR studies explicitly characterizes the stereochemistry of the Alg5 product, we set out to unequivocally establish the anomeric configuration of Dol-P-Glc to determine if Alg5 has been historically mischaracterized as an inverting p-glucosyltransferase instead of a retaining a-glucosyltransferase. This could also have enormous implications on the classification of glycosyltransferases identified subsequently to the DPM and DPG synthases. This chapter describes our efforts to express high levels of Alg5 to use in the production of Dol-P-Glc for NMR studies of the anomeric center. AglK Alg5 (1) (1) -------------------------------------------------------------------MA MRALRFLIENRNTVFFTLLVALVLSLYLLVYLFSHTPRPPYPEELKYIADEKGHEVSRALPNLNEHQDDE ---IKNVVN4 N--YD IDGW GRILLILTDAISFIKEKYGSRWEBVDDGS AglK (3) Alg5 (71) DKLIYLIP YN EIFLS I AglK (67) Alg5 (135) NNNKVIAEHEKGVMI -YEQFRIEFSERGKV RQ KKAYELGADIAVTF I P---LHIRGKYGLF -- I4PKEFKNMPLTKKVGN SFI YY H EAVIKRSMIRNCLMYEHTLVIFIIRSI AglK Alg5 (129) -- Y (201) PAVAI AglK Alg5 KNKL I (196) TCSI (272) FDVILIRKRItI HAPD SKF KIMKELEEKQ QYCLKICKFKI NNN ---- PIINDSKE-------ElU*AISKIETSSTDLKTTK KSKVLSEQ YE IIFKIMR4ILKIFPYHTEGWI K -------FY TIY SMARGTNVMIG S--WHEVDGSKMALAIDSIMKDIVIIE YLIGIYRDNKKC Figure 5-2: Alignment of AglK from M voltae with Alg5, the dolichyl-phosphate pglucosyltransferase from S. cerevisiae. Residues highlighted in black indicate sequence identity (25%) and those in gray and black denote sequence homology (53%). The alignment was performed using ClustalW. 144 Results and discussion Aig5 expression in E. coli Alg5 is a 38.3 kDa protein that is predicted to have one transmembrane domain (Figure 5-3). It has previously been overexpressed in both yeast and Escherichia coli, resulting in an increase of dolichyl-phosphate glucosyltransferase activity with UDP-Glc, whereas a deletion of the yeast gene leads to a loss of this activity and a concomitant underglycosylation of carboxypeptidase Y.13 In our studies, Alg5 activity is detected with the standard glycosyltransferase assay described in Chapter 2, which follows the transfer of radioactivity from aqueous-soluble UDP-[3H]Glc to organic-soluble Dol-P-[3H]Glc. 12 08 05 04 02 0 50 100 150 200 inside transrnernbrane - - 250 300 outside Figure 5-3: Topology prediction generated using the TMHMM server1 5 for S. cerevisiae Alg5. The aig5 gene was cloned out of the yeast genome and expressed in the pGEX-4T3 plasmid as a GST-Alg5 fusion (64.3 kDa) and the pET-47 plasmid as His6 -Alg5 (39.2 kDa) protein in E. coli BL21 (DE3) RIL cells. Since Alg5 is a membrane protein, it was purified as a cell envelope fraction (CEF) and assayed for activity. Unfortunately, no expression of protein the correct size was observed and no activity was detected above the background of CEF prepared from empty vector expression (Figure 5-4). 145 A A 2 1 kD 3 4 5 6 kD 1 2 4 3 5 6 P ''6 851 5060 j4 5040 40 37 25 202 2 -+-GST-Ag5 -- Ag5-Hl 2 Blank 20 0 15 0 200 400 600 WO 1000 Tirm (min) Figure 5-4: GST-Alg5 and His 6 -Alg5 expression detected by (A) Coomassie-stained SDSPAGE and (B) anti-His 4 Western blot. Lanes 1-3 are GST-Alg5 supernatant, CEF (dilute) and CEF (concentrated). Lanes 4-6 are His 6-Alg5 supernatant, CEF (dilute) and CEF (concentrated). (C) Activity assay of GST-Alg5, His 6 -Alg5, and blank CEF shows no activity above background. After the lack of Alg5 activity was documented with CEFs from both GST-Alg5 and His6 -Alg5, a synthetic aig5 gene was purchased with codon optimization for E. coli expression. This gene was then cloned into the pE-SUMO vector and expressed as His6 -SUMO-Alg5 (50.3 kDa) in several different cell lines. The SUMO fusion protein is regularly used to increase protein expression and solubility in E. coli.' 6 Once again, no expression of protein of the correct size was observed by Western blot analysis (Figure 5-5). A B kD 75 C41 C43 RIL Rose2 Lemo21 kD w. C41 C43 RIL Roset Lemo21 85 50 60 37 6 40 25 25 2020 15 15 Figure 5-5: Expression of His6 -SUMO-Alg5 in C41 (DE3), C43 (DE3), BL21 (DE3) RIL, Rosetta2 (DE3), and Lemo21 (DE3) cell lines detected by (A) Coomassie-stained SDS-PAGE and (B) Anti-His 4 Western blot. Two dilutions of each CEF were loaded. 146 Our next attempt at E. coli expression involved the use of the pBAD vector, which is ideal for tightly regulating potentially toxic or essential genes for increased protein expression levels," in BL21 (DE3) RIL cells. The resulting Alg5-Hisio (39.8 kDa) protein expression levels can be optimized with induction at variable L-arabinose concentrations (Figure 5-6). Expression of one protein with approximately the correct size was detected over time. Cell lysates were assayed for glucosyltransferase activity at several time points after induction with 0.2% Larabinose, which is the concentration that appeared to afford the highest expression levels of this protein (Figure 5-7). While it appears that the cleared lysate, after 21 hours, has the highest activity, the uninduced lysate also appears to have background activity on a comparable level with the time points at 2, 5, and even 9 hours. The higher activity seen at 21 hours is probably due to the higher overall cell density and total protein in the sample. This non-specific activity was also observed for GST-Alg5, His 6-Alg5, and SUMO-Alg5. These results demonstrate that Alg5-Hisio, if indeed expressed from the pBAD vector, is not active. A 0.002% kDa 0.02% 0.1% 0.2% B Oh 2h 5h Oh 2h 5h Oh 2h 5h Oh 2h 5h 0.002% kDa 0.02% 0.1% 0.2% Oh 2h 5h Oh 2h 5h Oh 2h 5h Oh 2h 5h 50 37 50 37 dw 25 20 ? 25 20, Figure 5-6: Alg5-Hisio expression levels were detected by (A) Coomassie-stained SDS-PAGE and (B) Anti-His 6 Western blot at 0, 2, and 5 hours after induction with variable concentrations of L-arabinose. One protein that could be Alg5-Hisio (39.8 kDa) is indicated by the arrow. 147 7 6 4 -+-2 h 5h -~9 h 2~--A-21 h 0 0 50 100 Time (min) 150 200 Figure 5-7: Activity of cell lysates Alg5-HisiO expressed from pBAD vector in BL21 (DE3) RIL cells at various times after induction with 0.2% L-arabinose. In addition, we used the commercial Genscript bacterial expression service with the codon-optimized aig5 gene in several different vectors and cell lines. His 6 -Alg5 expression failed in BL21 (DE3) and Origami B (DE3) cells. A thioredoxin protein tag was appended to the N-terminus to increase protein expression. While TRX-His 6 -Alg5 expression in BL21 (DE3) cells also failed, TRX-His 6 -Alg5 expression in Rosetta (DE3) cells was detectable on a Western blot, but at extremely low levels that would not likely be useful for large-scale preparation of Dol-P-Glc (Figure 5-8). kDa 1 2 120 80 60 <_ 50 40 30 20 Figure 5-8: TRX-His 6 -Alg5 expression is detected by anti-His Western blot analysis. Lane 1 is cell lysate with induction for 16 h at 15 *C. Lane 2 is cell lysate with induction for 4 h at 37 *C. The arrow indicates TRX-His 6 -Alg5. 148 While our various attempts at expressing Alg5 in E. coli have been largely unsuccessful, earlier reports by the Aebi lab show that it should be possible to generate Alg5 in E. coli and yeast. While the levels of Alg5 expression were not reported, the glucosyltransferase activity was clearly demonstrated.13 We received the original plasmid used in the initial studies identifying Alg5 activity. The pET(Alg5) plasmid is based on the pET-3b vector and results in Alg5 protein expression with no tags (38.3 kDa). The protein was expressed in BL21 (DE3) RIL and C43 (DE3) cell lines using either Aebi's original or our lab's standard expression protocol. Alg5 expression was not readily visible on a Coomassie gel and the activity assays showed only the low level of background activity similarly observed for previous expression constructs (Figure 59). A kDa 1 2 3 4 B 5 75 4 37 3 25 2 -+-RIL 30C -G-RIL 16C C43 30C - )-C43 16C 20 0 0 200 400 600 Time (min) 800 1000 Figure 5-9: (A) Alg5 expression as CEF from pET(Alg5) under different conditions evaluated by Coomassie-stained SDS-PAGE. Lane 1: BL21 (DE3) RIL, 30 'C, OD5 78 = 0.1, 3 mM IPTG for 3 h. Lane 2: BL21 (DE3) RIL, 16 'C, OD600 = 0.6, 1 mM IPTG, 16 h. Lane 3: C43 (DE3), 30 'C, OD5 78 = 0.1, 3 mM IPTG for 3 h. Lane 4: C43 (DE3), 16 'C, OD600 = 0.6, 1 mM IPTG, 16 h. (B) Activity assay of Alg5 CEFs. The Alg5 expressed in different constructs and cell lines all exhibited background activity that is attributable to an unidentified component of the CEF instead of the formation of Dol-PGlc. The glycosyltransferase assay was repeated with UDP-[3H]-Glc and UDP-[3H]-GlcNAc as the sugar donor substrates and long Dol-P (C85-105) or short Dol-P (C55-60) or no Dol-P as the 149 lipid acceptor substrates. The organic extractions of each reaction were isolated and resolved by normal-phase HPLC to determine where the radioactive counts are localized (Figure 5-10). It is clear that even when the reaction lacks Dol-P, a small amount of radiolabeled glucose gets transferred to some species native to the E. coli membrane. However, there is no distinct product formation in the presence of short or long Dol-P. 50000 40000 -Glc + C55 DolP Glc + long DolP 30000 S-*-Glc - DolP 20000 -+-GcNAc + C55 DolP -WGlcNAc + long DolP 10000 0. 0 10 20 30 Time (min) 40 50 60 Figure 5-10: HPLC fractions of the organic extraction of the Alg5 reaction with UDP-[ 3H]Glc or UDP-[ 3H]GlcNAc and short Dol-P (C5 5-C60), long Dol-P (C85-C 105), or no Dol-P. Alg5 expression in S. cerevisiae Alg5 expression was also undertaken in yeast, its native organism. The aig5 gene was cloned into the pRS423 expression plasmid in S. cerevisiae strains INVScl, W303a and W303a. The protein expressed from this system is His 14 -TEV-Alg5 (42.4 kDa), with a TEV protease cleavage site to remove the N-terminal His tag. Western blots of the membrane fraction showed no obvious overexpressed protein bands of the size corresponding to His 14-TEV-Alg5 (Figure 511). In addition, none of the strains exhibited any glucosyltransferase activity (Figure 5-12). 150 A kDa INVScI + + -- W303a + + -- B W303a . kDa + +-- 85 60 50 40 INVSc1 + + -- W30 + + -- W303Q + +-- 85 60 50 40 255 20 2 15 1 Figure 5-11: Alg5 yeast expression (A) SDS-PAGE Ponceau S stain (B) Anti-His 4 Western blot. Membrane fractions with Alg5 expression (+) or without Alg5 expression (-) in yeast strains INVSc1, W303a, or W303a. 3 C -+-Alg5 INVScI 2 - +Alg5 W303a C AIg5 W303a O 1 0 0 50 150 100 Time (min) 200 Figure 5-12: Glucosyltransferase activity assay with membrane fractions of Alg5 expression in INVSc1, W303a, and W303a yeast strains with UDP-[ 3 H]Glc and Dol-P. Cell-free translationofAlg5 An entirely different strategy for protein expression is the cell-free transcription and translation system. Pioneered more than forty years ago, cell-free systems utilize an optimized E. coli extract, a reaction buffer containing an ATP regenerating system, and amino acids to allow high-level synthesis of a recombinant protein of interest. Cell-free expression offers several advantages over traditional cell-based expression methods.'8 '1 9 Overexpression of membrane proteins in vivo frequently results in cell toxicity largely owing to protein aggregation and misfolding that can disrupt the integrity of the cell membrane. In vitro translation takes 151 advantage of the highly efficient bacterial transcription and translational machinery while introducing mild detergents, natural and synthetic lipid mixtures, or nanodiscs during the reaction to alleviate aggregation and insolubility issues while maintaining translation activity.23 Recent technical advances and improvements in translation efficiency have resulted in yields that exceed a milligram of protein per milliliter of reaction mix, with easy modification of reaction conditions to favor protein folding, decreased sensitivity to product toxicity, and suitability for high-throughput strategies because of reduced reaction volumes and process time. 19 Cell-free translation has been used to synthesize milligram amounts of several integral membrane proteins, and this approach is ideal for Alg5 expression, which has failed in both bacterial and yeast in vivo systems. Expression constructs were generated by insertion of the alg5 gene, previously codonoptimized for E. coli expression, into pEXP5-NT/TOPO and pEXP5-CT/TOPO vectors. The proteins synthesized using the Expressway TM Cell-Free E. coli Expression System are Alg5 fused with an N-terminal peptide containing a removable His 6 tag and TEV protease cleavage site (Alg5-NT, 40.9 kDa) and Alg5 with a C-terminal His 6 tag (Alg5-CT, 39.4 kDa). Even without the addition of detergents or lipids, both Alg5-NT and Alg5-CT were expressed and identifiable by Western blot analysis (Figure 5-13). Alg5-NT and Alg5-CT were also expressed in the presence of different detergents including Triton X- 100, n-dodecyl P-D-maltoside (DDM), n-octyl p-D-glucoside (P-OG), lauryl maltose neopentyl glycol (LMNG) and digitonin. Alternatively, Alg5-NT and Alg5-CT were expressed in the absence of detergents and then solubilized with detergents or lipids including Triton X-100, DDM, P-OG, LMNG, n-dodecyl phsophocholine (Fos-12), sodium dodecyl sulfate (SDS), 1,2-dimyristoyl-sn-glycero-3- phosphocholine (DMPC), and 1 -palm itoyl-2-oleoyl-sn-glycero-3-phosphocholine (POPC). In all 152 cases, Alg5 was located mainly in the pellet, indicating that Alg5 expression is hampered with solubility and aggregation difficulties (Figure 5-14). kDa NT NT CT CT 180 115 82 64 49 37 26 19 15 Figure 5-13: Anti-His 6 Western blot showing cell-free expression of Alg5-NT and Alg5-CT, indicated by the arrow. 153 . A I- 0 0 A 0 C i B = zi _i 200 9 CM S P S PS P S P S P S P z C C a.2 0 S P S P S P S P SP S P S P S a z ioa C CL0 9 - H0 0 . JLL 00.. W1 s P C sPS P sPS . L P sP s PS PsP S P sPS P sP S Figure 5-14: Anti-His 6 Western blot of (A) Alg5-NT expression with additives, (B) Alg5-NT solubilization after expression, (C) Alg5-CT expression with additives, and (D) Alg5-CT solubilization after expression. S = supernatant, P = pellet. Phosphatidylcholines are a major component of eukaryotic biological membranes but are absent in the membranes of most bacteria, including E. coli.24 Since Alg5 is native to yeast, several phosphatidylcholines were added to the cell-free expression reaction to better mimic a yeast-like membrane environment than can be achieved by detergents or other polar lipids, which may be beneficial for robust activity. Alg5-NT and Alg5-CT exhibited the highest glucosyltransferase activity when expressed and assayed in the presence of 0.1% POPC additive, but the activity level observed for Alg5-NT was much higher than that of Alg5-CT (Figure 5-15). 154 14 12 C 10 .0 -+-Ag5-NT -Ag5-CT o 6no Dol-P S4 ---- 0 0 100 300 200 Time (min) 400 Figure 5-15: Glucosyltransferase activity assay with Alg5-NT and Alg5-CT with UDP-[ 3H]Glc and Dol-P (C55-60) or Alg5-NT with UDP-[3H]Glc and no Dol-P. All reactions have 0.1% POPC. Alg5-NT was expressed with POPC on a larger scale and used in the preparation of DolP-Glc for NMR studies. A maximum of 30% Dol-P turnover was achievable with Alg5-NT and excess UDP-Glc. After normal-phase HPLC, a significant amount of POPC was co-purified with Dol-P-Glc. Nevertheless, we were able to investigate the stereochemistry of the glucose moiety bound to Dol-P using phosphorus-decoupled 'H-NMR spectroscopy, and determined the J1 ,2 value of the anomeric proton to be 7.9 Hz (Figure 5-16). This value strongly indicates that the Alg5 reaction proceeds with inversion of stereochemistry and the Dol-P-Glc product bears an anomeric P-linkage. This differs from what was observed for the AgIK reaction, which produces Dol-P-GlcNAc with the anomeric proton having a J1 ,2 value of 3.4 Hz. However, this result is consistent with previous work done to characterize Dol-P-Glc through acid/base hydrolysis, enzymatic digestion, and mass spectrometry."'"13 Therefore, Alg5 and AglK, though sharing high sequence similarity, do not catalyze reactions with the same stereochemical outcome. 155 A B 5.4 51 .2 4 8 4.6 4.4 4.Z 4.0 3.8 3,6 3.4 2 3.0 Figure 5-16: (A) 1H-NMR spectrum of Dol-P-Glc, and (B) 3 1P decoupled 'H-NMR spectrum of the anomeric proton for Dol-P-Glc, a doublet with a coupling constant of 7.9 Hz (J, 2 ). Conclusions Alg5 expression proved to be more complicated than anticipated in E. coli and S. cerevisiae. However, this enabled us to utilize a cell-free expression system for efficient synthesis of Alg5 with an N-terminal His 6 tag. This integral membrane protein was expressed with 0.1% POPC added into the reaction for optimal glucosyltransferase activity. The Dol-P-Glc product of the Alg5 reaction was observed through NMR to have an anomeric P-linkage. stereochemistry has been long suggested, and now has been confirmed through our studies. 156 This Acknowledgements Dr. Angelyn Larkin provided the yeast aig5 gene in pGEX-4T3 and pET-47 vectors, as well as the pRS423 expression plasmid, sequencing primers, and S. cerevisiae strains INVScl, W303a and W303a. Prof. M. Aebi (ETH ZUrich) generously provided the pET(Alg5) plasmid. Dr. Garrett Whitworth provided (C55-60) Dol-P and, along with Dr. Jeff Simpson (DCIF, MIT), helped with the NMR acquisitions. Experimental methods Alg5 expression in pGEX-4T3, pET-47, andpE-SUMO vectors The aig5 gene was cloned out of the yeast genome and inserted into the pGEX-4T3 and pET-47 plasmids by Dr. Angelyn Larkin. The pGEX-4T3/Alg5 plasmid was transformed into E. coli BL21 (DE3) RIL cells (Agilent) using carbenicillin and chloramphenicol for selection, and the pET-47/Alg5 plasmid was transformed with kanamycin and chloramphenicol. The aig5 gene, codon-optimized for E. coli expression, was cloned into pE-SUMO (LifeSensors) with BsmBI and Xhol restriction sites (primers listed below) and the plasmid was transformed into BL21 (DE3) RIL cells using kanamycin and chloramphenicol. Primers used for pE-SUMO/Alg5: Alg5fwdOO4: 5'-CGATCGTCTCCAGGTATGCGCGCCCTGCGC-3' Alg5rev002: 5'-CGATCTCGAGTTAACACTTTTTGTT-3' Alg5 was expressed from each plasmid in BL21 (DE3) RIL cells. Other cell lines tested were OverExpress C41 (DE3) and C43 (DE3) (Lucigen), Rosetta2 (DE3) (EMD Millipore), and Lemo2l (DE3) (New England Biolabs). In all cases, LB media (1 L) supplemented with antibiotics was inoculated with a 5 mL starter culture and incubated at 37 'C with shaking until an optical density (600 nm) of 0.6-0.8 AU was obtained. The cultures were then cooled to 16 'C 157 and protein expression was induced with 1 mM IPTG. After 16 hrs, the cells were harvested by centrifugation (4000 x g). For protein purification, cell pellets were resuspended in 5% of the original culture volume in buffer A (50 mM HEPES, pH 7.5/300 mM NaCl) and a protease inhibitor cocktail (Calbiochem), then subjected to sonication. To prepare cell membrane fractions, the cell lysate was spun down to remove cellular debris (6000 x g, 30 min), followed by pelleting of the membranes (142,000 x g, I hr). The membrane pellet was then resuspended in 0.25% of the original culture volume in buffer B (50 mM HEPES, pH 7.5/150 mM NaCl) and stored at -80 C. Since the different Alg5 fusion proteins are all integral membrane proteins, all were purified as a cell envelope fraction and assayed for activity. A ig5 expression in pBAD vector The pBAD-mNectarine plasmid was purchased from Addgene. This plasmid is modified from pBAD/HisB (Invitrogen) with the mNectarine gene inserted between XhoI and EcoRi of the vector. The mNectarine gene can be removed using the XhoI and EcoRI restriction enzymes, or the mNectarine gene and preceding polyhistidine-Xpress epitope-EK recognition site region could be removed using Ncol and EcoRI restriction enzymes. The alg5 gene had previously been codon-optimized for E. coli expression and cloned into the pUC57-kan plasmid (Genewiz). This was used as the template for amplification with the primers listed below. The forward primer contains the NcoI site and the reverse primer contains the EcoRi site and a Hisio tag. The NcoI restriction site contains an internal start codon, so two additional base pairs were included to put the gene in the correct reading frame and the methionine from the start of the alg5 gene was removed. As a consequence of these cloning necessities, the final Alg5 expressed will have a glycine residue inserted at position 2, following the start codon, and a C-terminal Hiso tag. Alg5-HisIo is 39.8 kDa and its extinction coefficient at 280 nm is 34,310 cm-'M'. 158 Primers for pBAD/Alg5: Alg5fwdOO6: 5'-CGATCCATGGGACGCGCCCTGCGCTTCCTGAT-3' Alg5revOO3: 5'-CGATGAATTCTTAGTGATGGTGATGGTGATGGTGATGGTGATG ACACTTTTTGTTATCACGGTAG-3' A pilot expression of Alg5/pBAD in BL21 (DE3) RIL cells was conducted to determine the optimal concentration of L-arabinose for induction. 5 mL of culture was grown at 37 'C until an optical density (600 nm) of 0.6-0.8 AU was reached. Cells were induced with 0.002, 0.02, 0.1, or 0.2% L-arabinose. I mL aliquots were removed and pelleted at 0, 2, and 5 hours. Each pellet was resuspended in 100 tL of lx SDS-PAGE buffer, boiled for 5 minutes, and run on a 12% acrylamide gel. Anti-His 6 Western blot was used to determine expression levels. For the larger preparation of Alg5 expression, Alg5/pBAD in BL21 (DE3) RIL was expressed in I L of LB media and 1 L of TB media. Cultures were grown at 37 'C until an optical density (600 nm) of 0.5 AU was reached, then induced with 0.2% L-arabinose (10 mL of 20% L-arabinose stock), and the temperature was lowered to 16 C for overnight expression and harvested the next day. Cell pellets were resuspended in 50 mL of buffer C (50 mM HEPES, pH 7.5/100 mM NaCI/lmM MgCl 2/l mM DTT) and 50 [L protease inhibitor cocktail. Cells were lysed by sonication (3 x 1.5 min, 40% amplitude, pulses I s on and 1 s off, on ice). An aliquot of this cell lysate was saved for the activity assay. The remaining lysate was centrifuged at (4000 x g, 30 min, 4 'C) to remove unbroken cells and then at (142,000 x g, 1 hr, 4 'C) to pellet the membranes. The CEF pellet was homogenized in 1 mL of buffer D (50 mM HEPES, pH 7.5/1 mM MgCI 2 /l mM DTT) and assayed for activity. For the time course of Alg5 activity, pBAD/Alg5 in BL21(DE3)RIL cells were grown at 37 'C until an optical density (600 nm) of 0.5 AU was reached. Cells were induced with 0.2% L- 159 arabinose. I mL aliquots were removed and pelleted at 0, 2, 5, 9 or 21 hours. Each pellet was resuspended in 100 tL of lx SDS-PAGE buffer, boiled for 5 minutes, and run on a 12% acrylamide gel. Anti-His 6 Western blot was used to determine expression levels. For larger scale tests, 10 ml aliquots were removed and pelleted at 0, 2, 5, 9, or 21 hours. Each pellet was resuspended in 0.5 mL buffer D and protease inhibitor. Cells were lysed by sonication with the microtip: 1.5 min, 20% amplitude, 1 s on/5 s off on ice. Cells were centrifuged to clear the lysate (6000 x g, 30 min). Cleared lysate was used for activity assays. Alg5 expressionfrom pET(Alg5) plasmid The pET(Alg5) plasmid was expressed in C43 (DE3) cells using the normal protocol and in BL21 (DE3) RIL cells using the normal protocol and the protocol found in the original experiments. 13 1 L of each type was expressed. In the original protocol, the cultures were grown at 30 'C until an optical density (578 nm) of 0.1 AU was reached. Expression was induced with 3 mM IPTG and then cells were harvested after 3 h of expression at 30 C. In the normal protocol, the cultures were grown at 37 C until an optical density (600 nm) of 0.6-0.8 AU was reached, then induced with 1 mM IPTG, and the temperature was lowered to 16 'C and harvested after 16 hrs. CEF preparation is the same as described earlier. Alg5 expression in S. cerevisiae The aig5 gene was PCR amplified with BamHI and Xhol restriction sites on the 5' and 3' ends, respectively. Primers used are listed below. The amplified aig5 gene and the pRS423 vector were subjected to restriction digestion with BamHI and Xhol, and then ligated together with T4 DNA ligase. Colony PCR confirmed Alg5 insertion into the pRS423 vector and plasmid sequencing validated its identity. 160 Primers for pRS234/Alg5: Alg5fwdOO5: 5'-CGATGGATCCATGAGAGCGTTGAGATTC-3' Alg5revOO3: 5'-CGATCTCGAGCTAACATTTCTTATTATC-3' The Alg5/pRS423 plasmid was introduced into the yeast host through transformation using a lithium acetate transformation protocol with PEG 3350. The plasmid was transformed into INVSc1 (diploid strain with the genotype: MATa his3DI leu2 trpl-289 ura3-52), or W303a and W303a (haploid strains with the genotype: MATa his3-11,15 trpl-l leu2-3,112 ura3-1 ade21 can1-100). Cells were cultured in synthetic complete media without histidine to maintain . selection for the plasmid containing a1g5 In our test expression, starter cultures were made with 15 ml of SC-His media inoculated with Alg5/pRS423 in INVScl, W303a, or W303a and grown at 30 'C for two days. 1 ml of the starter culture was used to inoculate 60 ml of SC-His in a 250 ml Erlenmeyer flask. After 47 h with shaking at 200 rpm, OD600 maxed out between 2.2 and 4.2. Each culture was induced with 20 ml of 4x YPG media. After 16 h of expression at 30 'C with 200 rpm shaking, the cells were harvested by centrifugation. We resuspended the cells in lysis buffer (50 mM Tris, pH 7.4/500 mM NaCl/20% glycerol) and protease inhibitor cocktail, with 30 ml of lysis buffer per 0.5 L of growth. We added 1 ml of resuspended cells and 1 ml of glass beads to 2 ml microcentrifuge tubes with screw caps. Cells were lysed with 6 x 1 min cycles on the bead beater (speed 400). The tubes were pierced with a needle at the bottom and centrifuged to collect the lysate away from the beads. A low speed spin (6000 x g, 30 min) pelleted cell debris. The supernatant was transferred and subjected to a high-speed spin (142,000 x g, 1 h) to pellet the microsomes. The supernatant was discarded and the pellet was resuspended in buffer containing 50 mM Tris, pH 161 7.4/200 mM NaCl/10% glycerol. Additional protease inhibitors were added at this point and aliquots were frozen at -80 'C. Alg5 activity assay For Alg5 expressed in E. coli, the activity assay contains 5 nmol Dol-P (C55-60) or DolP (C85-105), resuspended in 3 tL DMSO by vortexing, followed by addition of 10 tL of 10% Triton X-100, 8 tL 0.5 M HEPES, pH 7.5, 1 [L of I M MgCl 2 , 1 tL of 100 mM DTT, 56 [tL of dH 20, and 20 [tL of cell lysate. The reaction was initiated by the addition of 1 RL of UDP[3 H]Glc (2.5 pmol, 300 kDPM). Aliquots of 15 [tL were quenched at various time points into 800 tL chloroform/400 [IL methanol, with 3 x 300 tL PSUP extractions and subjected to liquid scintillation counting (Beckman Coulter LS6500). In the activity assay for Alg5 expressed in yeast, 25 p1 of microsomes were added to 10 nmol Dol-P (C55 or C85-105), 3 p DMSO, 10 pl 10% Triton X-100, 2 p 100 mM DTT, 10 pI 0.5 M HEPES, pH 7.5, 1 p 1 M MgCl 2 , 48 p1 dH 20, and 1 p1 UDP-[3H]Glc. Aliquots were quenched into 800 [IL chloroform/400 [IL methanol, with 3 x 300 [tL PSUP extractions and subjected to liquid scintillation counting. Cell-free expression ofAlg5 Expressway TM Maxi Cell-Free E. coli Expression System (Invitrogen) was used to carry out in vitro translation of Alg5. First, the alg5 gene, codon-optimized for E. coli expression, was amplified with PCR using Taq polymerase to produce the aig5 gene with single deoxyadenosineoverhangs at the 3'-end of the PCR product (primers listed below). Primers for pEXP5-NT/TOPO/Alg5: Alg5_TOPO-f: 5'-ATGCGCGCCCTGCGCTT-3' 162 Alg5_NT_r: 5'-TTAACACTTTTTGTTATC-3' Primers for pEXP5-CT/TOPO/Alg5: Alg5_TOPO_f: 5'-ATGCGCGCCCTGCGCTTC-3' Alg5_CT-r: 5'-ACACTTTTTGTTATCACG-3' This alg5 gene was then cloned into the pEXP5-NT/TOPO and pEXP5-CT/TOPO vectors (Invitrogen) using the TOPO TA cloning technique that relies on an adenine on the insert and a thymine on the vector to hybridize and become ligated in the presence of topoisomerase. Once ligated, the plasmids were transformed into One Shot TOP10 Chemically Competent E. coli (Invitrogen), from which we generated our template DNA, concentrated to at least 500 ng/tl. I pg of template DNA was used per 100 pl of protein synthesis reaction. The reactions were carried out in sterile, RNase-free 1.5 mL tubes (Eppendorf). For each small scale reaction, 20 p1 of E. coli slyD- Extract, 20 p1 of 2.5X IVPS E. coli Reaction Buffer (-A.A.), 1.25 pl of 50 mM amino acids (-Met), 1 p1 of 75 mM methionine, 1 p1 of T7 Enzyme Mix, 1 [g of DNA Template, and DNase/RNase-free distilled water to a final volume of 50 p was added. The sample was incubated in a shaking incubator (300 rpm) at 30 'C for 30 minutes. After 30 minutes of incubation, 50 pl of the Feed Buffer was added to the sample (total volume = 100 p1). The Feed Buffer contains 25 p1 of 2X IVPS Feed Buffer, 1.25 pl of 50 mM amino acids (-Met), 1 p of 75 mM methionine, and DNase/RNase-free distilled water to a final volume of 50 pl. The entire sample was then shaken at 300 rpm at 30 'C for 6 hours. Protein synthesis was evaluated by Anti-His 6 Western blot. In our screen of cell-free expression conditions with detergent, we supplemented both our initial reaction and our feed buffer with 0.2% DDM, 0.3% Triton X-100, 1% P-OG, 0.02% LMNG, or 0.5% digitonin. We spun down our reactions and ran both the supernatant and the 163 pellet on a 12% acrylamide for analysis with Anti-His 6 Western blot. In our experiment to determine which detergents or lipids best resolubilized the protein, we added no supplements to the initial reaction and then tried to resolubilize the resulting pellet in 2% DDM, 2% Triton X100, 2% p-OG, 0.2% LMNG, 2% Fos-12, 2% DMPC, 2% POPC, or 2% SDS by incubation for 2 hours with shaking. The samples were then spun down and analyzed as before. The Alg5 activity assay was modified to include 0.2% POPC. Alg5-NT and Alg5-CT were assayed with 0.2% POPC and 0.2% DMPC as well as with a range of POPC (0.01%, 0.05%, 0.1% 0.2%, 1%). For the large preparation of Alg5-NT, three separate I mL reactions were set up with 0.2% POPC and scaled appropriately. Dol-P-Glc synthesis and analysis For the preparation of Dol-P-[3H]Glc, one I mL reaction contained 100 nmol of Dol-P (C55-60), 30 p of DMSO, 10 p of 200 mM DTT, 50 pl of 10% Triton X-100, 10 pI of I M MgC 2 , 50 pl of 2% POPC, 50 pl of 0.5 M HEPES, pH 7.5, 18.1 pl of 13.8 mM UDP-Glc, 15 P1 of UDP-[ 3 H]Glc (37.5 pmol), 566.9 pl. of dH 20, and 200 pl of cell-free reaction. The total UDPGlc in the reaction is 250 nmol, or 250 pM concentration. This hot reaction serves as an estimate for the turnover of the cold reaction and generates a labeled product that can be followed easily during purification. For the preparation of unlabeled Dol-P-Glc, 14 x I mL reactions were carried out containing 100 nmol of Dol-P (C55-60), 30 pl of DMSO, 10 p of 200 mM DTT, 50 p of 10% Triton X-100, 10 p of 1 M MgC 2 , 50 p of 2% POPC, 50 pl of 0.5 M HEPES, pH 7.5, 18.1 pl of 13.8 mM UDP-Glc, 581.9 pl of dH2 0, and 200 p of cell-free reaction. The UDPGlc concentration in the reaction is 250 nmol or 250 pM. Both hot and cold reactions were quenched into 8:4:3 chloroform/methanol/PSUP. The aqueous phase was removed and the organic phase was extracted twice more with PSUP before being dried down. The hot reaction 164 was purified over a normal phase HPLC using a YMC-Pack PVA-SIL-NP column (250 x 4.6 mm I.D., S-5 [tm, 12 nm) and separating with a gradient of 21-28% buffer F over 60 minutes, in which buffer E is 4:1 CHCI 3/MeOH and buffer F is 10:10:3 CHCl 3/MeOH/2 M NH 40Ac, flowing at 1 mL/min with 1 mL fractions collected. Each fraction was analyzed by liquid scintillation counting to locate Dol-P-[ 3H]Glc. The cold reactions were purified similarly, with the desired fractions located by analytical TLC using silica gel 60 F254 plates with a solvent system of 65:25:4:0.5 CHCI3/MeOH/H 20/NH 40H and stained with ceric ammonium molybdate. NMR spectra were acquired using either a Varian Inova 500 MHz spectrometer equipped with a 5 mm inverse broadband gradient probe. Samples were diluted in 4:1 CDCI3 :CD 30D and analyzed. The anomeric configuration of the Dol-P-Glc glycosidic linkage was determined using a 3 P-decoupled 'H pulse sequence. References 1. 2. 3. 4. 5. 6. 7. 8. Kornfeld, R. & Komfeld, S. Assembly of Asparagine-Linked Oligosaccharides. Annu. Rev. Biochem. 54, 631-664 (1985). Lehle, L. Biosynthesis of the Core Region of Yeast Mannoproteins. Eur. J. Biochem. 109, 589-601 (1980). Burda, P. & Aebi, M. The dolichol pathway of N-linked glycosylation. Biochim. Biophys. Acta BBA - Gen. Subj. 1426, 239-257 (1999). Varki, A. et al. Essentialsof Glycobiology. (Cold Spring Harbor Laboratory Press, 2009). at <http://www.ncbi.nlm.nih.gov/books/NBK 1908/> Helenius, J. et al. Translocation of lipid-linked oligosaccharides across the ER membrane requires Rftl protein. Nature 415, 447-450 (2002). Jelk, J. et al. Glycoprotein Biosynthesis in a Eukaryote Lacking the Membrane Protein Rftl. J. Biol. Chem. 288, 20616-20623 (2013). Tkacz, J. S., Herscovics, A., Warren, C. D. & Jeanloz, R. W. Mannosyltransferase Activity in Calf Pancreas Microsomes Formation From Guanosine Diphosphate-d-[14C]Mannose of a 14C-Labeled Mannolipid with Properties of Dolichyl Mannopyranosyl Phosphate. J. Biol. Chem. 249, 6372-6381 (1974). Herscovics, A., Warren, C. D. & Jeanloz, R. W. Anomeric configuration of the dolichyl Dmannosyl phosphate formed in calf pancreas microsomes. J. Biol. Chem. 250, 8079-8084 (1975). 165 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. Orlean, P., Albright, C. & Robbins, P. W. Cloning and sequencing of the yeast gene for dolichol phosphate mannose synthase, an essential protein. J. Biol. Chem. 263, 1749917507 (1988). Maeda, Y. & Kinoshita, T. Dolichol-phosphate mannose synthase: Structure, function and regulation. Biochim. Biophys. Acta BBA - Gen. Sub. 1780, 861-868 (2008). Behrens, N. H. & Leloir, L. F. Dolichol Monophosphate Glucose: An Intermediate in Glucose Transfer in Liver. Proc. NatL. Acad Sci. U. S. A. 66, 153-159 (1970). Palamarczyk, G., Drake, R., Haley, B. & Lennarz, W. J. Evidence that the synthesis of glucosylphosphodolichol in yeast involves a 35-kDa membrane protein. Proc. Nat!. Acad. Sci. 87, 2666-2670 (1990). Heesen, S. te, Lehle, L., Weissmann, A. & Aebi, M. Isolation of the ALG5 Locus Encoding the UDP-Glucose:Dolichyl-Phosphate Glucosyltransferase from Saccharomyces cerevisiae. Eur. J. Biochem. 224, 71-79 (1994). Larkin, A., Chang, M. M., Whitworth, G. E. & Imperiali, B. Biochemical evidence for an alternate pathway in N-linked glycoprotein biosynthesis. Nat. Chem. Biol. 9, 367-373 (2013). Krogh, A., Larsson, B., von Heijne, G. & Sonnhammer, E. L. L. Predicting transmembrane protein topology with a hidden markov model: application to complete genomes. J. Mo!. Biol. 305, 567-580 (2001). Iii, R. J. P., Orcutt, S. J., Strickler, J. E. & Butt, T. R. in Heterologous Gene Expression in E. coli (eds. Jr, T. C. E. & Xu, M.-Q.) 15-30 (Humana Press, 2011). at <http://link.springer.com/protocol/l0.1007/978-1-61737-967-3_2> Guzman, L. M., Belin, D., Carson, M. J. & Beckwith, J. Tight regulation, modulation, and high-level expression by vectors containing the arabinose PBAD promoter. J. Bacteriol. 177, 4121-4130 (1995). Nirenberg, M. W. & Matthaei, J. H. The dependence of cell-free protein synthesis in E. coli upon naturally occurring or synthetic polyribonucleotides. Proc. Nat!. A cad. Sci. U. S. A. 47, 1588-1602 (1961). Katzen, F., Chang, G. & Kudlicki, W. The past, present and future of cell-free protein synthesis. Trends Biotechnol. 23, 150-156 (2005). Elbaz, Y., Steiner-Mordoch, S., Danieli, T. & Schuldiner, S. In vitro synthesis of fully functional EmrE, a multidrug transporter, and study of its oligomeric state. Proc. NatL. Acad. Sci. U. S. A. 101, 1519-1524 (2004). Klammt, C. et aL. High level cell-free expression and specific labeling of integral membrane proteins. Eur. J. Biochem. 271, 568-580 (2004). Katzen, F. et aL. Insertion of Membrane Proteins into Discoidal Membranes Using a CellFree Protein Expression Approach. J. Proteome Res. 7, 3535-3542 (2008). Ma, Y. et al. Preparative Scale Cell-free Production and Quality Optimization of MraY Homologues in Different Expression Modes. J. Biol. Chem. 286, 38844-38853 (2011). Vance, J. E. & Vance, D. E. Biochemistry of Lipids, Lipoproteins and Membranes. (Elsevier, 2008). Hays, F. A., Roe-Zurz, Z. & Stroud, R. M. in Methods in Enzymology (ed. Jonathan Weissman; Christine Guthrie and Gerald R. Fink) 470, 695-707 (Academic Press, 2010). 166