Enolase Bridging Project John A. Gerlt Enzyme Function Initiative (EFI) Advisory Committee Meeting November 30, 2011 Enolase superfamily: partition of specificity and chemistry Capping Domain: Specificity Barrel Domain: Acid/base chemistry QuickTime™ and a decompressor are needed to see this picture. Enolase superfamily: > 20 assigned functions CO2OH H (R)-Mandelate -O OPO322C CH2OH H 2-PGA CO2H CH3 +H N H 3 CO2- -O - H2O anti H CO2- PEP -O MAL H CO2 CO2H O C 2C O Ala Muconolactone CO2N H H -O 2C CO2O OSB -O 2C OH NH Ala +H N 2 NSAR CO2NH H N-Succ-L-Arg CO2- L-Ala-D-Glu NH2 O CO2- - H2O syn AE Epim HN H CO2- syn L-Ala-L-Glu +H N 2 O OSBS O SHCHC -O HH H CO2- - Mesaconate MLE -O HN 2C FucD - H2O CO2O H H HO H H OH CH2OH CO2O H H H OH HO H CH2OH CO2O H H HO H HO H CH3 CO2OH H OH CH2OH CO2- CO2OH H OH OH CH2OH D-Gluconate H HO H H HO H CO2H OH CO2- GlcD - H2O TarD - H2O CO2O H H H OH H OH CH2OH CO2O H H CO2- XylD - H2O CO2H GlucD H OH - H2O H syn CO2D-Glucarate HO HO H HO HO H H CO2H OH OH CH2OH AraD - H2O CO2O H H H OH CH2OH D-Arabinonate CO2CO2HO H GalrD/TalrD O H OH H H H OH H OH - H2O HO H HO H CO2 CO2Galactarate H HO H CO2H OH (S)-Mandelate D-Tartrate CO2H OH RhamD H OH HO H - H2O HO H CH3 L-Rhamnonate CO2O H H H OH CH2OH D-Xylonate NH2 OH NH CO2H OH OH H CH3 L-Fuconate HO H H HO CH3 2C -NH3 anti cis,cis-Muconate HO H 2C CH2 3-Methyl Asp O- H O C OPO32- Enolase CO2OH GalD H H - H2O OH CH2OH D-Galactonate H HO HO H MR CO2O H H H OH HO H CO2- CO2CO2OH GalrD/TalrD O OH H H OH H OH - H2O H HO H CO2 CO2L-Talarate H H H HO CO2OH GlucD-II H OH - H2O OH CO2D-Glucarate H HO H H CO2OH H OH H CO2L-Idarate H HO H HO GlucD - H2O anti CO2O H H H OH HO H CO2CO2O H H H OH HO H CO2- N-Succ-D-Arg CO2OH ManD OH OH - H2O OH syn CH2OH D-Mannonate H H H H CO2O H H H OH H OH CH2OH CO2H OH OH H CO2Galactarate HO H H HO GalrD - H2O anti HO H H CO2H OH H O CO2- Current directions 1. Superfamily/Genome, Protein/Structure, and Computation Cores: boundaries between functions ManD, GlucD, TarD, MR 2. Computation Core: 2PMQ by operon docking 3. Microbiology Core: New pathways and functions, focusing on Agrobacterium tumefaciens Current directions 1. Superfamily/Genome, Protein/Structure, and Computation Cores: boundaries between functions ManD, GlucD, TarD, MR 2. Computation Core: 2PMQ by operon docking 3. Microbiology Core: New pathways and functions, focusing on Agrobacterium tumefaciens SSN: dehydratases in EN superfamily Mandelate racemase D-Glucarate dehydratase D-Mannonate dehydratase Galactarate dehydratase D-Arabinonate dehydratase Galactonate dehydratase Gluconate dehydratase D-Tartrate dehydratase L-Fuconate dehydratase Galactarate/L-talarate dehydratase L-Rhamnonate dehydratase Unknown SSN: dehydratases in EN superfamily D-Mannonate dehydratase ManD Boundaries between functions: ManD e-80 35% seq.id. e-184 70% seq.id. ≥ 70% sequence identity: functional significance ? Activities: ManD, low ManD and/or GluD, none Structures Conserved active site structures 2QJJ 3BSM 3DFH 3GY1 Conserved structures, except for active site loop: protein-protein interactions ? 2QJJ 3BSM 3DFH 3GY1 Current directions 1. Superfamily/Genome, Protein/Structure, and Computation Cores: boundaries between functions ManD, GlucD, TarD, MR 2. Computation Core: 2PMQ by operon docking 3. Microbiology Core: New pathways and functions, focusing on Agrobacterium tumefaciens Unknown family in the MLE subgroup: 2PMQ MLE MLEII MLE 2 OSBS NSAR NSAR 2 Dipeptide epimerase Unknown 2PMQ: Structure with No Function (SNF) from PSI-2 Operon docking: retrospective glycolysis C. Kalyanaraman and M. P. Jacobson. "Studying enzymesubstrate specificity in silico: A case study of the E. coli glycolysis pathway”, Biochemistry, 49 (2010) 4003-4005. PGI PFK FBP Aldolase TIM G3PDH PGK PGM Enolase Rank (%) 0.9 0.1 0.3 0.1 0.8 0.5 0.03 0.2 PDB HM 64% to 2cxr 1pfk HM 40% to HM 58% HM 45% HM 49% 7tim 3elf to 1nqa to 1vpe to 1ejj 1ebj 2PMQ gene cluster (Pelagibaca bermudensis) Transporter: A Trp “cage” for a betaine Dioxygenase/hydroxylase: Homologues use aromatics Closest liganded homologues: 60% with 3N0Q, unliganded 18% with 1O7N: a naphthalene dioxygenase, cocrystalized with indole Template: 1O7N smaller active site: indole in 1O7N has lots of steric clashes 2PMQ: Docking, a small active site 1 2 3 4 5 Docked with 4-hydroxy proline Top 5 docking hits Experimental testing of 2PMQ prediction HO HO CO2- N+ H3C H3C CH3 CO2- N+ kcat/Km = 4300 M-1s-1 CH3 4-OH Pro betaine CO2- N+ H3C CH3 CO2 N+ H3C - kcat/Km = 380 M-1s-1 CH3 Pro betaine Genome context was helpful, but structures were essential First amino acid racemase in EN superfamily Proposed pathway for 4-OH Pro utilization HO HO HO 2PMQ N+ H3C CO2- N+ H3C CH3 CO2- CO2- N CH3 CH3 HO HO N CO2- N CO2- H2N CO2- HO H O H H CO2- O O -O C 2 CO2O Metabolomics to confirm pathway is in progress Current directions 1. Superfamily/Genome, Protein/Structure, and Computation Cores: boundaries between functions ManD, GlucD, TarD, MR 2. Computation Core: 2PMQ by operon docking 3. Microbiology Core: New pathways and functions, focusing on Agrobacterium tumefaciens SSN: dehydratases in EN superfamily Mandelate racemase D-Glucarate dehydratase D-Mannonate dehydratase Galactarate dehydratase D-Arabinonate dehydratase Galactonate dehydratase Gluconate dehydratase D-Tartrate dehydratase L-Fuconate dehydratase Galactarate/L-talarate dehydratase L-Rhamnonate dehydratase Unknown Nine Agrobacterium tumefaciens dehydratases Four SNFs FucD 1RVK 3DIP 2NQL 3TJ4 1RVK: ordered 20s loop, large active site ? Library screening: 1RVK is a novel GlucD HO HO H HO CO2H H OH H CO2- 1RVK - H2O CO2O H H H OH H HO CO2- D-Glucarate kcat = 0.36 s-1 Km = 45 M kcat/Km = 7.4 x 103 M-1 s-1 H HO H HO CO2OH H OH H CO2- 1RVK - H2O CO2O H H H OH H HO CO2- L-Idarate kcat = 0.33 s-1 Km = 54 M kcat/Km = 6.1 x 103 M-1 s-1 Agrobacterium utilize D-glucarate as carbon source ?? Complex with L-lyxarohydroxamate Novel pathway for D-glucuronate catabolism ? Phenotypic/metabolomic analyses by Micro Core 1. Cosmid library (from S. Farrand, UIUC) 2. Identification of dehydratase cosmids 3. Wanner mutagenesis of cosmids in E. coli 4. Transformation and recombination of mutant cosmids into A. tumefaciens C58 5. Phenotypic analyses (BioLog) 6. Metabolomics to discover pathways Current directions 1. Superfamily/Genome, Protein/Structure, and Computation Cores: boundaries between functions ManD, GlucD, TarD, MR 2. Computation Core: 2PMQ by operon docking 3. Microbiology Core: New pathways and functions, focusing on Agrobacterium tumefaciens