The GlycO Ontology in Protégé 3 Top-Level Classes are Defined in GlycO The GlycO Ontology in Protégé Semantics Include Chemical Context This Class Inherits from 2 Parents The GlycO Ontology in Protégé The -D-Manp residues in N-glycans are found in 8 different chemical environments GlycoTree – A Canonical Representation of N-Glycans We give a residue in this position the same name, regardless of the specific structure it resides in b-D-GlcpNAc-(1-6)+ b-D-GlcpNAc-(1-2)- -D-Manp -(1-6)+ Semantics! b-D-Manp-(1-4)- b-D-GlcpNAc -(1-4)- b-D-GlcpNAc b-D-GlcpNAc-(1-4)- -D-Manp -(1-3)+ b-D-GlcpNAc-(1-2)+ N. Takahashi and K. Kato, Trends in Glycosciences and Glycotechnology, 15: 235-251 Glyco Population • The next slides show the different steps that were necessary to populate GlycO with glycan structures from multiple sources. Ontology population workflow Semagix Freedom knowledge extractor YES: next Instance Instance Data Already in KB? Has CarbBank ID? NO YES Insert into KB Compare to Knowledge Base NO IUPAC to LINUCS LINUCS to GLYDE Semagix Freedom knowledge extractor YES: next Instance Instance Data Already in KB? Has CarbBank ID? NO YES Insert into KB Compare to Knowledge Base [][Asn]{[(4+1)][b-D-GlcpNAc] {[(4+1)][b-D-GlcpNAc] {[(4+1)][b-D-Manp] {[(3+1)][a-D-Manp] IUPAC to NO{[(2+1)][b-D-GlcpNAc] LINUCS {}[(4+1)][b-D-GlcpNAc] {}}[(6+1)][a-D-Manp] {[(2+1)][b-D-GlcpNAc]{}}}}}} LINUCS to GLYDE Semagix Freedom knowledge extractor <Glycan> YES: <aglycon name="Asn"/> <residue link="4" anomer="b" chirality="D" monosaccharide="GlcNAc"> nextanomeric_carbon="1" Instance <residue link="4" anomeric_carbon="1" anomer="b" chirality="D" monosaccharide="GlcNAc"> <residue link="4" anomeric_carbon="1" anomer="b" Instancechirality="D" monosaccharide="Man" > <residue link="3" anomeric_carbon="1" anomer="a" Data chirality="D" monosaccharide="Man" > <residue link="2" anomeric_carbon="1" anomer="b" chirality="D" monosaccharide="GlcNAc" > </residue> <residue link="4" anomeric_carbon="1" anomer="b" chirality="D" monosaccharide="GlcNAc" > </residue> Has </residue> Already in IUPAC to CarbBankchirality="D" NO monosaccharide="Man" > <residue link="6" anomeric_carbon="1" anomer="a" KB? LINUCS <residue link="2" anomeric_carbon="1" anomer="b" chirality="D" monosaccharide="GlcNAc"> ID? </residue> </residue> </residue> NO YES </residue> </residue> </Glycan> Compare to Insert into KB Knowledge Base LINUCS to GLYDE EnzyO • The enzyme ontology EnzyO is highly intertwined with GlycO. While it’s structure is mostly that of a taxonomy, it is highly restricted at the class level and hence allows for comfortable classification of enzyme instances from multiple organisms. Putting it together • GlycO together with EnzyO contain all the information that is needed for the description of Metabolic pathways – e.g. N-Glycan Biosynthesis N-Glycosylation metabolic pathway N-glycan_beta_GlcNAc_9 GNT-I attaches GlcNAc at position 2 N-acetyl-glucosaminyl_transferase_V N-glycan_alpha_man_4 GNT-V attaches GlcNAc at position 6 UDP-N-acetyl-D-glucosamine + alpha-D-Mannosyl-1,3-(R1)-beta-D-mannosyl-R2 <=> UDP + N-Acetyl-$beta-D-glucosaminyl-1,2-alpha-D-mannosyl-1,3-(R1)-beta-D-mannosyl-$R2 UDP-N-acetyl-D-glucosamine + G00020 <=> UDP + G00021 Pathway representation in GlycO Pathways do not need to be explicitly defined in GlycO. The residue-, glycan-, enzyme- and reaction descriptions contain all the knowledge necessary to infer pathways. Zooming in a little … Reaction R05987 catalyzed by enzyme 2.4.1.145 adds_glycosyl_residue N-glycan_b-D-GlcpNAc_13 The product of this reaction is the Glycan with KEGG ID 00020. The N-Glycan with KEGG ID 00015 is the substrate to the reaction R05987, which is catalyzed by an enzyme of the class EC 2.4.1.145.