Biological Expression Language Overview August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/ or send a letter to Creative Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041, USA. 1 Contents • • • • • • BEL Statements BEL Statement Annotations BEL Terms BEL Functions BEL Relationships General Hints BEL Statements • Basic statement types: Term Expression Relationship Term Expression p(HGNC:CCND1) directlyIncreases kin(p(HGNC:CDK4)) Term Expression complex(p(HGNC:CCND1), p(HGNC:CDK4)) 3 BEL Statements Term Expression Relationship Term Expression a(CHEBI:corticosteroid) -> path(MESHD:"Insulin Resistance") The abundance of molecules designated by the name “corticosteroid” in the CHEBI namespace. The pathology designated by the name “Insulin Resistance” in the MESHD namespace. 4 BEL Statements Term Expression Relationship Term Expression a(CHEBI:corticosteroid) -> path(MESHD:"Insulin Resistance") increases 5 BEL Statements • Complex statement type: – A causal statement can be used as the target term of a causal statement Term Expression Causal Relationship Causal Statement p(HGNC:CLSPN) -> (kin(p(HGNC:ATR)) => p(HGNC:CHEK1, pmod(P))) 6 Contents • • • • • • BEL Statements BEL Statement Annotations BEL Terms BEL Functions BEL Relationships General Hints BEL Statement Annotations • Annotations provide information about one or more BEL Statements SET Citation = {"PubMed", "J Mol Med", "12682725", "200303-14","Limbourg FP|Liao JK",""} SET Evidence = "high-dose steroid treatment decreases vascular inflammation and ischemic tissue damage after myocardial infarction and stroke through direct vascular effects involving the nontranscriptional activation of eNOS" SET Species = "9606" SET Tissue = "Vascular System" SET Disease = "Stroke" a(CHEBI:corticosteroid) -| bp(MESHD:"Inflammation") 8 Contents • • • • • • BEL Statements BEL Statement Annotations BEL Terms BEL Functions BEL Relationships General Hints BEL Terms function(ns:value) • BEL terms minimally have the following components: – Function • Required • Can be nested to create complex terms – Namespace Abbreviation • Optional – Value • Required • Generally found in the referenced namespace • BEL terms using values from different namespaces can be equivalenced 10 BEL Terms a(CHEBI:corticosteroid) function - abundance() path(MESHD:"Insulin Resistance") function - pathology() 11 BEL Terms a(CHEBI:corticosteroid) Namespace abbreviation CHEBI path(MESHD:"Insulin Resistance") Namespace abbreviation – MESHD 12 BEL Terms a(CHEBI:corticosteroid) Namespace value bp(MESHD:"Insulin Resistance") Namespace value 13 Equivalence of Terms p(EG:207) “the abundance of the protein designated by EntrezGene id 207” (human AKT1) p(SPAC:P31749) “the abundance of the protein designated by Swiss-Prot id P31749” (human AKT1) p(HGNC:AKT1) Can unify to p(HGNC:AKT1) in the KAM “the abundance of the protein designated by HGNC gene symbol ‘AKT1’” (human AKT1) Terms are unified during compilation using information in the BEL namespace equivalence documents Contents • • • • • • BEL Statements BEL Statement Annotations BEL Terms BEL Functions BEL Relationships General Hints BEL Functions • Types of functions: – – – – – – Abundances Processes Modifications of abundances Activities Transformations List functions • Abundances and processes are applied directly to namespace values • All other functions are applied to abundance functions! BEL Functions - Abundances • Abundances – – – – – – abundance(), a() geneAbundance(), g() rnaAbundance(), r() microRNAAbundance(), m() complexAbundance(), complex() compositeAbundance(), composite() 17 abundance(), a() • Use abundance() to represent any abundances that are not represented by a more specific abundance type, including: – Chemicals • a(CHEBI:corticosteroid) – Cellular structures • a(GOCCTERM:"astral microtubule") • No modification functions apply to abundance terms • Generally, activity functions do not apply to abundance terms 18 geneAbundance(), g() • Use geneAbundance terms to represent DNA – Can use to represent gene amplification and deletion events – Used in "gene scaffolding" • g(HGNC:AKT1) transcribedTo r(HGNC:AKT1) – Use in complexes to represent binding to promoters • complex(p(HGNC:TP53), g(HGNC:CDKN1A)) • In BEL v1.0, the only modification function that can be applied to gene abundances is fusion() – g(HGNC:TMPRSS2,fusion(HGNC:ERG)) • No activity functions apply to geneAbundance terms 19 complexAbundance(), complex() • Use complexAbundance() to represent molecular complexes and binding events • complexAbundance terms can take two forms: – complexAbundance(ns:value) • Used for named complexes • E.g., complexAbundance(NCH:"AP-1 Complex") – complexAbundance(<abundance term list>) • Use to represent binding events or to define complexes by components • Unordered list • E.g., complex(p(HGNC:FOS),p(HGNC:JUN)) 20 compositeAbundance(), composite() • Use to represent cases where multiple abundances synergize to produce an effect – Composite terms should not be used if any of the abundances alone are reported to cause the effect – Use composite terms only as subjects of statements – E.g., composite(p(HGNC:TGFB1), p(HGNC:IL6)) 21 BEL Functions - Processes • Processes include biological phenomena that occur at the level of the cell or organism – biologicalProcess(), bp() • E.g., bp(GO:"cellular senescence") – pathology(), path() • E.g., path(MESHD:"Muscle Hypotonia") 22 BEL Functions – Abundance Modifications • Modifications are functions used as arguments within abundance functions • Currently supported modification types are: – Variants - use to represent protein sequence variants, generally resulting from a mutation or polymorphism • substitution(), truncation(), fusion() • E.g., p(HGNC:PIK3CA, sub(E, 545, K)) – PIK3CA protein with glutamic acid 545 substituted with a lysine – Protein Modifications - use to represent post-translational modifications of proteins • Includes phosphorylation, ubiquitination, acetylation, glycosylation • proteinModification() • E.g., p(HGNC:HIF1A, pmod(H, N, 803)) – Modification of HIF1A by hydroxylation at amino acid asparagine 803 23 BEL Functions - Activities • Activity functions are applied to protein, complex, and RNA abundances to specify the frequency of events resulting from the molecular activity of the abundance – E.g., tport(complex(NCH:"EnaC Complex")) • Transporter activity of the EnaC sodium channel complex • This distinction is particularly useful for proteins whose activities are regulated by post-translational modification • BEL v1.0 supports 10 distinct activity functions: – catalyticActivity, peptidaseActivity, gtpBoundActivity, transportActivity, chaperoneActivity, transcriptionalActivity, molecularActivity, kinaseActivity, phosphataseActivity, ribosylaseActivity • molecularActivity() should be used to represent activities that are not represented by a more specific function 24 BEL Functions - Transformations • Transformations are events in which one class of abundance is transformed or changed into a second class of abundance – Translocations • translocation(), tloc() • cellSecretion(), sec() • cellSurfaceExpression(), surf() – Reactions • reaction(), rxn() – Degradation • degradation(), deg() 25 translocation(), tloc() • Use translocation terms to represent the movement of abundances from one cellular location to another • E.g., tport(complex(NCH:"EnaC Complex")) => \ tloc(a(CHEBI:"sodium(1+)"), MESHCL:"Extracellular Space", \ MESHCL:"Intracellular Space") – The transport activity of the EnaC Complex translocates sodium ions from extracellular to intracellular 26 cellSecretion(), sec() cellSurfaceExpression(), surf() • sec() and surf() are convenience functions for commonly used translocations 27 degradation(), deg() • Generally used to indicate complete proteolysis of a protein • Do not use to indicate proteolysis which results in functional cleavage products! • During compilation Phase I, degradation nodes are linked to the root abundance with a directlyDecreases relationship – E.g., deg(p(HGNC(MAPT)) – Compilation adds: deg(p(HGNC:MAPT)) =| p(HGNC:MAPT) BEL Functions – List Functions • List functions used for: – Protein family assignment • p(PFH:"Cu-Zn SOD Family") hasMembers list(p(HGNC:SOD1), p(HGNC:SOD3)) – Complex component assignment • complex(GOCCTERM:"gamma-secretase complex") hasComponents \ list(p(HGNC:PSEN1),p(HGNC:NCSTN),p(HGNC:APH1A),p(HGNC:PSEN2)) – Reactants and Products within a reaction term • rxn(reactants(a(CHEBI:superoxide)), \ products(a(CHEBI:"hydrogen peroxide"))) 29 Contents • • • • • • BEL Statements BEL Statement Annotations BEL Terms BEL Functions BEL Relationships General Hints BEL Relationships • Causal relationships – increases, directlyIncreases, decreases, directlyDecreases, rateLimitingStepOf, causesNoChange • Correlative relationships – negativeCorrelation, positiveCorrelation, association • Biomarker relationships – biomarkerFor, prognosticBiomarkerFor • Assignment to groups – hasMember, hasComponent, hasMembers, hasComponents • Other – isA, subProcessOf • Genomic relationships – transcribedTo, translatedTo, orthologousTo 31 BEL Relationships – Compiler Inserted Relationships • These relationships are not needed for creating BEL statements – Used only by the compiler • • • • • • • actsIn hasModification hasProduct hasVariant reactantIn translocates includes 32 Contents • • • • • • BEL Statements BEL Statement Annotations BEL Terms BEL Functions BEL Relationships General Hints General BEL Hints • BEL functions, relationships, and namespace values are all case sensitive • Every term must have a function – Namespace values are always associated with an abundance or process function – Exception - cellular location values within a translocation function • Namespace values with spaces or unusual characters require quotes – E.g., complex(GOCCTERM:"gamma-secretase complex") 34