Biological Expression Language Overview

advertisement
Biological Expression Language Overview
August 2012
This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy
of this license, visit http://creativecommons.org/licenses/by/3.0/ or send a letter to Creative
Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041, USA.
1
Contents
•
•
•
•
•
•
BEL Statements
BEL Statement Annotations
BEL Terms
BEL Functions
BEL Relationships
General Hints
BEL Statements
• Basic statement types:
Term Expression
Relationship
Term Expression
p(HGNC:CCND1) directlyIncreases kin(p(HGNC:CDK4))
Term Expression
complex(p(HGNC:CCND1), p(HGNC:CDK4))
3
BEL Statements
Term Expression
Relationship
Term Expression
a(CHEBI:corticosteroid) -> path(MESHD:"Insulin Resistance")
The abundance of molecules
designated by the name
“corticosteroid” in the CHEBI
namespace.
The pathology designated by
the name “Insulin
Resistance” in the MESHD
namespace.
4
BEL Statements
Term Expression
Relationship
Term Expression
a(CHEBI:corticosteroid) -> path(MESHD:"Insulin Resistance")
increases
5
BEL Statements
• Complex statement type:
– A causal statement can be used as the target term of a
causal statement
Term Expression
Causal Relationship
Causal Statement
p(HGNC:CLSPN) -> (kin(p(HGNC:ATR)) => p(HGNC:CHEK1, pmod(P)))
6
Contents
•
•
•
•
•
•
BEL Statements
BEL Statement Annotations
BEL Terms
BEL Functions
BEL Relationships
General Hints
BEL Statement Annotations
• Annotations provide information about one or more BEL
Statements
SET Citation = {"PubMed", "J Mol Med", "12682725", "200303-14","Limbourg FP|Liao JK",""}
SET Evidence = "high-dose steroid treatment decreases
vascular inflammation and ischemic
tissue damage after myocardial infarction and stroke
through direct vascular effects involving the
nontranscriptional activation of eNOS"
SET Species = "9606"
SET Tissue = "Vascular System"
SET Disease = "Stroke"
a(CHEBI:corticosteroid) -| bp(MESHD:"Inflammation")
8
Contents
•
•
•
•
•
•
BEL Statements
BEL Statement Annotations
BEL Terms
BEL Functions
BEL Relationships
General Hints
BEL Terms
function(ns:value)
• BEL terms minimally have the following components:
– Function
• Required
• Can be nested to create complex terms
– Namespace Abbreviation
• Optional
– Value
• Required
• Generally found in the referenced namespace
• BEL terms using values from different namespaces
can be equivalenced
10
BEL Terms
a(CHEBI:corticosteroid)
function - abundance()
path(MESHD:"Insulin Resistance")
function - pathology()
11
BEL Terms
a(CHEBI:corticosteroid)
Namespace abbreviation CHEBI
path(MESHD:"Insulin Resistance")
Namespace abbreviation –
MESHD
12
BEL Terms
a(CHEBI:corticosteroid)
Namespace value
bp(MESHD:"Insulin Resistance")
Namespace value
13
Equivalence of Terms
p(EG:207)
“the abundance of the protein
designated by EntrezGene id
207” (human AKT1)
p(SPAC:P31749)
“the abundance of the protein
designated by Swiss-Prot id
P31749” (human AKT1)
p(HGNC:AKT1)
Can unify to
p(HGNC:AKT1)
in the KAM
“the abundance of the protein
designated by HGNC gene
symbol ‘AKT1’” (human AKT1)
Terms are unified during compilation using information in the BEL
namespace equivalence documents
Contents
•
•
•
•
•
•
BEL Statements
BEL Statement Annotations
BEL Terms
BEL Functions
BEL Relationships
General Hints
BEL Functions
• Types of functions:
–
–
–
–
–
–
Abundances
Processes
Modifications of abundances
Activities
Transformations
List functions
• Abundances and processes are applied directly to
namespace values
• All other functions are applied to abundance functions!
BEL Functions - Abundances
• Abundances
–
–
–
–
–
–
abundance(), a()
geneAbundance(), g()
rnaAbundance(), r()
microRNAAbundance(), m()
complexAbundance(), complex()
compositeAbundance(), composite()
17
abundance(), a()
• Use abundance() to represent any abundances that
are not represented by a more specific abundance
type, including:
– Chemicals
• a(CHEBI:corticosteroid)
– Cellular structures
• a(GOCCTERM:"astral microtubule")
• No modification functions apply to abundance terms
• Generally, activity functions do not apply to
abundance terms
18
geneAbundance(), g()
• Use geneAbundance terms to represent DNA
– Can use to represent gene amplification and deletion
events
– Used in "gene scaffolding"
• g(HGNC:AKT1) transcribedTo r(HGNC:AKT1)
– Use in complexes to represent binding to promoters
• complex(p(HGNC:TP53), g(HGNC:CDKN1A))
• In BEL v1.0, the only modification function that can
be applied to gene abundances is fusion()
– g(HGNC:TMPRSS2,fusion(HGNC:ERG))
• No activity functions apply to geneAbundance terms
19
complexAbundance(), complex()
• Use complexAbundance() to represent molecular
complexes and binding events
• complexAbundance terms can take two forms:
– complexAbundance(ns:value)
• Used for named complexes
• E.g., complexAbundance(NCH:"AP-1 Complex")
– complexAbundance(<abundance term list>)
• Use to represent binding events or to define complexes by
components
• Unordered list
• E.g., complex(p(HGNC:FOS),p(HGNC:JUN))
20
compositeAbundance(), composite()
• Use to represent cases where multiple abundances
synergize to produce an effect
– Composite terms should not be used if any of the
abundances alone are reported to cause the effect
– Use composite terms only as subjects of statements
– E.g., composite(p(HGNC:TGFB1), p(HGNC:IL6))
21
BEL Functions - Processes
• Processes include biological phenomena that occur
at the level of the cell or organism
– biologicalProcess(), bp()
• E.g., bp(GO:"cellular senescence")
– pathology(), path()
• E.g., path(MESHD:"Muscle Hypotonia")
22
BEL Functions – Abundance Modifications
• Modifications are functions used as arguments within
abundance functions
• Currently supported modification types are:
– Variants - use to represent protein sequence variants, generally
resulting from a mutation or polymorphism
• substitution(), truncation(), fusion()
• E.g., p(HGNC:PIK3CA, sub(E, 545, K))
– PIK3CA protein with glutamic acid 545 substituted with a lysine
– Protein Modifications - use to represent post-translational
modifications of proteins
• Includes phosphorylation, ubiquitination, acetylation, glycosylation
• proteinModification()
• E.g., p(HGNC:HIF1A, pmod(H, N, 803))
– Modification of HIF1A by hydroxylation at amino acid asparagine 803
23
BEL Functions - Activities
• Activity functions are applied to protein, complex, and RNA
abundances to specify the frequency of events resulting from
the molecular activity of the abundance
– E.g., tport(complex(NCH:"EnaC Complex"))
• Transporter activity of the EnaC sodium channel complex
• This distinction is particularly useful for proteins whose
activities are regulated by post-translational modification
• BEL v1.0 supports 10 distinct activity functions:
– catalyticActivity, peptidaseActivity, gtpBoundActivity, transportActivity,
chaperoneActivity, transcriptionalActivity, molecularActivity,
kinaseActivity, phosphataseActivity, ribosylaseActivity
• molecularActivity() should be used to represent activities that
are not represented by a more specific function
24
BEL Functions - Transformations
• Transformations are events in which one class of
abundance is transformed or changed into a second
class of abundance
– Translocations
• translocation(), tloc()
• cellSecretion(), sec()
• cellSurfaceExpression(), surf()
– Reactions
• reaction(), rxn()
– Degradation
• degradation(), deg()
25
translocation(), tloc()
• Use translocation terms to represent the movement
of abundances from one cellular location to another
• E.g., tport(complex(NCH:"EnaC Complex")) => \
tloc(a(CHEBI:"sodium(1+)"), MESHCL:"Extracellular Space", \
MESHCL:"Intracellular Space")
– The transport activity of the EnaC Complex translocates
sodium ions from extracellular to intracellular
26
cellSecretion(), sec()
cellSurfaceExpression(), surf()
• sec() and surf() are convenience functions for
commonly used translocations
27
degradation(), deg()
• Generally used to indicate complete proteolysis of a
protein
• Do not use to indicate proteolysis which results in
functional cleavage products!
• During compilation Phase I, degradation nodes are
linked to the root abundance with a
directlyDecreases relationship
– E.g., deg(p(HGNC(MAPT))
– Compilation adds:
deg(p(HGNC:MAPT)) =| p(HGNC:MAPT)
BEL Functions – List Functions
• List functions used for:
– Protein family assignment
• p(PFH:"Cu-Zn SOD Family") hasMembers list(p(HGNC:SOD1), p(HGNC:SOD3))
– Complex component assignment
• complex(GOCCTERM:"gamma-secretase complex") hasComponents \
list(p(HGNC:PSEN1),p(HGNC:NCSTN),p(HGNC:APH1A),p(HGNC:PSEN2))
– Reactants and Products within a reaction term
• rxn(reactants(a(CHEBI:superoxide)), \
products(a(CHEBI:"hydrogen peroxide")))
29
Contents
•
•
•
•
•
•
BEL Statements
BEL Statement Annotations
BEL Terms
BEL Functions
BEL Relationships
General Hints
BEL Relationships
• Causal relationships
– increases, directlyIncreases, decreases, directlyDecreases,
rateLimitingStepOf, causesNoChange
• Correlative relationships
– negativeCorrelation, positiveCorrelation, association
• Biomarker relationships
– biomarkerFor, prognosticBiomarkerFor
• Assignment to groups
– hasMember, hasComponent, hasMembers, hasComponents
• Other
– isA, subProcessOf
• Genomic relationships
– transcribedTo, translatedTo, orthologousTo
31
BEL Relationships – Compiler Inserted
Relationships
• These relationships are not needed for creating BEL
statements
– Used only by the compiler
•
•
•
•
•
•
•
actsIn
hasModification
hasProduct
hasVariant
reactantIn
translocates
includes
32
Contents
•
•
•
•
•
•
BEL Statements
BEL Statement Annotations
BEL Terms
BEL Functions
BEL Relationships
General Hints
General BEL Hints
• BEL functions, relationships, and namespace values
are all case sensitive
• Every term must have a function
– Namespace values are always associated with an
abundance or process function
– Exception - cellular location values within a translocation
function
• Namespace values with spaces or unusual characters
require quotes
– E.g., complex(GOCCTERM:"gamma-secretase
complex")
34
Download