1 SRI International Bioinformatics
4
Represents information about a reaction that is independent of enzymes that catalyze the reaction
Connected to enzyme(s) via enzymatic reaction frames
Classified with EC system when possible
Example: 2.7.7.7 – DNA-directed DNA polymerization
Carried out by five enzymes in E. coli
SRI International Bioinformatics
5
A necessary bridge between enzymes and
“generic” versions of reactions
Carries information specific to an enzyme/reaction combination:
Cofactors and prosthetic groups
Alternative substrates
Links to regulatory interactions
Kinetic data (Km, Vmax, etc.)
Frame is generated when protein is associated with reaction (via protein or reaction editor)
SRI International Bioinformatics
6 SRI International Bioinformatics
7 SRI International Bioinformatics
8
Left/Right reflect direction of reaction as written by Enzyme Commission
Reflects systematic direction for different reaction classes
Left/Right do not necessarily correspond to physiological direction of a reaction
Get-rxn-direction(rxn)
Returns :L2R or :R2L or :BOTH or NIL
Integrates all available info about direction of this reaction
Direction explicitly curated for reaction
Direction(s) it occurs in all pathways in the PGDB
Direction(s) as specified in Enzymatic-Reactions
SRI International Bioinformatics
9
Balance-state
EC-number
Enzymatic-reaction
Generated in protein or reaction editor
In-pathway
Generated in pathway editor
Left and Right
Reaction-Direction
:left-to-right, :right-to-left, :reversible
Spontaneous?
SRI International Bioinformatics
Enzyme
Reaction
Regulated-By
Cofactors
Kinetic slots:
Vmax, Km, Kcat, Specific-Activity, Temperature-Opt, pH-Opt
Reaction-Direction
Reaction may be reversible, but this enzyme effectively only catalyzes it in one direction.
10 SRI International Bioinformatics
11 SRI International Bioinformatics
12 SRI International Bioinformatics
13
Genes-of-reaction (rxn)
Substrates-of-reaction (rxn)
Enzymes-of-reaction (rxn)
Lacking-ec-number (organism)
Returns list of rxns with no ec numbers in that database
Get-reaction-direction-in-pathway (pwy rxn)
Reaction-type(rxn)
Indicates types of Rxn as: Small molecule rxn, transport rxn, protein-small-molecule rxn
(one substrate is protein and one is a small molecule), protein rxn (all substrates are proteins), etc.
All-rxns(type)
Specify the type of reaction (see above for type)
Obtain-rxn-stats
Returns six values
Length of : all-rxns, transport, non-transport, etc…
SRI International Bioinformatics
Find all small-molecule reactions that have no enzyme but are not spontaneous (“orphan” reactions)
Find all reactions that consume a given compound
14 SRI International Bioinformatics
Find all small-molecule reactions that have no enzyme but are not spontaneous (“orphan” reactions):
(loop for r in (all-rxns :small-molecule) when (and (not (slot-has-value-p r 'enzymatic-reaction)
(not (get-slot-value r 'spontaneous?))) collect r)
15 SRI International Bioinformatics
16
Find all reactions that consume a given compound:
(defun rxns-consuming-cpd (cpd)
(append
(loop for r in (get-slot-values cpd ‘appears-in-left-side-of) for dir = (get-rxn-direction r) if (or (eq dir :l2r) (eq dir :both)) collect r)
(loop for r in (get-slot-values cpd ‘appears-in-right-side-of) for dir = (get-rxn-direction r) if (or (eq dir :r2l) (eq dir :both)) collect r)))
SRI International Bioinformatics
17 SRI International Bioinformatics
18
PGDBs only represent RNAs that are “terminal gene products”
tRNAs
rRNAs
Regulatory RNAs
Miscellaneous small RNAs
Slots similar to proteins
tRNAs can have an anticodon
SRI International Bioinformatics
20 SRI International Bioinformatics
21 SRI International Bioinformatics
22
An ordered set of interconnected, directed biochemical reactions
Reactions form a coherent unit, e.g.
Regulated as a single unit
Evolutionarily conserved across organisms as a single unit
When combined, perform a single cellular function
Historically grouped together as a unit
Includes metabolic pathways and signaling pathways
Evidence for all reactions in a single organism
Pathways can be linear, cyclical, branched, or some combination
SRI International Bioinformatics
23
REACTION-LIST: unordered list of reactions that comprise the pathway
PREDECESSORS: list of reaction pairs that define ordering relationships between reactions.
E.g. R1 R2 C
A B
R3 D
(R2 R1) : Predecessor of R2 is R1
(R3 R1) : Predecessor of R3 is R1
(R1) : R1 has no predecessor (can be omitted)
SRI International Bioinformatics
24
Reaction directions
Some reactions are unidirectional, but many are reversible – how do we know in which direction to draw the reaction?
Main vs. side substrates
A B C
D E F
Main compounds form the backbone of the pathway
substrates shared between connecting reactions major inputs and outputs.
Side compounds omitted from pathway diagrams at low detail levels
Individual reactions do not necessarily have main and side compounds – a particular substrate may be either a main or a side depending on the pathway context.
SRI International Bioinformatics
25
Our philosophy: Enable curator to specify as little as possible. Compute as much as possible. This reduces redundancy and potential for inconsistencies.
Example:
Reactions R1: A + B C + D
R2: B E
Predecessors: (R2 R1)
Only substrate overlap is B
B must be a main substrate
A must be a side substrate,
R1 must proceed from right to left
R2 must proceed from left to right
C + D B E
A
SRI International Bioinformatics
26
Unfortunately, mains, sides and reaction directions are sometimes ambiguous:
At beginnings and ends of pathways
Use heuristics to determine main/side substrates at beginnings, ends of pathways
Not always what the curator wants
Substrate overlap with both sides of a reaction, e.g. A + B C + D
C + B E
Solution: Additional slot PRIMARIES, should only be populated when necessary:
PRIMARIES: (R (A B) (C)) says that for reaction R, A and B are both main reactants, and C is a main product.
SRI International Bioinformatics
27
ENZYMES-NOT-USED: a reaction may be catalyzed by multiple enzymes, but not all the enzymes necessarily participate in a given pathway
Not present in the same compartment with rest of pathway enzymes
Down-regulated or not expressed under conditions in which pathway is active
ENZYMES-NOT-USED slot lists enzymes that are not involved in the pathway even though they catalyze one of its reactions.
LAYOUT-ADVICE: helps software draw pathway correctly, e.g. in a cyclical pathway, tells which substrate should be at the top.
HYPOTHETICAL-REACTIONS: list of reactions in the pathway that are considered hypothetical (i.e. no direct experimental evidence)
SRI International Bioinformatics
28
… X [n] X [n+1] X [10]
POLYMERIZATION-LINKS: specifies reactions that should be connected by a polymerization link
(X R1 R1) --- REACTANT-NAME-SLOT: N-NAME
--- PRODUCT-NAME-SLOT: N+1-NAME
CLASS-INSTANCE-LINKS: specifies when a link should be drawn between a substrate class and some instance of it (necessary only if instance is not a member of some reaction, so no predecessor relationship can be defined)
R1 --- PRODUCT-INSTANCES: X [10]
SRI International Bioinformatics
Can be used as an alternative or in addition to defining super-pathways
Link must be to or from some main substrate in the pathway
Other end of link can be a pathway, a reaction, or an arbitrary text string
Software automatically computes direction of link, but curator can override it
29 SRI International Bioinformatics
30
Collection of pathways that connect to each other via common substrates or reactions, or as part of some larger logical unit
Can contain both sub-pathways and additional connecting reactions
Can be nested arbitrarily
REACTION-LIST: a pathway ID instead of a reaction ID in this slot means include all reactions from the specified pathway
PREDECESSORS: a pathway ID instead of a tuple in this slot means include all predecessor tuples from the specified pathway
SRI International Bioinformatics
Signaling pathways have different layout conventions than metabolic pathways
Layout is done manually by curator using specialized editor
Curator has more control
Lack of automatic layout algorithm means that diagram won’t update automatically when data changes.
31 SRI International Bioinformatics
32 SRI International Bioinformatics
33
See http://bioinformatics.ai.sri.com/ptools/ptools-resources.html
(all-pathways)
(base-pathways)
Returns list of all pathways that are not super-pathways
(genes-of-pathway pwy)
(unique-genes-of-pathway pwy)
Returns list of all genes of a pathway that are not also part of other pathways
(enzymes-of-pathway pwy)
(compounds-of-pathway pwy)
(get-reaction-list pwy)
(variants-of-pathway pwy)
Returns all pathways in the same variant class as a pathway
(get-predecessors rxn pwy), (get-successors rxn pwy)
(get-rxn-direction-in-pathway pwy rxn)
(pathway-inputs pwy), (pathway-outputs pwy)
Returns all compounds consumed (produced) but not produced (consumed) by pathway (ignores stoichiometry)
SRI International Bioinformatics
Find all genes involved in metabolic pathways
Find all compounds that are unique to a single pathway
Find the reactions of a pathway that have multiple isozymes
34 SRI International Bioinformatics
35
Find all genes involved in metabolic pathways:
(remove-duplicates
(loop for p in (all-pathways) append (genes-of-pathway p)))
Find all compounds that are unique to a single pathway:
(loop for p in (base-pathways) append (loop for c in (compounds-of-pathway p) when (null (remove p (pathways-of-compound c))) collect (list c p)))
SRI International Bioinformatics
Find the reactions of a pathway that have multiple isozymes:
(defun rxns-w-multiple-isozymes (pwy)
(loop for rxn in (get-reaction-list pwy) for enzymes = (enzymes-of-reaction rxn) when (> (length enzymes) 1) collect rxn))
36 SRI International Bioinformatics
Class Regulation with subclasses that describe different biochemical mechanisms of regulation
Slots:
Regulator
Regulated-Entity
Mode
Mechanism
37 SRI International Bioinformatics
38 SRI International Bioinformatics
39
Class Regulation-of-Enzyme-Activity
Each instance of the class describes one regulatory interaction
Slots:
Regulator -- usually a small molecule
Regulated-Entity -- an Enzymatic-Reaction
Mechanism -- One of:
Competitive, Uncompetitive, Noncompetitive, Irreversible, Allosteric,
Unkmech, Other
Mode -- One of: + , -
Physiologically-Relevant?
SRI International Bioinformatics
40
Transcription-Unit
One or more genes, transcribed as a unit
Any gene, promoter, etc. can belong to multiple transcription-units
One or zero promoters
Zero or more terminators, binding-sites
Transcription-Direction: +, -
Promoter
Absolute-plus-1-pos = transcription start site
Binds-Sigma-Factor
Terminator
Rho-Dependent or Rho-Independent
Left-End-Position, Right-End-Position
Binding-Site
DNA-Binding-Site or mRNA-Binding-Site
Left-End-Position, Right-End-Position
Involved-in-Regulation
SRI International Bioinformatics
Class Transcription-Factor-Binding
Slots:
Regulator -- instance of Proteins or Complexes (a transcription-factor)
Regulated-Entity -- instance of Promoters or Transcription-
Units or Genes
Mode -- One of: + , -
Associated-Binding-Site
41 SRI International Bioinformatics
42
Class Transcriptional-Attenuation
Several subclasses depending on type of attenuation
Slots common to all:
Regulator -- Depends on subtype of attenuation
Regulated-Entity -- instance of Terminators or Genes or
Transcription-Units
Mode -- One of: + , -
Slots particular to one or more subclasses:
Associated-Binding-Site
Antiterminator-Start-Pos, Antiterminator-End-Pos
Pause-Start-Pos, Pause-End-Pos
SRI International Bioinformatics
43
Protein-Mediated-Attenuation
RNA-Mediated-Attenuation
Small-Molecule-Mediated-Attenuation
Regulator = A protein/RNA/small molecule
Leader transcript binds protein/RNA/small molecule and determines formation of terminator or antiterminator
Ribosome-Mediated-Attenuation
Regulator = charged tRNA
Ribosome pauses, determining whether terminator or antiterminator forms
RNA-Polymerase-Modification
Regulator = instance of Proteins or Complexes
Regulatory protein binds to site in transcription unit and interacts with RNA polymerase to determine termination
Rho-Blocking-Antitermination
SRI International Bioinformatics
44
Class Regulation-of-Translation
Several subclasses depending on type of attenuation
Compound-Mediated (i.e. riboswitch)
RNA-Mediated
Protein-Mediated
Slots:
Regulator -- Depends on subtype of attenuation
Regulated-Entity -- Transcription-Unit or Gene
Mode -- One of: + , -
Mechanism – Translation-Blocking, mRNA-Degradation, and/or
Translation-Attenuation
Associated-Binding-Site
SRI International Bioinformatics
(transcription-unit-genes tu)
(transcription-unit-promoter tu)
(transcription-unit-binding-sites tu)
(transcription-unit-mrna-binding-sites tu)
(transcription-unit-terminators)
(transcription-unit-transcription-factors)
(terminators-affecting-gene gene)
(containing-tus frame)
(binding-site-transcription-factors bs)
45 SRI International Bioinformatics
46
(genes-regulating-gene gene)
(genes-regulated-by-gene gene)
Includes all transcriptional or translational regulation
(direct-activators frame)
(direct-inhibitors frame)
Frame will be an enzymatic-reaction, transcription-unit, promoter, etc. – this is a low-level function
(transcription-unit-activators tu)
(transcription-unit-inhibitors tu)
Includes transcriptional and translational regulators, and regulators of promoter as well as direct regulators of tu
SRI International Bioinformatics
Find all DNA-binding-sites that regulate a gene
Find only those substrate-level regulators of an enzyme that are considered physiologically relevant
Find all inhibitors of an enzyme, including transcriptional, translational and substrate-level inhibition.
47 SRI International Bioinformatics
48
Find all DNA-binding-sites that regulate a gene:
(defun gene-binding-sites (gene)
(remove-duplicates
(loop for tu in (containing-tus gene) append (transcription-unit-binding-sites tu))))
Find only those substrate-level regulators of an enzyme that are considered physiologically relevant:
(defun relevant-regulators (enz)
(loop for enzrxn in (get-slot-values enz ‘catalyzes) append (loop for regframe in (get-slot-values enzrxn ‘regulated-by) when (get-slot-value regframe ‘physiologically-relevant?) collect (get-slot-value regframe ‘regulator)
)))
SRI International Bioinformatics
49
Find all inhibitors of an enzyme, including transcriptional, translational and substrate-level inhibition:
(defun enzyme-inhibitors (enz)
(let* ((genes (genes-of-enzyme enz))
(tus (remove-duplicates
(loop for g in genes append (containing-tus g))))
(terminators
(remove-duplicates
(loop for g in genes append (terminators-affecting-gene g))))
(enzrxns (get-slot-values enz 'catalyzes))
)
(append (loop for enzrxn in enzrxns append (direct-inhibitors enzrxn))
(loop for tu in tus append (transcription-unit-inhibitors tu))
(loop for term in terminators append (direct-inhibitors term))
)))
SRI International Bioinformatics