Ambrosius Publishing Structural and Behavioral Properties of Biochemical Networks Herbert M. Sauro

advertisement
Structural and Behavioral Properties of
Biochemical Networks
Herbert M. Sauro
University of Washington
Seattle, WA
Ambrosius Publishing
Copyright © 2010 Herbert M. Sauro. All rights reserved.
Draft Edition v0.6, first upload (January, 2011)
Published by Ambrosius Publishing
www.sysbioBooks.com
Typeset using LATEX 2" , TikZ, PGFPlots, WinEdt
and Math Time Professional 2 Fonts
Limit of Liability/Disclaimer of Warranty: While the author has used his best
efforts in preparing this book, he makes no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically
disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written
sales materials. The advice and strategies contained herein may not be suitable for
your situation. You should consult with a professional where appropriate. Neither
the author nor publisher shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or
other damages.
Printed in the United States of America.
Front-Cover: Protein images from RCSB Protein Data Bank and David
Goodsell © (www.pdb.org).
Contents
1
2
3
Quantitative Models
1
1.1
Different Kinds of Model . . . . . . . . . . . . . . . . .
2
1.2
Desirable Attributes . . . . . . . . . . . . . . . . . . . .
3
1.3
Variables and Parameters . . . . . . . . . . . . . . . . .
4
1.4
Dimensions and Units . . . . . . . . . . . . . . . . . . .
7
1.5
Model Approximations . . . . . . . . . . . . . . . . . .
9
1.6
Types of Mathematical Models . . . . . . . . . . . . . .
11
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
Graphs and Networks
13
2.1
Introduction to Graph Theory . . . . . . . . . . . . . . .
13
2.2
Example Network: Protein-protein Networks . . . . . .
15
2.3
Stoichiometric Networks . . . . . . . . . . . . . . . . .
18
Stoichiometric Networks
27
3.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . .
27
3.2
Stoichiometry Matrix . . . . . . . . . . . . . . . . . . .
28
3.3
Mass-Balance Equations . . . . . . . . . . . . . . . . .
29
3.4
The System Equation . . . . . . . . . . . . . . . . . . .
32
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33
iii
CONTENTS
iv
4
5
6
Flux Balance Laws
37
4.1
Flux Balance Laws . . . . . . . . . . . . . . . . . . . .
39
4.2
Determined Systems . . . . . . . . . . . . . . . . . . .
41
4.3
Flux Balance Analysis . . . . . . . . . . . . . . . . . .
52
4.4
Isotopic Flux Measurements . . . . . . . . . . . . . . .
62
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
72
Steady State Flux Patterns
73
5.1
The Null Space . . . . . . . . . . . . . . . . . . . . . .
73
5.2
Elementary Flux Modes . . . . . . . . . . . . . . . . . .
79
5.3
Definition of a Pathway . . . . . . . . . . . . . . . . . .
86
5.4
Maximum Yield Predictions . . . . . . . . . . . . . . .
87
5.5
Engineering a Pathway . . . . . . . . . . . . . . . . . .
90
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
91
Species Conservation Laws
93
6.1
Moiety Conserved Cycles . . . . . . . . . . . . . . . . .
97
6.2
Basic Theory . . . . . . . . . . . . . . . . . . . . . . .
99
6.3
Computational Approaches . . . . . . . . . . . . . . . . 104
6.4
Summary . . . . . . . . . . . . . . . . . . . . . . . . . 116
6.5
Behavioral Consequences . . . . . . . . . . . . . . . . . 116
6.6
Advanced Theory . . . . . . . . . . . . . . . . . . . . . 126
6.7
Numerical Methods . . . . . . . . . . . . . . . . . . . . 132
6.8
Design of Simulation Software . . . . . . . . . . . . . . 140
Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Math Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 146
References
147
CONTENTS
History
v
157
vi
CONTENTS
1
Quantitative Models
The Oxford English dictionary defines a model in the following way:
“A simplified or idealized description or conception of a particular system, situation, or process, often in mathematical
terms, that is put forward as a basis for theoretical or empirical
understanding, or for calculations, predictions, etc.”
This definition embodies a number of critical features that defines a model.
Probably the most important is that a model represents an idealized description, a simplification, of a real world process. This is important because it allows us to comprehend the essential features of a complex process without being burdened and overwhelmed by unnecessary detail. Models are therefore not replicas of reality, they are, by design, approximations.
1
2
CHAPTER 1. QUANTITATIVE MODELS
1.1 Different Kinds of Model
There are many different ways of approximating reality that include mathematical as well as non-mathematical approaches. In biology a very common non-mathematical way to represent cellular networks is by cartoons.
Such cartoons distill into a very concise form, the results of thousands
of experiments undertaken to delineate the pathways responsible for mass
and signal flow. Cartoon models are useful for giving a quick snapshot
of a given process however they have limitations as reasoning tools. To
increase their reasoning power cartoons can be converted into mathematical models. Of these there are many forms, ranging from simple graph
interaction maps to sophisticated dynamic temporal and spatial models.
Models also come in two different forms which might be called conceptual
and concrete. Concrete models are proposed models of some particular
real system, a metabolic pathway such as glycolysis or a signaling network
such as the MAPK pathway. Such models are used to predict the behavior
in the real pathway so that the assumptions we used in constructing the
model can be tested.
Alternatively we can also build conceptual models. These models are
thought experiments, they allow us to investigate the properties of hypothetical networks. Conceptual models serve as test-beds for investigating
basic principles of network design. A typical thought experiment might
involve the investigation of how a particular enzyme in a linear pathway
affects the level of metabolite levels.
Conceptual models can also serve as a means to develop new hypotheses concerning the global properties of pathways. One hypothesis might
concern the prediction that the control of flux in a linear pathway without
regulation is always located on the enzymatic steps near the start of the
pathway. Conceptual models also allow us to abstract and study generic
pathway motifs, for example what are the distinguishing behavioral properties of a branched pathway compared to a cyclic pathway? What properties do negative feedback compared to positive feedback confer on the
dynamical properties of a pathway? Many of the examples we will use
in this course will be conceptual. Conceptual models are not intended to
describe actual real biological systems but are instead employed to aid rea-
1.2. DESIRABLE ATTRIBUTES
3
soning about particular aspects of biological networks. The importance of
conceptual model is that understanding conceptual models makes it much
easier to understand concrete models.
1.2 Desirable Attributes
What makes a good model? There are a range of properties that a good
model should have, but probably the most important are accuracy, predictability and falsifiablity.
B A model is considered accurate if the model is able to describe current
experimental observations, that is a model should be able to reproduce the
current state of knowledge.
B A predictive model should be able to generate insight and/or predictions that are beyond current knowledge. Without this ability a model is
considerably less useful, some would even suggest useless.
B Finally, a model should be falsifiable. Since no model can ever be
proven to be correct, the only means to validate a model is to refute it, that
is, to show that through some experimental test, the model is insufficient.
For example, the statement, RNA is never transcribed into DNA can be
falsified simply by finding one instance where it happens (e.g. the life cycle
of the HIV virus).
A more common means of testing a model is through ‘validation’. This
simply means testing whether a prediction made by the model is correct.
When the model correctly makes a prediction a model is not falsified but
instead our confidence that the model will be able to make further correct
predictions in increased. Many models are of this kind, they have never
been falsified but we have high confidence in them.
There are other attributes of a model that are desirable but not essential, these include parsimonious and selective. A parsimonious model is
a model that is as simple as possible, but no simpler. This is related to
OccamŠs infamous razor which states that “Entities should not be multiplied beyond necessity” and argues that given competing and equally good
models, the simplest is preferred. Finally, since no model can represent
4
CHAPTER 1. QUANTITATIVE MODELS
everything in a given problem, a model must be selective and represent
those things most relevant to the task at hand.
1.3 Variables and Parameters
In both numerical simulations and experiments it is impossible and unnecessary to deal with the entire universe. Instead we select a region of
interest, an organ, a tissue, a cell, a segment of a pathway or even a single enzyme. The region of interest will have a boundary that marks the
division between the system and the surrounding environment. The choice
of boundary is important, it shouldn’t be too small so that the interesting
behavior is no longer observable but it must not be to large to make the
study difficult to achieve. In any study we therefore divide quantities into
two broad groups, variables and parameters.
Intensive and Extensive Properties. In science a distinction is made between physical quantities termed intensive and extensive. An intensive
property is a physical quantity whose value does not depend on the size
of the system. Examples include pressure, density, concentration and temperature. An extensive property is a physical quantity whose value does
depend on the size of the system, examples include mass, volume, energy
and entropy.
A variable is a quantity that changes during the course of a simulation or
experiment. For example, changes in the level of a phosphorylated protein,
the level of mRNA, the concentration of a metabolite or the voltage across
a membrane are all examples of variables. Variables are also called state
variables because they determine the state of the system.
If one quantity depends on the other, we call the first quantity the dependent variable; it is the controlled value. The other quantity, the cause,
is called the independent variable. Often independent variables are also
called parameters. The main characteristic of a parameter is that it is
not a function of the dependent variables and in many cases is under the
control of the experimenter. Examples of parameters include kinetic rate
constants, equilibrium constants, a clamped voltage or a constant external
molecular species that supplies a pathway. Some of these parameters can
1.3. VARIABLES AND PARAMETERS
5
Environment
B1 ; B2 ; Bi ; : : :
S1 ; S2 ; Si ; : : :
System
Figure 1.1: System and Environment: S1 ; S2 ; Si ; : : : are state variables
that will change during the evolution of the system; B1 ; B2 ; Bi ; : : : are
boundary variables that are clamped to certain values by the observer.
The exchange arrows represent the exchange of mass between the environment and the system.
be controlled by the experimenter, for example an external concentration,
while others, such as an equilibrium constant may be very difficult or even
impossible to change.
Parameters are also divided into two groups, external variables such as
boundary species of clamped voltages and internal parameters such as kinetic constants. External concentrations are often called boundary species
because they are at the boundary of the system and the external environment. In an experiment, boundary species are clamped by some kind of
buffering mechanism. The buffering mechanism can simply be a large
external reservoir so that any exchange of mass between the system and
the external environment has a negligible effect on the external concentration. Alternatively there may be active mechanisms maintaining an external concentration. A classic example of active maintenance of an external variable is the voltage clamp used in electrophysiology. Finally,
the external concentrations may simply be slow moving compared to the
timescale of the model so that over the study period, the external concentrations change very little. A typical example of the latter is the study of
a metabolic response over a timescale that is shorter than gene expression.
This permits a modeler or experimentalist to study a metabolic pathway
6
CHAPTER 1. QUANTITATIVE MODELS
without considering the effect of changes in gene expression
Figure 1.2 illustrates a simplified model of glycolysis. The corresponding
Table 1.1 lists the various variables and parameters that have been identified in the model. Glucose and ethanol are assumed to be boundary variables, that is controlled by the observer. This can be arranged by supplying
glucose from a large volume compartment so that when consumed by the
pathway there is only a negligible change in its concentration. Likewise
we assume that ethanol is discharged into a large volume. In this highly
simplified model we also make a possibly unreasonable assumption that
NAD and NADH do not change appreciably during the duration of the
study. Such choices are necessary when building a model. To justify this
assumption we would need to carry out experiments to ascertain what actual changes occur in the NAD/NADH couple. We make an additional
assumption about ATP. Since glycolysis is ostensibly the pathway for generating ATP, some way to simulate ATP consumption is necessary, this
is achieved by including a single step that hydrolyzes ATP to ADP even
though we know that ATP consumption is a complex process involving
many separate reactions. The response of the pathway to changing ATP
demand can be simulated by perturbing the ATP demand step.
NAD
Glucose
ATP
F-16-BisP
G3P
NADH
NADH NAD
Pyruvate
Ethanol
2ADP 2ATP
ADP
ATP
ADP + Pi
Figure 1.2: A simplified glycolytic pathway. Many reactions have been
condensed and ATP consumption has been simplified to a single process
ATP ! ADP C P i
1.4. DIMENSIONS AND UNITS
7
State Variables
System Parameters
Boundary Variables
F-16-BisP
G3P
Pyruvate
ATP
ADP
Kinetic Constants
Enzyme Activities
Volume
Temperature
Glucose
Ethanol
NAD
NADH
Pi
Table 1.1: Variables and parameters for the simplified glycolytic
model ??fig:GlycolysisFigure). We assume that glucose and ethanol
are clamped by the observer using large volume sinks. We assume that
during the period of study that the concentrations of NAD and NADH
remain essentially unchanged. This latter assumption may be unreasonable and would need to be justified experimentally. F-16-BisP =
Fructose-1,6-bisphosphate; G3P = Glyceraldehyde-3-Phosphate; Pi =
Phosphate
1.4 Dimensions and Units
Variables and parameters that go into a model will be expressed in some
standard unit of measurement. In science the recognized standard for units
are the SI units. These include units such as the meter for length, kilogram
for mass, second for time, Joules for energy, kelvin for temperature and
the mole for amount. The mole is of particular importance because it is a
means to measure the number of particles of substance irrespective of the
mass of substance itself. Thus 1 mole of glucose is the same amount as 1
mole of the enzyme glucose-6-phosphate isomerase even though the mass
of each type of molecule is quite different. The actual number of particles in 1 mole is defined as the number of atoms in 12 grams of carbon-12
which has been determined empirically to be 6:0221415 1023 . This definition means that 1 mole of substance will have a mass equal to the molecular weight of the substance, this makes is easy to calculate the number of
moles using the following relation
8
CHAPTER 1. QUANTITATIVE MODELS
moles D
mass
molecular weight
The concentration of a substance is expressed in moles per unit volume
and is usually termed the molarity. Thus a 1 molar solution means 1 mol
of substance in 1 litre of volume.
Dimensional analysis is a simple but effective method for uncovering mistakes when formulating kinetic models. This is particularly true for concrete models where one is dealing with actual quantities and kinetic constants. Conceptual models are more forgiving and don‘t usually require
the same level of attention because they are much simpler. Amounts of
substance is usually expressed in moles and concentrations in moles per
unit volume (mol l 1 ). Reaction rates can be expressed either in concentrations or amounts per unit time depending on the context (mol t 1 , mol
l 1 t 1 ). Rate constants are expressed in differing units depending on the
form of the rate law, the rate constants in simple first order kinetics are
expressed in per unit time (t 1 ), while in second order reactions the rate
constant is expressed per concentration per unit time (mol 1 t 1 ).
In dimensional analysis, units on the left and right-hand sides of expressions must have the same units (or dimensions). There are certain rules
for combining units when checking consistency in units. Only like units
can be added or subtracted, thus the expression S C k1 cannot be summed
because the units of S are likely to be mol l 1 and the units for k1 , t 1 .
Even something as innocent looking as 1 C S can be troublesome because
S has units of concentration but the constant value “1” is unit-less. Quantities with different units can be multiplied or divided with the units for the
overall expression computed using the laws of exponents and treating the
unit symbols as variables.
Example 1.1
Determine the overall units for the expression, k1 S=Km where the units for each
variable are k1 : t 1 l; S : mol and Km : mol l 1 . We first write out the expression
in terms of the individual units:
t
1
l mol=.mol l
1
/
1.5. MODEL APPROXIMATIONS
9
by treating the symbols are algebraic variables we can see that the symbol mol will
cancel and using exponents rules we can bring the l 1 term to the denominator to
yield:
t
1 2
l
In exponentials such as exp x, the exponent term must be dimensionless,
or at least the expression should resolve to dimensionless, thus exp .k t / is
permissible but exp .k/ is not. Trigonometric functions will always resolve
to dimensionless quantities because the argument will be an angle which
can always be expressed as a ratio of lengths which will by necessity have
the same dimension.
1.5 Model Approximations
By their very nature, models involve making assumptions and approximations. The best modelers are those individuals who can make the most
shrewd and viable approximations without compromising the accuracy of
the model’s predictions. In many cases it is only through direct experience that a modeler will learn what are the best approximations to make.
There are however some kinds of approximations which are useful in most
problems:
Neglecting small effects.
Assuming that the environment is unchanged by the system.
Replacing complex subsystems with lumped or aggregate laws.
Sometimes it is possible to assume simple linear cause-effect relationships even though the underlying process is complex.
Assuming that the physical characteristics of the system do not change
with time.
Neglect noise and uncertainty.
10
CHAPTER 1. QUANTITATIVE MODELS
Neglecting small effects. Neglecting small effects includes such things
as changes in the local ionic strength during a catalytic reaction or small
changes in pH or the volume of a cell. Sometimes these small effects can
be important however
Replacing complex subsystems with lumped or aggregate laws. Lumping subsystems is a commonly used technique in simplifying cellular models. The most important of these are aggregate rate laws, such as MichaelisMenten or Hill like equations to model cooperativity. Sometimes entire sequences of reactions can be replaced with a single rate law. However care
must be taken when selecting these approximations. In particular, aggregate rate laws do not model the effects of substrate sequestration which in
some systems, such as protein networks can critically affect the behavior.
In addition replacing entire sections of enzymes with one lumped rate law
does not model the delay in the transmission of perturbations caused by
the sequence of enzymes.
Assuming simple linear cause-effect relationships. In some cases it is
possible to assume a linear cause-effect between an enzyme reaction rate
and the substrate concentration, this is especially true when the substrate
concentration is below the Km of the enzyme. Another common approximation is to assume that the rate of degradation of protein is first-order
even though degradation involved a highly complex process. Linear approximations can greatly simplify analytical studies of model and in some
cases they can also simplify numerical analysis.
Physical characteristics do not change with time. A modeler will often assume that the physical characteristics of a systems do not change,
for example the volume of a cell, the values of the rate constants or the
temperature of the system. In many cases such approximations are quite
reasonable.
Neglecting noise and uncertainty. Most models make two important
approximations. The first is that noise in the system is either negligible
1.6. TYPES OF MATHEMATICAL MODELS
11
or unimportant. In many non-biological systems such an approximation
might be reasonable. However biological systems operate at the molecular
level. As a result, biological systems are susceptible to noise generated
from thermal effects as a result of molecular collisions. For many systems
the large number of particles ensures that the noise generated in this way
is insignificant and in many cases can be safely ignored. For some systems such as prokaryotic organisms, the number of particles in some of
the cellular systems is very small. In such cases the effect of noise can be
significant and therefore must be included as part of the model.
1.6 Types of Mathematical Models
Models can be divided up into a number of broad groups, the most notable
include:
NonLinear and Linear Models
Discrete and Continuous
Deterministic and Stochastic
Deterministic Models
A deterministic model is one where the state of the system at any time is
determined entirely by the initial conditions. This implies that the parameters and variables of the model are not subject to random fluctuations.
Repeated runs of a deterministic model with the same initial conditions
will yield identical results.
There are a number of approaches to building deterministic models in cellular biology. The most common is the use of ordinary differential equations to describe the rate of change of molecular species in time. Such
models are the primary focus of this book. Other researches have examined the use of Boolean models and models based on partial differential
equations.
12
CHAPTER 1. QUANTITATIVE MODELS
Stochastic Models
Another important class of model that is frequently used in building cellular models is the stochastic model. The deterministic model based on
ordinary differential equations assumes a continuum of values for concentrations. This clearly ignores the fact that biological process are a result of
particulate interactions at the molecular level and strictly speaking concentrations should be described by discrete values. However, because we often
deal with systems containing tens of thousands of particles we assume that
we can describe concentration as a continuous variable. For systems where
the particulate number is very low, of the order of tens of particles, the use
a continuum measure is unreasonable. However, an additional and more
important problem arises when dealing with low particulate numbers. At
low concentrations, Brownian motion becomes a significant factor in determining reaction rates such that when a molecule binds or is transformed,
the time at which the event occurs becomes a statistical property. As a result of these factors, models of systems containing low particulate numbers
are better modeled using a stochastic approach.
Exercises
1. Choose from the following options. A model is:
(a) an attempt to form an exact replica of reality.
(b) something that bears no resemblance to the real system.
(c) a simplification of the real world.
2. List the three most desirable attributes of a model.
3. When we “validate” a model which of the following do we most
likely mean:
(a) We show that the model represents the truth about the real system.
(b) We increase our confidence in the model’s predictive power.
(c) We prove that the model is correct.
2
Graphs and Networks
2.1 Introduction to Graph Theory
Mathematically, a graph is described by a set of nodes (often called vertices) and a set of edges that connect the nodes. Apart from their theoretical interest to mathematicians, there are many real-world problems that
can be represented as graphs. For example, the links between web sites,
an ecological food web or a set of protein interactions are common examples of systems which can be represented by graphs. In many cases, the
nodes of a graph represent physical entities such as web sites, organisms
or proteins and edges represent the relationships between the nodes.
A typical example of a graph in cell biology is the protein interaction graph
where the set of nodes represent proteins in a cell and the edges between
two nodes represent a physical interaction between two proteins. Such
graphs are called protein-protein interaction graphs.
In practice graphs are commonly represented using either lists or matrices
(Fig. 2.1) with lists being the most concise.
13
14
CHAPTER 2. GRAPHS AND NETWORKS
Fig 2.1 illustrates a small graph and its corresponding symmetrical adjacency matrix. If a graph has n nodes, then the adjacency matrix will be
a symmetrical n n matrix. The rows and columns of the matrix correspond to the nodes in the graph. An intersection of a row and column (i.e.
between two nodes) is marked by a one if a connection is present, otherwise it is marked by a zero. The number of edges that are incident on a
particular node is called the degree, k, of the node.
1
3
2
5
6
4
Graph
1
2
3
4
5
6
1
0
1
0
1
0
0
2
1
0
1
1
0
1
3
0
1
0
0
0
1
4
1
1
0
0
1
0
5
0
0
0
1
0
0
6
0
1
1
0
0
0
Adjacency Matrix
1
2
3
4
5
6
(2,4)
(1,3,4,6)
(2,6)
(1,2,5)
(4)
(2,3)
Adjacency List
Figure 2.1: Equivalent representations of a graph, an adjacency matrix
and an adjacency list. Note that the adjacency matrix is symmetrical,
i.e. A D A T
Many published graphs are undirected, that is an edge joining two nodes
has no specific direction. This means it is not possible with undirected
graphs to indicate flows, direction or specific dependency information. Visually an undirected edge is simply a straight line while a directed edge
is usually depicted as a line with an arrow head, with the direction of the
2.2. EXAMPLE NETWORK: PROTEIN-PROTEIN NETWORKS
15
arrow head indicating the direction of dependence, Fig. 2.2.
Graphs can also be annotated, that is both the edges and nodes can be
labeled with additional information, for example whether a particular protein is essential. Graphs that have annotated edges are also called weighted
graphs. Interaction graphs are useful for a number of reasons. First they
Undirected Graph
Directed Graph
Figure 2.2: Simple undirected and directed graphs.
provide a formal way to represent biological knowledge on a large scale.
Once represented formally, such graphs can be visualized (Chapter 2) to
give an overview of structure and connectivity. In addition various measures, such as the degree distribution, clustering coefficient, path length or
centrality can be employed to characterize the graphs [37, 1, 38].
2.2 Example Network: Protein-protein Networks
Work on uncovering protein networks has be ongoing since the 1950s and
considerable detail has accumulated on many different pathways across
different organisms. More recently, high throughput techniques have been
employed to describe large network protein interaction maps. In this work,
an interaction is defined if two proteins, A and B are known to associate. Such information however generally ignores stoichiometry, mass
conservation and kinetics, hence they will be termed non-stoichiometric
networks.
Traditional methods, though laborious [13, 11] have been used extensively
to gain detailed knowledge on phosphorylation sites, protein structure, the
16
CHAPTER 2. GRAPHS AND NETWORKS
nature of membrane receptors and the constitution and function of protein
complexes. More recently high-throughput methods, though more course
grained, have been used to elucidate large swaths of protein-protein interaction networks. For example, in yeast, large scale studies have identified
approximately 500 different protein complexes [19, 33] and their relationships to each other.
A popular high-throughput technique that has been used to uncover proteinprotein interaction networks is the Yeast two-hybrid method [18, 44] but
other methods such as phage display [59, 22] and particularly affinity purification and mass spectrometry have also been employed [19, 33]. The
Yeast two-hybrid method is based on the idea that eukaryotic transcriptional activators consist of two domains, a DNA binding domain (DB) and
an activation domain (AD). The activation domain is responsible for recruiting the RNA polymerase to begin transcription. What is remarkable is
that the two domains do not have to be covalently linked in order to function correctly but merely need to be in close proximity. It is this property
that is the basis of the Yeast two-hybrid method.
Let us assume that it is required to know whether two proteins, X and Y
interact with each other. In the two-hybrid method, protein X is fused with
the DB domain (known as the bait protein) and the second protein, Y, is
fused with the AD domain (known as the prey protein). These two fused
proteins are now expressed in Yeast and if the two proteins, X and Y, interact in some way they will also bring the DB and AD domains close to
each other resulting in an active transcriptional activator. If the gene downstream of the DNA binding sequence is a reporter gene, then the interaction
of X and Y can be detected.
A common reporter gene is the lacZ gene which codes for ˇ-galactosidase
and which produces a blue coloring in Yeast colonies through the metabolism
of exogenously supplied X-gal (5-bromo-4-chloro-3-indolyl-ˇ-D-galactoside).
There are some caveats with the Yeast two-hybrid method. Although two
proteins may be observed to interact, the protein in their natural setting
may not be expressed at the same time or may be expressed but in different compartments. In addition using the method to identify interactions
between non-yeast proteins may be invalid because of the alien environ-
2.2. EXAMPLE NETWORK: PROTEIN-PROTEIN NETWORKS
17
ment in the yeast cell. As with many high-through-put methods caution is
advised when interpreting the data.
Wild Type
AD
BD
Reporter Gene
Bait
BD-Bait
BD
Reporter Gene
Prey
AD
AD-Prey
Reporter Gene
Bait
BD-AD
Prey
AD
BD
Reporter Gene
Figure 2.3: Yeast two-hybrid. The wild-type transcription fact is composed of two domains, BD and AD. Both are essential for transcription.
Two fusion proteins are made, BD-Bait and AD-Prey. Bait and Prey
are two proteins under investigation. If the two protein, Bait and Prey
interact bringing BD and AD together resulting in a viable transcription
fact that can be used to express a reporter gene.
Using techniques such as Yeast two-hybrid, one of the first interaction
graphs to be published was the protein interaction graph of Saccharomyces
cerevisiae [62, 28]. Subsequent analysis of this map was conducted by
Jeong et al. [29] and included 1870 proteins nodes and 2240 interaction
edges. Such graphs give a birds-eye view of protein interactions (Fig. 2.4).
18
REVIEWS
CHAPTER 2. GRAPHS AND NETWORKS
Figure 2 | Yeast protein interaction network. A map of protein–protein interactions18 in
Saccharomyces cerevisiae, which is based on early yeast two-hybrid measurements23, illustrates
mathematical properties of random networks14. T
much-investigated random network model assumes
a fixed number of nodes are connected randomly to
other (BOX 2). The most remarkable property of the m
is its ‘democratic’ or uniform character, characterizin
degree, or connectivity (k ; BOX 1), of the individual no
Because, in the model, the links are placed rando
among the nodes, it is expected that some nodes co
only a few links whereas others collect many more
random network, the nodes degrees follow a Poi
distribution, which indicates that most nodes
roughly the same number of links, approximately e
to the network’s average degree, <k> (where <> den
the average); nodes that have significantly more or
links than <k> are absent or very rare (BOX 2).
Despite its elegance, a series of recent findings i
cate that the random network model cannot exp
the topological properties of real networks.
deviations from the random model have severa
signatures, the most striking being the finding tha
contrast to the Poisson degree distribution, for m
social and technological networks the number of n
with a given degree follows a power law. That is
probability that a chosen node has exactly k l
follows P(k) ~ k –γ, where γ is the degree exponent,
its value for most networks being between 2 a
(REF. 15). Networks that are characterized by a power
degree distribution are highly non-uniform, mo
the nodes have only a few links. A few nodes with a
large number of links, which are often called hubs,
these nodes together. Networks with a power de
distribution are called scale-free15, a name that is ro
in statistical physics literature. It indicates the abs
of a typical node in the network (one that coul
used to characterize the rest of the nodes). This
strong contrast to random networks, for which
degree of all nodes is in the vicinity of the ave
degree, which could be considered typical. Howe
scale-free networks could easily be called scale-ric
well, as their main feature is the coexistence of nod
widely different degrees (scales), from nodes with
or two links to major hubs.
Figure 2.4:thatThe
poster
child
of areinteraction
one of the early
a few highly
connected
nodes (which
also known as hubs)networks,
hold the network together.
The largest cluster, which contains ~78% of all proteins, is shown. The colour of a node indicates
Yeast protein
interaction
networks
yeast two-hybrid
the phenotypic
effect of removing
the correspondinggenerated
protein (red = lethal, from
green = non-lethal,
orange = slow growth, yellow = unknown). Reproduced with permission from REF. 18 ©
measurements.
Each Ltd.
node represents a protein and each edge an inMacmillan Magazines
teraction. In addition the graph nodes have been annotated so that red
Depending
on theif
nature
of the interactions,
net- non-lethal,
nodes indicate lethal phenotypic
effect
removed,
green
works can be directed or undirected. In directed
orange slow growth and yellow
unknown.
Adapted
Barabási
and
networks,
the interaction between
any twofrom
nodes has
a
well-defined direction, which represents, for example,
Oltvai [5] but originally published
in
arxiv
and
Nature
[29]
the direction of material flow from a substrate to a
product in a metabolic reaction, or the direction of
information flow from a transcription factor to the gene
that it regulates. In undirected networks, the links do
not have an assigned direction. For example, in protein
interaction networks (FIG. 2) a link represents a mutual
binding relationship: if protein A binds to protein B,
then protein B also binds to protein A.
2.3 Stoichiometric Networks
Architectural features of cellular networks
Cellular networks are scale-free. An important deve
ment in our understanding of the cellular netw
architecture was the finding that most networks wi
the cell approximate a scale-free topology. The first
dence came from the analysis of metabolism, in w
the nodes are metabolites and the links repre
enzyme-catalysed biochemical reactions (FIG. 1). As m
of the reactions are irreversible, metabolic network
directed. So, for each metabolite an ‘in’ and an
degree (BOX 1) can be assigned that denotes the num
of reactions that produce or consume it, respecti
The analysis of the metabolic networks of 43 diffe
organisms from all three domains of life (eukary
bacteria, and archaea) indicates that the cellular met
lism has a scale-free topology, in which most metab
substrates participate in only one or two reactions, b
few, such as pyruvate or coenzyme A, participa
dozens and function as metabolic hubs16,17.
From randomsome
to scale-freekind
networks.of
Probably
the most process such
Almost all cellular processes involve
chemical
important discovery of network theory was the realization
that
despite
the
remarkable
diversity
of
networks
as binding or unbinding, oftenin nature,
in particular
stoichiometric amounts. In
their architecture is governed by a few simple
that are common
most networks
of major
addition, such processes haveprinciples
direction
and toshow
conservation
of mass.
scientific and technological interest . For decades
graph theory —
field of
mathematics
that deals
None of these properties are captured
bythethe
simple
interaction
graphs. In
with the mathematical foundations of networks —
fact there has been some criticism
[2, 34]
that
simple
modelled complex
networks
eitherthe
as regular
objects, graph models
such as a square or a diamond lattice, or as completely
fail to capture the most important
ofapproach
biological
and as a
was rooted networks
in the
randomaspects
network . This
influential work of two mathematicians, Paul Erdös,
result lead to misleading or unimportant
conclusions.
Toofillustrate
the difand Alfréd Rényi, who
in 1960 initiated the study
the
ficulties in representing stoichiometric networks using simple undirected
FEBRUARY 2004 | VOLUME 5
graphs consider |how
one might go about representing a metabolic net- www.nature.com/reviews/gen
work. The problem lies in deciding whether a node should be a reaction or
a substance and thereby what an edge should be. In the literature various
9,10
13
104
2.3. STOICHIOMETRIC NETWORKS
19
approaches have been taken under the headings of substance, reaction and
the less commonly used enzyme graphs [66, 12, 25]. A substance graph
is constructed from nodes that correspond to the substances and an edge
exists between two substances, A and B, if there exists a reaction where
one reaction is a substrate and the other a product. A reaction graph is
where the node corresponds to a reaction and an edge exists between two
reactions if there exists a substance that is produced by one and consumed
by anther. Finally an enzyme graph is where nodes represent enzymes and
an edge exists if two enzymes catalyze a reaction that shares a substance.
One troubling property of both substance and reaction graphs is that they
are not unique, that is different reaction schemes can generate identical
substance and reaction graphs. These representations are therefore lossy.
A further problem with these representations is whether to include linking
substrates such as ATP and NAD. These substances can cross-link distant
pathways and their presence or absence from a graph can have profound effects on the structural characteristics of the resulting graph [66]. In the end
the choice in these matters seems at times to be ad hoc and one wonders
whether there is any sensible approach to represent cellular networks in a
meaningful way using simple graphs. The same arguments apply equally
to both protein and gene networks although there is little discussion of
these limitations in the literature.
Bipartite Graphs
An improved graph model for representing cellular networks is the directed
bipartite graph. Whereas a simple graph is made from one kind of node, a
directed bipartite graph is made from two different kinds of nodes joined
by simple directed edges. An important constraint on bipartite graphs is
that like nodes cannot be connected by an edge. As a result, bipartite
graphs can easily describe cellular networks by representing substances as
one node type and reactions as the other node type. The constraint that
like nodes cannot be connected works well in this case because it makes
no sense to connect two reactions together or to connect two substances
together, but it does make sense to connect a substance to a reaction.
Formally, a bipartite graph is a graph whose nodes are separated into two
20
CHAPTER 2. GRAPHS AND NETWORKS
disjoint (no overlap) sets, U and V such that every edge connects a node
from V into U . Fig. 2.6 illustrates both an undirected and directed bipartite
graph.
Hypergraphs Graphs
In text books on biochemistry, cellular pathways have almost always being represented using directed hypergraphs, Fig. 2.7. Hypergraphs are
graphs where the edges, now called hyperedges, can have more than two
end points – recall that in a simple graph, edges only have at most two end
points. Hypergraphs are then clearly similar to bipartite graphs. Whereas
in bipartite graphs the reaction node is explicit; in hypergraphs, the reaction is replaced by a hyperedge but both graph types can be considered
equivalent.
Stoichiometric networks can be represented easily using either hypergraphs
or bipartite graphs. The edges in each case can be weighted to specify the
stoichiometry while directedness can be used to indicate the direction that
signifies the positive reaction rate.
To depict a bimolecular reaction such as A C B ! C as a bipartite graph,
four nodes are used, one node to depict the reaction, called a reaction
node, and three other nodes to represent the species A, B and C , called
the species nodes (Fig.2.8). In order to indicate the positive reaction rate,
the bipartite graph is directed. The individual stoichiometries can be attached to the edges (called labeled edges) while the reaction rate law can
be attached to the reaction node.
In textbooks hypergraphs generally predominate as they tends to follow a
visual convention used by chemists and biochemists. Occasionally bipartite graphs are also used, most notably the KEGG database uses bipartite
graphs to display metabolic pathways, Fig. 2.10.
2.3. STOICHIOMETRIC NETWORKS
Figure 2.5: A Small Protein-Protein Interaction Map. This image
was taken from the STRING web site (Search Tool for the Retrieval
of Interacting Genes/Proteins, http://string.embl.de/). The image displays a small segment of the protein interaction map centered
around LEU3, the transcription factor that regulates genes involved in
leucine and other branched chain amino acid biosynthesis. The number of lines between nodes indicates the number of lines of evidence
that supports the interaction. Of interest are the genes shown in green
(LEU1,LEU2,LEU4,ILV2,ILV3) related to leucine and valine biosynthesis with strong evidence supporting that claim. Such maps provide a
useful snapshot of potential interactions and may highlight relationships
that were not previously noted.
21
22
CHAPTER 2. GRAPHS AND NETWORKS
Undirected Bipartite Graph
Directed Bipartite Graph
Figure 2.6: Bipartite graph containing two kinds of node.
Figure 2.7: A directed hypergraph, commonly used in biochemistry
textbooks to depict metabolic and signaling pathways.
2.3. STOICHIOMETRIC NETWORKS
a)
Reaction
23
b)
A
A
C
C
B
B
Bipartite Graph
Hypergraph
Figure 2.8: Bipartite and hypergraph graph representing a bimolecular
reaction, A C B ! C . The bipartite graph (a) contains two kinds of
nodes, one type of node represents the molecular species (A, B, and C)
while the other node type represents the reaction. In a bipartite graph,
like nodes cannot connect to each other, that is species nodes can only
connect to reaction nodes. The hypergraph, (b), uses hyperedges which
have multiple end points.
24
CHAPTER 2. GRAPHS AND NETWORKS
a) Reaction Scheme
A+B
C
D
C
D
B
b) Hypergraph
A
C
B
D
c) Bipartite Graph
A
D
C
B
d) Substance Graph
C
A
e) Reaction Graph
D
R1
R2
B
R3
Figure 2.9: Five different representations for the same reaction scheme.
Note that the substance and reaction graphs are not unique and similar
graphs could be generated from different reaction schemes.
2.3. STOICHIOMETRIC NETWORKS
Figure 2.10: Glycolysis Pathway depicted here from the KEGG
database is a Bipartite Graph. The two kinds of nodes in this graph
represent metabolites (e.g. Pyruvate) and reactions, for example the reaction catalyzed by pyruvate kinase. The reaction nodes are represented
as rectangles containing the Enzyme Commission number of the enzyme and the metabolites by unfilled circles.
25
26
CHAPTER 2. GRAPHS AND NETWORKS
3
Stoichiometric Networks
3.1 Introduction
Stoichiometry refers to the molar proportions of reactants and products in
a chemical reaction. Given a hypothetical reaction such as:
3A C 4B ! 2C C D
with reactants A and B and products C and D, the stoichiometry is indicated by the number of participating reactant and product molecules. Thus
the stoichiometry for A is three, for B, four, for C , two and for D, one.
See Chapter 1 of “Introduction to Kinetics for Systems Biology” (www.
sysbiobooks.com) for a more detailed discussion of stoichiometry.
27
28
CHAPTER 3. STOICHIOMETRIC NETWORKS
3.2 Stoichiometry Matrix
When describing multiple reactions in a network, it is convenient to represent the stoichiometries in a compact form called the stoichiometry matrix, N . This matrix is a m row by n column matrix where m is the number
species and n the number reactions. The columns of the stoichiometry matrix correspond to the distinct chemical reactions in the network, the rows
to the molecular species, one row per species. Thus the intersection of a
row and column in the matrix indicates whether a certain species takes part
in a particular reaction or not, and, according to the sign of the element,
whether there is a net loss or gain of substance, and by the magnitude,
the relative quantity of substance that takes part in that reaction. The elements of the stoichiometry matrix thus concern the relative mole amounts
of chemical species that react in a particular reaction; it does not concern
itself with the rate of reaction.
For example, consider the simple chain of reactions which has five molecular species and four reactions. The four reactions are labeled, v1 to v4 .
S1
v1
S2
v2
v3
S3
S4
v4
S5
The stoichiometry matrix for this simple system is given by:
2
6
N D 6
6
6
4
v1
1
1
0
0
0
v2
0
1
1
0
0
v3
0
0
1
1
0
v4
0
0
0
1
1
3
S1
7 S2
7
7 S3
7
5 S4
S5
Entries in the stoichiometry matrix are computed as follows. Given a
species Si and reaction vj , the corresponding entry in the stoichiometry
matrix at ij (row i and column j ) is given by the total stoichiometry for
Si on the product side minus the total stoichiometry of Si on the reactant
3.3. MASS-BALANCE EQUATIONS
29
side. Thus, considering species S1 in reaction v1 , we note that the total
stoichiometry on the product size is zero (no S1 molecules are formed on
the product side) and the total stoichiometry of S1 on the reactant side is
C1. Subtracting one from the other (0 1) we obtain 1, which is entered into the stoichiometry matrix. This rather long winded approach to
computing stoichiometries avoids errors that arise when a species occurs
as both reactant and product. For example, for a more complex reaction
such as:
3A ! 2A C B
the stoichiometry entry for A is .2 3/ that is 1, because the stoichiometry for A on the product side is 2 and on the reactant side is 3.
3.3 Mass-Balance Equations
According to the law of conservation of mass, any observed net change in
the amount of a species must be due to the difference between the inward
and outward flows from the species pool (Figure 3.1).
Inflows
dX=dt D
P
X
Outflows
Inflow
P
Outflows
Figure 3.1: Mass Balance: The rate of change in species X is equal
to the difference between the sum of the inflows and the sum of the
outflows
The equations which describe such flows are called the mass balance
equations and are central to building mathematical models of cellular networks:
X
X
dSi
D
Inflows
Outflows
(3.1)
dt
30
CHAPTER 3. STOICHIOMETRIC NETWORKS
A reaction rate which contributes to a flow is given by the term, cij vj ,
where cij is the stoichiometry coefficient and vj the reaction rate. Therefore the balance equation can be written (under constant volume conditions) in a more formal way as:
X
dSi
D
cij vj
dt
(3.2)
j
that is the sum over all flows into and out and of a particular species pool.
In the equation, Si is the concentration of species i , cij is the stoichiometric coefficient for species i with respect to reaction j and vj is the
rate of reaction for reaction j . Stoichiometric coefficients for reactants are
negative and for products, positive.
Consider a simple linear chain of reactants from S1 to S5 shown in Figure 3.2. The mass-balance equations for this simple system can be written
as shown in equation 3.3.
S1
v1
S2
v2
S3
v3
S4
v4
S5
Figure 3.2: Simple Straight Chain Pathway.
dS1
D
dt
v1
dS3
D v2
dt
v3
dS5
D v4
dt
dS2
D v1
dt
v2
dS4
D v3
dt
v4
(3.3)
Each species in the network is assigned a mass-balance equation which
accounts for the flows into and out of the species pool. For a branched
system such as the following:
3.3. MASS-BALANCE EQUATIONS
31
v2
v1
S1
v4
v3
S2
v5
Figure 3.3: Multi-Branched Pathway.
the mass-balance equations are given by:
dS1
D v1
dt
v2
v3
dS2
D v3
dt
v4
v5
Finally consider a more complex pathway such as:
ACX
X CY
Z
v1
! 2X
v2
! Z
v3
! Y CB
This example is more subtle because we must be careful to take into account the stoichiometry change between the reactant and product side in
the first reaction (v1 ). In reaction v1 , the overall stoichiometry for X is C1
because two X molecules are made for every one consumed. Taking this
into account the rate of change of species X can be written as:
dX
D
dt
or more simply as v1
v1 C 2v1
v2
v2 . The full set of mass-balance equations can
32
CHAPTER 3. STOICHIOMETRIC NETWORKS
therefore be written as:
dA
D
dt
v1
dY
D v3
dt
v2
dX
D v1
dt
v2
dZ
D v2
dt
v3
dB
D v3
dt
It is therefore fairly straight forward to derive the balance equations from a
visual inspection of the network. Many software tools exist that will assist
in this effort by converting network diagrams, either represented visually
on a computer screen or provided as a text file listing the reactions in the
network.
3.4 The System Equation
Equation 3.2, which describes the mass balance equation, can be reexpressed in terms of the stoichiometry matrix to form the system equation.
dS
D Nv
dt
(3.4)
where N is the m n stoichiometry matrix and v is the n dimensional rate
vector, whose i th component gives the rate of reaction i as a function of
the species concentrations.
Looking again at the model depicting the simple chain of
system equation can be written down as:
2
3
2
1
0
0
0
v
6 1
7
1
0
0 7 6 1
6
dS
6 v2
0
1
1
0 7
D Nv D 6
6
7
4 v3
dt
4 0
0
1
1 5
v4
0
0
0
1
reactions, the
3
7
7
5
3.4. THE SYSTEM EQUATION
33
If stoichiometry matrix is multiplied into the rate vector, the mass-balance
equations show earlier (3.3) are recovered.
As already stated, the stoichiometry matrix represents the connectivity of
the network and contains information on the network’s structural characteristics. These characteristics fall into two groups, relationships among
the species and relationships among the reaction rates. Each will be considered in turn. Let us first consider the relationships among the species,
that is relationships between the rows of the stoichiometry matrix.
Exercises
1. Explain the difference between the terms: Stoichiometric amount,
Stoichiometric coefficient, rate of change (dX=dt) and reaction rate
(vi ).
2. Determine the stoichiometric amount and stoichiometric coefficient
for each species in the following reactions:
A !B
ACB !C
A !B CC
2A ! B
3A C 4B ! 2C C D
ACB !ACC
A C 2B ! 3B C C
3. Derive the set of differential equations for the following model in
terms of the rate of reaction, v1 , v2 and v3 :
34
CHAPTER 3. STOICHIOMETRIC NETWORKS
v1
A ! 2B
v2
B ! 2C
v3
C !
4. Derive the set of differential equations for the following model in
terms of the rate of reaction, v1 , v2 and v3 :
v1
A!B
v2
2B C C ! B C D
v3
D !C CA
5. Write out the stoichiometry matrix for the networks in question 3
and 4
6. Derive the stoichiometry matrix for each of the following networks.
In addition write out the mass-balance equations in each case.
(a)
v1
B
D
A
v2
C
v4
B
D
A
v2
C
v4
v3
v1
v3
A
B
v2
C
v4
A
B
v2
(b)
v1
v3
v1
v3
C
v4
3.4. THE SYSTEM EQUATION
35
(c)
v1
ACX !B CY
v3
B !C
DCY
X CW
v5
!X
v7
! 2Y
v2
B CX !Y
v4
C CX !DCY
v6
X !Y
2Y
v8
!X CW
7. A gene G1 expresses a protein p1 at a rate v1 . p1 forms a tetramer
(4 subunits), called p14 at a rate v2 . The tetramer negatively regulates
a gene G2 . p1 degrades at a rate v3 . G2 expresses a protein, p2 at a
rate v9 . p2 is cleaved by an enzyme at a rate v4 to form two protein
domains, p21 and p22 . p21 degrades at a rate v5 . Gene G3 expresses a
protein, p3 at a rate v6 . p3 binds to p22 forming an active complex,
p4 at a rate v10 , which can bind to gene G1 and activate G1 . p4
degrades at a rate v7 . Finally, p21 can form a dead-end complex, p5 ,
with p4 at a rate v8 .
(a) Draw the network represented in the description given above.
(b) Write out the differential equation for each protein species in
the network in terms of v1 ; v2 ; : : :.
(c) Write out the stoichiometric matrix for the network.
36
CHAPTER 3. STOICHIOMETRIC NETWORKS
4
Flux Balance Laws
The study of metabolism, that is the chemical reactions that are involved
in breaking down nutrients and building up more complex molecules, was
one of the earliest topics of study in biochemistry. Glycolysis, which
concerns the breakdown of glucose in to pyruvate, was one of the first
metabolic pathways to be investigated during the early part of the 20th century. In the period since, numerous other pathways have been uncovered.
One of the most widely studied organisms, E. coli, has been shown at last
count to have at least 918 enzymes catalyzing a wide range of metabolic
functions [31]. In any particular pathway, enzymes catalyze the conversion of substances from one form to another. The rate of conversion is
often called the flux which is simply another word for a reaction rate but
refers specifically to the reaction rate through a pathway. Figure 4.1 shows
a simplified metabolic map from Corynebacterium glutamicum [43]. The
numbers next to the reaction steps indicate the flux through each step and
shows how the flow of mass through the different metabolic pathways are
distributed.
The network topology has a significant bearing on how flux is distributed
37
38
CHAPTER 4. FLUX BALANCE LAWS
through a pathway. This chapter will focus on a number of areas related to
this topic.
Figure 4.1: Metabolic Map of Corynebacterium glutamicum central
metabolism adapted from [43].
4.1. FLUX BALANCE LAWS
39
4.1 Flux Balance Laws
The steady state of a system is defined when the rates of change of all
species are zero,
Nv D 0
In addition, to distinguish the steady state from thermodynamic equilibrium it is also assumed that at steady state there is a net flow of mass
between the system boundaries of the network.
Box 2.2 Steady State - Recap
The steady state is defined when all dSi =dt are equal to zero while one
or more reaction rates are non-zero.
dS
D Nv D 0
dt
vi ¤ 0
By illustration, let us look at the very simple branched pathway shown in
Figure 4.2. The stoichiometry matrix for this pathway is: N D Œ1 1 1
and the balance equation at steady state is given by:
2
3
v1
1
1
1 4 v2 5 D 0
v3
The mass balance equation for this system at steady state is given simply
by
v1
.
v2
v3 D 0
40
CHAPTER 4. FLUX BALANCE LAWS
v2
v1
S
v3
Figure 4.2: Simple branched pathway.
Flux Distributions
A common need by metabolic engineers is to know the flux distribution
throughout a reaction network. One approach to obtain this information
is to measure every individual flux in the network. This can be done, at
least in principle, by measuring the consumption or turnover rates of all
the metabolites in the network. The easiest rates to measure are on the
reaction steps that connect directly to the external environment, such steps
might be involved in nutrient and oxygen consumption, carbon dioxide,
ethanol or biomass production, quantities that can be measured experimentally. However, the internal fluxes that are deep inside the metabolic
networks are much more difficult to measure, although the use of 13 C labeled substrates has made such measurements more accessible.
In practice it is extremely difficult to measure every reaction rate directly,
instead the steady state balance equations can be exploited to reduce the
number of necessary flux measurements. To illustrate, the balance equation for the simple branched pathway shows us that only two rates actually
need be measured because the third can be computed. For example, if v2
and v1 were measured, the third rate, v3 , could be calculated from the balance equation v3 D v2 v1 , taking note that the pathway must be in steady
state. For an experimentalist this is a great benefit because it reduces the
number of measurements that need to be made.
One of the practical aims of flux balance analysis is to devise methods
that allow all the fluxes in a pathway to be determined with the minimum
effort. To devise such methods however, a number of questions need to be
answered. For example, are there a minimum number of fluxes that can be
measured experimentally to fully determine all fluxes in a pathway? In the
4.2. DETERMINED SYSTEMS
41
simple branch pathway (Figure 4.2) a minimum of two flux were required.
Alternatively it may not be possible to measure even the minimum number,
in such cases can a best estimate for the flux distribution in a pathway be
computed? The following sections will consider approaches to answering
all these questions, particularly for arbitrary networks where systematic
approaches are required.
4.2 Determined Systems
Consider the more complicated pathway shown in Figure 4.3. The stoi-
v1
S1
v2
S2
v4
v3
v5
S3
v6
Figure 4.3: Complex branched pathway.
chiometry matrix for this pathway is:
v1
S1
1
N D S2 4 0
S3
0
2
v2
1
1
0
v3
0
1
0
v4
1
0
1
v5
0
1
1
v6
3
0
0 5
1
(4.1)
42
CHAPTER 4. FLUX BALANCE LAWS
which corresponds to the following three balance equations:
v1
v2 C v4 D 0
v2
v3 C v5 D 0
v6
v4
v5 D 0
Assume we wish to determine all the fluxes through this simple pathway.
Is there a minimum number of fluxes we can measure, from which we can
compute the remaining? Since there are three equations and six unknowns,
at least three of the fluxes must to be measured so that number of unknowns
can be reduced to three. However, of the six, which of the three fluxes
should be measured? For example, measuring v1 , v2 and v4 , will not help
because it is not possible to compute the others from these fluxes. The
problem arises because there are dependencies among the columns of the
stoichiometry matrix.
In order to answer this question let us divide the fluxes into two groups, call
one the measured fluxes (JM ) and the other the computed fluxes (JC ). The
computed fluxes will be calculated from some combination of the measured fluxes. Consider the system equation at steady state:
Nv D 0
Let us apply row reduction to the system equation until N is in reduced
echelon form (See Box 3.2). Since the right-hand side is zero, it remains
unchanged in the process. These operations lead to:
I M
vD0
(4.2)
0 0
The process is likely to result in column as well as row exchanges and as
a result the linearly independent columns will move to the left partition
forming the identity matrix and the linearly dependent columns will be
found in the partition corresponding to M . Let us partition the v vector to
correspond to the partitioning in the echelon matrix, so that:
I M v1
D0
0 0
v2
4.2. DETERMINED SYSTEMS
43
which when multiplied out gives v1 D M v2 . This tells us that the flux
terms in the v1 partition correspond to the computed fluxes, JC , and v2 to
the measured fluxes, JM , that is JC D M JM .
This relation describes a set of computed fluxes, JC , as a function of a
set of measured fluxes, JM via a transformation matrix, M . To follow
conventional notation, the term M will be renamed to K0 (that is M D
K0 ) so that
JC D K0 JM :
(4.3)
and equation 4.2 can be reexpressed as:
I
0
K0
0
JC
D0
JM
(4.4)
Returning to the example shown in Figure 4.3, let us apply a series of elementary operations to the stoichiometry matrix to reduce the stoichiometry
to its reduced echelon form (Equation 4.4):
1. Start with the stoichiometry matrix.
2
1
4 0
0
1
1
0
0
1
0
1
0
1
0
1
1
3
0
05
1
0
1
0
1
0
1
0
1
1
3
0
05
1
1
0
1
1
1
1
3
0
05
1
1. Multiply the 3rd row by -1.
2
1
4 0
0
1
1
0
2. Add the 2nd row to the 1st row.
2
1
4 0
0
0
1
0
1
1
0
44
CHAPTER 4. FLUX BALANCE LAWS
Box 2.1 Echelon Forms - Recap
There are two kinds of matrices that one frequently encounters in the
study of linear equations. These are the row echelon and reduced echelon forms. Both matrices are generated when solving sets of linear
equations. The row echelon form is derived using forward elimination
and the reduced echelon form by Gauss-Jordan Elimination.
A row echelon matrix is defined as having the following characteristics:
1. All rows that consist entirely of zeros are at the bottom of the matrix.
2. In each non-zero row, the first non-zero entry is a 1, the leading one.
3. The leading 1 in each row is to the right of all leading 1’s above it.
This means there will be zeros below each leading 1.
The following three matrices are examples of row echelon forms:
2
3
3
2
1 4 3 0
1 5 3 0
1 1 0
40 0 1 75
40 1 7 25
0 1 0
0 0 0 0
0 0 0 1
The reduced echelon form has one additional characteristic:
4. Each column that contains a leading one has zeros above and below
it. The following three matrices are examples of reduced echelon forms:
2
3
2
3
1 0 4 0
1 0 0
1 0 0
40 1 1 75
40 1 0 5
0 1 0
0 0 0 0
0 0 1
Sometimes the columns of a reduced echelon can be ordered such that
each leading one is immediately to the right of the leading one above
it. This will ensure that the leading 1’s form an identity matrix at the
front of the matrix. The reduced echelon form will therefore have the
following general block structure:
I A
0 0
It is always possible to reduce any matrix to its echelon or reduced
echelon form by an appropriate choice of elementary operations. The
function rref() implemented in many math applications will generate
a reduced row echelon.
4.2. DETERMINED SYSTEMS
45
3. Add the 3rd row times -1 to the 1st row.
2
1
4 0
0
0
1
0
1
1
0
0
0
1
3
1
05
1
0
1
1
4. And finally, exchange the 3rd and 4th columns.
2
1
4 0
0
0
1
0
0
0
1
1
1
0
3
1
05
1
0
1
1
These operations lead to the following reduced echelon matrix (leading
ones are shown in red):
v1
Reduced Echelon D
2
4
1
0
0
v2 v4 v3
v5 v6
0
1
0
0
1
1
0
0
1
1
1
0
3
1
0 5
1
(4.5)
Note that during the reduction, the third and forth columns were exchanged.
The partition that holds the identity matrix marks the computed fluxes and
the right-hand partition which holds the K0 matrix marks the measured
fluxes. Thus the computed fluxes correspond to the independent columns
and the measured fluxes to the dependent columns. If we extract the K0
partition, equation 4.3 can be used to relate the computed to the measured
fluxes as follows:
2 3
2
32 3
v1
1
0
1
v3
4 v2 5 D  4 1
5
4
1
0
v5 5
(4.6)
v4
0
1
1
v6
Or
v1 D v3
v6
v2 D v3
v5
v4 D v6
v5
46
CHAPTER 4. FLUX BALANCE LAWS
This shows that in principle only v3 , v5 and v6 need be measured from
which all remaining rates can be calculated. A visual inspection of the
pathway in Figure 4.3, will reveal this to be true, thus, v4 can be computed
from v5 and v6 ; v2 can be computed from v5 and v3 ; and lastly, v1 can be
computed from v2 and v4 .
Software tools such as PySCeS [39] can be used to automatically compute
the K0 matrix along with an appropriately reordered stoichiometry matrix.
In summary, the method outlined above enables us to derive the minimum
set of fluxes to measure in order to determine all fluxes in an arbitrary
pathway.
Linear Algebra of Determined Systems
An alternative but related approach to derive the computed from the measured fluxes is as follows. Let us assume we can reorder the columns of
the stoichiometry matrix so that all the dependent columns are moved to
the left-side of the matrix and the independent columns are moved to the
right-side of the matrix. Note this is the opposite order to the columns
in equations 4.5 and 4.2. Furthermore, let us also assume that the rows
have also been reordered so that the independent rows are moved to the
top and the dependent rows to the bottom of the matrix. These prerequisites means that the stoichiometry matrix has a partitioned structure shown
in Figure 4.4.
The partition, NR represents the set of independent species and at steady
state:
JM
NR
D0
JC
NR can be partitioned as shown in Figure 4.4:
JM
NDC NIC
D0
JC
where NDC represents the set of linearly dependent columns and NIC the
set of linearly independent columns. To reemphasize again, the order of
the computed and measured fluxes are exchanged compared to that shown
in equation 4.4.
4.2. DETERMINED SYSTEMS
47
n0
m0
NDC
NIC
m0
m
N=
N0
NR
n
Figure 4.4: Partitioned Stoichiometry Matrix: n D number of reactions; m D number of species; NDC D partition of linearly dependent
columns; NIC D partition of linearly independent columns; NR D
reduced stoichiometry matrix; N0 partition of linearly dependent rows.
Multiplying out this equation gives NDC JM C NIC JC D 0. This
equation can be rearranged and both sides multiplied by the inverse of
NIC to obtain:
JC D .NIC / 1 NDC JM
(4.7)
This result gives us a relationship between the computed and measured
fluxes. The term .NIC / 1 NDC can be replaced by, K0 , so that JC D
K0 JM . This equation is identical to equation 4.3 but offers an alternative approach to computing K0 and is the method often cited in the literature [60, 14]. The inverse of NIC is guaranteed to exist because NIC
is square and all rows and columns are guaranteed by construction to be
linearly independent.
The equation, K0 D
.NIC /
1
NDC can be rearranged into the follow-
48
CHAPTER 4. FLUX BALANCE LAWS
ing form:
NDC
NIC
I
K0
D0
(4.8)
or more simply:
NR K D 0
(4.9)
This shows that the K0 matrix is related to the null space of the reordered
stoichiometry matrix. We will return to the interpretation of equation 4.9
in the next chapter.
Examples
The following examples illustrate the application of equation 4.7.
a) Consider the branched pathway shown in Figure 4.3. The columns of
the stoichiometry matrix can be reordered so that the linearly dependent
columns (NDC ) are first, followed by the linearly independent columns
(NIC ). Row reduction to the reduced echelon form (equation 4.4) can
be used to determine which are the linearly independent and dependent
columns (equation 4.5). In the stoichiometry matrix below, the partitions
have been exchanged so that the linearly independent columns are first,
followed by the linearly dependent columns:
v3 v5
N D
2
4
0
1
0
0
1
1
v6
0
0
1
v1
1
0
0
v2
1
1
0
v4
3
1
0 5
1
From the reordered matrix, the NDC and NIC partitions can be extracted
from which the dependency relations can be derived by applying equation 4.7.
2
K0 D
4
1
0
0
1
1
0
3
1
0 5
1
12
4
0
1
0
0
1
1
3 2
0
0 5D4
1
1
1
0
0
1
1
The derived K0 corresponds to the same result found in equation 4.6.
3
1
0 5
1
4.2. DETERMINED SYSTEMS
b)
49
A more complex example of a pathway is shown in Figure 4.5. The
v5
E
B
v2
v1
v4
v6
A
v3
D
v8
F v
9
C v7
Figure 4.5: Complex Network incorporating two input fluxes and two
output fluxes, coupled internally by multiple branches and one reaction
that exhibits non-unity stoichiometry (v4 ).
stoichiometry matrix for this network is given by:
A
B
C
N D
D
E
F
2
6
6
6
6
6
6
4
v1
1
0
0
0
0
0
v2
1
1
0
0
0
0
v3
1
0
1
0
0
0
v4
0
1
0
2
1
0
v5
0
0
0
0
1
0
v6
0
1
1
0
0
1
v7
0
0
1
1
0
0
v8
0
0
0
1
0
0
and the balance equations by:
v2
v3 D 0
v3 C v6
v5
v1
v4
v6 D 0
v7 D 0
2v4 C v7
v8 D 0
v4 D 0
v6
v9 D 0
v2
v9
0
0
0
0
0
1
3
7
7
7
7
7
7
5
50
CHAPTER 4. FLUX BALANCE LAWS
Let us reorder the columns of the stoichiometry matrix so that the linearly
dependent columns are on the left and linearly independent columns are
on the right (Figure 4.4). Note that there are no dependent rows in the
network so that there is no N0 partition in the reordered matrix. Reordering can be accomplished by carrying out a row reduction on the matrix
to reduced echelon form (equation 4.2) and recording the column changes
in the stoichiometry matrix. Note that the partitions must be exchanged
to match the structure shown in equation 5.1. The simplest reordering is
given by the following stoichiometry matrix:
A
B
C
N D
D
E
F
2
6
6
6
6
6
6
4
v7
0
0
1
1
0
0
v8
0
0
0
1
0
0
v9
0
0
0
0
0
1
v1
1
0
0
0
0
0
v2
1
1
0
0
0
0
v3
1
0
1
0
0
0
v4
0
1
0
2
1
0
v5
0
0
0
0
1
0
v6
0
1
1
0
0
1
3
7
7
7
7
7
7
5
The K0 matrix can be computed from the null space (4.9) of this reordered
matrix:
2
1
0
0
3
6
v8 6
6
6
v9 6
6
6
v1 6
6
K D v2 6
6
6
v3 6
6
6
v4 6
6
v5 6
4
v6
0
1
0
0
0
1
0:5 0:5
0
0:5
0:5
1
0
1
0:5
0:5
0
0:5
0:5
0
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
0
1
v7
1
0
v1
2
6
v2 6
6
6
v3 6
6
K0 D
6
v4 6
6
v5 6
4
v6
0:5 0:5
0
3
0:5
0:5
1
0
1
0:5
0:5
0
0:5
0:5
0
0
1
7
7
7
7
7
7
7
7
7
7
5
1
0
From the K0 matrix the relation between the measured and computed
fluxes can be determined. From the reordering of the stoichiometry matrix it should be apparent that the measured fluxes are v7 , v8 , and v9 , that
4.2. DETERMINED SYSTEMS
51
is a minimum of three fluxes must be measured in order to fully determine
the remainder. Of the three measured fluxes, v7 is the most problematic
because it is an internal flux which experimentally would not be easy to
determine.
It is however possible persuade the algorithm to set the measured fluxes by
assigning the right most columns of the stoichiometry matrix to those flux
which are considered the easiest to measure. Edge fluxes should me moved
to the right-hand edge of the stoichiometry matrix prior to carrying out the
row reduction, this will ensure that a solution that uses the edge fluxes can
be derived. It is however not guaranteed that the measured fluxes will be
dominated by the edge fluxes especially if it is not possible to determine
the internal fluxes from the edge fluxes. This is true in the case of complex branched model, Figure 4.3 where knowing the input and output edge
fluxes is not sufficient to determine all the internal fluxes.
In the case of the more complex pathway, Figure 4.5, it is possible move
three of the edge fluxes (v5 , v8 and v9 ) to the right-hand side of the stoichiometry matrix. These three fluxes are sufficient to calculate all fluxes
inside the pathway.
The stoichiometry matrix can be reordered as follows:
A
B
C
N D
D
E
F
2
6
6
6
6
6
6
4
v5
0
0
0
0
1
0
v8
0
0
0
1
0
0
v9
0
0
0
0
0
1
v6
0
1
1
0
0
1
v2
1
1
0
0
0
0
v3
1
0
1
0
0
0
v4
0
1
0
2
1
0
which yields the following K0 matrix from the null space:
v6
v2
v
K0 D 3
v4
v1
v7
2
6
6
6
6
6
6
4
0
1
2
1
1
2
0
0
1
0
1
1
1
1
1
0
0
0
3
7
7
7
7
7
7
5
v1
1
0
0
0
0
0
v7
0
0
1
1
0
0
3
7
7
7
7
7
7
5
52
CHAPTER 4. FLUX BALANCE LAWS
In turn this gives the dependency equations using equation 4.3:
v6 D v9
v2 D v5 C v9
v3 D v8
v9
2 v5
v4 D v5
v1 D v8
v5
v7 D v8
2 v5
In summary, measuring only v5 , v8 and v9 allows us to completely determine all the fluxes in the network. Unfortunately in real systems the
internal structure of the network will be much more complex and will include many more degrees of freedom. This means that in many cases there
will be insufficient information to fully determine the internal fluxes. Such
cases are called underdetermined systems and alternative strategies must
be used to gain access to the unknown fluxes. Two common strategies to
the study of underdetermined systems include flux balance analysis and
metabolic flux analysis. Flux balance analysis relies on linear programming while metabolic flux analysis uses 13 C -labeled substrates to estimate
fluxes.
4.3 Flux Balance Analysis
The previous section described how one can determine the set of computed and measured fluxes and how to calculate one set from the other. It
assumed that it was possible to measure all the measured fluxes. However
it is often the case that experimentally it is very difficult to measure all the
required measured fluxes. In this situation, the problem becomes underdetermined and alternative strategies are required to determine the fluxes
in a pathway. One method is to use linear programming. By its nature,
linear programming only gives an estimate of the fluxes and predictions
based on linear programming should be supported by additional measurements, however the approach has proved to be popular in the metabolic
community [41].
4.3. FLUX BALANCE ANALYSIS
53
Linear Programming
Linear programming has it’s origins during the 1940s and whose development was motivated by a need during wartime to solve complex planning
problems. Once developed, the method was rapidly taken up by many private industries as a means to determine the optimal allocation of a finite
set of resources given an objective and a set of constraints. Applications
in industry cover a wide range of areas, including airline crew scheduling, stock and bond portfolio selection and oil refining and blending. In
the last few decades linear programming has also been employed as a
means to estimate the optimal allocation of fluxes in a metabolic pathway [67, 68, 16, 32].
Linear programming is an optimization method that requires two inputs,
a linear objective function that is generally a sum of terms that contains
weighted measurable elements from a metabolic model and a set of linear constraints. The maximizing linear programming problem can be expressed by the relations shown in equation 4.10.
Maximize: Z D c1 x1 C c2 x2 C D c T x
Subject to:
a11 x1 C a12 x2 C C a1n xn b1
a12 x1 C a22 x2 C C a2n xn b2
::
:
am1 x1 C am2 x2 C C amn xn bm
Or: Ax b
where all: xi 0
(4.10)
There are a number of algorithms that can be used to solve linear programming problems, but by far the most popular is the simplex method – not
to be confused with the simplex method developed by Nelder and Mead
54
CHAPTER 4. FLUX BALANCE LAWS
for solving nonlinear optimization problems. The simplex method can be
motivated by a simple example.
Consider a pharmaceutical company that manufactures two drugs, say x
and y, from two genetically engineered organisms, A and B. Let us assume
that organism A can produce at maximum 4 kg of drug x per day and
organism B a maximum of 2 kg of y per day. Let us also assume that
the factory can only process a total of 5 kg of any drug per day due to
packaging equipment limitations. If the company can make a profit of
$100 per kg for drug x and a profit of $150 per kg for drug y, what is
the optimal rate at which each drug should be manufactured in order to
maximize profit?
This problem is sufficiently small that it can be easily solved manually.
To maximize profit, it would be prudent to first produce the maximum
amount of most profitable drug first, y, then to use what ever spare capacity
remains in the packaging department to manufacture drug x. This would
mean producing 2 kg per day of drug y, which leaves 3 kg capacity left in
the packaging department to produce 3 kg per day of drug x. Therefore
the total profit for this scenario is 2 150 C 3 100 D $600.
The problem of drug manufacture allocation can be easily expressed as a
linear programming problem. For example, the objective function for the
problem is to maximize profit, that is to maximize:
Maximise: Z D $100 x C $150 y
The constraints on the problem can also be easily expressed. For example,
the quantity of drug manufactured cannot be negative, that is:
x 0 and y 0
In addition, the problem states that a maximum of 4 kg of x can be manufactured per day and a maximum of 2 kg of y per day, that is:
x 4 and y 2
Finally, the packaging department can only process a maximum of 5 kg
per day, that is:
xCy 5
4.3. FLUX BALANCE ANALYSIS
55
This problem can be reexpressed in graphical form as shown in Figure 4.6.
The figure plots all the linear constraints that define the problem, including,
x 4, y 2 and x C y 5. The limits of the object function is indicated by the hashed line. Points where two or more constraints intersect
are called cornerpoints or vertices. Figure 4.7 illustrates the feasible solu-
x =0
y
6
5
4
x+y =� 5
x =� 4
3
24
y =� 2
3
1
0
Cornerpoint
2
1
2
3
1
4
5
6
y =0
x
Figure 4.6: Linear Programming: Constants displayed as edges on a
graph for the drug manufacturing problem.
tion bounded by the constraints and the maximum value of the objective
function. The simplex method works by traversing the cornerpoints one by
1
one. The method first starts at one of the cornerpoints, say cornerpoint and then attempts to move to an adjacent cornerpoint which yields a better
value for the objective function. If the method is unable to move to a better
objective function it stops and reports the last cornerpoint as the optimal
solution. For example, the value of the objective function at cornerpoint
1 is 400 dollars. An adjacent cornerpoint is cornerpoint .
2 The value of
the objective function at this point is $550. Since the objective function at
56
CHAPTER 4. FLUX BALANCE LAWS
the new cornerpoint is larger, the method moves to this cornerpoint. From
the second cornerpoint the method moves to the next adjacent cornerpoint,
3
3 is
cornerpoint .
The value of the objective function at cornerpoint 2
$600. This again is larger than the value at cornerpoint .
Once again,
4
the method moves to the next adjacent cornerpoint, cornerpoint .
The
4 is $300 which is less that the value
value of the objective function at 3 Since there are no other cornerpoints to traverse, the method stops
at .
3
and assigns the optimal value at $600 on cornerpoint .
When a single
point is located it represents a unique solution. However, it is possible
for optimum solutions to lie on a line that joins two cornerpoints, that is
two cornerpoints yield the same value for the objective function. In higher
dimensions, optima may lie on hyperplanes connecting multiple cornerpoints. In such situations the solution is termed degenerate because there
are now an infinite number of optimal solutions and other non-quantifiable
criterion may be used to judge the ‘best’ solution. For example, a degenerate solution may indicate that two different combinations of drug x and y
are equally profitable. However, one of the drugs may have toxicity issues
in which case the optimum with the lowest level of this drug is better.
Another important aspect that Figure 4.7 illustrates what would happen to
the optimal solution if the constraints change. This question leads to the
idea of sensitivity and what are called shadow prices. A shadow price is
the change in the optimal solution if a constraint is changed by one unit.
For example what would happen to the optimal solution if the manufacture of drug y were to be increased from 2 to 3 kgs per day? Sensitivity
analysis can answer these questions and provide additional information on
interpreting the optimal solutions and to gauge how robust the solutions
are to the constraints and/or objective function.
The drug manufacturing example was a relatively simple problem and
could be solved without recourse to the simplex method. For problems
with more variables the number of cornerpoints rises considerably. In addition, rather than being a simple two dimensional problem real problems
are invariable hyper-dimensional. Linear programming is therefore rarely
done by hand, instead software is employed to find solutions. Given the
popularity of linear programming in general, there is a very wide range of
software tools available, including well known tools such as Excel, Mat-
4.3. FLUX BALANCE ANALYSIS
6
57
y
5
Infeasible solution
4
3
24
Optimum
Cornerpoint
3
1
Feasible
Region
0
1
2
2
3
1
4
5
6
x
Feasible solution
Figure 4.7: Linear Programming: Area within the confinement of the
constraints is marked as the feasible region. All potential solutions to
the problem reside in this region. Linear Programming attempts to locate the optimum solution within this region given an objective function. The simplex method moves from cornerpoint (vertex) to cornerpoint searching for the maximum value of the objective function. In this
problem, the third cornerpoint indicates the optimal solution.
lab and Mathematica or more specialized tools such as LINDO (http:
//www.lindo.com) or CPLEX (http://www.ilog.com). However there
is also a wide range of equally good open source alternatives. Probably the
most notable of these include the GNU Linear Programming Kit (GLPK)
or better still the lp_solve library by Peter Notebaert. lp_solve is notable
for a number of reasons, its licence has less restrictions (LGPL) and there
are language bindings that allows lp_solve to be easily called from many
different computer languages, including for example, Java, Delphi, C#,
Matlab, Excel, Python and SciLab. Both GLPK and lp_solve have a very
active community forums. One enterprizing individual (Henri Gourvest)
58
CHAPTER 4. FLUX BALANCE LAWS
has written an excellent graphical front end to lp_solve, called the LPSolve
IDE. This front-end makes it very easy to specify the objective function
and constraints and solve the linear programming problem with the press
of a single button. Further discussion of LPSolve IDE will be given in the
next section.
Objective Functions
The choice of objective function is critical for the linear programming approach to be effective and there has been much discussion in the literature
on what a suitable objective function might be for biological systems. For
example, one of the earliest reported efforts to use linear programming in
metabolic modeling was by Fell and Small [16]. These authors investigated fat synthesis in adipose tissue, and used a variety of objective functions, include minimizing the amount of glucose used per triacylglycerol
formed or maximizing the generation of NADH from the pentose pathway. The authors subsequently used the model to study how the efficiency
of conversion was affected by the availability of ATP.
One of the early attempts to determine the flux distribution in E. coli was
conducted by Palsson’s group [53, 64, 65]. An objective function used in
this work involved maximizing the production of biomass, the assumption
being that growing single celled organisms have been selected for growth
(Unlike cells in multicellular organism where the objective function is
more obscure). In order to relate biomass to a metabolic map, the authors
obtained data [27, 65] that described how 1 gram of E. coli biomass was derived from various metabolic precursors and cofactors (See Table 4.1). The
objective function used to optimize the flux distribution was then defined
as the sum of all the fluxes that produce each of the precursors, weighted
by the amount of precursor required.
Thus, a suitable objective function may be written as:
Z D 41:257 vATP
3:547 vNADH C 18:225 vNADH C 0:205 vG6P C : : :
The use of this objective function yielded results which overdetermined
4.3. FLUX BALANCE ANALYSIS
Metabolite
ATP
NADH
NADPH
G6P
F6P
R5P
E4P
T3P
3PG
PEP
PYR
AcCoA
OAA
AKG
59
Demand (mmol)
41.2570
-3.5470
18.2250
0.2050
0.0709
0.8977
0.3610
0.1290
1.4960
0.5191
2.8328
3.7478
1.7867
1.0789
Table 4.1: Number of mmoles of precursors and cofactors that are required to yield 1 gram of biomass of E. coli [27, 65]
the experimentally determined glucose yield. This suggested that the stoichiometry model was missing an important component. In order to correct
the discrepancy, the authors introduced ATP maintenance into the calculation since cells will use energy not just to achieve growth but also to
maintain other non-growth functions such as maintenance of transmembrane gradients and cellular motility. The addition of ATP maintenance
into the calculation yielded better estimates for glucose yield.
Another but quite different example of an objective function relates to the
flux balance analysis of the mycolic acid pathway in Mycobacterium tuberculosis. In the work by Raman et al [46], the authors selected an objective
function based on maximizing the different proportions of mycolates that
make up the cell wall. Given that cell wall composition is important to the
60
CHAPTER 4. FLUX BALANCE LAWS
structural integrity of the cell wall, optimal production of mycolates would
appear to be an appropriate optimum for the organism to achieve. With
the objective function set, linear programming then requires a set of linear
constraints that will restrict the limits of the objective function and allow
one to find the maximum.
Flux Balance Constraints
In addition to an objective function, linear programming also requires a
set of constraints to limit the scope of the solution space. Of these, the
most important group are the steady state constraints on the pathway, that
is N v D 0. There is one restriction on the steady state constraints, all
rates must be positive. This means that reversible reactions must be split
into their separate forward and reverse reactions.
In addition to the steady state constraints, other constraints can be added
to the mix. The most common of these include constraints on the values
of the external fluxes. Such fluxes, which might include nutrient uptake
or oxygen consumption, will most likely be known and will contribute an
important source of constraints on the model.
Other constraints include thermodynamic and capacity constraints. Capacity constraints impose upper bounds on a flux (0 vi bi ). Such limits
can be set by the Vmax of the enzyme catalyzing the reaction. Sometimes
lower bounds may be set so that in general capacity constraints are set
with the inequality (ai vi bi ). In addition some reaction steps under specific growth conditions may be absent all together due to catabolite
repression, the rates through such reactions can be constrained to zero.
Thermodynamic constraints are more difficult to set and require the use of
plausible ranges for metabolite levels. Thermodynamic constraints attempt
to impose flux directions that are consistent with changes in the Gibb’s free
energy across each reaction which naturally require knowledge of metabolite levels (ref).
Finally, there will sometimes be available internal fluxes that have been
measured. This means that such reactions have specific rates and can be
added to the list of model constraints.
4.3. FLUX BALANCE ANALYSIS
61
Through a judicious use of constraints it is possible to reduce the solution
space and thus improve the reliability of the optimized solution.
In summary, a linear programming problem for estimating the fluxes in a
metabolic pathway takes the form:
Maximize: Z D ci vi C cj vj C Subject to: N v D 0
where: v 0
(4.11)
Example
Consider again the network shown in Figure 4.5. Let us assume that only
v5 and v1 have been measured. Clearly there is insufficient information
to compute the remaining fluxes in the pathway without recourse to linear
programming. To solve the problem using linear programming, an objective function and a set of constraints will be required. For illustration, the
model will be optimized for maximum production of biomass and for the
sake of argument let us assume that fluxes v8 and v9 contribute to biomass.
The objective can then be some weighted sum of the fluxes that contribute
to biomass, that is Z D c1 v8 C c2 v9 .
As for the constraints, the most important are the steady state conditions on
each of the nodes in the network. In this case the steady state constraints
include:
v1
v2
v3 D 0
v2
v6
v4 D 0
v3 C v6
v7 D 0
2v4
v8 C v7 D 0
v5
v4 D 0
v6
v9 D 0
62
CHAPTER 4. FLUX BALANCE LAWS
Two other constraints include the measured fluxes on v1 and v5 . For illustration assume that v1 D 10 flux units and v5 D 6 flux units. This
sets up the problem. Figure 4.8 shows a screen-shot of the LPSolve IDE
software where the problem has been setup. The following code illustrate
the problem expressed in the script language used by LPSolve.
/* Objective function */
max: 0.5*v9 + 0.75*v8;
/* Steady
v1 - v2 v2 - v4 v3 + v6 v6 - v9 =
v5 - v4 =
2 v4 - v8
/*
v1
v5
v3
State Constraints */
v3 = 0;
/* A */
v6 = 0;
/* B */
v7 = 0;
/* C */
0;
/* F */
0;
/* E */
+ v7 = 0; /* D */
Known Flux Constraints */
= 10;
= 6;
>= 1;
Running this script through LPSolve (click the green go button in the tool
bar) yields the following computer optimal solution:
v1 D 10I v2 D 9I v3 D 1I v4 D 6I v5 D 6I v7 D 4I v8 D 16I v9 D 3
The maximum flux was 13.5 units exiting at v8 and v9 .
4.4 Isotopic Flux Measurements
In the previous section linear programming and its application to flux balance analysis was described as a method for estimating fluxes in undetermined systems. The method carried with it a number of assumptions,
one in particularly was the choice of objective function which can in some
systems be difficult to describe or justify. In addition, flux balance analysis has difficulties in estimating the fluxes in certain cases without more
4.4. ISOTOPIC FLUX MEASUREMENTS
63
Figure 4.8: LPSolve IDE used to model a simple metabolic model problem.
information, in particular the flux in parallel pathways, metabolic cycles
such as futile cycles, and cofactor linked cycles cannot always be resolved
by the method (See Figure 4.9). For this reason, other more experimentally based approaches have been devised to try and gather data on fluxes
more directly. The most important approach by far is the use of isotopic
tracer techniques, often referred to as metabolic flux analysis or MFA.
The method proceeds in two phases, one experimental and another computational. The computational analysis is very important as the data analysis
is complex owing to the size of the data sets and the resulting combinatory
expansion of the system equations. Let us first consider the experimental
phase.
64
CHAPTER 4. FLUX BALANCE LAWS
a)
b)
c)
a1
a2
Figure 4.9: Typical situations where linear programming based flux balance analysis cannot resolve fluxes: a) Parallel pathways; b) Metabolic
Cycles; c) Pathways with closed cofactor cycles.
Table 4.2: Isotopes commonly used in biological research.
Common Isotope
1H
12 C
14 N
16 O
Rare Stable Isotope
2H
(0.02%)
13 C (1.1%)
15 N (0.37%)
18 O (0.04%)
Radioactive Isotope
3H
14 C
13 N
11 O
Isotopes are atoms that have the same number of protons but differ in the
number of neutrons. For example, carbon has three naturally occurring
isotopes, the common and stable 12 C (6 protons and 6 neutrons), the stable and relatively uncommon ( 1%) 13 C (6 protons and 7 neutrons) and
trace amounts of radioactive 14 C (6 protons and 8 neutrons), Table 4.2. In
practice a given substrate, such as glucose will be labeled, that is one or
more of the atoms in the glucose molecule will be replaced by a different
isotope. For example, the 12 C on position one might be replaced with an
atom of 13 C. In this case the glucose is referred to as [1-13 C]glucose to
distinguish it from natural glucose.
The main advantage to using isotopes is that they can be measured, that is
in a mixture of labeled and unlabeled glucose it is possible to distinguish
between the two molecules. The way labeled molecules are identified de-
4.4. ISOTOPIC FLUX MEASUREMENTS
65
pends on whether radioactive or stable isotopes are used. Radioactive isotopes can clearly be identified by their decay emissions, for example ˇ
decay in 14 C and 3 H by using scintillation counters. The advantage to using radioactive isotopes is their great sensitivity. However they are also
difficult to handle due to the radiation hazard.
Stable isotopes can be identified by measuring the difference in mass between labeled and unlabeled molecules using mass spectroscopy combined
with gas chromatography (GC/MS). Gas chromatography is used to separate the initial mixture of compounds based on differential equilibration
between a gas and solid phase. Once separated, each compound is fed into
the mass spectrometer where each compound is broken into fragments by
an electron beam. The fragments, now charged, are first accelerated in an
electric field then travel through a magnetic field on a circular path. The
path that an individual fragment actually takes will depend on its charge
and mass. The end results is a MS spectrum which records the relative
proportion of the different fragments that were detected. If similar fragments contain different isotopes then different peaks will emerge in the
spectrum and the proportional of the different labeled compounds can be
determined. The introduction of high performance GC/MS in the last 10
years or so has revolutionized metabolic flux analysis and is now probably
the preferred choice for estimating fluxes.
The basis for MFA is that when a labeled substrate is fed to an organism,
the labeled atoms distribute themselves throughout the chemical composition of the organism. In microbial studies, commonly used substrates
include specifically labeled glucose such as [1-13 C]glucose, uniformly labeled glucose ([U-13 C]glucose) or labeled amino acids. Once administered, the labeled molecules are metabolized by the organism and through
various metabolic processes the atoms in the labeled substrate are rearranged by separation and recombination of molecular fragments. In addition some labeled isotope is either lost as metabolic waste, for example,
CO2 or incorporated into biomass. Assuming no further changes take place
and the substrate is constantly applied, the distribution of the isotopes will
reach what is called isotopic steady state. This can occur quite rapidly in
about an hour. Once in isotopic steady state, GC/MS or NMR is used to
determine how the label has been distributed in the various metabolites of
66
CHAPTER 4. FLUX BALANCE LAWS
interest. This is the raw data that is used to determine the fluxes through
the various pathways.
In order to understand the process of generating fluxes from the isotopic
data a number of terms must first be defined and understood.
Isotopomer One of the most important concepts in MFA is the isotopomer.
Consider a molecule of alanine which has three carbon atoms; there are
eight different ways to label a three carbon alanine molecule, Figure 4.10.
As label enters the metabolic pathways from an external source there is the
potential for the label to partition itself into every possible isotopomer. In
general for a molecule with n potentially labeled atoms there will be 2n
different isotopomers, for example alanine with three atoms has 23 D 8
possible isotopomers. Most often it is the relative mole fraction of isotopomers for a given molecular type that is considered and the vector of
that holds the fractional contribution of each isotopomer is usually called
the isotopomer distribution vector, or IDV. Mass Distribution Vector An-
Figure 4.10: Alanine is a three carbon amino acid. If Alanine were labeled with 13 C, there would be eight possible different labeling patterns.
These different labeled forms are call Isotopomers. For a molecules
with n potentially labeled atoms, there will 2n possible isotopomers.
other useful concept is the mass distribution vector, often abbreviated to
MDV in the literature. An element from the mass distribution vector gives
4.4. ISOTOPIC FLUX MEASUREMENTS
67
the proportion of mass in a group of isotopomers of the same mass. For n
potentially labeled atoms in a molecule there will be n C 1 elements in the
MDV. The C1 element corresponds to the fully unlabeled molecule.
Figure 4.11 illustrates the relationship between the IDV and MDV measures. The key reason for considering these two different descriptions is
that the MDV are measurable while the IDV are on the whole more difficult
to obtain experimentally, although a careful study of the fragmentation patterns from the mass spectrometry can sometimes give information on the
IDV itself. In addition NMR can also be used to gain some information
on the relative distribution of specific isotopomers, but the MDVs are the
primary experiential data.
Mass Distribution
Vector (MDV)
9%
33%
49%
9%
9% 5% 23% 5% 36% 4% 9% 9%
Isotopomers
Fractions (IDV)
C1
C2
C3
Isotopomers
Figure 4.11: This figure illustrates the relationship between the isotopomer fraction (IDC) and the mass distribution vector (MDV). The
example uses a three carbon molecule of which there are eight possible isotopomers. For each labeled molecule there is a fraction that is
labeled, for example the unlabeled molecule is 9% of the total fraction.
To compute the mass distributions, we collect all isotopomers having
the same number of labeled atoms, for example, the 2nd, 3rd and 4th
isotopomers have one labeled atom each, therefore this group constitutes a particular element in the MDV, in this case 33%
Figure 4.12 shows a simple hypothetical network that illustrates three ways
68
CHAPTER 4. FLUX BALANCE LAWS
to view a such a network, as a stoichiometric network, as an atom transition network and as a isotopomer network. The stoichiometric network,
a), is the simplest and most familiar, with six species and five connecting
reactions. If we assume that the species, A, B, E, and F contain two atoms
that could be potentially labeled, and species C and D contain one atom
each that could be potentially labeled, then b) in Figure 4.12 shows the
species with their atomic structure explicitly given, hence the atom transition network.
A number of assumptions are invoked in order for the subsequent analysis
to be valid. The most important is that the system is at steady state, that is
the fluxes and the isotopic distribution are steady. Some of the fluxes in the
system can be measured directly, for example most of the external fluxes
such as substrate uptake and product and biomass formation are known.
What is left are the intracellular fluxes and it is these that will be estimated
from the isotopic data.
The second phase in MFA is the computational effort. This is a fairly
sophisticated and computationally procedure. Here we describe the basic
approach but many refinements have been introduced in recent years [70,
74, 69].
The essential idea behind the computational phase is the construction of
a set of differential equations that describe the time evolution of the isotopomer distribution vector. These equations include two kinds of terms,
fluxes and elements from the isotopomer distribution vector. The equations are used to predict the steady state levels of the various isotopomers,
or more precisely the fractional distribution of the isotopomers at steady
state. The nature of these equations will be described more fully later, for
now let us designate the isotopomer distribution vector with the symbol p
so that the set of differential equations can be written as:
dp
D f .p; J /
dt
At steady state the left-hand side is zero and the isotopomer can be written,
at least in principle, as a function of the fluxes, J .
p D g.J /
4.4. ISOTOPIC FLUX MEASUREMENTS
v2
v1
v2
v5
v4
69
v1
v5
v3
v4
a)
v3
b)
v1
v1
v1
v2
v2
v4
v5
v5
v5
v3
c)
Figure 4.12: Label distribution in a simple network: a) Stoichiometry
network, b) Atom transition network, c) Isotopomer network. Figure
adapted with permission from Weitzel et al. [69], BioMed Central
We say in principle because the equations will tend to be non-linear, rendering an analytical solution difficult if not impossible to obtain, instead
numerical methods are used to find the solution, p. Once a solution has
been found, the vector p is compared to the real measurements and a difference computed. The procedure now makes small adjustments to the
flux values and the steady state equations is solved again to obtain a new
p vector. If the difference between the new values and the measured values is small then the flux values are accepted and the procedure repeated
otherwise the fluxes are adjusted again. The actual strategy for adjusting
the fluxes will be described later but what we have is an iterative procedure where the flux values are adjusted until the measured values of the
70
CHAPTER 4. FLUX BALANCE LAWS
isotopomers match the computed values. The procedure just outlined is of
course a classic optimization problem and many strategies exist for adjusting the flux values at each iteration including gradient search methods such
as Levenberg-Marquardt or better still evolutionary algorithms [54, 75]
that are less likely to fail to converge.
In practice the measured values for the isotopomer distribution are not usually available, instead the model values are converted to the mass distributions and it is these that are compared to the measured mass distributions.
One can imagine that in a large network, particularly where the metabolites have many potentially labeled atoms (say six or more carbon atoms)
then the number of isotopomers can become very large with a corresponding increase in the number of model differential equations. Large models
can have thousands of differential equations that need to be solved at each
iteration. The computational cost is therefore relatively high although with
the availability of cheap and powerful personal computers the issue is not
so significant as it used to be.
One question remains which relates to the exact nature of the model equations that are used to predict the isotopomers. Of all the steps required
during the computational phase, generating the model equations is probably the most tedious and error prone, especially given the large number
of equations that need to be deployed. With this in mind a number of authors have devised specialized software that can automate this phase and
much else. Here a brief description of the equations themselves will be
given. What may not be obvious is that the model equations do not assume any kinetics from the reaction steps themselves, that is there are no
rates that depend on Michaelis-Menten rate laws or other more complicated functions. Instead linear equations are devised that assume that the
rate of reaction between two particular label molecules is a linear function
of the isotopomer concentrations. This is possible because the underlying
metabolic state is assumed to be at steady state. In addition, the individual
rates are simply scaled terms containing the fluxes.
Consider the system depicted in Figure 4.13. The overall reaction is given
as A ! B ! 2 C in the upper panel. In the lower panel we see the individual species represented by their groups of isotopomers. For simplicity the
species are assumed to only contain two potentially labeled carbon atoms.
4.4. ISOTOPIC FLUX MEASUREMENTS
71
The first reaction, v1 swaps the carbon atoms and the second reaction, v2 ,
dissociates the species into two one carbon units, C and D. The fractional
distribution of isotopomers in the A species is given by A1 and A2 , and in
the B species by B1 and B2 . Note that in each case the following is also
true, A1 C A2 D 1 and B1 C B2 D 1. At steady state the flux from species
a) Overall Reaction
A
v1
B
D
v2
C
a) Reaction in Terms of Isotopomers
v 1.A 1
A1
v 1.A 2
A2
v 2.B 1
B1
v 2.B 2
B2
A
B
Figure 4.13
A to B and from species B to C plus D is v1 and v2 respectively, these are
the fluxes we would like to know. However the isotopomer computational
model considers each isotopomer reaction transition as a separate reaction
such that the rate from from one isotopomer to another is proportional to
the fraction of isotopomer.
For example, the rate of reaction from isotopomer A1 to B1 is the fraction
of the overall rate, v1 A1 . Likewise for the other isotopomers. For this
72
CHAPTER 4. FLUX BALANCE LAWS
system, the rate of change of the fraction B1 and B2 is then give by:
dB1
D v1 A 1
dt
v2 B 1
dB2
D v1 A 2
dt
v2 B 2
Note that these equations compute the rate of change on the fraction of
isotopomers, not the absolute amount of isotopomers. This approach eliminates the need for a complex kinetic model whose construction would be
extremely difficult to construct and suspect at best.
The computational effort required to estimate the fluxes are as formidable
as the experimental effort and for this reason a number of authors have
devised software for the automatic construction and solution to the equations. One of the earliest and most comprehensive is the software tool by
Wiechert [73], 13C-FLUX1 who was one of the pioneers in developing the
current state of MFA [72, 71, 70]. Other tools of note include FluxSimulator from Binsl ([6]) and FiatFlux from [76].
There are many other details of MFA that have not been mentioned and
the area is still under rapid development with an ever increasing number of
researchers turning to use the approach to estimate fluxes [47, 50, 35, 58].
Flux ratios, negligible isotopic mass effects
No need to fit external fluxes. Cumomer allow model equations to be
solved analytically. If carbons atoms are not mixed up then those fluxes
cannot be easily determined. Statistics, sensitivity tests.
Exercises
1 see
http://www.uni-siegen.de/fb11/simtec/software/13cflux/
5
Steady State Flux Patterns
One of the interesting aspects of the stoichiometry matrix is how the columns
of the matrix constrains flux patterns particularly at steady state. In this
chapter we will be looking at two approaches that help us understand these
constraints. These related approaches involve examining the null space of
the elementary modes of the stoichiometry matrix.
5.1 The Null Space
The null space of the stoichiometry matrix and its transpose provides important information on the structural constraints in a network. The null
space was introduced briefly in the last chapter in the form of equation 4.9.
In this chapter we will consider more closely its physical interpretation.
Equations 5.1 and 5.2 are the null space equations that were introduced in
the last chapter. The fact that the right-hand side is zero, means that the
null space vectors must indicate some particular aspect of the steady state.
I
NDC NIC
D0
(5.1)
K0
73
74
CHAPTER 5. STEADY STATE FLUX PATTERNS
Box 5.1 The Null Space
Given a matrix equation of the form Ax D 0 where A is an m n
matrix and x is a column vector of n elements, the solution, that is all
the vectors x that satisfy this equation, is called the null space of A.
The number of vectors required to fully describe the null space is called
the dimension of the null space and is equal to the rank of the matrix
rank.A/ minus the number of columns, n. These vectors form what is
called a basis for the space and linear combinations of these vectors can
generate any other vector in the null space. In order to form a basis, the
vectors must also be linearly independent.
Many tools can compute the basis for the null space, for example null
(A, 'r') will compute the basis in Matlab, while NullSpace[A] can
be used to compute the basis in Mathematica.
or more simply:
NR K D 0
(5.2)
Equation 5.2 is a homogeneous linear equation who solutions are given by
the columns of the K matrix. The equation below illustrates the null space
vectors for the complex branched pathway in Figure 4.3.
The partitioning of K is shown by a horizontal dotted line.
2
2
4
0
1
0
0
1
1
0
0
1
1
0
0
1
1
0
6
36
1 6
6
0 56
6
1 6
6
4
1
0
0
0
1
0
1
1
0
0
1
1
0
0
1
3
7
7
7
7
7D0
7
1 7
7
0 5
1
(5.3)
The simplest interpretation of the K matrix is that the vectors that make
5.1. THE NULL SPACE
75
up K represent possible steady state flow patterns in the network. In addition, any linear combination of the vectors is also a valid steady state flow
pattern. Thus for the network shown in Figure 4.3, the null space can be
shown to be:
2
3
1
0
0
6 0
1
0 7
6
7
6 0
0
1 7
6
7
6 1
0
1 7
6
7
4 1
1
0 5
0
1
1
The null space contains three vectors which can be interpreted as flow
patterns which satisfy the steady state condition. These flow patterns are
shown in Figure 5.1 below. Any flow pattern in vivo is some linear combi-
a)
S2
S1
b)
S1
S3
S2
S3
c)
S1
S2
S3
K
=
1
0
0
1
1
0
0
1
0
0
-1
-1
0
0
1
-1
0
1
v3
v5
v6
v1
v2
v4
Figure 5.1: Flow Patterns Based on the Null Space of the Stoichiometry
Matrix.
nation of the three basic patterns indicated in the null space. For example,
the following combination is a potential flow pattern:
76
CHAPTER 5. STEADY STATE FLUX PATTERNS
2
6
6
6
J D6
6
6
4
1:5
0:6
0:8
0:7
0:9
0:2
3
2
1
0
0
1
1
0
7
6
7
6
7
6
7 D 1:5 6
7
6
7
6
5
4
3
2
7
6
7
6
7
6
7 C 0:6 6
7
6
7
6
5
4
0
1
0
0
1
1
3
2
7
6
7
6
7
6
7 C 0:8 6
7
6
7
6
5
4
0
0
1
1
0
1
3
7
7
7
7
7
7
5
The one problem with this interpretation is that negative terms in the K
0.7
S1
0.9
S2
0.2
1.5
0.6
S3
0.8
Figure 5.2: A Possible Flow Pattern Based on the Null Space.
matrix indicate that the flow is in the opposite direction to that indicated
in the network diagram. Such flows might be thermodynamically unlikely.
For example, pattern (c) in Figure 5.2 shows the reaction v1 operating in
the opposite direction to that indicted in the original Figure 4.3. It may
be the case that v1 is reversible in which case pattern c) in Figure 5.1 is a
legitimate flow pattern. If however, v1 , irreversible the flow pattern is not
likely to occur in vivo. The use of elementary modes (see next section)
eliminates this problem by forbidding patterns that include irreversible reactions.
In summary, the null space vectors, and combinations thereof, can be interpreted as possible steady state flows through a given network. There is
however another interpretation which is also very useful with respect to
metabolic engineering. Let us consider the system equation again:
N vD0
Let us assume that it is possible, by some means, to change the rates
through the reactions such that the species levels remain unchanged but
5.1. THE NULL SPACE
77
the flux changes. This will be justified in a later volume when we consider
dynamics and control coefficients. In particular it can be shown that the
unscaled concentration control coefficients and the null space are related
by the expression:
C sK D 0
(5.4)
where elements of the C s matrix equal dSi =dEj , that is, how a given
enzyme, Ej effects the steady state concentration of a given molecular
species, Si . The equation tell us that perturbations in reaction rates that
match entries in the K vector results in no changes in concentrations. The
same applies to linear combinations of vectors in the K matrix.
With this in mind we can state that there is a set of perturbations, ıv, that
satisfies the following:
N .v C ıv/ D 0
which can be simplified to:
N ıv D 0
This equation tells that the null space of N can be interpreted as the vector
ıv. That is, ıv, can be interpreted as a set of disturbances to the reaction
rates that leaves the steady state species levels unchanged but changes the
fluxes. Such perturbations could be achieved by changing the level of gene
expression at each reaction which has a non-zero entry in the K matrix.
For example the first column of the K matrix in Figure 5.1 is Œ1 0 0 1 1 0T .
This means that changing rates v3 , v1 and v2 by a ıv amount will leave the
steady state concentration of S1 , S2 and S3 unchanged but will increase
the net flow from v1 to v3 by a ıv amount. In practice such changes
might not be realizable but in principle one could imagine changing the
enzyme activities at v1 , v2 and v3 through changes in gene expression.
Since enzyme activity is proportional to the concentration of enzyme it
must be true that proportional changes in an enzyme concentration, Ei will
lead to proportional changes in the reaction rate vi - assuming other factors
such as substrate and product concentrations remain unchanged. Since the
later condition can be guaranteed, to make a given relative change in vi ,
ıvi =vi , we need only make the same proportional change in Ei , that is:
78
CHAPTER 5. STEADY STATE FLUX PATTERNS
ıEi
ıvi
D
Ei
vi
(5.5)
From the first column of K , we can state that ıv1 D ıv2 D ıv3 or equivalently:
ıv1
ıv2 v2
ıv3 v3
D
D
v1
v2 v1
v3 v1
or
ıE1
ıE2 v2
ıE3 v3
D
D
E1
E 2 v1
E3 v1
In practice, if we change the activity of enzyme, E1 by a percentage, ˛,
then the percentage changes we must make in E2 and E3 will equal:
ıE1
D˛
E1
v1
ıE2
D˛
E2
v2
(5.6)
v1
ıE3
D˛
E3
v3
This result indicates the changes in enzyme activity that are necessary in
order to increase the flux through v1 ; v2 and v3 while keeping all other
fluxes and metabolite levels the same. It shows that the relative changes
in enzyme concentrations is related to the proportion of flux that the particular step carries. Note that this result applies to large as well as small
changes in enzyme concentrations. The ability to alter fluxes independently of metabolite concentrations is a desirable goal in metabolic engineering because when metabolites change regulation is invoked with resulting unpredictable effects.
5.2. ELEMENTARY FLUX MODES
79
The null space basis have one disadvantage, the flow patterns that the null
space basis admit may not necessarily be thermodynamically viable. In
order to circumvent this problem a different approach was devised called
elementary flux modes.
5.2 Elementary Flux Modes
A closely related concept to the null space of the stoichiometry matrix is
the set of elementary flux modes. As previously discussed, the vectors
in the null space of the stoichiometry matrix can be interpreted as steady
state flow patterns in a network. However, one criticism is that the vectors
in the null space can admit patterns that are thermodynamically unlikely
(See Figure 5.1). In addition, the set of null space vectors is not unique.
Elementary flux modes avoid these issues.
Elementary flux modes are minimal realizable flow patterns through a network that can sustain a steady state. This means that elementary modes
cannot be decomposed further into simpler pathways.
Elementary flux modes provide a comprehensive description of all
metabolic routes for a group of enzymes that are stoichiometrically
and thermodynamically feasible [56]. As a result, metabolic pathways can be defined in terms of their elementary modes.
Mathematically elementary modes are defined as follows. An elementary
mode, ei , is defined as a vector of fluxes, v1 ; v2 ; : : :, such that the following
three conditions must be met (Table 5.1).
In the following examples, all elementary models were computed using
Metatool via JDesigner. Figure 5.3 shows the two elementary modes that
exist for a simple branched pathway.
All steps in Figure 5.3 are assumed to be irreversible. Let us show that
each mode in this system satisfies the three conditions (Table 5.1). The
first condition is steady state, that is for each mode ei , N ei D 0
The two modes are given by:
80
CHAPTER 5. STEADY STATE FLUX PATTERNS
1. The vector must satisfy: N ei D 0, that is the steady state condition.
2. For all irreversible reactions, vi 0. This means that all flow patterns must use reactions that proceed in their most natural direction.
This makes the pathway described by the elementary mode a thermodynamically feasible pathway.
3. The vector ei must be elementary, that is, it should not be possible
to generate ei by combining two other vectors that satisfy the first
and second requirements using the same set of enzymes that appear
as non-zero entries in ei . In other words it should not be possible to
decompose ei into two other pathways that can themselves sustain a
steady state.
Table 5.1: Conditions necessary to define an Elementary Mode.
a)
b)
S1
S1
Figure 5.3: Elementary mode patterns in a simple branched pathway
assuming irreversibility at all reaction steps. Highlighted reactions in
bold signify steps that belong to the elementary mode.
2 3
1
415
0
and
2 3
1
405
1
(5.7)
By substituting each of these vectors into N ei D 0, it is easy to show
that condition one is satisfied. For condition two we must ensure that all
reactions that are irreversible have positive entries in the corresponding
elements of the elementary modes. Since all three reactions in the branch
are irreversible and all entries in the elementary modes are positive then
condition two is satisfied.
5.2. ELEMENTARY FLUX MODES
81
Finally to satisfy condition three we must ask whether we can decompose
the two elementary modes into other paths that can sustain a steady state
while using the same non-zero entries in the elementary mode. In this
example it is impossible to decompose the elementary modes any further
without disrupting the ability to sustain a steady state. Therefore with
all three conditions satisfied we can conclude that the two vectors given
previously are elementary modes.
Like the basis for the null space, all possible flows through a network can
be constructed from linear combinations of the elementary modes, that is:
vD
X
i ei
(5.8)
where
0
such that the entire space of flows through a network can be described.
i must be greater than or equal to zero to ensure that irreversible steps
aren’t inadvertently made to go in the reverse direction. For example, the
following is a possible flow in the branched pathway:
2 3
2 3 2 3
1
1
3:0
4
5
4
5
4
v D 2:5 1 C 0:5 0 D 2:55
0
1
0:5
If one of the outflow steps in the simple branched pathway is made reversible an additional elementary mode becomes available that represents
the flow between the two outflow branches (Figure 5.4). An additional
mode emerges because with only the first two modes it is impossible to
represent a flow between the two branches because the scaling factor, i ,
cannot be negative which would be required to reverse the flow.
Equation 5.4 indicated that if specific perturbations are made along the
route indicated by a vector in the null space, then all species remain unchanged while the net flux increases. This equation can be extended to also
82
CHAPTER 5. STEADY STATE FLUX PATTERNS
include elementary modes, so that if E is the vector of elementary modes,
then since an elementary mode can be generated from a suitable combination of null space vectors (personal communication: Stefan Schuster), it
must be true that:
C sE D 0
(5.9)
This is an important results because it indicates that pathways represented
by individual elementary modes can also be perturbed such that species
levels remain unchanged which has a significant bearing on metabolic engineering strategies.
a)
b)
S1
S1
c)
S1
Figure 5.4: Elementary mode patterns in a simple branched pathway
assuming reversibility at one of the outflow branches.
Cyclic Branched Model
Figure 5.6 lists the elementary modes for a cyclic branched model. Whereas the null space vectors admit flow patterns which violate thermodynamic
considerations, elementary modes do not. For example, pattern (b) and (c)
in Figure 5.1, the reactions v1 and v4 are going in the reverse direction.
This also means that there are likely to be more elementary mode vectors
than the dimension of the null space. Figure 5.6 illustrates four elementary
modes when the first reaction is considered reversible. Figure 5.5 on the
5.2. ELEMENTARY FLUX MODES
83
other hand shows only three elementary modes when the first reaction is
assumed to be irreversible.
a)
S1
S2
S3
b)
S1
S2
c)
S1
S2
S3
S3
Figure 5.5: Elementary mode patterns in a multi-branched pathway assuming irreversibility at each reaction step.
Comment on Condition Three
Condition three in Table 5.1 requires further explanation. Condition three
relates to the non-decomposability of an elementary mode and is partly
what makes elementary modes interesting, the two other important features
include are uniqueness and thermodynamic plausability. Decomposition
implies that it is possible to represent a mode as a combination of two or
more other modes. For example, a mode e1 might be composed from two
other modes, e2 and e3 :
e1 D 1 e2 C 2 e3
If a mode can be decomposed does it mean that the mode is not an elementary mode? Condition three provides a rule to determine whether a
decomposition means that a given mode is an elementary mode or not. If it
is only possible to decompose a given mode by introducing enzymes that
84
CHAPTER 5. STEADY STATE FLUX PATTERNS
a)
S2
S1
S1
b)
S3
S1
c)
S2
S3
S2
S1
d)
S2
S3
S3
Figure 5.6: Elementary mode patterns in a multi-branched pathway assuming reversibility at the first reaction step.
are not used in the mode, then the mode is elementary. That is, is there
more than one way to generate a pathway (i.e something that can sustain
a steady state) with the enzymes currently used in the mode? If so, then
the mode is not elementary. To illustrate this subtle condition consider the
pathway shown in Figure 5.7.
1
S1
2
S2
4
S3
5
3
S3
6
Figure 5.7: Stylized Glycolytic Pathway
This pathway represents a stylized rendition of glycolysis. Two of the
steps in the network are reversible, that is step three and six are reversible
and correspond to triose phosphate isomerase and glycerol 3-phosphate
dehydrogenase respectively.
5.2. ELEMENTARY FLUX MODES
85
The network has four elementary flux modes which are shown in Figure 5.8. The elementary flux mode vectors are shown below:
a)
1
S1
4
S2
2
b)
S1
6
4
S2
2
c)
S1
4
S2
2
S3
5
3
d)
S1
5
6
S3
1
S3
3
S3
1
5
3
S3
1
S3
6
4
S2
2
S3
5
3
S3
6
Figure 5.8: Stylized Glycolytic Pathway illustrating the four elementary
flux modes. Elementary modes are shown as bold arrows.
2
6
6
6
6
6
6
4
e1
1
1
1
0
0
2
e2
1
1
0
1
1
1
e3
1
1
1
2
2
0
e4
0
0
1
1
1
1
3
7
7
7
7
7
7
5
(5.10)
86
CHAPTER 5. STEADY STATE FLUX PATTERNS
Note that it is possible to have negative entries in the set of elementary
modes because they will correspond to the reversible steps. Of interest is
the observation that the fourth vector, e4 D Œ1 1 0 1 1 1 T (where T represents the transpose) can be formed from the sum of the first and second
vectors (5.11). This suggests that the fourth vector is not an elementary
mode.
2
6
6
6
6
6
6
4
e4
1
1
0
1
1
1
3
2
7
7
7
7D
7
7
5
6
6
6
6
6
6
4
e1
0
0
1
1
1
1
3
2
7
7
7
7C
7
7
5
6
6
6
6
6
6
4
e2
1
1
1
0
0
2
3
7
7
7
7
7
7
5
(5.11)
However, this decomposition only works because we have introduced a
new enzyme, E4 (triose phosphate isomerase) which is not used in the
second vector. It is in fact not possible to decompose e4 into pathways that
can sustain the steady state with only the five steps, E3 ; E4 ; E5 and E6 ,
used in the elementary mode. We conclude therefore that e4 is an elementary mode.
5.3 Definition of a Pathway
Unlike the basis of the null space, the set of elementary modes for a given
network is unique (up to an arbitrary positive scaling factor). Given the
fundamental nature of elementary modes, particularly their uniqueness and
non-decomposability, they are a vehicle with which to define the notion
of a pathway. That is every elementary mode and every positive linear
combination of elementary modes is by definition, a pathway. A single
elementary mode can therefore be thought of as an elementary pathway.
Note that the set of elementary modes will change as the set of expressed
enzymes change during transitions from one cell state to another.
5.4. MAXIMUM YIELD PREDICTIONS
87
5.4 Maximum Yield Predictions
An important application of elementary modes is finding pathways that
give the maximum molar yield, that is the largest product/substrate rate
ratio:
Yield D
Synthesis Rate of Product
Consumption Rate of Substrate
(5.12)
In many situations the biosynthesis of a product can be achieved by a number of different pathway routes and the question then arises what are the
routes that achieve the maximum yield of product relative to a given starting material.
A very interesting property of elementary modes is that the set of elementary modes in a particular pathway represent the highest yielding pathways.
The argument for this is as follows. Any flux distribution can be described
as a non-negative linear combination of elementary modes (5.8), for example, 1 e1 C 2 e2 . The yield of a given product and substrate is the
weighted average of the yields of each of the elementary modes that make
up the pathway. However, the average yield will always be smaller – the
average of two numbers is always smaller than the highest of the two individual numbers – than the elementary mode in the set that has the highest
yield. Hence, given that elementary modes cannot be decomposed, the
elementary modes must represent the highest yielding pathways.
Consider the network (from [61]) shown in Figure 5.9. The stoichiometry
for the network is given by:
S1
S2
N D S3
S4
S5
2
6
6
6
6
4
v1
1
0
0
0
0
v2
1
0
1
0
0
v3
0
0
1
1
1
v4
0
0
0
0
1
v5
1
1
0
0
0
v6
0
1
1
0
0
v7
0
1
0
0
2
v8
0
1
0
0
0
v9
0
0
0
1
0
3
7
7
7 (5.13)
7
5
88
CHAPTER 5. STEADY STATE FLUX PATTERNS
X1
v8
S2
v5
Xo
v1
S1
v7
v6
v2
S3
v3
S4
v4
S5 v
9
P
Q
Figure 5.9: Example Network to Illustrate Computation of Maximum
Yields. Xo , Xo , P and Q are boundary species. Reactions v6 and v8
are reversible.
It is straight forward to show using software such as JDesigner/Metatool
that the network in Figure 5.9 has eight elementary modes, labeled EM1 ,
EM2 , EM3 , EM4 , EM5 , EM6 , EM7 and EM8 (See Figure 5.10).
Let us suppose that we are interested in maximizing the production of
product P from the feed substrate, Xo . Of the eight elementary modes,
only six start with Xo . However, only four of these six result in the production of product P , that is EM4 , EM6 , EM7 and EM8 .
The four elementary modes that connect Xo to P are given below:
5.4. MAXIMUM YIELD PREDICTIONS
EM
Yield
4
6
7
8
1
2
2
1
89
Table 5.2: Yield for each elementary mode that consumes input Xo and
produces product P .
2
6
6
6
6
6
6
N D 6
6
6
6
6
6
4
EM4 EM6 EM7 EM8
3
1
1
1
1
1
0
1
0 7
7
1
0
0
1 7
7
1
2
2
1 7
7
0
1
0
1 7
7
0
0
1
1 7
7
0
1
1
0 7
7
0
0
0
0 5
1
0
0
1
(5.14)
The question now is which of these elementary modes achieves the highest
yield? Equation 5.12 will allow us to compute the yields for the elementary
modes. Recall that the entries in the elementary mode vectors represent
relative flux values and since the yield equation is a ratio of fluxes we can
use the entries in the elementary modes to compute the yields for each
elementary mode. For example, consider EM4 . The yield for this mode is
given by v4 =v1 D 1=1 D 1. Table 5.2 summaries the yields for each of
the four elementary modes.
From the table (Table 5.2) it should be clear that two of the modes, EM6
and EM7 produce twice the yield as EM4 and EM8 . From this information
it would be logical therefore to over express the enzymes along EM6 and
EM7 pathways. However examination of EM6 and EM7 shows that EM6
includes four enzymatic steps whereas EM7 includes five enzymatic steps.
90
CHAPTER 5. STEADY STATE FLUX PATTERNS
We can therefore narrow down the choice further and suggest that EM6 ,
which has fewer steps, would be the initial target for engineering.
Having chosen the pathway to engineer we now need to determine by how
much each enzyme should be over expressed. From equation 5.6 we know
that in a branched system not every enzyme must be over expressed by
the same amount. Instead we must compute the relative over expression in
each enzyme from the known fluxes through the pathway (possibly computed using Flux Balance Analysis). We are also assured from equation
that during this engineering, none of the metabolites will change.
Flux balance analysis using linear programming can also be used to compute pathways with the highest yields by suitable adjustment of the objective function. However, elementary modes provides a systematic approach
to uncovering all high yielding pathways [55, 56]. Linear programming
will sometimes inadvertently uncover pathways that represent elementary
modes and the work by Varma and Palsson [64, 65] on biomass yields in
E. coli did just that.
Computing elementary models efficiently is a non-trivial calculation but
a small number of tools are available. In particular METATOOL (4.3 series) developed by a number of authors including, Thomas Pfeiffer, Stefan
Schuster, Juan Carlos Nuno and Ferdinand Moldenhauer is highly recommended. This tool has been incorporated into the systems biology workbench and can be access via the JDesigner application.
5.5 Engineering a Pathway
Most approaches used to engineer metabolic pathways stem from an intuitive understanding of how metabolism operates. For example, to increase
the output of some product it seems logical to first increase the level of enzymes that are involved directly in the production of the product and secondly to reduce enzyme activities of those pathways that may divert flux
away from the product pathway. This approach has been shown to work in
certain cases [51]. However the method can fail due to inadvertent changes
in metabolite levels that cause metabolites to increase to toxic levels. In
addition changes to enzyme levels can also disrupt cofactor levels such
5.5. ENGINEERING A PATHWAY
91
as NAD that have a global and disruptive impact on cellular metabolism.
What is needed is a more systematic approach to engineering pathways.
Two such approaches will be described here.
The simplest approach is to use flux balance analysis. In a flux balance
model, one or more enzymatic steps can be eliminated to investigate the
effect this has on the pathway of interest. However it does not take into account the effect that regulation which may be due to changes in metabolite
levels.
Here we describe the approach developed in this chapter that uses elementary modes as its basis. The strategy is as follows:
1. Enumerate all elementary modes in the metabolic network.
2. Find all modes which end at the desired product.
3. Select one of the modes in step two for engineering. The choice of
mode will depend on the number of steps (which should be minimized), the costs involved in genetically engineering each step and
the yield that the mode can deliver.
4. Use equation 5.5 and ?? to compute the degree of over-expression
of each enzyme along the elementary mode.
In theory this strategy should work, however there are a number of pitfalls.
This include the inability to up regulate all the necessary enzymes and
secondly the possibility be not begin able to be precise enough when a
particular enzyme needs to be up regulated by a specific amount.
Exercises
92
CHAPTER 5. STEADY STATE FLUX PATTERNS
X1
EM 1
X1
EM 2
v8
v8
S2
v5
Xo
v1
S1
S2
v7
v6
v2
S3
v5
S4
v3
v4
S5 v
9
P
Xo
S1
v2
v1
S1
S3
v3
v5
S4
v4
S5 v
9
P
Xo
v1
S1
v2
v1
S1
S3
v7
v5
v1
S4
v3
v4
P
Xo
v1
S1
v2
S3
S4
v3
v4
P
S5 v
9
Q
S4
P
X1
EM 8
v8
S2
v7
v6
v2
S3
Q
v8
S1
Q
v7
v6
S2
Xo
P
v4
v8
X1
v5
S4
S2
S5 v
9
EM 7
v3
X1
EM 6
v6
v2
Q
S5 v
9
S2
Xo
S3
Q
v8
v5
S5 v
9
v7
v6
X1
EM 5
P
S2
v7
v6
v2
v4
v8
S2
Xo
S4
v3
X1
EM 4
v8
v5
S3
Q
X1
EM 3
v1
v7
v6
v3
v5
S4
v4
S5 v
9
P
Q
Xo
v1
S1
v7
v6
v2
S3
v3
v4
S5 v
9
Q
Figure 5.10: Example Network to Illustrate Computation of Maximum
Yields. Xo , Xo , P and Q are boundary species. Reactions v6 and v8
are reversible. The network admits eight elementary mode. Each mode
is indicated in red (thickened reactions).
6
Species Conservation Laws
Many cell processes operate on different time scales. For example, metabolic
processes tend to operate on a faster scale than protein synthesis and degradation. Such time scale differences have a number of implications to model
builders, software designers and model behavior. In this chapter we will
examine these aspects in relation to species conservation laws. To introduce this topic consider a simple protein phosphorylation cycle such as
the one shown in Figure 6.1. This shows a protein undergoing phosphorylation (upper limb) and dephosphorylation (lower limb) via a kinase and
phosphatase respectively.
The depiction in Figure 6.1 is however a simplification. The ATP used
during phosphorylation is not shown as well as the release of free phosphate during the dephosphorylation. In addition synthesis and degradation
of protein is also absent. In many cases we can leave these aspects out of
the picture. ATP for instance is held at a relatively constant level by strong
homeostatic forces from metabolism so that within the context of the cycle,
changes in ATP isn’t something we need worry about. More interestingly
is that within the time scale of phosphorylation and dephosphorylation we
93
94
CHAPTER 6. SPECIES CONSERVATION LAWS
Figure 6.1: Phosphorylation and Dephosphorylation Cycle forming a
Moiety Conservation Cycle between Unphosphorylated (left species)
and Phosphorylated protein (right species).
can assume that the rate of protein synthesis and degradation is negligible.
This assumption is more significant and leads to the emergence of a new
property of the cycle called moiety conservation [49].
In chemistry a moiety is described as a subgroup of a larger molecule.
In this case the moiety is a protein. During the interconversion between
the phosphorylated and unphosphorylated protein, the amount of moiety
(protein) remains constant. More abstractly we can draw a cycle in the
following way (Figure 6.2), where S1 and S2 are the cycle species:
A
B
v2
S1
S2
v1
D
C
Figure 6.2: Simple Conserved cycle where S1 C S2 D constant.
The two species, S1 and S2 are conserved because the total S1 C S2 remains constant over time (at least over a time scale shorter than protein
synthesis and degradation). Such cycles are collectively called conserved
cycles.
95
Protein signalling pathways abound with conserved cycles such as these
although many are more complex than this and may involve multiple phosphorylation reactions. In addition to protein networks other pathways also
possess conservation cycles. One of the earliest conservation cycles to be
recognized was the adenosine triphosphate (ATP) cycle. ATP is a chain
of three phosphate residues linked to a nucleoside adenosine group, Figure 6.3.
NH 2
N
N
N
N
OH
O
HO
OH
P
-O
O
OH
P
-O
O
OH
P
OH
O
-O
Figure 6.3: Adenosine Triphosphate: Three phosphate groups plus an
adenosine subgroup.
The linkage between the phosphate groups involves unstable phosphoric
acid anhydride bonds and these can be cleaved by hydrolysis one at a
time leading in turn to the formation of adenosine diphosphate (ADP) and
adenosine monophosphate (AMP) respectively. The hydrolysis provides
much of the free energy to drive endergonic processes in the cell. Given
the insatiable need for energy, there is a continual and rapid interconversion between ATP, ADP and AMP as energy is released or captured. One
thing that is constant during these interconversions is the amount of adenosine group (Figure 6.4). That is adenosine is a conserved moiety. Over
longer time scales there is also the slower process of AMP degradation
and biosynthesis via the purine nucleotide pathway but for many models
we assume that this process is negligible compared to ATP turn over by
energy metabolism.
96
CHAPTER 6. SPECIES CONSERVATION LAWS
ATP
ADP
Degradation,
Synthesis
AMP
Fast
Slow
There are many other examples of conserved moieties such enzyme/enzymesubstrate complex, NAD/NADH, phosphate and coenzyme A. In all these
cases the basic assumption is that the interconversions of the subgroups
is rapid compared to their net synthesis and degradation. We should emphasize that in reality conserved moieties do not exist since all molecular subgroups will at some point be subject to synthesis and degradation.
However, over sufficiently short time scales, the sum total of these groups
can be considered constant. In this chapter we will consider conserved
moieties in detail. In particular we will look at how to detect them in our
models, what effect they have on model dynamics and how they influence
the design of simulation software.
ATP
ADP
NH 2
NH 2
N
N
HO
OH
P
O
O
OH
P
O
O
OH
P
N
N
OH
O
N
N
N
N
OH
NH 2
N
N
N
N
AMP
OH
O
OH
O
O
HO
OH
P
O
O
OH
P
O
O
OH
O
HO
OH
P
OH
O
O
Figure 6.4: The adenosine moiety, indicated by the boxed molecular
group, is conserved during the interconversion of ATP, ADP and AMP.
Moiety:
Conserved Moiety:
A subgroup of a larger molecule.
A subgroup whose interconversion
through a sequence of reactions leaves it
unchanged.
6.1. MOIETY CONSERVED CYCLES
97
6.1 Moiety Conserved Cycles
Any chemical group that is preserved during a cyclic series of interconversions is called a conserved moiety. Examples of conserved moiety
subgroups include species such as phosphate, acyl, nucleoside groups or
covalently modifiable proteins, As a moiety gets redistributed through a
network, the total amount of the moiety is constant and does not change
during the time evolution of the system. For any particular subgroup, the
total amount is determined solely by the initial conditions imposed on the
model.
Figure 6.5: Conserved Moiety in a Cyclic Network. The blue species
are modified as they traverse the reaction cycle, but the red subgroup
(small circle) remains unchanged. This creates a conserved cycle,
where the total number of moles of moiety (red subgroup) stays constant.
There are rare cases when a ‘conservation’ relationship arises out of a nonmoiety cycle. This does not affect the mathematic analysis but only the
physical interpretation of the relationship. For example, in Figure 6.6 the
constraint B C D T applies even though there is no moiety involved.
The presence of conserved moieties is an approximation introduced into
98
CHAPTER 6. SPECIES CONSERVATION LAWS
B
A
D
C
Figure 6.6: Conservation due to stoichiometric matching. In this system, B C D constant.
a model, however, over the time scale in which the conservations hold,
their existence can have a profound effect on the dynamic behavior of the
model. For example the hyperbolic response of a simple enzyme (in the
form of enzyme conservation between E and ES ), or the sigmoid behavior
observed in protein signalling networks is due in significant part to moiety
conservation laws (see section ??).
Figure 6.7 illustrates the simplest possible network which displays a conserved moiety, the total mass, S1 C S2 is constant during the evolution of
the network.
A
B
v2
S1
S2
v1
D
C
Figure 6.7: Simple Conserved cycle. The dotted lines signify negligible
levels of synthesis and degradation, therefore over short time scales,
S1 C S2 D constant.
The system equations for the simple conserved cycle are easily written
6.2. BASIC THEORY
99
down as:
dS1
D v1
dt
v2
dS2
D v2
dt
v1
From these equations it should be evident that the rate of appearance of S1
must equal the rate of disappearance of S2 , that is dS1 =dt D dS2 =dt .
This means that when ever S1 changes, S2 must change in the opposite
direction by exactly the same amount. During a simulation the sum of S1
and S2 will therefore remain unchanged.
Computationally we need only explicitly evaluate one of the differential
equations because the other one can be computed from the conservation
relation. Whichever differential equation is chosen however, the species
left out must be computed algebraically using the conservation law. Therefore, the system can be reduced to one differential and one linear algebraic
equation compared to the two differential equations in the original formulation.
S2 D T
S1
dS1
D v1
dt
v2
The term T in the algebraic equation shown above refers to the total amount
of S1 and S2 . This value is computed from the initial amounts given to S1
and S2 at the start of a simulation.
6.2 Basic Theory
The question we want to address here is how to determine whether a given
network contains conserved cycles and if so what are they. The key to
this question is the stoichiometry matrix, N . In the example shown in
Figure 6.7 the stoichiometry matrix is given by:
100
CHAPTER 6. SPECIES CONSERVATION LAWS
10
Concentration
8
6
S1
S2
4
2
0
0
10
20
Time
30
40
Figure 6.8: Simulation of the simple cycle shown in Figure 6.7. The
total moiety remains constant at 10 concentration units. Model: S1 ->
S2; k1*S1; S2 -> S1; k2*S2; S1 = 10; k1=0.1; k2=0.2
N D
1
1
1
1
The first thing to note is that since either row can be derived from the other
by multiplication by 1, the rows are called linearly dependent rows,
(See Box 3.0) and the rank of the matrix is therefore 1 (See Box 3.1). It is
these dependencies that appear as linear relationships between the rates of
change, dS=dt.
Whenever a network exhibits conserved moieties, there will be dependencies among the rows of N , and the rank of N rank(N ), will be less than
m, the number of rows of N . The rows of N can be rearranged so that
the first rank(N ) rows are linearly independent. The metabolites which
correspond to these rows are called the independent species (Si ). The
remaining m rank.N / rows correspond to the dependent species (Sd ).
6.2. BASIC THEORY
101
Box 3.0 Linear Dependence and Independent - Recap
One of the most important ideas in linear algebra is the concept of linear dependence and independence. Take three vectors, say Œ1; 1; 2,
Œ3; 0; 1 and Œ9; 3; 4. If we look at these vectors carefully it should
be apparent that the third vector can be generated from a combination
of the first two, that is Œ9; 3; 4 D 3Œ1; 1; 2 C 2Œ3; 0; 1. Mathematically we say that these vectors are linearly dependent.
In contrast, the following vectors, Œ1; 1; 0; Œ0; 1; 1 and Œ0; 0; 1, are
independent because there is no combination of these vectors that can
generate even one of them. Mathematically we say that these vectors
are linearly independent.
In the simple conserved cycle, Figure 6.7, there is one independent species,
S1 and one dependent species, S2 .
Example 6.1
Figure 6.5 illustrates a three species cycle. What is the conservation law for this
pathway? The stoichiometry matrix for this system is given by:
2
N D 4
v1
1
1
0
v2
0
1
1
v3 3
S1
1
0 5 S2
1
S3
(6.1)
Inspection reveals that the sum of the three rows is zero meaning that
dS1
dS1
dS1
C
C
D0
dt
dt
dt
or that the total S1 CS2 CS3 is constant. There are no other relationships between
the rows other than this one.
Example 6.2
A linear pathway has the following stoichiometry matrix:
102
CHAPTER 6. SPECIES CONSERVATION LAWS
N D
1
0
1
1
0
1
Does the pathway contain any conserved cycles? No, because neither row in the
matrix can be derived from the other by a simple operation, the rows are linearly
independent, therefore the pathway has no conserved cycles.
To illustrate this idea on a more complicated example, consider the pathway shown in Figure 6.9. This pathway includes four species, S1 , S2 , E
and ES .
* S1
v2
v1
~
ES
E
o
?
v3
S2
Figure 6.9: Linked Conserved Cycles. The network rendered on the
right shows the moiety composition of the participating species.
The mass-balance equations of this model can be written down as:
dE
D v2
dt
v3
dES
D v3
dt
v2
dS1
dS2
D v2 v1
D v1 v3
dt
dt
A visual inspection of the mass-balance equations reveals the following
two relationships:
dE
dES
C
D0
dt
dt
(6.2)
dES
dS1
dS2
C
C
D0
dt
dt
dt
6.2. BASIC THEORY
103
These relationships tell us that there are two conservation laws, E C ES
and ES C S1 C S2 . This means that given the amount of ES , the amount
of E can be computed. In addition, given the amount of ES and S1 , the
amount of S2 can be computed. Therefore ES and S1 can be designed
the independent species and E and S2 the dependent species. What this
means in practical terms is that in a modeling program only two differential
equations need be solved instead of four. The reduced model equations
will look like:
E D T1
ES
S2 D T2
S1
dES
D v3
dt
v2
dS1
D v2
dt
v1
ES
where T1 is the total amount of E type moiety and T2 is the total amount
of S type moiety.
Box 3.1 The Rank of a Matrix - Recap
Closely related to linear independence (Box 3.0) is the concept of
Rank. Consider the three vectors described in Box 3.0, Œ1; 1; 2,
Œ3; 0; 1 and Œ9; 3; 4 and stack them one atop each other to form a
matrix:
2
3
1
1
2
4 3
0
1 5
9
3
4
then the Rank is simply the number of linear independent vectors that
make up the matrix. In this case the Rank is 2, because there are only
two linear independent row vectors in the matrix.
104
CHAPTER 6. SPECIES CONSERVATION LAWS
The stoichiometry matrix for the model in Figure 6.9 is given by:
2
N D 6
6
4
v1
1
0
1
0
v2
0
1
1
1
v3
3
1
1 7
7
0 5
1
S2
ES
S1
E
(6.3)
Examining the stoichiometry matrix reveals conservation laws as relationships among the matrix rows. The 4th row (E) can be formed by multiplying the 2nd row (ES ) by -1, and the 3rd row (S1 ) can be formed by
multiplying the first row by -1 and adding it to the 4th row (ES ).
These simple examples show that it is possible to derive conservation laws
by looking for dependencies among the rows of the stoichiometry matrix.
For simple cases this can be done by inspection but for large pathways this
approach is not practical. Instead a more systematic theory for deriving
the conservation laws must be developed.
6.3 Computational Approaches
There are a number of related methods for computing the conservation
laws of a given pathway, some are simple such as the one shortly to be
described, while others are more sophisticated and are used to determine
the conservation laws in very large stoichiometry matrices.
The easiest method to derive conservation laws is to use row reduction [42,
10, 9]. This is based on forward elimination which is the first part of
Gaussian Elimination. Gaussian Elimination is a traditional way to solve
simultaneous linear equations by eliminating one unknown at a time and
is a technique often taught in high school. Elimination is carried out by
applying a series of simple manipulations called elementary operations.
These operations include interchanging two equations (exchange), multiplying an equation through by a nonzero number (scaling) and adding an
equation one or more times to another equation (replacement). In practice
the equations are recast into a matrix form so that the elementary operations are applied to the values in the matrix where each row of the matrix
6.3. COMPUTATIONAL APPROACHES
105
represents an equation. Thus interchanging two equations is equivalent to
swapping two rows in the matrix. The elementary operations are carried
out on the matrix until a particular arrangement, called the echelon form,
is established (See Box 3.3).
Elementary operations are often represented in matrix form and are then
called elementary matrices (See Box 3.2). Applying a particular elementary operation then becomes equivalent to multiplying by an elementary
matrix.
The technique for finding conservation laws works as follows. Consider
the network in Figure 6.9. The system equation for this network is:
3
2
3
2
2
3
S2
1
0
1
dS2 =dt
v1
7
6
ES 6
1
1 7
7 4 v2 5 D 6 dES=dt 7
6 0
4 dS1 =dt 5
S1 4 1
1
0 5
v3
0
1
1
E
dE=dt
We will recast the equation in the following form where an identity matrix
has been added to the right-hand side.
Nv D I
dS
dt
Written out fully the system equation will look like:
2
S2
ES 6
6
S1 4
E
1
0
1
0
0
1
1
1
2
3
2
3
1
1
v
1
6 0
1 7
7 4 v2 5 D 6
4 0
0 5
v3
1
0
0
1
0
0
0
0
1
0
32
dS2 =dt
0
6 dES=dt
0 7
76
0 5 4 dS1 =dt
dE=dt
1
3
7
7
5
Let us now apply forward elimination to the stoichiometry matrix. To do
this we apply a series of elementary operations to the left-hand side such
that the stoichiometry matrix is reduced to echelon form. For consistency
we apply the same set of elementary operations to the right-hand side so
that the identity matrix records whatever operations we carried out. This
amounts to multiplying both sides by a set of elementary matrices. We
106
CHAPTER 6. SPECIES CONSERVATION LAWS
Box 3.2 Elementary Matrices - Recap
Elementary matrix operations such as row exchange, row scaling or row
replacement can be represented by simple matrices called elementary
matrices, called Type I, II and III respectively. Elementary matrices
can be constructed from the identity matrix. For example a scaling
operation can be represented out by replacing one of the elements of the
main diagonal of an identity matrix by the scaling factor. The following
matrix represents a type II matrix which will scale the second row of a
given matrix by the factor k:
2
3
1 0 0
40 k 0 5
0 0 1
Type I elementary matrices will exchange two given rows in a given
matrix and are constructed from an identity matrix where rows in an
identity matrix are exchanged that correspond to the rows exchanged in
the target matrix. The following type I matrix will exchange rows 2 and
3 in a target matrix:
2
3
1 0 0
40 0 1 5
0 1 0
Type III elementary matrices will add/subtract a given row in a target
matrix to another row in the same matrix. Type III matrices are constructed from an identity matrix where a single off diagonal element is
set to the multiplication factor and the specific location represents the
two rows to combine. If an elementary matrix adds a row i to a row
j multiplied by a factor ˛, then the identity matrix with entry i; j is
set to ˛. In the following example, the type III elementary matrix will
subtract five times the 2nd row from the 3rd row.
2
3
1
0
0
40
0
15
0
5
0
A particularly important property of elementary matrices is that they
can all be inverted. In addition, pre-multiplying by an elementary matrix will modify the rows of a target matrix while post-multiplying will
operate on the columns.
6.3. COMPUTATIONAL APPROACHES
107
only need to reduce the matrix to its row echelon form not to its reduced
echelon form.
Reducing a matrix to echelon form raises the possibility of generating zero
rows in the matrix if there are dependencies in the rows (See Box 3.3).
This being the case the system equation after forward elimination can be
expressed in the following way:
dS
M
vDE
(6.4)
0
dt
where the identity matrix has been shown transformed into the matrix E
which represents the product of all elementary operations that were applied
to the left-hand side. The left-hand side has itself been transformed into an
echelon form which is represented as a partitioned matrix. The E matrix
can also be partitioned row-wise to match the partitioning in the echelon
matrix, that is:
M
0
vD
X
Y
dS
dt
(6.5)
Multiplying out the lower partition one obtains:
Y
dS
D0
dt
(6.6)
This general result is equivalent to the equations shown in 6.2, that is 6.6
represents the set of conservation laws. Determining the conservation laws
therefore involves reducing the stoichiometry matrix and extracting the
lower portion of the modified identity matrix.
Let us now proceed with an example to illustrate this method. We will
use the stoichiometry matrix from equation 6.3. For convenience the stoichiometry and identity matrix are placed next to each other in the following sequence of elementary operations. An elementary operation carried
out on the stoichiometry matrix is simultaneously applied to the identity
matrix.
108
CHAPTER 6. SPECIES CONSERVATION LAWS
1. Stoichiometry matrix on the left and identity matrix on the right.
2
1
6 0
6
4 1
0
0
1
1
1
3
1
17
7
05
1
2
1
60
6
40
0
0
1
0
0
0
0
1
0
3
0
07
7
05
1
0
0
1
0
3
0
07
7
05
1
2. Add the 1st row to the third row to yield:
2
1
6 0
6
4 0
0
0
1
1
1
3
1
17
7
15
1
2
1
60
6
41
0
0
1
0
0
3. Add the 2nd row to the third and forth rows to yield:
2
1
6 0
6
4 0
0
0
1
0
0
3
1
17
7
05
0
2
1
60
6
41
0
0
1
1
1
0
0
1
0
3
0
07
7
05
1
4. Multiply the second row by -1 to yield the final echelon form:
2
1
6 0
6
4 0
0
0
1
0
0
3
1
17
7
05
0
2
1
60
6
41
0
0
1
1
1
0
0
1
0
3
0
07
7
05
1
The final operation achieves the goal of reducing the stoichiometry matrix
to an echelon form (in this case it happens to be a reduced echelon form).
Note that the operation has resulted in two zero rows appearing in the reduced stoichiometry matrix. These two rows correspond to the Y partition
in equation 6.5. The lower two rows can be extracted from the right-hand
matrix (what was once the identity matrix) to construct equation 6.6, thus
2
3
dS2 =dt
7
1 1 1 0 6
6dES=dt 7 D 0
4
0 1 0 1
dS1 =dt 5
dE=dt
6.3. COMPUTATIONAL APPROACHES
109
Or:
dS2
dES
dS1
C
C
D0
dt
dt
dt
dE
dES
C
D0
dt
dt
From the above equations the following conservation laws should be evident:
S2 C ES C S1 D T1
(6.7)
ES C E D T2
In summary the algorithm for deriving the conservation laws is as follows:
1. Apply elementary operations to the stoichiometry matrix until the matrix is reduced to its row echelon form. Simultaneously apply the elementary operations to an identity matrix. The size of the identity matrix should
be equal to the number of rows in the stoichiometry matrix.
2. If there are zero rows at the bottom of the reduced stoichiometry matrix
then there are conservation laws in the network otherwise there are not.
The number of conservation laws will be equal to the number of zero rows.
3. Extract the rows in the transformed identity matrix that correspond to
the position of the zero rows in the reduced stoichiometry matrix. The
extracted rows represent the conservation laws.
There are two points worth making when applying this algorithm. The first
is that any row swaps made using the row reduction in the stoichiometry
matrix will not translate to swaps in the names of the species on the righthand side of the equation. This means that when reading the conservation
rows, the names on the columns are not changed by any row exchanges
in the stoichiometry matrix. The second point to make is that when carrying out the elementary row operations, it is recommended to eliminate,
whenever possible, terms below a leading entry by adding rather than subtracting. This will ensure that entries in the transforming identity matrix
remain positive and that the resulting conservation laws will be made up
of positive terms. Sometimes the ability to add will not be possible and
110
CHAPTER 6. SPECIES CONSERVATION LAWS
subtractions will be necessary. This will result in negative terms appearing
in the conservation laws which may make them more difficult to interpret
physically.
A useful strategy that can be used to avoid negative terms in the conservation equations is to order the rows of the stoichiometry matrix such that
any species that is likely to appear in more than one conservation relationship should be placed at the bottom of the stoichiometry matrix. In the
case of the previous example we would make sure that ES is located to the
bottom row of the stoichiometry matrix. This ordering ensures that the independent species (top rows) are represented by the free variables and the
dependent species (bottom rows) by the shared variables. This means that
the shared or dependent variables (i.e. complexes) will then be a function
of the free variables which is more likely to result in positive terms [52].
A more brute force method is to try all permutations of the matrix rows
until a positive set of conservation laws is found. For small models (< 10
species) this approach is a viable option.
Although it is possible to manually reduce a stoichiometry matrix, it is
far easier to use specialized math software such Scilab, Octave, Matlab
and Mathematica or even advanced modern desktop calculators. All these
tools offer a rref() command for generating a reduced row echelon. The
following examples will illustrate the use of the freely available Scilab
application (www.scilab.org) to compute the conservation laws.
Example 6.3
Row reduction using Scilab/Matlab. Given the following stoichiometry matrix,
use Scilab functions to row reduce and extract the conservation laws.
2
3
S2
1
0
1
ES 6
1
1 7
6 0
7
N D
S1 4 1
1
0 5
E
0
1
1
Enter the stoichiometry matrix into the software:
-->n = [1 0 -1; 0 -1 1; -1 1 0; 0 1 -1];
Augment the matrix with the identity matrix, this will allow us to record row
reduction operations in the identity matrix part of the augmented matrix.
6.3. COMPUTATIONAL APPROACHES
-->ni = [n, eye(4,4)]
ni =
1.
0. - 1.
0. - 1.
1.
- 1.
1.
0.
0.
1. - 1.
-->
1.
0.
0.
0.
0.
1.
0.
0.
111
0.
0.
1.
0.
0.
0.
0.
1.
0. - 1.
0.
0.
0.
1.
1.
0.
1.
1.
- 1.
1.
Row reduce the augmented matrix:
-->rni = rref (ni)
rni =
1.
0. - 1.
0.
1. - 1.
0.
0.
0.
0.
0.
0.
0.
0.
1.
0.
The left partition of the reduced matrix contains two zero rows, therefore there
are two conservation laws. These laws correspond to the two bottom rows in the
right partition. We extract the rows in the right partition to yield:
-->c = rni(3:4,4:7)
c =
1.
0.
1. - 1.
0.
1.
0.
1.
The species column order is the same as the species row order in the original
matrix, that is S2 ; ES; S1 and E, therefore:
S2 C S1
E D T1
ES C E D T2
Note the negative E term in the first conservation law. At first glance this does not
appear to be the same set of conservation laws that were derived earlier. However,
if we substitute E from the second equation into the first we will get the same set
of conservation laws: S1 C S2 C ES D T , showing us that the two sets are
identical. To avoid negative terms appearing in the conservation laws, we can use
the rule that all complex species (that is shared species), such as ES be moved to
the bottom of the matrix (See next example).
112
CHAPTER 6. SPECIES CONSERVATION LAWS
Example 6.4
Row reduction using Scilab/Matlab. Given the following stoichiometry matrix,
use Scilab functions to row reduce and extract the conservation laws. In this
example, the shared species ES has been moved to the bottom of the matrix.
3
2
S2
1
0
1
1
0 7
S 6 1
7
N D 1 6
4
0
1
1 5
E
0
1
1
ES
The reduced augmented matrix is now:
-->rni = rref (ni)
rni =
1.
0. - 1.
0.
1. - 1.
0.
0.
0.
0.
0.
0.
0.
0.
1.
0.
- 1.
0.
1.
0.
0.
0.
0.
1.
- 1.
- 1.
1.
1.
Once again there are two zero rows but this time the corresponding conservation
laws all have positive entries, yielding the following equations:
S2 C S1 C ES D T1
ES C E D T2
The following Scilab/Matlab code will find the conservation laws for any
stoichiometry matrix.
6.3. COMPUTATIONAL APPROACHES
113
// Compute Conservation Laws
// ------------------------// Enter the stoichiometry matrix first
n = [1 0 -1; 0 -1 1; -1 1 0; 0 1 -1];
nRows = size(n, 1);
// Create the augmented matrix
ni = [n, eye(nRows,nRows)];
// Carry out row reduction
rni = rref (ni);
r = rank (n);
// Extract the conservation rows
c = rni(r+1:nRows,size(n,2)+1:size(ni,2));
// Display result
c
Figure 6.10: General purpose Scilab/Matlab code to determine conservation laws using row reduction.
Row reduction of the augmented stoichiometry is probably the easiest way
to derive the conservation laws. The main advantage of this method includes simplicity and significantly the ability to direct the calculation by
setting the order of rows in the initial stoichiometry. However it has one
disadvantage which is potential numerical instability for large systems. In
particular for large genomic style stoichiometry models [40] that involve
many hundreds or even thousands of reactions and species, the method
can suffer dramatic failures due to rounding errors during row reduction.
In a subsequent section more robust methods will be described that rely
on QR factorization [63] and Singular Value Decomposition (SVD). The
main disadvantage of these other methods is that sometimes, depending on
the particular algorithm, the row order can not be easily prescribed. In any
event there are some simple tests one can do to check that the computed
conservation laws are correct, one such test will be described next.
114
CHAPTER 6. SPECIES CONSERVATION LAWS
Null Space of N T
To complete this section let us consider in more detail the algebraic nature
of the Y partition in equation 6.6.
The elementary matrix, E , reduced the stoichiometry matrix to a row echelon form, that is to:
EN D
M
0
(6.8)
The E matrix corresponds to the same E matrix in equation 6.5, so that we
can partition the elementary matrix, E row-wise into X and Y partitions
(equation 6.5).
X
M
N D
Y
0
From which we can immediately see that:
YN D 0
Taking the transpose we obtain
N TY T D 0
The Y partition is therefore the null space of the transpose of the stoichiometry matrix (cf. ??). This is a significant result for a number of
reasons. It gives a very concise definition of the conservation matrix but
more importantly it opens up the possibility of using other computational
approaches.
The other point of interest is that this result can be used to test whether
a set of conservation laws were correctly derived or not. To do this we
simply multiply the transpose of N by the transpose of the conservation
matrix Y and make sure the product equals zero.
6.3. COMPUTATIONAL APPROACHES
115
Many software packages such as Matlab, Scilab or Mathematica supply
commands to compute the null space. This makes is easy to compute the
conservation laws by simply computing the null space of the transpose of
the stoichiometry matrix. For example the following session shows how
we can use Scilab to compute the conservation laws for the example matrix
we used in previous examples.
-->N = [1 0 -1; -1 1 0; 0 1 -1; 0 -1 1]
N =
1.
0. - 1.
0. - 1.
1.
- 1.
1.
0.
0.
1. - 1.
--> ns = kernel (N')
ans =
0.
0.6324555
0.
0.6324555
0.7071068 - 0.3162278
0.7071068
0.3162278
--> // Convert the orthonormal set
--> // into a rational basis using rref
-->rref (ns')'
ans =
1.
0.
1.
0.
0.
1.
1.
1.
The null space command in Scilab is kernel, in Matlab it is null and in
Mathematical it is NullSpace. Like many null space commands implemented in mathematical software, the kernel command in Scilab has the
drawback of generating an orthonormal set. In order to generate a rational
basis we must row reduce the kernel, this results in a more interpretable set
of conservation laws. In Matlab it is possible to use the modified null space
command, null (N, 'r') which will automatically generate a rational
basis (Neither Octave or Scilab support this format). Interestingly, Mathe-
116
CHAPTER 6. SPECIES CONSERVATION LAWS
matica’s (7.0) null space function does generate a rational basis, however,
the algorithm that Mathematica uses is unknown which raises its own issues.
Given that we can now compute the conservation laws for arbitrary networks, the next question to consider is whether conservation laws have
any behavioral consequences.
6.4 Summary
Of particular interest is to compare these results with equations 6.14 and 6.15.
Whereas the flux balance relationships are derived from the stoichiometry
matrix, the moiety conservation laws are derived from the transpose of the
stoichiometry matrix. Thus to summarize:
Moiety Conservation Laws:
NR
L0 I
D0
N0
Flux Balance Laws:
I
NDC NIC
D0
K0
NT
T
D0
NR K D 0
6.5 Behavioral Consequences
Conservation laws in general can have profound effects on the behavior
of pathway models. Two broad categories can be described, constraints in
the form of limiting changes to species and fluxes and behavioral enhancements in the form of new emergent behavior.
As discussed by Eisenthal and Cornish-Bowden [15], many traditional
drugs, for example pesticides and anti-pathogen agents, work by disrupting
6.5. BEHAVIORAL CONSEQUENCES
117
either flux or metabolite levels to an extent that is harmful to the organism.
This can be achieved by either reducing an important flux to unacceptably low levels or increasing the level of a metabolite to toxic proportions.
Conservation constraints can impose hard limits to the extent that a drug
can influence species levels. This effect is separate from any kinetic constraints that may exist. Thus stoichiometry analysis is an important initial
evaluation of whether manipulating a particular target might be effective or
not. For an interesting example of these constraints in operation, the reader
is referred to the work of Bakker et. al. [3, 4] and also Cornish-Bowden,
Eisenthal and Hofmeyr [8, 10].
More interesting is the ability of conservation cycles to enhance the behavioral properties of networks. We will now consider a series of example
pathways where conservation laws can have a profound effect of behavior.
We will start first with a linear chain, a pathway that has no conservation
laws but provides instead a useful reference case to compare subsequent
examples.
Linear Chain
Consider a simple linear chain where the kinetics for each reaction follows
simple first order mass-action kinetics. It is assumed that Xo and X1 are
fixed.
v1 D k1 Xo
Xo
S1
v2 D k2 S1
v3 D k3 S2
S2
X1
We can investigate the steady state concentrations of S1 and S2 as a function of the rate constant, k1 . Figure 6.11 shows a typical steady state plot
of S1 and S2 versus k1 . What is characteristic about this simulation is that
the concentrations, S1 and S2 show linear behavior in response to changes
in k1 .
118
CHAPTER 6. SPECIES CONSERVATION LAWS
Concentration
4
2
S1
S2
0
0
1
2
3
4
5
k1
Figure 6.11: Simulation of the simple linear chain as a function of
k1 . Model: Xo -> S1; k1*Xo; S1 -> S2; k2*S1; S2 -> X1;
k3*S2; Xo = 1; k1=0.5; k2=1; k3=2
Simple Cycle with Linear Kinetics
Instead of a simple linear chain let us now consider a cycle such as the
one shown in Figure 6.7. We will again assume that the kinetics governing
each cycle arm is simple first order mass-action kinetics.
If we plot the steady state concentration of S1 and S2 versus the kinetic
constant k1 we get the response curves shown in Figure 6.12. The responses for a cycle are quite different from a linear chain. The response
curves are in fact hyperbolic. For example, S2 rises linearly then levels off
to 10 concentration units in the limit. What is happening here is that as k1
increases more and more S1 is converted to S2 leading to a rise in S2 and
a fall in S1 . The limit is reached because there is only a limited amount of
mass in the cycle. A simple conservation law has resulted in a change in
behavior from a linear to hyperbolic behavior even though the underlying
kinetic laws are unchanged.
6.5. BEHAVIORAL CONSEQUENCES
119
10
Concentration
8
6
4
S1
S2
2
0
0
1
2
3
4
5
k1
Figure 6.12: Simulation of the simple cycle with linear kinetics. Plot
shows the steady state concentration of each species as a function of k1 .
Model: S1 -> S2; k1*S1; S2 -> S1; k2*S2; S1=10; k1=0.1;
k2=0.4
Simple Cycle with Non-Linear Kinetics
If we now take the simple cycle model from the last section and instead
of linear kinetics we now use non-linear kinetics, for example MichaelisMenten kinetics on the forward and reverse arms then additional changes
in behavior will be observed.
The response is now sigmoidal rather than hyperbolic. The reason for
this is explained in Figure 6.14. The intersection points marked by a grey
marker represents the corresponding steady state point (v1 D v2 ). A perpendicular dropped from these indicates the corresponding steady state
concentration of S1 . If the activity of v1 is increased by increasing k1
by 20% then the v1 curve moves up. The left intersection point indicates
how much the steady state concentration moves as a result, shown by S .
The closer the steady state point is to the saturated point of the curve, the
more the steady state will move. This shows that the response in S1 can
be very sensitive in changes in k1 . Because k1 is a linear term in the rate
law we could replace it with the concentration of the enzyme implied in
the Michaelis-Menten law. In practice such a cycle could represent a phos-
120
CHAPTER 6. SPECIES CONSERVATION LAWS
10
Concentration
8
6
S1
S2
4
2
0
0
0:2
0:4
0:6
0:8
1
k1
Figure 6.13: Simulation of the simple cycle with non-linear kinetics illustrating sigmoid or ultrasensitive behavior. Model: S1 ->
S2; k1*S1/(Km1+S1); S2 -> S1; k2*S2/(Km2+S2); S1=10;
k1=0.1; Km1=0.5; k2=0.4; Km2=0.5
phorylation/dephophsorylation cycle where the implied enzyme is now a
kinase. The kinase in turn could be controlled by other processes so that
changes in the kinase activity results in sigmoid (or switch like) behavior
in the cycle dynamics. In the literature such behavior is termed ultrasensitivity [20, 21] and has been observed experimentally [26].
Dual Cycle
We can also consider double cycles such as the one shown in Figure 6.15.
We can write out the stoichiometry matrix for the double cycle as:
2
1
4
1
N D
0
1
1
0
0
1
1
3
0
15
1
From this it is possible to show that there is one conservation law given by
the relation:
Reaction Rate v1 and v2
6.5. BEHAVIORAL CONSEQUENCES
121
1
0:5
0
v1
k1 C 20% k1
v2
S
0
0:2
0:4
0:6
0:8
1
S1
Figure 6.14: Plots the two cycle rates, v1 and v2 for the simple cycle
with non-linear kinetics. Model: S1 -> S2; k1*S1/(Km1+S1);
S2 -> S1; k2*S2/(Km2+S2); S1=1; k1=1; Km1=0.05; k2=1;
Km2=0.05. The intersection points marked by a grey marker represents
the steady state point (v1 D v2 ). See main text for explanation.
S1 C S2 C S3 D T
If we assume simple linear mass-action kinetics for each of the reactions,
simulation will reveal that the concentration of S3 shows sigmoid behavior
with respect to the stimulus signal S . We can assume that the stimulus
signal, S, operates on the rate constants, k1 and k3 by the same factor,
that is an increase in S by x% results in a change in k1 and k3 by x%.
What is of interest is that we no longer need non-linear kinetics to generate
sigmoidal behavior but can instead rely on only a small increase in the
complexity of the conservation laws.
The Markevich Switch
The next example will illustrate a fairly complex set of interlinked conservation laws that leads to quite elaborate behavior. This system, first
discovered by Kholodenko and co-workers et al. will be referred to as the
Markevich Switch after the first author on the original paper [36].
122
CHAPTER 6. SPECIES CONSERVATION LAWS
S
v3
v1
S1
S3
S2
v2
v4
Figure 6.15: Two cycles in sequence. The rate laws for each step is
given by v1 D k1 S1 , v2 D k2 S2 , v3 D k3 S2 , v4 D k4 S3 . S is the
stimulus signal which acts by increasing k1 and k3 by the same factor.
The system involves a double cycle but with secondary sequestration effects occurring on the limbs. Figure 6.16 illustrates the full pathway. The
model describes the catalysis of the conversion of S1 through two enzyme
catalyzed reactions, v1 and v2 . The individual catalytic cycles are made
explicit in this model, that is, the binding of S1 to enzyme E1 to form
complex and dissociation to form product, S2 is explicitly modeled. In addition there is the reverse conversion of S3 back to S1 , again by a sequence
of two enzyme catalyzed reactions, v3 and v4 again in explicit form. The
stimulus, S , acts by adding more total E1 to the upper limbs.
This pathway has multiple conservation laws stemming from the two different enzymes and a separate substrate cycle. These conservation laws
include:
S1 C S2 C S3 C ES1 C ES2 C ES3 C ES4 D T1
(6.9)
E1 C ES1 C ES2 D T2
(6.10)
E2 C ES3 C ES4 D T3
(6.11)
Figure 6.18 illustrates graphically the three conservation laws.
The behavior shown by the pathway is called bistability. That is, given a
particular set of parameters, there exists three possible steady states, two
stable and one unstable (sometimes called metastable). We can see this
depicted in the steady state plot that shows the concentration of S3 versus
6.5. BEHAVIORAL CONSEQUENCES
123
S
E1
E1
ES1
ES2
v1
S1
v2
S3
S2
ES4
ES3
E2
E2
v4
v3
Figure 6.16: A complex interlinked set of conserved cycles that describes the Markevich switch [36]. S controls the activity of the pathway by controlling the amount of total E1 .
total E1 (E1 C ES1 ). At a certain range of total E1 , the curve shows
three possible steady states. A high stable state, a low stable state and
an intermediate unstable state (thin line in the graph). In principle the
unstable state could be achieved and maintained indefinitely but random
fluctuations at the molecular level would move the network to one of the
two stable steady state. The question is how does this come about?
A major part of the answer lies in the constraints imposed by the conservation laws. Consider the following scenario. If the activity of the two
forward limbs, v1 and v2 is increased, this will cause more S2 and S3 to
be made. These changes have a number of consequences. To begin with,
the additional S3 will bind to more E2 to form complex ES3 . However
because ES3 is linked by way of a conservation law (6.11) to the levels
of ES4 and E2 , these concentrations will therefore decline. This effectively makes S3 compete with S2 for E2 . The result is that there is less
E2 to catalyze v4 resulting in an effective inhibition of v4 by S3 . This
kind of inhibition has been called apparent regulation because there is not
direct molecular mechanism involved, it is simply an effect brought about
by competitive sequestration. There are other factors in play here as well,
124
CHAPTER 6. SPECIES CONSERVATION LAWS
1.
E1
E1
ES1
ES2
v1
S1
v2
ES3
E2
E2
v3
v4
2.
S3
S2
ES4
3.
E1
E1
v2
v1
E1
E1
ES1
ES2
ES1
ES2
v1
S1
S3 S1
S2
v2
S3
S2
ES4
ES3
ES4
ES3
E2
E2
E2
E2
v4
v4
v3
v3
Figure 6.17
for example the degree of saturation (see [36] for details), however the
constraints imposed by the conservation laws are critical to the observed
bistability.
Given that S2 and S3 have both increased then S1 is likely to have decreased (6.10). If this is the case then there is less binding of S1 to E1 .
This results in a greater availability of E1 which can be used to increase
v2 . If we invert the logic here then we see that increases in S1 will lead
to decreases in v2 . This is another example of apparent regulation due to
conservation law constraints, in this case equation (6.10). We can therefore
redraw the pathway in a more simplified way as depicted in Figure 6.19).
We can simplify this diagram even further by removing the central link,
S2 to give the diagram shown in Figure 6.20. This shows more clearly the
opposing repression loops that surround the pathway.
In essence, what we have here is a toggle switch. Consider the states that
can possible exist in the pathway shown in Figure 6.20. If the concentration of S1 is low then this relieves the inhibition on the forward limb
this converting S1 into S2 and thus maintaining S1 in the low state. S2
6.5. BEHAVIORAL CONSEQUENCES
125
0.8
0.7
0.6
S3
0.5
0.4
0.3
SN
0.2
0.1
SN
0
0.1
0.2
0.3
0.4
0.5
0.6
Total E1
Figure 6.18: Bifurcation plot illustrating bistability in the concentration
of S3 as a function of E1 . The symbol SN indicates a turning point,
i.e. a change in stability. Thick lines represent stable branches and the
thinner central line an unstable branch. Simulations were carried out by
the Oscill8 Tool (oscill8.sf.net), the model was obtained from [45]
as a SBML file via the BioModels Database (http://www.ebi.ac.
uk/biomodels-main/).
is now at a higher concentration and its effect is to repress the low limb.
This state of affairs is therefore stable. If on the other hand we start S1 at
a high concentration, the reverse logic applies. The forward limb is now
repressed this stabilizing S1 at its high state. In contract S2 must now be at
a low concentration where the repression it apply to the lower limb is now
released thus stabilizing it’s low level.
Sequestration Based Ultrasensitivity
To illustrate one last example where conservation laws contribute to new
behavior, we will look at a very simple linear pathway where there is a
dead-end leak caused by complex formation. The observed ultrasensitivity
is in response to a change in the stimulus signal originates from a combination of kinetic and conservation factors. Sigmoid behavior can be observed in both the free species, X and the complex XI forms. Saturation
126
CHAPTER 6. SPECIES CONSERVATION LAWS
v3
v1
S1
S3
S2
v2
v4
Figure 6.19: Two apparent regulatory loops in the Markevich pathway.
S1
S3
Figure 6.20: A highly simplified version of the Markevich pathway
showing the opposing repression loops that surround the pathway.
in the llevel of XI is due to a conservation law involving the I moiety.
To achieve a saturating effect in X, the second step, v2 should be modeled
using a Michaelis-Menten rate law (itself based on a conservation law between free enzyme and enzyme substrate complex) and the first step, v1
should be reversible to ensure that a steady state exists at high stimulus
levels (X would go to infinity otherwise). Figure 6.22 shows an example
simulation that illustrates ultrasensitivity in a simple sequestration model.
6.6 Advanced Theory
In this section we will look at further aspects of conservation laws analysis
using a more formal approach. In a later section we will also consider
more advanced numerical methods for computing conservation laws.
Let us begin by assuming that the rows of the stoichiometry matrix have
6.6. ADVANCED THEORY
127
v1
Inh
v2
X
v3
v4
Inh
XI
Figure 6.21: Simple Sequestration Model
been arranged so that the top rows, mo include the independent rows and
the bottom m mo rows the dependent rows. If we designate the top
rows with the symbol NR and the bottom rows by N0 we can write the
stoichiometry matrix as:
NR
N D
N0
where the submatrix NR is full rank, and each row of the submatrix N0
can be derived by is a linear combination of the rows of NR . We can
also reorder the columns of the stoichiometry matrix of which there will
also be mo independent columns (column and rows ranks re equal). We
will denote the partition of N that contains the last mo columns, the NC
matrix. Finally we will designate the partition of N that includes only the
independent rows and columns the NRC matrix. The NRC matrix will
be a mo mo square invertible matrix. NRC must be invertible because
all rows and columns are independent. The graphical depiction of this
partitioning is given in Figure ??.
If there are no conserved cycles in the network, then the rank (N ) = m
(i.e. full rank) and N equals NR . Following Reder [48] Ehlde [14] and
Hofmeyr [24], we make the following construction. Since the rows of
N0 are linear combinations of the rows of NR we can define a link-zero
matrix, L0 which satisfies
N0 D L0 NR :
(6.12)
128
CHAPTER 6. SPECIES CONSERVATION LAWS
Concentration
0:4
XI
I
0:2
0
0
10
20
30
Simulus
40
50
Figure 6.22: Ultrasensitivity by Simple Sequestration: Xo -> X;
stimulus*(k11*Xo - k12*X); X ->; k2*X/(X + Km); X +
Inh -> XI; k3*X*Inh - k4*XI; Xo=1; k11=0.1; k12=0.5;
k2=1; k3=0.5; k4=0.1; Inh=1; Km1=0.001. Xo is fixed.
L0 will have dimensions .m mo / mo . We can combine L0 with the
identity matrix – of dimension rank.N / – to form the m mo link matrix,
L, thus:
I
LD
L0
When N has full rank, L equals the identity matrix. Using equation (6.12)
and the link matrix we can write:
I
NR
N D
D
NR D LNR
N0
L0
For networks without conserved moieties the L matrix reduces to the identity matrix, I. If we delete the dependent columns of N and NR we obtain:
NC D L0 NRC
or L D NC NRC
1
By partitioning the stoichiometry matrix into a dependent and independent
set we also partition the system equation. The full system equation which
6.6. ADVANCED THEORY
129
n0
m0
NR
NRC
m0
m
N=
N0
NC
n
Figure 6.23: Partitioning of the Stoichiometry Matrix into Four Fundamental Partitions.
describes the dynamics of the network is thus:
dS
I
dSi =dt
NR v D
D
L0
dSd =dt
dt
where the terms dSi =dt and dSd =dt refer to the independent and dependent rates of change respectively. From the above equation, we see that
dSd
dSi
D L0
:
dt
dt
Integrating this last equation, we find
Sd .t /
Sd .0/ D L0 ŒSi .t /
Si .0/
for all time t . Introducing the constant vector T D Sd .0/
can write the above equation as
L0 I
Si
Sd
L0 Si .0/, we
DT
(6.13)
130
CHAPTER 6. SPECIES CONSERVATION LAWS
Recalling that S D .Si ; Sd /, we can introduce
this concisely as
D Œ L0 I, and write
S DT
We will call the conservation matrix and is equivalent to the Y matrix
in equation 6.6. Each row of the conservation matrix relates to a particular conserved cycle and thus the number of rows indicates the number of
conserved cycles in the network. The elements in a particular row indicate
which metabolite species contribute to a particular cycle.
The relationship, N0 D L0 NR can be reexpressed in the following form:
NR
L0 I
D0
(6.14)
N0
However since the conservation matrix, D Œ L0 I, the above relation
can be rewritten as: N D 0. Taking the transpose of this gives us
NT
T
D0
(6.15)
We have already seen this equation in a previous section (6.3 and tells us
that the conservation matrix is the null space of the transpose of the stoichiometry matrix. An equivalent way to state this is that the conservation
matrix is the left null space of the stoichiometry matrix ( N D 0).
The significance of equation (6.15) is that there are many software tools
that allow one to compute the null space very easily. For example Matlab, Mathematica, Maple, O-Matrix, Jarnac or Scilab can easily compute
the null space of a matrix and thus derive the conservation laws. Some
of these tools however, for example Scilab and Matlab, do not normalize the null space so that a second stage is required, but this is easily accomplished with the command rref. Matlab has a variant on the null
command, null (A, 'r') which generates what is called a rational basis. In Scilab one would enter, cm = rref (kernel (N')'). The final
transpose that is applied is simply to reorientate the conservation matrix
for better viewing. In Jarnac one would enter, cm = tr (ns (tr (N)))
and so on. One advantage to using Jarnac is that matrices are labeled with
the reaction and species names which allows the conservation matrix to
6.6. ADVANCED THEORY
131
be easily interpreted without having to manually identify the columns. In
addition Jarnac can generate a labeled stoichiometry directly from a model
expressed in standard SBML.
Returning once again to the network shown in Figure 6.9, equation (6.13)
can be rearranged so that the dependent species can be computed from the
independent species, that is:
Sd D L0 Si C T
(6.16)
The complete set of conservation law equations for this model is therefore,
equation (6.16):
S1
E
1
0
D
dS2 =dt
dES=dt
1
1
D
1
0
0
1
S2
ES
T1
C
T2
2
3
v1
1 4
v2 5
1
v3
(6.17)
Note that even though there appears to be four variables in this system,
there are in fact only two independent variables, fES; S1 g, and thus two
differential equations and two linear constraints. When solving the system
in time, only two differential equations need to be explicitly integrated.
Scaled L
In metabolic control analysis [30, 48, 17] the link matrix, L plays a central
role in formulating the sensitivities. In such cases the scaled version of L,
denoted, L is often used.
L is defined as:
L D .D s /
1
L D SI
where D represents a diagonal matrix of either the reciprocals of species,
D s or a diagonal of the independent species, D SI . For the previous example, L would be given by:
132
CHAPTER 6. SPECIES CONSERVATION LAWS
2
1=S2
0
6 0
1=ES
LD6
4 0
0
0
0
32
0
0
6
0
0 7
76
5
4
1=S1
0
0
1=E
1
0
1
0
3
0 17
7 S2 0
15 0 ES
1
6.7 Numerical Methods
In a previous section 6.3, a simple method based on forward elimination
was described that could be used to derive the conservation laws. This
method has a number of advantages but for large matrices can be numerically unstable. In this section we will review alternative methods that,
although not always as flexible as forward elimination, are however well
suited for the analysis of large matrices.
These methods fall into two groups, three methods based on QR factorization and one method based on Singular Value Decomposition (SVD). The
method based on SVD is the simplest and will be described first.
SVD
Singular Value Decomposition, or SVD is a very useful method for decompiling a matrix into the four orthonormal fundamental subspaces. These
subspaces include the range and null space of the matrix and its transpose.
SVD is based on the following factorization:
A D USV T
where A be a m n matrix of real numbers, U is a m m orthonormal
matrix, V is an n n orthonormal matrix and S a m n diagonal matrix
with entries 1 2 : : : p where p is either m or n, which ever is the
smallest (p D minfm; ng). The numbers, i are called the singular values
and are positive. The columns of U and V form the left and right-hand
singular vectors.
Of more interest here is the fact that the rows of V which correspond to
6.7. NUMERICAL METHODS
133
the zero singular values of A form an orthonormal basis for the null space
of A. Therefore on way to obtain the null space of a given matrix is to
extract these lower rows from the V matrix. The number of rows in V that
correspond to the null space vectors will equal n r where r is the rank
and n the number of columns of A. If there are no zero rows in the S
matrix then the null space is empty.
Example 6.5
Obtain an estimate for the null space of the transpose of the following stoichiometry matrix using SVD. Since we will be working on the transpose, the null space
vectors will represent the conservation laws.
2
1
6 1
6
N D4
0
0
0
1
1
1
3
1
07
7
15
1
Many math applications such as Scilab or Matlab have svd functions. Here we
will use the svd function from Scilab.
-->[U, S, V] = svd (N')
V =
-0.316229 -0.707107 0.632456 0.
-0.316229
0.707107
0.632456
0.
-0.632456
0.
-0.316229 0.707107
0.632456
0.
0.316229
0.707107
S =
2.236068
0.
0.
0.
0.
1.7320508
0.
0.
0.
0.
1.587D-16
0.
U =
-1.886D-16 -0.8164966 0.5773503
-0.7071068
0.4082483
0.5773503
0.7071068
0.4082483
0.5773503
We can extract the null space from V T . The number of zero rows in the S matrix
is two, therefore we must extract the bottom two rows of V T . This gives us:
-->Vt = V'
134
CHAPTER 6. SPECIES CONSERVATION LAWS
-->Vt(3:4,1:4)
0.6324555 0.6324555 -0.3162278 0.3162278
0.
0.
0.7071068 0.7071068
SVD returns an orthonormal basis, to generate a rational basis we apply row reduction to these two rows to yield:
-->rref (kk)
ans =
1.
1.
0.
0.
0.
1.
1.
1.
The transpose of these two vectors is the null space of N T . This can be confirmed
by computing the product N T N .N T / and showing that the product equals zero:
2
1
4 0
1
1
1
0
0
1
1
2
3 1
0 6
1
15 6
40
1
1
3
2
0
0
07
7 D 40
15
0
1
3
0
05
0
Because there are no row or column exchanges during SVD, the rows in the null
space vectors correspond to the same rows in the original matrix, N . This makes it
easy to identify the individual conservation entries in the conservation law vectors.
We can formalize the SVD algorithm using the following Scilab/Matlab
code.
// Use SVD to estimate conservation laws
// Operate on the transpose of n
[u, s, v] = svd (n');
vt = v';
nRows = size(vt, 1);
nCols = size(vt, 2);
// Extract bottom nCols(n')-rank orthonormal rows
orthogns = Vt(r+1:nRows,1:nCols);
// Row reduce the transpose to get rational basis
ratns = rref (orthogns)';
// Display Result
6.7. NUMERICAL METHODS
135
ratns'
// Confirm it is the null space, ns should equal 0
ns = n'*ratns
Since there are no column or row exchanges during SVD, the order of
the rows in the stoichiometry matrix can be used to influence the form of
final conservation laws. Just like the row reduction technique, the order of
rows in the stoichiometry matrix should be such that any shared species
(i.e species containing more than one moiety) be located as close to the
bottom of the matrix as possible. This will ensure that negative terms will
tend not appear in the final conservation equations.
QR Factorization
The SVD method given in the last section is an excellent choice for determining the conservation laws. However, it has two downsides, the first is
that it is far more computationally intensive that the simple row reduction
technique described in 6.3. The second problem with the SVD approach is
the need to carry out a final Guass-Jordan elimination to obtain a rational
basis for the conservation laws. Depending on the size of the stoichiometry
matrix Guass-Jordan elimination can be numerically unstable.
Methods that have both excellent stability properties and are less computationally intense than SVD are methods based on QR factorization.
The first QR method to describe is based on computing L0 . Any m n
matrix can be factored into a product of two matrices Q and R and a
permutation matrix P:
AP D QR
Q is an m m orthogonal matrix, that is QT Q D I, R is a m n upper
trapezoidal matrix and P a permutation matrix. If A is the transpose of the
stoichiometry matrix N T , the the permutation matrix will also reorder the
columns of N T such that the independent columns are on the left and the
dependent rows on the right. This is equivalent to reordering the rows in N .
This partitioning can be written as follows where R has been partitioned
to match the left side:
136
CHAPTER 6. SPECIES CONSERVATION LAWS
T
Q
NR
T
N0
T
R 11 R 12
D
0
0
Note that the partitioned matrix has been absorbed into the reordered N T
matrix during the reordering. If we multiply out the terms we obtain:
R 11
D Q T NR T
0
R 12
D Q T N0 T
0
Given that N0 D L0 NR , R 12 can be rewritten as:
R 12
D QT NR T L0 T
0
so that
R 12
R 11
D
L0 T
0
0
That is
R 12 D R 11 L0 T
Since the permutation matrix post-multiplies N T , it means that the columns
are reordered, this is reflected in column reordering in the R matrix such
that all independent columns are moved to the left and dependent columns
to the right. Row reduction of the R matrix to a reduced echelon form will
therefore result in the left partition being transformed into the identity matrix, that is R 11 D I. From this it follows that the reduced left partition,
R 12 D L0 T , which is the result we seek.
L0 D R T12
By augmenting the L0 matrix with an appropriately sized identity matrix we can use this method to generate conservation laws int he standard
6.7. NUMERICAL METHODS
137
form, that is in the form Œ L0 I. This also means that the rows of the
stoichiometry matrix will also have been reordered in the process as determined by the permutation matrix obtained from the QR factorization.
Therefore, unlike the row reduction technique or SVD, it is not possible
to greatly influence the kind of conservation laws generated by presetting
the row order of the stoichiometry matrix although some flexibility still
exists. It is still advantageous to make sure that all the shared species are
in the bottom rows. The one potential problem with the method is the final Guass-Jordan elimination, however the reordering of the columns will
make this less of an issue.
The Scilab/Matlab code below illustrates an implementation of this method.
It is very important to note that the species labels attached to the columns
of the conservation matrix is determined by the permutation matrix. This
part of the calculation is not shown in the following code.
// Use QR to estimate conservation laws via Lo
// Operate on the transpose of n
[qm, rm, p] = qr (n');
nRows = size(n, 1);
nCols = size(n, 2);
mo = rank (n);
m = size(n, 1);
mmo = m - mo;
// Extract bottom nCols-rank orthonormal rows
rt = rm(1:r,1:nRows);
// Row reduce the transpose to get a rational basis
rrt = rref (rt);
Lo = rrt(1:mo,mo+1:nRows)';
// Display Lo
Lo
// Construct the conservation vectors and display
cm = [-Lo eye(mo,mo)];
cm
Example 6.6
Compute the L0 matrix of the following stoichiometry matrix using QR factorization.
138
CHAPTER 6. SPECIES CONSERVATION LAWS
2
1
6 1
6
N D4
0
0
0
1
1
1
3
1
07
7
15
1
Many software tools offer standard QR factorization. In this example we use
Scilab.
QR factorization yields the following R matrix:
R
=
1.414217 -0.707107 -1.414217 -0.707107
0.
1.224745 0.
-1.224745
0.
0.
0.
0.
Since the rank of the stoichiometry matrix is 2, we extract the top two rows from
R and carry our a complete row reduction (for example by using the rref()
function) to yield:
ans
1.
0.
=
0. - 1.
1.
0.
- 1.
- 1.
The transpose of the L0 matrix can be found in the top right corner starting at
column mo C 1 where mo equals the number of independent rows in the original
stoichiometry matrix. In this case mo equals 2, therefore the L0 matrix (after
transposition) is given by:
-1
-1
0
-1
We now combine the negative of this with the identity matrix to obtain the conservation vectors:
1
1
0
1
1
0
0
1
The only thing that remains is the species labeling for the conservation columns.
These can be obtained from the original stoichiometry matrix and the permutation
matrix, P. As returned by the QR factorization, P is given by:
6.7. NUMERICAL METHODS
P
=
1.
0.
0.
0.
0.
0.
1.
0.
0.
1.
0.
0.
139
0.
0.
0.
1.
and the original species order was ES; E; S1 ; S2 . The permutation matrix shows
that the new species order should be: ES; S1 ; E; S2 .
The final QR method to consider is one based on rank revealing methods,
sometimes called RRQR [7].. The algebra is described in a separate chapter but the method uses the following formula to estimate the null space:
AP
R 11 1 R 12
I
D0
(6.18)
This approach is of interest because it generates a rational basis for the null
space because of the identity matrix in the lower partition. The downside
is that it requires an inversion of R 11 but since R 11 is triangular it is possible to exploit widely available and efficient routines for inverting such
matrices.
Example 6.7
Use the RRQR based method to compute the null space for the transpose of the
stoichiometry matrix:
2
1
6 1
N D6
4 0
0
0
1
1
1
3
1
07
7
15
1
From the last example we saw that QR factorization yielded the following R
matrix:
R
=
1.414217 -0.707107 -1.414217 -0.707107
0.
1.224745 0.
-1.224745
0.
0.
0.
0.
140
CHAPTER 6. SPECIES CONSERVATION LAWS
Since the rank of the stoichiometry matrix is 2, we can partition R into the following submatrices:
R11 D
1:414217
0:
We now compute R 11
0:707107
1:224745
1
R12 D
1:414217
0:
0:707107
1:224745
R 12 to obtain:
1
0
1
1
Combining this with an appropriately sized identity matrix gives the null space:
2
1
60
6
41
0
3
1
17
7
05
1
Like the previous method we need to be aware of the permutation matrix as this
will determine the labels that are associated with the rows of the null space.
There are also ways to obtain the conservation vectors via the Q matrix
and these are discussed in [63].
For a completely different approach to computing the conservation laws,
the reader is referred to the work by Schuster and colleagues. In this work,
convex analysis [57] is used to determine the conservation laws and is used
primarily to generate conservation laws that only contain (where possible)
positive entries.
Most modern simulation applications either use the simpler row reduction
technique or more commonly in recent years, they use the QR factorization
technique based on estimating the L0 matrix [63].
6.8 Design of Simulation Software
On practical implication of moiety conservation concerns the design of
software for simulation and analysis. Two issues arise, one concerns in-
6.8. DESIGN OF SIMULATION SOFTWARE
Method
Advantages
Disadvantages
Row Reduction
a) Simple
b) Fast
c) Row Order
Potential numerical
instabilities
SVD
a) Robust
b) Expensive on
large systems
Requires one final
Gauss-Jordan step
QR by L0
a) Robust
b) Faster than SVD
Requires one final
Gauss-Jordan step
QR by RRQR
a) Robust
b) Row order
No Gauss-Jordan
step required
141
Table 6.1: Comparison of different approaches to computing conservation laws.
creasing simulation efficiency by reducing the number of differential equations and the second concerns numerical stability by removing the dependent species from a model.
The rule to follow is to make sure that any metabolite likely to appear in
more than one conservation relationship must be placed at the beginning
of the DEC statement.
Reduced Systems
The first concern is straight forward, instead of solving the full set of systems equations many simulator instead solve the following reduced set:
Sd D
L0 Si C T
dS i
D NR v.Si ; Sd /
dt
(6.19)
142
CHAPTER 6. SPECIES CONSERVATION LAWS
In these equations, Si is the vector of independent species, Sd , the vector
of dependent species, L0 the link matrix, T the total mass vector, NR
the reduced stoichiometry matrix and v the rate vector. This modified
equation (6.19) constitutes the most general expression for a differential
equation based temporal model [24, 23]. Equations 6.6 shows a typical reduced system. Note that in these equations the dependent species are first
computed from the dependent species. This is followed by the evaluation
of the reduced set of differential equations. The order is crucial. The total
amounts, T , can be computed at the start of a simulation by using equation 6.16 and the initial conditions. In multi-compartmental systems where
the size of compartments may differ, it is important to sum the amounts not
concentrations.
One obvious advantage reducing the model is that it lessens the computationally burden of solving the full set of differential equations. Many
biochemical simulation packages will automatically check for moiety conservations and perform this simplification before performing any analysis
of the system equations. This is especially important for large models.
For example, in the E. coli model obtained from PalssonŠs web site at
http://gcrg.ucsd.edu/organisms/ecoli.html, approximately five
percent of the differential equations are redundant, that is they can be
safely eliminated from the model by using moiety conservation constraints.
Multicompartment Systems
Up to now we have not mentioned the fact that many models may include
multiple compartments, that is separate volume spaces where the movement of mass between volumes is via specific transporter proteins. The
literature is not very clear or extensive in discussing the modeling of multicompartment systems however one crucial point to bear in mind when considering conservation laws that cross compartments is that the sum must be
with respect to the total mass. For convenience models will often assume a
unit volume for a compartment such that any conserved cycles within the
compartment are expressed as the sum of concentrations. In such situations it is easy to forget that what is actually conserved is in fact mass not
concentration. In general a conservation law is therefore expressed in the
6.8. DESIGN OF SIMULATION SOFTWARE
143
form:
X
Vi Si D T
where Vi is the volume that the concentration of species Si resides.
Numerical Stability
Although simplifying a model by eliminating the dependent species can
offer speed improvements to simulations, the most important reason for
model reduction is the gain in numerical stability. One of the most important metrics that arises often in the analysis of pathways (or any dynamical
system for that matter) is the Jacobian matrix.
The Jacobian matrix is an m m matrix of partial derivatives of the rates
of change with respect to the species, that is:
J D
@
@S
dS
dt
For example, for a simple linear chain such as: the differential equations
v1 D k1 Xo
Xo
v2 D k2 S1
S1
v3 D k3 S2
S2
X1
are given by:
dS1
D v1
dt
v2
dS2
D v2
dt
v3
The Jacobian matrix is then given by
2
J D
@.v1 v2 /
4 @S1
@.v2 v3 /
@S1
3
@.v1 v2 /
@S2
5
@.v2 v3 /
@S2
"
D
k2
0
k2
k3
#
144
CHAPTER 6. SPECIES CONSERVATION LAWS
The Jacobian is used in many ancillary calculations, for example, solving
differential equations (particularly stiff equations), solving for the steady
state, calculating sensitivities, frequency analysis, certain optimization algorithms and others. In many of these cases the calculation involves the
inversion of the Jacobian. In the case of the linear pathway, there will always be an inverse so long as the rate constants are non-zero. However if
we consider a simple cycle such as the one shown below:
v2 D k 1 S 1
S1
S2
v1 D k 2 S 2
then the Jacobian matrix is given by:
2
J D
@.v1 v2 /
4 @S1
@.v2 v1 /
@S1
3
@.v1 v2 /
@S2
5
@.v2 v1 /
@S2
"
D
k1
k2
k1
k2
#
This shows that the row dependencies in the stoichiometry matrix reappear
as dependencies in the Jacobian. This means that the Jacobian cannot be
inverted and any calculations that require the inversion of the Jacobian will
fail. The solution is to work with the reduced model, this eliminates the
dependent species from the stoichiometry matrix which in turn makes sure
that the Jacobian is once again invertible.
6.8. DESIGN OF SIMULATION SOFTWARE
145
Exercises
1. The network depicted below has a single conservation law, A C B C
C C D D T . Using the row reduction technique described in section 6.3, prove that this conservation law is true.
v1
B
D
A
v2
C
v4
v3
v1
v3
A
B
v2
C
v4
2. Write a simple application in Matlab, Scilab or Octave to derive the
conservation laws for an arbitrary stoichiometry matrix.
vquestion
v1
3
3. Using the
determine the
v1 application
B v3 written in the previous
conservation laws for the following models, confirm that the conserA
B
C
vationA laws are true. D
v2 C v4
v4
v2
(a)
v1
A
B
v2
C
D
v3
(b)
146
CHAPTER 6. SPECIES CONSERVATION LAWS
B
v1
A
D
A
v2
C
v3
v1
v3
v4
B
v2
C
v4
(c)
v1
A !B
v2
B CC !ACD
v3
D !C
4. Carry out a simulation that illustrates the high sensitivity seen in a
simple conserved cycle that uses saturable Michaelis-Menten rate
laws (See Figure 6.13).
Math Practice
1. Row reduce the following matrices to reduced echelon form.
2
1 2 2
41 2 4
1 3 9
3 2
1 2
2
4
5
3 ; 2 4
3
4 6
3 2
3 0
1 1
5
4
2 2 ; 2 4
6 3
3 6
3 2
2 9
1
5
4
3 1 ; 2
0
5 0
1
0
1
3
2
35
1
References
[1] R Albert. Scale-free networks in cell biology. J Cell Sci, 118(Pt
21):4947–4957, Nov 2005.
[2] Eric Alm and Adam P Arkin. Biological networks. Current Opinion
in Structural Biology, 13(2):193 – 202, 2003.
[3] B. M. Bakker, P. A. M. Michels, F. R. Opperdoes, and H.V. Westerhoff. What controls glycloysis in bloodstream form trypanosoma
brucei. J. Biol. Chem., 274:14551–14559, 1999.
[4] B. M. Bakker, H. V. Westerhoff, F. R. Opperdoes, and P. A. M.
Michels. Metabolic control analysis of glycolysis in trypanosomes as
an approach to improve selectivity and effectiveness of drugs. Mol.
Biochem. Parasitology, 106:1–10, 2000.
[5] A L Barabási and Z N Oltvai. Network biology: understanding the
cell’s functional organization. Nat Rev Genet, 5(2):101–113, Feb
2004.
[6] Thomas W. Binsl, Katharine M Mullen, Ivo H.M. van Stokkum, Jaap
Heringa, and Johannes H.G.M. van Beek. Fluxsimulator: An r package to simulate isotopomer distributions in metabolic networks. Journal of Statistical Software, 18(7):1–17, 1 2007.
[7] T.F. Chan and P.C. Hansen. Some Applications of the Rank Revealing QR Factorization. SIAM Journal on Scientific and Statistical
Computing, 13:727, 1992.
[8] A. Cornish-Bowden and R. Eisenthal. Computer simulation as a tool
for studying metabolism and drug design. In A. Cornish-Bowden
147
148
REFERENCES
and M. L. Cardenas, editors, Technological and Medical Implications
of Metabolic Control Analysis, pages 165–Ű172. Kluwer Academic
Publishers, Dordrecht, The Netherlands, 2000.
[9] A. Cornish-Bowden, J. Hofmeyr, and M. Cardenas. Stoicheiometric
analysis in studies of metabolism. Biochemical Society Transactions,
30:43–47, 2002.
[10] A. Cornish-Bowden and J.-H. S. Hofmeyr. The role of stoichiometric analysis in studies of metabolism: An example. J. theor. Biol,
216:179–191, 2002.
[11] Marjo de Graauw, editor. Phospho-Proteomics, volume 527 of Methods in Molecular Biology. Humana Press, 2009.
[12] Y. Deville, D. Gilbert, J. van Helden, and S.J. Wodak. An overview
of data models for the analysis of biochemical pathways. Briefings
in Bioinformatics, 4(3):246–259, 2003.
[13] R. C. Dickson and M. D. Mendenhall, editors. Signal Transduction
Protocols, volume 284 of Methods in Molecular Biology. Humana
Press, 2n edition edition, 2004.
[14] M. Ehlde and G. Zacchi. A general formalism for metabolic control
analysis. Chemical Engineering Science, 52:2599–2606(8), 1997.
[15] R. Eisenthal and A. Cornish-Bowden. Prospects for antiparasitic
drugs the case of Trypanosoma brucei, the causative agent of African
sleeping sickness. Journal of Biological Chemistry, 273(10):5500–
5505, 1998.
[16] D. A. Fell and J. R. Small. Fat synthesis in adipose tissue: an examination of stoichiometric constraints. Biochem. J., 238:781–786,
1986.
[17] D.A. Fell. Understanding the Control of Metabolism. Portland Press.,
London, 1997.
[18] S Fields and O Song. A novel genetic system to detect protein-protein
interactions. Nature, 340(6230):245–246, 1989.
REFERENCES
149
[19] Anne-Claude Gavin, Patrick Aloy, Paola Grandi, Roland Krause,
Markus Boesche, Martina Marzioch, Christina Rau, Lars Juhl Jensen,
Sonja Bastuck, Birgit Dümpelfeld, Angela Edelmann, Marie-Anne
Heurtier, Verena Hoffman, Christian Hoefert, Karin Klein, Manuela
Hudak, Anne-Marie Michon, Malgorzata Schelder, Markus Schirle,
Marita Remor, Tatjana Rudi, Sean Hooper, Andreas Bauer, Tewis
Bouwmeester, Georg Casari, Gerard Drewes, Gitte Neubauer, Jens M
Rick, Bernhard Kuster, Peer Bork, Robert B Russell, and Giulio
Superti-Furga. Proteome survey reveals modularity of the yeast cell
machinery. Nature, 440(7084):631–636, Mar 2006.
[20] A. Goldbeter and D. E. Koshland. An amplified sensitivity arising
from covalent modification in biological systems. Proc. Natl. Acad.
Sci, 78:6840–6844, 1981.
[21] A. Goldbeter and D. E. Koshland. Ultrasensitivity in biochemical
systems controlled by covalent modification. interplay between zeroorder and multistep effects. J. Biol. Chem., 259:14441–7, 1984.
[22] C.S. Goodyear and G.J. Silverman. Phage-Display Methodology for
the Study of Protein-Protein Interactions: Overview. Cold Spring
Harbor Protocols, 2008(9), 2008.
[23] R. Heinrich and S Schuster. The Regulation of Cellular Systems.
Chapman and Hall, 1996.
[24] J.-H. S. Hofmeyr. Metabolic control analysis in a nutshell. In Proceedings of the Second International Conference on Systems Biology.
Caltech, 2001.
[25] AB Horne, TC Hodgman, HD Spence, and AR Dalby. Constructing an enzyme-centric view of metabolism. Bioinformatics,
20(13):2050–2055, 2004.
[26] C. F. Huang and J. E. Ferrell. Ultrasensitivity in the mitogen-activated
protein kinase cascade. Proc. Natl. Acad. Sci, 93:10078–10083,
1996.
[27] J. L. Ingraham. Growth of the Bacterial Cell. Sinauer Associates Inc,
1983.
150
REFERENCES
[28] T. Ito, T. Chiba, R. Ozawa, M. Yoshida, M. Hattori, and Y. Sakaki.
A comprehensive two-hybrid analysis to explore the yeast protein
interactome. Proc Natl Acad Sci U S A, 98(8):4569–4574, Apr 2001.
[29] H. Jeong, S. P. Mason, A. L. Barabási, and Z. N. Oltvai. Lethality
and centrality in protein networks. Nature, 411(6833):41–42, May
2001.
[30] H. Kacser and J. A. Burns. The control of flux. In D. D. Davies,
editor, Rate Control of Biological Processes, volume 27 of Symp.
Soc. Exp. Biol., pages 65–104. Cambridge University Press, 1973.
[31] P. D. Karp, I. M. Keseler, A. Shearer, M. Latendresse, M. Krummenacker, S. M. Paley, I. Paulsen, J. Collado-Vides, S. GamaCastro, M. Peralta-Gil, A. Santos-Zavaleta, M. I. Peñaloza-Spínola,
C. Bonavides-Martinez, and J. Ingraham. Multidimensional annotation of the Escherichia coli K-12 genome. Nucleic Acids Res,
35(22):7577–7590, 2007.
[32] K J Kauffman, P Prakash, and J S Edwards. Advances in flux balance
analysis. Curr Opin Biotechnol, 14(5):491–496, Oct 2003.
[33] Nevan J Krogan, Gerard Cagney, Haiyuan Yu, Gouqing Zhong,
Xinghua Guo, Alexandr Ignatchenko, Joyce Li, Shuye Pu, Nira
Datta, Aaron P Tikuisis, Thanuja Punna, José M Peregrín-Alvarez,
Michael Shales, Xin Zhang, Michael Davey, Mark D Robinson, Alberto Paccanaro, James E Bray, Anthony Sheung, Bryan Beattie,
Dawn P Richards, Veronica Canadien, Atanas Lalev, Frank Mena,
Peter Wong, Andrei Starostine, Myra M Canete, James Vlasblom,
Samuel Wu, Chris Orsi, Sean R Collins, Shamanta Chandran, Robin
Haw, Jennifer J Rilstone, Kiran Gandi, Natalie J Thompson, Gabe
Musso, Peter St Onge, Shaun Ghanny, Mandy H Y Lam, Gareth
Butland, Amin M Altaf-Ul, Shigehiko Kanaya, Ali Shilatifard, Erin
O’Shea, Jonathan S Weissman, C. James Ingles, Timothy R Hughes,
John Parkinson, Mark Gerstein, Shoshana J Wodak, Andrew Emili,
and Jack F Greenblatt. Global landscape of protein complexes in the
yeast saccharomyces cerevisiae. Nature, 440(7084):637–643, Mar
2006.
REFERENCES
151
[34] Vincent Lacroix, Ludovic Cottret, Th&#x0e9, Patricia Bault, and
Marie-France Sagot. An introduction to metabolic networks and
their structural analysis. Computational Biology and Bioinformatics,
IEEE/ACM Transactions on, 5(4):594–617, 2008.
[35] I.G. Libourel and Y. Shachar-Hill. Metabolic Flux Analysis in Plants:
From Intelligent Design to Rational Engineering. Annu Rev Plant
Biol, pages 625–650, Feb 2008.
[36] N. I Markevich, J B Hoek, and B. N. Kholodenko. Signaling switches
and bistability arising from multisite phosphorylation in protein kinase cascades. J. Cell Biol., 164:353–9, 2004.
[37] M. E. J. Newman. The structure and function of complex networks.
SIAM Review, 45(2):167–256, 2003.
[38] M.E.J. Newman. Power laws, Pareto distributions and Zipf’s law.
Contemporary Physics, 46(5):323, 2005.
[39] B. G. Olivier, J. M. Rohwer, and J. H. Hofmeyr. Modelling cellular
systems with pysces. Bioinformatics, 21:560–1, 2005.
[40] B. O. Palsson. Systems Biology: Properties of Reconstructed Networks. Cambridge University Press, 2007.
[41] J. A. Papin, N. D. Price, S. J. Wiback, D. A. Fell, and B. O. Palsson.
Metabolic pathways in the post-genome era. Trends Biochem Sci,
28:250–8, 2003.
[42] D.J.M.J.R. PARK. Positive compositional algorithms in chemical
reaction systems. Computers & chemistry, 12(2):175–188, 1988.
[43] S Petersen, A A de Graaf, L Eggeling, M Möllney, W Wiechert, and
H Sahm. In vivo quantification of parallel and bidirectional fluxes
in the anaplerosis of corynebacterium glutamicum. J Biol Chem,
275(46):35932–35941, Nov 2000.
[44] E. Phizicky, P.I.H. Bastiaens, H. Zhu, M. Snyder, and S. Fields.
Protein analysis on a proteomic scale. Nature, 422(6928):208–215,
2003.
152
REFERENCES
[45] L. Qiao, R.B. Nachbar, I.G. Kevrekidis, S.Y. Shvartsman, and
A. Asthagiri. Bistability and oscillations in the Huang-Ferrell model
of MAPK signaling. PLoS Comput Biol, 3(9):e184, 2007.
[46] Karthik Raman, Preethi Rajagopalan, and Nagasuma Chandra. Flux
balance analysis of mycolic acid pathway: targets for anti-tubercular
drugs. PLoS Comput Biol, 1(5):e46, Oct 2005.
[47] R G Ratcliffe and Y Shachar-Hill. Measuring multiple fluxes through
plant metabolic networks. Plant J, 45(4):490–511, Feb 2006.
[48] C. Reder. Metabolic control theory: A structural approach. J. Theor.
Biol., 135:175–201, 1988.
[49] J. G. Reich and E. E. Selkov. Energy metabolism of the cell. Academic Press, London, 1981.
[50] R Rios-Estepa and B M Lange. Experimental and mathematical approaches to modeling plant metabolic networks. Phytochemistry,
68(16-18):2351–2374, Aug-Sep 2007.
[51] D K Ro, E M Paradise, M Ouellet, K J Fisher, K L Newman, J M
Ndungu, K A Ho, R A Eachus, T S Ham, J Kirby, M C Chang, S T
Withers, Y Shiba, R Sarpong, and J D Keasling. Production of the
antimalarial drug precursor artemisinic acid in engineered yeast. Nature, 440(7086):940–943, Apr 2006.
[52] H. M. Sauro and D. A. Fell. Scamp: A metabolic simulator and control analysis program. Mathl. Comput. Modelling, 15:15–28, 1991.
[53] J. M. Savinell and B. O. Palsson. Network analysis of intermediary
metabolism using linear optimization. i. development of mathematical formalism. J Theor Biol, 154(4):421–454, Feb 1992.
[54] K. Schmidt and SH Isaacs. An evolutionary algorithm for initial state
and parameter estimation in complex biochemical models. Proceedings of the sixth international conference on computer applications in
biotechnology. Garmish-Partenkirchen: Germany. p, pages 239–242,
1995.
REFERENCES
153
[55] S Schuster, T Dandekar, and D A Fell. Detection of elementary flux
modes in biochemical networks: a promising tool for pathway analysis and metabolic engineering. Trends Biotechnol, 17(2):53–60, Feb
1999.
[56] S. Schuster, D. A. Fell, and T. Dandekar. A general definition of
metabolic pathways useful for systematic organization and analysis
of complex metabolic networks. Nature Biotechnlogy, 18:326–332,
2000.
[57] S. Schuster and T. Hofer. Determining all extreme semi-positive conservation relations in chemical reaction systems: a test criterion for
conservativity. J. Chem. Soc. Faraday Trans., 87:2561–2566, 1991.
[58] J Schwender. Metabolic flux analysis as a tool in metabolic engineering of plants. Curr Opin Biotechnol, 19(2):131–137, Apr 2008.
[59] G P Smith. Filamentous fusion phage: novel expression vectors that display cloned antigens on the virion surface. Science,
228(4705):1315–1317, Jun 1985.
[60] G. N. Stephanopoulos and A. A. Aristidou. Metabolic Engineering:
Principles and Methodologies. Academic Press, 1998.
[61] Cong Trinh, Aaron Wlaschin, and Friedrich Srienc. Elementary
mode analysis: a useful metabolic pathway analysis tool for characterizing cellular metabolism. Applied Microbiology and Biotechnology, 81:813–826, 2009. 10.1007/s00253-008-1770-1.
[62] P. Uetz, L. Giot, G. Cagney, T. A. Mansfield, R. S. Judson,
J. R. Knight, D. Lockshon, V. Narayan, M. Srinivasan, P. Pochart,
A. Qureshi-Emili, Y. Li, B. Godwin, D. Conover, T. Kalbfleisch,
G. Vijayadamodar, M. Yang, M. Johnston, S. Fields, and J. M. Rothberg. A comprehensive analysis of protein-protein interactions in
saccharomyces cerevisiae. Nature, 403(6770):623–627, Feb 2000.
[63] R R Vallabhajosyula, V Chickarmane, and H M Sauro. Conservation
analysis of large biochemical networks. Bioinformatics, 22(3):346–
353, Feb 2006.
154
REFERENCES
[64] A. Varma and B. O. Palsson. Metabolic capabilities of escherichia
coli: I. synthesis of biosynthetic precursors and cofactors. Journal of
Theoretical Biology, 165(4):477–502, 1993.
[65] A. Varma and B. O. Palsson. Metabolic capabilities of escherichia
coli ii. optimal growth patterns. Journal of Theoretical Biology,
165(4):503–522, 1993.
[66] A. Wagner and D. A. Fell. The small world inside large metabolic
networks. Proceedings of the Royal Society B: Biological Sciences,
268(1478):1803–1810, 2001.
[67] M. R. Watson. Metabolic maps for the apple-ii. Biochem. Soc. Trans,
12(6):1093–1094, 1984.
[68] M. R. Watson. A discrete model of bacterial metabolism. Comput
Appl Biosci, 2(1):23–27, 1986.
[69] M Weitzel, W Wiechert, and K Nöh. The topology of metabolic
isotope labeling networks. BMC Bioinformatics, 8:315–315, 2007.
[70] W Wiechert. 13c metabolic flux analysis. Metab Eng, 3(3):195–206,
Jul 2001.
[71] W. Wiechert. A gentle introduction to 13 C metabolic flux analysis.
Genet. Eng, 24, 2001.
[72] W Wiechert, M Möllney, N Isermann, M Wurzel, and A A de Graaf.
Bidirectional reaction steps in metabolic networks: Iii. explicit solution and analysis of isotopomer labeling systems. Biotechnol Bioeng,
66(2):69–85, 1999.
[73] W Wiechert, M Möllney, S Petersen, and A A de Graaf. A universal
framework for 13c metabolic flux analysis. Metab Eng, 3(3):265–
283, Jul 2001.
[74] W Wiechert and K Nöh. From stationary to instationary metabolic
flux analysis. Adv Biochem Eng Biotechnol, 92:145–172, 2005.
REFERENCES
155
[75] J. Yang, S. Wongsa, V. Kadirkamanathan, S.A. Billings, and P.C.
Wright. Metabolic Flux EstimationŮA Self-Adaptive Evolutionary Algorithm with Singular Value Decomposition. IEEE/ACM
TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, pages 126–138, 2007.
[76] N Zamboni, E Fischer, and U Sauer. Fiatflux–a software for
metabolic flux analysis from 13c-glucose experiments. BMC Bioinformatics, 6:209–209, 2005.
156
REFERENCES
History
1. VERSION: 0.9
Date: 2011-01-6
Author(s): Herbert M. Sauro
Title: Structural and Behavioral Properties of Biochemical Networks
Modification(s): Initial Version
157
158
REFERENCES
Download