Supplementary Manual - Word file (56 KB )

advertisement
SUPPLEMENTARY MATERIAL
Quantitative Prediction of Cellular Metabolism with Constraint-based Models: The
COBRA Toolbox
S.1 Installation instructions for the lp_solve linear programming solver on Windows
The instructions below assume that the current version of lp_solve is 5.5.0.10. The
instructions should hold for any other version as well. Note that lp_solve is quite slow
and is not suitable for large-scale optimization problems such as flux variability analysis
or single/double gene deletion studies.
1. Download an uncompress lp_solve_5.5.0.10_dev.zip from
https://sourceforge.net/projects/lpsolve/.
2. Copy lpsolve55.dll into the C:/WINDOWS/system32 directory or any other
location that is in the Windows path for your computer.
3. Download an uncompress lp_solve_5.5.0.10_MATLAB_exe.zip from
https://sourceforge.net/projects/lpsolve/.
4. Copy the files included in the zip file to a directory that is in your Matlab path
(for example the solvers subfolder under the main COBRA Toolbox folder).
5. Test the lp_solve solver by entering mxlpsolve in the Matlab command window.
If lp_solve libraries are correctly installed, you should see usage instructions of
the mxlpsolve interface.
6. Run ex.m from the command line in Matlab to test full functionality of lp_solve.
7. Follow troubleshooting instructions in MATLAB.htm document that is included
in lp_solve_5.5.0.10_MATLAB_exe.zip if problems occur.
S.2 Description of the model structure used by the COBRA Toolbox
The model structure used by the COBRA Toolbox contains the following fields:








rxns: A list of all of the reaction abbreviations in the same order they appear in the
stoichiometric matrix
mets: A list of all of the metabolite abbreviations in the model in the same order they
appear in the stoichiometric matrix
S: The stoichiometric matrix in sparse format
lb: The lower bound corresponding to each reaction, in order
ub: The upper bound corresponding to each reaction, in order
c: The relative weight of each reaction in the objective function—often a single one in
the position corresponding to the biomass reaction and zeros elsewhere
subSystem: The metabolic subsystem for each reaction
rules: Boolean rules for each reaction describing the gene-reaction relationship. For
example ‘gene1 and gene2’ indicate that the two gene products are part of a enzyme
comples whereas ‘gene1 or gene2’ indicate that the two gene products are isozymes
that catalyze the same reaction.


genes: The gene names of all the genes included in the model
rxnGeneMat: A matrix with as many rows as there are reactions in the model and as
many columns as there are genes in the model. The ith row and jth column contains a
one if the jth gene in genes is associated with the ith reaction in rxns and zero
otherwise.
S.3 Description of the SBML file structure (Level 2 version 1)
The SBML files generated in this work contain the following types of data and the format
of this file should be generated by using the standards outlined at
(http://sbml.org/documents/) for SBML, level 2 version 1



reactions (format: ‘R_<reaction abbreviation >’)
o reaction name, reversibility, reaction stoichiometry, gene-protein-reaction
(GPR) association, subsystem, E.C. number
metabolites (format: ‘M_<metabolite abbreviation>_<compartment
abbreviation>’)
o metabolite name, compartment, charge, formula (appended to the end of
the name, <metabolite name>_FORMULA)
a flux distribution associated with a steady-state modeling simulation
o lower bound, upper bound, objective coefficient, flux value, reduced cost
Special helpful notes on the specific SBML format developed for this toolbox:
 For a reaction, the notes field was used to provide the gene association, protein
association, subsystem and protein class
 The lower and upper bounds on a reaction, the objective coefficient for the
presented simulation, the flux value for that particular simulation and the reduced
cost are all parameters assigned to the flux value
Metabolites are listed in the <listOfSpecies> section of the SBML file in the following
format (example for dihydroxyacetone phosphate):
<listOfSpecies>
…
<species id="M_dhap_c" name="M_Dihydroxyacetone_phosphate_C3H5O6P"
compartment="Cytosol" charge="-2" boundaryCondition="false"/>
…
</listOfSpecies>
Reactions are listed in the <listOfReactions> section of the SBML file in the following
format (example for triose phosphate isomerase):
<listOfReactions>
…
<reaction id="R_TPI" name="R_triose_phosphate_isomerase" reversible="true">
<notes>
<html:p>GENE_ASSOCIATION: b3919</html:p>
<html:p>PROTEIN_ASSOCIATION: Tpi</html:p>
<html:p>SUBSYSTEM: S_GlycolysisGluconeogenesis</html:p>
<html:p>PROTEIN_CLASS: 5.3.1.1</html:p>
</notes>
<listOfReactants>
<speciesReference species="M_dhap_c" stoichiometry="1.000000"/>
</listOfReactants>
<listOfProducts>
<speciesReference species="M_g3p_c" stoichiometry="1.000000"/>
</listOfProducts>
<kineticLaw>
<math xmlns="http://www.w3.org/1998/Math/MathML">
<apply>
<ci> LOWER_BOUND </ci>
<ci> UPPER_BOUND </ci>
<ci> OBJECTIVE_COEFFICIENT </ci>
</apply>
</math>
<listOfParameters>
<parameter id="LOWER_BOUND" value="-1000"
units="mmol_per_gDW_per_hr"/>
<parameter id="UPPER_BOUND" value="1000"
units="mmol_per_gDW_per_hr"/>
<parameter id="OBJECTIVE_COEFFICIENT" value="0.000000"/>
</listOfParameters>
</kineticLaw>
</reaction>
…
<listOfReactions>
The reactants of a reaction are listed in the <listOfReactants> section whereas the
products are listed in the <listOfProducts> section. Parameters for each reaction (lower
and upper bounds, objective coefficients) are listed in the <kineticLaw> section. The
gene-protein-reaction associations and subsystems for each reaction are listed in the
<notes> section.
S.4 Adding the capability to use other LP solvers with the COBRA Toolbox
The interfaces to LP solvers are defined in the solveLPStm.m function that is called by
all other functions in the Toolbox that use LP solvers. A new solver can be added by
simply inserting a new section into the ‘switch-case’ statement that deals with selecting
the solver that the user wants to use. For example, the following segment allows the user
to use the TOMLAB CPLEX solver (Tomlab Optimization Inc., San Diego, CA) by
defining the CBTLPSOLVER global variable to be ‘tomlab_cplex’:
case ‘tomlab_cplex’
if (~isempty(csense))
b_L(csense == 'E')
b_U(csense == 'E')
b_L(csense == 'G')
b_U(csense == 'G')
b_L(csense == 'L')
b_U(csense == 'L')
else
=
=
=
=
=
=
b(csense
b(csense
b(csense
1e6;
-1e6;
b(csense
== 'E');
== 'E');
== 'G');
== 'L');
b_L = b;
b_U = b;
end
tomlabProblem = lpAssign(osense*c,A,b_L,b_U,lb,ub);
Result = tomRun('cplex', tomlabProblem, 0);
x = Result.x_k;
f = osense*Result.f_k;
stat = Result.Inform;
w = Result.v_k(1:length(lb));
y = Result.v_k((length(lb)+1):end);
if (stat == 1)
solStat = 1;
elseif (stat == 3)
solStat = 0;
elseif (stat == 2 | stat == 4)
solStat = 2;
else
solStat = -1;
end
The key parts of this interface code are converting the description of the LP problem to a
format suitable to the particular solver, calling the appropriate external functions with the
correct arguments, assigning the correct solution status to be returned to the toolbox, and
converting variables returned by the function into those used by the Toolbox. Examples
of how an interface is built for a number of solvers can be found in solveLPStm.m
function. Note that certain solvers may require calling multiple external functions. In this
case, it may be preferable to write an additional solver interface function that handles
these function calls (these can be placed in the ‘solvers’ folder).
S.5 Linear MOMA approach
The COBRA Toolbox includes an implementation of a linear version of the standard
Minimization of Metabolic Adjustment (MOMA) approach introduced by Segre et al. [1].
MOMA finds the flux distribution (vdel) and growth rate (μdel) for a gene deletion strain
by solving the following quadratic programming problem:
 del  c T v del
min
 v
del
i
 viwt

2
i
Sv
del
(S.1)
0
del
videl
 videl
, min  vi
, max
Here the wild type flux distribution vwt is found solving a separate FBA problem:
 wt  max c T v wt
Sv wt  0
v
wt
i , min
v
(S.2)
v
wt
i
wt
i , max
In the linear version of the MOMA approach (linearMOMA) the same FBA problem
(S.2) is first solved for the wild type strain model to obtain the wild type growth rate μwt.
In order to obtain the deletion strain flux distribution and growth rate, the following
optimization problem is then solved:
 del  c T v del
min
v
del
i
 viwt
i
Sv
v
del
del
i , min
0
 videl  videl
, max
(S.3)
Sv wt  0
viwt,min  viwt  viwt,max
 wt  c T v wt
This problem can be solved as a standard linear programming problem using the
approach described in [2]. The differences between MOMA and linearMOMA are 1) the
use of 1-norm objective function (absolute value) as opposed to a 2-norm objective
(Euclidean norm), and 2) solving the wild-type and deletion strain problems
simultaneously (three last constraints in S.3) in order to avoid problems with alternative
optimal flux distributions in solving the FBA problem (S.2). The linear MOMA approach
was first proposed as an alternative to MOMA by Burgard et al. [3].
References
1.
2.
3.
Segre, D., D. Vitkup, and G.M. Church, Analysis of optimality in natural and
perturbed metabolic networks. Proc Natl Acad Sci U S A, 2002. 99(23): p. 151127.
Herrgard, M.J., S.S. Fong, and B.O. Palsson, Identification of genome-scale
Metabolic Network Models Using Experimentally Measured Flux Profiles. PLoS
Computational Biology, 2006. 2(7): p. e72.
Burgard, A.P., P. Pharkya, and C.D. Maranas, Optknock: a bilevel programming
framework for identifying gene knockout strategies for microbial strain
optimization. Biotechnol Bioeng, 2003. 84(6): p. 647-57.
Download