PPTX - Utah State University

advertisement
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-1-
Tissue
Specific Models
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-2-
LEARNING OBJECTIVES
Each student should be able to:
• Describe the process of creating tissue-specific models.
• Explain the GIMME-like family of tools for creating
tissue-specific models
• Explain the iMAT-like family of tools for creating
tissue-specific models
• Explain the MBA-like family of tools for creating
tissue-specific models
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-3-
Lesson Outline
• Overview
• Creating Tissue-specific Models
 GIMME-like Family
 iMAT-like Family
 MBA-like Family
• Tissue-specific Example
 Red Blood Cells
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-4-
The multiple uses
of high-throughput
omics data in
constraint-based
models
Bordbar, A., J. M. Monk, et al. (2014). "Constraintbased models predict metabolic and associated
cellular functions." Nature reviews. Genetics 15(2):
107-120.
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-5-
Four Major Applications of Recon X
I)
Utilizing high-throughput data, Recon X can be tailored to cell
and tissue-specific networks. The process has been done both
algorithmically and manually.
II)
Similarly, Recon X has been transformed into other mammalian
reconstructions, particularly M. musculus. The high overlap of
homologous genes in Recon X with similar mammals allows for
reconstructing accurate mammalian models quickly.
III) High-throughput data can be interpreted by mapping the data
onto Recon X’s metabolic network backbone. This process has
been done to study pathological and drug-treated states.
IV)
Recon X can be used to simulate and predict phenotypes,
providing biological clues to physiology and pathology as well as
guiding experimental design.
Bordbar, A. and B. O. Palsson (2012). "Using the reconstructed genome-scale human
metabolic network to study physiology and pathology." J Intern Med 271(2): 131-141.
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-6-
Lesson Outline
• Overview
• Creating Tissue-specific Models
 GIMME-like Family
 iMAT-like Family
 MBA-like Family
• Tissue-specific Example
 Red Blood Cells
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-7-
Existing Methods for Context-specific Model Extraction
Estevez, S. R. and Z. Nikoloski (2014). "Generalized framework for context-specific metabolic model extraction methods." Frontiers in plant science 5.
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-8-
Lesson Outline
• Overview
• Creating Tissue-specific Models
 GIMME-like Family
 iMAT-like Family
 MBA-like Family
• Tissue-specific Example
 Red Blood Cells
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-9-
GIMME-like Family
• The objective employed by the GIMME-like family corresponds to the similarity of the flux
phenotype to data, which is to be maximized while guaranteeing a given Required Metabolic
Functionality (RMF), such as: growth or ATP production.
• This family reconstructs a context-specific model in two steps:
 First, it optimizes an objective function, the RMF, by using the classical linear programming
(LP) formulation of flux balance analysis which imposes mass balance and thermodynamic
constraints. This objective function is assumed to be the main cellular task in the investigated
condition.
 It then involves solving a second LP that minimizes a penalty function, corresponding to the
discrepancies between flux values and the respective transcript levels, with the additional
constraint that the flux through the previous RMF must be above a given lower bound (e.g. a
fraction of the optimum value found by flux balance analysis).
• The methods included in this family mainly differ in the way the discrepancies are minimized in the
second step, the type of high-throughput data used, and in the treatment of reversible reactions
Estevez, S. R. and Z. Nikoloski (2014). "Generalized framework for context-specific metabolic model extraction methods." Frontiers in plant science 5.
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-10-
GIMME-like Family
GIMME
GIM3E
Becker, S. A. and B. O. Palsson (2008). "Context-specific metabolic networks are
consistent with experiments." PLoS computational biology 4(5): e1000082.
Schmidt, B. J., et al. (2013). "GIM3E: condition-specific models of cellular metabolism
developed from metabolomics and expression data." Bioinformatics 29(22): 2900-2908.
•
The penalty function is termed inconsistency score.
•
GIM3E introduces several modifications to the original GIMME.
•
This function penalizes flux values of reactions whose associated
expression levels are below a user-defined cut-off (i.e., threshold). More
specifically, the inconsistency score is given by the dot product of the flux
distribution and the reaction penalty, defined as the vector difference of
the associated expression values from the threshold.
•
First, it allows integration of metabolomics data, imposing a nonzero flux
value to reactions involving a metabolite for which there is evidence of
being synthesized in an investigated condition.
•
Second, it modifies the definition of the reaction penalty; here, the
penalties for all reaction-associated genes are determined separately and
are then mapped to the reaction following the GPR rules. Moreover, the
penalties are calculated as the distance between each transcript and the
maximum expression level of the set. Consequently, after mapping
transcript penalties all reactions obtain a penalty value, rather than only
the set below the threshold which is the case in GIMME.
•
Finally, GIM3E takes into account directionality of reversible reactions by
constraining them to operate in only one direction, which is modeled by
introducing a binary variable for the direction of choice. As a result,
GIM3E is formulated as a mixed integer linear program (MILP), which is
more computationally challenging than the LP formulation of GIMME.
•
The reaction associated expression level is obtained following the standard
GPR rules (Becker and Palsson, 2008), which take into account the presence
of isoenzymes and protein complexes.
•
Although transcript profiles were used in the original formulation, a variant
called GIMMEp allows for the integration of proteomic data (Bordbar, et al.
(2012). "Model-driven multi-omic data analysis elucidates metabolic
immunomodulators of macrophage activation." Molecular Systems Biology).
•
The result of applying this algorithm is a flux distribution which ensures
that a given RMF can be carried out and is as consistent as possible to the
employed data.
Estevez, S. R. and Z. Nikoloski (2014). "Generalized framework for context-specific metabolic model extraction methods." Frontiers in plant science 5.
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-11-
Create a Tissue Model of the Eye
• Step #1 – Collect and format the gene expression data for the non-treated ARPE-19 cells
• Gene Expression Omnibus data (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE5741)
• GSM133871.CEL, GSM133872.CEL, GSM133873.CEL (Non-treatment)
• Step #2 - Convert the Affymetrix data to a format that can be used by the “createTissueSpecific()” Cobra
Toolbox function
• A script has been created by Farhad Farjood using the language R to get the Absence/Presence (AP.txt) data
and Entrez IDs (EID.txt).
• R is open source and can be downloaded for free at http://cran.rstudio.com/
• A function call getExpD() will convert to AP.txt and EID.txt to expression data that can be used by the Cobra
Toolbox
• Step #3 – Create a tissue-specific model with the Cobra Toolbox
• Use an update version of “createTissueSpecific()” called “createTissueSpecificRec()” that has been modified
by Farhad Farjood to work with Recon 1 and Recon 2.
• Step #4 – Create a specific objective function based on the cellular functions of reabsorption and secretion.
• Step #5 - Use manual assessment with primary literature to validate the physiological functions for accuracy.
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
createTissueSpecificModel
H. Scott Hinton, 2015
-12-
Create draft tissue specific model from mRNA expression data
[tissueModel,Rxns] = createTissueSpecificModel(model,expressionData,proceedExp,orphan,exRxnRemove,solver,options,funcModel)
INPUTS
model
global recon1 model
expressionData
mRNA expression Data structure
Locus
Vector containing GeneIDs
Data
Presence/Absence Calls
OUTPUTS
tissueModel
Model produced by GIMME or Shlomi,
Rxns
containing only reactions carrying flux
Statistics of test
Use: (1 - Present, 0 - Absent) when proceedExp = 1
Use: (2 - Present, 1 - Marginal, 0 - Absent)
when proceedExp = 0
Transcript
RefSeq Accession (only required if proceedExp = 0)
OPTINAL INPUTS
proceedExp
orphan
ExpressedRxns - predicted by mRNA data
UnExpressedRxns - predicted by mRNA
data unknown - unable to be predicted by
mRNA data
Upregulated - added back into model
Downregulated - removed from model
UnknownIncluded - orphans added
1 - data are processed ; 0 - data need to be processed (Default = 1)
1 - leave orphan reactions in model for Shlomi Method 0 - remove orphan reactions (Default = 1)
exRxnRemove
Names of exchange reactions to remove (Default = [])
solver
Use either 'GIMME' or 'Shlomi' to create tissue specific model (Default = 'GIMME')
options
If using GIMME, enter objectiveCol here
Default: objective function with 90% flux cutoff, written as: [find(model.c) 0.9]
funcModel
1 - Build a functional model having only reactions that can carry a flux (using FVA), 0 - skip this step (Default = 0)
http://opencobra.sourceforge.net/openCOBRA/opencobra_documentation/cobra_toolbox_2/index.html
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-13-
Build a draft tissue-specific human macrophage model from the
global human metabolic network and omics data
Download the MAT file “testTissueModel.mat” from “https://github.com/opencobra/cobratoolbox/tree/master/testing/testTissueModel.” It
contains the global human metabolic network model and a formatted expressionData structure. The model is the version of the human metabolic
network reconstruction Recon 1 that was used to create an alveolar macrophage model1 using expression data from Kazeros et al.2
> > load(‘testTissueModel.mat’)
The GIMME algorithm retains reactions from Recon 1 that are orphans or are present in the high-throughput data. The reactions with no detected
expression are minimized and those not required to retain flux through the objective reaction are removed.
> > [tissueModel,Rxns] = createTissueSpecificModel(model,expressionData);
Where tissueModel is the GIMME algorithm-derived draft model; and Rxns is a structure with lists of all the reactions. The reactions fall into the
following categories:
•
Expressed—1,769 potentially active reactions based on transcriptome data;
•
UnExpressed—497 reactions without requisite gene products based on transcriptome data;
•
Unknown—41 reactions unable to be predicted by transcriptome data;
•
Upregulated—52 UnExpressed reactions added back into model;
•
Downregulated— 0 Expressed reactions removed from model; and
•
UnknownIncluded—1,476 orphan reactions included.
1.
Bordbar, A., Lewis, N.E., Schellenberger, J., Palsson, B.O. &
Jamshidi, N. Insight into human alveolar macrophage and M.
tuberculosis interactions via metabolic reconstructions. Mol.
Syst. Biol. 6, 422 (2010).
2. Kazeros, A. et al. Overexpression of apoptotic cell removal
receptor MERTK in alveolar macrophages of cigarette
smokers. Am. J. Respir. Cell Mol. Biol. 39, 747–757 (2008).
Schellenberger, J., R. Que, et al. (2011). "Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox v2.0." Nature protocols 6(9): 1290-1307.
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-14-
Step #1 – Collect Gene Expression Files for the Non-treated ARPE-19 Cells
http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE5741
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-15-
R Filename = AP_HSH.R
source("http://bioconductor.org/biocLite.R")
biocLite("affy")
biocLite("simpleaffy")
biocLite("annotate")
biocLite("hgu133plus2.db")
Step #2 - Convert the Affymetrix Data to a Format
That can be Used by the “createTissueSpecific()”
Cobra Toolbox Function
library('affy')
library('simpleaffy')
library('annotate')
library('hgu133plus2.db')
setwd("/… /GSE5741_RAW") Set working directory
affy.data = ReadAffy()
data.AP = mas5calls(affy.data)
data.AP.calls = exprs(data.AP)
Affymetrix_ID <- row.names(data.AP.calls)
Entrez_ID <- getEG(Affymetrix_ID, annotation(affy.data))
write.table(data.AP.calls[,1:3], file="AP1.txt", row.names=FALSE,col.names=F, quote=F, sep="\t")
write.table(Entrez_ID, file="EID.txt", row.names=FALSE,col.names=F, quote=F, sep="\t")
Utah State University
BE 5500/6500
AP.txt
EID.txt
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
Step #3 – Create a
Tissue-specific
GIMME-based Model
with the Cobra Toolbox
H. Scott Hinton, 2015
-16-
Expression Data
% CreateARPE19GIMMEModel.m
clear;
load('Recon2.mat');
ExpressionData = getExpD('EID.txt','AP.txt');
Rxns
ARPE19 Model
[ARPE19_GIMME,Rxns] = createTissueSpecificRec(Recon2_2,ExpressionData,1,1,[],'GIMME',options,1);
writeCbModel(ARPE19_GIMME,'sbml','ARPE19_GIMME_Model');
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-17-
Lesson Outline
• Overview
• Creating Tissue-specific Models
 GIMME-like Family
 iMAT-like Family
 MBA-like Family
• Tissue-specific Example
 Red Blood Cells
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-18-
iMAT-like Family
• The iMAT-like family comprises three methods, iMAT (Shlomi et al., 2008), INIT (Agren et
al., 2012) and its extension, tINIT (Agren et al., 2014).
• The iMAT-like family does not assume a RMF achieved by the cell. More specifically, these
methods maximize the number of matches between reaction states (i.e., active or inactive)
and corresponding data states (i.e., expressed or not non-expressed).
• The mathematical formulation results in a MILP, in which the value of the binary variable
denotes the most concordant reaction state for a given (data) context.
• Although sharing the general strategy, iMAT, INIT and tINIT differ considerably respecting
to how they deal with data:
• iMAT integrates data in the constraints,
• INIT and tINIT do so directly in the objective function.
Estevez, S. R. and Z. Nikoloski (2014). "Generalized framework for context-specific metabolic model extraction methods." Frontiers in plant science 5.
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
iMAT-like Family
iMAT (Shlomi)
• iMAT tackles this issue through an adapted flux variability
analysis (FVA): First, it forces each reaction to be active and
evaluates the similarity, and then repeats the process in a similar
way by forcing each reaction to be inactive. The final outcome is
computed by comparing the two obtained similarities. A reaction
is termed active if its inclusion results in higher similarity to
data, and it is termed as inactive, if its inclusion decreases this
similarity. In the case that both similarities are equal, iMAT
categorizes the reaction as undetermined.
-19-
INIT
Agren, R., S. Bordel, et al. (2012). "Reconstruction of genome-scale active metabolic
networks for 69 human cell types and 16 cancer types using INIT." PLoS
computational biology 8(5): e1002518.
Shlomi, T., M. N. Cabili, et al. (2008). "Network-based prediction of human tissue-specific
metabolism." Nat Biotechnol 26(9): 1003-1010.
• The algorithm first classifies reactions into two groups based on
a previously defined threshold for the corresponding expression
data; this results in the groups of reactions with a high and low
associated expression values. It then maximizes the number of
matches between a reaction state, defined through a minimum
flux value, and the group to which the reaction belongs. Thus, if a
reaction is included in the highly expressed group, the aim is to
obtain a flux value over the minimum.
H. Scott Hinton, 2015
•
INIT was optimized to integrate evidences from the Human Protein
Atlas, although expression data are integrated when proteomic
evidences are missing.
•
INIT does not group reactions in categories in contrast to iMAT.
Instead, it adopts experimental data to weight the binary variable
of the corresponding reaction, whereby the weight is a function of
experimental data (e.g., gene expression profiles) or a set of
arbitrary numbers that quantify the color code of the entries of the
Human Protein Atlas.
•
INIT imposes a positive net production of metabolites for which
there is experimental support for that context or tissue. Hence,
when a metabolite is experimentally determined to be present, its
net production is forced to comply with a given lower bound. As a
result, INIT allows the integration of metabolomics data in a
qualitative way.
•
This method has been applied to generate a human metabolic
reaction database (“Human Metabolic Atlas2 ”) where several tissuespecific model reconstructions can be examined.
Estevez, S. R. and Z. Nikoloski (2014). "Generalized framework for context-specific metabolic model extraction methods." Frontiers in plant science 5.
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
Step #3 – Create a
Tissue-specific
Shlomi-based Model
with the Cobra Toolbox
H. Scott Hinton, 2015
-20-
Expression Data
% CreateARPE19ShlomiModel.m
clear;
load('Recon2.mat');
Rxns
Rxns.solution
ARPE19 Shlomi Model
ExpressionData = getExpD('EID.txt','AP.txt');
[ARPE19_Shlomi,Rxns] = createTissueSpecificRec(Recon2_2,ExpressionData,1,1,[],'Shlomi',options,1);
writeCbModel(ARPE19_Shlomi,'sbml','ARPE19_Shlomi_Model');
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-21-
Lesson Outline
• Overview
• Creating Tissue-specific Models
 GIMME-like Family
 iMAT-like Family
 MBA-like Family
• Tissue-specific Example
 Red Blood Cells
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-22-
MBA-like Family
• The MBA-like family is composed of MBA,mCADRE and FastCORE.
• While previous methods perform both a flux prediction and a context-specific model reconstruction, MBA-like
methods only return a context-specific model as output.
• This family a priori categorizes reactions in two sets, the core and the non-core.
• The core set includes those reactions with positive evidences (e.g., high through put data and/or well-curated
biochemical knowledge) of being active in a certain context. Once these sets are defined, the MBA-like methods
prune the GEM by eliminating non-core reactions that are unnecessary to ensure consistency in the core set, i.e.,
no blocked reaction is allowed in the final model. Thereby, all reactions must carry non-zero flux in at least one
feasible solution.
• Checking model consistency is a crucial part of these methods and also the main difference in comparison to the
other methods.
• FVA have been used to pinpoint blocked reactions, but it is computationally expensive since it requires solving two
optimization problems per reaction. Thus, the major changes in formulation are due to finding faster alternatives to
perform the same task.
• Other differences arise when defining the core set and during the pruning process.
Estevez, S. R. and Z. Nikoloski (2014). "Generalized framework for context-specific metabolic model extraction methods." Frontiers in plant science 5.
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
MBA
H. Scott Hinton, 2015
-23-
MBA-like Family
mCADRE
•
MBA divides the core set in two subcores: a set with high likelihood to be
present in the context-specific model (CH), if evidence comes from well-curated
biochemical knowledge in that particular context, and a set with moderate
likelihood (CM) if evidence comes from context-specific high-throughput data.
•
A prominent characteristic of mCADRE lies in ranking reactions of
the genome-scale reconstruction according to three scores:
expression-, connectivity-, and confidence-level-based.
•
The algorithm performs the pruning iteratively and randomly by selecting a noncore (NC) reaction to be eliminated, and checking consistency at each step: if CH
and a user-defined fraction of CM remain unblocked, MBA removes the reaction
out of the model along with CM and NC corresponding blocked reactions. This
routine is repeated until no reaction is left in NC.
•
•
The topology of the final model clearly depends on the order in which non-core
reactions are eliminated. Therefore, to remove artifacts due to the order, the
algorithm is repeated a number of times (1000) to obtain a population of
context-specific models. Later, reactions are ranked according to their
occurrence in the population and added up to CH until a consistent model is
obtained.
This ranking determines the core set of reactions as well as the
order by which non-core reactions are eliminated. The core is
determined by fixing a threshold value to the expression based
score; therefore, reactions whose values are above the threshold
are included in the core, and the rest constitute the non-core
reactions.
•
Unlike other methods, the expression-based score does not directly
consider the levels of expression. Instead, it calculates the
frequency of expressed states over a battery of transcript profiles
in the same context, and, thus, requires a previous binarization of
the expression data.
•
Reactions outside the core are then ranked according to the
connectivity-based score, which assesses the connectedness of
adjacent reactions, and the confidence level-based score, which
accounts for the type of evidences supporting a reaction in the
genome-scale reconstruction.
•
MBA proposes an alternative to FVA to check consistency in a more efficient
way: First, it solves a LP problem which maximizes the total sum of fluxes. It
then removes active reactions (i.e., carrying non-zero flux) and repeats the LP
over the remaining set of reactions. If no reaction is found to be active, FVA is
applied to each reaction to determine whether it is blocked. The process is
repeated until all reactions have been classified either as blocked or unblocked.
Estevez, S. R. and Z. Nikoloski (2014). "Generalized framework for context-specific metabolic model extraction methods." Frontiers in plant science 5.
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-24-
MBA-like Family
mCADRE
•
A prominent characteristic of mCADRE lies in ranking reactions of
the genome-scale reconstruction according to three scores:
expression-, connectivity-, and confidence-level-based.
•
This ranking determines the core set of reactions as well as the
order by which non-core reactions are eliminated. The core is
determined by fixing a threshold value to the expression based
score; therefore, reactions whose values are above the threshold
are included in the core, and the rest constitute the non-core
reactions.
•
Unlike other methods, the expression-based score does not directly
consider the levels of expression. Instead, it calculates the
frequency of expressed states over a battery of transcript profiles
in the same context, and, thus, requires a previous binarization of
the expression data.
•
Reactions outside the core are then ranked according to the
connectivity-based score, which assesses the connectedness of
adjacent reactions, and the confidence level-based score, which
accounts for the type of evidences supporting a reaction in the
genome-scale reconstruction.
•
Non-core reactions are in turn sequentially removed according to the
previous ranking, and consistency is evaluated.
•
Here, mCADRE presents two other innovations: it defines a set of
key metabolites, with positive evidences of appearing in the context
specific model reconstruction, and relaxes the stringent condition of
including all core reactions in the final model. More specifically, a
reaction can only be eliminated if it does not prevent the production
of a key metabolite and if it is unnecessary to ensure core
consistency. However, if evidence exists for the respective
transcript to be unexpressed in any of the context-specific samples,
mCADRE allows the elimination of the reaction even if it blocks some
of the core reactions.
•
To this end, two conditions have to be satisfied: (1) production of
key metabolites is not impaired and (2) the relation between the
number of blocked core and non-core reactions matches a predefined
ratio.
•
To check model consistency, mCADRE maintains the procedure
proposed in MBA, although adapted to use FastFVA instead of
maximizing the total sum of flux values.
Estevez, S. R. and Z. Nikoloski (2014). "Generalized framework for context-specific metabolic model extraction methods." Frontiers in plant science 5.
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
MBA
-25-
MBA-like Family
FastCore
•
MBA divides the core set in two subcores: a set with high likelihood to be
present in the context-specific model (CH), if evidence comes from well-curated
biochemical knowledge in that particular context, and a set with moderate
likelihood (CM) if evidence comes from context-specific high-throughput data.
•
The algorithm performs the pruning iteratively and randomly by selecting a noncore (NC) reaction to be eliminated, and checking consistency at each step: if CH
and a user-defined fraction of CM remain unblocked, MBA removes the reaction
out of the model along with CM and NC corresponding blocked reactions. This
routine is repeated until no reaction is left in NC.
•
The topology of the final model clearly depends on the order in which non-core
reactions are eliminated. Therefore, to remove artifacts due to the order, the
algorithm is repeated a number of times (1000) to obtain a population of
context-specific models. Later, reactions are ranked according to their
occurrence in the population and added up to CH until a consistent model is
obtained.
•
H. Scott Hinton, 2015
MBA proposes an alternative to FVA to check consistency in a more efficient
way: First, it solves a LP problem which maximizes the total sum of fluxes. It
then removes active reactions (i.e., carrying non-zero flux) and repeats the LP
over the remaining set of reactions. If no reaction is found to be active, FVA is
applied to each reaction to determine whether it is blocked. The process is
repeated until all reactions have been classified either as blocked or unblocked.
•
While FastCORE aims also at obtaining a minimal consistent model
containing all core reactions, typical for this family of methods, it
differs principally from MBA and mCADRE in the algorithmic
strategy.
•
Instead of eliminating one non-core reaction followed by consistency
evaluation at each step, FastCORE solves two LPs: The first LP
maximizes the cardinality of the core set of reactions, computed as
the number of reaction values above a small positive constant. On the
other hand, the second LP minimizes the cardinality outside the core
set by minimizing the L1-norm of the flux vector, under the
constraint that the entire core set must remain active.
•
These two LPs are repeatedly applied in alternating fashion until the
core set is consistent, whereby activation of all core reactions is
ensured while including a minimum set of non-core reactions in the
final model. To deal with reversible reactions, FastCORE evaluates
both directions by changing the sign of the corresponding column of
the stoichiometric matrix.
Estevez, S. R. and Z. Nikoloski (2014). "Generalized framework for context-specific metabolic model extraction methods." Frontiers in plant science 5.
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-26-
Flowchart Context-specific Reconstruction Methods
Optimal choice of methodologies
when tackling a context-specific
reconstruction problem. The
choice can be made by answering
a few questions, in a flowchart
manner, related to: demand of
model extraction and flux
prediction, knowledge on a
required metabolic functionality,
the type of experimental data
available or the computational
platform.
Estevez, S. R. and Z. Nikoloski (2014). "Generalized framework for context-specific metabolic model extraction methods." Frontiers in plant science 5.
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-27-
Summary of Methods for
Context-specific Metabolic Model Extraction.
Estevez, S. R. and Z. Nikoloski (2014). "Generalized framework for context-specific metabolic model extraction methods." Frontiers in plant science 5.
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-28-
Recon 1 Cell-specific Reconstructions
Human brain
 Lewis, N. E., G. Schramm, et al. (2010). "Large-scale in silico modeling of metabolic interactions between cell types in the human brain." Nat Biotechnol
28(12): 1279-1285.
• Liver
 Gille, C., C. Bolling, et al. (2010). "HepatoNet1: a comprehensive metabolic reconstruction of the human hepatocyte for the analysis of liver physiology."
Molecular Systems Biology 6: 411.
 Jerby, L., T. Shlomi, et al. (2010). "Computational reconstruction of tissue-specific metabolic models: application to human liver metabolism." Molecular
Systems Biology 6: 401.
• Kidney
 Chang, R. L., L. Xie, et al. (2010). "Drug off-target effects predicted using structural analysis in the context of a metabolic network model." PLoS
computational biology 6(9): e1000938.
• Aveolar macrophage
 Bordbar, A., N. E. Lewis, et al. (2010). "Insight into human alveolar macrophage and M. tuberculosis interactions via metabolic reconstructions." Molecular
Systems Biology 6: 422.
• Red Blood Cells
 Bordbar, A., N. Jamshidi, et al. (2011). "iAB-RBC-283: A proteomically derived knowledge-base of erythrocyte metabolism that can be used to simulate
its physiological and patho-physiological states." BMC systems biology 5: 110.
• Human small intestinal epithelial cells
 Sahoo, S. and I. Thiele (2013). "Predicting the impact of diet and enzymopathies on human small intestinal epithelial cells." Hum Mol Genet 22(13): 27052722.
 Platelets
 Thomas, A., S. Rahmanian, et al. (2014). "Network reconstruction of platelet metabolism identifies metabolic signature for aspirin resistance." Sci Rep 4:
3925.
•
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-29-
Recon 2 Cell-specific Reconstructions
• Using Recon 2
 Haraldsdottir, H. S., Thiele, I., Fleming, R. M. T.,"Comparative evaluation of open source software for mapping between metabolite
identifiers in metabolic network reconstructions: application to Recon 2", Journal of Cheminformatics, 6(2) (2014).
• Kidney
 Quek, L. E., S. Dietmair, et al. (2014). "Reducing Recon 2 for steady-state flux analysis of HEK cell culture." J Biotechnol 184: 172178.
• Membrane Transporters
 Sahoo, S., Aurich, M. K., Jonsson, J. J. and Thiele, I., "Membrane transporters in a human genome-scale metabolic knowledgebase
and their implications for disease", Front. Physiol., doi: 10.3389 (2014).
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-30-
‘Google Map’ of Human Metabolism
http://medicalxpress.com/news/2013-03-international-consortium-google-human-metabolism.html
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-31-
Recon 2–based Models
http://humanmetabolism.org/
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-32-
Lesson Outline
• Overview
• Creating Tissue-specific Models
 GIMME-like Family
 iMAT-like Family
 MBA-like Family
• Tissue-specific Example
 Red Blood Cells
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-33-
Red Blood Cells (erythrocytes)
• The most common type of blood cell and the vertebrate organism's principal
means of delivering oxygen (O2) to the body tissues—via blood flow through
the circulatory system. Red blood cells (RBCs) take up oxygen in the lungs or
gills and release it into tissues while squeezing through the body's capillaries.
• The cytoplasm of erythrocytes is rich in hemoglobin, an iron-containing
biomolecule that can bind oxygen and is responsible for the red color of the
cells. The cell membrane is composed of proteins and lipids, and this structure
provides properties essential for physiological cell function such as
deformability and stability while traversing the circulatory system and
specifically the capillary network.
Scanning electron micrograph of blood cells.
From left to right: human erythrocyte,
thrombocyte (platelet), leukocyte.
• In humans, mature red blood cells are flexible and oval biconcave disks. They
lack a cell nucleus and most organelles, in order to accommodate maximum
space for hemoglobin. Approximately 2.4 million new erythrocytes are
produced per second in human adults. The cells develop in the bone marrow and
circulate for about 100–120 days in the body before their components are
recycled by macrophages. Each circulation takes about 20 seconds.
Approximately a quarter of the cells in the human body are red blood cells.
https://en.wikipedia.org/wiki/Red_blood_cell
Utah State University
Red Blood Cell membrane major proteins
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-34-
Building an in silico Red Blood Cell
(Erythrocyte) Metabolic Network
• The three major required data types are: the human
genome sequence, high-throughput data (specifically,
proteomics for an enucleated cell), and primary
literature
• To build the erythrocyte network, iAB-RBC-283,
proteomics was used to remove non-erythrocyte
related open reading frames (ORFs) or genes.
• Detailed curation utilizing protein, metabolite, and
transport experimental literature was needed to
build a high-quality metabolic reconstruction.
• Without network reconstruction and rigorous
curation, the experimentally generated proteomic
data is raw and difficult to interpret.
Bordbar, A., N. Jamshidi, et al. (2011). "iAB-RBC-283: A proteomically derived
knowledge-base of erythrocyte metabolism that can be used to simulate its
physiological and patho-physiological states." BMC systems biology 5: 110.
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-35-
Functional Assessment of Model
• Flux variability analysis (FVA) was utilized to determine the functional metabolic
pathways of the erythrocyte network.
• FVA is used to define the bounding box on network capabilities (FVA determines the
minimum and maximum allowable flux through each metabolic reaction).
 Reactions with a calculated non-zero maximum or minimum have the potential to be active and
have a potential physiological function. For a reaction to have a non-zero flux, the reaction
must be linked to other metabolic reactions and pathways and plays a functional role in the
system. Thus, potentially active reactions are deemed as functional.
 Thus, we use FVA to determine the capability/capacity of the network reactions to determine
metabolic functionality.
• After determining which reactions were functional, the reaction list was perused to
determine pathway and subsystem functionality in the network.
Bordbar, A., N. Jamshidi, et al. (2011). "iAB-RBC-283: A proteomically derived knowledge-base of erythrocyte
metabolism that can be used to simulate its physiological and patho-physiological states." BMC systems biology 5: 110.
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-36-
Analyzing iAB-RBC-283 as a functional biomarker
• The Morbid Map from the Online Mendelian Inheritance in Man (OMIM) [51]
and the DrugBank [52] were downloaded from their respective databases. The
enzyme names in iAB-RBC-283 were cross-referenced against the database
entries to determine morbid SNPs in erythrocyte proteins and drugs with
protein targets in the erythrocyte. The morbid SNPs that did not have sole
pathological effects in the erythrocyte were classified using the Merck
Manual [43].
• Just as FVA can be used to assess the function of a network under a particular
set of constraints, it can also be used to assess the changes in function and
thus has applications for characterizing disease states [53] and identifying
biomarkers [54]. When simulating a morbid SNP or a drug inhibited enzyme,
the lower and upper bound constraints on the affected reaction is set to zero.
FVA is then used to characterize the exchange reactions under morbid SNP or
drug treated conditions and then compared to the normal state.
• A reaction was considered to be confidently altered if the change in the
minimum or maximum flux was 40% of the total flux span. The flux span is
defined as the absolute difference between the original (unperturbed)
maximum and minimum fluxes.
Bordbar, A., N. Jamshidi, et al. (2011). "iAB-RBC-283: A proteomically derived knowledge-base of erythrocyte
metabolism that can be used to simulate its physiological and patho-physiological states." BMC systems biology 5: 110.
Utah State University
BE 5500/6500
There are four major differences that can occur
for an exchange reaction in two different states:
i) the reaction is either active (non-zero minimum
or maximum flux) or inactive (zero minimum and
maximum flux), ii) the exchange becomes fixed in
one direction (uptake or secretion only), iii) there
is a magnitude change in exchange, iv) the
reaction is unaffected and is the same for both
states.
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-37-
Topological Map of the Human Erythrocyte Metabolic Network
• 283 genes
• 292 reactions,
• 267 metabolites
• 77 transporters
Bordbar, A., N. Jamshidi, et al. (2011). "iAB-RBC-283: A proteomically derived knowledge-base of erythrocyte
metabolism that can be used to simulate its physiological and patho-physiological states." BMC systems biology 5: 110.
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-38-
Additional File #2: Detected SNPs
and FVA Results for SNP Perturbations
OMIM is a comprehensive, authoritative compendium of human genes and
genetic phenotypes that is freely available and updated daily. OMIM is
authored and edited at the McKusick-Nathans Institute of Genetic
Medicine, Johns Hopkins University School of Medicine, under the
direction of Dr. Ada Hamosh. Its official home is omim.org.
Bordbar, A., N. Jamshidi, et al. (2011). "iAB-RBC-283: A proteomically derived knowledge-base of erythrocyte
metabolism that can be used to simulate its physiological and patho-physiological states." BMC systems biology 5: 110.
Utah State University
BE 5500/6500
A reaction was considered
to be confidently altered
if the change in the
minimum or maximum flux
was 40% of the total flux
span. The flux span is
defined as the absolute
difference between the
original (unperturbed)
maximum and minimum
fluxes.
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-39-
OMIM
Database
http://omim.org/
Amberger J, Bocchini CA, Scott AF, Hamosh A: McKusick’s
Online Mendelian Inheritance in Man (OMIM). Nucleic Acids
Res 2009, , 37 Database: D793-6.
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-40-
Additional File #3: Detected Drug Targets
and FVA Results for Drug Effect Perturbations
A reaction was considered to be
confidently altered if the change in the
minimum or maximum flux was 40% of the
total flux span. The flux span is defined as
the absolute difference between the
original (unperturbed) maximum and
minimum fluxes.
Describes relationship between RBC
genes and drugs in the drug bank
Spreadsheet also includes drug descriptions
like: Fluocinonide - A topical glucocorticoid
used in the treatment of eczema. [PubChem]
Bordbar, A., N. Jamshidi, et al. (2011). "iAB-RBC-283: A
proteomically derived knowledge-base of erythrocyte
metabolism that can be used to simulate its physiological and
patho-physiological states." BMC systems biology 5: 110.
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-41-
DrugBank
Database
http://www.drugbank.ca/
Wishart DS, Knox C, Guo AC, Cheng D, Shrivastava S, Tzur
D, Gautam B, Hassanali M: DrugBank: a knowledgebase for
drugs, drug actions and drug targets. Nucleic Acids Res
2008, , 36 Database: D901-6.
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-42-
RBC Flux Balance Analysis (Default)
EX_ac(e)
EX_acnam(e)
EX_ade(e)
EX_adn(e)
EX_co2(e)
EX_fru(e)
EX_fum(e)
EX_gal(e)
EX_gam(e)
EX_glc(e)
EX_h(e)
EX_h2o(e)
EX_hco3(e)
EX_lac-L(e)
EX_mal-L(e)
EX_man(e)
EX_nh4(e)
EX_pyr(e)
ACGAM2E
ACGAMK
3.74e-05
-3.74e-05
0.01
-0.01
-5.87112
-0.0075
-0.25
-0.3169
-1e-05
-1.12
8.79668
-6.13121
5.87112
2.92556
0.25
-0.01
4.74e-05
3.74e-05
-3.74e-05
3.74e-05
Utah State University
ACNAMt2
ACNMLr
ACt2r
ADEt
ADNt
AGDC
CO2t
ENO
FBA
FRUt1r
FUM
FUMtr
G6PDA
GALK
GALT
GALU
GALt1r
GAMt1r
GAPD
GLCt1r
3.74e-05
3.74e-05
-3.74e-05
-0.01
0.01
3.74e-05
5.87112
2.92556
1.46111
0.0075
0.25
0.25
4.74e-05
0.3169
1000
-1000
0.3169
1e-05
2.92556
1.12
H2Ot
HCO3E
HCO3_CLt
HEX1
HEX10
HEX4
HEX7
Ht
KCCt
L-LACt2r
LDH_L
MALt
MAN6PI
MANt1r
NAt
NH4t3r
NaKt
PFK
PGI
PGK
BE 5500/6500
6.13121
5.87112
-5.87112
1.12
1e-05
0.01
0.0075
5.87113
-5.87112
-2.92556
-2.92556
-0.25
0.01
0.01
8.80668
4.74e-05
2.93556
1.46111
1.4369
-2.92556
PGM
PGMT
PPM
PUNP1
PYK
PYRt2r
RPE
RPI
TALA
TKT1
TKT2
TPI
UDPG4E
UGLT
-2.92556
0.3169
0.01
0.01
2.92556
-3.74e-05
0.00666667
0.00666667
0.00333333
0.00333333
0.00333333
1.46111
-0.3169
-999.683
Objective Function
testRBC.m
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-43-
RBC Flux Balance Analysis (Default,No Loops)
EX_ac(e)
EX_acnam(e)
EX_ade(e)
EX_adn(e)
EX_arg-L(e)
EX_co2(e)
EX_fru(e)
EX_gal(e)
EX_gam(e)
EX_glc(e)
EX_h(e)
EX_h2o(e)
EX_h2o2(e)
EX_hco3(e)
EX_lac-L(e)
EX_man(e)
EX_nh4(e)
EX_o2(e)
EX_ptrc(e)
EX_pyr(e)
EX_urea(e)
3.74e-05
-3.74e-05
0.01
-0.01
-0.1152
-5.75592
-0.0075
-0.3169
-1e-05
-1.12
10
4.00359
-10
5.87112
2.2663
-0.01
4.74e-05
5
0.1152
0.659295
0.1152
Utah State University
ACGAM2E
ACGAMK
ACNAMt2
ACNMLr
ACt2r
ADEt
ADNt
AGDC
ARGN
ARGt5r
CAT
CO2t
DM_nadh
ENO
FBA
FRUt1r
G6PDA
GALK
GALt1r
GAMt1r
GAPD
-3.74e-05
3.74e-05
3.74e-05
3.74e-05
-3.74e-05
-0.01
0.01
3.74e-05
0.1152
0.1152
5
5.75592
0.659258
2.92556
1.46111
0.0075
4.74e-05
0.3169
0.3169
1e-05
2.92556
GLCt1r
H2O2t
H2Ot
HCO3E
HCO3_CLt
HEX1
HEX10
HEX4
HEX7
Ht
KCCt
L-LACt2r
LDH_L
MAN6PI
MANt1r
NAt
NH4t3r
NaKt
O2t
ORNDC
PFK
BE 5500/6500
1.12
10
-4.00359
5.87112
-5.87112
1.12
1e-05
0.01
0.0075
7.07445
-5.87112
-2.2663
-2.2663
0.01
0.01
8.80668
4.74e-05
2.93556
-5
0.1152
1.46111
PGI
PGK
PGM
PGMT
PPM
PTRCt
PUNP1
PYK
PYRt2r
RPE
RPI
TALA
TKT1
TKT2
TPI
UDPG4E
UGLT
UREAt
1.4369
-2.92556
-2.92556
0.3169
0.01
-0.1152
0.01
2.92556
-0.659295
0.00666667
0.00666667
0.00333333
0.00333333
0.00333333
1.46111
-0.3169
0.3169
-0.1152
Objective Function
testRBC.m
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
RBC
Metabolic
Map
H. Scott Hinton, 2015
-44-
Citric Acid Cycle
Pyruvate Metabolism
Glycolysis/Gluconeogenesis
Pentose Phosphate Pathway
Glutamate Metabolism
Entire RBC Network
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Constraint-based Metabolic Reconstructions & Analysis
H. Scott Hinton, 2015
-45-
Lesson Outline
• Overview
• Creating Tissue-specific Models
 GIMME-like Family
 iMAT-like Family
 MBA-like Family
• Tissue-specific Example
 Red Blood Cells
Utah State University
BE 5500/6500
Lesson: Tissue Specific Models
Download