5. Studying human metabolism

advertisement
Biological Network Analysis:
Human Metabolic Network
Tomer Shlomi
Winter 2008
Lecture Outline
1.
2.
3.
4.
Human metabolic network reconstruction
Predicting tissue-specific metabolism
Predicting biomarkers for disease diagnosis
Implications of network topology for disease
co-morbidity
5. Building models of tissue metabolism
1. Human metabolic network reconstruction
Why Study Human Metabolism?
• In born errors of metabolism cause acute symptoms and
even death on early age
• Metabolic diseases (obesity, diabetics) are major sources
of morbidity and mortality.
• Metabolic enzymes and their regulators gradually
becoming viable drug targets
• In-vivo studies of tissue-specific metabolic functions are
limited in scope
4
5
6
Network statistics
7
8
Human metabolic knowledge landscape
• Confidence scores:
• 3 – biochemical or genetic evidence
• 2 - physiological data or evidence
from other mammalian cell
• 1 – modeling evidence
• 0 – unevaluated
9
Correlated reaction sets
• Deficiencies in enzymes belonging to the same functionally coupled
reaction set may have similar phenotypes
• Production and transport of two glutathione (antioxidant) related genes
involved in hemolytic anemia (OMIM database)
10
11
Drug targets
• 3-Hydroxy-3-methylglutaryl-CoA reductase (Entrez Gene ID 3156) is a
primary target of antilipidemic class of statin drugs, is in this coupled
reaction set
• Other members of the set are potential candidates for treating
hyperlipidemia
Cholesterol
biosynthesis
12
2. Predicting tissue-specific metabolism
Human metabolism
• The first large-scale model of human metabolism (with ~2000 genes)
published last year (Duarte et al., PNAS’07)
• Various cell-types/tissues activate different pathways
• Unknown tissue-specific metabolic objective functions
• Unknown tissue-specific metabolite uptake rates
• How to predict feasible metabolic states under various conditions?
byp
Growth medium
nutrients
A
B
C
cof
?
cof
E
Biomass
byp
D
14
Our objective
• Develop a general approach for predicting tissue-specific metabolic states
• Provide the first large-scale description of the metabolism of various
tissues
Our method


Model integration with tissue-specific gene and protein expression
data
Motivated by the assertion that highly expressed genes are expected
to carry metabolic flux
15
Enzyme expression level vs. metabolic flux
level
• Changes in gene expression levels between conditions
significantly correlate with changes in predicted fluxes via
FBA:
– Schuster, et al, 2002
– Famili, et al, 2003
– Bilu, et al, 2006
• Changes in gene expression levels show high qualitative
correspondence with changes in measured fluxes:
– Daran, et al, 2004
– Fong, et al, 2004
16
Model integration with tissue-specific
expression data
• Use expression level only as a clue for the existence of metabolic flux
• Network integration is then used to accumulate these cues into a global,
consistent metabolic state
Highly expressed
E1
Input
E5
M3
E2
M4
M1
M5
M2
M6
E6
E3
E7
M7
Output
M8
Output
E4
M9
Lowly expressed
17
Inconsistencies with expression data:
putative post-transcriptional regulation
• A flux activity state of a gene is defined based on the predicted flux through its
reactions
•
•
Metabolic regulation – flux regulation via mass-action based effects
Hierarchical regulation – flux regulation via changes in enzyme concentrations
Highly expressed
E1
Input
E5
M3
E2
M4
M1
M5
M2
M6
E6
E3
M7
Output
M8
Output
E4
Up regulated
E7
M9
Lowly expressed
Down regulated
18
The computational method
•
Relies on Mixed-Integer Linear Programming (MILP)
– Steady-state fluxes – v
– For a highly expressed reaction, ai represents whether it is metabolically active
– For a lowly expressed reaction, ni represents whether it is metabolically inactive
•
Max Σa + Σn
(maximize the correlation with the expression data)
S·v=0;
vminv vmax
(feasible flux distribution)
if ai=1 then vi>0;
if ni=1 then vi=0
(activity/inactivity constraints)
E1
a1
n1
E5
M3
E2
M1
M4
n2
M5
M6
M2
E6
M7
E3
M8
a3
E4
a2
E7
M9
19
The computational method: considering
alternative solutions
•
•
Here, the lower pathway may be either activated or inactivated in an optimal solution –
achieving maximal correlation with the expression data
Predict gene activity state by considering all feasible flux distributions
– A gene is predicted to be active if it cannot be inactivated in any optimal solution
– A gene is predicted to be inactive if it cannot be activated in any optimal solution
•
Genes may be predicted to have an undetermined activity state
E1
a1
n1
?
E5
M3
E2
M7
M4
M1
M5
M2
M6
E6
?
E3
M8
a3
E4
a2
E7
M9
20
Validating the method in predicting yeast
metabolism
Expression data under
various media
Comparison with measured fluxes
(Daran et al’04)
E1
M3
E2
M4
M8
E3
M5
M1
M2
E4
M6
E6
E5
M7
E7
M9
Flux Balance Analysis (FBA)
growth maximization
Comparison with FBA
Biomass
E1
Uptake
rates
E5
M3
E2
M7
M4
M1
M5 E3
M2
M6
E6
E7
M8
E4
M9
21
Applying the method to Human
• Employing the model of Duarte et al.
• Gene and protein expression from GeneNote and HPRD
• 10 tissues:
brain, heart, kidney, liver, lung, pancreas, prostate, spleen,
skeletal muscle and thymus.
• The activity state of 644 was uniquely determined in at least
one tissue, with an average of 408 genes per tissue
22
Cross validation
• The expression state of 80% of the genes is used as input
• Gene activity states for 20% held-out set is predicted
• The overlap between the predicted activity state and the expression data
or the held-out set is highly significant for all tissues!
23
Post-transcriptional regulation plays a major role
in tissue-specific metabolism
• 20% of the metabolic genes are predicted to be post-transcriptionally
regulated across tissues
• An average of 42 (3.6%) genes post-transcriptionally up-regulated and 180
(15.4%) post-transcriptionally down-regulated in each tissue
Up regulation
Down regulation
24
Large-scale validation
• Predicted tissue-specificity of genes, reactions, and metabolites is
significantly correlated with various independent data sources
• The high correlation is still evident when focusing on predictions of posttranscriptionally regulated genes:
– A significantly high fraction of the genes that are predicted to be posttranscriptionally up regulated in certain tissues are known to be active there
– A significant low fraction of the down-regulated genes in certain tissues are
known to be active there
25
Metabolic disease-causing genes
• Many disease genes (OMIM) are predicted to be post-transcriptional upregulated specifically in tissues affected by the disease
26
3. Predicting biomarkers for disease diagnosis
A method for predicting metabolic
biomarkers
• In-born errors of metabolism are commonly diagnosed via biofluid
metabolomics, identifying metabolites with altered concentrations
• Perform systematic biomarker prediction for all known genetic metabolic
disorders via a genome-scale model
Metabolite exchange interval
Biofluids
Uptake
0
Secretion
Tissue
v1
M1
M2
M3
M4
v2
V2
v4
V1
reduced
(high confidence)
elevated
(high confidence)
M5
v5
M6
v6
V4
reduced
V6
elevated
M7
v7
V5,V7
unchanged
28
Validation via Kinetic RBC Model
•
•
•
•
Apply the method to predict
biomarkers for enzyme deficiencies
in human erythrocytes, for which a
detailed kinetic model is available
for validation (Jamshidi et al. 2001).
Kinetic simulations identified 156
biomarkers for 43 enzymatic
disorders
Our method predicts 85 biomarkers,
with a precision of 0.73 and recall of
0.4.
Our method correctly identifies
alterations in extracellular
metabolite concentrations, relying
solely on reaction stoichiometry
and directionality data.
29
Predicting biomarkers for an array of inborn errors of metabolism
•
•
•
The concentration of 223 metabolites is predicted to change as a result of 176
possible dysfunctional enzymes
A high fraction of the disorders (42%) are predicted to have very few biomarker
changes (less than 6)
Many of the disorders (45%) have a unique set of biomarker alteration - these
predictions may be used for the unique diagnosis of metabolic disorders via
biofluids metabolomics
30
Validating biomarker predictions
• Systematic OMIM data
– Extracted via a text mining approach of disease description field of the OMIM database
– Erroneous data
– Show moderate correlation with our method’s predictions
• Manual OMIM data
– Extracted via manual inspection of errors in amino-acid metabolism in the OMIM
database
– High quality data: identify the specific reaction affected in each disease; resolve
metabolite name ambiguities
– Show high correlation with our method’s predictions
• Ramdis/HMDB
– low quality data (has a low correlates with OMIM data)
31
Validating predicted biomarkers
• Extracting data on known biomarkers for in-born errors of amino-acid
metabolism
• The predictions are significantly correlated with the known biomarkers (pvalue=4·10-13) – precision = 0.76, recall = 0.56
32
Download