Gene`s flux activity states

Download Report

Transcript Gene`s flux activity states

Network-based data integration reveals
extensive post-transcriptional regulation of
human tissue-specific metabolism
Tomer Shlomi*, Moran Cabili*, Markus J. Herrgard,
Bernhard Q Palsson and Eytan Ruppin
* These authors contributed equally to this work
1
Metabolism
Metabolism is the totality of all the chemical
reactions that operate in a living organism.
Catabolic reactions
Breakdown and produce energy
Anabolic reactions
Use energy and build up essential
cell components
2
Why Study Human Metabolism?
• In born errors of metabolism cause acute symptoms and
even death on early age
• Metabolic diseases (obesity, diabetics) are major sources
of morbidity and mortality.
• Metabolic enzymes and their regulators gradually
becoming viable drug targets
3
Modeling Cellular Metabolism
A Short Review
Metabolic flux :
The production or
elimination of a quantity
of metabolite per mass
of organ or organism
over a specific time
frame
Metabolite
Reaction catalyzed
by an enzyme
“..it is the concept of metabolic flux that
is crucial in the translation of genotype
and environmental factors into
phenotype or a threshold for disease.”
Brendan Lee Nature 2006
4
Constraint Based Modeling
Find a steady-state flux distribution through all
biochemical reactions
• Under the constraints:
– Mass balance: metabolite production and consumption rates are
equal
– Thermodynamic: irreversibility of reactions
– Enzymatic capacity: bounds on enzyme rates
• Successfully predicts:
constant
5
Constraint Based Modeling (CBM)
Mathematical Representation of Constrains
Glucose + ATP
Glucokinase
Glucose-6-Phosphate + ADP
Mass balance
S·v = 0
n
Subspace of R
metabolites
• Stoichiometric matrix – network topology with stoichiometry of
biochemical reactions
reactions
Glucose
ATP
G-6-P
ADP
Thermodynamic & capacity
10 >vi > 0
Glucokinase
-1
-1
+1
+1
Optimization
Maximize Vgrowth
Bounded convex cone
Fell, et al (1986), Varma and Palsson (1993)
6
Human Metabolic Models
•
Motivated by the fact that in-vivo studies of tissue-specific
metabolic functions are limited in scope
•
Individual genes and pathways (KEGG, HumanCyc)
• Detailed description of the genes, reactions, enzymes
• No connections between pathways
•
Specific cell-types and organelles
• Red blood cell Wiback et al. 2002
• Mitochondria Vo et al. 2004
•
Large-Scale Human Metabolic Networks
• The first large-scale model of human metabolism ~2000 genes, ~3700
reactions, 7 organelles (Duarte et al. 2007, Ma et al. 2007)
7
CBM in Human
Modeling human tissue function is problematic
•Various cell-types activate different pathways
(shown in Expression studies)
•Hard to formulate cellular metabolic objectives –
(like biomass maximization for microbial species)
•Unknown inputs and outputs of each cell-type
Can we use constraint-based modeling
to systematically predict tissuespecific metabolic behavior?
8
Our Objective :
1. General approach to study tissue specific metabolic
models
2. Tissue specific activity of metabolic genes/reactions
Our Method :
Model Integration with Tissue-Specific Gene and
Protein Expression Data
Motivated by the assertion that highly expressed genes
in a certain tissue are likely to be active there
9
Our Method
1
Gene expression data
Protein measurements data
Highly and Lowly expressed gene sets
Gene-to-reaction mapping
Highly and Lowly expressed reaction sets
Human Metabolic Model
2
(Duarte et. al)
3
New objective function:
Maximize consistency with expression data.
Use Mixed Integer Linear Programming (MILP)
4
Determine activity state and conf. level for each gene/reaction
10
Our Method
Determine Highly and Lowly Reaction sets
1. Genes set :Extract set of enzymes whose expression is significantly
increased or decreased (GeneNote, HPRD)
2. Reactions set :Employ a detailed gene-to-reaction mapping to identify a
tissue-specific expression state for each reaction
R1 = (g1 & g2) | g3 | g4
11
Our Method
1
Gene expression data
Protein measurements data
Highly and Lowly expressed gene sets
Gene-to-reaction mapping
Highly and Lowly expressed reaction sets
Human Metabolic Model
2
(Duarte et. al)
3
New objective function:
Maximize consistency with expression data.
Use Mixed Integer Linear Programming (MILP)
4
Determine activity state and conf. level for each gene/reaction
12
Our Method
Represent Flux Consistency with Expression
State
Highly expressed
Input
E1
E2
H1
M1
L1
M3
M4
L2
M6
M2
E6
E5
M5
M7
E3
M8
Output
H2
Output
E4
H3
E7
M9
Lowly expressed
Looking for real flux vector V
Now add additional Boolean vectors H, L s.t :
Hi=1  Vi != 0 (if the enzyme associated with Vi is Highly expressed)
L i=1  Vi=0
(if the enzyme associated with Vi is Lowly expressed)
13
Our Method
Define a New Objective function
Highly expressed
Input
E1
E2
H1
M1
L1
E5
M3
M4
L2
M5
M6
M2
E6
M7
E3
M8
Output
H2
Output
E4
H3
E7
M9
4 out of 5 reactions were
Use Mixed Integer Linear Programming. Define
a new objective
consistent
withfunction:
the
MAX Σ (Hi + Li )
expression state!
Lowly expressed
Which practically mean maximize the number of Highly expressed
reactions that are active and the number of Lowly expressed reactions
that are inactive
Maximize consistency with expression data
14
Our Method
1
Gene expression data
Protein measurements data
Highly and Lowly expressed gene sets
Gene-to-reaction mapping
Highly and Lowly expressed reaction sets
Human Metabolic Model
2
(Duarte et. al)
3
New objective function:
Maximize consistency with expression data.
Use Mixed Integer Linear Programming (MILP)
4
Determine activity state and conf. level for each gene/reaction
15
Our Method
Flux Activity State
• Gene’s flux activity states -reflect the absence/existence of
non-zero flux through the enzymatic reactions they encode
• Comparison of the flux activity states and the expression
state will teach us on post transcription regulation
Highly expressed
E1
E5
Lowly expressed
M3
E2
M4
M1
M5
M2
M6
E6
M7
E3
M8
E4
Up regulated
E7
M9
Down regulated
16
Flux Activity State
Consider Space of Possible Solutions
• We predict for each tissue active and inactive gene and reactions
sets
• Since there is a space of possible solutions to the MILP problem we
solve a set of MILP problems to determine the gene activity
1. Simulate a state where the gene is inactive
2. Simulate an active gene product
Estimate confidence levels based on the drop in the
consistency (with expression) between the 2
different solutions!
17
Results
Gene Tissue Specific Activity
•We employed the method described above on
• metabolic network model of Duarte et al.
• gene and protein expression measurements from
GeneNote and HPRD
•10 tissues :
brain, heart, kidney, liver, lung, pancreas, prostate,
spleen, skeletal muscle and thymus.
• The activity state of 781 out of 1475 model genes was
determined in at least one tissue
18
Post-transcriptional Regulation
of Metabolic Genes
• Post-transcriptional regulation plays a major role in
shaping tissue-specific metabolic behavior: ~20% of the
metabolic genes per tissue
•
average of 42 (3.6%) genes post-transcriptionally up-regulated and 180
(15.4%) post-transcriptionally down-regulated in each tissue
down-regulated
up-regulated
19
Cross Validation Test
•We performed a five-fold cross validation test
•80% of the genes were used to constrain the model
•Gene activity states for a held-out set of 20% of the genes were predicted
according to the expression constrains of the remaining other 80%
•The overlap between the genes predicted as active and the highly
expressed genes in the held-out data was significantly high for all tissues
20
Large Scale Validation
Large-Scale Mining of Tissue-Specificity Data
- Tissue-specificity of genes, reactions, and metabolites is significantly
correlated with all data sources
- Tissue specificity of post-transcriptional up regulated elements is
significantly high !!!!
- Tissue specificity of post-transcriptional down regulated elements is
significantly low !!!!
21
Tissue-Specific Metabolite
Exchange with Biofluids
• 249 metabolites are known to be
secreted or taken up by human
tissues
• 54% of the metabolites are not
associated with transporters and
cannot be predicted by expression
data
• Transport direction can not be
inferred by the expression data
• A transporter might carry several
metabolites
• Many of the known transporters are
post-transcriptionall regulated
22
Metabolic Disease-Causing Genes
• 162 metabolic genes are associated with a mendelian disease
• Prediction accuracy: precision of 49% and a recall of 22%
• There is a significant affect of post transcriptional regulation on
disease-causing genes
GBE1 causes the glycogen storage disease is post-transcriptionally up-regulated in liver, heart, skeletal
muscle, and brain)
23
Summary
Methodological Standpoint
• First constraint-based modeling analysis of recently
published human metabolic networks
• First to account for post-transcriptional regulation within the
computational framework of large-scale metabolic
modeling
• Integrate expression data as part of the optimization
instead of imposing it as a constrain during the
preprocessing step (Akesson et al. 2004)
24
Summary
Main Conclusions
• Post transcriptional regulation plays a significant rule in
shaping tissue specific metabolic behavior
 The tissue specificity of many metabolic disease-causing
genes goes markedly beyond that manifested in their
expression level, giving rise to new predictions concerning
their involvement in different tissues
 Metabolites exchange with biofluids displays a large
variance across tissues, composing a unique view of
tissue-specific uptake and secretion of hundreds of
metabolites
25
What’s Next?
• Integrate other tissue-specificity data
• Modeling of metabolic diseases
– Using various data sources (known disease-causing
genes, drug databases)
– Predict tissue-wide metabolic symptoms
– Predict metabolic response to drugs
• Predict disease biomarkers that can be identified by
biofluid metabolomics
26
Thank you!
27
Mathematical representation of our optimization problem
max (iR ( y  y )  iR y i )
v, y  , y 
E

i

i

N
s.t
S  v  0 (1)
v min  v  v max
(2)
vi  y i v min, i     v min, i , , i  RE
(3)
, i  RE
(4)
vi  y i v max, i     v max, i
v min, i (1  y i )  vi  v max, i (1  y i ) , i  R N
(5)
v  Rm
y i , y i  0,1
28