ATP - Manchester Centre for Integrative Systems Biology

Download Report

Transcript ATP - Manchester Centre for Integrative Systems Biology

Metabolic Engineering
Flux Balance Analysis
Evangelos Simeonidis
[email protected]
Simulating living cells
• Living cells contain >300 gene products
• E.coli: 4000 gene products
• yeast: 6000 gene products
• ~1000 are enzymes
• Thousands of processes occur simultaneously 
 immense complexity
How can Metabolic Engineering cope with this complexity?
Stoichiometry
• Stoichiometry, together with Mass Balances help us
overcome much of this complexity
dX i
  sip  v p   sic  vc  import  export
dt
p
c
Linear Pathway
S
v1
X
dX
0  v1  v2
dt
v2
Y
v3
P
dY
0  v22  v33
dt
 v1  v2  v3
• This system has 3 degrees of freedom = number of rates
• What will happen to the rates after time 0? After a long time?
• The system is said to be in “steady state”
Flux Analysis
S
v1
X
v2
Y
v3
P
v1  v2 , v2  v3
• The rates at steady state are commonly called “fluxes”
• Degrees of freedom = number of rates – flux constraints
• Because this is a linear pathway, there is ONE flux through the system
• Value of flux determined by kinetics; but the equality of the 3 fluxes is
determined ONLY by stoichiometry
How do we increase the yield (production) of P?
S
v1
Flux Analysis
v2
X
Y
v3
P
• Write the mass balances that are possible for this system. What changed from
the previous example?
• Degrees of freedom ?
• How do we change the yield in the first example? How do we change it in
this example?
• What is the difference from a Metabolic Engineering Perspective?
Flux Analysis and Optimisation
ATP
S
X
ADP + Pi
Y
P
• Knowing enough about the system is often difficult; assume for example that
in the above “cell” we can only measure consumption of S and production of P
• What is the relationship between the two fluxes in ss? Is there a way to
evaluate the internal fluxes?
• An assumption is needed; often this assumption is that the organism is
operating OPTIMALLY
• e.g. assume the above organism optimal with respect to energetic efficiency
Topics covered
• Constraint-based approach to metabolic modelling
• Mathematics behind FBA: Optimisation
• Mathematical examples
Introduction
• A major goal of systems biology is to relate genome
sequence to cell physiology
• This requires the identification of the components and
their interactions in the system + mathematical modelling
• Small molecule metabolism is the best described
molecular network in the cell and there are various
computational tools to model its behaviour
Metabolic network reconstructions
Network reconstruction  delineation of the chemical and
physical interactions between the components
Metabolic network reconstructions
• Automated metabolic reconstructions for > 500 organisms
based on genome sequence data (e.g. KEGG database)
• Automated reconstructions are usually not suitable for
modelling
• Manual assembly gives higher quality networks and is
based on genomic + biochemical + physiological data
High quality manual reconstructions
• Incorporate information on:
reaction reversibility
cofactor usage
transport reactions
cellular compartments (e.g. mitochondrion)
biomass composition
• Only available for well studied microbes (e.g. yeast, E.
coli and ~10 other bacteria)
High quality manual reconstructions
Example: Escherichia coli metabolic reconstruction*
• the best characterized network
• 931 reactions
• 625 different metabolites
But: 67 are dead end!
* Reed et al . (2003) Genome Biol 4: R54
How to analyse such a complex network?
ExPASy – Metabolic Pathways
Problem with kinetic modelling
A lot of data is required to parameterize large-scale models,
experimentally intractable at present.
The largest kinetic metabolic model available:
 Human red blood cell (35 enzymes)
Two modelling approaches
Mechanistic
Constraint-based
(kinetic)
(stoichiometric)
Kinetic rate equation for
each reaction + parameters
Consider all possible
behaviours of the system
(large solution space)
Simulation of system
behaviour
Imposing constraints
(physicochemical laws,
biological constraints)
Smaller allowable solution
space
Types of constraints
• Physico-chemical constraints
 mass, charge and energy conservation, laws of
thermodynamics
• Biological constraints:
 external environment, regulatory constraints
Stoichiometric modelling of metabolism
Metabolic network
Thermodynamic
constraints
(irreversibility)
Mass balance
constraint in
steady state
etc.
Maximum
enzyme capacity
Problem:
Addition of constraints reduces the allowable solution space,
but usually not to a single point (underdetermined system).
How to find a particular solution?
We can look for a solution which optimises a particular network
function (e.g. production of ATP or biomass)  FBA
Topics covered
• Constraint-based approach to metabolic modelling
• Mathematics behind FBA: Optimisation
• Mathematical examples
Optimisation and Mathematical Programming
• optimisation problem or mathematical programming problem: a
formulation in which a function is minimised by systematically
choosing the values of variables from within an allowed set
Given a function f: A  R (e.g. min x2+1)
Find an element x0 in A such that f(x0) ≤ f(x) for all x in A
• The domain A of f is called the search space, while the
elements of A are called feasible solutions
• A is specified by a set of constraints (equalities or inequalities)
• function f is called an objective function
• A feasible solution that minimizes the objective function is
called an optimal solution
Subfields
• Linear programming studies the case in which the objective
function f is linear and the constraints linear equalities and
inequalities
• Integer linear programming studies linear programs in which
some or all variables take on integer values
• Nonlinear programming studies the general case in which the
objective function or the constraints or both are nonlinear
Techniques for solving
mathematical programming problems
• There exist robust, fast numerical techniques for optimising
mathematical programming problems
–
–
–
–
–
–
–
Gradient descent (steepest descent)
Nelder-Mead method
Simplex method
Ellipsoid method
Newton's method
Quasi-Newton methods
Interior point methods
– Conjugate gradient method
Alternatives for optimisation
• Mathematical programming and its techniques for solving of
optimisation problems are powerful tools, but are not the only
solutions available. Other approaches (that usually apply numerical
analysis approximations) are:
–
–
–
–
–
–
–
–
–
–
–
Hill climbing
Simulated annealing
Quantum annealing
Tabu search
Beam search
Genetic algorithms
Ant colony optimization
Evolution strategy
Stochastic tunneling
Particle swarm optimization
Differential evolution
Linear Programming (LP)
• LP model 
objective function + linear constraints
minimise c1.x1+ c2.x2+ … +cn.xn
• extensively used optimisation
technique
subject to:
• allocation of limited resources to
competing activities in the optimal
way
• examples of application: graphs,
network flows, plant management,
economics, business management
equality constraints
ai1.x1+ ai2.x2+ … +ain.xn = ai0
inequality constraints
bi1.x1+ bi2.x2+ … +bin.xn  bi0
or in matrix form:
• most prominent method for solving:
simplex method
• Prominent solver: CPLEX
min cT.x
subject to:
A.x = a
B.x  b
Mixed Integer Programming
• If variables are required to be integer, then the problem is an integer
programming (IP) or mixed integer programming (MIP) problem
• In contrast to linear programming, which can be solved efficiently in
the worst case, integer programming problems are in the worst case
undecidable, and in many practical situations NP-hard
• MIP problems are solved using advanced algorithms such as branch
and bound or branch and cut
• LP and MILP solvers are in widespread use for optimization of various
problems in industry, such as optimization of flow in transportation
networks
–
–
–
–
CPLEX
MINTO
AIMMS
SYMPHONY
–
–
–
–
Xpress-MP
GNU Linear Programming Kit
Qoca
Cassowary constraint solver
Optimisation in FBA
• Linear Programming may be used to study the stoichiometric
constraints on metabolic networks
optimisation is used to predict metabolic flux distributions at
steady state based on the assumption of maximised growth
performance along evolution
• only stoichiometric data and cellular composition required
• valuable for identifying flux distribution boundaries for the
metabolic function of cellular systems
Application
• FBA involves carrying out a steady state analysis, using the
stoichiometric matrix (S) for the system in question
• The system is assumed to be optimised with respect to objectives such
as maximisation of biomass production or minimisation of nutrient
utilisation
• At steady state:
dx
 S v  0
dt
• The required flux distribution is the null space of S. Since the number
of fluxes typically exceeds the number of metabolites, the system is
under-determined and may be solved by selecting an optimisation
criterion, following which, the system translates into an LP problem
Mathematical Model
max
s.t.
Σj cj . vj
Σj Sij . vj = 0
Lj ≤ vj ≤ Uj
0 ≤ vj ≤ Uj
i
jreversible
jirreversible
i: metabolites
j: reactions
Sij: stoichiometric matrix
vj: reaction fluxes (mmol / gDW hr)
cj: weight
Lj, Uj: bounds
Small example for S matrix construction
AB
A+BC
B+C2A
j1
 -1
 1

 0
j3
-1
-1
1
j4
2 A
-1  B
-1  C
Exercise 1
AB
A+E2C
B+CD+E
2E+C 2A+B
j1
-1
1

0

0
 0
j2
j3
j4
2 A

0 -1 1  B
2 -1 -1 C

0 1 0D
-1 1 -2  E
-1 0
Exercise 2
Topics covered
• Constraint-based approach to metabolic modelling
• Mathematics behind FBA: optimisation
• Mathematical examples
Model Network
• Simplified model of E. coli metabolic network
Varma and Palsson, J.Theor.Biol. (1993) 165, 477
• Reactions: 35
– 17 reversible
– 18 irreversible
• Metabolites: 30
Glc
G6P
ATP
Hexp
R5P
NADPH
F6P
NADH
E4P
Fum
Succ
T3P
SuccCoA
Mal
3PG
Glx
PEP
OA
aKG
Cit
Pyr
Lac
QH2
AcCoA
Eth
Ac
ICit
FADH
Single metabolite production maximisation
Maximum stoichiometric yields (mmol/mmol Glc)
ATP
NADH
NADPH
3PG
PEP
Pyr
OA
G6P
18.67
11.57
11.00
2.00
2.00
2.00
1.50
0.91
F6P
R5P
E4P
T3P
AcCoA
aKG
SuccCoA
0.91
1.08
1.33
1.74
2.00
1.00
1.00
Glc
G6P
ATP
Hexp
R5P
NADPH
F6P
NADH
E4P
Fum
Succ
T3P
SuccCoA
Mal
3PG
Glx
PEP
OA
aKG
Cit
Pyr
Lac
QH2
AcCoA
Eth
Ac
ICit
FADH
Flux distribution map for maximal ATP production
Glc
ATP
Hexp
QH2
G6P
NADPH
NADH
F6P
Fum
Succ
T3P
Mal
SuccCoA
OA
aKG
3PG
PEP
Cit
Pyr
AcCoA
ICit
FADH
Glc
G6P
ATP
Hexp
R5P
NADPH
F6P
NADH
E4P
Fum
Succ
T3P
SuccCoA
Mal
3PG
Glx
PEP
OA
aKG
Cit
Pyr
Lac
QH2
AcCoA
Eth
Ac
ICit
FADH
Flux distribution map for maximal PEP
production
Glc
ATP
Hexp
G6P
NADPH
F6P
T3P
Mal
3PG
PEP
Pyr
OA
NADH
QH2
Biomass production maximisation
Metabolic Demands for 1g of biomass yield (mmol)
ATP
41.2570
T3P
0.1290
NADH
-3.5470
3PG
1.4960
NADPH
18.2250
PEP
0.5191
G6P
0.2050
Pyr
2.8328
F6P
0.0709
AcCoA
3.7478
R5P
0.8977
OA
1.7867
E4P
0.3610
aKgG
1.0789
Maximum biomass yield:
0.589 g DW / g Glc
Glc
G6P
ATP
Hexp
R5P
NADPH
F6P
NADH
E4P
Fum
Succ
T3P
SuccCoA
Mal
3PG
Glx
PEP
OA
aKG
Cit
Pyr
Lac
QH2
AcCoA
Eth
Ac
ICit
FADH
Optimal flux distribution for
aerobic growth on glucose
Glc
G6P
ATP
Hexp
R5P
NADH
F6P
E4P
Fum
Succ
T3P
Mal
3PG
Glx
PEP
OA
aKG
Cit
Pyr
QH2
AcCoA
ICit
FADH
FBA Summary
• Simple, no kinetic information needed
• Can be applied to large networks
• In accordance with experimental results
• Can be used for defining wider limits of metabolic
behaviour
Further reading
Bernhard O. Palsson: Systems Biology: Properties of reconstructed
networks, Cambridge University Press, 2006
Metabolic network reconstruction:
Nature Reviews Genetics (2006) 7:130-141.
Metabolic modelling approaches:
Journal of Biotechnology (2002) 94: 37-63.
Biotechnology and Bioengineering (2003) 84: 763-772
Flux Balance Analysis:
Current Opinion in Biotechnology (2003) 14: 491-496.
Nature Reviews Microbiology (2004) 2: 886-897
Mathematical programming:
Paul Williams: Model Building in Mathematical Programming