[Cdc2-Cyclin~{p1}] => Cdc2-Cyclin~{p1}.

Download Report

Transcript [Cdc2-Cyclin~{p1}] => Cdc2-Cyclin~{p1}.

Semantical and Algorithmic
Aspects of the Living
François Fages
Constraint Programming project-team,
INRIA Paris-Rocquencourt
To tackle the complexity of biological systems, investigate
• Programming Theory Concepts
• Formal Methods of Circuit and Program Verification
• Automated Reasoning Tools
Prototype Implementation in the Biochemical Abstract Machine BIOCHAM
modeling environment available at http://contraintes.inria.fr/BIOCHAM
François Fages
Rocquencourt, Sep. 2007
Systems Biology ?
“Systems Biology aims at systems-level understanding [which]
requires a set of principles and methodologies that links the
behaviors of molecules to systems characteristics and functions.”
H. Kitano, ICSB 2000
• Analyze (post-)genomic data produced with high-throughput
technologies (stored in databases like GO, KEGG, BioCyc, etc.);
• Integrate heterogeneous data about a specific problem;
• Understand and Predict behaviors or interactions in large networks of
genes and proteins.
Systems Biology Markup Language (SBML) : exchange format for
reaction models
François Fages
Rocquencourt, Sep. 2007
Issue of Abstraction
Models are built in Systems Biology with two contradictory perspectives :
François Fages
Rocquencourt, Sep. 2007
Issue of Abstraction
Models are built in Systems Biology with two contradictory perspectives :
1) Models for representing knowledge : the more concrete the better
François Fages
Rocquencourt, Sep. 2007
Issue of Abstraction
Models are built in Systems Biology with two contradictory perspectives :
1) Models for representing knowledge : the more concrete the better
2) Models for making predictions : the more abstract the better !
François Fages
Rocquencourt, Sep. 2007
Issue of Abstraction
Models are built in Systems Biology with two contradictory perspectives :
1) Models for representing knowledge : the more concrete the better
2) Models for making predictions : the more abstract the better !
These perspectives can be reconciled by organizing formalisms and
models into hierarchies of abstractions.
To understand a system is not to know everything about it but to know
abstraction levels that are sufficient for answering questions about it
François Fages
Rocquencourt, Sep. 2007
Semantics of Living Processes ?
Formally, “the” behavior of a system depends on our choice of observables.
?
?
Mitosis movie [Lodish et al. 03]
François Fages
Rocquencourt, Sep. 2007
Boolean Semantics
Formally, “the” behavior of a system depends on our choice of observables.
Presence/absence of molecules
Boolean transitions
0
François Fages
1
Rocquencourt, Sep. 2007
Continuous Differential Semantics
Formally, “the” behavior of a system depends on our choice of observables.
Concentrations of molecules
Rates of reactions
x
François Fages
ý
Rocquencourt, Sep. 2007
Stochastic Semantics
Formally, “the” behavior of a system depends on our choice of observables.
Numbers of molecules
Probabilities of reaction
n
François Fages

Rocquencourt, Sep. 2007
Temporal Logic Semantics
Formally, “the” behavior of a system depends on our choice of observables.
Presence/absence of molecules
Temporal logic formulas
F
François Fages
x
Fx
F (x ^ F ( x ^ y))
FG (x v y)
…
Rocquencourt, Sep. 2007
Constraint Temporal Logic Semantics
Formally, “the” behavior of a system depends on our choice of observables.
Concentrations of molecules
Constraint LTL temporal formulas
F
François Fages
x> 1
F (x >0.2)
F (x >0.2 ^ F (x<0.1 ^ y>0.2))
FG (x>0.2 v y>0.2)
…
Rocquencourt, Sep. 2007
Language-based Approaches to Cell Systems Biology
Qualitative models: from diagrammatic notation to
• Boolean networks [Kaufman 69, Thomas 73]
• Petri Nets [Reddy 93, Chaouiya 05]
• Process algebra π–calculus [Regev-Silverman-Shapiro 99-01, Nagasali et al. 00]
• Bio-ambients [Regev-Panina-Silverman-Cardelli-Shapiro 03]
• Pathway logic [Eker-Knapp-Laderoute-Lincoln-Meseguer-Sonmez 02]
• Reaction rules [Chabrier-Fages 03] [Chabrier-Chiaverini-Danos-Fages-Schachter 04]
Quantitative models: from ODEs and stochastic simulations to
• Hybrid Petri nets [Hofestadt-Thelen 98, Matsuno et al. 00]
• Hybrid automata [Alur et al. 01, Ghosh-Tomlin 01] HCC [Bockmayr-Courtois 01]
• Stochastic π–calculus [Priami et al. 03] [Cardelli et al. 06]
• Reaction rules with continuous time dynamics [Fages-Soliman-Chabrier 04]
François Fages
Rocquencourt, Sep. 2007
Overview of the Talk
1. Rule-based Modeling of Biochemical Systems
1. Syntax of molecules, compartments and reactions
2. Semantics at three abstraction levels: boolean, differential, stochastic
3. Cell cycle control models
2. Temporal Logic Language for Formalizing Biological Properties
1. CTL for the boolean semantics
2. Constraint LTL for the differential semantics
3. PCTL for the stochastic semantics
3. Automated Reasoning Tools
1. Inferring kinetic parameter values from Constraint-LTL specification
2. Inferring reaction rules from CTL specification
L. Calzone, N. Chabrier, F. Fages, S. Soliman. TCSB VI, LNBI 4220:68-94. 2006.
François Fages
Rocquencourt, Sep. 2007
Molecules of the living
Small molecules: covalent bonds 50-200 kcal/mol
• 70% water
• 1% ions
• 6% amino acids (20), nucleotides (5),
• fats, sugars, ATP, ADP, …
Macromolecules: hydrogen bonds, ionic, hydrophobic, Waals 1-5 kcal/mol
• 20% proteins (50-104 amino acids)
• RNA (102-104 nucleotides AGCU)
• DNA (102-106 nucleotides AGCT)
François Fages
Rocquencourt, Sep. 2007
Structure Levels of Proteins
1) Primary structure: word of n amino acids residues (20n possibilities)
linked with C-N bonds
Example: INRIA
Isoleucine-asparagiNe-aRginine-Isoleucine-Alanine
François Fages
Rocquencourt, Sep. 2007
Structure Levels of Proteins
1) Primary structure: word of n amino acids residues (20n possibilities)
linked with C-N bonds
Example: INRIA
Isoleucine-asparagiNe-aRginine-Isoleucine-Alanine
2) Secondary: word of m a-helix, b-strands, random coils,… (3m-10m)
stabilized by hydrogen bonds H---O
François Fages
Rocquencourt, Sep. 2007
Structure Levels of Proteins
1) Primary structure: word of n amino acids residues (20n possibilities)
linked with C-N bonds
Example: INRIA
Isoleucine-asparagiNe-aRginine-Isoleucine-Alanine
2) Secondary: word of m a-helix, b-strands, random coils,… (3m-10m)
stabilized by hydrogen bonds H---O
3) Tertiary 3D structure: spatial folding
stabilized by
hydrophobic
interactions
François Fages
Rocquencourt, Sep. 2007
Syntax of proteins
Cyclin dependent kinase 1
(free, inactive)
Complex Cdk1-Cyclin B
(low activity)
Cdk1
Cdk1–CycB
Phosphorylated form
Cdk1~{thr161}-CycB
at site threonine 161
(high activity)
mitosis promotion factor
François Fages
Rocquencourt, Sep. 2007
BIOCHAM Syntax of Objects
E == compound | E-E | E~{p1,…,pn}
Compound: molecule, #gene binding site, abstract @process…
- : binding operator for protein complexes, gene binding sites, …
Associative and commutative.
~{…}: modification operator for phosphorylated sites, …
Set of modified sites (Associative, Commutative, Idempotent).
O == E | E::location
Location: symbolic compartment (nucleus, cytoplasm, membrane, …)
S == _ | O+S
+ : solution operator (Associative, Commutative, Neutral _)
François Fages
Rocquencourt, Sep. 2007
Elementary Reaction Rules
Complexation: A + B => A-B
Decomplexation A-B => A + B
cdk1+cycB => cdk1–cycB
François Fages
Rocquencourt, Sep. 2007
Elementary Reaction Rules
Complexation: A + B => A-B
Decomplexation A-B => A + B
cdk1+cycB => cdk1–cycB
Phosphorylation: A =[C]=> A~{p}
Dephosphorylation A~{p} =[C]=> A
Cdk1-CycB =[Myt1]=> Cdk1~{thr161}-CycB
Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB
François Fages
Rocquencourt, Sep. 2007
Elementary Reaction Rules
Complexation: A + B => A-B
Decomplexation A-B => A + B
cdk1+cycB => cdk1–cycB
Phosphorylation: A =[C]=> A~{p}
Dephosphorylation A~{p} =[C]=> A
Cdk1-CycB =[Myt1]=> Cdk1~{thr161}-CycB
Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB
Synthesis: _ =[C]=> A.
_ =[#E2-E2f13-Dp12]=> CycA
Degradation: A =[C]=> _.
cycE =[@UbiPro]=> _
(not for cycE-cdk2 which is stable)
François Fages
Rocquencourt, Sep. 2007
Elementary Reaction Rules
Complexation: A + B => A-B
Decomplexation A-B => A + B
cdk1+cycB => cdk1–cycB
Phosphorylation: A =[C]=> A~{p}
Dephosphorylation A~{p} =[C]=> A
Cdk1-CycB =[Myt1]=> Cdk1~{thr161}-CycB
Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB
Synthesis: _ =[C]=> A.
_ =[#E2-E2f13-Dp12]=> CycA
Degradation: A =[C]=> _.
cycE =[@UbiPro]=> _
(not for cycE-cdk2 which is stable)
Transport: A::L1 => A::L2
Cdk1~{p}-CycB::cytoplasm => Cdk1~{p}-CycB::nucleus
François Fages
Rocquencourt, Sep. 2007
From Syntax to Semantics
R ::= S=>S | S =[O]=> S | S <=> S | S <=[O]=> S
where A =[C]=> B stands for A+C => B+C
A <=> B stands for A=>B and B=>A, etc.
| kinetic for R
(import/export SBML format)
In SBML : no semantics (exchange format)
François Fages
Rocquencourt, Sep. 2007
From Syntax to Semantics
R ::= S=>S | S =[O]=> S | S <=> S | S <=[O]=> S
where A =[C]=> B stands for A+C => B+C
A <=> B stands for A=>B and B=>A, etc.
| kinetic for R
(import/export SBML format)
In SBML : no semantics (exchange format)
In BIOCHAM : three abstraction levels
1. Boolean Semantics: presence-absence of molecules
1. Concurrent Transition System (asynchronous, non-deterministic)
François Fages
Rocquencourt, Sep. 2007
From Syntax to Semantics
R ::= S=>S | S =[O]=> S | S <=> S | S <=[O]=> S
where A =[C]=> B stands for A+C => B+C
A <=> B stands for A=>B and B=>A, etc.
| kinetic for R
(import/export SBML format)
In SBML : no semantics (exchange format)
In BIOCHAM : three abstraction levels
1. Boolean Semantics: presence-absence of molecules
1. Concurrent Transition System (asynchronous, non-deterministic)
2. Differential Semantics: concentration
1. Ordinary Differential Equations or Hybrid system (deterministic)
François Fages
Rocquencourt, Sep. 2007
From Syntax to Semantics
R ::= S=>S | S =[O]=> S | S <=> S | S <=[O]=> S
where A =[C]=> B stands for A+C => B+C
A <=> B stands for A=>B and B=>A, etc.
| kinetic for R
(import/export SBML format)
In SBML : no semantics (exchange format)
In BIOCHAM : three abstraction levels
1. Boolean Semantics: presence-absence of molecules
1. Concurrent Transition System (asynchronous, non-deterministic)
2. Differential Semantics: concentration
1. Ordinary Differential Equations or Hybrid system (deterministic)
3. Stochastic Semantics: number of molecules
1. Continuous time Markov chain
François Fages
Rocquencourt, Sep. 2007
1. Differential Semantics
Associates to each molecule its concentration [Ai]= | Ai| / volume ML-1
volume of diffusion …
François Fages
Rocquencourt, Sep. 2007
1. Differential Semantics
Associates to each molecule its concentration [Ai]= | Ai| / volume ML-1
volume of compartment
Compiles a set of rules { ei for Si=>S’I }i=1,…,n (by default ei is MA(1))
into the system of ODEs (or hybrid automaton) over variables {A1,…,Ak}
dA/dt = Σ i=1 ri(A)*ei - Σ j=1 lj(A)*ej
where ri(A) (resp. li(A)) is the stoichiometric coefficient of A in Si (resp. S’i)
multiplied by the volume ratio of the location of A.
n
François Fages
n
Rocquencourt, Sep. 2007
1. Differential Semantics
Associates to each molecule its concentration [Ai]= | Ai| / volume ML-1
volume of compartment
Compiles a set of rules { ei for Si=>S’I }i=1,…,n (by default ei is MA(1))
into the system of ODEs (or hybrid automaton) over variables {A1,…,Ak}
dA/dt = Σ i=1 ri(A)*ei - Σ j=1 lj(A)*ej
where ri(A) (resp. li(A)) is the stoichiometric coefficient of A in Si (resp. S’i)
multiplied by the volume ratio of the location of A.
n
n
volume_ratio (15,n),(1,c).
mRNAcycA::n <=> mRNAcycA::c.
means 15*Vn = Vc and is equivalent to 15*mRNAcycA::n <=> mRNAcycA::c.
François Fages
Rocquencourt, Sep. 2007
Numerical Integration
• Adaptive step size 4th order Runge-Kutta can be weak for stiff systems
• Rosenbrock implicit method using the Jacobian matrix ∂x’i/∂xj
computes a (clever) discretization of time
and a time series of concentrations and their derivatives
(t0, X0, dX0/dt), (t1, X1, dX1/dt), …, (tn, Xn, dXn/dt), …
at discrete time points
François Fages
Rocquencourt, Sep. 2007
2. Stochastic Semantics
Associate to each molecule its number |Ai| in its location of volume Vi
François Fages
Rocquencourt, Sep. 2007
2. Stochastic Semantics
Associate to each molecule its number |Ai| in its location of volume Vi
Compile the rule set into a continuous time Markov chain
over vector states X=(|A1|,…, |Ak|)
and where the transition rate τi for the reaction ei for Si=>S’I
(giving probability after normalization) is obtained from ei by replacing
concentrations by molecule numbers
François Fages
Rocquencourt, Sep. 2007
2. Stochastic Semantics
Associate to each molecule its number |Ai| in its location of volume Vi
Compile the rule set into a continuous time Markov chain
over vector states X=(|A1|,…, |Ak|)
and where the transition rate τi for the reaction ei for Si=>S’I
(giving probability after normalization) is obtained from ei by replacing
concentrations by molecule numbers
Stochastic simulation [Gillespie 76, Gibson 00]
computes realizations as time series (t0, X0), (t1, X1), …, (tn, Xn), …
François Fages
Rocquencourt, Sep. 2007
3. Boolean Semantics
Associate to each molecule a Boolean denoting its presence/absence in
its location
François Fages
Rocquencourt, Sep. 2007
3. Boolean Semantics
Associate to each molecule a Boolean denoting its presence/absence in
its location
Compile the rule set into an asynchronous transition system
François Fages
Rocquencourt, Sep. 2007
3. Boolean Semantics
Associate to each molecule a Boolean denoting its presence/absence in
its location
Compile the rule set into an asynchronous transition system where a
reaction like A+B=>C+D is translated into 4 transition rules taking into
account the possible complete consumption of reactants:
A+BA+B+C+D
A+BA+B +C+D
A+BA+B+C+D
A+BA+B+C+D
François Fages
Rocquencourt, Sep. 2007
3. Boolean Semantics
Associate to each molecule a Boolean denoting its presence/absence in
its location
Compile the rule set into an asynchronous transition system where a
reaction like A+B=>C+D is translated into 4 transition rules taking into
account the possible complete consumption of reactants:
A+BA+B+C+D
A+BA+B +C+D
A+BA+B+C+D
A+BA+B+C+D
Necessary for over-approximating the possible behaviors under the
stochastic/discrete semantics (abstraction N  {zero, non-zero})
François Fages
Rocquencourt, Sep. 2007
Hierarchy of Semantics
abstraction
Boolean model
Theory of abstract Interpretation
[Cousot Cousot POPL’77]
[Fages Soliman TCSc’07]
Discrete model
Differential model
Stochastic model
concretization
François Fages
Syntactical
model
Models for answering queries:
The more abstract the better
Optimal abstraction w.r.t. query
Rocquencourt, Sep. 2007
Query: what are the stationary states ?
Boolean circuit analysis
abstraction
abstraction
Boolean model abstraction
Discrete model abstraction
Differential model
Stochastic model
concretization
François Fages
Syntactical
model
Discrete circuit analysis
Jacobian circuit analysis
Positive circuits are a
necessary condition
for multistationarity
[Thomas 73] [de Jong 02]
[Soulé 03]
[Remy Ruet Thieffry 05]
Rocquencourt, Sep. 2007
Type Inference / Type Checking
abstraction
[Fages Soliman CMSB’06]
Boolean model
Discrete model
Differential model
Influence graph of proteins
Stochastic model
concretization
François Fages
Syntactical
model
Protein functions
(kinase, phosphatase,…)
Compartments topology
Rocquencourt, Sep. 2007
Type Inference / Type Checking
abstraction
[Fages Soliman CMSB’06]
Boolean model
Discrete model
Influence graph of proteins
(activation/inhibition)
Differential model
Influence graph of proteins
Stochastic model
concretization
François Fages
Syntactical
model
Protein functions
(kinase, phosphatase,…)
Compartments topology
Rocquencourt, Sep. 2007
Cell Cycle: G1  DNA Synthesis  G2  Mitosis
G1: CdK4-CycD
Cdk6-CycD
Cdk2-CycE
François Fages
S: Cdk2-CycA
G2,M: Cdk1-CycA
Cdk1-CycB (MPF)
Rocquencourt, Sep. 2007
Example: Cell Cycle Control Model [Tyson 91]
MA(k1) for
MA(k2) for
MA(K7) for
_ => Cyclin.
Cyclin => _.
Cyclin~{p1} => _.
MA(k8) for
MA(k9) for
Cdc2 => Cdc2~{p1}.
Cdc2~{p1} =>Cdc2.
MA(k3) for Cyclin+Cdc2~{p1} => Cdc2~{p1}-Cyclin~{p1}.
MA(k4p) for Cdc2~{p1}-Cyclin~{p1} => Cdc2-Cyclin~{p1}.
k4*[Cdc2-Cyclin~{p1}]^2*[Cdc2~{p1}-Cyclin~{p1}] for
Cdc2~{p1}-Cyclin~{p1} =[Cdc2-Cyclin~{p1}] => Cdc2-Cyclin~{p1}.
MA(k5) for Cdc2-Cyclin~{p1} => Cdc2~{p1}-Cyclin~{p1}.
MA(k6) for Cdc2-Cyclin~{p1} => Cdc2+Cyclin~{p1}.
François Fages
Rocquencourt, Sep. 2007
Interaction Graph
François Fages
Rocquencourt, Sep. 2007
Stochastic Simulation
François Fages
Rocquencourt, Sep. 2007
Differential Simulation
François Fages
Rocquencourt, Sep. 2007
Boolean Simulation
François Fages
Rocquencourt, Sep. 2007
François Fages
Rocquencourt, Sep. 2007
Mammalian Cell Cycle Control Map [Kohn 99]
François Fages
Rocquencourt, Sep. 2007
Transcription of Kohn’s Map
_ =[ E2F13-DP12-gE2 ]=> cycA.
...
cycB =[ APC~{p1} ]=>_.
cdk1~{p1,p2,p3} + cycA => cdk1~{p1,p2,p3}-cycA.
cdk1~{p1,p2,p3} + cycB => cdk1~{p1,p2,p3}-cycB.
...
cdk1~{p1,p3}-cycA =[ Wee1 ]=> cdk1~{p1,p2,p3}-cycA.
cdk1~{p1,p3}-cycB =[ Wee1 ]=> cdk1~{p1,p2,p3}-cycB.
cdk1~{p2,p3}-cycA =[ Myt1 ]=> cdk1~{p1,p2,p3}-cycA.
cdk1~{p2,p3}-cycB =[ Myt1 ]=> cdk1~{p1,p2,p3}-cycB.
...
cdk1~{p1,p2,p3} =[ cdc25C~{p1,p2} ]=> cdk1~{p1,p3}.
cdk1~{p1,p2,p3}-cycA =[ cdc25C~{p1,p2} ]=> cdk1~{p1,p3}-cycA.
cdk1~{p1,p2,p3}-cycB =[ cdc25C~{p1,p2} ]=> cdk1~{p1,p3}-cycB.
165 proteins and genes, 500 variables, 800 rules [Chiaverini Danos 02]
François Fages
Rocquencourt, Sep. 2007
Overview of the Talk
1. Rule-based Modeling of Biochemical Systems
1. Syntax of molecules, compartments and reactions
2. Semantics at three abstraction levels: boolean, differential, stochastic
3. Cell cycle control models
2. Temporal Logic Language for Formalizing Biological Properties
1. CTL for the boolean semantics
2. Constraint LTL for the differential semantics
3. PCTL for the stochastic semantics
3. Automated Reasoning Tools
1. Inferring kinetic parameter values from Constraint-LTL specification
2. Inferring reaction rules from CTL specification
François Fages
Rocquencourt, Sep. 2007
A Logical Paradigm for Systems Biology
Biological model = Transition System
Biological property = Temporal Logic Formula
Biological validation = Model-checking
Formalize properties of the biological system in:
• Computation Tree Logic CTL for the boolean semantics
• Linear Time Logic with numerical constraints for the concentration semantics
• Probabilistic CTL with numerical constraints for the stochastic semantics
Evaluate the formulas by model checking techniques
[Lincoln et al. PSB’02] [Chabrier Fages CMSB’03] [Bernot et al. TCS’04] …
François Fages
Rocquencourt, Sep. 2007
A Logical Paradigm for Systems Biology
Biological model = Transition System
Biological property = Temporal Logic Formula
Biological validation = Model-checking
In the Biochemical Abstract Machine environment,
Model:
- Boolean
- Differential
- Stochastic
(SBML)
François Fages
BIOCHAM
- simulation
Rocquencourt, Sep. 2007
A Logical Paradigm for Systems Biology
Biological model = Transition System
Biological property = Temporal Logic Formula
Biological validation = Model-checking
In the Biochemical Abstract Machine environment,
Model:
- Boolean
- Differential
- Stochastic
(SBML)
François Fages
BIOCHAM
- simulation
- query evaluation
Biological Properties:
- CTL
- LTL with constraints
- PCTL with constraints
Rocquencourt, Sep. 2007
A Logical Paradigm for Systems Biology
Biological model = Transition System
Biological property = Temporal Logic Formula
Biological validation = Model-checking
In the Biochemical Abstract Machine environment,
Model:
- Boolean
- Differential
- Stochastic
(SBML)
François Fages
BIOCHAM
- simulation
- query evaluation
Biological Properties:
- CTL
- LTL with constraints
- PCTL with constraints
Rocquencourt, Sep. 2007
A Logical Paradigm for Systems Biology
Biological model = Transition System
Biological property = Temporal Logic Formula
Biological validation = Model-checking
In the Biochemical Abstract Machine environment,
Model:
- Boolean
- Differential
- Stochastic
(SBML)
François Fages
BIOCHAM
- simulation
- query evaluation
- rule learning
- parameter search
Biological Properties:
- CTL
- LTL with constraints
- PCTL with constraints
Rocquencourt, Sep. 2007
2.1 Computation Tree Logic CTL
Extension of propositional (or first-order) logic with operators for time and
choices [Clarke et al. 99]
Choice
E
exists
A
always
X
next time
EX(f)
¬ AX(¬ f)
AX(f)
F
finally
EF(f)
¬ AG(¬ f)
AF(f)
G
globally
EG(f)
¬ AF(¬ f)
AG(f)
Time
U
until
François Fages
E (f1 U f2)
Non-determinism E, A
AG
A (f1 U f2)
EU
F,G,U
Time
EF
Rocquencourt, Sep. 2007
Biological Properties formalized in CTL (1/3)
About reachability:
• Can the cell produce some protein P? reachable(P)==EF(P)
François Fages
Rocquencourt, Sep. 2007
Biological Properties formalized in CTL (1/3)
About reachability:
• Can the cell produce some protein P? reachable(P)==EF(P)
• Can the cell produce P, Q and not R? reachable(P^Q^R)
François Fages
Rocquencourt, Sep. 2007
Biological Properties formalized in CTL (1/3)
About reachability:
• Can the cell produce some protein P? reachable(P)==EF(P)
• Can the cell produce P, Q and not R? reachable(P^Q^R)
• Can the cell always produce P? AG(reachable(P))
François Fages
Rocquencourt, Sep. 2007
Biological Properties formalized in CTL (1/3)
About reachability:
• Can the cell produce some protein P? reachable(P)==EF(P)
• Can the cell produce P, Q and not R? reachable(P^Q^R)
• Can the cell always produce P? AG(reachable(P))
About pathways:
• Can the cell reach a (partially described) set of states s while passing
by another set of states s2?
EF(s2^EFs)
François Fages
Rocquencourt, Sep. 2007
Biological Properties formalized in CTL (1/3)
About reachability:
• Can the cell produce some protein P? reachable(P)==EF(P)
• Can the cell produce P, Q and not R? reachable(P^Q^R)
• Can the cell always produce P? AG(reachable(P))
About pathways:
• Can the cell reach a (partially described) set of states s while passing
by another set of states s2?
EF(s2^EFs)
• Is it possible to produce P without Q? E(Q U P)
François Fages
Rocquencourt, Sep. 2007
Biological Properties formalized in CTL (1/3)
About reachability:
• Can the cell produce some protein P? reachable(P)==EF(P)
• Can the cell produce P, Q and not R? reachable(P^Q^R)
• Can the cell always produce P? AG(reachable(P))
About pathways:
• Can the cell reach a (partially described) set of states s while passing
by another set of states s2?
EF(s2^EFs)
• Is it possible to produce P without Q? E(Q U P)
• Is (set of) state s2 a necessary checkpoint for reaching (set of) state s?
checkpoint(s2,s)== E(s2U s)
François Fages
Rocquencourt, Sep. 2007
Biological Properties formalized in CTL (1/3)
About reachability:
• Can the cell produce some protein P? reachable(P)==EF(P)
• Can the cell produce P, Q and not R? reachable(P^Q^R)
• Can the cell always produce P? AG(reachable(P))
About pathways:
• Can the cell reach a (partially described) set of states s while passing
by another set of states s2?
EF(s2^EFs)
• Is it possible to produce P without Q? E(Q U P)
• Is (set of) state s2 a necessary checkpoint for reaching (set of) state s?
checkpoint(s2,s)== E(s2U s)
• Is s2 always a checkpoint for s? AG(s -> checkpoint(s2,s))
François Fages
Rocquencourt, Sep. 2007
Biological Properties formalized in CTL (2/3)
About stationarity:
• Is a (set of) state s a stable state? stable(s)== AG(s)
François Fages
Rocquencourt, Sep. 2007
Biological Properties formalized in CTL (2/3)
About stationarity:
• Is a (set of) state s a stable state? stable(s)== AG(s)
• Is s a steady state (with possibility of escaping) ? steady(s)==EG(s)
François Fages
Rocquencourt, Sep. 2007
Biological Properties formalized in CTL (2/3)
About stationarity:
• Is a (set of) state s a stable state? stable(s)== AG(s)
• Is s a steady state (with possibility of escaping) ? steady(s)==EG(s)
• Can the cell reach a stable state s? EF(stable(s)) not in LTL
François Fages
Rocquencourt, Sep. 2007
Biological Properties formalized in CTL (2/3)
About stationarity:
• Is a (set of) state s a stable state? stable(s)== AG(s)
• Is s a steady state (with possibility of escaping) ? steady(s)==EG(s)
• Can the cell reach a stable state s? EF(stable(s)) not in LTL
• Must the cell reach a stable state s? AG(stable(s))
François Fages
Rocquencourt, Sep. 2007
Biological Properties formalized in CTL (2/3)
About stationarity:
• Is a (set of) state s a stable state? stable(s)== AG(s)
• Is s a steady state (with possibility of escaping) ? steady(s)==EG(s)
• Can the cell reach a stable state s? EF(stable(s)) not in LTL
• Must the cell reach a stable state s? AG(stable(s))
• What are the stable states? Not expressible in CTL. Needs to combine
CTL with search (e.g. constraint programming [Thieffry et al. 05] )
François Fages
Rocquencourt, Sep. 2007
Biological Properties formalized in CTL (3/3)
About oscillations:
• Can the system exhibit a cyclic behavior w.r.t. the presence of P ?
oscil(P)== EG(F P ^ F P)
CTL* formula that can be approximated in CTL by
oscil(P)== EG((P  EF P) ^ (P  EF P))
(necessary but not sufficient condition for oscillation)
François Fages
Rocquencourt, Sep. 2007
Biological Properties formalized in CTL (3/3)
About oscillations:
• Can the system exhibit a cyclic behavior w.r.t. the presence of P ?
oscil(P)== EG((P  EF P) ^ (P  EF P))
(necessary but not sufficient condition)
• Can the system loops between states s and s2 ?
loop(P,Q)== EG((s  EF s2) ^ (s2  EF s))
François Fages
Rocquencourt, Sep. 2007
Symbolic Model-Checking
Still for finite Kripke structures, use boolean constraints to represent
1. sets of states as a boolean constraint c(V)
2. the transition relation as a boolean constraint r(V,V’)
Binary Decision Diagrams BDD [Bryant 85] provide canonical forms to
Boolean formulas (decide Boolean equivalence)
(x⋁¬y)⋀(y⋁¬z)⋀(z⋁¬x)
and
(x⋁¬z)⋀(z⋁¬y)⋀(y⋁¬x)
are equivalent, they
have the same BDD(x,y,z)
François Fages
Rocquencourt, Sep. 2007
Mammalian Cell Cycle Control Map [Kohn 99]
François Fages
Rocquencourt, Sep. 2007
Cell Cycle Model-Checking (with BDD NuSMV)
biocham: check_reachable(cdk46~{p1,p2}-cycD~{p1}).
Ei(EF(cdk46~{p1,p2}-cycD~{p1})) is true
biocham: check_checkpoint(cdc25C~{p1,p2}, cdk1~{p1,p3}-cycB).
Ai(!(E(!(cdc25C~{p1,p2}) U cdk1~{p1,p3}-cycB))) is true
biocham: nusmv(Ai(AG(!(cdk1~{p1,p2,p3}-cycB) -> checkpoint(Wee1, cdk1~{p1,p2,p3}-cycB))))).
Ai(AG(!(cdk1~{p1,p2,p3}-cycB)->!(E(!(Wee1) U cdk1~{p1,p2,p3}-cycB)))) is false
biocham: why.
-- Loop starts here
cycB-cdk1~{p1,p2,p3} is present
cdk7 is present
cycH is present
cdk1 is present
Myt1 is present
cdc25C~{p1} is present
rule_114 cycB-cdk1~{p1,p2,p3}=[cdc25C~{p1}]=>cycB-cdk1~{p2,p3}.
cycB-cdk1~{p2,p3} is present
cycB-cdk1~{p1,p2,p3} is absent
rule_74 cycB-cdk1~{p2,p3}=[Myt1]=>cycB-cdk1~{p1,p2,p3}.
cycB-cdk1~{p2,p3} is absent
cycB-cdk1~{p1,p2,p3} is present
François Fages
Rocquencourt, Sep. 2007
Mammalian Cell Cycle Control Benchmark
500 variables, 2500 states. 800 rules.
BIOCHAM NuSMV model-checker time in sec. [Chabrier et al. TCS 04]
Initial state G2
Query:
Time:
compiling
29 s
Reachability G1
EF CycE
2s
Reachability G1
EF CycD
1.9 s
Reachability G1
EF PCNA-CycD
1.7 s
EF ( Cdc25~{Nterm}
U Cdk1~{Thr161}-CycB)
EG ( (CycA => EF  CycA) ^
( CycA => EF CycA))
2.2 s
Checkpoint
for mitosis complex
Oscillation
François Fages
31.8 s
Rocquencourt, Sep. 2007
2.2 LTL with Constraints for the Differential Semantics
• Constraints over concentrations and derivatives as FOL formulae over
the reals:
• [M] > 0.2
• [M]+[P] > [Q]
• d([M])/dt < 0
François Fages
Rocquencourt, Sep. 2007
LTL with Constraints for the Differential Semantics
• Constraints over concentrations and derivatives as FOL formulae over
the reals:
• [M] > 0.2
• [M]+[P] > [Q]
• d([M])/dt < 0
• Linear Time Logic LTL operators for time X, F, U, G
•
•
•
•
•
F([M]>0.2)
FG([M]>0.2)
F ([M]>2 & F (d([M])/dt<0 & F ([M]<2 & d([M])/dt>0 & F(d([M])/dt<0))))
oscil(M,n) defined as at least n alternances of sign of the derivative
Period(A,75)=  t v F(T = t & [A] = v & d([A])/dt > 0 & X(d([A])/dt < 0)
& F(T = t + 75 & [A] = v & d([A])/dt > 0 & X(d([A])/dt < 0)))…
François Fages
Rocquencourt, Sep. 2007
How to Evaluate a Constraint LTL Formula ?
• Consider the ODE’s of the concentration semantics dX/dt = f(X)
François Fages
Rocquencourt, Sep. 2007
How to Evaluate a Constraint LTL Formula ?
• Consider the ODE’s of the concentration semantics dX/dt = f(X)
• Numerical integration methods produce a discretization of time
(adaptive step size Runge-Kutta or Rosenbrock method for stiff syst.)
• The trace is a linear Kripke structure:
(t0,X0,dX0/dt), (t1,X1,dX1/dt), …, (tn,Xn,dXn/dt), …
over concentrations and their derivatives at discrete time points
• Evaluate the formula on that Kripke structure with a model checking alg.
François Fages
Rocquencourt, Sep. 2007
Simulation-Based Constraint LTL Model Checking
Hypothesis 1: the initial state is completely known
Hypothesis 2: the formula can be checked over a finite period of time [0,T]
1. Run the numerical integration from 0 to T
producing values at a finite sequence of time points
2. Iteratively label the time points with the sub-formulae of f that are true:
Add f to the time points where a FOL formula f is true,
Add F f (X f) to the (immediate) previous time points labeled by f,
Add f1 U f2 to the predecessor time points of f2 while they satisfy f1,
(Add G f to the states satisfying f until T)
Model checker and numerical integration methods implemented in Prolog
François Fages
Rocquencourt, Sep. 2007
Constraint-LTL Instanciation Algo. [Fages Rizk CMSB’07]
François Fages
Rocquencourt, Sep. 2007
2.3 PCTL Model Checker for the Stochastic Semantics
Compute the probability of realisation of a TL formula (with constraints) by
Monte Carlo method
Perform several stochastic simulations
Evaluate the probability of realization of the TL formula
Costly…
PRISM [Kwiatkowska et al. 04] : PCTL model checker based on BDDs or
using Monte Carlo method.
François Fages
Rocquencourt, Sep. 2007
Overview of the Talk
1. Rule-based Modeling of Biochemical Systems
1. Syntax of molecules, compartments and reactions
2. Semantics at three abstraction levels: boolean, differential, stochastic
3. Cell cycle control models
2. Temporal Logic Language for Formalizing Biological Properties
1. CTL for the boolean semantics
2. Constraint LTL for the differential semantics
3. PCTL for the stochastic semantics
3. Automated Reasoning Tools
1. Inferring kinetic parameter values from Constraint-LTL specification
2. Inferring reaction rules from CTL specification
François Fages
Rocquencourt, Sep. 2007
Example: Cell Cycle Control Model [Tyson 91]
MA(k1) for
MA(k2) for
MA(K7) for
_ => Cyclin.
Cyclin => _.
Cyclin~{p1} => _.
MA(k8) for
MA(k9) for
Cdc2 => Cdc2~{p1}.
Cdc2~{p1} =>Cdc2.
MA(k3) for Cyclin+Cdc2~{p1} => Cdc2~{p1}-Cyclin~{p1}.
MA(k4p) for Cdc2~{p1}-Cyclin~{p1} => Cdc2-Cyclin~{p1}.
k4*[Cdc2-Cyclin~{p1}]^2*[Cdc2~{p1}-Cyclin~{p1}] for
Cdc2~{p1}-Cyclin~{p1} =[Cdc2-Cyclin~{p1}] => Cdc2-Cyclin~{p1}.
MA(k5) for Cdc2-Cyclin~{p1} => Cdc2~{p1}-Cyclin~{p1}.
MA(k6) for Cdc2-Cyclin~{p1} => Cdc2+Cyclin~{p1}.
François Fages
Rocquencourt, Sep. 2007
3.1 Inferring Parameters from Temporal Properties
biocham: learn_parameter([k3,k4],[(0,200),(0,200)],20,
oscil(Cdc2-Cyclin~{p1},3),150).
François Fages
Rocquencourt, Sep. 2007
3.1 Inferring Parameters from Temporal Properties
biocham: learn_parameter([k3,k4],[(0,200),(0,200)],20,
oscil(Cdc2-Cyclin~{p1},3),150).
First values found :
parameter(k3,10).
parameter(k4,70).
François Fages
Rocquencourt, Sep. 2007
3.1 Inferring Parameters from Temporal Properties
biocham: learn_parameter([k3,k4],[(0,200),(0,200)],20,
oscil(Cdc2-Cyclin~{p1},3) & F([Cdc2-Cyclin~{p1}]>0.15), 150).
First values found :
parameter(k3,10).
parameter(k4,120).
François Fages
Rocquencourt, Sep. 2007
3.1 Inferring Parameters from LTL Specification
biocham: learn_parameter([k3,k4],[(0,200),(0,200)],20,
period(Cdc2-Cyclin~{p1},35), 150).
First values found:
parameter(k3,10).
parameter(k4,280).
François Fages
Rocquencourt, Sep. 2007
Linking the Cell and Circadian Cycles through Wee1
Cell cycle
Circadian cycle Leloup and Goldbeter (1999)
Wee1P
BMAL1/CLOCK
Wee1
PER/CRY
preMPF
MPF
....
....
Cdc25P
Cdc25
....
....
[L. Calzone, S. Soliman 2006]
L
François Fages
Rocquencourt, Sep. 2007
PCN
Wee1m
BN
Wee1
MPF
Cdc25
François Fages
Rocquencourt, Sep. 2007
Condition on Wee1/Cdc25 for the Entrainment in Period
entrainment
entrainment
Entrainment in period constraint expressed in LTL with the period formula
François Fages
Rocquencourt, Sep. 2007
3.2. Inferring Rules from Temporal Properties
Given
• a BIOCHAM model (background knowledge)
• a set of properties formalized in temporal logic
learn
• revisions of the reaction model, i.e. rules to delete and rules to add
such that the revised model satisfies the properties
François Fages
Rocquencourt, Sep. 2007
Model Revision from Temporal Properties
• Background knowledge T: BIOCHAM model
• reaction rule language: complexation, phosphorylation, …
• Examples φ: biological properties formalized in temporal logic language
•
•
•
•
Reachability
Checkpoints
Stable states
Oscillations
• Bias R: Reaction rule patterns or parameter ranges
• Kind of rules to add or delete
Find a revision T’ of T such that T’ |= φ
François Fages
Rocquencourt, Sep. 2007
Model Revision Algorithm
General idea of constraint programming: replace a generate-and-test
algorithm by a constrain-and-generate algorithm.
Anticipate whether one has to add or remove a rule.
• Positive ECTL formula: if false, remains false after removing a rule
• EF(φ) where φ is a boolean formula (pure state description)
• Oscil(φ)
• Negative ACTL formula: if false, remains false after adding a rule
• AG(φ) where φ is a boolean formula,
• Checkpoint(a,b): ¬E(¬aUb)
• Remove a rule on the path given by the model checker (why command)
• Unclassified CTL formulae
François Fages
Rocquencourt, Sep. 2007
Example: Cell Cycle Control Model [Tyson 91]
MA(k1) for
MA(k2) for
MA(K7) for
_ => Cyclin.
Cyclin => _.
Cyclin~{p1} => _.
MA(k8) for
MA(k9) for
Cdc2 => Cdc2~{p1}.
Cdc2~{p1} =>Cdc2.
MA(k3) for Cyclin+Cdc2~{p1} => Cdc2~{p1}-Cyclin~{p1}.
MA(k4p) for Cdc2~{p1}-Cyclin~{p1} => Cdc2-Cyclin~{p1}.
k4*[Cdc2-Cyclin~{p1}]^2*[Cdc2~{p1}-Cyclin~{p1}] for
Cdc2~{p1}-Cyclin~{p1} =[Cdc2-Cyclin~{p1}] => Cdc2-Cyclin~{p1}.
MA(k5) for Cdc2-Cyclin~{p1} => Cdc2~{p1}-Cyclin~{p1}.
MA(k6) for Cdc2-Cyclin~{p1} => Cdc2+Cyclin~{p1}.
François Fages
Rocquencourt, Sep. 2007
Automatic Generation of True CTL Properties
Ei(reachable(Cyclin)))
Ei(reachable(!(Cyclin))))
Ai(oscil(Cyclin)))
Ei(reachable(Cdc2~{p1})))
Ei(reachable(!(Cdc2~{p1}))))
Ai(oscil(Cdc2~{p1})))
Ai(AG(!(Cdc2~{p1})->checkpoint(Cdc2,Cdc2~{p1}))))
Ei(reachable(Cdc2-Cyclin~{p1,p2})))
Ei(reachable(!(Cdc2-Cyclin~{p1,p2}))))
Ai(oscil(Cdc2-Cyclin~{p1,p2})))
Ei(reachable(Cdc2-Cyclin~{p1})))
Ei(reachable(!(Cdc2-Cyclin~{p1}))))
Ai(oscil(Cdc2-Cyclin~{p1})))
Ai(AG(!(Cdc2-Cyclin~{p1})->checkpoint(Cdc2-Cyclin~{p1,p2},Cdc2-Cyclin~{p1})))
Ei(reachable(Cdc2)))
Ei(reachable(!(Cdc2))))
Ai(oscil(Cdc2)))
Ei(reachable(Cyclin~{p1})))
Ei(reachable(!(Cyclin~{p1}))))
Ai(oscil(Cyclin~{p1})))
Ai(AG(!(Cyclin~{p1})->checkpoint(Cdc2-Cyclin~{p1},Cyclin~{p1}))))
François Fages
Rocquencourt, Sep. 2007
Rule Deletion
biocham: delete_rules(Cdc2 => Cdc2~{p1}).
biocham: check_all.
First formula not satisfied
Ei(EF(Cdc2-Cyclin~{p1}))
François Fages
Rocquencourt, Sep. 2007
Model Revision from Temporal Properties
biocham: revise_model.
Rules to delete:
Rules to add:
Cdc2 => Cdc2~{p1}.
François Fages
Rocquencourt, Sep. 2007
Model Revision from Temporal Properties
biocham: revise_model.
Rules to delete:
Rules to add:
Cdc2 => Cdc2~{p1}.
biocham: learn_one_addition.
(1) Cdc2 => Cdc2~{p1}.
(2) Cdc2 =[Cdc2]> Cdc2~{p1}.
(3) Cdc2 =[Cyclin]> Cdc2~{p1}.
François Fages
Rocquencourt, Sep. 2007
Conclusion
• Temporal logic with constraints is powerful enough to express both
qualitative and quantitative biological properties of systems
François Fages
Rocquencourt, Sep. 2007
Conclusion
• Temporal logic with constraints is powerful enough to express both
qualitative and quantitative biological properties of systems
• Three levels of abstraction in BIOCHAM :
Boolean semantics
CTL formulas (rule learning)
Differential semantics LTL with constraints over reals (parameter search)
Stochastic semantics Probabilistic CTL with integer constraints
François Fages
Rocquencourt, Sep. 2007
Conclusion
• Temporal logic with constraints is powerful enough to express both
qualitative and quantitative biological properties of systems
• Three levels of abstraction in BIOCHAM :
Boolean semantics
CTL formulas (rule learning)
Differential semantics LTL with constraints over reals (parameter search)
Stochastic semantics Probabilistic CTL with integer constraints
• Parameter search from temporal properties proved useful and
complementary to bifurcation theory tools (Xppaut)
François Fages
Rocquencourt, Sep. 2007
Conclusion
• Temporal logic with constraints is powerful enough to express both
qualitative and quantitative biological properties of systems
• Three levels of abstraction in BIOCHAM :
Boolean semantics
CTL formulas (rule learning)
Differential semantics LTL with constraints over reals (parameter search)
Stochastic semantics Probabilistic CTL with integer constraints
• Parameter search from temporal properties proved useful and
complementary to bifurcation theory tools (Xppaut)
• Rule inference from temporal properties still in infancy, to be optimized
and improved by types (e.g. protein functions, computed by abstract
interpretation)
François Fages
Rocquencourt, Sep. 2007
Collaborations
STREP APrIL2 : Luc de Raedt, Univ. Freiburg, Stephen Muggleton, IC ,…
• Learning in a probabilistic logic setting (finished)
ARC MOCA :
• modularity, compositionality and abstraction
NoE REWERSE : semantic web, François Bry, Münich, Rolf Backofen,
• Connecting Biocham to gene and protein ontologies (types)
STREP TEMPO : Cancer chronotherapies, INSERM Villejuif, F. Lévi;
Bang: J. Clairambault, Contraintes: S. Soliman
• Coupled models of cell cycle, circadian cycle, cytotoxic drugs.
INRA Tours : E. Reiter, D. Heitzler, Sysiphe : F. Clément
• Model of FSH signalling.
François Fages
Rocquencourt, Sep. 2007