REVerse Engineering Algorithm
Download
Report
Transcript REVerse Engineering Algorithm
Modeling Genetic Network:
Boolean Network
Yongyeol Ahn
2004.08.18. KAIST
Genetic Network
• Genes interact with each other via
proteins, RNAs and themselves.
Main Objectives
• To infer genetic network from biological
data
• To explain and predict the behaviors of
genetic regulatory network
Modeling Genetic Networks
• Statistics rules!
Bayesian network
Hmm.. We need some dynamics.
• Let’s be realistic!
Differential equation approach
• Simple is the best!
Boolean network
Bayesian Network :
Information Theory
• Shannon entropy:
• Joint probability: Pr(x,y)
• Conditional probability:
Pr(y|x) = Pr(x,y)/Pr(x)
• Mutual information:
Bayesian Network
• Find a directed acyclic graph which shows the
relationships of nodes well.
Xi : Expression level
Differential Equation Approach
C
‘The Clock’
1
A
10
50
50
500
A
Gene A
2
1
50
+
0.2
1
R
0.5
5
50
0.01
A
A
Gene R
1
100
A
Differential Equation Approach
80
0.8
Expressed
genes
0.6
mRNAs
60
40
0.4
A
20
0.2
30
40
50
1500
30
60
C
2000
1000
R
C
R
40
50
60
2000
1500
A
1000
500
500
30
40
50
60
250
500
750
1000
1250
1500
1750
R
Why Boolean network?
• It tells about the dynamics (vs. bayesian
network)
• ‘Gene switch’ : There are attempts to make a
‘genetic computer’ using genetic ‘logic gates’.
Binary state approximation is fine.
• In many cases, the exact timing may not be
important.
• Simple, general, easy to implement, …
The Boolean Network
• Nodes, Directed links
• Synchronous dynamics
• Binary states: ‘on’ or ‘off’
• A node’s state is determined by states
of other nodes which have a link to the
node(by assigned boolean functions).
Example
0
1
2
Node 0
Node 2
1
0
0
1
1
1
0
1
2
0
1
0
1
1
0
1
1
1
0
0
1
2
0
1
0
0
0
0
0
0
0
1
1
1
t
Boolean Network Variations
• Multi-Valued model
• Different updating scheme
(asynchronous, …)
• Probabilistic model
Classification of boolean
networks
(Gershenson2004)
Tools
• DDLab
– http://www.ddlab.com
• RBNLab
– http://rbn.sourceforge.net
• BN/PBN toolbox:
– http://www2.mdanderson.org/app/ilya/PBN/PBN.htm
(Classical)
Random Boolean Network
• Parameter: N, K, p
– N: number of nodes
– K: average in-degree
– p: probability of ‘1’ in each boolean function
• Large ensemble
,state space(2^N)
So big!
Very high standard deviations
Phase Transition
• Stable (K<=2)
• Critical
• Chaotic (K>=3)
• Visualization method
– Active nodes: ‘green’
– Frozen nodes: ‘red’
Phase Transition
• Islands
– Chaotic: green sea percolate & red islands
– Stable: frozen red sea & green islands
• Robustness
– Chaotic: damage spreads
– Stable: robust
• Convergence and divergence of traj.
(Lyapunov exponent)
– Chaotic: similar states tend to diverge
– Stable: tend to converge
Loops Trees
• For active dynamics, network needs
Loops.
• Loops activate other parts (trees).
• Active wave propagates from loops to
trees.
G – Density
• G-density : Garden of Eden states
density
• Ordered: very high G-density, high indegree frequency
• Critical: power-law in-degree
distribution
• Chaotic: lower G-density,
Analytical Result of Phase
Transition
• Derrida’s annealed approx. : Assuming
connections and boolean functions are
randomly reshuffled at each time step.
• Define overlap = 1 – Normalized
Hamming distance between two states
• What will happen at tinf ?
Analytical Result of Phase
Transition
For a network with in-degree k,
1
1
k
x(t 1) x (t ) (1 x (t )) (1 x k (t ))
2
2
k
Transforming with Hamming distance and consider bias
d (t 1) 2 p(1 p)[1 (1 d (t )) k ]
Derrida Curves
d (t 1)
K=5
K=2
Critical connectivity
k [2 p(1 p)]
d (t )
1
Phase diagram
(In practice, the size of the network can
play a role in the phase transitions)
Topology of boolean network
• In reality, genetic networks have very
inhomogeneous degree distribution
• Using Derrida’s annealed approximation,
the phase diagram for scale-free
network can be obtained.
Derrida’s Annealed Approx. For
Power-law Degree Distribution
• By the assumption, x(t) obeys the
equation
Where,
Contd.
1
x(t 1) (1 x k P I (k ))
2
k 1
1
d (t 1) (1 (1 d (t )) k P I (k ))
2
k 1
generalize for arbitrary p,
d (t 1) 2 p (1 p )(1 (1 d (t )) k P I (k ))
k 1
x(t 1) 1 2 p(1 p)(1 x (t ) P I (k ))
k
k 1
Contd.
The phase transitio n condition : 2 p(1 p ) k I 1
1
Let PI (k )
k ,
( )
( c 1)
2 p(1 p)
1
( c )
Topology of boolean network:
Scale-free boolean network
Attractors
The Number of Attractors
• The number of attractors grows faster
than any power law with system size.
(Samuelsson2003)
The Length of Attractor
• For K=1, root(N/2)
• At critical phase, it is long believed that
the length proportional to root N (
Kauffman argued that this is related to
the number of cell types )
• But it is linear
Applications
•
•
•
•
Reverse Engineering
Morphogenesis model
Segment polarity development
Yeast transcriptional network
Reverse engineering: REVEAL
• REVerse Engineering Algorithm
• It finds a minimal solution for a boolean
network given any set of time-series.
• Use entropy, mutual information
Neutral Mutation and Punctuated
Equilibrium (Bornholdt1998)
• The model evolves under robustness
principle (look for silent mutations)
• Threshold networks (restricted set of
the boolean networks)
Weight = ± 1, 0
Evolutionary Rule
• Create a daughter network by ‘adding’,
‘removing’, ‘adding and removing a
weight in the coupling matrix’ at
random. (each p = 1/3)
• With a random initial state, if mother &
daughter reach the same attractor,
replace the mother with the daughter.
In other case, keep the mother network.
Punctuated Equilibrium
• The evolution shows punctuated network
connectivity (lifetime ~ 1/t^2)
• Evolved networks have much shorter
attractors, large frozen components
Model for Morphogenesis(Sole2003)
• Modeling an organism with one
dimensional cell array.
• Each cells have the same set of genes
and hormones.
• Genes interact within the cell.
• Hormones communicate with
neighboring cells.
• Threshold model.
Morphogenesis Model
Development
Adaptive Walks
• ‘Toward more complex organism’
• Complexity measure: the number of cell
types
• Rule
– Evolve many organism in parallel
– Addition, Removal, randomization of link,
link’s weight (each p=1/3)
– Check the complexity
Logarithmic Increase of the
Number of Patterns
• Consistent with
Kauffman’s
‘rugged landscape’
explanation of
Cambrian
Explosion
Segment Polarity Network in
Fly (Albert2003)
• The genetic
network of Fly
development
• This network is
simulated with
ODE(Dassow,
2002) and has
shown a good
result.
Boolean Network
Construction Rules
• The effect of transcriptional activators and inhibitors
is never additive, but rather, inhibitors are dominant.
• Transcription and translation are ON/OFF functions of
the state
• If transcription/translation is ON, mRNAs/proteins are
synthesized in one time step
• mRNAs decay in one time step if not transcribed
• Transcription factors and proteins undergoing posttranslational modification decay in one time step if
their mRNA is not present.
Constructed Boolean Network
Results
• Stable state is same as the real fly.
• Essential gene deletion results also
agree with real data.
• There are six distinct steady states in
the model. Three of these are wellknown experimentally
Yeast transcriptional
network(Kauffman2003)
• For a given network structure, they generate boolean
network ensembles with nested canalyzing functions.
• The ensemble of networks are very stable.
Canalyzing Function
• Canalyzing boolean function has at least
one input which the output value is
fixed.
• In most cases, genetic networks consist
of canalyzing functions.
Yeast Transcriptional Network
• Find topological
transition point
(to determine the
confidence p value)
• Remove genes that
have no output to other
genes
Yeast Transcriptional
Network
• For a given network
structure..
– Functions with null
hypothesis
– Functions based on
literature (canalyzing)
Nested Canalyzing Function
• Inputs im, outputs Om, in degree k
• Assume i1 is canalyzing input, then we
can define a new rule without i1 (with
indegree k-1)
• In most cases, this new rule is also
canalyzing.
Conclusion
• Boolean network model is simple,
abstract, general.
• But, it’s powerful.