File - Department of Computer Science
Download
Report
Transcript File - Department of Computer Science
Attractor Detection
and
Control of Boolean Networks
Tatsuya Akutsu
Bioinformatics Center
Institute for Chemical Research
Kyoto University
Contents
Boolean Network
Attractor Detection
Definition and Algorithms
Control of Boolean Network
Definition and DP algorithm
Integer Programming-based Approach
PBN and its Control
Conclusion
Acknowledgment
Tamura Takeyuki, Morihiro Hayashida [Kyoto U.]
Masaki Yamamoto [Kwansei Gakuin U.]
Wai-Ki Ching, Shuqin Zhang, Xi Chen [U. Hong
Kong]
Michael Ng [Hong Kong Baptist U.]
Avraham A. Melkman [Ben-Gurion University of
the Negev]
Boolean Network
Boolean Network
Mathematical model of genetic networks
node⇔gene
Regulation rules
State of node: 1 (active) / 0 (inactive)
Boolean function (AND, OR, NOT …)
Edge from y to x ⇔ y directly controls x
Synchronized update
Almost the same as digital circuits (with
clocks)
[Kauffman, The Origin of Order, 1993]
Example of Boolean Network
Boolean Network
State Transition Table
A
B
C
A’ = B
B’ = A and C
C’ = not A
time t
time t+1
A B C
A’ B’ C’
0
0
0
0
1
1
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
1
0
1
0
1
0
1
INPUT
0
0
0
0
0
1
0
1
1
1
1
1
0
0
0
0
OUTPUT
Example of state transition:
111 ⇒ 110 ⇒ 100 ⇒ 000 ⇒ 001 ⇒ 001 ⇒ 001 ⇒ 。。。
Why Boolean Networks ?
Criticism that BN is too simplified
Unless simplified, difficult for theoretical analysis,
inference, and control
Maybe useful for qualitative analyses
One of most simple non-linear models
though complex models can be used for simulation
Negative results on BN suggest negative results on
more general (non-linear) models
Almost the same as digital circuits
Theories and techniques in computer science can
be utilized
Our Focus: Time Complexity
Many problems for BN are NP-hard
NP-hard means that there is no polynomial time
algorithm (unless P=NP)
It will take O(2n) time or more if we use
naïve methods
But, we want to solve much better
Because we can solve the cases of
n=300 for O(1.1n)
n=600 for O(1.05n)
Important for coping with large-scale
networks
Attractor Detection
Attractor (1)
Steady state
Different attractors ⇔
Different cell types
Example
011 ⇒ 101 ⇒ 010 ⇒
101 ⇒ 010 ⇒…
111 ⇒ 110 ⇒ 100 ⇒
000 ⇒ 001 ⇒ 001 ⇒
001 ⇒ …
State Transition Table
time t
time t+1
A B C
A’ B’ C’
0
0
0
0
1
1
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
1
0
1
0
1
0
1
INPUT
0
0
0
0
0
1
0
1
1
1
1
1
0
0
0
0
OUTPUT
Attractor (2)
time t
time t+1
A B C
A’ B’ C’
0
0
0
0
1
1
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
1
0
1
0
1
0
1
INPUT
0
0
0
0
0
1
0
1
1
1
1
1
0
0
0
0
OUTPUT
111
011
110
100
000
001
101
010
N-K Model (Kauffman Network)
N: Number of nodes (We use n instead of N)
K: Indegree
Indegree = the number of input edges = the number
of genes directly affecting node v
Each node has (maximum or average) indegree K
Boolean function assigned to each node is randomly
selected
indegree=2
v
indegree=3
v
Distribution of Attractors in N-K Model
Classical conjecture
O( n )
Some results suggest that this conjecture may
not be true
The number of attractors is
Superpolynomial growth ( > nγ for any γ) of the number
of attractors
(Samuelsson & Troein, PRL, 2003)
Superpolynomial growth of the average size of attractors
(Drossel et al., PRL, 2005)
No conclusive result is known
Singleton Attractor (or Point Attractor)
Biological interpretation of attractors
Different attractors ⇔ Different cell types
Point attractor
Attractor with period 1
Corresponding to a steady state
Definition: v(t ) (v1 (t ), , vn (t ))
satisfying
v (t 1) v (t )
(or,
v (1) v (0) )
Attractor Detection
Input: Boolean Network
Output: Point Attractor (if any)
Previous Works and Our Works
Around
O(2n ) time is enough
since there are 2n global states
Several heuristics, but no theoretical guarantee
[Irons, Pysica D, 2006], [Devloo et al., Bull. Math. Biol. 2003], …
Detection of a singleton attractor is NP-hard
[Akutsu et al., GIW 1998]
We developed algorithms with average case
theoretical bounds
[Zhang et al., EURASIP JBSB 2007]
We developed algorithms for singleton attractor detection
O(1.587 n ) time algorithm for AND-OR BNs
[Melkman, Tamura & Akutsu, 2010]
O(1.799 n ) time algorithm for nested canalyzing BNs
[Akutsu, Melkman, Tamura & Yamamoto, 2011]
Reduction from BN-ATTRACTOR to SAT
Detection of Singleton Attractor with Max.
Indegree K (K+1)-SAT (Boolean SATisfiability problem)
vj
vk
vi
Basic Idea of Our Algorithms
Assigning x=0 eliminates three nodes
Assigning x=1 eliminates two nodes
n
f
(
n
)
f
(
n
2
)
f
(
n
3
)
f
(
n
)
1
.
325
⇒
⇒
⇒ need additional work using SAT ⇒ O(1.587 n )
y
z
OR
0
0
OR
OR
0
1
x
OR
OR
w
OR
1
Summary of Attractor Detection Algorithms
Singleton Attractors
K=2
K=3
Recursive
(Ave.
Time)
O(1.19n)
O(1.27n)
SAT based
(detection)
O(1.323n)
O(1.474n)
Our
algorithms
(detection)
O((1.323-δ)n)
(δ=0.00004)
Cyclic Attractors
AND/OR of
literals
(any K)
Canalyzing
(any K)
AND/OR of literals
(Planar, any K)
N/A
N/A
N/A
O(1.587n) O(1.799n)
O((1+ε)n)
(Recursive, Average Case)
K=2
K=3
K=4
K=5
period=2
O(1.57n)
O(1.70n)
O(1.78n)
O(1.83n)
period=3
O(1.72n)
O(1.86n)
O(1.92n)
O(1.95n)
Control of Boolean Network
Control Theory for Biological Systems
One of the main targets of Systems Biology
Though control theory is well established for linear
systems, biological systems have non-linear
components
May lead to new drugs and treatment methods
Introduction of 4 genes turns normal cells into
induced pluripotent stem cells (iPS cells)
制御
Control
がん細胞
Cancer
Cell
正常細胞
Normal
Cell
Definition of BN-Control
Input
Internal nodes: v1 ,…, vn
Initial state:
v0
External nodes: u1 ,…, um
Desired state:
vM
BN
Output
Sequence of states of external nodes: u(0), u(1), …, u(M)
v(0)=v0, v(M)=vM
(leading to the desired state at time M)
[Akutsu et al.,
J. Theo. Biol. 2007]
BN-Control: Related Works
Datta et al. defined a problem of control of PBN
(Probabilistic Extension of BN) and proposed a dynamic
programming based method [Machine Learning, 52:169-191, 2003]
They also proposed various extensions
But, their method must handle 2n×2n matrices
BN-Control (also PBN-Control) is NP-hard
BN-Control can be solved in polynomial time if the
network has a tree structure
[Akutsu et al., JTB 2007]
Practical approach based on Model Checking/SAT
[Langmund & Jha, APBC 2008, JBCB 2009]
Theoretical studies using Semi-Tensor Product
[Cheng, 2009, 2010, …]
Dynamic Programming for Control of BN
BN version of the algorithm by Datta et al.
DP table: D[b1 , b2 ,, bn , t ]
takes 1 if there is a control seq. leading to the
target state
can be computed by
1, if [b1 ,, bn ] v M
D[b1 , b2 ,, bn , M ]
0, otherwise
1, if there is (c, x) such that
D[b1 , b2 , , bn , t 1]
D[c1, ,cn ,t ] 1 and c f (b, x)
0, otherwise
Illustration of DP Algorithm
D[1,1,1, 2] =1
D[0,0,0, 2] = 0
D[b1 , b2 , , bn , t 1]
1, if there is (c, x) such that
D[c1, ,cn ,t ] 1 and c f (b, x)
0, otherwise
u1=1, u2=1
D[0,1,1, 3] = 1
DP
Computation
But, the size of
DP table is
exponential
Integer Linear ProgrammingBased Approach
Integer Programming
Linear Programming (LP)
Maximize (or minimize) an objective linear function under
constraints of linear inequalities
Integer Linear Programming (ILP)
LP + constraints that specified variables must take integer value
Several efficient solvers: CPLEX, Gurobi
Used for solving various NP-hard problems
maxize
2x 3y
subject to
100 x 50 y 500
100 x 20 y 100
x 0,
y 0, x, y : integers
ILP Representation of Boolean Functions
Variables: either 0 or 1
(i.e., integer between 0 and 1)
zAND
x AND y
z x, z y , z x y 1
z x OR y
OR
z x, z y, z x y
NOT
z NOT x
z 1 x
We applied this methodology to BN-control.
[Akutsu et al., IEEE CDC 2009]
Result on Attractor Detection
Data: randomly generated BNs
with cases of indegree=2 and indegree=3
n: #nodes
3GHz Xeon CPU + ILOG CPLEX
Result: quite fast if indegree=2
Result on BN-Control
Data: randomly generated BNs
with cases of indegree=2 and indegree=3
n: #internal nodes, m: #external nodes, M: #steps
Result: fast if indegree=2
but, not so fast if indegree=3
PBN and its Control
Probabilistic Boolean Network (PBN)
[Shmulevich et al., 2002]
Multiple control rules (boolean functions) for each
node
Control rule is selected randomly at each t
according to a given probability distribution
Almost equivalent to Dynamic Bayesian Network
Pros: Capable of noise. Can be modeled as Markov
process.
Cons:Not scalable since it takes O(2n) or more time for
almost all problems on PBN
B
A(t+1) = B(t) AND C(t) with Prob.=0.6
A
C
A(t+1) = B(t) OR (NOT C(t)) with Prob.=0.4
Example of PBN
State Transition Diagram
PBN
(only for half of nodes)
One of 4(=2×1×2) BNs is randomly selected at each time setp
BN vs. PBN
BN: 1 outgoing edge
PBN: multiple outgoing edges (with probabilities)
101
101
0.1
0.3 0.2 0.4
BN1 BN2 BN3
001
BN
001
011
101
PBN
BN4
110
PBN-CONTROL: Model
Probabilistic Boolean network (PBN, an extension of Boolean
network)
v(t ) (v1 (t ), , vn (t )) {0,1}n
Global state at time t:
Probabilistic regulation rule is given as a 2n×2n matrix A
A can be controlled by m boolean variables u(t ) (u1 (t ), , um (t ))
Pr( v (t 1) w | v(t ) x ) A(u(t )) w, x
Cost functions
Ct(v, u): cost for applying control u for global state v at time t
C(v):
cost for final global state v
[Datta et al., Machine Learning, 2003]
PBN-CONTROL: Problem and Algorithm
Problem:
Given initial state v(0), control rule A(u(t)), target time M ,
and cost functions,
Find a first control action u(0) minimizing
M
E C ( v( M )) Ct ( v(t ), u(t ))
t 0
s.t.
Pr( v (t 1) | v (t )) A(u(t )) v ( t 1), v ( t )
Can be solved by dynamic programming
J M* ( v) C ( v)
111
*
*
J t ( v) min m Ct ( v, u) A(u) v ,x J t 1 (x)
u{0 ,1}
x 000
[Datta et al., Machine Learning, 2003]
Hardness Results
Control of BN is NP-complete [Akutsu et al., JTB 07]
Integer linear programming (ILP)-based
[Akutsu et al., IEEE CDC 09]
method for control of BN
p
Control of PBN is harder than NP ( 2 -hard)
Such technique as ILP, SAT cannot be
utilized
[Chen et al., BIBM 2010]
PSPACE
2p
Control of BN
ILP
SAT
NP
?
Control of PBN
Conclusion
Conclusion
Boolean network
Attractor Detection/Enumeration
NP-hard
Integer Linear Programming-based Approach
NP-hard
Much better than naïve O(2n) bound for several cases
Control of Boolean Networks
A discrete model of a genetic network
Similar to digital circuits
Simple, Flexible for modifications/extensions
Control of Probabilistic Boolean Networks
p -hard ⇒ SAT or IP cannot be utilized
2
Future Work
Development of Non-trivial Algorithms for
Periodic Attractor Detection
In progress
Control of Boolean Network
Break O(2n) bound !
Control of PBN
p
How to cope with 2 -hardness
Development of Hybrid Model/Theory Combining
Boolean and Linear Models
Thank you !