Pathology - specific Gene Discovery Program

download report

Transcript Pathology - specific Gene Discovery Program

Abstract Processes Go Live:
Representing Biomolecular Process
Networks with Process Algebra
Ehud Shapiro
Joint work with Aviv Regev, Bill Silverman, Naama Barkai
The Cell
DNA, RNA, and Ribosomes creating
Proteins
Ribosomes translate RNA to Proteins
RNA Polymerase transcribes DNA to RNA
Ribosomes in operation
25 nm
(= protein)
Computationally: A stateless string transducer from the RNA alphabet of nucleic acids
to the Protein alphabet of amino acids
Biochemical
Pathways
Types of biological knowledge
•
Sequence: Sequence of the genome
(106 - 109 -long sequences of 4 nucleic acids) and
identity of its protein products
(102 - 104 -long sequences of 22 amino acids)
•
•
Structure: 3D structure of biomolecules
•
Molecular interactions: Inter-molecular
interaction capabilities
Network behavior: Behavior of biomolecular
networks
Computer representation
and generation of biological knowledge
Knowledge
Type
Computer
Usage
Sequnece
Essential
Sequence databases
and search algorithms
Structure
Essential
3D resolved structures
and deduced models
Molecular
interaction
Some
Derived from sequence
and structure
similarities
Network
behavior
Almost none
WHY ?
Formal representation languages for
biological knowledge
• Sequence: Strings
• Structure: 3D modelling language
• Molecular interaction: ?
• Network behavior: ?
Computer use is key driver of explosive
growth in sequence and structure
branches of biology
Our goal:
Formal language for representing
biomolecular networks
Formal language for representing
biomolecular process networks
• Enables objective repository
 Deposited knowledge can be easily shared
and critically evaluated.
• Enables computer manipulation
 Analysis and discovery
Distant goal (10-25yrs): Full simulations of
virtual cells and of virtual organisms, derived
from such knowledge repositories
Our approach:
Represent
biomolecular networks
as
computational process networks
The molecule as a computational process
Example: The enzyme ERK1 Ser/Thr kinase
•
Detect: Binds proteins with Ser and Thr
amino acids
•
Modify: Adds phosphate group to these
amino acids
•
Be Modified: Can be bound and
phosphorylated by other proteins and
eznymes
The molecule as a computational process
Structure
Function
NH2
Nt lobe
Binding MP1
molecules
Regulatory T-loop:
Change conformation
p-Y
Catalytic
p-T
core
Ct lobe
COOH
Kinase site:
Phosphorylate Ser/Thr residues
(PXT/SP motifs)
ATP binding site:
Bind ATP, and use it for
phsophorylation
Binding to
substrates
The correspondence between molecular and
computational processes
Molecular
Computational
Multiple molecules,
with separate domains
Concurrent
computational
processes
Molecular interaction
Communication
The effect of interaction is to change state of
the interacting components, thus changing their
future interaction capabilities
Our approach, specifically:
Represent
biomolecular networks
as
Stochastic p-Calculus programs
Process algebras (calculi)
• Research direction began in the late 70’s
• Guarded commands (Dijkstra), CSP, Occam
(Hoare), CCS (Milner)
• Culminated in the p-Calculus in the late 80’s
• Except for Occam, no viable
implementations (viewed as mathematical
tools, not “real” programming languages)
The p-calculus
(Milner, Walker and Parrow, 1989)
•
A program specifies a network of interacting
processes
•
Processes are defined by their potential
communication activities
•
Communication occurs via channels, defined
by names
•
Communication content: Channel names
(mobility, reconfiguration)
Syntax: Channels
All communication events, input or output, occur on channels
Channel names
x,y
Input
x?y
Output
x!y
Restriction
(new x)
Receiving a channel
name y on a
channel x
Sending a channel
name y on a
channel x
The scope of channels
may be restricted
Syntax: Processes
Processes are composed of communication events
and of other processes
Process
names
P,Q
Empty
process
0
Normal
process
Summed
process
Parallel
composition
(PAR)
No current or future
activity
Input or output
preceding (guarding)
process P
p . P + p . Q Two mutual exclusive
processes
p.P
P|Q
Two processes occur in
parallel
The p-calculus: Reduction rules
COMM:
Ready to
send z
on x
Ready to
receive
y
on x
Actions consumed;
Alternative
choices discarded
( … + x ! z . Q ) | (… + x ? y . P)  Q | P {z/y}
z
replaces
y in P
Principles for mapping molecules to pcalculus
A
Domain = Process
SYSTEM ::= R_GENE | A_GENE | R | R | A | ...
A ::= (new internal_channels)
(BINDING_DOMAIN |CATALYTIC_DOMAIN)
A
R
Residue stretches =
Global (free) channel names and co-names
BINDING_DOMAIN (rbs )::=
rbs ? {e} . BOUND_DOMAIN (e )
R_GENE
Principles for mapping molecules to pcalculus
A
Molecular integrity (molecule) =
Local channels as unique identifiers
A ::= (new e)
(BINDING_DOMAIN |CATALYTIC_DOMAIN)
A
R
Molecule binding = Exporting local channels
rbs ! {e} . e ! { … } |
rbs ? {cross_e} . cross_e ? {…}
Principles for mapping molecules to pcalculus
Y
Molecular interaction and modification =
Communication and change of channel names
tyr ! p-tyr . CATALYTIC_DOMAIN | … + tyr ? tyr’ . BINDING_DOMAIN

CATALYTIC_DOMAIN | BINDING_DOMAIN {p-tyr / tyr }
Y
Quantitative aspects
•
•
•
~109 protein molecules within the cell
Packed tightly in space
Important proteins in only small amounts
(100’s,1000’s)
Stochastic effects
on molecular interaction
Stochastic p-calculus
(Priami, 1995)
•
Every channel x or internal communication
t attached with a delay parameter d
•
Delay for each communication is chosen
from an exponential distribution with d
•
At each time step all enabled
communications occur
A simple example of stochastic
molecular processes
Degradation
machinery
R
R
Synthesis
machinery
sPR
R_GENE
•
Fast synthesis of protein
R from single gene R
•
Slow degradation of
each protein R
•
Result: Steady state level
of protein R
A simple example of stochastic
molecular processes: p-calculus code
Degradation
machinery
TEST::=
R_GENE | SYNTHESIS | DEGRADATION
R
R
R_GENE::=
spR(1,0) ? [] . ( R_GENE | R )
R::=
degR(100,0) ? [] . 0
Synthesis
machinery
SYNTHESIS::=
spR(1,0) ! [] . SYNTHESIS
sPR
R_GENE
DEGRADATION::=
degpR(100,0) ! [] . DEGRADATION
A simple example of stochastic
molecular processes: spiFCP simulation
Degradation
machinery
R
R
Synthesis
machinery
sPR
R_GENE
The spiFCP simulation system
•
Based on the Logix system (Flat Concurrent
Prolog)
•
•
•
Supports synchronous interaction
Appropriate insulated surface syntax (PiFCP)
Compiler: Generate FCP computational processes
from input p-calculus (in PiFCP syntax) code
 Each channel to an FCP message stream
 Each process to an FCP process
The spiFCP simulation system
•
Regular version
 Step-by-step execution and tracing, at process and
channel level
•
Stochastic version
 A Scheduler mechanism ensures behavior
 Monitoring time evolution of quantities of process
instances
Circadian Clocks: Implementations
J. Dunlap, Science (1998) 280 1548-9
The circadian clock machinery
(Barkai and Leibler, Nature 2000)
A
R
Induced
A
fast
fast
mA
mR
Repressed
R
PA
PR
Hysteretic Oscillator
The circadian clock machinery
(Barkai and Leibler, Nature 2000)
A
degradation
R
A
R
UTRA
transcription
PA
UTRR
translation
A_RNA
degradation
translation
transcription
PR
A_GENE
R_RNA
R_GENE
Figure 1
The circadian clock machinery:
Stochastic aspects
(Barkai and Leibler, Nature 2000)
Appropriate behavior requires different rates
•
•
•
•
Basal transcription A >>> R
Promoted transcription A > R
Degradation A > R degradation
R binding to A (repression) >> A binding to
pA or pR (promotion)
The machinery in p-calculus: “A” molecules
Agene_a::= PROMOTED_A + BASAL_A
PROMOTED_A::= pA ? {e} . ACTIVATED_TRANSCRIPTION_A(e)
BASAL_A::= bA ? []. ( Agene_a | AmRNA_a)
ACTIVATED_TRANSCRIPTION_A::=
t1 . (ACTIVATED_TRANSCRIPTION_A | AmRNA_a) + e ? [] . Agene_a
AmRNA_a::= TRANSLATION_A + DEGRADATION_mA
TRANSLATION_A::= utrA ? [] . (AmRNA_a | Aprot_A)
DEGRADATION_mA::= degmA ? [] . 0
Aprot_A::= (new e1,e2,e3) PROMOTION_A-R + BINDING_R + DEGRADATION_A
PROMOTION_A-R ::= pA ! {e2} . e2 ! [] . Aprot_A +
pR ! {e3} . e3 ! [] . Aprot_A
BINDING_R ::= rbs ! {e1} . BOUND_Aprot_A
BOUND_Aprot_A::= e1 ! []
. Aprot_A
DEGRADATION_A::= degpA ? [] . 0
+ degpA ? [] . e1 ! [] . 0
The machinery in p-calculus: “R” molecules
Rgene_r::= PROMOTED_R + BASAL_R
PROMOTED_R::= pR ? {e} . ACTIVATED_TRANSCRIPTION_R(e)
BASAL_R::= bR ? []. ( Rgene_r | RmRNA_r)
ACTIVATED_TRANSCRIPTION_R::=
t2 . (ACTIVATED_TRANSCRIPTION_R | RmRNA_r) + e ? [] . Rgene_r
RmRNA_r::= TRANSLATION_R + DEGRADATION_mR
TRANSLATION_R::= utrR ? [] . (RmRNA_r | Rprot_R)
DEGRADATION_mR::= degmR ? [] . 0
Rprot_R::= BINDING_R + DEGRADATION_A
BINDING_A ::= rbs ? {e} . BOUND_Rprot_R
BOUND_Rprot_R::= e1 ! []
. Rprot_R
DEGRADATION_R::= degpR ? [] . 0
A mRNA
spiFCP simulation
R mRNA
A-R complex
Free A protein
Free R protein
Utilizing Concurrency Theory to assign
function to biomolecular ensembles
•
Semantic concept: Two processes are
equivalent if can be exchanged within a
context without changing system behavior
•
Build two representations in the p-calculus
 molecular level (implementation)
 functional module level (specification)
•
Show the equivalence of both representations
 by computer simulation
 by formal verification
The circadian clock specification:
Hysteresis module
(Barkai and Leibler, Nature 2000)
A
ON
Fast
Fast
OFF
R
Internal action
Rapid accumulation of A
(creation+degradation)
Slow accumulation of A
Communication
(reaction)
Transition
Inhibition of A
(binding with R)
OFF  ON:
CA>= Threshold1
ON  OFF:
CA<= Threshold2
Hysteresis module
• “R” processes remain intact
• All “A” processes replaced by a single
process, the H-MODULE
H-MODULE::= (new e1, e2, e3)
ON_H-MODULE(CA)
Hysteresis module (ON)
ON_H-MODULE(CA)::=
{CA<=T1} . OFF_H-MODULE(CA) +
{CA>T1} .
(rbs ! {e1} . ON_DECREASE +
e1 ! [] . ON_H_MODULE +
pR ! {e2} . (e2 ! [] .0 | ON_H_MODULE) +
t1 . ON_INCREASE )
ON_INCREASE::= {CA++} . ON_H-MODULE
ON_DECREASE::= {CA--} . ON_H-MODULE
Hysteresis module (OFF)
OFF_H-MODULE(CA)::=
{CA>T2} . ON_H-MODULE(CA) +
{CA<=T2} .
(rbs ! {e1} . OFF_DECREASE +
e1 ! [] . OFF_H_MODULE +
t2 . OFF_INCREASE )
OFF_INCREASE::= {CA++} . OFF_H-MODULE
OFF_DECREASE::= {CA--} . OFF_H-MODULE