BioUML extensible workbench for systems biology Some

Download Report

Transcript BioUML extensible workbench for systems biology Some

0100100010011101
ISB
BioUML
Fedor Kolpakov
Institute of Systems Biology
(spin-off of DevelopmentOnTheEdge.com)
Laboratory of Bioinformatics,
Design Technological Institute of Digital Techniques
Novosibirsk, Russia
Agenda
• Part 1: overview of BioUML workbench
• Cafe break
• Part 2: new concepts and possibilities
(versions 0.8.0 – 0.8.3)
• Further development
• Questions and discussion
Part 1: overview of BioUML workbench
Overview
• Main concepts
• Meta model
• Architecture overview
• Diagram types
• Database module
concepts
• Full text search
• Graph search
• Simulation engine
• BioUML server
• BMOND/Biopath database
Live demonstration:
• Installation of BioUML
workbench
• Creating and simulating
simple model
• SBML - Biomodels module
• BioPAX import
• BMOND database
web interface
• JavaScript shell
Part 2: new concepts and possibilities
Overview
• Reconstruction as
solitaire game
• Levels of biological
information
• BioHub concept
• Composite database
module
• Composite diagram
• Experiment concept
• Graphic notation editor
• Microarray data analysis
Live demonstration
• Loading database
modules from server
• Text search
• Graph search
• Creating of composite
database module
• Creating of composite
diagram
• Experiment
• Graphic notation editor
• Microarray data analysis
Useful resources
http://www.biouml.org/demo
Flash movies that demonstrates how to work with BioUML
workbench
http://www.biouml.org/user/help/index.html
http://www.biouml.org/download/0.7.8/manual.doc
Useguide, >200 pages
- HTML version
- MS Word document
http://bmond.biouml.org
Examples of pathway annotation:
BMOND – Biological Models aNd Diagrams database
Part 1
Overview
of BioUML workbench
Main BioUML concepts and ideas
• Visual modeling
o Meta model – problem domain neutral level of
abstraction that describes system as
compartmentalized graph
o Diagram type concept – formally defines graphical
notation and provides its incorporation into BioUML
workbench.
o Automated code generation for model simulation.
• Database module concept - allows developer to
incorporate databases on biological pathways into
BioUML workbench taking into account database
peculiarities.
• Plug-in based architecture (Eclipse platform runtime from
IBM company).
Biological databases
Data search and retrieving
Formal description of structure of
biological system
Visual modeling
Automated code generation for model
simulation of model behavior
MATLAB code
Simulating using MATLAB.
JMatLink allows to BioUML
workbench to start MATLAB and
retrieve simulations results
… code
Java code
Java simulation plug-in.
Contains ODE solvers ported
from odeToJava and methods
for hybrid models support.
Meta model
Example:
system from two chemical reactions
-k1[A] k1[A]
A
100
R1
B
-k2[B] K2[B]
0
R2
C
0
k1 - reaction rate for R1
k2 – reaction rate for R2
Corresponding mathematical model:
dA
  k1[ A]
dt
dB
 k1[ A]  k 2[ B ]
dt
dC
 k 2[ B ]
dt
Meta-model: example of formal description of
system from two chemical reactions
A
-k1[A] k1[A]
R1
100
ID A
CC ..
...
//
ID
R1
A->B
...
//
R1
A
100
-k1[A]
B
0
ID B
CC ..
...
//
B
k1[A] 0
-k2[B] k2[B]
R2
ID
R2
B->C
...
//
R2
-k2[B]
C
0
ID C
CC ..
...
//
C
k2[B] 0
Description of system
components in the
database
System structure is
described as a graph
Mathematical model
of the system
Suggested approach can be applied for modeling
biological systems using:
–
–
–
–
–
–
–
–
–
–
Systems of ordinary differential equations
Systems of algebra-differential equations
State and transition diagrams
Hybrid models
Boolean and logical networks
Petri nets
Markov chains
Stochastic models
Cellular automates
…
Some limitations
– Spatial models
– PDE
–…
BioUML architecture
Plug-in based architecture
A plug-in is the smallest unit of BioUML workbench function that can be developed
and delivered separately into BioUML workbench. A plug-in is described in an XML
manifest file, called plugin.xml. The parsed contents of plug-in manifest files are made
available programmatically through a plug-in registry API provided by Eclipse
runtime.
- extension points are well-defined function points in the system where other
plug-ins can contribute functionality.
- extension is a specific contribution to an extension point. Plug-ins can define
their own extension points, so that other plug-ins can integrate tightly with them.
Plug-in
- plugin.xml
- Java jar files
Plug-in
- plugin.xml
- Java jar files
Eclipse platform runtime
Plug-in
- plugin.xml
- etc.
Standard module
GeneNet module
Diagram types
- Semantic map
- Pathway
- Pathway simulation
Database
Database adapter
KEGG/pathways
module
TRANSPATH
module
Java objects
Gene
Protein
Query engine
…
SBML module
Diagram
view part
Meta model
Graph structure
ModuleType
DiagramType
Executable model
-diagram types
-data categories
-query engine
-semantic controller
-diagram view builder
-diagram filter
Diagram
editor
Eclipse platform runtime
Analysis
tools
Diagram
editor part
Simulation
tools
Other
tools
Workbench UI
Perspectives
Views,
editors
Menus,
toolbars, etc.
Formal description and modeling of biological systems
require coordinated efforts of different group of
researchers:
• programmers - they should provide computer tools
for this task.
• problem domain experts - they should specify what
and how should be described.
• experimenters and annotators - they should describe
corresponding data following to these rules.
• mathematicians - they should provide methods for
models analysis and simulations.
BioUML architecture separates these tasks so they can
be effectively solved by corresponding group of
researchers and provides simple contract how these
groups and corresponding software parts should
communicate.
Diagram types
Diagram type concept
Diagram type defines:
·
types of biological components and their
interactions that can be shown on the diagram;
· diagram view builder - it is used to generate view
for each diagram element taking into account
problem domain peculiarities;
· semantic controller - provides semantic integrity of
the diagram during its editing;
· filters – hide or highlight diagram elements
according to some selection criteria.
Reconstruction and formal description of
biological systems using different diagram types
Formality,
details
1. Semantic network
Semi-structured
data
2. Pathway diagram
(semantic network +
gene network or metabolic pathway)
3. Metabolic pathway
4. Gene network
5. Pathway simulation
(mathematical model)
Structured data
(reactions and
its components)
Kinetic data
(kinetic laws, constants,
initial values
Graphic notation
Stimulus activating NF-kappaB
(semantic network, ontology)
NF-kappaB family
(semantic network, ontology)
Function of human DNA methyltransferases
(pathway diagram)
The biosynthesis of catecholamines
(metabolic pathway)
Cell cycle model of mammalian G1/S
transition control with E2F feedback loops
(pathway simulation diagram)
DGR0356 “NF-kB model” (Hoffmann et al., 2002)
NF-kB dynamics in nucleus and cytoplasm before and after
TNF-alpha stimulation (Hoffmann et al., 2002)
Regulation of caspase-3 activation and degradation
(Stucki and Simon, 2005 )
Database module concept
The database module concept
allows to developer define new
diagram types and incorporate
other databases on biological
pathways into BioUML
framework.
The database module defines
mapping of database content
into diagram elements and
diagram types that can be used
with the database.
Module also provides query engine
that can be used by BioUML
workbench to find interactiong
components of the system.
BioUML database modules
BioUML standard module
Databases
• EBI databases: Ensembl, UniProt, ChEBI, GeneOntology
• Biopath/BMOND (http://biopath.biouml.org)
• KEGG/Ligand (http://www.kegg.com)
• TRANSPATH (http://www.biobase.de)
• GeneNet (http://wwwmgs.bionet.nsc.ru)
Formats
• SBML – Systems Biology Markup Language, level 1, 2
(http:// www.sbml.org)
• CellML – Cell Markup Language (http://www.cellml.org)
• BioPax – Biological Pathways Exchange (http://www.biopax.org)
• PSI-MI
• OBO
• GXL - Graph eXchange Language (http://www.gupro.de/GXL)
KEGG pathway
CellML mode
SBML model
Full text search
User interface for full text search: 1) pop-up menu; 2) menu buttons for selected
entity; 3) full text search pane.
Full text search (uses Lucene engine)
Graph search
Graph search engine
Simulation engine
Biological databases
Data search and retrieving
Formal description of structure of
biological system
Visual modeling
Automated code generation for model
simulation of model behavior
MATLAB code
Simulating using MATLAB.
JMatLink allows to BioUML
workbench to start MATLAB and
retrieve simulations results
… code
Java code
Java simulation plug-in.
Contains ODE solvers ported
from odeToJava and methods
for hybrid models support.
%script for 'CellCycle_1991Gol' model simulation
%constants declaration
global Reaction1_vi Reaction2_kd Reaction4_K1 Reaction4_Kc Reaction4_VM1 Reaction5_K3
Reaction5_VM3 Reaction6_K2 Reaction6_V2 Reaction7_K4 Reaction7_V4
Reaction1_vi = 0.023
Reaction2_kd = 0.00333
Reaction4_K1 = 0.1
Reaction4_Kc = 0.3
Reaction4_VM1 = 0.5
Reaction5_K3 = 0.1
Reaction5_VM3 = 0.2
Reaction6_K2 = 0.1
Reaction6_V2 = 0.167
Reaction7_K4 = 0.1
Reaction7_V4 = 0.1
%Model
y = []
y(1) =
y(2) =
y(3) =
y(4) =
rate variables and their initial values
0.0
0.0
0.0
0.0
%
%
%
%
y(1)
y(2)
y(3)
y(4)
-
$cytoplasm.C
$cytoplasm.EmptySet
$cytoplasm.M
$cytoplasm.X
%numeric equation solving
[t,y] = ode23('CellCycle_1991Gol_dy',[0 100],y)
%plot the solver output
plot(t,y(:,1),'-',t,y(:,2),'-',t,y(:,3),'-',t,y(:,4),'-')
title ('Solving Goldbeter problem')
ylabel ('y(t)')
xlabel ('x(t)')
legend('$cytoplasm.C','$cytoplasm.EmptySet','$cytoplasm.M','$cytoplasm.X');
Function to calculate dy/dt for the
model
function dy = CellCycle_1991Gol_dy(t, y)
% Calculates dy/dt for 'CellCycle_1991Gol' model.
%constants declaration
global Reaction1_vi Reaction2_kd Reaction4_K1 Reaction4_Kc Reaction4_VM1 Reaction5_K3
Reaction5_VM3 Reaction6_K2 Reaction6_V2 Reaction7_K4 Reaction7_V4
% write rules to calculate some eqution parameters
rateOfReaction1 = Reaction1_vi;
rateOfReaction4 = ((1 - y(3))*Reaction4_VM1*y(1))/((1 + Reaction4_K1 y(3))*(Reaction4_Kc + y(1)));
rateOfReaction5 = (Reaction5_VM3*(1 - y(4))*y(3))/(1 + Reaction5_K3 - y(4));
rateOfReaction6 = (y(3)*Reaction6_V2)/(Reaction6_K2 + y(3));
rateOfReaction7 = (Reaction7_V4*y(4))/(Reaction7_K4 + y(4));
rateOfReaction2 = y(1)*Reaction2_kd;
% calculates dy/dt for 'CellCycle-1991Gol.xml' model
dy = [ + rateOfReaction1 - rateOfReaction2
- rateOfReaction1 - rateOfReaction4 - rateOfReaction5 + rateOfReaction6 +
rateOfReaction7 + rateOfReaction2
+ rateOfReaction4 - rateOfReaction6
+ rateOfReaction5 - rateOfReaction7]
Results of SBML semantic tests
BioModels –
comparison BioUML simulation
results with other simulators
http://www.biouml.org/_biomodels/
Simulators comparison criteria
Passed – CSV file was generated by simulator
interval criteria
no difference - 0.999 * min < x < 1.001 * max
x < ZERO and max < ZERO
small difference – 0.5 * min < x < 1.5 * max
significant difference - otherwise
or
median criteria
no difference - abs((x – median)/median) < 0.01 or
x < ZERO and median < ZERO
small difference - abs((x – median)/median) < 0.5
significant difference – otherwise
x – variable value provided by compared simulator
min, max, median – calculated from values provided by other simulators with
which the specified simulator is being compared.
Implementation note: if result file was not generated by BioUML, then other
simulators can be compared one to each other.
BioUML Enterprise Edition:
BioUML server
BioUML EE architecture
Web browser
BioUML workbench
Client
side:
Server
side:
Database module
Servlet container: Tomcat
BioUML servlet
BeanExplorer
Enterprise Edition
JDBC DB module
JDBC
MySQL database
Lucene full text
search engine
BMOND
Biological MOdels aNd Diagrams
database
(former name – Biopath)
BMOND system architecture
Web browser
BioUML workbench
Biopath module
Client side:
Server side:
Servlet container: Tomcat
BeanExplorer
Enterprise Edition
JDBC
Biopath
MySQL database
Figure 4. G1/S entry model (Kel et al., 2000) described using BioUML technology.
BMOND web interface
live demonstration
http://bmond.biouml.org
- Interface overview
- View diagrams
- View diagram components
- List of diagram components
- Categories (classification)
- Filter
- Dynamic columns
- Web forms for components editing
Part 2
New concepts and possibilities
Part 2: new concepts and possibilities
Overview
• Reconstruction as
solitaire game
• Levels of biological
information
• BioHub concept
• Composite database
module
• Composite diagram
• Experiment concept
• Graphic notation editor
• Microarray data analysis
Live demonstration
• Loading database
modules from server
• Text search
• Graph search
• Creating of composite
database module
• Creating of composite
diagram
• Experiment
• Graphic notation editor
• Microarray data analysis
Metaphor: biological systems reconstruction
as solitaire (patience) game
Desk – BioUML editor
Solitaire – biological pathway
Cards – biological objects
(genes, proteins, lipids, etc.)
Pack of cards – different
biological databases
Levels of biological information
Main idea for data integration and pathway reconstruction:
- escape information duplication
- classify components of biological pathways by levels
- each next level should refer but do not duplicate information from
previous levels
- use free EBI databases whenever it is possible.
Level 3:
Problem
specific
Cyclonet w
i
- leads
- actions k
- targets i
w
classifications: i
E1, E2, E3, … k
i
UbiProt
Level 1:
Catalogs
refers
refers
refers
Level 2:
Pathways,
models
w
classifications: i
k
- lipids
- genes
i
LipidNet
GeneModels
BMOND
refers
Ensembl
UniProt
refers
ChEBI
Biological objects
GO
Add-on technology
This approach should help us to solve difficulties with usage of external
catalogs when external catalog does not contain needed entity (for
example gene or substance) or when we would like to add some
information to existing entity description.
Example for BMOND2, gene: special table allow us to add new entity to
BMOND2 if such entity missing in corresponding external catalog.
Classification
BioUML
Java object
Gene catalog
Ensembl
SQL
query
Synonyms
Description
DB references
Literature
references
Gene
add-on table
BeanExplorer
Web interface
Lucene
Document
BioHub
BioHub concept
• BioHUB – an approach link information from different databases.
Main usage:
– binding microarray (omics) data to pathway diagrams
– graph search
– DBReferences editor
– microarray (omics) data analysis
• Follows to MIRIAM standard:
– References to database objects
– Relationships between biological objects
• Simple Java API
BioHub structure
Entities
- DB_ID
- version
- ID
- AC
- species
- description
- key words
Databases
- DB_ID
- name
- description
- URL
- url_patern_ID
- url_patern_AC
Relations
- DB_ID_1
- DB_version_1
- ID_1
- DB_ID_2
- DB_version_2
- ID_2
- relation
- evidence
- comment
RelationTypes
- relation
- description
- backwardRelation
- comment
RelationInfo
- DB_ID_1
- DB_ID_2
- relation
- comment
Linking with experimental data and results of analysis
Level 3:
Problem
specific
Cyclonet w
i
- leads
- actions k
- targets i
w
classifications: i
E1, E2, E3, … k
i
UbiProt
Level 1:
Catalogs
refers
refers
refers
Level 2:
Pathways,
models
w
classifications: i
k
- lipids
- genes
i
LipidNet
GeneModels
BMOND
refers
Ensembl
refers
UniProt
ChEBI
GO
Biological objects
BioHUB
Experimental data,
results of analysis
OMICS
data
Results
of analysis
MSigDB
GeneAtlas,
NCI60
Linking with external databases
Cyclonet w
i
- leads
- actions k
- targets i
Level 3:
Problem
specific
GeneModels
refers
BMOND
refers
Level 1:
Catalogs
LipidNet
refers
refers
Level 2:
Pathways,
models
w
classifications: i
k
- lipids
- genes
i
w
classifications: i
E1, E2, E3, … k
i
UbiProt
Ensembl
refers
UniProt
ChEBI
GO
Biological objects
Experimental data,
results of analysis
OMICS
data
External databases:
- KEGG
- LipidMap, LipidBank
- Reactome, …
BioHUB
Results
of analysis
MSigDB
GeneAtlas,
NCI60
Coloring diagram according to microarray data.
Each bar corresponds to one value from
corresponding microarray series.
Coloring diagram according to omics data
BioHub usage: graph search engine
Composite database module
Flash movie: XML_module.exe
Composite database module
Composite database module is defined formally as XML
document. It allows:
• specify dependencies from other database modules
• specify data types that can be used from external database
modules
• describe dynamic properties for add-on technology
• specify what dynamic properties can be added to data
types from external modules. This information will be stored
in local module and merged dynamically with information
from external modules. By this way user can add information
to external catalogs like Ensembl, UniPropt, etc.
• specify data types used by local module
• specify diagram types used by local module
• specify QueryEngine
DTD
<!ELEMENT dbModule
(jdbcConnection, properties?, dependencies?, types?)>
<!ATTLIST dbModule >
name
CDATA
#REQUIRED
title
CDATA
#REQUIRED
description
PCDATA
version
CDATA
"0.8.0"
type
CDATA
text|SQL
databaseType
CDATA
databaseVersion
CDATA
databaseName
CDATA
>
<!ELEMENT jdbcConnection>
<!ATTLIST jdbcConnection>
name
CDATA
jdbcDriverClass
CDATA
jdbcURL
CDATA
jdbcUser
CDATA
jdbcPassword
CDATA
>
#REQUIRED
#REQUIRED
#REQUIRED
<!-- ================================================================
<!-- Properties - definition of properties for all types of diagram
<!-- elements used by the graphic notation.
<!-<!-- Possible property types:
<!-- - simple types: boolean, int, double, String
<!-- - array
<!-- - composite
<!-- ================================================================
<!ELEMENT properties (property*)>
<!ELEMENT property (tags?)>
<!ATTLIST property
name
CDATA
#REQUIRED
type
CDATA
#REQUIRED
short-description
CDATA
#IMPLIED
value
CDATA
>
<!ELEMENT tags (tag+)>
<!ELEMENT tag>
<!ATTLIST tag
name
CDATA
#REQUIRED
value
CDATA
#IMPLIED
>
<!ELEMENT propertyRef>
<!ATTLIST propertyRef
name
CDATA
#REQUIRED
value
CDATA
>
-->
-->
-->
-->
-->
-->
-->
-->
-->
<!-<!-<!-<!--
================================================================-->
Dependencies from other databases and modules
-->
Graphic notations can be defined in the specialized module
-->
=============================================================== -->
<!ELEMENT dependencies (dbModule*, graphicNotation*)>
<!ELEMENT dbModule (externalType+)>
<!ATTLIST dbModule>
name
CDATA
#REQUIRED
>
<!ELEMENT externalType (propertyRef*)>
<!ATTLIST externalType
name
CDATA
#REQUIRED
readOnly
CDATA
true|false
>
<!ELEMENT graphicNotation>
<!ATTLIST graphicNotation>
name
CDATA
#REQUIRED
type
CDATA
Java|XML
class
CDATA
path
CDATA
>
<!-<!-<!-<!-<!--
================================================================
Internal data types for this module
Description of internal type should provide all information to
create corresponding DataCollection
================================================================
<!ELEMENT types (internalType*)>
<!ELEMENT internalType (querySystem, propertyRef*)>
<!ATTLIST internalType>
section
CDATA
#REQUIRED
name
CDATA
#REQUIRED
class
CDATA
#REQUIRED
transformer
CDATA
#REQUIRED
>
<!ELEMENT querySystem (index*)>
<!ATTLIST querySystem>
class
CDATA
#REQUIRED
luceneIndexes
CDATA
>
<!ELEMENT index>
<!ATTLIST index>
class
CDATA
#REQUIRED
table
CDATA
>
-->
-->
-->
-->
-->
Editor for composite database module
Editor for composite database module
Editor for composite database module
Editor for composite database module
Current status:
Implemented:
• Database modules (initial version):
Ensembl, UniProt, ChEBI, GO,
IntAct, Reactome, BioModels
• Composite module (external referencies)
– Defined as XML
– Composite module editor
• Selecting and loading modules from server
In process:
• BioHUB
• Protein state concept
• Add-on technology
• BMOND2 – redesigned version of BMOND.
From huge theory to practical output
Automated language
translation
Practical output
• electronic
dictionaries
• spell checkers
Biological data
integrations
Practical output
• catalogs (Ensembl,
UniProt, CheBI)
• controlled vocabularies,
ontologies
• hubs
Model composition
Composite diagram: main concepts
block (EModel)
dx/dt = f1
dy/dt = f2
z = f3
x
f(x)
e
x
x
y
y
x
s2
s1
s1
s2
R
Indirect link
block 3 (EModel)
dx/dt = f5
dy/dt = f6 + block2.k
k+z+f4 = 0
e
R
forbidden
x
subdiagram (EModel)
s1
block 2 (EModel)
dx/dt = f5
dy/dt = f6 + z
k+z+f4 = 0
s3
s4
direct participation of subdiagram element
in a reaction
Flat model:
Before Matlab or Java code
generation composite model
is transformed into flat model
and usual genertions routines
are used.
Block types:
1) block – only mathematical
equations. Used mainly for
physiological models;
2) subdiagram – other
diagram
Connection types:
1) directed – input  output.
Transformation function can
be used;
2) undirected – contact.
Indicates that 2 nodes in
mode is the same entity.
Semantic constraints:
There are semantic
constraints, for example:
block can have only one
input for each variable.
Two inputs are forbidden
for the same variable.
Experiment
Experiment
To make a virtual experiment it is frequently needed to
modify initial model.
Typical modifications (changes) are:
• changing of initial values
• changing of model parameters to imitate different
conditions or mutations
• deleting of some model elements to imitate knock-out
mutations
• adding events to imitate external influences on the
model
To skip model duplications for each virtual experiment we
introduce “changes” concept.
Graphic notation
formal definition as XML
document
http://www.biouml.org/sbgn.shtml
Flash movie: Graphic_Notations_Editor.exe
Graphic notation
versus graph layout
• allows edit diagram
• allows to create new diagram
• different graphic notations can be applied
to the same SBML model
• allows formally define SBGN and use it in
SBML models
• allows to reuse graphic notation by many
tools
Graphic notation can be defined formally as XML document
•
•
•
•
•
properties – formal definition of properties that can be used as properties of nodes and
edges (for example, title, multimer, etc.). Definition of property includes:
– name
– type
– short description
– controlled vocabulary (optional)
node types – definition of node includes:
– name
– icon
– properties
– view function (JavaScript)
– short description
edge types – definition of edge includes:
– name
– icon
– properties
– view function (JavaScript)
– short description
semantic controller – defines rules for semantic control of diagram integrity. For this
purpose it defines following functions:
– canAccept (JavaScript)
– isResizable (JavaScript)
– move (JavaScript)
Examples – a set of diagrams that can be used as test cases, legend and examples for the
graphic notation. DML - Diagram Markup Language – is used for this purpose.
Basic software architecture for rendering of
biological models according to specified
graphic notation and layout information
Diagram
Rendering API
JavaScript API for creating primitives
similar with SBML layout extension
Rendering
engine
JavaScript functions:
- build node/edge view
- semantic control
JavaScript
API for
data access
Initial data
Model API
SBML
…
BioPAX
Notation API
Layout API
Graphic
notation
Layout
information
Formal definition of graphic
notation as XML document and
integration with SBML format
Graphic notation
components
Object types
Object properties
User defined properties
Rules for visualization
Rules for semantic control
Test cases
Defined as
XML
XML
XML
JavaScript
JavaScript
XML
SBML
<annotations>
<annotations>
<annotations>
model, module
Graphic notation editor
main concepts
• graphic notation is defined formally as XML document
• graphic notation editor provides user friendly interface for
XML document editing
• SBGN graphic notation (prototype) is implemented
• BioUML workbench allows to create and edit diagrams
using graphic notation defined as XML document
• May be graphic editor will be useful for SBGN
community for:
– improving SBGN specification
– for testing SBGN specification by creating different diagrams
Details:
http://www.biouml.org/sbgn.shtml
BioUML workbech
Select ‘Data’ tab to see the tab with a list
with available graphic notations
Click right mouse button on
selected graphic notation to open it
Graphic Notation Editor
Graphic Notation Editor
Main sections of formal definition
of graphic notation
List of specific properties
that are used by graphic notation
Properties editor
User can click right mouse button
on Properties node to create new property
Nodes – contains list of all node types
used by graphic notation
For each node type user can define:
- name
- properties
- icon
- view function (JavaScript)
By clicking right mouse button on “Nodes”
user can create new node type
By the same way user can define
edge type:
- name
- properties
- icon
- view function (JavaScript)
“Examples” node
contains a set of diagrams
that demonstrates usage of
graphic notation.
User can create and edit
such diagram.
When user selects some
element on the diagram he
can edit:
- object properties
- JavaScript that builds a
view for selected diagram
element
“Semantic controller” node
contains list of JavaScript functions that
provide semantic constraints and semantic
integrity of the diagram.
Graphic notation defined as XML document can
be used by BioUML workbench to create
corresponding diagram.
Graphic Notation Editor
SBGN examples
created in BioUML
Skins
Microarray plug-in
(alpha version)
Microarray plug-in
-
Import microarray data in tab delimited format
Show data as a table
Filter data by different criteria
Microarray data analysis
- Revealing up/down regulated genes
- Meta-analyses
- Binding with diagram nodes by ID
- Coloring diagrams
- JavaScript functions
- Data manipulation (filter, join, intersect, trim, etc.)
- Statistical analysis
Microarray plug-in
Current work:
- Powerful user interface for coloring diagrams
- Support of other formats for microarray data and results
of analyses
- Sophisticated binding algorithm using different database
references and ID (gene hub)
Further work:
- Server module that will provide access to ArrayExpress
data
BioUML workbench.
Data tab contains section “Microarray”.
User can import microarray data in tab delimited
format into this section.
Possibility to filter probe sets:
- by column values
- selecting only those probe sets that can be linked to
the specified diagram
Microarray analysis
Coloring diagram according to microarray data.
Each bar corresponds to one value from
corresponding microarray series.
Coloring diagram according to omics data
Further development:
Protein state
BioUML workbench: further development
• Protein states
• Complexes
• Improving team work on annotation
– Login, single sign on
– Editing history (what data were modified, whom and when)
– Passing of changes from server to client
• Sequence analysis and visualization
• Agent based modeling
Protein state
Modification
• The functions of macromolecular entities (mainly proteins)
are often determined not only by their primary sequences,
but by chemical modifications they have undergone.
• In BMOND2 unmodified and modified forms of a protein
refer to the same entity in UniProt database
• List of possible modifications is extracted from UniProt
Feature Table
• BMOND2 modifications table
– allows to describe modifications that are not described in UniProt.
These modifications are automatically added to the protein,
referred from BMOND2.
• Modification type – control vocabulary that describes
possible modification types (for example, phosphorylation,
acetylation, ubiqutination)
• To take into account protein modifications State concept
is used.
UniProt Feature Table
•FT
•FT
•FT
•FT
•FT
•FT
•FT
•FT
•FT
•FT
•FT
•FT
•FT
•FT
•FT
•FT
•FT
•FT
•FT
•FT
CHAIN
1
561
REGION
COMPBIAS
ACT_SITE
ACT_SITE
METAL
METAL
METAL
BINDING
BINDING
BINDING
BINDING
202
549
52
54
52
54
351
127
154
354
436
210
561
52
54
52
54
351
127
154
354
436
BINDING
MOD_RES
VARIANT
453
527
3
453
527
3
VARIANT
136
136
Cytosolic purine 5'-nucleotidase.
/FTId=PRO_0000064389.
Substrate binding (Potential).
Asp/Glu-rich (acidic).
Nucleophile.
Proton donor.
Magnesium.
Magnesium (via carbonyl oxygen).
Magnesium.
Allosteric activator 1.
Allosteric activator 2.
Allosteric activator 2.
Allosteric activator 1; via carbonyl
oxygen.
Allosteric activator 2.
Phosphoserine (By similarity).
T -> A (in dbSNP:rs10883841).
/FTId=VAR_024244.
Q -> R (in dbSNP:rs12262171).
/FTId=VAR_030242.
Modification
•
•
•
•
position
amynoacid
modification type (controlled vocabulary)
evidence
experimental, by similarity, predicted
• comment
• Publication reference
State concept
• State – describes states of all amino acids available for
modifications
• possible values:
– ? – unknown, not specified
– * – any
– - – unmodified
– p – phoshporylated
– ac – acetylated
– … – from controlled vocabulary
• Protein states are described in BMOND2 states table
• Reaction – user should specify protein state
• Diagram – user should specify protein state
State table
•
•
•
•
•
module (database)
id
state – short name (like TRANSPATH)
position
modification
SBGN
Mapping: BMOND2 -- SBGN
modification – state variable
state
– state of macromolecule
Complex concept
Complex concept
• A complex is s a biochemical entity composed of other
biochemical entities, whether macromolecules, small
molecules, multimers, or themselves complexes.
• Complex is specified as a set of units
• Complex modifications
– all possible modifications of its units
(some of them can not occur due to physical
interactions between units – how we can take it into
account)
• Complex state
– var.1 – list of modifications for its subunits
– var. 2 – list of states for its units
Complex tables
• Complex
– ID
– title (short name)
– complete name
– species
– synonyms
– comment
• References:
– States
– Synonyms
– Structure
– DBReferences
– Publications
• Complex Units
– complexDB
– complexID
– unitDB
– unitID
– multimer
SBGN
Reaction
• Reaction components
– component identification
• DB
• id
• [state]
• [compartment]
• Reaction
– [compartment]
• Reaction dialog
– specie state
– specie compartment
– reaction compartment
• Tables
– Reaction
• compartment
– Reaction
components
• state
• compartment
Diagrams
• Macromolecule state
– “New diagram element” dialog
• Graphic notation
– BioUML
• states – right label, one modification
• complexes
– SBGN skin
Acknowledgements
Part of this work was partially supported by following grants:
• European Committee grant №037590 “Net2Drug”
• Siberian Branch of Russian Academy of Sciences
(interdisciplinary projects № 46)
• Volkswagen-Stiftung (I/75941),
• INTAS Nr. 03-51-5218
• RFBR Nr. 04-04-49826-а
Author is grateful to for useful comments, discussions and technical support
Alexander Kel
Sergey Zhatchenko
Software developers
Nikita Tolstyh
Mikhail Puzanov
Sergey Lapukhov
Ilya Kiselev
Alexander Magdysyuk Denis Ryumin
Vlad Zhvaleev
Alexandr Koshukov
Vasiliy Hudyakov
Igor Tyazhev
Sergey Graschenko
Oleg Onegov
Annotators
Ruslan Sharipov
Ivan Yevshin
Elena Cheremushkina
Ekaterina Kalashnikova