Transcript Document

http://www.dgemap.org/
Jano van Hemert
3 years EU-funded design study
Goal: design the organisational &
collaborative structures, ethical
framework, and molecular genetic &
informatics technologies necessary for a
new research infrastructure which will
accelerate an integrated European
approach to gene expression in early
human development
Jano van Hemert
www.dgemap.org
Consortium
National e-Science Centre
MRC Human Genetics Unit
University of Newcastle
Jano van Hemert
www.dgemap.org
4 work packages
Jano van Hemert
www.dgemap.org
Three major goals
1. Facilitate collaboration over
multiple laboratories
2. Improve ways for handling spatialtemporal data from gene
expression studies
3. Provide integration with other
technologies and databases to
help biologists advance their
studies
Jano van Hemert
www.dgemap.org
Laboratory process
Jano van Hemert
www.dgemap.org
From 2D sections to 3D models
QuickTime™ and a
YUV420 codec decompressor
are needed to see this picture.
Jano van Hemert
www.dgemap.org
Framework:
Edinburgh Mouse Atlas
Space and Anatomy
QuickT ime™ and a YUV420 codec decompressor are needed t o see this picture.
Jano van Hemert
www.dgemap.org
Space and Anatomy
anatomical name
Jano van Hemert
www.dgemap.org
Gene Expression Database
• Query: by both space and text...
Jano van Hemert
www.dgemap.org
emage: Query by space
QuickTime™ and a
Animation decompressor
are needed to see this picture.
Jano van Hemert
www.dgemap.org
emage: Query by space
Jano van Hemert
www.dgemap.org
Silicon processes
Data mining
Human-mouse link
Visualisation
Jano van Hemert
Other data sources
(OMIM, GDX, …)
www.dgemap.org
The Developing Human e-Portal
Jano van Hemert
www.dgemap.org
Web services exist
Jano van Hemert
www.dgemap.org
Where do workflows fit in?
• Advanced queries incorporating other DBs
– Linking genes with diseases (OMIM)
– Genetic pathways (Kegg)
• Mouse-human interoperability
– Using anatomical terms
– Using direct 3D to 3D model mapping
– Using spatial-temporal ontologies
• Data mining and processes
– Hierarchical Clustering
– Association rules
Jano van Hemert
www.dgemap.org
Mouse-human interoperability
Jano van Hemert
www.dgemap.org
Hierarchical clustering
‘McMahon’ Data TS17
Jano van Hemert
www.dgemap.org
Hierarchical clustering
‘McMahon’ Data TS17
Myt1l
Dlx5
Jano van Hemert
www.dgemap.org
Let biologists cluster data
Jano van Hemert
www.dgemap.org
Clustering: viewing the output
Jano van Hemert
www.dgemap.org
What are association rules?
•
•
•
•
•
Based on a set of transactions
We want to derive rules of the form X => Y
Meaning, if X happens then Y happens
X and
X and Y are sets of items appearing in the
transactions
• The rules come with numbers to express their
quality with respect to the set of transactions
(most common: support and confidence)
Jano van Hemert
www.dgemap.org
Association Rules
• In the context of gene expression:
if Gene1 and Gene2 then Gene3
where a transaction equals a set of
genes expressing together at the
same time in the same anatomical
component
• Alternative: if Component1 then
Component2 and Component3
where a transaction equals a
number of components expressing
the same gene at the same time
Jano van Hemert
www.dgemap.org
Association Rules Results
Transaction: genes
expressing in the same
anatomical component in
the same Theiler stage
Wnt1, Bmp4 => Shh
Vcam1 => Kdr
Emx2 => Otx2
Otx1, Pax6 => Otx2
Techo-fact: extracted using
web services called from a
Perl script…
Jano van Hemert
Association rules with a
minimum confidence of
90%
0.053
0.057
0.054
0.051
0.91
0.93
0.95
0.92
Source: the EMAGE
database, using the editorial
spatial annotations
extracted on 2006/08/28
www.dgemap.org
Perl script
Jano van Hemert
www.dgemap.org
Main issues while using Taverna
•
•
•
•
•
•
Need for more data mangling functions
Need for more data formatting controls
Pipelining and memory concerns
Library of useful translations services
Interaction Plug-in Architecture…?
What about Axis version 2?
Jano van Hemert
www.dgemap.org
Thanks for your attention
Susan Lindsay
Demetrius Vouyiouklis
Marie-Laure Muiras
Xunxian Wang
Mark Scott
Alina Andras
Jano van Hemert
Malcolm Atkinson
Jano van Hemert
Yin Chen
Richard Baldock
Simon Woods
Ken Taylor
www.dgemap.org