European Molecular Biology Institute European Bioinformatics Institute
Download
Report
Transcript European Molecular Biology Institute European Bioinformatics Institute
http://www.embl.de/
from 1974
http://www.ebi.ac.uk/
from 1996
The European Molecular Biology Laboratory
(EMBL) is supported by sixteen countries.
Consists of the main Laboratory in Heidelberg
(Germany), Outstations in Hamburg (Germany),
Grenoble (France) and Hinxton (U. K.), and an
external Research Programme in Monterotondo
(Italy).
The EBI Mission
To provide Bioinformatics Facilities for the
Scientific Community
To become a flagship laboratory for research
in bioinformatics
To provide bioinformatics training
To help disseminate standards &
technologies
Role of Bioinformatics
To Support Experimental Biology
To Collect and Archive Data
To provide Framework and Integration
To give Easy Access to Data
To make New Discoveries through Data
Analysis
To predict through modelling
To facilitate application and exploitation of
academic research in Medicine, Agriculture,
Health and Environment
Dramatic Changes in Biology over last 5
years
Data Explosion & New Types of Data
Move towards High-Throughput Biology
Move towards Systems Biology
Much larger community – often naïve
users
Growth of Applied Biology – molecular
medicine, agriculture, food, environmental
sciences
Genomes
Literature
Expressionprofiling
Proteome
data
Metabolic
data
Bioinformatics
Comparative
genomics
Biochemistry
Mutant/RNAi
data
Hypotheses and
in silico models
Molecules to Cells to Organisms
Protein
E.coli Genome
Genomes
Systems Biology
Input
AdaptorAdaptor
Methyl
CheB
ATP
CheA CheWCheW
ADP
Pi
Pi
CheY
CheZ
Flim C
Output
Methyl
CheR
Molecular Basis of Disease
p53 tumour suppressor
core domain –
cancers of many types
Cu-Zn Superoxide
Dismutase - Autosomal dominant
Amyotrophic lateral sclerosis
From Structure to Functional Annotation
Linking to
Domain
data,
eFamily
Sequence Mapping,
SIFTS
MSDchem ligand data
Electron Density Visualisation
AstexViewer MSDPro, MSDlite
MSDsite Active sites
SSM fold matching
PQS biological assemblies
Surface Matching
From Structure To
Biochemical Function
Gene Protein 3D Structure Function
Given a protein structure:
Where is the functional site?
What is the multimeric state of the protein?
Which ligands bind to the protein?
What is biochemical function?
High throughput
A new sequence every 4 seconds
600 000 web requests a day
100 000 users
5-10 core databases
20 000 000 cross-references
About 160 other databases
Data Growth
Web requests per day
(excluding Ensembl)
500000
450000
400000
350000
300000
250000
200000
150000
100000
50000
Dec-02
Sep-02
Jun-02
Mar-02
Dec-01
Sep-01
Jun-01
Mar-01
Dec-00
0
ftp
year
2001
2002
2003
2004
2005
million files; Terabytes
4.5
11914
5.6
11809
13.5
43860
17.3
60508
26.3
85396
Web Servers Requests
millions
2002
2003
2004
2005
118
255
354
482
118631650
255399724
354235704
482076196
Distinct hosts served Number users(millions)
2002 1586883
1.5
2003 2784974
2.7
2004 3656109
3.6
2005 3919564
3.9
dynamic pages domains (2005)
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
.uk (United Kingdom)
.com (Commercial)
[unknown domain]
[unresolved numerical addresses]
.edu (USA Higher Education)
.net (Networks)
.fr (France)
.it (Italy)
.de (Germany)
.nl (Netherlands)
21.14%
17.16%
13.37%
11.05%
5.29%
5.27%
4.76%
4.68%
2.81%
2.00%
The Services of the EBI
Nucleotide sequences
Genes
Transcription information
Protein sequences
Protein families
Macromolecular structures
Molecular interactions
Pathways
Metabolic information
Scientific Literature
Structure of EBI: Services
Structure of EBI: Services
Database
Integration
and
External
Services
Lopez
Apweiler,
Stoesser
Stoehr,
Zhu
Henrick
Brazma
Birney
Structure of EBI: Research
Structure of EBI: Research
Text Mining
Schuhmann
Structural
Proteomics
Computational
Genomics
Ouzounis
Thornton
?
Le
Novere
Neuroinformatics
Goldman
Phylogeny &
Evolution
EBI DATA BASES
EMBL-Bank
DNA sequences
EMBL-Bank
DNA sequences
SWISS-PROT
+ TrEMBL
Protein Sequences
EMBL-Bank
DNA sequences
SWISS-PROT
+ TrEMBL
Protein Sequences
EMSD
Macromolecular
Structure Data
EMBL-Bank
DNA sequences
Array-Express
Microarray
Expression Data
SWISS-PROT
+ TrEMBL
Protein Sequences
EMSD
Macromolecular
Structure Data
EMBL-Bank
DNA sequences
Array-Express
Microarray
Expression Data
SWISS-PROT
+ TrEMBL
Protein Sequences
EMSD
Macromolecular
Structure Data
EnsEMBL
Human Genome
Gene Annotation
EMBL-Bank
DNA sequences
Array-Express
Microarray
Expression Data
SWISS-PROT
+ TrEMBL
Protein Sequences
IntAct
Protein Interactions
EMSD
Macromolecular
Structure Data
EnsEMBL
Human Genome
Gene Annotation
EMBL-Bank
DNA sequences
GKB
Pathways
Array-Express
Microarray
Expression Data
SWISS-PROT
+ TrEMBL
Protein Sequences
EnsEMBL
Human Genome
Gene Annotation
IntAct
Protein Interactions
EMSD
Macromolecular
Structure Data
Integration
Integrative science demands
integrative resources
EBI databases have a backbone of integrative
links
20 000 000 cross-references support transdatabase navigation
Is this good enough?
sparse and coarse-grain
not straight-forward to use
Integrative science
demands
integrative resources
Major efforts involved in integration
Interpro: database of protein families, domains
and functional sites.
Interg8: data integration project co-ordinated by
the EBI, to provide an integrated layer for the
exploitation of genomic and proteomic data.
GRID technologies
European Patent Office
Support the inclusion of sequence data in
the public databases
Development of tools to capture sequence
data
Run their searches at the EBI
(similar arrangements in USA and Japan
ensure exchange)
Analogous systems being developed for
structure information
Industry Support
Industry Support
Current successful Industry programme for
Pharma
Quarterly meetings
R&D Training - workshops
Industry Forum
Funded by subscriptions
New SME programme under development
New Data
Expression
Data
Chip-onChip
Proteomic
Data
Metabolome
Data
Human
Variation
Atlases
Disease
Links
Electron
tomographs
??
http://www.ebi.ac.uk/2can/
The Magic Search Box