Introduction to EMBL-EBI - European Bioinformatics Institute

Download Report

Transcript Introduction to EMBL-EBI - European Bioinformatics Institute

EBI resources introductory course
Pablo Porras Millán
[email protected]
www.ebi.ac.uk
Schedule
8:30 - 9:30
Intro to EBI
9:30 - 10:00
Expectations assessment
10:00 - 11:30
Browsing the genome and exploring sequences: DNA &
RNA services
Ensembl, Ensembl Genomes,
ENA.
11:30 - 12:00 Break
12:00 - 12:30 Studying expression profiles: Gene expression services
12:30 - 13:30
Understanding proteins: Resources for identification and
annotation
Array Express and Expression
Atlas
GO, UniProt & InterPro
13:30 - 14:30 Lunch
14:30 - 15:30
Proteomics and systems: From mass spectrometry data to PRIDE, IntAct, Reactome &
models
BioModels
15:30 - 16:00 Break
16:00 - 16:30 Small molecules bioinformatics
16:30 - 17:00 Expectations re-assessments, Q&A
ChEMBL, ChEBI, Metabolights
The EMBL-European Bioinformatics Institute
The hub for bioinformatics in Europe
What is EMBL-EBI?
• Part of the European Molecular
Biology Laboratory
• International, non-profit research
institute
• Europe’s hub for biological data,
services and research
The European Molecular Biology Laboratory
Heidelberg
Hamburg
Hinxton, Cambridge
Basic research
Administration
EMBO
Structural biology
Bioinformatics
Grenoble
Monterotondo, Rome
Structural biology
Mouse biology
EMBL staff:
1500 people
>60
nationalities
EMBL-EBI’s mission
• Provide freely available data and bioinformatics services
to all facets of the scientific community in ways that
promote scientific progress
• Contribute to the advancement of biology through basic
investigator-driven research in bioinformatics
• Provide advanced bioinformatics training to scientists at
all levels, from PhD students to independent investigators
• Help disseminate cutting-edge technologies to industry
• Coordinate biological data provision throughout Europe
EMBL member states
Austria, Belgium, Croatia,
Denmark, Finland, France,
Germany, Greece, Iceland,
Ireland, Israel, Italy, Luxembourg,
the Netherlands, Norway,
Portugal, Spain, Sweden,
Switzerland and the United
Kingdom
Associate member state:
Australia
Services
Data and tools for molecular life science
www.ebi.ac.uk/services
What services do we provide?
Labs around the
world send us
their data and
we…
…provide
tools to help
researchers
use it
A virtuous
circle
Archive it
Analyse it
Share it with
other data
providers
Classify it
Bioinformatics underpins research
Literature
Genomes
Protein sequence and
proteomes
Nucleotide sequence
Protein structure
Gene expression
Chemical entities
Protein families, domains and
motifs
Protein-protein
interactions
Systems
Pathways
Standards – international collaborations
Genomics Standards Consortium (GSC)
http://gensc.org
Genome annotation
www.geneontology.org
Nucleotide sequence
www.insdc.org
Functional Genomics
Data Society
www.fged.org
Cheminformatics
www.ebi.ac.uk/chebi
Protein sequence
www.uniprot.org
HUPOProteomics
Standards
Initiative (PSI)
www.psidev.info/
Protein structure
www.wwpdb.org
Pathways
www.reactome.org
www.biopax.org
Metabolomics Standards Initiative (MSI)
www.metabolomicssociety.org
Systems modelling
standards
www.sbml.org
EMBL-EBI users: a one-day snapshot
Key facts about our services
•
•
•
•
Freely available
A comprehensive collection of molecular databases
Globally coordinated data collection and dissemination
Produced in collaboration with other world leaders:
• NCBI (US)
• National Institute of Genetics (Japan)
• SIB Swiss Institute of Bioinformatics (Switzerland)
• Wellcome Trust Sanger Institute (UK)
Data resources
DNA & RNA
genes, genomes & variation
Systems
reactions, interactions &
Gene expression
RNA, protein & metabolite expression
Chemical biology
chemogenomics & metabolomics
Proteins
sequences, families & motifs
Ontologies
taxonomies & controlled vocabularies
Structures
molecular & cellular structures
Literature
Scientific publications & patents
Other software
cross-domain tools & resources pathways
The EBI Search Service
Gene and protein summaries
Explore the data and
return easily to
your results
Data organised by:
• gene
• expression
• protein
• structure
• literature
Species selector
allows for easy
comparison
Bioinformatics tools
• Over 100 analysis tools
• Results enriched with data from EBI resources
Nucleotide sequence search
Protein sequence search
e.g. BLAST nucleotide
e.g. BLAST protein, PSI-Search
Multiple sequence alignment Pairwise sequence
e.g. Clustal Omega, MUSCLE
alignment
e.g. Needle
Protein functional analysis
Functional genomics tools
e.g. InterProScan
e.g. Expression Atlas
Molecular structure analysis
Text mining
e.g. PDBeFold
e.g. EBIMed, Whatizit
Programmatic access: EBI Web Services
• Run tasks on EBI servers, using EBI
data
• Ideal for large scale analyses, repetitive
tasks and internal pipelines
• Integration of EBI resources and data
• EBI Search, tools, data retrieval
• Same programs, data and results
enrichment as running via the web pages
• www.ebi.ac.uk/tools/webservices
Research
Data-driven discovery
PhD and postdoctoral programmes
www.ebi.ac.uk/research
Research themes
Proteins, structures &
chemical biology
Alex Bateman
Gerard Kleywegt
Genes & gene expression
John Overington
Paul Bertone
Christoph Steinbeck
Ewan Birney
Sarah Teichmann
Alvis Brazma
Janet Thornton
Anton Enright
Paul Flicek
Nick Goldman
Systems biology
Pedro Beltrao
John Marioni
Julio Saez-Rodriguez
Oliver Stegle
Research leaders
Ewan
Birney
Anton
Enright
Paul
Flicek
Alvis
Brazma
Nick
Goldman
Janet
Thornton
Sarah
Teichmann
Alex Bateman
Julio SaezRodriguez
Oliver
Stegle
Christoph
Steinbeck
John
Marioni
Paul
Bertone
Pedro Beltrao
Gerard
Kleywegt
John
Overington
Examples of EMBL-EBI research
What is the
molecular
basis of
ageing?
What makes a stem cell
decide to become skin or
muscle?
Which of these
proteins will make
good targets for
drugs?
How do the
neurons of
someone with
Parkinson’s
disease signal
differently from
healthy neurons?
Which of these
changes to a
genome’s structure
drive cancer?
PhDs and Postdocs
• EMBL International PhD programme:
www.embl.de/training/eipp
• Postdoctoral positions available from: www.ebi.ac.uk/jobs
• Postdoctoral fellowships:
• EIPOD EMBL sponsored: interdisciplinary
• ESPOD EBI–Sanger: combined experimental/computational
User training
For scientists working at all levels
www.ebi.ac.uk/training
Bioinformatics training
Train at EMBL-EBI
Train at your place
Gain hands-on
Choose the training that’s
experience in our state-ofright for you and your
the-art facilities.
colleagues - and our
experts will come to you.
www.ebi.ac.uk/training
Train online
Learn in your own time,
at your own pace with
our freely available
online courses.
Train online
• Free online courses
• Learn in your own time,
at your own pace
• Created for life-science
researchers
• No previous knowledge
of bioinformatics
needed
www.ebi.ac.uk/training/online
Interactions with industry
Support and collaboration
www.ebi.ac.uk/industry
The EMBL-EBI Industry Programme
• Helping industry make the most of
innovations in bioinformatics
• Neutral ground for members to explore
developments and concepts
•
•
•
•
Pre-competitive collaboration
Standards development
Technical development
Input into services development
“The Programme’s regular
meetings foster inter-company
interactions as we collaborate
on special projects and liaise on
other industry initiatives.”
- Bertram
Industry Programme members
•
•
•
•
•
•
•
•
•
Astellas Pharma Inc.
AstraZeneca
Bayer Pharma AG
Boehringer Ingelheim
Bristol-Meyers-Squibb
Eli Lilly and Company
F. Hoffmann-La Roche
GlaxoSmithKline
Johnson & Johnson
Pharmaceutical R&D
• Merck Serono S.A.
• Nestlé Institute of Health
Sciences
•
•
•
•
Novartis Pharma AG
Novo Nordisk
Syngenta
Sanofi-Aventis Recherche &
Développement
• UCB
• Unilever
With thanks to our funders
• EMBL member states
• The European Commission
• The Wellcome Trust
• Research Councils UK
• US National Institutes of Health
Supported by the European Community's Seventh
Framework Programme (FP7/2007-2013) under grant
agreement for Affinomics (FP7-241481).
A brief introduction to standards and
data integration
Lazebnik, Biochemistry (Mosc). 2004, PMID: 15627398
Undoubtedly Most
Important
Component
Most Important
Component
Serendipitiously
Recovered
Component
Really Important
Component
The biologist’s model
A model that reflects reality
Standards
Images from:
http://archive.nrc-cnrc.gc.ca/eng/projects/inms/mass.html
http://commons.wikimedia.org/wiki/File:Ce-logo.jpg
http://www.nmpdr.org/FIG/wiki/view.cgi/FIG/FastaFormat
Standards in bioinformatics
• Common identifiers
• Controlled vocabularies /
ontologies
• Common formats
• Common schemas
• Minimum information guidelines
• Common query interfaces
Control
Schema
vocabulary
Format
Reporting
Data
guideline
distribution
Identifiers
The problem of data integration
Reality
Ideally
DB
DB
DB
DB
I
I
I
I
DB
I
Database
I
Interface
User
Scientific impact
Utility of Bioinformatics
Too little
bioinformatics
Too many databases
Too diverse interfaces
Tim Hubbard
Data integration
data
Combining
residing in different sources
… providing users with a unified view of these data.
Ideally
Compromise
Reality
DB
DB
DB
DB
I
I
I
I
DB
DB
DB
DB
I
Database
I
I
Interface
User
DB
The danger with standards…
From xkcd: http://xkcd.com/927/
Collaboration among data providers
•
•
•
•
More data coverage
Access, exchange, sharing,
portability, interoperability,
annotation, comparison,
verification, representation,
integration, reusability.
Less redundancy
Less inconsistency
Better data management
IntAct
EMBL
PRIDE
…
…
InnateDB
NCBI
GPMDB
DDBJ
Tranche
MINT
DIP
Nucleotide sequences
INSDC
Molecular interactions
IMEx
PeptideAtlas
Protein indentifications
ProteomeXchange
Example of community development of
standards standards: PSI-MI
http://www.psidev.info/MI
• Work group of the Proteomics Standards Initiative
• Community coordination to ensure deposition of
Molecular Interaction data in public repositories
• Concentrating on …
• Annotation and representation of published MI data
• Accessibility of MI data to the user community
Control vocabulary
PSI-MI CV
Data format/schema
MIAPE
Reporting guideline
MIMIx
PSI-MI XML
PSI-MITAB
IMEx
Data distribution
Scoring
PSICQUIC
PSISCORE
Thank you!
www.ebi.ac.uk
Twitter: @emblebi
Facebook: EMBLEBI
YouTube: EMBLMedia