ebi_vickyschneider_part2_bioquest2011

Download Report

Transcript ebi_vickyschneider_part2_bioquest2011

Learning and exploring Life science
through the EBI reosurces and tools
BIOQUEST workshop_2011
Vicky Schneider,
EMBL-EBI Training Programme Project leader
[email protected]
Services
www.ebi.ac.uk/services
Principles of service provision
@ Patrick Hoesly
Accessibility
Compatibility
Portability
3
Comprehensive
Quality
Databases: molecules to systems
Genomes
Ensembl
Ensembl Genomes
EGA
Nucleotide sequence
ENA
Functional
genomics
ArrayExpress
Expression Atlas
Literature and ontologies
CiteXplore, GO
Protein families,
motifs and domains
InterPro
Macromolecular
PDBe
Protein activity
IntAct , PRIDE
Pathways
Reactome
Protein Sequences
UniProt
Chemical entities
ChEBI
Chemogenomics
ChEMBL
4
Systems
BioModels
BioSamples
Database collaborations
5
Standards development – international collaborations
Genomics Standards Consortium (GSC)
http://gensc.org
Genome annotation
www.geneontology.org
Protein sequence
www.uniprot.org
Nucleotide sequence
www.insdc.org
Functional Genomics
Data Society
www.fged.org
Cheminformatics
www.ebi.ac.uk/chebi
HUPOProteomics
Standards
Initiative (PSI)
www.psidev.info/
Pathways
www.reactome.org
www.biopax.org
Metabolomics Standards Initiative (MSI)
www.metabolomicssociety.org
6
Protein structure
www.wwpdb.org
Systems modelling
standards
www.sbml.org
New search service
Access from the
EBI’s homepage
Species selector
allows for easy
comparison
Data organised
according to:
• gene
• expression
• protein
• structure
• literature
7
Explore data,
return easily to
your results
Goals of the new EBI Search
• Relevant to ‘wet-lab’ biologists
• Organises information based around a single gene
(or a small number of genes)
• User-expectation centric (not database centric)
• Smooth transition to the detailed information in
many of EBI’s core databases
• NOT for bioinformaticians:
does not provide programmatic access
8
Quick databases tour
9
Genomes 1: Ensembl
Chromosomes
Genes
Genomic alignments
Pick a genome
Synteny
Variations
Variation Effect
Predictor
Gene trees
Gene families
10
User
Upload
Genomes 2: Ensembl Genomes
Genome portals for the five
kingdoms of life
Interface uses
Ensembl technology
Variation data for
plant, metazoan
and fungal
species
Multi-way comparison
of whole bacterial
chromosomes
11
Pan-taxonomic
comparative analysis
Nucleotides: European Nucleotide Archive
(ENA)
The ENA has a three-tiered data
architecture.
It consolidates information from
EMBL-Bank, the European Trace
Archive (containing raw data from
electrophoresis-based sequencing
machines) and the Sequence Read
Archive (containing raw data from
next-generation sequencing
platforms).
Figure adapted from: Cochrane, G. et al. Public Data
Resources as the Foundation for a Worldwide
Metagenomics Data Infrastructure. In: Metagenomics:
Theory, Methods and Applications (Chapter 5), Caister
Academic Press, Universidad Nacional de Cordoba,
Argentina. Ed. D. Marco (2010).
12
Transcriptomes: ArrayExpress
Expand results
ArrayExpress Archive:
browse experiments
Search by keyword
Spreadsheets describing
the sample properties
13
Transcriptomes: Gene Expression Atlas
Atlas: browse
changes in gene
expression
Gene
page
Experiment page
14
Search by gene or
biological condition
Some data sources for annotation
Input sources for UniProtKB
15
GO
Functional info
PRIDE
Protein
identification data
InterPro
Protein families and
domains
IntAct
Molecular
interactions
IntEnz
Enzymes
HAMAP
RESID
Microbial protein
families
Post-translational
modifications
•
Manual curation
•
Literature-based
annotation
•
Sequence analysis
InterPro
classification
Signal
prediction
UniProt
•
Automated
annotation
Transmembrane
prediction
Other
predictions
Protein
classification
Protein families, motifs and domains: InterPro
Powerful tool for protein
classification, integrating several
methods into one resource
Compare methods of protein
signature prediction
Visualise the taxonomic range
for a protein signature
View architectures of proteins
containing a signature
16
Proteomics services
PRIDE: protein identifications
from proteomics experiments
IntAct: molecular interactions
ChEBI: small molecules
17
INTENZ: enzyme classification
Structures: PDBe
18
Chemical entities: ChEBI
Download flat files,
database dumps and
the ChEBI Ontology
for local installation
View
relationships in
the ChEBI
Ontology
Link to
other
databases
19
View
mappings to
other
databases
View structure,
nomenclature,
formula and more
Chemogenomics: ChEMBL
ChEMBL
database
Neglected
Tropical
Disease
(NTD) archive
ChEMBL
Browse targets
Target
search
Kinase SARfari
Search
results
Compound
search
20
GPCR SARfari
Pathways: Reactome
Compare events in
different species
View expression values
overlaid on a pathway
Link to source
databases
Interaction overlay on a
pathway diagram
21
Export pathway
to your favourite
modelling
software
User support
• E-mail support – www.ebi.ac.uk/support
• Online help pages – www.ebi.ac.uk/help
• 2Can bioinformatics user support – www.ebi.ac.uk/2Can
• eLearning Portal – coming soon ([email protected])
23