FunCat Manchester
Download
Report
Transcript FunCat Manchester
FunCatTM, a controlled vocabulary encompassing the
biology of prokaryotes, plants and animals from
cellular to systemic level
Dr. Dieter Maier
Manchester Ontologies Workshop 23/24.3.02
Biomax Informatics AG, Lochhamer Str. 11, 82152 Martinsried, Germany
Biomax Informatics AG
Bioinformatics designed with you in mind.
Outline
•
•
•
•
•
Biomax Informatics AG
Objectives
Structure
Content
Development
Use
Bioinformatics designed with you in mind.
Objectives
•
•
•
•
•
•
Automatic data management
No prior knowledge of vocabulary required
Group genes by functional categories
Extensible
Organism independent
Compatible to other ontologies
Biomax Informatics AG
Bioinformatics designed with you in mind.
Disclaimer
what the FunCat is not:
-
Tool for the complete description of functions on a
single gene level
Biomax Informatics AG
Bioinformatics designed with you in mind.
Structure
•
•
•
Organized hierarchicall
Related functions grouped on different levels
Internally consistent
=>
Provides a data warehouse
- overview about available selection
- progress from general to specific
- infere from specific to general
Biomax Informatics AG
Bioinformatics designed with you in mind.
Hierarchical structure
Transcription
rRNA-transcription
rRNA-processing
tRNA-transcription
mRNA-transcription
mRNA-processing
5´-end processing
Biomax Informatics AG
Bioinformatics designed with you in mind.
Content
• Covers cellular processes, systemic
physiology, development and anatomy
from procaryotes to the human
• 25 main Categories with ~ 1500
sub-categories
• Categories are independent of organism
• Genes can belong to multiple categories
Biomax Informatics AG
Bioinformatics designed with you in mind.
Metabolism: 247
Energy: 60
Biological process: 1061
Cell cycle and DNA processing: 54
Transcription: 31
Protein synthesis (Translation): 11
Localisation: 256
Protein fate (folding, modification, destination): 25
Subcellular localisation: 63
Cell type localisation: 69
Tissue localisation: 41
Organ localisation: 91
Cellular transport: 32
Cellular communication: 47
Cell rescue, defense and virulence: 50
Regulation / interaction with cellular environment: 45
Cell fate: 54
Molecular function: 122
Systemic regulation / interaction with environment : 89
Development (systemic): 51
Transposable Elements, viral and plasmid proteins: 8
Control of cellular organisation: 57
Cell type differentiation: 69
Tissue differentiation: 40
Organ differentiation: 91
Biomax Informatics AG
Enzymatic activity
=> EC ~ 4400
Protein activity regulation: 23
Protein with binding function /
cofactor requirement: 49
Transport facilitation: 49
Bioinformatics designed with you in mind.
Development
• Historical
• Pathways
• Thesaurus
• Complex relations
Biomax Informatics AG
Bioinformatics designed with you in mind.
Structural development
• Proven flexibility – easy to extend
• Stable overall structure
• Compatibel to other ontologies like
- Enzyme Cataloge
- Gene Ontology
- EcoCyce
Biomax Informatics AG
Bioinformatics designed with you in mind.
Development in numbers
S. cerevisiae
1996
Main categories:
Plant (A. thaliana)
and Procaryotes
1998
Animals (Human)
2001
16
20
25
Depth:
4
6
6
Total:
182
528
1448
Biomax Informatics AG
Bioinformatics designed with you in mind.
Integrating Pathways into
processes
- hierachical structure allows:
- Univocal attribution
- Test for completeness
- Test for consistence
Biomax Informatics AG
Bioinformatics designed with you in mind.
Integrating additional information
• Create a dynamic ontology from existing ontologies,
keywords and linguistic extraction of descriptors from
the literature
• Semiautomatic mapping of dynamic ontologie to FunCat
Biomax Informatics AG
Bioinformatics designed with you in mind.
Enabling complex relations
• Intensify multidimensionality
• Enable if ... then ... relations
Biomax Informatics AG
Bioinformatics designed with you in mind.
Use
• Manual annotation
• Automatic annotation
• Data mining
Biomax Informatics AG
Bioinformatics designed with you in mind.
Manual annotation
- multidimensional
- stepwise
Four
dimensions
Biomax Informatics AG
Bioinformatics designed with you in mind.
Manual annotation
• 17 manually annotated genomes (5 eucaryotes, 12
procaryotes)
• H.sapiens, A.thaliana, S.cerevisiae, N.crassa,
propriatary: A.niger
• B.subtilis, T.acidophilum, Listeria, 6 public procaryotes
in progress,
propriatary: C.glutamicum, C.pneumoniae, 1 undisclosed
• Used for annotation of Transcriptomes
Biomax Informatics AG
Bioinformatics designed with you in mind.
Automatic Annotation
Sequence similarity to manually annotated proteins
(distinguish experimentally verified and similarity
associated function):
-
H. sapiens
A. thaliana
S. cerevisiae
B. subtilis
T. acidophilum
Biomax Informatics AG
Bioinformatics designed with you in mind.
PEDANT Genome Database
Currently more than 170 genomes (600 000 ORFs)
Bacteria
Archea
Green
non-sulfur
bacteria
Eucarya
Entamoeba
Methanosarcina
Gram
positives
Proteobacteria
Cyanobacteria
Methanobacterium
Extreme
halophiles
Methanococcus
Slime
molds
Animals
Fungi
Plants
Ciliates
Thermoproteus
Pyrodictium
Flavobacteria
Thermotogales
Flagellates
Trichomonades
Microsporida
Diplomonades
Biomax Informatics AG
Bioinformatics designed with you in mind.
Data mining
• Retrieval
• Visualisation
• Mining
• Integration
Biomax Informatics AG
Bioinformatics designed with you in mind.
Queries using the FunCat:
Grouplevel
- Looking for groups of genes:
Biomax Informatics AG
Bioinformatics designed with you in mind.
Single molecule level
- Retrieving protein entries:
Biomax Informatics AG
Bioinformatics designed with you in mind.
The human FunCat
cell cycle
Transcription
Translation
Energy
Metabolism
Protein fate
Intracellular
Transport
Signalling
Unclassified
Cell
physiology
Biomax Informatics AG
Bioinformatics designed with you in mind.
Defense
Comparing genomes
Sequence similairty „ functional homology“
Identification of organism specific functions
Biomax Informatics AG
Bioinformatics designed with you in mind.
Comparing H.sapiens – B.subtilis
30
25
20
H.sapiens
B.subtilis
15
10
5
0
Biomax Informatics AG
Bioinformatics designed with you in mind.
Integrative analysis
Protein-protein
interaction data
Protein expression data
Gene expression data
Functional
Functional
Functional
catalogue
Functional
catalogue
catalogue
catalogue
Biomax Informatics AG
Bioinformatics designed with you in mind.
Topological clustering (SOM)
Biomax Informatics AG
Bioinformatics designed with you in mind.
Distribution of the genes
Biomax Informatics AG
Bioinformatics designed with you in mind.
Limitations
Co-expression is no proof of functional association.
Integrate evidence from multiple sources.
Biomax Informatics AG
Bioinformatics designed with you in mind.
Integration with annotation
Analyse gene expression data using integration
with annotation catalogues.
Functional catalogue
Phenotypes
Interaction
Biomax Informatics AG
Bioinformatics designed with you in mind.
Functional projection
Biomax Informatics AG
Bioinformatics designed with you in mind.
Looking at the gene lists
Biomax Informatics AG
Bioinformatics designed with you in mind.
FunCat
Tool to structure information
Tool to connect information
Biomax Informatics AG
Bioinformatics designed with you in mind.
Thank
you!
Biomax Informatics AG
Bioinformatics designed with you in mind.