From_GO_to_OBOFoundr.. - Buffalo Ontology Site

download report

Transcript From_GO_to_OBOFoundr.. - Buffalo Ontology Site

What is an ontology and
Why should you care?
Barry Smith
http://ontology.buffalo.edu/smith
with thanks to
Jane Lomax, Gene Ontology Consortium
1
You’re interested
in which genes
control heart
muscle
development
17,536 results
2
time
Defense response
Immune response
Response to stimulus
Toll regulated genes
JAK-STAT regulated genes
Microarray data
shows changed
expression of
thousands of genes.
Puparial adhesion
Molting cycle
hemocyanin
Amino acid catabolism
Lipid metobolism
How will you spot
the patterns?
Peptidase activity
Protein catabloism
Immune response
Immune response
Toll regulated genes
attacked control
Tree:
pearson
Coloredby:
by:
arson
lw n3d
... lw n3d ... Colored
assification:
Set_LW_n3d_5p_...
Gene
List:
t_LW_n3d_5p_...
Gene
List:
Copy
of Copy
C5_RMA
Copy
ofofCopy
of(Defa...
C5_RMA (Defa...
allall
genes
(14010)(14010)
genes
3
Ontologies provide a way to capture
and represent all this knowledge in a
computable form
4
Uses of ‘ontology’ in PubMed abstracts
5
By far the most successful: The Gene Ontology
6
7
Definitions
8
Gene products involved in cardiac muscle development in humans9
Term Search Results
10
Hierarchical view representing
relations between represented types
11
How GO can be used to help analyse
microarray data
•
•
•
•
•
•
•
•
Treat samples
Collect mRNA
Label
Hybridize
Scan
Normalize
Select differentially regulated genes
Understand the biological phenomena involved
12
Traditional analysis operates via
literature search for each successive gene
Gene 1
Apoptosis
Cell-cell signaling
Protein phosphorylation
Mitosis
…
Gene 3
Growth control
Gene 4
Mitosis
Nervous system
Oncogenesis
Pregnancy
Protein phosphorylation
Oncogenesis
…
Mitosis
…
Gene 2
Growth control
Mitosis
Oncogenesis
Protein phosphorylation
…
Gene 100
Positive control. of cell proliferation
Mitosis
Oncogenesis
Glucose transport
…
13
But by using GO annotations,
this work has already been done
GO:0006915 : apoptosis
14
GO allows grouping by process
Apoptosis
Gene 1
Gene 53
Positive control. of
cell proliferation
Gene 7
Gene 3
Gene 12
…
Mitosis
Gene 2
Gene 5
Gene45
Gene 7
Gene 35
…
Glucose transport
Gene 7
Gene 3
Gene 6
…
Growth
Gene 5
Gene 2
Gene 6
…
Allows us to ask meaningful questions of
microarray data e.g. which genes are involved
in the same process, with same/different
15
expression patterns?
How does the
Gene Ontology work?
16
1. It provides a controlled
vocabulary
contributing to the cumulativity of
scientific results achieved by distinct
research communities
(if we all use kilograms, meters,
seconds … , our results are callibrated)
17
2. It provides a tool for
algorithmic reasoning
18
Hierarchical view representing
relations between represented types
19
The massive quantities of
annotations to gene products
in terms of the GO allows a
new kind of research
20
Uses of GO in studies of
• pathways associated with heart failure development
correlated with cardiac remodeling (PMID 18780759)
• sex-specific pathways in early cardiac response to
pressure overload in mice (PMID 18665344)
• molecular signature of cardiomyocyte clusters derived
from human embryonic stem cells (PMID 18436862)
• contrast between cardiac left ventricle and diaphragm
muscle in expression of genes involved in
carbohydrate and lipid metabolism. (PMID 18207466 )
• immune system involvement in abdominal aortic
aneurisms in humans (PMID 17634102)
• …
21
But GO covers only three
sorts of biological entities
– cellular components
– molecular functions
– biological processes
and does not provide representations
of disease-related phenomena
22
How extend the GO to
help integrate complex
representations of reality
help human beings find things in
complex representations of reality
help computers reason with complex
representations of reality
in other areas of biomedicine?
23
RELATION
TO TIME
CONTINUANT
INDEPENDENT
OCCURRENT
DEPENDENT
GRANULARITY
ORGAN AND
ORGANISM
Organism
(NCBI
Taxonomy)
CELL AND
CELLULAR
COMPONENT
Cell
(CL)
MOLECULE
Anatomical
Organ
Entity
Function
(FMA,
(FMP, CPRO) Phenotypic
CARO)
Quality
(PaTO)
Cellular
Cellular
Component Function
(FMA, GO)
(GO)
Molecule
(ChEBI, SO,
RnaO, PrO)
Molecular Function
(GO)
Biological
Process
(GO)
Molecular Process
(GO)
The Open Biomedical Ontologies (OBO) Foundry
24
RELATION TO
TIME
GRANULARITY
INDEPENDENT
ORGAN AND
ORGANISM
Organism
(NCBI
Taxonomy)
CELL AND
CELLULAR
COMPONENT
Cell
(CL)
MOLECULE
CONTINUANT
DEPENDENT
Anatomical
Organ
Entity
Function
(FMA,
(FMP, CPRO) Phenotypic
CARO)
Quality
(PaTO)
Cellular
Cellular
Component Function
(FMA, GO)
(GO)
Molecule
(ChEBI, SO,
RnaO, PrO)
OCCURRENT
Molecular Function
(GO)
Organism-Level
Process
(GO)
Cellular Process
(GO)
Molecular
Process
(GO)
initial OBO Foundry coverage
25
CRITERIA
CRITERIA
 opennness
 common formal language.
 collaborative development
 evidence-based maintenance
 identifiers
 versioning
 textual and formal definitions
26
CRITERIA
 COMMON ARCHITECTURE: The ontology uses
common formal relations
 ORTHOGONALITY: One ontology for each
domain
27
LEADERSHIP
 Michael Ashburner, Suzanna Lewis, Chris
Mungall (GO Consortium)
 Alan Ruttenberg (Science Commons, OWL
Working Group, HCLS/Semantic Web)
 Richard Scheuermann (ImmPort, CTSA)
 Barry Smith
28
OBO Foundry provides
• tested guidelines enabling new groups to
develop the ontologies they need in ways which
counteract forking and dispersion of effort
• an incremental bottoms-up approach to
evidence-based terminology practices in
medicine that is rooted in basic biology
• automatic web-based linkage between medical
terminologies and biological knowledge
resources
29