CL_review-RS - The OBO Foundry

Download Report

Transcript CL_review-RS - The OBO Foundry

OBO Foundry Workshop 2009
Cell Ontology (CL)
Preliminary review
Cell Ontology (CL)
19. Versioning
–
–
–
–
–
format-version: 1.2
date: 09:12:2008 10:05
auto-generated-by: OBO-Edit 2.000-beta51
default-namespace: cell
remark: $Revision: 1.38 $ Drafted by Jonathan Bard,
Michael Ashburner, David States, Seung Y. Rhee, and
Pascal Gaudet. Incorporating terms and synonyms from the
eVOC cell ontology of Janet Kelso, Win Hide et al.
http://www.sanbi.ac.za/evoc/ontologies.html.
Hematopoietic cell terms revised by Alexander Diehl, MGI,
The Jackson Laboratory. Contact Oliver Hofmann,
[email protected], at SANBI, University of the Western
Cape.
General CL Attributes
1. Clearly defined domain - naturally-occurring and
experimentally-derived cell types from all of the
biological kingdoms – eukaryotic, prokaryotic and
plant. (Not cell lines)
2. Relevant to biomedical research – yes
3. OBO Foundry principles being followed - for the most
part, yes.
13. Openly available – yes.
14. Syntactical correctness - available in .obo format.
18. Uniqueness of all identifiers and preferred terms terms and identifiers appear to be unique in the OBO
Foundry space.
CL Modularity & Interoperability
4. Modular – yes.
5. Interoperation with other ontologies – Not well documented (general
problem for OBO ontologies), but yes.
• Use of CL by other ontologies
– The Gene Ontology (GO) is actively using the CL as a source for the redefinition of GO terms as cross-products (~25%-30% of GO annotations are
associated with a CL term).
– ZFIN and the Infectious Disease Ontology (IDO) are using CL for crossproduct term generation.
– Gautheret group (William Ritchie) at the Université Paris-Sud used CL to
derive a Cell-line ontology.
– The Plant Ontology (CSHL) and eVOC (SANBI) appear to be users as well
(http://nar.oxfordjournals.org/cgi/content/full/36/suppl_1/D449).
• Use of other ontologies by CL
– The CL is attempting to utilize other OBO Foundry ontologies as appropriate.
For example, the CL has begun to use terms from the Protein Ontology in its
cell type definition.
Domain Coverage
6.Adequate coverage of defined domain - A wide
variety of cell type terms are included. The
hematopoietic branch appears to be especially
well developed. The domain appears to be
well covered.
Outreach
8. Collaborative development through the engagement of relevant
domain stakeholders and developers of neighboring ontologies - The
development of the CL appears to be in transition. Although this
needs to be verified, it appears that the CL development coordinator
is transitioning from Oliver Hoffman to Alexander Diehl. As part of
this transition, Dr. Diehl has actively engaged the immunology
research community to flesh out the hematopoietic cell branch. The
end result has been a major improvement in both the content and the
structure of this branch. Similar outreach to other relevant
communities is encouraged.
9. Tracker for submissions of new terms and errors - yes, through
Sourceforge.
10. Help desk and responsiveness - there are 52 requests that remain
open at the Sourceforge site. This may reflect the recent transition
in development leadership and inadequate funding, but remains a
concern none-the-less.
Use
22. Degree to which the ontology is being used in data annotation - this
is difficult to assess at the moment (general problem for OBO
ontologies).
– On one hand,
• Mouse Genome Informatics has been using the CL in conjunction with the GO
for around 5 years, and the GO Consortium is promoting the use of the CL in
GO annotation by other MODs.
• According to Melissa Haendel (Univ. Oregon), 5892 expression and 3384
phenotype annotations use CL terms in the ZFIN database.
• 32 literature references mention the CL either formally by citation or
informally in text.
– On the other hand, some resources have been waiting with great
anticipation for certain parts of the ontology to be fleshed out.
• For example, the Immunology Database and Analysis Portal
(www.immport.org) is planning on linking the hematopoietic branch of the CL
to the output of analytical algorithms being used to identify cell populations in
high-dimensional flow cytometry data.
Structure
12. Classification principles stated – no (general problem for OBO
ontologies).
21. RO relations used - entire ontology based on is_a and derives_from
relations. In general, the relations are properly applied, with some
exceptions (see 23 below).
7. Inference support in structure - the ontology does support inferencing
based on its structure, but suffers from the challenge of multiple
inheritance (see 23 below).
23. Multiple and inconsistent inheritance - in some case, cell branches
are defined based on function (e.g. barrier cell), in other cases
based on their anatomic location (e.g. circulating cell), and in other
cases based on their embryonic origin (e.g. ectodermal cells). This
inevitably leads to multiple inheritance.
Multiple & Inconsistent Inheritance
Metadata & Definitions
11. Ontology metadata – some metadata items found on the
Sourceforge tracker; the Cell Ontology has a wiki page
(http://bioontology.org/wiki/index.php/CL:Main_Page),
although it has not been heavily used.
15. Clarity and precision of definitions - some definitions are
missing. The definitions that are found are often derived from
relevant reference publications, but many of the current
definitions need to be improved. Some of the definitions in
the hematopoietic branch are Aristotelian, which has helped
in the accuracy of the hierarchy in this branch.
20. Completeness of ontology term metadata - ~30% of terms
lack definitions. No other metadata is included.
CL Summary
Domain is well delineated and coverage is
relatively complete; ontology is open and
available in common syntax
General problems of incomplete metadata,
statement of classification principles and lack of
direct funding support.
Specific problem of multiple and inconsistent
inheritance, lack of term metadata and lack of
ongoing curation
Definitions