Chemical Ontologies

Download Report

Transcript Chemical Ontologies

Chemical Ontologies
I533 Seminar
21.Feb.2006
Kent Holaday
Overview
•
•
•
•
Define Ontology
Knowledge Representations
Applications of Ontologies
Chemical Ontologies
Ontology Defined
Merriam-Webster Online Dictionary
1 : a branch of metaphysics concerned with the
nature and relations of being
2 : a particular theory about the nature of being
or the kinds of existents
Ontology Defined (cont.)
Google Definitions on the web
• An ontology is a controlled vocabulary that describes
objects and the relations between them in a formal way,
and has a grammar for using the vocabulary terms to
express something meaningful within a specified domain
of interest. Source: members.optusnet.com.au/~webindexing/Webbook2Ed/glossary.htm
• Ontology is the newest label attached to some KOSs.
Ontologies are being developed as specific concept
models by the Knowledge Management community.
They can represent complex relationships between
objects, and include the rules and axioms missing from
semantic networks. Ontologies that describe knowledge
in a specific area are often connected with systems for
data mining and knowledge management.
Source: www.und.nodak.edu/dept/library/Departments/abc/SACSEM-SemInGlossary.htm
Knowledge Representations
INFORMAL
• Tagging
• Folksonomies
Examples
• del.icio.us
• flickr
• Yahoo My Web 2.0
FORMAL
• Lists
• Thesauri
• Taxonomies
• Ontologies
http://www.biowisdom.com/ontology/faq_q1.htm
Examples
• IUPAC, MeSH, LCSH,
XML schema/DTD
IUPAC Nomenclature
• Compendium of Chemical Terminology
carbon
Element number 6 of the periodic table of elements (electronic ground state
1s2 2s2 2p2).
For a description of the various types of carbon as a solid the term carbon
should be used only in combination with an additional noun or a clarifying
adjective.
See also amorphous carbon, carbon fibres, carbon material, glasslike
carbon, graphitic carbon, non-graphitic carbon, pyrolytic carbon.
1995, 67, 479
• Nomenclature of Inorganic Compounds
• Nomenclature of Organic Chemistry
http://www.iupac.org/publications/books/seriestitles/nomenclature.html
1.
2.
3.
4.
MeSH Tree
Structures
2006
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
Anatomy [A]
Organisms [B]
Diseases [C]
Chemicals and Drugs [D]
oInorganic Chemicals [D01] +
oOrganic Chemicals [D02] +
oHeterocyclic Compounds [D03] +
oPolycyclic Compounds [D04] +
oMacromolecular Substances [D05] +
oHormones, Hormone Substitutes, and Hormone Antagonists [D06] +
oEnzymes and Coenzymes [D08] +
oCarbohydrates [D09] +
oLipids [D10] +
oAmino Acids, Peptides, and Proteins [D12] +
oNucleic Acids, Nucleotides, and Nucleosides [D13] +
oComplex Mixtures [D20] +
oBiological Factors [D23] +
oBiomedical and Dental Materials [D25] +
oPharmaceutical Preparations [D26] +
oChemical Actions and Uses [D27] +
Analytical, Diagnostic and Therapeutic Techniques and Equipment [E]
Psychiatry and Psychology [F]
Biological Sciences [G]
Physical Sciences [H]
Anthropology, Education, Sociology and Social Phenomena [I]
Technology and Food and Beverages [J]
Humanities [K]
Information Science [L]
Persons [M]
Health Care [N]
Publication Characteristics [V]
Geographic Locations [Z]
Visual vs Linguistic
Caffeine
CAS: 58-08-2
C8H10N4O2
Synonyms:
• 3,7-dihydro-1,3,7-trimethyl-1H-purine-2,6-dione
• Methyltheobromin
• guaranine
Source: Krallinger, M. et al. (2005) Text-mining approaches in molecular biology and biomedicine. DDT 10(6) 440
Data Sources
Structured
• Medline
• SwissProt
• ChemID Plus
• Medline Plus
• Chemical Abstracts
• NCBI databases
• Misc. databases
Unstructured
• Text documents
• Journal articles
• Lab notebooks
• Web pages
• Database BLOBs
• Email
Source: Gardner, S. (2005) Ontologies and semantic data integration. DDT 10(14) 1004
Semantic Web
Figure 1: The Semantic Web "layer cake" as presented by Tim Berners-Lee.
Source: Hendler, J. (2001) Agents and the semantic web. http://www.cs.umd.edu/users/hendler/AgentWeb.html
Chemical Ontology
• Describe chemical objects and relationships
• Enable the search across multiple data sources
• Bridge some of the graphical versus linguistic
representations
Fragment of Chemical Ontology
HO
grouped_by_chemistry
O
molecules
organic molecules
heterocyclic compounds
bridged-ring heterocyclic compounds HO
IsA
morphinans
morphine
two-ring heterocyclic compounds
isoquinolines
isoquinoline alkaloids
morphinans
morphine
Source: Ennis, M. (2004) ChEBI A Dictionary of Chemical Entities with an Associated Ontology.
SOFG-2, Philadelphia, October 23-26 2004
N CH3
H
morphine
IsA
NH
H
morphinan
ChEBI: What is it?
• Chemical Entities of Biological Interest – an
EBI database/dictionary of ‘biochemical
compounds’ and other chemical entities of
biochemical interest with an associated
ontology
• ChEBI’s goal is to provide standard
terminology of (bio)chemical compounds that
should finally be used in biological databases
Source: Ennis, M. (2004) ChEBI A Dictionary of Chemical Entities with an Associated Ontology.
SOFG-2, Philadelphia, October 23-26 2004
Relationships in ChEBI ontology
Current
• IsA : inherited from Chemical Ontology - class to class;
instance to class
To be implemented…
• IsPartOf - group to molecule; group to group; group to
class
• IsEnantiomerOf - molecule to molecule; cycles allowed
• IsTautomerOf - molecule to molecule; cycles allowed
• IsConjugateBaseOf/IsConjugateAcidOf - molecule to
molecule (e.g. anion to acid)
• IsParentHydrideOf - molecule to molecule (later?)
Source: Ennis, M. (2004) ChEBI A Dictionary of Chemical Entities with an Associated Ontology.
SOFG-2, Philadelphia, October 23-26 2004
is_a
is_enantiomer_of
is_tautomer_of
is_part_of
is_conjugate_base_of
CO2H
O
OH
-
H2N
O
OH
O
H2C
OH
H2N
O
O
H2C
H
CO2¯
NH2
OH
H3C
H
H2N
O
O
-
O
H3C
H
H2N
H
L-Amino acid
OH
OH
O
-
O
H2C
H
O
OH
H2C
H
NH2
OH
H3C
NH2
H
O
O
O
O
-
O
H3C
H
NH2
NH2
D-Amino acid
OH
=O
Amino acid
Source: Ennis, M. (2004) ChEBI A Dictionary of Chemical
Entities with an Associated Ontology. SOFG-2,
Philadelphia, October 23-26 2004
Source: http://www.cse.buffalo.edu/~rapaport/663/F03/ontology.html