Transcript Document

Pubcrawler
Semantic Web
 “The
Semantic Web will bring structure to
the meaningful content of Web pages,
creating an environment where software
agents roaming from page to page can
readily carry out sophisticated tasks for
users.”
Semantic Web
 Two

major components:
Agents-software designed to execute
searches without direction from a human.
• Flexible - server down, look for alternate resource.
• Persistent - works without supervision, as needed.

Ontology- structured language
Ontology
 That
department of metaphysics which
investigates and explains the nature and
essential properties and relations of all
beings, as such, or the principles and
causes of being. Webster's Revised Unabridged Dictionary
Ontology
 Structured,
hierarchical, controlled
vocabulary that describes the concepts or
knowledge regarding a particular domain.
Why develop an ontology?
 To
share common understanding of the
structure of information.


everyone agrees that the terms of the
ontology describe the domain of knowledge
Adapted from: Ontology Development 101: A Guide to Creating Your First Ontology Natalya F. Noy and Deborah L.
McGuinness
http://www.ksl.stanford.edu/people/dlm/papers/ontology101/ontology101-noy-mcguinness.html
Why develop an ontology?

To share common understanding of the structure of information.
 To


allow reuse of domain knowledge.
Ontologies describing gene functions can be
combined with an ontology describing the
sequence of genes.
Adapted from: Ontology Development 101: A Guide to Creating Your First Ontology Natalya F. Noy and Deborah L.
McGuinness
http://www.ksl.stanford.edu/people/dlm/papers/ontology101/ontology101-noy-mcguinness.html
Why develop an ontology?

To share common understanding of the structure of information.

To allow reuse of domain knowledge.
 To

make domain assumptions explicit.
Dehydrogenases ARE enzymes.
Why develop an ontology?

To share common understanding of the structure of information.

To allow reuse of domain knowledge.

To make domain assumptions explicit.
 Separate
domain knowledge from
operational knowledge.


Domain knowledge about the function of
enzymes from the reaction mechanisms of
enzymes.
Adapted from: Ontology Development 101: A Guide to Creating Your First Ontology Natalya F. Noy and Deborah L.
McGuinness
http://www.ksl.stanford.edu/people/dlm/papers/ontology101/ontology101-noy-mcguinness.html
Why develop an ontology?

To share common understanding of the structure of information.

To allow reuse of domain knowledge.

To make domain assumptions explicit.

Separate domain knowledge from operational knowledge.
 Analyze


domain knowledge.
Formalizing knowledge into defined
relationships of an ontology permits computer
science to help analyze data.
Adapted from: Ontology Development 101: A Guide to Creating Your First Ontology Natalya F. Noy and Deborah L.
Ontology
 Describes
the concepts or knowledge
regarding a particular domain.
 An

ontology is comprised of
Classes
• concepts that encompass the domain of interest.


Function of a gene product Subclasses may exist-- enzyme
Ontology
 An


ontology is comprised of
Classes
• concepts that encompass the domain of interest.

Function of a gene product 
Subclasses may exist-- enzyme
Properties of the classes
• specific properties

dehydrogenase
Ontology
 An



ontology is comprised of
Classes
• concepts that encompass the domain of interest.
 Function of a gene product  Subclasses may exist-- enzyme
Properties of the classes
• specific properties
 dehydrogenase
Restrictions on the properties
• only certain classes of dehydrogenases exist
Ontology
 Typically,
instances of the domain are kept
separate from the ontology.


Liver alcohol dehydrogenase is an instance.
Combining an ontology with specific instances
is a knowledge base (as distinct from a
database).
Taxonomy
Taxonomy

Eukaryota; Metazoa; Chordata;
Craniata; Vertebrata; Euteleostomi;
Mammalia; Eutheria; Primates;
Catarrhini; Hominidae; Homo sapiens




















* Gorilla
* Gorilla gorilla (gorilla)
* Gorilla gorilla beringei (mountain gorilla)
* Gorilla gorilla gorilla (lowland gorilla)
* Gorilla gorilla graueri
* Homo
* Homo sapiens (human)
* Homo sapiens neanderthalensis
* Pan (chimpanzees)
* Pan paniscus (pygmy chimpanzee)
* Pan troglodytes (chimpanzee)
* Pan troglodytes schweinfurthii
* Pan troglodytes troglodytes
* Pan troglodytes vellerosus
* Pan troglodytes verus
* Pongo
* Pongo pygmaeus (orangutan)
* Pongo pygmaeus abelii (Sumatran orangutan)
* Pongo pygmaeus pygmaeus (Bornean orangutan)
* Pongo sp.
Gene Ontology
Gene Ontology
Parent-Child Relationships
Nucleus
Nucleoplasm
A child is a subset of
a parent’s element s
From: Karen Christie, Ceri Van Slyke and Petra Fey
Nuclear
envelope
Nucleolus
Chromosome
Perinuclear space
T he cell component t erm
Nucleus has 5 children
Gene Ontology
“Tree” Relationships
Derivation of Romance languages from Latin.
From R.A. Hall Jr., Introductory Linguistics; originally published by Chilton Books,
now distributed by Rand McNally & Co.
From: Karen Christie, Ceri Van Slyke and Petra Fey
Gene Ontology
From: Karen Christie, Ceri Van Slyke and Petra Fey
Gene Ontology
Apoptosis
•Utility: Microarray, want to know all genes involved in Apoptosis.
• Determine fold change in gene expression for all genes involved in
Apoptosis.
•Report all genes, involved in apoptosis, that change at least 2 fold.
MeSH Ontology

Medical Subject Headings -provides indexing for
PubMed.
 Can be used to generate complex queries in a
simple fashion.
Do not need to
remember all
these terms
Food
Food
Fruits
Vegetables
Meats
Food
Fruits
Apples
Oranges
Vegetables
Meats
Pears
Food
Fruits
Apples
Oranges
Vegetables
Meats
Pears
System would “know”
that apples, oranges and
pairs are all fruits, AND
that they are edible.
Food
Fruits
Apples
Oranges
Vegetables
Pears
Meats
Chicken
“Know” that
chicken is not a
fruit
Food
Fruits
Apples
Oranges
Vegetables
Pears
Meats
Chicken
Additional
subclasses
Food
Fruits
Apples
Oranges
Vegetables
Pears
Meats
Chicken
Additional
subclasses
Instances