Transcript Document

OAT - The Ontology Annotation Tree browser tool
Motivation for OAT
1. Submit probe sets
The microarray technique has gained
enormous popularity both among the
major pharmaceutical companies and
academic institutions. Even though all
the benefits of the technique, we have
to face the problems of that the
amount of resulting information of
microarray analysis is of such magnitude that it is difficult to get an
overview of the data. The need of
condensing the information of an
analysis is therefore evident. In order
to address this challenge we have
developed the Ontology Annotation
Tree browser tool (OAT-tool). The
OAT-tool utilizes the two ontologies of
Medical Subject Headings (MeSH)
and Gene Ontology (GO) to represent
the information.
Technical Components
The OAT-tool runs from a web server. It is
written in Perl and CGI. All data are stored in
the OAT database (OATdb) and for each
query a Query Specific database (QSdb) is
generated. The information flow of OAT’s
three main scripts is outlined below to the
right.
2. Browse the Ontology
The information flow of OAT
Checkbox for
terms which are
to be included in
the report
(b)
Significance of
the Annotation
The source of information
The gene information is collected from
AffymeterixTM-files. Ontology information is
collected from MeSH and GO. Annotations
is collected from EMBL/MEDLINE and
Gene Ontology Annotation Campaign
(GOAC).
OATdb
treesetup.cgi
(c)
(d)
(e)
QSdb
Number of probe set
annotated with this term
MeSH is the National Library
of Medicine's controlled vocabulary thesaurus. Thesauri
are carefully constructed sets
of terms often connected by
broader-than, narrower-than,
and related links. These links show the
relationships between related terms and
provide a hierarchical structure that
permits searching at various levels of
specificity from narrower to broader. There
are more than 19,000 terms in MeSH
The goal of the Gene
OntologyTM
Consortium is to produce a
dynamic
controlled
vocabulary that can be
applied to all organisms. The terms are structured in three
ontologies: Molecular Function, Biological
Process and Cellular Componet.
(a)
(g)
treebrowser.cgi
(f)
Number of probe set below
Number of annotations below
Link to MeSH description
3. Make a report
Work flow of the OAT-tool
1
For each query of probe sets we extract a subset of the
ontology and assign the annotated genes to the different
terms in the ontology.
2
3
The user browse the ontology in a hierarchical way from
the top-level terms and down to the more detailed ones.
When a satisfying level of detail is reached the user
have the ability of summarizing the information in a
report which could be used for further studies.
Anders Bresell1, Bo Servenius2
1. Department of Computer and Information Science, Linköpings Universitet, Linköping, Sweden.
2. Department of Molecular Sciences, AstraZeneca R&D Lund, Lund Sweden.
URL: http://bioinfo.selu.astrazeneca.net/~ext_abl/html/atb/
Link to Affymetrix DB
Link to MeSH DB
Link to MEDLINE
report.cgi
(h)
(j)
(i)
annotationReport.cgi
(k)
geneReport.cgi
At the home page of OAT (a) a link to the
query form is found. (b) The probe sets and
the tree option is sent to the tree setup script.
The relevant information of the query is (c)
extracted from OATdb and (d) stored in
QSdb. The tree browser script (e) reads data
from QSdb and (f) visualises it as a web
page. For each modification of the tree
visualisation (g) the tree browser script is
reloaded with the new data. By marking
terms and clicking on the submit button (h)
the report information is sent to (i) a
redirection script. The report scripts, one
each for the sorting of data in terms of genes
or term strings, (j) reads data from QSdb and
(k) generates a report web page.
A more detailed description of the
work is given in Interpretion of
microarray expression data using
ontology browsing, a master thesis
report (LiTH-IDA-Ex-02/75) from
Linköpings Universitet by Anders
Bresell.