WormBase Advisory Board Meeting RNAi

Download Report

Transcript WormBase Advisory Board Meeting RNAi

WormBase -- one Web site, many roles
PATO, December 2006
Caenorhabditis elegans
PATO, December 2006
re
s
si
on
G
da
ta
Se en
qu e f RN
G
en un Ai
en
ce cti
e
pr
ch o n
od
uc Tr ang
t i an e
G
nt sg
en G
er
e- en
ac ene
se et io
q, ge
ge ne
A ns
n e in n t
n t e ib
St am rac ody
ti
ru e,
ct
s on
ur yn s
M e c on
ut
y
an orre m
t p ct
he ion
no
ty
O
N
Si
p
t e ver ew e
of ex all
p
e
a
Se cti res le
qu on sio
C
el
en an n
l(
ce aly
na
s
m
f
e,
M eat is
fu
ap u
nc p re
tio ing s
n
M ,ab dat
o
s a la t a
Pr
ic ion
ot
ei
a
S
n
tr n a )
f u u c ly
n c t u si
tio ra s
ns l in
C
ov
in fo
al
e n M i v it r
Fu
t m cro o
nc
tio
od ar
na
ifi ray
ca
lc
tio
om
n
pl
em S
en NP
ta s
t io
n
Ex
p
Diverse data (from 3,739 papers)
1200 1129
1000
834
800
600
610 598
529 503
479
419
400
351 344 326
200
278
193
150 130 124
PATO, December 2006
85
58
58
57
43
26
20
0
12
Prior to July, 2006:
≈ 127 phenotype objects in WormBase.
≈ three-tiered organization (specialization_of or
generalization_of)
≈ redundancy existed between terms
≈ no phenotype term definitions, references
≈ many RNAi experiments annotated to
‘Unclassified’ phenotype term
≈ ‘Not’ phenotype associations were not
captured
≈ Phenotype vocabulary was not used for
annotation of alleles and transgene objects
PATO, December 2006
A controlled and structured
vocabulary for phenotypes:
≈ allows complex data queries, and expedites
analysis of genes that act in the same
processes or pathways.
≈ helps to integrate a massive array of data
from many different sources into a common
body of knowledge.
≈ provides the option of linking phenotype data
with other data in WormBase or with data
from other databases.
≈ facilitates communication within and outside
of the C. elegans community
PATO, December 2006
Expansion of the phenotype
ontology, source for term names:
≈ text descriptions in WormBase
≈ free text phenotype descriptions associated with
alleles
≈ text associated with RNAi objects annotated to
‘Unclassified’ phenotype
≈
≈
≈
≈
prior phenotype terms in WormBase
GO ontology
WormBase anatomy ontology
Life stage ontology
Term names and synonyms reflect the language of researchers.
PATO, December 2006
The WormBase phenotype ontology
is a pre-coordinated ontology:
1348 terms, ~20% of terms are defined
PATO, December 2006
Current term usage:
40% used
60% not
associated for
annotation
with an
annotation
PATO, December 2006
1
2
RNAi-phenotype data:
≈ 272,759 total RNAi-phenotype
connections
≈ 63,439 RNAi experiments
≈ 19,692 genes associated with phenotypes
via RNAi experiments:
≈ 19,185 genes connected via “Not” associations
≈ 4,577 genes connected directly
PATO, December 2006
Allele-phenotype data:
≈ Most phenotype connections are to knockout
alleles (NBP).
≈ Ongoing:
≈ Continuing to collect phenotype data from the
community.
≈ Starting to annotate early papers describing large
collections of mutants -> many high-level phenotype
annotations.
≈ Starting to annotate new papers.
Currently, 4,401 total allele-phenotype
connections to 2585 alleles, defining 1296
genes.
PATO, December 2006
Lots of RNAi data -> dense
early_embryonic_lethal node:
PATO, December 2006
Vague collections of phenotypes
present challenges for
ontology/annotation:
pleiotropic_defects_severe_early_emb:
“Often multiple pronuclei, aberrant cytoplasmic texture, drop
in overall pace of development, osmotic sensitivity.”
complex_phenotype_early_emb
“Complex combination of defects that does not match other
class definitions.”
PATO, December 2006
Looking ahead to an entity-quality
compatible schema:
≈ Within OBO-Edit we store relevant GO
term names within primary names or
synonym names (GO ID stored in
relevant dbxref field)
≈ Phenotype ontology is developed using
existing anatomy and life stage term
names
PATO, December 2006
Phenotype data integration:
≈ Phenotype annotations are associated with
molecular information for alleles, transgenes,
and RNAi objects that permit mapping these
objects to the genome.
≈ High-level phenotype annotations associated
with RNAi objects are automatically converted
to GO terms (RNAi2GO) and associated with
gene objects.
≈ Phenotype annotations describing gene
regulation (‘transgene_expression_abnormal’)
linked with detailed gene regulation
information.
≈ Phenotypes linked to life stage and anatomy
term
PATO, December 2006
RNAi summary on gene page:
PATO, December 2006
Sample detailed RNAi report:
PATO, December 2006
Sample allele report:
PATO, December 2006
Immediate future plans:
≈ Ontology:
≈ Define terms, further refine ontology
(expansion will be dictated by community
feedback and curation needs)
≈ Solicit more expert community feed-back
≈ Web site:
≈ Enhance phenotype search tools
PATO, December 2006
Ontology browser to be integrated
into WormBase:
http://elbrus.caltech.edu/cgi-bin/igor/ontology/ontology.cgi
PATO, December 2006
WormBase = ~30 people, 4 centers
Cold Spring Harbor Laboratory
Washington University at St. Louis
Payan Canaran
Jack Chen
Tristan Fiedler
Todd Harris
Sheldon McKay
Will Spooner
Lincoln Stein
Tamberlyn Bieri
Darin Blasiar
Phil Ozersky
John Spieth
California Institute of Technology
Wellcome Trust Sanger Institute
Igor Antoshechkin
Carol Bastiani
Juancarlos Chan
Wen Chen
Ranjana Kishore
Raymond Lee
Hans-Michael Müller
Cecilia Nakamura
Andrei Petcherski
Gary Schindelman
Erich Schwarz
Paul Sternberg
Kimberly Van Auken
Daniel Wang
Xiaodong Wang
Paul Davis
Richard Durbin
Michael Han
Anthony Rogers
Mary Ann Tuli
Gary Williams
PATO, December 2006
Other acknowledgements:
≈ NIH/NHGRI
≈ C. elegans research community
PATO, December 2006