Hong - Gene Ontology Consortium

Download Report

Transcript Hong - Gene Ontology Consortium

Manually curated and computationally
predicted GO annotations at the
Saccharomyces Genome Database
http://www.yeastgenome.org/
Eurie L. Hong, Ph.D.
Department of Genetics • Stanford University School of Medicine
Scientific community
Integrated data
Analysis tools
Data from
traditional
experiments
Data from high
through-put
experiments
Genome
sequence
CHS6/YJL099W Locus Summary Page
Nomenclature
Summary of
published data
Curated data
from
published
literature
Sequence
Information
Links to other
databases
Links to SGD tools
and other databases
Data from
high throughput
experiments
Accessing the data via files
ftp://ftp.yeastgenome.org/yeast/
Display of GO Annotations
Status of GO Annotations at SGD
All protein and RNA gene products have been annotated with
GO terms
All GO annotations are manually curated from literature (no IEA)
Genes without published characterization data
Molecular Function
2112 genes (33.6% of all genes)
Biological Process
1448 genes (23.0% of all genes)
Cellular Component
864 genes (13.7% of all genes)
from Genome Snapshot 8/23/2006
Sources of Computationally Predicted
GO Annotations
1. InterPro domain matches in S. cerevisiae proteins
source: GOA project
2. Integrated analysis of multiple datasets
source: publications, external databases
CHS6/YJL099W Locus Summary Page
Identifying Types of GO Annotations
CHS6/YJL099W GO Annotation Page
Core GO Annotations
GO Annotations from
Large Scale Experiments
Computationally Predicted
GO Annotations
{
{
{
Changes to GO Term Finder
Current functionality
{
{
Specify background set
Refine annotations used
by annotation source or
evidence codes
{
Improving GO Annotations
Computationally predicted GO annotations
Manually curated GO annotations
1. Computational predictions may indicate publications that were
overlooked
2. Review inconsistencies between computationally predicted and
manually curated GO annotations to improve mappings and manually
curated annotations
3. Review inconsistencies between computationally predicted and
manually curated GO annotations to improve ontology
Additional Annotations Using Interpro2GO
Information added to genes with
no published characterization data
Molecular Function
468 genes
Biological Process
316 genes
Cellular Component
207 genes
from gene_association.goa_uniprot 7/2006
Preliminary Comparison:
Cellular Component Annotations
Interpro2go annotation
is ancestor of curated
annotation
43%
Shared parent
is root term
2%
Other
38%
Interpro2go
annotation for
an unknown
4%
Interpro2go
annotation matches
curated annotation
15%
5946 IEA
9059 IC+IDA+IEP+IGI+IMP+IPI+ISS+NAS+RCA+TAS
Shared parent
is child of root
term
18%
Other shared
parent term
18%
Summary
1. Currently, all GO annotations for S. cerevisiae gene
products are manually curated from literature
2. SGD will incorporate computationally predicted GO
annotations that will provide additional information for a
gene product’s role in biology
3. Computationally predicted GO annotations will be used
to refine and improve manually curated GO annotations
at SGD
[email protected]