Harrisslides

Download Report

Transcript Harrisslides

Using The Gene Ontology:
Gene Product Annotation
GO Project Goals
• Compile structured vocabularies describing
aspects of molecular biology
• Describe gene products using vocabulary terms
(annotation)
• Develop tools:
• to query and modify the vocabularies and
annotations
• annotation tools for curators
GO Data
GO provides two bodies of data:
• Terms with definitions and crossreferences
• Gene product annotations with
supporting data
The Three Ontologies
•Molecular Function — elemental activity or task
nuclease, DNA binding, transcription factor
•Biological Process — broad objective or goal
mitosis, signal transduction, metabolism
•Cellular Component — location or complex
nucleus, ribosome, origin recognition complex
DAG Structure
Directed acyclic graph: each child
may have one or more parents
The True Path Rule
Every path from a node back to the
root must be biologically accurate
GO Annotation
• Association between gene product and
applicable GO terms
• Provided by member databases
• Made by manual or automated methods
DAG Structure
Annotate to any level within DAG
DAG Structure
mitotic chromosome condensation
S.c. BRN1, D.m. barren
Annotate to any level within DAG
DAG Structure
mitosis
S.c. NNF1
mitotic chromosome condensation
S.c. BRN1, D.m. barren
Annotate to any level within DAG
GO Annotation: Data
• Database object: gene or gene product
• GO term ID
• Reference
• publication or computational method
• Evidence supporting annotation
GO Evidence Codes
IDA - Inferred from Direct Assay
IMP - Inferred from Mutant
Phenotype
TAS - Traceable Author Statement
NAS - Non-traceable Author Statement
IGI - Inferred from Genetic
Interaction
IC - Inferred by Curator
IPI - Inferred from Physical
Interaction
ISS - Inferred from Sequence or
structural Similarity
IEP - Inferred from Expression
Pattern
IEA - Inferred from Electronic
Annotation
ND - Not Determined
GO Evidence Codes
IDA - Inferred from Direct Assay
IMP - Inferred from Mutant
Phenotype
TAS - Traceable Author Statement
NAS - Non-traceable Author Statement
IGI - Inferred from Genetic
Interaction
IC - Inferred by Curator
IPI - Inferred from Physical
Interaction
ISS - Inferred from Sequence or
structural Similarity
IEP - Inferred from Expression
Pattern
IEA - Inferred from Electronic
Annotation
ND - Not Determined
From primary literature
GO Evidence Codes
From reviews or introductions
IDA - Inferred from Direct Assay
IMP - Inferred from Mutant
Phenotype
TAS - Traceable Author Statement
NAS - Non-traceable Author Statement
IGI - Inferred from Genetic
Interaction
IC - Inferred by Curator
IPI - Inferred from Physical
Interaction
ISS - Inferred from Sequence or
structural Similarity
IEP - Inferred from Expression
Pattern
IEA - Inferred from Electronic
Annotation
ND - Not Determined
From primary literature
GO Evidence Codes
From reviews or introductions
IDA - Inferred from Direct Assay
IMP - Inferred from Mutant
Phenotype
TAS - Traceable Author Statement
NAS - Non-traceable Author Statement
IGI - Inferred from Genetic
Interaction
IC - Inferred by Curator
IPI - Inferred from Physical
Interaction
ISS - Inferred from Sequence or
structural Similarity
IEP - Inferred from Expression
Pattern
IEA - Inferred from Electronic
Annotation
ND - Not Determined
From primary literature
GO Evidence Codes
From reviews or introductions
IDA - Inferred from Direct Assay
IMP - Inferred from Mutant
Phenotype
TAS - Traceable Author Statement
NAS - Non-traceable Author Statement
IGI - Inferred from Genetic
Interaction
IC - Inferred by Curator
IPI - Inferred from Physical
Interaction
ISS - Inferred from Sequence or
structural Similarity
IEP - Inferred from Expression
Pattern
IEA - Inferred from Electronic
Annotation
ND - Not Determined
From primary literature
automated
GO Annotation: Methods
• Manual
• Automated
• sequence similarity
• transitive annotation
• nomenclature, other text matching
Literature-Based Manual Annotation:
Experimental Evidence Codes
Lecoq, K., et al. (2001) YLR209C Encodes Saccharomyces
cerevisiae Purine Nucleoside Phosphorylase. J. Bacteriology
183(16): 4910-4913.
Experiment 1 - Purification and enzyme assay
Purified His-tagged Ylr209cp; can convert
various nucleoside substrates to bases + Pi;
inosine and guanosine are substrates
Experiment 2 - Knockout of YLR209C
null mutant excretes inosine and guanosine into
medium (compounds in medium separated by
chromatography and identified by HPLC
separation profiles)
Literature-Based Manual Annotation:
Experimental Evidence Codes
Lecoq, K., et al. (2001) YLR209C encodes Saccharomyces
cerevisiae purine nucleoside phosphorylase. J. Bacteriol.
183(16): 4910–4913.
Experiment 1 - Purification and enzyme assay
Purified His-tagged Ylr209cp; can convert
various nucleoside substrates to bases + Pi;
inosine and guanosine are substrates
Experiment 2 - Knockout of YLR209C
null mutant excretes inosine and guanosine into
medium (compounds in medium separated by
chromatography and identified by HPLC
separation profiles)
IDA
Literature-Based Manual Annotation:
Experimental Evidence Codes
Lecoq, K., et al. (2001) YLR209C encodes Saccharomyces
cerevisiae purine nucleoside phosphorylase. J. Bacteriol.
183(16): 4910–4913.
Experiment 1 - Purification and enzyme assay
Purified His-tagged Ylr209cp; can convert
various nucleoside substrates to bases + Pi;
inosine and guanosine are substrates
Experiment 2 - Knockout of YLR209C
null mutant excretes inosine and guanosine into
medium (compounds in medium separated by
chromatography and identified by HPLC
separation profiles)
IDA
FUNCTION:
purine nucleoside
phosphorylase
Literature-Based Manual Annotation:
Experimental Evidence Codes
Lecoq, K., et al. (2001) YLR209C ncodes Saccharomyces
cerevisiae purine nucleoside phosphorylase. J. Bacteriol.
183(16): 4910–4913.
Experiment 1 - Purification and enzyme assay
Purified His-tagged Ylr209cp; can convert
various nucleoside substrates to bases + Pi;
inosine and guanosine are substrates
Experiment 2 - Knockout of YLR209C
null mutant excretes inosine and guanosine into
medium (compounds in medium separated by
chromatography and identified by HPLC
separation profiles)
IDA
IMP
FUNCTION:
purine nucleoside
phosphorylase
Literature-Based Manual Annotation:
Experimental Evidence Codes
Lecoq, K., et al. (2001) YLR209C encodes Saccharomyces
cerevisiae purine nucleoside phosphorylase. J. Bacteriol.
183(16): 4910–4913.
Experiment 1 - Purification and enzyme assay
Purified His-tagged Ylr209cp; can convert
various nucleoside substrates to bases + Pi;
inosine and guanosine are substrates
Experiment 2 - Knockout of YLR209C
null mutant excretes inosine and guanosine into
medium (compounds in medium separated by
chromatography and identified by HPLC
separation profiles)
IDA
IMP
FUNCTION:
purine nucleoside
phosphorylase
Literature-Based Manual Annotation:
Experimental Evidence Codes
Lecoq, K., et al. (2001) YLR209C encodes Saccharomyces
cerevisiae purine nucleoside phosphorylase. J. Bacteriol.
183(16): 4910–4913.
Experiment 1 - Purification and enzyme assay
Purified His-tagged Ylr209cp; can convert
various nucleoside substrates to bases + Pi;
inosine and guanosine are substrates
Experiment 2 - Knockout of YLR209C
null mutant excretes inosine and guanosine into
medium (compounds in medium separated by
chromatography and identified by HPLC
separation profiles)
IDA
FUNCTION:
purine nucleoside
phosphorylase
PROCESS:
IMP
purine nucleoside
catabolism
Literature-Based Manual Annotation:
Experimental Evidence Codes
Lecoq, K., et al. (2001) YLR209C encodes Saccharomyces
cerevisiae purine nucleoside phosphorylase. J. Bacteriol.
183(16): 4910–4913.
Experiment 1 - Purification and enzyme assay
Purified His-tagged Ylr209cp; can convert
various nucleoside substrates to bases + Pi;
inosine and guanosine are substrates
Experiment 2 - Knockout of YLR209C
null mutant excretes inosine and guanosine into
medium (compounds in medium separated by
chromatography and identified by HPLC
separation profiles)
IDA
FUNCTION:
purine nucleoside
phosphorylase
PROCESS:
IMP
This paper has no data for cellular component.
purine nucleoside
catabolism
Automated Annotation: InterPro
Example
YFP
InterPro
entry
GO
entry
InterPro2go links InterPro
entries and GO terms
Automated Annotation: InterPro
Example
Run InterProScan
to link YFP and
InterPro entry
InterPro
entry
YFP
GO
entry
InterPro2go links InterPro
entries and GO terms
Automated Annotation: InterPro
Example
Run InterProScan
to link YFP and
InterPro entry
InterPro
entry
YFP
Infer GO term from
the other two links
GO
entry
InterPro2go links InterPro
entries and GO terms
AmiGO Browser
detailed view of term
AmiGO Browser
gene products
annotated to term
GO Annotation: Contributors
•
•
•
•
•
FlyBase
Saccharomyces Genome Database
Mouse Genome Informatics
The Arabidopsis Information Resource
Swiss-Prot/TrEMBL/InterPro
• WormBase
• DictyBase
• Gramene
• Compugen, Inc.
• Pathogen Sequencing Unit (Sanger Institute)
• PomBase (Sanger Institute)
• Rat Genome Database
• The Institute for Genomic Research
GO Annotation: Organisms
• Fruit fly (Drosophila melanogaster)
• Budding yeast (Saccharomyces cerevisiae)
• Fission yeast (Schizosaccharomyces pombe)
• Human (Homo sapiens)
• Mouse (Mus musculus)
• Rice (Oryza sativa)
• Rat (Rattus norvegicus)
• Tsetse fly (G. morsitans)
• Caenorhabditis elegans
• Arabidopsis thaliana
• Vibrio cholerae
• Dictyostelium discoideum
Current GO Annotations
www.geneontology.org
•
•
•
•
•
FlyBase & Berkeley Drosophila Genome Project
Saccharomyces Genome Database
Mouse Genome Informatics
The Arabidopsis Information Resource
Swiss-Prot/TrEMBL/InterPro
•
•
•
•
WormBase
DictyBase
Gramene
Compugen, Inc.
• Pathogen Sequencing Unit (Sanger Institute)
• PomBase (Sanger Institute)
• Rat Genome Database
• Genome Knowledge Base (CSHL)
• The Institute for Genomic Research
The Gene Ontology Consortium is supported
by NHGRI grant HG02273 (R01). The Gene
Ontology project thanks AstraZeneca for
financial support. The Stanford group
acknowledges a gift from Incyte Genomics.
Conference:
Standards and Ontologies for
Functional Genomics (SOFG)
Towards unified ontologies for describing biology
and biomedicine
17 – 20 November 2002
Hinxton Hall Conference Centre
Hinxton, Cambridge, UK
www.wellcome.ac.uk/hinxton/sofg