Protein Ontology (PRO)

Download Report

Transcript Protein Ontology (PRO)

User Community Interactions
- Impact on PRO Protein Ontology 3rd Annual Meeting (2010)
Darren A. Natale, Ph.D.
Protein Science Associate Team Lead, PIR
Research Assistant Professor, GUMC
PRO Users (Current & Potential)
- a few examples 
Ontology Providers


Semantic Resources




Royal Society of Chemistry (RSC)
Science Collaboration Framework
Semantic Web Applications in Neuromedicine (SWAN)
Process-Modeling Resources




Dendritic Cell Ontology (DC_CL)
Reactome, MouseCyc
EcoCyc
Pathway Logic
Molecule-Modeling Resources


Database of Protein Disorder (DisProt)
Int’l Union of Basic and Clinical Pharmacology (IUPhar)
Ontology Providers
- Dendritic Cell Ontology 
Describes different types of cells, sometimes using
surface-expressed proteins to distinguish them
[Term]
id: DC_CL:0000003
name: conventional dendritic cell
def: "conventional dendritic cell is_a leukocyte that
has_high_plasma_membrane_amount_relative_to_leukocyte CD11c and
lacks_plasma_membrane_part CD19, CD3, C34, and CD56." [AMM:amm]
comment: Immunological Reviews 2007 219: 118-142
intersection_of: CL:0000738 ! leukocyte
intersection_of: has_high_plasma_membrane_amount_relative_to_leukocyte PRO:000001013 ! CD11c
intersection_of: lacks_plasma_membrane_part PRO:000001002 ! CD19
intersection_of: lacks_plasma_membrane_part PRO:000001003 ! CD34
intersection_of: lacks_plasma_membrane_part PRO:000001024 ! CD56
intersection_of: lacks_plasma_membrane_part PRO:000001027 ! CD3
PRO Content
- before 
Focused on mouse and human proteins

Strictly species-neutral

Small-scale

Strictly single-molecule

Protein forms were highly specific

No derivation information
Semantic Resources
- Royal Society of Chemistry 
Seeks to tag chemistry articles with links to further
information about the tagged entities, including
proteins.
Example:p21ras, heregulin (plus many others that PRO
lacked).
Highlighted the need for a large-scale expansion of terms.
PRO Content
- after 
Focused on mouse and human proteins

Strictly species-neutral

Small-scale

Strictly single-molecule

Protein forms were highly specific

No derivation information
Semantic Resources
- Science Collaboration Framework 
Contains linkouts to model organisms, genes, cell lines,
and – most pertinent to PRO – antibodies against
proteins
Example: antibodies against HSP70 proteins, which might
be cross-reactive against several species, but might also
react only against HSP70 from a single species.
Highlighted the need to add species-specific terms to PRO.
This required the addition of NCBI taxon terms.
PRO Content
- after 
Focused on mouse and human proteins

Strictly species-neutral

Small-scale

Strictly single-molecule

Protein forms were highly specific

No derivation information
Semantic Resources
- AlzSWAN 
Models the proteins involved in the etiology of
Alzheimer disease, and in particular the proteolytic
cleavage products of amyloid beta A4 protein.
Example: amyloid beta A4 protein is cleaved into a
number of pieces, but not all isoforms can give rise to
all pieces.
Highlighted the need for the ability to indicate precursors for
certain proteolytic cleavage products.
PRO Content
- after 
Focused on mouse and human proteins

Strictly species-neutral

Small-scale

Strictly single-molecule

Protein forms were highly specific

No derivation information
Process Resources
- Reactome, MouseCyc 
Components may be protein complexes whose
functions depend on modification state, or could have
different components depending on the species
Example: CDK2/cyclin A is activated by phosphorylation of
T160 while T14 phosphorylation inactivates the kinase.
Example: Mediator complex in humans has 36 subunits
while yeast has 21.
Highlighted the need for specific complexes with respect to
both species and modification state.
PRO Content
- after 
Focused on mouse and human proteins

Strictly species-neutral

Small-scale

Strictly single-molecule

Protein forms were highly specific

No derivation information
Process Resources
- EcoCyc 
A rich source of protein forms and pathway information,
similar to MouseCyc, except focused on Escherichia
coli
Example: NarQ Nitrate/Nitrite-Dependent Two-Component
Regulatory System (not present in eukaryotes).
Reaction: ATP+NarQ -> ADP+NarQphosphohistidine
Highlighted the need for bacterial protein representation.
PRO Content
- after 
Focused on mouse and human proteins

Strictly species-neutral

Small-scale

Strictly single-molecule

Protein forms were highly specific

No derivation information
Process Resources
- Pathway Logic 
Contains pathway descriptions whose components
were known to be the product of a certain gene and
known to be modified, but the position of the
modification and the splice form were not known or
were not important
Example: phosphorylated BMX protein in EGF-mediated
signaling pathway.
Highlighted the need for general modified forms along the
lines of protein -> phosphorylated protein.
PRO Content
- after 
Focused on mouse and human proteins

Strictly species-neutral

Small-scale

Strictly single-molecule

Protein forms were highly specific

No derivation information
BFO
SO
MOD
(conversion to ChEBI is under review)
GO
ChEBI
OBI
NCBI taxon
Relationship between UniProt & PRO
Precise protein form (species-specific) – isoform 1 of human Smad2 phosphorylated on serine 144
Not found anywhere yet as independent entry; though PRO will
contain them; sometimes described in UniProtKB within a larger entry
Precise protein form (species-neutral) – isoform 1 of Smad2 from any species phosphorylated on the serine
analogous to serine144 of human Smad2 isoform 1
Currently covered by PRO
All protein products from a particular gene (species-specific)
“Whenever possible, all the protein products encoded by one gene in a given
species are described in a single UniProtKB/Swiss-Prot entry”
Currently covered by UniProtKB; Will need to be included in or
referred to by PRO
How to do it?
All protein products from a particular gene (species-neutral)
Currently covered by PRO