tutorial7_09

Download Report

Transcript tutorial7_09

Tutorial 7
Protein and Function Databases
Today’s menu:
-UniProt - SwissProt/TrEMBL
-PROSITE
-Pfam
-Gene Onltology
Hypothetical
proteins
Characterized
proteins
UniProt
http://www.uniprot.org/
The Universal Protein Resource
(UniProt) is a central
Repository of protein sequence,
function,classification,and cross
reference. It was created by
Joining the information contained
in Swiss-Prot and TrEMBL.
Pfam
• http://pfam.sanger.ac.uk/
• Pfam is a database of multiple
alignments of protein domains
or conserved protein regions.
One more example
ls mode: a hit is reported if it globally aligns to the seed
fs mode: a hit is reported if it locally aligns to the seed
Description
Structure info
Gene Ontology
Links
What kind of domains can we find
in Pfam?
Trusted Domains
Repeats and Motifs
Fragment Domains
Nested Domains
Disulfide bonds
Important residues
(e.g active sites)
Trans membrane domains
What kind of domains can we find
in Pfam?
Context domains: are those that despite not
scoring above the family threshold are expected
to be real based on the other domains found in
the protein
Signal peptides:
(indicate a protein that will be secreted)
Low complexity regions
Coiled Coils:
(two or three alpha helices that
wind around each other)
• http://www.expasy.org/tools/scanprosite
ProSite is a database of
protein domains and motifs that
can be searched by either
regular expression patterns or
sequence profiles.
Search Results
Domains
architecture
PRATT
Make a pattern from FASTA format sequences
http://www.expasy.ch/tools/pratt/
PRATT
Greed, Overlap and Include
Search A-x(1,3)-A on ABACADAEAFA
Gene Ontology (GO)
• It is a database of biological processes,
molecular functions and cellular components.
• GO does not contain sequence information nor gene
or protein description.
• GO is linked to gene and protein databases.
•The GO database is structured as a tree
Three principal branches
http://www.geneontology.org/amigo/
GO structure is a
Directed Acyclic Graph
Important: note what is the source
of the GO entry
GO sources
ISS
IDA
IPI
TAS
NAS
IMP
IGI
IEP
IC
ND
IEA
Inferred from Sequence/Structural Similarity
Inferred from Direct Assay
Inferred from Physical Interaction
Traceable Author Statement
Non-traceable Author Statement
Inferred from Mutant Phenotype
Inferred from Genetic Interaction
Inferred from Expression Pattern
Inferred by Curator
No Data available
Inferred from electronic annotation