Day 2: Protein Sequence Analysis
Download
Report
Transcript Day 2: Protein Sequence Analysis
Day 2: Protein Sequence Analysis
1.
2.
3.
4.
5.
6.
7.
8.
Physico-chemical properties.
Cellular localization.
Signal peptides.
Transmembrane domains.
Post-translational modifications.
Motifs & domains.
Secondary structure.
Other resources.
ExPASy (Expert Protein Analysis System)
Swiss Institute of Bioinformatics (SIB).
Dedicated to the analysis of protein sequences and structures.
Many of the programs for protein sequence analysis can be accessed
via ExPASy.
1) Physico-chemical properties:
ProtParam tool
o molecular weight
o theoretical pI (pH no net electrical charge)
o amino acid composition
o atomic composition
o extinction coefficient
o estimated half-life
o instability index
o aliphatic index
o grand average of hydropathicity (GRAVY)
2) Cellular localization:
Proteins destined for particular subcellular localizations have distinct
amino acid properties particularly in their N-terminal regions.
Used to predict whether a protein is localized in the cytoplasm,
nucleus, mitochondria, or is retained in the ER, or destined for
lysosome (vacuolar) or the peroxisome.
PSORT
End of the output the percentage likelihood of the subcellular
localization.
3) Signal peptides:
Proteins destined for secretion, operation with the endoplasmic
reticulum, lysosomes and many transmembrane proteins are
synthesized with leading (N-terminal) 13 – 36 residue signal peptides.
SignalP WWW server can be used to predict the presence and location
of signal peptide cleavage sites in your proteins.
Useful to know whether your protein has a signal peptide as it
indicates that it may be secreted from the cell.
Proteins in their active form will have their signal peptides removed.
4) Transmembrane domains:
TMpred program makes a prediction of membrane-spanning regions
and their orientation.
Algorithm is based on the statistical analysis of TMbase, a database of
naturally occurring transmembrane proteins.
Presence of transmembrane domains is an indication that the protein is
located on the cell surface.
5) Post-translational modifications:
After translation has occurred proteins may undergo a number of
posttranslational modifications.
Can include the cleavage of the pro- region to release the active
protein, the removal of the signal peptide and numerous covalent
modifications such as, acetylations, glycosylations, hydroxylations,
methylations and phosphorylations.
Posttranslational modifications may alter the molecular weight of your
protein and thus its position on a gel.
Many programs available for predicting the presence of
posttranslational modifications, we will take a look at one for the
prediction of type O-glycosylation sites in mammalian proteins.
These programs work by looking for consensus sites and just because a
site is found does not mean that a modification definitely occurs.
6) Motifs and Domains:
Motifs and domains give you information on the function of your
protein.
Search the protein against one of the motif or profile databases.
ProfileScan, which allows you to search both the Prosite and Pfam
databases simultaneously
7) Secondary Structure Prediction:
WHY:
– If protein structure, even secondary structure, can be accurately
predicted from the now abundantly available gene and protein
sequences, such sequences become immensely more valuable for
the understanding of drug design, the genetic basis of disease, the
role of protein structure in its enzymatic, structural, and signal
transduction functions, and basic physiology from molecular to
cellular, to fully systemic levels.
JPRED - works by combining a number of modern, high quality
prediction methods to form a consensus.
Secondary Structure Prediction
Essentially protein secondary structure consists of 3 major
conformations;
a Helix.
b pleated sheet.
coil conformation.
a Helix
b pleated sheet
Parallel
– OR
Anti-parallel