No Slide Title

Download Report

Transcript No Slide Title

Protein analysis and proteomics
Friday, 27 January 2006
Introduction to Bioinformatics
DA McClellan
[email protected]
[1] Protein families
[2] Physical properties
protein
[3] Protein localization
[4] Protein function
Fig. 8.1
Page 224
Perspective 1:
Protein families
(domains and motifs)
Page 225
Definitions
Signature:
• a protein category such as a domain or motif
Domain:
• a region of a protein that can adopt a 3D structure
• a characteristic fold or functional region
• a family (superfamily) is a group of proteins that share a
domain
• examples:
zinc finger domain
immunoglobulin domain
Motif (or fingerprint):
• a short, conserved region of a protein
• typically 10 to 20 contiguous amino acid residues
Page 225
15 most common domains (human)
Zn finger, C2H2 type
Immunoglobulin
EGF-like
Zn-finger, RING
Homeobox
Pleckstrin-like
RNA-binding region RNP-1
SH3
Calcium-binding EF-hand
Fibronectin, type III
PDZ/DHR/GLGF
Small GTP-binding protein
BTB/POZ
bHLH
Cadherin
1093 proteins
1032
471
458
417
405
400
394
392
300
280
261
236
226
226
Table 8-3
Page 227
Source: Integr8 program at www.ebi.ac.uk/proteome/
Definition of a domain
According to InterPro at EBI (http://www.ebi.ac.uk/interpro/):
A domain is an independent structural unit, found alone
or in conjunction with other domains or repeats.
Domains are evolutionarily related.
According to SMART (http://smart.embl-heidelberg.de):
A domain is a conserved structural entity with distinctive
secondary structure content and a hydrophobic core.
Homologous domains with common functions usually
show sequence similarities.
Tables 8-1,8-2
Page 226
Varieties of protein domains
Extending along the length of a protein
Occupying a subset of a protein sequence
Occurring one or more times
Fig. 8.2
Page 228
Example of a protein with domains:
Methyl CpG binding protein 2 (MeCP2)
MBD
TRD
The protein includes a methylated DNA binding domain
(MBD) and a transcriptional repression domain (TRD).
MeCP2 is a transcriptional repressor.
Mutations in the gene encoding MeCP2 cause Rett
Syndrome, a neurological disorder affecting girls primarily.
Page 227
Result of an MeCP2 blastp search:
A methyl-binding domain shared by several proteins
Fig. 8.3
Page 228
Are proteins that share only a domain homologous?
Fig. 8.3
Page 228
ProDom entry for HIV-1 pol shows many related proteins
Fig. 8.7
Page 231
Proteins can have both domains and patterns (motifs)
Pattern
(several
residues)
Domain
(aspartyl
protease)
Pattern
(several
residues)
Domain
(reverse
transcriptase)
Fig. 8.7
Page 231
Fig. 8.8
Page 232
Definition of a motif
A motif (or fingerprint) is a short, conserved region
of a protein. Its size is often 10 to 20 amino acids.
Simple motifs include transmembrane domains and
phosphorylation sites. These do not imply homology
when found in a group of proteins.
PROSITE (www.expasy.org/prosite) is a dictionary of
motifs (there are currently >1300 entries)(9/05). In PROSITE,
a pattern is a qualitative motif description (a protein
either matches a pattern, or not). In contrast, a profile
is a quantitative motif description. We will encounter
profiles in Pfam, ProDom, SMART, and other databases.
Page 231-233
Perspective 2:
Physical properties of proteins
Page 233
Posttranslational modifications:
Fig. 8.9
Page 234
Fig. 8.11
Page 235
Fig. 8.11
Page 235
Fig. 8.12
Page 236
Fig. 8.13
Page 238
Fig. 8.13
Page 238
Fig. 8.13
Page 238
Syntaxin, SNAP-25 and VAMP are three proteins that
interact via coiled-coil domains
Introduction to Perspectives 3 and 4:
Gene Ontology (GO) Consortium
Page 237
The Gene Ontology Consortium
An ontology is a description of concepts. The GO
Consortium compiles a dynamic, controlled vocabulary
of terms related to gene products.
There are three organizing principles:
Molecular function
Biological process
Cellular compartment
You can visit GO at http://www.geneontology.org.
There is no centralized GO database. Instead, curators
of organism-specific databases assign GO terms
to gene products for each organism.
Page 237
GO terms are assigned to Entrez Gene entries
Fig. 8.14
Page 241
Fig. 8.14
Page 241
Fig. 8.14
Page 241
Fig. 8.14
Page 241
The Gene Ontology Consortium: Evidence Codes
IC
IDA
IEA
IEP
IGI
IMP
IPI
ISS
NAS
ND
TAS
Inferred by curator
Inferred from direct assay
Inferred from electronic annotation
Inferred from expression pattern
Inferred from genetic interaction
Inferred from mutant phenotype
Inferred from physical interaction
Inferred from sequence or structural similarity
Non-traceable author statement
No biological data
Traceable author statement
Table 8-7
Page 240
Perspective 3:
Protein localization
Page 242
Protein localization
protein
Page 242
Protein localization
Proteins may be localized to intracellular compartments,
cytosol, the plasma membrane, or they may be secreted.
Many proteins shuttle between multiple compartments.
A variety of algorithms predict localization, but this
is essentially a cell biological question.
Page 242
PSORT: searches for sorting signals that are characteristic
of proteins localized to particular cellular compartments
Fig. 8.15
Page 242
Fig. 8.16
Page 244
Fig. 8.16
Page 244
Localization of 2,900 yeast proteins
Michael Snyder and colleagues incorporated epitope
tags into thousands of S. cerevisiae cDNAs,
and systematically localized proteins (Kumar et al., 2002).
See http://ygac.med.yale.edu for a database including
2,900 fluorescence micrographs.
Page 243
Perspective 4:
Protein function
Page 243
Protein function
Function refers to the role of a protein in the cell.
We can consider protein function from a variety
of perspectives.
Page 243
1. Biochemical function
(molecular function)
RBP binds retinol,
could be a carrier
Fig. 8.17
Page 245
2. Functional assignment
based on homology
RBP
could be
a carrier
too
Other
carrier
proteins
Fig. 8.17
Page 245
3. Function
based on structure
RBP forms a calyx
Fig. 8.17
Page 245
4. Function based on
ligand binding specificity
RBP binds vitamin A
Fig. 8.17
Page 245
5. Function based on
cellular process
DNA
RNA
RBP is abundant,
soluble, secreted
Fig. 8.17
Page 245
6. Function based
on biological process
Analyze a gene knockout phenotype;
RBP is essential for vision
Fig. 8.17
Page 245
7. Function based on “proteomics”
or high throughput “functional genomics”
High throughput analyses show...
RBP levels elevated in renal failure
RBP levels decreased in liver disease
Fig. 8.17
Page 245
Functional assignment of enzymes:
the EC (Enzyme Commission) system
Oxidoreductases
Transferases
Hydrolases
Lyases
Isomerases
Ligases
Updated 9/04, 9/05
1,003
1,076
1,125
356
156
126
Table 8-8
Page 246
Functional assignment of proteins:
Clusters of Orthologous Groups (COGs)
Information storage and processing
Cellular processes
Metabolism
Poorly characterized
See Chapter 14 for COGs at NCBI
Table 8-9
Page 247