Lecture 11 - University of Warwick

Download Report

Transcript Lecture 11 - University of Warwick

CH 908: MASS
SPECTROMETRY
LECTURE 11
Proteomics
Outline
•
•
•
•
•
•
•
•
•
Definitions and concepts
Top-down vs. Bottom-up proteomics
Importance of separation in proteomics
2D Gel Electrophoresis
MS analysis: Peptide Mass Fingerprinting
For MS/MS sequencing: see lectures 8-10
Scoring algorithms (MOWSE, Mascot)
Database searching: importance of constraints
Importance of high resolution and high mass accuracy for
protein ID
• (Post-translational modifications)
• (Phosphorylation)
GENOME AND PROTEOME
• Genome is the complete genetic information (either
DNA or, in some viruses, RNA) of an organism
• The proteome is the entire set of proteins expressed
by a genome of a cell, tissue or organism.
More specifically, it is the set of proteins expressed in
a given type of cells at a given time under defined
conditions.
A CONCEPT OF PROTEOMICS
GENOME
TRANSCRIPTOME
Gene
PROTEOME
Protein
mRNA
Post-Translational
Modification
Exon 1
Exon 1
2) Alternative
Exon 3 Splicing
Protein
Isoform 12
Exon 2
1) Transcription
Exon 2
Translation
Exon 1
Exon 3
Active
Protein Isoforms
Protein
Isoform 13
A
B
C…
12A
12B
12C
…
13A
13B
Protein-Protein Interactions
13C
Protein-DNA Interactions
…
Protein-Small Molecule Interactions… INTERACTOME
PROTEOMICS
• Proteomics – study of proteome using the
technologies of large-scale protein separation
and identification.
• The term “proteomics” was coined in 1994 by
Marc Wilkins who defined it as "the study of
proteins, how they're modified, when and where
they're expressed, how they're involved in
metabolic pathways and how they interact with
one another."
WHY STUDY PROTEINS?
• Proteins are mediators of the cell
functions
• Differences in protein expression or
degree of PTM may denote a disease
• Proteins are drug/therapeutic targets
• => more efficient drug discovery
CASE STUDY –
GLEEVEC/GLIVEC
• In the 1980s, it was discovered that a single defective
protein (a tyrosine kinase enzyme) could cause chronic
myeloid leukemia (CML)
• By the mid-to-late 1990s, a powerful, specific inhibitor of
the abnormal cell pathway was developed from a
promising but weak-acting inhibitor class of molecules
• Blocking the pathway prevents the uncontrolled
proliferation of white blood cells
• The drug is Glivec (imatinib; Novartis)
CHALLENGES: GENOMICS VS. PROTEOMICS
Genome (DNA)
Proteome (proteins)
• Static (no change with time)
• Dynamic (highly variable
with time; many proteomes
• Can be amplified (PCR)
for one genome)
• Little sample complexity (4 base
• Cannot be amplified
pairs, very similar, same order of
• High sample complexity
concentration)
(wide variety of physical
• Good solubility
and chemical properties;
concentrations can differ by
9 orders of magnitude)
• Various solubility; some
proteins are insoluble in
water
OF9-10
PROTEOMIC
SAMPLES
•COMPLEXITY
Concentration range:
orders of magnitude
Example: human plasma: albumin ~ 50% (~ 30-40
mg/ml); cytokines and growth factors, distinct clonal
immunoglobulins (millions, single copies, <<pg/ml)
• Protein isoforms, post-translational modifications
- >250 different kinds of covalent PTM’s;
- examples: phosphorylation, glycosylation, etc.;
- a protein may have multiple PTM sites and various
degrees of modification
=> different proteins by phys.-chem. properties
COMPLEXITY OF PROTEOMIC
SAMPLES: HUMAN PLASMA
Anderson, N. L. (2002)
Mol. Cell. Proteomics 1: 845-867
COMPLEXITY OF PROTEOMIC SAMPLES: PROTEIN
HETEROGENEITY (1)
http://www.cs.helsinki.fi/bioinformatiikka/mbi/courses/06-07/proteomics/slides/lecture1.ppt
PROTEOMICS GOALS
• Identification of all proteins in a proteome
• Search for new, hypothetical or predicted proteins
• Analysis of differential expression between 2,3,...
different conditions (protein up- or downregulation)
• Identification of post-translational modifications
• Characterization of proteins by function, pathway,
cellular location, etc.
• Study of protein-protein interactions
PROTEOMICS: VARIOUS APPROACHES
“Bottom-up”
Separate, digest
Proteins
Peptides
Digest, separate
Separate
MS, MS/MS
Database
search,
analyze
(“Shotgun proteomics”)
“Top-down”
Proteins
Separate, MS, MS/MS,
analyze, (database search)
Mass-spectrometer - FT ICR
Both strategies are complementary
Mass-spectrometers –
IT/TQ/TOF/ICR...
Proteins
identified
“TOP-DOWN
• Analysis
of intact proteinsPROTEOMICS”
from complex biological
systems
• FT-ICR MS (Fourier-Transform Ion Cyclotron
Resonance Mass Spectrometry) is the technique of
choice
• MS/MS: ECD (electron capture dissociation) and ETD
(electron transfer dissociation) typically provide more
uniform dissociation than conventional CID
(collisionally induced dissociation), while preserving
the labile modifications
“TOP-DOWN” APPROACH
Williams, E. R. et.al. http://www.cchem.berkeley.edu/erwgrp/science_old.html
“TOP-DOWN” APPROACH:
A DESCENDANT OF HIGH RESOLUTION TANDEM MASS
SPECTROMETRY OF LARGE BIOMOLECULES (EARLY 90IES)
Kelleher, N. L., C. A. Costello, et al. (1995). "Thiaminase I (42 Kda) Heterogeneity, Sequence
Refinement, and Active Site Location From High-Resolution Tandem Mass Spectrometry." Journal of the American Society for Mass
Spectrometry 6(10): 981-984.
TOP-DOWN APPROACH: THE IMPORTANCE OF
HIGH RESOLVING POWER
∆m = 71 Da, Resolving power = M/ ∆m ~ 600 is necessary
Resolving power up to 3 000 000 is achievable with FTICR MS
Kelleher, N. L., C. A. Costello, et al. (1995). "Thiaminase I (42 Kda) Heterogeneity, Sequence
Refinement, and Active Site Location From High-Resolution Tandem Mass Spectrometry." Journal of the American Society for Mass
Spectrometry 6(10): 981-984.
TOP-DOWN PROTEOMICS: PRO’S AND CON’S
Advantages
• 100% protein sequence coverage is possible => identification of
protein isoforms, proteolytic processing events, and PTMs;
• De-novo sequencing;
• Big protein masses are more "information rich" thus improving
the quality of the information and decreasing false positives it's almost impossible to mis-assign with top-down;
• big proteins often lose signal peptides, methionines, or are
otherwise proteolytically modified after translation so that they
don't correlate to genome sequence databases (The databases
need updating to reflect this.)
• Localization of non-covalently bound ligands is possible
TOP-DOWN PROTEOMICS: PRO’S AND CON’S
Disadvantages
•
•
•
•
Limited sensitivity and throughput
Pure samples are required
Insoluble proteins cannot be analyzed
Expensive instrumentation, expert level
users
“BOTTOM-UP” APPROACH
• Bottom-up proteomics is a common method to
identify proteins and characterize their amino acid
sequences and PTMs by enzymatic digestion of
proteins prior to analysis by mass spectrometry
• The proteins may first be purified (e.g., GE) resulting
in one or a few proteins in each enzymatic digest.
• Alternatively, the crude protein extract is digested
directly, followed by one or more dimensions of
separation of the peptides by liquid chromatography
coupled to mass spectrometry (“shotgun proteomics”)
“BOTTOM-UP” PROTEOMICS
FLOWCHART
M. L. Fournier, J. M. Gilmore, S. A. Martin-Brown, and M.P. Washburn
Multidimensional Separations-Based Shotgun Proteomics
Chem. Rev. 2007, 107, 3654-3686
BOTTOM-UP PROTEOMICS: PRO’S AND CON’S
Advantages
• Less sophisticated
instrumentation and
expertise
• High throughput
• More info about proteins
with “extreme” phys.chem. properties
(hydrophobic, Hi/Low
MW, acidic/basic)
Disadvantages
• Confidence in protein ID
strongly depends on
restriction criteria
(subjective; potential bias)
• Since protein ID is often
done by 1-2 peptides,
PTM and isoform
information is often lost
SEPARATION IN PROTEOMICS
It
is impossible to resolve all species in a proteomics
sample using only one separation method
Multidimensional separation - two or more independent
(“orthogonal”) separation techniques coupled together
for the analysis of a single sample.
Separation method
Separation by:
Reversed phase
Hydrophobicity
Ion exchange,
IsoElectroFocusing (IEF)
Size exclusion,
SDS Gel Electrophoresis
Net charge, Isoelectric point
Affinity chromatography
Specific functional groups
Size, molecular weight
Visualization of peak capacity in both 1D and 2D separations
Identical samples of six species are separated by 1D and 2D techniques. Although the column shown for 1D
separation has a theoretical peak capacity of eight (indicated by the boxes below the column), the 1D technique is
able to clearly resolve only four distinct peaks. The addition of a second chromatographic dimension greatly
improves the theoretical peak capacity (8 8 = 64) as shown in the boxes below the columns. The second column is
able to improve the separation of overlapped peaks so that clearly resolved peaks from all six species can be
clearly identified.
Published in: Marjorie L. Fournier; Joshua M. Gilmore; Skylar A. Martin-Brown; Michael P. Washburn; Chem. Rev. 2007, 107, 3654-3686.
DOI: 10.1021/cr068279a Copyright © 2007 American Chemical Society
PROTEOMICS CLASSIC: 2D GEL ELECTROPHORESIS
Immobilized pH
gradient
SDS
PAGE
2D gel image of brain proteins: about 3000 spots after Coomassie staining
Proteomics in brain research: potentials and limitations
Gert Lubec, Kurt Krapfenbauer and Michael Fountoulakis
4.5
Progress in Neurobiology,
Volume 69, Issue 3, February 2003, Pages 193-211
pI
STEPS IN 2D GE
•
•
•
•
•
•
•
•
•
Cell disruption
Protein solubilization
Prefractionation (optional, yet recommended)
Isoelectric focusing (IEF)
Equilibrium of IEF strip
SDS PAGE
Detection of protein spots
Image analysis and spot picking
Protein spot identification
PROTEIN EXTRACTION
• Detergents: solubilize membrane proteins-separation from lipids
• Reductants: Reduce S-S bonds
• Denaturing agents: Disrupt protein-protein interactions-unfold
proteins
• Enzymes: Digest contaminating molecules (nucleic acids etc)
• Protease inhibitors
http://cbt20.files.wordpress.com/2009/04/proteomics-seminar-bio-rad-2.pdf
PREPARATIVE IEF
The protein mixture is
injected into the focusing
chamber
Proteins are focused as in
standard IEF
Vacuum assisted aspiration into
sample tubes
The pH gradient is achieved with
soluble ampholytes
Large amount of proteins (up to 3g protein)
http://cbt20.files.wordpress.com/2009/04/proteomics-seminar-bio-rad-2.pdf
GEL STAINS - SUMMARY
Stain
Sensitivity (ng/spot)
Advantages
Coomassie R-250
50-100
Simple, fast, consistent
Colloidal Coomassie
5-10
Simple, fast
Silver stain
1-4
Very sensitive, awkward
Copper stain
5-15
Reversible, 1 reagent
negative stain
Zinc stain
5-15
Reversible, simple, fast
high contrast neg. stain
SYPRO ruby
1-10
Very sensitive, fluorescent
Ruby red
Silver
Coomassie blue
1) David Wishart, University of Alberta, Edmonton, AB;
2) http://cbt20.files.wordpress.com/2009/04/proteomics-seminar-bio-rad-2.pdf
29
PROTEIN DIGESTION
Why digest the protein?
• Peptides are easier to work with compared to peptides (smaller, easier to
solubilize, etc.)
• Peptide fragments of between 6 – 20 amino acids are ideal for MS analysis and
database comparisons (m/z 700 to 2000 – ideal mass range for most mass
analyzers)
•
Proteins are cleaved at certain specific amino acid residues (a constrain for
database searching)
Trypsin:
- Cleaves at basic arginine (R) or lysine (K) amino acid residues
=> each proteolytic fragment will contains a basic residue, a site of a proton
attraction, and thus is eminently suitable for positive ionisation MS;
- Generates peptide fragments of optimal length for MS;
- Robust and easy to use; good activity in gel, solution and when immobilized on
column or beads
PEPTIDE LENGTH AND NUMBER OF PEPTIDES GENERATED
DEPENDING ON ENZYME USED FOR DIGESTION
Other enzymes with more or less specific cleavage:
Advantages of a new proteomic approach that uses accurate mass measurements, LC retention time,
isoelectric point and dual enzymatic digestion. Petritis K. et. al., Biological Sciences Division, Pacific
Northwest National Laboratory, Richland, WA 99352; ASMS'2007 poster presentation
http://www.chem.agilent.com/Library/posters/Public/Petritis_ASMS_2007.pdf
PROTEIN IDENTIFICATION BY MASS SPECTROMETRY: PEPTIDE
MASS FINGERPRINTING (PMF)
% Intensity
90
80
70
60
50
40
30
20
0650
1220
10
1790
Mass (m/z)
2360
2100.6
2930
3017.3643
2847.3223
1700.2
2744.2598
1299.8
2408.0854
2466.1292
2522.1731
70
2265.1555
955.2725
2002.8455
1742.8866
1778.0724
1787.7205
899.4
1584.7921
1394.7437
1446.8040
1507.7018
1265.6365
1299.6415
10
1457.7261
1341.6743
499.0
1083.5385
1100.5907
0
666.3641
710.3889
736.4528
750.3957
801.5159
823.5144
862.4270
870.4494
900.4011
963.4941
989.5035
% Intensity
100
PMF: MALDI-TOF MS SPECTRA OF TRYPTIC DIGESTS TWO
PROTEINS
1168.3344
1.5E+4
90
80
1633.2021
60
50
40
30
20
779.2675
1470.3254
Mass (m/z)
2501.0
1770
100
3500
PMF: PROTEIN IDENTIFICATION
Identification is possible for single proteins or
mixtures of a small number of proteins (e.g., in-gel digests)
PEPTIDE MASS FINGERPRINTING: DATABASE SEARCH
RESULTS
Sequence coverage map
PEPTIDE SEQUENCING BY MS/MS
Peptide backbone
Various types of fragment ions
MS AND MS/MS FOR PROTEIN IDENTIFICATION
100
90
3.6E+4
1570.6824
MS
spectrum
MS/MS
spectrum
80
% Intensity
70
60
50
1713.7466
40
30
783.2330
10
1176.5958
1106.5554
0
701
1877.7767
1246.6296
20
1359
1552.6726
1728.6266
2211.0630
2017
2675
Mass (m/z)
3333
3991
PEPTIDE SEQUENCE FROM MS/MS SPECTRUM: EGVYVHPV
[Abs . Int. * 1000]
a
b
y 70
E
V 363.046
Va 3
V
G
H
Y
352.069
Y
y 3
V
H
V
H
P
Y
E
65
YVH
y 5b6
60
VH
y 4b6
55
391.055
b 3
50
526.081
a 4
770.388
y 7
VY
y 6b4
H
45
554.090
b 4
40
790.271
b 6
GVY
y 7b4
214.972
y 2
35
762.266
a 6
291.945
b 2
30
25
234.952
b 1
GVYV
y 7b5
20
653.172
b 5
VHP
y 4b7
15
264.966
a 2
10
5
V
1004.481
y 8
614.446
y 5
625.165
a 5
451.120
y 4
887.378
b 7
Y
0
0
200
400
600
m /z
800
1000
DATABASE SEARCHING…
=> Identified proteins
Name
World-Wide Web
Publicly
Available Bioinformatic Tools for Proteomics
Resources
NCBI
SwissProt
address
National resource for molecular biology information (). A
comprehensive, non-identical protein database maintained
by NCBI for use with their search tools BLAST and Entrez
A comprehensive, annotated, non-identical protein
sequence database maintained by Swiss Institute of
Bioinformatics
ExPASy
Proteomics server with a variety of tools
Swiss2DPAGE
2-D gel database of various organisms
ProFound
Protein chemistry and mass spectrometry resource
(PROWL)
Protein
Prospector
Peptide mass search tools from UCSF
MASCOT
Probability-based search algorithm for peptide and protein
identification using MS data
SEQUEST
Search algorithm for ESI-MS/MS data
http://www.ncbi.nlm.nih.g
ov/
http://www.expasy.org/sp
rot
http://www.expasy.org/
http://www.expasy.ch/ch
2d/
http://prowl.rockefeller.e
du/
http://prospector.ucsf.ed
u/prospector/mshome.ht
m
http://www.matrixscience
.com
http://fields.scripps.edu/s
equest
DATABASE SEARCH: SCORING
• MOWSE algorithm (used in Protein Prospector,
Mascot)
• There is no threshold for a reliable MOWSE score
• It is mainly for putting proteins in order of whether
they are likely to be correct => the goal is to separate
real matches from random;
• Mascot uses a probability based implementation of
the MOWSE algorithm
The MOWSE score is described in the paper: Pappin et al, Current Biology, 1993, Vol 3, No 6,
pp 327-332
http://prospector.ucsf.edu/prospector/mshome.htm
http://www.matrixscience.com/help/scoring_help.html
DATABASE SEARCH: MASCOT
Mascot score = –10*LOG10(P),
where P is the absolute probability that the observed match is a
random event
- A probability of 10-20 thus becomes a score of 200;
• The higher P, the less chance that a match is random, therefore,
the more chance that it is real;
• A commonly accepted threshold is that an event is significant if it
would be expected to occur at random with a frequency of less
than 5%
- Mascot report: "Scores greater than ... are significant (p<0.05)".
• => The size of the database searched becomes important!
• Constraint parameters will decrease the size of the database thus
increasing the level of significance
DIFFERENCE BETWEEN A SIGNIFICANT MATCH
AND A CORRECT MATCH
(1)
Scores greater than 67 are significant
(p<0.05) => the highest score is highly
significant
(3)
(2)
Mass tolerance increased from ±0.1 Da
to ±1.0 Da => the best match is still
correct, but it is barely significant
Mass tolerance increased to ±2.0 Da =>
he correct match remains at the top of
the list, but because the score is << the
significance threshold, there could be no
confidence in this match
CONFIDENCE IN DATABASE IDENTIFICATION: AN
IMPORTANCE OF SEARCH CONSTRAINTS
Constraints
SEARCH CONSTRAINTS
“Classic”
Proposed
•
•
•
•
•
• Retention time (RP HPLC)
• pI (peptide)
• ...
MW, mass accuracy
pI (protein)
Enzyme (specificity)
Species (taxonomy)
Instrument (=> type of ions in
MS/MS)
IMPORTANCE OF MASS ACCURACY: ACCURATE MASS TAGS
(AMT)
• Richard Smith, Pacific-Northwest National Laboratory: “Utility of
Accurate Mass Tags for Proteome-Wide Protein Identification” T.
P. Conrads, G.A. Anderson, T.D. Veenstra, L. Paša-Tolić, and
R.D. Smith Anal. Chem., 2000, 72 (14), pp 3349–3354
• Accurate mass tag (AMT) - mass of a single peptide measured
with such a high mass accuracy that it allowes unambiguous
protein identification
• Analysis of all the predicted proteins and tryptic peptides
generated from the theoretical ORFs in Saccharomyces
cerevisiae (yeast; 6117 proteins) and Caenorhabditis elegans
(19 098 proteins) to determine the mass accuracy needed for
unambiguous protein identification on a proteome-wide basis
• The results indicate that the MMA required is presently
achievable using FTICR mass spectrometry
IMPORTANCE OF MASS ACCURACY (CONTINUED)
•
•
•
Low ppm (i.e.,~1 ppm) level measurements have practical utility for
analysis of small proteomes;
Up to 85% of the peptides predicted from these organisms can function
as AMTs at sub-ppm mass accuracy levels attainable using FT ICR MS;
Additional constraints should enable even more complex proteome to be
studied at more modest mass measurement accuracies
6117 potential proteins
19 098 potential proteins
• As the number of
potential proteins
increases, the
identification of
proteins solely on
the basis of
molecular mass
becomes more
demanding
IMPORTANCE OF MASS ACCURACY AND RESOLVING POWER
Calculated mass spectra for a
mixture of all possible 10-mer
polypeptides at mass resolutions that
correspond to resolutions (R) (and
mass measurement accuracies) of 103
(1000 ppm) (A), 104 (100 ppm) (B),
105 (10 ppm) (C), and 107 (0.1 ppm)
(D).
The number of peptides is the
number unresolved and that would
not be distinguishable at a
corresponding level resolution (i.e., 1
ppm corresponds to a resolution of
106, although it should be noted that
the attainment of the necessary level
of MMA is generally much more
important than resolution for the use
of the AMT concept).
Percent unique tryptic
fragments (potential accurate
mass tags) as a function of
tryptic fragment mass at four
different levels of mass
measurement accuracy for the
predicted proteins of yeast (A)
and C. elegans (B).
ADDITIONAL CONSTRAINT: PHOSPHOPEPTIDES
Distinctively large mass defect of phosphorus relative to H, C, and O (~0.3 Da) has
the net result of off-setting the average mass of phosphopeptides to slightly lower
mass than unmodified peptides of the same nominal molecular weight, often
marking a peptide as phosphorylated simply on the basis of its mass.
Predicted percent unique
yeast phosphopeptide
fragments (potential
accurate mass tags) as a
function of phosphopeptide
fragment mass at four
different levels of mass
measurement accuracy.
Self Assessment Questions
• What’s the difference between ‘top-down’ and
‘bottom-up’ proteomics? Which works better?
Why?
• Chromatography is critical in proteomics, what
types of chromatography are most used?
• What’s the dynamic range of proteins in a cell?
• How does accurate mass help in a proteomic
experiment?
Percentage of unique peptides as a function to peptide MW and different
theoretical conditions of mass accuracy (ppm), retention time (RT), isoelectric
point (pI) (ppm), and “in-solution fragmentation (ISF)
•ISF: LysC for the 1st digestion and a combination of trypsin and chymotrypsin for the 2nd digestion.
•ISF in combination with accuracies: 5 ppm mass, +/- 0.05% RT prediction and +/- 0.5 pH units IEF provides
enough specificity in order for the peptides with MW>1000 Da to be identified with high confidence.
•At least >91% of the peptides with MW > 1000 Da are unique while >99% of the peptides with MW > 1500
are unique.
POST-TRANSLATION MODIFICATIONS (PTM’S): SOME FACTS
• >250 different kinds of covalent PTM’s are known to date;
• Covalent PTM: derivatization of individual amino acid residues;
• May occur at any stage of protein biosynthesis, but only
after the formation of the aminoacyl-tRNA;
• Serve for purposes of regulation of all biochemical processes in
cells;
• Procaryotes and eucaryotes use different PTM’s as
regulation mechanisms
REGULATORY PTM’S
• Should be reversible by nature
Main regulatory types known to date:
•
•
•
•
•
•
•
•
Phosphorylation;
Acetylation;
Methylation;
SS/SH interconversions;
Glycosylation
Ubiquitination
Adenylation;
Uridylation
CHEMISTRY OF PHOSPHORYLATION
• Hydroxyaminoacids - main acceptors of phosphate groups;
• ~98% of phosphorylated amino acids in proteins are Ser-P;
•
>99% -- Ser-P and Thr-P;
•
Tyr-P content is <0.01% (though is very important)
PHOSPHOPROTEINS: A CHALLENGE FOR MS ANALYSIS
• Low abundance (partially phosphorylated proteins; phosphorylation
stoichiometry range: 4.5-100%)
• Ionization efficiency for phosphopeptides is ~1/10 of unmodified
peptides (phosphate is in anionic form => poor proton acceptor)
• => Low sensitivity
• Multiply phosphorylated peptides carry a substantial (-) charge
=>3 phosphates - no signal !
• Poor fragmentation (main fragmentation channel – loss of
phosphate, not informative about the sequence):
- abundant [MH-H3PO4]+ peak for Ser-P and Thr-P;
- abundant [MH-HPO3]+ peak for Tyr-P;
Two-dimensional (2D) peptide separation methods for shotgun
proteomics analysis
(a) This method couples two liquid chromatography separations. In the first dimension, peptides are separated on
the basis of charge or affinity and in the second dimension, on the basis of hydrophobicity. The two liquid
chromatography separation methods can be coupled in offline or online modes. The online modes can be
performed by MudPIT or a column-switching system.
(b) This method couples a first separation based on the isoelectric point and a second separation by liquid
chromatography based on hydrophobicity. In the first dimension peptides can be separated by isoelectric
focusing through electrophoresis on immobilized pH gradient gel strips (IPG-IEF) or in solution, by capillary
electrophoresis (CIEF) or free-flow electrophoresis (FFE-IEF). The IPG-IEF or FFE-IEF and CIEF systems are
respectively coupled with the liquid chromatography method in off-line and on-line modes.
(c) A third separation method couples liquid chromatography separation based on hydrophobicity with capillary
zone electrophoresis. The separation systems are interfaced in an online mode.
Published in: Marjorie L. Fournier; Joshua M. Gilmore; Skylar A. Martin-Brown; Michael P. Washburn; Chem. Rev. 2007, 107, 3654-3686.
DOI: 10.1021/cr068279a Copyright © 2007 American Chemical Society
“DIVIDE AND CONQUER“!
• Separation is a key
• All data will go through database search, therefore...
• => Gather as much information about a sample as possible:
- Sample: species, organelle, fraction;
- Proteins: MW (M/z), the higher mass accuracy, the better; pI;
- Peptides: MW (m/z), the higher mass accuracy, the better;
retention time (new algorythms); pI; enzyme used for digestion;
specific functional groups or amino acids (affinity separation);
• Restrictions on database search (constraints) will limit choices
and increase confidence in identification!
(more discussed later)
DIFFERENTIAL SOLUBILIZATION
Protein sample
Extraction with
40mM Tris Base
supernatant
Fraction 1
pellet
Extraction with
8M Urea, 4% CHAPS
supernatant
Fraction 2
pellet
Extraction with
5M Urea, 2M Thiourea
2% CHAPS, 2% SB3
supernatant
Fraction 3
David Wishart, University of Alberta, Edmonton, AB
pellet
Extract with SDS
Fraction 4
SUBCELLULAR FRACTIONATION
Human mitochondrial proteins
David Wishart, University of Alberta, Edmonton, AB
Human nuclear proteins
ISOELECTRIC FOCUSING
• Separation of proteins by isoelectric point (pI) in pH gradient
strips (ampholytes, immobilines)
• pI is a pH in which net charge of protein is zero (defined by
pKa of charged side chain groups of AA)
• Protein moves in pH gradient as long as its net charge in a
given point of pH gradient becomes 0
• Carried out in Immobiline gelsor in (semi-) preparative IEF
devices
• Advantages: proteins in native state; proteins concentrated;
pI data; preparative IEF is possible (high protein loads)
COMPLEXITY OF PROTEOMIC SAMPLES : PROREIN
HETEROGENEITY (1)
g-enolase
A
B
Partial 2D-gel images showing g-enolase from human brain. The
protein is represented by one spot when IEF was performed on pH 310 non-linear IPG strips (A),
and by six spots when IEF was performed on pH 4-7 strips (B).
Better resolution – more information!
Proteomics in brain research: potentials and limitations
Gert Lubec, Kurt Krapfenbauer and Michael Fountoulakis
Progress in Neurobiology, Volume 69, Issue 3, February 2003, Pages 193-211
COMPLEXITY OF PROTEOMIC SAMPLES : PROREIN
HETEROGENEITY (2)
Glial fibrillary acidic protein (GFAP) is
considered to be a specific marker in
Alzheimer’s disease. The determination
of GFAP expression is confounded by
the many isoforms and posttranslationally modified forms observed
in brain
Figure: Two-dimensional gel
analysis of the thalamus brain
region from a patient with
Alzheimer’s disease. The spots
indicated represent GFAP. This
protein is usually represented by
more than 50 spots.
Proteomics in brain research: potentials and limitations
Gert Lubec, Kurt Krapfenbauer and Michael Fountoulakis
Progress in Neurobiology, Volume 69, Issue 3, February 2003, Pages 193-211
2D GE
Advantages
Disadvantages
• Limited pI range (2 to 11,
• Provides a hard-copy record of
but 4-8 routinely)
separation; a map of intact
proteins which reflects changes
• Proteins >150 kD are not
in protein expression, isoforms
seen in 2D gels
or PTM
• Membrane (hydrophobic)
• Separation of up to 9000
proteins are
different proteins (~2000
underrepresented (>30% of
routinely)
all proteins)
• Able to resolve proteins with pI
• Only detects high
around 0.001 pH units
abundance proteins (top
30% typically)
• Detect and quantify <1ng of
protein per spot
• Multiple proteins in one
spot (~ 30% of all spots)
• Highly reproducible
• Time consuming
• Provides accurate info for
database searching (Mw, pI
and PTMs)
David Wishart, University of Alberta, Edmonton, AB;
• Inexpensive