Lecture 10 Mass Spectrommetry Interpretation

Download Report

Transcript Lecture 10 Mass Spectrommetry Interpretation

Lecture 10
Interpretation of Mass Spectra
A.
Peptide Mass Fingerprinting
B. MS/MS sequencing
Oct 2010 SDMBT
1
General workflow for proteomic analysis
Sample
Sample preparation
Protein mixture
Sample separation and
visualisation
Comparative analysis
Peptides
Digestion
Mass spectrometry
MS data
Database search
Protein identification
Oct 2010 SDMBT
2
Peptide Mass Fingerprinting (PMF)
Protein separated on 2D-gel
Tryptic digest
Experimentally
On MALDI-TOF
match Virtual Tryptic Digest all
known proteins
Peptides
Trypsin cuts at C-terminal side of lysine and arginine – size of
peptides unique for each protein
Oct 2010 SDMBT
3
Peptide Mass Fingerprinting (PMF)
Recall: Tryptic digest of β-casein
Major peaks at:
646
742
748
780
830
2186
King’s College
London
(Pierce)
Oct 2010 SDMBT
4
Peptide Mass Fingerprinting (PMF)
Virtual peptide digest
Amino acid Sequence from
GenBank
http://www.ncbi.nlm.nih.gov/entrez/
Peptide Mass Fingerprinting (PMF)
Virtual peptide digest
Convert to FASTA format
http://bioinformatics.org/sms2/genbank_fasta.html
Peptide Mass Fingerprinting (PMF)
Virtual peptide digest
http://www.expasy.org/
Peptide Mass Fingerprinting (PMF)
Virtual peptide digest
Results of virtual tryptic digest
Compare with experimental peaks in MALDI-TOF of tryptic digest
Major peaks at:
646
742
748
780
830
2186
Peptide mass fingerprinting (PMF)
Peptide masses are matched against theoretical
digests of proteins in databases
Matches are ranked by the number of matching
peptides
Confidence in the identity is given by
•a large gap in the number of matching peptides
between the 1st and 2nd ranked protein
•good coverage of the 1st ranked protein with the
experiment results
Oct 2010 SDMBT
9
Peptide Mass Fingerprinting (PMF)
Variables for database search
Choice of database (public or private)
Species of origin
Molecular weight and pI range
Enzyme used for digest
Modifications (reduction, alkylation, phosphorylation)
Tolerance
Oct 2010 SDMBT
10
PMF using MS-FIT
http://prospector.ucsf.edu/
Oct 2010 SDMBT
11
PMF using MS-FIT
Choice of database
Choice of enzyme
Oct 2010 SDMBT
12
PMF using MS-FIT
Tolerance
Choice of
modifications
Peaks
entered
here
Oct 2010 SDMBT
13
Peptide Mass Fingerprinting (PMF) results
for tryptic digest of β-casein
Same protein across 4 similar species
Oct 2010 SDMBT
14
Peptide Mass Fingerprinting (PMF) results
for tryptic digest of β-casein
Does this agree with position in 2D-gel?
Note: do not need match all peaks or
whole protein to identify protein!
Oct 2010 SDMBT
15
Limitations of PMF
This method assumes that databases are complete
but the genomes of only some organisms are
completely sequenced, high confidence matches
might not be available
But homology between organisms allow for good
results
No information about amino acid sequence, only
identity of protein. The amino acid sequence in slide
15 is only the ‘predicted sequence’ based on virtual
digest.
Oct 2010 SDMBT
16
Peptide Mass Fingerprinting (PMF)
Database search is only good as the database and the input data e.g.
MALDI spectra often have peaks due to trypsin autolysis and keratin
degradation
(Promega)
Oct 2010 SDMBT
17
Peptide Mass Fingerprinting (PMF)
If the MS is too noisy…..
Real world MS data
(L&T Inc)
Oct 2010 SDMBT
18
Peptide Mass Fingerprinting (PMF)
Exercise: Identify this protein
Oct 2010 SDMBT
19
MS/MS sequencing
Fragmentation of peptides causes cleavages
along the peptide backbone
Comparison of MS-MS spectra allows in theory
determination of possible amino acid sequences
manually (slides 21-33)
Sequences matched to databases to determine
identity and sequence of proteins (slides 34 onward)
Adds another layer of certainty in the identification
of the peptide and hence to the protein
Oct 2010 SDMBT
20
MS/MS sequencing
TRYPTIC PEPTIDES IN MS/MS
C-terminal always
Arginine (R) or Lysine (K)
By convention N-terminal on left
N-terminal of peptide
Trypsin cuts C terminal side of R/K
Proteins digested into peptides by trypsin
All tryptic peptides have similar structure – because digested by trypsin
When peptides ionised usually– 2+ charge on either end of peptide
MS/MS fragmentation of peptide in 6 ways leads to …..
By convention, ion fragments are called….
IMPORTANT
Although 6 possible ways, generally b and y ions are most common
It is in general not always to predict what sort of ions will be produced
Explain how does ionisation break up?
In theory 8 y-ions and b-ions possible but not all may be observed
Left-hand side
N-terminus
Right-hand side
C-terminus
Residue mass of amino acid
C-terminal
Residue mass+19
N-terminal
Residue mass+1
In practice, not all y and b ions observed
(cannot be predicted)
MS/MS sequencing
Difference betw y ions=
Residue mass (see next page)
Just looking at the y ions
y7
y-ions contain the C-terminus
y6
Gly (G)
Ala (A)
y4
y3
57.1
Ala (A)
Cys (C)
70.9
102.8
57.3
71
y5
y2
246.2
therefore …
Gly (G)
303.3
374.2
AGCAG….CO2H
477.0
534.3
605.3
Residue masses of amino acids
Residue mass = Molecular weight of amino acid –18 (2xH + 1xO)
Note: some have very similar molecular weights
letter
name
mass, Da
letter
name
mass, Da
G
glycine
57.02
D
aspartic acid
115.03
A
alanine
71.04
Q
glutamine
128.06
S
serine
87.03
K
lysine
128.09
P
proline
97.05
E
glutamic acid
129.04
V
valine
99.07
M
methionine
131.04
T
threonine
101.05
H
histidine
137.06
C
cysteine
103.01
F
phenylalanine
147.07
I
isoleucine
113.08
R
arginine
156.10
L
leucine
113.08
Y
tyrosine
163.06
N
asparagine
114.04
W
tryptophan
186.08
(N.S. Weld)
Oct 2010 SDMBT
26
MS/MS sequencing
Just looking at the b ions
b-ions contain the N-terminus
b2
Ala (A)
Ala (A) Gly (G)
71.1
57.2
b3
Cys (C)
b5
b6
102.8
b4
70.3
Gly (G) Ala (A)
b8
b7
57.5
170.9
242.0
299.2
therefore … NH2-…….AGCAGA
402.0
472.5
70.5
530.2
600.7
MS/MS sequencing
Combine the results…..
from y-ions… …….AGCAG….CO2H
from b-ions … NH2-…….AGCAGA….
Partial sequence - NH2-….AGCAGA….CO2H
Need to know how to interpret MS
– which peaks are y- and b-? Which are y2, y3 etc?
Difficult to tell the amino acids at the beginning and the end
MS/MS sequencing
Useful numbers and Hints for MS-MS spectra
ym ions - add all m residue masses + 19
bn ions – add all n residue masses + 1
cm ions – add all m residue masses +17
zn ions – add all n residue masses + 2
am ions – add all m residue masses - 27
xn ions – add all n residue masses + 45
MS/MS sequencing
Where do these numbers come from?
+
NH3
O
CH3
OH
NH
+
NH
H3N
H
O
Definition of
residue mass of amino acid =
Molecular weight of amino acid –
18 (2xH + 1xO)
O
b ion has 1 extra hydrogen
Compared to “residue mass of amino acid”
CH3
CH3
OH
H2N
b-ion (b1)
C
+
O
HN
H
O
MS/MS sequencing
+
NH3
Where do these numbers come from?
O
CH3
OH
NH
+
NH
H3N
H
O
O
NH2
Residue
Mass of
Gly
y-ion (y2)
Residue
Mass of
Lys
NH2
O
O
+
H3N
OH
H
NH
NH
NH
H
OH
O
O
Residue mass of Gly+Lys + 2xH + 1xH+1xO =
sum of residue masses+19
MS/MS sequencing
Draw the a,b,c and x,y,z ions from
this dipeptide
and
Calculate the m/z ratios
NH
O
NH
OH
H2N
O
H2N
MS/MS sequencing
CH3
O
O
NH
H2N
OH
NH
H3C
CH3
O
NH2
Draw the a,b,c and x,y,z ions from
this tripeptide
And calculate the m/z ratio
MS/MS sequencing
Peptide after ionisation by MALDI or ESI
Fragmentation
experimental
match Virtual Fragmentation
Fragment peptides
Oct 2010 SDMBT
34
eg peptide from human catalase
LSQEDPDYGIR
Protein Prospector – MS-Product
http://prospector.ucsf.edu/
Paste amino acid sequence
All predicted a, b, y ions etc.
MS-MS data – amino acid sequence – protein identification
e.g. if MS-MS of a
A peptide of mass 1292.61
has the following peaks
1179.53
1092.50
964.44
835.39
720.37
623.31
508.29
345.22
288.20
175.12
First number - must be mass of peptide+1 i.e. [M+H]+
In ESI-MS tryptic peptide is usually 2+ – it is actually [M+2H]2+
MS/MS sequencing
Output – protein identified
MS/MS sequencing
Each of the fragments identified as y or b ions – the user does not
have to assign the peaks or work out residual masses
MS/MS sequencing
More complex example…..
MS-MS of a peptide with mass 1217.58
with peaks at
1088.54
975.46
847.40
746.35
631.32
457.28
358.21
243.13
300.16
371.19
Yeast alcohol dehydrogenase –
But deliberately missed out one y ion and
all except 3 b ions
Still able to identify the protein.
Even though info incomplete
All peaks identified
as y or b ions