+ + 1 - Hecklab.com

Download Report

Transcript + + 1 - Hecklab.com

Peptide fragmentation
understanding how to interpret mass spectra
for peptide identification
Arjen Scholten
UU, Biomolecular Mass Spectrometry & Proteomics Group
Netherlands Proteomics Centre
Utrecht University
Peptide fragmentation by MS
CID: Collision Induced Dissociation
Peptide fragmentation by MS
100
95
90
85
80
75
70
65
60
55
50
45
40
35
30
25
20
15
10
5
0
?
400
600
800
1000
1200
1400
1600
IMNTFSVVPSPK
The ability to identify peptides from
mass spectra in a reliable manner is the
foundation of any proteomic experiment
Peptide structure
Peptides (and proteins) are linearly arranged chains of amino acids
H
H
O
H ﴾N
C
C ﴿nOH
R
H
H
O
H N
C
C
R1
H
O
N
C
C
H
R2
H
O
⁄ N C C
H
Rn-1
H
O
N
C
C
H
Rn
OH
Terminology
H
H
O
H
H
O
H N
C
C OH
N
C
C
R
R
Residue (amino acid minus H2O)
Amino acid
H
H
O
H N
C
C
R1
H
O
N
C
C
H
R2
H
O
N
C
C
H
R3
Tetrapeptide
N-terminus
H
O
N
C
C
H
R4
OH
Peptide bond
C-terminus
Ionisation of peptides
• Positive ion mode
– M + H+
 [M+H]+
– M + nH+
 [M+nH]n+
– M + Na+
 [M+Na]+
• Negative ion mode
– M
 [M-H]- + H+
– M + Cl [M+Cl]-
Ionisation of peptides
H2 N
H
O
C
C
R1
H
O
N
C
C
H
R2
H
O
N
C
C
H
R3
OH
protonation
+ H
H O
H O
+
H O
+
H3N C C N C C N C C OH
R1
H R2
H R3
Ionisation of peptides
+
H 3N
H
O
C
C
R1
H
H 2N
C
O
C
R1
H 2N
O
N
C
C
H
R2
H
N
H
H
O
C
C
R1
H 2N
H
H
O
C
C
+
H
O
C
C
R2
H
O
N
C
C
H
R 2H
+
N
H
O
C
C
H
O
N
C
C
H
R3
H
O
N
C
C
H
R3
H
O
N
C
C
H
R3
N
H
OH
C
C
+
R1
H
R2
H
R3
OH
OH
OH
OH
Mass determination of peptides
H
H
O
H N
C
C
R1
1 +
mass
residue
1
H = 1.0078250
O = 15.994915
H
O
N
C
C
H
R2
+
mass
residue
2
H
O
⁄ N C C
H
Rn-1
+
mass
residue
n-1
H
O
N
C
C
H
Rn
+
mass
residue + 16+1
n
OH
+ H+
+1
Residue masses
Glycine
Alanine
Serine
Proline
Valine
Threonine
Cysteine
Isoleucine
Leucine
Asparagine
Aspartic acid
Glutamine
Lysine
Glutamic acid
Methionine
Histidine
Phenylalanine
Arginine
Tyrosine
Tryptophan
G
A
S
P
V
T
C
I
L
N
D
Q
K
E
M
H
F
R
Y
W
57.02147
71.03712
87.03203
97.05277
99.06842
101.04768
103.00919
113.08407
113.08407
114.04293
115.02695
128.05858
128.09497
129.0426
131.04049
137.05891
147.06842
156.10112
163.06333
186.07932
Biemann notation of peptide fragment ions
z3
z2
y3
H 2N
CH
z1
y2
y1
x3
x2
x1
O
O
O
C
R1
N
CH
H
R2
a1
C
N
CH
H
R3
a2
b1
N
CH
H
R4
C
OH
a3
b2
c1
C
O
b3
c2
c3
• Low energy fragmentation: break weakest bond = peptide bond
• Yields primarily b and y ions
+ H+
Peptide fragmentation
Any of the peptide bonds might break, hard to predict which ones will break
Peptide: S-G-F-L-E-E-D-E-L-K
MW
ion
ion
88
b1
S
145
b2
SG
292
b3
405
MW
GFLEEDELK
y9
1080
FLEEDELK
y8
1022
SGF
LEEDELK
y7
875
b4
SGFL
EEDELK
y6
762
534
b5
SGFLE
EDELK
y5
633
663
b6
SGFLEE
DELK
y4
504
778
b7
SGFLEED
ELK
y3
389
907
b8
SGFLEEDE
LK
y2
260
1020
b9
SGFLEEDEL
K
y1
147
Peptide fragmentation
S—G—F—L……………..D―E―L―K
b1 y1 b2
y2 b3
y3
b4 y4 b5
y5
m/z
Sequence can be read from distance between peaks
Peptide fragmentation
Peptide sequence:
HLVDEPQNLIK
H
L
V
D
E
P
Q
N
L
I
K
B
138.06
251.15
350.21
465.24
594.28
691.34
819.39
933.44
1046.52
1159.61
-
Y
1168.65
1055.57
956.50
841.47
712.43
615.38
487.32
373.28
260.19
147.11
1055.44
100
95
90
85
80
75
70
65
Relative Abundance
No
1
2
3
4
5
6
7
8
9
10
11
644.57
60
55
50
45
40
956.39
251.16
35
712.36
30
841.40
25
332.17
20
465.22
594.19
15
10
5
260.28
1168.58
792.87
523.69
223.31
761.52
350.24
394.77
933.37
1029.41
1143.58
0
200
300
400
500
600
700
800
m/z
900
1000
1100
1239.37
1200
1300
Peptide fragmentation
Outstanding questions:
•
Many fragments can occur, which ones do occur (mass)?
•
What is the frequency of their occurrence (peak intensity)?
•
How does fragmentation behavior depend on peptide sequence?
•
What is the influence of the number of (positive) charges?
•
How can we use all this to reliably determine a peptide sequence from a
spectrum?
•
Can we predict the fragmentation of any peptide (mass and intensity), and
thus its spectrum?
Commonly observed fragment ions
N-terminus
b2
H
H
O
H N
C
C
R
H
a2
H N
1
H
H
O
N
C
C+
H
R
O
C
R
C-terminus
+
H3N
H
O
C
C
R
2
3
H
O
N
C
C
H
R
OH
4
H
C
1
+
N
C
H
R
2
Sequence
Composition
Immonium ions
H
+
H
H
N
C
R1
H
+
H
H
N
C
R2
H
+
H
H
N
C
R3
H
+
H
H
N
C
R4
y2
Formation of sequence ions
H
R1
H 2N
CH
H
N
C
O
O
CH
R2
C
H
R3
+
N
C
H
C
O
H
N
O
CH
C
OH
R4
Tetrapeptide protonated at an amide nitrogen
R3
O
H 2N
CH
R1
C
O
N
C
CH
+
+
H 3N
C
C
O
H
N
O
CH
C
OH
R4
R2
Neutral dipeptide
Protonated dipeptide
y2
Formation of sequence ions
R1
H 2N
CH
C
H
N
O
O
CH
R2
C
H
R3
+
N
C
H
C
O
H
N
O
CH
C
OH
R4
Tetrapeptide protonated at an amide nitrogen
R1
b2
H2 N
CH
C
H
N
O
R3
+
CH
C
O
R2
Protonated dipeptide
_ CO
R1
a2
H 2N
CH
C
O
H
N+
CH
R2
+
H 2N
C
C
O
H
N
O
CH
C
R4
Neutral dipeptide
OH
Formation of sequence ions
Linear b-ions are unstable
•
•
•
Linear b-ions are unstable and form stable cyclic structures
Alternatively, they decompose to a-ions
Because of this, b1 ions are never observed
R'
H H
C
N
H 2N H C
C
O
+
C
R'
H
C
CH C
O
C
O
O
H 2N
R"
R"
+
HN
B 2 (cyclic)
B 2 (linear)
R'
- CO
H+
N
C
H 2N
H
H
C
C
O
R"
A2
Double cleavages lead to internal fragments
•
•
•
Double cleavage: when two separate backbone cleavage events take place
EITHER a combination of b-type and y-type cleavage
OR a combination of a-type with y-type cleavage
e.g. product of b3 and yn-1 cleavage:
H
H
O H
H
O
H2N
C
C N
C
C+
R2
R3
Double cleavages: immonium ions
•
•
•
•
•
Most common double cleavage ions are immonium ions
Correspond to y/a double cleavages
Contain a single amino acid side chain (R)
Observed in low m/z region of spectrum
Indicative for presence of specific residue(s) in the peptide
H2N
H
O
C
C
R
H
+
H
H
N
C
R1
i1
1
H
O
N
C
C
H
R
H
+
2
H
H
N
C
R2
i2
H
O
N
C
C
H
R
OH
3
H
+
H
H
N
C
R3
i3
Residue masses
Amino acid
Glycine
Alanine
Serine
Proline
Valine
Threonine
Cysteine
Isoleucine
Leucine
Asparagine
Aspartic acid
Glutamine
Lysine
Glutamic acid
Methionine
Histidine
Phenylalanine
Arginine
Tyrosine
Tryptophan
G
A
S
P
V
T
C
I
L
N
D
Q
K
E
M
H
F
R
Y
W
residue mass
(Da)
immonium
ion (Da)
57.021
71.037
87.032
97.052
99.068
101.047
103.009
113.084
113.084
114.042
115.026
128.058
128.094
129.042
131.040
137.058
147.068
156.101
163.063
186.079
30.034
44.050
60.044
70.065
72.081
74.060
76.022
86.097
86.097
87.055
88.039
101.071
101.108
102.055
104.053
110.072
120.081
129.114
136.076
159.092
Peptide fragmentation
•
•
•
•
It is the charge (proton) on a peptide that drives fragmentation
In proteomics, tryptic peptides are analyzed; terminate in Arg or Lys
Arg and Lys are basic, so they sequester proton which then cannot easily
migrate along the peptide to drive fragmentation
Solution:
1. choose precursor with extra proton(s), e.g. [M+2H]2+
2. introduce extra energy
3. rely on peptide providing extra proton (from acidic groups, e.g. Asp
or Glu
Fragmentation of doubly protonated peptides
+
H3 N
[M + 2H] 2 + :
O
H
C
C
N
H
R1
H2 N
H
C
R1
H2 N
H
C
R1
O
C
N
H
b2
H
C
R2
C
N
H
R2
O
C
O
H
C
N
H
H
C
O
H
C
O
C
R3
N
H
O
+ H
N C C
H2
R3
C
R2
O
C+
+
H2N
O
H
C
C
OH
R4 NH3
+
N
H
H
C
R3
H
C
O
C
OH
R4 N H 3
+
O
C
N
H
y2
H
C
O
C
OH
R4 N H 3
+
Fragmentation of triply protonated peptides
[M+3H]3+
H3 N
H
C
O
C
R1
H2 N
H
C
O
C
N
H
R1
H
C
O
C
R2
+
N
H2
O
H
C
C
R3
+
N
H2
H
C
+ H
O
N C C
H2
R2
+
N
H2
H
C
O
C
R3
OH
O
C
OH
R4 NH3
+
O
C
N
H
H
C
H2 N
R4 N H 3
O
H
C
C
R1
+ H
O
N C C
H2
R2
+
N
H2
H
C
O
C
R3
N
H
+
H
C
R1
H2 N
H
C
R1
C
N
H2
b2+
H
C
R2
O
C+
+
H 2N
H
C
R3
O
C
+ H
O
N C C OH
H2
R4 NH3
+
y2++
O
C
OH
R4 N H 3
+
H2 N
O
H
C
O
C
+ H
O
N C C+
H2
R2
b2++
+
H2N
H
C
R3
O
C
N
H
H
C
O
C
R4 NH3
+
y2+
OH
Fragmentation of multiply protonated peptides
•
•
•
•
Products of n+ peptide have charges up to (n-1)+
Which fragments and charges occur largely depends on the sequence
(=chemical composition) of the peptide
Tryptic peptides typically yield strong y-ion series, because charge sits on
C-terminal residue
Fragmentation pattern of a peptide carrying 1, 2 or 3 charges greatly differs
Different fragmentation of 1, 2 and 3+ charged peptide
1+
TCVADESHAGCEK
m/z parent: 1463.52
2+
m/z parent: 732.28
3+
m/z parent: 488.54
Preferential cleavage of peptides
Breci et al, Anal Chem 2003
Preferential cleavage of peptides
•
•
Preferential cleavage generally observed N-terminal of Pro, since its amide
N is more basic than those of other amino acids
Fragmentation at the Xxx-Pro bond is predictable; this information may be
used to improve the identification of peptides
Huang et al, Anal Chem 2005
Preferential cleavage of peptides
•
•
Find patterns in fragmentation behavior in peptides sharing certain
sequence motifs
May help to identify or confirm peptide identity
Composition of the 28 330 peptide MS/MS spectral database
Huang et al, Anal Chem 2005
Preferential cleavage of peptides
Motif:
[…P…noH…R] 2+
[…P…noH…K] 2+
Relative intensities of y-ions
Huang et al, Anal Chem 2005
Preferential cleavage of peptides
Motif:
[…P…noH…R] +
[…P…noH…K] +
Relative intensities of y-ions
Huang et al, Anal Chem 2005
Preferential cleavage of peptides
Motif:
[…noP…noH…R/K] 2+
Relative intensities of y-ions
Huang et al, Anal Chem 2005
Preferential cleavage of peptides
Conclusions:
• Distinct fragmentation of 1+ peptides ending in K and R, related to
basicity (R>K) and mobility of proton (K>R)
• Cleavage C-terminal to acidic residues (D, E) dominates spectra
from 1+ peptides with localized proton (….R)
• Cleavage N-terminal to Proline dominates spectra from 1+ peptides
with mobile proton (….K)
• Cleavage N-terminal to Proline dominates spectra from 2+ peptides
irrespective of proton mobility (….K/R)
• In spectra from 2+ peptides lacking Pro, fragmentation is much more
heterogeneous
Preferential cleavage of peptides
Although these observations are interesting and
helpful, there is NO rule that says that all
structurally useful ions that could be formed will
indeed be formed.
Summary
•
•
•
•
•
•
•
•
•
Sequence-informative ions: y and b ions
y-ions contain the C-terminus
b-ions contain the N-terminus
b-ions are often accompanied by a-ions, 28 Th smaller
b1-ion is unstable and thus not observed
Internal fragments are produced by double cleavages (usually b/y or a/y),
mostly in ≥2+ peptides
Immonium ions give information on amino acid composition, not
sequence
Strong cleavage N-terminal to Pro (1+ and 2+ peptides) and C-terminal to
Asp or Glu (1+ peptides)
Products of 2+ peptides are usually (but not exclusively) singly charged
How to interpret peptide fragmentation spectra
1.
2.
3.
4.
5.
6.
Estimate number of amino acids in the peptide from precursor mass
(‘average’ amino acid mass 100 Da)
Determine which amino acids are present from immonium ions in low m/z
region (if present)
Usually it is easiest to start looking for y-ions:
i) starting from the intact peptide mass downwards, look for mass
differences corresponding to residue mass
ii) starting from y1 upwards; y1 for C-terminal Arg at m/z = 175, Lys = 147
Then look for complementary b-ions (with a-ion 28 Th lower)
Try to correlate b and y ions; remember, for a peptide consisting of n
amino acids: b(n-m) + ym = [M+H]+ + 1
Only rarely can a full peptide sequence be determined from the spectrum
Other aspects of peptide fragmentation
NB: Only some aspects of peptide fragmentation have
been discussed here. Other important issues are:
•
•
•
•
Fragmentation of negatively charged ions
Impact of posttranslational modifications on fragmentation behavior
High and low energy CID; other fragmentation modes than CID (e.g. ECD,
ETD)
These may be very helpful tools to identify peptides that are inaccessible by
‘straight forward’ approach
Bioinformatics and peptide identification
Searching a protein database
•
•
•
Matching fragmentation spectra to masses that can be formed from tryptic
peptides in a particular protein database (e.g. human proteome)
Only takes into account masses, not intensities that can be expected (due
to preferential cleavage)
Examples: Sequest, Mascot, X!tandem, Phenyx, etc.
http://fields.scripps.edu/sequest/
http://www.thegpm.org/TANDEM/
•
http://www.matrixscience.com
http://www.phenyx-ms.com/
More details lecture by Monique Slijper
Raw, uninterpreted
MS/MS spectra
Sequence Database
>SEQ1
CVVEELCPTPEGKDIGES
VDLLKLQWCWENGTLRSL
DCDVVS
>SEQ2
DLRSWTVRIDALNHGVKP
HPPNVSVVDLTNR>
Bioinformatics and peptide identification
Spectral matching
•
•
•
•
Create a database of spectra and their associated peptide identity
Match experimental spectra to spectra in database
Advantages: faster than database search; potential to find (unexpected)
modifications when spectra are similar, but not identical
Disadvantage: building a database that is sufficiently comprehensive
Raw, uninterpreted
MS/MS spectra
Spectral database
Bioinformatics and peptide identification
De novo sequencing
•
•
•
Identify peptides directly from the spectrum
Advantage: independent of any database; applicable to species with
unsequenced genomes
Disadvantage (so far): Accuracy, sensitivity
Example: spectral interpretation
Precursor mass: 591.28 (2+)
890.54
432.30
136.08
662.40
181.11
175.13
292.17
777.44
547.35
274.13
500.80
1001.59
Precursor mass: 591.28 (2+)  M+H = 1181.56
890.54
432.30
136.08
662.40
181.11
175.13
292.17
777.44
547.35
274.13
500.80
1001.59
Precursor mass: 591.28 (2+)  M+H = 1181.56  10-11 amino acids
890.54
432.30
136.08
777.44
662.40
181.11
175.13
547.35
292.17
274.13
140.13
157.2
D
D
D
I/L
115.0
115.0
115.0
113.1
500.80
111.0
1001.59
Precursor mass: 591.28 (2+)  M+H = 1181.56  10-11 amino acids
890.54
432.30
136.08
777.44
662.40
181.11
R
175.13
292.17
274.13
547.35
D
D
D
I/L
115.0
115.0
115.0
113.1
500.80
1001.59
Precursor mass: 591.28 (2+)  M+H = 1181.56  10-11 amino acids
AS
GT
158.1
890.54
136.08
y3
432.30
2+
777.44
662.40
181.11
R
175.13
292.17
S
274.13
V
99.0
87.0
547.35
A
D
D
D
I/L
71.0
115.0
115.0
115.0
113.1
500.80
1001.59
Precursor mass: 591.28 (2+)  M+H = 1181.56  10-11 amino acids
291.0
890.54
Y
432.30
136.08
y32+
181.11
R
175.13
292.17
S
274.13
V
99.0
777.44
662.40
87.0
547.35
A
D
D
D
I/L
71.0
115.0
115.0
115.0
113.1
500.80
1001.59
Precursor mass: 591.28 (2+)  [M+H]+ = 1181.56  10-11 amino acids
291.0
163.1 + 128.1?
890.54
QY or YQ
890.5+163.1=1053.6
890.5+128.1=1018.6
Y
432.30
136.08
y32+
181.11
292.17
S
175.13
274.13
V
99.0
777.44
662.40
87.0
547.35
A
D
D
D
I/L
71.0
115.0
115.0
115.0
113.1
500.80
1018.6-17 (NH2)
1001.59
Precursor mass: 591.28 (2+)  M+H = 1181.56  10-11 amino acids
890.54
b2
Y
‘b1: 164.1’
b2: 292.1
432.30
136.08
y32+
181.11
R
175.13
292.17
S
274.13
V
99.0
777.44
662.40
87.0
QY
547.35
A
D
D
D
I/L
71.0
115.0
115.0
115.0
113.1
500.80
y-17
1001.59
890.54
432.30
136.08
662.40
181.11
175.13
292.17
777.44
547.35
274.13
500.80
YQLDDDASVR
porin (Drosophila melanogaster)
1001.59
Protein 3_2
Protein 4_2
Protein 8_1
175.12
Answers:
Peptide 3_2:
NSIVVIDATPFR
Peptide 4_2:
AAYFGFYDTAR
Peptide 8_1:
SIQDLTVTGTEPGQVSSR
‘Ion-trap’:
VTLGTQPTVLR