Data Processing
Download
Report
Transcript Data Processing
Metabolomics Workshop
I Orbi 4 - 2012
The world leader in serving science
1
Metabolomics Profiling: Current Practice and Challenges
• Goals
• Challenges
1) Qualitative & Quantitative
assessment of the biochemical
composition of the samples
• Complexity of biological samples
2) Differential analysis between
sample groups
• Wide range of concentration
3) Identify compounds responsible
for changes
• Lack of standards
• Diversity of small molecule
metabolites (LC challenge)
• Multiple sources of variability
• Incomplete information – majority
of components in LC/MS are
unknowns
• Structure elucidation of unknowns
is expensive
2
Metabolomics Profiling: Current Practice and Challenges
• Challenges
• Complexity of biological samples
• Diversity of small molecule
metabolites (LC challenge)
• Wide range of concentration
• Multiple sources of variability
• Lack of standards
• Incomplete information – majority
… Not typical for metabolomics
of components in LC/MS are
unknowns
• Structure elucidation of unknowns
…. more realistic situation (without
is expensive
proper tools)
3
Cornerstones of Metabolomics Workflow
1. Sample preparation
2. HPLC separation
3. Mass Spectrometry detection
4. Data processing and reporting
4
Cornerstones of Metabolomics Workflow
1. Sample preparation
2. HPLC separation
3. Mass Spectrometry detection
4. Data processing and reporting
5
Chromatography for Metabolomics
• Lipids
• Typically not a big problem
• RPLC works pretty well (C18, C8, C30 for isomer)
• Polar metabolites
• Miscellaneous types of compounds
• Some are “difficult” compounds – some amino acids, nucleotides
• Various and good LC methods are available – but usually not for all the
compounds together
• GOAL – a single LC method for all polar metabolites
challenge …
6
Chromatography for Metabolomics - Polar Metabolites
• RPLC (C18)
• Good reproducibility
• No or limited retention for very polar compounds
• Ion pairing
• Significant ion suppression
• Possible decrease of dynamic range of Orbitrap (filling trap with ion pairing ions)
• Ion Chromatography
• Good for specific group of compounds (e.g. organic acids, nucleotides, …)
• Not a broad (“universal”) method for al metabolites
7
Profiling of Organic Acid Metabolites by Capillary IC
Glycolate
Oxalate
75 73
89 61
Glycerate
Fumarate
115 71
105 75
Lactate
Malate
89 43
2-Hydroxyisobutyrate
133 115
2-Hydroxybutyrate
Tartrate
149 87
103 57
Pyruvate
2-Oxoglutarate
87 43
145 101
Glutarate-d6
cis-Aconitate
173 85
137 93
Citrate
Succinate
191 111
117 73
2
8
4
6
8
10
12
14
10
12
14
16
18
20
trans-Aconitate
isoCitrate
22
24
Chromatography for Metabolomics - Polar Metabolites
• RPLC (C18)
• Good reproducibility
• No or limited retention for very polar compounds
• Ion pairing
• Significant ion suppression
• Possible decrease of dynamic range of Orbitrap (filling trap with ion pairing agent)
• Poor retention time reproducibility
• Ion Chromatography
• Good for specific group of compounds (e.g. organic acids, nucleotides, …)
• Not a broad (“universal”) method for al metabolites
• HILIC
• Good retention for polar compounds
• Compounds eluting in organic solvent
• Good potential to be a “universal” for polar metabolites
• Historically, methods have been tricky to develop and reproduce
9
HILIC LC-MS for Metabolomics
• Luna NH2 100A, 3um, 150 x 2 mm
• Column temp.: 15`C
• Biological matrix samples
• reconstituted in 50% ACN
• Inj vol. 1 uL
• A = 5 mM AcONH4 pH 9.9
• B = ACN
• Q Exactive
• Pos/Neg switching
• R = 70k
10
HILIC for Metabolomics: Amino Acids
(most [M+H]+)
RT: 0.00 - 20.00 SM: 5G
RT: 6.02
AA: 5248539477
Leu/Ile
separated
RT: 7.76
AA: 650551637
Tyr
RT: 8.01
AA: 187502422
Gly
RT: 8.34
AA: 1128162804
Gln
RT: 8.61
AA: 57259176
Ser
RT: 8.76
AA: 844873242
Arg
RT: 12.89
AA: 774302105
Glu
RT: 12.90
AA: 8223136
pSer [M-H]0
11
2
5.47
4
15.17
8.25
6
8
10
Time (min)
12
14
16
18
20
HILIC for Metabolomics: Nucleotides
RT: 4.00 - 28.00 SM: 5G
RT: 9.29
AA: 3915504
dAMP
RT: 15.23
AA: 20152930
AMP
18.86
RT: 16.66
AA: 365001
GMP
RT: 18.57
AA: 8162921
ADP
RT: 20.92
AA: 885386
CTP [M-H]-
RT: 21.32
AA: 2203457
UTP [M-H]-
RT: 21.62
AA: 3344803
ATP [M-H]-
RT: 23.02
AA: 78634
GTP [M-H]4
12
6
8
10
12
14
16
Time (min)
18
20
22
24
26
28
HILIC for Metabolomics: Sugar Phosphates & Organic Acids
(all [M-H]-)
RT: 0.00 - 28.02 SM: 5G
RT: 5.24
AA: 1910441
Pyruvate
11.38 12.19
RT: 14.86
AA: 300254421
Malate
RT: 14.88
AA: 3970448
Glucose-6-phosphate
RT: 15.01
AA: 66545477
α-Ketoglutarate
RT: 15.11
AA: 242102
Oxaloacetate
RT: 17.57
AA: 178250854
Citrate
RT: 17.68
AA: 224903
3-Phosphoglycerate
RT: 17.97
AA: 166714
Glyceraldehyde-3-phosphate
0
13
2
4
6
8
10
12
14
16
Time (min)
18
20
22
24
26
28
Cornerstones of Metabolomics Workflow
1. Sample preparation
2. HPLC separation
3. Mass Spectrometry detection
4. Data processing and reporting
14
Good HR/AM Mass Spectrometer for Metabolomics?
• Accurate mass stability
• Robust accuracy over extended periods – (set it and forget it)
• Ability to do pos/neg switching within a run and maintain accuracy
• Can save 50% of analysis time
• Speed
• Compatibility with the most demanding UHPLC separations
• Resolution
• Primary discriminator for the analytes of interest - (more is better)
• Want as much as we can get without compromising sensitivity
• Sensitivity
• As good if not better than a triple quad
15
Alternating Polarity Switching – Cycle time (R = 70k)
RT: 7.64 - 8.42
7.70
7.72
100
80
7.76
NL: 8.45E8
TIC MS
Metabolomics_sample_
141
8.13
8.32 8.34
8.11
8.20
8.01 8.03
7.78
7.86
60
TIC
40
20
7.97
7.95
60
40
8.05
8.07
8.09
8.11
8.13
8.15 8.18
8.26 8.28
7.93
7.91
7.88
7.86
8.00
80
7.96
8.08
8.10
7.94
8.12
40
7.92
20
7.87
7.90
[Gly+H]+
NL: 6.06E6
m/z= 74.0244-74.0252
F: FTMS - p ESI Full
ms [70.00-1000.00]
MS
Metabolomics_sample_
141
8.04
8.06
60
NL: 1.85E7
m/z= 76.0389-76.0397
F: FTMS + p ESI Full
ms [70.00-1000.00]
MS
Metabolomics_sample_
141
Relative Abundance
7.99 8.03
80
0
100
100
80
12.55
8.14
8.19 8.21
8.27
[Gly-H]-
13.13 13.37
60
TIC
20
0
100
60
8.0
Time (min)
8.2
12.80
12.78
12.76
40
12.72
12.66
0
100
12.85
12.83
12.81
12.79
60
12.91
12.93
12.95
12.97
12.99
13.01
13.03
[Glu+H]+
13.09
80
13.18 13.37
NL: 7.63E7
m/z=
146.0452-146.0466 F:
FTMS - p ESI Full ms
[70.00-1000.00] MS
Metabolomics_sample_
141
12.90 12.92
12.94
12.96
12.98
12.77
12.75
8.35
20
7.8
12.84
80
20
NL: 6.77E7
m/z=
148.0597-148.0611 F:
FTMS + p ESI Full ms
[70.00-1000.00] MS
Metabolomics_sample_
141
12.86 12.89
40
0
8.4
12.71
12.65
13.00
[Glu-H]-
13.02
13.06
13.15 13.27
0
12.6
16
NL: 9.33E8
TIC MS
Metabolomics_sample_
141
12.86 12.89
12.95
12.82
12.99
12.74
13.07
40
0
100
20
RT: 12.43 - 13.52
12.8
13.0
Time (min)
13.2
13.4
SIM
Increase sensitivity of your Q Exactive for
targeted metabolomics
…. significantly (if not dramatically)
Animation of SIM on Q Exactive
*3:30 min
17
Full-MS vs. SIM in Metabolomics
T: 0.00 - 28.01 SM: 5G
RT: 12.72
AA: 6955648
100
Full-scan
(identical sample)
SIM
RT: 0.00 - 28.00 SM: 3G
NL: 7.50E5
m/z= 664.11308-664.11972
F: FTMS + p ESI Full ms
[70.00-1000.00] MS ICIS
rpmi_468_1004
50
RT: 12.33
AA: 40294435
100
NL: 4.27E6
m/z=
664.11308-664.11972 F:
FTMS + p ESI SIM msx ms
MS ICIS
RPMI_468_SIM_3001
50
NAD+
NAD+
0
NL: 0
m/z= 666.12872-666.13538
F: FTMS + p ESI Full ms
[70.00-1000.00] MS
rpmi_468_1004
100
!
50
NL: 1.06E4
m/z= 742.06447-742.07189
F: FTMS - p ESI Full ms
[70.00-1000.00] MS ICIS
rpmi_468_1004
RT: 17.52
AA: 55919
100
50
0
NL: 1.38E4
m/z= 371.53641-371.54013
F: FTMS - p ESI Full ms
[70.00-1000.00] MS ICIS
rpmi_468_1004
RT: 20.21
AA: 87582
100
50
NADH
0
0
RT: 23.76
AA: 224530
100
Diphosphoglycerate
NL: 2.48E4
m/z= 264.95067-264.95331
F: FTMS - p ESI Full ms
[70.00-1000.00] MS ICIS
rpmi_468_1004
100
NADP+
0
NL: 1.25E4
m/z=
371.53641-371.54013 F:
FTMS - p ESI SIM msx ms
MS ICIS
RPMI_468_SIM_3001
RT: 20.15
AA: 119607
100
NADPH
0
RT: 23.82
AA: 686997
100
Diphosphoglycerate
50
0
NL: 1.27E5
m/z=
742.06447-742.07189 F:
FTMS - p ESI SIM msx ms
MS ICIS
RPMI_468_SIM_3001
RT: 17.43
AA: 867351
50
NADPH
50
100
50
NADP+
NL: 1.66E5
m/z=
666.12872-666.13538 F:
FTMS + p ESI SIM msx ms
MS ICIS
RPMI_468_SIM_3001
RT: 15.24
AA: 716959
50
NADH
0
0
0
0
18
5
10
15
Time (min)
20
25
0
5
10
15
Time (min)
NADH is observed when using multiplexing SIM
20
25
NL: 4.25E4
m/z=
264.95067-264.95331 F:
FTMS - p ESI SIM msx ms
MS ICIS
RPMI_468_SIM_3001
Q Exactive (Exactive Plus) settings for
Metabolomics LC-MS assays
The world leader in serving science
19
Q Exactive Settings for Pos-Neg switching Full-MS
negative
(-)
20
(+)
QE Settings for Pos-Neg switching SIM dd-MS2
(-)
21
(+)
QE Settings for Pos-Neg switching SIM dd-MS2
(-)
Inclusion mass list
22
(+)
Thinks to Consider When Setting Your Orbi for
Metabolomics
• Q Exactive, Exactive Plus, Orbi Velos/Elite
• Tune parameters:
• S-lens: Lower the setting
~
30 - 40 % (default = 50%)
• To avoid in-source fragmentation of fragile analytes
• Lower Tube lens values for LTQ Orbi (XL), Exactive
• LTQ Orbitraps
• Very small ions and larger ions together
2 scan events: 1) 70 - 200 Da
2) 150 - 1000 Da
• HESI 2 ion probe settings
• Aux gas heat (vaporizer) can be set high (depending on LC flow 500`C)
• Capillary temperature set mid-low - ~ 275`C for Q Exactive, Exactive Plus,
Orbi Velos/Elite)
~ 250`C for LTQ Orbi (XL), Exactive
23
Targeted Screening and Quantitation for
Metabolomics
TraceFinder
The world leader in serving science
24
TraceFinder 2.1
• Quantitation and targeted screening software platform
• 1) Method setup, 2) acquisition, 3) data processing, 4) reporting
• All-in-one package
• Fast method development for integrated multi-residue analysis
• All Thermo MS platforms supported – Orbitraps, IT, TSQ, GC-MS
• Automated report generation
• In the next version (very close future)
• HR/AM product-ion database
• HR/AM MS/MS library (spectral matching)
• Isotopic pattern matching (… of precursor ions in MS)
25
TraceFinder - Compound Datastore
26
TraceFinder: Method Development - Compound Identification
and Detection
XIC
Target
Compounds
27
Integration
Parameters
TraceFinder Method Development: Calibration Setup
28
TraceFinder Batch View: Sample Sequence
Sequence List
Select compounds
to be reported
Reports
• Built in
• Possibility for
custom made
29
TraceFinder Data Review: Quan Results
Compound List
Sequence List
XIC
30
Cal. curve
SIEVE
Differential Analysis Software
The world leader in serving science
31
SIEVE 2.0: The Differential Analysis Software
• For label free, semi-quantitative
differential analysis
• Aids discovery of molecular changes
between states
• Relative quant and trend plots over
multiple sample groups
• Identification using Chemspider or local
database search
• PCA and other statistical analysis tools
New algorithm for small molecules
Proteomics, metabolomics and lipodomics workflows
32
SIEVE 2.0 - Background Subtraction (New)
Sample
-
Solvent blank
=
Analyte signals
~98% of lower intensity signals are eliminated
33
Data
Processing
Anatomy of a UHPLC/Orbitrap Data Set
5
100
852.9720 m/z window
853.4727
[M+H]+
= ± 1 Da
853.9745
z =+2
• >1,000,000 data points
854.4817
0
0
853.0 853.5 854.0 854.5 855.0
m/z
[M+H]+
[M+Na]+
[M-H2O+H]+
[2M+H]+
[M+NH4]+
[2M+Na]+
[M+K]+
[M-2(H2O)+H]+
[M+CH3CN+H]+
+3
+2
Z=2
12%
~100,000 extracted ion peaks.
• Peak area ranges ~ 7 orders
100 200 300 400 500 600 700 800 900 1000
Adduct
+4
•
%
Assignments
100
12.1
8.3
4.7
3.8
3.1
2.7
2.5
• Much irrelevant data
• Much redundant data
• High quality data from the
Orbitrap mass analyzer allows
for more precise automated data
processing
2.1
Z=3
(10%)
Other
(5%)
34
Data
Processing
Z=1
(73%)
+1
Need to be able to reduce the data
to chemical entities
SIEVE 2.0 - Component Detection – Declustering (New)
Adducts, fragments and multimers
524.3703, z=1, I=4.2E+08, 100%
[M+H]+
21.9816
546.3517, z=1, I=1.0E+08, 24.6%
562.3232, z=1, I=1.1E+06, 0.3%
[M+K]+
[M+Na]+
37.9554
Isotopic peaks
A+1
525.3730, I=1.2E+08, 28.9%
A+2
526.3756, I=2.3E+07, 5.5%
A+3
527.3784, I=3.0E+06, 0.7%
A+4
528.3811, I=3.9E+05, 0.1%
Isotopic peaks
A+1
547.3535, I=2.9E+07, 27.8%
A+2
548.3577, I=5.6E+06, 5.4%
A+3
Constituents are represented by
base component
35
549.3595, I=9.0E+05, 0.9%
Component Detection – D4-Succinic Acid
One Compound:
31 ions
18 Adducts
13 Isotopomers
36
Example of a Component
Rat O
blank
m/z 232.1541, RT = 3.54 min
37
Rat L
Accurate Mass Identification
Component MW
chemspider
web service
38
MolWt
290.079
306.074
314.01
380.1254
382.1047
426.0945
436.1153
450.0793
468.1051
472.1
477.1266
478.0742
486.1157
494.0691
Local
database (.csv)
Expression
Name
L-Epicatechin
Epigallocatechin
D-glycoside of vanillin
Vellokaempferol 3-5-dimethyl ether
Velloquercetin 4 -methyl ether
Epigallocatechin 3-O-(4-hydroxybenzoate)
Epigallocatechin 3-O-cinnamate
Quercetin 4 -galactoside
Epigallocatechin 3-O-caffeate
Epigallocatechin 3-O-(3-O-methylgallate)
Isorhamnetin 7-alpha-D-Glucosamine;Quercetin 3 -methyl ether 7-alpha-D-Glucosamine;7-[(2-Amino-2-deoxy-alpha-D-glucopyranosyl)oxy]-3
Quercetin 7-glucuronide
Epigallocatechin 3-O-(3-5-di-O-methylgallate)
Myricetin 3-glucuronide
List of
candidates
SIEVE 2.1 Features
• Improved peak detection & integration
• Complement native algorithm with PPD
• Elemental composition
• ChemSpider, DBLookup search
• RT & Formula in DBLookup
• Local database searches can now be
limited by retention time and elemental
composition
• Enhanced filtering capability
• Pathway mapping
• KEGG visualization of full experimental
results
39
Pathway Annotation in Sieve 2.1
40
Case Study
The world leader in serving science
41
Metabolomics Application - ZDF Rat Serum
• ZDF Lean rat serum (n=3)
• ZDF Obese rat serum (n=3)
• Water blanks (n=3)
• ESI Positive Ion full scan LC-MS
• Q Exactive 70K resolution
• Goals:
• Find the exact monoisotopic MW of components that are statistically
significantly different between the Lean and Obese groups.
• Determine putative ID’s by an exact mass/formula search of Human
Metabolome DB using ChemSpider
42
ZDF Rat Serum – SIEVE 2.0 Analysis
UHPLC Conditions: Accela 1250
•
•
•
•
•
Samples: 50µL serum precipitated with 150µL cold methanol, 0.1% formic acid
Internal Standard: 5µg/mL of d5-Hippuric acid (200µL)
Column: Hypersil GOLD aQ 2.1x150mm, 1.9µm, 50°C
Mobile Phase: A: 0.1% formic acid in Water, B: 0.1% formic acid in Acetonitrile
Injection: 3µL
Time
(min)
%
A
%
B
0.00
100
0
600
6.00
80
20
600
8.00
60
40
600
12.00
5
95
600
14.00
5
95
600
14.10
100
0
600
17.50
100
0
600
Q Exactive Conditions:
• Full scan MS: 70K resolution, ESI+ and ESI-, m/z 65-850, 0-14 min
• AGC Target = 1.0 E+6, Max IT = 120 ms
• Source: Vaporizer temp = 400°C, S-Lens = 35
43
Flow
(µL/min)
m/z 232.1541, RT = 3.54 min, p = 2.7 E-6, 3.0-fold change
OBESE
LEAN
44
ChemSpider ID of C4 Carnitine, m/z 232.1541
45
Examples of Significant Changes in ZDF Rats
46
C4 Carnitine
p = 2.73 E-6
Leucine
Uric Acid
p = 1.35 E-6
Uracil
p = 1.40 E-4
p = 2.49 E-6
Lean vs. Obese Rat Serum by SIEVE 2.0
Obese
Lean
PCA Results
47
Significant Changes from Obese vs. Lean ZDF Rats
Chem
Spider
ID’s
Ratios in red are downregulated in Obese rats
Ratios in green were upregulated in Obese rats
Branched amino acids &
acylcarnitines are known
hallmarks of Type II
Diabetes in the literature1
1. Muoio, D.M.; Newgard, C.B. Nature
Rev. Mol. Cell Biol. 2008, 9, 193-205.
48
Sieve Parameters
The world leader in serving science
49
What Do Those Parameters Mean?
50
SIEVE Parameters
Mass range
RT range
Change to false and rerun
alignment if aligned data does
not look good
I don’t change these values
Multiplier for background removal
In this case a peak would have to be 10X more in
sample than in blank to be observed
Minimum intensity threshold
10ppm is fine I do not change
51
Minimum number of scans across the peak
SIEVE Parameters
Be sure to change this when
moving from pos to neg mode
Used in background subtraction routine
I do not change
Exclusion list set for your own setup.
No need to change. This is part of the
algorithm.
Not used
Electronic noise removal – do not change
How tightly the adducts/dimer apex needs
to be to be considered part of the
component. Can be changed but 5 scans
seems to work
These parameters are not used for component
detection
Database search parameters
52
Things to Remember for Sieve
• Sieve is performing an in depth analysis of large data sets. It
will take a little time for it to process.
• A faster computer will help. Sieve is 64-Bit compatible and
does multi-threading.
• Remember to look the alignment.
• Make sure you are using the correct adduct table.
• Run blanks: all samples labeled blanks will be used in
background subtraction
53