AIChE Annual Meeting 2005

Download Report

Transcript AIChE Annual Meeting 2005

Identification of Important Signaling
Proteins and Stimulants for the
Production of Cytokines in RAW 264.7
Macrophages
Sylvain Pradervand1, Mano Ram Maurya2,
Shankar Subramaniam1,2
1San
Diego Supercomputer Center
2Department of Bioengineering
University of California, San Diego
AIChE Annual Meeting, Wednesday, November 02, 2005
Outline
 Production and release of cytokines in macrophage
 Identification of significant correlations between signaling
proteins and cytokines
 Quantitative input/output modeling using Principal
Component Regression (PCR)
 Results
 Summary and conclusions
AIChE Annual Meeting 2005
Cytokine Production and Release in Macrophages
Cytokines

Proteins for communication between immune cells

Important players of the immune system

Apoptosis of infected cells

Initiation of inflammation

Control of inflammation
Secreted by immune cells

Complex signaling activities followed by gene-expression
AIChE Annual Meeting 2005
Cytokine Production and Release in Macrophages
(ligands)
Multiple Stimuli
Complex signaling
network of
signaling proteins
(phosphoproteins)
Macrophage
Cytokines
Paracrine cytokines
Endocrine cytokines
AIChE Annual Meeting 2005
Each of the signaling proteins or 2nd messengers are a marker
of a pathway: overall very complex
Acknowledgement: http://www.biocarta.com/pathfiles/h_gpcrPathway.asp
Clustering and Correlation Analysis
Hierarchical clustering (using R)

Reveals which inputs (ligands) have similar effect on the
signaling proteins and on cytokine release
Correlation analysis (using R)

Neuman-Pearson correlation method
AIChE Annual Meeting 2005
Signaling Pathway Activations and Cytokine Release
Singaling proteins/2nd messengers
cytokines
Increase
Ligands
Decrease
Toll like receptor (TLR) ligands
Toll-like receptor: pattern recognition receptors (PRRs),
binding to pathogen-associated molecular patterns, for immediate
action without antibody
AIChE Annual Meeting 2005
Correlation between Signaling Activity and Cytokines
Significant correlations are displayed
With Toll-like receptor (TLR) ligands
Without TLR ligands
Cytokine
Signaling
proteins
Positive correlation

Negative correlation
With TLRL, similar pattern of correlations for the many cytokines

TLRL are dominant, effect of others less visible
Without TLRL data, only few positives correlations, most involved TNF
 Without TLRL data, STATs show stronger correlations

AIChE Annual Meeting 2005
Further (Quantitative) Analysis
Hierarchical clustering and correlation analysis are only
qualitative
Detailed signaling map is not available

Develop simplified linear input/output models

Elucidate common and different signaling modules

Predict cytokine release
AIChE Annual Meeting 2005
Further (Quantitative) Analysis
Two-part model
L
Part-I: Capture most
of the output as: Y1
= X*B1 (PP-model)
Part-II: Residual, Y
– Y1, as Y2 = L*B2
X’s highly correlated
X
B1
Y1 Y2
B2
Y = Y1 + Y2
 User principal component regression (PCR)
 PLS not used since the number of data points > > the
number of outputs
AIChE Annual Meeting 2005
PCR-Based Approach
Estimate B s/t Y = X*B, using known X and Y
X mxn1
X: input or predictor
V n1xk
V = matrix of eigen vectors of cov(X)
T = matrix of latent variables
T mxk
k latent variables
Y=X*B
T=X*V
Y=T*Q
Q kxn2 Y=X*V*Q
Y mxn2
1.
2.
3.
4.
B=V*Q
Calculate T = X*V
Calculate Q with least-square
Predicte Y: Yp = T*Q
Repeat the procedure for the residual, Y – Yp, with L as input
AIChE Annual Meeting 2005
Statistical Significance of the Coefficients
Most coefficients non-zero
Identify the significant coefficients

Estimate the coefficients for many random models

Randomly shuffle Y (the data points), Ys, calculate coefficients

Calculate standard deviation of the random coefficients (j)

Calculate the ratio: rj = bj/ j

Significance test 95% confidence level: rth = 1.96

Null hypothesis true if rj < rth
Use higher threshold for the residuals: rth = 2*1.96 = 2.77

standard deviation of the difference of two samples from N(0,1)
AIChE Annual Meeting 2005
Cytokines Regulatory Signals
JNK, p38, NF-kB strongest coefficients
ERK1/2 and RSK similar profile
cAMP the only significant negative
AIChE Annual Meeting 2005
Cytokines Regulatory Signals Without TLRL Data
cAMP kept its negative strength
STATs became more significant
Remaining positive coefficients: p38 (G-CSF and
TNF), RSK (TNF)
AIChE Annual Meeting 2005
Cytokines Regulatory Signals in Residuals
Only few ligands statistically significant
AIChE Annual Meeting 2005
Cytokines Regulatory Signals in Residuals Without TLRL Data
IL-4 is strong for IL-1a, IL-6 and IL-10
2MA is strong for G-CSF and TNFa
G-CSF and TNFa have a similar pattern of coefficients
AIChE Annual Meeting 2005
Minimal PCR Model
 Many predictors flagged as significant due to correlation
with other important predictors

Identifies most known pathways but high false positive rate
 Identify necessary and sufficient set of signaling pathways
that would predict cytokine release
 Generate minimal models

Find the least number of predictors with statistically same fit
as the full model

Must be better than a zero predictor (average) model

Use F-test for each of these
AIChE Annual Meeting 2005
Procedure for PCR Minimal Models
F-test:
Full (detailed) model with all
significant predictors (ed)
R1  er2 / ed2  finv( p, d r , d d )
p  1   ,   0.05, p  0.95
Better than the trivial model:
R2  e02 / er2  finv( p, d 0 , d r )
p  0.68 for the residuals
Decreasing number of predictors
As good as the full-model:
If more than one predictor left, use combinatorial
selection (integer programming) for exhaustive
testing
Keep eliminating the least
significant predictor:
R1 increases, R2 decreases
Initial minimal model
Final minimal model
Zero-predictor model (e0)
AIChE Annual Meeting 2005
Combined Minimal Model and Validation
Integrate validation with the model development
Build a network combining the results from model +/TLRL data
Pathways: p38, cAMP,
NF-kB, JNK, STAT1
Ligands: others
10 regulatory modules
 JNK/NF-kB translates TLRL dependency
 p38/PAF post-transcriptional controls?
 STAT1 affects the chemokines
 cAMP is anti-correlated (inhibitory?)

AIChE Annual Meeting 2005
Validation with the Literature
Cytokine
False Positive
False Negative
G-CSF
0
0
IL-1a
0
0
IL-6
5% (2 extra)
0
IL-10
2.5% (1 extra)
40% (missed 2)
MIP-1a
0
33% (missed 2)
RANTES
0
0
TNFa
0
17% (missed 1)
 With minimal model

Overall 1.2% false positive rate (FPR) and 13% false
negative rate (FNR)
 With full model

Overall 11% FPR (10 times higher) and 3% FNR (4
times lower)
 Relative gain with minimization: a factor of 2.5
AIChE Annual Meeting 2005
New Hypothesis for G-CSF from Network Reconstruction
All known regulatory pathways found
New hypothesis: p38 involved in posttranscriptional regulation of G-CSF
(stimulates production of neutrophils)?
ISO
Adrb2
p38
LPS, P2C
P3C, R-848
TLR2/1, TLR2/6
TLR4, TLR7
NF-B
2MA
P2X, P2Y
JNK
G-CSF
AIChE Annual Meeting 2005
Summary
Ligand screen
data set
Statistical analysis
Modeling
Collection of hypothesis
Design in vitro assays
AIChE Annual Meeting 2005
Cytokine Production and Release in Macrophages
Cytokines

Messengers proteins in communication between immune cells

Important players of the immune system
AIChE Annual Meeting 2005
Glossary of Cytokine names
IL: interleukin
TGF: transforming growth factor
TNF: tumor necrosis factor
GM-CSF: granulocyte/macrophage colony stimulating
factor, also M-CSF and G-CSF
MIP: macrophage inflammatory protein
RANTES: Regulated on Activation, Normal T Expressed
and Secreted (also known as CCL5, binds to CCR5 which
is a coreceptor of HIV, thus blocks HIV from entering the
cell)
AIChE Annual Meeting 2005
Glossary ofcolony
Ligand
Names
 GM-CSF: granulocyte/macrophage
stimulating
factor, also M-CSF and
G-CSF
 IL: interleukin
 IFN: interferon (induce cells to resist viral replication)
 C5a: cleavage product from C5 (a protein of the complement pathway/system)
 R-848: Resiquimod: potent antiviral regent
 LPS: lipopolysaccharide
 P2C: PAM2CSK4 (synthetic diacylated lipopeptide; AfCS)
 2MA: 2-Methylthio-ATP is a synthetic analog of ATP (acts through P2X
(ligand-gated) and P2Y (GPCR))
 LPA: Lysophosphatidic acid (derived from phospholipid)
 UDP: Uridine diphosphate (a nucleotide)
 S1P: Sphingosine-1-phosphate
 PAF:Platelet activating factor (PAF) is a proinflammatory phospholipid
 ISO:Isoproterenol
 PGE:Prostaglandin E2, a lipid product of arachidonic acid metabolism, has an
immunosuppressive effect
AIChE Annual Meeting 2005
Glossary of Signaling Proteins
 cAMP
 Akt: protein kinase B
 ERK & JNK: MAPKs (from wikipedia.com; To date, four distinct groups of
MAPKs have been characterized in mammals: (1) extracellular signal-regulated
kinases (ERKs), (2) c-Jun N-terminal kinases (JNKs), (3) p38 isoforms, and (4)
ERK5)
 RSK:ribosomal S6 kinase
 GSK: Glycogen synthase kinase-3 (overexpressed in Alzheimer’s disease)
 nF-KB
 p40Phox (Neutrophil cytosolic factor 4; an oxidoreductase)
 SMAD SMAD-1 is the human homologue of Drosophila Mad (Mad =Mothers
against decapentaplegic)
 STAT: Signal Transducers and Activator of Transcription
 Rps6: ribosomal protein S6
AIChE Annual Meeting 2005
Measurement of Signaling Proteins and Cytokines
2nd messengers
Enzyme-linked immunoassay to measure cAMP
concentrations
 Fluorescent dye to measure intracellular free calcium

Signaling proteins

Immunoblots to detect signaling proteins phosphorylations
Responses
Agilent inkjet-deposited presynthesized oligo arrays to
assess gene expression
 Multiplex suspension array system to measure
concentrations of cytokines in the extracellular medium

Data is log-transformed after subtraction of basal
response observed in control data

Stimulation by a single or double ligands at a fixed strength
AIChE Annual Meeting 2005
Procedure for Normalization of Data
Data processing

Signaling proteins
Log2(Fold-change (response/basal-response))
 Except for cAMP for which basal was subtracted, then log2


Cytokines
Log2(response – basal + 1), basal is close to 0
 Signal-to-noise ratio calculated

• Cytokine not analyzed if SNR < 5
AIChE Annual Meeting 2005
Main Regulation at the mRNA Level
Most of the regulatory mechanisms at the genetranscription level

Except for IL-1a, good overall correlation with coefficients >
0.9: 0.92 (MIP-1a) to 0.99 (IL-10)
AIChE Annual Meeting 2005
Statistical Analysis of Ligands Interactions
Is there more than additive effect of ligands on the cytokine
release (output)?
Use of linear model (a similar model with lesser terms used
to identify significant ligands in single-ligand data)
Yhijklt    L1h  L 2 i  T j  E k  Gl ( k )  L1L 2 hi  ...
These terms are either 0 or
non-zero but fixed
Since the data corresponds
to fixed strength of the
stimulus
Gel-effect
Nonlinear-term
Random-error
Time-effect
Constant term
Effect of ligand 1
Effect of ligand 2
Null hypothesis: no synergism (more than additive effect) of the
ligands on the cytokine release, i.e., L1L2hi = 0
 Used ANOVA (Analysis of Variance)

AIChE Annual Meeting 2005
An Example of Hypothesis from Interaction Analysis
IL-4 enhances STAT1/b activation by IFNg
IFNg
IL-4
+
Pathway A
Pathway B
STAT1a/b
AIChE Annual Meeting 2005
Examples of Hypothesis from Interaction Analysis
Gs ligands enhance G-CSF, IL-1, IL-6, IL-10
releases by TLRL
TLRL
ISO/PGE
+
Pathway A
Pathway B
G-CSF, IL-1a
IL-6, IL-10
AIChE Annual Meeting 2005
Examples of Hypothesis from Interaction Analysis
Synergism between IL-6 and TLRL on IL-10 release is
mediated via a ERK1/2-dependent pathway
TLRL
IL-6
+
ERK1/2
Pathway B
IL-10
AIChE Annual Meeting 2005
PCR-Based Approach
Estimate B s/t Y = X*B, using known X and Y
X mxn1
X: input or predictor
V n1xk
V = matrix of eigen vectors of cov(X)
T = matrix of latent variables
T mxk
k latent variables
Y=X*B
T=X*V
Y=T*Q
Q kxn2 Y=X*V*Q
Y mxn2
1.
2.
3.
4.
5.
B=V*Q
Vk = [V1 V2…Vk], k = matrix of eigen-values =diag[1 2… k]
Calculate T = X*V; T’*T = *(m-1); k-1 = diag(1/1…. 1/k)
Calculate Q with least-square method: Q = k-1/(m-1) *(T’*Y)
Calculate B = V*Q, predicted Y, Yp = T*Q = T* k-1/(m-1) *(T’*Y)
Repeat the procedure for the residual, Y – Yp, with L as input
AIChE Annual Meeting 2005
Statistical Significance of the Coefficients
Most coefficients (bj; with respect to jth input) are non-zero
Identify the significant coefficients


Estimate the coefficients for a random model

Randomly shuffle Y (the data points), Ys, calculate coefficients

Repeat many times (1000 times)

Calculate standard deviation of the random coefficients (j)
Approximation:  j  diag (V * ( k * (m  1)) 1 * V T ) * std (Y j )
Calculate the ratio: rj = bj/ j
Significance test at a confidence level of 95%: rth = 1.96
Null hypothesis (coefficient not significant) true if ri < rth
Use higher threshold for the residuals: rth = 2*1.96 = 2.77

standard deviation of the difference of two samples from N(0,1)
AIChE Annual Meeting 2005
Cytokines Regulatory Signals
Average of the ratios for models with different number of predictors
to capture 80% - 95% variation in input data
JNK, p38, NF-kB strongest coefficients
ERK1/2 and RSK similar profile
cAMP the only significant negative
AIChE Annual Meeting 2005
Minimal PCR Model
 Many predictors flagged as significant because of their
correlation with other important predictors

Identifies most of the known pathways but results in high
number of false positives
 Identification of necessary and sufficient set of signaling
pathways that would predict cytokine release

Algorithm to generate minimal models

Essential idea:

Form the list of significant predictors, find the least number of
predictors with fit statistically equal to the fit for the full model

The minimal model should be better than a zero predictor
(average of the output) model

Use F-test for each of these
AIChE Annual Meeting 2005
Procedure for PCR Minimal Models
F-test:
Full (detailed) model with all
significant predictors (ed)
R1  er2 / ed2  finv( p, d r , d d )
p  1   ,   0.05, p  0.95
Better than the trivial model:
R2 
e02
/ er2
 finv( p, d 0 , d r )
p  0.68
If full model itself is no better than the
trivial model: Accept the trivial model
Decreasing number of predictors
As good as the full-model:
If more than one predictor left, use combinatorial
selection (integer programming) for exhaustive
testing
Intermediate model-1 (e1)
Keep eliminating the least
significant predictor:
R1 increases, R2 decreases
Initial minimal model
Final minimal model
Intermediate model-2 (e2)
Zero-predictor (average output)
model (e0)
AIChE Annual Meeting 2005
New hypothesis for IL-1 from network reconstruction
All known regulatory
pathways found
New hypothesis: IFNg
regulates IL-1a through an
IRFs-dependent pathway?
New hypothesis: IL-4
regulates IL-1a through a
STAT6 pathway?
IFNg
IFNGR
JNK
LPS
P2C
P3C
R-848
TLR2/1,TLR2/6,
TLR4, TLR7
IL-4
IL-4R
NF-B
IL-1
AIChE Annual Meeting 2005
New hypothesis for TNFa from network reconstruction
All known regulatory
pathways except ERK1/2
found
New hypothesis: M-CSFspecific pathway regulates
TNFa?
M-CSF
CSF-1R
p38
LPS
P2C
P3C
R-848
2MA
UDP
TLR2/1, TLR2/6,
P2X, P2Y TLR4, TLR7
JNK
ISO
IFNg
Adrb2
IFNGR
NF-B
cAMP
TNF
AIChE Annual Meeting 2005
New hypothesis for RANTES from network reconstruction
All known regulatory
pathways found
New hypothesis:
Synergisms between LPS
specific pathway (IRF-1?)
and NF-kB on RANTES
regulation?
IFNb
IFNAR
STAT1
R-848
P2C
P3C
TLR2/1,
TLR2/6, TLR7
NFB
LPS
TLR4
JNK
RANTES
Similar hypothesis for IL-6 release
AIChE Annual Meeting 2005
Validation with the Literature
literature
True positives identified (both) (1,1)
Our model False negative (0,1)
False positive (1,0)
True negative identified (both) (0,0)
Count ER1/2 as 1, stat1a/b as 1 JNK sh/lg as 1, GSK 3a/3b as 1
18 PP and 22 ligands as total (false positive + none) = 40
Total true negative = (negative identified + not-identified-butreported-in literature)
 With minimal model

Overall 1.2% false positive rate (FPR) and 13% false
negative rate (FNR)
 With full model

Overall 11% FPR (10 times higher) and 3% FNR (4
times lower)
 Relative gain with minimization: a factor of 2.5
AIChE Annual Meeting 2005
Validation with the Literature
Cytokine
False Positive
False Negative
G-CSF
0
0
IL-1a
0
0
IL-6
5% (2 extra)
0
IL-10
2.5% (1 extra)
40% (missed 2)
MIP-1a
0
33% (missed 2)
RANTES
0
0
TNFa
0
17% (missed 1)
Full model missed only cAMP for IL-10, but it has more false positives
FPR (type-I error) = FP/(FP + none (true negative))
FNR (type-II error) = FN/(true positives = positives_identified + FN)
AIChE Annual Meeting 2005
Validation with the Literature
Total # predictors
Minimal Model
40
Both
G-CSF
IL-1
IL-6
IL-10
MIP-1a
RANTES
TNFa
Our model only
3
2
4
3
4
3
5
Total # predictors
FullModel
0
0
2
1
0
0
0
none
0
0
0
2
2
0
1
false positive
37
38
34
34
34
37
34
0
0
0.055555556
0.028571429
0
0
0
false negative
0
0
0
0.4
0.333333333
0
0.166666667
0.012018141
0.128571429
40
Both
G-CSF
IL-1
IL-6
IL-10
MIP-1a
RANTES
TNFa
Litterature only
Our model only
3
2
4
4
6
3
6
Litterature only
4
4
3
6
3
5
3
none
0
0
0
1
0
0
0
33
34
33
29
31
32
31
false positive
false negative
0.108108108
0
0.105263158
0
0.083333333
0
0.171428571
0.2
0.088235294
0
0.135135135
0
0.088235294
0
0.111391271
0.028571429
In the detail model, the only false negative would have been cAMP for IL-10. Then, the FNR would be (0.2/7) = 2.9%
AIChE Annual Meeting 2005