Normalization Methods for Two

Download Report

Transcript Normalization Methods for Two

Using lowess curves for dealing
with technical variation in cDNA
microarrays
Adapted from
Dan Nettleton
Iowa State University
Copyright © 2007 Dan Nettleton
1
We can now actually measure the amount of mRNA (a
surrogate for protein) being produced at a particular
point in time at thousands of genes for a single
organism using…. MICROARRAYS
Usually, we compare the gene
expression across groups:
Cancer vs. Healthy
Wild type vs. Mutant
High yield vs. low yield
We’re looking for genes that
show a big difference.
2
Usually, we expect most of the thousands of genes to be
similar between the two groups (standard functioning of
organism), but that a few will be very different.
We collect a 1 sample from
group 1 and dye it green.
We collect a 1 sample from
group 2 and dye it red.
The samples are mixed and
placed on the same array.
Yellow suggests similar
expression…
3
We don’t want any of our found differences to be due to
technology (we want found differences to be due to biology).
Therefore, we usually have to make some adjustments to our
data.
Here is a side-by-side
boxplot of the red and
green expression
values from a slide:
4
Here is the log(red) expression plotted against the log(green)
expression level… notice the deviation from the diagonal.
The technology is
introducing some bias,
a deviation from the
diagonal.
Slide 1 Log Signal Means
Log Red
Since we expect most
of the red vs. green
genes to be equal, the
bulk of the points
should fall near the
45o line or diagonal
(shown in blue).
Log Green
5
We can rotate the previous plot to show the expected
`mean curve’ (blue line) as a horizontal line:
Now, we will use a
lowess curve to
straighten out the
plot as the biology
suggests it should
be…
M = Log Red - Log Green
M vs. A Plot for Slide 1 Log Signal Means
A = (Log Green + Log Red) / 2
6
The lowess fitted curve:
M = Log Red - Log Green
M vs. A Plot for Slide 1 Log Signal Means
with lowess fit (f=0.40)
A = (Log Green + Log Red) / 2
7
Each point will be adjusted with respect to the
lowess curve (the fitted mean curve)…
M = Log Red - Log Green
Adjust M Values
A = (Log Green + Log Red) / 2
8
M = Adjusted Log Red – Adjusted Log Green
Giving the final adjusted expression values to be
used in the analysis.
M vs. A Plot after Adjustment
A = (Adjusted Log Green + Adjusted Log Red) / 2
9