paperreview20081017

Download Report

Transcript paperreview20081017

Paper Review on Crossspecies Microarray
Comparison
Hong Lu
2008-10-14
Title: Conservation of Regional Gene
Expression in Mouse and Human Brain
Authors: Strand AD, Olson JM., et.al
Year: 2007
Journal: PLoS genetics
Purpose
In-species comparison:
To find the differences to distinguish resistant
and sensitive tissues and cell types.
Cross-species comparison:
To provide a framework to explore the ability of
mouse to model diseases of the human brain.
Materials
Persons
Human
Group I
Group II
Tissue
3: caudate,
cerebellum,
motor cortex
2: caudate
cerebellum
man
8
7
15
woman
4
2
6
Total
12
9
21
12 x 3 = 36
9 x 2 = 18
54
Range
36 ~ 77
22 ~ 72
22 ~ 77
Mean
58
49
54
Total Slides
Age
Affymetrix
HG-U133A
Probesets #
22,283
Total
Sample
Species
Human
Mouse(C57BL)
Tissue
3
caudate,
cerebellum,
motor cortex
3
caudate,
cerebellum,
motor cortex
Male
8
1
Female
4
5
Total
12
6
12 x 3 = 36
6 x 3 = 18
Range
36 ~ 77 (years)
35 (days)
Mean
58 (years)
35 (days)
Affymetrix
HG-U133A
MOE_430A_2
Probesets #
22,283
22,690
Total Slides
Age
Microarray analysis
1) Normalize the CEL files with Robust Multiplearray Average (RMA).
2) Fit a linear model for each of three pairs with
LIMMA (bioconductor package)
gene expression ≈ donor + tissue type
• Caudate/Cerebellum
• BA4 Cortex/Cerebellum
• BA4 Cortex/Caudate
3) Get log ratio, paired t-statistics and p-values
Sample result (human)
Score
Caudate/Cerebellum
t
P.value
…
…
Caudate Cerebellum Motor cortex
Probeset ID
Log
Ratio
106.05
-89.15
-16.9
215241_at
6.08
65.1 1.65E-21
…
103.2
-62.01
-41.19
220313_at
5.95
71.9 3.13E-22
…
93.7
-51.66
-42.04
207307_at
5.04
71.9 3.16E-22 …
Caudate score = t-score(Caudate/Cerebellum) + t-score(Caudate/BA4 Cortex)
Different Regions of the Brain Show Many Statistically Significant
Differentially Expressed Genes
To select sets of genes whose expression was highly
enriched in one of the three regions
Caudate
Cerebellum
BA4 Cortex
1) p < 0.001 and log ratio ≥ 1 in both relevant pair-wise comparisons.
2) The log ratios of the two relevant comparisons were summed, such
as log2(BA4/caudate) + log2(BA4/cerebellum) would be candidate
BA4 genes
3) Order sum of log ratios
4) if summed regional score >2 in more than one region, probesets
were culled from the list.
Table 3:Selected Regionally Enriched Genes in Human and Mouse Brain
Tissues
Gene Expression Variation between Tissues
and Individuals
gene expression ≈ donor + tissue type
Within-tissue variance VS Between-tissue variance
The variance for a probeset, across n samples, was calculated by
where xi is the RMA signal for probeset i on array n.
The between-tissue variability was greater for 89% of the
human probesets and 85% of the mouse probesets.
Conclusion:
Compared to expression dictated by regional identity,
age and gender appear to have effects of small
magnitude or of large magnitude on a small fraction of
genes, even in humans.
Cross-Species Comparison of Regional Gene
Expression
What’s the relationship between mouse probesets
and human probesets?
ENSEMBL
Mouse probesets  Mouse ENSEMBL identities
(Example: 1415688_at)
Human probesets  Human ENSEMBL identities
(209141_at)
dN/dS
dN (number of nonsynonymous substitutions /
number of nonsynonymous sites)
dS (number of synonymous substitutions / number
of synonymous sites)
dN/dS was generated using the codeml (PAML
package, pair-wise Maximum Likelihood Method)
with F3 × 4 codon evolution model
Pick up 2,998 one-to-one orthologus pairs.
Compute normalized Euclidian distance between
all possible nonself pairs of tissues.
where there are g probesets and x and y are any two mouse or human samples.
Euclidian distances between regions were calculated using the mean RMA
probeset signals for each tissue.
Conclusion: Orthologous Brain Regions between Species Are
More Similar to Each Other than to Different Regions within a
Species
Analysis of GO categories
Human: 70.6% of the probesets had an assigned GO category .
Mouse: 66.2% of the probesets had an assigned GO category.
For each GO category,
The total number of probes in that category (a)
VS
The number of probes appearing on a list of
differentially expressed probes (p < 0.05) (b)
If a or b < 10
Fisher's exact test
Otherwise
Pearson chi-square
To detect which category
is over-represented.
Conclusion: Mouse and Human Brain Regions Share a Higher
Number of Overrepresented Functional Groups than Would Be
Expected by Chance
Relationships between Tissue-Specific Expression,
Conservation of Sequence, and Conservation of
Expression
(A)
X-axis: dN/dS ratios, least conserved (left) to most conserved (right).
Y-axis: Correlation coefficient between human and mouse log ratios.
(B)
X-axis: The percent nucleotide identity, low (left) to high (right).
Y-axis: Correlation coefficient between human and mouse log ratios.
Conclusion: Genes with High Variance across Tissues Have Greater
Conservation of Nucleotide Sequence
Conclusion
1) In-species comparison:
The different brain regions have distinctly
different expression profiles.
2) Cross-species comparison:
Region-specific genes are conserved at both
the sequence and gene expression levels.
(positive correlated)
Advantage and Shortage?
Thanks