Comm meeting 03

Download Report

Transcript Comm meeting 03

Is Gene Position Adaptively
Favored?
1
Why do we care?
• Genomic clusters of genes
• Yeast
• 98% of genes in metabolic pathways cluster (Lee
& Sonnhammer)
• 25% of cell-cycle dependent genes are directly
adjacent (Cho et al.)
• Drosophila
• 20% of genes in co-expression clusters (Spellman
et al.)
2
Why do we care?
• Genomic clusters of genes
• Arabidopsis
• 43% of genes in metabolic pathways
cluster (Lee & Sonnhammer)
• Blocks of neighboring genes co-express
(Williams & Bowles)
• Clusters up to 20 genes, 100kb
• Evolutionary process?
3
Idea to test
• Gene clusters occur due to evolutionary
processes
• Look for evidence of selection on clusters
4
Method
• Plot co-expression vs recombination rate
for each gene, compare to selection
models
Co-expression /
Co-expression
Recombination
rate
Chromosomal
position
5
Hill-Robertson Effect
• Reduction in the efficacy of selection
• In regions of low recombination, mutations
are linked
• This limits the effective population size for
region
• Mutations more likely to be fixed in small
populations
6
The test
• Selection:
• Relatively neutral - clusters scattered
• Positive - skew toward high recombination
• Negative - skew toward low recomb regions
Co-expression /
Co-expression
Recombination
rate
Chromosomal
position
7
Method
• Plot co-expression vs recombination rate
for each gene, compare to selection
models
Recombination
rate
Chromosomal
position
8
Recombination rate
• Estimated for 17.8k
genes by Jianhua Hu
cM
(genetic dist)
Chromosome position
9
Recombination rate
Arabidopsis Recombination Rates per Chromosome
0
5000000
15000000
25000000
Chromosome 5
1.0 e-05
5.0 e-06
0.0 e+00
recomb.rate.cM.bp.
Chromosome 4
1.0 e-05
5.0 e-06
0.0 e+00
Chromosome 3
1.0 e-05
5.0 e-06
0.0 e+00
Chromosome 2
1.0 e-05
5.0 e-06
0.0 e+00
Chromosome 1
1.0 e-05
5.0 e-06
0.0 e+00
phys.dist.bp.
10
Method
• Plot co-expression vs recombination rate
for each gene, compare to selection
models
Co-expression /
Co-expression
Chromosomal
position
11
Expression Data
• NASC Expression data
• Affymetrix arrays of Arabidopsis
genes
• ~22,700 genes
• 59 datasets, 534 arrays
• AtGenExpress data
• >500 datasets, 1300 arrays
12
Co-expression measurement
• Measures of co-expression
• R – Pearson / Spearman correlation
• Linear relationship between points (scaled)
• Demonstrate positive or negative correlation (-1..1)
• Euclidean distance
• Distance between two points
• Demonstrates correlation (0..1)
• Calculate for neighboring genes
13
Other factors
• Factors to consider:
• Tandem duplicates – likely to co-express
• Array conditions
14
Method
• Plot co-expression vs recombination rate
for each gene, compare to selection
models
Co-expression /
Co-expression
Recombination
rate
Chromosomal
position
15
Regional transcription
• Mechanisms
• Matrix attachment sites
• Change chromatin loops (3D structure)
• Insulators, Boundary elements
16
17