RISE AND FALL OF GENE FAMILIES Dynamics of Their Expansion

Download Report

Transcript RISE AND FALL OF GENE FAMILIES Dynamics of Their Expansion

Evolution of Plant Stress Responsiveness:
Genome-wide and Gene Family Level Analysis
Shin-Han Shiu
Department of Plant Biology
Outline

Major interests and why

Gene families and stress responsiveness:
 The interplay between gene family expansion, duplication
mechanism, and the elusive selection pressure

The Receptor Kinase family as an example
 One of the biggest plant gene families and their involvement in plant
biotic interactions

If there is enough time, the short story on plant pseudogenes
 When can you can a gene pseudogene?
Major interests
Molecular
evolutionary
patterns

Source of selection
pressure: abiotic
and biotic stress
conditions

Target of selection:
duplicate genes
Genetic basis
of adaptation
Where does all these duplicates come from

Whole genome duplication
+

Tandem duplication

Segmental duplication

Replicative transposition
Measuring Lineage-specific Gain

Orthologous group and lineage-specific gain
 Reconcile species and gene trees
 N Obs 

Enrichment: log 
N 
 Exp 
log(freq)
Expansion at the orthologous group level
log(OG size)
Two major patterns in OG expansion
Convergent expansion
Single lineage expansion
>6
>6
5
5
5
4
4
4
3
Poplar
>6
Rice
Moss


3
3
2
2
2
1
1
1
0
0
0
0
1
2
3
4
5 >6
Arabidopsis thaliana
−0.7
0
0.7
log2(Obs/Exp)
0
1
2
3
4
5 >6
Arabidopsis thaliana
 N Obs 

Enrichment: log 
N 
 Exp 
0
1
2
3
4
5 >6
Arabidopsis thaliana
Expansion patterns and duplication mechanisms

Comparison of ratios between tandem and non-tandem genes
 e.g. for A-M orthology
OG type
A-M
A-R
A-P
Convergent
Single-lineage
Tandem
756
848
Non-tandem
4500
2918
Ratio
0.17
0.30
Method for defining OG
Similarity
Tree
Similarity
Tree
Similarity
Tree
Expansion pattern
Convergent 1
Single-lineage2
0.17 (756/4500) < 0.30 (848/2918)
0.16 (831/5297) < 0.40 (1443/3566)
0.31 (959/3115) < 0.47 (644/1375)
0.27 (844/3073) < 0.50 (1631/3294)
0.29 (1141/3944) < 0.60 (741/1234)
0.26 (1014/3930) < 0.64 (1578/2452)
P values
3
2.2×10-23
9.4×10-88
3.1×10-12
2.3×10-33
7.2×10-38
1.0×10-83
Summary I

Duplicate gene turn over
 But even though some of them are retained for millions of years, the
majority of them will be lost over hundreds MY time scale.

The degree of lineage-specific expansion is similar at the family
level but with substantial variation

Expansion patterns fall into two major categories
 Convergent expansion
 Single lineage expansion

Orthologous group with single lineage expansion
 Tend to be enriched in tandemly repeated genes
Expansion of responsive genes and conditions

Genes in expanded OGs tends be enriched in stress responsive
genes
Response
Up regulation
OG type
A-M
Statistical test
1
Exp
Down regulation
A-R
2
T/N
A-P
A-M
A-R
Exp
T/N
Exp
T/N
Exp
T/N
+
T
+
T
+
N
Exp
A-P
T/N
Exp
T/N
3
Abiotic stress conditions
+
UV-B
T
Wounding
+
T
+
Cold4C
+
N
+
+
N
+
+
+
+
+
Salt
+
+
+
+
+
+
Osmotic
T
+
N
+
+
+
+
+
Biotic stress conditions3
AvrRpm1
+
+
+
DC3000
+
+
+
Flg22
+
T
+
T
+
GST-NPP1
+
T
+
T
+
T
HrcC-
+
T
+
T
+
T
HrpZ
+
P. infestans
+
T
+
T
+
T
Psph
+
T
+
T
+
T
+
the 5% level
+
N
Heat
Drought
+: significant at
+
N
Stress responsiveness and duplication mechanisms

Enrichment of tandemly over non-tandemly expanded genes
under biotic conditions
Response
Up regulation
OG type
Statistical test
A-M
Exp1
Down regulation
A-R
T/N2
Abiotic stress conditions3
+
UV-B
T
A-P
A-M
Exp
T/N
Exp
T/N
Exp
T/N
+
T
+
T
+
N
Wounding
+
T
+
Cold4C
+
N
+
+
N
+
+
+
Drought
+
+
T
Salt
+
+
+
+
+
+
Exp
A-P
T/N
Exp
T/N
+
N
Heat
Osmotic
A-R
+
+
N
+
N
+
Biotic stress conditions3
1
AvrRpm1
+
+
+
DC3000
+
+
+
Flg22
+
T
+
T
+
GST-NPP1
+
T
+
T
+
T
HrcC-
+
T
+
T
+
T
HrpZ
+
P. infestans
+
T
+
T
+
T
Psph
+
T
+
T
+
T
+
Significant at the 5% level
+
+
+
T: tandem >> non-tandem
N: non-tandem >> tandem
Summary II

Over the course of plant evolution, retention rate:
 Stress response genes >> genome average

True for genes up-regulated in both biotic and abiotic stress
conditions

Influence of duplication mechanism, particularly for biotic
stress conditions, retention rate:
 Tandem >> non-tandem

However, genes responsive to biotic stimuli are not necessarily
tandem
 Depend on their location in the signaling network
 e.g. Plant receptor kinase: biotic -> tandem
 e.g. Transcription factors -> non-tandem, presumably WGD
Functional evolution of duplicate genes

Question: What is the fate of duplicate gene?

Address this question in the context of stress responsiveness.
Reconstruction of ancestral functions
Step1:construct
the phylogeny of
genes
Step2: map
current functions
Step3:reconstruct
the function of
ancestral genes
Branch-based analysis

Ancestral state determined by BayesTraits
 An MCMC/ML combination to estimate the trait values at ancestral
nodes
Relative abundance of functional evolution classes

Retention > loss >> gain >> switch
0>-0
change
i
sw
tch
in
ga
1->
0
1->
>0
-1-
-1
1
>-
1
Evolution of stress responsiveness over Ks
Relative abundance of functional evolution classes
N Maintenance
NLoss
N = NM+NL+NS
Nswitch
Abiotic stress
Biotic stress
Ks
Ks
Relative abundance of functional evolution classes
N Maintenance
NGain
N = total
Nswitch
Abiotic stress
Biotic stress
Ks
Ks
What is the nature of functional switch?

Switch: e.g. 1 to -1
 Up regulation in the ancestral state
 But down regulated in the current state

Does switch involve a one-step or a 2-step process:

Seem to be the second case since:
 Loss rate >> gain rate >> switch rate
Summary III

Branch-based analysis
 Maintenance > loss > gain > switch

Retention of stress responsiveness
 Ks< 0.8, continued loss
 Ks> 0.8, nearly constant
 Responsiveness gain has a similar trajectory

Loss > gain > switch
 Suggest a two-step evolutionary process of functional swtich.
Functional evolution in the context of duplicate pairs

Functional partition
Relative abundance of functional evolution classes

Looking at each gene pair under each condition
 Partition is the most abundant class
100
90
80
70
60
50
S
40
30
20
10
0
lost
PL
redendunt
RET
sub
PAR
neo
NEO
Functional partition of gene pairs
Partition tend to be extremely asymmetric
Informative conditions
1
1
0
-1
1
0
Number of conditions
1
1
1
1
1
0
Sp = 0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
Sp = 1
2-5
0.8
0
0
0
0
1
1
0
0
0
Frequency (pair)

6-10
11-15
0.6
>15
0.4
0.2
0
1
1
0
0
1
1
0~1
Sp
0
Statistical significance of asymmetry
Partition tend to be extremely asymmetric, why?
Gene 1 conditions
LR 
Gene 2 conditions

Observed frequency
Expected frequency
Stress responsiveness of genes over multiple conditions

Over-representation of broadly responsive genes
ABIOTIC CONDITIONS
BIOTIC CONDITIONS
Enrichment of cis elements under each condition
Some cis-elements are in genes that are broadly responsive
Up-regulated
Down-regulated
Cis-elements

Stress conditions
Both
Neither
Enrichment of cis elements under each condition

Some cis-elements are in genes that are broadly responsive
Breadth in responsiveness vs. cis-element complexity
Positive correlation between these two factors
 Spearman rank: ρ = 0.51, p < 2.2e-16
Number of cis-element types

Number of responsive conditions
Expression and cis-element asymmetries
Some cis-elements are in genes that are broadly responsive
 Spearman rank: ρ = 0.62, p < 2.2e-16
Asymmetry (cis-element)

Asymmetry (up-regulation)
Summary IV

Sub- vs. neofunctionalization
 Sub > retention > neo > parallel loss

Subfunctionalization asymmetry
 Significant asymmetry

Explanation
 Some genes are controlled by multi-responsive elemenents
 Differential loss/gain of cis-elements
Acknowledgement

Lab members
 Melissa Lehti-Shiu
 Gaurav Moghe
 Cheng Zou

Past member
 Kosuke Hanada, RIKEN

Collaborators
 Jeff Conner
 Gregg Howe, PRL
 Rong Jin, CSE
 Doug Schemske
 Mike Thomashow, PRL

Funding:

