Operating genetic material
Download
Report
Transcript Operating genetic material
The Structure, Function, and
Evolution of Biological Systems
Instructor: Van Savage
Spring 2010 Quarter
4/6/2010
Crash Course in Evolutionary Theory
What is fitness and what does it
describe?
Ability of an entity to survive and propagate forward
in time. It is inherently a dynamic (time evolving
property). Can assign fitness to
1.
2.
3.
4.
5.
6.
7.
8.
Individuals
Genes
Phenotypes
Behaviors
Strategies (economic, cultural, games, etc)
Tumor cells and tumor treatment
Antibiotic resistance
Language
Special case of Price’s Theorem
We will learn full version in much greater detail soon.
Cov(w,g)
p g
w
Additional effects for more than two
loci
1. Recombination—breaking, rejoining, and rearranging
of genetic material. Major extra source of variation.
2. Epistasis—interactions between loci
(i.e., non-independence). Fitness effects of alleles affect
each other in non-additive way.
Recombination
Can understand all of this again in terms of covariance.
Covariance of A and B implies effect of recombination.
Zero covariance implies no recombination
Cov(A,B) E(AB) E(A)E(B) x11 p1q1 D
D is the measure of gametic disequilibrium and time
evolution can be expressed in terms of this and the
recombination rate
x’ij=xij+(-1)i+jrD
D’=D(1-r)
Recombination with selection
Must assign fitness and then use formulas and do algebra
similar to what we have been doing.
1
x ij [Cov(w,gij ) rDw1122]
w
1
[Cov(w,gij ) rw1122Cov(A1,B1 )]
w
Additional term captures effects of recombination
and whether it slows or speeds up evolution.
“-” if i=j and “+” is I does not equal j
Epistasis
Interactions among fitness effects for different alleles
Cov(wx ,wy ) w xy w x w y
Can now see covariance plays central role in all of
evolution. In fact, it is as central as fitness itself.
If
no
interaction,
then
the
covariance
is
0.
w xy w x w y
This is know as additive
(or sometimes multiplicative).
Additive
Choose relative fitness so that the wild type fitness is 1,
and look at exponential (continuous) versions
wWT 1 e
0
Still assuming a mutation is deleterious, we look
at combined effects of two mutations
w x 1
s
~
e
x
sx
sx sy
wx wy e e
w y 1 sy ~ e
and
e
(sx sy )
~ 1 (sx sy )
sy
Non-Additive
w xy w x w y
w xy w x w y
w xy w x w y
Synergistic (negative epistasis)
Antagonistic (positive epistasis)
What is the distribution of these effects?
What fraction of mutation pairs are antagonistic?
What fraction of mutation pairs are synergistic?
Graphical representation
,g gA w,g gB
A
B
,g gA w,g gB ,g
A
B
A gB
gA gB
Modeling more than two mutations
If all mutations have the same deleterious effect, and
k mutations are lethal, then
ks
we
k
~ 1 s ~ 1 ks 1
kL
k
How can we modify this for epistasis?
wepi 1 sk
1
~ (1 s)
k1
sk1
~e
What about these forms for epistasis?
1
wepi (1 ks)
or
Lethal number
of mutations
1
k
wepi 1
kL
Modeling more than two mutations
wepi 1 sk
1
~ (1 s)
k1
sk1
~e
How do we interpret synergy and
antagonism?
Mutations to steps in sequence are antagonistic (green)
Mutations to steps in parallel can be synthetically lethal
if it knocks out a loop, which is extreme synergy (red), or
multiplicative (black).
Recent papers using models of epistasis:
Lenski,Ofria, Collier, Adami
Definition of digital organism
Carrying capacity of 3600 individuals
Probability of point mutation was 0.0075 per instruction copied
and probability of insertion and deletion is 0.05 per division
Each generation is 5-10 updates and each update is execution
of on average 30 instructions per individual
Start with genome length of 20 instructions
28 different types of instructions (like amino acids)
Phenotypic rewards are multiplicative
Instructions are mathematical operations
Advantages of digital organisms
1. Allows us to choose condition and seek generalizations
beyond organic life forms
2. Allow us to perform experiments, in terms of time scales
and numbers, that are unattainable with real systems
3. Use evolving programs to solve computational problems
Complex organisms
Selection criteria:
1. Baseline allocation of CPU time is proportional to
genome size
Why? Larger genomes does not necessarily imply more complex or better
at “solving problems” or getting resources. See next plot.
2. Certain mathematical operations, which require
novel combinations of instructions (e.g., performing an XOR
operation using NAND operations), are rewarded with
additional CPU time.
This is a type of selection. Solving computational problem is solving fitness problem or
getting more resources.
Simple organisms
Selection criteria starting with complex organisms:
1. Baseline allocation of CPU time is independent of
genome size
2. mathematical operations are not rewarded with
additional CPU time
Removes a type of selection. Does not seem biological. But, there is still selection for
shorter replication time, so some biological analogy.
Complex vs. Simple organisms
Complex organisms average genome size=91.3 instructions
(Because of assumption 1 or 2?)
Simple organisms average genome size=19.8 instructions
Simple organisms have more lethal mutations
Tests for epistasis
Decay test—see whether successive mutations (1-10)
become increasingly worse (synergy), better (antagonistic),
or are multiplicative.
wepi 1 sk
1
~ (1 s)
k1
sk1
~e
Both show an average type of antagonistic epistasis, but
complex organisms show it more strongly. Why do you think?
Tests for epistasis
Pair test—Explicitly calculate and compare double mutant
fitness (Wxy) to the product of each single mutant fitness
(WxWy) for each fitness for all pairs (double mutants).
Cov(w x ,w y ) w xy w x w y
Simple organisms have more lethals.
Both have more antagonistic than synergistic interactions
Simple organisms actually have a higher proportion of
non-lethal, antagonistic interactions.
Recent papers using models of epistasis:
Segre, DeLuna, Church, Kishony
Quantitative Epistatic Interactions
Perturbation X
Phenotype
(Growth Rate)
Perturbation Y
Synergy
Antagonism
Suppression
See also:
Boone, Science (2004)
Weissman, Cell (2006)
Boeke, Nature Genetics (2003); Cell (2006)
Giaever, Nature Genetics (2007)
Quantitative genetic interactions
Proliferation Rates
How much information is concealed within the
remaining 99.5% of gene interactions???
Gene X
Gene Y
+
+ 1
~0.5%
Aggravating
0.9
0
0.72
0.8
0
Alleviating
0.72
?
0.8
# Cells
WT X
Y
XY
Functional Association
XY
XY
Time
0.
8
Interactions between mutations in yeast metabolism using Flux
Balance Analysis
Varma and Palsson, 1994
Famili et al, 2003
Rate of biomass
production
(growth rate)
• 829 - metabolic reactions
• 343206 gene pairs
Flux Balance Analysis
• Computational model of metabolism
• Growth rate predictions for wild-type and
deletion mutants
• Main Assumptions:
– Steady-state
– Mass-conservation
– Optimality
Nutrients
B
A
C
• Developed and experimentally verified in
E. coli and yeast by Palsson et al:
Nature 2002; PNAS 2003; Nat. Genetics 2004
Biomass
(new cells)
D
Measures of epistasis
Since covariance is as fundamental as fitness, why not
define relative covariance instead of relative fitness. We
define it relative to tri-modally binned covariance that
itself varies, so relative to a shifting baseline.
Absolute covariance
Cov(wx ,wy ) wxy wx wy
Relative covariance
wxy wx w y
˜
BinnedCov (wx ,w y ) w˜ xy wx w y
Cov(wx ,wy )
Measures of epistasis
Additive
BinnedCov(wx ,wy ) wxy wx wy 0
Antagonistic is binned as synthetic lethal
BinnedCov(wx ,wy ) 0 wx wy wx wy
Synergistic is binned as buffering (wxy=wx<wy)
BinnedCov(wx ,wy ) wx wx wy
Can think of this as relative fitness being relative to
product of fitnesses and again a shifting baseline
Additive
BinnedCov(wx ,wy ) 0
Antagonistic is binned as synthetic lethal
˜ w 1 ~ 1
r
xy
Synergistic is binned as buffering (wxy=wx<wy)
w 1
˜
~1
w 1
r
xy
r
xy
Measures of epistasis
What do we expect for distribution of new measure
of epistasis?
If multiplicative, then relative covariance should still be 0
For antagonistic, since ε is unimodal, if wxwy is constant,
then relative covariance is unimodal. If wxwy is unimodal
in same form as ε , then relative covariance should be 1.
For synergistic, since ε is unimodal, if wx-wxwy is constant,
then relative covariance is unimodal. If wx-wxwy is unimodal
in same form as ε, then relative covariance should be -1.
Measures of epistasis
How does ε vary with relative covariance?
˜ ~
2
Will accentuate differences in distribution
Measures of epistasis—based onFBA
predictions in yeast
Sort of unimodal distribution goes to trimodal distribution
Opposite of Lenki et al. because synergy is enriched. Why?
Measures of epistasis—RNA viruses
A bit more continuous for real data. We will see more real
data later on.
Higher level epistasis—interactions
among functional groups rather than loci
Interactions are mostly monochromatic. No reason a priori
that this should be, except it signifies functional organization.
Cov(WModA,WModB ) WModA,ModB WModAWModB
Can we do reverse and cluster
monochromatically to find functional groups?
Construct network for all pairwise interactions,
Start with each gene in its own group.
Cluster by pairs if they interact with other genes in same way.
Require monochromaticity, each group must interact with all
other groups in same way
Within a group there is no requirement for monochromaticity
Make cluster sizes as large as possible
Next class we will move onto more theory for evolution of
epistasis, synergy, and antagonism as well as evolution of
resistance in antibiotics.
First Homework set is due in two weeks (April 20, 2010).