Transcript Linkage

Linkage
What is Linkage?
• Linkage is defined genetically: the failure of two genes to assort
independently.
• Linkage occurs when two genes are close to each other on the
same chromosome.
• However, two genes on the same chromosome are called syntenic.
• Linked genes are syntenic, but syntenic genes are not always
linked. Genes far apart on the same chromosome assort
independently: they are not linked.
• Linkage is based on the frequency of crossing over between the two
genes. Crossing over occurs in prophase of meiosis 1, where
homologous chromosomes break at identical locations and rejoin
with each other.
Discovery of
Linkage
• In 1900, Mendel’s work was re-discovered, and
scientists were testing his theories with as many
different genes and organisms as possible.
• William Bateson and R.C. Punnett were working
with several traits in sweet peas, notably a gene
for purple (P) vs. red (p) flowers, and a gene for
long pollen grains (L) vs. round pollen grains (l).
Bateson and Punnett’s Results
• PP LL x pp ll
• selfed F1: Pp Ll
• F2 results in table
• Very significant deviation
from expected Mendelian
ratio: chi-square = 132.6,
with 3 d.f. Critical chi
square value = 7.815.
• The null hypothesis for
chi square test with 2
genes is that the genes
assort independently.
These genes do not
assort independently.
Phenotype
obs
exp
ratio
exp
num
P_ L_
284
9/16
215
P_ ll
21
3/16
71
pp L_
21
3/16
71
pp ll
55
1/16
24
total
381
1
381
B+P Genes in a Test Cross
• Purpose of a test cross: the offspring
phenotypes appear in the same ratio
as the gametes in the parent being
tested.
• Here, we want to see how many
gametes are in the original parental
configuration (PL or pl) and how many
are in the recombinant configuration
(Pl or pL). The parental types have the
same combination of alleles that were
in the original parents, and the
recombinant types have a combination
of the mother’s and father’s alleles.
• Original parents: PP LL x pp ll
• F1 test cross: Pp Ll x pp ll
Phenotype
purple
long
purple
round
red
long
red
round
total
obs
392
116
127
365
1000
More Test Cross
• Parentals: 392 PL + 365 pl = 757.
757/1000 total offspring = 75.7% parental
• Recombinant: 116 Pl + 127 pL = 243.
243 /1000 = 24.3% recombinant.
• If the genes were unlinked, 50% would be recombinant. These
genes are linked, with 24.3% recombination between the P gene
and the L gene.
• If the genes were right on top of each other, that is, the two
phenotypes were both caused by the same gene (pleiotropy), then
there would be 0% recombination between them.
• The percentage of recombinants is always between 0% and 50%,
and the percentage of parentals is always between 50% and 100%.
Better Symbolism
•
•
•
•
We have been following Mendel’s tradition in writing the two alleles for each
gene together, as in PP LL x pp ll.
Now we need to start paying attention to the fact that genes are on
chromosomes.
If one parent contributes a P L chromosome and the other parent
contributes a p l chromosome, we write the heterozygote as PL/pl.
Homozygotes (for all genes on that chromosome) are written without the
slash: the pp ll homozygote used in the test cross is written as p l.
Coupling vs. Repulsion
• The original test cross we did was PL/pl x p l. Among
the offspring, PL and pl were parental types, and pL and
Pl were the recombinant types. There was 24.3%
recombination between the genes.
• The condition of having the dominant alleles for both
genes on the same parental chromosome, with both
recessives on the other parental chromosome, is called
“coupling”: the P and L genes are “in coupling phase”.
• The opposite condition, having one dominant and one
recessive on each parental chromosome, is called
“repulsion”. Thus, if the original parents were P l x p L,
their offspring would have the genes in repulsion phase:
Pl / pL.
Test Cross in Repulsion
•
•
•
•
•
•
Now do the test cross in repulsion:
Pl / pL x p l
Here, the parental types are P l
and p L, and the recombinant
types are P L and p l.
The numbers of offspring in each
type are quite different from the
originals.
However, the percentage of
recombinants is the same: 24.3%.
123 P L + 120 p l = 243
recombinant offspring.
243/ 1000 total offspring = 24.3 %
The percentage of recombination
depends on the distance between
the genes on the chromosome,
and NOT on which alleles are on
which chromosome.
phenotype
obs
PL
123
Pl
372
pL
385
pl
120
total
1000
Process of Recombination
• From an evolutionary point of view, the purpose of sex is
to re-shuffle the combinations of alleles so the offspring
receive a different set of alleles than their parents had.
Natural selection then causes offspring with good
combinations to survive and reproduce, while offspring
with bad combinations don’t pass them on.
• Genes are on chromosomes. Meiosis is a mechanism
for re-shuffling the chromosomes: each gamete gets a
mixture of paternal and maternal chromosomes.
• However, chromosomes are long and contain many
genes. To get individual genes re-shuffled, there needs
to be a mechanism of recombining genes that are on the
same chromosome. This mechanism is called “crossing
over.
More Recombination
• Crossing over occurs in prophase of meiosis 1, when the
homologous chromosomes “synapse”, which means to
pair closely with each other. DNA strands from the two
chromosomes are matched with each other.
• During synapsis, an enzyme, “recombinase”, attaches to
each chromosome at several randomly chosen points.
The recombinase breaks both DNA molecules at the
same point, and re-attaches them to opposite partners.
• The result of crossing over can be seen in the
microscope as prophase continues, as X-shaped
structures linking the homologues.
• The genetic consequence of crossing over is that each
chromosome that goes into a gamete is a combination of
maternal and paternal chromosomes.
Recombination Process
The Process of Recombination,
animated version
http://www.youtube.com/watch?v=BhJf9MHHmc4
Or, maybe this animated GIF image will work
Linkage Mapping
•
•
•
•
Each gene is found at a fixed position on a particular chromosome. Making
a map of their locations allows us to identify and study them better. In
modern times, we can use the locations to clone the genes so we can better
understand what they do and why they cause genetic diseases when
mutated.
The basis of linkage mapping is that since crossing over occurs at random
locations, the closer two genes are to each other, the less likely it is that a
crossover will occur between them. Thus, the percentage of gametes that
had a crossover between two genes is a measure of how far apart those
two genes are.
As pointed out by T. H. Morgan and Alfred Sturtevant, who produced the
first Drosophila gene map in 1913. Morgan was the founder of Drosophila
genetics, and in his honor a recombination map unit is called a centiMorgan
(cM).
A map unit, or centiMorgan, is equal to crossing over between 2 genes in
1% of the gametes.
Three Point Cross
• The easiest way to map genes is to compare them in
groups of 3. This allows both the distances between
them and their order to be determined. Further genes
can be added to the map by using overlapping groups of
3.
• Mapping is usually done in test crosses. One parent is
heterozygous for two versions of the chromosome being
mapped, and the other parent is homozygous for the
recessive mutants being mapped. Recombination in the
heterozygous parent gives different combinations of
alleles, which are counted.
• Note that recombination also occurs in the homozygous
recessive parent, but it has no effect on the alleles in the
offspring because it is homozygous.
Data
•
•
•
•
•
•
In corn, c gives a green plant
body, while its wildtype allele c+
gives a purple plant body.
bz (bronze) gives brown seeds,
while the wildtype allele bz+ gives
purple seeds.
wx (waxy) gives waxy endosperm
in the seeds; wx+ gives starchy
endosperm.
The genes are arranged on the
chromosome in the order c-bz-wx.
The cross: c bz wx / + + + x c bz
wx.
Note the +’s are the dominant
wildtype alleles of the
corresponding gene.
phenotype
c bz wx
+ + +
c bz +
+ + wx
c + +
+ bz wx
c + wx
+ bz +
total
count
318
324
105
108
18
20
4
3
900
Notes on the Data
• Genes are arranged in reciprocal pairs: each
pair has 1 copy of the mutant allele and the
wildtype allele for each gene. The counts are
roughly equal for reciprocal pairs, because they
are both products of the same crossing over
events.
• Parentals are the largest groups: c bz wx and +
+ +.
• Double crossovers, one between c and bz and
another between bz and wx, are the smallest
groups.
Calculating Map Distances
• Basic process: determine the percentage of
offspring that had a crossover between each
pair of genes.
• 1. Examine c and bz first. Parental configuration
was c bz and + +. Therefore, the recombinant
configurations are c + and + bz.
• Count recombinants, ignoring the other gene
(wx): 18 c + + , 20 + bz wx, 4 c + wx, 3 + bz +.
Total is 45 recombinants out of 900 total
offspring. 45 / 900 = 0.05. Need to multiply by
100 to get percentage: 0.05 x 100 = 5.0 map
units between c and bz.
More Calculating Map Distances
• 2. Next examine bz and wx. Parentals are bz wx and +
+, so recombinants are bz + and + wx.
• Ignoring c, the count of recombinants is: 105 c bz +,
108 + + wx, 4 c + wx, 3 + bz +. Total = 220
recombinants. 220 / 900 = 0.244. 0.244 x 100 = 24.4
map units between bz and wx.
• 3. Now do c and wx. Parentals are c wx and + +, so
recombinant offspring are c + and + wx.
• Ignoring bz, the recombinants are 105 c bz +, 108 + +
wx, 18 c + + , 20 + bz wx. Total = 251. 251/ 900 x 100 =
27.9 map units.
Map of c, bz, and wx
• All 3 genes are in the
proper order, and all 3
distances between
pairs of genes are
shown.
• Note that distances
don’t add up. This is
due to double
crossovers, which we
will discuss next.
Double Crossovers and Mapping
•
•
•
•
•
A double crossover is two crossovers both occurring between the two genes
being examined. The first crossover changes the parental configuration of
alleles to the recombinant configuration. The second crossover changes
the recombinant configuration back to the parental. The net result is that
the genes are in the parental configuration, same as if no crossovers had
occurred.
Thus, any even number of crossovers is the same as 0 crossovers, and any
odd number is the same as 1 crossover.
Since you only see the offspring and not the actual crossovers, it is very
easy to undercount the number that occurred
Consider the c bz wx cross. If you were just looking at c and wx, and hadn’t
examined bz, the c + wx and + bz + offspring would be parental and count
as 0 crossovers. The only way you know that 2 crossovers occurred is by
examining bz. Perhaps other crossovers also occurred that we didn’t
detect, since we didn’t examine any other genes in between bz and wx.
This is the main reason why the sum of the c--bz and bz--wx distances
didn’t add up: double crossovers were counted as parentals.
Mapping Function
•
•
•
•
•
The further apart 2 genes are, the more
likely it is that undetected double crossovers
will occur between them.
For this reason, gene maps are created
using short intervals.
One consequence: the maximum
percentage of recombinant offspring is 50%,
but many chromosomes are several
hundred map units long. For instance, the
human chromosome 13 is 125 map units
long. Genes on opposite ends of this
chromosome would have a 50% frequency
of recombinant offspring.
Mapping function: number of actual
crossovers on x-axis, frequency of
recombinant offspring on y-axis.
In general,
–
–
–
for less than 20% recombinants, the number
of recombinants is roughly equal to the
number of crossovers;
for 20-50% recombinants, the number of
recombinants is significantly less than the
number of crossovers.
for 50% recombinants, the number of
crossovers is not determinable.
Interference
• There is a second issue with double crossovers:
interference.
• Interference is the inability of 2 crossovers to
occur very close to each other. Think of the
chromosome as a thick rope: it is impossible to
bend it too tightly.
• It is possible to measure the amount of
interference, by comparing the actual number of
double crossovers to the number that you would
expect based on the number of single
crossovers that occurred.
Measuring Interference
•
•
•
•
Consider our c--bz--wx example. The c--bz interval was 5.0 map units, or
5% recombinants between those 2 genes. The probability of a crossover
between c and bz is 0.05, which you get by dividing the map distance by
100 to put it on a 0-1 scale instead of a 0-100 scale. The bz--wx interval
was 24.4 map units, or 24.4% recombinants, or a 0.244 probability.
If a crossover between c and bz had no effect on the possibility of a
crossover between bz and wx (i.e. no interference), then a chance of a c--bz
crossover AND and bz--wx crossover would be the product of their
individual probabilities.
That is, the expected frequency of double crossovers would be 0.05 * 0.244
= 0.0122. Since there were 900 offspring, the expected number of double
crossover offspring would be 10.98.
The observed number of double crossovers was 7. The coefficient of
coincidence is the ratio of observed double crossovers to expected: 7 /
10.98 = 0.638. The interference is 1 minus the coefficient of coincidence, =
1 - 0.638 = 0.362. This means that about 36% of the expected double
crossovers are not occurring due to interference.
Interference Formulas
 mapdistI  mapdistII 
exp 2CO  

total _ offspring 
 100  100 
obs _ 2CO
C.of _ C. 
exp_ 2CO
I  1  C.of _ C.
Gene Order
• Another problem that can arise is that when
mapping genes you usually don’t know their
order. You need to infer the order from the data.
• Also, not all the alleles are necessarily going to
be in coupling. Sometimes some alleles are in
repulsion.
• Getting the gene order is a matter of comparing
the alleles present on the parental
chromosomes to those on the double crossover
chromosomes.
Example
• Note that the offspring counts
are arranged in reciprocal
pairs. Each member of the
pair has the opposite alleles,
and the counts for the two
members of the pair are
approximately equal.
• The parental class of offspring
is the largest: a + d, and + b
+. No crossovers have
occurred here: the original
parents had these
combinations of alleles.
• The double crossovers class
(2CO) is the smallest class: +
b d and a + +.
• The other two pairs are the two
single crossover classes.
offspring
phenotype
a + d
+ b +
a + +
+ b d
a b +
+ + d
a b d
+ + +
total
count
510
498
3
1
61
59
35
37
1204
Getting Gene Order
• Since we don’t know (or care about) the orientation of
these genes relative to the chromosome as a whole,
there are only 3 possible orders. These are based on
which gene is in the middle.
• The genes could be a--b--d, b--a--d, or a--d--b.
• Note that a--b--d is identical to d--b--a, etc.
• The order in which the genes are listed in the offspring
counts has nothing to do with their order on the
chromosome!!! The gene order is unknown at this point.
• To determine gene order, set up the parental
chromosomes in the F1, then see what the resulting
double crossover offspring would look like. If the
observed 2CO’s match the expected, then you have the
correct gene order. If not, try a different order.
Gene Order
• 1. Try a--b--d. The F1 chromosomes would be a + d/ +
b +. A double crossover would give a b d and + + + 2CO
offspring. This does not match the observed a + + and
+ b d.
• 2. Try b--a--d. The F1 chromosomes are + a d / b + +.
A double crossover would give b a + and + + d 2CO
offspring. This does not match the observed b + d and +
a +.
• 3. Try a--d--b. The F1 chromosomes are a d + and + +
b. A double crossover gives a + + and + d b, which
matches the observed 2CO offspring. Therefore, a--d--b
is the correct order; d is in the middle.
Gene Distances
• a--d. Parentals are a d and + +, so
recombinants are a + and + d. There are 61 +
59 + 3 + 1 = 124 recombinant offspring. Total
offspring = 1204. So map distance = 124/1204 *
100 = 10.3 map units.
• d--b. Parentals are d + and + b, so recombinants
are d b and + +. There are 35 + 37 + 3 + 1 = 76
of them. 76 / 1204 * 100 = 6.3 map units.
• a--b. Parentals are a + and + b, so recombinants
are a b and + +. There are 61 + 59 + 35 + 37 =
192 of them. 192 / 1204 * 100 = 15.9 map units.
Interference and Map
• There were 3 + 1 = 4 observed
2CO’s.
• Expected 2CO:
= (10.3/100) * (6.3/100) * 1204
= 7.81
• Coef. of Coincidence = obs 2CO
/ exp 2CO
= 4 / 7.81
= 0.512
• Interference = 1 - C. of C.
= 1 - 0.512
= 0.487
Steps to Solving 3-Point Cross
Problems
•
•
•
•
•
•
•
1. Organize the data into reciprocal pairs. For example, a + c and + b + are
a reciprocal pair, and so are a b c and + + +. The number of offspring for
each member of a pair will be similar.
2. Determine which pair is the parental class: it is the LARGEST class.
3. Determine which pair is the double crossover (2CO) class: the
SMALLEST class.
4. Determine which gene is in the middle. If you compare the parentals with
the 2COs, the gene which switched partners is in the middle. If you think
two genes have switched sides, it is the other gene that is in the middle.
5. Determine map distances for all three pairs of genes. Count the number
of offspring that have had a crossover between the genes of interest, then
divide by the total offspring and multiply by 100.
6. Figure the expected double crossovers: (map distance for interval I / 100
) * (map distance for interval II / 100 ) * (total offspring) = exp 2CO.
6. Figure the interference as: I = 1 - (obs 2CO / exp 2CO). The observed
2CO class comes from the data.
One More Example
• Here are some test cross data for a 3 point
cross. Determine the order of the genes, make
a map showing all map distances, and
determine the interference value.
• pr = purple eyes
• bl = black body
• dp = dumpy bristles
• Answer will be given in class.
Data
phenotype
wild type
purple
dumpy
black
purple dumpy
purple black
dumpy black
purple dumpy black
total
count
60
756
23
417
414
21
750
59
2500