Lecture 1 Human Genetics

Download Report

Transcript Lecture 1 Human Genetics

Great Dane x Mexican Chihuahua
F1 Big (Great Danes)
3 Big : 1 Small
The Genius of Mendel
• Highly inbred strains of peas
• Differed by single character
Round x Wrinkled
(WT x mutant)
F1 All Round
F2 5474 Round
1850 Wrinkled
2.96:1 (3:1)
Needs Statistics
Mapping in Drosophila
Ly Sb br
+ + +
Lots of variation in people
There must be a genetic component
How do we assign “traits” to genes?
Ultimately want a molecular description
Start with inherited diseases
Pedigrees……Mendel’s First Law
Autosomal Dominant Disorder
Autosomal Recessive
Disease is apparent because of consanguinity (III 5 &6)
Population Genetics
Science of Intraspecific Variation
Phenotypic
GENOTYPIC
• Genotypic Variation: Alleles, Polymorphism
– Ultimate Source of Variation: Mutation
• Dynamics of Variation during Population
History
– Changes in Allele Frequencies due to
• Drift
• Selection
– Persistence of Allele Combinations due to Linkage
• Linkage Disequilibrium
Some Basics 1
1. Only refer to one strand, and don’t confuse strands with alleles
GATCACA
Allele 1
=
TGTAATC
GATTACA
TGTGATC
GATTACA
Allele 2
Allele 2
TGTGATC
GATCACA
Allele 1
TGTAATC
2. Context is unimportant (unless we have linkage…next)
AGACAGAAAGGAAAAGAACCTTCCATTTTTGGCTGTGCCAAGAAGCTCAGAAAGG
ATACATTGTG
AGACAGAAAGGAAAAGAACCTTCCATTTTTGGCTGTGCCAAGAAGCTCAGAAAGG
T
C
GATAATATAAAAAATATATAGTTAATTGGGAATTGAATTTACAAA
GATAATATAAAAAATATATAGTTAATTGGGAATTGAATTTACAAA
ATACATTGTG
Allele 1: T
Allele 2: C
Some Basics 2
3. Because mutations are rare events, the vast majority of variation is
BINARY, at the base pair level.
Allele 1
T
Allele 2
C
CAAAGGAAAAGAATGCCTTCCATTTTTGGCTGTGCCAAGAAGCTCAGAAAGG
CAAAGGAAAAGAATGCCTTCCATTTTTGGCTGTGCCAAGAAGCTCAGAAAGG
GATAATATAAAAAATATATAGTTAATTGGGAATTGAATTTACAAAATACATT
GATAATATAAAAAATATATAGTTAATTGGGAATTGAATTTACAAAATACATT
4. Linkage makes things more complicated but only if you actually care
about linkage: Linkage equilibrium/disequilibrium.
Haplotype 1
2 alleles
2 alleles
2 alleles
C
T
A
C
A
A
C
G
CTTCC[1396bp]GAAGCTCAGAAAGG
GAAAGGAAAAGAAGATTT
G
GATAATATAAAAAATAT[2502bp]TTGGGAATTTACA
AATAC
Haplotype 2
CTTCC[1396bp]GAAGCTCAGAAAGG
GAAAGGAAAAGAAGATTT
GATAATATAAAAAATAT[2502bp]TTGGGAATTTACA
AATAC
Haplotype 3
GAAAGGAAAAGAAGATTT
CTTCC[1396bp]GAAGCTCAGAAAGG
GATAATATAAAAAATAT[2502bp]TTGGGAATTTACA
AATAC
Some Basics 3
5. Alleles have frequencies in the population (which sum to 1)
Frequency of Allele 1 (T) = 0.59
p = 0.59
Frequency of Allele 2 (C) = 0.41
frequency of major allele
6. We’ll be talking about diploids, and genotype probabilities (which sum
to 1) can be calculated from allele frequencies.
(And vice versa; and under certain assumptions)
Prob. of having:
T,T
T,C
C,C
0.35
0.48
0.17
p2
2pq
q2
What about two different genes?
Consider two genes A and B that each have two alleles
Aa
Bb
Allelic frequencies are 0.5
(At the “A” locus A=0.5, a= 0.5)
(At the “B” locus B=0.5 and b=0.5)
For A and a genotype frequencies = p2 +2pq +q2
AA , Aa and aa individuals = 0.25 + 0.5 + 0.25
The same for BB, Bb and bb
How many AA BB individuals are (0.25 x 0.25) aa Bb individuals are
(0.25 x 0.50)
Both genes are in “equilibrium”. (Hardy and Weinberg)
Hardy Weinberg is the Population Equivalent of the Punnett Square
A
a
A
AA
Aa
a
Aa
aa
(p + q)2 = p2 +2pq + q2
Mutation Rate per Generation
How often per
generation does this
happen?
1 generation
Average Mutation Rates in Mammals
Point substitution (nuc) 0.5 x 10-8 per base pair
Microdeletion (1-10bp) ~10-9 per base pair
Microinsertion (1-10bp) ~0.5 x 10-9 per base pair
Mobile element ins’n
Inversion
~10-11
?? much rarer
Exceptions
Hypermutable sites (CpGs)
C->T = 10x avg point rate
Simple Sequence Repeats
10-1000x indel rate (some 10-4!)
mitochondrial DNA
10-100x nuclear point rate
Haploid Human Genome is ~2 x 109 base pairs
Most of the DNA is non-coding
Introns, Intragenic regions, LINES, SINES etc
AT the DNA level, can have tremendous variation
ath no phenotypic consequenses
Remember the LacI gene (the repressor)
Nonsense mutations at every codon
Substitute every AA at every position
White means no phenotype
Lesson….most mutations in coding regions are silent
Drift vs. Selection
The two forces that determine the fate
of alleles in a population
• Drift
– Change in allele frequencies due to
sampling
• Selection
– Change in allele frequencies due to
function
Genetic
Drift
This is like 107
independent
populations
Gen 0
For every bottle: after
eggs hatch pick 8 male
larvae and 8 female
larvae, stick in a new
bottle. Repeat for 19
generations.
Gen 19
Genetic Drift: Size Matters
4 populations
2 at N=25
2 at N=250
From Li (1997) Molecular Evolution, Sinauer Press
Selection & Fitness
“Absolute Fitness” = “Viability” = # of survivors / total # progeny produced
= P(survival until mean reproductive age)
If Fitness depends on Genotype, then we have (natural) Selection
Selection vs Drift Recap
From the perspective of disease severity:
Given a particular selection coefficient (picture severity of
disease), selection is only effective in a population whose size
is large enough to overcome the effect of drift.
From the perspective of population size:
Given a particular population size, only alleles that bear a large
enough selection coefficient (picture severity of disease) will be
strongly selected against.
Linkage disequilibrium: the big (and oversimplified) picture
A new mutation!
(on the "red"
chromosome)
Eager geneticist
obtains samples from
multiple affected
individuals
• Small number (maybe one) of ancestral disease-causing mutations
• Isolation of chromosome bearing disease-causing mutation
• "Reasonable" opportunity for recombination during population history
• (Think Finland: 1000 founders 2000 years ago; consistent expansion)
• Few (maybe none) reoccurrences of disease-causing mutation
LD and time: history at work
Do we care about:
The age of the mutation or the age of the founding population?
Two common types of DNA
variants
DNA haplotype
• Haplotype = a series of marker alleles
on a chromosome (DNA molecule)
• E.g.: DNA sequence, a series of SNPs
or microsatellites along a chromosome.