Transcript A 1

Computational Genetics
Lecture 1
Background Readings: Chapter 2&3 of An introduction to Genetics,
Griffiths et al. 2000, Seventh Edition (CS/Fishbach/Other libraries).
This class has been edited from several sources. Primarily from Terry Speed’s homepage at
Stanford and the Technion course “Introduction to Genetics”. Changes made by Dan Geiger.
.
2
3
4
Human Genome
Most human cells contain
46 chromosomes:


2 sex chromosomes
(X,Y):
XY – in males.
XX – in females.
22 pairs of
chromosomes, named
autosomes.
6
Genetic Information
Gene – basic unit of genetic
information. They determine
the inherited characters.
 Genome – the collection of
genetic information.
 Chromosomes – storage
units of genes.

7
Sexual Reproduction
egg
Meiosis
sperm
gametes
zygote
8
Source: Alberts et al
The Double Helix
9
Chromosome Logical Structure
Marker – Genes, SNP, Tandem repeats.
Locus – location of markers.
Allele – one variant form of a marker.
Locus1
Possible Alleles: A1,A2
Locus2
Possible Alleles: B1,B2,B3
11
Alleles - the ABO locus example
Phenotype
Genotype
A
A/A, A/O
B
B/B, B/O
AB
A/B
O
O/O
O is recessive to A.
A is dominant over O.
A and B are codominant.
Multiple alleles: A,B,O.
Trait = Character = Phenotype
12
X-linked
genotype
phenotype
b - dominant allele. Namely, (b,b), (b,w) is Black.
 w - recessive allele. Namely, only (w,w) is White.
This is an example of an X-linked
trait/character.
For males b alone is Black and w alone is white.
There is no homolog gene on the Y chromose.

14
Mendel’s Work
Modern genetics began with Mendel’s experiments on garden
peas (Although, the ramification of his work were not realized
during his life time). He studied seven contrasting pairs of
characters, including:
The form of ripe seeds: round, wrinkled
The color of the seed albumen: yellow, green
The length of the stem: long, short
Mendel Gregor. 1866. Experiments on
Plant Hybridization. Transactions of the
Brünn Natural History Society.
15
Mendel’s first law
Characters are controlled by pairs of genes which
separate during the formation of the reproductive
cells (meiosis)
Aa
A
a
16
P:
AA X
F1:
aa
Aa
F1 X F1
Aa
X Aa
test cross
Aa X
Gametes:
A
a
Gametes:
A
a
A
AA
Aa
a
Aa
aa
a
Aa
aa
aa
~
~
Phenotype: 1A : 1 a
F2:
1 AA : 2 Aa : 1 aa
Phenotype
~
A
~
a
17
Mendel’s second law
When two or more pairs of genes segregate
simultaneously, they do so independently.
A a; B b
AB
PAB= PA  PB
Ab
PAb=PA  Pb
aB
PaB=Pa  PB
ab
Pab=Pa  Pb
23
Recombination Phenomenon
(Happens during Meiosis)
Male or female
Recombination
Haplotype
The recombination fraction
Between two loci on the
:‫תאי מין‬
same chromosome
‫ או זרע‬,‫ביצית‬
Is the probability that they
end up in regions
Of different colors
31
Example: ABO, AK1 on
Chromosome 9
O
A
O O
A2 A2
2
1
A2/A2
A1/A1
Phase inferred
A O
A1 A2
A
A
4
3
A2/A2
A1/A2
Recombinant
O O
A1 A2
O
A |O
A2 | A2
5
A1/A2
Hardy-Weinberg law of population genetics permits calculation of
genotype frequencies from allele frequencies
P(a)= frequency of “a” in the population
P(ab) =2P(a)P(b)
Hardy-Weinberg equilibrium corresponds to a random union of
33
Example: ABO, AK1 on
Chromosome 9
O
A
O O
A2 A2
2
1
A2/A2
A1/A1
Phase inferred
A O
A1 A2
Recombinant
A
A
4
3
A2/A2
A1/A2
O O
A1 A2
O
A |O
A2 | A2
5
A1/A2
Recombination fraction is 12/100 in males and 20/100 in females.
One centi-morgan means one recombination every 100
meiosis.
One centi-morgan corresponds to approx 1M nucleotides (with
large variance) depending on location and sex.
34
Conventions
35
Maximum Likelihood Principle
What is the probability of data
for this pedigree, assuming a
recessive mutation ?
What is the probability of data
for this pedigree, assuming a
dominant mutation ?
Maximum likelihood principle: Choose the model that
maximizes the probability of the data.
36
Linkage Equilibrium
 Linkage
Equilibrium =haplotype frequency is the
product of the underlying allele’s frequencies:
independence.
 Exceptions occur for tightly linked loci.
37
One locus: founder probabilities
Founders are individuals whose parents are not in the pedigree. They may of
may not be typed (namely, their genotype measured). Either way, we need to
assign probabilities to their actual or possible genotypes.
This is usually done by assuming Hardy-Weinberg equilibrium (H-W). If the
frequency of D is .01, then H-W says:
1
Dd
pr(Dd ) = 2x.01x.99
Genotypes of founder couples are (usually) treated as independent.
1
Dd
2
dd
pr(pop Dd , mom dd ) = (2x.01x.99)x(.99)2
38
One locus: transmission probabilities
Children get their genes from their parents’ genes,
independently, according to Mendel’s laws; also
independently for different children.
Dd
1
2
3
Dd
dd
pr(kid 3 dd | pop 1 Dd & mom 2 Dd )
= 1/2 x 1/2
39
One locus: transmission probabilities - II
Dd
3
dd
1
2
Dd
4
5
Dd
DD
pr(3 dd & 4 Dd & 5 DD | 1 Dd & 2 Dd )
= (1/2 x 1/2)x(2 x 1/2 x 1/2) x (1/2 x 1/2).
The factor 2 comes from summing over the two mutually
exclusive and equiprobable ways 4 can get a D and a d.
40
One locus: penetrance probabilities
Pedigree analyses usually suppose that, given the genotype at all loci,
and in some cases age and sex, the chance of having a particular
phenotype depends only on genotype at one locus, and is independent
of all other factors: genotypes at other loci, environment, genotypes and
phenotypes of relatives, etc.
Complete penetrance:
DD
pr(affected | DD ) = 1
Incomplete penetrance)
DD
pr(affected | DD ) = .8
41
One locus: penetrance - II
Age and sex-dependent penetrance (liability
classes)
D D (45)
pr( affected | DD , male, 45 y.o. ) = .6
42
One locus: putting it all together
Dd
3
2
1
5
4
dd
Dd
Dd
DD
Assume penetrances pr(affected | dd ) = .1, pr(affected | Dd ) = .3
pr(affected | DD ) = .8, and that allele D has frequency .01.
The probability of data for this pedigree assuming penetrances of
1=0.1 and 2=0.3 is the product:
(2 x .01 x .99 x .7) x (2 x .01 x .99 x .3) x (1/2 x 1/2 x .9) x (2
x 1/2 x 1/2 x .7) x (1/2 x 1/2 x .8)
This is a function of the penetrances. By the maximum likelihood
principle, the values for 1 and 1 that maximize this
probability are the ML estimates.
44
Tutorial #2
by Ma’ayan Fishelson
.
Crossing Over
Sometimes in meiosis, homologous chromosomes
exchange parts in a process called crossing-over.
 New combinations are obtained, called the
crossover products.

47
Recombination During Meiosis
Recombinant gametes
48
Linkage

2 genes on separate chromosomes assort independently at
meiosis.

2 genes far apart on the same chromosome can also assort
independently at meiosis.

2 genes close together on the same chromosome pair do
not assort independently at meiosis.

A recombination frequency << 50% between 2 genes shows
that they are linked.
49
Two Loci Inheritance
AA
B B
a a
2 b b
1
A a
B b
3
A a
b b
4
5
6
a a
b b
A a
B b
Recombinant
50
Linkage Maps



Let U and V be 2 genes on the same chromosome.
In every meiosis, chromatids cross over at random along the
chromosome.
If the chromatids cross over between U & V, then a recombinant is
produced.
The farther apart U & V are  the greater the
chance that a crossing over would occur between
them  the greater the chance of recombination
between them.
51
Recombination Fraction
• The recombination fraction  between two loci
is the percentage of times a recombination
occurs between the two loci.
•  is a monotone, nonlinear function of the
physical distance separating between the loci
on the chromosome.
(Linkage ) 0    P(Recombinat ion )  0.5 ( No Linkage)
52
Centimorgan (cM)
1
cM (or 1 genetic map unit, m.u.) is the distance
between genes for which the recombination
frequency is 1%.
53
Interference


Crossovers in adjacent chromosome regions are
usually not independent. This interaction is called
interference.
A crossover in one region usually decreases the
probability of a crossover in an adjacent region.
 observerd # of double recombinan ts 
Interferen ce(I)  1  

 expected # of double recombinan ts 
54
Building Genetic Maps



At first: only genes with variant alleles producing
detectably different phenotypes were used as markers
for mapping.
Problem: the chromosomal intervals between the genes
were too large  the resolution of the maps wasn’t high
enough.
Solution: use of molecular markers (a site of
heterozygosity for some type of silent DNA variation
not associated with any measurable phenotypic DNA
variation).
55
Linkage Mapping by Recombination in
Humans.
 Problems:



It’s impossible to make controlled
crosses in humans.
Human progenies are rather small.
The human genome is immense. The
distances between genes are large on
average.
56
Lod Score for Linkage Testing by
Pedigrees
The results of many identical matings are combined to get
a more reliable estimate of the recombination fraction.
1.
Calculate the probabilities of obtaining a set of results in a family
on the basis of (a) independent assortment and (b) a specific
degree of linkage.
2.
Calculate the Lod score = log(b/a).
A Lod score of 3 is considered convincing
support for a specific recombination fraction.
57