Transcript probability

Probability
• Probability: what is the chance that a given
event will occur? For us, what is the chance that
a child, or a family of children, will have a given
phenotype?
• Probability is expressed in numbers between 0
and 1. Probability = 0 means the event never
happens; probability = 1 means it always
happens.
• The total probability of all possible event always
sums to 1.
Definition of Probability
• The probability of an event equals the number of times it
happens divided by the number of opportunities.
• These numbers can be determined by experiment or by
knowledge of the system.
• For instance, rolling a die (singular of dice). The chance
of rolling a 2 is 1/6, because there is a 2 on one face and
a total of 6 faces. So, assuming the die is balanced, a 2
will come up 1 time in 6.
• It is also possible to determine probability by experiment:
if the die were unbalanced (loaded = cheating), you
could roll it hundreds or thousands of times to get the
actual probability of getting a 2. For a fair die, the
experimentally determined number should be quite close
to 1/6, especially with many rolls.
The AND Rule of Probability
• The probability of 2 independent events both happening
is the product of their individual probabilities.
• Called the AND rule because “this event happens AND
that event happens”.
• For example, what is the probability of rolling a 2 on one
die and a 2 on a second die? For each event, the
probability is 1/6, so the probability of both happening is
1/6 x 1/6 = 1/36.
• Note that the events have to be independent: they can’t
affect each other’s probability of occurring. An example
of non-independence: you have a hat with a red ball and
a green ball in it. The probability of drawing out the red
ball is 1/2, same as the chance of drawing a green ball.
However, once you draw the red ball out, the chance of
getting another red ball is 0 and the chance of a green
ball is 1.
The OR Rule of Probability
• The probability that either one of 2
different events will occur is the sum of
their separate probabilities.
• For example, the chance of rolling either a
2 or a 3 on a die is 1/6 + 1/6 = 1/3.
NOT Rule
• The chance of an event not happening is 1
minus the chance of it happening.
• For example, the chance of not getting a 2
on a die is 1 - 1/6 = 5/6.
• This rule can be very useful. Sometimes
complicated problems are greatly
simplified by examining them backwards.
Combining the Rules
• More complicated situations involve combining the AND
and OR rules.
• It is very important to keep track of the individuals
involved and not allow them to be confused. This is the
source of most people’s problems with probability.
• What is the chance of rolling 2 dice and getting a 2 and a
5? The trick is, there are 2 ways to accomplish this: a 2
on die A and a 5 on die B, or a 5 on die A and a 2 on die
B. Each possibility has a 1/36 chance of occurring, and
you want either one or the other of the 2 events, so the
final probabilty is 1/36 + 1/36 = 2/36 = 1/18.
Getting a 7 on Two Dice
• There are 6 different ways of
getting two dice to sum to 7:
• In each case, the probability of
getting the required number on
a single die is 1/6.
• To get both numbers (so they
add to 7), the probability uses
the AND rule: 1/6 x 1/6 = 1/36.
• To sum up the 6 possibilities,
use the OR rule: only 1 of the 6
events can occur, but you don’t
care which one.
• 6/36 = 1/6
die A
die B
prob
1
6
1/36
2
5
1/36
3
4
1/36
4
3
1/36
5
2
1/36
6
1
1/36
total
6/36
Probability and Genetics
• The probability that any individual child
has a certain genotype is calculated using
Punnett squares.
• We are interested in calculating the
probability of a given distribution of
phenotypes in a family of children.
• This is calculated using the rules of
probability.
Sex Ratio in a Family of 3
• Assume that the probability of
a boy = 1/2 and the probability
of a girl = 1/2.
• Enumerate each child
separately for each of the 8
possible families.
• Each family has a probability
of 1/8 of occurring ( 1/2 x 1/2 x
1/2).
• Chance of 2 boys + 1 girl.
There are 3 families in which
this occurs: BBG, BGB, and
GBB. Thus, the chance is 1/8
+ 1/8 + 1/8 = 3/8.
child
#1
B
B
B
B
G
G
G
G
child
#2
B
B
G
G
B
B
G
G
child
#3
B
G
B
G
B
G
B
G
Different Probabilities for Different
Phenotypes
• Once again, a family of 3
children, but this time the
parents are heterozygous
for Tay-Sachs, a recessive
genetic disease. Each child
thus has a 3/4 chance of
being normal (TT or Tt) and
a 1/4 chance of having the
disease (tt).
• Now, the chances for
different kinds of families is
different.
• chance of all 3 normal =
27/64. Chance of all 3 with
disease = 1/64.
child #1
child #2
child #3
total
prob
T (3/4)
T
(3/4)
T (3/4)
27/16
T (3/4)
T (3/4)
t (1/4)
9/64
T (3/4)
t (1/4)
T (3/4)
9/64
T (3/4)
t (1/4)
t (1/4)
3/64
t (1/4)
T (3/4)
T (3/4)
9/64
t (1/4)
t (1/4)
T (3/4)
3/64
t (1/4)
T (3/4)
t (1/4)
3/64
t (1/4)
t (1/4)
t (1/4)
1/64
Different Probabilities for Different
Phenotypes, p. 2
• Chance of 2 normal + 1 with disease: there are 3
families of this type, each with probability 9/64. So, 9/64
+ 9/64 + 9/64 = 27/64.
• Chance of “at least” one normal child. This means 1
normal or 2 normal or 3 normal. Need to figure each
part separately, then add them.
--1 normal + 2 diseased = 3/64 + 3/64 + 3/64 = 9/64.
--2 normal + 1 diseased = 27/64 (see above)
-- 3 normal = 27/64
--Sum = 9/64 + 27/64 + 27/64 = 63/64.
• This could also be done with the NOT rule: “at least 1
normal” is the same as “NOT all 3 diseased”. The
chance of all 3 diseased is 1/64, so the chance at least 1
normal is 1 = 1/64 = 63/64.
Larger Families: Binomial
Distribution
• The basic method of examining all
possible families and counting the
ones of the proper type gets
unwieldy with big families.
• The binomial distribution is a
shortcut method based on the
expansion of the equation to the
right, where p = probability of one
event (say, a normal child), and q =
probability of the alternative event
9mutant child). n is the number of
children in the family.
• Since 1 raised to any power
(multiplied by itself) is always equal
to 1, this equation describes the
probability of any size family.
( p  q)  1
n
Binomial for a Family of 2
• The expansion of the binomial for n = 2 is shown. The 3 terms
represent the 3 different kinds of families: p2 is families with 2
normal children, 2pq is the families with 1 normal and 1 mutant child,
and q2 is the families with 2 mutant children.
• The coefficients in front of these terms: 1, 2, and 1, are the number
of different families of the given type. Thus there are 2 different
families with 1 normal plus 1 mutant child: normal born first and
mutant born second, or mutant born first and normal born second.
• As before, p = 3/4 and q = 1/4.
• Chance of 2 normal children = p2 = (3/4)2 = 9/16.
• Chance of 1 normal plus 1 mutant = 2pq = 2 * 3/4 * 1/4 = 6/16 = 3/8.
p  2 pq  q
2
2
Binomial for a Family of 3
• Here, p3 is a family of 3 normal children, 3p2q is 2 normal
plus 1 affected, 3pq2 is 1 normal plus 2 affected, and q3
is 3 affected.
• The exponents on the p and q represent the number of
children of each type.
• The coefficients are the number of families of that type.
• Chance of 2 normal + 1 affected is described by the term
3p2q. Thus, 3 * (3/4)2 * 1/4 = 27/64. Same as we got by
enumerating the families in a list.
p  3 p q  3 pq  q  1
3
2
2
3
Larger Families
• To write the terms of the binomial expansion for
larger families, you need to get the exponents
and the coefficients.
• Exponents are easy: you just systematically vary
the exponents on p and q so they always add to
n. Start with pnq0 (remembering that anything to
the 0 power = 1), do pn-1q1, then pn-2q2, etc.
• Coefficients require a bit more work. There are
several methods for finding them. I am going to
show you Pascal’s Triangle, but other methods
are also commonly used.
Pascal’s Triangle
• Is a way of finding the
coefficients for the binomial in
a simple way.
• Start by writing the coefficients
for n = 1: 1 1.
• Below this, the coefficients for
n = 2 are found by putting 1’s
on the outside and adding up
adjacent coefficients from the
line above: 1, 1 + 1 = 2, 1.
• Next line goes the same way:
write 1’s on the outsides, then
add up adjacent coefficients
from the line above: 1, 1+2 = 3,
2+1 = 3, 1.
• For n = 5, coefficients are 1, 5,
10, 10, 5, 1.
More Pascal’s Triangle
• Now apply the coefficients to the terms. For n = 5, the
terms with appropriate exponents are p5, p4q, p3q2, p2q3,
pq4, and q5.
• The coefficients are 1, 5, 10, 10, 5, 1. So the final
equation is p5 + 5p4q + 10p3q2 + 10p2q3 + 5pq4 + q5 = 1.
• Using this: what is the chance of 3 normal plus 2
affected children? The relevant term is 10p3q2. The
exponents on p and q determine how many of each kind
of child is involved. The coefficient, 10, says that there
are 10 families of this type on the list of all possible
families.
• So, the chance of the desired family is 10p3q2 = 10 *
(3/4)3 * (1/4)2 = 10 * 27/64 * 1/16 = 270/1024
More with this Example
• What is the chance of 1 normal plus 4 affected? The
relevant term is 5pq4. So, the chance is 5 * (3/4) * (1/4)4
= 5 * 3/4 * 1/256 = 15/1024.
• What is the chance of 1 normal or 2 normal? Sum of the
probabilities for 5pq4 (1 normal) and 10p2q3 (2 normal) =
90/1024 + 15/1024 = 285/1024.
• What is the chance of at least 4 normal? This means 4
normal or 5 normal. Add them up.
• What is the chance of at least 1 normal? Easiest to do
with the NOT rule: 1 - chance of all affected.
• What is the chance that child #3 is normal? Trick
question. For any individual child, the probability is
always the simple probability form the Punnett square:
3/4 in this case.