Example - University of Toronto

Download Report

Transcript Example - University of Toronto

Example
• Human males have one X-chromosome and one Y-chromosome,
while females have two X-chromosomes each chromosome being
inherited from one parent.
• Hemophilia is a disease that exhibits X-chromosome-linked
recessive inheritance, meaning that a male who inherits the gene that
cause the disease on the X-chromosome is affected, while a female
carrying the gene on only one of her two X-chromosomes is not
affected. The disease is generally fatal for woman who inherit two
such genes, and this is very rare, since the frequency of occurrence
of the gene is low in human populations.
week 2
1
• Consider a woman who has an affected brother, this implies that her
mother must be a carrier of the hemophilia gene with one ‘good’ and
one ‘bad’ X-chromosome. We are also told that her father is not
affected; therefore the woman herself has a fifty-fifty chance of
having the gene. The unknown quantity of interest is the state of the
women and it had two values: the woman is either a carrier of the
gene (θ=1) or not (θ=0). Based on the information provided so far,
the prior distribution for the unknown θ can simply be expressed as
Pr(θ = 1) = Pr(θ = 0) =½.
• The data used to update this prior information consist of the
affection status of the woman’s sons. Suppose she has two sons,
neither of whom is affected. Let yi = 1 or 0 denote an affected or
unaffected son respectively. We assume the two sons are not
identical twins and so their outcomes conditional on θ are
independent. The likelihood function is then…
week 2
2
• Bayes’ rule can be used to combine the information in the data with
the prior distribution; in particular we are interested in the posterior
probability that the woman is a carrier. This is given by…
• Intuitively it is clear that if a woman has unaffected sons, it is less
probable that she is a carrier, and Bayes’ rule provides a formal
mechanism for determining the extent of the correction.
week 2
3
Single-parameter models
• We now consider four fundamental and widely used onedimensional models. That is, models that have only a single
scalar parameter.
• These models are the binomial, normal, Poisson and
exponential.
week 2
4
Example: Bernoulli Model
• Suppose we observe a sample x1, x2 , ..., xn  from the Bernoulli(θ)
distribution with   0,1 unknown.
• In this model the aim is to estimate an unknown population
proportion from the result of a sequence of ‘Bernoulli trials’.
• The likelihood function for this model is:
• Suppose we choose the prior distribution on θ to be the Beta(α, β)
distribution.
• The posterior distribution is then…
week 2
5
• As a specific case, suppose we observe nx  10in a sample of n = 40
and that α = β = 1 (i.e. we have a uniform prior on θ). Then the
posterior of θ is given by the Beta(11,31) distribution. The plot of
the posterior and prior density looks like
Scatterplot of Prior, Posterior vs theta
Variable
Prior
Posterior
6
5
Y-Data
4
3
2
1
0
0.0
0.2
0.4
0.6
0.8
1.0
theta
• The spread of the posterior distribution gives us some idea of how
precise any probability statements we make about θ can be. Note how
much information the data have added as reflected in the above graph.
week 2
6
Example: Location Normal Model
• Suppose that x1, x2 , ..., xn  is a sample from an N  , 02  distribution where
  R is unknown and is known. The likelihood function is given by…
•
Suppose we take the prior distribution of μ to be the N 0 , 02  for some
specified choices of μ0 and  02 . The posterior density is then
proportional to…
• Expending the above term we get that the posterior distribution of μ is
the following normal distribution
1
 1


n
n
N   2  2   20  2
 0  0   0  0

week 2
1
1
n  
x ,  2  2 
   0  0  
7
• Note that the posterior mean is a weighted average of the prior mean
μ0 and the sample mean.
• Further, the posterior variance is smaller than the variance of the
sample mean. So if the information expressed by the prior is
accurate, inference about μ based on the posterior will be more
accurate than those based on the sample mean alone.
2
• Note, the more diffuse the prior is, i.e., the larger  0 is, the less
influence the prior has.
week 2
8