Transcript Week3
The Likelihood Function - Introduction
• Recall: a statistical model for some data is a set f : of
distributions, one of which corresponds to the true unknown
distribution that produced the data.
• The distribution fθ can be either a probability density function or a
probability mass function.
• The joint probability density function or probability mass function
of iid random variables X1, …, Xn is
n
f x1 ,..., x n f xi .
i 1
week 3
1
The Likelihood Function
• Let x1, …, xn be sample observations taken on corresponding random
variables X1, …, Xn whose distribution depends on a parameter θ. The
likelihood function defined on the parameter space Ω is given by
L | x1 ,..., xn f x1 ,..., xn .
• Note that for the likelihood function we are fixing the data, x1,…, xn,
and varying the value of the parameter.
• The value L(θ | x1, …, xn) is called the likelihood of θ. It is the
probability of observing the data values we observed given that θ is the
true value of the parameter. It is not the probability of θ given that we
observed x1, …, xn.
week 3
2
Examples
• Suppose we toss a coin n = 10 times and observed 4 heads. With no
knowledge whatsoever about the probability of getting a head on a
single toss, the appropriate statistical model for the data is the
Binomial(10, θ) model. The likelihood function is given by
• Suppose X1, …, Xn is a random sample from an Exponential(θ)
distribution. The likelihood function is
week 3
3
Sufficiency - Introduction
• A statistic that summarizes all the information in the sample about
the target parameter is called sufficient statistic.
• An estimator ˆ is sufficient if we get as much information about θ
from ˆ as we would from the entire sample X1, …, Xn.
• A sufficient statistic T(x1, …, xn) for a model is any function of the
data x1, …, xn such that once we know the value of T(x1, …, xn),
then we can determine the likelihood function.
week 3
4
Sufficient Statistic
• A sufficient statistic is a function T(x1, …, xn) defined on the
sample space, such that whenever T(x1, …, xn) = T(y1, …, yn), then
L | x1 ,..., x2 c L | y1 ,... yn
for some constant c.
• Typically, T(x1, …, xn) will be of lower dimension than x1, …, xn, so
we can consider replacing x1, …, xn by T(x1, …, xn) as a data
reduction and this simplifies the analysis.
• Example…
week 3
5
Minimal Sufficient Statistics
• A minimal sufficient statistic T for s model is any sufficient
statistic such that once we know a likelihood function L(θ|x1, …, xn)
for the model and data then we can determine T(x1, …, xn).
• A relevant likelihood function can always be obtained from the
value of any sufficient statistic T, but if T is minimal sufficient as
well, then we can also obtain the value of T from any likelihood
function.
• It can be shown that a minimal sufficient statistics gives the
maximal reduction of the data.
• Example…
week 3
6
Alternative Definition of Sufficient Statistic
• Let X1, …, Xn be a random sample from a distribution with unknown
parameter θ. The statistic T(x1, …, xn) is said to be sufficient for θ if
the conditional distribution of X1, …, Xn given T does not depend on θ.
• This definition is much harder to work with as the conditional
distribution of the sample X1, …, Xn given the sufficient statistics T is
often hard to derive.
week 3
7
Factorization Theorem
• Let T be a statistic based on a random sample X1, …, Xn. Then T is a
sufficient statistic for θ if
L | x1 ,..., xn g T ; hx1 ,.., xn
i.e. if the likelihood function can be factored into two nonnegative
functions one that depend on T(x1, …, xn) and θ and one that depend
only on the data x1, …, xn.
• Proof:
week 3
8
Examples
week 3
9
Minimum Variance Unbiased Estimator
• MVUE for θ is the unbiased estimator with the smallest possible
variance. We look amongst all unbiased estimators for the one with
the smallest variance.
week 3
10
The Rao-Blackwell Theorem
• Let ˆ be an unbiased estimator for θ such that Var ˆ . If T is a
sufficient statistic for θ, define ˆ* E ˆ | T . Then, for all θ,
E ˆ*
Var ˆ Var ˆ .
*
and
• Proof:
week 3
11