Statistical Model - University of Toronto

Download Report

Transcript Statistical Model - University of Toronto

Statistical Model
• A statistical model for some data is a set of distributions,  f :   
one of which corresponds to the true unknown distribution that
produced the data.
• The statistical model corresponds to the information a statistician
brings to the application about what the true distribution is or at least
what he or she is willing to assume about it.
• The variable θ is called the parameter of the model, and the set Ω is
called the parameter space.
• From the definition of a statistical model, we see that there is a
unique value    , such that fθ is the true distribution that generated
the data. We refer to this value as the true parameter value.
STA248 week 3
1
Examples
• Suppose there are two manufacturing plants for machines. It is
known that the life lengths of machines built by the first plant have
an Exponential(1) distribution, while machines manufactured by the
second plant have life lengths distributed Exponential(1.5). You
have purchased five of these machines and you know that all five
came from the same plant but do not know which plant. Further, you
observe the life lengths of these machines, obtaining a sample
(x1, …, x5) and want to make inference about the true distribution of
the life lengths of these machines.
• Suppose we have observations of heights in cm of individuals in a
population and we feel that it is reasonable to assume that the
distribution of height is the population is normal with some
unknown mean and variance. The statistical model in this case is
f , : ,  2   where Ω = R×R+, where R+ = (0, ∞).
2
STA248 week 3
2
Goals of Statistics
• Estimate unknown parameters of underlying probability
distribution.
• Measure errors of these estimates.
• Test whether data gives evidence that parameters are (or are
not) equal to a certain value or the probability distribution
have a particular form.
STA248 week 3
3
Point Estimation
• Most statistical procedures involve estimation of the unknown value
of the parameter of the statistical model.
• A point estimator of the parameter θ is a function of the underlying
random variables and so it is a random variable with a distribution
function.
• A point estimate of the parameter θ is a function on the data; it is a
statistic. For a given sample an estimate is a number.
• Notation…
STA248 week 3
4
What Makes a Good Estimator?
• Unbiased
• Consistent
• Minimum variance
• With know probability distribution
STA248 week 3
5
Properties of Point Estimators - Unbiased
• Let ˆ be a point estimator for a parameter θ. Then ˆ is an unbiased
estimator if E ˆ   .

• There may not always exist an unbiased estimator for θ.
• ˆ unbiased for θ, does not mean g ˆ is unbiased for g(θ).
STA248 week 3
6
Example - Common Point Estimators
• A natural estimate for the population mean μ is the sample mean (in
any distribution). The sample mean is an unbiased estimator of the
population mean.
• There are two common estimator for the population variance …
STA248 week 3
7
Claim
• Let X1, X2,…, Xn be random sample of size n from a population
with mean μ and variance σ2 . The sample variance s2 is an
unbiased estimator of the population variance σ2.
• Proof…
STA248 week 3
8
Example
• Suppose X1, X2,…, Xn is a random sample from U(0, θ) distribution.
Let ˆ  X . Find the density of ˆ and its mean. Is ˆ unbiased?
n 
STA248 week 3
9
Asymptotically Unbiased Estimators
Bˆ   0
• An estimator is asymptotically unbiased if lim
n
• Example:
STA248 week 3
10
Consistency
p
• An estimator ˆn is a consistent estimator of θ, if ˆn   , i.e., if ˆn
converge in probability to θ.
STA248 week 3
11
Minimum Variance
• An estimator for θ is a function of underlying random variables and
so it is a random variable and has its own probability distribution
function.
• This probability distribution is called sampling distribution.
• We can use the sampling distribution to get variance of an estimator.
• A better estimate has smaller variance; it is more likely to produce
estimated close to the true value of the parameter if it is unbiased.
• The standard deviation of the sampling distribution of an estimator
is usually called the standard error of the estimator.
STA248 week 3
12
Examples
STA248 week 3
13
How to find estimators?
• There are two main methods for finding estimators:
1) Method of moments.
2) The method of Maximum likelihood.
• Sometimes the two methods will give the same estimator.
STA248 week 3
14
Method of Moments
• The method of moments is a very simple procedure for finding an
estimator for one or more parameters of a statistical model.
• It is one of the oldest methods for deriving point estimators.
• Recall: the k moment of a random variable is
 k  EX k .
These will very often be functions of the unknown parameters.
• The corresponding k sample moment is the average .
1 n k
mk   xi
n i 1
• The estimator based on the method of moments will be the solutions
to the equation μk = mk.
STA248 week 3
15
Examples
STA248 week 3
16
The Likelihood Function
• Let x1, …, xn be sample observations taken on corresponding random
variables X1, …, Xn whose distribution depends on a parameter θ. The
likelihood function defined on the parameter space Ω is given by
L | x1 ,..., xn   f x1 ,..., xn .
• Note that for the likelihood function we are fixing the data, x1,…, xn,
and varying the value of the parameter.
• The value L(θ | x1, …, xn) is called the likelihood of θ. It is the
probability of observing the data values we observed given that θ is the
true value of the parameter. It is not the probability of θ given that we
observed x1, …, xn.
STA248 week 3
17
Maximum Likelihood Estimators
• In the likelihood function, different values of θ will attach different
probabilities to a particular observed sample.
• The likelihood function, L(θ | x1, …, xn), can be maximized over θ,
to give the parameter value that attaches the highest possible
probability to a particular observed sample.
• We can maximize the likelihood function to find an estimator of θ.
• This estimator is a statistics – it is a function of the sample data. It is
denoted by ˆ.
STA248 week 3
18
The log likelihood function
• l(θ) = ln(L(θ)) is the log likelihood function.
• Both the likelihood function and the log likelihood function have
their maximums at the same value of ˆ.
• It is often easier to maximize l(θ).
STA248 week 3
19
Examples
STA248 week 3
20
Properties of MLE
• Maximum likelihood estimators (MLEs) are consistent.
• The MLE of any parameter is asymptotically unbiased.
• MLE has variance that is nearly as small as can be achieved by any
estimator (asymptotically).
• Distribution of MLSs is approximately Normal (asymptotically).
STA248 week 3
21