Transcript Document

Statistical Decision Making
• Almost all problems in statistics
can be formulated as a problem of
making a decision .
• That is given some data observed
from some phenomena, a decision
will have to be made about the
phenomena
Decisions are generally broken
into two types:
• Estimation decisions
and
• Hypothesis Testing decisions.
Probability Theory plays a very
important role in these decisions
and the assessment of error made
by these decisions
Definition:
A random variable X is a
numerical quantity that is
determined by the outcome of a
random experiment
Example :
An individual is selected at
random from a population
and
X = the weight of the individual
The probability distribution of a
random variable (continuous) is
describe by:
its probability density curve f(x).
i.e. a curve which has the
following properties :
1.
2.
3.
f(x) is always positive.
The total are under the curve f(x) is one.
The area under the curve f(x) between a
and b is the probability that X lies between
the two values.
0.025
0.02
0.015
f(x)
0.01
0.005
0
0
20
40
60
80
100
120
Examples of some important
Univariate distributions
1.The Normal distribution
A common probability density curve is the “Normal”
density curve - symmetric and bell shaped
Comment: If m = 0 and s = 1 the distribution is
called the standard normal distribution
0.03
Normal distribution
with m = 50 and s =15
0.025
0.02
Normal distribution with
m = 70 and s =20
0.015
0.01
0.005
0
0
20
40
60
80
100
120
xm 
2
f(x) 

1
e
2s
2s
2
2.The Chi-squared distribution
with n degrees of freedom
1
(n  2 ) / 2  x / 2
f ( x)  n n / 2 x
e if x  0
 2 2
0.5
0.4
0.3
0.2
0.1
2
4
6
8
10
12
14
Comment: If z1, z2, ..., zn are
independent random variables each
having a standard normal
distribution then
2
2
2
U = z1  z2    zn
has a chi-squared distribution with
n degrees of freedom.
3. The F distribution with
n1 degrees of freedom in the
numerator and n2 degrees of
freedom in the denominator
 n 1  n 2  / 2
 n1 
if x  0
1  x
f(x)  K x
 n 2 
n1 / 2
n1 
n1  n 2  



 2  n 
2
where K =
n1  n2 


 2   2 
(n1  2)2
0.8
0.7
0.6
F dist
0.5
0.4
0.3
0.2
0.1
0
0
1
2
3
4
5
6
Comment: If U1 and U2 are independent
random variables each having Chi-squared
distribution with n1 and n2 degrees of
freedom respectively then
U1 n1
F=
U 2 n2
has a F distribution with n1 degrees of
freedom in the numerator and n2 degrees of
freedom in the denominator
4.The t distribution with n
degrees of freedom
 n1  / 2

x 

f(x)  K 1
 n 
2
n  1

 2 
where K =
n 
   n
2
0.4
0.3
0.2
0.1
-4
-2
2
4
Comment: If z and U are independent
random variables, and z has a standard
Normal distribution while U has a Chisquared distribution with n degrees of
freedom then
t=
z
U n
has a t distribution with n degrees of
freedom.
The Sampling distribution
of a statistic
A random sample from a probability
distribution, with density function
f(x) is a collection of n independent
random variables, x1, x2, ...,xn with a
probability distribution described by
f(x).
If for example we collect a random
sample of individuals from a population
and
– measure some variable X for each of
those individuals,
– the n measurements x1, x2, ...,xn will
form a set of n independent random
variables with a probability distribution
equivalent to the distribution of X across
the population.
A statistic T is any quantity
computed from the random
observations x1, x2, ...,xn.
• Any statistic will necessarily be
also a random variable and
therefore will have a probability
distribution described by some
probability density function fT(t).
• This distribution is called the
sampling distribution of the
statistic T.
• This distribution is very important if one is
using this statistic in a statistical analysis.
• It is used to assess the accuracy of a
statistic if it is used as an estimator.
• It is used to determine thresholds for
acceptance and rejection if it is used for
Hypothesis testing.
Some examples of Sampling
distributions of statistics
Distribution of the sample mean for a
sample from a Normal popululation
Let x1, x2, ...,xn is a sample from a normal
population with mean m and standard
deviation s
Let
x
x
i
i
n
Than
x
x
i
i
n
has a normal sampling distribution with mean
mx  m
and standard deviation
sx  s
n
0
20
40
60
80
100
Distribution of the z statistic
Let x1, x2, ...,xn is a sample from a normal
population with mean m and standard deviation s
Let
z
xm
s
n
Then z has a standard normal distibution
Comment:
Many statistics T have a normal distribution
with mean mT and standard deviation sT.
Then
T  mT
z
sT
will have a standard normal distribution.
Distribution of the c2 statistic for
sample variance
Let x1, x2, ...,xn is a sample from a normal
population with mean m and standard deviation s
Let
s2 
and
2


x

x
 i
= sample variance
i
n 1
 xi  x 
2
s
i
n 1
= sample standard deviation
Let
c 
2
 x
i
 x
2
i
s2
(n  1)s

2
s
2
Then c2 has chi-squared distribution with n
= n-1 degrees of freedom.
The chi-squared
distribution
0 .5
0
0
4
8
12
16
20
24
Distribution of the t statistic
Let x1, x2, ...,xn is a sample from a normal
population with mean m and standard deviation s
Let
xm
t
s
n
then t has student’s t distribution with n = n-1
degrees of freedom
Comment:
If an estimator T has a normal distribution with
mean mT and standard deviation sT.
If sT is an estimatior of sT based on n degrees of
freedom
Then
TmT
t
sT
will have student’s t distribution with n degrees of
freedom.
t distribution
standard normal distribution
Point estimation
• A statistic T is called an estimator of the
parameter q if its value is used as an
estimate of the parameter q.
• The performance of an estimator T will be
determined by how “close” the sampling
distribution of T is to the parameter, q,
being estimated.
• An estimator T is called an unbiased
estimator of q if mT, the mean of the
sampling distribution of T satisfies mT = q.
• This implies that in the long run the average
value of T is q.
• An estimator T is called the Minimum
Variance Unbiased estimator of q if T is an
unbiased estimator and it has the smallest
standard error sT amongst all unbiased
estimators of q.
• If the sampling distribution of T is normal,
the standard error of T is extremely
important. It completely describes the
variability of the estimator T.
Interval Estimation
(confidence intervals)
• Point estimators give only single values as
an estimate. There is no indication of the
accuracy of the estimate.
• The accuracy can sometimes be measured
and shown by displaying the standard error
of the estimate.
• There is however a better way.
• Using the idea of confidence interval
estimates
• The unknown parameter is estimated with a
range of values that have a given probability
of capturing the parameter being estimated.
• The interval TL to TU is called a (1 - a) 
100 % confidence interval for the parameter
q, if the probability that q lies in the range
TL to TU is equal to 1 - a.
• Here , TL to TU , are
– statistics
– random numerical quantities calculated from
the data.
Examples
Confidence interval for the mean of a Normal population
(based on the z statistic).
TL  x  z a / 2
s
s
to TU  x  z a / 2
n
n
is a (1 - a)  100 % confidence interval for m, the mean of a
normal population.
Here za/2 is the upper a/2  100 % percentage point of the
standard normal distribution.
More generally if T is an unbiased estimator of the parameter
q and has a normal sampling distribution with known
standard error sT then
TL  T  z a / 2 s T to TU  T  z a / 2s T
is a (1 - a)  100 % confidence interval for q.
Confidence interval for the mean of a Normal
population
(based on the t statistic).
TL  x  t a / 2
s
s
to TU  x  t a / 2
n
n
is a (1 - a)  100 % confidence interval for m, the
mean of a normal population.
Here ta/2 is the upper a/2  100 % percentage point
of the Student’s t distribution with n = n-1 degrees of
freedom.
More generally if T is an unbiased estimator of the parameter
q and has a normal sampling distribution with estmated
standard error sT, based on n degrees of freedom, then
TL  T  t a / 2s T to TU  T  t a / 2s T
is a (1 - a)  100 % confidence interval for q.
Common Confidence intervals
Situation
Sample form the Normal distribution with unknown
mean and known variance
(Estimating m) (n large)
Sample form the Normal distribution with unknown
mean and unknown variance (Estimating m)(n small)
Confidence interval
x  za / 2
x  ta / 2
Estimation of a binomial probability p
pˆ  za / 2
Two independent samples from the Normal
distribution with unknown means and known
variances
(Estimating m1 - m2) (n,m large)
Two independent samples from the Normal
distribution with unknown means and unknown but
equal variances. (Estimating m1 - m2) ) (n,m small)
Estimation of a the difference between two binomial
probabilities, p1-p2
s0
n
s
n
pˆ (1  pˆ )
n
x  y  za / 2
2
s x2 s y

n m
x  y  ta / 2 s Pooled
pˆ 1  pˆ 2  za / 2
1 1

n m
pˆ 1 (1  pˆ 1 ) pˆ 2 (1  pˆ 2 )

n1
n2
Multiple Confidence intervals
In many situations one is interested in estimating not
only a single parameter, q, but a collection of
parameters, q1, q2, q3, ... .
A collection of intervals, TL1 to TU1, TL2 to TU2, TL3
to TU3, ... are called a set of (1 - a)  100 % multiple
confidence intervals if the probability that all the
intervals capture their respective parameters is 1 - a
Hypothesis Testing
• Another important area of statistical
inference is that of Hypothesis Testing.
• In this situation one has a statement
(Hypothesis) about the parameter(s) of the
distributions being sampled and one is
interested in deciding whether the statement
is true or false.
• In fact there are two hypotheses
– The Null Hypothesis (H0) and
– the Alternative Hypothesis (HA).
• A decision will be made either to
– Accept H0 (Reject HA) or to
– Reject H0 (Accept HA). The following table
gives the different possibilities for the decision
and the different possibilities for the correctness
of the decision
• The following table gives the different
possibilities for the decision and the
different possibilities for the correctness of
the decision
H0
is true
H0
is false
Accept H0
Reject H0
Correct
Decision
Type II
error
Type I
error
Correct
Decision
• Type I error - The Null Hypothesis H0 is
rejected when it is true.
• The probability that a decision procedure
makes a type I error is denoted by a, and is
sometimes called the significance level of
the test.
• Common significance levels that are used
are a = .05 and a = .01
• Type II error - The Null Hypothesis H0 is
accepted when it is false.
• The probability that a decision procedure
makes a type II error is denoted by b.
• The probability 1 - b is called the Power of
the test and is the probability that the
decision procedure correctly rejects a false
Null Hypothesis.
A statistical test is defined by
• 1. Choosing a statistic for making the
decision to Accept or Reject H0. This
statisitic is called the test statistic.
• 2. Dividing the set of possible values of
the test statistic into two regions - an
Acceptance and Critical Region.
• If upon collection of the data and evaluation
of the test statistic, its value lies in the
Acceptance Region, a decision is made to
accept the Null Hypothesis H0.
• If upon collection of the data and evaluation
of the test statistic, its value lies in the
Critical Region, a decision is made to reject
the Null Hypothesis H0.
• The probability of a type I error, a, is
usually set at a predefined level by choosing
the critical thresholds (boundaries between
the Acceptance and Critical Regions)
appropriately.
• The probability of a type II error, b, is
decreased (and the power of the test, 1 - b,
is increased) by
1. Choosing the “best” test statistic.
2. Selecting the most efficient experimental
design.
3. Increasing the amount of information
(usually by increasing the sample sizes
involved) that the decision is based.
Some common Tests
Situation
Test Statistic
Sample form the Normal
distribution with unknown
mean and known variance
(Testing m) (n large)
z
n x  m0 
s
Sample form the Normal
distribution with unknown
mean and unknown variance
(Testing m) (n small)
t
n x  m 0 
s
Testing of a binomial
probability p
Two independent samples
from the Normal distribution
with unknown means and
known variances
(Testing m1 - m2)
(n, m largel)
z
z
H0
m  m
m  m
pˆ  p0
p0 (1  p0 )
n
p  p
HA
m  m
m  m
m  m
m  m
m  m
m  m
p  p
p  p
p  p
x  y 
m1  m 2 m1  m 2
2
s x2 s y

n m
Critical Region
z < -za/2 or z > za/2
z > za
z <-za
t < -ta/2 or t > ta/2
t > ta
t < -ta
z < -za/2 or z > za/2
z > za
z < -za
z < -za/2 or z > za/2
m 1  m 2 z > za
m1  m 2 z < -za
Two independent samples
from the Normal distribution
with unknown means and
unknown but equal
variances. (Testing m1 - m2)
t
x  y 
s Pooled
m1  m 2 m1  m 2 t < -ta/2 or t > ta/2
1 1

n m
m1  m 2 t > ta
m1  m 2 t < -ta
Estimation of a the
difference between two
binomial probabilities, p1-p2
z
pˆ 1  pˆ 2
1
1
pˆ (1  pˆ ) 
 n1 n 2
p1  p 2



p1  p2 z < -za/2 or z > za/2
p1  p 2 z > za
p1  p2
z < -za
The p-value approach to
Hypothesis Testing
In hypothesis testing we need
1. A test statistic
2. A Critical and Acceptance region
for the test statistic
The Critical Region is set up under the
sampling distribution of the test statistic.
Area = a (0.05 or 0.01) above the critical
region. The critical region may be one tailed or
two tailed
The Critical region:
a/2
a/2
Reject H0
 za / 2
0
za / 2
Accept H0
z
Reject H0
PAccept H 0 when true   P za / 2  z  za / 2   1  a
PReject H 0 when true   Pz   za / 2 or z  za / 2   a
In test is carried out by
1. Computing the value of the test
statistic
2. Making the decision
a. Reject if the value is in the Critical
region and
b. Accept if the value is in the
Acceptance region.
The value of the test statistic may be in the
Acceptance region but close to being in the
Critical region, or
The it may be in the Critical region but close to
being in the Acceptance region.
To measure this we compute the p-value.
Definition – Once the test statistic has been
computed form the data the p-value is defined
to be:
p-value = P[the test statistic is as or more
extreme than the observed value of
the test statistic]
more extreme means giving stronger evidence to
rejecting H0
Example – Suppose we are using the z –test for the
mean m of a normal population and a = 0.05.
Z0.025 = 1.960
Thus the critical region is to reject H0 if
Z < -1.960 or Z > 1.960 .
Suppose the z = 2.3, then we reject H0
p-value = P[the test statistic is as or more extreme than
the observed value of the test statistic]
= P [ z > 2.3] + P[z < -2.3]
= 0.0107 + 0.0107 = 0.0214
Graph
p - value
-2.3
2.3
If the value of z = 1.2, then we accept H0
p-value = P[the test statistic is as or more extreme than
the observed value of the test statistic]
= P [ z > 1.2] + P[z < -1.2]
= 0.1151 + 0.1151 = 0.2302
23.02% chance that the test statistic is as or more
extreme than 1.2. Fairly high, hence 1.2 is not very
extreme
Graph
p - value
-1.2
1.2
Properties of the p -value
1. If the p-value is small (<0.05 or 0.01) H0 should be
rejected.
2. The p-value measures the plausibility of H0.
3. If the test is two tailed the p-value should be two
tailed.
4. If the test is one tailed the p-value should be one
tailed.
5. It is customary to report p-values when reporting
the results. This gives the reader some idea of the
strength of the evidence for rejecting H0
Multiple testing
Quite often one is interested in performing
collection (family) of tests of hypotheses.
1. H0,1 versus HA,1.
2. H0,2 versus HA,2.
3. H0,3 versus HA,3.
etc.
• Let a* denote the probability that at least one type
I error is made in the collection of tests that are
performed.
• The value of a*, the family type I error rate, can
be considerably larger than a, the type I error rate
of each individual test.
• The value of the family error rate, a*, can be
controlled by altering the thresholds of each
individual test appropriately.
• A testing procedure of this nature is called a
Multiple testing procedure.
Independent variables
Dependent
Variables
Categorical
Continuous
Categorical
Multiway frequency Analysis
(Log Linear Model)
Discriminant Analysis
Continuous
Continuous &
Categorical
ANOVA (single dep var)
MANOVA (Mult dep var)
??
MULTIPLE
REGRESSION
(single dep variable)
MULTIVARIATE
MULTIPLE
REGRESSION
(multiple dependent
variable)
??
Continuous &
Categorical
Discriminant Analysis
ANACOVA
(single dep var)
MANACOVA
(Mult dep var)
??