Transcript Lecture 4
Sampling from a MVN Distribution
BMTRY 726
1/17/2014
Sample Mean Vector
• We can estimate a sample mean for X1, X2, …, Xn
X
1
n
X1 X 2 ... X n
n X1 j
j 1
n
X
2
j
1n j 1
n
X pj
j 1
1 n X1 j
n j 1
X1
1 n X X
2
n j 1 2 j
1n n X pj X p
j 1
1
n
n
j 1
Xj
Sample Mean Vector
• Now we can estimate the mean of our sample
• But what about the properties of X ?
– It is an unbiased estimate of the mean
– It is a sufficient statistic
– Also, the sampling distribution is:
X ~ N p , 1n
Sample Covariance
• And the sample covariance for X1, X2, …, Xn
s11
s
'
j
21
S n11 j 1 x j x x j x
s p1
• Sample variance
sii s
2
i
1
n 1
x
n
j 1
ij
xi
2
• Sample Covariance
sik
1
n 1
x
n
j 1
ij
xi
x
kj
xk
s12
s22
sp2
s1 p
s2 p
s pp
Sample Mean Vector
• So we can also estimate the variance of our sample
• And like X, S also has some nice properties
– It is an unbiased estimate of the variance
– It is also a sufficient statistic
– It is also independent of X
• But what about the sampling distribution of S?
Wishart Distribution
• Given Z1 , Z 2 ,..., Z n ~ NID p 0, Σ , the distribution of j1 Z j Z'j
is called a Wishart distribution with n degrees of freedom.
n
•
A n 1 S j 1 x j x x j x has a Wishart distribution
'
n
with n -1 degrees of freedom
• The density function is
Wn 1 A Σ
A
2
p n1
2
p p 1
4
n p 2
2
Σ
e
n1
2
tr AΣ1
2
n i
i1 2
where A and are positive definite
p
Wishart cont’d
• The Wishart distribution is the multivariate analog of the
central chi-squared distribution.
– If A1 ~ Wq A1 Σ and A 2 ~ Wq A 2 Σ are independent then
A1 A 2 ~ Wq r A1 A 2 Σ
– If A ~ Wn A Σ then CAC’ is distributed Wn CAC ' CΣC '
– The distribution of the (i, i) element of A is
aii j 1 xij xi
n
2
~ ii 2n1
Large Sample Behavior
• Let X1, X2, …, Xn be a random sample from a population with
mean and variance (not necessarily normally distributed)
1
μ
p
and
11
Σ
p1
1 p
pp
Then X and S are consistent estimators for and . This
means
P
X μ
P
SΣ
as n
as n
Large Sample Behavior
• If we have a random sample X1, X2, …, Xn a
population with mean and variance, we can apply
the multivariate central limit theorem as well
• The multivariate CLT says
n X μ N p 0, Σ
and
X μ X μ ~
'
1
n
1
2
p
Checking Normality Assumptions
• Check univariate normality for each component of X
– Normal probability plots (i.e. Q-Q plots)
– Tests:
• Shapiro-Wilk
• Correlation
• EDF
• Check bivariate (and higher)
– Bivariate scatter plots
– Chi-square probability plots
Univariate Methods
• If X1, X2,…, Xn are a random sample from a p-dimensional
normal population, then the data for the ith trait are a random
sample from a univariate normal distribution (from result 4.2)
• -Q-Q plot
(1) Order the data
xi1 xi2 ... xin
(2) Compute the quantiles q1 q2 ... qn according to
qj
j 12
1 12 z 2
e dz
n
2
leads to
(3) Plot the pairs of observations
j 12
qj
j 1, 2,..., n
n
1
x q , x q ,..., x q
i1, 1
i 2, 2
i n, n
Correlation Tests
• Shapiro-Wilk test
• Alternative is a modified version of Shapiro-Wilk test
• Uses correlation coefficient from the Q-Q plot
rQ
x x q q
x x q q
n
j 1
i
i j
2
n
j 1
i j
i
j
n
j 1
2
j
• Reject normality if rQ is too small (values in Table 4.2)
Empirical Distribution Tests
• Anderson-Darling and Kolmogrov-Smirnov statistics measure
how much the empirical distribution function (EDF)
Fn xi
number observations less than or equal to xi
n
differs from the hypothesized distribution
F x, θ using θˆ to estimate θ
• For a univariate normal distribution
ˆ x
xx
θ 2 , θ 2 , and F x, θˆ =
s
s
• Large values for either statistic indicate observed data were
not sampled from the hypothesized distribution
Multivariate Methods
• You can generate bivariate plots of all pairs of traits and look
for unusual observations
• A chi-square plot checks for normality in p > 2 dimensions
(1) For each observation compute
d 2j x j x S 1 x j x , j 1, 2,..., n
'
(2) Order these values from smallest to largest
d21 d22 ... d2n
(3) Calculate quantiles for the chi-squared distribution with p d.f.
q1 q2 ... qn
j 12
P p2 q j
n
Multivariate Methods
(1) Plot the pairs
d , q , d , q ,..., d , q
2
1
1
2
2
2
n
2
n
d 2j
qj
Do the points deviate too much from a straight line?
Things to Do with non-MVN Data
• Apply normal based procedures anyway
– Hope for the best….
– Resampling procedures
• Try to identify an more appropriate multivariate
distribution
• Nonparametric methods
• Transformations
• Check for outliers
Transformations
• The idea of transformations is to re-express the data
to make it more normal looking
• Choosing a suitable transformation can be guided by
– Theoretical considerations
• Count data can often be made to look more normal by
using a square root transformation
– The data themselves
• If the choice is not particularly clear consider power
transformations
Power Transformations
• Commonly use but note, defined only for positive
variables
• Defined by a parameter l as follows:
y j x lj
if
y j ln x j
l0
if
l 0
• So what do we use?
– Right skewed data consider l < 1 (fractions, 0, negative
numbers…)
– Left skewed data consider l > 1
Power Transformations
• Box-Cox are a popular modification of power
transformations where
yj
x lj 1
l
y j ln x j
if
if
l0
l 0
• Box-Cox transformations determine the best l by
maximizing:
2
n
n
1
l l ln n j 1 y j y j l 1 j 1 ln x j
l
x
1
n
y j 1n j 1 j
n
2
l
Transformations
• Note, in the multivariate setting, this would be
considered for every trait
• However… normality of each individual trait does not
guarantee joint normality
• We could iteratively try to search for the best
transformations for joint and marginal normality
– May not really improve our results substantially
– And often univariate transformations are good enough in
practice
• Be very cautious about rejecting normality
Next Time
• Examples of normality checks in SAS and R
• Begin our discussion of statistical inference for
MV vectors