Transcript ppt
Multilevel Modeling
using Stata
{
Andrew Hicks
CCPR Statistics and Methods Core
Workshop based on the book:
Multilevel and
Longitudinal Modeling
Using Stata
(Second Edition)
by
Sophia Rabe-Hesketh
Anders Skrondal
600
500
400
300
200
Mini Wright Measurements
700
Within-Subject Dependence
1
2
3
4
5
6
7
8 9 10 11 12 13 14 15 16 17
Subject ID
Occasion 1
Occasion 2
Within-Subject Dependence: We can predict occasion 2 measurement if
we know the subjectβs occasion 1 measurement.
Between-Subject Heterogeneity: Large differences between subjects
(compare subjects 9 and 15)
Within-subject dependence is due to between-subject heterogeneity
Standard Regression Model
Measurement of subject i on occasion j
π¦ππ = π½ + πππ
Population Mean
Residuals (error terms)
Independent over subjects and occasions
πππ {
πππ {
πππ {
πππ {
π·
Clearly ignores information about
within-subject dependence
Variance Component Model
π¦ππ = π½ + πππ
π¦ππ = π½ + ππ + πππ
Random Intercept: deviation of subject
jβs mean from overall mean π½
Within-subject residual: deviation of
observation i from subject jβs mean
Variance Component Model
π¦ππ = π½ + πππ
π¦ππ = π½ + ππ + πππ
Random Intercept: deviation of subject
jβs mean from overall mean π½
Within-subject residual: deviation of
observation i from subject jβs mean
Variance Component Model
π¦ππ = π½ + ππ + πππ
Random Intercept: deviation of subject
jβs mean from overall mean π½
Within-subject residual: deviation of
observation i from subject jβs mean
π2π
ππ
π1π
π½ + ππ
π·
Variance Component Model
π¦ππ = π½ + ππ + πππ
ππ βΌ π(0, π)
πππ βΌ π(0, π)
πππ π¦ππ = πππ π½ + πππ(ππ ) + πππ(πππ )
0
πππ π¦ππ =
π
π
+
π
π
Variance Component Model
π¦ππ = π½ + ππ + πππ
Proportion of Total Variance due to subject differences:
πππ(ππ )
πππ π¦ππ
=
π
π+π
=Ο
Intraclass Correlation: within cluster correlation
πΆππ(π¦1π , π¦2π ) = Ο
Random or Fixed Effect?
Since every subject has a different effect ππ we can think of
subjects as categorical explanatory variables. Since the effects
of each subject is random, we have been using a random effect model:
π¦ππ = π½ + ππ + πππ ,
ππ βΌ π(0, π)
What if we want to fix our model so that each effect is for a specific
subject? Then we would use a fixed effect model:
π¦ππ = π½ + πΌπ + πππ ,
π½
π=1 πΌπ
.xtreg wm, fe
=0
Random or Fixed Effect?
random effect model:
if the interest concerns the population of clusters
βgeneralize the potential effectβ
i.e. nurse giving the drug
fixed effect model:
if we are interest in the βeffectβ of the specific clusters in a particular
dataset
βreplicable in lifeβ
i.e. the actual drug
Random Intercept Model
with Covariates
without covariates:
π¦ππ = π½ + πππ
π¦ππ = π½ + ππ + πππ
Random Intercept Model
with Covariates
with covariates:
π¦ππ = π½1 + π½2 π₯2ππ + β― π½π π₯πππ + πππ
π¦ππ = π½1 + π½2 π₯2ππ + β― π½π π₯πππ + ππ +
πππ
= (π½1 + ππ ) + π½2 π₯2ππ + β― π½π π₯πππ +
πππ
random parameter not estimated with fixed parameters π½1 β π½π ,
but whose variance π is estimated with variance π of πππ
Ecological Fallacy
occurs when between-cluster relationships differ substantially
from within-cluster relationships.
β’ Can be caused by cluster-lever confounding
For example, mothers who smoke during pregnancy may also adopt
other behaviors such as drinking and poor nutritional intake, or have lower
socioeconomic status and be less educated. These variables adversely affect
birthweight and have not be adequately controlled for. In these cases the
covariate is correlated with the error term. (endogeneity)
β’ Because of this, the between-effect may be an overestimate of the
true effect.
β’ In contrast, for within-effects each mother serves as her own control,
so within mother estimates may be closer to the true causal effect.
How to test for endogeneity?
Use the Hausman test to compare two alternative estimators of π½
Random-coefficient model
Weβve already considered random intercept models where the intercept
is allowed to vary over clusters after controlling for covariates.
What if we would also like the coefficients (or slopes) to vary across
clusters?
Models the involve both random intercepts and random slopes are
called Random Coefficient Models
Random-coefficient model
Random Intercept Model:
π¦ππ = π½1 + π½2 π₯ππ + ππ + πππ
Random Coefficient Model:
cluster-specific random intercept
π¦ππ = π½1 + π½2 π₯ππ + π1π + π2π π₯ππ + πππ
cluster-specific random slope
π¦ππ = (π½1 +π1π ) + (π½2 + π2π )π₯ππ + πππ