Y ij - Kevin E. O`Grady, PhD
Download
Report
Transcript Y ij - Kevin E. O`Grady, PhD
Mixing it up: Mixed Models
Tracy Tomlinson
December 11, 2009
Outline
What are fixed effects
What are random effects
How do I know if my effects are fixed or random
Why do I care about fixed and random effects
Mixed models
SAS and mixed models
SPSS and mixed models
Fixed Effects
Specific levels of interest of a factor are
selected
May use all levels or a subset of levels
These are the specific levels of interest
Interest in comparing these levels
Inference only for these levels
Random Effects
Levels of a factor selected from a probability
distribution
Interested in the extent to which the random
factor accounts for variance in the dependent
variable
May be a control variable or a variable of interest
Rather than being interested in the individual
means across the levels of the fixed factor,
we are interested in the variance of means
across the levels of a random factor
Fixed Effects: One Factor
Running a clinical trial in which a drug is
administered at four different dose levels
Model Equation:
Yij = + i + eij
i corresponds to 1, 2, 3, or 4 dose levels
is the effect of the drug on the mean
Random Effects: One Factor
Clinical trial using a new drug at 20 different
clinics in DC selected at random
Model Equation:
Yij = +
aii + eijij
Where i corresponds to the 20 clinics
Where represents the mean of all dosages in the
population, not just the observed study
The effects of ai are random variables with mean 0
and variance a2
Fixed Versus Random
Levels of a factor
chosen of specific
interest
Levels of a factor
selected from a
probability
distribution
Interested in the means Interested in the
variance of means
across the chosen
across the levels of the
levels of the factor
factor
Determining Fixed and Random
Effects
YOU determine what effects you have!
As the researcher you select your levels of
interest:
Are the specific levels of interest?
Are you interested in comparing group means?
Did you sample the levels from a larger
population?
YOU determine the population!
Fixed or Random Effect?
Is it reasonable to assume
that the levels of the factor
come from a probability
distribution?
No
Fixed Factor
Yes
Random Factor
Fixed or Random Effect?
Do you care about
comparing the specific factor
level means?
No
Fixed Factor
Yes
Random Factor
Why does it Matter, or Does it?
Assumptions about random effects differ from
those for fixed effects
Error terms are different depending on fixed
versus random effects
Random effects have additional error terms
beyond 2
The effects of ai are random variables with mean 0
and variance a2
Important for inferential statistics
Fixed and Random Effects Error
Terms
Fixed effects error = MSerror = MSwithin = 2
Random effects error: Depends on the
nature of the random and fixed effects
Two Factor Model
A fixed, B fixed
Yij = + i + j + ()ij + eij
A fixed, B random
Yij = + i + bj + (ab)ij + eij
A random, B fixed
Yij = + i + bj + (ab)ij + eij
A random, B random
Yij = + ai + j + (ab)ij + eij
One error
term:
MSwithin e2
Two error
terms:
1) MSwithin e2
2) MSab ab2
Two Factor Model
A fixed, B fixed
Yij = + i + j + ()ij + eij
A fixed, B random
Yij = + i + bj + (ab)ij + eij
A random, B fixed
Yij = + i + bj + (ab)ij + eij
A random, B random
Yij = + ai + j + (ab)ij + eij
In all cases the
highest level
interaction
term is tested
against
MSwithin e2
Two Factor Model
A fixed, B fixed
Yij = + i + j + ()ij + eij
A fixed, B random
Yij = + i + bj + (ab)ij + eij
A random, B fixed
Yij = + i + bj + (ab)ij + eij
A random, B random
Yij = + ai + j + (ab)ij + eij
All effects
tested against
MSwithin e2
Two Factor Model
A fixed, B fixed
Yij = + i + j + ()ij + eij
A fixed, B random
Yij = + i + bj + (ab)ij + eij
A random, B fixed
Yij = + i + bj + (ab)ij + eij
A random, B random
Yij = + ai + j + (ab)ij + eij
Fixed effect A
tested against
MSab ab2
Random effect
B tested
against
MSwithin e2
Two Factor Model
A fixed, B fixed
Yij = + i + j + ()ij + eij
A fixed, B random
Yij = + i + bj + (ab)ij + eij
A random, B fixed
Yij = + i + bj + (ab)ij + eij
A random, B random
Yij = + ai + j + (ab)ij + eij
Random effect
A tested
against
MSwithin e2
Fixed effect B
tested against
MSab ab2
Two Factor Model
A fixed, B fixed
Yij = + i + j + ()ij + eij
A fixed, B random
Yij = + i + bj + (ab)ij + eij
A random, B fixed
Yij = + ai + j + (ab)ij + eij
A random, B random
Yij = + ai + bj + (ab)ij + eij
All effects
tested against
MSwithin e2
Fixed and Random Effects and
Effect Size
Effect size estimates assume effects are
fixed or random
Fixed
2
2
Random
Why does it Matter, or Does it?
Assumptions about random effects differ from those for
fixed effects
Error terms are different depending on fixed versus
random effects
Random effects have additional error terms beyond 2
The effects of ai are random variables with mean 0 and
variance a2
Important for inferential statistics
Interpretation differs
Fixed effects interpretations constrained to the levels of the
factor(s) in the study
Random effects interpretations have a broader generalization to the
population of interest
Mixed Models
Contains both fixed and random effects
Randomized blocks designs
Nested/ Hierarchical designs
Split-plot designs
Clustered designs
Repeated measures
Analysis comparable
One way ANOVA
Two way ANOVA
ANCOVA
Linear Mixed Model (LMM)
• Handles data where observations are not independent
• LMM correctly models correlated errors, whereas procedures in
the general linear model family (GLM) usually do not
• Nature [structure] of the correlation must be correctly modeled for
the tests of mean differences to be unbiased
• LMM is a further generalization of GLM to better support
analysis of a continuous dependent for:
• Random effects
• Hierarchical effects
• Repeated measures
Mixed Models Randomized
Blocks Example
Testing four drugs and assigning n subjects to one
of 4 groups carefully matched by demographic
variables. Each person in the four groups gets one
of the drugs.
Fixed effect of treatment
Random effect of blocks
Yij = + i + bj + eij
represents unknown fixed parameters - intercept and
the four drug treatment effects
bj and eij are random variables representing blocks and
error
bj assumed to have an error of b2
Error (eij) assumed to have an error of 2
Mixed Models Hierarchical
Example
Using four dosage levels of a drug in 20 clinics. In each
clinic each patient was randomly assigned to one of the 4
dose levels.
Yijk = + i + bj + cij + eijk
Where ai, bj, and cij are the effects due to drug dose i, clinic
j, and clinic-by-dose interaction
IF you assume that the 20 clinics are not sampled this
experiment may now have only fixed effects
Yijk = + i + j + cij + eijk
Mixed Models Repeated
Measures and Split-Plot
Three drug treatments randomly assigned to
subjects with subjects observed at 1, 2, …, 7, and
8 hours post-treatment Where
Yijk = + i + s()ij + k + ()ik + eijk
Where represents treatment effects
represents time (or hour) effects
s() represents the random subject within
treatment effects
Mixed Models
Contains both fixed and random effects
Randomized blocks designs
Nested/ Hierarchical designs
Split-plot designs
Clustered designs
Repeated measures
Analysis comparable
One way ANOVA
Two way ANOVA
ANCOVA
PROC MIXED
Specifically designed to fit mixed effect models.
It can model:
Random and mixed effect data
Repeated measures
Spatial data
Data with heterogeneous variances and autocorrelated
observations
The MIXED procedure is more general than GLM in
the sense that it gives a user more flexibility in
specifying the correlation structures, particularly
useful in repeated measures and random effect
models
GLM uses OLS estimation
Mixed uses ML, REML, or MIVQUE0 estimation
PROC MIXED v. GLM
The PROC MIXED syntax is similar to the syntax of
PROC GLM.
The random effects and repeated statements are
used differently
Random effects are not listed in the model statement for
MIXED
GLM has MEANS and LSMEANS statements
MIXED has only the LSMEANS statement
General SAS Mixed Model
Syntax
PROC MIXED options;
CLASS variable-list;
MODEL dependent=fixed effects/ options; RANDOM
random effects / options;
REPEATED repeated effects / options; CONTRAST
'label' fixed-effect values | random-effect values/
options;
ESTIMATE 'label' fixed-effect values | random-effect
values/ options;
LSMEANS fixed-effects / options;
MAKE 'table' OUT= SAS-data-set < options >;
RUN;
PROC MIXED statement
PROC MIXED options;
PROC MIXED noclprint covtest ;
The NOCLPRINT option prevents the printing of the CLASS level
information
The first time you run the program you probably don’t want to include
noclprint
When there are lots of group units, use NOCLPRINT to suppress the
printing of group names.
The COVTEST option tells SAS that you would like hypothesis tests
for the variance and covariance components.
CLASS statement
CLASS variable-list;
CLASS IDpatient IDclinic;
The CLASS statement indicates that SCHOOL is a
classification variable whose values do not contain
quantitative information
The variables that we want SAS to treat as categorical
variables go here.
Variables that are characters (e.g., city names) must be on
this line (it won’t run otherwise).
MODEL statement
MODEL dependent = fixed effects/options;
MODEL dosage = /solution;
If you have no fixed effects you would have no independent variables
listed in the model statement
/solution option asks SAS to print the estimates for the fixed effects
Intercept is included as a default in SAS
If you would like to fit a model without the intercept you add the
/NOINT option to the model statement
Random statement
RANDOM intercept /sub=IDclinic;
By default there is always at least one random effect, usually the
lowest-level residual
You can specify the intercept on the RANDOM statement: This
indicates the presence of a second random effect and the intercept
should be treated not only as a fixed effect but also as a random effect
The SUB option on the RANDOM statement specifies the multilevel
structure, indicating how the individuals (level-1 units) are divided into
higher level groups (level-2 units)
General SAS Mixed Model
Syntax
PROC MIXED options;
CLASS variable-list;
MODEL dependent=fixed effects/ options; RANDOM random
effects / options;
REPEATED repeated effects / options; CONTRAST 'label' fixedeffect values | random-effect values/ options;
ESTIMATE 'label' fixed-effect values | random-effect values/
options;
LSMEANS fixed-effects / options;
MAKE 'table' OUT= SAS-data-set < options >;
RUN;
General SPSS Mixed Model
Syntax
Mixed dependent with independent
/print=solution
/fixed=fixed effects
/random intercept random effects | subject(random
effect grouping) covtype (options)
/repeated = repeated effect | subject(repeated effect
grouping) covtype (options).
Recap Main Points Slide
Fixed versus random effects
Error terms
Effect size
Interpretation/inference
Sampling
Independence of observations
SAS and SPSS syntax for random and
mixed models
For your Reading Pleasure
Schabenberger, O. (2006). SAS system for mixed
models (Second Edition). Cary, NC: SAS Institute.
McCullagh, P., & Nelder, J. A. (1989). Generalized
linear models (Second Edition). New York: Chapman
and Hall.
McCulloch, C., & Searle, S. (2008) Generalized, linear,
and mixed models (Second Edition). New York: Wiley.
Verbeke, G. E., & Molenberghs, G. (1997). Linear
mixed models in practice: A SAS-oriented approach.
New York: Springer.
Fahrmeir, L., & Tutz, G. (1994). Multivariate statistical
modeling based on generalized linear models.
Heidelberg: Springer-Verlag. Lindsey, J. (1993).
Data Example with PROC
MIXED
High School and Beyond data example (Byrk &
Raudenbush, 1992)
7,185 students in 160 schools
MATHACH: Student level (level-1) outcome is
math achievement
SES: Student level (levl-1) covariate is socioeconomic status
MEANSES: School-level (level-2) covariate of
mean SES for the school
SECTOR: School-level (level-2) covariate of
school type (dummy coded, public= 0 and
Catholic=1)
Singer (1998)
Random Effects Model
Unconditional means model examining the
variation of MATHACH across schools
One-way random effects ANOVA model
Model Equation
Yij = + j + rij
SAS syntax
proc mixed;
class school;
model mathach = ;
random school;
MIXED Model Two-Level
Approach
Level 1: students outcome (Yij) is expressed as the sum of an
intercept for the students school (0j) and a random error term (rij)
associated with the ith student in the jth school
Yij = 0j + rij
Level 2: We express the school level intercept as the sum of an
overall mean (00) and a series of random deviations from that mean
(0j).
0j = 00 + 0j
This leads to a model equation of
Yij = 00 + 0j + rij
SAS syntax
proc mixed noclprint covtest;
class school;
model mathach = /solution;
random intercept/sub=school;
Mixed Model Two-Level
Approach Output
Variance
component
for between
school: 00
Variance
component
for within
school:
2
SPSS for Random Effects Model
SAS syntax
SPSS syntax
proc mixed noclprint covtest;
class school;
model mathach = /solution;
random intercept/sub=school;
mixed mathach
/print=solution
/random intercept | subject(school).
SPSS output
Including Level-2 Predictors
School-level (level-2) predictor of MEANSES
Level 1: students outcome (Yij) is expressed as the sum of an intercept
for the students school (0j) and a random error term (rij) associated
with the ith student in the jth school
Yij = 0j + rij
Level 2: We express the school level intercept as the sum of an overall
mean (00), MEANSES and a series of random deviations from that
mean (0j).
Fixed Effects
0j = 00 + 01MEANSESj + 0j
Random
This leads to a model equation of
Effects
Yij = [00 + 01MEANSESj ] + [0j + rij ]
Compute df
for fixed
SAS syntax
proc mixed noclprint covtest;
class school;
model mathach = meanses/solution ddfm=bw;
random intercept/sub=school;
effects with
“between/
within”
SAS Output
Random
effect: 00
Random
effect: 2
Fixed effect:
00
Fixed effect:
01
SPSS for Including Level-2
Predictor
SAS syntax
SPSS syntax
proc mixed noclprint covtest;
class school;
model mathach = meanses/solution ddfm=bw;
random intercept/sub=school;
mixed mathach with meanses
/print=solution
/fixed = meanses
/random intercept | subject(school).
SPSS output
Including Level-1 Predictors
Student-level (level-1) predictor of SES
Level 1: students outcome (Yij) is expressed as a function of an intercept for the
students school (0j), individual CSES (centered by MEANSES), and a
random error term (rij) associated with the ith student in the jth school
Yij = 0j + 1jCSESij +rij
Level 2: We express the school level intercept as the sum of an overall mean
(00) and a series of random deviations from that mean (0j).
0j = 00 + 0j
Fixed Effects
0j = 10 + 1j
This leads to a model equation of
Random
Yij = [00 + 01CSESj ] + [0j + 1j(CSES)+ rij ]
Effects Don’t print
SAS syntax
iteration page
proc mixed noclprint covtest noitprint;
class school;
model mathach = cses/solution ddfm=bw notest;
random intercept cses/sub=school type = un;
Unstructured
specification
SAS Output
Including Level-1 Predictors in
SPSS
Yij = [00 + 01CSESj ] + [0j + 1j(CSES)+ rij ]
SAS syntax
proc mixed noclprint covtest noitprint;
class school;
model mathach = cses/solution ddfm=bw notest;
random intercept cses/sub=school type = un;
SPSS syntax
mixed mathach with cses
/print=solution
/fixed = cses
/random intercept cses | subject(school) covtype(un).
Including Level-1 AND Level-2
Predictors
Model with the effect of students SES (CSES) and school
SES (MEANSES) with a second level-2 factor of
SECTOR
Yij = 0j + 1jCSESij +rij
0j = 00 + 01MEANSESj + 02SECTORj + 0j
0j = 10 + 11MEANSESj + 12SECTORj + 1j
This leads to a model equation of
Yij = 00 + 01MEANSESj + 02SECTORj + 10CSESij
+ 11MEANSESj(CSESij) + 12SECTORj(CSESij)
+ 0j + 1j(CSESij) + rij
SAS syntax
proc mixed noclprint covtest noitprint;
class school;
model mathach = meanses sector cses meanses*cses
sector*cses/solution ddfm=bw notest;
random intercept cses/type=un sub=school;
SAS Output
Including Level-1 and Level-2
Predictors in SPSS
Yij = 00 + 01MEANSESj + 02SECTORj + 10CSESij
+ 11MEANSESj(CSESij) + 12SECTORj(CSESij)
+ 0j + 1j(CSESij) + rij
SAS syntax
proc mixed noclprint covtest noitprint;
class school;
model mathach = meanses sector cses meanses*cses
sector*cses/solution ddfm=bw notest;
random intercept cses/type=un sub=school;
SPSS syntax
mixed mathach with meanses sector cses
/print=solution
/fixed = meanses sector cses meanses*cses sector*cses
/random intercept cses | subject(school) covtype(un).