Y ij - Kevin E. O`Grady, PhD

Download Report

Transcript Y ij - Kevin E. O`Grady, PhD

Mixing it up: Mixed Models
Tracy Tomlinson
December 11, 2009
Outline







What are fixed effects
What are random effects
How do I know if my effects are fixed or random
Why do I care about fixed and random effects
Mixed models
SAS and mixed models
SPSS and mixed models
Fixed Effects
 Specific levels of interest of a factor are
selected
 May use all levels or a subset of levels
 These are the specific levels of interest
 Interest in comparing these levels
 Inference only for these levels
Random Effects
 Levels of a factor selected from a probability
distribution
 Interested in the extent to which the random
factor accounts for variance in the dependent
variable
 May be a control variable or a variable of interest
 Rather than being interested in the individual
means across the levels of the fixed factor,
we are interested in the variance of means
across the levels of a random factor
Fixed Effects: One Factor
 Running a clinical trial in which a drug is
administered at four different dose levels
 Model Equation:
Yij =  + i + eij
 i corresponds to 1, 2, 3, or 4 dose levels
  is the effect of the drug on the mean
Random Effects: One Factor
 Clinical trial using a new drug at 20 different
clinics in DC selected at random
 Model Equation:
Yij =  + 
aii + eijij
 Where i corresponds to the 20 clinics
 Where  represents the mean of all dosages in the
population, not just the observed study
 The effects of ai are random variables with mean 0
and variance a2
Fixed Versus Random
 Levels of a factor
chosen of specific
interest
 Levels of a factor
selected from a
probability
distribution
 Interested in the means  Interested in the
variance of means
across the chosen
across the levels of the
levels of the factor
factor
Determining Fixed and Random
Effects
 YOU determine what effects you have!
 As the researcher you select your levels of
interest:
 Are the specific levels of interest?
 Are you interested in comparing group means?
 Did you sample the levels from a larger
population?
 YOU determine the population!
Fixed or Random Effect?
Is it reasonable to assume
that the levels of the factor
come from a probability
distribution?
No
Fixed Factor
Yes
Random Factor
Fixed or Random Effect?
Do you care about
comparing the specific factor
level means?
No
Fixed Factor
Yes
Random Factor
Why does it Matter, or Does it?
 Assumptions about random effects differ from
those for fixed effects
 Error terms are different depending on fixed
versus random effects
 Random effects have additional error terms
beyond 2
 The effects of ai are random variables with mean 0
and variance a2
 Important for inferential statistics
Fixed and Random Effects Error
Terms
 Fixed effects error = MSerror = MSwithin = 2
 Random effects error: Depends on the
nature of the random and fixed effects
Two Factor Model
 A fixed, B fixed
 Yij =  + i + j + ()ij + eij
 A fixed, B random
 Yij =  + i + bj + (ab)ij + eij
 A random, B fixed
 Yij =  + i + bj + (ab)ij + eij
 A random, B random
 Yij =  + ai + j + (ab)ij + eij
One error
term:
MSwithin e2
Two error
terms:
1) MSwithin e2
2) MSab ab2
Two Factor Model
 A fixed, B fixed
 Yij =  + i + j + ()ij + eij
 A fixed, B random
 Yij =  + i + bj + (ab)ij + eij
 A random, B fixed
 Yij =  + i + bj + (ab)ij + eij
 A random, B random
 Yij =  + ai + j + (ab)ij + eij
In all cases the
highest level
interaction
term is tested
against
MSwithin e2
Two Factor Model
 A fixed, B fixed
 Yij =  + i + j + ()ij + eij
 A fixed, B random
 Yij =  + i + bj + (ab)ij + eij
 A random, B fixed
 Yij =  + i + bj + (ab)ij + eij
 A random, B random
 Yij =  + ai + j + (ab)ij + eij
All effects
tested against
MSwithin e2
Two Factor Model
 A fixed, B fixed
 Yij =  + i + j + ()ij + eij
 A fixed, B random
 Yij =  + i + bj + (ab)ij + eij
 A random, B fixed
 Yij =  + i + bj + (ab)ij + eij
 A random, B random
 Yij =  + ai + j + (ab)ij + eij
Fixed effect A
tested against
MSab ab2
Random effect
B tested
against
MSwithin e2
Two Factor Model
 A fixed, B fixed
 Yij =  + i + j + ()ij + eij
 A fixed, B random
 Yij =  + i + bj + (ab)ij + eij
 A random, B fixed
 Yij =  + i + bj + (ab)ij + eij
 A random, B random
 Yij =  + ai + j + (ab)ij + eij
Random effect
A tested
against
MSwithin e2
Fixed effect B
tested against
MSab ab2
Two Factor Model
 A fixed, B fixed
 Yij =  + i + j + ()ij + eij
 A fixed, B random
 Yij =  + i + bj + (ab)ij + eij
 A random, B fixed
 Yij =  + ai + j + (ab)ij + eij
 A random, B random
 Yij =  + ai + bj + (ab)ij + eij
All effects
tested against
MSwithin e2
Fixed and Random Effects and
Effect Size
 Effect size estimates assume effects are
fixed or random
 Fixed
 2
 2
 Random

Why does it Matter, or Does it?
 Assumptions about random effects differ from those for
fixed effects
 Error terms are different depending on fixed versus
random effects
 Random effects have additional error terms beyond 2
 The effects of ai are random variables with mean 0 and
variance a2
 Important for inferential statistics
 Interpretation differs
 Fixed effects interpretations constrained to the levels of the
factor(s) in the study
 Random effects interpretations have a broader generalization to the
population of interest
Mixed Models
 Contains both fixed and random effects





Randomized blocks designs
Nested/ Hierarchical designs
Split-plot designs
Clustered designs
Repeated measures
 Analysis comparable
 One way ANOVA
 Two way ANOVA
 ANCOVA
Linear Mixed Model (LMM)
• Handles data where observations are not independent
• LMM correctly models correlated errors, whereas procedures in
the general linear model family (GLM) usually do not
• Nature [structure] of the correlation must be correctly modeled for
the tests of mean differences to be unbiased
• LMM is a further generalization of GLM to better support
analysis of a continuous dependent for:
• Random effects
• Hierarchical effects
• Repeated measures
Mixed Models Randomized
Blocks Example
 Testing four drugs and assigning n subjects to one
of 4 groups carefully matched by demographic
variables. Each person in the four groups gets one
of the drugs.
 Fixed effect of treatment
 Random effect of blocks
Yij =  + i + bj + eij
  represents unknown fixed parameters - intercept and
the four drug treatment effects
 bj and eij are random variables representing blocks and
error
 bj assumed to have an error of b2
 Error (eij) assumed to have an error of 2
Mixed Models Hierarchical
Example
 Using four dosage levels of a drug in 20 clinics. In each
clinic each patient was randomly assigned to one of the 4
dose levels.
Yijk =  + i + bj + cij + eijk
 Where ai, bj, and cij are the effects due to drug dose i, clinic
j, and clinic-by-dose interaction
 IF you assume that the 20 clinics are not sampled this
experiment may now have only fixed effects
Yijk =  + i + j + cij + eijk
Mixed Models Repeated
Measures and Split-Plot
 Three drug treatments randomly assigned to
subjects with subjects observed at 1, 2, …, 7, and
8 hours post-treatment Where
Yijk =  + i + s()ij + k + ()ik + eijk
 Where  represents treatment effects
  represents time (or hour) effects
 s() represents the random subject within
treatment effects
Mixed Models
 Contains both fixed and random effects





Randomized blocks designs
Nested/ Hierarchical designs
Split-plot designs
Clustered designs
Repeated measures
 Analysis comparable
 One way ANOVA
 Two way ANOVA
 ANCOVA
PROC MIXED
 Specifically designed to fit mixed effect models.
 It can model:




Random and mixed effect data
Repeated measures
Spatial data
Data with heterogeneous variances and autocorrelated
observations
 The MIXED procedure is more general than GLM in
the sense that it gives a user more flexibility in
specifying the correlation structures, particularly
useful in repeated measures and random effect
models
 GLM uses OLS estimation
 Mixed uses ML, REML, or MIVQUE0 estimation
PROC MIXED v. GLM
 The PROC MIXED syntax is similar to the syntax of
PROC GLM.
 The random effects and repeated statements are
used differently
 Random effects are not listed in the model statement for
MIXED
 GLM has MEANS and LSMEANS statements
 MIXED has only the LSMEANS statement
General SAS Mixed Model
Syntax
 PROC MIXED options;
CLASS variable-list;
MODEL dependent=fixed effects/ options; RANDOM
random effects / options;
REPEATED repeated effects / options; CONTRAST
'label' fixed-effect values | random-effect values/
options;
ESTIMATE 'label' fixed-effect values | random-effect
values/ options;
LSMEANS fixed-effects / options;
MAKE 'table' OUT= SAS-data-set < options >;
RUN;
PROC MIXED statement
PROC MIXED options;
PROC MIXED noclprint covtest ;
 The NOCLPRINT option prevents the printing of the CLASS level
information
 The first time you run the program you probably don’t want to include
noclprint
 When there are lots of group units, use NOCLPRINT to suppress the
printing of group names.
 The COVTEST option tells SAS that you would like hypothesis tests
for the variance and covariance components.
CLASS statement
CLASS variable-list;
CLASS IDpatient IDclinic;
 The CLASS statement indicates that SCHOOL is a
classification variable whose values do not contain
quantitative information
 The variables that we want SAS to treat as categorical
variables go here.
 Variables that are characters (e.g., city names) must be on
this line (it won’t run otherwise).
MODEL statement
MODEL dependent = fixed effects/options;
MODEL dosage = /solution;
 If you have no fixed effects you would have no independent variables
listed in the model statement
 /solution option asks SAS to print the estimates for the fixed effects
 Intercept is included as a default in SAS
 If you would like to fit a model without the intercept you add the
/NOINT option to the model statement
Random statement
RANDOM intercept /sub=IDclinic;
 By default there is always at least one random effect, usually the
lowest-level residual
 You can specify the intercept on the RANDOM statement: This
indicates the presence of a second random effect and the intercept
should be treated not only as a fixed effect but also as a random effect
 The SUB option on the RANDOM statement specifies the multilevel
structure, indicating how the individuals (level-1 units) are divided into
higher level groups (level-2 units)
General SAS Mixed Model
Syntax
PROC MIXED options;
CLASS variable-list;
MODEL dependent=fixed effects/ options; RANDOM random
effects / options;
REPEATED repeated effects / options; CONTRAST 'label' fixedeffect values | random-effect values/ options;
ESTIMATE 'label' fixed-effect values | random-effect values/
options;
LSMEANS fixed-effects / options;
MAKE 'table' OUT= SAS-data-set < options >;
RUN;
General SPSS Mixed Model
Syntax
Mixed dependent with independent
/print=solution
/fixed=fixed effects
/random intercept random effects | subject(random
effect grouping) covtype (options)
/repeated = repeated effect | subject(repeated effect
grouping) covtype (options).
Recap Main Points Slide
 Fixed versus random effects
 Error terms
 Effect size
 Interpretation/inference
 Sampling
 Independence of observations
 SAS and SPSS syntax for random and
mixed models
For your Reading Pleasure
 Schabenberger, O. (2006). SAS system for mixed
models (Second Edition). Cary, NC: SAS Institute.
 McCullagh, P., & Nelder, J. A. (1989). Generalized
linear models (Second Edition). New York: Chapman
and Hall.
 McCulloch, C., & Searle, S. (2008) Generalized, linear,
and mixed models (Second Edition). New York: Wiley.
 Verbeke, G. E., & Molenberghs, G. (1997). Linear
mixed models in practice: A SAS-oriented approach.
New York: Springer.
 Fahrmeir, L., & Tutz, G. (1994). Multivariate statistical
modeling based on generalized linear models.
 Heidelberg: Springer-Verlag. Lindsey, J. (1993).
Data Example with PROC
MIXED
 High School and Beyond data example (Byrk &
Raudenbush, 1992)
 7,185 students in 160 schools
 MATHACH: Student level (level-1) outcome is
math achievement
 SES: Student level (levl-1) covariate is socioeconomic status
 MEANSES: School-level (level-2) covariate of
mean SES for the school
 SECTOR: School-level (level-2) covariate of
school type (dummy coded, public= 0 and
Catholic=1)
Singer (1998)
Random Effects Model
 Unconditional means model examining the
variation of MATHACH across schools
 One-way random effects ANOVA model
 Model Equation
 Yij =  + j + rij
 SAS syntax
proc mixed;
class school;
model mathach = ;
random school;
MIXED Model Two-Level
Approach
 Level 1: students outcome (Yij) is expressed as the sum of an
intercept for the students school (0j) and a random error term (rij)
associated with the ith student in the jth school
Yij = 0j + rij
 Level 2: We express the school level intercept as the sum of an
overall mean (00) and a series of random deviations from that mean
(0j).
0j = 00 + 0j
 This leads to a model equation of
Yij = 00 + 0j + rij
 SAS syntax
proc mixed noclprint covtest;
class school;
model mathach = /solution;
random intercept/sub=school;
Mixed Model Two-Level
Approach Output
Variance
component
for between
school: 00
Variance
component
for within
school:
2
SPSS for Random Effects Model
SAS syntax
SPSS syntax
proc mixed noclprint covtest;
class school;
model mathach = /solution;
random intercept/sub=school;
mixed mathach
/print=solution
/random intercept | subject(school).
SPSS output
Including Level-2 Predictors
 School-level (level-2) predictor of MEANSES
 Level 1: students outcome (Yij) is expressed as the sum of an intercept
for the students school (0j) and a random error term (rij) associated
with the ith student in the jth school
Yij = 0j + rij
 Level 2: We express the school level intercept as the sum of an overall
mean (00), MEANSES and a series of random deviations from that
mean (0j).
Fixed Effects
0j = 00 + 01MEANSESj + 0j
Random
 This leads to a model equation of
Effects
Yij = [00 + 01MEANSESj ] + [0j + rij ]
Compute df
for fixed
 SAS syntax
proc mixed noclprint covtest;
class school;
model mathach = meanses/solution ddfm=bw;
random intercept/sub=school;
effects with
“between/
within”
SAS Output
Random
effect: 00
Random
effect: 2
Fixed effect:
00
Fixed effect:
01
SPSS for Including Level-2
Predictor
SAS syntax
SPSS syntax
proc mixed noclprint covtest;
class school;
model mathach = meanses/solution ddfm=bw;
random intercept/sub=school;
mixed mathach with meanses
/print=solution
/fixed = meanses
/random intercept | subject(school).
SPSS output
Including Level-1 Predictors
 Student-level (level-1) predictor of SES




Level 1: students outcome (Yij) is expressed as a function of an intercept for the
students school (0j), individual CSES (centered by MEANSES), and a
random error term (rij) associated with the ith student in the jth school
Yij = 0j + 1jCSESij +rij
Level 2: We express the school level intercept as the sum of an overall mean
(00) and a series of random deviations from that mean (0j).
0j = 00 + 0j
Fixed Effects
0j = 10 + 1j
This leads to a model equation of
Random
Yij = [00 + 01CSESj ] + [0j + 1j(CSES)+ rij ]
Effects Don’t print
SAS syntax
iteration page
proc mixed noclprint covtest noitprint;
class school;
model mathach = cses/solution ddfm=bw notest;
random intercept cses/sub=school type = un;
Unstructured
specification
SAS Output
Including Level-1 Predictors in
SPSS

Yij = [00 + 01CSESj ] + [0j + 1j(CSES)+ rij ]
SAS syntax
proc mixed noclprint covtest noitprint;
class school;
model mathach = cses/solution ddfm=bw notest;
random intercept cses/sub=school type = un;

SPSS syntax
mixed mathach with cses
/print=solution
/fixed = cses
/random intercept cses | subject(school) covtype(un).
Including Level-1 AND Level-2
Predictors
 Model with the effect of students SES (CSES) and school
SES (MEANSES) with a second level-2 factor of
SECTOR
Yij = 0j + 1jCSESij +rij
0j = 00 + 01MEANSESj + 02SECTORj + 0j
0j = 10 + 11MEANSESj + 12SECTORj + 1j
 This leads to a model equation of
Yij = 00 + 01MEANSESj + 02SECTORj + 10CSESij
+ 11MEANSESj(CSESij) + 12SECTORj(CSESij)
+ 0j + 1j(CSESij) + rij
 SAS syntax
proc mixed noclprint covtest noitprint;
class school;
model mathach = meanses sector cses meanses*cses
sector*cses/solution ddfm=bw notest;
random intercept cses/type=un sub=school;
SAS Output
Including Level-1 and Level-2
Predictors in SPSS
Yij = 00 + 01MEANSESj + 02SECTORj + 10CSESij
+ 11MEANSESj(CSESij) + 12SECTORj(CSESij)
+ 0j + 1j(CSESij) + rij
 SAS syntax
proc mixed noclprint covtest noitprint;
class school;
model mathach = meanses sector cses meanses*cses
sector*cses/solution ddfm=bw notest;
random intercept cses/type=un sub=school;

SPSS syntax
mixed mathach with meanses sector cses
/print=solution
/fixed = meanses sector cses meanses*cses sector*cses
/random intercept cses | subject(school) covtype(un).