Y ij - Kevin E. O`Grady, PhD

Download Report

Transcript Y ij - Kevin E. O`Grady, PhD

Mixing it up: Mixed Models
Tracy Tomlinson
December 11, 2009
What are fixed effects
What are random effects
How do I know if my effects are fixed or random
Why do I care about fixed and random effects
Mixed models
SAS and mixed models
SPSS and mixed models
Fixed Effects
 Specific levels of interest of a factor are
 May use all levels or a subset of levels
 These are the specific levels of interest
 Interest in comparing these levels
 Inference only for these levels
Random Effects
 Levels of a factor selected from a probability
 Interested in the extent to which the random
factor accounts for variance in the dependent
 May be a control variable or a variable of interest
 Rather than being interested in the individual
means across the levels of the fixed factor,
we are interested in the variance of means
across the levels of a random factor
Fixed Effects: One Factor
 Running a clinical trial in which a drug is
administered at four different dose levels
 Model Equation:
Yij =  + i + eij
 i corresponds to 1, 2, 3, or 4 dose levels
  is the effect of the drug on the mean
Random Effects: One Factor
 Clinical trial using a new drug at 20 different
clinics in DC selected at random
 Model Equation:
Yij =  + 
aii + eijij
 Where i corresponds to the 20 clinics
 Where  represents the mean of all dosages in the
population, not just the observed study
 The effects of ai are random variables with mean 0
and variance a2
Fixed Versus Random
 Levels of a factor
chosen of specific
 Levels of a factor
selected from a
 Interested in the means  Interested in the
variance of means
across the chosen
across the levels of the
levels of the factor
Determining Fixed and Random
 YOU determine what effects you have!
 As the researcher you select your levels of
 Are the specific levels of interest?
 Are you interested in comparing group means?
 Did you sample the levels from a larger
 YOU determine the population!
Fixed or Random Effect?
Is it reasonable to assume
that the levels of the factor
come from a probability
Fixed Factor
Random Factor
Fixed or Random Effect?
Do you care about
comparing the specific factor
level means?
Fixed Factor
Random Factor
Why does it Matter, or Does it?
 Assumptions about random effects differ from
those for fixed effects
 Error terms are different depending on fixed
versus random effects
 Random effects have additional error terms
beyond 2
 The effects of ai are random variables with mean 0
and variance a2
 Important for inferential statistics
Fixed and Random Effects Error
 Fixed effects error = MSerror = MSwithin = 2
 Random effects error: Depends on the
nature of the random and fixed effects
Two Factor Model
 A fixed, B fixed
 Yij =  + i + j + ()ij + eij
 A fixed, B random
 Yij =  + i + bj + (ab)ij + eij
 A random, B fixed
 Yij =  + i + bj + (ab)ij + eij
 A random, B random
 Yij =  + ai + j + (ab)ij + eij
One error
MSwithin e2
Two error
1) MSwithin e2
2) MSab ab2
Two Factor Model
 A fixed, B fixed
 Yij =  + i + j + ()ij + eij
 A fixed, B random
 Yij =  + i + bj + (ab)ij + eij
 A random, B fixed
 Yij =  + i + bj + (ab)ij + eij
 A random, B random
 Yij =  + ai + j + (ab)ij + eij
In all cases the
highest level
term is tested
MSwithin e2
Two Factor Model
 A fixed, B fixed
 Yij =  + i + j + ()ij + eij
 A fixed, B random
 Yij =  + i + bj + (ab)ij + eij
 A random, B fixed
 Yij =  + i + bj + (ab)ij + eij
 A random, B random
 Yij =  + ai + j + (ab)ij + eij
All effects
tested against
MSwithin e2
Two Factor Model
 A fixed, B fixed
 Yij =  + i + j + ()ij + eij
 A fixed, B random
 Yij =  + i + bj + (ab)ij + eij
 A random, B fixed
 Yij =  + i + bj + (ab)ij + eij
 A random, B random
 Yij =  + ai + j + (ab)ij + eij
Fixed effect A
tested against
MSab ab2
Random effect
B tested
MSwithin e2
Two Factor Model
 A fixed, B fixed
 Yij =  + i + j + ()ij + eij
 A fixed, B random
 Yij =  + i + bj + (ab)ij + eij
 A random, B fixed
 Yij =  + i + bj + (ab)ij + eij
 A random, B random
 Yij =  + ai + j + (ab)ij + eij
Random effect
A tested
MSwithin e2
Fixed effect B
tested against
MSab ab2
Two Factor Model
 A fixed, B fixed
 Yij =  + i + j + ()ij + eij
 A fixed, B random
 Yij =  + i + bj + (ab)ij + eij
 A random, B fixed
 Yij =  + ai + j + (ab)ij + eij
 A random, B random
 Yij =  + ai + bj + (ab)ij + eij
All effects
tested against
MSwithin e2
Fixed and Random Effects and
Effect Size
 Effect size estimates assume effects are
fixed or random
 Fixed
 2
 2
 Random
Why does it Matter, or Does it?
 Assumptions about random effects differ from those for
fixed effects
 Error terms are different depending on fixed versus
random effects
 Random effects have additional error terms beyond 2
 The effects of ai are random variables with mean 0 and
variance a2
 Important for inferential statistics
 Interpretation differs
 Fixed effects interpretations constrained to the levels of the
factor(s) in the study
 Random effects interpretations have a broader generalization to the
population of interest
Mixed Models
 Contains both fixed and random effects
Randomized blocks designs
Nested/ Hierarchical designs
Split-plot designs
Clustered designs
Repeated measures
 Analysis comparable
 One way ANOVA
 Two way ANOVA
Linear Mixed Model (LMM)
• Handles data where observations are not independent
• LMM correctly models correlated errors, whereas procedures in
the general linear model family (GLM) usually do not
• Nature [structure] of the correlation must be correctly modeled for
the tests of mean differences to be unbiased
• LMM is a further generalization of GLM to better support
analysis of a continuous dependent for:
• Random effects
• Hierarchical effects
• Repeated measures
Mixed Models Randomized
Blocks Example
 Testing four drugs and assigning n subjects to one
of 4 groups carefully matched by demographic
variables. Each person in the four groups gets one
of the drugs.
 Fixed effect of treatment
 Random effect of blocks
Yij =  + i + bj + eij
  represents unknown fixed parameters - intercept and
the four drug treatment effects
 bj and eij are random variables representing blocks and
 bj assumed to have an error of b2
 Error (eij) assumed to have an error of 2
Mixed Models Hierarchical
 Using four dosage levels of a drug in 20 clinics. In each
clinic each patient was randomly assigned to one of the 4
dose levels.
Yijk =  + i + bj + cij + eijk
 Where ai, bj, and cij are the effects due to drug dose i, clinic
j, and clinic-by-dose interaction
 IF you assume that the 20 clinics are not sampled this
experiment may now have only fixed effects
Yijk =  + i + j + cij + eijk
Mixed Models Repeated
Measures and Split-Plot
 Three drug treatments randomly assigned to
subjects with subjects observed at 1, 2, …, 7, and
8 hours post-treatment Where
Yijk =  + i + s()ij + k + ()ik + eijk
 Where  represents treatment effects
  represents time (or hour) effects
 s() represents the random subject within
treatment effects
Mixed Models
 Contains both fixed and random effects
Randomized blocks designs
Nested/ Hierarchical designs
Split-plot designs
Clustered designs
Repeated measures
 Analysis comparable
 One way ANOVA
 Two way ANOVA
 Specifically designed to fit mixed effect models.
 It can model:
Random and mixed effect data
Repeated measures
Spatial data
Data with heterogeneous variances and autocorrelated
 The MIXED procedure is more general than GLM in
the sense that it gives a user more flexibility in
specifying the correlation structures, particularly
useful in repeated measures and random effect
 GLM uses OLS estimation
 Mixed uses ML, REML, or MIVQUE0 estimation
 The PROC MIXED syntax is similar to the syntax of
 The random effects and repeated statements are
used differently
 Random effects are not listed in the model statement for
 GLM has MEANS and LSMEANS statements
 MIXED has only the LSMEANS statement
General SAS Mixed Model
 PROC MIXED options;
CLASS variable-list;
MODEL dependent=fixed effects/ options; RANDOM
random effects / options;
REPEATED repeated effects / options; CONTRAST
'label' fixed-effect values | random-effect values/
ESTIMATE 'label' fixed-effect values | random-effect
values/ options;
LSMEANS fixed-effects / options;
MAKE 'table' OUT= SAS-data-set < options >;
PROC MIXED statement
PROC MIXED options;
PROC MIXED noclprint covtest ;
 The NOCLPRINT option prevents the printing of the CLASS level
 The first time you run the program you probably don’t want to include
 When there are lots of group units, use NOCLPRINT to suppress the
printing of group names.
 The COVTEST option tells SAS that you would like hypothesis tests
for the variance and covariance components.
CLASS statement
CLASS variable-list;
CLASS IDpatient IDclinic;
 The CLASS statement indicates that SCHOOL is a
classification variable whose values do not contain
quantitative information
 The variables that we want SAS to treat as categorical
variables go here.
 Variables that are characters (e.g., city names) must be on
this line (it won’t run otherwise).
MODEL statement
MODEL dependent = fixed effects/options;
MODEL dosage = /solution;
 If you have no fixed effects you would have no independent variables
listed in the model statement
 /solution option asks SAS to print the estimates for the fixed effects
 Intercept is included as a default in SAS
 If you would like to fit a model without the intercept you add the
/NOINT option to the model statement
Random statement
RANDOM intercept /sub=IDclinic;
 By default there is always at least one random effect, usually the
lowest-level residual
 You can specify the intercept on the RANDOM statement: This
indicates the presence of a second random effect and the intercept
should be treated not only as a fixed effect but also as a random effect
 The SUB option on the RANDOM statement specifies the multilevel
structure, indicating how the individuals (level-1 units) are divided into
higher level groups (level-2 units)
General SAS Mixed Model
PROC MIXED options;
CLASS variable-list;
MODEL dependent=fixed effects/ options; RANDOM random
effects / options;
REPEATED repeated effects / options; CONTRAST 'label' fixedeffect values | random-effect values/ options;
ESTIMATE 'label' fixed-effect values | random-effect values/
LSMEANS fixed-effects / options;
MAKE 'table' OUT= SAS-data-set < options >;
General SPSS Mixed Model
Mixed dependent with independent
/fixed=fixed effects
/random intercept random effects | subject(random
effect grouping) covtype (options)
/repeated = repeated effect | subject(repeated effect
grouping) covtype (options).
Recap Main Points Slide
 Fixed versus random effects
 Error terms
 Effect size
 Interpretation/inference
 Sampling
 Independence of observations
 SAS and SPSS syntax for random and
mixed models
For your Reading Pleasure
 Schabenberger, O. (2006). SAS system for mixed
models (Second Edition). Cary, NC: SAS Institute.
 McCullagh, P., & Nelder, J. A. (1989). Generalized
linear models (Second Edition). New York: Chapman
and Hall.
 McCulloch, C., & Searle, S. (2008) Generalized, linear,
and mixed models (Second Edition). New York: Wiley.
 Verbeke, G. E., & Molenberghs, G. (1997). Linear
mixed models in practice: A SAS-oriented approach.
New York: Springer.
 Fahrmeir, L., & Tutz, G. (1994). Multivariate statistical
modeling based on generalized linear models.
 Heidelberg: Springer-Verlag. Lindsey, J. (1993).
Data Example with PROC
 High School and Beyond data example (Byrk &
Raudenbush, 1992)
 7,185 students in 160 schools
 MATHACH: Student level (level-1) outcome is
math achievement
 SES: Student level (levl-1) covariate is socioeconomic status
 MEANSES: School-level (level-2) covariate of
mean SES for the school
 SECTOR: School-level (level-2) covariate of
school type (dummy coded, public= 0 and
Singer (1998)
Random Effects Model
 Unconditional means model examining the
variation of MATHACH across schools
 One-way random effects ANOVA model
 Model Equation
 Yij =  + j + rij
 SAS syntax
proc mixed;
class school;
model mathach = ;
random school;
MIXED Model Two-Level
 Level 1: students outcome (Yij) is expressed as the sum of an
intercept for the students school (0j) and a random error term (rij)
associated with the ith student in the jth school
Yij = 0j + rij
 Level 2: We express the school level intercept as the sum of an
overall mean (00) and a series of random deviations from that mean
0j = 00 + 0j
 This leads to a model equation of
Yij = 00 + 0j + rij
 SAS syntax
proc mixed noclprint covtest;
class school;
model mathach = /solution;
random intercept/sub=school;
Mixed Model Two-Level
Approach Output
for between
school: 00
for within
SPSS for Random Effects Model
SAS syntax
SPSS syntax
proc mixed noclprint covtest;
class school;
model mathach = /solution;
random intercept/sub=school;
mixed mathach
/random intercept | subject(school).
SPSS output
Including Level-2 Predictors
 School-level (level-2) predictor of MEANSES
 Level 1: students outcome (Yij) is expressed as the sum of an intercept
for the students school (0j) and a random error term (rij) associated
with the ith student in the jth school
Yij = 0j + rij
 Level 2: We express the school level intercept as the sum of an overall
mean (00), MEANSES and a series of random deviations from that
mean (0j).
Fixed Effects
0j = 00 + 01MEANSESj + 0j
 This leads to a model equation of
Yij = [00 + 01MEANSESj ] + [0j + rij ]
Compute df
for fixed
 SAS syntax
proc mixed noclprint covtest;
class school;
model mathach = meanses/solution ddfm=bw;
random intercept/sub=school;
effects with
SAS Output
effect: 00
effect: 2
Fixed effect:
Fixed effect:
SPSS for Including Level-2
SAS syntax
SPSS syntax
proc mixed noclprint covtest;
class school;
model mathach = meanses/solution ddfm=bw;
random intercept/sub=school;
mixed mathach with meanses
/fixed = meanses
/random intercept | subject(school).
SPSS output
Including Level-1 Predictors
 Student-level (level-1) predictor of SES
Level 1: students outcome (Yij) is expressed as a function of an intercept for the
students school (0j), individual CSES (centered by MEANSES), and a
random error term (rij) associated with the ith student in the jth school
Yij = 0j + 1jCSESij +rij
Level 2: We express the school level intercept as the sum of an overall mean
(00) and a series of random deviations from that mean (0j).
0j = 00 + 0j
Fixed Effects
0j = 10 + 1j
This leads to a model equation of
Yij = [00 + 01CSESj ] + [0j + 1j(CSES)+ rij ]
Effects Don’t print
SAS syntax
iteration page
proc mixed noclprint covtest noitprint;
class school;
model mathach = cses/solution ddfm=bw notest;
random intercept cses/sub=school type = un;
SAS Output
Including Level-1 Predictors in
Yij = [00 + 01CSESj ] + [0j + 1j(CSES)+ rij ]
SAS syntax
proc mixed noclprint covtest noitprint;
class school;
model mathach = cses/solution ddfm=bw notest;
random intercept cses/sub=school type = un;
SPSS syntax
mixed mathach with cses
/fixed = cses
/random intercept cses | subject(school) covtype(un).
Including Level-1 AND Level-2
 Model with the effect of students SES (CSES) and school
SES (MEANSES) with a second level-2 factor of
Yij = 0j + 1jCSESij +rij
0j = 00 + 01MEANSESj + 02SECTORj + 0j
0j = 10 + 11MEANSESj + 12SECTORj + 1j
 This leads to a model equation of
Yij = 00 + 01MEANSESj + 02SECTORj + 10CSESij
+ 11MEANSESj(CSESij) + 12SECTORj(CSESij)
+ 0j + 1j(CSESij) + rij
 SAS syntax
proc mixed noclprint covtest noitprint;
class school;
model mathach = meanses sector cses meanses*cses
sector*cses/solution ddfm=bw notest;
random intercept cses/type=un sub=school;
SAS Output
Including Level-1 and Level-2
Predictors in SPSS
Yij = 00 + 01MEANSESj + 02SECTORj + 10CSESij
+ 11MEANSESj(CSESij) + 12SECTORj(CSESij)
+ 0j + 1j(CSESij) + rij
 SAS syntax
proc mixed noclprint covtest noitprint;
class school;
model mathach = meanses sector cses meanses*cses
sector*cses/solution ddfm=bw notest;
random intercept cses/type=un sub=school;
SPSS syntax
mixed mathach with meanses sector cses
/fixed = meanses sector cses meanses*cses sector*cses
/random intercept cses | subject(school) covtype(un).