Presentation Slides

download report

Transcript Presentation Slides

Sharon Wolf
NYU Abu Dhabi Additional Insights Summer Training Institute
June 15, 2015
1

Conceptual overview

Analytic considerations
 Power/Minimum detectable differential effects
 Cross-level interactions in MLM
 Centering variables

Recommendations and tips
2
When is the story in the subgroups?
3
•
Guide questions about how to target
resources most efficiently:
• How widespread are the effects of an intervention?
• Is the intervention effective for a specific subgroup?
• Is the intervention effective for any subgroup?
•
Exploratory* versus confirmatory subgroup
findings
4

Two examples from welfare reform in the United
States and the different policy implications.

Michalopoulos & Schwartz (2000) assessed two
types of subgroups:
A range of person-level subgroups (e.g., education
level, prior employment experience, and risk of
depression).
2. The nature of the program and program office
practices.
1.
5
•
Characteristics believed to be related to the
need for a particular intervention or the
likelihood of benefiting from it.
• Demographic characteristics – e.g., gender, age,
education level
• Risk factors - past smoking, drug abuse, severity of
disease, poverty status
• Combinations of characteristics – e.g., gender and
age; cumulative levels of risk/risk index
6

Exogenous to the intervention: not affected by the
intervention or correlated with its receipt (all prerandom assignment characteristics).

Endogenous to the intervention: affected by the
intervention or correlated with its receipt (e.g.,
dosage of the intervention). Valid causal inferences
much more difficult.
 Gambia: higher “dosage” (i.e., higher attendance)  more
learning?
 Increased attendance could bring less advantaged
students into the intervention group, biasing the average
treatment effect (ATE) downward.
7

Exploratory subgroup analyses
 Provide a basis for hypothesis-generation
 Essential step in the scientific method
 Should be considered suggestive

Confirmatory subgroup analyses
 Appropriate basis for testing hypotheses
 Provide strong evidence if findings are: (a) consistent
with existing findings, (b) large enough magnitude to
be meaningful, (c) robust.
Bloom & Michalopoulos, 2010
8

Internal contextual considerations
 Features of findings internal to a given study
 E.g., pattern across all outcomes for a particular
subgroup in a study

External contextual considerations
 Features of findings external to a given study
 E.g., consistency with prior study findings
9
1.
What is the impact of the program for each
subgroup?
2.
What are the relative impacts of the
program across subgroups?
10
Minimum detectable differential effects
11
1. Did the program work for a particular
subgroup ?


Assess impacts separately for this subgroup
Assess power to detect impacts for this subgroup
2. Were the effects different for particular
subgroups?


Assess impacts using a cross-level interaction
Assess power to detect a cross-level interaction
12

Minimum Detectable Effect Size (MDES): the
smallest true effect, in standard deviations of
the outcome, that is detectable for a given
level of power and statistical significance.
Accepted parameters:
 Power: 80%
 Statistical significance level: 0.05
13





ρ = intraclass correlation
δ = MDES
λ = non-centrality parameter
J = number of clusters
n = number of units per cluster
14

Main effect:
2
 
4  1    / n / J
2
Main effect
with covariate:
|W 
Cluster level
Moderator:
2
|W S 
161  R|W2 S   1    / n/ J


Individual level
Moderator:

| X 



4 1  R|W2   1    / n / J

2


16 1  R|2X 1    / n / J

The number of clusters (highest level units) is more
important than the size of the cluster (lower level
units) in reducing the MDES.

A higher intra-cluster correlation (ICC) increases
the MDES (i.e., if τ00 is relatively large).

The proportion of variance in the outcome you
can predict with L1 and L2 variables (i.e., R|X2 and
R|W2) reduces the MDES.
16

Maintains a significant portion of power
because the number of clusters (or L2 units)
remains the same.

The only statistical difference between the
subexperiment and the full experiment is the
number of L1 units per cluster.
17
18

Minimum Detectable Effect Size Differences
(MDESD): the smallest true effect of the
difference in program impacts for two
subgroups, in standard deviations of the
outcome, that is detectable for a given level of
power and statistical significance.
Accepted parameters:
 Power: 80%
 Statistical significance level: 0.05
19

Main effect:
2
 
4  1    / n / J
2
Main effect
with covariate:
|W 
Cluster level
Moderator:
2
|W S 
161  R|W2 S   1    / n/ J


Individual level
Moderator:

| X 



4 1  R|W2   1    / n / J

2


16 1  R|2X 1    / n / J

Within-level variance becomes increasingly important.
Implications include:
 The number of cases per cluster (lower level units)
become more important for increasing power.
 The intra-cluster correlation (ICC) becomes less
significant in affecting power (though still important).
 The proportion of variance in the outcome you can
predict with L1 variables (not L2; i.e., R|X2) increases
power.
21
Assessing individual-level moderation in
cluster-randomized trials using multi-level
models
22
1.
Lower level direct effects. Does a L1 predictor X (e.g.,
student gender) have a relationship with the L1 outcome
variable Y (e.g., student reading)?
2.
Cross-level direct effects. Does a L2 predictor (e.g.,
school treatment status) have a relationship with an L1
outcome variable Y (e.g., student reading)?
3.
Cross-level interaction effects. Does the nature or
strength of the relationship between a L1 variable (e.g.,
gender) and the outcome (e.g., reading) change as a
function of a higher-level variable (e.g., school treatment
status)?
23
Level 1
Level 2
𝑌𝑖𝑗 = 𝛽0𝑗 + 𝑟𝑖𝑗
𝛽0𝑗 = 𝛾00 + 𝛾01 𝑇𝑗 + 𝑢0𝑗
Yij = outcome for individual i in cluster j
Tj = 1 for program-group members, 0 for control-group
γ 00 = mean outcome for the control group
γ 01 = true program impact
rij = error component for individual i from cluster j
u0j = error component for cluster j
24
𝑌𝑖𝑗 = 𝛾00 + 𝛾01 𝑇𝑗 + 𝑢0𝑗 + 𝑟𝑖𝑗
25

Level 2 (e.g., school treatment status) and Level 1
(e.g., student gender) variables interacting to
produce an effect on the outcome (e.g., student
reading scores).

In terms of your impact estimation equation:
 (a) Add Level 1 predictor (moderator).
 (b) Expand Level 2 model to include a fixed slope (1).
 (c) Add a level 2 predictor (treatment status) to the slope.
26
Level 1
Level 2
Expand L2
slope
𝑌𝑖𝑗 = 𝛽0𝑗 + 𝛽1𝑗 𝑀𝑖𝑗 + 𝑟𝑖𝑗
𝛽0𝑗 = 𝛾00 + 𝛾01 𝑇𝑗 + 𝑢0𝑗
𝛽1𝑗 = 𝛾10 + 𝛾11 𝑇𝑗
Added L1
predictor
(moderator)
Add L2 predictor
to the slope
27
Level 1
Level 2
𝑌𝑖𝑗 = 𝛽0𝑗 + 𝛽1𝑗 𝑀𝑖𝑗 + 𝑟𝑖𝑗
𝛽0𝑗 = 𝛾00 + 𝛾01 𝑇𝑗 + 𝑢0𝑗
𝛽1𝑗 = 𝛾10 + 𝛾11 𝑇𝑗
Coefficient for
cross-level
interaction
γ00 = mean outcome for the control group
γ01 = estimated program impact for Mij=0
γ10 = main effect for the moderating variable, Tj=0
γ11 = moderated effect (i.e., interaction)
28
𝑌𝑖𝑗 = 𝛾00 + 𝛾01 𝑇𝑗 + 𝛾10 𝑀𝑖𝑗 + 𝛾11 𝑇𝑗 𝑀𝑖𝑗
+ 𝑢0𝑗 + 𝑟𝑖𝑗
29

“Simple Regression Equation”: Calculate the
expected values of Yij under different conditions of
Tj and Mij

For continuous moderators, plot at values of one
standard deviation below the mean, the mean,
and one standard deviation above the mean for M.

It may also be useful to choose additional values
that may be informative in specific contexts.
30
E(Yij | Mij ,Tj) =
γ00 + γ01(Tj )+ γ10(Mij )+ γ11(Mij)(Tj)
Under control conditions:
E(Yij | Mij ,Tj = 0) = γ00 + γ10(Mij)
Under treatment conditions:
E(Yij | Mij ,Tj = 1) =
γ00 + γ01 + γ10(Mij) + γ11 (Mij)
31
If we need it
32
Implications for interpreting effect
estimates and detecting impact variation
33

How do you want to interpret the intercept in
your model? The coefficients?
 Example: School diversity/cultural awareness program
H1: Improved sense of belonging for minority students
(L1 moderator).
H2: Improved sense of belonging for minority students in
less diverse schools (L1 & L2 moderators).

The distribution of the moderator variable
across clusters needs to be considered.
34

CGM = Centering at the grand mean
 Deviations calculated from the sample mean for
all individuals
 CGM L1 with all individuals; L2 with all clusters

CWC = Centering within clusters
 aka, group-mean centering
 Deviations calculated around the mean of the
cluster j to which case i belongs
35
The distribution of
M is highly variable
across clusters.
Y
(outcome)
Cluster 1
Cluster 2
Cluster 3
X
(predictor)
36
CGM
Y
(outcome)
M
X
(predictor)
37
CGM
Y
(outcome)
M
X
(predictor)
38

Does not affect the rank order of scores on the
variable. The complex, multilevel association
between the L1 and L2 variables is unaffected.

Yields scores that are correlated with variables at
both levels of the hierarchy. (This is a critical
differences with CWC.)

Produces an interaction coefficient (γ11) that is a
weighted combination of the within- and betweencluster regression coefficients.
39
CWC
The distribution of
M is highly
concentrated within
clusters.
M1
Y
(outcome)
M2
M3
X
(predictor)
40
CWC
M1
Y
(outcome)
M2
M3
X
(predictor)
41



Affects the rank order of scores of variables
within the sample.
Produces scores that are uncorrelated with Level
2 variables (because the mean for all L2 variables
is zero).
Produces an interaction coefficient (γ11) that is
an unbiased estimate of the Level 1 association
 γ11 is a pure estimate of the cross-level interaction,
no longer confounded with the Level 2 interaction.
42
The distribution of
M is even across
clusters.
Y
(outcome)
X
(predictor)
43
CGM
Y
(outcome)
M
X
(predictor)
44
CWC
Y
(outcome)
M2
M3
M1
X
(predictor)
45

Centering will affect estimates more if the predictor
variable is not evenly distributed across clusters.

Cross-level interaction term using CGM will provide a
coefficient estimate that is a mix of the L1 and L2
effects.

Cross-level interaction term using CWC will provide a
pure estimate of the L1 relationship.

Decisions on how to center depend on your data and
your research question (!!).
46

Predictor: Treatment status (L2)

Individual level moderator: Student age (L1)
(continuous)

Outcome: Reading score

Some options on how to center the data and what
it means for interpreting your moderated effect…
47
Level 1
Level 2
𝑌𝑖𝑗 = 𝛽0𝑗 + 𝛽1𝑗 𝑎𝑔𝑒𝑖𝑗 + 𝑟𝑖𝑗
𝛽0𝑗 = 𝛾00 + 𝛾01 𝑇𝑗 + 𝑢0𝑗
𝛽1𝑗 = 𝛾10 + 𝛾11 𝑇𝑗
γ00 is the average school mean reading score for the control
group when age=0
 γ 10 is the composite of the relationship of within school agereading scores and between-school age reading scores
 γ 11 is the composite of the interaction between treatment
and within school age-reading scores and treatment and
between-school age reading scores.

48
Level 1
Level 2
𝑌𝑖𝑗 = 𝛽0𝑗 + 𝛽1𝑗 (𝑎𝑔𝑒𝑖𝑗 −𝑎𝑔𝑒) + 𝑟𝑖𝑗
𝛽0𝑗 = 𝛾00 + 𝛾01 𝑇𝑗 + 𝑢0𝑗
𝛽1𝑗 = 𝛾10 + 𝛾11 𝑇𝑗
γ00 is the average school mean reading score for schools
for the control group.
 γ 10 is the composite of the relationship of within school
age-reading scores and between-school age reading
scores.
 γ 11 is still the composite of the interaction between
treatment and within school age-reading scores and
treatment and between-school age reading scores.

49
Level 1
Level 2



𝑌𝑖𝑗 = 𝛽0𝑗 +𝛽1𝑗 (𝑎𝑔𝑒𝑖𝑗 −𝑎𝑔𝑒𝑗 ) + 𝑟𝑖𝑗
𝛽0𝑗 = 𝛾00 + 𝛾01 𝑇𝑗 + 𝛾02 (𝑎𝑔𝑒𝑗 − 𝑎𝑔𝑒) + 𝑢0𝑗
𝛽1𝑗 = 𝛾10 + 𝛾11 𝑇𝑗
γ00 is the average school mean reading score across the
schools for the control group.
γ10 is the average change in school mean reading score for a 1
unit increase in school mean age across schools (between
school age relationship)
γ 11 is the composite of the interaction between treatment and
within school age-reading scores and treatment and betweenschool age reading scores.
50
Level 1
Level 2
𝑌𝑖𝑗 = 𝛽0𝑗 + 𝛽1𝑗 (𝑎𝑔𝑒𝑖𝑗 −𝑎𝑔𝑒𝑗 ) + 𝑟𝑖𝑗
𝛽0𝑗 = 𝛾00 + 𝛾01 𝑇𝑗 + 𝛾02 𝑎𝑔𝑒𝑗 − 𝑎𝑔𝑒 +
𝛾03 (𝑎𝑔𝑒𝑗 − 𝑎𝑔𝑒 ∗ 𝑇𝑗) + 𝑢0𝑗
𝛽1𝑗 = 𝛾10 + 𝛾11 𝑇𝑗




γ00 is the average school mean reading fluency across the schools for
the control group.
γ10 is the average change in school mean reading fluency for a 1 unit
increase in school mean age across schools (between school age
relationship).
γ03 is the moderated relationship between treatment and between
school age reading scores.
γ11 is the moderated relationship between treatment and within
school age-reading scores.
51
52
•
Distortions to statistical inferences can occur when
multiple related hypothesis tests are conducted.
•
Suggested approaches:
1.
2.
3.
4.
Explicitly distinguish between exploratory and
confirmatory findings
Minimize the number of confirmatory hypothesis tests
conducted by a given study.
Create an omnibus hypothesis test about the intervention’s
effects that considers all outcome measures and
subgroups together. (e.g., composite measure of individual
outcomes).
Consider family-wise error correction (reduces statistical
power considerably).
53
1.
Calculate ρ for all levels.
2.
Determine your research question and relevant approach to
assessing subgroup affects.
3.
Calculate the power needed to detect a subgroup effect (either
for a particular subgroup, or for a cross-level interaction,
depending on your research question).
4.
Rescale (i.e., center) predictor variables as needed.
5.
Assess the practical significance of your findings (i.e., calculate
effect sizes).
6.
Report results regarding each step of the model building
process including all coefficients, standard errors and variance
components.
54

Aguinis, H., Gottfredson, R. K., & Culpepper, S. A. (2013). Best-practice
recommendations for estimating cross-level interaction effects using multilevel
modeling. Journal of Management, 0149206313478188.

Bloom, H. S. (Ed.). (2005). Learning more from social experiments:
Evolving analytic approaches. Russell Sage Foundation.
Bloom, H. & Michalopoulos, M. (2013). When Is the Story in the
Subgroups? MDRC Working Paper.
Enders, C. K., & Tofighi, D. (2007). Centering predictor variables in crosssectional multilevel models: a new look at an old issue. Psychological
methods, 12(2), 121.
Mathieu, J. E., Aguinis, H., Culpepper, S. A., & Chen, G. (2012).
Understanding and estimating the power to detect cross-level
interaction effects in multilevel modeling. Journal of Applied Psychology,
97(5), 951.



55