the twins effect 2

Download Report

Transcript the twins effect 2

Testing causality
Charlotte Huppertz
VU University Medical Center Amsterdam
Netherlands Twin Register
Rationale
• Correlation does not equal causation!
• Alternative explanation: underlying factors
• Example: „Exercising relieves depressive symptoms“
Descriptives
• Exercisers are on average happier and less anxious
and depressed
• This holds in women and men of all ages
*
27,5
27,0
26,5
26,0
Non-exercisers
Exercisers
Non-exercisers
Life satisfaction (SWLS)
*
2.5
2.0
*
23.0
22.8
22.6
22.4
22.2
22.0
Exercisers
Happiness (SHS)
*
6.5
34.5
34.0
33.5
33.0
32.5
32.0
31.5
6.0
1.5
5.5
1.0
5.0
0.5
0.0
4.5
Non-exercisers
Exercisers
Depression (BDI)
Non-exercisers
Exercisers
Anxious depression (YASR)
*
Non-exercisers
51.0
50.0
49.0
48.0
47.0
46.0
45.0
Exercisers
Anxiety (STAI)
*
Non-exercisers
Exercisers
Neuroticism (EPQ)
Possible explanations
Genes
Genes
(50%)
(35-50%)
Exercise
behaviour
Depressive
symptoms
Genes
”Pleitropy”?
Ways to falsify causality
≠ proving causality
• Gold standard: experimentation
– Randomize individuals to different conditions
– Expose them to a treatment vs. control
– Assess post-treatment differences
• Problems
–
–
–
–
Really random samples?
Generalizable samples? (e.g. students; exercisers)
Usually requires a lot of effort/ money!
Not everything can be manipulated! (e.g. effect of
childhood maltreatment on depression)
Ways to falsify causality
1) The MZ twin intrapair differences model
2) Bivariate genetic models
3) (Mendelian randomization)
1) The MZ twin intrapair differences model
Control for: shared genes, shared environment
MZ twin pair differences model
The MZ twin intrapair differences model
Within-pair differences in Phenotype 1 should be associated
with within-pair differences in Phenotype 2.
How to do the calculation?
2) Bivariate genetic models
Bivariate genetic model
Bivariate genetic models
All genetic and environmental factors that influence P1 will also,
through the causal chain, influence P2 (If A -> B and B -> C, then A -> C).
A
E
Bivariate genetic model
Path diagram
0.5 or 1
1
A C
E
Time Point 1
A C
E
Time Point 2
Twin1
A C
E
Time Point 1
A C
E
Time Point 2
Twin2
Bivariate genetic model
Path diagram
0.5 or 1
1
A C
E
A C
Exercise beh.
E
Depr. symp.
Twin1
A C
E
Exercise beh.
A C
E
Depr. symp.
Twin2
Bivariate genetic model
Path diagram
0.5 or 1
A
E
A
Exercise beh.
E
Depr. symp.
Twin1
A
E
Exercise beh.
A
E
Depr. symp.
Twin2
Bivariate genetic model
Let‘s start at the beginning...
UNIVERIATE TWIN MODEL
1 or 0.5
E
A
e
a
Exercise Twin1
A= genes, E= unique environment
A
E
a
e
Exercise Twin2
Bivariate genetic model
Univariate twin model
1 or 0.5
A
a11
Exercise Twin1
A
a11
Exercise Twin2
Bivariate genetic model
Bivariate twin model
1 or 0.5
A
a11
Exercise Twin1
A
a22
Depr. S. Twin1
1 or 0.5
A
a11
Exercise Twin2
A
a22
Depr. S. Twin2
Bivariate genetic model
Bivariate twin model
1 or 0.5
A
A
a11
Exercise Twin1
a21
a22
Depr. S. Twin1
1 or 0.5
A
a11
Exercise Twin2
A
a21
a22
Depr. S. Twin2
Bivariate genetic model
Bivariate twin model
A2
A1
A1
rgenetic
A2
!
a11
Exercise
behavior
?
e11
a22
Depressive
symptoms
e22
!
E1
E2
a1
a2
Exercise
behavior
Depressive
symptoms
e1
E1
e2
renviron
E2
Be careful: order can matter!
Bivariate genetic model
Practical
• Open „BivariateCholesky.R“
• Simulated data
• Zygos: 1= MZM, 2= DZM
• TASKS:
1) Find four errors in the script
2) Run the main model & write out the path diagram including
the estimated parameters
3) Drop the cross-trait genetic path (provided) – conclusion?
4) Drop the cross-trait environmental path (not provided) –
conclusion?
Bivariate genetic model
1) Four errors
Bivariate genetic model
2) The path diagram
A
E
A
E
0.46
0.21
0.22
0.05
1.11
1.07
Exercise Twin1
Depr. S. Twin1
Bivariate genetic model
3) Output dropping A
Conclusion?
Bivariate genetic model
3) Output dropping E
Conclusion?
Bivariate genetic model
Conclusion
A
E
A
E
0.46
0.21
0.22
0.05 n.s.
1.11
1.07
Exercise Twin1
Depr. S. Twin1
 Not compatible with a causal effect!
 In „real life“: check power!
Bivariate genetic model
Result bivariate twin model
A1
rgenetic
a1
a2
Exercise
behavior
Depressive
symptoms
e1
E1
A2
e2
renviron
E2
MZ twin pair differences model
Result MZ twin intrapair differences model
• The twin who exercises more, is not less
anxious/depressed
Conclusion
• The association between exercise and depressive
symptoms is best explained by the same genetic
vulnerability!
”Our results signal psychiatrists and epidemiologists that
the small but robust cross-sectional and longitudinal
correlations between voluntary exercise behavior and
mental health should be interpreted with caution.”
Exercise behavior & other phenotypes
“The genetic factors influencing exercise participation and self-rated health
partially overlap (r = 0.36) and this overlap fully explains their phenotypic
correlation.“ (de Moor et al., 2006, EurJEpid)
“Exercise participation is associated with higher levels of life satisfaction and
happiness. This association is non-causal and appears to be mediated by
genetic factors that influence both exercise behavior and well-being.” (Stubbe
et al., 2007, PrevMed)
“Regular exercise is associated with reduced anxious and depressive
symptoms in the population at large, but the association is not because of
causal effects of exercise.“ (de Moor et al., 2008, ArchGenPsychiatry)
“Exercise behavior is associated with fewer internalizing problems and higher
levels of SWB. The association largely reflects the effects of common genetic
factors on these traits.” (Bartels et al., 2012, FronGen)
Genetic correlation
Cross-trait correlations
A2
A1
A1
rgenetic
A2
!
a11
Exercise
behavior
?
e11
a22
Depressive
symptoms
e22
!
E1
E2
a1
a2
Exercise
behavior
Depressive
symptoms
e1
E1
e2
renviron
How can we calculate a genetic correlation?
E2
Genetic correlation
Genetic correlation
A
E
A
E
e22
a22
a21
e21
a11
e11
Exercise Twin1
Depr. S. Twin1
Genetic correlation
Genetic correlation
Write out the formulae using
a11, a21 and a22!
A
rxy 
cov xy
sdx * sdy
sd  var
A
a21
a22
a11 * a 21
rxy 
a11² * a 21²  a 22²
rg 
a11
Exercise Twin1
a21a11
a * (a  a )
2
11
2
21
2
22
Depr. S. Twin1
What does a rg of 1 mean?
Genetic correlation
There is even more information
in such a bivariate model
Explained covariance
Can the explained covariance be close to zero when
the genetic correlation is 1?
Genetic correlation
Genetic correlation
A1
rgenetic
a1
a2
Exercise
behavior
Depressive
symptoms
e1
E1
A2
e2
renviron
E2
Importance is
determined by both
the genetic correlation
and the heritability of
each phenotype!
3) Mendelian randomization
Mendelian randomization
Problems with the twin approach
 Latent, unmeasured genetic and environmental effects
 ACDE cannot easily be measured simultaneously
 E includes error
Needs very large twin samples
Mendelian randomization
Mendelian randomization
• Testing causality based on measured DNA
• Apart from than, similar to the bivariate model:
„A genetic variant that influences an exposure
variable (such as exercise behavior) should also,
through the causal chain, predict an outcome variable
(e.g. depressive symptoms)!“
• “Randomization to genotype” at conception
Mendelian randomization
Advantages & problems
 Based on measured variants
 Can be applied to any large population-based
samples
 Solid associations between genetic markers and
exposure variable first need to be established („genetic
instrument“)
Take home messages
• Necessary conditions for causality based on twin
data:
– In MZ twins, differences in trait A need to be associated
with differences in trait B
– In a bivariate Cholesky decomposition, the cross-trait paths
need to be significant
• Mendelian randomization as a means to test
causality in the general population
• Physical exercise is not the elixir to happiness ;)