A 1 - QIMR Genetic Epidemiology

Download Report

Transcript A 1 - QIMR Genetic Epidemiology

Detection of gene-gene interactions in
genome-wide association studies
Manuel A R Ferreira
Center for Human Genetic Research
Massachusetts General Hospital
Harvard Medical School
What is epistasis or GxG?
5
Phenotype
4
3
2
BB 5
1
Bb 3
0
aa
bb 1
Aa
AA
Epistasis defined as the extent to which the joint contribution of two loci towards a
phenotype deviates from that expected under a purely additive model.
Fisher (1918)
Is it expected to be important for complex traits/diseases?
Not much evidence
Growing evidence
Model organisms
Model organisms
Xu & Jia 2007 Genetics 175, 1955-1963 [Barley]
Brem et al. 2005 Nature 436: 701–703 [Yeast]
Zeng et al. 2000 Genetics 154, 299–310 [Drosophila]
Li et al. 1997 Genetics 145, 453–465 [Rice]
Flint et al. 2004 Mamm. Genome 15, 77–82 [Mice]
Montooth et al. 2003 Genetics 165, 623–635 [Drosophila]
Carlborg et al. 2003 Genome Res. 13, 413–421 [Chicken]
Humans
Shimomura et al. 2001 Genome Res. 11, 959–980 [Mice]
Maller et al. 2006 Nat Genet. 38:1055-9
Humans
Schadt & Lum 2006 J Lipid Res 47: 2601–2613
(Gjuvsland et al. 2006 Genetics 175: 411–420)
Can we detect it in genome-wide association studies?
Technically challenging
Astronomical number of tests (how to perform, analyze and correct for them, power)
Plausible for certain models of interaction
Marchini et al. 2005 Nat Genet 37, 413–417
Evans et al. 2006 PLoS Genet 2, e157
No reports as yet (in humans)
Traditional methods to detect epistasis
1. Regression
Flexible framework
Slow
y = m1.LocusA + m2.LocusB + m3. (LocusA × LocusB)
2. “Linkage Disequilibrium” or allelic-association
Powerful (eg. case-only)
Less flexible, phasing
+
-
A1
a
c
A2
b
d
OR =
a×d
b×c
B1
B2
A1
a
c
A2
b
d
3. Transmission distortion
More robust
Less powerful
AA
Aa
Aa
50%
BB probands
AA
Aa
Aa
52%
Bb probands
AA
Aa
56%
Aa
bb probands
All allele-based!
New methods
Allele-based test
Faster standard tests (eg. logistic regression), useful for whole-genome screens
B1B1 B1B2 B2B2
A1A1
B1 B 2
Collapse B
A1A1
A1A2
A1A2
A2A2
A2A2
Collapse A
B1 B2
OR =
a×d
A1
a
c
b×c
A2
b
d
ORcases ≠ ORcontrols
Test for epistasis
SNPs
New methods
Gene 1
Gene 2
A
B
C
D
E
F
1
2
3
4
5
35
allele-based tests
A
B
C
D
E
F
1
2
3
4
5
A single
gene-based test
New methods
2. Gene-based test
Gene 1
Gene 2
A
B
C
D
E
F
1
2
3
4
5
Reduce # tests, capture “haplotypic” variation,
analysis of pathways or networks
Case-only sample
Powerful
Less robust
1. Canonical correlation analysis of Gene 1 and Gene 2
p canonical correlations
2. Estimate the significance of all correlations using Bartlett’s (1941) test
Case-control sample
Flexible
Less powerful
1. Canonical correlation analysis of Gene 1 and Gene 2
Store composite variables for Gene 1 and Gene 2 associated with the largest canonical correlation
2. Test for interaction between these composite variables using standard linear or logit regression
y = m1.Gene1 + m2.Gene2
+ m3. (Gene1 × Gene2)
New methods

3. Performance
Gene 1
Type-1 error
(α = 0.05)
Gene 2
New methods

3. Performance
Gene 1
Power
(α = 0.05)
Gene 2
http://pngu.mgh.harvard.edu/~purcell/plink/
Application to a bipolar disease GWAS
Poster 560
New Bioinformatic and Computational Methods
3.15 – 5.00pm
Acknowledgements
MGH
University College London
Shaun Purcell
Pamela Sklar
Mark Daly
Ed Scolnik
Laurie Weiss
Douglas Ruderfer
Yan Meng
Jennifer Stone
Matt Ogdie
Hugh Gurling
STEP-BD
Jordan Smoller
Roy Perlis
Vishwajit Nimgaonkar
Nan Laird
Matt McQueen
Steve Faraone
WTCCC
Nick Craddock
Funding
NHMRC Sidney Sax post-doctoral fellowship