IBD Estimation in Pedigrees

Download Report

Transcript IBD Estimation in Pedigrees

Connecting biobanks -
adding
value in the genetics of complex traits
The Australian Twin Collections Biobank
Nick Martin
Queensland Institute of Medical Research
Brisbane
MRC CAiTE Symposium
Bristol
January 12, 2011
My brief…
• how biobanks can be beneficial for
researchers
• what’s happening and what is
accomplished
• some results of projects I’m involved in
How beneficial biobanks can
be for research[ers] (1)
1 page of authors and affiliations!
How beneficial biobanks can
be for research[ers] (2)
2 pages of authors and affiliations !
Australian Twin Registry
•
•
•
•
Founded 1978
Voluntary enrolment – schools, media, etc
~30,000 pairs enrolled (~15% of all pairs)
Two adult cohorts studied
• 1893-1964 (5967 pairs), 1965-1971 (4629 pairs)
• Typical of population wrt psychiatric symptoms,
personality, social class & education (females)
• Males slightly more educated and middle class
• New cohort of ~8000 pairs (born 1972-85)
Timetable of Questionnaires and
Interviews
Cohort 1
N23, A, D
3808 p / 576 s
N12, A, D
3051 p / 468 s
DSM-IIIR MD, PD
2456 p / 771 s
N23, CIDI
894
N12, A, D
1279 p / 558 s
Cohort 2
Siblings
Parents
1980
1985
N12, A, D
2270 p / 518 s
N23, CIDI
404
N12
5375
N23, CIDI
1172
N12
6014
765
1990
1995
2000
QIMR GenEpi core interests
Quantitative phenotypes related to disease risk:
• Metabolic / cardiovascular risks
Biochemical test results
Lipids
Glucose, insulin
Urate, CRP, ferritin
Liver enzymes GGT, ALT, AST, BCHE
• Personality, depression, anxiety, cognition, MRI, taste, smell
• Addictions (alcohol, nicotine, cannabis, opioids, gambling)
• Melanoma; endometriosis; asthma; migraine; twinning
Data (Twins and families)
• Biochemical phenotypes
• GWAS
N ≈ 19,000 adults
N ≈ 2,500 adolescents
N ≈ 20,000
ENGAGE participation
• Meta-analysis of lipids, urate, alcohol, liver function
tests, glucose
• Meta-analysis of iron markers, transferrin isoforms
Queensland Twin Registry
Adolescent twins + sibs
Phenotypes measured on
teenage twins
 included; - no information
12yrs 14yrs 16yrs
Sun exposure
Sun protective behaviour
Mole counts and locations
Melanoma family history
Mosquito bite susceptibility
Mouth ulcers
Sociodemographic Variables
Eye, hair and skin colour
Personality (JEPQ, NEO)
Acne
Height, weight
Blood pressure
Fingerprints, handprints

































12yrs 14yrs 16yrs
Photoaging (skin mould)
Visual acuity
AutoRefractometry (myopia)
ENT (grommets, T&A)
Asthma, eczema
Laterality (hand, eye, foot)
Hand preference (peg board)
Binocular rivalry (bipolar)
Computer Use
Reading Ability (CCRT)
Cognitive Ability (IQ – MAB)
Information Processing (IT)
Working Memory (DRT)
ERPs (DRT)
EEG (power, coherence)
Academic achievement (QCST)
Taste (PTC, bitter, sweet)
Smell (BSIT, NatGeo)
Psychiatric signs (SPHERE)
Relationships
Leisure activity









-










-





















Blood phenotypes
12yrs 14yrs 16yrs
Haemoglobin
Red blood cell count
Packed cell volume
Mean corpuscular volume
Platelet count
White blood cell count
Neutrophils
Monocytes
Eosinophils
Basophils
Total lymphocytes
CD3+ T-cells
CD4+ helper T-cells
CD8+ cytotoxic T-cells
CD19+ B cells
CD56+ natural killer cells
CD4+/CD8+ T-cell ratio
Blood groups (ABO, MNS, Rh)

















-

















-


















Serum biochemistry
12yrs 14yrs 16yrs
Cholesterol, HDL, LDL
Triglyceride
Apolipoproteins A1,A2.B,E
Lp(a)
Glucose, Insulin
Ca, PO4
Creatinine
Urea, Uric acid
Alkaline phosphatase
Albumin, Bilirubin
AST, ALT, GGT
Fe, Ferritin, Transferrin
Heavy metals (Pb, As etc)







































Population 21 million
Area 7.7 million km2
External blood collection:
Labmailer Process
Preparing Labmailers
Preparing FTA cards
Biobottle Box
Incoming Blood Samples
Receipting the blood sample
Standard blood collection
and processing
Samples are collected in the following tubes:
2 x EDTA
1 x SERUM
1 x ACD
1 x PAX
MNC Processing
4 x Red Blood Cells
4 x Plasma
2 x Buffy Coats
Buccal Extraction
4 x Serum
The 2 x EDTA & 1 x SERUM tubes are
centrifuged at 3000rpm for 10mins and
then the fractions are collected. All
fractions & 1 x Buffy Coat are stored in
the -80oC freezers
1 x Buffy Coat Extraction
1 x BUCCAL
Stored in Freezer
for later RNA
work
Average DNA Yield per buffy coat
(10ml EDTA blood collection)
Mean = 171.291
Std. Dev = 68.5431
N = 3,554
Genetic Epidemiology
Frozen sample inventory
Fraction
Plasma
Buffy Coats
Red Blood Cells
Serum
Buccals
FO Plasma
FO BC
FO RBC
Total
Number of Samples
128,012
101,333
130,668
97,677
5,591
7,815
7,387
7,500
485,983
Genetic Epidemiology
DNA sample inventory
Fraction
DNA Dilutions at 50ng/µl
DNA Stocks
DNA Other
Total
Number of Samples
44,926
50,719
16,443
112,088
GWAS studies at QIMR
Study
Subjects
CVD Risk
Adult MZ ff
Migraine+ Nic
Site
Funding
923 Illumina 317k
Helsinki
EU
Adult twins
1,234 Illumina 610k
deCode
NHMRC
Alcohol (1)
Adult twins
2,736 Illumina 370k
deCode
NIH
Alcohol (2)
Adult sibships
4477 Illumina 370k
CIDR
NIH
Depression
Adult cases
TGen
GAIN
Endometriosis
Adult cases
2,383 Illumina 660k
deCode
Wellcome
Adolescent
Twin families
4,556 Illumina 610k
deCode
NHMRC+
Asthma/Angst
Twin families
1,766 Illumina 610k
Brown
NHMRC
TOTAL
N Platform
(1,257) Affy 6.0
19,257
Australia’s changing ethic
composition
Published Genome-Wide Associations
through 6/2010
904 published GWA at p<5x10-8 for 165 traits
NHGRI GWA Catalog
www.genome.gov/GWAStudies
(Most) genetic effects
are modest
• Genetic risks for complex traits are modest
• A genetic risk (OR) of 1.3 (2% variance) is large
• Most genetic risks are in the 1.1 to 1.2 range or
less (<1% variance)
• This is true for most complex diseases (e.g.
alcoholism, schizophrenia, bipolar disorder, lung
cancer) and traits (height, BMI, lipids)
BUT not always………….(use your Biobank !)
Serum Bilirubin
• a waste product of the normal breakdown of red
blood cells
• excreted from the body after being conjugated
with glucuronic acid ~ UGT (Uridine
Diphosphate Glucuronyltransferase) enzyme
• a diagnostic marker of liver and blood disorders
• acts as an antioxidant: an increase in bilirubin
levels is associated with a reduced risk of
cardiovascular diseases
Bilirubin in adolescents
rs2070959
Effect (b) Se
R2
Measure
Allele
P Value
Age 12
A
-0.58
0.04
21%
3E-59
Age 14
A
-0.71
0.05
23%
1E-50
Age 16
A
-0.97
0.06
29%
4E-65
Age 18
Mean
A
A
-0.72
-0.76
0.09
0.03
24%
28%
5E-15
2.1E-115
Genetics of Iron Status
–
What genes affect iron status (e.g. serum iron,
transferin, saturation, ferritin), and the risk of
either deficiency or overload in general
population?
GWAS (N = 8942)
Serum iron
Transferrin
Tf saturation
Ferritin
HFE
P = 5E-38
TF
P = 3E-104
TMPRSS6
P = 7E-27
HFE
P = 1E-73
HFE
P = 8E-83
HFE
P = 4E-12
TMPRSS6
P = 2E-27
ZNF521 (Zinc Finger Protein 521)
P = 4E-08
Large effects of TF and
HFE variants
Measures
TF Mutation (rs3811647)
HFE mutation (rs1800562)
Effect
% var
p
Effect
% var
p
Iron
-.01±.10 SD
0
.81
.66±.10 SD
10
3.5 x 10-11
Transferrin
.46±.06 SD
13
3 x 10-15
-.68±.10 SD
9
1.1 x 10-10
Saturation
-.17±.06 SD
2
.002
.80±.10 SD
13
4.3 x 10-15
Ferritin
-.13±.06 SD
1
.03
.44±.11
4
4.5 x 10-5
ENGAGE meta-analysis to find more iron metabolism genes
Butyrylcholinesterase
(BCHE)
• Enzyme found in plasma
• Rare variants in BCHE
extensively studied
because of
pharmacogenetic effects
• Evidence of involvement
with T2DM, CVD, Alzheimer
disease (questionable)
Correlations ≥ 0.25 for:
BMI
Blood pressure
ApoB
ApoE
Total cholesterol
Triglycerides
GGT
+ significant but smaller
correlations for ALT, AST, HDL-C,
LDL-C, urate.
Cholinesterase
GWAS Meta-Analysis (3 studies, total N = 8781)
QQ Plots
Before and After Adjustment for the BCHE K Variant –
many other variants contributing…….
Ingenuity Pathway Analysis on all
butyrlcholinesterase GWAS data
All SNPs with p ≤ 0.001 (Total 5662, of which 2003 mapped to 440 genes)
CD4+/ CD8+ ratio
h2 = 0.84 (0.79–0.87)
Not only blood variables show
large SNP effects...
Hair curliness –
straight, wavy, curly
λ = 1.00008
GWAS for
curliness in
three
independent
Australian
Cohorts
P = 10-31
Other peaks
GWAS for hair curliness
~6% variance
Trichohyalin
is expressed
in hair root
sheaths
Heterogeneity of gene
effects by age, and
sex...and environment?
Liver function: gamma glutamyl
transferase (GGT)
Several significant hits in the combined data, but not the expected one on Chr. 22
?
Heterogeneity between adult and adolescent results at this locus!
Multiple SNPs
show
heterogeneity
between adult
and adolescent
results for GGT
Melanocytic naevi
(common moles)
The largest risk factor
for melanoma
QIMR GWAS for total, flat and raised nevi
IRF4
MTAP
Note inverse association signals for MTAP and IRF4 with flat and raised nevi
Mole count: Interaction of IRF4
genotype with age
American Journal of Human Genetics 87, 6–16, 2010
Teenage acne
• 4 point rating (none to severe)
• 3 sites – face, chest, back
• at age 12 and 14
• at age 16 face only
• How to combine these 7 measures ?
• Lots of missingness
• Item response modelling in WinBUGS enables
Bayesian estimation of liability, allowing for twin
relatedness and adjusting for age, sex
GWAS for Acne – different
genes for males and females ?
Joint
F+M
Females
Males
Gene – environment interaction
• Is sensitivity to the environment a function of
genotype?
• For MZ twins |twin1 – twin2| is a pure
measure of e
• does |twin1 – twin2| vary systematically
between genotypes?
• A direct test of G x E
GenomEUtwin
Systematic GWA search for GxE using MZ twins
• 1800 MZ female pairs aged 30-70 from
AU, UK, NL, DK, SE
• GWAS using Illumina 317k array
• Focus on CVD risk factors (lipids), but
other phenotypes as well (including
depression)
Genome-wide association scan
of MZ pair mean levels of HDL
cholesterol
GWAS of MZ pair |differences|
of HDL cholesterol
1800 MZ female pairs from GenomEUtwin
A gene for environmental sensitivity on Chr 16 ?
Adding value to your Biobank
(1)
- expression and epigenetic data
eQTL Study
Study
Design
• Gene expression
profiles for ~980
individuals
980
Individuals
• Individuals from 3
‘family’ groups
• Only PAX gene
expression generated
• expression levels
generated using Illumina
HumanHT-12 v4.0 chips
PAX
PAX
PAX
Full Families
MZ and DZ
twin pairs
MZ, DZ
and Sib
Parents +
Offspring (MZ /
DZ / Sib)
~2/3 of
samples
~1/3 of
samples
Expression levels can be correlated with all other phenotypes
Methylation levels
Study
Design
• From the sample
individuals as the full
expression study
• Whole genome
methylation levels
determined
• Using Illumina
methylation 450k chips
980
Individuals
Methylation
Methylation
Methylation
Full Families
MZ and DZ
twin pairs
MZ, DZ
and Sib
Parents +
Offspring (MZ /
DZ / Sib)
~2/3 of
samples
~1/3 of
samples
Methylation levels can be correlated with expression…and with
MZ discordance !
MZ pairs discordant for SLE
- widespread methylation differences
Changes in the pattern of DNA methylation associate
with twin discordance in systemic lupus erythematosus.
Javierre BM et al. Genome Res. 2010 20: 170-179, 2010
Adding value to your Biobank
(2)
- keep adding new phenotypes !
Ratio of 2nd to 4th finger length
Associated with:
testosterone exposure
aggression
ADHD
homosexuality
fertility
Multivariate Genetic Analyses of the 2D:4D Ratio: Examining the Effects of Hand and
others
Measurement Technique in Data from 757 Twin Families.
Sarah E. Medland and John C. Loehlin
Twin Research and Human Genetics 11: 335–341, 2008
LIN28B SNP associated
with:
2D:4D ratio
Age of menarche
Menopause
Height
A Variant in LIN28B Is Associated with 2D:4D
Finger-Length Ratio, a Putative Retrospective
Biomarker of Prenatal Testosterone Exposure
Sarah E. Medland…. David M. Evans
Am J Human Genetics 86, 519–525, 2010
Large consortia…..
Twin Imaging Study
(TIMS)
 Brisbane Adolescent Twin database - (>700 scanned)
 Data acquisition: 4 Tesla Bruker Medspec scanner –
CMR, UQ
 MRI
 DTI (HARDI)
 fMRI (n-back)
 resting-fMRI
 Processing and analysis:
 MRI - UCLA
 DTI (HARDI) -UCLA
 fMRI (n-back) - UQ
 resting-fMRI – UQ + NYU
http://enigma.loni.ucla.edu/
Adding value to your Biobank
(3)
- sequencing !
Whole-genome sequencing
Why?
Discover novel, rare variants with potential relevance for disease, including CNVs.
These can then be imputed/genotyped and tested for association in large cohorts.
Pilot study: first look at data
14 cases + 1 control (including trio) sequenced with deep coverage using HiSeq.
Cases with strong family history, severe disease and other co-morbid phenotypes.
~97% concordance of sequence with KGP imputation (610k)
Acknowledgements
 Twins and their families for the participation
 John Whitfield, Peter Visscher, David Duffy, Grant Montgomery, Dale Nyholt
 Dixie Statham, Ann Eldridge, Marlene Grace, Anjali Henders and Megan Campbell,
Leanne Wallace for the data collection and sample processing.
 Allan McRae, Manuel Ferreira, Brian McEvoy, Scott Gordon, Sarah Medland, Gu Zhu,
Beben Benyamin, Rita Middelberg, Margie Wright for helping with data & analysis
 Harry Beeby and David Smyth for IT support
 Collaborators:
 Netherlands Twin Registry: Gonneke Willemsen, Jouke-Jan Hottenga,
de Geus, Brenda Penninx, Dorret Boomsma
 UK Twin Registry: Tim Spector, Mangimo Massimo
 ALSPAC Study: David Evans, George Davey Smith
 Sanger Institute / U Helsinki: Aarno Palotie, Leena Peltonen
 University of Queensland: Ian Frazer, Rick Sturm, Greig de Zubicaray
 Washington University, St. Louis: Andrew Heath, Pam Madden
Eco