Transcript Document

“10,001 Dalmatians”
research programme:
Discovery of genetic
variants that control
human quantitative traits
and predispose to diseases
Igor Rudan, Mladen Boban, Tatijana Zemunik, Gordan Lauc,
Zoran Đogaš, Stipan Janković, Ivica Grković, Ana Marušić,
Janoš Terzić, Rosanda Mulić, Vjekoslav Krželj,
Lina Zgaga, Zrinka Biloglav, Ivana Kolčić, Marina Pehlić,
Grgo Gunjača, Danijela Budimir, Ozren Polašek
2001. – human
genome sequence
was published
Main expectation (general public,
investors, researchers, pharma and
biotech industries):
Linking genes with diseases and
development of new treatments and
“personalized medicine” – the race
towards this goal begun (each group
with its own approach)
Main idea:
1) Find “markers” in the genome
and “tag” the whole genome as
densely as possible;
2) Find consistent associations
between some of those markers
and disease phenotypes
3) Find genes in proximity of
implicated markers – they are
“disease genes”
CASES
(“affected”)
CONTROLS
(“unaffected”)
STR
MARKER A
STR
MARKER B,C…
DISEASE GENE
(MUTATION)
DISEASE GENE
(WILD TYPE)
Short
tandem
repeats
(STR) –
e.g.
(TA)x4 or
(CTG)x7
–
hundreds
of STRs
across
the
genome
- STR marker
maps were
not dense, but
they were still
very useful to
“pick” genes
that caused
monogenic
(Mendelian)
diseases
Problems with genome-wide linkage
analyses using genome-wide STR maps:
1) STR markers and diseases were not
always 100% linked because of
incomplete penetrance of causing
mutations or genetic heterogeneity of
the disease: low study power
2) STR markers and disease genes were
not always 100% linked because of
recombination (crossing over)
between them: low study power
CASES
(“affected”)
CONTROLS
(“unaffected”)
STR
MARKER A
STR
MARKER B,C…
DISEASE GENE
(MUTATION)
DISEASE GENE
(WILD TYPE)
Problems with genome-wide linkage
analyses using genome-wide STR maps:
3) Even when a marker closest to disease
gene was found with nearly 100%
certainty, it still took years to find all
candidate genes in regions up to 10
megabases (or more) and sequence
them all to find exact causal mutation
4) Good ideas:
-
-
Choose to study phenotypes that are
precisely measurable and in good
correlation with genotypes
Use populations with large linkage
disequilibrium
Strategy (1): Our group proposed to rely on isolated
populations (for increased LD) and pedigree-based
approach (adds information) in 1999
Nat Genet 1999; 23: 397-404
Strategy (2): Our group proposed a highly polygenic
model for complex traits and diseases in 2003
Genetics 2003; 163: 1011-1021
Trends Genet 2003; 19: 97-106
Our understanding of complex traits and diseases:
COMPLEX DISEASE
PHENOTYPE
ENVIRONMENT
QUANTITATIVE TRAIT LEVEL
(e.g. CHOLESTEROL, BLOOD PRESSURE)
ENVIRONMENT
“-OMICS” LEVEL (PROTEOMICS, LIPIDOMICS,
GLYCOMICS, METABOLOMICS)
ENVIRONMENT
GWAS: MOST POWER &
FUNCTIONAL RELEVANCE
HIGHLY POLYGENIC GENETIC BASIS (FEW RARE VARIANTS WITH
LARGE EFFECTS AND MANY COMMON WITH SMALL EFFECTS)
Strategy (3): Our group proposed to measure large number of
QTs (closer to genes - power, more chance, later - networks)
Quantitative traits: More than 100 selected initially
ANTHROPOMETRIC MEASURES
PHYSIOLOGICAL MEASURES
ELECTROCARDIOGRAM
Body height
Systolic blood pressure (1&2)
ECG (30 sec, digital)*
Body weight
Diastolic blood pressure (1&2)
P duration
Bicondylar brachial width
Impedance - body resistency
PR interval
Abdomen circumference
Impedance - body reactancy
QRS duration
Hip circumference
Ankle-brachial BP indeks
QT interval
Brachial circumference
Spirometry - FVC
QTc interval
Biceps skinfold
Spirometry - FEV1
P axis
Triceps skinfold
Spirometry - PEF
QRS axis
Subscapular skinfold
Spirometry - FEF25
T axis
Suprailiac skinfold
Spirometry - FEF50
Abdomen skinfold
Peak flow
Head circumference
Bone mineral density
COGNITIVE & SLEEP TRAITS
EYE MEASURES
LIFESTYLE
Eysenck Personality Inventory
Retinal art:ven diameter ratio
Family disease history
Digit-symbol test
Retinal art leng:diam ratio
Birth weight
Mill-Hill vocabulary
Retinal art branching angle
Medical/surgical history
Standard Progress. Matrices
Retinal arteriolar tortuosity
Menstruation, menarche, HRT
Controlled Oral Word Assoc.
Retinal arterjunction expon
Rose Angina questionnaire*
Weschler Memory Scale
Intraoccular pressure, OD, OS
Claudication questionnaire*
Munich Chronotype Question.
Fundus photography
Respiratory questionnaire
GHQ-30
Autorefractor-measurements
Physical activity
Intra-ocular length-measur.
Smoking
Alcohol
Diet
Socioeconomic status
Quantitative traits: More than 100 selected initially
BIOCHEMICAL MEASURES
Creatinine
Uric acid
Total cholesterol
Triglycerides
HDL
LDL
Calcium
Phosphorous
Albumin
LIPIDOMICS
MARKERS OF INFLAMMATION
lipid metabolytes, e.g.
Fibrinogen
von Willebrand's factor
D-dimers
132 phospholipids,
CRP
70 sphingolipids, fatty acids,
tPA inhibitor
A large number (several
hundred) of circulating
apolipoproteins, etc.
HbA1c
Glucose
GLYCOMICS
URINE TRAITS
GENOTYPING
16 main groups of N-glycans,
A larger number of traits
Cohort 1: STR typing
4 additional groups based on
quantitated in urine samples
Cohort 2: 800 STR
number of antennas, and
that are biomedically
3 derived variables
relevant
317.000 SNP
Cohort 3: 370.000 SNP + CNV
Cohort 4: 370.000 SNP + CNV
Strategy (4): Finding money to start a large cohort
Grants awarded 2000-2007 (£ 4.0 M)
2000-2002: The British Council
 2001-2004: The Wellcome Trust
 2002-2003: Medical Research Council UK (1/3)
 2002-2006: Ministry of Science and
Technology, Croatia
 2003-2005: The Royal Society, UK
 2003-2004: National Institutes of Health, USA
 2003-2005: Medical Research Council UK (2/3)
 2006-2009: EU fp6 EUROSPAN
 2005-2010: Medical Research Council UK (3/3)
 2007-2012: Ministry of S & T, Croatia
(The Croatian Biobank)

COHORT 1.
(1001 examinee)
“Susak-10”:
served to choose the
most appropriate
population
(2001-2002)
2003: The choice of further populations was based
on demography data and population genetic studies
2003: The populations were extremely differentiated
(based on analysis of 26 STR markers below); LD
studies conducted using 8 STR markers on Xq13-12
COHORT 2.
(1024 examinees)
“Vis”:
genotyped with
(i) 800 STRs and
(ii) Illumina 317 k
(2003-2005)
COHORT 3.
(969 examinees)
“Korcula”:
genotyped with
Illumina 370 k CNV
(2006-2007)
COHORT 4.
(1001 examinees)
“Split”:
outbred population
genotyped with
Illumina 370 k CNV
(2008-2009)
Year 2005: BAD YEAR
We used 800 STR marker scan and analysed
the data using genome-wide linkage
analysis.
What did we find?
ABSOLUTELY NOTHING.
Other approaches (e.g. candidate genes and
case-control studies)?
NO REPLICATIONS FOR ANY OF THE
THOUSANDS OF REPORTED
ASSOCIATIONS (…OK, MAYBE 4-5 MAX.)
The HapMap project
Tried to define “blocks” of genome between
“recombination hotspots” and tag each one
of them with one of more than 10 million
predicted SNPs: new GWAS based on SNPs
Year 2006:
TECHNOLOGICAL BREAKTHROUGH!
Affymetrix Inc. and Illumina Inc.:
Dense genome-wide scans using hundreds of
thousands of SNP markers (from HapMap project
– “tagging SNPs)
Year 2007:
THE “BRAVE NEW WORLD” STUDY
(WTCCC, Nature, June 07, 2007)
2006.-2007. First analyses of data using SNP
Results of GWAS of QTs with “disease risk” studied
Nat Genet 2008; 40: 437-442
Nat Genet 2009; 41: 47-55
2008: uric acid & gout
2009: lipid levels & coronary heart disease
2010: fasting glucose & diabetes type 2 Nat Genet 2010
2010: FVC, FEC & chr. lung disease Nat Genet 2010
2010: creatinine & chr. kidney disease Nat Genet 2010
(2011: blood pressure & stroke) JAMA 2011 ?
(2011: CFH & age-related mac. degeneration) Lancet ?
Results of GWAS of QTs without disease risk links
PLoS Genet 2009; 5: e1000504
PLoS Genet 2009; 5: e1000539
2009: smoking initiation and intensity Nat Genet 2010
2009: clotting factors VII, VIII & vWF Circulation 2010
(2010: sleep duration and latency) Nat Genet 2010 ?
(2010: human height, weight, WHR) 3 x Nat Genet 2010 ?
(2010: global lipids) Nature 2010 ?
(2010: cognitive traits) 2 x Nat Genet 2010 ?
(2010: ECG, urine, CRP, HbA1c, ABPI, P, cortisol…)
Strategy (5): Next moves (plan for 2010-2012)
1. GWAS of -OMICS (“1 level down from QT”) &
functional follow-up & systems biology / pathways
2. Development of novel methods for analysis of the
effect of CNV and rare variants on human QTs
3. Expand the number of phenotypes measured in
plasma in at least 3,000 examinees (e.g. ILs, etc.)
4. Whole-genome sequence for 1,000 examinees &
the new round of consortia participation
Results of GWAS of LIPIDOMICS traits
PLoS Genet 2009; 163: 1011-1021
Forthcoming (2010): GWAS of 132 circulating
phospholipids (PLoS Genetics)
Further interest of our group: GWAS of
glycomics, proteomics, other metabolomics
and functional follow-up
Progress in GLYCOMICS:
dependent of measurement
Rudd PM et al. (Natl. Inst.
Bioprocessing Res. Train.):
refined chromatography
approaches for analysis of
glycosylation
High-performance liquid
chromatography (HPLC):
- Glycoproteins immobilized
- Glycans released
- Fluorescent labels attached
- Labelled sugars run on a
normal phase HPLC column
- Resulting peaks correlated
to a pre-run dextran ladder
Nature 2009; 457: 617-620
CROATIAN
CENTRE
FOR
GLOBAL
HEALTH
“GlycoBioGen”:
A consortium led by
collaboration of Scottish,
Croatian & Irish
institutions
Quantitation of glycans in human plasma:
J Proteome Res 2009; 8: 694-701
• Separation of plasma N-glycans in 16
chromatographic peaks using HPLC method
(GP1-GP16): area under peak measured as a QT
• Unusual biological variability at population level
• Significant effects of age, gender, environmental
factors
• Highly varying heritabilities
• Striking correlations with other biochemical QTs
Results of GWAS study (Vis island, Croatia):
• FUT8: associated with GP1 in 1,000 subjects
(p=5.09 x 10-8 - 7.07 x 10-8)
Strategy (5): Next moves (plan for 2010-2012)
1. GWAS of -OMICS (“1 level down from QT”) &
functional follow-up & systems biology / pathways
2. Development of novel methods for analysis of the
effect of CNV and rare variants on human QTs
3. Expand the number of phenotypes measured in
plasma in at least 3,000 examinees (e.g. ILs, etc.)
4. Whole-genome sequence for 1,000 examinees &
the new round of consortia participation
“Missing heritability”:
• CNVs (copy number variants):
• Nature (April 2010) – WTCCC – didn’t find
any associations with disease at all;
• Rare variants:
• “Moving frames” method (by Eleftheria
Zeggini at Sanger, Hinxton, Cambridge):
MAGIC, DIAGRAM & SPIROMETA
• “Exome sequencing” (4-10x)
• “Deep whole-genome sequencing” (48x)
Strategy (5): Next moves (plan for 2010-2012)
1. GWAS of -OMICS (“1 level down from QT”) &
functional follow-up & systems biology / pathways
2. Development of novel methods for analysis of the
effect of CNV and rare variants on human QTs
3. Expand the number of phenotypes measured in
plasma in at least 3,000 examinees (e.g. ILs, etc.)
4. Whole-genome sequence for 1,000 examinees &
the new round of consortia participation
“Expand phenotypes”:
• Gordan: N-glycans
• Zoran: CRD series
• Tatijana i Vesela: T4, TSH
• Mladen: markers of oxidative stress?
• Janoš: proteomics?
• Rosanda: anti-HBV antigens?
• Ana: interleukins, CD4?
Strategy (5): Next moves (plan for 2010-2012)
1. GWAS of -OMICS (“1 level down from QT”) &
functional follow-up & systems biology / pathways
2. Development of novel methods for analysis of the
effect of CNV and rare variants on human QTs
3. Expand the number of phenotypes measured in
plasma in at least 3,000 examinees (e.g. ILs, etc.)
4. Whole-genome sequence for 1,000 examinees &
the new round of consortia participation
“Whole-genome sequence era”:
• Wellcome Trust Sanger Institute, Hinxton,
Cambridge: agreement that 400 / 2500 first
examinees with WGS will be Croatians
(Korcula)
• Why? – genealogies (expanding the
number through “imputation”) and dense
phenotyping (hundreds of QTs)
• Project will start: end of 2011
• Value for us: GBP 4 million at present
time; should get us into the “next wave” of
consortia work; needs Vesna Boraska etc.