Media:GWAS_projects_intro
Download
Report
Transcript Media:GWAS_projects_intro
Modeling genetic and phenotypic data with the
use of statistics
Discovery of phenotypes influenced by the season of birth
Can environment modify genetic effects on human anthropometric traits?
Genetics of liver abnormalities in obese subjects
Genetics of liver markers and their interaction with obesity
“Solving biological problems that require Maths”
Projects supervised by:
Zoltán Kutalik
Diana Marek
Murielle Bochud
Pedro Marques-Vidal
But des projets
– Sensibiliser à une recherche clinique concrète, impliquant des
notions et des données de génétique ainsi que des phénotypes,
mesurés dans une population
•
•
•
•
Données de génotypage (SNPs)
Phénotypes
Interactions avec des facteurs environnementaux
Détection d'association entre un SNP et les variations d'un
phénotype
– Mise en pratique de théories mathématiques /statistiques
permettant de modéliser la question biologique
• Regression linéaire et logistique
• Utilisation de Matlab®
6’189
individuals
Données: CoLaus (Cohort Lausanne)
Genotypes
Phenotypes
500.000 SNPs
159 measurement
144 questions
Collaboration with:
Vincent Mooser (GSK), Peter Vollenweider & Gerard Waeber (CHUV)
Variants génétiques: SNPs
(Single Nucleotide Polymorphisms)
ATTGCAATCCGTGG...ATCGAGCCA…TACGATTGCACGCCG…
ATTGCAAGCCGTGG...ATCTAGCCA…TACGATTGCAAGCCG…
ATTGCAAGCCGTGG...ATCTAGCCA…TACGATTGCAAGCCG…
ATTGCAATCCGTGG...ATCGAGCCA…TACGATTGCACGCCG…
ATTGCAAGCCGTGG...ATCTAGCCA…TACGATTGCAAGCCG…
What is association?
SNPs
trait variant
chromosome
Genetic variation
yields phenotypic variation
1.2
1
0.8
Population with ‘ ’ allele
Population with ‘ ’ allele
0.6
0.4
0.2
0
-6
-4
-2
0
2
Distributions of “trait”
4
6
phenotype
Association using regression
genotype
Coded genotype
Regression formalism
(monotonic)
transformation
effect size
(regression coefficient)
error
(residual)
phenotype
(response variable)
of individual i
p(β=0)
coded genotype
(feature) of individual i
Goal: Find effect size that explains best all (potentially
transformed) phenotypes as a linear function of the
genotypes and estimate the probability (p-value) for the data
being consistent with the null hypothesis (i.e. no effect)
Whole Genome Association
Whole Genome Association
Current microarrays probe ~1M SNPs!
significance
Standard approach:
Evaluate significance for association
of each SNP independently:
Whole Genome Association
Quantile-quantile plot
significance
observed
significance
Manhattan plot
Chromosome & position
Expected significance
GWA screens include large number of statistical tests!
• Huge burden of correcting for multiple testing!
• Can detect only highly significant associations
(p < α / #(tests) ~ 10-7)
Discovery of phenotypes influenced
by the season of birth
•
•
•
•
•
•
Background: It has been evidenced for model organisms, e.g. mouse, that
the perinatal photoperiod can have long term influence on behaviour and
regulation of the Circadian clock genes.
Goal: The goal of this project is to use the Cohorte Lausannois (CoLaus)
data to discover phenotypes with statistical evidence of being influenced by
the season of birth of the individual. Special emphasis will be on
psychological traits.
Mathematical tools: Statistics. The students will learn how to use Matlab to
read in large data sets, conduct linear and logistic regression analysis.
Biological or Medical aspects: The effect of imprinting on complex human
traits is poorly understood, we aim to elucidate a special aspect of it.
Supervisors: Zoltan Kutalik & Diana Marek
References:
Ciarleglio CM, Axley JC, Strauss BR, Gamble KL, and McMahon DG.
Perinatal photoperiod imprints the circadian clock. Nat Neurosci 2011 Jan;
14(1) 25-7. doi:10.1038/nn.2699 pmid:21131951.
Can environment modify genetic effects
on human anthropometric traits?
•
•
•
•
•
•
Background: Large studies (including hundreds of thousands of
individuals) identified genetic factors influencing human height, body-massindex (BMI) and waist-to-hip ratio (WHR). It is currently unknown whether
the effect of the discovered genetic variants are modified by environmental
factors.
Goal: The goal of this project is to use the Cohorte Lausannois (CoLaus)
data to find environmental factors (e.g. smoking, alcohol consumption,
physical activity) that modify genetic effects influencing human height, BMI,
WHR.
Mathematical tools: Statistics. The students will learn how to use Matlab to
read in large data sets including genetic data; conduct linear and logistic
regression and interaction analysis.
Biological or Medical aspects:
Supervisors: Zoltan Kutalik & Diana Marek
References:
Lango Allen et al. Hundreds of variants clustered in genomic loci and
biological pathways affect human height. Nature. 2010 Oct
14;467(7317):832-8.