Gene Hunting in Complex Diseases (CL)x
Download
Report
Transcript Gene Hunting in Complex Diseases (CL)x
MUS2046 Genetics in Medicine
Finding disease genes
Cathryn Lewis
Professor of Genetic Epidemiology
and Statistics
Introduction to genetics: 1
DNA structure
Introduction to genetics: 2
DNA sequence
www.onlineeducation.net/dna
Introduction to genetics: 3
DNA differences
What makes us different?
These differences control our hair
colour, our height, and the diseases
we will get
Complex disease: contributions from genetic
and environmental factors
Gene8
Gene7
Gene6
Gene5
Gene4
Gene3
Gene2
Gene4
Gene1
Env4
Env3
Env2
Env1
Env4
Disease
Examples: asthma, breast cancer, heart disease, autism, arthritis,
migraine, obesity, diabetes, stroke
Most diseases that have a major economic, social and health burden
Complex Diseases
•
Raised risk in families
– But increase in risk may be slight compared with population risk
– Can be measured by the sibling relative risk
•
•
•
•
No clear mode of inheritance
Multiple genes
Environmental effects
Gene-gene, gene-environment interactions
Examples
Inflammatory bowel disease, multiple sclerosis, depression, asthma,
rheumatoid arthritis, diabetes, heart disease , ....
Most diseases that have a considerable public health impact
Genetic Association Study
A genetic association study tests whether the presence of a specific
genetic variant correlates with a trait of interest (such as risk of
disease)
• A SNP has two alleles: C, T
• Each individual has a genotype
at this SNP
– CC, CT or TT
Genetic variation:
Single nucleotide polymorphism (SNP)
....TGGACATGCA....
....TGGACCTGCA....
Alleles A and C are present in the population
Genotype : carried by an individual, on paternal
and maternal inherited chromosomes
....TGGACATGCA.... ....TGGACATGCA.... ....TGGACCTGCA....
....TGGACATGCA.... ....TGGACCTGCA.... ....TGGACCTGCA....
Genotype: AA
AC
CC
Genetic Association Study
A genetic association study tests whether
the presence of a specific genetic variant
correlates with a trait of interest (e.g.
presence/absence of disease)
Identifying SNPs that increase risk of disease
Genotype SNP with A, C alleles:
AA
Cases – affected with disease
AC
CC
Controls – not affected with
disease
More AC and CC genotypes in cases than in controls
Indicates that carrying C allele increases risk of disease
Case control studies
• Compare frequency of SNP alleles or genotypes in a series of cases
and controls
Cases
– Diagnosed with disease
– Ascertainment - through hospital or community?
– Define criteria for inclusion in study
Controls
– Unaffected with disease (supernormal controls)
– Randomly ascertained (e.g. blood donors)
– Both types of controls are valid
Important to match cases and controls on genetic ancestry
– if not, genetic differences between cases and controls may reflect
their ancestry, not their disease status
Association of PTPN22 mutation with
rheumatoid arthritis (RA)
Steer et al., Arthritis Rheum, 2005
• RA is a complex disease with a sibling relative risk of approximately 3, and
a strong HLA effect
• PTPN22 encodes a protein tyrosine phosphatase which interacts with the
negative regulatory kinase Csk to inhibit T cell signalling and activation
• The R620W mutation was shown in other studies to increase risk of RA
• Association study of R620W performed in London RA patients
• 302 RA cases (hospital-ascertained) and 374 controls, all of European
ancestry
Association of PTPN22 mutation with RA
CC
CT
TT
Controls
(n=374)
312
83%
61
16%
1
0.3%
Cases
(n=302)
218
72%
72
24%
12
4%
Odds ratio
(95% CI)
1
1.7
(1.2 –2.5)
17.2
(3.8-77.8)
Freq. of T
allele
8.4%
15.9%
Significant difference in allele frequency (p=3 x 10-5)
Odds ratio of CT genotype compared to CC genotype
= 312 x 72 / (218 x 61)
Association of PTPN22 mutation with RA
C
T
TT
Freq. of T
Controls
(n=374)
685
92%
63
8%
1
0.3%
8.4%
Cases
(n=302)
508
84%
96
16%
12
4%
15.9%
Allelic Odds
ratio (95% CI)
1
2.05
(1.47- 2.88)
17.2
(3.8-77.8)
Odds ratios for genotypes CC, CT, TT are 1, r, r2
Here, OR for CC 1 (baseline), OR for CT=2.05, OR for TT = 4.02
Rheumatoid arthritis: contributions from genetic
and environmental factors
Other
genes
CTLA4
CD40
TNFAIP3
STAT4
TRAF1
PTPN22
Gene4
HLA
?
Sex
Age
Smoking
Env4
Rheumatoid
arthritis
Now over 100 genes identified that are associated
with rheumatoid arthrtis
Genome-wide association studies (GWAS)
• SNP chips from Illumina and Affymetrix will
genotype up to 1 million SNPs across the genome
• Capture most of the variation across the genome
WTCCC (2007) Nature 447: 661-78
Steps in WGA study
•
•
•
•
•
•
•
•
Design study
Collect samples
Define phenotypes
Type DNA on whole-genome panel
Quality Control (QC)
SNP-by-SNP analysis
Interpret results
Replicate, perform meta-analysis
GWAS analysis methods
• SNP-by-SNP analysis against phenotype
– Analysis of genotype counts
– Regression analysis of quantitative trait
– Logistic regression of case-control status on SNP genotype,
ancestry covariates, phenotypic covariates, environmental
factors ....
– Problems of multiple testing with 500K SNPs
Multiple Testing Approaches
How to make sense of ½ million p-values?
15000
10000
5000
25000 p-values<0.05
0
Frequency
20000
25000
Distribution under the Null
0.0
0.2
0.4
0.6
p-values
0.8
1.0
QQ (quintile-quintile) plot
Elevation above line
implies observed results more
significant than expected: true
signal or artefact?
Observed
log(p-values)
ordered
Expected log(p-values), ordered
Manhattan plot
-log(pvalue) of each SNP plotted.
Thresholds at genome-wide significance (5 x 10-8) and
suggestive significance (5 x 10-6)
SNP rs7517810 (lies near gene TNFSF18)
Risk allele, T, has frequency 0.246
Odds ratio for this allele is 1.22
SNP risk information: Crohn’s disease
• SNP name : rs7517810
– Gives information on location, which gene(s) the SNP is in (or near)
• Allele frequency: allele T has frequency 0.246
– What are the frequency of the two alleles?
– Can use to calculate genotype frequencies
• Odds ratio: 1.22 for allele T
– How much does carrying the ‘risk’ allele increase your risk of disease?
Genotype
Frequency
CC
(baseline)
CT
TT
0.57
0.37
0.06
Underlying model
Odds ratio
for odds ratio
1.00
1.22
1.49
1
r
r2
Regional association plot of association with SNPs on chr 19
associated with LDL cholesterol levels
• Strongest association with rs73015013
• Other SNPs also have significant
evidence of association
• SNPs are highly correlated (red), so
picking up same information
• Which is relevant genes? The most
stronly associated SNPs do not lie in the
gene
• SNPs probably affect regulation of LDLR
gene (strong functional candidate gene)
Mendelian disorder
Complex disease
www.genome.gov/GWAStudies
How can we use genetics in a clinical setting?
•
Disease risk estimation
Can we identify individuals at high risk of disease?
– offer appropriate screening protocols for early diagnosis
– ‘healthy’ living, reducing the environmental risk component
– preventative therapy?
•
Diagnosis and prognosis
– Using genetics to help in diagnosis, avoiding expensive clinical tests
– Predict future disease path and treat accordingly
•
Personalised medicine: pharmacogenetics and therapeutics
– Using genetic profiles to identify the most effective drug or therapy
– Avoid drugs likely to have major side-effects
•
Gain insight into disease pathways from knowledge of gene function
– Deeper understanding of disease mechanism and prevention
– New targets for drug development
Can we use risk SNPs to identify individuals at
high risk of disease?
Risk prediction
1. Theory
Using breast cancer risk SNPs to estimate the
distribution of risks in the population
2. Practice
Using a cohort study to assess the predictive ability
of T2D SNPs
Disease prediction from genetic association studies
Test 500k SNPs across genome for
differences between cases and controls
Identify panels of SNPs that control risk of
disease
Each SNP: odds ratio of disease,
frequency in population
For any individual, can calculate genetic risk
profile across these SNPs
Can we use the low risk genes to predict a woman’s
risk of breast cancer?
Gene8
Gene7
Gene6
Gene5
MAP3K1
ZNF365
TOX3
FGFR2
Gene4
………
1.35
1.20
1
1.25
1
0.74
Breast
cancer
0.86
Multiply odds ratios
from each gene to give
overall
Relative risk = 1.30
(slight increase in risk,
compared to average
risk of 1)
Distribution of genetic risk in the population
Decreased risk:
Carry few risk
alleles
Baseline
risk
Increased risk:
Carry many risk
alleles
5% and 1% of
population at
highest risk
How useful is this information for:
•Screening?
•Therapeutic interventions?
•Lifestyle management?
Can we use risk SNPs to identify individuals at
high risk of disease?
Risk prediction
1. Theory
Using breast cancer risk SNPs to estimate the
distribution of risks in the population
2. Practice
Using a cohort study to assess the predictive ability
of T2D SNPs
How do genes and environmental/clinical risk factors
help predict individuals who develop type 2 diabetes?
• Non-genetic risk factors: Framingham risk scores
– Age, BMI, cholesterol, family history, blood pressure, fasting glucose
• Genetic risk factors: 20 SNPs
• 5535 healthy individuals: 303 developed T2D over next 10
years
Fig 2 Percentage of participants in each gene count score category among those who developed type 2
diabetes and those who remained free from diabetes.
Talmud P J et al. BMJ 2010;340:bmj.b4838
©2010 by British Medical Journal Publishing Group
Fig 1 Receiver operating characteristics curves for gene count score alone (area under curve 0.54, 95% CI
0.50 to 0.58), Framingham offspring risk score (area under curve 0.78, 0.75 to 0.82), and gene count score
incorporated into Framingham offspring risk score (area under curve 0.78, 0.75 to 0.81).
Talmud P J et al. BMJ 2010;340:bmj.b4838
©2010 by British Medical Journal Publishing Group
Finding genes for complex disorders – how are we doing?
Identified SNPs only account for a small proportion of the genetic
contribution to disease
Disease
Number of
loci
Age-related macular degeneration
5
Proportion of
heritability Heritability measure
explained
50%
Sibling recurrence risk
Crohn's disease
32
20%
Genetic risk (liability)
Systemic lupus erythematosus
6
15%
Sibling recurrence risk
Type 2 diabetes
18
6%
HDL cholesterol
7
5.2%
Height
40
5%
Sibling recurrence risk
Residual phenotypic
variance
Phenotypic variance
Manolio et al., Nat Genet, 2009
How do we find the missing heritability?
• Genotype denser SNPs genome-wide
• Identify the causal variant, (not necessarily the SNP on the
GWAS chip)
• Account for gene-gene, gene-environment interactions
• Epigenetics?
• Systems biology approach: genotype, gene
expression, proteomics, epigenetics,
environment, clinical data
DIRECT-TO-CONSUMER GENETIC
TESTING
23andme.com
• In November 2013, the US FDA banned 23andme from giving
information about disease risks
• Only give information on ancestry currently
What did 23andme test for?
Disease risk(122)
Abdominal Aortic
Aneurysm
Age-related Macular
Degeneration
Alcohol Dependence
Alopecia Areata
Alzheimer's Disease
Alzheimer's Disease:
Preliminary Research
Ankylosing Spondylitis
Asthma
Atopic Dermatitis
Atrial Fibrillation
Atrial Fibrillation:
Preliminary Research
Attention-Deficit
Hyperactivity Disorder
Back Pain
Basal Cell Carcinoma
Behçet's Disease
Bipolar Disorder
Bipolar Disorder:
Preliminary Research
Bladder Cancer
Brain Aneurysm
Carrier Status (53)
Traits (60)
ARSACS
Agenesis of the Corpus
Callosum with Peripheral
Neuropathy (ACCPN)
Alpha-1 Antitrypsin
Deficiency
Autosomal Recessive
Polycystic Kidney Disease
BRCA Cancer Mutations
(Selected)
Beta Thalassemia
Bloom's Syndrome
Canavan Disease
Congenital Disorder of
Glycosylation Type 1a
(PMM2-CDG)
Connexin 26-Related
Sensorineural Hearing
Loss
Cystic Fibrosis
D-Bifunctional Protein
Deficiency
DPD Deficiency
Dihydrolipoamide
Dehydrogenase Deficiency
Drug Response (24)
Adiponectin Levels
Alcohol Flush Reaction
Asparagus Metabolite
Detection
Avoidance of Errors
Biological Aging
Birth Weight
Bitter Taste Perception
Blood Glucose
Breast Morphology
Breastfeeding and IQ
C-reactive Protein Level
Caffeine Consumption
Childhood and
Adolescent Growth
Chronic Hepatitis B
Earwax Type
Eating Behavior
Eye Color
Eye Color: Preliminary
Research
Finger Length Ratio
Food Preference
Freckling
HDL ("Good") Cholesterol
Abacavir Hypersensitivity
Alcohol Consumption,
Smoking and Risk of
Esophageal Cancer
Antidepressant Response
Beta-Blocker Response
Caffeine Metabolism
Clopidogrel (Plavix®)
Efficacy
Floxacillin Toxicity
Fluorouracil Toxicity
Hepatitis C Treatment
Side Effects
Heroin Addiction
Lumiracoxib (Prexige®)
Side Effects
Metformin Response
Naltrexone Treatment
Response
Oral Contraceptives,
Hormone Replacement
Therapy and Risk of
Venous
Thromboembolism
Phenytoin (Dilantin®)
Absolute
risk
Relative
risk
Traits
Parkinson’s, Alzheimer’s: Locked
23andme: psoriasis
23andme: psoriasis
Psoriasis
Gene
SNP
My
genotype
Adjusted
OR
HLA-C
rs10484554
CT
1.83
IL12B
rs3212227
TT
1.16
IL23R
rs11209026
GG
1.06
Type 1 diabetes
“ …. better off spending their money on a gym
membership or personal trainer”
Hunter, Khoury & Drazen,
N Engl J Med, 2008
RBI: Risk .... burden ... intervention
• Is the risk conferred large?
Is it really worth worrying about a relative risk
of 1.05?
• Is the disorder or trait severe?
Schizophrenia? Baldness? High blood
pressure? Breast cancer?
• Is there anything we can do about it?
Prevention? Early diagnosis?
Empowerment or endangerment?
Summary
• Scientific strides in identifying the inherited genetic variants that
affect disease risk
• Gives biological insights into the disease
• Very limited disease prediction available from current findings
– Incomplete knowledge of polygenic component of disease
– Causal genetic variants are unknown
• Better prediction comes from
– Family history
– Environmental risk factors (smoking, body mass index)
– Pre-clinical factors (blood pressure, cholesterol levels)