Introduction to Medical Genetics

Transcript Introduction to Medical Genetics

Applied research in human
genetics
Weibin Shi
Michele Sale
The central focus of human
genetics research:
Identification of genes
that cause disease
Which polymorphisms in
Which genes in
Which individuals
Exposed to which environmental factors
Increase risk of developing disease?
Defining what to study

As in any biomedical study, need to precisely
define the disease under study

Define primary phenotype and secondary
phenotypes

Understanding risk factors

Genetic or Environmental?
• Ethnic differences
• Age/gender influence
Refining whether the disease
under study is genetic
 Family
studies: Familial aggregation
 Twin studies: Concordance rate of
disorder for monozygotic twins (MZ) vs.
the rate for dyzogotic (DZ) twins
 Adoption studies: disease frequency of
adoptees’ biological vs. their adopted
parents or siblings
 Ethnic differences
Best Proof of All?
Connect genetic variation to the disease!
But, how do we
find the gene?
Linkage analysis and Association
analysis are effective in identifying
Mendelian disorder genes but are
less effective in identifying
complex disease genes
Complex diseases are often caused by
multiple genes and environmental factors
Difficulties of genetic studies of
complex disease in humans
 Heterogeneity
of human populations
 Several
to many genes involved
 modest
effects for any single gene
 Environmental
influences
Mouse model of human genetic
disease
Advantages over other mammals:
-Small size (<40g), short generation time (8-9 wks),
large litter size (5~10 puppies)
-Numerous inbred strains and gene-targeted
-Easy control of environmental factors
Mouse genome shares great similarity
with the human genome
Mouse-Human Comparison
2.5 vs. 3.2 billion bp long
> 99% of genes have homologs
> 95% of genome “syntenic” (relative gene-order
conservation)
Variation among mouse strains in susceptibility
to diet-induced atherosclerosis
Atherosclerotic Vascular Disease
Terminology
 Discrete/qualitative
trait - traits that are
present or absent.
 Continuous/quantitative
trait - traits that
have measurable characteristics across a
range of values. This class includes the
vast majority of diseases afflicting
humans.
Quantitative trait locus (QTL)
analysis
Gene 1
Gene 2
Gene 3
Gene 4
Gene 5
Gene 6
QTL analysis starts with selection of two
phenotypically different strains
C3H
x
x
B6
F1
…
F2
All F2s are analyzed for trait values
All F2s are typed for genetic markers
spanning the who genome
Statistical analysis
Map Manager QTXb20 (http://mapmgr.roswellpark.org/) and
R/qtl (http://www.biostat. jhsph.edu/~kbroman/software) are
available for testing the association of a phenotype
with each marker.
Log of the-odds-ratio (LOD) score is used to define the
significance of the association of a genetic marker with a
trait.
Genome-wide scan for
atherosclerotic lesions
Interval mapping provides best estimation on the location of genes
affecting atherosclerotic lesions
Dissect major QTL by
construction and analysis of
congenic strains
Congenic strain: identical
to an inbred strain except
for a differential
chromosomal segment
Sequence Comparison
 If
crosses include those of sequenced
strains, search database for polymorphisms
of positional candidate genes in the QTL
regions.
15 common inbred strains (B6, AJ, 129, DBA, C3H …)
now available at MGI, NCBI, and Ensembl
 Re-sequence
coding and promoter regions of
strong candidate genes.
Gene expression database
 Where
is your gene expressed?
http://www.informatics.jax.org/javawi2/servlet/WIFetch?p
age=expressionQF
 Is
there microarry data for your gene?
http://www.ncbi.nlm.nih.gov/geo/
Conduct functional studies to prove the identity
of promising candidate genes
Test the significance of QTL genes found in mouse
by association analysis using human populations
Table 2 Genotyping results for genes in the human Chr 1 region homologous to the mouse Ath1 locus
Rare alleles, % (total alleles)
Genea
RefSNP IDb
SNPc
Position in gene (bp in Ensembl)
Affected
Control
P
PIGC
rs1063412
C/T
Exon 2 coding (169650343)
40.7 (684)
41.7 (734)
0.69
C1orf9d
rs1053381
A/G
3' UTR (169819913)
6.3 (694)
8.0 (672)
0.22
TNFSF6 (FASL)
rs763110
C/T
687 bp upstream (169866874)
31.6 (728)
32.0 (744)
0.87
Intergenic
rs983514
A/T
(170111828)
3.0 (708)
3.5 (714)
0.57
TNFSF18
rs1883477
A/G
Intron 1 (170258429)
19.0 (694)
18.3 (706)
0.72
TNFSF4 (OX40L)
rs1234315
C/T
1,992 bp upstream (170417839)f
45.9 (754)
43.3 (778)
0.31
rs3850641
A/G
Intron 1 (170415208)e
15.5 (766)
12.1 (784)
0.05
rs1234313
A/G
Intron 1 (170405623)e
29.6 (766)
33.4 (784)
0.11
rs3861950
C/T
Intron 2 (170395668)e
33.4 (710)
30.4 (746)
0.23
Applied research in
human genetics
Michèle Sale, Ph.D.
Center for Public Health Genomics
[email protected]
Tel: 982-0368
National DNA Day!
 April
25
 Commemorates the discovery of the
structure of DNA in 1953 and the
sequencing of the human genome 50
years later
Genetic Information NonDiscrimination Act of 2007
(GINA)

A version first introduced in 1995

GINA would:



Prohibit access to individuals' personal genetic information by
insurance companies making health coverage plan
enrollment decisions, and by employers making hiring
decisions;
Prohibit insurance companies from requesting that applicants
for group or individual health coverage plans be subjected to
genetic testing or screening, and prohibit them from
discriminating against health plan applicants based on
individual genetic information; and
Prohibit employers from using genetic information to refuse
employment, and prohibit them from collecting employees'
personal genetic information without their explicit consent.

Nearly 40 states have had individual forms of the legislation in place

Passed by House:


April 25, 2007 (420-3), and again
March 7, 2008 (as part of the Paul Wellstone Mental Health
and Addiction Equity Act, 268-148)
Some examples from
GWAS for type 2
diabetes
The first type 2 diabetes GWAS papers…

Sladek et al. A genome-wide association study identifies novel risk
loci for type 2 diabetes. Nature. 2007 Feb 22; 445:881-5.
 Frayling et al. A common variant in the FTO gene is associated with
body mass index and predisposes to childhood and adult obesity.
Science. 2007 May 11; 316:889-94.
 Steinthorsdottir et al. A variant in CDKAL1 influences insulin
response and risk of type 2 diabetes. Nat Genet. 2007 Jun; 39:7705.
 Wellcome Trust Case Control Consortium. Genome-wide association
study of 14,000 cases of seven common diseases and 3,000 shared
controls. Nature. 2007 Jun 7; 447:661-78.

Saxena et al. Genome-wide association analysis identifies loci for
type 2 diabetes and triglyceride levels. Science. 2007 Jun
1;316(5829):1331-6
 Zeggini et al. Replication of genome-wide association signals in UK
samples reveals risk loci for type 2 diabetes. Science. 2007 Jun 1;
316:1336-41.
 Scott et al. A genome-wide association study of type 2 diabetes in
Finns detects multiple susceptibility variants. Science. 2007 Jun 1;
Diabetes Genetics Initiative of Broad Institute of Harvard
and MIT, Lund University, and Novartis Institutes of
BioMedical Research, Science 2007 Jun
1;316(5829):1331-6
Association results from
WTCC replication study
Zeggini, E. et al. Replication of genome-wide association signals in UK samples reveals risk loci
for type 2 diabetes. Science 316, 1336–1341 (2007).
Frayling TM. Nat Rev Genet 2007 Sep; 8:657-62
Transcription-factor 7-like 2
(TCF7L2)





Major new diabetes gene
Identified as a diabetes gene by
Grant et al. Nat Genet 2006 March; 38: 320-323
Not previously suspected to be involved in
diabetes
Known to influence levels of at least 60 other
genes!
Shown to have a role in insulin secretion
(Lyssenko et al. J Clin Invest. 2007 Aug; 117:2155-63)
Replicated GWAS diabetes genes
Gene
Chr
Reference
Previously known diabetes genes
TCF7L2
10
Sladek, Steinthorsdottir, Scott
PPARG
3
WTCCC, Scott
KCNJ11
11
WTCCC, Scott, Saxena
SLC30A8
1
Sladek, Scott, Zeggini, Saxena
IGF2BP2
3
Saxena, WTCCC, Scott, Zeggini
CDKAL1
6
Steinthorsdottir, Scott, Zeggini, Saxena
HHEX/IDE
10
Sladek, Scott, Zeggini, Saxena
CDKN2A/CDKN2B region
9
Saxena, WTCCC, Scott
FTO
16
WTCCC, Scott, Zeggini
Novel diabetes genes
Frayling TM. Nat Rev Genet 2007 Sep; 8:657-62
Effect sizes of 11 confirmed
diabetes variants
Frayling TM. Nat Rev Genet
2007 Sep; 8:657-62
TCF7L2 results
SNP
Population
Case
frequency
39%
Control
frequency
30%
rs7903146
Iceland
Denmark
36%
U.S. (Caucasians)
P-value
-9
Odds ratio
1.50
27%
1.6 x 10
0.0018
40%
28%
1.6 x 10-7
1.71
U.K. (Caucasians)
38%
31%
-11
1.35
Finland
22%
18%
1.3 x 10
0.00042
France
43%
31%
6.0 x 10-35
1.69
Netherlands
37%
29%
-5
1.41
Europe (Caucasians)
36%
28%
4.4 x 10
<0.0001
U.K. (Indian)
34%
27%
0.002
1.53
U.S. (African American)
37%
28%
West Africa
41%
21%
4.1 x 10
0.0021
-6
1.46
1.33
1.54
1.51
1.45
But this variant is rarer in
East Asian and Native
American populations
SNP
Population
Case
frequency
39%
Control
frequency
30%
P-value
Odds ratio
rs7903146
Iceland
1.50
27%
1.6 x 10-9
0.0018
Denmark
36%
U.S. (Caucasians)
40%
28%
1.6 x 10
-7
1.71
U.K. (Caucasians)
38%
31%
-11
1.35
Finland
22%
18%
1.3 x 10
0.00042
France
43%
31%
6.0 x 10-35
1.69
Netherlands
37%
29%
1.41
Europe (Caucasians)
36%
28%
4.4 x 10-5
<0.0001
U.K. (Indian)
34%
27%
0.002
1.53
U.S. (African American)
37%
28%
1.51
West Africa
41%
21%
4.1 x 10-6
0.0021
Mexico
19%
16%
0.16
1.25
Hong Kong (Chinese)
3%
2%
0.42
1.27
• However, other variants in the same
gene are associated with diabetes
1.46
1.33
1.54
1.45
Investigation of “European” diabetes
alleles in African Americans
Gene
PKN2
IGF2BP2
FLJ39370
CDKAL1
CDKAL1
SLC30A8
CDKN2B/CDKN2A
CDKN2B/CDKN2A
IDE/KIF11/HHEX
IDE/KIF11/HHEX
IDE/KIF11/HHEX
Intragenic
LOC387761
EXT2/ALX4
EXT2/ALX4
EXT2/ALX4
FTO
TCF7L2*
SNP
rs6698181
rs4402960
rs17044137
rs10946398
rs7754840
rs13266634
rs564398
rs10811661
rs1111875
rs5015480
rs7923837
rs9300039
rs7480010
rs1113132
rs11037909
rs3740878
rs8050136
rs7903146
European AdmixtureReported
Adjusted
Risk Allele Additive
P-value
T
T
A
C
C
C
T
T
C
C
G
C
G
C
T
A
A
T
0.388
0.803
0.747
0.110
0.136
0.543*
0.320*
0.128*
0.767
0.400
0.303*
0.029*
0.084
0.221*
0.511
0.129*
0.783
1.59x10-6
AdmixtureAdjusted OR
(95% CI)
1.08 (0.91-1.29)
0.98 (0.87-1.11)
0.98 (0.86-1.12)
1.11 (0.98-1.26)
1.10 (0.97-1.25)
1.46 (0.43-4.89)
2.99 (0.34-25.98)
0.18 (0.02-1.64)
1.02 (0.88-1.19)
0.95 (0.83-1.08)
1.87 (0.57-6.12)
0.42 (0.19-0.91)
1.18 (0.98-1.44)
0.47 (0.14-1.57)
0.94 (0.79-1.13)
0.46 (0.17-1.26)
1.02 (0.90-1.15)
1.39 (1.21-1.60)
*Dominant model (<10 counts for
minor alllele homozygote)
Lewis et al. Diabetes 2008 (in press)
Allele frequencies differ
Power to Detect
African American Data Reported European
Association in African
Data
Americans
Gene
PKN2
IGF2BP2
FLJ39370
CDKAL1
CDKAL1
SLC30A8
CDKN2B/CDKN2A
CDKN2B/CDKN2A
IDE/KIF11/HHEX
IDE/KIF11/HHEX
IDE/KIF11/HHEX
Intragenic
LOC387761
EXT2/ALX4
EXT2/ALX4
EXT2/ALX4
FTO
TCF7L2*
SNP
rs6698181
rs4402960
rs17044137
rs10946398
rs7754840
rs13266634
rs564398
rs10811661
rs1111875
rs5015480
rs7923837
rs9300039
rs7480010
rs1113132
rs11037909
rs3740878
rs8050136
rs7903146
European Risk Allele Risk Allele Reported
Reported
Reported Frequency Frequency Risk Allele Risk Allele
Risk
Controls
Cases
Frequency Frequency
Allele
Controls
Cases
T
T
A
C
C
C
T
T
C
C
G
C
G
C
T
A
A
T
Lewis et al. Diabetes 2008 (in press)
0.153
0.525
0.329
0.582
0.585
0.914
0.934
0.933
0.766
0.633
0.917
0.889
0.858
0.915
0.862
0.907
0.446
0.284
0.156
0.528
0.326
0.615
0.616
0.916
0.943
0.927
0.774
0.621
0.929
0.884
0.890
0.920
0.859
0.914
0.452
0.354
0.290
0.304
0.230
0.319
0.360
0.609
0.558
0.850
0.522
0.425
0.597
0.892
0.301
0.733
0.729
0.728
0.398
0.181
0.320
0.341
0.270
0.361
0.387
0.649
0.595
0.872
0.546
0.379
0.622
0.924
0.336
0.763
0.760
0.760
0.455
0.227
α=0.05
α=0.10
0.237
0.555
0.060
0.427
0.427
0.169
0.140
0.304
0.371
0.470
0.143
0.584
0.062
0.475
0.913
0.760
0.711
0.997
0.345
0.675
0.115
0.522
0.552
0.263
0.225
0.422
0.495
0.595
0.229
0.701
0.117
0.600
0.953
0.846
0.808
0.999
Can genetic
information change
practice in the clinic?
Neonatal diabetes
 Mutations
of the ATP-sensitive inwardlyrectifying potassium channel subunit Kir6.2
(KCNJ11) gene cause 30-58% of cases of
diabetes diagnosed in patients under six
months of age
 The majority of cases (80-90%) are de
novo mutations, so won’t be identified on
the basis of family history
Neonatal diabetes –
KCNJ11 mutations
Pearson ER et al. N Engl J
Med 2006, 355 (5), 467-477




In the beta-cell, glucose metabolism increases intracellular ATP
production from ADP
This leads to the closure of ATP-sensitive potassium channels and
membrane depolarization
Subsequent activation of voltage-dependent calcium channels and
influx of calcium results in insulin granule exocytosis
Patients with KCNJ11 mutations have KATP channels with decreased
sensitivity to ATP


Channels remain open in the presence of glucose
Reducing insulin secretion
Neonatal diabetes
Pearson ER et al. N Engl J
Med 2006, 355 (5), 467-477


Since patients present with
hyperglycemia, undetectable C-peptide,
and frequently have ketoacidosis (30%),
they are often initially treated with insulin
A study of 49 patients showed that 90%
could successfully be treated with
Pharmacogenetics
Cytochrome P450 table
Stamer and Stuber. Genetic factors in pain and its treatment. Curr Opin
Anaesthesiol. 2007 Oct;20(5):478-84.
 http://medicine.iupui.edu/flockhart/table.ht
m
Lanfear and McLeod.
Pharmacogenetics: using
DNA to optimize drug therapy.
Am Fam Physician. 2007 Oct
15;76(8):1179-82.
Clinical trials
 Genetic
testing may allow selective
recruitment of participants in whom drug is
expected to be most efficacious
 Lower costs to bring drug to market
 Will it be approved for a select genetic
group?
Ethical issues
 Privacy
 Insurance



Health
Life
Disability
 Employment
You can’t change your genes –
Why does genetics matter?

Identify new pathways involved in disease predisposition

New “druggable” targets

More specific diagnosis

Pharmocogenetics




Outcomes



Identify genetic factors that influence an individual’s response
to a particular therapy
Selection of therapies
Clinical trial design
Recovery rates
Long-term sequelae
Era of “personalized medicine”
You can’t change your genes –
Why does genetics matter?
 Better
prediction of who is at greatest risk
and targeted early intervention
PREVENTION
J. Craig Venter
Results from Venter’s
Genome

After QC filtering, 4.1 Million variants,
1.288M are novel to dbSNP (30%)

SNPs, indels, inversions, segmental duplication,
and more complex variation

78% of 4.1M are SNPs; the other 22% cover
9Mb of variant bases
 62 Copy Number Variants = 10Mb
 Total of variation = 0.5% of genome
 Heterozygous Indels range from 1 - 321 bp
Levy et al, PLoS Biology, 2007
J. Craig Venter
Carries:
 A gene variant linked to moist ear wax production
 Genes linked to both heart disease (SORL1) and longevity
Genes linked to
 Alzheimer’s (APOE)
 Macular degeneration
 High cholesterol
 Carries up to seven gene types linked to tobacco addiction
‘Project Jim’
1.3 percent of Watson’s genome did not match the existing reference genome.
> 600,000 novel SNPs
< 68,000 insertions and deletions compared to the reference sequence, 3bp - 7kbases
Bio-IT World June 2007
 http://www.personalgenomes.org/
 23andMe
Personal.
- Genetics Just Got
 Navigenics
Home

Introduction to Medical Genetics

Transcript Introduction to Medical Genetics

Directory