PGx - BASS VIII, December 2001 - American Statistical Association

Download Report

Transcript PGx - BASS VIII, December 2001 - American Statistical Association

Complex Adaptive Systems and Human Health:
Statistical Approaches in Pharmacogenomics
Kim E. Zerba, Ph.D.
Bristol-Myers Squibb
FDA/Industry Statistics Workshop
Statistics: From Theory to
Regulatory Acceptance
18-19 September 2003
Bethesda, Maryland
Disclaimer
The views presented are my own and do not necessarily represent those of Bristol-Myers Squibb
Outline



Complex Adaptive Systems and Human Health
Approach and Some Key Statistical Issues with
Genetic Polymorphisms in Pharmacogenomics
Where Do We Go from Here?
The Genetic Paradigm
Protein
Phenotype
Disease
Gene
Gene
RNA
DNA
Non-Infectious Human Disease Load
Complex,
Multifactorial
> 98%
Simple,
Monogenic
< 2%
Complex Adaptive Systems and Human Health
•
FUTURE
NORM OF
REACTION
Risk of Disease
-
UNIQUE
ENVIRONMENTAL
HISTORY
PHYSIOLOGICAL
FITNESS, HEALTH
INDIVIDUAL
NOW
+
TIME-SPACE CONTINUUM
BLOOD
HAEMOSTASIS
CARBOHYDRATE
METABOLISM
PRESSURE
REGULATION
LIPID
METABOLISM
UNIQUE
GENOME TYPE
(Initial Conditions)
•
Each individual is a complex
adaptive system and the
fundamental unit of organization
Health or disease is
an emergent feature based on
interactions among many
agents, including genes and
environments
• Agents participate in dynamic
network and are not direct causes
• Network organized hierarchically
and heterarchically into fields
• Fields are domains of relational
order among agents
• Stronger relationships within fields,
weaker relationships among fields
• Unique genome type provides initial
conditions and capacity for change
• Context and time are key to understanding influence of genetic variation
See : Zerba and Sing, 1993, Current Opinion in Lipidology 4: 152-162,
Zerba et al. 2000, Human Genetics 107: 466-475 for more detail
Complex Adaptive Systems Approach to PGx
Endpoints
2
?
Biomarkers
3
?
Genes
?
1
Some Key Statistical Issues for
Pharmacogenomics Studies Using Genetic
Polymorphisms
 Gene/Polymorphism Selection
 Linkage Disequilibrium
 Admixture and Population Stratification
 Invariance
 Context Dependence
 Time
Gene/Polymorphism Selection


Genome Scan
– Genes not identified a priori
– Genotyping
– 25K - 500K polymorphisms genotyped
for each subject (not practical yet)
– DNA Pooling
– 25K - 1.5 million polymorphisms
– Case-control allele frequency differences
for each polymorphism
Candidate Genes
Candidate Genes
P
F
F
Si
Unknown
and unmeasured
functional
polymorphism

Candidate
Gene
Region
One of numerous
non-functional
polymorphisms
Assume that any association of Si with phenotype, P,
is because of linkage disequilibrium between F and Si
PFS = pFpS + DFS
Admixture
+
Population
I
-
+
+ +
SNP
+
+
+
- +
-
+
+
+
p+ = 0.8
Admixed
Population
a = proportion
of population I
= 0.5
+
-
+
+
+
- -
p+ = 0.5
+ - p+ = 0.2
Population
II
Consider two subpopulations, I and II:
For each subpopulation, there is linkage equilibrium
between a disease allele, F, and a marker allele, S,
PFISI = pFIpSI; PFIISII = pFIIpSII; DFISI = DFIISII = 0.
In the admixed population (I + II), there is linkage
Admixture
disequilibrium between F and S,
PFS = pFpS + a(1-a)(pFI - pFII)(pSI - pSII)
Subpopulations
Proportions
Disease Allele
Frequency
Difference
Linkage
Disequilibrium
Marker Allele
Frequency
Difference
Admixture and Population Stratification



Admixture linkage disequilibrium dissipates
quickly in a randomly mating population
Common clinical trial feature: > 1 ethnic group
– Population stratification
Ethnicity is a confounder
– Population stratification can create linkage
disequilibrium just like admixture only spurious
– Type I or Type II error inflation
False-Positive Endpoint Association Example
Not considered in analysis
Ethnic Group I
+
Carriers
Cases
13
3
Controls
3
1
Ethnic Group II
+
Carriers
0
2
2
6
Margin
+
Carriers
13
5
5
7
OR = 3.6
 Unbalanced design
– Unequal numbers of each group: aI = 0.67
– Marker allele: p+ = 0.8 in ethnic group I
–
p+ = 0.2 in ethnic group II
Disease risk: pF = 0.8 for ethnic group I
pF = 0.2 for ethnic group II
FREQUENCY
Population Genetic Structure and
the Search for Functional Mutations:
Quantitative Traits
AA
Aa
aa
?
SCALE
?
SNP
Phenotype
(Biomarker)
Functional
Mutation?
Genotype
FREQUENCY and SCALE contribute to inferences about SNPphenotype associations:
Analysis of Variance
SSR =  fi (Yi - Y)2
Approach
Population Stratification and
Genotype Frequencies
Aa
A
 Stratification can result in
a
SNP
Paa
PAA
Ethnic Group I
PAa
aa
pa
pA
decreased heterozygote
frequencies relative to
expectation:
PAa = 2pApa - 2DA
(DA positive in example)
Ethnic Group II
Average
Genotype
Frequencies
AA

Population stratification can result in
overestimation of quantitative phenotypic
variation associated with genetic variation relative
to Hardy-Weinberg equilibrium expectation
+
0
DA --> +
-
0
DA
+
Invariance, Context and Time
An example from Apolipoprotein E Biology
 Molecular weight: 34 kD
 Synthesized in most organs
– liver, brain, gonads, kidney, spleen, muscle
 Key physiological role in lipid transport
– ligand for the LDL (ApoB-E) receptor
 Structural gene on chromosome 19
– polymorphic with three common alleles
5’
3’
SNP
SNP
AA 112
AA 158
e2
Cys
Cys
e3
e4
Cys Arg
Arg Arg
Note: combination of SNPs involved
Invariance
Cholesterol (mg/dL)
10
Quebec,
Canada
N = 201
Nancy,
France
N = 223
Munster, Helsinki, Rochester,
Germany Finland MN, USA
N=226
N = 1000
N=207
Alleles
0
e2
e3
e4
-10
-20
From Sing et al. (1996) Genetic architecture of common multifactorial diseases, pp. 211-232 In:Chadwick and
Cardew (eds.) Variation in the human genome, Ciba Foundation Symposium 197, John Wiley & Sons, New York
Context and Time
Changes in ApoE Additive Genetic Variance with Age
Rochester, MN
Males, N=1035
2
A
12
8
4
0
10
20
30
40
50
60
70
70
Age Window Midpoint
(years)
Variance x 10-4
16
Bootstrap
Significance Tests
60
50
40
30
20
10
10
20
30
Age Window Midpoint (years)
From Zerba et al. 1996, Genetics 143: 463-478.
40
50
60
70
+ 0.05 > P < 0.10
 P < 0.05
Where Do We Go From Here?
Some Additional Statistical Challenges
 Study design in genetic setting
 Genetic stratification
 Genomic control
 Ascertainment bias correction in choice of which


polymorphisms to study
Contexts/Interactions-- which ones are important?
New analytical methods needed
– Combinations of SNPs within and among genes and environments
may be involved
– Haplotype Reconstruction
– Combinatorial Partitioning
 Missing genotypes for individual polymorphisms
 Sampling vs technical variability in DNA pooling studies
 Multiplicity-- p-value adjustment not a trivial problem