Advisory Committee update meeting

Download Report

Transcript Advisory Committee update meeting

RARE GERMLINE VARIABILITY IN
PEDIATRIC LEUKEMIA.
Cancer Biology Series
January 29, 2013
Todd Druley, MD, PhD
Assistant Professor of Pediatric and Genetics
Presenter Disclosure Information
Todd E. Druley, M.D., Ph.D.
Druley Lab / WUSM CGSSB
In compliance with ACCME policy, WU requires the following disclosures to the session audience:
Research Support/P.I.
No relevant conflicts of interest to declare
Employee
No relevant conflicts of interest to declare
Consultant
No relevant conflicts of interest to declare
Major Stockholder
No relevant conflicts of interest to declare
Speakers’ Bureau
No relevant conflicts of interest to declare
Scientific Advisory Board
No relevant conflicts of interest to declare
Why study rare variation?
• Whole genomes show 2-4 million variants PER PERSON!
• Only about 25 – 33% of these are common (>2% MAF).
• There are roughly 22,000 human genes
• This equals ~40,000,000 nucleotides total for all of our genes.
• ~1.5 % of the entire genome
• If 2 individual genomes differ by:
• 2M x 0.67 = 1,340,000 nucleotides
• There are 1.8 x 1012 possible combinations between the two
genomes!!
Common vs. Rare Variants
• Critical differences between common and rare variant analysis include:
• Rare variants have greater effect sizes [average OR=3.7]
(Bodmer Nat Genet 2008)
• Disruptive rare variants are more likely to act dominantly
(Fearnhead Cell Cycle 2005)
• Rare variants are individually rare, but collectively common when collapsed
(binned) within a genetic locus or metabolic pathway
(Cohen Science 2004; Ji Nat Genet 2008)
Antonarakis SE et al. Nature Rev Genet 2009.
“Private”
Antonarakis SE et al. Nature Rev Genet 2009.
We’re operating
here
“Private”
Example:
Cystic Fibrosis

Originally thought
that only the
ΔF508 mutation
was causative for
CF.

Sequencing of the
CFTR gene was
initiated.

Now over 1000
mutations in
CFTR have been
documented.

Cause various
severities of cystic
fibrosis.
http://www.ccb.sickkids.ca/index.php/cystic-fibrosis-mutation-database.html
Complex diseases demonstrating increased rare variation
AJHG 80, 779-791; 2007
 Obesity
 High Cholesterol
 Sequenced two
groups of 128
individuals each
 Psychiatric illness, cancer, autoimmune disorders, heart disease, height,
extreme longevity, many others…
What about pediatric cancer?
• “Early onset cancer” = defined as cancer <50 years old
• Germline “cancer causing gene alleles” (TP53, APC, BRCA1) –
average age of disease onset is 20’s
• Cannot explain the incidence of pediatric cancer by somatic mutation.
• Epi studies have failed to explain exposures causing these cancers.
• Almost all pediatric cancer patients have a negative family history.
• So why do we see ~3 children/week with a new cancer??
Infant acute leukemia – worst outcomes
• ~50% mortality, 67% with MLL-rearrangements
• MLL regulates developmental transcription (HOX genes)
• Survivors often left with developmental problems
• COG AE24 “Epidemiology of Infant Leukemia”
• Largest case-control study to date looking for pre/perinatal
exposures associated with infant leukemia
• Topoisomerase II inhibitor exposure during pregnancy
• Only associated with AML, but didn’t impact survival
• Ross JA, J Nat Cancer Inst Monogr 2008
Pilot exome sequencing experiment
• GERMLINE exome sequencing from 25 pairs of mothers and
infants with MLL-negative acute leukemia
• Julie Ross, PhD (PI) and Amy Linabery, PhD.
• We are looking at genes with rare variants in affected infants, but
also inherited from mothers
• These parents typically don’t have leukemia or other cancers.
• We hypothesize a combinatorial effect from parental variants contributes
to the early onset/short latency of leukemia.
Demographics
25 pairs of Caucasian mothers and infants: 12 ALL, 13 AML
Table 1.
ALL
AML
Sex
Boys
4
6
Girls
8
7
Avg age at diagnosis
8.3
5.3
(months)
(0.6 - 11.4) (1.6 - 11.4)
Avg maternal age
(years)
No. mothers >35 yrs
31.9
33.4
(21.3-40.6) (25.4-41.8)
3
5
Validated bioinformatics
 We analyzed exome
data using a validated
bioinformatic pipeline:
 Align using
Novoalign
 Call variants with
SAMtools
 Sensitivity = 97%
 Specificity = 99.8%
Variant calls in COSMIC genes
• Prioritize by comparing our variant calls in genes already associated
with hematologic malignancies in the COSMIC database.
• http://www.sanger.ac.uk/genetics/CGP/cosmic/
1. ALL (126 ALL-associated genes)
Infants = 695 total variants (481 known, 214 novel)
Mothers = 728 total (588 known, 140 novel – 65%)
2.
AML (657 AML-associated genes)
Infants = 5517 total (3961 known, 1556 novel)
Mothers = 4735 total (4264 known, 471 novel – 30%)
Permutation testing
Average: ALL = 5 variant genes/infant, AML = 6 variant genes/infant
Null
distribution
Null
distribution
Both sets of infants have a statistically significant (P<10-7) enrichment
of novel, non-synonymous, deleterious germline variants in genes
associated with hematopoietic malignancies (COSMIC).
Mark Valentine
Validation
• No significant enrichment in randomly chosen gene sets in
infants
• No significant enrichment in random or leukemia gene sets in
Caucasian unaffected exomes
• Unlikely to see the same novel variant in only related mother :
infant pairs by chance.
• 45% in ALL; 23% in AML
• Consistent with maternal totals of 65% & 30%, respectively
• Sanger validation of other variants is ongoing
micro-RNA regulation?
• Many variant candidate genes are regulated by MIRs independently
associated with leukemia and cell cycle regulation:
Nick Sanchez
Pathway Analysis
• ABC transporters
• Developmental defects
• Chloride channel regulator activity
• Transcription factor dysregulation
• YYI, Cdx, HNF1, MAF, EA2
• TDG glycosylase mediated binding and cleavage of a thymine,
uracil or ethenocytosine opposite a guanine
Implications / Conclusions
• Supports the hypothesis that infants with leukemia are born with a
putatively functional enrichment of variation in genes associated
with leukemogenesis.
• Infants with AML have an excess of novel, nonsynonymous,
deleterious variation not from mother.
• Paternal age = de novo mutation during spermatogenesis?
• De novo mutation during embryogenesis?
• Can we identify discreet biological/developmental and regulatory
mechanisms leading to early onset leukemia?
• MIRs
• ABC transporters
• Specific transcription factors
Future work:
SHORT TERM:
1.
2.
3.
Complete the bioinformatic analysis
Compare to existing data (TARGET and PCGP)
Exome sequencing of 25 MLL-positive pairs
LONG TERM:
1.
2.
3.
Validate results in a second cohort of triads
Establish model systems to study complex genetic interactions
Integrate information into clinical trials?
High-risk pediatric ALL: Pooled sequencing
1.
2.
3.
Patient germline (N=96)
Patient leukemia (N=96)
Unaffected controls (N=93)
55 genes per pool
Candidate genes for pooled sequencing
Function
• 55 genes
selected for
pooled
sequencing
JAK tyrosine kinases
Oxidative stress
response
identified near
significant
tagged-SNPs on
the prior array
(asterisks)
• Various cellular
functions
DNA repair
JAK2 *
Folate metabolism
Steroid metabolism
MLH1 *
MSH3
PAX5 *
NAT1 *
TCF3/E2A *
NAT2 *
EBF1
NQO1
LEF1 *
MPO
IKZF1 *
MTHFR
IKZF3 *
B-cell development
ETV6/TEL *
TYMS *
BTG1 *
GGH
ERG *
CCND1 *
ATM *
TPMT *
CCNC *
TNFAIP1
FHIT
IL10 *
BLNK *
NR3C1 *
HOXA7 *
PTPN11 *
HOXA9 *
NRAS *
HOXA10 *
Hematopoietic stem
cell differentiation
KRAS2 *
FLT3 *
Cell cycle signaling
Gene Name
JAK3
RFC1 *
Thiopurine metabolism
• 43 were
Function
JAK1 *
• All genes have
been published
in relation to
pediatric ALL
Gene Name
HOXB4 *
AML1/RUNX1 *
RB1 *
MLL *
TP53 *
PBX1 *
MDM2 *
MEIS1 *
CDKN2A i s oform 1 (P16
ink4a
)*
Drug efflux
CDKN2A i s oform 4 (P14arf) *
CDKN1A (P21
CDK4
CDK6 *
cip1
)
ABCB1 *
CYP1A1 *
Cytochromes
CYP2E1
CYP3A5 *
Pooled sequencing pilot project
• Sequenced 94.5% of coding regions from all three pools.
• 420 kb per person = 1.2 x 108 total bases covered
Total Variants
Coverage/Allele
Unaffected
4209
80-fold
Germline
3929
86-fold
Leukemia
3822
101-fold
SPLINTER v. GoldenGate MAF comparison - GERMLINE
0.6
• Validation at 384 base positions by
custom Illumina GoldenGate array
GoldenGate MAF
0.5
SPLINTER v. GoldenGate MAF comparison - CONTROL
0.6
0.4
0.3
0.2
R² = 0.9179
0.1
0
0.4
0
0.1
0.2
0.3
0.4
0.5
SPLINTER MAF
0.3
SPLINTER v. GoldenGate MAF comparison - LEUKEMIA
0.6
0.2
0.5
0.1
R² = 0.9139
0
0
0.1
0.2
0.3
SPLINTER MAF
0.4
0.5
GoldenGate MAF
GoldenGate MAF
0.5
0.4
0.3
0.2
R² = 0.9456
0.1
0.0
0.0
0.1
0.2
0.3
SPLINTER MAF
0.4
0.5
Overlap
• 49% of called variants are unique to the ALL Germline pool
• Only 2.5% of Leukemia variants were NOT seen in the Germline pool (97.5% overlap)
•
Somatic mutations
Germline pool:
NOT in Unaffected
Leukemia pool:
NOT in Germline
Total variants
1915 (49%)
96 (2.5%)
Coding substitutions
233 (12%)
19 (20%)
• [22 novel mutations in UTRs]
• [5 within putative splice site]
Novel
175 (75%)
15 (79%)
• Non-synonymous
• Synonymous
162 (70%)
71
14 (74%)
5
Damaging (per SIFT)
89 (38%)
• 84 missense
• 5 nonsense
9 (47%)
• all missense
Coding Insertions/Deletions
9
7
Causes protein dysfunction
(per SIFT)?
6
• 3 MLL, 1 ATM, 1 PAX5, 1 LEF1
7
• 6 MLL, 1 TCF3
[11 in UTR or splice site]
Visualizing the dataset
Leukemia SNPs (x)
Germline SNPs (+)
Amplicons
Control SNPs (Δ)
High
Conservation
Across
Species
Low
Joe Giacalone
Mark Valentine
Visualizing the dataset
Leukemia SNPs (x)
Germline SNPs (+)
Amplicons
Control SNPs (Δ)
High
Conservation
Across
Species
Low
1. No variants in control group
2. Multiple variants in affected germline
3. Overlap with highly conserved region
Joe Giacalone
Mark Valentine
+
++
+
Mark Valentine
Exome variant server overlay
Drew Hughes
All looking at known ancestral polymorphisms and the incidence of acute
leukemia.
• None involve sequencing to demonstrate novel/rare variants in the same
genes.
Overexpressed genes:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
ATM
CDKN1A
CYP1A1
CYP3A5
IKZF1
MDM2
MLL
MTHFR
NAT2
NQO1
PAX5
PTPN11
TCF3
TPMT
Overexpressed genes:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
ATM
CDKN1A
CYP1A1
CYP3A5
IKZF1
MDM2
MLL
MTHFR
NAT2
NQO1
PAX5
PTPN11
TCF3
TPMT
6 of 14 overexpressed genes (43%)
are involved in drug metabolism.
Additional gene expression profiles
• Similar expression
differences in 18
additional genes (5
overexpressed
CYPs).
• All genes possess
≥1 novel coding
variant in P9906
patients.
• No clear connection
between genetic
variation and gene
expression.
Drew Hughes
Implications / Conclusions:
• Overexpression of specific genes involved in metabolism of anti-
leukemia agents identifies a subgroup of children with inferior EFS.
• Private sequence variation in drug/energy metabolism genes is not
coupled to expression profiles, but may predispose to leukemia or
modulate therapeutic response through defective metabolism.
• Pathogenesis vs. pharmacogenomics?
• Therapeutic implications:
• Can look for these genomic signatures at diagnosis; existing precedent
• Dose modification or direct to bone marrow transplant
Future work:
Validation and identification of individual profiles.
1.
•
Delve more into the underexpressed genes as well.
2.
Analyze sequencing results of ~700 additional drug/energy
metabolism genes.
3.
Functional iPSC-based assays from patient fibroblasts.
4.
Introduction into immune-deficient mice for functional study.
Acknowledgements & Funding
Wash U:
• Bob Hayashi
• Alan Schwartz
• Rob Mitra
• F. Sessions Cole
COG:
• Julie Ross
• Logan Spector
• Mignon Loh
• Rick Harvey
1K08CA140720-01A1
Druley Lab:
• Nick Sanchez
• Mark Valentine
• Joe Giacalone
• Drew Hughes
• Andrew Young
Eli Seth Matthews Leukemia Foundation™