PARADIGM-Shift Calculation Overview

Download Report

Transcript PARADIGM-Shift Calculation Overview

Predicting the impact of
mutations using pathwayguided integrative genomics
AACR Annual Meeting 2012
Major Symposium: Designing Rational Combination
Therapy for Cancer
Josh Stuart, UC Santa Cruz
Apr 3, 2012
Disclosure
SAB for Five3 Genomics
Overview of pathway-guided approach
Integrate many data sources to gain accurate view of
how genes are functioning in pathways
Predict the functional consequences of mutations by
quantifying the effect on the surrounding pathway
Use pathway signatures to implicate mutations in
novel genes to (re-)focus targeting
Identify critical “Achilles Heels” in the pathways that
distinguish a particular sub-type
Flood of Data Analysis Challenges
Genomics, Functional Genomics, Metabolomics, Epigenomics =
Structural
Variation
This is What it
Copy Number
Does to You
Alterations
Exome
Sequences
Multiple, Possibly
Conflicting Signals
Expression
DNA Methylation
Analysis of disease samples like automotive repair
(or detective work or other sleuthing)
Patient Sample 1
Patient Sample 2
Sleuths use as much
knowledge as possible.
Patient Sample 3
Patient Sample N
…
Much Cell Machinery Known:
Gene circuitry now available.
Curated and/or Collected
Reactome
KEGG
Biocarta
NCI-PID
Pathway Commons
…
Integration key to interpret gene function
Expression not always an indicator of activity
Downstream effects often provide clues
Expression of 3 transcription factors:
high
TF
high
TF
low
TF
Inference:
TF is ON
Inference:
TF is OFF
Inference:
TF is ON
(expression
reflects
activity)
(high expression
but inactive)
(low-expression
but active )
Integration key to interpret gene function
Need multiple data modalities to get it right.
BUT, targets are amplified
Expression -> TF ON
TF
Copy Number -> TF OFF
Lowers our belief
in active TF because
explained away by
cis evidence.
Probabilistic Graphical Models:
A Language for Integrative Genomics
Nir Friedman, Science (2004) - Review
 Generalize HMMs, Kalman Filters, Regression, Boolean Nets, etc.
 Language of probability ties together multiple aspects of gene function &
regulation
 Enable data-driven discovery of biological mechanisms
 Foundation: J. Pearl, D. Heckerman, E. Horvitz, G. Cooper, R. Schacter, D.
Koller, N. Friedman, M. Jordan, …
 Bioinformatics: D. Pe’er, A. Hartemink, E. Segal, E Schadt…
Integration Approach: Detailed models of
gene expression and interaction
MDM2
TP53
Integration Approach: Detailed models
of expression and interaction
Two Parts:
MDM2
TP53
1. Gene Level Model
(central dogma)
2. Interaction Model
(regulation)
PARDIGM Gene Model to Integrate Data
1. Central Dogma-Like
Gene Model of Activity
2. Interactions that
connect to specific points
in gene regulation map
Vaske et al. 2010. Bioinformatics
Charlie Vaske
Steve Benz
Integrated Pathway Analysis for Cancer
Cohort
Multimodal Data
Pathway Model
of Cancer
Inferred Activities
CNV
mRNA
meth
…
Integrated dataset for downstream analysis
Inferred activities reflect neighborhood of influence
around a gene.
Can boost signal for survival analysis and mutation
impact
TCGA Ovarian Cancer
Inferred Pathway Activities
Pathway Concepts (867)
Patient Samples (247)
TCGA Network. 2011. Nature
(lead by Paul Spellman)
Ovarian: FOXM1 pathway altered
in majority of serous ovarian tumors
Patient Samples (247)
Pathway Concepts (867)
FOXM1 Transcription Network
TCGA Network. 2011. Nature
(lead by Paul Spellman)
FOXM1 central to cross-talk between
DNA repair and cell proliferation
in Ovarian Cancer
TCGA Network. 2011. Nature (lead by Paul Spellman)
PATHMARK: Identify Pathway-based
“markers” that underlie sub-types
Identify sub-pathways that distinguish
Insight from contrast
patients sub-types (e.g. mutant vs.
non-mutant, response to drug, etc)
Predict mutation impact on pathway
“neighborhood”
Identify master control points for
drug targeting.
Predict outcomes with quantitative
simulations.
Sam Ng
Ted Goldstein
Pathway signatures of mutations to reveal
therapeutic candidates
Mutated genes are the focus of many targeted approaches.
Some patients with “right” mutation don’t respond. Why?
Many cancers have one of several “novel” mutations. Can
these be targeted with current approaches?
Pathway-motivated approaches:
 Identify
gain-of-function from loss-of-function.
 Compare
novel signatures
Sam Ng
Poster # 2985
PARADIGM-Shift: Pathway context of GOF and LOF events
Use pathways to predict the impact of observed mutations in
patient tumors
High
Inferred
Activity
Low
Inferred
Activity
Predicted
Loss-Of-Function
Predicted
Gain-Of-Function
FG
FG
Sam Ng
High
Inferred
Activity
PARADIGM-Shift Predicting the Impact of
Mutations On Genetic Pathways
Inference using all
neighbors
FG
mutated
gene
Inference using
downstream
neighbors
FG
Inference using
upstream neighbors
SHIFT
FG
Low
Inferred
Activity
Sam Ng
PARADIGM-Shift Calculation Overview
FG
Sam Ng
PARADIGM-Shift Calculation Overview
FG
1.
Identify
Local
FG
Neighborhood
Sam Ng
PARADIGM-Shift Calculation Overview
2a.
Regulators
Run
FG
1.
Identify
Local
Neighborhood
FG
FG
2b.
Targets
FG
Run
Sam Ng
PARADIGM-Shift Calculation Overview
2a.
Regulators
Run
FG
1.
Identify
Local
Neighborhood
FG
3.
Calculate
FG
FG
2b.
Targets
Run
FG
-
P-Shift
Score
FG
Difference
FG
(LOF)
Sam Ng
P-Shift Predicts RB1 Loss-of-Function in GBM
Shift Score
PARADIGM downstream
PARADIGM upstream
Expression
Mutation
RB1
Sam Ng
RB1 Network (GBM)
Focus Gene Key
P-Shift
T-Run
R-Run
Expression
Mutation
Neighbor Gene Key
Activity
Expression
RB1 Mutation
Sam Ng
RB1 Discrepancy Scores distinguish
mutated vs non-mutated samples
Signal Score (t-statistic) = -5.78
Sam Ng
RB1 discrepancy distinction is significant
Given the same network topology, how likely would we
call a gain/loss of function

Background model: permute gene labels in our dataset

Compare observed signal score to signal scores (SS) obtained from
background model
Observed SS
Background SS
Sam Ng
TP53 Network
Sam Ng
Gain-of-Function (LUSC)
P-Shift Score
PARADIGM downstream
PARADIGM upstream
Expression
Mutation
NFE2L2
Sam Ng
NFE2L2 Network (LUSC)
Focus Gene Key
P-Shift
T-Run
R-Run
Expression
Mutation
Neighbor Gene Key
Activity
Expression
RB1 Mutation
NFE2L2
Sam Ng
Discrepancy scores are sensitive
RB1
Signal Score (t-statistic) = -5.78
TP53
Signal Score (t-statistic) = -10.94
NFE2L2
Signal Score (t-statistic) = 4.985
Observed SS
Background SS
Sam Ng
Expect passenger mutations to lack shifts
Is the discrepancy specific?
Negative control: calculate scores for
“passenger” mutations
Passengers:
insignificant
by MutSig (p > 0.10)
well-represented in our pathways
Discrepancy of these “neutral” mutations
should be close to what’s expected by
chance (from permuted)
Sam Ng
Discrepancies of Passenger Mutations
are NOT distinctive
Sam Ng
Pathway Discrepancy
LUSC
PARADIGM-Shift gives orthogonal view of
the importance of mutations in LUSC
HIF3A (n=7)
TBC1D4 (n=9) (AKT signaling)
NFE2L2 (29)
MAP2K6 (n=5)
MET (n=7) (gefitinib resistance)
GLI2 (n=10) (SHH
signaling) CDKN2A (n=30)
EIF4G1 (n=20)
AR (n=8)
Enables probing into infrequent events
Can detect non-coding mutation impact (pseudo FPs)
Can detect presence of pathway compensation for those
seemingly functional mutations (pseudo FPs)
Extend beyond mutations
Limited to genes w/ pathway representation
Sam Ng
Defining Pathway Signatures for Mutations and Sub-Types
Build a signature for every mutation and
tumor/clinical event.
Correlate every signature to each other.
Reveals common molecular similarities between
different divisions of patient subgroups
Mutations in novel genes may “phenocopy”
mutations in known genes
Ted Goldstein
PathMark: Differential Subnetworks from a
“SuperPathway”
Pathway Activities
Pathway Activities
Ted Goldstein Sam Ng
PathMark: Differential Subnetworks from a
“SuperPathway”
SuperPathway Activities
SuperPathway Activities
Pathway
Signature
Ted Goldstein Sam Ng
PIL: Pathway-informed Learning



Traditional methods
treat each gene as a
separate feature
Use features reflecting
overall pathway activity
Smaller number of
features are now fed to
predictors
Predictor
Artem Sokolov
PIL: Pathway-informed Learning



Traditional methods
treat each gene as a
separate feature
Use features reflecting
overall pathway activity
Smaller number of
features are now fed to
predictors
Predictor
Artem Sokolov
Basal vs.
Luminal
Recursive
feature
elimination:
we train an
SVM, drop
the least
important
half of
features and
recurse
The number of
times each
feature
survived the
elimination
across 100
random splits
of data
Artem Sokolov
Methotrexate
Sensitivity
Non sub-type
specific drug
Pathway involving
the target of the
drug.
Artem Sokolov
Triple Negative Breast Pathway Markers
Identified from 50 Cell Lines
980 pathway concepts
1048 interactions
One large highly-connected
component (size and connectivity
significant according to permutation
analysis)
Characterized by
several “hubs’
IL23/JAK2/TYK2
P53
tetramer
HIF1A/ARNT
ER
FOXA1
Myc/Max
Higher activity in ERLower activity in ER-
Sam Ng, Ted Goldstein
Identify master controllers using
SPIA (signaling pathway impact analysis)
Google PageRank for Networks
Determines affect of a given pathway on each node
Calculates perturbation factor for each node in the network
Takes into account regulatory logic of interactions.
n
Impact
factor:
IF ( gi ) = s ( gi ) + å bij ×
j=1
IF ( g j )
N up ( g j )
Google’s PageRank
Yulia Newton
Slight Trick: Run SPIA in reverse
Reverse edges in Super Pathway
High scoring genes now those at the “top” of the
pathway
PageRank finds
highly referenced
Reverse to find
Highly referencing
Yulia Newton
Master Controller Analysis on Breast Cell
Lines
Basal
Luminal
Yulia Newton
Master regulators predict response to drugs:
PLK3 predicted as a target for basal breast
• DNA damage network is
upregulated in basal
breast cancers
• Basal breast cancers are
sensitive to PLK inhibitors
GSK-PLKi
Luminal
Claudin-low
Basal
Ng, Goldstein
Heiser et al. 2011 PNAS
Up
Down
HDAC inhibitors predicted for luminal breast
• HDAC Network is downregulated in basal breast
cancer cell lines
• Basal/CL breast cancers are
resistant to HDAC inhibitors
HDAC inhibitor
VORINOSTAT
Heiser et al. 2011 PNAS
Ng, Goldstein
PARADIGM in TCGA patient BRCA tumors
Christina Yau,
Buck Inst
Christina Yau, Buck Inst.
Connect genomic alterations to downstream
expression/activity
?
• What circuitry connects mutations to transcriptional
changes?
Mutations  general (epi-) genomic perturbation
– Expression  activity
–
• Mutation/perturbation and expression/activity
treated as heat diffusing on a network
HotNet, Vandin F, Upfal E, B.J. Raphael, 2008.
– HotNet used in ovarian to implicate Notch pathway
–
• Find subnetworks that link genetic to mRNA and
protein-level changes.
Evan Paull
HotLink
Gene Activity
(Expression, RPPA, PARADIGM)
Genomic Perturbations
(Mutations, Methylation, Focal Copy Number)
?
Evan Paull
HotLink
Genomic Perturbations
(Mutations, Methylation, Focal Copy Number)
1. Add heat
2. Diffuse heat
Gene Activity
(Expression, RPPA, PARADIGM)
3. Cut out linkers
Evan Paull
HotLink
Genomic Perturbations
(Mutations, Methylation, Focal Copy Number)
Gene Activity
(Expression, RPPA, PARADIGM)
Linking Sub-Pathway
Evan Paull
TCGA Interlinking Network
Basal-LumA
HotLink
Basal LumA
MAP, PI3K, AKT
Basal-LumA
HotLink Map
Basal LumA
TP53, RB1
MYC, FOXM1
AKT/PI3K
MAPK8, MAPK14 (p38alpha)
identified as mediators.
TP53, RB1
Basal LumA
MYC Neighborhood
Double feedback loop involving TP53,
RB1, CDK4, FOXM1, and MYC
Basal LumA
AKT signaling
Basal LumA
Mutation Association to Pathways
What pathway activities is a mutation’s presence
associated?
Can we classify mutations based on these
associations?
PARADIGM Signatures
Mutations
Ted Goldstein
Mutation Association to Pathways
What pathway activities is a mutation’s presence
associated?
Can we classify mutations based on these
associations?
PARADIGM Signatures
Mutations
APC and other Wnt
(Note: CRC figure below; soon for BRCA)
Ted Goldstein
Mutation Association to Pathways
What pathway activities is a mutation’s presence
associated?
Can we classify mutations based on these
associations?
PARADIGM Signatures
Mutations
Ted Goldstein
Mutation Association to Pathways
What pathway activities is a mutation’s presence
associated?
Can we classify mutations based on these
associations?
PARADIGM Signatures
Mutations
TGFB Pathway mutations
(Note: CRC figure below; soon for BRCA)
Ted Goldstein
Mutation Association to Pathways
What pathway activities is a mutation’s presence
associated?
Can we classify mutations based on these
associations?
PARADIGM Signatures
Mutations
PIK3CA, RTK pathway, KRAS
(Note: CRC figure below; soon for BRCA)
Ted Goldstein
Mutation Association to Pathways
What pathway activities is a mutation’s presence
associated?
Can we classify mutations based on these
associations?
PARADIGM Signatures
Mutations
Evidence for
AHNAK2 acting
PI3KCA-like?
(Note: CRC figure below; soon for BRCA)
Ted Goldstein
Summary
• Modeling information flow on known pathways gives view
of gene activity.
• Patient stratification into pathway-based subtypes
• Sub-networks provide pathway-based signatures of subtypes and mutations.
• Loss- and gain-of-function predicted from pathway
neighbors for even rare mutations.
• Identify interlinking genes associated with mutations to
implicate additional targets even in LOF cases
• E.g. Target MYC-related pathways in certain TP53deficient cells?
• Current work: Use pathways to now simulate the affects of
specific knock-downs.
UCSC Integrative Genomics
Group
Marcos Woehrmann
Sam Ng
Dan Carlin
Evan Paull
Ted Golstein
James Durbin
Artem Sokolov
Yulia Newton
Chris Szeto
Chris Wong
David Haussler
UCSC Cancer Genomics
• Kyle Ellrott
• Brian Craft
• Chris Wilks
• Amie Radenbaugh
• Mia Grifford
• Sofie Salama
• Steve Benz
Jing Zhu
UCSC Genome Browser Staff
• Mark Diekins
• Melissa Cline
• Jorge Garcia
• Erich Weiler
Acknowledgments
Buck Institute for Aging Chris Benz,
• Christina Yau
• Sean Mooney
• Janita Thusberg
Collaborators
• Joe Gray, LBL
• Laura Heiser, LBL
• Eric Collisson, UCSF
• Nuria Lopez-Bigas, UPF
• Abel Gonzalez, UPF
Funding Agencies
•
•
•
•
•
•
Broad Institute
• Gaddy Getz
• Mike Noble
• Daniel DeCara
NCI/NIH
SU2C
NHGRI
AACR
UCSF Comprehensive Cancer Center
QB3
UCSC Cancer Browser
genome-cancer.ucsc.edu
Jing Zhu