peer12 - Computer Science, Columbia University

Download Report

Transcript peer12 - Computer Science, Columbia University

Dealing with the
heterogeneity of
cancer
Dana Pe’er
Department of
Biological Sciences
Center for Computational
Biology and Bioinformatics
What is Cancer?
Weinberg, Cell 2001
Why these phenotypes?
• Cells only proliferate when they are told to do
so.
– Usually achieved by growth factors or cell-to-cell
interaction.
• Malignant cells proliferate independent of
external signals
• Proliferation rate is controlled by external and
internal signals.
• Cells that interfere with their environment
receive signals to die
• Tumors evade these signals
• A local tumor is almost always surgically
removable.
• Cancer is such a terrible disease because it
metastasizes and affects other organs
• Our chromosomes end with “telomeres”, a
chunk of DNA that isn’t replicated and gets
smaller when a new DNA is synthesized.
• When they are too short, the “important”
DNA is unable to be copied and the cell dies
• Tumors activate the process that elongates
telomeres (and don’t die).
• Cells need blood. More cells need more blood
• Tumors, which spread into new areas, need
new blood vessels
• Our cells aren’t designed to proliferate
indefinitely, metastasize, divide whenever
they want and ignore extracellular signals
• There are checkpoints in place that prevent all
of the above by a suicide.
• These are lost in cancer.
So what is cancer?
Weinberg, Cell 2011
The “Pathway” view of the cell
• We depict proteins and processes as
“pathways”.
How a cell achieves these phenotypes
• Different types of mutations (alterations) can
alter pathway activity
– Activate “Oncogene”
– Inhibit “Tumor
suppressor”
TCGA, Nature 2008
Point mutations
• Nucleotide change can lead to:
– An early stop codon – making a protein nonfunctional
– Create a constitutively active protein
DNA Copy Number Alterations
• Chunks of the genome can be amplified
– Leading to many copies of an oncogene
– Which leads to overexpression of the gene
• Chunks can also be lost (deleted)
– And that is one mechanism to lose a tumor
suppressor
Subtypes of cancer – By expression
• Different cancers, and
even subtypes of cancer,
have dramatically
different gene expression
patterns
• These represent cellular
states
Sandhu, 2010
Cancer development
Genetic alterations
alterations
functional
drivers
Identifying significantly
recurrent alterations
across samples
The Cancer Genome Atlas (TCGA)
• Characterization of 20 cancers x 1000 tumors each
• Assays include:
– How is the DNA changing: DNA sequencing (mostly exon),
copy number variation
– How is expression different: RNA-seq, miRNAs
– Extras: methylation, clinical annotation
• https://tcga-data.nci.nih.gov/tcga/
Prevalence of alterations by type
Sequence mutations
35
Frequency
30
25
20
15
10
6 alt > 5%
samples
5
0
CN alterations
80
Frequency
70
60
50
40
30
20
10
0
87 alt > 5%
samples
Distinguishing drivers from passengers
What Aberrations
Make a Cell Go Bad?
Driver Aberrations:
Significantly Recur Across Tumors
Breast Copy Number Profile

Breast Cancer Exome
Sequencing
Total mutations: 21713
 Per patient: 48

Two forces driver copy number
I. Selection
of the Fittest
II. DNA secondary
structure and packing

Norwell, 1976
Our ISAR algorithm
tries to identify
frequent alterations
driven by fitness.
ISAR

Significance of number of alterations should be
computed locally.
~8Mbp
P-value
Distribution
ISAR regions
# regions



# genes
per region
# genes
per peak
ISAR
83
14
GISTIC2
33
14.39
1.18
A better null model helps sensitivity
~1200 genes in ISAR regions: we need to identify drivers
within these regions.
GISTIC2 narrows down regions to deterministic peaks
containing 1.18 genes. Problem solved?
Defining peaks: cut-off
9 of the 33 GISTIC2 peaks do
not contain a single gene
Helios approach
Sample 1
Sample 2
Sample 3
Sample 4
Genome
deterministic
0/1 decision
GENE1
GENE2
GENE3
GENE4
GENE5
Classic
Approach
Features
Sequence
Weight
and
combine
Genome
Integrative
Score
Copy Number
GENE1
GENE2
Expression
GENE3
GENE4
shRNA
GENE5
Primary tumor data (TCGA)
Functional assays (RNAi screens)
Helios: Data Integration
Primary tumor (many)
Cell Line (few)
…
A ranked and
scored list of
driver genes


Making use of the large-scale of functional screens that are
quickly accumulating
Best of both worlds: Integrating primary tumor data with
functional screens on cell lines
Features: Gene expression

Is the gene expressed ?

Diploid VS amplified :
CCND1 CN
AMP
WT
CCND1 EXP

Differentially expressed in subtypes:
SUBTYPE
FOXA1 EXP
BASAL
LUMINAL
Features: Sequence mutations


Driver genes may show a footprint of point
mutations
We use p-value of frequency of alteration calculated
by MutSig (Banerji, Nature 2012 )
Training data
Features
Classifier
Labels
List of drivers
and
passengers
Too small and biased !!!
Make frequency
of alteration the
center of the
system
PLX4720-Targeted Therapy
Proteins Form a Complex Network
Chandarpalaty et al. 2011
Crosstalk
BRAF exists in a network
Feedback
BRAF
Networks Vary Across
Genetic Backgrounds
Drastically different genetic backgrounds
Our Aims


Identify genetic determinants and master regulators
of drug resistance
Predict additional target pathways for
combinatorial drug treatment.
Heterogeneity within a tumor


If even < 1% of cells
evade therapy, tumor
will recur.
The influence of this
population on any bulk
assay is negligent
Mass cytometry: A powerful new technology
Time of flight Mass spectrometer


We capture the level of 45 protein
epitopes simultaneously in single
cells
For tens of thousands of cells
Mass cytometry
How do we view > 30 dimensions?
Parameters: 32
14
8
4
Plots: 91
28
6
496
Acknowledgements
Felix Sanchez-Garcia
Dylan Kotliar
Uri David Akavia
Jose Silva (CUMC)
Junji Matsui
Bo-Juen Chen
El-ad David Amir
Jacob Levine
Smita Krishnaswamy
Daniel Shenfeld
Michelle Tadmor
Garry Nolan (Stanford)
Sean Bendall
Erin Simonds
Kara Davis