TakedaPharmaGroup_30Sept2010
Download
Report
Transcript TakedaPharmaGroup_30Sept2010
Genome Sciences Centre
BC Cancer Agency, Vancouver, BC, Canada
ALEXA-Seq analysis reveals breast cell type
specific mRNA isoforms
www.AlexaPlatform.org
Malachi Griffith
30 Sept. 2010
1
In most genes, transcript diversity is generated
by alternative expression
Gene expression
Types of alternative expression
2
Transcript variation is important to the
study of human disease
• Alternative expression generates
multiple distinct transcript variants
from most human loci
• Specific transcript variants may
represent useful therapeutic targets
or diagnostic markers
(Venables, 2006)
3
Massively parallel RNA sequencing
Tissues/Cell Lines
Luminal
Isolate RNAs
Generate cDNA, fragment,
size select, add linkers
Myoepithelial
vHMECs
hESCs
Sequence ends
Map to genome,
transcriptome,
and predicted
exon junctions
Discover isoforms and
measure abundance
263 million paired reads
21 billion bases of sequence
4
Pipeline
overview
5
What is an ALEXA-Seq sequence ‘feature’
Summary of features for human:
~4 million total (14% ‘known’)
37k Genes
62k Transcripts
278k exons
2,210k exon junctions
407k alternative exon boundaries
560k intron regions
227k intergenic regions
6
Data analyzed to date
• ALEXA-Seq processing: 19 projects
– REMC + 18 others
• 105 libraries (200+ lanes)
• 3.9 billion paired-end reads
• 36-mers to 75-mers
7
Output
• Expression, differential expression and alternative expression
values for 3.8 million features for each library processed
• Library quality analysis
• Number of features expressed (above background)
– Genes, transcripts, exon regions, junctions, etc.
• Differential gene expression
– Ranked lists
• Alternative expression
– Ranked lists
– Alternative isoforms involving exon skipping, alternative transcript
initiation sites, etc.
– Known or predicted novel isoforms
• Candidate peptides
– Ranked lists
8
ALEXA-Seq data browser
(using REMC analysis as an example)
• Goals
– Visualization, interpretation, design of validation
experiments, distribute results to internal/external
collaborators
• What kinds of questions does ALEXA-Seq
allow us to ask/answer?
• http://www.alexaplatform.org/alexa_seq/Breast/Summary.htm
9
Is the RNA-Seq library suitable for
alternative expression analysis?
•
•
•
•
•
•
•
Library summary
Read quality
Tag redundancy
End bias
Mapping rates
Signal-to-noise
hnRNA & gDNA
contamination
• Features detected
10
Is my favorite gene expressed?
alternatively expressed?
11
What are the most highly expressed
genes, exons, etc. in each library?
•
•
•
•
•
Expression
Differential
expression
Alternative
expression
Provided for each
feature type (gene,
exon, junction, etc.)
Ranked lists of
events
12
e.g. most highly expressed genes
13
What are the top DE and AE genes for
each tissue comparison?
•
•
•
•
Candidate genes
Each comparison
DE or AE events
Gains or Losses
14
Summary page for vHMECs vs. Luminal
15
Candidate features gained in vHMECs
vHMECs vs. Luminal
CD10
16
Which exons/junctions and corresponding
peptides might be suitable for antibody design?
17
Candidate peptides gained in vHMECs
vHMECs vs. Luminal
18
Example housekeeping gene
(Actin; no change)
19
CD10 (used to sort myoepithelial cells)
Myoepithelial
& vHMECs
Luminal
422-fold higher in Myoepithelial than Luminal
20
CD227 (used to sort luminal epithelial cells)
CD227
Luminal
Myoepithelial
CD227
21
Differential gene expression of CASP14
(Caspase 14 gained in vHMECs)
22
Novel skipping of PTEN exon 6
23
Exon 12 skipping of DDX5 (p68)
24
Tissue specific isoforms of CA12
Myoepithelial
vHMECs
Luminal
25
Alternative first exons of INPP4B
26
Alternative first exons of SERPINB7
27
FERM domain containing proteins are
alternatively expressed *
* (FRM6, FRM4A, FRMD4B are AE) (FRMD3, FRMD8 are DE)
28
Novel isoforms observed only in
vHMECs
E6-E10
E7-E10
29
How reliable are predictions from
ALEXA-Seq?
• Are novel junctions real?
– What proportion validate by RT-PCR and Sanger
sequencing?
• Are differential/alternative expression changes
observed between tissues accurate?
– How well do DE values correlate with qPCR?
• To answer these questions we performed ~400
validations of ALEXA-Seq predictions from a
comparison of two cell lines…
30
Validation (qualitative)
33 of 189 assays shown. Overall validation rate = 85%
31
Validation (quantitative)
qPCR of 192
exons identified
as alternatively
expressed by
ALEXA-Seq
Validation rate = 88%
32
Conclusions
• ALEXA-Seq approach provides comprehensive
global transcriptome profile
– Input: paired-end RNA sequence data
– Output: expression, differential expression, alternative
expression, candidate peptides, etc.
• Detection of both known and novel isoforms
– Subset that differ between conditions
• Predictions are highly accurate
– 86% validation rate by RT-PCR, qPCR and Sanger
sequencing
• www.AlexaPlatform.org
33
Acknowledgements
Griffith M, Griffith OL, Morin RD, Tang MJ, Pugh TJ, Ally A, Asano JK, Chan SY, Li I, McDonald H,
Teague K, Zhao Y, Zeng T, Delaney AD, Hirst M, Morin GB, Jones SJM, Tai IT, Marra MA.
Alternative expression analysis by RNA sequencing. In review (Nature Methods).
Supervisor
Marco Marra
Committee
Joseph Connors
Stephane Flibotte
Steve Jones
Gregg Morin
Bioinformatics
Obi Griffith
Ryan Morin
Rodrigo Goya
Allen Delaney
Gordon Robertson
Richard Corbett
Sequencing
Martin Hirst
Thomas Zeng
Yongjun Zhao
Helen McDonald
Laboratory
Trevor Pugh
Tesa Severson
Neuroblastoma
Olena Morozova
Marco Marra
Morgen
Pamela Hoodless
Jacquie Schein
Inanc Birol
Gordon Robertson
Shaun Jackman
5-FU resistance
Michelle Tang
Isabella Tai
Marco Marra
Iressa and Sutent
Obi Griffith
Steven Jones
Multiple Myeloma
Rodrigo Goya
Marco Marra
Lymphoma
Ryan Morin
Marco Marra
34
35