PowerPoint - Oregon State University

Download Report

Transcript PowerPoint - Oregon State University

Computational methods
to quantify transcriptome
changes in bacteria
Rebecca Pankow
Mentor: Dr. Jeff Chang
Botany and Plant Pathology
Oregon State University
What makes a pathogen?
• Overcome host defenses
• Manipulate host cell
• Survive in host environment
Infections caused by
Pseudomonas syringae
Hypothesis
Genes that are expressed in conditions that
mimic the plant are candidates for hostassociated genes.
Experimental Setup
Grow P. syringae in
KB (rich media)
No virulence gene
expression
Grow P. syringae in minimal media:
simulates environment of plant host
Virulence gene
expression
Identify differential expression of genes
How to identify expressed genes?
DNA
mRNA
protein
Transcriptome: all mRNAs in a cell at a given time
completely
sequenced genome
TAATTCTCGTTATCGTCCGG
ATTAAGAGCAATAGCAGGCC
AGAGCAATAGCA
aligning back
sequenced
transcriptome
AGAGCAATAGCA
How to quantify transcriptome changes?
mRNAs in transcriptome
Next-Generation Illumina IIG Genome Sequencer
ACATAGGAGCTAGATAGCTATGCATCGATCGACATG
GATCGACATGAGAGTTACGAGTAGACTGAGAGATAT
CTGAGAGATATGTTTACCCAGATTACTCTCCGATGC
GATCGACATGAGAGTTACGAGTAGACTGAGAGATAT
36 base-long reads (36-mers)
Computational Pipeline
Processed 36-mers
TGTTTACCCAGATTACTCTCCGATGCCAGGGAGAAT
GATCGACAGATGCATGTTTACCCAGATTACTCTCCG
ACATAGGAGCTAGATAGCTATGCATCGATCGACAGA
GATCGACAGATGCATGTTTACCCAGATTACTCTCCG
Align to ref. genome
Signal Processing
…0010100234201231201001022410301022040102020…
Graph signal
# reads that
map to
coordinates
genome coordinates of a potential transcription unit
Not very informative!
Signal Processing
Using sliding window approach to minimize noise
old signal
Set“sliding window” = 15
Sum of reads in sliding window = 19
20
22
processed signal
__________________________…
19
_________________________…
20
_______________________…
22
_____________________…
Resulting signal
old signal
scaled and processed signal
More informative, but signal is jagged
Smoothing the Signal
Iteration of the sliding window
Deconvoluting Signal
Changes in the signal found by using the sliding
window on the first and second derivatives of
the signal.
Deconvoluting Signal
• Refine signal divisions by looking in-between
previous divisions
• Categorize signal divisions as increasing,
decreasing, or flat
Processing Empirical Data
Next-Generation Illumina IIG Genome Sequencer
ACATAGGAGCTAGATAGCTATGCATCGATCGACATG
GATCGACATGAGAGTTACGAGTAGACTGAGAGATAT
CTGAGAGATATGTTTACCCAGATTACTCTCCGATGC
GATCGACATGAGAGTTACGAGTAGACTGAGAGATAT
36 base-long reads (36-mers)
Problems
Mistakes in sequencing can be made!
ACATAGGAGCTAGATAGCTATGCATCGATCGACATG
GATCGACATGAGAGTTACGAGTAGACTGAGAGATAT
CTGAGAGATATGTTTACCCAGATTACTCTCCGATGC
GATCGACATGAGAGTTACGAGTAGACTGAGAGATAT
30% of reads match P.syringae genome
Solution
Account for mismatches by treating
each base in a 36-mer as a wildcard
ACATAGGAGCTAGATAGCTATGCATCGATCGACATG
_CATAGGAGCTAGATAGCTATGCATCGATCGACATG
A_ATAGGAGCTAGATAGCTATGCATCGATCGACATG
AC_TAGGAGCTAGATAGCTATGCATCGATCGACATG
36-mers containing wildcards are
mapped back to the original genome
Conclusions
• Computational pipeline developed to
– Generate and smooth signal
– Divide signal into sections that are going up,
down, or are flat
• 30% of reads from transcriptome map back to
original genome
Future Work
Quantify changes in bacterial transcriptome
under different treatments
Acknowledgements
Jeff Chang
Jason Cumbie
Jeff Kimbrel
Bill Thomas
Cait Thireault
Allison Smith
Ryan Lilley
Phillip Hillenbrand
Jayme Stout
HHMI/USDA
Kevin Ahern