Computational methods to quantify transcriptome changes in bacteria Rebecca Pankow Mentor: Dr. Jeff Chang Botany and Plant Pathology Oregon State University.

Post on 22-Dec-2015

215 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

Transcript

Computational methods to quantify transcriptome

changes in bacteria

Rebecca PankowMentor: Dr. Jeff Chang

Botany and Plant PathologyOregon State University

What makes a pathogen?

Infections caused by Pseudomonas syringae

• Overcome host defenses• Manipulate host cell• Survive in host environment

HypothesisGenes that are expressed in conditions that

mimic the plant are candidates for host-associated genes.

Experimental Setup

Grow P. syringae inKB (rich media)

No virulence geneexpression

Grow P. syringae in minimal media:simulates environment of plant host

Virulence geneexpression

Identify differential expression of genes

How to identify expressed genes?

Transcriptome: all mRNAs in a cell at a given time

DNA mRNA protein

sequencedtranscriptome

completely sequenced genome

aligning back

AGAGCAATAGCA

TAATTCTCGTTATCGTCCGGATTAAGAGCAATAGCAGGCC

AGAGCAATAGCA

How to quantify transcriptome changes?

Next-Generation Illumina IIG Genome Sequencer

ACATAGGAGCTAGATAGCTATGCATCGATCGACATGGATCGACATGAGAGTTACGAGTAGACTGAGAGATATCTGAGAGATATGTTTACCCAGATTACTCTCCGATGCGATCGACATGAGAGTTACGAGTAGACTGAGAGATAT

mRNAs in transcriptome

36 base-long reads (36-mers)

Computational Pipeline

TGTTTACCCAGATTACTCTCCGATGCCAGGGAGAAT GATCGACAGATGCATGTTTACCCAGATTACTCTCCG ACATAGGAGCTAGATAGCTATGCATCGATCGACAGAGATCGACAGATGCATGTTTACCCAGATTACTCTCCG

Processed 36-mers

Align to ref. genome

Signal Processing

genome coordinates of a potential transcription unit

# reads thatmap to

coordinates

Graph signal

Not very informative!

…0010100234201231201001022410301022040102020…

Signal Processing

Using sliding window approach to minimize noise

Set

old signal

processed signal

Sum of reads in sliding window =

______________________________________…1919 ___________ _________________________…1919 2020 __________ _______________________…

19 20 “sliding window” = 15

22

1919 2020 2222 ________ _____________________…

Resulting signal

old signal

scaled and processed signal

More informative, but signal is jagged

Smoothing the Signal

Iteration of the sliding window

Deconvoluting Signal

Changes in the signal found by using the sliding window on the first and second derivatives of

the signal.

Deconvoluting Signal

• Refine signal divisions by looking in-between previous divisions• Categorize signal divisions as increasing, decreasing, or flat

Processing Empirical Data

Next-Generation Illumina IIG Genome Sequencer

ACATAGGAGCTAGATAGCTATGCATCGATCGACATGGATCGACATGAGAGTTACGAGTAGACTGAGAGATATCTGAGAGATATGTTTACCCAGATTACTCTCCGATGCGATCGACATGAGAGTTACGAGTAGACTGAGAGATAT

36 base-long reads (36-mers)

Problems

Mistakes in sequencing can be made!

ACATAGGAGCTAGATAGCTATGCATCGATCGACATGGATCGACATGAGAGTTACGAGTAGACTGAGAGATATCTGAGAGATATGTTTACCCAGATTACTCTCCGATGCGATCGACATGAGAGTTACGAGTAGACTGAGAGATAT

30% of reads match P.syringae genome

SolutionAccount for mismatches by treating each base in a 36-mer as a wildcard

ACATAGGAGCTAGATAGCTATGCATCGATCGACATG

_CATAGGAGCTAGATAGCTATGCATCGATCGACATG

A_ATAGGAGCTAGATAGCTATGCATCGATCGACATG

AC_TAGGAGCTAGATAGCTATGCATCGATCGACATG

36-mers containing wildcards are mapped back to the original genome

Conclusions

• Computational pipeline developed to– Generate and smooth signal– Divide signal into sections that are going up,

down, or are flat

• 30% of reads from transcriptome map back to original genome

Future Work

Quantify changes in bacterial transcriptome under different treatments

AcknowledgementsJeff Chang

Jason CumbieJeff KimbrelBill ThomasCait ThireaultAllison SmithRyan LilleyPhillip HillenbrandJayme Stout

HHMI/USDAKevin Ahern

top related