8/10/2019 RNAseq Pevsner
1/21
8/10/2019 RNAseq Pevsner
2/21
RNAseq: design principles
How RNAseq works
Data analysis
Outline
8/10/2019 RNAseq Pevsner
3/21
Experimental design principles: randomization, replication
RNAseq specific effects
Sequencing depth
Paired-end sequencing
Biases of NGS
Sample size calculation
Validation
RNAseq: experimental principles
Fang Z, Cui X. Brief. Bioinf. (2011) 12:280
8/10/2019 RNAseq Pevsner
4/21
Disc1
k/o
wt
barcode 1
barcode 2
barcode 3
barcode 4
barcode 5
barcode 6
barcode 7
barcode 8
barcode 9
pool, sequence
demultiplex
Experimental design for RNAseq
8/10/2019 RNAseq Pevsner
5/21
Disc1
k/o
wt
Experimental design for RNAseq
plus
virus
wt Pcm1
k/o
wt
date of RNA
isolation: best
to coordinate
sample size
can vary, but try
for n>3
barcoding
strategy
depth of
coverage
8/10/2019 RNAseq Pevsner
6/21
RNAseq: design principles
How RNAseq works
Data analysis
Outline
8/10/2019 RNAseq Pevsner
7/21
mRNA or RNA
removecontaminant DNA
fragment RNA
reverse transcribe
to cDNA
Data generation Data analysis
Ligate sequence
adaptors
Select size range
Sequence cDNA
ends
raw reads
remove artifacts
correct errors
assemble
transcripts
Post-process
transcripts
Align reads to
transcripts to
quantifyexpressionMartin JA Wang Z NRG (2011) 12:671
8/10/2019 RNAseq Pevsner
8/21
RNAseq single reads: map to genomic DNA,
detect alternative splicing events
Ozsolak F, Milos PM NRG (2011) 12:87
8/10/2019 RNAseq Pevsner
9/21Ozsolak F, Milos PM NRG (2011) 12:87
RNAseq paired-end reads: map to genomic DNA,
get better map of transcript structure
and of chimeric sequences
8/10/2019 RNAseq Pevsner
10/21Martin JA Wang Z NRG (2011) 12:671
Reference-based transcriptome assembly
(a) Splice-align reads to the genome
8/10/2019 RNAseq Pevsner
11/21Martin JA Wang Z NRG (2011) 12:671
Reference-based transcriptome assembly
(b) Build graph of alternative splicing events
8/10/2019 RNAseq Pevsner
12/21
Martin JA Wang Z NRG (2011) 12:671
Reference-based transcriptome assembly
(c) Traverse graph to assemble variants
8/10/2019 RNAseq Pevsner
13/21
Martin JA Wang Z NRG (2011) 12:671
Reference-based transcriptome assembly
(d) Assemble isoforms
8/10/2019 RNAseq Pevsner
14/21
Accuracy: % of correctly assembled bases using references
Completeness: % expressed reference transcripts covered by
all the assembled transcripts
Contiguity: % of expressed reference transcripts covered by a
single, longest-assembled transcript
Chimerism: % of chimeras due to misassemblies (spans two
or more different reference genes)
Variant resolution: % of transcript variants resolved
Quality metrics for assessing transcriptome assemblies
Martin JA Wang Z NRG (2011) 12:671
8/10/2019 RNAseq Pevsner
15/21
Ozsolak F, Milos PM NRG (2011) 12:87
Emerging technologies for single-cell gene
expression profiling
8/10/2019 RNAseq Pevsner
16/21
RNAseq: design principles
How RNAseq works
Data analysis
Outline
8/10/2019 RNAseq Pevsner
17/21
Bowtie: short read aligner
TopHat: align RNAseq reads togenome, find splice sites
Cufflinks: assembles
transcripts, finds differentially
expressed transcripts
CummRbund: explore data
Trapnell C, Nature Prot. (2012) 7:563
8/10/2019 RNAseq Pevsner
18/21
Galaxy for RNAseq analysis:
web-based collection of tools for bioinformatics analysis
8/10/2019 RNAseq Pevsner
19/21
Galaxy for RNAseq analysis
Tools panel includes Tophat,
Cufflinks for RNAseq analysis
8/10/2019 RNAseq Pevsner
20/21
Galaxy for RNAseq analysis
History panel shows data files for
analysis (can be included in
transparent, reproducible
workflows)
8/10/2019 RNAseq Pevsner
21/21
Galaxy for RNAseq analysis
Display panel shows data
files; here fastq files from
RNAseq with raw sequence
data and base quality scores