Streaming algorithms for real-time analysis of Oxford Nanopore sequencing data Minh Duc Cao [email protected]Institute for Molecular Bioscience The University of Queensland London Calling 2016 May 27, 2016 Minh Duc Cao, The University of Queensland London Calling 2016
17
Embed
Streaming algorithms for real-time analysis of Oxford Nanopore ...
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Streaming algorithms for real-time analysis of OxfordNanopore sequencing data
Institute for Molecular BioscienceThe University of Queensland
London Calling 2016May 27, 2016
Minh Duc Cao, The University of Queensland London Calling 2016
Streaming algorithms
Real-time analysis:Answer the biological questions quickly e.g., infection diagnosisRun sequencing only until the answers are obtainedDecide complementary experimentsSave time, save money
Streaming algorithms:Process data input as a streamContinuously make inference and update the confident levelAre robust to noise and scalable to massive data sets
Minh Duc Cao, The University of Queensland London Calling 2016
Real-time analysis workflow
Run
sim
ulta
neou
slyDNA
extractionLibrary
preparationSequencing
setup2 hours 2.5 hours 0.5 hours
MinIONsequencing
Basecalling(Metrichor)
Fastqextraction
ScaffoldAssemblies
Speciestyping
Straintyping
Resistanceprofile
Minh Duc Cao, The University of Queensland London Calling 2016
Fastq extraction
DNAextraction
Librarypreparation
Sequencingsetup
2 hours 2.5 hours 0.5 hours
MinIONsequencing
Basecalling(Metrichor)
Fastqextraction
ScaffoldAssemblies
Speciestyping
Straintyping
Resistanceprofile
(Cao et al, 2015): Bioinformatics, DOI: 10.1093/bioinformatics/btv658
Minh Duc Cao, The University of Queensland London Calling 2016
Scaffold and complete genome assemblies
BWA‐MEM
Stream of bridges
connec ng
Stream of long reads
Pre‐assemblies
Stream of
alignment records
pairing
Extending scaffolds
con
nuing process
output in real‐ me
repeats aligning
(Cao et al, 2016): bioRxiv, DOI: 10.1101/054783Minh Duc Cao, The University of Queensland London Calling 2016
MinION sequencing
Sequence two K. pneumoniae strains with the MinION:
Minh Duc Cao, The University of Queensland London Calling 2016
A glimpse of R9 flowcells
Time genes Data(mins) (reads)10 ermA 1230
ermGspcnorA
15 Van 2152VanHVanS
20 dfrA 2968aac6VanXtetUaadAsul1sul3aadD
30 fusB 3964mphA
40 VanY 5754catVanZVanA
Time genes Data(mins) (reads)50 tetM 6804
tetS120 aph6 9426
msrCmecA
150 catpC221 13198fosAblaOXA
240 blaCTX 15138cfr
480 blaZ 20412vgaA
780 vgaALC 26184ermCaac3blaCMYblaLATblaBIL
1020 dfrA29 36040
Minh Duc Cao, The University of Queensland London Calling 2016
Summary and outlook
Scaffold and complete bacterial assemblies with < 30-fold coverageIdentify pathogen species and strain with 1000 reads (<.5 hours sequencing)Detect antibiotic resistance profile in a few hours of sequencingWe expect the times to be significantly shortened:
Higher throughput with upcoming models: MinION MkII, PromethION.Quicker library preparation