This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
11/13/18
1
NGS: current and future platforms
Natacha Couto
Department of Medical Microbiology
University Medical Center Groningen, RUG
Groningen, 21-24th October 2018
Disclosure slide for speaker at further training events
(Potential) conflict of interest None
Potentially relevant company relationships in connection with
event
Nanopore, Qiagen– no personal benefits
Sponsorship or research funding Nanopore, Qiagen, Roche
• Overclustering acts on sequencing data in the following ways:
• Lower Q30 Scores—Due to overloaded signal intensities, the ratio of base
intensity to background for each base is decreased. This decrease often results in ambiguity during base calling, and leads to a decrease in data quality.
• Lower Clusters Passing Filter—The percentage of clusters passing filter (%PF) is an indication of signal purity from each cluster. Overclustered flow cells typically have higher numbers of overlapping clusters. This leads to poor template generation, which then causes a decrease in the %PF metric.
Illumina Technical Note – Optimizing Cluster Density on Illumina Sequencing Systems.
How does overclustering affect sequencing data?
• Lower Data Output—Reduced yield (gigabases [Gb] per flow cell) is a by product of lower %PF.
• Inaccurate Demultiplexing—Index reads usually have low diversity by design, which can lead to poor base calling. Overclustering exacerbates the potential
for poor base calling, which in turn, can lead to demultiplexing failures.
• Complete Run Failure—In cases of extreme overclustering, focusing can fail and the run may terminate at any cycle.
lllumina Technical Note – Optimizing Cluster Density on Illumina Sequencing Systems.
• It has been shown that different Illumina sequencers may have
different sequencing errors that might influence the final sequences.
• Therefore, here we evaluated the use of shotgun metagenomics and bioinformatics analyses to type DENV directly from sera and plasma samples.
• To optimize the workflow, we evaluated the effect of: i) the DNase I treatment to decrease the human DNA background; ii) two different library preparation methods and iii) two sequencing platforms, on the sequence data quality.
Schirmer et al. BMC Bioinf. 2016, 17: 125; Lizarazo et al. Under review 2018.
MiSeq vs NextSeq
37
Table 2. Sequence quality of the 4 runs performed using two different library preparation kits and two sequencing platforms. 722
Abbreviations: Gbp, giga base pair; PF, passing filter; Q30, quality score with base call accuracy of 99.9% (1 incorrect base in 1000 based calls); 723
1. Each nucleotide contains a specific fluorescent dye.
2. Once the correct nucleotide is added to the sequence by the DNA polymerase, it emits light at a
specific wavelength.
3. The detector inside the zero-mode waveguide (ZMW, nanophotonic visualization chamber), captures
the light emitted.
Study 1
• The objective was to investigate the utility of Pacific Biosciences
circular consensus sequencing (CCS) reads for metagenomic projects.
• They compared the application and performance of both PacBio CCS and Illumina HiSeq data with assembly and taxonomic binning algorithms using metagenomic samples representing a complex microbial community.
| : | DOI: . /srep
www.nature.com/scientificreports
Improved metagenome assemblies and taxonomic binning using long-read circular consensus sequence dataJ. A. Frank , Y. Pan , A. Tooming-Klunderud , V. G. H. Eijsink , A. C. McHardy ,
A. J. Nederbragt & P. B. Pope
function within microbial communities. (ere we investigate the utility of Paciic Biosciences long and
SMRT cells produced approximately Mb of CCS reads from a biogas reactor microbiome sample that averaged nt in length and . % accuracy. CCS data assembly generated a comparative number of large contigs greater than kb, to those assembled from a ~ x larger (iSeq dataset ~ Gb produced from the same sample i.e approximately % of total contigs . (ybrid assemblies
produced signiicant enhancements in taxonomic binning and genome reconstruction of two dominant
Department of emistr iotec no o an oo cience orwe ian ni ersit of ife ciences s omputationa io o of Infection esearc e m o t entre for Infection esearc In o enstra
raunsc wei erman . o utionar nt esis in ern orwa . orrespon ence an re uests for materia s s ou e a resse to
Levy and Myers. Annu. Rev. Genomics Hum. Genet. 2016, 17: 95-115.
Sequencing platforms
Manufacturer Amplification Detection Chemistry
Oxford Nanopore Single molecule Nanopore Nanopore
Roche Genia Single molecule Nanopore Nanopore
Quantum Biosystems Single molecule Nanogate Nanogate
Sequencing with a nanopore
• A nanopore is a pore of nanometer size.
• It can be divided into three categories:
• Biological– also called transmembrane protein channels, usually inserted into a substrate (membrane). Well-defined and highly-reproducible nanopore size and structure.
• Solid-state – synthetic nanopores. They have many superior advantages over their biological counterparts, such as chemical, thermal, and mechanical stability, size adjustability, and integration.
• Hybrid – a mixture of both, taking advantage of the features of biological and solid-state nanopores.
• However, you need the right equipment, for example the MinIT(Nanopore).
• You also need bioinformatic skills.
NGS for Tuberculosis
• Routine full characterization of Mycobacterium tuberculosis (TB) is
culture-based, taking many weeks.
• Whole-genome sequencing (WGS) can generate antibiotic susceptibility profiles to inform treatment, augmented with strain information for global surveillance.
• Such data could be transformative if provided at or near the point of care.
Votintseva et al. J. Clin. Microbiol. 2017, 55(5):1285–98.
• Initial evaluation with Illumina, followed by Nanopore sequencing.
• With Illumina MiSeq/MiniSeq, the workflow from patient sample to results could be completed in 44h/16 h at a reagent cost of £96/£198 per sample.
• For Nanopore, the estimated turnaround time to detection of resistance was 7.5h (full profile 5h later).
• Antibiotic susceptibility predictions were fully concordant.
Votintseva et al. J. Clin. Microbiol. 2017, 55(5):1285–98.
• They designed an adapter of a highly conserved termini of the
influenza A virus genome to target the (-) sense RNA into a protein nanopore on the Oxford Nanopore MinION sequencing platform.
• The researchers used total RNA extracted from the allantoic fluid of influenza rA/Puerto Rico/8/1934 (H1N1) virus infected chicken eggs (EID50 6.8 × 109).
• They demonstrated successful sequencing of the coding complete influenza A virus genome with 100% nucleotide coverage, 99% consensus identity, and 99% of reads mapped to influenza A virus.
| DO): . /s - - -
www.nature.com/scientificreports
Direct RNA Sequencing of the Coding Complete )nluenza A Virus GenomeMatthew W. Keller , Benjamin L. Rambo-Martin , Malania M. Wilson , Callie A. Ridenour ,
Samuel S. Shepard , Thomas J. Stark , Elizabeth B. Neuhaus , Vivien G. Dugan ,
David E. Wentworth & John R. Barnes
For the irst time, a coding complete genome of an RNA virus has been sequenced in its original form. Previously, RNA was sequenced by the chemical degradation of radiolabeled RNA, a diicult method that produced only short sequences. )nstead, RNA has usually been sequenced indirectly by copying it into cDNA, which is often ampliied to dsDNA by PCR and subsequently analyzed using a variety of DNA sequencing methods. We designed an adapter to short highly conserved termini of the inluenza A virus genome to target the - sense RNA into a protein nanopore on the Oxford Nanopore Min)ON sequencing platform. Utilizing this method with total RNA extracted from the allantoic luid of inluenza rA/Puerto Rico/ / ( N virus infected chicken eggs E)D . , we demonstrate successful sequencing of the coding complete inluenza A virus genome with % nucleotide coverage, % consensus identity, and % of reads mapped to inluenza A virus. By utilizing the same methodology one can redesign the adapter in order to expand the targets to include viral mRNA and
sense cRNA, which are essential to the viral life cycle, or other pathogens. This approach also has the potential to identify and quantify splice variants and base modiications, which are not practically measurable with current methods.
)nluenza Division, National Center for )mmunization and Respiratory Diseases NC)RD , Centers for Disease Control and Prevention CDC , Atlanta, Georgia, USA. Matthew W. Keller and Benjamin L. Rambo-Martin contributed equally. Correspondence and requests for materials should be addressed to J.R.B. email: fzq @cdc.gov
Received: 23 April 2018
Accepted: 5 September 2018
Published online: 26 September 2018
OPENCorrection: Author Correction
Keller et al. Sci. Rep. 2018, 8:14408 .
| DOI: . /s - - -
ResultsRNA calibration strand: enolase )) mRNA.
Sequencing RNA from crude versus puriied inluenza rA/Puerto Rico/ /1 (1N1 virus.