Rapid Nucleic Acid Isolation Method for Next-generation Sequencing Applications Brown MT, Ferguson TM, Doebler R Claremont BioSolutions LLC, Upland, CA The introduction of the MinION sequencer has contributed significantly to advances in next generation sequencing (NGS), providing researchers a field portable NGS tool capable of achieving ultra-long DNA reads. As applications for the MinION continue to rapidly expand, one area that continues to be overlooked and consequently lags behind is sample preparation. In an effort to facilitate upstream nucleic acid extraction, Claremont BioSolutions (CBIO) has developed novel sample preparation methods for the rapid isolation of DNA or RNA. Compact and field portable, the technology can be used to isolate nucleic acids from difficult eukaryotes and prokaryotes present in complex sample matrices (i.e. stool, sputum, blood, soil, and tissue) in as little as 15 minutes. Initial testing of DNA isolated from E.coli and human pancreatic tumor tissue shows that DNA samples are compatible with the MinION and can achieve read lengths >100 Kb. CBIO’s technology is customizable depending upon the application and can be integrated into a cartridge format to allow for automated sample preparation solutions. Therefore, due to customizability for a variety of bench and field applications, CBIO’s sample preparation methods appear well - suited for use with the MinION sequencing platform. Background Conclusions For applications that require ultra-long sequence reads, including the detection of structural variants and antibiotic resistant islands, the MinION platform offers significant promise. In our preliminary testing we were able to sequence ultra-long DNA (>100 Kb) from both E. coli and pancreatic tumor tissue samples purified using DNAexpress™ technology and align the reads with reference genomes. The system was relatively straightforward to setup and very mobile, which fits well with our applications. While we did observe significant loss of ultra-long DNA during library preparation there appear to steps that can be employed to minimize loss and or shearing. Overall, we felt the MinION sequencer offered a convenient and effective mobile solution for our downstream NGS needs and validated our rapid method for ultra-long DNA isolation from bacteria and tissue samples. Funding for this poster was made possible in part by NIH Phase I SBIR Grant 5R43GM109502-2 Abstract Figure 1. Rapid Sample Preparation Results Sample preparation E. coli Cell Lysis Method 1. 5x10 8 cfus were lysed by passing the cells through an OmniLyse® device (Figure 3A) for two minutes using a 3 volt battery pack. Method 2: 5x10 8 cfus were incubated in DNAexpress buffer containing 50 mg of lysozyme for 10 min at 37 o C. Tissue Sample Preparation Frozen mouse and human tissue (10mg) were thawed in DNAexpress buffer and disaggregated using a microHomogenizer™ device (Figure 3B) for 2 minutes using a 1.5v volt battery pack. Methodology Figure 4. PFGE Analysis of Purified DNA L = Lambda hindi marker (48Kb -1 Mb) Lane 1. E. coli DNA isolated using DNAexpress™ kit following lysis using OmniLyse 2. E. coli DNA isolated using DNAexpress™ column following lysozyme digestion and prototype DNAexpress column 3. E. coli DNA isolated using DNAexpress™ kit following lysozyme (no bead beating) and modified DNAexpress column 4. DNA isolated from tissue, post-microHomogenizer and modified DNAexpress 5. DNA isolated from mouse lung, post-microHomogenizer and modified DNAexpress MinION Sequencing • Nanopore sequencing technology is set to revolutionize the field of genomics by offering the advantage of long-read sequencing in a rapid and cost effective platform. In contrast to 2nd generation short-read NGS technology, nanopore technology allows long-read sequencing of genomic DNA • Possible improvements in: (1) detection of cancer associated structural variants (Norris, 2015), (2) characterization of emerging antibiotic resistance bacterial strains (Ashton, 2015; Karlsson, 2015), or (3) adoption of NGS in clinical settings • To achieve long-read sequencing, tools to rapidly and effectively isolate high molecular weight DNA for downstream library preparation are needed Claremont BioSolutions’ Sample Preparation (Figure 1-2) • CBIO’s sample preparation technology offers a rapid upstream method to isolate nucleic acid that is compatible with downstream PCR, isothermal amplification, and lateral flow detection • Compact and battery powered, CBIO’s technology is adaptable to the bench, biological safety hood or field applications, including space • CBIO’s technology can be used to isolate total nucleic acid, RNA and DNA (including ultra-long DNA) References Ashton, P.M., et al. (2015). MinION nanopore sequencing identifies the position and structure of a bacterial antibiotic resistance island. Nature Biotechnology, 33; 3, 296-300. Frith, M.C., et al. (2010). Parameters for accurate genome alignment. BMC Bioinformatics, 11, 80. Karlsson, E., et al. (2015). Scaffolding of a bacterial genome using MinION nanopore sequencing. Scientific Reports, 5. Li, H., et al., The Sequence Alignment/Map format and SAMtools. Bioinformatics, 25; 16, 2078-2079. Norris, A.L., et al. (2016). Nanopore sequencing detects structural variants in cancer. Cancer Biol Ther, 17; 3, 246-253. Schonfeld, J. (2015) Quantitative PCR tools for spaceflight studies of gene expression aboard the International Space Station, NASA Facts. FS-2015-06-01-ARC. Watson, et al. (2015). poRe: an R package for the visualization and analysis of nanopore sequencing data. Bioinformatics, 31;1 114-115. Figure 2. Integration and Automation of Sample Preparation CBIO SimplePrep® Technology Automated lysis and DNA extraction from hard-to-lyse samples in 6 minutes NASA Ames WetLab-2 SPM Integrates CBIO’s technology for rapid extraction of RNA from biological samples in space -currently on ISS to isolate RNA for gene expression studies (Schonfield, 2015) E. coli Panc Total reads 6,529 28,510 2D Workflow Success 847 2325 Total 2D Yield 21 Mb 36 Mb Longest 2D 121,683 bp 445,574 bp* Peak 2D Sc 9.9 9.7 Median 2D Sc 8.6 8.9 Figure 3. ClaremontBio’s (A) OmniLyse® and (B) microHomogenizer™ Devices DNA size was analyzed using pulsed-field gel electrophoresis (PFGE; Fig. 4) and linear stretching methods (data not shown). • DNA isolated using DNAexpress from OmniLyse-treated cells averaged ~40 Kb • DNA isolated using DNAexpress from samples prepared using enzymatic lysis or homogenization was ultra- long (avg. ~200 Kb) OmniLyse microHomogenizer DNAexpress/RNAexpress or PureLyse (3 min lysis/extraction) MinION Library Prep DNA from E. coli and human pancreatic tumor tissue were chosen for MinION analysis and libraries were prepared with 1 μg of purified DNA. As per the MinION genomic library protocol, the DNA was first treated using the an FFPE repair kit (NEB). All steps of the genomic DNA sequencing protocol were followed with the exception of the Covaris g-tube shearing step in order to to keep the DNA ultra-long. Wide bore tips we used in the preparation of the pancreatic tissue library to minimize DNA shearing. MinION Run and Data Analysis The MinION R7 flow cells were run for >36 hours on MinKNOW software and base calling was performed using Metrichor™software (Oxford Nanopore). Fast5 files were converted to fastq and fasta files, and histogram analysis was performed using poRe (Watson, 2014). Reads were aligned to reference sequences using LAST (Frith, 2010) and converted to BAM format using maf-convert and SAMtools (Li, 2009). DNA isolation and Analysis Cell lysates, described above, were mixed with DNAexpress binding buffer and DNA was loaded onto the CBIO’s fast flow DNAexpress™ column. Bound DNA was washed twice and the DNA was eluted in TE buffer as per the kit protocol. Purified DNA was quantified using a Tecan m200 pro Nanoquant instrument. DNA size was analyzed using pulse field gel electrophoresis as shown in Figure 4. Table 1. Metrichor Data Output Isolated DNA from E. coli and human pancreatic tumor tissue was prepared using the genomic library protocol. We observed a loss of 80-90% of the ultra-long DNA during the library prep (DNA was recovered from AMP Pure beads after library prep). The final yield was 197ng of DNA for the E. coli sample and 72 ng for the pancreatic sample. During the sequencing runs, MinKNOW reported 139 active pores in the E. coli flow cell and 995 active pores in the pancreatic tissue flow cell. Both flow cells were run for >36 hours. We observed no issues during the E. coli DNA run but MinKNOW reported “No data received” errors during the pancreatic DNA run. We stopped the run, rebooted the system and restarted MinKNOW. The events happened twice and after restarting we were able to continue to collect data, including ultra-long DNA sequence reads. A B Figure 5. MinION Run Data Analysis. 2D Histogram analysis for E. coli (A) and Human Pancreatic Tumor Tissue (C) 2D “pass” data sets. Plot of Read Length vs. Quality Score for E. coli (B) and pancreatic tumor tissue (D). 2D reads are plotted in red (qscore >9) and green (qscore <9), and 1D reads are plotted in blue. Figure 6. Alignment of E. coli and Pancreatic Tissue Reads with reference sequence. (A) Blastn analysis of an ultra 2D long read from E. coli (102 Kb) showing 83% alignment with reference K12 E. coli genome. (B) Alignment of the E. coli 2D data set with the K-12 genome generated 90% coverage (2X) in NCBI Genome Workbench. (C) Blastn analysis of a long read 1D sequence (86 Kb) from the pancreatic tissue data set showed alignment with a gene for SLIT-ROBO Rho GTPase Activating Protein 2C from Human chromosome 1. The query sequence contained a possible 18 Kb insert when aligned with the genome. MinION Sequencing and Data Analysis Sizing of DNAexpress™ kit purified DNA From the Metrichor analysis both the E. coli and pancreatic MinION runs generated 2D and 1D reads >100 Kb (Figure 5 A-D). The mean size of reads was 10 Kb for the E. coli sample and 4 Kb for the pancreatic sample, which was significantly less that observed in the PFGE analysis. The difference in mean read length may be due to shearing during library preparation or loss of larger DNA (entanglement with beads). 2D reads for both data sets aligned with ~80-85% identity with reference genomes while template read alignments were lower than 80% identity. A C B C *Panc Longest 2D read of 445 Kb appears to be the addition of the template (276 Kb) and complement strand (207 Kb) Sequence length bp Mean Quality Score D 100 Kb Sequence length bp Mean Quality Score B 100 Kb A Coverage (X) 2.75 20 150