For Research Use Only. Not for use in diagnostic procedures. © Copyright 2021 by Pacific Biosciences of California, Inc. All rights reserved. Pacific Biosciences, the Pacific Biosciences logo, PacBio, SMRT, SMRTbell, Iso-Seq, and Sequel are trademarks of Pacific Biosciences. BluePippin and SageELF are trademarks of Sage Science. NGS-go and NGSengine are trademarks of GenDx. FEMTO Pulse and Fragment Analyzer are trademarks of Advanced Analytical Technologies. All other trademarks are the sole property of their respective owners. Towards Isoform Resolution Single-Cell Transcriptomics for Clinical Applications Using Highly Accurate Long-Read Sequencing Abstract #: 1873 Elizabeth Tseng 1 , Jason G. Underwood 1 , Arjun Scott Nanda 2 , Vijay Ramani 2 , Scott N. Furlan 3 1 PacBio, 1305 O’Brien Drive, Menlo Park, CA 94025 2 UCSF, San Francisco, CA 3 Fred Hutchinson Cancer Research Center, Seattle, WA Improving scIso-Seq Throughput on PacBio Systems PacBio Sequencing & Deconcatenation Single-Cell Deconvoluation With Short or Long Reads • PacBio Iso-Seq method generates full-length transcript sequences up to ~15kb with high accuracy (>99.9%) • 10X single-cell systems produce ~50% TSO-TSO artifact cDNA • Using TSO artifact depletion and cDNA concatenation, we achieve ~6X throughput, or 8-9 million full-length cDNA molecules per SMRT Cell 8M for the 10X single-cell platform • We applied to this throughput-improvement method to 10X single-cell libraries sequenced on PacBio Sequel II systems • Demonstrated cell BC concordance with matching short read libraries • Full-length isoform information revealed distinct expression levels in T cells not observable through 3’ tagging methods scIso-Seq Throughput Improvement Methodology Sample A Sample B HiFi Reads 2,557,092 3,174,724 Reads with cDNA primers 2,151,948 2,726,226 Deconcatenated cDNAs 7,853,190 8,519,673 Hypothetical cDNAs w/out TSO depletion and concatenation 1,075,974 1,363,113 Effective Throughput Increase ~7.2X ~6.2X Distribution of Concatemers per Long Read Sample A Transcript Classification using SQANTI3 Cell BC concordance, PacBio vs. Illumina −10 0 10 20 −10 −5 0 5 10 15 lrUMAP_1 lrUMAP_2 Long Reads −10 0 10 −10 0 10 20 srUMAP_1 srUMAP_2 Short Reads 15388 short reads/cell 936 cDNAs/cell Knee plot, all BC PacBio Short reads calls 8386 single cells (10X Cell Ranger) B Memory B Naive Basophils CD14 Mono CD16 Mono CD4 Memory CD4 Naive CD8 Effector CD8 Memory CD8 Naive CD8 TRB−V9 cDC ISG15_High Treg MAIT Multiplets Neutrophil NK pDC Proliferating RBC Treg −10 0 10 20 −10 −5 0 5 10 15 lrUMAP_1 lrUMAP_2 ISG20 Isoforms Assigned to Single Cells Multiple None PB.30915.37 PB.30915.4 PB.30915.5 PB.30915.7 T cell lineages T cells express 4 common isoforms B Memory B Naive Basophils CD14 Mono CD16 Mono CD4 Memory CD4 Naive CD8 Effector CD8 Memory CD8 Naive CD8 TRB−V9 cDC ISG15_High Treg MAIT Multiplets Neutrophil NK pDC Proliferating RBC Treg Sample A (5’ library) read schema Assigning Isoforms to Single Cells ISG20 : Interferon Stimulated Exonuclease Gene 20 The Complete Diversity of ISG20 Isoforms Expressed in CD4 Naïve Cells CD8 Naïve Cells Prefer from the Downstream TSS GENCODE Reference