AJ Trio GIAB January 2016 BioNano Genomics Irys: Genome Maps for Sequence Assembly and Structural Variation
AJ Trio GIAB January 2016
BioNano Genomics
Irys: Genome Maps for Sequence Assembly and
Structural Variation
2©2015 BioNano Genomics
Irys® Overview
Start with non-Amplified LONG Native genomic
DNA
Label seq. specific sites
(e.g. nickase motifs)
Linearize & Image
Convert images to digitized molecules:
• Convert label locations to distances
between labels
• Create molecular barcodes (100 kb to >1 Mb)
Assemble the molecular barcodes
into consensus maps/contigs:
• Map lengths can be as long as 30 Mb
For SV discovery/detection, compare to a
reference or gold standard, looking for changes
in the patterns:
• Shifts in barcode patterns reveal insertion
(addition), deletion (subtraction), inversion
(re-orientation, translocation of genome segments
For Genome Finishing,
the maps serve as a scaffold:
• Sequencing contigs are converted in silico into
molecular barcodes by highlighting the same
sequence motifs
• These sequencing based barcodes are then
aligned to the BioNano maps
Workflow Applications
3©2015 BioNano Genomics
Example of a Typical Irys Raw Data Generation andNext Generation De Novo Genome Map Assembly
Irys® Applied
BspQI
Data input (molecules >150kb) 256 Gb
Single molecule N50 235 kb
Genome map N50 1.59 Mb
Number of genome maps 2494
Total length 2.75 Gb
0.75 Mb
4©2015 BioNano Genomics
Haplotype Aware Assembly
Irys® Applied
Raw Data (molecules > 150 kb) Father Mother Son
BNG* Data input 268 Gb (87X) 289 Gb (93X) 340 Gb (110X)
Single molecule N50 304 kb 261 kb 265 kb
Assembly stats
Number of genome maps 2050 2119 2415
Genome map assembly size** 5.24 Gb 5.22 Gb 5.03 Gb
Genome map size N50 4.46 Mb 3.93 Mb 3.43 Mb
Number of maps aligned to hg19 1939 2079 2319
% Genome maps aligned to hg19 95% 98% 96%
% Overlap with hg19 90% 90% 89%
SV Calls (>1kb)
Deletion 1215 1192 1158
Insertion 2468 2440 2417
* BNG: BioNano Genomics
** Diploid assembly
5©2015 BioNano Genomics
Structural Variation Heredity Venn Diagram
InsertionDeletion
Irys® Applied
(> 1 kb) (> 1 kb)
6©2015 BioNano Genomics
Cross-validation of Various SV Calls
Irys® Applied
• BioNano SV calls can validate a high ratio of NGS deletion calls from various methods and
insertion calls from “CSHL assembly” but many BioNano SV calls, especially insertions, are
not detectible by NGS.
146
144
168
169
180
35
0
124
120
79
69
236
280
181
193
79
206
23
3
220
655
113
212
0%
20%
40%
60%
80%
100%
BNG 2-5 kb NGS 2-5 kb BNG >5kb NGS >5kb
DeletionPBHoney tails
PBHoney spots
CSHL Assembly based
CSHL sniffles
Parliament pacbio
Parliament assembly
30
2
20
8
182
1
69
0
161
147
31
25
54 2 7 016 0 4 023 0 6
0%
20%
40%
60%
80%
100%
BNG 2-5 kb NGS 2-5 kb BNG >5kb NGS >5kb
InsertionPBHoney tails
PBHoney spots (+/-5kb buffer)
CSHL Assembly based
CSHL sniffles
Parliament pacbio
Parliament assembly
7©2015 BioNano Genomics
Cross-validation of BioNano SV Calls (NIST GIAB-AJ Trio) Using a Compilation of SV Calls from Other NGS Methods
Irys® Applied
• Compared against a merged NGS call set, BNG can validate most (>2 kb)
NGS deletion and insertion calls. NGS can validate 2-5 kb BNG SVs
effectively but has low concordance/sensitivity for insertions of any size.
Only call sets with SV size information were included. Concordance is based on >= 1 bp overlap.
* 5 call sets based on PBHoney-tails, CSHL, and Parliament output. PBHoney-spots output was not included.
** 2 call sets based on CSHL output.
149
25
179
35
0%
20%
40%
60%
80%
100%
2-5 kbp > 5 kbp
Insertion**
NGS cross valid by BNG BNG cross valid by NGS
1221646247
208
0%
20%
40%
60%
80%
100%
2-5 kbp > 5 kbp
Deletion*
NGS cross valid by BNG BNG cross valid by NGS
8©2015 BioNano Genomics
Cross-validation of BioNano SV Calls (NIST GIAB-AJ Trio) Using a Compilation of SV Calls from Other NGS Methods
BNG SV SizeOverlap
with NGS# SVs
(BNG)% BNG
supportedOverlap
with NGS# SVs(BNG)
% BNG supported
1 – 2 kb 204 243 84% 186 689 27%
2 – 5 kb 250 290 86% 219 625 35%
5 – 100 kb 203 312 65% 56 313 18%
100 kb – Up 15 29 52% 3 11 27%
Total 669 869 77% 455 1570 29%
InsertionDeletion
*Parliament and PBHoney tail calls don’t estimate size
-
10,000
20,000
30,000
40,000
50,000
- 10,000 20,000 30,000 40,000 50,000
NG
S (
all
me
tho
ds)
BNGDeletions Insertions
BNG vs All 6 NGS based
9©2015 BioNano Genomics
Published BioNano CEPH Trio SV Calls
10©2015 BioNano Genomics
UGT2B17: Medically Relevant Deletion of a Large Gene Paralog
hg38
chr4
son
mother
father
Homo sapiens UDP glucuronosyltransferase 2 family, polypeptide B17 (UGT2B17), mRNA
11©2015 BioNano Genomics
Mom hap1
Mom hap2
Son hap1
Son hap2
Dad hap1
Dad hap2
hg19
Detection of D4Z4 Chr10 CNV in Subtelomeric Region
• Each repeat unit contains two homeoboxes gene DUX4: transcriptional activator
• Paralog of D4Z4 region on Chr4, deletion of which can cause Facioscapulohumeral
Muscular Dystrophy: < 5 D4Z4 repeat units
labeling motif -Nicking enzyme BspQI
12©2015 BioNano Genomics
Son hap1 Inherited from Mom Demonstrated by Long Single Molecules Pileup Support
Son hap1
Mom hap1/2
Son hap2
13©2015 BioNano Genomics
Son hap2 Inherited from Dad hap1 Demonstrated by Long Single Molecules Pileup Support
Son hap2
Dad hap1
Son hap1
14©2015 BioNano Genomics
Evaluation of Conflicting Alignments and Sequence Assembly Error CorrectionPacBio OnlyBioNano Only
Irys® Applied
*Bickhart and Rosen, USDA
Hybrid s
caffold
Hybrid
NG
SG
en
om
e
ma
ps
Weak sequence evidence and conflicting RH map support sequence chimera.
15©2015 BioNano Genomics
Summary
•Fully de novo genome map assembly for genome
structure
•Validation of sequence assembly by orthogonal
verification
•Hybrid scaffolding of sequence assemblies
•Structural variation detection
•Benchmarking tool for genome assembly and
structural variation by genome map and single
molecule alignment
16©2015 BioNano Genomics
Acknowledgments
• NIST-GIAB
− Justin Zook
− Marc Salit
• BioNano Genomics
− Han Cao
− Alex Hastie
− Zeljko Dzakula
− Ernest Lam
− Tiffany Liang
− Andy Pang
− Thomas Anantharaman
− Khoa Pham
− Will Stedman
• Mt. Sinai School of Medicine
− Ali Bashir
• Duke University
− Eric Jarvis
• USDA
− Derek Bickhart
− Ben Rosen