HORIZON DISCOVERY Understanding and Controlling for Sample and Platform Biases in NGS Assays
Aug 13, 2015
HORIZON DISCOVERY
Understanding and Controlling for Sample and Platform Biases in NGS Assays
2For Research Use Only
What is the impact of assay failure in your laboratory and how do you monitor for it?
3For Research Use Only
Clinical Application of Next Generation Sequencing
Using just one sample, one workflow can test for mutation status across multiple genes
4For Research Use Only
The Sources of Variability in the Next Generation Sequencing Workflow
5For Research Use Only
Quantitative Multiplex
BRAF V600E KIT D816V EGFR ΔE746 - A750
EGFR L858R EGFR T790M
EGFR G719S KRAS G13D KRAS G12D NRAS Q61K PIK3CA H1047R
PIK3CA E545K
0
5
10
15
20
25
30
HorizonPartner APartner BPartner C
AmpliSeq Panel in three laboratories
6For Research Use Only
Next-Generation Sequencing Introduction
Also known as high-throughput or massively-parallel sequencing• Allows us to address questions that require a lot of data
• Has been applied to scientific questions across industries• Pharma • Biotech• Biofuels• Agriculture• Food Science• Archeology• Medicine• Personalized Medicine
7For Research Use Only
Next-Generation Sequencing Introduction
8For Research Use Only
RNAtranscriptomics
DNAmetagenomics
And more…
Next-Generation Sequencing Introduction
DNAepigenomics
DNAresequencing
DNAde novo assembly
9For Research Use Only
RNAtranscriptomics
DNAmetagenomics
And more…
Next-Generation Sequencing Introduction
DNAepigenomics
DNAresequencing
DNAde novo assembly
We will focus on: • Biological Sample• Library Preparation• Sequencing Platform• Informatics Pipeline
o View our previous webinar for more on informatics
10For Research Use Only
11For Research Use Only
NGS Workflow – Reference Materials
12For Research Use Only
Source of Error: Biological Sample
Potential Sources of bias/error include: • User errors
o Exogeneous DNA contaminationo Mislabelling
• Heterogenous sampleo Non-tumor cells o Mixed-cell populations (xenografts)
• Limited sample availabilityo Low Quantity
• Degradation/fragmentationo FFPEo cfDNA
13For Research Use Only
Formalin Compromised DNA Reference Standards
• Multiple formats for Quantitative Multiplex Reference Standard• 11 validated positive mutations• Frequency range: 24%-1%
• HD-C749 (Formalin-Compromised DNA) – (mild formalin treatment, low-level degraded)• Lanes 2 and 4 on right
• HD-C751 (Formalin-Compromised DNA) – (harsh formalin treatment, highly degraded)• Lanes 3 and 5 on right
The Quantitative Multiplex also comes in the following formats:
• HD701 (DNA) – high molecular weight DNA extracted directly from cells
• HD200 (FFPE) - mild-formalin fixation, embedded in paraffin once extracted shows little degradation
Genomic DNA Tapescreen assay
[bp] 1 2 3 4 5
14For Research Use Only
Formalin-Compromised Multiplex Reference Standard
HD-C751 HD-C749
How does formalin treatment affect downstream analysis?
Amplification bias may not be detected without appropriate controls.
15For Research Use Only
Formalin-Compromised Multiplex Reference Standard
Variant Expected Ratio
“Acceptable Range”
Determined Ratio
Batch 1
Determined Ratio
Batch 2
Determined Ratio
Batch 3
Determined Ratio
Batch 1
Determined Ratio
Batch 2
Determined Ratio
Batch 3
EGFR G719S 25% 22.1%-27% 23.4% 23.8% 23.4% 24.1% 22.7% 23.2%
PI3KCA H1047R 18% 14%-21% 19.6% 20.0% 18.8% 20.7% 20.4% 20.7%
KRAS G13D 15% 12%-18% 13.8% 14.8% 12.9% 15.3% 17.8% 14.0%
NRAS Q61K 13% 10%-15% 10.4% 10.1% 12.0% 12.8% 13.5% 13.2%
BRAF V600E 11% 8.6%-12.8% 12.4% 12.5% 11.9% 12.3% 11.6% 12.7%
PI3KCA E545K 9% 7.2%-10.8% 8.0% 8.1% 8.8% 10.7% 13.1% 13.0%
KIT D816V 10% 8%-12% 10.5% 10.2% 10.2% 10.5% 21.9% 20.1%
KRAS G12D 6% 4.8%-7.2% 5.9% 6.0% 5.3% 7.2% 5.9% 7.2%
EGFR L858R 3% 2.1%-3.9% 3.2% 3.3% 3.3% 3.4% 4.3% 3.5%
EGFR ∆E746-A750 2% 1.4%-2.6% 1.9% 2.0% 1.9% 1.9% 3.3% 3.2%
EGFR T790M 1% 0.7%-1.3% 1.3% 1.3% 1.0% 1.1% 1.6% 1.2%
HD-C749 HD-C751
16For Research Use Only
Bias/Errors in Library Preparation
Robasky, K. et al. The role of replicates for error mitigation in next-generation sequencing. Nature Rev. Genet. 15, 56-62 (2014).
17For Research Use Only
Sequencing Library Preparation
Enrichment options:
• whole-genome (not enriched)
• whole-exome capture
• custom capture
• capture-based panels
• off-the-shelf amplicon panels
• custom amplicon panels
Goal: Use a reference standard that reflects your actual sample.
18For Research Use Only
Source of Error: Library Preparation
Errors arising from sequencing library preparation include:• Uneven sequencing coverage• Sequence changes• Length biasing/preferential amplification• Primer bias Mispriming Multiple Displacement Amplification (MDA) Incorporation of errors
From NuGEN
19For Research Use Only
Variant Type Mutation Expected Fractional Abundance (%) or CNV:
SNV High GC GNA11 Q209L 5.6SNV High GC AKT1 E17K 5.6SNV Low GC KRAS G13D 5.6SNV Low GC Pi3Ka E545K 5.6Long Insertion EGFR V769 ins 5.6
Long DeletionEGFR (delE746-A750)
5.3
Fusion ROS1 translocation 5.6
Fusion RET translocation 5.6
CNV MET amplification 4.5 x amplification
CNV MYC amplification 9.5 x amplification
SNP EGFR_G719S 5.3Short Deletion MET_p.V237fs 4.8*SNV High GC NOTCH1_p.P668S 5.0Short Deletion FLT3_p.S985fs 5.6Short Deletion BRCA2_p.A1689fs 5.6Short Deletion FBXW7_p.G667fs 5.6
Structural Multiplex Reference Standard
*This product is part of our early access program. It is the responsibility of the individual laboratory to determine expected results specific to its assay.
20For Research Use Only
Bias/Errors in Library Preparation
Robasky, K. et al. The role of replicates for error mitigation in next-generation sequencing. Nature Rev. Genet. 15, 56-62 (2014).
21For Research Use Only
Platform Bias – Overview
3 Common Platforms: Common sources of bias/error include: • User error Sample overloading• Machine failure Laser, hard drive, software, fluidics
failures• Nucleotide malfunction Fluorophore quenching, nucleotide
damage, signal overlap• Sequence context errors High GC content, low-complexity
regions, homopolymers• Dephasing Incomplete extension, addition of
multiple nucleotides
22For Research Use Only
Platform Bias – Illumina
Images from Illumina.
23For Research Use Only
Platform Bias – Illumina
Images from Illumina.
24For Research Use Only
Platform Bias – Ion Torrent
Illustration: James Provosthttp://spectrum.ieee.org/biomedical/devices/the-gene-machine-and-me
Erro
r Rat
e
Homopolymer length
25For Research Use Only
Platform Bias – PacBio
Single Molecule Real Time (SMRT) Sequencing
Image from PacBio.
26For Research Use Only
Platform Bias – How can replicates help?
DNA samples from blood and saliva were sequenced on two different
platforms — Illumina and Complete Genomics — which resulted in 88.1%
concordance of single-nucleotide variants (SNVs) across replicates.
Cross Platform Replicates
27For Research Use Only
Value of Replicates – Biological and Technical
Robasky, K. et al. The role of replicates for error mitigation in next-generation sequencing. Nature Rev. Genet. 15, 56-62 (2014).
R = replicates
28For Research Use Only
Value of Technical Replicates – Process Noise
PlatformQX100 Droplet
Digital PCR (Internal QC)
Ampliseq Cancer HotspotPanel v2*
Gene Mutation Specification Observed mutant ratio, % COV
BRAF V600E 10.5 10.2 10.3 0.01
KIT D816V 10.0 10.4 10.1 0.01
EGFR ΔE746 - A750 2.0 2.0 Not detected -
EGFR L858R 3.0 2.7 2.4 0.07
EGFR T790M 1.0 0.9 Not detected -
EGFR G719S 24.5 24.4 24.8 0.01
KRAS G13D 15.0 16.1 15.5 0.03
KRAS G12D 6.0 5.0 5.1 0.03
NRAS Q61K 12.5 12.8 12.6 0.01
PIK3CA H1047R 17.5 18.6 17.9 0.01
PIK3CA E545K 9.0 8.9 8.8 0.01
*Average of 8 runs, average coverage 2000x
Quantitative Multiplex Reference Standard
Available as gDNA or FFPE ready for extraction
29For Research Use Only
NGS Workflow
30For Research Use Only
Source of Error: Bioinformatics
http://www.horizondx.com/bioinformatics-webinar.html
31For Research Use Only
Next-Generation Sequencing – Wrap up
32
Horizon Discovery – Your Partners in Personalized Medicine
Powering Genomic Research and Translational Medicine, from Sequence to Treatment
Horizon’s mission is to be a fully integrated life science company that provides enabling products, services and research programs to clients engaged at every stage of the healthcare continuum from sequence to treatment
33
Horizon’s Range of Products/Services
34
Horizon’s Range of Products/Services
35For Research Use Only
Routinely monitor the performance of your workflows and assays with independent external controls
What extraction and quantification methods are you
using?
What is the limit of detection of your
workflow?
Is the impact of formalin treatment interesting to you?
What is the impact of assay failure in your laboratory and how do you monitor for it?
36For Research Use Only
How to Test the Robustness and Sensitivity of your Workflow and Assay
StructuralStandard
DNA
Sample Complexity
SampleFeatures QMRS
DNA andFFPE
GIABFFPE
Gene-SpecificMultiplex
DNA and FFPE
Tru-QDNA
Your Horizon Contact:
t + 44 (0)1223 655580f + 44 (0)1223 655581e [email protected] www.horizondiscovery.comHorizon Discovery, 7100 Cambridge Research Park, Waterbeach, Cambridge, CB25 9TL, United Kingdom
Natalie LaFranzo, PhDUS Customer/Technical Support [email protected]+1-844-655-7800