• Performance assessment − Complement to known-pathogenic control samples (e.g. Coriell/GeT-RM, NIBSC). These control samples are most relevant to our product, but only ~1 variant / sample, and a limited # of such samples are available. − GIAB boosts n greatly, though variants aren’t generally clinically relevant o We also use: Mike Eberle’s NA12878 calls; internally constructed truth set for CEPH 1463 family & NA19240 − Validation docs, performance assessment of genes with poor coverage with control samples, upcoming publications 05/28/2022 1 Use of GIAB NA12878 at Invitae
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
• Performance assessment− Complement to known-pathogenic control samples (e.g.
Coriell/GeT-RM, NIBSC). These control samples are most relevant to our product, but only ~1 variant / sample, and a limited # of such samples are available.
− GIAB boosts n greatly, though variants aren’t generally clinically relevanto We also use: Mike Eberle’s NA12878 calls; internally constructed truth
set for CEPH 1463 family & NA19240− Validation docs, performance assessment of genes with poor
coverage with control samples, upcoming publications
04/13/20231
Use of GIAB NA12878 at Invitae
Integrating NIST Call Sets into a Validation Workflow
Validation ReportFalse Positive Ratio FPR=FP/(FP+TN)
False Discovery Rate FDR=FP/(FP + TP)
Sensitivity Sens. = TP/(TP+FN)
Specificity Spec. = TN/(FP +TN)
Balanced Accuracy (Sens. + Spec.)/2
Nephropathology Associate’s Kidney Disease Gene Panel: Excerpts from a NA12878 Validation Report
• Data provided by Marjorie Beggs (Nephropathology Associates)• 301 genes from 13 renal disease categories• Agilent oligo-capture followed by MiSeq 2x150 sequencing• Genotypes/probabilities determined with a modified version of MAQ variant caller (Li et al., 2008)
Summary of all targeted positions: Summary of targeted zero coverage positions in experiment:In Standard VCF 614 In Standard VCF 3Not in Standard VCF 803980 Not In Strandard VCF 5100Total 804594 Total 5103
* Only positions with a depth greater than or equal to this value will be included in the calculation.** The minimum value for a position to be included as a variant.
Ion Benchmarking I
Ion Benchmarking II
Ion Benchmarking III
Background• Clinical laboratory – Division of Genomic Diagnostics Certified by regulatory
agencies (CAP).• CWES test requires stringent validation per CAP criteria to establish performance
metrics of the test.
Utilizing NIST data in validation of CWES Test
• Sequence and call variants of NA12878 at CHOP• CHOP ROI: Agilent SureSelect V5+ (SSV5+) baits file• Compare CHOP dataset to NIST data set for concordance
NIST Data Set Details:*High quality reference data set on NA12878 (Dec. 2013)*NIST’s highly confident Region of Interests (ROI) *Variants called in 219,222 regions on hg19 assembly
*: National Institute of Standards and Technology
Analytical Validation of Clinical Whole-Exome Sequencing (CWES) Test
LabCorp (Kyle Hart)• We are using this data to validate
our variant identification pipelines which are based on the Qiagen/CLC software and Illumina sequence data
• We are seeking high clinical sensitivity to minimize false negatives and we have a variety of strategies to rescue un-callable segments and confirm called variants prior to reporting to increase specificity.
NHGRI (Nancy Hansen)• We have a variant analysis
pipeline which analyzes whole exome sequence data (Illumina HiSeq2000/2500) for SNPs and small indels
• We are using the GIAB variant dataset to assess the accuracy of our pipeline and compare it to other publicly available pipelines.