Page 1
Welcome to the PHC Webinar Series
© 2010 College of American Pathologists. All rights reserved. 1
This lecture on “Next-Generation Sequencing for the
Clinical Laboratory” is given by
Karl V. Voelkerding, MD, FCAP
Your host is Jill Kaufman, PhD.
For comments about this webinar
or suggestions for upcoming
webinars, please contact
Jill Kaufman at [email protected]
THE WEBINAR WILL BEGIN MOMENTARILY. ENJOY!
Page 2
Karl Voelkerding, MD, FCAP
• Professor of Pathology at the
University of Utah
• Medical Director for Genomics and
Bioinformatics at the ARUP
Laboratories
• Past President of the Association for
Molecular Pathology
• Board certified in Clinical and
Molecular Genetic Pathology
• His research interests include
translation of nucleic acid based
technologies into diagnostics with
a current focus on complex
genetic analyses by next
generation sequencing
© 2010 College of American Pathologists. All rights reserved. 2
Page 3
Next-Generation Sequencing for the Clinical
LaboratoryKarl V. Voelkerding, MD, FCAP
July 20, 2011 www.cap.org v. #
Place sub-brand
here
Page 4
Disclaimer
© 2010 College of American Pathologists. All rights reserved. 4
The College does not permit reproduction of any substantial portion of the material in this
Webinar without its written authorization. The College hereby authorizes attendees of the CAP Webinar to use the pdf presentation solely for educational purposes within their own institutions. The College prohibits use of the material in the Webinar – and any unauthorized use of the College’s name or logo – in connection with promotional efforts by marketers of laboratory equipment, reagents, materials, or services.
Opinions expressed by the speaker are the speaker’s own and do not necessarily reflect an
endorsement by CAP of any organizations, equipment, reagents, materials or services used by participating laboratories.
Page 5
Disclosure
• I have nothing to disclose.
© 2010 College of American Pathologists. All rights reserved. 5
Page 6
Outline
• Progression: Gene Panels to Genomes
• Next Generation Sequencing Technology
• Bioinformatics
Page 7
Whole Exome
Whole Genome
Multi-GeneDiagnostics
Increasing Complexity
Progression
Page 8
Multiple Genes
Multi-Gene Diagnostics
Clinical Phenotype
Locus Heterogeneity Allelic Heterogeneity
Mutational Spectrum
Page 9
Cardiomyopathies
Hypertrophic
Dilated
Arrythmias
Mitochondrial Disorders
Mitochondrial Genome
Nuclear Genes > 100 Genes
X-Linked Mental Retardation ~ 95 Genes
10-35 Genes
Multi-Gene Diagnostics
Page 10
Hearing Loss Retinopathies
Structure/Function Complexes and Signaling Pathways
Multi-Gene Diagnostics
Metabolic Disorders
Oncology
Page 11
More Comprehensive Approach
Diagnosis Prognosis Treatment Counseling
Multi-Gene Diagnostics
Page 12
Sanger Sequencing of Individual Genes
Multi-Gene Resequencing Microarrays
Next Generation Sequencing
Technical Options
Multi-Gene Diagnostics
Scanning and Sequencing
Page 13
Outline
• Progression: Gene Panels to Genomes
• Next Generation Sequencing Technology
• Bioinformatics
Page 14
Sanger Sequencing
Electrophoretic separation of chain termination products
PCR followed by Cycle Sequencing with dNTPs/ddNTPs
Page 15
Next Generation Sequencing
Massively parallel configuration
Sequence DNA fragment library in situ in a flow cell
A T C G
Page 16
Genomic DNA or Enriched Target Genes
Fragmentation
End Repair and Adapter Ligation
Fragment A AdapterAdapter
“Fragment Library”
Process
Fragment B AdapterAdapter
Fragment C AdapterAdapter
Nebulization
Sonication
Acoustic
Wave
150-500
bp
+/-
PCR
Page 17
Clonal Amplification of Each Fragment
Sequencing of Clonal Amplicons in a Flow Cell
“Fragment Library”
A
B
C
Emulsion Bead PCR Surface Clusters
CBAA
B
C
Process
Page 18
Sequencing of Clonal Amplicons in a Flow Cell
Generation of Luminescent or Fluorescent Images
Conversion to Sequence
Pyrosequencing
454
Reversible Dye Terminators
Illumina
Sequencing by Ligation
SOLiD
Process
Page 19
Qualitative and Quantitative Information
Coverage
Ref Seq
G>
A
Illumina
Page 20
Multi-Gene Diagnostics
Genomic DNA
Enrichment
Target Genes
NGS Library Preparation
Next Generation Sequencing
Interpretation
Bioinformatics
Page 21
PCR or LR-PCR
RainDance ePCR
Fluidigm
HaloGenomics
Solid Surface
or
In Solution
Gene Enrichment Approaches
Amplification Based
Genomic DNA
Array Capture Based
Enriched Genes NGS
Page 22
Advantage: Enrichment Specificity Advantage: Scalable to Exome
PCR or LR-PCR
RainDance ePCR
Fluidigm
HaloGenomics
Solid Surface
or
In Solution
Gene Enrichment Approaches
Genomic DNA
Drawbacks:
Not as Scalable
Instrument and Chip
Costs
Drawbacks:
Homologous Sequence
Capture
Manually Complex
Amplification Based Array Capture Based
Page 23
Multi-Gene Diagnostics
Genomic DNA
Enrichment
Target Genes
NGS Library
Preparation
Next Generation Sequencing
Interpretation
Bioinformatics
Page 24
NGS Library Preparation
Fragment End Repair
Adapter Ligation
Fragmentation
Beckman SPRI-TE
1-10 SamplesAutomation
Genomic DNA
Page 25
Multi-Gene Diagnostics
Genomic DNA
Enrichment
Target Genes
NGS Library Preparation
Next Generation Sequencing
Interpretation
Bioinformatics
Page 26
Roche 454 IlluminaLife
Tech
GS FLXGenome
AnalyzerSOLiD
First Wave
GS
Junior
SOLiD 5500
SOLiD 5500xl
HelicosPacific
Biosciences
HeliScope
Second Wave
SMRT
HiSeq
GAIIx
GAIIe
HiScanSQ
Ion Torrent
Third Wave
Next Generation Sequencing Technology
miSeq
Page 27
Illumina HiSeq 2000
Independent Flow Cells
8 Lanes per Flow Cell
Multiple Panel Samples per Lane
1- 2 Genome per 8 Lanes
1- 3 Exome(s) per Lane
Advantage: High Throughput
Drawbacks:
Batching
Sample Coordination
Page 28
Ion Torrent PGMIllumina miSeq
Reversible Dye Terminators Monitors H+ Release
New PlatformsLower Throughput - Faster TAT
“Random Access”
Page 29
Pyrophosphate
Hydrogen Ion
Phosphodiester Bond
Formation
Page 30
Multi-Gene Diagnostics
Genomic DNA
Enrichment
Target Genes
NGS Library Preparation
Next Generation Sequencing
Interpretation
Bioinformatics
Imperfect
Page 31
Multi-Gene Diagnostics – Parallel Testing
Genomic DNA
Enrichment
Target Genes
NGS Library Prep
Next Generation Sequencing
Interpretation
Bioinformatics
Genomic DNA
PCR
“Difficult Gene Regions”
Big Dye Terminators
Sanger Sequencing
Bioinformatics
Interpretation
Page 32
Sanger
Variant g.34142190T>C in TPM1
Reference
LR-PCR
47%
Re-Sequencing – Cardiomyopathy Genes
Page 33
Whole Exome
Multi-Gene
Diagnostics
Increasing Complexity
Progression
Page 34
Human Exome
~ 30 Megabases (~ 1% of the genome)
~ 180,000 exons (~ 20,500 genes)
Harbors “Majority” of Mendelian Mutations
“Journey to the Center of the Genome”
Gene Discovery
~ 40 Publications
July 2011
Page 35
Library Preparation
Next Generation Sequencing Library
Exome Enriched Library
Bioinformatics Analysis
Next Generation
Sequencing
Genomic DNA
Hybridize to Exome Capture Probes
Page 36
MYH7
MYBPC3
TNNT2
TNNI3
CSRP3
TPM1
MYL2
ACTC1
MYL3
PRKAG2PLN
TNNC1TTN
MYH6TCAP
CAV3
Exome Sequencing – Cardiomyopathy Genes
Page 37
Exome
47%
Sanger
Variant g.34142190T>C in TPM1
Reference
LR-PCR
47%
Page 38
Whole Exome
Whole Genome
Multi-Gene
Diagnostics
Increasing Complexity
Progression
Page 39
Genomic DNA
Fragmentation
Process
Covaris
Acoustic Wave
0
20
40
60
80
100
120
140
160
0
50
100
150
200
250
32
36.85
41.7
46.55
51.4
56.25
61.1
65.95
70.8
75.65
80.5
85.35
90.2
95.05
99.9
104.75
109.6
114.45
119.3
124.15
129
133.85
300bp
QC
Page 40
ProcessLibrary Preparation - Illumina
Fragment End Repair
QC
Adapter Ligation
Beckman SPRI-TE
PCR
BioAnalyzer
Page 41
ProcessPCR Amplified Library
Gel Electrophoresis
QC
Size Selection
Gel Purification~50bp
Page 42
ProcessGel Purified Library
Quantitative PCR
QC
Dilution/Denaturation
Flow Cell Cluster Generation
5-7pM
Illumina cBot
Page 43
ProcessSequencing HiSeq 2000
“First Base” Report
QC
Cluster Densities/Intensities
2 X 100bp
Pair End
Target
~ 100+ Gb
Real Time Analysis - Cycle 25+
QC
FastQ Files
Page 44
Whole Genome Sequencing
Chr 10: g.43,615,633C>G in RET
Page 45
Moth
er
Son
Whole Genome Sequencing
Chr X: 3bp deletion in FOXP3
Page 46
Outline
• Progression: Gene Panels to Genomes
• Next Generation Sequencing Technology
• Bioinformatics
Page 47
Next Generation Sequencing Bioinformatics
Base Quality Scores
Conversion to Bases
Signal to Noise
Giga-Terabyte
Image Capture
Image ProcessingSequence Read
Files
Variant
Identification
Annotation
Alignment
Page 48
Variant Q ScoreFlanking Q
Scores
Forward and
Reverse Reads
Variant
Coverage
Variant
Percentage
Unique Reads
Duplicates
Software(s) Parameters
Target Specific
Issues
Enrichment
Co-Capture
Alignment Considerations
Pre-Filter
Mismatches
Gaps
Application
Dependent
Page 49
maq assemble [-sp] [-m maxmis] [-Q maxerr] [-r hetrate] [-t coef] [-q minQ] [-N nHap] out.cns in.ref.bfa in.aln.map 2> out.cns.log
•Command Line
•Free
•Genome community
support
•Feature rich user interface
•$$$
•Company support
Commercial Softwares
Academic Softwares
Alignment Softwares
Page 50
BFAST (UCLA)
BWA (Sanger Institute)
SAMtools (Sanger Institute)
Academic Softwares
Commercial Softwares
Alignment Softwares
Page 51
Next Generation Sequencing Bioinformatics
Base Quality Scores
Conversion to Bases
Signal to Noise
Giga-Terabyte
Image Capture
Image ProcessingSequence Read
Files
Variant
Identification
Annotation
Alignment
Page 52
dbSNP
OMIM
Locus Specific Databases
Literature and Internet
Functional Prediction Programs
Variant Annotation
1,000 Genome Project
Human Genome Mutation Database
PolyPhen
SIFT
Page 53
TCGAAGTCTGCCTAGCTGT
CCGTACGTCTGATGCGTA
Manually
Complex
Automation
Convergence of Chemistry + Bioinformatics
Large
Cognitive
Component
Pipelines
Page 54
Variant Calling
(15-20,000)
Filter Out
Common
Variants
(750-1000)
Variant
Annotation
Genes/Regions
Family/SNP
Arrays (10-200)
Nonsense
Splicing &
Frame shifts
Missense:
Protein Function
Predictions
Functional
Relevance
VAAST
(Variant Annotation,
Analysis, and
Selection Tool)Candidate
Genes
Approach 3
Approach 1
Approach 2
Evolutionary
Constraint
Exome Sequencing
Page 55
Reads
Pre-existing
variant files,
chip, bead
genotype
data
BWA
Merge Alignments (SAMtools)
Duplicate reads removed (Picard)
GATK
INDEL base-quality recalibrated alignments
Variant calling (SAMtools)
VCF files
GVF FilesSTEP II
STEP I
Courtesy
Mark Yandell
Martin Reese
VAAST
Page 56
VAAST – Probabilistic Candidate Gene Finder
Allele Frequencies
Cases and Controls
AA Substitution Analysis
Model Variant Severity
Combined Likelihood
Framework
Identify Aberrant Variant Combinations
Compromise Gene Function
Page 57
© 2010 College of American Pathologists. All rights reserved. 57
0.5:A 0.5:T
0.9:C 0.1:T
0.8:G 0.2:T S
*
0.9:C 0.1:T
0.2:A 0.8:G
0.1:G 0.9:C
Cases
0.5:A 0.5:T
0.9:C 0.1:T
0.9:C 0.1:T
0.2:A 0.8:G
0.6:G 0.4:C
Controls
Ge
ne
X
Ge
ne
X
VAAST – Probabilistic Candidate Gene
Finder
Page 58
Exome Sequence Data
= Affected
VAAST – Miller Kindred Quartet
Miller + PCD
VAAST
Two Siblings
Page 59
VAAST – Miller Kindred Quartet
Miller : DHOD PCD: DNAH5
GWS Alpha is 2.4
E-6
Page 60
Exome Sequence Data
= Affected
VAAST – Miller Kindred Quartet
Miller + PCD
VAAST
Two Siblings
Parents
Page 61
VAAST – Miller Kindred Quartet
Two
One
Page 62
© 2010 College of American Pathologists. All rights reserved. 62
Patient/Family
Consent
Structural Variation
Analysis
Whole
Exome/Genome
Sequencing
Variant
Identification
Annotation
Genetic
Functional
StudiesFoundation for Diagnostics
Genomics Clinical Research Program
Page 63
Summary
• Progression: Gene Panels to Genomes
• Next Generation Sequencing Technology
• Bioinformatics
Page 64
© 2010 College of American Pathologists. All rights reserved. 64
Acknowledgements
ARUP Laboratories Institute
for
Clinical and Experimental Pathology
Genomics-Bioinformatics
Rebecca Margraf Jacob Durtschi
Emily Coonrod Perry Ridge
U of Utah Genetics
Mark Yandell Lynn Jorde
Omicia
Martin [email protected]
Page 65
Next in the Series of Free PHC Webinars
• How to Have Successful Patient Interactions,
Wednesday, August 17th, 11:00-12:00 pm CT
o Mary Ann Abrams, MD, MPH & Barbara Savage, MT(ASCP)
• Go to www.cap.org/institute For All Upcoming Webinars!
• Past Webinars Available Now Online at www.cap.org/institute
o Accountable Care Organizations
o Whole Genome Analysis as a Universal Diagnostic
o How to Build and Fund a Financially Viable Molecular Lab
o Cancer: The Critical Role of Pathology
o Molecular Markers in Breast Cancer
o Bethesda System: Integrating Cytology and HPV Molecular Testing
o Molecular Diagnosis for Lung Cancer Patients
o Molecular Diagnosis for Colorectal Cancer Patients
© 2010 College of American Pathologists. All rights reserved. 65
Page 66
CAP Events of Interest
• Don’t Forget to Register for CAP’11 – THE
Pathologists’ Meeting – September 11 – 14, 2011
held at the Gaylord Texan in Grapevine, Texas!
–Go to www.cap.org/CAP11 or call
1-800-967-4548. International attendees please
call 1-847-996-5891.
Page 67
For more information go to
www.cap.org/CAP11
© 2010 College of American Pathologists. All rights reserved. 67
Tuesday, Sept 12th:
TP120 Breakfast Workshop – Hot Topics in Pathology: What Every Community Pathologist Should Know About Clinical Requests for Molecular Tests (6:30-7:45 am)Faculty--Samuel K. Caughron, MD, FCAP
Frederick L. Kiechle, MD, PhD, FCAP
Michael S. Brown, MD, FCAP
ST109 Companion Diagnostics for Targeted Therapy in Cancer (2:00-5:30 pm)Faculty--Sanja Dacic, MD, PhD, FCAP
David Hicks, MD, FCAP
Jeffrey Kant, MD, PhD, FCAP
Wednesday, Sept 13th:
ST110 Direct-to-Consumer Genetic Testing: Staying Ahead of Patients in This Current Trend (8:00-9:00 am)Faculty--Nazneen Aziz, PhD
Elizabeth A. Mansfield, PhD
ST111 What’s in It for Me? Using Technology to Become a Diagnostic Hero (8:00-11:30 am)Faculty--Kenneth J. Bloom, MD, FCAP
John W. Turner, MD, FCAP