Sponsored by: Participating Experts: Daniel Turner, Ph.D. Wellcome Trust Sanger Institute, Cambridge, UK Webinar Series Webinar Series Science Science DNA Target DNA Target 10 June, 2009 10 June, 2009 Brought to you by the Science/AAAS Business Office Kelly Frazer, Ph.D. Scripps Genomic Medicine San Diego, CA Enrichment Strategies Enrichment Strategies www.opengenomics.com/SureSelect
51
Embed
Science WWebinar Seriesebinar Series DNA Target … slides...sdfsdfsdf Make Genomic DNA Fragment Libraries Agilent Microarray ‐synthesis 120‐mer oligonucleotides ‐convert to
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Sponsored by:
Participating Experts:
Daniel Turner, Ph.D.Wellcome Trust Sanger Institute,Cambridge, UK
Brought to you by the Science/AAAS Business Office
Kelly Frazer, Ph.D.Scripps Genomic MedicineSan Diego, CA
Enrichment StrategiesEnrichment Strategies
www.opengenomics.com/SureSelect
Enrichment of sequencing targets from the human genome
Kelly A Frazer, PhDDirector, Genomic BiologyScripps Genomic Medicine
June 10, 2009
genomic DNA
select regions
What is targeted sequencing?
Define sequence targets
Target enriched samples
Sequence
Next‐Gen Sequencing
• Low costs for generating raw, per nucleotide sequence, ($0.00001 per base).
• Best suited for generating large amounts of raw sequence data per sample, (109nucleotides per day).
Still too costly and too low through‐put to perform whole‐genome sequencing for on many different DNA samples
Why perform targeted sequencing?
To efficiently use current technologies for population‐based sequencing studies, it is necessary to enrich for specific loci in the human genome.
Population Sequence Studies
• Sequence‐based association studies
Healthy elderly cohort versus individuals with age‐related diseases
• Functional annotation of genomic intervals
9p21 interval associated with CAD and T2D
• PCR – enriches target sequences with high specificity but difficult to scale
• Hybridization based methods – long oligonucleotides in solution allow for efficient capture of ~3.5 Mb of sequence targets
• Microdroplet PCR – encapsulation of PCR reactions allows for simultaneous amplification of ~4,000 targeted elements
Sample enrichment methods
Important parameters • Efficiency of assay design
– The fraction of targeted base pairs for which an assay can be designed
• Specificity of target enrichment– The fraction of high quality reads that map directly on the targeted sequences
• Coverage uniformity across targeted sequences– If coverage differs greatly then one has to sequence deeply to adequately cover underrepresented bases
• Reproducibility across technical replicates & samples
• Systematic allelic biases resulting in drop‐out effects– Errors of this nature result in high rates of incorrectly called heterozygous variant sites
Solution hybridization‐based method is well suited for the enrichment of loci in the mega‐base‐pair scale from the human genome for population sequence studies
Microdroplet PCR Workflow
Primer library – up to 4000 different elements
Fragmented genomic DNA template
Primer design efficiency
• 47 genes – 435 exons– 29 from ENCODE intervals
– 8 TRP channel superfamily
– 11 deep venous thrombosis
• 457 amplicons of varying sizes (119‐956 bp) and GC content (33‐74%)
Successfully design PCR assays for all exons
Specificity of target enrichment
• 78% of filtered reads successfully mapped to a targeted amplicon
• Off target reads aligned across genome in a random fashion ‐ suggesting that background sequence is due to non‐specific genomic DNA carryover rather then from off‐target amplification
Coverage uniformity across targeted sequences
Normalized coverage – divided the observed coverage of each base by the mean coverage of all targeted bases
89.6% of all bases fell within ¼ to 4 times the mean coverage
99.6% of all bases covered by at least one read
Only one ampliconcompletely failed
Reproducibility of coverage
Sample to sample r2 ~0.96
Variant calling accuracycomparison to microarray genotypes
Accuracy was similar in ENCODE versus non‐ENCODE interval variants and between samples of African and European ancestry indicating that allelic biases are mimimal
The microdroplet PCR process is extremely efficient with almost 100% of all primer pairs successful. The data generated is well suited for performing population‐based sequence studies.
Selecting a method
• Study design– Known functional elements or entire intervals
– Total amount of targeted sequences
– Number of samples
• Sequencing Technology
AcknowledgementsSTSI/Scripps Genomic Medicine
Ryan Tewhey
Kazu Nakano
Wendy Wang
Sarah Murray
Olivier Harismendy
Eric Topol
Sponsored by:
Participating Experts:
Daniel Turner, Ph.D.Wellcome Trust Sanger Institute,Cambridge, UK