1 ASHG Redux 2008 • Session -- Using DNA sequence to detect variation related to disease – Richard Wilson – WashU – deep sequencing of cancer tumors (AML) identified variations in 8 genes – Richard Gibbs – Baylor College of Medicine – "Complete Genomics" – genome for < $5,000 • Accurate sequencing by hybridization for DNA diagnostics and individual genomics, Drmanac, et al., Nature Biotechnology
28
Embed
1 ASHG Redux 2008 Session -- Using DNA sequence to detect variation related to disease –Richard Wilson – WashU – deep sequencing of cancer tumors (AML)
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
1
ASHG Redux 2008
• Session -- Using DNA sequence to detect variation related to disease– Richard Wilson – WashU – deep sequencing of
cancer tumors (AML) identified variations in 8 genes
– Richard Gibbs – Baylor College of Medicine – "Complete Genomics" – genome for < $5,000
• Accurate sequencing by hybridization for DNA diagnostics and individual genomics, Drmanac, et al., Nature Biotechnology
2
ASHG Redux
• Session -- Using DNA sequence to detect variation related to disease– Micahel Stratton – Wellcome Trust Cancer
Institute – genomic sequencing of breast cancer cell lines
• Copy number variations ("structural variants")
• "genomic shards" – 305 rearrangements in breast cancer cell line
• Difficult to assemble with short reads technology
genes• Bisulphite treatment – convert all un-methylated C's to U
(uracil) -- then sequence and all methylated C's sites are ID'ed• Drawback – harsh, fragments DNA
– High density HapMap of Humans, Dogs, and Cattle• Genotypes 900 dogs /w Affy 2.0 array at 61,344 SNPs• Dogs have very uniform phylogenetic tree with bread specific
• preliminary sequencing• finishing (not always performed -- coverage)• annotating• The "dideoxy method"• Need (for DNA replication):
– DNA, DNA polymerase, primers, deoxyribonucleotide triphosphates (dNTPs) (G,T,A,C)'s (one with radioactive atoms), dideoxyribonucleotide triphosphates (ddNTPs)
12
Dideoxy Method Obsolete?
• Next-generation sequencing technology– Cost per nucleotide down by factor of 100-1000
– Cost per run is still very high
– Expen$ive for validation on an individual basis
– Dideoxy method is very mature, very well understood
13
dideoxy method• Under normal DNA polymerization, dNTPs are added to the
end of the elongating strand of DNA.
• If an ddNTP is incorporated, the elongation terminates -- also carries "label" -- radioactive isotope or fluorescent dye
• This is performed in 4 different containers (test tubes), with each test tube having ddATP, ddGTP, ddCTP, and ddGTP.
• Therefore, each tube terminates with the same ddNTP
• Run these out on a gel, and smallest migrate fastest.
• Expose to x-ray film (or scan with laser), read gel
14Figure 2.1
15Figure 2.2
16
Comment
• Note -- this is pretty awful work• The gel material is toxic• Working with radioactive molecules• Slow and tedious• reading bands on glass• capturing/entering data • 500 bases took 24 hours (16,438 years to do the
human genome with this method)
17
Automated sequencing• Leroy Hood -- developed nonradioactive dideoxy method• ddNTP's are "labeled" with a different fluorescent dye• 1 lane could be used instead of 4 (why?)• A laser fluoresces the dye, the band can be "read", indicating
which ddNTP terminated the sequence• The intensities of these bands are now captured and graphed --
in what is called a chromatogram• Lane in a gel is replaced with a capillary• Can run 96, or 384 capillaries at a time (Applied Biosystems)• A run is approximately 1 hour• 500 bases * 384 cap ==> 651 years
18Box 2.1 Table
19
Choosing genomes
• Big 7– human, mouse, yeast, E. coli, fly, worm, arabidopsis
• medical applications– Pseudomonas aeruginosa (CF infection), mosquito,
trypanosomes, HIV
• evolutionary significance– microbes, archaea, chimp, gorilla, fugu fish
gene duplications, gene families, micro-RNAs, methylation, phosphorylation, tissue specific alternative splicing, copy number variations, (CNVs, also called "structural variations") differential expression, gene function, ????
27
Gene Identification
• Gene prediction (ORF finding)– was a hot topic– cooled when it became clear that EST sequencing was far
superior– EST sequencing in human (and some model organisms -- rat,
mouse, others) was very extensive -- millions of sequencing reads– The most effective approach to gene finding was the overlaying
of EST sequences to genomic sequence (but note you need both).– Gene prediction was 40-60% at best– Gene prediction has made a bit of resurgence because of the cost
savings of "in silico" gene finding
28
Pseudogenes
• text -- mammalian genome contains approximately 225 BP per KB of pseudogenes