Permissions: you are free to blog or live-blog about this presentation as long as you attribute the work to its authors Korea Center for Disease Control & Prevention Next-generation genomics: an integrative approach Chang Bum Hong Division of Structural and functional Genomics, Center for Genome Sciences, NIH
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Permissions: you are free to blog or live-blog about this presentation as long as you attribute the work to its authors
Korea Center for Disease Control & Prevention
Next-generation genomics:an integrative approach
Chang Bum Hong
Division of Structural and functional Genomics, Center for Genome Sciences, NIH
twitter
APPLICATIONS OF NEXT-GENERATION SEQUENCING
2011• Genome structural variation discovery and genotyping• RNA sequencing: advances, challenges and opportunities• Charting histon modifications and the functional organization of mammalian genomes
2010• Evaluating genome-scale approaches to eukaryotic DNA replication• Advances in understanding cancer genomes through second-generation sequencing• Genome-wide allele-specific analysis: insights into regulatory variation• Next-generation genomics: an integrative approach• Uncovering the roles of rare variants in common disease through whole-genome sequencing• Principles and challenges of genome-wide DNA methylation analysis• Prokaryotic transcriptomics: a new view on regulation, physiology and pathogenicity• Sequencing technologies - the next generation• RNA processing and its regulation: global insights into biological networks
2009• The complex eukaryotic transcriptome: unexpected pervasive transcription and novel small RNAs• ChIP-seq: advantages and challenges of a maturing technology• Insights from genomic profiling of transcription factors• RNA-Seq: a revolutionary tool for transcriptomics
DNA
RNA
Protein
Complete genome resequencingTargeted genomic resequencingde novo sequencing
Translated into proteins
DNA being transcribed into RNA
PhenotypeDisease
Chromatin immuniprecipitation sequencingSequencing of bisulfite-treated DNA
•We define this as the use of established sequencing platforms, including the
• Illumia/Solexa Genome Analyzer
• Roche/454 Genome Sequencer
• Applied Biosystems SOLiD
• Helicos and Pacific Biosciences
HiSeq 2000
5500xl SOLid System
MiSeq
Ion Personal Genome Machine
Genome Sequencer FLX System
GS Junior
HeliScope Single Molecule SequncerPACBIO RSJay Flatley Greg Lucier
Jay Flatley Greg Lucier Stephen Quake
Jim Watson Craig Venter
John WestFormer Illumina CEOFounder of HelicosLife Technogoies CEOIllumina CEO
?
BGI 1 x 454, 27 x SOLiD3/4, 128 x Illumina HiSeq
94 x Illumina GA2, 10 x 454, 8 x SOLiD3/4, 1 x Heliscope, 1 x Polonator, 1 x PacBioBroad Institute
Next Generation Genomics: World Map of High-throughput Sequencershttp://pathogenomics.bham.ac.uk/hts/
GMI at Seoul National University College of Medicine 10 x Illumina GA2Macrogen 10 x Illumina GA2, 1 x 454, 2 x SOLiD3/4NICEM Illumina GA2, 454Gachon University of Medicine and Science Illumina GA2, 2 x SOLiD 3/4KRIBB 1x Illumina GA2
• Next-next....-generation: how many ‘next’s are there?
• First Generation: automated version of Sanger sequencing(DNA-sequencing method invented by Fred Sanger in the 1970s)
• Second Generation
• Roche/454 sequencing machine from 454 Life Science(2005)
• 450 bases per read / $0.02 per 1000 bases / 2 days per Gb
• Solexa from Illumina(2006)
• 75 bases per read / $0.01 per 1000 bases / 0.5 days per Gb
• SOLiD from Applied BioSystem(2006)
• 50 bases per read / $0.001 per 1000 bases / 0.5 days per Gb
• Next-Next-Gen - Third Generation?
• Hiseq2000 from Illumina - 0.04 days per Gb
• Helicos Heliscope
• Pacific Biosciences SMART
Sequencing technologies
Shendure & Ji, 2008
Michael L. Metzker, 2010
Sequencing technologiesFeature generation
Sequencing technologiesSequencing by synthesis
Michael L. Metzker, 2010
• Sequencing
• How deep?
• Single, Paired read or both
• Alignment
• References, assemble or both
• Experimental specific analysis
• A ‘one-size-fits-all’ program dose not exist
NGS typical procedure
• Sequence assembly
• Whole Genome Assembly (Reference, De novo)
• Transcriptome Assembly
• Short Sequence Alignment
• Single read
• Paired read
• Genomic Variation Detection
• Detection of Single Nucleotide Polymorphism (SNP)
• Detection of Alternative Splicing Event
• Detection of major/minor transcript isoforms
Applications
Shendure & Ji, 2008
Applications
Bioinformatics tools
Shendure & Ji, 2008
• Sequence Reads
• fastq
• fasta
• Alignment
• Sequence Alignment Map (SAM)
• BAM (Binary Alignment Map)
• Variation
• VCF (Variation Call Format)
File Format
Data: Sequence Reads
Data: Sequence Reads
A challenge call for a new compression algorithmCompression of genomic sequences in FASTQ format
Sebastian Deorowicz et.al, 2011
Data: Sequence Reads
Compress type Compress time Size
gzip 14s 28M
bzip2 9.75s 23M
dsrc 1.36s 21M
• ChIP-Seq
• allows you to assay the amount of binding and location of a protein to DNA, such as a transcription factor bound to the start site of a gene, or a histones of a certain type
• RNA-Seq
• Mapping transcription start sites
• Characterization of alternative splicing patterns
• Gene fusion detection
• Estimation of the abundance of the transcripts from their depth of coverage in the mapping
Example of Applications
ChIP-Seq
Barski A & Zhao K, 2009
Chromatin immunoprecipitation (ChIP)
Kharchenko et al, 2008
Shirely et al, 2009
ChIP-Seq
Shirely et al, 2009
ChIP-Seq Software packages
Shirely et al, 2009
RNA-Seq
Zhong Wang, 2009
RNA-Seq (De novo transcriptome assembly)
RNA-Seq(Transcriptome resequencing)
RNA-Seq
RNA-Seq mapping of short reads in exon-exon junctionsRNA-Seq mapping of short reads over exon-exon junctions, depending on where each end maps to, it could be defined a Transor a Cis event.
from wikipedia.org
RNA-Seq Software packages
Shirely et al, 2009
• Genes in DNA being transcribed into RNA
• might be spliced
• transported to an appropriate cellular compartment
• translated into proteins
• Regulated at many levels
• DNA methylation
• chromatin modification
• binding of transcription factors to the DNA
• binding of splicing factors to the RNA and RNA transport
DNA encodes heritable traits
•What types of genomic data sets are available?
•Why perform integrative genomic analysis?
• Approaches to an integrative analysis
• Using large-scale data sets for integrative analysis