Top Banner
1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics
26

1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.

Dec 21, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.

1

Next Generation Sequencing

Itai SharonNovember 11th, 2009Introduction to Bioinformatics

Page 2: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.

2

2010: 5K$, a few days?

2009: Illumina, Helicos40-50K$

Sequencing the Human Genome

Year

Log

10(p

rice)

201020052000

10

8

6

4

22012: 100$, <24 hrs?

2008: ABI SOLiD60K$, 2 weeks

2007: 4541M$, 3 months

2001: Celera100M$, 3 years

2001: Human Genome Project2.7G$, 11 years

Page 3: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.

3

In this Talk:

• Sequencing 1.0: Sanger• Assembly• Next generation sequencing (NGS)• NGS applications• Future directions

Page 4: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.

Genome Sequencing

• Goal figuring the order of nucleotides across a genome

• Problem Current DNA sequencing methods can handle only

short stretches of DNA at once (<1-2Kbp)

• Solution Sequence and then use computers to assemble the

small pieces

4

Page 5: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.

Genome Sequencing

55

ACGTGGTAA CGTATACAC TAGGCCATA GTAATGGCG CACCCTTAG TGGCGTATA CATA…

ACGTGGTAATGGCGTATACACCCTTAGGCCATA

Short fragments of DNA

AC..GCTT..TC

CG..CA

AC..GC

TG..GT TC..CC

GA..GCTG..AC

CT..TGGT..GC AC..GC AC..GC

AT..ATTT..CC

AA..GC

Short DNA sequences

ACGTGACCGGTACTGGTAACGTACACCTACGTGACCGGTACTGGTAACGTACGCCTACGTGACCGGTACTGGTAACGTATACACGTGACCGGTACTGGTAACGTACACCTACGTGACCGGTACTGGTAACGTACGCCTACGTGACCGGTACTGGTAACGTATACCTCT...

Sequenced genome

Genome

Page 6: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.

Sanger Sequencing

6

Page 7: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.

Sanger Sequencing

7

Page 8: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.

Sanger Sequencing

• Advantages Long reads (~900bps) Suitable for small projects

• Disadvantages Low throughput Expensive

8

Page 9: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.

Assembly

9

9

Cut DNA to larger pieces (2Kbp, 15Kbp) and sequence both ends of each piece (Fleischmann et al., 1994)

contig 1 contig 215Kbp mates

2Kbp mates

~(length―1,000)

~500 bp ~500 bp

resolving repeats

Better assembly of contigs, gap lengths estimation

Page 10: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.

many pieces to assemble

High coverage:

Assembly: How Much DNA?

10

Low coverage:

A few pieces to assemble

a few contigs, a few gaps

many contigs, many gaps

Input OutputLander and Waterman,

1988

Page 11: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.

Sanger Sequencing

11

1980 1990 2000

1982: lambda virusDNA stretches up to 30-40Kbp (Sanger et al.)

1994: H. Influenzae1.8 Mbp (Fleischmann et al.)

2001: H. Sapiens, D. Melanogaster3 Gbp (Venter et al.)

2007: Global Ocean Sampling Expedition~3,000 organisms, 7Gbp (Venter et al.)

Page 12: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.

12

Next Generation Sequencing: Why Now?

Page 13: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.

13

High Parallelism is Achieved in Polony Sequencing

PolonySanger

Page 14: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.

14

Generation of Polony array: DNA Beads (454, SOLiD)

DNA Beads are generated using Emulsion PCR

Page 15: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.

15

Generation of Polony array: DNA Beads (454, SOLiD)

DNA Beads are placed in wells

Page 16: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.

16

Generation of Polony array: Bridge-PCR (Solexa)

DNA fragments are attached to array and used as PCR templates

Page 17: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.

17

Sequencing: Pyrosequencing (454)

Complementary strand elongation: DNA Polymerase

Page 18: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.

18

Sequencing: Fluorescently labeled Nucleotides (Solexa)

Complementary strand elongation: DNA Polymerase

Page 19: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.

19

Sequencing: Fluorescently Labeled Nucleotides (ABI SOLiD)

Complementary strand elongation: DNA Ligase

Page 20: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.

20

Sequencing: Fluorescently Labeled Nucleotides (ABI SOLiD)

5 reading frames, each position is read twice

Page 21: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.

21

Single Molecule Sequencing: HeliScope

Page 22: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.

22

Technology Summary

Read length Sequencing Technology

Throughput (per run)

Cost (1mbp)*

Sanger ~800bp Sanger 400kbp 500$

454 ~400bp Polony 500Mbp 60$

Solexa 75bp Polony 20Gbp 2$

SOLiD 75bp Polony 60Gbp 2$

Helicos 30-35bp Single molecule

25Gbp 1$

*Source: Shendure & Ji, Nat Biotech, 2008

Page 23: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.

23

What, When and Why

• Sanger:Small projects (less than 1Mbp)

• 454:De-novo sequencing, metagenomics

• Solexa, SOLiD, Heliscope:– Gene expression, protein-DNA interactions– Resequencing

Page 24: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.

24

Applications

Page 25: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.

25

Applications

Page 26: 1 Next Generation Sequencing Itai Sharon November 11th, 2009 Introduction to Bioinformatics.

26

Where Do We Go from Here?

• Higher throughput, longer reads (Pacific BioSciences)

• Computational bottleneck• Shift to sequencing-based technologies• Will it help to cure cancer?