Top Banner
Genome sequencing MUPGRET Workshop Joe Polacco
32

Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

Dec 22, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

Genome sequencing

MUPGRET WorkshopJoe Polacco

Page 2: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books

and stacked up would reach top of Washington monument.

Page 3: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

Human Genome Project Began as a academic effort Initially involved 5 research

centers in US and England. Soon joined by Celera, spin off

company.

Page 4: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

Some surprises Initial estimate 100,000 to 150,000

genes but found to be 35,000 to 50,000. (C. elegans ~19,000 genes)

Mass of genome that codes for protein originally estimated as 5% but found to be 1.5%.

Page 5: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

Some completely sequenced genomes Mycoplasma genetialium

578,000 bp, 400 genes Haemophilus influenza

1,830,138 bp, 1738 genes E. coli

4,639,221 bp, 4377 genes S. cervisiae

12 x 106 bp, 5885 genes

Page 6: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

More genomes C. elegans

95.5 x 106, 19,820 genes D. melanogaster

1.8 x 108, 13,601 genes A. thaliana

1.17 x 108, 25, 498 genes

Page 7: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

More genomes M. musculus

3 x 109, ~30,000 genes H. sapiens

3.3 X 109, 30-50,000 genes O. sativa

4.3 x 108, 30-63,000 genes

Page 8: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

The beginning Human genome project initially

discussed at a UC-Santa Cruz meeting in 1985.

Page 9: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

What were the concerns? What will it do to biology? How will be pay for it? Is this really science? Why bother to sequence it all?

all vs. just the genes (skim sequencing)

Page 10: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

Dept. of Energy Initially funded project in 1987. $5.3 million Study radiation induced mutations,

repair and effect on humans.

Page 11: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

NIH Joined in 1988. James Watson leader 3% of research budget devoted to

examining the ethical, legal, and social implications of gene research (ELSI)

Page 12: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

Other genomes Parallel sequencing of E. coli, S.

cerevisiae, C. elegans, D. melanogaster, and M. musculus

Why Work out the technology and methods

Page 13: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

Watson’s vision Sequence it all not just genes. Use genetic maps and markers to

help assemble the pieces.

Page 14: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

Academic players Wash U Baylor Whitehead Wellcome Trust Joint Genome Institute—DOE

Center

Page 15: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

$1 to 10 cents a finished bp automated processing of cloned

DNA automated DNA sequencing computer system to support

sequence data algorithms to assess sequence

fidelity, assemble sequences, and “find” genes.

Page 16: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

Maps Thomas Hunt Morgan (early 1900s)

—low resolution phenotypic markers

1970s restriction maps 1980s RFLPs 1989 Maynard Olson, Leroy Hood,

Charles Cantor, and David Botstein sequence itself is a marker! (STS)

Page 17: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

PCR Polymerase Chain Reaction http://www.dnai.org/b/index.html

Techniques Amplifying

Making copies of DNA

Page 18: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

The PCR revolution 1985 Kary Mullis-Cetus Corporation No need to send clones back and

forth Allowed automated DNA sequencing No need for large clone repositiory for

all human genes Unrestricted access to genes via

public sequence databases.

Page 19: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

Kary Mullis talks about PCR http://www.dnai.org/b/index.html

Techniques Amplifying Interviews

Making DNA copies Naming PCR

Page 20: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

Sequencing-the old way Maxim and Gilbert or Sanger methods

http://www.dnai.org/b/index.html Techniques Sorting and Amplifying

Early DNA sequencing http://www.dnai.org/b/index.html

Techniques Sorting and Amplifying

Interviews Dideoxy method of sequencing

Page 21: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

Automated Sequencing Automation made possible by new

dye chemistry developed by Leroy Hood and Lloyd Smith at Cal. Inst. Tech. in 1986. http://www.dnai.org/b/index.html

Techniques Sorting and Amplifying

Cycle Sequencing

Page 22: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

Inside the automated sequencer Collaboration with ABI produced

first automated sequencer. Laser detection of each bp.

http://www.dnai.org/b/index.html Techniques Sorting and Amplifying

Interviews Making sequencing automated Inside an automated sequencer

Page 23: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

Sequencing Detect all 4 nucleotides in one lane

so quadrupled the output from a single sequencing gel.

Dupont dye terminators—allowed all four nucleotides to be attached to terminal nucleotide in the same sequencing reaction.

Capillary eliminated need to cast gels.

Page 24: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

Sequencing the Genome an Overview

Show sequencing.exe file containing movie about sequencing the human genome.

Page 25: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

Two approaches to sequence the genome

Hierarchical Shotgun clone libraries Use map to pick pieces of genome in

order, break them, sequence and reassemble. (Watson)

Whole genome shotgun Break up genomic DNA randomly,

sequence several genome equivalents, and reassemble. (Ventner)

Page 26: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

Hierarchical Shotgun Clone Libraries Top-down strategy Ordered library of clones based on

large scale maps. Subclone larger inserts into

sequencing vector. Reassemble sequence. Based on order.

Page 27: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

ESTs Expressed sequence tags Reverse transcribe mRNA and

sequence. Venter used nonspecific primer to

randomly amplify 150-400 bp fragments of genes.

Page 28: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

Patent controversy NIH announced it would seed a

patent on Venter’s STS. Very controversial since functionally

unknown. More appropriate to private

company. Watson said it was “sheer lunacy”

and resigned due to conflict with Bernardine Healy NIH director.

Page 29: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

More patent Many biotech companies arose at

the time to mine ESTs and applied for patents on the genes for diagnostics and pharmaceuticals.

NIH withdrew patent application. ESTs must be novel to be patented. ESTs must be useful to be patented.

Page 30: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

The result No patents granted thus far on

genes without known function.

Page 31: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

Whole genome shotgun Break the genome into a bunch of

pieces often by mechanical shearing. Sequence pieces and reassemble. Weber (Marshfield Medical Research

Foundation) and Myers (U of AZ) proposed method to speed sequencing.

1998 Venter leaves NIH to head Celera and promised to sequence human genome in 3 years for $300 million.

Page 32: Genome sequencing MUPGRET Workshop Joe Polacco. Size of human genome 23 pairs of chromosomes 3.1 billion bp If code written in NYC phone books and stacked.

Accelerated the public project. Whole genome method was tested

by sequencing 120 Mbp of Drosophila genome.