Human Genome Project Seminal achievement. Scientific milestone. Scientific implications. Social implications.

Post on 22-Dec-2015

217 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Human Genome Project

• Seminal achievement.• Scientific milestone.• Scientific implications.• Social implications.

HGP: Background

• International Human Genome Sequencing Consortium: Proposed 1985, endorsed in

1988. 20 governmental groups. “Public project.”

Craig Venter & Celera Genomics:

Founded 1998. Sequence in 3 years. Technology: automation,

computers. Had access to public project’s data.

Race ends in tie Feb. 2001: both publish in Science and Nature.

International Human Genome Sequencing

Consortium• Approach was conservative and methodical.• Had to wait for technology.• First produced a clone-based physical map of the genome that

would serve as a scaffold for the later sequence data:– Broke genome into chunks of DNA whose position on chromosome

was known from maps, clone into bacteria using BACs.

– Digest BAC-inserted clonal chunks of DNA into small fragments.

– Sequence small fragments.

– Stitch together BAC clones to assemble sequence.

– Assemble genome sequence from BAC clone sequences, using clone-based physical map.

Celera

• Approach using "shotgun sequencing" (no organized map).

• Shreds genome randomly into small fragments with no idea of where they are physically located.

• Clones and sequences fragments.• Uses computer to stitch together genome by

matching overlapping ends of sequenced fragments.

Timeline

• Genome sequencing driven by technology.– 1985: 500 base pairs per day

by hand.– 1985-86: PCR and automated

DNA sequencing.– 1992: BACs.– 2000: 1000 bases per second.

Waiting for Technology

• Eyes on the human genome.

• While waiting for technology other genomes were sequenced.

Current Status

• Human genome ~3.2 Gb.• “Rough draft” sequence of the human

genome.• Have sequenced 90% of the 2.5 Gb of gene-

rich (euchromatic) DNA.• What is considered finished?

– Fewer than 1 base in 10,000 is incorrectly assigned.

– More than 95% of the euchromatic regions are assigned.

– Each gap is smaller than 150 kb.

Access to Information

• All public project data on the Internet.

• NCBI Website: www.ncbi.nlm.nih.gov.– Human genome database.– Sequence and mapping tools.

Database Search Example

• The genome database has many tools to locate a gene of interest or search for potential traits of the gene.

• Example–chromosomal map search result for the "breast cancer–causing gene" BRCA2:

Early Statistics

• Only 28% is transcribed into RNA.

• Only 1.1%-1.4% of genome actually encodes protein (=5% of transcribed RNA).

• Surprises:– More junk DNA.– Fewer genes.

Junk DNA• No apparent direct biological function.• Long stretches of repeated sequence.• Hot area of investigation.• Human genome has far more repeat DNA

than any other sequenced organism (over half).

• Parasitic elements–45% of this repeat DNA is from selfish, parasitic DNA:– Transposable elements.– May play role in evolution.

Gene Count

• Many fewer genes than expected (half):– Only 35,000-45,000 genes vs. previously

predicted 100,000.– Only twice the amount of a nematode or a fruit fly.– Does not correlate to twice as complex.– Alternative splicing: Invertebrate genes are more

innovative in their assembly of genes. – Protein domains are mixed more creatively and in

larger numbers by invertebrates.

• Genes elusive.

Genetic Variation

• The International Single Nucleotide Polymorphism (SNP) Map. – Compiled 1.4 million SNPs (single-base pair

differences between individuals).

• Investigate:– Disease resistance.– Response to therapeutics.– Evolution.– Natural selection.– Individual traits.

Gene Variation Example• Mutations in "breast cancer gene” BRCA2. • Chromosomal location and beginning sequence with

one of the mapped variations.

Future Directions

• Fill gaps (refinement).• Bioinformatics.• Sequence additional

genomes.– For comparison.– Upcoming: mouse, fish,

dogs, kangaroo, chimpanzee (most valuable).

• Proteomics.• Gene and Protein Chips

(Microarrays).

top related