CSCE555 Bioinformatics CSCE555 Bioinformatics Lecture 2 Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu Course page: http://www.scigen.org/csce555 University of South Carolina Department of Computer Science and Engineering 2008 www.cse.sc.edu.
35
Embed
CSCE555 Bioinformatics Lecture 2 Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu Course page: University of.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
CSCE555 BioinformaticsCSCE555 Bioinformatics
Lecture 2Meeting: MW 4:00PM-5:15PM SWGN2A21Instructor: Dr. Jianjun HuCourse page: http://www.scigen.org/csce555
University of South CarolinaDepartment of Computer Science and Engineering2008 www.cse.sc.edu.
RoadmapRoadmap
DNA, Chromosomes, Genomes
Genome Sequencing and whole genomes
DNA Sequence Representation, Models
Sequence Retrieval, Manipulation
Basic Analysis and Questions of Genomes
Summary
04/20/23 2
Tools to Learn Concepts Tools to Learn Concepts QuicklyQuicklyWikipedia.org
◦Search “Genome” bringing up many related information
◦In google, type “keywards wiki”Google search tips
◦Find info from university websites Genome, site:edu
◦Find info as powerpoint files Genome, tutorial, filetype:ppt
DNADNADeoxyribonucl
eic acid (DNA) is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms. Backbone:
sugars and phosphate groupsDNA is a long polymer of simple units called nucleotides
Complementary Base Pairing:A TC G Write a program to export
complementary sequence?
Genome of organismsGenome of organismsgenome of an
organism is a complete DNA sequence of one set of chromosomes
Sequencing: Basic IdeasSequencing: Basic Ideas Current lab techniques can sequence small (say 700 base
pairs) DNA pieces.◦ Use restriction enzymes to cut DNA pieces◦ Sort pieces of different sizes using gel electrophoresis and use
the sorting to read them Mapping and Walking
◦ Sequence one piece, get 700 letters, make a primer that allowed you to read the next 700, and work sequentially down the clone
◦ Estimate for human genome sequencing using this method: 100 years
Shotgun sequencing (introduced by Sanger et al. 1977) for sequencing genomes◦ Obtain random sequence reads from a genome◦ Assemble them into contigs on the basis of sequence overlaps
Straightforward for simple genomes (with no or few repeat sequences) Merge reads containing overlapping sequence
Shotgun sequencing is more challenging for complex (repeat-rich) genomes: two approaches
How Sequencing WorksHow Sequencing Works
Beckman CEQ 8000
Sequencing small DNA piecesSequencing small DNA pieces
Use DNA cloning or PCR to make multiple copies.
Put in 4 testtubes marked G, A, T and C
In testtube G use restriction enzymes that cuts at G.
Do the above step for the other testubes.
Use gel electrophoresis separately for the content in each testtube.
The data results in the table on the left.
Reading the table we get G has lengths 1, 7, 12, 13, 19; A has lengths 2, 6, 8, 11, 14,15,16; T has length 4, 5, 9, 18 and C has length 3, 10, 17.
This gives us the sequence.
G A T C
G --------------
A --------------
C --------------
T --------------
T --------------
A --------------
G --------------
A --------------
T --------------
C --------------
A --------------
G --------------
G --------------
A --------------
A --------------
A --------------
C --------------
T --------------
G --------------
Methods for very large scale Methods for very large scale sequencingsequencing
A hierarchical approach◦ Map on a large scale (physical mapping),
sequence specific clones whose position in the genome is known
Shot gun sequencing◦ “Tear up” the genome and sequence
random fragments until it is doneSequence tagged connectors (STC)
◦ Sequence the ends of many clones and use this info to pick overlapping clones