Lecture 8 Plant Genomics I Genome sequencing and analyses 1. Sequencing methods 2. Sequence annotation and analyses 3. Genome structure 4. Arabidopsis genome sequencing 5. Other plant genome sequencing effort -Chapter 7, 322-325, 328-329 -Nature vol. 408, page 792- 795 (Dec. 14, 2000) “Now for the hard ones” “ A green chapter in the book of life” Assigned reading
13
Embed
Lecture 8 Plant Genomics I Genome sequencing and …science.umd.edu/classroom/BSCI411/Liu/lecture8.pdfLecture 8 Plant Genomics I Genome sequencing and analyses 1. Sequencing methods
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Lecture 8 Plant Genomics I
Genome sequencing and analyses1. Sequencing methods2. Sequence annotation and analyses3. Genome structure4. Arabidopsis genome sequencing5. Other plant genome sequencing effort
-Chapter 7, 322-325, 328-329
-Nature vol. 408, page 792- 795 (Dec. 14, 2000) “Now for the hard ones”“ A green chapter in the book of life”
Assigned reading
Generate & align large BAC or P1 clones
Fragment and sequence a subset of clones
Hierarchical sequencing
Fragment and sequence entire genome
Shotgun sequencing
Adapted from Fig. 2.7 Gibson and Muse
A T G C
Sanger sequencing methodchain termination with a specific ddNTP (dideoxynucleotides)
BLAST: Basic Local Alignment Search ToolPerforms pairwise comparisons of sequences, seeking regions of local similarityrather than optimal global alignment between two sequences
Blast Search
NCBI (http://www.ncbi.nlm.nih.gov/)
Searching sequence databases
Query: the submitted sequence
Genbank accesion number: every sequence submitted to Genbankhas an assigned number
E-value: probability of, by chance, obtaining a seq similarityas similar as the blast result.
Scores:based on scoring matrix, penalizes mismatchesaccording to certain rules or seq alignment.
Blast result
Ab initio Gene discovery
EST (Expressed Sequence Tag) sequencing
Genome Annotation
GeneFinder
Grail
Genie
Genscan
HMM gene
FGENES
E. coli 4.5 kbYeast 1.2 x 104 kbC. elegans 9.7 x 104 kbArabidopsis 1.2 x 105 kbDrosophila 1.8 x 105 kbMung bean 4.5 x 105 kbRice 5.0 x 105 kbTomato 1.0 x 106 kbPotato 1.8 x 106 kbHuman 3.2 x 106 kbSoya bean 1.1 x 106 kbMaize 6.6 x 106 kbWheat 1.6 x 107 kb
Some of the genome sizes
-Near constant number of genes in all genomes irrespective of genome sizes25,000 Arabidopsis, 30-40,000 human, 19,099 in C. elegans, 13,600 in Drosophila.
-The bigger a genome, the more repetitive DNA, the C-value paradox
Genome sequence completed in 2000, published in 5 installmentSee “Arabidopsis Genome Intiative, 2000 (pdf)”
-115 Mb, 25,500 predicted genes, -Whole genome duplication 2X followed by extensive shuffling of chromosomal regions and gene loss-The majority of the genes can be assigned to just 11,000 families, which might represent the minimal complexity or “toolkit” to support complex multicellularity. Animal and plant genomes might evolve from this toolkit
-Distinctive features of plant genome: ~ 800 genes are of plastid decent ~10% genome are transposable elements ~ plant specific genes:
Enzymes for cell wall biosynthesis, photosynthesis, secondary metabolitesPhotptrophic, gravitrophicTransport proteins for nutrient, ion, toxic compound, metabolites between cellsPathogen resistant genes
Synteny: Colinearity of loci (genes) among different plant species
i.e. Revolutionarily conserved organization and arrangement of single copy genes
Also see Fig. 7.28 of our text book
20 of the 54 genes in a 340 kb stretch of the rice genome (top) retain the same order in five different 80-200 kb regions of Arabidopsis genome
genes on different strandsinterspersed, unrelated genes
Grasses, Legumes, and Solanaceae
-Whole genome seq: rice, maize, and alfafa-Comparative genome methods: synteny-EST projects: for rest of the crop plants-High resolution genetic map: for rest of the crop plants