Genome Biology and Genome Biology and Biotechnology Biotechnology 2. The genome structures of invertebrates 2. The genome structures of invertebrates Prof. M. Zabeau Prof. M. Zabeau Department of Plant Systems Biology Department of Plant Systems Biology Flanders Interuniversity Institute for Biotechnology Flanders Interuniversity Institute for Biotechnology (VIB) (VIB) University of Gent University of Gent International course 2005 International course 2005
43
Embed
Genome Biology and Biotechnology 2. The genome structures of invertebrates Prof. M. Zabeau Department of Plant Systems Biology Flanders Interuniversity.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Genome Biology and Genome Biology and BiotechnologyBiotechnology
2. The genome structures of invertebrates2. The genome structures of invertebrates
Prof. M. ZabeauProf. M. ZabeauDepartment of Plant Systems Biology Department of Plant Systems Biology
Flanders Interuniversity Institute for Biotechnology (VIB)Flanders Interuniversity Institute for Biotechnology (VIB)University of GentUniversity of Gent
International course 2005International course 2005
Sequenced genomes of Sequenced genomes of invertebratesinvertebrates
• Most of them are associated with transposons of C. Elegans which are probably no longer active in the genome
– Local repeat sequences• Tandem, inverted, or simple sequence repeats
Reprinted from: The C. elegans Sequencing Consortium, Science, 282, 2012 (1998)
Chromosome Structure and Chromosome Structure and OrganizationOrganization
¤ The genome structure is remarkably uniform– Gene density is fairly constant across the chromosomes– No localized centromeres
• Like in yeast, but in contrast to all other eukaryotes
¤ Differences between the central portion and the arms of the chromosomes– The conserved eukaryotic genes are in the central portion– Repetitive DNA is more prevalent in the arms– Meiotic recombination is much higher on the chromosome
arms– suggest that DNA in the arms might be evolving more
rapidly than in the central regions
Reprinted from: The C. elegans Sequencing Consortium, Science, 282, 2012 (1998)
Distribution of sequence elements on Distribution of sequence elements on Chromosome IChromosome I
Reprinted from: The C. elegans Sequencing Consortium, Science, 282, 2012 (1998)
TTAGGC repeats
Tandem repeats
Inverted repeats
Yeast similarities
EST matches
Predicted genes
Central part armarm
ConclusionsConclusions
¤ The complete sequence of the C. elegans genome has – provided a basis for the discovery of all the genes of a
multicellular eukaryotic organism• First inventory of eukaryotic genes
¤ C. elegans is a very effective model organism for – eukaryotic gene analysis: widely used for functional
genomics– human disease gene research– nematode pest control research
Reprinted from: The C. elegans Sequencing Consortium, Science, 282, 2012 (1998)
The Genome Sequence of Caenorhabditis The Genome Sequence of Caenorhabditis briggsae: A Platform for Comparative briggsae: A Platform for Comparative
GenomicsGenomics
¤ Paper presents– high-quality draft (> 10-fold coverage) sequence of C.
briggsae– Comparative genome analysis of C. briggsae and C. elegans
• The two species diverged ~ 100 million years ago • morphologically indistinguishable• same chromosome number (5) and genome size (104 and
100Mb)
– Comparisons of the genomes of related species allows • More precise annotation of protein-coding genes• Discovery of noncoding genes, regulatory sequences and
“unknown” functional elements
Stein et. al., PLoS Biol 1: 166-192 (2003)
Colinearity of the Colinearity of the C. briggsae and C. elegans C. briggsae and C. elegans GenomesGenomes
Chromosome Structure and Chromosome Structure and OrganizationOrganization
¤ The centers contain orthologous (1) and essential genes (2)– Very long synteny blocks
¤ The arms contain orphan genes (3) and repetitive elements (4)– Short synteny blocks– The arms of the chromosomes are evolving more rapidly than the centers
• Sequence contained 128 physical gaps and 1630 sequence gaps
– Some regions were of poor sequence quality
– Demonstrated that whole-genome shotgun sequencing can be used for large eukaryotic genomes
• Adams et. al., Science, 287, 2185 (2000)
¤ Finished sequence – (2002)– BAC clone sequencing and gap filling– Sequence contains 7 physical gaps and 37 sequence gaps– Very accurate sequence: error rate of < 1/100.000
• Celniker et al., Genome Biol. ; 3: research 0079.1–0079.14 (2002)
The The DrosophilaDrosophila Genome Genome
¤ The (female) Drosophila genome is ~176 Mb in size– Euchromatic part: 117 Mb completely sequenced– heterochromatic part: partly (~20Mb) sequenced
(unassembled)• Female: estimated at ~59 Mb • Male: the 40Mb Y chromosome is completely heterochromatic
Euchromatin and HeterochromatinEuchromatin and Heterochromatin
¤ Euchromatin– Gene rich portion of the genome– Condenses during mitosis and de-condenses there after – Portion of the genome that can be cloned stably in BACs
¤ Heterochromatin– Consists mainly of simple sequence repeats (sattelite
DNAs), transposable elements, and tandem arrays of rRNA genes
– Remains condensed after mitosis– Gene poor portion of the genome– Contains elements required for centromere function
¤ Euchromatin - heterochromatin transition– is gradual at the molecular level
Reprinted from: Celniker et al., Genome Biol. ; 3: research 0079.1–0079.14 (2002)
Transposons
centromere
Gene Content of the Drosophila Gene Content of the Drosophila GenomeGenome
¤ Annotation of the draft genome sequence – Predicted 13,601 genes
• >10,000 genes (>75%) supported by EST and protein matches• This annotation was incomplete
– Large number of sequence gaps and sequencing errors
¤ Annotation of the finished genome sequence– Predicted same number of genes: 13,676
• Majority (85%) of the gene models revised
– Improved: a collection of 250.000 ESTs and full length cDNAs– Found only 17 pseudogenes ( much less than in C. elegans )– Heterochromatic part may contain ~500 genes
• The 20Mb sequenced contains ~300 protein coding genes
– Reannotation reveals many complex gene models • genes that do not fit the simple 5’UTR – exons – 3’UTR
Conservation of gene Conservation of gene segmentssegments
¤ Sequence conservation in noncoding regions– Is insufficient for the identification of regulatory sequences– Multiple genome sequence alignments will be needed
The Mosquito Genome SequenceThe Mosquito Genome Sequence
¤ The draft genome spans 278 Mb– Covers the entire genome including the heterochromatic
DNA – Mosquito have larger genomes than Drosophila
• estimates from 250 to 500 Mb• Transposable elements constitute ~16% of the genome
– Drosophila experienced a recent genome size reduction
¤ The predicted number of genes is ~14.000– Very similar to Drosophila
¤ The comparison of the Anopheles and Drosophila genomes and proteomes reveals – considerable similarities and numerous differences– Reflects selection and adaptation to different ecologies and
(~6.000)• Exhibit 1:1 relationship• Genes with conserved
function
– Paralogs: ~12%• Duplicated genes
– Homologs: ~~25%• Unclear relationship
– Orphans: 11% to 18%• New genes • Rapidly evolving genes
The core of conserved proteinsThe core of conserved proteins
¤ Dynamics of Gene Structure in a span of 250MY– Exon lengths and intron frequencies are similar – introns in Drosophila have half the length of Anopheles
• systematic reduction of noncoding regions in Drosophila– Only 50% of the introns are perfectly conserved
• one intron gain or loss per gene per 125 My – Intron sequences diverge rapidly
• sequence similarity in <2% of the equivalent introns
~1000 microsynteny blocks• 2-3 genes per block (cfr.
fish-human)
¤ Macrosynteny– Both species have 5 five
major chromosomal arms – Clear 1:1 homologies
between the chromosomal arms
• Inversions much more frequent than translocations
The Draft Genome of The Draft Genome of Ciona intestinalisCiona intestinalis:: Insights into chordate and vertebrate originsInsights into chordate and vertebrate origins
¤ Paper presents– Draft genome sequence of Ciona intestinalis, an ancestral
chordate– Chordates appear in the fossil record at the Cambrian