Genome Evolution in Yeast Gilles Fischer 27 th January 2009 | European Course on
Mar 27, 2015
Genome Evolution in Yeast
Gilles Fischer
27th January 2009 | European Course on
INTRODUCTION:
Comparative genomics
Yeasts as model organisms
GENOME EVOLUTION:
DNA duplications
Chromosome dynamics
Nucleotide composition
A brief introduction to the field of Comparative Genomics
Vendrely and Vendrely (1950):
"Il ne fait aucun doute que l'étude systématique de la
teneur absolue du noyau en acide désoxyribonucléique, à
travers de nombreuses espèces animales puisse fournir des
suggestions intéressantes en ce qui concerne le problème de
l'évolution"
Comparing genomes is a very old idea…
DNA carries the genetic information: Avery (1943) and Hershey-Chase (1952)
"Tout ce qui est vrai pour le colibacille est vrai pour l'éléphant"
Jacques Monod:
identical divergent different
timeor
quantity of evolutionary changes
A brief introduction to the field of Comparative Genomics
Looking for differences Looking for similarities
identical divergent different
timeor
quantity of evolutionary changes
A brief introduction to the field of Comparative Genomics
Looking for differences Looking for similarities
NEED FOR ADEQUATELY RELATED ORGANSIMS
Looking for differences
Looking for similarities
A brief introduction to the field of Comparative Genomics
Genome sequences
Bio-informatics
Rules governing genome evolution
Mechanistic hypotheses
Genetic screens
functional genomics
Experimental Biology
Molecularmechanisms
Looking for differences
Looking for similarities
A brief introduction to the field of Comparative Genomics
Genome sequences
Bio-informatics
Rules governing genome evolution
Mechanistic hypotheses
Genetic screens
functional genomics
Experimental Biology
Molecularmechanisms
SMALL GENOMESAND
EXPERIMENTALLY TRACTABLE
•Eukaryotic micro-organisms classified in the kingdom Fungi
•About 1,500 species currently described (only 1% of all yeast)
•Yeasts are unicellular, typically measuring 3–4 µm in diameter (up to
over 40 µm)
•Saccharomyces cerevisiae used in baking and fermenting alcoholic
beverages for thousands of years
•Other species of yeast, such as Candida albicans, are opportunistic
human pathogens
•Yeasts have recently been used to generate electricity in microbial
fuel cells and produce ethanol for the biofuel industry.
•Yeasts are found in both divisions Ascomycota and Basidiomycota
•The budding yeasts ("true yeasts") are classified in the
Saccharomycotina subphylum
Organisms with small genomes, phylogenetically related and
experimentally tractable = YEASTS
A brief introduction to the field of Yeast Genomics
Organisms with small genomes, phylogenetically related and
experimentally tractable = YEASTS
A brief introduction to the field of Yeast Genomics
The Tree of Eukaryotes (Keeling et al., 2005)
A brief introduction to the field of Yeast Genomics
The genome of S. cerevisiae
André Goffeau
8 years, 120 labs,641 people
Life with 6000 genesScience (1996)
The first eukaryotic genome sequence:
Saccharomyces paradoxusSaccharomyces mikataeSaccharomyces cerevisiaeSaccharomyces kudriavzevii Saccharomyces bayanus
Saccharomyces pastorianusSaccharomyces exiguus Saccharomyces servazzii Saccharomyces castelliiCandida glabrata Vanderwaltozyma polysporaZygosaccharomyces rouxiiLachancea thermotoleransLachancea waltiiLachancea kluyveri
Kluyveromyces lactisKluyveromyces marxianusEremothecium gossypiiSaccharomycodes ludwigiiBrettanomyces bruxellensisPichia angustaCandida lusitaniae Debaryomyces hanseniiPichia stipitisPichia sorbitophilaCandida guilliermondiiCandida tropicalisCandida parapsilosisLodderomyces elongisporusCandida albicansCandida dubliniensisArxula adeninivoransYarrowia lipolytica
Schizosaccharomyces pombe
Saccharomycotina
Saccharomyces paradoxusSaccharomyces mikataeSaccharomyces cerevisiaeSaccharomyces kudriavzevii Saccharomyces bayanus
Saccharomyces pastorianusSaccharomyces exiguus Saccharomyces servazzii Saccharomyces castelliiCandida glabrata Vanderwaltozyma polysporaZygosaccharomyces rouxiiLachancea thermotoleransLachancea waltiiLachancea kluyveri
Kluyveromyces lactisKluyveromyces marxianusEremothecium gossypiiSaccharomycodes ludwigiiBrettanomyces bruxellensisPichia angustaCandida lusitaniae Debaryomyces hanseniiPichia stipitisPichia sorbitophilaCandida guilliermondiiCandida tropicalisCandida parapsilosisLodderomyces elongisporusCandida albicansCandida dubliniensisArxula adeninivoransYarrowia lipolytica
Schizosaccharomyces pombe
SaccharomycotinaA brief introduction to the field of Yeast Genomics
Whole Genome Duplication
Extensive loss of transposable elements and spliceosomal introns
Gain of mating type cassettesand small centromeres
frequent tandem duplications
Gain of Megasatellites
Gain of HO gene
5769
5204
4998
5308
5104
5084
6273
6434
# genes
274
207
272
258
231
162
200
510
# tRNA
287
131
167
322
286
175
475
1070
# introns
12,1
12,3
9,8
11,3
10,4
10,7
12,1
20,5
size (Mb)# chr
16
13
7
8
8
6
7
6
Genome annotation
Yarrowia lipolytica
Saccharomyces cerevisiae
Candida glabrata
Lachancea kluyveri(WashU seq center M. Jonhston)
Debaryomyces hansenii
Kluyveromyces lactis
Lachancea thermotolerans
Zygosaccharomyces rouxii
A brief introduction to the field of Yeast Genomics
Yarrowia lipolytica
Saccharomyces cerevisiae
Candida glabrata
Lachancea kluyveri
Debaryomyces hansenii
Kluyveromyces lactis
Lachancea thermotolerans
Zygosaccharomyces rouxii
300
- 10
00 M
Yr
100
- 30
0 M
Yr
100
MY
r
Berbee and Taylor, 2006; James et al., 2006
100 *
65
-
-
-
60
51
48
amino acididentity %
Evolutionary scale
Mus musculus
Takifugu rubripesTetraodon negroviridis
Homo sapiens
100 *
90
70
50
450 MY
r
100 MY
r
550 MY
r
Ciona intestinalis
*Dujon et al., et * Jaillon et al., Nature, 2004
A brief introduction to the field of Yeast Genomics
Yarrowia lipolytica
Saccharomyces cerevisiae
Candida glabrata
Lachancea kluyveri(WashU seq center M. Jonhston)
Debaryomyces hansenii
Kluyveromyces lactis
Lachancea thermotolerans
Zygosaccharomyces rouxii
1.10
1.15
1.20
1.25
1.30
1.35
1.40
me
an
fa
mily
siz
e
Genome redundancy
YALI
SACE
LAK
L
DEH
A
KLL
A
LATH
ZYR
O
CA
GL
WGD
Wolfe and Shields, 1997
- important level of redundancy (in all
eukaryotic phyla)
- Gene order changes (differential loss of
duplicates, translocation breakpoints)
- several mechanisms of duplication
A brief introduction to the field of Yeast Genomics
- Small, compact and specialized:- small intergenic sequences- few transposable elements- few introns- limited RNA interference
-Large evolutionary scale
- High level of genome redundancy
- Numerous evolutionary novelties in all clades
- High number of sequenced genomes
Yeast Genomes
===> good model organisms to study genome evolution
Most eukaryotic genomes contain high proportion of
duplicated genes
Duplicated Genes 43% 65% 49% 40% 50%
S. c. A. t. C. e. D. m. H. s. s. duplication
Gene dosage increaseGenetic robustness
Gain of a new function
Specialization of the 2 copies
Loss of function(most frequent fate)
Pseudogenization Neofunctionalization ConservationDegenerationComplementation
===> Strong evolutionary potential
Genome evolution: DNA duplications
CGH
SDs containing between 1 to 22 genesNo homology at the junctions (microhomologies)
Gresham et al., PLoS Genet 2008
Adaptation to sulfate-limited conditions in chemostats for 200 generations:
Genome evolution: DNA duplications
Adaptative value of DNA duplications:
3days - YPD - 30°
and so on…RPL20B
XV
XIII
RPL20A
==> WT growth rate
RPL20B
XV
XIII
==>slow growth
rpl20A∆délétion
???
RPL20B
==> WT growth rate
A duplication assay:
Genome evolution: DNA duplications
IVI
III
IX
V - VIII
XIX
XIVII
V, XIII
VII, XV
IV - XII
XV
Karyotype Hybridization
RPL20B
Molecular combing
direct tandem
PCR and sequence
A A C C T A G A G C T T ( G T T ) 14 G T G G A T T G T T T
Despite the selection of a single gene duplication event, only large segmental duplications were recovered
Molecular characterization of segmental duplications:
Comparative Genomic Hybridization
143 kb
RPL20B
Genome evolution: DNA duplications A duplication assay:
Inter-chromosomalIntra-chromosomal
strain rate of SDs (/cell/division)
type of SDs breakpoint sequences (%)
LTRs(300bp)
microhomologies(2 to 11 bp)
microsatellites(poly A/T or
répét trinucleotides)
WT 10-7 42 6 48 52(1)
pol32∆ 0 - - - -(<0.07)
RE
PL
ICA
TIO
N
clb5∆ 7x 10-5 66 3 62 38(730)
CPT 3 x 10-5 22 0 54 56(320)
rad52∆ 3 x 10-7 70 1 0 100(3)
DS
B R
EP
AIR
rad52∆rad1∆dnl4∆
8 x 10-8 15 0 0 100(0.8)
Genome evolution: DNA duplications Molecular mechanisms:
Koszul et al. EMBO J., 2004
T T T T TT0
5
10
15
20
25
30
35
40
time (min)
Lately replicated regionstRNAsLTRsmicrosatellites
a connection with replication?
Raghuraman et al. Science, 2001
Clb5
Replication-based mechanisms
Inter-chromosomalIntra-chromosomal
strain rate of SDs (/cell/division)
type of SDs breakpoint sequences (%)
LTRs microhomologiesmicrosatellites
WT 10-7 42 6 48 52(1)
clb5∆ 7x 10-5 66 3 62 38(730)
defect in the firing of late replication origins (Schwob et al , 1993)
S-phase lasts twice longer (Epstein et al, 1992)
Rad9-dependent activation of the replication checkpoint indicative of
DNA damages (Gibson et al, 2004)
RPL20B lies in Clb5-dependent region (CDR; McCune et al, 2008)
replication perturbations strongly induce SD formation
Bloom and Cross, 2007
Pol32Nick McElhinny, Cell 2008
pol32∆ 0 - - - -(<0.07)
Pol32 is required for initiating BIR reaction (Lydeard et al, 2007)
SDs are generated through replication-based mechanisms
Broken forks as precursor lesions leading to SDs
strain rate of SDs (/cell/division)
type of SDs breakpoint sequences (%)
WT 10-7 42 6 48 52(1)
CPT 3 x 10-5 22 0 54 56(320)
Top1CPT
Top1
=>broken forks promote SD formation
Inter-chromosomalIntra-chromosomal LTRs microhomologiesmicrosatellites
Replication-based mechanisms
pas d’homologies, religature simple
NHEJ
Dnl4
Resection
Rad52 Rad1
MMEJ
SSA BIR
SDSA DSBR
Rad51
The DSB repair pathways
Pol32
Microhomologies (5-12pb)
>30pb d’homologies
HR
Two different replication-based mechanisms
strain rate of SDs (/cell/division)
type of SDs breakpoint sequences (%)
WT 10-7 42 6 48 52(1)
HR-dependent
rad52∆ 3 x 10-7 70 1 0 100(3)=
>
==
==
>
HR-independent
Inter-chromosomalIntra-chromosomal LTRs microhomologiesmicrosatellites
=> HR-mediated SDs result from BIR Rad51-independent
=> Non HR-mediated SDs result from ?
Dnl4
Resection
Rad52 Rad1
The DSB repair pathways
X
X X
strain rate of SDs (/cell/division)
type of SDs breakpoint sequences (%)
WT 10-7 42 6 48 52(1)
rad52∆ 3 x 10-7 70 1 0 100(3)
MMIR: microhomology microsatellite-induced replication
Inter-chromosomalIntra-chromosomal LTRs microhomologiesmicrosatellites
rad52∆rad1∆dnl4∆
8 x 10-8 15 0 0 100(0.8)
SD are still being formed in the absence of all known DSB repair pathways
existence of a new DSB repair pathway?
HR requires Rad52MMEJ requires Rad1NHEJ requires Dnl4
Sequences found at breakpoints: microhomologies between 2 and 11 bp
poly (A/T)13-23
trinucleotide repeats (GTT)3-20
Formation of chimeric genes at breakpoints (in 13 out of 26 junctions)
Extremely high density of microhomologies and microsatelites in the genome
often intragenic
Dnl4
Resection
Rad52 Rad1
The DSB repair pathways
X
X X
Dnl4
Resection
Rad52 Rad1
The DSB repair pathways
X
X X
A new pathway?MMIR
Microhomology/microsatellites Induced Replication
- independent from all known DSB repair pathways (HR, NHEJ, MMEJ)
- dependent from Pol32
- Replication template switching between microhomologies and microsatellites
SDs are spontaneously generated at high frequency: 10-7 SD/cell/division for the RPL20B locus
SDs arise from two alternative replication-based mechanisms: BIR and MMIR
MMIR represents a new mechanism different from known DSB repair pathways (HR, NHEJ):
between microhomologie (between 2 to 11 nt) and microsatellites (poly A/T, trinucleotide repeats)
independent from Rad52
requires Pol32
MMIR induces the formation of chimerical genes at the rearrangement junctions
ConclusionsGenome evolution: DNA duplications
Hastings et al, Nature Review Genetics, 2009
In human, FoSTeS/MMBIR:
Complex structural variations: - Lissencephaly (Nagamani et al., J. Med Genet 2009)
- Miller-Dieker syndrome
- Charcot-Marie-Tooth disease (Lupski and Chance, 2005)
- Pelizaeus Merzbacher disease (Lee et al., Cell 2007)
- XLMR syndrome (Bauters et al., Genome Res 2008)
- SDs and CNVs (Kim et al., Genome Res 2008)
Genome evolution: DNA duplications
Genome evolution: Chromosome Dynamics
translocations
Inversions
duplications
deletions
rates of rearrangements
Species 1
Species 2
#
# x
-Duplications: high evolutionary potential (creation of new genes, adaptation, specialization,…)
- Translocations, inversions, deletions: very low evolutionary potential? (Loss of genes, deregulation of gene expression, modification of sub-nuclear architecture,…)
S. paradoxus
S. kudriavzevii
S. cariocanus
S. mikatae
S. bayanus
S. cerevisiae
Saccharomyces sensu stricto complex:
- monophyletic group
- very closely related species
- hybrids viable but sterile
- 16 chromosomes
Genome evolution: Chromosome Dynamics
Yarrowia lipolytica
S. serevisiaeS. bayanus
Candida glabrata
Lachancea kluyveri
Debaryomyces hansenii
Kluyveromyces lactis
Lachancea thermotolerans
Zygosaccharomyces rouxii
Sensu stricto
S. paradoxus
S. kudriavzevii
S. cariocanus
S. mikatae
S. bayanus
S. cerevisiae
Fischer et al. , Nature 2000
S. cerevisiaeS. paradoxus S. cariocanusS. mikataeS. kudriavzevii S. bayanusS. cerevisiaeS. paradoxus S. cariocanusS. mikataeS. kudriavzevii S. bayanus
Genome evolution: Chromosome Dynamics
(4)
(4)
(2)
only few translocations:• low reorganization• recombination between repeated sequences• no chromosomal speciation• variable rate of rearrangements?
(0)
(0)
C. glabrata K. lactis D. hanseniiS. cerevisiae
chr VIII
1 45678910111213 1
4562
3A D G IJ 245 6
88% 77% 11% 5%
Y. lipolyticaS. bayanus8 15
98%
Yarrowia lipolytica
S. serevisiaeS. bayanus
Candida glabrata
Lachancea kluyveri
Debaryomyces hansenii
Kluyveromyces lactis
Lachancea thermotolerans
Zygosaccharomyces rouxii
Sensu stricto
Genome evolution: Chromosome Dynamics
C. glabrata K. lactis D. hanseniiS. cerevisiae
chr VIII
1 45678910111213 1
4562
3A D G IJ 245 6
88% 77% 11% 5%
Y. lipolyticaS. bayanus8 15
98%
Genome evolution: Chromosome Dynamics
C. glabrata K. lactis D. hanseniiS. cerevisiae
chr VIII
1 45678910111213 1
4562
3A D G IJ 245 6
88% 77%
Y. lipolyticaS. bayanus8 15
98%
Fischer
Fischer et al. , PLoS Genet 2006
F. Brunet
Genome evolution: Chromosome Dynamics
C. glabrata K. lactis D. hanseniiS. cerevisiae
chr VIII
1 45678910111213 1
4562
3A D G IJ 245 6
88% 77% 11% 5%
Y. lipolyticaS. bayanus8 15
98%
Genome evolution: Chromosome Dynamics
C. glabrata K. lactis D. hanseniiS. cerevisiae
chr VIII
1 45678910111213 1
4562
3A D G IJ 245 6
88% 77% 11% 5%
Y. lipolyticaS. bayanus8 15
98%
Genome evolution: Chromosome Dynamics
Saccharomyces cerevisiae
Candida glabrata
Lachancea kluyveri
Lachancea thermotolerans
Zygosaccharomyces rouxii
at genome scale:
S.cerevisiae
C.
glab
rata
- comprehensive reshuffling
- 509 translocations, 104 inversions
- no homologous chromosomes
"UNSTABLE" GENOMES
"STABLE" GENOMES
Genome evolution: Chromosome Dynamics
L. kluyveri
L.
ther
mo
tole
ran
s
-moderate reshuffling
-91 translocations, 22 inversions
- large chromosomal segments (up to 670 kb)
Mean amino acid identity: 58%
Mean amino acid identity: 65%
Quantitative estimation of the relative genome stability: GOC (gene order conservation)
species 1
species 2
?
=5
=5
If yes: +1
If no: 0
Rocha, Trends Genet, 2003,
GOC =
# neighboring orthologues
Total # orthologues
- GOL : Gene Order Loss = 1 - GOC
- Rate of rearrangements = GOL
Dist phylogénétique(
(
mean rate
Genome evolution: Chromosome Dynamics
Yarrowia lipolytica
Saccharomyces cerevisiae
Candida glabrata
Lachancea kluyveri(WashU seq center M. Jonhston)
Debaryomyces hansenii
Kluyveromyces lactis
Lachancea thermotolerans
Zygosaccharomyces rouxii
1.5
1.32.7
WGD
1.7
1.7
1.7
0.6
0.9
0.4
0.3
0.0
0.4
Rearrangement branch rate
S. cerevisiae
C. glabrata
Z. rouxiiK. lactisL. kluyveri
L. thermot
D. hansenii
Species instability scale
0.3
0.4
0.5
0.6
0.7
Genome evolution: Chromosome Dynamics
Fischer et al. , PLoS Genet 2006
Y. lipolytica
S. serevisiaeS. bayanus
Candida glabrata
Lachancea kluyveri(WashU seq center M. Jonhston)
Debaryomyces hansenii
Kluyveromyces lactis
Lachancea thermotolerans
Zygosaccharomyces rouxii
Sensu stricto
lowmassive
Unstable genome
Stable genomes
differential gene loss
No synteny
moderate
TGA expansion
Genome evolution: Chromosome Dynamics
High level of chromosome plasticity Hundreds of translocations and inversions
Gene order is not very constrained
Highly variable rates of chromosome rearrangements between lineages but also within a given lineage
Is there a selective advantage associated to these rearrangements? Are they accumulated by genetic drift?
usually considered as deleterious
few examples of the adaptative role of rearrangements (proliferation of cancer cells (O’Neil and Look, 2007), growth advantage of translocated yeast cells (Colson et al, 2004), adaptative gene loss (Domergue, 2005).
Creation of genetic novelties requires chromosome plasticity?
ConclusionsGenome evolution: Chromosome Dynamics
Base substitution mutations:
C T transitions : cytosine deamination
QuickTime™ et undécompresseur
sont requis pour visionner cette image.
Kreutzer and Essigmann, PNAS, 1998
G T transversions : 8-oxo-guanine
Shibutani et al., Nature, 1991
Global AT-enrichment
Biased Gene Conversion (BGC):
Global GC-enrichment
AT GC mutations Duret and Galtier, Annu RevGenomics Human Genet, 2009
GC%
38.3
38.8
39.1
41.5
47.3
38.8
36.3
49.0Yarrowia lipolytica
Saccharomyces cerevisiae
Candida glabrata
Lachancea kluyveri
Debaryomyces hansenii
Kluyveromyces lactis
Lachancea thermotolerans
Zygosaccharomyces rouxii
Eremothecium gossypii 52.0
The Génolevures Consortium, Genome Res., 2009
>
<
Marsolier-Kergoat and Yeramian, Genetics, 2009
not in yeast?
Genome evolution: Nucleotide composition
20
40
60
80
1 2 3 4 5 6 7 8 9 10
A B C D E F G
GC%
Mb
39.1
A B C D E F G H
47.3
Lachancea thermotolerans
Zygosaccharomyces rouxii
QuickTime™ et undécompresseur
sont requis pour visionner cette image.
A B C D E F G H
1 2 3 4 5 6 7 8 9 10 Mb11
20
40
60
80
GC% Lachancea kluyveri
41.5
52.9
C-left
1 Mb
DNA
46.137.4
54.242.0
46.836.5
GC% in C-left:GC% out of C-left:
• global GC increase
RNA 1st 1st 1st2nd 2nd 2nd3rd 3rd 3rd AAAAAA
53.346.4
41.037.0
68.342.7
• strong bias in codon usageGC% in C-left:
GC% out of C-left:
Protein A G P R I N K F
84 84 84 72 11 16 16 16 GC% in synonymous codons
1.3 1.2 1.1 1.2 0.7 0.8 0.9 0.9relative use in C-left
• bias in protein compositionPayen et al., Genome Res., 2009
Genome evolution: Nucleotide composition
100
98
E. gossypii
K. lactis
L. thermotolerans
L. waltii
L. kluyveri
Z. rouxii
C. glabrata
S. cerevisiae
100
100
100
100
96
100
100
100
0.05
Payen et al., Genome Res., 2009
• C-left has the same phylogentic origin than the rest of the genome
Alignments of universally conserved proteins :
• 17 families (6688 residues) outside C-left
• 19 families (4631 residues) in C-left
Genome evolution: Nucleotide compositionPhylogeny:
LATH_GLATH_E
LATH_CLATH_A
LATH_F
LAKL_C
LAWA_S27 LAWA_S56 LAWA_S55LAWA_S33
670 kb
C-left share a common ancestral origin with the genomes of L. waltii (LAWA) and L. thermotolerans (LATH)
Genome evolution: Nucleotide compositionSynteny:
- Design of custom microarrays (Agilent 2 x 105k):
200bp fragments
G1
S
G2
DNACy3
DNACy5
- Time course analysis of copy number variation during S-phase:
Genome evolution: Nucleotide compositionReplication:
ChrA
ChrB
Genome evolution: Nucleotide compositionReplication:
ChrC
ChrD
Genome evolution: Nucleotide compositionReplication:
• Global GC increase (codon usage bias and protein composition bias)
• harbors a normal gene density
• Phylogenetic origin consistent with the rest of the genome
• presents a very high level of synteny conservation with sister species genomes
• encompasses the MAT locus but has lost the silent cassettes HMR and HML
• is devoid of Transposable Elements (203 insertions in the rest of the genome)
• harbors the same compositional bias in all 11 L. kluyveri strains tested
• The replication program is modified (more origins and delayed firing)
=> a cause or a consequence of the unusual GC composition?
• Meiotic recombination and BGC?
Genome evolution: Nucleotide compositionConclusions
L. kluyveri offers a unique opportunity to understand the mechansims of evolution of genome nucleotide composition
Merci
- Plateforme Puces ADN, Génopole Pasteur Odile Sismeiro, Jean-Yves Coppé
- Génopole Pasteur-Ile de France Christiane Bouchier, Lionel Frangeul
- Centre National de Séquençage, Evry
Jean-Luc Souciet Univ. Louis Pasteur, Strasbourg
Jean Weissenbach, Patrick Winker
- Génolevures consortium:
- Unité de Génétique Moléculaire des Levures, Institut Pasteur
Celia Payen
Romain Koszul
- Unité de Génomique des Microorganismes, équipe Biologie des
Génomes
Nicolas Agier
Guénola Drillon