Causes of insertion sequences abundance in prokaryotic genomes? A problem of size Marie Touchon E.P.C Rocha Atelier de BioInformatique, Université Pierre et Marie Curie, Paris Unité Génétique des Génomes Bactériens, Institut Pasteur, Paris [email protected]
38
Embed
Causes of insertion sequences abundance in prokaryotic genomes? A problem of size
Causes of insertion sequences abundance in prokaryotic genomes? A problem of size. Marie Touchon E.P.C Rocha Atelier de BioInformatique, Université Pierre et Marie Curie, Paris Unité Génétique des Génomes Bactériens, Institut Pasteur, Paris [email protected]. IS elements : - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Causes of insertion sequences abundance in prokaryotic genomes?
A problem of size
Marie Touchon
E.P.C Rocha
Atelier de BioInformatique, Université Pierre et Marie Curie, Paris
Unité Génétique des Génomes Bactériens, Institut Pasteur, Paris
- coding only the information allowing their mobility
ability to generate mutations :
- by insertion within genes
- by activate genes on insertion upstream
- to generate extensive DNA rearrangements
have been found to shuttle the transfer of adaptive traits such as :
- antibiotic resistance
- virulence
- new metabolic capabilities
Their exact nature is still debated : Selfish/Advantageous?
- genomic parasites
- beneficial agents
Causes of insertion sequences abundance in prokaryotic genome ?
Reasons largely unknown and widely speculated
Hypotheses :- IS family specificity- Genome size- Frequency of horizontal gene transfer - Pathogenicity- Type of ecological associations- Human sedentarisation
The current availability of hundreds of genomes renders testable many of these hypotheses.
IS elements Identification :
Problem : ISs annotations are heterogeneous, inaccurate or insufficient
Solution : Reannotation of ISs using comparative study
by adopting the nomenclature defined by Chandler (1998)
- ISs have one or two consecutive ORFs encoding transposase protein
This effect is unlikely to explain the variability of ISs
The effect of genome size
Wilcoxon test : p<0.0001 Spearman’s r=0.63, p<0.0001
Strong association between Genome size and IS number (and density)
The larger the genome, the more IS elements it contains
N= 64 198
The effect of horizontal gene transfer
Strain A
specific region
Lists of orthologs
Strain A B C
A Bi jPutative orthologs: Reciprocal best hits, proteins with >90% similarity and <20% length difference.
Strain specific region:Exclusive region to a strainwhich presented at leastten consecutive genes withoutan orthologs
Strain Specific region
Prophage-Database (Nestle, Casjeans, 2003)
HGT-Database (Garcia-Vallve,2003)
E. Coli O157:H7 Sakai
The effect of horizontal gene transfer
Wilcoxon test : p<0.0001
5.2%
11.4%
t-test : p<0.001
ISs are ~ 4 times more concentrated
in HGT regions
Genomes lacking ISs have fewer HGT
Spearman’s r= 0.31 p>0.1 (NS)
HGT may be a determinant of the
presence of ISs, but not of its abundance
Spearman’s r=0.84, p<0.0001
The effect of horizontal gene transfer
HGT is a necessary but not sufficient condition to the presence of ISs
The intensity of HGT is not a significant determinant of the IS abundance
IS families diversity in HGT regions is almost as high as in
the entire genome
The effect of pathogenicity
Yersinia pestis (plague)
Shigella flexneri, sonnei (dysentery)
Bordetella pertussis (whooping cough)
4.33.6
Wilcoxon test : p>0.5
N = 100 153
IS=0 8% 17% 55% 100%
Wilcoxon test : p<0.001
No association between the
presence of IS and pathogenicity
Strong association between the frequency of IS and the facultative
character of the ecological associations
The effect of the type of ecological association
Stepwise multiple regression
Genome size
Ecological association
Frequency HGT
0.4
0.47
0.47
Number of ISs
Covariate Cumulative R2
Genome size is the most important
variable
Kruskal-Wallis test : p>0.5 (NS)
We removed genomes lacking IS(possibly under sexual isolation)
Lifestyles is a non-significant
determinant
The effect of human sedentarisation (Mira et al.,2006)
1) Genomes with many ISs are from prokaryotes associated with humans or domesticated animals and plants.
2) Large intra-genomic IS expansions are recent.
Kruskal-Wallis test : p>0.5 (NS)
not directlyindirectly
No evidence that man-related prokaryotes have more Iss.
Genome size explains ˜ 40% of the variance in IS abundance
The smallest the genome, the lower the number but also the lower density of ISs
- Selection could favor small genomes : optimal use of resources; the replication time (an increase in genome size caused by IS could be counter-selected)
- ISs are selected to generate genetic variation : (such selection should be stronger in larger genomes)
Genomes with fewer ISs, correspond to the slowest growing prokaryotes
Wilcoxon test : p<0.05
De
nsi
ty o
f IS
s (/
Mb
)
fast slow
Growth
tranposition inactivates genes with high probability
the total number of essential genes : ˜300
+ 200-300 genes are nearly ubiquitous
The abundance of IS elements in genomes could be mostly a question of space for not highly deleterious
transposition events
500 nearly essential genes
- Selection against transposition in genomes with higher density of deleterious transposition targets
One explanation fits well the available data
Conclusions
High diversity of ISs found within strains or closely related species
The number of ISs evolve so fast, that there is no historical correlation
HGT may be a determinant of the presence of ISs, but not of its abundance
Surprisingly, genome size alone is the best predictor of IS number and density
Selection against transposition in genomes with higher density of deleterious
transposition targets
Bordetella bronchiseptica
Bord
ete
lla p
ara
pert
uss
is
Impacts of IS abundance?
IS expansion :
- increases the rate of genome rearrangements
- increases the number of pseudogenes Number of ISs