Revisiting the taxonomy and evolution of pathogenicity of the … · 2020. 3. 14. · Leptospirosis is an emerging zoonotic disease of worldwide distribution that affects more than
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
RESEARCH ARTICLE
Revisiting the taxonomy and evolution of
pathogenicity of the genus Leptospira through
the prism of genomics
Antony T. Vincent1☯, Olivier Schiettekatte2,3☯, Cyrille GoarantID4, Vasantha Kumari Neela5,
Eve Bernet1,2, Roman Thibeaux4, Nabilah Ismail6, Mohd Khairul Nizam Mohd KhalidID7,
Strains used in this study are available at the National Reference Center for Leptospirosis,
Institut Pasteur, Paris, France. Type strains of new Leptospira species were also deposited in
the DSMZ-German Collection of Microorganisms (www.dsmz.de) and the National Collabo-
rating Centre for Reference and Research on Leptospirosis, Amsterdam, The Netherlands
(http://leptospira.amc.nl/leptospira-library/), except for species Leptospira kobayashii, Leptos-pira ryugenii, Leptospira ellinghausenii, and Leptospira johnsonii which were deposited in the
CIP-Collection of Institut Pasteur (www.pasteur.fr/fr/crbip) and Japan Collection of Microor-
ganisms (http://jcm.brc.riken.jp/en/).
Ethics statement
Collection of the strains was conducted according to the Declaration of Helsinki. A written
informed consent from patients was not required as the study was conducted as part of routine
surveillance of the national reference center and no additional clinical specimens were col-
lected for the purpose of the study. Cultures originating from human samples were anon-
ymized. Approval for bacterial isolation from soil and water was not required as the study was
conducted as part of investigations of leptospirosis outbreaks. For New Caledonia, approval
for bacterial isolation from the natural environment was obtained from the South Province
(reference 1689–2017) and North Province (reference 60912-2002-2017).
Protocols for animal experiments conformed to the guidelines of the Animal Care and Use
Committees of the Institut Pasteur (Comite d’ethique d’experimentation animale CETEA #
2016–0019), agreed by the French Ministry of Agriculture. All animal procedures carried out
in our study were performed in accordance with the European Union legislation for the pro-
tection of animals used for scientific purposes (Directive 2010/63/EU).
Whole-genome sequencing
In this study, the DNA of a total of 90 Leptospira strains were sequenced (S1 Table), including
the type strain L. idonii Eri-1T [13], whose genome sequence was not available. Genomic DNA
was prepared by centrifugation of exponential-phase cultures and extraction with MagNA
Pure 96 Instrument (Roche). Next-generation sequencing was performed by the Mutualized
Platform for Microbiology (P2M) at Institut Pasteur, using the Nextera XT DNA Library Prep-
aration kit (Illumina), the NextSeq 500 sequencing systems (Illumina), and the CLC Genomics
Workbench 9 software (Qiagen) for de novo assemblies. The draft genomes with 50x mini-
mum coverage were used for subsequent analysis and they were submitted to GenBank; acces-
sion numbers are available in S1 Table. The genomic DNA of L. kobayashii E30T, L. ryugeniiYH101T, L. ellinghausenii E18T, and L. johnsonii E8T were sequenced at the Sequencing facility
at the University of Hokkaido (Japan) (Mazusawa et al. submitted).
Phylogenetic analyses
A delineation of the species for the genome sequences was performed by Average Nucleo-
tide Identity (ANI) using pyani version 0.2.7 (https://github.com/widdowquinn/pyani).
Subsequently, one genome per species was chosen and added to reference genomes available
in GenBank, in order to compose a dataset of 64 genomes of Leptospira. The genome
sequences of Turneriella parva DSM 2152 (GenBank Assembly # GCA_000266885.1) and
Leptonema illini DSM 21528 (GenBank Assembly # GCA_000243335.1) were added as out-
group for phylogenetic analysis. All 66 genomic sequences were annotated with Prokka
version 1.12 [18]. The orthology between the coding sequences has been inferred with
GET_HOMOLOGUES version 20092018 using the COG and OMCL algorithms [19].
Sequences of 1371 orthologous genes that are in single copy and in the softcore (present in
A collection of environmental isolates from Asia (Japan and Malaysia), Africa (Algeria and the
island of Mayotte, a French overseas department in the Indian Ocean), Europe (France) and
Oceania (New Caledonia) were included in this study. Leptospira isolates were retrieved from
water and soil samples from different sites from 2008 to 2017. Except for the Japan isolates
[17], environmental isolates reported in this study were not described previously.
The DNA of a total of 90 isolates was sequenced using Illumina technology (S1 Table). The
90 Leptospira strains had an average genome size of 4,128,000 ± 221,345 bp. The largest
genome was 4,993,538 bp, belonging to Leptospira putramalaysiae strain SSW20. The smallest
genome was the genome of Leptospira fletcheri strain SSW15T, 3,733,663 bp in size. The GC
content of the genomes in this study ranged from 37.06 to 47.70. The average genome assem-
bly contained 49 ± 50 contigs (S1 Table).
Phylogenomics and identification of 30 new Leptospira species
The complete genome sequences of the 90 strains described in this study were compared to the
previously published genome sequences from the already known Leptospira species and strain
GWTS#1, that was wrongly assigned to the species Leptospira alstonii [33, 34] (S1 Table). As a
note, we sequenced the genome of L. idonii strain Eri-1T because a genome sequence was not
available for this species [13]. Also, the genome of the recently described species Leptospira mac-culloughii [15] was excluded from further analysis in our list of reference genomes as the
genome of L. macculloughii was the result of a mixed culture of L. meyeri and L. levetti.The results obtained from pairwise comparisons of the 124 genome sequences are summarized
as a matrix in the S2 Table. Using a ANI cutoff of 95% generally used as the metrics to delineate
bacterial species [35], we established the existence of 64 different species of Leptospira. These spe-
cies include 34 previously described species, 4 new species from Japan (Masuzawa et al. submit-
each of the 64 Leptospira species (Fig 1). The figure confirms 64 well-delineated species of Lep-tospira. The ANI values are consistent with their phylogenetic relationships. The interspecies
ANI values ranged from ~69% to ~94%.
Four of the new 30 species, isolated from the natural environment in Malaysia, Mayotte,
and New Caledonia and in small mammals in Ireland, were classified within the lineage of
pathogens. Ten novel species, isolated from Malaysia, Mayotte, Japan and New Caledonia,
were identified as part of the intermediates. Twelve novel species, isolated from Malaysia,
Mayotte, Japan, and New Caledonia, were assigned to the saprophytes. Finally, four novel spe-
cies, isolated from Japan, Algeria, and France, were positioned in a clade sister to the one
formed by saprophytes together with L. idonii. Using this large amount of new species, we
could refine the different clades and identified two major clades and four subclades in the
Fig 1. Phylogenetic tree based on the sequences of 1371 genes inferred as orthologous. The matrix represents the calculated ANIb values for all the genomic
sequences. The branches are colored according to their belonging to the four main subclades: P1 (red), P2 (purple), S1 (green) and S2 (blue). The bootstrap value is
indicated for a single node (that corresponding to the separation between L. biflexa strain Patoc1 and L. bouyouniensis strain 201601297) since all the others have the
maximum value of 100. A circle of color, according to the legend, represents the geographical origin of each of the new species described by this study. Node 1
indicates the node from which descent pathogenic species most frequently involved in human disease.
To better define the genomic characteristics of the four subclades, we compared different
general features as shown in Fig 4. Members of the subclade P1 are often significantly more
divergent than those composing the other subclades. In general, genomes of species belonging
Fig 3. Pan-genome distribution in four categories (cloud, shell, soft core and core) for species from subclades (A) P1, (B) P2, (C) S1 and (D) S2. Analyses done
with GET_HOMOLOGUES (using 17 genomes for P1, 21 for P2, 21 for S1 and 5 for S2) showing the U-shaped distribution of pan-genome from the four groups.
However, strains of the P1 group show asymmetry by having four times more single species than core genes.
to P1 tend to be larger, have a higher and scattered GC content (common to P2), harbor more
genes encoding tRNAs (common to P2), have a lower coding ratio, and a higher number of
pseudogenes. For several of the features investigated, the subclade P1 presented a scattered dis-
tribution. Other studies have already noticed the presence of subgroups of species and unusual
discrepancies in some genomic characteristics [11, 15]. To verify if this could be seen in our
analyses we separated the subclade P1 into two groups with one group being the species that
diverged after a specific node of evolution (node 1 in Fig 1 that separates species frequently
involved in infections). Both groups are effectively significantly divergent for the GC %, the
coding ratio and the percentage of pseudogenes.
To further characterize the subclades, the CDSs of the different genomes were grouped into
functional categories to assess potential enrichments in one of them (Fig 5). A total of 16 cate-
gories involved in known functions are significantly enriched in at least one subclade
(p< 0.05). The two groups of the subclade P1 can also be significantly separated for eight out
of the 16 categories. It is interesting to note that around a third of the CDSs were not assigned
to functional categories for all subclades and that species belonging to the subclade P1 harbor
significantly more unassigned CDSs. These proteins have no similarity with functional catego-
ries including COGs with unknown function. This may represent remnants of pseudogenes,
wrong annotation, as well as proteins restricted to the Leptospira genus.
Finally, we investigated the number of lipoproteins and the distribution of known virulence
factors from the updated genus (Figs 6 and 7). Interestingly, lipoproteins, which are membrane
Fig 4. Distribution of (A) total length, (B) GC %, (C) number of tRNA genes, (D) number of CDSs, (E) coding % and (F) pseudogenes % (values in log) in the
four major subclades. The points representing the genome-specific values of the species that diverged after node 1 in Fig 1 (L. interrogans, L. kirschneri, L. noguchii, L.
santarosai, L. mayottensis, L. borgpetersenii, L. alexanderi and L. weilii) are in red. The "�" represent the level of significance between the different groups: � P� 0.05, ��
P� 0.01, ��� P� 0.001, and ���� P� 0.0001. The level of significance between the two P1 groups separated by node 1 is represented by the same code, but for the sake
proteins, are coded by a lower number of genes in the P1 subclade, in comparison to the other
subclades; this is particularly true for the species that diverged after node 1 (most virulent
Fig 5. Distribution in functional categories of the predicted CDSs (%). The points representing the genome-specific values of the species that diverged after node
1 in Fig 1 (L. interrogans, L. kirschneri, L. noguchii, L. santarosai, L. mayottensis, L. borgpetersenii, L. alexanderi and L. weilii) are in red. The "�" represent the level of
significance between the different groups: � P� 0.05, �� P� 0.01, ��� P� 0.001, and ���� P� 0.0001. The level of significance between the two pathogenic groups is
represented by the same code, but for the sake of clarity the symbol is "a". Only functional categories showing significant difference are shown.
species) within the P1 subclade (L. interrogans, L. kirschneri, L. noguchii, L. santarosai, L.
mayottensis, L. borgpetersenii, L. alexanderi and L. weilii) (Fig 6). In contrast, as expected, it is
possible to observe a gradient in the repertoire of genes encoding proteins known to be
involved in virulence (Fig 7A). The species of subclades P1 and P2 having the most genes
encoding virulence factors and S1 and S2 having the least genes. However, it is interesting to
note that the distribution of the gene coding for KatE catalase (LA1859), that is an important
virulence factor in animal model [38], is more heterogeneous than previously suspected [10],
as it is possible to confidently find an homologous copy in genomes of some strains belonging
to subclades P2, S1 and S2. Several PFAM domains are known to be associated with proteins
involved in Leptospira virulence [10]. As expected, it has been possible to find a much larger
number of these domains in the P1 species, more particularly in the species that diverged after
node 1 with a high level of pathogenicity in humans (Fig 7B).
16S rRNA data is insufficient to robustly distinguish Leptospira species
Phylogenetic reconstruction based on 16S rRNA gene sequences is a widely used approach to
infer relationships between bacteria. Nevertheless, the high conservation of rRNA reduces its
discriminatory power and 16S rRNA sequences may not be sufficient to distinguish related
bacterial species. In the light of the robustly updated genus, we investigated the power of reso-
lution of the 16S rRNA sequences for Leptospira. We found that i) L. johnsonii, L. saintgiron-siae and L. neocaledonica, ii) L. langatensis and L. sarikeiensis, iii) L. haakeii and L.
selangorensis, iv) L. venezuelensis and L. andrefontaineae, v) L. congkakensis, L. mtsangam-bouensis and L. noumeaensis, vi) L. ellinghausenii and L. montravelensis, and vii) L. kemama-nensis and L. bouyouniensis have 100% identical 16S rRNA sequences (Fig 8). A phylogenetic
Fig 6. Distribution of genes encoding lipoproteins. The "�" represent the level of significance between the different
groups: � P� 0.05, �� P� 0.01, ��� P� 0.001, and ���� P� 0.0001. The level of significance between the two
pathogenic P1 groups (before and after node 1) is represented by the same code, but for the sake of clarity the symbol is
"a". The points representing the genome-specific values of the species that diverged after node 1 in Fig 1 (L. interrogans,L. kirschneri, L. noguchii, L. santarosai, L. mayottensis, L. borgpetersenii, L. alexanderi and L. weilii) are in red.
analysis with these sequences and others available in GenBank permitted to recover the separa-
tion of the species into four large subclades P1, P2, S1 and S2 (Fig 8). Although less resolutive
Fig 8. Phylogenetic tree based on the 16S rRNA and ppk sequences to evaluate the diversity within the Leptospira genus. In addition to the 16S rRNA sequences
from the 64 genomes investigated in the present study, those from uncultured strains from the Peruvian Amazon (Clade C) [26] and from insectivorous bats from
eastern China [27] were added. The branches are colored according to their belonging to the four main subclades: P1 (red), P2 (purple), S1 (green) and S2 (blue),
while the strains of the “clade C” are in black. For the sake of clarity, the bootstrap values are only indicated for the nodes that correspond to the major splits. A tree
constructed with the ppk gene sequences is included in the dashed box for comparison. In this case, all bootstrap values less than 100 are indicated at the different
than the phylogenetic analysis with softcore genes, 16S rRNA analysis allows to appreciate the
potential diversity that remains to be explored in Leptospira. In this sense, sequences from bats
from China [27] clustered among P1 and long lengths of branches of some subclades suggest
that some of these strains could correspond to unknown, potentially novel species yet to be iso-
lated. A striking result was the high diversity of sequences recovered from the environment of
the Peruvian Amazon and composing the previously named “clade C” [26]. The “clade C” is
predicted to be sister to the S clade (Fig 8).
We searched among the genes of the core genome which would allow to obtain a topology
closest to that inferred with all softcore genes. A total of 553 phylogenetic trees (from the 553
genes of the core genome in single copy) were compared to the softcore tree. We found that
the ppk gene (LA3459 in L. interrogans), encoding a polyphosphate kinase of 712 aa in L. inter-rogans, made it possible to reproducibly obtain the tree with the lowest Robinson-Foulds dis-
tance. The tree generated from the sequences of the ppk gene effectively makes it possible to
recover the monophyly of the four subclades (P1, P2, S1 and S2) (Fig 8).
Discussion
In this study, 90 genomes of Leptospira strains collected from soil and water samples from 18
different sites across four continents were sequenced. The genome relatedness between these
environmental isolates and representative strains of each of the known species of Leptospiraallowed us to identify 30 new species. We propose to reclassify species of the Leptospira genus
into 4 subclades, called P1, P2, S1 and S2, instead of the clusters historically named as sapro-
phytes (S1 and S2), intermediates (P2) and pathogens (P1).
Traditionally, classification of bacteria is performed on the basis of their phenotypic charac-
teristics, such as Gram staining, growth requirements, and biochemical tests. Low phenotypic
diversity within the Leptospira genus precludes from using differential growth characteristics
for differentiation of Leptospira at the species level. Only a few phenotypic tests such as viru-
lence in animal models, growth rate at 30˚C, growth at 37˚C or 14˚C and growth in the pres-
ence of the purine analogue 8-azaguanine can be used to separate the P1 (former pathogens)
from S1 (former saprophytes). Modern microbial taxonomy is primarily based on 16S rRNA
gene relationships, enabling strain identification at the level of species in most cases. The 16S
phylogenetic analysis of the present study, while allowing visualizing the general diversity of
the genus Leptospira, shows the weakness of this gene to make a robust and precise phyloge-
netic inference. For example, it was impossible to find the monophyly of subclade S2 with
respect to S1. The 16S sequences are often highly conserved, and therefore often lack sufficient
variable characters to make a robust phylogenetic inference at the species level [10]. In addi-
tion, the two copies of the 16S gene in Leptospira genomes may be divergent and come from
horizontal transfers, and thus bias phylogenetic reconstruction (reviewed in [10]). The ability
to achieve robust phylogenetic classification from a single gene is, however, important in a
diagnostic context where it may be unrealistic to effectively and rapidly perform a phylogeny
based on several hundred genes. We therefore looked for a candidate gene among the core
genome. It turned out that the ppk gene, encoding a polyphosphate kinase, makes it possible to
reproducibly recover a topology very similar to that obtained from softcore genes (Fig 8). Pre-
vious studies in other bacteria, such as "Candidatus Accumulibacter" [39] and Microbacterium[40], have shown that the ppk gene evolves rapidly, allowing phylogenetic reconstructions.
The advent of high-throughput DNA sequencing has changed our view of bacterial taxon-
omy. This is particularly true for fastidious bacteria such as Leptospira. With the increase in
available sequences, genome-wide comparisons can be highly discriminative allowing precise
taxonomic classification. Among the various in silico genome-to-genome comparison methods
studied, the ANI, AAI, and POCP values were shown to yield good correlation with phyloge-
netic studies and the traditional DNA-DNA hybridization values [35, 37, 41, 42]. If possible,
suspected new species genome must be sequenced and ANI could be calculated to our curated
species database publicly available (http://fveyrier.profs.inrs.ca/Download/Dataset.zip). This
method clearly avoids misidentification of species (as demonstrated in this study by using the
NCBI public database) and can enable the identification of new species (cut-off >95%). The
use of ANI values can also delineate some clear subgroups within the four subclades. Interest-
ingly enough, the subclades P1 and P2 seem to be constituted by multiple small subgroups,
representing a high level of diversity. As a note the segmentation of the subclade P1 in groups
have been already described [9, 11, 15, 43]. Also, the species forming the new subclade S2 are
clearly among the most diverse in ANI values, consistent with the long branches in the phylo-
genetic tree.
Our study identified a total of 64 species with four new species (L. gomenensis, L. putrama-laysiae, and L. dzianiensis, L. tipperaryensis) in the P1 subclade. We also identified ten new spe-
cies in subclade P2 (old “intermediate” group). Finally, sixteen new Leptospira species isolated
from the natural environment belonged to subclades S1 and S2. We showed that species of the
new subclade S2 possess phenotypic characteristics of saprophytes S1, which is consistent with
their phylogenetic position. Leptospira species are considered ubiquitous, as they are found in
a wide variety of environments including surface water, soil, and they are found in mammals
but also in birds, amphibians, and reptiles [2, 44, 45]. Recent isolation of 12 novel species from
tropical soils in areas of endemic leptospirosis in New Caledonia suggests that soils are an
important niche for the genus [14, 15]. Our study, where we collected soil and water samples
from a wide range of ecosystem types (tropical forests, temperate and Mediterranean freshwa-
ters) worldwide, further supports that this genus is highly diverse and Leptospira spp. are
found in abundance in both soil and water throughout the different continents. Among the
sequenced environmental isolates, several saprophytic species were found onto different conti-
nents. For example, L. meyeri was isolated in France, New Caledonia, and Malaysia; L. bandra-bouensis in Mayotte and New Caledonia (S3 Table). The mechanisms of dispersion of these
non-pathogenic species with no know animal reservoirs remain to be determined, especially in
the context of tropical islands.
The evolution of the Leptospira genus is still puzzling. The current hypothesis is that Leptos-pira genus is broadly found in soil and water and that symbiosis of leptospires, including com-
mensals or pathogens, with eukaryotes emerged from free-living ancestral species in a stepwise
and independent manner, as suggested by different accessory genes [15]. The genomic ana-
lyzes presented in the present study allow a better understanding of the evolution of the species
forming the different clades and subclades. An open pan-genome is typical of bacteria living
sympatrically with other species and with a high rate of horizontal gene transfer [46], a feature
of soil microbiota. It was already known that the genus has an open pan-genome [9]. With
more species, we were able to refine this pan-genome in the different subclades and demon-
strated that the P1 subclade has the most open pan genome. This result is corroborated by the
fact that the pan-genome distribution of species belonging to P1 clade is asymmetrically U-
shaped, with many genes specifically found in single species. This suggests a massive rework-
ing of the cellular functions in this subclade by multiple horizontal gene transfers that could
have allowed a change in the ecological niche occupied from a free-living to non-obligatory
symbiotic (commensal or pathogen) organism. This correlates with a generally larger genome
of species from the P1 subclade. Although the reason is not yet completely clear, it is possible
to think that the large range of potential hosts that can be infected by these species requires
some specificity, and that horizontal gene transfers can be one of the methods allowing a fast
adaptation to these hosts. More interestingly, previous studies have defined groups within the
pathogens or subclade P1 on the basis of virulence (outcome in patients and/or virulence in
the hamster model) and phylogenomic analysis [11, 15]. Thus subgroups containing the spe-
cies L. interrogans, L. kirschneri and L. noguchii on one hand and L. santarosai, L. mayottensis,L. borgpetersenii, L. alexanderi and L. weilii on the other hand are most often associated with
severe infections in humans. These species diverged after a specific node of evolution (node 1
in Fig 1). The other species in subclade P1 were isolated from the environment with the excep-
tion of L. alstoni and L. tipperaryensis which were isolated from amphibians in China [47] and
shrews in Ireland [33, 34], respectively. Although these other species are in the P1 subclade,
they failed to induce disease or colonization in animal models like other Leptospira species
tested in the P2, S1 and S2 subclades [15]. It is striking to note that species that diverged after
node 1 harbor a lower percentage of coding sequences, and very high percentage of pseudo-
genes (as compared to other species) and an enrichment of genes in the category of replication,
recombination and repair that includes transposase and integrase. It has been shown that
mobile elements in L. borgpetersenii are likely involved in the genomic decay of the pathogen
though recombination events and inactivation of genes [48]. The same study postulated that
these IS-mediated events increased the dependence of L. borgpetersenii to its hosts as several
genes involved in tolerance to nutrient deprivation were altered. In the present study, we also
found that species that diverged after node 1 tend to be depleted in several functional catego-
ries comparatively to the other species. The mechanisms of such decay remains complicated to
study given the fact that insertion sequences are one of main genomic determinants that cause
contig breakages during the de novo assembly process [49]. Nevertheless, this phenomenon is
often associated with ecological specialization and host dependence [50], which could suggest
that after ongoing ecological niche switch from free living to symbiotic lifestyle (concomitant
with gene expansion), this group of bacteria are now stabilizing and restricting their lifestyle in
specific niches.
In conclusion, the present study, unveils the diversity of the Leptospira genus and the evolu-
tion of species from this genus. In the future, understanding how speciation occurs in the envi-
ronment should increase our knowledge of the evolution of pathogens and acquisition of
virulence factors. The increasing availability of Leptospira genomes that are representative of
the diversity within the genus has created new opportunities for reconstructing bacterial evolu-
tion. Nevertheless, by describing several potentially infectious Leptospira species opens up
questions about their implication in public health and diagnostic tools should be updated to
take into account the new species described in the present study in order to evaluate their asso-
ciation with infection of both animal and humans and their role in clinical disease.
Supporting information
S1 File. ANI analyzes with all the genomes available in GenBank and the genomes of our
dataset.
(XLSX)
S1 Table. Information on the 124 genomes investigated in this study (including accession
numbers).
(XLSX)
S2 Table. ANI analysis with the 124 genomes of the present study.
(XLSX)
S3 Table. Leptospira species isolated in at least two different countries.
S4 Table. Phenotypic analysis of representative species of subclade S2.
(XLSX)
S1 Fig. Phylogenetic tree based on the sequences of 1371 genes inferred as orthologous.
The matrix represents the calculated AAI values for all the genomic sequences. The branches
are colored according to their belonging to the four main subclades: P1 (red), P2 (purple), S1
(green) and S2 (blue). The bootstrap value is indicated for a single node (that corresponding to
the separation between L. biflexa strain Patoc 1 and L. bouyouniensis strain 201601297) since
all the others have the maximum value of 100. A circle of color, according to the legend, repre-
sents the geographical origin of each of the new species described by this study.
(TIF)
S2 Fig. Phylogenetic tree based on the sequences of 1371 genes inferred as orthologous.
The matrix represents the calculated POCP values for all the genomic sequences. The branches
are colored according to their belonging to the four main subclades: P1 (red), P2 (purple), S1
(green) and S2 (blue). The bootstrap value is indicated for a single node (that corresponding to
the separation between L. biflexa strain Patoc 1 and L. bouyouniensis strain 201601297) since
all the others have the maximum value of 100. A circle of color, according to the legend, repre-
sents the geographical origin of each of the new species described by this study.
(TIF)
S3 Fig. Transmission electron microscopy of representative species of subclades P1 (L.interrogans), P2 (L. licerasiae), S1 (L. biflexa) and S2 (L. kobayashii, L. ognonensis, and L.ilyithenensis). Exponential phase cultures of L. kobayashii strain E30T, L. ilyithenensis strain
201400974 T, L. ognonensis strain 201702476T, L. biflexa strain Patoc1, L. licerasiae strain
Var010T and L. interrogans strain L495 were allowed to adsorb onto a carbon-coated copper
grid. Samples were fixed with 2% glutaraldehyde, washed in distilled water and negatively
stained with 4% uranyl acetate. After drying, grids were observed under a FEI Tecnai T12
Transmission Electron Microscope with an acceleration voltage of 120 kV. Electron micro-
graphs were taken at a magnification of 2,900 on ten isolated representative cells of one strain
of each described species. Measurements were done using ImageJ software.
(TIFF)
Acknowledgments
We thank Vincent Enouf and the team of core facility P2M (Institut Pasteur, Mutualized Plat-
form for Microbiology) for genomic sequencing. We also thank Nathalie Armatys, Celine Lor-
ioux, Farida Zinini, Dominique Girault and Marie-Estelle Soupe-Gilbert for technical
assistance with the cultures of Leptospira, Sabine Henry, Geoffroy Liegeon, Marie-Estelle
Soupe-Gilbert and Emilie Bierque for environmental sampling, Robert Gaultney for animal
experiments, and Chantal Bizet from the Collection of Institut Pasteur (CIP) for providing ref-
erence strains. We are also grateful to Prof. Aharon Oren for revising the names of novel Lep-tospira species and Jarlath Nally for the name of L. tipperaryensis.
Author Contributions
Conceptualization: Pascale Bourhy, Frederic J. Veyrier, Mathieu Picardeau.
Data curation: Antony T. Vincent, Olivier Schiettekatte, Mathieu Picardeau.
Formal analysis: Antony T. Vincent, Olivier Schiettekatte, Eve Bernet, Toshiyuki Masuzawa,
Pascale Bourhy, Frederic J. Veyrier, Mathieu Picardeau.