Complete genome sequence of the entomopathogenicand metabolically versatile soil bacterium PseudomonasentomophilaNicolas Vodovar1, David Vallenet2, Stephane Cruveiller2, Zoe Rouy2, Valerie Barbe2, Carlos Acosta1,Laurence Cattolico2, Claire Jubin2, Aurelie Lajus2, Beatrice Segurens2, Benoıt Vacherie2, Patrick Wincker2,Jean Weissenbach2, Bruno Lemaitre1, Claudine Medigue2 & Frederic Boccard1
Pseudomonas entomophila is an entomopathogenic bacterium that, upon ingestion, kills Drosophila melanogaster as well as
insects from different orders. The complete sequence of the 5.9-Mb genome was determined and compared to the sequenced
genomes of four Pseudomonas species. P. entomophila possesses most of the catabolic genes of the closely related strain
P. putida KT2440, revealing its metabolically versatile properties and its soil lifestyle. Several features that probably contribute
to its entomopathogenic properties were disclosed. Unexpectedly for an animal pathogen, P. entomophila is devoid of a type III
secretion system and associated toxins but rather relies on a number of potential virulence factors such as insecticidal toxins,
proteases, putative hemolysins, hydrogen cyanide and novel secondary metabolites to infect and kill insects. Genome-wide
random mutagenesis revealed the major role of the two-component system GacS/GacA that regulates most of the potential
virulence factors identified.
Pseudomonas spp. are ubiquitous Gram-negative bacteria that colonizeand survive in numerous ecological niches including soil, water andplant surfaces. This versatility is reflected by the sizes of their genomes,which contain large sets of genes involved in carbon source utilizationand adaptation. In 2001, we isolated a bacterial strain closelyrelated to the saprophytic soil bacterium Pseudomonas putida,Pseudomonas entomophila, which triggers a systemic immune responsein D. melanogaster after ingestion1. P. entomophila is highly pathogenicfor both D. melanogaster larvae and adults. Its persistence in larvaeleads to a massive destruction of gut cells1.
Entomopathogenic bacteria such as the Gram-negative bacteriaPhotorhabdus luminescens, Xenorhabdus nematophilus, Yersiniapestis, Serratia marcescens and Serratia entomophila and the Gram-positive bacterium Bacillus thuringiensis have developed differentstrategies to interact with and kill insects2. Some gene productsderived from these bacteria as well as the bacteria themselves, havebeen used to generate biopesticides3. The ability of P. entomophila toorally infect and kill larvae of insect species belonging to differentorders makes it a promising model for the study of host-pathogeninteractions and for the development of biocontrol agents againstinsect pests. To unravel features contributing to P. entomophila’sentomopathogenic properties, we have determined its completegenome sequence and performed a genome-wide screen for mutantsaffected in their ability to trigger an immune response and lethalityin D. melanogaster.
RESULTS
Genome features and comparative genomics
The P. entomophila genome is composed of a single circular chromo-some of 5,888,780 base pairs (Fig. 1). Among 5,169 coding sequencesidentified, 3,466 genes (67%) have been assigned a predicted function(Table 1). The P. entomophila genome is smaller than the six otherPseudomonas genomes that have been published (Table 1): the humanopportunistic pathogen P. aeruginosa PAO1 (ref. 4), the threeP. syringae pathovars5–7, the plant commensal P. fluorescens Pf-5(ref. 8) and the saprophytic soil bacterium P. putida KT2440 (ref. 9).
GC skew analysis and the predicted location of the origin ofreplication oriC near dnaA and of the chromosome dimer resolutiondif site in PSEEN2780 revealed the presence of two replichores ofsimilar size, contrary to the unbalanced replichores found in thegenomes of P. putida KT2440 (ref. 10) and P. aeruginosa PAO1 (ref. 4)(see Supplementary Fig. 1 online). BLAST comparisons of genomesfrom the five Pseudomonas representative species identified a set of2,065 genes that constitutes the Pseudomonas core genome. Based onthis analysis, we identified 1,002 genes unique to the P. entomophilagenome. We found that, consistent with the close relatedness betweenP. entomophila and P. putida1, 70.2% of P. entomophila genes (3,630)have orthologs in the P. putida genome, of which more than 96% arefound in synteny (see Supplementary Table 1 online). The smallersize of the P. entomophila genome compared to that of otherPseudomonas does not seem to originate from reductive evolution.
©20
06 N
atu
re P
ub
lish
ing
Gro
up
h
ttp
://w
ww
.nat
ure
.co
m/n
atu
rebiotechnology
Received 30 January; accepted 7 April; published online 14 May 2006; doi:10.1038/nbt1212
1Centre de Genetique Moleculaire, Centre National de la Recherche Scientifique, 91198 Gif-sur-Yvette, France. 2Genoscope, Centre National de Sequencage and CNRS-UMR8030, 2 rue Gaston Cremieux, 91057 Evry Cedex, France. Correspondence should be addressed to F.B. ([email protected]).
NATURE BIOTECHNOLOGY ADVANCE ONLINE PUBLICATION 1
A R T I C L E S
Indeed the 50 genes of P. entomophila present in other Pseudomonasbut absent from P. putida belong to functional classes as diverse as the34 genes of P. putida present in other Pseudomonas but absent fromP. entomophila. Furthermore, comparison of gene contents inP. entomophila and P. putida indicates that the higher number ofspecies-specific genes in P. putida (1,774 versus 1,539) largely resultsfrom the presence of a higher number of paralogous genes (Fig. 2 andSupplementary Table 2 online). Comparison of the chromosomestructures of P. entomophila and P. putida KT2440 and scatter plotanalysis of syntenic regions of the two strains revealed frequent geneticinversions that reverse the genomic sequence symmetrically acrossoriC as observed in other bacterial genera11 (Fig. 2 and Supplemen-tary Fig. 2 online). The same rearrangement profile was observedwhen comparing the P. entomophila genome with those of otherPseudomonas spp., even though the levels of orthology and of syntenywere lower (see Supplementary Table 1 and Supplementary Fig. 2online). A search for repetitive extragenicpalindromic sequences (REPs) identified943 REPs similar to those found in thegenomes of P. putida KT2440 (ref. 12) andP. fluorescens Pf-5 (ref. 8). The genome ofP. entomophila has been remodeled by geneticmobile elements and bacteriophage inser-tions considerably less than the genomes ofother environmental pseudomonads such asP. putida KT2440 and P. syringae pv. tomatoDC3000 (Fig. 1). Particularly notable arethree clustered prophages related to FluMuphage, a pyocin-like phage and a lambdoidphage; they are inserted between recA andmutS, as observed for FluMu phage inP. fluorescens Pf-5 genome. Also of particularinterest are two putative prophages insertedin genes encoding 4.5S RNA and tmRNA,respectively. The genome of P. entomophilacontains only nine genes encoding
transposase-like proteins including three that are remnant or inactive.Unlike the genomes of P. putida KT2440 and P. syringae pv. tomatoDC3000, the genome of P. entomophila is devoid of type II introns.
Toxins against insects
We used several criteria to uncover genes that may contribute to theentomopathogenic properties of P. entomophila: specificity to theP. entomophila genome, localization within genomic islands thatsuggest recent lateral acquisitions (based on break of the synteny,GC content and absence of REPs) and similarity to genes associatedwith virulence in other systems (Table 2).
Particularly striking are three genes absent from other Pseudomonasgenomes that encode proteins related to insecticidal toxin complexesthat have been found only in entomopathogenic enterobacteria suchas Photorhabdus luminescens, Serratia entomophila, Xenorhabdusnematophilus or in Yersinia spp.13,14. Three basic types of genetic©
2006
Nat
ure
Pu
blis
hin
g G
rou
p
htt
p:/
/ww
w.n
atu
re.c
om
/nat
urebiotechnology
Pseudomonas entomophila
5888780 bp
3000000
3500000
4000000
4500
000
5000
000
55000000
500000
1000000
1500
000
2000
000
2500000
Figure 1 Circular representation of the P. entomophila genome. The outer
scale indicates coordinates in base pairs (bp). Circles 1 and 2 (from outside
to inside) show predicted coding regions transcribed clockwise and
counterclockwise, respectively. Coding sequences are color coded by role
categories: salmon, amino acid biosynthesis; light blue, biosynthesis of
cofactors, prosthetic groups and carriers; light green, cell envelope; red,
cellular processes; brown, central intermediary metabolism; yellow, DNA
metabolism; green, energy metabolism; purple, fatty acid and phospholipidmetabolism; violet, mobile and extrachromosomal element functions; pink,
protein synthesis and fate; orange, purines, pyrimidines, nucleosides and
nucleotides; navy blue, regulatory functions and signal transduction; lime
green, secondary metabolite biosynthesis; gray, transcription; teal, transport
and binding proteins; black, unknown function and hypothetical proteins.
Circle 3 shows rRNA genes in salmon, tRNA genes in green and
miscellaneous RNA genes in blue. Circle 4 shows transposase genes,
putative prophages and gene clusters encoding secondary metabolites coded
by colored symbols as follows: green arrowheads, transposases; gray,
putative prophages; red, pyoverdine synthesis; light blue, cluster involved in
lipopeptide II biosynthesis; violet, acinetobactin-like siderophore synthesis;
light green, cluster involved in lipopeptide III biosynthesis; navy blue,
cluster and isolated genes involved in lipopeptide I biosynthesis; pink,
hydrogen cyanide production; brown, polyketide synthesis. Circle 5 shows
the distribution of REPs. These repeats are scattered all over the genome
and were found either as single elements, in paired elements or in clusters
of up to six elements in alternating orientation. Circle 6 shows G+C in
relation to the mean G+C in a 1,000-bp window. Circle 7 shows GC skew in
a 1,000-bp window.
Table 1 General features of genomes of representative Pseudomonas species
General features Pe Ppa Pf a Paa Psta
Size (Mb) 5.9 6.2 7.1 6.3 6.4
GC (%) 64.2 61.6 63.3 66.6 58.4
Nb CDS 5169 5420 6144 5570 5615
Coding (%) 89.1 87.7 88.8 89 86.8
rRNA operon 7 7 5 4 5
tRNA 78 74 71 63 63
Protein with predicted function (%) 67.1 65.8 62.2 54.2 61.0
Proteins without predicted function
Conserved hypothetical proteins (%) 25.3 19.1 32.5 13.8 17.0
Hypothetical proteins (%) 7.5 15.1 5.3 31.9 22.0
aThe distributions of ORFs for the published chromosomes are derived from the original annotation. These numbers, particularlythose of hypothetical and conserved hypothetical proteins, may be different from numbers obtained with updated BLAST searchesand annotations. Features of the genomes of P. syringae pv. syringae B728a (6.1 Mb) and P. syringae pv. phaesolicola 1448A(5.9 Mb) are not indicated. CDS, coding sequences; Pe, P. entomophila; Pa, Pseudomonas aeruginosa; Pp, Pseudomonas putida;Pf, Pseudomonas fluorescens Pf-5; Pst, Pseudomonas syringae pv. tomato DC3000.
2 ADVANCE ONLINE PUBLICATION NATURE BIOTECHNOLOGY
A R T I C L E S
elements encode insecticidal toxin complexes: tcdA-, tcdB- and tccC-like genes. The P. entomophila genome encodes three TccC-typeinsecticidal toxins (PSEEN2485, PSEEN2697, PSEEN2788) (seeSupplementary Fig. 3 online). In addition to these three insecticidaltoxins, the P. entomophila genome, like that of P. syringae, encodesproteins more distantly related to TccC-type toxins (PSEEN701 andPSEEN702) and to TcdB-type toxins (PSEEN1172). The threeP. entomophila insecticidal toxins likely play a major role in thepathogenicity of P. entomophila as TccC and TcdB proteins havebeen shown to have entomocidal activity15,16, even though themolecular mechanisms remain to be characterized. These findingshighlight the efficient spreading of toxin-complex gene homologs ininsect-interacting soil bacteria belonging to different genera.
Bacterial hemolysins are exotoxins that attack blood cell membranesand cause cell rupture by poorly defined mechanisms17. Contrary tothe other Pseudomonas tested, P. entomophila secretes a strong diffu-sible hemolytic activity (see Supplementary Fig. 4 online) thatmay also be involved in pathogenicity against D. melanogaster. Weidentified three genes unique to P. entomophila that may be respon-sible for this activity (Table 2). The gene encoding PSEEN3925, aputative repeats-in-toxin (RTX) protein, is clustered with genesencoding a type I secretion system. PSEEN0968 and PSEEN3843 areproteins related to outer membrane autotransporters that havebeen associated with virulence in other bacteria. A number oflipases have also been shown to confer hemolytic activity. TheP. entomophila genome encodes four lipases that are absent fromP. putida KT2440 and that may contribute to its hemolytic activity(PSEEN709, PSEEN1065, PSEEN2195, PSEEN3432). Interestingly,the gene encoding a lysophospholipase (PSEEN709) is found in a
genomic islet associated with two genes encoding proteins related toinsecticidal toxins.
Proteases constitute another important group of extracellular,biologically active substances that are thought to contribute to thevirulence of bacterial species. P. entomophila encodes three serineproteases (PSEEN3027, PSEEN3028, PSEEN4433) and an alkalineprotease (PSEEN1550) absent from P. putida KT2440. These fourgenes are located at synteny break points between the genomes ofP. entomophila and other Pseudomonas spp. PSEEN1550 is the homo-log of the alkaline protease AprA, which has been shown to beinvolved in various virulence processes among different species18.AprA likely plays a key role in virulence because pathogenicity isaffected in mutants defective in PrtR, the predicted transcriptionalregulator of aprA (see below).
Pathogenic bacteria rely on a variety of cell surface–associatedvirulence factors that allow adhesion to the host surface and promoteeffective colonization. Filamentous hemagglutinin-like adhesins arebroadly important virulence factors in both plant and animal patho-gens. The genome of P. entomophila encodes three proteins(PSEEN0141, PSEEN2177, PSEEN3946) that are predicted to beinvolved in adhesion and cluster with genes encoding type I ortwo-partner secretion system proteins (Table 2). We also noticedthe presence of two putative autotransporter proteins with a pertactin-type adhesion domain.
Toxins against competitors
In addition to the putative toxins described above that may becrucial for its entomopathogenic properties, P. entomophilacarries a number of genes specifying diverse traits that may berequired not only for interaction with insects but also for its lifestylein soil, aquatic or rhizosphere environments (see SupplementaryFig. 5 online).
Fluorescent pseudomonads are characterized by the production ofpyoverdines, a diverse class of siderophores containing a chromophorelinked to a small peptide of varying length and composition synthe-sized by nonribosomal peptide synthases19. In P. entomophila, the twogene clusters that encode proteins required for pyoverdine biosynth-esis and uptake (PSEEN1813-PSEEN1815 and PSEEN3224-3234)present a general organization similar to that found in other fluor-escent pseudomonads20. We also identified a gene cluster responsible
©20
06 N
atu
re P
ub
lish
ing
Gro
up
h
ttp
://w
ww
.nat
ure
.co
m/n
atu
rebiotechnology
0 200 400 600 800 1,000
Unique P. entomophila-specific genes
Duplicated P. entomophila-specific genes
Unique P. putida-specific genes
P. putida KT2440
P. entomophila
Duplicated P. putida-specific genes
Aa
Bc
Ce
Cp
Ci
Dm
Em
Fam
Me
P
Pp
Rf
Sm
T
Tb
Uf
61818621
1 5888780a
b
Figure 2 Comparison of the P. entomophila and P. putida genomes.
(a) Regions of significant sequence identity between the nucleotide
sequence of P. entomophila (top) and P. putida KT2440 (bottom). Colinear
regions are connected by red lines and inverted regions by blue lines.
The display was generated using Artemis Comparison Toll (freely available
at http://www.sanger.ac.uk/Software/ACT/). (b) Specific gene content
comparison of the genomes of P. entomophila and P. putida KT2440.
Specific genes of P. entomophila (Pe) and of P. putida KT2440 (Pp) with noortholog in the other species are indicated in blue and green respectively,
and are classified according to role categories as described in Figure 1.
Two genes were considered as orthologs when their products share more
than 60% identity over more than 80% of their length. Duplicated genes
indicated by light colors were detected by using a constraint of 35% identity
over more than 80% of the length of the protein. Aa, amino acid
biosynthesis; Bc, biosynthesis of cofactors, prosthetic groups and carriers;
Ce, cell envelope; Cp, cellular processes; Ci, central intermediary
metabolism; Dm, DNA metabolism; Em, energy metabolism; Fam, fatty acid
and phospholipid metabolism; Me, mobile and extrachromosomal element
functions; P, protein synthesis and fate; Pp, purines, pyrimidines, nucleo-
sides and nucleotides; Rf, regulatory functions and signal transduction; Sm,
secondary metabolite biosynthesis; T, transcription; Tb, transport and
binding proteins; Uf, unknown function and hypothetical proteins.
NATURE BIOTECHNOLOGY ADVANCE ONLINE PUBLICATION 3
A R T I C L E S
©20
06 N
atu
re P
ub
lish
ing
Gro
up
h
ttp
://w
ww
.nat
ure
.co
m/n
atu
rebiotechnology
Table 2 Gene/gene products potentially involved in P. entomophila-D. melanogaster interaction
Gene/gene producta,b,c Function Ps.d
Adhesion
PSEEN0141a Putative surface adhesion protein 54% PP0168e
PSEEN2177a Putative filamentous hemagglutinin 51% PFL4237
PSEEN3946a Putative filamentous hemagglutinin 41% PA0041
PSEEN3161 Putative autotransporter, pertactin-like protein 63% PP3069
PSEEN4310a Putative autotransporter, pertactin-like protein 42% PSPTO2225
Proteases
aprA c Alkaline metalloprotease 72% PSPTO3332
PSEEN3027b Putative autotransporter, SSP-h1 serine protease 68% PSPTO1650
PSEEN3028b Putative autotransporter, serine protease 64% PA3535
PSEEN4433a Putative subtilisin-like serine protease Absent
Lipases
PSEEN0709b Lysophospholipase 76% PA2540
PSEEN1065b Phospholipase C 62% PFL0888
PSEEN2195 Triacylglycerol lipase 64% Pf B52 (P21773)g
PSEEN3432a,b Lipase class3 48% Pfo0149
Toxins
hcnABCc Hydrogen cyanide production 76% PA2193 (hcnA)
PSEEN0132/3332/3042-5a,b Cluster involved in lipopeptide I biosynthesis See texth
PSEEN2138-56a,b Cluster involved in lipopeptide II biosynthesis Absent
PSEEN2716-20b Cluster involved in lipopeptide III biosynthesis 77% Pfo2266 (2717)
PSEEN5524-36a,b Cluster involved in polyketide biosynthesis Absent
PSEEN0701a,b Protein related to TccC-type insecticidal toxin Absentf
PSEEN0702a,b Protein related to TccC-type insecticidal toxin Absentf
PSEEN1172a Protein related to TcdB-type insecticidal toxin Absentf
PSEEN2485a,c TccC-type insecticidal toxin Absent
PSEEN2697a,b,c TccC-type insecticidal toxin Absent
PSEEN2788a,b,c TccC-type insecticidal toxin Absent
PSEEN3326a,b Putative toxin (cytolethal distending toxin B domains) Absent
PSEEN3925-9a Putative RTX toxin and type I secretion system Absent
Miscellaneous
PSEEN0968a,b Putative autotransporter with unknown passenger domain Absent
PSEEN3843a Putative autotransporter with unknown passenger domain 53% PSPTO0714
Noninfectious and nonlethal Tn5 derivativesi
gacS(5) Sensor histidine kinase 88% PP1650
gacA(2) Response regulator, LuxR family 98% PP4099
bioC(1) Biotin biosynthesis 86% PP0365
PSEEN5207(1)-8(2) Putative amino acid ABC transporter 97%/82% PP0283-2
PSEEN4425(2) CHPj, CAIB/BAIF family 62% PFL4631
Infectious and nonlethal Tn5 derivativesi
prtR(3) Transmembrane transcriptional regulator 74% PP2889
algR(2) Transcriptional regulator involved in alginate production 91% PP0185
PSEEN0132(3)-3(1) NRPS loading protein, CHP (operonic) 59%/75% PSPTO5546-7
PSEEN0389(1) Putative chorismate mutase, operonic with glnA, ntrBC 44% PFL0385
aGene products specific to P. entomophila and not found in other Pseudomonas species (constraint of 60% identity over more than 80% of the protein length).bUnusual GC content (differing by more than 1 s.d. from the average GC) likely due to recent lateral transfer.cGene products or predicted domains associated with virulence in other systems.dSequence identity between the protein encoded by P. entomophila and the best BLAST hit among proteins from other Pseudomonas. PP, P. putida KT2440; PA, P. aeruginosa PA01; PSPTO,P. syringae pv. tomato DC3000; PFL, P. fluorescens Pf-5; Pfo P. fluorescens PfO-1 and Pf, P. fluorescens.ePSEEN0141 and PP0168 are aligned only on 67% of PP0168 length.fo40% identity.gTrEMBL accession number.hThis cluster and similarity with that of P. fluorescens Pf-5 are discussed in the Supplementary Figure 5.iSuperscripted numerals indicate the number of independent Tn5 insertions.jConserved hypothetical protein.
4 ADVANCE ONLINE PUBLICATION NATURE BIOTECHNOLOGY
A R T I C L E S
for the synthesis of a siderophore related to acinetobactin andcontaining a salicylamide moiety21 (Supplementary Fig. 5).
Five gene clusters that direct the production of secondary metabo-lites have been identified (see Supplementary Fig. 5). PSEEN5520-PSEEN5522 are responsible for hydrogen cyanide production that isinvolved in Caenorhabditis elegans killing by P. aeruginosa22 and in thesuppression of soil-borne plant pathogens by certain Pseudomonasspecies23. The genome of P. entomophila contains four clusters of genespredicted to encode three different lipopeptides and a polyketide(Table 2 and Supplementary Fig. 5).
Regulation of virulence revealed by a genome-wide mutagenesis
To directly identify factors that modulate the interaction betweenP. entomophila and D. melanogaster, we generated a Tn5-derivedlibrary of variants that were individually screened for their infectiousand pathogenic properties. Among the 7,500 clones, we isolated 23mutants whose growth was not affected and that displayed attenuatedinfectious and/or pathogenic properties (Table 2). Identification of themini-Tn5 insertion sites identified directly only a putative lipopeptideas a virulence factor. No other genes predicted to be virulence factorswere identified, indicating a likely redundancy. By contrast, a numberof insertions affected regulators that likely modulate the expression ofsuch virulence factors. Seven independent insertions inactivated thetwo-component system GacS/GacA involved in the regulation ofvarious processes, including virulence in different species, and resultedin the inability of these mutants to induce an immune response.P. entomophila gac mutants are defective in secretion of protease andhemolysin (data not shown) and do not persist in the gut ofD. melanogaster1, indicating the pivotal role of GacS/GacA inmodulating the entomopathogenic properties of that strain. Asobserved in other Pseudomonas species23, the GacS/GacA two-component system probably regulates P. entomophila virulence genesat a post-transcriptional level via the two identified small noncodingRsmY and RsmZ RNAs that alleviate post-transcriptional repressionby RsmA and RsmE homologs. Three independent insertions in theprtR gene reduce the pathogenic properties of P. entomophila butretain the capacity to induce an immune response. In P. fluorescensLS107d2 (ref. 24), PrtR and PrtI regulate the transcription of theaprA-inh-aprDEF operon suggesting that P. entomophila relies onAprA protease to fully express its pathogenic properties inD. melanogaster. Two independent insertions that had the sameconsequences for the interaction with D. melanogaster have beenfound in algR. In P. aeruginosa, AlgR regulates a number of processesincluding fimbrial biogenesis, biofilm formation and cyanide produc-tion25,26. Altogether, genetic analysis indicates that GacA is a masterregulator of the interaction and that PrtR and AlgR regulators, seem toplay secondary roles in the infection process.
Metabolism, transport and regulation
The P. entomophila genome encodes most of the central metabolicpathways found in the other Pseudomonas including the pentose phos-phate pathway, the Entner-Doudoroff pathway and the tricarboxylicacid cycle. Consistent with Pseudomonas metabolism, P. entomophilahas an incomplete Embden-Meyerhof-Parnas pathway owing to theabsence of 6-phosphofructokinase, and relies on a complete Entner-Doudoroff route for hexose utilization. The P. entomophila genomeharbors several genes that encode hydrolytic activities such as chitinases,lipases and proteases as well as a set of 19 uncharacterized hydrolases,which are potentially involved in the degradation of polymers found inthe soil. However, contrary to phytopathogenic strains such asP. syringae5–7, the genome of P. entomophila is devoid of genes encodingenzymes capable of degrading plant cell walls. This is consistent withthe observation that this species is not pathogenic for plants (M. Arlat,Institut National de la Recherche Agronomique, Castanet, France,personal communication).
The P. entomophila genome also contains determinants forthe catabolism of various aromatic compounds (see SupplementaryFig. 6 online) and long-chain carbohydrates. P. entomophila sharesseveral gene clusters with P. putida27 that are involved in thedegradation of various classes of aromatic compounds includingbenzoate and quinate, 4-hydroxybenzoate, phenylacetaldehyde andphenylalkanoate as well as phenylalanine and tyrosine. The P. ento-mophila genome contains two additional catabolic gene clusterspresent in the genome of P. aeruginosa PAO1 that encode determi-nants for the degradation of 3-hydroxybenzoate through gentisate28
and for the meta-cleavage of homoprotocatechuate29,30.Consistent with the size of its genome, P. entomophila possesses
more than 535 transporter-encoding genes. Remarkably, no genesencoding a type III or type IV secretion system, present in numerousGram-negative bacterial pathogens31, were found in P. entomophila.The high numbers of transcriptional regulators (more than 300) andgenes whose products are involved in signal transduction suggests thatP. entomophila is able to adapt to substantial substrate variations inits habitats.
The soil and entomopathogenic lifestyle of P. entomophila
The metabolic properties of P. entomophila predicted from its genomesuggest that this strain is a ubiquitous, metabolically versatile bacter-ium that may colonize diverse habitats including soil, rhizosphere andaquatic systems as shown for P. putida KT2440. However, in contrastto P. putida, P. entomophila contains a number of genes that arepredicted, or have been shown, to be important for virulence. Theexpression of these factors is under the control of the major regulatorGacA and presumably allows this strain to exploit new niches andinteract with various insects, particularly D. melanogaster (Fig. 3).
©20
06 N
atu
re P
ub
lish
ing
Gro
up
h
ttp
://w
ww
.nat
ure
.co
m/n
atu
rebiotechnology
ProventriculusEsophagus Midgut
gc
Ingestion (0 h)
1 3
5
42
Resistance to oxidative burstCatalases, SOD, GST
Persistence (2–3 h)Gac-dependent PPF
Immune response escape (3–6 h)PrtR, AprA, AlgR
Pathogenicity and death (12–24 h) Tc toxins, proteases,hemolysin, HCN, lipopeptides
*
PMmv
Ep
Figure 3 Steps in the interaction between P. entomophila and
D. melanogaster. Five different steps are shown: 1. ingestion of
P. entomophila through the esophagus; 2. resistance to oxidative
stress in response to a oxidative burst in the gut; 3. persistence of
P. entomophila in the gut; 4. escape from immune response effectors;
5. pathogenicity and lethal outcome of the interaction after important
modifications of the midgut physiology including microvilli disruption,
cell destruction (indicated by a *) and in some cases peritrophic matrixdisorganization (indicated by a dashed line). Red indicates important
steps in the infection process. Blue indicates newly identified proteins that
could be involved at these steps in the process. Time scale is indicated in
brackets. Ep, epithelial cell; mv, microvilli; PM, peritrophic matrix; gc,
gastric cecum.
NATURE BIOTECHNOLOGY ADVANCE ONLINE PUBLICATION 5
A R T I C L E S
In D. melanogaster, an environment hostile for microbial coloniza-tion is maintained in the gut by secretion of antimicrobial factors suchas lysozymes32,33 and other digestive enzymes. Recently, it has beenshown that a unique epithelial oxidative burst limits microbialproliferation in the gut34; resistance to oxidative stress mighttherefore be a prerequisite for D. melanogaster gut colonization. TheP. entomophila genome encodes 40 proteins that are predicted to beinvolved in resistance to oxidative stress including four catalases, twosuperoxide dismutases, three hydroperoxide reductases and elevenglutathione-S-transferases. It is noteworthy that resistance to oxidativestress is probably not sufficient for colonization as otherPseudomonas species that possess a large repertoire of oxidant detox-ifying proteins are not able to persist in the gut of D. melanogaster1.This assumption is further reinforced by the observation thatP. entomophila gacA mutants were not less resistant toperoxide, hypochlorite or paraquat (data not shown). As theP. entomophila-D. melanogaster interaction is specific, P. entomophilainfectivity likely involves the expression of a specific gene enablingthis strain to persist in the D. melanogaster gut, as shown forthe Erwinia carotovora Evf factor35. Because P. entomophila doesnot contain any evf-related genes, we cannot predict candidatesfor this putative persistence promoting factor (ppf in Fig. 3). None-theless, this gene is likely regulated by the GacS/GacA two-componentsystem: the gacA::Tn5 or gacS::Tn5 mutants do not persist in thegut and P. entomophila cells are infectious only at stationaryphase, concomitant with Gac activation of virulence genes (datanot shown). It is striking to note that in both P. entomophilaand E. carotovora35, genes required to interact with D. melanogasterare under the control of global regulators, that is, Hor and GacA,respectively, revealing the branching of virulence genes in a complexregulatory network.
Infection of D. melanogaster by P. entomophila is accompanied byblockage of food-uptake1. This phenomenon is also observed in theinteraction between Serratia entomophila and the grass grub Costelytrazealandica or between Yersinia pestis and the flea. The processes usedto effect food blockage seem to be different in the two systems; Y. pestisrelies on phospholipase synthesis and biofilm formation36,37 whereasthe mechanism used by S. entomophila remains unknown38. Genesresponsible for the anti-feeding determinants of S. entomophila have aprophage origin and no related genes were identified in the genome ofP. entomophila. Since algR mutants still provoke food-uptake blockage,biofilm formation is probably not essential for D. melanogasterinfection by P. entomophila.
The persistence of P. entomophila in the larval gut triggers both alocal and systemic immune response1. The P. entomophila levelremains high in wild-type larvae, similar to that observed in a relishmutant unable to induce an immune response1, suggesting that thisstrain is able to escape the D. melanogaster immune response. Biofilmformation might protect P. entomophila cells from immune effectorsor persistence of bacteria might result from the degradation ofeffectors. The defects observed with prtR mutants indicated thatAprA may degrade antimicrobial peptides, as indicated by recentin vivo studies39, and consequently disable the immune response.
Twelve hours after D. melanogaster ingests the bacteria, physiologi-cal modifications to the fly caused by P. entomophila are dramatic andthe expression of 205 D. melanogaster genes is modified1 (Fig. 3).These changes probably result from the action of virulence factorssuch as proteases, hemolysins, insecticidal toxin-like proteins, second-ary metabolites or hydrogen cyanide. However, lethality starts to beapparent after 16 h, indicating that this late gene expression will haveno effect on the fatal outcome of the interaction.
DISCUSSION
The complete genome sequence of P. entomophila provides insightinto this organism’s entomopathogenic lifestyle. Combined with agenetic approach, it has revealed potential virulence factors along withregulators that modulate their expression. This study also revealed thatP. entomophila is the first Pseudomonas strain to be pathogenic in amulticellular organism and at the same time to be devoid of a type IIIsecretion system. Its potential to use various plant-derived compoundsincluding aromatic molecules, and its antibiotic- and oxidative stress-resistance capacities suggest that P. entomophila is a commensalbacterium. As this strain is not a plant pathogen, it may have potentialto control insects. Unexpectedly for an environmental isolate,P. entomophila has a genome that contains a limited number ofbacteriophages and transposons. This may contribute to its relativelysmall size compared to other Pseudomonas genomes. Finally, thecomplete genome sequence of P. entomophila provides a frameworkfor further studies to characterize its pathogenic properties and for ahost-pathogen system in which both organisms are amenable togenetic and genomic analysis.
METHODSGenome sequencing, assembly and annotation. The complete genome
sequence of P. entomophila L48 was determined using the whole-genome
shotgun method (10� coverage, using two plasmid libraries and one BAC
library to order contigs). Finishing was performed by PCR amplification from
contigs extremities. After a first round of annotation, regions of lower quality as
well as regions with putative frameshifts were resequenced from PCR ampli-
fication of the dubious regions. Using the AMIGene software (annotation of
microbial genes)40, a total of 5,279 CoDing Sequences were predicted (and
assigned a unique identifier prefixed with ‘‘PSEEN’’), and submitted to
automatic functional annotation: exhaustive BLAST searches against the Uni-
Prot databank were performed to determine significant homology. Protein
motifs and domains were documented using the InterPro databank. In parallel,
genes coding for enzymes were classified using the PRIAM software41.
TMHMM vs2.0 was used to identify transmembrane domains42, and SignalP
3.0 was used to predict signal peptide regions43. Finally, tRNAs were identified
using tRNAscan-SE44. Sequence data for comparative analyses were obtained
from the NCBI databank (RefSeq section). Putative orthologs and synteny
groups (that is, conservation of the chromosomal colocalization between pairs
of orthologous genes from different genomes) were computed between
P. entomophila and all the other complete genomes as previously described45.
Manual validation of the automatic annotation was performed using the
MaGe (Magnifying Genomes) interface, which allows graphic visualization
of the P. entomophila annotations enhanced by a synchronized representa-
tion of synteny groups in other genomes chosen for comparisons45. All the
data (that is, syntactic and functional annotations, and results of compara-
tive analysis) were stored in a relational database, called EntomoScope.
This database is publicly available via the MaGe interface at http://
www.genoscope.cns.fr/agc/mage/.
Bacterial mutagenesis and screening. Random mutagenesis was performed by
biparental mating using P. entomophila1 and Escherichia coli S17.1-lpir46
carrying the pUT-Tn5-Tc suicide plasmid as previously described47. A total
of 7,500 TcR colonies obtained from several independent conjugations were
screened individually as previously described35. Transconjugants that displayed
attenuated virulence were subjected to several secondary screenings by natural
infection as previously described1. Insertion sites were determined using two
different methods. First, genomic DNA was digested by PstI or NotI/PstI and
ligated into pUC18 and pBlueScript, respectively. Clones that contained the
mini-transposon and its flanking sequences were selected by plating the E. coli
BW25142 transformants on tetracycline (10 mg/ml). One flanking region was
sequenced from the Tc gene using the oligonucleotide (Tc-F) 5¢-TCGTCGACA
AGCTTCGG-3¢. Some insertion sites were determined by reverse PCR method.
Genomic DNA was digested by either PstI or EagI, self-ligated and amplified
using the oligonucleotides Tc-F and 5¢-AGATCTGATCAAGAGACAT-3¢
©20
06 N
atu
re P
ub
lish
ing
Gro
up
h
ttp
://w
ww
.nat
ure
.co
m/n
atu
rebiotechnology
6 ADVANCE ONLINE PUBLICATION NATURE BIOTECHNOLOGY
A R T I C L E S
for PstI-digested DNA or 5¢-GGCGGCCCTATACCTTGTCTG-3¢ (Tet-end) and
5¢-CATAATGGGGAAGGCCAT-3¢ for EagI-digested DNA, respectively. One
flanking region was sequenced using the oligonucleotides Tc-F or Tet-end.
Insertion sites were confirmed by amplifying the region overlapping the
insertion site. Southern blot analysis was carried out to verify that the selected
clones only carried a single copy of the transposon.
Accession numbers. The P. entomophila nucleotide sequence and annota-
tion data have been deposited in the EMBL databank under accession
number CT573326.
Note: Supplementary information is available on the Nature Biotechnology website.
ACKNOWLEDGMENTSThis work was supported by CNRS (Programme Sequencage a grande echelle),by IFR115 and by MRT/ACI IMPBio 2004 ‘MicroScope.’ We thank CeliaFloquet and Camille Jourlain for technical assistance, Matthieu Arlat for plantassays and helpful discussions, Alexandra Gruss, Linda Sperling and SeanKennedy for critical reading of the manuscript, Olivier Espeli for expertannotation. N.V. was supported by a doctoral fellowship from the AssociationVaincre la Mucoviscidose and the Association pour la Recherche sur le Cancer.
AUTHOR CONTRIBUTIONSN.V., V.B., P.W., B.S., J.W., B.L., C.M. and F.B. designed research; N.V., D.V.,S.C., Z.R., V.B., C.A., L.C., C.J., A.L., B.V. and F.B. performed research; N.V.,D.V., S.C., V.B., C.A., C.M. and F.B. contributed new reagents/analytic tools;N.V., D.V., S.C., Z.R., V.B., C.A., L.C., C.J., A.L., B.V., B.L., C.M. and F.B.analyzed data; and N.V. and F.B. wrote the paper.
COMPETING INTERESTS STATEMENTThe authors declare that they have no competing financial interests.
Published online at http://www.nature.com/naturebiotechnology/
Reprints and permissions information is available online at http://npg.nature.com/
reprintsandpermissions/
1. Vodovar, N. et al. Drosophila host defense after oral infection by an entomopathogenicPseudomonas species. Proc. Natl. Acad. Sci. USA 102, 11414–11419 (2005).
2. Waterfield, N.R., Wren, B.W. & ffrench-Constant, R.H. Invertebrates as a source ofemerging human pathogens. Nat. Rev. Microbiol. 2, 833–841 (2004).
3. Chattopadhyay, A., Bhatnagar, N.B. & Bhatnagar, R. Bacterial insecticidal toxins. Crit.Rev. Microbiol. 30, 33–54 (2004).
4. Stover, C.K. et al. Complete genome sequence of Pseudomonas aeruginosa PAO1, anopportunistic pathogen. Nature 406, 959–964 (2000).
5. Buell, C.R. et al. The complete genome sequence of the Arabidopsis and tomatopathogen Pseudomonas syringae pv. tomato DC3000. Proc. Natl. Acad. Sci. USA 100,10181–10186 (2003).
6. Feil, H. et al. Comparison of the complete genome sequences of Pseudomonas syringaepv. syringae B728a and pv. tomato DC3000. Proc. Natl. Acad. Sci. USA 102, 11064–11069 (2005).
7. Joardar, V. et al. Whole-genome sequence analysis of Pseudomonas syringae pv.phaseolicola 1448A reveals divergence among pathovars in genes involved in virulenceand transposition. J. Bacteriol. 187, 6488–6498 (2005).
8. Paulsen, I.T. et al. Complete genome sequence of the plant commensal Pseudomonasfluorescens Pf-5. Nat. Biotechnol. 23, 873–878 (2005).
9. Nelson, K.E. et al. Complete genome sequence and comparative analysis of themetabolically versatile Pseudomonas putida KT2440. Environ. Microbiol. 4, 799–808 (2002).
10. Weinel, C., Nelson, K.E. & Tummler, B. Global features of the Pseudomonas putidaKT2440 genome sequence. Environ. Microbiol. 4, 809–818 (2002).
11. Eisen, J.A., Heidelberg, J.F., White, O. & Salzberg, S.L. Evidence for symmetricchromosomal inversions around the replication origin in bacteria. Genome Biol 1,RESEARCH0011 (2000).
12. Aranda-Olmedo, I., Tobes, R., Manzanera, M., Ramos, J.L. & Marques, S. Species-specific repetitive extragenic palindromic (REP) sequences in Pseudomonas putida.Nucleic Acids Res. 30, 1826–1833 (2002).
13. Waterfield, N.R., Bowen, D.J., Fetherston, J.D., Perry, R.D. & ffrench-Constant, R.H.The tc genes of Photorhabdus: a growing family. Trends Microbiol 9, 185–191 (2001).
14. Bowen, D. et al. Insecticidal toxins from the bacterium Photorhabdus luminescens.Science 280, 2129–2132 (1998).
15. Joo Lee, P. et al. Cloning and heterologous expression of a novel insecticidal gene(tccC1) from Xenorhabdus nematophilus strain. Biochem. Biophys. Res. Commun.319, 1110–1116 (2004).
16. Waterfield, N., Hares, M., Yang, G., Dowling, A. & ffrench-Constant, R. Potentiation andcellular phenotypes of the insecticidal Toxin complexes of Photorhabdus bacteria. CellMicrobiol. 7, 373–382 (2005).
17. Wilson, M., McNab, R. & Henderson, B.. Bacterial Disease Mechanisms (CambridgeUniversity Press, Cambridge, UK, 2002).
18. Miyoshi, S. & Shinoda, S. Microbial metalloproteases and pathogenesis. MicrobesInfect. 2, 91–98 (2000).
19. Meyer, J.M. Pyoverdines: pigments, siderophores and potential taxonomic markers offluorescent Pseudomonas species. Arch. Microbiol. 174, 135–142 (2000).
20. Ravel, J. & Cornelis, P. Genomics of pyoverdine-mediated iron uptake in pseudomo-nads. Trends Microbiol. 11, 195–200 (2003).
21. Mercado-Blanco, J. et al. Analysis of the pmsCEAB gene cluster involved in biosynth-esis of salicylic acid and the siderophore pseudomonine in the biocontrol strainPseudomonas fluorescens WCS374. J. Bacteriol. 183, 1909–1920 (2001).
22. Gallagher, L.A. & Manoil, C. Pseudomonas aeruginosa PAO1 kills Caenorhabditiselegans by cyanide poisoning. J. Bacteriol. 183, 6207–6214 (2001).
23. Haas, D. & Defago, G. Biological control of soil-borne pathogens by fluorescentpseudomonads. Nat. Rev. Microbiol. 3, 307–319 (2005).
24. Burger, M., Woods, R.G., McCarthy, C. & Beacham, I.R. Temperature regulation ofprotease in Pseudomonas fluorescens LS107d2 by an ECF sigma factor and atransmembrane activator. Microbiology 146, 3149–3155 (2000).
25. Lizewski, S.E. et al. Identification of AlgR-regulated genes in Pseudomonas aeruginosaby use of microarray analysis. J. Bacteriol. 186, 5672–5684 (2004).
26. Whitchurch, C.B. et al. Phosphorylation of the Pseudomonas aeruginosa responseregulator AlgR is essential for type IV fimbria-mediated twitching motility. J. Bacteriol.184, 4544–4554 (2002).
27. Jimenez, J.I., Minambres, B., Garcia, J.L. & Diaz, E. Genomic analysis of the aromaticcatabolic pathways from Pseudomonas putida KT2440. Environ. Microbiol. 4, 824–841 (2002).
28. Liu, D.Q., Liu, H., Gao, X.L., Leak, D.J. & Zhou, N.Y. Arg169 is essential for catalyticactivity of 3-hydroxybenzoate 6-hydroxylase from Klebsiella pneumoniae M5a1. Micro-biol. Res. 160, 53–59 (2005).
29. Prieto, M.A., Diaz, E. & Garcia, J.L. Molecular characterization of the 4-hydroxyphe-nylacetate catabolic pathway of Escherichia coli W: engineering a mobile aromaticdegradative cluster. J. Bacteriol. 178, 111–120 (1996).
30. Thotsaporn, K., Sucharitakul, J., Wongratana, J., Suadee, C. & Chaiyen, P. Cloning andexpression of p-hydroxyphenylacetate 3-hydroxylase from Acinetobacter baumannii:evidence of the divergence of enzymes in the class of two-protein component aromatichydroxylases. Biochim. Biophys. Acta 1680, 60–66 (2004).
31. Hueck, C.J. Type III protein secretion systems in bacterial pathogens of animals andplants. Microbiol. Mol. Biol. Rev. 62, 379–433 (1998).
32. Hultmark, D. Insect lysozymes. EXS 75, 87–102 (1996).33. Regel, R., Matioli, S.R. & Terra, W.R. Molecular adaptation of Drosophila melanogaster
lysozymes to a digestive function. Insect Biochem. Mol. Biol. 28, 309–319(1998).
34. Ha, E.M. et al. An antioxidant system required for host protection against gut infectionin Drosophila. Dev. Cell 8, 125–132 (2005).
35. Basset, A., Tzou, P., Lemaitre, B. & Boccard, F. A single gene that promotes interactionof a phytopathogenic bacterium with its insect vector, Drosophila melanogaster. EMBORep. 4, 205–209 (2003).
36. Hinnebusch, B.J. et al. Role of Yersinia murine toxin in survival of Yersinia pestis in themidgut of the flea vector. Science 296, 733–735 (2002).
37. Darby, C., Ananth, S.L., Tan, L. & Hinnebusch, B.J. Identification of gmhA, a Yersiniapestis gene required for flea blockage, by using a Caenorhabditis elegans biofilmsystem. Infect. Immun. 73, 7236–7242 (2005).
38. Hurst, M.R., Glare, T.R. & Jackson, T.A. Cloning Serratia entomophila antifeedinggenes–a putative defective prophage active against the grass grub Costelytra zealan-dica. J. Bacteriol. 186, 5116–5128 (2004).
39. Liehl, P., Blight, M., Vodovar, N., Boccard, F. & Lemaitre, B. Prevalence of localimmune response against oral infection in a Drosophila/Pseudomonas infection model.PLoS Pathog., in the press.
40. Bocs, S., Cruveiller, S., Vallenet, D., Nuel, G. & Medigue, C. AMIGene: Annotation ofMIcrobial Genes. Nucleic Acids Res. 31, 3723–3726 (2003).
41. Claudel-Renard, C., Chevalet, C., Faraut, T. & Kahn, D. Enzyme-specific profiles forgenome annotation: PRIAM. Nucleic Acids Res. 31, 6633–6639 (2003).
42. Krogh, A., Larsson, B., von Heijne, G. & Sonnhammer, E.L. Predicting transmembraneprotein topology with a hidden Markov model: application to complete genomes. J. Mol.Biol. 305, 567–580 (2001).
43. Bendtsen, J.D., Nielsen, H., von Heijne, G. & Brunak, S. Improved prediction of signalpeptides: SignalP 3.0. J. Mol. Biol. 340, 783–795 (2004).
44. Lowe, T.M. & Eddy, S.R. tRNAscan-SE: a program for improved detection of transferRNA genes in genomic sequence. Nucleic Acids Res. 25, 955–964 (1997).
45. Vallenet, D. et al. MaGe: a microbial genome annotation system supported by syntenyresults. Nucleic Acids Res. 34, 53–65 (2006).
46. Miller, V.L. & Mekalanos, J.J. A novel suicide vector and its use in construction ofinsertion mutations: osmoregulation of outer membrane proteins and virulence deter-minants in Vibrio cholerae requires toxR. J. Bacteriol. 170, 2575–2583 (1988).
47. de Lorenzo, V., Herrero, M., Jakubzik, U. & Timmis, K.N. Mini-Tn5 transposonderivatives for insertion mutagenesis, promoter probing, and chromosomalinsertion of cloned DNA in gram-negative eubacteria. J. Bacteriol. 172, 6568–6572(1990).
©20
06 N
atu
re P
ub
lish
ing
Gro
up
h
ttp
://w
ww
.nat
ure
.co
m/n
atu
rebiotechnology
NATURE BIOTECHNOLOGY ADVANCE ONLINE PUBLICATION 7
A R T I C L E S
0 e+00 1 e+06 2 e+06 3 e+06 4 e+06 5 e+06 6 e+06
0.0
60
.04
0.0
20
.00
0.0
20
.04
0.0
6
GC
sk
ew
0 e+00 1 e+06 2 e+06 3 e+06 4 e+06 5 e+06 6 e+06
0.0
40
.02
0.0
00
.02
0.0
40
.06
Position (bp)
GC
sk
ew
Position (bp)
P. entomophila
P. putida KT2440
dnaA
dnaA
Dif
0 e+00 1 e+06 2 e+06 3 e+06 4 e+06 5 e+06 6 e+06
0.0
60
.04
0.0
20.0
00.0
20.0
40
.06
Position (bp)
GC
skew
P. aeruginosa PAO1
P. syringae pv. tomato DC3000
0 e+00 1 e+06 2 e+06 4 e+06 5 e+06 6 e+06
0.0
40.0
20.0
00.0
20.0
4
GC
skew
0 e+00 1 e+06 2 e+06 3 e+06 4 e+06 5 e+06 6 e+06
0.0
40.0
20.0
00.0
20.0
4
Position (bp)
GC
sk
ew
Position (bp)
3 e+06
P. syringae pv. syringae B728a
dnaA
dnaA
dnaADif
Dif
Dif
0 e+00 1 e+06 2 e+06 3 e+06 4 e+06 5 e+06 6 e+06 7 e+06
0.0
60.0
40
.02
0.0
00
.02
0.0
40
.06
Position (bp)
GC
sk
ew
P. fluorescens Pf-5
dnaA
Dif
Dif
Supplementary Figure 1 Comparison of the replichore organization in selected Pseudomonas
genomes. Replichores were mapped by GC skew analysis using a 2000-bp window. The dnaA gene
close to oriC and the chromosome dimer resolution dif site are shown in red and green, respectively.
Supplementary information for Vodovar et al. "Complete genome sequence of the entomopathogenic
and metabolically versatile soil bacterium Pseudomonas entomophila"
P. entomophila
P. putida KT2440
P. entomophila
P. syringae pv. tomato D3000
P. entomophila
P. aeruginosa PAO1
P. entomophila
P. fluorescens Pf-5
P. entomophila
P. entomophila L48
P. a
eru
gin
osa
PA
01
P. puti
da K
T2440
P. fl
uore
scen
s P
f-05
P. sy
ringae
pv.
tom
ato D
C3000
A
C'
C
B
A'
B'
D'
D
1
1
1
1
1
1
1
1
5888780
5888780
5888780
5888780
6397126
6264403
7074893
6181863
Supplementary information for Vodovar et al. "Complete genome sequence of the entomopathogenic
and metabolically versatile soil bacterium Pseudomonas entomophila"
Supplementary Figure 2 Comparison of the P. entomophila genome with that of other
Pseudomonas species and visualization of P. entomophila genomic synteny compared to
selected Pseudomonas. Regions of significant sequence identity between the nucleotide
sequence of P. entomophila (top) and of different Pseudomonas spp (bottom): P. putida
KT2440 (a), P. fluorescens Pf-5 (b), P. aeruginosa PAO1 (c) or P. syringae pv. tomato
DC3000 (d) connected by red (collinear regions) and blue (inverted regions) lines. Axes
represent the portions coded for in the order in which they occur in the chromosomes. The
display was generated using Artemis Comparison Toll (freely available at
www.sanger.ac.uk/Software/ACT). Visualization of P. entomophila genomic synteny
compared to P. putida KT2440 (a’), P. fluorescens Pf-5 (b’), P. aeruginosa PAO1 (c’) or
P.syringae pv. tomato DC3000 (d’). Square represent groups of synteny with the size in
respect of their length. Groups of synteny correspond to groups of the genes shared by the 2
genomes (> 60% identity on > 80% of their length at the protein level) that display similar
organization with authorizing five insertion or deletion events.
Se SepA (pADAP_54)
PSPTO1231
PSPPH_2571
982
PSPTO4344
PSPTO4343
944
1000
PSPTO4340
PSPPH_4042
1000
1000
YPO3674
YPO3673
1000
PSEEN2788
PSEEN2485
PSEEN2697
640
YPO2312
Se SepC (pADAP_57)
1000
998
Pl TccC7 (plu4488)
Pl TccC1 (plu4167)
Pl TccC4 (plu0976)
Pl TccC3 (plu0967)
Pl TccC6 (plu4182)
762
466
Pl TccC5 (plu0964)
479
995
1000
919
782
1000
Serratia entomophila,
Pseudomonas entomophila,
Photorhabdus luminescens,
Yersinia YPO2312
subgroup
Yersinia spp.
subgroup
Pseudomonas syringae
subgroup
Supplementary information for Vodovar et al. "Complete genome sequence of the entomopathogenic
and metabolically versatile soil bacterium Pseudomonas entomophila"
Supplementary Figure 3 Phylogeny of representative TccC-type toxins. The tree was
reconstructed using the NJ method. The number shown next to each node indicates the bootstrap
values of 1,000 replicates. The sequence of Serratia entomophila SepA protein was used as
outgroup. The tree uncovers three major subgroups of TccC-type toxins and reveals that the P.
entomophila toxins are related to Yersinia spp. toxin YPO2312, to the S. entomophila SepC toxins
and to Photorhabdus luminescens TccC toxins (subgroup blue). The TccC-type toxins identified in
the genomes of the three P. syringae (green) are more distantly related. Labels indicate the
GenBank locus tags except for the S. entomophila Sep proteins and the P. luminescens TccC
proteins where both the name of the protein and the locus tag are shown (locus tags in brackets).
YPO: Yersinia pestis, PSPTO: P. syringae pv. tomato DC3000 and PSPPH: P. syringae pv.
phaseolicola 1448A.
1 2 3 4
5 67 8 9 10
1112
1314 15 16
1718 19
20 21 22
2324
2526 27
2829
1: P. fluorescens pv. lomagnae; 2: P. cedrina; 3: P. libanesis ; 4: P. mandelli;
5: P. corrugata; 6: P. fluorescens biovar 1; 7: P. fluorescens biovar 2;
8: P. fluorescens biovar 3; 9: P. fluorescens biovar 4; 10 P. fluorescens biovar 5;
11: P. marginalis pv. marginalis; 12: P. rhodesiae; 13 P. tolaasii; 14: P. veronii;
15: P. putida KT2440; 16: P. putida; 17 P. putida biovar b; 18: P. montellii;
19: P. mossellii; 20: P. cichorii; 21: P. fuscovaginae; 22: P. chlororaphis;
23: P. aeruginosa PAO1; 24: P. gingeri; 25 P. brassicacearum; 26: P. jessenii;
27: P. agarici; 28: P. asplenii; 29: P. entomophila.
Supplementary information for Vodovar et al. "Complete genome sequence of the entomopathogenic
and metabolically versatile soil bacterium Pseudomonas entomophila"
Supplementary Figure 4 P. entomophila secretes a highly diffusible hemolytic activity compared
to the other Pseudomonas strains tested. These strains have been previously described1 except P.
jessenii (CFBP4842) and P. asplenii (CFBP2063) that come from the Collection Française de
Bactéries Phytopathogènes. In the conditions tested (29°C on Trypticase soy broth containing sheep
erythrocytes (bioMérieux, Marcy l’Etoile)), a slight hemolytic activity is barely observed for some
strains (for example P. cedrina) whereas hemolytic activity catalyzed by phospholipase C of P.
aeruginosa PAO1 is repressed 2.
1. Vodovar, N. et al. Drosophila host defense after oral infection by an entomopathogenic
Pseudomonas species. PNAS 102, 11414-11419 (2005).
2. Vasil, M.L., Berka, R.M., Gray, G.L. & Nakai, H. Cloning of a phosphate-regulated
hemolysin gene (phospholipase C) from Pseudomonas aeruginosa. J Bacteriol. 152, 431-440.
(1982).
4. Lipopeptide II: PSEEN2138-PSEEN2156
2. Acinetobactin-like siderophore: PSEEN2492-PSEEN2507
5. Lipopeptide III: PSEEN2716-PSEEN2720
//1. Pyoverdine: PSEEN1813-PSEEN1815//PSEEN3223-PSEEN3234
//3. Lipopeptide I: PSEEN0132//PSEEN3042-PSEEN3045//PSEEN3332
6. HCN and PKS: PSEEN5520-PSEEN5536
//
Supplementary information for Vodovar et al. "Complete genome sequence of the
entomopathogenic and metabolically versatile soil bacterium Pseudomonas entomophila"
Supplementary Figure 5 Gene clusters involved in siderophore and secondary metabolism
biosynthesis in P. entomophila. Each cluster identified on the genome is represented and the
genes colored according to their assigned function as described in Figure 1: salmon, amino
acid biosynthesis; red, cellular processes; brown, central intermediary metabolism; navy blue,
regulatory functions and signal transduction; lime green, secondary metabolite biosynthesis;
teal, transport and binding proteins; black, unknown function and hypothetical proteins.
Concerning the siderophore related to acinetobactin, the gene cluster (PSEEN2492-
PSEEN2507) encodes determinants for both the synthesis of salicylic acid (SA)(PSEEN2504-
2507; similar to the pmsCEAB cluster from P. fluorescens WCS7343) and for the
nonribosomal synthesis and transport of a siderophore (PSEEN2492-PSEEN2503; similar to
the gene cluster involved in acinetobactin biosynthesis in Acinetobacter baumannii4).
Contrary to acinetobactin, the siderophore produced is thought to contain a salicylamide
moiety and might resemble to pseudomonine from P. fluorescens WCS734 as the genes
cluster in SA and siderophore biosynthesis seem to be linked3.
The cluster predicted to encode determinants for the synthesis of the lipodecapeptide I
contains three nonribosomal peptide synthetase (NRPS) encoding genes (PSEEN3332,
PSEEN3044-45) similar to those of P. fluorescens Pf-5 (PFL2144-2146) that are predicted to
be involved in a cyclic lipodecapeptide biosynthesis5. The ortholog of PFL2145 is found apart
from the two others and PSEEN3045 corresponds to a complete duplication of PFL2146. As
observed in the genome of P. fluorescens Pf-5, this cluster lacks an initial loading module;
this function may be carried by PSEEN0132 that specifies a loading module of NRPS. The
cluster predicted to encode determinants responsible for the synthesis of lipopeptide II
(PSEEN2139-PSEEN2156) is 32-kb long and encodes several NRPSs and PKSs which may
involved in the production of a novel uncharacterized lipopeptide. The cluster predicted to
encode determinants responsible for the synthesis of lipopeptide III contains three genes
(PSEEN2716-PSEEN2720) similar to Psyr1792-4 from P. syringae B728a. They encode two
NRPSs and a hybrid NRPS/polyketide synthase (PKS). As P. entomophila is not pathogenic
for plants (M. Arlat, unpublished data), this lipopeptide is probably not a phytotoxin but may
rather be involved in the suppression of plant diseases by competing with plant pathogens.
The cluster predicted to encode determinants responsible for the synthesis of a PKS is
adjacent to genes encoding hydrogen cyanid synthase, spans over 21-kb and contains genes
encoding five PKSs and six proteins related to polyketide biosynthesis (PSEEN5524-5536).
3. Mercado-Blanco, J. et al. Analysis of the pmsCEAB gene cluster involved in
biosynthesis of salicylic acid and the siderophore pseudomonine in the biocontrol strain
Pseudomonas fluorescens WCS374. J Bacteriol 183, 1909-1920 (2001).
4. Dorsey, C.W., Tolmasky, M.E., Crosa, J.H. & Actis, L.A. Genetic organization of an
Acinetobacter baumannii chromosomal region harbouring genes related to siderophore
biosynthesis and transport. Microbiology 149, 1227-1238 (2003).
5. Paulsen, I.T. et al. Complete genome sequence of the plant commensal Pseudomonas
fluorescens Pf-5. Nat Biotechnol 23, 873-878 (2005).
OO
O-
CH2
O
O-
-OOC
OH
-OOC
OH
OH
COO-
COO-
-OOC
COO-
-OOC
O
O
COO-
O
O
COO-
COO-
O
COOCoA
COO-
O
CH3 CoA
O
CoA
O
O
OH
NH2
COO-
OH
COO-
OH
OH
OH
COO-
COO-
COO-
O
O
COO-NH3
+
COO-NH3
+
OH
COO-
OH
COO-
OH
OH
COO-
O
O
COO-
H
H
COO-
O
O
H
H
-OOC
O
H
O
OCoA
COO-
OH
COO-
OH
OH
O
OOH
OH
O
O
O
CH3
O O
O-
O
CH3
O
OH
O
O-
O-
O
O
O-
O-
O
O
OH
COO- COO-
OH
OH
O
OH O
OH-OOC
O
OH O
OH-OOC
OH
O
-OOC
OH O
COO-
COO-
OH
-OOC
O
-OOC
COO-
-OOC COO-
OHOH
OH
H
O
O
OH
OH
O
O
4-hydroxybenzoate
protocatechuate
β-carboxy-
cis,cis-muconate
γ-carboxymucolactone
benzylamine
benzoate
benzoate diol
catechol
cis,cis-muconate
mucolactone
β-ketoadipate
enol-lactone
β-ketoadipate
β-ketoadipate-CoA
acetyl-CoA succinyl-CoA
Tricarboxylic
cycle
3-hydroxybenzoate
gentisate
maleylpyruvate
fumarylpyruvate
fumarate
pyruvate
phenylalkanoate
phenylacetate
phenylacetyl-CoA
phenylethylamine
phenylacetaldehyde
phenylalanine
tyrosine
4-hydroxyphenyl
pyruvate
homogentisate
maleylacetoacetate
fumarylacetoacetate
acetoacetate
fumarate
homoproto
catechuate
2-hydroxy-5-carboxy
methylmuconate
semialdehyde
5-carboxymethyl-
2-hydroxymuconate
5-carboxy-2oxohept-
3-enedioate
2-hydroxyhepta-2,4-
dienedioate
2-oxohept-3-enedioate
2,4-dihydroxyhept-
2-enedioate
succinate
semialdehyde
succinate
CH3 O
O OH
NH2
PobA
PcaGH
PcaB
PcaC
PcaD
CatIJ
PcaF
PhhAB
TyrB1
TyrB2
Hpd
HmgA
Mai
Fah
Fad
Pad
?
PaaF
PaaGLIJK
PaaN
PhaAPaaBCDPcaF-2
C1-hpah
C2-hpah
HpaD
HpaE
HpaF
?
HpaG
HpaH
HpaI
GabD
HpaH
MhbB
MhbD
MhbI
MhbM
0 1 2 3 4 5
pcaRKcatIJpcaFTBDC
maiAtyrB1
paaXYphaApaaBCDpcaF-2paaFGLIJKactPphaKpaaN
mhbDBIM
hpaG1G2EDFHI
C1-hpaH C2-hpaHpcaGHphhRABC
4783-8
4
4489-9
4
1160-6
9
1610
1635
2593-9
6
2670
2789-2
805
tyrB2 fahAhmgA
benFEcatACBbenKDCBA
fadB1xFxAxB2xDx
fadAB
3727-8
3545-9
3134-4
3
3092-8
3104-5
Mb
BenABC
BenD
CatA
CatB
CatC
( )( )2n+12n+1
n > 1
?OH
OH
OH
-OOC
OH
quinate
QuiA4-hydroxy
phenylacetate
quiA
3545-9
A
B
Supplementary information for Vodovar et al. "Complete genome sequence of the entomopathogenic
and metabolically versatile soil bacterium Pseudomonas entomophila"
Supplementary Figure 6 Catabolic pathways for aromatic compounds identified in the P.
entomophila genome. (a) The genes involved are positioned on a linear map of the
chromosome and display a scattered organization with exception of a 60-kb cluster (yellow
box). Unlike the genome of Acinetobacter SP16, most of these genes are dispersed throughout
the genome with the exception of a 57-kb region that contains all the determinants of the
catechol and the homoprotocatechuate pathways along with several oxidases, dehydrogenases
and oxygenases that might be involved in aromatic compound degradation. (b) Pathways
similar to those found in P. putida KT2440 include the catechol and the protocatechuate
pathways that lead to the β-ketoadipate pathway, as well as the phenylacetate and the
homogentisate pathways respectively. The pathways involved in phenylpropenoid utilization
(vanillate, coumarate, ferrulate, caffeinate) are absent even though several aldehyde
dehydrogenases that might convert particular phenylpropenoids (e.g. coniferyl aldehyde,
PSEEN0293) were identified. On the other hand, the P. entomophila genome contains two
additional catabolic gene clusters. The first one (PSEEN2593-2596), which is also present in
the genome of P. aeruginosa PA01, is similar to the mhbDBIM operon of Klebsiella
pneumoniae M5a1 that encodes determinant for the degradation of 3-hydroxybenzoate
through gentisate7. The second one (PSEEN3092-3098) is similar to the hpaRAGEDFHI
operon of Escherichia coli W that encodes the meta-cleavage pathway of
homoprotocatechuate8. This operon is also present in the genomes of P. aeruginosa and P.
fluorescens. Unlike E. coli W, P. entomophila does not contain the operon hpaBC whose
product are involved in the first step of 4-hydroxyphenylacetate catabolism but contains genes
(PSEEN3104-3105) similar to the hpaH(C1-C2) operon of Acinetobacter baumannii that
encodes the same activity as hpaBC9.
6. Barbe, V. et al. Unique features revealed by the genome sequence of Acinetobacter sp.
ADP1, a versatile and naturally transformation competent bacterium. Nucleic Acids Res
32, 5766-5779 (2004).
7. Liu, D.Q., Liu, H., Gao, X.L., Leak, D.J. & Zhou, N.Y. Arg169 is essential for catalytic
activity of 3-hydroxybenzoate 6-hydroxylase from Klebsiella pneumoniae M5a1.
Microbiol Res 160, 53-59 (2005).
8. Prieto, M.A., Diaz, E. & Garcia, J.L. Molecular characterization of the 4-
hydroxyphenylacetate catabolic pathway of Escherichia coli W: engineering a mobile
aromatic degradative cluster. J Bacteriol 178, 111-120 (1996).
9. Thotsaporn, K., Sucharitakul, J., Wongratana, J., Suadee, C. & Chaiyen, P. Cloning and
expression of p-hydroxyphenylacetate 3-hydroxylase from Acinetobacter baumannii:
evidence of the divergence of enzymes in the class of two-protein component aromatic
hydroxylases. Biochim Biophys Acta 1680, 60-66 (2004).
Supplementary Table 1 Comparison between the genome of P. entomophila and that of other pseudomonads
Ppa Pfa Paa Psta
No. orthologous genes (%) 3630 3301 2683 2603
% of orthologs in synteny
96.9 93.6 90.3 93.5
No. synteny groupsb 227 376 399 295
Maximal size of synteny groups 211 62 61 67
Average size of synteny groups 14.6 8.3 6.7 8.5
a Pp: P. putida KT2440, Pa: P. aeruginosa PA01, Pf: P. fluorescens-Pf-5, Pst: P. syringae pv. tomato DC3000. b Groups of synteny correspond to groups of genes shared by the 2 genomes (> 60% identity on > 80% of their length at the protein level) that display similar organization with authorizing five insertion or deletion events.
Supplementary Table 2 Gene comparison between P. entomophila and P. putida KT2440
Pe c Pp d
Total genes
5169
5404
Common genes a
3630
3630
Number of duplicated genes a/b
262 / 1271
389 / 1585
Specific genes
1539
1774
Number of duplicated specific
genes duplicated in the genome a/b
139 /410
237 / 635
% of specific duplicated genes
duplicated among specific genes a
81%
76%
a indicates the number of common or duplicated genes by using a constraint of 60% indentity over 80% of the length of the protein. b indicates the number of common or duplicated genes by using a constraint of 35% identity over 80% of the length of the protein. cPe: P. entomophila dPp: P. putida KT2440.