SoftwareVITCOMIC: visualization tool for taxonomic ... · Bacteria and Archaea, mosaic structures of highly con-served regions and variable regions [6,7], and little possi-bility

Mori et al. BMC Bioinformatics 2010, 11:332http://www.biomedcentral.com/1471-2105/11/332

Open AccessS O F T W A R E

SoftwareVITCOMIC: visualization tool for taxonomic compositions of microbial communities based on 16S rRNA gene sequencesHiroshi Mori, Fumito Maruyama and Ken Kurokawa*

AbstractBackground: Understanding the community structure of microbes is typically accomplished by sequencing 16S ribosomal RNA (16S rRNA) genes. These community data can be represented by constructing a phylogenetic tree and comparing it with other samples using statistical methods. However, owing to high computational complexity, these methods are insufficient to effectively analyze the millions of sequences produced by new sequencing technologies such as pyrosequencing.

Results: We introduce a web tool named VITCOMIC (VIsualization tool for Taxonomic COmpositions of MIcrobial Community) that can analyze millions of bacterial 16S rRNA gene sequences and calculate the overall taxonomic composition for a microbial community. The 16S rRNA gene sequences of genome-sequenced strains are used as references to identify the nearest relative of each sample sequence. With this information, VITCOMIC plots all sequences in a single figure and indicates relative evolutionary distances.

Conclusions: VITCOMIC yields a clear representation of the overall taxonomic composition of each sample and facilitates an intuitive understanding of differences in community structure between samples. VITCOMIC is freely available at http://mg.bio.titech.ac.jp/vitcomic/.

BackgroundThe number of sequenced bacterial genomes hasincreased rapidly and now exceeds 1,000 [1]; however, wehave little information regarding environmentalmicrobes, largely because the majority of them are uncul-turable [2]. The taxonomic composition of a microbialcommunity can provide important clues to better under-stand its structure and ecology [3]. Analysis using 16SrRNA genes is a frequently used method to obtain thetaxonomic composition of a microbial community [4,5].Features of 16S rRNA genes include essentiality for allBacteria and Archaea, mosaic structures of highly con-served regions and variable regions [6,7], and little possi-bility for horizontal gene transfer [8]. Moreover, theavailability of numerous tools and databases specific for

the 16S rRNA genes has potentiated taxonomic analyses[9-12].

Ultra-deep sequencing of microbial communities usinga massively parallel pyrosequencer has recently uncov-ered relatively rare species in communities [5,13-15].However, the enormous amounts of sequencing data pro-duced by recent pyrosequencing studies are difficult toeffectively analyze using existing computational tools(Additional file 1) [16]. For example, the overall taxo-nomic composition of each sample is traditionally pre-sented graphically in phylogenetic trees [9,17]. However,graphical representation and comparison of overall taxo-nomic compositions for pyrosequencing data is difficultdue to the high computational complexity involved inconstructing multiple alignments and phylogenetic treesfrom millions of sequences [16,18]. Therefore, research-ers tend to use a compressed representation of taxonomiccomposition such as a bar graph or pie chart of the phy-lum-level composition. Unfortunately, these compressedrepresentations of overall taxonomic composition can bedifficult to represent differences among microbial com-

* Correspondence: [email protected] Department of Biological Information, Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, 4259 B-36, Nagatsuta-cho, Midori-ku, Yokohama 226-8501, JapanFull list of author information is available at the end of the article

© 2010 Mori et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons At-tribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in anymedium, provided the original work is properly cited.

http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Abstract&list_uids=20565810

http://mg.bio.titech.ac.jp/vitcomic/


Page 2 of 9

munities, especially differences attributable to minoritytaxa [19].

To address deficiencies in the analysis of taxonomiccompositions of microbial communities, we developed arapid visualization tool, named VITCOMIC, that pres-ents overall taxonomic compositions based on large data-sets of 16S rRNA genes from microbial communities.VITCOMIC can facilitate intuitive understanding ofmicrobial communities and compare taxonomic compo-sitions between communities.

ImplementationCreation of a reference 16S rRNA gene database and their distance matrixThe reference 16S rRNA gene sequence database wasconstructed using 16S rRNA gene sequences fromgenome-sequenced strains. These data are suitable as ref-erence data because they are accurate and have well-defined taxonomic information. Genomic sequences ofBacteria and Archaea were obtained from the NCBIGenome Database [20] in September 2009. The 16SrRNA genes of each strain were detected using RNAm-mer [21]. One 16S rRNA gene was randomly sampled perspecies because there are only small sequence differencesamong 16S rRNA genes within the same genome and thesame species [22,23]. A total of 601 16S rRNA genesequences from 601 species of Bacteria and Archaea wereobtained. To calculate phylogenetic distances amongthem, all sequences were aligned using MAFFT 6.713with default parameters [24]. After constructing multiplealignments, genetic distances between sequences withKimura's two-parameter model of base substitution [25]were calculated using the dnadist program in PHYLIP3.69 [26]. The phylogenetic tree was constructed usingthe neighbor-joining method in the neighbor program inPHYLIP 3.69. The phylum-level taxonomy of the specieswas obtained from the NCBI Taxonomy Database [27].

Sample data for testing VITCOMICWe used human gut microbiome data from Turnbaugh etal. [15] to test VITCOMIC. In their study, each individualwas categorized as obese, lean, or overweight using bodymass index. DNA was extracted from the feces of eachindividual, and the V2 variable regions of 16S rRNAgenes were PCR amplified prior to pyrosequencing usinga 454 GS FLX system [28]. We used the sequences fromobese and lean individuals. The obese sample consisted of704,369 sequences from 196 individuals; the lean sampleconsisted of 291,993 sequences from 61 individuals.

Inference of a nearest relative for each sequenceUsing the human gut microbiome data, we conductedBLASTN searches against the reference 16S rRNA genedatabase to determine a nearest relative for each sample

sequence. The nearest relative is the evolutionally nearestdatabase sequence of each sample sequence. In general,the reference sequence with the highest BLAST score ischosen as the nearest relative in sequence analyses [29].However, because the 16S rRNA gene has mosaic struc-tures of highly conserved regions and variable regions[6,7], the alignments created by BLAST are often dividedby variable regions [30]. In this case, the BLAST score iscalculated for each divided alignment, because overallBLAST scores between the sample and databasesequences cannot be calculated using only the highestscore alignment. To overcome this problem, we calcu-lated a total BLAST score for alignments derived fromthe same pair of sample and database sequences. As illus-trated in Figure 1A, the total BLAST score is calculatedby summing BLAST scores of three divided alignmentsfrom the same pair of sample and database sequences(250 + 220 + 300 = 770). To identify the nearest relative ofthe sample sequence, the total BLAST score is calculatedagainst each database sequence. Upon comparison withthe total BLAST scores between database sequences, thedatabase sequence with the highest total BLAST score isadopted as the nearest relative of the sample sequence.

Alignments less than 50 bp were excluded to avoidinaccurate alignments. Because variable regions arenearly neutral, false alignments between a variable regionand a conserved region or other variable regions aresometimes constructed and included in calculations oftotal BLAST scores (Figure 1B). To calculate total BLASTscores, it is necessary to develop the function "alignmentsconsistency check". The alignments consistency checkdetects false alignment using information on positions ofaligned regions of the sample sequence and matcheddatabase sequence. Normally, the order of aligned regionsof the sample sequence is consistent with that of thematched database sequence (Figure 1A). On the otherhand, most pairs of sequences that contain false align-ments are not consistent with respect to the order ofaligned regions (Figure 1B). The alignments consistencycheck detects collapses of these consistencies andexcludes these pairs of sample and database sequences inthe target calculation of total BLAST scores.

Graphical representation of the taxonomic composition of the sampleAfter determining the nearest relative of each samplesequence, an average similarity between the samplesequence and the nearest relative was calculated fromeach set of alignments (Figure 1A). Information on thenearest relative and the average similarity is representedas a circle plot (Figures 2 and 3). In the figures, each spe-cies name in the reference 16S rRNA gene database isplaced outside of the most lateral circle with ordered phy-logenetic relatedness. Physical distances between nearest


Page 3 of 9

species in the plot indicate genetic distances of 16S rRNAgenes between them. The font color for each speciesname corresponds to its phylum name. Large circles indi-cate boundaries of BLAST average similarities (innermost circle starting at 80%, followed by 85, 90, 95 and100% similarity of the database sequence). Small coloreddots represent average similarities of each sequenceagainst the nearest relative species. The size of these dotsindicates relative abundance of sequences in the sample.The figure produced by VITCOMIC contains four cate-gories of dot size that indicate the relative abundance ofthe sample sequence: smallest dot < 1%; second smallestdot < 5%; third smallest dot < 10% (largest dot in Figures 2and 3); and the largest dot > 10%. The results are output-ted as a Postscript file that can be viewed at high resolu-tion. The overall workflow of VITCOMIC is described inFigure 4. The input file of VITCOMIC is basically a resultfile of BLAST against our reference 16S rRNA genesequence database. Our reference database can be down-

loaded from the VITCOMIC web site http://mg.bio.titech.ac.jp/vitcomic/. When analyzing smallamounts of data (less than 100,000 sequences), the multi-FASTA file before BLAST is accepted as the input file.The VITCOMIC web site contains detailed instructionsfor users.

Comparison of taxonomic compositions between samplesTo compare taxonomic compositions between samples,VITCOMIC clusters sample sequences using single-link-age clustering with 99% similarity as follows. When asample sequence is assigned to a reference speciesaccording to a certain average similarity as describedabove, VITCOMIC rounds down the average similarity tothe integer. If the rounded average similarity and thematched reference species are identical between samplesequences, VITCOMIC clusters these sequencestogether. For example, one sequence was assigned toBacillus subtilis with 98.8% average similarity, whereasanother sequence was assigned to B. subtilis with 98.1%

Figure 1 Calculation of total BLAST scores and average similarities. (A) A diagram of calculated total BLAST scores and average similarities be-tween the sample and database sequences. (B) An example of a collapse of the alignment consistency caused by a false alignment.

similarity 95.3 %score 250



Average similarity 97 %Total score 770

Reference 16S rRNA sequence of Species A

sequence a_1

sequence a_2

Alignments consistency is broken

A. Normal case

B. Strange case




Page 4 of 9

average similarity; VITCOMIC clusters these sequencesin the B. subtilis 98% cluster. After applying this single-linkage clustering based on reference sequences with 99%similarity to each sample, VITCOMIC compares the clus-tering results to identify common clusters between sam-ples. When the cluster that is assigned to the samereference species and sequence similarity exists both ofthe samples, the cluster is designated as a common clus-ter between samples. Using information on commonclusters between samples, VITCOMIC creates a mergedplot the one shown in Figure 5. Gray dots indicate com-mon clusters between the obese and lean samples, greendots indicate specific clusters of the obese samples, andorange dots indicate specific clusters of the lean samples.

For statistical comparison of taxonomic compositionsbetween samples, VITCOMIC calculates three types ofsimilarity indices for taxonomic compositions betweensamples using the clustering result (Jaccard index, Len-non index, and Yue and Clayton theta index) [31]. Theseindices are shown in the lower-right portion of themerged plot (Figure 5).

ResultsUsing VITCOMIC, the overall taxonomic compositionsof both the obese and lean samples could be clearly visu-alized (Figure 2 = obese; Figure 3 = lean). Large coloreddots indicate relatively abundant taxa in each sample (rel-ative abundance > 1%). These large colored dots are dis-

Figure 2 Mapping result for the human gut sample from obese individuals.

10095908580 Escherichia_coliEscherichia_fergusoniiShigella_flexneriShigella_boydiiShigella_sonneiShigella_dysenteriaeCitrobacter_koseriSalmonella_entericaErwinia_tasmaniensisCronobacter_sakazakiiPectobacterium_carotovorumDickeya_zeaePectobacterium_atrosepticumEnterobacter_sp.Klebsiella_pneumoniaeDickeya_dadantiiEdwardsiella_ictaluriSodalis_glossinidiusSerratia_proteamaculansYersinia_enterocoliticaYersinia_pseudotuberculosisYersinia_pestisPhotorhabdus_asymbioticaPhotorhabdus_luminescensProteus_mirabilisCandidatus_HamiltonellaBuchnera_aphidicolaBaumannia_cicadellinicola

Candidatus_BlochmanniaWigglesworthia_glossinidiaAliivibrio_salmonicidaVibrio_fischeriVibrio_splendidusVibrio_harveyiVibrio_parahaemolyticus

Vibrio_vulnificusPhotobacterium_profundum

Vibrio_choleraeHaemophilus_somnus

Actinobacillus_succinogenes

Actinobacillus_pleuropneumoniae

Haemophilus_ducreyi

Mannheimia_succiniciproducens

Aggregatibacter_aphrophilus

Haemophilus_influenzae

Haemophilus_parasuis

Pasteurella_multocida

Tolumonas_auensis

Aeromonas_salmonicida

Aeromonas_hydrophila

Shewanella_woodyi

Shewanella_sediminis

Shewanella_loihica

Shewanella_halifaxensis

Shewanella_pealeana

Shewanella_piezotolerans

Shewanella_amazonensis

Shewanella_frigidimarina

Shewanella_denitrificans

Shewanella_baltica

Shewanella_sp.

Shewanella_putrefaciens

Shewanella_oneidensis

Alteromonas_macleodii

Pseudoalteromonas_atlantica

Pseudoalteromonas_haloplanktis

Psychromonas_ingrahamii

Colwellia_psychrerythraea

Idiomarina_loihiensis

Alcanivorax_borkumensis

Marinomonas_sp.

Chromohalobacter_salexigens

Cellvibrio_japonicus

Teredinibacter_turnerae

Saccharophagus_degradans

Marinobacter_aquaeolei

Hahella_chejuensis

Azotobacter_vinelandii

Pseudomonas_fluorescens

Pseudomonas_syringae

Pseudomonas_mendocina

Pseudomonas_stutzeri

Pseudomonas_entomophila

Pseudomonas_putida

Pseudomonas_aeruginosa

Legionella_pneumophila

Acinetobacter_sp.

Acinetobacter_baumannii

Psychrobacter_sp.

Psychrobacter_cryohalolentis

Psychrobacter_arcticus

Dichelobacter_nodosus

Thiomicrospira_crunogena

Candidatus_Ruthia

Candidatus_Vesicomyosocius

Candidatus_Carsonella

Francisella_philomiragia

Francisella_novicida

Francisella_tularensis

Acidithiobacillus_ferrooxidans

Nitrosococcus_oceani

Methylococcus_capsulatus

Thioalkalivibrio_sp.

Halorhodospira_halophila

Alkalilimnicola_ehrlichii

Coxiella_burnetii

Stenotrophomonas_m

altophilia

Xanthomonas_oryzae

Xanthomonas_axonopodis

Xanthomonas_cam

pestris

Xylella_fastidiosa

Leptothrix_cholodnii

Methylibium

_petroleiphilum

Verminephrobacter_eiseniae

Delftia_acidovorans

Diaphorobacter_sp.

Acidovorax_sp.

Acidovorax_citrulli

Variovorax_paradoxus

Polaromonas_naphthalenivorans

Polaromonas_sp.

Rhodoferax_ferrireducens

Nitrosospira_m

ultiformis

Nitrosom

onas_eutropha

Nitrosom

onas_europaea

Thauera_sp.

Aromatoleum

_aromaticum

Azoarcus_sp.

Dechlorom

onas_aromatica

Thiobacillus_denitrificans

Burkholderia_phym

atum

Burkholderia_xenovorans

Burkholderia_thailandensis

Burkholderia_m

allei

Burkholderia_pseudom

allei

Burkholderia_glum

ae

Burkholderia_m

ultivorans

Burkholderia_vietnam

iensis

Burkholderia_am

bifaria

Burkholderia_sp.

Burkholderia_cenocepacia

Bordetella_bronchiseptica

Bordetella_parapertussis

Bordetella_pertussis

Bordetella_petrii

Bordetella_avium

Herm

iniimonas_arsenicoxydans

Janthinobacterium_sp.

Polynucleobacter_necessarius

Cupriavidus_taiw

anensis

Ralstonia_eutropha

Ralstonia_m

etallidurans

Ralstonia_pickettii

Ralstonia_solanacearum

Methylotenera_m

obilis

Methylovorus_sp.

Methylobacillus_flagellatus

Laribacter_hongkongensis

Chrom

obacterium_violaceum

Neisseria_gonorrhoeae

Neisseria_m

eningitidis

Magnetococcus_sp.

Acidiphilium

_cryptum

Gluconobacter_oxydans

Granulibacter_bethesdensis

Gluconacetobacter_diazotrophicus

Candidatus_P

elagibacter

Neorickettsia_risticii

Neorickettsia_sennetsu

Ehrlichia_rum

inantium

Ehrlichia_chaffeensis

Ehrlichia_canis

Anaplasm

a_phagocytophilum

Anaplasm

a_marginale

Wolbachia_sp.

Wolbachia_endosym

biont

Orientia_tsutsugam

ushi

Rickettsia_prow

azekii

Rickettsia_typhi

Rickettsia_akari

Rickettsia_canadensis

Rickettsia_bellii

Rickettsia_felis

Rickettsia_m

assiliae

Rickettsia_peacockii

Rickettsia_rickettsii

Rickettsia_africae

Rickettsia_conorii

Candidatus_Liberibacter

Candidatus_H

odgkinia

Rhodospirillum

_rubrum

Rhodospirillum

_centenum

Magnetospirillum

_magneticum

Novosphingobium

_aromaticivorans

Sphingopyxis_alaskensis

Erythrobacter_litoralis

Sphingom

onas_wittichii

Zym

omonas_m

obilisM

aricaulis_maris

Hirschia_baltica

Hyphom

onas_neptuniumP

arvibaculum_lavam

entivoransP

aracoccus_denitrificansR

hodobacter_sphaeroidesD

inoroseobacter_shibaeJannaschia_sp.R

oseobacter_denitrificansR

uegeria_sp.R

uegeria_pomeroyi

Methylobacterium

_radiotoleransM

ethylobacterium_nodulans

Methylobacterium

_sp.M

ethylobacterium_populi

Methylobacterium

_chloromethanicum

Methylobacterium

_extorquensO

ligotropha_carboxidovoransN

itrobacter_hamburgensis

Nitrobacter_w

inogradskyiR

hodopseudomonas_palustris

Bradyrhizobium

_sp.B

radyrhizobium_japonicum

Methylocella_silvestris

Beijerinckia_indica

Xanthobacter_autotrophicus

Azorhizobium

_caulinodansM

esorhizobium_sp.

Mesorhizobium

_lotiB

artonella_bacilliformis

Bartonella_quintana

Bartonella_henselae

Bartonella_graham

ii

Bartonella_tribocorum

Ochrobactrum

_anthropi

Brucella_canis

Brucella_ovis

Brucella_abortus

Brucella_suis

Brucella_m

elitensisS

inorhizobium_m

edicae

Sinorhizobium

_meliloti

Rhizobium

_sp.R

hizobium_etli

Rhizobium

_leguminosarum

Agrobacterium

_radiobacter

Agrobacterium

_vitisA

grobacterium_tum

efaciens

Phenylobacterium

_zucineum

Caulobacter_sp.

Caulobacter_crescentus

Helicobacter_pylori

Helicobacter_acinonychis

Helicobacter_hepaticus

Wolinella_succinogenes

Sulfurim

onas_denitrificans

Sulfurovum

_sp.A

rcobacter_butzleri

Cam

pylobacter_hominis

Cam

pylobacter_concisus

Cam

pylobacter_curvus

Cam

pylobacter_fetus

Cam

pylobacter_lari

Cam

pylobacter_jejuni

Nautilia_profundicola

Nitratiruptor_sp.

Salinibacter_ruber

Candidatus_A

moebophilus

Cytophaga_hutchinsonii

Candidatus_S

ulcia

Gram

ella_forsetii

Flavobacterium_johnsoniae

Flavobacterium_psychrophilum

Candidatus_Azobacteroides

Parabacteroides_distasonis

Bacteroides_vulgatus

Bacteroides_fragilis

Bacteroides_thetaiotaomicron

Porphyromonas_gingivalis

Chloroherpeton_thalassium

Prosthecochloris_aestuarii

Chlorobium_chlorochromatii

Pelodictyon_phaeoclathratiforme

Chlorobium_phaeobacteroides

Chlorobium_lim

icola

Pelodictyon_luteolum

Chlorobaculum_parvum

Chlorobium_tepidum

Bdellovibrio_bacteriovorus

Leptospira_biflexa

Leptospira_borgpetersenii

Leptospira_interrogans

Brachyspira_hyodysenteriae

Borrelia_recurrentis

Borrelia_duttonii

Borrelia_turicatae

Borrelia_hermsii

Borrelia_afzelii

Borrelia_garinii

Borrelia_burgdorferi

Treponema_denticola

Treponema_pallidum

Lawsonia_intracellularis

Desulfovibrio_magneticus

Desulfovibrio_salexigens

Desulfovibrio_desulfuricans

Desulfovibrio_vulgaris

Anaeromyxobacter_sp.

Anaeromyxobacter_dehalogenans

Myxococcus_xanthus

Sorangium_cellulosum

Syntrophobacter_fumaroxidans

Desulfotalea_psychrophila

Desulfobacterium_autotrophicum

Desulfatibacillum_alkenivorans

Desulfococcus_oleovorans

Syntrophus_aciditrophicus

Pelobacter_carbinolicus

Geobacter_bemidjiensis

Geobacter_sp.

Geobacter_uraniireducens

Geobacter_lovleyi

Pelobacter_propionicus

Geobacter_metallireducens

Geobacter_sulfurreducens

Candidatus_Solibacter

Acidobacterium_capsulatum

Candidatus_Koribacter

Elusimicrobium_minutum

uncultured_Termite

Fusobacterium_nucleatum

Acholeplasma_laidlawii

Aster_yellows

Mycoplasma_capricolum

Onion_yellowsCandidatus_Phytoplasma

Mycoplasma_mycoidesMesoplasma_florum

Mycoplasma_arthritidisMycoplasma_pulmonisMycoplasma_agalactiae Mycoplasma_synoviaeMycoplasma_mobile Mycoplasma_conjunctivae Mycoplasma_hyopneumoniae

Mycoplasma_penetrans Ureaplasma_urealyticumUreaplasma_parvum Mycoplasma_pneumoniae

Mycoplasma_genitaliumMycoplasma_gallisepticum Eubacterium_rectale

Eubacterium_eligensClostridium_phytofermentans

Alkaliphilus_oremlandii

Alkaliphilus_metalliredigens

Clostridium_difficileFinegoldia_magna

Paenibacillus_sp.

Brevibacillus_brevis

Anoxybacillus_flavithermus

Geobacillus_sp.

Geobacillus_thermodenitrificans

Geobacillus_kaustophilus

Exiguobacterium_sp.

Exiguobacterium_sibiricum

Lysinibacillus_sphaericus

Oceanobacillus_iheyensis

Bacillus_halodurans

Bacillus_clausii

Bacillus_pumilus

Bacillus_amyloliquefaciens

Bacillus_subtilis

Bacillus_licheniformis

Bacillus_cytotoxicus

Bacillus_weihenstephanensis

Bacillus_thuringiensis

Bacillus_anthracis

Bacillus_cereus

Listeria_welshimeri

Listeria_innocua

Listeria_monocytogenes

Enterococcus_faecalis

Macrococcus_caseolyticus

Staphylococcus_carnosus

Staphylococcus_saprophyticus

Staphylococcus_haemolyticus

Staphylococcus_aureus

Staphylococcus_epidermidis

Leuconostoc_citreum

Leuconostoc_mesenteroides

Oenococcus_oeni

Lactobacillus_salivariu

s

Lactobacillus_ferm

entum

Lactobacillus_reuteri

Lactobacillus_casei

Lactobacillus_sakei

Pediococcu

s_pentosa

ceus

Lactobacill

us_brevis

Lactobacill

us_plantarum

Lactobacill

us_delbrueck

ii

Lactobacill

us_helve

ticus

Lactobacill

us_acid

ophilus

Lactobacil

lus_gass

eri

Lactobacil

lus_johnso

nii

Lactoco

ccus_

lactis

Streptoco

ccus_

mutans

Streptoco

ccus_

thermophilu

s

Streptoco

ccus_

gordonii

Streptoco

ccus_

sanguinis

Streptoco

ccus_

pneumoniae

Streptoco

ccus_

suis

Streptoco

ccus_

equi

Streptoco

ccus_

dysgalacti

ae

Streptoco

ccus_

agalactiae

Streptoco

ccus_

uberis

Streptoco

ccus_

pyogenes

Clostri

dium

_cell

ulolyt

icum

Clostri

dium

_the

rmoc

ellum

Clostri

dium

_kluy

veri

Clostri

dium

_nov

yi

Clostri

dium

_beij

erinc

kii

Clostri

dium

_per

fring

ens

Clost

ridiu

m_b

otul

inum

Clost

ridiu

m_t

etan

i

Clost

ridiu

m_a

ceto

buty

licum

Desul

foto

mac

ulum

_red

ucen

s

Helio

bact

eriu

m_m

odes

tical

dum

Desul

fitob

acte

rium

_haf

nien

se

Nat

rana

erob

ius_

ther

mop

hilu

s

Sym

biob

acte

rium

_the

rmop

hilu

m

Synt

roph

omon

as_w

olfe

i

Car

boxy

doth

erm

us_h

ydro

geno

form

ans

Moo

rella

_the

rmoa

cetica

Pel

otom

acul

um_t

herm

opro

pion

icum

Can

dida

tus_

Des

ulfo

rudi

s

Ther

moa

naer

obac

ter_

pseu

deth

anol

icus

Ther

moa

naer

obac

ter_

sp.

Ther

moa

naer

obac

ter_

teng

cong

ensi

s

Ther

mod

esul

fovi

brio

_yel

low

ston

ii

Deh

aloc

occo

ides

_eth

enog

enes

Deh

aloc

occo

ides

_sp.

Gem

mat

imon

as_a

uran

tiaca

Rub

roba

cter

_xyl

anop

hilu

s

Bifi

doba

cter

ium

_ani

mal

is

Bifi

doba

cter

ium

_ado

lesc

entis

Bifi

doba

cter

ium

_lon

gum

Str

epto

myc

es_a

verm

itilis

Str

epto

myc

es_g

riseu

s

Str

epto

myc

es_c

oelic

olor

Kin

eoco

ccus

_rad

ioto

lera

ns

Beu

tenb

ergi

a_ca

vern

ae

Mic

roco

ccus

_lut

eus

Ren

ibac

teriu

m_s

alm

onin

arum

Art

hrob

acte

r_au

resc

ens

Art

hrob

acte

r_ch

loro

phen

olic

us

Art

hrob

acte

r_sp

.

Koc

uria

_rhi

zoph

ila

Cla

viba

cter

_mic

higa

nens

is

Leifs

onia

_xyl

i

Tro

pher

yma_

whi

pple

i

Aci

doth

erm

us_c

ellu

loly

ticus

Fra

nkia

_aln

iF

rank

ia_s

p.

Sal

inis

pora

_are

nico

la

Sal

inis

pora

_tro

pica

The

rmob

ifida

_fus

ca

Noc

ardi

oide

s_sp

.

Pro

pion

ibac

teriu

m_a

cnes

Cor

yneb

acte

rium

_kro

ppen

sted

tii

Cor

yneb

acte

rium

_effi

cien

s

Cor

yneb

acte

rium

_glu

tam

icum

Cor

yneb

acte

rium

_aur

imuc

osum

Cor

yneb

acte

rium

_dip

hthe

riae

Cor

yneb

acte

rium

_jei

keiu

m

Cor

yneb

acte

rium

_ure

alyt

icum

Sac

char

opol

yspo

ra_e

ryth

raea

Rho

doco

ccus

_jos

tiiR

hodo

cocc

us_o

pacus

Rho

doco

ccus

_ery

thro

polis

Noc

ardi

a_fa

rcin

ica

Myc

obac

teriu

m_a

bsce

ssus

Myc

obac

teriu

m_v

anba

alen

iiM

ycob

acte

rium

_gilv

umM

ycob

acte

rium

_sm

egm

atis

Myc

obac

teriu

m_s

p.M

ycob

acte

rium

_lep

rae

Myc

obac

teriu

m_a

vium

Myc

obac

teriu

m_m

arin

umM

ycob

acte

rium

_ulc

eran

sM

ycob

acte

rium

_bov

isM

ycob

acte

rium

_tub

ercu

losi

s

The

rmom

icro

bium

_ros

eum

Her

peto

siph

on_a

uran

tiacu

s

Chl

orof

lexu

s_ag

greg

ans

Chl

orof

lexu

s_sp

.C

hlor

ofle

xus_

aura

ntia

cus

Ros

eifle

xus_

cast

enho

lzii

Ros

eifle

xus_

sp.

The

rmus

_the

rmop

hilu

sD

eino

cocc

us_g

eoth

erm

alis

Dei

noco

ccus

_des

erti

Dei

noco

ccus

_rad

iodu

rans

Ana

eroc

ellu

m_t

herm

ophi

lum

Cal

dice

llulo

siru

ptor

_sac

char

olyt

icus

Dic

tyog

lom

us_t

urgi

dum

Dic

tyog

lom

us_t

herm

ophi

lum

Cop

roth

erm

obac

ter_

prot

eoly

ticus

Pet

roto

ga_m

obili

sK

osm

otog

a_ol

earia

Fer

vido

bact

eriu

m_n

odos

um

The

rmos

ipho

_afr

ican

us

The

rmos

ipho

_mel

anes

iens

is

The

rmot

oga_

letti

ngae

The

rmot

oga_

petr

ophi

la

The

rmot

oga_

neap

olita

na

The

rmot

oga_

sp.

The

rmot

oga_

mar

itim

aA

kker

man

sia_

muc

inip

hila

Opi

tutu

s_te

rrae

Met

hyla

cidi

philu

m_i

nfer

noru

m

Can

dida

tus_

Pro

toch

lam

ydia

Chl

amyd

ophi

la_p

neum

onia

e

Chl

amyd

ophi

la_f

elis

Chl

amyd

ophi

la_a

bortu

s

Chl

amyd

ophi

la_c

avia

e

Chl

amyd

ia_m

urid

arum

Chl

amyd

ia_t

rach

omat

is

Hal

othe

rmot

hrix

_ore

nii

Glo

eoba

cter

_vio

lace

us

Ther

mos

ynec

hoco

ccus

_elo

ngat

us

Acar

yoch

loris

_mar

ina

Tric

hode

smiu

m_e

ryth

raeu

m

Cya

noth

ece_

sp.

Nos

toc_

punc

tifor

me

Anab

aena

_var

iabi

lis

Nos

toc_

sp.

Syne

choc

ystis

_sp.

Mic

rocy

stis

_aer

ugin

osa

Syne

choc

occu

s_el

onga

tus

Syne

choc

occu

s_sp

.

Proc

hlor

ococ

cus_

mar

inus

Rhodo

pire

llula

_bal

tica

Perse

phon

ella

_mar

ina

Sulfu

rihyd

roge

nibi

um_a

zore

nse

Sulfu

rihyd

roge

nibi

um_s

p.

Hydro

geno

bacu

lum_s

p.

Aquife

x_ae

olicu

s

Therm

ofilum

_pen

dens

Caldivirg

a_maquilingensis

Pyrobacu

lum_islandicu

m

Pyrobacu

lum_calid

ifontis

Pyrobacu

lum_arsenatic

um

Pyrobacu

lum_aerophilu

m

Thermoproteus_

neutrophilu

s

Ignicocc

us_hosp

italis

Desulfu

rococc

us_ka

mchatke

nsis

Staphylotherm

us_marin

us

Hyperth

ermus_

butylicu

s

Aeropyrum_pernix

Metallosp

haera_sedula

Sulfolobus_

acidoca

ldarius

Sulfolobus_tokodaii

Sulfolobus_islandicus

Sulfolobus_solfa

taricus

Nanoarchaeum_equitans

Thermococcus_sibiric

us

Thermococcus_onnurineus

Thermococcus_gammatolerans

Thermococcus_kodakarensis

Pyrococcus_horikoshii

Pyrococcus_abyssi

Pyrococcus_furiosus

Archaeoglobus_fulgidus

Methanopyrus_kandleri

Methanocaldococcus_jannaschii

Methanococcus_aeolicus

Methanococcus_vannielii

Methanococcus_maripaludis

Thermoplasma_volcanium

Thermoplasma_acidophilum

Picrophilus_torridus

Methanoculleus_marisnigri

Methanocorpusculum_labreanum

Methanosphaerula_palustris

Candidatus_Methanoregula

Methanospirillum_hungatei

uncultured_methanogenic

Methanosaeta_thermophila

Methanococcoides_burtonii

Methanosarcina_barkeri

Methanosarcina_acetivorans

Methanosarcina_mazei

Halorubrum_lacusprofundi

Haloquadratum_walsbyi

Natronomonas_pharaonis

Haloarcula_marismortui

Halobacterium_salinarumHalobacterium_sp.

Methanobrevibacter_smithii

Methanosphaera_stadtmanae

Methanothermobacter_thermautotrophicus

Nitrosopumilus_maritimus

ProteobacteriaChlorobiBacteroidetesAcidobacteriaElusimicrobiacandidatus division TG1SpirochaetesFusobacteriaFirmicutesTenericutesThermotogaeDictyoglomiChlamydiaeCyanobacteriaActinobacteriaGemmatimonadetesVerrucomicrobiaChloroflexiDeinococcus-ThermusNitrospiraePlanctomycetesAquificaeEuryarchaeotaNanoarchaeotaCrenarchaeotaThaumarchaeota

Relative abundance <= 1%



Relative abundance > 10%


Page 5 of 9

tributed almost identically between obese and leansamples and are located at related species of Clostridium,Eubacterim, and Bacteroides. These taxa are the abun-dant in the normal human gut microbiome [32]. Smalldots that are located at the most lateral circle indicateclosely related strains of the genome-sequenced strains.These strains are Escherichia coli and Proteus mirabilis inProteobacteria, Enterococcus faecalis and the group ofLactobacillus in Firmicutes, groups of Bifidobacteriumand Propionibacterium in Actinobacteria, and Akkerman-sia muciniphila in Verrucomicrobia. It is well establishedthat some of these strains inhabit the human gut, whereasothers do not [33-39]. In Figures 2 and 3, several dots aredistributed on the 80-90% lines, indicating that several

taxa distantly related to genome-sequenced strainsinhabit the human gut. These results were consistentwith the study of Turnbaugh et al. [15].

Differences between the obese and lean samples areclearly evident in Figure 5, which was created by the com-paring function of VITCOMIC. Gray dots indicate com-mon taxa between the obese and lean samples; green dotsindicate specific taxa of the obese samples, and orangedots indicate specific taxa of the lean samples. Themajority of taxa appear to be common between obese andlean samples, although certain taxa could be specific tothe obese or lean sample (for example, the phylum Acti-nobacteria in the obese sample as described in the studyof Turnbaugh et al. [15]). Figure 6 presents a higher reso-

Figure 3 Mapping result for the human gut sample from lean individuals.







Haemophilus_ducreyi






Tolumonas_auensis



Shewanella_woodyi


Shewanella_loihica


Shewanella_pealeana





Shewanella_baltica

Shewanella_sp.










Marinomonas_sp.






Hahella_chejuensis







Pseudomonas_putida



Acinetobacter_sp.


Psychrobacter_sp.





Candidatus_Ruthia












Coxiella_burnetii

Stenotrophomonas_m

altophilia

Xanthomonas_oryzae


Xanthomonas_cam

pestris

Xylella_fastidiosa


Methylibium

_petroleiphilum


Delftia_acidovorans

Diaphorobacter_sp.

Acidovorax_sp.

Acidovorax_citrulli



Polaromonas_sp.


Nitrosospira_m

ultiformis

Nitrosom

onas_eutropha

Nitrosom

onas_europaea

Thauera_sp.

Aromatoleum

_aromaticum

Azoarcus_sp.

Dechlorom

onas_aromatica


Burkholderia_phym

atum



Burkholderia_m

allei


allei

Burkholderia_glum

ae

Burkholderia_m

ultivorans


iensis

Burkholderia_am

bifaria

Burkholderia_sp.





Bordetella_petrii

Bordetella_avium

Herm




Cupriavidus_taiw

anensis

Ralstonia_eutropha

Ralstonia_m

etallidurans

Ralstonia_pickettii


Methylotenera_m

obilis

Methylovorus_sp.



Chrom



Neisseria_m

eningitidis

Magnetococcus_sp.

Acidiphilium

_cryptum




Candidatus_P

elagibacter



Ehrlichia_rum

inantium


Ehrlichia_canis

Anaplasm

a_phagocytophilum

Anaplasm

a_marginale

Wolbachia_sp.

Wolbachia_endosym

biont

Orientia_tsutsugam

ushi

Rickettsia_prow

azekii

Rickettsia_typhi

Rickettsia_akari


Rickettsia_bellii

Rickettsia_felis

Rickettsia_m

assiliae



Rickettsia_africae

Rickettsia_conorii


Candidatus_H

odgkinia

Rhodospirillum

_rubrum

Rhodospirillum

_centenum

Magnetospirillum

_magneticum

Novosphingobium

_aromaticivorans



Sphingom

onas_wittichii

Zym

omonas_m

obilisM

aricaulis_maris

Hirschia_baltica

Hyphom

ona s_neptuniumP

arvibaculum_lavam

entivoransP





uegeria_sp.R

uegeria_pomeroyi

Methylobacterium

_radiotoleransM


Methylobacterium

_sp.M


Methylobacterium

_chloromethanicum

Methylobacterium

_extorquensO



Nitrobacter_w

inogradskyiR


Bradyrhizobium

_sp.B



Beijerinckia_indica


Azorhizobium

_caulinodansM

esorhizobium_sp.

Mesorhizobium

_lotiB


Bartonella_quintana

Bartonella_henselae

Bartonella_graham

ii


Ochrobactrum

_anthropi

Brucella_canis

Brucella_ovis

Brucella_abortus

Brucella_suis

Brucella_m

elitensisS

inorhizobium_m

edicae

Sinorhizobium

_meliloti

Rhizobium

_sp.R

hizobium_etli

Rhizobium

_leguminosarum

Agrobacterium

_radiobacter

Agrobacterium

_vitisA

grobacterium_tum

efaciens

Phenylobacterium

_zucineum

Caulobacter_sp.


Helicobacter_pylori




Sulfurim

onas_denitrificans

Sulfurovum

_sp.A

rcobacter_butzleri

Cam

pylobacter_hominis

Cam

pylobacter_concisus

Cam

pylobacter_curvus

Cam

pylobacter_fetus

Cam

pylobacter_lari

Cam

pylobacter_jejuni


Nitratiruptor_sp.

Salinibacter_ruber

Candidatus_A

moebophilus


Candidatus_S

ulcia

Gram

ella_forsetii














Chlorobium_lim

icola



Chlorobium_tepidum


Leptospira_biflexa





Borrelia_duttonii

Borrelia_turicatae

Borrelia_hermsii

Borrelia_afzelii

Borrelia_garinii


Treponema_denticola

Treponema_pallidum








Myxococcus_xanthus










Geobacter_sp.


Geobacter_lovleyi








uncultured_Termite



Aster_yellows











Paenibacillus_sp.



Geobacillus_sp.



Exiguobacterium_sp.




Bacillus_halodurans

Bacillus_clausii

Bacillus_pumilus


Bacillus_subtilis





Bacillus_anthracis

Bacillus_cereus

Listeria_welshimeri

Listeria_innocua









Leuconostoc_citreum


Oenococcus_oeni


s

Lactobacillus_ferm

entum


Lactobacillus_casei

Lactobacillus_sakei

Pediococcu

s_pentosa

ceus

Lactobacill

us_brevis

Lactobacillu

s_plantarum

Lactobacill

us_delbrueck

ii

Lactobacill

us_helve

ticus

Lactobacill

us_acid

ophilus

Lactobacil

lus_gass

eri

Lactobacil

lus_johnso

nii

Lactoco

ccus_

lactis

Streptoco

ccus_

mutans

Streptoco

ccus_

thermophilu

s

Streptoco

ccus_

gordonii

Streptoco

ccus_

sanguinis

Streptoco

ccus_

pneumoniae

Streptoco

ccus_

suis

Streptoco

ccus_

equi

Streptoco

ccus_

dysgalacti

ae

Streptoco

ccus_

agalactiae

Streptoco

ccus

_uberis

Streptoc

occu

s_pyo

genes

Clostri

dium

_cell

ulolyt

icum

Clostri

dium

_the

rmoc

ellum

Clostri

dium

_kluy

veri

Clostri

dium

_nov

yi

Clostri

dium

_beij

erinc

kii

Clostri

dium

_per

fring

ens

Clost

ridiu

m_b

otul

inum

Clost

ridiu

m_t

etan

i

Clost

ridiu

m_a

ceto

buty

licum

Desul

foto

mac

ulum

_red

ucen

s

Helio

bact

eriu

m_m

odes

tical

dum

Desul

fitob

acte

rium

_haf

nien

se

Nat

rana

erob

ius_

ther

mop

hilu

s

Sym

biob

acte

rium

_the

rmop

hilu

m

Synt

roph

omon

as_w

olfe

i

Car

boxy

doth

erm

us_h

ydro

geno

form

ans

Moo

rella

_the

rmoa

cetica

Pel

otom

acul

um_t

herm

opro

pion

icum

Can

dida

tus_

Des

ulfo

rudi

s

Ther

moa

naer

obac

ter_

pseu

deth

anol

icus

Ther

moa

naer

obac

ter_

sp.

Ther

moa

naer

obac

ter_

teng

cong

ensi

s

Ther

mod

esul

fovi

brio

_yel

low

ston

ii

Deh

aloc

occo

ides

_eth

enog

enes

Deh

aloc

occo

ides

_sp.

Gem

mat

imon

as_a

uran

tiaca

Rub

roba

cter

_xyl

anop

hilu

s

Bifi

doba

cter

ium

_ani

mal

is

Bifi

doba

cter

ium

_ado

lesc

entis

Bifi

doba

cter

ium

_lon

gum

Str

epto

myc

es_a

verm

itilis

Str

epto

myc

es_g

riseu

s

Str

epto

myc

es_c

oelic

olor

Kin

eoco

ccus

_rad

ioto

lera

ns

Beu

tenb

ergi

a_ca

vern

ae

Mic

roco

ccus

_lut

eus

Ren

ibac

teriu

m_s

alm

onin

arum

Art

hrob

acte

r_au

resc

ens

Art

hrob

acte

r_ch

loro

phen

olic

us

Art

hrob

acte

r_sp

.

Koc

uria

_rhi

zoph

ila

Cla

viba

cter

_mic

higa

nens

is

Leifs

onia

_xyl

i

Tro

pher

yma_

whi

pple

i

Aci

doth

erm

us_c

ellu

loly

ticus

Fra

nkia

_aln

iF

rank

ia_s

p.

Sal

inis

pora

_are

nico

la

Sal

inis

pora

_tro

pica

The

rmob

ifida

_fus

ca

Noc

ardi

oide

s_sp

.

Pro

pion

ibac

teriu

m_a

cnes

Cor

yneb

acte

rium

_kro

ppen

sted

tii

Cor

yneb

acte

rium

_effi

cien

s

Cor

yneb

acte

rium

_glu

tam

icum

Cor

yneb

acte

rium

_aur

imuc

osum

Cor

yneb

acte

rium

_dip

hthe

riae

Cor

yneb

acte

rium

_jei

keiu

m

Cor

yneb

acte

rium

_ure

alyt

icum

Sac

char

opol

yspo

ra_e

ryth

raea

Rho

doco

ccus

_jos

tiiR

hodo

cocc

us_o

pacus

Rho

doco

ccus

_ery

thro

polis

Noc

ardi

a_fa

rcin

ica

Myc

obac

teriu

m_a

bsce

ssus

Myc

obac

teriu

m_v

anba

alen

iiM

ycob

acte

rium

_gilv

umM

ycob

acte

rium

_sm

egm

atis

Myc

obac

teriu

m_s

p.M

ycob

acte

rium

_lep

rae

Myc

obac

teriu

m_a

vium

Myc

obac

teriu

m_m

arin

umM

ycob

acte

rium

_ulc

eran

sM

ycob

acte

rium

_bov

isM

ycob

acte

rium

_tub

ercu

losi

s

The

rmom

icro

bium

_ros

eum

Her

peto

siph

on_a

uran

tiacu

s

Chl

orof

lexu

s _ag

greg

ans

Chl

o rof

lexu

s_sp

.C

hlo r

ofle

xus_

aura

ntia

cus

Ros

eifle

xus_

cast

enho

lzii

Ros

eifle

xus_

sp.

The

rmus

_the

rmop

hilu

sD

eino

cocc

us_g

eoth

erm

alis

Dei

noco

ccus

_des

erti

Dei

noco

ccus

_rad

iodu

rans

Ana

eroc

ellu

m_t

herm

ophi

lum

Cal

dice

llulo

siru

ptor

_sac

char

olyt

icus

Dic

tyog

lom

us_t

urgi

dum

Dic

tyog

lom

us_t

herm

ophi

lum

Cop

roth

erm

obac

ter_

prot

eoly

ticus

Pet

roto

ga_m

obili

sK

osm

otog

a_ol

earia

Fer

vido

bact

eriu

m_n

odos

um

The

rmos

ipho

_afr

ican

us

The

rmos

ipho

_mel

anes

iens

is

The

rmot

oga_

letti

ngae

The

rmot

oga_

petr

ophi

la

The

rmot

oga_

neap

olita

na

The

rmot

oga_

sp.

The

rmot

oga_

mar

itim

aA

kker

man

sia_

muc

inip

hila

Opi

tutu

s_te

rrae

Met

hyla

cidi

philu

m_i

nfer

noru

m

Can

dida

tus_

Pro

toch

lam

ydia

Chl

amyd

ophi

la_p

neum

onia

e

Chl

amyd

ophi

la_f

elis

Chl

amyd

ophi

la_a

bortu

s

Chl

amyd

ophi

la_c

avia

e

Chl

amyd

ia_m

urid

arum

Chl

amyd

ia_t

rach

omat

is

Hal

othe

rmot

hrix

_ore

nii

Glo

eoba

cter

_vio

lace

us

Ther

mos

ynec

hoco

ccus

_elo

ngat

us

Acar

yoch

loris

_mar

ina

Tric

hode

smiu

m_e

ryth

raeu

m

Cya

noth

ece_

sp.

Nos

toc_

punc

tifor

me

Anab

aena

_var

iabi

lis

Nos

toc_

sp.

Syne

choc

ystis

_sp.

Mic

rocy

stis

_aer

ugin

osa

Syne

choc

occu

s_el

onga

tus

Syne

choc

occu

s_sp

.

Proc

hlor

ococ

cus_

mar

inus

Rhodo

pire

llula

_bal

tica

Perse

phon

ella

_mar

ina

Sulfu

rihyd

roge

nibi

um_a

zore

nse

Sulfu

rihyd

roge

nibi

um_s

p.

Hydro

geno

bacu

lum_s

p.

Aquife

x_ae

olicu

s

Therm

ofilum

_pen

dens

Caldivirg

a_maquilingensis

Pyrobacu

lum_islandicu

m

Pyrobacu

lum_calid

ifontis

Pyrobacu

lum_arsenatic

um

Pyrobacu

lum_aerophilu

m

Thermoproteus_

neutrophilu

s

Ignicocc

us_hosp

italis

Desulfu

rococc

us_ka

mchatke

nsis

Staphylotherm

us_marin

us

Hyperth

ermus_

butylicu

s

Aeropyrum_pernix

Metallosp

haera_sedula

Sulfolobus_

acidoca

ldarius

Sulfolobus_tokodaii


Sulfolobus_solfa

taricus



us





Pyrococcus_abyssi

Pyrococcus_furiosus




































Page 6 of 9

lution view of the region related to Actibobacteria in Fig-ure 5.

DiscussionVITCOMIC can easily visualize overall taxonomic com-positions of large amounts of 16S rRNA gene-based com-munity analysis data. Traditional visualization methodsby constructing phylogenetic trees require a lot of com-putation time when analyzing large amounts of data [16].Even if researchers are able to construct a phylogenetictree, the tree itself can be difficult to analyze because itmay contain too many branches [29]. By contrast, taxo-nomic assignments based on BLAST are fast and can behighly parallelized [40]. Although several highly accuratetaxonomic assignment tools have been developed [41,42],the accuracy of BLAST-based taxonomic assignments isalso well validated [29,43]. In addition, calculations oftotal BLAST scores and applications of the alignmentsconsistency check improve the accuracy of the assign-ment, especially when long sequences are examined. Lon-

ger sequences containing more variable regions willgenerate a greater number of alignment divisions. Thealignments consistency check may be necessary for thestudy using the pyrosequencer because recently devel-oped pyrosequencer has improved the read length byover 400 bp [44]. Although the taxonomic assignmentusing only genome-sequenced species for the referencewould not yield the best assignment compared with theassignment using larger database that contains uncul-tured bacteria [12,45], this provisional taxonomy pro-vided by VITCOMIC is accurate enough for the visualcomparisons of taxonomic composition between sam-ples.

Compared with other tools, the most unique functionof VITCOMIC is a simultaneous visualization and com-parison of taxonomic compositions between samples(Additional file 1). Comparison of taxonomic composi-tions between samples from different microbial commu-nities is an effective means to better understandsimilarities and differences between microbial communi-

Figure 4 VITCOMIC flow chart.

Rnammer

Extract single 16S rRNA sequenceper one species

Reference 16S rRNA sequence database

MAFFT

dnadist in PHYLIP

neighbor in PHYLIP

Assign Taxonomy

Reference Tree

NCBI Genome Sequence database

NCBI TaxonomyDatabase

BLASTN

Sample 16S rRNA sequence data

Alignments consistency check

Extract hits of each query with max total scores

(nearest relative)

Create a plot using BLAST average similarities and

names of the nearest relative

User uploaded data


Page 7 of 9

ties [10]. However, the comparison of several microbialcommunities can be difficult given a large number ofsequences [16]. VITCOMIC can simultaneously visualizelarge amounts of data by merging sequence data fromseveral community analysis projects (Additional files 2, 3,and 4). Additional file 2 visualizes 139,356 16S rRNAgene sequences obtained from various soils [13]. Addi-tional file 3 presents seawater microbial communitiesdata derived from 452 different 16S rRNA gene surveyscontaining 11,144,358 sequences, which were obtainedfrom the NCBI Sequence Read Archive [46]. Additionalfile 4 presents data for the human microbial communities

derived from 60 different 16S rRNA gene surveys con-taining 4,363,040 sequences, which were obtained fromNCBI Sequence Read Archive. Although detailed com-parisons among samples from different microbial com-munities are difficult due to the large number ofsequences and differing primers, VITCOMIC showedthat overall taxonomic compositions and abundant taxaare distinctly different between environments.

VITCOMIC only uses the 16S rRNA gene sequencesfrom 601 genome-sequenced bacteria as references. Thereason why we selected the reference database from 601species is the quality and quantity of the biological infor-

Figure 5 Merged results for the obese and lean human gut samples.







Haemophilus_ducreyi






Tolumonas_auensis



Shewanella_woodyi


Shewanella_loihica


Shewanella_pealeana





Shewanella_baltica

Shewanella_sp.










Marinomonas_sp.






Hahella_chejuensis







Pseudomonas_putida



Acinetobacter_sp.


Psychrobacter_sp.





Candidatus_Ruthia












Coxiella_burnetii

Stenotrophomonas_m

altophilia

Xanthomonas_oryzae


Xanthomonas_cam

pestris

Xylella_fastidiosa


Methylibium

_petroleiphilum


Delftia_acidovorans

Diaphorobacter_sp.

Acidovorax_sp.

Acidovorax_citrulli



Polaromonas_sp.


Nitrosospira_m

ultiformis

Nitrosom

onas_eutropha

Nitrosom

onas_europaea

Thauera_sp.

Aromatoleum

_aromaticum

Azoarcus_sp.

Dechlorom

onas_aromatica


Burkholderia_phym

atum



Burkholderia_m

allei


allei

Burkholderia_glum

ae

Burkholderia_m

ultivorans


iensis

Burkholderia_am

bifaria

Burkholderia_sp.





Bordetella_petrii

Bordetella_avium

Herm




Cupriavidus_taiw

anensis

Ralstonia_eutropha

Ralstonia_m

etallidurans

Ralstonia_pickettii


Methylotenera_m

obilis

Methylovorus_sp.



Chrom



Neisseria_m

eningitidis

Magnetococcus_sp.

Acidiphilium

_cryptum




Candidatus_P

elagibacter



Ehrlichia_rum

inantium


Ehrlichia_canis

Anaplasm

a_phagocytophilum

Anaplasm

a_marginale

Wolbachia_sp.

Wolbachia_endosym

biont

Orientia_tsutsugam

ushi

Rickettsia_prow

azekii

Rickettsia_typhi

Rickettsia_akari


Rickettsia_bellii

Rickettsia_felis

Rickettsia_m

assiliae



Rickettsia_africae

Rickettsia_conorii


Candidatus_H

odgkinia

Rhodospirillum

_rubrum

Rhodospirillum

_centenum

Magnetospirillum

_magneticum

Novosphingobium

_aromaticivorans



Sphingom

onas_wittichii

Zym

omonas_m

obilisM

aricaulis_maris

Hirschia_baltica

Hyphom

o nas_neptuni umP

arvibaculum_lavam

entivoransP





uegeria_sp.R

uegeria_pomeroyi

Methylobacterium

_radiotoleransM


Methylobacterium

_sp.M


Methylobacterium

_chloromethanicum

Methylobacterium

_extorquensO



Nitrobacter_w

inogradskyiR


Bradyrhizobium

_sp.B



Beijerinckia_indica


Azorhizobium

_caulinodansM

esorhizobium_sp.

Mesorhizobium

_lotiB


Bartonella_quintana

Bartonella_henselae

Bartonella_graham

ii


Ochrobactrum

_anthropi

Brucella_canis

Brucella_ovis

Brucella_abortus

Brucella_suis

Brucella_m

elitensisS

inorhizobium_m

edicae

Sinorhizobium

_meliloti

Rhizobium

_sp.R

hizobium_etli

Rhizobium

_leguminosarum

Agrobacterium

_radiobacter

Agrobacterium

_vitisA

grobacterium_tum

efaciens

Phenylobacterium

_zucineum

Caulobacter_sp.


Helicobacter_pylori




Sulfurim

onas_denitrificans

Sulfurovum

_sp.A

rcobacter_butzleri

Cam

pylobacter_hominis

Cam

pylobacter_concisus

Cam

pylobacter_curvus

Cam

pylobacter_fetus

Cam

pylobacter_lari

Cam

pylobacter_jejuni


Nitratiruptor_sp.

Salinibacter_ruber

Candidatus_A

moebophilus


Candidatus_S

ulcia

Gram

ella_forsetii














Chlorobium_lim

icola



Chlorobium_tepidum


Leptospira_biflexa





Borrelia_duttonii

Borrelia_turicatae

Borrelia_hermsii

Borrelia_afzelii

Borrelia_garinii


Treponema_denticola

Treponema_pallidum








Myxococcus_xanthus










Geobacter_sp.


Geobacter_lovleyi








uncultured_Termite



Aster_yellows











Paenibacillus_sp.



Geobacillus_sp.



Exiguobacterium_sp.




Bacillus_halodurans

Bacillus_clausii

Bacillus_pumilus


Bacillus_subtilis





Bacillus_anthracis

Bacillus_cereus

Listeria_welshimeri

Listeria_innocua









Leuconostoc_citreum


Oenococcus_oeni


s

Lactobacillus_ferm

entum


Lactobacillus_casei

Lactobacillus_sakei

Pediococcu

s_pentosa

ceus

Lactobacill

us_brevis

Lactobacill

us_plantarum

Lactobacill

us_delbrueck

ii

Lactobacill

us_helve

ticus

Lactobacill

us_acid

ophilus

Lactobacil

lus_gass

eri

Lactobacil

lus_johnso

nii

Lactoco

ccus_

lactis

Streptoco

ccus_

mutans

Streptoco

ccus_

thermophilu

s

Streptoco

ccus_

gordonii

Streptoco

ccus_

sanguinis

Streptoco

ccus_

pneumoniae

Streptoco

ccus_

suis

Streptoco

ccus_

equi

Streptoco

ccus_

dysgalacti

ae

Streptoco

ccus_

agalactiae

Streptoco

ccus_

uberis

Strepto

cocc

us_pyo

genes

Clostri

dium

_cell

ulolyt

icum

Clostri

dium

_the

rmoc

ellum

Clostri

dium

_kluy

veri

Clostri

dium

_nov

yi

Clostri

dium

_beij

erinc

kii

Clostri

dium

_per

fring

ens

Clost

ridiu

m_b

otul

inum

Clost

ridiu

m_t

etan

i

Clost

ridiu

m_a

ceto

buty

licum

Desul

foto

mac

ulum

_red

ucen

s

Helio

bact

eriu

m_m

odes

tical

dum

Desul

fitob

acte

rium

_haf

nien

se

Nat

rana

erob

ius_

ther

mop

hilu

s

Sym

biob

acte

rium

_the

rmop

hilu

m

Synt

roph

omon

as_w

olfe

i

Car

boxy

doth

erm

us_h

ydro

geno

form

ans

Moo

rella

_the

rmoa

cetica

Pel

otom

acul

um_t

herm

opro

pion

icum

Can

dida

tus_

Des

ulfo

rudi

s

Ther

moa

naer

obac

ter_

pseu

deth

anol

icus

Ther

moa

naer

obac

ter_

sp.

Ther

moa

naer

obac

ter_

teng

cong

ensi

s

Ther

mod

esul

fovi

brio

_yel

low

ston

ii

Deh

aloc

occo

ides

_eth

enog

enes

Deh

aloc

occo

ides

_sp.

Gem

mat

imon

as_a

uran

tiaca

Rub

roba

cter

_xyl

anop

hilu

s

Bifi

doba

cter

ium

_ani

mal

is

Bifi

doba

cter

ium

_ado

lesc

entis

Bifi

doba

cter

ium

_lon

gum

Str

epto

myc

es_a

verm

itilis

Str

epto

myc

es_g

riseu

s

Str

epto

myc

es_c

oelic

olor

Kin

eoco

ccus

_rad

ioto

lera

ns

Beu

tenb

ergi

a_ca

vern

ae

Mic

roco

ccus

_lut

eus

Ren

ibac

teriu

m_s

alm

onin

arum

Art

hrob

acte

r_au

resc

ens

Art

hrob

acte

r_ch

loro

phen

olic

us

Art

hrob

acte

r_sp

.

Koc

uria

_rhi

zoph

ila

Cla

viba

cter

_mic

higa

nens

is

Leifs

onia

_xyl

i

Tro

pher

yma_

whi

pple

i

Aci

doth

erm

us_c

ellu

loly

ticus

Fra

nkia

_aln

iF

rank

ia_s

p.

Sal

inis

pora

_are

nico

la

Sal

inis

pora

_tro

pica

The

rmob

ifida

_fus

ca

Noc

ardi

oide

s_sp

.

Pro

pion

ibac

teriu

m_a

cnes

Cor

yneb

acte

rium

_kro

ppen

sted

tii

Cor

yneb

acte

rium

_effi

cien

s

Cor

yneb

acte

rium

_glu

tam

icum

Cor

yneb

acte

rium

_aur

imuc

osum

Cor

yneb

acte

rium

_dip

hthe

riae

Cor

yneb

acte

rium

_jei

keiu

m

Cor

yneb

acte

rium

_ure

alyt

icum

Sac

char

opol

yspo

ra_e

ryth

raea

Rho

doco

ccus

_jos

tiiR

hodo

cocc

us_o

pacus

Rho

doco

ccus

_ery

thro

polis

Noc

ardi

a_fa

rcin

ica

Myc

obac

teriu

m_a

bsce

ssus

Myc

obac

teriu

m_v

anba

alen

iiM

ycob

acte

rium

_gilv

umM

ycob

acte

rium

_sm

egm

atis

Myc

obac

teriu

m_s

p.M

ycob

acte

rium

_lep

rae

Myc

obac

teriu

m_a

vium

Myc

obac

teriu

m_m

arin

umM

ycob

acte

rium

_ulc

eran

sM

ycob

acte

rium

_bov

isM

ycob

acte

rium

_tub

ercu

losi

s

The

rmom

icro

bium

_ros

eum

Her

peto

siph

on_a

uran

tiacu

s

Chl

orof

lexu

s_ag

greg

ans

Chl

orof

lexu

s _sp

.C

hlor

ofle

xus _

aura

ntia

cus

Ros

eifle

xus_

cast

enho

lzii

Ros

eifle

xus_

sp.

The

rmus

_the

rmop

hilu

sD

eino

cocc

us_g

eoth

erm

alis

Dei

noco

ccus

_des

erti

Dei

noco

ccus

_rad

iodu

rans

Ana

eroc

ellu

m_t

herm

ophi

lum

Cal

dice

llulo

siru

ptor

_sac

char

olyt

icus

Dic

tyog

lom

us_t

urgi

dum

Dic

tyog

lom

us_t

herm

ophi

lum

Cop

roth

erm

obac

ter_

prot

eoly

ticus

Pet

roto

ga_m

obili

sK

osm

otog

a_ol

earia

Fer

vido

bact

eriu

m_n

odos

um

The

rmos

ipho

_afr

ican

us

The

rmos

ipho

_mel

anes

iens

is

The

rmot

oga_

letti

ngae

The

rmot

oga_

petr

ophi

la

The

rmot

oga_

neap

olita

na

The

rmot

oga_

sp.

The

rmot

oga_

mar

itim

aA

kker

man

sia_

muc

inip

hila

Opi

tutu

s_te

rrae

Met

hyla

cidi

philu

m_i

nfer

noru

m

Can

dida

tus_

Pro

toch

lam

ydia

Chl

amyd

ophi

la_p

neum

onia

e

Chl

amyd

ophi

la_f

elis

Chl

amyd

ophi

la_a

bortu

s

Chl

amyd

ophi

la_c

avia

e

Chl

amyd

ia_m

urid

arum

Chl

amyd

ia_t

rach

omat

is

Hal

othe

rmot

hrix

_ore

nii

Glo

eoba

cter

_vio

lace

us

Ther

mos

ynec

hoco

ccus

_elo

ngat

us

Acar

yoch

loris

_mar

ina

Tric

hode

smiu

m_e

ryth

raeu

m

Cya

noth

ece_

sp.

Nos

toc_

punc

tifor

me

Anab

aena

_var

iabi

lis

Nos

toc_

sp.

Syne

choc

ystis

_sp.

Mic

rocy

stis

_aer

ugin

osa

Syne

choc

occu

s_el

onga

tus

Syne

choc

occu

s_sp

.

Proc

hlor

ococ

cus_

mar

inus

Rhodo

pire

llula

_bal

tica

Perse

phon

ella

_mar

ina

Sulfu

rihyd

roge

nibi

um_a

zore

nse

Sulfu

rihyd

roge

nibi

um_s

p.

Hydro

geno

bacu

lum_s

p.

Aquife

x_ae

olicu

s

Therm

ofilum_p

enden

s

Caldivirg

a_maquilingensis

Pyrobacu

lum_islandicu

m

Pyrobacu

lum_calid

ifontis

Pyrobacu

lum_arsenatic

um

Pyrobacu

lum_aerophilu

m

Thermoproteus_

neutrophilu

s

Ignicocc

us_hosp

italis

Desulfu

rococc

us_ka

mchatke

nsis

Staphylotherm

us_marin

us

Hyperth

ermus_

butylicu

s

Aeropyrum_pernix

Metallosp

haera_sedula

Sulfolobus_

acidoca

ldarius

Sulfolobus_tokodaii


Sulfolobus_solfa

taricus



us





Pyrococcus_abyssi

Pyrococcus_furiosus



































Jaccard index = 0.5739

Lennon index = 0.8437

Yue & Clayton theta = 0.9339


Page 8 of 9

mation. These sequences are derived from genome-sequenced species, from which we can generally obtainmuch information about ecophysiology (i.e., metabolicpotentials, habitats, gene repertoires). Therefore, byadopting genome-sequenced species as the referencedatabase, we can retrieve several biological informationfor each taxon inductively by analyzing the genomicinformation of the nearest genome-sequenced speciesfrom the 16S rRNA gene-targeted analysis. These fea-tures provide valuable initiative knowledge for a follow-ing metagenomic analysis. To address the increasingnumber of genome-sequenced species, the referencedatabase of VITCOMIC will be updated periodically.

ConclusionsUsing a phylogenetic relationship with genome-sequenced strains, VITCOMIC clearly presents the over-all taxonomic composition of 16S rRNA gene-basedmicrobial community analysis data. VITCOMIC facili-

tates an intuitive understanding of differences in commu-nity structure between samples.

Availability and requirements• Project name: VITCOMIC

• Project home page: http://mg.bio.titech.ac.jp/vit-comic/

• Operating system(s): Platform independent• Programming language: Perl• Other requirements: None• License: GNU GPL• Any restrictions to use by non-academics: None

Additional material

Authors' contributionsHM and KK designed the study. HM developed the method and performed theanalyses. FM and KK provided advise on method design and analyses. HMdrafted the manuscript, and FM and KK critically revised it. All authors read andapproved the final manuscript.

AcknowledgementsWe thank Hiroyuki Toh, Tetsuya Hayashi and Takehiko Itoh for helpful discus-sions. This work was supported by a Grant-in-Aid from the Institute for Bioinfor-matics Research and Development, the Japan Science and Technology Agency (BIRD-JST) and a Grant-in-Aid for Scientific Research (C: 22592032).

Author DetailsDepartment of Biological Information, Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, 4259 B-36, Nagatsuta-cho, Midori-ku, Yokohama 226-8501, Japan

References1. Liolios K, Chen IM, Mavromatis K, Tavernarakis N, Hugenholtz P, Markowitz

VM, Kyrpides NC: The Genomes On Line Database (GOLD) in 2009: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Res 2010, 38:D346-D354.

2. Rappé MS, Giovannoni SJ: The uncultured microbial majority. Annu Rev Microbiol 2003, 57:369-394.

3. Pace NR: A molecular view of microbial diversity and the biosphere. Science 1997, 276:734-740.

4. Eckburg PB, Bik EM, Bernstein CN, Purdom E, Dethlefsen L, Sargent M, Gill SR, Nelson KE, Relman DA: Diversity of the human intestinal microbial flora. Science 2005, 308:1635-1638.

5. Sogin ML, Morrison HG, Huber JA, Welch DM, Huse SM, Neal PR, Arrieta JM, Herndl GJ: Microbial diversity in the deep sea and the underexplored "rare biosphere". Proc Natl Acad Sci USA 2006, 103:12115-12120.

Additional file 1 Comparison of VITCOMIC's features relative to exist-ing commonly used 16S rRNA gene analysis tools.

Additional file 2 Mapping result for the soil microbial community analyses data. The soil microbial community analyses data derived from 4 different soils that included 139,356 16S rRNA gene sequences [13].Additional file 3 Mapping result for the seawater microbial commu-nity analyses data. The seawater microbial community analyses data derived from 452 experiments that included 11,144,358 sequences were obtained from the NCBI Sequence Read Archive on December 16, 2009.Additional file 4 Mapping result for the human microbial community analyses data. The human microbial community analyses data derived from 60 experiments that included 4,363,040 sequences were obtained from the NCBI Sequence Read Archive on December 16, 2009.

Received: 17 February 2010 Accepted: 18 June 2010 Published: 18 June 2010This article is available from: http://www.biomedcentral.com/1471-2105/11/332© 2010 Mori et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.BMC Bioinformatics 2010, 11:332

Figure 6 High-resolution view of the region containing the phy-lum Actinobacteria in Figure 5.

100

95

90

85

80



http://www.biomedcentral.com/content/supplementary/1471-2105-11-332-S1.XLS

http://www.biomedcentral.com/content/supplementary/1471-2105-11-332-S2.PDF



http://www.biomedcentral.com/1471-2105/11/332

http://creativecommons.org/licenses/by/2.0







Page 9 of 9

6. Van de Peer Y, Chapelle S, De Wachter R: A quantitative map of nucleotide substitution rates in bacterial rRNA. Nucleic Acids Res 1996, 24:3381-3391.

7. Mears JA, Cannone JJ, Stagg SM, Gutell RR, Agrawal RK, Harvey SC: Modeling a minimal ribosome based on comparative sequence analysis. J Mol Biol 2002, 321:215-234.

8. Jain R, Rivera MC, Lake JA: Horizontal gene transfer among genomes: the complexity hypothesis. Proc Natl Acad Sci USA 1999, 96:3801-3806.

9. Ludwig W, Strunk O, Westram R, Richter L, Meier H, Buchner A, Lai T, Steppi S, Jobb G, Förster W, Brettske I, Gerber S, Ginhart AW, Gross O, Grumann S, Hermann S, Jost R, König A, Liss T, Lüssmann R, May M, Nonhoff B, Reichel B, Strehlow R, Stamatakis A, Stuckmann N, Vilbig A, Lenke M, Ludwig T, Bode A, Schleifer KH: ARB: a software environment for sequence data. Nucleic Acids Res 2004, 32:1363-1371.

10. Lozupone C, Knight R: UniFrac: a new phylogenetic method for comparing microbial communities. Appl Environ Microbiol 2005, 71:8228-8235.

11. Schloss PD, Handelsman J: Introducing DOTUR, a computer program for defining operational taxonomic units and estimating species richness. Appl Environ Microbiol 2005, 71:1501-1506.

12. Cole JR, Wang Q, Cardenas E, Fish J, Chai B, Farris RJ, Kulam-Syed-Mohideen AS, McGarrell DM, Marsh T, Garrity GM, Tiedje JM: The Ribosomal Database Project: improved alignments and new tools for rRNA analysis. Nucleic Acids Res 2009, 37:D141-D145.

13. Roesch LF, Fulthorpe RR, Riva A, Casella G, Hadwin AK, Kent AD, Daroub SH, Camargo FA, Farmerie WG, Triplett EW: Pyrosequencing enumerates and contrasts soil microbial diversity. ISME J 2007, 1:283-290.

14. Armougom F, Raoult D: Exploring microbial diversity using 16S rRNA high-throughput methods. J Comput Sci Syst Biol 2009, 2:69-92.

15. Turnbaugh PJ, Hamady M, Yatsunenko T, Cantarel BL, Duncan A, Ley RE, Sogin ML, Jones WJ, Roe BA, Affourtit JP, Egholm M, Henrissat B, Heath AC, Knight R, Gordon JI: A core gut microbiome in obese and lean twins. Nature 2009, 457:480-484.

16. Sun Y, Cai Y, Liu L, Yu F, Farrell ML, McKendree W, Farmerie W: ESPRIT: estimating species richness using large collections of 16S rRNA pyrosequences. Nucleic Acids Res 2009, 37:e76.

17. Letunic I, Bork P: Interactive Tree Of Life (iTOL): an online tool for phylogenetic tree display and annotation. Bioinformatics 2007, 23:127-128.

18. Kemena C, Notredame C: Upcoming challenges for multiple sequence alignment methods in the high-throughput era. Bioinformatics 2009, 25:2455-2465.

19. Bent SJ, Forney LJ: The tragedy of the uncommon: understanding limitations in the analysis of microbial diversity. ISME J 2008, 2:689-695.

20. NCBI Genome Database [ftp://ftp.ncbi.nih.gov/genbank/genomes/Bacteria/]

21. Lagesen K, Hallin P, Rødland EA, Staerfeldt HH, Rognes T, Ussery DW: RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res 2007, 35:3100-3108.

22. Acinas SG, Marcelino LA, Klepac-Ceraj V, Polz MF: Divergence and redundancy of 16S rRNA sequences in genomes with multiple rrn operons. J Bacteriol 2004, 186:2629-2635.

23. Yarza P, Richter M, Peplies J, Euzeby J, Amann R, Schleifer KH, Ludwig W, Glöckner FO, Rosselló-Móra R: The All-Species Living Tree project: a 16S rRNA-based phylogenetic tree of all sequenced type strains. Syst Appl Microbiol 2008, 31:241-250.

24. Katoh K, Toh H: Improved accuracy of multiple ncRNA alignment by incorporating structural information into a MAFFT-based framework. BMC Bioinformatics 2008, 9:212.

25. Kimura M: A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol 1980, 16:111-120.

26. Felsenstein J: PHYLIP-Phylogeny inference package (Version 3.2). Cladistics 1989, 5:164-166.

27. NCBI Taxonomy Database [http://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html/index.cgi]

28. Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ, Chen Z, Dewell SB, Du L, Fierro JM, Gomes XV, Godwin BC, He W, Helgesen S, Ho CH, Irzyk GP, Jando SC, Alenquer ML, Jarvie TP, Jirage KB, Kim JB, Knight JR, Lanza JR, Leamon JH, Lefkowitz SM, Lei M, Li J, Lohman KL, Lu H, Makhijani VB, McDade KE, McKenna MP, Myers EW, Nickerson E, Nobile JR, Plant R, Puc BP, Ronan MT, Roth GT, Sarkis GJ, Simons JF, Simpson JW, Srinivasan M, Tartaro KR, Tomasz A, Vogt

KA, Volkmer GA, Wang SH, Wang Y, Weiner MP, Yu P, Begley RF, Rothberg JM: Genome sequencing in microfabricated high-density picolitre reactors. Nature 2005, 437:376-380.

29. Hamady M, Lozupone C, Knight R: Fast UniFrac: facilitating high-throughput phylogenetic analyses of microbial communities including analysis of pyrosequencing and PhyloChip data. ISME J 2010, 4:17-27.

30. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol 1990, 215:403-410.

31. Chao A, Chazdon RL, Colwell RK, Shen TJ: Abundance-based similarity indices and their estimation when there are unseen species in samples. Biometrics 2006, 62:361-371.

32. Kurokawa K, Itoh T, Kuwahara T, Oshima K, Toh H, Toyoda A, Takami H, Morita H, Sharma VK, Srivastava TP, Taylor TD, Noguchi H, Mori H, Ogura Y, Ehrlich DS, Itoh K, Takagi T, Sakaki Y, Hayashi T, Hattori M: Comparative metagenomics revealed commonly enriched gene sets in human gut microbiomes. DNA Res 2007, 14:169-181.

33. Paulsen IT, Banerjei L, Myers GS, Nelson KE, Seshadri R, Read TD, Fouts DE, Eisen JA, Gill SR, Heidelberg JF, Tettelin H, Dodson RJ, Umayam L, Brinkac L, Beanan M, Daugherty S, DeBoy RT, Durkin S, Kolonay J, Madupu R, Nelson W, Vamathevan J, Tran B, Upton J, Hansen T, Shetty J, Khouri H, Utterback T, Radune D, Ketchum KA, Dougherty BA, Fraser CM: Role of mobile DNA in the evolution of vancomycin-resistant Enterococcus faecalis. Science 2003, 299:2071-2074.

34. Brüggemann H, Henne A, Hoster F, Liesegang H, Wiezer A, Strittmatter A, Hujer S, Dürre P, Gottschalk G: The complete genome sequence of Propionibacterium acnes, a commensal of human skin. Science 2004, 305:671-673.

35. Derrien M, Collado MC, Ben-Amor K, Salminen S, de Vos WM: The Mucin degrader Akkermansia muciniphila is an abundant resident of the human intestinal tract. Appl Environ Microbiol 2008, 74:1646-1648.

36. Morita H, Toh H, Fukuda S, Horikawa H, Oshima K, Suzuki T, Murakami M, Hisamatsu S, Kato Y, Takizawa T, Fukuoka H, Yoshimura T, Itoh K, O'Sullivan DJ, McKay LL, Ohno H, Kikuchi J, Masaoka T, Hattori M: Comparative genome analysis of Lactobacillus reuteri and Lactobacillus fermentum reveal a genomic island for reuterin and cobalamin production. DNA Res 2008, 15:151-161.

37. Oshima K, Toh H, Ogura Y, Sasamoto H, Morita H, Park SH, Ooka T, Iyoda S, Taylor TD, Hayashi T, Itoh K, Hattori M: Complete genome sequence and comparative analysis of the wild-type commensal Escherichia coli strain SE11 isolated from a healthy adult. DNA Res 2008, 15:375-386.

38. Pearson MM, Sebaihia M, Churcher C, Quail MA, Seshasayee AS, Luscombe NM, Abdellah Z, Arrosmith C, Atkin B, Chillingworth T, Hauser H, Jagels K, Moule S, Mungall K, Norbertczak H, Rabbinowitsch E, Walker D, Whithead S, Thomson NR, Rather PN, Parkhill J, Mobley HL: Complete genome sequence of uropathogenic Proteus mirabilis, a master of both adherence and motility. J Bacteriol 2008, 190:4027-4037.

39. Sela DA, Chapman J, Adeuya A, Kim JH, Chen F, Whitehead TR, Lapidus A, Rokhsar DS, Lebrilla CB, German JB, Price NP, Richardson PM, Mills DA: The genome sequence of Bifidobacterium longum subsp. infantis reveals adaptations for milk utilization within the infant microbiome. Proc Natl Acad Sci USA 2008, 105:18964-18969.

40. Mathog DR: Parallel BLAST on split databases. Bioinformatics 2003, 19:1865-1866.

41. Wang Q, Garrity GM, Tiedje JM, Cole JR: Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ Microbiol 2007, 73:5261-5267.

42. Clemente JC, Jansson J, Valiente G: Accurate taxonomic assignment of short pyrosequencing reads. Pac Symp Biocomput 2010:3-9.

43. Wu D, Hartman A, Ward N, Eisen JA: An automated phylogenetic tree-based small subunit rRNA taxonomy and alignment pipeline (STAP). PLoS One 2008, 3:e2566.

44. Roche 454 sequencer web page [http://454.com/]45. Desantis TZ, Hugenholtz P, Larsen N, Rojas M, Brodie EL, Keller K, Huber T,

Dalevi D, Hu P, Andersen GL: Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl Environ Microbiol 2006, 72:5069-5072.

46. NCBI Sequence Read Archive [http://www.ncbi.nlm.nih.gov/Traces/sra/sra.cgi]

doi: 10.1186/1471-2105-11-332Cite this article as: Mori et al., VITCOMIC: visualization tool for taxonomic compositions of microbial communities based on 16S rRNA gene sequences BMC Bioinformatics 2010, 11:332














ftp://ftp.ncbi.nih.gov/genbank/genomes/Bacteria/

ftp://ftp.ncbi.nih.gov/genbank/genomes/Bacteria/






http://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html/index.cgi

http://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html/index.cgi

















http://454.com/


http://www.ncbi.nlm.nih.gov/Traces/sra/sra.cgi

http://www.ncbi.nlm.nih.gov/Traces/sra/sra.cgi

SoftwareVITCOMIC: visualization tool for taxonomic ... · Bacteria and Archaea, mosaic structures of highly con-served regions and variable regions [6,7], and little possi-bility

Documents