Clinical Metagenomics for Rapid Detection of Enteric Pathogens and Characterization of the Intestinal Microbiome

Clinical Metagenomics for Rapid Detection of Enteric Pathogens and Characterization of the Intestinal Microbiome

Dr. Rita R. ColwellUniversity of Maryland, College Park

Johns Hopkins University Bloomberg School of Public HealthCosmosID Inc.

History/Cholera

Metagenomics

Microbiomes in Health and Disease

Demo

Culture - Numerical Taxonomy

Nucleic Acid (Base Composition)

Density Gradient Hybridization

Fluorescent Antibody Microscopy

Polymerase Chain Reaction (PCR)

Next Gen Sequencing

Metagenomics

1960

1965

1970

1975

1985

1996

2008

2000

A Timeline of Microbiology

First Numerical Approaches to Microbial Taxonomy





Polymerase Chain Reaction

Next Gen Sequencing

Metagenomics

1960

1965

1970

1975

1985

1996

2008

2000

Culture Limitations

A curiosity –

All the bacteria don’t grow in laboratory culture

Solving the Mystery of the “Viable But Non-Culturable (VBNC)” Bacteria






Next Gen Sequencing

Metagenomics

1960

1965

1970

1975

1985

1996

2008

2000

JD Oliver, Journal of Microbiology, 2005 Feb; 43 Spec No: 93-100

Over 400 papers have appeared on the VBNC phenomenon and over 1000 papers describing the various aspects of it.

Bacteria Described to Enter The VBNC state

Cholera: A Global Disease

• Acute water-related diarrheal disease

• Seventh pandemic started in 1960s

• Occurs in more than 50 countries affecting approximately 7 million people

• Bengal Delta is known as “native homeland” of cholera outbreaks

• Since cholera bacteria • exist naturally in aquatic habitats• evidence of new biotypes

emerging, it is highly unlikely that cholera will be eradicated but clearly can be controlled by provision of safe drinking water.

The Copepod Vector

Copepods were found to carry approximately 10,000 to 50,000 CFU of V. cholerae per copepod






Next Gen Sequencing

Metagenomics

1960

1965

1970

1975

1985

1996

2008

2000

10

A Simple Solution for Cholera Prevention : Sari Filtration

The move to genomics…early sixties to the present

Source: The Institute for Genomic Research

Vibrio cholerae

Sequenced and published in 2000

Small Chromosome

Large Chromosome






Next Gen Sequencing

Metagenomics

1960

1965

1970

1975

1985

1996

2008

2000

Genomic re-assortment not serotype defines epidemic clones

History/Cholera

Metagenomics


Demo

Role of Microbiome in Health and Wellness

AcneAlzheimer’s DiseaseAntibiotic-associated DiarrheaAtherosclerosis & ArthritisAsthma/AllergiesAttention Deficit Hyperactivity Disorder AutismAutoimmune Diseases (Multiple Sclerosis, Lupus, Rheumatoid arthritis)Bipolar DisorderCancerChronic Fatigue / FibromyalgiaCoeliac DiseaseChron’s DiseaseCystic FibrosisDental CavitiesDepression and AnxietyDiabetes Type 1 & 2EpilepsyEczemaIrritable Bowel SyndromeGastric UlcersMalnutritionNarcolepsyObesityParkinson’s DiseaseUlcerative Colitis

Identified Bacteria

Raw Sequence Reads

Biological specimen Community DNA

GenBookⓇ Biomarker Matching

GenBookⓇ AR/VF Library

TetR

CIPR

mecActxA

Microbial Identification &

Pathogen Characterization

GenBookⓇ Database

How It Works

DNA Sequencing

CosmosID for Automated Metagenomics in the 21st Century

CosmosID Analyzes Microbial DNA sequences

• Growing proprietary database of 65,000 microbial genomes (bacteria, viruses, fungi and parasites)

• In minutes, our software solution delivers unrivaled specificity and sensitivity

• Microbial Identification at subspecies/strain level• Relative abundance of the microbial community• Presence of antibiotic resistance and virulence

factors• Sequencing platform agnostic

• Can handle both short read and long read NGS data

• Illumina, ThermoFisher, Pacific Biosciences and Oxford Nanopore

• Commercial products (software and databases are well maintained and continuously updated)

Infectious Disease – Rapid Evolution

§ Previously recognized pathogens are evolving faster.

§ New, potentially dangerous pathogens are emerging every year.

§ Nosocomial and mixed microbial infections are dramatically increasing.

§ Many acute infectious diseases have unknown or poorly known etiology

§ Resident microflora in health and wellness Source: Clinical Infectious Diseases 2013;57(S3):S139–70

0.0 0.2 0.4 0.6 0.8 1.0

HC1

F1-score (Presence / Absence)

OneCodexAbundanceOneCodexCountsPhyloSiftClarkM1DefaultClarkM4SpacedKrakenNBCLMATPhyloSift90pctKrakenFilteredCosmosIDMetaphlanDiamondMeganBlastMeganFilteredCosmosIDFilteredBlastMeganFilteredLiberalOneCodexCountsFilteredOneCodexAbundanceFiltered

0.0 0.2 0.4 0.6 0.8 1.0

ds.soil


OneCodexCountsOneCodexAbundancePhyloSiftLMATKrakenClarkM4SpacedClarkM1DefaultPhyloSift90pctNBCKrakenFilteredBlastMeganFilteredLiberalMetaphlanCosmosIDCosmosIDFilteredOneCodexCountsFilteredOneCodexAbundanceFilteredBlastMeganFilteredDiamondMegan

Cos

mos

ID_f

ilter

ed

Cos

mos

ID

Bla

stM

egan

_filt

ered

Bla

stM

egan

_filt

ered

_lib

eral

LMA

T

Met

aPhl

An

Kra

ken_

filte

red

Phy

loS

ift_f

ilter

ed

Dia

mon

dMeg

an_f

ilter

ed

Kra

ken

One

Cod

ex

NB

C

Phy

loS

ift

One

Cod

ex_f

ilter

ed

0

20

40

60

80

100

perc

ent

subspecies

Bla

stM

egan

_filt

ered

Cos

mos

ID_f

ilter

ed

One

Cod

ex_f

ilter

ed

Bla

stM

egan

_filt

ered

_lib

eral

Dia

mon

dMeg

an_f

ilter

ed

Cos

mos

ID

Met

aPhl

An

Kra

ken_

filte

red

LMA

T

Phy

loS

ift_f

ilter

ed

CLA

RK

S

Kra

ken

CLA

RK

NB

C

One

Cod

ex

Phy

loS

ift

0

20

40

60

80

100

perc

ent

species

Bla

stM

egan

_filt

ered

Cos

mos

ID_f

ilter

ed

One

Cod

ex_f

ilter

ed

Bla

stM

egan

_filt

ered

_lib

eral

Dia

mon

dMeg

an_f

ilter

ed

Cos

mos

ID

Met

aPhl

An

Kra

ken_

filte

red

LMA

T

Phy

loS

ift_f

ilter

ed

CLA

RK

S

Kra

ken

CLA

RK

One

Cod

ex

NB

C

Phy

loS

ift

0

20

40

60

80

100

perc

ent

genus

F1-scoreprecisionrecallAUPR

B

A D

C

CosmosID Performance

Unpublished Data:

• 34 datasets of varying complexity and diversity

• 12 tools

0.0 0.2 0.4 0.6 0.8 1.0

HC1


OneCodexAbundanceOneCodexCountsPhyloSiftClarkM1DefaultClarkM4SpacedKrakenNBCLMATPhyloSift90pctKrakenFilteredCosmosIDMetaphlanDiamondMeganBlastMeganFilteredCosmosIDFilteredBlastMeganFilteredLiberalOneCodexCountsFilteredOneCodexAbundanceFiltered

0.0 0.2 0.4 0.6 0.8 1.0

ds.soil


OneCodexCountsOneCodexAbundancePhyloSiftLMATKrakenClarkM4SpacedClarkM1DefaultPhyloSift90pctNBCKrakenFilteredBlastMeganFilteredLiberalMetaphlanCosmosIDCosmosIDFilteredOneCodexCountsFilteredOneCodexAbundanceFilteredBlastMeganFilteredDiamondMegan

Cos

mos

ID_f

ilter

ed

Cos

mos

ID

Bla

stM

egan

_filt

ered

Bla

stM

egan

_filt

ered

_lib

eral

LMA

T

Met

aPhl

An

Kra

ken_

filte

red

Phy

loS

ift_f

ilter

ed

Dia

mon

dMeg

an_f

ilter

ed

Kra

ken

One

Cod

ex

NB

C

Phy

loS

ift

One

Cod

ex_f

ilter

ed

0

20

40

60

80

100

perc

ent

subspecies

Bla

stM

egan

_filt

ered

Cos

mos

ID_f

ilter

ed

One

Cod

ex_f

ilter

ed

Bla

stM

egan

_filt

ered

_lib

eral

Dia

mon

dMeg

an_f

ilter

ed

Cos

mos

ID

Met

aPhl

An

Kra

ken_

filte

red

LMA

T

Phy

loS

ift_f

ilter

ed

CLA

RK

S

Kra

ken

CLA

RK

NB

C

One

Cod

ex

Phy

loS

ift

0

20

40

60

80

100

perc

ent

species

Bla

stM

egan

_filt

ered

Cos

mos

ID_f

ilter

ed

One

Cod

ex_f

ilter

ed

Bla

stM

egan

_filt

ered

_lib

eral

Dia

mon

dMeg

an_f

ilter

ed

Cos

mos

ID

Met

aPhl

An

Kra

ken_

filte

red

LMA

T

Phy

loS

ift_f

ilter

ed

CLA

RK

S

Kra

ken

CLA

RK

One

Cod

ex

NB

C

Phy

loS

ift

0

20

40

60

80

100

perc

ent

genus

F1-scoreprecisionrecallAUPR

B

A D

C

Collaboration with Chris Mason, Weil Cornell Medicine

Accuracy of Relative Abundance

● ●

●●

●

●●

●

●

●

●

●

●●

●

●●

●

● ●

●

● ●

−100

0100

200

300

400

500

600

700

● ●

●●

●

●●

●

●

●

●

●

●●

●

●●

●

● ●

●

● ●

BLAS

T−Megan

BLAS

T−Megan−Liberal

CLAR

K

CLAR

K−S

Cosm

osID−filte

red

Cosm

osID

Diam

ond

Kraken−filte

red

Kraken

LMAT

MetaPalette

MetaPalette−Specific

Metaphlan2

NBC

OneCo

dex−filtered

OneCo

dex

Phylo

sift−filtered

Phylo

sift

Percent Difference of Estimated to True AbundanceBioPool & NARG1 samples

●

●

●

●

●●●

●

●

●

●

●

●

●●●

●

●

●

●

●

●

●

●●●●●

●

●

●

●

●

●

●

●●●●●

●

●

●

●

●●

●

●●

●

●

●

●

●●●

●

●

●

●●

●

●●●●●●

●●●●

●

●●

●●

●

●

●

●●

●

●●

●

●

●

●

●●●

●

●

●

●

●

●●●●●●

●●●●

●

●●

●

●

●●●

●●

●

●●●●●

●

●

●

●

●

●

●

●●●●

●

●

●●

●

●

●

●

●

●●

●

●

●●●

●

●●

●

●●●

●

●

●●●

●●

●

●

●

●

●

●

●

●●●

●

●

●●●●●●●●●●●●●●●●●●

●

●●●●

●

●

●●

●●●●●●●●

●●●●

●

●●●●

●

●

●

●

●●

●

●

●

●●

●

●

●

●●●●●●●●

●

●●●●●●●●

●

●●●●●●●●●●●●●●●●●●●●●●●

●

●●●●●●●●

●

●●●●

●

●●●●

●

●

●

●

●●

●

●

●

●●

●

●

●

●●●●●●●

●

●

●●●●●●

●

●●

●●

●

●●●●

●

●●●●●●●

●

●●●●●●●●●

●

●●●●●●●●

●

●●●

●

●

●●●

●

●●●●

●●

●

●

●●●●

●

●

−100

0100

200

300

400

500

600

700

800

●

●

●

●

●●●

●

●

●

●

●

●

●●●

●

●

●

●

●

●

●

●●●●●

●

●

●

●

●

●

●

●●●●●

●

●

●

●

●●

●

●●

●

●

●

●

●●●

●

●

●

●●

●

●●●●●●

●●●●

●

●●

●●

●

●

●

●●

●

●●

●

●

●

●

●●●

●

●

●

●

●

●●●●●●

●●●●

●

●●

●

●

●●●

●●

●

●●●●●

●

●

●

●

●

●

●

●●●●

●

●

●●

●

●

●

●

●

●●

●

●

●●●

●

●●

●

●●●

●

●

●●●

●●

●

●

●

●

●

●

●

●●●

●

●

●●●●●●●●●●●●●●●●●●

●

●●●●

●

●

●●

●●●●●●●●

●●●●

●

●●●●

●

●

●

●

●●

●

●

●

●●

●

●

●

●●●●●●●●

●

●●●●●●●●

●

●●●●●●●●●●●●●●●●●●●●●●●

●

●●●●●●●●

●

●●●●

●

●●●●

●

●

●

●

●●

●

●

●

●●

●

●

●

●●●●●●●

●

●

●●●●●●

●

●●

●●

●

●●●●

●

●●●●●●●

●

●●●●●●●●●

●

●●●●●●●●

●

●●●

●

●

●●●

●

●●●●

●●

●

●

●●●●

●

●

BLAS

T−Megan

BLAS

T−Megan−Liberal

CLAR

K

CLAR

K−S

Cosm

osID−filte

red

Cosm

osID

Diam

ond

Kraken−filte

red

Kraken

LMAT

MetaPalette

MetaPalette−Specific

Metaphlan2

NBC

OneCo

dex−filtered

OneCo

dex

Phylo

sift−filtered

Phylo

sift

Percent Difference of Estimated to True AbundanceHC LC samples

Synthetic Datasets Biological Datasets

Detection is one problem; abundance is much harder

Biological Specimens (i.e. Stool, CSF, etc.)

Sequencing

Further analysis using CLC • Functional• Mapping• Assembly

CLC

Sample to Analysis with CLC + CosmosID Plugin

CosmosIDCLC Plugin

X

Y

Z

BEST MATCH

• Microbial Identification (subspecies & strain)

• Antibiotic Resistance• Virulence Factors• Relative Abundance• Bacteria, fungi, protists,

viruses

WGS fasta or fastq file

Curated Genome Databases

§ Mostcomprehensiveandlargestcurateddatabases(>65,000genomes)

§ Organizedasphylogenetictrees

§ Twotypesofbiomarkers:§ Uniquetotheorganism,and§ Sharedacrossthephylogeneticlineageinthetree

Protists

CosmosID Use Cases

Some Applications:

Microbiome ResearchClinical MetagenomicsEmerging and Re-emerging Pathogen DiscoveryPolymicrobial Infection DynamicsHospital-associated InfectionsOutbreak InvestigationFood SafetyFunctional FoodHuman MicrobiomeHome MicrobiomeSubway MicrobiomeAnimal MicrobiomePharmaceuticals R&DOil MetagenomeEnvironmental ScreeningCosmeticsClinical Trial

Analyzed >30K Biological Samples

History/Cholera

Metagenomics

Microbiomes in Health and Disease:Microbiome Analysis of Acute Diarrheal

Patients Compared with Healthy Individuals

Total # of NICED samples: 74Indian Healthy Control (HC): 20Sick with Unknown Etiology (UE): 28Sick with Known Etiology (KE): 26Healthy Human Microbiome Project (HMP): 20

BacteriaVibrio choleraeVibrio parahaemolyticusVibrio fluvialisAeromonas spp.Campylobacter jejuniCampylobacter coliShigellaSalmonellaEscherichia coli

VirusesRotavirusAdenovirusNorovirusSapovirusAstrovirus

ParasitesGiardia lambliaCryptosporidium parvumEntamoeba histolyticaBlastocystis hominis

Microbiome of Acute Diarrheal Patients Compared with Healthy Individuals

In collaboration with the National Institute of Cholera and Enteric Disease (NICED), Calcutta, India

Enteric Pathogens Monitored By NICED

DIAR

RHEA

LPATIEN

TS

HEALTH

YINDIVIDU

ALS

A Subpopulation is Overrepresented in Diarrheal Patients Compared to Healthy Individuals

γ- Proteobacteria – Firmicutes - Bacteroidetes

N2_GY16

N2_GY31

N2_GY26

N2_GY09

N2_GY23

N2_GY29

N2_GY30

N2_GY14

N2_GY19

N3_IDH_20N3_IDH_33N3_IDH_38N2_G

Y22N2_G

Y28N2_G

Y11N2_G

Y12N3_IDH_37HM

P_SRS023583HM

P_SRS043001HM

P_SRS017433HM

P_SRS019601HM

P_SRS058770HM

P_SRS064557HM

P_SRS015854HM

P_SRS017701HM

P_SRS017307HM

P_SRS015190HM

P_SRS016335HM

P_SRS022609HM

P_SRS056259HM

P_SRS019968HM

P_SRS011586HM

P_SRS013476HM

P_SRS020233HM

P_SRS019161N3_IDH_10AN3_IDH_2N3_NICED_27N2_CSN10N3_NICED_26N2_G

Y20N2_G

Y13HM

P_SRS012902N3_NICED_24N3_IDH_40N3_NICED_28N3_IDH_32N3_IDH_17N2_CSN1N2_CSN6N2_CSN8N3_NICED_30N3_IDH_16N3_IDH_6AN3_IDH_18N3_IDH_12N2_G

Y17N3_IDH_9N3_IDH_4N3_NICED_22N3_NICED_25N2_CSN9HM

P_SRS013215N2_G

Y21N2_G

Y25N2_G

Y18N2_CSN5N3_IDH_7N3_IDH_36N2_G

Y27N2_CSN3N2_CSN2N2_CSN7N2_G

Y15

AcinetobacterSpirochaetaceaeSphingomonadaceaeShigellaFlavobacteriaceaeNeisseriaBurkholderiaceaeLactobacillusSynergistaceaeSalmonellaActinomycetalesActinomycetaceaeLactococcus lactisMoraxellaceaeCitrobacterCarnobacteriaceaeMicrococcaceaeNeisseriaceaeFusobacteriumPeptoniphilaceaeEnterobacter cloacae complexEnterobacterialesKlebsiellaAcidaminococcaceaeCampylobacteralesCampylobacteraceaeAeromonadaceaeTenericutesBacillales Family XI. Incertae SedisVibrioCoriobacterialesOxalobacteraceaeMethanobacteriaceaePeptostreptococcaceaeBurkholderialesErysipelotrichalesVerrucomicrobiaceaeHelicobacteraceaeMycoplasmataceaeCampylobacter jejuniActinobacteriaBacteriaBrachyspiraceaeSelenomonadalesEnterococcaceaeBifidobacterialesBifidobacteriumStreptococcusLactobacillaceaeLeuconostocaceaeEnterobacteriaceaeEscherichia coliCoriobacterineaeDesulfovibrionalesFusobacteriaceaeDesulfovibrionaceaeCoriobacteriaceaeClostridiaceaeRikenellaceaeClostridiaErysipelotrichaceaeEubacteriaceaeunclassified ClostridialesClostridialesSutterellaceaeBacteroidalesPorphyromonadaceaeBifidobacteriaceaePasteurellaceaeStreptococcaceaePrevotellaceaeVeillonellaceaeRuminococcaceaeLachnospiraceaeBacteroidaceae Group

HCHMPKEUE

FamilyLevel

Many pathogens can readily be identified from disease patients

: Known Etiology: Unknown Etiology

UE_N2_GY31KE_N3_IDH_5

UE_N3_IDH_40UE_N2_GY09

UE_N3_IDH_36UE_N2_GY22UE_N2_GY12

KE_N3_IDH_19UE_N2_GY23UE_N2_GY28UE_N2_GY26UE_N2_GY19

UE_N3_IDH_35UE_N2_GY17UE_N2_GY14

KE_N3_IDH_20UE_N3_IDH_34UE_N3_IDH_33

UE_N2_GY20UE_N2_GY29

UE_N3_IDH_39UE_N2_GY11KE_N3_IDH_3

HC_N3_NICED_23UE_N2_GY18UE_N2_GY13

UE_N3_IDH_31KE_N3_IDH_7KE_N3_IDH_9

HC_N3_NICED_26KE_N3_IDH_17KE_N3_IDH_16

UE_N2_GY21UE_N2_GY30

UE_N3_IDH_37HC_N2_CSN10

HC_N2_CSN3UE_N2_GY15HC_N2_CSN2HC_N2_CSN9

KE_N3_IDH_12HC_N3_NICED_22

KE_N3_IDH_10AKE_N3_IDH_2HC_N2_CSN5UE_N2_GY25


HC_N2_CSN7KE_N3_IDH_6A


UE_N2_GY27UE_N3_IDH_32UE_N3_IDH_38

UE_N2_GY24UE_N2_GY16KE_N3_IDH_8

KE_N3_IDH_14KE_N3_IDH_11

HC_N3_NICED_30HC_N3_NICED_28HC_N3_NICED_24

HC_N2_CSN8HC_N2_CSN1HC_N2_CSN6

Esch

erich

ia_c

oli_

TW11

681

Esch

erich

ia_c

oli_

str_

K−12

_sub

str_

DH10

BEs

cher

ichia

_col

i_TW

1059

8Es

cher

ichia

_col

i_E4

82/B

41Es

cher

ichia

_col

i_M

S_14

5−7

Esch

erich

ia_c

oli_

NA11

4Es

cher

ichia

_col

i_2_

3916

Esch

erich

ia_c

oli_

MS_

21−1

Esch

erich

ia_c

oli_

SMS−

3−5

Esch

erich

ia_c

oli_

XH14

0AEs

cher

ichia

_col

i_M

S_11

6−1

Esch

erich

ia_c

oli_

ETEC

_H10

407

Esch

erich

ia_c

oli_

MS_

69−1

Esch

erich

ia_c

oli_

MS1

75/1

16Es

cher

ichia

_col

i_H2

52Es

cher

ichia

_col

i_ST

EC_B

2F1

Shig

ella

_sp_

D9Es

cher

ichia

_col

i_SE

15Es

cher

ichia

_col

i_B

Esch

erich

ia_c

oli_

MS_

196−

1Es

cher

ichia

_col

i_3_

3884

/96_

154/

O11

3_H2

1Es

cher

ichia

_col

i_55

989

Esch

erich

ia_c

oli_

O10

4:H4

_mai

nEs

cher

ichia

_col

i_O

157:

H7_s

tr_FR

IK20

00Es

cher

ichia

_col

i_M

S_11

5_1/

2_41

68/T

W11

681

Esch

erich

ia_c

oli_

H736

Esch

erich

ia_c

oli_

H494

Esch

erich

ia_c

oli_

E22/

1200

9/3_

2608

_spl

itEs

cher

ichia

_col

i_O

104:

H4_0

1/04

/09−

8351

Node

_664

2Es

cher

ichia

_col

i_18

27−7

0Es

cher

ichia

_col

i_TW

1072

2/M

S145

Esch

erich

ia_c

oli_

E22

Esch

erich

ia_c

oli_

ATCC

_873

9Es

cher

ichia

_col

i_TW

1442

5Es

cher

ichia

_col

i_UM

N026

/FVE

C/04

2cv

mar

_001

2_at

_119

9_Es

cher

ichia

_col

i_G

ENE_

yjeH

cvm

ar_0

016_

at_8

05_E

sche

richi

a_co

li_G

ENE_

yjeJ

cvm

ar_0

017_

at_9

63_E

sche

richi

a_co

li_G

ENE_

yjeK

cvm

ar_0

019_

x_at

_93_

Esch

erich

ia_c

oli_

GEN

E_su

gEcv

mar

_002

0_at

_394

_Esc

heric

hia_

coli_

GEN

E_bl

ccv

mar

_002

1_at

_111

7_Es

cher

ichia

_col

i_G

ENE_

ampC

cvm

ar_0

022_

at_3

41_E

sche

richi

a_co

li_G

ENE_

frdD

cvm

ar_0

023_

at_3

38_E

sche

richi

a_co

li_G

ENE_

frdC

cvm

ar_0

024_

at_6

10_E

sche

richi

a_co

li_G

ENE_

frdB

cvm

ar_0

025_

at_1

625_

Esch

erich

ia_c

oli_

GEN

E_frd

Acv

mar

_014

4_at

_241

_Esc

heric

hia_

coli_

GI_

2163

7409

cvm

ar_0

170_

at_5

01_E

sche

richi

a_co

li_G

ENE_

cat

cvm

ar_0

374_

s_at

_486

_Esc

heric

hia_

coli_

GEN

E_m

rxcv

mar

_037

7_s_

at_9

4_Es

cher

ichia

_col

i_G

I_51

0368

5cv

mar

_037

8_s_

at_8

5_Es

cher

ichia

_col

i_G

I_51

0368

6cv

mar

_044

6_s_

at_2

49_E

sche

richi

a_co

li_G

I_15

2580

cvm

ar_0

448_

s_at

_403

_Esc

heric

hia_

coli_

GI_

1525

82cv

mar

_044

9_s_

at_8

79_E

sche

richi

a_co

li_G

I_15

2583

cvm

ar_0

450_

s_at

_198

_Esc

heric

hia_

coli_

GI_

1525

84cv

mar

_045

1_s_

at_1

72_E

sche

richi

a_co

li_G

I_15

2585

cvm

ar_0

453_

s_at

_740

_Esc

heric

hia_

coli_

GI_

1525

87cv

mar

_066

6_s_

at_2

80_E

sche

richi

a_co

li_G

I_16

4886

6cv

mar

_071

4_s_

at_1

94_E

sche

richi

a_co

li_G

ENE_

orfF

mvir

_242

83_s

_at_

1882

_Esc

heric

hia_

coli_

Gen

eID_

ECC2

359

mvir

_242

87_s

_at_

1457

_Esc

heric

hia_

coli_

Gen

eID_

ECC2

363

mvir

_242

89_s

_at_

1551

_Esc

heric

hia_

coli_

Gen

eID_

ECC2

364

mvir

_245

64_a

t_45

7_Es

cher

ichia

_col

i_G

ENE_

cfa−

Im

vir_2

4567

_at_

474_

Esch

erich

ia_c

oli_

GEN

E_af

aE−1

mvir

_245

76_a

t_27

7_Es

cher

ichia

_col

i_G

ENE_

draA

mvir

_246

14_x

_at_

705_

Esch

erich

ia_c

oli_

GEN

E_om

pAm

vir_2

4697

_at_

116_

Esch

erich

ia_c

oli_

GEN

E_lta

mvir

_247

01_a

t_21

6_Es

cher

ichia

_col

i_G

ENE_

sta

mvir

_247

11_a

t_13

91_E

sche

richi

a_co

li_G

ENE_

pet

mvir

_247

21_s

_at_

82_E

sche

richi

a_co

li_G

ENE_

set1

Am

vir_2

4752

_at_

939_

Esch

erich

ia_c

oli_

GEN

E_es

pGm

vir_4

3038

_at_

335_

Esch

erich

ia_c

oli_

GEN

E_ag

g3B

mvir

_430

41_a

t_15

27_E

sche

richi

a_co

li_G

ENE_

agg3

Cm

vir_4

3044

_at_

495_

Esch

erich

ia_c

oli_

GEN

E_ag

g3D

mvir

_430

59_a

t_26

6_Es

cher

ichia

_col

i_G

ENE_

aggR

mvir

_430

68_a

t_38

6_Es

cher

ichia

_col

i_G

ENE_

cs3

mvir

_430

77_a

t_36

2_Es

cher

ichia

_col

i_G

ENE_

cseA

mvir

_430

92_s

_at_

405_

Esch

erich

ia_c

oli_

GEN

E_af

aE−3

mvir

_431

01_a

t_47

2_Es

cher

ichia

_col

i_G

ENE_

draE

2m

vir_4

3143

_at_

494_

Esch

erich

ia_c

oli_

GEN

E_aa

tBm

vir_4

3149

_at_

1071

_Esc

heric

hia_

coli_

GEN

E_aa

tDm

vir_4

3155

_at_

525_

Esch

erich

ia_c

oli_

GEN

E_dr

aBm

vir_4

3158

_at_

1738

_Esc

heric

hia_

coli_

GEN

E_dr

aCm

vir_4

3161

_at_

440_

Esch

erich

ia_c

oli_

GEN

E_dr

aDm

vir_4

3164

_at_

168_

Esch

erich

ia_c

oli_

GEN

E_dr

aPm

vir_4

3188

_s_a

t_18

51_E

sche

richi

a_co

li_G

ENE_

papC

mvir

_432

87_s

_at_

1880

_Esc

heric

hia_

coli_

GEN

E_iu

tAm

vir_4

3338

_at_

202_

Esch

erich

ia_c

oli_

GEN

E_ltb

mvir

_433

50_s

_at_

160_

Esch

erich

ia_c

oli_

GEN

E_se

t1B

mvir

db_0

213_

s_at

_116

_Esc

heric

hia_

coli_

GEN

E_tn

pRm

virdb

_022

8_at

_266

0_Es

cher

ichia

_col

i_G

ENE_

tnpA

mvir

db_0

243_

at_2

85_E

sche

richi

a_co

li_G

ENE_

tnpA

mvir

db_0

247_

at_9

3_Es

cher

ichia

_col

i_G

ENE_

ycdA

mvir

db_0

249_

at_2

96_E

sche

richi

a_co

li_G

ENE_

stbB

mvir

db_0

250_

at_5

50_E

sche

richi

a_co

li_G

ENE_

stbA

mvir

db_0

261_

s_at

_58_

Esch

erich

ia_c

oli_

GEN

E_yd

dAm

virdb

_026

1_x_

at_3

05_E

sche

richi

a_co

li_G

ENE_

yddA

mvir

db_0

273_

at_7

9_Es

cher

ichia

_col

i_G

ENE_

yehA

mvir

db_0

276_

at_4

42_E

sche

richi

a_co

li_G

I_38

6061

13m

virdb

_052

0_s_

at_2

58_E

sche

richi

a_co

li_G

ENE_

sugE

mvir

db_0

521_

at_5

27_E

sche

richi

a_co

li_G

ENE_

blc

mvir

db_0

580_

x_at

_99_

Esch

erich

ia_c

oli_

GI_

3334

641

mvir

db_0

606_

s_at

_103

_Esc

heric

hia_

coli_

GEN

E_qa

cEde

lta1

mvir

db_0

644_

at_2

842_

Esch

erich

ia_c

oli_

GI_

1432

6200

mvir

db_0

835_

at_2

775_

Esch

erich

ia_c

oli_

GI_

3798

3283

mvir

db_0

849_

at_3

17_E

sche

richi

a_co

li_G

ENE_

lacY

mvir

db_0

856_

x_at

_228

_Esc

heric

hia_

coli_

GEN

E_tra

A

org

Sam

ple

Escherichia coli

Identify Organism Characterize for Accessory Genes

Unkn

own

Etio

logy

Unknown Etiology samples predominantly contain members of E. coli super familyKn

own

Etio

logy

Healt

hyCo

ntro

ls

NICED PCA Bray-Curtis distanceHMPHCUEKE

Treatment Group

-1 -0.75 -0.5 -0.25 0 0.25 0.5 0.75 -1-0.75

-0.5-0.2500.250.50.75

-1

-0.75

-0.5

-0.25

0

0.25

0.5

0.75

1

Microbiome of Healthy People in India Different From That of Western Europeans

Heatlhy ControlUnknown EtiologyKnown Etiology

HMP

Number of Individuals with AMR genes present in microbiome

beta.lactamase

tetracycline

sulphonamide

rifampicin

quinolone

fosfomycin

nitroimidazole

phenicol

macrolide

trimethoprim

aminoglycoside

Unknown Etiology

Known Etiology

Healthy or Asymptomatic Control

0

4.8

9.6

15

Genes which match at > 50% coverageHMP samples had no genes present which matched at this level of coverage

Predominance of genes related to carbohydrate metabolism

Amino Acids and DerivativesCarbohydrates

Cell Wall and Capsule

Cofactors, Vitamins,Prosthetic Groups, Pigments

DNA Metabolism

Membrane Transport

Protein Metabolism

unclassified

Alanine, serine, and glycine

Arginine; urea cycle, polyamines

Aromatic amino acids and derivatives

Branchedunclassifiedchain amino acids

Glutamine, glutamate, aspartate, asparagine; ammonia assimilationHistidine Metabolism

Lysine, threonine, methionine, and cysteineProline and 4unclassifiedhydroxyprolineunclassified_1Aminosugars

Central carbohydrate metabolism

CO2 fixation

Diunclassified and oligosaccharides

Fermentation

Monosaccharides

Oneunclassifiedcarbon Metabolism

Organic acids

Polysaccharides

Sugar alcohols

unclassified_2

Capsular and extracellular polysacchrides

GramunclassifiedNegative cell wall components

GramunclassifiedPositive cell wall componentsunclassified_3

CRISPsDNA recombinationDNA repair DNA replicationDNA uptake, competence

unclassified_4

ABC transporters

Protein and nucleoprotein secretion system, Type IVProtein secretion system, ChaperoneunclassifiedUsher pathway (CU)

Protein secretion system, Type IIProtein secretion system, Type III

Protein secretion system, Type VIProtein secretion system, Type VIIProtein secretion system, Type VIII (Extracellular nucleation/precipitation pathway, ENP)

Protein translocation across cytoplasmic membraneSugar Phosphotransferase Systems, PTS

TRAP transportersUniunclassified Symunclassified and Antiporters

Protein biosynthesis

Protein degradation

Protein folding

Protein processing and modification

SecretionSelenoproteins

Arabinose Sensor andtransport moduleCell Division and Cell Cycle

Central metabolism

Clusteringunclassifiedbased subsystemsDormancy and Sporulation

Fatty Acids, Lipids, and Isoprenoids

Iron acquisition and metabolism

Metabolism of Aromatic Compounds

Miscellaneous

Motility and ChemotaxisNitrogen Metabolism

Nucleosides and Nucleotides

Phages, Prophages,Transposable elements

Phages, Prophages, Transposableelements, Plasmids

Phosphorus MetabolismPhotosynthesisPotassium metabolism

Predictions based on plantunclassifiedprokaryote comparative analysis

Regulation and Cell signaling

RespirationRNA Metabolism

Secondary Metabolism

Stress Response

Sulfur MetabolismTranscriptional regulation

Virulence

Virulence, Disease and Defense

unclassified

Alanine, serine, and glycine

Arginine; urea cycle, polyamines

Aromatic amino acids and derivatives

Branchedunclassifiedchain amino acids

Glutamine, glutamate, aspartate, asparagine; ammonia assimilationHistidine Metabolism

Lysine, threonine, methionine, and cysteineProline and 4unclassifiedhydroxyprolineunclassified_1Aminosugars

Central carbohydrate metabolism

CO2 fixation

Diunclassified and oligosaccharides

Fermentation

Monosaccharides

Oneunclassifiedcarbon Metabolism

Organic acids

Polysaccharides

Sugar alcohols

unclassified_2

Capsular and extracellular polysacchrides

GramunclassifiedNegative cell wall components

GramunclassifiedPositive cell wall componentsunclassified_3

CRISPsDNA recombinationDNA repair DNA replicationDNA uptake, competence

unclassified_4

ABC transporters

Protein and nucleoprotein secretion system, Type IVProtein secretion system, ChaperoneunclassifiedUsher pathway (CU)

Protein secretion system, Type IIProtein secretion system, Type III

Protein secretion system, Type VIProtein secretion system, Type VIIProtein secretion system, Type VIII (Extracellular nucleation/precipitation pathway, ENP)

Protein translocation across cytoplasmic membraneSugar Phosphotransferase Systems, PTS

TRAP transportersUniunclassified Symunclassified and Antiporters

Protein biosynthesis

Protein degradation

Protein folding

Protein processing and modification

SecretionSelenoproteins

5

10

15

Average Abundance (%)

Functional GroupsLevel 1 Abundance < 5%

Functional GroupsLevel 1 Abundance > 5%

Functional analysis

Qiagen Functional analysis – Reinforces the Community Composition

Beta-diversity analysis

Permutation analysis (significance of

clustering)

Differential abundance analysis

Evaluation of differentially

abundant function

HMP

Indian healthy

HMP

Indian healthy

Ontology (GO) Clustering based on Pfam Clustering based on Gene

Summary

• Microbial communities found in the healthy volunteers suggest that themicrobiome of healthy humans of Indian descent is markedly different thanthose of Western European descent.

• Indian population may tolerate low number of pathogenic microorganismsthat may indicate a “disease state” for Western European descent

• Metadata revealed that patients who exhibited profound watery diarrhea contained in their microbiome pathogens primarily of the Escherichia colicomplex, namely pathogenic E. coli and Shigella species.

• Multiple pathogens can readily be identified from disease patients

• Microbial community of Indian population encodes alarming rate of antibiotic resistance genes

• Functional analysis of the Indian microbiome indicates predominance of carbohydrate metabolism genes

• Over abundance of cyto-/hemolysis genes observed in unknown etiology help explain diseased state

Pilot Studies Underway: NGS based (culture free) direct detection

Study Area Sample TypeWound and surgical Infections

Tissues, aspirates, swabs

Infective Endocarditis cardiac valves

Broad range pathogen detection CSF and Biopsies

HCAP BAL specimens;

Necrotizing Fasciitis Tissues: Muscle, Lung, Liver

Neonatal Sepsis blood, CSF, urine

Orthopedic Infection Pure isolates

Cystic Fibrosis and UTI Stool and Urine

Study Area Sample TypeHealthcare Associated Infections: Strainsubtyping and molecular epidemiology

Hospital Isolates

HAI: infection control and source tracking Isolates and biofilms

Broad Range Pathogen detection Blood; Ulcer, isolates

Empyema Plural effusionNeutropenic Infection & Aseptic Meningitis

Blood, CSF, Throat swab etc

Strain ID and sub-typing Clinical IsolatesProsthetic Joint Infection Tissue, swab

Lyme Disease Blood, CSF

CosmosID is working with the top research hospitals in the United States and Europe

Enabling End to End Microbiome Research with Great Partners

Sample Collection & Biobanking

DNA Isolation

Sequencing

High Resolution Microbial Characterization and Identification

Functional Analysis

Sequencing Partners

Sample

Action

History/Cholera

Metagenomics


Demo

How CosmosID Works

Reference database

Metagenomic SampleUnique and Shared Regions Identified

Sample Matched with Database

IdentificationStaphylococcus aureus subsp aureus USA300 TCH959

Propionibacterium acnes KPA171202

Enterococcus faecalis OG1RF

Thank you!

[email protected]

Clinical Metagenomics for Rapid Detection of Enteric Pathogens and Characterization of the Intestinal Microbiome

Health & Medicine