Supplementary Materials for · composition, and Shannon index in IBS-GE. Table S21 (Microsoft Excel format). Associated phenotypes on gene richness, gut microbiome composition, and

www.sciencetranslationalmedicine.org/cgi/content/full/10/472/eaap8914/DC1

Supplementary Materials for

Gut microbiota composition and functional changes in inflammatory bowel disease

and irritable bowel syndrome

Arnau Vich Vila, Floris Imhann, Valerie Collij, Soesma A. Jankipersadsing, Thomas Gurry, Zlatan Mujagic, Alexander Kurilshikov, Marc Jan Bonder, Xiaofang Jiang, Ettje F. Tigchelaar, Jackie Dekens, Vera Peters,

Michiel D. Voskuil, Marijn C. Visschedijk, Hendrik M. van Dullemen, Daniel Keszthelyi, Morris A. Swertz, Lude Franke, Rudi Alberts, Eleonora A. M. Festen, Gerard Dijkstra, Ad A. M. Masclee, Marten H. Hofker,

Ramnik J. Xavier, Eric J. Alm, Jingyuan Fu, Cisca Wijmenga, Daisy M. A. E. Jonkers, Alexandra Zhernakova, Rinse K. Weersma*

*Corresponding author. Email: [email protected]

Published 19 December 2018, Sci. Transl. Med. 10, eaap8914 (2018)

DOI: 10.1126/scitranslmed.aap8914

The PDF file includes:

Materials and Methods Fig. S1. Comparison of microbial richness between cohorts. Fig. S2. Venn diagram of overlapping taxa between IBD and clinical IBS. Fig. S3. Cohorts, sample collection, and sample processing algorithm. Fig. S4. Principal coordinate analysis plot on Bray-Curtis dissimilarities of controls. Fig. S5. Phenotype data processing algorithm. Fig. S6. Metagenomic sequencing data pipeline. Fig. S7. Overview of statistical analyses. Fig. S8. Prediction model to distinguish cohort of origin in disease. Fig. S9. Prediction model to distinguish cohort of origin in controls. References (37–51)

Other Supplementary Material for this manuscript includes the following: (available at www.sciencetranslationalmedicine.org/cgi/content/full/10/472/eaap8914/DC1)

Table S1 (Microsoft Excel format). Summary statistics of phenotypes. Table S2 (Microsoft Excel format). Summary statistics of gut microbiome taxonomy. Table S3 (Microsoft Excel format). Variables included in linear models case-control analyses. Table S4 (Microsoft Excel format). Taxonomy results of CD versus controls. Table S5 (Microsoft Excel format). Taxonomy results of UC versus controls. Table S6 (Microsoft Excel format). Taxonomy results of IBS-GE versus controls. Table S7 (Microsoft Excel format). Taxonomy results in the overlap of IBD and IBS-GE. Table S8 (Microsoft Excel format). Taxonomy results of IBS-POP versus controls.

Table S9 (Microsoft Excel format). Taxonomy results of all diseases versus controls. Table S10 (Microsoft Excel format). Strain diversity results of all diseases versus controls. Table S11 (Microsoft Excel format). Bacterial growth rate results of all diseases versus controls. Table S12 (Microsoft Excel format). Prediction accuracy of all prediction models. Table S13 (Microsoft Excel format). Top 20 gut microbiome features in the prediction model. Table S14 (Microsoft Excel format). Summary statistics of gut microbiome MetaCyc function. Table S15 (Microsoft Excel format). Pathway results of all diseases versus controls. Table S16 (Microsoft Excel format). Virulence factor results of all diseases versus controls. Table S17 (Microsoft Excel format). Antibiotic resistance gene results of all diseases versus controls. Table S18 (Microsoft Excel format). Associated phenotypes on gene richness, gut microbiome composition, and Shannon index in CD. Table S19 (Microsoft Excel format). Associated phenotypes on gene richness, gut microbiome composition, and Shannon index in UC. Table S20 (Microsoft Excel format). Associated phenotypes on gene richness, gut microbiome composition, and Shannon index in IBS-GE. Table S21 (Microsoft Excel format). Associated phenotypes on gene richness, gut microbiome composition, and Shannon index in IBS-POP. Table S22 (Microsoft Excel format). Correlation phenotypic factors with an FDR <0.1 from the Adonis analysis in CD. Table S23 (Microsoft Excel format). Correlation phenotypic factors with an FDR <0.1 from the Adonis analysis in UC. Table S24 (Microsoft Excel format). Correlation phenotypic factors with an FDR <0.1 from the Adonis analysis in IBS-GE. Table S25 (Microsoft Excel format). Correlation phenotypic factors with an FDR <0.1 from the Adonis analysis in IBS-POP. Table S26 (Microsoft Excel format). Variables included in multivariate linear models within disease cohorts. Table S27 (Microsoft Excel format). Taxonomy results within the CD univariate model. Table S28 (Microsoft Excel format). Taxonomy results within the CD multivariate model. Table S29 (Microsoft Excel format). Taxonomy results within the UC univariate model. Table S30 (Microsoft Excel format). Taxonomy results within the UC multivariate model. Table S31 (Microsoft Excel format). Taxonomy results within the IBS-GE univariate model. Table S32 (Microsoft Excel format). Taxonomy results within the IBS-GE multivariate model. Table S33 (Microsoft Excel format). Taxonomy results within the IBS-POP univariate model. Table S34 (Microsoft Excel format). Taxonomy results within the IBS-POP multivariate model. Table S35 (Microsoft Excel format). Pathway results within the CD univariate model. Table S36 (Microsoft Excel format). Pathway results within the CD multivariate model. Table S37 (Microsoft Excel format). Pathway results within the UC univariate model. Table S38 (Microsoft Excel format). Pathway results within the UC multivariate model. Table S39 (Microsoft Excel format). Pathway results within the IBS-GE univariate model. Table S40 (Microsoft Excel format). Pathway results within the IBS-GE multivariate model. Table S41 (Microsoft Excel format). Pathway results within the IBS-POP univariate model. Table S42 (Microsoft Excel format). Pathway results within the IBS-POP multivariate model. Table S43 (Microsoft Excel format). Cohort-associated taxa and IBD versus IBS taxonomical associations.

SUPPLEMENTARY MATERIALS

Materials and Methods

I. Cohorts, sample collection, sample processing, and metagenomic sequencing

Stool samples and phenotypic data of three cohorts were collected and uniformly processed

using the algorithm depicted in Fig. S3.

A. Cohorts

In this study, we used data and biomaterials from three cohorts from the Netherlands:

Cohort 1 - LifeLines DEEP

The first cohort is the LifeLines DEEP cohort, which is a subset of the Dutch general population

cohort, LifeLines. Both LifeLines and LifeLines DEEP have been previously described (37, 38).

In summary, LifeLines is a three-generation cohort that comprises approximately 167,000

participants residing in the three northern provinces of the Netherlands. All participants will be

followed-up prospectively for at least 30 years. Participants regularly undergo physical

examinations and fill in extensive questionnaires. In addition, blood and urine samples are

collected. Each participant is asked to fill in health, lifestyle, and quality-of-life questionnaires

every 1.5 years, whereas each participant is invited for a follow-up visit to a LifeLines clinic

every 5 years (37).

LifeLinesDEEP comprises approximately 1,500 LifeLines participants. The aim of the

LifeLinesDEEP cohort is to investigate different -omics layers. Therefore, additional

biomaterials were collected, including fecal samples. Participants who consented to giving a

fecal sample were also asked to fill in extensive questionnaires on their gastrointestinal (GI)

health (38).

Cohort 2 - University Medical Center Groningen IBD

The second cohort is the University Medical Center Groningen IBD (UMCG IBD) cohort.

Patients with inflammatory bowel disease (IBD) were recruited at the specialized IBD outpatient

clinic of the Department of Gastroenterology and Hepatology, UMCG, as described previously

(7). The IBD diagnosis was made based on accepted radiological, endoscopic and

histopathological evaluation. All patients were 18 years or older at the time of fecal sample

collection.

Cohort 3 - Maastricht IBS

The third cohort is the Maastricht IBS (MIBS) cohort and comprises Irritable Bowel Syndrome

(IBS) patients and healthy controls. IBS patients were recruited at the out-patient department of

the Gastroenterology-Hepatology division of the Maastricht University Medical Center+, a

secondary and tertiary referral center, and via general practitioners from the Maastricht area. All

IBS patients were diagnosed by a gastroenterologist after an extensive work-up that usually

included a colonoscopy. Healthy controls were age- and sex-matched and had a medical

examination to exclude any gastrointestinal disorders and determine current or previous

gastrointestinal complaints. In addition, patients were asked to fill in extensive questionnaires on

gastrointestinal health (39).

B. Analysis groups

The three cohorts were combined and the samples were subsequently divided into analysis

groups. Samples were exclusively assigned to one of the following four analysis groups:

Analysis group 1 - Population controls

The first analysis group comprised participants from both the LifeLinesDEEP cohort as well as

from the MIBS cohort. From LifeLinesDEEP, participants with self-reported IBD and self-

reported IBS were excluded, and fecal samples from 926 participants were analyzed. From the

MIBS cohort, 144 healthy controls were included. After removing samples with a read count of

less than 10 million after metagenomic sequencing (n=45), analysis group 1 - population controls

consisted of 1025 samples (893 from LifeLinesDEEP, and 132 from MIBS). Comparability

between controls was assessed by a principal coordinate analysis on Bray-Curtis dissimilarities.

No significance differences were observed in the first three principal coordinates (Wilcoxon test:

p= 0.07, p=0.95, p=0.56, respectively) (Fig S4). However, when testing individual microbial

feature associations between controls from the LifeLinesDEEP cohort versus controls from the

Maastricht cohort, the relative abundance of 42 taxa were found to be statistically significantly

different between cohorts (FDR<0.01) (Table S43). In order to remove any batch effect in the

case-control analyses, a cohort covariate was forced into the linear models.

Analysis group 2 - IBD patients diagnosed by a gastroenterologist (IBD)

The second analysis group comprised 427 patients with IBD, i.e. the entire cohort UMCG IBD.

Patients were excluded from analyses when they had a stoma, a pouch or a short bowel (n=47).

Moreover, two samples were excluded due to accidental sampling of peri-anal abscess content

instead of stool. After removing samples with a read count less than 10 million reads after

metagenomic sequencing (n=23), analysis group 2 - IBD consisted of 355 IBD patients

comprising 208 patients with CD, 126 patients with UC, and 21 patients with IBD-Undetermined

(IBDU).

Analysis group 3 - IBS patients diagnosed by a gastroenterologist (IBS-GE)

The third analysis group comprised 188 IBS patients from the MIBS cohort, who were diagnosed

by their treating gastroenterologist based on to the ROME III criteria and the exclusion of other

GI diseases (39). This group is referred to as IBS-gastroenterologist (IBS-GE). To exclude any

other organic diseases, additional tests were performed when deemed necessary by the

gastroenterologist, i.e. biopsies from endoscopy, abdominal imaging, blood, breath, and fecal

analyses. After removing samples with a read count less than 10 million reads after metagenomic

sequencing (n=7), analysis group 3 - IBS-GE consisted of 181 patients. In this group, patients

were also assigned to IBS subtypes according to the ROME III criteria, indicating the most

predominant bowel habit of the patients: 65 patients with IBS diarrhea (IBS‐D), 33 patients with

IBS constipation (IBS‐C), 73 patients with IBS mixed stool pattern (IBS‐M) and 10 patients with

the IBS unspecified subtype (IBS‐U).

Analysis group 4 - IBS patients diagnosed by self-filled-in ROME III questionnaire (IBS-POP)

The fourth group comprised 242 IBS patients from the LifeLines DEEP cohort. After the

completed ROME III questionnaires had been evaluated, these patients met the ROME III

criteria for IBS. This group is referred to as IBS population (IBS-POP). After removing patients

with a read count less than 10 million reads after metagenomic sequencing (n=11), analysis

group 4 - IBS-POP consisted of 231 IBS patients. Patients were assigned to IBS subtypes based

on predominant bowel habits according to the self-reported ROME III criteria and to the self-

reported Bristol Stool Form Scale information (40): 42 patients with IBS diarrhea (IBS‐D), 65

patients with IBS constipation (IBS‐C), 111 patients with IBS mixed stool pattern (IBS‐M) and

13 patients with the IBS unspecified subtype (IBS‐U).

C. Fecal sample collection

All participants of the LifeLines DEEP cohort and the UMCG IBD cohort were asked to produce

a fecal sample at home and place it in their home freezer (-20°C) within 15 minutes after

production. Subsequently, a nurse visited all participants to pick up the fecal samples on dry ice

and transfer them to the laboratory of the Department of Genetics, UMCG. Aliquots were made

and these were stored at -80°C until further processing.

All participants of the MIBS cohort were asked to produce a fecal sample at home and

place it in their home refrigerator (4°C) and bring it to the hospital within 24 hours. Here, the

samples were collected and aliquots were made. The aliquots were then stored at -80°C. Next,

aliquots of the samples were shipped on dry ice to the Department of Genetics, UMCG, for

further processing.

D. DNA extraction from fecal samples

All fecal sample aliquots remained frozen until microbial DNA extraction. Fecal DNA isolation

was performed using the AllPrep DNA/RNA Mini Kit (Qiagen; cat. # 80204) with the addition

of mechanical lysis as previously described (8).

E. Metagenomic sequencing

After fecal sample collection and DNA extraction, fecal DNA was sent to the Broad Institute of

Harvard and MIT in Cambridge, Massachusetts, USA, where library preparation and whole

genome shot-gun sequencing was performed on the Illumina HiSeq platform. From the raw

metagenomic sequencing data, low quality reads were discarded by the sequencing facility using

an in-house pipeline. Samples with a read depth of less than 10 million reads were excluded from

subsequent analyses. Next, quality trimming and adapter removal was performed using

Trimmomatic (v.0.32), setting the minimum length to 70% of the total input read length (41).

F. Fecal biomarker measurements

In this study, we used three fecal biomarkers: fecal calprotectin, chromogranin A (CgA), and

human-β-defensin-2 (HBD-2). Fecal calprotectin levels were available for all three cohorts. The

levels are a marker for inflammation in the gastrointestinal tract (42). Fecal calprotectin levels

were measured at the UMCG using a commercial enzyme-linked immunosorbent assay (ELISA,

Bühlmann Laboratories, Switzerland). CgA and HBD-2 were available for patients in the

analysis groups IBS-GE and IBS-POP. CgA marks the activation of the neuroendocrine system

(27) and HBD-2 marks the defense mechanisms against invasion of microbes (43). As described

previously, CgA and HBD-2 were measured at the ‘Medische Laboratoria Dr. Stein & Collegae’

(the Netherlands) using a commercial radioimmunoassay (RIA, Euro-Diagnostica, Sweden) and

a commercial enzyme-linked immunosorbent assay (ELISA, Immunodiagnostik AG, Germany),

respectively (39).

II. PROCESSING OF PHENOTYPE DATA

The phenotypes and summary statistics of all 1792 individuals are presented in Table S1

(depicted in Fig S5)

A. Phenotypes of Analysis group 1 - Population controls

In population controls, 25 phenotypes were collected and subsequently used for all case-control

analyses (Table S3): 4 intrinsic factors, 19 medication categories, and 2 smoking factors. These

phenotypes have been related to the gut microbiome composition in the general population (10).

In addition, these phenotypes were available for all four analysis groups (10, 39).

B. Phenotypes of Analysis group 2 - IBD patients

For patients with IBD, 159 phenotypes were collected and subsequently associated with the gut

microbiome composition. These factors included 5 intrinsic factors, 20 IBD-specific phenotypes,

45 medication categories, 11 extra-intestinal manifestations (EIM), 2 smoking factors, 5 types of

GI-tract surgery, and 71 dietary factors. Except for the dietary factors, all factors were extracted

from the IBD-specific electronic patient records of the IBD Center at the Department of

Gastroenterology and Hepatology of the UMCG as was previously described (7). The dietary

factors were extracted from a food frequency questionnaire (FFQ) in which IBD patients could

record in what frequency and quantity each food item was consumed. Based on these data, the

food intake was calculated in grams per day by per food group by University of Wageningen,

Wageningen, the Netherlands.

C. Phenotypes of Analysis group 3 - IBS patients diagnosed by a gastroenterologist (IBS-

GE) and Analysis group 4 - IBS patients diagnosed by self-filled-in ROME III

questionnaire (IBS-POP)

In both the IBS-GE and IBS-POP analysis groups, we analyzed 63 phenotypes and their

associations with the gut microbiome composition. These phenotypes included 3 intrinsic

factors, 13 IBS-specific phenotypes, 45 medication categories and 2 smoking factors. The factors

were extracted from questionnaires as was previously described (38, 39). In both analysis groups,

patients kept a diary in which they recorded their bowel movement frequency and stool type. The

latter was assessed using the Bristol Stool Form Scale. In addition, patients kept a diary for one

week to score the following gastro-intestinal symptoms: abdominal pain, distention, belching,

constipation, diarrhea, flatulence, and nausea. Mean scores were calculated and used for further

analyses.

D. IBD - Disease activity

Two methods were used to define disease activity in patients with IBD. The first method was

based on standardized clinical scores for disease severity: the Harvey-Bradshaw index (HBI) for

patients with CD and the Simple Clinical Colitis Activity Index (SSCAI) score for patients with

UC. A HBI> 4 and a SSCAI> 2.5 was considered as active disease. The second method was

based on the fecal calprotectin measurements >200 mg/kg feces was considered active IBD.

III. PROCESSING OF METAGENOMIC SEQUENCING DATA

The metagenomic reads were processed in five ways, using the algorithm described in Fig S6.

A. Taxonomy

To assess the microbial composition of fecal samples, a custom database was created. Available

reference genomes for 3693 bacteria, 5490 viruses, 207 archea, 183 fungi, 70 protozoa, and the

human genome (build 37) were downloaded from the RefSeq NCBI database (accession date:

June 3, 2016). The database was built using Kraken (kraken-build) (33), creating a 250 GB

database. To optimize the memory usage and running-time, the original database was reduced by

selecting 2.1 Million k-mers/taxon pairs (kraken-build-shrink). Next, individual metagenomic

reads of each sample were classified using Kraken default settings. Microbial relative

abundances at genus- and species-level were estimated using Bracken. To exclude false

positives, only taxa with number of counts greater than 0.01% of the total aligned reads were

considered during the reassignment process. In addition, the relative abundance of the human

intracellular pathogen Toxoplasma gondii was correlated with the number of human reads per

sample. The relative abundance of Toxoplasma gondii was highly correlated with the relative

abundance of human reads (r2=0.94, Pearson correlation). Reads matching this taxa were

excluded from further analysis due to the potential misclassification, as was previously described

(44). In total, 1668 taxonomies were reported, comprising 26 phyla, 53 classes, 109 orders, 180

families, 467 genera and 1237 species.

B. Strain diversity

The strain diversity of bacteria was estimated by assessing the genetic heterozygosity. In order to

compute the heterozygosity of polymorphic loci within individual bacterial species, we first

identified sequencing reads belonging to particular species by mapping each sample’s

metagenome to 31 AMPHORA genes (14) from a set of 649 non-redundant reference genomes

from the Human Microbiome Project (15). Reads mapping to one of these genes in a given

species with greater than 90% identity were associated with that species. All loci that were

polymorphic in at least one sample were aggregated from all samples into a set of SNPs to

consider. To distinguish truly polymorphic loci from sequencing error artifacts, we required a

locus to have at least two read counts in the minor allele to be considered polymorphic in a given

sample. Reads overlapping with these SNPs were aggregated by species, and the probability that

any two reads mapping to the same locus have different alleles was computed. Heterozygosity

within that species was then calculated from the mean of all SNP heterozygosities within each

genome. Heterozygosity values are therefore only defined in subjects in which the species is

present and which recruit sufficient reads to compute SNP allelic frequencies. The code is

publicly available: https://github.com/thomasgurry/strains/blob/master/heterozygosity.py

C. Function

Before functional prediction, metagenomic reads belonging to the human genome (build 37)

were removed using Bowtie2 (v.2.1.0) (45). Microbial community functional profiling was

conducted using HUMAnN2 (v 0.4.0) (http://huttenhower.sph.harvard.edu/humann2), which

aligns reads to a customized pan-genomes database (ChocoPhlAn). The abundance of more than

5 million UniRef (UniRef50) gene families were obtained. These gene families were grouped

into 784 pathways using the multi-organism database MetaCyc (46) as a reference database.

http://huttenhower.sph.harvard.edu/humann2

MetaCyc-pathways were quantile normalized and scaled. Extra information about biological

classification and taxonomical range of the pathways was retrieved from the MetaCyc website.

D. Bacterial growth rates

Bacterial growth rates were determined using the peak-to-trough-ratio algorithm (v1.1) designed

by Korem et al (16). Default settings, generated predictions (-preds generate), and the provided

database were used. Growth rate values for 100 bacterial species were obtained. Due to problems

with the supplied code, the read mapping against the reference database was performed as a

separate step using the same parameters as described in the original publication.

E. Abundance of antibiotic resistance genes and virulence factors

The abundance of antibiotic resistance genes and virulence factors was detected and quantified in

each sample. For this, we used DIAMOND (version 0.8.2) (47) to align metagenomic sequence

data against protein reference databases. First, 2171 antibiotic resistant gene protein sequences

were downloaded from the Comprehensive Antibiotic Resistance Database (CARD, v 1.1.1,

https://card.mcmaster.ca/) (35), and 2581 protein sequences of the core virulence factor dataset

were downloaded from the Virulence Factors Database (VFDB, version August 19, 2016,

http://www.mgc.ac.cn/VFs/) (36). Second, for each sample, metagenomics sequence reads were

aligned to both reference protein databases using DIAMOND. Alignment was only considered

valid if both paired-reads aligned to the same protein sequence. In the cases of multiple matches,

only the best match was kept. This was specified by using the DIAMOND alignment option “-k

1”. Proteins abundances were quantified as counts per million (CPM): calculated by the raw

https://card.mcmaster.ca/)

http://www.mgc.ac.cn/VFs/)

valid counts (number of valid alignments) divided by the library sizes and multiplied by one

million. After excluding antibiotic resistance genes present in less than 5% of the samples, the

abundances of 384 antibiotic resistance genes and 658 virulence factor proteins were analyzed.

F. Data filtering

Identified taxonomies, pathways, antibiotic resistance genes and virulence factors were excluded

from analyses if they were present in less than 5% of the samples, or if the average relative

abundance was lower than 0.001% in the non-zero values. Additionally, non-microbial MetaCyc

pathways were not considered in our analyses. Growth rate analysis was confined to bacterial

species in which ratios could be calculated in at least 25% of the samples.

After filtering, a total of 479 taxa, 104 bacterial strain richness, 405 MetaCyc pathways,

41 bacterial species growth rates, 384 antibiotic resistance genes, and 658 virulence factors were

considered for analysis.

G. Microbial composition measurements

With the R package ‘vegan’ (version 2.4-1) (48), exploratory analyses of the microbiome

composition were performed using 319 non-redundant taxonomical end-points. End-points were

defined as the lower non-redundant taxonomical levels. Taxonomical richness per sample was

estimated by calculating the Shannon diversity index using the diversity function

(index="shannon"). Differences in microbial diversity between phenotypes were tested using the

non-parametric Wilcoxon rank-sum test. Differences were considered significant at p-value<

0.05.

Microbial composition dissimilarities between samples were represented as Bray-Curtis

dissimilarities using the vegdist function (method="bray"). To estimate the gene richness, the

number of unique UniRef gene families were counted.

IV. STATISTICAL ANALYSIS

Using the processed phenotype and metagenomic data, we performed the following statistical

analyses represented in Fig S7.

A. Correlation structures

Correlations between phenotypes were calculated to exclude highly associated factors from our

multivariate models. If two phenotypes were found to be highly correlated (Spearman

coefficient, r > 0.8 and FDR<0.1) one representative phenotype would be selected (Tables S22-

S25).

In addition, the contribution of each taxonomical level to the abundances of pathways,

antibiotic resistance genes and virulence factors was assessed by correlating abundances. Tables

S13-S15 list the most correlated taxonomies presented with respectively functional pathways,

ARs, and VFs.

Correlation structures were calculated using the corr.test function of the R package psych

(version 1.7.3.21) (49). Spearman correlation and Benjamini and Hochberg adjusted p-values

were evaluated (method= “spearman”, adjust=”BH”).

B. Association between microbial composition and phenotypes in diseases cohorts

Associations between microbial composition measurements and phenotypic data were tested in

each disease cohort. Spearman correlations were used to associate phenotypes with taxonomical

and gene families’ richness.

The proportion of variance explained in the inter-individual distances (Bray-Curtis

dissimilarities) per phenotype was tested using the adonis function in the vegan package.

Significance was calculated using 1000 permutations.

In all the analyses, p-values were corrected for multiple testing using the Benjamini and

Hochberg method implemented in the p.adjust function in the R package stats (50). FDR<0.01

was used as the significance threshold. Only those phenotypes that were significantly correlated

with variation of the microbial composition (Table S26) were considered in the individual

taxonomy and pathways associations analyses (described in section IV C of Supplementary

methods).

C. Individual taxonomy and pathways association analysis

The association analysis between the specific taxonomies and phenotypic factors was performed

by using the statistical framework Multivariate Association with Linear Model (MaAsLin)

(https://huttenhower.sph.harvard.edu/maaslin) as was previously described (51). MaAsLin was

also used to perform the association analysis between specific pathways and phenotypic factors.

Three different models were used to assess: a) dysbiosis in IBD and IBS compared to population

controls, b) individual associations between phenotypes and microbiome signatures, and c)

independent microbiome-phenotype signals in patients with IBD and IBS.

a) Microbial composition and function of IBD patients and IBS patients were compared to those

of population controls. To standardize the case-control analysis between cohorts, the

selection of factors used for correction were based on the availability in the 3 cohorts

(LifeLinesDeep, UMCG IBD and MIBS). The effect of 25 previously identified (10)

https://huttenhower.sph.harvard.edu/maaslin

microbiome-related factors were taken into account (Table S3). In addition, we also add a

geographical covariate to remove the batch effect between the samples collected in

Groningen (Northern part of the Netherlands) and the samples collected in Maastricht

(Southern part of the Netherlands). These factors were forced as covariates using the –F

option in MaAsLin. Disease phenotypes (i.e. IBD and IBS) were tested as Boolean factor. All

results were considered significant when FDR<0.01.

b) The effects of the selected phenotypic factors (described in Supplementary methods section

IV B and in Table S26) on the gut microbiome composition were tested univariately (i.e. one

factor at a time) in each disease cohort. In these analyses, the effects of the microbiome-

independent factors age, sex and sequencing read depth were by taken into account by

forcing them as covariates in each analysis (option –F in MaAsLin). FDR<0.01 was used as

significance threshold.

c) All selected phenotypic factors were added in the same multivariate model to identify which

factors influence the specific taxonomies and pathways in the disease cohorts (Table S26).

No variables were forced; the boosting option in MaAsLin was activated to perform selection

within the metadata. FDR<0.01 was used as significance threshold.

D. Comparisons of antibiotic resistance mechanisms, virulence factors, growth rates and

heterozygosity

Differences in inter-species richness, growth rates, and the abundance of antibiotic resistance

mechanisms and virulence factors were calculated between population control samples and

disease analysis groups (i.e. UMCG IBD, IBS-POP, IBS-GE). Differences were evaluated using

MaAsLin and forcing the geographical cohort as a covariate and turning off the data

transformation step (-l none). FDR<0.01 was used as significance threshold.

V. PREDICTION MODELS

For building the prediction models, we used analysis group 2 (IBD patients diagnosed by a

gastroenterologist) and analysis group 3 (IBS patients diagnosed by a gastroenterologist). We

applied a 10-fold cross-validation. For each fold, 9 out of 10 data blocks were used to build the

model, and the last block was used to estimate goodness-of-fit represented as AUC (area under

curve). Models were fit by elasticnet linear models of binomial family from the R package

“glmnet”. The mixing parameter alpha was fixed at 0.5, while the penalization parameter lambda

was estimated by a nested 5-fold cross-validation within each training set. Taxonomical

abundances on genera and species level from Bracken were used to represent taxonomic

microbiome features in disease prediction. The abundances were log-transformed. Pathways

abundance data from HUManN2 was used to represent microbiome functional features. Both

taxonomical and pathway abundances were log-transformed. For highly correlated features

(Rspearman>0.9), only one representative taxonomy was retained.

For each fold, seven models were built in which baseline predictors (age, sex and BMI)

were not penalized:

1. Baseline model: Disease ~ Age + Sex + BMI

2. Calprotectin model: Disease ~ Age + Sex + BMI + Calprotectin

3. Microbiome model: Disease ~ Age + Sex + BMI + Microbiome(taxonomy)

4. Microbiome model: Disease ~ Age + Sex + BMI + Microbiome(pathways)

5. Microbiome and calprotectin model: Disease ~ Age + Sex + BMI + Calprotectin +

Microbiome(taxonomy)


Microbiome(pathways)


Microbiome (taxonomy+pathways)

To verify that the disease classification model robustly differentiates IBD vs. IBS, and that

the prediction is not inflated by differences between cohorts (Groningen and Maastricht), we

applied it to the microbiome profiles of healthy control samples from both cohorts. We then

performed additional reciprocal validation using microbiome profiles of healthy individuals to

build a model for cohort prediction. While the reciprocal model slightly exceeded the baseline

model in predicting IBD/IBS of the disease patient samples (AUCIBD/IBS=0.61, AUCbase-

model=0.54) (Figure S8), the disease prediction model completely fails to predict the cohort origin

of the control samples (AUCcohort=0.54, AUCbase-model=0.56) (Figure S9). In agreement with our

association study results, this shows that differences in microbiome profiles associated with

disease are mostly independent of possible cohort differences and batch effects.

Models 1, 3 and 5 were also applied to the microbiome profiles of healthy individuals from

the LifeLinesDEEP and Maastricht IBS cohorts to estimate the AUC of predicting origin. When

estimating AUC, LifeLinesDEEP and Maastricht IBS origin were encoded as Groningen-

controls and Maastricht-controls to represent the origin of the samples. To perform reciprocal

validation, models 1, 3 and 5 were recalculated using the microbiome profiles/baseline

phenotypes of healthy individuals as predictors and the cohort as an outcome following the same

mathematical approach and 10-fold cross-validation design. These models were then applied to

the microbiome profiles of IBD/IBS patients to see if we could predict their cohort of origin

All model AUCs are presented in Table S12. To compare goodness-of-fit, the two-sided

paired Wilcoxon rank-sum test was used (Table S12). The lists of the top-20 features for models

3 and 5 are presented in Table S13. The effect size is the mean effect size across 10 folds.

Next, these features were sequentially added to the unpenalized logistic regression model

to estimate per-feature improvement of prediction. To do so, we performed the second 10-fold

cross-validation with another seed, and report mean training and test AUC for iterative models

built.

The code is publicly available at https://github.com/alexa-kur/ibd_ibs_pred

Fig. S1. Comparison of microbial richness between cohorts. Decreased alpha diversity

(Shannon Index) in CD (median 3.18 [1.37 to 3.89]), IBS-GE (median 3.18 [1.90 to 4.09]) and

UC (median 3.35 [1.15 to 3.97]) compared to controls (median 3.41 [1.99 to 4.18]). IBS: Irritable

Bowel Syndrome, CD: Crohn’s disease, UC: ulcerative colitis

Fig. S2. Venn diagram of overlapping taxa between IBD and clinical IBS. Venn diagram: 24

taxa overlap between IBD and IBS-GE. IBD: inflammatory bowel disease, IBS-GE:

inflammatory bowel disease diagnosed by a gastroenterologist.

Figure S3. Cohorts, sample collection, and sample processing algorithm.

Fig. S4. Principal coordinate analysis plot on Bray-Curtis dissimilarities of controls.

Maastricht healthy controls (red) and LifeLines healthy controls (black). Similar distribution

gradients between cohorts are indicative of comparability.

Figure S5. Phenotype data processing algorithm.

Figure S6. Metagenomic sequencing data pipeline.

Figure S7. Overview of statistical analyses.

Fig. S8. Prediction model to distinguish cohort of origin in disease. ROC curves showing the

prediction model trained to distinguish between cohorts (red line). When applied to disease

cohorts, the prediction model fails to distinguish between IBD and IBS (blue line, AUC=0.61)

but shows a higher predictive value than the base model based on sex, BMI and age (black line,

AUC=0.54). MB: microbiome; Base: Age+Sex+BMI.

Cohort prediction

False positive rate

Tru

e p

ositiv

e r

ate

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

MB

Base

IBD/IBS

Fig. S9. Prediction model to distinguish cohort of origin in controls. ROC curves showing

application of the prediction model described in the main manuscript to cohort/geographic origin

of the samples (Maastricht controls vs. Groningen controls). The mean AUC value did not

exceed the AUC value of the base model (AUCcohorts=0.54, AUCbase=0.56), which demonstrates

that any batch effect in the IBD/IBS classification model is negligible. (MB: microbiome [red

curve]; Calprot: Faecal Calprotectin [blue curve]; Base: Age+Sex+BMI [black curve]; Cohort:

Groningen or Maastricht samples [light blue curve]).

Supplementary Materials for · composition, and Shannon index in IBS-GE. Table S21 (Microsoft Excel format). Associated phenotypes on gene richness, gut microbiome composition, and

Documents