Top Banner
1 International Human Microbiome Standards Grant Agreement: HEALTH-F4-2010-261376 DELIVERABLE REPORT Work package WP3 – Improved standards for sequencing Work package leader Partner 5 – CEA Genoscope Deliverable D3.2 – Improved standards for sequencing Delivery date* 01/08/2013 Dissemination level** PU (Public) * Please refer to IHMS Calendar on IHMS intranet * *Please highlight the dissemination level appropriate for the deliverable. You can find the corresponding information in the IHMS Calendar Summary report From January 2012 until now, we have received some other 22 DNA extractions from faecal samples from INRA partner and 217 DNA extractions from the other partners. All the samples have been treated according to our validated pipeline which includes: i) sample quality control at arrival; 2) Illumina sequencing library preparation from samples which passed the QC, by applying our standardized protocol; iii) 100 bp lenght paired end sequencing of each library; iv) sequence quality control and validation; v) data delivery to partner 7. In order to help in the establishment of standards for faecal sample extraction protocol, a particular attention has been paid to the check of quality of the DNA samples. In this report we will describe the analysis applied to sample QC and the exclusion criteria used. All the INRA samples passed the QC and were sequenced. Of the other 217 samples, 192 passed the QC and were sequenced. All sequencing data have been transferred to partner 7 and analyses are under progress.
12

International Human Microbiome Standards

Feb 09, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: International Human Microbiome Standards

1

International Human Microbiome

Standards

Grant Agreement: HEALTH-F4-2010-261376

DELIVERABLE REPORT

Work package WP3 – Improved standards for sequencing

Work package leader Partner 5 – CEA Genoscope

Deliverable D3.2 – Improved standards for sequencing

Delivery date* 01/08/2013

Dissemination level** PU (Public)

* Please refer to IHMS Calendar on IHMS intranet

* *Please highlight the dissemination level appropriate for the deliverable. You can find the

corresponding information in the IHMS Calendar

Summary report

From January 2012 until now, we have received some other 22 DNA extractions from faecal

samples from INRA partner and 217 DNA extractions from the other partners. All the samples

have been treated according to our validated pipeline which includes: i) sample quality control at

arrival; 2) Illumina sequencing library preparation from samples which passed the QC, by

applying our standardized protocol; iii) 100 bp lenght paired end sequencing of each library; iv)

sequence quality control and validation; v) data delivery to partner 7.

In order to help in the establishment of standards for faecal sample extraction protocol, a

particular attention has been paid to the check of quality of the DNA samples. In this report we will

describe the analysis applied to sample QC and the exclusion criteria used. All the INRA samples

passed the QC and were sequenced. Of the other 217 samples, 192 passed the QC and were

sequenced. All sequencing data have been transferred to partner 7 and analyses are under

progress.

Page 2: International Human Microbiome Standards

2

sd3.2.1 – Improved inventory of standards for genomic sequencing

sd3.2.2 – Improved standards and recommendation for metagenomic long contiguous reference

sequence

In the period January 2012 – January 2013, the INRA partner sent to Genoscope 239 DNA

extractions. The INRA partner extracted 22 of them by using the same protocol applied for

extraction of the 20 samples previously processed. The other 217 samples were extracted by the

other IHMS partner starting from the same two faecal samples aliquots (A and B) by using their

own extraction protocols.

Upon arrival at Genoscope, all the samples were recorded in our LIMS system for internal follow

up at any stage of the processing. They were stored at –20°C until processing according to our

well established and standardized pipeline described below.

defines stopping points : the

experiment must fill some well defined

criteria, otherwise it is stopped

Page 3: International Human Microbiome Standards

3

i) Sample quality control

Our standardized protocol for genomic DNA quality control was initially applied on all the samples.

A SOP for genomic DNA QC is described in appendix 1. We recommend to the laboratory where

extractions are performed to use this protocol in order to evaluate DNA quality.

Briefly, the protocol includes two steps:

- Quantity evaluation: quantification by two independent measures by Qbit BR Assay kit is

performed. A mean concentration is calculated. For library preparation protocol established for

IHMS project, 250 ng input DNA are required. In our standard procedure, if total DNA quantity is

less than 500 ng (2 fold the minimal quantity), the sample is not valid and the QC ends at this

stage. In the context of this project, we have decided to check the quality also of samples with an

insufficient quantity (<250 ng) to perform the library.

- Quality evaluation: samples are loaded on a 0,4 % agarose gel and migration is performed at

100V during one hour. A photo is taken and quality of DNA is visually checked. If RNA

contamination is present, an RNAse treatment is applied to the sample, after which the sample

repeats the QC from the beginning. DNA integrity is visually checked. For standard paired end

library preparation, DNA passes the QC if the majority of the DNA is located on a tight band at high

molecular weight. Anyway, we wanted to check the IHMS samples quality much more carefully in

order to produce the most of information about DNA quality. This should be helpful in order to

evaluate the different extraction protocols used by the IHMS partner and to establish a

standardized protocol to produce good quality DNA. For this aim, we took advantage of the

availability in our laboratory of a gel image analysis system (GeneTools, Syngene) which is able to

calculate the % of DNA present at different size ranges selected by the user. Based on the size of

the DNA ladder bands as reference, we have chosen four size ranges: > 9 kb, between 9 and 5

kb, between 5 and 1,8kb and < 1,8kb. We have manually delimited these size regions on each gel

image and the analysis system has calculated the % of DNA for each region. We have combined

the results of the software analysis with our visual interpretation of the images and finally we have

established four DNA quality categories:

Qualitative classification colour code

Group 1

Very good quality DNA.

Optimal for sequencing Group 2

Majority of high molecular weight DNA.

Good for sequencing

Group 3

Presence of degraded DNA mostly > 1.8kb.

Acceptable for standard PE sequencing (not for MatePair)

Group 4

Presence of degraded DNA with most fragments < 1.8kb.

Not suitable for standard sequencing.

Group 5

Totally degraded DNA.

Not acceptable for sequencing

Page 4: International Human Microbiome Standards

4

Here is an example of the QC control on a subset of 10 IHMS samples. For each sample, a pure

1µl aliquot and a 1:10 diluted aliquot have been loaded on the agarose gel.

Ge

no

sc

op

e

ID

Sa

mp

le ID

Qualitative analysis Quantitative analysis

Validation

decision

% DNA

> 9 kb

% DNA

5-9 kb

% DNA

1,8-5kb

% DNA

<1,8kb

RNA

cont

Qualitative

classification

Reported

volume

(ul)

Reported

quantity

(ng)

Measure

d

volume

(µl)

Measured

quantity

(ng)

ES

A1-

002 87,57 7,50 1,44 3,50 -

50 19177 49,50 9356 Valid

ET

A1-

052 91,96 5,52 0,38 2,14 -

50 15318 50,40 9097 Valid

EV

A1-

102 86,52 9,96 1,20 2,32 -

50 13065 55,40 8487 Valid

FA

A1-

152 88,05 9,18 1,46 1,31 -

50 18372 50,20 14427 Valid

FB

B1-

002 75,43 22,03 1,13 1,41 -

50 13983 48,00 8112 Valid

FC

B1-

052 75,77 21,04 0,05 3,14 -

50 22250 47,00 15566 Valid

FD

B1-

102 70,30 29,09 0,00 0,61 -

50 17150 51,80 13126 Valid

FE

B1-

152 1,54 21,53 46,52 30,41 -

50 50272 60,50 19481 Valid

FF

C1-

002 1,19 1,21 1,72 95,88 -

50 15649 50,00 988 Invalid

FG

C1-

022 0,29 0,71 2,82 96,18 -

50 16398 51,50 883 Invalid

Page 5: International Human Microbiome Standards

5

Based on this classification, all INRA samples were classified in the first group. The following table

resumes the classification results for the remaining 217 samples:

Very good quality DNA. Optimal for sequencing

76

Majority of high molecular weight DNA. Good for sequencing

56

Presence of degraded DNA mostly > 1.8kb. Acceptable for standard PE sequencing (not for MatePair)

45

Presence of degraded DNA with most fragments < 1.8kb. Not suitable for standard sequencing.

19

Totally degraded DNA. Not acceptable for sequencing

21

Total sequenced libraries 192

Total invalid samples (including 4 samples with good quality but insufficient quantity)

25

Even if, according to our QC criteria for sample exclusion, samples classed in the Group 4 should

not have been processed further, we decided in agreement with the project coordinator, to process

them anyway in order to establish if the low DNA quality will affect library preparation and

sequence data results.

Finally, of 217 samples analysed by this way at QC stage, 192 were considered valid and were

then used to prepare libraries.

ii) Illumina library preparation and QC

Library preparation was performed according to the protocol described in the D3.1 report. A SOP

for library preparation is included in Appendix 2.

All the samples were successfully processed.

iii) Sequencing

Each indexed library was sequenced on one eight fraction of an Illumina HiSeq2000 lane in order

to obtain at least 20 millions reads/sample. Standard Illumina operating procedures have been

followed for cluster generation and sequencing run.

iv) Data QC

Raw fastq files sorting from the sequencer are treated by the Genoscope internal pipeline

schematized below

Page 6: International Human Microbiome Standards

6

First of all, a read quality check is performed on a subsample of the reads, in order to detect

possible biases in the library construction or sequencing problems. After manual validation of the

sequencing run, the whole reads dataset is treated for removal of adapters and low quality

nucleotides from both ends (low quality threshold is fixed at 20). The cleaned reads (fastx_clean)

continue next steps which include: i) removal of sequences between the second unknown

nucleotide (N) and the end of the read; ii) discarding of reads shorter than 30 nucleotides after

trimming; iii) removal of reads and their mates that mapped onto run quality control sequences

(PhiX genome) with at max 2 mismatches. QC charts and contamination screening are then

performed on a clean reads subsetset.

Raw Fastq

checkReadsQuality

20000

reads

Adaptors

fastx_clean

Cleaned Fastq

decontamFastq

checkContamination

checkReadsQuality

• Composition biais

• N Distribution

• Quality

• Primer search

• Adaptors < 0.5

• Quality >20

• N < 2

• length >= 30

• Phix

• Other …

20000

reads

Page 7: International Human Microbiome Standards

7

APPENDIX 1 : Genomic DNA QC using standard electrophoresis

Summary

This protocol describes how to evaluate the quality and quantity of genomic DNA samples using

run a standard agarose gel as well as Qubit™ fluorometer

Reagents and consumables

Reagent / consommable Supplier

Seakem Agarose Biorad

50x TBE buffer Biorad

SYBR® Safe DNA gel stain (10,000X concentrate in DMSO) Invitrogen

5x loading dye General lab supplier

RNAse A 100 mg/ml Qiagen

0.1x TE buffer General lab supplier

Resuspension buffer (10mM TrisHCl, pH 7,5) General lab supplier

Agilent DNA HS kit Agilent

Quant-iTTM

dsDNA BR assay kit Life Technologies

DNA molecular weight marker II (0,1 – 23 kb) Roche

Equipment

Equipment Supplier

Mini horizontal device 15-wells combs Biorad

Mini horizontal Gel electrophoresis device with 7x10

cm tray Biorad

Gel imager system Different lab suppliers

Qubit™ fluorometer 1.0 or 2.0 Life Technologies

Page 8: International Human Microbiome Standards

8

Procedure

Upon arrival, store the sample at –20 °C until use.

STEP 1: gDNA quantification using Qbit™ fluorometer

Use the Quant-iTTM

dsDNA BR assay kit following the manufacturer instructions for use of the

kit and the Qbit fluorometer. Perform two independent measurements using 1 µl of the DNA

sample for each measure. Calculate the mean concentration in ng/µl.

STEP2: gDNA integrity check by agarose gel electrophoresis

All reagents and stock solution should be prepared prior to the start of the procedure.

Gel & Sample Preparation

a) Cast a ~40ml 0,6% Seakem agarose gel with 1X TBE and 10 µl SYBR® Safe DNA gel stain

(10,000X concentrate in DMSO). Use a narrow well comb.

b) For each sample to be tested prepare two clean labeled tubes

Tube 1: transfer 1µl DNA and complete with 5 µl H2O and 2µl 5x loading dye

Tube 2: prepare a 1:10 dilution of the initial sample in TE buffer and use 1 µl of the dilution.

Complete with 5 µl H2O + 2µl 5x loading dye

Gel Electrophoresis a) Load the gel by leaving an empty well between two samples. Load 100-150 ng of the DNA

molecular weight marker II in the two wells located on the left and right edgex of the gel

b) Run gel for 30 min at ~100V in 1X TBE buffer.

c) Remove gel from gel box and image.

This first image capture allows to better evaluate the presence of RNA contamination

d) Return gel to gel box and run again for 30 min at 100V

e) Remove gel from gel box and image

DNA QC Gel Analysis

Evaluate genomic DNA integrity and RNA contamination

a) RNA contamination

If RNA is massively present in the sample (visible as a cloud at < 1 kb and /or two bands at at

~ 5kb and 1,8 kb corresponding to rRNA), treat the initial sample with RNAse A: use 1 µl

RNAse A for each 100 µl sample, incubate 90 min at 37 °C and reload 1µl of the treated

sample on the gel. If RNA has disappeared, perform a new quantification by Qbit assay as

previously described. If RNA is still present, retreat sample with RNAse A.

b) DNA integrity

The majority of DNA shoud appear as a tight band > 23 kb. If a smear is present, this means

that DNA is partially degraded. If no tight high molecular weight band is visible and DNA is

present only in the smear, the degradation is massive and DNA is not suitable for sequencing.

If a quantification software system is available, refer to the software instructions analyze DNA

quality on gels.

If DNA has to be used for large long mate-pair library construction, the size of DNA needs to

be in the high molecular weight. In this case, DNA band should be above the 23kb band. It is

highly recommended to check the integrity of DNA by pulsed field electrophoresis to properly

determine the molecular weight.

Page 9: International Human Microbiome Standards

9

APPENDIX 2: Library Preparation Recommendations for Illumina sequencing

of metagenomic samples

Summary

The purpose of this procedure is to generate a 180-480 bp insert size DNA library that will be

used for sequencing on the Illumina HiSeq2000 on 100 bp paired end lengths. Starting material is

500 ng genomic DNA extracted from fecal samples. Genomic DNA is broken into smaller

fragments via Covaris instrument and barcoded adapters are added so that the DNA can be

hybridized to a FlowCell before being put on the HiSeq instrument. During library preparation,

end repair, A tailing, adaptors ligation and size selection are perfomed by a semi automatized

instrument, the SPRI TE instrument supplied by Beckmann Coulter

Reagents and consumables

Reagent / consommable Supplier

6-mm × 16-mm AFA microtubes and snap caps Covaris

LoBind tubes, 1.5 mL Eppendorf

Agencourt AMPure XP beads Beckman Coulter

SPRI Works Fragment Library System I Beckmann Coulter

Platinum Pfx Taq Polymerase kit Life Technologies

0.1x TE buffer

Resuspension buffer (10mM TrisHCl, pH 7,5) General lab supplier

Agilent DNA HS kit Agilent

Quant-iT dsDNA HS assay kit Life Technologies

Illumina adapters Bioo Scientific

Illumina Library quantification kit KAPA Biosystems

Equipment

Equipment Supplier

Covaris AFA™ Ultrasonicator Covaris

SPRI-TE Instrument Beckmann Coulter

2100 Bioanalyzer Agilent

Thermal cycler General lab supplier

Qbit fluorometer or equivalent Life Technologies

Page 10: International Human Microbiome Standards

10

Procedure

STEP 1: DNA fragmentation using Covaris

Fragment DNA using S2 or E210 systems. Follow the manufacturer recommendations for correct

use of the instrument

a) Allow the Covaris chiller to reach 4 °C, and degas for at least 30 min (for S2) or 1h (for E210).

b) During this time, prepare the DNA sample:

Dilute 500 ng DNA to 130 μl with 0.1x TE buffer and transfer the DNA sample to a 100-μl

Covaris microtube, keeping the cap on the tube

c) Insert the microtube into the holder (S2) ore the rack (E210, and for fragment sizes in the range

of 200 bp, run the Covaris with the following settings:

Duty cycle: 10%

Intensity: 5

Cycles per burst: 200

Time: 120 sec.

d) Transfer processed sample to a 2 ml screw cap tube supplied with SPRI Works Fragment Library

System I

e) QC step: remove 1µl of the sample to test fragmentation size on a High Sensitivity DNA Chip on

Bioanalyzer. The expected DNA fragment range is 100bp to 1kb with a peak around 400 bp

Page 11: International Human Microbiome Standards

11

STEP 2: SPRI TE run

This step includes end repair, A tailing, adaptors ligation and size selection performed using the

preloaded SPRIWorks reagents cartridge and SPRI TE instrument

a) Remove SPRIworks Fragment Library I cartridges from-20°C storage and allow the cartridges to

thaw during shearing. Remove one cartridge for each library to be

constructed. Thaw cartridges at room temperature for approximately one hour, or until all

contents are completely thawed.

b) Set up the SPRI-TE instrument and prepare the reagent rack, following the manufacturer

instructions included with the SPRI Works Library System I.

Use barcoded Illumina compatible adapters (they can be home made or purchased from various

suppliers as Bioo Scientific)

**Note: depending on the initial adapter concentration, dilute the adapters with resuspension

buffer to adjust for 500 ng input DNA. Excess adapters can interfere with sequencing. The adapters

may have to be titrated relative to starting material.

If using 15 µM adapters, you have to dilute 1:10 for about 500 ng input DNA

c) Start the run by selecting the option 300-600 bp size selection

d) At the end of the run, retrieve the tube containing the library and clean up the reaction using

AMPure XP beads.

This step allows to additionally remove fragments <300 bp and remaining adapters dimers that can

interfere during PCR enrichment. Before performing clean up, rewiev AMPure XP handling

recommendations of the manufacturer (Beckmann Coulter).

Measure the library volume and adjust to 50µl with Resuspension buffer. Add 32,5µL (0,65

volumes) AMPure XP beads, mix by short vortexing. Incubate for 5 minutes, then bind the beads

and remove the supernatant. Add 500 µL 70% ethanol (made fresh each time), incubate 30

seconds and remove. Repeat wash once. Let the pellet dry completely (5-10 minutes), then elute

in 40 µL Resuspension buffer

e) Remove 1 µl form the sample and perform a Qbit quantification.

STEP 3: PCR enrichment

Perform enrichment of the library using Platinum Pfx Taq Polymerase (Life Technologies) and P5

and P7 primers. Other protocols suitable for use with the Illumina HiSeq2000 may also be used.

Primer P5

5' AATGATACGGCGACCACCGAG

Primer P7

5’CAAGCAGAAGACGGCATACGAG

This protocol is based on 10 ng ligated DNA input as matrix for 12 cycles enrichment

Page 12: International Human Microbiome Standards

12

a) Combine and mix the following components in two sterile 0,2 ml tubes

Ligated DNA (10 ng) x µl

Pfx amplification buffer 10x Reaction Buffer 5 µl

P5 primer 50 µM 1 µl

P7 primer 50 µM 1 µl

MgSO4 50mM 2 µl

dNTP 10mM 2 µl

Pfx Platinum Taq polymerase 0.8 µl

H2O 18.2-x µl

Total volume 50 µl

b) Amplify using the following PCR cycling conditions:

30 sec at 98 °C

[10 sec at 98 °C, 30 sec at 60 °C, 30 sec at 72 °C] 12 cycles total

5 min at 72 °C

Hold at 4 °C

c) Clean up the reaction using AMPure XP beads (Agencourt).

Add 40 µL (0,8 volumes) AMPure XP beads, mix by short vortexing. Incubate for 5 minutes,

then bind the beads and remove the supernatant. Add 500 µL 70% Ethanol (made fresh each

time), incubate 30 seconds and remove. Repeat once. Let the pellet dry completely (5-10

minutes), then elute in 30 µL Resuspension buffer.

STEP 4: Quantitative and qualitative assessment of the library

The sample must be accurately quantified in order to optimize yield. This step is absolutely crucial

to the success of any experiment.

a) Measure the concentration using the Qubit using the HS kit.

b) Run 1 ng of the sample on the Bioanalyzer High Sensitivity DNA Chip

c) Quantify the library by qPCR. The unknown library is compared to a previously analyzed

library for which the optimal cluster density has been achieved