NGS Applications II (UEB-UAT Bioinformatics Course - Session 2.1.3 - VHIR, Barcelona)

Post on 10-May-2015

393 Views

Category:

Science

4 Downloads

Preview:

Click to see full reader

DESCRIPTION

Course: Bioinformatics for Biomedical Research (2014). Session: 2.1.3- Next Generation Sequencing. Technologies and Applications. Part III: NGS Applications II. Statistics and Bioinformatisc Unit (UEB) & High Technology Unit (UAT) from Vall d'Hebron Research Institute (www.vhir.org), Barcelona.

Transcript

1

Vall d’Hebron Institut de Recerca (VHIR)

Rosa PrietoHead of the High Tech Unit

rosa.prieto@vhir.org

15/05/2014

Institut d’Investigació Sanitària acreditat per l’Instituto de Salud Carlos III (ISCIII)

NEXT GENERATION SEQUENCING TECHNOLOGIES AND APPLICATIONS

CURS OF BIOINFORMATICS FOR BIOMEDICAL RESEARCH

2

INTRODUCTION TO NGS1

2

3

4

Index

NGS TECHNOLOGY OVERVIEW

NGS APPLICATIONS OVERVIEW

CURS OF BIOINFORMATICS FOR BIOMEDICAL RESEARCH

WHAT IS NEXT IN SEQUENCING TECHNOLOGIES?

NGS applications

-Amplicon sequencing-Targeted DNA resequencing-Exome sequencing-Whole genome sequencing

-Metagenomics

-RNA sequencing-Targeted RNA resequencing

-Epigenomics-Sequencing of free DNA-RNA (plasma/serum)

4

Metagenomics is the study of a collection of genetic material (genomes) from a mixed community of organisms.Metagenomics usually refers to the study of microbial communities.

2

What can we study?

•The biosphere contains between 1030 and 1031 microbial genomes, at least 2–3 orders of magnitude morethan the number of plant and animal cells combined.•Microbes associated with the human body outnumber human cells by at least a factor of ten.•The vast majority cannot be cultured.

Metagenomics

5

2

(16S rRNA)

The 16S rRNA gene is comprised of highly conserved regions interspersed with more variable regions, allowing PCRprimers to be designed that are complementary to universally conserved regions flanking variable regions.Wu et al. BMC Microbiol. 2010; 10: 206.

Unidirectional sequencing

Types of metagenomics studies using NGS

-Population screening and diversity-Genome assembly-Gene prediction and annotation-Functional genomics-Ecology

-Taxonomy

7

2

Sampling and pyrosequencing methods for characterizing bacterial communities in the humangut using 16S sequence tags.

Wu et al. BMC Microbiol. 2010; 10: 206.

This is a study of methods for surveying bacterial communities in human feces using 454/Roche pyrosequencing of 16SrRNA gene tags.

Comparison of different methods of sample storage (no effect), DNA extraction and purification (great effect), set ofprimers for amplification of several variable regions (effect) and GS FLX vs. GS FLX Titanium sequencing (no effect).

Composition of the gut microbiome in the ten subjects studied.

We did find that the choice of 16S rRNA gene regionused for analysis had a noticeable effect, with the V6-V9region representing an outlier.The V6-V9 primers consistently showed the lowestpercentage of taxonomic assignments at the genuslevel.We note that our choice of V6-V9 primer andsequencing direction did not cover the V6 regionsefficiently.

Types of metagenomics studies using NGS

8

2 NIH Human Microbiome Project

“our other genome”

9

•To establish associationsbetween the genes of the humanintestinal microbiota and ourhealth and disease.• Focused on two disorders ofincreasing importance in Europe,Inflammatory Bowel Disease(IBD) and obesity.

2 MetaHit Project

Intestinal microbiota deep-sequencing for patient stratification:•rich microbiota •poor microbiota (obesity, metabolic disturbance, weight increase)

The obese individuals among the lower bacterial richness group also gain more weight over time. Only a few bacterialspecies are sufficient to distinguish between individuals with high and low bacterial richness, and even between lean andobese participants. Our classifications based on variation in the gut microbiome identify subsets of individuals in thegeneral white adult population who may be at increased risk of progressing to adiposity-associated co-morbidities.

10

The first Genomics technique: microarrays

One gene at a time

Many genes at the same time

PRE-GENOMICS ERA

GENOMICS ERA

Description of two-colour arrays

11

What is a microarray?

SOLID SURFACE

PROBES

SAMPLE(TARGET)

Fluorescence scanning

Image analysis

Raw data

14

Wang et al., Nat. Rev. Genetics 10 (2009)

4

500 pg RNAt 100 pg RNAt (Illumina), 10 pg (ultralow Illumina), 500 pg (Roche)

RNAseq vs microarrays for transcriptome analysis

•Much more sensitive than microarrays•Higher dynamic range•Real count of sequences vs. Fluorescence intensities•All RNA species can be sequenced (microarrays probes more focused on coding genes)•Available for all kinds of organisms•Protocols optimized for very low input •Cost is getting rapidly reduced

15

RNAseq library construction

Very high dynamic range (105 to 107)

16

Total RNAseq

Nat. Rev. Genetics 2009

more than 95% of the transcripts willbe ribosomal

17

•Poly A+ selection for mRNAseq: 1st strand synthesis done on oligodTattached to magnetic beads.

PROs: very effective at removing ribosomal species.Less sequencing required for the same coverage compared to tRNA.

CONs: RNA quality is an issue (degraded RNA makes it difficult to sequence 5’)Many RNA species get lost (non coding, miRNA…)

•Standard library construction does not preserve directionality (butprotocols are available to generate libraries that do preserve strandness). This may be particularlyuseful for finding unannotated genes and ncRNAs and for de-novo sequencing.

•Small RNAseq requires specific isolation and RNA library construction protocols.

•FFPE or very poor quality samples also can be sequenced using specific kits and protocols thatnot rely on polyA tails

•Illumina and Ion Torrent sell specific kits for all these kinds of RNA libraries.•Targeted RNA custom panels also exists.

Other kinds of RNA libraries

Third generation sequencing: PacBio RSII

•AMPLIFICATION OF SAMPLE IS NOT REQUIRED (LOW INPUT, AVOID BIAS, MORE UNIFORM COVERAGE, ANALYSIS OF HETEROGENEUS SAMPLES)

•SMRT Technology (Single Molecule Real Time): highly processive DNApol+ labeled phospholinked fluorescent nucleotides recorded in real time → direct observation of nucleotide incorporation

•Long reads (6-10 kb), a small number of reads up to 18 kb

•Single reads show very high error rate (15% compared to 0,1-1% of other platforms), but stochastic, improved by circular consensus sequencing (consensus sequence of high quality)

•Amplification not required (avoids bias, more uniform coverage)

•Quick delivery of results (runs last from 30 min to 3 hr)

•No problem for GC rich regions. Modification status of the template nucleotides (5-mC, 5-hmC) seen

http://smrt.med.cornell.edu/Strategies.html

2016: end of 454 commercialization and support by Roche

https://ncifrederick.cancer.gov/atp/cms/wp-content/uploads/2011/10/pacbio_technology_backgrounder.pdf

Oxford Nanopore Technologies

https://www.nanoporetech.com/technology/the-minion-device-a-miniaturised-sensing-system/the-minion-device-a-miniaturised-sensing-system

Third generation sequencing: nanopore technology

https://www.nanoporetech.com/technology/introduction-to-nanopore-sensing/introduction-to-nanopore-sensing

GridION

Expected to be released in late Nov.2014

1000$ genome for everybody

??

•18 Tb/run, 2x150 bp length•Human sequencing only•Bioinformatics/interpretation not included

In:-Macrogen (Seoul)-Broad Institute in Cambridge (Massachusetts)-Garvan Institute (Sydney)

Human genomes at 30x coverage

2012

2014

1000$ genome for everybody

And now….. what?

-Sequencing capabilities have been dramatically increased, so obtaining Tb of sequences is no longer an issue.

-Issues to deal with:

Data managing

Clinical information

VHIR’s HIGH TECHNOLOGY UNIT (UAT)

•Genomics•Metabolomics•Cytomics•Microscopy

•Statistic and Bioinformatics Unit

Unitat d’Alta Tecnologia (UAT)VHIR-Mediterrània Building-Ground floor

uat@vhir.org

We offer a set of high-tech services that support teaching activities and research activities in the biomedical field:

top related