Top Banner
June 27, 2009 Pathogen Profiling Pipeline M3 SIG – ISMB/ECCB 2009 1 Pathogen Profiling Pipeline Tom Matthews National Microbiology Laboratory Public Health Agency of Canada [email protected] A Metagenomics Tool for Rapid Identification of Pathogens from Clinical Specimens
23

Pathogen Profiling Pipeline

Dec 05, 2014

Download

Technology

tom14

Metagenomc sample analysis pipeline.
Talk at M3 SIG at ISMB 2009 in Stockholm Sweden.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Pathogen Profiling Pipeline

June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009

1

Pathogen Profiling Pipeline

Tom MatthewsNational Microbiology LaboratoryPublic Health Agency of Canada

[email protected]

A Metagenomics Tool for RapidIdentification of Pathogens from Clinical

Specimens

Page 2: Pathogen Profiling Pipeline

June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009

2

Introduction

● With novel/emerging disease classical pathogen identification may not always produce results

● Advances in next-gen sequencing technology● Characterize samples at genomic level

● Pathogen Profiling Pipeline● Bioinformatics pipeline ● Analysis of host and microbial nucleic acids

Page 3: Pathogen Profiling Pipeline

June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009

3

Features

● Nucleotide and protein BLAST analysis● Unbiased analysis of input reads● Clustered execution● Web front-end● Custom analysis pipelines● Easily viewed results

Page 4: Pathogen Profiling Pipeline

June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009

4

Filtering Overview● BLAST analysis performed against reference

sequence database● Assigns hits according to cut-off criteria● Calculate equivalent hits● Clustered BLAST and filtering

Page 5: Pathogen Profiling Pipeline

June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009

5

Last Common Ancestor Estimation

● Uses equivalent hits for LCA calculation

● User specifies equivalent hit percentage cutoff

● NCBI taxonomy database for ancestor lookup

● Walks up taxonomy tree to find lowest intersection of all leaf nodes

● Unbiased approach

Vaccinia

Camelpox

Taterapox

VariolaOrthopoxvirus

Page 6: Pathogen Profiling Pipeline

June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009

6

Filtering Outputs

● Hits – High scoring reads passing filtering values

● Equivalent Hits – BLAST hits matching to within an assigned percentage of the top hit's bitscore

● Last Common Ancestors – Calculated (estimated) LCA of all the equivalent hits

● Unassigned – Passed to the next pipeline step

Page 7: Pathogen Profiling Pipeline

June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009

7

Example Analysis Method

● BLAST reads against host database

● Remove host reads

● BLAST unassigned against reference database

● Filter hits vs. unassigned

● Repeat...

● Post analysis

Samplereads

BLAST andFiltering

Hostgenome

Viralgenome

Bacterialgenome

Protozoangenome

Fungalgenome

Non-hostreads

BLAST andFiltering

BLAST andFiltering

Non-hostreads

BLAST andFiltering

BLAST andFiltering

Poolresults

UniqueorganismsIn sample

Page 8: Pathogen Profiling Pipeline

June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009

8

Pipeline Construction

Page 9: Pathogen Profiling Pipeline

June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009

9

Pipeline Construction

Page 10: Pathogen Profiling Pipeline

June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009

10

Pipeline Construction

Page 11: Pathogen Profiling Pipeline

June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009

11

Pipeline Construction

Page 12: Pathogen Profiling Pipeline

June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009

12

Pipeline Construction

Page 13: Pathogen Profiling Pipeline

June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009

13

Pipeline Execution

● Custom execution manager● Computes dependencies and monitors running

jobs● Distribute jobs across Linux cluster● Facilitates unattended clustered executions

Page 14: Pathogen Profiling Pipeline

June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009

14

Reports

Page 15: Pathogen Profiling Pipeline

June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009

15

Drill Down Reports

Page 16: Pathogen Profiling Pipeline

June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009

16

Abundance View

● Displays abundance of taxonomic hits

Page 17: Pathogen Profiling Pipeline

June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009

17

Example Run

● Mouth swab input samples● Two pools:

● Samples spiked with Vaccinia and Influenza A● Background reference sample

Page 18: Pathogen Profiling Pipeline

June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009

18

Example Run

Page 19: Pathogen Profiling Pipeline

June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009

19

Example run

Page 20: Pathogen Profiling Pipeline

June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009

20

Example Run

Page 21: Pathogen Profiling Pipeline

June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009

21

Example Run

Page 22: Pathogen Profiling Pipeline

June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009

22

Wrap-up

● Unbiased analysis of input reads● Custom analysis pipelines● Last common ancestor calculation● Clustered execution● Multiple report views● Exportable results

Page 23: Pathogen Profiling Pipeline

June 27, 2009 Pathogen Profiling PipelineM3 SIG – ISMB/ECCB 2009

23

Acknowledgements

● Gary Van Domselaar● Morag Graham● Shaun Tyler● Heather Kent● Kim Melnychuk● Christine Bonner● Geoff Peters● Philip Mabon