Top Banner
Bioinformatic Analyses of Whole- Genome Sequence Data in a Public Health Laboratory InFORM 2017 Garden Grove, CA Dr. Kelly F. Oakeson Ph.D.
23

Genome Sequence Data in a Public Laboratory - APHL Home · Analysis Workflow Sequence QC High Quality Sequence De novo Genome Assembly Draft Genome Sequence Annotation Draft Genome

Dec 10, 2018

Download

Documents

vuthien
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Genome Sequence Data in a Public Laboratory - APHL Home · Analysis Workflow Sequence QC High Quality Sequence De novo Genome Assembly Draft Genome Sequence Annotation Draft Genome

Bioinformatic Analyses of Whole-

Genome Sequence Data in a Public

Health

Laboratory

InFORM 2017

Garden Grove, CA

Dr. Kelly F. Oakeson Ph.D.

Page 2: Genome Sequence Data in a Public Laboratory - APHL Home · Analysis Workflow Sequence QC High Quality Sequence De novo Genome Assembly Draft Genome Sequence Annotation Draft Genome
Page 3: Genome Sequence Data in a Public Laboratory - APHL Home · Analysis Workflow Sequence QC High Quality Sequence De novo Genome Assembly Draft Genome Sequence Annotation Draft Genome

UPHL Bioinformatic Workflow

Computational Requirements & Throughput

Oakeson KF, Wagner JM, Mendenhall M, Rohrwasser A, Atkinson-Dunn R.

Bioinformatic Analyses of Whole-Genome Sequence Data in a Public Health

Laboratory. Emerging Infect Dis. 2017 Sep;23(9):1441–5.

Page 4: Genome Sequence Data in a Public Laboratory - APHL Home · Analysis Workflow Sequence QC High Quality Sequence De novo Genome Assembly Draft Genome Sequence Annotation Draft Genome

Analysis Workflow

Sequence QC

High Quality

Sequence

De novo Genome

AssemblyDraft Genome

Sequence

Annotation

Draft Genome

Annotation

Phylogenetic

Relationships

Phylogenetic Tree

Construction

Page 5: Genome Sequence Data in a Public Laboratory - APHL Home · Analysis Workflow Sequence QC High Quality Sequence De novo Genome Assembly Draft Genome Sequence Annotation Draft Genome

Analysis Workflow

Sequence QC

High Quality

Sequence

Page 6: Genome Sequence Data in a Public Laboratory - APHL Home · Analysis Workflow Sequence QC High Quality Sequence De novo Genome Assembly Draft Genome Sequence Annotation Draft Genome

Sequence QC with SeqyClean

lya Y. Zhbannikov, Samuel S. Hunter, James A. Foster, and Matthew L. Settles. 2017. SeqyClean: A Pipeline for High-throughput Sequence Data Preprocessing.

In Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics (ACM-BCB '17). ACM, New York,

NY, USA, 407-416. DOI: https://doi.org/10.1145/3107411.3107446

Page 7: Genome Sequence Data in a Public Laboratory - APHL Home · Analysis Workflow Sequence QC High Quality Sequence De novo Genome Assembly Draft Genome Sequence Annotation Draft Genome

Analysis Workflow

Sequence QC

High Quality

Sequence

De novo Genome

AssemblyDraft Genome

Sequence

Page 8: Genome Sequence Data in a Public Laboratory - APHL Home · Analysis Workflow Sequence QC High Quality Sequence De novo Genome Assembly Draft Genome Sequence Annotation Draft Genome

De novo Genome Assembly with SPAdes

Determine Sequence Overlap

————— —————

Assembled Overlapping Sequence

Assembled Draft Genome Sequence

Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, et al. SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell

Sequencing. J Comput Biol. 2012 May;19(5):455–77.

Page 9: Genome Sequence Data in a Public Laboratory - APHL Home · Analysis Workflow Sequence QC High Quality Sequence De novo Genome Assembly Draft Genome Sequence Annotation Draft Genome

Analysis Workflow

Sequence QC

High Quality

Sequence

De novo Genome

AssemblyDraft Genome

Sequence

Annotation

Draft Genome

Annotation

Page 10: Genome Sequence Data in a Public Laboratory - APHL Home · Analysis Workflow Sequence QC High Quality Sequence De novo Genome Assembly Draft Genome Sequence Annotation Draft Genome

Draft Genome Annotation with Prokka

Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics. Oxford University Press; 2014 Jul 15;30(14):2068–9.

Page 11: Genome Sequence Data in a Public Laboratory - APHL Home · Analysis Workflow Sequence QC High Quality Sequence De novo Genome Assembly Draft Genome Sequence Annotation Draft Genome

De novo Genome Assembly & Annotation

Page 12: Genome Sequence Data in a Public Laboratory - APHL Home · Analysis Workflow Sequence QC High Quality Sequence De novo Genome Assembly Draft Genome Sequence Annotation Draft Genome

Analysis Workflow

Sequence QC

High Quality

Sequence

De novo Genome

AssemblyDraft Genome

Sequence

Annotation

Draft Genome

Annotation

Phylogenetic

Relationships

Phylogenetic Tree

Construction

Page 13: Genome Sequence Data in a Public Laboratory - APHL Home · Analysis Workflow Sequence QC High Quality Sequence De novo Genome Assembly Draft Genome Sequence Annotation Draft Genome

Phylogenetic Analysis with Roary

Page AJ, Cummins CA, Hunt M, Wong VK, Reuter S, Holden MTG, et al. Roary: rapid large-scale prokaryote pan genome analysis.

Bioinformatics. Oxford University Press; 2015 Nov 15;31(22):3691–3.

Page 14: Genome Sequence Data in a Public Laboratory - APHL Home · Analysis Workflow Sequence QC High Quality Sequence De novo Genome Assembly Draft Genome Sequence Annotation Draft Genome

Phylogenetic Analysis with RAxML

RAxML-VI-HPC: Maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 2006, 22: 2688–2690. BMC Bioinformatics; 2009.

Page 15: Genome Sequence Data in a Public Laboratory - APHL Home · Analysis Workflow Sequence QC High Quality Sequence De novo Genome Assembly Draft Genome Sequence Annotation Draft Genome

Phylogenetic Analysis

Page 16: Genome Sequence Data in a Public Laboratory - APHL Home · Analysis Workflow Sequence QC High Quality Sequence De novo Genome Assembly Draft Genome Sequence Annotation Draft Genome

Campylobacter

jejuni

• May 2014, three confirmed

cases of C. jejuni infections

• Identical PFGE patterns

• All patients reported raw milk

consumption from dairy “A”

• Additional cases identified

during May and June

• Outbreak investigation initiated

June 10, 2014

• Total of 99 cases

Davis KR, Dunn AC, Burnett C, McCullough L, Dimond M, Wagner J,

et al. Campylobacter jejuni Infections Associated with Raw Milk

Consumption--Utah, 2014. MMWR Morb Mortal Wkly Rep. 2016 Apr

1;65(12):301–5.

Page 17: Genome Sequence Data in a Public Laboratory - APHL Home · Analysis Workflow Sequence QC High Quality Sequence De novo Genome Assembly Draft Genome Sequence Annotation Draft Genome

Campylobacter jejuni

PFGE

• PFGE performed on 79 isolates

• 61 patient derived isolates

• 18 isolates derived from bulk milk

storage tanks

• 76 of 79 isolates have indistinguishable

SmaI PFGE patterns

Page 18: Genome Sequence Data in a Public Laboratory - APHL Home · Analysis Workflow Sequence QC High Quality Sequence De novo Genome Assembly Draft Genome Sequence Annotation Draft Genome
Page 19: Genome Sequence Data in a Public Laboratory - APHL Home · Analysis Workflow Sequence QC High Quality Sequence De novo Genome Assembly Draft Genome Sequence Annotation Draft Genome

Salmonella enterica

• Complex Multi-state Outbreak

• Associated with Rotisserie

Chicken

• Five Distinct PFGE Patterns

• 88 Isolates in Total

• 80 Patient Derived Isolates

• 8 Environmental Isolates

• Sequence Data Obtained From

SRA

Page 20: Genome Sequence Data in a Public Laboratory - APHL Home · Analysis Workflow Sequence QC High Quality Sequence De novo Genome Assembly Draft Genome Sequence Annotation Draft Genome
Page 21: Genome Sequence Data in a Public Laboratory - APHL Home · Analysis Workflow Sequence QC High Quality Sequence De novo Genome Assembly Draft Genome Sequence Annotation Draft Genome

Analysis Pipeline

Sequence QC

High Quality

Sequence

De novo Genome

AssemblyDraft Genome

Sequence

Annotation

Draft Genome

Annotation

Phylogenetic

Relationships

Signatures of

Selection

Phylogenetic Tree

Construction

Phylogenetic

Analysis

Page 22: Genome Sequence Data in a Public Laboratory - APHL Home · Analysis Workflow Sequence QC High Quality Sequence De novo Genome Assembly Draft Genome Sequence Annotation Draft Genome

Molecular Evolutionary Analysis

Page 23: Genome Sequence Data in a Public Laboratory - APHL Home · Analysis Workflow Sequence QC High Quality Sequence De novo Genome Assembly Draft Genome Sequence Annotation Draft Genome

Thank You

Utah Public Health Laboratory

Robyn Atkinson-Dunn

Andy Rohrwasser

Jenni Wagner

Erik Poole