João André Carriço, PhD Microbiology Institute/Institute for Molecular Medicine Faculty of Medicine, University of Lisbon Portugal Making Use of NGS Data: from Reads to Trees and Annotations http://im.fm.ul.pt http://imm.fm.ul.pt http://www.joaocarrico.info WORKSHOP 24: NGS FOR MICROBIAL GENOMIC SURVEILLANCE AND MORE - ONE TECHNOLOGY FITS ALL
36
Embed
Making Use of NGS Data: From Reads to Trees and Annotations
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
João André Carriço, PhDMicrobiology Institute/Institute for Molecular MedicineFaculty of Medicine, University of LisbonPortugal
To know more: - Presentation on the Controversies in interpreting whole genome sequence data session : http://eccmidlive.org/#resources/how-can-we-design-actionable-virulome-databases
Martin SergeantMark AchtmanNabil-Fareed AlikhanZhemin Zhou
Sequenced my strain…now what?
To know more : http://www.slideshare.net/nickloman/eccmid-2015-so-i-have-sequenced-my-genome-what-now
Reads(fastq files)
contigs(fasta files)
Annotated contigs(gbk/gff files)
Roary :Pan Genome Analysis
Enterobase BIGSdb
Nullabor
PHYLOViZ:Tree + metada visualization
Microreact.org: Tree +metadata +vizualization
Prok
ka
De novo assembler
Prokka Genome annotation made easy by
Torsten Seemann (slides by Torsten) Genome annotation: adding
biological information to the sequence, by describing features
To know more :http://www.slideshare.net/torstenseemann/prokka-rapid-bacterial-genome-annotation-abphm-2013
Available at: https://github.com/tseemann/prokka
Roary Pan genome analysis by Andrew Page Available at: https://sangerpathogens.github.io/Roary/
Core genome
Accessory genome
Pan-genome
Roary Inputs: Annotated de novo assemblies (GFF files)
• Typically from the annotation pipeline
Outputs:• Spreadsheet with presence and absence of genes• Multi-FASTA alignment of core genes so you can build a tree without a
reference• Multi-FASTA alignments for each gene• Plots for the open/closed genome, unique genes• Integrates with Phandango so you can visualise all structural variation• QC report from Kraken to help identify suspect samples
(Slide by Andrew Page)
Roary outputs
Core (n or n-1 strains)
Soft-Core (n-2 or n-3 strains)
Shell( 8(?) to n-3 strains)
Cloud( <8 (?) strains)
Core genome:Core + Soft-Core
Accessory genome:Shell + Cloud
Roary outputs
iCANDY output of presence and absence of genes in accessory genome.S. Weltevreden & public S. enterica genomes
(Slide by Andrew Page)
Nullarbor Complete pipeline from reads to reports by Torsten
Seemann
Objective is automate analysis for everyday use on public health labs /research settings
microreact.org Available at http://microreact.org/
Presentation on session Harnessing whole genome sequence data for public health applications : Novel open access tools for WGS-based pathogen surveillance and the identification of high-risk clones
Take home messages• Huge variety of software and database
solutions
• There is no single One-Size-Fits-All solution (job security for bioinformaticians)
• Different questions require different approaches
• Always question the results and data provenance
ECCMID2015 Meet-the-expert session on “What bioinformatic tools should I use for analysis of High Throughput Sequencing data for molecular diagnostics? ”
Nick Loman: http://www.slideshare.net/nickloman/eccmid-2015-meettheexpert-bioinformatics-tools
João André Carriço: http://www.slideshare.net/joaoandrecarrico/eccmid-meet-theexpert2015
More references/presentations
Acknowledgments UMMI Members
Bruno Gonçalves Mário Ramirez José Melo-Cristino
INESC-ID Alexandre Francisco Cátia Vaz Marta Nascimento
FP7 PathoNGenTrace (http://www.patho-ngen-trace.eu/): Dag Harmsen (Univ. Muenster) Stefan Niemann (Research Center Borstel) Keith Jolley, James Bray and Martin Maiden (Univ. Oxford) Joerg Rothganger (RIDOM) Hannes Pouseele (Applied Maths)
Genome Canada IRIDA project (www.irida.ca) Franklin Bristow, Thomas Matthews, Aaron Petkau, Morag Graham and Gary Van Domselaar (NLM , PHAC) Ed Taboada and Peter Kruczkiewicz (Lab Foodborne Zoonoses, PHAC) Fiona Brinkman (SFU) William Hsiao (BCCDC) INTEGRATED RAPID INFECTIOUS DISEASE ANALYSIS