Annotating Metagenomes Using the NMPDR Rob Edwards Department of Computer Sciences, San Diego State University Mathematics and Computer Sciences Division, Argonne National Laboratory ASM General Meeting, Boston. www.nmpdr.org www.theseed.org See also poster: B-179 (126B) Aziz et al
26
Embed
Annotating Metagenomes Using the NMPDR Rob Edwards Department of Computer Sciences, San Diego State University Mathematics and Computer Sciences Division,
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Annotating Metagenomes Using the NMPDR
Annotating Metagenomes Using the NMPDR
Rob Edwards
Department of Computer Sciences, San Diego State University
Mathematics and Computer Sciences Division, Argonne National Laboratory
ASM General Meeting, Boston.
www.nmpdr.org www.theseed.org
See also poster:B-179 (126B)
Aziz et al
Firstbacterial genome
100bacterial genomes
1,000bacterial genomesN
um
ber
of
know
n s
equence
s
Year
How much has been sequenced?How much has been sequenced?
Environmentalsequencing
www.nmpdr.org www.theseed.org
Everybody inBoston
Everybody inUSA
AllculturedBacteria
100people
How much will be sequenced?
One genome fromevery species
Most majormicrobial environments
www.nmpdr.org www.theseed.org
The ProblemThe Problem
How do you generate consistent and accurate annotations for
metagenomes?
www.nmpdr.org www.theseed.org
The SEED Family
The SEED Family
www.nmpdr.org www.theseed.org
Annotations using subsystemsAnnotations using subsystems
FIG has developed the notion of Subsystem – a generalization of “pathway” as a collection of functional roles jointly involved in a biological process or complex
Extended subsystems into FIGfams – protein families that perform the same functions.
www.nmpdr.org www.theseed.org
Subsystems make up metabolismSubsystems make up metabolism
Wik
ipedia
Meta
bolis
mhtt
p:/
/en.w
ikip
edia
.org
/wik
i/Port
al:M
eta
bolis
m
SEED ViewerSEED Viewer
www.nmpdr.org www.theseed.org
Populated SubsystemPopulated Subsystem
www.nmpdr.org www.theseed.org
predicted or measured co-regulation
genome context(virulence islands, prophages,
conserved gene clusters)
virulence mechanism
cellular localization
enzymatic activity
common phenotype
combinations of criteria
Subsystems Are Not Just PathwaysSubsystems Are Not Just Pathways
www.nmpdr.org www.theseed.org
Automated Annotations of Complete genomes
Automated Annotations of Complete genomes
• Automated user originated processing
• Takes 1-7 hours depending on size and complexity of the genome
• ~1,500 external submissions, including 150 genomes not yet publicly released.
• Reannotation of >500 genomes complete
• 789 users, 160 organizations, 25 countries.
http://rast.nmpdr.org/
Automated Annotations of Complete Metagenomes
Automated Annotations of Complete Metagenomes
MG-RAST Server
Accurate and consistent annotations in a few days
Automatic metabolic reconstructionFreely available after registration
http://metagenomics.theseed.org/
www.nmpdr.org www.theseed.org
Metagenome AnnotationMetagenome Annotation
Automated pipeline– upload sequences in fasta, with or without