Comprehensive Microbial Resource www.tigr.org/CMR Bioinformatics Visualization Bioinformatics Visualization Workshop Workshop Owen White Owen White May 30, 2002 May 30, 2002
Dec 22, 2015
Comprehensive Microbial Resource
www.tigr.org/CMR
Bioinformatics Visualization Bioinformatics Visualization WorkshopWorkshop
Owen WhiteOwen White
May 30, 2002May 30, 2002
Curation Genome AnnotationGenome Annotation
Michelle GwinnMichelle Gwinn Bob DodsonBob Dodson Bob DeBoyBob DeBoy James KolonayJames Kolonay Bill NelsonBill Nelson Ramana MadupuRamana Madupu Sean DaughertySean Daugherty Maureen BeananMaureen Beanan Scott DurkinScott Durkin Lauren BrinkacLauren Brinkac
Bioinformatics EngineersBioinformatics Engineers Jeremy PetersonJeremy Peterson Lowell UmayamLowell Umayam Samual AngiuoliSamual Angiuoli
TIGRFAMs/GroupsTIGRFAMs/Groups Dan HaftDan Haft Jeremy SelengutJeremy Selengut
Maria Ermolaeva Maria Ermolaeva (Operons/Terminators)(Operons/Terminators)
Erik Ferlanti (All vs. All)Erik Ferlanti (All vs. All) FacultyFaculty
Jonathan Eisen (DNA Jonathan Eisen (DNA repair)repair)
Ian Paulsen Ian Paulsen (transporters)(transporters)
Steven Salzberg Steven Salzberg CollaboratorsCollaborators
Swiss-protSwiss-prot Monica RileyMonica Riley The open source crowdThe open source crowd Art Delcher (Glimmer)Art Delcher (Glimmer)
RetrievalH
eter
ocer
cal-
For
ked-
Lun
ate-
Em
argi
nate
-
Tru
ncat
e-
Rou
nded
-
Poi
nted
-Caudal Fins
http://web.pdx.edu/~bowersn/bi399/lecture2.html
Caudal FinsDorsal Spines Dorsal Rays
Retrieval across data types.
Typical annotation datatypesclone_info: Tracks information related to the parent nucleotide assembly, including its annotation status,
which institution the sequence was derived, and whether it is part of a larger assembly such as a chromosome.
asm_feature: All major features of the parent assembly are stored here, including annotated genes, predicted genes, repetitive elements, splice sites, and all underlying components of a gene (models, transcript exons, and cds exons).
phys_ev: Attribute for each gene component within the asm_feature table. For example, each predicted and annotated gene has a model and multiple exons stored in the asm_feature table. Linking the feature to phys_ev will identify the type of feature present: ie. glimmer, genscan+, genemarkHMM, or working (annotation). This becomes important if a single feature in the asm_feature table is shared by multiple model types.
feat_link: This table is key to the principles behind representing gene models in the database. All parent and child relationships are defined here.
evidence: The main repository for all sequence database search results. Also, it retains information regarding gene model attributes such as the best blast match and all Pfam matches.
ident: Stores attributes for the highest element of the gene component hierarchy, the transcriptional unit. Gene names, loci, EC symbols, and other attributes are available.
role_link: The role category assignments for each gene are available here. Roles include examples such as ‘transcription’, ‘DNA synthesis’, ‘translation’, ‘DNA repair’, ‘amino acid metabolism’, etc.
Omniome Content, GenesTotal # of genes: 132,998 from world-wide effort. (43,311 TIGR projects). 36,274 w/ genetic names. 15,098 genes placed into 5,451 paralogous
families.
413 rRNAs.
1311 tRNAs.
49 sRNAs.
293 IS elements.
Omniome ContentEvidence: 1073 distinct EC#s, assigned to 17308 genes Rows of allVall data: 3,996,851 Rows of HMM TIGRFAM data: 91,550 Rows of HMM Pfam data: 131,963 Rows of COG data: 149,940 Rows of Interpro data: 175,760 Rows of Prosite data: 53,132 Rows of BER data: 91,899
TIGRFAM Matrix
The Genome Browser: Linear Display of DNA Molecules
Genome vs. Genome Protein Hits
MUMmer: The Whole Genome Alignment Tool
Role Category Graph
Multi-Genome Query ToolQuery across all genomes based on different properties MW, pI, membrane spanning regions Taxon, Paralogous families, TIGRFAMs, Role
Category Best Match to: organism, locus, kingdom, etc.
“Genes with >5 membrane spanning regions and MW 36,000-51,000d.”
“E. coli genes with best match to Archeoglobis involved in DNA metabolism.”
Pseudo-Restriction Digest and Linear Depiction of Cuts
Position effect: