Genomics is the study of an organism's genome and the function of the genes >>200 microbial genomes completely sequenced. Key question: How to use this rich source of information? Genomics
Dec 20, 2015
Genomics is the study of an organism's genome and the function of the genes
>>200 microbial genomes completely sequenced.
Key question:
How to use this rich source of information?
Genomics
GENOME
TRANSCRIPTOME
PROTEOME
METABOLOME
Organisation(HT-sequencing)
Expression(DNA-arrays)
Synthesis/Structure(2D gels -MS-NMR-Xray)
Flux(NMR-kinetics-model)
FUNCTION
DNA
RNA
PROTEIN
METABOLISM
Single genes All genes
Functional genomics
Reading the genome map
Steps1. Determine complete DNA sequence2. Predict genes 3. Translate genes to proteins4. Predict functions of proteins 5. Reconstruct metabolic pathways6. Predict regulatory elements7. Reconstruct regulatory networks
Next: experimental confirmation ! transciptomics, proteomics, metabolomics
Raw sequence data: Bacterial sequence of 2.000.000 to 5.000.000 nucleotides AAACACTTAGACAATCAATATAAAGATGAAGTGAACGCTCTTAAAGAGAAGTTGGAAAACTTGCAGGAACAAATCAAAGATCAAAAAAGGATAGAAGAACAAGAAAAACCACAAACACTTAGACAATCAATATAAAGATGAAGTGAACGCTCTTAAAGAGAAGTTGGAAAACTTGCAGGAACAAATCAAAGATCAAAAAAGGATAGAAGAACAAGAAAAACCACAAACACTTAGACAATCAATATAAAGATGAAGTGAACGCTCTTAAAGAGAAGTTGGAAAACTTGCAGGAACAAATCAAAGATCAAAAAAGGATAGAAGAACAAGAAAAACCACAAACACTTAGACAATCAATATAAAGATGAAGTGAACGCTCTTAAAGAGAAGTTGGAAAACTTGCAGGAACAAATCAAAGATCAAAAAAGGATAGAAGAACAAGAAAAACCACAAACACTTAGACAATCAATATAAAGATGAAGTGAACGCTCTTAAAGAGAAGTTGGAAAACTTGCAGGAACAAATCAAAGATCAAAAAAGGATAGAAGAACAAGAAAAACCACAAACACTTAGACAATCAATATAAAGATGAAGTGAACGCTCTTAAAGAGAAGTTGGAAAACTTGCAGGAACAAATCAAAGATCAAAAAAGGATAGAAGAACAAGAAAAACCAC
A virtual cell:overview of predicted pathways
Genomics: from sequence to predicted function
What do we want to learn ?
Overview of• complete repertoire of genes and proteins
• complete metabolic network
• complete regulatory network
• diversity and evolution
Systems biology: understand how a whole cell works
total genes 2.000 6.300 19.000 14.000 30.000 ?
% genes
bacteria yeast worm fly man
Size (Mb) 2 12 97 137 3.500
Genome content
junk ?
Microbial genomes
Microbial genome sequencing 1995-2000: Mainly pathogenic bacteria
2000-present: Genomes of many food relevant micro-organisms - Lactic Acid Bacteria - Food Spoilage Bacteria
1997 2000 2003
Genome Sequencing Projects
2005:250 complete genomes 600 million bases600 thousand proteins
Microbial genomes
Archaea Bacteria
sequenced genomes 23 236
size range (Mb) 0.5-5.8 0.6-9.1
genes 540-4500 470-8300
% GC 31 - 68 22 - 72
Coding density is ~ 85-90%
Average of ~ 1 gene per 1 kb
Status Sept. 2004
Bacterial genomes
Chromosomes 1 circular 0.6 - 9 MbPlasmids 0-10 circular 1 - 250 kb
ExceptionsLinear chromosomes• Borrelia burgdorfei 0.91 Mb• Rickettsia typhi 1.11 Mb• Desulfotalea psychrophila 3.52 Mb• Streptomyces coelicolor 8.67 Mb
Two chromosomes• Ralstonia solanacearum 3.72 and 2.09 Mb• Agrobacterium tumefaciens 2.84 and 2.07 Mb• Vibrio cholerae 2.96 and 1.07 Mb• Brucella melitensis 2.12 and 1.18 Mb• Deinococcus radiodurans 2.65 and 0.41 Mb
Biological Databases
Database types: • sequence EMBL, GenBank• annotation SwissProt• enzyme Enzyme, Brenda• genome Entrez, EBI-Genome Reviews• structure PDB, SCOP• pathway KEGG, EcoCyc• organism FlyBase, WormBase• organizational Pfam, COG
Summarized each year in Nucleic Acids Res., January issue
Genome DatabasesMain databases
• NCBI Entrezwww.ncbi.nlm.nih.gov/genomes/lproks.cgi
• EBI Genome Reviewswww.ebi.ac.uk/genomes/bacteria.html
• TIGR Comprehensive Microbial Resource (CMR)www.tigr.org/tdb
• Integrated Genomics GOLDwww.genomesonline.org
• CBS Genome Altaswww.cbs.dtu.dk/services/GenomeAtlas
Genome Databases
Specialized databases
• Sanger Instituteown genomes, many pathogenic bacteria(UK) www.sanger.ac.uk/projects
• Pasteur Institute own genomes, many pathogenic bacteria(France) www.pasteur.fr/english.html
• MIPS PEDANT – all genomes(Germany) http://pedant.gsf.de/
• DOE-JGI own genomes, many microbial - environmental(USA) http://genome.jgi-psf.org/microbial/
Genome DatabasesOverviews of databases
• ABIM organism databases(France) www.up.univ-mrs.fr/~wabim/english/genome.html
Complete Genomes • COGENT (COmplete GENome Tracking : a flexible data environment for computational genomics) EBI (UK) • Complete genomes NCBI (Haemophilius influenza, E. Coli, Mycoplasma genitalium) • Completed Genomes at the EBI EBI (UK) • Completed microbial genomes InfoBioGen (France) • Completed microbial genomes TIGR • Completely sequenced genomes Rockfeller (USA) • EMGLib (completely sequenced bacterial genomes and the yeast genome) PBIL (France) • Fully Sequenced Genomes Present In The Public DataBases GOLD (USA) • Integr8 (integrated views of complete genomes and proteomes) EBI (UK) • PEDANT (Protein Extraction, Description, and Analysis Tool) MIPS (more 200 genomes) (Germany) • E. Coli : Wisconsin (USA), see also Entrez (NCBI) • Rfam (annotating non-coding RNAs in complete genomes) Saint Louis (USA), see also Sanger (UK) • SACSO (Systematic Analysis of Completely Sequenced Organisms) Pasteur (French) • TIGR : Genomes (USA) • Yeast Genome Project (Complete genome) MIPS (Germany), see also Genome Viewer, see also Comprehensive Yeast Genome Database •
Genome Databases
Comparative genomics databases
• ERGO Comparative genomics analysis • (USA) http://ergo.integratedgenomics.com/ERGO
Genome DatabasesComparative genomics databases
• STRING Search Tool for the Retrieval of Interacting Genes/Proteins(De) http://www.bork.embl-heidelberg.de/STRING
Genome DatabasesMetabolic pathway-genome databases (PGDB)
• KEGG Kyoto Encyclopedia of Genes and Genomes(Japan) http://www.genome.jp/kegg/kegg2.html
• EcoCyc E.coli metabolic pathways (highly curated)
(USA) http://www.ecocyc.org.
• BioCyc collection of PGDBshttp://www.biocyc.org
• modeling the components and their wiring (roadmap)
• modeling regulatory interactions (traffic lights)
• modeling fluxes and dynamics (traffic)
• predictive modeling: rational design (solve traffic jams)
• “genomics modeling”: provide biological interpretation of omics data
Modeling metabolic networks:what are the questions?