Big Picture • Of ≈1.7 million species classified so far, roughly 6000 are microbes • True number of microbes is obviously larger than 6000 • “Imagine if our entire understanding of biology was based on a visit to the zoo. That’s where we’ve been in microbiology. – Norman Pace, Univ of Colorado, Boulder
26
Embed
Big Picture Of ≈1.7 million species classified so far, roughly 6000 are microbes True number of microbes is obviously larger than 6000 “Imagine if our.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Big Picture
• Of ≈1.7 million species classified so far, roughly 6000 are microbes
• True number of microbes is obviously larger than 6000
• “Imagine if our entire understanding of biology was based on a visit to the zoo. That’s where we’ve been in microbiology.– Norman Pace, Univ of Colorado, Boulder
Diversity of bacteria and archaea
• Only ~1% of all microbial species can be cultured
• 97% of prokaryotic isolates in stock centers are from just 4 phyla:– Proteobacteria (Escherichia, Helicobacter,
Current estimates of microbial community diversity
• Curtis et al (2002) PNAS estimated that there are up to 160 species in a typical milliliter of seawater while there are somewhere between 6,400 and 38,000 in a typical gram of soil.
What if we combined environmental sampling and shotgun sequencing?
• How many genomes would be sampled and from what organisms?
• How many novel genes would be discovered?
• How many genomes could we completely assemble?
Environmental Genome Shotgun Sequencing of the Sargasso Sea
Venter JC, et al
Methods: Sargasso
• Performed whole-genome shotgun sequencing of surface water samples from Sargasso Sea
• Samples were filtered to isolate microbes• Created genomic libraries with 2 to 6 kb
inserts• Sequenced plasmid clones• Resulted in >1.5 Gbp of microbial DNA
sequence
The Sargasso Sea
• A sea with no coastline (bounded by ocean currents)– It moves!– Generally between the West Indies and the Azores– Water is very placid (the ‘doldrums’)– Covered by a lens of warm, nutrient-poor water and a
vast mat of algae (Sargassum)
• As simple a microbial community as is likely to be found in the ocean
The Sargasso Sea
Sampling scheme
• 1700 liters of surface water sampled from four different sites in Feb or May
• Filters allowed only cells in the 0.1-3.0 micron range– Excluded dissolved DNA and free virus
– Excluded most eukaryotes
Lots and lots of sequences
• 2 million cloned fragments 2-6 kbp in size were sequenced
• This yielded 1.6 billion base pairs total – 1 billion bp non-redundant– For comparison, the human genome is 3
billion bp
Assembly issues
• Organisms differ in– abundance– genome size
• Cannot rely on assumption that coverage is uniformly random
• Some contigs will have extremely deep coverage, which is a challenge for assembly algorithms
Assembly results
• Assembly was only successful in February sample– 64,000 scaffolds, most less than 10 kbp– 500,000 clones did not assemble
• Of those with 3X or greater coverage– About ½ could be classified taxonomically
• 21 scaffolds with greater than 14X coverage– SNPs occur at 1/10 kbp, suggesting genetic diversity within
‘species’
• Only two genomes were fully assembled, and then only with the aid of an existing reference sequence for both
Unexpected sequences
• Relatives of Burkholderia and Shewanella, typical of much more nutrient-rich environments. – Probable contaminants (at least the Burkholderia)
• At least two abundant Archaeal organisms, typical of much greater depths (200 meters)
• At least 10 mega-plasmids, many with genes related to trace metal utilization or toxicity
• Not too surprising– Some phage genomes, presumably integrated– About 70 different eukaryote species (based mainly on the
presence of 18S rDNA)
How many new genes are discovered?
• 1.2 million genes identified– Equal to the number of genes submitted to the
Swissprot/TrEMBL database from the last 8 years!
• Interesting findings– Ammonium oxidation in Archaea, which was
previously unknown– Widespread presence of genes allowing
unconventional forms of phosphate uptake– Only~37 Rubisco sequences were found, but ~800
proteorhodopsin-like genes
Organism Identification
• Focused analysis on scaffolds with at least 3X coverage depth
• =333 scaffolds; 2226 contigs; 30.9 Mbp; 25% of the data set
• Used oligonucleotide frequencies, depth of coverage, and similarity to previously sequenced genomes to separate some sequences into organism “bins”
• Identified several populations related to known species
Prochlorococcus gene conservation
Phylotypes
Photosynthesis in Sargasso Sea
• Thought to be dominated by the cyanobacteria Prochlorococcus and Synechococcus
• But, >90% of cyanobacteria scaffolds appear to be Prochlorococcus
• Could be due to the gradient sampled and the larger size of Synechococcus
Bacteriorhodopsin
• Transmembrane protein that is a green light-driven proton pump
• Protons pumped out of cell; then flow back in through ATP Synthase to create ATP
• Some rhodopsins found on scaffolds of organisms previously unknown to contain them
Rhodopsin-like Sequences
• Identified 13 subfamilies of rhodopsin-like genes
• Four families of proteins from cultured organisms and nine families from uncultured organisms
• Expression levels of these genes are unknown
Problems
• Large data dump in NCBI angered some
• Unclear how effective filtering was (some apparent eukaryotic DNA found)
• Some questioning of how samples were collected
• Had some trouble getting permits from countries to collect ocean water samples
How many species are there?
• Number of distinct SSU genes– 1164 in February– 248 in May– Dominated by proteobacteria– 148 are new phylotypes (at 97% identity)
• Can we estimate how many remain to be sampled?– From a model of assembly completeness, one can estimate
1,800 to 48,000 species in the combined sample– With 5-10 fold deeper coverage, one can estimate that 50
genomes could be fully assembled
Patchiness
• Patchiness is well documented in marine macro-organisms in the open ocean, but is not known for microbial communities
• Of the species represented by assemblies with more than 50 fragments, more than half differed in abundance among sites