Nitrogen fixing potential in extreme environments Author: Sorek Abramovich, Reut Publication Date: 2013 DOI: https://doi.org/10.26190/unsworks/16288 License: https://creativecommons.org/licenses/by-nc-nd/3.0/au/ Link to license to see what you are allowed to do with this resource. Downloaded from http://hdl.handle.net/1959.4/52826 in https:// unsworks.unsw.edu.au on 2022-08-16
242
Embed
Nitrogen fixing potential in extreme environments - UNSWorks
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Nitrogen fixing potential in extreme environments
Author:Sorek Abramovich, Reut
Publication Date:2013
DOI:https://doi.org/10.26190/unsworks/16288
License:https://creativecommons.org/licenses/by-nc-nd/3.0/au/Link to license to see what you are allowed to do with this resource.
Downloaded from http://hdl.handle.net/1959.4/52826 in https://unsworks.unsw.edu.au on 2022-08-16
A thesis in fulfilment of the requirements for the degree of
Doctor of Philosophy
School of Biotechnology and Biomolecular Sciences
The University of New South Wales
Sydney, Australia
March 2013
iii
ORIGINALITY STATEMENT
‘I hereby declare that this submission is my own work and to the best of my
knowledge it contains no materials previously published or written by another
person, or substantial proportions of material which have been accepted for the
award of any other degree or diploma at UNSW or any other educational
institution, except where due acknowledgement is made in the thesis. Any
contribution made to the research by others, with whom I have worked at
UNSW or elsewhere is explicitly acknowledged in the thesis. I also declare that
the intellectual content of this thesis is the product of my own work, except to
the extent that assistance from others in the project's design and conception or
in style, presentation and linguistic expression is acknowledged.’
Signed ……………………………………………..............
Date ……………………………………………...........
iv
COPYRIGHT STATEMENT
‘I hereby grant the University of New South Wales or its agents the right to archive and to make available my thesis or dissertation in whole or part in the University libraries in all forms of media, now or here after known, subject to the provisions of the Copyright Act 1968. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation. I also authorise University Microfilms to use the 350 word abstract of my thesis in Dissertation Abstract International (this is applicable to doctoral theses only). I have either used no substantial portions of copyright material in my thesis or I have obtained permission to use copyright material; where permission has not been granted I have applied/will apply for a partial restriction of the digital copy of my thesis or dissertation.'
Signed ……………………………………………...........................
Date ……………………………………………...........................
AUTHENTICITY STATEMENT
‘I certify that the Library deposit digital copy is a direct equivalent of the final officially approved version of my thesis. No emendation of content has occurred and if there are any minor variations in formatting, they are the result of the conversion to digital format.’
Signed ……………………………………………...........................
Date ……………………………………………...........................
v
Abstract Biological nitrogen fixation is a key process in providing accessible nitrogen to Earth’s biosphere. This process has been studied in various habitats yet extreme environments still remain relatively unexplored. The nifH gene codes for the Fe protein component in the nitrogenase, which facilitates the nitrogen fixation. Our aims in this study were to assess diazotrophic diversity, richness and community structure in three unique environments and analyse potential adaptations in the Fe protein composition and structure. Our methods included a terminal-restriction fragment length polymorphism (T-RFLP) analysis on 16S rDNA, PCR amplification of the nifH gene, statistical t-test analysis of amino acid compositions, a novel evolutionary analysis and 3D modelling with the I-TASSER web server. Boulder Clay and Amorphous Glacier are two ice-free areas in Terra Nova Bay, Antarctica, which differ in their geological origins and physio-chemical properties. DNA yields from ice-core samples ranged from 0.29 ng L-1 in Amorphous Glacier to 88 ng L-1 in Boulder Clay. Bray-Curtis cluster analysis suggested Boulder Clay bacterial profiles were similar to each other, but cluster separately from Amorphous Glacier. The hypersaline (>70 ppt) bays of Shark Bay, Western Australia, are home to the stromatolites microbial mats. The microbial diversity of diazotrophs from two different years, 1996 and 2004, was investigated. Our analysis indicated columnar stromatolites included a common persisting cyanobacterial diazotroph, a Cyanothece or Xenoccocous. Both samples contained novel nifH gene sequences of low similarity to uncultured nifH clones from saline to hypersaline environments, and their inferred NifH amino acid sequences were highly similar to unicellular, non-heterocystous Cyanobacteria and γ, -Proteobacteria sequences. Paralana’s hot radon springs (PHS, 57 C°) are situated in South Australia. Phylogenetic analysis indicated a rich and diverse group of amino acid NifH sequences from α-, γ-, and δ-Proteobacteria, Chloroflexi and Cyanobacteria phyla. These results suggested aerobic and anaerobic bacteria with conventional Mo nitrogenase might be involved in nitrogen fixation. Our bioinformatic analysis suggested that halophilic adaptations, with an increase in salt bridges, acidic residues and a decrease in bulkier hydrophobic amino acids, did occur in stromatolite diazotrophs and that partial thermophilic adaptations, mainly an increase in salt bridges, Pro and charged residues, did occur in the PHS diazotrophs. These studies provide new insight on the ongoing evolution of nitrogen fixation in extreme environments.
vi
Acknowledgments I would like to thank my supervisors - Prof. Brett A. Neilan, Dr. Michelle Gehringer and Dr. Brendan P. Burns, for their support and advice during my PhD studies. I have benefited from their advice, and followed their wise council. I would like to thank Dr. Sohail Siddiqui, Prof. Aharon Oren and Prof. Nir Ben Tal, for their support and invaluable suggestions. The Australian Centre for Astrobiology was a creative hub for me and other students, a place to exchange ideas, thoughts and avenues of exploration into the biggest mysteries of life. I would like in particular to thank the director, Prof. Malcolm Walter, for his ongoing support of my efforts, and thank Carol, Jessica, Maria, Tamsyn, David and Ivan for creative conversations during my research career at the centre. I would also like to thank my friends and colleagues at the Blue Green Groove Machine lab, for their patience, help and suggestions. I could not have come this far without their knowledge. My special thanks go to: Anne D.J., Michelle A, Kristin, Falicia, Alex, Hannah, Ivan, Jasper, Troco, Shane, Stefan, Jae, Frank, Tim, Maria, Sarah, Tamsyn, Angie, Rati, Julia, Shauna, Leanne, Will and Alper. The Mars Society of Australia (MSA) is a group of intelligent and devoted people. My 2009 field trip to the Paralana Hot Springs in South Australia, with NASA’s Spaceward Bound program, was very special thanks to their efforts and hard work. I salute you: David Cooper, David Wilson, Jon Clarke, Guy Murphy, Mark Gargano, Eriita Jones, Marcia Tanner and Shaun Strong. I am also indebted to Dr. Chris McKay and Prof. Penelope Boston for enlightened conversations and field trip advice & help. Thank you my coffee break friends: Rhea, Shahar, Nitzan, Eldad and Mikayla. To my ever loving husband, Aviv - Thank You, my No. 1. To my parents & brother, Aryeh & Channa & Shachar - Thank you for inspirational stories. To my first born daughter, Eleanor - You were the best surprise I’ve have ever received. May your life be interesting and filled with joy. One last statement if I may -
“The time has come for humanity to journey to Mars.” (The Mars Society founding declaration, University of Colorado, Boulder, Colorado, United
States, 1998)
vii
List of Publications Abramovich, R.S., Pomati, F., Jungblut, A.D., Guglielmin, M., and Neilan, B.A. (2012) T-RFLP Fingerprinting Analysis of Bacterial Communities in Debris Cones, Northern Victoria Land, Antarctica. Permafrost and Periglacial Processes 23: 244-248.
Contributions to academic conferences Abramovich, R.S., Burns, B.P., and Neilan, B.A. Temporal Biodiversity of Potential Diazotrophs in Stromatolites, Shark Bay, Western Australia. Australian Mars Exploration Conference. July, 17-19th 2009, Adelaide, South Australia. Abramovich, R.S., Burns, B.P., and Neilan, B.A. Nitrogen fixation potential in stromatolites, Shark Bay, Western Australia. The 9th Australian Space Science Conference. 28 - 30th, September 2009, Sydney, Australia. Abramovich, R.S., Gehringer, M.M., and Neilan, B.A. Biodiversity of Potential Diazotrophs in Microbial Communities of Stromatolites at Shark Bay, Western Australia. Sydney Astronomy and Astrophysics Student Symposium. 18th, June 2010, Sydney, Australia. Abramovich, R.S., Gehringer, M.M., and Neilan, B.A. Biodiversity of Potential Diazotrophs in Microbial Communities of a Radon Hot Spring in the Flinders Ranges and Stromatolites at Shark Bay. The 8th International Congress on Extremophiles. 12-14th, September 2010, Azores, Portugal. Abramovich, R.S., Gehringer, M.M., and Neilan, B.A. Biological nitrogen fixation potential in stromatolites, Shark Bay, Western Australia. The 16th SUNFix Symposium. 25th of June 2010, Sydney, Australia. Abramovich, R.S., Gehringer, M.M., Burns, B.P., and Neilan, B.A. Biodiversity of Potential Diazotrophs in Stromatolites of Shark Bay and a Radon Hot Spring. The Australian Society for Microbiology, Annual Scientific Meeting. 4-8th, July 2010, Sydney, Australia.
viii
List of Acronyms and Abbreviations ARA Acetylene reduction assay ATCC American Type Culture Collection ATP Adenosine triphosphate BLAST Basic local alignment search bp Base pairs BSA Bovine serum albumin cDNA Complementary Deoxyribonucleotide acid Chla Chlorophyll a DMSO Dimethyl sulfoxide DNA Deoxyribonucleotide acid dNTP Deoxyribonucleotide triphosphate DTT Dithiothreithol EDTA Ethylenediaminetetraacetic acid EPS Exopolysaccharide FISH Fluorescence in situ hybridisation g Gram
g Microgram GC-MS Gas chromatography-mass spectrometry GTP Guanosine-5'-triphosphate h Hour IPTG Isopropyl- D-thiogalactoside kb Kilobase kDa Kilodalton km Kilometre km2 Square kilometre L Litre
L Microlitre LB Luria-Bertani m Metre m.b.s.l. Meters below surface level
M Micromolar min Minute ml Millilitre mm Millimetre MQ Milli-Q mRNA Messenger RNA NCBI National Centre for Biotechnology nd Not detected ºC Degrees Celsius ORF Open reading frame OTU Operational Taxonomic Unit PCC Pasteur Culture Collection (France) PCR Polymerase chain reaction PDB Protein Data Bank pmol Picomol
ix
rDNA Ribosomal Deoxyribonucleotide acid RDP Ribosomal Database Project RFLP Random fragment length polymorphism RFLP Restriction fragment length polymorphism RNA Ribonucleic acid rpm Revolutions per minute rRNA Ribosomal ribonucleic acid RT Room temperature RT-PCR Reverse Transcriptase PCR s Second SD Standard deviation SDS Sodium dodecyl sulphate SRB Sulphate reducing bacteria SSU Small sub-unit TAP T-RFLP Analysis Program T-RFLP Terminal Restriction Fragment Length Polymorphism UTCC University of Toronto Culture UTEX University of Texas Culture UV Ultraviolet light
1.1 The extremophiles ..................................................................................................... 5
1.2 Nitrogen significance and source .............................................................................. 6
1.3 Nitrogenase structure and function ......................................................................... 7 1.3.1 Fe protein structure and function .......................................................................................... 7 1.3.2 MoFe protein structure and function .................................................................................... 8 1.3.3 Nitrogenase modus operandi ................................................................................................ 9
5.3 Results ..................................................................................................................... 116 5.3.1 Evolution, composition and structure of the Cluster III Fe protein .................................. 116 5.3.2 Evolution, composition and structure of the Cluster I Fe protein ..................................... 126 5.3.3 Comparative analysis of cluster I and cluster III Fe proteins ........................................... 142
5.4 Discussion ............................................................................................................... 148 5.4.1 Methodology .................................................................................................................... 148 5.4.2 Evolution, composition and structure in cluster I & III .................................................... 150
6.3 Results ..................................................................................................................... 159 6.3.1 Potential halophilic adaptations in the Fe protein ............................................................. 159 6.3.2 Potential thermophilic adaptations in the Fe protein ........................................................ 168
Methanoplanus, Methanosarcina and Methanothermus (Gary Stacey, 1992; Dixon and Kahn,
2004). Additional processes are involved in the nitrogen cycle on Earth and provide oxidized
and reduced forms of nitrogen. Aerobic nitrification converts ammonia into oxidized varieties,
using ammonia and nitrite oxidation pathways (NH4+ / NH3 NO2 NO3
-). Denitrification
converts oxidized forms to dinitrogen (NO3- NO2
- NO->N2O->N2), as does anaerobic
ammonium oxidation (ANAMMOX) by the Planctomycetes phylum and members of the
Crenarchaeota (Francis et al., 2007).
1.3 Nitrogenase structure and function
All diazotrophic micro-organisms have in common an enzyme – the nitrogenase, which
compromises about 10% of the total cellular proteins (Burns et al., 1972). An ATP-hydrolyzing
complex of two proteins: Dinitrogenase, a α2β2 heterotetramer where α encoded by nifD and β
by nifK genes, and the dinitrogenase reductase, a γ2 homodimer encoded by nifH gene
(Georgiadis et al., 1992; Dilworth et al., 1993). These components are sometimes referred to as
the MoFe protein and Fe protein, respectively.
Furthermore, during the last two decades crystallographic structures of nitrogenase have
emerged, leading to new 3D structural models and new insights and understanding of its
mechanism. Currently there are 36 3D structures of nitrogenase in the (Research Collaboratory
for Structural Bioinformatics, Protein Data Bank,H.M. Berman, 2003). The first were
crystallographic structures of nitrogenase reductase from Azotobacter vinelandii and
Clostridium pasteurianum at 2.9 and 3.0 Å resolution, respectively (Georgiadis et al., 1992;
Kim et al., 1993). Since then, 34 structures of nitrogenase were resolved from A. vinelandii, C.
pasteurianum, Klebsiella pneumoniae and Azospirillum brasilense at 1.16 to 3.2 Å
resolution (H.M.Berman, 2000; H.M. Berman, 2003). The following paragraphs briefly describe
the structure and function of the individual components of nitrogenase.
1.3.1 Fe protein structure and function
Research based on crystallographic structures, genetic and molecular methodologies has
revealed that the Fe protein, a ~60kD protein, has several functionalities: it binds
MgATP/MgADP (each monomer contains an ATP-binding site in a single domain) and is
required for the initial biosynthesis of the FeMo cofactor and its insertion into the MoFe protein
(Burgess and Lowe, 1996). It also transfers electrons from a suitable donor (such as reduced
ferredoxin or flavodoxin) to the dinitrogenase. The homodimer is composed of two polypeptide
8
chains linked by a single redox-active Fe4S4 cluster that can reach three oxidative states
(Howard and Rees, 1996, see figure 1). The nucleotides are essential for the electron transfer
because they induce conformational changes which result in receptive iron atoms in the clusters.
The Fe protein structure reflects these multiple functionalities via its complex structure and
motifs: eight parallel beta-sheets flanked by nine alpha-helices, a nucleotide binding fold
(Walker et al., 1982) and two switch regions, designated by Schlessman et al. (1998) Switch I
and Switch II, which interact with the gamma-phosphate group of the bound MgATP and
facilitate the conformational changes (Jang et al., 2000; Jang et al., 2004).
Figure 1. General view of the Fe protein. The two polypeptide chains are linked by a single redox-active Fe4S4 cluster - chains F (red) and E (blue). Secondary structure depicted as determined by Tezcan et al. (2005). A1 - Fe4S4 cluster centred view, B1 - view centres on the cleft between the two chains. A2, B2 - same viewing angles, only the PCR amplified region of NifH in each chain is coloured. From Azotobacter vinelandii (PDB ID: 2AFH).
1.3.2 MoFe protein structure and function
This ~250kD component is encoded by nifD and nifK genes and contains two types of clusters:
P clusters and FeMo cofactors (Kim and Rees, 1994). The α subunit contains a FeMo cofactor,
typically a MoFe7S9 metal cluster (see figure 2). Some organisms contain nitrogenases wherein
Molybdenum is replaced by either Iron or Vanadium. Homocitrate and two residues, His and
Cys, coordinate the FeMo cofactor in the protein (Burgess and Lowe, 1996). Each P cluster
contains eight iron atoms and seven sulphides linked to the protein by six Cys residues. The
9
clusters serve as a conduit for electron transfer from the Fe protein to the FeMo cofactor to
which N2 has been hypothesized to bind (Howard and Rees, 1996).
Figure 2. General overview of α2β2 heterotetramer MoFe protein from Klebsiella pneumoniae (PDB ID: 1H1L), the FeMo cofactor (with the homocitrate molecule close by), cation binding site and the P cluster are marked in the image (Hawkes et al., 1984).
1.3.3 Nitrogenase modus operandi
Three events of electron transfer are involved in the nitrogenase modus operandi: (1) reduction
of Fe protein through an electron transfer from a suitable donor – ferredoxin or flavodoxin, (2)
transfer of the electron to MoFe protein, (3) electron transfer from the active site within MoFe
protein (presumably FeMo cofactor) to the substrate. For each 1 mol dinitrogen, 2 mol of
ammonia and 1 mol of H2 form. A total of 8 electrons are thus consumed (Burgess and Lowe,
1996). For every electron utilized in this fashion, 2 mol of MgATP are hydrolyzed to MgADP.
Chroococcidiopsis, Dermocarpa, Myxosarcina, Xenococcus, Pleurocapsa group
13
environmental adaptations (Fani et al., 2000). The inconsistencies with 16S rDNA phylogeny
are usually explained as Lateral Gene Transfer and loss of genes due to loss of function
(Hartmann and Barnum, 2010). In general, the evolutionary progress of the nif genes is a
complicated matter, not entirely resolved as of yet.
Figure 3.General overview of an unrooted nifH gene tree topology modified from Zehr et al. (2003a) with four major clusters I-IV.
It is of interest to review what is known of nitrogen fixation in extreme environments. The
following paragraphs provide background on nitrogen fixation in relation to cryospheric,
hypersaline and high temperature environments, from a microbiology point of view.
1.5 Psychrophilic diazotrophs
Nitrogen fixation has been studied in Antarctica for several decades now. Early studies in the
1960’s detected nitrogen fixation by Cyanobacteria, mainly by Anabaena, Calothrix and Nostoc
genera, and to a lesser extent by other genera - Stigonema and Tolypothrix (Smith and Russell,
1982). Nitrogen fixation was usually detected between 4-10°C, during mid day and was rarely
detected during winter or below 0°C (Stewart, 1970b; Davey and Marchant, 1983). More
recently, N2 fixation was found to represent between 6.3%-33% of total N incorporated by
microbial component in ponds or soils in Antarctica (Fernandez-Valiente et al., 2001), with the
higher end of contributed N reported from microbial mat studies, mostly from surface layers and
during day time (Fernandez-Valiente et al., 2007), supporting heterocystous Cyanobacteria as
14
the substantial providers of reduced nitrogen in the Antarctic ecosystem (Vincent et al., 1993).
Recent studies also have reported unicellular (Gloeocapsa, Synechococcus) and filamentous
non-heterocystous (Oscillatoria , Phormidium) Cyanobacteria as active nitrogen fixers, usually
under dark conditions and at substantially lower optimal temperatures than tropical or temperate
strains (Pandey et al., 2004). These Cyanobacteria were not considered true psychrophiles, since
nitrogen fixation optima was in the range of 15-25°C, and they were not able to grow at 0°C or
at subzero temperatures (Pikuta et al., 2007).
While Cyanobacteria were the dominant active nitrogen fixers reported in most studies, other
potential diazotrophs have been reported, from the Proteobacteria, Verrumicrobia, Firmicutes,
Spirochaetes and Bacteroidetes in Antarctica and other cryospheric environments.
Representatives of these major phyla were also found in other cryospheric environments such as
ice shelves, sub-glacial lakes and streams, as well as fjords and deep sea basalt flows (Priscu et
al., 1998; Carpenter et al., 2000; Bowman et al., 2003; Gaidos et al., 2004; Liu et al., 2006;
Perreault et al., 2007; Jungblut and Neilan, 2010).
The bacterial diversity in polar permafrost is considered high as nearly 40 genera have been
isolated or cloned from Arctic and Antarctic permafrost so far (Gilichinsky et al., 2007;
Gilichinsky et al., 2008), some of which are diazotrophs. The various genera identified in these
regions include: Acinetobacter, Bradyrhizobium, Comamonas, Lysobacter, Methylobacterium,
Pseudomonas and Sphingomonas of the Proteobacteria, Bacillus, Clostridium, Paenibacillus,
Planococcus and Sporosarcina from Firmicutes, Flavobacterium and Pedobacter from
Bacteroidetes and Arthrobacter, Brevibacterium, Corynebacterium, Kocuria, Micrococcus,
Rhodococcus and Streptomyces from the phylogenetic group Actinobacteria (Soina et al., 1995;
Shi et al., 1997; Zhou et al., 1997; Kochkina et al., 2001; Steven et al., 2006; Vishnivetskaya et
al., 2006; Steven et al., 2007; Mindlin et al., 2008; Niederberger et al., 2008).
The permafrost environment itself is characterized by temperatures below or equal to 0°C for at
least two consecutive years (Muller, 1947) and severe environmental conditions such as extreme
cold, high salt concentrations and low nutrient supply (Friedmann et al., 1993; Aislabie et al.,
2006; Barrett et al., 2006). Permafrost covers more than 25% of the Earth’s landmass, yet its
microbiology remains largely unexplored. Relatively little is known of Antarctic permafrost
(Gilichinsky et al., 2007; Cannone et al., 2008; Niederberger et al., 2008) and most current data
originate mainly from Siberian permafrost studies (Shi et al., 1997; Bakermans et al., 2003;
Vishnivetskaya et al., 2006).
15
A characteristic of cryospheric environments is that they usually have low bacterial content and
are not easy to culture (Christner et al., 2005; Miteva, 2008), and are therefore suitable to the
application of molecular based techniques in exploring their bacterial communities, diversity
and richness. Terminal Restriction Fragment Length Polymorphism (T-RFLP) is a DNA
fingerprinting method that also enables one to produce bacterial community profiles and match
bacterial genera to specific terminal restriction fragments (T-RFs) after digestion of
fluorescently labelled 16S rRNA amplicons with specific restriction enzymes (Liu et al., 1997;
Marsh et al., 2000; Derakshani et al., 2001).
T-RFLP has been used widely in microbial ecology studies of temperate zones and in versatile
environments such as marine and lake sediments, soils, plant roots and more (Clement et al.,
1998; Marsh, 1999; Liesack and Dunfield, 2004). However, to date it has been rarely used for
community analysis of permafrost or glacial environments. Bhatia et al. (2006) employed this
method to explore the relationship between supra-, sub-, and pro-glacial bacterial communities
of the John Evans Glacier (Canada) but did not identify any bacteria. T-RFLP is a quick and
sensitive molecular technique for exploring possible bacterial genotypes in a given
environmental sample, enabling future studies to target specific groups or genes.
1.6 Halophilic diazotrophs
It is of interest to look into nitrogen fixation and halophilicity in two aspects - halophilic
diazotrophs, and nitrogen fixation in hypersaline environments in general.
Halophilic micro-organisms require salt in the media for optimal growth and can be divided to
slightly, moderate or extremely halophilic (2-5 %, 5–20 % and minimum of 20–30% NaCl
respectively, in media). Halophiles can be found in the Archaea, Bacteria and Eukaryota
domains (DasSarma and Arora, 2006; Ma et al., 2010). Hypersaline environments are generally
defined as containing salt in higher concentration than sea water (3.5% total dissolved salts, or
35 PSU). Halophilic micro-organisms were detected and isolated from solar saltern ponds,
Great Salt Lake (USA), the Dead Sea (Israel et al.), African soda lakes, Hamelin Pool (Western
Australia), deep-sea brines, and many others worldwide localities (Oren, 2002; DasSarma and
Arora, 2006; Ma et al., 2010; Goh et al., 2011).
Moderate diazotrophic halophiles exist amongst the Cyanobacteria and other prokaryotes (see
table 3). Very few extreme halophilic bacteria possess nif genes, and none have been studied
extensively in terms of their nitrogen fixation capabilities. Halorhodospira halophila (γ-
proteobacteria) nif genes have been mapped and nitrogenase shown to be active and mediating
hydrogen production (Tsuihiji et al., 2006). nifH genes were reported also from H. abdelmalekii
16
and H. halochloris (Tourova et al., 2007) of the Ectothiorhodospiraceae family (γ-
proteobacteria; Chromatiales), which also includes additional halophilic diazotrophs genera -
Ectothiorhodospira and Thiorhodospira. These species are slightly to moderately halophilic,
and their nitrogen fixation capacity remains largely unexplored (Hirschler-Réa et al., 2003;
Imhoff, 2006).
None of the known nitrogen fixing Archaean, members of the Methanococcales,
Methanomicrobiales and Methanobacteriales, are halophiles (Leigh, 2000).
Nitrogen fixation studies in the Dead Sea (347 g l-1 salinity) have not been conducted to date,
though several halophilic micro-organisms with potential for diazotrophy were isolated. A
halophilic Rhodospirillum sodomense have been isolated from the Dead Sea, but it lacked the
nitrogenase activity usually found in the family of Rhodospirillaceae (Madigan et al., 1984;
Mack et al., 1993). Another moderate halophile from the Dead Sea, Ectothiorhodospira
marismortui, was able to grow on N2, but very poorly or not at all (Oren et al., 1989). The Dead
Sea represents the most saline environment known to date, and thus the upper salinity limits for
nitrogen fixation.
However, even though an extreme halophilic diazotroph seems to be a rare commodity, studies
into other hypersaline environments clearly indicated nitrogen fixation occurs under stressful
conditions. Few investigations in such environments revealed different dynamics of nitrogen
fixation (Pinckney et al., 1995;Paerl et al., 2003; Yannarell et al., 2007).
Microbial mat in a tropical hypersaline lagoon (74‰ salinity) has exhibited higher nitrogen
fixation rates once introduced to lower salinity levels, from 74 to 37‰ (Pinckney et al., 1995).
Additional experiments have reported similar results (Paerl et al., 2003; Yannarell et al., 2007)
with the interesting addition that non-cyanobacterial diazotrophs were more sensitive to salinity
changes than cyanobacterial diazotrophs (Yannarell et al., 2006). Nitrogen fixation rates were
rather similar during dark and light periods, until oxygenic photosynthesis was blocked, which
caused a big spike in nitrogen fixation rates under light conditions (Pinckney and Paerl, 1997).
These results suggested that halophilic anaerobic phototrophic diazotrophs were important to
nitrogen fixation just as Cyanobacteria, yet they are more sensitive to changes in salinity, and
hence their composition may vary. In another study, a hypersaline (90-78‰) Microcoleus
chthonoplastes dominated microbial mats showed high nitrogen fixation rates during night time
and low fixation rates during the day (Omoregie et al., 2004b).
17
Table 3. Representatives of moderately halophilic diazotrophs. Cyanobacteria Halothece Microcoleus chthonoplastes O. limnetica O. salina Oscillatoria neglecta Phormidium ambiguum Synechococcus Chloroflexi Chloroflexus aurautiacus Bacteroidetes/Chlorobi group Chlorobium limicola C. phaeobacteriales Proteobacteria Alkalilimnicola halodurans Desulfovibrio halophilus Ectothiorhodospira Halomonas maura Marichromatium purpuratum Rhodospirillum salexigens Thiocapsa roseoparsarcina Thiorhodococcus minor
Thiorhodospira sibirica
References: (Madigan et al., 1984; Yakimov et al., 2001; Oren, 2002; Argandoña et al., 2005; DasSarma and Arora, 2006; Imhoff, 2006; Tsuihiji et al., 2006; Tourova et al., 2007).
This established that the active nitrogen fixers were non heterocystous Cyanobacteria
(Plectonema boryanum, Halothece, Phormidium spp), and halophilic anaerobic sulphate reducer
similar to Desulfovibrio spp (Omoregie et al., 2004b). This suggested that lack of oxygen
enabled more diazotrophs to actively fix nitrogen.
Halophilic Bacteria and Archaea adapt to saline conditions mainly via ‘salt in’ or ‘salt out’
strategies, cell membrane and proteomic modifications (Pikuta et al., 2007). With the first
strategy, a halophile tends to accumulate salt ions (K+ Cl-, Na+) in high concentrations within
the cytoplasm - thus creating an internal osmotic pressure to counter balance the environmental
stress (Oren, 1986, 1999). Due to the high concentration of salt ions, intracellular electrostatic
charges of the enzymes change significantly and require further adaptations in enzyme structure
and composition to maintain activity and bind water molecules and ions efficiently (Rengpipat
et al., 1988; Madern et al., 2000). Oren (1999) states that the salting in strategy has been found
to date only in Halobacteriales (Archaea) and Haloanaerobiales (Bacteria) orders. In the
second strategy, ‘salting out’, a halophile synthesises and accumulates organic compatible
solutes such as betaines, ectoines, N-acetylated diamino acids and N-derivatized carboxamides
of glutamine in order to maintain an osmotic balance (Galinski and Trüper, 1994). It is
suggested that these low molecular weight osmolytes interact with water molecules via their
18
hydrophilic and hydrophobic regions and counteract the ionic imbalance, yet the exact
mechanism of their model of interaction with proteins is still under investigation (Galinski,
1993; Oren, 1999).
In halophilic Archaea membrane modifications may include specific transport systems to
accommodate the import or export of salt ions into the cytoplasm, bacteriorhodopsin (as a light
driven proton pump, to expel salt ions) and high content of glycerol isopranoid ethers lipids to
maintain membrane integrity under high salt concentrations (Yamauchi et al., 1992;
Gambacorta et al., 1995; van de Vossenberg et al., 1998; van de Vossenberg et al., 1999;
Gliozzi et al., 2002).
Theoretically, proteins in micro-organisms which employ several of these strategies won’t
require specific adaptations as to compete with the salt ions for water molecules. Yet, genetic
analysis of several halophilic bacterial genomes, known to employ compatible solutes for stress
management, has clearly indicated changes in the genetic code and in proteins residues
composition in comparison to non-halophilic bacterial proteins (Severin et al., 1992; Galinski
and Trüper, 1994; Oren, 1999; Paul et al., 2008; Rhodes et al., 2010) and suggest there are
specific genetic variations for proteins coping with salt induced stress conditions. The main
finding from metagenomic studies of halophilic micro-organisms, indicated that halophilic
proteins possessed more acidic residues (Asp, Glu) on the protein exterior, than in their interior
or in the active site (Lanyi, 1974; Rao and Argos, 1981; Madern et al., 1995; Madern et al.,
2000; Fukuchi et al., 2003).
1.7 Thermophilic diazotrophs
The hot geysers of California were the first terrestrial environment in which a thermophilic
Chlamydobacteriales was discovered in 1866 (Brewer, 1866; Edwards, 1868). Since then, our
knowledge has expanded the known temperature boundaries for life. High temperatures can
degrade chlorophyll (>75°C), proteins, nucleic acids (>70°C) and increase the fluidity of
membranes and yet, thermophilic Archaea and Bacteria can survive and grow in high
temperatures. They can furthermore be divided into moderately thermophilic, which have a
growth optimum at 50°–60°C, thermophilic micro-organisms, with an optimum higher than
70°C, and hyperthermophilic, with an optimum higher than 80°C (Rothschild and Mancinelli,
2001; Pikuta et al., 2007).
Microbial mats in hot environments, mainly hot springs, have been studied in regards to their
diazotrophic capabilities. A few decades of research into microbial mats from Yellowstone
19
National Park, have portrayed the nitrogen fixation dynamics and participants within a wide
temperature range (16°-82°C) in this unique environment (Stewart, 1970a; Miyamoto et al.,
1979). Within the mats, nitrogen fixation occurs in various layers, during daytime and night.
During daytime, it was established that heterocystous Cyanobacteria Mastigocladus laminosus
and members of the genus Calothrix were the active nitrogen fixers, at 55° and 40°C
respectively, in mid layers of the mats (Stewart, 1970a; Miyamoto et al., 1979). Under dark
conditions, 14 morphological diverse sulphate reducing anaerobic diazotrophs, were fixing
nitrogen, at temperature ranges of 30°-60°C (Wickstrom, 1984). Unicellular Synechococcus spp.
have been also identified as active nitrogen fixers at 60°C, while nifHDK gene transcripts were
high during sunset and nil when light levels were high and the mat oxic (Steunou et al., 2006).
Accordingly, nitrogenase activity (via acetylene reduction) was highest during night time, when
the mat was anoxic. It would appear then, that in the hot springs of Yellowstone National Park,
unicellular Cyanobacteria Synechococcus in the mats upper levels, as well as heterocystous
Mastigocladus in mid layers, fix atmospheric nitrogen with temporal differences. Heterotrophic
bacteria fix nitrogen during night time, when oxygen levels are low (Hamilton et al., 2011b);
(Steunou et al., 2006; Steunou et al., 2008).
Roseiflexus spp have been identified as potential diazotrophs in this system (Klatt et al., 2011)
and recently, a diverse array of nifH phylotypes have been reported from 57 springs, including
springs at 89°C, in Yellowstone National Park (Hamilton et al., 2011a). The most reoccurring
phylotypes were identified as Mastigocladus laminosus strain CCMEE 5201, Synechococcus sp.
JA-3-3Ab (Cyanobacteria), Burkholderia tropica, B. xenovorans LB400, and Dechloromonas
sp. SIUL ( -Proteobacteria). Aquificae, α-γ- -Proteobacteria and Verrucomicrobia diazotrophic
representatives were less frequent (Hamilton et al., 2011a). The maximum rates of nitrogen
fixation were recorded at 82°C and pH 2.5 by an isolated anaerobic single nifH phylotype,
related to Leptospirillium ferrooxidans (Hamilton et al., 2011b). This is the highest recorded
temperature for nitrogen fixation by a bacterial species. Bacterial hyperthermophiles,
Hydrogenobacter thermophilus strain TK-6 and Thermocrinis albus DSM 14484 (Aquificales),
posses a nifH gene copy in their respective genomes (NC_013799, CP001931), yet taxonomic
studies of these species and others in the Aquificales order have not indicated they were actively
fixing atmospheric nitrogen (Kawasumi et al., 1984; Huber et al., 1998; Eder and Huber, 2002).
In the Archaea, the highest temperature for nitrogen fixation was recorded at 92°C, by a
Methanocaldococcus jannaschii -like isolate (Mehta and Baross, 2006) with a nifH gene copy
most similar to Methanothermococcus thermolithotrophicus, the only other known thermophilic
Archaea to fix nitrogen at high temperatures (Belay et al., 1984).
20
Thermophiles accumulate compounds, such as amino acids and sugars (and their derivatives), as
well as mannosylglycerate and glucosylglycerate, in response to stress conditions (Borges et al.,
2002). Under high temperatures it was found these compounds protect enzymes from denaturing
or aggregating, thus demonstrating their multipurpose function, under heat as well as osmotic
leverage in saline stress (Empadinhas and da Costa, 2010). In addition, proteins from
thermophiles have several characteristics which enable them to function under normally
damaging temperatures, extremely thermostable enzymes can remain active above 85°C (Pikuta
et al., 2007). These features include changes in the primary, secondary and tertiary structural
hierarchies, which produce a compact thermophilic protein, highly complex, relatively short in
length and more hydrophobic in nature, in comparison to mesophilic or non-thermophilic
homologs (Jaenicke and Böhm, 1998; Haney et al., 1999). A higher percentage of charged
and Gln) and more salt bridges provide a network of ionic bonds and hence stability to the
tertiary structure (Daniel et al., 2008; Somero, 2003). Additional features reported included:
shortening of the N- and C-terminals, increased amounts of Pro, decreased Gly content, fewer
and smaller internal cavities and higher degrees of oligomerisation. Thermostable enzymes are
thus more rigid, and need higher melting temperatures to denature and become inactive
(Jaenicke and Böhm, 1998; Somero, 2003; Greaves and Warwicker, 2009).
1.8 Analyzing nitrogen fixation
There are several molecular and chemical methods available to analyze nitrogen fixation and
collect relevant data. Dinitrogen fixation rates are usually measured by two techniques - 15N2
uptake and the Acetylene Reduction Assay (ARA) (Stewart, 1967; Stewart, 1973). Potential and
active nitrogen fixers are usually determined by extraction of DNA and RNA from
environmental sample or bacteria of choice, followed by Polymerase Chain Reaction (PCR)
amplification process and analysis (Muyzer et al., 1993).
The molecular approach of analysing nitrogen fixation via DNA or RNA extractions is quite
robust and reliable, with few known disadvantages. In general, even though DNA-based
methods are considered better in exploring natural microbial diversity than classic culturing
techniques (Amann et al., 1995; Head et al., 1998), there are several possible biases generated
by DNA extraction methods and PCR kinetics which might affect the objective representation
of an uncultured environmental microbial community. Adsorption of DNA to soil particles or
mucilaginous polysaccharides produced by many micro-organisms can inhibit DNA extraction
(Frostegard et al., 1999; Tillett and Neilan, 2000). The PCR process may be faulty at the
21
selection stage e.g., higher binding efficiencies to GC rich templates, or at the drift
(amplification) stage, resulting in a 1:1 product ratio bias, due to quick amplification of an
initially higher concentrated template. This would then result in a biased view of the original
sample DNA content and composition (Suzuki and Giovannoni, 1996; Polz and Cavanaugh,
1998). Additional problems in PCR process (mostly relating to 16S rDNA amplification)
include for instance: PCR chimeras, bias due to PCR cycling conditions, limitations involving
primers design and more (Wilson and Blitchington, 1996; Marchesi et al., 1998; Qiu et al.,
2001).
These problems can, however, be circumvented, and molecular techniques to identify
diazotrophs in environmental samples, via amplification of the nifH gene specifically, have been
successfully implemented and reviewed by the scientific community for at least two decades
(Zehr and McReynolds, 1989). Specifically, the nested PCR approach, targeting nifH gene, has
been successfully tried and implemented in environmental studies of aqueous origins (marine,
fresh water, ice, snow, salt pans, etc) and terrestrial origins (soil, rhizosphere, rocks, etc) under a
wide range of physical and chemical conditions (Zehr et al., 1995; Affourtit et al., 2001; Brown
et al., 2003; Mehta et al., 2003; Short and Zehr, 2005; Izquierdo and Nüsslein, 2006; Jungblut
and Neilan, 2010; Singh et al., 2010).
There are over 38,000 matches of the nifH gene currently in NCBI GenBank database (as of
December, 2011), making it a favourable reference gene for use in phylogenetic and genetic
studies. The partially amplified portion of the nifH gene encodes the nitrogenase Fe protein and
provides insights into the function and structure of this important protein.
1.9 Research aims
It is thus evident that a wide variety of diazotrophs in microbial mats participate in diel cycles of
nitrogen fixation, under stressful conditions. While the general dynamics remain similar, the
diazotrophic participants are different per extreme environment, and most probably represent an
optimal adaptation to the respective environment.
I chose to identify potential diazotrophs from three different environments: Antarctic
permafrost, halophilic microbial mats from Western Australia and thermophilic microbial
population from a hot and slightly radioactive spring in South Australia. I also have assessed
their adaptation to environmental conditions via changes to the Fe protein, as manifested in the
nifH gene.
22
We aimed to assess the diversity and potential for diazotrophs in Boulder Clay and Amorphous
Glacier, two ice-free areas in Terra Nova Bay, Antarctica. I have employed molecular and
computational methods which included environmental DNA extraction, amplification of the
bacterial 16S rDNA and Terminal Restriction Fragment Length Polymorphism (T-RFLP)
analysis, followed by an in-depth analysis with the T-RFLP Analysis Program (TAP). This
allowed for a diversity and structure analysis, with preliminary results as to who are the
diazotrophs in these unique sites.
The question of nitrogen fixation in the Shark Bay environment has never been addressed
before. I chose to employ a molecular approach which included environmental DNA extraction,
PCR amplification of the nifH gene followed by clone libraries, restriction fragment length
polymorphism, DNA sequencing and phylogenetic sequence analysis (Zehr et al., 1998;
Omoregie et al., 2004c). I was able to characterise diazotrophs in samples obtained in two
different years, and assess the diversity and structural changes to the bacterial community as
well as potential halophilic adaptations in the Fe protein of the stromatolites.
Paralana Hot Springs (55.6°C), a hot spring in South Australia, was investigated before for its
bacterial community (Anitori et al., 2002) and nothing is known in regards to the diazotrophic
diversity. I used the same research procedure as described for the Shark Bay environment. I was
able to compare the diazotrophic community characteristics to other thermal microbial systems
and assess potential thermal adaptations in the Fe protein of the springs’ diazotrophs.
Specific research aims were -
1. Estimation of bacterial diversity and identification of potential nitrogen fixers using T-RFLP
community analysis and PCR amplification of the nifH gene in glacial and permafrost
formations in Northern Victoria Land, Antarctica (chapter 2).
2. Assessment of diazotrophic diversity, richness and community structure in stromatolites in
Shark Bay, Western Australia from two different years (1996 & 2004, chapter 3).
3. Assessment of diazotrophic diversity, richness and community structure in Paralana Hot
Springs, South Australia (chapter 4).
4. Analysis of molecular data from aims 2 and 3, and investigate potential adaptations in the Fe
protein composition and structure (chapter 5).
23
Overall, extreme environments harbour novel solutions for biotechnology, as well as analogous
conditions to environments on other worlds. The overall objective of this thesis was to
contribute information in regards to diazotrophs in extreme environments and how they adapt to
their environment.
24
Chapter 2 T-RFLP analysis of potential diazotrophs in glacial and
permafrost formations in Northern Victoria Land, Antarctica.
_______________________________________________
2.1 Introduction
Antarctica has been the focus of microbial research for some time now, due to its extreme
climate and pristine conditions. Until a few decades ago, glacial formations and permafrost
areas on the Antarctic continent have been seen as abiotic systems. However, new data are
emerging that indicate microorganisms live within cryospheric geological features. Diverse
bacterial compositions have been described from recent and ancient permafrost (Rivkina et al.,
2004, and references within). Bacteria were found in ice cores from Lake Vostok, Mizuho Base
in the Enderby Land Mountains, and the Yamato Mountains in Dronning Maud (Christner et al.,
2001; Segawa et al., 2010). Nitrospira isolates, for instance, were detected in Luther Vale soil
samples, in Northern Victoria land and also in sediment cores at 761 m.b.s.l. from the Mertz
Glacier Polynya (MGP), Antarctica (Bowman and McCuaig, 2003; Aislabie et al., 2009).
Antarctic microbial population have changed our views of the continent as abiotic, and
substantial research have identified mainly Proteobacteria members, as well as Firmicutes,
Cytophaga-Flavobacteria- Bacteroidetes (CFB group), Actinobacteria and Deinococcus
members to successfully function under cold and desiccation stressful conditions (see also
chapter 1, section 1.5). Our study focused on two localities in the Terra Nova Bay area,
Northern Victoria Land (see figure 1). Past microbial studies in this area analysed various
ecological niches, such as soil and seawater from coastal and terrestrial stations (Nicolaus et al.,
1991; Nicolaus et al., 1996; Bargagli et al., 2004; Pepi et al., 2005). Some 140 bacterial isolates
were identified and characterised using molecular tools, such as 16S rDNA amplification,
fluorescence in situ hybridization (FISH), clone libraries and culture-dependant methods
(Michaud et al., 2004; Yakimov et al., 2004; Lo Giudice et al., 2007). Spore-forming Bacilli
species were identified from a seawater sample in Rod Bay, and Alicyclobacillus has been
isolated from geothermal soils on Mount Melbourne (Nicolaus et al., 1998; Pepi et al., 2005).
Burkholderia, a cold-tolerant, hydrocarbon-degrading soil bacteria, was also found in sea water
samples from Rod Bay (Yakimov et al., 2004). In addition, clones affiliated with
Burkholderiales were found in soil samples from a Northern Victoria Land locality
(Niederberger et al., 2008). Pseudomonas, a Gram-negative, aerobic bacterium known to inhabit
25
cold marine ecosystems, was detected in sea water samples from Santa Maria Novella and Rod
Bay (Yakimov et al., 2004; Lo Giudice et al., 2007).These studies revealed diverse
communities exist in this area, comprised principally from Proteobacteria, Bacteroidetes,
Firmicutes and Actinobacteria bacterial groups. It is unknown whether diazotrophic
communities exist in the Terra Nova Bay area, and only several genera from these studies are
known to have the nifH gene (see table 1). Interestingly, no representative from the
Cyanobacteria phylum has been reported from the Terra Nova Bay studies so far.
Table 1. Potential nitrogen fixers in Terra Nova Bay area, see references in text. Phylum Genus α- Proteobacteria Loktanella, Sulfitobacter, Methylobacterium, Paracoccus,
In general, relatively few micro-organisms are culturable (Amann et al., 1995) and due to the
low bacterial content in polar ice core samples and difficulties in culturing them (Christner et
al., 2005; Miteva, 2008), investigating bacterial content in ice cores requires the use of highly
sensitive techniques. Terminal Restriction Fragment Length Polymorphism (T-RFLP) is a
sensitive, affordable and applicable method used mainly for estimating the diversity of bacterial
communities.
Briefly, this method amplifies 16S rDNA templates of a target community using PCR (Clement
et al., 1998) with one primer carrying a fluorescent label. Fragmentation of the amplicons by
endonuclease restriction enzymes produces a population of fluorescently labelled terminal
fragments (‘T-RF’, length in base pairs). The fluorescent PCR products are detected using
sequencing electrophoresis technologies and are visualized as peaks - each peak represents a
fragment, post-digestion (Marsh, 1999; Blackwood et al., 2003). The general assumption in this
method is that the height of a peak represents the abundance of a fragment. The more of
fragment X that is present, a stronger signal will be detected and the peak will be higher (Marsh,
2005). It should be noted that the method provides a quantitative and detailed view of the PCR
product pool derived from a community, and does not accurately reflect the native community
structure (Moeseneder et al., 1999).
This method has been employed successfully in cryospheric environments. A T-RFLP analysis
of John Evans Glacier reported 142 T-RFs from 141 DNA preparations with HaeIII digestion,
26
suggesting a relative low number of T-RFs was reported for each sample preparation from the
glacier (Bhatia et al., 2006). An ice core taken from Lake Vostok, at 3589 m depth, produced 12
fragments after a universal bacterial 16S rDNA amplification and digestion (Priscu et al., 1999).
T-RFLP was also used in an extensive study of the microbial diversity in lithic niches
(sandstones, quartz, soil, etc), in the McKelvey Valley, McMurdo Dry Valleys and revealed a
complex community of bacteria and eukaryota (Pointing et al., 2009).
Biases involved in environmental DNA extraction and in primer annealing to different
templates during PCR mean that certain DNA sequences (or T-RF’s) are preferentially retrieved
from a sample (Liesack and Dunfield, 2004). Therefore a particular T-RF cannot be compared
to a different T-RF in a single profile. However, it is possible to compare a T-RF to itself over
different samples, and so also a T-RF pair match can be compared to itself over different
samples (Osborn et al., 2000). In the end, the list of T-RFs in a sample is a profile of a bacterial
community present in the environmental sample.
The study area is located in Northern Victoria Land, Antarctica, close to Mario Zucchelli station
(74° 41′ 36.96″ S, 164° 6′ 42.12″ E), previously known as Terra Nova Bay station. In general,
the climate is cold with a mean annual temperature of -14°C (Frezzotti et al., 2001) and mean
monthly air temperature ranging between -26°C and 0°C. Average precipitation is 270 mm/year
water equivalent in snow (Piccardi et al., 1994). Two sites in the study area were the focus of
research - “Amorphous Glacier” (74°41’25’’ S, 164°00’ E) and “Boulder Clay“ (74°44’45’’ S,
164°01’17’’ E), which are two small ice-free areas characterised by debris cones (Guglielmin et
al., 2002; Guglielmin and French, 2004).
Although in close proximity, Amorphous Glacier is above the Pleistocene grounding line and
Holocene in age, whereas Boulder Clay is below the grounding line, with sediments likely of a
glacial-marine origin and dated to the Late Pleistocene (Orombelli et al., 1991). These novel
sites have been extensively studied for their isotopic composition, mechanisms of ice
distribution and formations (Guglielmin et al., 1997; Gragnani R et al., 1998; French and
Guglielmin, 1999a; Guglielmin and French, 2004) yet to date, their microbial and diazotrophic
aspects remain unknown. We therefore proceeded to test if bacterial DNA could be obtained
from ice and permafrost cores of the Amorphous Glacier and Boulder Clay areas, and whether
bacterial community profiles differ between these two distinct sites by way of terminal-
restriction fragment length polymorphism (T-RFLP) analysis.
Knowledge of microbial life existing in ice does not only improve our understanding of the
taxonomic diversity, richness and biogeography of cold-adapted microorganisms, but also
27
Boulder Clay
Amorphous Glacier
Figure 1 Antarctic study sites. Left: location of the two study sites. Upper right pane: View of the perennially frozen lake and debris cone in Amorphous Glacier (Guglielmin et al., 2002). Lower right pane: Frost mound at Boulder Clay (Guglielmin and French, 2004). Reproduced with permission from author.
assists in evaluating the metabolic requirements for survival and proliferation of life in the
cryosphere, and in defining the actual limits of life.
28
2.2 Materials and methods
2.2.1 Study sites
Amorphous Glacier is located west of Mario Zucchelli Station (MZS) between 250 and 290 m
above sea level (see figure 1). The summit of the cone is partially collapsed and its debris cover
consists of 70-80% of light grey granitic gravel, with some granite boulders being more than 1
m in diameter. Ice within it represents congelation ice derived from ground waters formed under
different thermodynamic conditions (Guglielmin et al., 2002). The age of the cone is relatively
recent within the Holocene. The ice core stratigraphy has revealed several layers, based on
crystallographic characteristics (C-axes, bubble density, crystal size) and chemical analyses
(Guglielmin et al., 2002). These layers are summarised in table 2.
Boulder Clay site is located south of Mario Zucchelli Station (MZS) in an ice-free area, 205 m
above sea level (Guglielmin and French, 2004). The mean annual air temperature is -13.8°C and
the mean annual ground temperature at the surface (2 cm depth) is 16.1°C and at the permafrost
table (30 cm depth) -16.5°C. The mean annual temperature of the deepest monitored layer (3.6
m, within the ice), is -17°C (Guglielmin and Cannone, 2012).
In the Boulder Clay area, an ablation till of late-glacial age overlies a body of buried glacier ice
(Guglielmin et al., 1997; Gragnani R et al., 1998; Guglielmin and French, 2004), and surface
features include perennially ice-covered ponds with icing blisters and frost mounds (French and
Guglielmin, 2000), frost-fissure polygons and debris islands (French and Guglielmin, 1999b).
The age of the frost blister is younger than 1020 BP ± 70, while the till that generally covered
the surface of the Boulder Clay area is referred to the Late Pleistocene and in particular to the
Ross Sea I glaciations (Orombelli et al., 1991). The analysed frost mound formed during the
late Holocene, in the middle of a perennially ice-covered lake, which is located on the
sublimation till, overlying the buried Pleistocene relict glacier ice (Guglielmin et al., 2009).
2.2.2 Ice core collection
Two ice cores were obtained during the austral summer in 1996 (Guglielmin et al., 2002) with
slow rotary drilling equipment without any chemical solutions, antifreeze liquid or any drilling
fluid in order to minimize possible contamination. A 237 cm long ice core was extracted from
the debris cone of Amorphous Glacier (AM), placed in polyethylene bags and stored in MZS
station at -25°C (Guglielmin et al., 2002). The Boulder Clay (Stöver and Müller) core was 375
cm long and sampled from a shallow perennially-frozen pond through the underlying sediment
29
into the moraine-covered glacial ice. Cores were transported to Milan-Bicocca University, Italy,
and stored in -40°C for further processing. Both cores contained several distinct layers (table 2).
Amorphous Glacier was previously characterised chemically and isotopically (Guglielmin et al.,
2002).
2.2.3 Sample preparation
Samples were aseptically cut from the ice cores in a -40°C room and stored on dry ice in a
-40°C room, by a former member of the lab. Internal parts of the cores were cut by an electric
saw (repetitively washed with ethanol) and stored in sterile falcon tubes after the surface was
washed with 70% ethanol. BC samples contained a mixture of ice, stones and shells due its
glacial-marine origin. These samples were crushed with an ethanol washed hammer. Two
duplicates from each sample were taken and stored in sterile falcon tubes for further
amplification and T-RFLP analysis.
2.2.4 DNA extraction and amplification
Samples were thawed overnight at 4°C and always kept in the dark. AM samples were then
filtered through a sterile 0.22 mm membrane (Millipore). The flow-through was collected in
sterile Falcon tubes, lyophilised and resuspended in 1mL sterile buffer.
Filters were washed with 1 mL TNE buffer to recover bacterial cells. DNA was extracted from
the filter and flow-through fractions using a protocol as previously described (Burns et al.,
2004) with a modified incubation step with proteinase K (10 mg ml-1) and SDS (10%) to give a
final concentration of 100 μg ml-1 proteinase K in 0.5% SDS, for 1 h at 37°C, and finally
resuspended in 50 mL sterile water.
BC samples (400 mL) were added to 500 mL TNE buffer DNA extracted as described above
and resuspended in 50 mL sterile Milli-Q water. All BC and AM samples were resuspended in a
final volume of 50 mL sterile Milli-Q water. The DNA concentration was measured using
NanoDrop ND-1000 Spectrophotometer. The presence of bacterial DNA, as well as the quality
of extracted DNA and the presence of PCR inhibitors, was tested by universal bacterial 16S
rDNA PCR using unlabelled 27F and 1494R primers. To amplify 16S rDNA fragment for the
T-RFLP analysis, PCR was performed with a labelled universal forward primer 27F (6-FAM,
carboxyfluorescein, 5’ AGAGTTTGATCCTGGCTCAG) and universal reverse primer 1494R
(5’ TACGGCTACCTTGTTACGAC) in a 50 μL reaction (1X reaction buffer, 0.2 mM dNTP’s
each, 0.25 mM MgCl, 0.2 μM primers, 0.8 U Taq polymerase). After an initial denaturing step
30
at 92ºC for 2 min, 30 cycles of amplification followed (92ºC for 20 sec, 50ºC for 30 sec, 72ºC
for 1 min), concluding with an extension step at 72ºC for 7 min. DNA extraction of a microbial Table 2. Ice core sections and layers description from Amorphous Glacier and Boulder Clay samples. n.d. - no data.
Sample Ice core section (cm, depth)
El. Cond. (μS cm-1 20°C)
pH (20°C)
Cl - ( eq L-1)
SO4 -2 ( eq L-1)
Layer description (Guglielmin et al., 2002)
Amorphous Glacier
AM-18 0-22 124 6.48 983.14 125.26 Active layer composed of loose sandy gravel with fine material increasing with depth.
AM-3 75-79 19.5 5.73 155.45
10.77 Massive ice with high bubble density, elongated and big crystals; chemicals maximum concentration peaked in sinusoidal cycles every 60 cm in depth.
AM-21 265-272 59.95 6.71 347.4 7.62 Massive ice with an intermediate bubble density, less elongated and smaller crystals; sinusoidal chemicals cycles were not present.
Boulder Clay
BC-1 0-15 n.d. n.d. n.d. n.d. Dynamic active layer in a small debris cone (0-30 cm depth changes).
BC-T 325-330 n.d. n.d. n.d. n.d. Massive ice and brine pockets within the ice (from a frost mound in perennial frozen pond).
BC-B 370-375 n.d. n.d. n.d. n.d. Massive ice and brine pockets within the ice (from a frost mound in perennial frozen pond).
Data taken from (Abramovich et al., 2012).
mat sample from Brack Pond (McMurdo Ice Shelf, Antarctica) was used as a positive control
while filter sterilized water was used as a negative.
Positive results from the PCR were verified by 2% agarose gel electrophoresis and ethidium
bromide staining prior to UV transillumination. DNA concentration was measured using
NanoDrop ND-1000 Spectrophotometer. Samples at all times were kept in the dark at all times.
31
2.2.5 Terminal Restriction Fragment Length Polymorphism (T-RFLP)
Quadruplicates of AM-3, BC-1, BC-T and BC-B samples were analysed as well as triplicates of
Brack Pond mat sample. Approximately 150 ng of each FAM-labelled PCR product was
digested with 6 U of the restriction endonuclease MspI or 3 U of ScrFI (New England Biolabs).
Digestions were carried out in a total volume of 10 μL over night at 37ºC following the
manufacturer’s instructions.
The size of each Terminal Restriction Fragment (“T-RF”) was determined according to the
GeneScan™ 1200 LIZ® size standard on an ABI 3730 Capillary sequencer (Applied
Biosystems Inc.) with an acceptable error of ± 0.5 bp and also analysed using Peak Scanner™
Software Version 1.0 (Applied Biosystems Inc). T-RFs were visualized as peaks in
GeneScan™, which are characterised by width (base pairs) and height (arbitrary fluorescence
units as a linear representation of the abundance of a specific T-RF in the PCR pool). Height is
therefore a qualified estimation of the original amount of a specific DNA fragment in a sample,
prior to the PCR process (Marsh, 2005). Here, the absolute peak height was not used as a
measure of bacterial abundance, since PCR fragment levels could have originated from process
biases (Suzuki and Giovannoni, 1996).
Little background noise was evident in the electropherograms, affording an unambiguous
selection of valid T-RFs with a minimum height of 35 fluorescent units (Liesack and Dunfield,
2004). T-RFs over 35 fluorescent units in intensity and present in at least two replicates were
selected for further analysis. For comparative analysis, T-RFs within an electropherogram were
normalized to the total height of that sample (Dunbar et al., 2000) and T-RFs with a relative
height of less than 1% of the total height were excluded from further analysis. T-RFs with peak
heights determined to be off-scale by GeneScan™ were also excluded from further analysis,
unless present in other replicates at lower heights, in which case these T-RFs were adjusted to
the lower height value (Dunbar et al., 2001).
Identical T-RFs in replicas were aligned and grouped after manually inspecting
electropherograms. Assigning a specific size to a group of similar T-RFs was based on
averaging their sizes. Similarly, assignment of relative height to a group of similar T-RFs, was
based on averaged normalized relative height values. Only a few T-RFs were separated by 1
base pair but were shown to be identical peaks after manual inspection of electropherograms.
These included T-RFs 80, 81, 82 and 145,146 that were collectively assigned as size 81 and
145, respectively.
32
2.2.6 T-RFLP profiles
The presence of similar T-RFs in each profile was the basis for the community comparison
between AM-3, BC-1, BC-T and BC-B (Dunbar et al., 2000). The T-RFs list of each sample
was considered a community profile and the similarity between Boulder Clay and Amorphous
Glacier samples was assessed.
A binary data set was created based on presence or absence of T-RFs from all samples. Bray-
Curtis analysis was performed on presence-absence data using PAleontological STatistics
program (PAST) (Hammer et al., 2001). The Venn diagram was calculated with the online
program Venny (Oliveros, 2007).
2.2.7 T-RFLP Analysis Program (TAP)
T-RFs from all profiles were matched to the in silico digestions performed by the TAP software,
on 16S rRNA genes present in the Ribosomal Database Project release 9, update 57 (Marsh et
al., 2000; Cole et al., 2003). The software produced terminal restriction fragments, after taking
into account the PCR primer binding sites, and the restriction enzyme excision sites (MspI and
ScrFI), producing a database which contained list of T-RFs and bacteria divided into phyla,
genera and species, which were then manually matched to the T-RFs observed in each Antarctic
sample.
2.2.8 RDP 9, TAP and T-RFLP databases
T-RFs from all profiles were used for putative bacterial identification, based on the list the TAP
software produced (section 2.2.7). We compared three 16S rDNA databases in terms of their
phylogenetic composition: RDP (release 9, update 61), RDP 9 after TAP performed an in silico
digestion with MspI and ScrFI, and a third database which was based on the samples profiles
from the T-RFLP analysis. The first two databases provided a reference point for the third
database in terms of taxa distribution.
2.2.9 PCR amplification of nifH genes
PCR amplification of nifH genes could not be carried out, mainly due to lack of source material
after optimising our methodology as described in sections 2.2.4 - 2.2.5.
33
2.3 Results and discussion
Molecular fingerprinting analysis based on the bacterial 16S rDNA allows us to determine the
presence of bacteria in environmental samples and their community profiles (Marsh et al.,
2000). We obtained DNA with concentrations ranging from 0.29 to 88.02 ng mL-1, with the
highest concentration from sample BC-1 (table 3). DNA yields and bacterial cell counts would,
however, be required to determine if the different DNA concentrations are due to changes in the
distribution of bacteria in the ice cores. DNA did not degrade under specified storage conditions
and the partial 16S rDNA was successfully amplified from the samples BC-1, BC-T, BC-B and
AM-3. Table 3 summarises DNA yields and results of 16S rDNA amplification from the study
site and figure 2 presents representative electropherograms from each sample. Reasons for the
failure of any amplification from the samples AM-18 and AM-21 could be due to a combination
of low DNA concentration and degraded DNA (Rivkina et al., 2004), as we did not detect PCR
inhibition in the extracted nucleic acids.
Table 3. DNA yields and results of 16S rDNA amplification from Amorphous Glacier and Boulder Clay samples. +, successful amplification; -, no amplification.
Table 4. Number and relative peak abundances (%) of T-RFs > 40 bp, following MspI and ScrFI digestions of ice core samples and a 1% relative height threshold. The most abundant T-RF in a sample is marked bold.
Figure 3 (A) Venn diagram illustrating the number of T-RFs per bacterial profile. (B) Bray-Curtis cluster analysis of 16S rDNA T-RF profiles from Amorphous Glacier (AM) and Boulder Clay (Stöver and Müller) obtained from 1000 bootstraps.
We proceeded to evaluate diversity without implementing a 1% relative height threshold, in
order to gain more data points, and we assessed each digestion separately (table 5). According
to the ScrFI digestion results and similarity comparison, the majority of T-RFs detected in
Boulder Clay were shared between its profiles - BC-1, BC-T and BC-B. Eighty-two and 88% of
BC-1 T-RFs were shared with BC-B and BC-T, respectively (ScrFI digestion). In addition, 93%
and 100% of BC-T T-RFs were shared with BC-1 and BC-B, respectively. Most T-RFs (88-
100%) from all Boulder Clay profiles were detected in AM-3 profile which had about 3 times
more T-RFs (76) in total than other profiles.
The shared T-RFs between BC-1, BC-B, BC-T and AM-3 amounted to a fifth (20%, 22%) of
the AM-3 profile; therefore 80% of AM-3 DNA fragments were different than the Boulder Clay
ice cores contents, according to ScrFI digestion results.
MspI digestion produced more T-RFs for each profile and therefore less T-RFs were shared
between profiles. Twenty-one and 34% of BC-1 T-RFs were shared with BC-T and BC-B,
respectively. BC-T shared T-RFs at 65% and 62% with BC-1 and BC-B respectively, yet 73%
of BC-T T-RFs were shared with AM-3. Additionally, BC-1 shared 35% of T-RFs with AM-3
while BC-B shared 42% T-RFs with AM-3, altogether suggesting that Boulder Clay DNA
fragments were also present in AM-3.
37
Table 5. Cross-profile analysis based on shared T-RFs between AM-3, BC-1, BC-T and BC-B. Ice core profiles ScrFI digestion BC-1 (17)(a) BC-T (15) BC-B (19) AM-3 (76) BC-1 93 79 20 BC-T 82(b) 79 20 BC-B 88 100 22 AM-3 88 100 89 MspI digestion BC-1 (82) BC-T (26) BC-B (57) AM-3 (94) BC-1 65 49 31 BC-T 21 28 20 BC-B 34 62 26 AM-3 35 73 42 a The total number of T-RFs counted for a specific profile is displayed in brackets. T-RF count was done prior to implementing 1% relative height threshold to produce as much data points as possible for the analysis. b Total number of T-RFs varied between profiles. The result in each cell is the percentage of shared T-RFs between two profiles, with respect to the column profile.
Additionally, as observed in the ScrFI digestion analysis, the shared T-RFs between BC-1, BC-
B, BC-T and AM-3 still amounted to a relatively small portion of the AM-3 profile (31%, 20%
and 26%), even though the number of T-RFs in Boulder Clay profiles was considerably higher
(82, 26, 57) in comparison to the ScrFI digestion.
In summary, with or without a 1% relative height threshold, AM-3 was the most diverse sample
and differed from Boulder Clay samples, and few DNA fragments were shared between all sites.
Additionally, Boulder Clay samples were similar to one another, and clustered separately from
the AM-3 ice-core sample. Amorphous Glacier and Boulder Clay differ lithologically and in
their geological ages (Holocene vs. Late Pleistocene), and therefore most probably support
different microbial populations.
2.3.2 In silico database composition
The in silico process of the TAP program is based on the RDP 16S rRNA sequences database
(Marsh et al., 2000; Cole et al., 2007). It was of interest to compare the outcome of the in silico
digestion to the composition and size of the original RDP database. If the in silico digestion
produced a seriously skewed bacterial taxa representation of the original RDP database bacterial
composition, the T-RFLP profiling would be biased as well and would be only partially
representative of the bacteria in the samples.
38
Figure 4. Databases phylogenetic compositions. (A) Bacterial phylogenetic composition of RDP 9 16S rRNA gene sequence database; (B) Bacterial phylogenetic composition based on TAP in-silico digestion with ScrfI and MspI;(C) Bacterial phylogenetic composition from all digested samples, after T-RFs were assigned bacterial identification.
The RDP 9 (release 61) database contained a total of 180,642 bacterial sequences (figure 4, A).
Thirty three point five percent were affiliated with Proteobacteria sequences, 31.9% Firmicutes,
12.7% Bacteroidetes (CFB group) and 8.8% Actinobacteria (Cole et al., 2007; Cole et al.,
2009). Another 31 phyla were present in the database, 26 of which had less than 1% of the total
amount of sequences in the database, while Acidobacteria, Cyanobacteria, Spirochaetes,
Verrucomicrobia and unclassified bacteria phyla were present with slightly more than 1% ratio.
The second database was produced in silico by the TAP program and contained 30,781
sequences (figure 4, B). Major groups included Proteobacteria (34.8%), Firmicutes
32.7%), Bacteroidetes (14%, CFB group) and Actinobacteria (8.4%), similarly to the RDP 9
database distribution. Additional twenty eight phyla were present in this database.
Cyanobacteria and unclassified bacteria were >1%, while 26 other phyla had lower proportions.
From size perspective, the TAP database size was only 17% of the RDP 9 database, yet from a
composition perspective, it was similar in its composition to the RDP 9 database. Therefore, the
TAP program produced a representative database for the downstream process.
39
The third database, based on the T-RFLP analysis (figure 4, C), contained fewer bacterial
sequences than the TAP or RDP 9 databases and varied in the composition of the phylogenetic
groups. It contained potential cryospheric bacteria from the analysed samples, and consisted of
650 sequences in total. Four major phylogenetic groups were represented: Proteobacteria
(39.1%), Firmicutes (22.2%), Actinobacteria (13.2%) and Bacteroidetes (CFB group) (12.5%).
Acidobacteria, Cyanobacteria, Planctomycetes, Spirochaetes and unclassified bacteria were also
present with > 1% sequence abundance in the database and eight additional phyla were present
with < 1%.
Thus the T-RFLP database, generated in this study, contained seventeen phyla vs. the 32 and 35
phyla identified in the TAP and RDP 9 databases, respectively, with proportional shifts within
the four major phylogenetic groups - an increase in the Proteobacteria and Actinobacteria
sequences, a decrease in the Firmicutes, and no substantial changes within the Bacteroidetes
(CFB group).
We then continued to further analyse the T-RFLP profiles of each sample in order to gain an
overview of putative phylotypes (Pointing et al., 2009). The TAP database contained all the
possible T-RFs emerging specifically from using MspI and ScrFI restriction enzymes, we
therefore normalized the individual profiles, of each sample, to the TAP database (table 6). An
average ratio value above 1 indicated a higher portion of a specific phylum relatively to the
original TAP database. Across all samples, for instance, there was a higher portion of
Proteobacteria, Firmicutes, Actinobacteria and Nitrospira (1.12, 1.03, 1.04 and 13.43, average
ratios respectively) in the samples than in the original TAP database. Conversely, Bacteroidetes
(CFB group), Cyanobacteria, and unclassified bacteria had an average ratio < 1 (0.47, 0.44 and
0.78, respectively) indicating a lower portion of these phyla in each profile, compared to the
phyla) and it had the largest number of phylogenetic groups compared to BC-T, BC-1 and BC-B
(Table 6). Cross-profile analysis also suggested that Amorphous also shared common T-RFs
with Boulder Clay (table 5) and TAP in silico projection established these common T-RFs were
probably related to Proteobacteria, Firmicutes, Bacteroidetes (CFB group), Actinobacteria,
Cyanobacteria and Nitrospira.
Except for OP10 and the Thermotogae groups, all other phyla associated with AM-3, have
previously been found in Antarctic lakes and microbial mats and other cryospheric
environments - alpine permafrost and Siberian permafrost-affected soils (Franzmann and
40
Dobson, 1992; Brambilla et al., 2001; Bowman et al., 2003; Sheridan et al., 2003; Miteva et al.,
2004; Bai et al., 2006; Wagner et al., 2009).
Table 6. Distribution of potential phyla groups within each profile based on TAP database of 16S rDNAsequences.
a A partial list of phyla present in the TAP database post in-silico MspI and ScrFI digestion. b The ratio of each phylum in the TAP database. c The ratio of each phylum in the T-RFLP profiles: AM, BC-T, BC-B and BC-1. d The ratio of each phylum was divided by the ratio of its corresponding phylum in the TAP database. The resulting ratios were then averaged.
e “(X)” not an average, value based on one profile only.
2.3.3 Amorphous Glacier and Boulder Clay cryospheric bacteria
Phyla level analysis provided a broad overview and we proceeded to review the data at the
genus level. This would further correlate our results to current microbial cryospheric data, and
in particular, to findings from the Terra Nova Bay area.
A genus was denoted ‘cryospheric’ based on published reports of sequence data with 95% or
higher 16S rDNA sequence similarity to a specific genus isolated from cold environments
(Everett et al., 1999). This was deemed necessary in order to narrow down the possible
diazotrophic candidates, and after applying the 95% sequence similarity criteria, 65 genera were
excluded from the analysis (data not shown) while 38 genera passed the criteria.
The mean number of T-RFs following MspI digestion was 10.4 ± 1.7 for an individual profile
and a total of 26 different 16S rDNA T-RFs were observed from AM-3, BC-1, BC-T, BC-B
combined. From these 26 T-RFs, 16 T-RFs were singular peaks (appeared in one profile only)
TAP database Profiles individual T-RFLP databases Phylum (a) Sequences
with relative heights between 1.2% and 39.1% of the total fluorescence (T-RF 553 in AM-3
profile, table 4).
Five T-RFs were observed after MspI digestion in most profiles (T-RFs 30, 31, 35, 38 and 81).
However, the first four of these T-RFs were considered primer dimers and discarded from
further analysis. T-RF 81 had a relative average height of 8.8% and had an in-silico (TAP
Projected T-RFs (TPTs) identification of Bacilli spp. (Firmicutes) and the Nitrospira class.
Clones of both taxa were reported from cryospheric environments (Bakermans et al., 2003;
Gilichinsky et al., 2007; Steven et al., 2007), and more specifically from Rod Bay (Terra Nova
Bay area) and Northern Victoria land (Yakimov et al., 2004; Aislabie et al., 2009). Two T-RFs
following MspI digestion (73 and 145, table 4) were observed in 60% of the profiles with
relative heights below 4%, indicating fragments were not abundant after PCR amplification
process. Only T-RF 145 had an in-silico bacterial match to various phylogenetic groups -
Gammaproteobacteria, Firmicutes and Actinobacteria.
The mean number of T-RFs following ScrFI digestion was 8.6 ± 2 per profile and a total of 13
different 16S rDNA T-RFs were observed across all profiles. Three out of the 13 T-RFs were
singular peaks. Four T-RFs following ScrFI digestion (43, 76, 81 and 145) were observed in
more than 80% of profiles and were associated in-silico with Proteobacteria, Actinobacteria,
Firmicutes, Bacteroidetes (CFB group) and green sulphur bacteria. T-RF 81 was the most
abundant fragment with an average relative height of 30.4 ± 16.2 %. Its TAP Projected T-RF
(TPT) was associated in-silico with nine different genera. Alicyclobacillus has been isolated
from Mount Melbourne, Northern Victoria land (Nicolaus et al., 1998; Pepi et al., 2005) and
Bacillus is a common find in polar environments, as mentioned previously. Burkholderia and
Pseudomonas were reported previously from Rod Bay and Santa Maria Novella (Yakimov et
al., 2004; Lo Giudice et al., 2007). In addition, clones affiliated with Burkholderiales were
recently found in soil samples in Northern Victoria Land (Niederberger et al., 2008). Of the
remaining five genera associated with T-RF 81, three (Rikenella, Terrimonas and
Sporolactobacillus) have not been detected in cryospheric environments, yet Leptospirillum and
Rhodanobacter were reported previously in the cryospheric literature (Spirina et al., 2003;
Vishnivetskaya et al., 2006).
Three unique T-RFs were observed only in AM-3 profile after ScrFI digestion (18, 116 and
796) with respective relative heights of 1.3%, 1.3% and 3.6%. T-RF 116 did not have an in-
silico bacterial match. T-RF 796 was associated with Sporolactobacillus and Streptococcus
genera from the Bacilli class. Sporolactobacillus was not detected or found in cryospheric
42
environments to date, while Streptococcus has been discussed above. In silico matches to all T-
RFs within samples, are listed in table 7.
A total of 20 putative diazotrophic genera were identified based on the analysis (table 7), 8 in
Boulder Clay samples and 17 in Amorphous Glacier samples, none relating to Cyanobacteria.
Of these diazotrophic bacteria, some have been previously reported from the Terra Nova Bay
area (table 1): Arcobacter, Burkholderia, Halomonas, Methylobacterium and Pseudomonas
(Proteobacteria), Bacillus and Paenibacillus (Firmicutes), and Arthrobacter (Actinobacteria).
Table 7. Genera and diazotrophic cryospheric bacteria in AM-3, BC-T, BC-B and BC-1. Shaded rows indicate genera previously found in Terra Nova Bay area. Potential diazotroph N/Y (a)
Genera AM-3 BC-T BC-B BC-1 Cryospheric References(b)
Proteobacteria
N Acidovorax + (Priscu et al., 1999; Liu et al., 2006; Lo Giudice et al., 2007)
Y Aeromonas + (Gilichinsky et al., 2007) Y Afipia + (Priscu et al., 1999)
N Alteromonas + +
(Feller et al., 1992; Gauthier et al., 1995) Now Pseudoalteromonas haloplanktis
Y Arcobacter + (Bowman and McCuaig, 2003; Yakimov et al., 2004)
Y Bradyrhizobium + (Zhou et al., 1997; Sheridan et al., 2003; Xiao et al., 2007)
Y Burkholderia + + (Christner et al., 2000; Yakimov et al., 2004)
N Coxiella + (Bowman and McCuaig, 2003)
Y Delftia + (Gaidos et al., 2004; Skidmore et al., 2005; Xiao et al., 2007)
Y Desulfobacterium + (Ravenschlag et al., 1999; Bowman and McCuaig, 2003)
N Erythrobacter + (Yakimov et al., 2004) N Gallionella + + (Skidmore et al., 2005)
Y Halomonas + (Bowman et al., 1997; Xiang et al., 2004; Yakimov et al., 2004)
N Hydrogenophaga + (Liu et al., 2006; Amato et al., 2007)
N Lysobacter + (Vishnivetskaya et al., 2006; Steven et al., 2007)
N Marinobacter + + + + (Brinkmeyer et al., 2003; Lysnes et al., 2004;
43
Yakimov et al., 2004)
Y Mesorhizobium + (Bowman and McCuaig, 2003)
Y Methylobacterium +
(Yakimov et al., 2004; Miteva and Brenchley, 2005; Zhang et al., 2007b)
Y Pelobacter + (Brambilla et al., 2001; Sjöling and Cowan, 2003)
Y Pseudomonas + + + (Michaud et al., 2004; Yakimov et al., 2004; Lo Giudice et al., 2007)
N Shewanella + + + (Bowman et al., 2003; Yakimov et al., 2004; Lo Giudice et al., 2007)
N Sphingobium + (Xiang et al., 2005) Total for each sample 20 3 6 3
Firmicutes
Y Bacillus + + + + (Nicolaus et al., 1996; Yakimov et al., 2004; Steven et al., 2007)
Y Clostridium +
(Brambilla et al., 2001; Gilichinsky et al., 2005; Vishnivetskaya et al., 2006)
N Lactobacillus + (Segawa et al., 2005; Sundset et al., 2007)
Y Paenibacillus + (Bargagli et al., 2004; Pepi et al., 2005; Mindlin et al., 2008)
N Streptococcus + (Gaidos et al., 2004; Segawa et al., 2005)
N Syntrophococcus + (Sundset et al., 2007) Total for each sample 6 1 1 1
Bacteroidetes
N Algoriphagus (Van Trappen et al., 2004)
N Bacteroides + + (Sheridan et al., 2003) N Flavobacterium + (Bowman et al., 1997;
Gilichinsky et al., 2007) N Hymenobacter + (Hirsch et al., 1998)
Total for each sample 2 1 1 0
Actinobacteria
Y Arthrobacter + (Bargagli et al., 2004; Michaud et al., 2004; Lo Giudice et al., 2007)
N Corynebacterium + (Gilichinsky et al., 2007; Lo Giudice et al., 2007)
44
Y Curtobacterium + (Miteva and Brenchley, 2005)
Y Frankia + + (Christner et al., 2000)
N Janibacter + (Michaud et al., 2004; Miteva et al., 2004; Lo Giudice et al., 2007)
Y Micrococcus + (Petrova et al., 2002; Xiang et al., 2005; Steven et al., 2007)
N Mycobacterium + (Miteva et al., 2004) N Nocardiopsis + (Abyzov et al., 1983)
N Rhodococcus + (Michaud et al., 2004; Yakimov et al., 2004; Lo Giudice et al., 2007)
Y Streptomyces + +
(Kochkina et al., 2001; Zhang et al., 2002; Mannisto and Haggblom, 2006)
Total for each sample 8 0 4 0
Verrucomicrobia
N Prosthecobacter + (Bowman and McCuaig, 2003)
Total for each sample 1 0 0 0
Deinococcus-Thermus
N Thermus + (Sheridan et al., 2003) Total for each sample 1 0 0 0 a Y denotes a genus with species which possess a copy of nifH gene, based on NCBI genomic databases.
b An abbreviated list of cryospheric references. Where possible, only references which presented clones with 95% and higher 16S rDNA sequence similarity to a specific genus were included.
2.4 Concluding remarks
Boulder Clay and Amorphous Glacier are two ice-free areas in Terra Nova Bay, Antarctica,
which differ in their geological origins and physio-chemical properties, which have been
assessed, for the first time, for their microbial content and biodiversity. In order to gather first
evidence for the bacterial communities in these glacial zones, we carried out terminal-restriction
fragment length polymorphism (T-RFLP) analysis on 16S rDNA using a universal bacterial
amplification protocol on two permafrost cores.
Microbial diversity differed between Boulder Clay and Amorphous Glacier and between the
different layers of Boulder Clay. Bray-Curtis cluster analysis suggested Boulder Clay bacterial
45
profiles were similar to each other, but cluster separately from the Amorphous Glacier bacterial
profile. With our current data it is not possible to ascertain definitively if the difference in the
geological age or other properties that distinguish one site from the other, is responsible for the
analysis results.
Our analysis suggested that the microbial population of the Boulder Clay active layer was less
diverse than the other layers at this site. This maybe is due to two reasons: A. vertical water
movement permitted micro-organisms to penetrate into deeper permafrost layers and not remain
on the surface B. hypersaline brine pockets that remained liquid at very low temperatures,
providing basic conditions for the survival of the microbial community.
Another finding of this analysis was that the Amorphous Glacier sample included potentially 38
cryospheric genera. Boulder Clay and Amorphous Glacier possibly shared in common the
following genera: Gallionella, Burkholderia (Betaproteobacteria), Alteromonas, Marinobacter,
group), Frankia and Streptomyces (high G+C Gram-positive). Each of these phylotypes include
species psychrotolerant to psychrophilic and microaerophilic. These phylotypes were either
detected in marine environments, or proven to be tolerant to NaCl salt stress, which is not
surprising considering the connection between salt tolerance and cold resistant bacteria in terms
of survival mechanisms (Jeffrey O. Dawson and Gibson, 1987; Deming, 2002; Yakimov et al.,
2004; Lo Giudice et al., 2007; Pumbwe et al., 2007).
Amorphous Glacier sample also included potentially 20 nitrogen fixing genera, based solely on
the known presence of a nifH gene copy in their genomes. Unfortunately, we were unable to
further characterise this community due to the lack of source material after a lengthy optimising
process of the T-RFLP analysis.
This preliminary work suggested the presence of a common group of cold and desiccation
resistant bacteria, some of which might be nitrogen fixers. In general, our molecular analysis
provided us with relatively few data points and the bacterial identification is by no means a
definitive conclusion and therefore would require further sampling and analysis. Such research
would help confirm and correlate the community composition with the geological and habitat
characteristics of Amorphous Glacier and Boulder Clay.
46
Figure 1 Shark Bay, Western Australia. Image Google Earth. Salinity values are in parts per thousand (ppt), modified from O’Leary (2008).
Chapter 3 Diazotrophic diversity in columnar stromatolites of Shark
Bay, Western Australia.
_______________________________________________
3.1 Introduction
Biologically accessible nitrogen is imperative and essential for a thriving microbial community.
Identifying potential nitrogen fixers - diazotrophs, has not been investigated to date in columnar
stromatolites in Shark Bay, Western Australia. Nothing is known of the nitrogen cycle in Shark
Bay’s modern stromatolite community, and it is of interest to compare Shark Bay’s diazotrophic
community characteristics to other comparable hypersaline microbial systems. The aim of this
study was to ascertain and characterise the diazotrophic community in columnar stromatolites
from Hamelin Pool in Shark Bay (figure 1).
Shark Bay is a 14,000 km2 world heritage site, off the
central coast in Western Australia (24°-26° S 113°-114°
E). It is a semi-enclosed embayment comprised of two
long, narrow reaches: Freycinet Reach is 105 km long
and 20–35 km wide, and Hopeless Reach (35 km long,
40 km wide). These reaches are separated by the Peron
Peninsula with a mean water depth of 10 m (Smith and
Atkinson, 1983). Shallow, carbonate rich banks exist in
both reaches, effectively creating bays of 1-2 m deep,
with relatively low oceanic water influx, which are
regulated mainly by diurnal to semi-diurnal tidal
processes (Burling et al., 2003).
Faure Sill, a major sea grass-covered sand bank, divides
Hopeless Reach into two unequally sized bays -
L'Haridon Bight and Hamelin Pool. The seawater
temperature within the bay varies between 17°C in
August and a maximum of 27°C in February (Bureau of
Meteorology, 2011). The low oceanic circulation within the bay, and the low intermittent
47
Figure 2 Three stromatolite morphotypes from Shark Bay (A) columnar (B) smooth (C) pustular
rainfall (<200 mm y-1) plus the high evaporation rates (>2000 mm y-1), form a NW-SE salinity
gradient, of oceanic to hypersaline levels: 35-40 ppt, then 40-56 ppt (metahaline) and up to 56-
70 ppt, almost twice as that of sea water (O'Leary et al., 2008).
Within Shark Bay shallow and hypersaline pools, exist vast banks of benthic microbial deposits,
known as stromatolites (Riding, 1999; Jahnert and Collins, 2011). The word stromatolite is
derived from the Latin word stroma, meaning bed covering, and the Greek word strōmat,
meaning spreading out, as well as lithos, meaning stone. Geologists have identified fossilized
records of stromatolites in the rock record for more than 200 years (Walter, 1976). The oldest
fossilized stromatolite was identified in the Dresser Formation of the Pilbara subgroup, Western
Australia, dated at 3.496 Ga, from the Archaean period (Schopf, 2006). Very few stromatolite
structures have been found in Archean rocks (which are less preserved in the rock record), while
most structures to date have been found in rocks from the Proterozoic and Phanerozoic era, 2.5
Ga to the present (Bertrand-Sarfati and Walter, 1976; Krylov et al., 1976; Serebryakov and
Walter, 1976a, b; Schopf, 2006).
The ancient origins of stromatolitic deposits have led to Shark Bay modern stromatolites being
referred to as “living fossils”, and marked them as important to our understanding of the origin
of life on Earth. Shark Bay entered UNSECO’s World Heritage List in 1991. As stated in the
nomination, the foremost justification for the inclusion was that the
Hamelin Pool stromatolites represented an ancient life form in
existence, and Hamelin Pool would be the classic site for study of these
“living fossils” (UNESCO, 1991).
There are five main stromatolite morphologies known to exist in Shark
Bay - pustular, smooth, cerebroid, microbial pavement, and columnar or
colloform microbial deposits (Hoffman, 1976; Jahnert and Collins,
2011). Pustular, smooth mats and columnar morphotypes are the three
best known and studied (Logan, 1961; Logan et al., 1974; Hoffman and
Walter, 1976; Playford et al., 1976; Burns et al., 2004; Allen et al.,
2009; Burns et al., 2009). Pustular mats are irregular, coarsely
fenestrated, non-laminated mats, usually found in the upper intertidal
zone (figure 2, C). Smooth mats in contrast, are finely laminated, with
distinct, well-defined layers, usually found in the lower intertidal zone.
Columnar stromatolites are usually found in the intertidal to sub tidal
zones and exist down to a depth of 1-2 m below sea surface. Club
48
shaped with coarsely laminated internal texture, they are highly calcified and are up to 1.5 m
height with spherical tops (Hoffman, 1976; Playford et al., 1976; Jahnert and Collins, 2011).
The various stromatolites morphologies represent interplay between environmental and
microbial processes. Environmental factors have been suggested to control the external
morphological development of a stromatolite structure. These factors included for instance -
wave energy, tide amplitudes, water levels and turbulences, sand waves, sediment grain size,
hard/soft substrates and more (Logan, 1961; Logan et al., 1964; Logan and Cebulski, 1970;
Logan et al., 1970; Playford et al., 1976; Dupraz et al., 2009). Whilst the active microbial
component produced repetitive cemented grain layers and internal laminae, mainly by
precipitation of aragonite micro crystals and repeatedly trapping and binding sediment particles
(Andres and Pamela Reid, 2006). Biotic and abiotic factors thus come together to promote the
outgrowth of stromatolites both on a micro- and macro-scale.
Microbiologists took a keen interest in the microbial component of the stromatolites and
investigated them using microscopic, culturing techniques, and more recently, molecular
methodologies. Table 1 provides a summarised view of the microbial agents identified in
stromatolites from Hamelin Pool and Guerrero Negro, Baja California Sur, Mexico, a highly
similar saline environment, discussed later in this chapter.
Combining microscopic and molecular tools has provided researchers with a qualitative and
quantitative view of the microbial communities residing in stromatolite mats. In the past,
microscopic observations in the stromatolite mats identified mainly cyanobacterial species -
Microcoleus chthonoplastes, Entophysalis deusta, Schizothrix fuscescens and Leptolyngbya spp.
(Logan, 1961; Golubic and Walter, 1976) and there is no record of their nitrogen fixation
potential or actual rates, under local conditions. PCR based research has increased the number
of identified microorganisms several fold with members of the Archaea, Bacteria and Eukaryota
detected in stromatolite samples, as can be seen in table 1. Classification of the functional
groups in stromatolites has identified Archaea as involved in fermentation (mainly
methanogenesis), cyanobacteria, as oxygenic photosynthetic produces nitrogen fixers, and
diatoms as oxygenic photosynthesisers (Paerl et al., 2000; Des Marais, 2003; Dupraz and
Visscher, 2005). Aerobic heterotrophs, anoxygenic phototrophs, sulphate reducers and sulphide
oxidizers from the Proteobacteria, Actinobacteria, Firmicutes and Bacteroidetes groups are
apparently involved in several overlapping processes: fermentation, denitrification, nitrogen
fixation and sulphur reduction/oxidation, which are all tightly bound to the light levels and
oxygen/sulphur profiles within the mats (Paerl et al., 2000; Des Marais, 2003; Dupraz and
Visscher, 2005; Goh et al., 2008; Allen et al., 2009; Burns et al., 2009).
49
Table 1. Stromatolite related microorganisms (genus level) from published studies. Potential diazotrophs which contain nifH gene are highlighted in bold.
Environment Hamelin Pool, Shark Bay (a)
Guerrero Negro, Baja California Sur, Mexico (b) Hamelin Pool, Shark Bay (c)
* 16S rDNA sequence similarity was 90% - 95% to a designated genus. Their identification should therefore be cautiously accepted (Everett et al., 1999). (a) Data collected with microscopic methods only (Logan, 1961; Bauld et al., 1986). (b) Data collected with microscopic and molecular methods (Javor and Castenholz, 1981; Risatti et al., 1994; López-Cortés et al., 2001; Ley et al., 2006). (c) Data collected by microscopic and molecular methods (Bauld et al., 1986; Burns et al., 2004; Papineau et al., 2005; Goh et al., 2006; Allen et al., 2008; Allen et al., 2009).
Diazotrophs have not been investigated to date in columnar stromatolites. In general, identifying
nitrogen fixers has advanced considerably with the introduction of molecular techniques. These
techniques are based on DNA and RNA extractions, as well as on the polymerase chain reaction
(PCR, Mullis and Erlich, 1988) and are considered better in exploring natural microbial
diversity (Amann et al., 1995; Head et al., 1998). The possible biases generated by DNA
extraction methods and PCR kinetics were briefly discussed in the introduction chapter, section
1.8.2 and have been addressed in this study (see the methods section).
The aim of this study was to use molecular methodology to ascertain and characterise
diazotrophs in columnar stromatolites. Important nitrogen fixers in stromatolites from other
environments were usually cyanobacterial representatives from the Nostocales, Chroococcales
and Oscillatoriales, as well as anaerobic, sulphate reducing -Proteobacteria representatives
(Steppe et al., 2001; Fourçans et al., 2004; Jenkins et al., 2004; Omoregie et al., 2004a;
Omoregie et al., 2004b; Jungblut et al., 2005; Yannarell et al., 2006; Desnues et al., 2007;
Leuko et al., 2007).
Nothing is known of the nitrogen cycle in Shark Bay’s stromatolitic community, and it is of
interest to compare Shark Bay’s diazotrophic community characteristics to other comparable
hypersaline microbial systems in order to broaden our understanding of the nitrogen fixation
processes occurring within extant stromatolites and by extrapolation, processes which might
have occurred in extinct stromatolites.
51
Figure 3 Left: Map of Shark Bay region. Inset image shows Shark Bay’s location on the west coast of Australia. Image copyright GeoScience Australia. Right: Low tide at Hamelin Pool, Telegraph station, showing columnar stromatolites. Image copyright Torben Rübke, 2006.
3.2 Materials and methods
3.2.1 Sample collection and sample sites
Sampling was conducted by former lab students in the intertidal region of Hamelin Pool at
Telegraph Station at low tide in Shark Bay, Western Australia in 1996 and May 2004
(26°24’03” S, 114°09’36.1” E, figure 3).
Intertidal columnar stromatolite pieces were collected by cracking the stromatolite with a
geological hammer about 2 cm from the top of the stromatolite or collecting a whole small
stromatolite. All samples were placed in sterile specimen bags and kept at 4 C during transport
back to the laboratory, where they were stored in the dark at 4 C until further processing. All
samples were collected and handled with sterile instruments throughout the course of the study.
No other environmental data was collected.
3.2.2 DNA isolation and PCR amplification of nifH genes
Within two weeks of 4 C storage, samples were processed and DNA extracted in order to avoid
potential DNA degradation. A rock hammer washed with 70% ethanol and flamed was used to
break small chunks out of stromatolite specimens from 2004 and 1996. Approximately 1 cm3
fragments of the stromatolite were ground to a fine paste using a sterile mortar and pestle.
Genomic DNA was extracted by the method of Neilan (1995). Approximately 100 mg of fine
paste was transferred to a 1.5 ml eppendorf tube and suspended in 567 μL TE buffer (10 mM
52
Tris-HCl, 1 mM EDTA, pH 8.0), to which 30 μL 10% SDS and 3 μL Proteinase K (10 mg ml-1)
were added. The samples were incubated at 37 C for 3 h with intermittent shaking.
An additional step of 5 cycles of freezing at -40 C and thawing was added to ensure complete
cell lysis. Next, 100 μL of 5 M NaCl was added to the lysate and mixed thoroughly before the
addition of 80 μL CTAB solution (10% w/v acetyltrimethyl ammonium bromide in 0.7 M NaCl)
and incubated at room temperature overnight (Wilson, 2001).
An equal volume of phenol: chloroform: isoamyl alcohol (25:24:1) was added to the supernatant
and mixed thoroughly before centrifugation at 14,000 g for 5 min at RT. The top layer of the
supernatant was transferred to a fresh tube and the DNA precipitated in 50% isopropanol and
0.4 M potassium acetate. The samples were incubated at RT for 30 min or at 4 C overnight, and
then centrifuged at 14,000 g for 5 min to pellet the DNA. The supernatant was discarded and the
pellet washed with 70% ethanol, air dried and resuspended in 30 μL of sterile MilliQ water.
Two replicas of 1996 or 2004 extracted genomic DNA (5 ng μL-1), were used in a nested PCR
to amplify the nitrogenase gene nifH (Omoregie et al., 2004c). The first PCR in the nested
approach was performed using 0.3 units of Taq polymerase (Sigma-Aldrich, St. Louis, MO) in a
20 μL reaction mix containing 2.5 mM MgCl2, 1x Taq-Polymerase reaction buffer, 0.2 mM
dNTPs (Fisher Biotec, WA, Australia), and 2 pM of each of the primers NifH3 (5' ATR TTR
TTN GCN GCR TA 3') and NifH4 (5' TTY TAY GGN AAR GGN GG 3') (Zani et al., 2000), 1
μL of genomic DNA (5 ng μL-1) and sterile MilliQ water to a total volume of 20 μL. Thermal
cycling was performed in a GeneAmp PCR System 2400 Thermocycler (Perkin Elmer,
Norwalk, CT). Thermal cycling conditions for the amplification of bacterial nifH genes were as
follows: An initial denaturation step at 94˚C for 4 min was followed by 30 cycles of DNA
denaturation at 94˚C for 1 min, primer annealing at 55˚C for 1 min and strand extension at 72˚C
for 1 min, with a final extension step at 72˚C for 7 min. Two microliters of the first PCR
reaction were used for a second amplification round using primers NifH1 (5'-TGY GAY CCN
AAR GCN GA-3') and NifH2 (5'-ADN GCC ATC ATY TCN CC-3', (Zehr and McReynolds,
1989). The reaction mix and amplification protocol were as described above except for
increasing the annealing temperature to 57°C. All PCR experiments included a negative control
reaction without DNA template, and a positive control using DNA from the reference strain
Nostoc PCC 7120.
PCR products were visualised on 1% and 2% agarose gels (molecular biology grade, Progen
Pharmaceuticals, QLD, Australia) with 1x TAE-buffer and stained by ethidium bromide (1 μg
ml-1) for 10-15 min. Nucleic acids were visualised via UV transillumination (Gel Doc 2000,
53
BioRad, Hercules, CA) using QuantityOne 4.1R software (BioRad, Hercules, CA) and raw
images were exported in jpeg format for later visualisation.
3.2.3 Clone libraries and Restriction Fragment Length Polymorphism (RFLP)
Fresh PCR products (containing an A-overhang at the 3’ end) of the nifH gene amplification
were ligated into the pGEM-T Easy vector (Promega, Madison, WI) according to the
manufacturer’s instructions. From each clone library, at least 50 clones containing inserts were
selected and amplified using the vector specific primers MpF and MpR. PCR products of the
correct size were precipitated by transferring the remaining reaction mixture to a 1.5 ml
eppendorf tube, adding a double volume of ice-cold 100% ethanol, and then incubated on ice for
15 min. The samples were centrifuged at 14,000 g for 15 min, the supernatant discarded and the
pellet washed once with 200 μL freshly made 70% ethanol. The resultant pellets were dried
using a SpeedVac vacuum centrifuge (Thermo Fisher Scientific Inc., Waltham, MA) or left with
open caps under aluminium foil in room temperature, after which they were resuspended in 10-
15 μL sterile MilliQ water. To verify that the PCR product had not been lost during the ethanol
precipitation, the cleaned PCR products were visualised on 1% agarose gels (molecular biology
grade, Progen Pharmaceuticals, QLD, Australia) with 1x TAE-buffer, stained by ethidium
bromide (1 μg ml-1) for 10-15 min and visualised via UV transillumination (Gel Doc 2000,
BioRad) using QuantityOne 4.1R software (BioRad).
Each clone was subjected to duplicate Restriction Fragment Length Polymorphisms (RFLP)
analysis using restriction enzymes ScrFI and MspI (New England Biolabs, Ipswich, MA)
separately. Each digest reaction contained 3 μL PCR product, 1 μL of the corresponding
enzyme buffer, 2 units of restriction enzyme and sterile MilliQ water to a total volume of 10 μL.
The digests were incubated at 37°C overnight. Clones’ RFLP patterns were analysed manually
after electrophoresis on 2% agarose gels as previously described. At least one clone from each
unique RFLP pattern was sequenced.
3.2.4 DNA sequencing
Sequencing of selected clones was carried out using the PRISM Big Dye cycle sequencing
system with MPF or MPR primers and 3-49 ng of the precipitated product.
The sequencing reaction products were transferred to a 1.5 ml tube and precipitated by the
addition of 16 μL sterile MilliQ water and 64 μL 95% ethanol and mixed thoroughly. After
incubation at RT for 15 minutes, the samples were centrifuged at 16,000 g for 20 min and all the
54
supernatant was removed. The pellet was washed and dried as above. The sample was submitted
for automated sequencing at The Ramaciotti Centre for Gene Function Analysis, UNSW, using
the Applied Biosystems 3730 DNA Analyser (Foster City, CA) and analysed with Applied
Biosystems Sequencing Analysis 5.1.1 software provided by Applied Biosystems.
3.2.5 Phylogenetic sequence analysis
Sequences chromatograms were manually checked for signal quality with “ABI and SCF Trace
Viewer” embedded in “BioEdit Sequence Alignment Editor” software version 7.0.5.3 (Hall,
1999). Sequences with high background signal noise were discarded from further analysis.
The 2004 and 1996 stromatolite clone library sequences were initially batch edited with
Sequences of the nifH clones are available under GenBank accession numbers JF826460-
JF826496.
57
3.3 Results and discussion
3.3.1 General methodology consideration
In order to minimize potential biases, this study used a DNA extraction method known to
produce high quality DNA extractions, that does not skew the original composition of the
microbial community in the sample (Leuko et al., 2007; Goh et al., 2008). PCR cycles were
kept at 30 cycles, in order to avoid introducing biases in amplification and the nifH primer sets
used in this study, have been shown not to cause major bias in the PCR amplification process
(Diallo et al., 2008). To insure no false-positive results were created due to contaminated PCR
reagents, all reactions and gel visualizations included negative controls, and BLAST results
were screened against sequences known to arise from such contamination (Zehr et al., 2003b;
Goto et al., 2005).
BLAST and BLASTX analyses are useful tools for taxonomical identification in terms of
evolutionary interpretations, under certain known limitations (States and Botstein, 1991).
Taxonomical affiliation based on 16S rDNA sequences, generally assumes a 1 to 1.5 %
sequence difference is appropriate for defining strains within the same species, and a 2 to 5 %
difference for species within the same genus (Clarridge, 2004). Translated NifH sequences are
much more interesting and informative as they relate to the protein itself, a 3-D entity composed
of primary, secondary and higher levels of structural compilations and subject to biochemical
influences (Stormo, 2002). The selective pressure to adapt functionality to a micro-environment
and yet retain a specific functionality, can produce nucleotide sequences which are distantly
related, but the amino acid code would contain homologous coding regions which reflect
structural and functional similarities (Sander and Schneider, 1991). A 2 to 8 % difference in the
NifH amino acid sequences represents variations on the amino acid sequence, which relates to
structural changes. Thus, a 2 to 8 % difference was considered appropriate for a positive
identification of a Fe protein, based on past studies analysing structural homologies and
sequence similarities (Sander and Schneider, 1991; Hobohm et al., 1992).
In this study, BLAST and BLASTX results passed significant statistical thresholds: BLAST
expected (E) value range was e-88 – e-168 and BLASTX E-values ranged
e-27 – e-60 (Ladunga, 2002a, b; McGinnis and Madden, 2004). In addition, NifH translated
sequences were longer than 100 residues and the remaining hits for each sequence with lower E-
values were also identified as Fe protein (NifH). We therefore assumed our translated sequences
were homologs of the nitrogenase Fe protein component (Rychlewski et al., 2000) and
58
attributed sequence differences to biological adaptations to ecological constraints, or inter-
species differentiation.
3.3.2 2004 clone library BLAST & BLASTX analysis
NifH genes were present and amplified from the total DNA extracted from the 2004 stromatolite
samples (figure 4). In total, 38 clones containing the correctly sized insert (350bp) were
obtained and analysed (figure 5). RFLP analysis was performed on 30 random positive clones
from the 2004 library, which grouped into five patterns (see figure 6 and table 2).
Figure 4 Products obtained during the amplification of nifH from stromatolite DNA extractions using nested primers. Left pane: amplification products after the first step of nifH PCR amplification; right pane: amplification products after the second step. Lane 1: 2004 stromatolite; 2: 1996 stromatolite; 3, 4: positive control Nostoc PCC 7120; 5: negative control sterile MilliQ H2O; M: 0.5 μg μL-1 GeneRuler™ DNA Ladder Mix (Fermentas, Ontario). Figure 5 Example of a 1% agarose gel showing the PCR amplification of 13 clones with the nifH insert before (left) and after (right) PCR product clean up procedure. Expected band size of a pGEM vector without nifH gene insert – 236 bp. Expected band size with nifH insert present – 586 bp. Lanes 1 - 13: 2004 stromatolite clones (white colonies); Lane B: blue colony product (negative control); M: 0.5μg μL-1 GeneRuler™ DNA ladder Mix (Fermentas, Ontario).
59
Figure 6 2% agarose gel showing RFLP patterns using ScrFI (bottom) and MspI (top) restriction enzymes on 12 clones from 2004 stromatolite library. M: 0.5μg/μL GeneRuler™ DNA ladder Mix (Fermentas).
Table 2 Modified representations of the 2004 stromatolite clones RFLP digestion patterns. Gel lanes 1 2 3 5 6 7 8 9 10 12 13
All these species are strict anaerobes, sulphide or sulphate reducers, isolated from marine and
freshwater sediments, and are considered mesophilic (Schink, 1992; Sakaguchi et al., 2002;
Cravo-Laureau et al., 2004; Sorokin et al., 2008). Nitrogenase activity has yet to be
characterised in these genera, though nifH DNA fragments have been amplified from mostly
marine sediments (Zadorina et al., 2009; Bertics et al., 2010; Quaiser et al., 2010).
Several individual NifH clones had 97% sequence similarity to Halorhodospira halophila SL1
and Teredinibacter turnerae T7901. H. halophila is an anaerobic halophilic phototroph with
nitrogenase activity under light conditions, and its nifH DNA fragments have been amplified
64
from various environments, usually at relatively low sequence similarities (Tsuihiji et al., 2006;
Falcón et al., 2007; Zadorina et al., 2009; Ma et al., 2010). T. turnerae is a mesophilic
endosymbiotic -Proteobacterium isolated from molluscs (Bivalvia: Teredinidae) which is able
to fix nitrogen under microaerobic conditions (Distel et al., 2002). Cyanothece sp. CCY0110
(ZP_01727765) was the only cyanobacterial match in the 1996 clone library, with 95% average
sequence similarity.
3.3.4 BLAST and BLASTX comparative analysis
A common match in both clone libraries, based on BLAST analysis, were uncultured nifH
sequences from microbial mats in Guerrero Negro (GN), Mexico, dominated by a Lyngbya sp.
(Moisander et al., 2006). Common nifH sequences suggest, to a certain extent, that the same
diazotrophs were present in the 1996 and 2004 stromatolites communities, but do not imply that
they employed similar nitrogen fixation patterns.
The GN microbial mats were collected from a tidal flat, which underwent alternating
desiccation/wetting periods pending tidal flooding; they were therefore subjected to alternating
levels of salinity (sea water-hypersalinity), in a similar fashion to the intertidal columnar
stromatolites in Hamelin Pool. Additional cyanobacterial species, purple and colourless sulphur
bacteria were also identified (Omoregie et al., 2004b) and it was not surprising that the clones
sequence similarity was only 86%-87% similar to the nifH nucleotide sequences of the GN mat
samples (Moisander et al., 2006) after taking into consideration the varying environmental
salinity levels and methodological differences between our study and the GN mats studies. The
1996 and 2004 stromatolite nifH clones may represent novel sequences due to local adaptations
to their own environment and its specific characteristics such as salinity levels, and nutrient
dynamics.
Based on the BLASTX results, Cyanothece sp. CCY0110 (ZP_01727765, Cyanobacteria)
emerged as the common match for both clone libraries at 93% and 95% average sequence
similarity. This was a major component of the 2004 clone library but less prevalent in the 1996
clone library (only 8.11% of the clones). Cyanothece sp. CCY0110 is constantly found and
cultured from various hypersaline and marine environments (Garcia-Pichel et al., 1998; López-
Cortés et al., 2001) and as mentioned previously, known to fix nitrogen under dark, aerobic
conditions.
65
3.3.5 Phylogenetic analysis
Most of the reference NifH sequences were obtained from the Swiss-Prot database (Boeckmann
et al., 2003), in which they were manually annotated, reviewed and verified, thus providing a
reliable genetic framework into which we integrated the clones sequences and additional
BLASTX hits (see appendix A, table A-5). The LG model (Le-Gascuel) is an improved model
over WAG and JTT in estimating amino acid substitution rates and in general provides better
tree topologies and likelihood probabilities (Le and Gascuel, 2008; Guindon et al., 2010). The
LG model takes into consideration not only variations in amino acid substitutions per site but
also whether a site is slow or fast to change due to evolutionary constraints. Using deduced
amino acid sequences instead of nucleotides might cause loss of some information in regards to
synonymous vs. non-synonymous substitutions, which then might provide a different
evolutionary presentation of the nifH gene. Yet, because nifH is a coding gene for a protein, it is
logical to view the code at the amino acid level, where it will be subjected to far more selective
pressure arising from physical and chemical conditions within the cell. The resulting branch
support values in this study were satisfactory and provided a reliable representation of the
possible evolution of nifH genes amongst Archaea and Eubacteria (Posada et al., 2009).
For this analysis, a total of 232 NifH amino acid sequences with an average length of 120
residues, were subjected to a maximum likelihood analysis. This produced a phylogenetic tree
with four major clusters, corresponding to NifH designated clusters I-IV (Chien and Zinder,
1996; Zehr et al., 2003a; Raymond et al., 2004a), plus two smaller clusters, one affiliated with
Desulfuromonadales representatives from the -Proteobacteria and another cluster of
Roseiflexus spp. NifH amino acid sequences (93 and 96 branch support values, respectively,
figure 7).
66
Briefly, cluster I contained the conventional Mo-containing NifH sequences most of them
affiliated with Proteobacteria, Cyanobacteria and Firmicutes (figure 8). Cluster I contained a
total of 46 stromatolite NifH clones and had a branch support value of 88 within the entire NifH
tree. Cluster II included phylotypes with an alternative nitrogenase containing Fe instead of Mo
or V. These included Archean methane producers, and alternative nifH genes (nifH2, nifH3)
from Firmicutes, α, -Proteobacteria and Spirochaetes. Cluster III included NifH sequences of
anaerobic diazotrophs with conventional nitrogenase (mostly Mo), mainly from the -
Proteobacteria, Spirochaetes, Chlorobi group, Firmicutes and Archaea (figure 9). Cluster III
contained a total of 18 stromatolite NifH clones and had a support value of 75 within the entire
NifH tree. NifH Cluster IV was very divergent and included mostly strict anaerobic Archaean
genera, some with alternative nifH genes: Methanopyrus, Methanosarcina, Methanobrevibacter,
Methanothermobacter (nifH2), Methanobacterium, Methanocaldococcus and Methanococcus.
The only exception to the Archaea was an alternative nifH gene copy of Rhodobacter capsulatus
(nifH2), a phototrophic purple non-sulphur α-Proteobacterium.
Figure 7 A phylogenetic tree based on Maximum-likelihood analysis of partial NifH amino acid sequences. Sequences determined in this study were given an alphanumeric prefix RSAYYYY and are marked bold; number of clones is in parenthesis; the scale bar represents the number of substitutions per 100 bases.
67
Eleven NifH clones from the 1996 stromatolite library were affiliated with the closest out-group
to cluster I - Pelobacter carbinolicus DSM 2380 ( -Proteobacteria, figure 8).
Three sub-clusters in cluster I, 1-Cyan-A/B/C, were treated as one sub-cluster designated “1B”
by Zehr et al (2003) and included only cyanobacterial NifH sequences. The sub-cluster 1-Cyan-
B branch support value was 89 and included unicellular Cyanobacteria: Cyanothece,
Gloeothece and a very divergent Cyanobacterium UCYN-A NifH sequence. It is unclear why
they clustered separately from Cyanothece and Gloeothece NifH sequences in 1-Cyan-A and
1-Cyan-C. The entire 2004 stromatolite clone library, 38 clones in total, clustered closely to a
NifH sequence of Xenococcus PCC 7305, in the cyanobacterial cluster 1-Cyan-B (O08262
accession ID, 108AA length). Three 1996 stromatolite clones clustered separately from the
2004 sequences, but in the same-sub cluster. Xenococcus sp. NifH fragments were identified in
marine sponges, coral reef lagoon seawater samples from Heron Island, Australia, in microbial
mats from Guerrero Negro (GN) salt ponds in Baja California, and additionally from core
samples of marine stromatolites from Highborne Cay, Bahamas (Steppe et al., 2001; Omoregie
et al., 2004c; Hewson et al., 2007; Mohamed et al., 2008b). In these studies nitrogenase activity
was highest during the dark period, but the activity was not attributed to a specific bacterial
group. Xenococcus PCC 7305 specific strain is known to fix nitrogen anaerobically (Fay, 1992;
CRBIP, 2007), though its sequence clustered with the aerobic diazotrophic Cyanothece spp.
Two other sub-clusters included five 1996 stromatolites clones - 1-Prot- -B and 1-Prot- -C,
with branch support values >74. Two clones were closely affiliated with Marichromatium
purpuratum, also known as Chromatium purpuratum, which is a halophilic purple sulphur
anaerobic -Proteobacterium, phototrophic, with high G+C content (68.9%) and 25-35 °C
optimal growth temperature, usually found in anoxic marine sediments, marine sponges and
other marine invertebrates (Proctor, 1997; Imhoff et al., 1998). Accordingly, its NifH fragments
have been found in water surface samples from a river estuary, Hawaiian corals and in a tropical
intertidal lagoon (Affourtit et al., 2001; Bauer et al., 2008; Olson et al., 2009). The above clones
were originally matched in BLASTX as the halophilic γ-Proteobacterium Halorhodospira
halophila SL1 at 93% sequence similarity (table 4). H. halophila and M. purpuratum NifH
sequences share a high level of sequence similarity - 91%, confirmed by another published
analysis of H. halophila NifH sequence which positioned it within the same cluster as M.
purpuratum (Imhoff et al., 1998; Tsuihiji et al., 2006; Bertics et al., 2010). This ‘mismatch’
between BLASTX match and the phylogenetic affiliation was due to the different basic
assumptions employed in BLAST and BLASTX algorithms vs. the phylogenetic modelling.
BLAST and BLASTX are statistical methods designed to ‘fish out’ significant matches from
huge databases, without any evolutionary framework or assumptions (Altschul et al., 1997;
68
Ladunga, 2002b). Hence, because these two clones had almost the same sequence length as H.
halophila SL1, 120 vs. 121 residues, while M. purpuratum had only 109 residues and was short
of two known conserved motifs – “CDPKAD” at the beginning of the NifH partial sequence and
“GEMMAL/M” further along the sequence - BLASTX analysis chose H. halophila SL1 as the
best ‘correct’ match for these clones.
Phylogenetic models, on the other hand, incorporate evolutionary assumptions into their
algorithms such as time reversibility, amino acid substitution matrices, base frequencies,
proportion of invariable sites and more (Sullivan and Joyce, 2005). Therefore, the few non-
conserved residues between the clone sequences and H. halophila SL1 and M. purpuratum,
eventuated in these two clones clustering with M. purpuratum instead of H. halophila SL1,
disregarding the length issue.
Figure 8 next page: Cluster I phylogenetic tree, based on Maximum-likelihood analysis of NifH partial amino acid sequences. Sequences obtained in this study were given an alphanumeric prefix RSAYYYY and are marked bold, branch support values (approximate likelihood-ratio test, aLRT) are shown for key branches; only values > 50 were considered significant. Text box contain designation of clusters and in parenthesis is the closest sub cluster nomination as per (Zehr et al., 2003a). ‘1’ - cluster I, Prot=Proteobacteria, Cyan=Cyanobacteria, Firm=Firmicutes. The scale bar represents the number of substitutions per 100 bases. Out-group was Desulfuromonadales ( -Proteobacteria) NifH sequences from Geobacter and Pelobacter genera.
69
1-Prot-αβ (1J, 1K)
1-Cyan-A (1B)
1-Cyan-B (1B)
1-Cyan-C (1B)
1-Firm-A (1D)
1-Prot- -A (1P)
1-Prot- -B (1M)
1-Prot- -C (1H, 1T, 1l, 1U)
70
Additionally, three 1996 clones clustered with Teredinibacter turnerae T7901 in a Vibrio
spp. cluster (92% amino acid sequence similarity, table 4). T. turnerae is an endosymbiotic -
proteobacterium isolated from molluscs (Bivalvia: Teredinidae), that can fix nitrogen under
microaerobic conditions, at seawater salinity level (Fiore et al., 2010). This genus cluster with
Pseudomonas spp. based on its 16S rDNA sequence, yet its NifH amino acid sequence clustered
with Vibrio spp. rather than Pseudomonas (Distel et al., 2002). There are a few bivalve species
living in Hamelin Pool (and Shark Bay in general), hence it is reasonable to assume T. turnerae
integrated structurally within the columnar stromatolite. The abundant bivalves Fragum
hamelini Iredale, Fragum erugatum and the small bivalve Irus irus (Linnd), which is found at
the sides of many sub tidal stromatolites, have been reported in this area (Hoffman and Walter,
1976; Playford et al., 1976; Flint and Abeysinghe, 2000/07), yet this is the first report of a T.
turnerae NifH fragment in a stromatolite microbial mat.
As mentioned earlier, cluster III contained 18 stromatolite NifH clones, entirely from the 1996
stromatolite clone library, which clustered in two sub-clusters: 3-Prot- -A and 3-Prot- -B, each
with branch support values >89 (figure 9).
Nine clones clustered with Desulfovibrio gigas (P71156) and Desulfonatronospira
thiodismutans ASO3-1 (D6SLD2) - -Proteobacteria, sulphate reducers and strict anaerobes. D.
thiodismutans ASO3-1 is an obligatory alkaliphilic (optimum pH 10) bacterium with moderate
salinity acceptance and maximum growth temperature of 43 °C (Sorokin et al., 2008). It has not
been detected in Shark Bay or in other marine microbial mats to date, perhaps because the
sequence is relatively new in the databases (first entry 10th august-2010) and therefore
additional confirmation may follow. A singular clone was closely affiliated with D.gigas, whose
nifH DNA fragments were found in plant rhizopheres, marine sediment samples and in a few
cyanobacterial mats (Zehr et al., 1995; Moisander et al., 2007). The same clone had 97%
BLASTX sequence similarity to D. magneticus RS-1 (Table 4), which the phylogenetic analysis
had assigned to a different sub-cluster, 3-Prot- -GS. Ten residues were not conserved between
D. magneticus and the 1996 stromatolite clone and were sufficient for the phylogenetic model to
place the clone sequence with D.gigas instead of D. magneticus. Desulfovibrio spp. seem to fix
nitrogen within marine sediment microcosms regardless of light conditions (Postgate et al.,
1988; Kent et al., 1989; Musat et al., 2006). Some studies suggested that the genus fixed
nitrogen mainly during dark periods in marine intertidal microbial mats (Zehr et al., 1995;
Steppe and Paerl, 2002).
Additionally, nine 1996 stromatolite clones clustered with an alkene-degrading, sulphate-
reducing bacterium - Desulfatibacillum alkenivorans strain AK-01 (B8FAC4) which was first
isolated from oil-polluted sediments of a sewage plant (Cravo-Laureau et al., 2004). Related
71
nifH sequences were reported in low abundance from a low temperature, acidic peat bog and in
a ghost shrimp benthic burrow within intertidal lagoon waters (Zadorina et al., 2009; Bertics et
al., 2010).
Briefly summarising, our phylogenetic analysis indicated that 100% of the 2004 and 22% of the
1996 stromatolite clone libraries were affiliated with cluster I. Almost 30% of the 1996
stromatolite clone library sequences were associated with an out-group to cluster I, which was
composed of Desulfuromonadales representatives from the -Proteobacteria, and an additional
48% were affiliated with cluster III. Neither clone library had representatives in cluster II or
cluster IV as designated by Zehr et al. (2003a). Combined to a unified representation of
potential diazotrophs in columnar stromatolites, cluster I clones would represent 61% and
cluster III clones would represent 39% of the diazotrophic community composition.
Additionally, a NifH sequence of Xenococcus PCC 7305, in the cyanobacterial sub-cluster 1-
Cyan-B, was a common phylogenetic affiliation for both clone libraries. This may indicate, as
with the previous BLAST and BLASTX analyses, that this was the common diazotrophic specie
in columnar stromatolites. The few inconsistencies between the BLAST or BLASTX results and
the phylogenetic assignments emphasize the importance of applying at least two different
methods on the same batch of nucleotide or amino acid sequences, in order to gain an unbiased
view of the possible outcomes from the original sequences.
72
3-Firm-Arch (3C,
3D, 3A)
3-Prot- -GS
(3L, 3T)
3-Prot- -B
(3B, 3E, 3L)
3-Prot- -A (3P)
3-Spiro-A (3L)
Figure 9 cluster III phylogenetic tree based on Maximum-likelihood analysis of partial NifH amino acid sequences. Sequences determined in this study were given a prefix RSA and are marked bold, branch support values (approximate likelihood-ratio test (aLRT)) are shown for key branches; only values > 50 were considered significant. Spir=Spirochaetes, Arch=Archaea. The scale bar represents the number of substitutions per 100 bases.
73
3.3.6 Coverage, diversity and community structure
Before analyzing richness, diversity and structure, it was necessary to ascertain whether the
clone library coverage was sufficient enough to provide a decent assessment of the above
factors. The program “Mothur” employs molecular distance matrices in order to calculate
various ecological parameters and coverage estimates, and has been successfully used in
microbial ecological studies (Schloss et al., 2009). The Mothur software version used in this
study did not provide a sub-program to calculate distances of amino acid sequences, so after
aligning sequences with “Muscle” and confirming alignment quality against known NifH
reference sequences, the Probability Matrix from Blocks (PMB, Veerassamy et al., 2003) as
implemented in “PHYLIP Protdist”, version 3.67 (Felsenstein, 2007), was used for that purpose.
PMB is derived from the popular BLOSUM matrices for amino acid substitutions and from the
Blocks database (Henikoff et al., 1999; Henikoff et al., 2000). This matrix takes into
consideration aligned ungapped conserved regions and adjusts amino acid substitution scores
based on evolutionary assumptions (e.g. evolutionary distances are additive in a linear fashion).
The resulting model is strongly based in empirical data as it included the NifH/BchL/ChlL
family, and was suitable for use with NifH sequences which have several conserved blocks in
the sequence.
Statistical analyses of the clone libraries from 1996 and 2004 stromatolites are presented in
tables 5-7, as well as collector curves for coverage of all libraries (figure 10). These curves
represent the frequency data for each distance level (0.01-0.12, figure 10) plotted against the
number of unique sequences or species observed. In other words, the data is based on the
number of observed OTUs as a function of distance between sequences and the number of
sequences sampled. Therefore, when a curve reaches an asymptote, it means no more unique
sequences were observed for a specific distance level, full species coverage attained, and no
need to sample the clone library any further (Schloss et al., 2004).
The estimated clone library coverage was 73% for 99% phylotype cutoff and up to 97%
coverage for 87% phylotype cutoff, regardless of the sampling year (table 5). Phylotype cutoff
of 99% meant sequences were at a maximum distance of 1% from one another. Therefore, the
coverage of potential diazotrophs was comprehensive and the clone libraries were representative
of the diazotrophic diversity in our samples. At 100% phylotype cutoff (unique sequences), 34
OTUs were identified and the number of observed species by Chao1 non parametric estimator
for richness was 121.75 (63.88-291.73, 95% CI), indicating that when sampled to completion
there would be between 29 and 257 additional NifH species. Shannon-Wiener index of diversity
(H’) estimator was 2.72 (2.37-3.07, 95% CI).
74
Between 100% - 87% phylotype cutoff , 8.82% to 55.55% of the OTUs respectively (table 5),
were shared between the clone libraries, indicating common OTUs of NifH sequences in the
2004 and 1996 clone libraries. At 87% phylotype cutoff, Yue and Clayton’s non-parametric
estimator for similarity (θ) was 0.85, indicating a high proportional similarity between the clone
libraries. Similarity (θ) between the libraries was estimated at 0.3 under 100% phylotype cutoff
(lower values).
Table 5 Shared coverage, observed richness, diversity & similarity estimators, based on NifH translated amino acid sequences from both clone libraries. Phylotype cutoff (%)
Abbreviations: CI, confidence interval; OTUs, operational taxonomic units. (a) The coverage index was calculated by the method of Good (1953). (b) The richness index was calculated by the method of Chao et al. (1993). (c) The diversity index calculated by the method of Shannon–Wiener (Krebs, 1989). (d) Number of shared OTUs between libraries. (e) Yue and Clayton’s (2005) community overlap measure based on shared OTUs proportions (Yue and Clayton, 2005).
Statistical analysis by Libshuff (Singleton et al., 2001), confirmed that the clone libraries were
not significantly different from one another (significance >0.025, table 6). The marginal
significance (0.01-0.05) given by the parsimony method (P-test) and weighted UniFrac test
(Martin, 2002; Lozupone and Knight, 2005) indicated that the structural similarity between the
communities might not occur by chance, which can be interpreted to mean that the communities
were not significantly different from one another (table 6).
(a) Libshuff analysis calculated using the Cramer-von Mises test statistic with 10,000 randomisations by the method of Schloss et al. (2004). (b) Parsimony statistical test (P-test) with 100 permutations by the method of (Martin, 2002) corrected for multiple comparisons using the Bonferroni correction. (c) UniFrac statistical test (P-test) with 100 permutations by the method of (Lozupone and Knight, 2005), corrected for multiple comparisons using the Bonferroni correction. * Marginal significance 0.01-0.05 as calculated by UniFrac (Lozupone et al., 2006).
According to the collector’s curve analysis of OTU’s and based on the furthest-neighbour
algorithm and a distance precision of 0.01 (Schloss and Handelsman, 2005), the number of
unique NifH amino acid sequences began to stabilize and reach an asymptote at the 99%
phylotype cutoff, in each clone library (table 7, figure 10). This meant full species coverage was
attained at that cutoff. The 99% phylotype cutoff meant that all sequences were at a maximum
distance of 1% from one another. At the 99% phylotype cutoff, the 2004 clone library sequences
were grouped into 6 OTUs only, and grouped into one OTU at 93% phylotype cutoff (0.07
distance), indicating the relatively low diversity of the 2004 clone library. The 1996 stromatolite
sequences grouped into 20 OTUs at the 99% phylotype cutoff, and even at the 88% cutoff (0.12
distance) were still not grouped into one collective OTU, as occurred with the 2004 stromatolite
sequences. This indicated a higher diversity and potential richness of the NifH sequences from
the 1996 clone library compared with the 2004 clone library.
Figure 10 Collector’s curves for taxa (defined here as OTUs), with phylotype cut-offs of 99% (0.01) - 88% (0.12), based on NifH translated amino acid sequences. (A) 1996 stromatolite clone library (B) 2004 stromatolite clone library.
76
Table 7 Coverage, observed phylotype richness and diversity indices for each clone libraries, based on NifH translated amino acid sequences.
98 97.37 5 5.00 (5-5.00) 0.85 (0.50-1.19) 96 97.37 3 3.00 (3-3.00) 0.55 (0.30-0.81) 95 100.00 2 2.00 (2-2.00) 0.44 (0.24-0.63) Abbreviations: CI, confidence interval; OTUs, operational taxonomic units. (a) The coverage index was calculated by the method of Good (1953). (b)The richness index was calculated by the method of Chao et al. (1993). (c) The diversity index by the method of Shannon–Wiener (Krebs, 1989).
The 1996 columnar stromatolite diazotrophic community at 98% phylotype cutoff grouped into
18 OTUs, and the number of observed species by Chao1 non-parametric estimator for richness
was 40 (23.58-104.73, 95% CI, table 7), indicating that when sampled to completion there
would be between 5 and 86 more NifH species obtained. The 2004 clone library included 5
OTUs at 98% phylotype cutoff, and the number of observed species by Chao1 estimator for
richness was 5 (5 with 95% CI) which indicated that at this specific cutoff, all NifH species
were sampled to completion. Shannon-Wiener index of diversity (H’) was 2.55 and 0.85 for
1996 and 2004 clone libraries, respectively. With an estimated coverage of >67% for both
libraries, at 98% phylotype cutoff, it was clear that the 2004 clone library was far less diverse
and less rich in NifH species compared with the 1996 clone library.
A possible explanation for the differences in diversity and richness estimators between libraries
might originate from different environmental conditions at the time of sampling. Mean rainfall
(mm) from 1990 to 2010 in Hamelin Pool was 199.7 mm y-1, and in the month of May alone
was 29.8 mm (Bureau of Meteorology, 2011). In 2004 and 1996 there was no substantial
deviation from this mean in May, yet 1996 had much higher rainfall occurring throughout the
year. Hamelin Pool experienced far more rainfall in February, June, July, August and October
1996, culminating in a total of 299.2 mm rainfall (50% increase).
This would have changed the local water budget, usually dominated by evaporation of
freshwater and influx of saline oceanic waters (Smith and Atkinson, 1983). Increase of fresh
77
water would probably have lowered salinity levels, washed additional nutrients into the bay and
changed Hamelin pool’s water chemistry. These conditions would further influence microbial
community composition, allowing proliferation of new phylogenetic groups to participate in
new biochemical processes and niches, as has been evident in other hypersaline microbial mats
under similar conditions (Yannarell et al., 2006). As expected from arid and dry conditions,
2004 library NifH sequences were affiliated with Cyanobacteria, a resilient group of
microorganisms which flourish (sometimes exclusively) under various stressful conditions
(Paerl et al., 2000; Pandey et al., 2004; Yannarell et al., 2007). The 1996 library included far
more non-cyanobacterial nitrogenase sequences, which was also the case for a sample, taken
during the wet season, of a hypersaline microbial mat from Salt Pond, San Salvador Island,
Bahamas (Yannarell et al., 2006). Currently we do not have additional environmental data to
further support the above suggestion or offer an alternative explanation.
3.3.7 Nitrogen fixation potential in Shark Bay
Past studies of the stromatolite bacterial communities in Hamelin Pool, Shark Bay, have
suggested the presence of several possible diazotrophs based on 16S rDNA molecular analyses
and culturing efforts (table 1). Bacterial matches between those studies and this study, included
uncultured clones of the sulphate reducer Desulfatibacillum alkenivorans and clones with less
than 90% sequence similarity to Desulfovibrio africanus and P.carbinolicus DSM 2380,
sampled from smooth and pustular mats in the same locality (Allen, 2006; Allen et al., 2009).
In addition, cyanobacterial matches to this study included Xenococcus, Oscillatoria and
Cyanothece isolates at 92% - 93% sequence similarity, below the acceptable threshold of 95%
sequence similarity for a positive genus identification (Everett et al., 1999; Clarridge, 2004).
Xenococcus spp. were isolated from pustular and smooth mats and Cyanothece and Oscillatoria
spp. were isolated from columnar stromatolites in the past (Burns et al., 2004; Goh et al., 2008).
Since few stromatolite NifH clones were affiliated in the phylogenetic analysis with
Marichromatium purpuratum, it is worth mentioning that an obligate halophilic
diazotrophic strain of Chromatium vinosum (also known as Allochromatium vinosum, from the
same family - Chromatiaceae), was isolated from surface deposits of columnar stromatolites, in
the intertidal zone of Hamelin Pool (Bauld et al., 1986).
While taking into consideration all the possible diazotrophic genera in Hamelin Pool
stromatolites (table 1, underlined names), this study has confirmed δ-Proteobacteria and
Cyanobacteria representatives were present in columnar stromatolites. More specifically,
Desulfatibacillum and Chroococcales, Oscillatoriales and Pleurocapsales members -
Cyanothece, Xenococcus and Oscillatoria were identified. Additional potential diazotrophs
78
were a novel discovery and were not identified before: Cyanobacterium UCYN-A and γ-
Proteobacteria members Teredinibacter and Halorhodospira in cluster I, δ-Proteobacteria
representatives Desulfovibrio and Desulfonatronospira in cluster III and Pelobacter in the out-
group to cluster I.
Because the nifH gene is present in a relatively limited number of Eubacteria genomes, in
comparison to 16S rDNA, we would definitely not expect diversity based on nifH to exceed
diversity estimates based on 16S rDNA analysis. Diversity estimates would be lower also
because they were based on amino acid sequences, not nucleotides. A 4% distance within a
group of amino acid sequences might underestimate a more diverse population of nucleotide
sequences. However, for OTU based analysis purposes, we can go forward, bearing this
assumption in mind while discussing molecular diversity based on NifH translated sequences
vs. 16S rDNA.
Previous molecular analyses of 16S rDNA, from smooth and pustular stromatolite mats,
generated bacterial clone libraries that were fairly similar to one another in terms of their
richness and diversity (Allen et al., 2009), yet columnar intertidal stromatolite clone libraries
had lower estimates of diversity and OTUs richness (Allen et al., 2009). At the 98% phylotype
cutoff, bacterial smooth mat sequences were grouped into 111 OTUs, with a Chao1 richness
estimator of 6216, and 4.71 for Shannon-Wiener index of diversity (H’). At the same cutoff
(98%), pustular mat sequences were grouped into 110 OTUs, Chao1 = 3053, H’ = 4.7.
Columnar intertidal stromatolite sequences, on the other hand, grouped into 34 OTUs, Chao1 =
45.2 and H’ = 2.89 at the same cutoff level, which indicated a substantial drop in richness and
species diversity. Additional 16S rDNA-based studies confirmed relatively low richness and
diversity estimators for the bacteria within columnar stromatolites from Shark Bay (Papineau et
al., 2005; Goh et al., 2008). This consistent finding can be attributed to the fact that columnar
stromatolites contain lower biomass in general and higher net carbon precipitation, and
therefore undergo lithification, producing less space and volume in which microorganisms can
live (Dupraz and Visscher, 2005).
The shared estimators for richness and diversity from both NifH clone libraries were slightly
lower compared to the above mentioned 16S rDNA-based analysis of bacterial communities
(Burns et al., 2004; Allen et al., 2009). At a 98% phylotype threshold, NifH sequences were
grouped into 23 OTUs, with a Chao1 richness estimator of 38.6 and H’ = 2.38 (table 5).
Because our analysis was based on amino acid sequences, it underestimated, to a certain degree,
the true diazotrophic diversity within stromatolites. However, our NifH estimates were on the
same scale as 16S rDNA-based richness and diversity estimation, in columnar stromatolites and
79
it is possible the nifH DNA fragments were similarly diverse and abundant as 16S rDNA
fragments. A cautious conclusion based solely on diversity and richness calculations, would be
that the bacterial community in columnar stromatolites specifically, is comprised mostly from
diazotrophic species, and may exhibit spatial and temporal differentiation in regards to nitrogen
fixation.
Uncultured nifH clones from Guerrero Negro (GN) salt ponds were a common finding in our
BLAST analysis (see tables 3 & 4). Microbial mats from the Guerrero Negro in Baja California,
Mexico, provide a well-studied system which is similar, in certain characteristics, to the Shark
Bay system. Furthermore, in order to provide a likely depiction of the active and potential
nitrogen fixers in columnar stromatolites, we reviewed findings from our study, the GN studies
and former 16S rDNA-based analyses of the Hamelin Pool stromatolites (table 8).
The GN study site is set in a hyperarid climate (sporadic rainfall of 35 mm yr-1), with mean
monthly maximum high temperature of 29°C (Summers et al.) and high evaporation rates (1500
mm yr-1) (Jørgensen and Des Marais, 1990). A gentle tide of 0.5 – 1 m floods onto narrow,
shallow trenches and creates a natural large marsh land with shallow pools and hypersaline
evaporitic ponds (80‰ - 108‰ salinity), in which cyanobacterial mats prosper (Fryberger et al.,
1990; Jørgensen and Des Marais, 1990). While the environmental characteristics are similar in
general to those of Shark Bay’s Hamelin Pool, the mat morphologies differ, as columnar
stromatolites (also known as ‘stromatolite heads’) are not present in the Guerrero Negro study
site (Javor and Castenholz, 1981; Hoehler et al., 2001).
Generally, bacterial communities were found to be similar between Hamelin Pool (HP) and
Guerrero Negro (GN) in terms of taxonomy based on 16S rDNA analysis, but they were not
identical. Some of the most abundant bacterial divisions in HP mats were also abundant in GN
mats – mainly α-Proteobacteria, Bacteroidetes, Planctomycetes and -Proteobacteria (Ley et al.,
2006; Goh et al., 2008; Allen et al., 2009).
80
Table 8 Potential diazotrophs in Hamelin Pool (HP) and Guerrero Negro (GN), based on 16S rDNA or nifH genes molecular analysis.
Potential diazotrophs based on 16S rDNA (a,b)
Potential diazotrophs based on nifH gene (c,d)
Common Potential
diazotrophs in GN and
HP
Chroococcidiopsis Cyanothece* Gloeocapsa*
Halothece Leptolyngbya
Lyngbya Microcoleus Oscillatoria
Phormidium* Synechocystis
Cyanothece* Desulfovibrio* Myxosarcina*
HP GN HP GN
Unique potential
diazotrophs in GN or HP
Bacillus Desulfatibacillum
Gloeothece* Halomonas
Methanosracina* Myxosarcina* Pleurocapsa*
Pseudoalteromonas Pseudomonas Rhodobacter
Rhodopseudomonas Rhodospirillum*
Stanieria* Symploca
Synechococcus* Vibrio
Xenococcus*
Chlorobium Desulfobacter
Desulfobacterium Desulfococcus Desulfovibrio
Pseudanabaena
Cyanobacterium UCYN-A*
Desulfatibacillum* Desulfonatronospira
Halorhodospira Marichromatium
Oscillatoria* Pelobacter
Teredinibacter Xenococcus
Anabaena Azotobacter*
Burkholderia* Clostridium* Dermocarpa
Desulfonema* Halothece Klebsiella* Plectonema
Synechocystis*
* 16S rDNA and nifH genes sequence similarity was less than 95% to a designated genus. (a) Data collected from the following references: (Burns et al., 2004; Papineau et al., 2005; Goh et al., 2006; Allen et al., 2008; Allen et al., 2009) (b) Data collected from the following references: Risatti et al., 1994; López-Cortés et al., 2001; Ley et al., 2006. (c) Data from this study. (d) Data collected from the following references: Omoregie et al., 2004a; Omoregie et al., 2004c. It does not include results from green house experiments.
The majority of potential diazotrophs in GN mats were affiliated with cluster III representatives
- -Proteobacteria and Firmicutes; yet included also cluster I representatives such as
Cyanobacteria, β, -Proteobacteria (table 8 and references within). There were no representatives
of Pelobacter spp. or associations with cluster II or cluster IV. Common potential diazotrophs in
HP and GN, based on 16S rDNA, included 10 cyanobacterial representatives from cluster I -
81
Chroococcales, Oscillatoriales and Pleurocapsales groups, while unique GN potential
diazotrophs included six genera mainly from -Proteobacteria, cluster III.
There were fewer common diazotrophs in HP and GN, based on nifH gene studies. These
included Cyanothece, Myxosarcina (cluster I) and Desulfovibrio genera (cluster III). The GN
site had 10 unique potential diazotrophs, and our study has identified 9 unique potential
diazotrophs in HP, all of which were affiliated with cluster I or III.
Following reverse transcriptase PCR analysis in the GN mats, it was concluded that actual
nitrogen fixers during night time were Halothece sp. strain MPI96P605, Myxosarcina strain
strain ATCC 29409 and NifH2 of Anabaena variabilis ATCC 29413 from cluster I. Only one
genus from cluster III was identified as an active nitrogen fixer - Desulfovibrio (Omoregie et al.,
2004a). Halothece, Synechocystis, and Phormidium were detected in HP based on past 16S
rDNA analysis, and Myxosarcina and Desulfovibrio were detected in HP columnar
stromatolites based on this study using nifH gene analysis. This would point to a potentially
similar pattern of nitrogen fixation.
In regards to community diversity and richness, the GN system, based on 16S rDNA, was
estimated to harbour almost twice the number of bacterial species - 10,000 vs. 6216 in HP
smooth or pustular mats (Ley et al., 2006; Goh et al., 2008; Allen et al., 2009). However,
diazotroph-related estimators of richness and diversity, based on nifH gene, were not available
for the GN mats and we therefore cannot compare this specific aspect. Though nitrogenase
activity was not measured in columnar stromatolites, in GN mats nitrogenase activity was
restricted mostly to the upper 5 mm and peaked during night time (9-37 mol C2H4 m-2 h-1 ,
0:00-6:00), with almost no activity during the day time (Omoregie et al., 2004b).
In summary, based on the available data, Hamelin Pool columnar stromatolites and GN mats
harbour similar diazotrophic species. These include -Proteobacteria and Cyanobacteria
representatives from cluster I and cluster III of the nifH phylogeny tree. It is plausible that the
nitrogenase activity in columnar stromatolites in HP would peak during night time in the upper
layers of the mat, and that actual nitrogen fixers would be Desulfovibrio, Myxosarcina,
Xenococcus spp. and also perhaps Halothece, Synechocystis, and Phormidium, as they were
previously identified in Hamelin Pool (table 8), and in GN mat they were active nitrogen fixers.
It remains to be seen if future samples from columnar stromatolites, under different
environmental conditions, would reveal additional diazotrophs and their activity pattern.
82
3.4 Concluding remarks
Columnar stromatolites are one of five well known morphologies of modern stromatolites in
Shark Bay, usually found in shallow hypersaline waters. In order to assess this complex
microbial mat community, this study used DNA-based, culture independent, molecular
techniques and provided a novel view of the microbial diazotrophic communities within
columnar stromatolites.
Sequence analysis has provided statistically significant taxonomical identification and an
evolutionary representation of the nifH genes in this community. Our analysis indicated
columnar stromatolites, sampled from different years, included a common persisting
cyanobacterial diazotroph, of the genus Cyanothece or Xenoccocous (tables 3 & 4, figure 8).
The diazotrophic community structure did not vary significantly between the temporal samples
according to our statistical tests (table 6). Diversity and richness did vary between the samples,
probably due to environmental shifts which affected seawater salinity levels and allowed for
diverse microbial groups to proliferate in 1996 (table 7). Both samples contained novel nifH
gene nucleotide sequences with low similarity scores to uncultured nifH clones from saline to
hypersaline environments, and translated NifH sequences with high similarity to unicellular,
non-heterocystous Cyanobacteria and γ, -Proteobacteria NifH sequences.
NifH clones sequences were mainly affiliated with cluster I and to a lesser extent with cluster
III, suggesting aerobic and anaerobic bacteria with conventional Mo nitrogenase might be
involved in the nitrogen fixation process. Not a single clone was affiliated with cluster II or
cluster IV, while several clones were affiliated with a -Proteobacteria out-group to cluster I,
represented by P. carbinolicus DSM 2380. Taking into consideration past studies done on this
community and similar microbial mats in hypersaline environments such as those present in the
Guerrero Negro (GN) salt ponds, we suggest columnar intertidal stromatolites are less diverse
and rich in microbial species relatively to other mat morphologies, and most of these species
will retain nitrogen fixation capabilities. Additionally, it would seem marine based diazotrophic
bacteria are capable of enduring hypersaline conditions and it remains to be seen what are their
adaptive mechanisms.
In conclusion, Shark Bay, a UNSECO’s World Heritage site, continuously provides researchers
with fascinating endemic microbiological subjects that bridge our current era with Archaean
fossil records of early organic life on planet Earth. This furthers our understanding of how life
began, evolved and survived dynamic environmental conditions, on a geological scale.
83
Chapter 4 The bacterial diazotrophic community in a radon hot
spring, South Australia. ____________________________________________________________________
4.1 Introduction
Paralana Hot Springs (PHS) are situated in Mt. Painter, near the town Arkaroola, on the north
eastern side of the Flinders Ranges, South Australia (30°10’35”S, 139°26’26”E, figure 1). The
climate is arid, with an average annual rainfall of 20.3 mm, with an extreme of 1270 mm in
1974 (Sprigg, 1984). The maximum temperatures at Arkaroola can exceed 30°C during the
summer months and minimum temperatures can fall below 10°C during May to September
(Bureau of Meteorology, 2011). There are several water sources in the Paralana fault area; PHS
is the only radioactive spring in the area, and includes two connected oval shaped pools and a
draining creek. Pool 1 is the hot source pool, and pending on the time of year and flooding
events, tends to vary in terms of its size, depth and temperature (2 - 9 m2 , 30 cm – 80 cm deep,
48°C – 63°C; (Mawson, 1927; Grant, 1938; Long et al., 2001; Anitori et al., 2002). Pool 2 is the
larger of the two, deeper and cooler (50-80 m2, 1 to 4.5 m deep, 40.2°C - 48°C, respectively)
with neutral pH (7). Both pools include microbial components, which manifest as floating
microbial mats, of emerald-green colour, as well as dark benthic mats, mainly in pool 2 (Anitori
et al., 2002).
The PHS system, though unique in its characteristics, is not an isolated or a closed ecosystem. It
is subject to external inputs from its surrounding fauna and flora due to floods and long-standing
human interest in the springs for cultural and medicinal values (Sprigg, 1984).
Underground water circulates through the hot, radioactive rocks underlying the Mt. Painter
Domain, and then flows near a localized radiogenic source, relatively close to the surface
(Brugger et al., 2005). Hence, water is discharged at the hot source pool, at relatively high
temperatures (56°C - 63°C), and very high radon levels (29,000 Bq/L, in the gas bubbles), with
traces of radiogenic helium.
84
14 km
N
Figure 1: Arkaroola and Paralana Hot Springs locations in South Australia. Main satellite image by Google Earth, inset map source: Australian Bureau of Meteorology.
In general, most of the PHS studies have focused on their geology, mineralization processes,
hydrothermal activity and geochemistry characteristics, while rarely analyzing the springs’
4.2.3 Clone library and Restriction Fragment Length Polymorphism (RFLP)
Ligation and transformation of freshly amplified PCR products of the nifH gene, containing an
A-overhang at the 3’ end, were ligated into the pCR2.1 vector of the TOPO TA Cloning kit
(Invitrogen Corporation, Carlsbad, CA) according to the manufacturer’s instructions. From each
clone library, at least 50 positive (white) clones, with the correct insert size - 350 bp, were
selected and their inserts amplified using the vector specific primers MpF and MpR. PCR
89
products of the correct size from positive clones were cleaned and visualized as described
previously, in section 4.2.2.
Each clone was subjected to Restriction Fragment Length Polymorphism (RFLP) analysis and
was screened twice, using restriction enzymes ScrFI and MspI (New England Biolabs, Ipswich,
MA) separately. Each digest reaction contained 1.5 μL PCR products, 2 μL of the appropriate
enzyme buffer, 1 U of restriction enzyme and sterile MilliQ water to a total volume of 20 μL
and incubated at 37°C overnight. The RFLP patterns were analysed manually after
electrophoresis on 2% and 3% agarose gels (molecular biology grade, Progen Pharmaceuticals,
QLD, Australia) with 1x TAE-buffer, stained by ethidium bromide (1 μg ml-1) for 10-15 min
and visualized as described previously in section 4.2.2.
4.2.4 DNA sequencing
Sequencing of selected clones was carried out using the PRISM Big Dye cycle sequencing
system with MPF or MPR primers (3.2 M) and 3-60 ng of cleaned PCR product. After
sequencing reactions had been performed, the reaction was cleaned up and analysed as
described previously in chapter 3, section 3.2.4.
4.2.5 Phylogenetic analysis
Phylogenetic analysis was carried out as described previously in chapter 3, section 3.2.5.
4.2.6 Diversity, richness and coverage analysis
NifH translated nucleotide sequences of 137 bp average lengths were aligned using the
computer package “Muscle” version 3.8.31 and the clone library sampling coverage, diversity
and richness were calculated as described previously in chapter 3, section 3.2.6.
4.2.7 Accession numbers
Sequences of the nifH clones are available under GenBank accession numbers KC295666-
KC295692.
90
4.3 Results and discussion
4.3.1 BLAST & BLASTX comparative analysis
NifH genes were present and were amplified from Paralana hot source pool samples (pool 1,
figure 4).
Figure 4 Products obtained after the second step of the PCR amplification of nifH from PHS hot source pool DNA extractions using nested primers. Lane 1: PHS hot source pool; 2: negative control sterile MilliQ H2O; 3: positive control Nostoc PCC 7120; M: 0.5 μg μL-1 GeneRuler™ DNA Ladder Mix (Fermentas, Ontario).
Seventy six clones containing the correctly sized insert (350 bp) were obtained and analysed.
RFLP analysis was performed on 64 random positive clones with the nifH nucleotide insert,
which grouped them into 7 groups (figure 5). Initially, three representatives of each RFLP
pattern were selected for sequencing, and due to high sequence variation and diversity,
additional positive clones were sequenced directly without further restriction enzyme treatment,
until clone library coverage was deemed sufficient for diversity and richness analysis.
Figure 5. 3% agarose gel showing RFLP patterns using ScrFI restriction enzyme on 9 positive clones from PHS library. M: 0.5μg/μL GeneRuler™ DNA ladder Mix (Fermentas, Ontario).
500 bp
1000 bp
300 bp
M 1 2 3
M
91
BLAST and BLASTX results passed significant statistical thresholds, with BLAST expected
(E) values e-33 – e-180 and BLASTX results ranged from e-47 to e-64 for all clones.
Two representative PHS nifH clones were related to Geobacter lovleyi strain SZ, with high
sequence identity similarity in the BLAST results, (CP001089 accession ID, 99% sequence
similarity, table 2). At 98% sequence similarity there were few clones related to Mastigocladus
cluster 1-Cyan-B). One clone clustered closely to the Nostocales and Mastigocladus laminosus
(Q47917 accession ID), in sub cluster 1-Cyan-C. Three clones were closely related to the
Burkholderia spp. ( -Proteobacteria, 1-Prot-αβ sub cluster). A single clone, RSA193-HSP09,
nestled individually between 1-Prot-αβ sub cluster and Paenibacillus azotofixans (Firmicutes,
Q9AKT8). The BLASTX analysis indicated this clone was related to Azospirillum sp. B510, -
Proteobacteria, at 92% sequence similarity (table 2). In the sub cluster 1-Prot-β -A, two clones
were affiliated with Azoarcus communis, and a single clone with Dechloromonas aromatica
strain RCB ( -Proteobacteria, Q79AX4 and Q47G67, respectively). In the sub cluster 1-Prot- -
C, two clones were closely related to the Azotobacter spp.
97
Figure 10: Cluster I and Cluster III positions within the three main clusters of the nifH phylogenetic tree. Topology was based on Maximum-likelihood analysis of nifH amino acid sequences. Cluster I was outgrouped by Nitrospirae, and Cluster III was outgrouped by Roseiflexus spp. (Chloroflexi).
During the BLASTX analysis, several PHS clones had 85 - 100% sequence similarity to
unverified NifH sequences (table 2). These included for instance, Thermodesulfovibrio
yellowstonii DSM 11347 and NifH sequences from -Proteobacteria – Pelobacter and
Geobacter spp. Though these sequences were not manually annotated or verified in the Swiss-
Prot database, they were nevertheless integrated into the phylogenetic analysis to provide an
unbiased view (figure 11). Four NifH clones clustered with T. yellowstonii, a thermophilic
sulphate-reducing organism isolated from a thermal vent in Yellowstone Lake in Wyoming,
USA (Henry et al., 1994). An additional five NifH clones clustered with the
Desulfuromonadales order as these clones were matched to Geobacter lovleyi strain SZ NifH
sequences, at 99-100 % sequence similarity (CP001089, YP_001951460, YP_001950896, table
2). T. yellowstonii, Geobacter spp. NifH sequences and affiliated clones, clustered separately
from one another, forming distinct groups outside of cluster I (figure 11).
NifH cluster III which included the anaerobic diazotrophs (figure 12), contained five sub
clusters with support values >72. A total of 16 NifH PHS clones were affiliated to -
Proteobacteria, Spirochaetes, Firmicutes and Bacteroidetes. Six clones formed a tight group
within sub cluster 3-Prot- -B ( -Proteobacteria) remotely related to NifH sequences from
Desulfobulbus propionicus DSM 2033 and Desulfatibacillum alkenivorans AK-01 (88%
BLASTX sequence similarity, table 2). An additional three clones in this sub cluster were
98
Figure 11 next page: Phylogenetic distribution of cluster I based on Maximum-likelihood analysis of partial NifH amino acid sequences. Sequences determined in this study were given an alphanumeric prefix RSAX-HSP09 and are marked bold; number of clones for each sequence is in parenthesis. Branch support values are shown for key branches; only values > 50 were considered significant. Text boxes contain designation of clusters and in parenthesis is the closest sub cluster nomination as per (Zehr et al., 2003a). Prot=Proteobacteria, Cyan=Cyanobacteria, Firm=Firmicutes. The scale bar represents the number of substitutions per 100 bases. Outgroup was Desulfuromonadales ( -Proteobacteria) nifH sequences from Geobacter and Pelobacter genera.
closely related to Desulfovibrio gigas (P71156), while in sub cluster 3-Spir-A, five clones
were affiliated with Treponema and Spirochaeta spp. (Spirochaetes).
A single clone (RSA205-HSP09) nestled individually between two sub clusters, 3-Firm-Arch
and 3-Prot- -B, which was suggested by the BLASTX analysis to be remotely related to
Paludibacter propionicigenes WB4, Bacteroidetes, with only 79% NifH sequence similarity. An
additional singular clone was affiliated with Thermincola sp. JR (YP_003639458) and the
family Peptococcaceae (Firmicutes) in the sub cluster 3-Firm-Arch.
Overall, Cyanobacteria and -Proteobacteria contributed the main NifH sequences to the clone
library (figure 13). The Spirochaetes affiliated clones were detected only during the
phylogenetic analysis, while in the BLASTX analysis those sequences were matched to δ-
Proteobacteria and Bacteroidetes representatives, at 88% sequence similarity. Other shifts
occurred within the assignments to Firmicutes and α- and -Proteobacteria, as would be
expected, since the phylogenetic analysis employs different assumptions and algorithms in its
calculation, in comparison to the BLASTX analysis (see chapter 3, section 3.3.5, for further
discussion regarding this point).
99
1-Cyan-B & C (1B)
1-Firm-A (1D)
1-Prot-β -A (1P)
1-Prot- -C (1H,1T,1L,1U,1M)
1-Prot-αβ (1J, 1K)
1-Cyan-A (1B)
100
3-Firm-Arch (3C, 3D, 3A)
3-Prot- -GS (3L, 3T)
3-Prot- -B (3B, 3E, 3L)
3-Prot- -A (3P,
3-Spir-A (3P, 3L)
Figure 12: Phylogenetic distribution of PHS hot source clones in cluster III based on Maximum-likelihood analysis of NifH partial amino acid sequences. Sequences from this study (alphanumeric prefix RSAX-HSP09) and are marked bold, branch support values are shown for key branches; only values > 50 were considered significant. Spir=Spirochaetes, Arch=Archaea. The scale bar represents the number of substitutions per 100 bases.
101
A
B
Cluster III Cluster I
4.3.3 Coverage, diversity and community richness
The “Mothur” program was employed in order to calculate the various ecological parameters
and coverage estimators as detailed in the previous chapter. As evident from the collectors curve
(figure 14), the number of unique NifH amino acid sequences has not reached a plateau at 99%
phylotype cutoff (based on furthest neighbour algorithm and distance precision of 0.01, Schloss
and Handelsman (2005)), but did so at 93% phylotype cutoff. The estimated coverage by the
method of Good (1953) was above 75%, at the 98% phylotype cutoff. Therefore, the coverage
of potential diazotrophs in PHS hot source pool was sufficient but not complete, and the clone
library was mostly representative of the diazotrophic diversity. Coverage and collectors curves
suggested high diazotrophic diversity and the potential richness of the NifH species present in
the hot source pool
Figure 13: Phyla percentile representation from PHS hot source pool clone library between NifH cluster I (clear slices) and cluster III (shaded slices). Pane A) Phyla distribution based on the BLASTX results. Pane B) Phyla distribution based on the phylogenetic analysis. In bold – The two main phyla per cluster.
102
Figure 14: Collector’s curves for taxa (OTUs) with minimum thresholds of 99(0.01), 98(0.02), 97(0.03), 96(0.04), 95% phylotype cutoff and lower, based on NifH partial amino acid sequences.
The PHS hot source pool diazotrophic community included 20 OTUs (Operational Taxonomic
Units) at a 98% phylotype threshold, and the number of observed species by the Chao1 non-
parametric estimator for richness was 33.75 (23.40-75.55, 95% CI), indicating that when
sampled to completion there would be between 14 and 56 more NifH species obtained (table 3,
at 98% phylotype cutoff). The Shannon-Wiener index of diversity (D) range was 2.35 - 3.12,
between 91% - 100% phylotype cutoff.
Table 3: Coverage, observed phylotype richness and diversity indices for PHS hot source pool clone libraries, based on NifH partial amino acid sequences
Abbreviations: CI, confidence interval; OTUs, operational taxonomic units. (a) The coverage index was calculated by the method of Good (1953). (b) The richness index was calculated by the method of Chao et al. (1993). (c) The diversity index by the method of Shannon–Wiener (Krebs, 1989).
4.3.4 Nitrogen fixation in Paralana Hot Springs
The diversity analysis demonstrated high diazotrophic diversity and richness in the PHS hot
source pool. The number of NifH clones analysed and sequenced in this study (76), represents
the highest number of NifH clones from a singular hot spring to be analysed to date (Hamilton
et al., 2011a). There is a strong potential for active nitrogen fixers, yet our attempts to identify
103
actively transcribing species was unsuccessful owing mainly to limited material availability (see
section 4.2.1).
The DNA extraction and nifH gene amplification were successful. In summary, the best matches
in the BLAST analysis were to G. lovleyi strain SZ nifH gene, M. laminosus CCMEE 5198 and
a few uncultured nifH clones from a sea water sample, in the Mediterranean Sea (Man-
Aharonovich et al., 2007). The heterocystous Cyanobacterium M. laminosus CCMEE 5198, is a
moderately thermophilic bacteria, that is found in many hot springs worldwide (Miller et al.,
2007) and was reportedly previously from PHS (Anitori et al., 2002). However, the BLASTX
analysis recognised this specific clone as the heterocystous Nostoc sp. PCC 7120, at 97%
sequence similarity. Even at lower similarities in the BLASTX results, M. laminosus nifH
sequence was not suggested as a possible match (data not shown), and we concluded that most
probably this clone was indeed a NifH sequence from Nostoc sp. PCC 7120. Additionally, a
common match in the BLAST and BLASTX analyses was the mesophilic, strictly anaerobic G.
lovleyi, with high sequence similarities scores. G. lovleyi is a known metal reducer and
dechlorinating agent, that was studied extensively for its capabilities in bioremediation of
pollutants (Sung et al., 2006). This finding in the hot source pool of PHS is of interest,
especially if future work can verify it is an active nitrogen fixer. Cyanobacteria and -
Proteobacteria were the main diazotrophic taxa present in PHS hot source pool, according to the
BLASTX analysis.
Phylogenetic analysis provided additional interesting results as well, mainly in relation to the
cluster I vs. cluster III affiliations. The overall tree topology included high likelihood branches,
which were the result of choosing verified reference NifH sequences to work with, as well as
highly optimised amino acid substitution matrix and phylogeny algorithms. The tree topology in
general was similar to previously reported NifH phylogeny trees (Zehr et al., 1997; Zehr et al.,
2003a), and PHS hot source pool sequences were divided amongst two main clusters, I and III,
and additional out groups.
The PHS -Proteobacteria NifH representatives were closely related to Burkholderia and
Azospirillum in cluster I. These genera are considered mesophilic and are routinely found to fix
nitrogen in rhizosphere and soil environments (Okon, 1985; Garrity et al., 2005). Finding such
traces is not surprising in the hot source pool, as it is an open pool, subjected to various
interventions from nearby soil areas. The DNA fragments may represent adjacent bacteria,
which landed in the sampling area, but do not actively fix nitrogen. The -Proteobacteria
representatives were related to Azoarcus and Dechloromonas from cluster I, and BLAST
analysis indicated their sequences were originally isolated from an estuary, an Antarctic
104
microbial mat (Moisander et al., 2007; Jungblut and Neilan, 2010) and a Yellowstone National
Park hot spring (Hall et al., 2008). This could point to potential active nitrogen fixers, which
have the capability to adapt to various temperatures. The -Proteobacteria NifH clones were
affiliated with the Mo-dependant Azotobacter, a well studied mesophilic soil nitrogen fixer
under microaerobic conditions, with an optimal diazotrophic growth pH at 7.0-7.5 (Dixon and
Kahn, 2004; Garrity et al., 2005), again pointing to the possible introduction of this genus from
nearby soil or rhizosphere areas. Half of the hot source pool cluster I NifH sequences were
affiliated with the Cyanobacteria, a well known resiliant group of microorganisms, reported
from virtually every extreme environment on Earth, and are the best candidates to be the active
nitrogen fixers in this unique ecosystem (Whitton and Potts, 2000; Pandey et al., 2004; Thomas,
2005; Kaštovský and Johansen, 2008).
Sulphate reducers were another prominent finding in the hot source pool and were affiliated
with cluster III. The PHS -Proteobacteria NifH representatives were closely related to the
anaerobic Desulfobulbus, Desulfatibacillum and Desulfovibrio genera. Desulfobulbus spp. are
found in diverse environments, including deep sea methane vents and arsenic-rich, ferruginous
shallow marine hydrothermal sediments (Pernthaler et al., 2008; Handley et al., 2010).
Desulfatibacillum spp. were recently found in oil deposits and wellheads from hyper
temperature oil wells (74 °C), and nifH fragments were also found in acidic, low temperature,
peat bogs (Zadorina et al., 2009; Yamane et al., 2011). D. gigas has been rarely reported from
thermal environments, yet it is known to fix nitrogen (Gall, 1963; Riederer-Henderson and
Wilson, 1970; Steppe and Paerl, 2002).
PHS NifH clones were affiliated also with bacteria from the Spirochaetes group. Treponema
and Spirochaeta spp., which are obligate anaerobes, are commonly found in hot and thermal
environments, with an optimum growth range of up to 60°C in certain species (Patel et al.,
1985; Paster et al., 1991; Weller et al., 1992). They are known contributors to the global
nitrogen cycle with high N2 fixation rates of up to 5 ng of N2 per hour (Lilburn et al., 2001). A
singular clone was affiliated with an anaerobic Thermincola member of the Firmicutes phylum.
This genus closest phylogenetic relatives - Desulfosporosinus and Desulfotomaculum, were
found to fix nitrogen in soil and termite guts (Postgate, 1982; Roesch et al., 2010). It is of
interest to note that this thermophilic alkali-tolerant genus was isolated from a hot spring, in the
Baikal Lake region (Sokolova et al., 2005), and its nitrogen fixation capabilities under various
temperatures conditions, are currently unknown.
Two out groups to cluster I were T. yellowstonii and Geobacter spp. NifH sequences (figure
11). Certain strains of Geobacter were shown to fix atmospheric nitrogen, under anaerobic
105
conditions (Bazylinski et al., 2000; Methé et al., 2005), however, this is the first report of
finding a Geobacter nifH gene fragments from a hot environment. Potential nifH sequences
were identified in few other Geobacter spp. genomes, and they are under different stages of
verification in the databases, hence most were not included in the tree (these were: G.
sulfurreducens, G. bemidjiensis, G. metallireducens, G. sp. M21, G. sp. FRC-32, G.
uraniireducens, G. sp. M18 and G. daltonii (NCBI nucleotide database, 2012). Furthermore, a
thermophilic isolate of Geothermobacter ehrlichii of the same family - Geobacteracea, was
isolated from hydrothermal vents and grew at 35°C and 65°C, with an optimum growth
temperature of 55°C, suggesting thermophilic adaptations are quite possible within members of
the Geobacteracea (Kashefi et al., 2003). The above data suggests a thermophilic nitrogen fixer
of the Geobacter genus might be active in PHS hot source pool.
The thermophilic T. yellowstonii DSM 11347 NifH sequence was obtained from a complete
genome sequence project, directly submitted to NCBI databases (Genbank ID CP001147.1,
bioproject ID PRJNA30733) and is unverified by any other source. To our knowledge, this is
the first report of NifH from this species, from a hot environment. Nothing is known of its true
nitrogen fixation capabilities. It is interesting as well that this thermophilic group of sulphate
reducers (Geobacter, Thermodesulfovibrio) do not cluster within cluster III, with other sulphate
reducers. We estimate that as additional genome sequencing projects are completed, a
thermophilic NifH cluster would further establish itself separately from other NifH clusters.
This is mainly because the temperature regime would impose changes onto the nitrogenase
characteristics, in order for it to remain functional under high temperatures. These changes will
probably be reflected in the amino acids sequence and the nifH genetic code, effectively
producing a new cluster in the tree topology.
In the past, PHS system was found to harbour high bacterial diversity based on a 16S rDNA
molecular analysis (Anitori et al., 2002). In the same study, 180 different RFLP patterns were
detected across all samples, and the Shannon-Wiener diversity estimator ranged from 0.57-3.85,
with most samples showing values higher than 2.5. Our study echoed that diversity with a high
Shannon-Wiener diversity range of 2.35 - 3.12. Only one study has provided this specific
estimator for thermophilic diazotrophs, and at the moment, these are the highest values reported
from a thermophilic environment. Hydrothermal vents, at 20°C to 78°C temperature range,
reported diversity estimators of 1.8 to 2.2 (Mehta et al., 2003), while non thermophilic studies
produced diversity estimators as high as 2.92 and as low as 1.02 in comparison (Izquierdo and
Nüsslein, 2006; Roesch et al., 2010). Diversity studies in the geothermal springs of Yellowstone
National Park (Hamilton et al., 2011a), have not provided this specific diversity estimator for
the NifH clones, yet 13 hot springs were found to harbour 2-12 unique phylotypes, at sequence
106
identity threshold of 99%. In this study, at the same sequence identity threshold, the unique
OTU number was 24 (table 3), pointing to a potentially higher diazotrophic diversity.
A substantial increase in -Proteobacteria sequences was evident in our study, in comparison to
the published 16S rDNA analysis (Anitori et al., 2002). Also, we have identified
representatives of the Spirochaetes for the first time, and did not find any NifH Chloroflexi
related clones, though that group of bacteria was reported previously (Anitori et al., 2002).
There were no exact taxonomical matches in the -, -Proteobacteria groups between the
studies. However, there were several 16S rDNA sequences affiliated at 95% similarity, to a T.
islandicus, originally isolated from Icelandic hot springs (Sonne-Hansen and Ahring, 1999), and
of the same genus as T. yellowstonii strain DSM 11347, which was detected in our study
(Anitori et al., 2002). In a similar fashion, few 16S rDNA sequences were also remotely related
to Pelobacter carbinolicus DSM 2380 and to Desulfuromonas spp. (87%, 84%, respectively),
from the Desulfuromonadales order. Our study has reported NifH clones affiliated P.
carbinolicus DSM 2380 and Geobacter spp. from the same order.
We did not measure nitrogen fixing rates in this study, and we were unable to confirm active
nitrogen fixers. However, the literature points to common species that are repeatedly detected in
thermal springs around the world, some are known to actively fix nitrogen. For instance, our
analysis and the previous 16S rDNA analysis (Anitori et al., 2002), suggested heterocystous
were present in the hot source pool. Both species are aerobic nitrogen fixers, usually during
light periods (Stewart, 1973), and they were also detected in hot springs in Japan, at 70°C,
though it was not mentioned if they actively fixed nitrogen (Watanabe and Yamamoto, 1971).
These facts make them a likely candidate to be an active nitrogen fixer in the PHS system. In a
similar fashion, unicellular, filamentous and non-heterocystous Cyanobacteria found in our
study, such as Oscillatoria sp. PCC 6506 and Cyanothece sp. CCY0110, tend to fix nitrogen
aerobically during dark periods in order to avoid potential oxygen damage to the nitrogenase
complex (Stal and Krumbein, 1987; Reddy et al., 1993; Schneegurt et al., 1994; Berman-Frank
et al., 2003). A large thermophilic Oscillatoriales group was present at the Zerka Ma’in hot
springs at 59°C - 63°C (Ionescu et al., 2010). An interesting finding in a sulphide-rich hot
spring microbial mat (54°C), included a thermophilic Oscillatoria terebriformis, which was
found to move vertically along the sulphide gradients in the mat, from oxic to anoxic conditions
(Richardson and Castenholz, 1987). In addition, some strains of the Oscillatoria exhibited
reduced nitrogenase activity during light periods (in vivo) when grown heterotrophically, yet
when grown anaerobically, they were able to fix nitrogen during the light period as well (Stal
and Heyer, 1987; Gallon et al., 1991). Considering the Oscillatoria group nitrogen fixing and
107
motility capabilities, and their repeated presence in hot environments, might suggest that the
thermophilic Oscillatoria spp. present in the PHS hot source pool, would be a likely candidate
to be an active nitrogen fixer. We would suggest also that anaerobic sulphate reducing -
Proteobacteria Desulfovibrio spp. would potentially be the active nitrogen fixers in the hot
source pool. Evidence for anaerobically nitrogen fixation have been reported from various hot
sources - a 63°C hot spring in Jordan (Steppe and Paerl, 2002), 50°C to 60°C alkaline springs in
Yellowstone National Park (Wickstrom, 1984; Oren et al., 2009).
Though few , , and -Proteobacteria have been detected in other thermophilic environments
(Ward et al., 1998; Ferris et al., 2001; Miller et al., 2009), PHS Proteobacteria NifH clones
were mainly associated with plants rhizosphere, and it remains to be seen the extent of their
active nitrogen contribution to the PHS system.
4.4 Concluding remarks
In summary, PHS hot source pool NifH clones partially matched a past study based on 16S
rDNA (Anitori et al., 2002). NifH clones were affiliated with the Oscillatoriales, Chroococcales,
Nostocales (Cyanobacteria), as well as with P. carbinolicus and Thermodesulfovibrio spp. ( -
Proteobacteria and Nitrospirae, respectively). Cluster III NifH clones were related to members
of the -Proteobacteria (mostly SRB), Spirochaetes, Bacteroidetes and Firmicutes, none of the
which were identified in the original 16S rDNA bacterial community study (Anitori et al.,
2002).
BLAST and BLASTX identified diazotrophs, who might be active in nitrogen fixation in this
system. We would suggest that a thermophilic Oscillatoria spp. and an anaerobic sulphate
reducing -Proteobacteria from the Desulfovibrio spp. would potentially be the active nitrogen
fixers in the hot source pool.
As with other culture independent studies, we assumed that not all bacteria which have the nifH
genes actually express them and fix N2. Nevertheless, we would like to suggest nitrogen fixation
does occur in the hot source pool, mainly because N2 levels in the spring waters were higher,
compared to the local atmospheric composition (Brugger et al., 2005).
In summary, the hot source pool in Paralana Hot Springs supports a diverse and rich
diazotrophic community. Our study has not only identified potential nitrogen fixers it has also
expanded our basic knowledge of the microbial community composition and the potential of it
nitrogen fixation dynamics.
108
Chapter 5 Structural and evolutionary adaptations in the Fe protein
component of the nitrogenase
_______________________________________________
5.1 Introduction
The background question, propelling our efforts throughout this chapter, was whether there
were changes to the inferred NifH sequences obtained from hypersaline and thermal
environments (chapters 3 & 4), which reflect adaptations of the Fe protein to these
environments ?
In order to remain active and functional under various physical conditions, it is essential for any
protein to adapt to its immediate surroundings (Jaenicke and Böhm, 1998; Somero, 2003;
Bolhuis et al., 2008). There are several possible pathways for adaptation; a protein may be
protected from inactivation by “external” factors, such as being enclosed within a cell or
organelle (a heterocyst for example). Micro-conditions surrounding the protein can also be
controlled, either with heat/cold shock proteins, or by organic compatible solutes or by a
heterotrophic existence, thus preventing exposure to unfavourable conditions and inactivation
(Des Marais, 1995; Fields, 2001; Pikuta et al., 2007). The amino acid composition within a
protein was found to change under stressful conditions such as high salinity, pressure, extreme
temperatures and pH (Madern et al., 1995; Jaenicke, 1996; Groudieva et al., 2004; Siddiqui and
Thomas, 2008; Greaves and Warwicker, 2009). It is therefore of interest to look into the
potential adaptations in the Fe protein in response to stressful environmental conditions, and
gain better understanding of the mechanistic solutions originating from genetic code
permutations.
The Fe protein, encoded by the nifH gene, has been phylogenetically classified within the
family of the Mrp/MinD proteins, as part of the SIMIBI class within the GTPase super class
group of proteins, which include translation factors, signal recognition particle (Costello et al.)
GTPases, and several families of ATPases (Leipe et al., 2002). GTPase proteins include several
conserved elements - a repetitive α/ secondary structure, an N-terminal Walker A motif, also
known as a P-loop, which structurally forms a loop and binds the -phosphate of a nucleotide to
facilitate hydrolysis (Walker et al., 1982). In addition, GTPases also include the Walker B
109
Figure 1. Amplified regions of NifH in the Fe protein (highlighted in blue and red). MoFe chains A - D are shown with minimal backbone atom display, the Fe protein chains E and F are shown in grey ribbons, except for the amplified NifH regions, residues 37-155. Space filled atoms are displayed for the Calcium ions, Fe7MoNS9 and Fe8S7 clusters in the MoFe protein, and the Fe4S4 cluster in the Fe protein. Image based on 2AFH PDB file (Tezcan et al., 2005).
motif, which binds via a water molecule to the MgATP, and includes a conserved Asp and Gly
residues, preceded by four hydrophobic residues (Peters et al., 1995). Two switch regions,
known as Switch I and Switch II, were termed as an analogy to the homologous regions in ras
P21 proteins (Lanzilotta et al., 1996; Jang et al., 2000; Jang et al., 2004) and are vital to the
conformational change upon nucleotide binding. The Fe protein is a dimer, structurally
composed from eight beta sheets and alpha helices (Schlessman et al., 1998; Tezcan et al.,
2005), with a 4Fe:4S metalo cluster nestled in between (see also introduction chapter, section
1.3.1, for further details).
The Fe protein structure has been studied quite extensively due to its role in dinitrogen fixation
(Howard and Rees, 1996; Peters and Szilagyi, 2006). Molecular phylogenetic studies utilising
the nifH gene primers (Zehr and McReynolds, 1989; Omoregie et al., 2004b), amplify only part
of the gene, corresponding to residues 37-155 (residue numbering according to P00456 Swiss-
Prot ID sequence, see figure 1). This part contains information on switches I and II, the Walker
B motif and residues which coordinate the metallo cluster and interact with the second
component of the nitrogenase, the MoFe protein (see figure 2). The amplified section does not
cover the nucleotide binding fold, Walker A motif (Walker et al., 1982). Within the amplified
region, there are known loops which can undergo conformational variations, plus several
conserved residues which form multiple hydrogen bonds via interaction with conserved water
molecules, and also NH-S bonds between the amide groups and sulfur atoms, specifically
around the 4Fe:4S cluster (Georgiadis et al., 1992; Schlessman et al., 1998; Chiu et al., 2001).
\
110
Figure 2. Known functional regions in the amplified regions of NifH in the Fe protein. Switch I region is highlighted in orange, switch II in forest green, Walker B motif in blue, and residues which interact with the MoFe protein are coloured red. Q54, part of the Q-loop motif (see main text in section 5.4.2) is in pink. For visualization purposes, MoFe chains A and B are presented in minimal wire, and the image was cropped. Fe protein chain E was omitted from the image (2AFH PDB file (Tezcan et al., 2005).
In order to elucidate structural deviations relating to potential environmental adaptation, it was
imperative to obtain a known Fe protein structure, which would represent each of cluster I and
III individually. Since 1992 the crystallographic structures of the Fe protein provided new
insights on its mechanism and structure (Georgiadis et al., 1992; Kim et al., 1993; Peters and
Szilagyi, 2006). Twenty Fe proteins have been resolved in the range of 2.1 - 3.2 Å from
Azotobacter vinelandii, phylogenetically affiliated with cluster I (P00459 Swiss-Prot ID, H.M.
Berman, 2003). The best refined model, 2AFH at 2.1 Å (P00459 Swiss-Prot ID), was chosen as
the reference structure for clones affiliated with cluster I (Tezcan et al., 2005). However, only
two resolved structures have emerged from bacteria affiliated with cluster III -
Clostridium pasteurianum, and these structures were determined at 1.93 and 3.00 Å resolution
(Kim et al., 1993; Schlessman et al., 1998). The more refined structure, designated 1CP2
(P00456 Swiss-Prot ID), was chosen as the reference structure for this study, for clones
affiliated with cluster III. These two Fe protein models, 2AFH and 1CP2, are from mesophilic
bacteria and share a 69% overall amino acid sequence and 73.5% sequence identity in the
111
amplified region of nifH specifically (Burgess et al., 1980; Zehr and McReynolds, 1989;
Schlessman et al., 1998; Omoregie et al., 2004a).
In order to detect amino acid substitutions in a sequence and changes in the Fe protein structure,
two different bioinformatic tools were employed with 1CP2, 2AFH and NifH clones from this
study and existing databases. ConSurf is a bioinformatic tool which identifies functional regions
in proteins, by taking into consideration their phylogenetic background and similarities between
amino acids (Glaser et al., 2003; Landau et al., 2005). After estimating the level of conservation
of each amino acid in a set of sequences, a representative colour scheme is projected onto a
protein 3D visualized structure, thus helping researchers to identify areas highly conserved and
functionally important, but also areas of medium to high variability (Pupko et al., 2002;
Goldenberg et al., 2008). ConSurf is currently ranked as one of the best bioinformatic tools
available today for identifying important functional sections in proteins (Chung et al., 2005;
Ashkenazy et al., 2010; Mooney et al., 2011). ConSurf has been employed in the past in the
analysis of various proteins which included an iron-sulfur cluster, or supervised the biogenesis
of such clusters, for instance - the cytosolic iron-sulfur assembly protein (Cia1, Srinivasan et al.,
2007), the nitrogenase molybdenum-iron protein (Chung et al., 2006), the Iron–Sulfur Cluster
Assembly proteins (IscU, IscS, Ramelot et al., 2004; Shi et al., 2010), reverse-acting
Dissimilatory sulphite reductase (DsrAB, Grimm et al., 2010), and an ATPase component in the
biosynthesis of Fe–S clusters (SufC, SufE, Goldsmith-Fischman et al., 2004).
In most cases, these studies used ConSurf additionally to an analysis of the protein resolved
structure, to highlight regions of strict conservation and point out or confirm their specific
functionality (Ramelot et al., 2004; Li et al., 2009). At times, ConSurf has been used without
any accompanying biochemical analysis, being used as a prediction tool, to help researchers
find, among other things, protein-protein interaction sites, ligand binding sites, provide data for
future mutational or structural studies, and assigning domain functions to the ever increasing
number of hypothetical proteins (Bell and Ben Tal, 2003; Chung et al., 2005; Ashkenazy et al.,
2010). Thus, ConSurf analysis can be used to distinguish and illuminate conserved important
functional zones in families of proteins, and help in deducing lineage specific adaptations
(Glaser et al., 2005), even when a known 3D crystallographic structure is unavailable (Razia et
al., 2010; Kumar et al., 2012).
In our analysis, multiple alignments of each NifH cluster were compiled from reviewed NifH
sequences obtained from the Swiss-Prot database (Boeckmann et al., 2003), 58 and 32 reference
sequences, of cluster I and cluster III, respectively. Using reference sequences provided less
background noise to the data, as there are many NifH sequences available in the databases,
112
isolated from various sources under various conditions. Furthermore, it was assumed that
genetic changes would manifest mainly in the non conserved regions of the protein, and
therefore each multiple alignment was further split into two - a set of multiple sequences with
conserved residues only, and another set with variable residues only. The distinction between
variable and conserved residues was based on the ConSurf analysis, detailed in section 5.2.1.
The dichotomy between conserved and non-conserved enabled us to analyse shifts in the amino
acids composition for each segment.
Currently, the ConSurf web server requires a 3D structure of a protein, written as a PDB file, for
visualising the end result (Glaser et al., 2003). This and our aim to detect structural shifts in the
clones, directed us to use predicted structures, that were modelled by the iterative threading
assembly refinement (I-TASSER) server, which predicts 3D protein models (Zhang, 2008,
2009). Briefly, the server first assesses the possible secondary structure of a given sequence
against a representative PDB template library, using a 70% cutoff criterion for the pair-wise
comparison, and a combination of alignment programs, such as Needleman-Wunsch
(Needleman and Wunsch, 1970), Smith-Waterman (Pearson, 1991), etc, to propose a potential
secondary structure for the sequence (Zhang, 2008, 2009). The potential structure is then
divided into continuous segments of good quality structural alignments, and unaligned
fragmented sections, usually loop regions, which require a different method for structural
refinement. After additional spatial characteristics are calculated and averaged across a cluster
of potential structures, the modelling process is repeated to produce the best structural
candidate. In the second round, additional algorithms are used, such as the TM-align (Zhang and
Skolnick, 2005), for structural alignment, and other softwares to add backbone atoms and side
chain rotamers, eventually producing a PDB file for downstream applications (Roy et al., 2010).
The I-TASSER server provides different scores to evaluate the quality of its models. ‘C-score’
is a confidence score for estimating the quality of predicted models by the I-TASSER server.
The C-score is calculated based on the significance of threading template alignments and the
convergence parameters of the structure assembly simulations. It is typically in the range of [-
5,2], where a C-score of higher value signifies a model with a high confidence and vice-versa
(Zhang and Skolnick, 2007; Roy et al., 2010). A template modelling score (TM-score) is a scale
for measuring the topological similarity between two structures (Zhang and Skolnick, 2004), a
TM-score >0.5 would indicate that a model had correct topology and a TM-score below 0.17
would indicate random similarity. Root mean square deviation (RMSD) is another score
provided by the server, which is a well known standard for measuring the accuracy of structure
modelling, when the native structure is known (Kabsch, 1976, 1978; Carugo, 2003). The lower
the RMSD score, the better is the match between structures (i.e., smaller deviations).
113
I-TASSER consistently ranks the best method in the Critical Assessment of Structure Prediction
(Caspi and Karp) experiments for predicting protein structures (Zhang, 2007, 2009; Roy et al.,
2010).
Either accompanied with a functional or biochemical analysis, or without, these two
bioinformatics tools can provide powerful insight and novel information on protein structure
and conservation. In one study on the membrane associated thioredoxins of the Arabidopsis
thaliana plant, ConSurf was used to highlight two conserved amino acids, Gly and Cys, in the
N-terminal extension of the protein. These were then mutated to Ala, for further functional and
structural analysis (Meng et al., 2010). Subsequently, I-TASSER was used to predict the protein
and the mutant variants’ 3D structures, which enabled the researchers to show structural
modifications due to the changes in those specific amino acids. Their mutational and
biochemical study supported the ConSurf and I-TASSER results. In another study, a large scale
ConSurf analysis of the NS1 and NS2 amino acid sequences, from influenza A virus, was
projected onto I-TASSER models of NS1 and NS2, to highlight novel potential binding sites for
drugs (Darapaneni et al., 2009). There are few additional studies which employed both tools,
and we expect more studies will emerge using ConSurf and I-TASSER (Jimenez-Lopez et al.,
2010; Meng and Feldman, 2010; Aluri and Terli, 2012; Bhat et al., 2012).
To our knowledge, this is the first time these tools have been used in the analysis of the Fe
protein component of the nitrogenase protein. This required us first and foremost to analyse the
novel methodology, followed by the later analysis of our data with the established protocol.
Specific aims:
1. The two main clusters in the NifH phylogeny tree are cluster I and cluster III. Most of our
previously findings (detailed in chapters 3 and 4) were affiliated with these clusters. Therefore
our aim in this chapter was firstly to characterise conservation patterns and amino acid
distribution in NifH sequences from cluster I and cluster III.
2. Evaluate novel methodology in regards to known structural and functional regions of the Fe
protein.
114
5.2 Material and methods
5.2.1 Evolutionary conservation
Evolutionary conserved and non conserved residues for cluster I & III and affiliated clones,
were calculated by the “ConSurf” program (Pupko et al., 2002; Glaser et al., 2003; Landau et
al., 2005; Goldenberg et al., 2008). Pre-compiled multiple alignments were built using
MUSCLE (Edgar, 2004), manually checked, and submitted to ConSurf online web server for
analysis (http://ConSurf.tau.ac.il/). Specific parameters were chosen - homologues were
collected from Swiss-Prot (Boeckmann et al., 2003), PSI-BLAST E-value: 0.0001, no. of PSI-
BLAST iterations: 1, maximal % ID Between Sequences: 100, minimal % ID for homologs: 72.
A phylogenetic tree was constructed with the method of Neighbour Joining and ML distance,
and the method of maximum likelihood and the LG protein substitution model were chosen for
the conservation scores (Le and Gascuel, 2008; Posada et al., 2009). After the initial collection
of homologues, only cluster I or cluster III specific sequences from known organisms were
chosen (58 and 32 sequences, respectively) and submitted for further analysis with ConSeq
(Berezin et al., 2004).
5.2.2 Residue composition
The aligned NifH sequences from known organisms affiliated with cluster I or cluster III and
the inferred NifH sequences from clones were subjected to residue composition calculation. The
average ratio of 20 amino acids in the partial NifH sequences was calculated for each set of
multiple alignment using MEGA 5 software (Tamura et al., 2007; Kumar et al., 2008; Tamura
et al., 2011). In each multiple alignment, the average ratio of the amino acids was calculated
separately for the conserved and variable sections of the NifH sequence (conserved residues
(mainly Ala, 8), 114 (Tyr, 8), 125 (Tyr, 8), 178 (Val, 8), 201 (Ala, 8), 253 (L/M/K, score 7) and
261 (highly variable, score 1).
Figure 3 Multiple alignment of NifH complete sequences (N=32) from known organisms affiliated with cluster III coloured by ConSurf. Scale bar colours represent scores 1-9, variable residues in turquoise and completely conserved residues in maroon. The first line shows the residue number, the second line shows consensus, and third line shows the evolutionary score. The first reference sequence, input-pdb-seqres_A is NifH chain A from 1CP2 pdb file, the rest are NifH sequences affiliated with cluster III, see section 5.2.1. Thin line marks the amplified region. Figure continues in the next pages.
118
119
120
121
122
Because the nifH gene primers, used throughout this study (Zehr and McReynolds, 1989;
Omoregie et al., 2004b), amplify only part of the gene, the rest of our analyses referred only to
the amplified section in order to obtain specifics to compare against the clones’ inferred
sequences.
In regards to the average residue composition of cluster III alignment, it was important to clarify
whether the amino acids population followed a Gaussian or normal distribution in order to
perform a comparative analysis, such as a t-test analysis or ANOVA, on their residue
summarises our analysis of amino acid composition in cluster III sequences.
In cluster III, 12 amino acids were found to pass the normality test, and eight amino acids did
not pass. Trp rarely appeared and did not pass the normality test because there was no
distribution to observe. Asp, Glu and Gly did converge to a bell shape curve yet the bell shape
curve was not the best fit (figure 4). The amino acids Phe, Val, and Ser did not pass the
normality test mainly because the distribution revolved around a few discrete values and did not
exhibit normal distribution in our data set. Gly was the most common amino acid in cluster III
sequences (mean value 11.58), while Glu composition varied the most, and had the highest
standard deviation value within the group - 1.22.
Table 1. Amino acids composition in the amplified region of NifH, cluster III sequences from known organisms (N=32). Amino Acid Ala Cys Asp Glu Phe Gly His Ile Lys Leu Met Mean (%) 7.59 2.37 6.01 8.41 2.40 11.58 0.88 7.48 6.34 8.07 3.96 Std. Deviation 0.67 0.31 0.92 1.22 0.56 0.57 0.41 0.63 0.58 1.05 0.53 Passed normality test (alpha=0.05)? (a) Yes Yes No No No No No Yes Yes Yes Yes P-value Summary (b) **** * ** * **
Amino Acid Asn Pro Gln Arg Ser Thr Val Trp Tyr Mean (%) 3.92 3.19 3.23 4.98 3.51 4.93 7.50 0.14 3.50 Std. Deviation 0.75 0.21 0.49 0.84 0.87 0.40 0.69 0.21 0.52 Passed normality test (alpha=0.05)? Yes Yes Yes Yes No Yes No No Yes P-value Summary ** ** * (a) D'Agostino & Pearson omnibus normality test (D'Agostino, 1986). (b) **** P< 0.0001 extremely significant, *** 0.0001 <P< 0.001 very significant, ** 0.001 <P< 0.01, *0.01 <P< 0.05, significant. N - Not significant. n/a - not applicable.
123
Figure 4 Upper pane: The distribution shape of three amino acids based on their composition pattern in the NifH sequence: Asp, Glu and Gly. Goodness of fit (Robust sum of square):Asp-6.22, Glu-6.04, Gly-8.03. Lower pane: The distribution shape of three amino acids based on their composition pattern in the NifH sequence: Ser, Val and Phe. Goodness of fit (Robust sum of square): Ser-6.7, Val-6.83, Phe- curve did not converge. Y axis represents the relative frequency of the X axis values. X axis represents the range of the composition values of each amino acid in the NifH amplified region.
Completely conserved Gly and Ala, in the amplified section of NifH, had interesting structural
characteristics (Table 2). Their secondary structure, based on 1CP2 resolved structure, was
characterised by high curvature sections (bends, ‘S’), H-bonded turns (‘T’, just before or after a
helix, usually) and 3-helix turns (‘G’ - three residues per turn), and few were present in
unidentified or coil regions (Table 2, G or A score 9). Conserved Gly or Ala residues were
adjacent to important functional domains, such as the nucleotide or MoFe binding sites, the Fe
protein inter-subunits interaction region and the switch I & II regions. Solvent accessibility
analysis suggested that some of the Gly and Ala were accessible to the solvent at positions - 41,
50, 64, 76, 86-87, 91, 110 and 139. However, other conserved Gly or Ala residues were buried,
usually within conserved structural motifs.
Composition (%)
Composition (%)
124
Tabl
e 2
1CP2
Fe
prot
ein
parti
al N
ifH se
quen
ce, c
onse
rvat
ion
scor
es, s
econ
dary
stru
ctur
e an
d so
lven
t acc
essi
bilit
y. C
onse
rved
Ala
(A) a
nd G
ly (G
) res
idue
s are
hig
hlig
hted
. 1CP2
40
60
80
|
|
|
(^)
Partial NifH Sequence (a)
CDP
KADSTRLLLGGLAQKSVLDT
LREEGEDVELDSILKEGYGG
IRCVESGGPEPGVGCAGRGI
Conservation Score (b)
999
99999984919719688699
99189757161131419111
41889999999999999999
Secondary Structure (c)
*
E-T
TS-SSHHHHTS-----HHHH
HHHHGGG--HHHH-EE-GGG
-EEEE------TTSS-HHHH
***
---
---HHHHHH-------HHHH
HHHH-----HHHHHHH----
EEEEE--------------H
Solvent Accessibility (d)
†
bee
eeebbebbbeebeeeebbee
beeeeeebebeebbeeeeee
beeeeeeeeeeeebbbbebb
††
--e
--e-e---e-ee-e--ee--
eeee-ee--eeee--ee--e
-e-e----e-e--ee--e--
††
†
b-b
---bb--b--------bb--
b-----b-b--bb-------
b-bbbb---bbbbb----bb
Binding to MoFe Protein
---
-----------------LD-
LR---ED-------------
--------P---V-C--R--
Nucleotide Binding Site
-D-
K-DS----------------
--------------------
--------------------
Chains Interface
---
K-D--R-----L--------
--------------------
---------EP-V--A----
Structural Motifs
SWITC H1------------------
--------------------
--------------------
(a) C
orre
spon
ding
par
tial N
ifH a
min
o ac
id se
quen
ce P
0045
6 ac
cess
ion
ID, c
hain
A.
(b) R
esid
ue c
onse
rvat
ion
scor
es, c
alcu
late
d by
Con
Surf
with
Max
imum
like
lihoo
d an
d LG
pro
tein
subs
titut
ion
mod
el (
Pupk
o et
al.,
200
2; G
lase
r et a
l., 2
003;
Lan
dau
et a
l.,
2005
; Gol
denb
erg
et a
l., 2
008)
. (e
) Sec
onda
ry st
ruct
ure
: ‘H
’ - h
elix
; ‘T’
- hyd
roge
n bo
nded
turn
; ‘S’
- ben
d; ‘E
’ - e
xten
ded
beta
shee
t; ‘G
’- 3
-hel
ix (t
hree
resi
dues
per
turn
); ‘-
‘ unk
now
n/ ra
ndom
coi
l. *
on
3D c
oord
inat
es c
alcu
late
d by
DSS
P as
impl
emen
ted
in T
he P
rote
in D
ata
Ban
k (K
absc
h an
d Sa
nder
, 198
3; H
.M. B
erm
an, 2
003)
. **
* Pr
edic
ted
seco
ndar
y st
ruct
ure
by I-
TASS
ER o
n lin
e se
rver
, bas
ed o
n P0
0459
NifH
sequ
ence
(Zha
ng, 2
008;
Roy
et a
l., 2
010)
. ‘H
’ - h
elix
, ‘E’
- ex
tend
ed b
eta
shee
t, ‘-
‘unk
now
n.
(c) S
olve
nt a
cces
sibi
lity:
†
B
urie
d (b
) or e
xpos
ed (e
) res
idue
; cal
cula
ted
by C
onSe
q on
line
serv
er (S
ridha
ran
et a
l., 1
992;
Pol
last
ri et
al.,
200
2; B
erez
in e
t al.,
200
4).
†† S
olve
nt a
cces
sibi
lity
calc
ulat
ed b
y W
HA
T IF
pro
gram
, ‘-‘
unk
now
n , (
e) e
xpos
ed -
a re
sidu
e th
at is
cle
arly
solv
ent a
cces
sibl
e, m
ore
expo
sed
than
102
Ang
stro
m, o
r mor
e th
an 3
3% o
f its
acc
essi
bilit
y in
the
unfo
lded
stat
e (V
riend
, 199
0).
†††
Pred
icte
d so
lven
t acc
essi
bilit
y ca
lcul
ated
by
I-TA
SSER
serv
er. ‘
b’- b
urie
d re
sidu
e (0
); ‘e
’ hig
hly
expo
sed
resi
due
(7-9
); ‘-
‘ var
ying
deg
rees
of e
xpos
ure;
(C
hen
and
Zhou
, 200
5; W
u an
d Zh
ang,
200
8).
(d) r
esid
ues i
nter
actin
g w
ith s
truct
ural
com
pone
nts i
n ni
troge
nase
(Sch
less
man
et a
l., 1
998)
(^
) Cys
tein
e re
sidu
es w
hich
coo
rdin
ate
the
met
allo
clu
ster
are
mar
ked
C
125
1CP2
sequ
ence
con
tinue
d. C
onse
rved
Ala
(A) a
nd G
ly (G
) res
idue
s are
hig
hlig
hted
.
100
120
140 154
Partial NifH Sequence
|
| (^)
| |
ITSINMLEQLGAYTDDLDYV
FYDVLGDVVCGGFAMPIREG
KAQEIYIVASGEMMAL
Conservation Score
99886379279861119967
88999999999999999949
9919799959797983
Secondary Structure
*
HHHHHHHHTT----TT-SEE
EEEEE-SS-STTTTHHHHTT
S--EEEEEE-SSHHHH
***
HHHHHHHHHHHH------EE
EEE-----EEE---EE----
---EEEEEE--HHHHH
Solvent Accessibility
†
bbbbbbbeebeebeeebebb
bbbbbbbbbbbbbbbebeee
ebeebbbbbbbebbbb
††
-e-ee-eeee-eeee-e---
-----------------ee-
e-e--------eee--
††
†
bbb--b-e------e-b-bb
bbbb-b--bbbb-bb--b--
-b--bbbbb----bbb
Binding to MoFe protein
IT--N---Q-----------
---------C----M--RE-
----------------
Nucleotide Binding Site
--------------------
--D---D-------------
-----------E-MA-
Chains Interface
--------------------
----L-DVVC--FA------
-----------EMM--
Structural Motifs
--------------------
-S W I T C H2-------
----------------
126
Figure 5 Superimposition of the I-TASSER model of the amplified NifH sequence based on P00456, and the crystallographic structures of 1CP2. Sections with RMSD > 1Å are highlighted with colour.
n of the I-TASSER model of the amplified NifH seq
The I-TASSER analysis of 1CP2 chain A sequence, produced one 3D model, with an estimated
accuracy of RMSD 1.9±1.5 Å, C-score of 2.13 and 0.99±0.04 TM-score. The PDB templates,
identified by the various server software modules in the threading stage, were PDB 1CP2 chain
A and 2AFH chains E & A, the latter receiving lower sequence identity percentages. The top
ranking EC predicted number was 1.18.6.1 (nitrogenase) with a TM-score of 0.9782 and RMSD
0.66 Å, with 100% sequence identity to the query sequence, an EC-score of 4.5881 and a PDB
hit to 1CP2 chain B (the dimer chain in the Fe protein). The most structurally similar protein to
the I-TASSER model was actually identified as the 2AFH chain E, with a TM-score of 0.9929
and RMSD 0.55 Å, with 69% sequence identity.
Superimposing the I-TASSER model over the known x-ray crystallographic structure of 1CP2
(see section 5.2.3 for specific parameters), highlighted which amino acids were positioned
imprecisely by the I-TASSER server. The overall RMSD of superimposing both structures,
predicted and known, was 0.529 Å. Four sections had RMSD values higher than 1 Å (figure 5):
164, 170-171, 187, 213, 223, 262 and 266. The less conserved positions were 159 (mainly Y,
score 8), 166 was an exchange between positively charged amino acids (K/R, 6), position 215
was mainly Asn (N, 8) and position 219 (another exchange - R/H, 6). Additionally, positions
187-190 included a motif which started with a positively charged amino acid, and ended with a
negatively charged amino acid, with a hydrophobic or uncharged residue inserted in between
(R,9; N/Q/K,1; T/V,6; D,8).
A similar motif was present in positions 221-224, starting with a negatively charged amino acid,
followed by a hydrophobic residue and ending with a positively charged amino acid (E, 9; L/I,
8; R, 9; R/K, 7). In addition, 26 residues that participated as hydrogen bonding partners with
water molecules (Schlessman et al., 1998), were completely conserved, with only two residues
scoring 8 - position 143 which was mainly Lys and position 169 that had an exchange between
Val and Leu.
128
Figure 6 Multiple alignment of 58 NifH complete sequences (N=58) from known organisms affiliated with cluster I, coloured by ConSurf. Colours represent scores 1-9, variable residues in turquoise, average conservation in white and completely conserved residues in maroon. The first line shows the residue number, the second line shows consensus, and third line shows the evolutionary score by ConSurf. The first sequence in the alignment, Input_pdb_ATOM_E, is the complete NifH sequence of chain E from 2AFH PDB file. The rest are NifH sequences affiliated with cluster I, see section 5.2.1. Thin black line marks the amplified region by the nifH gene PCR primers. Figure continues in the next pages.
129
130
131
132
133
134
135
136
137
In a similar fashion to the residue composition analysis previously done on the cluster III
alignment, table 3 summarises our analysis of amino acids composition and distribution in
cluster I sequences. In cluster I, 13 amino acids were found to pass the “omnibus K2” normality
test (alpha=0.05), and seven amino acids did not pass. These amino acids did not pass the
normality test due to three different reasons: a. an amino acid very rarely appeared in a
sequence, hence there was no distribution to observe (Trp), b. the distribution revolved around a
few discrete values and did not exhibit normal distribution (Cys, Asp, Phe, Asn, Pro), and c. two
distributions were observed instead of just one in the data set (Arg). Figure 7 shows examples of
a Gaussian non linear regression analysis for points b & c, for Arg, Cys and Phe. Gly was the
most common amino acid in cluster I sequences (mean value 10.05), while Leu composition
varied the most, it had the highest standard deviation value in the group - 1.12.
Table 3. Amino acids composition in the amplified region of NifH sequences from known organisms affiliated with cluster I (N=58). Amino Acid Ala Cys Asp Glu Phe Gly His Ile Lys Leu Met Mean (%) 9.56 2.09 5.33 9.13 1.99 10.05 1.51 7.94 5.13 8.59 4.00 Std. Deviation 0.78 0.49 0.63 0.86 0.36 0.47 0.52 0.62 0.84 1.12 0.75 Passed normality test (alpha=0.05)? (a) Yes No No Yes No Yes Yes Yes Yes Yes Yes P-value Summary (b) **** ** *** Asn Pro Gln Arg Ser Thr Val Trp Tyr Mean (%) 4.25 2.89 3.62 4.77 4.07 4.76 6.99 0.04 3.32 Std. Deviation 0.55 0.15 0.76 0.73 0.72 0.67 0.90 0.11 0.29 Passed normality test (alpha=0.05)? No No Yes No Yes Yes Yes No Yes P-value Summary *** *** *** **** (a)D'Agostino & Pearson omnibus normality test (D'Agostino, 1986). (b) **** P< 0.0001 extremely significant, *** 0.0001 <P< 0.001 very significant, ** 0.001 <P< 0.01, *0.01 <P< 0.05, significant. N - not significant. n/a - not applicable.
138
Figure 7 The distribution shape of three amino acids based on their composition pattern in the NifH sequence: Arg, Cys and Phe. Goodness of fit (Robust sum of square): Arg-7.21, Cys-6.01, Phe-5.54. Y axis represents the relative frequency of the X axis values. X axis represents the range of the composition values of each amino acid in the NifH amplified region.
In a similar fashion to 1CP2 structural analysis, the conserved Gly and Ala of the amplified
region of NifH in 2AFH, appeared in alpha helices (‘H’), regions with high curvature (bends,
‘S’) and H-bonded turns (‘T’, just before or after a helix, usually) and 3-helix turns (‘G’ - three
residues per turn), with only few present in unidentified or coil regions (G or A with
conservation score 9, table 4). Conserved Gly or Ala residues in 2AFH were adjacent to
important functional domains, similar to the 1CP2 findings. Solvent accessibility analysis
suggested that some of the Gly or Ala residues were accessible to the solvent, note positions -
42, 65, 89-90 and 114, while others remained buried.
The I-TASSER server provided five potential models for 2AFH chain E sequence, and their C-
score ranged between 2.12 to -5. ‘Model1’ had the best C-score - 2.12, RMSD 2.0±1.6 Å, and a
0.99±0.04 TM-score. The templates identified in the threading stage were - 2AFH chains E and
A, 1CP2 chain A, 1DE0 chain A and 2NIP chain A (additional Fe proteins from A. vinelandii).
Only chains E and A from 2AFH were 100% sequence identity while the rest had varying
degrees of sequence identity. The top ranking EC predicted number was 1.18.6.1 (nitrogenase)
with a TM-score of 0.8922, RMSD 1.7 Å, 98% sequence identity to the query sequence, an EC-
score of 4.0401 and a PDB hit to 1N2C chain E (a nitrogenase complex from A. vinelandii). The
most structurally similar protein to the first I-TASSER model was identified as 2AFH chain E,
according to its TM-score of 0.9897 (RMSD 0.54 Å, 100% sequence identity).
Composition (%)
139
Superimposing the I-TASSER model with the known x-ray crystallographic coordinates of
2AFH (see section 5.2.3 for specific parameters), highlighted only two residues that were not
precisely positioned (figure 8). These were G96 and E116 (numbering according to P00459
sequence), at RMSD values of 1.311 and 1.168 Å, respectively, while the overall RMSD was
0.349 Å.
Figure 8 Superimposition of the I-TASSER model based on P00459 sequence, amplified section, and the crystallographic structures of 2AFH chain E. Sections with RMSD >1 Å are highlighted with colour.
140
Tabl
e 4
2AFH
Fe
prot
ein
parti
al N
ifH se
ctio
n, c
onse
rvat
ion
scor
es, s
econ
dary
stru
ctur
e an
d so
lven
t acc
essi
bilit
y. C
onse
rved
Ala
(A) a
nd G
ly (G
) res
idue
s are
hi
ghlig
hted
. 2AFH
40
60
80
|
|
| (^)
Partial NifH Sequence (a)
CDPKADSTRLILHSKAQNTIM
EMAAEAGTVEDLELEDVLKA
GYGGVKCVESGGPEPGVGCA
Conservation Score (b)
999999999979619793947
38991195999582139111
91115197989999979999
Secondary Structure (e)
*
E-S-SSSSHHHH--SS--HHH
HHHHTTSSGGG--HHHH-EE
-GGG-EEEE-----TTT--H
***
---EEEEEE--------HHHH
HHHHHHHH---EEEEE----
--HHHHHHH------HHHHH
Solvent Accessibility (c)
†
beeeeebbebbbebebeebbb
ebbbeeeebeebebeebbee
beeebebeeeeeeeeeebbb
††
--ee----e---e-ee-e---
e--ee----ee-eeee--ee
-eee-e-------e--e---
†††
b---
--bb-
bbb-
----
--bb
--
--e-
----
-b-b
--bb
--
----
b-bb
----
----
-bbb
Binding to MoFe Protein (d)
--------------------M
E-AA---TVED----------
------------P---V-C-
Nucleotide Binding Site (d)
-D-K-DS--------------
---------------------
--------------------
Chains Interface (d)
---K-D--R-----K------
---------------------
-------------EP-V--A
Structural Motifs (d)
SWITCH1---------------- ---------------------
--------------------
(a) P
artia
l NifH
am
ino
acid
sequ
ence
from
2A
FH c
ryst
allo
grap
hic
3D st
ruct
ure
and
P004
59 se
quen
ce a
cces
sion
ID, c
hain
E.
(b) R
esid
ue c
onse
rvat
ion
scor
es, c
alcu
late
d by
Con
Surf
with
Max
imum
like
lihoo
d an
d LG
pro
tein
subs
titut
ion
mod
el (
Pupk
o et
al.,
200
2; G
lase
r et a
l., 2
003;
Lan
dau
et a
l., 2
005;
Gol
denb
erg
et a
l., 2
008)
. (e
) Sec
onda
ry st
ruct
ure
: ‘H
’ - h
elix
; ‘T’
- hyd
roge
n bo
nded
turn
; ‘S’
- ben
d; ‘E
’ - e
xten
ded
beta
shee
t; ‘G
’- 3
-hel
ix (t
hree
resi
dues
per
turn
); ‘-
‘ unk
now
n/ ra
ndom
coi
l. *
Bas
ed o
n 3D
coo
rdin
ates
cal
cula
ted
by D
SSP
as im
plem
ente
d in
The
Pro
tein
Dat
a B
ank
(Kab
sch
and
Sand
er, 1
983;
H.M
. Ber
man
, 200
3).
***
Pred
icte
d se
cond
ary
stru
ctur
e by
I-TA
SSER
on
line
serv
er, b
ased
on
P004
59 N
ifH se
quen
ce (Z
hang
, 200
8; R
oy e
t al.,
201
0). ‘
H’ -
hel
ix, ‘
E’ -
exte
nded
bet
a sh
eet,
‘-‘u
nkno
wn.
(c
) Sol
vent
acc
essi
bilit
y:
†
Bur
ied
(b) o
r exp
osed
(e) r
esid
ue; c
alcu
late
d by
Con
Seq
on li
ne se
rver
(Srid
hara
n et
al.,
199
2; P
olla
stri
et a
l., 2
002;
Ber
ezin
et a
l., 2
004)
. ††
So
lven
t acc
essi
bilit
y ca
lcul
ated
by
WH
AT
IF p
rogr
am, ‘
-‘ un
know
n , (
e) e
xpos
ed -
a re
sidu
e th
at is
cle
arly
solv
ent a
cces
sibl
e, m
ore
expo
sed
than
102
Ang
stro
m, o
r m
ore
than
33%
of i
ts a
cces
sibi
lity
in th
e un
fold
ed st
ate
(Vrie
nd, 1
990)
. ††
† Pr
edic
ted
solv
ent a
cces
sibi
lity
calc
ulat
ed b
y I-
TASS
ER se
rver
. ‘b’
- bur
ied
resi
due
(0);
‘e’ h
ighl
y ex
pose
d re
sidu
e (7
-9);
‘-‘ v
aryi
ng d
egre
es o
f exp
osur
e;
(Che
n an
d Zh
ou, 2
005;
Wu
and
Zhan
g, 2
008)
. (d
) res
idue
s int
erac
ting
with
stru
ctur
al c
ompo
nent
s in
nitro
gena
se (S
chle
ssm
an e
t al.,
199
8)
(^) C
yste
ine
resi
dues
whi
ch c
oord
inat
e th
e m
etal
lo c
lust
er a
re m
arke
d C
141
Tabl
e 4
2AFH
sequ
ence
con
tinue
d. C
onse
rved
Ala
(A) a
nd G
ly (G
) res
idue
s are
hig
hlig
hted
. Partial NifH Sequence
100
120
140
158
|
|
(^)
|
|
GRGVITAINFLEEEGAYEDD
LDFVFYDVLGDVVCGGFAMP
IRENKAQEIYIVCSGEMMAM
Conservation Score
99989969889992799113
18897999999999999999
99668999999939999999
Secondary Structure
*
HHHHHHHHHHHHHTT-SSTT
-SEEEEEEE-SS--TTTTHH
HHTT---EEEEEE-SSHHHH
***
HHH--------HHHHHHH--
--EEEEE-------------
-HHHHHHHHHH---------
Solvent Accessibility
†
bebebbbbbbbeeeeeeeee
bebbbbbbbbbbbbbbbbbe
beeeebeebbbbbbbebbbb
††
------------e----eee
-e------------------
-eee--e---------ee--
††
†
--bbbbbb-bb---------
b-bbbbbb-bbbbbb-bb--
b----b--bbbbbb---bbb
Binding to MoFe protein
-R--IT--N---E-------
-------------C----M-
-RE-----------------
Nucleotide Binding Site
--------------------
------D---D---------
---------------E-MA-
Chains Interface
--------------------
--------L-DVVC--FA--
---------------EMM--
Structural Motifs
--------------------
-S W I T C H 2 -
--------------------
142
5.3.3 Comparative analysis of cluster I and cluster III Fe proteins
We projected our results from the evolutionary analysis onto the relevant Fe protein structures,
2AFH chain E for cluster I and 1CP2 chain A for cluster III (figure 9). Big blocks of completely
conserved residues were evident in the interior of the Fe protein, in the vicinity of the metallo
cluster, as expected. Completely conserved residues were found throughout the structure, also
toward its exterior, in a more fragmented fashion, alongside less conserved residues.
The secondary structure was similar between 2AFH & 1CP2 (figure 10). The overall RMSD
when superimposing 2AFH & 1CP2 crystallographic structures was 0.67 Å (without the 13
residues in the C-terminus of 2AFH), which meant most Cα atoms of the amino acids were
positioned fairly similarly in both proteins. However - there were regions where the RMSD was
higher than 1 Å, as can be seen in figure 11. Six sections with RMSD values higher than 1 Å
were present in the amplified region (Table 5).
Table 5 2AFH & 1CP2 regions of RMSD >1 Å. In bold, residues in the amplified region of NifH.
Residue positions* 26-28 50-53 58-68 87-90 93 2AFH sequence(a) AEM SKAQ EMAAEAGTVEDLE GPEP G Conservation score(b) 841 1979 3899119599958 9999 9 1CP2 sequence HAM GLAQ DTLREEGEDVE GPEP G Conservation score 111 9719 99991897571 9999 9 Main secondary structure(c) helix&turn coil helix&turn coil bend RMSD (Å) 1.513 2.143 2.341 1.295 1.998
Residue positions 96-97 108-116 184-188 198 262-269 2AFH sequence GR EEGAYEDDL RNTDR N EELLMEFG Conservation score 99 927991131 91684 1 91695149 1CP2 sequence GR QLGAYTDDL RKVAN K EEILMQYG Conservation score 99 279861119 91971 1 91641119 Main secondary structure helix coil&turn coil&bend helix helix&turn RMSD (Å) 1.001 1.379 3.473 1.073 3.083
* Position number according to 1CP2, chain A sequence P00456. (a) The amino acid in each position in the respective Fe protein, 2AFH or 1CP2. (b) ConSurf conservation scores, 1-9, non conserved to completely conserved, respectively (c) Secondary structure calculated by DSSP as implemented in The Protein Data Bank.
These six regions included coil, turns and parts of alpha helices as their secondary structure
(tables 2 & 4). They included residues involved in intersubunit interactions (51, 89-90), binding
to the MoFe protein (58-68, 88, 97, 108-116) and coordination of the metallo cluster (87-90, 93,
96-97).
143
Figure 9 Conservation pattern of the Fe protein. Top image: Superimposed 1CP2 and 2AFH Fe proteins at opposite angles, composed from completely conserved residues only (score 9). The metallo cluster is represented with space filled atoms, yellow for Sulphur atoms, orange for the Fe atoms. Bottom image: Conservation scores projected onto individual Fe protein structures. Left Fe protein is 2AFH chain E, and right Fe protein is 1CP2 chain A. Coloured ribbons represent less conserved residues in the protein (scores 1-8, turquoise - pink), while the wire is composed only from completely conserved residues (score 9, maroon) in each cluster.
mage: Superimposed 1CP2 and 2AFH Fe ompletely conserved residues only (score 9). The metalloyellow for Sulphur atoms orange for the Fe atoms
gure 9 Conservation pattern of the Fe proteinoteins at opposite angles composed from co
n. Top imompletely
For the amino acid composition analysis, we performed a two tailed unpaired t-tests, to measure
the significant changes between the amino acid compositions between cluster I and III, in the
variable and conserved sections in the NifH sequences. Most amino acids in both clusters have
passed the normality tests (sections 5.3.1), which marked them suitable for t-tests (Heeren and
D'Agostino, 1987).
The composition analysis in the conserved vs. variable sections in the partial NifH sequence,
revealed some interesting similarities and differences between cluster I (C1) and cluster III (C3,
figure 12) sequences. Under the conserved section, Cys, Leu, Pro and Thr compositions were
144
similar in both clusters, while His, Asn and Trp were absent (Table 6). However, Ala and Gly
differed substantially in the conserved region, as evident from their relatively high SD values
(3.5 and 4.9, respectively, Table 6). In the variable sections, the composition of six amino acids
- Ala, Asp, Glu, Ile, Arg and Val, showed no statistically significant differences between the
clusters (Table 7). In addition, Pro was nonexistent in cluster I, and rarely present in cluster III,
while Leu was the most common amino acid in both clusters (composition mean 11 and 15, C1
and C3, respectively, Table 7), followed by Glu (11, 12). Phe, Gly, His, Lys, Asn and Ser
content decreased significantly in cluster III (Table 7), while Cys, Leu, Met, Gln, Thr and Tyr
compositions increased significantly. Table 6 Amino acid mean composition in the conserved sections of partial NifH sequences from known organisms affiliated with cluster I (C1, N=58) and cluster III (C3, N=32). Shaded cells denote highest standard deviation (SD) values. Amino Acid Ala Cys Asp Glu Phe Gly His Ile Lys Leu Met C1-Conserved (%) 11 5.3 6.6 9.2 1.3 14 0.0 6.6 2.7 5.6 4.9 C3-Conserved (%) 6.1 4.6 9.1 7.6 1.5 21 0.0 5.9 3.0 6.1 3.1 Mean (SD)* 8.6(3.5) 5(0.49) 7.9(1.8) 8.4(1.1) 1.4(0.14) 18(4.9) 0(0) 6.3(0.49) 2.9(0.21) 5.9(0.35) 4(1.3) Asn Pro Gln Arg Ser Thr Val Trp Tyr C1-Conserved (%) 0.0 5.3 2.6 3.9 2.7 3.9 11 0.0 3.9 C3-Conserved (%) 0.0 6.1 1.5 6.1 4.5 4.5 7.7 0.0 1.5 Mean (SD) 0(0) 5.7(0.57) 2.1(0.78) 5(1.6) 3.6(1.3 ) 4.2(0.42) 9.4(2.3) 0(0) 2.7(1.7) *Mean and Standard Deviation of C1 and C3 conserved values.
Table 7 Amino acid mean composition in variable sections of partial NifH sequences from known organisms affiliated with cluster I (C1, N=58) and cluster III (C3, N=32).
(a)The mean composition of each amino acid and its standard deviation. (b) Indicates if the means were significantly different (P<0.05) according to unpaired t-tests with Welch’s correction for unequal variances. **** P< 0.0001 extremely significant, *** 0.0001 <P< 0.001 very significant, ** 0.001 <P< 0.01, * 0.01 <P< 0.05, significant. N - not significant. n/a - not applicable.
Amino Acid Ala Cys Asp Glu Phe Gly His Ile Lys Leu Met C1-Variable(a) 6.9(1.8) 0.81(1.5) 8.5(2.7) 11(3) 4.6(2.3) 7.5(1.8) 2.8(1.7) 5.8(2.2) 5.7(2.6) 11(2.3) 3.3(1.6) C3-Variable 6.3(2.3) 2.5(0.86) 7.7(2.2) 12(3.5) 3(1.2) 5.7(2) 0.29(0.68) 6.3(2.1) 4.6(1.5) 15(3) 4.2(1.5) t-test P value summary (b) N **** N N **** **** **** N ** **** *
Asn Pro Gln Arg Ser Thr Val Trp Tyr C1-Variable 6.7(2.3) 0(0) 1.4(1.8) 2.2(2.5) 7.7(3.3) 3.2(2.9) 8(2.3) 0.16(0.58) 2.6(1.9) C3-Variable 3.2(1.6) 0.35(0.73) 2.9(1.3) 2(1.3) 3.6(1.9) 4.9(2) 8.5(2.1) 0.57(0.86) 6.2(1.4)
t-test P value summary **** n/a **** N **** ** N * ****
145
Figu
re 1
0 Su
perim
pose
d st
ruct
ures
of 1
CP2
cha
in A
and
2A
FH c
hain
E c
ryst
allo
grap
hic
stru
ctur
es h
ighl
ight
ing
sim
ilarit
ies i
n th
eir s
econ
dary
stru
ctur
e in
the
NifH
am
plifi
ed
regi
on. O
vera
ll R
MSD
0.6
7 Å
. Hel
ices
in b
lue,
coi
l in
light
gre
y an
d be
ta sh
eets
in re
d. T
he m
etal
o cl
uste
r pos
ition
is b
ased
on
2AFH
PD
B fi
le a
tom
coo
rdin
ates
(ora
nge
for
Fe, a
nd y
ello
w fo
r S a
tom
s).
posi
tion
is b
ase
Figu
re 1
1 Su
perim
pose
d st
ruct
ures
of 1
CP2
cha
in A
and
2A
FH c
hain
E c
ryst
allo
grap
hic
stru
ctur
es h
ighl
ight
ing
spec
ific
sect
ions
with
RM
SD v
alue
s >1
Å. L
eft:
Con
serv
ed a
nd n
on c
onse
rved
regi
ons i
n w
hich
RM
SD v
alue
s >1
Å, h
ighl
ight
ed w
ith th
e C
onSu
rf c
olou
r sch
eme.
The
refo
re, c
ompl
etel
y co
nser
ved
regi
ons a
re in
mar
oon
(sco
re ‘9
’), a
nd n
on-c
onse
rved
are
col
oure
d tu
rquo
ise
to p
ink
(sco
res ‘
1-8’
). R
ight
: The
sam
e re
gion
s in
whi
ch R
MSD
>1
Å, h
ighl
ight
ed p
er p
rote
in st
ruct
ure,
hen
ce, 1
CP2
is
in d
ark
blue
and
2A
FH is
hig
hlig
hted
in d
ark
gree
n.
2AFH
cha
in E
cry
stal
logr
aphi
c st
ruct
ures
hig
hlig
hgh
t gre
y an
d be
ta sh
eets
in re
d. T
he m
etal
o cl
uste
r pTh
e m
etal
o cl
uste
r pes
in th
eir s
econ
dary
stru
ctur
e in
the
NifH
am
plifi
edd
on 2
AFH
PD
B fi
le a
tom
coo
rdin
ates
(ora
nge
for
edon
2A
FH P
DB
file
at
ure
10 S
uper
impo
sed
stru
ctur
es o
f 1C
P2 c
hain
A a
ndon
.Ove
rall
RM
SD0.
67Å
.Hel
ices
inbl
ue,c
oili
nli
146
Figure 12 The amino acids mean composition in the partially amplified NifH sequence, from known organisms affiliated with cluster I (C1) and cluster III (C3), divided to variable vs. conserved regions. Error bars are SD.
Salt bridges analysis, based on the crystallographic structures of 1CP2 and 2AFH, revealed that most
of the bridges included highly conserved residues mainly in coil regions, with few residues in helices
or beta sheets (Table 8).
While the residues within the beta sheets were buried, almost all the other residues were exposed to
the solvent according to our previous analysis (tables 2 & 4). Four common bridges were completely
conserved in both structures, yet two unique salt bridges in 2AFH and three in 1CP2, had low
conservation scores, suggesting these specific bridges were not present in all the sequences from
cluster I or cluster III. Three unique salt bridges were highly conserved and connected the
intersubunits of 2AFH, yet similar intersubunits bridges were not found in 1CP2, even when our
distance criterion was extended to allow a distance of 7 Å between participating atoms.
147
Table 8 Potential salt bridges, with maximum intertatomic distance of 4 Å, in the amplified NifH region of the Fe protein 1CP2 or 2AFH. Shaded rows represent common salt bridges.
(a) 1CP2 analysis by WHAT IF, salt bridges were not detected between the Fe protein subunits A & B. (b) Positioning was manually corrected for minor shifts per alignment. (c) Conservation score was based on the individual analysis of ConSurf on cluster III, Stromatolite affiliated with cluster III (S3), Cluster I and its affiliated stromatolite clones (S1). Scores ranged from 1 to 9, non-conserved to completely conserved, respectively. “-“ score was not calculated. (d) 2AFH analysis by WHAT IF, salt bridges were detected between subunits E & F, and are designated where relevant. * Yellow background denotes a residue in a α-helix structure, and green denotes a residue within a -sheet. No background colour means random coil or unknown structure.
148
5.4 Discussion
Cluster I and cluster III NifH sequences from known organisms were subjected to analyses of
their conservation patterns, amino acids composition and structural shifts, in representative Fe
proteins. Two new bioinformatic tools were evaluated, ConSurf and the I-TASSER web server
for structural prediction.
5.4.1 Methodology
Our methods included a statistical t-test analysis of the amino acid composition, a ConSurf
evolutionary analysis, and using I-TASSER for modelling the Fe proteins. In general, these
methods performed well, with the following limitations and restrictions.
Amino acid composition analysis is usually used, to ascertain unique patterns and characterise a
designated group of sequences. The number of sequences and their length may vary, and can
encompass a few dozen sequences to hundreds, as well as statistical tools that can be employed,
such as Chi square tests, Significance (‘R’) formula based on standard deviation, Cluster
analysis and more (Bohm and Jaenicke, 1994; Fukuchi and Nishikawa, 2001; Fukuchi et al.,
2003; Paul et al., 2008). In our analysis, the amino acids composition was compared between
NifH defined clusters derived from known reference NifH sequences, and our specific NifH
clone sets obtained from the Shark Bay stromatolites and the Paralana Hot Springs. While most
amino acids passed the normality tests, Trp, Phe and Asp from both cluster I and cluster III did
not (tables 1 & 3).
However, we continued with the statistical analysis, with all the amino acids, because of the
robustness of the t-test to violations of normal distribution (Heeren and D'Agostino, 1987). The
distribution shape of amino acids composition in proteins is a complex matter. The normal type
of distribution has been suggested in the past (Smith, 1966; Nishikawa et al., 1983; Gerstein,
1998), but it remains inconclusive. However, we would have liked to see more amino acids pass
the normality test. Once elongated, the amplified region of the nifH gene will provide longer
sequences, with more appearances of each amino acid, and eventually the distribution shape will
become clearer. However, because some of these amino acids would appear mostly in the
conserved regions of the sequence, they would always attain the same values, regardless of the
data set size, and therefore perhaps a different statistical approach should be used in those cases.
The limitation of the ConSurf analysis was tightly related to the multiple alignment quality,
more so than its size. The multiple alignment quality is at the core of the ConSurf analysis
149
(Glaser et al., 2005). In our study we used the Muscle alignment software, and visually checked
the alignments. As reflected in independent benchmark testing of multiple alignment tools,
MAFFT and Muscle produce similar quality outputs and both are better than ClustalX software
(Nuin et al., 2006). Big blocks of conserved motifs throughout our alignments were always
correctly aligned, however, whenever there was a single insertion, Muscle tended to position it a
bit randomly. Thus an insertion near two identical residues, would create three different forms -
xx, x-x, xx-, and impact how ConSurf computes conservation for these specific positions. If the
insertion is inserted randomly next to two identical residues, these highly conserved residues
will ‘lose’ their specific positioning within a multiple alignment, and will be marked as variable,
though they are not. In a highly conserved region of the Fe protein, any minute modifications to
the sequence could represent an adaptation. On a large scale, these mini-modifications might get
lost or overlooked, yet in our alignments, they were observed and corrected.
In addition, positions with functionally similar amino acids, i.e. a Glu or Asp, will exhibit
higher rates of change compared with positions which require the function and the structure of
the amino acid to be exactly the same in order for the protein to function at all. Hence, positions
which include functionally similar amino acids, but not structurally identical - will alternate
between those optional amino acids. As Consurf uses “rate4site” algorithm, which scores
positions based on their mutational rates - such alternating positions will be scored as relatively
variable, not conserved. They will be given lower scores. Across the NifH sequences from
known organisms in cluster III, positions with alternating Glu or Asp received a range of scores
- 1,4,5,7 (figure 2, table 3). The exact mechanism by which the specific score was given to these
positions requires an in-depth analysis and inspection of the algorithm, factoring into it the
Maximum likelihood and the Le and Gascuel substitution matrix and the effect the total number
of sequences in the alignment has on the calculation.
The limitation of the I-TASSER modelling method was identified after we have employed the
RMSD calculation method (Kabsch and Sander, 1983), as implemented in the Chimera UCSF
software (Meng et al., 2006). We performed RMSD analysis on the resolved structures of 1CP2
chain A and 2AFH chain E, and gained an independent RMSD analysis of their structural
differences (Table 5). This base analysis was later reviewed against our analysis of each
resolved structure against its predicted model by the I-TASSER server (figures 5 & 8). The base
analysis indicated where authentic structural changes actually occur between 1CP2 and 2AFH
(Table 5). Six of those sections were in the amplified region of NifH, and relatively exposed to
the solvent. They are known to undergo conformational changes upon nucleotide binding, or
upon forming a docking complex with the MoFe protein (Georgiadis et al., 1992; Tezcan et al.,
2005).
150
The RMSD analysis of 1CP2 or 2AFH structure against their predicted I-TASSER models
allowed us to isolate any differences introduced by the I-TASSER process. The four sections in
1CP2 which were positioned imprecisely, correlated entirely with our base analysis between the
two Fe proteins. We therefore assumed the cause for the misplacement was due to the TM-align
procedure on the 1CP2/P00456 sequence, which identified 2AFH chain E as the best structure
to model after even though sequence identity to P00456 was only 69%. Hence, in the resulting
predicted model for 1CP2/ P00456, the Cα atoms of seven amino acids were placed at a distance
from the resolved structure, and were mainly based on 2AFH chain E structural alignment. The
only two residues that were slightly off on the 2AFH chain E model, reflect the ab initio
modelling procedure I-TASSER employs for loop regions, which has a lower success rate than
comparative analysis to a known sequence and structural template procedure (Zhang et al.,
2003; Moult, 2005).
5.4.2 Evolution, composition and structure in cluster I & III
The ConSurf analysis of cluster I and cluster III provided some interesting points. The results
for both clusters confirmed most residues with important functional roles, were completely or
highly conserved. The scoring system itself, based on the maximum likelihood and the LG
amino acid substitution matrix, produced a colour scheme for the multiple alignment of each
cluster (figures 3 & 6).
In general, Score 8 was assigned by ConSurf to positions in which only one sequence in the
entire multiple alignment included a change of a residue, while scores 6-7 were usually assigned
to exchanges between a positively charged Arg/Lys and a negatively charged Asp/Glu, or an
exchange between Gln/Thr residues with polar uncharged side chains. The latter type of
exchange points to function preservation over structural one. Lower scores were given to
exchanges of hydrophobic residues, mostly of the bulkier type - Leu/Met/Tyr/Phe. In general,
60% of cluster I residues scored a 8-9, and ~11% scored a 6-7, altogether suggesting that both
function and structure were highly conserved throughout cluster I. Cluster III residues scored
differently - 54% scored a 9-8 and 14.5% scored a 6-7 - suggesting that cluster III, while still
quite conserved in function and structure, is more prone to changes in its amino acids
composition, relative to cluster I.
Completely conserved residues in both clusters were found in the switch I & II regions, the
4Fe:4S metalo cluster, and the Walker A motif (GKGGIGGKST), as expected. In addition,
regions that were involved in at least two functional roles scored 9. For example, positions 86-
151
99 that were involved in the MoFe binding and in the intersubunits interface, included
completely conserved blocks of residues (residue numbering according to 1CP2; (Schlessman et
al., 1998). Another conserved motif, in both clusters, was the complete conservation of Q54, as
part of a Q-loop motif (Yang et al., 2011). This motif is usually found within the ATP-binding
cassette (ABC) transporters, which includes also the multidrug resistance protein MRP. The Q-
loop motif is integral to the binding of the nucleotide via its metal cofactor (Yang et al., 2011),
and its presence is not surprising considering that NifH has been affiliated phylogenetically with
the Mrp /MinD protein family, within the SIMIBI class of the p-loop GTPases (Leipe et al.,
2002). Other motifs were not completely conserved in both clusters: The Walker B hhhhDxxG
motif was partially conserved (‘h’ denotes hydrophobic residue, 122-129 residue number in
1CP2), DxxG motif was DVLG in both clusters, however, in cluster I, the preliminary ‘hhhh’
included Ser/Cys/Phe (position 123), while remaining positions were hydrophobic residues, as
expected. In cluster III, the ’hhhh’ motif was conserved and the residue exchanges were solely
hydrophobic in nature (Val/Phe/Tyr/Ile). These structural and functional motifs are involved in
the nucleotide binding as well.
The Fe protein, in general, has retained the Asn residue which is part of a Nxxx motif , a
variation on the original NKXD in p-loop GTPases (Leipe et al., 2002). This Asn is thought to
stabilize the guanine nucleotide binding site, and produce a specificity for GTP binding (Bourne
et al., 1991). It was not completely conserved in cluster III, in which position N106 scored only
6, as some sequences also included an Asp or Gly variation for this position. In cluster I, N107
scored 8, as almost the entire alignment maintained this specific residue at this position.
The salt bridges, in the amplified NifH region, were mainly composed from exposed and
completely conserved residues, located chiefly in the coil regions. 2AFH included a larger
number of completely conserved bridges, in contrast to the 1CP2 structure. The unique bridge in
2AFH of Glu92-Lys170 was observed previously (Schlessman et al., 1998), however, our
analysis indicated two additional bridges might be present - Glu265E-Lys52F from chain E to
chain F, and Glu277F-Lys52E, from chain F to E, both of which were highly conserved (Table
8). Our analysis did not detect Asp129-Lys41 as a bridge between the two subunits in 2AFH,
but as a bridge within subunit E, yet it was reported as an intersubunit bridge in another Fe
protein structure, 1NIP (Schlessman et al., 1998), hence this pair of residues may be
multifunctional, which would explain their complete conservation. Some studies have indicated
that the highly conserved Arg100 (cluster I numbering) is part of a salt bridge with Glu120 in
the alpha chain of the MoFe protein (Georgiadis et al., 1992; Burgess and Lowe, 1996), and
replacing this residue produced salt sensitivity and partial functionality of the Fe protein (Peters
et al., 1995).
152
Our analysis indicated this residue may interact also with Glu156 of the beta chain in the MoFe
protein. In both bridges, the distance calculated by the WHAT IF program for the participating
atoms was larger than 4.1 Å, suggesting that using the 4.0 Å as the maximum distance criterion
between atoms for an established salt bridge, was not ideal. Glu110, Arg140 and Lys143 also
have been implicated in mutational studies to be important for the protein to function under
saline conditions, as replacing them caused salt sensitivity and various degrees of uncoupling of
the MgATP hydrolysis (Peters et al., 1995). The exact role of these residues is not yet
determined, though some have suggested they play a crucial role during the docking procedure
with the MoFe protein (Tezcan et al., 2005). The highly conserved salt bridge between Asp125
and Lys15 (2AFH numbering, Table 8), have been confirmed and is known to connect between
the Walker A motif (Lys15) and the switch II region, and is crucial for the conformational
changes the Fe protein undergoes once a nucleotide is bound (Georgiadis et al., 1992; Lanzilotta
et al., 1995). However, Asp39 and Lys15 represent a potential additional important salt bridge,
between the Walker A motif and switch I region, which requires further studies. The other salt
bridges suggested in our analysis of 1CP2 and 2AFH should be further characterised,
particularly those involving highly conserved residues such as Asp129-Lys41, Glu146-Arg3,
Glu154-Lys10 and the intersubunit salt bridges (Table 8). Because most salt bridges were
exposed to the solvent, it is possible they switch between possible partners within the Fe protein
and partners in the MoFe protein upon docking.
The ConSurf analysis not only confirmed the functional and structural elements in cluster I &
III, it also provided an additional layer of information - mainly in regards to residues which
maintained functionality but not structure. This was complemented by the amino acid
composition analysis.
Most amino acids composition, from cluster I and cluster III sequences were found to have
passed robust normality tests (tables 1 and 3). They therefore have been considered as following
a Gaussian distribution, and able to withstand specific statistical tests. A two tailed unpaired t-
test analysis in the amplified region of NifH divided into conserved vs. variable segments,
produced positive results and found shifts in both segments (tables 7 & 8). Under the conserved
region, only Ala and Gly showed any considerable shifts in composition between the two
clusters, while other amino acids remained very similar in composition, as would be expected
from conserved regions.
The variation between Ala and Gly residues, in the conserved region, might be an indication of
the relative effect of Ala versus Gly on helix stability within the Fe protein. Other studies
indicated these amino acids stabilised helices, but at different locations along the helix, Gly at
153
the N- and C terminals with Ala in internal position (Chakrabartty et al., 1991; Serrano et al.,
1992b; Serrano et al., 1992a). It is thought this specific exchange of Gly & Ala impacts solvent
accessibility, and influences the exposure of hydrophobic surfaces to the solvent. Conserved
Gly or Ala residues in 2AFH or 1CP2 were positioned adjacent to important functional
domains, and most were accessible to the solvent, making them suitable to provide minor
adjustments for the functional residues, in regards to the solvent accessibility (tables 2 & 4).
Our composition analysis of the variable sections (scores 1-8), indicated that cluster III
increased its hydrophobic content and thus reduced the overall accessible surface area to the
solvent (Moret and Zebende, 2007), producing a more compact Fe protein in general (Table 7).
In addition, although our analysis suggested that the variable sections differ substantially in
their amino acid compositions between cluster I and III, there were underlying common trends.
These included Leu as the highest occurring amino acid, and that charged amino acids, such as
Asp, Glu and Arg, did not defer significantly in their composition. This was true also for several
hydrophobic amino acids, mainly Ala, Val and Ile. The fact these groups had no significant
change in composition, although located in the variable section of the NifH sequence, points to
their involvement in a functional role via their side chains, perhaps in a similar fashion to
Arg100, Arg140, and Lys143 (2AFH sequence numbering). Mutational studies revealed these
conserved amino acids provided essential ionic support during the complex formation with the
MoFe protein (Peters et al., 1995), perhaps Asp, Glu and Arg in the variable segments, provide
similar support as well.
Prior to analysing a clone based Fe protein structure, it was of interest to check how the I-
TASSER server would perform on 1CP2 and 2AFH sequences vs. their known X-ray
crystallographic structures, in order to independently gauge the server performance. Overall, the
I-TASSER models were in good agreement and quality with the crystallographic structures of
2AFH and 1CP2 Fe proteins (figures 5 & 8). The performance of the I-TASSER server was
rather accurate, taking into consideration known server limitations, such as an average error of
0.08 for the TM-score and 2 Å for RMSD (Zhang, 2008). All models used in this study had TM-
scores higher than 0.5, C-scores in the range of 1.34-2.14 and RMSD values of 1.7-2.3 Å.
Lower RMSD scores for 2AFH than 1CP2 I-TASSER models, were most probably due to the
fact that there are more resolved Fe protein structures from A. vinelandii (20 in total) than C.
pasternium (2). Therefore the resulting 2AFH model was more accurate than the 1CP2 I-
TASSER, with an RMSD value of 0.689 Å for the 1CP2 and its I-TASSER model, and 0.529 Å
for 2AFH and its I-TASSER model. The TM-align software (Zhang and Skolnick, 2005),
ranked 2AFH chain E as the best structural match to the sequence of 1CP2 chain A, even
154
though the sequence identity was only 0.69, and we suspected this introduced bias in the cluster
III clones predicted models.
In total there were seven mismatched positions in the 1CP2 model, which meant that in the
predicted model, Cα atoms of seven amino acids were placed at a distance from the resolved
structure. According to our previous analyses (Table 2) these residues were exposed to the
solvent, and participated in two coil sections, one hydrogen bond turn and one beta sheet,
respectively, and were completely conserved, except for Leu52 and Thr115-Asp116 (ConSurf
score 7, and 1-1, respectively). As expected, there were only two minor mismatched positions in
the predicted model for 2AFH chain E. According to our previous analyses G96 is completely
conserved in cluster I, present at the end of a α-helix, buried and close to a loop region (Table
4), and Glu116 is a highly variable position (score 1 E/D/V/S, sometimes absent), exposed to
the solvent, and is part of a bend near the end of a α-helix. Coil and loop regions are notoriously
hard to model accurately (Moult, 2005), and these results were not surprising.
When comparing 1CP2 and 2AFH actual crystallographic structures and superimposing them,
the mismatched residues, in the amplified region of NifH, included most of the positions we
reported as a mismatch between 1CP2 and its I-TASSER model (Table 5, figure 5). This
suggested again that the lack of additional structures affiliated with cluster III, in the PDB
template library, has most probably caused a slight bias in the I-TASSER process. However - as
our analysis clarified what those positions were, we would be able to inspect them carefully in
future analyses.
Projecting the ConSurf evolutionary scheme onto the resolved structures of 1CP2 chain A and
2AFH chain E, has demonstrated that in the amplified section of the NifH sequence, the most
conserved regions were switch I and II and residues coordinating the 4Fe:4S metalo cluster (see
figure 9). Combining the RMSD analysis and the ConSurf evolutionary scheme, suggested that
structural shifts occur within specific regions, which included highly conserved and also non
conserved residues (figure 11). The region of 113-118 was in particular prone to insertions and
structural shifts, and this region is chiefly involved in the docking procedure between the Fe
protein and the MoFe protein of the nitrogenase (Peters et al., 1995; Tezcan et al., 2005). In
general, our RMSD analysis was in agreement with the RMSD analysis as presented by
Schlessman et al., 1998, on 1CP2 Fe protein structure, and A. vinelandii Fe-protein denoted
Av2, at 2.13 Å resolution.
155
5.5 Concluding remarks
NifH sequences from known organisms affiliated with cluster I or cluster III were analysed in
terms of their conservation patterns, amino acid composition and existing and potential
structural attributes. Our methods included a statistical t-test analysis of the amino acid
composition, a novel ConSurf evolutionary analysis, and the use of the I-TASSER web server.
These methods performed well in general, with few limitations, and provided interesting results.
The analyses results suggested cluster III was slightly less conserved than cluster I, and
contained more hydrophobic residues. A possible role for the Ala and Gly residues as
interchangeable stabilisers of the alpha helices in the Fe protein was suggested as well.
The main known difference between cluster I and cluster III is that the latter includes strictly
anaerobic species, while cluster I includes both aerobic and anaerobic species (see section 1.4,
chapter 1). Our analysis highlights what are the underlying changes which facilitate this
specilialisation in cluster III diazotrophs.
156
Chapter 6 Halophilic and thermophilic adaptations in the Fe protein
_______________________________________________
6.1 Introduction
Clones obtained from columnar stromatolites (chapter 3) and Paralana Hot Springs (chapter 4),
were phylogenetically affiliated mainly with cluster I and cluster III, of the nifH phylogenetic
tree (Zehr et al., 2003a; Raymond et al., 2004a). We expected that halophilic adaptations would
manifest themselves to some extent in the nifH genes from columnar stromatolites of Shark
Bay, because representatives of halophilic Halobacteriales have been previously detected in
stromatolites (Goh et al., 2006; Allen et al., 2008; Allen et al., 2009) as well as
Haloanaerobiales in Guerrero Negro microbial mats (Ley et al., 2006). The
archaeon Halococcus hamelinensis, isolated from Shark Bay stromatolite mats, has been found
to employ mainly glycine betaine as an osmolyte (Goh et al., 2011), while 18 Cyanobacteria
isolates from the Oscillatoriales, Chroococcales and Pleurocapsales orders, have been found to
accumulate predominantly various saccharides, glycine betaine, and trimethylamine-N-oxide
(Goh et al., 2010). While halophilic Archaean diazotrophs have not been detected in our
analysis (chapter 3), we have detected Cyanobacteria representatives. Thus, we have potential
nitrogen fixers with known halophilic adaptive strategies in Shark Bay.
Halophilic adaptations may include an increase in acidic residues (Asp, Glu), a decrease in large
hydrophobic residues and their replacement with small hydrophobic residues such as Ala, Gly
and Val, and a lower Lys content, alongside an increase in salt bridges, within monomers and
between subunits (Lanyi, 1974; Rao and Argos, 1981; Madern et al., 1995; Madern et al., 2000;
Fukuchi et al., 2003). The main ‘threat’ to a protein under saline conditions, is the excess of salt
ions in the solvent, which prevent proper bonding with the water molecules and promotes
aggregation (Bolhuis et al., 2008). The increase in negative charges in a protein, by the increase
in the acidic residues, acts as a charged screen against the salt ions and attracts water molecules
to the protein (Bolhuis et al., 2008). Other studies suggested that the salt bridges were stabilized
at times by the solvent salt ions, thus harnessing the solvent to preserve the protein structure and
function (Eisenberg, 1995; Madern et al., 2000). The change in hydrophobicity helps the protein
to remain flexible under saline conditions and prevents aggregation (Jaenicke and Böhm, 1998;
Madern et al., 2000). These changes provide different mechanisms which enable a protein to
function under extreme saline conditions, such as those surrounding Shark Bay stromatolites.
157
Similar information about known thermophilic diazotrophs in Paralana Hot Springs (PHS) is
scarce. However reports of active diazotrophs in hot springs and hydrothermal vents (Mehta and
Baross, 2006; Hamilton et al., 2011b) and recent analyses of thermophilic proteins (Siddiqui
and Thomas, 2008), suggest that a thermophilic diazotroph might acquire unique adaptations,
and reside in PHS. Thermophilic adaptations usually include an increase in charged amino acids
and some hydrophobic amino acids (Ile, Met, Val, Tyr), as well as an increase in Pro and a
decrease in Gly content (Kumar and Nussinov, 2001; Somero, 2003). A decrease in uncharged
polar amino acids such Ser, Thr, Asn and Gln was also observed in various thermophilic
proteins (Georlette et al., 2003; Daniel et al., 2008). Structural adaptations may involve an
increase in salt bridges within monomers and between subunits, and a decrease in the protein
size, usually by removing loop regions and sections in the N- and C-terminals (Fields, 2001;
Daniel et al., 2008). The increase in charged residues and salt bridges increases ionic networks
which stabilize the protein at higher temperatures and prevent unfolding. Removal of Asn and
Gln stabilizes the protein in general as these amino acids tend to deaminate at higher
temperatures (Kumar and Nussinov, 2001). The increase in hydrophobic residues, specifically at
the core of the protein, enhances hydrophobic interactions and increases its thermostability
overall. In general, thermophilic proteins increase their hydrophobic, electrostatic, Van der
Waals and hydrogen bonds to prevent unfolding at higher temperatures and in the process
become compact and rigid, relative to mesophilic and psychrophilic proteins (Siddiqui and
Cavicchioli, 2006; Daniel et al., 2008).
Our aim was to assess halophilic and thermophilic adaptations in the inferred NifH sequences
from columnar stromatolites of Shark Bay and from the microbial communities at Paralana Hot
Springs, respectively.
158
6.2 Material and methods
6.2.1 Evolutionary conservation
Analysed as described in section 5.2.1, chapter 5.
6.2.2 Residue composition
Analysed as described in section 5.2.2, chapter 5.
6.2.3 Statistical analysis
Analysed as described in section 5.2.3, chapter 5.
6.2.4 Distance matrices
Subsets of the individual multiple alignments were converted to phylip format using Readseq
(Gilbert, 2003) on the EMBL-EBI web server (EMBL-European Bioinformatics Institute) and
submitted to “PHYLIP Protdist” version 3.67 (Felsenstein, 2007), available via the Mobyle web
portal (http://mobyle.pasteur.fr/cgi-bin/portal.py#forms::protdist), to create distance matrices as
described previously (chapter 2, section 2.2.6). 1CP2 and 2AFH NifH sequences were extracted
from their respective PDB files, by the WHAT IF web server version 8.0 (Vriend, 1990) and
trimmed to include only one copy of NifH for this calculation.
6.2.5 Structural characteristics
3D crystallographic representatives of the Fe protein from mesophilic Azotobacter vinelandii
PDB file ID 2AFH (Burgess et al., 1980; Tezcan et al., 2005) and Clostridium pasteurianum,
PDB file ID 1CP2 (Schlessman et al., 1998), were chosen in order to assess potential structural
changes in the clone libraries, in relation to cluster I and cluster III, respectively. Structural
characteristics were calculated as described in section 5.2.4, chapter 5).Protein images were
created using the Chimera UCSF program, version 1.6.2 (Pettersen et al., 2004). The I-TASSER
on line server provided 3D models for chosen clone sequences, which were superimposed on
the 1CP2 or 2AFH, using the “MatchMaker” option in the UCSF Chimera software (Pettersen et
al., 2004; Meng et al., 2006) with Smith-Waterman (Pearson, 1991) alignment algorithm and
other options left in default fashion (BLOSUM 62 substitution matrix and 30% weighting of the
secondary structure term). Salt bridges were defined by the WHAT IF web server (Vriend,
1990), version 10.1a (http://swift.cmbi.ru.nl/servers/html/index.html) and restricted to an
interatomic distance of less than 4.0 Å as described in section 5.2.4, chapter 5.
Figure 2 The amino acids mean composition in the partial NifH sequence of cluster III (C3) and affiliated stromatolite clones (S3). Divided into variable vs. conserved regions of NifH. Error bars are SD.
Figure 3 Amino acids mean composition in the partial region of NifH from cluster I (C1) and affiliated stromatolites clones (S1), divided into variable vs. conserved regions of NifH. Error bars are SD.
162
Table 1 The amino acid mean composition in the conserved sections in cluster I (C1, N=58), cluster III (C3, N=32and affiliated clones (S1=44, S3=18). Shaded cells denote high Standard Deviation values (SD). Amino Acid Ala Cys Asp Glu Phe Gly His Ile Lys Leu C1-Conserved 11 5.3 6.6 9.2 1.3 14 0.0 6.6 2.7 5.6 S1-Conserved 12 3.6 5.9 9.5 2.4 14 1.2 4.8 3.6 8.4 SD 0.71 1.2 0.49 0.21 0.78 0.0 0.85 1.3 0.64 2.0 C3-Conserved 6.1 4.6 9.1 7.6 1.5 21 0.0 5.9 3.0 6.1 S3-Conserved 4.1 4.2 5.5 8.0 1.4 16 0.0 6.9 1.3 17 SD 1.4 0.28 2.5 0.28 0.071 3.5 0.0 0.71 1.2 7.7 Met Asn Pro Gln Arg Ser Thr Val Trp Tyr C1-Conserved 4.9 0.0 5.3 2.6 3.9 2.7 3.9 11 0.0 3.9 S1-Conserved 3.50 0.0 4.8 2.4 2.4 3.6 2.4 11 0.0 4.8 SD 0.99 0.0 0.35 0.14 1.1 0.64 1.1 0.0 0.0 0.64 C3-Conserved 3.1 0.0 6.1 1.5 6.1 4.5 4.5 7.7 0.0 1.5 S3-Conserved 2.4 1.4 5.5 1.4 2.8 5.3 4.2 7.0 0.0 5.5 SD 0.49 0.99 0.42 0.071 2.3 0.57 0.21 0.49 0.0 2.8 Table 2 Amino acid compositions in cluster I (C1, N=58), cluster III (C3, N=32) and affiliated clones (S1=44, S3=18) in the variable portions of the amplified section in NifH. Amino Acid Ala Cys Asp Glu Phe Gly His Ile Lys Leu
(a) Mean ratios and standard deviation in parenthesis. C1-Cluster I, C3-Cluster III, S1-Stromatolites affiliated with cluster I and S3-Stromatolites clones affiliated with cluster III. (b) Two tailed t-test P value summary. Means were significantly different (P<0.05) according to unpaired t-tests with Welch’s correction for unequal variances. **** P< 0.0001 extremely significant, *** 0.0001 <P< 0.001 very significant, ** 0.001 <P< 0.01, *0.01 <P< 0.05, significant. N - not significant. n/a - not applicable. ‘ ’ - an increased amount of amino acid, ‘ ’ - decreased amount of amino acid.
163
An increase in the Leu composition within the conserved segments was also evident in the affiliated
stromatolite clones of cluster I (figure 3, table 1). All other amino acids in the conserved segments had
low standard deviation values and did not vary in their composition, relative to the reference cluster. In
the variable segments, significant ratio changes were found with 16 amino acids. A significant
increase of Glu, Ile, Leu, Asn, Arg and Thr (table 2) and a significant decrease of Ala, Cys, Asp, Phe,
Gly, His, Lys, Gln, Val and Tyr were observed. Pro and Trp were absent, and the Met and Ser
compositions did not vary significantly in the clones affiliated with cluster I.
Potential structural changes to the Fe protein structure were assessed by using 20 stromatolite NifH
clones that were submitted to the I-TASSER server for model prediction. Two groups of ten clones
each, which were at maximum distance of 0.23 to1CP2 or 2AFH sequences (see table 3). The RMSD
values and evolutionary conservation of the I-TASSER models were analysed, relative to 1CP2 or
2AFH structures.
The 10 stromatolite clones affiliated with cluster III were at a distance of 0.14-0.23 from the 1CP2
sequence and their model RMSD values, according to I-TASSER, ranged between 0.56 (RSA13796)
to 1.05 Å (table 3). There were seven sections, in the clones’ Fe protein models, that showed an
average RMSD value higher than 1 Å, and positions 50-51, and 113-115 presented average RMSD
values higher than 2 Å, indicating a larger shift in the structural alignment compared to the 1CP2 chain
A structure (table 4, figure 5). These two sections were composed from conserved and non conserved
residues, exposed to the solvent and their predicted secondary structures included coils, bends and
turns. The 10 stromatolite clones affiliated with cluster I were at a distance of 0.08-0.22, from the
2AFH sequence (table 3), and their RMSD values ranged from 0.36 (RSA13396) to 0.61 Å. There
were three sections which showed RMSD values above 1 Å, and none had values above 2 Å, as seen
in the previous analysis of cluster III affiliated clones (figure 6). According to the 2AFH and affiliated
clones analysis, two Gly residues showed minor structural shifts, one Gly was denoted buried and the
other exposed to the solvent, while the third section, 115-118, was mostly exposed (table 5). The
secondary structure was characterised as coil, turns and bends for these sections and they included
conserved and non conserved residues, according to the ConSurf analysis.
2AFH 0 2.12 0.9897 0.54 1 RSA13596 0.08 2.14 0.9926 0.56 0.96 RSA13396 0.09 1.82 0.996 0.36 0.949 RSA10296 0.12 1.81 0.994 0.41 0.935 RSA11596 0.13 2.15 0.9953 0.43 0.95 RSA11496 0.17 1.79 0.989 0.61 0.916 RSA10196 0.18 1.76 0.989 0.44 0.913 RSA7904 0.19 1.85 0.99 0.43 0.916 RSA6104 0.2 1.57 0.986 0.42 0.867 RSA6704 0.2 1.85 0.989 0.48 0.912 RSA6904 0.22 1.76 0.987 0.5 0.889 (*) Distances as calculated by PHYLIP Protdist” version 3.67 (Felsenstein, 2007). (a) Confidence score for estimating the quality of the predicted top model by I-TASSER (Roy et al., 2010). (b) TM-score of the structural alignment between the query structure and known structures in the PDB library (Zhang and Skolnick, 2005). (c) The overall RMSD between residues that were structurally aligned by TM-align (Zhang and Skolnick, 2005). (d) The percentage sequence identity in the structurally aligned region (Roy et al., 2010). Table 4 Residue characteristics of stromatolite clones affiliated with cluster III with regions of RMSD >1 Å.
Residue positions* 41-42 50-51 62-63 65-68 89 91-93 113-115 Amino acid(a) AD GL EE EDVE E GVG TDD Conservation score(b) 99 97 18 7571 9 999 111 Main secondary structure(c) bend/coil bend/coil helix 3-helix turn coil turn/bend coil/turn Solvent accessibility(d) ee ee ee ee-- e --- eee Average RMSD (Å) (e) 1.245 2.252 1.316 1.142 1.053 1.412 2.328
* Position number according to 1CP2, chain A sequence, P000456 accession ID. (a) The amino acid in each position in the respective 1CP2 sequence. (b) ConSurf conservation scores for cluster III, 1-9, non-conserved to completely conserved, respectively. (c) The secondary structure based on the crystallographic structure of 1CP2. (d) Solvent accessibility - Buried (b) or exposed (e) residue, (-) varying degrees of exposure. See Table 2 for full details. (e) Average RMSD values for specific positions, in which the clones presented RMSD values >1 Å, relatively to the 1CP2 structure.
165
Figure 5 Two opposite angles of 1CP2 Fe protein superimposed with ten stromatolite clones I-TASSER models. Magenta highlights areas where RMSD >1 Å. The largest structural shifts, where RMSD >2 Å and the site of the two residue insertion are in red, see table 12 for further details.
g ,residue insertion are in red, see table 12 for further details.
Figure 6 Two different angles of 2AFH Fe protein superimposed with its closest ten stromatolite clones I-TASSER models. Magenta highlights areas where RMSD >1 Å, as per table 13.
Table 5 Residue characteristics of stromatolite clones affiliated with cluster I with regions of RMSD >1 Å. Residue positions* 94 96 115-118
Amino acid(a) G G YEDD Conservation score(b) 7 9 9113 Main secondary structure(c) turn coil bend & turn Solvent accessibility(d) e b eeee Average RMSD (Å) (e) 1.379 1.028 1.016 * Position number according to 2AFH, P00459 sequence accession ID, chain E. (a) The amino acid in each position in the respective sequence. (b) ConSurf conservation scores for cluster I, 1-9, non- conserved to completely conserved, respectively. (c) The secondary structure based on the crystallographic structure of 2AFH.
166
(d) Solvent accessibility - Buried (b) or exposed (e) residue, (-) varying degrees of exposure. See Table 4 for full details. (e) Average RMSD values for specific positions, in which the clones presented RMSD values >1 Å, relatively to the 2AFH structure.
According to the analysis we performed on six stromatolite clones (three for each cluster), there was a
total of 11 common salt bridges for the stromatolite clones and 1CP2 or 2AFH (table 14), and 15
unique salt bridges which were not detected in 1CP2 or 2AFH, under the enforced 4 Å interatomic
distance limit, between the side chain oxygen atoms in Asp or Glu, to the side chain nitrogen atoms in
Arg, Lys or His (table 6). Two unique salt bridges were highly conserved in S3 and S1, and were at
positions Asp42-Arg45, and Asp128-Lys9 (residue numbering according to S3, underlined in table 6).
The additional negative residues that sometimes appear in the region of 113-115 in stromatolites
(tables 4 & 5); seem to strengthen ionic bonds with Lys32 and Lys84 mainly, but not only. Salt
bridges in S3, corresponding to the above mentioned region, were detected in our analysis but the
interatomic distances ranged between 4.5 to 6.99 Å, and were therefore not specified in table 6. These
salt bridges included Asp residues which interacted mainly with Lys30 and Met1, residues which
scored 1 and 9 for conservation, respectively, in cluster III. The salt bridges also included Lys residues
in this region that interacted with Glu113, but at a distance of 6.52 Å, and therefore were not specified
in table 6.
167
Table 6 Potential salt bridges, with maximum intertatomic distance of 4Å, in the amplified NifH region of the Fe protein. Shaded rows represent common salt bridges present in the representative clones and the selected structure, 1CP2 or 2AFH.
(a) 1CP2 analysis by WHAT IF, salt bridges were not detected between the Fe protein subunits A & B. (b) Positioning was manually corrected for minor shifts per alignment.
168
(c) Conservation score was based on the individual analysis of ConSurf on cluster I, affiliated stromatolite clones (S1), cluster III and stromatolite affiliated with cluster III (S3). Scores ranged from 1 to 9, non-conserved to completely conserved, respectively. “-“score was not calculated. (d) Based on the WHAT IF analysis on the I-TASSER PDB files of cluster III stromatolite clones: RSA14196, RSA11996 and RSA98963. (e) 2AFH analysis by WHAT IF, salt bridges were detected between subunits E & F, and are designated where relevant. (f) Based on the WHAT IF analysis on the I-TASSER PDB files of cluster I stromatolite clones: RSA13596, RSA7904 and RSA6904. * A yellow background colour denotes a residue in a α-helix structure, and a green colour denotes a residue within a -sheet. No background colour means random coil or unknown structure.
6.3.2 Potential thermophilic adaptations in the Fe protein
The conservation analysis of the Paralana Hot Springs (PHS) clones revealed that 54% of the
amplified region sequence scored 8 & 9, while 13% scored 6 & 7. Several sections were completely
conserved - the nucleotide binding site, intersubunits interface within the Fe protein, the MoFe binding
residues, the metalo cluster and the two switch regions. Positions 80 and 116 had additional variants,
in comparison to cluster I & III (figure 7, highlighted residues in bold). These positions were highly
variable in PHS and the original clusters, however, in the PHS clones several sequences included a
Cys at position 80, and at position 116 several sequences included a Lys, both variants were not
present in these positions in the original clusters. Position N106 was completely conserved throughout
the PHS alignment, though not so in the original clusters.
Following a statistical analysis of the amino acid composition, significant shifts in the were discovered
in the variable segments of the NifH sequences in Paralana Hot Springs (PHS) clones affiliated with
cluster III. There were significant ratio changes in 10 amino acids: a significant increase in Asp, Phe,
Pro, Arg and Val, and a significant decrease in Ala, Glu, Leu, Ser and Tyr (figure 8, table 7). The
composition of Cys, Gly, His, Ile, Lys, Met, Asn, Gln and Thr, did not change significantly. In its
conserved segment, Gly, Leu and Tyr content varied, according to their SD values, 2.1, 2.8 and 2.3,
respectively (table 8), while SD values for the other amino acids ranged from 0.0 to 1.2. Affiliated
PHS clones with cluster I included a significant increase in the content of Cys, Ile, Leu, and Arg in the
variable section (figure 9, table 7, P1:C1). Ala, Glu, Gly, Lys, Asn, and Gln content decreased
significantly, while eight other amino acids did not change significantly. In the conserved segment,
Glu content increased in P1 clones (SD = 2.7,table 8), compared to cluster I, while SD for the other
amino acids remained low and ranged from 0 to 1.4.
Figure 9 The amino acids mean composition in the amplified NifH amino sequence of cluster I (C1) and affiliated PHS clones (P1). Divided into variable vs. conserved regions of NifH. Error bars are SD.
Figure 8 The amino acids mean composition in the amplified NifH amino sequence of cluster III (C3) and affiliated PHS clones (P3). Divided into variable vs. conserved regions of NifH. Error bars are SD.
171
Table 7 Amino acid compositions in cluster I (C1, N=58), cluster III (C3, N=32), and affiliated clones (P1=20, P3=16) in the variable sections of NifH amino acid sequences. Amino Acid Ala Cys Asp Glu Phe Gly His Ile Lys Leu C3(a) 6.3(2.3) 2.5(0.86) 7.7(2.2) 12(3.5) 3(1.2) 5.7(2) 0.29(0.68) 6.3(2.1) 4.6(1.5) 15(3) P3 4.91(1.66) 1.74(1.40) 10.62(2.70) 8.44(3.68) 4.51(1.31) 7.30(4.20) 0.17(0.69) 6.60(1.92) 6.38(3.80) 12.74(2.28) C1 6.9(1.8) 0.81(1.5) 8.5(2.7) 11(3) 4.6(2.3) 7.5(1.8) 2.8(1.7) 5.8(2.2) 5.7(2.6) 11(2.3) P1 5.88(1.03) 2.72(1.11) 8.45(3.58) 6.16(3.52) 3.71(2.56) 5.62(1.53) 2.17(1.29) 10.93(2.72) 4.30(2.12) 13.42(4.31)
C3:C1(b) N **** N N **** **** **** N ** **** P3:C3 * N ** ** *** N N N N ** P1:C1 ** **** N **** N **** N **** * *
Met Asn Pro Gln Arg Ser Thr Val Trp Tyr C3 4.2(1.5) 3.2(1.6) 0.35(0.73) 2.9(1.3) 2(1.3) 3.6(1.9) 4.9(2) 8.5(2.1) 0.57(0.86) 6.2(1.4) P3 4.73(1.40) 2.21(3.40) 3.95(2.20) 3.25(2.17) 3.40(2.44) 1.52(2.21) 4.50(1.60) 11.94(3.62) 0(0) 1.09(1.46) C1 3.3(1.6) 6.7(2.3) 0(0) 1.4(1.8) 2.2(2.5) 7.7(3.3) 3.2(2.9) 8(2.3) 0.16(0.58) 2.6(1.9) P1 4.02(1.41) 3.90(1.45) 0.14(0.62) 0.15(0.68) 6.95(3.19) 6.96(3.19) 4.03(2.39) 7.74(3.42) 0(0) 2.75(1.75) C3:C1 * **** n/a **** N **** ** N * **** + P3:C3 N N **** N * ** N ** n/a **** P1:C1 N **** n/a **** **** N N N n/a N (a) Mean ratios and standard deviation in parenthesis. C1-Cluster I, C3-Cluster III, P1-PHS clones affiliated with cluster I, P3-PHS clones affiliated with cluster III. (b) Two tailed t-test P value summary. Means are significantly different (P<0.05) according to unpaired t-tests with Welch’s correction for unequal variances. **** P< 0.0001 extremely significant, *** 0.0001 <P< 0.001 very significant, ** 0.001 <P< 0.01, *0.01 <P< 0.05, significant. N - not significant. n/a - not applicable. ‘ ’ - increased amount of amino acid, ‘ ’ - decreased amount of amino acid. Table 8 The amino acid mean composition in the conserved sections of cluster I (C1, N=58), cluster III (C3, N=32) and affiliated clones (P1=20, P3=16). Shaded cells denote high Standard Deviation values (SD). Amino Acid Ala Cys Asp Glu Phe Gly His Ile Lys Leu C1-Conserved 11 5.3 6.6 9.2 1.3 14 0.0 6.6 2.7 5.6 P1-Conserved 11 3.6 5.9 13 2.4 16 1.2 4.7 2.4 6.4 SD 0.0 1.2 0.49 2.7 0.78 1.4 0.85 1.3 0.21 0.57 C3-Conserved 6.1 4.6 9.1 7.6 1.5 21 0.0 5.9 3.0 6.1 P3-Conserved 7.2 4.9 8.2 8.5 1.3 18 0.0 4.7 3.6 10 SD 0.78 0.21 0.64 0.64 0.14 2.1 0.0 0.85 0.42 2.8 Met Asn Pro Gln Arg Ser Thr Val Trp Tyr C1-Conserved 4.9 0.0 5.3 2.6 3.9 2.7 3.9 11 0.0 3.9 P1-Conserved 3.2 1.2 4.7 2.4 2.4 3.4 3.6 9.7 0.0 3.5 SD 1.2 0.85 0.42 0.14 1.1 0.49 0.21 0.92 0.0 0.28 C3-Conserved 3.1 0.0 6.1 1.5 6.1 4.5 4.5 7.7 0.0 1.5 P3-Conserved 2.4 1.3 3.8 1.3 4.7 4.6 4.8 6.0 0.0 4.8 SD 0.49 0.92 1.6 0.14 0.99 0.071 0.21 1.2 0.0 2.3
172
In order to find out how changes in the amino acids content might have influenced the Fe
protein structure (if at all), 18 PHS clones, 9 from each cluster, were submitted to the I-
TASSER process, in order to create structural models (table 9). The distance of nine PHS clones
affiliated with cluster III ranged from 0.13 (RSA207) to 0.16, from the 1CP2 sequence, and
their models RMSD values, ranged between 0.45 (RSA227) to 0.89 Å, according to I-TASSER
results.
There were seven sections, in the P3 clones Fe proteins, with an average RMSD value higher
than 1 Å, and positions 50-51, and 112-115 presented an average RMSD values higher than 2 Å
(figure 10, table 10). These two sections were exposed to the solvent and composed from
conserved and non conserved residues. Their predicted secondary structures included coils,
bends and turns. The distance of nine PHS clones affiliated with cluster I ranged from 0.02
(RSA173) to 0.21, from the 2AFH sequence, and the RMSD values of their I-TASSER models
were between 0.34 (RSA159) to 0.56 Å (table 9). The only section which varied structurally,
108-113, included conserved and non conserved residues, most of which were exposed to the
solvent (figure 11, table 11). The predicted structure included parts of a helix and a turn.
Table 9 PHS clones chosen for structural analysis. Sequence ID Distance(e) C-score(a) TM(b) RMSD (Å)(c) IDEN(d)
2AFH 0 2.12 0.9897 0.54 1 RSA173Par0 0.02 1.84 0.994 0.43 0.841 RSA159Par0 0.03 1.86 0.992 0.34 0.987 RSA194Par0 0.09 1.86 0.989 0.48 0.954 RSA208Par0 0.11 1.87 0.946 0.41 0.996 RSA192Par0 0.12 1.87 0.99 0.44 0.946 RSA203Par0 0.17 1.83 0.988 0.52 0.916 RSA221Par0 0.17 1.85 0.989 0.48 0.924 RSA213Par0 0.18 1.85 0.992 0.38 0.916 RSA163Par0 0.21 1.85 0.987 0.56 0.895 (a) Confidence score for estimating the quality of the predicted top model by I-TASSER (Roy et al., 2010).
173
(b) TM-score of the structural alignment between the query structure and known structures in the PDB library (Zhang and Skolnick, 2005). (c) The overall RMSD between residues that were structurally aligned by TM-align (Zhang and Skolnick, 2005). (d) The percentage sequence identity in the structurally aligned region (Roy et al., 2010). (e) Distances as calculated by PHYLIP Protdist” version 3.67 (Felsenstein, 2007). Table 10 Residue characteristics of PHS clones affiliated with cluster III with regions of RMSD >1 Å. Residue positions* 50-51 62-63 65-67 89 91-93 112-115 152-153 Amino acid(a) GL EE EDV E GVG YTDD MM Conservation score(b) 97 18 757 9 999 6111 79 Main secondary structure(c) bend/coil helix 3-helix-
turn/coil coil turn/bend coil/turn helix
Solvent accessibility(d) ee ee ee- e --- eeee eb Average RMSD (Å) (e) 2.32 1.23 1.28 1.24 1.37 2.3 1.58 * Position number according to 1CP2, chain A sequence, P000456 accession ID (a) The amino acid in each position in the respective 1CP2 sequence. (b) ConSurf conservation scores for cluster III, 1-9, non- conserved to completely conserved, respectively. (c) The secondary structure based on the crystallographic structure of 1CP2. (d) Solvent accessibility - Buried (b) or exposed (e) residue, (-) varying degrees of exposure. See Table 2 for full details. (e) Average RMSD values for specific positions, in which the clones presented RMSD values > 1 Å, relatively to the 1CP2 structure. Table 11 Residue characteristics of PHS clones affiliated with cluster I with regions of RMSD >1 Å. Residue positions* 108-113 Amino acid(a) FLEEEG Conservation score(b) 899927 Main secondary structure(c) helix&turn Solvent accessibility(d) bbeeee Average RMSD (Å) (e) 1.7 * Position number according to 2AFH, P00459 sequence accession ID, chain E. (a) The amino acid in each position in the respective sequence. (b) ConSurf conservation scores for cluster I, 1-9, non- conserved to completely conserved, respectively. (c) The secondary structure based on the crystallographic structure of 2AFH. (d) Solvent accessibility - Buried (b) or exposed (e) residue, (-) varying degrees of exposure. See Table 4 for full details. (e) Average RMSD values for specific positions, in which the clones presented RMSD values > 1 Å, relatively to the 2AFH structure
174
Figure 10 Two opposite angles of 1CP2 Fe protein superimposed with PHS clones I-TASSER models. Magenta highlights areas where RMSD >1 Å. The largest structural shifts, where RMSD >2 Å and the site of the two residue insertion are in red, see table 18.
Figure 11 2AFH Fe protein superimposed with its closest PHS clones I-TASSER models. Magenta highlights areas where RMSD >1 Å, as per table 19.
According to the analysis we performed on six PHS clones (three for each cluster), there was a
total of seven common salt bridges for PHS clones and 1CP2 or 2AFH and 12 unique salt
bridges which were not detected in 1CP2 or 2AFH (Table 12). Two unique salt bridges were
highly conserved in P3 and P1, and were at positions Asp42-Arg45 and Glu151-Arg184 (P3
residue numbering respectively, underlined in Table 12). The Asp or Glu residues and most of
their positive partners were highly conserved in the unique salt bridges. The additional negative
residues in the region of 112-115 (cluster III, table 10); seem to strengthen ionic bonds with
Lys33 and Met1 mainly, but not only. Salt bridges in P3, corresponding to the above mentioned
region, were detected in our analysis but their interatomic distances ranged between 4.1 to 6.88
Å and were therefore not specified in Table 12. The negative residues interacted mainly with
Arg81, Lys113 and His26, all of which scored 1 for conservation in cluster III.
175
Table 12 Potential salt bridges, with maximum intertatomic distance of 4 Å, in the amplified NifH region of the Fe protein. Shaded rows represent common salt bridges present in the representative clones and the selected structure, 1CP2 or 2AFH.
Residue Position(b) Residue Position Distance (Å) Conservation scores (c)
1CP2(a) ASP 38 LYS 14 3.07 9,9 GLU 62* LYS 54 2.77 1,6 GLU 75 ARG 81 3.84 1,1 GLU 107 LYS 140 3.96 9,9 ASP 115 ARG 81 3.22 1,1 ASP 122* LYS 14 2.95 9,9 GLU 143 ARG 2 3.58 9,9 P3(d) ASP 42 ARG 45 2.65 9,9 ASP 58 LYS 54 3.3 9,9 ASP 58 ARG 61 3.25 9,9 ASP 62 LYS 54 3.61 7,9 GLU 89 ARG 61 3.19 9,9 GLU 107 LYS 140 2.73 9,9 ASP 126 LYS 40 2.84 9,9 GLU 145 ARG 2 3.22 9,- GLU 151 ARG 184 3.04 9,- 2AFH(e) ASP 39 LYS 15 3.09 9,9 E↔F GLU 92 LYS 170 2.8 9,9 GLU 110 LYS 143 3.24 9,8 ASP 118 LYS 32 3.38 3,7 ASP 125 LYS 15 3.14 9,9 ASP 129 LYS 41 2.74 9,9 GLU 141 ARG 140 2.92 6,9 GLU 146 ARG 3 2.88 9,9 GLU 154 LYS 10 3.97 9,9 GLU 229 HIS 50 2.53 2,6 E→F GLU 265 LYS 52 3.68 9,9 F→E GLU 277 LYS 52 2.86 -,9 P1(f) GLU 29 ARG 82 3.7 -,1 ASP 44 ARG 47 2.84 9,9 ASP 70 ARG 65 2.65 9,2 GLU 111 LYS 144 2.82 9,9 ASP 117 LYS 33 3.97 7,- GLU 118 MET 1 2.77 7,- ASP 119 LYS 33 3.25 7,- ASP 126 LYS 16 3.82 8,- GLU 147 ARG 4 2.66 9,- GLU 155 ARG 188 2.96 9,- (a) 1CP2 analysis by WHAT IF, salt bridges were not detected between the Fe protein subunits A & B. (b) Positioning was manually corrected for minor shifts per alignment. (c) Conservation score was based on the individual analysis of ConSurf on cluster I and its affiliated clones (P1), cluster III, PHS clones affiliated with cluster III (P3). Scores ranged from 1 to 9, non-conserved to completely conserved, respectively. “-“score was not calculated. (d) Based on the WHAT IF analysis on the I-TASSER PDB files of cluster III PHS clones: RSA227Par09, RSA158Par09 and RSA207Par09. (e) 2AFH analysis by WHAT IF, salt bridges were detected between subunits E & F, and are designated where relevant. (f) Based on the WHAT IF analysis on the I-TASSER PDB files of cluster I PHS clones: RSA163Par09, RSA208Par09 and RSA194Par09. * Yellow background colour denotes a residue in a α-helix structure, and green colour denotes a residue within a -sheet. No background colour means random coil or unknown structure.
176
6.4 Discussion
6.4.1 Halophilic adaptations
The stromatolite inferred NifH sequences were subjected to analyses of conservation patterns,
amino acids composition and structural shifts, in comparison to cluster I and cluster III.
According to our analyses, most of the amplified region was conserved in the stromatolite
clones. A lesser portion of the alignment was completely conserved in comparison to the
original clusters. 70% and 66% of cluster I and cluster III positions in the same region,
respectively, scored 8 & 9 in comparison to the stromatolite alignment (50%). Also, it would
appear that highly variable positions with scores lower than 6, were more prevalent in the
sequence alignment of columnar stromatolites (41%), and in comparison to clusters I and III
(21%, each). This was expected as the stromatolite multiple alignment was combined from
clones affiliated with both clusters, hence their multiple alignment included more variable
residues per position.
Two characteristics were checked in order to find if the stromatolite sequences, as a set, as a
group, had a pattern regardless of their clusters affiliation. Firstly, whether the completely
conserved sections matched the correlating segments in cluster I or III and secondly, whether
the variability of residues, per position, was within the known variants we found for cluster I &
III previously. Completely conserved sections did match the correlating segments in cluster I
and cluster III and included the important functional regions, and in this regards they did not
provide any new information. In regards to the second point, 16 positions out of the 119
residues of the partial gene region (on average), matched our second criteria, and included
residue variants in the stromatolite clones which were different than those present in cluster I or
III. However, 13 variants were present in only one sequence (RSA152) out of the whole set, and
therefore were discarded from further analysis and discussion. Four positions were found to
include patterns unique to the stromatolites, as a group. They suggested bias towards Leu and
Asn, and conservation of function over structure.
The amino acids composition analysis enabled us to detect shifts relative to cluster I or cluster
III. In the clones affiliated with cluster III, the slight decrease in the charged amino acids in the
conserved section, Arg and Asp, could be interpreted as an adaptive strategy to minimise
interference from salt ions within the core of the protein, and the addition of hydrophobic
elements such as Tyr and Leu would minimise accessible surface area to the solvent, within the
important functional buried sites of the protein (Moret and Zebende, 2007). Additionally, in the
177
variable section there was an increase in positively and negatively charged amino acids (Asp,
Glu, Arg, Lys), an increase in small amino acids (Gly, Pro) and small hydrophobic amino acid
(Val), as well as a decrease in bulkier hydrophobic amino acids (Ile, Leu, Met, Tyr, Cys).
Additional findings included an increase of Leu in the conserved section of stromatolites
regardless of cluster affiliation (table 1). The common finding in the variable sections included
five amino acids whose composition varied significantly in the stromatolites, but not so in the
reference clusters (table 2). Interestingly, Glu and Arg content increased in stromatolite
affiliated with cluster I (S1) and cluster III (S3), yet Asp, Ile and Val did not have a joint trend
for S1 and S3, and displayed different shifts (table 2). Asp and Val decreased in S1 and
increased in S3, while Ile increased in S1 and decreased in S3. Tyr and Cys decreased in S1 and
S3, and the most prevalent amino acid was Glu (17/15 for S1/S3, respectively, table 2). The
interplay previously observed in cluster I & III, between Ala and Gly amino acids, was not
present in the stromatolites.
Metagenomic studies of halophilic bacteria, Archaea and also of total bacterial DNA from saline
and hypersaline environmental samples, revealed several re-occurring genomic themes, though
not all of them were absolute for all protein families across all halophiles (Fukuchi et al., 2003;
Paul et al., 2008; Rhodes et al., 2010).
The shift in hydrophobic residues has been partially attributed to the rich GC-based DNA
halophiles possess, which effects codon usage (Paul et al., 2008). Rao and Argos (1981)
reported that in a chloroplast-type 2Fe-2S ferredoxins from two halophiles, large hydrophobic
and aliphatic residues such as Ile, Leu, Phe and Met, were replaced by smaller residues - Ala,
Gly and Val to reduce overall protein bulkiness and promote a tight configuration which is less
accessible to the solvent. The overall hydrophobicity remaining relatively unchanged in
comparison to the non halophilic proteins (Rao and Argos, 1981). The increase in charged
amino acid frequency has been reported as a halophilic mechanism, for instance, to produce
excess of negative charges which act as a charged screen against salt ions, attract water
molecules and enable the protein to remain active in saline conditions up to 4M NaCl.
According to Lanyi (1974), there is also an excess of small amino acids with short side chain -
Gly, Ala. Fukuchi et al. (2003) performed a statistical analysis on 126 proteins from
Halobacterium sp. NRC-1 and three other halophiles and found an abundance of acidic residues
on the external surface of halophilic proteins vs. non halophiles, while the internal composition
did not change significantly
Paul et al. (2008) presented data from which it was concluded that halophilic proteins, in
general, are less hydrophobic than non halophilic proteins. Similar findings were reported from
178
a statistical review of 26 halophilic enzymes by Madern et al. (1995) with the additional finding
of lower Lys content (a feature which was mentioned by Eisenberg (1995) as well). However, it
should be noted that some analysis has shown that the composition of Arg and Lys is dictated
solely by the G+C content in the DNA, and has nothing to do with their charge or other
biochemical properties (Cambillau and Claverie, 2000).
In order to achieve a reliable structural analysis, we chose the 10 stromatolite clone sequences
that were relative close, distance wise, to the 1CP2/P00456 or 2AFH/P00459 sequences. The
minimum distance for a stromatolite clone to 1CP2 was 0.14 in our dataset (table 3). The
maximum distance was 0.23 and we therefore expected that some structural changes would be
evident. The highest RMSD value for a clone model with cluster I was 0.61 Å (RSA114) while
the highest value for a clone model of cluster III was 1.05 Å (RSA9196), indicative of the
uncertainties in modelling a cluster III Fe protein, when the I-TASSER does not have a robust
number of cluster III resolved structures to rely on. Our analysis suggested two sections
participated in structural shifts in the cluster III affiliated clones. These two sections included a
residue involved in the Fe protein dimer interface (Leu51), and a hydrogen bonding partner with
water molecules (Thr113; (Schlessman et al., 1998).
The region of 112-115 in the S3 clones was always elongated by two charged residues. The two
main forms in this region were AESEE or EEDKK in S3 clones, which formed unfavourable
salt bridges (interatomic distance >4 Å, table 6). A quick glance at the cluster III alignment
(figure 2, section 5.3.1, chapter 5), revealed that in these positions, an insertion of two residues
tend to occur, and the specific KK or EE type of insertion was also present in the NifH
sequences of Desulfovibrio magneticus strain ATCC 700980 (NIFH_DESMR_1_271 sequence
ID), Desulfovibrio gigas (NIFH_DESGI_1_271) and Desulfatibacillum alkenivorans strain AK-
01 (NIFH_DESAA_1_271). Therefore while the insertion of the charged residues was not
unique or endemic to the stromatolite group, it was definitely a stabilising adaptation for saline
conditions, as these specific species are known to withstand saline conditions (Cravo-Laureau et
al., 2004; Garrity et al., 2005). An increase of salt bridges was reported within monomers and at
the inter subunits interfaces of halophilic proteins, as a stabilising mechanism (Eisenberg, 1995;
Madern et al., 2000). These studies suggested that the salt bridges were composed by either an
Arg residue which interacted with the acidic residues, or by the solvent ions such as chloride
and sodium to which the salt bridges were bound. Our analysis of potential salt bridges revealed
that while the Asp or Glu residues were always highly conserved in S3 or S1 in general, their
positive partners were sometimes highly variable. The mechanism described by Madern et al.,
(2000), would therefore fit the conservation we see of acidic residues in S3 or S1 and would
allow for flexible interactions with positive ions from the solvent.
179
Our analysis suggested three sections were involved in structural shifts in the affiliated clones of
cluster I. The Gly residues, in positions 94 and 96, support a functionally important Val residue
in between (V95, table 4, section 5.3.2, chapter 5), which participates both in binding to the
MoFe protein and in the dimer interaction within the protein. In addition C97, right after G96,
coordinates the metalo cluster by a hydrogen bonding between the main chain amide, the sulfur
atoms in the cluster and the thiol group of the Cys (NH-S bond). This loop region have been
found to exhibit variation in conformation previously and though one residue was denoted
buried and the other exposed to the solvent, the entire cluster area is considered accessible to the
solvent in general (Schlessman et al., 1998). The region of YEDD (115-118) is mostly non-
conserved and exposed to the solvent, and position 117 sometimes included Asp or Asn in the
S1 clones. The Asn is a rather unique choice for some of the stromatolites, since in 13
sequences in cluster I alignment, an insertion of D/E/S/V was evident as well (figure 5, section
5.3.2, chapter 5), but Asn was never present. These 13 sequences were of NifH from
Pseudomonas stutzeri, Azotobacter chroococcum and A. vinelandii. Most of the sequences in
cluster I did not have an additional residue in position 117 (figure 5, section 5.3.2, chapter 5).
According to our previous analysis with 1CP2 and 2AFH resolved crystallographic structures -
the residues corresponding to positions 51-52, 89, 93 and 113-114 also presented high RMSD
values (table 5, section 5.3.3, chapter 5). Hence, we believe a plausible explanation for the
structural shifts in positions 51-52, 89, 91-93, and 113-114 in cluster III clones, originated from
the methodology used by I-TASSER. The process relies on available protein structures, which
at the moment are mostly Fe proteins from A. vinnelandii, and therefore may not reflect
authentically potential shifts in the structure of Fe proteins from the clones. In a similar fashion,
two out of the three sections observed when analysing 2AFH and related stromatolite clones
(table 5, figure 6) were also detected previously and can be attributed to the I-TASSER process,
and may not represent authentic structural shifts. On the other hand, positions 41-42, 62-63, 65-
68, and 115 (with the insertion of two additional residues), may authentically represent
structural shifts in stromatolite affiliated with cluster III. Altogether these findings suggest that
structural shifts occur in the stromatolite Fe proteins, in addition to the possible bias introduced
by the I-TASSER procedure.
In summary, based on amino acid composition and structural analysis, the overall results
suggest halophilic adaptations were present in the inferred NifH sequences of the stromatolites.
180
6.4.2 Thermophilic adaptations
The Paralana Hot Springs NifH clones (PHS) were subjected to analyses of conservation
patterns, amino acids composition and structural shifts, in comparison to cluster I and cluster III.
Our analysis demonstrated that most of the amplified region was conserved in the PHS clones,
yet relative to cluster I and cluster III, they were less conserved. Seventy percent and 66% of
cluster I and cluster III positions in the same region, respectively, scored 8 & 9. In addition, the
sequence alignment of PHS clones included more highly variable positions with scores below 6.
This was expected as the PHS multiple alignment was combined from clones affiliated with
both clusters; hence their multiple alignment included more variable residues per position. We
also looked for unique residue variants within the PHS multiple alignment that differed from the
variants of the original clusters.
In the variable region of PHS alignment, 13 positions were found to include amino acid variants
which were not present in cluster I or III. However, except for positions 80 and 116, the variants
were present in only one sequence out of the whole set, and therefore were discarded from
further analysis and discussion. N106 was completely conserved in PHS, in contrast to the
original clusters (figure 7). It is unknown at the moment, how these changes would affect the Fe
protein function in PHS clones.
Hyperthermophilic proteins usually display an increase in charged (Arg, Lys, Glu, Asp) and
some hydrophobic amino acids (Ile, Met, Val, Tyr), accompanied by a decrease in uncharged
polar residues such as Ser, Thr, Asn and Gln, with no significant variation for His, Pro, Gly or
Cys (Cambillau and Claverie, 2000; Daniel et al., 2008). Other studies reported slightly
different results: an increase in Glu, Ile, Val, Tyr, accompanied by decreases in Ala, His, Gln
and Thr (Fukuchi and Nishikawa, 2001; Singer and Hickey, 2003).
In general, the increase in charged amino acids results in chains of ion pairs, which enhance
stability at high temperatures. Asn and Gln are sensitive to temperature fluctuations, due to the
increased rate of deamination at high temperatures, hence decreasing their presence promotes
stability overall, at high temperatures. Hydrophobic interactions in the protein affect its stability
, increasing the core hydrophobicity produces a small and compact core, which stabilises the
protein at higher temperatures (Siddiqui and Cavicchioli, 2006; Siddiqui and Thomas, 2008).
The amino acids composition analysis enabled us to detect composition shifts relative to cluster
I or cluster III. Clones affiliated with cluster III (P3) had an increase in positively and negatively
charged amino acids (Asp, Arg), and an increase in small or hydrophobic amino acids (Phe, Val,
181
Pro) but also a decrease in hydrophobic residues such as Tyr, Leu and Ala, as well as the
negatively charged Glu (table 7, figure 8).
The fluctuations in the conserved sections, relative to cluster III, point to an increase in the Leu
& Tyr content, and a decrease in the Gly content. Therefore there might be interplay between
the external, variable sections to the conserved interior. In the interior, a slight increase in large
hydrophobic residues, would help to minimise accessible surface area to the solvent, within the
important functional buried sites of the protein (Jaenicke and Böhm, 1998; Haney et al., 1999).
In common with the P3 clones, the variable section included an increase in Arg, and a decrease
in Ala and Glu amino acids (table 7, figure 9). P1 clones also decreased in other charged amino
acids and uncharged polar residues. Glu increased in the conserved sections of P1 but this was
not observed in the P3 clones (table 8). The interplay previously observed in cluster I & III,
between Ala and Gly amino acids in the conserved regions, was not detected in PHS clones.
In order to achieve a reliable structural analysis, we have chosen 18 PHS clones, at a maximum
distance of 0.21, to 1CP2/P00456 or 2AFH/P00459 sequences (table 9). The highest RMSD
value for a clone model with cluster I was 0.56 Å (RSA163) while the highest value for a clone
model of cluster III was 0.9 Å (RSA215, RSA195), indicative of the uncertainties in modelling
a cluster III Fe protein, with the current low number of available resolved structures of Fe
proteins from this cluster. According to our previous analysis with 1CP2 and 2AFH resolved
crystallographic structures (table 5, section 5.3.3, chapter 5), the residues corresponding to
positions 50-51, 62-63, 65-67, 89, 91-93 and 112-115, presented high RMSD values and
therefore some of the reported shifts in P3 are a result from the I-TASSER process and may not
represent authentic shifts.
These results were similar to our findings with the stromatolites clone partial NifH sequences.
In the P3 clones, the region of 112-115 was sometimes elongated by two residues, and our
analysis suggested a salt bridge might be established at times (Table 12figure 10). Three main
alternatives for this section were observed in the clone sequences - KMD/EESQE/DADKK. For
some of the PHS clones affiliated with cluster I, no insertion was evident at all, and one of the
Asp residues would change to Gly, while another negative residue would be omitted at times
(table 11, figure 11). P1 and P3 did not share the exact same modification, but they did share the
same region in which this modification occurred.
In summary, based on amino acid composition and structural analysis, the analysis suggested
thermophilic adaptations were not present in full, in the inferred NifH sequences of PHS clones.
182
6.5 Concluding remarks
NifH sequences from Shark Bay hypersaline environment and Paralana hot springs were
analysed in terms of their conservation patterns, amino acid composition and existing and
potential structural attributes. Our methods included a statistical t-test analysis of the amino acid
composition, a novel ConSurf evolutionary analysis, and the use of the I-TASSER web server
for 3D modelling of the amplified region of NifH.
Our results were explained in light of the methodology limitations as discussed previously, in
section 5.4.1, chapter 5.
The results suggested that to a certain degree, halophilic adaptations, with an increase in salt
bridges, charged residues and a decrease in bulkier hydrophobic amino acids, did occur. The
changes were less apparent in the clones affiliated with cluster I, than with the clones affiliated
with cluster III, which may be an indication of some measure of protection of the protein from
the environment in the cluster I affiliated clones (see table 13).
The NifH protein sequences from Paralana Hot Springs were subjected to a similar analysis.
The results suggested that to a limited degree, some of the known thermophilic adaptations - an
increase in salt bridges, charged residues and Pro, were present in the sequences; however other
known features were not detected, including an increase in several hydrophobic amino acids and
a decrease in uncharged polar residues. These conflicting results may be indicative of a
changing temperature regime in the hot spring, as different temperatures were reported in the
past (Mawson, 1927; Grant, 1938; Long et al., 2001; Anitori et al., 2002), or of additional
environmental factors such as salinity, coming into play (see table 13). These factors require
further confirmation.
Some of our findings can only be confirmed once a determined Fe protein structure has been
isolated from representatives’ microorganisms from the investigated environments.
183
Table 13 Summarising halophilic and thermophilic findings from this study.
Halophilic adaptations*
More Asp or Glu, Ala, Gly or Val(a)
Less Lys, Ile or Leu or Phe or Met(a)
More salt bridges(b)
Stromatolites NifH clones
S3: Glu, Asp, Gly, Val S1: Glu
S3: Ile, Leu, Phe, Met S1: Lys, Phe +
Thermophilic adaptations**
More Ile or Tyr, Arg or Glu, Pro or Lys(a)
Less Gly or Met or Gln or Thr or Asn or Ser(a)
More salt bridges(b)
High Arg/Lys ratio (>1)
(a) PHS NifH clones
P3:Arg, Pro P1: Ile, Arg, Pro
P3: Ser P1: Gly, Gln, Asn, Ser +
P3: 0.53 P1: 1.61
*Specific halophilic adaptations (Eisenberg, 1995; Madern et al., 2000; Bolhuis et al., 2008). ** Specific thermophilic adaptations (Haney et al., 1999; Daniel et al., 2008). (a) Specific changes in the amino acids composition and whether they appeared in the variable sections of the NifH sequence of the stromatolite clones (S1, S3), or the PHS clones (P1, P3). Changes were in comparison to cluster I (S1, P1 vs. C1) or cluster III (S3, P3 vs. C3) values. (b) See tables 14 & 20 - salt bridges calculated by WHAT IF, version 10.1a, (Rodriguez et al., 1998).
184
Chapter 7 Conclusions & future work
_______________________________________________
“Nothing in biology makes sense except in the light of evolution” is a statement that still stands
true, throughout the decades (Dobzhansky, 1973). Genetic studies have revealed that the nifH
gene is present in numerous bacteria and Archaea and is relatively common in a vast number of
genomes (Gary Stacey, 1992; Berman-Frank et al., 2003; Raymond et al., 2004a). This in turn
suggests that the gene has been present in the genetic code, for a long time, perhaps even since
the Last Universal Common Ancestor (LUCA) (Fani et al., 2000; Leipe et al., 2002; Latysheva
et al., 2012).
As stated in the beginning of this thesis - nitrogen fixation is one of the most important
biochemical processes. Our main aim in this work was to study microbial communities involved
in this process, which reside in unique, sometimes extreme, environments. We then analysed the
modifications in the NifH sequences we obtained from the molecular work, and assessed
whether unique adaptations of the Fe protein were evident. Our non molecular methods
included a statistical t-test analysis of amino acid compositions, and a novel combination of an
evolutionary analysis and protein 3D models.
It would appear then, that from the early beginning of life on Earth, the nifH gene had been
translated into a functional protein, under various environmental conditions (Leigh, 2000).
According to some recent studies, it would seem that phylogeny trees based on functional
genes, such as the nifH gene, represent microbial communities better than taxonomy based
phylogeny trees, as they reflect the immediate environment in which the micro-organisms live
(Burke et al., 2011; Hamilton et al., 2011a). It is reasonable to assume that proteins would be
optimised to ensure survival in a specific environmental setting, and that micro-evolution would
match specific ecological niches (Taroncher-Oldenburg et al., 2003). Findings of this nature
suggest that the different clusters in the phylogenetic tree would actually represent past
adaptations to environmental changes regardless of taxonomical relations (Burke et al., 2011;
Hamilton et al., 2011a) and would actually represent conditions currently influencing the
composition of the genetic code.
In other words - phylogenetic affiliations would correlate best with specific physical and
chemical influences, during a specific time frame, and not necessarily with taxonomical groups,
185
and in addition, functional genes such as the nifH, would not be identical in the same species, if
its members reside in different environments. Altogether this suggests that a linear story for the
evolution of the nifH gene (or other functional gene), is highly unlikely. Published phylogenetic
analyses of nifH, and also related nif operon genes, seem to support this avenue of thought
(Gary Stacey, 1992; Fani et al., 2000; Leipe et al., 2002; Berman-Frank et al., 2003; Raymond
et al., 2004a; Latysheva et al., 2012).
A possible interpretation of the current known topology of the nifH tree (four clusters, cluster I
and III as the main clades, see chapters 1-3) would be that the main clusters most probably
represent an adaptation to the presence, or lack of, oxygen. In turn, this would set the cluster’s
time of branching around the 2.22 - 2.45 billion years ago, at the great oxidation event (Brocks
et al., 1999; Anbar et al., 2007). The current tree topology may thus represent not only a
specific and dramatic change that happened in the past, at some point in time, but also an
ongoing global setting - still affecting genomes across a wide range of geographical locations.
We would argue that any functional gene phylogenetic tree should be searched for a similar
topology, and if found, one could assume that the ‘great divide’ would have been set around the
time of the great oxidation event.
We have assessed in this study, for the first time, bacterial profiles from two Antarctic sites, in
the Terra Nova Bay area (Abramovich et al., 2012). In order to gather evidence for the bacterial
communities in these glacial zones, we carried out a terminal-restriction fragment length
polymorphism (T-RFLP) analysis on 16S rDNA using a universal bacterial amplification
protocol on two permafrost cores (Marsh, 1999). Bray-Curtis cluster analysis suggested Boulder
Clay bacterial profiles were similar to each other, but cluster separately from the Amorphous
Glacier bacterial profile (Hammer et al., 2001). Amorphous Glacier was potentially rich in
microbial species and the two sites differed in their microbial diversity. Permafrost and icy
environments are difficult to work with (Miteva, 2008), but they are present on Mars and other
objects in the solar system (McKay et al., 1991; Friedmann, 1993; Ostroumov and Siegert,
1996). Icy environments on Earth are therefore important analogue sites for astrobiological
research, if we aim to learn and adapt technologies to find life elsewhere in the universe (Soina
et al., 1995).
Our study is the first to confirm the presence of nifH genes in columnar stromatolites, Shark
Bay, Western Australia (chapter 3). Shark Bay, a UNSECO’s world heritage site, provides
researchers with fascinating endemic microbiological subjects, which bridge our current era
with Archaean fossil records from the beginning of life on Earth. These “living fossils” are
important to our understanding of the origin of life on Earth, as their remnants are consistently
186
being found in the Earth’s geological records, the oldest to date found in 3.49 Ga Archean rocks
(Walter, 1976; Schopf, 2006).
Our findings partially matched former taxonomical findings on the stromatolites based on
studies which utilized mainly 16SrDNA and culturing analyses. Common potential diazotrophs
included cyanobacterial species and Desulfatibacillum of the δ-Proteobacteria (Goh et al., 2008;
Allen et al., 2009; Burns et al., 2009). The two stromatolite samples, from different years,
differed in their species diversity and richness, and we suggested this was related to the
environmental events that occurred at the time of sampling. Our results indicated that columnar
stromatolites and the salt ponds of Guerrero Negro, Mexico, harbour similar diazotrophic
species, mainly from the δ-Proteobacteria and Cyanobacteria groups. However, the stromatolites
included unique species, such as non-heterocystous Cyanobacteria and γ, δ-Proteobacteria NifH
sequences, which were not present in the Guerrero Negro salt ponds. A new clade was an out-
group to cluster I, and centred on the δ-proteobacterium, Pelobacter carbinolicus DSM 2380
and affiliated NifH clones.
In a different part of Australia, the diazotrophic community of a hot and slightly radioactive
spring was investigated for the first time (chapter 4). Our findings included diazotrophs from the
Cyanobacteria, Nitrospirae, Spirochaetes, Bacteroidetes, Firmicutes and δ-Proteobacteria
groups, few of which were reported by a former taxonomical study utilising a 16SrDNA
analysis (Anitori et al., 2002). These diazotrophs were mainly affiliated with cluster I and
cluster III of the NifH phylogeny tree; however, two new clades were established as out groups
to cluster I. These clades included NifH clones closely related to Thermodesulfovibrio
yellowstonii DSM 11347 (Nitrospirae), several Geobacter spp. and P. carbinolicus DSM 2380
(δ-Proteobacteria).
The number of NifH clones analysed and sequenced in this study (76), represents the highest
number of NifH clones from a singular hot spring to be analysed to date (Hamilton et al., 2011).
According to our richness and diversity analysis, the diazotrophic community was more diverse
and included more NifH species than Shark Bay columnar stromatolites and should be further
investigated and sampled. Hydrothermal systems in general produce habitable
microenvironments (Jannasch and Wirsen, 1981; Sogin et al., 2006), and there is evidence to
suggest their existence on Mars, Europa, Enceladus and other solar bodies (McCollom, 1999;
Vance et al., 2007; Glein et al., 2008; Skok et al., 2010), making the Paralana’s active
amagmatic hydrothermal system an interesting analogue site for astrobiology research.
Our bioinformatics approach paved the way for future research to use the nifH gene as a
reference point for analysis of genomic and protein modifications (chapters 5 & 6). While our
187
data sets were small, it allowed for an in depth analysis of our methodology and its limitations.
The results were limited by the nature of our datasets, and yet showed great promise as specific
adaptations were detected in the NifH sequences from Shark Bay and Paralana Hot springs,
supporting the notion of dynamic evolution in their respective environments.
Future work
Molecular and bioinformatics tools were our main methodologies in this study. Future
researchers may want to focus not only on potential diazotrophs but also on identifying the
actual nitrogen fixers in these unique environments. The new out groups of the NifH
phylogenetic tree reported in this study, represent adaptation to high temperatures and high
salinity, but it is unclear if they are active agents in fixing atmospheric nitrogen.
Assessment of actual nitrogenase activity can be achieved with reverse transcriptase PCR and
quantitative reverse transcriptase PCR, and with acetylene reduction assays. These
methodologies would shed light on the key players in the N2 fixation cycle.
Whole genome amplification could also be used to increase the DNA concentrations recovered
from the environment for downstream PCR analysis. Such research will confirm the presence
and viability of psychrophilic, thermophilic and halophilic bacterial phyla, and correlate the
community composition with the geological and habitat characteristics. Proteomics studies
would link nitrogen fixation key enzymes and genes to other biochemical processes, such as
photosynthesis (oxygenic and anoxic) or sulphate reduction and oxidation, and would provide
comparable data with other microbial systems. Measurements of N15 uptake on a micron scale,
within for example, the stromatolite mats’ upper layers (down to 5-8 mm depth), would provide
a reliable portrait of the nitrogen budget within the layered microbial mats and within the
different types of stromatolite mats.
Additionally, as most of what is currently known about the nitrogenase activity is derived from
studies based on Cyanobacteria, nitrogenase activity should be explored and characterised in
diazotrophic sulphate reducing bacteria (SRB) and other anaerobic bacteria.
Future work may also include analysing the new phylogenetic out groups presented in this study
(chapters 3 and 4). Our methodology can be employed on these sequences and compared to our
current body of work, and also compare them to distinct thermophilic or halophilic NifH
sequences or perhaps GTPases from thermophilic and halophilic genomes. This in turn, will not
only clarify what are the evolutionary steps which bring forth thermophilic or halophilic
188
adaptations, across taxonomical groups and across protein families, but it will clarify whether
taxonomy trumps functionality for this type of gene (see the opening paragraphs in this chapter).
In addition, comparing these out group sequences to cluster I and cluster III affiliated clones
might reveal a gradient of adaptations in the protein composition and structure, thus
illuminating the entire range of adaptations possible to diazotrophs in a specific environment.
Elongation of the amplified region of the nifH gene via the PCR process would be very
beneficial to our analysis, and will enable researchers to confirm or reject our current analysis,
mainly in regards to the amino acid compositions and content in the conserved vs. non
conserved regions of the Fe protein. Additional characteristics that can be assessed in regard to
potential adaptations include (briefly): aromatic interactions, hydrogen bonds, disulfide bridges,
surface accessibility of certain amino acids, electrostatic interactions in the core vs. protein
surface and thermodynamic and protein activity properties.
In summary, this study has enhanced our knowledge of microbiological agents which survive
successfully in extreme environments. These environments are worthy of our attention as they
provide analogous sites for research intended on finding evidence for life elsewhere in the solar
system. Given enough time to adapt, these successful micro-organisms could survive rigorous
conditions outside of Earth’s protective shell, promoting an optimistic view of finding micro-
organisms elsewhere in the solar system.
189
References
Abascal, F., Zardoya, R., and Posada, D. (2005) ProtTest: selection of best-fit models of protein evolution. Bioinformatics 21: 2104-2105. Abramovich, R.S., Pomati, F., Jungblut, A.D., Guglielmin, M., and Neilan, B.A. (2012) T-RFLP Fingerprinting Analysis of Bacterial Communities in Debris Cones, Northern Victoria Land, Antarctica. Permafrost Periglac 23: 244-248. Abyzov, S.S., Filippova, S.N., and Kuznetsov, V.D. (1983) Nocardiopsis antarcticus-A new species of actinomyces isolated from the ice sheet of the Central Antarctica glacier. Izv Akad Nauk Ser Biol 4: 559-568. Adams, D.G. (2000) Heterocyst formation in cyanobacteria. Curr Opin Microbiol 3: 618-624. Affourtit, J., Zehr, J., and Paerl, H. (2001) Distribution of nitrogen-fixing microorganisms along the Neuse River Estuary, North Carolina. Microb Ecol 41: 114-123. Aislabie, J., Jordan, S., Ayton, J., Klassen, J.L., Barker, G.M., and Turner, S. (2009) Bacterial diversity associated with ornithogenic soil of the Ross Sea region, Antarctica. Can J Microbiol 55: 21-36. Aislabie, J.M., Chhour, K.L., Saul, D.J., Miyauchi, S., Ayton, J., Paetzold, R.F., and Balks, M.R. (2006) Dominant bacteria in soils of Marble Point and Wright Valley, Victoria Land, Antarctica. Soil Biol Biochem 38: 3041-3056. Akaike, H. (2002) A new look at the statistical model identification. Automatic Control, IEEE Transactions on 19: 716-723. Allen, M., Goh, F., Burns, B., and Neilan, B. (2009) Bacterial, archaeal and eukaryotic diversity of smooth and pustular microbial mat communities in the hypersaline lagoon of Shark Bay. Geobiology 7: 82-96. Allen, M.A. (2006) An Astrobiology-Focused Analysis of Microbial Mat Communities from Hamelin Pool, Shark Bay, Western Australia. In School of Biotechnology and Biomolecular Sciences. Sydney: The University of New South Wales, p. 243. Allen, M.A., Goh, F., Leuko, S., Igo, A.E., Mizuki, T., Usami, R. et al. (2008) Haloferax elongans sp nov and Haloterax mucosum sp nov., isolated from microbial mats from Hamelin Pool, Shark Bay, Australia. Int J Syst Evol Microbiol 58: 798-802. Altschul, S., Madden, T., Schaffer, A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25: 3389. Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. (1990) Basic local alignment search tool. J. Mol. Biol 215: 403-410.
190
Aluri, S., and Terli, R. (2012) Three dimensional modelling of beta endorphin and its interaction with three opioid receptors. Journal of Computational Biology and Bioinformatics Research 4: 51-57. Amann, R.I., Ludwig, W., and Schleifer, K.H. (1995) Phylogenetic identification and in situ detection of individual microbial cells without cultivation. Microbiol Rev 59: 143-169. Amato, P., Hennebelle, R., Magand, O., Sancelme, M., Delort, A.M., Barbante, C. et al. (2007) Bacterial characterization of the snow cover at Spitzberg, Svalbard. FEMS Microbiol Ecol 59: 255-264. Anbar, A.D., Duan, Y., Lyons, T.W., Arnold, G.L., Kendall, B., Creaser, R.A. et al. (2007) A whiff of oxygen before the great oxidation event? Science 317: 1903-1906. Andres, M.S., and Pamela Reid, R. (2006) Growth morphologies of modern marine stromatolites: A case study from Highborne Cay, Bahamas. Sediment Geol 185: 319-328. Anisimova, M., and Gascuel, O. (2006) Approximate likelihood-ratio test for branches: A fast, accurate, and powerful alternative. Syst Biol 55: 539. Anitori, R.P., Trott, C., Saul, D.J., Bergquist, P.L., and Walter, M.R. (2002) A culture-independent survey of the bacterial community in a radon hot spring. Astrobiology 2: 255-270. Apweiler, R., Martin, M., O’Donovan, C., Magrane, M., Alam-Faruque, Y., Antunes, R. et al. (2010) The universal protein resource (UniProt) in 2010. Nucleic Acids Res 38: D142-D148. Argandoña, M., Fernández Carazo, R., Llamas, I., Martínez Checa, F., Caba, J.M., Quesada, E., and Moral, A. (2005) The moderately halophilic bacterium Halomonas maura is a free living diazotroph. FEMS Microbiol Lett 244: 69-74. Ashkenazy, H., Erez, E., Martz, E., Pupko, T., and Ben-Tal, N. (2010) ConSurf 2010: calculating evolutionary conservation in sequence and structure of proteins and nucleic acids. Nucleic Acids Res 38: W529-W533. Bai, Y., Yang, D., Wang, J., Xu, S., Wang, X., and An, L. (2006) Phylogenetic diversity of culturable bacteria from alpine permafrost in the Tianshan Mountains, northwestern China. Res Microbiol 157: 741-751. Bakermans, C., Tsapin, A.I., Souza-Egipsy, V., Gilichinsky, D.A., and Nealson, K.H. (2003) Reproduction and metabolism at -10°C of bacteria isolated from Siberian permafrost. Environ Microbiol 5: 321-326. Bardavid, R., Ionescu, D., Oren, A., Rainey, F., Hollen, B., Bagaley, D. et al. (2007) Selective enrichment, isolation and molecular detection of Salinibacter and related extremely halophilic Bacteria from hypersaline environments. Hydrobiologia 576: 3-13. Bargagli, R., Skotnicki, M.L., Marri, L., Pepi, M., Mackenzie, A., and Agnorelli, C. (2004) New record of moss and thermophilic bacteria species and physico-chemical properties of geothermal soils on the northwest slope of Mt. Melbourne (Antarctica). Polar Biol 27: 423-431.
191
Barrett, J.E., Virginia, R.A., Wall, D.H., Cary, S.C., Adams, B.J., Hacker, A.L., and Aislabie, J.M. (2006) Co-variation in soil biodiversity and biogeochemistry in northern and southern Victoria Land, Antarctica. Antarct Sci 18: 535-548. Bauer, K., Díez, B., Lugomela, C., Seppälä, S., Borg, A., and Bergman, B. (2008) Variability in benthic diazotrophy and cyanobacterial diversity in a tropical intertidal lagoon. FEMS Microbiol Ecol 63: 205-221. Bauld, J., Favinger, J.L., Madigan, M.T., and Gest, H. (1986) Obligately halophilic Chromatium vinosum from Hamelin Pool, Shark Bay, Australia. Curr Microbiol 14: 335-339. Bazylinski, D.A., Dean, A.J., Schüler, D., Phillips, E.J.P., and Lovley, D.R. (2000) N2 dependent growth and nitrogenase activity in the metal metabolizing bacteria, Geobacter and Magnetospirillum species. Environ Microbiol 2: 266-273. Belay, N., Sparling, R., and Daniels, L. (1984) Dinitrogen fixation by a thermophilic methanogenic bacterium. Bell, R.E., and Ben Tal, N. (2003) In silico identification of functional protein interfaces. Comp Funct Genomics 4: 420-423. Berezin, C., Glaser, F., Rosenberg, J., Paz, I., Pupko, T., Fariselli, P. et al. (2004) ConSeq: the identification of functionally and structurally important residues in protein sequences. Bioinformatics 20: 1322. Berg, J.M., Tymoczko, J.L., and Stryer, L. (2002) Biochemistry. New York:: W. H. Freeman and Co. Bergman, B., Gallon, J.R., Rai, A.N., and Stal, L.J. (1997) N2 Fixation by non-heterocystous cyanobacteria. In, pp. 139-185. Berman-Frank, I., Lundgren, P., and Falkowski, P. (2003) Nitrogen fixation and photosynthetic oxygen evolution in cyanobacteria. Res Microbiol 154: 157-164. Bertics, V., Sohm, J., Treude, T., Chow, C., Capone, D., Fuhrman, J., and Ziebis, W. (2010) Burrowing deeper into benthic nitrogen cycling: the impact of bioturbation on nitrogen fixation coupled to sulfate reduction. Mar Ecol Prog Ser 409: 1-15. Bertrand-Sarfati, J., and Walter, M.R. (1976) Chapter 5.2 An Attempt to Classify Late Precambrian Stromatolite Microstructures. In Developments in Sedimentology: Elsevier, pp. 251-259. Bhat, W.W., Lattoo, S.K., Razdan, S., Dhar, N., Rana, S., Dhar, R.S. et al. (2012) Molecular cloning, bacterial expression and promoter analysis of squalene synthase from< i> Withania somnifera</i>(L.) Dunal. Gene. Bhatia, M., Sharp, M., and Foght, J. (2006) Distinct Bacterial Communities Exist beneath a High Arctic Polythermal Glacier. Appl Environ Microbiol 72: 5838-5845. Blackwood, C.B., Marsh, T., Kim, S.-H., and Paul, E.A. (2003) Terminal Restriction Fragment Length Polymorphism Data Analysis for Quantitative Comparison of Microbial Communities. Appl Environ Microbiol 69: 926-932. Blight, P.G. (1977) Uraniferous Metamorphics and" younger" Granites of the Paralana Area, Mount Painter Province, South Australia: A Petrographical and Geochemial Study: Department of Geology, University of Adelaide.
192
Boeckmann, B., Bairoch, A., Apweiler, R., Blatter, M.C., Estreicher, A., Gasteiger, E. et al. (2003) The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res 31: 365-370. Bohm, G., and Jaenicke, R. (1994) Relevance of sequence statistics for the properties of extremophilic proteins. International Journal of Peptide and Protein Research 43: 97-106. Bohme, H. (1998) Regulation of nitrogen fixation in heterocyst-forming cyanobacteria. Trends Plant Sci 3: 346-351. Bolhuis, A., Kwan, D., and Thomas, J. (2008) Halophilic Adaptations of Proteins. In Protein adaptation in extremophiles. Siddiqui, K.S., and Thomas, T. (eds): Nova Science Publishers, Inc., pp. 71-104. Bonin, P., and Michotey, V. (2006) Nitrogen budget in a microbial mat in the Camargue (southern France). MARINE ECOLOGY-PROGRESS SERIES- 322: 75. Bothe, H., Tripp, H., and Zehr, J. (2010) Unicellular cyanobacteria with a new mode of life: the lack of photosynthetic oxygen evolution allows nitrogen fixation to proceed. Arch Microbiol: 1-8. Bourne, H.R., Sanders, D.A., and McCormick, F. (1991) The GTPase superfamily: conserved structure and molecular mechanism. Bowman, J.P., and McCuaig, R.D. (2003) Biodiversity, Community Structural Shifts, and Biogeography of Prokaryotes within Antarctic Continental Shelf Sediment. Appl Environ Microbiol 69: 2463-2483. Bowman, J.P., McCammon, S.A., Brown, M.V., Nichols, D.S., and McMeekin, T.A. (1997) Diversity and association of psychrophilic bacteria in Antarctic sea ice. Appl Environ Microbiol 63: 3068-3078. Bowman, J.P., McCammon, S.A., Gibson, J.A.E., Robertson, L., and Nichols, P.D. (2003) Prokaryotic Metabolic Activity and Community Structure in Antarctic Continental Shelf Sediments. Appl Environ Microbiol 69: 2448-2462. Brambilla, E., Hippe, H., Hagelstein, A., Tindall, B.J., and Stackebrandt, E. (2001) 16S rDNA diversity of cultured and uncultured prokaryotes of a mat sample from Lake Fryxell, McMurdo Dry Valleys, Antarctica. Extremophiles 5: 23-33. Brewer, W. (1866) Note on the organisms of the geysers of California. Am. J. Sci 92: 429. Brinkmeyer, R., Knittel, K., Jurgens, J., Weyland, H., Amann, R., and Helmke, E. (2003) Diversity and Structure of Bacterial Communities in Arctic versus Antarctic Pack Ice. Appl Environ Microbiol 69: 6610-6619. Brocks, J., Logan, G., Buick, R., and Summons, R. (1999) Archean molecular fossils and the early rise of eukaryotes. Science 285: 1033. Brown, I.I., Bryant, D.A., Casamatta, D., Thomas-Keprta, K.L., Sarkisova, S.A., Shen, G. et al. (2010) Polyphasic Characterization of a Thermotolerant Siderophilic Filamentous Cyanobacterium That Produces Intracellular Iron Deposits. Appl Environ Microbiol 76: 6664.
193
Brown, M., Friez, M., and Lovell, C. (2003) Expression of nifH genes by diazotrophic bacteria in the rhizosphere of short form Spartina alterniflora. FEMS Microbiol Ecol 43: 411-417. Brugger, J., Long, N., McPhail, D.C., and Plimer, I. (2005) An active amagmatic hydrothermal system: The Paralana hot springs, Northern Flinders Ranges, South Australia. Chemical Geology 222: 35-64. Bureau of Meteorology, C.o.A. (2011). Climate Data Online [WWW document]. URL http://www.bom.gov.au/climate/data/. Burgess, B.K., and Lowe, D.J. (1996) Mechanism of Molybdenum Nitrogenase. Chem Rev 96: 2983-3012. Burgess, B.K., Jacobs, D.B., and Stiefel, E.I. (1980) Large-scale purification of high activity< i> Azotobacter vinelandii</i> nitrogenase. Biochimica et Biophysica Acta (BBA)-Enzymology 614: 196-209. Burke, C., Steinberg, P., Rusch, D., Kjelleberg, S., and Thomas, T. (2011) Bacterial community assembly based on functional genes rather than species. Proceedings of the National Academy of Sciences 108: 14288-14293. Burling, M., Pattiaratchi, C., and Ivey, G. (2003) The tidal regime of Shark Bay, Western Australia. Estuarine, Coastal and Shelf Science 57: 725-735. Burns, B., Goh, F., Allen, M., and Neilan, B. (2004) Microbial diversity of extant stromatolites in the hypersaline marine environment of Shark Bay, Australia. Environ Microbiol 6: 1096-1101. Burns, B., Anitori, R., Butterworth, P., Henneberger, R., Goh, F., Allen, M. et al. (2009) Modern analogues and the early history of microbial life. Precambrian Res 173: 10-18. Burns, R.C., Hardy, R.W.F., and Anthony San, P. (1972) Purification of nitrogenase and crystallization of its Mo---Fe protein. In Methods in Enzymology: Academic Press, pp. 480-496. Cambillau, C., and Claverie, J.-M. (2000) Structural and genomic correlates of hyperthermostability. J Biol Chem 275: 32383-32386. Cannone, N., Wagner, D., Hubberten, H., and Guglielmin, M. (2008) Biotic and abiotic factors influencing soil properties across a latitudinal gradient in Victoria Land, Antarctica. Geoderma 144: 50-65. Carpenter, E.J., Lin, S., and Capone, D.G. (2000) Bacterial Activity in South Pole Snow. Appl Environ Microbiol 66: 4514-4517. Carugo, O. (2003) How root-mean-square distance (rmsd) values depend on the resolution of protein structures that are compared. Journal of applied crystallography 36: 125-128. Caspi, R., and Karp, P.D. (2002) Using the MetaCyc Pathway Database and the BioCyc Database Collection: John Wiley & Sons, Inc. Cavicchioli, R. (2002) Extremophiles and the search for extraterrestrial life. Astrobiology 2: 281-292. Chakrabartty, A., Schellman, J.A., and Baldwin, R.L. (1991) Large differences in the helix propensities of alanine and glycine.
Chao, A., and Yang, M.C.K. (1993) Stopping Rules and Estimation for Recapture Debugging with Unequal Failure Rates. In: Biometrika Trust, pp. 193-201. Chen, H., and Zhou, H.X. (2005) Prediction of solvent accessibility and sites of deleterious mutations from protein sequence. Nucleic Acids Res 33: 3193. Chevenet, F., Brun, C., Bañuls, A., Jacq, B., and Christen, R. (2006) TreeDyn: towards dynamic graphics and annotations for analyses of trees. BMC Bioinformatics 7: 439. Chien, Y., and Zinder, S. (1996) Cloning, functional organization, transcript studies, and phylogenetic analysis of the complete nitrogenase structural genes (nifHDK2) and associated genes in the archaeon Methanosarcina barkeri 227. J Bacteriol 178: 143. Chiu, H.J., Peters, J.W., Lanzilotta, W.N., Ryle, M.J., Seefeldt, L.C., Howard, J.B., and Rees, D.C. (2001) MgATP-bound and nucleotide-free structures of a nitrogenase protein complex between the Leu 127 -Fe-protein and the MoFe-protein. Biochemistry 40: 641-650. Christner, B.C., Mosley-Thompson, E., Thompson, L.G., and Reeve, J.N. (2005) Classification of Bacteria from Polar and Nonpolar Glacial Ice. In Life in Ancient Ice. Castello, J.D., and Rogers, S.O. (eds). Princeton, New Jersey: Princeton University Press, pp. 227-239. Christner, B.C., Mosley-Thompson, E., Thompson, L.G., Zagorodnov, V., Sandman, K., and Reeve, J.N. (2000) Recovery and Identification of Viable Bacteria Immured in Glacial Ice. Icarus 144: 479-485. Chung, J., Wang, W., and Bourne, P. (2006) Exploiting sequence and structure homologs to identify protein-protein binding sites. PROTEINS-NEW YORK- 62: 630. Chung, J.L., Wang, W., and Bourne, P.E. (2005) Exploiting sequence and structure homologs to identify protein–protein binding sites. Proteins: Structure, Function, and Bioinformatics 62: 630-640. Clarridge, J.E., III (2004) Impact of 16S rRNA Gene Sequence Analysis for Identification of Bacteria on Clinical Microbiology and Infectious Diseases. Clin. Microbiol. Rev. 17: 840-862. Clement, B.G., Kehl, L.E., DeBord, K.L., and Kitts, C.L. (1998) Terminal restriction fragment patterns (TRFPs), a rapid, PCR-based method for the comparison of complex bacterial communities. J Microbiol Methods 31: 135-142. Cole, J., Wang, Q., Cardenas, E., Fish, J., Chai, B., Farris, R. et al. (2009) The Ribosomal Database Project: improved alignments and new tools for rRNA analysis. Nucleic Acids Res 37: D141. Cole, J.R., Chai, B., Farris, R.J., Wang, Q., McGarrell, D.M., Bandela, A.M. et al. (2007) The ribosomal database project (RDP-II): introducing myRDP space and quality controlled public data. Nucleic Acids Res 35: D169. Cole, J.R., Chai, B., Marsh, T.L., Farris, R.J., Wang, Q., Kulam, S.A. et al. (2003) The Ribosomal Database Project (RDP-II): previewing a new autoaligner that allows regular updates and the new prokaryotic taxonomy. Nucleic Acids Res 31: 442-443. Colon-Lopez, M., Sherman, D., and Sherman, L. (1997) Transcriptional and translational regulation of nitrogenase in light-dark-and continuous-light-grown cultures
195
of the unicellular cyanobacterium Cyanothece sp. strain ATCC 51142. J Bacteriol 179: 4319. Costello, E., Halloy, S., Reed, S., Sowell, P., and Schmidt, S. (2009) Fumarole-Supported Islands of Biodiversity within a Hyperarid, High-Elevation Landscape on Socompa Volcano, Puna de Atacama, Andes. Appl Environ Microbiol 75: 735. Cravo-Laureau, C., Matheron, R., Joulian, C., Cayol, J.-L., and Hirschler-Rea, A. (2004) Desulfatibacillum alkenivorans sp. nov., a novel n-alkene-degrading, sulfate-reducing bacterium, and emended description of the genus Desulfatibacillum. Int J Syst Evol Microbiol 54: 1639-1642. CRBIP, T. (2007) Centre de Ressources Biologiques de l'Institut Pasteur. In: Institut Pasteur. D'Agostino, R.B. (1986) Tests for Normal Distribution. In Goodness-of-fit techniques. D'Agostino, R.B., and Stephens, M.A. (eds). New York, NY, USA: Marcel Dekker, Inc. Daniel, R.M., Danson, M.J., Hough, D.W., Lee, C.K., Peterson, M.E., and Cowan, D.A. (2008) Enzyme stability and activity at high temperatures. In Protein Adaptation in Extremophiles. Siddiqui, K.S., and Thomas, T. (eds). New York, NY: Nova Science Publishers, Inc, pp. 1-34. Darapaneni, V., Prabhaker, V.K., and Kukol, A. (2009) Large-scale analysis of influenza A virus sequences reveals potential drug target sites of non-structural proteins. J Gen Virol 90: 2124-2133. DasSarma, S., and Arora, P. (2006) Halophiles. eLS. Davey, A., and Marchant, H.J. (1983) Seasonal Variation in Nitrogen Fixation by Nostoc-Commune Vaucher at the Vestfold Hills Antarctica. Phycologia 22: 377-386. Davila, A.F., Gómez-Silva, B., De los Rios, A., Ascaso, C., Olivares, H., McKay, C.P., and Wierzchos, J. (2008) Facilitation of endolithic microbial survival in the hyperarid core of the Atacama Desert by mineral deliquescence. Journal of Geophysical Research 113: G01028. Davila, A.F., Duport, L.G., Melchiorri, R., Jänchen, J., Valea, S., de los Rios, A. et al. (2010) Hygroscopic Salts and the Potential for Life on Mars. Astrobiology 10: 617-628. Deming, J.W. (2002) Psychrophiles and polar regions. Curr Opin Microbiol 5: 301-309. Derakshani, M., Lukow, T., and Liesack, W. (2001) Novel bacterial lineages at the (sub) division level as detected by signature nucleotide-targeted recovery of 16S rRNA genes from bulk soil and rice roots of flooded rice microcosms. Appl Environ Microbiol 67: 623-631. Des Marais, D. (1995) The biogeochemistry of hypersaline microbial mats. Adv Microb Ecol 14: 251. Des Marais, D.J. (2003) Biogeochemistry of Hypersaline Microbial Mats Illustrates the Dynamics of Modern Microbial Ecosystems and the Early Evolution of the Biosphere. Biol Bull 204: 160-167. Desnues, C., Michotey, V., Wieland, A., Zhizang, C., Fourçans, A., Duran, R., and Bonin, P. (2007) Seasonal and diel distributions of denitrifying and bacterial
196
communities in a hypersaline microbial mat (Camargue, France). Water Res 41: 3407-3419. Diallo, M.D., Reinhold-Hurek, B., and Hurek, T. (2008) Evaluation of PCR primers for universal nifH gene targeting and for assessment of transcribed nifH pools in roots of Oryza longistaminata with and without low nitrogen input. FEMS Microbiol Ecol 65: 220-228. Dilworth, M.J., Eldridge, M.E., and Eady, R.R. (1993) The molybdenum and vanadium nitrogenases of Azotobacter chroococcum: effect of elevated temperature on N2 reduction. Biochem J 289: 395. Distel, D.L., Morrill, W., MacLaren-Toussaint, N., Franks, D., and Waterbury, J. (2002) Teredinibacter turnerae gen. nov., sp. nov., a dinitrogen-fixing, cellulolytic, endosymbiotic gamma-proteobacterium isolated from the gills of wood-boring molluscs (Bivalvia: Teredinidae). Int J Syst Evol Microbiol 52: 2261-2269. Dixon, R., and Kahn, D. (2004) Genetic regulation of biological nitrogen fixation. Nature Reviews Microbiology 2: 621-631. Dobzhansky, T. (1973) Nothing in biology makes sense except in the light of evolution. American Biology Teacher 35: 125-129. Dunbar, J., Ticknor, L.O., and Kuske, C.R. (2000) Assessment of Microbial Diversity in Four Southwestern United States Soils by 16S rRNA Gene Terminal Restriction Fragment Analysis. Appl Environ Microbiol 66: 2943-2950. Dunbar, J., Ticknor, L.O., and Kuske, C.R. (2001) Phylogenetic Specificity and Reproducibility and New Method for Analysis of Terminal Restriction Fragment Profiles of 16S rRNA Genes from Bacterial Communities. Appl Environ Microbiol 67: 190-197. Dupraz, C., and Visscher, P. (2005) Microbial lithification in marine stromatolites and hypersaline mats. Trends Microbiol 13: 429-438. Dupraz, C., Reid, R., Braissant, O., Decho, A., Norman, R., and Visscher, P. (2009) Processes of carbonate precipitation in modern microbial mats. Earth-Sci Rev 96: 141-162. Eder, W., and Huber, R. (2002) New isolates and physiological properties of the Aquificales and description of Thermocrinis albus sp. nov. Extremophiles 6: 309-318. Edgar, R.C. (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucl. Acids Res. 32: 1792-1797. Edwards, A.M. (1868) Original Communications: On the Occurrence of Living Forms in the Hot Waters of California. Quarterly Journal of Microscopical Science 2: 247-250. Edwards, D., Stajich, J.E., and Hansen, D. (2009) Bioinformatics: Tools and Applications: Springer. Eisenberg, H. (1995) Life in unusual environments: progress in understanding the structure and function of enzymes from extreme halophilic bacteria. Archives of Biochemistry and Biophysics 318: 1-5. Eisenberg, H., Mevarech, M., and Zaccai, G. (1992) Biochemical, structural, and molecular genetic aspects of halophilism. Advances in protein chemistry 43: 1-62.
197
Empadinhas, N., and da Costa, M.S. (2010) Diversity and biosynthesis of compatible solutes in hyper/thermophiles. Int Microbiol 9: 199-206. Everett, K.D.E., Bush, R.M., and Andersen, A.A. (1999) Emended description of the order Chlamydiales, proposal of Parachlamydiaceae fam. nov. and Simkaniaceae fam. nov., each containing one monotypic genus, revised taxonomy of the family Chlamydiaceae, including a new genus and five new species, and standards for the identification of organisms. Int J Syst Bacteriol 49: 415-440. Falcón, L., Cerritos, R., Eguiarte, L., and Souza, V. (2007) Nitrogen fixation in microbial mat and stromatolite communities from Cuatro Cienegas, Mexico. Microb Ecol 54: 363-373. Fani, R., Gallo, R., and Liò, P. (2000) Molecular Evolution of Nitrogen Fixation: The Evolutionary History of the nifD, nifK, nifE, and nifN Genes. J Mol Evol 51: 1-11. Farnelid, H., Öberg, T., and Riemann, L. (2009) Identity and dynamics of putative N2 fixing picoplankton in the Baltic Sea proper suggest complex patterns of regulation. Environmental Microbiology Reports 1: 145-154. Fay, P. (1992) Oxygen relations of nitrogen fixation in cyanobacteria. Microbiology and Molecular Biology Reviews 56: 340. Feller, G., and Gerday, C. (2003) Psychrophilic enzymes: hot topics in cold adaptation. Nature Reviews Microbiology 1: 200-208. Feller, G., Lonhienne, T., Deroanne, C., Libioulle, C., Van Beeumen, J., and Gerday, C. (1992) Purification, characterization, and nucleotide sequence of the thermolabile alpha-amylase from the antarctic psychrotroph Alteromonas haloplanctis A23. J Biol Chem 267: 5217-5221. Felsenstein, J. (1981) Evolutionary trees from DNA sequences: A maximum likelihood approach. J Mol Evol 17: 368-376. Felsenstein, J. (2007) PHYLIP (phylogeny inference package) version 3.67. Distributed by the author. Department of Genome Sciences, University of Washington, Seattle, USA. Fernandez-Valiente, E., Quesada, A., Howard-Williams, C., and Hawes, I. (2001) N2-Fixation in Cyanobacterial Mats from Ponds on the McMurdo Ice Shelf, Antarctica. Microb Ecol 42: 338-349. Fernandez-Valiente, E., Camacho, A., Rochera, C., Rico, E., Vincent, W.F., and Quesada, A. (2007) Community structure and physiological characterization of microbial mats in Byers Peninsula, Livingston Island (South Shetland Islands, Antarctica). In, pp. 377-385. Ferris, M., Nold, S., Santegoeds, C., and Ward, D. (2001) Examining bacterial population diversity within the Octopus Spring microbial mat community. Thermophiles: Biodiversity, Ecology and Evolution: 51–64. Fields, P.A. (2001) Review: Protein function at thermal extremes: balancing stability and flexibility. Comparative Biochemistry and Physiology-Part A: Molecular & Integrative Physiology 129: 417-431. Fiore, C.L., Jarett, J.K., Olson, N.D., and Lesser, M.P. (2010) Nitrogen fixation and nitrogen transformations in marine symbioses. Trends Microbiol 18: 455-463.
198
Fleming, H., and Haselkorn, R. (1973) Differentiation in Nostoc muscorum: Nitrogenase is Synthesized in Heterocysts. In, pp. 2727-2731. Flint, D.J., and Abeysinghe, P.B. (2000/07) Geology and mineral resources of the Gascoyne Region: Western Australia Geological Survey. In. Perth: Western Australia Geological Survey, p. 29. Fourçans, A., Oteyza, T., Wieland, A., Solé, A., Diestra, E., Bleijswijk, J. et al. (2004) Characterization of functional bacterial groups in a hypersaline microbial mat community (Salins de Giraud, Camargue, France). FEMS Microbiol Ecol 51: 55-70. Francis, C.A., Beman, J.M., and Kuypers, M.M.M. (2007) New processes and players in the nitrogen cycle: the microbial ecology of anaerobic and archaeal ammonia oxidation. The ISME Journal 1: 19-27. Franzmann, P.D., and Dobson, S.J. (1992) Cell wall-less, free-living spirochetes in Antarctica. FEMS Microbiol Lett 97: 289-292. French, H., and Guglielmin, M. (1999a) Observations on the ice-marginal, periglacial geomorphology of Terra Nova Bay, northern Victoria Land, Antarctica. Permafrost Periglac 10: 331-347. French, H.M., and Guglielmin, M. (1999b) Observations on the Ice-Marginal, Periglacial Geomorphology of Terra Nova Bay, Northern Victoria Land, Antarctica. Permafrost Periglac 10: 331-347. French, H.M., and Guglielmin, M. (2000) Frozen Ground Phenomena in the Vicinity of Terra Nova Bay, Northern Victoria land, Antarctica: A Preliminary Report. Geografiska Annaler: Series A, Physical Geography 82: 513-526. Frezzotti, M., Salvatore, M., Vittuari, L., Grigioni, P., and De Silvestri, L. (2001) Satellite Image Map - Northern Foothills and Inexpressible Island Area (Victoria Land, Antarctica). Ter Ant Rep 6: 1-8. Friedmann, E.I. (1993) Extreme environments and exobiology. G Bot Ital 127: 369-376. Friedmann, E.I., Kappen, L., Meyer, M.A., and Nienow, J.A. (1993) Long-term productivity in the cryptoendolithic microbial community of the Ross Desert, Antarctica. Microb Ecol 25: 51-69. Frostegard, A., Courtois, S., Ramisse, V., Clerc, S., Bernillon, D., Le Gall, F. et al. (1999) Quantification of bias related to the extraction of DNA directly from soils. Appl Environ Microbiol 65: 5409. Fryberger, S., Krystinik, L., and Schenk, C. (1990) Tidally flooded back-barrier dunefield, Guerrero Negro area, Baja California, Mexico. Sedimentology 37: 23-43. Fukuchi, S., and Nishikawa, K. (2001) Protein surface amino acid compositions distinctively differ between thermophilic and mesophilic bacteria1. J Mol Biol 309: 835-843. Fukuchi, S., Yoshimune, K., Wakayama, M., Moriguchi, M., and Nishikawa, K. (2003) Unique amino acid composition of proteins in halophilic bacteria. J Mol Biol 327: 347-357.
199
Gaidos, E., Lanoil, B., Thorsteinsson, T., Graham, A., Skidmore, M., Han, S.K. et al. (2004) A Viable Microbial Community in a Subglacial Volcanic Crater Lake, Iceland. Astrobiology 4: 327-344. Galinski, E.A. (1993) Compatible solutes of halophilic eubacteria: molecular principles, water-solute interaction, stress protection. Cell Mol Life Sci 49: 487-496. Galinski, E.A., and Trüper, H.G. (1994) Microbial behaviour in salt-stressed ecosystems. FEMS Microbiol Rev 15: 95-108. Gall, J.L. (1963) A new species of Desulfovibrio. J Bacteriol 86: 1120. Gallon, J.R. (2001) N-2 fixation in phototrophs: adaptation to a specialized way of life. Plant and Soil 230: 39-48. Gallon, J.R., Hashem, M.A., and Chaplin, A.E. (1991) Nitrogen fixation by Oscillatoria spp. under autotrophic and photoheterotrophic conditions. Microbiology 137: 31. Gambacorta, A., Gliozzi, A., and Rosa, M. (1995) Archaeal lipids and their biotechnological applications. World Journal of Microbiology and Biotechnology 11: 115-131. Garcia-Pichel, F., Nübel, U., and Muyzer, G. (1998) The phylogeny of unicellular, extremely halotolerant cyanobacteria. Arch Microbiol 169: 469-482. Garrity, G.M., Brenner, D.J., Krieg, N.R., and Staley, J.R. (2005) Bergey's Manual of Systematic Bacteriology, Volume Two: The Proteobacteria, Parts A - C: Springer - Verlag. Gary Stacey, R.H.B., Harold J. Evans (1992) Biological nitrogen fixation New York Chapman & Hall. Gauthier, G., Gauthier, M., and Christen, R. (1995) Phylogenetic analysis of the genera Alteromonas, Shewanella, and Moritella using genes coding for small-subunit rRNA sequences and division of the genus Alteromonas into two genera, Alteromonas (emended) and Pseudoalteromonas gen. nov., and proposal of twelve new species combinations. Int J Syst Bacteriol 45: 755-761. Georgiadis, M., Komiya, H., Chakrabarti, P., Woo, D., Kornuc, J., and Rees, D. (1992) Crystallographic structure of the nitrogenase iron protein from Azotobacter vinelandii. Science 257: 1653. Georlette, D., Damien, B., Blaise, V., Depiereux, E., Uversky, V.N., Gerday, C., and Feller, G. (2003) Structural and Functional Adaptations to Extreme Temperatures in Psychrophilic, Mesophilic, and Thermophilic DNA Ligases. J Biol Chem 278: 37015-37023. Gerstein, M. (1998) How representative are the known structures of the proteins in a complete genome? A comprehensive structural census. Folding and Design 3: 497-512. Gilbert, D. (2003) Sequence File Format Conversion with Command Line Readseq. Current Protocols in Bioinformatics. Gilichinsky, D., Vishnivetskaya, T., Petrova, M., Spirina, E., Mamykin, V., and Rivkina, E. (2008) Bacteria in Permafrost. In Psychrophiles: from Biodiversity to Biotechnology. Margesin, R., Schinner, F., Marx, J.-C., and Gerday, C. (eds). Berlin, Germany: Springer, pp. 83-102.
200
Gilichinsky, D., Rivkina, E., Bakermans, C., Shcherbakova, V., Petrovskaya, L., Ozerskaya, S. et al. (2005) Biodiversity of cryopegs in permafrost. FEMS Microbiol Ecol 53: 117-128. Gilichinsky, D.A., Wilson, G.S., Friedmann, E.I., McKay, C.P., Sletten, R.S., Rivkina, E.M. et al. (2007) Microbial Populations in Antarctic Permafrost: Biodiversity, State, Age, and Implication for Astrobiology. Astrobiology 7: 275-311. Glaser, F., Rosenberg, Y., Kessel, A., Pupko, T., and Ben-Tal, N. (2005) The ConSurf-HSSP database: the mapping of evolutionary conservation among homologs onto PDB structures. Proteins 58: 610–617. Glaser, F., Pupko, T., Paz, I., Bell, R.E., Bechor-Shental, D., Martz, E., and Ben-Tal, N. (2003) ConSurf: identification of functional regions in proteins by surface-mapping of phylogenetic information. Bioinformatics 19: 163. Gliozzi, A., Relini, A., and Chong, P.L.G. (2002) Structure and permeability properties of biomimetic membranes of bolaform archaeal tetraether lipids. J Membr Sci 206: 131-147. Goh, F., Barrow, K.D., Burns, B.P., and Neilan, B.A. (2010) Identification and regulation of novel compatible solutes from hypersaline stromatolite-associated cyanobacteria. Arch Microbiol: 1-8. Goh, F., Jeon, Y.J., Barrow, K., Neilan, B.A., and Burns, B.P. (2011) Osmoadaptive Strategies of the Archaeon Halococcus hamelinensis Isolated from a Hypersaline Stromatolite Environment. Astrobiology 11: 529-536. Goh, F., Leuko, S., Allen, M., Bowman, J., Kamekura, M., Neilan, B., and Burns, B. (2006) Halococcus hamelinensis sp. nov., a novel halophilic archaeon isolated from stromatolites in Shark Bay, Australia. Int J Syst Evol Microbiol 56: 1323. Goh, F., Allen, M., Leuko, S., Kawaguchi, T., Decho, A., Burns, B., and Neilan, B. (2008) Determining the specific microbial populations and their spatial distribution within the stromatolite ecosystem of Shark Bay. The ISME Journal 3: 383-396. Goldenberg, O., Erez, E., Nimrod, G., and Ben-Tal, N. (2008) The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures. Nucleic Acids Res. Goldsmith-Fischman, S., Kuzin, A., Edstrom, W.C., Benach, J., Shastry, R., Xiao, R. et al. (2004) The SufE Sulfur-acceptor Protein Contains a Conserved Core Structure that Mediates Interdomain Interactions in a Variety of Redox Protein Complexes. J Mol Biol 344: 549-565. Golubic, S., and Walter, M.R. (1976) Chapter 4.1 Organisms that Build Stromatolites. In Developments in Sedimentology: Elsevier, pp. 113-126. Good, I.J. (1953) THE POPULATION FREQUENCIES OF SPECIES AND THE ESTIMATION OF POPULATION PARAMETERS. In, pp. 237-264. Goto, M., Ando, S., Hachisuka, Y., and Yoneyama, T. (2005) Contamination of diverse nifH and nifH-like DNA into commercial PCR primers. FEMS Microbiol Lett 246: 33-38. Gragnani R, Guglielmin M, Stenni B, Longinelli A, Smiraglia C, and L, C. (1998) Origins of the ground ice in the ice-free lands of the Northern Foothills (Northern
201
Victoria Land, Antarctica). Lewkowicz, A.G., and Allard, M. (eds). Yellowknife, Canada: Collection Nordicana, pp. 335-340. Grant, K. (1938) The Radio-activity and Composition of the Water and Gases of the Paralana Hot Spring. Trans. Roy. Soc. SA 62: 2. Greaves, R.B., and Warwicker, J. (2009) Stability and solubility of proteins from extremophiles. Biochem Biophys Res Commun 380: 581-585. Grimm, F., Cort, J.R., and Dahl, C. (2010) DsrR, a novel IscA-like protein lacking iron-and Fe-S-binding functions, involved in the regulation of sulfur oxidation in Allochromatium vinosum. J Bacteriol 192: 1652-1661. Groudieva, T., Kambourova, M., Yusef, H., Royter, M., Grote, R., Trinks, H., and Antranikian, G. (2004) Diversity and cold-active hydrolytic enzymes of culturable bacteria associated with Arctic sea ice, Spitzbergen. Extremophiles 8: 475-488. Guglielmin, M., and French, H.M. (2004) Ground ice in the Northern Foothills, northern Victoria Land, Antarctica. Ann Glaciol 39: 495-500. Guglielmin, M., and Cannone, N. (2012) A permafrost warming in a cooling Antarctica? Clim Change 111: 177-195. Guglielmin, M., Biasini, A., and Smiraglia, C. (1997) The Contribution of Geoelectrical Investigations in the Analysis of Periglacial and Glacial Landforms in Ice Free Areas of the Northern Foothills (Northern Victoria Land, Antarctica). Geogr Ann Ser A PhyGeogr: 17-24. Guglielmin, M., Camusso, M., Polesello, S., Valsecchi, S., and Teruzzi, M. (2002) A Note on the Ice Crystallography and Geochemistry of a Debris Cone, Northern Foothills, Antarctica. Permafrost Periglac 13: 77-82. Guindon, S., and Gascuel, O. (2003) A Simple, Fast, and Accurate Algorithm to Estimate Large Phylogenies by Maximum Likelihood. Syst Biol 52: 696-704. Guindon, S., Dufayard, J., Lefort, V., Anisimova, M., Hordijk, W., and Gascuel, O. (2010) New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol 59: 307. H.M. Berman, K.H., H. Nakamura (2003) Announcing the worldwide Protein Data Bank. Nature Structural Biology 10: 980. H.M.Berman, J.W., Z.Feng, G.Gilliland, T.N.Bhat, H.Weissig, I.N.Shindyalov, P.E.Bourne (2000) The Protein Data Bank. Nucleic Acids Res 28: 235-242. Hall, J.R., Mitchell, K.R., Jackson-Weaver, O., Kooser, A.S., Cron, B.R., Crossey, L.J., and Takacs-Vesbach, C.D. (2008) Molecular Characterization of the Diversity and Distribution of a Thermal Spring Microbial Community using rRNA and Metabolic Genes. Appl Environ Microbiol. Hall, T.A. (1999) BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucl Acid Symp Ser 41: 95-98. Hamilton, T.L., Boyd, E.S., and Peters, J.W. (2011a) Environmental constraints underpin the distribution and phylogenetic diversity of nifH in the Yellowstone geothermal complex. Microb Ecol 61: 860-870.
202
Hamilton, T.L., Lange, R.K., Boyd, E.S., and Peters, J.W. (2011b) Biological nitrogen fixation in acidic high-temperature geothermal springs in Yellowstone National Park, Wyoming. Environ Microbiol 13: 2204-2215. Hammer, Ã., Harper, D.A.T., and Ryan, P.D. (2001) PAST: paleontological statistics software package for education and data analysis. Palaeontologia electronica 4: 9. Handley, K.M., Boothman, C., Mills, R.A., Pancost, R.D., and Lloyd, J.R. (2010) Functional diversity of bacteria in a ferruginous hydrothermal sediment. The ISME Journal. Haney, P.J., Badger, J.H., Buldak, G.L., Reich, C.I., Woese, C.R., and Olsen, G.J. (1999) Thermal adaptation analyzed by comparison of protein sequences from mesophilic and extremely thermophilic Methanococcus species. Proceedings of the National Academy of Sciences 96: 3578. Hartmann, L.S., and Barnum, S.R. (2010) Inferring the evolutionary history of Mo-dependent nitrogen fixation from phylogenetic studies of nifK and nifDK. J Mol Evol: 1-16. Hawkes, T.R., McLEAN, P.A., and Smith, B.E. (1984) Nitrogenase from nifV mutants of Klebsiella pneumoniae contains an altered form of the iron-molybdenum cofactor. Biochem J 217: 317. Head, I., Saunders, J., and Pickup, R. (1998) Microbial evolution, diversity, and ecology: a decade of ribosomal RNA analysis of uncultivated microorganisms. Microb Ecol 35: 1-21. Heeren, T., and D'Agostino, R. (1987) Robustness of the two independent samples t test when applied to ordinal scaled data. Stat Med 6: 79-90. Henikoff, J.G., Greene, E.A., Pietrokovski, S., and Henikoff, S. (2000) Increased coverage of protein families with the blocks database servers. Nucleic Acids Res 28: 228. Henikoff, S., Henikoff, J.G., and Pietrokovski, S. (1999) Blocks+: a non-redundant database of protein alignment blocks derived from multiple compilations. Bioinformatics 15: 471. Henry, E., Devereux, R., Maki, J., Gilmour, C., Woese, C., Mandelco, L. et al. (1994) Characterization of a new thermophilic sulfate-reducing bacterium. Arch Microbiol 161: 62-69. Herbert, R.A., and Sharp, R. (1992) Molecular biology and biotechnology of extremophiles: Blackie and Son Ltd. Hewson, I., Moisander, P.H., Morrison, A.E., and Zehr, J.P. (2007) Diazotrophic bacterioplankton in a coral reef lagoon: phylogeny, diel nitrogenase expression and response to phosphate enrichment. ISME J 1: 78-91. Hirsch, P., Ludwig, W., Hethke, C., Sittig, M., Hoffmann, B., and Gallikowski, C.A. (1998) Hymenobacter roseosalivarius gen. nov., sp. nov. from continental Antartica soils and sandstone: bacteria of the Cytophaga/Flavobacterium/Bacteroides line of phylogenetic descent. Syst Appl Microbiol 21: 374-383. Hirschler-Réa, A., Matheron, R., Riffaud, C., Mouné, S., Eatock, C., Herbert, R.A. et al. (2003) Isolation and characterization of spirilloid purple phototrophic bacteria
203
forming red layers in microbial mats of Mediterranean salterns: description of Halorhodospira neutriphila sp. nov. and emendation of the genus Halorhodospira. Int J Syst Evol Microbiol 53: 153-163. Hobohm, U., Scharf, M., Schneider, R., and Sander, C. (1992) Selection of representative protein data sets. Protein Sci 1: 409-417. Hoehler, T.M., Bebout, B.M., and Des Marais, D.J. (2001) The role of microbial mats in the production of reduced gases on the early Earth. Nature 412: 324-327. Hoffman, P. (1976) Stromatolite Morphogenesis in Shark Bay, Western Australia. Developments in Sedimentology 20: 261-271. Hoffman, P., and Walter, M.R. (1976) Chapter 6.1 Stromatolite Morphogenesis in Shark Bay, Western Australia. In Developments in Sedimentology: Elsevier, pp. 261-271. Howard, J.B., and Rees, D.C. (1996) Structural Basis of Biological Nitrogen Fixation. Chem Rev 96: 2965-2982. Huber, R., Eder, W., Heldwein, S., Wanner, G., Huber, H., Rachel, R., and Stetter, K.O. (1998) Thermocrinis ruber gen. nov., sp. nov., a pink-filament-forming hyperthermophilic bacterium isolated from Yellowstone National Park. Appl Environ Microbiol 64: 3576-3583. Imhoff, J., Suling, J., and Petri, R. (1998) Phylogenetic relationships among the Chromatiaceae, their taxonomic reclassification and description of the new genera Allochromatium, Halochromatium, Isochromatium, Marichromatium, Thiococcus, Thiohalocapsa and Thermochromatium. Int J Syst Evol Microbiol 48: 1129. Imhoff, J.F. (2006) The family Ectothiorhodospiraceae. The Prokaryotes: 874-886. Imshenetsky, A.A., Abyzov, S.S., Voronov, G.T., Kuzjurina, L.A., Lysenko, S.V., Sotnikov, G.G., and Fedorova, R.I. (1967) Exobiology and the effect of physical factors on micro-organisms. Life Sci Space Res 5: 250-260. Ionescu, D., Hindiyeh, M., Malkawi, H., and Oren, A. (2010) Biogeography of thermophilic cyanobacteria: insights from the Zerka Ma'in hot springs (Jordan). FEMS Microbiol Ecol 72: 103-113. Israel, G., Cabane, M., Coll, P., Coscia, D., Raulin, F., and Niemann, H. (1999) The Cassini-Huygens ACP experiment and exobiological implications. Adv Space Res 23: 319-331. Izquierdo, J.A., and Nüsslein, K. (2006) Distribution of extensive nifH gene diversity across physical soil microenvironments. Microb Ecol 51: 441-452. Jaenicke, R. (1996) How Do Proteins Acquire Their Three-Dimensional Structure and Stability? Naturwissenschaften 83: 544-554. Jaenicke, R., and Böhm, G. (1998) The stability of proteins in extreme environments. Curr Opin Struct Biol 8: 738-748. Jahnert, R.J., and Collins, L.B. (2011) Significance of subtidal microbial deposits in Shark Bay, Australia. Mar Geol. Jang, S.B., Seefeldt, L.C., and Peters, J.W. (2000) Insights into nucleotide signal transduction in nitrogenase: structure of an iron protein with MgADP bound. Biochemistry 39: 14745-14752.
204
Jang, S.B., Jeong, M.S., Seefeldt, L.C., and Peters, J.W. (2004) Structural and biochemical implications of single amino acid substitutions in the nucleotide-dependent switch regions of the nitrogenase Fe protein from Azotobacter vinelandii. Journal of Biological Inorganic Chemistry 9: 1028-1033. Jannasch, H.W., and Wirsen, C.O. (1981) Morphological survey of microbial mats near deep-sea thermal vents. Appl Environ Microbiol 41: 528-538. Javor, B.J., and Castenholz, R.W. (1981) Laminated microbial mats, laguna Guerrero Negro, Mexico. Geomicrobiol J 2: 237 - 273. Jeffrey O. Dawson, and Gibson, A.H. (1987) Sensitivity of selected Frankia isolates from Casuarina, Allocasuarina and North American host plants to sodium chloride. Physiol Plant 70: 272-278. Jenkins, B.D., Steward, G.F., Short, S.M., Ward, B.B., and Zehr, J.P. (2004) Fingerprinting diazotroph communities in the Chesapeake Bay by using a DNA macroarray. Appl Environ Microbiol 70: 1767-1776. Jimenez-Lopez, J.C., Gachomo, E.W., Seufferheld, M.J., and Kotchoni, S.O. (2010) The maize ALDH protein superfamily: linking structural features to functional specificities. BMC Struct Biol 10: 43. Jørgensen, B., and Des Marais, D. (1990) The diffusive boundary layer of sediments: Oxygen microgradients over a microbial mat. Limnol Oceanogr 35: 1343-1355. Jungblut, A.D., and Neilan, B.A. (2010) NifH gene diversity and expression in a microbial mat community on the McMurdo Ice Shelf, Antarctica. Antarct Sci 22: 117-122. Jungblut, A.D., Hawes, I., Mountfort, D., Hitzfeld, B., Dietrich, D.R., Burns, B.P., and Neilan, B.A. (2005) Diversity within cyanobacterial mat communities in variable salinity meltwater ponds of McMurdo Ice Shelf, Antarctica. Environ Microbiol 7: 519-529. Kabsch, W. (1976) A solution for the best rotation to relate two sets of vectors. Acta Crystallographica Section A: Crystal Physics, Diffraction, Theoretical and General Crystallography 32: 922-923. Kabsch, W. (1978) A discussion of the solution for the best rotation to relate two sets of vectors. Acta Crystallographica Section A: Crystal Physics, Diffraction, Theoretical and General Crystallography 34: 827-828. Kabsch, W., and Sander, C. (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen bonded and geometrical features. Biopolymers 22: 2577-2637. Kashefi, K., Holmes, D.E., Baross, J.A., and Lovley, D.R. (2003) Thermophily in the Geobacteraceae: Geothermobacter ehrlichii gen. nov., sp. nov., a Novel Thermophilic Member of the Geobacteraceae from the" Bag City" Hydrothermal Vent. Appl Environ Microbiol 69: 2985. Kaštovský, J., and Johansen, J.R. (2008) Mastigocladus laminosus (Stigonematales, Cyanobacteria): phylogenetic relationship of strains from thermal springs to soil-inhabiting genera of the order and taxonomic implications for the genus. Phycologia 47: 307-320.
205
Katoh, K., and Toh, H. (2008) Recent developments in the MAFFT multiple sequence alignment program. Briefings in Bioinformatics 9: 286-298. Katoh, K., Misawa, K., Kuma, K., and Miyata, T. (2002) MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res 30: 3059. Kawasumi, T., Igarashi, Y., Kodama, T., and Minoda, Y. (1984) Hydrogenobacter thermophilus gen. nov., sp. nov., an Extremely Thermophilic, Aerobic, Hydrogen-Oxidizing Bacterium. Int J Syst Bacteriol 34: 5-10. Kent, H., Buck, M., and Evans, D. (1989) Cloning and sequencing of the nifH gene of Desulfovibrio gigas. FEMS Microbiol Lett 61: 73-78. Kim, J., and Rees, D.C. (1994) Nitrogenase and biological nitrogen fixation. Biochemistry 33: 389-397. Kim, J., Woo, D., and Rees, D. (1993) X-ray crystal structure of the nitrogenase molybdenum-iron protein from Clostridium pasteurianum at 3.0-. ANG. resolution. Biochemistry 32: 7104-7115. Klatt, C.G., Wood, J.M., Rusch, D.B., Bateson, M.M., Hamamura, N., Heidelberg, J.F. et al. (2011) Community ecology of hot spring cyanobacterial mats: predominant populations and their functional potential. The ISME Journal 5: 1262-1278. Klopprogge, K., Grabbe, R., Hoppert, M., and Schmitz, R.A. (2002) Membrane association of Klebsiella pneumoniae NifL is affected by molecular oxygen and combined nitrogen. Arch Microbiol 177: 223-234. Kochkina, G.A., Ivanushkina, N.E., Karasev, S.G., Gavrish, E.Y., Gurina, L.V., Evtushenko, L.I. et al. (2001) Survival of Micromycetes and Actinobacteria under Conditions of Long-Term Natural Cryopreservation. Microbiology 70: 356-364. Krebs, C. (1989) Ecological methodology: Harper & Row New York. Krylov, I.N., Semikhatov, M.A., and Walter, M.R. (1976) Appendix II Table of Time-Ranges of the Principal Groups of Precambrian Stromatolites. In Developments in Sedimentology: Elsevier, pp. 693-694. Kumar, M., Ahmad, S., Ahmad, E., Saifi, M.A., and Khan, R.H. (2012) In Silico Prediction and Analysis of Caenorhabditis EF-hand Containing Proteins. PloS one 7: e36770. Kumar, S., and Nussinov, R. (2001) How do thermophilic proteins deal with heat? Cell Mol Life Sci 58: 1216-1233. Kumar, S., Nei, M., Dudley, J., and Tamura, K. (2008) MEGA: a biologist-centric software for evolutionary analysis of DNA and protein sequences. Briefings in bioinformatics 9: 299. Ladunga, I. (2002a) Finding Similar Nucleotide Sequences Using Network BLAST Searches: John Wiley & Sons, Inc. Ladunga, I. (2002b) Finding Homologs in Amino Acid Sequences Using Network BLAST Searches: John Wiley & Sons, Inc. Landau, M., Mayrose, I., Rosenberg, Y., Glaser, F., Martz, E., Pupko, T., and Ben-Tal, N. (2005) ConSurf 2005: the projection of evolutionary conservation scores of residues on protein structures. Nucleic Acids Res 33: W299.
206
Lanyi, J. (1974) Salt-dependent properties of proteins from extremely halophilic bacteria. Microbiology and Molecular Biology Reviews 38: 272. Lanzilotta, W.N., Ryle, M.J., and Seefeldt, L.C. (1995) Nucleotide Hydrolysis and Protein Conformational Changes in Azotobacter vinelandii Nitrogenase Iron Protein: Defining the Function of Aspartate 129. Biochemistry 34: 10713-10723. Lanzilotta, W.N., Fisher, K., and Seefeldt, L.C. (1996) Evidence for electron transfer from the nitrogenase iron protein to the molybdenum-iron protein without MgATP hydrolysis: characterization of a tight protein-protein complex. Biochemistry 35: 7188-7196. Larkin, M.A., Blackshields, G., Brown, N.P., Chenna, R., McGettigan, P.A., McWilliam, H. et al. (2007) Clustal W and Clustal X version 2.0. Bioinformatics 23: 2947. Latysheva, N., Junker, V.L., Palmer, W.J., Codd, G.A., and Barker, D. (2012) The evolution of nitrogen fixation in cyanobacteria. Bioinformatics 28: 603-606. Le, S.Q., and Gascuel, O. (2008) An Improved General Amino Acid Replacement Matrix. Mol Biol Evol 25: 1307-1320. Leigh, J.A. (2000) Nitrogen fixation in methanogens: the archaeal perspective. Curr Issues Mol Biol 2: 125-131. Leipe, D.D., Wolf, Y.I., Koonin, E.V., and Aravind, L. (2002) Classification and evolution of P-loop GTPases and related ATPases. J Mol Biol 317: 41-72. Leuko, S., Goh, F., Allen, M., Burns, B., Walter, M., and Neilan, B. (2007) Analysis of intergenic spacer region length polymorphisms to investigate the halophilic archaeal diversity of stromatolites and microbial mats. Extremophiles 11: 203-210. Ley, R., Harris, J., Wilcox, J., Spear, J., Miller, S., Bebout, B. et al. (2006) Unexpected diversity and complexity of the Guerrero Negro hypersaline microbial mat. Appl Environ Microbiol 72: 3685. Li, X.-D., Huergo, L.F., Gasperina, A., Pedrosa, F.O., Merrick, M., and Winkler, F.K. (2009) Crystal Structure of Dinitrogenase Reductase-activating Glycohydrolase (DRAG) Reveals Conservation in the ADP-Ribosylhydrolase Fold and Specific Features in the ADP-Ribose-binding Pocket. J Mol Biol 390: 737-746. Liesack, W., and Dunfield, P.F. (2004) T-RFLP Analysis: A Rapid Fingerprinting Method for Studying Diversity, Structure, and Dynamics of Microbial Communities. In Environmental Microbiology: Methods and Protocols. Spencer, J.F.T., and Ragout de Spencer, A.L. (eds). Totowa, New Jersey: Springer, pp. 23-38. Lilburn, T., Kim, K., Ostrom, N., Byzek, K., Leadbetter, J., and Breznak, J. (2001) Nitrogen fixation by symbiotic and free-living spirochetes. Science 292: 2495. Liu, W.T., Marsh, T.L., Cheng, H., and Forney, L.J. (1997) Characterization of microbial diversity by determining terminal restriction fragment length polymorphisms of genes encoding 16S rRNA. Appl Environ Microbiol 63: 4516-4522. Liu, Y., Yao, T., Jiao, N., Kang, S., Zeng, Y., and Huang, S. (2006) Microbial community structure in moraine lakes and glacial meltwaters, Mount Everest. FEMS Microbiol Lett 265: 98-105.
207
Lo Giudice, A., Brilli, M., Bruni, V., De Domenico, M., Fani, R., and Michaud, L. (2007) Bacterium-bacterium inhibitory interactions among psychrotrophic bacteria isolated from Antarctic seawater (Terra Nova Bay, Ross Sea). FEMS Microbiol Ecol 60: 383-396. Logan, B. (1961) Cryptozoon and associate stromatolites from the recent, Shark Bay, Western Australia. The Journal of Geology 69: 517-533. Logan, B., and Cebulski, D. (1970) Sedimentary environments of Shark Bay, Western Australia. Am. Assoc. Pet. Geol. Mem 13: l-37. Logan, B., Rezak, R., and Ginsburg, R. (1964) Classification and environmental significance of algal stromatolites. The Journal of Geology 72: 68-83. Logan, B., Hoffman, P., and Gebelein, C. (1974) Algal mats, cryptalgal fabrics and structures. Hamelin Pool, Western Australia: American Association of Petroleum Geologists Memoir 22: 140-194. Logan, B., Davies, G., Read, J., and Cebulski, D. (1970) Carbonate sedimentation and environments, Shark bay, Western Australia: AAPG. Long, N., McPhail, D., Brugger, J., and Plimer, I. (2001) Geochemical and thermal characterisation of the Paralana Hot Springs, northern Flinders Ranges, South Australia: Geological Society of Australia; 1999, pp. 35-35. López-Cortés, A., García-Pichel, F., Nübel, U., and Vázquez-Juárez, R. (2001) Cyanobacterial diversity in extreme environments in Baja California, Mexico: a polyphasic study. Int Microbiol 4: 227-236. Lozupone, C., and Knight, R. (2005) UniFrac: a new phylogenetic method for comparing microbial communities. Appl Environ Microbiol 71: 8228. Lozupone, C., Hamady, M., and Knight, R. (2006) UniFrac – An online tool for comparing microbial community diversity in a phylogenetic context. BMC Bioinformatics 7: 371. Lysnes, K., Thorseth, I.H., Steinsbu, B.O., Ovreas, L., Torsvik, T., and Pedersen, R.B. (2004) Microbial community diversity in seafloor basalt from the Arctic spreading ridges. FEMS Microbiol Ecol 50: 213-230. Ma, Y., Galinski, E.A., Grant, W.D., Oren, A., and Ventosa, A. (2010) Halophiles 2010: Life in Saline Environments. Appl Environ Microbiol 76: 6971. Mack, E.E., Mandelco, L., Woese, C.R., and Madigan, M.T. (1993) Rhodospirillum sodomense, sp. nov., a Dead Sea Rhodospirillum species. Arch Microbiol 160: 363-371. Madern, D., Pfister, C., and Zaccai, G. (1995) Mutation at a single acidic amino acid enhances the halophilic behaviour of malate dehydrogenase from Haloarcula marismortui in physiological salts. Eur J Biochem 230: 1088-1095. Madern, D., Ebel, C., and Zaccai, G. (2000) Halophilic adaptation of enzymes. Extremophiles 4: 91-98. Madigan, M., Cox, S.S., and Stegeman, R.A. (1984) Nitrogen fixation and nitrogenase activities in members of the family Rhodospirillaceae. J Bacteriol 157: 73-78.
208
Man-Aharonovich, D., Kress, N., Zeev, E.B., Berman-Frank, I., and Beja, O. (2007) Molecular ecology of nifH genes and transcripts in the eastern Mediterranean Sea. In, pp. 2354-2363. Mannisto, M.K., and Haggblom, M.M. (2006) Characterization of psychrotolerant heterotrophic bacteria from Finnish Lapland. Syst Appl Microbiol 29: 229-243. Marchesi, J.R., Sato, T., Weightman, A.J., Martin, T.A., Fry, J.C., Hiom, S.J., and Wade, W.G. (1998) Design and evaluation of useful bacterium-specific PCR primers that amplify genes coding for bacterial 16S rRNA. Appl Environ Microbiol 64: 795. Marsh, T.L. (1999) Terminal restriction fragment length polymorphism (T-RFLP): An emerging method for characterizing diversity among homologous populations of amplification products. Curr Opin Microbiol 2: 323-327. Marsh, T.L. (2005) Culture-independent microbial community analysis with terminal restriction fragment length polymorphism. Methods Enzymol 397: 308-329. Marsh, T.L., Saxman, P., Cole, J., and Tiedje, J. (2000) Terminal Restriction Fragment Length Polymorphism Analysis Program, a Web-Based Research Tool for Microbial Community Analysis. Appl Environ Microbiol 66: 3616-3620. Marteinsson, V.T., Birrien, J.-L., Reysenbach, A.-L., Vernet, M., Marie, D., Gambacorta, A. et al. (1999) Thermococcus barophilus sp. nov., a new barophilic and hyperthermophilic archaeon isolated under high hydrostatic pressure from a deep-sea hydrothermal vent. Int J Syst Bacteriol 49: 351-359. Martin, A.P. (2002) Phylogenetic approaches for describing and comparing the diversity of microbial communities. Appl Environ Microbiol 68: 3673. Mawson, D. (1927) The Paralana hot spring. Trans R Soc S Aust 20: 391–397. McGinnis, S., and Madden, T.L. (2004) BLAST: at the core of a powerful and diverse set of sequence analysis tools. Nucleic Acids Res 32: W20. McGuinness, L.M., Salganik, M., Vega, L., Pickering, K.D., and Kerkhof, L.J. (2006) Replicability of Bacterial Communities in Denitrifying Bioreactors as Measured by PCR/T-RFLP Analysis. Environ Sci Technol 40: 509-515. McKay, C.P., Friedmann, E.I., and Meyer, M.A. (1991) From Siberia to Mars. Planet Rep Mar-Apr: 8-11. Mehta, M.P., and Baross, J.A. (2006) Nitrogen fixation at 92 C by a hydrothermal vent archaeon. Science 314: 1783-1786. Mehta, M.P., Butterfield, D.A., and Baross, J.A. (2003) Phylogenetic diversity of nitrogenase (nifH) genes in deep-sea and hydrothermal vent environments of the Juan de Fuca Ridge. Appl Environ Microbiol 69: 960. Meng, E., Pettersen, E., Couch, G., Huang, C., and Ferrin, T. (2006) Tools for integrated sequence-structure analysis with UCSF Chimera. BMC Bioinformatics 7: 339. Meng, L., and Feldman, L.J. (2010) CLE14/CLE20 peptides may interact with CLAVATA2/CORYNE receptor-like kinases to irreversibly inhibit cell division in the root meristem of Arabidopsis. Planta 232: 1061-1074. Meng, L., Wong, J.H., Feldman, L.J., Lemaux, P.G., and Buchanan, B.B. (2010) A membrane-associated thioredoxin required for plant growth moves from cell to cell,
209
suggestive of a role in intercellular communication. Proceedings of the National Academy of Sciences 107: 3900-3905. Methé, B.A., Webster, J., Nevin, K., Butler, J., and Lovley, D.R. (2005) DNA microarray analysis of nitrogen fixation and Fe (III) reduction in Geobacter sulfurreducens. Appl Environ Microbiol 71: 2530. Michaud, L., Cello, F., Brilli, M., Fani, R., Giudice, A., and Bruni, V. (2004) Biodiversity of cultivable psychrotrophic marine bacteria isolated from Terra Nova Bay (Ross Sea, Antarctica). FEMS Microbiol Lett 230: 63-71. Miller, S.R., Castenholz, R.W., and Pedersen, D. (2007) Phylogeography of the thermophilic cyanobacterium Mastigocladus laminosus. Appl Environ Microbiol 73: 4751. Miller, S.R., Strong, A.L., Jones, K.L., and Ungerer, M.C. (2009) Bar-coded pyrosequencing reveals shared bacterial community properties along the temperature gradients of two alkaline hot springs in Yellowstone National Park. Appl Environ Microbiol 75: 4565. Mindlin, S., Soina, V., Petrova, M., and Gorlenko, Z. (2008) Isolation of antibiotic resistance bacterial strains from Eastern Siberia permafrost sediments. Russ J Genet 44: 27-34. Mishustin, E.N., and Shilnikova, V.K. (1971) Biological fixation of atmospheric nitrogen. London: Macmillan.420. Miteva, V. (2008) Bacteria in Snow and Glacier Ice. In Psychrophiles: from Biodiversity to Biotechnology. Margesin, R., Schinner, F., Marx, J.-C., and Gerday, C. (eds). Berlin, Germany: Springer pp. 31-50. Miteva, V.I., and Brenchley, J.E. (2005) Detection and Isolation of Ultrasmall Microorganisms from a 120,000-Year-Old Greenland Glacier Ice Core. Appl Environ Microbiol 71: 7806-7818. Miteva, V.I., Sheridan, P.P., and Brenchley, J.E. (2004) Phylogenetic and Physiological Diversity of Microorganisms Isolated from a Deep Greenland Glacier Ice Core. Appl Environ Microbiol 70: 202-213. Miyamoto, K., Hallenbeck, P.C., and Benemann, J.R. (1979) Nitrogen fixation by thermophilic blue-green algae (cyanobacteria): temperature characteristics and potential use in biophotolysis. Appl Environ Microbiol 37: 454. Moeseneder, M.M., Arrieta, J.M., Muyzer, G., Winter, C., and Herndl, G.J. (1999) Optimization of Terminal-Restriction Fragment Length Polymorphism Analysis for Complex Marine Bacterioplankton Communities and Comparison with Denaturing Gradient Gel Electrophoresis. In, pp. 3518-3525. Mohamed, N., Colman, A., Tal, Y., and Hill, R. (2008a) Diversity and expression of nitrogen fixation genes in bacterial symbionts of marine sponges. Environ Microbiol 10: 2910-2921. Mohamed, N.M., Colman, A.S., Tal, Y., and Hill, R.T. (2008b) Diversity and expression of nitrogen fixation genes in bacterial symbionts of marine sponges. Environ Microbiol 10: 2910-2921.
210
Moisander, P.H., Morrison, A.E., Ward, B.B., Jenkins, B.D., and Zehr, J.P. (2007) Spatial-temporal variability in diazotroph assemblages in Chesapeake Bay using an oligonucleotide nifH microarray. Environ Microbiol 9: 1823-1835. Moisander, P.H., Shiue, L., Steward, G.F., Jenkins, B.D., Bebout, B.M., and Zehr, J.P. (2006) Application of a nifH oligonucleotide microarray for profiling diversity of N2-fixing microorganisms in marine microbial mats. Environ Microbiol 8: 1721-1735. Mooney, C., Davey, N., Martin, A., Walsh, I., Shields, D.C., and Pollastri, G. (2011) In silico protein motif discovery and structural analysis. In Methods in molecular biology. Yu, B., and Hinchcliffe, M. (eds). Clifton, NJ: Springer Science+Business Media, pp. 341-353. Moret, M., and Zebende, G. (2007) Amino acid hydrophobicity and accessible surface area. Physical Review E 75: 011920. Moses, J., Fouchet, T., Bézard, B., Gladstone, G., Lellouch, E., and Feuchtgruber, H. (2005) Photochemistry and diffusion in Jupiter's stratosphere: Constraints from ISO observations and comparisons with other giant planets. J. Geophys. Res 110: E08001. Motulsky, H., and Christopoulos, A. (2004) Fitting models to biological data using linear and nonlinear regression: a practical guide to curve fitting: Oxford University Press, USA. Motulsky, H.J., and Brown, R.E. (2006) Detecting outliers when fitting data with nonlinear regression–a new method based on robust nonlinear regression and the false discovery rate. BMC Bioinformatics 7: 123. Moult, J. (2005) A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction. Curr Opin Struct Biol 15: 285-289. Muller, S.W. (1947) Permafrost or, Permanently frozen ground and related engineering problems (Strategic engineering study). Ann Arbor, Michigan: Edwards Brothers.231. Mullis, K., and Erlich, H. (1988) Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 239: 487–491. Musat, F., Harder, J., and Widdel, F. (2006) Study of nitrogen fixation in microbial communities of oil-contaminated marine sediment microcosms. Environ Microbiol 8: 1834-1843. Muyzer, G., de Waal, E.C., and Uitterlinden, A.G. (1993) Profiling of complex microbial populations by denaturing gradient gel electrophoresis analysis of polymerase chain reaction-amplified genes coding for 16S rRNA. Appl Environ Microbiol 59: 695-700. Nakazawa, H., Arakaki, A., Narita-Yamada, S., Yashiro, I., Jinno, K., Aoki, N. et al. (2009) Whole genome sequence of Desulfovibrio magneticus strain RS-1 revealed common gene clusters in magnetotactic bacteria. Genome Res 19: 1801. NASA (2012). Missions. URL http://science.nasa.gov/earth-science/missions/ Needleman, S.B., and Wunsch, C.D. (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48: 443-453.
Neilan, B.A. (1995) Identification and Phylogenetic Analysis of Toxigenic Cyanobacteria by Multiplex Randomly Amplified Polymorphic DNA PCR. In, pp. 2286-2291. Néron, B., Ménager, H., Maufrais, C., Joly, N., Maupetit, J., Letort, S. et al. (2009) Mobyle: a new full web bioinformatics framework. Bioinformatics 25: 3005. Nicolaus, B., Lama, L., Esposito, E., Manca, M., Gambacorta, A., and Prisco, G. (1996) “Bacillus thermoantarcticus” sp. nov., from Mount Melbourne, Antarctica: a novel thermophilic species. Polar Biol 16: 101-104. Nicolaus, B., Improta, R., Manca, M.C., Lama, L., Esposito, E., and Gambacorta, A. (1998) Alicyclobacilli from an unexplored geothermal soil in Antarctica: Mount Rittmann. Polar Biol 19: 133-141. Nicolaus, B., Marsiglia, F., Esposito, E., Trincone, A., Lama, L., Sharp, R. et al. (1991) Isolation of five strains of thermophilic eubacteria in Antarctica. Polar Biol 11: 425-429. Niederberger, T.D., McDonald, I.R., Hacker, A.L., Soo, R.M., Barrett, J.E., Wall, D.H., and Cary, S.C. (2008) Microbial community composition in soils of Northern Victoria Land, Antarctica. Environ Microbiol 10: 1713 - 1724. Nishikawa, K., Kubota, Y., and Tatsuo, O. (1983) Classification of proteins into groups based on amino acid composition and other characters. I. Angular distribution. Journal of biochemistry 94: 981-995. Nuin, P.A.S., Wang, Z., and Tillier, E.R.M. (2006) The accuracy of several multiple sequence alignment programs for proteins. BMC Bioinformatics 7: 471. Nיron, B., Tuffיry, P., and Letondal, C. (2005) Mobyle: a Web portal framework for bioinformatics analyses. Network Tools and Applications in Biology (poster), Naples, Italy. O'Leary, M.J., Hearty, P.J., and McCulloch, M.T. (2008) U-series evidence for widespread reef development in Shark Bay during the last interglacial. Palaeogeography, Palaeoclimatology, Palaeoecology 259: 424-435. Okon, Y. (1985) Azospirillum as a potential inoculant for agriculture. Trends Biotechnol 3: 223-228. Oliveros, J. (2007) VENNY. An interactive tool for comparing lists with Venn Diagrams. In: BioinfoGP, CNB-CSIC. URL http://bioinfogp. cnb. csic. es/tools/venny/index. html [accessed on 30 April 2009]. Olson, N., Ainsworth, T., Gates, R., and Takabayashi, M. (2009) Diazotrophic bacteria associated with Hawaiian Montipora corals: diversity and abundance in correlation with symbiotic dinoflagellates. J Exp Mar Biol Ecol 371: 140-146. Omoregie, E., Crumbliss, L., Bebout, B., and Zehr, J. (2004a) Determination of nitrogen-fixing phylotypes in Lyngbya sp. and Microcoleus chthonoplastes cyanobacterial mats from Guerrero Negro, Baja California, Mexico. Appl Environ Microbiol 70: 2119. Omoregie, E.O., Crumbliss, L.L., Bebout, B.M., and Zehr, J.P. (2004b) Comparison of diazotroph community structure in Lyngbya sp. and Microcoleus chthonoplastes
dominated microbial mats from Guerrero Negro, Baja, Mexico. FEMS Microbiol Ecol 47: 305-318. Omoregie, E.O., Crumbliss, L.L., Bebout, B.M., and Zehr, J.P. (2004c) Determination of nitrogen-fixing phylotypes in Lyngbya sp. and Microcoleus chthonoplastes cyanobacterial mats from Guerrero Negro, Baja California, Mexico. Appl Environ Microbiol 70: 2119-2128. Oren, A. (1986) Intracellular salt concentrations of the anaerobic halophilic eubacteria Haloanaerobium praevalens and Halobacteroides halobius. Can J Microbiol 32: 4-9. Oren, A. (1999) Bioenergetic aspects of halophilism. Microbiology and Molecular Biology Reviews 63: 334. Oren, A. (2002) Diversity of halophilic microorganisms: environments, phylogeny, physiology, and applications. J Ind Microbiol Biotechnol 28: 56-63. Oren, A., Kessel, M., and Stackebrandt, E. (1989) Ectothiorhodospira marismortui sp. nov., an obligately anaerobic, moderately halophilic purple sulfur bacterium from a hypersaline sulfur spring on the shore of the Dead Sea. Arch Microbiol 151: 524-529. Oren, A., Ionescu, D., Hindiyeh, M., and Malkawi, H. (2009) Morphological, phylogenetic and physiological diversity of cyanobacteria in the hot springs of Zerka Ma. BioRisk 3: 69. Orombelli, G., Baroni, C., and Denton, G. (1991) Late Cenozoic glacial history of the Terra Nova Bay region, northern Victoria Land, Antarctica. Geogr Fis Din Quat 13: 139-163. Osborn, A.M., Moore, E.R.B., and Timmis, K.N. (2000) An evaluation of terminal restriction fragment length polymorphism (T-RFLP) analysis for the study of microbial community structure and dynamics. Environ Microbiol 2: 39-50. Ostroumov, V., and Siegert, C. (1996) Exobiological aspects of mass transfer in microzones of permafrost deposits. Adv Space Res 18: 79-86. Paerl, H.W., Pinckney, J.L., and Steppe, T.F. (2000) Cyanobacterial-bacterial mat consortia: examining the functional unit of microbial survival and growth in extreme environments. Environ Microbiol 2: 11-26. Paerl, H.W., Steppe, T.F., Buchan, K.C., and Potts, M. (2003) Hypersaline cyanobacterial mats as indicators of elevated tropical hurricane activity and associated climate change. AMBIO: A Journal of the Human Environment 32: 87-90. Pandey, K.D., Shukla, S.P., Shukla, P.N., Giri, D.D., Singh, J.S., Singh, P., and Kashyap, A.K. (2004) Cyanobacteria in Antarctica: ecology, physiology and cold adaptation. Cell Mol Biol (Noisy-le-grand) 50: 575-584. Papineau, D., Walker, J., Mojzsis, S., and Pace, N. (2005) Composition and structure of microbial communities from stromatolites of Hamelin Pool in Shark Bay, Western Australia. Appl Environ Microbiol 71: 4822. Paster, B., Dewhirst, F., Weisburg, W., Tordoff, L., Fraser, G., Hespell, R. et al. (1991) Phylogenetic analysis of the spirochetes. J Bacteriol 173: 6101. Patel, B., Morgan, H., and Daniel, R. (1985) Thermophilic anaerobic spirochetes in New Zealand hot springs. FEMS Microbiol Lett 26: 101-106.
213
Pätzold, M., Häusler, B., Bird, M., Tellmann, S., Mattei, R., Asmar, S. et al. (2007) The structure of Venus’ middle atmosphere and ionosphere. Nature 450: 657-660. Paul, S., Bag, S.K., Das, S., Harvill, E.T., and Dutta, C. (2008) Molecular signature of hypersaline adaptation: insights from genome and proteome composition of halophilic prokaryotes. Genome Biol 9: R70. Pearson, W.R. (1991) Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms. Genomics 11: 635-650. Pennisi, E. (1997) Biotechnology: in industry, extremophiles begin to make their mark. Science 276: 705. Pepi, M., Agnorelli, C., and Bargagli, R. (2005) Iron Demand by Thermophilic and Mesophilic Bacteria Isolated from an Antarctic Geothermal Soil. BioMetals 18: 529-536. Pernthaler, A., Dekas, A.E., Brown, C.T., Goffredi, S.K., Embaye, T., and Orphan, V.J. (2008) Diverse syntrophic partnerships from deep-sea methane vents revealed by direct cell capture and metagenomics. Proceedings of the National Academy of Sciences 105: 7052. Perreault, N.N., Andersen, D.T., Pollard, W.H., Greer, C.W., and Whyte, L.G. (2007) Characterization of the Prokaryotic Diversity in Cold Saline Perennial Springs of the Canadian High Arctic. Appl Environ Microbiol 73: 1532-1543. Peters, J., Fisher, K., and Dean, D. (1995) Nitrogenase structure and function: a biochemical-genetic perspective. Annual Reviews in Microbiology 49: 335-366. Peters, J.W., and Szilagyi, R.K. (2006) Exploring new frontiers of nitrogenase structure and mechanism. Curr Opin Chem Biol 10: 101-108. Petrova, M.A., Mindlin, S.Z., Gorlenko, Z.M., Kalyaeva, E.S., Soina, V.S., and Bogdanova, E.S. (2002) Mercury-Resistant Bacteria from Permafrost Sediments and Prospects for their Use in Comparative Studies of Mercury Resistance Determinants. Russ J Genet 38: 1330-1334. Pettersen, E.F., Goddard, T.D., Huang, C.C., Couch, G.S., Greenblatt, D.M., Meng, E.C., and Ferrin, T.E. (2004) UCSF Chimera - A Visualization System for Exploratory Research and Analysis. J Comput Chem 25: 1605-1612. Piccardi, G., Udisti, R., and Casella, F. (1994) Seasonal trends and chemical composition of snow at Terra Nova Bay (Antarctica). Int J Environ Anal Chem 55: 219-234. Pierrehumbert, R.T. (2011) Infrared radiation and planetary temperature. Physics Today 64: 33. Pikuta, E.V., Hoover, R.B., and Tang, J. (2007) Microbial extremophiles at the limits of life. Crit Rev Microbiol 33: 183-209. Pinckney, J., Paerl, H.W., and Bebout, B.M. (1995) Salinity control of benthic microbial mat community production in a Bahamian hypersaline lagoon. J Exp Mar Biol Ecol 187: 223-237.
214
Pinckney, J.L., and Paerl, H.W. (1997) Anoxygenic Photosynthesis and Nitrogen Fixation by a Microbial Mat Community in a Bahamian Hypersaline Lagoon. Appl Environ Microbiol 63: 420-426. Playford, P.E., Cockbain, A.E., and Walter, M.R. (1976) Chapter 8.2 Modern Algal Stromatolites at Hamelin Pool, A Hypersaline Barred Basin in Shark Bay, Western Australia. In Developments in Sedimentology: Elsevier, pp. 389-411. Pointing, S.B., Chan, Y., Lacap, D.C., Lau, M.C.Y., Jurgens, J.A., and Farrell, R.L. (2009) Highly specialized microbial diversity in hyper-arid polar desert. Proceedings of the National Academy of Sciences 106: 19964-19969. Polański, A., and Kimmel, M. (2007) Bioinformatics: Springer. Pollastri, G., Baldi, P., Fariselli, P., and Casadio, R. (2002) Prediction of coordination number and relative solvent accessibility in proteins. Proteins: Structure, Function, and Bioinformatics 47: 142-153. Polz, M.F., and Cavanaugh, C.M. (1998) Bias in template-to-product ratios in multitemplate PCR. Appl Environ Microbiol 64: 3724-3730. Posada, D., Guindon, S., Delsuc, F., Dufayard, J.-F., and Gascuel, O. (2009) Estimating Maximum Likelihood Phylogenies with PhyML. In Bioinformatics for DNA Sequence Analysis. Posada, D. (ed): Humana Press, pp. 113-137. Postgate, J., Kent, H., and Robson, R. (1988) Nitrogen fixation by Desulfovibrio. The Nitrogen and Sulphur Cycles: 457–471. Postgate, J.R. (1982) The fundamentals of nitrogen fixation: Cambridge Univ Pr. Postgate, J.R. (1987) Nitrogen Fixation: Cambridge University Press. Priscu, J.C., Fritsen, C.H., Adams, E.E., Giovannoni, S.J., Paerl, H.W., McKay, C.P. et al. (1998) Perennial Antarctic lake ice: an oasis for life in a polar desert. Science 280: 2095-2098. Priscu, J.C., Adams, E.E., Lyons, W.B., Voytek, M.A., Mogk, D.W., Brown, R.L. et al. (1999) Geomicrobiology of Subglacial Ice Above Lake Vostok, Antarctica. Science 286: 2141. Proctor, L.M. (1997) Nitrogen-fixing, photosynthetic, anaerobic bacteria associated with pelagic copepods. Aquat Microb Ecol 12: 105-113. Pumbwe, L., Skilbeck, C.A., and Wexler, H.M. (2007) Impact of Anatomic Site on Growth, Efflux-Pump Expression, Cell Structure, and Stress Responsiveness of Bacteroides fragilis. Curr Microbiol 55: 362-365. Pupko, T., Bell, R.E., Mayrose, I., Glaser, F., and Ben-Tal, N. (2002) Rate4Site: an algorithmic tool for the identification of functional regions in proteins by surface mapping of evolutionary determinants within their homologues. Bioinformatics 18: S71-77. Qiu, X., Wu, L., Huang, H., McDonel, P.E., Palumbo, A.V., Tiedje, J.M., and Zhou, J. (2001) Evaluation of PCR-Generated Chimeras, Mutations, and Heteroduplexes with 16S rRNA Gene-Based Cloning. Appl Environ Microbiol 67: 880-887.
215
Quaiser, A., Zivanovic, Y., Moreira, D., and López-García, P. (2010) Comparative metagenomics of bathypelagic plankton and bottom sediment from the Sea of Marmara. The ISME Journal. Ramelot, T.A., Cort, J.R., Goldsmith-Fischman, S., Kornhaber, G.J., Xiao, R., Shastry, R. et al. (2004) Solution NMR structure of the iron–sulfur cluster assembly protein U (IscU) with zinc bound at the active site. J Mol Biol 344: 567-583. Ramsden, J. (2009) Bioinformatics: an introduction: Springer. Rao, J., and Argos, P. (1981) Structural stability of halophilic proteins. Biochemistry 20: 6536-6543. Rasche, M.E., and Seefeldt, L.C. (1997) Reduction of Thiocyanate, Cyanate, and Carbon Disulfide by Nitrogenase:  Kinetic Characterization and EPR Spectroscopic Analysis†Biochemistry 36: 8574-8585. Ravenschlag, K., Sahm, K., Pernthaler, J., and Amann, R. (1999) High Bacterial Diversity in Permanently Cold Marine Sediments. Appl Environ Microbiol 65: 3982-3989. Raymond, J., Siefert, J.L., Staples, C.R., and Blankenship, R.E. (2004a) The Natural History of Nitrogen Fixation. Mol Biol Evol 21: 541-554. Raymond, J., Siefert, J.L., Staples, C.R., and Blankenship, R.E. (2004b) The Natural History of Nitrogen Fixation. In, pp. 541-554. Razia, M., Raja, K., Padmanaban, K., Sivaramakrishnan, S., and Chellapandi, P. (2010) A Phylogenetic Approach for Assigning Function of Hypothetical Proteins in Photorhabdus luminescens Subsp. laumondii TT01 Genome. J Comput Sci Syst Biol 3: 21-29. Reddy, K., Haskell, J., Sherman, D., and Sherman, L. (1993) Unicellular, aerobic nitrogen-fixing cyanobacteria of the genus Cyanothece. J Bacteriol 175: 1284. Rengpipat, S., Lowe, S., and Zeikus, J. (1988) Effect of extreme salt concentrations on the physiology and biochemistry of Halobacteroides acetoethylicus. J Bacteriol 170: 3065. Rhodes, M.E., Fitz-Gibbon, S.T., Oren, A., and House, C.H. (2010) Amino acid signatures of salinity on an environmental scale with a focus on the Dead Sea. Environ Microbiol 12: 2613-2623. Richardson, L.L., and Castenholz, R.W. (1987) Diel vertical movements of the cyanobacterium Oscillatoria terebriformis in a sulfide-rich hot spring microbial mat. Appl Environ Microbiol 53: 2142. Riding, R. (1999) The term stromatolite: towards an essential definition. Lethaia 32: 321-330. Riederer-Henderson, M.A., and Wilson, P. (1970) Nitrogen fixation by sulphate-reducing bacteria. Microbiology 61: 27. Ríos, A., Valera, S., Ascaso, C., Davila, A., Kastovsky, J., McKay, C.P. et al. (2010) Comparative analysis of the microbial communities inhabiting halite evaportes of the Atacama Desert. International microbiology: official journal of the Spanish Society for Microbiology 13: 79-89.
216
Risatti, J., Capman, W., and Stahl, D. (1994) Community structure of a microbial mat: the phylogenetic dimension. Proceedings of the National Academy of Sciences of the United States of America 91: 10173. Rodriguez, R., Chinea, G., Lopez, N., Pons, T., and Vriend, G. (1998) Homology modeling, model and software evaluation: three related resources. Bioinformatics 14: 523-528. Roesch, L.F.W., Fulthorpe, R.R., Jaccques, R.J.S., Bento, F.M., and de Oliveira Camargo, F.A. (2010) Biogeography of diazotrophic bacteria in soils. World Journal of Microbiology and Biotechnology: 1-6. Rothschild, L.J., and Mancinelli, R.L. (2001) Life in extreme environments. Nature 409: 1092-1101. Roy, A., Kucukural, A., and Zhang, Y. (2010) I-TASSER: a unified platform for automated protein structure and function prediction. Nature protocols 5: 725-738. Rychlewski, L., Li, W., Jaroszewski, L., and Godzik, A. (2000) Comparison of sequence profiles. Strategies for structural predictions using sequence information. Protein Sci 9: 232-241. Sakaguchi, T., Arakaki, A., and Matsunaga, T. (2002) Desulfovibrio magneticus sp. nov., a novel sulfate-reducing bacterium that produces intracellular single-domain-sized magnetite particles. Int J Syst Evol Microbiol 52: 215. Sander, C., and Schneider, R. (1991) Database of homology derived protein structures and the structural meaning of sequence alignment. Proteins: Structure, Function, and Bioinformatics 9: 56-68. Schaller, R.R. (1997) Moore's law: past, present and future. Spectrum, IEEE 34: 52-59. Schink, B. (1992) The genus Pelobacter. The Prokaryotes: 3393–3399. Schleifer, K.-H. (2004) Microbial Diversity: Facts, Problems and Prospects. Syst Appl Microbiol 27: 3-9. Schlessman, J.L., Woo, D., Joshua-Tor, L., Howard, J.B., and Rees, D.C. (1998) Conformational variability in structures of the nitrogenase iron proteins from Azotobacter vinelandii and Clostridium pasteurianum1. J Mol Biol 280: 669-685. Schloss, P., and Handelsman, J. (2006a) Introducing TreeClimber, a test to compare microbial community structures. Appl Environ Microbiol 72: 2379. Schloss, P.D., and Handelsman, J. (2005) Introducing DOTUR, a Computer Program for Defining Operational Taxonomic Units and Estimating Species Richness. Appl Environ Microbiol 71: 1501-1506. Schloss, P.D., and Handelsman, J. (2006b) Introducing SONS, a Tool for Operational Taxonomic Unit-Based Comparisons of Microbial Community Memberships and Structures. Appl Environ Microbiol 72: 6773-6779. Schloss, P.D., Larget, B.R., and Handelsman, J. (2004) Integration of Microbial Ecology and Statistics: a Test To Compare Gene Libraries. Appl Environ Microbiol 70: 5485-5492. Schloss, P.D., Westcott, S.L., Ryabin, T., Hall, J.R., Hartmann, M., Hollister, E.B. et al. (2009) Introducing mothur: Open-Source, Platform-Independent, Community-
217
Supported Software for Describing and Comparing Microbial Communities. Appl Environ Microbiol 75: 7537-7541. Schneegurt, M.A., Sherman, D.M., Nayar, S., and Sherman, L.A. (1994) Oscillating behavior of carbohydrate granule formation and dinitrogen fixation in the cyanobacterium Cyanothece sp. strain ATCC 51142. J Bacteriol 176: 1586. Schopf, J.W. (2006) Fossil evidence of Archaean life. Philosophical Transactions of the Royal Society B: Biological Sciences 361: 869-885. Segawa, T., Miyamoto, K., Ushida, K., Agata, K., Okada, N., and Kohshima, S. (2005) Seasonal Change in Bacterial Flora and Biomass in Mountain Snow from the Tateyama Mountains, Japan, Analyzed by 16S rRNA Gene Sequencing and Real-Time PCR. Appl Environ Microbiol 71: 123-130. Serebryakov, S.N., and Walter, M.R. (1976a) Chapter 10.8 Distribution of Stromatolites in Riphean Deposits of the Uchur-Maya Region of Siberia. In Developments in Sedimentology: Elsevier, pp. 613-614, 615-620, 621-633. Serebryakov, S.N., and Walter, M.R. (1976b) Chapter 6.4 Biotic and Abiotic Factors Controlling the Morphology of Riphean Stromatolites. In Developments in Sedimentology: Elsevier, pp. 321-336. Serrano, L., Sancho, J., Hirshberg, M., and Fersht, A.R. (1992a) [alpha]-Helix stability in proteins:: I. Empirical correlations concerning substitution of side-chains at the N and C-caps and the replacement of alanine by glycine or serine at solvent-exposed surfaces. J Mol Biol 227: 544-559. Serrano, L., Neira, J.L., Sancho, J., and Fersht, A.R. (1992b) Effect of alanine versus glycine in α-helices on protein stability. Severin, I., and Stal, L.J. (2010) NifH expression by five groups of phototrophs compared with nitrogenase activity in coastal microbial mats. FEMS Microbiol Ecol 73: 55-67. Severin, J., Wohlfarth, A., and Galinski, E.A. (1992) The predominant role of recently discovered tetrahydropyrimidines for the osmoadaptation of halophilic eubacteria. Journal of general microbiology 138: 1629. Sheridan, P.P., Miteva, V.I., and Brenchley, J.E. (2003) Phylogenetic Analysis of Anaerobic Psychrophilic Enrichment Cultures Obtained from a Greenland Glacier Ice Core. Appl Environ Microbiol 69: 2153-2160. Shi, R., Proteau, A., Villarroya, M., Moukadiri, I., Zhang, L., Trempe, J.F. et al. (2010) Structural basis for Fe–S cluster assembly and tRNA thiolation mediated by IscS protein–protein interactions. PLoS Biol 8: e1000354. Shi, T., Reeves, R.H., Gilichinsky, D.A., and Friedmann, E.I. (1997) Characterization of Viable Bacteria from Siberian Permafrost by 16S rDNA Sequencing. Microb Ecol 33: 169-179. Short, S.M., and Zehr, J.P. (2005) Quantitative analysis of nifH genes and transcripts from aquatic environments. Methods Enzymol 397: 380-394. Siddiqui, K.S., and Cavicchioli, R. (2006) Cold-adapted enzymes. Annu. Rev. Biochem. 75: 403-433.
218
Siddiqui, K.S., and Thomas, T. (2008) Protein adaptation in extremophiles: Nova Biomedical. Singer, G., and Hickey, D.A. (2003) Thermophilic prokaryotes have characteristic patterns of codon usage, amino acid composition and nucleotide content. Gene 317: 39. Singh, C., Soni, R., Jain, S., Roy, S., and Goel, R. (2010) Diversification of nitrogen fixing bacterial community using nifH gene as a biomarker in different geographical soils of Western Indian Himalayas. J Environ Biol. Singleton, D.R., Furlong, M.A., Rathbun, S.L., and Whitman, W.B. (2001) Quantitative Comparisons of 16S rRNA Gene Sequence Libraries from Environmental Samples. Appl Environ Microbiol 67: 4374-4376. Sjöling, S., and Cowan, D.A. (2003) High 16S rDNA bacterial diversity in glacial meltwater lake sediment, Bratina Island, Antarctica. Extremophiles 7: 275-282. Skidmore, M., Anderson, S.P., Sharp, M., Foght, J., and Lanoil, B.D. (2005) Comparison of Microbial Community Compositions of Two Subglacial Environments Reveals a Possible Role for Microbes in Chemical Weathering Processes. Appl Environ Microbiol 71: 6986-6997. Smith, A.B. (1992) Geology of the Yudnamutana Gorge, Paralana Hot Springs Area and Genesis of Mineralization at the Hodgkinson Prospect, Mount Painter Province, South Australia: University of Adelaide, Dept. of Geology and Geophysics. Smith, M.H. (1966) The amino acid composition of proteins. J Theor Biol 13: 261-282. Smith, S., and Atkinson, M. (1983) Mass balance of carbon and phosphorus in Shark Bay, Western Australia. Limnol Oceanogr 28: 625-639. Smith, V.R., and Russell, S. (1982) Acetylene reduction by bryophyte-cyanobacteria associations on a Subantarctic island. Polar Biol V1: 153-157. Sogin, M.L., Morrison, H.G., Huber, J.A., Welch, D.M., Huse, S.M., Neal, P.R. et al. (2006) Microbial diversity in the deep sea and the underexplored “rare biosphere”. Proceedings of the National Academy of Sciences 103: 12115-12120. Soina, V.S., Vorobiova, E.A., Zvyagintsev, D.G., and Gilichinsky, D.A. (1995) Preservation of cell structures in permafrost: A model for exobiology. Adv Space Res 15: 237-242. Sokolova, T.G., Kostrikina, N.A., Chernyh, N.A., Kolganova, T.V., Tourova, T.P., and Bonch-Osmolovskaya, E.A. (2005) Thermincola carboxydiphila gen. nov., sp. nov., a novel anaerobic, carboxydotrophic, hydrogenogenic bacterium from a hot spring of the Lake Baikal area. Int J Syst Evol Microbiol 55: 2069. Somero, G. (2003) Protein adaptations to temperature and pressure: complementary roles of adaptive changes in amino acid sequence and internal milieu* 1. Comparative Biochemistry and Physiology Part B: Biochemistry and Molecular Biology 136: 577-591. Sonne-Hansen, J., and Ahring, B. (1999) Thermodesulfobacterium hveragerdense sp. nov., and Thermodesulfovibrio islandicus sp. nov., two thermophilic sulfate reducing bacteria isolated from a Icelandic hot spring. Syst Appl Microbiol 22: 559-564. Sorokin, D.Y., Tourova, T.P., Henstra, A.M., Stams, A.J.M., Galinski, E.A., and Muyzer, G. (2008) Sulfidogenesis under extremely haloalkaline conditions by
219
Desulfonatronospira thiodismutans gen. nov., sp. nov., and Desulfonatronospira delicata sp. nov. - a novel lineage of Deltaproteobacteria from hypersaline soda lakes. Microbiology 154: 1444-1453. Spirina, E., Cole, J., Chai, B., Gilichinksy, D., and Tiedje, J. (2003) New high throughput approach to study ancient microbial phylogenetic diversity in permafrost. In Geophysical Research Abstracts. Nice, France: Copernicus Publications. Sprigg, R.C. (1984) Arkaroola-Mount Painter in the northern Flinders Ranges, SA: the last billion years: Arkaroola. Sridharan, S., Nicholls, A., and Honig, B. (1992) A new vertex algorithm to calculate solvent accessible surface areas. Biophys. J 61: A174. Srinivasan, V., Netz, D.J.A., Webert, H., Mascarenhas, J., Pierik, A.J., Michel, H., and Lill, R. (2007) Structure of the yeast WD40 domain protein Cia1, a component acting late in iron-sulfur protein biogenesis. Structure 15: 1246-1257. Stal, L., and Krumbein, W. (1987) Temporal separation of nitrogen fixation and photosynthesis in the filamentous, non-heterocystous cyanobacterium Oscillatoria sp. Arch Microbiol 149: 76-80. Stal, L.J., and Heyer, H. (1987) Dark anaerobic nitrogen fixation (acetylene reduction) in the cyanobacterium Oscillatoria sp. FEMS Microbiol Lett 45: 227-232. States, D.J., and Botstein, D. (1991) Molecular Sequence Accuracy and the Analysis of Protein Coding Regions. Proceedings of the National Academy of Sciences of the United States of America 88: 5518-5522. Steppe, T., and Paerl, H. (2002) Potential N2 fixation by sulfate-reducing bacteria in a marine intertidal microbial mat. Aquat Microb Ecol 28: 1-12. Steppe, T.F., Pinckney, J.L., Dyble, J., and Paerl, H.W. (2001) Diazotrophy in Modern Marine Bahamian Stromatolites. Microb Ecol 41: 36-44. Steunou, A.S., Bhaya, D., Bateson, M.M., Melendrez, M.C., Ward, D.M., Brecht, E. et al. (2006) In situ analysis of nitrogen fixation and metabolic switching in unicellular thermophilic cyanobacteria inhabiting hot spring microbial mats. Proceedings of the National Academy of Sciences of the United States of America 103: 2398-2403. Steunou, A.S., Jensen, S.I., Brecht, E., Becraft, E.D., Bateson, M.M., Kilian, O. et al. (2008) Regulation of nif gene expression and the energetics of N2 fixation over the diel cycle in a hot spring microbial mat. The ISME Journal 2: 364-378. Steven, B., Briggs, G., McKay, C.P., Pollard, W.H., Greer, C.W., and Whyte, L.G. (2007) Characterization of the microbial diversity in a permafrost sample from the Canadian high Arctic using culture-dependent and culture-independent methods. FEMS Microbiol Ecol 59: 513-523. Stewart, W. (1970a) Nitrogen fixation by blue-green algae in Yellowstone thermal areas. Stewart, W. (1973) Nitrogen fixation by photosynthetic microorganisms. Annual Reviews in Microbiology 27: 283-316. Stewart, W.D.P. (1967) Nitrogen Turnover in Marine and Brackish Habitats II. Use of 15N in Measuring Nitrogen Fixation in the Field. In, pp. 385-407.
220
Stewart, W.D.P. (1970b) Algal fixation of atmospheric nitrogen. Plant and Soil 32: 555-588. Stormo, G.D. (2002) An Introduction to Sequence Similarity (“Homology”) Searching: John Wiley & Sons, Inc. Stöver, B., and Müller, K. (2010) TreeGraph 2: Combining and visualizing evidence from different phylogenetic analyses. BMC Bioinformatics 11: 7. Sullivan, J., and Joyce, P. (2005) Model selection in phylogenetics. Annual Review of Ecology, Evolution, and Systematics 36: 445. Summers, M.L., Wallis, J.G., Campbell, E.L., and Meeks, J.C. (1995) Genetic evidence of a major role for glucose-6-phosphate dehydrogenase in nitrogen fixation and dark growth of the cyanobacterium Nostoc sp. strain ATCC 29133. J Bacteriol 177: 6184. Sundset, M., Præsteng, K., Cann, I., Mathiesen, S., and Mackie, R. (2007) Novel Rumen Bacterial Diversity in Two Geographically Separated Sub-Species of Reindeer. Microb Ecol 54: 424-438. Sung, Y., Fletcher, K.E., Ritalahti, K.M., Apkarian, R.P., Ramos-Hernández, N., Sanford, R.A. et al. (2006) Geobacter lovleyi sp. nov. strain SZ, a novel metal-reducing and tetrachloroethene-dechlorinating bacterium. Appl Environ Microbiol 72: 2775. Suzuki, M.T., and Giovannoni, S.J. (1996) Bias caused by template annealing in the amplification of mixtures of 16S rRNA genes by PCR. Appl Environ Microbiol 62: 625-630. Tamura, K., Dudley, J., Nei, M., and Kumar, S. (2007) MEGA4: molecular evolutionary genetics analysis (MEGA) software version 4.0. Mol Biol Evol 24: 1596. Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M., and Kumar, S. (2011) MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. Taroncher-Oldenburg, G., Griner, E.M., Francis, C.A., and Ward, B.B. (2003) Oligonucleotide microarray for the study of functional gene diversity in the nitrogen cycle in the environment. Appl Environ Microbiol 69: 1159-1171. Tezcan, F.A., Kaiser, J.T., Mustafi, D., Walton, M.Y., Howard, J.B., and Rees, D.C. (2005) Nitrogenase complexes: multiple docking sites for a nucleotide switch protein. Science 309: 1377-1380. The UniProt, C. (2008) The Universal Protein Resource (UniProt). Nucl. Acids Res. 36: D190-195. Thomas, D.N. (2005) Photosynthetic microbes in freezing deserts. Trends Microbiol 13: 87-88. Thomas, M., and Walter, M.R. (2002) Application of hyperspectral infrared analysis of hydrothermal alteration on Earth and Mars. Astrobiology 2: 335-351. Tillett, D., and Neilan, B.A. (2000) Xanthogenate nucleic acid isolation from cultured and environmental cyanobacteria. Journal of Phycology 36: 251-258. Tiquia, S.M., Lloyd, J., Herms, D.A., Hoitink, H.A.J., and Michel, F.C. (2002) Effects of mulching and fertilization on soil nutrients, microbial activity and
221
rhizosphere bacterial community structure determined by analysis of TRFLPs of PCR-amplified 16S rRNA genes. Appl Soil Ecol 21: 31-48. Tourova, T.P., Spiridonova, E.M., Berg, I.A., Slobodova, N.V., Boulygina, E.S., and Sorokin, D.Y. (2007) Phylogeny and evolution of the family Ectothiorhodospiraceae based on comparison of 16S rRNA, cbbL and nifH gene sequences. Int J Syst Evol Microbiol 57: 2387. Tripp, H., Bench, S., Turk, K., Foster, R., Desany, B., Niazi, F. et al. (2010) Metabolic streamlining in an open-ocean nitrogen-fixing cyanobacterium. Nature 464: 90-94. Tsuihiji, H., Yamazaki, Y., Kamikubo, H., Imamoto, Y., and Kataoka, M. (2006) Cloning and characterization of nif structural and regulatory genes in the purple sulfur bacterium, Halorhodospira halophila. J Biosci Bioeng 101: 263-270. UNESCO (1991) World heritage nomination - IUCN summary, 578: Shark Bay (Australia). In. van de Vossenberg, J.L.C.M., Driessen, A.J.M., and Konings, W.N. (1998) The essence of being extremophilic: the role of the unique archaeal membrane lipids. Extremophiles 2: 163-170. van de Vossenberg, J.L.C.M., Driessen, A.J.M., Grant, D., and Konings, W.N. (1999) Lipid membranes from halophilic and alkali-halophilic Archaea have a low H+ and Na+ permeability at high salt concentration. Extremophiles 3: 253-257. van den Burg, B. (2003) Extremophiles as a source for novel enzymes. Curr Opin Microbiol 6: 213-218. Van Trappen, S., Vandecandelaere, I., Mergaert, J., and Swings, J. (2004) Algoriphagus antarcticus sp. nov., a novel psychrophile from microbial mats in Antarctic lakes. Int J Syst Evol Microbiol 54: 1969-1973. Veerassamy, S., Smith, A., and Tillier, E. (2003) A transition probability model for amino acid substitutions from blocks. J Comput Biol 10: 997-1010. Vincent, W., Castenholz, R., Downes, M., and H-Williams, C. (1993) Antarctic cyanobacteria: Light, nutrients, and photosynthesis in the microbial mat environment. Journal of Phycology 29: 745-755. Vishnivetskaya, T.A., Petrova, M.A., Urbance, J., Ponder, M., Moyer, C.L., Gilichinsky, D.A., and Tiedje, J.M. (2006) Bacterial Community in Ancient Siberian Permafrost as Characterized by Culture and Culture-Independent Methods. Astrobiology 6: 400-414. Vriend, G. (1990) WHAT IF: a molecular modeling and drug design program. J Mol Graphics 8: 52-56. Wagner, D., Kobabe, S., and Liebner, S. (2009) Bacterial community structure and carbon turnover in permafrost-affected soils of the Lena Delta, northeastern Siberia. Can J Microbiol 55: 73-83. Walker, J.E., Saraste, M., Runswick, M.J., and Gay, N.J. (1982) Distantly related sequences in the alpha-and beta-subunits of ATP synthase, myosin, kinases and other ATP-requiring enzymes and a common nucleotide binding fold. The EMBO journal 1: 945.
222
Walter, M. (1976) Stromatolites. New York: Elsevier.20.1-790. Ward, D.M., Ferris, M.J., Nold, S.C., and Bateson, M.M. (1998) A natural view of microbial biodiversity within hot spring cyanobacterial mat communities. Microbiology and Molecular Biology Reviews 62: 1353. Watanabe, A., and Yamamoto, Y. (1971) Algal nitrogen fixation in the tropics. Plant and Soil 35: 403-413. Welch, B.L. (1947) The generalization ofstudent's' problem when several different population variances are involved. Biometrika 34: 28-35. Weller, R., Bateson, M.M., Heimbuch, B.K., Kopczynski, E.D., and Ward, D.M. (1992) Uncultivated cyanobacteria, Chloroflexus-like inhabitants, and spirochete-like inhabitants of a hot spring microbial mat. Appl Environ Microbiol 58: 3964. Whitton, B.A., and Potts, M. (2000) The Ecology of Cyanobacteria Their Diversity in Time and Space: Kluwer Academic Publishers. Wickstrom, C.E. (1984) Discovery and evidence of nitrogen fixation by thermophilic heterotrophs in hot springs. Curr Microbiol 10: 275-280. Wilson, K. (2001) Preparation of genomic DNA from bacteria. In Current Protocols in Molecular Biology. F. M. Ausubel, R.B., R. E. Kingston, D. D. Moore, J.G. Seidman, J. A. Smith, K. Struhl (ed). New York: John Wiley & Sons Inc, p. Unit 2.4. Wilson, K.H., and Blitchington, R.B. (1996) Human colonic biota studied by ribosomal DNA sequence analysis. Appl Environ Microbiol 62: 2273. Wu, S., and Zhang, Y. (2008) MUSTER: improving protein sequence profile–profile alignments by using multiple sources of structure information. Proteins: Structure, Function, and Bioinformatics 72: 547-556. Xiang, S., Yao, T., An, L., Xu, B., and Wang, J. (2005) 16S rRNA Sequences and Differences in Bacteria Isolated from the Muztag Ata Glacier at Increasing Depths. Appl Environ Microbiol 71: 4619-4627. Xiang, S.R., Yao, T.D., An, L.Z., Xu, B.Q., Li, Z., Wu, G.J. et al. (2004) Bacterial diversity in Malan ice core from the Tibetan Plateau. Folia Microbiol 49: 269-275. Xiao, X., Li, M., You, Z., and Wang, F. (2007) Bacterial communities inside and in the vicinity of the Chinese Great Wall Station, King George Island, Antarctica. Antarct Sci 19: 11-16. Yakimov, M.M., Giuliano, L., Chernikova, T.N., Gentile, G., Abraham, W.R., Timmis, K., and Golyshin, P. (2001) Alcalilimnicola halodurans gen. nov., sp. nov., an alkaliphilic, moderately halophilic and extremely halotolerant bacterium, isolated from sediments of soda-depositing Lake Natron, East Africa Rift Valley. Int J Syst Evol Microbiol 51: 2133-2143. Yakimov, M.M., Gentile, G., Bruni, V., Cappello, S., D'Auria, G., Golyshin, P.N., and Giuliano, L. (2004) Crude oil-induced structural shift of coastal bacterial communities of rod bay (Terra Nova Bay, Ross Sea, Antarctica) and characterization of cultured cold-adapted hydrocarbonoclastic bacteria. FEMS Microbiol Ecol 49: 419-432. Yamane, K., Hattori, Y., Ohtagaki, H., and Fujiwara, K. (2011) Microbial diversity with dominance of 16S rRNA gene sequences with high GC contents at 74 and 98° C subsurface crude oil deposits in Japan. FEMS Microbiol Ecol.
223
Yamauchi, K., Doi, K., Kinoshita, M., Kii, F., and Fukuda, H. (1992) Archaebacterial lipid models: highly salt-tolerant membranes from 1, 2-diphytanylglycero-3-phosphocholine. Biochimica et Biophysica Acta (BBA)-Biomembranes 1110: 171-177. Yang, R., Hou, Y., Campbell, C.A., Palaniyandi, K., Zhao, Q., Bordner, A.J., and Chang, X. (2011) Glutamine residues in Q-loops of multidrug resistance protein MRP1 contribute to ATP binding via interaction with metal cofactor. Biochimica et Biophysica Acta (BBA)-Biomembranes 1808: 1790-1796. Yannarell, A.C., Steppe, T.F., and Paerl, H.W. (2006) Genetic Variance in the Composition of Two Functional Groups (Diazotrophs and Cyanobacteria) from a Hypersaline Microbial Mat. Appl Environ Microbiol 72: 1207-1217. Yannarell, A.C., Steppe, T.F., and Paerl, H.W. (2007) Disturbance and recovery of microbial community structure and function following Hurricane Frances. Environ Microbiol 9: 576-583. Yue, J., and Clayton, M. (2005) A similarity measure based on species proportions. Communications in Statistics-Theory and Methods 34: 2123-2131. Zadorina, E., Slobodova, N., Boulygina, E., Kolganova, T., Kravchenko, I., and Kuznetsov, B. (2009) Analysis of the diversity of diazotrophic bacteria in peat soil by cloning of the nifH gene. Microbiology 78: 218-226. Zani, S., Mellon, M.T., Collier, J.L., and Zehr, J.P. (2000) Expression of nifH Genes in Natural Microbial Assemblages in Lake George, New York, Detected by Reverse Transcriptase PCR. Appl Environ Microbiol 66: 3119-3124. Zehr, J., Bench, S., Carter, B., Hewson, I., Niazi, F., Shi, T. et al. (2008) Globally distributed uncultivated oceanic N2-fixing cyanobacteria lack oxygenic photosystem II. Science 322: 1110. Zehr, J.P., and McReynolds, L.A. (1989) Use of degenerate oligonucleotides for amplification of the nifH gene from the marine cyanobacterium Trichodesmium thiebautii. Appl Environ Microbiol 55: 2522-2526. Zehr, J.P., Mellon, M.T., and Hiorns, W.D. (1997) Phylogeny of cyanobacterial nifH genes: evolutionary implications and potential applications to natural assemblages. Microbiology 143: 1443-1450. Zehr, J.P., Mellon, M.T., and Zani, S. (1998) New Nitrogen-Fixing Microorganisms Detected in Oligotrophic Oceans by Amplification of Nitrogenase (nifH) Genes. Appl Environ Microbiol 64: 5067. Zehr, J.P., Jenkins, B.D., Short, S.M., and Steward, G.F. (2003a) Nitrogenase gene diversity and microbial community structure: a cross-system comparison. Environ Microbiol 5: 539-554. Zehr, J.P., Crumbliss, L.L., Church, M.J., Omoregie, E.O., and Jenkins, B.D. (2003b) Nitrogenase genes in PCR and RT-PCR reagents: implications for studies of diversity of functional genes. BioTechniques 35: 996-1002, 1004-1005. Zehr, J.P., Mellon, M., Braun, S., Litaker, W., Steppe, T., and Paerl, H.W. (1995) Diversity of Heterotrophic Nitrogen Fixation Genes in a Marine Cyanobacterial Mat. Appl Environ Microbiol 61: 2527-2532.
224
Zeitlin, C., Cleghorn, T., Cucinotta, F., Saganti, P., Andersen, V., Lee, K. et al. (2004) Overview of the Martian radiation environment experiment. Adv Space Res 33: 2204-2210. Zhang, L., Hurek, T., and Reinhold-Hurek, B. (2007a) A nifH-based oligonucleotide microarray for functional diagnostics of nitrogen-fixing microorganisms. Microb Ecol 53: 456-470. Zhang, S., Hou, S., Ma, X., Qin, D., and Chen, T. (2007b) Culturable bacteria in Himalayan ice in response to atmospheric circulation. Biogeosci Disc 3: 765-778. Zhang, X., Yao, T., Ma, X., and Wang, N. (2002) Microorganisms in a high altitude Glacier Ice in Tibet. Folia Microbiol 47: 241-245. Zhang, Y. (2007) Template based modeling and free modeling by I TASSER in CASP7. Proteins: Structure, Function, and Bioinformatics 69: 108-117. Zhang, Y. (2008) I-TASSER server for protein 3 D structure prediction. BMC Bioinformatics 9: 40. Zhang, Y. (2009) I-TASSER: fully automated protein structure prediction in CASP8. Proteins: Structure, Function, and Bioinformatics 77: 100-113. Zhang, Y., and Skolnick, J. (2004) Scoring function for automated assessment of protein structure template quality. Proteins: Structure, Function, and Bioinformatics 57: 702-710. Zhang, Y., and Skolnick, J. (2005) TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res 33: 2302-2309. Zhang, Y., and Skolnick, J. (2007) Scoring function for automated assessment of protein structure template quality. Proteins 68: 1020. Zhang, Y., Kolinski, A., and Skolnick, J. (2003) TOUCHSTONE II: a new approach to ab initio protein structure prediction. Biophys J 85: 1145. Zhang, Y., Dong, J., Yang, Z., Zhang, S., and Wang, Y. (2008) Phylogenetic diversity of nitrogen-fixing bacteria in mangrove sediments assessed by PCR–denaturing gradient gel electrophoresis. Arch Microbiol 190: 19-28. Zhou, J., Davey, M.E., Figueras, J.B., Rivkina, E., Gilichinsky, D., and Tiedje, J.M. (1997) Phylogenetic diversity of a bacterial community determined from Siberian tundra soil DNA. Microbiology 143: 3913-3919.
225
Appendix A ______________________________________________________________________
1996 and 2004 Stromatolite nifH BLAST and BLASTX matches
Table A-1: Stromatolite 2004 clones BLAST results, presenting the highest sequence similarity match for each clone. 38 clones of 2004 stromatolites were analysed.
Sequence file ID Clone ID Nearest relative in GenBank*
RSA75_04 C35 Uncultured bacterium nifH clone DQ338103 86 RSA76_04 C36 “ DQ140596 100 RSA77_04 C37 “ DQ338103 87 RSA78_04 C39 “ EU594188 85 RSA79_04 C40 “ DQ338103 87 RSA80_04 C41 “ EF174826 93 RSA81_04 C42 “ EF174812 89 RSA82_04 C43 “ DQ338103 87 RSA83_04 C44 “ DQ338103 87 RSA84_04 C45 “ EU594188 84 RSA85_04 C47 “ EF174812 89 RSA86_04 C48 “ DQ338103 86 RSA87_04 C49 “ EU594145 87 RSA88_04 C50 “ DQ338103 87 *Only a single match is shown. There could be two or more identical high scores for each record here.
226
Table A-2: Stromatolite 1996 clones BLAST results, presenting only the highest sequence similarity match for each clone. 37 clones of 1996 stromatolites were analysed.
RSA139_96 GC54 “ DQ078042 89 RSA141_96 GC56 “ DQ338071 85 RSA143_96 GC58 “ DQ078042 89 RSA147_96 GC57 “ DQ338014 88 RSA148_96 GC1 “ DQ078042 89 RSA150_96 GC3 “ DQ078042 89 RSA152_96 GC6 “ DQ338014 86 *Only a single match is shown. There could be two or more identical high scores for each record here.
227
Table A-3: Stromatolite 2004 clones BLASTX results, presenting only the highest sequence similarity match for each clone.
Sequence file ID
Clone ID
Nearest bacterial Nitrogenase iron protein match in GenBank
(a) ‘Reviewed’ status indicates sequences that were manually annotated and reviewed in the Swiss-Prot database. Reviewed sequences are reliable as they were inferred from homology studies, and there is evidence at transcript and protein level of their existence. ‘Unreviewed’ status indicates sequences that were automatically annotated and were not reviewed (TrEMBL database). They are mostly derived from prediction studies and have far less verification at the protein and transcript levels.