Top Banner
Nitrogen fixing potential in extreme environments Author: Sorek Abramovich, Reut Publication Date: 2013 DOI: https://doi.org/10.26190/unsworks/16288 License: https://creativecommons.org/licenses/by-nc-nd/3.0/au/ Link to license to see what you are allowed to do with this resource. Downloaded from http://hdl.handle.net/1959.4/52826 in https:// unsworks.unsw.edu.au on 2022-08-16
242

Nitrogen fixing potential in extreme environments - UNSWorks

Apr 30, 2023

Download

Documents

Khang Minh
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Nitrogen fixing potential in extreme environments - UNSWorks

Nitrogen fixing potential in extreme environments

Author:Sorek Abramovich, Reut

Publication Date:2013

DOI:https://doi.org/10.26190/unsworks/16288

License:https://creativecommons.org/licenses/by-nc-nd/3.0/au/Link to license to see what you are allowed to do with this resource.

Downloaded from http://hdl.handle.net/1959.4/52826 in https://unsworks.unsw.edu.au on 2022-08-16

Page 2: Nitrogen fixing potential in extreme environments - UNSWorks

Nitrogen fixing potential in extreme environments

Reut Sorek Abramovich

A thesis in fulfilment of the requirements for the degree of

Doctor of Philosophy

School of Biotechnology and Biomolecular Sciences

The University of New South Wales

Sydney, Australia

March 2013

Page 3: Nitrogen fixing potential in extreme environments - UNSWorks

iii

ORIGINALITY STATEMENT

‘I hereby declare that this submission is my own work and to the best of my

knowledge it contains no materials previously published or written by another

person, or substantial proportions of material which have been accepted for the

award of any other degree or diploma at UNSW or any other educational

institution, except where due acknowledgement is made in the thesis. Any

contribution made to the research by others, with whom I have worked at

UNSW or elsewhere is explicitly acknowledged in the thesis. I also declare that

the intellectual content of this thesis is the product of my own work, except to

the extent that assistance from others in the project's design and conception or

in style, presentation and linguistic expression is acknowledged.’

Signed ……………………………………………..............

Date ……………………………………………...........

Page 4: Nitrogen fixing potential in extreme environments - UNSWorks

iv

COPYRIGHT STATEMENT

‘I hereby grant the University of New South Wales or its agents the right to archive and to make available my thesis or dissertation in whole or part in the University libraries in all forms of media, now or here after known, subject to the provisions of the Copyright Act 1968. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation. I also authorise University Microfilms to use the 350 word abstract of my thesis in Dissertation Abstract International (this is applicable to doctoral theses only). I have either used no substantial portions of copyright material in my thesis or I have obtained permission to use copyright material; where permission has not been granted I have applied/will apply for a partial restriction of the digital copy of my thesis or dissertation.'

Signed ……………………………………………...........................

Date ……………………………………………...........................

AUTHENTICITY STATEMENT

‘I certify that the Library deposit digital copy is a direct equivalent of the final officially approved version of my thesis. No emendation of content has occurred and if there are any minor variations in formatting, they are the result of the conversion to digital format.’

Signed ……………………………………………...........................

Date ……………………………………………...........................

Page 5: Nitrogen fixing potential in extreme environments - UNSWorks

v

Abstract Biological nitrogen fixation is a key process in providing accessible nitrogen to Earth’s biosphere. This process has been studied in various habitats yet extreme environments still remain relatively unexplored. The nifH gene codes for the Fe protein component in the nitrogenase, which facilitates the nitrogen fixation. Our aims in this study were to assess diazotrophic diversity, richness and community structure in three unique environments and analyse potential adaptations in the Fe protein composition and structure. Our methods included a terminal-restriction fragment length polymorphism (T-RFLP) analysis on 16S rDNA, PCR amplification of the nifH gene, statistical t-test analysis of amino acid compositions, a novel evolutionary analysis and 3D modelling with the I-TASSER web server. Boulder Clay and Amorphous Glacier are two ice-free areas in Terra Nova Bay, Antarctica, which differ in their geological origins and physio-chemical properties. DNA yields from ice-core samples ranged from 0.29 ng L-1 in Amorphous Glacier to 88 ng L-1 in Boulder Clay. Bray-Curtis cluster analysis suggested Boulder Clay bacterial profiles were similar to each other, but cluster separately from Amorphous Glacier. The hypersaline (>70 ppt) bays of Shark Bay, Western Australia, are home to the stromatolites microbial mats. The microbial diversity of diazotrophs from two different years, 1996 and 2004, was investigated. Our analysis indicated columnar stromatolites included a common persisting cyanobacterial diazotroph, a Cyanothece or Xenoccocous. Both samples contained novel nifH gene sequences of low similarity to uncultured nifH clones from saline to hypersaline environments, and their inferred NifH amino acid sequences were highly similar to unicellular, non-heterocystous Cyanobacteria and γ, -Proteobacteria sequences. Paralana’s hot radon springs (PHS, 57 C°) are situated in South Australia. Phylogenetic analysis indicated a rich and diverse group of amino acid NifH sequences from α-, γ-, and δ-Proteobacteria, Chloroflexi and Cyanobacteria phyla. These results suggested aerobic and anaerobic bacteria with conventional Mo nitrogenase might be involved in nitrogen fixation. Our bioinformatic analysis suggested that halophilic adaptations, with an increase in salt bridges, acidic residues and a decrease in bulkier hydrophobic amino acids, did occur in stromatolite diazotrophs and that partial thermophilic adaptations, mainly an increase in salt bridges, Pro and charged residues, did occur in the PHS diazotrophs. These studies provide new insight on the ongoing evolution of nitrogen fixation in extreme environments.

Page 6: Nitrogen fixing potential in extreme environments - UNSWorks

vi

Acknowledgments I would like to thank my supervisors - Prof. Brett A. Neilan, Dr. Michelle Gehringer and Dr. Brendan P. Burns, for their support and advice during my PhD studies. I have benefited from their advice, and followed their wise council. I would like to thank Dr. Sohail Siddiqui, Prof. Aharon Oren and Prof. Nir Ben Tal, for their support and invaluable suggestions. The Australian Centre for Astrobiology was a creative hub for me and other students, a place to exchange ideas, thoughts and avenues of exploration into the biggest mysteries of life. I would like in particular to thank the director, Prof. Malcolm Walter, for his ongoing support of my efforts, and thank Carol, Jessica, Maria, Tamsyn, David and Ivan for creative conversations during my research career at the centre. I would also like to thank my friends and colleagues at the Blue Green Groove Machine lab, for their patience, help and suggestions. I could not have come this far without their knowledge. My special thanks go to: Anne D.J., Michelle A, Kristin, Falicia, Alex, Hannah, Ivan, Jasper, Troco, Shane, Stefan, Jae, Frank, Tim, Maria, Sarah, Tamsyn, Angie, Rati, Julia, Shauna, Leanne, Will and Alper. The Mars Society of Australia (MSA) is a group of intelligent and devoted people. My 2009 field trip to the Paralana Hot Springs in South Australia, with NASA’s Spaceward Bound program, was very special thanks to their efforts and hard work. I salute you: David Cooper, David Wilson, Jon Clarke, Guy Murphy, Mark Gargano, Eriita Jones, Marcia Tanner and Shaun Strong. I am also indebted to Dr. Chris McKay and Prof. Penelope Boston for enlightened conversations and field trip advice & help. Thank you my coffee break friends: Rhea, Shahar, Nitzan, Eldad and Mikayla. To my ever loving husband, Aviv - Thank You, my No. 1. To my parents & brother, Aryeh & Channa & Shachar - Thank you for inspirational stories. To my first born daughter, Eleanor - You were the best surprise I’ve have ever received. May your life be interesting and filled with joy. One last statement if I may -

“The time has come for humanity to journey to Mars.” (The Mars Society founding declaration, University of Colorado, Boulder, Colorado, United

States, 1998)

Page 7: Nitrogen fixing potential in extreme environments - UNSWorks

vii

List of Publications Abramovich, R.S., Pomati, F., Jungblut, A.D., Guglielmin, M., and Neilan, B.A. (2012) T-RFLP Fingerprinting Analysis of Bacterial Communities in Debris Cones, Northern Victoria Land, Antarctica. Permafrost and Periglacial Processes 23: 244-248.

Contributions to academic conferences Abramovich, R.S., Burns, B.P., and Neilan, B.A. Temporal Biodiversity of Potential Diazotrophs in Stromatolites, Shark Bay, Western Australia. Australian Mars Exploration Conference. July, 17-19th 2009, Adelaide, South Australia. Abramovich, R.S., Burns, B.P., and Neilan, B.A. Nitrogen fixation potential in stromatolites, Shark Bay, Western Australia. The 9th Australian Space Science Conference. 28 - 30th, September 2009, Sydney, Australia. Abramovich, R.S., Gehringer, M.M., and Neilan, B.A. Biodiversity of Potential Diazotrophs in Microbial Communities of Stromatolites at Shark Bay, Western Australia. Sydney Astronomy and Astrophysics Student Symposium. 18th, June 2010, Sydney, Australia. Abramovich, R.S., Gehringer, M.M., and Neilan, B.A. Biodiversity of Potential Diazotrophs in Microbial Communities of a Radon Hot Spring in the Flinders Ranges and Stromatolites at Shark Bay. The 8th International Congress on Extremophiles. 12-14th, September 2010, Azores, Portugal. Abramovich, R.S., Gehringer, M.M., and Neilan, B.A. Biological nitrogen fixation potential in stromatolites, Shark Bay, Western Australia. The 16th SUNFix Symposium. 25th of June 2010, Sydney, Australia. Abramovich, R.S., Gehringer, M.M., Burns, B.P., and Neilan, B.A. Biodiversity of Potential Diazotrophs in Stromatolites of Shark Bay and a Radon Hot Spring. The Australian Society for Microbiology, Annual Scientific Meeting. 4-8th, July 2010, Sydney, Australia.

Page 8: Nitrogen fixing potential in extreme environments - UNSWorks

viii

List of Acronyms and Abbreviations ARA Acetylene reduction assay ATCC American Type Culture Collection ATP Adenosine triphosphate BLAST Basic local alignment search bp Base pairs BSA Bovine serum albumin cDNA Complementary Deoxyribonucleotide acid Chla Chlorophyll a DMSO Dimethyl sulfoxide DNA Deoxyribonucleotide acid dNTP Deoxyribonucleotide triphosphate DTT Dithiothreithol EDTA Ethylenediaminetetraacetic acid EPS Exopolysaccharide FISH Fluorescence in situ hybridisation g Gram

g Microgram GC-MS Gas chromatography-mass spectrometry GTP Guanosine-5'-triphosphate h Hour IPTG Isopropyl- D-thiogalactoside kb Kilobase kDa Kilodalton km Kilometre km2 Square kilometre L Litre

L Microlitre LB Luria-Bertani m Metre m.b.s.l. Meters below surface level

M Micromolar min Minute ml Millilitre mm Millimetre MQ Milli-Q mRNA Messenger RNA NCBI National Centre for Biotechnology nd Not detected ºC Degrees Celsius ORF Open reading frame OTU Operational Taxonomic Unit PCC Pasteur Culture Collection (France) PCR Polymerase chain reaction PDB Protein Data Bank pmol Picomol

Page 9: Nitrogen fixing potential in extreme environments - UNSWorks

ix

rDNA Ribosomal Deoxyribonucleotide acid RDP Ribosomal Database Project RFLP Random fragment length polymorphism RFLP Restriction fragment length polymorphism RNA Ribonucleic acid rpm Revolutions per minute rRNA Ribosomal ribonucleic acid RT Room temperature RT-PCR Reverse Transcriptase PCR s Second SD Standard deviation SDS Sodium dodecyl sulphate SRB Sulphate reducing bacteria SSU Small sub-unit TAP T-RFLP Analysis Program T-RFLP Terminal Restriction Fragment Length Polymorphism UTCC University of Toronto Culture UTEX University of Texas Culture UV Ultraviolet light

Page 10: Nitrogen fixing potential in extreme environments - UNSWorks

1

TABLE OF CONTENTS

Chapter 1 Introduction ............................................................................................................. 5

1.1 The extremophiles ..................................................................................................... 5

1.2 Nitrogen significance and source .............................................................................. 6

1.3 Nitrogenase structure and function ......................................................................... 7 1.3.1 Fe protein structure and function .......................................................................................... 7 1.3.2 MoFe protein structure and function .................................................................................... 8 1.3.3 Nitrogenase modus operandi ................................................................................................ 9

1.4 Diazotroph phylogeny ............................................................................................. 10 1.4.1 Cyanobacteria ..................................................................................................................... 11 1.4.2 Other prokaryotic diazotrophs ............................................................................................ 12

1.5 Psychrophilic diazotrophs ....................................................................................... 13

1.6 Halophilic diazotrophs ............................................................................................ 15

1.7 Thermophilic diazotrophs ....................................................................................... 18

1.8 Analyzing nitrogen fixation .................................................................................... 20

1.9 Research aims .......................................................................................................... 21

Chapter 2 T-RFLP analysis of potential diazotrophs in glacial and permafrost formations

in Northern Victoria Land, Antarctica. .................................................................................. 24

2.1 Introduction ............................................................................................................. 24

2.2 Materials and methods ............................................................................................ 28 2.2.1 Study sites .......................................................................................................................... 28 2.2.2 Ice core collection .............................................................................................................. 28 2.2.3 Sample preparation ............................................................................................................. 29 2.2.4 DNA extraction and amplification ..................................................................................... 29 2.2.5 Terminal Restriction Fragment Length Polymorphism (T-RFLP) ..................................... 31 2.2.6 T-RFLP profiles ................................................................................................................. 32 2.2.7 T-RFLP Analysis Program (TAP) ...................................................................................... 32 2.2.8 RDP 9, TAP and T-RFLP databases .................................................................................. 32 2.2.9 PCR amplification of nifH genes ........................................................................................ 32

2.3 Results and discussion ............................................................................................. 33 2.3.1 Amorphous Glacier and Boulder Clay T-RFLP Profiles .................................................... 34 2.3.2 In silico database composition ............................................................................................ 37 2.3.3 Amorphous Glacier and Boulder Clay cryospheric bacteria ............................................. 40

Page 11: Nitrogen fixing potential in extreme environments - UNSWorks

2

2.4 Concluding remarks ................................................................................................ 44

Chapter 3 Diazotrophic diversity in columnar stromatolites of Shark Bay, Western

Australia. ................................................................................................................................... 46

3.1 Introduction ............................................................................................................. 46

3.2 Materials and methods ............................................................................................ 51 3.2.1 Sample collection and sample sites .................................................................................... 51 3.2.2 DNA isolation and PCR amplification of nifH genes ......................................................... 51 3.2.3 Clone libraries and Restriction Fragment Length Polymorphism (RFLP) ......................... 53 3.2.4 DNA sequencing ................................................................................................................ 53 3.2.5 Phylogenetic sequence analysis .......................................................................................... 54 3.2.6 Diversity, richness and coverage estimators ....................................................................... 55 3.2.7 Accession numbers ............................................................................................................. 56

3.3 Results and discussion ............................................................................................. 57 3.3.1 General methodology consideration ................................................................................... 57 3.3.2 2004 clone library BLAST & BLASTX analysis ............................................................... 58 3.3.3 1996 clone library BLAST & BLASTX analysis ............................................................... 62 3.3.4 BLAST and BLASTX comparative analysis ...................................................................... 64 3.3.5 Phylogenetic analysis ......................................................................................................... 65 3.3.6 Coverage, diversity and community structure .................................................................... 73 3.3.7 Nitrogen fixation potential in Shark Bay ............................................................................ 77

3.4 Concluding remarks ................................................................................................ 82

Chapter 4 The bacterial diazotrophic community in a radon hot spring, South Australia.

……………………………………………………………………………………………83

4.1 Introduction ............................................................................................................. 83

4.2 Materials and methods ............................................................................................ 86 4.2.1 Sample collection ............................................................................................................... 86 4.2.2 DNA isolation and PCR amplification of nifH genes ......................................................... 87 4.2.3 Clone library and Restriction Fragment Length Polymorphism (RFLP) ............................ 88 4.2.4 DNA sequencing ................................................................................................................ 89 4.2.5 Phylogenetic analysis ......................................................................................................... 89 4.2.6 Diversity, richness and coverage analysis .......................................................................... 89 4.2.7 Accession numbers ............................................................................................................. 89

4.3 Results and discussion ............................................................................................. 90 4.3.1 BLAST & BLASTX comparative analysis ........................................................................ 90

Page 12: Nitrogen fixing potential in extreme environments - UNSWorks

3

4.3.2 Phylogenetic analysis ......................................................................................................... 96 4.3.3 Coverage, diversity and community richness ................................................................... 101 4.3.4 Nitrogen fixation in Paralana Hot Springs ....................................................................... 102

4.4 Concluding remarks .............................................................................................. 107

Chapter 5 Structural and evolutionary adaptations in the Fe protein component of the

nitrogenase ............................................................................................................................... 108

5.1 Introduction ........................................................................................................... 108

5.2 Material and methods ........................................................................................... 114 5.2.1 Evolutionary conservation ................................................................................................ 114 5.2.2 Residue composition ........................................................................................................ 114 5.2.3 Statistical analysis ............................................................................................................ 114 5.2.4 Structural characteristics .................................................................................................. 115

5.3 Results ..................................................................................................................... 116 5.3.1 Evolution, composition and structure of the Cluster III Fe protein .................................. 116 5.3.2 Evolution, composition and structure of the Cluster I Fe protein ..................................... 126 5.3.3 Comparative analysis of cluster I and cluster III Fe proteins ........................................... 142

5.4 Discussion ............................................................................................................... 148 5.4.1 Methodology .................................................................................................................... 148 5.4.2 Evolution, composition and structure in cluster I & III .................................................... 150

5.5 Concluding remarks .............................................................................................. 155

Chapter 6 Halophilic and thermophilic adaptations in the Fe protein ............................ 156

6.1 Introduction ........................................................................................................... 156

6.2 Material and methods ........................................................................................... 158 6.2.1 Evolutionary conservation ................................................................................................ 158 6.2.2 Residue composition ........................................................................................................ 158 6.2.3 Statistical analysis ............................................................................................................ 158 6.2.4 Distance matrices ............................................................................................................. 158 6.2.5 Structural characteristics .................................................................................................. 158

6.3 Results ..................................................................................................................... 159 6.3.1 Potential halophilic adaptations in the Fe protein ............................................................. 159 6.3.2 Potential thermophilic adaptations in the Fe protein ........................................................ 168

6.4 Discussion ............................................................................................................... 176 6.4.1 Halophilic adaptations ...................................................................................................... 176

Page 13: Nitrogen fixing potential in extreme environments - UNSWorks

4

6.4.2 Thermophilic adaptations ................................................................................................. 180

6.5 Concluding remarks .............................................................................................. 182

Chapter 7 Conclusions & future work ................................................................................. 184

References ................................................................................................................................ 189

Appendix A .............................................................................................................................. 225

Page 14: Nitrogen fixing potential in extreme environments - UNSWorks

5

Chapter 1 Introduction

_______________________________________________

1.1 The extremophiles

Today, it is clear micro-organisms are one of Earth’s most extraordinary life forms, employing

complex strategies to withstand harsh conditions we, as human species, cannot endure

(Schleifer, 2004). Records from the beginning of the 19th century describe bacteria capable of

withstanding acidic conditions, thriving in hot geysers or enduring 0°C (Pikuta et al., 2007) and

references within), and since then it has been found that extreme geochemical and physical

conditions such as acidity, high salinity, intense radiation and extremes in temperature or

pressure, do not block microbial life from thriving (Rothschild and Mancinelli, 2001). Bacteria

living in such conditions have been termed extremophiles, and have been shown to be useful

industrial agents (Herbert and Sharp, 1992; Pennisi, 1997; van den Burg, 2003), as well as

model organisms in astrobiology research (Imshenetsky et al., 1967; Friedmann, 1993;

Cavicchioli, 2002).

Since the 1950’s more than 40 successful robotic missions were lead by NASA and other space

agencies, which provided new knowledge regarding atmospheric and geological processes on

other bodies in the solar system (NASA, 2012). These missions revealed a wide range of

environmental conditions - from very high temperatures to very cold (730 K, Venus, 110 K,

Jupiter), high pressure (90 atm, Venus) and intense radiation (0.3-0.4 Sv yr-1 of galactic cosmic

rays dosage, Mars) to name a few variables (Zeitlin et al., 2004; Moses et al., 2005; Pätzold et

al., 2007; Pierrehumbert, 2011).

While Earth-like life forms have not been detected anywhere in the solar system as of yet (nor

elsewhere in the universe), on Earth there are extreme environments which are teeming with

microbial life. A hydrothermal vent, 3550 m deep in the mid Atlantic ridge provided us with

Thermococcus barophilus - a barophilic, hyperthermophilic Archean that grows optimally at

358 K and 396 atmospheres (Marteinsson et al., 1999). Salinibacter, an obligate halophile

which grows optimally with 200 - 300 g l–1 salt, was isolated from a saltern crystallized pond

(Bardavid et al., 2007). Active endolithic Cyanobacteria (Chroococcidiopsis) and heterotrophic

bacteria live in halite crust, in the hyperarid Atacama Desert ( 2 mm y-1), under extreme

Page 15: Nitrogen fixing potential in extreme environments - UNSWorks

6

dryness and radiation conditions (Davila et al., 2008; Ríos et al., 2010). Hygroscopic salts (such

as sodium chloride, magnesium chloride) were found on Mars, and are considered a potential

niche to support endolithic microbial communities, even under Martian conditions (Davila et

al., 2010). Earth’s extreme environments, analogous to other environments in distant planets,

are worthy of intense research because investigating their ecological systems expands our

knowledge and chances of finding Earth-like life on other solar bodies.

The environmental limits, in which life can thrive, especially microbiological life, are

consequently constantly being re-defined. Amongst the known biochemical pathways, nitrogen

fixation is one of the most important, as it is a fundamental process of acquiring an abiotic

element, and integrating it into complex biological and ecological systems.

1.2 Nitrogen significance and source

Biological Nitrogen Fixation (BNF) is an important process to all Earth life. The element is

present in amino acids, purines, pyrimidines and other important biological molecules (Postgate,

1982). BNF is coupled to important biochemical pathways such as photosynthesis and

represents a direct input of one of earth’s most abundant atmospheric elements, N, into living

organisms and the biosphere in general (Postgate, 1987). Most of the biosphere cannot access

the atmospheric source directly and although N2 is the most abundant gas in Earth’s atmosphere

(78%), it is extremely unreactive resulting from the triple bond between N atoms having a high

bond energy of 225 kcal mol-1 (Howard and Rees, 1996). Most organisms require nitrogen to be

reduced to ammonia before they can integrate this important element into variable biosynthesis

pathways (Berg et al., 2002; Berman-Frank et al., 2003). Non-biological nitrogen fixation occurs via atmospheric ionization caused by lightning and UV

radiation, and in an industrial process devised by F. Haber in 1910 (and developed later for

commercial purposes by C. Bosch, (Kim and Rees, 1994)). Lightning and UV radiation

discharge enough electrons and energy to break the triple bond and form nitrogen oxides, while

in the industrial process an iron catalyst is used (with 200-500 atmospheres and 330-800°C)

followed by the addition of hydrogen to form ammonia (Mishustin and Shilnikova, 1971;

Postgate, 1982; Postgate, 1987; Howard and Rees, 1996; Berg et al., 2002).

BNF capabilities are found in micro-organisms from two kingdoms – the Archaea and Bacteria.

The major nitrogen fixing phylogenetic groups in the Eubacteria are the green sulfur bacteria,

Firmicutes, Cyanobacteria and Proteobacteria. In Archaea there are several genera found to be

nitrogen fixating: Halobacterium, Methanobacterium, Methanococcus, Methanolobus,

Page 16: Nitrogen fixing potential in extreme environments - UNSWorks

7

Methanoplanus, Methanosarcina and Methanothermus (Gary Stacey, 1992; Dixon and Kahn,

2004). Additional processes are involved in the nitrogen cycle on Earth and provide oxidized

and reduced forms of nitrogen. Aerobic nitrification converts ammonia into oxidized varieties,

using ammonia and nitrite oxidation pathways (NH4+ / NH3 NO2 NO3

-). Denitrification

converts oxidized forms to dinitrogen (NO3- NO2

- NO->N2O->N2), as does anaerobic

ammonium oxidation (ANAMMOX) by the Planctomycetes phylum and members of the

Crenarchaeota (Francis et al., 2007).

1.3 Nitrogenase structure and function

All diazotrophic micro-organisms have in common an enzyme – the nitrogenase, which

compromises about 10% of the total cellular proteins (Burns et al., 1972). An ATP-hydrolyzing

complex of two proteins: Dinitrogenase, a α2β2 heterotetramer where α encoded by nifD and β

by nifK genes, and the dinitrogenase reductase, a γ2 homodimer encoded by nifH gene

(Georgiadis et al., 1992; Dilworth et al., 1993). These components are sometimes referred to as

the MoFe protein and Fe protein, respectively.

Furthermore, during the last two decades crystallographic structures of nitrogenase have

emerged, leading to new 3D structural models and new insights and understanding of its

mechanism. Currently there are 36 3D structures of nitrogenase in the (Research Collaboratory

for Structural Bioinformatics, Protein Data Bank,H.M. Berman, 2003). The first were

crystallographic structures of nitrogenase reductase from Azotobacter vinelandii and

Clostridium pasteurianum at 2.9 and 3.0 Å resolution, respectively (Georgiadis et al., 1992;

Kim et al., 1993). Since then, 34 structures of nitrogenase were resolved from A. vinelandii, C.

pasteurianum, Klebsiella pneumoniae and Azospirillum brasilense at 1.16 to 3.2 Å

resolution (H.M.Berman, 2000; H.M. Berman, 2003). The following paragraphs briefly describe

the structure and function of the individual components of nitrogenase.

1.3.1 Fe protein structure and function

Research based on crystallographic structures, genetic and molecular methodologies has

revealed that the Fe protein, a ~60kD protein, has several functionalities: it binds

MgATP/MgADP (each monomer contains an ATP-binding site in a single domain) and is

required for the initial biosynthesis of the FeMo cofactor and its insertion into the MoFe protein

(Burgess and Lowe, 1996). It also transfers electrons from a suitable donor (such as reduced

ferredoxin or flavodoxin) to the dinitrogenase. The homodimer is composed of two polypeptide

Page 17: Nitrogen fixing potential in extreme environments - UNSWorks

8

chains linked by a single redox-active Fe4S4 cluster that can reach three oxidative states

(Howard and Rees, 1996, see figure 1). The nucleotides are essential for the electron transfer

because they induce conformational changes which result in receptive iron atoms in the clusters.

The Fe protein structure reflects these multiple functionalities via its complex structure and

motifs: eight parallel beta-sheets flanked by nine alpha-helices, a nucleotide binding fold

(Walker et al., 1982) and two switch regions, designated by Schlessman et al. (1998) Switch I

and Switch II, which interact with the gamma-phosphate group of the bound MgATP and

facilitate the conformational changes (Jang et al., 2000; Jang et al., 2004).

Figure 1. General view of the Fe protein. The two polypeptide chains are linked by a single redox-active Fe4S4 cluster - chains F (red) and E (blue). Secondary structure depicted as determined by Tezcan et al. (2005). A1 - Fe4S4 cluster centred view, B1 - view centres on the cleft between the two chains. A2, B2 - same viewing angles, only the PCR amplified region of NifH in each chain is coloured. From Azotobacter vinelandii (PDB ID: 2AFH).

1.3.2 MoFe protein structure and function

This ~250kD component is encoded by nifD and nifK genes and contains two types of clusters:

P clusters and FeMo cofactors (Kim and Rees, 1994). The α subunit contains a FeMo cofactor,

typically a MoFe7S9 metal cluster (see figure 2). Some organisms contain nitrogenases wherein

Molybdenum is replaced by either Iron or Vanadium. Homocitrate and two residues, His and

Cys, coordinate the FeMo cofactor in the protein (Burgess and Lowe, 1996). Each P cluster

contains eight iron atoms and seven sulphides linked to the protein by six Cys residues. The

Page 18: Nitrogen fixing potential in extreme environments - UNSWorks

9

clusters serve as a conduit for electron transfer from the Fe protein to the FeMo cofactor to

which N2 has been hypothesized to bind (Howard and Rees, 1996).

Figure 2. General overview of α2β2 heterotetramer MoFe protein from Klebsiella pneumoniae (PDB ID: 1H1L), the FeMo cofactor (with the homocitrate molecule close by), cation binding site and the P cluster are marked in the image (Hawkes et al., 1984).

1.3.3 Nitrogenase modus operandi

Three events of electron transfer are involved in the nitrogenase modus operandi: (1) reduction

of Fe protein through an electron transfer from a suitable donor – ferredoxin or flavodoxin, (2)

transfer of the electron to MoFe protein, (3) electron transfer from the active site within MoFe

protein (presumably FeMo cofactor) to the substrate. For each 1 mol dinitrogen, 2 mol of

ammonia and 1 mol of H2 form. A total of 8 electrons are thus consumed (Burgess and Lowe,

1996). For every electron utilized in this fashion, 2 mol of MgATP are hydrolyzed to MgADP.

Reaction formula:

N2 + 8 H+ + 8 e- + 16 MgATP 2 NH3 + H2 + 16 MgADP + 16Pi

The first step in nitrogenase operation is the formation of a complex between two enzymes, a

reduced dinitrogen reductase (with MgATP bound) and dinitrogenase. One electron is then

transferred to the P cluster, and 2 MgATPs are hydrolyzed to 2 MgADPs and 2Pi. The next step

is a slow dissociation of dinitrogen reductase from dinitrogenase. This is usually the rate

limiting step, responsible for the slow turnover rate, 1.25 sec (Howard and Rees, 1996).

Dinitrogen reductase is now bound with MgADP and free from the complex. It will be first

reduced again before 2 MgADPs and 2Pi are released, and then 2 MgATPs bind again (quite

Page 19: Nitrogen fixing potential in extreme environments - UNSWorks

10

rapidly). These steps will be repeated until 8 electrons are transferred to dinitrogenase in order

to reduce N2 to 2 NH3 (and form H2). Electrons and protons are then transferred within the

dinitrogenase (in a way not entirely known) to the active site (presumably the FeMo cofactor) to

form ammonia and hydrogen as mentioned above (Postgate, 1987; Raymond et al., 2004b). In

addition, oxygen inhibits synthesis of nitrogenase in many diazotrophs and exerts different

effects on the individual nitrogenase components. Whereas both the P and Fe4S4 clusters are

inhibited by oxygen, the Fe4S4 cluster is irreversibly damaged in vitro. Inhibition has also been

associated with the presence of reactive oxygen species (ROS) (Postgate, 1987; Berman-Frank

et al., 2003).

Nitrogenase is also a non exclusive enzyme and is capable of reducing other molecules besides

dinitrogen. Some of these are listed in table 1 (Burns et al., 1972; Rasche and Seefeldt, 1997).

Table 1. List of molecules reduced by nitrogenase. Acetylene C2H2 C2H4 Nitrous oxide N2O N2 + H2O Azide N3

- N2 + NH3 Cyanide CN- CH4 + NH3 + CH3NH2 + traces of C2H4 and C2H6 Methyl Isocyanide CH3NC CH4 + C2H6+ C2H4+ C3H6 + C3H8 + CH3NH2 1-Propyne, 1-Butyne (C4H6) reduced to corresponding alkenes

Generally, proteins from extremophiles must adapt in order to retain their functionality under

extremities of temperature, pH, salinity and more (Siddiqui and Thomas, 2008). Additive

changes to the primary structure - by changing amino acids composition for instance, or changes

at higher structural levels, provide structural stability under such conditions. While an

extremophilic nitrogenase, or one of its individual components, is yet to be isolated and

characterized in depth, other proteins from halophiles, thermophiles and some psychrophiles

(Eisenberg et al., 1992; Madern et al., 2000; Feller and Gerday, 2003; Georlette et al., 2003)

have been assessed and provide a starting point to look at the Fe protein and its possible

adaptations to extreme conditions. The huge advancements in computing power (Schaller, 1997)

means structural analysis based in bioinformatics and molecular results provides an increasing

number of plausible models to work with (Polański and Kimmel, 2007; Edwards et al., 2009;

Ramsden, 2009).

1.4 Diazotroph phylogeny

Genetic research on nitrogen fixation genes originally focused on Klebsiella pneumoniae nif

genes (Postgate, 1987; Glenn and Dilworth, 1991; Gary Stacey, 1992). Comparing the nif gene

Page 20: Nitrogen fixing potential in extreme environments - UNSWorks

11

structure of K. pneumoniae to other diazotrophs like Azotobacter vinelandii, Clostridium

pasteurianum or Anabaena spp. revealed high level of nucleotide conservation between nifH

genes in the different diazotrophs. Originally, 20 nif genes were identified, arranged in eight

transcriptional units: nifJ, nifHDKTY, nifENX, nifUSVWZ, nifM, nifF, nifLA, nifBQ (Renato et

al., 2000). Transcription of the nif operons is prevented in the presence of oxygen, sources of

combined nitrogen and also when temperature reaches above a certain threshold, different per

organism (Fay, 1992; Klopprogge et al., 2002; Steunou et al., 2006). The system is regulated by

nifL,A products and additional genes, ntrA,B,C and glnB,D (Bohme, 1998).

1.4.1 Cyanobacteria

Cyanobacteria were of special focus in regards to the genetic basis for nitrogen fixation, as they

were known to exhibit high fixation rates and contributed substantially to the global biological

nitrogen fixation budget (Gallon, 2001; Berman-Frank et al., 2003). Their ability to fix

dinitrogen has been studied extensively and some cyanobacterial groups synthesize exclusively

in specialized heterocyst cells (Fleming and Haselkorn, 1973, see table 1). Some heterocystous

Cyanobacteria were not exclusive - in Anabaena variabilis ATCC 29413, for example, the nif

genes are organized in two clusters: nif1, which is expressed only in a heterocyst cell, and nif2,

which is expressed in vegetative cells only under anaerobic conditions (and also expressed in

heterocysts; (Fleming and Haselkorn, 1973; Bohme, 1998; Adams, 2000). Also, nifH and nifD

are contiguous and separated from nifK by 11kb. During heterocyst differentiation, excision of

the 11kb fragments (by xisA gene product) leads to the restoration of nifHDK operon and

synthesis of nitrogenase subunits begins (Bohme, 1998). However, it seems nifHDK genes are

contiguous in non heterocystous strains (Berman-Frank et al., 2003). Most non heterocystous

Cyanobacteria can fix only under micro-oxic or anoxic conditions which occur, for instance,

when photosynthesis is not active and therefore oxygen is not produced (i.e. during dark periods

(Bergman et al., 1997).

Page 21: Nitrogen fixing potential in extreme environments - UNSWorks

12

Table 2. Nitrogen fixing Cyanobacteria genera, heterocystous and non heterocystous.

N2 Fixing Heterocystous Cyanobacteria

Nostocales Anabaena, Aphanizomenon, Calothrix, Cylindrospermum, Nodularia, Nostoc, Scytonema

Stigonematales Chlorogloeopsis, Fischerella, Geitleria, Stigonema

1.4.2 Other prokaryotic diazotrophs

While BNF is present in several bacterial phyla, it is not restricted to the Cyanobacteria.

Phylogenetic analysis based on nif genes and nif genes homologs, depicts the diazotrophic

community as five (I-IV) distinct groups (see figure 3, Raymond et al., 2004a) . Group I+II are

diazotrophs with a Molybdenum dependent nitrogenase - an active nitrogenase with

Molybdenum in dinitrogenase component, operative under aerobic and anaerobic conditions.

These groups include members of the Cyanobacteria, Proteobacteria, Firmicutes (Clostridia),

Actinobacteria (Frankia) and Archaea (methanogens). Group III includes Molybdenum

independent nitrogenase - an active alternative nitrogenase which can use Iron or Vanadium as a

cofactor in the metalo clusters. This group includes strictly anaerobic Proteobacteria,

Spirochetes, Chlorobia and Archaea members. Group IV constitutes organisms which have the

genes, but do not fix dinitrogen and are mostly Archaeans and group V is a diverse group of

organisms which do not fix dinitrogen, yet they possess homolog genes to nifH and nifD which

encode protochlorophyllide reductase and chlorophyllide reductase. These enzymes are

analogues of nitrogenase and are related to pigment biosynthesis. NifH gene phylogeny analyses

support above grouping in general (Chien and Zinder, 1996; Zehr et al., 2003a; Moisander et

al., 2006; Zhang et al., 2007a).

Phylogenetic studies based on nifH, D, K, E, N gene sequences yielded tree topologies that were

fairly similar to 16S rDNA phylogeny (Zani et al., 2000; Jenkins et al., 2004). Additionally,

nifH was found to be highly conserved across diverse taxa in general, as well as a part of a

conserved nifHDK transcriptional operon (Omoregie et al., 2004b).

These reoccurring results, from genetic and evolution studies, support a vertical type of gene

transfer, from a common Archaean ancestor, followed by loss of gene activity due to

N2 Fixing Non Heterocystous Cyanobacteria

Aerobic Synechocystis group, Gloeothece, Cyanothece group, Gloeocapsa group, Synechococcus group, Trichodesmium, Oscillatoria

Micro-oxic or anoxic conditions

Pseudanabaena, Lyngbya, Phormidium, Plectonema, Oscillatoria

Anoxic

Chroococcidiopsis, Dermocarpa, Myxosarcina, Xenococcus, Pleurocapsa group

Page 22: Nitrogen fixing potential in extreme environments - UNSWorks

13

environmental adaptations (Fani et al., 2000). The inconsistencies with 16S rDNA phylogeny

are usually explained as Lateral Gene Transfer and loss of genes due to loss of function

(Hartmann and Barnum, 2010). In general, the evolutionary progress of the nif genes is a

complicated matter, not entirely resolved as of yet.

Figure 3.General overview of an unrooted nifH gene tree topology modified from Zehr et al. (2003a) with four major clusters I-IV.

It is of interest to review what is known of nitrogen fixation in extreme environments. The

following paragraphs provide background on nitrogen fixation in relation to cryospheric,

hypersaline and high temperature environments, from a microbiology point of view.

1.5 Psychrophilic diazotrophs

Nitrogen fixation has been studied in Antarctica for several decades now. Early studies in the

1960’s detected nitrogen fixation by Cyanobacteria, mainly by Anabaena, Calothrix and Nostoc

genera, and to a lesser extent by other genera - Stigonema and Tolypothrix (Smith and Russell,

1982). Nitrogen fixation was usually detected between 4-10°C, during mid day and was rarely

detected during winter or below 0°C (Stewart, 1970b; Davey and Marchant, 1983). More

recently, N2 fixation was found to represent between 6.3%-33% of total N incorporated by

microbial component in ponds or soils in Antarctica (Fernandez-Valiente et al., 2001), with the

higher end of contributed N reported from microbial mat studies, mostly from surface layers and

during day time (Fernandez-Valiente et al., 2007), supporting heterocystous Cyanobacteria as

Page 23: Nitrogen fixing potential in extreme environments - UNSWorks

14

the substantial providers of reduced nitrogen in the Antarctic ecosystem (Vincent et al., 1993).

Recent studies also have reported unicellular (Gloeocapsa, Synechococcus) and filamentous

non-heterocystous (Oscillatoria , Phormidium) Cyanobacteria as active nitrogen fixers, usually

under dark conditions and at substantially lower optimal temperatures than tropical or temperate

strains (Pandey et al., 2004). These Cyanobacteria were not considered true psychrophiles, since

nitrogen fixation optima was in the range of 15-25°C, and they were not able to grow at 0°C or

at subzero temperatures (Pikuta et al., 2007).

While Cyanobacteria were the dominant active nitrogen fixers reported in most studies, other

potential diazotrophs have been reported, from the Proteobacteria, Verrumicrobia, Firmicutes,

Spirochaetes and Bacteroidetes in Antarctica and other cryospheric environments.

Representatives of these major phyla were also found in other cryospheric environments such as

ice shelves, sub-glacial lakes and streams, as well as fjords and deep sea basalt flows (Priscu et

al., 1998; Carpenter et al., 2000; Bowman et al., 2003; Gaidos et al., 2004; Liu et al., 2006;

Perreault et al., 2007; Jungblut and Neilan, 2010).

The bacterial diversity in polar permafrost is considered high as nearly 40 genera have been

isolated or cloned from Arctic and Antarctic permafrost so far (Gilichinsky et al., 2007;

Gilichinsky et al., 2008), some of which are diazotrophs. The various genera identified in these

regions include: Acinetobacter, Bradyrhizobium, Comamonas, Lysobacter, Methylobacterium,

Pseudomonas and Sphingomonas of the Proteobacteria, Bacillus, Clostridium, Paenibacillus,

Planococcus and Sporosarcina from Firmicutes, Flavobacterium and Pedobacter from

Bacteroidetes and Arthrobacter, Brevibacterium, Corynebacterium, Kocuria, Micrococcus,

Rhodococcus and Streptomyces from the phylogenetic group Actinobacteria (Soina et al., 1995;

Shi et al., 1997; Zhou et al., 1997; Kochkina et al., 2001; Steven et al., 2006; Vishnivetskaya et

al., 2006; Steven et al., 2007; Mindlin et al., 2008; Niederberger et al., 2008).

The permafrost environment itself is characterized by temperatures below or equal to 0°C for at

least two consecutive years (Muller, 1947) and severe environmental conditions such as extreme

cold, high salt concentrations and low nutrient supply (Friedmann et al., 1993; Aislabie et al.,

2006; Barrett et al., 2006). Permafrost covers more than 25% of the Earth’s landmass, yet its

microbiology remains largely unexplored. Relatively little is known of Antarctic permafrost

(Gilichinsky et al., 2007; Cannone et al., 2008; Niederberger et al., 2008) and most current data

originate mainly from Siberian permafrost studies (Shi et al., 1997; Bakermans et al., 2003;

Vishnivetskaya et al., 2006).

Page 24: Nitrogen fixing potential in extreme environments - UNSWorks

15

A characteristic of cryospheric environments is that they usually have low bacterial content and

are not easy to culture (Christner et al., 2005; Miteva, 2008), and are therefore suitable to the

application of molecular based techniques in exploring their bacterial communities, diversity

and richness. Terminal Restriction Fragment Length Polymorphism (T-RFLP) is a DNA

fingerprinting method that also enables one to produce bacterial community profiles and match

bacterial genera to specific terminal restriction fragments (T-RFs) after digestion of

fluorescently labelled 16S rRNA amplicons with specific restriction enzymes (Liu et al., 1997;

Marsh et al., 2000; Derakshani et al., 2001).

T-RFLP has been used widely in microbial ecology studies of temperate zones and in versatile

environments such as marine and lake sediments, soils, plant roots and more (Clement et al.,

1998; Marsh, 1999; Liesack and Dunfield, 2004). However, to date it has been rarely used for

community analysis of permafrost or glacial environments. Bhatia et al. (2006) employed this

method to explore the relationship between supra-, sub-, and pro-glacial bacterial communities

of the John Evans Glacier (Canada) but did not identify any bacteria. T-RFLP is a quick and

sensitive molecular technique for exploring possible bacterial genotypes in a given

environmental sample, enabling future studies to target specific groups or genes.

1.6 Halophilic diazotrophs

It is of interest to look into nitrogen fixation and halophilicity in two aspects - halophilic

diazotrophs, and nitrogen fixation in hypersaline environments in general.

Halophilic micro-organisms require salt in the media for optimal growth and can be divided to

slightly, moderate or extremely halophilic (2-5 %, 5–20 % and minimum of 20–30% NaCl

respectively, in media). Halophiles can be found in the Archaea, Bacteria and Eukaryota

domains (DasSarma and Arora, 2006; Ma et al., 2010). Hypersaline environments are generally

defined as containing salt in higher concentration than sea water (3.5% total dissolved salts, or

35 PSU). Halophilic micro-organisms were detected and isolated from solar saltern ponds,

Great Salt Lake (USA), the Dead Sea (Israel et al.), African soda lakes, Hamelin Pool (Western

Australia), deep-sea brines, and many others worldwide localities (Oren, 2002; DasSarma and

Arora, 2006; Ma et al., 2010; Goh et al., 2011).

Moderate diazotrophic halophiles exist amongst the Cyanobacteria and other prokaryotes (see

table 3). Very few extreme halophilic bacteria possess nif genes, and none have been studied

extensively in terms of their nitrogen fixation capabilities. Halorhodospira halophila (γ-

proteobacteria) nif genes have been mapped and nitrogenase shown to be active and mediating

hydrogen production (Tsuihiji et al., 2006). nifH genes were reported also from H. abdelmalekii

Page 25: Nitrogen fixing potential in extreme environments - UNSWorks

16

and H. halochloris (Tourova et al., 2007) of the Ectothiorhodospiraceae family (γ-

proteobacteria; Chromatiales), which also includes additional halophilic diazotrophs genera -

Ectothiorhodospira and Thiorhodospira. These species are slightly to moderately halophilic,

and their nitrogen fixation capacity remains largely unexplored (Hirschler-Réa et al., 2003;

Imhoff, 2006).

None of the known nitrogen fixing Archaean, members of the Methanococcales,

Methanomicrobiales and Methanobacteriales, are halophiles (Leigh, 2000).

Nitrogen fixation studies in the Dead Sea (347 g l-1 salinity) have not been conducted to date,

though several halophilic micro-organisms with potential for diazotrophy were isolated. A

halophilic Rhodospirillum sodomense have been isolated from the Dead Sea, but it lacked the

nitrogenase activity usually found in the family of Rhodospirillaceae (Madigan et al., 1984;

Mack et al., 1993). Another moderate halophile from the Dead Sea, Ectothiorhodospira

marismortui, was able to grow on N2, but very poorly or not at all (Oren et al., 1989). The Dead

Sea represents the most saline environment known to date, and thus the upper salinity limits for

nitrogen fixation.

However, even though an extreme halophilic diazotroph seems to be a rare commodity, studies

into other hypersaline environments clearly indicated nitrogen fixation occurs under stressful

conditions. Few investigations in such environments revealed different dynamics of nitrogen

fixation (Pinckney et al., 1995;Paerl et al., 2003; Yannarell et al., 2007).

Microbial mat in a tropical hypersaline lagoon (74‰ salinity) has exhibited higher nitrogen

fixation rates once introduced to lower salinity levels, from 74 to 37‰ (Pinckney et al., 1995).

Additional experiments have reported similar results (Paerl et al., 2003; Yannarell et al., 2007)

with the interesting addition that non-cyanobacterial diazotrophs were more sensitive to salinity

changes than cyanobacterial diazotrophs (Yannarell et al., 2006). Nitrogen fixation rates were

rather similar during dark and light periods, until oxygenic photosynthesis was blocked, which

caused a big spike in nitrogen fixation rates under light conditions (Pinckney and Paerl, 1997).

These results suggested that halophilic anaerobic phototrophic diazotrophs were important to

nitrogen fixation just as Cyanobacteria, yet they are more sensitive to changes in salinity, and

hence their composition may vary. In another study, a hypersaline (90-78‰) Microcoleus

chthonoplastes dominated microbial mats showed high nitrogen fixation rates during night time

and low fixation rates during the day (Omoregie et al., 2004b).

Page 26: Nitrogen fixing potential in extreme environments - UNSWorks

17

Table 3. Representatives of moderately halophilic diazotrophs. Cyanobacteria Halothece Microcoleus chthonoplastes O. limnetica O. salina Oscillatoria neglecta Phormidium ambiguum Synechococcus Chloroflexi Chloroflexus aurautiacus Bacteroidetes/Chlorobi group Chlorobium limicola C. phaeobacteriales Proteobacteria Alkalilimnicola halodurans Desulfovibrio halophilus Ectothiorhodospira Halomonas maura Marichromatium purpuratum Rhodospirillum salexigens Thiocapsa roseoparsarcina Thiorhodococcus minor

Thiorhodospira sibirica

References: (Madigan et al., 1984; Yakimov et al., 2001; Oren, 2002; Argandoña et al., 2005; DasSarma and Arora, 2006; Imhoff, 2006; Tsuihiji et al., 2006; Tourova et al., 2007).

This established that the active nitrogen fixers were non heterocystous Cyanobacteria

(Plectonema boryanum, Halothece, Phormidium spp), and halophilic anaerobic sulphate reducer

similar to Desulfovibrio spp (Omoregie et al., 2004b). This suggested that lack of oxygen

enabled more diazotrophs to actively fix nitrogen.

Halophilic Bacteria and Archaea adapt to saline conditions mainly via ‘salt in’ or ‘salt out’

strategies, cell membrane and proteomic modifications (Pikuta et al., 2007). With the first

strategy, a halophile tends to accumulate salt ions (K+ Cl-, Na+) in high concentrations within

the cytoplasm - thus creating an internal osmotic pressure to counter balance the environmental

stress (Oren, 1986, 1999). Due to the high concentration of salt ions, intracellular electrostatic

charges of the enzymes change significantly and require further adaptations in enzyme structure

and composition to maintain activity and bind water molecules and ions efficiently (Rengpipat

et al., 1988; Madern et al., 2000). Oren (1999) states that the salting in strategy has been found

to date only in Halobacteriales (Archaea) and Haloanaerobiales (Bacteria) orders. In the

second strategy, ‘salting out’, a halophile synthesises and accumulates organic compatible

solutes such as betaines, ectoines, N-acetylated diamino acids and N-derivatized carboxamides

of glutamine in order to maintain an osmotic balance (Galinski and Trüper, 1994). It is

suggested that these low molecular weight osmolytes interact with water molecules via their

Page 27: Nitrogen fixing potential in extreme environments - UNSWorks

18

hydrophilic and hydrophobic regions and counteract the ionic imbalance, yet the exact

mechanism of their model of interaction with proteins is still under investigation (Galinski,

1993; Oren, 1999).

In halophilic Archaea membrane modifications may include specific transport systems to

accommodate the import or export of salt ions into the cytoplasm, bacteriorhodopsin (as a light

driven proton pump, to expel salt ions) and high content of glycerol isopranoid ethers lipids to

maintain membrane integrity under high salt concentrations (Yamauchi et al., 1992;

Gambacorta et al., 1995; van de Vossenberg et al., 1998; van de Vossenberg et al., 1999;

Gliozzi et al., 2002).

Theoretically, proteins in micro-organisms which employ several of these strategies won’t

require specific adaptations as to compete with the salt ions for water molecules. Yet, genetic

analysis of several halophilic bacterial genomes, known to employ compatible solutes for stress

management, has clearly indicated changes in the genetic code and in proteins residues

composition in comparison to non-halophilic bacterial proteins (Severin et al., 1992; Galinski

and Trüper, 1994; Oren, 1999; Paul et al., 2008; Rhodes et al., 2010) and suggest there are

specific genetic variations for proteins coping with salt induced stress conditions. The main

finding from metagenomic studies of halophilic micro-organisms, indicated that halophilic

proteins possessed more acidic residues (Asp, Glu) on the protein exterior, than in their interior

or in the active site (Lanyi, 1974; Rao and Argos, 1981; Madern et al., 1995; Madern et al.,

2000; Fukuchi et al., 2003).

1.7 Thermophilic diazotrophs

The hot geysers of California were the first terrestrial environment in which a thermophilic

Chlamydobacteriales was discovered in 1866 (Brewer, 1866; Edwards, 1868). Since then, our

knowledge has expanded the known temperature boundaries for life. High temperatures can

degrade chlorophyll (>75°C), proteins, nucleic acids (>70°C) and increase the fluidity of

membranes and yet, thermophilic Archaea and Bacteria can survive and grow in high

temperatures. They can furthermore be divided into moderately thermophilic, which have a

growth optimum at 50°–60°C, thermophilic micro-organisms, with an optimum higher than

70°C, and hyperthermophilic, with an optimum higher than 80°C (Rothschild and Mancinelli,

2001; Pikuta et al., 2007).

Microbial mats in hot environments, mainly hot springs, have been studied in regards to their

diazotrophic capabilities. A few decades of research into microbial mats from Yellowstone

Page 28: Nitrogen fixing potential in extreme environments - UNSWorks

19

National Park, have portrayed the nitrogen fixation dynamics and participants within a wide

temperature range (16°-82°C) in this unique environment (Stewart, 1970a; Miyamoto et al.,

1979). Within the mats, nitrogen fixation occurs in various layers, during daytime and night.

During daytime, it was established that heterocystous Cyanobacteria Mastigocladus laminosus

and members of the genus Calothrix were the active nitrogen fixers, at 55° and 40°C

respectively, in mid layers of the mats (Stewart, 1970a; Miyamoto et al., 1979). Under dark

conditions, 14 morphological diverse sulphate reducing anaerobic diazotrophs, were fixing

nitrogen, at temperature ranges of 30°-60°C (Wickstrom, 1984). Unicellular Synechococcus spp.

have been also identified as active nitrogen fixers at 60°C, while nifHDK gene transcripts were

high during sunset and nil when light levels were high and the mat oxic (Steunou et al., 2006).

Accordingly, nitrogenase activity (via acetylene reduction) was highest during night time, when

the mat was anoxic. It would appear then, that in the hot springs of Yellowstone National Park,

unicellular Cyanobacteria Synechococcus in the mats upper levels, as well as heterocystous

Mastigocladus in mid layers, fix atmospheric nitrogen with temporal differences. Heterotrophic

bacteria fix nitrogen during night time, when oxygen levels are low (Hamilton et al., 2011b);

(Steunou et al., 2006; Steunou et al., 2008).

Roseiflexus spp have been identified as potential diazotrophs in this system (Klatt et al., 2011)

and recently, a diverse array of nifH phylotypes have been reported from 57 springs, including

springs at 89°C, in Yellowstone National Park (Hamilton et al., 2011a). The most reoccurring

phylotypes were identified as Mastigocladus laminosus strain CCMEE 5201, Synechococcus sp.

JA-3-3Ab (Cyanobacteria), Burkholderia tropica, B. xenovorans LB400, and Dechloromonas

sp. SIUL ( -Proteobacteria). Aquificae, α-γ- -Proteobacteria and Verrucomicrobia diazotrophic

representatives were less frequent (Hamilton et al., 2011a). The maximum rates of nitrogen

fixation were recorded at 82°C and pH 2.5 by an isolated anaerobic single nifH phylotype,

related to Leptospirillium ferrooxidans (Hamilton et al., 2011b). This is the highest recorded

temperature for nitrogen fixation by a bacterial species. Bacterial hyperthermophiles,

Hydrogenobacter thermophilus strain TK-6 and Thermocrinis albus DSM 14484 (Aquificales),

posses a nifH gene copy in their respective genomes (NC_013799, CP001931), yet taxonomic

studies of these species and others in the Aquificales order have not indicated they were actively

fixing atmospheric nitrogen (Kawasumi et al., 1984; Huber et al., 1998; Eder and Huber, 2002).

In the Archaea, the highest temperature for nitrogen fixation was recorded at 92°C, by a

Methanocaldococcus jannaschii -like isolate (Mehta and Baross, 2006) with a nifH gene copy

most similar to Methanothermococcus thermolithotrophicus, the only other known thermophilic

Archaea to fix nitrogen at high temperatures (Belay et al., 1984).

Page 29: Nitrogen fixing potential in extreme environments - UNSWorks

20

Thermophiles accumulate compounds, such as amino acids and sugars (and their derivatives), as

well as mannosylglycerate and glucosylglycerate, in response to stress conditions (Borges et al.,

2002). Under high temperatures it was found these compounds protect enzymes from denaturing

or aggregating, thus demonstrating their multipurpose function, under heat as well as osmotic

leverage in saline stress (Empadinhas and da Costa, 2010). In addition, proteins from

thermophiles have several characteristics which enable them to function under normally

damaging temperatures, extremely thermostable enzymes can remain active above 85°C (Pikuta

et al., 2007). These features include changes in the primary, secondary and tertiary structural

hierarchies, which produce a compact thermophilic protein, highly complex, relatively short in

length and more hydrophobic in nature, in comparison to mesophilic or non-thermophilic

homologs (Jaenicke and Böhm, 1998; Haney et al., 1999). A higher percentage of charged

amino acids (Glu, Lys, Arg), accompanied by fewer uncharged polar residues (Ser, Thr, Asn,

and Gln) and more salt bridges provide a network of ionic bonds and hence stability to the

tertiary structure (Daniel et al., 2008; Somero, 2003). Additional features reported included:

shortening of the N- and C-terminals, increased amounts of Pro, decreased Gly content, fewer

and smaller internal cavities and higher degrees of oligomerisation. Thermostable enzymes are

thus more rigid, and need higher melting temperatures to denature and become inactive

(Jaenicke and Böhm, 1998; Somero, 2003; Greaves and Warwicker, 2009).

1.8 Analyzing nitrogen fixation

There are several molecular and chemical methods available to analyze nitrogen fixation and

collect relevant data. Dinitrogen fixation rates are usually measured by two techniques - 15N2

uptake and the Acetylene Reduction Assay (ARA) (Stewart, 1967; Stewart, 1973). Potential and

active nitrogen fixers are usually determined by extraction of DNA and RNA from

environmental sample or bacteria of choice, followed by Polymerase Chain Reaction (PCR)

amplification process and analysis (Muyzer et al., 1993).

The molecular approach of analysing nitrogen fixation via DNA or RNA extractions is quite

robust and reliable, with few known disadvantages. In general, even though DNA-based

methods are considered better in exploring natural microbial diversity than classic culturing

techniques (Amann et al., 1995; Head et al., 1998), there are several possible biases generated

by DNA extraction methods and PCR kinetics which might affect the objective representation

of an uncultured environmental microbial community. Adsorption of DNA to soil particles or

mucilaginous polysaccharides produced by many micro-organisms can inhibit DNA extraction

(Frostegard et al., 1999; Tillett and Neilan, 2000). The PCR process may be faulty at the

Page 30: Nitrogen fixing potential in extreme environments - UNSWorks

21

selection stage e.g., higher binding efficiencies to GC rich templates, or at the drift

(amplification) stage, resulting in a 1:1 product ratio bias, due to quick amplification of an

initially higher concentrated template. This would then result in a biased view of the original

sample DNA content and composition (Suzuki and Giovannoni, 1996; Polz and Cavanaugh,

1998). Additional problems in PCR process (mostly relating to 16S rDNA amplification)

include for instance: PCR chimeras, bias due to PCR cycling conditions, limitations involving

primers design and more (Wilson and Blitchington, 1996; Marchesi et al., 1998; Qiu et al.,

2001).

These problems can, however, be circumvented, and molecular techniques to identify

diazotrophs in environmental samples, via amplification of the nifH gene specifically, have been

successfully implemented and reviewed by the scientific community for at least two decades

(Zehr and McReynolds, 1989). Specifically, the nested PCR approach, targeting nifH gene, has

been successfully tried and implemented in environmental studies of aqueous origins (marine,

fresh water, ice, snow, salt pans, etc) and terrestrial origins (soil, rhizosphere, rocks, etc) under a

wide range of physical and chemical conditions (Zehr et al., 1995; Affourtit et al., 2001; Brown

et al., 2003; Mehta et al., 2003; Short and Zehr, 2005; Izquierdo and Nüsslein, 2006; Jungblut

and Neilan, 2010; Singh et al., 2010).

There are over 38,000 matches of the nifH gene currently in NCBI GenBank database (as of

December, 2011), making it a favourable reference gene for use in phylogenetic and genetic

studies. The partially amplified portion of the nifH gene encodes the nitrogenase Fe protein and

provides insights into the function and structure of this important protein.

1.9 Research aims

It is thus evident that a wide variety of diazotrophs in microbial mats participate in diel cycles of

nitrogen fixation, under stressful conditions. While the general dynamics remain similar, the

diazotrophic participants are different per extreme environment, and most probably represent an

optimal adaptation to the respective environment.

I chose to identify potential diazotrophs from three different environments: Antarctic

permafrost, halophilic microbial mats from Western Australia and thermophilic microbial

population from a hot and slightly radioactive spring in South Australia. I also have assessed

their adaptation to environmental conditions via changes to the Fe protein, as manifested in the

nifH gene.

Page 31: Nitrogen fixing potential in extreme environments - UNSWorks

22

We aimed to assess the diversity and potential for diazotrophs in Boulder Clay and Amorphous

Glacier, two ice-free areas in Terra Nova Bay, Antarctica. I have employed molecular and

computational methods which included environmental DNA extraction, amplification of the

bacterial 16S rDNA and Terminal Restriction Fragment Length Polymorphism (T-RFLP)

analysis, followed by an in-depth analysis with the T-RFLP Analysis Program (TAP). This

allowed for a diversity and structure analysis, with preliminary results as to who are the

diazotrophs in these unique sites.

The question of nitrogen fixation in the Shark Bay environment has never been addressed

before. I chose to employ a molecular approach which included environmental DNA extraction,

PCR amplification of the nifH gene followed by clone libraries, restriction fragment length

polymorphism, DNA sequencing and phylogenetic sequence analysis (Zehr et al., 1998;

Omoregie et al., 2004c). I was able to characterise diazotrophs in samples obtained in two

different years, and assess the diversity and structural changes to the bacterial community as

well as potential halophilic adaptations in the Fe protein of the stromatolites.

Paralana Hot Springs (55.6°C), a hot spring in South Australia, was investigated before for its

bacterial community (Anitori et al., 2002) and nothing is known in regards to the diazotrophic

diversity. I used the same research procedure as described for the Shark Bay environment. I was

able to compare the diazotrophic community characteristics to other thermal microbial systems

and assess potential thermal adaptations in the Fe protein of the springs’ diazotrophs.

Specific research aims were -

1. Estimation of bacterial diversity and identification of potential nitrogen fixers using T-RFLP

community analysis and PCR amplification of the nifH gene in glacial and permafrost

formations in Northern Victoria Land, Antarctica (chapter 2).

2. Assessment of diazotrophic diversity, richness and community structure in stromatolites in

Shark Bay, Western Australia from two different years (1996 & 2004, chapter 3).

3. Assessment of diazotrophic diversity, richness and community structure in Paralana Hot

Springs, South Australia (chapter 4).

4. Analysis of molecular data from aims 2 and 3, and investigate potential adaptations in the Fe

protein composition and structure (chapter 5).

Page 32: Nitrogen fixing potential in extreme environments - UNSWorks

23

Overall, extreme environments harbour novel solutions for biotechnology, as well as analogous

conditions to environments on other worlds. The overall objective of this thesis was to

contribute information in regards to diazotrophs in extreme environments and how they adapt to

their environment.

Page 33: Nitrogen fixing potential in extreme environments - UNSWorks

24

Chapter 2 T-RFLP analysis of potential diazotrophs in glacial and

permafrost formations in Northern Victoria Land, Antarctica.

_______________________________________________

2.1 Introduction

Antarctica has been the focus of microbial research for some time now, due to its extreme

climate and pristine conditions. Until a few decades ago, glacial formations and permafrost

areas on the Antarctic continent have been seen as abiotic systems. However, new data are

emerging that indicate microorganisms live within cryospheric geological features. Diverse

bacterial compositions have been described from recent and ancient permafrost (Rivkina et al.,

2004, and references within). Bacteria were found in ice cores from Lake Vostok, Mizuho Base

in the Enderby Land Mountains, and the Yamato Mountains in Dronning Maud (Christner et al.,

2001; Segawa et al., 2010). Nitrospira isolates, for instance, were detected in Luther Vale soil

samples, in Northern Victoria land and also in sediment cores at 761 m.b.s.l. from the Mertz

Glacier Polynya (MGP), Antarctica (Bowman and McCuaig, 2003; Aislabie et al., 2009).

Antarctic microbial population have changed our views of the continent as abiotic, and

substantial research have identified mainly Proteobacteria members, as well as Firmicutes,

Cytophaga-Flavobacteria- Bacteroidetes (CFB group), Actinobacteria and Deinococcus

members to successfully function under cold and desiccation stressful conditions (see also

chapter 1, section 1.5). Our study focused on two localities in the Terra Nova Bay area,

Northern Victoria Land (see figure 1). Past microbial studies in this area analysed various

ecological niches, such as soil and seawater from coastal and terrestrial stations (Nicolaus et al.,

1991; Nicolaus et al., 1996; Bargagli et al., 2004; Pepi et al., 2005). Some 140 bacterial isolates

were identified and characterised using molecular tools, such as 16S rDNA amplification,

fluorescence in situ hybridization (FISH), clone libraries and culture-dependant methods

(Michaud et al., 2004; Yakimov et al., 2004; Lo Giudice et al., 2007). Spore-forming Bacilli

species were identified from a seawater sample in Rod Bay, and Alicyclobacillus has been

isolated from geothermal soils on Mount Melbourne (Nicolaus et al., 1998; Pepi et al., 2005).

Burkholderia, a cold-tolerant, hydrocarbon-degrading soil bacteria, was also found in sea water

samples from Rod Bay (Yakimov et al., 2004). In addition, clones affiliated with

Burkholderiales were found in soil samples from a Northern Victoria Land locality

(Niederberger et al., 2008). Pseudomonas, a Gram-negative, aerobic bacterium known to inhabit

Page 34: Nitrogen fixing potential in extreme environments - UNSWorks

25

cold marine ecosystems, was detected in sea water samples from Santa Maria Novella and Rod

Bay (Yakimov et al., 2004; Lo Giudice et al., 2007).These studies revealed diverse

communities exist in this area, comprised principally from Proteobacteria, Bacteroidetes,

Firmicutes and Actinobacteria bacterial groups. It is unknown whether diazotrophic

communities exist in the Terra Nova Bay area, and only several genera from these studies are

known to have the nifH gene (see table 1). Interestingly, no representative from the

Cyanobacteria phylum has been reported from the Terra Nova Bay studies so far.

Table 1. Potential nitrogen fixers in Terra Nova Bay area, see references in text. Phylum Genus α- Proteobacteria Loktanella, Sulfitobacter, Methylobacterium, Paracoccus,

Sphingomonas - Proteobacteria Burkholderia

γ- Proteobacteria Stenotrophomonas, Halomonas, Pseudomonas Firmicutes Bacillus, Paenibacillus Actinobacteria Micrococcus, Arthrobacter, Microbacterium

In general, relatively few micro-organisms are culturable (Amann et al., 1995) and due to the

low bacterial content in polar ice core samples and difficulties in culturing them (Christner et

al., 2005; Miteva, 2008), investigating bacterial content in ice cores requires the use of highly

sensitive techniques. Terminal Restriction Fragment Length Polymorphism (T-RFLP) is a

sensitive, affordable and applicable method used mainly for estimating the diversity of bacterial

communities.

Briefly, this method amplifies 16S rDNA templates of a target community using PCR (Clement

et al., 1998) with one primer carrying a fluorescent label. Fragmentation of the amplicons by

endonuclease restriction enzymes produces a population of fluorescently labelled terminal

fragments (‘T-RF’, length in base pairs). The fluorescent PCR products are detected using

sequencing electrophoresis technologies and are visualized as peaks - each peak represents a

fragment, post-digestion (Marsh, 1999; Blackwood et al., 2003). The general assumption in this

method is that the height of a peak represents the abundance of a fragment. The more of

fragment X that is present, a stronger signal will be detected and the peak will be higher (Marsh,

2005). It should be noted that the method provides a quantitative and detailed view of the PCR

product pool derived from a community, and does not accurately reflect the native community

structure (Moeseneder et al., 1999).

This method has been employed successfully in cryospheric environments. A T-RFLP analysis

of John Evans Glacier reported 142 T-RFs from 141 DNA preparations with HaeIII digestion,

Page 35: Nitrogen fixing potential in extreme environments - UNSWorks

26

suggesting a relative low number of T-RFs was reported for each sample preparation from the

glacier (Bhatia et al., 2006). An ice core taken from Lake Vostok, at 3589 m depth, produced 12

fragments after a universal bacterial 16S rDNA amplification and digestion (Priscu et al., 1999).

T-RFLP was also used in an extensive study of the microbial diversity in lithic niches

(sandstones, quartz, soil, etc), in the McKelvey Valley, McMurdo Dry Valleys and revealed a

complex community of bacteria and eukaryota (Pointing et al., 2009).

Biases involved in environmental DNA extraction and in primer annealing to different

templates during PCR mean that certain DNA sequences (or T-RF’s) are preferentially retrieved

from a sample (Liesack and Dunfield, 2004). Therefore a particular T-RF cannot be compared

to a different T-RF in a single profile. However, it is possible to compare a T-RF to itself over

different samples, and so also a T-RF pair match can be compared to itself over different

samples (Osborn et al., 2000). In the end, the list of T-RFs in a sample is a profile of a bacterial

community present in the environmental sample.

The study area is located in Northern Victoria Land, Antarctica, close to Mario Zucchelli station

(74° 41′ 36.96″ S, 164° 6′ 42.12″ E), previously known as Terra Nova Bay station. In general,

the climate is cold with a mean annual temperature of -14°C (Frezzotti et al., 2001) and mean

monthly air temperature ranging between -26°C and 0°C. Average precipitation is 270 mm/year

water equivalent in snow (Piccardi et al., 1994). Two sites in the study area were the focus of

research - “Amorphous Glacier” (74°41’25’’ S, 164°00’ E) and “Boulder Clay“ (74°44’45’’ S,

164°01’17’’ E), which are two small ice-free areas characterised by debris cones (Guglielmin et

al., 2002; Guglielmin and French, 2004).

Although in close proximity, Amorphous Glacier is above the Pleistocene grounding line and

Holocene in age, whereas Boulder Clay is below the grounding line, with sediments likely of a

glacial-marine origin and dated to the Late Pleistocene (Orombelli et al., 1991). These novel

sites have been extensively studied for their isotopic composition, mechanisms of ice

distribution and formations (Guglielmin et al., 1997; Gragnani R et al., 1998; French and

Guglielmin, 1999a; Guglielmin and French, 2004) yet to date, their microbial and diazotrophic

aspects remain unknown. We therefore proceeded to test if bacterial DNA could be obtained

from ice and permafrost cores of the Amorphous Glacier and Boulder Clay areas, and whether

bacterial community profiles differ between these two distinct sites by way of terminal-

restriction fragment length polymorphism (T-RFLP) analysis.

Knowledge of microbial life existing in ice does not only improve our understanding of the

taxonomic diversity, richness and biogeography of cold-adapted microorganisms, but also

Page 36: Nitrogen fixing potential in extreme environments - UNSWorks

27

Boulder Clay

Amorphous Glacier

Figure 1 Antarctic study sites. Left: location of the two study sites. Upper right pane: View of the perennially frozen lake and debris cone in Amorphous Glacier (Guglielmin et al., 2002). Lower right pane: Frost mound at Boulder Clay (Guglielmin and French, 2004). Reproduced with permission from author.

assists in evaluating the metabolic requirements for survival and proliferation of life in the

cryosphere, and in defining the actual limits of life.

Page 37: Nitrogen fixing potential in extreme environments - UNSWorks

28

2.2 Materials and methods

2.2.1 Study sites

Amorphous Glacier is located west of Mario Zucchelli Station (MZS) between 250 and 290 m

above sea level (see figure 1). The summit of the cone is partially collapsed and its debris cover

consists of 70-80% of light grey granitic gravel, with some granite boulders being more than 1

m in diameter. Ice within it represents congelation ice derived from ground waters formed under

different thermodynamic conditions (Guglielmin et al., 2002). The age of the cone is relatively

recent within the Holocene. The ice core stratigraphy has revealed several layers, based on

crystallographic characteristics (C-axes, bubble density, crystal size) and chemical analyses

(Guglielmin et al., 2002). These layers are summarised in table 2.

Boulder Clay site is located south of Mario Zucchelli Station (MZS) in an ice-free area, 205 m

above sea level (Guglielmin and French, 2004). The mean annual air temperature is -13.8°C and

the mean annual ground temperature at the surface (2 cm depth) is 16.1°C and at the permafrost

table (30 cm depth) -16.5°C. The mean annual temperature of the deepest monitored layer (3.6

m, within the ice), is -17°C (Guglielmin and Cannone, 2012).

In the Boulder Clay area, an ablation till of late-glacial age overlies a body of buried glacier ice

(Guglielmin et al., 1997; Gragnani R et al., 1998; Guglielmin and French, 2004), and surface

features include perennially ice-covered ponds with icing blisters and frost mounds (French and

Guglielmin, 2000), frost-fissure polygons and debris islands (French and Guglielmin, 1999b).

The age of the frost blister is younger than 1020 BP ± 70, while the till that generally covered

the surface of the Boulder Clay area is referred to the Late Pleistocene and in particular to the

Ross Sea I glaciations (Orombelli et al., 1991). The analysed frost mound formed during the

late Holocene, in the middle of a perennially ice-covered lake, which is located on the

sublimation till, overlying the buried Pleistocene relict glacier ice (Guglielmin et al., 2009).

2.2.2 Ice core collection

Two ice cores were obtained during the austral summer in 1996 (Guglielmin et al., 2002) with

slow rotary drilling equipment without any chemical solutions, antifreeze liquid or any drilling

fluid in order to minimize possible contamination. A 237 cm long ice core was extracted from

the debris cone of Amorphous Glacier (AM), placed in polyethylene bags and stored in MZS

station at -25°C (Guglielmin et al., 2002). The Boulder Clay (Stöver and Müller) core was 375

cm long and sampled from a shallow perennially-frozen pond through the underlying sediment

Page 38: Nitrogen fixing potential in extreme environments - UNSWorks

29

into the moraine-covered glacial ice. Cores were transported to Milan-Bicocca University, Italy,

and stored in -40°C for further processing. Both cores contained several distinct layers (table 2).

Amorphous Glacier was previously characterised chemically and isotopically (Guglielmin et al.,

2002).

2.2.3 Sample preparation

Samples were aseptically cut from the ice cores in a -40°C room and stored on dry ice in a

-40°C room, by a former member of the lab. Internal parts of the cores were cut by an electric

saw (repetitively washed with ethanol) and stored in sterile falcon tubes after the surface was

washed with 70% ethanol. BC samples contained a mixture of ice, stones and shells due its

glacial-marine origin. These samples were crushed with an ethanol washed hammer. Two

duplicates from each sample were taken and stored in sterile falcon tubes for further

amplification and T-RFLP analysis.

2.2.4 DNA extraction and amplification

Samples were thawed overnight at 4°C and always kept in the dark. AM samples were then

filtered through a sterile 0.22 mm membrane (Millipore). The flow-through was collected in

sterile Falcon tubes, lyophilised and resuspended in 1mL sterile buffer.

Filters were washed with 1 mL TNE buffer to recover bacterial cells. DNA was extracted from

the filter and flow-through fractions using a protocol as previously described (Burns et al.,

2004) with a modified incubation step with proteinase K (10 mg ml-1) and SDS (10%) to give a

final concentration of 100 μg ml-1 proteinase K in 0.5% SDS, for 1 h at 37°C, and finally

resuspended in 50 mL sterile water.

BC samples (400 mL) were added to 500 mL TNE buffer DNA extracted as described above

and resuspended in 50 mL sterile Milli-Q water. All BC and AM samples were resuspended in a

final volume of 50 mL sterile Milli-Q water. The DNA concentration was measured using

NanoDrop ND-1000 Spectrophotometer. The presence of bacterial DNA, as well as the quality

of extracted DNA and the presence of PCR inhibitors, was tested by universal bacterial 16S

rDNA PCR using unlabelled 27F and 1494R primers. To amplify 16S rDNA fragment for the

T-RFLP analysis, PCR was performed with a labelled universal forward primer 27F (6-FAM,

carboxyfluorescein, 5’ AGAGTTTGATCCTGGCTCAG) and universal reverse primer 1494R

(5’ TACGGCTACCTTGTTACGAC) in a 50 μL reaction (1X reaction buffer, 0.2 mM dNTP’s

each, 0.25 mM MgCl, 0.2 μM primers, 0.8 U Taq polymerase). After an initial denaturing step

Page 39: Nitrogen fixing potential in extreme environments - UNSWorks

30

at 92ºC for 2 min, 30 cycles of amplification followed (92ºC for 20 sec, 50ºC for 30 sec, 72ºC

for 1 min), concluding with an extension step at 72ºC for 7 min. DNA extraction of a microbial Table 2. Ice core sections and layers description from Amorphous Glacier and Boulder Clay samples. n.d. - no data.

Sample Ice core section (cm, depth)

El. Cond. (μS cm-1 20°C)

pH (20°C)

Cl - ( eq L-1)

SO4 -2 ( eq L-1)

Layer description (Guglielmin et al., 2002)

Amorphous Glacier

AM-18 0-22 124 6.48 983.14 125.26 Active layer composed of loose sandy gravel with fine material increasing with depth.

AM-3 75-79 19.5 5.73 155.45

10.77 Massive ice with high bubble density, elongated and big crystals; chemicals maximum concentration peaked in sinusoidal cycles every 60 cm in depth.

AM-21 265-272 59.95 6.71 347.4 7.62 Massive ice with an intermediate bubble density, less elongated and smaller crystals; sinusoidal chemicals cycles were not present.

Boulder Clay

BC-1 0-15 n.d. n.d. n.d. n.d. Dynamic active layer in a small debris cone (0-30 cm depth changes).

BC-T 325-330 n.d. n.d. n.d. n.d. Massive ice and brine pockets within the ice (from a frost mound in perennial frozen pond).

BC-B 370-375 n.d. n.d. n.d. n.d. Massive ice and brine pockets within the ice (from a frost mound in perennial frozen pond).

Data taken from (Abramovich et al., 2012).

mat sample from Brack Pond (McMurdo Ice Shelf, Antarctica) was used as a positive control

while filter sterilized water was used as a negative.

Positive results from the PCR were verified by 2% agarose gel electrophoresis and ethidium

bromide staining prior to UV transillumination. DNA concentration was measured using

NanoDrop ND-1000 Spectrophotometer. Samples at all times were kept in the dark at all times.

Page 40: Nitrogen fixing potential in extreme environments - UNSWorks

31

2.2.5 Terminal Restriction Fragment Length Polymorphism (T-RFLP)

Quadruplicates of AM-3, BC-1, BC-T and BC-B samples were analysed as well as triplicates of

Brack Pond mat sample. Approximately 150 ng of each FAM-labelled PCR product was

digested with 6 U of the restriction endonuclease MspI or 3 U of ScrFI (New England Biolabs).

Digestions were carried out in a total volume of 10 μL over night at 37ºC following the

manufacturer’s instructions.

The size of each Terminal Restriction Fragment (“T-RF”) was determined according to the

GeneScan™ 1200 LIZ® size standard on an ABI 3730 Capillary sequencer (Applied

Biosystems Inc.) with an acceptable error of ± 0.5 bp and also analysed using Peak Scanner™

Software Version 1.0 (Applied Biosystems Inc). T-RFs were visualized as peaks in

GeneScan™, which are characterised by width (base pairs) and height (arbitrary fluorescence

units as a linear representation of the abundance of a specific T-RF in the PCR pool). Height is

therefore a qualified estimation of the original amount of a specific DNA fragment in a sample,

prior to the PCR process (Marsh, 2005). Here, the absolute peak height was not used as a

measure of bacterial abundance, since PCR fragment levels could have originated from process

biases (Suzuki and Giovannoni, 1996).

Little background noise was evident in the electropherograms, affording an unambiguous

selection of valid T-RFs with a minimum height of 35 fluorescent units (Liesack and Dunfield,

2004). T-RFs over 35 fluorescent units in intensity and present in at least two replicates were

selected for further analysis. For comparative analysis, T-RFs within an electropherogram were

normalized to the total height of that sample (Dunbar et al., 2000) and T-RFs with a relative

height of less than 1% of the total height were excluded from further analysis. T-RFs with peak

heights determined to be off-scale by GeneScan™ were also excluded from further analysis,

unless present in other replicates at lower heights, in which case these T-RFs were adjusted to

the lower height value (Dunbar et al., 2001).

Identical T-RFs in replicas were aligned and grouped after manually inspecting

electropherograms. Assigning a specific size to a group of similar T-RFs was based on

averaging their sizes. Similarly, assignment of relative height to a group of similar T-RFs, was

based on averaged normalized relative height values. Only a few T-RFs were separated by 1

base pair but were shown to be identical peaks after manual inspection of electropherograms.

These included T-RFs 80, 81, 82 and 145,146 that were collectively assigned as size 81 and

145, respectively.

Page 41: Nitrogen fixing potential in extreme environments - UNSWorks

32

2.2.6 T-RFLP profiles

The presence of similar T-RFs in each profile was the basis for the community comparison

between AM-3, BC-1, BC-T and BC-B (Dunbar et al., 2000). The T-RFs list of each sample

was considered a community profile and the similarity between Boulder Clay and Amorphous

Glacier samples was assessed.

A binary data set was created based on presence or absence of T-RFs from all samples. Bray-

Curtis analysis was performed on presence-absence data using PAleontological STatistics

program (PAST) (Hammer et al., 2001). The Venn diagram was calculated with the online

program Venny (Oliveros, 2007).

2.2.7 T-RFLP Analysis Program (TAP)

T-RFs from all profiles were matched to the in silico digestions performed by the TAP software,

on 16S rRNA genes present in the Ribosomal Database Project release 9, update 57 (Marsh et

al., 2000; Cole et al., 2003). The software produced terminal restriction fragments, after taking

into account the PCR primer binding sites, and the restriction enzyme excision sites (MspI and

ScrFI), producing a database which contained list of T-RFs and bacteria divided into phyla,

genera and species, which were then manually matched to the T-RFs observed in each Antarctic

sample.

2.2.8 RDP 9, TAP and T-RFLP databases

T-RFs from all profiles were used for putative bacterial identification, based on the list the TAP

software produced (section 2.2.7). We compared three 16S rDNA databases in terms of their

phylogenetic composition: RDP (release 9, update 61), RDP 9 after TAP performed an in silico

digestion with MspI and ScrFI, and a third database which was based on the samples profiles

from the T-RFLP analysis. The first two databases provided a reference point for the third

database in terms of taxa distribution.

2.2.9 PCR amplification of nifH genes

PCR amplification of nifH genes could not be carried out, mainly due to lack of source material

after optimising our methodology as described in sections 2.2.4 - 2.2.5.

Page 42: Nitrogen fixing potential in extreme environments - UNSWorks

33

2.3 Results and discussion

Molecular fingerprinting analysis based on the bacterial 16S rDNA allows us to determine the

presence of bacteria in environmental samples and their community profiles (Marsh et al.,

2000). We obtained DNA with concentrations ranging from 0.29 to 88.02 ng mL-1, with the

highest concentration from sample BC-1 (table 3). DNA yields and bacterial cell counts would,

however, be required to determine if the different DNA concentrations are due to changes in the

distribution of bacteria in the ice cores. DNA did not degrade under specified storage conditions

and the partial 16S rDNA was successfully amplified from the samples BC-1, BC-T, BC-B and

AM-3. Table 3 summarises DNA yields and results of 16S rDNA amplification from the study

site and figure 2 presents representative electropherograms from each sample. Reasons for the

failure of any amplification from the samples AM-18 and AM-21 could be due to a combination

of low DNA concentration and degraded DNA (Rivkina et al., 2004), as we did not detect PCR

inhibition in the extracted nucleic acids.

Table 3. DNA yields and results of 16S rDNA amplification from Amorphous Glacier and Boulder Clay samples. +, successful amplification; -, no amplification.

Sample Ice core section (cm, depth)

DNA (ng μL-1)

Amplification 16S rDNA

Amorphous Glacier

AM-18 0-22 0.29 - AM-3 75-79 2.44 + AM-21 265-272 2.47 -

Boulder Clay

BC-1 0-15 88.02 + BC-T 325-330 3.48 + BC-B 370-375 9.29 +

In general, the number of T-RFs from the ice core samples was lower in comparison to T-RFLP

studies from non-cryospheric environments. For example, a bioreactor study reported 69 T-RFs

(McGuinness et al., 2006) and a cucumber roots study reported about 32 T-RFs using MspI

digestion (Tiquia et al., 2002).

However, the number of T-RFs from this study is within the range of results from other

cryospheric environments (Priscu et al., 1999; Bhatia et al., 2006). It would seem T-RFLP

studies of icy environments tend to produce relatively low numbers of T-RFs, suggesting

restricted bacterial diversity in these environments.

Page 43: Nitrogen fixing potential in extreme environments - UNSWorks

34

Peaks that accounted for about a third of the total fluorescence (peak height) in any profile were

usually small T-RFs (38 base pairs or lower), while longer fragments were generally one fifth or

less of the total fluorescence in any profile. Small fragments were not marked as out of scale,

and were also present in replicas. They may be the result of primer dimers (Liesack and

Dunfield, 2004; Marsh, 2005) or have originated from unidentified bacteria. In this study, they

were excluded from downstream analysis as they were considered most probably primer dimers

which did not reflect true bacterial community diversity.

2.3.1 Amorphous Glacier and Boulder Clay T-RFLP Profiles

T-RFLP analysis identified 18 T-RFs from MspI and ScrFI digestions (table 4). Four T-RFs

were found in all sites. There were 11 unique T-RFs and the majority were detected in the AM

sample (figure 3A). The BC-B and BC-1 ice-core samples had one and three unique T-RFs,

respectively, but no unique T-RFs were identified in BC-T, which also only had five T-RFs in

total. BC-T and BC-B bacterial community profiles were most similar to one another and both

were more similar to BC-1 than to Amorphous (table 4, figure 3).

The relative peak height of T-RFs can indicate their relative abundance within the bacterial

communities. In the bacterial profiles analysed here, 88 per cent of T-RFs from the MspI

digestion had a relative peak abundance of less than 10 per cent (table 4). The greatest peak

abundance was T-RF size 553 from AM-3 sample, with 39.1 per cent. In the ScrFI digestion, T-

RF size 81 was the most abundant fragment, with a relative abundance of 32.5 per cent (BC-B

and BC-1 samples), and 76 per cent of T-RFs had a relative peak abundance below 10 per cent.

This could suggest that some taxa within the bacterial communities may dominate the overall

abundance of the community profiles. Bray-Curtis similarity cluster analysis (figure 3B)

suggested that Boulder Clay T-RF profiles were similar to each other but clustered separately

from the AM-3 ice-core sample. Two possible explanations for these results are the brine

pockets in Boulder Clay, with high salt concentrations created due to partially melted ice with

hypersaline water intrusions, and the penetration of bacteria from the top to lower layers via

liquid water or micro-channels in the ice.

Page 44: Nitrogen fixing potential in extreme environments - UNSWorks

35

B

C

D

A

Figure 2. T-RFLP electropherograms after MspI digestion. A- AM-3 B- BC-T, C- BC-B , D- Brack Pond , E- BC-1.

E

Table 4. Number and relative peak abundances (%) of T-RFs > 40 bp, following MspI and ScrFI digestions of ice core samples and a 1% relative height threshold. The most abundant T-RF in a sample is marked bold.

T-RFs (bp) Relative peak abundance (%) AM-3 BC-T BC-B BC-1 MspI Digestion 43 + (1.3) 48 +(1.3) 73 +(1.6) +(2.0) 81 +(5.6) +(6.7) +(10.1) +(7.8) 145 +(1.6) +(1.2) 147 +(1.2) 148 +(1.7) 149 +(1.3) 279 +(2.4) 538 +(1.3) 553 +(39.1) 1205 +(1.7) Sum 6 2 4 5 ScrFI Digestion 43 +(3.1) +(1.3) +(1.6) 76 +(1.2) +(1.3) +(1.2) +(1.6) 81 +(15.6) +(16.2) +(32.5) +(32.4) 116 +(1.3) 145 +(2.7) +(2.5) +(2.1) +(3.1) 796 +(3.6) Sum 6 3 4 4 Total 12 5 8 9

Page 45: Nitrogen fixing potential in extreme environments - UNSWorks

36

Figure 3 (A) Venn diagram illustrating the number of T-RFs per bacterial profile. (B) Bray-Curtis cluster analysis of 16S rDNA T-RF profiles from Amorphous Glacier (AM) and Boulder Clay (Stöver and Müller) obtained from 1000 bootstraps.

We proceeded to evaluate diversity without implementing a 1% relative height threshold, in

order to gain more data points, and we assessed each digestion separately (table 5). According

to the ScrFI digestion results and similarity comparison, the majority of T-RFs detected in

Boulder Clay were shared between its profiles - BC-1, BC-T and BC-B. Eighty-two and 88% of

BC-1 T-RFs were shared with BC-B and BC-T, respectively (ScrFI digestion). In addition, 93%

and 100% of BC-T T-RFs were shared with BC-1 and BC-B, respectively. Most T-RFs (88-

100%) from all Boulder Clay profiles were detected in AM-3 profile which had about 3 times

more T-RFs (76) in total than other profiles.

The shared T-RFs between BC-1, BC-B, BC-T and AM-3 amounted to a fifth (20%, 22%) of

the AM-3 profile; therefore 80% of AM-3 DNA fragments were different than the Boulder Clay

ice cores contents, according to ScrFI digestion results.

MspI digestion produced more T-RFs for each profile and therefore less T-RFs were shared

between profiles. Twenty-one and 34% of BC-1 T-RFs were shared with BC-T and BC-B,

respectively. BC-T shared T-RFs at 65% and 62% with BC-1 and BC-B respectively, yet 73%

of BC-T T-RFs were shared with AM-3. Additionally, BC-1 shared 35% of T-RFs with AM-3

while BC-B shared 42% T-RFs with AM-3, altogether suggesting that Boulder Clay DNA

fragments were also present in AM-3.

Page 46: Nitrogen fixing potential in extreme environments - UNSWorks

37

Table 5. Cross-profile analysis based on shared T-RFs between AM-3, BC-1, BC-T and BC-B. Ice core profiles ScrFI digestion BC-1 (17)(a) BC-T (15) BC-B (19) AM-3 (76) BC-1 93 79 20 BC-T 82(b) 79 20 BC-B 88 100 22 AM-3 88 100 89 MspI digestion BC-1 (82) BC-T (26) BC-B (57) AM-3 (94) BC-1 65 49 31 BC-T 21 28 20 BC-B 34 62 26 AM-3 35 73 42 a The total number of T-RFs counted for a specific profile is displayed in brackets. T-RF count was done prior to implementing 1% relative height threshold to produce as much data points as possible for the analysis. b Total number of T-RFs varied between profiles. The result in each cell is the percentage of shared T-RFs between two profiles, with respect to the column profile.

Additionally, as observed in the ScrFI digestion analysis, the shared T-RFs between BC-1, BC-

B, BC-T and AM-3 still amounted to a relatively small portion of the AM-3 profile (31%, 20%

and 26%), even though the number of T-RFs in Boulder Clay profiles was considerably higher

(82, 26, 57) in comparison to the ScrFI digestion.

In summary, with or without a 1% relative height threshold, AM-3 was the most diverse sample

and differed from Boulder Clay samples, and few DNA fragments were shared between all sites.

Additionally, Boulder Clay samples were similar to one another, and clustered separately from

the AM-3 ice-core sample. Amorphous Glacier and Boulder Clay differ lithologically and in

their geological ages (Holocene vs. Late Pleistocene), and therefore most probably support

different microbial populations.

2.3.2 In silico database composition

The in silico process of the TAP program is based on the RDP 16S rRNA sequences database

(Marsh et al., 2000; Cole et al., 2007). It was of interest to compare the outcome of the in silico

digestion to the composition and size of the original RDP database. If the in silico digestion

produced a seriously skewed bacterial taxa representation of the original RDP database bacterial

composition, the T-RFLP profiling would be biased as well and would be only partially

representative of the bacteria in the samples.

Page 47: Nitrogen fixing potential in extreme environments - UNSWorks

38

Figure 4. Databases phylogenetic compositions. (A) Bacterial phylogenetic composition of RDP 9 16S rRNA gene sequence database; (B) Bacterial phylogenetic composition based on TAP in-silico digestion with ScrfI and MspI;(C) Bacterial phylogenetic composition from all digested samples, after T-RFs were assigned bacterial identification.

The RDP 9 (release 61) database contained a total of 180,642 bacterial sequences (figure 4, A).

Thirty three point five percent were affiliated with Proteobacteria sequences, 31.9% Firmicutes,

12.7% Bacteroidetes (CFB group) and 8.8% Actinobacteria (Cole et al., 2007; Cole et al.,

2009). Another 31 phyla were present in the database, 26 of which had less than 1% of the total

amount of sequences in the database, while Acidobacteria, Cyanobacteria, Spirochaetes,

Verrucomicrobia and unclassified bacteria phyla were present with slightly more than 1% ratio.

The second database was produced in silico by the TAP program and contained 30,781

sequences (figure 4, B). Major groups included Proteobacteria (34.8%), Firmicutes

32.7%), Bacteroidetes (14%, CFB group) and Actinobacteria (8.4%), similarly to the RDP 9

database distribution. Additional twenty eight phyla were present in this database.

Cyanobacteria and unclassified bacteria were >1%, while 26 other phyla had lower proportions.

From size perspective, the TAP database size was only 17% of the RDP 9 database, yet from a

composition perspective, it was similar in its composition to the RDP 9 database. Therefore, the

TAP program produced a representative database for the downstream process.

Page 48: Nitrogen fixing potential in extreme environments - UNSWorks

39

The third database, based on the T-RFLP analysis (figure 4, C), contained fewer bacterial

sequences than the TAP or RDP 9 databases and varied in the composition of the phylogenetic

groups. It contained potential cryospheric bacteria from the analysed samples, and consisted of

650 sequences in total. Four major phylogenetic groups were represented: Proteobacteria

(39.1%), Firmicutes (22.2%), Actinobacteria (13.2%) and Bacteroidetes (CFB group) (12.5%).

Acidobacteria, Cyanobacteria, Planctomycetes, Spirochaetes and unclassified bacteria were also

present with > 1% sequence abundance in the database and eight additional phyla were present

with < 1%.

Thus the T-RFLP database, generated in this study, contained seventeen phyla vs. the 32 and 35

phyla identified in the TAP and RDP 9 databases, respectively, with proportional shifts within

the four major phylogenetic groups - an increase in the Proteobacteria and Actinobacteria

sequences, a decrease in the Firmicutes, and no substantial changes within the Bacteroidetes

(CFB group).

We then continued to further analyse the T-RFLP profiles of each sample in order to gain an

overview of putative phylotypes (Pointing et al., 2009). The TAP database contained all the

possible T-RFs emerging specifically from using MspI and ScrFI restriction enzymes, we

therefore normalized the individual profiles, of each sample, to the TAP database (table 6). An

average ratio value above 1 indicated a higher portion of a specific phylum relatively to the

original TAP database. Across all samples, for instance, there was a higher portion of

Proteobacteria, Firmicutes, Actinobacteria and Nitrospira (1.12, 1.03, 1.04 and 13.43, average

ratios respectively) in the samples than in the original TAP database. Conversely, Bacteroidetes

(CFB group), Cyanobacteria, and unclassified bacteria had an average ratio < 1 (0.47, 0.44 and

0.78, respectively) indicating a lower portion of these phyla in each profile, compared to the

TAP database.

Generally, Amorphous Glacier (AM-3) sample retained 16 phyla (TAP projected originally 32

phyla) and it had the largest number of phylogenetic groups compared to BC-T, BC-1 and BC-B

(Table 6). Cross-profile analysis also suggested that Amorphous also shared common T-RFs

with Boulder Clay (table 5) and TAP in silico projection established these common T-RFs were

probably related to Proteobacteria, Firmicutes, Bacteroidetes (CFB group), Actinobacteria,

Cyanobacteria and Nitrospira.

Except for OP10 and the Thermotogae groups, all other phyla associated with AM-3, have

previously been found in Antarctic lakes and microbial mats and other cryospheric

environments - alpine permafrost and Siberian permafrost-affected soils (Franzmann and

Page 49: Nitrogen fixing potential in extreme environments - UNSWorks

40

Dobson, 1992; Brambilla et al., 2001; Bowman et al., 2003; Sheridan et al., 2003; Miteva et al.,

2004; Bai et al., 2006; Wagner et al., 2009).

Table 6. Distribution of potential phyla groups within each profile based on TAP database of 16S rDNAsequences.

a A partial list of phyla present in the TAP database post in-silico MspI and ScrFI digestion. b The ratio of each phylum in the TAP database. c The ratio of each phylum in the T-RFLP profiles: AM, BC-T, BC-B and BC-1. d The ratio of each phylum was divided by the ratio of its corresponding phylum in the TAP database. The resulting ratios were then averaged.

e “(X)” not an average, value based on one profile only.

2.3.3 Amorphous Glacier and Boulder Clay cryospheric bacteria

Phyla level analysis provided a broad overview and we proceeded to review the data at the

genus level. This would further correlate our results to current microbial cryospheric data, and

in particular, to findings from the Terra Nova Bay area.

A genus was denoted ‘cryospheric’ based on published reports of sequence data with 95% or

higher 16S rDNA sequence similarity to a specific genus isolated from cold environments

(Everett et al., 1999). This was deemed necessary in order to narrow down the possible

diazotrophic candidates, and after applying the 95% sequence similarity criteria, 65 genera were

excluded from the analysis (data not shown) while 38 genera passed the criteria.

The mean number of T-RFs following MspI digestion was 10.4 ± 1.7 for an individual profile

and a total of 26 different 16S rDNA T-RFs were observed from AM-3, BC-1, BC-T, BC-B

combined. From these 26 T-RFs, 16 T-RFs were singular peaks (appeared in one profile only)

TAP database Profiles individual T-RFLP databases Phylum (a) Sequences

(%) (b)

AM-3 (%) (c)

BC-T (%)

BC-B (%)

BC-1 (%)

Average ratio(d)

Proteobacteria 34.8 42.1 51.2 33.6 28.6 1.12 Firmicutes 32.7 17.5 32.6 25.6 58.9 1.03 Bacteroidetes 14.0 13.9 5.8 4.8 1.8 0.47 Actinobacteria 8.4 16.3 7.0 8 3.6 1.04 Cyanobacteria 3.6 1.6 1.6 0.44 Unclassified bacteria 2.0 1.4 1.8 0.78 Acidobacteria 0.9 2.0 (2.24)(e) Chloroflexi 0.6 0.6 (0.99) Verrucomicrobia 0.6 0.6 (1.06) Spirochaetes 0.5 24 (45.32) Planctomycetes 0.5 1.2 (2.48) Deinococcus-Thermus 0.2 1.0 (4.30) Nitrospira 0.2 0.8 3.5 2.4 5.4 13.43 Fusobacteria 0.2 0.4 (2.40) Thermotogae 0.1 0.4 (3.39) Chlorobi 0.05 0.2 (4.07) OP10 0.05 0.2 (4.07)

Page 50: Nitrogen fixing potential in extreme environments - UNSWorks

41

with relative heights between 1.2% and 39.1% of the total fluorescence (T-RF 553 in AM-3

profile, table 4).

Five T-RFs were observed after MspI digestion in most profiles (T-RFs 30, 31, 35, 38 and 81).

However, the first four of these T-RFs were considered primer dimers and discarded from

further analysis. T-RF 81 had a relative average height of 8.8% and had an in-silico (TAP

Projected T-RFs (TPTs) identification of Bacilli spp. (Firmicutes) and the Nitrospira class.

Clones of both taxa were reported from cryospheric environments (Bakermans et al., 2003;

Gilichinsky et al., 2007; Steven et al., 2007), and more specifically from Rod Bay (Terra Nova

Bay area) and Northern Victoria land (Yakimov et al., 2004; Aislabie et al., 2009). Two T-RFs

following MspI digestion (73 and 145, table 4) were observed in 60% of the profiles with

relative heights below 4%, indicating fragments were not abundant after PCR amplification

process. Only T-RF 145 had an in-silico bacterial match to various phylogenetic groups -

Gammaproteobacteria, Firmicutes and Actinobacteria.

The mean number of T-RFs following ScrFI digestion was 8.6 ± 2 per profile and a total of 13

different 16S rDNA T-RFs were observed across all profiles. Three out of the 13 T-RFs were

singular peaks. Four T-RFs following ScrFI digestion (43, 76, 81 and 145) were observed in

more than 80% of profiles and were associated in-silico with Proteobacteria, Actinobacteria,

Firmicutes, Bacteroidetes (CFB group) and green sulphur bacteria. T-RF 81 was the most

abundant fragment with an average relative height of 30.4 ± 16.2 %. Its TAP Projected T-RF

(TPT) was associated in-silico with nine different genera. Alicyclobacillus has been isolated

from Mount Melbourne, Northern Victoria land (Nicolaus et al., 1998; Pepi et al., 2005) and

Bacillus is a common find in polar environments, as mentioned previously. Burkholderia and

Pseudomonas were reported previously from Rod Bay and Santa Maria Novella (Yakimov et

al., 2004; Lo Giudice et al., 2007). In addition, clones affiliated with Burkholderiales were

recently found in soil samples in Northern Victoria Land (Niederberger et al., 2008). Of the

remaining five genera associated with T-RF 81, three (Rikenella, Terrimonas and

Sporolactobacillus) have not been detected in cryospheric environments, yet Leptospirillum and

Rhodanobacter were reported previously in the cryospheric literature (Spirina et al., 2003;

Vishnivetskaya et al., 2006).

Three unique T-RFs were observed only in AM-3 profile after ScrFI digestion (18, 116 and

796) with respective relative heights of 1.3%, 1.3% and 3.6%. T-RF 116 did not have an in-

silico bacterial match. T-RF 796 was associated with Sporolactobacillus and Streptococcus

genera from the Bacilli class. Sporolactobacillus was not detected or found in cryospheric

Page 51: Nitrogen fixing potential in extreme environments - UNSWorks

42

environments to date, while Streptococcus has been discussed above. In silico matches to all T-

RFs within samples, are listed in table 7.

A total of 20 putative diazotrophic genera were identified based on the analysis (table 7), 8 in

Boulder Clay samples and 17 in Amorphous Glacier samples, none relating to Cyanobacteria.

Of these diazotrophic bacteria, some have been previously reported from the Terra Nova Bay

area (table 1): Arcobacter, Burkholderia, Halomonas, Methylobacterium and Pseudomonas

(Proteobacteria), Bacillus and Paenibacillus (Firmicutes), and Arthrobacter (Actinobacteria).

Table 7. Genera and diazotrophic cryospheric bacteria in AM-3, BC-T, BC-B and BC-1. Shaded rows indicate genera previously found in Terra Nova Bay area. Potential diazotroph N/Y (a)

Genera AM-3 BC-T BC-B BC-1 Cryospheric References(b)

Proteobacteria

N Acidovorax + (Priscu et al., 1999; Liu et al., 2006; Lo Giudice et al., 2007)

Y Aeromonas + (Gilichinsky et al., 2007) Y Afipia + (Priscu et al., 1999)

N Alteromonas + +

(Feller et al., 1992; Gauthier et al., 1995) Now Pseudoalteromonas haloplanktis

Y Arcobacter + (Bowman and McCuaig, 2003; Yakimov et al., 2004)

Y Bradyrhizobium + (Zhou et al., 1997; Sheridan et al., 2003; Xiao et al., 2007)

Y Burkholderia + + (Christner et al., 2000; Yakimov et al., 2004)

N Coxiella + (Bowman and McCuaig, 2003)

Y Delftia + (Gaidos et al., 2004; Skidmore et al., 2005; Xiao et al., 2007)

Y Desulfobacterium + (Ravenschlag et al., 1999; Bowman and McCuaig, 2003)

N Erythrobacter + (Yakimov et al., 2004) N Gallionella + + (Skidmore et al., 2005)

Y Halomonas + (Bowman et al., 1997; Xiang et al., 2004; Yakimov et al., 2004)

N Hydrogenophaga + (Liu et al., 2006; Amato et al., 2007)

N Lysobacter + (Vishnivetskaya et al., 2006; Steven et al., 2007)

N Marinobacter + + + + (Brinkmeyer et al., 2003; Lysnes et al., 2004;

Page 52: Nitrogen fixing potential in extreme environments - UNSWorks

43

Yakimov et al., 2004)

Y Mesorhizobium + (Bowman and McCuaig, 2003)

Y Methylobacterium +

(Yakimov et al., 2004; Miteva and Brenchley, 2005; Zhang et al., 2007b)

Y Pelobacter + (Brambilla et al., 2001; Sjöling and Cowan, 2003)

Y Pseudomonas + + + (Michaud et al., 2004; Yakimov et al., 2004; Lo Giudice et al., 2007)

N Shewanella + + + (Bowman et al., 2003; Yakimov et al., 2004; Lo Giudice et al., 2007)

N Sphingobium + (Xiang et al., 2005) Total for each sample 20 3 6 3

Firmicutes

Y Bacillus + + + + (Nicolaus et al., 1996; Yakimov et al., 2004; Steven et al., 2007)

Y Clostridium +

(Brambilla et al., 2001; Gilichinsky et al., 2005; Vishnivetskaya et al., 2006)

N Lactobacillus + (Segawa et al., 2005; Sundset et al., 2007)

Y Paenibacillus + (Bargagli et al., 2004; Pepi et al., 2005; Mindlin et al., 2008)

N Streptococcus + (Gaidos et al., 2004; Segawa et al., 2005)

N Syntrophococcus + (Sundset et al., 2007) Total for each sample 6 1 1 1

Bacteroidetes

N Algoriphagus (Van Trappen et al., 2004)

N Bacteroides + + (Sheridan et al., 2003) N Flavobacterium + (Bowman et al., 1997;

Gilichinsky et al., 2007) N Hymenobacter + (Hirsch et al., 1998)

Total for each sample 2 1 1 0

Actinobacteria

Y Arthrobacter + (Bargagli et al., 2004; Michaud et al., 2004; Lo Giudice et al., 2007)

N Corynebacterium + (Gilichinsky et al., 2007; Lo Giudice et al., 2007)

Page 53: Nitrogen fixing potential in extreme environments - UNSWorks

44

Y Curtobacterium + (Miteva and Brenchley, 2005)

Y Frankia + + (Christner et al., 2000)

N Janibacter + (Michaud et al., 2004; Miteva et al., 2004; Lo Giudice et al., 2007)

Y Micrococcus + (Petrova et al., 2002; Xiang et al., 2005; Steven et al., 2007)

N Mycobacterium + (Miteva et al., 2004) N Nocardiopsis + (Abyzov et al., 1983)

N Rhodococcus + (Michaud et al., 2004; Yakimov et al., 2004; Lo Giudice et al., 2007)

Y Streptomyces + +

(Kochkina et al., 2001; Zhang et al., 2002; Mannisto and Haggblom, 2006)

Total for each sample 8 0 4 0

Verrucomicrobia

N Prosthecobacter + (Bowman and McCuaig, 2003)

Total for each sample 1 0 0 0

Deinococcus-Thermus

N Thermus + (Sheridan et al., 2003) Total for each sample 1 0 0 0 a Y denotes a genus with species which possess a copy of nifH gene, based on NCBI genomic databases.

b An abbreviated list of cryospheric references. Where possible, only references which presented clones with 95% and higher 16S rDNA sequence similarity to a specific genus were included.

2.4 Concluding remarks

Boulder Clay and Amorphous Glacier are two ice-free areas in Terra Nova Bay, Antarctica,

which differ in their geological origins and physio-chemical properties, which have been

assessed, for the first time, for their microbial content and biodiversity. In order to gather first

evidence for the bacterial communities in these glacial zones, we carried out terminal-restriction

fragment length polymorphism (T-RFLP) analysis on 16S rDNA using a universal bacterial

amplification protocol on two permafrost cores.

Microbial diversity differed between Boulder Clay and Amorphous Glacier and between the

different layers of Boulder Clay. Bray-Curtis cluster analysis suggested Boulder Clay bacterial

Page 54: Nitrogen fixing potential in extreme environments - UNSWorks

45

profiles were similar to each other, but cluster separately from the Amorphous Glacier bacterial

profile. With our current data it is not possible to ascertain definitively if the difference in the

geological age or other properties that distinguish one site from the other, is responsible for the

analysis results.

Our analysis suggested that the microbial population of the Boulder Clay active layer was less

diverse than the other layers at this site. This maybe is due to two reasons: A. vertical water

movement permitted micro-organisms to penetrate into deeper permafrost layers and not remain

on the surface B. hypersaline brine pockets that remained liquid at very low temperatures,

providing basic conditions for the survival of the microbial community.

Another finding of this analysis was that the Amorphous Glacier sample included potentially 38

cryospheric genera. Boulder Clay and Amorphous Glacier possibly shared in common the

following genera: Gallionella, Burkholderia (Betaproteobacteria), Alteromonas, Marinobacter,

Pseudomonas, Shewanella (Gammaproteobacteria), Bacillus (Gram positive), Bacteroides (CFB

group), Frankia and Streptomyces (high G+C Gram-positive). Each of these phylotypes include

species psychrotolerant to psychrophilic and microaerophilic. These phylotypes were either

detected in marine environments, or proven to be tolerant to NaCl salt stress, which is not

surprising considering the connection between salt tolerance and cold resistant bacteria in terms

of survival mechanisms (Jeffrey O. Dawson and Gibson, 1987; Deming, 2002; Yakimov et al.,

2004; Lo Giudice et al., 2007; Pumbwe et al., 2007).

Amorphous Glacier sample also included potentially 20 nitrogen fixing genera, based solely on

the known presence of a nifH gene copy in their genomes. Unfortunately, we were unable to

further characterise this community due to the lack of source material after a lengthy optimising

process of the T-RFLP analysis.

This preliminary work suggested the presence of a common group of cold and desiccation

resistant bacteria, some of which might be nitrogen fixers. In general, our molecular analysis

provided us with relatively few data points and the bacterial identification is by no means a

definitive conclusion and therefore would require further sampling and analysis. Such research

would help confirm and correlate the community composition with the geological and habitat

characteristics of Amorphous Glacier and Boulder Clay.

Page 55: Nitrogen fixing potential in extreme environments - UNSWorks

46

Figure 1 Shark Bay, Western Australia. Image Google Earth. Salinity values are in parts per thousand (ppt), modified from O’Leary (2008).

Chapter 3 Diazotrophic diversity in columnar stromatolites of Shark

Bay, Western Australia.

_______________________________________________

3.1 Introduction

Biologically accessible nitrogen is imperative and essential for a thriving microbial community.

Identifying potential nitrogen fixers - diazotrophs, has not been investigated to date in columnar

stromatolites in Shark Bay, Western Australia. Nothing is known of the nitrogen cycle in Shark

Bay’s modern stromatolite community, and it is of interest to compare Shark Bay’s diazotrophic

community characteristics to other comparable hypersaline microbial systems. The aim of this

study was to ascertain and characterise the diazotrophic community in columnar stromatolites

from Hamelin Pool in Shark Bay (figure 1).

Shark Bay is a 14,000 km2 world heritage site, off the

central coast in Western Australia (24°-26° S 113°-114°

E). It is a semi-enclosed embayment comprised of two

long, narrow reaches: Freycinet Reach is 105 km long

and 20–35 km wide, and Hopeless Reach (35 km long,

40 km wide). These reaches are separated by the Peron

Peninsula with a mean water depth of 10 m (Smith and

Atkinson, 1983). Shallow, carbonate rich banks exist in

both reaches, effectively creating bays of 1-2 m deep,

with relatively low oceanic water influx, which are

regulated mainly by diurnal to semi-diurnal tidal

processes (Burling et al., 2003).

Faure Sill, a major sea grass-covered sand bank, divides

Hopeless Reach into two unequally sized bays -

L'Haridon Bight and Hamelin Pool. The seawater

temperature within the bay varies between 17°C in

August and a maximum of 27°C in February (Bureau of

Meteorology, 2011). The low oceanic circulation within the bay, and the low intermittent

Page 56: Nitrogen fixing potential in extreme environments - UNSWorks

47

Figure 2 Three stromatolite morphotypes from Shark Bay (A) columnar (B) smooth (C) pustular

rainfall (<200 mm y-1) plus the high evaporation rates (>2000 mm y-1), form a NW-SE salinity

gradient, of oceanic to hypersaline levels: 35-40 ppt, then 40-56 ppt (metahaline) and up to 56-

70 ppt, almost twice as that of sea water (O'Leary et al., 2008).

Within Shark Bay shallow and hypersaline pools, exist vast banks of benthic microbial deposits,

known as stromatolites (Riding, 1999; Jahnert and Collins, 2011). The word stromatolite is

derived from the Latin word stroma, meaning bed covering, and the Greek word strōmat,

meaning spreading out, as well as lithos, meaning stone. Geologists have identified fossilized

records of stromatolites in the rock record for more than 200 years (Walter, 1976). The oldest

fossilized stromatolite was identified in the Dresser Formation of the Pilbara subgroup, Western

Australia, dated at 3.496 Ga, from the Archaean period (Schopf, 2006). Very few stromatolite

structures have been found in Archean rocks (which are less preserved in the rock record), while

most structures to date have been found in rocks from the Proterozoic and Phanerozoic era, 2.5

Ga to the present (Bertrand-Sarfati and Walter, 1976; Krylov et al., 1976; Serebryakov and

Walter, 1976a, b; Schopf, 2006).

The ancient origins of stromatolitic deposits have led to Shark Bay modern stromatolites being

referred to as “living fossils”, and marked them as important to our understanding of the origin

of life on Earth. Shark Bay entered UNSECO’s World Heritage List in 1991. As stated in the

nomination, the foremost justification for the inclusion was that the

Hamelin Pool stromatolites represented an ancient life form in

existence, and Hamelin Pool would be the classic site for study of these

“living fossils” (UNESCO, 1991).

There are five main stromatolite morphologies known to exist in Shark

Bay - pustular, smooth, cerebroid, microbial pavement, and columnar or

colloform microbial deposits (Hoffman, 1976; Jahnert and Collins,

2011). Pustular, smooth mats and columnar morphotypes are the three

best known and studied (Logan, 1961; Logan et al., 1974; Hoffman and

Walter, 1976; Playford et al., 1976; Burns et al., 2004; Allen et al.,

2009; Burns et al., 2009). Pustular mats are irregular, coarsely

fenestrated, non-laminated mats, usually found in the upper intertidal

zone (figure 2, C). Smooth mats in contrast, are finely laminated, with

distinct, well-defined layers, usually found in the lower intertidal zone.

Columnar stromatolites are usually found in the intertidal to sub tidal

zones and exist down to a depth of 1-2 m below sea surface. Club

Page 57: Nitrogen fixing potential in extreme environments - UNSWorks

48

shaped with coarsely laminated internal texture, they are highly calcified and are up to 1.5 m

height with spherical tops (Hoffman, 1976; Playford et al., 1976; Jahnert and Collins, 2011).

The various stromatolites morphologies represent interplay between environmental and

microbial processes. Environmental factors have been suggested to control the external

morphological development of a stromatolite structure. These factors included for instance -

wave energy, tide amplitudes, water levels and turbulences, sand waves, sediment grain size,

hard/soft substrates and more (Logan, 1961; Logan et al., 1964; Logan and Cebulski, 1970;

Logan et al., 1970; Playford et al., 1976; Dupraz et al., 2009). Whilst the active microbial

component produced repetitive cemented grain layers and internal laminae, mainly by

precipitation of aragonite micro crystals and repeatedly trapping and binding sediment particles

(Andres and Pamela Reid, 2006). Biotic and abiotic factors thus come together to promote the

outgrowth of stromatolites both on a micro- and macro-scale.

Microbiologists took a keen interest in the microbial component of the stromatolites and

investigated them using microscopic, culturing techniques, and more recently, molecular

methodologies. Table 1 provides a summarised view of the microbial agents identified in

stromatolites from Hamelin Pool and Guerrero Negro, Baja California Sur, Mexico, a highly

similar saline environment, discussed later in this chapter.

Combining microscopic and molecular tools has provided researchers with a qualitative and

quantitative view of the microbial communities residing in stromatolite mats. In the past,

microscopic observations in the stromatolite mats identified mainly cyanobacterial species -

Microcoleus chthonoplastes, Entophysalis deusta, Schizothrix fuscescens and Leptolyngbya spp.

(Logan, 1961; Golubic and Walter, 1976) and there is no record of their nitrogen fixation

potential or actual rates, under local conditions. PCR based research has increased the number

of identified microorganisms several fold with members of the Archaea, Bacteria and Eukaryota

detected in stromatolite samples, as can be seen in table 1. Classification of the functional

groups in stromatolites has identified Archaea as involved in fermentation (mainly

methanogenesis), cyanobacteria, as oxygenic photosynthetic produces nitrogen fixers, and

diatoms as oxygenic photosynthesisers (Paerl et al., 2000; Des Marais, 2003; Dupraz and

Visscher, 2005). Aerobic heterotrophs, anoxygenic phototrophs, sulphate reducers and sulphide

oxidizers from the Proteobacteria, Actinobacteria, Firmicutes and Bacteroidetes groups are

apparently involved in several overlapping processes: fermentation, denitrification, nitrogen

fixation and sulphur reduction/oxidation, which are all tightly bound to the light levels and

oxygen/sulphur profiles within the mats (Paerl et al., 2000; Des Marais, 2003; Dupraz and

Visscher, 2005; Goh et al., 2008; Allen et al., 2009; Burns et al., 2009).

Page 58: Nitrogen fixing potential in extreme environments - UNSWorks

49

Table 1. Stromatolite related microorganisms (genus level) from published studies. Potential diazotrophs which contain nifH gene are highlighted in bold.

Environment Hamelin Pool, Shark Bay (a)

Guerrero Negro, Baja California Sur, Mexico (b) Hamelin Pool, Shark Bay (c)

Bacteria Allochromatium

Entophysalis Microcoleus Leptolyngbya Schizothrix

Aphanothece Chlorobium Chloroflexus Chlorothrix Chromatium Chroococcidiopsis Cyanothece Desulfobacter Desulfobacterium Desulfococcus Desulfovibrio Euhalothece Gloeocapsa Halospirulina Halothece Leptolyngya Lyngbya Microcoleus Oscillatoria Phormidium Pseudanabaena Spirulina Synechocystis

Acidobacterium* Alcanivorax Alteromonas Arthrospira* Bacillus Cellulomonas Chroococcidiopsis Chroococcus Cyanothece* Cytophaga Dermocarpella Desulfatibacillum Euhalothece Gloeocapsa* Gloeothece* Halobacillus Halomicronema Halomonadaceae Halomonas Halothece Idiomarina Leptolyngbya Flexistipes* Lyngbya Marinobacter Marinomonas Microcoleus Myxosarcina* Nitrococcus* Oscillatoria Phormidium Planococcus Planctomyces Pleurocapsa* Pontibacillus Porphyrobacter Prochloron Pseudoalteromonas Pseudomonas Rhodomicrobium Rhodopseudomonas Rhodospirillum* Rhodovibrio Rhodobacter Roseobacter Salinimonas Spirulina * Stanieria*

Page 59: Nitrogen fixing potential in extreme environments - UNSWorks

50

Symploca Synechococcus* Synechocystis Vibrio Virgibacillus Xenococcus

Archaea

- -

Halococcus Haloferax Halobacterium* Methanosarcina* Halogeometricum Nitrosopumilus Cenarchaeum Fusobacterium*

* 16S rDNA sequence similarity was 90% - 95% to a designated genus. Their identification should therefore be cautiously accepted (Everett et al., 1999). (a) Data collected with microscopic methods only (Logan, 1961; Bauld et al., 1986). (b) Data collected with microscopic and molecular methods (Javor and Castenholz, 1981; Risatti et al., 1994; López-Cortés et al., 2001; Ley et al., 2006). (c) Data collected by microscopic and molecular methods (Bauld et al., 1986; Burns et al., 2004; Papineau et al., 2005; Goh et al., 2006; Allen et al., 2008; Allen et al., 2009).

Diazotrophs have not been investigated to date in columnar stromatolites. In general, identifying

nitrogen fixers has advanced considerably with the introduction of molecular techniques. These

techniques are based on DNA and RNA extractions, as well as on the polymerase chain reaction

(PCR, Mullis and Erlich, 1988) and are considered better in exploring natural microbial

diversity (Amann et al., 1995; Head et al., 1998). The possible biases generated by DNA

extraction methods and PCR kinetics were briefly discussed in the introduction chapter, section

1.8.2 and have been addressed in this study (see the methods section).

The aim of this study was to use molecular methodology to ascertain and characterise

diazotrophs in columnar stromatolites. Important nitrogen fixers in stromatolites from other

environments were usually cyanobacterial representatives from the Nostocales, Chroococcales

and Oscillatoriales, as well as anaerobic, sulphate reducing -Proteobacteria representatives

(Steppe et al., 2001; Fourçans et al., 2004; Jenkins et al., 2004; Omoregie et al., 2004a;

Omoregie et al., 2004b; Jungblut et al., 2005; Yannarell et al., 2006; Desnues et al., 2007;

Leuko et al., 2007).

Nothing is known of the nitrogen cycle in Shark Bay’s stromatolitic community, and it is of

interest to compare Shark Bay’s diazotrophic community characteristics to other comparable

hypersaline microbial systems in order to broaden our understanding of the nitrogen fixation

processes occurring within extant stromatolites and by extrapolation, processes which might

have occurred in extinct stromatolites.

Page 60: Nitrogen fixing potential in extreme environments - UNSWorks

51

Figure 3 Left: Map of Shark Bay region. Inset image shows Shark Bay’s location on the west coast of Australia. Image copyright GeoScience Australia. Right: Low tide at Hamelin Pool, Telegraph station, showing columnar stromatolites. Image copyright Torben Rübke, 2006.

3.2 Materials and methods

3.2.1 Sample collection and sample sites

Sampling was conducted by former lab students in the intertidal region of Hamelin Pool at

Telegraph Station at low tide in Shark Bay, Western Australia in 1996 and May 2004

(26°24’03” S, 114°09’36.1” E, figure 3).

Intertidal columnar stromatolite pieces were collected by cracking the stromatolite with a

geological hammer about 2 cm from the top of the stromatolite or collecting a whole small

stromatolite. All samples were placed in sterile specimen bags and kept at 4 C during transport

back to the laboratory, where they were stored in the dark at 4 C until further processing. All

samples were collected and handled with sterile instruments throughout the course of the study.

No other environmental data was collected.

3.2.2 DNA isolation and PCR amplification of nifH genes

Within two weeks of 4 C storage, samples were processed and DNA extracted in order to avoid

potential DNA degradation. A rock hammer washed with 70% ethanol and flamed was used to

break small chunks out of stromatolite specimens from 2004 and 1996. Approximately 1 cm3

fragments of the stromatolite were ground to a fine paste using a sterile mortar and pestle.

Genomic DNA was extracted by the method of Neilan (1995). Approximately 100 mg of fine

paste was transferred to a 1.5 ml eppendorf tube and suspended in 567 μL TE buffer (10 mM

Page 61: Nitrogen fixing potential in extreme environments - UNSWorks

52

Tris-HCl, 1 mM EDTA, pH 8.0), to which 30 μL 10% SDS and 3 μL Proteinase K (10 mg ml-1)

were added. The samples were incubated at 37 C for 3 h with intermittent shaking.

An additional step of 5 cycles of freezing at -40 C and thawing was added to ensure complete

cell lysis. Next, 100 μL of 5 M NaCl was added to the lysate and mixed thoroughly before the

addition of 80 μL CTAB solution (10% w/v acetyltrimethyl ammonium bromide in 0.7 M NaCl)

and incubated at room temperature overnight (Wilson, 2001).

An equal volume of phenol: chloroform: isoamyl alcohol (25:24:1) was added to the supernatant

and mixed thoroughly before centrifugation at 14,000 g for 5 min at RT. The top layer of the

supernatant was transferred to a fresh tube and the DNA precipitated in 50% isopropanol and

0.4 M potassium acetate. The samples were incubated at RT for 30 min or at 4 C overnight, and

then centrifuged at 14,000 g for 5 min to pellet the DNA. The supernatant was discarded and the

pellet washed with 70% ethanol, air dried and resuspended in 30 μL of sterile MilliQ water.

Two replicas of 1996 or 2004 extracted genomic DNA (5 ng μL-1), were used in a nested PCR

to amplify the nitrogenase gene nifH (Omoregie et al., 2004c). The first PCR in the nested

approach was performed using 0.3 units of Taq polymerase (Sigma-Aldrich, St. Louis, MO) in a

20 μL reaction mix containing 2.5 mM MgCl2, 1x Taq-Polymerase reaction buffer, 0.2 mM

dNTPs (Fisher Biotec, WA, Australia), and 2 pM of each of the primers NifH3 (5' ATR TTR

TTN GCN GCR TA 3') and NifH4 (5' TTY TAY GGN AAR GGN GG 3') (Zani et al., 2000), 1

μL of genomic DNA (5 ng μL-1) and sterile MilliQ water to a total volume of 20 μL. Thermal

cycling was performed in a GeneAmp PCR System 2400 Thermocycler (Perkin Elmer,

Norwalk, CT). Thermal cycling conditions for the amplification of bacterial nifH genes were as

follows: An initial denaturation step at 94˚C for 4 min was followed by 30 cycles of DNA

denaturation at 94˚C for 1 min, primer annealing at 55˚C for 1 min and strand extension at 72˚C

for 1 min, with a final extension step at 72˚C for 7 min. Two microliters of the first PCR

reaction were used for a second amplification round using primers NifH1 (5'-TGY GAY CCN

AAR GCN GA-3') and NifH2 (5'-ADN GCC ATC ATY TCN CC-3', (Zehr and McReynolds,

1989). The reaction mix and amplification protocol were as described above except for

increasing the annealing temperature to 57°C. All PCR experiments included a negative control

reaction without DNA template, and a positive control using DNA from the reference strain

Nostoc PCC 7120.

PCR products were visualised on 1% and 2% agarose gels (molecular biology grade, Progen

Pharmaceuticals, QLD, Australia) with 1x TAE-buffer and stained by ethidium bromide (1 μg

ml-1) for 10-15 min. Nucleic acids were visualised via UV transillumination (Gel Doc 2000,

Page 62: Nitrogen fixing potential in extreme environments - UNSWorks

53

BioRad, Hercules, CA) using QuantityOne 4.1R software (BioRad, Hercules, CA) and raw

images were exported in jpeg format for later visualisation.

3.2.3 Clone libraries and Restriction Fragment Length Polymorphism (RFLP)

Fresh PCR products (containing an A-overhang at the 3’ end) of the nifH gene amplification

were ligated into the pGEM-T Easy vector (Promega, Madison, WI) according to the

manufacturer’s instructions. From each clone library, at least 50 clones containing inserts were

selected and amplified using the vector specific primers MpF and MpR. PCR products of the

correct size were precipitated by transferring the remaining reaction mixture to a 1.5 ml

eppendorf tube, adding a double volume of ice-cold 100% ethanol, and then incubated on ice for

15 min. The samples were centrifuged at 14,000 g for 15 min, the supernatant discarded and the

pellet washed once with 200 μL freshly made 70% ethanol. The resultant pellets were dried

using a SpeedVac vacuum centrifuge (Thermo Fisher Scientific Inc., Waltham, MA) or left with

open caps under aluminium foil in room temperature, after which they were resuspended in 10-

15 μL sterile MilliQ water. To verify that the PCR product had not been lost during the ethanol

precipitation, the cleaned PCR products were visualised on 1% agarose gels (molecular biology

grade, Progen Pharmaceuticals, QLD, Australia) with 1x TAE-buffer, stained by ethidium

bromide (1 μg ml-1) for 10-15 min and visualised via UV transillumination (Gel Doc 2000,

BioRad) using QuantityOne 4.1R software (BioRad).

Each clone was subjected to duplicate Restriction Fragment Length Polymorphisms (RFLP)

analysis using restriction enzymes ScrFI and MspI (New England Biolabs, Ipswich, MA)

separately. Each digest reaction contained 3 μL PCR product, 1 μL of the corresponding

enzyme buffer, 2 units of restriction enzyme and sterile MilliQ water to a total volume of 10 μL.

The digests were incubated at 37°C overnight. Clones’ RFLP patterns were analysed manually

after electrophoresis on 2% agarose gels as previously described. At least one clone from each

unique RFLP pattern was sequenced.

3.2.4 DNA sequencing

Sequencing of selected clones was carried out using the PRISM Big Dye cycle sequencing

system with MPF or MPR primers and 3-49 ng of the precipitated product.

The sequencing reaction products were transferred to a 1.5 ml tube and precipitated by the

addition of 16 μL sterile MilliQ water and 64 μL 95% ethanol and mixed thoroughly. After

incubation at RT for 15 minutes, the samples were centrifuged at 16,000 g for 20 min and all the

Page 63: Nitrogen fixing potential in extreme environments - UNSWorks

54

supernatant was removed. The pellet was washed and dried as above. The sample was submitted

for automated sequencing at The Ramaciotti Centre for Gene Function Analysis, UNSW, using

the Applied Biosystems 3730 DNA Analyser (Foster City, CA) and analysed with Applied

Biosystems Sequencing Analysis 5.1.1 software provided by Applied Biosystems.

3.2.5 Phylogenetic sequence analysis

Sequences chromatograms were manually checked for signal quality with “ABI and SCF Trace

Viewer” embedded in “BioEdit Sequence Alignment Editor” software version 7.0.5.3 (Hall,

1999). Sequences with high background signal noise were discarded from further analysis.

The 2004 and 1996 stromatolite clone library sequences were initially batch edited with

“BioEdit” and a text editor “Crimson Editor” version 3.70 (freeware, Copyright © Ingyu Kang).

Any remaining nucleotides from the cloning vector were removed from the 5’ and 3’ ends of the

sequences, which were then temporarily realigned in a default fashion in “BioEdit”.

Sequence homologies were obtained using a nucleotide query (Altschul et al., 1990) with

“BLASTN” version 2.2.18 and translated nucleotide query “BLASTX” version 2.2.24 (Altschul

et al., 1997) from the National Center for Biotechnology Information (NCBI) website. BLAST

results were screened against 42 sequences known to arise mainly from PCR reagents

contamination - AY225105–AY225107, AY333089–AY333101, AB198366 - AB198391 (Zehr

et al., 2003b; Goto et al., 2005).

The reference NifH amino acid sequences for the alignment and phylogenetic analyses were

imported from The Universal Protein Resource – UniProt (The UniProt, 2008; Apweiler et al.,

2010) see Appendix A, table A-5.

Multiple alignments of the nifH gene nucleotide and amino acid sequences were carried out

separately, initially using three different software packages: ClustalX 2.0.11 (Larkin et al.,

2007), Muscle 3.8.31 (MUltiple Sequence Comparison by Log-Expectation, (Edgar, 2004) as

implemented in EMBL-EBI website, and MAFFT 6 (Multiple sequence Alignment based on

Fast Fourier Transform (Katoh et al., 2002; Katoh and Toh, 2008). Based on all software

output, Muscle was chosen as the best alignment tool for nifH gene multiple sequences based on

a visual check of the resulting alignments as well as the bootstrapping values of the NJ

phylogenetic tree branches. As reflected in a benchmark testing of these multiple alignment

tools (Nuin et al., 2006), MAFFT and Muscle produced similar quality outputs and both were

better than ClustalX software results.

Page 64: Nitrogen fixing potential in extreme environments - UNSWorks

55

The appropriate amino acid substitution model for phylogenetic inference of nifH genes was

obtained using “ProtTest” version 2.4 (Abascal et al., 2005). “ProtTest” proposed the best

protein evolutionary model based on the smallest Akaike Information Criterion (AIC) score

(Akaike, 2002). Phylogenetic trees were then created by the maximum likelihood approach

(Felsenstein, 1981), LG substitution matrix (Le and Gascuel, 2008) and approximate

Likelihood-Ratio Test for branch support (aLRT-SH-like, Posada et al.(2006) with “PhyML”

version 3.0 (Guindon and Gascuel, 2003; Guindon et al., 2010) as implemented in

http://www.atgc-montpellier.fr/phyml web site.

“TreeDyn” version 198.3 (Chevenet et al., 2006), “MEGA 4” version 4.0.2 (Tamura et al.,

2007), “TreeGraph 2” version 2.0.45 (Stöver and Müller, 2010) and “Adobe Photoshop

Elements” version 8 (Copyright © 1990-2009 Adobe Systems Incorporated), were used for

phylogenetic tree visualisation and final modifications.

3.2.6 Diversity, richness and coverage estimators

The NifH inferred sequences were aligned using “Muscle” version 3.8.31 and the molecular

distances were calculated with the Probability Matrix from Blocks (PMB, Veerassamy et al.

(2003) model implemented in “PHYLIP Protdist” version 3.67 (Felsenstein, 2007) available at

the Mobyle web portal - http://mobyle.pasteur.fr/cgi-bin/portal.py#welcome (Nיron et al., 2005;

Néron et al., 2009).

Distance matrices generated by the above procedure were used in “Mothur” version 1.15.0 to

group sequences into Operational Taxonomic Units (OTUs) of 88 % - 100 % phylotype cutoff

thresholds, using the furthest-neighbour algorithm (Schloss et al., 2009). OTUs were then used

in calculating collector’s curves and various estimators relating to clone library sampling

coverage, diversity, richness, as well as shared estimators between clone libraries and structural

similarities between 1996 and 2004 communities. Coverage of the clone libraries was calculated

by “DOTUR” (Schloss and Handelsman, 2005) and the method of Good (1953). Richness was

calculated by the method of Chao (Chao and Yang, 1993). Diversity index (H) was calculated

by the method of Shannon–Wiener (Krebs, 1989). The programs “∫-LIBSHUFF”,

“TreeClimber” and “UniFrac” were used for cross communities comparisons (Singleton et al.,

2001; Schloss et al., 2004; Lozupone and Knight, 2005; Schloss and Handelsman, 2006a;

Schloss and Handelsman, 2006b).

Page 65: Nitrogen fixing potential in extreme environments - UNSWorks

56

3.2.7 Accession numbers

Sequences of the nifH clones are available under GenBank accession numbers JF826460-

JF826496.

Page 66: Nitrogen fixing potential in extreme environments - UNSWorks

57

3.3 Results and discussion

3.3.1 General methodology consideration

In order to minimize potential biases, this study used a DNA extraction method known to

produce high quality DNA extractions, that does not skew the original composition of the

microbial community in the sample (Leuko et al., 2007; Goh et al., 2008). PCR cycles were

kept at 30 cycles, in order to avoid introducing biases in amplification and the nifH primer sets

used in this study, have been shown not to cause major bias in the PCR amplification process

(Diallo et al., 2008). To insure no false-positive results were created due to contaminated PCR

reagents, all reactions and gel visualizations included negative controls, and BLAST results

were screened against sequences known to arise from such contamination (Zehr et al., 2003b;

Goto et al., 2005).

BLAST and BLASTX analyses are useful tools for taxonomical identification in terms of

evolutionary interpretations, under certain known limitations (States and Botstein, 1991).

Taxonomical affiliation based on 16S rDNA sequences, generally assumes a 1 to 1.5 %

sequence difference is appropriate for defining strains within the same species, and a 2 to 5 %

difference for species within the same genus (Clarridge, 2004). Translated NifH sequences are

much more interesting and informative as they relate to the protein itself, a 3-D entity composed

of primary, secondary and higher levels of structural compilations and subject to biochemical

influences (Stormo, 2002). The selective pressure to adapt functionality to a micro-environment

and yet retain a specific functionality, can produce nucleotide sequences which are distantly

related, but the amino acid code would contain homologous coding regions which reflect

structural and functional similarities (Sander and Schneider, 1991). A 2 to 8 % difference in the

NifH amino acid sequences represents variations on the amino acid sequence, which relates to

structural changes. Thus, a 2 to 8 % difference was considered appropriate for a positive

identification of a Fe protein, based on past studies analysing structural homologies and

sequence similarities (Sander and Schneider, 1991; Hobohm et al., 1992).

In this study, BLAST and BLASTX results passed significant statistical thresholds: BLAST

expected (E) value range was e-88 – e-168 and BLASTX E-values ranged

e-27 – e-60 (Ladunga, 2002a, b; McGinnis and Madden, 2004). In addition, NifH translated

sequences were longer than 100 residues and the remaining hits for each sequence with lower E-

values were also identified as Fe protein (NifH). We therefore assumed our translated sequences

were homologs of the nitrogenase Fe protein component (Rychlewski et al., 2000) and

Page 67: Nitrogen fixing potential in extreme environments - UNSWorks

58

attributed sequence differences to biological adaptations to ecological constraints, or inter-

species differentiation.

3.3.2 2004 clone library BLAST & BLASTX analysis

NifH genes were present and amplified from the total DNA extracted from the 2004 stromatolite

samples (figure 4). In total, 38 clones containing the correctly sized insert (350bp) were

obtained and analysed (figure 5). RFLP analysis was performed on 30 random positive clones

from the 2004 library, which grouped into five patterns (see figure 6 and table 2).

Figure 4 Products obtained during the amplification of nifH from stromatolite DNA extractions using nested primers. Left pane: amplification products after the first step of nifH PCR amplification; right pane: amplification products after the second step. Lane 1: 2004 stromatolite; 2: 1996 stromatolite; 3, 4: positive control Nostoc PCC 7120; 5: negative control sterile MilliQ H2O; M: 0.5 μg μL-1 GeneRuler™ DNA Ladder Mix (Fermentas, Ontario). Figure 5 Example of a 1% agarose gel showing the PCR amplification of 13 clones with the nifH insert before (left) and after (right) PCR product clean up procedure. Expected band size of a pGEM vector without nifH gene insert – 236 bp. Expected band size with nifH insert present – 586 bp. Lanes 1 - 13: 2004 stromatolite clones (white colonies); Lane B: blue colony product (negative control); M: 0.5μg μL-1 GeneRuler™ DNA ladder Mix (Fermentas, Ontario).

Page 68: Nitrogen fixing potential in extreme environments - UNSWorks

59

Figure 6 2% agarose gel showing RFLP patterns using ScrFI (bottom) and MspI (top) restriction enzymes on 12 clones from 2004 stromatolite library. M: 0.5μg/μL GeneRuler™ DNA ladder Mix (Fermentas).

Table 2 Modified representations of the 2004 stromatolite clones RFLP digestion patterns. Gel lanes 1 2 3 5 6 7 8 9 10 12 13

MspI 350* 350 350 200 200 200 200 200 200 200 200 200 200

ScrFI

600 600 550 500 350 350 300 300 300 300 300 300 300 300 200* 200 200 200 200 200 200 200 200 200 200

* Band size in basepairs.

A minor portion of the 2004 NifH clone library (table 3) was identified in the BLAST analysis

as an uncultured cyanobacterial nifH clone from a benthic hypersaline microbial mat in San

Salvador Island, Bahamas, with 98% and 100% sequence similarity (DQ140596 accession ID,

Yannarell et al. (2006). Half of the clone library was identified as uncultured nifH clone

sequences obtained from the microbial mats of a natural marsh in Guerrero Negro (GN),

Mexico (Moisander et al., 2006), with sequence similarity of 87%. The vast majority of the

clone library had less than 90% sequence similarity to uncultured cyanobacterial clones from

Page 69: Nitrogen fixing potential in extreme environments - UNSWorks

60

saline to hypersaline environments in the BLAST analysis, indicating potential novel nifH

genes.

Translated NifH sequences from the 2004 clone library were identified by the BLASTX

analysis as cyanobacterial NifH amino acid sequences affiliated with Chroococcales and

Oscillatoriales, at 90%-94% sequence similarity (Table 3). The vast majority of the clones were

identified as NifH sequences of Cyanothece spp. (strains CCY0110, ATCC 51142 and PCC

7425), with an average sequence similarity of 93%, and the remainder of the library were

affiliated with uncultivated Cyanobacterium UCYN-A and Oscillatoria PCC 6506 with 91% and

94% sequence similarity, respectively.

This finding suggests nitrogen fixation might occur during night time in columnar stromatolites,

as transcriptional nitrogenase studies of Cyanothece sp., strain ATCC 51142, revealed this strain

fixed nitrogen diurnally, usually under dark, aerobic conditions, and that the nitrogenase was

degraded during light periods (Colon-Lopez et al., 1997). Filamentous Oscillatoria spp.

exhibited also a diurnal rhythm in N2 fixation, with aerobic nitrogen fixation detected only

during dark periods (Stal and Krumbein, 1987). Therefore, the group of unicellular, filamentous

and non-heterocystous Cyanobacteria that includes Oscillatoria sp. PCC 6506 and Cyanothece

spp. would probably fix nitrogen aerobically during dark periods in stromatolites, and thus

avoid potential oxygen damage to the nitrogenase complex (Reddy et al., 1993; Schneegurt et

al., 1994; Berman-Frank et al., 2003). In addition, unicellular Cyanothece spp. and

Cyanobacterium UCYN-A were shown to have similar nifH sequences (Zehr et al., 2008). The

oceanic unicellular Cyanobacterium UCYN-A finding is of interest as it can fix N2 during

daylight hours as well, without producing oxygen and causing redox damage to nitrogenase

(Bothe et al., 2010).

Its genome contains only photosystem I with no trace of photosystem II, associated pigments or

carbon fixation genes (Zehr et al., 2008). In addition, it was recently found that UCYN-A lacks

many metabolic pathways and relies on other bacteria for the provision of essential amino acids

and other important compounds (Tripp et al., 2010) and therefore the presence of UCYN-A

within the stromatolite community is reasonable, as it is an anoxygenic photosynthetic

heterotrophic microorganism living within a complex community and in contact with seawater,

as is the case of the intertidal columnar stromatolites. However, the identification of this strain

was not absolute in this study, as sequence similarity was only 91% when compared to

Cyanobacterium UCYN-A NifH amino acid sequence. It remains to be seen if the newly added

UCYN-A genome will be identified not only in oceanic environments, but also in other

microbial mats studies.

Page 70: Nitrogen fixing potential in extreme environments - UNSWorks

61

Table 3 BLAST and BLASTX analysis of 2004 stromatolite nifH clone library. Total of 38 clones were analysed.

BLAST Analysis

Accession ID

No. of clones (%)

Average sequence similarity (%)

Source(a)

DQ338040 DQ338103 50.0 87

Natural marsh microbial mats, Mexico: Guerrero Negro, Baja California (Moisander et al., 2006)

EU594141 EU594212 28.95 86 Marine sponges bacterial symbionts

(Mohamed et al., 2008a)

EF174812 EF174826 10.53 90

Marine water, Heron Reef Lagoon, Great Barrier Reef (Hewson et al., 2007)

DQ140596 5.26 99 Benthic, hypersaline microbial mat, San Salvador Island, Bahamas (Yannarell et al., 2006)

AY450628 2.63 87 Lyngbya mats of an intertidal flat, Mexico: Guerrero Negro, Baja California (Omoregie et al., 2004a)

U73133 2.63 89 Myxosarcina PCC 7312 (Zehr et al., 1997)

BLASTX Analysis

Accession ID

No. of clones (%)

Average sequence similarity (%)

Phylum Closest NifH -deduced amino acid sequence bacteria

ZP_01727765 52.63 93 Cyanobacteria Cyanothece sp. CCY0110

YP_001801976 26.32 93 Cyanobacteria Cyanothece strain ATCC 51142

YP_002483083 13.16 93 Cyanobacteria Cyanothece PCC 7425

YP_003421696 5.26 91 Cyanobacteria Cyanobacterium UCYN-A

ZP_07112556 2.63 94 Cyanobacteria Oscillatoria PCC 6506

(a) Unless specifically specified, all matches were to uncultured bacterial nifH clones.

Page 71: Nitrogen fixing potential in extreme environments - UNSWorks

62

3.3.3 1996 clone library BLAST & BLASTX analysis

In total, 100 clones were screened and 37 positive clones, containing the correctly sized insert

(350bp), were obtained and analysed from the 1996 stromatolite clone library. Clones were

sequenced directly without performing RFLP analysis until sufficient coverage (>50%) of the

clone library was attained.

About a third of the clone library matched uncultured nifH clones from microbial mats of

Guerrero Negro (GN), Mexico, with an average sequence similarity of 86% (Table 4), based on

the BLAST analysis. A similar number of clones were identified as uncultured nifH clones from

natural seawater sediments contaminated with crude oil from the Gulf of Mexico, at an average

sequence similarity of 86%. Less than a fifth of the 1996 stromatolite clone library, had 96% -

98% similar to uncultured nifH clones from a benthic hypersaline microbial mat of a salt pond

in Salins-de-Giraud, Camargue, France, and uncultured clones from a natural marsh in GN,

Mexico (AM286438 and DQ821946). Overall, the vast majority of the 1996 stromatolites clone

library had less than 90% sequence similarity to known sequences in the databases. Most of the

BLAST matches were to uncultured nifH clones from saline to hypersaline environments,

similar to the BLAST analysis of the 2004 clone library.

Translated NifH sequences of the 1996 clone library clustered into three different phyla and

seven different diazotrophic genera based on BLASTX analysis (table 4). The majority of

sequences in this clone library were affiliated with the -Proteobacteria group, followed by γ-

Proteobacteria and Cyanobacteria representatives.

Almost a third of the clone library matched Pelobacter carbinolicus DSM 2380 with an average

96% sequence similarity; another 24.32% of the clone library matched Desulfatibacillum

alkenivorans AK-01 (93% similarity) followed by matches to Desulfonatronospira

thiodismutans ASO3-1 with 89% average sequence similarity. Desulfovibrio magneticus RS-1

matches constituted a minute portion of the clone library, yet had a relatively high sequence

similarity of 97%.

Page 72: Nitrogen fixing potential in extreme environments - UNSWorks

63

Table 4 BLAST and BLASTX analysis of 1996 stromatolite nifH clone library. Total of 37 clones were analysed.

BLAST Analysis

Accession ID

No. of clones (%)

Average sequence similarity (%)

Source(a)

DQ338014 DQ338071

35.14

86 Natural marsh microbial mats, Mexico: Guerrero Negro, Baja California (Moisander et al., 2006)

DQ078021 DQ078042 29.73 88 Oil contaminated marine sediments (Musat et al., 2006)

HM750443 HM750759 13.51 89 Rhizosphere of a salt marsh (unpubl.)

DQ821946 8.11 98 Natural marsh microbial mats, Mexico: Guerrero Negro, Baja California (Moisander et al., 2007)

AM286438 5.41 96 Saline pond benthic microbial mat, France: Salins-de Giraud, Camargue (Bonin and Michotey, 2006)

GU193021 5.41 84 Intertidal microbial mat (unpubl.) AP010904 2.70 85 Desulfovibrio magneticus RS-1 (Nakazawa et al., 2009)

BLASTX Analysis

Accession ID

No. of clones (%)

Average sequence similarity (%)

Phylum Closest NifH -deduced amino acid sequence

YP_357508 29.73 96 δ-Proteobacteria Pelobacter carbinolicus DSM 2380 YP_002430688 24.32 93 δ-Proteobacteria Desulfatibacillum alkenivorans AK-01 ZP_07015343 21.62 89 δ-Proteobacteria Desulfonatronospira thiodismutans

ASO3-1 ZP_01727765 8.11 95 Cyanobacteria Cyanothece sp. CCY0110 YP_001001870 5.41 93 γ-Proteobacteria Halorhodospira halophila SL1 YP_003073074 5.41 92 γ-Proteobacteria Teredinibacter turnerae T7901 YP_002953433 2.70 97 δ-Proteobacteria Desulfovibrio magneticus RS-1 (a) Unless specifically specified, all matches were to uncultured bacterium nifH clones.

All these species are strict anaerobes, sulphide or sulphate reducers, isolated from marine and

freshwater sediments, and are considered mesophilic (Schink, 1992; Sakaguchi et al., 2002;

Cravo-Laureau et al., 2004; Sorokin et al., 2008). Nitrogenase activity has yet to be

characterised in these genera, though nifH DNA fragments have been amplified from mostly

marine sediments (Zadorina et al., 2009; Bertics et al., 2010; Quaiser et al., 2010).

Several individual NifH clones had 97% sequence similarity to Halorhodospira halophila SL1

and Teredinibacter turnerae T7901. H. halophila is an anaerobic halophilic phototroph with

nitrogenase activity under light conditions, and its nifH DNA fragments have been amplified

Page 73: Nitrogen fixing potential in extreme environments - UNSWorks

64

from various environments, usually at relatively low sequence similarities (Tsuihiji et al., 2006;

Falcón et al., 2007; Zadorina et al., 2009; Ma et al., 2010). T. turnerae is a mesophilic

endosymbiotic -Proteobacterium isolated from molluscs (Bivalvia: Teredinidae) which is able

to fix nitrogen under microaerobic conditions (Distel et al., 2002). Cyanothece sp. CCY0110

(ZP_01727765) was the only cyanobacterial match in the 1996 clone library, with 95% average

sequence similarity.

3.3.4 BLAST and BLASTX comparative analysis

A common match in both clone libraries, based on BLAST analysis, were uncultured nifH

sequences from microbial mats in Guerrero Negro (GN), Mexico, dominated by a Lyngbya sp.

(Moisander et al., 2006). Common nifH sequences suggest, to a certain extent, that the same

diazotrophs were present in the 1996 and 2004 stromatolites communities, but do not imply that

they employed similar nitrogen fixation patterns.

The GN microbial mats were collected from a tidal flat, which underwent alternating

desiccation/wetting periods pending tidal flooding; they were therefore subjected to alternating

levels of salinity (sea water-hypersalinity), in a similar fashion to the intertidal columnar

stromatolites in Hamelin Pool. Additional cyanobacterial species, purple and colourless sulphur

bacteria were also identified (Omoregie et al., 2004b) and it was not surprising that the clones

sequence similarity was only 86%-87% similar to the nifH nucleotide sequences of the GN mat

samples (Moisander et al., 2006) after taking into consideration the varying environmental

salinity levels and methodological differences between our study and the GN mats studies. The

1996 and 2004 stromatolite nifH clones may represent novel sequences due to local adaptations

to their own environment and its specific characteristics such as salinity levels, and nutrient

dynamics.

Based on the BLASTX results, Cyanothece sp. CCY0110 (ZP_01727765, Cyanobacteria)

emerged as the common match for both clone libraries at 93% and 95% average sequence

similarity. This was a major component of the 2004 clone library but less prevalent in the 1996

clone library (only 8.11% of the clones). Cyanothece sp. CCY0110 is constantly found and

cultured from various hypersaline and marine environments (Garcia-Pichel et al., 1998; López-

Cortés et al., 2001) and as mentioned previously, known to fix nitrogen under dark, aerobic

conditions.

Page 74: Nitrogen fixing potential in extreme environments - UNSWorks

65

3.3.5 Phylogenetic analysis

Most of the reference NifH sequences were obtained from the Swiss-Prot database (Boeckmann

et al., 2003), in which they were manually annotated, reviewed and verified, thus providing a

reliable genetic framework into which we integrated the clones sequences and additional

BLASTX hits (see appendix A, table A-5). The LG model (Le-Gascuel) is an improved model

over WAG and JTT in estimating amino acid substitution rates and in general provides better

tree topologies and likelihood probabilities (Le and Gascuel, 2008; Guindon et al., 2010). The

LG model takes into consideration not only variations in amino acid substitutions per site but

also whether a site is slow or fast to change due to evolutionary constraints. Using deduced

amino acid sequences instead of nucleotides might cause loss of some information in regards to

synonymous vs. non-synonymous substitutions, which then might provide a different

evolutionary presentation of the nifH gene. Yet, because nifH is a coding gene for a protein, it is

logical to view the code at the amino acid level, where it will be subjected to far more selective

pressure arising from physical and chemical conditions within the cell. The resulting branch

support values in this study were satisfactory and provided a reliable representation of the

possible evolution of nifH genes amongst Archaea and Eubacteria (Posada et al., 2009).

For this analysis, a total of 232 NifH amino acid sequences with an average length of 120

residues, were subjected to a maximum likelihood analysis. This produced a phylogenetic tree

with four major clusters, corresponding to NifH designated clusters I-IV (Chien and Zinder,

1996; Zehr et al., 2003a; Raymond et al., 2004a), plus two smaller clusters, one affiliated with

Desulfuromonadales representatives from the -Proteobacteria and another cluster of

Roseiflexus spp. NifH amino acid sequences (93 and 96 branch support values, respectively,

figure 7).

Page 75: Nitrogen fixing potential in extreme environments - UNSWorks

66

Briefly, cluster I contained the conventional Mo-containing NifH sequences most of them

affiliated with Proteobacteria, Cyanobacteria and Firmicutes (figure 8). Cluster I contained a

total of 46 stromatolite NifH clones and had a branch support value of 88 within the entire NifH

tree. Cluster II included phylotypes with an alternative nitrogenase containing Fe instead of Mo

or V. These included Archean methane producers, and alternative nifH genes (nifH2, nifH3)

from Firmicutes, α, -Proteobacteria and Spirochaetes. Cluster III included NifH sequences of

anaerobic diazotrophs with conventional nitrogenase (mostly Mo), mainly from the -

Proteobacteria, Spirochaetes, Chlorobi group, Firmicutes and Archaea (figure 9). Cluster III

contained a total of 18 stromatolite NifH clones and had a support value of 75 within the entire

NifH tree. NifH Cluster IV was very divergent and included mostly strict anaerobic Archaean

genera, some with alternative nifH genes: Methanopyrus, Methanosarcina, Methanobrevibacter,

Methanothermobacter (nifH2), Methanobacterium, Methanocaldococcus and Methanococcus.

The only exception to the Archaea was an alternative nifH gene copy of Rhodobacter capsulatus

(nifH2), a phototrophic purple non-sulphur α-Proteobacterium.

Figure 7 A phylogenetic tree based on Maximum-likelihood analysis of partial NifH amino acid sequences. Sequences determined in this study were given an alphanumeric prefix RSAYYYY and are marked bold; number of clones is in parenthesis; the scale bar represents the number of substitutions per 100 bases.

Page 76: Nitrogen fixing potential in extreme environments - UNSWorks

67

Eleven NifH clones from the 1996 stromatolite library were affiliated with the closest out-group

to cluster I - Pelobacter carbinolicus DSM 2380 ( -Proteobacteria, figure 8).

Three sub-clusters in cluster I, 1-Cyan-A/B/C, were treated as one sub-cluster designated “1B”

by Zehr et al (2003) and included only cyanobacterial NifH sequences. The sub-cluster 1-Cyan-

B branch support value was 89 and included unicellular Cyanobacteria: Cyanothece,

Gloeothece and a very divergent Cyanobacterium UCYN-A NifH sequence. It is unclear why

they clustered separately from Cyanothece and Gloeothece NifH sequences in 1-Cyan-A and

1-Cyan-C. The entire 2004 stromatolite clone library, 38 clones in total, clustered closely to a

NifH sequence of Xenococcus PCC 7305, in the cyanobacterial cluster 1-Cyan-B (O08262

accession ID, 108AA length). Three 1996 stromatolite clones clustered separately from the

2004 sequences, but in the same-sub cluster. Xenococcus sp. NifH fragments were identified in

marine sponges, coral reef lagoon seawater samples from Heron Island, Australia, in microbial

mats from Guerrero Negro (GN) salt ponds in Baja California, and additionally from core

samples of marine stromatolites from Highborne Cay, Bahamas (Steppe et al., 2001; Omoregie

et al., 2004c; Hewson et al., 2007; Mohamed et al., 2008b). In these studies nitrogenase activity

was highest during the dark period, but the activity was not attributed to a specific bacterial

group. Xenococcus PCC 7305 specific strain is known to fix nitrogen anaerobically (Fay, 1992;

CRBIP, 2007), though its sequence clustered with the aerobic diazotrophic Cyanothece spp.

Two other sub-clusters included five 1996 stromatolites clones - 1-Prot- -B and 1-Prot- -C,

with branch support values >74. Two clones were closely affiliated with Marichromatium

purpuratum, also known as Chromatium purpuratum, which is a halophilic purple sulphur

anaerobic -Proteobacterium, phototrophic, with high G+C content (68.9%) and 25-35 °C

optimal growth temperature, usually found in anoxic marine sediments, marine sponges and

other marine invertebrates (Proctor, 1997; Imhoff et al., 1998). Accordingly, its NifH fragments

have been found in water surface samples from a river estuary, Hawaiian corals and in a tropical

intertidal lagoon (Affourtit et al., 2001; Bauer et al., 2008; Olson et al., 2009). The above clones

were originally matched in BLASTX as the halophilic γ-Proteobacterium Halorhodospira

halophila SL1 at 93% sequence similarity (table 4). H. halophila and M. purpuratum NifH

sequences share a high level of sequence similarity - 91%, confirmed by another published

analysis of H. halophila NifH sequence which positioned it within the same cluster as M.

purpuratum (Imhoff et al., 1998; Tsuihiji et al., 2006; Bertics et al., 2010). This ‘mismatch’

between BLASTX match and the phylogenetic affiliation was due to the different basic

assumptions employed in BLAST and BLASTX algorithms vs. the phylogenetic modelling.

BLAST and BLASTX are statistical methods designed to ‘fish out’ significant matches from

huge databases, without any evolutionary framework or assumptions (Altschul et al., 1997;

Page 77: Nitrogen fixing potential in extreme environments - UNSWorks

68

Ladunga, 2002b). Hence, because these two clones had almost the same sequence length as H.

halophila SL1, 120 vs. 121 residues, while M. purpuratum had only 109 residues and was short

of two known conserved motifs – “CDPKAD” at the beginning of the NifH partial sequence and

“GEMMAL/M” further along the sequence - BLASTX analysis chose H. halophila SL1 as the

best ‘correct’ match for these clones.

Phylogenetic models, on the other hand, incorporate evolutionary assumptions into their

algorithms such as time reversibility, amino acid substitution matrices, base frequencies,

proportion of invariable sites and more (Sullivan and Joyce, 2005). Therefore, the few non-

conserved residues between the clone sequences and H. halophila SL1 and M. purpuratum,

eventuated in these two clones clustering with M. purpuratum instead of H. halophila SL1,

disregarding the length issue.

Figure 8 next page: Cluster I phylogenetic tree, based on Maximum-likelihood analysis of NifH partial amino acid sequences. Sequences obtained in this study were given an alphanumeric prefix RSAYYYY and are marked bold, branch support values (approximate likelihood-ratio test, aLRT) are shown for key branches; only values > 50 were considered significant. Text box contain designation of clusters and in parenthesis is the closest sub cluster nomination as per (Zehr et al., 2003a). ‘1’ - cluster I, Prot=Proteobacteria, Cyan=Cyanobacteria, Firm=Firmicutes. The scale bar represents the number of substitutions per 100 bases. Out-group was Desulfuromonadales ( -Proteobacteria) NifH sequences from Geobacter and Pelobacter genera.

Page 78: Nitrogen fixing potential in extreme environments - UNSWorks

69

1-Prot-αβ (1J, 1K)

1-Cyan-A (1B)

1-Cyan-B (1B)

1-Cyan-C (1B)

1-Firm-A (1D)

1-Prot- -A (1P)

1-Prot- -B (1M)

1-Prot- -C (1H, 1T, 1l, 1U)

Page 79: Nitrogen fixing potential in extreme environments - UNSWorks

70

Additionally, three 1996 clones clustered with Teredinibacter turnerae T7901 in a Vibrio

spp. cluster (92% amino acid sequence similarity, table 4). T. turnerae is an endosymbiotic -

proteobacterium isolated from molluscs (Bivalvia: Teredinidae), that can fix nitrogen under

microaerobic conditions, at seawater salinity level (Fiore et al., 2010). This genus cluster with

Pseudomonas spp. based on its 16S rDNA sequence, yet its NifH amino acid sequence clustered

with Vibrio spp. rather than Pseudomonas (Distel et al., 2002). There are a few bivalve species

living in Hamelin Pool (and Shark Bay in general), hence it is reasonable to assume T. turnerae

integrated structurally within the columnar stromatolite. The abundant bivalves Fragum

hamelini Iredale, Fragum erugatum and the small bivalve Irus irus (Linnd), which is found at

the sides of many sub tidal stromatolites, have been reported in this area (Hoffman and Walter,

1976; Playford et al., 1976; Flint and Abeysinghe, 2000/07), yet this is the first report of a T.

turnerae NifH fragment in a stromatolite microbial mat.

As mentioned earlier, cluster III contained 18 stromatolite NifH clones, entirely from the 1996

stromatolite clone library, which clustered in two sub-clusters: 3-Prot- -A and 3-Prot- -B, each

with branch support values >89 (figure 9).

Nine clones clustered with Desulfovibrio gigas (P71156) and Desulfonatronospira

thiodismutans ASO3-1 (D6SLD2) - -Proteobacteria, sulphate reducers and strict anaerobes. D.

thiodismutans ASO3-1 is an obligatory alkaliphilic (optimum pH 10) bacterium with moderate

salinity acceptance and maximum growth temperature of 43 °C (Sorokin et al., 2008). It has not

been detected in Shark Bay or in other marine microbial mats to date, perhaps because the

sequence is relatively new in the databases (first entry 10th august-2010) and therefore

additional confirmation may follow. A singular clone was closely affiliated with D.gigas, whose

nifH DNA fragments were found in plant rhizopheres, marine sediment samples and in a few

cyanobacterial mats (Zehr et al., 1995; Moisander et al., 2007). The same clone had 97%

BLASTX sequence similarity to D. magneticus RS-1 (Table 4), which the phylogenetic analysis

had assigned to a different sub-cluster, 3-Prot- -GS. Ten residues were not conserved between

D. magneticus and the 1996 stromatolite clone and were sufficient for the phylogenetic model to

place the clone sequence with D.gigas instead of D. magneticus. Desulfovibrio spp. seem to fix

nitrogen within marine sediment microcosms regardless of light conditions (Postgate et al.,

1988; Kent et al., 1989; Musat et al., 2006). Some studies suggested that the genus fixed

nitrogen mainly during dark periods in marine intertidal microbial mats (Zehr et al., 1995;

Steppe and Paerl, 2002).

Additionally, nine 1996 stromatolite clones clustered with an alkene-degrading, sulphate-

reducing bacterium - Desulfatibacillum alkenivorans strain AK-01 (B8FAC4) which was first

isolated from oil-polluted sediments of a sewage plant (Cravo-Laureau et al., 2004). Related

Page 80: Nitrogen fixing potential in extreme environments - UNSWorks

71

nifH sequences were reported in low abundance from a low temperature, acidic peat bog and in

a ghost shrimp benthic burrow within intertidal lagoon waters (Zadorina et al., 2009; Bertics et

al., 2010).

Briefly summarising, our phylogenetic analysis indicated that 100% of the 2004 and 22% of the

1996 stromatolite clone libraries were affiliated with cluster I. Almost 30% of the 1996

stromatolite clone library sequences were associated with an out-group to cluster I, which was

composed of Desulfuromonadales representatives from the -Proteobacteria, and an additional

48% were affiliated with cluster III. Neither clone library had representatives in cluster II or

cluster IV as designated by Zehr et al. (2003a). Combined to a unified representation of

potential diazotrophs in columnar stromatolites, cluster I clones would represent 61% and

cluster III clones would represent 39% of the diazotrophic community composition.

Additionally, a NifH sequence of Xenococcus PCC 7305, in the cyanobacterial sub-cluster 1-

Cyan-B, was a common phylogenetic affiliation for both clone libraries. This may indicate, as

with the previous BLAST and BLASTX analyses, that this was the common diazotrophic specie

in columnar stromatolites. The few inconsistencies between the BLAST or BLASTX results and

the phylogenetic assignments emphasize the importance of applying at least two different

methods on the same batch of nucleotide or amino acid sequences, in order to gain an unbiased

view of the possible outcomes from the original sequences.

Page 81: Nitrogen fixing potential in extreme environments - UNSWorks

72

3-Firm-Arch (3C,

3D, 3A)

3-Prot- -GS

(3L, 3T)

3-Prot- -B

(3B, 3E, 3L)

3-Prot- -A (3P)

3-Spiro-A (3L)

Figure 9 cluster III phylogenetic tree based on Maximum-likelihood analysis of partial NifH amino acid sequences. Sequences determined in this study were given a prefix RSA and are marked bold, branch support values (approximate likelihood-ratio test (aLRT)) are shown for key branches; only values > 50 were considered significant. Spir=Spirochaetes, Arch=Archaea. The scale bar represents the number of substitutions per 100 bases.

Page 82: Nitrogen fixing potential in extreme environments - UNSWorks

73

3.3.6 Coverage, diversity and community structure

Before analyzing richness, diversity and structure, it was necessary to ascertain whether the

clone library coverage was sufficient enough to provide a decent assessment of the above

factors. The program “Mothur” employs molecular distance matrices in order to calculate

various ecological parameters and coverage estimates, and has been successfully used in

microbial ecological studies (Schloss et al., 2009). The Mothur software version used in this

study did not provide a sub-program to calculate distances of amino acid sequences, so after

aligning sequences with “Muscle” and confirming alignment quality against known NifH

reference sequences, the Probability Matrix from Blocks (PMB, Veerassamy et al., 2003) as

implemented in “PHYLIP Protdist”, version 3.67 (Felsenstein, 2007), was used for that purpose.

PMB is derived from the popular BLOSUM matrices for amino acid substitutions and from the

Blocks database (Henikoff et al., 1999; Henikoff et al., 2000). This matrix takes into

consideration aligned ungapped conserved regions and adjusts amino acid substitution scores

based on evolutionary assumptions (e.g. evolutionary distances are additive in a linear fashion).

The resulting model is strongly based in empirical data as it included the NifH/BchL/ChlL

family, and was suitable for use with NifH sequences which have several conserved blocks in

the sequence.

Statistical analyses of the clone libraries from 1996 and 2004 stromatolites are presented in

tables 5-7, as well as collector curves for coverage of all libraries (figure 10). These curves

represent the frequency data for each distance level (0.01-0.12, figure 10) plotted against the

number of unique sequences or species observed. In other words, the data is based on the

number of observed OTUs as a function of distance between sequences and the number of

sequences sampled. Therefore, when a curve reaches an asymptote, it means no more unique

sequences were observed for a specific distance level, full species coverage attained, and no

need to sample the clone library any further (Schloss et al., 2004).

The estimated clone library coverage was 73% for 99% phylotype cutoff and up to 97%

coverage for 87% phylotype cutoff, regardless of the sampling year (table 5). Phylotype cutoff

of 99% meant sequences were at a maximum distance of 1% from one another. Therefore, the

coverage of potential diazotrophs was comprehensive and the clone libraries were representative

of the diazotrophic diversity in our samples. At 100% phylotype cutoff (unique sequences), 34

OTUs were identified and the number of observed species by Chao1 non parametric estimator

for richness was 121.75 (63.88-291.73, 95% CI), indicating that when sampled to completion

there would be between 29 and 257 additional NifH species. Shannon-Wiener index of diversity

(H’) estimator was 2.72 (2.37-3.07, 95% CI).

Page 83: Nitrogen fixing potential in extreme environments - UNSWorks

74

Between 100% - 87% phylotype cutoff , 8.82% to 55.55% of the OTUs respectively (table 5),

were shared between the clone libraries, indicating common OTUs of NifH sequences in the

2004 and 1996 clone libraries. At 87% phylotype cutoff, Yue and Clayton’s non-parametric

estimator for similarity (θ) was 0.85, indicating a high proportional similarity between the clone

libraries. Similarity (θ) between the libraries was estimated at 0.3 under 100% phylotype cutoff

(lower values).

Table 5 Shared coverage, observed richness, diversity & similarity estimators, based on NifH translated amino acid sequences from both clone libraries. Phylotype cutoff (%)

OTUs Shared OTUs (%) (d)

Coverage (%) (a)

Community Similarity (θ) (e)

Richness index Chao1 (95% CI) (b)

Diversity index Shannon–Wiener (95% CI) (c)

100 34 8.82 64.00 0.30 121.75(63.88-291.73) 2.72(2.37-3.07) 99 28 17.85 73.33 0.46 75.50(42.89-179.48) 2.51(2.18-2.84) 98 23 30.43 82.67 0.58 38.60(27.23-80.53) 2.38(2.07-2.68) 96 19 36.84 85.33 0.69 37.33(23.48-94.08) 2.16(1.87-2.45) 95 15 46.66 92.00 0.75 18.75(15.64-37.02) 2.00(1.73-2.27) 90 10 50.00 96.00 0.82 10.75(10.07-18.45) 1.52(1.25-1.78) 87 9 55.55 97.33 0.85 9.33(9.02-14.96) 1.49(1.24-1.74)

Abbreviations: CI, confidence interval; OTUs, operational taxonomic units. (a) The coverage index was calculated by the method of Good (1953). (b) The richness index was calculated by the method of Chao et al. (1993). (c) The diversity index calculated by the method of Shannon–Wiener (Krebs, 1989). (d) Number of shared OTUs between libraries. (e) Yue and Clayton’s (2005) community overlap measure based on shared OTUs proportions (Yue and Clayton, 2005).

Statistical analysis by Libshuff (Singleton et al., 2001), confirmed that the clone libraries were

not significantly different from one another (significance >0.025, table 6). The marginal

significance (0.01-0.05) given by the parsimony method (P-test) and weighted UniFrac test

(Martin, 2002; Lozupone and Knight, 2005) indicated that the structural similarity between the

communities might not occur by chance, which can be interpreted to mean that the communities

were not significantly different from one another (table 6).

Page 84: Nitrogen fixing potential in extreme environments - UNSWorks

75

Table 6 Community structure comparisons based on NifH translated amino acid sequences. dCXYScore Significance Libshuff (a) Strom2004-Strom1996 0.00059148 0.2839 Strom1996-Strom2004 0.00106589 0.0912 Corrected P-value Significance Parsimony(b) Strom1996-Strom2004 0.0300 Marginally significant* UniFrac(c) Strom1996-Strom2004 0.0200 Marginally significant

(a) Libshuff analysis calculated using the Cramer-von Mises test statistic with 10,000 randomisations by the method of Schloss et al. (2004). (b) Parsimony statistical test (P-test) with 100 permutations by the method of (Martin, 2002) corrected for multiple comparisons using the Bonferroni correction. (c) UniFrac statistical test (P-test) with 100 permutations by the method of (Lozupone and Knight, 2005), corrected for multiple comparisons using the Bonferroni correction. * Marginal significance 0.01-0.05 as calculated by UniFrac (Lozupone et al., 2006).

According to the collector’s curve analysis of OTU’s and based on the furthest-neighbour

algorithm and a distance precision of 0.01 (Schloss and Handelsman, 2005), the number of

unique NifH amino acid sequences began to stabilize and reach an asymptote at the 99%

phylotype cutoff, in each clone library (table 7, figure 10). This meant full species coverage was

attained at that cutoff. The 99% phylotype cutoff meant that all sequences were at a maximum

distance of 1% from one another. At the 99% phylotype cutoff, the 2004 clone library sequences

were grouped into 6 OTUs only, and grouped into one OTU at 93% phylotype cutoff (0.07

distance), indicating the relatively low diversity of the 2004 clone library. The 1996 stromatolite

sequences grouped into 20 OTUs at the 99% phylotype cutoff, and even at the 88% cutoff (0.12

distance) were still not grouped into one collective OTU, as occurred with the 2004 stromatolite

sequences. This indicated a higher diversity and potential richness of the NifH sequences from

the 1996 clone library compared with the 2004 clone library.

Figure 10 Collector’s curves for taxa (defined here as OTUs), with phylotype cut-offs of 99% (0.01) - 88% (0.12), based on NifH translated amino acid sequences. (A) 1996 stromatolite clone library (B) 2004 stromatolite clone library.

Page 85: Nitrogen fixing potential in extreme environments - UNSWorks

76

Table 7 Coverage, observed phylotype richness and diversity indices for each clone libraries, based on NifH translated amino acid sequences.

Phylotype cutoff (%)

Sequence length analysed

Coverage Good (%)(a)

OTUs Richness Index Chao1(b) (95% CI)

Diversity index Shannon–Wiener (c) (95% CI)

1996 Stromatolites

100 105 AA 43.24 25 130 (56.46-375.48) 2.97 (2.65-3.28) 99 51.35 22 98.5 (43.99-288.10) 2.76 (2.42-3.09)

98 67.57 18 40 (23.58-104.73) 2.55 (2.23-2.86) 96 72.97 16 31 (19.50-80.24) 2.40 (2.09-2.72) 95 83.78 13 16.75 (13.64-35.02) 2.21 (1.91-2.50) 90 91.89 10 10.75 (10.07-18.45) 1.96 (1.69-2.23) 87 94.59 9 9.33 (9.02-14.96) 1.91 (1.66-2.16) 2004 Stromatolites

100 104 AA 84.21 9 14.00 (9.86-37.91) 1.11 (0.67-1.55) 99 94.74 6 6.33 (6.02-11.96) 0.91 (0.53-1.29)

98 97.37 5 5.00 (5-5.00) 0.85 (0.50-1.19) 96 97.37 3 3.00 (3-3.00) 0.55 (0.30-0.81) 95 100.00 2 2.00 (2-2.00) 0.44 (0.24-0.63) Abbreviations: CI, confidence interval; OTUs, operational taxonomic units. (a) The coverage index was calculated by the method of Good (1953). (b)The richness index was calculated by the method of Chao et al. (1993). (c) The diversity index by the method of Shannon–Wiener (Krebs, 1989).

The 1996 columnar stromatolite diazotrophic community at 98% phylotype cutoff grouped into

18 OTUs, and the number of observed species by Chao1 non-parametric estimator for richness

was 40 (23.58-104.73, 95% CI, table 7), indicating that when sampled to completion there

would be between 5 and 86 more NifH species obtained. The 2004 clone library included 5

OTUs at 98% phylotype cutoff, and the number of observed species by Chao1 estimator for

richness was 5 (5 with 95% CI) which indicated that at this specific cutoff, all NifH species

were sampled to completion. Shannon-Wiener index of diversity (H’) was 2.55 and 0.85 for

1996 and 2004 clone libraries, respectively. With an estimated coverage of >67% for both

libraries, at 98% phylotype cutoff, it was clear that the 2004 clone library was far less diverse

and less rich in NifH species compared with the 1996 clone library.

A possible explanation for the differences in diversity and richness estimators between libraries

might originate from different environmental conditions at the time of sampling. Mean rainfall

(mm) from 1990 to 2010 in Hamelin Pool was 199.7 mm y-1, and in the month of May alone

was 29.8 mm (Bureau of Meteorology, 2011). In 2004 and 1996 there was no substantial

deviation from this mean in May, yet 1996 had much higher rainfall occurring throughout the

year. Hamelin Pool experienced far more rainfall in February, June, July, August and October

1996, culminating in a total of 299.2 mm rainfall (50% increase).

This would have changed the local water budget, usually dominated by evaporation of

freshwater and influx of saline oceanic waters (Smith and Atkinson, 1983). Increase of fresh

Page 86: Nitrogen fixing potential in extreme environments - UNSWorks

77

water would probably have lowered salinity levels, washed additional nutrients into the bay and

changed Hamelin pool’s water chemistry. These conditions would further influence microbial

community composition, allowing proliferation of new phylogenetic groups to participate in

new biochemical processes and niches, as has been evident in other hypersaline microbial mats

under similar conditions (Yannarell et al., 2006). As expected from arid and dry conditions,

2004 library NifH sequences were affiliated with Cyanobacteria, a resilient group of

microorganisms which flourish (sometimes exclusively) under various stressful conditions

(Paerl et al., 2000; Pandey et al., 2004; Yannarell et al., 2007). The 1996 library included far

more non-cyanobacterial nitrogenase sequences, which was also the case for a sample, taken

during the wet season, of a hypersaline microbial mat from Salt Pond, San Salvador Island,

Bahamas (Yannarell et al., 2006). Currently we do not have additional environmental data to

further support the above suggestion or offer an alternative explanation.

3.3.7 Nitrogen fixation potential in Shark Bay

Past studies of the stromatolite bacterial communities in Hamelin Pool, Shark Bay, have

suggested the presence of several possible diazotrophs based on 16S rDNA molecular analyses

and culturing efforts (table 1). Bacterial matches between those studies and this study, included

uncultured clones of the sulphate reducer Desulfatibacillum alkenivorans and clones with less

than 90% sequence similarity to Desulfovibrio africanus and P.carbinolicus DSM 2380,

sampled from smooth and pustular mats in the same locality (Allen, 2006; Allen et al., 2009).

In addition, cyanobacterial matches to this study included Xenococcus, Oscillatoria and

Cyanothece isolates at 92% - 93% sequence similarity, below the acceptable threshold of 95%

sequence similarity for a positive genus identification (Everett et al., 1999; Clarridge, 2004).

Xenococcus spp. were isolated from pustular and smooth mats and Cyanothece and Oscillatoria

spp. were isolated from columnar stromatolites in the past (Burns et al., 2004; Goh et al., 2008).

Since few stromatolite NifH clones were affiliated in the phylogenetic analysis with

Marichromatium purpuratum, it is worth mentioning that an obligate halophilic

diazotrophic strain of Chromatium vinosum (also known as Allochromatium vinosum, from the

same family - Chromatiaceae), was isolated from surface deposits of columnar stromatolites, in

the intertidal zone of Hamelin Pool (Bauld et al., 1986).

While taking into consideration all the possible diazotrophic genera in Hamelin Pool

stromatolites (table 1, underlined names), this study has confirmed δ-Proteobacteria and

Cyanobacteria representatives were present in columnar stromatolites. More specifically,

Desulfatibacillum and Chroococcales, Oscillatoriales and Pleurocapsales members -

Cyanothece, Xenococcus and Oscillatoria were identified. Additional potential diazotrophs

Page 87: Nitrogen fixing potential in extreme environments - UNSWorks

78

were a novel discovery and were not identified before: Cyanobacterium UCYN-A and γ-

Proteobacteria members Teredinibacter and Halorhodospira in cluster I, δ-Proteobacteria

representatives Desulfovibrio and Desulfonatronospira in cluster III and Pelobacter in the out-

group to cluster I.

Because the nifH gene is present in a relatively limited number of Eubacteria genomes, in

comparison to 16S rDNA, we would definitely not expect diversity based on nifH to exceed

diversity estimates based on 16S rDNA analysis. Diversity estimates would be lower also

because they were based on amino acid sequences, not nucleotides. A 4% distance within a

group of amino acid sequences might underestimate a more diverse population of nucleotide

sequences. However, for OTU based analysis purposes, we can go forward, bearing this

assumption in mind while discussing molecular diversity based on NifH translated sequences

vs. 16S rDNA.

Previous molecular analyses of 16S rDNA, from smooth and pustular stromatolite mats,

generated bacterial clone libraries that were fairly similar to one another in terms of their

richness and diversity (Allen et al., 2009), yet columnar intertidal stromatolite clone libraries

had lower estimates of diversity and OTUs richness (Allen et al., 2009). At the 98% phylotype

cutoff, bacterial smooth mat sequences were grouped into 111 OTUs, with a Chao1 richness

estimator of 6216, and 4.71 for Shannon-Wiener index of diversity (H’). At the same cutoff

(98%), pustular mat sequences were grouped into 110 OTUs, Chao1 = 3053, H’ = 4.7.

Columnar intertidal stromatolite sequences, on the other hand, grouped into 34 OTUs, Chao1 =

45.2 and H’ = 2.89 at the same cutoff level, which indicated a substantial drop in richness and

species diversity. Additional 16S rDNA-based studies confirmed relatively low richness and

diversity estimators for the bacteria within columnar stromatolites from Shark Bay (Papineau et

al., 2005; Goh et al., 2008). This consistent finding can be attributed to the fact that columnar

stromatolites contain lower biomass in general and higher net carbon precipitation, and

therefore undergo lithification, producing less space and volume in which microorganisms can

live (Dupraz and Visscher, 2005).

The shared estimators for richness and diversity from both NifH clone libraries were slightly

lower compared to the above mentioned 16S rDNA-based analysis of bacterial communities

(Burns et al., 2004; Allen et al., 2009). At a 98% phylotype threshold, NifH sequences were

grouped into 23 OTUs, with a Chao1 richness estimator of 38.6 and H’ = 2.38 (table 5).

Because our analysis was based on amino acid sequences, it underestimated, to a certain degree,

the true diazotrophic diversity within stromatolites. However, our NifH estimates were on the

same scale as 16S rDNA-based richness and diversity estimation, in columnar stromatolites and

Page 88: Nitrogen fixing potential in extreme environments - UNSWorks

79

it is possible the nifH DNA fragments were similarly diverse and abundant as 16S rDNA

fragments. A cautious conclusion based solely on diversity and richness calculations, would be

that the bacterial community in columnar stromatolites specifically, is comprised mostly from

diazotrophic species, and may exhibit spatial and temporal differentiation in regards to nitrogen

fixation.

Uncultured nifH clones from Guerrero Negro (GN) salt ponds were a common finding in our

BLAST analysis (see tables 3 & 4). Microbial mats from the Guerrero Negro in Baja California,

Mexico, provide a well-studied system which is similar, in certain characteristics, to the Shark

Bay system. Furthermore, in order to provide a likely depiction of the active and potential

nitrogen fixers in columnar stromatolites, we reviewed findings from our study, the GN studies

and former 16S rDNA-based analyses of the Hamelin Pool stromatolites (table 8).

The GN study site is set in a hyperarid climate (sporadic rainfall of 35 mm yr-1), with mean

monthly maximum high temperature of 29°C (Summers et al.) and high evaporation rates (1500

mm yr-1) (Jørgensen and Des Marais, 1990). A gentle tide of 0.5 – 1 m floods onto narrow,

shallow trenches and creates a natural large marsh land with shallow pools and hypersaline

evaporitic ponds (80‰ - 108‰ salinity), in which cyanobacterial mats prosper (Fryberger et al.,

1990; Jørgensen and Des Marais, 1990). While the environmental characteristics are similar in

general to those of Shark Bay’s Hamelin Pool, the mat morphologies differ, as columnar

stromatolites (also known as ‘stromatolite heads’) are not present in the Guerrero Negro study

site (Javor and Castenholz, 1981; Hoehler et al., 2001).

Generally, bacterial communities were found to be similar between Hamelin Pool (HP) and

Guerrero Negro (GN) in terms of taxonomy based on 16S rDNA analysis, but they were not

identical. Some of the most abundant bacterial divisions in HP mats were also abundant in GN

mats – mainly α-Proteobacteria, Bacteroidetes, Planctomycetes and -Proteobacteria (Ley et al.,

2006; Goh et al., 2008; Allen et al., 2009).

Page 89: Nitrogen fixing potential in extreme environments - UNSWorks

80

Table 8 Potential diazotrophs in Hamelin Pool (HP) and Guerrero Negro (GN), based on 16S rDNA or nifH genes molecular analysis.

Potential diazotrophs based on 16S rDNA (a,b)

Potential diazotrophs based on nifH gene (c,d)

Common Potential

diazotrophs in GN and

HP

Chroococcidiopsis Cyanothece* Gloeocapsa*

Halothece Leptolyngbya

Lyngbya Microcoleus Oscillatoria

Phormidium* Synechocystis

Cyanothece* Desulfovibrio* Myxosarcina*

HP GN HP GN

Unique potential

diazotrophs in GN or HP

Bacillus Desulfatibacillum

Gloeothece* Halomonas

Methanosracina* Myxosarcina* Pleurocapsa*

Pseudoalteromonas Pseudomonas Rhodobacter

Rhodopseudomonas Rhodospirillum*

Stanieria* Symploca

Synechococcus* Vibrio

Xenococcus*

Chlorobium Desulfobacter

Desulfobacterium Desulfococcus Desulfovibrio

Pseudanabaena

Cyanobacterium UCYN-A*

Desulfatibacillum* Desulfonatronospira

Halorhodospira Marichromatium

Oscillatoria* Pelobacter

Teredinibacter Xenococcus

Anabaena Azotobacter*

Burkholderia* Clostridium* Dermocarpa

Desulfonema* Halothece Klebsiella* Plectonema

Synechocystis*

* 16S rDNA and nifH genes sequence similarity was less than 95% to a designated genus. (a) Data collected from the following references: (Burns et al., 2004; Papineau et al., 2005; Goh et al., 2006; Allen et al., 2008; Allen et al., 2009) (b) Data collected from the following references: Risatti et al., 1994; López-Cortés et al., 2001; Ley et al., 2006. (c) Data from this study. (d) Data collected from the following references: Omoregie et al., 2004a; Omoregie et al., 2004c. It does not include results from green house experiments.

The majority of potential diazotrophs in GN mats were affiliated with cluster III representatives

- -Proteobacteria and Firmicutes; yet included also cluster I representatives such as

Cyanobacteria, β, -Proteobacteria (table 8 and references within). There were no representatives

of Pelobacter spp. or associations with cluster II or cluster IV. Common potential diazotrophs in

HP and GN, based on 16S rDNA, included 10 cyanobacterial representatives from cluster I -

Page 90: Nitrogen fixing potential in extreme environments - UNSWorks

81

Chroococcales, Oscillatoriales and Pleurocapsales groups, while unique GN potential

diazotrophs included six genera mainly from -Proteobacteria, cluster III.

There were fewer common diazotrophs in HP and GN, based on nifH gene studies. These

included Cyanothece, Myxosarcina (cluster I) and Desulfovibrio genera (cluster III). The GN

site had 10 unique potential diazotrophs, and our study has identified 9 unique potential

diazotrophs in HP, all of which were affiliated with cluster I or III.

Following reverse transcriptase PCR analysis in the GN mats, it was concluded that actual

nitrogen fixers during night time were Halothece sp. strain MPI96P605, Myxosarcina strain

ATCC 29377, Synechocystis sp. strain WH8501, Plectonema boryanum, Phormidium sp.

strain ATCC 29409 and NifH2 of Anabaena variabilis ATCC 29413 from cluster I. Only one

genus from cluster III was identified as an active nitrogen fixer - Desulfovibrio (Omoregie et al.,

2004a). Halothece, Synechocystis, and Phormidium were detected in HP based on past 16S

rDNA analysis, and Myxosarcina and Desulfovibrio were detected in HP columnar

stromatolites based on this study using nifH gene analysis. This would point to a potentially

similar pattern of nitrogen fixation.

In regards to community diversity and richness, the GN system, based on 16S rDNA, was

estimated to harbour almost twice the number of bacterial species - 10,000 vs. 6216 in HP

smooth or pustular mats (Ley et al., 2006; Goh et al., 2008; Allen et al., 2009). However,

diazotroph-related estimators of richness and diversity, based on nifH gene, were not available

for the GN mats and we therefore cannot compare this specific aspect. Though nitrogenase

activity was not measured in columnar stromatolites, in GN mats nitrogenase activity was

restricted mostly to the upper 5 mm and peaked during night time (9-37 mol C2H4 m-2 h-1 ,

0:00-6:00), with almost no activity during the day time (Omoregie et al., 2004b).

In summary, based on the available data, Hamelin Pool columnar stromatolites and GN mats

harbour similar diazotrophic species. These include -Proteobacteria and Cyanobacteria

representatives from cluster I and cluster III of the nifH phylogeny tree. It is plausible that the

nitrogenase activity in columnar stromatolites in HP would peak during night time in the upper

layers of the mat, and that actual nitrogen fixers would be Desulfovibrio, Myxosarcina,

Xenococcus spp. and also perhaps Halothece, Synechocystis, and Phormidium, as they were

previously identified in Hamelin Pool (table 8), and in GN mat they were active nitrogen fixers.

It remains to be seen if future samples from columnar stromatolites, under different

environmental conditions, would reveal additional diazotrophs and their activity pattern.

Page 91: Nitrogen fixing potential in extreme environments - UNSWorks

82

3.4 Concluding remarks

Columnar stromatolites are one of five well known morphologies of modern stromatolites in

Shark Bay, usually found in shallow hypersaline waters. In order to assess this complex

microbial mat community, this study used DNA-based, culture independent, molecular

techniques and provided a novel view of the microbial diazotrophic communities within

columnar stromatolites.

Sequence analysis has provided statistically significant taxonomical identification and an

evolutionary representation of the nifH genes in this community. Our analysis indicated

columnar stromatolites, sampled from different years, included a common persisting

cyanobacterial diazotroph, of the genus Cyanothece or Xenoccocous (tables 3 & 4, figure 8).

The diazotrophic community structure did not vary significantly between the temporal samples

according to our statistical tests (table 6). Diversity and richness did vary between the samples,

probably due to environmental shifts which affected seawater salinity levels and allowed for

diverse microbial groups to proliferate in 1996 (table 7). Both samples contained novel nifH

gene nucleotide sequences with low similarity scores to uncultured nifH clones from saline to

hypersaline environments, and translated NifH sequences with high similarity to unicellular,

non-heterocystous Cyanobacteria and γ, -Proteobacteria NifH sequences.

NifH clones sequences were mainly affiliated with cluster I and to a lesser extent with cluster

III, suggesting aerobic and anaerobic bacteria with conventional Mo nitrogenase might be

involved in the nitrogen fixation process. Not a single clone was affiliated with cluster II or

cluster IV, while several clones were affiliated with a -Proteobacteria out-group to cluster I,

represented by P. carbinolicus DSM 2380. Taking into consideration past studies done on this

community and similar microbial mats in hypersaline environments such as those present in the

Guerrero Negro (GN) salt ponds, we suggest columnar intertidal stromatolites are less diverse

and rich in microbial species relatively to other mat morphologies, and most of these species

will retain nitrogen fixation capabilities. Additionally, it would seem marine based diazotrophic

bacteria are capable of enduring hypersaline conditions and it remains to be seen what are their

adaptive mechanisms.

In conclusion, Shark Bay, a UNSECO’s World Heritage site, continuously provides researchers

with fascinating endemic microbiological subjects that bridge our current era with Archaean

fossil records of early organic life on planet Earth. This furthers our understanding of how life

began, evolved and survived dynamic environmental conditions, on a geological scale.

Page 92: Nitrogen fixing potential in extreme environments - UNSWorks

83

Chapter 4 The bacterial diazotrophic community in a radon hot

spring, South Australia. ____________________________________________________________________

4.1 Introduction

Paralana Hot Springs (PHS) are situated in Mt. Painter, near the town Arkaroola, on the north

eastern side of the Flinders Ranges, South Australia (30°10’35”S, 139°26’26”E, figure 1). The

climate is arid, with an average annual rainfall of 20.3 mm, with an extreme of 1270 mm in

1974 (Sprigg, 1984). The maximum temperatures at Arkaroola can exceed 30°C during the

summer months and minimum temperatures can fall below 10°C during May to September

(Bureau of Meteorology, 2011). There are several water sources in the Paralana fault area; PHS

is the only radioactive spring in the area, and includes two connected oval shaped pools and a

draining creek. Pool 1 is the hot source pool, and pending on the time of year and flooding

events, tends to vary in terms of its size, depth and temperature (2 - 9 m2 , 30 cm – 80 cm deep,

48°C – 63°C; (Mawson, 1927; Grant, 1938; Long et al., 2001; Anitori et al., 2002). Pool 2 is the

larger of the two, deeper and cooler (50-80 m2, 1 to 4.5 m deep, 40.2°C - 48°C, respectively)

with neutral pH (7). Both pools include microbial components, which manifest as floating

microbial mats, of emerald-green colour, as well as dark benthic mats, mainly in pool 2 (Anitori

et al., 2002).

The PHS system, though unique in its characteristics, is not an isolated or a closed ecosystem. It

is subject to external inputs from its surrounding fauna and flora due to floods and long-standing

human interest in the springs for cultural and medicinal values (Sprigg, 1984).

Underground water circulates through the hot, radioactive rocks underlying the Mt. Painter

Domain, and then flows near a localized radiogenic source, relatively close to the surface

(Brugger et al., 2005). Hence, water is discharged at the hot source pool, at relatively high

temperatures (56°C - 63°C), and very high radon levels (29,000 Bq/L, in the gas bubbles), with

traces of radiogenic helium.

Page 93: Nitrogen fixing potential in extreme environments - UNSWorks

84

14 km

N

Figure 1: Arkaroola and Paralana Hot Springs locations in South Australia. Main satellite image by Google Earth, inset map source: Australian Bureau of Meteorology.

In general, most of the PHS studies have focused on their geology, mineralization processes,

hydrothermal activity and geochemistry characteristics, while rarely analyzing the springs’

unique biological and ecological attributes (Mawson, 1927; Grant, 1938; Blight, 1977; Smith,

1992; Long et al., 2001; Anitori et al., 2002; Thomas and Walter, 2002; Brugger et al., 2005).

PHS water chemistry has not changed significantly over the past 81 years, therefore its rather

stable conditions can support a steady, endemic microbial community within the pools

(Mawson, 1927; Grant, 1938; Long et al., 2001). The PHS waters contain at ppm concentrations

levels fluorine, cesium, rubidium, tungsten and molybdenum, and uranium is at ppb

concentrations levels (Brugger et al. 2005). It has been suggested that the concentration

differences between the hot source pool and pool 2 of several trace elements such as Al, Cu, Pb,

Y, and Zn, as well as Mn and Fe, were caused by the microbial community present in the pools,

though the exact mechanisms were not suggested (Brugger and others 2005).

Furthermore, there seems to be an active microbial uptake of N2 and CO2, based on the gas

bubble analysis. Relative to their known atmospheric content, the dry gas concentration of N2(g)

ranged between 79 to 80%, and CO2(g) 3.8 to 5.2% in the hot source pool, while their

atmospheric composition is usually 78% and 0.03%, respectively (Brugger and others 2005).

Page 94: Nitrogen fixing potential in extreme environments - UNSWorks

85

Identifying the functional microbial groups in the hot source pool would be of great interest,

especially in light of its current temperature and radiation regime.

A single microbial study has been published to date about PHS (Anitori et al., 2002). The 16S

rDNA molecular analysis detected a rich and diverse bacterial community with representatives

from nine taxonomical groups - the β-Proteobacteria, δ-Proteobacteria, Cyanobacteria,

Firmicutes, Bacteroidetes/Chlorobi group, Nitrospira, Chloroflexi and two candidate divisions –

OP8 and OP12 (Anitori et al., 2002). The 16S rDNA clone library was composed mainly from

cyanobacterial and β-Proteobacteria sequences, and few -Proteobacteria representatives.

Thermophilic bacteria were mainly affiliated with the Cyanobacteria group - a thermophilic

Oscillatoria amphigranulata was a dominant sequence and additional Oscillatoriales,

Chroococcales, and Nostocales at low sequence similarities were found in the temperature range

of 48°C to 60°C. Mastigocladus laminosus was the prevalent sequence in a sample taken at

53°C in the hot source pool. Mesophilic heterotrophic bacteria were affiliated with the β-

Proteobacteria and Nitrospira groups. PHS bacterial high diversity was demonstrated by the 180

different RFLP patterns that were detected during 16S rDNA molecular analysis of the PHS hot

source pool (Anitori et al., 2002).

It would be of interest to see how whether the diversity of the diazotrophic community would

complement 16S rDNA study, and whether our findings would suggest nitrogen fixation

dynamics occur in PHS, in a similar fashion to Yellowstone National Park hot springs (see

introduction chapter, section 1.7). Based on 16S rDNA study, and studies done in the past in hot

springs, such as Yellowstone National Park, one would expect to find evidence of diazotrophic

cyanobacterial representatives (heterocystous or unicellular), as well as sulphate reducers and

heterotrophic diazotrophs from the Nitrospira and β-Proteobacteria groups.

Thermal radioactive springs are unique habitats for exploring microbiological agents with

unique adaptations to this environment, as well as potential novel solutions for DNA

stabilization and repair mechanisms, chaperone proteins and internal modifications to important

proteins, and other molecular mechanisms. The main aim of this study was to explore the

diazotrophic community in PHS hot source pool and compare the results to the former

microbiological study of this unique site, as well as to other findings relating to thermophilic

diazotrophs. Nothing is known of the potential and active nitrogen fixers in the PHS bacterial

community, and it is of interest to compare this diazotrophic community characteristic to other

thermal microbial systems.

Page 95: Nitrogen fixing potential in extreme environments - UNSWorks

86

4.2 Materials and methods

4.2.1 Sample collection

Water levels were relatively low, and white evaporitic lines were evident on the rocks

surrounding the pools at the time of sampling, 11th of July, 2009. The hot source pool (pool 1)

was ~30 cm deep, and pool 2 was 1-2 m deep (figures 2 and 3). The hot source pool measured ~

1 × 2 m in a roughly oval shape, with water and gas bubbles (Radon) emerging through the

sediment and gravel. Two samples, 50 ml falcon tube each, were collected from several

localities in PHS: the hot source pool (pool 1), the subsequent cooler pool (pool 2) and from a

~43 m local point downstream of the Hot Spring Creek. Samples taken from sites other than

pool 1, were not analysed, and therefore not discussed here.

At the time of sampling, 15:00-18:00, the hot source pool temperature was 55.6°C and at a

neutral pH 7. The temperature at the sampling sites was measured with a digital pocket

thermometer while the pH was recorded using pH test strips (range 0–14; Sigma, Australia).

Fifty ml sterile falcon tubes were used to bore into 2 different sections of the source pool

separated ~ 0.5 m from one another. Site one was right above the emergence of gas bubbles and

the sediment was composed of finely grained, soft, grey particles, entangled with brown-beige

coloured organic matter. The second site was further away from the gas source with no gas

bubbles present and included a layered mat, spongy in structure and texture, coloured yellow,

brown and dark green with less sediment particles than at site 1. The source pond was blocked

from two sides by big boulders and its water trickled around them, via the sediment into the

cooler pond (pool 2). After collection all samples were placed in sterile specimen bags and kept

in the dark, at 4 C during transport back to Arkaroola lodge for further processing. Five hours

after sampling, roughly 2 ml of the original samples were transferred, under flame, to an equal

volume of RNALater buffer (Ambion, Austen, TX) and kept in the dark, at 4 C until further

processing.

Page 96: Nitrogen fixing potential in extreme environments - UNSWorks

87

Creek Pool 1

Figure 3 PHS pools 2 on the 11th of July 2009, with floating microbial mats.

Figure 2 PHS hot source pool on the 11th of July 2009. Water and gas bubbles (Radon) emerge through the sediment just behind the right boulder.

4.2.2 DNA isolation and PCR amplification of nifH genes

Two different genomic DNA extraction methods were employed, each with three replicates

from the hot source pool samples. The total genomic DNA from three replicates was extracted

using the PowerPlant™ DNA Isolation Kit (MO BIO Laboratories, Inc., Carlsbad, CA)

following the manufacturer’s instructions.

Page 97: Nitrogen fixing potential in extreme environments - UNSWorks

88

Three replicates were from the low organic content locality with gas bubbles present, and three

replicates were from the distant locality, rich in organic content, no gas bubbles. Total genomic

DNA was also extracted in using the XS DNA extraction method (Neilan, 1995; Tillett and

Neilan, 2000). Approximately 100 mg of homogenised environmental sample was transferred to

1.5 ml eppendorf tubes, which contained 500 μL XS extraction buffer (1% potassium-

methylxanthogenate; 800 mM ammonium acetate; 20 mM EDTA; 1% SDS; 100 mM Tris-HCl,

pH 7.4). Then 0.2 g of silicate beads (0.1 mm) was added to each tube, and lysis performed by

bead beating (BIO101/Savant FastPrep FP120, Qbiogene, Inc.), three times at the highest speed

(45 seconds at 6.5 m sec-1), followed by 2 hours of 65ºC incubation with intermittent vortexing.

Once properly lysed, the sample was placed on ice for 30 min or left overnight at -20ºC, after

which it was centrifuged at 14000 g for 20 min.

An equal volume of phenol: chloroform: isoamyl alcohol (25:24:1) was added to the

supernatant, then centrifuged at 14000 g for 5 min at 4ºC. The top layer of the supernatant was

again transferred to a fresh tube and DNA precipitated using 1 volume of isopropanol (or two

volume absolute ice cold ethanol) and 0.1 volume of 4 M potassium acetate. The samples were

incubated at -20ºC for 2 hours, or overnight, and then centrifuged at 14000 g, at 4ºC, for 20 min

to pellet the DNA. The supernatant was discarded and the pellet washed with ice cold 70%

ethanol, followed by centrifugation at 14000 g. The DNA pellet was dried and resuspended in

the 30 L of sterile TE buffer (10 mM Tris-HCl, 1 mM EDTA, pH 8.0). The DNA concentration

was measured using NanoDrop® ND-1000 spectrophotometer (Thermo Fisher Scientific,

Wilmington, DE).

The pooled extracted genomic DNA from each method, 1 μL of a 5 ng μL-1, was applied

separately in a nested PCR to amplify the nitrogenase gene nifH (Omoregie et al., 2004c), as

described previously in chapter 3, section 3.2.2. All PCR experiments included a negative

control reaction without DNA template and a positive control using DNA extracted from the

cyanobacterial reference strain DNA, Nostoc PCC 7120.

4.2.3 Clone library and Restriction Fragment Length Polymorphism (RFLP)

Ligation and transformation of freshly amplified PCR products of the nifH gene, containing an

A-overhang at the 3’ end, were ligated into the pCR2.1 vector of the TOPO TA Cloning kit

(Invitrogen Corporation, Carlsbad, CA) according to the manufacturer’s instructions. From each

clone library, at least 50 positive (white) clones, with the correct insert size - 350 bp, were

selected and their inserts amplified using the vector specific primers MpF and MpR. PCR

Page 98: Nitrogen fixing potential in extreme environments - UNSWorks

89

products of the correct size from positive clones were cleaned and visualized as described

previously, in section 4.2.2.

Each clone was subjected to Restriction Fragment Length Polymorphism (RFLP) analysis and

was screened twice, using restriction enzymes ScrFI and MspI (New England Biolabs, Ipswich,

MA) separately. Each digest reaction contained 1.5 μL PCR products, 2 μL of the appropriate

enzyme buffer, 1 U of restriction enzyme and sterile MilliQ water to a total volume of 20 μL

and incubated at 37°C overnight. The RFLP patterns were analysed manually after

electrophoresis on 2% and 3% agarose gels (molecular biology grade, Progen Pharmaceuticals,

QLD, Australia) with 1x TAE-buffer, stained by ethidium bromide (1 μg ml-1) for 10-15 min

and visualized as described previously in section 4.2.2.

4.2.4 DNA sequencing

Sequencing of selected clones was carried out using the PRISM Big Dye cycle sequencing

system with MPF or MPR primers (3.2 M) and 3-60 ng of cleaned PCR product. After

sequencing reactions had been performed, the reaction was cleaned up and analysed as

described previously in chapter 3, section 3.2.4.

4.2.5 Phylogenetic analysis

Phylogenetic analysis was carried out as described previously in chapter 3, section 3.2.5.

4.2.6 Diversity, richness and coverage analysis

NifH translated nucleotide sequences of 137 bp average lengths were aligned using the

computer package “Muscle” version 3.8.31 and the clone library sampling coverage, diversity

and richness were calculated as described previously in chapter 3, section 3.2.6.

4.2.7 Accession numbers

Sequences of the nifH clones are available under GenBank accession numbers KC295666-

KC295692.

Page 99: Nitrogen fixing potential in extreme environments - UNSWorks

90

4.3 Results and discussion

4.3.1 BLAST & BLASTX comparative analysis

NifH genes were present and were amplified from Paralana hot source pool samples (pool 1,

figure 4).

Figure 4 Products obtained after the second step of the PCR amplification of nifH from PHS hot source pool DNA extractions using nested primers. Lane 1: PHS hot source pool; 2: negative control sterile MilliQ H2O; 3: positive control Nostoc PCC 7120; M: 0.5 μg μL-1 GeneRuler™ DNA Ladder Mix (Fermentas, Ontario).

Seventy six clones containing the correctly sized insert (350 bp) were obtained and analysed.

RFLP analysis was performed on 64 random positive clones with the nifH nucleotide insert,

which grouped them into 7 groups (figure 5). Initially, three representatives of each RFLP

pattern were selected for sequencing, and due to high sequence variation and diversity,

additional positive clones were sequenced directly without further restriction enzyme treatment,

until clone library coverage was deemed sufficient for diversity and richness analysis.

Figure 5. 3% agarose gel showing RFLP patterns using ScrFI restriction enzyme on 9 positive clones from PHS library. M: 0.5μg/μL GeneRuler™ DNA ladder Mix (Fermentas, Ontario).

500 bp

1000 bp

300 bp

M 1 2 3

M

Page 100: Nitrogen fixing potential in extreme environments - UNSWorks

91

BLAST and BLASTX results passed significant statistical thresholds, with BLAST expected

(E) values e-33 – e-180 and BLASTX results ranged from e-47 to e-64 for all clones.

Two representative PHS nifH clones were related to Geobacter lovleyi strain SZ, with high

sequence identity similarity in the BLAST results, (CP001089 accession ID, 99% sequence

similarity, table 2). At 98% sequence similarity there were few clones related to Mastigocladus

laminosus CCMEE 5198 (EF570547, 98%) and uncultured nifH clones (EF568492, EF568489,

98%) from the Mediterranean Sea (Man-Aharonovich et al., 2007). These matches represented

less than a fifth of the clone library composition. The majority of the hot source pool clone

library (43%, left pane, figure 8), was matched in BLAST with nifH genes remotely related to

G. lovleyi strain SZ (88%), Desulfovibrio magneticus RS-1 (AP010904, 85%) and uncultured

nifH clones from thermal, cold and saline environments, covering sediments, soil and coastal

samples (Zhang et al., 2008; Farnelid et al., 2009; Brown et al., 2010; Severin and Stal, 2010;

Singh et al., 2010).

Inferred NifH amino acid sequences clustered into eight different phyla and 15 different

diazotrophic genera based on the BLASTX analysis. The majority of sequences in our clone

library, were affiliated with the δ-Proteobacteria (33%, figure 9) and Cyanobacteria (27%),

followed by β-Proteobacteria (13%), Bacteroidetes (Cytophaga-Flexibacter-Bacteroides group)

and Nitrospirae (9%). The -Proteobacteria, α-Proteobacteria and Firmicutes (low G+C Gram-

positive bacteria), were the smallest represented groups, at 4%, 2% and 2%, respectively.

The highest amino acid sequence similarity matches (98%-100%, table 2) included Azotobacter

vinelandii DJ (YP_002797378, 98%), Dechloromonas aromatica RCB (YP_284634, 100%),

Burkholderia sp. Ch1-1 (ZP_06839018, 100%) and Geobacter lovleyi SZ (YP_001951460,

100%). These sequences constituted slightly more than a fifth of the clone library composition.

The largest section in the clone library, 37%, included translated NifH sequences at 85%-89%

similarity levels to Desulfobulbus propionicus DSM 2033, Desulfatibacillum alkenivorans AK-

01, Desulfovibrio fructosovorans JJ and D. magneticus RS-1 (δ-Proteobacteria), Paludibacter

propionicigenes WB4 (Bacteroidetes) and Thermodesulfovibrio yellowstonii DSM 11347

(Nitrospirae, see figure 9, right pane). Cyanobacterial matches included members of the

Nostocales (Nostoc sp. PCC 7120, Anabaena variabilis ATCC 29413), Chroococcales

(Cyanothece sp. CCY0110) and Oscillatoriales (Oscillatoria sp. PCC 6506) at an average

sequence similarity of 95%.

Page 101: Nitrogen fixing potential in extreme environments - UNSWorks

92

Figure 8: BLASTN (left) and BLASTX (right) sequences similarity levels, within the PHS hot source pool clone library.

Figure 9 Phyla distribution within PHS hot source pool NifH clone library, based on the BLASTX analysis.

Page 102: Nitrogen fixing potential in extreme environments - UNSWorks

93

Tabl

e 2

Para

lana

hot

sour

ce p

ool B

LAST

and

BLA

STX

resu

lts, p

rese

ntin

g on

ly th

e hi

ghes

t mat

ch fo

r eac

h se

quen

ce. A

tota

l of 4

6 cl

ones

wer

e se

quen

ced

and

blas

ted.

H

ighe

st B

LAST

and

BLA

STX

sequ

ence

sim

ilarit

ies (

98-1

00%

) wer

e m

arke

d bo

ld.

Sequ

ence

fil

e ID

C

lone

ID

N

eare

st re

lativ

e in

Gen

Ban

k

BLA

ST m

atch

ac

cess

ion

ID

Sequ

ence

Si

mila

rity

(%)

BLA

STX

mat

ch

acce

ssio

n ID

Nea

rest

bac

teria

l Fe

prot

ein

mat

ch in

Gen

Ban

k

Sequ

ence

si

mila

rity

(%)

Phyl

um

R

SA15

8 H

S1

Unc

ultu

red

nifH

clo

ne

EU91

6413

97

Y

P_00

4196

476

D

esul

fobu

lbus

pro

pion

icus

DSM

20

33

88

δ-Pr

oteo

bact

eria

RSA

159

HS2

"

EF17

8501

88

Y

P_00

2797

378

Azot

obac

ter v

inel

andi

i DJ

98

γ-Pr

oteo

bact

eria

R

SA16

0 H

S3

" A

Y76

3451

78

Y

P_00

2249

508

Ther

mod

esul

fovi

brio

yel

low

ston

ii D

SM 1

1347

88

N

itros

pira

e

RSA

162

HS5

"

AY

7634

55

84

“ “

83

Nitr

ospi

rae

RSA

163

HS6

M

astig

ocla

dus l

amin

osus

C

CM

EE

519

8 E

F570

547

98

NP_

4854

97

Nos

toc

sp. P

CC

712

0 97

C

yano

bact

eria

RSA

164

HS7

U

ncul

ture

d n

ifH c

lone

A

Y76

3451

85

Y

P_00

2249

508

Ther

mod

esul

fovi

brio

yel

low

ston

ii D

SM 1

1347

85

N

itros

pira

e

RSA

165

HS8

"

EU91

6413

89

Y

P_00

2430

688

Des

ulfa

tibac

illum

alk

eniv

oran

s A

K-0

1

89

δ-Pr

oteo

bact

eria

RSA

166

HS9

"

" 89

“ 89

δ-

Prot

eoba

cter

ia

RSA

167

HS1

0 "

" 86

“ 86

δ-

Prot

eoba

cter

ia

RSA

168

HS1

1 "

" 87

Y

P_00

4196

476

Des

ulfo

bulb

us p

ropi

onic

us D

SM

2033

87

δ-

Prot

eoba

cter

ia

RSA

169

HS1

2 "

" 86

“ 86

δ-

Prot

eoba

cter

ia

RSA

170

HS1

3 O

scill

ator

iale

s cya

noba

cter

ium

JS

C-1

nifH

FJ

7974

16

95

YP_

3247

41

Anab

aena

var

iabi

lis A

TCC

294

13

95

Cya

noba

cter

ia

RSA

171

HS1

4 "

" 95

96

Cya

noba

cter

ia

RSA

172

HS1

5 "

" 97

ZP

_071

1255

6

Osc

illat

oria

sp. P

CC

650

6 95

C

yano

bact

eria

R

SA17

3 H

S16

Unc

ultu

red

nifH

clo

ne

EF17

8501

89

Y

P_00

2797

378

Azot

obac

ter v

inel

andi

i DJ

98

γ-Pr

oteo

bact

eria

R

SA17

4 H

S17

" A

Y76

3451

86

Y

P_00

2249

508

Ther

mod

esul

fovi

brio

yel

low

ston

ii D

SM 1

1347

86

N

itros

pira

e

Page 103: Nitrogen fixing potential in extreme environments - UNSWorks

94

Sequ

ence

fil

e ID

C

lone

ID

N

eare

st re

lativ

e in

Gen

Ban

k B

LAST

mat

ch

acce

ssio

n ID

Sequ

ence

Si

mila

rity

(%)

BLA

STX

mat

ch

acce

ssio

n ID

Nea

rest

bac

teria

l Fe

prot

ein

mat

ch in

Gen

Ban

k

Sequ

ence

si

mila

rity

(%)

Phyl

um

RSA

191

HS1

0 O

scill

ator

iale

s cya

noba

cter

ium

JS

C-1

nifH

FJ

7974

16

97

ZP_0

7112

556

O

scill

ator

ia sp

. PC

C 6

506

94

Cya

noba

cter

ia

RSA

192

HS1

8 U

ncul

ture

d n

ifH c

lone

A

Y19

6418

89

Y

P_28

4634

D

echl

orom

onas

aro

mat

ica

RCB

95

β-

Prot

eoba

cter

ia

RSA

193

HS3

0 "

EF56

8492

89

Y

P_00

3447

953

Azos

piri

llum

sp. B

510

92

α-Pr

oteo

bact

eria

R

SA19

4 H

S31

" G

U11

7600

94

Y

P_28

4634

D

echl

orom

onas

aro

mat

ica

RC

B

100

β-Pr

oteo

bact

eria

R

SA19

5 H

S34

" EU

6226

27

86

YP_

0040

4255

4 Pa

ludi

bact

er p

ropi

onic

igen

es

WB

4 87

B

acte

roid

etes

RSA

198

HS1

5 "

EF5

6849

2 98

ZP

_068

3901

8 Bu

rkho

lder

ia sp

. Ch1

-1

100

β-Pr

oteo

bact

eria

R

SA20

3 H

S32

Osc

illat

oria

les c

yano

bact

eriu

m

JSC

-1 n

ifH

FJ79

7416

96

ZP

_071

1255

6 O

scill

ator

ia sp

. PC

C 6

506

95

Cya

noba

cter

ia

RSA

204

HS4

1 "

“ 94

“ 95

C

yano

bact

eria

R

SA20

5 H

S20

Unc

ultu

red

nifH

clo

ne

EF56

8489

95

Y

P_00

4042

554

Palu

diba

cter

pro

pion

icig

enes

W

B4

79

Bac

tero

idet

es

RSA

206

HS2

2 O

scill

ator

iale

s cya

noba

cter

ium

JS

C-1

nifH

FJ

7974

16

94

ZP_0

7112

556

Osc

illat

oria

sp. P

CC

650

6 95

C

yano

bact

eria

RSA

207

HS3

5 U

ncul

ture

d n

ifH c

lone

EU

9150

63

86

ZP_0

7333

883

Des

ulfo

vibr

io fr

ucto

sovo

rans

JJ

86

δ-Pr

oteo

bact

eria

R

SA20

8 H

S42

" D

Q39

8449

91

Y

P_28

4634

D

echl

orom

onas

aro

mat

ica

RCB

95

β-

Prot

eoba

cter

ia

RSA

209

HS4

4 "

EF5

6849

2 98

ZP

_068

3901

8 Bu

rkho

lder

ia sp

. Ch1

-1

100

β-Pr

oteo

bact

eria

R

SA21

2 H

S29

Osc

illat

oria

les c

yano

bact

eriu

m

JSC

-1 n

ifH

FJ79

7416

94

ZP

_071

1255

6 O

scill

ator

ia sp

. PC

C 6

506

94

Cya

noba

cter

ia

RSA

213

HS4

6 U

ncul

ture

d n

ifH c

lone

EF

5684

89

98

ZP_0

6839

018

Burk

hold

eria

sp. C

h1-1

10

0 β-

Prot

eoba

cter

ia

RSA

214

HS4

8 G

eoba

cter

lovl

eyi S

Z C

P001

089

88

YP_

0019

5146

0 G

eoba

cter

lovl

eyi S

Z 95

δ-

Prot

eoba

cter

ia

RSA

215

HS4

9 U

ncul

ture

d n

ifH c

lone

EU

9150

63

85

ZP_0

7333

883

Des

ulfo

vibr

io fr

ucto

sovo

rans

JJ

87

δ-Pr

oteo

bact

eria

R

SA21

7 H

S8

Geo

bact

er lo

vley

i SZ

CP0

0108

9 99

Y

P_00

1951

460

Geo

bact

er lo

vley

i SZ

100

δ-Pr

oteo

bact

eria

R

SA21

8 H

S11

" “

99

“ “

100

δ-Pr

oteo

bact

eria

R

SA21

9 H

S2.1

D

esul

fovi

brio

mag

netic

us R

S-1

AP0

1090

4 85

Y

P_00

2953

433

Des

ulfo

vibr

io m

agne

ticus

RS-

1 88

δ-

Prot

eoba

cter

ia

RSA

220

HS2

.4

Osc

illat

oria

les

cyan

obac

teriu

m JS

C-1

nifH

FJ

7974

16

94

ZP_0

1727

765

Cya

noth

ece

sp. C

CY

0110

94

C

yano

bact

eria

Page 104: Nitrogen fixing potential in extreme environments - UNSWorks

95

Sequ

ence

fil

e ID

C

lone

ID

N

eare

st re

lativ

e in

Gen

Ban

k

BLA

ST m

atch

ac

cess

ion

ID

Sequ

ence

Si

mila

rity

(%)

BLA

STX

mat

ch

acce

ssio

n ID

Nea

rest

bac

teria

l Fe

prot

ein

mat

ch in

Gen

Ban

k

Sequ

ence

si

mila

rity

(%)

Phyl

um

R

SA22

0 H

S2.4

O

scill

ator

iale

s cya

noba

cter

ium

JS

C-1

nifH

FJ

7974

16

94

ZP_0

1727

765

Cya

noth

ece

sp. C

CY

0110

94

C

yano

bact

eria

RSA

221

HS2

.19

" “

96

ZP_0

7112

556

Osc

illat

oria

sp. P

CC

650

6 95

C

yano

bact

eria

R

SA22

2 H

S2.2

3 G

eoba

cter

lovl

eyi S

Z C

P001

089

99

YP_

0019

5146

0 G

eoba

cter

lovl

eyi S

Z

100

δ-Pr

oteo

bact

eria

R

SA22

3 H

S2.4

3 "

“ 99

Y

P_00

1950

896

“ 10

0 δ-

Prot

eoba

cter

ia

RSA

224

HS2

.16

Unc

ultu

red

nifH

clo

ne

GU

1934

72

88

YP_

0040

4255

4 Pa

ludi

bact

er p

ropi

onic

igen

es

WB

4 88

B

acte

roid

etes

RSA

225

HS2

.27

Osc

illat

oria

les c

yano

bact

eriu

m

JSC

-1 n

ifH

FJ79

7416

94

ZP

_017

2776

5 C

yano

thec

e sp

. CC

Y01

10

95

Cya

noba

cter

ia

RSA

226

HS2

.34

Unc

ultu

red

nifH

clo

ne

AY

2240

41

87

YP_

0029

5343

3 D

esul

fovi

brio

mag

netic

us R

S-1

89

δ-Pr

oteo

bact

eria

R

SA22

7 H

S2.4

6 "

GU

1938

22

77

YP_

0036

3945

8 Th

erm

inco

la sp

. JR

95

Fi

rmic

utes

R

SA22

8 H

S2.4

7 "

GU

1934

72

89

ZP_0

7333

883

Des

ulfo

vibr

io fr

ucto

sovo

rans

JJ

88

δ-Pr

oteo

bact

eria

R

SA22

9 H

S2.5

6 "

“ 89

Y

P_00

4042

554

Palu

diba

cter

pro

pion

icig

enes

W

B4

88

Bac

tero

idet

es

Page 105: Nitrogen fixing potential in extreme environments - UNSWorks

96

4.3.2 Phylogenetic analysis

A total of 256 NifH amino acid sequences (137 AA length) were subjected to a maximum

likelihood analysis and produced a phylogenetic tree with four major clusters, corresponding to

NifH designated clusters I-IV, previously described in chapter 3, section 3.3.5 (Chien and

Zinder, 1996; Zehr et al., 2003a; Raymond et al., 2004a).

Cluster I, known as the conventional Mo-containing NifH sequences, contained eight sub

clusters and had a support value of 90 within the entire NifH tree (figure 10). All its sub clusters

had support values above 74, thus the tree topology had high likelihood probabilities. Two

distinct clusters were out grouped to cluster I, at branch support values of 59 (Nitrospirae) and

98 (Desulfuromonadales, -Proteobacteria). PHS clones were not affiliated with cluster II or IV.

NifH cluster III had a support value of 79 within the entire NifH tree and included anaerobic

diazotrophs within five sub clusters, all with branch support values above 72.

A total of 20 NifH PHS clones were affiliated with cluster I, with the Cyanobacteria and the ,

and -Proteobacteria (figure 11). Ten clones formed their own tight group related to

cyanobacterial genera, namely: Oscillatoria sp. PCC 6506 and Cyanothece sp. CCY0110 (sub

cluster 1-Cyan-B). One clone clustered closely to the Nostocales and Mastigocladus laminosus

(Q47917 accession ID), in sub cluster 1-Cyan-C. Three clones were closely related to the

Burkholderia spp. ( -Proteobacteria, 1-Prot-αβ sub cluster). A single clone, RSA193-HSP09,

nestled individually between 1-Prot-αβ sub cluster and Paenibacillus azotofixans (Firmicutes,

Q9AKT8). The BLASTX analysis indicated this clone was related to Azospirillum sp. B510, -

Proteobacteria, at 92% sequence similarity (table 2). In the sub cluster 1-Prot-β -A, two clones

were affiliated with Azoarcus communis, and a single clone with Dechloromonas aromatica

strain RCB ( -Proteobacteria, Q79AX4 and Q47G67, respectively). In the sub cluster 1-Prot- -

C, two clones were closely related to the Azotobacter spp.

Page 106: Nitrogen fixing potential in extreme environments - UNSWorks

97

Figure 10: Cluster I and Cluster III positions within the three main clusters of the nifH phylogenetic tree. Topology was based on Maximum-likelihood analysis of nifH amino acid sequences. Cluster I was outgrouped by Nitrospirae, and Cluster III was outgrouped by Roseiflexus spp. (Chloroflexi).

During the BLASTX analysis, several PHS clones had 85 - 100% sequence similarity to

unverified NifH sequences (table 2). These included for instance, Thermodesulfovibrio

yellowstonii DSM 11347 and NifH sequences from -Proteobacteria – Pelobacter and

Geobacter spp. Though these sequences were not manually annotated or verified in the Swiss-

Prot database, they were nevertheless integrated into the phylogenetic analysis to provide an

unbiased view (figure 11). Four NifH clones clustered with T. yellowstonii, a thermophilic

sulphate-reducing organism isolated from a thermal vent in Yellowstone Lake in Wyoming,

USA (Henry et al., 1994). An additional five NifH clones clustered with the

Desulfuromonadales order as these clones were matched to Geobacter lovleyi strain SZ NifH

sequences, at 99-100 % sequence similarity (CP001089, YP_001951460, YP_001950896, table

2). T. yellowstonii, Geobacter spp. NifH sequences and affiliated clones, clustered separately

from one another, forming distinct groups outside of cluster I (figure 11).

NifH cluster III which included the anaerobic diazotrophs (figure 12), contained five sub

clusters with support values >72. A total of 16 NifH PHS clones were affiliated to -

Proteobacteria, Spirochaetes, Firmicutes and Bacteroidetes. Six clones formed a tight group

within sub cluster 3-Prot- -B ( -Proteobacteria) remotely related to NifH sequences from

Desulfobulbus propionicus DSM 2033 and Desulfatibacillum alkenivorans AK-01 (88%

BLASTX sequence similarity, table 2). An additional three clones in this sub cluster were

Page 107: Nitrogen fixing potential in extreme environments - UNSWorks

98

Figure 11 next page: Phylogenetic distribution of cluster I based on Maximum-likelihood analysis of partial NifH amino acid sequences. Sequences determined in this study were given an alphanumeric prefix RSAX-HSP09 and are marked bold; number of clones for each sequence is in parenthesis. Branch support values are shown for key branches; only values > 50 were considered significant. Text boxes contain designation of clusters and in parenthesis is the closest sub cluster nomination as per (Zehr et al., 2003a). Prot=Proteobacteria, Cyan=Cyanobacteria, Firm=Firmicutes. The scale bar represents the number of substitutions per 100 bases. Outgroup was Desulfuromonadales ( -Proteobacteria) nifH sequences from Geobacter and Pelobacter genera.

closely related to Desulfovibrio gigas (P71156), while in sub cluster 3-Spir-A, five clones

were affiliated with Treponema and Spirochaeta spp. (Spirochaetes).

A single clone (RSA205-HSP09) nestled individually between two sub clusters, 3-Firm-Arch

and 3-Prot- -B, which was suggested by the BLASTX analysis to be remotely related to

Paludibacter propionicigenes WB4, Bacteroidetes, with only 79% NifH sequence similarity. An

additional singular clone was affiliated with Thermincola sp. JR (YP_003639458) and the

family Peptococcaceae (Firmicutes) in the sub cluster 3-Firm-Arch.

Overall, Cyanobacteria and -Proteobacteria contributed the main NifH sequences to the clone

library (figure 13). The Spirochaetes affiliated clones were detected only during the

phylogenetic analysis, while in the BLASTX analysis those sequences were matched to δ-

Proteobacteria and Bacteroidetes representatives, at 88% sequence similarity. Other shifts

occurred within the assignments to Firmicutes and α- and -Proteobacteria, as would be

expected, since the phylogenetic analysis employs different assumptions and algorithms in its

calculation, in comparison to the BLASTX analysis (see chapter 3, section 3.3.5, for further

discussion regarding this point).

Page 108: Nitrogen fixing potential in extreme environments - UNSWorks

99

1-Cyan-B & C (1B)

1-Firm-A (1D)

1-Prot-β -A (1P)

1-Prot- -C (1H,1T,1L,1U,1M)

1-Prot-αβ (1J, 1K)

1-Cyan-A (1B)

Page 109: Nitrogen fixing potential in extreme environments - UNSWorks

100

3-Firm-Arch (3C, 3D, 3A)

3-Prot- -GS (3L, 3T)

3-Prot- -B (3B, 3E, 3L)

3-Prot- -A (3P,

3-Spir-A (3P, 3L)

Figure 12: Phylogenetic distribution of PHS hot source clones in cluster III based on Maximum-likelihood analysis of NifH partial amino acid sequences. Sequences from this study (alphanumeric prefix RSAX-HSP09) and are marked bold, branch support values are shown for key branches; only values > 50 were considered significant. Spir=Spirochaetes, Arch=Archaea. The scale bar represents the number of substitutions per 100 bases.

Page 110: Nitrogen fixing potential in extreme environments - UNSWorks

101

A

B

Cluster III Cluster I

4.3.3 Coverage, diversity and community richness

The “Mothur” program was employed in order to calculate the various ecological parameters

and coverage estimators as detailed in the previous chapter. As evident from the collectors curve

(figure 14), the number of unique NifH amino acid sequences has not reached a plateau at 99%

phylotype cutoff (based on furthest neighbour algorithm and distance precision of 0.01, Schloss

and Handelsman (2005)), but did so at 93% phylotype cutoff. The estimated coverage by the

method of Good (1953) was above 75%, at the 98% phylotype cutoff. Therefore, the coverage

of potential diazotrophs in PHS hot source pool was sufficient but not complete, and the clone

library was mostly representative of the diazotrophic diversity. Coverage and collectors curves

suggested high diazotrophic diversity and the potential richness of the NifH species present in

the hot source pool

Figure 13: Phyla percentile representation from PHS hot source pool clone library between NifH cluster I (clear slices) and cluster III (shaded slices). Pane A) Phyla distribution based on the BLASTX results. Pane B) Phyla distribution based on the phylogenetic analysis. In bold – The two main phyla per cluster.

Page 111: Nitrogen fixing potential in extreme environments - UNSWorks

102

Figure 14: Collector’s curves for taxa (OTUs) with minimum thresholds of 99(0.01), 98(0.02), 97(0.03), 96(0.04), 95% phylotype cutoff and lower, based on NifH partial amino acid sequences.

The PHS hot source pool diazotrophic community included 20 OTUs (Operational Taxonomic

Units) at a 98% phylotype threshold, and the number of observed species by the Chao1 non-

parametric estimator for richness was 33.75 (23.40-75.55, 95% CI), indicating that when

sampled to completion there would be between 14 and 56 more NifH species obtained (table 3,

at 98% phylotype cutoff). The Shannon-Wiener index of diversity (D) range was 2.35 - 3.12,

between 91% - 100% phylotype cutoff.

Table 3: Coverage, observed phylotype richness and diversity indices for PHS hot source pool clone libraries, based on NifH partial amino acid sequences

Phylotype cutoff (%)

Sequence length analysed

Coverage Good (%) (a)

OTUs Richness Index Chao1(b) (95% CI)

Diversity index Shannon–Wiener (c) (95% CI)

PHS 2009

100 123AA 53.33 28 133.00(59.46-378.48) 3.12(2.86-3.38) 99 64.44 24 54.00(32.72-127.19) 2.88(2.60-3.16)

98 75.56 20 33.75(23.40-75.55) 2.68(2.41-2.96) 95 82.22 17 24.00(18.45-50.75) 2.53(2.27-2.78) 93 86.67 15 20.00(15.86-43.91) 2.42(2.18-2.66) 91 88.89 14 17.33(14.50-36.07) 2.35(2.11-2.59)

Abbreviations: CI, confidence interval; OTUs, operational taxonomic units. (a) The coverage index was calculated by the method of Good (1953). (b) The richness index was calculated by the method of Chao et al. (1993). (c) The diversity index by the method of Shannon–Wiener (Krebs, 1989).

4.3.4 Nitrogen fixation in Paralana Hot Springs

The diversity analysis demonstrated high diazotrophic diversity and richness in the PHS hot

source pool. The number of NifH clones analysed and sequenced in this study (76), represents

the highest number of NifH clones from a singular hot spring to be analysed to date (Hamilton

et al., 2011a). There is a strong potential for active nitrogen fixers, yet our attempts to identify

Page 112: Nitrogen fixing potential in extreme environments - UNSWorks

103

actively transcribing species was unsuccessful owing mainly to limited material availability (see

section 4.2.1).

The DNA extraction and nifH gene amplification were successful. In summary, the best matches

in the BLAST analysis were to G. lovleyi strain SZ nifH gene, M. laminosus CCMEE 5198 and

a few uncultured nifH clones from a sea water sample, in the Mediterranean Sea (Man-

Aharonovich et al., 2007). The heterocystous Cyanobacterium M. laminosus CCMEE 5198, is a

moderately thermophilic bacteria, that is found in many hot springs worldwide (Miller et al.,

2007) and was reportedly previously from PHS (Anitori et al., 2002). However, the BLASTX

analysis recognised this specific clone as the heterocystous Nostoc sp. PCC 7120, at 97%

sequence similarity. Even at lower similarities in the BLASTX results, M. laminosus nifH

sequence was not suggested as a possible match (data not shown), and we concluded that most

probably this clone was indeed a NifH sequence from Nostoc sp. PCC 7120. Additionally, a

common match in the BLAST and BLASTX analyses was the mesophilic, strictly anaerobic G.

lovleyi, with high sequence similarities scores. G. lovleyi is a known metal reducer and

dechlorinating agent, that was studied extensively for its capabilities in bioremediation of

pollutants (Sung et al., 2006). This finding in the hot source pool of PHS is of interest,

especially if future work can verify it is an active nitrogen fixer. Cyanobacteria and -

Proteobacteria were the main diazotrophic taxa present in PHS hot source pool, according to the

BLASTX analysis.

Phylogenetic analysis provided additional interesting results as well, mainly in relation to the

cluster I vs. cluster III affiliations. The overall tree topology included high likelihood branches,

which were the result of choosing verified reference NifH sequences to work with, as well as

highly optimised amino acid substitution matrix and phylogeny algorithms. The tree topology in

general was similar to previously reported NifH phylogeny trees (Zehr et al., 1997; Zehr et al.,

2003a), and PHS hot source pool sequences were divided amongst two main clusters, I and III,

and additional out groups.

The PHS -Proteobacteria NifH representatives were closely related to Burkholderia and

Azospirillum in cluster I. These genera are considered mesophilic and are routinely found to fix

nitrogen in rhizosphere and soil environments (Okon, 1985; Garrity et al., 2005). Finding such

traces is not surprising in the hot source pool, as it is an open pool, subjected to various

interventions from nearby soil areas. The DNA fragments may represent adjacent bacteria,

which landed in the sampling area, but do not actively fix nitrogen. The -Proteobacteria

representatives were related to Azoarcus and Dechloromonas from cluster I, and BLAST

analysis indicated their sequences were originally isolated from an estuary, an Antarctic

Page 113: Nitrogen fixing potential in extreme environments - UNSWorks

104

microbial mat (Moisander et al., 2007; Jungblut and Neilan, 2010) and a Yellowstone National

Park hot spring (Hall et al., 2008). This could point to potential active nitrogen fixers, which

have the capability to adapt to various temperatures. The -Proteobacteria NifH clones were

affiliated with the Mo-dependant Azotobacter, a well studied mesophilic soil nitrogen fixer

under microaerobic conditions, with an optimal diazotrophic growth pH at 7.0-7.5 (Dixon and

Kahn, 2004; Garrity et al., 2005), again pointing to the possible introduction of this genus from

nearby soil or rhizosphere areas. Half of the hot source pool cluster I NifH sequences were

affiliated with the Cyanobacteria, a well known resiliant group of microorganisms, reported

from virtually every extreme environment on Earth, and are the best candidates to be the active

nitrogen fixers in this unique ecosystem (Whitton and Potts, 2000; Pandey et al., 2004; Thomas,

2005; Kaštovský and Johansen, 2008).

Sulphate reducers were another prominent finding in the hot source pool and were affiliated

with cluster III. The PHS -Proteobacteria NifH representatives were closely related to the

anaerobic Desulfobulbus, Desulfatibacillum and Desulfovibrio genera. Desulfobulbus spp. are

found in diverse environments, including deep sea methane vents and arsenic-rich, ferruginous

shallow marine hydrothermal sediments (Pernthaler et al., 2008; Handley et al., 2010).

Desulfatibacillum spp. were recently found in oil deposits and wellheads from hyper

temperature oil wells (74 °C), and nifH fragments were also found in acidic, low temperature,

peat bogs (Zadorina et al., 2009; Yamane et al., 2011). D. gigas has been rarely reported from

thermal environments, yet it is known to fix nitrogen (Gall, 1963; Riederer-Henderson and

Wilson, 1970; Steppe and Paerl, 2002).

PHS NifH clones were affiliated also with bacteria from the Spirochaetes group. Treponema

and Spirochaeta spp., which are obligate anaerobes, are commonly found in hot and thermal

environments, with an optimum growth range of up to 60°C in certain species (Patel et al.,

1985; Paster et al., 1991; Weller et al., 1992). They are known contributors to the global

nitrogen cycle with high N2 fixation rates of up to 5 ng of N2 per hour (Lilburn et al., 2001). A

singular clone was affiliated with an anaerobic Thermincola member of the Firmicutes phylum.

This genus closest phylogenetic relatives - Desulfosporosinus and Desulfotomaculum, were

found to fix nitrogen in soil and termite guts (Postgate, 1982; Roesch et al., 2010). It is of

interest to note that this thermophilic alkali-tolerant genus was isolated from a hot spring, in the

Baikal Lake region (Sokolova et al., 2005), and its nitrogen fixation capabilities under various

temperatures conditions, are currently unknown.

Two out groups to cluster I were T. yellowstonii and Geobacter spp. NifH sequences (figure

11). Certain strains of Geobacter were shown to fix atmospheric nitrogen, under anaerobic

Page 114: Nitrogen fixing potential in extreme environments - UNSWorks

105

conditions (Bazylinski et al., 2000; Methé et al., 2005), however, this is the first report of

finding a Geobacter nifH gene fragments from a hot environment. Potential nifH sequences

were identified in few other Geobacter spp. genomes, and they are under different stages of

verification in the databases, hence most were not included in the tree (these were: G.

sulfurreducens, G. bemidjiensis, G. metallireducens, G. sp. M21, G. sp. FRC-32, G.

uraniireducens, G. sp. M18 and G. daltonii (NCBI nucleotide database, 2012). Furthermore, a

thermophilic isolate of Geothermobacter ehrlichii of the same family - Geobacteracea, was

isolated from hydrothermal vents and grew at 35°C and 65°C, with an optimum growth

temperature of 55°C, suggesting thermophilic adaptations are quite possible within members of

the Geobacteracea (Kashefi et al., 2003). The above data suggests a thermophilic nitrogen fixer

of the Geobacter genus might be active in PHS hot source pool.

The thermophilic T. yellowstonii DSM 11347 NifH sequence was obtained from a complete

genome sequence project, directly submitted to NCBI databases (Genbank ID CP001147.1,

bioproject ID PRJNA30733) and is unverified by any other source. To our knowledge, this is

the first report of NifH from this species, from a hot environment. Nothing is known of its true

nitrogen fixation capabilities. It is interesting as well that this thermophilic group of sulphate

reducers (Geobacter, Thermodesulfovibrio) do not cluster within cluster III, with other sulphate

reducers. We estimate that as additional genome sequencing projects are completed, a

thermophilic NifH cluster would further establish itself separately from other NifH clusters.

This is mainly because the temperature regime would impose changes onto the nitrogenase

characteristics, in order for it to remain functional under high temperatures. These changes will

probably be reflected in the amino acids sequence and the nifH genetic code, effectively

producing a new cluster in the tree topology.

In the past, PHS system was found to harbour high bacterial diversity based on a 16S rDNA

molecular analysis (Anitori et al., 2002). In the same study, 180 different RFLP patterns were

detected across all samples, and the Shannon-Wiener diversity estimator ranged from 0.57-3.85,

with most samples showing values higher than 2.5. Our study echoed that diversity with a high

Shannon-Wiener diversity range of 2.35 - 3.12. Only one study has provided this specific

estimator for thermophilic diazotrophs, and at the moment, these are the highest values reported

from a thermophilic environment. Hydrothermal vents, at 20°C to 78°C temperature range,

reported diversity estimators of 1.8 to 2.2 (Mehta et al., 2003), while non thermophilic studies

produced diversity estimators as high as 2.92 and as low as 1.02 in comparison (Izquierdo and

Nüsslein, 2006; Roesch et al., 2010). Diversity studies in the geothermal springs of Yellowstone

National Park (Hamilton et al., 2011a), have not provided this specific diversity estimator for

the NifH clones, yet 13 hot springs were found to harbour 2-12 unique phylotypes, at sequence

Page 115: Nitrogen fixing potential in extreme environments - UNSWorks

106

identity threshold of 99%. In this study, at the same sequence identity threshold, the unique

OTU number was 24 (table 3), pointing to a potentially higher diazotrophic diversity.

A substantial increase in -Proteobacteria sequences was evident in our study, in comparison to

the published 16S rDNA analysis (Anitori et al., 2002). Also, we have identified

representatives of the Spirochaetes for the first time, and did not find any NifH Chloroflexi

related clones, though that group of bacteria was reported previously (Anitori et al., 2002).

There were no exact taxonomical matches in the -, -Proteobacteria groups between the

studies. However, there were several 16S rDNA sequences affiliated at 95% similarity, to a T.

islandicus, originally isolated from Icelandic hot springs (Sonne-Hansen and Ahring, 1999), and

of the same genus as T. yellowstonii strain DSM 11347, which was detected in our study

(Anitori et al., 2002). In a similar fashion, few 16S rDNA sequences were also remotely related

to Pelobacter carbinolicus DSM 2380 and to Desulfuromonas spp. (87%, 84%, respectively),

from the Desulfuromonadales order. Our study has reported NifH clones affiliated P.

carbinolicus DSM 2380 and Geobacter spp. from the same order.

We did not measure nitrogen fixing rates in this study, and we were unable to confirm active

nitrogen fixers. However, the literature points to common species that are repeatedly detected in

thermal springs around the world, some are known to actively fix nitrogen. For instance, our

analysis and the previous 16S rDNA analysis (Anitori et al., 2002), suggested heterocystous

diazotrophic Nostocales, specifically Nostoc PCC 7120 and Anabaena variabilis ATCC 29413,

were present in the hot source pool. Both species are aerobic nitrogen fixers, usually during

light periods (Stewart, 1973), and they were also detected in hot springs in Japan, at 70°C,

though it was not mentioned if they actively fixed nitrogen (Watanabe and Yamamoto, 1971).

These facts make them a likely candidate to be an active nitrogen fixer in the PHS system. In a

similar fashion, unicellular, filamentous and non-heterocystous Cyanobacteria found in our

study, such as Oscillatoria sp. PCC 6506 and Cyanothece sp. CCY0110, tend to fix nitrogen

aerobically during dark periods in order to avoid potential oxygen damage to the nitrogenase

complex (Stal and Krumbein, 1987; Reddy et al., 1993; Schneegurt et al., 1994; Berman-Frank

et al., 2003). A large thermophilic Oscillatoriales group was present at the Zerka Ma’in hot

springs at 59°C - 63°C (Ionescu et al., 2010). An interesting finding in a sulphide-rich hot

spring microbial mat (54°C), included a thermophilic Oscillatoria terebriformis, which was

found to move vertically along the sulphide gradients in the mat, from oxic to anoxic conditions

(Richardson and Castenholz, 1987). In addition, some strains of the Oscillatoria exhibited

reduced nitrogenase activity during light periods (in vivo) when grown heterotrophically, yet

when grown anaerobically, they were able to fix nitrogen during the light period as well (Stal

and Heyer, 1987; Gallon et al., 1991). Considering the Oscillatoria group nitrogen fixing and

Page 116: Nitrogen fixing potential in extreme environments - UNSWorks

107

motility capabilities, and their repeated presence in hot environments, might suggest that the

thermophilic Oscillatoria spp. present in the PHS hot source pool, would be a likely candidate

to be an active nitrogen fixer. We would suggest also that anaerobic sulphate reducing -

Proteobacteria Desulfovibrio spp. would potentially be the active nitrogen fixers in the hot

source pool. Evidence for anaerobically nitrogen fixation have been reported from various hot

sources - a 63°C hot spring in Jordan (Steppe and Paerl, 2002), 50°C to 60°C alkaline springs in

Yellowstone National Park (Wickstrom, 1984; Oren et al., 2009).

Though few , , and -Proteobacteria have been detected in other thermophilic environments

(Ward et al., 1998; Ferris et al., 2001; Miller et al., 2009), PHS Proteobacteria NifH clones

were mainly associated with plants rhizosphere, and it remains to be seen the extent of their

active nitrogen contribution to the PHS system.

4.4 Concluding remarks

In summary, PHS hot source pool NifH clones partially matched a past study based on 16S

rDNA (Anitori et al., 2002). NifH clones were affiliated with the Oscillatoriales, Chroococcales,

Nostocales (Cyanobacteria), as well as with P. carbinolicus and Thermodesulfovibrio spp. ( -

Proteobacteria and Nitrospirae, respectively). Cluster III NifH clones were related to members

of the -Proteobacteria (mostly SRB), Spirochaetes, Bacteroidetes and Firmicutes, none of the

which were identified in the original 16S rDNA bacterial community study (Anitori et al.,

2002).

BLAST and BLASTX identified diazotrophs, who might be active in nitrogen fixation in this

system. We would suggest that a thermophilic Oscillatoria spp. and an anaerobic sulphate

reducing -Proteobacteria from the Desulfovibrio spp. would potentially be the active nitrogen

fixers in the hot source pool.

As with other culture independent studies, we assumed that not all bacteria which have the nifH

genes actually express them and fix N2. Nevertheless, we would like to suggest nitrogen fixation

does occur in the hot source pool, mainly because N2 levels in the spring waters were higher,

compared to the local atmospheric composition (Brugger et al., 2005).

In summary, the hot source pool in Paralana Hot Springs supports a diverse and rich

diazotrophic community. Our study has not only identified potential nitrogen fixers it has also

expanded our basic knowledge of the microbial community composition and the potential of it

nitrogen fixation dynamics.

Page 117: Nitrogen fixing potential in extreme environments - UNSWorks

108

Chapter 5 Structural and evolutionary adaptations in the Fe protein

component of the nitrogenase

_______________________________________________

5.1 Introduction

The background question, propelling our efforts throughout this chapter, was whether there

were changes to the inferred NifH sequences obtained from hypersaline and thermal

environments (chapters 3 & 4), which reflect adaptations of the Fe protein to these

environments ?

In order to remain active and functional under various physical conditions, it is essential for any

protein to adapt to its immediate surroundings (Jaenicke and Böhm, 1998; Somero, 2003;

Bolhuis et al., 2008). There are several possible pathways for adaptation; a protein may be

protected from inactivation by “external” factors, such as being enclosed within a cell or

organelle (a heterocyst for example). Micro-conditions surrounding the protein can also be

controlled, either with heat/cold shock proteins, or by organic compatible solutes or by a

heterotrophic existence, thus preventing exposure to unfavourable conditions and inactivation

(Des Marais, 1995; Fields, 2001; Pikuta et al., 2007). The amino acid composition within a

protein was found to change under stressful conditions such as high salinity, pressure, extreme

temperatures and pH (Madern et al., 1995; Jaenicke, 1996; Groudieva et al., 2004; Siddiqui and

Thomas, 2008; Greaves and Warwicker, 2009). It is therefore of interest to look into the

potential adaptations in the Fe protein in response to stressful environmental conditions, and

gain better understanding of the mechanistic solutions originating from genetic code

permutations.

The Fe protein, encoded by the nifH gene, has been phylogenetically classified within the

family of the Mrp/MinD proteins, as part of the SIMIBI class within the GTPase super class

group of proteins, which include translation factors, signal recognition particle (Costello et al.)

GTPases, and several families of ATPases (Leipe et al., 2002). GTPase proteins include several

conserved elements - a repetitive α/ secondary structure, an N-terminal Walker A motif, also

known as a P-loop, which structurally forms a loop and binds the -phosphate of a nucleotide to

facilitate hydrolysis (Walker et al., 1982). In addition, GTPases also include the Walker B

Page 118: Nitrogen fixing potential in extreme environments - UNSWorks

109

Figure 1. Amplified regions of NifH in the Fe protein (highlighted in blue and red). MoFe chains A - D are shown with minimal backbone atom display, the Fe protein chains E and F are shown in grey ribbons, except for the amplified NifH regions, residues 37-155. Space filled atoms are displayed for the Calcium ions, Fe7MoNS9 and Fe8S7 clusters in the MoFe protein, and the Fe4S4 cluster in the Fe protein. Image based on 2AFH PDB file (Tezcan et al., 2005).

motif, which binds via a water molecule to the MgATP, and includes a conserved Asp and Gly

residues, preceded by four hydrophobic residues (Peters et al., 1995). Two switch regions,

known as Switch I and Switch II, were termed as an analogy to the homologous regions in ras

P21 proteins (Lanzilotta et al., 1996; Jang et al., 2000; Jang et al., 2004) and are vital to the

conformational change upon nucleotide binding. The Fe protein is a dimer, structurally

composed from eight beta sheets and alpha helices (Schlessman et al., 1998; Tezcan et al.,

2005), with a 4Fe:4S metalo cluster nestled in between (see also introduction chapter, section

1.3.1, for further details).

The Fe protein structure has been studied quite extensively due to its role in dinitrogen fixation

(Howard and Rees, 1996; Peters and Szilagyi, 2006). Molecular phylogenetic studies utilising

the nifH gene primers (Zehr and McReynolds, 1989; Omoregie et al., 2004b), amplify only part

of the gene, corresponding to residues 37-155 (residue numbering according to P00456 Swiss-

Prot ID sequence, see figure 1). This part contains information on switches I and II, the Walker

B motif and residues which coordinate the metallo cluster and interact with the second

component of the nitrogenase, the MoFe protein (see figure 2). The amplified section does not

cover the nucleotide binding fold, Walker A motif (Walker et al., 1982). Within the amplified

region, there are known loops which can undergo conformational variations, plus several

conserved residues which form multiple hydrogen bonds via interaction with conserved water

molecules, and also NH-S bonds between the amide groups and sulfur atoms, specifically

around the 4Fe:4S cluster (Georgiadis et al., 1992; Schlessman et al., 1998; Chiu et al., 2001).

\

Page 119: Nitrogen fixing potential in extreme environments - UNSWorks

110

Figure 2. Known functional regions in the amplified regions of NifH in the Fe protein. Switch I region is highlighted in orange, switch II in forest green, Walker B motif in blue, and residues which interact with the MoFe protein are coloured red. Q54, part of the Q-loop motif (see main text in section 5.4.2) is in pink. For visualization purposes, MoFe chains A and B are presented in minimal wire, and the image was cropped. Fe protein chain E was omitted from the image (2AFH PDB file (Tezcan et al., 2005).

In order to elucidate structural deviations relating to potential environmental adaptation, it was

imperative to obtain a known Fe protein structure, which would represent each of cluster I and

III individually. Since 1992 the crystallographic structures of the Fe protein provided new

insights on its mechanism and structure (Georgiadis et al., 1992; Kim et al., 1993; Peters and

Szilagyi, 2006). Twenty Fe proteins have been resolved in the range of 2.1 - 3.2 Å from

Azotobacter vinelandii, phylogenetically affiliated with cluster I (P00459 Swiss-Prot ID, H.M.

Berman, 2003). The best refined model, 2AFH at 2.1 Å (P00459 Swiss-Prot ID), was chosen as

the reference structure for clones affiliated with cluster I (Tezcan et al., 2005). However, only

two resolved structures have emerged from bacteria affiliated with cluster III -

Clostridium pasteurianum, and these structures were determined at 1.93 and 3.00 Å resolution

(Kim et al., 1993; Schlessman et al., 1998). The more refined structure, designated 1CP2

(P00456 Swiss-Prot ID), was chosen as the reference structure for this study, for clones

affiliated with cluster III. These two Fe protein models, 2AFH and 1CP2, are from mesophilic

bacteria and share a 69% overall amino acid sequence and 73.5% sequence identity in the

Page 120: Nitrogen fixing potential in extreme environments - UNSWorks

111

amplified region of nifH specifically (Burgess et al., 1980; Zehr and McReynolds, 1989;

Schlessman et al., 1998; Omoregie et al., 2004a).

In order to detect amino acid substitutions in a sequence and changes in the Fe protein structure,

two different bioinformatic tools were employed with 1CP2, 2AFH and NifH clones from this

study and existing databases. ConSurf is a bioinformatic tool which identifies functional regions

in proteins, by taking into consideration their phylogenetic background and similarities between

amino acids (Glaser et al., 2003; Landau et al., 2005). After estimating the level of conservation

of each amino acid in a set of sequences, a representative colour scheme is projected onto a

protein 3D visualized structure, thus helping researchers to identify areas highly conserved and

functionally important, but also areas of medium to high variability (Pupko et al., 2002;

Goldenberg et al., 2008). ConSurf is currently ranked as one of the best bioinformatic tools

available today for identifying important functional sections in proteins (Chung et al., 2005;

Ashkenazy et al., 2010; Mooney et al., 2011). ConSurf has been employed in the past in the

analysis of various proteins which included an iron-sulfur cluster, or supervised the biogenesis

of such clusters, for instance - the cytosolic iron-sulfur assembly protein (Cia1, Srinivasan et al.,

2007), the nitrogenase molybdenum-iron protein (Chung et al., 2006), the Iron–Sulfur Cluster

Assembly proteins (IscU, IscS, Ramelot et al., 2004; Shi et al., 2010), reverse-acting

Dissimilatory sulphite reductase (DsrAB, Grimm et al., 2010), and an ATPase component in the

biosynthesis of Fe–S clusters (SufC, SufE, Goldsmith-Fischman et al., 2004).

In most cases, these studies used ConSurf additionally to an analysis of the protein resolved

structure, to highlight regions of strict conservation and point out or confirm their specific

functionality (Ramelot et al., 2004; Li et al., 2009). At times, ConSurf has been used without

any accompanying biochemical analysis, being used as a prediction tool, to help researchers

find, among other things, protein-protein interaction sites, ligand binding sites, provide data for

future mutational or structural studies, and assigning domain functions to the ever increasing

number of hypothetical proteins (Bell and Ben Tal, 2003; Chung et al., 2005; Ashkenazy et al.,

2010). Thus, ConSurf analysis can be used to distinguish and illuminate conserved important

functional zones in families of proteins, and help in deducing lineage specific adaptations

(Glaser et al., 2005), even when a known 3D crystallographic structure is unavailable (Razia et

al., 2010; Kumar et al., 2012).

In our analysis, multiple alignments of each NifH cluster were compiled from reviewed NifH

sequences obtained from the Swiss-Prot database (Boeckmann et al., 2003), 58 and 32 reference

sequences, of cluster I and cluster III, respectively. Using reference sequences provided less

background noise to the data, as there are many NifH sequences available in the databases,

Page 121: Nitrogen fixing potential in extreme environments - UNSWorks

112

isolated from various sources under various conditions. Furthermore, it was assumed that

genetic changes would manifest mainly in the non conserved regions of the protein, and

therefore each multiple alignment was further split into two - a set of multiple sequences with

conserved residues only, and another set with variable residues only. The distinction between

variable and conserved residues was based on the ConSurf analysis, detailed in section 5.2.1.

The dichotomy between conserved and non-conserved enabled us to analyse shifts in the amino

acids composition for each segment.

Currently, the ConSurf web server requires a 3D structure of a protein, written as a PDB file, for

visualising the end result (Glaser et al., 2003). This and our aim to detect structural shifts in the

clones, directed us to use predicted structures, that were modelled by the iterative threading

assembly refinement (I-TASSER) server, which predicts 3D protein models (Zhang, 2008,

2009). Briefly, the server first assesses the possible secondary structure of a given sequence

against a representative PDB template library, using a 70% cutoff criterion for the pair-wise

comparison, and a combination of alignment programs, such as Needleman-Wunsch

(Needleman and Wunsch, 1970), Smith-Waterman (Pearson, 1991), etc, to propose a potential

secondary structure for the sequence (Zhang, 2008, 2009). The potential structure is then

divided into continuous segments of good quality structural alignments, and unaligned

fragmented sections, usually loop regions, which require a different method for structural

refinement. After additional spatial characteristics are calculated and averaged across a cluster

of potential structures, the modelling process is repeated to produce the best structural

candidate. In the second round, additional algorithms are used, such as the TM-align (Zhang and

Skolnick, 2005), for structural alignment, and other softwares to add backbone atoms and side

chain rotamers, eventually producing a PDB file for downstream applications (Roy et al., 2010).

The I-TASSER server provides different scores to evaluate the quality of its models. ‘C-score’

is a confidence score for estimating the quality of predicted models by the I-TASSER server.

The C-score is calculated based on the significance of threading template alignments and the

convergence parameters of the structure assembly simulations. It is typically in the range of [-

5,2], where a C-score of higher value signifies a model with a high confidence and vice-versa

(Zhang and Skolnick, 2007; Roy et al., 2010). A template modelling score (TM-score) is a scale

for measuring the topological similarity between two structures (Zhang and Skolnick, 2004), a

TM-score >0.5 would indicate that a model had correct topology and a TM-score below 0.17

would indicate random similarity. Root mean square deviation (RMSD) is another score

provided by the server, which is a well known standard for measuring the accuracy of structure

modelling, when the native structure is known (Kabsch, 1976, 1978; Carugo, 2003). The lower

the RMSD score, the better is the match between structures (i.e., smaller deviations).

Page 122: Nitrogen fixing potential in extreme environments - UNSWorks

113

I-TASSER consistently ranks the best method in the Critical Assessment of Structure Prediction

(Caspi and Karp) experiments for predicting protein structures (Zhang, 2007, 2009; Roy et al.,

2010).

Either accompanied with a functional or biochemical analysis, or without, these two

bioinformatics tools can provide powerful insight and novel information on protein structure

and conservation. In one study on the membrane associated thioredoxins of the Arabidopsis

thaliana plant, ConSurf was used to highlight two conserved amino acids, Gly and Cys, in the

N-terminal extension of the protein. These were then mutated to Ala, for further functional and

structural analysis (Meng et al., 2010). Subsequently, I-TASSER was used to predict the protein

and the mutant variants’ 3D structures, which enabled the researchers to show structural

modifications due to the changes in those specific amino acids. Their mutational and

biochemical study supported the ConSurf and I-TASSER results. In another study, a large scale

ConSurf analysis of the NS1 and NS2 amino acid sequences, from influenza A virus, was

projected onto I-TASSER models of NS1 and NS2, to highlight novel potential binding sites for

drugs (Darapaneni et al., 2009). There are few additional studies which employed both tools,

and we expect more studies will emerge using ConSurf and I-TASSER (Jimenez-Lopez et al.,

2010; Meng and Feldman, 2010; Aluri and Terli, 2012; Bhat et al., 2012).

To our knowledge, this is the first time these tools have been used in the analysis of the Fe

protein component of the nitrogenase protein. This required us first and foremost to analyse the

novel methodology, followed by the later analysis of our data with the established protocol.

Specific aims:

1. The two main clusters in the NifH phylogeny tree are cluster I and cluster III. Most of our

previously findings (detailed in chapters 3 and 4) were affiliated with these clusters. Therefore

our aim in this chapter was firstly to characterise conservation patterns and amino acid

distribution in NifH sequences from cluster I and cluster III.

2. Evaluate novel methodology in regards to known structural and functional regions of the Fe

protein.

Page 123: Nitrogen fixing potential in extreme environments - UNSWorks

114

5.2 Material and methods

5.2.1 Evolutionary conservation

Evolutionary conserved and non conserved residues for cluster I & III and affiliated clones,

were calculated by the “ConSurf” program (Pupko et al., 2002; Glaser et al., 2003; Landau et

al., 2005; Goldenberg et al., 2008). Pre-compiled multiple alignments were built using

MUSCLE (Edgar, 2004), manually checked, and submitted to ConSurf online web server for

analysis (http://ConSurf.tau.ac.il/). Specific parameters were chosen - homologues were

collected from Swiss-Prot (Boeckmann et al., 2003), PSI-BLAST E-value: 0.0001, no. of PSI-

BLAST iterations: 1, maximal % ID Between Sequences: 100, minimal % ID for homologs: 72.

A phylogenetic tree was constructed with the method of Neighbour Joining and ML distance,

and the method of maximum likelihood and the LG protein substitution model were chosen for

the conservation scores (Le and Gascuel, 2008; Posada et al., 2009). After the initial collection

of homologues, only cluster I or cluster III specific sequences from known organisms were

chosen (58 and 32 sequences, respectively) and submitted for further analysis with ConSeq

(Berezin et al., 2004).

5.2.2 Residue composition

The aligned NifH sequences from known organisms affiliated with cluster I or cluster III and

the inferred NifH sequences from clones were subjected to residue composition calculation. The

average ratio of 20 amino acids in the partial NifH sequences was calculated for each set of

multiple alignment using MEGA 5 software (Tamura et al., 2007; Kumar et al., 2008; Tamura

et al., 2011). In each multiple alignment, the average ratio of the amino acids was calculated

separately for the conserved and variable sections of the NifH sequence (conserved residues

denoted ‘9’ by ConSurf evolutionary scoring matrix, variable residues - ‘0-8’).

5.2.3 Statistical analysis

All statistical analyses were calculated using GraphPad Prism version 5.04 for Windows

(GraphPad Software, San Diego, California, USA). D'Agostino & Pearson “omnibus K2”

normality tests (D'Agostino, 1986) were performed on each amino acid in the partial NifH

sequences of cluster I and III. Amino acids were then subjected to frequency distribution

analysis, followed by a non linear regression analysis using the Gaussian equation and “robust

fit” as fitting method, with ‘Q’ parameter set to 1% in order to exclude possible outliers in the

Page 124: Nitrogen fixing potential in extreme environments - UNSWorks

115

data set (Motulsky and Brown, 2006). Two tailed unpaired t-tests with Welch’s correction,

allowing for different variances, were performed (Welch, 1947) and the mean composition was

denoted significantly different when P < 0.05.

5.2.4 Structural characteristics

3D crystallographic representatives of the Fe protein from mesophilic Azotobacter vinelandii

PDB file ID 2AFH (Burgess et al., 1980; Tezcan et al., 2005) and Clostridium pasteurianum,

PDB file ID 1CP2 (Schlessman et al., 1998), were chosen in order to assess potential structural

changes in relation to cluster I and cluster III, respectively.

Secondary structures based on 3D coordinates were analysed by DSSP (Define Secondary

Structure of Proteins) as implemented in The Protein Data Bank (Kabsch and Sander, 1983;

H.M. Berman, 2003) and in WHAT IF program (Vriend, 1990), and were also predicted by I-

TASSER online server (Zhang, 2008; Roy et al., 2010). Solvent accessibility was calculated by

the ConSeq program (Sridharan et al., 1992; Pollastri et al., 2002; Berezin et al., 2004), WHAT

IF program and I-TASSER on line server. Images were created using the Chimera UCSF

program, version 1.6.2 (Pettersen et al., 2004). Salt bridges were defined by the WHAT IF web

server (Vriend, 1990), version 10.1a (http://swift.cmbi.ru.nl/servers/html/index.html).

Salt bridges were restricted to an interatomic distance of less than 4.0 Å between a negative

atom, at the side chain oxygen atoms of an Asp or Glu residue, and a positive atom at the side

chain nitrogen of an Arg, Lys or His residue (Rodriguez et al., 1998).

Page 125: Nitrogen fixing potential in extreme environments - UNSWorks

116

5.3 Results

5.3.1 Evolution, composition and structure of the Cluster III Fe protein

The evolutionary analysis was based on the alignment of 32 complete NifH reference sequences

from known organisms affiliated with cluster III (figure 3). The assigned function for the

residues presented here was taken from Schlessman et al., 1998, unless otherwise specified. The

completely conserved residues were scored 9 by ConSurf and coloured in maroon, and included

residues with important function. The residues coordinating the 4Fe:4S cluster and creating NH-

S bonds between the amide groups and sulphur atoms were at positions 91, 93 to 97, 127 and

129 to 132 (numbering as presented in figure 3, score 9). Many positions involved in the

binding of the Fe protein to the MoFe protein were completely conserved - 59-62, 90-103, 133-

141, 171, though several were not. Positions 66-67 and 142 included an Asp or Glu, while

position 106 usually included Asn (score 6) and sometimes Gly or Asp. Position 110 included

almost always a polar uncharged amino acid (Q/N/S, score 2) and positions 172-174 were an

interplay between a hydrophobic Y/F followed by a highly conserved Gly or Ala (173, score 8)

and a charged amino acid (E/D/K at position 174, score 1).

All the residues in the two switch regions, known as Switch I and Switch II, were completely

conserved (38-43 and 126-136 , respectively). Walker A motif, the phosphate binding loop

region, at residues 7-17, was completely conserved except for position 13 (Gly, score 8), in

which Ala replaced the Gly in one sequence. Most of the residues implicated in the chains

interaction within the Fe protein were completely conserved, but several were not. The

completely conserved positions were - 9, 41, 43, 46, 91-92, 94, 97, 128, 130-133, 136-137, 155,

157, 160-161, 164-165, 167, 171, 188, 190 and 266. There were 16 less conserved positions,

according to the ConSurf analysis. Positions 52 and 156, included an exchange between two

hydrophobic amino acids (L/M, score 7), position 172 was an exchange between large

hydrophobic amino acids (Y/F, score 1), position 189 was highly variable (score 1), position

191 was mostly an exchange between A/D (score 7).

Position 214 was mostly Arg (score 8) and position 216 was mostly an exchange between P/N

(score 6) while positions 219-220 were an interplay between polar uncharged amino acids T/Q

(score 8) followed by a positively charged amino acid, K/R (score 5). Positions 222-225 scored

6-8, as position 222 was mainly a negatively charged Glu, while position 225 was always a

positively charged amino acid, R/K. 223-224 were uncharged amino acids, mainly Ile or Asn

(scores 6/7, respectively). Position 262 was highly variable (score 1), while positions 269-270

always contained a pair of hydrophobic amino acids, I/L/M/V (and scored 4-1).

Page 126: Nitrogen fixing potential in extreme environments - UNSWorks

117

In addition, hydrogen bonding partners, with water molecules, were highly conserved as well.

Completely conserved residues were at positions 11-17, 39, 44, 86, 109, 128, 130-132, 144, 187

and 205. Several other hydrogen bonding positions were not conserved. These positions were -

3 (exchanges between Q/K, score 8), 13 (mainly Gly, score 8), 55 (mainly Lys, score 6), 113

(mainly Ala, 8), 114 (Tyr, 8), 125 (Tyr, 8), 178 (Val, 8), 201 (Ala, 8), 253 (L/M/K, score 7) and

261 (highly variable, score 1).

Figure 3 Multiple alignment of NifH complete sequences (N=32) from known organisms affiliated with cluster III coloured by ConSurf. Scale bar colours represent scores 1-9, variable residues in turquoise and completely conserved residues in maroon. The first line shows the residue number, the second line shows consensus, and third line shows the evolutionary score. The first reference sequence, input-pdb-seqres_A is NifH chain A from 1CP2 pdb file, the rest are NifH sequences affiliated with cluster III, see section 5.2.1. Thin line marks the amplified region. Figure continues in the next pages.

Page 127: Nitrogen fixing potential in extreme environments - UNSWorks

118

Page 128: Nitrogen fixing potential in extreme environments - UNSWorks

119

Page 129: Nitrogen fixing potential in extreme environments - UNSWorks

120

Page 130: Nitrogen fixing potential in extreme environments - UNSWorks

121

Page 131: Nitrogen fixing potential in extreme environments - UNSWorks

122

Because the nifH gene primers, used throughout this study (Zehr and McReynolds, 1989;

Omoregie et al., 2004b), amplify only part of the gene, the rest of our analyses referred only to

the amplified section in order to obtain specifics to compare against the clones’ inferred

sequences.

In regards to the average residue composition of cluster III alignment, it was important to clarify

whether the amino acids population followed a Gaussian or normal distribution in order to

perform a comparative analysis, such as a t-test analysis or ANOVA, on their residue

compositions (Smith, 1966; D'Agostino, 1986; Motulsky and Christopoulos, 2004). Table 1

summarises our analysis of amino acid composition in cluster III sequences.

In cluster III, 12 amino acids were found to pass the normality test, and eight amino acids did

not pass. Trp rarely appeared and did not pass the normality test because there was no

distribution to observe. Asp, Glu and Gly did converge to a bell shape curve yet the bell shape

curve was not the best fit (figure 4). The amino acids Phe, Val, and Ser did not pass the

normality test mainly because the distribution revolved around a few discrete values and did not

exhibit normal distribution in our data set. Gly was the most common amino acid in cluster III

sequences (mean value 11.58), while Glu composition varied the most, and had the highest

standard deviation value within the group - 1.22.

Table 1. Amino acids composition in the amplified region of NifH, cluster III sequences from known organisms (N=32). Amino Acid Ala Cys Asp Glu Phe Gly His Ile Lys Leu Met Mean (%) 7.59 2.37 6.01 8.41 2.40 11.58 0.88 7.48 6.34 8.07 3.96 Std. Deviation 0.67 0.31 0.92 1.22 0.56 0.57 0.41 0.63 0.58 1.05 0.53 Passed normality test (alpha=0.05)? (a) Yes Yes No No No No No Yes Yes Yes Yes P-value Summary (b) **** * ** * **

Amino Acid Asn Pro Gln Arg Ser Thr Val Trp Tyr Mean (%) 3.92 3.19 3.23 4.98 3.51 4.93 7.50 0.14 3.50 Std. Deviation 0.75 0.21 0.49 0.84 0.87 0.40 0.69 0.21 0.52 Passed normality test (alpha=0.05)? Yes Yes Yes Yes No Yes No No Yes P-value Summary ** ** * (a) D'Agostino & Pearson omnibus normality test (D'Agostino, 1986). (b) **** P< 0.0001 extremely significant, *** 0.0001 <P< 0.001 very significant, ** 0.001 <P< 0.01, *0.01 <P< 0.05, significant. N - Not significant. n/a - not applicable.

Page 132: Nitrogen fixing potential in extreme environments - UNSWorks

123

Figure 4 Upper pane: The distribution shape of three amino acids based on their composition pattern in the NifH sequence: Asp, Glu and Gly. Goodness of fit (Robust sum of square):Asp-6.22, Glu-6.04, Gly-8.03. Lower pane: The distribution shape of three amino acids based on their composition pattern in the NifH sequence: Ser, Val and Phe. Goodness of fit (Robust sum of square): Ser-6.7, Val-6.83, Phe- curve did not converge. Y axis represents the relative frequency of the X axis values. X axis represents the range of the composition values of each amino acid in the NifH amplified region.

Completely conserved Gly and Ala, in the amplified section of NifH, had interesting structural

characteristics (Table 2). Their secondary structure, based on 1CP2 resolved structure, was

characterised by high curvature sections (bends, ‘S’), H-bonded turns (‘T’, just before or after a

helix, usually) and 3-helix turns (‘G’ - three residues per turn), and few were present in

unidentified or coil regions (Table 2, G or A score 9). Conserved Gly or Ala residues were

adjacent to important functional domains, such as the nucleotide or MoFe binding sites, the Fe

protein inter-subunits interaction region and the switch I & II regions. Solvent accessibility

analysis suggested that some of the Gly and Ala were accessible to the solvent at positions - 41,

50, 64, 76, 86-87, 91, 110 and 139. However, other conserved Gly or Ala residues were buried,

usually within conserved structural motifs.

Composition (%)

Composition (%)

Page 133: Nitrogen fixing potential in extreme environments - UNSWorks

124

Tabl

e 2

1CP2

Fe

prot

ein

parti

al N

ifH se

quen

ce, c

onse

rvat

ion

scor

es, s

econ

dary

stru

ctur

e an

d so

lven

t acc

essi

bilit

y. C

onse

rved

Ala

(A) a

nd G

ly (G

) res

idue

s are

hig

hlig

hted

. 1CP2

40

60

80

|

|

|

(^)

Partial NifH Sequence (a)

CDP

KADSTRLLLGGLAQKSVLDT

LREEGEDVELDSILKEGYGG

IRCVESGGPEPGVGCAGRGI

Conservation Score (b)

999

99999984919719688699

99189757161131419111

41889999999999999999

Secondary Structure (c)

*

E-T

TS-SSHHHHTS-----HHHH

HHHHGGG--HHHH-EE-GGG

-EEEE------TTSS-HHHH

***

---

---HHHHHH-------HHHH

HHHH-----HHHHHHH----

EEEEE--------------H

Solvent Accessibility (d)

bee

eeebbebbbeebeeeebbee

beeeeeebebeebbeeeeee

beeeeeeeeeeeebbbbebb

††

--e

--e-e---e-ee-e--ee--

eeee-ee--eeee--ee--e

-e-e----e-e--ee--e--

††

b-b

---bb--b--------bb--

b-----b-b--bb-------

b-bbbb---bbbbb----bb

Binding to MoFe Protein

---

-----------------LD-

LR---ED-------------

--------P---V-C--R--

Nucleotide Binding Site

-D-

K-DS----------------

--------------------

--------------------

Chains Interface

---

K-D--R-----L--------

--------------------

---------EP-V--A----

Structural Motifs

SWITC H1------------------

--------------------

--------------------

(a) C

orre

spon

ding

par

tial N

ifH a

min

o ac

id se

quen

ce P

0045

6 ac

cess

ion

ID, c

hain

A.

(b) R

esid

ue c

onse

rvat

ion

scor

es, c

alcu

late

d by

Con

Surf

with

Max

imum

like

lihoo

d an

d LG

pro

tein

subs

titut

ion

mod

el (

Pupk

o et

al.,

200

2; G

lase

r et a

l., 2

003;

Lan

dau

et a

l.,

2005

; Gol

denb

erg

et a

l., 2

008)

. (e

) Sec

onda

ry st

ruct

ure

: ‘H

’ - h

elix

; ‘T’

- hyd

roge

n bo

nded

turn

; ‘S’

- ben

d; ‘E

’ - e

xten

ded

beta

shee

t; ‘G

’- 3

-hel

ix (t

hree

resi

dues

per

turn

); ‘-

‘ unk

now

n/ ra

ndom

coi

l. *

on

3D c

oord

inat

es c

alcu

late

d by

DSS

P as

impl

emen

ted

in T

he P

rote

in D

ata

Ban

k (K

absc

h an

d Sa

nder

, 198

3; H

.M. B

erm

an, 2

003)

. **

* Pr

edic

ted

seco

ndar

y st

ruct

ure

by I-

TASS

ER o

n lin

e se

rver

, bas

ed o

n P0

0459

NifH

sequ

ence

(Zha

ng, 2

008;

Roy

et a

l., 2

010)

. ‘H

’ - h

elix

, ‘E’

- ex

tend

ed b

eta

shee

t, ‘-

‘unk

now

n.

(c) S

olve

nt a

cces

sibi

lity:

B

urie

d (b

) or e

xpos

ed (e

) res

idue

; cal

cula

ted

by C

onSe

q on

line

serv

er (S

ridha

ran

et a

l., 1

992;

Pol

last

ri et

al.,

200

2; B

erez

in e

t al.,

200

4).

†† S

olve

nt a

cces

sibi

lity

calc

ulat

ed b

y W

HA

T IF

pro

gram

, ‘-‘

unk

now

n , (

e) e

xpos

ed -

a re

sidu

e th

at is

cle

arly

solv

ent a

cces

sibl

e, m

ore

expo

sed

than

102

Ang

stro

m, o

r mor

e th

an 3

3% o

f its

acc

essi

bilit

y in

the

unfo

lded

stat

e (V

riend

, 199

0).

†††

Pred

icte

d so

lven

t acc

essi

bilit

y ca

lcul

ated

by

I-TA

SSER

serv

er. ‘

b’- b

urie

d re

sidu

e (0

); ‘e

’ hig

hly

expo

sed

resi

due

(7-9

); ‘-

‘ var

ying

deg

rees

of e

xpos

ure;

(C

hen

and

Zhou

, 200

5; W

u an

d Zh

ang,

200

8).

(d) r

esid

ues i

nter

actin

g w

ith s

truct

ural

com

pone

nts i

n ni

troge

nase

(Sch

less

man

et a

l., 1

998)

(^

) Cys

tein

e re

sidu

es w

hich

coo

rdin

ate

the

met

allo

clu

ster

are

mar

ked

C

Page 134: Nitrogen fixing potential in extreme environments - UNSWorks

125

1CP2

sequ

ence

con

tinue

d. C

onse

rved

Ala

(A) a

nd G

ly (G

) res

idue

s are

hig

hlig

hted

.

100

120

140 154

Partial NifH Sequence

|

| (^)

| |

ITSINMLEQLGAYTDDLDYV

FYDVLGDVVCGGFAMPIREG

KAQEIYIVASGEMMAL

Conservation Score

99886379279861119967

88999999999999999949

9919799959797983

Secondary Structure

*

HHHHHHHHTT----TT-SEE

EEEEE-SS-STTTTHHHHTT

S--EEEEEE-SSHHHH

***

HHHHHHHHHHHH------EE

EEE-----EEE---EE----

---EEEEEE--HHHHH

Solvent Accessibility

bbbbbbbeebeebeeebebb

bbbbbbbbbbbbbbbebeee

ebeebbbbbbbebbbb

††

-e-ee-eeee-eeee-e---

-----------------ee-

e-e--------eee--

††

bbb--b-e------e-b-bb

bbbb-b--bbbb-bb--b--

-b--bbbbb----bbb

Binding to MoFe protein

IT--N---Q-----------

---------C----M--RE-

----------------

Nucleotide Binding Site

--------------------

--D---D-------------

-----------E-MA-

Chains Interface

--------------------

----L-DVVC--FA------

-----------EMM--

Structural Motifs

--------------------

-S W I T C H2-------

----------------

Page 135: Nitrogen fixing potential in extreme environments - UNSWorks

126

Figure 5 Superimposition of the I-TASSER model of the amplified NifH sequence based on P00456, and the crystallographic structures of 1CP2. Sections with RMSD > 1Å are highlighted with colour.

n of the I-TASSER model of the amplified NifH seq

The I-TASSER analysis of 1CP2 chain A sequence, produced one 3D model, with an estimated

accuracy of RMSD 1.9±1.5 Å, C-score of 2.13 and 0.99±0.04 TM-score. The PDB templates,

identified by the various server software modules in the threading stage, were PDB 1CP2 chain

A and 2AFH chains E & A, the latter receiving lower sequence identity percentages. The top

ranking EC predicted number was 1.18.6.1 (nitrogenase) with a TM-score of 0.9782 and RMSD

0.66 Å, with 100% sequence identity to the query sequence, an EC-score of 4.5881 and a PDB

hit to 1CP2 chain B (the dimer chain in the Fe protein). The most structurally similar protein to

the I-TASSER model was actually identified as the 2AFH chain E, with a TM-score of 0.9929

and RMSD 0.55 Å, with 69% sequence identity.

Superimposing the I-TASSER model over the known x-ray crystallographic structure of 1CP2

(see section 5.2.3 for specific parameters), highlighted which amino acids were positioned

imprecisely by the I-TASSER server. The overall RMSD of superimposing both structures,

predicted and known, was 0.529 Å. Four sections had RMSD values higher than 1 Å (figure 5):

Gly51-Leu52 (2.982 Å), Glu91 (1.811 Å), Gly93-Val94 (2.214 Å) and Thr115-Asp116 (1.374

Å).

5.3.2 Evolution, composition and structure of the Cluster I Fe protein

The evolutionary analysis was based on 58 complete reference NifH sequences, from known

organisms affiliated with cluster I. The ConSurf evolutionary analysis results for cluster I

(figure 6) were similar to the results from cluster III analysis. All the residues coordinating the

4Fe:4S cluster in the Fe protein were completely conserved (positions 96-100, 130, 132-135,

numbering based on figure 6). All the residues in the Walker A motif, known as Switch I and

Page 136: Nitrogen fixing potential in extreme environments - UNSWorks

127

Switch II were completely conserved as well, positions 7-17, 38-43, and 125-135. Most of the

residues involved in the binding of the Fe protein to the MoFe protein - 61-62, 67-69, 91, 95,

97, 100, 103-104, 137, 132, 140, 170-171, were completely conserved as well, though some

were not. Residues at position 58 were a hydrophobic Met or Leu (score 7), 59 was highly

variable (score 3), position 66 included exchanges between T/S/A (score 5), 107 was mainly

Asn (score 8), and position 141 included an exchange between E/Q (score 6), 173-174 were

highly variable (score 1-4).

Aside from one position, all the residues involved in the nucleotide binding sections were

completely conserved. This single position included either Ile or Val (Score 7). Not all the

residues involved in the intersubunits interaction were completely conserved. The completely

conserved positions were 9, 41, 43, 46, 52, 92-93, 95, 98, 127, 129-132, 135-136, 154-156, 163-

164, 170-171, 187, 213, 223, 262 and 266. The less conserved positions were 159 (mainly Y,

score 8), 166 was an exchange between positively charged amino acids (K/R, 6), position 215

was mainly Asn (N, 8) and position 219 (another exchange - R/H, 6). Additionally, positions

187-190 included a motif which started with a positively charged amino acid, and ended with a

negatively charged amino acid, with a hydrophobic or uncharged residue inserted in between

(R,9; N/Q/K,1; T/V,6; D,8).

A similar motif was present in positions 221-224, starting with a negatively charged amino acid,

followed by a hydrophobic residue and ending with a positively charged amino acid (E, 9; L/I,

8; R, 9; R/K, 7). In addition, 26 residues that participated as hydrogen bonding partners with

water molecules (Schlessman et al., 1998), were completely conserved, with only two residues

scoring 8 - position 143 which was mainly Lys and position 169 that had an exchange between

Val and Leu.

Page 137: Nitrogen fixing potential in extreme environments - UNSWorks

128

Figure 6 Multiple alignment of 58 NifH complete sequences (N=58) from known organisms affiliated with cluster I, coloured by ConSurf. Colours represent scores 1-9, variable residues in turquoise, average conservation in white and completely conserved residues in maroon. The first line shows the residue number, the second line shows consensus, and third line shows the evolutionary score by ConSurf. The first sequence in the alignment, Input_pdb_ATOM_E, is the complete NifH sequence of chain E from 2AFH PDB file. The rest are NifH sequences affiliated with cluster I, see section 5.2.1. Thin black line marks the amplified region by the nifH gene PCR primers. Figure continues in the next pages.

Page 138: Nitrogen fixing potential in extreme environments - UNSWorks

129

Page 139: Nitrogen fixing potential in extreme environments - UNSWorks

130

Page 140: Nitrogen fixing potential in extreme environments - UNSWorks

131

Page 141: Nitrogen fixing potential in extreme environments - UNSWorks

132

Page 142: Nitrogen fixing potential in extreme environments - UNSWorks

133

Page 143: Nitrogen fixing potential in extreme environments - UNSWorks

134

Page 144: Nitrogen fixing potential in extreme environments - UNSWorks

135

Page 145: Nitrogen fixing potential in extreme environments - UNSWorks

136

Page 146: Nitrogen fixing potential in extreme environments - UNSWorks

137

In a similar fashion to the residue composition analysis previously done on the cluster III

alignment, table 3 summarises our analysis of amino acids composition and distribution in

cluster I sequences. In cluster I, 13 amino acids were found to pass the “omnibus K2” normality

test (alpha=0.05), and seven amino acids did not pass. These amino acids did not pass the

normality test due to three different reasons: a. an amino acid very rarely appeared in a

sequence, hence there was no distribution to observe (Trp), b. the distribution revolved around a

few discrete values and did not exhibit normal distribution (Cys, Asp, Phe, Asn, Pro), and c. two

distributions were observed instead of just one in the data set (Arg). Figure 7 shows examples of

a Gaussian non linear regression analysis for points b & c, for Arg, Cys and Phe. Gly was the

most common amino acid in cluster I sequences (mean value 10.05), while Leu composition

varied the most, it had the highest standard deviation value in the group - 1.12.

Table 3. Amino acids composition in the amplified region of NifH sequences from known organisms affiliated with cluster I (N=58). Amino Acid Ala Cys Asp Glu Phe Gly His Ile Lys Leu Met Mean (%) 9.56 2.09 5.33 9.13 1.99 10.05 1.51 7.94 5.13 8.59 4.00 Std. Deviation 0.78 0.49 0.63 0.86 0.36 0.47 0.52 0.62 0.84 1.12 0.75 Passed normality test (alpha=0.05)? (a) Yes No No Yes No Yes Yes Yes Yes Yes Yes P-value Summary (b) **** ** *** Asn Pro Gln Arg Ser Thr Val Trp Tyr Mean (%) 4.25 2.89 3.62 4.77 4.07 4.76 6.99 0.04 3.32 Std. Deviation 0.55 0.15 0.76 0.73 0.72 0.67 0.90 0.11 0.29 Passed normality test (alpha=0.05)? No No Yes No Yes Yes Yes No Yes P-value Summary *** *** *** **** (a)D'Agostino & Pearson omnibus normality test (D'Agostino, 1986). (b) **** P< 0.0001 extremely significant, *** 0.0001 <P< 0.001 very significant, ** 0.001 <P< 0.01, *0.01 <P< 0.05, significant. N - not significant. n/a - not applicable.

Page 147: Nitrogen fixing potential in extreme environments - UNSWorks

138

Figure 7 The distribution shape of three amino acids based on their composition pattern in the NifH sequence: Arg, Cys and Phe. Goodness of fit (Robust sum of square): Arg-7.21, Cys-6.01, Phe-5.54. Y axis represents the relative frequency of the X axis values. X axis represents the range of the composition values of each amino acid in the NifH amplified region.

In a similar fashion to 1CP2 structural analysis, the conserved Gly and Ala of the amplified

region of NifH in 2AFH, appeared in alpha helices (‘H’), regions with high curvature (bends,

‘S’) and H-bonded turns (‘T’, just before or after a helix, usually) and 3-helix turns (‘G’ - three

residues per turn), with only few present in unidentified or coil regions (G or A with

conservation score 9, table 4). Conserved Gly or Ala residues in 2AFH were adjacent to

important functional domains, similar to the 1CP2 findings. Solvent accessibility analysis

suggested that some of the Gly or Ala residues were accessible to the solvent, note positions -

42, 65, 89-90 and 114, while others remained buried.

The I-TASSER server provided five potential models for 2AFH chain E sequence, and their C-

score ranged between 2.12 to -5. ‘Model1’ had the best C-score - 2.12, RMSD 2.0±1.6 Å, and a

0.99±0.04 TM-score. The templates identified in the threading stage were - 2AFH chains E and

A, 1CP2 chain A, 1DE0 chain A and 2NIP chain A (additional Fe proteins from A. vinelandii).

Only chains E and A from 2AFH were 100% sequence identity while the rest had varying

degrees of sequence identity. The top ranking EC predicted number was 1.18.6.1 (nitrogenase)

with a TM-score of 0.8922, RMSD 1.7 Å, 98% sequence identity to the query sequence, an EC-

score of 4.0401 and a PDB hit to 1N2C chain E (a nitrogenase complex from A. vinelandii). The

most structurally similar protein to the first I-TASSER model was identified as 2AFH chain E,

according to its TM-score of 0.9897 (RMSD 0.54 Å, 100% sequence identity).

Composition (%)

Page 148: Nitrogen fixing potential in extreme environments - UNSWorks

139

Superimposing the I-TASSER model with the known x-ray crystallographic coordinates of

2AFH (see section 5.2.3 for specific parameters), highlighted only two residues that were not

precisely positioned (figure 8). These were G96 and E116 (numbering according to P00459

sequence), at RMSD values of 1.311 and 1.168 Å, respectively, while the overall RMSD was

0.349 Å.

Figure 8 Superimposition of the I-TASSER model based on P00459 sequence, amplified section, and the crystallographic structures of 2AFH chain E. Sections with RMSD >1 Å are highlighted with colour.

Page 149: Nitrogen fixing potential in extreme environments - UNSWorks

140

Tabl

e 4

2AFH

Fe

prot

ein

parti

al N

ifH se

ctio

n, c

onse

rvat

ion

scor

es, s

econ

dary

stru

ctur

e an

d so

lven

t acc

essi

bilit

y. C

onse

rved

Ala

(A) a

nd G

ly (G

) res

idue

s are

hi

ghlig

hted

. 2AFH

40

60

80

|

|

| (^)

Partial NifH Sequence (a)

CDPKADSTRLILHSKAQNTIM

EMAAEAGTVEDLELEDVLKA

GYGGVKCVESGGPEPGVGCA

Conservation Score (b)

999999999979619793947

38991195999582139111

91115197989999979999

Secondary Structure (e)

*

E-S-SSSSHHHH--SS--HHH

HHHHTTSSGGG--HHHH-EE

-GGG-EEEE-----TTT--H

***

---EEEEEE--------HHHH

HHHHHHHH---EEEEE----

--HHHHHHH------HHHHH

Solvent Accessibility (c)

beeeeebbebbbebebeebbb

ebbbeeeebeebebeebbee

beeebebeeeeeeeeeebbb

††

--ee----e---e-ee-e---

e--ee----ee-eeee--ee

-eee-e-------e--e---

†††

b---

--bb-

bbb-

----

--bb

--

--e-

----

-b-b

--bb

--

----

b-bb

----

----

-bbb

Binding to MoFe Protein (d)

--------------------M

E-AA---TVED----------

------------P---V-C-

Nucleotide Binding Site (d)

-D-K-DS--------------

---------------------

--------------------

Chains Interface (d)

---K-D--R-----K------

---------------------

-------------EP-V--A

Structural Motifs (d)

SWITCH1---------------- ---------------------

--------------------

(a) P

artia

l NifH

am

ino

acid

sequ

ence

from

2A

FH c

ryst

allo

grap

hic

3D st

ruct

ure

and

P004

59 se

quen

ce a

cces

sion

ID, c

hain

E.

(b) R

esid

ue c

onse

rvat

ion

scor

es, c

alcu

late

d by

Con

Surf

with

Max

imum

like

lihoo

d an

d LG

pro

tein

subs

titut

ion

mod

el (

Pupk

o et

al.,

200

2; G

lase

r et a

l., 2

003;

Lan

dau

et a

l., 2

005;

Gol

denb

erg

et a

l., 2

008)

. (e

) Sec

onda

ry st

ruct

ure

: ‘H

’ - h

elix

; ‘T’

- hyd

roge

n bo

nded

turn

; ‘S’

- ben

d; ‘E

’ - e

xten

ded

beta

shee

t; ‘G

’- 3

-hel

ix (t

hree

resi

dues

per

turn

); ‘-

‘ unk

now

n/ ra

ndom

coi

l. *

Bas

ed o

n 3D

coo

rdin

ates

cal

cula

ted

by D

SSP

as im

plem

ente

d in

The

Pro

tein

Dat

a B

ank

(Kab

sch

and

Sand

er, 1

983;

H.M

. Ber

man

, 200

3).

***

Pred

icte

d se

cond

ary

stru

ctur

e by

I-TA

SSER

on

line

serv

er, b

ased

on

P004

59 N

ifH se

quen

ce (Z

hang

, 200

8; R

oy e

t al.,

201

0). ‘

H’ -

hel

ix, ‘

E’ -

exte

nded

bet

a sh

eet,

‘-‘u

nkno

wn.

(c

) Sol

vent

acc

essi

bilit

y:

Bur

ied

(b) o

r exp

osed

(e) r

esid

ue; c

alcu

late

d by

Con

Seq

on li

ne se

rver

(Srid

hara

n et

al.,

199

2; P

olla

stri

et a

l., 2

002;

Ber

ezin

et a

l., 2

004)

. ††

So

lven

t acc

essi

bilit

y ca

lcul

ated

by

WH

AT

IF p

rogr

am, ‘

-‘ un

know

n , (

e) e

xpos

ed -

a re

sidu

e th

at is

cle

arly

solv

ent a

cces

sibl

e, m

ore

expo

sed

than

102

Ang

stro

m, o

r m

ore

than

33%

of i

ts a

cces

sibi

lity

in th

e un

fold

ed st

ate

(Vrie

nd, 1

990)

. ††

† Pr

edic

ted

solv

ent a

cces

sibi

lity

calc

ulat

ed b

y I-

TASS

ER se

rver

. ‘b’

- bur

ied

resi

due

(0);

‘e’ h

ighl

y ex

pose

d re

sidu

e (7

-9);

‘-‘ v

aryi

ng d

egre

es o

f exp

osur

e;

(Che

n an

d Zh

ou, 2

005;

Wu

and

Zhan

g, 2

008)

. (d

) res

idue

s int

erac

ting

with

stru

ctur

al c

ompo

nent

s in

nitro

gena

se (S

chle

ssm

an e

t al.,

199

8)

(^) C

yste

ine

resi

dues

whi

ch c

oord

inat

e th

e m

etal

lo c

lust

er a

re m

arke

d C

Page 150: Nitrogen fixing potential in extreme environments - UNSWorks

141

Tabl

e 4

2AFH

sequ

ence

con

tinue

d. C

onse

rved

Ala

(A) a

nd G

ly (G

) res

idue

s are

hig

hlig

hted

. Partial NifH Sequence

100

120

140

158

|

|

(^)

|

|

GRGVITAINFLEEEGAYEDD

LDFVFYDVLGDVVCGGFAMP

IRENKAQEIYIVCSGEMMAM

Conservation Score

99989969889992799113

18897999999999999999

99668999999939999999

Secondary Structure

*

HHHHHHHHHHHHHTT-SSTT

-SEEEEEEE-SS--TTTTHH

HHTT---EEEEEE-SSHHHH

***

HHH--------HHHHHHH--

--EEEEE-------------

-HHHHHHHHHH---------

Solvent Accessibility

bebebbbbbbbeeeeeeeee

bebbbbbbbbbbbbbbbbbe

beeeebeebbbbbbbebbbb

††

------------e----eee

-e------------------

-eee--e---------ee--

††

--bbbbbb-bb---------

b-bbbbbb-bbbbbb-bb--

b----b--bbbbbb---bbb

Binding to MoFe protein

-R--IT--N---E-------

-------------C----M-

-RE-----------------

Nucleotide Binding Site

--------------------

------D---D---------

---------------E-MA-

Chains Interface

--------------------

--------L-DVVC--FA--

---------------EMM--

Structural Motifs

--------------------

-S W I T C H 2 -

--------------------

Page 151: Nitrogen fixing potential in extreme environments - UNSWorks

142

5.3.3 Comparative analysis of cluster I and cluster III Fe proteins

We projected our results from the evolutionary analysis onto the relevant Fe protein structures,

2AFH chain E for cluster I and 1CP2 chain A for cluster III (figure 9). Big blocks of completely

conserved residues were evident in the interior of the Fe protein, in the vicinity of the metallo

cluster, as expected. Completely conserved residues were found throughout the structure, also

toward its exterior, in a more fragmented fashion, alongside less conserved residues.

The secondary structure was similar between 2AFH & 1CP2 (figure 10). The overall RMSD

when superimposing 2AFH & 1CP2 crystallographic structures was 0.67 Å (without the 13

residues in the C-terminus of 2AFH), which meant most Cα atoms of the amino acids were

positioned fairly similarly in both proteins. However - there were regions where the RMSD was

higher than 1 Å, as can be seen in figure 11. Six sections with RMSD values higher than 1 Å

were present in the amplified region (Table 5).

Table 5 2AFH & 1CP2 regions of RMSD >1 Å. In bold, residues in the amplified region of NifH.

Residue positions* 26-28 50-53 58-68 87-90 93 2AFH sequence(a) AEM SKAQ EMAAEAGTVEDLE GPEP G Conservation score(b) 841 1979 3899119599958 9999 9 1CP2 sequence HAM GLAQ DTLREEGEDVE GPEP G Conservation score 111 9719 99991897571 9999 9 Main secondary structure(c) helix&turn coil helix&turn coil bend RMSD (Å) 1.513 2.143 2.341 1.295 1.998

Residue positions 96-97 108-116 184-188 198 262-269 2AFH sequence GR EEGAYEDDL RNTDR N EELLMEFG Conservation score 99 927991131 91684 1 91695149 1CP2 sequence GR QLGAYTDDL RKVAN K EEILMQYG Conservation score 99 279861119 91971 1 91641119 Main secondary structure helix coil&turn coil&bend helix helix&turn RMSD (Å) 1.001 1.379 3.473 1.073 3.083

* Position number according to 1CP2, chain A sequence P00456. (a) The amino acid in each position in the respective Fe protein, 2AFH or 1CP2. (b) ConSurf conservation scores, 1-9, non conserved to completely conserved, respectively (c) Secondary structure calculated by DSSP as implemented in The Protein Data Bank.

These six regions included coil, turns and parts of alpha helices as their secondary structure

(tables 2 & 4). They included residues involved in intersubunit interactions (51, 89-90), binding

to the MoFe protein (58-68, 88, 97, 108-116) and coordination of the metallo cluster (87-90, 93,

96-97).

Page 152: Nitrogen fixing potential in extreme environments - UNSWorks

143

Figure 9 Conservation pattern of the Fe protein. Top image: Superimposed 1CP2 and 2AFH Fe proteins at opposite angles, composed from completely conserved residues only (score 9). The metallo cluster is represented with space filled atoms, yellow for Sulphur atoms, orange for the Fe atoms. Bottom image: Conservation scores projected onto individual Fe protein structures. Left Fe protein is 2AFH chain E, and right Fe protein is 1CP2 chain A. Coloured ribbons represent less conserved residues in the protein (scores 1-8, turquoise - pink), while the wire is composed only from completely conserved residues (score 9, maroon) in each cluster.

mage: Superimposed 1CP2 and 2AFH Fe ompletely conserved residues only (score 9). The metalloyellow for Sulphur atoms orange for the Fe atoms

gure 9 Conservation pattern of the Fe proteinoteins at opposite angles composed from co

n. Top imompletely

For the amino acid composition analysis, we performed a two tailed unpaired t-tests, to measure

the significant changes between the amino acid compositions between cluster I and III, in the

variable and conserved sections in the NifH sequences. Most amino acids in both clusters have

passed the normality tests (sections 5.3.1), which marked them suitable for t-tests (Heeren and

D'Agostino, 1987).

The composition analysis in the conserved vs. variable sections in the partial NifH sequence,

revealed some interesting similarities and differences between cluster I (C1) and cluster III (C3,

figure 12) sequences. Under the conserved section, Cys, Leu, Pro and Thr compositions were

Page 153: Nitrogen fixing potential in extreme environments - UNSWorks

144

similar in both clusters, while His, Asn and Trp were absent (Table 6). However, Ala and Gly

differed substantially in the conserved region, as evident from their relatively high SD values

(3.5 and 4.9, respectively, Table 6). In the variable sections, the composition of six amino acids

- Ala, Asp, Glu, Ile, Arg and Val, showed no statistically significant differences between the

clusters (Table 7). In addition, Pro was nonexistent in cluster I, and rarely present in cluster III,

while Leu was the most common amino acid in both clusters (composition mean 11 and 15, C1

and C3, respectively, Table 7), followed by Glu (11, 12). Phe, Gly, His, Lys, Asn and Ser

content decreased significantly in cluster III (Table 7), while Cys, Leu, Met, Gln, Thr and Tyr

compositions increased significantly. Table 6 Amino acid mean composition in the conserved sections of partial NifH sequences from known organisms affiliated with cluster I (C1, N=58) and cluster III (C3, N=32). Shaded cells denote highest standard deviation (SD) values. Amino Acid Ala Cys Asp Glu Phe Gly His Ile Lys Leu Met C1-Conserved (%) 11 5.3 6.6 9.2 1.3 14 0.0 6.6 2.7 5.6 4.9 C3-Conserved (%) 6.1 4.6 9.1 7.6 1.5 21 0.0 5.9 3.0 6.1 3.1 Mean (SD)* 8.6(3.5) 5(0.49) 7.9(1.8) 8.4(1.1) 1.4(0.14) 18(4.9) 0(0) 6.3(0.49) 2.9(0.21) 5.9(0.35) 4(1.3) Asn Pro Gln Arg Ser Thr Val Trp Tyr C1-Conserved (%) 0.0 5.3 2.6 3.9 2.7 3.9 11 0.0 3.9 C3-Conserved (%) 0.0 6.1 1.5 6.1 4.5 4.5 7.7 0.0 1.5 Mean (SD) 0(0) 5.7(0.57) 2.1(0.78) 5(1.6) 3.6(1.3 ) 4.2(0.42) 9.4(2.3) 0(0) 2.7(1.7) *Mean and Standard Deviation of C1 and C3 conserved values.

Table 7 Amino acid mean composition in variable sections of partial NifH sequences from known organisms affiliated with cluster I (C1, N=58) and cluster III (C3, N=32).

(a)The mean composition of each amino acid and its standard deviation. (b) Indicates if the means were significantly different (P<0.05) according to unpaired t-tests with Welch’s correction for unequal variances. **** P< 0.0001 extremely significant, *** 0.0001 <P< 0.001 very significant, ** 0.001 <P< 0.01, * 0.01 <P< 0.05, significant. N - not significant. n/a - not applicable.

Amino Acid Ala Cys Asp Glu Phe Gly His Ile Lys Leu Met C1-Variable(a) 6.9(1.8) 0.81(1.5) 8.5(2.7) 11(3) 4.6(2.3) 7.5(1.8) 2.8(1.7) 5.8(2.2) 5.7(2.6) 11(2.3) 3.3(1.6) C3-Variable 6.3(2.3) 2.5(0.86) 7.7(2.2) 12(3.5) 3(1.2) 5.7(2) 0.29(0.68) 6.3(2.1) 4.6(1.5) 15(3) 4.2(1.5) t-test P value summary (b) N **** N N **** **** **** N ** **** *

Asn Pro Gln Arg Ser Thr Val Trp Tyr C1-Variable 6.7(2.3) 0(0) 1.4(1.8) 2.2(2.5) 7.7(3.3) 3.2(2.9) 8(2.3) 0.16(0.58) 2.6(1.9) C3-Variable 3.2(1.6) 0.35(0.73) 2.9(1.3) 2(1.3) 3.6(1.9) 4.9(2) 8.5(2.1) 0.57(0.86) 6.2(1.4)

t-test P value summary **** n/a **** N **** ** N * ****

Page 154: Nitrogen fixing potential in extreme environments - UNSWorks

145

Figu

re 1

0 Su

perim

pose

d st

ruct

ures

of 1

CP2

cha

in A

and

2A

FH c

hain

E c

ryst

allo

grap

hic

stru

ctur

es h

ighl

ight

ing

sim

ilarit

ies i

n th

eir s

econ

dary

stru

ctur

e in

the

NifH

am

plifi

ed

regi

on. O

vera

ll R

MSD

0.6

7 Å

. Hel

ices

in b

lue,

coi

l in

light

gre

y an

d be

ta sh

eets

in re

d. T

he m

etal

o cl

uste

r pos

ition

is b

ased

on

2AFH

PD

B fi

le a

tom

coo

rdin

ates

(ora

nge

for

Fe, a

nd y

ello

w fo

r S a

tom

s).

posi

tion

is b

ase

Figu

re 1

1 Su

perim

pose

d st

ruct

ures

of 1

CP2

cha

in A

and

2A

FH c

hain

E c

ryst

allo

grap

hic

stru

ctur

es h

ighl

ight

ing

spec

ific

sect

ions

with

RM

SD v

alue

s >1

Å. L

eft:

Con

serv

ed a

nd n

on c

onse

rved

regi

ons i

n w

hich

RM

SD v

alue

s >1

Å, h

ighl

ight

ed w

ith th

e C

onSu

rf c

olou

r sch

eme.

The

refo

re, c

ompl

etel

y co

nser

ved

regi

ons a

re in

mar

oon

(sco

re ‘9

’), a

nd n

on-c

onse

rved

are

col

oure

d tu

rquo

ise

to p

ink

(sco

res ‘

1-8’

). R

ight

: The

sam

e re

gion

s in

whi

ch R

MSD

>1

Å, h

ighl

ight

ed p

er p

rote

in st

ruct

ure,

hen

ce, 1

CP2

is

in d

ark

blue

and

2A

FH is

hig

hlig

hted

in d

ark

gree

n.

2AFH

cha

in E

cry

stal

logr

aphi

c st

ruct

ures

hig

hlig

hgh

t gre

y an

d be

ta sh

eets

in re

d. T

he m

etal

o cl

uste

r pTh

e m

etal

o cl

uste

r pes

in th

eir s

econ

dary

stru

ctur

e in

the

NifH

am

plifi

edd

on 2

AFH

PD

B fi

le a

tom

coo

rdin

ates

(ora

nge

for

edon

2A

FH P

DB

file

at

ure

10 S

uper

impo

sed

stru

ctur

es o

f 1C

P2 c

hain

A a

ndon

.Ove

rall

RM

SD0.

67Å

.Hel

ices

inbl

ue,c

oili

nli

Page 155: Nitrogen fixing potential in extreme environments - UNSWorks

146

Figure 12 The amino acids mean composition in the partially amplified NifH sequence, from known organisms affiliated with cluster I (C1) and cluster III (C3), divided to variable vs. conserved regions. Error bars are SD.

Salt bridges analysis, based on the crystallographic structures of 1CP2 and 2AFH, revealed that most

of the bridges included highly conserved residues mainly in coil regions, with few residues in helices

or beta sheets (Table 8).

While the residues within the beta sheets were buried, almost all the other residues were exposed to

the solvent according to our previous analysis (tables 2 & 4). Four common bridges were completely

conserved in both structures, yet two unique salt bridges in 2AFH and three in 1CP2, had low

conservation scores, suggesting these specific bridges were not present in all the sequences from

cluster I or cluster III. Three unique salt bridges were highly conserved and connected the

intersubunits of 2AFH, yet similar intersubunits bridges were not found in 1CP2, even when our

distance criterion was extended to allow a distance of 7 Å between participating atoms.

Page 156: Nitrogen fixing potential in extreme environments - UNSWorks

147

Table 8 Potential salt bridges, with maximum intertatomic distance of 4 Å, in the amplified NifH region of the Fe protein 1CP2 or 2AFH. Shaded rows represent common salt bridges.

Residue Position(b) Residue Position Distance (Å)

Conservation scores (c)

1CP2(a) ASP 38 LYS 14 3.07 9,9 GLU 62* LYS 54 2.77 1,6 GLU 75 ARG 81 3.84 1,1 GLU 107 LYS 140 3.96 9,9 ASP 115 ARG 81 3.22 1,1 ASP 122* LYS 14 2.95 9,9 GLU 143 ARG 2 3.58 9,9 2AFH(d) ASP 39 LYS 15 3.09 9,9 E↔F GLU 92 LYS 170 2.8 9,9 GLU 110 LYS 143 3.24 9,8 ASP 118 LYS 32 3.38 3,7 ASP 125 LYS 15 3.14 9,9 ASP 129 LYS 41 2.74 9,9 GLU 141 ARG 140 2.92 6,9 GLU 146 ARG 3 2.88 9,9 GLU 154 LYS 10 3.97 9,9 GLU 229 HIS 50 2.53 2,6 E→F GLU 265 LYS 52 3.68 9,9 F→E GLU 277 LYS 52 2.86 -,9

(a) 1CP2 analysis by WHAT IF, salt bridges were not detected between the Fe protein subunits A & B. (b) Positioning was manually corrected for minor shifts per alignment. (c) Conservation score was based on the individual analysis of ConSurf on cluster III, Stromatolite affiliated with cluster III (S3), Cluster I and its affiliated stromatolite clones (S1). Scores ranged from 1 to 9, non-conserved to completely conserved, respectively. “-“ score was not calculated. (d) 2AFH analysis by WHAT IF, salt bridges were detected between subunits E & F, and are designated where relevant. * Yellow background denotes a residue in a α-helix structure, and green denotes a residue within a -sheet. No background colour means random coil or unknown structure.

Page 157: Nitrogen fixing potential in extreme environments - UNSWorks

148

5.4 Discussion

Cluster I and cluster III NifH sequences from known organisms were subjected to analyses of

their conservation patterns, amino acids composition and structural shifts, in representative Fe

proteins. Two new bioinformatic tools were evaluated, ConSurf and the I-TASSER web server

for structural prediction.

5.4.1 Methodology

Our methods included a statistical t-test analysis of the amino acid composition, a ConSurf

evolutionary analysis, and using I-TASSER for modelling the Fe proteins. In general, these

methods performed well, with the following limitations and restrictions.

Amino acid composition analysis is usually used, to ascertain unique patterns and characterise a

designated group of sequences. The number of sequences and their length may vary, and can

encompass a few dozen sequences to hundreds, as well as statistical tools that can be employed,

such as Chi square tests, Significance (‘R’) formula based on standard deviation, Cluster

analysis and more (Bohm and Jaenicke, 1994; Fukuchi and Nishikawa, 2001; Fukuchi et al.,

2003; Paul et al., 2008). In our analysis, the amino acids composition was compared between

NifH defined clusters derived from known reference NifH sequences, and our specific NifH

clone sets obtained from the Shark Bay stromatolites and the Paralana Hot Springs. While most

amino acids passed the normality tests, Trp, Phe and Asp from both cluster I and cluster III did

not (tables 1 & 3).

However, we continued with the statistical analysis, with all the amino acids, because of the

robustness of the t-test to violations of normal distribution (Heeren and D'Agostino, 1987). The

distribution shape of amino acids composition in proteins is a complex matter. The normal type

of distribution has been suggested in the past (Smith, 1966; Nishikawa et al., 1983; Gerstein,

1998), but it remains inconclusive. However, we would have liked to see more amino acids pass

the normality test. Once elongated, the amplified region of the nifH gene will provide longer

sequences, with more appearances of each amino acid, and eventually the distribution shape will

become clearer. However, because some of these amino acids would appear mostly in the

conserved regions of the sequence, they would always attain the same values, regardless of the

data set size, and therefore perhaps a different statistical approach should be used in those cases.

The limitation of the ConSurf analysis was tightly related to the multiple alignment quality,

more so than its size. The multiple alignment quality is at the core of the ConSurf analysis

Page 158: Nitrogen fixing potential in extreme environments - UNSWorks

149

(Glaser et al., 2005). In our study we used the Muscle alignment software, and visually checked

the alignments. As reflected in independent benchmark testing of multiple alignment tools,

MAFFT and Muscle produce similar quality outputs and both are better than ClustalX software

(Nuin et al., 2006). Big blocks of conserved motifs throughout our alignments were always

correctly aligned, however, whenever there was a single insertion, Muscle tended to position it a

bit randomly. Thus an insertion near two identical residues, would create three different forms -

xx, x-x, xx-, and impact how ConSurf computes conservation for these specific positions. If the

insertion is inserted randomly next to two identical residues, these highly conserved residues

will ‘lose’ their specific positioning within a multiple alignment, and will be marked as variable,

though they are not. In a highly conserved region of the Fe protein, any minute modifications to

the sequence could represent an adaptation. On a large scale, these mini-modifications might get

lost or overlooked, yet in our alignments, they were observed and corrected.

In addition, positions with functionally similar amino acids, i.e. a Glu or Asp, will exhibit

higher rates of change compared with positions which require the function and the structure of

the amino acid to be exactly the same in order for the protein to function at all. Hence, positions

which include functionally similar amino acids, but not structurally identical - will alternate

between those optional amino acids. As Consurf uses “rate4site” algorithm, which scores

positions based on their mutational rates - such alternating positions will be scored as relatively

variable, not conserved. They will be given lower scores. Across the NifH sequences from

known organisms in cluster III, positions with alternating Glu or Asp received a range of scores

- 1,4,5,7 (figure 2, table 3). The exact mechanism by which the specific score was given to these

positions requires an in-depth analysis and inspection of the algorithm, factoring into it the

Maximum likelihood and the Le and Gascuel substitution matrix and the effect the total number

of sequences in the alignment has on the calculation.

The limitation of the I-TASSER modelling method was identified after we have employed the

RMSD calculation method (Kabsch and Sander, 1983), as implemented in the Chimera UCSF

software (Meng et al., 2006). We performed RMSD analysis on the resolved structures of 1CP2

chain A and 2AFH chain E, and gained an independent RMSD analysis of their structural

differences (Table 5). This base analysis was later reviewed against our analysis of each

resolved structure against its predicted model by the I-TASSER server (figures 5 & 8). The base

analysis indicated where authentic structural changes actually occur between 1CP2 and 2AFH

(Table 5). Six of those sections were in the amplified region of NifH, and relatively exposed to

the solvent. They are known to undergo conformational changes upon nucleotide binding, or

upon forming a docking complex with the MoFe protein (Georgiadis et al., 1992; Tezcan et al.,

2005).

Page 159: Nitrogen fixing potential in extreme environments - UNSWorks

150

The RMSD analysis of 1CP2 or 2AFH structure against their predicted I-TASSER models

allowed us to isolate any differences introduced by the I-TASSER process. The four sections in

1CP2 which were positioned imprecisely, correlated entirely with our base analysis between the

two Fe proteins. We therefore assumed the cause for the misplacement was due to the TM-align

procedure on the 1CP2/P00456 sequence, which identified 2AFH chain E as the best structure

to model after even though sequence identity to P00456 was only 69%. Hence, in the resulting

predicted model for 1CP2/ P00456, the Cα atoms of seven amino acids were placed at a distance

from the resolved structure, and were mainly based on 2AFH chain E structural alignment. The

only two residues that were slightly off on the 2AFH chain E model, reflect the ab initio

modelling procedure I-TASSER employs for loop regions, which has a lower success rate than

comparative analysis to a known sequence and structural template procedure (Zhang et al.,

2003; Moult, 2005).

5.4.2 Evolution, composition and structure in cluster I & III

The ConSurf analysis of cluster I and cluster III provided some interesting points. The results

for both clusters confirmed most residues with important functional roles, were completely or

highly conserved. The scoring system itself, based on the maximum likelihood and the LG

amino acid substitution matrix, produced a colour scheme for the multiple alignment of each

cluster (figures 3 & 6).

In general, Score 8 was assigned by ConSurf to positions in which only one sequence in the

entire multiple alignment included a change of a residue, while scores 6-7 were usually assigned

to exchanges between a positively charged Arg/Lys and a negatively charged Asp/Glu, or an

exchange between Gln/Thr residues with polar uncharged side chains. The latter type of

exchange points to function preservation over structural one. Lower scores were given to

exchanges of hydrophobic residues, mostly of the bulkier type - Leu/Met/Tyr/Phe. In general,

60% of cluster I residues scored a 8-9, and ~11% scored a 6-7, altogether suggesting that both

function and structure were highly conserved throughout cluster I. Cluster III residues scored

differently - 54% scored a 9-8 and 14.5% scored a 6-7 - suggesting that cluster III, while still

quite conserved in function and structure, is more prone to changes in its amino acids

composition, relative to cluster I.

Completely conserved residues in both clusters were found in the switch I & II regions, the

4Fe:4S metalo cluster, and the Walker A motif (GKGGIGGKST), as expected. In addition,

regions that were involved in at least two functional roles scored 9. For example, positions 86-

Page 160: Nitrogen fixing potential in extreme environments - UNSWorks

151

99 that were involved in the MoFe binding and in the intersubunits interface, included

completely conserved blocks of residues (residue numbering according to 1CP2; (Schlessman et

al., 1998). Another conserved motif, in both clusters, was the complete conservation of Q54, as

part of a Q-loop motif (Yang et al., 2011). This motif is usually found within the ATP-binding

cassette (ABC) transporters, which includes also the multidrug resistance protein MRP. The Q-

loop motif is integral to the binding of the nucleotide via its metal cofactor (Yang et al., 2011),

and its presence is not surprising considering that NifH has been affiliated phylogenetically with

the Mrp /MinD protein family, within the SIMIBI class of the p-loop GTPases (Leipe et al.,

2002). Other motifs were not completely conserved in both clusters: The Walker B hhhhDxxG

motif was partially conserved (‘h’ denotes hydrophobic residue, 122-129 residue number in

1CP2), DxxG motif was DVLG in both clusters, however, in cluster I, the preliminary ‘hhhh’

included Ser/Cys/Phe (position 123), while remaining positions were hydrophobic residues, as

expected. In cluster III, the ’hhhh’ motif was conserved and the residue exchanges were solely

hydrophobic in nature (Val/Phe/Tyr/Ile). These structural and functional motifs are involved in

the nucleotide binding as well.

The Fe protein, in general, has retained the Asn residue which is part of a Nxxx motif , a

variation on the original NKXD in p-loop GTPases (Leipe et al., 2002). This Asn is thought to

stabilize the guanine nucleotide binding site, and produce a specificity for GTP binding (Bourne

et al., 1991). It was not completely conserved in cluster III, in which position N106 scored only

6, as some sequences also included an Asp or Gly variation for this position. In cluster I, N107

scored 8, as almost the entire alignment maintained this specific residue at this position.

The salt bridges, in the amplified NifH region, were mainly composed from exposed and

completely conserved residues, located chiefly in the coil regions. 2AFH included a larger

number of completely conserved bridges, in contrast to the 1CP2 structure. The unique bridge in

2AFH of Glu92-Lys170 was observed previously (Schlessman et al., 1998), however, our

analysis indicated two additional bridges might be present - Glu265E-Lys52F from chain E to

chain F, and Glu277F-Lys52E, from chain F to E, both of which were highly conserved (Table

8). Our analysis did not detect Asp129-Lys41 as a bridge between the two subunits in 2AFH,

but as a bridge within subunit E, yet it was reported as an intersubunit bridge in another Fe

protein structure, 1NIP (Schlessman et al., 1998), hence this pair of residues may be

multifunctional, which would explain their complete conservation. Some studies have indicated

that the highly conserved Arg100 (cluster I numbering) is part of a salt bridge with Glu120 in

the alpha chain of the MoFe protein (Georgiadis et al., 1992; Burgess and Lowe, 1996), and

replacing this residue produced salt sensitivity and partial functionality of the Fe protein (Peters

et al., 1995).

Page 161: Nitrogen fixing potential in extreme environments - UNSWorks

152

Our analysis indicated this residue may interact also with Glu156 of the beta chain in the MoFe

protein. In both bridges, the distance calculated by the WHAT IF program for the participating

atoms was larger than 4.1 Å, suggesting that using the 4.0 Å as the maximum distance criterion

between atoms for an established salt bridge, was not ideal. Glu110, Arg140 and Lys143 also

have been implicated in mutational studies to be important for the protein to function under

saline conditions, as replacing them caused salt sensitivity and various degrees of uncoupling of

the MgATP hydrolysis (Peters et al., 1995). The exact role of these residues is not yet

determined, though some have suggested they play a crucial role during the docking procedure

with the MoFe protein (Tezcan et al., 2005). The highly conserved salt bridge between Asp125

and Lys15 (2AFH numbering, Table 8), have been confirmed and is known to connect between

the Walker A motif (Lys15) and the switch II region, and is crucial for the conformational

changes the Fe protein undergoes once a nucleotide is bound (Georgiadis et al., 1992; Lanzilotta

et al., 1995). However, Asp39 and Lys15 represent a potential additional important salt bridge,

between the Walker A motif and switch I region, which requires further studies. The other salt

bridges suggested in our analysis of 1CP2 and 2AFH should be further characterised,

particularly those involving highly conserved residues such as Asp129-Lys41, Glu146-Arg3,

Glu154-Lys10 and the intersubunit salt bridges (Table 8). Because most salt bridges were

exposed to the solvent, it is possible they switch between possible partners within the Fe protein

and partners in the MoFe protein upon docking.

The ConSurf analysis not only confirmed the functional and structural elements in cluster I &

III, it also provided an additional layer of information - mainly in regards to residues which

maintained functionality but not structure. This was complemented by the amino acid

composition analysis.

Most amino acids composition, from cluster I and cluster III sequences were found to have

passed robust normality tests (tables 1 and 3). They therefore have been considered as following

a Gaussian distribution, and able to withstand specific statistical tests. A two tailed unpaired t-

test analysis in the amplified region of NifH divided into conserved vs. variable segments,

produced positive results and found shifts in both segments (tables 7 & 8). Under the conserved

region, only Ala and Gly showed any considerable shifts in composition between the two

clusters, while other amino acids remained very similar in composition, as would be expected

from conserved regions.

The variation between Ala and Gly residues, in the conserved region, might be an indication of

the relative effect of Ala versus Gly on helix stability within the Fe protein. Other studies

indicated these amino acids stabilised helices, but at different locations along the helix, Gly at

Page 162: Nitrogen fixing potential in extreme environments - UNSWorks

153

the N- and C terminals with Ala in internal position (Chakrabartty et al., 1991; Serrano et al.,

1992b; Serrano et al., 1992a). It is thought this specific exchange of Gly & Ala impacts solvent

accessibility, and influences the exposure of hydrophobic surfaces to the solvent. Conserved

Gly or Ala residues in 2AFH or 1CP2 were positioned adjacent to important functional

domains, and most were accessible to the solvent, making them suitable to provide minor

adjustments for the functional residues, in regards to the solvent accessibility (tables 2 & 4).

Our composition analysis of the variable sections (scores 1-8), indicated that cluster III

increased its hydrophobic content and thus reduced the overall accessible surface area to the

solvent (Moret and Zebende, 2007), producing a more compact Fe protein in general (Table 7).

In addition, although our analysis suggested that the variable sections differ substantially in

their amino acid compositions between cluster I and III, there were underlying common trends.

These included Leu as the highest occurring amino acid, and that charged amino acids, such as

Asp, Glu and Arg, did not defer significantly in their composition. This was true also for several

hydrophobic amino acids, mainly Ala, Val and Ile. The fact these groups had no significant

change in composition, although located in the variable section of the NifH sequence, points to

their involvement in a functional role via their side chains, perhaps in a similar fashion to

Arg100, Arg140, and Lys143 (2AFH sequence numbering). Mutational studies revealed these

conserved amino acids provided essential ionic support during the complex formation with the

MoFe protein (Peters et al., 1995), perhaps Asp, Glu and Arg in the variable segments, provide

similar support as well.

Prior to analysing a clone based Fe protein structure, it was of interest to check how the I-

TASSER server would perform on 1CP2 and 2AFH sequences vs. their known X-ray

crystallographic structures, in order to independently gauge the server performance. Overall, the

I-TASSER models were in good agreement and quality with the crystallographic structures of

2AFH and 1CP2 Fe proteins (figures 5 & 8). The performance of the I-TASSER server was

rather accurate, taking into consideration known server limitations, such as an average error of

0.08 for the TM-score and 2 Å for RMSD (Zhang, 2008). All models used in this study had TM-

scores higher than 0.5, C-scores in the range of 1.34-2.14 and RMSD values of 1.7-2.3 Å.

Lower RMSD scores for 2AFH than 1CP2 I-TASSER models, were most probably due to the

fact that there are more resolved Fe protein structures from A. vinelandii (20 in total) than C.

pasternium (2). Therefore the resulting 2AFH model was more accurate than the 1CP2 I-

TASSER, with an RMSD value of 0.689 Å for the 1CP2 and its I-TASSER model, and 0.529 Å

for 2AFH and its I-TASSER model. The TM-align software (Zhang and Skolnick, 2005),

ranked 2AFH chain E as the best structural match to the sequence of 1CP2 chain A, even

Page 163: Nitrogen fixing potential in extreme environments - UNSWorks

154

though the sequence identity was only 0.69, and we suspected this introduced bias in the cluster

III clones predicted models.

In total there were seven mismatched positions in the 1CP2 model, which meant that in the

predicted model, Cα atoms of seven amino acids were placed at a distance from the resolved

structure. According to our previous analyses (Table 2) these residues were exposed to the

solvent, and participated in two coil sections, one hydrogen bond turn and one beta sheet,

respectively, and were completely conserved, except for Leu52 and Thr115-Asp116 (ConSurf

score 7, and 1-1, respectively). As expected, there were only two minor mismatched positions in

the predicted model for 2AFH chain E. According to our previous analyses G96 is completely

conserved in cluster I, present at the end of a α-helix, buried and close to a loop region (Table

4), and Glu116 is a highly variable position (score 1 E/D/V/S, sometimes absent), exposed to

the solvent, and is part of a bend near the end of a α-helix. Coil and loop regions are notoriously

hard to model accurately (Moult, 2005), and these results were not surprising.

When comparing 1CP2 and 2AFH actual crystallographic structures and superimposing them,

the mismatched residues, in the amplified region of NifH, included most of the positions we

reported as a mismatch between 1CP2 and its I-TASSER model (Table 5, figure 5). This

suggested again that the lack of additional structures affiliated with cluster III, in the PDB

template library, has most probably caused a slight bias in the I-TASSER process. However - as

our analysis clarified what those positions were, we would be able to inspect them carefully in

future analyses.

Projecting the ConSurf evolutionary scheme onto the resolved structures of 1CP2 chain A and

2AFH chain E, has demonstrated that in the amplified section of the NifH sequence, the most

conserved regions were switch I and II and residues coordinating the 4Fe:4S metalo cluster (see

figure 9). Combining the RMSD analysis and the ConSurf evolutionary scheme, suggested that

structural shifts occur within specific regions, which included highly conserved and also non

conserved residues (figure 11). The region of 113-118 was in particular prone to insertions and

structural shifts, and this region is chiefly involved in the docking procedure between the Fe

protein and the MoFe protein of the nitrogenase (Peters et al., 1995; Tezcan et al., 2005). In

general, our RMSD analysis was in agreement with the RMSD analysis as presented by

Schlessman et al., 1998, on 1CP2 Fe protein structure, and A. vinelandii Fe-protein denoted

Av2, at 2.13 Å resolution.

Page 164: Nitrogen fixing potential in extreme environments - UNSWorks

155

5.5 Concluding remarks

NifH sequences from known organisms affiliated with cluster I or cluster III were analysed in

terms of their conservation patterns, amino acid composition and existing and potential

structural attributes. Our methods included a statistical t-test analysis of the amino acid

composition, a novel ConSurf evolutionary analysis, and the use of the I-TASSER web server.

These methods performed well in general, with few limitations, and provided interesting results.

The analyses results suggested cluster III was slightly less conserved than cluster I, and

contained more hydrophobic residues. A possible role for the Ala and Gly residues as

interchangeable stabilisers of the alpha helices in the Fe protein was suggested as well.

The main known difference between cluster I and cluster III is that the latter includes strictly

anaerobic species, while cluster I includes both aerobic and anaerobic species (see section 1.4,

chapter 1). Our analysis highlights what are the underlying changes which facilitate this

specilialisation in cluster III diazotrophs.

Page 165: Nitrogen fixing potential in extreme environments - UNSWorks

156

Chapter 6 Halophilic and thermophilic adaptations in the Fe protein

_______________________________________________

6.1 Introduction

Clones obtained from columnar stromatolites (chapter 3) and Paralana Hot Springs (chapter 4),

were phylogenetically affiliated mainly with cluster I and cluster III, of the nifH phylogenetic

tree (Zehr et al., 2003a; Raymond et al., 2004a). We expected that halophilic adaptations would

manifest themselves to some extent in the nifH genes from columnar stromatolites of Shark

Bay, because representatives of halophilic Halobacteriales have been previously detected in

stromatolites (Goh et al., 2006; Allen et al., 2008; Allen et al., 2009) as well as

Haloanaerobiales in Guerrero Negro microbial mats (Ley et al., 2006). The

archaeon Halococcus hamelinensis, isolated from Shark Bay stromatolite mats, has been found

to employ mainly glycine betaine as an osmolyte (Goh et al., 2011), while 18 Cyanobacteria

isolates from the Oscillatoriales, Chroococcales and Pleurocapsales orders, have been found to

accumulate predominantly various saccharides, glycine betaine, and trimethylamine-N-oxide

(Goh et al., 2010). While halophilic Archaean diazotrophs have not been detected in our

analysis (chapter 3), we have detected Cyanobacteria representatives. Thus, we have potential

nitrogen fixers with known halophilic adaptive strategies in Shark Bay.

Halophilic adaptations may include an increase in acidic residues (Asp, Glu), a decrease in large

hydrophobic residues and their replacement with small hydrophobic residues such as Ala, Gly

and Val, and a lower Lys content, alongside an increase in salt bridges, within monomers and

between subunits (Lanyi, 1974; Rao and Argos, 1981; Madern et al., 1995; Madern et al., 2000;

Fukuchi et al., 2003). The main ‘threat’ to a protein under saline conditions, is the excess of salt

ions in the solvent, which prevent proper bonding with the water molecules and promotes

aggregation (Bolhuis et al., 2008). The increase in negative charges in a protein, by the increase

in the acidic residues, acts as a charged screen against the salt ions and attracts water molecules

to the protein (Bolhuis et al., 2008). Other studies suggested that the salt bridges were stabilized

at times by the solvent salt ions, thus harnessing the solvent to preserve the protein structure and

function (Eisenberg, 1995; Madern et al., 2000). The change in hydrophobicity helps the protein

to remain flexible under saline conditions and prevents aggregation (Jaenicke and Böhm, 1998;

Madern et al., 2000). These changes provide different mechanisms which enable a protein to

function under extreme saline conditions, such as those surrounding Shark Bay stromatolites.

Page 166: Nitrogen fixing potential in extreme environments - UNSWorks

157

Similar information about known thermophilic diazotrophs in Paralana Hot Springs (PHS) is

scarce. However reports of active diazotrophs in hot springs and hydrothermal vents (Mehta and

Baross, 2006; Hamilton et al., 2011b) and recent analyses of thermophilic proteins (Siddiqui

and Thomas, 2008), suggest that a thermophilic diazotroph might acquire unique adaptations,

and reside in PHS. Thermophilic adaptations usually include an increase in charged amino acids

and some hydrophobic amino acids (Ile, Met, Val, Tyr), as well as an increase in Pro and a

decrease in Gly content (Kumar and Nussinov, 2001; Somero, 2003). A decrease in uncharged

polar amino acids such Ser, Thr, Asn and Gln was also observed in various thermophilic

proteins (Georlette et al., 2003; Daniel et al., 2008). Structural adaptations may involve an

increase in salt bridges within monomers and between subunits, and a decrease in the protein

size, usually by removing loop regions and sections in the N- and C-terminals (Fields, 2001;

Daniel et al., 2008). The increase in charged residues and salt bridges increases ionic networks

which stabilize the protein at higher temperatures and prevent unfolding. Removal of Asn and

Gln stabilizes the protein in general as these amino acids tend to deaminate at higher

temperatures (Kumar and Nussinov, 2001). The increase in hydrophobic residues, specifically at

the core of the protein, enhances hydrophobic interactions and increases its thermostability

overall. In general, thermophilic proteins increase their hydrophobic, electrostatic, Van der

Waals and hydrogen bonds to prevent unfolding at higher temperatures and in the process

become compact and rigid, relative to mesophilic and psychrophilic proteins (Siddiqui and

Cavicchioli, 2006; Daniel et al., 2008).

Our aim was to assess halophilic and thermophilic adaptations in the inferred NifH sequences

from columnar stromatolites of Shark Bay and from the microbial communities at Paralana Hot

Springs, respectively.

Page 167: Nitrogen fixing potential in extreme environments - UNSWorks

158

6.2 Material and methods

6.2.1 Evolutionary conservation

Analysed as described in section 5.2.1, chapter 5.

6.2.2 Residue composition

Analysed as described in section 5.2.2, chapter 5.

6.2.3 Statistical analysis

Analysed as described in section 5.2.3, chapter 5.

6.2.4 Distance matrices

Subsets of the individual multiple alignments were converted to phylip format using Readseq

(Gilbert, 2003) on the EMBL-EBI web server (EMBL-European Bioinformatics Institute) and

submitted to “PHYLIP Protdist” version 3.67 (Felsenstein, 2007), available via the Mobyle web

portal (http://mobyle.pasteur.fr/cgi-bin/portal.py#forms::protdist), to create distance matrices as

described previously (chapter 2, section 2.2.6). 1CP2 and 2AFH NifH sequences were extracted

from their respective PDB files, by the WHAT IF web server version 8.0 (Vriend, 1990) and

trimmed to include only one copy of NifH for this calculation.

6.2.5 Structural characteristics

3D crystallographic representatives of the Fe protein from mesophilic Azotobacter vinelandii

PDB file ID 2AFH (Burgess et al., 1980; Tezcan et al., 2005) and Clostridium pasteurianum,

PDB file ID 1CP2 (Schlessman et al., 1998), were chosen in order to assess potential structural

changes in the clone libraries, in relation to cluster I and cluster III, respectively. Structural

characteristics were calculated as described in section 5.2.4, chapter 5).Protein images were

created using the Chimera UCSF program, version 1.6.2 (Pettersen et al., 2004). The I-TASSER

on line server provided 3D models for chosen clone sequences, which were superimposed on

the 1CP2 or 2AFH, using the “MatchMaker” option in the UCSF Chimera software (Pettersen et

al., 2004; Meng et al., 2006) with Smith-Waterman (Pearson, 1991) alignment algorithm and

other options left in default fashion (BLOSUM 62 substitution matrix and 30% weighting of the

secondary structure term). Salt bridges were defined by the WHAT IF web server (Vriend,

1990), version 10.1a (http://swift.cmbi.ru.nl/servers/html/index.html) and restricted to an

interatomic distance of less than 4.0 Å as described in section 5.2.4, chapter 5.

Page 168: Nitrogen fixing potential in extreme environments - UNSWorks

159

6.3 Results

6.3.1 Potential halophilic adaptations in the Fe protein

The evolutionary conservation amongst the stromatolites revealed that 50% of the amplified

region of NifH scored 8 & 9, while 9% scored 6 & 7. Important functional areas such as the

switch I & II, the nucleotide binding site, intersubunit interface within the Fe protein, as well as

the MoFe binding and the metalo cluster coordinating residues, were completely conserved in

the stromatolites, in a similar fashion to cluster I & III (figure 1). Four positions had unique

attributes, in comparison to cluster I & III (figure 1, highlighted residues in bold). Position 70

(residue number according to figure 1) in the stromatolite alignment, was a completely

conserved Leu, though in cluster I several other hydrophobic amino acids were also present,

such as Ile/Val, and in cluster III Ala/Thr were present. Position 79 was highly variable, but

included an Asn residue in many of the stromatolites. Asn was absent from cluster I and III

alignments for this position. The residues variants in positions 118-119 included also the

addition of Glu and Leu, in the stromatolites, while in the clusters, position 118 included mainly

Asp, and position 119 included mainly Phe/Tyr. No other unique variants were found.

We continued with an amino acid compositional analysis, as previously done for cluster I and

III. Residue shifts in the amplified section of the NifH sequences, in the affiliated stromatolites

of cluster III, are depicted in figure 2. 15 amino acids did not change in their composition in the

conserved segments, and had similar composition values (red dots in figure 2, table 1).

Nevertheless, Asp, Gly and Arg decreased in stromatolites, while Leu and Tyr increased in their

respective composition (table 1). Leu had the highest standard deviation value, suggesting

variation in its composition within the stromatolites (7.7 SD). In the variable segments for the

clones affiliated with cluster III, there were significant ratio changes with 14 amino acids (table

2). A significant increase was observed in the composition of Asp, Glu, Phe, Gly, Lys, Pro, Gln,

Arg, Val and a significant decrease was observed with Cys, Ile, Leu, Met, and Tyr. Ala, His,

Asn, Ser and Thr composition did not change significantly.

Page 169: Nitrogen fixing potential in extreme environments - UNSWorks

160

37(*) - DPKADSTRLM LHAKAQNTIM EMAAEAGTVE DLELDEVLKVG YNDVKCVES GGPEPGVGCA GRGVITAINF LEEEGAYDDD-

116

5799999694 9115191976 1256521485 66691171114 211119199 8889889899 8687977975 9611898112

117 - LDFVFYDVLG DVVCGGFAMP IRENKAQEIY IVVSGEMMA-156

2224381994 6499889678 9624996975 731985987

Con

sens

us (a

)

Clu

ster

I (b)

Clu

ster

III

Fi

gure

1 C

onse

rvat

ion

of th

e pa

rtial

regi

on o

f NifH

in st

rom

atol

ites (

N=6

1). (

*) C

onSu

rf c

onse

rvat

ion

scor

es fo

r the

stro

mat

olite

alig

nmen

t. A

re

pres

enta

tive

clon

e se

quen

ce w

as c

hose

n, w

ith n

o ga

ps. R

esid

ues i

n bo

ld a

nd g

rey

back

grou

nd c

olou

r are

uni

que

varia

nts,

see

text

for d

etai

ls.

(a) T

he c

onse

nsus

line

of t

he st

rom

atol

ites,

red

= co

mpl

etel

y co

nser

ved

resi

due,

pur

ple

= hi

ghly

con

serv

ed re

sidu

es (8

0% o

r gre

ater

), no

n co

nser

ved

resi

dues

are

show

n in

bla

ck. (

b) C

lust

er I

sequ

ence

and

con

serv

atio

n, b

ased

on

P004

59 se

quen

ce, a

nd c

lust

er II

I seq

uenc

e an

d co

nser

vatio

n ba

sed

on

P004

56 se

quen

ce.

Page 170: Nitrogen fixing potential in extreme environments - UNSWorks

161

Figure 2 The amino acids mean composition in the partial NifH sequence of cluster III (C3) and affiliated stromatolite clones (S3). Divided into variable vs. conserved regions of NifH. Error bars are SD.

Figure 3 Amino acids mean composition in the partial region of NifH from cluster I (C1) and affiliated stromatolites clones (S1), divided into variable vs. conserved regions of NifH. Error bars are SD.

Page 171: Nitrogen fixing potential in extreme environments - UNSWorks

162

Table 1 The amino acid mean composition in the conserved sections in cluster I (C1, N=58), cluster III (C3, N=32and affiliated clones (S1=44, S3=18). Shaded cells denote high Standard Deviation values (SD). Amino Acid Ala Cys Asp Glu Phe Gly His Ile Lys Leu C1-Conserved 11 5.3 6.6 9.2 1.3 14 0.0 6.6 2.7 5.6 S1-Conserved 12 3.6 5.9 9.5 2.4 14 1.2 4.8 3.6 8.4 SD 0.71 1.2 0.49 0.21 0.78 0.0 0.85 1.3 0.64 2.0 C3-Conserved 6.1 4.6 9.1 7.6 1.5 21 0.0 5.9 3.0 6.1 S3-Conserved 4.1 4.2 5.5 8.0 1.4 16 0.0 6.9 1.3 17 SD 1.4 0.28 2.5 0.28 0.071 3.5 0.0 0.71 1.2 7.7 Met Asn Pro Gln Arg Ser Thr Val Trp Tyr C1-Conserved 4.9 0.0 5.3 2.6 3.9 2.7 3.9 11 0.0 3.9 S1-Conserved 3.50 0.0 4.8 2.4 2.4 3.6 2.4 11 0.0 4.8 SD 0.99 0.0 0.35 0.14 1.1 0.64 1.1 0.0 0.0 0.64 C3-Conserved 3.1 0.0 6.1 1.5 6.1 4.5 4.5 7.7 0.0 1.5 S3-Conserved 2.4 1.4 5.5 1.4 2.8 5.3 4.2 7.0 0.0 5.5 SD 0.49 0.99 0.42 0.071 2.3 0.57 0.21 0.49 0.0 2.8 Table 2 Amino acid compositions in cluster I (C1, N=58), cluster III (C3, N=32) and affiliated clones (S1=44, S3=18) in the variable portions of the amplified section in NifH. Amino Acid Ala Cys Asp Glu Phe Gly His Ile Lys Leu

C3(a) 6.3(2.3) 2.5(0.86) 7.7(2.2) 12(3.5) 3(1.2) 5.7(2) 0.29(0.68) 6.3(2.1) 4.6(1.5) 15(3) S3 5.5(1.6) 0.14(0.59) 11(1.9) 15(1.6) 4.4(0.77) 12(1.2) 0.41(0.91) 2.2(1) 6.2(2) 3.5(1.7) C1 6.9(1.8) 0.81(1.5) 8.5(2.7) 11(3) 4.6(2.3) 7.5(1.8) 2.8(1.7) 5.8(2.2) 5.7(2.6) 11(2.3) S1 0.71(2.1) 0.26(0.83) 6.4(3.1) 17(1.4) 0.71(1.7) 5.4(0.94) 0.72(1.3) 9.2(1.9) 1.1(2.2) 15(3.4) C3:C1(b) N **** N N **** **** **** N ** ****

S3:C3 N **** **** ** **** **** N **** ** ****

S1:C1 **** * *** **** **** **** **** **** **** ****

Amino Acid Met Asn Pro Gln Arg Ser Thr Val Trp Tyr

C3 4.2(1.5) 3.2(1.6) 0.35(0.73) 2.9(1.3) 2(1.3) 3.6(1.9) 4.9(2) 8.5(2.1) 0.57(0.86) 6.2(1.4) S3 1.5(1.4) 2.9(3.1) 1.4(1.5) 4.7(1.2) 7.9(3.9) 4.2(1.6) 5.9(1.7) 11(1.9) 0(0) 0.14(0.56) C1 3.3(1.6) 6.7(2.3) 0(0) 1.4(1.8) 2.2(2.5) 7.7(3.3) 3.2(2.9) 8(2.3) 0.16(0.58) 2.6(1.9) S1 3.1(1.2) 8.4(1.5) 0(0) 0.065(0.43) 8.1(1.1) 7.6(2.3) 13(2.9) 3.6(2.1) 0(0) 0.067(0.44) C3:C1 * **** n/a **** N **** ** N * **** S3:C3 **** N * **** **** N N *** n/a **** S1:C1 N **** n/a **** **** N **** **** n/a ****

(a) Mean ratios and standard deviation in parenthesis. C1-Cluster I, C3-Cluster III, S1-Stromatolites affiliated with cluster I and S3-Stromatolites clones affiliated with cluster III. (b) Two tailed t-test P value summary. Means were significantly different (P<0.05) according to unpaired t-tests with Welch’s correction for unequal variances. **** P< 0.0001 extremely significant, *** 0.0001 <P< 0.001 very significant, ** 0.001 <P< 0.01, *0.01 <P< 0.05, significant. N - not significant. n/a - not applicable. ‘ ’ - an increased amount of amino acid, ‘ ’ - decreased amount of amino acid.

Page 172: Nitrogen fixing potential in extreme environments - UNSWorks

163

An increase in the Leu composition within the conserved segments was also evident in the affiliated

stromatolite clones of cluster I (figure 3, table 1). All other amino acids in the conserved segments had

low standard deviation values and did not vary in their composition, relative to the reference cluster. In

the variable segments, significant ratio changes were found with 16 amino acids. A significant

increase of Glu, Ile, Leu, Asn, Arg and Thr (table 2) and a significant decrease of Ala, Cys, Asp, Phe,

Gly, His, Lys, Gln, Val and Tyr were observed. Pro and Trp were absent, and the Met and Ser

compositions did not vary significantly in the clones affiliated with cluster I.

Potential structural changes to the Fe protein structure were assessed by using 20 stromatolite NifH

clones that were submitted to the I-TASSER server for model prediction. Two groups of ten clones

each, which were at maximum distance of 0.23 to1CP2 or 2AFH sequences (see table 3). The RMSD

values and evolutionary conservation of the I-TASSER models were analysed, relative to 1CP2 or

2AFH structures.

The 10 stromatolite clones affiliated with cluster III were at a distance of 0.14-0.23 from the 1CP2

sequence and their model RMSD values, according to I-TASSER, ranged between 0.56 (RSA13796)

to 1.05 Å (table 3). There were seven sections, in the clones’ Fe protein models, that showed an

average RMSD value higher than 1 Å, and positions 50-51, and 113-115 presented average RMSD

values higher than 2 Å, indicating a larger shift in the structural alignment compared to the 1CP2 chain

A structure (table 4, figure 5). These two sections were composed from conserved and non conserved

residues, exposed to the solvent and their predicted secondary structures included coils, bends and

turns. The 10 stromatolite clones affiliated with cluster I were at a distance of 0.08-0.22, from the

2AFH sequence (table 3), and their RMSD values ranged from 0.36 (RSA13396) to 0.61 Å. There

were three sections which showed RMSD values above 1 Å, and none had values above 2 Å, as seen

in the previous analysis of cluster III affiliated clones (figure 6). According to the 2AFH and affiliated

clones analysis, two Gly residues showed minor structural shifts, one Gly was denoted buried and the

other exposed to the solvent, while the third section, 115-118, was mostly exposed (table 5). The

secondary structure was characterised as coil, turns and bends for these sections and they included

conserved and non conserved residues, according to the ConSurf analysis.

Page 173: Nitrogen fixing potential in extreme environments - UNSWorks

164

Table 3 Stromatolite clones chosen for structural analysis. Sequence ID Distance(*) C-score(a) TM(b) RMSD (Å)(c) IDEN(d) 1CP2 0 2.2 0.972 0.84 1 RSA9396 0.14 1.93 0.962 0.92 0.941 RSA9696 0.14 1.93 0.963 0.87 0.952 RSA9096 0.15 1.81 0.966 0.9 0.915 RSA13796 0.16 1.93 0.974 0.56 0.929 RSA9196 0.16 1.34 0.9279 1.05 0.7 RSA98963 0.16 1.93 0.964 0.85 0.926 RSA10796 0.18 2.12 0.9804 0.73 0.68 RSA14196 0.19 1.89 0.964 0.85 0.926 RSA11996 0.21 1.91 0.963 0.89 0.918 RSA15296 0.23 1.93 0.962 0.9 0.914

2AFH 0 2.12 0.9897 0.54 1 RSA13596 0.08 2.14 0.9926 0.56 0.96 RSA13396 0.09 1.82 0.996 0.36 0.949 RSA10296 0.12 1.81 0.994 0.41 0.935 RSA11596 0.13 2.15 0.9953 0.43 0.95 RSA11496 0.17 1.79 0.989 0.61 0.916 RSA10196 0.18 1.76 0.989 0.44 0.913 RSA7904 0.19 1.85 0.99 0.43 0.916 RSA6104 0.2 1.57 0.986 0.42 0.867 RSA6704 0.2 1.85 0.989 0.48 0.912 RSA6904 0.22 1.76 0.987 0.5 0.889 (*) Distances as calculated by PHYLIP Protdist” version 3.67 (Felsenstein, 2007). (a) Confidence score for estimating the quality of the predicted top model by I-TASSER (Roy et al., 2010). (b) TM-score of the structural alignment between the query structure and known structures in the PDB library (Zhang and Skolnick, 2005). (c) The overall RMSD between residues that were structurally aligned by TM-align (Zhang and Skolnick, 2005). (d) The percentage sequence identity in the structurally aligned region (Roy et al., 2010). Table 4 Residue characteristics of stromatolite clones affiliated with cluster III with regions of RMSD >1 Å.

Residue positions* 41-42 50-51 62-63 65-68 89 91-93 113-115 Amino acid(a) AD GL EE EDVE E GVG TDD Conservation score(b) 99 97 18 7571 9 999 111 Main secondary structure(c) bend/coil bend/coil helix 3-helix turn coil turn/bend coil/turn Solvent accessibility(d) ee ee ee ee-- e --- eee Average RMSD (Å) (e) 1.245 2.252 1.316 1.142 1.053 1.412 2.328

* Position number according to 1CP2, chain A sequence, P000456 accession ID. (a) The amino acid in each position in the respective 1CP2 sequence. (b) ConSurf conservation scores for cluster III, 1-9, non-conserved to completely conserved, respectively. (c) The secondary structure based on the crystallographic structure of 1CP2. (d) Solvent accessibility - Buried (b) or exposed (e) residue, (-) varying degrees of exposure. See Table 2 for full details. (e) Average RMSD values for specific positions, in which the clones presented RMSD values >1 Å, relatively to the 1CP2 structure.

Page 174: Nitrogen fixing potential in extreme environments - UNSWorks

165

Figure 5 Two opposite angles of 1CP2 Fe protein superimposed with ten stromatolite clones I-TASSER models. Magenta highlights areas where RMSD >1 Å. The largest structural shifts, where RMSD >2 Å and the site of the two residue insertion are in red, see table 12 for further details.

g ,residue insertion are in red, see table 12 for further details.

Figure 6 Two different angles of 2AFH Fe protein superimposed with its closest ten stromatolite clones I-TASSER models. Magenta highlights areas where RMSD >1 Å, as per table 13.

Table 5 Residue characteristics of stromatolite clones affiliated with cluster I with regions of RMSD >1 Å. Residue positions* 94 96 115-118

Amino acid(a) G G YEDD Conservation score(b) 7 9 9113 Main secondary structure(c) turn coil bend & turn Solvent accessibility(d) e b eeee Average RMSD (Å) (e) 1.379 1.028 1.016 * Position number according to 2AFH, P00459 sequence accession ID, chain E. (a) The amino acid in each position in the respective sequence. (b) ConSurf conservation scores for cluster I, 1-9, non- conserved to completely conserved, respectively. (c) The secondary structure based on the crystallographic structure of 2AFH.

Page 175: Nitrogen fixing potential in extreme environments - UNSWorks

166

(d) Solvent accessibility - Buried (b) or exposed (e) residue, (-) varying degrees of exposure. See Table 4 for full details. (e) Average RMSD values for specific positions, in which the clones presented RMSD values >1 Å, relatively to the 2AFH structure.

According to the analysis we performed on six stromatolite clones (three for each cluster), there was a

total of 11 common salt bridges for the stromatolite clones and 1CP2 or 2AFH (table 14), and 15

unique salt bridges which were not detected in 1CP2 or 2AFH, under the enforced 4 Å interatomic

distance limit, between the side chain oxygen atoms in Asp or Glu, to the side chain nitrogen atoms in

Arg, Lys or His (table 6). Two unique salt bridges were highly conserved in S3 and S1, and were at

positions Asp42-Arg45, and Asp128-Lys9 (residue numbering according to S3, underlined in table 6).

The additional negative residues that sometimes appear in the region of 113-115 in stromatolites

(tables 4 & 5); seem to strengthen ionic bonds with Lys32 and Lys84 mainly, but not only. Salt

bridges in S3, corresponding to the above mentioned region, were detected in our analysis but the

interatomic distances ranged between 4.5 to 6.99 Å, and were therefore not specified in table 6. These

salt bridges included Asp residues which interacted mainly with Lys30 and Met1, residues which

scored 1 and 9 for conservation, respectively, in cluster III. The salt bridges also included Lys residues

in this region that interacted with Glu113, but at a distance of 6.52 Å, and therefore were not specified

in table 6.

Page 176: Nitrogen fixing potential in extreme environments - UNSWorks

167

Table 6 Potential salt bridges, with maximum intertatomic distance of 4Å, in the amplified NifH region of the Fe protein. Shaded rows represent common salt bridges present in the representative clones and the selected structure, 1CP2 or 2AFH.

Residue Position(b) Residue Position Distance (Å)

Conservation scores (c)

1CP2(a) ASP 38 LYS 14 3.07 9,9 GLU 62* LYS 54 2.77 1,6 GLU 75 ARG 81 3.84 1,1 GLU 107 LYS 140 3.96 9,9 ASP 115 ARG 81 3.22 1,1 ASP 122* LYS 14 2.95 9,9 GLU 143 ARG 2 3.58 9,9 S3(d) ASP 42 ARG 45 3.66 9,8 GLU 62 ARG 54 3.61 9,1 ASP 71 ARG 54 3.89 7,1 GLU 89 ARG 61 3.65 9,9 GLU 107 LYS 142 2.71 8,8 ASP 128 LYS 9 3.97 8,- GLU 145 ARG 2 3.32 8,- 2AFH(e) ASP 39 LYS 15 3.09 9,9 E↔F GLU 92 LYS 170 2.8 9,9 GLU 110 LYS 143 3.24 9,8 ASP 118 LYS 32 3.38 3,7 ASP 125 LYS 15 3.14 9,9 ASP 129 LYS 41 2.74 9,9 GLU 141 ARG 140 2.92 6,9 GLU 146 ARG 3 2.88 9,9 GLU 154 LYS 10 3.97 9,9 GLU 229 HIS 50 2.53 2,6 E→F GLU 265 LYS 52 3.68 9,9 F→E GLU 277 LYS 52 2.86 -,9 S1(f) GLU 28 ARG 81 2.7 ,-,1 ASP 39 LYS 15 2.73 9,- ASP 44 ARG 47 2.71 9,9 ASP 70 ARG 65 2.66 8,1 GLU 74 LYS 77 2.68 7,1 GLU 92 ARG 100 2.71 9,8 GLU 110 LYS 143 2.73 9,9 ASP 116 LYS 32 2.74 7,- ASP 116 LYS 84 2.62 1,7 ASP 118 LYS 32 2.68 7,- ASP 120 LYS 31 2.73 7,- ASP 125 LYS 15 3.77 8,- ASP 129 LYS 10 2.73 9,- ASP 129 LYS 41 2.71 9,9 GLU 141 ARG 140 2.7 8,9 GLU 146 ARG 3 2.65 9,- GLU 154 ARG 187 2.74 8,- GLU 221 ARG 46 2.69 ,-,9 GLU 229 HIS 50 2.75 ,-,9

(a) 1CP2 analysis by WHAT IF, salt bridges were not detected between the Fe protein subunits A & B. (b) Positioning was manually corrected for minor shifts per alignment.

Page 177: Nitrogen fixing potential in extreme environments - UNSWorks

168

(c) Conservation score was based on the individual analysis of ConSurf on cluster I, affiliated stromatolite clones (S1), cluster III and stromatolite affiliated with cluster III (S3). Scores ranged from 1 to 9, non-conserved to completely conserved, respectively. “-“score was not calculated. (d) Based on the WHAT IF analysis on the I-TASSER PDB files of cluster III stromatolite clones: RSA14196, RSA11996 and RSA98963. (e) 2AFH analysis by WHAT IF, salt bridges were detected between subunits E & F, and are designated where relevant. (f) Based on the WHAT IF analysis on the I-TASSER PDB files of cluster I stromatolite clones: RSA13596, RSA7904 and RSA6904. * A yellow background colour denotes a residue in a α-helix structure, and a green colour denotes a residue within a -sheet. No background colour means random coil or unknown structure.

6.3.2 Potential thermophilic adaptations in the Fe protein

The conservation analysis of the Paralana Hot Springs (PHS) clones revealed that 54% of the

amplified region sequence scored 8 & 9, while 13% scored 6 & 7. Several sections were completely

conserved - the nucleotide binding site, intersubunits interface within the Fe protein, the MoFe binding

residues, the metalo cluster and the two switch regions. Positions 80 and 116 had additional variants,

in comparison to cluster I & III (figure 7, highlighted residues in bold). These positions were highly

variable in PHS and the original clusters, however, in the PHS clones several sequences included a

Cys at position 80, and at position 116 several sequences included a Lys, both variants were not

present in these positions in the original clusters. Position N106 was completely conserved throughout

the PHS alignment, though not so in the original clusters.

Following a statistical analysis of the amino acid composition, significant shifts in the were discovered

in the variable segments of the NifH sequences in Paralana Hot Springs (PHS) clones affiliated with

cluster III. There were significant ratio changes in 10 amino acids: a significant increase in Asp, Phe,

Pro, Arg and Val, and a significant decrease in Ala, Glu, Leu, Ser and Tyr (figure 8, table 7). The

composition of Cys, Gly, His, Ile, Lys, Met, Asn, Gln and Thr, did not change significantly. In its

conserved segment, Gly, Leu and Tyr content varied, according to their SD values, 2.1, 2.8 and 2.3,

respectively (table 8), while SD values for the other amino acids ranged from 0.0 to 1.2. Affiliated

PHS clones with cluster I included a significant increase in the content of Cys, Ile, Leu, and Arg in the

variable section (figure 9, table 7, P1:C1). Ala, Glu, Gly, Lys, Asn, and Gln content decreased

significantly, while eight other amino acids did not change significantly. In the conserved segment,

Glu content increased in P1 clones (SD = 2.7,table 8), compared to cluster I, while SD for the other

amino acids remained low and ranged from 0 to 1.4.

Page 178: Nitrogen fixing potential in extreme environments - UNSWorks

169

Figu

re 7

Con

serv

atio

n of

the

ampl

ified

regi

on o

f NifH

in P

HS

clon

es (N

=36)

. (*)

Con

Surf

con

serv

atio

n sc

ore

for t

he P

HS

alig

nmen

t. A

repr

esen

tativ

e cl

one

sequ

ence

was

cho

sen,

with

no

gaps

. (a

) The

con

sens

us li

ne o

f the

stro

mao

lites

, red

= c

ompl

etel

y co

nser

ved

resi

due,

pur

ple

= hi

ghly

con

serv

ed re

sidu

es (8

0% o

r gr

eate

r), n

on c

onse

rved

resi

dues

are

show

n in

bla

ck. (

b) C

lust

er I

sequ

ence

and

con

serv

atio

n, b

ased

on

P004

59 se

quen

ce, a

nd

clus

ter I

II se

quen

ce a

nd c

onse

rvat

ion

base

d on

P00

456

sequ

ence

.

38(*

) - 9469999994 9136691936 1466339199 94663331119 111119599 9989899999 9993795991 991199911--116

DPKADSTRLI LHSKAQNTIM EMAAEAGTVE DLELEDVLKVG YGGIKCVES GGPEPGVGCA GRGVITAINF LEEEGAYED-

117- -11915196 9699979999 9989611991 9799719999999-157

-DLDFVFYD VLGDVVCGGF AMPIRENKAQ EIYIVCSGEMMAL

C

onse

nsus

(a)

Clu

ster

I (b)

Clu

ster

III

Page 179: Nitrogen fixing potential in extreme environments - UNSWorks

170

Figure 9 The amino acids mean composition in the amplified NifH amino sequence of cluster I (C1) and affiliated PHS clones (P1). Divided into variable vs. conserved regions of NifH. Error bars are SD.

Figure 8 The amino acids mean composition in the amplified NifH amino sequence of cluster III (C3) and affiliated PHS clones (P3). Divided into variable vs. conserved regions of NifH. Error bars are SD.

Page 180: Nitrogen fixing potential in extreme environments - UNSWorks

171

Table 7 Amino acid compositions in cluster I (C1, N=58), cluster III (C3, N=32), and affiliated clones (P1=20, P3=16) in the variable sections of NifH amino acid sequences. Amino Acid Ala Cys Asp Glu Phe Gly His Ile Lys Leu C3(a) 6.3(2.3) 2.5(0.86) 7.7(2.2) 12(3.5) 3(1.2) 5.7(2) 0.29(0.68) 6.3(2.1) 4.6(1.5) 15(3) P3 4.91(1.66) 1.74(1.40) 10.62(2.70) 8.44(3.68) 4.51(1.31) 7.30(4.20) 0.17(0.69) 6.60(1.92) 6.38(3.80) 12.74(2.28) C1 6.9(1.8) 0.81(1.5) 8.5(2.7) 11(3) 4.6(2.3) 7.5(1.8) 2.8(1.7) 5.8(2.2) 5.7(2.6) 11(2.3) P1 5.88(1.03) 2.72(1.11) 8.45(3.58) 6.16(3.52) 3.71(2.56) 5.62(1.53) 2.17(1.29) 10.93(2.72) 4.30(2.12) 13.42(4.31)

C3:C1(b) N **** N N **** **** **** N ** **** P3:C3 * N ** ** *** N N N N ** P1:C1 ** **** N **** N **** N **** * *

Met Asn Pro Gln Arg Ser Thr Val Trp Tyr C3 4.2(1.5) 3.2(1.6) 0.35(0.73) 2.9(1.3) 2(1.3) 3.6(1.9) 4.9(2) 8.5(2.1) 0.57(0.86) 6.2(1.4) P3 4.73(1.40) 2.21(3.40) 3.95(2.20) 3.25(2.17) 3.40(2.44) 1.52(2.21) 4.50(1.60) 11.94(3.62) 0(0) 1.09(1.46) C1 3.3(1.6) 6.7(2.3) 0(0) 1.4(1.8) 2.2(2.5) 7.7(3.3) 3.2(2.9) 8(2.3) 0.16(0.58) 2.6(1.9) P1 4.02(1.41) 3.90(1.45) 0.14(0.62) 0.15(0.68) 6.95(3.19) 6.96(3.19) 4.03(2.39) 7.74(3.42) 0(0) 2.75(1.75) C3:C1 * **** n/a **** N **** ** N * **** + P3:C3 N N **** N * ** N ** n/a **** P1:C1 N **** n/a **** **** N N N n/a N (a) Mean ratios and standard deviation in parenthesis. C1-Cluster I, C3-Cluster III, P1-PHS clones affiliated with cluster I, P3-PHS clones affiliated with cluster III. (b) Two tailed t-test P value summary. Means are significantly different (P<0.05) according to unpaired t-tests with Welch’s correction for unequal variances. **** P< 0.0001 extremely significant, *** 0.0001 <P< 0.001 very significant, ** 0.001 <P< 0.01, *0.01 <P< 0.05, significant. N - not significant. n/a - not applicable. ‘ ’ - increased amount of amino acid, ‘ ’ - decreased amount of amino acid. Table 8 The amino acid mean composition in the conserved sections of cluster I (C1, N=58), cluster III (C3, N=32) and affiliated clones (P1=20, P3=16). Shaded cells denote high Standard Deviation values (SD). Amino Acid Ala Cys Asp Glu Phe Gly His Ile Lys Leu C1-Conserved 11 5.3 6.6 9.2 1.3 14 0.0 6.6 2.7 5.6 P1-Conserved 11 3.6 5.9 13 2.4 16 1.2 4.7 2.4 6.4 SD 0.0 1.2 0.49 2.7 0.78 1.4 0.85 1.3 0.21 0.57 C3-Conserved 6.1 4.6 9.1 7.6 1.5 21 0.0 5.9 3.0 6.1 P3-Conserved 7.2 4.9 8.2 8.5 1.3 18 0.0 4.7 3.6 10 SD 0.78 0.21 0.64 0.64 0.14 2.1 0.0 0.85 0.42 2.8 Met Asn Pro Gln Arg Ser Thr Val Trp Tyr C1-Conserved 4.9 0.0 5.3 2.6 3.9 2.7 3.9 11 0.0 3.9 P1-Conserved 3.2 1.2 4.7 2.4 2.4 3.4 3.6 9.7 0.0 3.5 SD 1.2 0.85 0.42 0.14 1.1 0.49 0.21 0.92 0.0 0.28 C3-Conserved 3.1 0.0 6.1 1.5 6.1 4.5 4.5 7.7 0.0 1.5 P3-Conserved 2.4 1.3 3.8 1.3 4.7 4.6 4.8 6.0 0.0 4.8 SD 0.49 0.92 1.6 0.14 0.99 0.071 0.21 1.2 0.0 2.3

Page 181: Nitrogen fixing potential in extreme environments - UNSWorks

172

In order to find out how changes in the amino acids content might have influenced the Fe

protein structure (if at all), 18 PHS clones, 9 from each cluster, were submitted to the I-

TASSER process, in order to create structural models (table 9). The distance of nine PHS clones

affiliated with cluster III ranged from 0.13 (RSA207) to 0.16, from the 1CP2 sequence, and

their models RMSD values, ranged between 0.45 (RSA227) to 0.89 Å, according to I-TASSER

results.

There were seven sections, in the P3 clones Fe proteins, with an average RMSD value higher

than 1 Å, and positions 50-51, and 112-115 presented an average RMSD values higher than 2 Å

(figure 10, table 10). These two sections were exposed to the solvent and composed from

conserved and non conserved residues. Their predicted secondary structures included coils,

bends and turns. The distance of nine PHS clones affiliated with cluster I ranged from 0.02

(RSA173) to 0.21, from the 2AFH sequence, and the RMSD values of their I-TASSER models

were between 0.34 (RSA159) to 0.56 Å (table 9). The only section which varied structurally,

108-113, included conserved and non conserved residues, most of which were exposed to the

solvent (figure 11, table 11). The predicted structure included parts of a helix and a turn.

Table 9 PHS clones chosen for structural analysis. Sequence ID Distance(e) C-score(a) TM(b) RMSD (Å)(c) IDEN(d)

1CP2 0 1.91 0.972 0.84 1 RSA207Par09 0.13 1.85 0.983 0.51 0.938 RSA165Par09 0.14 1.84 0.969 0.88 0.938 RSA215Par09 0.14 1.85 0.968 0.9 0.933 RSA228Par09 0.14 2.14 0.9815 0.7 0.67 RSA195Par09 0.15 1.85 0.967 0.9 0.929 RSA158Par09 0.15 1.84 0.976 0.89 0.933 RSA226Par09 0.15 2.13 0.9833 0.63 0.69 RSA219Par09 0.16 1.85 0.985 0.46 0.924 RSA227Par09 0.16 1.82 0.989 0.45 0.915

2AFH 0 2.12 0.9897 0.54 1 RSA173Par0 0.02 1.84 0.994 0.43 0.841 RSA159Par0 0.03 1.86 0.992 0.34 0.987 RSA194Par0 0.09 1.86 0.989 0.48 0.954 RSA208Par0 0.11 1.87 0.946 0.41 0.996 RSA192Par0 0.12 1.87 0.99 0.44 0.946 RSA203Par0 0.17 1.83 0.988 0.52 0.916 RSA221Par0 0.17 1.85 0.989 0.48 0.924 RSA213Par0 0.18 1.85 0.992 0.38 0.916 RSA163Par0 0.21 1.85 0.987 0.56 0.895 (a) Confidence score for estimating the quality of the predicted top model by I-TASSER (Roy et al., 2010).

Page 182: Nitrogen fixing potential in extreme environments - UNSWorks

173

(b) TM-score of the structural alignment between the query structure and known structures in the PDB library (Zhang and Skolnick, 2005). (c) The overall RMSD between residues that were structurally aligned by TM-align (Zhang and Skolnick, 2005). (d) The percentage sequence identity in the structurally aligned region (Roy et al., 2010). (e) Distances as calculated by PHYLIP Protdist” version 3.67 (Felsenstein, 2007). Table 10 Residue characteristics of PHS clones affiliated with cluster III with regions of RMSD >1 Å. Residue positions* 50-51 62-63 65-67 89 91-93 112-115 152-153 Amino acid(a) GL EE EDV E GVG YTDD MM Conservation score(b) 97 18 757 9 999 6111 79 Main secondary structure(c) bend/coil helix 3-helix-

turn/coil coil turn/bend coil/turn helix

Solvent accessibility(d) ee ee ee- e --- eeee eb Average RMSD (Å) (e) 2.32 1.23 1.28 1.24 1.37 2.3 1.58 * Position number according to 1CP2, chain A sequence, P000456 accession ID (a) The amino acid in each position in the respective 1CP2 sequence. (b) ConSurf conservation scores for cluster III, 1-9, non- conserved to completely conserved, respectively. (c) The secondary structure based on the crystallographic structure of 1CP2. (d) Solvent accessibility - Buried (b) or exposed (e) residue, (-) varying degrees of exposure. See Table 2 for full details. (e) Average RMSD values for specific positions, in which the clones presented RMSD values > 1 Å, relatively to the 1CP2 structure. Table 11 Residue characteristics of PHS clones affiliated with cluster I with regions of RMSD >1 Å. Residue positions* 108-113 Amino acid(a) FLEEEG Conservation score(b) 899927 Main secondary structure(c) helix&turn Solvent accessibility(d) bbeeee Average RMSD (Å) (e) 1.7 * Position number according to 2AFH, P00459 sequence accession ID, chain E. (a) The amino acid in each position in the respective sequence. (b) ConSurf conservation scores for cluster I, 1-9, non- conserved to completely conserved, respectively. (c) The secondary structure based on the crystallographic structure of 2AFH. (d) Solvent accessibility - Buried (b) or exposed (e) residue, (-) varying degrees of exposure. See Table 4 for full details. (e) Average RMSD values for specific positions, in which the clones presented RMSD values > 1 Å, relatively to the 2AFH structure

Page 183: Nitrogen fixing potential in extreme environments - UNSWorks

174

Figure 10 Two opposite angles of 1CP2 Fe protein superimposed with PHS clones I-TASSER models. Magenta highlights areas where RMSD >1 Å. The largest structural shifts, where RMSD >2 Å and the site of the two residue insertion are in red, see table 18.

Figure 11 2AFH Fe protein superimposed with its closest PHS clones I-TASSER models. Magenta highlights areas where RMSD >1 Å, as per table 19.

According to the analysis we performed on six PHS clones (three for each cluster), there was a

total of seven common salt bridges for PHS clones and 1CP2 or 2AFH and 12 unique salt

bridges which were not detected in 1CP2 or 2AFH (Table 12). Two unique salt bridges were

highly conserved in P3 and P1, and were at positions Asp42-Arg45 and Glu151-Arg184 (P3

residue numbering respectively, underlined in Table 12). The Asp or Glu residues and most of

their positive partners were highly conserved in the unique salt bridges. The additional negative

residues in the region of 112-115 (cluster III, table 10); seem to strengthen ionic bonds with

Lys33 and Met1 mainly, but not only. Salt bridges in P3, corresponding to the above mentioned

region, were detected in our analysis but their interatomic distances ranged between 4.1 to 6.88

Å and were therefore not specified in Table 12. The negative residues interacted mainly with

Arg81, Lys113 and His26, all of which scored 1 for conservation in cluster III.

Page 184: Nitrogen fixing potential in extreme environments - UNSWorks

175

Table 12 Potential salt bridges, with maximum intertatomic distance of 4 Å, in the amplified NifH region of the Fe protein. Shaded rows represent common salt bridges present in the representative clones and the selected structure, 1CP2 or 2AFH.

Residue Position(b) Residue Position Distance (Å) Conservation scores (c)

1CP2(a) ASP 38 LYS 14 3.07 9,9 GLU 62* LYS 54 2.77 1,6 GLU 75 ARG 81 3.84 1,1 GLU 107 LYS 140 3.96 9,9 ASP 115 ARG 81 3.22 1,1 ASP 122* LYS 14 2.95 9,9 GLU 143 ARG 2 3.58 9,9 P3(d) ASP 42 ARG 45 2.65 9,9 ASP 58 LYS 54 3.3 9,9 ASP 58 ARG 61 3.25 9,9 ASP 62 LYS 54 3.61 7,9 GLU 89 ARG 61 3.19 9,9 GLU 107 LYS 140 2.73 9,9 ASP 126 LYS 40 2.84 9,9 GLU 145 ARG 2 3.22 9,- GLU 151 ARG 184 3.04 9,- 2AFH(e) ASP 39 LYS 15 3.09 9,9 E↔F GLU 92 LYS 170 2.8 9,9 GLU 110 LYS 143 3.24 9,8 ASP 118 LYS 32 3.38 3,7 ASP 125 LYS 15 3.14 9,9 ASP 129 LYS 41 2.74 9,9 GLU 141 ARG 140 2.92 6,9 GLU 146 ARG 3 2.88 9,9 GLU 154 LYS 10 3.97 9,9 GLU 229 HIS 50 2.53 2,6 E→F GLU 265 LYS 52 3.68 9,9 F→E GLU 277 LYS 52 2.86 -,9 P1(f) GLU 29 ARG 82 3.7 -,1 ASP 44 ARG 47 2.84 9,9 ASP 70 ARG 65 2.65 9,2 GLU 111 LYS 144 2.82 9,9 ASP 117 LYS 33 3.97 7,- GLU 118 MET 1 2.77 7,- ASP 119 LYS 33 3.25 7,- ASP 126 LYS 16 3.82 8,- GLU 147 ARG 4 2.66 9,- GLU 155 ARG 188 2.96 9,- (a) 1CP2 analysis by WHAT IF, salt bridges were not detected between the Fe protein subunits A & B. (b) Positioning was manually corrected for minor shifts per alignment. (c) Conservation score was based on the individual analysis of ConSurf on cluster I and its affiliated clones (P1), cluster III, PHS clones affiliated with cluster III (P3). Scores ranged from 1 to 9, non-conserved to completely conserved, respectively. “-“score was not calculated. (d) Based on the WHAT IF analysis on the I-TASSER PDB files of cluster III PHS clones: RSA227Par09, RSA158Par09 and RSA207Par09. (e) 2AFH analysis by WHAT IF, salt bridges were detected between subunits E & F, and are designated where relevant. (f) Based on the WHAT IF analysis on the I-TASSER PDB files of cluster I PHS clones: RSA163Par09, RSA208Par09 and RSA194Par09. * Yellow background colour denotes a residue in a α-helix structure, and green colour denotes a residue within a -sheet. No background colour means random coil or unknown structure.

Page 185: Nitrogen fixing potential in extreme environments - UNSWorks

176

6.4 Discussion

6.4.1 Halophilic adaptations

The stromatolite inferred NifH sequences were subjected to analyses of conservation patterns,

amino acids composition and structural shifts, in comparison to cluster I and cluster III.

According to our analyses, most of the amplified region was conserved in the stromatolite

clones. A lesser portion of the alignment was completely conserved in comparison to the

original clusters. 70% and 66% of cluster I and cluster III positions in the same region,

respectively, scored 8 & 9 in comparison to the stromatolite alignment (50%). Also, it would

appear that highly variable positions with scores lower than 6, were more prevalent in the

sequence alignment of columnar stromatolites (41%), and in comparison to clusters I and III

(21%, each). This was expected as the stromatolite multiple alignment was combined from

clones affiliated with both clusters, hence their multiple alignment included more variable

residues per position.

Two characteristics were checked in order to find if the stromatolite sequences, as a set, as a

group, had a pattern regardless of their clusters affiliation. Firstly, whether the completely

conserved sections matched the correlating segments in cluster I or III and secondly, whether

the variability of residues, per position, was within the known variants we found for cluster I &

III previously. Completely conserved sections did match the correlating segments in cluster I

and cluster III and included the important functional regions, and in this regards they did not

provide any new information. In regards to the second point, 16 positions out of the 119

residues of the partial gene region (on average), matched our second criteria, and included

residue variants in the stromatolite clones which were different than those present in cluster I or

III. However, 13 variants were present in only one sequence (RSA152) out of the whole set, and

therefore were discarded from further analysis and discussion. Four positions were found to

include patterns unique to the stromatolites, as a group. They suggested bias towards Leu and

Asn, and conservation of function over structure.

The amino acids composition analysis enabled us to detect shifts relative to cluster I or cluster

III. In the clones affiliated with cluster III, the slight decrease in the charged amino acids in the

conserved section, Arg and Asp, could be interpreted as an adaptive strategy to minimise

interference from salt ions within the core of the protein, and the addition of hydrophobic

elements such as Tyr and Leu would minimise accessible surface area to the solvent, within the

important functional buried sites of the protein (Moret and Zebende, 2007). Additionally, in the

Page 186: Nitrogen fixing potential in extreme environments - UNSWorks

177

variable section there was an increase in positively and negatively charged amino acids (Asp,

Glu, Arg, Lys), an increase in small amino acids (Gly, Pro) and small hydrophobic amino acid

(Val), as well as a decrease in bulkier hydrophobic amino acids (Ile, Leu, Met, Tyr, Cys).

Additional findings included an increase of Leu in the conserved section of stromatolites

regardless of cluster affiliation (table 1). The common finding in the variable sections included

five amino acids whose composition varied significantly in the stromatolites, but not so in the

reference clusters (table 2). Interestingly, Glu and Arg content increased in stromatolite

affiliated with cluster I (S1) and cluster III (S3), yet Asp, Ile and Val did not have a joint trend

for S1 and S3, and displayed different shifts (table 2). Asp and Val decreased in S1 and

increased in S3, while Ile increased in S1 and decreased in S3. Tyr and Cys decreased in S1 and

S3, and the most prevalent amino acid was Glu (17/15 for S1/S3, respectively, table 2). The

interplay previously observed in cluster I & III, between Ala and Gly amino acids, was not

present in the stromatolites.

Metagenomic studies of halophilic bacteria, Archaea and also of total bacterial DNA from saline

and hypersaline environmental samples, revealed several re-occurring genomic themes, though

not all of them were absolute for all protein families across all halophiles (Fukuchi et al., 2003;

Paul et al., 2008; Rhodes et al., 2010).

The shift in hydrophobic residues has been partially attributed to the rich GC-based DNA

halophiles possess, which effects codon usage (Paul et al., 2008). Rao and Argos (1981)

reported that in a chloroplast-type 2Fe-2S ferredoxins from two halophiles, large hydrophobic

and aliphatic residues such as Ile, Leu, Phe and Met, were replaced by smaller residues - Ala,

Gly and Val to reduce overall protein bulkiness and promote a tight configuration which is less

accessible to the solvent. The overall hydrophobicity remaining relatively unchanged in

comparison to the non halophilic proteins (Rao and Argos, 1981). The increase in charged

amino acid frequency has been reported as a halophilic mechanism, for instance, to produce

excess of negative charges which act as a charged screen against salt ions, attract water

molecules and enable the protein to remain active in saline conditions up to 4M NaCl.

According to Lanyi (1974), there is also an excess of small amino acids with short side chain -

Gly, Ala. Fukuchi et al. (2003) performed a statistical analysis on 126 proteins from

Halobacterium sp. NRC-1 and three other halophiles and found an abundance of acidic residues

on the external surface of halophilic proteins vs. non halophiles, while the internal composition

did not change significantly

Paul et al. (2008) presented data from which it was concluded that halophilic proteins, in

general, are less hydrophobic than non halophilic proteins. Similar findings were reported from

Page 187: Nitrogen fixing potential in extreme environments - UNSWorks

178

a statistical review of 26 halophilic enzymes by Madern et al. (1995) with the additional finding

of lower Lys content (a feature which was mentioned by Eisenberg (1995) as well). However, it

should be noted that some analysis has shown that the composition of Arg and Lys is dictated

solely by the G+C content in the DNA, and has nothing to do with their charge or other

biochemical properties (Cambillau and Claverie, 2000).

In order to achieve a reliable structural analysis, we chose the 10 stromatolite clone sequences

that were relative close, distance wise, to the 1CP2/P00456 or 2AFH/P00459 sequences. The

minimum distance for a stromatolite clone to 1CP2 was 0.14 in our dataset (table 3). The

maximum distance was 0.23 and we therefore expected that some structural changes would be

evident. The highest RMSD value for a clone model with cluster I was 0.61 Å (RSA114) while

the highest value for a clone model of cluster III was 1.05 Å (RSA9196), indicative of the

uncertainties in modelling a cluster III Fe protein, when the I-TASSER does not have a robust

number of cluster III resolved structures to rely on. Our analysis suggested two sections

participated in structural shifts in the cluster III affiliated clones. These two sections included a

residue involved in the Fe protein dimer interface (Leu51), and a hydrogen bonding partner with

water molecules (Thr113; (Schlessman et al., 1998).

The region of 112-115 in the S3 clones was always elongated by two charged residues. The two

main forms in this region were AESEE or EEDKK in S3 clones, which formed unfavourable

salt bridges (interatomic distance >4 Å, table 6). A quick glance at the cluster III alignment

(figure 2, section 5.3.1, chapter 5), revealed that in these positions, an insertion of two residues

tend to occur, and the specific KK or EE type of insertion was also present in the NifH

sequences of Desulfovibrio magneticus strain ATCC 700980 (NIFH_DESMR_1_271 sequence

ID), Desulfovibrio gigas (NIFH_DESGI_1_271) and Desulfatibacillum alkenivorans strain AK-

01 (NIFH_DESAA_1_271). Therefore while the insertion of the charged residues was not

unique or endemic to the stromatolite group, it was definitely a stabilising adaptation for saline

conditions, as these specific species are known to withstand saline conditions (Cravo-Laureau et

al., 2004; Garrity et al., 2005). An increase of salt bridges was reported within monomers and at

the inter subunits interfaces of halophilic proteins, as a stabilising mechanism (Eisenberg, 1995;

Madern et al., 2000). These studies suggested that the salt bridges were composed by either an

Arg residue which interacted with the acidic residues, or by the solvent ions such as chloride

and sodium to which the salt bridges were bound. Our analysis of potential salt bridges revealed

that while the Asp or Glu residues were always highly conserved in S3 or S1 in general, their

positive partners were sometimes highly variable. The mechanism described by Madern et al.,

(2000), would therefore fit the conservation we see of acidic residues in S3 or S1 and would

allow for flexible interactions with positive ions from the solvent.

Page 188: Nitrogen fixing potential in extreme environments - UNSWorks

179

Our analysis suggested three sections were involved in structural shifts in the affiliated clones of

cluster I. The Gly residues, in positions 94 and 96, support a functionally important Val residue

in between (V95, table 4, section 5.3.2, chapter 5), which participates both in binding to the

MoFe protein and in the dimer interaction within the protein. In addition C97, right after G96,

coordinates the metalo cluster by a hydrogen bonding between the main chain amide, the sulfur

atoms in the cluster and the thiol group of the Cys (NH-S bond). This loop region have been

found to exhibit variation in conformation previously and though one residue was denoted

buried and the other exposed to the solvent, the entire cluster area is considered accessible to the

solvent in general (Schlessman et al., 1998). The region of YEDD (115-118) is mostly non-

conserved and exposed to the solvent, and position 117 sometimes included Asp or Asn in the

S1 clones. The Asn is a rather unique choice for some of the stromatolites, since in 13

sequences in cluster I alignment, an insertion of D/E/S/V was evident as well (figure 5, section

5.3.2, chapter 5), but Asn was never present. These 13 sequences were of NifH from

Magnetococcus strain MC-1, Dechloromonas aromatic, Tolumonas auensis, Pectobacterium

atrosepticum, Klebsiella pneumonia, Teredinibacter turnerae T7901, Alcaligenes faecalis,

Pseudomonas stutzeri, Azotobacter chroococcum and A. vinelandii. Most of the sequences in

cluster I did not have an additional residue in position 117 (figure 5, section 5.3.2, chapter 5).

According to our previous analysis with 1CP2 and 2AFH resolved crystallographic structures -

the residues corresponding to positions 51-52, 89, 93 and 113-114 also presented high RMSD

values (table 5, section 5.3.3, chapter 5). Hence, we believe a plausible explanation for the

structural shifts in positions 51-52, 89, 91-93, and 113-114 in cluster III clones, originated from

the methodology used by I-TASSER. The process relies on available protein structures, which

at the moment are mostly Fe proteins from A. vinnelandii, and therefore may not reflect

authentically potential shifts in the structure of Fe proteins from the clones. In a similar fashion,

two out of the three sections observed when analysing 2AFH and related stromatolite clones

(table 5, figure 6) were also detected previously and can be attributed to the I-TASSER process,

and may not represent authentic structural shifts. On the other hand, positions 41-42, 62-63, 65-

68, and 115 (with the insertion of two additional residues), may authentically represent

structural shifts in stromatolite affiliated with cluster III. Altogether these findings suggest that

structural shifts occur in the stromatolite Fe proteins, in addition to the possible bias introduced

by the I-TASSER procedure.

In summary, based on amino acid composition and structural analysis, the overall results

suggest halophilic adaptations were present in the inferred NifH sequences of the stromatolites.

Page 189: Nitrogen fixing potential in extreme environments - UNSWorks

180

6.4.2 Thermophilic adaptations

The Paralana Hot Springs NifH clones (PHS) were subjected to analyses of conservation

patterns, amino acids composition and structural shifts, in comparison to cluster I and cluster III.

Our analysis demonstrated that most of the amplified region was conserved in the PHS clones,

yet relative to cluster I and cluster III, they were less conserved. Seventy percent and 66% of

cluster I and cluster III positions in the same region, respectively, scored 8 & 9. In addition, the

sequence alignment of PHS clones included more highly variable positions with scores below 6.

This was expected as the PHS multiple alignment was combined from clones affiliated with

both clusters; hence their multiple alignment included more variable residues per position. We

also looked for unique residue variants within the PHS multiple alignment that differed from the

variants of the original clusters.

In the variable region of PHS alignment, 13 positions were found to include amino acid variants

which were not present in cluster I or III. However, except for positions 80 and 116, the variants

were present in only one sequence out of the whole set, and therefore were discarded from

further analysis and discussion. N106 was completely conserved in PHS, in contrast to the

original clusters (figure 7). It is unknown at the moment, how these changes would affect the Fe

protein function in PHS clones.

Hyperthermophilic proteins usually display an increase in charged (Arg, Lys, Glu, Asp) and

some hydrophobic amino acids (Ile, Met, Val, Tyr), accompanied by a decrease in uncharged

polar residues such as Ser, Thr, Asn and Gln, with no significant variation for His, Pro, Gly or

Cys (Cambillau and Claverie, 2000; Daniel et al., 2008). Other studies reported slightly

different results: an increase in Glu, Ile, Val, Tyr, accompanied by decreases in Ala, His, Gln

and Thr (Fukuchi and Nishikawa, 2001; Singer and Hickey, 2003).

In general, the increase in charged amino acids results in chains of ion pairs, which enhance

stability at high temperatures. Asn and Gln are sensitive to temperature fluctuations, due to the

increased rate of deamination at high temperatures, hence decreasing their presence promotes

stability overall, at high temperatures. Hydrophobic interactions in the protein affect its stability

, increasing the core hydrophobicity produces a small and compact core, which stabilises the

protein at higher temperatures (Siddiqui and Cavicchioli, 2006; Siddiqui and Thomas, 2008).

The amino acids composition analysis enabled us to detect composition shifts relative to cluster

I or cluster III. Clones affiliated with cluster III (P3) had an increase in positively and negatively

charged amino acids (Asp, Arg), and an increase in small or hydrophobic amino acids (Phe, Val,

Page 190: Nitrogen fixing potential in extreme environments - UNSWorks

181

Pro) but also a decrease in hydrophobic residues such as Tyr, Leu and Ala, as well as the

negatively charged Glu (table 7, figure 8).

The fluctuations in the conserved sections, relative to cluster III, point to an increase in the Leu

& Tyr content, and a decrease in the Gly content. Therefore there might be interplay between

the external, variable sections to the conserved interior. In the interior, a slight increase in large

hydrophobic residues, would help to minimise accessible surface area to the solvent, within the

important functional buried sites of the protein (Jaenicke and Böhm, 1998; Haney et al., 1999).

In common with the P3 clones, the variable section included an increase in Arg, and a decrease

in Ala and Glu amino acids (table 7, figure 9). P1 clones also decreased in other charged amino

acids and uncharged polar residues. Glu increased in the conserved sections of P1 but this was

not observed in the P3 clones (table 8). The interplay previously observed in cluster I & III,

between Ala and Gly amino acids in the conserved regions, was not detected in PHS clones.

In order to achieve a reliable structural analysis, we have chosen 18 PHS clones, at a maximum

distance of 0.21, to 1CP2/P00456 or 2AFH/P00459 sequences (table 9). The highest RMSD

value for a clone model with cluster I was 0.56 Å (RSA163) while the highest value for a clone

model of cluster III was 0.9 Å (RSA215, RSA195), indicative of the uncertainties in modelling

a cluster III Fe protein, with the current low number of available resolved structures of Fe

proteins from this cluster. According to our previous analysis with 1CP2 and 2AFH resolved

crystallographic structures (table 5, section 5.3.3, chapter 5), the residues corresponding to

positions 50-51, 62-63, 65-67, 89, 91-93 and 112-115, presented high RMSD values and

therefore some of the reported shifts in P3 are a result from the I-TASSER process and may not

represent authentic shifts.

These results were similar to our findings with the stromatolites clone partial NifH sequences.

In the P3 clones, the region of 112-115 was sometimes elongated by two residues, and our

analysis suggested a salt bridge might be established at times (Table 12figure 10). Three main

alternatives for this section were observed in the clone sequences - KMD/EESQE/DADKK. For

some of the PHS clones affiliated with cluster I, no insertion was evident at all, and one of the

Asp residues would change to Gly, while another negative residue would be omitted at times

(table 11, figure 11). P1 and P3 did not share the exact same modification, but they did share the

same region in which this modification occurred.

In summary, based on amino acid composition and structural analysis, the analysis suggested

thermophilic adaptations were not present in full, in the inferred NifH sequences of PHS clones.

Page 191: Nitrogen fixing potential in extreme environments - UNSWorks

182

6.5 Concluding remarks

NifH sequences from Shark Bay hypersaline environment and Paralana hot springs were

analysed in terms of their conservation patterns, amino acid composition and existing and

potential structural attributes. Our methods included a statistical t-test analysis of the amino acid

composition, a novel ConSurf evolutionary analysis, and the use of the I-TASSER web server

for 3D modelling of the amplified region of NifH.

Our results were explained in light of the methodology limitations as discussed previously, in

section 5.4.1, chapter 5.

The results suggested that to a certain degree, halophilic adaptations, with an increase in salt

bridges, charged residues and a decrease in bulkier hydrophobic amino acids, did occur. The

changes were less apparent in the clones affiliated with cluster I, than with the clones affiliated

with cluster III, which may be an indication of some measure of protection of the protein from

the environment in the cluster I affiliated clones (see table 13).

The NifH protein sequences from Paralana Hot Springs were subjected to a similar analysis.

The results suggested that to a limited degree, some of the known thermophilic adaptations - an

increase in salt bridges, charged residues and Pro, were present in the sequences; however other

known features were not detected, including an increase in several hydrophobic amino acids and

a decrease in uncharged polar residues. These conflicting results may be indicative of a

changing temperature regime in the hot spring, as different temperatures were reported in the

past (Mawson, 1927; Grant, 1938; Long et al., 2001; Anitori et al., 2002), or of additional

environmental factors such as salinity, coming into play (see table 13). These factors require

further confirmation.

Some of our findings can only be confirmed once a determined Fe protein structure has been

isolated from representatives’ microorganisms from the investigated environments.

Page 192: Nitrogen fixing potential in extreme environments - UNSWorks

183

Table 13 Summarising halophilic and thermophilic findings from this study.

Halophilic adaptations*

More Asp or Glu, Ala, Gly or Val(a)

Less Lys, Ile or Leu or Phe or Met(a)

More salt bridges(b)

Stromatolites NifH clones

S3: Glu, Asp, Gly, Val S1: Glu

S3: Ile, Leu, Phe, Met S1: Lys, Phe +

Thermophilic adaptations**

More Ile or Tyr, Arg or Glu, Pro or Lys(a)

Less Gly or Met or Gln or Thr or Asn or Ser(a)

More salt bridges(b)

High Arg/Lys ratio (>1)

(a) PHS NifH clones

P3:Arg, Pro P1: Ile, Arg, Pro

P3: Ser P1: Gly, Gln, Asn, Ser +

P3: 0.53 P1: 1.61

*Specific halophilic adaptations (Eisenberg, 1995; Madern et al., 2000; Bolhuis et al., 2008). ** Specific thermophilic adaptations (Haney et al., 1999; Daniel et al., 2008). (a) Specific changes in the amino acids composition and whether they appeared in the variable sections of the NifH sequence of the stromatolite clones (S1, S3), or the PHS clones (P1, P3). Changes were in comparison to cluster I (S1, P1 vs. C1) or cluster III (S3, P3 vs. C3) values. (b) See tables 14 & 20 - salt bridges calculated by WHAT IF, version 10.1a, (Rodriguez et al., 1998).

Page 193: Nitrogen fixing potential in extreme environments - UNSWorks

184

Chapter 7 Conclusions & future work

_______________________________________________

“Nothing in biology makes sense except in the light of evolution” is a statement that still stands

true, throughout the decades (Dobzhansky, 1973). Genetic studies have revealed that the nifH

gene is present in numerous bacteria and Archaea and is relatively common in a vast number of

genomes (Gary Stacey, 1992; Berman-Frank et al., 2003; Raymond et al., 2004a). This in turn

suggests that the gene has been present in the genetic code, for a long time, perhaps even since

the Last Universal Common Ancestor (LUCA) (Fani et al., 2000; Leipe et al., 2002; Latysheva

et al., 2012).

As stated in the beginning of this thesis - nitrogen fixation is one of the most important

biochemical processes. Our main aim in this work was to study microbial communities involved

in this process, which reside in unique, sometimes extreme, environments. We then analysed the

modifications in the NifH sequences we obtained from the molecular work, and assessed

whether unique adaptations of the Fe protein were evident. Our non molecular methods

included a statistical t-test analysis of amino acid compositions, and a novel combination of an

evolutionary analysis and protein 3D models.

It would appear then, that from the early beginning of life on Earth, the nifH gene had been

translated into a functional protein, under various environmental conditions (Leigh, 2000).

According to some recent studies, it would seem that phylogeny trees based on functional

genes, such as the nifH gene, represent microbial communities better than taxonomy based

phylogeny trees, as they reflect the immediate environment in which the micro-organisms live

(Burke et al., 2011; Hamilton et al., 2011a). It is reasonable to assume that proteins would be

optimised to ensure survival in a specific environmental setting, and that micro-evolution would

match specific ecological niches (Taroncher-Oldenburg et al., 2003). Findings of this nature

suggest that the different clusters in the phylogenetic tree would actually represent past

adaptations to environmental changes regardless of taxonomical relations (Burke et al., 2011;

Hamilton et al., 2011a) and would actually represent conditions currently influencing the

composition of the genetic code.

In other words - phylogenetic affiliations would correlate best with specific physical and

chemical influences, during a specific time frame, and not necessarily with taxonomical groups,

Page 194: Nitrogen fixing potential in extreme environments - UNSWorks

185

and in addition, functional genes such as the nifH, would not be identical in the same species, if

its members reside in different environments. Altogether this suggests that a linear story for the

evolution of the nifH gene (or other functional gene), is highly unlikely. Published phylogenetic

analyses of nifH, and also related nif operon genes, seem to support this avenue of thought

(Gary Stacey, 1992; Fani et al., 2000; Leipe et al., 2002; Berman-Frank et al., 2003; Raymond

et al., 2004a; Latysheva et al., 2012).

A possible interpretation of the current known topology of the nifH tree (four clusters, cluster I

and III as the main clades, see chapters 1-3) would be that the main clusters most probably

represent an adaptation to the presence, or lack of, oxygen. In turn, this would set the cluster’s

time of branching around the 2.22 - 2.45 billion years ago, at the great oxidation event (Brocks

et al., 1999; Anbar et al., 2007). The current tree topology may thus represent not only a

specific and dramatic change that happened in the past, at some point in time, but also an

ongoing global setting - still affecting genomes across a wide range of geographical locations.

We would argue that any functional gene phylogenetic tree should be searched for a similar

topology, and if found, one could assume that the ‘great divide’ would have been set around the

time of the great oxidation event.

We have assessed in this study, for the first time, bacterial profiles from two Antarctic sites, in

the Terra Nova Bay area (Abramovich et al., 2012). In order to gather evidence for the bacterial

communities in these glacial zones, we carried out a terminal-restriction fragment length

polymorphism (T-RFLP) analysis on 16S rDNA using a universal bacterial amplification

protocol on two permafrost cores (Marsh, 1999). Bray-Curtis cluster analysis suggested Boulder

Clay bacterial profiles were similar to each other, but cluster separately from the Amorphous

Glacier bacterial profile (Hammer et al., 2001). Amorphous Glacier was potentially rich in

microbial species and the two sites differed in their microbial diversity. Permafrost and icy

environments are difficult to work with (Miteva, 2008), but they are present on Mars and other

objects in the solar system (McKay et al., 1991; Friedmann, 1993; Ostroumov and Siegert,

1996). Icy environments on Earth are therefore important analogue sites for astrobiological

research, if we aim to learn and adapt technologies to find life elsewhere in the universe (Soina

et al., 1995).

Our study is the first to confirm the presence of nifH genes in columnar stromatolites, Shark

Bay, Western Australia (chapter 3). Shark Bay, a UNSECO’s world heritage site, provides

researchers with fascinating endemic microbiological subjects, which bridge our current era

with Archaean fossil records from the beginning of life on Earth. These “living fossils” are

important to our understanding of the origin of life on Earth, as their remnants are consistently

Page 195: Nitrogen fixing potential in extreme environments - UNSWorks

186

being found in the Earth’s geological records, the oldest to date found in 3.49 Ga Archean rocks

(Walter, 1976; Schopf, 2006).

Our findings partially matched former taxonomical findings on the stromatolites based on

studies which utilized mainly 16SrDNA and culturing analyses. Common potential diazotrophs

included cyanobacterial species and Desulfatibacillum of the δ-Proteobacteria (Goh et al., 2008;

Allen et al., 2009; Burns et al., 2009). The two stromatolite samples, from different years,

differed in their species diversity and richness, and we suggested this was related to the

environmental events that occurred at the time of sampling. Our results indicated that columnar

stromatolites and the salt ponds of Guerrero Negro, Mexico, harbour similar diazotrophic

species, mainly from the δ-Proteobacteria and Cyanobacteria groups. However, the stromatolites

included unique species, such as non-heterocystous Cyanobacteria and γ, δ-Proteobacteria NifH

sequences, which were not present in the Guerrero Negro salt ponds. A new clade was an out-

group to cluster I, and centred on the δ-proteobacterium, Pelobacter carbinolicus DSM 2380

and affiliated NifH clones.

In a different part of Australia, the diazotrophic community of a hot and slightly radioactive

spring was investigated for the first time (chapter 4). Our findings included diazotrophs from the

Cyanobacteria, Nitrospirae, Spirochaetes, Bacteroidetes, Firmicutes and δ-Proteobacteria

groups, few of which were reported by a former taxonomical study utilising a 16SrDNA

analysis (Anitori et al., 2002). These diazotrophs were mainly affiliated with cluster I and

cluster III of the NifH phylogeny tree; however, two new clades were established as out groups

to cluster I. These clades included NifH clones closely related to Thermodesulfovibrio

yellowstonii DSM 11347 (Nitrospirae), several Geobacter spp. and P. carbinolicus DSM 2380

(δ-Proteobacteria).

The number of NifH clones analysed and sequenced in this study (76), represents the highest

number of NifH clones from a singular hot spring to be analysed to date (Hamilton et al., 2011).

According to our richness and diversity analysis, the diazotrophic community was more diverse

and included more NifH species than Shark Bay columnar stromatolites and should be further

investigated and sampled. Hydrothermal systems in general produce habitable

microenvironments (Jannasch and Wirsen, 1981; Sogin et al., 2006), and there is evidence to

suggest their existence on Mars, Europa, Enceladus and other solar bodies (McCollom, 1999;

Vance et al., 2007; Glein et al., 2008; Skok et al., 2010), making the Paralana’s active

amagmatic hydrothermal system an interesting analogue site for astrobiology research.

Our bioinformatics approach paved the way for future research to use the nifH gene as a

reference point for analysis of genomic and protein modifications (chapters 5 & 6). While our

Page 196: Nitrogen fixing potential in extreme environments - UNSWorks

187

data sets were small, it allowed for an in depth analysis of our methodology and its limitations.

The results were limited by the nature of our datasets, and yet showed great promise as specific

adaptations were detected in the NifH sequences from Shark Bay and Paralana Hot springs,

supporting the notion of dynamic evolution in their respective environments.

Future work

Molecular and bioinformatics tools were our main methodologies in this study. Future

researchers may want to focus not only on potential diazotrophs but also on identifying the

actual nitrogen fixers in these unique environments. The new out groups of the NifH

phylogenetic tree reported in this study, represent adaptation to high temperatures and high

salinity, but it is unclear if they are active agents in fixing atmospheric nitrogen.

Assessment of actual nitrogenase activity can be achieved with reverse transcriptase PCR and

quantitative reverse transcriptase PCR, and with acetylene reduction assays. These

methodologies would shed light on the key players in the N2 fixation cycle.

Whole genome amplification could also be used to increase the DNA concentrations recovered

from the environment for downstream PCR analysis. Such research will confirm the presence

and viability of psychrophilic, thermophilic and halophilic bacterial phyla, and correlate the

community composition with the geological and habitat characteristics. Proteomics studies

would link nitrogen fixation key enzymes and genes to other biochemical processes, such as

photosynthesis (oxygenic and anoxic) or sulphate reduction and oxidation, and would provide

comparable data with other microbial systems. Measurements of N15 uptake on a micron scale,

within for example, the stromatolite mats’ upper layers (down to 5-8 mm depth), would provide

a reliable portrait of the nitrogen budget within the layered microbial mats and within the

different types of stromatolite mats.

Additionally, as most of what is currently known about the nitrogenase activity is derived from

studies based on Cyanobacteria, nitrogenase activity should be explored and characterised in

diazotrophic sulphate reducing bacteria (SRB) and other anaerobic bacteria.

Future work may also include analysing the new phylogenetic out groups presented in this study

(chapters 3 and 4). Our methodology can be employed on these sequences and compared to our

current body of work, and also compare them to distinct thermophilic or halophilic NifH

sequences or perhaps GTPases from thermophilic and halophilic genomes. This in turn, will not

only clarify what are the evolutionary steps which bring forth thermophilic or halophilic

Page 197: Nitrogen fixing potential in extreme environments - UNSWorks

188

adaptations, across taxonomical groups and across protein families, but it will clarify whether

taxonomy trumps functionality for this type of gene (see the opening paragraphs in this chapter).

In addition, comparing these out group sequences to cluster I and cluster III affiliated clones

might reveal a gradient of adaptations in the protein composition and structure, thus

illuminating the entire range of adaptations possible to diazotrophs in a specific environment.

Elongation of the amplified region of the nifH gene via the PCR process would be very

beneficial to our analysis, and will enable researchers to confirm or reject our current analysis,

mainly in regards to the amino acid compositions and content in the conserved vs. non

conserved regions of the Fe protein. Additional characteristics that can be assessed in regard to

potential adaptations include (briefly): aromatic interactions, hydrogen bonds, disulfide bridges,

surface accessibility of certain amino acids, electrostatic interactions in the core vs. protein

surface and thermodynamic and protein activity properties.

In summary, this study has enhanced our knowledge of microbiological agents which survive

successfully in extreme environments. These environments are worthy of our attention as they

provide analogous sites for research intended on finding evidence for life elsewhere in the solar

system. Given enough time to adapt, these successful micro-organisms could survive rigorous

conditions outside of Earth’s protective shell, promoting an optimistic view of finding micro-

organisms elsewhere in the solar system.

Page 198: Nitrogen fixing potential in extreme environments - UNSWorks

189

References

Abascal, F., Zardoya, R., and Posada, D. (2005) ProtTest: selection of best-fit models of protein evolution. Bioinformatics 21: 2104-2105. Abramovich, R.S., Pomati, F., Jungblut, A.D., Guglielmin, M., and Neilan, B.A. (2012) T-RFLP Fingerprinting Analysis of Bacterial Communities in Debris Cones, Northern Victoria Land, Antarctica. Permafrost Periglac 23: 244-248. Abyzov, S.S., Filippova, S.N., and Kuznetsov, V.D. (1983) Nocardiopsis antarcticus-A new species of actinomyces isolated from the ice sheet of the Central Antarctica glacier. Izv Akad Nauk Ser Biol 4: 559-568. Adams, D.G. (2000) Heterocyst formation in cyanobacteria. Curr Opin Microbiol 3: 618-624. Affourtit, J., Zehr, J., and Paerl, H. (2001) Distribution of nitrogen-fixing microorganisms along the Neuse River Estuary, North Carolina. Microb Ecol 41: 114-123. Aislabie, J., Jordan, S., Ayton, J., Klassen, J.L., Barker, G.M., and Turner, S. (2009) Bacterial diversity associated with ornithogenic soil of the Ross Sea region, Antarctica. Can J Microbiol 55: 21-36. Aislabie, J.M., Chhour, K.L., Saul, D.J., Miyauchi, S., Ayton, J., Paetzold, R.F., and Balks, M.R. (2006) Dominant bacteria in soils of Marble Point and Wright Valley, Victoria Land, Antarctica. Soil Biol Biochem 38: 3041-3056. Akaike, H. (2002) A new look at the statistical model identification. Automatic Control, IEEE Transactions on 19: 716-723. Allen, M., Goh, F., Burns, B., and Neilan, B. (2009) Bacterial, archaeal and eukaryotic diversity of smooth and pustular microbial mat communities in the hypersaline lagoon of Shark Bay. Geobiology 7: 82-96. Allen, M.A. (2006) An Astrobiology-Focused Analysis of Microbial Mat Communities from Hamelin Pool, Shark Bay, Western Australia. In School of Biotechnology and Biomolecular Sciences. Sydney: The University of New South Wales, p. 243. Allen, M.A., Goh, F., Leuko, S., Igo, A.E., Mizuki, T., Usami, R. et al. (2008) Haloferax elongans sp nov and Haloterax mucosum sp nov., isolated from microbial mats from Hamelin Pool, Shark Bay, Australia. Int J Syst Evol Microbiol 58: 798-802. Altschul, S., Madden, T., Schaffer, A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25: 3389. Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. (1990) Basic local alignment search tool. J. Mol. Biol 215: 403-410.

Page 199: Nitrogen fixing potential in extreme environments - UNSWorks

190

Aluri, S., and Terli, R. (2012) Three dimensional modelling of beta endorphin and its interaction with three opioid receptors. Journal of Computational Biology and Bioinformatics Research 4: 51-57. Amann, R.I., Ludwig, W., and Schleifer, K.H. (1995) Phylogenetic identification and in situ detection of individual microbial cells without cultivation. Microbiol Rev 59: 143-169. Amato, P., Hennebelle, R., Magand, O., Sancelme, M., Delort, A.M., Barbante, C. et al. (2007) Bacterial characterization of the snow cover at Spitzberg, Svalbard. FEMS Microbiol Ecol 59: 255-264. Anbar, A.D., Duan, Y., Lyons, T.W., Arnold, G.L., Kendall, B., Creaser, R.A. et al. (2007) A whiff of oxygen before the great oxidation event? Science 317: 1903-1906. Andres, M.S., and Pamela Reid, R. (2006) Growth morphologies of modern marine stromatolites: A case study from Highborne Cay, Bahamas. Sediment Geol 185: 319-328. Anisimova, M., and Gascuel, O. (2006) Approximate likelihood-ratio test for branches: A fast, accurate, and powerful alternative. Syst Biol 55: 539. Anitori, R.P., Trott, C., Saul, D.J., Bergquist, P.L., and Walter, M.R. (2002) A culture-independent survey of the bacterial community in a radon hot spring. Astrobiology 2: 255-270. Apweiler, R., Martin, M., O’Donovan, C., Magrane, M., Alam-Faruque, Y., Antunes, R. et al. (2010) The universal protein resource (UniProt) in 2010. Nucleic Acids Res 38: D142-D148. Argandoña, M., Fernández Carazo, R., Llamas, I., Martínez Checa, F., Caba, J.M., Quesada, E., and Moral, A. (2005) The moderately halophilic bacterium Halomonas maura is a free living diazotroph. FEMS Microbiol Lett 244: 69-74. Ashkenazy, H., Erez, E., Martz, E., Pupko, T., and Ben-Tal, N. (2010) ConSurf 2010: calculating evolutionary conservation in sequence and structure of proteins and nucleic acids. Nucleic Acids Res 38: W529-W533. Bai, Y., Yang, D., Wang, J., Xu, S., Wang, X., and An, L. (2006) Phylogenetic diversity of culturable bacteria from alpine permafrost in the Tianshan Mountains, northwestern China. Res Microbiol 157: 741-751. Bakermans, C., Tsapin, A.I., Souza-Egipsy, V., Gilichinsky, D.A., and Nealson, K.H. (2003) Reproduction and metabolism at -10°C of bacteria isolated from Siberian permafrost. Environ Microbiol 5: 321-326. Bardavid, R., Ionescu, D., Oren, A., Rainey, F., Hollen, B., Bagaley, D. et al. (2007) Selective enrichment, isolation and molecular detection of Salinibacter and related extremely halophilic Bacteria from hypersaline environments. Hydrobiologia 576: 3-13. Bargagli, R., Skotnicki, M.L., Marri, L., Pepi, M., Mackenzie, A., and Agnorelli, C. (2004) New record of moss and thermophilic bacteria species and physico-chemical properties of geothermal soils on the northwest slope of Mt. Melbourne (Antarctica). Polar Biol 27: 423-431.

Page 200: Nitrogen fixing potential in extreme environments - UNSWorks

191

Barrett, J.E., Virginia, R.A., Wall, D.H., Cary, S.C., Adams, B.J., Hacker, A.L., and Aislabie, J.M. (2006) Co-variation in soil biodiversity and biogeochemistry in northern and southern Victoria Land, Antarctica. Antarct Sci 18: 535-548. Bauer, K., Díez, B., Lugomela, C., Seppälä, S., Borg, A., and Bergman, B. (2008) Variability in benthic diazotrophy and cyanobacterial diversity in a tropical intertidal lagoon. FEMS Microbiol Ecol 63: 205-221. Bauld, J., Favinger, J.L., Madigan, M.T., and Gest, H. (1986) Obligately halophilic Chromatium vinosum from Hamelin Pool, Shark Bay, Australia. Curr Microbiol 14: 335-339. Bazylinski, D.A., Dean, A.J., Schüler, D., Phillips, E.J.P., and Lovley, D.R. (2000) N2 dependent growth and nitrogenase activity in the metal metabolizing bacteria, Geobacter and Magnetospirillum species. Environ Microbiol 2: 266-273. Belay, N., Sparling, R., and Daniels, L. (1984) Dinitrogen fixation by a thermophilic methanogenic bacterium. Bell, R.E., and Ben Tal, N. (2003) In silico identification of functional protein interfaces. Comp Funct Genomics 4: 420-423. Berezin, C., Glaser, F., Rosenberg, J., Paz, I., Pupko, T., Fariselli, P. et al. (2004) ConSeq: the identification of functionally and structurally important residues in protein sequences. Bioinformatics 20: 1322. Berg, J.M., Tymoczko, J.L., and Stryer, L. (2002) Biochemistry. New York:: W. H. Freeman and Co. Bergman, B., Gallon, J.R., Rai, A.N., and Stal, L.J. (1997) N2 Fixation by non-heterocystous cyanobacteria. In, pp. 139-185. Berman-Frank, I., Lundgren, P., and Falkowski, P. (2003) Nitrogen fixation and photosynthetic oxygen evolution in cyanobacteria. Res Microbiol 154: 157-164. Bertics, V., Sohm, J., Treude, T., Chow, C., Capone, D., Fuhrman, J., and Ziebis, W. (2010) Burrowing deeper into benthic nitrogen cycling: the impact of bioturbation on nitrogen fixation coupled to sulfate reduction. Mar Ecol Prog Ser 409: 1-15. Bertrand-Sarfati, J., and Walter, M.R. (1976) Chapter 5.2 An Attempt to Classify Late Precambrian Stromatolite Microstructures. In Developments in Sedimentology: Elsevier, pp. 251-259. Bhat, W.W., Lattoo, S.K., Razdan, S., Dhar, N., Rana, S., Dhar, R.S. et al. (2012) Molecular cloning, bacterial expression and promoter analysis of squalene synthase from< i> Withania somnifera</i>(L.) Dunal. Gene. Bhatia, M., Sharp, M., and Foght, J. (2006) Distinct Bacterial Communities Exist beneath a High Arctic Polythermal Glacier. Appl Environ Microbiol 72: 5838-5845. Blackwood, C.B., Marsh, T., Kim, S.-H., and Paul, E.A. (2003) Terminal Restriction Fragment Length Polymorphism Data Analysis for Quantitative Comparison of Microbial Communities. Appl Environ Microbiol 69: 926-932. Blight, P.G. (1977) Uraniferous Metamorphics and" younger" Granites of the Paralana Area, Mount Painter Province, South Australia: A Petrographical and Geochemial Study: Department of Geology, University of Adelaide.

Page 201: Nitrogen fixing potential in extreme environments - UNSWorks

192

Boeckmann, B., Bairoch, A., Apweiler, R., Blatter, M.C., Estreicher, A., Gasteiger, E. et al. (2003) The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res 31: 365-370. Bohm, G., and Jaenicke, R. (1994) Relevance of sequence statistics for the properties of extremophilic proteins. International Journal of Peptide and Protein Research 43: 97-106. Bohme, H. (1998) Regulation of nitrogen fixation in heterocyst-forming cyanobacteria. Trends Plant Sci 3: 346-351. Bolhuis, A., Kwan, D., and Thomas, J. (2008) Halophilic Adaptations of Proteins. In Protein adaptation in extremophiles. Siddiqui, K.S., and Thomas, T. (eds): Nova Science Publishers, Inc., pp. 71-104. Bonin, P., and Michotey, V. (2006) Nitrogen budget in a microbial mat in the Camargue (southern France). MARINE ECOLOGY-PROGRESS SERIES- 322: 75. Bothe, H., Tripp, H., and Zehr, J. (2010) Unicellular cyanobacteria with a new mode of life: the lack of photosynthetic oxygen evolution allows nitrogen fixation to proceed. Arch Microbiol: 1-8. Bourne, H.R., Sanders, D.A., and McCormick, F. (1991) The GTPase superfamily: conserved structure and molecular mechanism. Bowman, J.P., and McCuaig, R.D. (2003) Biodiversity, Community Structural Shifts, and Biogeography of Prokaryotes within Antarctic Continental Shelf Sediment. Appl Environ Microbiol 69: 2463-2483. Bowman, J.P., McCammon, S.A., Brown, M.V., Nichols, D.S., and McMeekin, T.A. (1997) Diversity and association of psychrophilic bacteria in Antarctic sea ice. Appl Environ Microbiol 63: 3068-3078. Bowman, J.P., McCammon, S.A., Gibson, J.A.E., Robertson, L., and Nichols, P.D. (2003) Prokaryotic Metabolic Activity and Community Structure in Antarctic Continental Shelf Sediments. Appl Environ Microbiol 69: 2448-2462. Brambilla, E., Hippe, H., Hagelstein, A., Tindall, B.J., and Stackebrandt, E. (2001) 16S rDNA diversity of cultured and uncultured prokaryotes of a mat sample from Lake Fryxell, McMurdo Dry Valleys, Antarctica. Extremophiles 5: 23-33. Brewer, W. (1866) Note on the organisms of the geysers of California. Am. J. Sci 92: 429. Brinkmeyer, R., Knittel, K., Jurgens, J., Weyland, H., Amann, R., and Helmke, E. (2003) Diversity and Structure of Bacterial Communities in Arctic versus Antarctic Pack Ice. Appl Environ Microbiol 69: 6610-6619. Brocks, J., Logan, G., Buick, R., and Summons, R. (1999) Archean molecular fossils and the early rise of eukaryotes. Science 285: 1033. Brown, I.I., Bryant, D.A., Casamatta, D., Thomas-Keprta, K.L., Sarkisova, S.A., Shen, G. et al. (2010) Polyphasic Characterization of a Thermotolerant Siderophilic Filamentous Cyanobacterium That Produces Intracellular Iron Deposits. Appl Environ Microbiol 76: 6664.

Page 202: Nitrogen fixing potential in extreme environments - UNSWorks

193

Brown, M., Friez, M., and Lovell, C. (2003) Expression of nifH genes by diazotrophic bacteria in the rhizosphere of short form Spartina alterniflora. FEMS Microbiol Ecol 43: 411-417. Brugger, J., Long, N., McPhail, D.C., and Plimer, I. (2005) An active amagmatic hydrothermal system: The Paralana hot springs, Northern Flinders Ranges, South Australia. Chemical Geology 222: 35-64. Bureau of Meteorology, C.o.A. (2011). Climate Data Online [WWW document]. URL http://www.bom.gov.au/climate/data/. Burgess, B.K., and Lowe, D.J. (1996) Mechanism of Molybdenum Nitrogenase. Chem Rev 96: 2983-3012. Burgess, B.K., Jacobs, D.B., and Stiefel, E.I. (1980) Large-scale purification of high activity< i> Azotobacter vinelandii</i> nitrogenase. Biochimica et Biophysica Acta (BBA)-Enzymology 614: 196-209. Burke, C., Steinberg, P., Rusch, D., Kjelleberg, S., and Thomas, T. (2011) Bacterial community assembly based on functional genes rather than species. Proceedings of the National Academy of Sciences 108: 14288-14293. Burling, M., Pattiaratchi, C., and Ivey, G. (2003) The tidal regime of Shark Bay, Western Australia. Estuarine, Coastal and Shelf Science 57: 725-735. Burns, B., Goh, F., Allen, M., and Neilan, B. (2004) Microbial diversity of extant stromatolites in the hypersaline marine environment of Shark Bay, Australia. Environ Microbiol 6: 1096-1101. Burns, B., Anitori, R., Butterworth, P., Henneberger, R., Goh, F., Allen, M. et al. (2009) Modern analogues and the early history of microbial life. Precambrian Res 173: 10-18. Burns, R.C., Hardy, R.W.F., and Anthony San, P. (1972) Purification of nitrogenase and crystallization of its Mo---Fe protein. In Methods in Enzymology: Academic Press, pp. 480-496. Cambillau, C., and Claverie, J.-M. (2000) Structural and genomic correlates of hyperthermostability. J Biol Chem 275: 32383-32386. Cannone, N., Wagner, D., Hubberten, H., and Guglielmin, M. (2008) Biotic and abiotic factors influencing soil properties across a latitudinal gradient in Victoria Land, Antarctica. Geoderma 144: 50-65. Carpenter, E.J., Lin, S., and Capone, D.G. (2000) Bacterial Activity in South Pole Snow. Appl Environ Microbiol 66: 4514-4517. Carugo, O. (2003) How root-mean-square distance (rmsd) values depend on the resolution of protein structures that are compared. Journal of applied crystallography 36: 125-128. Caspi, R., and Karp, P.D. (2002) Using the MetaCyc Pathway Database and the BioCyc Database Collection: John Wiley & Sons, Inc. Cavicchioli, R. (2002) Extremophiles and the search for extraterrestrial life. Astrobiology 2: 281-292. Chakrabartty, A., Schellman, J.A., and Baldwin, R.L. (1991) Large differences in the helix propensities of alanine and glycine.

Page 203: Nitrogen fixing potential in extreme environments - UNSWorks

194

Chao, A., and Yang, M.C.K. (1993) Stopping Rules and Estimation for Recapture Debugging with Unequal Failure Rates. In: Biometrika Trust, pp. 193-201. Chen, H., and Zhou, H.X. (2005) Prediction of solvent accessibility and sites of deleterious mutations from protein sequence. Nucleic Acids Res 33: 3193. Chevenet, F., Brun, C., Bañuls, A., Jacq, B., and Christen, R. (2006) TreeDyn: towards dynamic graphics and annotations for analyses of trees. BMC Bioinformatics 7: 439. Chien, Y., and Zinder, S. (1996) Cloning, functional organization, transcript studies, and phylogenetic analysis of the complete nitrogenase structural genes (nifHDK2) and associated genes in the archaeon Methanosarcina barkeri 227. J Bacteriol 178: 143. Chiu, H.J., Peters, J.W., Lanzilotta, W.N., Ryle, M.J., Seefeldt, L.C., Howard, J.B., and Rees, D.C. (2001) MgATP-bound and nucleotide-free structures of a nitrogenase protein complex between the Leu 127 -Fe-protein and the MoFe-protein. Biochemistry 40: 641-650. Christner, B.C., Mosley-Thompson, E., Thompson, L.G., and Reeve, J.N. (2005) Classification of Bacteria from Polar and Nonpolar Glacial Ice. In Life in Ancient Ice. Castello, J.D., and Rogers, S.O. (eds). Princeton, New Jersey: Princeton University Press, pp. 227-239. Christner, B.C., Mosley-Thompson, E., Thompson, L.G., Zagorodnov, V., Sandman, K., and Reeve, J.N. (2000) Recovery and Identification of Viable Bacteria Immured in Glacial Ice. Icarus 144: 479-485. Chung, J., Wang, W., and Bourne, P. (2006) Exploiting sequence and structure homologs to identify protein-protein binding sites. PROTEINS-NEW YORK- 62: 630. Chung, J.L., Wang, W., and Bourne, P.E. (2005) Exploiting sequence and structure homologs to identify protein–protein binding sites. Proteins: Structure, Function, and Bioinformatics 62: 630-640. Clarridge, J.E., III (2004) Impact of 16S rRNA Gene Sequence Analysis for Identification of Bacteria on Clinical Microbiology and Infectious Diseases. Clin. Microbiol. Rev. 17: 840-862. Clement, B.G., Kehl, L.E., DeBord, K.L., and Kitts, C.L. (1998) Terminal restriction fragment patterns (TRFPs), a rapid, PCR-based method for the comparison of complex bacterial communities. J Microbiol Methods 31: 135-142. Cole, J., Wang, Q., Cardenas, E., Fish, J., Chai, B., Farris, R. et al. (2009) The Ribosomal Database Project: improved alignments and new tools for rRNA analysis. Nucleic Acids Res 37: D141. Cole, J.R., Chai, B., Farris, R.J., Wang, Q., McGarrell, D.M., Bandela, A.M. et al. (2007) The ribosomal database project (RDP-II): introducing myRDP space and quality controlled public data. Nucleic Acids Res 35: D169. Cole, J.R., Chai, B., Marsh, T.L., Farris, R.J., Wang, Q., Kulam, S.A. et al. (2003) The Ribosomal Database Project (RDP-II): previewing a new autoaligner that allows regular updates and the new prokaryotic taxonomy. Nucleic Acids Res 31: 442-443. Colon-Lopez, M., Sherman, D., and Sherman, L. (1997) Transcriptional and translational regulation of nitrogenase in light-dark-and continuous-light-grown cultures

Page 204: Nitrogen fixing potential in extreme environments - UNSWorks

195

of the unicellular cyanobacterium Cyanothece sp. strain ATCC 51142. J Bacteriol 179: 4319. Costello, E., Halloy, S., Reed, S., Sowell, P., and Schmidt, S. (2009) Fumarole-Supported Islands of Biodiversity within a Hyperarid, High-Elevation Landscape on Socompa Volcano, Puna de Atacama, Andes. Appl Environ Microbiol 75: 735. Cravo-Laureau, C., Matheron, R., Joulian, C., Cayol, J.-L., and Hirschler-Rea, A. (2004) Desulfatibacillum alkenivorans sp. nov., a novel n-alkene-degrading, sulfate-reducing bacterium, and emended description of the genus Desulfatibacillum. Int J Syst Evol Microbiol 54: 1639-1642. CRBIP, T. (2007) Centre de Ressources Biologiques de l'Institut Pasteur. In: Institut Pasteur. D'Agostino, R.B. (1986) Tests for Normal Distribution. In Goodness-of-fit techniques. D'Agostino, R.B., and Stephens, M.A. (eds). New York, NY, USA: Marcel Dekker, Inc. Daniel, R.M., Danson, M.J., Hough, D.W., Lee, C.K., Peterson, M.E., and Cowan, D.A. (2008) Enzyme stability and activity at high temperatures. In Protein Adaptation in Extremophiles. Siddiqui, K.S., and Thomas, T. (eds). New York, NY: Nova Science Publishers, Inc, pp. 1-34. Darapaneni, V., Prabhaker, V.K., and Kukol, A. (2009) Large-scale analysis of influenza A virus sequences reveals potential drug target sites of non-structural proteins. J Gen Virol 90: 2124-2133. DasSarma, S., and Arora, P. (2006) Halophiles. eLS. Davey, A., and Marchant, H.J. (1983) Seasonal Variation in Nitrogen Fixation by Nostoc-Commune Vaucher at the Vestfold Hills Antarctica. Phycologia 22: 377-386. Davila, A.F., Gómez-Silva, B., De los Rios, A., Ascaso, C., Olivares, H., McKay, C.P., and Wierzchos, J. (2008) Facilitation of endolithic microbial survival in the hyperarid core of the Atacama Desert by mineral deliquescence. Journal of Geophysical Research 113: G01028. Davila, A.F., Duport, L.G., Melchiorri, R., Jänchen, J., Valea, S., de los Rios, A. et al. (2010) Hygroscopic Salts and the Potential for Life on Mars. Astrobiology 10: 617-628. Deming, J.W. (2002) Psychrophiles and polar regions. Curr Opin Microbiol 5: 301-309. Derakshani, M., Lukow, T., and Liesack, W. (2001) Novel bacterial lineages at the (sub) division level as detected by signature nucleotide-targeted recovery of 16S rRNA genes from bulk soil and rice roots of flooded rice microcosms. Appl Environ Microbiol 67: 623-631. Des Marais, D. (1995) The biogeochemistry of hypersaline microbial mats. Adv Microb Ecol 14: 251. Des Marais, D.J. (2003) Biogeochemistry of Hypersaline Microbial Mats Illustrates the Dynamics of Modern Microbial Ecosystems and the Early Evolution of the Biosphere. Biol Bull 204: 160-167. Desnues, C., Michotey, V., Wieland, A., Zhizang, C., Fourçans, A., Duran, R., and Bonin, P. (2007) Seasonal and diel distributions of denitrifying and bacterial

Page 205: Nitrogen fixing potential in extreme environments - UNSWorks

196

communities in a hypersaline microbial mat (Camargue, France). Water Res 41: 3407-3419. Diallo, M.D., Reinhold-Hurek, B., and Hurek, T. (2008) Evaluation of PCR primers for universal nifH gene targeting and for assessment of transcribed nifH pools in roots of Oryza longistaminata with and without low nitrogen input. FEMS Microbiol Ecol 65: 220-228. Dilworth, M.J., Eldridge, M.E., and Eady, R.R. (1993) The molybdenum and vanadium nitrogenases of Azotobacter chroococcum: effect of elevated temperature on N2 reduction. Biochem J 289: 395. Distel, D.L., Morrill, W., MacLaren-Toussaint, N., Franks, D., and Waterbury, J. (2002) Teredinibacter turnerae gen. nov., sp. nov., a dinitrogen-fixing, cellulolytic, endosymbiotic gamma-proteobacterium isolated from the gills of wood-boring molluscs (Bivalvia: Teredinidae). Int J Syst Evol Microbiol 52: 2261-2269. Dixon, R., and Kahn, D. (2004) Genetic regulation of biological nitrogen fixation. Nature Reviews Microbiology 2: 621-631. Dobzhansky, T. (1973) Nothing in biology makes sense except in the light of evolution. American Biology Teacher 35: 125-129. Dunbar, J., Ticknor, L.O., and Kuske, C.R. (2000) Assessment of Microbial Diversity in Four Southwestern United States Soils by 16S rRNA Gene Terminal Restriction Fragment Analysis. Appl Environ Microbiol 66: 2943-2950. Dunbar, J., Ticknor, L.O., and Kuske, C.R. (2001) Phylogenetic Specificity and Reproducibility and New Method for Analysis of Terminal Restriction Fragment Profiles of 16S rRNA Genes from Bacterial Communities. Appl Environ Microbiol 67: 190-197. Dupraz, C., and Visscher, P. (2005) Microbial lithification in marine stromatolites and hypersaline mats. Trends Microbiol 13: 429-438. Dupraz, C., Reid, R., Braissant, O., Decho, A., Norman, R., and Visscher, P. (2009) Processes of carbonate precipitation in modern microbial mats. Earth-Sci Rev 96: 141-162. Eder, W., and Huber, R. (2002) New isolates and physiological properties of the Aquificales and description of Thermocrinis albus sp. nov. Extremophiles 6: 309-318. Edgar, R.C. (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucl. Acids Res. 32: 1792-1797. Edwards, A.M. (1868) Original Communications: On the Occurrence of Living Forms in the Hot Waters of California. Quarterly Journal of Microscopical Science 2: 247-250. Edwards, D., Stajich, J.E., and Hansen, D. (2009) Bioinformatics: Tools and Applications: Springer. Eisenberg, H. (1995) Life in unusual environments: progress in understanding the structure and function of enzymes from extreme halophilic bacteria. Archives of Biochemistry and Biophysics 318: 1-5. Eisenberg, H., Mevarech, M., and Zaccai, G. (1992) Biochemical, structural, and molecular genetic aspects of halophilism. Advances in protein chemistry 43: 1-62.

Page 206: Nitrogen fixing potential in extreme environments - UNSWorks

197

Empadinhas, N., and da Costa, M.S. (2010) Diversity and biosynthesis of compatible solutes in hyper/thermophiles. Int Microbiol 9: 199-206. Everett, K.D.E., Bush, R.M., and Andersen, A.A. (1999) Emended description of the order Chlamydiales, proposal of Parachlamydiaceae fam. nov. and Simkaniaceae fam. nov., each containing one monotypic genus, revised taxonomy of the family Chlamydiaceae, including a new genus and five new species, and standards for the identification of organisms. Int J Syst Bacteriol 49: 415-440. Falcón, L., Cerritos, R., Eguiarte, L., and Souza, V. (2007) Nitrogen fixation in microbial mat and stromatolite communities from Cuatro Cienegas, Mexico. Microb Ecol 54: 363-373. Fani, R., Gallo, R., and Liò, P. (2000) Molecular Evolution of Nitrogen Fixation: The Evolutionary History of the nifD, nifK, nifE, and nifN Genes. J Mol Evol 51: 1-11. Farnelid, H., Öberg, T., and Riemann, L. (2009) Identity and dynamics of putative N2 fixing picoplankton in the Baltic Sea proper suggest complex patterns of regulation. Environmental Microbiology Reports 1: 145-154. Fay, P. (1992) Oxygen relations of nitrogen fixation in cyanobacteria. Microbiology and Molecular Biology Reviews 56: 340. Feller, G., and Gerday, C. (2003) Psychrophilic enzymes: hot topics in cold adaptation. Nature Reviews Microbiology 1: 200-208. Feller, G., Lonhienne, T., Deroanne, C., Libioulle, C., Van Beeumen, J., and Gerday, C. (1992) Purification, characterization, and nucleotide sequence of the thermolabile alpha-amylase from the antarctic psychrotroph Alteromonas haloplanctis A23. J Biol Chem 267: 5217-5221. Felsenstein, J. (1981) Evolutionary trees from DNA sequences: A maximum likelihood approach. J Mol Evol 17: 368-376. Felsenstein, J. (2007) PHYLIP (phylogeny inference package) version 3.67. Distributed by the author. Department of Genome Sciences, University of Washington, Seattle, USA. Fernandez-Valiente, E., Quesada, A., Howard-Williams, C., and Hawes, I. (2001) N2-Fixation in Cyanobacterial Mats from Ponds on the McMurdo Ice Shelf, Antarctica. Microb Ecol 42: 338-349. Fernandez-Valiente, E., Camacho, A., Rochera, C., Rico, E., Vincent, W.F., and Quesada, A. (2007) Community structure and physiological characterization of microbial mats in Byers Peninsula, Livingston Island (South Shetland Islands, Antarctica). In, pp. 377-385. Ferris, M., Nold, S., Santegoeds, C., and Ward, D. (2001) Examining bacterial population diversity within the Octopus Spring microbial mat community. Thermophiles: Biodiversity, Ecology and Evolution: 51–64. Fields, P.A. (2001) Review: Protein function at thermal extremes: balancing stability and flexibility. Comparative Biochemistry and Physiology-Part A: Molecular & Integrative Physiology 129: 417-431. Fiore, C.L., Jarett, J.K., Olson, N.D., and Lesser, M.P. (2010) Nitrogen fixation and nitrogen transformations in marine symbioses. Trends Microbiol 18: 455-463.

Page 207: Nitrogen fixing potential in extreme environments - UNSWorks

198

Fleming, H., and Haselkorn, R. (1973) Differentiation in Nostoc muscorum: Nitrogenase is Synthesized in Heterocysts. In, pp. 2727-2731. Flint, D.J., and Abeysinghe, P.B. (2000/07) Geology and mineral resources of the Gascoyne Region: Western Australia Geological Survey. In. Perth: Western Australia Geological Survey, p. 29. Fourçans, A., Oteyza, T., Wieland, A., Solé, A., Diestra, E., Bleijswijk, J. et al. (2004) Characterization of functional bacterial groups in a hypersaline microbial mat community (Salins de Giraud, Camargue, France). FEMS Microbiol Ecol 51: 55-70. Francis, C.A., Beman, J.M., and Kuypers, M.M.M. (2007) New processes and players in the nitrogen cycle: the microbial ecology of anaerobic and archaeal ammonia oxidation. The ISME Journal 1: 19-27. Franzmann, P.D., and Dobson, S.J. (1992) Cell wall-less, free-living spirochetes in Antarctica. FEMS Microbiol Lett 97: 289-292. French, H., and Guglielmin, M. (1999a) Observations on the ice-marginal, periglacial geomorphology of Terra Nova Bay, northern Victoria Land, Antarctica. Permafrost Periglac 10: 331-347. French, H.M., and Guglielmin, M. (1999b) Observations on the Ice-Marginal, Periglacial Geomorphology of Terra Nova Bay, Northern Victoria Land, Antarctica. Permafrost Periglac 10: 331-347. French, H.M., and Guglielmin, M. (2000) Frozen Ground Phenomena in the Vicinity of Terra Nova Bay, Northern Victoria land, Antarctica: A Preliminary Report. Geografiska Annaler: Series A, Physical Geography 82: 513-526. Frezzotti, M., Salvatore, M., Vittuari, L., Grigioni, P., and De Silvestri, L. (2001) Satellite Image Map - Northern Foothills and Inexpressible Island Area (Victoria Land, Antarctica). Ter Ant Rep 6: 1-8. Friedmann, E.I. (1993) Extreme environments and exobiology. G Bot Ital 127: 369-376. Friedmann, E.I., Kappen, L., Meyer, M.A., and Nienow, J.A. (1993) Long-term productivity in the cryptoendolithic microbial community of the Ross Desert, Antarctica. Microb Ecol 25: 51-69. Frostegard, A., Courtois, S., Ramisse, V., Clerc, S., Bernillon, D., Le Gall, F. et al. (1999) Quantification of bias related to the extraction of DNA directly from soils. Appl Environ Microbiol 65: 5409. Fryberger, S., Krystinik, L., and Schenk, C. (1990) Tidally flooded back-barrier dunefield, Guerrero Negro area, Baja California, Mexico. Sedimentology 37: 23-43. Fukuchi, S., and Nishikawa, K. (2001) Protein surface amino acid compositions distinctively differ between thermophilic and mesophilic bacteria1. J Mol Biol 309: 835-843. Fukuchi, S., Yoshimune, K., Wakayama, M., Moriguchi, M., and Nishikawa, K. (2003) Unique amino acid composition of proteins in halophilic bacteria. J Mol Biol 327: 347-357.

Page 208: Nitrogen fixing potential in extreme environments - UNSWorks

199

Gaidos, E., Lanoil, B., Thorsteinsson, T., Graham, A., Skidmore, M., Han, S.K. et al. (2004) A Viable Microbial Community in a Subglacial Volcanic Crater Lake, Iceland. Astrobiology 4: 327-344. Galinski, E.A. (1993) Compatible solutes of halophilic eubacteria: molecular principles, water-solute interaction, stress protection. Cell Mol Life Sci 49: 487-496. Galinski, E.A., and Trüper, H.G. (1994) Microbial behaviour in salt-stressed ecosystems. FEMS Microbiol Rev 15: 95-108. Gall, J.L. (1963) A new species of Desulfovibrio. J Bacteriol 86: 1120. Gallon, J.R. (2001) N-2 fixation in phototrophs: adaptation to a specialized way of life. Plant and Soil 230: 39-48. Gallon, J.R., Hashem, M.A., and Chaplin, A.E. (1991) Nitrogen fixation by Oscillatoria spp. under autotrophic and photoheterotrophic conditions. Microbiology 137: 31. Gambacorta, A., Gliozzi, A., and Rosa, M. (1995) Archaeal lipids and their biotechnological applications. World Journal of Microbiology and Biotechnology 11: 115-131. Garcia-Pichel, F., Nübel, U., and Muyzer, G. (1998) The phylogeny of unicellular, extremely halotolerant cyanobacteria. Arch Microbiol 169: 469-482. Garrity, G.M., Brenner, D.J., Krieg, N.R., and Staley, J.R. (2005) Bergey's Manual of Systematic Bacteriology, Volume Two: The Proteobacteria, Parts A - C: Springer - Verlag. Gary Stacey, R.H.B., Harold J. Evans (1992) Biological nitrogen fixation New York Chapman & Hall. Gauthier, G., Gauthier, M., and Christen, R. (1995) Phylogenetic analysis of the genera Alteromonas, Shewanella, and Moritella using genes coding for small-subunit rRNA sequences and division of the genus Alteromonas into two genera, Alteromonas (emended) and Pseudoalteromonas gen. nov., and proposal of twelve new species combinations. Int J Syst Bacteriol 45: 755-761. Georgiadis, M., Komiya, H., Chakrabarti, P., Woo, D., Kornuc, J., and Rees, D. (1992) Crystallographic structure of the nitrogenase iron protein from Azotobacter vinelandii. Science 257: 1653. Georlette, D., Damien, B., Blaise, V., Depiereux, E., Uversky, V.N., Gerday, C., and Feller, G. (2003) Structural and Functional Adaptations to Extreme Temperatures in Psychrophilic, Mesophilic, and Thermophilic DNA Ligases. J Biol Chem 278: 37015-37023. Gerstein, M. (1998) How representative are the known structures of the proteins in a complete genome? A comprehensive structural census. Folding and Design 3: 497-512. Gilbert, D. (2003) Sequence File Format Conversion with Command Line Readseq. Current Protocols in Bioinformatics. Gilichinsky, D., Vishnivetskaya, T., Petrova, M., Spirina, E., Mamykin, V., and Rivkina, E. (2008) Bacteria in Permafrost. In Psychrophiles: from Biodiversity to Biotechnology. Margesin, R., Schinner, F., Marx, J.-C., and Gerday, C. (eds). Berlin, Germany: Springer, pp. 83-102.

Page 209: Nitrogen fixing potential in extreme environments - UNSWorks

200

Gilichinsky, D., Rivkina, E., Bakermans, C., Shcherbakova, V., Petrovskaya, L., Ozerskaya, S. et al. (2005) Biodiversity of cryopegs in permafrost. FEMS Microbiol Ecol 53: 117-128. Gilichinsky, D.A., Wilson, G.S., Friedmann, E.I., McKay, C.P., Sletten, R.S., Rivkina, E.M. et al. (2007) Microbial Populations in Antarctic Permafrost: Biodiversity, State, Age, and Implication for Astrobiology. Astrobiology 7: 275-311. Glaser, F., Rosenberg, Y., Kessel, A., Pupko, T., and Ben-Tal, N. (2005) The ConSurf-HSSP database: the mapping of evolutionary conservation among homologs onto PDB structures. Proteins 58: 610–617. Glaser, F., Pupko, T., Paz, I., Bell, R.E., Bechor-Shental, D., Martz, E., and Ben-Tal, N. (2003) ConSurf: identification of functional regions in proteins by surface-mapping of phylogenetic information. Bioinformatics 19: 163. Gliozzi, A., Relini, A., and Chong, P.L.G. (2002) Structure and permeability properties of biomimetic membranes of bolaform archaeal tetraether lipids. J Membr Sci 206: 131-147. Goh, F., Barrow, K.D., Burns, B.P., and Neilan, B.A. (2010) Identification and regulation of novel compatible solutes from hypersaline stromatolite-associated cyanobacteria. Arch Microbiol: 1-8. Goh, F., Jeon, Y.J., Barrow, K., Neilan, B.A., and Burns, B.P. (2011) Osmoadaptive Strategies of the Archaeon Halococcus hamelinensis Isolated from a Hypersaline Stromatolite Environment. Astrobiology 11: 529-536. Goh, F., Leuko, S., Allen, M., Bowman, J., Kamekura, M., Neilan, B., and Burns, B. (2006) Halococcus hamelinensis sp. nov., a novel halophilic archaeon isolated from stromatolites in Shark Bay, Australia. Int J Syst Evol Microbiol 56: 1323. Goh, F., Allen, M., Leuko, S., Kawaguchi, T., Decho, A., Burns, B., and Neilan, B. (2008) Determining the specific microbial populations and their spatial distribution within the stromatolite ecosystem of Shark Bay. The ISME Journal 3: 383-396. Goldenberg, O., Erez, E., Nimrod, G., and Ben-Tal, N. (2008) The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures. Nucleic Acids Res. Goldsmith-Fischman, S., Kuzin, A., Edstrom, W.C., Benach, J., Shastry, R., Xiao, R. et al. (2004) The SufE Sulfur-acceptor Protein Contains a Conserved Core Structure that Mediates Interdomain Interactions in a Variety of Redox Protein Complexes. J Mol Biol 344: 549-565. Golubic, S., and Walter, M.R. (1976) Chapter 4.1 Organisms that Build Stromatolites. In Developments in Sedimentology: Elsevier, pp. 113-126. Good, I.J. (1953) THE POPULATION FREQUENCIES OF SPECIES AND THE ESTIMATION OF POPULATION PARAMETERS. In, pp. 237-264. Goto, M., Ando, S., Hachisuka, Y., and Yoneyama, T. (2005) Contamination of diverse nifH and nifH-like DNA into commercial PCR primers. FEMS Microbiol Lett 246: 33-38. Gragnani R, Guglielmin M, Stenni B, Longinelli A, Smiraglia C, and L, C. (1998) Origins of the ground ice in the ice-free lands of the Northern Foothills (Northern

Page 210: Nitrogen fixing potential in extreme environments - UNSWorks

201

Victoria Land, Antarctica). Lewkowicz, A.G., and Allard, M. (eds). Yellowknife, Canada: Collection Nordicana, pp. 335-340. Grant, K. (1938) The Radio-activity and Composition of the Water and Gases of the Paralana Hot Spring. Trans. Roy. Soc. SA 62: 2. Greaves, R.B., and Warwicker, J. (2009) Stability and solubility of proteins from extremophiles. Biochem Biophys Res Commun 380: 581-585. Grimm, F., Cort, J.R., and Dahl, C. (2010) DsrR, a novel IscA-like protein lacking iron-and Fe-S-binding functions, involved in the regulation of sulfur oxidation in Allochromatium vinosum. J Bacteriol 192: 1652-1661. Groudieva, T., Kambourova, M., Yusef, H., Royter, M., Grote, R., Trinks, H., and Antranikian, G. (2004) Diversity and cold-active hydrolytic enzymes of culturable bacteria associated with Arctic sea ice, Spitzbergen. Extremophiles 8: 475-488. Guglielmin, M., and French, H.M. (2004) Ground ice in the Northern Foothills, northern Victoria Land, Antarctica. Ann Glaciol 39: 495-500. Guglielmin, M., and Cannone, N. (2012) A permafrost warming in a cooling Antarctica? Clim Change 111: 177-195. Guglielmin, M., Biasini, A., and Smiraglia, C. (1997) The Contribution of Geoelectrical Investigations in the Analysis of Periglacial and Glacial Landforms in Ice Free Areas of the Northern Foothills (Northern Victoria Land, Antarctica). Geogr Ann Ser A PhyGeogr: 17-24. Guglielmin, M., Camusso, M., Polesello, S., Valsecchi, S., and Teruzzi, M. (2002) A Note on the Ice Crystallography and Geochemistry of a Debris Cone, Northern Foothills, Antarctica. Permafrost Periglac 13: 77-82. Guindon, S., and Gascuel, O. (2003) A Simple, Fast, and Accurate Algorithm to Estimate Large Phylogenies by Maximum Likelihood. Syst Biol 52: 696-704. Guindon, S., Dufayard, J., Lefort, V., Anisimova, M., Hordijk, W., and Gascuel, O. (2010) New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol 59: 307. H.M. Berman, K.H., H. Nakamura (2003) Announcing the worldwide Protein Data Bank. Nature Structural Biology 10: 980. H.M.Berman, J.W., Z.Feng, G.Gilliland, T.N.Bhat, H.Weissig, I.N.Shindyalov, P.E.Bourne (2000) The Protein Data Bank. Nucleic Acids Res 28: 235-242. Hall, J.R., Mitchell, K.R., Jackson-Weaver, O., Kooser, A.S., Cron, B.R., Crossey, L.J., and Takacs-Vesbach, C.D. (2008) Molecular Characterization of the Diversity and Distribution of a Thermal Spring Microbial Community using rRNA and Metabolic Genes. Appl Environ Microbiol. Hall, T.A. (1999) BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucl Acid Symp Ser 41: 95-98. Hamilton, T.L., Boyd, E.S., and Peters, J.W. (2011a) Environmental constraints underpin the distribution and phylogenetic diversity of nifH in the Yellowstone geothermal complex. Microb Ecol 61: 860-870.

Page 211: Nitrogen fixing potential in extreme environments - UNSWorks

202

Hamilton, T.L., Lange, R.K., Boyd, E.S., and Peters, J.W. (2011b) Biological nitrogen fixation in acidic high-temperature geothermal springs in Yellowstone National Park, Wyoming. Environ Microbiol 13: 2204-2215. Hammer, Ã., Harper, D.A.T., and Ryan, P.D. (2001) PAST: paleontological statistics software package for education and data analysis. Palaeontologia electronica 4: 9. Handley, K.M., Boothman, C., Mills, R.A., Pancost, R.D., and Lloyd, J.R. (2010) Functional diversity of bacteria in a ferruginous hydrothermal sediment. The ISME Journal. Haney, P.J., Badger, J.H., Buldak, G.L., Reich, C.I., Woese, C.R., and Olsen, G.J. (1999) Thermal adaptation analyzed by comparison of protein sequences from mesophilic and extremely thermophilic Methanococcus species. Proceedings of the National Academy of Sciences 96: 3578. Hartmann, L.S., and Barnum, S.R. (2010) Inferring the evolutionary history of Mo-dependent nitrogen fixation from phylogenetic studies of nifK and nifDK. J Mol Evol: 1-16. Hawkes, T.R., McLEAN, P.A., and Smith, B.E. (1984) Nitrogenase from nifV mutants of Klebsiella pneumoniae contains an altered form of the iron-molybdenum cofactor. Biochem J 217: 317. Head, I., Saunders, J., and Pickup, R. (1998) Microbial evolution, diversity, and ecology: a decade of ribosomal RNA analysis of uncultivated microorganisms. Microb Ecol 35: 1-21. Heeren, T., and D'Agostino, R. (1987) Robustness of the two independent samples t test when applied to ordinal scaled data. Stat Med 6: 79-90. Henikoff, J.G., Greene, E.A., Pietrokovski, S., and Henikoff, S. (2000) Increased coverage of protein families with the blocks database servers. Nucleic Acids Res 28: 228. Henikoff, S., Henikoff, J.G., and Pietrokovski, S. (1999) Blocks+: a non-redundant database of protein alignment blocks derived from multiple compilations. Bioinformatics 15: 471. Henry, E., Devereux, R., Maki, J., Gilmour, C., Woese, C., Mandelco, L. et al. (1994) Characterization of a new thermophilic sulfate-reducing bacterium. Arch Microbiol 161: 62-69. Herbert, R.A., and Sharp, R. (1992) Molecular biology and biotechnology of extremophiles: Blackie and Son Ltd. Hewson, I., Moisander, P.H., Morrison, A.E., and Zehr, J.P. (2007) Diazotrophic bacterioplankton in a coral reef lagoon: phylogeny, diel nitrogenase expression and response to phosphate enrichment. ISME J 1: 78-91. Hirsch, P., Ludwig, W., Hethke, C., Sittig, M., Hoffmann, B., and Gallikowski, C.A. (1998) Hymenobacter roseosalivarius gen. nov., sp. nov. from continental Antartica soils and sandstone: bacteria of the Cytophaga/Flavobacterium/Bacteroides line of phylogenetic descent. Syst Appl Microbiol 21: 374-383. Hirschler-Réa, A., Matheron, R., Riffaud, C., Mouné, S., Eatock, C., Herbert, R.A. et al. (2003) Isolation and characterization of spirilloid purple phototrophic bacteria

Page 212: Nitrogen fixing potential in extreme environments - UNSWorks

203

forming red layers in microbial mats of Mediterranean salterns: description of Halorhodospira neutriphila sp. nov. and emendation of the genus Halorhodospira. Int J Syst Evol Microbiol 53: 153-163. Hobohm, U., Scharf, M., Schneider, R., and Sander, C. (1992) Selection of representative protein data sets. Protein Sci 1: 409-417. Hoehler, T.M., Bebout, B.M., and Des Marais, D.J. (2001) The role of microbial mats in the production of reduced gases on the early Earth. Nature 412: 324-327. Hoffman, P. (1976) Stromatolite Morphogenesis in Shark Bay, Western Australia. Developments in Sedimentology 20: 261-271. Hoffman, P., and Walter, M.R. (1976) Chapter 6.1 Stromatolite Morphogenesis in Shark Bay, Western Australia. In Developments in Sedimentology: Elsevier, pp. 261-271. Howard, J.B., and Rees, D.C. (1996) Structural Basis of Biological Nitrogen Fixation. Chem Rev 96: 2965-2982. Huber, R., Eder, W., Heldwein, S., Wanner, G., Huber, H., Rachel, R., and Stetter, K.O. (1998) Thermocrinis ruber gen. nov., sp. nov., a pink-filament-forming hyperthermophilic bacterium isolated from Yellowstone National Park. Appl Environ Microbiol 64: 3576-3583. Imhoff, J., Suling, J., and Petri, R. (1998) Phylogenetic relationships among the Chromatiaceae, their taxonomic reclassification and description of the new genera Allochromatium, Halochromatium, Isochromatium, Marichromatium, Thiococcus, Thiohalocapsa and Thermochromatium. Int J Syst Evol Microbiol 48: 1129. Imhoff, J.F. (2006) The family Ectothiorhodospiraceae. The Prokaryotes: 874-886. Imshenetsky, A.A., Abyzov, S.S., Voronov, G.T., Kuzjurina, L.A., Lysenko, S.V., Sotnikov, G.G., and Fedorova, R.I. (1967) Exobiology and the effect of physical factors on micro-organisms. Life Sci Space Res 5: 250-260. Ionescu, D., Hindiyeh, M., Malkawi, H., and Oren, A. (2010) Biogeography of thermophilic cyanobacteria: insights from the Zerka Ma'in hot springs (Jordan). FEMS Microbiol Ecol 72: 103-113. Israel, G., Cabane, M., Coll, P., Coscia, D., Raulin, F., and Niemann, H. (1999) The Cassini-Huygens ACP experiment and exobiological implications. Adv Space Res 23: 319-331. Izquierdo, J.A., and Nüsslein, K. (2006) Distribution of extensive nifH gene diversity across physical soil microenvironments. Microb Ecol 51: 441-452. Jaenicke, R. (1996) How Do Proteins Acquire Their Three-Dimensional Structure and Stability? Naturwissenschaften 83: 544-554. Jaenicke, R., and Böhm, G. (1998) The stability of proteins in extreme environments. Curr Opin Struct Biol 8: 738-748. Jahnert, R.J., and Collins, L.B. (2011) Significance of subtidal microbial deposits in Shark Bay, Australia. Mar Geol. Jang, S.B., Seefeldt, L.C., and Peters, J.W. (2000) Insights into nucleotide signal transduction in nitrogenase: structure of an iron protein with MgADP bound. Biochemistry 39: 14745-14752.

Page 213: Nitrogen fixing potential in extreme environments - UNSWorks

204

Jang, S.B., Jeong, M.S., Seefeldt, L.C., and Peters, J.W. (2004) Structural and biochemical implications of single amino acid substitutions in the nucleotide-dependent switch regions of the nitrogenase Fe protein from Azotobacter vinelandii. Journal of Biological Inorganic Chemistry 9: 1028-1033. Jannasch, H.W., and Wirsen, C.O. (1981) Morphological survey of microbial mats near deep-sea thermal vents. Appl Environ Microbiol 41: 528-538. Javor, B.J., and Castenholz, R.W. (1981) Laminated microbial mats, laguna Guerrero Negro, Mexico. Geomicrobiol J 2: 237 - 273. Jeffrey O. Dawson, and Gibson, A.H. (1987) Sensitivity of selected Frankia isolates from Casuarina, Allocasuarina and North American host plants to sodium chloride. Physiol Plant 70: 272-278. Jenkins, B.D., Steward, G.F., Short, S.M., Ward, B.B., and Zehr, J.P. (2004) Fingerprinting diazotroph communities in the Chesapeake Bay by using a DNA macroarray. Appl Environ Microbiol 70: 1767-1776. Jimenez-Lopez, J.C., Gachomo, E.W., Seufferheld, M.J., and Kotchoni, S.O. (2010) The maize ALDH protein superfamily: linking structural features to functional specificities. BMC Struct Biol 10: 43. Jørgensen, B., and Des Marais, D. (1990) The diffusive boundary layer of sediments: Oxygen microgradients over a microbial mat. Limnol Oceanogr 35: 1343-1355. Jungblut, A.D., and Neilan, B.A. (2010) NifH gene diversity and expression in a microbial mat community on the McMurdo Ice Shelf, Antarctica. Antarct Sci 22: 117-122. Jungblut, A.D., Hawes, I., Mountfort, D., Hitzfeld, B., Dietrich, D.R., Burns, B.P., and Neilan, B.A. (2005) Diversity within cyanobacterial mat communities in variable salinity meltwater ponds of McMurdo Ice Shelf, Antarctica. Environ Microbiol 7: 519-529. Kabsch, W. (1976) A solution for the best rotation to relate two sets of vectors. Acta Crystallographica Section A: Crystal Physics, Diffraction, Theoretical and General Crystallography 32: 922-923. Kabsch, W. (1978) A discussion of the solution for the best rotation to relate two sets of vectors. Acta Crystallographica Section A: Crystal Physics, Diffraction, Theoretical and General Crystallography 34: 827-828. Kabsch, W., and Sander, C. (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen bonded and geometrical features. Biopolymers 22: 2577-2637. Kashefi, K., Holmes, D.E., Baross, J.A., and Lovley, D.R. (2003) Thermophily in the Geobacteraceae: Geothermobacter ehrlichii gen. nov., sp. nov., a Novel Thermophilic Member of the Geobacteraceae from the" Bag City" Hydrothermal Vent. Appl Environ Microbiol 69: 2985. Kaštovský, J., and Johansen, J.R. (2008) Mastigocladus laminosus (Stigonematales, Cyanobacteria): phylogenetic relationship of strains from thermal springs to soil-inhabiting genera of the order and taxonomic implications for the genus. Phycologia 47: 307-320.

Page 214: Nitrogen fixing potential in extreme environments - UNSWorks

205

Katoh, K., and Toh, H. (2008) Recent developments in the MAFFT multiple sequence alignment program. Briefings in Bioinformatics 9: 286-298. Katoh, K., Misawa, K., Kuma, K., and Miyata, T. (2002) MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res 30: 3059. Kawasumi, T., Igarashi, Y., Kodama, T., and Minoda, Y. (1984) Hydrogenobacter thermophilus gen. nov., sp. nov., an Extremely Thermophilic, Aerobic, Hydrogen-Oxidizing Bacterium. Int J Syst Bacteriol 34: 5-10. Kent, H., Buck, M., and Evans, D. (1989) Cloning and sequencing of the nifH gene of Desulfovibrio gigas. FEMS Microbiol Lett 61: 73-78. Kim, J., and Rees, D.C. (1994) Nitrogenase and biological nitrogen fixation. Biochemistry 33: 389-397. Kim, J., Woo, D., and Rees, D. (1993) X-ray crystal structure of the nitrogenase molybdenum-iron protein from Clostridium pasteurianum at 3.0-. ANG. resolution. Biochemistry 32: 7104-7115. Klatt, C.G., Wood, J.M., Rusch, D.B., Bateson, M.M., Hamamura, N., Heidelberg, J.F. et al. (2011) Community ecology of hot spring cyanobacterial mats: predominant populations and their functional potential. The ISME Journal 5: 1262-1278. Klopprogge, K., Grabbe, R., Hoppert, M., and Schmitz, R.A. (2002) Membrane association of Klebsiella pneumoniae NifL is affected by molecular oxygen and combined nitrogen. Arch Microbiol 177: 223-234. Kochkina, G.A., Ivanushkina, N.E., Karasev, S.G., Gavrish, E.Y., Gurina, L.V., Evtushenko, L.I. et al. (2001) Survival of Micromycetes and Actinobacteria under Conditions of Long-Term Natural Cryopreservation. Microbiology 70: 356-364. Krebs, C. (1989) Ecological methodology: Harper & Row New York. Krylov, I.N., Semikhatov, M.A., and Walter, M.R. (1976) Appendix II Table of Time-Ranges of the Principal Groups of Precambrian Stromatolites. In Developments in Sedimentology: Elsevier, pp. 693-694. Kumar, M., Ahmad, S., Ahmad, E., Saifi, M.A., and Khan, R.H. (2012) In Silico Prediction and Analysis of Caenorhabditis EF-hand Containing Proteins. PloS one 7: e36770. Kumar, S., and Nussinov, R. (2001) How do thermophilic proteins deal with heat? Cell Mol Life Sci 58: 1216-1233. Kumar, S., Nei, M., Dudley, J., and Tamura, K. (2008) MEGA: a biologist-centric software for evolutionary analysis of DNA and protein sequences. Briefings in bioinformatics 9: 299. Ladunga, I. (2002a) Finding Similar Nucleotide Sequences Using Network BLAST Searches: John Wiley & Sons, Inc. Ladunga, I. (2002b) Finding Homologs in Amino Acid Sequences Using Network BLAST Searches: John Wiley & Sons, Inc. Landau, M., Mayrose, I., Rosenberg, Y., Glaser, F., Martz, E., Pupko, T., and Ben-Tal, N. (2005) ConSurf 2005: the projection of evolutionary conservation scores of residues on protein structures. Nucleic Acids Res 33: W299.

Page 215: Nitrogen fixing potential in extreme environments - UNSWorks

206

Lanyi, J. (1974) Salt-dependent properties of proteins from extremely halophilic bacteria. Microbiology and Molecular Biology Reviews 38: 272. Lanzilotta, W.N., Ryle, M.J., and Seefeldt, L.C. (1995) Nucleotide Hydrolysis and Protein Conformational Changes in Azotobacter vinelandii Nitrogenase Iron Protein: Defining the Function of Aspartate 129. Biochemistry 34: 10713-10723. Lanzilotta, W.N., Fisher, K., and Seefeldt, L.C. (1996) Evidence for electron transfer from the nitrogenase iron protein to the molybdenum-iron protein without MgATP hydrolysis: characterization of a tight protein-protein complex. Biochemistry 35: 7188-7196. Larkin, M.A., Blackshields, G., Brown, N.P., Chenna, R., McGettigan, P.A., McWilliam, H. et al. (2007) Clustal W and Clustal X version 2.0. Bioinformatics 23: 2947. Latysheva, N., Junker, V.L., Palmer, W.J., Codd, G.A., and Barker, D. (2012) The evolution of nitrogen fixation in cyanobacteria. Bioinformatics 28: 603-606. Le, S.Q., and Gascuel, O. (2008) An Improved General Amino Acid Replacement Matrix. Mol Biol Evol 25: 1307-1320. Leigh, J.A. (2000) Nitrogen fixation in methanogens: the archaeal perspective. Curr Issues Mol Biol 2: 125-131. Leipe, D.D., Wolf, Y.I., Koonin, E.V., and Aravind, L. (2002) Classification and evolution of P-loop GTPases and related ATPases. J Mol Biol 317: 41-72. Leuko, S., Goh, F., Allen, M., Burns, B., Walter, M., and Neilan, B. (2007) Analysis of intergenic spacer region length polymorphisms to investigate the halophilic archaeal diversity of stromatolites and microbial mats. Extremophiles 11: 203-210. Ley, R., Harris, J., Wilcox, J., Spear, J., Miller, S., Bebout, B. et al. (2006) Unexpected diversity and complexity of the Guerrero Negro hypersaline microbial mat. Appl Environ Microbiol 72: 3685. Li, X.-D., Huergo, L.F., Gasperina, A., Pedrosa, F.O., Merrick, M., and Winkler, F.K. (2009) Crystal Structure of Dinitrogenase Reductase-activating Glycohydrolase (DRAG) Reveals Conservation in the ADP-Ribosylhydrolase Fold and Specific Features in the ADP-Ribose-binding Pocket. J Mol Biol 390: 737-746. Liesack, W., and Dunfield, P.F. (2004) T-RFLP Analysis: A Rapid Fingerprinting Method for Studying Diversity, Structure, and Dynamics of Microbial Communities. In Environmental Microbiology: Methods and Protocols. Spencer, J.F.T., and Ragout de Spencer, A.L. (eds). Totowa, New Jersey: Springer, pp. 23-38. Lilburn, T., Kim, K., Ostrom, N., Byzek, K., Leadbetter, J., and Breznak, J. (2001) Nitrogen fixation by symbiotic and free-living spirochetes. Science 292: 2495. Liu, W.T., Marsh, T.L., Cheng, H., and Forney, L.J. (1997) Characterization of microbial diversity by determining terminal restriction fragment length polymorphisms of genes encoding 16S rRNA. Appl Environ Microbiol 63: 4516-4522. Liu, Y., Yao, T., Jiao, N., Kang, S., Zeng, Y., and Huang, S. (2006) Microbial community structure in moraine lakes and glacial meltwaters, Mount Everest. FEMS Microbiol Lett 265: 98-105.

Page 216: Nitrogen fixing potential in extreme environments - UNSWorks

207

Lo Giudice, A., Brilli, M., Bruni, V., De Domenico, M., Fani, R., and Michaud, L. (2007) Bacterium-bacterium inhibitory interactions among psychrotrophic bacteria isolated from Antarctic seawater (Terra Nova Bay, Ross Sea). FEMS Microbiol Ecol 60: 383-396. Logan, B. (1961) Cryptozoon and associate stromatolites from the recent, Shark Bay, Western Australia. The Journal of Geology 69: 517-533. Logan, B., and Cebulski, D. (1970) Sedimentary environments of Shark Bay, Western Australia. Am. Assoc. Pet. Geol. Mem 13: l-37. Logan, B., Rezak, R., and Ginsburg, R. (1964) Classification and environmental significance of algal stromatolites. The Journal of Geology 72: 68-83. Logan, B., Hoffman, P., and Gebelein, C. (1974) Algal mats, cryptalgal fabrics and structures. Hamelin Pool, Western Australia: American Association of Petroleum Geologists Memoir 22: 140-194. Logan, B., Davies, G., Read, J., and Cebulski, D. (1970) Carbonate sedimentation and environments, Shark bay, Western Australia: AAPG. Long, N., McPhail, D., Brugger, J., and Plimer, I. (2001) Geochemical and thermal characterisation of the Paralana Hot Springs, northern Flinders Ranges, South Australia: Geological Society of Australia; 1999, pp. 35-35. López-Cortés, A., García-Pichel, F., Nübel, U., and Vázquez-Juárez, R. (2001) Cyanobacterial diversity in extreme environments in Baja California, Mexico: a polyphasic study. Int Microbiol 4: 227-236. Lozupone, C., and Knight, R. (2005) UniFrac: a new phylogenetic method for comparing microbial communities. Appl Environ Microbiol 71: 8228. Lozupone, C., Hamady, M., and Knight, R. (2006) UniFrac – An online tool for comparing microbial community diversity in a phylogenetic context. BMC Bioinformatics 7: 371. Lysnes, K., Thorseth, I.H., Steinsbu, B.O., Ovreas, L., Torsvik, T., and Pedersen, R.B. (2004) Microbial community diversity in seafloor basalt from the Arctic spreading ridges. FEMS Microbiol Ecol 50: 213-230. Ma, Y., Galinski, E.A., Grant, W.D., Oren, A., and Ventosa, A. (2010) Halophiles 2010: Life in Saline Environments. Appl Environ Microbiol 76: 6971. Mack, E.E., Mandelco, L., Woese, C.R., and Madigan, M.T. (1993) Rhodospirillum sodomense, sp. nov., a Dead Sea Rhodospirillum species. Arch Microbiol 160: 363-371. Madern, D., Pfister, C., and Zaccai, G. (1995) Mutation at a single acidic amino acid enhances the halophilic behaviour of malate dehydrogenase from Haloarcula marismortui in physiological salts. Eur J Biochem 230: 1088-1095. Madern, D., Ebel, C., and Zaccai, G. (2000) Halophilic adaptation of enzymes. Extremophiles 4: 91-98. Madigan, M., Cox, S.S., and Stegeman, R.A. (1984) Nitrogen fixation and nitrogenase activities in members of the family Rhodospirillaceae. J Bacteriol 157: 73-78.

Page 217: Nitrogen fixing potential in extreme environments - UNSWorks

208

Man-Aharonovich, D., Kress, N., Zeev, E.B., Berman-Frank, I., and Beja, O. (2007) Molecular ecology of nifH genes and transcripts in the eastern Mediterranean Sea. In, pp. 2354-2363. Mannisto, M.K., and Haggblom, M.M. (2006) Characterization of psychrotolerant heterotrophic bacteria from Finnish Lapland. Syst Appl Microbiol 29: 229-243. Marchesi, J.R., Sato, T., Weightman, A.J., Martin, T.A., Fry, J.C., Hiom, S.J., and Wade, W.G. (1998) Design and evaluation of useful bacterium-specific PCR primers that amplify genes coding for bacterial 16S rRNA. Appl Environ Microbiol 64: 795. Marsh, T.L. (1999) Terminal restriction fragment length polymorphism (T-RFLP): An emerging method for characterizing diversity among homologous populations of amplification products. Curr Opin Microbiol 2: 323-327. Marsh, T.L. (2005) Culture-independent microbial community analysis with terminal restriction fragment length polymorphism. Methods Enzymol 397: 308-329. Marsh, T.L., Saxman, P., Cole, J., and Tiedje, J. (2000) Terminal Restriction Fragment Length Polymorphism Analysis Program, a Web-Based Research Tool for Microbial Community Analysis. Appl Environ Microbiol 66: 3616-3620. Marteinsson, V.T., Birrien, J.-L., Reysenbach, A.-L., Vernet, M., Marie, D., Gambacorta, A. et al. (1999) Thermococcus barophilus sp. nov., a new barophilic and hyperthermophilic archaeon isolated under high hydrostatic pressure from a deep-sea hydrothermal vent. Int J Syst Bacteriol 49: 351-359. Martin, A.P. (2002) Phylogenetic approaches for describing and comparing the diversity of microbial communities. Appl Environ Microbiol 68: 3673. Mawson, D. (1927) The Paralana hot spring. Trans R Soc S Aust 20: 391–397. McGinnis, S., and Madden, T.L. (2004) BLAST: at the core of a powerful and diverse set of sequence analysis tools. Nucleic Acids Res 32: W20. McGuinness, L.M., Salganik, M., Vega, L., Pickering, K.D., and Kerkhof, L.J. (2006) Replicability of Bacterial Communities in Denitrifying Bioreactors as Measured by PCR/T-RFLP Analysis. Environ Sci Technol 40: 509-515. McKay, C.P., Friedmann, E.I., and Meyer, M.A. (1991) From Siberia to Mars. Planet Rep Mar-Apr: 8-11. Mehta, M.P., and Baross, J.A. (2006) Nitrogen fixation at 92 C by a hydrothermal vent archaeon. Science 314: 1783-1786. Mehta, M.P., Butterfield, D.A., and Baross, J.A. (2003) Phylogenetic diversity of nitrogenase (nifH) genes in deep-sea and hydrothermal vent environments of the Juan de Fuca Ridge. Appl Environ Microbiol 69: 960. Meng, E., Pettersen, E., Couch, G., Huang, C., and Ferrin, T. (2006) Tools for integrated sequence-structure analysis with UCSF Chimera. BMC Bioinformatics 7: 339. Meng, L., and Feldman, L.J. (2010) CLE14/CLE20 peptides may interact with CLAVATA2/CORYNE receptor-like kinases to irreversibly inhibit cell division in the root meristem of Arabidopsis. Planta 232: 1061-1074. Meng, L., Wong, J.H., Feldman, L.J., Lemaux, P.G., and Buchanan, B.B. (2010) A membrane-associated thioredoxin required for plant growth moves from cell to cell,

Page 218: Nitrogen fixing potential in extreme environments - UNSWorks

209

suggestive of a role in intercellular communication. Proceedings of the National Academy of Sciences 107: 3900-3905. Methé, B.A., Webster, J., Nevin, K., Butler, J., and Lovley, D.R. (2005) DNA microarray analysis of nitrogen fixation and Fe (III) reduction in Geobacter sulfurreducens. Appl Environ Microbiol 71: 2530. Michaud, L., Cello, F., Brilli, M., Fani, R., Giudice, A., and Bruni, V. (2004) Biodiversity of cultivable psychrotrophic marine bacteria isolated from Terra Nova Bay (Ross Sea, Antarctica). FEMS Microbiol Lett 230: 63-71. Miller, S.R., Castenholz, R.W., and Pedersen, D. (2007) Phylogeography of the thermophilic cyanobacterium Mastigocladus laminosus. Appl Environ Microbiol 73: 4751. Miller, S.R., Strong, A.L., Jones, K.L., and Ungerer, M.C. (2009) Bar-coded pyrosequencing reveals shared bacterial community properties along the temperature gradients of two alkaline hot springs in Yellowstone National Park. Appl Environ Microbiol 75: 4565. Mindlin, S., Soina, V., Petrova, M., and Gorlenko, Z. (2008) Isolation of antibiotic resistance bacterial strains from Eastern Siberia permafrost sediments. Russ J Genet 44: 27-34. Mishustin, E.N., and Shilnikova, V.K. (1971) Biological fixation of atmospheric nitrogen. London: Macmillan.420. Miteva, V. (2008) Bacteria in Snow and Glacier Ice. In Psychrophiles: from Biodiversity to Biotechnology. Margesin, R., Schinner, F., Marx, J.-C., and Gerday, C. (eds). Berlin, Germany: Springer pp. 31-50. Miteva, V.I., and Brenchley, J.E. (2005) Detection and Isolation of Ultrasmall Microorganisms from a 120,000-Year-Old Greenland Glacier Ice Core. Appl Environ Microbiol 71: 7806-7818. Miteva, V.I., Sheridan, P.P., and Brenchley, J.E. (2004) Phylogenetic and Physiological Diversity of Microorganisms Isolated from a Deep Greenland Glacier Ice Core. Appl Environ Microbiol 70: 202-213. Miyamoto, K., Hallenbeck, P.C., and Benemann, J.R. (1979) Nitrogen fixation by thermophilic blue-green algae (cyanobacteria): temperature characteristics and potential use in biophotolysis. Appl Environ Microbiol 37: 454. Moeseneder, M.M., Arrieta, J.M., Muyzer, G., Winter, C., and Herndl, G.J. (1999) Optimization of Terminal-Restriction Fragment Length Polymorphism Analysis for Complex Marine Bacterioplankton Communities and Comparison with Denaturing Gradient Gel Electrophoresis. In, pp. 3518-3525. Mohamed, N., Colman, A., Tal, Y., and Hill, R. (2008a) Diversity and expression of nitrogen fixation genes in bacterial symbionts of marine sponges. Environ Microbiol 10: 2910-2921. Mohamed, N.M., Colman, A.S., Tal, Y., and Hill, R.T. (2008b) Diversity and expression of nitrogen fixation genes in bacterial symbionts of marine sponges. Environ Microbiol 10: 2910-2921.

Page 219: Nitrogen fixing potential in extreme environments - UNSWorks

210

Moisander, P.H., Morrison, A.E., Ward, B.B., Jenkins, B.D., and Zehr, J.P. (2007) Spatial-temporal variability in diazotroph assemblages in Chesapeake Bay using an oligonucleotide nifH microarray. Environ Microbiol 9: 1823-1835. Moisander, P.H., Shiue, L., Steward, G.F., Jenkins, B.D., Bebout, B.M., and Zehr, J.P. (2006) Application of a nifH oligonucleotide microarray for profiling diversity of N2-fixing microorganisms in marine microbial mats. Environ Microbiol 8: 1721-1735. Mooney, C., Davey, N., Martin, A., Walsh, I., Shields, D.C., and Pollastri, G. (2011) In silico protein motif discovery and structural analysis. In Methods in molecular biology. Yu, B., and Hinchcliffe, M. (eds). Clifton, NJ: Springer Science+Business Media, pp. 341-353. Moret, M., and Zebende, G. (2007) Amino acid hydrophobicity and accessible surface area. Physical Review E 75: 011920. Moses, J., Fouchet, T., Bézard, B., Gladstone, G., Lellouch, E., and Feuchtgruber, H. (2005) Photochemistry and diffusion in Jupiter's stratosphere: Constraints from ISO observations and comparisons with other giant planets. J. Geophys. Res 110: E08001. Motulsky, H., and Christopoulos, A. (2004) Fitting models to biological data using linear and nonlinear regression: a practical guide to curve fitting: Oxford University Press, USA. Motulsky, H.J., and Brown, R.E. (2006) Detecting outliers when fitting data with nonlinear regression–a new method based on robust nonlinear regression and the false discovery rate. BMC Bioinformatics 7: 123. Moult, J. (2005) A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction. Curr Opin Struct Biol 15: 285-289. Muller, S.W. (1947) Permafrost or, Permanently frozen ground and related engineering problems (Strategic engineering study). Ann Arbor, Michigan: Edwards Brothers.231. Mullis, K., and Erlich, H. (1988) Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 239: 487–491. Musat, F., Harder, J., and Widdel, F. (2006) Study of nitrogen fixation in microbial communities of oil-contaminated marine sediment microcosms. Environ Microbiol 8: 1834-1843. Muyzer, G., de Waal, E.C., and Uitterlinden, A.G. (1993) Profiling of complex microbial populations by denaturing gradient gel electrophoresis analysis of polymerase chain reaction-amplified genes coding for 16S rRNA. Appl Environ Microbiol 59: 695-700. Nakazawa, H., Arakaki, A., Narita-Yamada, S., Yashiro, I., Jinno, K., Aoki, N. et al. (2009) Whole genome sequence of Desulfovibrio magneticus strain RS-1 revealed common gene clusters in magnetotactic bacteria. Genome Res 19: 1801. NASA (2012). Missions. URL http://science.nasa.gov/earth-science/missions/ Needleman, S.B., and Wunsch, C.D. (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48: 443-453.

Page 220: Nitrogen fixing potential in extreme environments - UNSWorks

211

Neilan, B.A. (1995) Identification and Phylogenetic Analysis of Toxigenic Cyanobacteria by Multiplex Randomly Amplified Polymorphic DNA PCR. In, pp. 2286-2291. Néron, B., Ménager, H., Maufrais, C., Joly, N., Maupetit, J., Letort, S. et al. (2009) Mobyle: a new full web bioinformatics framework. Bioinformatics 25: 3005. Nicolaus, B., Lama, L., Esposito, E., Manca, M., Gambacorta, A., and Prisco, G. (1996) “Bacillus thermoantarcticus” sp. nov., from Mount Melbourne, Antarctica: a novel thermophilic species. Polar Biol 16: 101-104. Nicolaus, B., Improta, R., Manca, M.C., Lama, L., Esposito, E., and Gambacorta, A. (1998) Alicyclobacilli from an unexplored geothermal soil in Antarctica: Mount Rittmann. Polar Biol 19: 133-141. Nicolaus, B., Marsiglia, F., Esposito, E., Trincone, A., Lama, L., Sharp, R. et al. (1991) Isolation of five strains of thermophilic eubacteria in Antarctica. Polar Biol 11: 425-429. Niederberger, T.D., McDonald, I.R., Hacker, A.L., Soo, R.M., Barrett, J.E., Wall, D.H., and Cary, S.C. (2008) Microbial community composition in soils of Northern Victoria Land, Antarctica. Environ Microbiol 10: 1713 - 1724. Nishikawa, K., Kubota, Y., and Tatsuo, O. (1983) Classification of proteins into groups based on amino acid composition and other characters. I. Angular distribution. Journal of biochemistry 94: 981-995. Nuin, P.A.S., Wang, Z., and Tillier, E.R.M. (2006) The accuracy of several multiple sequence alignment programs for proteins. BMC Bioinformatics 7: 471. Nיron, B., Tuffיry, P., and Letondal, C. (2005) Mobyle: a Web portal framework for bioinformatics analyses. Network Tools and Applications in Biology (poster), Naples, Italy. O'Leary, M.J., Hearty, P.J., and McCulloch, M.T. (2008) U-series evidence for widespread reef development in Shark Bay during the last interglacial. Palaeogeography, Palaeoclimatology, Palaeoecology 259: 424-435. Okon, Y. (1985) Azospirillum as a potential inoculant for agriculture. Trends Biotechnol 3: 223-228. Oliveros, J. (2007) VENNY. An interactive tool for comparing lists with Venn Diagrams. In: BioinfoGP, CNB-CSIC. URL http://bioinfogp. cnb. csic. es/tools/venny/index. html [accessed on 30 April 2009]. Olson, N., Ainsworth, T., Gates, R., and Takabayashi, M. (2009) Diazotrophic bacteria associated with Hawaiian Montipora corals: diversity and abundance in correlation with symbiotic dinoflagellates. J Exp Mar Biol Ecol 371: 140-146. Omoregie, E., Crumbliss, L., Bebout, B., and Zehr, J. (2004a) Determination of nitrogen-fixing phylotypes in Lyngbya sp. and Microcoleus chthonoplastes cyanobacterial mats from Guerrero Negro, Baja California, Mexico. Appl Environ Microbiol 70: 2119. Omoregie, E.O., Crumbliss, L.L., Bebout, B.M., and Zehr, J.P. (2004b) Comparison of diazotroph community structure in Lyngbya sp. and Microcoleus chthonoplastes

Page 221: Nitrogen fixing potential in extreme environments - UNSWorks

212

dominated microbial mats from Guerrero Negro, Baja, Mexico. FEMS Microbiol Ecol 47: 305-318. Omoregie, E.O., Crumbliss, L.L., Bebout, B.M., and Zehr, J.P. (2004c) Determination of nitrogen-fixing phylotypes in Lyngbya sp. and Microcoleus chthonoplastes cyanobacterial mats from Guerrero Negro, Baja California, Mexico. Appl Environ Microbiol 70: 2119-2128. Oren, A. (1986) Intracellular salt concentrations of the anaerobic halophilic eubacteria Haloanaerobium praevalens and Halobacteroides halobius. Can J Microbiol 32: 4-9. Oren, A. (1999) Bioenergetic aspects of halophilism. Microbiology and Molecular Biology Reviews 63: 334. Oren, A. (2002) Diversity of halophilic microorganisms: environments, phylogeny, physiology, and applications. J Ind Microbiol Biotechnol 28: 56-63. Oren, A., Kessel, M., and Stackebrandt, E. (1989) Ectothiorhodospira marismortui sp. nov., an obligately anaerobic, moderately halophilic purple sulfur bacterium from a hypersaline sulfur spring on the shore of the Dead Sea. Arch Microbiol 151: 524-529. Oren, A., Ionescu, D., Hindiyeh, M., and Malkawi, H. (2009) Morphological, phylogenetic and physiological diversity of cyanobacteria in the hot springs of Zerka Ma. BioRisk 3: 69. Orombelli, G., Baroni, C., and Denton, G. (1991) Late Cenozoic glacial history of the Terra Nova Bay region, northern Victoria Land, Antarctica. Geogr Fis Din Quat 13: 139-163. Osborn, A.M., Moore, E.R.B., and Timmis, K.N. (2000) An evaluation of terminal restriction fragment length polymorphism (T-RFLP) analysis for the study of microbial community structure and dynamics. Environ Microbiol 2: 39-50. Ostroumov, V., and Siegert, C. (1996) Exobiological aspects of mass transfer in microzones of permafrost deposits. Adv Space Res 18: 79-86. Paerl, H.W., Pinckney, J.L., and Steppe, T.F. (2000) Cyanobacterial-bacterial mat consortia: examining the functional unit of microbial survival and growth in extreme environments. Environ Microbiol 2: 11-26. Paerl, H.W., Steppe, T.F., Buchan, K.C., and Potts, M. (2003) Hypersaline cyanobacterial mats as indicators of elevated tropical hurricane activity and associated climate change. AMBIO: A Journal of the Human Environment 32: 87-90. Pandey, K.D., Shukla, S.P., Shukla, P.N., Giri, D.D., Singh, J.S., Singh, P., and Kashyap, A.K. (2004) Cyanobacteria in Antarctica: ecology, physiology and cold adaptation. Cell Mol Biol (Noisy-le-grand) 50: 575-584. Papineau, D., Walker, J., Mojzsis, S., and Pace, N. (2005) Composition and structure of microbial communities from stromatolites of Hamelin Pool in Shark Bay, Western Australia. Appl Environ Microbiol 71: 4822. Paster, B., Dewhirst, F., Weisburg, W., Tordoff, L., Fraser, G., Hespell, R. et al. (1991) Phylogenetic analysis of the spirochetes. J Bacteriol 173: 6101. Patel, B., Morgan, H., and Daniel, R. (1985) Thermophilic anaerobic spirochetes in New Zealand hot springs. FEMS Microbiol Lett 26: 101-106.

Page 222: Nitrogen fixing potential in extreme environments - UNSWorks

213

Pätzold, M., Häusler, B., Bird, M., Tellmann, S., Mattei, R., Asmar, S. et al. (2007) The structure of Venus’ middle atmosphere and ionosphere. Nature 450: 657-660. Paul, S., Bag, S.K., Das, S., Harvill, E.T., and Dutta, C. (2008) Molecular signature of hypersaline adaptation: insights from genome and proteome composition of halophilic prokaryotes. Genome Biol 9: R70. Pearson, W.R. (1991) Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms. Genomics 11: 635-650. Pennisi, E. (1997) Biotechnology: in industry, extremophiles begin to make their mark. Science 276: 705. Pepi, M., Agnorelli, C., and Bargagli, R. (2005) Iron Demand by Thermophilic and Mesophilic Bacteria Isolated from an Antarctic Geothermal Soil. BioMetals 18: 529-536. Pernthaler, A., Dekas, A.E., Brown, C.T., Goffredi, S.K., Embaye, T., and Orphan, V.J. (2008) Diverse syntrophic partnerships from deep-sea methane vents revealed by direct cell capture and metagenomics. Proceedings of the National Academy of Sciences 105: 7052. Perreault, N.N., Andersen, D.T., Pollard, W.H., Greer, C.W., and Whyte, L.G. (2007) Characterization of the Prokaryotic Diversity in Cold Saline Perennial Springs of the Canadian High Arctic. Appl Environ Microbiol 73: 1532-1543. Peters, J., Fisher, K., and Dean, D. (1995) Nitrogenase structure and function: a biochemical-genetic perspective. Annual Reviews in Microbiology 49: 335-366. Peters, J.W., and Szilagyi, R.K. (2006) Exploring new frontiers of nitrogenase structure and mechanism. Curr Opin Chem Biol 10: 101-108. Petrova, M.A., Mindlin, S.Z., Gorlenko, Z.M., Kalyaeva, E.S., Soina, V.S., and Bogdanova, E.S. (2002) Mercury-Resistant Bacteria from Permafrost Sediments and Prospects for their Use in Comparative Studies of Mercury Resistance Determinants. Russ J Genet 38: 1330-1334. Pettersen, E.F., Goddard, T.D., Huang, C.C., Couch, G.S., Greenblatt, D.M., Meng, E.C., and Ferrin, T.E. (2004) UCSF Chimera - A Visualization System for Exploratory Research and Analysis. J Comput Chem 25: 1605-1612. Piccardi, G., Udisti, R., and Casella, F. (1994) Seasonal trends and chemical composition of snow at Terra Nova Bay (Antarctica). Int J Environ Anal Chem 55: 219-234. Pierrehumbert, R.T. (2011) Infrared radiation and planetary temperature. Physics Today 64: 33. Pikuta, E.V., Hoover, R.B., and Tang, J. (2007) Microbial extremophiles at the limits of life. Crit Rev Microbiol 33: 183-209. Pinckney, J., Paerl, H.W., and Bebout, B.M. (1995) Salinity control of benthic microbial mat community production in a Bahamian hypersaline lagoon. J Exp Mar Biol Ecol 187: 223-237.

Page 223: Nitrogen fixing potential in extreme environments - UNSWorks

214

Pinckney, J.L., and Paerl, H.W. (1997) Anoxygenic Photosynthesis and Nitrogen Fixation by a Microbial Mat Community in a Bahamian Hypersaline Lagoon. Appl Environ Microbiol 63: 420-426. Playford, P.E., Cockbain, A.E., and Walter, M.R. (1976) Chapter 8.2 Modern Algal Stromatolites at Hamelin Pool, A Hypersaline Barred Basin in Shark Bay, Western Australia. In Developments in Sedimentology: Elsevier, pp. 389-411. Pointing, S.B., Chan, Y., Lacap, D.C., Lau, M.C.Y., Jurgens, J.A., and Farrell, R.L. (2009) Highly specialized microbial diversity in hyper-arid polar desert. Proceedings of the National Academy of Sciences 106: 19964-19969. Polański, A., and Kimmel, M. (2007) Bioinformatics: Springer. Pollastri, G., Baldi, P., Fariselli, P., and Casadio, R. (2002) Prediction of coordination number and relative solvent accessibility in proteins. Proteins: Structure, Function, and Bioinformatics 47: 142-153. Polz, M.F., and Cavanaugh, C.M. (1998) Bias in template-to-product ratios in multitemplate PCR. Appl Environ Microbiol 64: 3724-3730. Posada, D., Guindon, S., Delsuc, F., Dufayard, J.-F., and Gascuel, O. (2009) Estimating Maximum Likelihood Phylogenies with PhyML. In Bioinformatics for DNA Sequence Analysis. Posada, D. (ed): Humana Press, pp. 113-137. Postgate, J., Kent, H., and Robson, R. (1988) Nitrogen fixation by Desulfovibrio. The Nitrogen and Sulphur Cycles: 457–471. Postgate, J.R. (1982) The fundamentals of nitrogen fixation: Cambridge Univ Pr. Postgate, J.R. (1987) Nitrogen Fixation: Cambridge University Press. Priscu, J.C., Fritsen, C.H., Adams, E.E., Giovannoni, S.J., Paerl, H.W., McKay, C.P. et al. (1998) Perennial Antarctic lake ice: an oasis for life in a polar desert. Science 280: 2095-2098. Priscu, J.C., Adams, E.E., Lyons, W.B., Voytek, M.A., Mogk, D.W., Brown, R.L. et al. (1999) Geomicrobiology of Subglacial Ice Above Lake Vostok, Antarctica. Science 286: 2141. Proctor, L.M. (1997) Nitrogen-fixing, photosynthetic, anaerobic bacteria associated with pelagic copepods. Aquat Microb Ecol 12: 105-113. Pumbwe, L., Skilbeck, C.A., and Wexler, H.M. (2007) Impact of Anatomic Site on Growth, Efflux-Pump Expression, Cell Structure, and Stress Responsiveness of Bacteroides fragilis. Curr Microbiol 55: 362-365. Pupko, T., Bell, R.E., Mayrose, I., Glaser, F., and Ben-Tal, N. (2002) Rate4Site: an algorithmic tool for the identification of functional regions in proteins by surface mapping of evolutionary determinants within their homologues. Bioinformatics 18: S71-77. Qiu, X., Wu, L., Huang, H., McDonel, P.E., Palumbo, A.V., Tiedje, J.M., and Zhou, J. (2001) Evaluation of PCR-Generated Chimeras, Mutations, and Heteroduplexes with 16S rRNA Gene-Based Cloning. Appl Environ Microbiol 67: 880-887.

Page 224: Nitrogen fixing potential in extreme environments - UNSWorks

215

Quaiser, A., Zivanovic, Y., Moreira, D., and López-García, P. (2010) Comparative metagenomics of bathypelagic plankton and bottom sediment from the Sea of Marmara. The ISME Journal. Ramelot, T.A., Cort, J.R., Goldsmith-Fischman, S., Kornhaber, G.J., Xiao, R., Shastry, R. et al. (2004) Solution NMR structure of the iron–sulfur cluster assembly protein U (IscU) with zinc bound at the active site. J Mol Biol 344: 567-583. Ramsden, J. (2009) Bioinformatics: an introduction: Springer. Rao, J., and Argos, P. (1981) Structural stability of halophilic proteins. Biochemistry 20: 6536-6543. Rasche, M.E., and Seefeldt, L.C. (1997) Reduction of Thiocyanate, Cyanate, and Carbon Disulfide by Nitrogenase:  Kinetic Characterization and EPR Spectroscopic Analysis†Biochemistry 36: 8574-8585. Ravenschlag, K., Sahm, K., Pernthaler, J., and Amann, R. (1999) High Bacterial Diversity in Permanently Cold Marine Sediments. Appl Environ Microbiol 65: 3982-3989. Raymond, J., Siefert, J.L., Staples, C.R., and Blankenship, R.E. (2004a) The Natural History of Nitrogen Fixation. Mol Biol Evol 21: 541-554. Raymond, J., Siefert, J.L., Staples, C.R., and Blankenship, R.E. (2004b) The Natural History of Nitrogen Fixation. In, pp. 541-554. Razia, M., Raja, K., Padmanaban, K., Sivaramakrishnan, S., and Chellapandi, P. (2010) A Phylogenetic Approach for Assigning Function of Hypothetical Proteins in Photorhabdus luminescens Subsp. laumondii TT01 Genome. J Comput Sci Syst Biol 3: 21-29. Reddy, K., Haskell, J., Sherman, D., and Sherman, L. (1993) Unicellular, aerobic nitrogen-fixing cyanobacteria of the genus Cyanothece. J Bacteriol 175: 1284. Rengpipat, S., Lowe, S., and Zeikus, J. (1988) Effect of extreme salt concentrations on the physiology and biochemistry of Halobacteroides acetoethylicus. J Bacteriol 170: 3065. Rhodes, M.E., Fitz-Gibbon, S.T., Oren, A., and House, C.H. (2010) Amino acid signatures of salinity on an environmental scale with a focus on the Dead Sea. Environ Microbiol 12: 2613-2623. Richardson, L.L., and Castenholz, R.W. (1987) Diel vertical movements of the cyanobacterium Oscillatoria terebriformis in a sulfide-rich hot spring microbial mat. Appl Environ Microbiol 53: 2142. Riding, R. (1999) The term stromatolite: towards an essential definition. Lethaia 32: 321-330. Riederer-Henderson, M.A., and Wilson, P. (1970) Nitrogen fixation by sulphate-reducing bacteria. Microbiology 61: 27. Ríos, A., Valera, S., Ascaso, C., Davila, A., Kastovsky, J., McKay, C.P. et al. (2010) Comparative analysis of the microbial communities inhabiting halite evaportes of the Atacama Desert. International microbiology: official journal of the Spanish Society for Microbiology 13: 79-89.

Page 225: Nitrogen fixing potential in extreme environments - UNSWorks

216

Risatti, J., Capman, W., and Stahl, D. (1994) Community structure of a microbial mat: the phylogenetic dimension. Proceedings of the National Academy of Sciences of the United States of America 91: 10173. Rodriguez, R., Chinea, G., Lopez, N., Pons, T., and Vriend, G. (1998) Homology modeling, model and software evaluation: three related resources. Bioinformatics 14: 523-528. Roesch, L.F.W., Fulthorpe, R.R., Jaccques, R.J.S., Bento, F.M., and de Oliveira Camargo, F.A. (2010) Biogeography of diazotrophic bacteria in soils. World Journal of Microbiology and Biotechnology: 1-6. Rothschild, L.J., and Mancinelli, R.L. (2001) Life in extreme environments. Nature 409: 1092-1101. Roy, A., Kucukural, A., and Zhang, Y. (2010) I-TASSER: a unified platform for automated protein structure and function prediction. Nature protocols 5: 725-738. Rychlewski, L., Li, W., Jaroszewski, L., and Godzik, A. (2000) Comparison of sequence profiles. Strategies for structural predictions using sequence information. Protein Sci 9: 232-241. Sakaguchi, T., Arakaki, A., and Matsunaga, T. (2002) Desulfovibrio magneticus sp. nov., a novel sulfate-reducing bacterium that produces intracellular single-domain-sized magnetite particles. Int J Syst Evol Microbiol 52: 215. Sander, C., and Schneider, R. (1991) Database of homology derived protein structures and the structural meaning of sequence alignment. Proteins: Structure, Function, and Bioinformatics 9: 56-68. Schaller, R.R. (1997) Moore's law: past, present and future. Spectrum, IEEE 34: 52-59. Schink, B. (1992) The genus Pelobacter. The Prokaryotes: 3393–3399. Schleifer, K.-H. (2004) Microbial Diversity: Facts, Problems and Prospects. Syst Appl Microbiol 27: 3-9. Schlessman, J.L., Woo, D., Joshua-Tor, L., Howard, J.B., and Rees, D.C. (1998) Conformational variability in structures of the nitrogenase iron proteins from Azotobacter vinelandii and Clostridium pasteurianum1. J Mol Biol 280: 669-685. Schloss, P., and Handelsman, J. (2006a) Introducing TreeClimber, a test to compare microbial community structures. Appl Environ Microbiol 72: 2379. Schloss, P.D., and Handelsman, J. (2005) Introducing DOTUR, a Computer Program for Defining Operational Taxonomic Units and Estimating Species Richness. Appl Environ Microbiol 71: 1501-1506. Schloss, P.D., and Handelsman, J. (2006b) Introducing SONS, a Tool for Operational Taxonomic Unit-Based Comparisons of Microbial Community Memberships and Structures. Appl Environ Microbiol 72: 6773-6779. Schloss, P.D., Larget, B.R., and Handelsman, J. (2004) Integration of Microbial Ecology and Statistics: a Test To Compare Gene Libraries. Appl Environ Microbiol 70: 5485-5492. Schloss, P.D., Westcott, S.L., Ryabin, T., Hall, J.R., Hartmann, M., Hollister, E.B. et al. (2009) Introducing mothur: Open-Source, Platform-Independent, Community-

Page 226: Nitrogen fixing potential in extreme environments - UNSWorks

217

Supported Software for Describing and Comparing Microbial Communities. Appl Environ Microbiol 75: 7537-7541. Schneegurt, M.A., Sherman, D.M., Nayar, S., and Sherman, L.A. (1994) Oscillating behavior of carbohydrate granule formation and dinitrogen fixation in the cyanobacterium Cyanothece sp. strain ATCC 51142. J Bacteriol 176: 1586. Schopf, J.W. (2006) Fossil evidence of Archaean life. Philosophical Transactions of the Royal Society B: Biological Sciences 361: 869-885. Segawa, T., Miyamoto, K., Ushida, K., Agata, K., Okada, N., and Kohshima, S. (2005) Seasonal Change in Bacterial Flora and Biomass in Mountain Snow from the Tateyama Mountains, Japan, Analyzed by 16S rRNA Gene Sequencing and Real-Time PCR. Appl Environ Microbiol 71: 123-130. Serebryakov, S.N., and Walter, M.R. (1976a) Chapter 10.8 Distribution of Stromatolites in Riphean Deposits of the Uchur-Maya Region of Siberia. In Developments in Sedimentology: Elsevier, pp. 613-614, 615-620, 621-633. Serebryakov, S.N., and Walter, M.R. (1976b) Chapter 6.4 Biotic and Abiotic Factors Controlling the Morphology of Riphean Stromatolites. In Developments in Sedimentology: Elsevier, pp. 321-336. Serrano, L., Sancho, J., Hirshberg, M., and Fersht, A.R. (1992a) [alpha]-Helix stability in proteins:: I. Empirical correlations concerning substitution of side-chains at the N and C-caps and the replacement of alanine by glycine or serine at solvent-exposed surfaces. J Mol Biol 227: 544-559. Serrano, L., Neira, J.L., Sancho, J., and Fersht, A.R. (1992b) Effect of alanine versus glycine in α-helices on protein stability. Severin, I., and Stal, L.J. (2010) NifH expression by five groups of phototrophs compared with nitrogenase activity in coastal microbial mats. FEMS Microbiol Ecol 73: 55-67. Severin, J., Wohlfarth, A., and Galinski, E.A. (1992) The predominant role of recently discovered tetrahydropyrimidines for the osmoadaptation of halophilic eubacteria. Journal of general microbiology 138: 1629. Sheridan, P.P., Miteva, V.I., and Brenchley, J.E. (2003) Phylogenetic Analysis of Anaerobic Psychrophilic Enrichment Cultures Obtained from a Greenland Glacier Ice Core. Appl Environ Microbiol 69: 2153-2160. Shi, R., Proteau, A., Villarroya, M., Moukadiri, I., Zhang, L., Trempe, J.F. et al. (2010) Structural basis for Fe–S cluster assembly and tRNA thiolation mediated by IscS protein–protein interactions. PLoS Biol 8: e1000354. Shi, T., Reeves, R.H., Gilichinsky, D.A., and Friedmann, E.I. (1997) Characterization of Viable Bacteria from Siberian Permafrost by 16S rDNA Sequencing. Microb Ecol 33: 169-179. Short, S.M., and Zehr, J.P. (2005) Quantitative analysis of nifH genes and transcripts from aquatic environments. Methods Enzymol 397: 380-394. Siddiqui, K.S., and Cavicchioli, R. (2006) Cold-adapted enzymes. Annu. Rev. Biochem. 75: 403-433.

Page 227: Nitrogen fixing potential in extreme environments - UNSWorks

218

Siddiqui, K.S., and Thomas, T. (2008) Protein adaptation in extremophiles: Nova Biomedical. Singer, G., and Hickey, D.A. (2003) Thermophilic prokaryotes have characteristic patterns of codon usage, amino acid composition and nucleotide content. Gene 317: 39. Singh, C., Soni, R., Jain, S., Roy, S., and Goel, R. (2010) Diversification of nitrogen fixing bacterial community using nifH gene as a biomarker in different geographical soils of Western Indian Himalayas. J Environ Biol. Singleton, D.R., Furlong, M.A., Rathbun, S.L., and Whitman, W.B. (2001) Quantitative Comparisons of 16S rRNA Gene Sequence Libraries from Environmental Samples. Appl Environ Microbiol 67: 4374-4376. Sjöling, S., and Cowan, D.A. (2003) High 16S rDNA bacterial diversity in glacial meltwater lake sediment, Bratina Island, Antarctica. Extremophiles 7: 275-282. Skidmore, M., Anderson, S.P., Sharp, M., Foght, J., and Lanoil, B.D. (2005) Comparison of Microbial Community Compositions of Two Subglacial Environments Reveals a Possible Role for Microbes in Chemical Weathering Processes. Appl Environ Microbiol 71: 6986-6997. Smith, A.B. (1992) Geology of the Yudnamutana Gorge, Paralana Hot Springs Area and Genesis of Mineralization at the Hodgkinson Prospect, Mount Painter Province, South Australia: University of Adelaide, Dept. of Geology and Geophysics. Smith, M.H. (1966) The amino acid composition of proteins. J Theor Biol 13: 261-282. Smith, S., and Atkinson, M. (1983) Mass balance of carbon and phosphorus in Shark Bay, Western Australia. Limnol Oceanogr 28: 625-639. Smith, V.R., and Russell, S. (1982) Acetylene reduction by bryophyte-cyanobacteria associations on a Subantarctic island. Polar Biol V1: 153-157. Sogin, M.L., Morrison, H.G., Huber, J.A., Welch, D.M., Huse, S.M., Neal, P.R. et al. (2006) Microbial diversity in the deep sea and the underexplored “rare biosphere”. Proceedings of the National Academy of Sciences 103: 12115-12120. Soina, V.S., Vorobiova, E.A., Zvyagintsev, D.G., and Gilichinsky, D.A. (1995) Preservation of cell structures in permafrost: A model for exobiology. Adv Space Res 15: 237-242. Sokolova, T.G., Kostrikina, N.A., Chernyh, N.A., Kolganova, T.V., Tourova, T.P., and Bonch-Osmolovskaya, E.A. (2005) Thermincola carboxydiphila gen. nov., sp. nov., a novel anaerobic, carboxydotrophic, hydrogenogenic bacterium from a hot spring of the Lake Baikal area. Int J Syst Evol Microbiol 55: 2069. Somero, G. (2003) Protein adaptations to temperature and pressure: complementary roles of adaptive changes in amino acid sequence and internal milieu* 1. Comparative Biochemistry and Physiology Part B: Biochemistry and Molecular Biology 136: 577-591. Sonne-Hansen, J., and Ahring, B. (1999) Thermodesulfobacterium hveragerdense sp. nov., and Thermodesulfovibrio islandicus sp. nov., two thermophilic sulfate reducing bacteria isolated from a Icelandic hot spring. Syst Appl Microbiol 22: 559-564. Sorokin, D.Y., Tourova, T.P., Henstra, A.M., Stams, A.J.M., Galinski, E.A., and Muyzer, G. (2008) Sulfidogenesis under extremely haloalkaline conditions by

Page 228: Nitrogen fixing potential in extreme environments - UNSWorks

219

Desulfonatronospira thiodismutans gen. nov., sp. nov., and Desulfonatronospira delicata sp. nov. - a novel lineage of Deltaproteobacteria from hypersaline soda lakes. Microbiology 154: 1444-1453. Spirina, E., Cole, J., Chai, B., Gilichinksy, D., and Tiedje, J. (2003) New high throughput approach to study ancient microbial phylogenetic diversity in permafrost. In Geophysical Research Abstracts. Nice, France: Copernicus Publications. Sprigg, R.C. (1984) Arkaroola-Mount Painter in the northern Flinders Ranges, SA: the last billion years: Arkaroola. Sridharan, S., Nicholls, A., and Honig, B. (1992) A new vertex algorithm to calculate solvent accessible surface areas. Biophys. J 61: A174. Srinivasan, V., Netz, D.J.A., Webert, H., Mascarenhas, J., Pierik, A.J., Michel, H., and Lill, R. (2007) Structure of the yeast WD40 domain protein Cia1, a component acting late in iron-sulfur protein biogenesis. Structure 15: 1246-1257. Stal, L., and Krumbein, W. (1987) Temporal separation of nitrogen fixation and photosynthesis in the filamentous, non-heterocystous cyanobacterium Oscillatoria sp. Arch Microbiol 149: 76-80. Stal, L.J., and Heyer, H. (1987) Dark anaerobic nitrogen fixation (acetylene reduction) in the cyanobacterium Oscillatoria sp. FEMS Microbiol Lett 45: 227-232. States, D.J., and Botstein, D. (1991) Molecular Sequence Accuracy and the Analysis of Protein Coding Regions. Proceedings of the National Academy of Sciences of the United States of America 88: 5518-5522. Steppe, T., and Paerl, H. (2002) Potential N2 fixation by sulfate-reducing bacteria in a marine intertidal microbial mat. Aquat Microb Ecol 28: 1-12. Steppe, T.F., Pinckney, J.L., Dyble, J., and Paerl, H.W. (2001) Diazotrophy in Modern Marine Bahamian Stromatolites. Microb Ecol 41: 36-44. Steunou, A.S., Bhaya, D., Bateson, M.M., Melendrez, M.C., Ward, D.M., Brecht, E. et al. (2006) In situ analysis of nitrogen fixation and metabolic switching in unicellular thermophilic cyanobacteria inhabiting hot spring microbial mats. Proceedings of the National Academy of Sciences of the United States of America 103: 2398-2403. Steunou, A.S., Jensen, S.I., Brecht, E., Becraft, E.D., Bateson, M.M., Kilian, O. et al. (2008) Regulation of nif gene expression and the energetics of N2 fixation over the diel cycle in a hot spring microbial mat. The ISME Journal 2: 364-378. Steven, B., Briggs, G., McKay, C.P., Pollard, W.H., Greer, C.W., and Whyte, L.G. (2007) Characterization of the microbial diversity in a permafrost sample from the Canadian high Arctic using culture-dependent and culture-independent methods. FEMS Microbiol Ecol 59: 513-523. Stewart, W. (1970a) Nitrogen fixation by blue-green algae in Yellowstone thermal areas. Stewart, W. (1973) Nitrogen fixation by photosynthetic microorganisms. Annual Reviews in Microbiology 27: 283-316. Stewart, W.D.P. (1967) Nitrogen Turnover in Marine and Brackish Habitats II. Use of 15N in Measuring Nitrogen Fixation in the Field. In, pp. 385-407.

Page 229: Nitrogen fixing potential in extreme environments - UNSWorks

220

Stewart, W.D.P. (1970b) Algal fixation of atmospheric nitrogen. Plant and Soil 32: 555-588. Stormo, G.D. (2002) An Introduction to Sequence Similarity (“Homology”) Searching: John Wiley & Sons, Inc. Stöver, B., and Müller, K. (2010) TreeGraph 2: Combining and visualizing evidence from different phylogenetic analyses. BMC Bioinformatics 11: 7. Sullivan, J., and Joyce, P. (2005) Model selection in phylogenetics. Annual Review of Ecology, Evolution, and Systematics 36: 445. Summers, M.L., Wallis, J.G., Campbell, E.L., and Meeks, J.C. (1995) Genetic evidence of a major role for glucose-6-phosphate dehydrogenase in nitrogen fixation and dark growth of the cyanobacterium Nostoc sp. strain ATCC 29133. J Bacteriol 177: 6184. Sundset, M., Præsteng, K., Cann, I., Mathiesen, S., and Mackie, R. (2007) Novel Rumen Bacterial Diversity in Two Geographically Separated Sub-Species of Reindeer. Microb Ecol 54: 424-438. Sung, Y., Fletcher, K.E., Ritalahti, K.M., Apkarian, R.P., Ramos-Hernández, N., Sanford, R.A. et al. (2006) Geobacter lovleyi sp. nov. strain SZ, a novel metal-reducing and tetrachloroethene-dechlorinating bacterium. Appl Environ Microbiol 72: 2775. Suzuki, M.T., and Giovannoni, S.J. (1996) Bias caused by template annealing in the amplification of mixtures of 16S rRNA genes by PCR. Appl Environ Microbiol 62: 625-630. Tamura, K., Dudley, J., Nei, M., and Kumar, S. (2007) MEGA4: molecular evolutionary genetics analysis (MEGA) software version 4.0. Mol Biol Evol 24: 1596. Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M., and Kumar, S. (2011) MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. Taroncher-Oldenburg, G., Griner, E.M., Francis, C.A., and Ward, B.B. (2003) Oligonucleotide microarray for the study of functional gene diversity in the nitrogen cycle in the environment. Appl Environ Microbiol 69: 1159-1171. Tezcan, F.A., Kaiser, J.T., Mustafi, D., Walton, M.Y., Howard, J.B., and Rees, D.C. (2005) Nitrogenase complexes: multiple docking sites for a nucleotide switch protein. Science 309: 1377-1380. The UniProt, C. (2008) The Universal Protein Resource (UniProt). Nucl. Acids Res. 36: D190-195. Thomas, D.N. (2005) Photosynthetic microbes in freezing deserts. Trends Microbiol 13: 87-88. Thomas, M., and Walter, M.R. (2002) Application of hyperspectral infrared analysis of hydrothermal alteration on Earth and Mars. Astrobiology 2: 335-351. Tillett, D., and Neilan, B.A. (2000) Xanthogenate nucleic acid isolation from cultured and environmental cyanobacteria. Journal of Phycology 36: 251-258. Tiquia, S.M., Lloyd, J., Herms, D.A., Hoitink, H.A.J., and Michel, F.C. (2002) Effects of mulching and fertilization on soil nutrients, microbial activity and

Page 230: Nitrogen fixing potential in extreme environments - UNSWorks

221

rhizosphere bacterial community structure determined by analysis of TRFLPs of PCR-amplified 16S rRNA genes. Appl Soil Ecol 21: 31-48. Tourova, T.P., Spiridonova, E.M., Berg, I.A., Slobodova, N.V., Boulygina, E.S., and Sorokin, D.Y. (2007) Phylogeny and evolution of the family Ectothiorhodospiraceae based on comparison of 16S rRNA, cbbL and nifH gene sequences. Int J Syst Evol Microbiol 57: 2387. Tripp, H., Bench, S., Turk, K., Foster, R., Desany, B., Niazi, F. et al. (2010) Metabolic streamlining in an open-ocean nitrogen-fixing cyanobacterium. Nature 464: 90-94. Tsuihiji, H., Yamazaki, Y., Kamikubo, H., Imamoto, Y., and Kataoka, M. (2006) Cloning and characterization of nif structural and regulatory genes in the purple sulfur bacterium, Halorhodospira halophila. J Biosci Bioeng 101: 263-270. UNESCO (1991) World heritage nomination - IUCN summary, 578: Shark Bay (Australia). In. van de Vossenberg, J.L.C.M., Driessen, A.J.M., and Konings, W.N. (1998) The essence of being extremophilic: the role of the unique archaeal membrane lipids. Extremophiles 2: 163-170. van de Vossenberg, J.L.C.M., Driessen, A.J.M., Grant, D., and Konings, W.N. (1999) Lipid membranes from halophilic and alkali-halophilic Archaea have a low H+ and Na+ permeability at high salt concentration. Extremophiles 3: 253-257. van den Burg, B. (2003) Extremophiles as a source for novel enzymes. Curr Opin Microbiol 6: 213-218. Van Trappen, S., Vandecandelaere, I., Mergaert, J., and Swings, J. (2004) Algoriphagus antarcticus sp. nov., a novel psychrophile from microbial mats in Antarctic lakes. Int J Syst Evol Microbiol 54: 1969-1973. Veerassamy, S., Smith, A., and Tillier, E. (2003) A transition probability model for amino acid substitutions from blocks. J Comput Biol 10: 997-1010. Vincent, W., Castenholz, R., Downes, M., and H-Williams, C. (1993) Antarctic cyanobacteria: Light, nutrients, and photosynthesis in the microbial mat environment. Journal of Phycology 29: 745-755. Vishnivetskaya, T.A., Petrova, M.A., Urbance, J., Ponder, M., Moyer, C.L., Gilichinsky, D.A., and Tiedje, J.M. (2006) Bacterial Community in Ancient Siberian Permafrost as Characterized by Culture and Culture-Independent Methods. Astrobiology 6: 400-414. Vriend, G. (1990) WHAT IF: a molecular modeling and drug design program. J Mol Graphics 8: 52-56. Wagner, D., Kobabe, S., and Liebner, S. (2009) Bacterial community structure and carbon turnover in permafrost-affected soils of the Lena Delta, northeastern Siberia. Can J Microbiol 55: 73-83. Walker, J.E., Saraste, M., Runswick, M.J., and Gay, N.J. (1982) Distantly related sequences in the alpha-and beta-subunits of ATP synthase, myosin, kinases and other ATP-requiring enzymes and a common nucleotide binding fold. The EMBO journal 1: 945.

Page 231: Nitrogen fixing potential in extreme environments - UNSWorks

222

Walter, M. (1976) Stromatolites. New York: Elsevier.20.1-790. Ward, D.M., Ferris, M.J., Nold, S.C., and Bateson, M.M. (1998) A natural view of microbial biodiversity within hot spring cyanobacterial mat communities. Microbiology and Molecular Biology Reviews 62: 1353. Watanabe, A., and Yamamoto, Y. (1971) Algal nitrogen fixation in the tropics. Plant and Soil 35: 403-413. Welch, B.L. (1947) The generalization ofstudent's' problem when several different population variances are involved. Biometrika 34: 28-35. Weller, R., Bateson, M.M., Heimbuch, B.K., Kopczynski, E.D., and Ward, D.M. (1992) Uncultivated cyanobacteria, Chloroflexus-like inhabitants, and spirochete-like inhabitants of a hot spring microbial mat. Appl Environ Microbiol 58: 3964. Whitton, B.A., and Potts, M. (2000) The Ecology of Cyanobacteria Their Diversity in Time and Space: Kluwer Academic Publishers. Wickstrom, C.E. (1984) Discovery and evidence of nitrogen fixation by thermophilic heterotrophs in hot springs. Curr Microbiol 10: 275-280. Wilson, K. (2001) Preparation of genomic DNA from bacteria. In Current Protocols in Molecular Biology. F. M. Ausubel, R.B., R. E. Kingston, D. D. Moore, J.G. Seidman, J. A. Smith, K. Struhl (ed). New York: John Wiley & Sons Inc, p. Unit 2.4. Wilson, K.H., and Blitchington, R.B. (1996) Human colonic biota studied by ribosomal DNA sequence analysis. Appl Environ Microbiol 62: 2273. Wu, S., and Zhang, Y. (2008) MUSTER: improving protein sequence profile–profile alignments by using multiple sources of structure information. Proteins: Structure, Function, and Bioinformatics 72: 547-556. Xiang, S., Yao, T., An, L., Xu, B., and Wang, J. (2005) 16S rRNA Sequences and Differences in Bacteria Isolated from the Muztag Ata Glacier at Increasing Depths. Appl Environ Microbiol 71: 4619-4627. Xiang, S.R., Yao, T.D., An, L.Z., Xu, B.Q., Li, Z., Wu, G.J. et al. (2004) Bacterial diversity in Malan ice core from the Tibetan Plateau. Folia Microbiol 49: 269-275. Xiao, X., Li, M., You, Z., and Wang, F. (2007) Bacterial communities inside and in the vicinity of the Chinese Great Wall Station, King George Island, Antarctica. Antarct Sci 19: 11-16. Yakimov, M.M., Giuliano, L., Chernikova, T.N., Gentile, G., Abraham, W.R., Timmis, K., and Golyshin, P. (2001) Alcalilimnicola halodurans gen. nov., sp. nov., an alkaliphilic, moderately halophilic and extremely halotolerant bacterium, isolated from sediments of soda-depositing Lake Natron, East Africa Rift Valley. Int J Syst Evol Microbiol 51: 2133-2143. Yakimov, M.M., Gentile, G., Bruni, V., Cappello, S., D'Auria, G., Golyshin, P.N., and Giuliano, L. (2004) Crude oil-induced structural shift of coastal bacterial communities of rod bay (Terra Nova Bay, Ross Sea, Antarctica) and characterization of cultured cold-adapted hydrocarbonoclastic bacteria. FEMS Microbiol Ecol 49: 419-432. Yamane, K., Hattori, Y., Ohtagaki, H., and Fujiwara, K. (2011) Microbial diversity with dominance of 16S rRNA gene sequences with high GC contents at 74 and 98° C subsurface crude oil deposits in Japan. FEMS Microbiol Ecol.

Page 232: Nitrogen fixing potential in extreme environments - UNSWorks

223

Yamauchi, K., Doi, K., Kinoshita, M., Kii, F., and Fukuda, H. (1992) Archaebacterial lipid models: highly salt-tolerant membranes from 1, 2-diphytanylglycero-3-phosphocholine. Biochimica et Biophysica Acta (BBA)-Biomembranes 1110: 171-177. Yang, R., Hou, Y., Campbell, C.A., Palaniyandi, K., Zhao, Q., Bordner, A.J., and Chang, X. (2011) Glutamine residues in Q-loops of multidrug resistance protein MRP1 contribute to ATP binding via interaction with metal cofactor. Biochimica et Biophysica Acta (BBA)-Biomembranes 1808: 1790-1796. Yannarell, A.C., Steppe, T.F., and Paerl, H.W. (2006) Genetic Variance in the Composition of Two Functional Groups (Diazotrophs and Cyanobacteria) from a Hypersaline Microbial Mat. Appl Environ Microbiol 72: 1207-1217. Yannarell, A.C., Steppe, T.F., and Paerl, H.W. (2007) Disturbance and recovery of microbial community structure and function following Hurricane Frances. Environ Microbiol 9: 576-583. Yue, J., and Clayton, M. (2005) A similarity measure based on species proportions. Communications in Statistics-Theory and Methods 34: 2123-2131. Zadorina, E., Slobodova, N., Boulygina, E., Kolganova, T., Kravchenko, I., and Kuznetsov, B. (2009) Analysis of the diversity of diazotrophic bacteria in peat soil by cloning of the nifH gene. Microbiology 78: 218-226. Zani, S., Mellon, M.T., Collier, J.L., and Zehr, J.P. (2000) Expression of nifH Genes in Natural Microbial Assemblages in Lake George, New York, Detected by Reverse Transcriptase PCR. Appl Environ Microbiol 66: 3119-3124. Zehr, J., Bench, S., Carter, B., Hewson, I., Niazi, F., Shi, T. et al. (2008) Globally distributed uncultivated oceanic N2-fixing cyanobacteria lack oxygenic photosystem II. Science 322: 1110. Zehr, J.P., and McReynolds, L.A. (1989) Use of degenerate oligonucleotides for amplification of the nifH gene from the marine cyanobacterium Trichodesmium thiebautii. Appl Environ Microbiol 55: 2522-2526. Zehr, J.P., Mellon, M.T., and Hiorns, W.D. (1997) Phylogeny of cyanobacterial nifH genes: evolutionary implications and potential applications to natural assemblages. Microbiology 143: 1443-1450. Zehr, J.P., Mellon, M.T., and Zani, S. (1998) New Nitrogen-Fixing Microorganisms Detected in Oligotrophic Oceans by Amplification of Nitrogenase (nifH) Genes. Appl Environ Microbiol 64: 5067. Zehr, J.P., Jenkins, B.D., Short, S.M., and Steward, G.F. (2003a) Nitrogenase gene diversity and microbial community structure: a cross-system comparison. Environ Microbiol 5: 539-554. Zehr, J.P., Crumbliss, L.L., Church, M.J., Omoregie, E.O., and Jenkins, B.D. (2003b) Nitrogenase genes in PCR and RT-PCR reagents: implications for studies of diversity of functional genes. BioTechniques 35: 996-1002, 1004-1005. Zehr, J.P., Mellon, M., Braun, S., Litaker, W., Steppe, T., and Paerl, H.W. (1995) Diversity of Heterotrophic Nitrogen Fixation Genes in a Marine Cyanobacterial Mat. Appl Environ Microbiol 61: 2527-2532.

Page 233: Nitrogen fixing potential in extreme environments - UNSWorks

224

Zeitlin, C., Cleghorn, T., Cucinotta, F., Saganti, P., Andersen, V., Lee, K. et al. (2004) Overview of the Martian radiation environment experiment. Adv Space Res 33: 2204-2210. Zhang, L., Hurek, T., and Reinhold-Hurek, B. (2007a) A nifH-based oligonucleotide microarray for functional diagnostics of nitrogen-fixing microorganisms. Microb Ecol 53: 456-470. Zhang, S., Hou, S., Ma, X., Qin, D., and Chen, T. (2007b) Culturable bacteria in Himalayan ice in response to atmospheric circulation. Biogeosci Disc 3: 765-778. Zhang, X., Yao, T., Ma, X., and Wang, N. (2002) Microorganisms in a high altitude Glacier Ice in Tibet. Folia Microbiol 47: 241-245. Zhang, Y. (2007) Template based modeling and free modeling by I TASSER in CASP7. Proteins: Structure, Function, and Bioinformatics 69: 108-117. Zhang, Y. (2008) I-TASSER server for protein 3 D structure prediction. BMC Bioinformatics 9: 40. Zhang, Y. (2009) I-TASSER: fully automated protein structure prediction in CASP8. Proteins: Structure, Function, and Bioinformatics 77: 100-113. Zhang, Y., and Skolnick, J. (2004) Scoring function for automated assessment of protein structure template quality. Proteins: Structure, Function, and Bioinformatics 57: 702-710. Zhang, Y., and Skolnick, J. (2005) TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res 33: 2302-2309. Zhang, Y., and Skolnick, J. (2007) Scoring function for automated assessment of protein structure template quality. Proteins 68: 1020. Zhang, Y., Kolinski, A., and Skolnick, J. (2003) TOUCHSTONE II: a new approach to ab initio protein structure prediction. Biophys J 85: 1145. Zhang, Y., Dong, J., Yang, Z., Zhang, S., and Wang, Y. (2008) Phylogenetic diversity of nitrogen-fixing bacteria in mangrove sediments assessed by PCR–denaturing gradient gel electrophoresis. Arch Microbiol 190: 19-28. Zhou, J., Davey, M.E., Figueras, J.B., Rivkina, E., Gilichinsky, D., and Tiedje, J.M. (1997) Phylogenetic diversity of a bacterial community determined from Siberian tundra soil DNA. Microbiology 143: 3913-3919.

Page 234: Nitrogen fixing potential in extreme environments - UNSWorks

225

Appendix A ______________________________________________________________________

1996 and 2004 Stromatolite nifH BLAST and BLASTX matches

Table A-1: Stromatolite 2004 clones BLAST results, presenting the highest sequence similarity match for each clone. 38 clones of 2004 stromatolites were analysed.

Sequence file ID Clone ID Nearest relative in GenBank*

Blast match accession ID

Sequence Similarity (%)

RSA46_04 C1 Uncultured bacterium nifH clone DQ140596 98 RSA47_04 C2 “ EU594145 86 RSA48_04 C5 “ DQ338103 87 RSA49_04 C7 “ EU594212 88 RSA50_04 C10 “ AY450628 87 RSA51_04 C12 “ DQ338103 87 RSA52_04 C13 “ DQ338103 87 RSA53_04 C14 “ DQ338103 87 RSA54_04 C16 “ EU594141 87 RSA55_04 C17 “ EU594141 87 RSA56_04 C18 “ DQ338103 86 RSA59_04 C20 “ DQ338103 87 RSA61_04 C21 “ EF174812 88 RSA62_04 C22 “ EU594146 87 RSA63_04 C23 “ DQ338103 87 RSA64_04 C24 “ DQ338103 87 RSA65_04 C25 “ DQ338040 85 RSA66_04 C26 “ EU594145 86 RSA67_04 C27 “ DQ338103 86 RSA68_04 C28 “ DQ338103 87 RSA69_04 C29 “ EU594145 85 RSA70_04 C30 “ EU594145 87 RSA71_04 C31 “ DQ338103 87 RSA74_04 C34 Myxosarcina sp. dinitrogenase

reductase (nifH ) gene U73133 89

RSA75_04 C35 Uncultured bacterium nifH clone DQ338103 86 RSA76_04 C36 “ DQ140596 100 RSA77_04 C37 “ DQ338103 87 RSA78_04 C39 “ EU594188 85 RSA79_04 C40 “ DQ338103 87 RSA80_04 C41 “ EF174826 93 RSA81_04 C42 “ EF174812 89 RSA82_04 C43 “ DQ338103 87 RSA83_04 C44 “ DQ338103 87 RSA84_04 C45 “ EU594188 84 RSA85_04 C47 “ EF174812 89 RSA86_04 C48 “ DQ338103 86 RSA87_04 C49 “ EU594145 87 RSA88_04 C50 “ DQ338103 87 *Only a single match is shown. There could be two or more identical high scores for each record here.

Page 235: Nitrogen fixing potential in extreme environments - UNSWorks

226

Table A-2: Stromatolite 1996 clones BLAST results, presenting only the highest sequence similarity match for each clone. 37 clones of 1996 stromatolites were analysed.

Sequence file ID

Clone ID

Nearest relative in GenBank*

Blast match accession ID

Sequence similarity (%)

RSA90_96 GC15 Uncultured bacterium nifH clone

DQ338027 85

RSA91_96 GC16 “ DQ338027 87 RSA93_96 GC20 “ DQ078021 85 RSA94_96 GC21 “ DQ338071 85 RSA95_96 GC22 “ DQ338014 86 RSA96_96 GC23 “ AM286438 95 RSA97_96 GC25 “ DQ338071 85 RSA98_96 GC26 “ DQ338027 87 RSA99_96 GC27 “ DQ078042 89 RSA101_96 GC29 “ DQ821946 98 RSA102_96 GC30 “ HM750588 92 RSA107_96 GC36 “ DQ338071 85 RSA112_96 GC41 “ GU193054 82 RSA114_96 GC43 “ DQ821946 98 RSA115_96 GC45 “ HM750588 92 RSA119_96 GC17 “ HM750603 83

RSA121_96 GC31 “ DQ338014 87 RSA122_96 GC32 “ DQ078042 89 RSA124_96 GC36 “ DQ338071 85 RSA126_96 GC38 “ HM750759 89 RSA127_96 GC39 “ DQ078042 88 RSA128_96 GC35 “ DQ078042 87 RSA129_96 GC40 “ AM286438 96 RSA130_96 GC42 “ DQ078042 89 RSA132_96 GC47 “ DQ078042 88 RSA133_96 GC48 “ HM750443 87 RSA134_96 GC49 “ DQ821946 97 RSA135_96 GC50 “ GU193021 86 RSA137_96 GC52 Desulfovibrio magneticus RS-1

DNA AP010904 85

RSA138_96 GC53 Uncultured bacterium nifH clone

DQ338014 87

RSA139_96 GC54 “ DQ078042 89 RSA141_96 GC56 “ DQ338071 85 RSA143_96 GC58 “ DQ078042 89 RSA147_96 GC57 “ DQ338014 88 RSA148_96 GC1 “ DQ078042 89 RSA150_96 GC3 “ DQ078042 89 RSA152_96 GC6 “ DQ338014 86 *Only a single match is shown. There could be two or more identical high scores for each record here.

Page 236: Nitrogen fixing potential in extreme environments - UNSWorks

227

Table A-3: Stromatolite 2004 clones BLASTX results, presenting only the highest sequence similarity match for each clone.

Sequence file ID

Clone ID

Nearest bacterial Nitrogenase iron protein match in GenBank

Phylum

BLASTX match accession ID

Sequence similarity (%)

RSA46_04 C1 Oscillatoria PCC 6506 Cyanobacteria ZP_07112556 94 RSA47_04 C2 Cyanothece sp. CCY0110 Cyanobacteria ZP_01727765 94 RSA48_04 C5 “ Cyanobacteria ZP_01727765 94 RSA49_04 C7 “ Cyanobacteria ZP_01727765 94 RSA50_04 C10 “ Cyanobacteria ZP_01727765 92 RSA51_04 C12 “ Cyanobacteria ZP_01727765 94 RSA52_04 C13 “ Cyanobacteria ZP_01727765 94 RSA53_04 C14 “ Cyanobacteria ZP_01727765 93 RSA54_04 C16 “ Cyanobacteria ZP_01727765 93 RSA55_04 C17 “ Cyanobacteria ZP_01727765 93 RSA56_04 C18 “ Cyanobacteria ZP_01727765 93 RSA59_04 C20 “ Cyanobacteria ZP_01727765 93 RSA61_04 C21 Cyanobacterium UCYN-A Cyanobacteria YP_003421696 90 RSA62_04 C22 Cyanothece ATCC 51142 Cyanobacteria YP_001801976 93 RSA63_04 C23 “ Cyanobacteria YP_001801976 93 RSA64_04 C24 “ Cyanobacteria YP_001801976 94 RSA65_04 C25 “ Cyanobacteria YP_001801976 91 RSA66_04 C26 “ Cyanobacteria YP_001801976 93 RSA67_04 C27 Cyanothece PCC 7425 Cyanobacteria YP_002483083 93 RSA68_04 C28 Cyanothece ATCC 51142 Cyanobacteria YP_001801976 93 RSA69_04 C29 Cyanothece PCC 7425 Cyanobacteria YP_002483083 91 RSA70_04 C30 Cyanothece ATCC 51142 Cyanobacteria YP_001801976 93 RSA71_04 C31 “ Cyanobacteria YP_001801976 93 RSA74_04 C34 “ Cyanobacteria YP_001801976 93 RSA75_04 C35 “ Cyanobacteria YP_001801976 93 RSA76_04 C36 Cyanothece PCC 7425 Cyanobacteria YP_002483083 95 RSA77_04 C37 Cyanothece CCY0110 Cyanobacteria ZP_01727765 93 RSA78_04 C39 “ Cyanobacteria ZP_01727765 93 RSA79_04 C40 “ Cyanobacteria ZP_01727765 93 RSA80_04 C41 Cyanothece PCC 7425 Cyanobacteria YP_002483083 93 RSA81_04 C42 Cyanobacterium UCYN-A Cyanobacteria YP_003421696 91 RSA82_04 C43 Cyanothece sp. CCY0110 Cyanobacteria ZP_01727765 93 RSA83_04 C44 “ Cyanobacteria ZP_01727765 93 RSA84_04 C45 “ Cyanobacteria ZP_01727765 93 RSA85_04 C47 Cyanothece PCC 7425 Cyanobacteria YP_002483083 92 RSA86_04 C48 Cyanothece sp. CCY0110 Cyanobacteria ZP_01727765 93 RSA87_04 C49 “ Cyanobacteria ZP_01727765 93 RSA88_04 C50 “ Cyanobacteria ZP_01727765 93

Page 237: Nitrogen fixing potential in extreme environments - UNSWorks

228

Table A-4: 1996 stromatolites clones BLASTX results, presenting only the highest sequence similarity match for each clone.

Sequence file ID

Clone ID

Nearest bacterial nitrogenase iron protein match in GenBank

Phylum

BLASTX match accession ID

Sequence similarity (%)

RSA90_96 GC15 Desulfonatronospira thiodismutans ASO3-1

δ-Proteobacteria ZP_07015343 89

RSA91_96 GC16 “ δ-Proteobacteria ZP_07015343 97 RSA93_96 GC20 Desulfatibacillum

alkenivorans AK-01 δ-Proteobacteria YP_002430688 95

RSA94_96 GC21 “ δ-Proteobacteria YP_002430688 93 RSA95_96 GC22 Desulfonatronospira

thiodismutans ASO3-1 δ-Proteobacteria ZP_07015343 94

RSA96_96 GC23 Desulfatibacillum alkenivorans AK-01

δ-Proteobacteria YP_002430688 84

RSA97_96 GC25 “ δ-Proteobacteria YP_002430688 95 RSA98_96 GC26 Desulfonatronospira

thiodismutans ASO3-1 δ-Proteobacteria ZP_07015343 93

RSA99_96 GC27 Pelobacter carbinolicus DSM 2380

δ-Proteobacteria YP_357508 90

RSA101_96 GC29 Cyanothece sp. CCY0110

Cyanobacteria ZP_01727765 91

RSA102_96 GC30 Halorhodospira halophila SL1

γ-Proteobacteria YP_001001870 97

RSA107_96 GC36 Desulfatibacillum alkenivorans AK-01

δ-Proteobacteria YP_002430688 94

RSA112_96 GC41 Teredinibacter turnerae T7901

γ-Proteobacteria YP_003073074 96

RSA114_96 GC43 Cyanothece sp. CCY0110

Cyanobacteria ZP_01727765 96

RSA115_96 GC45 Halorhodospira halophila SL1

γ-Proteobacteria YP_001001870 94

RSA119_96 GC17 Desulfatibacillum alkenivorans AK-01

δ-Proteobacteria YP_002430688 89

RSA121_96 GC31 Desulfonatronospira thiodismutans ASO3-1

δ-Proteobacteria ZP_07015343 97

RSA122_96 GC32 Pelobacter carbinolicus DSM 2380

δ-Proteobacteria YP_357508 94

RSA124_96 GC36 Desulfatibacillum alkenivorans AK-01

δ-Proteobacteria YP_002430688 95

RSA126_96 GC38 Pelobacter carbinolicus DSM 2380

δ-Proteobacteria YP_357508 95

RSA127_96 GC39 “ δ-Proteobacteria YP_357508 96 RSA128_96 GC35 “ δ-Proteobacteria YP_357508 92 RSA129_96 GC40 Desulfatibacillum

alkenivorans AK-01 δ-Proteobacteria YP_002430688 91

RSA130_96 GC42 Pelobacter carbinolicus DSM 2380

δ-Proteobacteria YP_357508 97

RSA132_96 GC47 “ δ-Proteobacteria YP_357508 93 RSA133_96 GC48 Teredinibacter

turnerae T7901 γ-Proteobacteria YP_003073074 97

RSA134_96 GC49 Cyanothece sp. CCY0110

Cyanobacteria ZP_01727765 89

Page 238: Nitrogen fixing potential in extreme environments - UNSWorks

229

RSA135_96 GC50 Teredinibacter turnerae T7901

γ-Proteobacteria YP_003073074 97

RSA137_96 GC52 Desulfovibrio magneticus RS-1

δ-Proteobacteria YP_002953433 97

RSA138_96 GC53 Desulfonatronospira thiodismutans ASO3-1

δ-Proteobacteria ZP_07015343 85

RSA139_96 GC54 Pelobacter carbinolicus DSM 2380

δ-Proteobacteria YP_357508 89

RSA141_96 GC56 Desulfatibacillum alkenivorans AK-01

δ-Proteobacteria YP_002430688 97

RSA143_96 GC58 Pelobacter carbinolicus DSM 2380

δ-Proteobacteria YP_357508 95

RSA147_96 GC57 Desulfonatronospira thiodismutans ASO3-1

δ-Proteobacteria ZP_07015343 93

RSA148_96 GC1 Pelobacter carbinolicus DSM 2380

δ-Proteobacteria YP_357508 94

RSA150_96 GC3 “ δ-Proteobacteria YP_357508 84 RSA152_96 GC6 Desulfonatronospira

thiodismutans ASO3-1 δ-Proteobacteria ZP_07015343 95

NifH phylogeny reference sequences

Table A-5: 186 imported nifH amino acid sequences from The Universal Protein Resource (UniProtKB) database.

Organism Accession ID Entry name Length (AA) Status(a)

Acidithiobacillus ferrooxidans strain ATCC 23270

B7JA99 NIFH_ACIF2 296 reviewed

Acidithiobacillus ferrooxidans strain ATCC 53993

B5ER76 NIFH_ACIF5 296 reviewed

Alcaligenes faecalis Q44044 NIFH_ALCFA 296 reviewed Alkaliphilus metalliredigens A6TTY3 NIFH_ALKMQ 272 reviewed Anabaena azollae P0A3S2 NIFH_ANAAZ 295 reviewed Anabaena sp. strain L31 P33178 NIFH_ANASL 294 reviewed Anabaena variabilis strain ATCC 29413

P0A3S1 NIFH1_ANAVT 295 reviewed

Anabaena variabilis strain ATCC 29413

Q44484 NIFH2_ANAVT 296 reviewed

Arcobacter nitrofigilis Q6XJ51 Q6XJ51_9PROT 121 unreviewed Arcobacter nitrofigilis Q7WUN7 Q7WUN7_9PROT 121 unreviewed Arcobacter nitrofigilis DSM 7299 D5V3K8 D5V3K8_ARCNC 304 unreviewed Azoarcus communis Q79AX4 Q79AX4_9RHOO 137 unreviewed Azoarcus sp. O31255 O31255_AZOSP 113 unreviewed Azoarcus sp. strain BH72 Q9F0V9 Q9F0V9_AZOSB 297 unreviewed Azorhizobium caulinodans strain ATCC 43989

P26251 NIFH1_AZOC5 296 reviewed

Azorhizobium caulinodans strain P26252 NIFH2_AZOC5 296 reviewed

Page 239: Nitrogen fixing potential in extreme environments - UNSWorks

230

ATCC 43989 Azospirillum brasilense P17303 NIFH_AZOBR 293 reviewed Azotobacter chroococcum strain mcd 1 P06118 NIFH2_AZOCH 290 reviewed Azotobacter chroococcum strain mcd 1 P26248 NIFH1_AZOCH 291 reviewed Azotobacter vinelandii P00459 NIFH1_AZOVI 290 reviewed Azotobacter vinelandii P15335 NIFH2_AZOVI 290 reviewed Azotobacter vinelandii P16269 NIFH3_AZOVI 275 reviewed Beijerinckia indica subsp.indica strain ATCC 9039

B2IET2 B2IET2_BEII9 290 unreviewed

Bradyrhizobium japonicum P06117 NIFH_BRAJA 294 reviewed Bradyrhizobium strain ANU 289 P00463 NIFH_BRASP 294 reviewed Burkholderia sp. WSM3938 A9YL99 A9YL99_9BURK 204 unreviewed Burkholderia tropica A6Y960 A6Y960_9BURK 284 unreviewed Burkholderia vietnamiensis A4JRN7 A4JRN7_BURVG 293 unreviewed Candidatus Azobacteroides pseudotrichonymphae genomovar. CFP2

B6YRI6 NIFH_AZOPC 274 reviewed

Chlorobaculum parvum strain NCIB 8327

B3QQ12 NIFH_CHLP8 274 reviewed

Chlorobium chlorochromatii strain CaD3

Q3AR70 NIFH_CHLCH 274 reviewed

Chlorobium limicola strain DSM 245 B3EH88 NIFH_CHLL2 274 reviewed Chlorobium phaeobacteroides strain BS1

B3EL81 NIFH_CHLPB 274 reviewed

Chlorobium phaeobacteroides strain DSM 266

A1BEH0 NIFH_CHLPD 274 reviewed

Chlorobium tepidum Q8KC92 NIFH_CHLTE 274 reviewed Chlorogloeopsis sp. P94673 P94673_9CYAN 108 unreviewed Clostridium acetobutylicum Q97ME5 NIFH_CLOAB 272 reviewed Clostridium cellobioparum Q59270 NIFH_CLOCB 271 reviewed Clostridium pasteurianum P00456 NIFH1_CLOPA 273 reviewed Clostridium pasteurianum P09552 NIFH2_CLOPA 272 reviewed Clostridium pasteurianum P09553 NIFH3_CLOPA 275 reviewed Clostridium pasteurianum P09554 NIFH5_CLOPA 273 reviewed Clostridium pasteurianum P09555 NIFH6_CLOPA 272 reviewed Clostridium pasteurianum P22548 NIFH4_CLOPA 273 reviewed Cupriavidus sp. SWF66167 C1JID2 C1JID2_9BURK 268 unreviewed Cyanobacterium UCYN-A B7TB36 UCYN_06140 287 unreviewed Cyanothece strain ATCC 51142 O07641 NIFH_CYAA5 327 reviewed Cyanothece sp. CCY0110 A3IL28 A3IL28_9CHRO 290 unreviewed Cyanothece sp. strain PCC 7424 B7KG76 NIFH_CYAP7 299 reviewed Cyanothece sp. strain PCC 7425 B8HWE3 NIFH_CYAP4 298 reviewed Cyanothece sp. strain PCC 8801 Q55028 NIFH_CYAP8 296 reviewed Dechloromonas aromatica strain RCB Q47G67 NIFH_DECAR 296 reviewed Dehalococcoides ethenogenes strain 195

Q3Z7C7 NIFH_DEHE1 274 reviewed

Desulfatibacillum alkenivorans strain AK-01

B8FAC4 NIFH_DESAA 274 reviewed

Desulfobacter curvatus Q9X3A2 Q9X3A2_9DELT 109 unreviewed Desulfobacter latus Q7WUN9 Q7WUN9_9DELT 109 unreviewed Desulfobacterium autotrophicum strain ATCC 43914

C0QKK8 NIFH_DESAH 273 reviewed

Desulfobacterium autotrophicum strain ATCC 43914

C0QMP6 C0QMP6_DESAH 99 unreviewed

Page 240: Nitrogen fixing potential in extreme environments - UNSWorks

231

Desulfomicrobium baculatum strain DSM 4028

C7LPA9 C7LPA9_DESBD 276 unreviewed

Desulfonatronospira thiodismutans ASO3-1

D6SLD2 D6SLD2_9DELT 276 unreviewed

Desulfonema limicola Q9X3A1 Q9X3A1_9DELT 107 unreviewed Desulforudis audaxviator MP104C B1I0Y8 NIFH_DESAP 280 reviewed Desulfosporosinus orientis Q9F8Z7 Q9F8Z7_9FIRM 107 unreviewed Desulfotomaculum reducens MI-1 A4J8C2 NIFH_DESRM 272 reviewed Desulfovibrio baculatus Q8VPI5 Q8VPI5_DESBA 109 unreviewed Desulfovibrio gigas P71156 NIFH_DESGI 274 reviewed Desulfovibrio magneticus strain ATCC 700980

C4XRK2 NIFH_DESMR 274 reviewed

Desulfovibrio salexigens strain ATCC 14822

C6BX34 NIFH_DESAD 275 reviewed

Frankia alni P08925 NIFH_FRAAL 287 reviewed Frankia sp. strain CcI3 Q2J4F8 NIFH_FRASC 287 reviewed Frankia strain EAN1pec A8L2C4 NIFH_FRASN 289 reviewed Frankia sp. strain EuIK1 Q47922 NIFH_FRASE 287 reviewed Frankia sp. strain FaC1 P46034 NIFH_FRASP 287 reviewed Gloeothece sp. KO11DG A4PC92 A4PC92_9CHRO 119 unreviewed Gloeothece sp. KO68DGA A2V899 A2V899_9CHRO 290 unreviewed Gluconacetobacter diazotrophicus strain ATCC 49037

Q9ZIE4 NIFH_GLUDA 298 reviewed

Halorhodospira halophila SL1 A1WTQ6 A1WTQ6_HALHL 291 unreviewed Herbaspirillum seropedicae P77873 NIFH_HERSE 292 reviewed Klebsiella pneumoniae P00458 NIFH_KLEPN 293 reviewed Klebsiella pneumoniae strain 342 B5XPH2 NIFH_KLEP3 293 reviewed Magnetococcus strain MC-1 A0L6X0 NIFH_MAGSM 296 reviewed Mastigocladus laminosus Q47917 NIFH1_MASLA 295 reviewed Mastigocladus laminosus Q47921 NIFH2_MASLA 307 reviewed Methanobacterium ivanovii P08624 NIFH2_METIV 263 reviewed Methanobacterium ivanovii P51602 NIFH1_METIV 275 reviewed Methanobacterium thermoautotrophicum

O26739 NIFH2_METTH 265 reviewed

Methanobacterium thermoautotrophicum

O27602 NIFH1_METTH 275 reviewed

Methanobacterium thermoautotrophicum strain DSM 2133

Q50785 NIFH1_METTM 275 reviewed

Methanobrevibacter arboriphilus Q48890 Q48890_9EURY 109 unreviewed Methanobrevibacter ruminantium O93628 O93628_9EURY 139 unreviewed Methanobrevibacter smithii O93629 O93629_METSM 139 unreviewed Methanococcus jannaschii Q58289 NIFH_METJA 279 reviewed Methanococcus maripaludis Q50218 NIFH_METMP 275 reviewed Methanococcus thermolithotrophicus P08625 NIFH2_METTL 292 reviewed Methanococcus thermolithotrophicus P25767 NIFH1_METTL 284 reviewed Methanococcus voltae P06119 NIFH_METVO 278 reviewed Methanoculleus marisnigri strain ATCC 35101

A3CWW3 NIFH_METMJ 272 reviewed

Methanopyrus kandleri Q8TVH3 Q8TVH3_METKA 266 unreviewed Methanosarcina acetivorans Q8TJ93 Q8TJ93_METAC 273 unreviewed Methanosarcina acetivorans Q8TJZ9 Q8TJZ9_METAC 265 unreviewed Methanosarcina barkeri O93630 O93630_METBA 142 unreviewed Methanosarcina barkeri P54799 NIFH1_METBA 275 reviewed

Page 241: Nitrogen fixing potential in extreme environments - UNSWorks

232

Methanosarcina barkeri P54800 NIFH2_METBA 273 reviewed Methanosarcina lacustris Q977F4 Q977F4_9EURY 134 unreviewed Methanosarcina lacustris Q977P9 Q977P9_9EURY 134 unreviewed Methanosarcina mazei Q8PYY0 NIFH_METMA 273 reviewed Methanosarcina mazei Q8PZH9 Q8PZH9_METMA 265 unreviewed Methanospirillum hungatei strain DSM 864

Q2FUB7 NIFH_METHJ 280 reviewed

Methylobacter luteus Q6KEW2 Q6KEW2_9GAMM 148 unreviewed Methylobacter marinus Q93DU4 Q93DU4_METMR 120 unreviewed Methylobacterium nodulans strain ORS 2060

B8ITG7 NIFH_METNO 299 reviewed

Methylobacterium strain 4-46 B0UAK2 NIFH_METS4 299 reviewed Methylocella silvestris strain DSM 15510

B8EJ25 B8EJ25_METSB 293 unreviewed

Methylocella tundrae Q6KEX4 Q6KEX4_METTU 147 unreviewed Methylomonas methanica Q93DU1 Q93DU1_9GAMM 121 unreviewed Methylomonas rubra Q83TP5 Q83TP5_METRU 150 unreviewed Methylosinus trichosporium OB3b D5QKI5 D5QKI5_METTR 295 unreviewed Nostoc commune P26250 NIFH_NOSCO 297 reviewed Nostoc muscorum Q09158 NIFH_NOSMU 108 reviewed Nostoc sp. strain PCC 6720 Q51296 NIFH_NOSS6 295 reviewed Nostoc sp. strain PCC 7120 O30577 NIFH2_NOSS1 297 reviewed Nostoc sp. strain PCC 7120 P00457 NIFH1_NOSS1 295 reviewed Oscillatoria sp. PCC 6506 D8G542 D8G542_9CYAN 300 unreviewed Paenibacillus azotofixans Q9AKT4 NIFH2_PAEAZ 292 reviewed Paenibacillus azotofixans Q9AKT8 NIFH1_PAEAZ 292 reviewed Pectobacterium atrosepticum Q6D2Y8 NIFH_ERWCT 293 reviewed Pelobacter carbinolicus DSM 2380 Q3A2R9 Q3A2R9_PELCD 292 unreviewed Pelodictyon luteolum DSM 273 Q3B2P6 NIFH_PELLD 274 reviewed Pelodictyon phaeoclathratiforme DSM 5477

B4SC59 NIFH_PELPB 274 reviewed

Phormidium sp. AD1 Q9F8Z5 Q9F8Z5_9CYAN 108 unreviewed Plectonema boryanum Q00240 NIFH_PLEBO 296 reviewed Prosthecochloris aestuarii strain DSM 271

B4S9H5 NIFH_PROA2 274 reviewed

Prosthecochloris vibrioformis strain DSM 265

A4SFU6 NIFH_PROVI 274 reviewed

Pseudomonas stutzeri strain A1501 A4VJ70 NIFH_PSEU5 293 reviewed Rhizobium etli P00462 NIFH_RHIET 297 reviewed Rhizobium leguminosarum bv. trifolii P00461 NIFH_RHILT 297 reviewed Mesorhizobium loti Q98AP7 NIFH_RHILO 297 reviewed Rhizobium meliloti P00460 NIFH_RHIME 297 reviewed Sinorhizobium fredii P19068 NIFH_RHISN 296 reviewed Rhodobacter azotoformans Q8L0U5 Q8L0U5_9RHOB 154 unreviewed Rhodobacter capsulatus P08718 NIFH1_RHOCA 295 reviewed Rhodobacter capsulatus Q07942 NIFH2_RHOCA 275 reviewed Rhodobacter capsulatus strain ATCC BAA-309

D5AKX6 D5AKX6_RHOCB 290 unreviewed

Rhodobacter capsulatus strain ATCC BAA-309

D5ANI3 D5ANI3_RHOCB 295 unreviewed

Rhodobacter sp. AP-10 Q8L0U4 Q8L0U4_9RHOB 153 unreviewed Rhodobacter sp. SW2 C8S0T4 C8S0T4_9RHOB 295 unreviewed Rhodobacter sphaeroides O31183 NIFH_RHOSH 291 reviewed Rhodobacter sphaeroides strain ATCC Q3J0H1 NIFH_RHOS4 291 reviewed

Page 242: Nitrogen fixing potential in extreme environments - UNSWorks

233

17023 Rhodobacter sphaeroides strain ATCC 17029

A3PLS9 NIFH_RHOS1 291 reviewed

Rhodopseudomonas blastica Q8L0T4 Q8L0T4_RHOBL 154 unreviewed Rhodopseudomonas palustris strain BisB5

Q13C78 NIFH_RHOPS 299 reviewed

Rhodopseudomonas palustris strain HaA2

Q2J1I1 NIFH_RHOP2 299 reviewed

Rhodospirillum rubrum P22921 NIFH_RHORU 295 reviewed Rhodovulum sulfidophilum Q8L0T6 Q8L0T6_RHOSU 153 unreviewed Roseiflexus castenholzii strain DSM 13941

A7NR80 NIFH_ROSCS 273 reviewed

Roseiflexus strain RS-1 A5USK5 NIFH_ROSS1 273 reviewed Scytonema sp. NCC-4B Q19AR3 Q19AR3_9CYAN 103 unreviewed Sinorhizobium medicae WSM419 A6UME9 A6UME9_SINMW 297 unreviewed Spirochaeta aurantia Q9AMD9 Q9AMD9_SPIAU 109 unreviewed Spirochaeta aurantia Q9AME0 Q9AME0_SPIAU 109 unreviewed Spirochaeta stenostrepta Q9AMD7 Q9AMD7_9SPIO 134 unreviewed Spirochaeta stenostrepta Q9AMD8 Q9AMD8_9SPIO 143 unreviewed Spirochaeta zuelzerae Q9AMD6 Q9AMD6_9SPIO 143 unreviewed Symploca atlantica PCC 8002 Q7WUP5 Q7WUP5_9CYAN 108 unreviewed Synechococcus strain JA-2-3B Q2JP78 NIFH_SYNJB 292 reviewed Synechococcus strain JA-3-3Ab Q2JTL7 NIFH_SYNJA 292 reviewed Synechocystis sp. WH 002 Q7WUP2 Q7WUP2_9SYNC 108 unreviewed Syntrophobacter fumaroxidans strain DSM 10017

A0LH11 NIFH_SYNFM 274 reviewed

Teredinibacter turnerae T7901 C5BTB0 NIFH_TERTT 292 reviewed Acidithiobacillus ferrooxidans P06661 NIFH_THIFE 296 reviewed Tolumonas auensis DSM 9187 C4LAS5 NIFH_TOLAT 295 reviewed Tolypothrix sp. PCC 7101 Q7WUP4 Q7WUP4_9CYAN 108 unreviewed Tolypothrix sp. PCC 7601 Q3L168 Q3L168_9CYAN 136 unreviewed Treponema azotonutricium Q9AMC8 Q9AMC8_9SPIO 144 unreviewed Treponema denticola Q9AMD4 Q9AMD4_TREDE 143 unreviewed Treponema primitia Q9AMD0 Q9AMD0_9SPIO 143 unreviewed Trichodesmium erythraeum IMS101 O34106 NIFH_TRIEI 296 reviewed Trichodesmium thiebautii P26254 NIFH_TRITH 294 reviewed Uncultured methanogenic archaeon RC-I

Q0W443 NIFH_UNCMA 274 reviewed

Vibrio cincinnatii Q9LAI5 Q9LAI5_VIBCI 229 unreviewed Vibrio natriegens Q9RQR0 Q9RQR0_VIBNA 198 unreviewed Vibrio parahaemolyticus A7X7L2 A7X7L2_VIBPA 139 unreviewed Wolinella succinogenes Q7M8U8 NIFH_WOLSU 303 reviewed Xanthobacter autotrophicus Q93G65 Q93G65_XANAU 134 unreviewed Xenococcus sp. O08262 O08262_9CYAN 108 unreviewed Zymomonas mobilis Q5NLG3 NIFH_ZYMMO 295 reviewed

(a) ‘Reviewed’ status indicates sequences that were manually annotated and reviewed in the Swiss-Prot database. Reviewed sequences are reliable as they were inferred from homology studies, and there is evidence at transcript and protein level of their existence. ‘Unreviewed’ status indicates sequences that were automatically annotated and were not reviewed (TrEMBL database). They are mostly derived from prediction studies and have far less verification at the protein and transcript levels.