Top Banner
UNCORRECTED PROOF 1 The chloroplast genome of the diatom Seminavis robusta: New features 2 introduced through multiple mechanisms of horizontal gene transfer Tore Q1 Brembu a , Per Winge a , Ave Tooming-Klunderud b , Alexander J. Nederbragt b , 4 Kjetill S. Jakobsen b , Atle M. Bones a, 5 a Department of Biology, Norwegian University of Science and Technology, Trondheim, Norway 6 b Centre for Ecological and Evolutionary Synthesis (CEES), Department of Biology, University of Oslo, Oslo, Norway 7 8 abstract article info 9 Article history: 10 Received 30 September 2013 11 Received in revised form 29 November 2013 12 Accepted 4 December 2013 13 Available online xxxx 14 15 16 17 Keywords: 18 Plastid 19 Plasmid 20 HGT 21 Evolution 22 The chloroplasts of heterokont algae such as diatoms are the result of a secondary endosymbiosis event, in which 23 a red alga was engulfed by a non-photosynthetic eukaryote. The diatom chloroplast genomes sequenced to date 24 show a high degree of similarity, but some examples of gene replacement or introduction of genes through hor- 25 izontal gene transfer are known. The evolutionary origin of the gene transfers is unclear. We have sequenced and 26 characterised the complete chloroplast genome and a putatively chloroplast-associated plasmid of the pennate 27 diatom Seminavis robusta. The chloroplast genome contains two introns, a feature that has not previously been 28 found in diatoms. The group II intron of atpB appears to be recently transferred from a Volvox-like green alga. 29 The S. robusta chloroplast genome (150,905 bp) is the largest diatom chloroplast genome characterised to 30 date, mainly due to the presence of four large gene-poor regions. Open reading frames (ORFs) encoded by the 31 gene-poor regions show similarity to putative proteins encoded by the chloroplast genomes of different 32 heterokonts, as well as the plasmids pCf1 and pCf2 found in the diatom Cylindrotheca fusiformis. A tyrosine 33 recombinase and a serine recombinase are encoded by the S. robusta chloroplast genome, indicating a possible 34 mechanism for the introduction of novel genes. A plasmid with similarity to pCf2 was also identied. Phylogenet- 35 ic analyses of three ORFs identied on pCf2 suggest that two of them are part of an operon-like gene cluster con- 36 served in bacteria. Several genetic elements have moved through horizontal gene transfer between the 37 chloroplast genomes of different heterokonts. Two recombinases are likely to promote such gene insertion 38 events, and the plasmid identied may act as vectors in this process. The copy number of the plasmid was similar 39 to that of the plastid genome indicating a plastid localization. 40 © 2013 The Authors. Published by Elsevier B.V. All rights reserved. 41 42 43 44 45 1. Introduction 46 The primary origin of the chloroplast organelle (plastid) in all eu- 47 karyotic photosynthetic organisms lies in the ancient engulfment of a 48 photosynthetic cyanobacterium by a heterotrophic eukaryote in a pro- 49 cess termed primary endosymbiosis. Over time, most genes of the pri- 50 mary endosymbiont were lost or transferred to the host genome, 51 resulting in a highly reduced chloroplast genome encoding core 52 elements of the photosynthetic machinery. Descendants of the primary 53 endosymbiosis evolved into three lineages: the redlineage consisting 54 of the red algae (Rhodophyta), the greenlineage consisting of green 55 algae (Chlorophyta) and land plants (Streptophyta), and the glauco- 56 phytes. In secondary endosymbiosis, a red alga or a green alga was 57 engulfed by a non-photosynthetic protist (Green, 2011; Reyes-Prieto 58 et al., 2007). Chloroplasts of algae belonging to the heterokonts, 59 which include diatoms, brown algae, raphidophytes and heterotrophic 60 oomycetes, arose from a secondary endosymbiosis event including a 61 red alga. Recent results indicate that the red algal endosymbiont 62 succeeded a green algal endosymbiont related to prasinophytes, as a 63 large number of nuclear genes in diatom genomes have a green algal or- 64 igin (Jiroutová et al., 2010; Moustafa et al., 2009). However, this nding 65 is controversial, and has been the subject of criticism for taxonomic 66 sampling bias (Burki et al., 2012; Deschamps and Moreira, 2012). 67 In addition to the large contribution of genetic material to algal ge- 68 nomes through endosymbiosis (endosymbiotic gene transfer, EGT), sev- 69 eral genes have been introduced to nuclear and organelle genomes 70 independently through horizontal gene transfer (HGT) events. The nucle- 71 ar genomes of the diatoms Thalassiosira pseudonana and Phaeodactylum 72 tricornutum contain several hundred genes that appear to have been 73 acquired from a wide range of bacteria through HGT (Armbrust et al., 74 2004; Bowler et al., 2008). Marine Genomics xxx (2013) xxxxxx This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-No Derivative Works License, which permits non-commercial use, distribution, and reproduction in any medium, provided the original author and source are credited. Corresponding author at: Department of Biology, Norwegian University of Science and Technology, Realfagbygget, Høgskoleringen 5, N-7491 Trondheim, Norway. Tel.: +47 73 59 86 92; fax: +47 73 59 61 00. E-mail address: [email protected] (A.M. Bones). MARGEN-00177; No of Pages 11 1874-7787/$ see front matter © 2013 The Authors. Published by Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.margen.2013.12.002 Contents lists available at ScienceDirect Marine Genomics journal homepage: www.elsevier.com/locate/margen Please cite this article as: Brembu, T., et al., The chloroplast genome of the diatom Seminavis robusta: New features introduced through multiple mechanisms of horizontal gene..., Mar. Genomics (2013), http://dx.doi.org/10.1016/j.margen.2013.12.002
11

The chloroplast genome of the diatom Seminavis robusta: New features introduced through multiple mechanisms of horizontal gene transfer

Apr 27, 2023

Download

Documents

Bendik Bygstad
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The chloroplast genome of the diatom Seminavis robusta: New features introduced through multiple mechanisms of horizontal gene transfer

1

2

3Q1

4

56

7

89101112131415161718192021

43

44

45

46

47

48

49

50

51

52

53

54

Marine Genomics xxx (2013) xxx–xxx

MARGEN-00177; No of Pages 11

Contents lists available at ScienceDirect

Marine Genomics

j ourna l homepage: www.e lsev ie r .com/ locate /margen

The chloroplast genome of the diatom Seminavis robusta: New featuresintroduced through multiple mechanisms of horizontal gene transfer☆

OFTore Brembu a, Per Winge a, Ave Tooming-Klunderud b, Alexander J. Nederbragt b,

Kjetill S. Jakobsen b, Atle M. Bones a,⁎a Department of Biology, Norwegian University of Science and Technology, Trondheim, Norwayb Centre for Ecological and Evolutionary Synthesis (CEES), Department of Biology, University of Oslo, Oslo, Norway

☆ This is an open-access article distributed under the tAttribution-NonCommercial-No Derivative Works License,use, distribution, and reproduction in any medium, provideare credited.⁎ Corresponding author at: Department of Biology, Norw

Technology, Realfagbygget, Høgskoleringen 5, N-7491 Tro59 86 92; fax: +47 73 59 61 00.

E-mail address: [email protected] (A.M. Bones).

1874-7787/$ – see front matter © 2013 The Authors. Pubhttp://dx.doi.org/10.1016/j.margen.2013.12.002

Please cite this article as: Brembu, T., et al., Tmechanisms of horizontal gene..., Mar. Geno

O

a b s t r a c t

a r t i c l e i n f o

22

23

24

25

26

27

28

29

30

31

Article history:Received 30 September 2013Received in revised form 29 November 2013Accepted 4 December 2013Available online xxxx

Keywords:PlastidPlasmidHGTEvolution

32

33

34

35

36

37

38

39

40

ECTED P

RThe chloroplasts of heterokont algae such as diatoms are the result of a secondary endosymbiosis event, inwhicha red alga was engulfed by a non-photosynthetic eukaryote. The diatom chloroplast genomes sequenced to dateshow a high degree of similarity, but some examples of gene replacement or introduction of genes through hor-izontal gene transfer are known. The evolutionary origin of the gene transfers is unclear.We have sequenced andcharacterised the complete chloroplast genome and a putatively chloroplast-associated plasmid of the pennatediatom Seminavis robusta. The chloroplast genome contains two introns, a feature that has not previously beenfound in diatoms. The group II intron of atpB appears to be recently transferred from a Volvox-like green alga.The S. robusta chloroplast genome (150,905 bp) is the largest diatom chloroplast genome characterised todate, mainly due to the presence of four large gene-poor regions. Open reading frames (ORFs) encoded by thegene-poor regions show similarity to putative proteins encoded by the chloroplast genomes of differentheterokonts, as well as the plasmids pCf1 and pCf2 found in the diatom Cylindrotheca fusiformis. A tyrosinerecombinase and a serine recombinase are encoded by the S. robusta chloroplast genome, indicating a possiblemechanism for the introduction of novel genes. A plasmidwith similarity to pCf2was also identified. Phylogenet-ic analyses of three ORFs identified on pCf2 suggest that two of them are part of an operon-like gene cluster con-served in bacteria. Several genetic elements have moved through horizontal gene transfer between thechloroplast genomes of different heterokonts. Two recombinases are likely to promote such gene insertionevents, and the plasmid identifiedmay act as vectors in this process. The copy number of the plasmidwas similarto that of the plastid genome indicating a plastid localization.

© 2013 The Authors. Published by Elsevier B.V. All rights reserved.

4142

R

55

56

57

58

59

60

61

62

63

64

65

UNCO

R

1. Introduction

The primary origin of the chloroplast organelle (plastid) in all eu-karyotic photosynthetic organisms lies in the ancient engulfment of aphotosynthetic cyanobacterium by a heterotrophic eukaryote in a pro-cess termed primary endosymbiosis. Over time, most genes of the pri-mary endosymbiont were lost or transferred to the host genome,resulting in a highly reduced chloroplast genome encoding coreelements of the photosynthetic machinery. Descendants of the primaryendosymbiosis evolved into three lineages: the “red” lineage consistingof the red algae (Rhodophyta), the “green” lineage consisting of green

66

67

68

69

70

71

72

73

74

erms of the Creative Commonswhich permits non-commerciald the original author and source

egianUniversity of Science andndheim, Norway. Tel.: +47 73

lished by Elsevier B.V. All rights reser

he chloroplast genome of themics (2013), http://dx.doi.org

algae (Chlorophyta) and land plants (Streptophyta), and the glauco-phytes. In secondary endosymbiosis, a red alga or a green alga wasengulfed by a non-photosynthetic protist (Green, 2011; Reyes-Prietoet al., 2007). Chloroplasts of algae belonging to the heterokonts,which include diatoms, brown algae, raphidophytes and heterotrophicoomycetes, arose from a secondary endosymbiosis event including ared alga. Recent results indicate that the red algal endosymbiontsucceeded a green algal endosymbiont related to prasinophytes, as alarge number of nuclear genes in diatom genomes have a green algal or-igin (Jiroutová et al., 2010; Moustafa et al., 2009). However, this findingis controversial, and has been the subject of criticism for taxonomicsampling bias (Burki et al., 2012; Deschamps and Moreira, 2012).

In addition to the large contribution of genetic material to algal ge-nomes through endosymbiosis (endosymbiotic gene transfer, EGT), sev-eral genes have been introduced to nuclear and organelle genomesindependently through horizontal gene transfer (HGT) events. The nucle-ar genomes of the diatoms Thalassiosira pseudonana and Phaeodactylumtricornutum contain several hundred genes that appear to have beenacquired from a wide range of bacteria through HGT (Armbrust et al.,2004; Bowler et al., 2008).

ved.

diatom Seminavis robusta: New features introduced throughmultiple/10.1016/j.margen.2013.12.002

Page 2: The chloroplast genome of the diatom Seminavis robusta: New features introduced through multiple mechanisms of horizontal gene transfer

T

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

2 T. Brembu et al. / Marine Genomics xxx (2013) xxx–xxx

UNCO

RREC

Diatoms (Bacillariophyta) constitute one of the most abundantgroups of marine phytoplankton, with an estimated diversity of around100000 species (Round et al., 1990; Van denHoek et al., 1995). The evo-lutionary success of diatoms is also reflected in their ecological impor-tance; this group contributes approximately 40% to primary netproduction in the oceans (Field et al., 1998). This success is suggestedto be caused at least in part by the ability of diatoms to respond andadapt to large fluctuations in light irradiance, thereby maintaininghigh photosynthetic efficiency over a wide range of environmental con-ditions (Depauw et al., 2012).

Thus far, the chloroplast genome has been sequenced in five dia-toms: the centrics Odontella sinensis and T. pseudonana, and the pen-nates P. tricornutum, Fistulifera sp. JPCC DA0580 and Synedra acus(Galachyants et al., 2012; Kowallik et al., 1995; Oudot-Le Secq et al.,2007; Tanaka et al., 2011). In addition, the chloroplast genomes of thediatom endosymbiont of two dinoflagellates, Durinskia baltica andKryptoperidinium foliaceum, have also been characterised (Imanianet al., 2010). These genomes share a highly similar gene set, of whicha core set of 86 genes is found in all chromalveolates (Green, 2011).Two plasmids identified in the pennate diatom Cylindrotheca fusiformismay be associated with chloroplasts, as they hybridise with chloroplastDNA (Hildebrand et al., 1992; Jacobs et al., 1992). In support of this view,genes encoding putative proteins with similarity to ORFs found in theC. fusiformis plasmids have been found in the chloroplast genomes ofFistulifera sp. and K. foliaceum (Imanian et al., 2010; Tanaka et al., 2011).

Seminavis robusta is a marine pennate diatom belonging to thelarge Naviculaceae family (Danielidis and Mann, 2002). In contrast toP. tricornutum and T. pseudonana, S. robusta is dioecious and exhibits asize reduction–restitution life cycle, where sexual reproduction is sizedependent and results in restoration of cell size (Chepurnov et al.,2002). Recently, diproline was identified as a pheromone involved insensing of mature partners for reproduction in S. robusta (Gillard et al.,2013). S. robusta is easy to cultivate and tolerant to inbreeding, makingit a good candidate for molecular and genetic studies. Furthermore, itsrelatively large cell size (up to 80 μm long) is an advantage with regardto bioimaging studies (Chepurnov et al., 2008). S. robusta has two largechloroplasts which divide transversely and relocate to the valves duringthe S/G2 phase of the cell cycle (Chepurnov et al., 2002; Gillard et al.,2008). Due to its large size and well-characterised development, thechloroplast of S. robusta is promising as a model system for studies ofchloroplast morphology and development in diatoms.

Here,we report the complete sequence of the chloroplast and a plas-mid genome of S. robusta. The plasmid sequence has similarity to theC. fusiformis pCf2. The S. robusta chloroplast genome is the largest iden-tified in diatoms. The increase in size is mostly due to the presence offour gene-poor regions containing ORFs that are not part of the con-served gene set of diatom chloroplast genomes. Phylogenetic analysesindicate that these ORFs are the result of several lateral gene transferevents between different heterokont chloroplast genomes.

2. Results and discussion

2.1. Structure and gene content of the S. robusta chloroplast genome

As a part of ongoing genome sequencing of the pennate, benthic di-atom S. robusta, its chloroplast genome sequence was characterised.Shotgun and paired end sequencing resulted in the identification oftwelve contigs with read depth coverage between 463 and 1858, in av-erage 64 times higher than the general read depth. Eleven of thesecontigs showed similarity to chloroplast genomes from other diatoms,resulting in a complete circular sequence with a length of 150,905 bp(Fig. 1).

Table 1 shows the general properties of the chloroplast genome ofS. robusta and three other diatoms (Kowallik et al., 1995; Oudot-LeSecq et al., 2007; Tanaka et al., 2011) as well as the diatom endosymbi-onts of the dinoflagellates K. foliaceum and D. baltica (Imanian et al.,

Please cite this article as: Brembu, T., et al., The chloroplast genome of themechanisms of horizontal gene..., Mar. Genomics (2013), http://dx.doi.org

ED P

RO

OF

2010). The S. robusta chloroplast genome has a quadripartite organisa-tion similar to that found in other diatoms, being divided into a largesingle-copy (LSC) and a small single-copy (SSC) region by two invertedrepeats (IRs). It is larger than any of the other characterised diatomchloroplast genomes; this is not due to the size of the IRs,which is inter-mediate compared to other diatoms (9434 bp). The increase in size ismainly caused by the presence of four gene-poor regions (markedwith roman capitals I to IV in Fig. 1), which in total contributes about24,000 bp (16%) to the genome size. Another consequence of thegene-poor regions is a large average size of the intergenic spacers(214.0 bp). Excluding these regions, the average intergenic spacer sizeis reduced to 134.8 bp.

2.2. Plastid introns

An interesting feature of the S. robusta chloroplast genome is thepresence of introns in two of the genes: the rnl gene encoding the 23Sribosomal RNA in the IR and the atpB gene encoding the ATP synthasebeta chain. The other diatom chloroplast genomes analysed so far donot contain any introns. The only intron reported in a heterokont chlo-roplast genome so far is a group I intron found in the trnL gene of Fucusvesiculosus and a few other brown algae (Le Corguillé et al., 2009).

The S. robusta rnl gene contains a group I intron with a length of764 bp that falls within the subgroup IA3 (Michel et al., 1990). This typeof introns has self-splicing activity, and is mostly found in fungi, plantsand red and green algae (Haugen et al., 2005). The rnl intron containsan ORF encoding a putative endonuclease with a single LAGLIDADG do-main. Single-LAGLIDADG endonucleases form homodimers that recog-nise and cleave palindromic or pseudopalindromic DNA target sites(Chan et al., 2011). Phylogenetic analyses (Fig. 2A) indicated that theS. robusta endonuclease ORF (designated I-SroI according to standard no-menclature for the family (Belfort and Roberts, 1997)) is similar to single-LAGLIDADG endonucleases fromgreen algae (chlorophytes) (Heath et al.,1997; Lucas et al., 2001), streptophytes (Turmel et al., 2002b) and theamoeboid protozoan Acanthamoeba castellanii (Lonergan and Gray,1994). All residues that are conserved within LAGLIDADG endonucleasesin green algae are also conserved in I-SroI, with the exception of Asp93 inI-SroI, which is a highly conserved proline in the other members of thefamily (Fig. A.1) (Lucas et al., 2001). The conserved proline is part of thehydrophobic core of LAGLIDADG endonucleases (Heath et al., 1997);replacing it with an acidic residue may therefore have deleterious effectson the structure and activity. Homing endonucleases, such as LAGLIDADGendonucleases that residewithin self-splicing introns, have evolved to actas opportunistic selfish DNA considered to provide little benefit to theirhosts (Stoddard and Belfort, 2010). However, homing endonucleasesmay also drive important gene conversion events. The HO endonucleasein Saccharomyces cerevisiae, which is of the LAGLIDADG type, is responsi-ble for mating-type genetic switch (Jin et al., 1997).

Further evidence for a green algal ancestry of the S. robusta rnl intronwas found in the non-coding part of the intron. Both the 5′ and 3′ part ofthe intron show significant similarity to the corresponding non-codingparts of the rnl intron of the ulvophycean algae Pseudendocloniumakinetum and Trichosarcinamucosa (Pombert et al., 2005), and, to a less-er extent, to Chlamydomonas reinhardtii (Rochaix et al., 1985). Further-more, the sequences of the rnl gene flanking the intron show partialsimilarity to the I-CreI recognition sequence (Thompson et al., 1992).

The atpB gene of S. robusta contains a group II intron with a totallength of 2394 bp. Similar to group I introns, group II introns haveself-splicing activity, and have been identified in bacteria and the chlo-roplast and mitochondrial genomes of fungi, plants, protists and an an-nelid worm. Group II introns are believed to be evolutionary ancestorsof spliceosomal introns, the spliceosome and retrotransposons in eu-karyotes (Lambowitz and Zimmerly, 2011). The S. robusta atpB introncontains an ORF (ORF582) encoding a reverse transcriptase (RT),which is a hallmark of bacterial group II introns, but which is frequentlylost in eukaryotic organellar group II introns (Lambowitz and Zimmerly,

diatom Seminavis robusta: New features introduced throughmultiple/10.1016/j.margen.2013.12.002

Page 3: The chloroplast genome of the diatom Seminavis robusta: New features introduced through multiple mechanisms of horizontal gene transfer

CO

RRECTED P

RO

OF

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

Fig. 1. Chloroplast genome map of Seminavis robusta. Transcriptional direction is indicated by boxes on the outside (clockwise) or inside (counterclockwise) the ring. Genes are colour-codedaccording to their functional categories as indicated at the bottom right. Genes for tRNAs are indicated by the single-letter code of the corresponding amino acid. Roman numerals (I–IV) markthe location of four gene-poor regions distinct for the S. robusta chloroplast genome. Abbreviations: IR, inverted repeat; LSC, large single-copy region; SSC, small single-copy region.

3T. Brembu et al. / Marine Genomics xxx (2013) xxx–xxx

UN2011). Phylogenetic analyses showed that the S. robusta ORF582 RT is

closely related to an RT from the chloroplast genome of the green algaVolvox carteri, which is also encoded by an ORF located in the intron ofthe chloroplast atpB gene (Fig. 2B) (Smith and Lee, 2009). The non-coding part of the atpB introns of S. robusta and V. carteri also show sig-nificant similarity; 13 bp at the 5′ end and 9 bp at the 3′ end of the atpBintron are identical, suggesting similar splicing properties (Fig. A.2). Thesite of the atpB intron is also conserved; in both species, the intron isinserted after a conserved Met (position 223 in S. robusta, position 239in V. carteri) in atpB. Finally, the GC content of the atpB intron in S. robusta(37.3%) is higher than that of the surrounding atpB coding sequence(35.3%) and the total chloroplast genome (30.9%). However, it is some-what lower than the GC content of the V. carteri atpB intron (39.7%). Sim-ilarly, an analysis of codonusage shows thatGC content in the third codonposition of ORF582 (33.7%) is much higher compared to atpB (12.95%).

Please cite this article as: Brembu, T., et al., The chloroplast genome of themechanisms of horizontal gene..., Mar. Genomics (2013), http://dx.doi.org

Thus, the atpB intron in S. robusta appears to have been acquired throughHGT from a green alga closely related to V. carteri.

2.3. The pSr1 plasmid in S. robusta

The four gene-poor regions found in the S. robusta chloroplast ge-nome all contain one ormore ORFs not conserved in diatom chloroplastgenomes (Table 2). The ORFs mostly show similarity to ORFs found inthe chloroplast genome of the diatom Fistulifera sp. JPCC DA0580(Tanaka et al., 2011) and K. foliaceum (Imanian et al., 2010), and theplasmids pCf1 and pCf2 from C. fusiformis (Hildebrand et al., 1992).Both of these plasmids contain four ORFs encoding putative proteinsof more than 100 AA; two of the ORFs on each plasmid (ORF218 andORF482 on pCf1, ORF217 and ORF484 on pCF2) show strong pairwisesimilarity (Fig. 3A, B). At least one ORF with similarity to each of the

diatom Seminavis robusta: New features introduced throughmultiple/10.1016/j.margen.2013.12.002

Page 4: The chloroplast genome of the diatom Seminavis robusta: New features introduced through multiple mechanisms of horizontal gene transfer

UNCO

RRECT

230

231

232

233

234

235

236

237

238

239

240Q3

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

Q2

Table1

t1:1

t1:2

Gen

eral

prop

erties

ofdiatom

and‘dinatom

’chlorop

last

geno

mes.

t1:3

Seminav

isrobu

sta

Phaeod

actylum

tricornu

tum

aFistulife

rasp

.JPC

CDA0

58b

Thalassiosirapseu

dona

naa

Odo

ntella

sine

nsisa

Durinskia

baltica

cKryp

tope

ridinium

folia

ceum

c

t1:4

Size

(bp)

150,90

511

7,36

913

4,91

812

8,81

411

9,70

411

6,47

014

0,42

6t1:5

Inve

rted

repe

at(IR)

9,43

46,91

213

,330

18,337

7725

7067

6017

t1:6

Gen

econten

td

t1:7

Total

175

162

164

159

160

159

173

t1:8

Exclud

ingun

ique

ORF

s16

016

215

915

916

015

916

0t1:9

Protein-co

ding

gene

se

130

130

132

127

128

127

128

t1:10

No.

ofintron

s2

00

00

00

t1:11

Startc

odon

s:ATG

e12

412

412

312

112

312

312

3t1:12

Startc

odon

s:GTG

45

65

54

5t1:13

Startc

odon

s:othe

r2ATT

1ATT

2TT

G,1

ATA

1ATA

00

0t1:14

TotalG

Cco

nten

t(%

)30

.92

32.56

32.20

30.66

31.82

32.55

32.40

t1:15

Ave

rage

size

intergen

icsp

acer

(bp)

214.0

88.4

179.5

108.2

115.7

94.3

246.7

aDatatake

nfrom

Oud

ot-LeSe

cqet

al.(20

07).

t1:16

bDatatake

nfrom

Tana

kaet

al.(20

11).

t1:17

cDatatake

nfrom

Iman

ianet

al.(20

10).

t1:18

dDup

licated

gene

swereno

ttake

ninto

acco

unt.

t1:19

eUniqu

eORF

swereno

ttake

ninto

accoun

t.t1:20

4 T. Brembu et al. / Marine Genomics xxx (2013) xxx–xxx

Please cite this article as: Brembu, T., et al., The chloroplast genome of themechanisms of horizontal gene..., Mar. Genomics (2013), http://dx.doi.org

ED P

RO

OF

six C. fusiformis plasmid ORFs is found in the gene-poor regions of theS. robusta chloroplast genome (Table 2). Some of the S. robusta homo-logues (i.e. ORF249) appear to be poorly conserved, and are likely notfunctional. Fragments of the plasmid ORFs are also found in the gene-poor regions. Eight incomplete ORFs with similarity to C. fusiformisORF482/ORF484, sometimes without start codon, are interspersedthroughout region III and IV.

One of the contigs with high read depth could not be assembled intothe chloroplast genome. Upon closer analysis, this contig was found toconstitute a separate circular molecule with a size of 3813 bp with sig-nificant similarity to C. fusiformis pCf2, which we designated as pSr1(Fig. 3C). A previous survey did not identify any plasmids in two othermembers of the Naviculaceae, Fistulifera pelliculosa and Navicula incerta(Hildebrand et al., 1991), and no plasmid was reported in Fistulifera sp.JPCC DA0580 (Tanaka et al., 2011). Thus, the pSr1 plasmid is the first tobe identified in a diatom belonging to Naviculales. Plasmids may not bea common feature in diatoms belonging to this order. Alternatively,plasmids have not been detected in previous studies due to technicallimitations. Purification of chloroplast DNA by cesium chloride or su-crose gradient centrifugation may result in the loss of any associatedplasmid DNA.

pSr1 contains three ORFs encoding putative proteins of 494, 317 and121 AAs, which show significant similarity to pCf2 ORF484, ORF246 andORF125, respectively (NCBI BlastP expect value b1e-36). The C-terminalpart of pSr1 ORF317 also shows similarity to a small ORF in pCf2(ORF64) that overlaps with pCf2 ORF246 (Fig. 3B). Introducing a dele-tion at position 732 of pCf2 ORF246 and an insertion in position 191 ofORF64 results in a continuous ORF encoding a putative protein of 311AAs (ORF311) showing high similarity to pSr1 ORF317 and S. robustachloroplast ORF292 (Fig. 3B; Fig. A.3). The two frameshifts in pCf2may be the result of sequencing errors. Alternatively, they have oc-curred as part of an inactivation of the ORF311 locus.

The only C. fusiformis plasmid ORFs with a putative function areORF217/ORF218, which show similarity to serine recombinases. Ho-mologues of these ORFs are not found in pSr1; however, gene-poorregion III in the chloroplast genome encodes a serine recombinase,termed SerC2, with similarity to CfORF217 and CfORF218 as wellas K. foliaceum SerC1 and SerC2 and Fistulifera sp. SerC2. Residuesfound to be critical for the active site of serine recombinases (Arg-8, Ser-10, Asp-67, Arg-68 and Arg-71 in the Escherichia coli γδresolvase (Grindley et al., 2006)) are conserved in all diatom chloro-plast serine recombinases. They also show a similar size and domainstructure as γδ resolvase, suggesting that they may act through asimilar mechanism.

Although the intracellular localisation of pSr1 is not known, it ap-pears to be closely associated with the chloroplast genome. ClonedpCf2 hybridised to both chloroplast and nuclear DNA from C. fusiformis(Jacobs et al., 1992). However, no fragments with similarity to pSr1have been identified in our non-chloroplast reads (results not shown).Multiple fragments of the plasmid ORFs are found in the chloroplastgenome. Gene-poor region III contains homologues of all three pSr1ORFs, in addition to SerC2, which is homologous to pCf2 ORF217(Fig. 4). The gene order is conserved in the chloroplast region; however,two of the ORFs (SerC2 and ORF261) are inverted, and ORF261 is trun-cated, suggesting that it is a pseudogene. Also, two unrelated ORFs areinserted in the region.

In an attempt to elucidate the evolutionary origin of the variousgenes in the diatom plasmids, we performed phylogenetic analysesbased on protein alignments of the plasmid ORFs with similar ORFsfrom other organisms. ORF494 shows similarity to ORFs from the chlo-roplast genomes of K. foliaceum and the raphidophyte Heterosigmaakashiwo, an ORF assembled from ESTs and shotgun reads of the centricdiatom Attheya sp. (Raymond and Kim, 2012), and ORF482/ORF484from C. fusiformis plasmids (Fig. 5A). No other proteins with significantsimilaritywere found; this protein family therefore appears to be specif-ic to heterokont chloroplast genomes.

diatom Seminavis robusta: New features introduced throughmultiple/10.1016/j.margen.2013.12.002

Page 5: The chloroplast genome of the diatom Seminavis robusta: New features introduced through multiple mechanisms of horizontal gene transfer

CTED P

RO

OF

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

Thermosynechococcus elongatus

Oltmannsiellopsis viridis

Trebouxia gelatinosa

Scherffelia dubia

Pseudendoclonium akinetum

Trichosarcina mucosa100/93

50/-

83/-

68/-

Chlorokybus atmophyticus

Chlorella vulgaris-/-

-/-

Nephroselmis olivacea

Seminavis robusta

Mesostigma viride

Acanthamoeba castellanii69/-

-/67

-/-

Monomastix sp. M722

Haematococcus lacustris

Carteria lunzensis

Chlamydomonas reinhardtii99/-

58/-

64/-

Coccomyxa sp. C-169

-/-

-/54

Lactococcus lactis

Sinorhizobium meliloti

Porphyra purpurea

Ktedonobacter racemifer

Arthrospira maxima

Lyngbya sp. PCC 8106

Nostoc sp. PCC 712059/79

99/-

100/97

Chlamydomonas sp. CCMP1619

Euglena myxocylindracea100/87

92/60

Seminavis robusta

Volvox carteri100/100

Physcomitrella patens

Pycnococcus provasolii59/-

99/98

95/71

100/98

A B

Fig. 2. Phylogeny of genes located in the chloroplast genome introns in Seminavis robusta. A) Neighbour-Joining (NJ) tree based on a protein alignment (Fig. A.1) of S. robusta LAGLIDADGendonuclease encoded in the rnl intronwith other endonucleases. B)Neighbour-Joining (NJ) tree based on a protein alignment of S. robustaORF582 encoded in the atpB intronwith relatedreverse transcriptases. The overall topologies for theNJ andMaximumLikelihood (ML) trees are the same. Bootstrap confidence values above 50% are shown in the tree, NJ (first value) andML (second value). Thermococcus elongatus endonuclease was used as an outgroup in A), Lactococcus lactis RT was used as an outgroup in B). Different lineages are indicated as follows:Heterokonts; brown, Rhodophyta; red, Viridiplantae; green, Amoebozoa; blue; Euglenozoa; orange, Bacteria; black. Accession numbers for A) are listed in Fig. A.1. Accession numbers forB): Volvox carteri (ACI31231.1), Pycnococcus provasolii (ACK36851.1), Lyngbya sp. PCC 8106 (EAW33662.1), Nostoc sp. PCC 7120 (BAB77479.1), Physcomitrella patens (BAE93088.1),Ktedonobacter racemifer DSM 44963 (EFH82859.1), Arthrospira maxima CS-328 (EDZ96665.1), Chlamydomonas sp. CCMP1619 (AAQ91581.1), Euglena myxocylindracea (AAQ84048.1),Porphyra purpurea (AAD03095.1), Lactococcus lactis (AAD03095.1), Sinorhizobium meliloti 1021 (CAC49024.1). (For interpretation of the references to colour in this figure, the reader isreferred to the web version of this article.)

5T. Brembu et al. / Marine Genomics xxx (2013) xxx–xxx

UNCO

RREORF317 in pSr1 and ORF292 in the chloroplast genome showed sim-

ilarity to C. fusiformis pCf2 ORF311 and the C-terminal part of Fistuliferasp. JP033 (Fig. A.3, red bar). The C-terminal part of these proteins consti-tutes a previously unidentified motif that can be found in bacterial pro-teins of various sizes, especially from species belonging to theFirmicutes, Actinobacteria, Bacteroidetes and Proteobacteria (Fig. 5B).A similar relationship with Firmicutes was observed for pSr1 ORF121and its chloroplast genome homologues ORF123 and ORF132. Althoughthe similarity is low, short conserved motifs are observed (Fig. A.4).ORF317 and ORF121 are part of two divergent and fast evolving groupsof proteins, with fewer than 20 proteins showingmoderate similarity toeach of them in theNCBI databases. Closer analyses of the genomic loca-tion of the bacterial homologues showed that gene order and orienta-tion is conserved between diatom plasmids and a number of bacterialgenomes (Fig. 4). Gene pairs showing highest similarity to pSr1ORF317 and ORF121 were found in the genomes of bacteria belongingto the Clostridiales order (Clostridium acetobutylicum, Clostridiumhathewayi and Acetivibrio cellulolyticus). Gene pairs with lower similar-ity were found in bacteria from other phyla, such as Proteobacteria(Moraxella catarrhalis) and Bacteroidetes (Microscilla marina). TheORF317–ORF121 gene pair and the similar gene pairs in bacteria haveseveral properties characteristic for operons (Chuang et al., 2012).Both genes are transcribed in the same direction, and gene order is con-served. Notably, the intergenic spacer between the two ORFs is veryshort (b32 bp) in the bacterial genomes aswell as the diatomplasmids.

The three pSr1 ORFs were screened for the presence of transmem-brane motifs (TMMs), using four different prediction servers. ORF492and ORF121 did contain any predicted TMMs. In contrast, ORF317 waspredicted to contain two N-terminal TMMs by DAS (Cserzö et al.,

Please cite this article as: Brembu, T., et al., The chloroplast genome of themechanisms of horizontal gene..., Mar. Genomics (2013), http://dx.doi.org

2002) and OCTOPUS (Viklund and Elofsson, 2008), and one TMM bySPLIT (Juretic et al., 2002) (Fig. A.3). The TMHMM (Krogh et al., 2001)server did not predict any TMMs in ORF317. Analysis of pCf2 ORF311yielded similar results; in addition, TMHMM also predicted one TMMmotif. Given the large surface of the thylakoidmembrane and themem-brane association of all major photosynthetic protein complexes, it isnot surprising that some of these proteins have predicted transmem-brane motifs.

2.4. Uncharacterised ORFs

Two previously uncharacterised, yet evolutionarily conserved ORFswere identified in the S. robusta chloroplast genome. An ORF encodinga putative protein of 161 AA was located in gene-poor region III, be-tween SerC2 and ORF188. The new ORF (ORF161) is highly similar toan uncharacterised ORF of 94 AA from K. foliaceum. If a poly(A) stretchin the K. foliaceum ORF is extended with one base, the ORF is extendedat the 5′ end to 155 AA. Surprisingly, 150 of the first 151 AA of thetwo ORFs are identical (Fig. A.5A), suggesting that the HGT event givingrise to these ORFs is recent. No other sequence with similarity to theseORFs was found in GenBank.

Gene-poor region IV contains an ORF encoding a putative protein of140 AA, which shows high similarity to the product of an uncharacterisedORF found in the chloroplast genomes of two strains of H. akashiwo,CCMP452 (146 AA) and NIES293 (144 AA) (Fig. A.5B) (Cattolico et al.,2008). The C-terminal half of S. robusta ORF140 contains seven cysteineresidues that are conserved in both H. akashiwo homologues. These resi-dues may form disulphide bridges that stabilise the tertiary structure ofthe gene product. Alternatively, the conserved Cys residues could be the

diatom Seminavis robusta: New features introduced throughmultiple/10.1016/j.margen.2013.12.002

Page 6: The chloroplast genome of the diatom Seminavis robusta: New features introduced through multiple mechanisms of horizontal gene transfer

T

OF

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

436

Table 2t2:1

t2:2 Uncharacterised ORFs and recombinases encoded by the S. robusta chloroplast genome.

t2:3 Region ORF name Highest similarity Identity (%) Fistulifera sp. JPCC DA0580 K. foliaceum H. akashiwo CCMP 452

t2:4 I ORF132 pCF2 ORF125 [Cylindrotheca fusiformis] 29 JC032, JC033 (N-term), JC034 – –

t2:5 II ORF116 pCF1 ORF111 [Cylindrotheca fusiformis] 65 – – –

t2:6 III serC2 Putative serine recombinase [Kryptoperidinium foliaceum] 77 serC2 serC2 –

t2:7 ORF161 Unannotated ORF [Kryptoperidinium foliaceum] 93 – Unannotated ORF betweenrrn5 and ORF157

t2:8 ORF188 Hypothetical protein KrfoC_p115[Kryptoperidinium foliaceum]

31 Unannotated ORF betweenycf35 and psbA

ORF175 –

t2:9 ORF261 pCF1 ORF482 [Cylindrotheca fusiformis] 26 JC81, JC82 ORF141 Heak452_Cp026t2:10 ORF292 pCF2 ORF246 [Cylindrotheca fusiformis] 53 JC033 (C-term) – –

t2:11 ORF123 pCF2 ORF125 [Cylindrotheca fusiformis] 73 JC032, JC033 (N-term), JC034 – –

t2:12 IV ORF504 Hypothetical protein Heak293_Cp026[Heterosigma akashiwo]

35 JC81, JC82 ORF141 Heak452_Cp026

t2:13 tyrC Putative integrase [Kryptoperidinium foliaceum] 64 – tyrC tyrCt2:14 ORF140 Unannotated ORF [Heterosigma akashiwo CCMP 452] 78 – – Unannotated ORF

between tyrC and trnGt2:15 ORF249 pCF1 ORF311 [Cylindrotheca fusiformis] 16 – – –

t2:16 pSr1 ORF494 pCF2 ORF484 [Cylindrotheca fusiformis] 32 JC81, JC82 ORF141 Heak452_Cp026t2:17 ORF317 pCF2 ORF246 [Cylindrotheca fusiformis] 39 JC033 (C-term) – –

t2:18 ORF121 pCF2 ORF125 [Cylindrotheca fusiformis] 52 JC032, JC033 (N-term), JC034 – –

6 T. Brembu et al. / Marine Genomics xxx (2013) xxx–xxx

UNCO

RREC

targets of redox regulation (Montrichard et al., 2009; Schürmann andJacquot, 2000).

We investigated the expression levels of the uncharacterised ORFsby quantitative RT-PCR (Fig. 6). As expected, psbA, which is conservedin the chloroplast genome of all photosynthetic organisms (Green,2011; Janouskovec et al., 2010), was expressed at very high levels. ThepsbA amplicon was detected after only 16 PCR cycles. None of theuncharacterised ORFs encoded by the S. robusta chloroplast genomewere expressed at comparable levels. ORF140, ORF292 and ORF123were expressed at low levels (Ct values between 25 and 30), whereasORF161 and ORF500 transcripts were barely detected (Ct values be-tween 30 and 35). ORF188 apparently was not expressed at detectablelevels under the conditions used (Ct N 35). In a separate experiment,all three ORFs encoded by the pSr1 plasmid were found to be expressedat low levels. In view of these results, pSr1 appears not to be merely avector for transport of genetic information, but is also able to confertranscription of its genes.

2.5. Recombinases

A neighbouring gene of ORF140, tyrC, encodes a putative tyrosinerecombinase with similarity to TyrC from K. foliaceum and H. akashiwo.All residues that are critical for catalytic activity of tyrosine recombinasesare conserved in the S. robusta TyrC, similar to its heterokont homologues(Fig. A.6A). Phylogenetic analyses (Fig. A.6B) showed that heterokont(and dinoflagellate) TyrC forms a clade togetherwith the Int recombinaseencoded by the chloroplast genome of the green alga Oegodoniumcardiacum (Brouard et al., 2008). Another eukaryotic clade is formed byrecombinases encoded by the mitochondrial genome of two other greenalgae, Prototheca wickerhamii (Wolff et al., 1994) and Chaetosphaeridiumglobosum (Turmel et al., 2002a). XerCD family tyrosine recombinaseswith a lower similarity to TyrC are found in a large number of bacteria,mainly belonging to Firmicutes. A bacteria belonging to this phylummay be the source of the ancestral lateral gene transfer of a tyrosinerecombinase to an algal organellar genome. Expression analyses indicatedthat neither tyrC nor serC2were expressed (Fig. 6).

Based on the presence of serine recombinases in the pCf1 and pCf2plasmids (Hildebrand et al., 1992), SerC2 in the S. robusta chloroplastgenomehas likely also been associatedwith a plasmid, possibly a prede-cessor of pSr1. After integration of pSr1 in the chloroplast genome, theserC2 gene may have been lost from the plasmid. One possible role forTyrC could be to act in conversion of multimeric chloroplast moleculesto monomers, as has been speculated for the H. akashiwo TyrC(Cattolico et al., 2008). A XerCD family recombinase has been shownto mediate excision of a genomic island from the genome of the

Please cite this article as: Brembu, T., et al., The chloroplast genome of themechanisms of horizontal gene..., Mar. Genomics (2013), http://dx.doi.org

ED P

RO

bacterial pathogen Helicobacter pylori; conjugative transfer of suchgenomic islands is believed to contribute to the genetic diversity ofH. pylori (Fischer et al., 2010). Whether a similar role can be attributedto TyrC in the chloroplast genomes of S. robusta and other eukaryoteswarrants further experimentation.

The occurrence of gene-poor regions containing uncharacterisedORFs appears to coincide with the presence of a serine recombinase(Fistulifera sp.), a tyrosine recombinase (H. akashiwo), or both (S. robustaand K. foliaceum) in the chloroplast genome (Table 2). The chloroplastgenomes of P. tricornutum, T. pseudonana and the diatom endosymbiontof D. baltica do not encode any recombinase; none of the ORFs listedin Table 2 are found in these diatoms, and the mean intergenic spaceris smaller (Table 1). Interestingly, an ORF encoding a partial serinerecombinase (annotated as Escp117) is found in the chloroplast genomeof the brown alga Ectocarpus siliculosus (Le Corguillé et al., 2009). Theintergenic regions of the E. siliculosus chloroplast genome are longerthan those of another brown alga, F. vesiculosus, where no traces of anyrecombinase were found. The insertion sites of tyrosine recombinases inthe chloroplastic and mitochondrial genomes of green algae (Brouardet al., 2008; Turmel et al., 2002a;Wolff et al., 1994) are also characterisedby low gene density. The apparent “junk” DNA associated with therecombinases may have been caught between the recombinase bindingsites upon excision from an ancestral donor. Due to the lack of selectionpressure, the non-coding parts of the transferred DNA have divergedquickly.

2.6. Conclusions

We report the complete plastid genome and the sequence of a plas-mid (pSr1) of the benthic diatom S. robusta. Our study shows that dia-tom plastid genomes are subject to major changes due to HGT events.The enlarged size of the S. robusta chloroplast genome is due to variousHGT events that have occurred through different mechanisms (homingintrons, recombinases) and from different sources (the pSr1 plasmid,other heterokonts, green algae). High sequence similarity indicatesthat two of the HGT events (resulting in the introduction of ORF161and the atpB intron) may be recent. Diatom plasmids may act asvectors for transfer of geneticmaterial between chloroplasts of differentdiatom species, and even other heterokonts. The bacterial origin of atleast two of the plasmid-localised genes suggests that they are derivedfrom bacteria belonging to the Clostridia. Sequencing of other diatomand heterokont chloroplast genomes will likely lead to a better under-standing of HGT between chloroplast genomes and the possible role ofdiatom plasmids in this process in heterokonts.

diatom Seminavis robusta: New features introduced throughmultiple/10.1016/j.margen.2013.12.002

Page 7: The chloroplast genome of the diatom Seminavis robusta: New features introduced through multiple mechanisms of horizontal gene transfer

UNCO

RRECT

437

438

439

440

441

442

443

444

445

446Q4

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

464

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

pCf2

4079

bps

1000

2000

3000

4000

OR

F21

7

OR

F48

4

OR

F31

1*OR

F12

5 C. f

usifo

rmis

pSr1

3813

bps

1000

2000

3000

OR

F49

4

OR

F31

7

OR

F12

1

S. r

obus

tapC

f142

73 b

ps

1000

2000

3000

4000

OR

F21

8

OR

F48

2

OR

F31

1

OR

F11

1 C. f

usifo

rmis

AB

C

OR

F24

6

OR

F64

Fig.

3.Gen

eticmap

sof

thekn

owndiatom

plasmids.A)C.

fusiform

ispC

f1(H

ildeb

rand

etal.,19

92),B)

C.fusiform

ispC

f2(H

ildeb

rand

etal.,19

92),C)

S.robu

stapS

r1.A

rrow

sindicate

tran

scriptiona

ldirection

.Colou

rsindicate

simila

rORF

s.Th

ediatom

-specificN-terminalen

code

dby

ORF

311(p

Cf2)

andORF

317(p

Sr1)

isco

loured

oran

ge,w

hereas

theev

olutiona

rilyco

nserve

dC-term

inalisco

loured

gree

n.Th

easterisk

inB)

indicatestha

ttwofram

eshiftmod

ification

softhe

pCf2

sequ

ence

resultsinthe

gene

ration

ofane

wORF

(ORF

311)

that

replaces

ORF

246an

dORF

64.(Fo

rinterpretation

ofthereferenc

esto

colour

inthisfigu

re,the

read

erisreferred

totheweb

versionof

thisarticle.)

7T. Brembu et al. / Marine Genomics xxx (2013) xxx–xxx

Please cite this article as: Brembu, T., et al., The chloroplast genome of themechanisms of horizontal gene..., Mar. Genomics (2013), http://dx.doi.org

ED P

RO

OF

3. Methods

3.1. Culturing

S. robusta strains were obtained from the BCCM/DCG culture collec-tion (http://bccm.belspo.be), accession numbers DCG 0115 and DCG0230. These were mated, and one of the progeny strains (D6) wasused further. The strains were cultivated in f/2 medium based on0.2 μm filtered and autoclaved seawater supplemented with vitaminsand inorganic nutrients (Guillard, 1975). Cells were grown at 22 °C ina 16 hour light:8 hour dark photoperiod at an illumination of approxi-mately 100 μmol m−2 s−1.

3.2. Genomic DNA purification

Isolation of genomic DNA was based on a modified protocol fromBowler et al. (Bowler et al., 2008). Six litres of S. robusta culture in lateexponential phase was centrifuged at 2000 g for 10 min at 4 °C. Thecell pellet was frozen in liquid nitrogen and resuspended in lysis buffer(50 mM Tris–HCl pH 8.0, 50 mM EDTA pH 8.0, 1% SDS, 10 mM DTT,10 mg/mL of proteinase K; 10 ml buffer/l of culture) and incubated at50 °C for 45 min. Three phenol/chloroform extractions were performedto remove proteins. The lysate was treated with RNase (10 mg/ml, 2 μlper ml lysate) at 37 °C for 60 min after the first phenol/chloroformextraction. A subsequent extraction with chloroform isoamyl alcohol(24:1)wasmade to eliminate completely the phenol residues. GenomicDNA was precipitated (2 volumes ethanol, 0.1 M NaCl), and the visibleDNA was wound up on a glass rod and transferred to a 15 ml tube.10 ml 70% EtOH was added and the pellet was incubated at 4 °C overnight. The pellet was washed in 70% EtOH once more without centrifu-gation, air dried and resuspended in TE (10 mM Tris pH 8.0, 1 mMEDTA). DNA concentration was determined by spectrophotometry at260 nm, by fluorometry (Qubit, Invitrogen), and checked on a 0.8% aga-rose gel.

3.3. Sequencing of the S. robusta genome

Genomic DNA from S. robusta was subjected to pyrosequencing ofshotgun and paired end libraries with 3 kb and 8 kb jumps. The prepara-tion and sequencing of the DNA libraries were performed according tostandard protocols from 454 Life Sciences Corporation (Roche AppliedScience). Pyrosequencing was performed on a Genome Sequencer FLXsystem using Titanium Chemistry (Roche, 454) at the Norwegian Se-quencing Centre (http://www.sequencing.uio.no/). In total, 4,321,373shotgun reads, 93,916 3 kb paired end reads and 180,133 8 kb pairedend reads were assembled using the Newbler program v2.5 (Margulieset al., 2005), using default settings. Assembly resulted in scaffolds andcontigs with more than 500 times coverage of the chloroplast genome.Scaffolds and contigs belonging to the chloroplast genome (between6740 and 30,827 bp) were identified based on similarity to the chloro-plast genomes of P. tricornutum (Bowler et al., 2008) and T. pseudonana(Armbrust et al., 2004), and similarity in read depth.

In order to fill the gaps between the resulting contigs, PCR primersflanking the contig ends were designed (Table S1) and PCR was per-formed on genomic DNA from S. robusta using a high-fidelity DNA poly-merase (Ex Taq, TAKARA). The resulting PCRproductswere subjected toSanger sequencing (Applied Biosystems) according to the manufac-turer's protocol.

3.4. Genome annotation

The S. robusta chloroplast genome was assembled and putative ORFswere identified using Clone Manager 9 (Sci-Ed Software) and refinedmanually. Chloroplast protein-coding genes were identified using theDOGMA tool (Wyman et al., 2004) and BLAST homology searches(Altschul et al., 1997). Genes encoding ribosomal RNAs andmiscellaneous

diatom Seminavis robusta: New features introduced throughmultiple/10.1016/j.margen.2013.12.002

Page 8: The chloroplast genome of the diatom Seminavis robusta: New features introduced through multiple mechanisms of horizontal gene transfer

CTED P

RO

OF

495

496

497

498

499

500

501

502

503

504

505

506

507

508

509

510

511

512

513Q5

514

515

516

517

518

519

520

521

522

523

524

525

526

527

528

529

pCf2

pSr1

S. robustagene-poor region III

Clostridium acetobutylicum

Clostridium hathewayi

Acetivibrio cellulolyticus

Brevibacillus laterosporus

Microscilla marina

Moraxella catarrhalis

1 kb

ORF217 ORF484 ORF311 ORF125

ORF494 ORF317 ORF121

ORF123ORF292

ORF188 ORF261ORF161SerC2

1004 1005

06284 06285

3164 3165

11010 11000

05839 05840

09802 09797

Fig. 4.Conservation of gene order in diatomplasmids. Arrows indicate transcriptional direction.ORFswith similarity to plasmidORFs are coloured accordingly. The colours for the differentORFs correspond to Fig. 3. Nonconserved domains and genes are coloured white.

8 T. Brembu et al. / Marine Genomics xxx (2013) xxx–xxx

CO

RREgenes were found by comparison with homologues in P. tricornutum

and T. pseudonana. Genes for tRNAs and tmRNA were identifiedusing the tRNAscan-SE search server (Schattner et al., 2005). Theuncharacterised ORFs were analysed for transmembrane domainsusing the prediction servers THMMM (Krogh et al., 2001), DAS(Cserzö et al., 1997), OCTOPUS (Viklund and Elofsson, 2008) andSPLIT (Juretic et al., 2002).

The physical map of the chloroplast genome was drawn using theGenomeVx tool (Conant and Wolfe, 2008). The map of the putativechloroplast plasmid was made in Clone Manager. Both maps were re-fined using Adobe Illustrator CS5.

530

531

532

533

534

535

536

537

538

539

540

541Q7

542

543

544

545

546

UN3.5. Phylogenetic analyses

DNA and protein alignments were generated using Macaw 2.05(NCBI) and manually refined in GeneDoc 2.7.000 (Nicholas et al.,1997). The ClustalX program(Thompson et al., 1997)was used to createbootstrapped neighbour-joining (N-J) (Saitou and Nei, 1987) treesusing the Gonnet 250 score matrix. Bootstrapping of the N-J tree wasdone with 1000 bootstrap trials. A number of substitution matriceswere evaluated and the best one was selected. Maximum likelihoodtrees were created with the RAxML program (version 7.2.6) using theGAMMA model of rate heterogeneity and the BLOSUM62 substitutionmatrix (Stamatakis, 2006). A total of 100 non-parametric bootstrap in-ferences were executed. Trees were visualised using TreeViewX 0.5.0(Page, 2002) or Dendroscope 2.7.2 (Hudson et al., 2007) and refinedusing Adobe Illustrator CS5.

Please cite this article as: Brembu, T., et al., The chloroplast genome of themechanisms of horizontal gene..., Mar. Genomics (2013), http://dx.doi.org

3.6. qRT-PCR

For expression analyses of chloroplast genome genes, two biologicalreplicates of S. robusta grown and harvested as previously describedwere used. For expression analyses of pSr1 genes, three biological repli-cates of S. robusta grown under continuous light were harvested. TotalRNA was isolated from the cultures as described by Nymark et al.(Nymark et al., 2009) and used in a two-step quantitative real-timePCR (qRT-PCR). Reverse transcription of the RNA was performed withthe PrimeScript™ 1st strand cDNA Synthesis Kit (TaKaRa), followingthe recommended protocol for synthesis of real-time PCR templateusing random primers. 500 ng of total RNA was used in each reaction.qRT-PCR mixtures (20 μl) were prepared containing forward and re-verse primers listed in Table S2, with a final concentration of 0.5 μMeach, 5 μl cDNA template diluted 1:10 and 2× LightCycler® 480 SYBRGreen I Master mix (Roche). The qRT-PCR reactions were run in aLightCycler®480Multiwell Plate 96 (Roche) in a LightCycler 480 instru-ment (Roche). No-template controls, where the cDNA template was re-placed with PCR-grade water, were included in each run to ensure thatno reagents were contaminated with DNA. To detect the level of geno-mic DNA still present in the 24RNA samples after theDNase I treatment,qRT-PCR was performed using 7.5 ng of isolated RNA as template, andthree different primer pairs were listed in Table S2. The PCR parameterswere programmed according to the manufacturer's instructions for aLightCycler 480 System PCR run with the LightCycler® 480 SYBRGreen I Master: 5 min preincubation at 95 °C, followed by 40 cycleswith 10 s at 95 °C, 10 s at 55 °C and 10 s at 72 °C. After 35 cycles thespecificity of the amplified PCR products was tested by heating from

diatom Seminavis robusta: New features introduced throughmultiple/10.1016/j.margen.2013.12.002

Page 9: The chloroplast genome of the diatom Seminavis robusta: New features introduced through multiple mechanisms of horizontal gene transfer

UNCO

RRECTED P

RO

OF

547

548

549

550

551

552

553

554

555

556

557

558

A

B

Butyrivibrio fibrisolvens 16/4

Acinetobacter baumannii

Fistulifera sp. JPCC DA0580 FispC_p033

Microscilla marina

Moraxella catarrhalis CO72 61/-

Marine metagenome GOS 945722276/56

Clostridium acetobutylicum ATCC 824

Clostridium hathewayi DSM 13479 95/-

Acetivibrio cellulolyticus CD295/-

Streptomyces sp. C

-/57

Seminavis robusta ORF292 CP

Seminavis robusta ORF317 pSr1

Cylindrotheca fusiformis ORF311 pCf280/-

100/88

Brevibacillus laterosporus LMG 15441

-/-

-/56

-/-

91/81

100/100

Clostridium thermocellum DSM 1

-/-

Heterosigma akashiwo Cpo26

Attheya sp. CCMP212

Seminavis robusta ORF504 CP100/100

Seminavis robusta ORF171 CP

Seminavis robusta ORF494 pSR1100/95

Cylindrotheca fusiformis ORF484

Kryptoperidinium foliaceum ORF141100/99

Seminavis robusta ORF261 CP

Cylindrotheca fusiformis ORF482-/-

99/86

100/100

Fig. 5. Phylogeny of the Seminavis robusta pSr1 plasmid-located genes. Neighbour Joining (NJ) andMaximumLikelihood (ML) treeswere constructed based on protein alignments of eachof the plasmid-encodedORFs in S. robusta. The NJ trees are shown. The overall topologies for theNJ andML trees are the same.When above 50%, bootstrap values are provided for NJ (firstvalue) and ML (second value) analyses. Heterokont and bacterial lineages are indicated by brown and black colour, respectively. A) Unrooted phylogram of the Seminavis robustachloroplast proteins ORF504 and ORF261 and the plasmid-encoded ORF494 (pSr1) with related proteins from heterokonts. Accession numbers: Cylindrotheca fusiformis ORF482(CAA45581.1), Cylindrotheca fusiformisORF484 (CAA45586.1),Heterosigma akashiwoHeak293_Cp026 (ABV65933.1), Kryptoperidinium_foliaceumORF141 (YP_003734646.1). The Attheyasp. ORF was compiled from ESTs JK727780, JK727368 and JK728101 plus reads from Sequence Read Archive, Attheya CCMP212. B) Phylogram of the Seminavis robusta chloroplast proteinORF292 and the plasmid-encoded ORF317 (pSr1) made from an alignment that included bacterial proteins with homology to the conserved C-terminal domain. Accession numbers:Acetivibrio cellulolyticus CD2 (EFL60893.1), Acinetobacter baumannii (YP_001736311), Brevibacillus laterosporus LMG 15441 (EGP36075.1), Butyrivibrio fibrisolvens 16/4 (CBK74180.1),Clostridium acetobutylicum ATCC 824 (NP_347640.1), Clostridium hathewayi DSM 13479 (EFC95524.1), Clostridium thermocellum DSM 1313 (ADU75261.1), Cylindrotheca fusiformisORF311 (CAA45587.1; CAA45589.1), Fistulifera sp. FispC_p033 (YP_004376598.1), Marinemetagenome (EBG29459.1),Microscilla marinaATCC23134 (EAY24864.1),Moraxella catarrhalisCO72 (EGE22131.1), Streptomyces sp. C (EFL17716.1).

psbA

ORF140

ORF161

ORF188

ORF292

ORF123

ORF500

tyrC

serC

2

ORF494

ORF317

ORF121

Ct

0

10

20

30

40

Chloroplast genome pSr1

Fig. 6. Gene expression of uncharacterised ORFs in the S. robusta chloroplast genome andthe pSr1 plasmid. Expression level is shown as cycle threshold (Ct) value (the number ofcycles required for the qPCR fluorescence signal to exceed background level).

9T. Brembu et al. / Marine Genomics xxx (2013) xxx–xxx

Please cite this article as: Brembu, T., et al., The chloroplast genome of themechanisms of horizontal gene..., Mar. Genomics (2013), http://dx.doi.org

65 °C up to 95 °C with a ramp rate of 2.2 °C/s, resulting in meltingcurves. The Second Derivative Maximum Method of the LightCycler480 software was used to identify the crossing points (CPs) of thesamples. A cycle threshold (Ct) value of 35 represents detection of a sin-gle template molecule; therefore, Ct values of N35 were considered tobe below the detection limit of the qRT-PCR assay (Guthrie et al.,2008). LinRegPCR software (Ramakers et al., 2003) was used to deter-mine the PCR efficiency for each sample. The primer set efficiency wasdetermined by calculating the mean of the efficiency values obtainedfrom the individual samples.

Supplementary data to this article can be found online at http://dx.doi.org/10.1016/j.margen.2013.12.002.

559

560

561

562

563

564

Acknowledgements

We thankWimWyverman, University of Ghent for useful commentson themanuscript,Mari-AnnØstensen for assistingwith cultivation andTorfinn Sparstad for DNA isolation and qPCR analysis. The sequencingservice was provided by the Norwegian Sequencing Centre (www.sequencing.uio.no), a national technology platform hosted by the

diatom Seminavis robusta: New features introduced throughmultiple/10.1016/j.margen.2013.12.002

Page 10: The chloroplast genome of the diatom Seminavis robusta: New features introduced through multiple mechanisms of horizontal gene transfer

565

566

567

568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646

647648649

10 T. Brembu et al. / Marine Genomics xxx (2013) xxx–xxx

University of Oslo and supported by the “Functional Genomics” (FUGE)and “INFRAstructure” programs of the Research Council of Norway.

T

650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708Q6709710711712713714715716717718719720721722723724725726727728729730731732

UNCO

RREC

References

Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J., 1997.Gapped BLAST and PSI-BLAST: a new generation of protein database search pro-grams. Nucleic Acids Res. 25, 3389–3402.

Armbrust, E.V., Berges, J.A., Bowler, C., Green, B.R., Martinez, D., Putnam, N.H., Zhou, S.,Allen, A.E., Apt, K.E., Bechner, M., et al., 2004. The genome of the diatom Thalassiosirapseudonana: ecology, evolution, and metabolism. Science 306, 79–86.

Belfort, M., Roberts, R.J., 1997. Homing endonucleases: keeping the house in order.Nucleic Acids Res. 25, 3379–3388.

Bowler, C., Allen, A.E., Badger, J.H., Grimwood, J., Jabbari, K., Kuo, A., Maheswari, U.,Martens, C., Maumus, F., Otillar, R.P., et al., 2008. The Phaeodactylum genome revealsthe evolutionary history of diatom genomes. Nature 456, 239–244.

Brouard, J.S., Otis, C., Lemieux, C., Turmel, M., 2008. Chloroplast DNA sequence of thegreen alga Oedogonium cardiacum (Chlorophyceae): unique genome architecture, de-rived characters shared with the Chaetophorales and novel genes acquired throughhorizontal transfer. BMC Genomics 9, 290.

Burki, F., Flegontov, P., Oborník, M., Cihlár, J., Pain, A., Lukes, J., Keeling, P.J., 2012. Re-evaluating the green versus red signal in eukaryotes with secondary plastid of redalgal origin. Genome Biol. Evol. 4, 626–635.

Cattolico, R.A., Jacobs, M.A., Zhou, Y., Chang, J., Duplessis, M., Lybrand, T., Mckay, J., Ong,H.C., Sims, E., Rocap, G., 2008. Chloroplast genome sequencing analysis ofHeterosigmaakashiwo CCMP452 (West Atlantic) and NIES293 (West Pacific) strains. BMC Geno-mics 9, 211.

Chan, S.H., Stoddard, B.L., Xu, S.Y., 2011. Natural and engineered nicking endonucleases—from cleavage mechanism to engineering of strand-specificity. Nucleic Acids Res. 39,1–18.

Chepurnov, V.A., Mann, D.G., von Dassow, P., Vanormelingen, P., Gillard, J., Inzé, D., Sabbe,K., Vyverman, W., 2008. In search of new tractable diatoms for experimental biology.BioEssays 30, 692–702.

Chepurnov, V.A., Mann, D.G., Vyverman,W., Sabbe, K., Danielidis, D.B., 2002. Sexual repro-duction, mating system, and protoplast dynamics of Seminavis (Bacillariophyceae).J. Phycol. 38, 1004–1019.

Chuang, L.Y., Chang, H.W., Tsai, J.H., Yang, C.H., 2012. Features for computational operonprediction in prokaryotes. Brief. Funct. Genomics 11, 291–299.

Conant, G.C.,Wolfe, K.H., 2008. GenomeVx: simpleweb-based creation of editable circularchromosome maps. Bioinformatics 24, 861–862.

Cserzö, M., Eisenhaber, F., Eisenhaber, B., Simon, I., 2002. On filtering false positive trans-membrane protein predictions. Protein Eng. 15, 745–752.

Cserzö, M., Wallin, E., Simon, I., von Heijne, G., Elofsson, A., 1997. Prediction of transmem-brane alpha-helices in prokaryotic membrane proteins: the dense alignment surfacemethod. Protein Eng. 10, 673–676.

Danielidis, D.B., Mann, D.G., 2002. The systematics of Seminavis (Bacillariophyta): the lostidentities of Amphora angusta, A. ventricosa and A. macilenta. Eur. J. Phycol. 37,429–448.

Depauw, F.A., Rogato, A., Ribera d'Alcalá, M., Falciatore, A., 2012. Exploring the molecularbasis of responses to light in marine diatoms. J. Exp. Bot. 57, 1159–1172.

Deschamps, P., Moreira, D., 2012. Reevaluating the green contribution to diatomgenomes. Genome Biol. Evol. 4, 683–688.

Field, C.B., Behrenfeld, M.J., Randerson, J.T., Falkowski, P., 1998. Primary production of thebiosphere: integrating terrestrial and oceanic components. Science 281, 237–240.

Fischer, W., Windhager, L., Rohrer, S., Zeiller, M., Karnholz, A., Hoffmann, R., Zimmer, R.,Haas, R., 2010. Strain-specific genes of Helicobacter pylori: genome evolution drivenby a novel type IV secretion system and genomic island transfer. Nucleic Acids Res.38, 6089–6101.

Galachyants, Y.P., Morozov, A.A., Mardanov, A.V., Beletsky, A.V., Ravin, N.V., Petrova, D.P.,Likhosway, Y.V., 2012. Complete chloroplast genome sequence of freshwater araphidpennate diatom alga Synedra acus from lake Baikal. Int. J. Biol. 4, 27–35.

Gillard, J., Devos, V., Huysman, M.J., De Veylder, L., D'Hondt, S., Martens, C.,Vanormelingen, P., Vannerum, K., Sabbe, K., Chepurnov, V.A., et al., 2008. Physiologi-cal and transcriptomic evidence for a close coupling between chloroplast ontogenyand cell cycle progression in the pennate diatom Seminavis robusta. Plant Physiol.148, 1394–1411.

Gillard, J., Frenkel, J., Devos, V., Sabbe, K., Paul, C., Rempt, M., Inzé, D., Pohnert, G.,Vuylsteke, M., Vyverman, W., 2013. Metabolomics enables the structure elucidationof a diatom sex pheromone. Angew. Chem. Int. Ed. 52, 854–857.

Green, B.R., 2011. Chloroplast genomes of photosynthetic eukaryotes. Plant J. 66, 34–44.Grindley, N.D.,Whiteson, K.L., Rice, P.A., 2006. Mechanisms of site-specific recombination.

Annu. Rev. Biochem. 75, 567–605.Guillard, R.R.L., 1975. Culture of phytoplankton for feeding marine invertebrates. In:

Smith, W.L., Chanley, M.H. (Eds.), Culture of Marine Invertebrate Animals. PlenumPress, New York, pp. 26–60.

Guthrie, J.L., Seah, C., Brown, S., Tang, P., Jamieson, F., Drews, S.J., 2008. Use of Bordetellapertussis BP3385 to establish a cutoff value for an IS481-targeted real-time PCRassay. J. Clin. Microbiol. 46, 3798–3799.

Haugen, P., Simon, D.M., Bhattacharya, D., 2005. The natural history of group I introns.Trends Genet. 21, 111–119.

Heath, P.J., Stephens, K.M., Monnat Jr., R.J., Stoddard, B.L., 1997. The structure of I-Crel, agroup I intron-encoded homing endonuclease. Nat. Struct. Biol. 4, 468–476.

Hildebrand, M., Corey, D.K., Ludwig, J.R., Kukel, A., Feng, T.Y., Volcani, B.E., 1991. Plasmidsin diatom species. J. Bacteriol. 173, 5924–5927.

Please cite this article as: Brembu, T., et al., The chloroplast genome of themechanisms of horizontal gene..., Mar. Genomics (2013), http://dx.doi.org

ED P

RO

OF

Hildebrand,M., Hasegawa, P., Ord, R.W., Thorpe, V.S., Glass, C.A., Volcani, B.E., 1992. Nucle-otide sequence of diatom plasmids: identification of open reading frames with simi-larity to site-specific recombinases. Plant Mol. Biol. 19, 759–770.

Hudson, D.H., Richter, D.C., Rausch, C., Dezulian, T., Franz, M., Rupp, R., 2007.Dendroscope: an interactive viewer for large phylogenetic trees. BMC Bioinforma.8, 460.

Imanian, B., Pombert, J.F., Keeling, P.J., 2010. The complete plastid genomes of the two‘dinotoms’ Durinskia baltica and Kryptoperidinium foliaceum. PLoS One 5, e10711.

Jacobs, J.D., Ludwig, J.R., Hildebrand, M., Kukel, A., Feng, T.Y., Ord, R.W., Volcani, B.E., 1992.Characterization of two circular plasmids from the marine diatom Cylindrothecafusiformis: plasmids hybridize to chloroplast and nuclear DNA. Mol. Gen. Genet.233, 302–310.

Janouskovec, J., Horák, A., Oborník, M., Lukes, J., Keeling, P.J., 2010. A common red algal or-igin of the apicomplexan, dinoflagellate, and heterokont plastids. Proc. Natl. Acad. Sci.U. S. A. 107, 10949–10954.

Jin, Y., Binkowski, G., Simon, L.D., Norris, D., 1997. Ho endonuclease cleaves MAT DNAin vitro by an inefficient stoichiometric reaction mechanism. J. Biol. Chem. 272,7352–7359.

Jiroutová, K., Koreny, L., Bowler, C., Oborník, M., 2010. A gene in the process of endosym-biotic transfer. PLoS One 5, e13234.

Juretic, D., Zoranic, L., Zucic, D., 2002. Basic charge clusters and predictions of membraneprotein topology. J. Chem. Inf. Comput. Sci. 42, 620–632.

Kowallik, K.V., Stoebe, B., Schaffran, I., Kroth-Pancic, P., Freier, U., 1995. The chloroplast ge-nome of a chlorophyll a+c-containing alga, Odontella sinensis. Plant Mol. Biol. Report.13, 336–342.

Krogh, A., Larsson, B., von Heijne, G., Sonnhammer, E.L., 2001. Predicting transmembraneprotein topology with a hidden Markov model: application to complete genomes.J. Mol. Biol. 305, 567–580.

Lambowitz, A.M., Zimmerly, S., 2011. Group II introns: mobile ribozymes thatinvade DNA. Cold Spring Harb. Perspect. Biol. 3. http://dx.doi.org/10.1101/cshperspect.a003616.

Le Corguillé, G., Pearson, G., Valente, M., Viegas, C., Gschloessl, B., Corre, E., Bailly, X.,Peters, A.F., Jubin, C., Vacherie, B., et al., 2009. Plastid genomes of two brown algae,Ectocarpus siliculosus and Fucus vesiculosus: further insights on the evolution of red-algal derived plastids. BMC Evol. Biol. 9, 253.

Lonergan, K.M., Gray, M.W., 1994. The ribosomal RNA gene region in Acanthamoebacastellaniimitochondrial DNA. A case of evolutionary transfer of introns between mi-tochondria and plastids? J. Mol. Biol. 239, 476–499.

Lucas, P., Otis, C., Mercier, J.P., Turmel, M., Lemieux, C., 2001. Rapid evolution of the DNA-binding site in LAGLIDADG homing endonucleases. Nucleic Acids Res. 29, 960–969.

Margulies, M., Egholm, M., Altman, W.E., Attiya, S., Bader, J.S., Bemben, L.A., Berka, J.,Braverman, M.S., Chen, Y.J., Chen, Z., et al., 2005. Genome sequencing inmicrofabricated high-density picolitre reactors. Nature 437, 376–380.

Michel, F., Netter, P., Xu, M.Q., Shub, D.A., 1990. Mechanism of 3′ splice site selection bythe catalytic core of the sunY intron of bacteriophage T4: the role of a novel base-pairing interaction in group I introns. Genes Dev. 4, 777–788.

Montrichard, F., Alkhalfioui, F., Yano, H., Vensel, W.H., Hurkman, W.J., Buchanan, B.B.,2009. Thioredoxin targets in plants: the first 30 years. J. Proteome 72, 452–474.

Moustafa, A., Beszteri, B., Maier, U.G., Bowler, C., Valentin, K., Bhattacharya, D., 2009. Ge-nomic footprints of a cryptic plastid endosymbiosis in diatoms. Science 324,1724–1726.

Nicholas, K.B., Nicholas, H.B.J., Deerfield, D.W.I., 1997. GeneDoc: analysis and visualizationof genetic variation. EMBnetnews 4, 1–4.

Nymark, M., Valle, K.C., Brembu, T., Hancke, K., Winge, P., Andresen, K., Johnsen, G., Bones,A.M., 2009. An integrated analysis of molecular acclimation to high light in the ma-rine diatom Phaeodactylum tricornutum. PLoS One 4, e7743.

Oudot-Le Secq, M.P., Grimwood, J., Shapiro, H., Armbrust, E.V., Bowler, C., Green, B.R.,2007. Chloroplast genomes of the diatoms Phaeodactylum tricornutum andThalassiosira pseudonana: comparison with other plastid genomes of the red lineage.Mol. Genet. Genomics 277, 427–439.

Page, R.D., 2002. Visualizing phylogenetic trees using TreeView. Curr. Protoc. Bioinformat-ics (Chapter 6, Unit 6 2).

Pombert, J.F., Otis, C., Lemieux, C., Turmel, M., 2005. Chloroplast genome sequence of thegreen alga Pseudendoclonium akinetum (Ulvophyceae) reveals unusual structural fea-tures and new insights into the branching order of chlorophyte lineages. Mol. Biol.Evol. 22, 1903–1918.

Ramakers, C., Ruijter, J.M., Deprez, R.H., Moorman, A.F., 2003. Assumption-free analysis ofquantitative real-time polymerase chain reaction (PCR) data. Neurosci. Lett. 339,62–66.

Raymond, J.A., Kim, H.J., 2012. Possible role of horizontal gene transfer in the colonizationof sea ice by algae. PLoS One 7, e35968.

Reyes-Prieto, A., Weber, A.P., Bhattacharya, D., 2007. The origin and establishment of theplastid in algae and plants. Annu. Rev. Genet. 41, 147–168.

Rochaix, J.D., Rahire, M., Michel, F., 1985. The chloroplast ribosomal intron ofChlamydomonas reinhardii codes for a polypeptide related to mitochondrialmaturases. Nucleic Acids Res. 13, 975–984.

Round, F.E., Crawford, R.M., Mann, D.G., 1990. The Diatoms. Cambridge University Press,Cambridge.

Saitou, N., Nei, M., 1987. The neighbor-joining method: a new method for reconstructingphylogenetic trees. Mol. Biol. Evol. 4, 406–425.

Schattner, P., Brooks, A.N., Lowe, T.M., 2005. The tRNAscan-SE, snoscan and snoGPS webservers for the detection of tRNAs and snoRNAs. Nucleic Acids Res. 33, W686–W689.

Schürmann, P., Jacquot, J.P., 2000. Plant thioredoxin systems revisited. Annu. Rev. PlantPhysiol. Plant Mol. Biol. 51, 371–400.

Smith, D.R., Lee, R.W., 2009. The mitochondrial and plastid genomes of Volvox carteri:bloated molecules rich in repetitive DNA. BMC Genomics 10, 132.

diatom Seminavis robusta: New features introduced throughmultiple/10.1016/j.margen.2013.12.002

Page 11: The chloroplast genome of the diatom Seminavis robusta: New features introduced through multiple mechanisms of horizontal gene transfer

733734735736737738739740741742743744745746747748

749750751752753754755756757758759760761762763764

766

11T. Brembu et al. / Marine Genomics xxx (2013) xxx–xxx

Stamatakis, A., 2006. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyseswith thousands of taxa and mixed models. Bioinformatics 22, 2688–2690.

Stoddard, B., Belfort, M., 2010. Social networking between mobile introns and their hostgenes. Mol. Microbiol. 78, 1–4.

Tanaka, T., Fukuda, Y., Yoshino, T., Maeda, Y., Muto, M., Matsumoto, M., Mayama, S.,Matsunaga, T., 2011. High-throughput pyrosequencing of the chloroplast genome ofa highly neutral-lipid-producing marine pennate diatom, Fistulifera sp. strain JPCCDA0580. Photosynth. Res. 109, 223–229.

Thompson, A.J., Yuan, X., Kudlicki, W., Herrin, D.L., 1992. Cleavage and recognition patternof a double-strand-specific endonuclease (I-creI) encoded by the chloroplast 23SrRNA intron of Chlamydomonas reinhardtii. Gene 119, 247–251.

Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F., Higgins, D.G., 1997. TheCLUSTAL_X windows interface: flexible strategies for multiple sequence alignmentaided by quality analysis tools. Nucleic Acids Res. 25, 4876–4882.

Turmel, M., Otis, C., Lemieux, C., 2002a. The chloroplast and mitochondrial genome se-quences of the charophyte Chaetosphaeridium globosum: insights into the timing of

UNCO

RRECT

765

Please cite this article as: Brembu, T., et al., The chloroplast genome of themechanisms of horizontal gene..., Mar. Genomics (2013), http://dx.doi.org

the events that restructured organelle DNAs within the green algal lineage that ledto land plants. Proc. Natl. Acad. Sci. U. S. A. 99, 11275–11280.

Turmel, M., Otis, C., Lemieux, C., 2002b. The complete mitochondrial DNA sequence ofMesostigma viride identifies this green alga as the earliest green plant divergenceand predicts a highly compact mitochondrial genome in the ancestor of all greenplants. Mol. Biol. Evol. 19, 24–38.

Van den Hoek, C., Mann, D.G., Jahns, H.M., 1995. Algae: An Introduction to Phycology.Cambridge University Press, Cambridge.

Viklund, H., Elofsson, A., 2008. OCTOPUS: improving topology prediction by two-trackANN-based preference scores and an extended topological grammar. Bioinformatics24, 1662–1668.

Wolff, G., Plante, I., Lang, B.F., Kück, U., Burger, G., 1994. Complete sequence of the mito-chondrial DNA of the chlorophyte alga Prototheca wickerhamii. Gene content and ge-nome organization. J. Mol. Biol. 237, 75–86.

Wyman, S.K., Jansen, R.K., Boore, J.L., 2004. Automatic annotation of organellar genomeswith DOGMA. Bioinformatics 20, 3252–3255.

ED P

RO

OF

diatom Seminavis robusta: New features introduced throughmultiple/10.1016/j.margen.2013.12.002