A simple and effective set of PCR-based molecular markers for the monitoring of the Saccharomyces cerevisiae cell population during bioethanol fermentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
G
B
Amb
OQ1
Wa
b
a
ARRAA
KSPGB
1
esbflmiteii
I1
(
0h
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
ARTICLE IN PRESS Model
IOTEC 6442 1–9
Journal of Biotechnology xxx (2013) xxx– xxx
Contents lists available at ScienceDirect
Journal of Biotechnology
jo u r n al homep age: www.elsev ier .com/ locate / jb io tec
simple and effective set of PCR-based molecular markers for theonitoring of the Saccharomyces cerevisiae cell population during
ioethanol fermentation
smar V. Carvalho-Nettoa,∗, Marcelo F. Carazzollea, Aline Rodriguesa,elbe O. Braganc aa, Gustavo G.L. Costaa, Juan Lucas Arguesob, Gonc alo A.G. Pereiraa
Departamento de Genética, Evoluc ão e Bioagentes, Instituto de Biologia, Universidade Estadual de Campinas, Campinas-SP, BrazilDepartment of Environmental and Radiological Health Sciences, Colorado State University, Fort Collins, CO, USA
r t i c l e i n f o
rticle history:eceived 26 May 2013eceived in revised form 19 August 2013ccepted 21 August 2013vailable online xxx
eywords:accharomyces cerevisiaeCRenotyping
a b s t r a c t
One of the defining features of the fermentation process used in the production of bioethanol from sug-arcane feedstock is the dynamic nature of the yeast population. Minisatellite molecular markers areparticularly useful for monitoring yeast communities because they produce polymorphic PCR productsthat typically display wide size variations. We compared the coding sequences derived from the genomeof the sugarcane bioethanol strain JAY270/PE-2 to those of the reference Saccharomyces cerevisiae labo-ratory strain S288c, and searched for genes containing insertion or deletion polymorphisms larger than24 bp. We then designed oligonucleotide primers flanking nine of these sites, and used them to amplifydifferentially sized PCR products. We analyzed the banding patterns in the most widely adopted sug-arcane bioethanol strains and in several indigenous yeast contaminants, and found that our marker set
ioethanol had very good discriminatory power. Subsequently, these markers were used to successfully monitor theyeast cell populations in six sugarcane bioethanol distilleries. Additionally, we showed that most of themarkers described here are also polymorphic among strains unrelated to bioethanol production, sug-gesting that they may be applied universally in S. cerevisiae. Because the relatively large polymorphismsare detectable in conventional agarose gels, our method is well suited to modestly equipped on-sitelaboratories at bioethanol distilleries, therefore providing both cost and time savings.
34
35
36
37
38
39
40
41
42
43
44
. Introduction
In Brazil and other tropical countries that grow sugarcane,thanol biofuel (bioethanol) is produced through fermentation ofucrose by the yeast Saccharomyces cerevisiae, primarily using a fed-atch process (Wheals et al., 1999). This high moisture/high sugareedstock brings into the industrial plant a substantial microbiaload of wild yeasts and bacteria (Amorim et al., 2011). A heat treat-
ent is applied to reduce the number of viable contaminants, butnevitably, some find their way into the fermentation tanks wherehey have the opportunity to flourish. After the fermentation of
Please cite this article in press as: Carvalho-Netto, O.V., et al., A simple and eSaccharomyces cerevisiae cell population during bioethanol fermentation. J
ach batch of sugarcane juice is completed, the cell populations recycled to ferment the subsequent batches, and this practices repeated continuously for the duration of the harvest season
∗ Corresponding author at: Departamento de Genética, Evoluc ão e Bioagentes,nstituto de Biologia, Universidade Estadual de Campinas, CP 6109, Campinas-SP3083-970, Brazil. Tel.: +55 19 3521 6237.
(8–9 months). The effects of the steady influx of microbial con-taminants are exacerbated by the cell recycling regimen widelyadopted by the industry to achieve a high rate of feedstock pro-cessing: i.e., a dense yeast inoculum (∼10% cells by volume) leads tofaster fermentations (∼10 h). Consequently, the microbial commu-nity in most distilleries is heterogeneous and highly dynamic. Thestarter strains inoculated in the beginning of the season (habituallybaker’s yeast due to their wide market availability and low cost)are rapidly out-competed and replaced by more robust indigenousstrains that can endure the stressful conditions encountered duringthe fermentation (da Silva et al., 2005).
Over the past fifteen years, indigenous strains that com-bine excellent fermentative characteristics and high adaptationto the industrial environment have been selected from Brazil-ian bioethanol distilleries and are used today as starter strains.Among those, the PE-2 and CAT-1 strains are the most widelyadopted (Basso et al., 2008). However, PE-2 and CAT-1 are excep-
ffective set of PCR-based molecular markers for the monitoring of the. Biotechnol. (2013), http://dx.doi.org/10.1016/j.jbiotec.2013.08.025
tions among their indigenous counterparts, which in their majoritydisplay undesirable fermentation characteristics, and often resultin significant productivity losses if they are allowed to establisha foothold in a distillery (Basso et al., 2008; Wheals et al., 1999).
herefore, effective methods are needed to monitor the micro-ial population to identify the presence of invaders in a timelyashion.
A wide array of methodologies has been previously describedor the molecular identification of S. cerevisiae strains includingibosomal DNA or ITS sequencing (Kurtzman and Robnett, 1998),APD (Bovo et al., 2009; Gomes et al., 2000), mDNA RFLP (Querolt al., 1992), PCR-RFLP (Schuller et al., 2004, 2007), real-time PCRHierro et al., 2006; Martorell et al., 2005), microsatellite and inter-elta sequence amplification (Legras and Karst, 2003; Legras et al.,005; Schuller et al., 2007; Vaudano and Garcia-Moruno, 2008),inisatellite intragenic tandem repeats (Marinangeli et al., 2004;erstrepen et al., 2005), and karyotyping by pulsed-field gel elec-
rophoresis (PFGE) (Basso et al., 2008; Schuller et al., 2004, 2007).ll of these methods can reveal polymorphisms between industrialeast strains and the indigenous contaminants. However, a limita-ion of several of these approaches is that they sometimes requireelatively sophisticated equipment and expertise that are rarelyvailable at bioethanol distilleries. For example, the analysis oficrosatellite markers occasionally requires high-resolution capil-
ary gel electrophoresis to resolve small polymorphisms; methodshat involve restriction fragmentation or RAPD often generateomplex patterns with multiple bands that can be difficult tonalyze; and karyotyping requires a dedicated pulse field appa-atus. In part due to their technical inability to carry out some ofhese procedures, most bioethanol distilleries opt to outsource the
olecular monitoring of the yeast population, even though theyften have on-site quality control laboratories that are reasonablyapable.
The standard monitoring assay used in Brazil since the mid990s has been karyotyping by PFGE, and microsatellite markersave increased in use in recent years. While these have provenseful, they are normally conducted at third party laboratories,nd several days may elapse between sample collection, shipment,olony isolation, analysis, and the final read-out. By that point inime, the composition of the yeast population in the distillery maylready have changed significantly.
One of the most straightforward methods cited above is theetection of relative large genetic polymorphisms at minisatellite
oci, whose analysis does not require specialized electrophoresisquipment or the expertise to operate them. In addition, the band-ng patterns tend to be simpler (one or two bands per sample),
hile still providing substantial discriminatory power betweentrains (Marinangeli et al., 2004; Verstrepen et al., 2005). As statedbove, simplicity is a key factor to the dissemination of new analyt-cal technologies at industrial laboratories, particularly in remoteugarcane production regions. Therefore, this class of moleculararkers represents a viable alternative to PFGE and microsatel-
ites because it would allow the yeast population monitoring to bearried out on-site, and have the added advantage of a shorter timenterval between sampling and the final result.
We have previously described the genome sequence of a hap-oid derivative (JAY291) from the JAY270/PE-2 bioethanol strainArgueso et al., 2009). We took advantage of our sequence datao design PCR markers at loci that show relatively large size poly-
orphisms in bioethanol strains, primarily at minisatellite loci.e carried out a genome-wide search for length polymorphic
egions within genes, by comparing the JAY291 haplotype sequenceo those of S288c and other sequenced S. cerevisiae strains. Thispproach allowed us to develop markers that can positively iden-ify JAY270/PE-2 using a small number of loci. We found that theseoci are also polymorphic in other major bioethanol strains, as wells in uncharacterized indigenous yeast contaminants. The markers
Please cite this article in press as: Carvalho-Netto, O.V., et al., A simple and eSaccharomyces cerevisiae cell population during bioethanol fermentation. J
resented here should be immediately useful to the large number ofistilleries that use strain JAY270/PE-2, as well as other S. cerevisiaetrains.
PRESSiotechnology xxx (2013) xxx– xxx
2. Materials and methods
2.1. Strains and culture
The S. cerevisiae strains used in this study were predominantlyderived from Brazilian bioethanol distilleries or cachac a still (listedin Supplementary Table 1; (Argueso et al., 2009; Basso et al.,2008; Campos et al., 2010; Gomes et al., 2000; Kane and Roth,1974; Mortimer et al., 1994; Novo et al., 2009; Wei et al., 2007;Winston et al., 1995)). The main industrial strains used in sug-arcane bioethanol production are of indigenous origin, and wereall isolated as high productivity invaders. In addition, fermenta-tion samples were collected from six different Brazilian bioethanoldistilleries during the 2011 sugarcane harvest season to monitorthe yeast population. Samples collected at bioethanol productionplants were streaked onto YPD solid medium [yeast extract 1%(w/v), peptone 2% (w/v), glucose 2% (w/v) and agar 2% (w/v)] forsingle colony isolation, and incubated at 30 ◦C for 2 days. Colonieswere then cultured overnight in 4 mL YPD liquid medium at 30 ◦Cfor genomic DNA extraction.
2.2. Identification and selection of length polymorphic regionswithin genes
In order to identify polymorphic regions within genes, thecoding regions from the S. cerevisiae reference genome (strainS288c) were aligned to the assembled genome sequence of JAY291using Exonerate (Slater and Birney, 2005), applying stringentparameters to produce accurate sequence alignments (-best 1-model est2genome -percent 20 -forcegtag TRUE -quality 80 -refine region). A custom PERL script was created to parser theExonerate output, identifying the insertion and deletion poly-morphisms present in the output. Only genes containing regionslarger than 24 consecutive gaps and with gap lengths divisibleby three (unaltered reading frames) were selected, resulting in aninitial list of 27 candidate genes with usable length polymorphicregions. The sequences of these genes were then analyzed in thegenomes of the unrelated S. cerevisiae strains YJM789 (Wei et al.,2007), AWRI1631 (Borneman et al., 2008), and RM11-1a (S. cere-visiae RM11-1a Sequencing Project; http://www.broad.mit.edu/)through nucleotide similarity searches in GenBank using BLASTn.Of the 27 initial sequences analyzed, the nine genes (Table 1) thatdisplayed the highest degree of length variability between strainswere selected as the most suitable for use as PCR marker loci.
It is important to note that genes and other sequences contain-ing short internal repeats often coincided with the ends of thecontigs obtained during the JAY291 genome assembly (Arguesoet al., 2009), as is typical for assemblies derived from short nextgeneration sequencing reads. Only fully assembled segments of theJAY291 genome were used in the search for length polymorphisms,therefore it is likely that additional polymorphic regions exist, butwere not identified in the present study (see Section 3).
2.3. Amplification of length polymorphic regions
Primer3 software (Rozen and Skaletsky, 2000) was used todesign optimal oligonucleotide primers to amplify the selectedpolymorphic regions. In seven of the nine loci the primers werewithin the coding sequence. The exceptions were YKL201C in whichthe reverse primer was downstream of the coding sequence, andYDR504C with both primers located outside the coding region. Theprimer sequences used to amplify the target regions are shown in
ffective set of PCR-based molecular markers for the monitoring of the. Biotechnol. (2013), http://dx.doi.org/10.1016/j.jbiotec.2013.08.025
Yeast genomic DNA was extracted using a phenol-chloroformprotocol (Ausubel et al., 1998). PCR was carried out in 10 �L volumereactions containing 20 ng of genomic DNA, 0.25 mM dNTP (each),
a Re-sequenced PCR product; The predicted PCR product size for YKL163W according to the first JAY291 genomic assembly was 337 bp.ing to7].
11o47cg8lwsDcD
2
ralddmP
2
stimn
2
lmcdaYGrST
cell wall genes (Marinangeli et al., 2004). Conversely, our approachidentified six genes that were not found in these earlier studies.
To better understand the reasons behind these differences wesystematically repeated the sequence alignments between S288c
Marinang eli et al. 20 04
4 loc i
Verstrepe n et al. 20 05
44 loc i
this study
9 loc i4
37
3
6
Fig. 1. Relationships between the polymorphic loci described previously and thoseidentified in this study.Venn diagram showing the number of loci with internal size polymorphisms found
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
b Re-sequenced PCR product; The predicted PCR product size for YKL201C accordc Three polymorphic loci also identified in the Verstrepen et al. (2005) gene set [1d Absolute difference in size between the S288c and JAY291 PCR products.
× PCR buffer, 3.5 mM MgCl2, 0.05 U/�L of Taq DNA polymerase,.5 pmol/�L of each primer. The amplification program consistedf 5 min denaturation at 94 ◦C, followed by 40 cycles of 45 s at 94 ◦C,0 s at primer-specific annealing temperature (Table 1), and 45 s at2 ◦C. A 5 min final extension was performed at 72 ◦C. PCR ampli-ons were resolved in conventional electrophoresis on 2% agaroseels in 1x TAE buffer (40 mM Tris–acetate, 2 mM Na2EDTA·2H2O, pH.5). Gels were stained with ethidium bromide, visualized under UV
ight, and photographed. The size and number of the detected bandsas scored relative to a molecular weight ladder. We have also
uccessfully amplified the marker PCR products from crude yeastNA template preparations and using shorter cycling programs (i.e.olony PCR; data not shown), therefore simpler and faster customNA preparation and PCR protocols can be employed.
.4. In silico PCR analysis
The S288c sequences of the nine selected loci, in their respectiveegions delimited by the primer pairs listed in Table 1 were useds reference for in silico PCR analysis. The comparison of sequenceengths at each locus in multiple S. cerevisiae strains was con-ucted by BLASTn searches using the complete genomic sequenceseposited at the Sanger Institute, SGD and NCBI databases. Experi-ental validation of the in silico analysis was conducted using the
CR protocol described above.
.5. Evaluation of discriminatory power
The discriminatory power of each PCR marker locus was mea-ured by the Polymorphic Index Content (PIC) parameter, usinghe algorithm: PIC = 1 −
∑(pi
2), where pi is the frequency of theth allele detected (Smith et al., 1997). A marker with PIC = 0 is
onomorphic, while a marker with PIC = 1 is very highly discrimi-ative.
.6. Re-sequencing of specific loci in JAY291
As part of this work, we re-sequenced the YKL163W and YKL201Coci from haploid strain JAY291 because they were found to be
is-assembled in the original JAY291 genome report. The neworrect sequences were used to edit the existing JAY291 contigseposited in GenBank under accession numbers ACFL01000239.2nd ACFL01000006.2, respectively. The YFL024C, YKL201C, andLL021W loci from the sibling haploid JAY292 were deposited under
Please cite this article in press as: Carvalho-Netto, O.V., et al., A simple and eSaccharomyces cerevisiae cell population during bioethanol fermentation. J
enBank accession numbers JX101677, JX101678, and JX101679,espectively. All of these sequences were obtained by conventionalanger sequencing of PCR products generated with the primers inable 1.
the first JAY291 genomic assembly was 582 bp.
3. Results and discussion
3.1. Identification of length polymorphic regions within genes
The goal of this study was to develop simple, yet effective, PCRmarkers for the molecular identification of Brazilian bioethanolyeast strains. We compared coding sequences from the genomeof the standard laboratory strain S288c (Goffeau et al., 1996) tothose assembled from the genome sequencing of strain JAY291, ahaploid derivative of JAY270/PE-2 (Argueso et al., 2009). We thenrefined the search for length polymorphic loci by examining candi-date regions in three other unrelated sequenced strains (YJM789,AWRI1631, and RM11-1a; Section 2). Nine genes were identifiedthat contained sufficiently large continuous length polymorphicregions that could be discriminated by conventional agarose gelelectrophoresis, therefore simplifying the readout of the analysis.Interestingly, the list of loci identified through our targeted S288cvs. JAY291 comparison approach (Table 1) had relatively little over-lap with similar loci described earlier (Fig. 1). Three of the genesidentified in our study (YKL163W, YKL201C, and YLL021W) had beenfound previously in a systematic survey of the S288c referencegenome for genes containing tandem repeats within their codingsequences (Verstrepen et al., 2005), and none were found in com-mon with another study that focused on potentially polymorphic
ffective set of PCR-based molecular markers for the monitoring of the. Biotechnol. (2013), http://dx.doi.org/10.1016/j.jbiotec.2013.08.025
in this study (blue circle) and those previously described by Marinangeli et al. (2004)(yellow) and Verstrepen et al. (2005) (green). The three genes identified in both thisstudy and (Verstrepen et al., 2005) (white) are YKL163W (PIR3); YKL201C (MNN4);and YLL021W (SPA2). The area of the circles is roughly proportional to the numberof genes. The full list of genes is presented in Supplementary Table 2.
4 O.V. Carvalho-Netto et al. / Journal of Biotechnology xxx (2013) xxx– xxx
Fig. 2. Length polymorphisms within coding sequences.Alignments of four of the genomic DNA sequences used as PCR marker loci in this study. Panels A through D show the alignments between the predicted PCR products (fromF 2 sequs 1 hapt
aaedo2pooctYwu(n
ciY
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
orward to Reverse primers) from the reference strain S288c and from JAY270/PE-hown as dashes. For the loci where JAY270/PE-2 is homozygous (B) only the JAY29he sequences of the two alleles are shown separately.
nd JAY291 for all 44 genes described in Marinangeli et al. (2004)nd in Verstrepen et al. (2005) (Marinangeli et al., 2004; Verstrepent al., 2005). 16 of the 44 genes were either not fully assembled orid not have a strong similarity hit in JAY291. Therefore, only 28f the 44 genes were discoverable using our approach. 17 of the8 discoverable genes either had no size polymorphism, or hadolymorphisms that were too small (at or below the 24 bp size cutff). Of the 11 remaining genes, 3 were successfully identified byur approach (above), but 8 were not. The reasons why some spe-ific genes were missed by our approach are not entirely clear, buthey include in a few cases a lack of size polymorphism relative toJM789, RM11-1a, and AWRI1631 (Section 2). We compiled a listith all 50 genes from the three studies as a source of potentiallyseful PCR markers for future S. cerevisiae genetic diversity studiesSupplementary Table 2). We focused the rest of our study on theine genes listed in Table 1.
Please cite this article in press as: Carvalho-Netto, O.V., et al., A simple and eSaccharomyces cerevisiae cell population during bioethanol fermentation. J
In most cases, the polymorphic regions identified in our studyontained low complexity sequences, often short repetitive min-satellites (Fig. 2; and Supplementary Figure 1). Others, such asLL021W, had internal repeat units that were larger in size (81 bp).
ences. Polymorphisms are shown shaded in gray, and insertion/deletion gaps areloid sequence is shown; and for the loci where this strain is heterozygous (A, C, D),
In all cases, the length polymorphic regions were present as in-frame insertions or deletions in the S288c and JAY291 codingsequences, and none of the selected genes contained introns. Threeof the genes (YDR299W, YFL024C, and YPR143W) are essential forviability in S288c according to the S. cerevisiae systematic deletionstudies. The nine genes participate in assorted cellular pathways,including, but not limited to, membrane or cell wall proteins. Theprimary goal of this study was to use of these sequences as molec-ular strain identification markers, therefore we did not investigateany possible association between locus function and the presenceof length polymorphisms.
Once the candidate length polymorphisms were identified, wedesigned oligonucleotide primer pairs flanking those sites to gen-erate PCR products (Table 1). We used these primers to amplifygenomic DNA from FY86 (an S288c isogenic strain; (Winston et al.,1995)) and from JAY291, and confirmed the expected product
ffective set of PCR-based molecular markers for the monitoring of the. Biotechnol. (2013), http://dx.doi.org/10.1016/j.jbiotec.2013.08.025
size polymorphisms in seven of the nine genes (Fig. 3A). For twoof the genes, YKL163W and YKL201C, the resulting JAY291 PCRproducts had different sizes from those predicted from the pub-lished genomic sequence. However, in both cases the observed
O.V. Carvalho-Netto et al. / Journal of Biotechnology xxx (2013) xxx– xxx 5
YLL021W
YFL024C
YPR143W
YKL163W
YKL201C
YBL046W
YDR299W
YDR504C
YDR505C
M FY
86
/S2
88
c
JA
Y2
91
M JA
Y2
70
/PE
-2
BG
-1
CA
T-1
SA
-1
A B
Fig. 3. PCR amplification of polymorphic loci.Panel A shows agarose gels with PCR products amplified from the genomes of the ref-erence strain FY86/S288c and of the JAY291 haploid. Panel B shows the PCR productsobtained from the JAY270/PE-2 heterozygous diploid, and from three other widelyuaf
bfJits2pudp
3
ls
YFL024C
M JA
Y2
70
/PE
-2
JA
Y2
89
JA
Y2
90
JA
Y2
91
JA
Y2
92
YKL201C
YLL021W
YKL163W
YDR299W
YDR505CH
ete
rozygous
Ho
mo
zygous
Fig. 4. Analysis of heterozygous length polymorphisms in JAY270/PE-2.Agarose gels showing the PCR products amplified from the genome of theJAY270/PE-2 heterozygous diploid, and from four haploid spores JAY289, JAY290,JAY291, and JAY292. These four spores are derived from a single tetrad of JAY270.The top panel shows polymorphic markers for three heterozygous loci, with 2:2Mendelian segregation among the haploids. The bottom panel shows the banding
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
sed bioethanol yeast strains. M indicates the molecular weight ladder, and therrow indicates the 500 bp marker band. The only exception is for the YLL021W gelor which the 700 bp band is indicated.
ands were still polymorphic relative to S288c. The likely causeor this discrepancy was local error in the de novo assembly of theAY291 genome (Argueso et al., 2009). Repetitive and low complex-ty regions are particularly challenging to assemble accurately fromhe short sequencing reads used to determine the JAY291 genomicequence (mostly 35 bp and 50 bp Illumina GAII reads, with some00 bp Roche 454). Therefore, we re-sequenced the JAY291 PCRroducts spanning the YKL163W and YKL201C polymorphic regionssing long reads from conventional Sanger sequencing. The Sanger-etermined sequences of both genes matched the sizes of the PCRroducts (Figs. 2 and 3A).
.2. PCR identification of industrial bioethanol strains
Please cite this article in press as: Carvalho-Netto, O.V., et al., A simple and eSaccharomyces cerevisiae cell population during bioethanol fermentation. J
Following the identification and validation of the intragenicength polymorphisms between S288c and JAY291, we used theame primer sets to look for polymorphisms between our strain of
patterns for three homozygous loci. M indicates the molecular weight ladder, andthe arrow indicates the 500 bp marker band. The only exception is for the YLL021Wgel for which the 700 bp band is indicated.
interest JAY270/PE-2, and strains BG-1, CAT-1 and SA-1 (Fig. 3B).This group represents the four most widely adopted strains inthe Brazilian bioethanol industry (Basso et al., 2008). This anal-ysis revealed polymorphisms at seven of the nine loci assayed,confirming that our set of markers was highly informative for theidentification bioethanol strains.
In several cases, more than one PCR product was amplified fromgenomic DNA of the industrial strains. We previously showed thatJAY270/PE-2 is a heterozygous diploid, therefore suggesting thatthis strain, as well as the other industrial strains tested, may beheterozygous at the polymorphic loci. To test this hypothesis, weprepared genomic DNA from JAY270/PE-2 and from all four haploidspore products of a JAY270/PE-2 meiotic tetrad (Fig. 4). We used thismaterial as template in PCR reactions to amplify three candidatehomozygous loci (YDR299W, YDR505C, and YKL163W) and threecandidate heterozygous loci (YFL024C, YKL201C, and YLL021W). The
ffective set of PCR-based molecular markers for the monitoring of the. Biotechnol. (2013), http://dx.doi.org/10.1016/j.jbiotec.2013.08.025
four haploid spores in this tetrad included JAY291 (the one with asequenced genome), and its three sibling spores JAY289, JAY290,and JAY292. For the homozygous markers, we observed a single PCRband of the same size in all samples. In contrast, for the YFL024C and
Discriminatory power of each marker locus was estimated by the Polymorphic
b The numbers identify the specific PCR allele detected in each strain. The allele
llele observed in JAY291 was arbitrarily assigned as allele “2”.
LL021W candidate heterozygous loci we observed the amplifica-ion of two bands in the JAY270/PE-2 diploid, and a single band inach of the haploid spores. PCR at the YKL201C locus produced threeands in JAY270/PE-2: The two shorter bands corresponded to thelleles observed in the haploids; and the longer third band was
specific and reproducible chimeric product (characterization andiscussion presented in Supplementary Figure 2). The polymorphicCR products from all the three heterozygous loci displayed thexpected 2:2 Mendelian segregation among the spores. Finally, weequenced the second (unknown) alleles from YFL024C, YLL021W,nd YKL201C, and compared them to the first alleles already knownrom the JAY291 genomic sequence (Fig. 2). In the case of YFL024C,he allele found in JAY290 and JAY292 was 385 bp long, and was theame as in S288c. For YLL021W, we observed that the allele presentn JAY290 and JAY292 had two units of an 81 bp internal repeat,
Please cite this article in press as: Carvalho-Netto, O.V., et al., A simple and eSaccharomyces cerevisiae cell population during bioethanol fermentation. J
hile the JAY289 and JAY291 allele has three units, and S288c hasnly one unit. Finally, in the case of YKL201C, the allele found inAY290 and JAY292 was 583 bp long, and differed from both the31 bp allele in JAY291 and JAY289, and also from the 699 bp allele
able 3CR analysis of yeast populations in six bioethanol distilleries during the 2011 season.
Distilleries
Alcídia Morro Vermelho Eldorado
Strain Inoc.(tons)a
1st b 2nd c Inoc.(tons)
1st 2nd Inoc.(tons)
1st
PE-2 0.35 22 18 0 0 0 0.1 12
CAT-1 0.35 0 0 0 0 0 0 0
BG-1 0 0 0 0 0 0 0 0
Angest 0 0 0 4 0 0 1.5 0
FF 0 0 0 0 0 0 0 0
Indigenousd 0 13 2 0 15 15 0 3
Total 0.7 35 20 4 15 15 1.6 15
his analysis only included the molecular markers for loci YFL024C, YKL163W, YKL201C anwo YFL024C, YKL201C or YLL021W alleles present in JAY270/PE-2 were classified as deriv
a Tons of dry active yeast powder used as starting inoculum (March/April).b Number of colonies identified at the first sample collection after the beginning of thec Number of colonies identified at the second sample collection (September/October).d Colonies whose genotype that did not match any of the other industrial strains in the
Content (PIC) parameter (Section 2).ed in the FY86/S288c reference genome was arbitrarily assigned as allele “1”. The
from S288c. This analysis showed that the heterozygosis observedin JAY270/PE-2 and in other industrial yeast strains is a helpful fea-ture that further enhances the discriminatory power of our PCRmarkers.
To further assess the usefulness of the PCR markers we analyzeda larger panel of yeasts associated with sugarcane fermentation.These included all the strains described above, as well as a Cachac astill strain (CA-11), three minor bioethanol strains (IZ1904, Y904,M304-2C), eleven indigenous contaminant yeast strains isolatedfrom Brazilian bioethanol and Cachac a fermentations, and a widelyused baker’s starter strain (FF). The PCR products were visualizedin conventional agarose gel electrophoresis (Supplementary Figure3), and the product sizes for each strain were examined to identifytheir specific allelic length variants. The summary of this analysisis shown in Table 2. We observed several new PCR alleles that were
ffective set of PCR-based molecular markers for the monitoring of the. Biotechnol. (2013), http://dx.doi.org/10.1016/j.jbiotec.2013.08.025
amplified in the different strain/loci combinations. In two cases,YBL046W and YDR505C, the only size polymorphism observed wasthe one originally mapped between S288c and JAY291. All other lociproduced between three and eleven PCR products of discernible
Fig. 5. PCR amplification of markers in S. cerevisiae strains unrelated to sugarcanefermentation.The panels show the PCR products amplified from the indicated loci resolved inagarose gels. The yeast strains are listed above each lane. M indicates the molecularweight ladder, and the arrow indicates the 700 bp marker band. The letters withinthe panels indicate for each lane if the size of the observed products was eitherconsistent (Y) with the size predicted from the deposited genome sequence of each
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
ARTICLEIOTEC 6442 1–9
O.V. Carvalho-Netto et al. / Journ
ize. Four of the loci, YFL024C, YKL163W, YKL201C, and YLL021W,ere particularly polymorphic, and we found that using only those
oci provided sufficient discriminatory power to uniquely identifyvery one of the 21 strains in Table 2. For the remainder of ourtudy, we focused our PCR genotyping analyses on those four corearkers.
.3. Monitoring of yeast populations in industrial fermentations
After completing the design and testing of our PCR markers,e evaluated their practical effectiveness by analyzing the yeastopulations in samples collected from six Brazilian distilleries dur-
ng the 2011 sugarcane harvest season. Each of these distilleriesegan their fermentations in March or April using known specificixes of commercial dry active yeast powder as initial inocu-
um (Table 3). As is typical in Brazil, the starting inoculum wasften composed of a large proportion of baker’s yeast strains (e.g.F, Angest), supplemented with smaller amounts of bioethanol-pecific strains such as PE-2, CAT-1 and BG-1. During the firstew weeks of cell recycling the baker’s strains gradually disappear,hile the bioethanol-specific strains and/or indigenous contam-
nants progressively take over the population. This dynamic haseen demonstrated previously and it is a common feature of therazilian bioethanol production process (Basso et al., 2008; da Silvat al., 2005).
We collected cell samples from each distillery roughly at the endf the first and second trimesters of the harvest season: June/Julynd September/October, respectively. We prepared genomic DNArom about twenty yeast colonies isolated from each sample, andsed them in PCR. The amplification patterns from these coloniesere then compared to those of the strains known to have been
riginally present in the inoculum. It is important to note that theanding patterns in the industrial strain colonies re-isolated fromhe industry were identical to the patterns previously observed inhe laboratory stocks, suggesting that the PCR markers were genet-cally stable through prolonged cell propagation and fermentation.
The banding patterns for the colonies isolated from the distil-eries were analyzed to generate a snapshot of the yeast populationt each time (Table 3). Even though the colony sample sizes weremall, this initial field test indicated that the markers are quiteffective. Using only the four core markers, we were able quicklylassify the colonies as being either derived from one of the initialtarter strains, or indigenous when their banding pattern did notatch any of those present in the inoculum.As expected, the starter baker’s yeast strains were mostly out-
ompeted by the time of the first sampling, and were not detectedt all in the second. The bioethanol strains had varying degrees ofuccess in establishing themselves as the dominant members ofhe yeast population. For example, PE-2 was very effective in col-nizing the fermentation environment at the Alcídia and Eldoradoistilleries, but was replaced by better-adapted indigenous con-aminants at Alto Taquari. Our PCR data showed that the indigenouseasts were represented by multiple strains, ranging from two andp to eleven unique PCR genotypes depending on the distillery.
nterestingly, at the Alto Taquari distillery a specific indigenoustrain was detected at a high frequency in the yeast population∼60%), yet the overall bioethanol yield in the distillery remainedatisfactory (data not shown). This suggested that this strain is welldapted to the specific environmental conditions at this distillery,nd therefore could be isolated for use as a starter strain in subse-uent years.
Please cite this article in press as: Carvalho-Netto, O.V., et al., A simple and eSaccharomyces cerevisiae cell population during bioethanol fermentation. J
.4. PCR length polymorphisms in non-sugarcane strains
Our analysis of a panel of sugarcane-associated strains (Table 2)evealed a substantial degree of polymorphism for our PCR
434
435
strain, or not (N).
markers. In addition, our analysis of yeast samples collected atdistilleries identified further allelic variants among indigenouscontaminant strains (Table 3; and data not shown). Togetherthese observations suggested that these loci are highly variable instructure, and that their usefulness as molecular markers may notbe limited just to sugarcane strains. To assess the universal appli-cability of these markers, we took advantage of the several fullysequenced S. cerevisiae genomes available in public databases. Wecreated BLASTn alignments using S288c as the reference sequence,and compared the sequence lengths at each locus in 59 strainsfrom different origins (Borneman et al., 2008; Liti et al., 2009;Novo et al., 2009; Wei et al., 2007). This analysis, summarized in
ffective set of PCR-based molecular markers for the monitoring of the. Biotechnol. (2013), http://dx.doi.org/10.1016/j.jbiotec.2013.08.025
Supplementary Table 3, allowed us to uniquely differentiate all buttwo of the strains tested. These two strains (UWOPS03.461.4 and
WOPS05.227.2) were isolated from the same geographic region,nd are very similar at the genomic level (Liti et al., 2009).
For most loci, the sequence alignments revealed a higher num-er of length variant alleles than we were able to observe in theugarcane strain set (Table 2). This number ranged from only twolleles in YBL046W, up to seventeen alleles in YKL201C. We foundases in which the size difference between the predicted PCRroducts for specific alleles was quite large and easily detectable
n conventional agarose electrophoresis gels (e.g. 240 bp, 192 bp,64 bp), but also observed very small size differences (<15 bp)hat would require running the PCR products on higher resolutionels.
While the alignments suggested that loci such as YKL201Cre highly polymorphic (17 alleles), the interpretation of theesults should be taken with caution since the determination theenomic sequences at these low complexity regions is suscep-ible to errors associated with the mis-assembly of sequencingeads. We therefore compared the sizes of the PCR products pre-icted from the deposited genome sequences of S288c, JAY291,nd five additional strains, to the actual products observedfter amplification at the four core loci (Fig. 5). The correla-ion between the predicted and observed PCR product sizes wasuite poor. S288c, which arguably has the most accurate S. cere-isiae genome assembly, was the only strain that matched theize prediction for all four loci tested. The other six strains com-ined had only 12 size matches in 24 alleles tested. This resultnderscored the difficulty of accurately assembling low complex-
ty regions, even in the case of genomes such as YJM789 andM11-1a that were sequenced primarily with conventional Sangereads.
Despite the limitations caused by genome mis-assembly, takenogether, the in silico PCR analysis and the validation experimentonfirmed that the markers are indeed polymorphic in strains out-ide the sugarcane set, even if the predicted size did not frequentlyorrespond to the observed PCR product. The results also showedhat in a reasonable number of cases the size difference betweenolymorphic PCR products was sufficiently large to be resolved inonventional agarose electrophoresis. Therefore the PCR markerseveloped in this study should be universally applicable to otherermentation processes that require a simple and fast method for
onitoring the yeast population.
. Conclusions
Previously described PCR-based methods for the identifica-ion of yeast strains relied primarily on DNA sequences predictedo be variable in the genome of the laboratory strain S288c. Inontrast, the strategy we adopted in this study was to directlydentify relatively large length polymorphisms in our indus-rial strain of interest, and then design PCR markers at thoseegions. This approach allowed us to create a simple and effec-ive system for positively identifying yeast colonies derived fromAY270/PE-2 in the heterogeneous yeast populations found inioethanol distilleries. We showed that this marker set waslso informative for other strains associated with sugarcane pro-uction, and possibly in strains from other origins. The PCRarkers described here represent an attractive alternative for
ioethanol distilleries, that may elect to carry out this simplend fast procedure in their on-site quality control laboratories,ather than relying on a third party for the analysis the yeastopulation.
Please cite this article in press as: Carvalho-Netto, O.V., et al., A simple and eSaccharomyces cerevisiae cell population during bioethanol fermentation. J
ompeting interests
The authors report no conflicts of interest.
PRESSiotechnology xxx (2013) xxx– xxx
Author’s contributions
All authors contributed to the data analysis, and read andapproved the final manuscript. O.V.C.N designed the concept, per-formed experiments and wrote the manuscript; M.F.C. and G.G.L.Cperformed bioinformatics analysis to identify polymorphic regions;A.R. and W.O.B. performed PCR genotyping analysis; J.L.A. designedthe concept, performed experiments and wrote the manuscript;G.A.G.P. wrote the manuscript.
Endnotes
The use of the marker loci described in this work for commer-cial/for profit yeast genotyping purposes is subject to intellectualproperty protection through a pending patent application sub-mitted to the Brazilian National Institute of Intellectual Property(Instituto Nacional de Propriedade Intelectual - INPI); patent appli-cation number 018100029693, 2010.
Acknowledgements
We would like to thank Odebrecht Agroindustrial and Brazilianfunding agencies FAPESP and CNPq for partial funding support tothis project and Grac a Stupiello, who generously provided yeaststrains. Bioenergy research in the Argueso laboratory was sup-ported in part by a seed grant from the Colorado State UniversityClean Energy Supercluster.
Appendix A. Supplementary data
Supplementary data associated with this article can befound, in the online version, at http://dx.doi.org/10.1016/j.jbiotec.2013.08.025.
References
Amorim, H.V., Lopes, M.L., de Castro Oliveira, J.V., Buckeridge, M.S., Goldman, G.H.,2011. Scientific challenges of bioethanol production in Brazil. Appl. Microbiol.Biotechnol. 91, 1267–1275.
Ausubel, F.M., Brent, R., Kingston, R.E., Moore, D.D., Seidman, J.G., Smith, A.J., Struhl,K., 1998. Current Protocols in Molecular Biology. John Wiley & Sons, New York.
Basso, L.C., de Amorim, H.V., de Oliveira, A.J., Lopes, M.L., 2008. Yeast selection forfuel ethanol production in Brazil. Fems Yeast Res. 8, 1155–1163.
Bovo, B., Andrighetto, C., Carlot, M., Corich, V., Lombardi, A., Giacomini, A., 2009. Yeastpopulation dynamics during pilot-scale storage of grape marcs for the produc-tion of Grappa, a traditional Italian alcoholic beverage. Int. J. Food Microbiol.129, 221–228.
Campos, C.R., Silva, C.F., Dias, D.R., Basso, L.C., Amorim, H.V., Schwan, R.F., 2010.Features of Saccharomyces cerevisiae as a culture starter for the productionof the distilled sugar cane beverage cachaca in Brazil. J. Appl. Microbiol. 108,1871–1879.
da Silva, E.A., dos Santos, S.K.B., Resende, A.D.M., de Morais, J.O.F., de Morais, M.A.,Simoes, D.A., 2005. Yeast population dynamics of industrial fuel-ethanol fer-mentation process assessed by PCR-fingerprinting. Anton Leeuw Int J G 88,13–23.
Goffeau, A., Barrell, B.G., Bussey, H., Davis, R.W., Dujon, B., Feldmann, H., Galibert,F., Hoheisel, J.D., Jacq, C., Johnston, M., Louis, E.J., Mewes, H.W., Murakami, Y.,Philippsen, P., Tettelin, H., Oliver, S.G., 1996. Life with 6000 genes. Science 274,546, 547–563.
ffective set of PCR-based molecular markers for the monitoring of the. Biotechnol. (2013), http://dx.doi.org/10.1016/j.jbiotec.2013.08.025
Methods for yeast characterization from industrial products. Food Microbiol.17, 217–223.
Hierro, N., Esteve-Zarzoso, B., Gonzalez, A., Mas, A., Guillamon, J.M., 2006. Real-time quantitative PCR (QPCR) and reverse transcription-QPCR for detection andenumeration of total yeasts in wine. Appl. Environ. Microb. 72, 7148–7155.
ane, S.M., Roth, R., 1974. Carbohydrate metabolism during ascospore developmentin yeast. J. Bacteriol. 118, 8–14.
urtzman, C.P., Robnett, C.J., 1998. Identification and phylogeny of ascomycetousyeasts from analysis of nuclear large subunit (26S) ribosomal DNA partialsequences. Anton Leeuw Int. J. G 73, 331–371.
egras, J.L., Karst, F., 2003. Optimization of interdelta analysis for Saccharomycescerevisiae strain characterization. FEMS Microbiol. Lett. 221, 249–255.
egras, J.L., Ruh, O., Merdinoglu, D., Karst, F., 2005. Selection of hypervariablemicrosatellite loci for the characterization of Saccharomyces cerevisiae strains.Int. J. Food Microbiol. 102, 73–83.
iti, G., Carter, D.M., Moses, A.M., Warringer, J., Parts, L., James, S.A., Davey, R.P.,Roberts, I.N., Burt, A., Koufopanou, V., Tsai, I.J., Bergman, C.M., Bensasson, D.,O‘Kelly, M.J.T., van Oudenaarden, A., Barton, D.B.H., Bailes, E., Ba, A.N.N., Jones,M., Quail, M.A., Goodhead, I., Sims, S., Smith, F., Blomberg, A., Durbin, R., Louis, E.J.,2009. Population genomics of domestic and wild yeasts. Nature 458, 337–341.
arinangeli, P., Angelozzi, D., Ciani, M., Clementi, F., Mannazzu, I., 2004. Minisatel-lites in Saccharomyces cerevisiae genes encoding cell wall proteins: a new waytowards wine strain characterisation. Fems Yeast Res. 4, 427–435.
artorell, P., Querol, A., Fernandez-Espinar, M.T., 2005. Rapid identification andenumeration of Saccharomyces cerevisiae cells in wine by real-time PCR. Appl.Environ. Microb. 71, 6823–6830.
ortimer, R.K., Romano, P., Suzzi, G., Polsinelli, M., 1994. Genome renewal: a newphenomenon revealed from a genetic study of 43 strains of Saccharomyces cere-visiae derived from natural fermentation of grape musts. Yeast 10, 1543–1552.
Please cite this article in press as: Carvalho-Netto, O.V., et al., A simple and eSaccharomyces cerevisiae cell population during bioethanol fermentation. J
fer events revealed by the genome sequence of the wine yeast Saccharomycescerevisiae EC1118. Proc. Natl. Acad. Sci. U.S.A. 106, 16333–16338.
uerol, A., Barrio, E., Huerta, T., Ramon, D., 1992. Molecular monitoring of winefermentations conducted by active dry yeast strains. Appl. Environ. Microb. 58,2948–2953.
PRESSiotechnology xxx (2013) xxx– xxx 9
Rozen, S., Skaletsky, H.J., 2000. Primer3 on the WWW for general users and for biol-ogist programmers. In: Krawetz, S., Misener, S. (Eds.), Bioinformatics Methodsand Protocols: Methods in Molecular Biology. Humana Press, Totowa, NJ, pp.365–386.
Schuller, D., Pereira, L., Alves, H., Cambon, B., Dequin, S., Casal, M., 2007. Geneticcharacterization of commercial Saccharomyces cerevisiae isolates recovered fromvineyard environments. Yeast 24, 625–636.
Schuller, D., Valero, E., Dequin, S., Casal, M., 2004. Survey of molecular methods forthe typing of wine yeast strains. FEMS Microbiol. Lett. 231, 19–26.
Slater, G.S., Birney, E., 2005. Automated generation of heuristics for biologicalsequence comparison. BMC Bioinformatics, 6.
Smith, J.S.C., Chin, E.C.L., Shu, H., Smith, O.S., Wall, S.J., Senior, M.L., Mitchell, S.E.,Kresovich, S., Ziegle, J., 1997. An evaluation of the utility of SSR loci as molecularmarkers in maize (Zea mays L): comparisons with data from RFLPS and pedigree.Theor. Appl. Genet. 95, 163–173.
Vaudano, E., Garcia-Moruno, E., 2008. Discrimination of Saccharomyces cerevisiaewine strains using microsatellite multiplex PCR and band pattern analysis. FoodMicrobiol. 25, 56–64.
Verstrepen, K.J., Jansen, A., Lewitter, F., Fink, G.R., 2005. Intragenic tandem repeatsgenerate functional variability. Nat. Genet. 37, 986–990.
Wei, W., McCusker, J.H., Hyman, R.W., Jones, T., Ning, Y., Cao, Z., Gu, Z., Bruno, D.,Miranda, M., Nguyen, M., Wilhelmy, J., Komp, C., Tamse, R., Wang, X., Jia, P., Luedi,P., Oefner, P.J., David, L., Dietrich, F.S., Li, Y., Davis, R.W., Steinmetz, L.M., 2007.Genome sequencing and comparative analysis of Saccharomyces cerevisiae strainYJM789. Proc. Natl. Acad. Sci. U.S.A. 104, 12825–12830.
ffective set of PCR-based molecular markers for the monitoring of the. Biotechnol. (2013), http://dx.doi.org/10.1016/j.jbiotec.2013.08.025
years. Trends Biotechnol. 17, 482–487.Winston, F., Dollard, C., Ricuperohovasse, S.L., 1995. Construction of a set of con-
venient Saccharomyces cerevisiae strains that are isogenic to S288c. Yeast 11,53–55.