Rapid identification of candidate genes for resistance to ...203.64.245.61/full_text/e13786.pdf · control (cv. Castlerock), were sown in 209 cell seedling trays with peat moss–vermiculi
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
RESEARCH ARTICLE
Rapid identification of candidate genes for
resistance to tomato late blight disease using
next-generation sequencing technologies
Ramadan A. Arafa1¤a, Mohamed T. Rakha2¤b, Nour Elden K. Soliman3☯, Olfat M. Moussa3☯,
Said M. Kamel1, Kenta Shirasawa4*
1 Plant Pathology Research Institute, Agricultural Research Center, Giza, Egypt, 2 Department of
Horticulture, Faculty of Agriculture, University of Kafrelsheikh, Kafr El-Sheikh, Egypt, 3 Department of Plant
Pathology, Faculty of Agriculture, Cairo University, Giza, Egypt, 4 Department of Frontier Science, Kazusa
DNA Research Institute, Chiba, Japan
☯ These authors contributed equally to this work.
¤a Current address: Department of Frontier Science, Kazusa DNA Research Institute, Chiba, Japan
¤b Current address: World Vegetable Center, Shanhua, Tainan, Taiwan
Tomato late blight caused by Phytophthora infestans (Mont.) de Bary, also known as the
Irish famine pathogen, is one of the most destructive plant diseases. Wild relatives of tomato
possess useful resistance genes against this disease, and could therefore be used in breed-
ing to improve cultivated varieties. In the genome of a wild relative of tomato, Solanum hab-
rochaites accession LA1777, we identified a new quantitative trait locus for resistance
against blight caused by an aggressive Egyptian isolate of P. infestans. Using double-digest
restriction site–associated DNA sequencing (ddRAD-Seq) technology, we determined
6,514 genome-wide SNP genotypes of an F2 population derived from an interspecific cross.
Subsequent association analysis of genotypes and phenotypes of the mapping population
revealed that a 6.8 Mb genome region on chromosome 6 was a candidate locus for disease
resistance. Whole-genome resequencing analysis revealed that 298 genes in this region
potentially had functional differences between the parental lines. Among of them, two genes
with missense mutations, Solyc06g071810.1 and Solyc06g083640.3, were considered to
be potential candidates for disease resistance. SNP and SSR markers linking to this region
can be used in marker-assisted selection in future breeding programs for late blight disease,
including introgression of new genetic loci from wild species. In addition, the approach
developed in this study provides a model for identification of other genes for attractive agro-
nomical traits.
Introduction
Plants suffer from many biotic and abiotic stresses [1], which reduce quantity and quality of
crop production worldwide. Late blight disease is caused by the hemibiotrophic oomycete Phy-tophthora infestans (Mont.) de Bary, one of the most destructive plant pathogens. Phytophthora
PLOS ONE | https://doi.org/10.1371/journal.pone.0189951 December 18, 2017 1 / 15
a1111111111
a1111111111
a1111111111
a1111111111
a1111111111
OPENACCESS
Citation: Arafa RA, Rakha MT, Soliman NEK,
Moussa OM, Kamel SM, Shirasawa K (2017) Rapid
identification of candidate genes for resistance to
infestans is well known as the causative agent of the Great Famine in Ireland between 1845 and
1852, which devastated potato production (Solanum tuberosum) [2]. After potato, tomato (S.
lycopersicum L.) is the second most agriculturally important crop in the Solanaceae family. The
annual global productivity of tomato has increased dramatically, to 170 million tons in 2014
[3]. However, tomato can also be damaged by the late blight disease, particularly in cool tem-
peratures, high relative humidity (RH), and rainy or foggy conditions [4], resulting in 100%
economic losses in open fields and greenhouses.
Tomato has been used in molecular genetic and genomic studies as a model for fruiting
plants [5] because of its compact genome (~950 Mb) and the simple diploid genome composi-
tion of family Solanaceae. The genome sequence of tomato [6] has enabled discovery of
genome-wide single-nucleotide polymorphisms (SNPs) and development of advanced molec-
ular markers [7–10]. Although the genetic diversity of the cultivated tomato is limited [11], its
wild relatives S. pennellii, S. habrochaites, S. peruvianum, and S. pimpinellifolium have many
useful traits potentially applicable to improvement of the agricultural varieties. Therefore,
introduction of wild tomato species into tomato breeding programs could facilitate develop-
ment of new tomato lines [12–15]. Indeed, five race-specific resistance (R) genes that confer
various levels of resistances against P. infestans isolates Ph-1, Ph-2, Ph-3, Ph-4, and Ph-5 have
been identified [16–22] and applied to molecular breeding by marker-assisted selection (MAS)
[20]. However, a serious problem in breeding by interspecific crossing is linkage drag, in
which undesirable traits linked to target traits in the wild relatives are introgressed in elite cul-
tivars [23, 24].
In the genomics era, advanced molecular markers and genotyping technologies have helped
to solve this problem [25, 26]. Simple sequence repeat (SSR) markers are useful for genomics
and breeding in tomato [27–29]; however, analysis of large numbers of genome-wide SSR
markers across multiple samples, such as breeding materials, is time-consuming and laborious.
However, next-generation sequencing (NGS) technologies, including high-throughput
sequencing and sophisticated bioinformatics techniques, can overcome these limitations.
Restriction site–associated DNA sequencing (RAD-Seq) [30–32] and an alternative technique,
double-digest RAD-Seq (ddRAD-Seq) [33], can skim through the genome with low cost and
high throughput. These methods can be successfully implemented in gene mapping, including
quantitative trait locus (QTL) analysis and genome-wide association studies (GWAS), of a vast
array of crops [32, 34–38]. On the other hand, whole-genome resequencing (WGRS) enables
prediction of the effects of sequence variants on gene function throughout the genome [39–
43]. Therefore, a combination of RAD-Seq and WGRS analysis represents a powerful strategy
for rapidly identifying candidate genes responsible for traits of interests.
Development of new tomato lines with resistance to late blight disease would be a straight-
forward, effective, and environmentally safe approach to managing late blight disease. There-
fore, in this study, we aimed to identify map positions of genetic loci derived from a wild
tomato relative, S. habrochaites that control resistance to late blight disease caused by P. infes-tans. We applied a ddRAD-Seq pipeline that we developed in a previous study [33] to genetic
mapping of the resistance loci, and then we used a WGRS strategy to predict candidate genes
for late blight disease resistance.
Materials and methods
Plant materials
A cultivated tomato (S. lycopersicum), Castlerock, and its wild relative, S. habrochaites(LA1777), were used in this study. Castlerock was chosen because it is susceptible to late blight
disease, and LA1777 was selected because it is resistant to the Egyptian P. infestans population,
NGS-based rapid identification of resistance genes to tomato late blight
PLOS ONE | https://doi.org/10.1371/journal.pone.0189951 December 18, 2017 2 / 15
study design, data collection and analysis, decision
Table 1. Genotyping of F2 mapping population with five EST-SSR markers.
SSR marker Chromosome Position (bp)1 Scale2 Allele Amplified samples Total tested samples
LA1777 Castlerock Hete.
TES0422 SL3.0ch06 44975890 0 17 0 13 317 344
1 12 0 11
2 20 0 18
3 14 0 33
4 26 2 47
5 7 1 22
6 20 8 46
Mean3 3.0431b 5.5455a 3.7895c
TES0014 SL3.0ch06 45297826 0 23 0 12 343 344
1 15 0 11
2 27 1 17
3 19 0 33
4 31 0 48
5 7 1 23
6 22 10 43
Mean3 2.8958b 5.5833a 3.7914c
TES1344 SL3.0ch06 45438555 0 23 0 12 340 344
1 14 1 11
2 27 1 17
3 18 1 32
4 30 0 46
5 7 1 23
6 24 9 43
Mean3 2.9441b 5.0000a 3.7935c
TES0945 SL3.0ch06 47342901 0 25 1 7 328 344
1 15 0 11
2 26 1 16
3 17 2 32
4 34 0 41
5 7 0 22
6 23 8 40
Mean3 2.9048b 4.6667a 3.8639a
TES0213 SL3.0ch06 49713763 0 24 2 9 343 344
1 17 0 9
2 23 1 21
3 20 1 31
4 35 0 44
5 10 2 19
6 23 8 44
Mean3 2.9671b 4.5000a 3.8362a
1 The position based on the tomato reference genome SL3.0 version2 The disease severity rating (DSR) to assessment the phenotype of late blight disease on tomato plants3 Means followed by the same letter are not significantly different at P < 0.05 (LSD test).
The superscripts of "a", "b", and "c" are alphabetical codes indicating significant differences when the letters are different.
https://doi.org/10.1371/journal.pone.0189951.t001
NGS-based rapid identification of resistance genes to tomato late blight
PLOS ONE | https://doi.org/10.1371/journal.pone.0189951 December 18, 2017 8 / 15
markers in the candidate regions. Three types of plant materials are potentially used for the
validation: 1) an additional biparental population derived from the same crossing in the
genetic analysis (as in this study); 2) near-isogenic lines (NILs) having target loci of the donor
(e.g., a wild relative) with genetic background of the recurrent line (e.g., a cultivated line); and
3) a group of genetically divergent lines like natural populations or core collections maintain-
ing genetic diversity of genetic pools. Among them, NILs would be the most useful materials
to investigate the effects of the candidate locus on the phenotypes, and to identify the genes
controlling the phenotypes by a map-based cloning strategy. However, it would take a long
time and labors to develop NILs because of recurrent backcrossings with marker-assisted
selection. In Tomato Genetic Resource Center, University of California, Davis, series of NILs
covering the entire genome of LA1777 in the background of S. lycopersicum E6203 have been
registered [59]; however, NILs for chromosome 6 is not available at the time of writing unfor-
tunately. On the other hand, although a group of genetically divergent lines could be useful for
the validation, no resistance lines against P. infestans EG_12 have identified except for S. hab-rochaites LA1777 [15]. This meant that this approach might be not suitable for the case of this
study.
It should be possible to breed new varieties with high disease resistance by combining the
new locus with previously reported genes [19, 20]. Such a ‘gene pyramid’ strategy resulting in
durable resistance could contribute to successful management of new populations of P. infes-tans, which are resistant not only to well-known R genes, but also to certified fungicides, e.g.,
metalaxyl [60, 61]. Because we have characterized many P. infestans isolates [15, 62], as well as
tomato wild relatives highly resistant to these isolates [15], further novel resistance loci could
be identified from these materials using an approach similar to the one employed in this study.
The genotyping analysis was completed in a short time by taking advantage of two NGS
technologies, ddRAD-Seq and WGRS. In the former type of analysis, the number of detectable
SNPs depends on genetic diversity (i.e., the so-called genetic distance) of the materials [32, 63,
64]. In this study, because the parental lines were genetically divergent, the number of obtained
SNPs was 6,514. This result is consistent with a previous report in which 8,784 SNPs were
obtained from an interspecific cross between different species [65]. In intercrossing, or cross-
ing between closely related species, even though the number of SNPs obtained by ddRAD-Seq
might be small [66], WGRS has the potential to overcome this issue [43]. Therefore, lab work
is no longer a limiting factor in the discovery of new genetic loci.
ddRAD-Seq analysis and WGRS are powerful tools for gene mapping. Previously, it was
common to employ SSR and SNP markers for such analysis [28, 29, 67]. However, because
these methods are time-consuming and laborious, it used to be difficult to analyze multiple
populations at once. Furthermore, even if genetic loci could be narrowed down to small geno-
mic regions, subsequent sequencing of the target regions was necessary for identification of
candidate genes of interest. By contrast, ddRAD-Seq analysis can be performed in parallel
across multiple mapping populations. In addition, WGRS is the most effective and easiest
method for identifying sequence variations in candidate regions. In this study, the alignment
rate of the sequence reads to reference sequence was lower in LA1777 than in Castlerock, likely
because LA1777 is a wild species belonging to the Eriopersicon subsection, which is distantly
associated with cultivated lines such as Castlerock and Heinz 1706 [10].
The distribution patterns of SNPs over the genome was highly biased, with higher density
at the distal ends of chromosomes and lower density in pericentromeric regions. This observa-
tion was consistent with some previous studies [6, 29, 66] but discordant with another [10].
On the other hand, the density of SNPs identified by the WGRS in this study (512.8 SNPs per
100 kb) was higher than that in a previous study using only cultivated lines (11.9–98.9 SNPs
per 100 kb) [33], confirming that wild tomato relatives are genetically distant from cultivated
NGS-based rapid identification of resistance genes to tomato late blight
PLOS ONE | https://doi.org/10.1371/journal.pone.0189951 December 18, 2017 9 / 15