Page 1
www.proteomics-journal.com Page 1 Proteomics
Received: 25-Sep-2014; Revised: 09-Nov-2014; Accepted: 17-Dec-2014
This article has been accepted for publication and undergone full peer review but has not been through the copyediting,
typesetting, pagination and proofreading process, which may lead to differences between this version and the Version of
Record. Please cite this article as doi: 10.1002/pmic.201400453.
This article is protected by copyright. All rights reserved.
Title
Comparative investigation of seed coats of brown- versus yellow-colored soybean seeds
using an integrated proteomics and metabolomics approach
Authors
Ravi Gupta1, Chul Woo Min
1, So Wun Kim
1, Yiming Wang
2, Ganesh Kumar Agrawal
3,4,
Randeep Rakwal3,4,5,6
, Sang Gon Kim7, Byong Won Lee
8, Jong Min Ko
8, In Yeol Back
8,
Dong Won Bae9, and Sun Tae Kim
1,*
1Department of Plant Bioscience, College of Natural Resources and Life Sciences, Pusan National
University, Miryang, 627-707, Republic of Korea
2Department of Plant Microbe Interaction, Max Planck Institute for Plant Breeding Research, Cologne,
Germany
3Research Laboratory for Biotechnology and Biochemistry (RLABB), GPO Box 13265, Kathmandu,
Nepal
4GRADE Academy Private Limited, Adarsh Nagar-13, Main Road, Birgunj, Nepal
5Organization for Educational Initiatives, University of Tsukuba, 1-1-1 Tennoudai, Tsukuba, Ibaraki
305-8577, Japan
6Department of Anatomy I, Showa University School of Medicine, 1-5-8 Hatanodai, Shinagawa,
Tokyo 142-8555, Japan
7Plant Molecular Biology and Biotechnology Research Center, Gyeongsang National University, Jinju,
South Korea
Page 2
www.proteomics-journal.com Page 2 Proteomics
This article is protected by copyright. All rights reserved.
8Department of Functional Crops, NICS, RDA, Miryang 627-803, South Korea
9Central Laboratory, Gyeongsang National University, Jinju, South Korea.
Running title: Soybean seed coat proteome and metabolome
Correspondence: Dr. Sun Tae Kim, Department of Plant Bioscience, Pusan National
University, Miryang, 627-706, South Korea.
E-mail: [email protected]
Tel.: +82-55-350-5505
Fax: +82-55-350-5509
SUMMARY
Seed coat color is an important attribute determining consumption of soybean seeds. Soybean
cultivar Mallikong (M) has yellow seed coat while its naturally mutated cultivar Mallikong
mutant (MM), has brown colored seed-coat. We used integrated proteomics and
metabolomics approach to investigate the differences between seed coats of M and MM
during different stages of seed development (4, 5 and 6 weeks after flowering). 2DE profiling
of total seed coat proteins from three stages showed 178 differentially expressed spots
between M and MM of which 172 were identified by MALDI-TOF/TOF. Of these, 62 were
up-regulated and 105 were down-regulated in MM compared with M, while 5 spots were
detected only in MM. Proteins involved in primary metabolism showed down-regulation in
MM suggesting energy in MM might be utilized for proanthocyanidin biosynthesis via
secondary metabolic pathways which leads to the development of brown seed coat color.
Besides, down-regulation of two isoforms of isoflavone reductase indicated reduced
isoflavones in seed coat of MM which was confirmed by quantitative estimation of total and
Page 3
www.proteomics-journal.com Page 3 Proteomics
This article is protected by copyright. All rights reserved.
individual isoflavones using HPLC. We propose that low isoflavones level in MM may offer
a high substrate for proanthocyanidin production which results in the development of brown
seed coat in MM.
Keywords:
Isoflavone / Mallikong mutant / Phenylpropanoid pathway / Proanthocyanidin/ Seed
storage proteins / Soybean seed coat
Abbreviations: IFR, isoflavone reductase; LEA, Late embryogenesis abundant; M,
Mallikong; MM, Mallikong mutant; SSP, Seed storage protein; SBP, Sucrose binding
proteins;
Total Words: 6507
1. Introduction
The seed coat or testa is a protective outer covering of the ovule and consist of two
integuments or outer layers of cells [1, 2]. In case of soybean, the seed coat matures after the
development of endosperm and embryo [3]. Besides acting as a physical barrier, the seed coat
has other multifunctional roles majorly in the metabolic control of seed development and
dormancy [3], disease resistance [4,5] and in metabolism of nutrients from parent plant [1,3].
Page 4
www.proteomics-journal.com Page 4 Proteomics
This article is protected by copyright. All rights reserved.
Soybean seed coat accounts for 8–10% of total seed mass [6] and comprises of cellulose (14–
25%), hemicellulose (14–20%), pectin (10–12%), protein (9–12%), uronic acid (7–11%), ash
(4–5%), lipid (4–5%), and lignin (3–4%) on the basis of dry weight [7]. In addition to these,
soybean seed coats also contain large amount of secondary metabolites, including phenolic
acid derivatives (flavonoids / isoflavonoids / anthocyanidins / proanthocyanidin), alkaloids,
terpenoids and steroids [8].
Seed color in soybean is determined by the levels of anthocyanins and
proanthocyanidins in the seed coats [9]. It was shown that yellow colored soybean seeds
neither contain anthocyanins nor proanthocyanidins while black and imperfect-black seeds
accumulate both anthocyanins and proanthocyanidins in their seed coats. However, in case of
brown and buff colored seed coats, only proanthocyanidins were detected suggesting that the
seed coat color in brown and buff colored soybean is determined by levels of
proanthocyanidins only [9]. There are ample of studies on the characterization of soybean
seed coats at the biochemical, genetic, and genomic levels because of anti-microbial and anti-
fungal roles of its secondary metabolites [10,11]. In addition, numerous studies have also
revealed that the beneficial health effects of colored soybean are due to its diverse
phytochemicals contents, such as isoflavones, saponins, proanthocyanidins and anthocyanins
[12-15]. Previous reports on analysis of soybean seed coat proteins led to the identification of
an array of enzymes including chitinase [16], peroxidase [17], invertase [3], subtilisin-like
serine protease [18], and a BURP domain containing protein [19], suggesting that soybean
seed coat is metabolically active [20]. However, the overall proteome of the seed coat has
remained largely elusive.
To date, comparative proteomic analysis of total seed proteins involved in seed
development have been conducted to understand evolutionary relationships or metabolic
Page 5
www.proteomics-journal.com Page 5 Proteomics
This article is protected by copyright. All rights reserved.
changes among soybean accessions [20,21,22]. In case of soybean seed coat, only once study
has been conducted till date where a shotgun proteomic approach was used to identify total
seed coat proteins, resulting in the identification of a total of 1372 proteins majorly involved
in primary and secondary metabolism, cellular structure, stress responses, nucleic acid
metabolism, protein synthesis, folding and targeting, hormones, signaling, and seed storage
proteins (SSPs) [20]. However, there is no report on comparative quantitative proteome
analysis of seed coats in soybean cultivars differing in seed coat colors. Such study is of
prime interest to understand the metabolic pathways which result in the development and
accumulation of different pigments in the seed coats.
In the present study, an integrative proteomics and metabolomics analyses was
carried out in order to identify the differentially-expressed proteins and metabolites between
two contrasting yellow and brown-colored soybean seed cultivars. For this purpose, we
selected Mallikong (M) and Mallikong mutant (MM) cultivars with yellow versus brown seed
coat colors, respectively. MM is a naturally-derived mutant of M and thus it is presumed that
there would not be much differences in the other metabolic pathways in these two cultivars
except for those responsible for the development of seed coat color.
2 Materials and Methods
2.1 Plant material and growth
M and MM cultivars were grown at the experimental field of the Department of Functional
Crop , National Institute of Crop Science, RDA at Miryang, South Korea (latitude 35°N) in
June and seeds were harvested in the October 2012 (average temperature 23.5±3.5°C,
average day length 12 h 17 min). Filling seeds of M and MM cultivars were harvested at 4, 5
and 6 week-after-flowering (WAF). Seed coats were dissected from the collected seeds and
Page 6
www.proteomics-journal.com Page 6 Proteomics
This article is protected by copyright. All rights reserved.
stored at -70 °C until used.
2.2 Protein extraction and 2DE analysis
Total proteins from seed coats were isolated as described previously [23] and subjected to
2DE analysis according to the previously published protocol [24,25]. Briefly, seed coat
proteins from three growth stages from M and MM seeds were isolated using Tris-Mg-NP-40
buffer followed by TCA-precipitation. Pellets obtained after precipitation were dissolved in
the rehydration buffer containing 7 M Urea, 2 M Thiourea, 4% (v/v) CHAPS, 2 M DTT, and
0.5% (v/v) IPG buffer pH 4-7 (GE Healthcare, Waukesha, WI, USA). Protein concentrations
in each fraction were determined using 2D-Quant kit (GE Healthcare). Proteins (600 µg)
were loaded on the 24 cm IPG strips, pH 4–7 by passive rehydration loading overnight at
20 °C. Iso-electric focusing was carried out using following protocol: 50 V for 4 hr, 100 V for
1 hr, 500 V for 1 hr, 1000 V for 1 hr, 2000 V for 1 hr, 4000 V for 2 hr, 8000 V for 5 hr, 8000
V for 9 hr, and 50 V for 6 hr on IPGphore II platform (GE Healthcare). After IEF, the strips
were reduced in an equilibration buffer [6 M urea, 30% (v/v) glycerol, 2% (w/v) SDS, 50 mM
Tris-HCl (pH 6.8), and 0.1 mg/ml bromophenol blue] containing 1% DTT as the first step and
then alkylated by 2.5% iodoacetamide as the second step. The second dimension analysis was
carried out on 13% SDS-polyacrylamide gels using EttanDalt twelve (GE Healthcare), after
which the gels were stained with colloidal Coomassie Brilliant Blue (CBB) [24,25]. A total of
three biological replicates were performed for each data set.
2.3 Image Acquisition and Data Analysis
Images of the colloidal CBB stained 2DE gels were acquired using a transmissive scanner
(PowerLook 1120, UMAX) with a 32 bit pixel depth, 300 dpi resolution, and brightness and
Page 7
www.proteomics-journal.com Page 7 Proteomics
This article is protected by copyright. All rights reserved.
contrast set to default. For the analysis of the 2DE gels, raw tiff image files were imported in
the ImageMaster 2D Platinum software (ver. 6.0, GE Healthcare) and spots were detected
from three biological replicates and averaged. For the quantitative analysis, the volume of
each spot was normalized as an average of the volume of spots on the gel and then spot
volumes were calculated to determine the relative abundance of proteins in the experimental
samples. Statistical analyses of spot volumes were performed using the one way-ANOVA to
determine statistically significant values (p≤ 0.05) using MeV software (Supplementary
Table 1).
2.5 MALDI-TOF/TOF MS identification of differential protein spots
Differential protein spots were excised from the 2DE gels and were subjected to in-gel
digestion as described previously [24]. Briefly, in gel reduction was carried out using 10 mM
DTT in 100 mM ammonium bicarbonate at 56 °C for 30 min. Alkylation of reduced proteins
was done using 50 mM iodoacetamide in 100 mM ammonium bicarbonate for 30 min in dark.
Gel pieces were washed with 1:1 ammonium bicarbonate and acetonitrile (ACN) solution and
dehydrated using 100% ACN for 5 min. Gel pieces were digested with 5 µl of trypsin
solution (20 ng/µl, Gold Mass Spectroscopy Grade, Promega, Madison, USA) in 50 mM
ammonium bicarbonate pH 7.8 for 16 hrs at 37 °C. Tryptic digested peptides were extracted
twice with 0.1% trifluroacetic acid (TFA). Each sample was mixed with same volume of
matrix (10 mg/ml α-cynohydroxycinnamic acid, 0.1% TFA, 50% ACN), loaded on a MALDI
target plate and allowed to dry at 25 °C. Prepared samples of tryptic peptides were subjected
to MALDI-TOF/TOF MS analysis using ABI 4800 Plus TOF-TOF Mass Spectrometer
(Applied Biosystems, Framingham, MA, USA) [26]. Spectra were calibrated with the peptide
Page 8
www.proteomics-journal.com Page 8 Proteomics
This article is protected by copyright. All rights reserved.
calibration standard (Mass Standard Kit for the 4700 Proteomics Analyzer; calibration
Mixture 1), prepared in the same way. The ten most and least intense ions per MALDI spot
with signal/noise ratios >25 were selected for subsequent MS/MS analysis in 1 kV mode
using 800–1000 consecutive laser shots. MS/MS spectra were searched against the
Uniprot/Swiss-Prot database (14926175 sequences; 5299740401 residues) and soybean
peptide database obtained from the soybean genome database (Phytozome ver. 8.0,
http://www.phytozome.net/soybean) by Protein Pilot v.3.0 software (AB Sciex, Framingham,
MA, USA) using MASCOT as search engine (ver. 2.3.0, Matrix Science, London, UK). The
search parameters used for the protein identification were as follows: fixed modifications-
carbamidomethylation of cysteines, variable modification- methionine oxidation, peptide and
fragment ion mass tolerances- 50 ppm, maximum trypsin missed cleavage- 1 and instrument
type- MALDI-TOF/TOF. Only significant hits, as identified by the MASCOT probability
analysis (p<0.05) were accepted.
2.6 Data processing and statistical analysis
Functional annotations of the identified proteins were carried out using Gene Ontology (GO)
database using UFO web server (http://ufo.gobics.de/) and GO term enrichment analysis was
carried out using agriGO database (http://bioinfo.cau.edu.cn/agriGO/). For hierarchical
clustering and principal component analysis (PCA), spot intensities of the differential
proteins were log2 transformed and used for clustering using multi-experimental viewer
(MeV) software and for PCA using XLSTAT software.
2.7 Total Isoflavone and individual isoflavone profiling
For quantitation of isoflavones in seed coats of M and MM, samples (1 g) were pulverized
Page 9
www.proteomics-journal.com Page 9 Proteomics
This article is protected by copyright. All rights reserved.
and incubated with 20 ml of 50% methanol at room temperature in a rotary shaker at 200 rpm
for 24 hrs. Samples were filtered with 0.45 µm syringe filter and separated (20 µL) using a
HPLC (Agilent 1100 series, Agilent Techologies, Inc., USA) equipped with Lichrospher 100
RP 18e column (5 μm) at 30°C. The mobile phase was 0.1% acetic acid and acetonitrile with
a flow rate of 1.0 mL min-1
. The isoflavones such as daidzin, glycitin, genistin, mal-glycitin,
mal-daidzin, mal-genistin, and daidzein were detected at 260 nm and quantified based on
comparisons with retention times and peak areas of standards [27].
2.8 Total Proteins and amino acid analysis
Total protein content in M and MM seeds was analyzed using protein analyzer (rapid N cube,
Germany). Samples were powdered using a high speed vibrating sample mill (CMT T1-100,
Japan). The samples (50 mg) were wrapped in nitrogen free paper and pressed to pellets with
the forming tool. Default parameters are used for the analysis. Glutamic acid (9.52% N) was
used as test standard and a protein factor of 6.25 was used. All samples were analyzed in
duplicates [27].
The amino acid profiling was carried out using an amino acid analyzer (Biochrom 30,
Biochrom Ltd., Cambridge, UK). Briefly, the samples were hydrolyzed with 6 N HCl in
sealed glass tubes filled with nitrogen at 110 oC for 24 hr. The HCl was removed from the
hydrolyzed sample on a rotary evaporator brought to 10 ml with 0.2 M sodium citrated buffer,
pH 2.2. Amino acids were determined on a Biochrom 30 amino acid analyzer using nihydrin
as colour reactant and on a cation exchange resin column [27].
2.9 Total oil and Fatty acid profiling
Total lipids were extracted and fatty acid methyl esters (FAMEs) were prepared by acid-
Page 10
www.proteomics-journal.com Page 10 Proteomics
This article is protected by copyright. All rights reserved.
catalyzed transesterification of total lipid as described previously [28] consisting following
steps: Soxhlet extraction, saponification, followed by acid-catalyzed transesterification, and
finally extraction of FAMEs in hexane. FAMEs were subsequently analysed by capillary gas
chromatography (Agilent 7890A) with a HP-FFAP capillary column (25 m × 0.32 mm, i.d. ×
0.5 μm). The percentage of fatty acid was calculated by standard values of peak areas of
C16:0, C18:0, C18:1, C18:2, C20:0, C20:1 and C22:0 methyl esters [28].
3. Results and Discussion
3.1 Morphological characteristics of M and MM
Phenotypic analysis of M and MM plants showed similar morphological characteristics like
flower color, leaf shape, leaf color, pubescence color and pod color, except for the seed coat
and hilum colors (Supplementary Table 2). Seed coat color of M is light yellow while that
of MM is light brown (Fig. 1). Hilum color of M is yellow while MM has white colored
hilum (Supplementary Table 2).
3.2 Proteome profiling of M and MM seed coats confidently assign 172 differential
protein spots
In order to find out the differences in the seed coats of M and MM, total seed coat proteins
were isolated from three stages of seed development and resolved on high-resolution 2DE
gels. A total of 18 gels consisting of 3 biological replicates for each datasets were performed
(Supplementary Fig. 1). Analysis of 2DE gels using ImageMaster2DPlatinum software
showed 867, 850 and 788 reproducible spots at 4, 5 and 6 WAF, respectively (Fig. 2). A total
Page 11
www.proteomics-journal.com Page 11 Proteomics
This article is protected by copyright. All rights reserved.
of 178 protein spots showed differential abundance in MM compared with M of which 68, 54,
and 56 spots were from 4, 5, and 6 WAF respectively. MALDI-TOF/TOF MS analysis
successfully identified and confidently assigned 172 protein spots out of the 178. Of these
172 identified spots, protein abundance of 62 (broken arrows) and 105 (solid arrows) was
increased and decreased respectively in MM than M (Fig. 2). In addition, five protein spots
were present only in MM. These five proteins were identified as GroES-like zinc-binding
alcohol dehydrogenase family protein (spot 243), caffeoyl coenzyme A 3-O-
methyltransferase 2 (spot 252), Xaa-Pro aminopeptidase 2 (spot 253), proteasome subunit
alpha type-6 (spot 254), and proteasome subunit alpha type (spot 255) (Supplementary
Table 3).
3.3 Hierarchical clustering and principal component analysis of the identified proteins
The GO annotation of the 172 identified proteins clustered these into 16 functional categories
including primary metabolism (20%), SSPs (20%), stress response (14%), unknown (9%),
ROS detoxification (6%), gene regulation (6%), protein folding (6%), protein synthesis and
regulation (5%), protease inhibitor (3%), cell structure (2%), secondary metabolism (2%),
energy metabolism (2%), photosynthesis (2%), xenobiotic detoxification (2%), signaling
(1%) and transport (1%) (Supplementary Fig. 2).
A comparison of this study with the previously published report on soybean seed coat
proteome [20] showed that out of the total 172 identified proteins, 118 (68.6%) were novel to
this study with only 54 (31.39%) proteins common to both studies (Supplementary Fig. 3).
The novel proteins from this study were mainly SSPs, different isoforms of sucrose binding
proteins (SBPs) and isoforms of Ran-binding proteins. These 118 novel proteins might be
specific to the soybean cultivar as the previous study was performed with Jack cultivar of
Page 12
www.proteomics-journal.com Page 12 Proteomics
This article is protected by copyright. All rights reserved.
soybean [20]. In addition, there is a high variability in mass spectrometry results therefore; it
might be possible that some proteins just escape the analysis in one run or in another in
previous or current study.
In order to understand the interdependence of protein profiles across the three seed
developmental stages in M and MM, hierarchical clustering was performed. After spot
normalization, the data sets were log-transformed to base 2 to normalize the scale of
abundance and subjected to the clustering using MeV software. A total of 6 clusters were
grouped based on the similar abundance profiles of the identified proteins (Fig. 3A-C).
Cluster 4 was the most abundant group with 39 proteins while cluster 1 was the least
abundant group containing only 11 proteins. Proteins of cluster 1 showed almost similar
abundance patterns at 4 WAF (stage 1) and 5 WAF (stage 2) compared to a higher abundance
at 6 WAF (stage 3). The proteins of cluster 1 were associated with primary metabolism,
protein synthesis and regulation, stress response, and SSPs. Besides, 27% of the proteins
were either hypothetical or with unknown functions. Clusters 2 and 3 exhibited 33 and 30
proteins respectively and contained proteins which showed a gradual decrease in the
abundance pattern from 4 to 6 WAF. The proteins of these clusters were mainly associated
with primary metabolism, energy metabolism, photosynthesis, protein synthesis and
regulation, gene regulation and protease inhibitor activity. The down-regulation of these
protein groups during seed developmental stages might be associated with the seed
maturation as matured seed is almost quiescent with negligible or no metabolic activities.
Proteins of cluster 4 showed almost no changes in their abundance pattern at 4 and 5 WAF
followed by sudden decrease at 6 WAF. The major functional groups in this cluster were
primary metabolism, ROS-detoxification, SSPs, etc. Proteins in clusters 5 and 6 showed
increase abundance at 6 WAF. Proteins of these clusters were mainly associated with the
Page 13
www.proteomics-journal.com Page 13 Proteomics
This article is protected by copyright. All rights reserved.
stress response, SSPs, and primary metabolism. As soybean seeds undergo excessive
dehydration during maturation, an increased accumulation of stress-related proteins and SSPs
at 6 WAF was expected.
Principal component analysis (PCA) was also carried out to identify the protein
clusters responsible for correlated variance. PCA results showed that the major differences in
the spot volumes between M and MM seed coats were at 4 and 5 WAF with no significant
differences at 6 WAF (datasets of M and MM overlapped each other) (Fig. 3D-E). These
results can be explained in terms of seed development process. At 4 and 5 WAF stage, seed is
metabolically active as this is the period when seed filling takes place in soybean, therefore
differences in the M and MM proteins are expected at these stages. However, at 6 WAF stage,
seeds are completely matured and no or negligible metabolic activities takes place in it,
therefore no major differences in PCA results were observed at this stage.
The GO term enrichment analysis was also carried out on all identified proteins using
agriGO database [29]. Results showed decrease in abundance of proteins involved in
biological processes (such as primary metabolism, catabolic process, etc.) in MM, supporting
the results of hierarchical clustering (Fig. 4A). Proteins with increased accumulation in MM
were mainly associated with the reproductive, developmental, and embryonic developmental
processes (Fig. 4B).
3.4 Functional significance of the identified proteins
3.4.1 Seed storage proteins (SSPs)
SSPs accounted for 20% of the total identified proteins (Supplementary Fig. 2 and
Supplementary Table 3). These SSPs were different isoforms of beta-conglycinin α subunit
(spots 2, 3, 10 and 163), beta-conglycinin α’ subunit (spots 8, 12 and 142), beta-conglycinin β
Page 14
www.proteomics-journal.com Page 14 Proteomics
This article is protected by copyright. All rights reserved.
subunit (spots 48, 49, 161, 162, 260 and 262), glycinin (spots 73 and 74), glycinin 1 (spots 37,
41, 148 and 154), glycinin 2 (spots 31, 32, 33, 177, 179 and 181), G5 protein (spots 175 and
182), and mutant glycinin subunit A1aB1b (spots 193 and 194). Of these, abundance of 57 %
of SSPs was higher in M compared to MM. SSPs are important constituent of soybean seeds
and comprise of up to 70–80 % of the total proteins [30,31]. In soybean, glycinin consists of
five subunits (G1–G5) which are encoded by five non-allelic genes. In this study, only the G1,
G2 and G5 subunits were identified, suggesting that either other subunits of glycinin are not
present in the seed coat and are specific to seeds only or the abundance pattern of other
subunits were similar in both M and MM and hence not identified in this study. β-conglycinin
is composed of three subunits, α-subunit, α’-subunit and β-subunit. The first two subunits are
encoded by same mRNA group while the β-subunit of β-conglycinin is encoded by another
mRNA group [30-32]. In this study, all the three subunits of β-conglycinin were detected,
suggesting that in addition to their high abundance in seeds, these are also present in the seed
coats of soybean. Surprisingly, no isoforms of either glycinin or β-conglycinin were identified
in the previous study [20]. This difference in identified proteins might be due to differences in
soybean cultivars used as experimental materials in this and the other study [20]. It is likely
that SSPs could serve as biomarkers for classification of soybean cultivars.
3.4.2 Proteins involved in primary metabolism
Soybean seeds are rich in proteins and fatty acids, which are accumulated during seed
development. In addition, it is also quite established that seeds act as a major sink and
accumulate lots of carbohydrates during development [3]. Since the current study utilized
seed coats from different stages of seed development, we identified a plethora of enzymes
involved in primary metabolism (20% of total identified proteins) (Supplementary Fig. 2).
Page 15
www.proteomics-journal.com Page 15 Proteomics
This article is protected by copyright. All rights reserved.
Sugars are transported in the plants in the form of sucrose through the phloem. From phloem,
sucrose accumulates in the apoplast and finally enters in the cytoplasm where it is catabolized
by the glycolytic pathway [33]. In this study, seven isoforms of SBPs were identified (spots
55, 147, 149, 150, 152, 153 and 160). SBPs are involved in the transportation of sucrose from
apoplast to cytoplasm, where it is catabolized to produce energy. Interestingly, all the
isoforms of SBPs showed increased protein abundance in the MM, suggesting a higher
requirement of sucrose or energy in the seed coats of mutant during seed development.
In addition to SBPs, various enzymes of glycolysis including enolase, fructose-
bisphosphate aldolase, glyceraldehyde-3-phosphate dehydrogenase and four isoforms of
phosphoglycerate kinase were also identified; however protein abundance of all these
enzymes were decreased in the MM (Fig. 5). These results showed that although MM seed
coats accumulate a lot of sucrose in the cytoplasm, it is not catabolized through the glycolytic
pathway. Either the amount of sucrose remains higher in the MM seed coats or sucrose might
be converted into glucose-6-phosphate and degraded by other sugar catabolizing pathways
like pentose-phosphate pathway. UDP-glucose pyrophosphorylase is involved in the
conversion of glucose-1-phosphate into UDP-glucose which ultimately leads to sucrose or
cell wall polysaccharides [34]. In this study, two isoforms of UDP-glucose
pyrophosphorylase were identified and both of them were down-regulated in MM.
In addition to the sugar metabolizing enzymes, other enzymes related to amino acid
metabolism were also identified. Glutamine synthetases are involved in the formation of
glutamine from glutamate and ammonia. Three isoforms of glutamine synthetase were found
to be down-regulated in MM. Iso-citrate dehydrogenase is a key enzyme of citric acid cycle,
involved in the conversion of isocitrate to α-ketogluteric acid. Protein abundance of iso-
citrate dehydrogenase was also decreased in MM, suggesting a down-regulation of citric acid
Page 16
www.proteomics-journal.com Page 16 Proteomics
This article is protected by copyright. All rights reserved.
cycle in this cultivar. Formate dehydrogenase catalyzes the conversion of formate to carbon
dioxide and yield energy in the form of NADH. Four isoforms of formate dehydrogenase
were identified of which three showed increased protein abundance in MM, suggesting a high
requirement of energy during seed development in MM compared to M. These findings
reveal overall down-regulation of proteins involved in the key metabolic pathways in MM
than M indicating utilization of alternative pathways for energy production in MM (Fig. 5).
3.4.3 Stress related proteins
During seed maturation, seeds undergo excessive dehydration therefore, a large number of
proteins involved in desiccation stress, were identified in this study. Late embryogenesis
abundant (LEA) proteins are a group of hydrophilic proteins which accumulate to a high level
during seed dehydration [35]. LEA isoforms protect the proteins from denaturation and
inactivation during dehydration of the seeds [36]. Besides, some of the LEA isoforms also
have membrane stabilization function. Nine isoforms of LEA and three isoforms of dehydrins
were identified in this study of which all the isoforms of dehydrins and 8 isoforms of LEA
showed up-regulation in MM compared to M indication a higher degree of dehydration of
MM seeds compared to M. In the previous study [20], 27 isoforms of LEA were identified in
seed coat of soybean cultivar Jack suggesting their vast abundance and requirement in
soybean seed coats at the time of their maturation.
3.4.4 Proteins of phenylpropanoid pathway
Phenylpropanoid pathway is a source of coumarins, lignin, flavones, isoflavones, flavonols,
anthocyanins and proanthocyanidin that are the important weapons for plant defense [36].
Protein spots 75 and 76 were identified as the isoforms of isoflavone reductase (IFR), and
Page 17
www.proteomics-journal.com Page 17 Proteomics
This article is protected by copyright. All rights reserved.
showed a similar trend of their abundance during seed development in M and MM. Proteomic
analysis clearly showed decreased protein abundance of IFR isoforms at 4 to 6 WAF in MM
than M, suggesting that it could be one of the key factor involved in color development in the
seed coats of M and MM. The protein abundance of IFR homologues was low in MM at 4
and 5 WAF during which major changes in seed physiology takes place. At 6 WAF and
matured stage, the amount of IFR was either equal (spot 76) or higher (spot 75) in MM in
comparison with the M (Fig. 6A). IFRs are involved in the biosynthesis of isoflavones, which
are important secondary metabolites for plants [37]. Isoflavones act as defense responsive
molecules for plants as these have antimicrobial, antifungal, and feeding deterrent properties
[38]. Lower levels of IFR hinted toward low levels of isoflavones in MM.
3.5 Isoflavone profiling of soybean seeds
As decreased abundance of IFR was observed in the MM seed coats during seed development
(Fig. 6A), total isoflavone content and isoflavone profiling were carried out in M and MM
using HPLC. Interestingly, total isoflavone content was found significantly lower in the MM
and was calculated as 1398 63 µg/g in comparison with 1794 85 µg/g, calculated in M
(Fig. 6B). In addition to the total isoflavones, the concentration of individual isoflavones
including genistin, mal-genistin, daidzein, daidzin, mal-daidzin, glycitein, glycitin and mal-
glycitin, were also assayed. Similar to the total isoflavone content, concentrations of all
individual 8 isoflavones were found to be lower in MM in comparison to M (Fig. 6B).
In order to investigate if any the other biochemical differences also exist between the
M and MM, other biochemical parameters like total protein, total oil, individual amino acids
Page 18
www.proteomics-journal.com Page 18 Proteomics
This article is protected by copyright. All rights reserved.
and individual fatty acids concentrations were also measured in the seed coats of M and MM.
Results showed that M contained 20.3% oil and 36.1% protein while the MM contained
20.5% oil and 36.5% proteins, suggesting that protein and oil contents of seed coats of M and
MM are almost similar (Supplementary Table 4 and 5). Fatty acid profiling of M and MM
seed coats showed no differences in the amount of α-linolenic acid (C18:3); however, a slight
difference in the content of palmitic acid (C16:0), stearic acid (C18:0), oleic acid (C18:1),
and linoleic acid (C18:2) was observed (Supplementary Table 4). Amino acids profiling did
not show any differences between M and MM except for the cysteine content. The amount of
cysteine in M was 1.02 mg/g, while it was calculated as 2.84 mg/g in MM (Supplementary
Table 5). Cysteine is a sulfur containing amino acid and participates in the formation of
glutathione, which is an important antioxidant. Interestingly, one protein spot specific to MM
(spot no. 252) was identified as caffeoyl coenzyme A 3-O-methyltransferase 2. This enzyme
catalyzes the conversion of S-adenosyl-L-methionine to S-adenosyl-L-homocysteine, which
is a precursor for both adenosine and cysteine biosynthesis, thus explaining high levels of
cysteine in MM in comparison with M.
Overall, except for the changes in isoflavone concentrations, no other major changes
in the biochemical composition of seed coats of M and MM were observed.
Concluding Remarks
The aim of this study was to unravel the major differences in the M and MM seed coats that
lead to the development of yellow versus brown seed coat colors, respectively, using an
integrated proteomics and metabolomics approach. This study identified 172 differentially
expressed proteins between M and MM, enriching our current knowledge on seed coat
proteomics and opening a door for improving the nutrient value of soybean seeds for better
Page 19
www.proteomics-journal.com Page 19 Proteomics
This article is protected by copyright. All rights reserved.
human life. In this direction, this study proposes that low level of IFR in MM compared with
M might be one reason for differences in their seed coat color. The seed coat color in
soybeans is determined by the levels of proanthocyanins [9]. As both isoflavones and
proanthocyanidins are synthesized via phenylpropanoid pathway in plants, reduced
concentration of isoflavones may offer a high substrate for the proanthocyanidin production
in MM, leading to development of brown seed coat.
The pathways for the biosynthesis of isoflavones and proanthocyanidin are similar in
which one flavone “naringenin” is a key regulator. Naringenin is a precursor for both
isoflavone and proanthocyanidin production. Since low levels of isoflavones were observed
in MM, it can be expected that naringenin in MM seeds would be utilized for the synthesis of
proanthocyanidins, which results in the production of brown-colored seed coats.
Acknowledgements
This work was supported by a grant from the Next-Generation BioGreen 21 Program (SSAC,
grant#:PJ009571), Cooperative Research Program for Agriculture Science & Technology
Development (Project No. PJ007155) of Development of Functional Crop, National Institute
of Crop Science and National Agenda Programs for Agricultural R&D (PJ01004602201401),
Rural Development Administration (RDA), Republic of Korea. RG acknowledge financial
support from RDA.
Authors declare no conflict of interest.
References
[1] Moïse, J. A., Han, S., Gudynaite-Savitch, L., Johnson, D. A., Miki, B. L. A., Seed coats:
Page 20
www.proteomics-journal.com Page 20 Proteomics
This article is protected by copyright. All rights reserved.
Structure, development, composition, and biotechnology. In Vitro Cell. Dev. Biol. Plant 2005,
41, 620–644.
[2] Miernyk, J. A., Preťová, A., Olmedilla, A., Klubicová, K. et al., Using proteomics to
study sexual reproduction in angiosperms. Sex. Plant Reprod. 2011, 24, 9–22.
[3] Weber, H., Borisjuk, L., Wobus, U., Molecular physiology of legume seed development.
Annu. Rev. Plant Biol. 2005, 56, 253–279.
[4] McClean, P. E., Lee, R. K., Otto, C., Gepts, P., Bassett, M. J., Molecular and phenotypic
mapping of genes controlling seed coat pattern and color in common bean (Phaseolus
vulgaris L.). J. Hered. 2002, 93, 148–152.
[5] Bellaloui, N., Soybean seed phenol, lignin, and isoflavones partitioning as affected by
seed node position and genotype differences. Food Nutr. Sci. 2012, 3, 447–454.
[6] Sessa, D. J., Wolf, W. J., Bowman-Birk inhibitors in soybean seed coats. Ind. Crops Prod.
2001, 14, 73–83.
[7] Mullin, W. J., Xu, W., Study of soybean seed coat components and their relationship to
water absorption. J. Agric. Food Chem. 2001, 49, 5331–5335.
[8] Kovinich, N., Saleem, A., Arnason, J. T., Miki, B., Combined analysis of transcriptome
and metabolite data reveals extensive differences between black and brown nearly-isogenic
soybean (Glycine max) seed coats enabling the identification of pigment isogenes. BMC
Genomics 2011, 12, 381.
[9] Todd, J. J., Vodkin, L. O., Pigmented soybean (glycine max) seed coats accumulate
proanthocyanidins during development. Plant Physiol. 1993, 102, 663-670.
Page 21
www.proteomics-journal.com Page 21 Proteomics
This article is protected by copyright. All rights reserved.
[10] Winkel-Shirley, B., Flavonoids in seeds and grains: physiological function, agronomic
importance and the genetics of biosynthesis. Seed Sci. Res. 1996, 8, 415–422.
[11] Lepiniec, L., Debeaujon, I., Routaboul, J-M., Baudry, A. et al., Genetics and
biochemistry of seed flavonoids. Annu. Rev. Plant Biol. 2006, 57, 405–430.
[12] Lee, J. H., Seo, K. I., Kang, N. S., Yang, M. S., Park, K. H., Triterpenods from roots of
Glycine max (L.) Merr. Agric. Chem. Biotechnol. 2006a, 49, 51-56.
[13] Lee, J. H., Seo, W. D., Jeong, S. H., Jeong, T. S. et al., Human acyl-coA: cholesterol
acyltransferase inhibitory effect of flavonoids from roots of Glycine max (L.) Merr. Agric.
Chem. Biotechnol. 2006b, 49, 57-61.
[14] Messina, M., Soyfoods and soybean phyto-oestrogens (isoflavones) as possible
alternatives to hormone replacement therapy (HRT). Eur. J. Cancer 2006, 36, s71-s72.
[15] Rao, A. V., Sung, M. K., Saponins as anticarcinogens. J. Nutr. 1995, 125, s717-s724.
[16] Gijzen, M., Van Huystee, R., Buzzell, R. I., Soybean seed coat peroxidase (A
comparison of high-activity and low-activity genotypes). Plant Physiol. 2003, 103, 1061–
1066.
[17] Gijzen, M., Kuflu, K., Qutob, D., Chernys, J. T., A class I chitinase from soybean seed
coat. J. Exp. Bot. 2001, 52, 2283–2289.
[18] Batchelor, A. K., Boutilier, K., Miller, S. S., Labbé, H. et al., The seed coat-specific
expression of a subtilisin-like gene, SCS1, from soybean. Planta 2001, 211, 484–492.
[19] Batchelor, A. K., Boutilier, K., Miller, S. S., Hattori, J. et al., SCB1, a BURP-domain
protein gene, from developing soybean seed coats. Planta 2002, 215, 523–532.
Page 22
www.proteomics-journal.com Page 22 Proteomics
This article is protected by copyright. All rights reserved.
[20] Miernyk, J. A., Johnston, M. L., Proteomic analysis of the testa from developing soybean
seeds. J. Proteomics 2013, 89, 265-272.
[21] Agrawal, G. K., Hajduch, M., Graham, K., Thelen, J. J., In-depth investigation of the
soybean seed-filling proteome and comparison with a parallel study. Plant Physiol. 2008, 148,
504-518.
[22] Xu, C., Caperna, T. J., Garrett, W. M., Cregan, P. B. et al., Proteomic analysis of the
distribution of the major seed allergens in sixteen soybean accessions. J. Sci. Food Agri. 2007,
87, 2511-2518.
[23] Kim, Y. J., Lee, S. J., Lee, H. M., Lee, B. W. et al., Comparative proteomics analysis of
seed coat from two black colored soybean cultivars during seed development. Plant Omics J.
2013a, 6, 456-463.
[24] Kim, Y. J., Lee, H. M., Wang, Y., Wu, J. et al., Depletion of abundant plant RuBisCO
protein using the protamine sulfate precipitation method. Proteomics 2013b, 13, 2176-2179.
[25] Kim, S. T., Kim, S. G., Kang, Y. H., Wang, Y. et al., Proteomics analysis of rice lesion
mimic mutant (spl1) reveals tightly localized probenazole-induced protein (PBZ1) in cells
undergoing programmed cell death. J. Proteome Res. 2008, 7, 1750-1760.
[26] Kwon, Y. S., Ryu, C. M., Lee, S., Park, H. B. et al., Proteome analysis of Arabidopsis
seedlings exposed to bacterial volatiles. Planta 2010, 232, 1355-1370.
[27] Radhakrishnan, R., Pae, S-B., Kang, S-M., Lee, I-J., Baek, I-Y., Parental effects on
nutritional and antioxidants constituents in seeds of peanut cv. Boreom 1. J. Crop Sci.
Biotech. 2014, 17, 35-39.
Page 23
www.proteomics-journal.com Page 23 Proteomics
This article is protected by copyright. All rights reserved.
[28] Christie, W. W., Preparation of ester derivatives of fatty acids for chromatographic
analysis, in Advances in Lipid Methodology-Two (Christie, W.W., ed.), 1993, pp. 69–112, The
Oily Press, Dundee.
[29] Du, Z., Zhou, X., Ling, Y., Zhang, Z., Su, Z., agriGO: a GO analysis toolkit for the
agricultural community. Nucleic Acids Res. 2010 38, W64-W70.
[30] Natarajan, S. S., Analysis of Soybean Seed Proteins Using Proteomics. J. Data Mining
Genomics Proteomics 2014, 5, 1.
[31] Schuler, M. A., Schmitt, E. S., Beachy, R. N., Closely related families of genes code for
the alpha and alpha’ subunits of the soybean 7S storage protein complex. Nucleic Acids Res.
1982, 10, 8225-8244.
[32] Koshiyama, I., Storage proteins of soybean: Seed Proteins: Biochemistry, Genetics,
Nutritive Value. 1983 Springer, Netherlands.
[33] Di Carli, M., Zamboni, A., Pè, M. E., Pezzotti, M. et al., Two-dimensional differential in
gel electrophoresis (2D-DIGE) analysis of grape berry proteome during postharvest withering.
J. Proteome Res. 2011, 10, 429-446.
[34] Kleczkowski, L. A., Kunz, S., Wilczynska, M., Mechanisms of UDP-Glucose Synthesis
in Plants. Crit. Rev. Plant Sci. 2010, 29, 191-203.
[35] Battaglia, M., Covarrubias, A. A., Late embryogenesis abundant (LEA) proteins in
legumes. Front. Plant Sci. 2013, 25, 4:190.
[36] Pawlak-Sprada, S., Arasimowicz-Jelonek, M., Podgórska, M., Deckert, J., Activation of
phenylpropanoid pathway in legume plants exposed to heavy metals. Part I. Effects of
Page 24
www.proteomics-journal.com Page 24 Proteomics
This article is protected by copyright. All rights reserved.
cadmium and lead on phenylalanine ammonia-lyase gene expression, enzyme activity and
lignin content. Acta Biochim. Pol. 2011, 58, 211-216.
[37] Kim, S. G., Kim, S.T., Wang, Y., Kim, S.K. et al., Overexpression of rice isoflavone
reductase-like gene (OsIRL) confers tolerance to reactive oxygen species. Physiol. Plant.
2010, 138, 1-9.
[38] He, X. Z., Dixon, R. A., Genetic manipulation of isoflavone 7-O-methyltransferase
enhances biosynthesis of 4'-O-methylated isoflavonoid phytoalexins and disease resistance in
alfalfa. Plant Cell 2000, 12, 1689-1702.
Page 25
www.proteomics-journal.com Page 25 Proteomics
This article is protected by copyright. All rights reserved.
Figure 1. Graphical representation of the experimental workflow utilized in the current study.
Upper panel shows morphological charaterstics of Mallikong (M) and Mallikong mutant
(MM) seeds at 4, 5 and 6 weeks after flowering (WAF).
Page 26
www.proteomics-journal.com Page 26 Proteomics
This article is protected by copyright. All rights reserved.
Figure 2. Representative 2D gel maps of seed coats of M and MM from three seed
development stages. Approximately 600 µg of the proteins were resolved on 24 cm IPG strips
pH 4 – 7 on first dimension and on 12 % SDS-PAGE on second dimension. Spots were
visualized by using colloidal CBB staining and gels were compared using ImageMaster 2D
Platinum software (ver. 6.0). Spots with red arrows showed down-regulation while blue
arrows showed up-regulation in MM in comparision with M. black arrowswere unique to
MM.
Page 27
www.proteomics-journal.com Page 27 Proteomics
This article is protected by copyright. All rights reserved.
Figure 3. Hierarchical clustering analysis of the identified protein based on their similar
expression profiles. All 178 differentially expressed proteins were grouped into 6 clusters.
A Heat map is shown on the top. B The expression profiles of the clustered proteins are
shown below for M and MM in the three growth stages (1–3). The x-axis of the graph
represents seed development stages while the y-axis represents log-transformed value of
protein expression. C. Functional groups in each cluster are depicted by pie chart below the
expression graphs of the protein. D. PCA analysis of the identified proteins.
Page 28
www.proteomics-journal.com Page 28 Proteomics
This article is protected by copyright. All rights reserved.
Figure 4. Gene Ontology term enrichment analysis of the differentially expressed proteins
identified in M and MM seed coats using AgriGO database.(A) Analysis of up-regulated
proteins (A) and down-regulated proteins (B) in MM within the “Biological Process”
category.
Page 29
www.proteomics-journal.com Page 29 Proteomics
This article is protected by copyright. All rights reserved.
Figure 5. A schematic diagram of the differentially modulated proteins involved in primary
and secondary metabolism in the seed coat of M and MM. Numbers indicate the spot
corresponding to Supplementary Table 1. Numbers in red indicate up-regulated spots, while
numbers in green represent down-regulated spots in MM compared with M.
Page 30
www.proteomics-journal.com Page 30 Proteomics
This article is protected by copyright. All rights reserved.
Figure 6. (A) Zoom-gel regions of M and MM 2-D gels showing expression pattern of IFR
isoforms during the seed development stages. (B) Measurement of individual and total
isoflavone content in M and MM seed coats using HPLC.