Top Banner
Breakthrough Technologies Advanced Data-Mining Strategies for the Analysis of Direct-Infusion Ion Trap Mass Spectrometry Data from the Association of Perennial Ryegrass with Its Endophytic Fungus, Neotyphodium lolii 1[W][OA] Mingshu Cao 2 , Albert Koulman 2 , Linda J. Johnson, Geoffrey A. Lane, and Susanne Rasmussen* AgResearch Limited, Grasslands Research Centre, Palmerston North 4442, New Zealand Direct-infusion mass spectrometry (MS) was applied to study the metabolic effects of the symbiosis between the endophytic fungus Neotyphodium lolii and its host perennial ryegrass (Lolium perenne) in three different tissues (immature leaf, blade, and sheath). Unbiased direct-infusion MS using a linear ion trap mass spectrometer allowed metabolic effects to be determined free of any preconceptions and in a high-throughput fashion. Not only the full MS 1 mass spectra (range 150–1,000 mass-to-charge ratio) were obtained but also MS 2 and MS 3 product ion spectra were collected on the most intense MS 1 ions as described previously (Koulman et al., 2007b). We developed a novel computational methodology to take advantage of the MS 2 product ion spectra collected. Several heterogeneous MS 1 bins (different MS 2 spectra from the same nominal MS 1 ) were identified with this method. Exploratory data analysis approaches were also developed to investigate how the metabolome differs in perennial ryegrass infected with N. lolii in comparison to uninfected perennial ryegrass. As well as some known fungal metabolites like peramine and mannitol, several novel metabolites involved in the symbiosis, including putative cyclic oligopeptides, were identified. Cor- relation network analysis revealed a group of structurally related oligosaccharides, which differed significantly in concentration in perennial ryegrass sheaths due to endophyte infection. This study demonstrates the potential of the combination of unbiased metabolite profiling using ion trap MS and advanced data-mining strategies for discovering unexpected perturbations of the metabolome, and generating new scientific questions for more detailed investigations in the future. With the advent of metabolomics, methods for the simultaneous analysis of a large number of small mol- ecules (metabolites) have been developed and im- proved, providing more details about the metabolism of complex biological systems (Sumner et al., 2003; Dettmer et al., 2007). Metabolite fingerprinting methods provide relatively unbiased and high-throughput infor- mation on complex biological systems and have been used for yeast (Saccharomyces cerevisiae) strain classifi- cation (Allen et al., 2003) and annotation of gene func- tions (Raamsdonk et al., 2001). Comprehensive and unbiased metabolite analysis is also an indispensable tool for systems biology together with transcriptomics and proteomics (see commentary by Sauer et al., 2007). Direct-infusion (without prior chromatographic sepa- ration) electrospray ionization (ESI) mass spectrome- try (MS) was introduced as a tool for the identification of novel fungal metabolites in culture (Smedsgaard and Frisvad, 1996) and is now widely applied in met- abolomics (for a recent review, see Dettmer et al., 2007). High resolution MS instrumentation such as time-of-flight MS (Dunn et al., 2005) or Fourier trans- form ion cyclotron resonance MS (FT-ICR-MS; Aharoni et al., 2002) is often preferred for the specificity pro- vided by resolution of isobaric ions and highly accurate estimates of mass-to-charge (m/z) ratios. We recently applied direct-infusion ESI MS using an ion trap MS (DIMS n ) to determine metabolic differences between endophyte-infected and endophyte-free perennial rye- grass (Lolium perenne) seed samples (Koulman et al., 2007b). This study established that DIMS n is a powerful tool for determining metabolic profiles, even though ion trap MS is a low resolution MS technology. Its ad- vantage lies in the capacity of the ion trap to rapidly collect fragmentation data on large numbers of ions (more than 200) selected from the MS 1 spectrum using automated data-dependent scanning, thereby facili- tating structural classification and identification of metabolites of interest. Direct-infusion ESI MS/MS with an ion trap has been used to discover unknown drug metabolites (e.g. Tozuka et al., 2003), but its potential for metabolome investigations warrants fur- ther exploration and development. Here, we have employed DIMS n technology to detect and investigate a wide range of metabolites involved in an association 1 This work was supported by a grant from the New Zealand Foundation for Research Science and Technology (contracts C10X0203 and AGRX0204) and conducted at AgResearch Grasslands, Palmer- ston North, New Zealand. 2 These authors contributed equally to the article. * Corresponding author; e-mail susanne.rasmussen@agresearch. co.nz. The author responsible for distribution of materials integral to the findings presented in this article in accordance with the policy described in the Instructions for Authors (www.plantphysiol.org) is: Susanne Rasmussen ([email protected]). [W] The online version of this article contains Web-only data. [OA] Open Access articles can be viewed online without a sub- scription. www.plantphysiol.org/cgi/doi/10.1104/pp.107.112458 Plant Physiology, April 2008, Vol. 146, pp. 1501–1514, www.plantphysiol.org Ó 2008 American Society of Plant Biologists 1501 www.plantphysiol.org on June 24, 2020 - Published by Downloaded from Copyright © 2008 American Society of Plant Biologists. All rights reserved.
14

Advanced Data-Mining Strategies for the Analysis of Direct ... · Advanced Data-Mining Strategies for the Analysis of ... Exploratory data analysis approaches were also developed

Jun 17, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Advanced Data-Mining Strategies for the Analysis of Direct ... · Advanced Data-Mining Strategies for the Analysis of ... Exploratory data analysis approaches were also developed

Breakthrough Technologies

Advanced Data-Mining Strategies for the Analysis ofDirect-Infusion Ion Trap Mass Spectrometry Data fromthe Association of Perennial Ryegrass with Its EndophyticFungus, Neotyphodium lolii1[W][OA]

Mingshu Cao2, Albert Koulman2, Linda J. Johnson, Geoffrey A. Lane, and Susanne Rasmussen*

AgResearch Limited, Grasslands Research Centre, Palmerston North 4442, New Zealand

Direct-infusion mass spectrometry (MS) was applied to study the metabolic effects of the symbiosis between the endophyticfungus Neotyphodium lolii and its host perennial ryegrass (Lolium perenne) in three different tissues (immature leaf, blade, andsheath). Unbiased direct-infusion MS using a linear ion trap mass spectrometer allowed metabolic effects to be determined free ofany preconceptions and in a high-throughput fashion. Not only the full MS1 mass spectra (range 150–1,000 mass-to-charge ratio)were obtained but also MS2 and MS3 product ion spectra were collected on the most intense MS1 ions as described previously(Koulman et al., 2007b). We developed a novel computational methodology to take advantage of the MS2 product ion spectracollected. Several heterogeneous MS1 bins (different MS2 spectra from the same nominal MS1) were identified with this method.Exploratory data analysis approaches were also developed to investigate how the metabolome differs in perennial ryegrassinfected with N. lolii in comparison to uninfected perennial ryegrass. As well as some known fungal metabolites like peramine andmannitol, several novel metabolites involved in the symbiosis, including putative cyclic oligopeptides, were identified. Cor-relation network analysis revealed a group of structurally related oligosaccharides, which differed significantly in concentrationin perennial ryegrass sheaths due to endophyte infection. This study demonstrates the potential of the combination of unbiasedmetabolite profiling using ion trap MS and advanced data-mining strategies for discovering unexpected perturbations of themetabolome, and generating new scientific questions for more detailed investigations in the future.

With the advent of metabolomics, methods for thesimultaneous analysis of a large number of small mol-ecules (metabolites) have been developed and im-proved, providing more details about the metabolismof complex biological systems (Sumner et al., 2003;Dettmer et al., 2007). Metabolite fingerprinting methodsprovide relatively unbiased and high-throughput infor-mation on complex biological systems and have beenused for yeast (Saccharomyces cerevisiae) strain classifi-cation (Allen et al., 2003) and annotation of gene func-tions (Raamsdonk et al., 2001). Comprehensive andunbiased metabolite analysis is also an indispensabletool for systems biology together with transcriptomicsand proteomics (see commentary by Sauer et al., 2007).Direct-infusion (without prior chromatographic sepa-

ration) electrospray ionization (ESI) mass spectrome-try (MS) was introduced as a tool for the identificationof novel fungal metabolites in culture (Smedsgaardand Frisvad, 1996) and is now widely applied in met-abolomics (for a recent review, see Dettmer et al.,2007). High resolution MS instrumentation such astime-of-flight MS (Dunn et al., 2005) or Fourier trans-form ion cyclotron resonance MS (FT-ICR-MS; Aharoniet al., 2002) is often preferred for the specificity pro-vided by resolution of isobaric ions and highly accurateestimates of mass-to-charge (m/z) ratios. We recentlyapplied direct-infusion ESI MS using an ion trap MS(DIMSn) to determine metabolic differences betweenendophyte-infected and endophyte-free perennial rye-grass (Lolium perenne) seed samples (Koulman et al.,2007b). This study established that DIMSn is a powerfultool for determining metabolic profiles, even thoughion trap MS is a low resolution MS technology. Its ad-vantage lies in the capacity of the ion trap to rapidlycollect fragmentation data on large numbers of ions(more than 200) selected from the MS1 spectrum usingautomated data-dependent scanning, thereby facili-tating structural classification and identification ofmetabolites of interest. Direct-infusion ESI MS/MSwith an ion trap has been used to discover unknowndrug metabolites (e.g. Tozuka et al., 2003), but itspotential for metabolome investigations warrants fur-ther exploration and development. Here, we haveemployed DIMSn technology to detect and investigatea wide range of metabolites involved in an association

1 This work was supported by a grant from the New ZealandFoundation for Research Science and Technology (contracts C10X0203and AGRX0204) and conducted at AgResearch Grasslands, Palmer-ston North, New Zealand.

2 These authors contributed equally to the article.* Corresponding author; e-mail susanne.rasmussen@agresearch.

co.nz.The author responsible for distribution of materials integral to

the findings presented in this article in accordance with the policydescribed in the Instructions for Authors (www.plantphysiol.org) is:Susanne Rasmussen ([email protected]).

[W] The online version of this article contains Web-only data.[OA] Open Access articles can be viewed online without a sub-

scription.www.plantphysiol.org/cgi/doi/10.1104/pp.107.112458

Plant Physiology, April 2008, Vol. 146, pp. 1501–1514, www.plantphysiol.org � 2008 American Society of Plant Biologists 1501 www.plantphysiol.orgon June 24, 2020 - Published by Downloaded from

Copyright © 2008 American Society of Plant Biologists. All rights reserved.

Page 2: Advanced Data-Mining Strategies for the Analysis of Direct ... · Advanced Data-Mining Strategies for the Analysis of ... Exploratory data analysis approaches were also developed

between perennial ryegrass and its endophytic fun-gus, Neotyphodium lolii.

Symbiotic associations between fungal endophytesand grasses are widespread and have been estimated tooccur in 20% to 30% of all grass species (Leuchtmann,1992), and are therefore of wide interest to studies onplant-fungal interactions. The two most widely studiedassociations are the perennial ryegrass-N. lolii symbio-sis in Australasia and tall fescue (Lolium arundinaceum)-Neotyphodiumcoenophialum inNorthAmerica (Christensenet al., 1993). These two associations are of particular in-terest to agricultural pastoral systems because the fun-gal endophytes have been implicated in the toxicity ofgrazing livestock including ryegrass staggers and fes-cue foot, but also found to confer a range of agronomicbenefits to their grass hosts, mainly through toxicityand feeding deterrent activities toward invertebrateherbivores. These antiherbivore activities have been as-sociated with specific alkaloids produced by the fungiwithin the plant (Bush et al., 1997; Lane et al., 2000;Malinowski and Belesky, 2000; Schardl et al., 2004). Themajor known alkaloids produced by N. lolii in perennialryegrass are peramine, lolitrem B, and ergovaline. Per-amine, a pyrrolopyrazine alkaloid, protects the grasshost from herbivorous insects such as the Argentinestem weevil (Listronotus bonariensis; Rowan and Gaynor,1986; Fletcher and Easton, 1997) and has been shown tobe exuded in the guttation fluid of endophyte-infectedperennial ryegrass (Koulman et al., 2007a). The indole-diterpene lolitrem B acts as a neurotoxin and causesryegrass staggers in grazing livestock (Gallagher et al.,1984). The peptide alkaloid ergovaline has been associ-ated with heat stress and poor liveweight gains inlivestock grazing endophyte-infected perennial rye-grass (Easton et al., 1986; Fletcher and Easton, 1997) andwith fescue foot, a severe mammalian disorder that canlead to considerable productivity losses in livestockraised on endophyte-infected tall fescue (Lyons et al.,1986; Strickland et al., 1996). Considerable researchefforts have been focused on the biosynthesis, accumu-lation, and ecological consequences of these fungalalkaloids (for review, see Schardl, 2001; Clay andSchardl, 2002; Schardl et al., 2004). Much less is knownabout the impacts of fungal endophytes on general hostplant performance and metabolism, and recent publi-cations indicate that the effects of fungal endophyteson plant metabolism might be of importance to theunderstanding of ecosystem-wide impacts of the grass-endophyte symbiosis (Hunt et al., 2005; Cheplick, 2007;Krauss et al., 2007; Rasmussen et al., 2007, 2008).

In this article, we present exploratory data analysisapproaches to investigate how metabolites analyzedby DIMSn differ between endophyte-infected and un-infected ryegrass plants in three tissue samples corre-sponding to three developmental stages (immatureleaf, blade, and sheath) of the symbiosis of perennialryegrass and N. lolii. We have also taken advantageof the MS/MS spectral information to aid metabo-lite identification and determine the homogeneity ofthe spectra, and hence uniformity of the metabolite

species across the samples. Available software for au-tomating the processing of liquid chromatography(LC)-MS data including commercial software such asMassFrontier (http://www.highchem.com/) and free-ware and open source software such as MetAlign (http://www.metalign.nl), XCMS (Smith et al., 2006), andMZmine (Katajamaa and Oresic, 2005) is not applica-ble to infusion profiles of the type generated in ourexperiments, as these programs are designed to searchfor peaks in both the time and mass domain. Thus,computational tools for harnessing the raw data fromDIMSn experiments have been developed in this studyand will be made available upon request.

RESULTS

The approach to DIMSn data analysis in this article isas follows: (1) statistical analysis of MS1 data, with therange of 150 to1,000 m/z, to select nominal m/z bins thatdiffer significantly in intensity between the groups offour samples from infected and uninfected plantswithin each of three tissue types (immature leaf, blade,and sheath); (2) analysis of the MS2 spectra derivingfrom the same parent MS1 m/z bin across all the samplesusing purpose-built computational tools to determinetheir similarity and aid identification of components inthe fragmentation data; and (3) correlation networkanalysis of all MS1 bins to identify metabolite relation-ships not revealed by feature selection based on thestatistical ranking.

MS1 Data Analysis and the Selection of DifferentiatedMS1 m/z Bins

The MS1 spectrum of each sample was obtainedfrom the raw data as described in ‘‘Materials andMethods’’ (‘‘Data Analysis’’). To handle the low reso-lution infusion data we were able to adopt simpleprocessing procedures (compare with Enot et al., 2006)compared to those required for handling high resolu-tion LC-MS data (Hansen and Smedsgaard, 2004),where alignment in the time and mass domains can bechallenging. The MS1 spectra were collated in nominalm/z bins in keeping with the 1 m/z resolution of themachine (called MS1 bins thereafter). After back-ground subtraction, the median value of the ion abun-dance in each bin was determined. Each MS1 bin isthus likely to include ion signals from more than onemetabolite. On the other hand, each metabolite is likelyto contribute to signals for more than one MS1 bin dueto the occurrence of isotopologue ions, hydrogen trans-fers in the source, and the formation of salt adduct ions.Both aspects must be taken into account for the inter-pretation of a nominal MS1 bin.

Our experiment was designed to analyze how themetabolome of the symbiosis changes upon endophyteinfection in different tissues, i.e. immature leaf, blade,and sheath (see Supplemental Table S1 for sample de-scription). Our first data exploration based on principal

Cao et al.

1502 Plant Physiol. Vol. 146, 2008 www.plantphysiol.orgon June 24, 2020 - Published by Downloaded from

Copyright © 2008 American Society of Plant Biologists. All rights reserved.

Page 3: Advanced Data-Mining Strategies for the Analysis of Direct ... · Advanced Data-Mining Strategies for the Analysis of ... Exploratory data analysis approaches were also developed

component analysis revealed that the main variations(73.45% of the total variation) were explained by met-abolic differences between tissues. The infected (E1)and uninfected (E2) samples were not resolved in thePC1-PC2 score plot (Supplemental Fig. S1). We there-fore used an empirical Bayes moderated t test (Smyth,2004) to investigate which metabolites were differen-tially expressed between E1 and E2 in immature leaf,blade, and sheath tissue, respectively. This approachwas devised to identify differentially expressed genesacross specified conditions in designed microarrayexperiments, and it provides more stable inferencewhen the sample size is small (Smyth, 2004). Significantdifferential MS1 bins were selected based on an ad-justed P value ,0.05 using Benjamin and Hochberg’sfalse discovery rate to control false positives. Ten MS1

bins were identified based on this criterion, of whichseven (m/z 230, 248, 205, 231, 335, 189, and 554) weresignificantly different between E1 and E2 in sheath,four in immature (m/z 209, 297, 223, and 230), but nonein blade tissue. The distributions of the ion abundanceof MS1 bins of m/z 205, 248, 335, and 554 in differenttreatment groups are shown in Figure 1. See Supple-mental Figure S2 for the distribution of the othersignificant MS1 bins (m/z 230, 231, 189, 209, 223, and297) and Table I for a summary of all the MS1 binsidentified as being significantly different between in-fected and uninfected perennial ryegrass tissues.

Based on the nominal mass of an MS1 bin only, itis impossible to determine the chemical identity of aselected ion in a single experiment. For the MS1 bins ofm/z 230, 231, and 189, no MS2 data were generated andno putative identification can be suggested. We haveshown previously that DIMSn with our instrumenta-

tion allows MS2 product ion spectra for selected MS1

features to be checked manually to identify unique MS1

ion species by their fragmentation pathway (Koulmanet al., 2007b). Here, we developed a computationalmethod to utilize the fragmentation data from all thesamples where MS2 data are available and circumventmanual comparison of individual spectra from differ-ent samples.

MS2 Data Analysis and the Identification of SelectedMS1 Ions

As noted above, MS1 bins of different m/z could arisefrom the same metabolite due to isotopic ions, hydro-gen transfers, or salt adducts. To identify a metabolitewe need to first identify the relevant monoisotopic (12C,1H, 14N, 16O) MS1 bin. This can be partially addressedusing correlation analysis (Enot et al., 2006) as binsderiving from the same metabolite should be highlycorrelated, and in the mass range we are investigating,the monoisotopic MS1 bin will be of highest intensity.Thus, to assist in the identification of ions in the sig-nificant MS1 bins, we have considered concurrentlycorrelated MS1 bins of adjacent mass (possible isotopo-logues) as well as MS1 bins corresponding to salt (e.g. K1

and Na1) adducts. We have also compared their MS2

product ion spectra as an aid to identifying isotopo-logues, adducts, and binning anomalies.

Analysis of the MS2 data can also assist in address-ing the alternative scenario, namely, that any one MS1

bin (a nominal unit m/z) is also likely to contain signalsfor a number of different metabolites not resolvedby instrument resolution (and also binning artefacts).If the MS1 bin of a nominal m/z contains ions from

Figure 1. The distribution of MS1 ion abundance indifferent treatment groups. MS1 bins m/z 205, 248,335, and 554 differ significantly (false discovery rateadjusted P value ,0.05) between E1 and E2 insheath tissue. The barplot is based on the median andmedian absolute deviation values of the replicates(n 5 4, median 6 median absolute deviation). Labelsof E1 and E2 refer to presence and absence of endo-phyte, and I, B, and S refer to plant tissue immatureleaf, blade, and sheath, respectively. See all samplenames in Supplemental Table S1.

Advanced Data-Mining Strategies for Metabolomic DIMSn Data

Plant Physiol. Vol. 146, 2008 1503 www.plantphysiol.orgon June 24, 2020 - Published by Downloaded from

Copyright © 2008 American Society of Plant Biologists. All rights reserved.

Page 4: Advanced Data-Mining Strategies for the Analysis of Direct ... · Advanced Data-Mining Strategies for the Analysis of ... Exploratory data analysis approaches were also developed

different metabolites in different samples, or from thesame metabolites but in different concentrations, thenthe MS2 product ion spectra are likely to differ be-tween samples. Thus we have developed a method forautomated comparison of MS2 data across a sample setto investigate ion homogeneity within selected MS1

bins.The method is based on the modified Manhattan dis-

tance as a measurement of the similarity of MS2 spectrafrom ions in a given parent MS1 bin. The procedure isdescribed in detail in ‘‘Materials and Methods.’’ Inbrief, MS2 data for a parent MS1 bin of interest werepairwise compared between samples. Only the 20 mostintense fragment ions in each MS2 spectrum were usedfor the comparison. The sum of the absolute values ofthe difference in normalized intensities was used as adistance score. From a set of pairwise differences, a dis-tance matrix was obtained for statistical classification(hierarchical clustering or multidimensional scaling[MDS], etc.) to assess the homogeneity of each MS1 bin.For most MS1 bins, no distinct groups were seen,indicating that the MS2 spectra were consistent, andthus these MS1 bins are homogeneous and likely to bedominated by a single ion species. However, therewere a number of MS1 bins for which different MS2

spectral patterns were observed for different samples.The differences between MS2 spectra in samples can bevisualized with clustering analysis methods such ashierarchical clustering or MDS. MDS preserves thedistance metric, and the cluster structures are revealedin different directions in the manner of principal com-ponent analysis (Lattin et al., 2003). The m/z 209 MS1

bin, which showed significant differences in ion abun-dance between E1/E2 in the immature tissues (see theplot in Supplemental Fig. S2), also exhibited differentMS2 patterns in E1 and E2 samples (Fig. 2, premz 209,where premz means precursor m/z in MS2 plots). In-

spection of individual MS2 spectra from the MS1 bin m/z209 (Fig. 3) suggests there are at least two metaboliteswithin the m/z 209 MS1 bin present in different concen-trations in the E1 and E2 samples (e.g. fragment ionsm/z 191 and 192; in comparison with m/z 149, 177, and181). The MS2 spectra for another significant MS1 binm/z 554 and the MS1 bin of adjacent mass m/z 555 (seethe following discussion) also showed different pat-terns between E1 and E2 samples (Fig. 2, premz 555).See Supplemental Figure S4 for the detailed MS2 spec-tra of m/z 555.

Many distance metrics have been proposed to mea-sure the similarity of spectra such as the dot product

Table I. A summary of MS1 bins discussed in this study

MS1 Bin Isotopologue Ion Adduct Putative Metabolitea Ion Fragmentation Tissueb

m/z

189 Unknown (no MS2) Sheath205 [182Na]1 Mannitolp,f Supplemental Figure S3 Sheath209 Unknown (heterogeneous bin) Figure 3 Immature223 Unknown (no MS2) Immature230 Unknown (no MS2) Sheath/immature231 Unknown (no MS2) Sheath248 Peraminef Supplemental Figure S7 Sheath297 296 5-Hydroxyferulyl sinapinep Immature335 333, 334 Perlolinep Sheath543 544, 545, 546 [504K]1 Trihexosidep Sheath554 555 [1,107H2]

11 Cyclic peptidef Supplemental Figure S4, a and b Sheath705 706 [666K]1 Tetrahexosidep Supplemental Figure S5a Sheath867 868, 869, 870 [828K]1 Pentahexosidep Supplemental Figure S5a Sheath779 781 [666K2Cl]1 Tetrahexosidep [K2Cl]1 cluster None941 943 [828K2Cl]1 Pentahexosidep [K2Cl]1 cluster None

aSuperscript labels p and f indicate plant and fungal origin, respectively. bTissue in which the bins show a statistically significant differencebetween E1 and E2 material. MS1 bins m/z 209, 223, 297, and 335 are significantly higher in E2 samples and all other MS1 bins are higher in E1

samples.

Figure 2. The clustering of MS2 spectra derived from the parent MS1

bins m/z 209 and m/z 555 by MDS based on modified Manhattandistances, suggesting there are different MS2 spectral patterns in E1 andE2 groups.

Cao et al.

1504 Plant Physiol. Vol. 146, 2008 www.plantphysiol.orgon June 24, 2020 - Published by Downloaded from

Copyright © 2008 American Society of Plant Biologists. All rights reserved.

Page 5: Advanced Data-Mining Strategies for the Analysis of Direct ... · Advanced Data-Mining Strategies for the Analysis of ... Exploratory data analysis approaches were also developed

(Stein and Scott, 1994) and correlation-based distancemetrics such as 1-correlation coefficient or 1-cosine(Tabb et al., 2003). We observed that fragment massesin MS2 spectra often showed mismatches between

samples, and major fragment masses needed to betreated individually. The top 20 ions in each MS2

spectrum were retained and compared in this study,but this number could be extended or decreased to any

Figure 3. Plots of MS2 spectra derived from the parent MS1 bin m/z 209 in E2 immature tissue (spectra A and B) and E1 bladetissue (spectra C and D) showing different relative intensities of product ions indicative of different metabolite compositionswithin this bin in the two classes of samples. MS2 data were not obtained for other samples.

Advanced Data-Mining Strategies for Metabolomic DIMSn Data

Plant Physiol. Vol. 146, 2008 1505 www.plantphysiol.orgon June 24, 2020 - Published by Downloaded from

Copyright © 2008 American Society of Plant Biologists. All rights reserved.

Page 6: Advanced Data-Mining Strategies for the Analysis of Direct ... · Advanced Data-Mining Strategies for the Analysis of ... Exploratory data analysis approaches were also developed

arbitrary number. However, McLafferty et al. (1999)reported that 18 peaks were 97% as effective as 150peaks for searching and comparing electron impactmass spectra. In standard practice of chemical analy-sis, only a few major MS2 product ions are used forconfirmation of the identity of the parent MS1 species(Allwood et al., 2006; Koulman et al., 2007b).

The predominant component of the m/z 248 MS1 binin the endophyte-infected samples is peramine, whichis a known fungal alkaloid. Peramine has a guani-dinium moiety that undergoes distinctive neutrallosses of 17 and 42. Its MS2 product ion spectrum isin accordance with previous findings (Koulman et al.,2007a); see also Supplemental Figure S7 in which m/z248 is used as an example for MS2 data handling.

The MS2 fragmentation pattern of the m/z 205 MS1

bin showed a clear water loss. Weakly basic metabo-lites such as alcohols are prone to form sodium ad-ducts rather than [MH]1 ions (Jemal et al., 1997). Wepropose the actual nominal mass of the metabolite tobe 182 (i.e. m/z 205 is [MNa]1). A logical candidate ismannitol or a related sugar alcohol. To test this hy-pothesis, we infused a water solution of mannitol intothe mass spectrometer and observed a sodiated ion ofm/z 205 with a highly similar MS2 spectrum to that ofthe m/z 205 MS1 bin in the endophyte-infected samples(Supplemental Fig. S3). However, as there are severalnaturally occurring hexitols (10 possible stereoiso-mers) with similar MS2 product ion spectra (basedon data from triple quadrupole MS; see www.hmdb.caand www.massbank.jp), the MS2 spectrum may be de-rived from a combination of several unresolved sugaralcohols in the m/z 205 MS1 bin.

No direct fragmentation data are available for them/z 335 MS1 bin that was detected at higher abundancein endophyte-free tissue (Fig. 1, m/z 335). However, con-sideration of MS1 bins of adjacent masses suggestedthe ions of m/z 335 are mainly isotopologues of theryegrass alkaloid perloline. The m/z 335 MS1 bin ishighly correlated with the MS1 bins of m/z 333 (r 5 0.86,r is the Pearson’s correlation coefficient, thereafter) and334 (r 5 0.93). The major ion detected at m/z 333 in pos-itive ESI MS is assigned as the anhydrocation perloline(C20H17N2O3

1), with predicted isotopologue ions atm/z 334 (13C1 and 15N1) and m/z 335 (13C1,

15N1,13C2,

15N2, and 18O1). The expected relative intensities of m/z333, 334, and 335 are 1:0.24:0.03. The MS1 bins of m/z333, 334, and 335 were observed to have a mean rel-ative intensity of 1:0.37:0.07. The measured ratios showhigher abundances for the higher mass ions than thetheoretical prediction, which suggests ions from addi-tional compounds have also been detected in the m/z335 MS1 bin. Thus, although modeling isotopic distri-bution has been attempted for high resolution MS data(Bocker et al., 2006), for low resolution infusion MSdata this may be confounded by interfering compo-nents isobaric with isotopologue peaks.

The m/z 554 MS1 bin is correlated with the m/z 555MS1 bin with r 5 0.72. MS2 product ion spectra of m/z554 were available only from four E1 samples and were

very similar to the MS2 product ion spectra of m/z 555 inE1 samples (m/z 555 represents a different compoundin E2 samples as noted above; see Fig. 2; SupplementalFig. S4). For both MS1 bins, the MS2 spectra in E1samples show a series of product ions with a higher m/zthan the parent ion (Supplemental Fig. S4), suggestingthis is a doubly charged ion. Manual examination ofthese ions showed that the parent ion occurred at m/z554.5 and its monoisotopologue at m/z 555.1. Due to thelimited precision of the mass spectrometer, the mea-sured m/z varied between 554.22 and 554.55 and there-fore caused binning problems with bins of unit m/z.The doubly charged state was confirmed by the occur-rence of high mass product ions in the MS2 and MS3

data. There was a significant product ion of m/z 904.3and several other high mass ions in the MS2 spectrumfrom m/z 554.5 and in the MS3 spectrum from its majorMS2 product of m/z 516.7 (Supplemental Fig. S4, aand b). The exact structure of the compound remainsto be elucidated, but the complex pattern of productions suggests that it is a cyclic oligomer of amino acids.

Differentially expressed ions in E1 versus E2 im-mature leaves were observed in MS1 bins of m/z 209,297, 223, and 230 (230 is also different in E1/E2sheaths). MS2 spectra for the m/z 297 MS1 bin wereobserved in only two samples. The dominant productions were of m/z 104,105, 237, and 238. This occurrenceof pairs of fragment ions in the MS2 spectrum suggestedthat the m/z 297 MS1 bin comprised isotopologues ofions in the m/z 296 bin, and the m/z 297 MS1 bin washighly correlated with the m/z 296 MS1 bin with r 5 0.82.The m/z 296 MS1 bin abundance is higher in endophyte-free samples and its MS2 spectrum showed a dominantm/z 104 ion as well as a clear m/z 59 loss. The MS3

spectrum of the m/z 104 ion showed a major fragmentof m/z 60. These fragmentations are all highly indicativeof a choline group (http://metlin.scripps.edu). Thisappears to be a novel compound, as no plant or fun-gal compounds with a corresponding mass and acholine group have been reported. One possibility is a5-hydroxyferulyl analog of sinapine. We also remainuncertain about the chemical identity of the majormetabolites detected in the m/z 209 and 223 MS1 bins,although MS2 product ion spectra (data not shown)were obtained in this study.

Correlation Network Analysis of MS1 Bins

Feature selection based on statistical ranking or ma-chine learning algorithms is an important step in high-throughput data analysis. However, no golden rulesexist for choosing a cutoff of P values or ranking scoresand alternative approaches other than statistical rank-ing may be of use. Correlations among variables aresometimes considered as redundancy and often onefeature (variable) is selected from a correlative groupfor further analysis (Zou and Hastie, 2005). However,correlations between MS1 bins deriving from differentmetabolites may provide insights into the functionaldependency of these metabolites.

Cao et al.

1506 Plant Physiol. Vol. 146, 2008 www.plantphysiol.orgon June 24, 2020 - Published by Downloaded from

Copyright © 2008 American Society of Plant Biologists. All rights reserved.

Page 7: Advanced Data-Mining Strategies for the Analysis of Direct ... · Advanced Data-Mining Strategies for the Analysis of ... Exploratory data analysis approaches were also developed

With the aid of network analysis tools (Carey et al.,2005), correlation (or relevance) networks (Butte et al.,2000) among all the measured MS1 bins in all thesamples were investigated. The correlation networksconstructed in this way should reveal when a group ofMS1 bins exhibit the same pattern of relative concen-tration (ion abundance) across samples. Based on thecriterion of r . 0.9, many isolated correlation unitswere found to be composed of MS1 bins of adjacentmass, which are likely to be due to natural isotopo-logues. A large highly connected subgraph (clique) inthe network was identified with 12 nodes (Fig. 4). Inthis subgraph component, some correlations betweenthese MS1 bins are due to naturally occurring isotopes.The m/z 543, 544, 545, and 546 MS1 bins belong toone group, the m/z 705 and 706 MS1 bins to another,and m/z 867, 868, 869, and 870 MS1 bins to a thirdgroup. That these subgroupings (highlighted by dif-ferent gray scale in Fig. 4) are due to isotopologues wasalso supported by MS2 spectral information, as withineach group the MS2 spectra for the higher mass MS1

bins were similar to that from the lowest mass (mono-isotopic) MS1 bin but with additional isotopologousfragments. The core correlation is between the threemost intense MS1 bins of the lowest mass in eachgroup, corresponding to the monoisotopic ions (m/z:543, 705, and 867; see Table I).

These three MS1 bins differ by a mass of 162, and thiscorresponds to the mass of a hexose (180) 2 H2O. TheMS2 and MS3 data showed consecutive losses of 162from the high mass ions (Supplemental Fig. S5a). Ionsof these m/z ratios have been reported by Enot et al.(2006) for potassiated fructan oligomers detected inDIMS of genetically modified potatoes (Solanum tuberosum;[504K]1, [666K]1, and [828K]1; potassiated trihexose,tetrahexose, and pentahexose, respectively). Perennialryegrass is known to produce a range of fructan

oligomers from degree of polymerization (DP) 3 to.8 (Pavis et al., 2001), and the identification of theseMS1 bins as deriving from fructan oligomers was sup-ported by comparison of MS2 and MS3 spectra withthose obtained by infusion and MS/MS analysis ofaqueous KCl solutions of 1-kestose, 1,1-tetrakestose,and 1,1,1-pentakestose (Supplemental Fig. S5b).

We considered the possibility that the two other MS1

bins m/z 779 and 941 in the correlation network mightderive from glycerol adducts of the oligosaccharides,as they differed in mass from the m/z 705 and 867species by 74 units. Glycosylglycerides have recentlybeen reported by Yamamoto et al. (2006) as syntheticproducts of kojibiose phosphorylase from Thermo-anaerobacter brocki. However, the MS2 spectrum of them/z 779 and 941 species in the plant extracts showed ineach case only a neutral loss of 74, while syntheticglucosyl-, maltosyl-, and maltotriosylglycerol showedneutral losses of 92 and 162 on MS/MS analysis.Further, m/z 779 and 941 ions were also observed inthe DIMSn profiles of 1,1-tetrakestose and 1,1,1-penta-kestose, respectively, in aqueous KCl (above), andwere accompanied by ions of lower intensity of m/z781 and 943, respectively, in the DIMSn profiles of boththe standard solutions and plant extracts. These highermass ions fragmented with a sole neutral loss of 76. Ineach case the product ion underwent subsequent MSfragmentation as for the potassiated oligosaccharide.As neutral losses of 74 and 76 correspond to the twochlorine isotopologues of KCl, we conclude the m/z 779and 941 species are KCl cluster adducts of the potas-siated oligosaccharides of the m/z 705 and 867. Similaradduct ions were present for the potassiated trihexose,although they were not detected as part of the corre-lation network.

The levels of the oligohexoses were low in the bladesand high in immature tissue and present at interme-diate levels in sheath tissue (Fig. 5). When consideringthe endophyte effect, concentrations of the oligosac-charides were significantly higher in the endophyte-infected sheaths by a simple t test, with P values of0.0082, 0.0048, and 0.0075 for the m/z 543, 705, and 867MS1 bins, respectively. However, there were no signif-icant (P values .0.05) endophyte infection effects inimmature leaf and blade.

DISCUSSION

Metabolite Identification and Measurement

The identification of metabolites from raw signalsdetected by mass spectrometers is a challenging task inmetabolomics that still demands considerable effort(Schauer and Fernie, 2006). Two types of MS technol-ogies are in general use for high-throughput metab-olomics analyses. One type is high-accuracy MS usingICR-MS or time-of-flight MS (Dettmer et al., 2007).Highly accurate m/z measurements narrow down thesearch space by providing a short list of chemical

Figure 4. A group of highly correlated MS1 ions identified by correla-tion network analysis. The ions with the same gray scale are the samemetabolites with different m/z due to the presence of natural isotopo-logues or salt adducts.

Advanced Data-Mining Strategies for Metabolomic DIMSn Data

Plant Physiol. Vol. 146, 2008 1507 www.plantphysiol.orgon June 24, 2020 - Published by Downloaded from

Copyright © 2008 American Society of Plant Biologists. All rights reserved.

Page 8: Advanced Data-Mining Strategies for the Analysis of Direct ... · Advanced Data-Mining Strategies for the Analysis of ... Exploratory data analysis approaches were also developed

formulae (elemental compositions), although withoutadditional isotope abundance data this list may re-main extensive (Kind and Fiehn, 2006). The other typeexploited in metabolomics is tandem MS (MS/MS orMSn) using an ion trap that can provide fragmentationdata on the initial MS1 ions, although in practice thiscapacity is often foregone (Enot et al., 2006). As dem-onstrated here, these fragmentation data are useful forthe elucidation and classification of chemical struc-tures. Technologies combining these two features, e.g.ion trap with FT-ICR-MS, are also available to providehigh mass resolution MS1 profiles and information onfragmentation patterns. However, applying this com-bination to obtain high resolution data on both MS1

and MS2 ions for large numbers of samples is imprac-tical due to the slow scan speeds required by the FT-ICR-MS. All of these technologies generate complexdata sets and there are many steps in translating rawsignals into chemical entities. With access to the rawdata now readily available in standard formats such asmzXML (Pedrioli et al., 2004), novel or improved algo-rithms can be employed in many steps of data anal-ysis, such as data preprocessing (including baselinedetection and removal, peak detection, peak, or reten-tion time alignment, etc.) and inference of the identityof components (e.g. Listgarten and Emili, 2005; for arecent review of LC-MS data processing for metabolo-mics and currently available software, see Katajamaaand Oresic, 2007).

We recently applied direct-infusion ESI MS usingDIMSn to determine metabolic differences betweenendophyte-infected and endophyte-free ryegrass seedsamples (Koulman et al., 2007b). In this study, we have

extended this approach by developing tools to analyzethe raw MS data from DIMSn and to compare MS2

spectra derived from the same parent MS1 bin fromdifferent samples. This has enabled us to handle thecollected data appropriately. Rather than seeking align-ment in the time domain as addressed by programssuch as XCMS and MZmine, our software bins the datainto unit m/z bins and finds the median intensity in eachm/z bin over the course of the infusion. The challengesof mass alignment and binning of data collected at highmass resolution (e.g. Hansen and Smedsgaard, 2004)are also much less for data collected at unit m/z reso-lution. Developing methods for automating the han-dling of MS2 data has enabled us to determine if theassignment of MS1 signal variations to treatment effectson a specific metabolite is justified, as it has allowed usto distinguish whether these MS1 signals were derivedfrom the same metabolite(s) in all the samples orwhether there were different metabolites in differentsamples detected within the same MS1 bin. For chem-ical identification, an MS2 product ion spectrum de-rived from an isotopically and chemically homogenousMS1 bin is desirable. An MS2 spectrum derived from anMS1 bin comprising a mixture of isotopologue ions willshow an anomalous pattern, as noted above for the m/z297 MS1 bin. Thus prior to investigating the MS2 data, itis useful to screen candidate MS1 bins for the existenceof highly correlated MS1 bins of adjacent mass andhigher intensity that are likely candidates for the mono-isotopic species. Correlation analysis can also reveal thepresence of ESI adducts such as the [K2Cl]1 clusteradducts reported here. The development of software toautomate the discovery of isotopologues and adducts isan area of current active research (Tautenhahn et al.,2007). While facilities for comparison of MSn data (iontrees) and construction of libraries are available incommercial software (e.g. MassFrontier), further re-finement and extension of the tool developed here toautomate the construction of MSn libraries to facilitatemetabolite identification would be a valuable tool forthe analysis of DIMSn data.

Metabolites have diverse physical and chemicalproperties and a wide range of concentrations in a bio-logical system, so any single analytical technique can-not detect all the metabolites of biological relevance.Using DIMSn, we have identified or classified a numberof metabolites known to be present in endophyte-infected grass samples such as peramine and a sugaralcohol putatively annotated as mannitol (but we can-not exclude other sugar alcohols). Other well-knownmetabolites, e.g. the alkaloids lolitrem B and ergova-line, although of high biological significance were notdetected in this experiment. Indolediterpenes like loli-trem B are lipophilic and not sufficiently extracted bythe extraction solvents used in this study. Ergopeptides(like ergovaline) are usually present at very low con-centrations in the symbiotum and their MS signals arewithin the noise range of DIMSn data. Therefore, theanalysis of these classes of metabolites requires thedeployment of dedicated approaches (Lehner et al.,

Figure 5. A plot of raw ion abundance of five correlated MS1 bins(represented by m/z 543, 705, 867, 779, and 941) showing that thesemetabolites accumulate to higher levels in immature leaves, lowerlevels in blade, and intermediate levels in sheath tissue.

Cao et al.

1508 Plant Physiol. Vol. 146, 2008 www.plantphysiol.orgon June 24, 2020 - Published by Downloaded from

Copyright © 2008 American Society of Plant Biologists. All rights reserved.

Page 9: Advanced Data-Mining Strategies for the Analysis of Direct ... · Advanced Data-Mining Strategies for the Analysis of ... Exploratory data analysis approaches were also developed

2005; Spiering et al., 2005). In this study we only usedpositive ionization and only one type of extractionprocedure. We believe that alternative extraction pro-cedures and ionization methods would deliver addi-tional information on other classes of metabolites.

Quantitation in infusion ESI MS is subject to signalsuppression or enhancement in the source (see Dettmeret al., 2007), and ionization from an infused mixture islikely to be selective and dependent on the ability of amolecule to capture a charge in the source. Thus thedetection of species such as the peramine and perlolinecations and the doubly charged putative peptide ion isnot unexpected, and other species less prone to formingcations are likely to have been underrepresented in theprofile. Developments in nanoscale ESI may provideimproved performance in this regard. The experimentsdescribed here were carried out with a standard cap-illary and flow rates of 5 mL/min. Nanospray technol-ogy utilizing very low flow rates (,20 nL/min) canreduce and perhaps eliminate analyte suppression(Schmidt et al., 2003) and may provide a less biasedprofile. An implementation in a microchip-mountedmicrofluidic device has shown promise for drug me-tabolite discovery (Trunzer et al., 2007) and may pro-vide advantages for metabolomics.

The other factor confounding quantitation in iontrap DIMSn is the presence of multiple componentswithin each 1 m/z bin. Thus, while the endophyte effecton the m/z 248 MS1 bin can be attributed to peramine,the differences in intensity between tissue types (Fig.1) appear to derive from the unknown plant compo-nents also detected in this bin. Concentrations of per-amine in these samples estimated by HPLC with photodiode array detection (L. Johnson, unpublished data)were similar in the three tissue types as reported bySpiering et al. (2005).

Although we have clearly demonstrated the useful-ness of fragmentation data for the classification andstructural elucidation of metabolites, the method is oflimited use for characterizing metabolites that do notshow a fragmentation (e.g. m/z 230, 189). For suchmetabolites the MS1 data provide a lead to furtherinvestigation using other methods. Indeed for all pu-tative novel metabolites, additional data such as accu-rate mass MS, and targeted isolation and structureelucidation by, for example, NMR spectroscopy isnecessary for their complete chemical characterization.

Biological Implications of Identified Metabolites

Several metabolites identified in this study have in-teresting implications for the metabolic regulation ofthe perennial ryegrass-endophyte symbiotum. As dis-cussed in detail above, the hexitol (m/z 205) present inendophyte-infected plants only is probably mannitol.Mannitol appears to be a very common polyol in fungi(Lewis and Smith, 1967) and has been reported to ac-cumulate in endophyte-infected tall fescue (Richardsonet al., 1992) and perennial ryegrass plants (Harwood,

1954; Johnson et al., 2006; Rasmussen et al., 2008).Although mannitol has been implicated as an osmopro-tectant in the resurrection plant Myrothamnus flabellifolia(Bianchi et al., 1993) and in transgenic Arabidopsis(Arabidopsis thaliana) expressing a celery (Apium grave-olens) Man-6-P reductase (Sickler et al., 2007) as wellas in Nicotiana tabacum expressing a mannitol-1-Pdehydrogenase (Karakas et al., 1997), a study in tallfescue indicates that mannitol levels in endophyte-infected plants are not increased under drought stress(Richardson et al., 1992). A recent review (Solomonet al., 2007) also questions this role for mannitol as wellas other claimed functions like fungal carbohydratestorage (Voegele et al., 2005) or NADPH regeneration(Hult and Gatenbeck, 1978; Hult et al., 1980). Solomonet al. (2007) conclude that the role and requirements formannitol seem to differ depending on the species offungus. In Aspergillus niger, mannitol is involved inconidial oxidative and high temperature stress protec-tion (Ruijter et al., 2003), and in the wheat (Triticumaestivum) pathogen Stagonospora nodorum it is requiredfor asexual sporulation (Solomon et al., 2006). Clearly,more studies are needed to understand the function ofmannitol in endophyte-infected grasses.

Peramine (m/z 248) has been shown to be the likelyagent to confer improved insect resistance to endophyte-infected plants affecting Argentine stem weevil anda range of other insects (Rowan and Gaynor, 1986;Latch, 1993; Rowan, 1993; Rowan and Latch, 1994).Recently, it was shown that peramine is produced by anendophyte-specific two-module nonribosomal peptidesynthetase (perA) and that an Epichloe festucae mutantdeleted for perA lacks detectable levels of peramine(Tanaka et al., 2005). It was also shown that plantmaterial containing this mutant endophyte was assusceptible to Argentine stem weevil feeding asendophyte-free plants, demonstrating unambiguouslythat peramine confers resistance to this insect. Peramineis also the most abundant alkaloid produced by thisendophyte in infected plants; its concentration is usu-ally an order of magnitude higher than for the otherendophyte-specific alkaloids (Spiering et al., 2005) andit is detectable in plant fluids from cut leaf and inguttation fluid (Koulman et al., 2007a).

Perloline (m/z 333), a diazaphenanthrene alkaloid, isproduced by the grass plant and has been isolated fromboth ryegrass and tall fescue plants (Grimmett andWaters, 1943; Jeffreys, 1964; Bush and Jeffreys, 1975).Our results suggest that perloline concentrations arereduced in endophyte-infected mature blades andsheaths; the mechanism for this effect remains to beelucidated. Not much is known about the biosynthesis orfunction of perloline, although it has been implicated ineffects on fall armyworm (Spodoptera frugiperda JE Smith)performance (Salminen et al., 2005) and to stimulateprolactin secretion in rats (Strickland et al., 1992). Earlierreports on the function of perloline, such as causingryegrass staggers, are questionable due to the possiblepresence of endophytes in the studied material unknownat the time (Fairbourn, 1962; Aasen et al., 1969).

Advanced Data-Mining Strategies for Metabolomic DIMSn Data

Plant Physiol. Vol. 146, 2008 1509 www.plantphysiol.orgon June 24, 2020 - Published by Downloaded from

Copyright © 2008 American Society of Plant Biologists. All rights reserved.

Page 10: Advanced Data-Mining Strategies for the Analysis of Direct ... · Advanced Data-Mining Strategies for the Analysis of ... Exploratory data analysis approaches were also developed

The putative oligopeptide (m/z 554.5) identified inthis study accumulates exclusively in endophyte-infected tissues and is therefore most probably anendophyte-produced metabolite. Recently, a novel cy-clic peptide, epichlicin, inhibiting spore germination ofCladosporidium phlei, a pathogenic fungus of timothygrass (Phleum pratense), was isolated from timothy grassinfected with Epichloe typhina (Seto et al., 2007), a rel-ative of N. lolii investigated in this study. Oligopeptideshave been isolated from fungal endophytes previously,e.g. leucinostatin A, a phytotoxic, anticancer, and anti-fungal peptide (Arai et al., 1973), and from Acremoniumsp., a fungus infecting Taxus baccata (Strobel et al., 1997).This mycotoxin causes necrotic symptoms in nonhostplants, presumably because these plants, unlike T.baccata, are not able to transform it into the less toxicleucinostatin A-b-di-O-glucoside (Strobel and Hess,1997; Tan and Zou, 2001). The antimicrobial cyclicechinocandin peptides have been isolated from endo-phytic Cryptosporiopsis sp. and Pezicula sp. in Pinussylvestris and Fagus sylvatica (Noble et al., 1991) and theantifungal cyclopeptide cryptocandin from the endo-phytic Cryptosporiopsis compared with quercina of red-wood (Strobel et al., 1999). We are currently isolatingthe oligopeptide identified in this study from endophyte-infected perennial ryegrass to elucidate its structureand to test its potential antimicrobial activity.

Correlation Network Analysis and ItsBiological Implications

Correlation analysis of metabolites has been usedpreviously to explore the functional dependency ofmetabolites, and it was shown that this type of analysisallowed, for example, the reconstruction of the meta-bolic pathway leading to the biosynthesis of gluco-sinolates in Arabidopsis (Keurentjes et al., 2006). It hasbeen proposed that the construction of correlation net-works based on metabolic fingerprinting might helpto uncover underlying enzymatic reaction networks(Steuer et al., 2003), although other origins of correla-tions between metabolites within a physiological statehave also been discussed (Camacho et al., 2005). How-ever, in this study comparing different plant tissues andendophyte infection status, physiological differencesare likely to be the dominant factor.

In this study we have identified three MS1 bins rep-resenting monoisotopic ions of different metabolitesthat correlate significantly in our sample set. Massfragmentation indicates that these metabolites are po-tassiated tri-, tetra-, and pentahexosides and their iden-tification as fructans of DP 3, 4, and 5 was supported bythe comparison with DIMSn of solutions of standards inaqueous KCl. Many cool-season C3 grasses accumulatefructans (Suc derived Fru polymers) as storage carbo-hydrates in their vegetative tissue, especially in maturesheaths (Pollock and Cairns, 1991). The genera Loliumand Festuca accumulate appreciable amounts of low DPfructans belonging to the inulin series, inulin neoseries,and levan neoseries, which differ in the position of Glc

(terminal or internal) and the linkage type of Fru res-idues (b2,1 or b2,6; Pollock, 1982; Pavis et al., 2001). Itwas also shown that the proportion of low DP fructans(DP , 6) was more prominent in bases of elongatingleaves than in leaf sheaths, and that mature leaf bladesaccumulate predominantly 6G-kestotriose and 1- and6G-kestotetraose. The limited MSn data obtained heredo not provide direct information on linkage andbranching type of these Fru-containing polymers, butdifferences in the relative intensity of product ions inthe MS2 and MS3 spectra between extracts and stan-dards (Supplemental Fig. S5, a and b) may reflect themixed isomer composition of the plant fructans. Recentdevelopments in the elucidation of carbohydrate struc-tures by MSn without chemical derivatization (e.g. Fangand Bendiak, 2007) suggest more extensive MSn anal-ysis may provide additional structural information. Aswas shown previously (Rasmussen et al., 2007, 2008),endophyte infection resulted in higher levels of some ofthe sugars, which might indicate that the increased sinkstrength in the infected tissue results in a higher turn-over of high DP fructans with a concomitant increase inlow DP oligosaccharides. Although the exact mecha-nism for this pattern of accumulation remains to beelucidated, it has been documented (for review, seeChalmers et al., 2005) that the base of youngest leavesand the sheath of the more mature leaves represent theorgans where fructosyltransferase activities, fructanaccumulation, and remobilization in perennial rye-grass is most active, and where several fructan metab-olism genes are expressed.

CONCLUSION

Our results extend our current knowledge on themetabolites involved in the symbiosis of the fungusN. lolii and its host perennial ryegrass. Using unbiasedmetabolite profiling (DIMSn) and advanced data-miningstrategies, we have been able to uncover a number ofunexpected perturbations of the metabolome uponendophyte infection. Based on the MS1 spectra wehave found several metabolites that were significantlydifferent between endophyte-infected and endophyte-free samples. New methods for automated processingof MS2 data have proved useful in detecting whetherions in a unit m/z MS1 bin represent a single major com-ponent across a sample set, or a heterogeneous mix-ture. With the aid of the MS2 product ion spectra wecould readily identify some MS1 ions on the top ofthe list such as the known metabolites peramine andmannitol. The analysis has also revealed some new me-tabolites that are present in endophyte-infected plants,such as putative cyclic oligopeptides, and plant com-pounds present at reduced levels in infected plantssuch as a novel putative choline derivative. The iden-tification of unknown MS1 bins as being statisticallysignificantly different in uninfected compared to in-fected tissues also provides justification for their furthercharacterization using more targeted approaches.

Cao et al.

1510 Plant Physiol. Vol. 146, 2008 www.plantphysiol.orgon June 24, 2020 - Published by Downloaded from

Copyright © 2008 American Society of Plant Biologists. All rights reserved.

Page 11: Advanced Data-Mining Strategies for the Analysis of Direct ... · Advanced Data-Mining Strategies for the Analysis of ... Exploratory data analysis approaches were also developed

Linear correlation network analysis revealed theeffect of the endophyte on a range of oligosaccharides,giving us new clues on how the endophyte utilizesplant carbohydrates. The methodology has proved tobe a powerful tool for discovering leads to novel chem-istry associated with the symbiosis, and these demandfurther chemical and biological investigation.

MATERIALS AND METHODS

Experimental Design and Sampling

Clonal perennial ryegrass plants (Lolium perenne ‘Nui’), either infected with

the fungal endophyte Neotyphodium lolii (strain Lp19) or endophyte free, were

used in this study. Endophyte-free perennial ryegrass was obtained as de-

scribed by Tanaka et al. (2005). A 2 3 3 factorial design was applied with

endophyte-infected (E1) and endophyte-free (E2) plants, and three tissue

types, namely, immature leaf, blade, and sheath. Four individual tillers, either

all E1 or all E2 perennial ryegrass, were planted in pots, to give four replicate

pots of E1 and E2 material. The three tissue types were dissected from each

of the plants in each pot and pooled for analysis (see also Supplemental Table

S1 for sample description). Thus, 24 ryegrass samples in total were examined

in this study. These plants were grown in a controlled environment chamber

with 14-h daylength (653 mmol m22 s21 of light intensity), a temperature of

20�C day/10�C night, and supplied with a modified Hoagland nutrient so-

lution. The tissue samples were harvested and immediately frozen in liquid

nitrogen, and stored at 280�C for subsequent analysis.

Direct-Infusion MS

Plant tissue samples were ground using pestle and mortar in liquid nitrogen

and stored at 280�C. Fifty milligrams of ground samples were extracted with

1.5 mL of MeOH. The extract was partitioned between water and dichloro-

methane. The aqueous phase was lyophilized and redissolved in 1.5 mL of

MeOH. The infusion solvent (MeOH) was pumped at 20 mL min21 flow to a T

junction just in front of the ESI source where 5 mL min21 MeOH with 2% formic

acid was added. A 100-mL aliquot of each sample was injected using an auto-

sampler. After 10 min a MeOH blank was injected and run at 200 mL min21 flow

rate for 3 min.

A linear ion trap mass spectrometer (Thermo LTQ) coupled to a Thermo

Finnigan Surveyor HPLC system was used. Thermo Finnigan Xcalibur soft-

ware (version 1.4) was used for data acquisition. The mass spectrometer was set

for ESI in positive mode. Samples were infused through a polyimide-coated

glass capillary (0.1 mm i.d., 0.19 mm o.d.) at a flow rate of 5 mL/min. The spray

voltage was 5.0 kV and the capillary temperature 275�C. The ion optics were

tuned using paxilline. The flow rates of sheath gas, auxiliary gas, and sweep gas

were set (in arbitrary units/min) to 20, 5, and 12, respectively. For the first

0.9 min after injection only MS1 spectra were recorded; for the period from 0.9 to

10 min the mass spectrometer was set up in data-dependent mode to collect

one MS1 spectrum, followed by the isolation (2 m/z) and fragmentation (35%

CE; relative collision energy) of the most intense ion from the MS1 spectrum,

followed by the isolation (2 m/z) and fragmentation (35% CE) of the most in-

tense ion from the MS2 spectrum. A new MS1 spectrum was then recorded,

followed by the repetitive isolation (2 m/z) and fragmentation (35% CE) of the

most intense ions from that MS1 spectrum and the most intense MS2 product

ion. When an MS1 ion with a specific mass had been isolated and fragmented for

the second time, it was placed on an exclusion list for the duration of the run.

In total up to approximately 200 MS2 spectra were recorded in an average run.

Samples of glucosyl-, maltosyl-, and maltotriosylglycerol (provided by H.

Nakano, Osaka Municipal Technical Research Institute, Osaka) and samples of

standard 1-kestose, 1,1-tetrakestose, and 1,1,1-pentakestose (Megazyme In-

ternational Ireland Ltd.; 4 mg mL21) in aqueous KCl (50 mM) were infused and

analyzed under the similar conditions.

Data Analysis

The raw data (Xcalibur raw file, in centroid mode) were converted into

mzXML data format (Pedrioli et al., 2004) using software ReAdw available

from http://sashimi.sourceforge.net/software_glossolalia.html. mzXML is a

standard data format for tandem mass spectrometric data and relatively easy

to manipulate.

MS1 Data Analysis

All the MS1 scans were retrieved from mzXML and the original data could

also be checked manually using the Thermo proprietary software Xcalibur.

Given the 1 m/z resolution of the machine, we binned the data to unit nominal

m/z. Thus in each sample, for a given ion, for example, m/z 248, all the mea-

sured m/z values (e.g. 247.98, 248.12, 248.35) were rounded to an integer value

of 248 (equivalent to binning with 1 m/z width), and the median (rather than

the average) of the abundance of corresponding ions within a sample was

taken as a robust statistical estimate to reduce potential rounding (or binning)

artefacts. The resulting bins are described here as MS1 bins. For each MS1 bin,

the first 300 scans (see Supplemental Fig. S6) were removed because they were

background signals from solvent (noise). For each MS1 bin, the median value

of these first 300 scans was then used as a baseline value, and subtracted from

the ion abundance values for subsequent scans. Any negative values (below

the background noise) and weak signals (less than the 10% quantile) were

removed. The median value of the ion abundance of all the remaining scans in

each sample was used as the representative MS1 bin abundance for that sample.

The ion abundance values for MS1 bins over the range m/z 150 to 1,000 were

determined for each sample to generate an MS1 data matrix of 24 3 851 for

statistical analysis.

The abundance of each MS1 bin was normalized against the median of the

observations in all the samples, using log2 [x(i)/median (x)], where vector x

comprises the abundance measurements of each MS1 bin, and x(i) is the

abundance of each individual treatment with i from 1 to 24.

Empirical Bayes moderated t statistics (Smyth, 2004) were applied to

identify MS1 ions that were differentially expressed between E1 and E2

across the three developmental stages (immature leaf, blade, and sheath). The

algorithms are implemented and available as R package Limma (Smyth, 2004).

MS1 data have been provided as Supplemental Data Set S1.

MS2 Data Analysis

All the MS2 data were retrieved by querying the mzXML data. MS2 spectra

derived from an MS1 bin in one sample (e.g. 248.35, 248.39; see Supplemental

Fig. S7) were merged by rounding to nominal m/z MS2 bins and assigning the

median abundance value for each bin. For each sample, only the top 20 most

abundant ions (MS2 bins) derived from an MS1 bin were retained for compar-

isons in this report. For a given parent MS1 bin, all derived MS2 spectra in each

sample, if available, were pairwise compared based on a customized distance

metric. The measurement of similarity of MS2 spectra is complicated not only by

variation in ion abundance, but also by the occurrence of different fragment

ions in spectra from the same MS1 bin in different samples and by carry over

from adjacent MS1 bins of high intensity (as for isotopologues) as the isolation

width of the ion trap was set to a range of 2 m/z for efficient capture and

fragmentation of ions. The procedure for the distance metric of two MS2 spectra

is as follows: (1) normalize each spectrum to its maximum abundance value

and sort each spectrum based on its intensity, up to 20 most intense fragment

ions are retained; (2) if the number of MS2 bins is still different, low intensity

ions in the longer spectrum are truncated; (3) for the matched nominal ions

(MS2 bins of same nominal m/z) occurring in both spectra, the Manhattan

distance d 5 +ki j xi 2 yi j is calculated, where the two spectra are denoted as

x 5 (x, ., xm) and y 5 (y, ., yn), and k is the number of shared MS2 bins [with m,

n # 20; k , min(m, n)]. For any unmatched MS2 bins, the normalized intensity is

added to the distance.

Step 3 provides a simplified way for m/z alignment. A similar idea was also

employed by Zhang et al. (2005). The calculated pairwise distances for all

samples were used for clustering analysis.

All the software functions for handling and analysis of MS1and MS2 data,

and correlation network analysis were written in R2.5 (R Development Core

Team, 2007) based on a number of R packages.

Supplemental Data

The following materials are available in the online version of this article.

Supplemental Figure S1. PCA analysis of normalized MS1 data.

Supplemental Figure S2. Abundance of MS1 bins in treatment groups.

Advanced Data-Mining Strategies for Metabolomic DIMSn Data

Plant Physiol. Vol. 146, 2008 1511 www.plantphysiol.orgon June 24, 2020 - Published by Downloaded from

Copyright © 2008 American Society of Plant Biologists. All rights reserved.

Page 12: Advanced Data-Mining Strategies for the Analysis of Direct ... · Advanced Data-Mining Strategies for the Analysis of ... Exploratory data analysis approaches were also developed

Supplemental Figure S3. MS2 spectra of parent MS1 m/z 205 [182Na]1 and

mannitol standard.

Supplemental Figure S4. MS2 and MS3 spectra of parent m/z 555.

Supplemental Figure S5. MS2 and MS3 spectra of potassium adducts of

oligosaccharide and oligokestose standards.

Supplemental Figure S6. Intensity plot of the nominal ion m/z 248 across

all scans.

Supplemental Figure S7. Spectral processing of MS2 using nominal MS1

ion m/z 248.

Supplemental Table S1. Designation and description of experimental

material.

Supplemental Data Set S1. MS1 data matrix.

ACKNOWLEDGMENTS

We acknowledge Karl Fraser for the operation and maintenance of the mass

spectrometer and the DIMSn analysis of fructan standards; Mike Christensen

and Catherine Tootil for the maintenance of plant materials; and Hirofumi

Nakano at Osaka Municipal Technical Research Institute, Osaka, for providing

synthetic glucosyl-, maltosyl-, and maltotriosylglycerol. We appreciate Drs.

Brian Tapper and Silas Villas-Boas for reviewing the manuscript and providing

useful aspects for discussion.

Received November 4, 2007; accepted February 18, 2008; published February

20, 2008.

LITERATURE CITED

Aasen AJ, Culvenor CCJ, Finnie EP, Kellock AW, Smith LW (1969)

Alkaloids as a possible cause of ryegrass staggers in grazing livestock.

Aust J Agric Res 20: 71–86

Aharoni A, de Vos R, Verhoeven H, Maliepaard C, Kruppa G, Bino R,

Goodenowe D (2002) Non-targeted metabolic profiling using fourier

transform ion cyclotron mass spectrometry (FTMS). OMICS 6: 217–234

Allen J, Davey HM, Broadhurst D, Heald JK, Rowland JJ, Oliver SG, Kell

DB (2003) High-throughput classification of yeast mutants for functional

genomics using metabolic footprinting. Nat Biotechnol 21: 692–696

Allwood JW, Ellis DI, Heald JK, Goodacre R, Mur LAJ (2006) Metabolo-

mic approaches reveal that phosphatidic and phosphatidyl glycerol

phospholipids are major discriminatory non-polar metabolites in re-

sponses by Brachypodium distachyon to challenge by Magnaporthe grisea.

Plant J 46: 351–368

Arai T, Mikami Y, Fukushima K, Utsumi T, Yazawa K (1973) A new

antibiotic, leucinostatin, derived from Penicillium lilacinum. J Antibiot

(Tokyo) 26: 157–161

Bianchi G, Gamba A, Limiroli R, Pozzi N, Elster R, Salamini F, Bartels D

(1993) The unusual sugar composition in leaves of the resurrection plant

Myrothamnus flabellifolia. Physiol Plant 87: 223–226

Bocker S, Letzel MC, Liptak Z, Pervukhin A (2006) Decomposing

metabolomic isotope patterns. In P Bucher, BME Moret, eds, Proceedings

of the Workshop on Algorithms in Bioinformatics, Vol 4175. Springer,

Berlin, pp 12–23

Bush LP, Jeffreys JAD (1975) Isolation and separation of tall fescue and

ryegrass alkaloids. J Chromatogr A 111: 165–170

Bush LP, Wilkinson HH, Schardl CL (1997) Bioprotective alkaloids of

grass-fungal endophyte symbioses. Plant Physiol 114: 1–7

Butte AJ, Tamayo P, Slonim D, Golub TR, Kohane IS (2000) Discovering

functional relationships between RNA expression and chemotherapeu-

tic susceptibility using relevance networks. Proc Natl Acad Sci USA 97:

12182–12186

Camacho D, de la Fuente A, Mendes P (2005) The origin of correlations in

metabolomics data. Metabolomics 1: 53–63

Carey VJ, Gentry J, Whalen E, Gentleman R (2005) Network structures

and algorithms in Bioconductor. Bioinformatics 21: 135–136

Chalmers J, Lidgett A, Cummings N, Cao Y, Forster J, Spangenberg G

(2005) Molecular genetics of fructan metabolism in perennial ryegrass.

Plant Biotechnol J 3: 459–474

Cheplick GP (2007) Costs of fungal endophyte infection in Lolium perenne

genotypes from Eurasia and North Africa under extreme resource

limitation. Environ Exp Bot 60: 202–210

Christensen MJ, Leuchtmann A, Rowan DD, Tapper BA (1993) Taxonomy

of Acremonium endophytes of tall fescue (Festuca arundinacea), meadow

fescue (Festuca pratensis) and perennial ryegrass (Lolium perenne). Mycol

Res 97: 1083–1092

Clay K, Schardl C (2002) Evolutionary origins and ecological consequences

of endophyte symbiosis with grasses. Am Nat 160: S99–S127

Dettmer K, Aronov PA, Hammock BD (2007) Mass spectrometry-based

metabolomics. Mass Spectrom Rev 26: 51–78

Dunn WB, Overy S, Quick WP (2005) Evaluation of automated electrospray-

TOF mass spectrometry for metabolic fingerprinting of the plant metabo-

lome. Metabolomics 1: 137–148

Easton HS, Lane GA, Tapper BA, Keogh RG, Cooper BM, Blackwell M,

Anderson MR, Fletcher L (1986) Ryegrass endophyte-related heat stress in

cattle. In Proceedings of the New Zealand Grassland Association, Vol 57.

New Zealand Grassland Association, Dunedin, New Zealand, pp 37–41

Enot DP, Beckmann M, Overy D, Draper J (2006) Predicting interpretabil-

ity of metabolome models based on behavior, putative identity, and

biological relevance of explanatory signals. Proc Natl Acad Sci USA 103:

14865–14870

Fairbourn ML (1962) Alkaloid affects in vitro dry matter digestibility of

Festuca and Bromus species. J Range Manage 35: 503–504

Fang TT, Bendiak B (2007) The stereochemical dependence of unimolec-

ular dissociation of monosaccharide-glycolaldehyde anions in the gas

phase: a basis for assignment of the stereochemistry and anomeric con-

figuration of monosaccharides in oligosaccharides by mass spectrome-

try via a key discriminatory product ion of disaccharide fragmentation,

m/z 221. J Am Chem Soc 129: 9721–9736

Fletcher LR, Easton HS (1997) The evaluation of use of endophytes for

pasture improvement. In CW Bacon, NS Hill, eds, Neotyphodium/Grass

Interactions. Plenum Press, New York, pp 209–228

Gallagher RT, Hawkes AD, Steyn PS, Vleggaar R (1984) Tremorgenic

neurotoxins from perennial ryegrass causing ryegrass staggers disorder

of livestock: structure elucidation of Lolitrem B. J Chem Soc Chem

Commun 9: 614–616

Grimmett RE, Waters DF (1943) A fluorescent alkaloid in ryegrass (Lolium

perenne L.). II. Extraction from fresh ryegrass and separation from other

bases. NZ J Sci Tech 24: 151B

Hansen ME, Smedsgaard J (2004) A new matching algorithm for high

resolution mass spectra. J Am Soc Mass Spectrom 15: 1173–1180

Harwood VD (1954) Analytical studies on the carbohydrates of grasses and

clovers. VII. The isolation of D-mannitol from perennial ryegrass

(Lolium perenne L.). J Sci Food Agric 5: 453–455

Hult K, Gatenbeck S (1978) Production of NADPH in the mannitol cycle

and its relation to polyketide formation in Alternaria alternata. Eur J

Biochem 88: 607–612

Hult K, Veide A, Gatenbeck S (1980) The distribution of the NADPH

regenerating mannitol cycle among fungal species. Arch Microbiol 128:

253–255

Hunt MG, Rasmussen S, Newton PCD, Parsons AJ, Newman JA (2005)

Near-term impacts of elevated CO2, nitrogen and fungal endophyte-

infection on Lolium perenne L. growth, chemical composition and alka-

loid production. Plant Cell Environ 28: 1345–1354

Jeffreys JAD (1964) 859. The alkaloids of perennial rye-grass (Lolium

perenne L.). Part I. Perloline. J Chem Soc 4504–4512

Jemal M, Almond RB, Teitz DS (1997) Quantitative bioanalysis utilizing

high-performance liquid chromatography/electrospray mass spectrom-

etry via selected-ion monitoring of the sodium ion adduct [M 1 Na]1.

Rapid Commun Mass Spectrom 11: 1083–1088

Johnson RD, Bassett S, Cao M, Christensen MJ, Gaborit C, Johnson LJ,

Koulman A, Rasmussen S, Voisey C, Bryan G (2006) A multidisciplin-

ary approach to dissect the molecular basis of the Neotyphodium lolii/

ryegrass symbiosis. In CF Mercer, ed, Advances in Pasture Plant

Breeding, Grassland Research and Practice Series No. 12. New Zealand

Grassland Association, Dunedin, New Zealand, pp 107–114

Karakas B, Ozias-Akins P, Stushnoff C, Suefferheld M, Rieger M (1997)

Salinity and drought tolerance of mannitol-accumulating transgenic

tobacco. Plant Cell Environ 20: 609–616

Katajamaa M, Oresic M (2005) Processing methods for differential analysis

of LC/MS profile data. BMC Bioinformatics 6: 179

Cao et al.

1512 Plant Physiol. Vol. 146, 2008 www.plantphysiol.orgon June 24, 2020 - Published by Downloaded from

Copyright © 2008 American Society of Plant Biologists. All rights reserved.

Page 13: Advanced Data-Mining Strategies for the Analysis of Direct ... · Advanced Data-Mining Strategies for the Analysis of ... Exploratory data analysis approaches were also developed

Katajamaa M, Oresic M (2007) Data processing for mass spectrometry-

based metabolomics. J Chromatogr A 1158: 318–328

Keurentjes JJB, Fu J, de Vos CHR, Lommen A, Hall RD, Bino RJ, van der

Plas LHW, Jansen RC, Vreugdenhil D, Koornneef M (2006) The

genetics of plant metabolism. Nat Genet 38: 842–849

Kind T, Fiehn O (2006) Metabolomic database annotations via query of

elemental compositions: mass accuracy is insufficient even at less than

1 ppm. BMC Bioinformatics 7: 234

Koulman A, Lane GA, Christensen MJ, Fraser K, Tapper BA (2007a)

Peramine and other fungal alkaloids are exuded in the guttation fluid of

endophyte-infected grasses. Phytochemistry 68: 355–360

Koulman A, Tapper BA, Fraser K, Cao M, Lane GA, Rasmussen S (2007b)

High-throughput direct-infusion ion trap mass spectrometry: a new

method for metabolomics. Rapid Commun Mass Spectrom 21: 421–428

Krauss J, Harri SA, Bush L, Husi R, Bigler L, Power SA, Muller CB (2007)

Effects of fertilizer, fungal endophytes and plant cultivar on the perfor-

mance of insect herbivores and their natural enemies. Funct Ecol 21:

107–116

Lane GA, Christensen MJ, Miles CO (2000) Coevolution of fungal endo-

phytes with grasses: the significance of secondary metabolites. In CW

Bacon, JF White, eds, Microbial Endophytes. Marcel Dekker, New York,

pp 341–388

Latch GCM (1993) Physiological interactions of endophytic fungi and their

hosts: biotic stress tolerance imparted to grasses by endophytes. Agric

Ecosyst Environ 44: 143–156

Lattin J, Carroll JD, Green PE (2003) Analyzing Multivariate Data. Dux-

bury Applied Series, Thomson Learning, Pacific Grove, CA

Lehner AF, Craig M, Fannin N, Bush L, Tobin T (2005) Electrospray[1]

tandem quadrupole mass spectrometry in the elucidation of ergot

alkaloids chromatographed by HPLC: screening of grass or forage

samples for novel toxic compounds. J Mass Spectrom 40: 1484–1502

Leuchtmann A (1992) Systematics, distribution, and host specificity of

grass endophytes. Nat Toxins 1: 150–162

Lewis DH, Smith DC (1967) Sugar alcohols (polyols) in fungi and green

plants. I. Distribution, physiology and metabolism. New Phytol 66:

143–184

Listgarten J, Emili A (2005) Statistical and computational methods for

comparative proteomics profiling using liquid chromatography-tandem

mass spectrometry. Mol Cell Proteomics 4: 419–434

Lyons PC, Plattner RD, Bacon CW (1986) Occurrence of peptide and

clavine ergot alkaloids in tall fescue grass. Science 232: 487–489

Malinowski DP, Belesky DP (2000) Adaptations of endophyte-infected

cool-season grasses to environmental stresses: mechanisms of drought

and mineral stress tolerance. Crop Sci 40: 923–940

McLafferty FW, Staufferb DA, Lohb SY, Wesdemiotis C (1999) Unknown

identification using reference mass spectra: quality evaluation of data-

bases. J Am Soc Mass Spectrom 10: 1229–1240

Noble HM, Langley D, Sidebottom PJ, Lane SJ, Fisher PJ (1991) An

echinocandin from an endophytic Cryptosporiopsis sp. and Pezicula sp. in

Pinus sylvestris and Fagus sylvatica. Mycol Res 95: 1439–1440

Pavis N, Chatterton NJ, Harrison PA, Baumgartner S, Praznik W, Boucaud

J, Prud’homme MP (2001) Structure of fructans in roots and leaf tissues

of Lolium perenne. New Phytol 150: 83–95

Pedrioli PGA, Eng JK, Hubley R, Vogelzang M, Deutsch EW, Raught B,

Pratt B, Nilsson E, Angeletti R, Apweiler R, et al (2004) A common

open representation of mass spectrometry data and its application in a

proteomics research environment. Nat Biotechnol 22: 1459–1466

Pollock CJ (1982) Oligosaccharide intermediates of fructan synthesis in

Lolium temulentum. Phytochemistry 21: 2461–2465

Pollock CJ, Cairns AJ (1991) Fructan metabolism in grasses and cereals.

Annu Rev Plant Physiol Plant Mol Biol 42: 77–101

Raamsdonk LM, Teusink B, Broadhurst D, Zhang N, Hayes A, Walsh MC,

Berden JA, Brindle KM, Kell DB, Rowland JJ, et al (2001) A functional

genomics strategy that uses metabolome data to reveal the phenotype of

silent mutations. Nat Biotechnol 19: 45–50

Rasmussen S, Parsons AJ, Bassett S, Christensen MJ, Hume DE, Johnson

LJ, Johnson RD, Simpson WR, Stacke C, Voisey CR, et al (2007) High

nitrogen supply and carbohydrate content reduce fungal endophyte and

alkaloid concentration in Lolium perenne. New Phytol 173: 787–797

Rasmussen S, Parsons AJ, Fraser K, Xue H, Newman JA (2008) Metabolic

profiles of Lolium perenne are differentially affected by nitrogen supply,

carbohydrate content, and fungal endophyte infection. Plant Physiol

146: 1–14

Richardson MD, Chapman GW, Hoveland CS, Bacon CW (1992) Sugar

alcohols in endophyte-infected tall fescue under drought. Crop Sci 32:

1060–1061

Rowan DD (1993) Lolitrems, peramine and paxilline: mycotoxins of the

ryegrass/endophyte interaction. Agric Ecosyst Environ 44: 103–122

Rowan DD, Gaynor DL (1986) Isolation of feeding deterrents against

Argentine stem weevil from ryegrass infected with the endophyte

Acremonium loliae. J Chem Ecol 12: 647–658

Rowan DD, Latch GCM (1994) Utilization of endophyte-infected perennial

ryegrasses for increased insect resistance. In CW Bacon, JF White, eds,

Biotechnology of Endophytic Fungi in Grasses. CRC Press, Boca Raton,

FL, pp 169–183

Ruijter GJG, Bax M, Patel H, Flitter SJ, van de Vondervoort PJI, de Vries

RP, van Kuyk PA, Visser J (2003) Mannitol is required for stress

tolerance in Aspergillus niger conidiospores. Eukaryot Cell 2: 690–698

Salminen SO, Richmond DS, Grewal SK, Grewal PS (2005) Influence of

temperature on alkaloid levels and fall armyworm performance in

endophytic tall fescue and perennial ryegrass. Entomol Exp Appl 115:

417–426

Sauer U, Heinemann M, Zamboni N (2007) Getting closer to the whole

picture. Science 316: 550–551

Schardl CL (2001) Epichloe festucae and related mutualistic symbionts of

grasses. Fungal Genet Biol 33: 69–82

Schardl CL, Leuchtmann A, Spiering MJ (2004) Symbioses of grasses with

seedborne fungal endophytes. Annu Rev Plant Biol 55: 315–340

Schauer N, Fernie AR (2006) Plant metabolomics: towards biological

function and mechanism. Trends Plant Sci 11: 508–516

Schmidt A, Karas M, Dulcks T (2003) Effect of different solution flow rates

on analyte ion signals in nano-ESI MS, or: when does ESI turn into nano-

ESI? J Am Soc Mass Spectrom 14: 492–500

Seto Y, Takahashi K, Matsuurai H, Kogami Y, Yada H, Yoshihara T,

Nabeta K (2007) Novel cyclic peptide, epichlicin, from the endophytic

fungus, Epichloe typhina. Biosci Biotechnol Biochem 71: 1470–1475

Sickler CM, Edwards GE, Kiirats O, Gao Z, Loscher W (2007) Response of

mannitol-producing Arabidopsis thaliana to abiotic stress. Funct Plant

Biol 34: 382–391

Smedsgaard J, Frisvad JC (1996) Using direct electrospray mass spectrom-

etry in taxonomy and secondary metabolite profiling of crude fungal

extracts. J Microbiol Methods 25: 5–17

Smith CA, Want EJ, O’Maille G, Abagyan R, Siuzdak G (2006) XCMS:

processing mass spectrometry data for metabolite profiling using

nonlinear peak alignment, matching and identification. Anal Chem 78:

779–787

Smyth GK (2004) Linear models and empirical Bayes methods for assess-

ing differential expression in microarray experiments. Stat Appl Genet

Mol Biol 1: Article 3

Solomon PS, Waters ODC, Jorgens CI, Lowe RGT, Rechberger J, Trengove

RD, Oliver RP (2006) Mannitol is required for asexual sporulation in the

wheat pathogen Stagonospora nodorum (glume blotch). Biochem J 399:

231–239

Solomon PS, Waters ODC, Oliver RP (2007) Decoding the mannitol

enigma in filamentous fungi. Trends Microbiol 15: 257–262

Spiering MJ, Lane GA, Christensen MJ, Schmid J (2005) Distribution of

the fungal endophyte Neotyphodium lolii is not a major determinant of

the distribution of fungal alkaloids in Lolium perenne plants. Phyto-

chemistry 66: 195–202

Stein SE, Scott DR (1994) Optimization and testing of mass spectral library

search algorithms for compound identification. J Am Soc Mass Spectrom

5: 859–866

Steuer R, Kurths J, Fiehn O, Weckwerth W (2003) Observing and

interpreting correlations in metabolomic networks. Bioinformatics 19:

1019–1026

Strickland JR, Bailey EM, Abney LK, Oliver JW (1996) Assessment of the

mitogenic potential of the alkaloids produced by endophyte (Acremo-

nium coenophialum)-infected tall fescue (Festuca arundinacea) on bovine

vascular smooth muscle in vitro. J Anim Sci 74: 1664–1671

Strickland JR, Cross DL, Jenkins TC, Petroski RJ, Powell RG (1992) The

effect of alkaloids and seed extracts of endophyte-infected tall fescue on

prolactin secretion in an in vitro rat pituitary perfusion system. J Anim

Sci 70: 2779–2786

Strobel GA, Hess WM (1997) Glucosylation of the peptide leucinostatin A,

produced by an endophytic fungus of European yew, may protect the

host from leucinostatin toxicity. Chem Biol 4: 529–536

Advanced Data-Mining Strategies for Metabolomic DIMSn Data

Plant Physiol. Vol. 146, 2008 1513 www.plantphysiol.orgon June 24, 2020 - Published by Downloaded from

Copyright © 2008 American Society of Plant Biologists. All rights reserved.

Page 14: Advanced Data-Mining Strategies for the Analysis of Direct ... · Advanced Data-Mining Strategies for the Analysis of ... Exploratory data analysis approaches were also developed

Strobel GA, Miller RV, Martinez-Miller C, Condron MM, Teplow

DB, Hess WM (1999) Cryptocandin, a potent antimycotic from the

endophytic fungus Cryptosporiopsis cf. quercina. Microbiology 145:

1919–1926

Strobel GA, Torczynski R, Bollon A (1997) Acremonium sp.—a leucinos-

tatin A producing endophyte of European yew (Taxus baccata). Plant Sci

128: 97–108

Sumner L, Mendes P, Dixon R (2003) Plant metabolomics: large-scale

phytochemistry in the functional genomics era. Phytochemistry 62:

807–836

Tabb DL, MacCoss MJ, Wu CC, Anderson SD, Yates III Jr (2003) Similarity

among tandem mass spectra from proteomic experiments: detection,

significance, and utility. Anal Chem 75: 2470–2477

Tan RX, Zou WX (2001) Endophytes: a rich source of functional metabo-

lites. Nat Prod Rep 18: 448–459

Tanaka A, Tapper BA, Popay A, Parker EJ, Scott B (2005) A symbiosis

expressed non-ribosomal peptide synthetase from a mutualistic fungal

endophyte of perennial ryegrass confers protection to the symbiotum

from insect herbivory. Mol Microbiol 57: 1036–1050

Tautenhahn R, Bottcher C, Neumann S (2007) Annotation of LC/ESI-MS

mass signals. In S Hochreiter, R Wagner, eds, Lecture Notes in Computer

Science, Vol 4414. Springer, Berlin, pp 371–380

Tozuka Z, Kaneko H, Shiraga T, Mitani Y, Beppu M, Terashita S,

Kawamura A, Kagayama A (2003) Strategy for structural elucidation

of drugs and drug metabolites using (MS)n fragmentation in an electro-

spray ion trap. J Mass Spectrom 38: 793–808

Trunzer M, Graf D, Kiffe M (2007) Comparison of a two-dimensional

liquid chromatography/mass spectrometry approach with a chip-based

nanoelectrospray device for structural elucidation of metabolites in a

human ADME study using a quadrupole time-of-flight mass spectrom-

eter. Rapid Commun Mass Spectrom 21: 937–944

Voegele RT, Hahn M, Lohaus G, Link T, Heiser I, Mendgen K (2005)

Possible roles for mannitol and mannitol dehydrogenase in the biotro-

phic plant pathogen Uromyces fabae. Plant Physiol 137: 190–198

Yamamoto T, Watanabe H, Nishimoto T, Aga H, Kubota M, Chaen H,

Fukuda S (2006) Acceptor recognition of kojibiose phosphorylase from

Thermoaerobacter brockii: syntheses of glycosyl glycerol and myo-inositol.

J Biosci Bioeng 101: 427–433

Zhang X, Asara JM, Adamec J, Ouzzani M, Elmagarmid AK (2005) Data

pre-processing in liquid chromatography-mass spectrometry based

proteomics. Bioinformatics 21: 4054–4059

Zou H, Hastie T (2005) Regularization and variable selection via the elastic

net. J R Statist Soc B 67: 301–320

Cao et al.

1514 Plant Physiol. Vol. 146, 2008 www.plantphysiol.orgon June 24, 2020 - Published by Downloaded from

Copyright © 2008 American Society of Plant Biologists. All rights reserved.