Top Banner
Original papers A computational environment to support research in sugarcane agriculture Carlos Driemeier a , Liu Yi Ling a , Guilherme M. Sanches a,b , Angélica O. Pontes a , Paulo S. Graziano Magalhães a,b,, João E. Ferreira c a Brazilian Bioethanol Science and Technology Laboratory (CTBE), National Center for Research in Energy and Materials (CNPEM), Caixa Postal 6192, CEP 13083-970 Campinas, São Paulo, Brazil b School of Agriculture Engineering – FEAGRI, University of Campinas – UNICAMP, Campinas, SP, Brazil c Institute of Mathematics and Statistics – IME, University of São Paulo – USP, São Paulo, SP, Brazil article info Article history: Received 26 October 2015 Received in revised form 23 August 2016 Accepted 4 October 2016 Keywords: Precision agriculture Sugarcane Database Workflow abstract Sugarcane is an important crop for tropical and sub-tropical countries. Like other crops, sugarcane agri- cultural research and practice is becoming increasingly data intensive, with several modeling frameworks developed to simulate biophysical processes in farming systems, all dependent on databases for accurate predictions of crop production. We developed a computational environment to support experiments in sugarcane agriculture and this article describes data acquisition, formatting, storage, and analysis. The potential to support creation of new agricultural knowledge is demonstrated through joint analysis of three experiments in sugarcane precision agriculture. Analysis of these case studies emphasizes spatial and temporal variations in soil attributes, sugarcane quality, and sugarcane yield. The developed compu- tational framework will aid data-driven advances in sugarcane agricultural research. Ó 2016 Elsevier B.V. All rights reserved. 1. Introduction Sugarcane is an important crop mainly in tropical and sub- tropical countries. Brazil is the largest sugarcane producer, with 9 Mha cultivated to produce 659 million Mg of sugarcane in the 2015/2016 season, resulting in 34,600 Mg of sugar and 29 billion L of ethanol (CONAB, 2015). In addition to sugar and ethanol, Brazil is today the country with the largest installed capacity of biomass- based electricity generation (IRENA, 2015). In 2015, the supply of electricity from biomass had estimated growth of 7%, with a total generation over 22 TW h, where sugarcane accounts for 80%. Several modeling frameworks such as AUSCANE, QCANE, APSIM, MOSICAS and CANEGRO (Marin et al., 2011) are increas- ingly being employed to simulate biophysical process in sugarcane farming systems. They are all dependent on databases, exemplify- ing the many ways in which agriculture is moving towards inten- sive data acquisition and processing. In addition, agriculture worldwide is witnessing a growing adoption of the so-called Precision Agriculture (PA), which comprises a set of tools to help farmers understand and manage soils and crops inherent spatial and temporal variability. PA relies on collection, analysis, process- ing, and synthesis of voluminous georeferenced data, which can be collected from a number of different technologies (Zamykal and Everingham, 2009). Research and technology on PA have advanced considerably in the past 20 years (Bramley, 2009). Due to its intense use of information, PA has grown and evolved to incorpo- rate the best of multidisciplinary science and technology (Zamykal and Everingham, 2009), requiring farmers to look at their business from different perspectives (Srinivasan, 2006). Sugarcane production system, however, differs substantially from major staple crops, affecting development and adoption of agricultural technologies. Comparison between a major cereal (e.g., wheat) and sugarcane highlights some key differences. Wheat area worldwide is 215 Mha, primarily in temperate zones, com- pared to 26 Mha of sugarcane, primarily in tropical developing countries, especially in Brazil. Furthermore, the harvested part of cereal crops is the grain, with mean yields of 3.2 Mg ha 1 for wheat, compared to 71 Mg ha 1 for harvesting the stalks of sugarcane in 2013 (FAO, 2015). Differences in area and location make sugarcane a small fraction of the global market for agricultural technology. In addition, the high tonnage of sugarcane requires dedicated http://dx.doi.org/10.1016/j.compag.2016.10.002 0168-1699/Ó 2016 Elsevier B.V. All rights reserved. Corresponding author at: Brazilian Bioethanol Science and Technology Labora- tory (CTBE), National Center for Research in Energy and Materials (CNPEM), Caixa Postal 6192, CEP 13083-970 Campinas, São Paulo, Brazil. E-mail address: [email protected] (Paulo S. Graziano Magalhães). Computers and Electronics in Agriculture 130 (2016) 13–19 Contents lists available at ScienceDirect Computers and Electronics in Agriculture journal homepage: www.elsevier.com/locate/compag
7

Computers and Electronics in Agriculture

Oct 02, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Computers and Electronics in Agriculture

Computers and Electronics in Agriculture 130 (2016) 13–19

Contents lists available at ScienceDirect

Computers and Electronics in Agriculture

journal homepage: www.elsevier .com/locate /compag

Original papers

A computational environment to support research in sugarcaneagriculture

http://dx.doi.org/10.1016/j.compag.2016.10.0020168-1699/� 2016 Elsevier B.V. All rights reserved.

⇑ Corresponding author at: Brazilian Bioethanol Science and Technology Labora-tory (CTBE), National Center for Research in Energy and Materials (CNPEM), CaixaPostal 6192, CEP 13083-970 Campinas, São Paulo, Brazil.

E-mail address: [email protected] (Paulo S. Graziano Magalhães).

Carlos Driemeier a, Liu Yi Ling a, Guilherme M. Sanches a,b, Angélica O. Pontes a,Paulo S. Graziano Magalhães a,b,⇑, João E. Ferreira c

aBrazilian Bioethanol Science and Technology Laboratory (CTBE), National Center for Research in Energy and Materials (CNPEM), Caixa Postal 6192, CEP 13083-970 Campinas,São Paulo, Brazilb School of Agriculture Engineering – FEAGRI, University of Campinas – UNICAMP, Campinas, SP, Brazilc Institute of Mathematics and Statistics – IME, University of São Paulo – USP, São Paulo, SP, Brazil

a r t i c l e i n f o a b s t r a c t

Article history:Received 26 October 2015Received in revised form 23 August 2016Accepted 4 October 2016

Keywords:Precision agricultureSugarcaneDatabaseWorkflow

Sugarcane is an important crop for tropical and sub-tropical countries. Like other crops, sugarcane agri-cultural research and practice is becoming increasingly data intensive, with several modeling frameworksdeveloped to simulate biophysical processes in farming systems, all dependent on databases for accuratepredictions of crop production. We developed a computational environment to support experiments insugarcane agriculture and this article describes data acquisition, formatting, storage, and analysis. Thepotential to support creation of new agricultural knowledge is demonstrated through joint analysis ofthree experiments in sugarcane precision agriculture. Analysis of these case studies emphasizes spatialand temporal variations in soil attributes, sugarcane quality, and sugarcane yield. The developed compu-tational framework will aid data-driven advances in sugarcane agricultural research.

� 2016 Elsevier B.V. All rights reserved.

1. Introduction

Sugarcane is an important crop mainly in tropical and sub-tropical countries. Brazil is the largest sugarcane producer, with9 Mha cultivated to produce 659 million Mg of sugarcane in the2015/2016 season, resulting in 34,600 Mg of sugar and 29 billionL of ethanol (CONAB, 2015). In addition to sugar and ethanol, Brazilis today the country with the largest installed capacity of biomass-based electricity generation (IRENA, 2015). In 2015, the supply ofelectricity from biomass had estimated growth of 7%, with a totalgeneration over 22 TW h, where sugarcane accounts for 80%.

Several modeling frameworks such as AUSCANE, QCANE,APSIM, MOSICAS and CANEGRO (Marin et al., 2011) are increas-ingly being employed to simulate biophysical process in sugarcanefarming systems. They are all dependent on databases, exemplify-ing the many ways in which agriculture is moving towards inten-sive data acquisition and processing. In addition, agricultureworldwide is witnessing a growing adoption of the so-called

Precision Agriculture (PA), which comprises a set of tools to helpfarmers understand and manage soils and crops inherent spatialand temporal variability. PA relies on collection, analysis, process-ing, and synthesis of voluminous georeferenced data, which can becollected from a number of different technologies (Zamykal andEveringham, 2009). Research and technology on PA have advancedconsiderably in the past 20 years (Bramley, 2009). Due to itsintense use of information, PA has grown and evolved to incorpo-rate the best of multidisciplinary science and technology (Zamykaland Everingham, 2009), requiring farmers to look at their businessfrom different perspectives (Srinivasan, 2006).

Sugarcane production system, however, differs substantiallyfrom major staple crops, affecting development and adoption ofagricultural technologies. Comparison between a major cereal(e.g., wheat) and sugarcane highlights some key differences. Wheatarea worldwide is 215 Mha, primarily in temperate zones, com-pared to 26 Mha of sugarcane, primarily in tropical developingcountries, especially in Brazil. Furthermore, the harvested part ofcereal crops is the grain, withmean yields of 3.2 Mg ha�1 for wheat,compared to 71 Mg ha�1 for harvesting the stalks of sugarcane in2013 (FAO, 2015). Differences in area and location make sugarcanea small fraction of the global market for agricultural technology. Inaddition, the high tonnage of sugarcane requires dedicated

Page 2: Computers and Electronics in Agriculture

Fig. 1. Tasks for handling of raw data prior to insertion into the database.

14 C. Driemeier et al. / Computers and Electronics in Agriculture 130 (2016) 13–19

technologies, such as tailored yield monitors (Magalhães and Cerri,2007). Due to specificities of the sugarcane system, and despiterapid adoption of auto-steer in tractors and harvesters (Bramleyand Trengove, 2013; Silva et al., 2011), PA is not yet adopted bythe sugarcane-based sugar-ethanol industry as it is for other agri-cultural systems (Gebbers and Adamchuk, 2010). According to sur-veys conducted in Brazil (Anselmi et al., 2014; Avanzi et al., 2014;Silva et al., 2011) and Australia (Bramley and Trengove, 2013),low PA adoption can be explained by four factors: relative advan-tage (usefulness), compatibility, trialability and observability. Forsugarcane production, perceived usefulness is correlated withincreased crop yield, reduced costs, and improved management.On the other hand, high costs of equipment, lack of qualified staffand lack of information on PA technologies were pointed by sugar-cane farmers as the main barriers (Silva et al., 2011).

In this context, efforts have been primarily dedicated to exper-iments aiming at establishing the scientific grounds and demon-strating the advantages of PA techniques applied to sugarcane(Portz et al., 2011; Rodrigues et al., 2012). Because of these goals,characterization of soil and plant attributes in experiments is muchmore comprehensive than the expected for large-scale PA practice.Furthermore, testing data acquisition technologies and contextual-izing their outputs are important goals of the experimentationstage. Considering the above, treating the diversity of measurableattributes is a critical point in experimentation for sugarcane PA.

The data-driven character of PA has attracted the attention ofthe research community from many different areas. For instance,there are studies on clustering algorithm to delineate managementzones (Tagarakis et al., 2012), data acquisition techniques withremote sensing (Mulla, 2013; Song et al., 2009), and softwarearchitecture for data analysis and integration of sensor based PAmonitoring (Chen et al., 2015).

In this work, we present a computational environment createdto support sugarcane agricultural research, including but not lim-ited to research in PA. Data acquisition, formatting, verification,storage, and analysis are discussed. To demonstrate the applicabil-ity of the computational environment, data of soil chemistry, sug-arcane quality, and sugarcane yield from three experiments arejointly analyzed and discussed.

2. Computational environment

2.1. Handling of raw data

Sugarcane agricultural experiments may include severalsources of raw data, including data acquired by different analyticallaboratories and by various types of sensors. The current version ofthe computational environment is able to process data in matrixformats. Processing of images (e.g., from unmanned aerial vehiclesand satellite) is foreseen as a future upgrade for the system.

The database has an expandable set of allowed matrix formats –essentially one matrix format for each type of measurement. Toassure that matrixes are properly recorded in the database, we rou-tinely handle raw data following the tasks presented in Fig. 1.

Using spreadsheets, raw data from sensors and laboratory filesare converted into data matrixes consistent with the predefineddatabase formats. Such data matrixes are verified and theninserted into the database. Importantly, data acquisition and for-matting are performed by agricultural field scientist, while verifi-cation and insertion are performed by computational workers.Among other advantages, this division of tasks assures an indepen-dent verification of data veracity. Verifications include matrix for-mats, measurement units, and typical range of values acceptablefor a certain measured attribute. Once verified, data matrixes areinserted into the database using python-generated SQL scripts.

2.2. Database

A relational database for sugarcane agricultural experimentswas created and named BDAgro – CTBE Database of AgriculturalExperiments, as detailed in a Technical Memorandum (Ponteset al., 2014). BDAgro was constructed having PostgreSQL as rela-tional database management system and pgAdmin as databaseadministration and development platform.

BDAgro conceptual model, i.e. entity-relationship model(Elmasri and Navathe, 2010), includes entities associated to man-agement and responsibilities (e.g., records of projects and responsi-ble persons). Nevertheless, more relevant for the analyticalpurposes of the computational environment, BDAgro representsagricultural experiments through the following entities:

� Experiment is defined by a certain land area during a certainperiod of time. The land area is most often an open agriculturalfield, but may also be inside close environments such asgreenhouses.

� Event is one important fact within one experiment. Events maybe of three types: (i) intervention, associated with change inexperimental land area (e.g., harvest, soil fertilization); (ii) char-acterization, associated with data acquisition without change inland area (e.g., characterization of soil attributes); and (iii) plan-ning, representing a record associated with neither physicalchange in land area nor new data acquisition (e.g., nutrientprescription).

� Static data is data generated by events. It is termed staticbecause each event is defined at a specific moment withinone experiment. Static data has x and y spatial coordinates asattributes. Additional attributes depend on type of static data(i.e., on type of measurements). Soil attributes, sugarcane qual-ity, and sugarcane yield are examples of types of static data.

� Dynamic data is data acquired continuously during the courseof one experiment. Meteorological information is one exampleof dynamic data.

We will refer to these entities as the article follows. However,the analysis of the case studies does not yet include any dynamicdata because of the complexity of agricultural analysis using finetemporal granularity.

2.3. Data analysis

We adopted the Work-Event-Data-flow (WED-flow) approach(Ferreira et al., 2010) as the methodology for modeling analysis

Page 3: Computers and Electronics in Agriculture

C. Driemeier et al. / Computers and Electronics in Agriculture 130 (2016) 13–19 15

workflows. WED-flow addresses the integration of three main flowparadigms: workflow, event-flow and data-flow. These three flowparadigms combine the concepts of workflow composition, trans-actions (i.e., activities), events, and data states. At an abstract level,the WED-flow is modeled as a SAGA (Garcia-Molina and Salem,1987), composed from SAGA steps, each of which is enclosed in atransaction. The main advantage of WED-flow approach is the inte-gration of three main flow paradigms supporting modeling of anal-ysis workflows. More concretely, in this paper, we assume that: theflow of work is a set of analysis steps; the flow of analysis event is aset of constraints (that once satisfied triggers the following analy-sis step); and flow of data is a set of data states generated by eachanalysis step. An analytical module was created within BDAgro byconstructing a set of tables containing the data states resultingfrom analysis workflows. The statistical computing integrated tothe database is performed with the R programming language.

3. Case studies

Data presented in this paper has been collected from 2007 to2014 in different sugarcane fields. Sugarcane is a semi-perennialcrop, which typically grows in cycles of four to six years, being har-vested and fertilized annually.

3.1. Three Experiments

The first and second field experiments were conducted from2010 to 2014 in two adjacent 50 ha areas in a commercial sugar-cane field in Serra Azul – SP, Brazil (21� 160 4100S and 47� 320

1000W), which belongs to Pedra Mill. Before sugarcane planting, asurvey of the area was carry out in November 2010 to establishthe soil chemical and physical conditions and nutrient need forcrop implementation. In the first field experiment (Fig. 2a), thearea was divided into a regular 50 m grid with 204 sample pointslocated using a differential global positioning system (DGPS)(Ag114TM, Trimble, Navigation Ltd, Sunnyvale, CA, USA). Additionalrefinement points were taken when necessary (Fig. 2a). Soil samplewas taken at two depths (0–0.20 and 0.20–0.40 m) at each gridpoint and a wet-chemical analysis was done to determine soilphysical and chemical attributes (macro and micronutrients). Inthe second experiment, the area was divided into a regular150 m grid with 24 sample points (Fig. 2b) located using the sametechnology. Soil samples were taken at the same depths, and sentto a laboratory for wet-chemical analysis. In both areas, sugarcanewas planted in April 2011. Annually the area was fertilized with K,P and N and soil samples were taken at the same spots.

Fig. 2. Photographs from land areas of experiments 1–3 (A-C). Grid points (black dotsharvester (yellow lines in the rectangular area zoomed at the inset of (A) are shown). (Forto the web version of this article.)

The third experiment was conducted in a 10-ha commercialsugarcane field (Fig. 2c) that belongs to GRUPO USJ, a sugar andethanol plant located in Araras, São Paulo State in the southeastregion of Brazil (22� 230 3800S, 47� 180 0400W). Sugarcane varietySP80-3280 was planted in 2007 and was mechanically green har-vested along the cropping seasons. Liming and fertilization wereperformed according to usual recommendations for sugarcane cropplanting (Raij et al., 1997) at a fixed rate. For the subsequent ratooncrops (2009–2011), no fertilizers were applied. A total of 117 sam-ple points were established on a 30 m regular grid to sample thesugarcane quality parameters and physical and chemical soil attri-butes. Additional 14 refinement points were also taken. Plant sam-pling occurred just before the respective harvests at the peak of thesugarcane maturity. The three sites (Fig. 2a–c) had been under con-tinuous sugarcane cultivation for more than 30 years.

Annually the areas were harvested using sugarcane harvestersequipped with auto-guidance system with a RTK signal (Trimble,Navigation Ltd, Sunnyvale, CA, USA) and a yield monitor (Simpro-cana, Enalta, São Carlos, Brazil). Annually, after harvest, soil sam-ples were taken again at same grid to diagnosis some deficiencyand recommend fertilizer application using VRT when applicable(field 1 and 2). At each field, the experiment has been conductedfor the whole sugarcane cycle, i.e. from planting, following throughthe consecutive ratoons, during 4 cycles.

3.2. Data acquisition techniques

Soil and crop data have been acquired with two types of tech-niques: (i) sampling at grid points, for physical and chemical soilattributes and (ii) scanning the area by soil (soil apparent electricalconductivity) and crop ‘‘on-the-go” sensors (crop spectrum reflec-tance), which are mounted on vehicles equipped with differentialglobal positioning system (DGPS). Data acquisition with on-the-go sensors has relatively lower costs and is promising for large-scale PA practice. On the other hand, sampling at grid points islabor intensive, requiring manual collection of samples for offlinelaboratory analysis. Such sampling approach is likely of limitedapplicability for large-scale PA. However, for the purpose of exper-imentation, sampling-based techniques are employed becausethey allow measurement of much more diverse sets of soil andplant attributes, thus expanding the scope of the experiments.

Sample collection at grid points is recorded in the database asevents of characterization, which generate static data. Measuredattributes of soil chemistry include pH, contents of organic matter(OM), and concentrations of macronutrients (P and K) andmicronutrients (B, Mg, S, Ca, Mn, Fe, Cu, and Zn). Soil samples were

), grid refinement points (red crosses, A and C), and the track from the sugarcaneinterpretation of the references to colour in this figure legend, the reader is referred

Page 4: Computers and Electronics in Agriculture

16 C. Driemeier et al. / Computers and Electronics in Agriculture 130 (2016) 13–19

collected from two soil layers: 0–0.20 and 0.20–0.40 m deep inexperiments 1 and 2; 0–0.20 and 0.20–0.50 m deep in experiment3. Sugarcane quality parameters (Fiber, Brix, and Pol) were mea-sured from sugarcane samples collected immediately before har-vest. Sugarcane yield was measured by an on-the-go yieldmonitor that determines sugarcane yield during the harvestingoperation (Magalhães and Cerri, 2007). In the database, yield is atype of static data associated with a characterization event. Eachevent of yield characterization is simultaneous to the interventionevent of harvest (Section 2.2). In this article, we analyze sugarcaneyield, whereas we omit data from other on-the-go sensors (ofapparent electrical conductivity, crop vegetation indexes).

3.3. Analysis workflow

We developed a data analysis workflow aiming at providingsynthetic views of (i) spatiotemporal variability in each measuredattribute and (ii) correlations among soil attributes that are pre-sumably associated with soil pH. The model of the analysis work-flow is sketched in Fig. 3, representing flow of analysis(represented by boxes), flow of events (represented by arrows),and flow of data states (represented by databases). Note thatresults of analysis (Fig. 3) are correlations, spatial autocorrelations,and outputs from principal component analysis (PCA). In essence,all these results are based on correlations, which would be detri-mentally affected by nonlinearities and outliers.

In order to avoid such issues, linearization and filtering stepsare applied beforehand in the workflow. Linearization takes thelogarithm of concentrations of soil components (OM, macronutri-ents and micronutrients), keeping other attributes unchanged.The logarithm scale reduced the positive skewness from concen-tration distributions and is additionally justifiable due to ubiquityof linear relations with log (concentration) found in physical-chemistry (Atkins and Paula, 2010). Outliers in data sets are

Fig. 3. Data analysis workflow.

removed by a filtering step. Any entry deviating from the meanby more than three standard deviations (for a given attribute)was treated as an outlier.

Following linearization and filtering, the next steps aim at rep-resenting all measurements of one experiment as a matrix ofentries U = {uij}, where i is the index for grid points and j for mea-sured attributes. Sampling at grid points straightforwardly gener-ates data in this matrix format. Nevertheless, in case of proximalmeasurements, such as those performed in grid refinement points(Fig. 2), uij is calculated as the average of the proximal points. Onthe other hand, each attribute from on-the-go sensor is a functionuj(x,y), where (x,y) are the spatial coordinates of the sensor track onthe field. The value of uij is estimated by employing linear regres-sion to fit a plane to uj(x,y) points within a circle of 50 m diametercentered at the grid point i.

With attributes estimated at grid points, Moran0s I spatial auto-correlation, Ij, is calculated by considering the connection matrix Mwhose terms equal one for neighbor grid points, and zero other-wise. The matrix L is obtained by normalizing M to have eachrow summing to unit. Then, Ij = zjT L zj, where zj is the vector ofthe mean-centered normalized attribute estimated at grid points,zij = (uij � �uj)/rj (Cliff and Ord, 1973).

Measured attributes vary widely concerning their range of val-ues as well as their data acquisition technologies. Therefore, preci-sion in attribute measurement also vary widely. Typical attributeprecisions were estimated as ‘‘noise” levels sj. For attributes deter-mined by sampling at grid points, sj was estimated as the standarddeviation of measurements in proximal points. For attributesobtained from on-the-go sensors, sj was estimated from the stan-dard error of the linear regression coefficient that estimates uij.

Principal Component Analysis (PCA) was employed to reducethe dimensionality of the attribute space. Prior to PCA, missingdata in matrix {uij} was imputed by the Expectation-Maximization (EM) algorithm associated with a multivariate nor-mal model (Johnson and Wichern, 2007), stopping EM iterationswhen imputed uij change way less than sj. This imputationapproach preserves the data covariance structure, thus being wellsuited as data preparation method for PCA. PCA was performedfrom correlation matrixes, which means that each attribute con-tributes with normalized, unit variance.

4. Results and discussion

4.1. Spatial autocorrelations

The ‘‘null hypothesis” of PA is uniform management of the field(Whelan and McBratney, 2000). A corollary is that, to be relevantfor PA, a measured attribute should present positive spatial auto-correlation. That is, an attribute j having spatial autocorrelationIj � 0 is consistent with random spatial distribution and, therefore,cannot be used to justify any site-specific action on the field. Onthe other hand, greater values of Ij might justify site-specificinterventions.

Considering the diversity of attributes characterized in thethree experiments, spatial autocorrelation Ij is distributed between��0.2 and �0.8 (Fig. 4). Negative Ij is observed mainly in experi-ment 2 (Fig. 4B). Most likely negative Ij can be interpreted asIj � 0 perturbed by statistical noise, which are greater in experi-ment 2 due to fewer grid points (24) (Fig. 2). Noteworthy,Webster and Oliver (2007) recommend that no fewer than 100sampling points are necessary to estimate the variability of a cer-tain field. Furthermore, it is important to note that, for the threeexperiments, many attributes are close to Ij � 0 so that such mea-sured attributes should not be employed to support site-specificinterventions on the field.

Page 5: Computers and Electronics in Agriculture

Fig. 4. Inter-year correlation against Moran0s I spatial autocorrelation for attributes of soil chemistry (black; OM, P, K, B, Mg, S, Ca, Mn, Fe, Cu, and Zn), sugarcane quality (blue;Fbr, Brix, Pol), and sugarcane yield (red; Yi). Results from experiments 1–3 are shown in plots A-C. Year of measurement is represented by the inclination of the labels, asindicated in the legends. Distinct soil layers are represented by Bold (0–0.20 m) and Italic (either 0.20–0.40 m or 0.20–0.50 m) labels. Diagonal dashed lines (y = x) andhorizontal dotted line (zero inter-year correlation) are shown to guide the eyes. (For interpretation of the references to colour in this figure legend, the reader is referred to theweb version of this article.)

C. Driemeier et al. / Computers and Electronics in Agriculture 130 (2016) 13–19 17

4.2. Spatiotemporal variability in soil attributes

Statistical noise tends to reduce the magnitude of all types ofcorrelations, including spatial autocorrelation and correlationbetween sequential measurements of a given attribute. Fig. 4 plotsinter-year correlation against spatial autocorrelation Ij. The inter-year correlation is an index for the temporal stability of the spatialvariability of a given attribute. For attributes of soil chemistry(black labels), a clear trend is observed in experiment 1 (Fig. 4A),with attributes clustered near the diagonal dashed line (y = x).Indeed, inter-year correlation is positively correlated (r = 0.73)with spatial autocorrelation. This trend is consistent with inter-year correlation and spatial autocorrelation being both substan-tially reduced by statistical noise. The correlation between inter-year correlation and spatial autocorrelation is weaker but still pos-itive in experiments 2 and 3 (r = 0.54, r = 0.32, respectively). Thisresult indicates that statistical noise is also a major factor reducinginter-year correlation and spatial autocorrelations in experiments2 and 3. In these two experiments, most attributes of soil chemicalattributes are above the diagonal dashed line (Fig. 4B and C), i.e., formost attributes inter-year correlation is greater than spatial auto-correlation. This observation might result from inherently lowerspatial autocorrelations in experiments 2 and 3, due to the greaterspacing between grid points in experiment 2, and perhaps due tothe more homogeneous, flat terrain of experiment 3.

4.3. Spatiotemporal variability in sugarcane quality

In experiments 1 and 2, attributes of sugarcane quality havespatial autocorrelation and inter-year correlation of approximatelyzero (blue labels in Fig. 4A and B, respectively). This observationindicates that sugarcane quality is consistent, at least approxi-mately, with random spatiotemporal variations. Experiment 3shows a different behavior. In year 2011, Brix and Pol (quality attri-butes associated with sugar content) have spatial autocorrelationIj � 0.3 and correlation with previous year of �0.2 (Fig. 4C). Theseare significant correlations, demonstrating that within-field spatialvariability and temporal stability of Brix and Pol are possible issuesto be managed by PA techniques. Correlations of even higher mag-nitudes are observed in sugarcane fiber contents (‘Fbr’ labels inFig. 4C), reinforcing that spatiotemporal management of sugarcanequality is possible, at least in principle. Nevertheless, the fact that

appreciable correlations and spatial autocorrelations are only occa-sionally observed in one (experiment 3) out of three experimentssuggests that significant spatiotemporal variability in sugarcanequality may often remain undetected.

4.4. Spatiotemporal variability in sugarcane yield

Yield is certainly a major aim in crop management. In othercrops, yield spatial variations are commonly found to be ratherstable across several harvests (Godwin et al., 2003). In sugarcane,however, we have been observing a lack of temporal stability inyield spatial variability. In experiment 1, yield in 2014 has a posi-tive (�0.2) correlation with yield in 2013. This level of correlationlocates yield close to the diagonal trend observed in Fig. 4A, whichwould be consistent with temporal stability reduced mainly by sta-tistical noise. On the other hand, yield in 2013 has a near zero cor-relation with yield in 2012 (Fig. 4A). This yield data is quite belowthe diagonal trend of Fig. 4A and the lack of inter-year correlationcannot be attributed to statistical noise. Even more extreme case isobserved in experiment 2, where correlation between 2012 and2011 yields is negative (��0.6).

A more detailed investigation indicated two major causes forthe zero or negative inter-year correlations in yield. One cause isoccasional sensor failure. Readings from yield monitor are verynoisy (Fig. 5a), even after removal of outliers (Fig. 5b). The exampleof Fig. 5b has mean yield of 93 Mg ha�1 and standard deviation of47 Mg ha1. Typical precision (sj) in yield is reduced to about 1–4 Mg ha�1 due to multiple readings used to estimate uij by regres-sion (Fig. 3). However, with such noisy signal, some sensor failuresremain unfiltered, detrimentally affecting the precision of yieldestimates and reducing associated correlations. The second causeof low inter-year correlations originates from real changes in yield,especially due to damages inflicted to the sugarcane crop. Harvestoperation may damage the roots of the sugarcane plants, occasion-ally pulling roots out of the soil. The result of such damage is alocalized yield reduction in the next harvests, thus reducinginter-year correlation. Such damages to roots may become visibleas gaps in the sugarcane field. Indeed, field observations by agrono-mists have been suggesting that crop damaging during harvest isthe major cause for declining yield across the multiyear crop cycles(Zamykal and Everingham, 2009).

Page 6: Computers and Electronics in Agriculture

Fig. 6. Loadings plot from principal component analysis of soil attributes presumably associated with soil pH (OM, Ca, Mg, and pH). Results from experiments 1–3 are shownin plots A-C. Year of measurement is represented by the inclination of the labels, as indicated in the legends. Distinct soil layers are represented by Bold (0–0.20 m) and Italic(either 0.20–0.40 m or 0.20–0.50 m) labels. Dotted lines are a pair of orthogonal axes rotated to have one axis aligned with the clusters of pH loadings (A and C).

Fig. 5. Reads from yield monitor before (a) and (b) after the filter step of the analysis workflow. Data from 2013 harvest of experiment 1.

18 C. Driemeier et al. / Computers and Electronics in Agriculture 130 (2016) 13–19

4.5. PCA of attributes associated with soil pH

The numerous soil and plant attributes exist in an abstractspace of high dimensionality where visualization of the informa-tion is impossible. Hence, reduction of dimensionality is oftendesirable and PCA is a statistical technique to do it. Fig. 6 presentsthe PC1& PC2 loadings from PCA applied to attributes judged to bepotentially associated with soil pH. The attributes are pH itself, themain elements of lime (Mg and Ca, applied to soils to increase theirpH), and organic matter (OM, whose decomposition is thought todecrease soil pH). PCA was applied separately for each experiment.Analyzed attributes span sequential years as well as the two soillayers that were characterized. Our interest in soil pH is due tothe possibility of pH control using lime broadcasted on the soil sur-face with variable rate technology, which could reduce costs byapplying lime prescribed according to site-specific demands.

As a first observation, experiments 1 and 3 show loadings thatare clustered (Fig. 6A and C). In each experiment, there is one clus-ter for pH, one for lime elements (Mg and Ca), and one for OM. Suchclustering indicates high correlation between soil top and bottomlayers. The clustering also indicates that spatial variations of pH,Ca, Mg, and OM are mostly preserved along the years. This isinstructive and perhaps surprising, considering that experiments1 and 3 employed distinct strategies to control soil pH (Section 3.1).Loadings of OM are comparatively more spread than loadings ofpH, Ca, and Mg (Fig. 6A and C). This observation is consistent withnoisier OM measurements as well as with significant dynamics inspatial variability of soil OM. Experiment 2 does not present such

clustering of loadings (Fig. 6B), which might be due to poorerstatistics arising from fewer grid points (Fig. 2).

Experiments 1 and 3 also share the relative directions of loadingclusters. The centroids of the pH clusters form an angle slightlygreater than 90� with the centroids of OM clusters(Fig. 6A and C). This relative direction indicates a tendency to haveslightly negative correlations between pH and OM, consistent withdecreasing pH due to higher levels of decomposing OM. Further-more, the loading clusters of Ca and Mg are observed in-betweenthose of pH and OM (Fig. 6A and C). Proximity with loadings ofpH is consistent with Ca and Mg causing increases in soil pH. Onthe other hand, the proximity between loadings of OM and of limeelements (Mg and Ca) is consistent with OM acting as storage med-ium for Mg and Ca available in soils. These relative directions inloadings plots demonstrate the possibility of finding commonbehavior across multiple sugarcane agricultural experiments.

5. Conclusions

This paper described a computational environment to supportresearch in sugarcane agriculture. Data acquisition, formatting,and verification steps are performed prior to data insertion into adedicated database named BDAgro. Data analysis is integrated tothe database by recording data states generated through data anal-ysis workflows. Such workflows can be tailored for the specificaims of each study.

The computational environment was employed to jointly ana-lyze three experiments in sugarcane precision agriculture (PA).

Page 7: Computers and Electronics in Agriculture

C. Driemeier et al. / Computers and Electronics in Agriculture 130 (2016) 13–19 19

These experiments comprised characterization of soil attributes,sugarcane quality, and sugarcane yield. Our analysis showed lowspatial autocorrelations and low inter-year correlations for mostof the sugarcane quality and yield attributes. This finding revealedimportant limitations in measurements of sugarcane quality, whileyield seems to be primarily impaired by crop damages inflicted bythe mechanical harvest processes, such as ratoon damage and soilcompaction. Furthermore, our multivariate analysis of featuresassociated with soil pH revealed common behavior in two experi-ments, evidencing the possibility of common underlying principlesto be identified across multiple sugarcane agricultural environ-ments. Overall, these case studies demonstrate the usefulness ofthe computational environment for supporting research in sugar-cane agriculture, which will positively impact the sustainabilityand competitiveness of the sugar-ethanol industry.

Acknowledgements

Project funded by CNPq and Fundação de Amparo à Pesquisa doEstado de São Paulo (FAPESP grants 2011/01817-9 and2015/01587-0).

References

Anselmi, A.A., Bredemeier, C., Federizzi, L.C., Molin, J.P., 2014. Factors related toadoption of precision agriculture technologies in southern Brazil. In: ISPA (Ed.),Proc. of the 12th International Conference on Precision Agriculture. ISPA,Sacramento, California, p. 11.

Atkins, P., Paula, J. de, 2010. Physical Chemistry. Oxford University Press, Oxford.Avanzi, J.C., Borghi, E., Bortolon, L., Bortolon, E.S.O., Luchiari Jr., A., Inamasu, R.Y.,

Bernardi, A.C.C., 2014. Adoption Level of Precision Agriculture for BrazilianFarmers - 2011/2012 crop year. In: ISPA (Ed.), Proc. of the 12th InternationalConference on Precision Agriculture. ISPA, Sacramento, California, pp. 1–2.

Bramley, R., Trengove, S.A.M., 2013. Precision agriculture in Australia: presentstatus and recent development. Eng. Agric. 33, 575–588.

Bramley, R.G.V., 2009. Lessons from nearly 20 years of Precision Agricultureresearch, development, and adoption as a guide to its appropriate application.Crop Pasture Sci. 60, 197–217. http://dx.doi.org/10.1071/CP08304.

Chen, N., Zhang, X., Wang, C., 2015. Integrated open geospatial web service enabledcyber-physical information infrastructure for precision agriculture monitoring.Comput. Electron. Agric. 111, 78–91.

Cliff, A.D., Ord, J.K., 1973. Spatial Autocorrelation. Pion, London.CONAB, Ministério da Agricultura, Abastecimento, P., 2015. SAFRA 2015/16 -

Terceiro levantamento.Elmasri, R., Navathe, S.B., 2010. Fundamental of Database Systems. Addison-

Wesley.FAO, Food and Agricultere Organization of the United Nations, 2015. Statistics

Division – FAOSTAT) [www Document]. URL http://faostat3.fao.org (accessed10.1.15).

Ferreira, J., Takai, O., Malkoviski, S., Pu, C., 2010. Reducing Exception HandlingComplexity in Business Process Modeling and Implementation: The WED-Flow

Approach. In: Meersman, R., Dillon, T., Herrero, P. (Eds.), On the Move toMeaningful Internet Systems: Otm 2010, PT I, Hersonissos, Greece, pp. 150–167.

Garcia-Molina, H., Salem, K., 1987. Sagas. In: ACM SIGMOD International Conferenceon Management of Data. San Francisco, pp. 249–259.

Gebbers, R., Adamchuk, V.I., 2010. Precision agriculture and food security. Science327, 828–831. http://dx.doi.org/10.1126/science.1183899.

Godwin, R.J., Wood, G.A.A., Taylor, J.C., Knight, S.M., Welsh, J.P., 2003. Precisionfarming of cereal crops : a review of a six year experiment to developmanagement guidelines. Biosyst. Eng. 84, 375–391. http://dx.doi.org/10.1016/S1537-5110(03)00031-X.

IRENA, 2015. Renewable energy capacity statistics 2015. Irena 44. http://dx.doi.org/10.1016/j.renene.2014.09.059.

Johnson, R.A., Wichern, D.W., 2007. Statistical Analysis. Pearson - Prentice Hall,Upper Saddle River.

Magalhães, P.S.G., Cerri, D.G.P., 2007. Yield monitoring of sugar cane. Biosyst. Eng.96, 1–6. http://dx.doi.org/10.1016/j.biosystemseng.2006.10.002.

Marin, F.R., Jones, J.W., Royce, F., Suguitani, C., Donzeli, J.L., Filho, W.J.P., Nassif, D.S.P., 2011. Parameterization and evaluation of predictions of DSSAT/CANEGRO forBrazilian sugarcane. Agron. J. 103, 304. http://dx.doi.org/10.2134/agronj2010.0302.

Mulla, D.J., 2013. Twenty five years of remote sensing in precision agriculture: keyadvances and remaining knowledge gaps. Biosyst. Eng. 114, 358–371. http://dx.doi.org/10.1016/j.biosystemseng.2012.08.009.

Pontes, A.O., Ling, L.Y., Sanches, G.M., Magalhães, P.S.G., Ferreira, J.E., Driemeier, C.E.,2014. BDAgro – CTBE Database of Agricultural Experiments, TechnicalMemorandum, MeT 11. Campinas. available at <https://forms.cnpem.br/formularios/prodbiblio/DB/2338/MeT%20112014.pdf>.

Portz, G., Molin, J.P., Jasper, J., 2011. Active crop sensor to detect variability ofnitrogen supply and biomass on sugarcane fields. Precis. Agric. 13, 33–44.http://dx.doi.org/10.1007/s11119-011-9243-4.

van Raij, B., Cantarella, H., Quaggio, J.A., Furlani, A.M.C., 1997. Recomendações deadubação e calagem para o Estado de São Paulo. Instituto Agronômico,Fundação IAC, Campinas. 285p.

Rodrigues Jr., F.A., Magalhães, P.S.G., Franco, H.C.J., 2012. Soil attributes and leafnitrogen estimating sugar cane quality parameters: brix, pol and fibre. Precis.Agric. 14, 270–289. http://dx.doi.org/10.1007/s11119-012-9294-1.

Silva, C.B., de Moraes, M.A.F.D., Molin, J.P., 2011. Adoption and use of precisionagriculture technologies in the sugarcane industry of São Paulo state, Brazil.Precis. Agric. 12, 67–81. http://dx.doi.org/10.1007/s11119-009-9155-8.

Song, X., Wang, J., Huang, W., Liu, L., Yan, G., Pu, R., 2009. The delineation ofagricultural management zones with high resolution remotely sensed data.Precis. Agric. 10, 471–487. http://dx.doi.org/10.1007/s11119-009-9108-2.

Srinivasan, A., 2006. Precision Agriculture: An overview. Handbook of PrecisionAgriculture, pp. 3–18.

Tagarakis, A., Liakos, V., Fountas, S., Koundouras, S., Gemtos, T.A., 2012.Management zones delineation using fuzzy clustering techniques ingrapevines. Precis. Agric. 14, 18–39. http://dx.doi.org/10.1007/s11119-012-9275-4.

Webster, R., Oliver, M.A., 2007. Geostatistics for Environmental Sciences. JohnWiley& Sonns, Ltd. http://dx.doi.org/10.1002/9780470517277.

Whelan, B.M., McBratney, A.B., 2000. The ‘‘null hypothesis” of precision agriculturemanagement. Precis. Agric. 2, 265–279.

Zamykal, D., Everingham, Y.L., 2009. Sugarcane and Precision Agriculture:Quantifying Variability Is Only Half the Story – A Review. In: Lichtfouse, E.(Ed.), Climate Change, Intercropping, Pest Control and BeneficialMicroorganisms. Springer, Netherlands, Dordrecht, pp. 189–218. http://dx.doi.org/10.1007/978-90-481-2716-0.