Discrimination of tomato plants under different irrigation regimes: analysis of hyperspectral sensor...

12
Discrimination of tomato plants under different irrigation regimes: analysis of hyperspectral sensor data M. Rinaldi a *, A. Castrignanò b , D. De Benedetto b , D. Sollitto b , S. Ruggieri b , P. Garofalo b , F. Santoro c , B. Figorito c , S. Gualano c and R. Tamborrino b The development and implementation of both economically and environmentally sustainable precision crop management systems can be greatly enhanced through the use of hyperspectral sensing. In this study, the potential of narrow-waveband hyperspectral observations for the discrimination of water-stressed tomato plants (Solanum lycopersicum L.) was investigated in a eld experiment conducted in southern Italy. The tomato crop was grown in a 1.8-ha test eld that was split into two plots with different irrigation treatments: optimal and decit water supplies, with the decit supply using half of the water of the optimal supply in the second half of the crop growing cycle. Hyperspectral measurements were taken with a eld spectroradiometer. To reduce the number of variables, principal component analysis was applied to each of six wavelength band sub-intervals across the whole wavelength interval from 400 to 1000 nm. The retained principal components were then submitted to canonical discriminant analysis. Finally, the principal components and the canonical component were interpolated using multivariate and univariate geostatistical techniques, respectively, and then mapped. The two irrigation treatments produced different plant biomass and leaf area indices, which were higher under optimal than decit water conditions, as was the plant water potential. These data show that the correlation between the individual bands varied during the crop cycle, so it was not feasible to choose a specic band to discriminate between the water treatments. However, we show that only a combination of all of the bands that use the full spectral information with differential weighting leads to clear discrimination of the two differently irrigated areas, with a mean accuracy of 75% to 77%. The processing of hyperspectral reectance data using canonical discriminant analysis can thus provide valuable information for the agricultural producer for the identication of within-eld areas of plant stress, so as to implement site-specic irrigation strategies. Copyright © 2014 John Wiley & Sons, Ltd. Keywords: hyperspectral sensor; irrigation; principal component analysis; discriminant analysis; plant water potential 1. INTRODUCTION Water is the greatest agricultural input in many cropping systems in Mediterranean areas, where water availability is one of the main limiting factors for crop yield. The application of the correct amount of water in the right place and at the right time is a challenging target for site-specic management. The matching of the placement and time of the water supply to the actual demand of the crop is then crucial for the achievement of the optimal crop response, while also minimising the environmental impact that arises from decreasing natural water resources. This matching of supply and demand requires adequate assessment of the water status in the agricultural eld, and especially at the critical growth stages, when such management decisions can have an appreciable impact upon yield. Estimating the water content of the plants in the laboratory is complex, costly and time consuming, thus making it difcult to manage the water needs of the crop during the growing season. However, remote or proximal sensing techniques that use spectral approaches can provide for rapid identication of water stress (Strachan et al., 2002; Dobrowski et al., 2005; Tilling et al., 2007; Zygielbaum et al., 2009), which can then be related to the irrigation management at the eld scale. Spectroradiometric methods have been developed for assessing biophysical and biochemical crop parameters in agricultural monitoring and management. Spectral reectance measures provide information on the canopy structure, quantity of biomass, chlorophyll concentrations, water contents and overall vegetative health (Asner, 1998; Zarco-Tejada et al., 2002; Clevers et al., 2005; Malenovsky et al., 2005; Stimson et al., 2005; Gualano et al., 2010). Water molecules absorb radiation in the near and middle infrared wavelengths, at about 970, 1240, 1400 and 1900 nm, and because the amount of absorption is related to the water content of the plant, these wavelengths are traditionally used as an indication of water * Correspondence to: Dott. Michele Rinaldi, Consiglio per la Ricerca e la Sperimentazione in Agricoltura, Centro di Ricerca per la Cerealicoltura (CRA-CER), 71122 Foggia, Italy. E-mail: [email protected] a Consiglio per la Ricerca e la Sperimentazione in Agricoltura, Centro di Ricerca per la Cerealicoltura (CRA-CER), Foggia, Italy b Consiglio per la Ricerca e la Sperimentazione in Agricoltura, Unità di Ricerca per i Sistemi Colturali degli ambienti caldo-aridi (CRA-SCA), Bari, Italy c CIHEAM MAIBMediterranean Agronomic Institute of Bari, Valenzano, BA, Italy Environmetrics (2014) Copyright © 2014 John Wiley & Sons, Ltd. Research Article Environmetrics Received: 23 July 2013, Revised: 17 May 2014, Accepted: 29 June 2014, Published online in Wiley Online Library: (wileyonlinelibrary.com) DOI: 10.1002/env.2297 1

Transcript of Discrimination of tomato plants under different irrigation regimes: analysis of hyperspectral sensor...

Discrimination of tomato plants under differentirrigation regimes: analysis of hyperspectralsensor dataM. Rinaldia*, A. Castrignanòb, D. De Benedettob, D. Sollittob, S. Ruggierib,P. Garofalob, F. Santoroc, B. Figoritoc, S. Gualanoc and R. Tamborrinob

The development and implementation of both economically and environmentally sustainable precision crop managementsystems can be greatly enhanced through the use of hyperspectral sensing. In this study, the potential of narrow-wavebandhyperspectral observations for the discrimination of water-stressed tomato plants (Solanum lycopersicum L.) wasinvestigated in a field experiment conducted in southern Italy. The tomato crop was grown in a 1.8-ha test field that was splitinto two plots with different irrigation treatments: optimal and deficit water supplies, with the deficit supply using half of thewater of the optimal supply in the second half of the crop growing cycle. Hyperspectral measurements were taken with a fieldspectroradiometer. To reduce the number of variables, principal component analysis was applied to each of six wavelengthband sub-intervals across the whole wavelength interval from 400 to 1000 nm. The retained principal components werethen submitted to canonical discriminant analysis. Finally, the principal components and the canonical component wereinterpolated using multivariate and univariate geostatistical techniques, respectively, and then mapped. The two irrigationtreatments produced different plant biomass and leaf area indices, which were higher under optimal than deficit waterconditions, as was the plant water potential. These data show that the correlation between the individual bands varied duringthe crop cycle, so it was not feasible to choose a specific band to discriminate between the water treatments. However, we showthat only a combination of all of the bands that use the full spectral information with differential weighting leads to cleardiscrimination of the two differently irrigated areas, with a mean accuracy of 75% to 77%. The processing of hyperspectralreflectance data using canonical discriminant analysis can thus provide valuable information for the agricultural producerfor the identification of within-field areas of plant stress, so as to implement site-specific irrigation strategies. Copyright ©2014 John Wiley & Sons, Ltd.

Keywords: hyperspectral sensor; irrigation; principal component analysis; discriminant analysis; plant water potential

1. INTRODUCTION

Water is the greatest agricultural input in many cropping systems in Mediterranean areas, where water availability is one of the main limitingfactors for crop yield. The application of the correct amount of water in the right place and at the right time is a challenging target for site-specificmanagement. The matching of the placement and time of the water supply to the actual demand of the crop is then crucial for the achievement ofthe optimal crop response, while also minimising the environmental impact that arises from decreasing natural water resources.This matching of supply and demand requires adequate assessment of the water status in the agricultural field, and especially at the critical

growth stages, when such management decisions can have an appreciable impact upon yield. Estimating the water content of the plants in thelaboratory is complex, costly and time consuming, thus making it difficult to manage the water needs of the crop during the growing season.However, remote or proximal sensing techniques that use spectral approaches can provide for rapid identification of water stress (Strachanet al., 2002; Dobrowski et al., 2005; Tilling et al., 2007; Zygielbaum et al., 2009), which can then be related to the irrigation management atthe field scale.Spectroradiometric methods have been developed for assessing biophysical and biochemical crop parameters in agricultural monitoring and

management. Spectral reflectance measures provide information on the canopy structure, quantity of biomass, chlorophyll concentrations, watercontents and overall vegetative health (Asner, 1998; Zarco-Tejada et al., 2002; Clevers et al., 2005;Malenovsky et al., 2005; Stimson et al., 2005;Gualano et al., 2010). Water molecules absorb radiation in the near and middle infrared wavelengths, at about 970, 1240, 1400 and 1900 nm, andbecause the amount of absorption is related to the water content of the plant, these wavelengths are traditionally used as an indication of water

* Correspondence to: Dott. Michele Rinaldi, Consiglio per la Ricerca e la Sperimentazione in Agricoltura, Centro di Ricerca per la Cerealicoltura (CRA-CER), 71122Foggia, Italy. E-mail: [email protected]

a Consiglio per la Ricerca e la Sperimentazione in Agricoltura, Centro di Ricerca per la Cerealicoltura (CRA-CER), Foggia, Italy

b Consiglio per la Ricerca e la Sperimentazione in Agricoltura, Unità di Ricerca per i Sistemi Colturali degli ambienti caldo-aridi (CRA-SCA), Bari, Italy

c CIHEAM MAIB—Mediterranean Agronomic Institute of Bari, Valenzano, BA, Italy

Environmetrics (2014) Copyright © 2014 John Wiley & Sons, Ltd.

Research Article Environmetrics

Received: 23 July 2013, Revised: 17 May 2014, Accepted: 29 June 2014, Published online in Wiley Online Library:

(wileyonlinelibrary.com) DOI: 10.1002/env.2297

1

stress (Hunt and Rock, 1989; Ceccato et al., 2001). Well-hydrated spongy mesophyll cells strongly reflect infrared wavelengths(Gates et al., 1965), and several studies have documented increased reflectance in the visible spectrum when plants are stressed(Yu et al., 2000; Ceccato et al., 2001). However, this effect has not been investigated in enough detail to allow its use in theestimation of vegetation water status. Zygielbaum et al. (2009) studied the optical properties for the detection of water-stressedand non-water-stressed maize leaves, and showed that there is a statistically significant increase in the visible spectrum reflectancemeasurements of the stressed leaves.

Hyperspectral data can offer tremendous advantages for the identification of the biophysical characteristics of a plant when compared withthe capabilities of broadband remote sensing systems (Kumar et al., 2001). The reflectance and absorption features in narrow bands arerelated to specific physico-chemical characteristics of plants, such as their biochemical composition, physical structure and water content(Strachan et al., 2002). Indeed, some studies have shown that the use of hyperspectral data can provide significant improvements to thedetection of plant stress (Carter, 1998), can identify small differences in percentages of green vegetation cover (McGwire et al., 2000)and crop moisture (Penuelas et al., 1995), and can assess the water contents in plant leaves (Bauer et al., 1981).

However, it must be noted that using hyperspectral data are much more complex than using multispectral data. The analysis of the largenumber of narrow bands from hyperspectral sensors is complex and time consuming, and it needs new algorithms for data processing, toselect an optimum sub-set of bands for any required study (Ray et al., 2006). Hyperspectral sensors also collect large volumes of data in shorttimes, which can cause several technical problems. Therefore, a way to overcome these challenges might be the determination of the optimalwavebands, to define those which better characterise and discriminate plants under different water conditions.

Discriminant analysis offers great potential for separating out different water application rates and eventually the identification ofareas of the plant canopy that are under ecophysiological stress from various sources, at both the leaf and canopy levels (Strachanet al., 2002).

The objectives of this study were thus to apply canonical discriminant analysis to hyperspectral data to detect and define the effects ofdifferent water application programmes on a tomato crop, and to determine if these data can be effectively used to discriminate betweenirrigation treatments throughout the growing season.

2. MATERIALS AND METHODS

2.1. Site and soil characterisation

The experimental site was located in an agricultural area of approximately 700 km2 (Capitanata Plain) in southern Italy (41°30′N, 15°33′E;102.9m a.s.l.). The area has a flat topography and is mainly devoted to wheat, tomato and sugar beet cultivation, as arable crops.

The climate of the experimental site is defined as ‘accentuated thermo-Mediterranean’ (FAO-UNESCO, 1963), with minimum temperaturesbelow 0 °C in winter and maximum temperatures above 40 °C in summer. The annual rainfall is mostly concentrated during the winter months,with a mean over a historical 50-year dataset of 550mm. The mean ‘A pan’ class evaporation exceeds 10mmday�1 in summer, based on themeans of the daily maximum values recorded in July and August.

The experimental field comprised 1.8 ha, as approximately 360m×50m. The soil was silty, and the field surface included a slight depression(maximum, 2m) to the south-east (De Benedetto et al., 2013).

2.2. Crop irrigation treatments and measurements

The experimental field at the ‘Antonio Forte’ commercial farmwas planted with processing tomatoes, according to the usual cropmanagement ofthe area. The crop was transplanted on 15 May, 2010, with a density of 3 plants m�2 in a twin-row arrangement (1.80m apart); the crop washarvested on 7 September, 2010. The previous crop was winter cabbage. The soil was ploughed to a depth of 35 cm, and then shallow tillagewas carried out, to prepare for the bed-sowing.

Irrigation management was carried out with a fixed water supply (30mm) every 5–6 days during the first month, and every 2–3 days in thesecond and third months of the crop cycle, according to the usual irrigation criteria used by farmers. To differentiate the irrigation treatments,the field was split into two similar blocks (180m× 50m) for the optimal and deficit irrigation: from 15 July, 2010, to 1week before harvest,the irrigation in the deficit block was scheduled for the same days as the optimal water treatment but with half the amount of water. A dripirrigation method was used with 2 L h�1 emitters spaced 0.4m apart along the irrigation line. Water metres were positioned at the head of theirrigation system to record the water supplied.

Plant measurements were carried out in the field every 2weeks. In particular, in each block, the following variables were measured:

- Phenology: times of the main phenological stages, according to the Biologische, Bundesanstalt, Bundessortenamt und ChemischeIndustrie (BBCH) scale (BBA, 2001);

- Leaf area index (LAI): non-destructive measurements were made at six georeferenced locations in each block and then averaged using aLiCOR LAI 2000 plant canopy analyser, which measured the blue light (320–490 nm) in five concentric cones (with a 148° field of view);

- Plant biomass: the fresh biomass and dry matter (oven at 72 °C until constant weight) of three individual plants per block were measuredseparately for fruits and plants;

- Plant water potential: a Scholander pressure bomb (Scholander et al., 1965) was used to measure the ‘pre-dawn’ plant water potential(expressed in bar) of six randomly selected plants per block.

For the LAI and plant biomass measurements, only the sampling dates of 16 and 24 August, 2010, were considered in this analysis, whichwere those closest to the hyperspectral measurement dates.

M. RINALDI ET AL.Environmetrics

wileyonlinelibrary.com/journal/environmetrics Copyright © 2014 John Wiley & Sons, Ltd. Environmetrics (2014)

2

2.3. Hyperspectral measurements

A Fieldspec hand-held spectrometer (Analytical Spectral Devices, Inc., Boulder, CO, USA) was used to measure the spectral reflectance ofthe tomato plants. This spectrometer has a wavelength interval of 325 to 1075 nm, with a sampling interval of 1.6 nm, and a spectralresolution of approximately 3 nm at around 700nm. The localisation of the monitored plants in the two blocks under the optimal and deficitirrigation treatments was defined using a global positioning system (Leica 1250 RTK-DGPS system) before the acquisition of the spectralsignatures, to georeference the target points with 1-cm (planimetric and altimetric) accuracy. Signal stakes were placed at the nodes of a regularsampling grid (5× 10m unit cell) for the identification of the plants to be measured. A total of 310 locations were surveyed that were evenlydistributed in the two optimal and deficit irrigation blocks (Figure 1).During the summer of 2010, the reflectance measurements were carried out three times (9, 13, 19 August), with the acquisition of 310

spectral signatures at each recording. These three recording dates were chosen because the plants were well differentiated under the waterconditions. All of the reflectance curves of the tomato plants located (930) were acquired at the nadir, with a field of view of 25° and at about1.3m in height from the ground. Under these conditions, the footprint of the spectrometer is a circle of 0.25m2, and this was sufficient tocover the vegetation canopy and to reduce background effects (e.g., from ground and water). All of the measurements were acquired between09:30 and 13:30 (local time), with clear sky conditions and with an optimal integration time of the spectrometer (68ms). Before theacquisition of the spectral signatures, a calibration panel with around 100% reflectance (50 cm side; Zenith Ultrawhite reflectance targetson aluminium sandwich) was used to balance any variations in the atmospheric and solar irradiation during the measurements. Despitethe optimal instrumental, geometric and lighting conditions (Milton et al., 2009), some small instrumental noise was observed at the lowestand highest wavelengths of the reflectance curves (<400 and >1070 nm, respectively).

2.4. Data analysis

To reduce the noise, the reflectance data analysis was restricted to 600 spectral wavelengths from 400 to 1000 nm. The hyperspectral datawere then aggregated into six band intervals of the electromagnetic spectrum, which were chosen on the basis of their sensitivities to aparticular feature of the ground: coastal-blue (400–510 nm), green (510–580 nm), yellow (580–630 nm), red (630–690 nm), red-edge(705–770 nm) and near infrared (NIR; 770–1000 nm) (Carter, 1993; Filella and Peñuelas, 1994; Sims and Gamon, 2002).As previous studies have shown that neighbouring wavebands can frequently provide similar information, and hence be redundant (Broge

and Leblanc, 2000; Thenkabail et al., 2004), a multivariate approach was used to reduce the spectral narrow bands to few new components.At each recording date (of three dates), principal component analysis (PCA) was performed on the Fieldspec hyperspectral data, separatelyfor each band interval (as six intervals), to linearly transform the original set of reflectance data into a new set of uncorrelated factors for atotal of 18 factors (as six per each date). The PCA approach was implemented using the FACTOR procedure of the SAS/STAT softwarepackage (SAS® 9.2, 2009). Only the principal components with eigenvalues >1 were retained for further analysis.

2.4.1. Geostatistical analysis

The PCA data were analysed using multivariate geostatistical methods, which require the modelling of the multivariate spatial structure ofcovariance of the dataset through a linear model of co-regionalisation (LMC).The LMC was developed by Journel and Huijbregts (1978), and it considers all of the study as a function of the same independent physical

processes that act at different spatial scales u. The n(n+ 1) / 2 simple and cross-semivariograms of the n variables are modelled through alinear combination of NS standardised semivariograms of unit sill gu(h). Using the matrix notation, the LMC can be written as follows:

Γ hð Þ ¼ ∑NS

u¼1Bugu hð Þ (1)

where Γ(h) = [γij(h)] is a symmetric matrix of order n× n, the diagonal and non-diagonal elements of which represent simple and cross-semivariograms, respectively; and Bu= [buij] is called the co-regionalisation matrix and is a symmetric positive semi-definite matrix of ordern× n with real elements buij at a specific spatial scale u, which represent the sills of the variograms. The functions gu(h) must be authorisedsemivariogram models, that is, mathematical functions that ensure non-zero variance (Castrignanò et al., 2000; Webster and Oliver, 2007).In practice, a set of normalised variograms (gu(h)) was selected in terms of the type of mathematical model and range, taking care to keep

NS reasonably small, on the basis of visual inspection of the experimental variograms. Then, the co-regionalisation matrices (Bu) were fittedusing a constrained weighted least-squares routine (Lajaunie and Behaxeteguy, 1989), which integrated the constraints of Bus to be positivesemi-definite into an automated fit for the set of predefined spatial structures (NS).

Figure 1. Grid of sampling points of hyperspectral (HS) measurements

DISCRIMINATION OF HYPERSPECTRAL DATA ON TOMATO Environmetrics

Environmetrics (2014) Copyright © 2014 John Wiley & Sons, Ltd. wileyonlinelibrary.com/journal/environmetrics

3

2.4.2. Interpolation procedures

The individual variables were interpolated at the nodes of a 0.5m× 0.5m cell grid, using ordinary co-kriging, which assumes the local meanof each variable to be a constant but unknown value. This estimates the unknown value of each variable at the unsampled location of aninterpolation grid as a linear combination of neighbouring observations of the variables (Wackernagel, 2003).

2.4.3. Discriminant analysis

Discriminant analysis is a multivariate statistical technique that is commonly used as a powerful classification approach for data mining, as ituses multiple quantitative attributes to discriminate single classification variables. A discriminant model, also known as a classificationcriterion, is determined by a measure of generalised squared distance (Rao, 1973). The classification criterion can be based on either theindividual within-group covariance matrices or the pooled covariance matrix; it also accounts for the prior probabilities of the groups. Eachobservation is placed in the class from which it has the smallest generalised squared distance (D), calculated for each group (j) according tothe following formula:

Dj2 Xð Þ ¼ X � Xj

� �′COV�1 X � Xj

� �(2)

where X is the vector of multivariate observations for a given pixel of the map, Xj is the vectors of the means of variates for treatment j andCOV is the covariance matrix.

In the present study, discriminant analysis was used as follows: (i) to investigate the radiometric differences between the two irrigationtreatments; (ii) to discriminate the treatments effectively; and (iii) to identify important discriminating variable(s) to be mapped, so as todelineate the vegetation areas under water stress.

The discriminant analysis consisted of the following six steps:

1. The assumption of multivariate normality within each treatment was checked by estimating the skewness and kurtosis, and graphically bydrawing the quantile–quantile plots of the expected standardised Gaussian and observed distributions of the multi-attribute residuals.

2. Bartlett’s modification of the likelihood ratio test (Anderson, 1984) was performed to test the homogeneity of the within-group covariancematrices for using the pooled covariance matrix.

3. Preliminary analyses assuming independence using ANOVA and MANOVA were carried out on the data. The former was used to test thehypothesis that the class means for each variable are equal while the latter was used to compare multivariate means across severalvariables. Four multivariate statistical tests were used: Wilks’ lambda, Pillai’s trace, Hotelling–Lawley trace and Roy’s maximum root(Morrison, 1976). Further confirmation is required from a geostatistical analysis.

4. Canonical discriminant analysis was performed to extract one (number of groups [2] minus 1) linear combination of the quantitativevariables (reflectances) that best revealed the differences between the groups. The extracted canonical variable has the highest possiblemultiple correlation with the groups and is called the first canonical variable or canonical component. The standardised discriminantfunction coefficients indicate the partial contribution of each variable to the discriminant function, and these structural loadings are commonlyused to interpret the meaning of the canonical variable.

5. Classification accuracy was assessed using the error matrix calculated in a cross-validation test (Lachenbruch and Mickey, 1968). Incross-validation, (n� 1) out of n training observations in the calibration sample are treated as the training set. The discriminant functionis determined on the (n� 1) observations and then applied to classify the one observation left out. This is repeated for each of the ntraining observations. Error matrices were generated for the different dates, where in each cell, there were reported the number ofobservations and the respective percentage over the whole population of observations. The overall accuracies were calculated based oncorrectly classified pixels along the diagonal of the error matrix. The group-specific error-count estimate was also calculated as the proportionof miss-classified observations in the group.

6. The extracted canonical variable scores were interpolated using the univariate geostatistical technique of kriging, and the canonicalvariable was mapped over the field to aid the visual interpretation of the radiometric differences between the groups.

3. RESULTS AND DISCUSSION

3.1. Agronomic traits

The irrigation water supply was uniform for the two blocks until 14 July, 2010 (as days after transplanting: DAT 75); rainfall (25mm)occurred at the end of July (DAT 82–83; Figure 2), which also influenced the tomato growth and resulted in the two treatments being verysimilar at that stage. Successively, the differences became clearer, according to the experimental irrigation schedules. At harvest, the seasonalirrigation volumes were 539mm for the optimal irrigation and 429mm for the deficit irrigation. The evapotranspiration deficit (i.e., sum ofthe daily potential evapotranspiration [ET0] minus [irrigation + rainfall]) was very similar for the two blocks until DAT 83, and then it wasvery different afterwards, with positive values with the optimal irrigation treatment and negative values with the deficit irrigation treatment(Figure 2). At the three Fieldspec measurement dates (9, 13, 19 August, 2010), the gap between the optimal and deficit irrigation treatmentsin terms of evapotranspiration deficit ranged from 75 (DAT 91) to 92mm (DAT 101).

The plant growth data at the two sampling dates closest to the Fieldspec measurement dates showed that the shortage in the water supplyreduced the plant growth (fresh biomass) and dry matter. In general, the plants under deficit irrigation were slightly smaller and less hydratedthan the optimal irrigation plants. The plant dry matter content indicated a better water status of the optimal irrigation plants compared withthose under deficit irrigation (Table 1).

M. RINALDI ET AL.Environmetrics

wileyonlinelibrary.com/journal/environmetrics Copyright © 2014 John Wiley & Sons, Ltd. Environmetrics (2014)

4

The six measurements of the ‘pre-dawn’ plant water potential clearly highlighted the differences between the two irrigation treatments(plant water potential: optimal irrigation, �3 to �2 bar; deficit irrigation, �4.5 to �3 bar). These differences were enhanced over theirrigation time. The plant growth during the period of the crop cycle when the irrigation treatments were different is shown in Figure 3.The superiority of the optimal irrigation treatment over the deficit irrigation treatment is evident for both the aboveground dry matter yieldand the LAI, despite the high variability of the samples.

Figure 2. Cumulated evapotranspiration (ETc), rainfall and irrigation supplies as a function of the two irrigation treatments (Optimal, OP and Deficit,DE) on tomatoes

Table 1. Mean plant characteristics as a function of the two irrigation regimes, from the pooling together of the 6 and 24 August, 2010,sampling data

Irrigationtreatment

Total fresh plant biomass(kgm�2)

Total dry plant biomass(kgm�2)

Total plant dry matter(%)

Leaf area index(m2m�2)

Optimal 110.433 ± 21.159 9.152 ± 1.792 8. 13 ± 0.51 4.18 ± 0.65Deficit 63.212 ± 9.492 6.590 ± 1.082 10.41 ± 0.18 3.02 ± 0.72

Data are means ± standard deviations.

Figure 3. Aboveground dry matter (top) and leaf area index (bottom) as a function of the two irrigation treatments (Optimal, continuous line and Deficit,dotted line) on tomatoes. The vertical bars indicate the standard deviations (n= 6)

DISCRIMINATION OF HYPERSPECTRAL DATA ON TOMATO Environmetrics

Environmetrics (2014) Copyright © 2014 John Wiley & Sons, Ltd. wileyonlinelibrary.com/journal/environmetrics

5

3.2. Radiometric data analysis

As the principal components with eigenvalues >1 were retained for further analysis, only the first component was used per each of the sixband intervals, and this explained >95% of the corresponding interval variance, on average. Before proceeding to the multivariate analysis,because of the different spatial structure properties shown by the principal components, the multivariate principal component dataset wassplit into two groups. One group included the coastal-blue, yellow and red principal components, the variogram maps for which wereisotropic (not shown), and the other group included the green, red-edge and NIR principal components, the variogram maps for which showedclear zonal anisotropy, which was characterised by longer continuity along the longitudinal axis of the field (97° from the North; N97°).

An isotropic LMCwas fitted to all of the variograms of the first group of principal components, including a nugget effect and a spherical modelwith range = 32.00m as the basic spatial structures. A zonal anisotropic LMC was fitted to the variograms of the second group of principalcomponents, including a nugget effect, an isotropic spherical model with range = 40.00m, a directional spherical model with range = 53.00malong the N7° direction, and a directional Bessel K model with range= 150.00m and shape parameter = 1.00 along the N97° direction.

The ANOVA showed that the two treatments were significantly different on all of the three recording dates for the green, red-edge andNIR principal components, whereas they were never differentiated by the coastal-blue and red principal components at the probability levelp< 0.01 (Table 2). On the last recording date only, the yellow principal component did not differentiate the two groups of plants.

The previous data describe the overall behaviour of these differently watered plants across the various spectral intervals, and they arebased on the assumption that the observations are independent. However, if the test statistic does not allow for autocorrelation, it mightbe too large, and the probability values might be too small, and thus, the null hypothesis (of no difference) might be rejected more often thanit should be. Therefore, type I errors would tend to increase with spatial dependence (Schabenberger and Gotway, 2005). What is necessaryin site-specific management is local information, such as that provided by the co-kriged maps (Figures 4–7) of the principal components.They show that the separation between the two blocks is not always so clear, and the individual plant responses vary as a function of theband and time. These data indicate that the individual bands cannot fully discriminate the plants under the different water conditions overthe crop season.

The MANOVA analysis was performed with different tests, and this shows high statistical significance for all of the three recording dates,thus showing that the two blocks had overall different radiometric behaviours (Table 3). The multivariate approach is thus to be preferred forthe discrimination between these optimal and deficit irrigation treatments. However, all of the previous considerations regarding the spatialdependence of the observations also hold here. Therefore, a multivariate geostatistical approach is expected to discriminate between the twotreatments more effectively and locally.

The multivariate normality of the principal component dataset for each of these treatments was assumed on the basis of the values ofskewness and kurtosis, and on the quantile–quantile plots for all of the principal components on the three recording dates.

Choosing a probability level p< 0.05, the Bartlett’s test confirmed the homogeneity of the within-group variances only for the lastmeasurement, so the pooled variance was used in the corresponding discriminant analysis (Table 4). For the other dates, the discriminantanalysis was performed using the separately calculated within-group variances. The correlations of the only canonical variable on the threerecording dates (Table 5) are significantly different from zero and much higher than those of the individual bands (Table 2), which confirmsthe benefits of following a multivariate approach.

Table 2. The ANOVA statistics for the three recording dates

Recording date Electromagnetic spectrum band interval R squared F value Probability

9 August Coastal-blue 0.0208 6.56 0.0109Green 0.1024 35.15 <0.0001Yellow 0.0690 22.81 <0.0001Red 0.0190 5.96 0.0152Red-edge 0.0663 21.85 <0.0001Near infrared (NIR) 0.0248 7.84 0.0054

Mean R squared 0.050313 August Coastal-blue 0.0013 0.41 0.5218

Green 0.0769 25.65 <0.0001Yellow 0.0233 7.34 0.0071Red 0.0002 0.06 0.8146Red-edge 0.1377 49.18 <0.0001NIR 0.0766 25.57 <0.0001

Mean R squared 0.052619 August Coastal-blue 0.0004 0.13 0.7236

Green 0.0569 18.59 <0.0001Yellow 0.0113 3.52 0.0616Red 0.0008 0.26 0.6128Red-edge 0.1078 37.21 <0.0001NIR 0.0533 17.35 <0.0001

Mean R squared. 0.0384

M. RINALDI ET AL.Environmetrics

wileyonlinelibrary.com/journal/environmetrics Copyright © 2014 John Wiley & Sons, Ltd. Environmetrics (2014)

6

Figure 4. Maps of co-kriged estimate values of yellow principal components (PCs) on the three dates (9 August, top left; 13 August, top right; 19 August, bottom)

Figure 5. Maps of co-kriged estimate values of green principal components (PCs) on the three dates (9 August, top left; 13 August, top right; 19 August, bottom)

Figure 6. Maps of co-kriged estimate values of red-edge principal components (PCs) on the three dates (9 August, top left; 13 August, top right; 19August, bottom)

DISCRIMINATION OF HYPERSPECTRAL DATA ON TOMATO Environmetrics

Environmetrics (2014) Copyright © 2014 John Wiley & Sons, Ltd. wileyonlinelibrary.com/journal/environmetrics

7

Table 6 shows the standardised coefficients of the canonical variable at the three recording dates. In this table, the highest weighting for the 9August sample date is represented by the green principal component. This is negatively correlated with the coastal-blue, red and NIR principalcomponents. On 13 August, the green and red-edge principal components yield higher weights and are negatively correlated with the red, NIRand green principal components. Finally, on 19 August, the green and red-edge principal component weights are large and negative.

Figure 7. Maps of co-kriged estimate values of near infrared (NIR) principal components (PCs) on the three dates (9 August, top left; 13 August, top right; 19August, bottom)

Table 3. The MANOVA results for the three recording dates

Recording date Statistic Value F value Pr>F

9 August Wilks’ lambda 0.6814 23.61 <0.0001Pillai’s trace 0.3186 23.61 <0.0001Hotelling–Lawley trace 0.4675 23.61 <0.0001Roy’s greatest root 0.4675 23.61 <0.0001

13 August Wilks’ lambda 0.6279 29.93 <0.0001Pillai’s trace 0.3721 29.93 <0.0001Hotelling–Lawley trace 0.5927 29.93 <0.0001Roy’s greatest root 0.5927 29.93 <0.0001

19 August Wilks’ lambda 0.6899 22.70 <0.0001Pillai’s trace 0.3101 22.70 <0.0001Hotelling–Lawley trace 0.4495 22.70 <0.0001Roy’s greatest root 0.4495 22.70 <0.0001

Table 4. Bartlett’s test results for the three recording dates

Recording date Chi-squared Probability

9 August 53.747343 0.000113 August 35.520905 0.024719 August 20.389355 0.4967

Table 5. Canonical correlation coefficients and approximate standard errors for the three recording dates

Recording date Canonical correlation Approx. standard error

9 August 0.564 0.03913 August 0.610 0.03619 August 0.557 0.039

M. RINALDI ET AL.Environmetrics

wileyonlinelibrary.com/journal/environmetrics Copyright © 2014 John Wiley & Sons, Ltd. Environmetrics (2014)

8

The previous data showed that the relationships among the different bands are not consistent over time, although the green band and, in theselast two measurements, the red-edge band, are generally more effective for discrimination between the optimal and deficit irrigation treatments.The error matrix (Table 7) shows that the discrimination between the optimal and deficit irrigation treatments is satisfactory, as the overall

accuracy is from 75% to 79%, which indicates that about 75% of the data were correctly classified. However, the discrimination capability ofthe equipment decreased over the crop season, with a tendency for the deficit irrigation error to increase, which is likely to be due to the largervariability in the water-stressed (deficit) block.At this point, it needs to be noted that the whole discriminant analysis was performed under the assumption of spatial independence of

observations, and Equation (2) gives the multivariate covariance matrix of the studied variables (COV). The question is as follows: Can thiscovariance structure be considered independent of the spatial correlation? The answer is positive only when the multivariate correlation is‘intrinsic’ (Wackernagel, 2003), which means that the multivariate covariance model can be described by the traditional variance–covariancematrix multiplied by a spatial correlation function that is the same for all of the variables. If we consider the LMCs fitted to all of thevariograms of the first and second groups of principal components, we realise that at least the isotropic spatial component is modelled bythe same spherical structure with a range of 32 to 40m. This induces us to assert that the method used earlier can be assumed to be reliable althoughin the presence of spatial correlation. In any case, an improvement to the analysismight come from the replacement of the variance–covariancematrixin Equation (2) with a full multivariate covariate model that describes the relationships between the variables and the relationships between the pointsin the space. This can be accomplished by using a fitted LMC of the set of the spectral variables under study.However, what is needed, in the perspective of precision irrigation, is not to discriminate the two blocks of plants as a whole but rather to

differentiate the plants that are under different water conditions at a local scale. Therefore, to map the canonical variable in the geographicalspace, an anisotropic LMC was fitted to the variograms of the scores of the three canonical variables that correspond to the three recordingdates. The structures of the LMC included the following: the nugget effect, an isotropic spherical model with range = 32.00m, a directionalBessel K model with range = 50.00m along the N7° direction, and a directional Bessel K model with range = 150.00m and shapeparameter = 1.00 along the N97° direction.

Table 6. Total-sample standardised coefficients of the canonical variables for the three recording dates

Recording date Electromagnetic spectrum band interval Standardised canonical coefficient

9 August Coastal-blue �1.1506Green 3.3532Yellow 0.1408Red �1.5618Red-edge 0.5403Near infrared (NIR) �1.1763

13 August Coastal-blue 0.4095Green �2.2809Yellow 4.4106Red �3.7489Red-edge 4.3533NIR �3.4404

19 August Coastal-blue 0.7922Green �3.1283Yellow 2.8567Red �0.2224Red-edge �2.4831NIR 2.4520

Table 7. Error matrices for the three recording dates, showing the measured classes (rows) and the estimated classes (columns)

Recordingdate

Observed Predicted

Optimal, n (%) Deficit, n (%) Overall, n (%) Group error

9 August Optimal 118 (78.67) 32 (21.35) 150 (100) 0.21Deficit 34 (21.25) 126 (78.75) 160 (100) 0.21Overall accuracy = 0.79

13 August Optimal 115 (76.67) 35 (23.33) 150 (100) 0.23Deficit 35 (21.88) 125 (78.13) 160 (100) 0.22Overall accuracy = 0.77

19 August Optimal 114 (76.00) 36 (24.00) 150 (100) 0.24Deficit 41 (25.63) 119 (74.38) 160 (100) 0.27Overall accuracy = 0.75

DISCRIMINATION OF HYPERSPECTRAL DATA ON TOMATO Environmetrics

Environmetrics (2014) Copyright © 2014 John Wiley & Sons, Ltd. wileyonlinelibrary.com/journal/environmetrics

9

The co-kriged maps of the canonical variables estimated on the same interpolation grid used for the principal components clearly show thepower of the canonical variable in distinguishing between the two water treatments (Figure 8). For the earliest date (9 August), the highestestimates of the canonical variable can be attributed to a greater reflectance in the green region of the visible spectrum, which appears to havebeen due to the more luxuriant vegetation in the well-watered (optimal irrigation) block. For the second date (13 August), the map mostlyreproduces the reflectance in the yellow and red-edge principal components, and again, the highest values occur in the well-watered block.For the last date (19 August), the map looks quite similar to the previous maps, but it is reversed, owing to the negative loading of the greenand red-edge principal components on the canonical variable.

Summing up, we can state that the (geo)statistical analysis has allowed us to effectively distinguish between the two blocks that wereirrigated with different amounts of water. The bands that are more suited to discriminate the plants with different water status are the greenand the red-edge, because of their greater sensitivity to the plant health, and thus the water content.

The variable influence of each of the bands for the discrimination between these two (optimal and deficit) water treatments can beexplained on the basis of the several considerations. The plant water stress for the deficit irrigation treatment increased from 9 to 19 August;this caused leaf rolling and a lower water content than for the optimal irrigation treatment, which represents a hardening of the plants with thereduced water availability. Within the 10-day monitoring period, the tomato phenology changed: fruit setting was reached also for the uppercluster, and the percentage of red-coloured fruit increased from 20% to 60%. The increased fruit weight produced an opening in the plantcanopy towards the outside of the crop row, and hence greater visibility from the top, of the stems and the soil. Over time, this resultedin the Fieldspec spectrometer reading a mixture of items with an increasing proportion of fruit (green and red), stems and soil, and adecreasing proportion of leaves. The complex tomato plant structure (several fruit clusters, semi-prostrate habit and variable canopy heightthat decreases at maturity) and the incomplete soil cover between the two rows by the canopy, which led us to take the measurements whenthe soil cover was at least 90%, are further issues that have to be taken into account in the dynamic selection of the bands that are deemedmore effective in the discrimination between irrigation treatments at any particular crop stage.

4. CONCLUSIONS

This experimental design and the statistical procedure applied to the collected hyperspectral data allowed us to effectively discriminatebetween the tomato plants submitted to these two different water regimes (optimal vs deficit). The combined approach of the data analysisincluded different techniques: PCA to reduce the number of bands, discriminant analysis on the retained principal components, and finally,geostatistical interpolation of the principal components and canonical components using multivariate techniques.

However, an effective improvement to the proposed approach might include a more complete integration of the traditional statisticaltechniques with the geostatistics, through replacement of the variance–covariance matrix with a full spatial multivariate model of covariance.Taking spatial dependence explicitly into account would also avoid the necessity for the blocking, randomisation and replication required tomakereliable assertions.

New multivariate indices of hyperspectral reflectance are necessary to better assess the evolution of differently water-stressed plants, andno single band exhaustively described the status of these crops.

This study has shown that a multivariate approach like canonical discriminant analysis has good potential to identify different watertreatments. When using a univariate approach for the analysis of the variables individually, the irrigation treatments were not always successfullyclassified, with considerable improvement achieved with the discriminant analysis. Moreover, the correlation between the individual bandsvaried during the crop cycle, and therefore, it is not effective to only choose some specific band(s) to discriminate between these watertreatments. We believe that only a weighted combination of all of the bands can lead to clear discrimination between these two field areasthat were differently irrigated.

Figure 8. Maps of co-kriged canonical variables on the three dates (9 August, top left; 13 August, top right; 19 August, bottom)

M. RINALDI ET AL.Environmetrics

wileyonlinelibrary.com/journal/environmetrics Copyright © 2014 John Wiley & Sons, Ltd. Environmetrics (2014)

10

In summary, hyperspectral reflectance data processing using a combined approach of canonical discriminant analysis and multivariategeostatistics can provide valuable support to farmers to identify within-field plant-stressed areas, allowing them to implement site-specificirrigation strategies. In the future, to confirm these results, the study should be extended to monitoring over the crop season and an investigationinto the spectral contribution of the tomato fruit.

Acknowledgements

This work has been supported by Italian Ministry of Agriculture and Forestry Policies under contracts 209/7393/05 and 8373/7303/09(AQUATER Project, Coordinator: Dr. M. Rinaldi).

REFERENCES

Anderson TW. 1984. An Introduction to Multivariate Analysis (2nd ed.). Wiley: New York.Asner GP. 1998. Biophysical and biochemical sources of variability in canopy reflectance. Remote Sensing of Environment 64: 134–153.Bauer ME, Daughtry CST, Vanderbilt VC. 1981. Spectral–agronomic relationships of corn, soybean, and wheat canopies. Report SR-P1-04187. WestLafayette, IN: Laboratory for Applications of Remote Sensing, Purdue University, 17 pp.

BBA. 2001. Growth stages of mono- and dicotyledonous plants. In BCH Monograph, (2nd ed.), U Meier (ed.), BBA, Federal Biological Research Centre forAgriculture and Forestry: Germany; 158.

Broge NH, Leblanc E. 2000. Comparing prediction power and stability of broadband and hyperspectral vegetation indices for estimation of green leaf areaindex and canopy chlorophyll density. Remote Sensing of Environment 76: 156–172.

Carter GA. 1993. Responses of leaf spectral reflectance to plant stress. American Journal of Botany 80(3): 239–243.Carter GA. 1998. Reflectance wavebands and indices for remote estimation of photosynthesis and stomatal conductance in pine canopies. Remote Sens.Environ. 63: 61–72.

Castrignanò A, Giugliarini L, Risaliti R, Martinelli N. 2000. Study of spatial relationships among some soil physico-chemical properties of a field in centralItaly using multivariate geostatistics. Geoderma 97: 39–60.

Ceccato P, Flasse S, Tarantola S, Jacquemoud S, Gregoire J. 2001. Detecting vegetation leaf water content using reflectance in the optical domain. RemoteSensing of Environment 77: 22–33.

Clevers JGPW, Heijden GWAM, van der Verzakov S, Schaepman ME. 2005. Estimating spatial patterns of biomass and nitrogen status in grasslands throughimaging spectrometry. In 9th International Symposium on Physical Measurements and Signatures in Remote Sensing (ISPMSRS), Beijing, 17–19 October2005. Beijing: ISPRS WG VII/1; 56–59.

De Benedetto D, Castrignanò A, Rinaldi M, Ruggieri S, Santoro F, Figorito B, Gualano S, Diacono M, Tamborrino R. 2013. An approach for delineatinghomogeneous zones by using multi-sensor data. Geoderma 199: 117–127.

Dobrowski SZ, Pushnik JC, Zarco-Tejada PJ, Ustin SL. 2005. Simple reflectance indices track heat and water stress-induced changes in steady-statechlorophyll fluorescence at the canopy scale. Remote Sensing of Environment 97: 403–414.

FAO-UNESCO. 1963. Bioclimatic Map of the Mediterranean Zone, Explanatory Notes. Published by the United Nations Educational, Scientific and CulturalOrganization Place de Fontenoy: Paris, France.

Filella I, Peñuelas L. 1994. The red edge position and shape as indicators of plant chlorophyll content, biomass and hydric status. International Journal ofRemote Sensing 15–7: 1459–1470.

Gates DM, Keegan HJ, Schleter JC, Weidner VR. 1965. Spectral properties of plants. Applied Optics 4: 11–20.Gualano S, Santoro F, Djelouah K, D’Onghia AM. 2010. Proximal and remote sensing in the monitoring of Citrus tristeza virus (CTV) infected trees: preliminaryresults. Acta Horticolturae 940: 641–646.

Hunt ER, Rock BN. 1989. Detection of changes in leaf water content using near- and middle-infrared reflectances. Remote Sensing of Environment 30: 43–52.Journel AG, Huijbregts CJ. 1978. Mining Geostatistics. Academic Press: London.Kumar L, Schmidt KS, Dury S, Skidmore AK. 2001. Review of hyperspectral remote sensing and vegetation science. In Hyperspectral Remote Sensing, F VanDer Meer (ed.), Kluwer Academic Press: Dordrecht; 111–155.

Lachenbruch PA, Mickey MR. 1968. Estimation of error rates in discriminant analysis. Technometrics 10: 1–11.Lajaunie C, Behaxeteguy JP. 1989. Elaboration d’un programme d’ajustement semi-automatique d’un modele de coregionalisation. Theorie. Technical ReportN21/89/G. ENSMP: Paris.

Malenovsky Z, Ufer C, Lhotakova Z, Clevers JGPW, Schaepman ME, Cudlin P, Albrechtova J. 2005. A new optical index for chlorophyll estimation of aforest canopy from hyperspectral images. In Imaging Spectroscopy—New Quality in Environmental Studies, Vol. 1, B Zagajewski, M Sobczak (eds.),EARSeL: Warsaw (Pl); 651–659.

McGwire, K, Minor T, Fenstermaker L. 2000. Hyperspectral mixture modeling for quantifying sparse vegetation cover in arid environments. Remote Sensingof Environment 72(3): 360–374.

Milton EJ, Schaepman ME, Anderson K, Kneubühler M, Fox N. 2009. Progress in field spectroscopy. Remote Sensing of Environment 113: S92–S109.Morrison DF. 1976. Discriminant analysis and clustering. Statistical Science 4: 34–69.Penuelas J, Filella I, Lloret P, Munoz F, Vilajeliu M. 1995. Reflectance assessment of mite effects on apple trees. International Journal of Remote Sensing 16:2727–2733.

Rao CR. 1973. Linear Statistical Inference and Its Applications. Wiley: New York.Ray SS, Das G, Singh JP, Panigrahy S. 2006. Evaluation of hyperspectral indices for LAI estimation and discrimination of potato crop under differentirrigation treatments. International Journal of Remote Sensing 27: 5373–5387.

SAS Institute Inc. 2009. SAS/STAT ® 9.2 User’s Guide, Second Edition. Cary, NC: NCSAS Institute Inc.Schabenberger O, Gotway C. 2005. Statistical Methods for Spatial Data Analysis. CRC Press: Boca Raton, FL.Scholander P, Bradstreet E, Hemmingsen E, Hammel H. 1965. Sap pressure in vascular plants: negative hydrostatic pressure can be measured in plants.Science 148(3668): 339–346.

Sims DA, Gamon JA. 2002. Relationships between leaf pigment content and spectral reflectance across a wide range of species, leaf structures and developmentalstages. Remote Sensing of Environment 81: 337–354.

Stimson HC, Breshears DD, Ustin SL, Kefauver SC. 2005. Spectral sensing of foliar water conditions in two co-occurring conifer species: Pinus edulis andJuniperus monosperma. Remote Sensing of Environment 96: 108–118.

Strachan IB, Pattey E, Boisvert JB. 2002. Impact of nitrogen and environmental conditions on corn as detected by hyperspectral reflectance. Remote Sensingof Environment 80: 213–214.

Thenkabail PS, Enclona EA, Ashton MS, Van Der Meer B. 2004. Accuracy assessments of hyperspectral waveband performance for vegetation analysisapplications. Remote Sensing of Environment 91: 354–376.

DISCRIMINATION OF HYPERSPECTRAL DATA ON TOMATO Environmetrics

Environmetrics (2014) Copyright © 2014 John Wiley & Sons, Ltd. wileyonlinelibrary.com/journal/environmetrics

11

Tilling AK, O’Leary GJ, Ferwerda JG, Jones SD, Fitzgerald GJ, Rodriguez D, Belford R. 2007. Remote sensing of nitrogen and water stress in wheat. FieldCrops Research 104: 77–85.

Wackernagel H. 2003. Multivariate Geostatistics: An Introduction with Applications (3rd ed.). Springer Verlag: Berlin.Webster R, Oliver MA. 2007. Geostatistics for Environmental Scientists. Wiley: Chichester, England.Yu G, Miwa T, Nakayama K, Matsuoka N, Kon H. 2000. A proposal for universal formulas for estimating leaf water status of herbaceous and woody plantsbased on spectral reflectance properties. Plant & Soil 227: 47–58.

Zarco-Tejada PJ, Miller JR, Mohammed GH, Noland TL, Sampson PH. 2002. Vegetation stress detection through chlorophyll a + b estimation and fluorescenceeffects on hyperspectral imagery. Journal of Environmental Quality 31: 1433–1441.

Zygielbaum AI, Gitelson AA, Arkebauer TJ, Rundquist DC. 2009. Non-destructive detection of water stress and estimation of relative water content in maize.Geophysical Research Letters 36: L12403.

M. RINALDI ET AL.Environmetrics

wileyonlinelibrary.com/journal/environmetrics Copyright © 2014 John Wiley & Sons, Ltd. Environmetrics (2014)

12