Comparative efficiency and accuracy of variable area transects versus square plots for sampling tree...
Transcript of Comparative efficiency and accuracy of variable area transects versus square plots for sampling tree...
Comparative efficiency and accuracy of variable areatransects versus square plots for sampling tree diversityand density
Cheryl D. Nath • Raphael Pelissier • Claude Garcia
Received: 7 April 2009 / Accepted: 22 September 2009
� Springer Science+Business Media B.V. 2009
Abstract Agroforestry systems have been recog-
nized as areas with high conservation potential, and
there is a need to quickly assess the biodiversity and
tree stocking density available in these systems.
However, it is not clear if the commonly used fixed
area plot is most efficient for sampling such land-
scapes, or if a different method could provide
equivalent data with less effort. Thus, a field and
simulation-based study was carried out to compare
the efficiency and accuracy of a variable area transect
versus the fixed area square plot. Field efficiency tests
were carried out in three habitat types, robusta coffee
plantations, arabica coffee plantations and a privately
owned forest fragment, in Kodagu, southern India. A
simulation study of bias, precision and accuracy of
the two methods for tree density estimation also was
carried out using various spatial distribution patterns
and densities. The variable area transect was signifi-
cantly more efficient per unit effort in the field than
the fixed area square plot. In the simulation tests both
methods performed equally well under random
spatial distribution. However, under simulated aggre-
gated distribution both methods were positively
biased (square plot up to 12% at low density, variable
area transect 9–12% at all densities), and under
simulated regular distribution the variable area tran-
sect was slightly negatively biased (-5 to -7% at
medium to high density). The variable area transect
thus can be recommended over the square plot for
rapid assessment of tree diversity and density, when
the vegetation is expected to be randomly dispersed.
Keywords Man-hours � Bias � Precision �Spatial dispersion � Coffee agroforestry �India
Introduction
Landscapes dominated by coffee agroforestry have
been identified as potential areas for future biodiver-
sity conservation in the tropics (Perfecto et al. 1996;
McNeeley and Schroth 2006). The small district of
Kodagu (4,104 km2), situated in the Western Ghats
of southern India, is an interesting location for such
non-formal conservation efforts, as its landscape is
dominated by protected forests, community-owned
or private forest patches and shade coffee estates
C. D. Nath (&) � R. Pelissier � C. Garcia
French Institute of Pondicherry, 11 St Louis Street,
PB 33, Pondicherry 605001, India
e-mail: [email protected]
R. Pelissier
UMR AMAP, TA-A51/PS2, Boulevard de la Lironde,
34398 Montpellier Cedex 5, France
C. Garcia
CIRAD—UPR 36, TA 10/D, Campus de Baillarguet,
34398 Montpellier Cedex 5, France
123
Agroforest Syst
DOI 10.1007/s10457-009-9255-5
(Elouard 2000; Garcia et al. in press). The coffee
agroforestry practiced in this region utilises medium
to high density shade from native and exotic trees
located between coffee bushes. Tree stocking density
varies greatly between estates, depending on factors
such as the original vegetation type, species of coffee
grown and management practices (Elouard et al.
2000; Moppert 2000). The landscape matrix is known
to harbour a high proportion of native biodiversity
(Bhagwat et al. 2005) as the existing bioclimatic
regime and topography support vegetation types
ranging from wet evergreen to dry deciduous forests
(Pascal 1988; Elouard 2000). In order to sample such
a varied landscape for biodiversity, we would have to
ensure wide geographic coverage of the region to
include rare species and habitat types (Gimaret-
Carpentier et al. 1998).
An appropriate sampling method was sought that
would be quick and easy to implement across a wide
range of habitat types. The best sampling technique
should provide accurate and representative informa-
tion about the population studied, while also being
geometrically compact and requiring the least amount
of field effort (Parker 1979; Laycock and Batcheler
1975; Scott and Gove 2002). The fixed area square
quadrat has traditionally been used for vegetation
sampling (Clapham 1932). However, other shapes for
fixed area plots such as rectangular and circular plots
or belt transects also have been used due to their
improved habitat coverage or ease of implementation
(Barbour et al. 1987). Plotless density estimators that
utilise distance measurements from random points to
nearest trees or from trees to their nearest neighbours
have been popularized on the basis of their greater
speed, ease of implementation, and economy of effort
in dense habitats (Cottam and Curtis 1956). More
recently, variable area plots and transects also were
introduced as a means to collect relatively fixed
amounts of field data irrespective of the local habitat
density (Parker 1979; Sheil et al. 2003).
We initially carried out a pilot test to compare
the relative field performances of four different
sampling methods: fixed area square plot, belt
transect, cluster sampling with point-centred quarter
(PCQ), and a new variable area transect method
developed by Sheil et al. (2003). The pilot survey
indicated that the square plot and variable area
transect were most efficient in the field in terms of
total time spent per sample, as well as in terms of
numbers of individuals and species recorded per
unit time (unpublished results). Therefore we
decided to test these two methods further, while
eliminating the belt transect and cluster sampling
with PCQ from further testing. The use of PCQ also
has been discouraged by other studies as it produces
biased results under nonrandom spatial distributions
(Lyon 1968; Risser and Zedler 1968; Mark and
Esler 1970; Good and Good 1971; Laycock and
Batcheler 1975; Bryant et al. 2004). Thus, the
square plot (hereafter referred to as ‘‘QUAD’’) and
the variable area transect (‘‘VAT’’) developed by
Sheil et al. (2003) were selected for further field
tests and computational analyses, which are the
subject of this paper.
Field and laboratory-based comparisons of the
efficiency and accuracy associated with different
vegetation sampling techniques have been carried out
before. However, only a few studies have docu-
mented the time required for data acquisition in the
field (Lindsey et al. 1958; Laycock and Batcheler
1975; Batcheler and Craib 1985; Kenkel and Podani
1991), which is an important component of total
sampling effort. Most often studies were focused on
assessing efficiency in terms of the precision of
sample estimates (Clapham 1932; Bormann 1953;
Cottam and Curtis 1956; Lyon 1968; Parker 1979;
Bryant et al. 2004), and on assessing biases with
spatially explicit datasets (Engeman et al. 1994;
White et al. 2008). Thus, in previous studies where a
QUAD was compared with a VAT, the VAT was
expected or assumed to be quicker and more efficient
to implement in the field (Parker 1979; Engeman
et al. 1994; White et al. 2008), but no relevant field
data were presented. With the exception of one study
(Batcheler and Craib 1985), data on comparative field
efficiencies of a fixed area square plot versus a
variable area transect method are lacking.
For the purpose of sampling broad swathes of the
landscape quickly, it is important to establish any
practical advantages gained in the field from the
sampling method to be used. In addition, it is critical
to show that the more efficient method does not suffer
from any major biases in estimating tree density or
diversity. Our study provides a unique comparison of
two methods for sampling trees, by carrying out tests
of efficiency in terms of field effort per replicate as
well as computer simulations to test their accuracy
under diverse habitat conditions.
Agroforest Syst
123
The main objectives of the study were:
(1) To establish which of two sampling methods,
the QUAD or the VAT, is more efficient in
terms of field effort for sampling individuals
and species of trees across a human-modified
landscape.
(2) To characterise the bias, precision and accuracy
of these two methods for estimating density,
under various spatial arrangements and density
distributions of trees.
The results of this study should be applicable to
other human-transformed landscapes that are charac-
terised by heterogeneous spatial distributions and
densities of trees.
Methods
Methods compared
The following two sampling methods were tested in
this study:
• Fixed area square plots or quadrats (‘‘QUAD’’):
QUADs have been recorded in use for vegetation
quantifications for at least 100 years (Clapham
1932). They are popular due to the ease with
which plots can be demarcated and enumerated
by minimally trained field crews, as well as the
generally low bias associated with tree density
estimates (Engeman et al. 1994). In this study we
tested square plots of 40 m length.
• Variable area transect (‘‘VAT’’) of Sheil et al.
(2002, 2003): Variable area plot or transect
methods are generally expected to improve the
sampling efficiency over fixed area plots,
although they are expected to be associated with
biases under nonrandom conditions (Parker 1979;
Engeman et al. 1994). Early VAT methods were
simple and included sampling a small fixed
number of individuals from a point or line up to
a fixed maximum search distance (Parker 1979;
Batcheler and Craib 1985; Engeman et al. 1994).
The number of individuals sampled per transect
generally was small (\5) in order to keep the
method practical and efficient. A more complex,
yet versatile, VAT method has been developed
recently for rapid sampling of landscapes (Sheil
et al. 2002, 2003). This method allows larger
samples to be collected per replicate, while
remaining compact and easy to apply under
different field conditions. According to this
method a baseline transect of 40 m is established
initially, and on either side of the baseline four
consecutive rectangular cells (of 10 m width
along the baseline, and up to 20 m length
perpendicular to the baseline) are searched for
five trees each. The length of each cell is
determined by the position of the fifth-most
distant tree from the baseline. If five trees are
identified within 20 m from the baseline (i.e., the
maximum search distance) the cell length is the
distance to the fifth tree, whereas if less than five
trees are encountered, the cell length is taken as
20 m. Thus, the eight cells together per transect
provide data on up to 40 trees, and the maximum
area sampled is 40 m by 40 m.
Field-based study
Field sampling
The first phase of the study involved collection of
data for field tests of the two sampling methods.
Coffee estates in the Kodagu district of Karnataka
state, southern India, were sampled during October
and November, 2007. Coffee estates in Kodagu grow
robusta (Coffea canephora) and arabica (Coffea
arabica) coffee. These two species usually are grown
in separate blocks because robusta requires less shade
than arabica. Our study included the following three
habitat types to represent an increasing gradient of
tree stocking density in human modified landscapes:
robusta coffee blocks, arabica coffee blocks and a
relatively undisturbed forest fragment.
Seven coffee estates were sampled in eastern
Kodagu (12�14057.800N to 12�19048.400N; and
75�53011.700E to 75�56041.700E). The estates were
2–10 km distant from each other, and presented a
range of different elevations and slopes (Table 1). In
addition, a large privately owned forest fragment on
relatively flat land was sampled, approximately 15 km
from the sampled coffee estates (12�05055.200N,
75�52047.800E). Six of the estates, as well as the forest
fragment, were each associated with only one of the
three habitat categories (corresponding to sites 2–4
Agroforest Syst
123
and 6–9 in Table 1). Only one estate was sampled for
both, robusta and arabica habitats (corresponding to
sites 1 and 5, respectively), as the two kinds of habitat
were well differentiated from each other within the
estate.
Robusta coffee habitat was sampled at sites 1–4,
where average tree densities ranged from 72 to
200 trees ha-1, while arabica coffee habitat was
sampled at sites 5–8, where average tree densities
ranged from 184 to 273 trees ha-1 (Table 1). In the
forest site average tree density was relatively high
(536 trees ha-1) compared to coffee estates; how-
ever, there was moderate weed occurrence (mainly
Strobilanthes kunthianus) in the undergrowth, indi-
cating human disturbance.
In total 21 replicates each of QUAD and VAT
were obtained, of which 20 replicates occurred in
robusta, 16 replicates in arabica and 6 replicates in
the forest. Each site had 2–6 replicates with equal
numbers of QUAD and VAT replicates per site
(Table 1). Replicates were situated at least 100 m
apart. The starting point of each replicate was
randomly located by utilizing random numbers to
select an estate management block, as well as the
number of steps and direction to follow within each
block. Whenever possible a replicate each of QUAD
and VAT were randomly located within the same
large block, to reduce variability due to management
effects.
Field crews consisted of 4–6 people at the estate
sites and 6–7 people at the forest site. These small
differences in crew size were due to day-to-day
variations in the availability of temporarily hired field
assistants, and only when sampling forest habitat a
slightly larger crew size was required for clearing
undergrowth. However, daily variations in field crew
size did not bias the efficiency of data collection for
either sampling method, as approximately two
QUADs and two VATs were completed per day
and we always sampled a QUAD followed by a VAT,
and vice versa. There was no significant difference in
crew size between the two sampling methods (Stu-
dent’s two-tailed t-test, N = 21, t = 0.33, P = 0.74),
and the same team leader and botanical specialist
were present during all field collection trips for this
study. Data collected per replicate included the
number of people involved with plot set up and data
collection, the starting and ending time, and any
breaks taken by workers before completing the data
collection.
Tree species identities were recorded whenever
possible in the field. For most species at least one
botanical sample was obtained for identity confirma-
tion, and all samples were deposited at the herbarium
of the French Institute of Pondicherry (HIFP). Only
trees whose main stems were C30 cm gbh (girth
at breast height, 1.3 m above the ground) were
recorded. In total, 1,284 trees belonging to 98 species
Table 1 Main features of the nine sites (eight coffee agroforests and a natural forest fragment) sampled in Kodagu district, Western
Ghats of Karnataka, India
Habitat Site
number
Elevation range
(m asl)
Minimum,
maximum
Slope range (�)
Minimum,
maximum
Number
of replicates
Number
of trees
Number
of species
Avg. density
(trees ha-1),
(minimum,
maximum)
Robusta 1 775, 825 2.75, 7.00 6 69 16 72 (56, 94)
2 775, 825 0.63, 15.00 6 144 29 151 (125, 200)
3 790, 867 2.00, 5.33 4 58 12 91 (56, 138)
4 820, 845 2.75, 5.75 4 115 27 200 (106, 282)
Arabica 5 845, 855 1.25, 3.18 2 88 12 273 (176, 369)
6 835, 910 3.40, 12.00 6 204 9 241 (88, 475)
7 820, 860 Not rec., 5.50 4 134 27 214 (198, 232)
8 850, 940 5.00, 22.75 4 111 26 184 (150, 222)
Forest 9 690, 730 Not rec. 6 361 48 536 (463, 653)
The range of values obtained across all replicates per site is shown, and includes trees more than 30 cm in girth at breast height. Slope
value for each replicate is the average of readings recorded in different directions (‘‘Not rec.’’ = flat areas where slope measurements
were not recorded). The number of replicates includes equal numbers of QUAD and VAT samples at each site
Agroforest Syst
123
were sampled during this study across all sites. The
robusta and arabica habitats together contained 67
species, while the forest habitat contained 48 species,
of which 17 species were common to both (see
Table 1 for details per site).
Analysis of field data
From the field datasets, three parameters were
calculated to represent efficiency in sampling trees
and species. These ‘Efficiency’ parameters were:
(i) Total number of man-hours per replicate (mh).
(ii) Total number of trees sampled per man-hour in
the field (trees mh-1).
(iii) Total number of species sampled per man-hour
in the field (species mh-1).
Man-hours was used as the standard unit of time or
effort as this took into account the variations in field
crew sizes across different replicates. The calculation
of total man-hours per replicate was achieved by
totalling the time spent (in hours) by each worker on
that replicate.
Three additional parameters were calculated to
evaluate differences, if any, between the two methods
for estimating tree density and diversity. These
‘Vegetation’ parameters were:
(i) Tree density (trees ha-1).
(ii) Species richness, or the total number of species
per replicate.
(iii) Simpson Index of diversity.
In order to assess differences between sampling
methods or habitats on each of the 6 parameters
above, linear mixed-effects models (LME models,
Pinheiro and Bates 2000) were used. In these models
the main fixed effects were ‘methods’ and ‘habitats’.
In order to address the possibility of non-indepen-
dence of replicates per site or of sites per habitat, the
additional random factor, ‘sites’, was nested in
‘habitats’. This has the effect of correctly partitioning
the variance and leading to more powerful tests for
the main effects. Interactions between methods and
habitats were included in the initial models, but were
non- significant for all parameters, and thus the
interaction term was excluded from the final models.
For efficiency parameters we also tested for differ-
ences between methods after adding the covariate,
‘‘tree density’’, in the models. This parameter was
correlated with species richness (Spearman rank
correlation coefficient = 0.72) and was expected to
account for some of the unexplained variation in the
models as it was highly variable across sites. LME
was implemented using the package ‘‘nlme’’ (Pinheiro
et al 2008) with R statistical software (R Development
Core Team 2008).
Model residuals were subjected to several tests
(Shapiro–Wilk, Kolmogorov–Smirnov, histograms
and normal quantile plots for normality; Bartlett,
Fligner-Killeen, variance test and standardized resid-
ual plots for homoscedasticity; standardized residual
plots for linearity) and, where necessary, transforma-
tions were carried out in incremental steps until all the
required assumptions were met (Sokal and Rohlf
1995; Grafen and Hails 2002). Based on these tests
the parameters ‘man-hours’ and ‘trees mh-1’ required
log-transformation, ‘trees ha-1’ required square-root
transformation and ‘Simpson Index’ required square
transformation in order to conform to the assumptions.
For non-significant results, we used power analysis
to judge if statistical significance could be achieved
by increasing the sample size. Power analysis gen-
erally is recommended for use prior to data collection
for determining the minimum sample size required to
obtain a significant effect (Thomas and Juanes 1996;
Steidl et al. 1997). However, it also can be used
retrospectively to determine the power associated
with a given effect and sample size (Thomas 1997).
An 80% power level is conventionally considered
adequate. Thus, we used a two sample t-test of means
and tested if increased sample sizes of 30 or 50 per
method would be sufficient to obtain 80% power to
detect the existing effect sizes with statistical signifi-
cance. The variance value used was as observed in
the field. Power calculations were carried out using
the statistical package ‘‘pwr’’ (Champely 2007) with
R statistical software (R Development Core Team
2008).
The above analyses relate to the statistical signifi-
cance of differences between methods; however, in
order to evaluate differences in terms of the potential
savings produced by a faster method (in the case of
efficiency parameters) or conservation losses incurred
by a biased method (in the case of vegetation
parameters) it is important to consider their economic
or biological significance also (Steidl et al. 1997).
Thus, 95% confidence intervals for the difference
between means, obtained from the t-test of the
Agroforest Syst
123
QUAD and VAT methods, were interpreted in
relation to predetermined values (Steidl et al. 1997;
Gerard et al. 1998; Di Stefano 2004). The value used
for minimum economic significance (or importance)
in the case of efficiency parameters was 10% of the
QUAD mean (i.e., the two methods should differ by
more than 10% of the QUAD mean in order for the
benefit to be considered as economically significant);
while the value used for minimum biological signifi-
cance (or importance) in the case of vegetation
parameters was a conservative 5% of the QUAD
mean (i.e., the two methods should not differ by more
than 5% of the QUAD mean in order to be assured of
unbiased vegetation assessment). We also calculated
the sample size required for 80% power to detect
these minimal economic or biological significance
values. Variables were transformed as in the previous
tests.
Simulation-based study
Creation of artificial datasets
Intensive testing of the accuracy of sampling methods
was carried out with a computer-based simulation
study. Our methodology, described below, is largely
based on that used by Engeman et al. (1994) and
White et al. (2008). Three types of common spatial
distributions, random (also known as Poisson distri-
bution or complete spatial randomness), aggregated
(also known as clumped, clustered or contagious) and
regular (also known as uniform, even or dispersed),
were artificially generated for estimation of tree
density. These three types of distribution may be
observed in natural populations at different scales.
While random distribution of trees may be considered
fairly common at the community level (Sheil et al.
2003), aggregated distributions may be more com-
mon at the species level in tropical and temperate
forests (Condit et al. 2000; Armesto et al. 1986).
Regular distributions may occur less frequently
(Hubbell 1979) and are more likely under condi-
tions of low density or high inter-tree competition
(Armesto et al. 1986). Human manipulations of trees
for silviculture also could result in regular spatial
distributions.
The random distribution was generated for our
study by randomly and independently selecting from
within the range of available x and y coordinate
values. The aggregated distribution was generated by
randomly selecting ‘‘parents’’ to signify the centre of
each cluster and then randomly selecting 30 ‘‘off-
spring’’ from a bivariate normal distribution around
each parent. Thus offspring were located with
increasing probability closer to parents. The number
of parents and cluster sizes were determined by the
predetermined density (see below). For the regular
distribution the entire area was gridded into equal
squares (according to the required density), and in
each of these a single individual was randomly
located.
The total simulated area was a square of length
2,000 m and the minimum inter-tree distance was
0.5 m. For each type of spatial distribution the
following 15 different tree densities were generated:
10 trees ha-1, and 50–700 trees ha-1 at intervals of
50 (densities of coffee agroforestry systems visited
during this study generally were between 50 and 400
trees ha-1). Thus, in total 45 different combinations
of tree distribution and density were tested. For each
distribution-density combination, 1,000 tree datasets
were artificially generated, within each of which two
sample sizes, 30 and 100, were used to estimate the
known population density by both sampling methods.
Starting points for QUADs or VATs were randomly
located with 0.1 m precision at least 60 m from the
edges of the simulated area, and the direction of
replicates was randomly selected. QUAD and VAT
replicates were paired for each starting point and
sampling direction.
Spatial contagion or nonrandomness of the arti-
ficial datasets was tested with the R index, as described
by Clark and Evans (1954) and White et al. (2008).
For this test each spatial pattern was generated 1,000
times for each of the following densities: 10, 50, 100,
300, 500, and 600 trees ha-1. From each simulated
population the average of all observed nearest
neighbour distances, Ro, was calculated after deleting
the values of individuals that were closer to the edge
than to their nearest neighbours (Crawley 2007). The
expected average nearest neighbour distance, Re, was
calculated as 1= 2ffiffiffi
Ap
� �
, where A is the true popula-
tion density. The ratio R was then calculated as
Ro/Re, which is equal to 1 for random distributions,
\1 for aggregated distributions and [1 for regular
distributions. Significance of R also was tested with a
z-test (Clark and Evans 1954; White et al. 2008).
Agroforest Syst
123
Analysis of simulation data
The bias, precision and accuracy of density estima-
tion by each sampling method were assessed for each
spatial distribution-density combination and both
sample sizes. We used scaled performance measures
to facilitate easy interpretation and comparisons
(Walther and Moore 2005). These were similar to
the measures used by Engeman et al. (1994) and
White et al. (2008). Calculations were as follows:
(1) Bias: a measure of relative bias was used to
assess positive or negative departures from the
true value of density (similar to RBIAS of
Engeman et al. 1994 and White et al. 2008,
SME in Walther and Moore 2005), calculated asP
E � Að Þ=An, where the summation is over the
n = 100 randomized tree distributions; E and A
are the sample estimates (based on 30 or 100
replicates) and the true (simulated) population
densities, respectively.
(2) Precision: the dispersion of density estimates
obtained with each sampling method was mea-
sured by the coefficient of variation (CV),
calculated as SD/ �E, where �E is the mean across
replicates and SD its standard deviation; preci-
sion increases when CV decreases.
(3) Accuracy: the relative root mean squared error
(RRMSE in Engeman et al. 1994; White et al.
2008, SRMSE in Walther and Moore 2005) was
used to assess overall accuracy associated with
the two methods, as it combines aspects of bias
and precision, calculated as 1=A
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
P
E � Að Þ2=n
q
,
where symbols are as defined above.
Bias, precision and accuracy values were consid-
ered to be good if they were within ±5% (i.e., from
-0.05 to 0.05). In addition, the mean difference
between QUAD and VAT estimations of density for
different sample sizes also was calculated, with 95%
confidence intervals. These were interpreted in rela-
tion to the minimum biological significance limits, as
described for the field efficiency tests. For all analyses
an alpha level of 0.05 was considered as statistically
significant. Analyses were carried out using R soft-
ware (ver. 2.9.1, R Development Core Team 2008).
Differences between the two sampling methods in
assessing tree species richness and diversity also
could be assessed by a similar simulation study.
However, given the large number of variations to be
considered by such a study we did not attempt to do
so in this paper.
Results
Field test results
Differences based on efficiency parameters
For all three efficiency parameters the VAT per-
formed better on average than the QUAD, as it
required fewer man-hours to complete (6.17 mh vs.
8.81 for QUAD), and produced higher numbers of
trees mh-1 (4.29 vs. 3.88) and species mh-1 (1.61 vs.
1.14). These differences were statistically highly
significant (P \\ 0.01) in the case of man-hours and
species mh-1, when tested with LME (Table 2).
Addition of the covariate, tree density, further
improved the F-value and significance of results for
the efficiency parameter man-hours. In the case of the
efficiency parameter, trees mh-1, the addition of
square root-transformed tree density as a covariate
resulted in the difference between sampling methods
becoming significant at the level of P \ 0.10 only
(i.e., new P = 0.098, as compared to P = 0.15
without the covariate in the model).
Across habitat categories also there were differ-
ences between the mean values of efficiency param-
eters (Table 2), but for all three efficiency parameters
the habitat effects were non-significant when mod-
elled with LME (P C 0.18), as the habitat effect was
considerably reduced by the significant random site
effects. Addition of the covariate, tree density, did not
alter the significance of habitat effects for all three
efficiency parameters.
Differences based on vegetation parameters
There were no significant differences between the
QUAD and VAT sampling methods for any of the
three vegetation parameters estimated (P C 0.31;
Table 2). Differences between habitat categories for
the three vegetation parameters were non-significant
when modelled with LME (P C 0.08), meaning that
the habitat effect again included a significant random
site effect.
Agroforest Syst
123
Power of tests and economic or biological
significance
For parameters with non-significant differences
between the QUAD and VAT sampling methods
(all but man-hours and species mh-1), the power to
detect statistical significance appeared to be low even
if sampling effort was increased to 30 or 50 replicates
(Table 3).
For all the efficiency parameters the difference
between means was greater than the minimum value
required for economic significance (Fig. 1). How-
ever, only in the case of species mh-1 the 95%
confidence interval of this difference also completely
excluded the minimum economic significance value.
This was nearly achieved for man-hours also (Fig. 1).
It follows that the VAT is statistically as well
as economically more efficient than the QUAD
Table 2 Mean values obtained in the field for six parameters, using two sampling techniques (QUAD and VAT; see text) in three
habitats (robusta, arabica and forest)
Parameter Explanatory variable: methods Explanatory variable: habitat types
VAT QUAD F-value Robusta Arabica Forest F-value
Efficiency parameters
Man-hours 6.17 8.81 17.81*** 6.60 6.43 13.27 0.49 ns
Trees mh-1 4.29 3.88 2.18 ns 2.95 5.30 4.65 2.35 ns
Species mh-1 1.61 1.14 23.25*** 1.28 1.32 1.83 0.17 ns
Vegetation parameters
Trees ha-1 225.49 217.26 0.17 ns 125.13 223.63 536.18 3.88 ns
Species richness 10.43 10.19 0.10 ns 8.20 8.56 22.00 1.29 ns
Simpson index 0.77 0.74 1.09 ns 0.75 0.71 0.91 0.36 ns
The F-value and statistical significance of parameters obtained with linear mixed effects models are also provided. Interactions
between methods and habitats were not significant in any of the models. ‘‘Efficiency’’ parameters are related to data collection
efficiency in the field; ‘‘vegetation’’ parameters are related to the estimation of tree density and diversity
ns non-significant
*** P \ 0.001
Table 3 Details of the power to detect significant observed
effects (i.e., differences between means of the two sampling
methods, QUAD and VAT; see text) that was associated with
higher sampling efforts (N = 30 and N = 50). Also shown are
predefined economic (10% of QUAD mean) or biological (5%
of QUAD mean) significance limits (transformed as detailed
below), and sample sizes required to detect the predefined
significance values with a power of 80%
Parameter Power (%) Significance limits Sample size required
N = 30 N = 50
Efficiency parameters Economic
1. Man-hours 89 99 -0.11a 51
2. Trees mh-1 20 31 0.10a 273
3. Species mh-1 93 99 0.11 341
Vegetation parameters Biological
4. Density (trees ha-1) 4 5 -0.37, ?0.36b 801
5. Species richness 4 4 ±0.51 2,064
6. Simpson index 13 19 -0.05, ?0.06c 586
In all cases power was calculated for the t-test of meansa Natural log transformed valuesb Square-root transformed valuesc Squared values
Agroforest Syst
123
regarding species mh-1. In the case of man-hours and
trees mh-1, and although for the former the VAT was
statistically more efficient than the QUAD, the
current results are inconclusive regarding economic
significance and greater sampling effort would be
required to resolve the issue.
In the case of vegetation parameters the difference
between means obtained by the QUAD and VAT
sampling methods was within the limits of minimum
biological significance for all three parameters.
However, for all three vegetation parameters the
95% confidence intervals were wide and clearly
exceeded the biological significance limits (Fig. 1).
Thus it is not clear if there is a biologically important
difference between these two sampling methods with
regard to vegetation parameters.
Additional field sampling appeared to be imprac-
tical for resolving the inconclusive results above, as
the sample size required for 80% power to detect
minimum economically or biologically significant
effects was prohibitively large for all parameters
except man-hours (Table 3). Increasing the sample
size would improve the chance of detecting signifi-
cant economic or biological effects by reducing the
width of confidence intervals. This is attempted in the
next section by using simulations to greatly increase
the sample sizes for tree density estimation.
Simulation results
Spatial contagion of artificial datasets
The artificially generated datasets conformed to
expectations regarding spatial contagion, as they
had the following R index values: random distri-
butions in the range of 0.98–1.00, aggregated distri-
butions in the range of 0.60–0.63 and regular
distributions in the range of 1.23–1.25. These values
were similar to the published values of other studies
(Clark and Evans 1954; White et al. 2008). In
addition, 95% confidence limits obtained from 1,000
randomisations were within 5% of the average value
for random and regular distributions, and within 10%
for aggregated distributions. Z-tests confirmed that
these values were significant only for the non-random
distributions.
Bias, precision and accuracy under different
spatial distributions
Random distribution Both sampling methods showed
very little bias (\2%) when estimating the densities of
random distributions (Table 4). This was true for both
sample sizes tested (30 and 100). The precision of
estimates (CV) also was very similar for both methods
-0.6
-0.4
-0.2
0.0
Ln (man-hours)
Diff
eren
ce b
etw
een
mea
ns
Efficiency Parameters
Econ. significance
-0.2
0.0
0.2
0.4
0.6
Ln (trees.mh−1
)
0.0
0.2
0.4
0.6
0.8
Species.mh−1
-4-2
02
4
Sqrt (trees.ha−1
)
Diff
eren
ce b
etw
een
mea
nsVegetation Parameters
Biol. significance
-4-2
02
4
Species richness -0.1
0-0
.05
0.00
0.05
0.10
0.15
0.20
Squared Simpson index
Fig. 1 Plots of differences
between means of the two
sampling methods (filleddiamond, VAT–QUAD; see
text), with 95% confidence
intervals, in relation to
minimum economic
significance value (‘‘Econ.
significance’’, single dashedline) or biological
significance limits (‘‘biol.
significance’’, two dashedlines’’) for six parameters
measured in the field.
ln Natural logarithm,
mh man-hours, sqrt square
root, ha hectare
Agroforest Syst
123
and decreased as density increased (Table 4). At higher
densities QUAD estimates were more precise (lower
CV) than VAT estimates, although this may not be
important as both methods had precision\5% for most
densities (except the lowest) with both sample sizes.
Thus, the overall accuracy of estimation by both
methods was within 5% for almost the entire density
range, except for density of 10 trees ha-1 for which
neither method produced accuracy within 5%, even
with a sample size of 100 (Table 4).
Aggregated distribution For aggregated distri-
butions both methods had positive biases up to 12%
for tree densities from 10 to 100 trees ha-1. However, at
higher densities the QUAD had consistently low bias
that was\5% and progressively reducing with density
for both sample sizes. However, the VAT continued to
have relatively unchanged bias levels of 10–11%, even
with a sample size of 100 (Table 4). Precision was poor
for both methods, at [5% for all densities, thus
resulting in generally poor accuracy (Table 4).
However, due to the lack of much bias at high
densities by the QUAD, this method was generally
more accurate than the VAT at high densities.
Regular distribution The QUAD showed almost no
bias (\\1%) at all densities with both sample sizes.
The VAT, however, showed a weak negative bias
between -5 and -7% at medium to high densities.
Thus, although both methods had similarly low
values for precision (except at 10 trees ha-1), the
accuracy of QUAD estimates was generally\5% for
most densities whereas that of the VAT was more
often [5%, especially at medium to high densities
([250 trees ha-1).
Overall, the QUAD estimated tree densities fairly
accurately for regular and random distributions,
whereas the VAT had generally good accuracy only
under random distribution. For both methods, preci-
sion improved with increasing sample size and density.
However, the bias associated with the VAT under
regular distribution was magnified at higher densities.
Difference between QUAD and VAT
The mean and 95% confidence intervals of the
difference between QUAD and VAT means were
within biologically significant limits for all densities
simulated under random distribution (Fig. 2). How-
ever, largely due to biases associated with the VAT
the difference between these two methods exceeded
biologically significant limits at medium to high
densities for aggregated and regular distributions.
Table 4 Results of the simulation study to assess bias, preci-
sion and accuracy of the two sampling methods, QUAD and
VAT (see text), in estimating tree density (trees ha-1) under
three different spatial distributions (aggregated, random and
regular)
Pattern,
density
Bias Precision Accuracy
QUAD VAT QUAD VAT QUAD VAT
Aggregated
10 0.12 0.12 0.17 0.17 0.22 0.22
50 0.08 0.09 0.12 0.13 0.16 0.17
100 0.06 0.09 0.10 0.11 0.12 0.16
200 0.04 0.10 0.08 0.10 0.09 0.15
300 0.03 0.11 0.07 0.09 0.08 0.15
500 0.03 0.11 0.05 0.08 0.06 0.14
700 0.02 0.10 0.05 0.15 0.05 0.19
Random
10 0.00 0.00 0.08 0.08 0.08 0.08
50 0.00 0.00 0.04 0.04 0.04 0.04
100 0.00 0.00 0.03 0.03 0.03 0.03
200 0.00 0.00 0.02 0.02 0.02 0.02
300 0.00 0.00 0.01 0.02 0.01 0.02
500 0.00 0.00 0.01 0.02 0.01 0.02
700 0.00 0.00 0.01 0.02 0.01 0.02
Regular
10 0.00 0.00 0.05 0.05 0.05 0.05
50 0.00 0.00 0.02 0.02 0.02 0.02
100 0.00 0.00 0.01 0.01 0.01 0.01
200 0.00 -0.04 0.01 0.01 0.01 0.04
300 0.00 -0.07 0.00 0.01 0.00 0.07
500 0.00 -0.07 0.00 0.01 0.00 0.07
700 0.00 -0.06 0.00 0.01 0.00 0.06
Spatial distributions were generated 1,000 times each, and
sampled with 100 randomly located QUADs and VATs.
‘‘Bias’’ measures the systematic departures of density
estimates from expected values. ‘‘Precision’’ measures the
dispersion of estimates around their mean in terms of the
coefficient of variation. ‘‘Accuracy’’ incorporates aspects of
bias and precision in terms of the relative root mean square
error (RRMSE, see text)
Agroforest Syst
123
Discussion
Efficiency and accuracy of VAT under random
tree distribution
For two of the three efficiency parameters the VAT
was significantly more efficient than the QUAD in
terms of sampling effort in the field. The VAT required
significantly less effort to complete each replicate (in
terms of man-hours) while simultaneously increasing
the species information collected per unit effort (in
terms of species mh-1). This advantage could result in
significant economic savings during large-scale bio-
diversity inventories. Reduced sampling effort per
replicate also could promote greater representation of
geographic variation by allowing many small repli-
cates to be widely distributed across a region rather
than being limited to a few large replicates (Batcheler
and Craib 1985; Gimaret-Carpentier et al. 1998). This
is of relevance in tropical areas, where environmental
gradients, dispersal limitation and other ecological
processes often produce spatial sorting of species
(Condit et al. 2000). Although the QUAD recorded
higher numbers of trees per replicate than the VAT (35
trees on average for QUAD, vs. 26 for VAT), it had
lower numbers of species per tree (per replicate, 0.35
for QUAD vs. 0.41 for VAT) suggesting that QUADs
(at the scale used and the conditions prevailing in this
study) may be more vulnerable to species clumping,
perhaps as a result of their relatively compact,
as opposed to elongated, shape (Clapham 1932;
Bormann 1953). Thus, based on our findings the
VAT presents clear advantages over the QUAD in
terms of optimizing the field sampling effort.
The two sampling methods did not differ signifi-
cantly in their estimation of vegetation parameters,
such as species density, richness and diversity, in the
field, although there was considerable variation
across the sites sampled. The simulation study also
showed no significant difference between the two
sampling methods for estimating tree density under
random spatial distribution, indicating that the VAT
provides accurate assessments of tree density and
diversity when trees are located randomly in the
habitat.
Many previous studies have used statistical preci-
sion as the basis for comparing efficiency across
different sampling methods (Clapham 1932; Bor-
mann 1953; Lindsey et al. 1958; Batcheler and Craib
1985; Kenkel and Podani 1991), with the implicit
assumption that differences in field effort between
different methods would be negligible. However, our
detailed recording of field efficiency data shows that
this assumption is not true in the case of the QUAD
and the VAT. In the current study we are in a position
to use statistical precision as well as field perfor-
mance for comparing efficiencies. Based on these two
kinds of efficiency evaluations, we conclude that the
VAT compares favourably with and even exceeds the
performance of the QUAD in terms of field effi-
ciency. Thus the VAT can be considered a reliable
alternative to the QUAD for efficient sampling of
tree density and diversity across varied habitats and
topographic conditions, subject to the condition of
random spatial distribution.
Biases under non-random tree distributions
As we were unable to spatially map our field study
sites for assessing the accuracy of the two sampling
methods in the field, the simulation study provided an
-60
-40
-20
020
4060
Diff
eren
ce b
etw
een
mea
ns
10 100 200 300 400 500 600 700
Density (trees.ha−1)
Biological significance limit
Fig. 2 Differences between the mean estimations of tree
density (i.e., ‘‘difference between means’’) by the two sampling
methods (VAT–QUAD; see text) with 95% confidence limits,
when sampling trees distributed spatially in a random (opendiamond), aggregated (open triangle) or regular (open square)
manner, under different densities (‘‘trees ha-1’’). Density
estimations were obtained using 100 sampling replicates per
spatial distribution-density combination, each of which was
simulated 1,000 times. The divergent (dashed) lines delineate
the biologically significant limits of 5% deviation from the
mean QUAD estimate
Agroforest Syst
123
appropriate testing environment in which to compare
known population parameters against estimates pro-
vided by the two sampling methods. Thus, under the
simulated aggregated and regular distributions, biases
were found to be associated with tree density
estimation by the VAT. Parker (1979) had previously
noted the possibility of bias when sampling with a
VAT under aggregated distributions and recom-
mended using fixed area quadrat methods instead,
under such conditions. On the other hand, the degree
of bias observed here could be considered unimpor-
tant in a different context, such as if the variability
observed in the field (i.e., precision) is also relatively
high (Sheil et al. 2003). Also, it may be noted that the
economical and biological significance limits used in
this study are conservative in comparison with those
used by other studies, where bias and precision values
up to 10% or 20% were considered acceptable
(Cottam and Curtis 1956; Lindsey et al. 1958;
Laycock and Batcheler 1975).
The QUAD was less biased, more precise and thus
more accurate than the VAT for estimating tree
densities under simulated nonrandom spatial distri-
butions. This is similar to the findings of other
simulation-based studies (Engeman et al. 1994;
White et al. 2008). Comparison of our simulation
results with those of other studies showed that the
RRMSE values obtained by us were better (lower)
than those reported by Engeman et al. (1994), and
White et al. (2008), especially under random and
regular spatial distributions. White et al. (2008)
reported a negative bias similar to ours, when
sampling regular distributions with a VAT; however,
the corresponding value reported by Engeman et al.
(1994) was positive. Both of those studies reported
negative biases for the VAT under aggregated
distributions. It is possible that the difference in
direction of bias detected by the different simulation
studies is related to differences in the VAT methods
used (the VAT tested by us is larger and more
complex than in the other studies). Further analysis is
required to understand these differences and find
ways to reduce the biases.
Engeman et al. (1994) found no bias associated
with the QUAD under any simulated spatial distri-
bution or density, whereas in our study the QUAD
was positively biased at low densities. Also, both
methods generally performed poorly in terms of
precision at very low densities, which affected the
accuracy. Thus, neither the QUAD nor the VAT
appear to be appropriate for estimating very low tree
densities.
Applications of the study
Landscapes with moderate to high tree stocking
density and variable densities of undergrowth, as in
this study, are likely to be common across agro-
silvicultural landscapes of the tropics. It should be
kept in mind that the optimal strategy for assessing
tree diversity would involve choice of a sampling
method as well as an appropriate estimator (Gimaret-
Carpentier et al. 1998), and that the optimal strategy
for efficient estimation of diversity may differ from
that for density. Nevertheless, in the interest of
inventorying human-modified landscapes quickly, the
variable area transect method developed by Sheil
et al. (2003) appears more suitable than the fixed area
square plot if the trees are distributed randomly. Our
recommendation is based on the greater efficacy of
the VAT with regard to utilization of time, as
demonstrated in the field, plus the absence of bias
in estimating tree density, as demonstrated under
controlled simulation conditions where the popula-
tion density was known. On the other hand, under
non-random spatial distributions the time advantage
gained by using the VAT is to be traded off against the
disadvantage of obtaining slightly biased estimations
of tree density. Given that both types of sampling
methods examined in this study showed biases under
nonrandom tree distributions, bias corrections gener-
ally should be applied if exact density values are
required under these conditions.
Finally, there are situations under which the
traditional QUAD would be preferred regardless of
its field efficiency. For example, if large numbers of
trees are to be censused per replicate without regard
for time, or if repeated long-term sampling of a fixed
location is called for, a single or few large square
plots may be most appropriate.
Acknowledgements Funding was provided by the CAFNET
project of the EuropAid program of the European Union
(Connecting, enhancing and sustaining environmental services
and market values of coffee agroforestry in Central America,
East Africa and India, CAFNET—Europaid/ENV/2006/114-
382/TPS). We are grateful to the farmers and estate managers
who permitted us to use their properties for data collection. We
thank N. Barathan for his assistance with species identification
Agroforest Syst
123
and specimen collection, S. Aravajy for species confirmation,
and the technicians, students and field assistants of the French
Institute of Pondicherry and Forestry College, Ponnampet,
Kodagu, for assistance in the field. We also thank Douglas
Sheil for helpful discussions during fieldwork and critical
comments on the manuscript, and two anonymous reviewers
for their valuable comments.
References
Armesto JJ, Mitchell JD, Villagran C (1986) A comparison of
spatial patterns in some tropical and temperate forests.
Biotropica 18:1–11
Barbour MG, Burk JH, Pitts WD (1987) Terrestrial plant
ecology, 2nd edn. Benjamin/Cummings, Menlo Park
Batcheler CL, Craib DG (1985) A variable area plot method of
assessment of forest condition and trend. NZ J Ecol 8:83–
95
Bhagwat SA, Kushalappa CG, Williams PH, Brown ND (2005)
A landscape approach to biodiversity conservation of
sacred groves in the Western Ghats of India. Conserv Biol
19:1853–1862
Bormann FH (1953) The statistical efficiency of sample plot
size and shape in forest ecology. Ecology 34:474–487
Bryant DM, Ducey MJ, Innes JC et al (2004) Forest commu-
nity analysis and the point-centered quarter method. Plant
Ecol 175:193–203
Champely S (2007) pwr: Basic functions for power analysis.
R package version 1.1
Clapham AR (1932) The form of the observational unit in
quantitative ecology. J Ecol 20:192–197
Clark PJ, Evans FC (1954) Distance to nearest neighbor as a
measure of spatial relationships in populations. Ecology
35:45–453
Condit R, Ashton PS, Baker B et al (2000) Spatial patterns in
the distribution of tropical tree species. Science 288:
1414–1418
Cottam G, Curtis JT (1956) The use of distance measures in
phytosociological sampling. Ecology 37:451–460
Crawley MJ (2007) The R book. Wiley, New York
Di Stefano J (2004) A confidence interval approach to data
analysis. For Ecol Manag 187:173–183
Elouard C (2000) Vegetation features in relation to biogeog-
raphy. In: Ramakrishnan PS, Chandrashekara UM, Elou-
ard C et al (eds) Mountain biodiversity, land use
dynamics, and traditional ecological knowledge. Oxford/
IBH, New Delhi, pp 25–42
Elouard C, Chaumette M, de Pommery H (2000) The role of
coffee plantations in biodiversity conservation. In: Ra-
makrishnan PS, Chandrashekara UM, Elouard C et al
(eds) Mountain biodiversity, land use dynamics, and tra-
ditional ecological knowledge. Oxford/IBH, New Delhi,
pp 120–144
Engeman RM, Sugihara RT, Pank LF, Dusenberry WE (1994)
A comparison of plotless density estimators using Monte
Carlo simulation. Ecology 75:1769–1779
Garcia CA, Bhagwat SA, Ghazoul J et al (in press) Biodiversity
conservation in agricultural landscapes: challenges and
opportunities of coffee agroforestry in the Western Ghats,
India. Conserv Biol
Gerard PD, Smith DR, Weerakkody G (1998) Limits of ret-
rospective power analysis. J Wildl Manag 62:801–807
Gimaret-Carpentier C, Pelissier R, Pascal J-P, Houllier F
(1998) Sampling strategies for the assessment of tree
species diversity. J Veg Sci 9:161–172
Good RE, Good NF (1971) Vegetation of a Minnesota prairie
and a comparison of methods. Am Midl Nat 85:228–231
Grafen A, Hails R (2002) Modern statistics for the life sci-
ences. Oxford University Press, Oxford
Hubbell SP (1979) Tree dispersion, abundance, and diversity in
a tropical dry forest. Science 203:1299–1309
Kenkel NC, Podani J (1991) Plot size and estimation efficiency
in plant community studies. J Veg Sci 2:539–544
Laycock WA, Batcheler CL (1975) Comparison of distance-
measurement techniques for sampling tussock grassland
species in New Zealand. J Range Manag 28:235–239
Lindsey AA, Barton JD Jr, Miles SR (1958) Field efficiencies
of forest sampling methods. Ecology 39:428–444
Lyon LJ (1968) An evaluation of density sampling methods in
a shrub community. J Range Manag 21:16–20
Mark AF, Esler AE (1970) An assessment of the point-centred
quarter method of plotless sampling in some New Zealand
forests. Proc NZ Ecol Soc 17:106–110
McNeeley JA, Schroth G (2006) Agroforestry and biodiversity
conservation—traditional practices, present dynamics,
and lessons for the future. Biodivers Conserv 15:549–554
Moppert B (2000) Expansion of coffee plantations and land-
scape changes. In: Ramakrishnan PS, Chandrashekara
UM, Elouard C et al (eds) Mountain biodiversity, land use
dynamics, and traditional ecological knowledge. Oxford/
IBH, New Delhi, pp 88–98
Parker KR (1979) Density estimation by variable area transect.
J Wildl Manag 43:484–492
Pascal J-P (1988) Wet evergreen forests of the Western Ghats
of India; ecology, structure, floristic composition and
succession. Institut Francais de Pondicherry, Pondicherry
Perfecto I, Rice RA, Greenberg R, van der Voort ME (1996)
Shade coffee: a disappearing refuge for biodiversity.
Bioscience 46:598–608
Pinheiro JC, Bates DM (2000) Mixed-effects models in S and
S-PLUS. Springer, New York
Pinheiro J, Bates D, DebRoy S et al (2008) nlme: Linear and
nonlinear mixed effects models. R package version 3.1-89
R Development Core Team (2008) R: a language and environ-
ment for statistical computing. R Foundation for Statistical
Computing, Vienna, Austria. ISBN: 3-900051-07-0. http://
www.R-project.org
Risser PG, Zedler PH (1968) An evaluation of the grassland
quarter method. Ecology 49:1006–1009
Scott CT, Gove JH (2002) Forest inventory. In: El-Shaarawi
AH, Piegorsch WW (eds) Encyclopedia of environmet-
rics. Wiley, Chichester, pp 814–820
Sheil D, Puri RD, Basuki I et al (2002) Exploring biological
diversity, environment and local people’s perspectives in
forest landscapes. Methods for a multidisciplinary land-
scape assessment. Center for International Forestry
Research, Bogor
Sheil D, Ducey MJ, Sidiyasa K, Samsoedin I (2003) A new
type of sample unit for the efficient assessment of diverse
Agroforest Syst
123
tree communities in complex forest landscapes. J Trop For
Sci 15:117–135
Sokal RR, Rohlf FJ (1995) Biometry: the principles and
practice of statistics in biological research, 3rd edn. W. H.
Freeman and Co, New York
Steidl RJ, Hayes JP, Schauber E (1997) Statistical power
analysis in wildlife research. J Wildl Manag 61:270–279
Thomas L (1997) Retrospective power analysis. Conserv Biol
11:276–280
Thomas L, Juanes F (1996) The importance of statistical power
analysis: an example from animal behaviour. Anim Behav
52:56–859
Walther BA, Moore JL (2005) The concepts of bias, precision
and accuracy, and their use in testing the performance of
species richness estimators, with a literature review of
estimator performance. Ecography 28:815–829
White NA, Engeman RM, Sugihara RT, Krupa HW (2008) A
comparison of plotless density estimators using Monte
Carlo simulation on totally enumerated field data sets.
BMC Ecol 8:6
Agroforest Syst
123