The use of most predictable surfaces for the classification and mapping of taxon assemblages

11
Vegetatio 74: 125-135, 1988 © Kluwer Academic Publishers, Dordrecht - Printed in the Netherlands 125 The use of most predictable surfaces for the classification and mapping of taxon assemblages G. M. MacDonald I & N. M. Waters z IDepartment of Geography, McMaster University, Hamilton, Ontario, Canada, L8S 4K1; 2Department of Geography, University of Calgary, Calgary, Alberta, Canada, T2N 1N4 Accepted 12.1.1988 Keywords: Alberta - Canada, Most Predictable Surface, Multivariate spatial analysis, Pollen, Vegetation clas- sification, Vegetation mapping Abstract Most Predictable Surface (MPS) analysis provides a spatially explicit, multivariate technique for the classifica- tion and contour maping of taxon assemblages. In this paper, the technique of producing Most Predictable Surfaces is outlined and the application of MPS for the classification and mapping of taxon assemblages is demonstrated using modern pollen spectra from western Canada. The MPS maps are compared with maps of scores from principal components analysis. The strength of MPS is that it provides a classification of sites, a local mapped surface of assemblage distribution, and a global model of the relationship between taxon assem- blages and geographic coordinates. The global model relating taxon assemblages to geographic coordinates may be used for indirect gradient analysis if the geographic coordinates can be related to specific environmental factors. Alternatively, independent environmental variables may be used directly in place of geographic coor- dinates. Potential limitations of MPS include (1) the assumption that the distribution of sites with similar as- semblages can be approximated by a polynomial (2) the assumption that only two major taxon assemblages are present in the study area and further subdivision of the assemblages is hierarchical, (3) the assumption of a linear relationship between the taxa, and (4) the requirement of a relatively high ratio of sample sites to taxa. However, the results presented here indicate that MPS can have wide application in analysis of vegetation or any other types of taxon assemblages. Abbreviations." MPS: Most Predictable Surface Introduction The recognition of correspondence in the spatial dis- tributions of plant species is an important compo- nent of plant ecology. At the heart of vegetation clas- sification is the implicit assumption that, at the correct scale, the spatial segregation of individual taxa can be recognized to form discrete taxon assem- blages or communities. The underlying control on the spatial distributions is provided by environmen- tal gradients. A variety of quantitative techniques are available for generating contour maps of the abundances of single taxa (MacDonald & Waters 1987). However, analysis of the distributions of com- munities or larger vegetation units requires a mul- tivariate approach for classifying samples on the ba- sis of the spatial correlation between plant taxa and mapping the resulting assemblages.

Transcript of The use of most predictable surfaces for the classification and mapping of taxon assemblages

Vegetatio 74: 125-135, 1988 © Kluwer Academic Publishers, Dordrecht - Printed in the Netherlands 125

The use of most predictable surfaces for the classification and mapping of taxon assemblages

G. M. M a c D o n a l d I & N. M. Waters z IDepartment of Geography, McMaster University, Hamilton, Ontario, Canada, L8S 4K1; 2Department of Geography, University of Calgary, Calgary, Alberta, Canada, T2N 1N4

Accepted 12.1.1988

Keywords: Alberta - Canada, Most Predictable Surface, Multivariate spatial analysis, Pollen, Vegetation clas-

sification, Vegetation mapping

Abstract

Most Predictable Surface (MPS) analysis provides a spatially explicit, multivariate technique for the classifica- tion and contour maping of taxon assemblages. In this paper, the technique of producing Most Predictable Surfaces is outlined and the application of MPS for the classification and mapping of taxon assemblages is demonstrated using modern pollen spectra from western Canada. The MPS maps are compared with maps of scores from principal components analysis. The strength of MPS is that it provides a classification of sites,

a local mapped surface of assemblage distribution, and a global model of the relationship between taxon assem- blages and geographic coordinates. The global model relating taxon assemblages to geographic coordinates may be used for indirect gradient analysis if the geographic coordinates can be related to specific environmental factors. Alternatively, independent environmental variables may be used directly in place of geographic coor- dinates. Potential limitations of MPS include (1) the assumption that the distribution of sites with similar as- semblages can be approximated by a polynomial (2) the assumption that only two major taxon assemblages are present in the study area and further subdivision of the assemblages is hierarchical, (3) the assumption of a linear relationship between the taxa, and (4) the requirement of a relatively high ratio of sample sites to taxa. However, the results presented here indicate that MPS can have wide application in analysis of vegetation or any other types of taxon assemblages.

Abbreviations." MPS: Most Predictable Surface

Introduct ion

The recognition of correspondence in the spatial dis- tributions of plant species is an important compo- nent of plant ecology. At the heart of vegetation clas- sification is the implicit assumption that, at the correct scale, the spatial segregation of individual taxa can be recognized to form discrete taxon assem- blages or communities. The underlying control on

the spatial distributions is provided by environmen- tal gradients. A variety of quantitative techniques are available for generating contour maps of the abundances of single taxa (MacDonald & Waters 1987). However, analysis of the distributions of com- munities or larger vegetation units requires a mul- tivariate approach for classifying samples on the ba- sis o f the spatial correlation between plant taxa and mapping the resulting assemblages.

126

Palaeoecologists have long been interested in clas- sifying and mapping the spatial distributions of pollen assemblages (e.g. Webb 1974; Birks & Saar- nisto 1975; Birks et al. 1975; Webb & McAndrews 1976; Huntley & Birks 1983; Prentice 1978, 1986). Many palynologists have used R-mode principal components analysis (PCA) to reduce taxa to a.few compound variables in the form of components. The taxon loadings on each component are exam- ined and the components are related to past or pres- ent vegetation types. Contour maps of the compo- nent scores are then constructed. Similar applications of PCA have been used to map modern plant distribution (e.g. Mucina & Polacik 1982). Pollen contour maps have also been constructed us- ing non-metric, multidimensional scaling co- ordinates rather than PCA components (Prentice 1978). A potential drawback of using PCA or similar approaches to classify samples and construct maps of taxon assemblages is that the spatial relationships between the samples are not taken into account in the construction of the components or co-ordinate axes. In this paper we examine the use of Most Pre- dictable Surface (MPS) analysis (Lee 1981) as a spa- tially explicit alternative for the classification and mapping of assemblages of taxa. The application of MPS to classification and mapping is demonstrated using pollen surface samples from western Canada (MacDonald & Ritchie 1986). The strengths and weaknesses of MPS for analysing vegetation and other assemblages of taxa are discussed.

Most predictable surface analysis

MPS analysis was developed by Lee (1981) and is an extension of his earlier work on canonical trend sur- face analysis (Lee 1969). MPS uses an approach similar to canonical correlation analysis (Hotelling 1936) to derive the correlation between a set of ran- dom variables and a set of geographic coordinates. The specific aims of MPS are to: 1. Classify samples into spatially contiguous groups which are internally homogeneous in terms of random variable composi- tion and, 2. Provide a local contour surface and a global model of the spatial distributions of the two groups. In plant ecological applications, the random variables represent plant taxon abundances and the

geographic coordinates represent the spatial loca- tion of the sample sites.

The first step in canonical correlation is the com- putation of a variate for each variable set so that the pair of variates have a stronger linear relationship than any other pair. The variates are linear combina- tions of the original variables and may be considered analogous to a component in PCA. The most highly correlated pair of variates is identified, then the sec- ond highest and so on. The variates are determined subject to the constraint that higher order variates are uncorrelated with the variates from the preceding steps. The maximum number of variates which may be identified is equal to the number of variables in the smallest of the two variable sets. Canonical correlation analysis uses an eigenvector approach employing the within-set variance-covariance ma- trices for each variable group, the between-set covar- iance matrix, and the identity matrix to determine the coefficients of the variates. Detailed introducto- ry accounts of the mathematics and assumptions of canonical correlation analysis are available from Clark (1975) and Briggs & Leonard (1977). Discus- sions of the application of canonical correlation to

ecological and palynological data are provided by Gauch & Wentworth (1976) and Webb & Clark

(1977). Although MPS is based on canonical correlation

analysis, there are important differences between the two approaches. The variates in MPS are referred to as roots. The MPS equation for the random variable set has a form similar to the canonical variate equa- tion (notation follows Lee 1981).

U = a l Z 1 -[- azZ 2 + a 3 z 3 . . . (1 )

where U is the composite variable, z~, z2, z3. . . are the taxon variables and a~, a2, a3 . . . are the MPS coefficients. The coefficients are determined by MPS so that the magnitude of aj relates to the strength of the spatial variation of the j - th taxon. These MPS coefficients can be negative or positive and taxa which are positively spatially correlated will have coefficients with the same sign. Samples with similar taxa composition will produce similar values of U, and the samples are classified into two groups on the basis of their sign. The equation for the geo- graphic coordinates is expressed as a polynomial such as:

V = blx + bzy + b3 X2 + baxY + . . . (2)

where V is the composite variable for the second variable group, bl, b2, b3 . . . are coefficients deter- mined by MPS, and x and y are the geographical coordinates. Equation 2 can be expanded to any suitable polynomial and has the form of a trend sur- face. The sign and magnitude of the V values for each sample should ideally be the same as the U values. This implies that geographic coordinates alone may be used to predict the taxa values and clas- sify the samples. To calculate the U and V terms, MPS attempts to identify the linear combination with the maximum possible variance of U while in- creasing the covariance of U a n d Vby adding higher power and cross-product terms of the geographic coordinates. Thus, MPS progressively increases the order of the trend surface equation. This procedure is repeated until the trend surface order is reached where the variance of U and covariance of U and V are at a max imum and the Most Predictable Surface (sensu Lee 1981) is determined. An important differ-

ence between MPS and canonical correlation analy- sis is that MPS does not use the within-set correla- tion matrix for the taxa variables when assessing the correlation between U and V. Multicollinearity with- in this variable set led to instabilities in the coeffi- cients a 1, a2...ap (Lee 1981).

For mapping purposes, three distinct surfaces are derived from MPS. The first is the U surface which is the product of mapping the values produced by equation I. The second is the V trend surface produced by polynomial equation 2. Finally, a residual surface may be constructed to examine the fit of the V trend surface. Residuals are calculated as the arithmetic difference between corresponding Uand Vvalues. It is possible to calculate subsequent

sets of MPS equations for U and V so that the as- sociated roots are uncorrelated with previous roots and have decreasing correlation coefficients. The number of roots which can be produced is equal to the number of random variables in the data set. The actual maps of the MPS surfaces may be constructed using distance-weighted averaging or similar al- gorithms (MacDonald & Waters 1987) to fit con- tours to the U, Vand U-Vor residual surface values. In this paper we have used the distance weighted averaging and simulated 3-d view subroutines of the Surface II package (Sampson 1978). Significance

127

tests for MPS surfaces have not been developed. Full details on the mathematical derivations of MPS are available from Lee (1981).

Classification and mapping taxon assemblages using MPS

The data requirements for MPS are similar to the re- quirements for other multivariate techniques used in plant ecology (see Gauch 1982). I f a study area con- tains two distinct taxon assemblages (vegetation types in the current application) each of which is relatively homogeneous and distributed spatially in a manner that can be approximated by a polynomi- al, the first root equations for U and V should pro- duce positive values for sites from one vegetation type and negative values for sites in the other type. The relationship between taxa and site classification may be examined by inspecting the U equation coefficients. Taxa with large negative or positive coefficients are important in determining the ulti- mate classification of a site. The zero contour on the U and V surface maps provides an approximation of the spatial boundary between the two vegetation types. Sharp boundaries between vegetation types will be displayed as steep contours in the vicinity of the zero contour when the MPS results are mapped. Diffuse boundaries will result in gentle contour gra- dients around the zero contour interval. The U-V

residual surface identifies regions where the V sur- face provides a poor fit to the spatial distribution of the taxon assemblages classified by MPS. Large residuals indicate that explanatory variables besides spatial location must be considered to account for these departures from the global, spatial model provided by the V surface.

The mapping of the first root from MPS portrays the main two vegetation units in a given study area. However, vegetation mapping is often concerned with situations where more than two distinct vegeta- tion types are represented. Two approaches are avail- able if further subdivision and mapping of vegeta- tion units is desired. First, a separate MPS analysis may be run on the data from either of the first two identified vegetation units. Second, as the MPS roots are orthogonal, values of second and subse-

128

quent roots should represent the next most impor- tant subdivisions of the vegetation in successive ord- er. Scores and coefficients from the second and sub- sequent roots may be examined and maps produced in the same manner as for the first root. Again, the zero contour on the MPS map approximates the spa- tial boundary between the two vegetation types as differentiated by the second and subsequent roots. However, it is by no means guaranteed that these subsequent subdivisions will have substantive mean- ing.

Mapping modern pollen assemblages from western Canada

To demonstrate the application of MPS for mapping taxon assemblages, we have analysed modern pollen assemblages from Alberta and eastern British Columbia. The modern pollen spectra from Alberta closely reflect contributing vegetation (MacDonald & Ritchie 1986; MacDonald & Waters 1987). In broad terms, the vegetation of the province consists of grassland in the southeast and forest elsewhere. Within this context six major vegetation zones can be recognized (Fig. la). The grasslands of the south- east are dominated by Gramineae and Carex spp., but support significant numbers of diverse herbs and shrubs including species of Artemisia,

120°

] GRASSLAND

] PARKLAND

I ~ 1 BOREAL FOREST

SUBALPINE FOREST

] SUBALPINE BOREAL TRANSITION

110 600 120 o 110 °

• ~ 53 4e

40o • "41 52 t39

I j8 '.'

. 5 5 { , / 49 48 45 ,36 j i

• 2~ 4 ; " • 120 •32 4• 27 26

o 19 23Q 24 • ~5 31 1~ °

• ~8 8 9 •

~e 1715 •5 •4 50 ~' ~la 13 e3 •

• 1 2

Fig. 1. a. General vegetation zones of Alberta (after National At- las of Canada, 1974). b. Position map of pollen surface sample sites.

6 0 °

- 5 0 °

Chenopodiaceae, Compositae, and Selaginella. The parkland is a transitional zone between the grassland and the forests to the north and northwest. The park- land is characterised by stands of Populus tremuloides and Picea glauca alternating with tracts of grassland. The boreal forest occupies the north- eastern half of Alberta. Typical boreal forest trees in- clude Picea glauca, P. mariana, Pinus banksiana, Larix laricina, Populus tremuloides, P. balsamifera and Betulapapyrifera. The subalpine forest occupies most of the Rocky Mountains along the south- western edge of the province. Typical subalpine trees include Picea glauca, P. engelmanni, Abies lasiocar- pa, and Pinus contorta, Pinus flexilis, P. albicaulis and Larix layalli are locally important near treeline. Alpine tundra is dominant above 2500 m elevation. The subalpine-boreal transition region is charac- terised by a mixture of subalpine and boreal species. C o m m o n trees include Picea glauca, P. mariana, Abies balsamea, A. lasiocarpa, Pinus contorta, Populus tremuloides, Populus balsamifera, and Bet- ula papyrifera.

The data used in this example are a network of 56 modern pollen surface samples (Table 1; Fig. lb). The samples are part of a larger data-set of modern pollen spectra reported in detail by MacDonald & Ritchie (1986). The samples include all of the major vegetation types except alpine tundra. Samples were obtained from the surface sediments of small to moderate sized lakes. Pollen percentages for the nine

most common taxa are used here. The main features of modern pollen deposition

include high values of Pinus (ca. 70%) in the subal- pine forest with a gradual decline to the northeast and a rapid decline to the southeast. However, long- distance dispersal produces values of Pinus as high as 42% in the grasslands. Picea reaches maximum values (ca. 40%) in the northeastern portion of the boreal forest and is at a minimum in the grassland. Alnus and Betula also achieve maximum values in the boreal forest region and minimum values in the grassland. Cheno-Am. (Chenopodiaceae and Amaranthaceae), Artemisia and Gramineae all achieve their highest frequencies (ca. 6% to 40%) in the grasslands.

The vegetation and associated pollen rain o fAlber- ta provide ideal conditions for evaluating the use of

Table 1. Location of the surface sample sites, observed vegetation setting and pollen percentages, and terrestrial pollen sum.

Site Vegetation zone Picea Pinus Betula Alnus Salix Artemisia Cheno-Am. Compositae Gramineae

1 Grassland 03.0 40.0 00.3 01.5 02.0 09.0 25.0 01.6 13.0

2 Grassland 02.0 31.0 00.0 00.0 00.5 09.0 23.0 02.3 08.0

3 Grassland 04.3 36.0 00.0 01.4 02.0 08.0 22.0 01.6 13.0 4 Grassland 01.7 19.0 00.0 00.0 02.0 22.0 39.0 05.0 15.0

5 Grassland 03.0 34.0 04.0 01.6 03.0 08.0 23.0 03.2 16.0

6 Grassland 04.0 38.0 01.8 01.8 03.0 09.0 16.0 01.4 15.0

7 Grassland 02.0 21.0 00.6 01.5 12.0 08.0 06.0 03.7 39.0

8 Grassland 05.0 40.0 01.5 00.2 02.0 10.0 09.0 01.6 14.0

9 Grassland 03.0 42.0 01.0 01.2 03.0 26.0 06.0 02.2 11.0

10 Grassland 01.8 18.0 01.0 02.7 04.0 20.0 06.5 13.0 20.0

11 Grassland 04.0 23.0 03.0 01.5 03.2 40.0 06.0 02.0 10.0

12 Subalpine for. 10.0 72.0 00.0 02.0 01.0 00.0 00.0 00.0 00.5

13 Subalpine for. 10.0 68.0 00.0 05.0 00.0 00.0 00.0 00.0 00.5

14 Subalpine for. 10.0 76.0 05.0 01.8 00.6 02.6 00.8 01.0 01.5

15 Subalpine for. 15.0 66.0 00.5 09.0 02.0 00.0 00.0 00.0 00.0

16 Subalpine for. 16.0 75.0 03.0 04.0 00.9 00.0 00.9 00.0 00.0

17 Subalpine for. 15.0 73.0 01.5 02.0 00.0 01.7 00.5 00.0 00.6

18 Subalpine for. 12.0 74.0 02.0 02.0 00.0 01.5 00.4 00.0 00.7

19 Subalpine for. 11.0 84.0 01.4 00.4 00.4 00.2 00.2 00.0 00.3

20 Subalpine for. 14.0 78.0 01.7 01.5 00.7 00.7 00.0 00.0 00.4

21 Subalpine for. 13.0 82.0 01.9 01.7 00.4 01.0 00.4 00.0 00.2

22 Subalpine for. 10.0 85.0 00.6 01.2 00.2 00.5 00.0 00.0 00.2

23 Parkland 21.0 51.0 06.0 04.0 06.0 00.0 00.6 00.6 04.0

24 Parkland 18.0 39.0 09.0 03.6 03.0 08.0 03.0 00.6 06.0

25 Parkland 19.0 39.0 09.0 04.0 04.0 10.0 01.7 00.3 04.0

26 Parkland 07.0 22.0 17.0 07.0 07.0 05.0 03.0 00.0 04.0

27 Parkland 20.0 40.0 25.0 04.0 01.2 02.3 01.0 00.4 01.0 28 Parkland 18.0 24.0 20.5 06.7 03.1 02.6 00.0 00.5 07.0

29 Parkland 13.0 30.0 20.0 20.0 06.0 00.5 00.0 00.9 11.0

30 Parkland 21.0 36.0 11.0 03.0 03.6 13.0 00.9 01.3 04.5

31 Sub.-Boreal for. 22.0 61.0 05.4 01.3 01.5 00.0 00.0 03.0 01.7

32 Sub.-Boreal for. 19.0 71.0 03.0 02.0 00.0 01.0 00.6 00.0 00.0

33 Sub.-Boreal for. 29.0 61.0 04.0 05.0 00.0 00.4 00.0 00.0 00.2

34 Sub.-Boreal for. 25.0 65.0 06.0 01.5 01.0 00.2 00.0 00.0 00.6

35 Sub.-Boreal for. 26.0 66.0 04.0 01.0 00.6 00.0 00.0 00.0 00.4

36 Sub.-Boreal for. 20.0 59.0 09.0 04.4 01.6 01.0 00.0 00.2 00.6

37 Sub.-Boreal for. 23.0 56.0 09.0 04.0 0.16 02.0 00.0 00.5 00.7

38 Sub.-Boreal for. 19.0 51.0 10.0 09.0 03.0 01.2 00.0 00.6 01.0

39 Sub.-Boreal for. 17.0 61.0 05.0 06.4 01.2 01.0 00.0 00.6 01.2

40 Sub.-Boreal for. 50.0 20.0 10.0 09.0 02.0 00.5 00.2 00.0 05.0

41 Sub.-Boreal for. 24.0 58.0 06.0 06.7 01.0 00.2 00.0 00.0 00.3

42 Sub.-Boreal for. 22.0 48.0 12.0 10.0 03.0 00.6 00.0 00.0 00.5

43 Sub.-Boreal for. 29.0 48.0 08.0 05.6 00.8 00.4 00.0 00.2 00.4 44 Boreal forest 30.0 40.0 05.0 06.0 06.0 00.0 00.5 00.0 05.0

45 Boreal forest 47.0 22.0 17.0 05.0 02.0 03.0 00.3 00.2 00.5

46 Boreal forest 20.0 24.0 33.0 15.0 07.0 03.0 00.0 00.3 07.0

47 Boreal forest 36.0 23.0 24.0 06.0 02.2 01.2 00.0 00.3 01.6

48 Boreal forest 23.0 20.0 35.0 10.0 03.0 03.0 00.0 00.0 03.0

49 Boreal forest 44.0 27.0 12.0 09.0 02.0 01.2 00.0 00.0 00.0

50 Boreal forest 36.0 23.0 23.0 11.4 0.25 01.5 00.0 00.4 01.7 51 Boreal forest 39.0 23.0 19.0 07.0 09.0 03.0 00.0 00.0 03.0 52 Boreal forest 26.0 43.0 09.0 12.0 01.0 01.0 00.0 00.3 00.8 53 Boreal forest 24.0 36.0 13.0 11.0 07.0 01.0 00.0 00.0 01.0 54 Boreal forest 33.0 19.0 20.0 16.4 04.0 00.8 00.1 00.0 01.5

55 Boreal forest 25.3 23.3 18.1 13.6 04.0 00.9 00.1 00.1 02.0 56 Boreal forest 24.7 18.2 24.0 17.0 02.7 00.6 00.2 00.0 02.1

130

.

°$

0

0

o

0

o

"~ :~• --~~/~ [ ........ i ~ ~"~i!~!i~ilI~i~i~ !!i~i~r~'~=~ " , ~i~i i i! ,,f~--'-~1 i~ °'~ ~ ~ ~ ' .... f~! iiiiii~ ~." ~" ~ ~.

MPS for mapping taxon assemblages. The province has two major vegetation types, grassland and for- est. The forest may be subdivided into two major components, subalpine and boreal forests. The subalpine-boreal forest and the parkland serve as transitional regions between the major vegetation types. The boundaries between the subalpine and boreal forests and between the grassland and boreal forest are relatively diffuse across these transition regions. In contrast, the boundary between the grassland and subalpine forest in the southwest of the province is relatively sharp. The conditions out- lined above allow for the appraisal of MPS for spa- tial classification and mapping of taxon assem- blages and portrayal of diffuse and discrete boundaries.

The U coefficients and V equation for the first MPS root are presented in Table 2. The highest covariance between the U a n d Vterms was obtained when Vhad the form of a 3rd order polynomial. The amount of variance explained is 37%. The Usurface equation has positive coefficients for typically grass- land taxa and negative coefficients for forest types. The U surface coefficients, and map (Table 2; Figs. 2a, b) indicate that MPS classified the pollen spectra into two groups representing grassland and forest. The sites in the grassland have positive values while sites from the forested regions have negative values. The zero contour on both the U and V sur- faces correspond with the actual limits of the grass- land (Fig. la). The sharp boundary between the grassland the subalpine forest in southwestern Al- berta is well represented by relatively steep contours on the U surface map. The more diffuse boundary in the parkland zone to the north is portrayed by gen- tler contours. The V surface maps present a similar

131

portrayal of grassland and forest distribution (Fig. 2b). However, the V surface polynomial does not provide as sensitive a portrayal of the grassland- boundary conditions as the Usurface. The largest U- V residuals occur along the subalpine-grassland boundary region and in the central boreal forest

(Fig. 2c). The high residuals in the central boreal re- gion may reflect the sparsity of data points there.

The 2nd to 9th roots were examined to determine if further meaningful classification of the samples could be obtained by producing surfaces from these roots. The second root accounted for 28% of the to- tal variance. Subsequent roots each accounted for 10% or less of the variance. The U and V surface equations for the second root are provided in Ta- ble 3. The U surface equation has a large negative coefficient for Pinus and low to moderate positive coefficients for the other eight taxa. The U surface coefficients and map indicate that the 2nd root differentiates between subalpine forest and the bore- al forest and grassland (Table 3; Figs. 3a, b). On the Uand Vsurface maps high negative values are found in the subalpine region (Figs. 3a, b). However, the zero contour is located further to the east than the actual limits of the subalpine forest (Fig. la) on the U surface and is not present within the study area on the U surface. It is probable that this poor por- trayal of the vegetation boundary on the U and V surfaces reflects a poor relation between pine pollen deposition and vegetation boundaries due to the long-distance dispersal characteristics of Pinus

pollen in Alberta (MacDonald & Waters 1987). In addition, some of the variance associated with this boundary may have been incorporated in the first root and will be poorly represented here due to the orthogonal nature of the roots. The sharp boundary

Table 2. U s u r f a c e coeff icients and Vsu r f ace equa t ion f r o m 1st

M P S root .

Picea - 0.8207 A rtemisia 0.7094

Pinus - 0.0114 C h e n o - A m . 0.8232

Betula - 0.6569 C o m p o s i t a e 0.7011

Alnus - 0.7069 G r a m i n e a e 0.5982

Salix - 0.0915

Table 3. U s u r f a c e coeff icients and Vsur face equa t ion f rom 2nd

MPS root .

Picea 0.2036 A rtem isia 0.4340

Pinus - 1.0532 C h e n o - A m . 0.2836

Betula 0.6391 C o m p o s i t a e 0.3021

Alnus 0.5239 G r a m i n e a e 0.3850

Salix 0.3947

V = 3 4 . 3 1 X + 5 3 . 6 1 Y + - 2 1 . 6 9 X 2 + - 8 2 . 3 6 X Y + V = 3 . 3 7 X + 18.98 Y + 2 9 . 3 6 X 2 + - 1 0 . 3 6 X Y + - 1 7 . 0 5

- 7 3 . 8 4 y2 + 9 . 7 2 X 3 + 1 3 . 7 7 x Z Y + 4 9 . 1 2 X y 2 + 3 1 . 6 0 X 4 y2 + _ 2 2 . 9 9 X 3 + _ l . 0 6 X 2 y + _ 4 . 5 9 X y 2 + 11.35 I:3

132

between the subalpine forest and grasslands is well

portrayed by close con tour intervals on the U sur-

face. The gradual t rans i t ion between the subalpine

and boreal forest is portrayed by gentler con tour in-

tervals. The V surface map also presents relatively

close contours a long the subalpine forest - grassland

b o u n d a r y and widely spaced contours in the

subalpine-boreal t rans i t ion region. The U-Vresidual

map (Fig. 3c) does no t provide any indica t ion of

strong spatial au tocorre la t ion of the residuals.

Comparison of MPS with mapped scores from principal components analysis

For purposes of compar ison , R-mode pr incipal

componen t s analysis of the correlat ion matr ix fol-

lowed by var imax rota t ion (Nie e t al. 1975) was per-

formed on the data set (Table 4). Since only the first

two componen t s had eigenvalues greater t han one,

on ly these two componen t s were entered into the var-

imax rotat ion. The scores for the two componen t s

were mapped in the same m a n n e r as the MPS U sur-

face (Figs. 4a, b).

Table 4. Summary results of the principal components analysis on the 56 sites.

Component Eigenvalue °70 Variation Explained 1 3.60 40.1 2 2.71 30.1 3 0.76 8.5 4 0.60 6.7 5 0.50 5.5

Varimax rotated component loadings (leading two components) Component 1 Component 2

Picea - 0.71 0.42 Pinus - 0.39 - 0.81 Betula - 0.33 0.84 Alnus -0.39 0.79 Salix 0.33 0.69 Artemisia 0.82 0.002 Cheno-Am. 0.74 -0.17 Compositae 0.72 0.12 Gramineae 0.88 0.22

The var ia t ion explained by the first and second

P C A componen t s was only 2°7o to 3°7o greater than

the variance explained by the first two MPS roots.

The U surface coefficients and the P C A loadings

120 °

C

4 a. ~ ~ / ~ ~ 50 °

120 ° f 'Oo.

........

i

Fig. 4. a. PCA scores from the first component, b. PCA scores from the second component. Block diagrams provide simulated 3-d surfaces as viewed from the southeast.

133

have a similar pattern. However, there are differences

in the signs of some of the minor taxa and the relative magnitude of the coefficients and loadings. Most notably, Pinus has a relatively low magnitude coeffi- cient for the first MPS root compared with its load- ing for the first component . Due to long distance transport, Pinus pollen percentages are high throughout the province. Thus, Pinus provides little information for the spatial classification of the sites using the MPS first root. The result is a low magni- tude coefficient for the taxon. Inspection of the PCA loadings and comparison of the mapped PCA scores with the Alberta vegetation indicates that positive scores on the first component represent grassland and parkland sites and negative scores rep- resent sites in the forested portions of the province. Negative scores on the second component appear to represent subalpine forest sites. However, as was the case with MPS, the high amounts of Pinus pollen throughout the province serve to obscure the bound- ary between the subalpine forest and other vegeta- tion types.

Discussion

MPS offers a powerful, spatially explicit technique for the classification and mapping of vegetation or other types of taxon assemblages. The technique classifies the data under the implicit assumption that sites with similar assemblages are spatially correlat- ed. The approach is most useful when the main ob- jective is to classify the taxon assemblages hierarchi- cally and spatially. The U surfaces provide an easy to interpret portrayal of the spatial distribution of two taxon assemblages. By virtue of its polynomial form, the V surfaces provide a more generalized representation of the distribution of the taxon as- semblages. The V equation is a global model of the spatial distribution of the assemblages. However, spurious results may be obtained when Vvalues are calculated for sites near the edge or beyond the boundary of the study area. As with traditional gra- dient analysis and ordination (Gauch 1982), and es- pecially canonical correspondence analysis (e.g., ter Braak 1986, 1987), attempts could be made to corre- late quantitatively the site values derived from the U

and V equations with environmental gradients. I f the relationship between environmental factors and latitude and longitude can be ascertained, the V equation can be viewed as a model of the relation- ship between environmental factors and taxon as- semblage distributions. In addition, MPS could be used to directly analyse the distribution of taxon as- semblages along environmental gradients if in- dependent environmental variables were used in place of longitude and latitude.

Both MPS and the mapping of PCA scores provided similar results. However, PCA does not provide a predictive equation similar to the Vsurface

equation. This equation allows the worker to classify sites in the study area that lack observations and to calculate residuals for sites where observations are available. The only way to address this deficiency us- ing PCA is through the cumbersome process of fit- ting a trend surface to the component scores. In ad- dition, the zero contour of the mapped PCA scores does not explicitly provide a boundary between vege- tation types. Depending on the loadings, the bound- ary may fall to either side of the zero contour.

Several caveats should be borne in mind when us- ing MPS for the classification and mapping of taxon assemblages. The MPS roots represent the linear combination of the random variables which produces the joint maxima of variance for the ran- dom variables and covariance between the random and geographic variables. Thus, the pr imary as- sumption of MPS-based classification is that the major assemblages are spatially correlated. The technique will not produce meaningful classifica- tions or maps when the assemblages are scattered through the study area in a distribution that cannot be modelled by a polynomial. Under these circum- stances it would be more appropriate to use classifi- cation techniques which are not spatially explicit. MPS classifies and maps assemblages by a dichoto- mous, divisive approach and assumes a hierarchical structure to the data. The resulting maps portray the distribution of only two assemblages even if there are more than two distinctive assemblage types in the study area. The classifications and maps will there- fore contain potentially misleading inaccuracies. However, this problem is inherent in interpreting sin- gle vector scores from any ordination procedure. Ex-

134

amination of the residuals and the percentage of var- iance explained by different roots provides a means of checking the results of MPS. I f the first root does not explain a high proport ion of the variance, it is likely that the distribution of taxa in the study area cannot be classified into just two spatially correlated assemblages. In these circumstances it may be useful to examine the equations and maps from the second and subsequent roots.

The clear similarities between MPS and canonical correlation suggests that some of the limitations ap- plicable to the latter are applicable to MPS. Canoni- cal correlation is extremely sensitive to sampling er- ror (Weiss 1972). It has been suggested that the ratio of the number of samples to the number of variables should ideally be at least 20:1 (Weiss 1972; Lindeman et al. 1980). In plant ecology it is often impossible to obtain this ratio of sites to taxa. Gauch & Went- worth (1976) suggested that the linearity require- ments of canonical correlation are difficult to meet in ecological applications and often render the tech- nique ineffective. However, in MPS one of the linear vectors is replaced by the polynomial equation for the geographic coordinates. The results presented here suggest the MPS may provide meaningful

results despite the above concerns. As the Uequat ion has the form of a trend surface,

MPS is prone to the same problems and limitations of traditional trend surface analysis (Davis 1973; Unwin 1975; Whitten 1975; Mather 1976; Mac- Donald & Waters 1987). The spatial dispersion of the data points may have a significant impact on the resulting surface. A fairly even distribution of data points over the entire map area is required. Uneven distribution of data points produces an indeter- minable loss of degrees of freedom for the analysis. Clumped or linear distributions of sampling points may produce spurious contour patterns. A buffer zone of data points surrounding the map area should be included in the analysis to mitigate estima- tion errors by the polynomials at the edge of the data distribution. Finally, polynomials produce surfaces with smooth undulations, concave bowls and convex hills. Many natural spatial distributions simply do not have this type of shape. Examination of the U-V

residual surface provides a rapid check of this last problem. High residuals may indicate regions where

local vegetation conditions dominate over the regional trend or the form of the polynomial is una- ble to model the regional trend.

Acknowledgements

This research was supported by a NSERC Operating Grant to MacDonald. We wish to thank Dr P. J. Lee, Geological Survey of Canada, for providing a copy of the MPS program and for many useful discus- sions on the MPS technique. Drs K. Gajawski and P. J. Lee provided useful comments on an earlier draft of this paper. The suggestions of the three anonymous reviewers were greatly appreciated.

References

Birks, H. J. B & Saarnisto, M. 1975. Isopollen maps and principal components analysis of Finnish pollen data for 4000, 6000 and 8000 years ago. Boreas 4: 77 -96 .

Birks, H. J. B & Webb III, T & Berti, A. A. 1975. Numerical analy- sis of surface samples from central Canada: a comparison of methods. Rev. Palaeobot. Palynol. 20: 133-169.

Briggs, R. & Leonard, W. A. 1977. Empirical implications of ad- vances in canonical theory. Can. Geog. 21: 133-147.

Clark, D. 1975. Understanding canonical correlation analysis. concepts and techniques in modern geography, No. 3, Geo

Abstracts, Norwich. Davis, J. C. 1973. Statistics and data analysis in geology. Wiley,

New York. Gauch Jr, H. G. 1982. Multivariate analysis in communi ty ecol-

ogy. Cambridge University Press, Cambridge. Gauch Jr, H. G. & Wentworth, T. R. 1976. Canonical correlation

analysis as an ordination technique. Vegetatio 28: 17-22. Hotelling, H. 1936. Relations between two sets of variates. Bi-

ometrika 28: 321-377. Huntley, B. & Birks, H. J. B. 1983. An atlas of past and present

pollen maps for Europe: 0 - 1 3 0 0 0 years ago. Cambridge University Press, Cambridge.

Lee, P. J. 1969. Theory and application of canonical trend surface.

J. Geol. 77: 303-318. Lee, P. J. 1981. The most predictable surface mapping method in

petroleum exploration. Bull. Can. Pet. Geol. 29: 224-240 . Lindeman, R. H., Merenda, P. E & Gold, R. Z. 1980. Introduc-

tion to bivariate and multivariate analysis. Scott, Foresman & Company, Glenview, III.

MacDonald, G. M. & Ritchie, J. C. 1986. Modern pollen spectra from the western interior of Canada and the interpretation of late quaternary vegetation development. New Phytol. 103: 245 - 268.

MacDonald, G. M. & Waters N. M. 1987. An evaluation of auto-

mated mapping algorithms for the analysis of Quaternary pollen data. Rev. Palaeobot. Palynol. 51: 289-307.

Mather, P. M. 1976. Computational methods of multivariate analysis in physical geography. Wiley, New York.

Mucina, L. & Polacik, S. 1982. Principal components analysis and trend surface analysis of a small-scale pattern of a transition mire. Vegetatio 48: 165-173.

National Atlas of Canada, 1974. Government of Canada. Otta- wa.

Nie, N. H., Hull, C. H., Jenkins, J. G., Steinbrenner, K. & Bent, D. H. 1975. Statistical package for the social sciences (2nd ed.). McGraw-Hill, New York.

Prentice, I. C. 1978. Modern pollen spectra from lake sediments in Finland and Finnmark, north Norway. Boreas 7: 131-153.

Prentice, I. C. 1986. Vegetation responses to past climatic varia- tion. Vegetatio 67: 131-141.

Sampson, R. J. 1978. Surface II graphics system (revision one). Kans. Geol. Surv. Spatial Anal. Monogr. 1.

ter Braak, C. J. F. 1986. Canonical correspondence analysis: a new eigenvector technique for multivariate direct gradient analysis. Ecology 67: 1167-1179.

135

ter Braak, C. J. E 1987. The analysis of vegetation - environment relationships by canonical correspondence analysis. Vegetatio 69: 6 9 - 77.

Unwin, D. 1975. An introduction to trend surface analysis. Con- cepts and techniques in modern geography, No. 3, Geo Ab- stracts, Norwich.

Webb III, T. 1974. Corresponding patterns of pollen and vegeta- tion in lower Michigan: a comparison of quantitative data. Ecology 55: 17-28.

Webb III, T. & Clark, D. R. 1977. Calibrating micro- paleontological data in climatic terms: a critical review. Ann. New York Acad. Sci. 288: 93-118.

Webb IIl, T. & McAndrews, J. H. 1976. Corresponding patterns of contemporary pollen and vegetation in central North Ameri- ca. Geol. Soc. Am. Mem., 145: 267-297.

Weiss, D. J. 1972. Canonical correlation analysis in counseling psychology. J. Couns. Psychol. 19: 241-252.

Whitten, E. H. T., 1975. The practical use of trend surface analysis in the geological sciences. In: Davis, J. C. & McCullagh, M. J. (eds), Display and analysis of spatial data. Wiley, New York.