The relationships of modern pollen spectra to vegetation and climate along a steppe-forest-tundra...

14
http://hol.sagepub.com The Holocene DOI: 10.1177/0959683608096600 2008; 18; 1259 The Holocene Barbora Pelánková, Petr Kunes, Milan Chytry, Vlasta Jankovská, Nikolai Ermakov and Helena Svobodová-Svitavská transition in southern Siberia, explored by decision trees The relationships of modern pollen spectra to vegetation and climate along a steppeforesttundra http://hol.sagepub.com/cgi/content/abstract/18/8/1259 The online version of this article can be found at: Published by: http://www.sagepublications.com can be found at: The Holocene Additional services and information for http://hol.sagepub.com/cgi/alerts Email Alerts: http://hol.sagepub.com/subscriptions Subscriptions: http://www.sagepub.com/journalsReprints.nav Reprints: http://www.sagepub.co.uk/journalsPermissions.nav Permissions: http://hol.sagepub.com/cgi/content/refs/18/8/1259 Citations at Charles Univ/Univ Karlova v Praze on November 28, 2008 http://hol.sagepub.com Downloaded from

Transcript of The relationships of modern pollen spectra to vegetation and climate along a steppe-forest-tundra...

http://hol.sagepub.com

The Holocene

DOI: 10.1177/0959683608096600 2008; 18; 1259 The Holocene

Barbora Pelánková, Petr Kunes, Milan Chytry, Vlasta Jankovská, Nikolai Ermakov and Helena Svobodová-Svitavská transition in southern Siberia, explored by decision trees

The relationships of modern pollen spectra to vegetation and climate along a steppe�forest�tundra

http://hol.sagepub.com/cgi/content/abstract/18/8/1259 The online version of this article can be found at:

Published by:

http://www.sagepublications.com

can be found at:The Holocene Additional services and information for

http://hol.sagepub.com/cgi/alerts Email Alerts:

http://hol.sagepub.com/subscriptions Subscriptions:

http://www.sagepub.com/journalsReprints.navReprints:

http://www.sagepub.co.uk/journalsPermissions.navPermissions:

http://hol.sagepub.com/cgi/content/refs/18/8/1259 Citations

at Charles Univ/Univ Karlova v Praze on November 28, 2008 http://hol.sagepub.comDownloaded from

deposition. Understanding this relationship is crucial for reliable

reconstructions of past landscapes from fossil pollen assemblages

(von Post, 1916; Sugita, 1994). One of the difficulties is that

pollen spectra of structurally distinct vegetation types (for exam-

ple open forest and steppic grassland) can be similar, which is a

strong reason for caution when interpreting fossil pollen assem-

blages. Therefore, for a reliable interpretation of past vegetation

Introduction

For as long as palynology has been used for reconstructions of

past landscapes and climate, palynologists have been trying to

assess the relationship between vegetation types and their pollen

The relationships of modern pollenspectra to vegetation and climatealong a steppe–forest–tundra transitionin southern Siberia, explored bydecision treesBarbora Pelánková,

1,2,* Petr Kuneš,

3Milan Chytrý,

1Vlasta

Jankovská,2Nikolai Ermakov

4and Helena Svobodová-Svitavská

5

( 1Department of Botany and Zoology, Masaryk University, Kotlárská 2, CZ-611 37 Brno, Czech

Republic; 2Institute of Botany, Academy of Sciences of the Czech Republic, Porící 3B, CZ-603 00

Brno, Czech Republic; 3Department of Botany, Charles University, Benátská 2, CZ-128 01 Praha 2,

Czech Republic; 4Central Siberian Botanical Garden, Russian Academy of Sciences,

Zolotodolinskaya 101, Novosibirsk 630090, Russia; 5Institute of Botany, Academy of

Sciences of the Czech Republic,CZ-25243 Pru°honice, Czech Republic)

Received 25 August 2007; revised manuscript accepted 19 May 2008

Abstract:We studied the relationships between surface pollen spectra, vegetation and selected climate character-

istics along a strong gradient of climate continentality across the Western Sayan Mountains, southern Siberia.

Representation of 111 pollen taxa in 81 surface samples from steppe, forest and tundra was related to the vegeta-

tion composition at various distances from the sampling point and to mean annual precipitation and mean July and

January temperatures. These relationships were assessed by the decision tree models. The results show (1) which

vegetation types are well distinguished by their pollen spectra; (2) which vegetation types are strongly similar in

their pollen spectra and therefore their interpretation from fossil pollen spectra should be carefully considered;

(3) tight relationship between surface pollen spectra and selected climate characteristics, which suggests that the

past climatic conditions can be reasonably predicted from pollen spectra; and (4) an important role of weak pollen

producers for assignment of pollen spectra to vegetation types or particular values of temperature and precipitation.

We found the decision trees suitable for analysis of pollen/vegetation relationship because they (1) formally and

precisely assign the pollen spectra to vegetation/landscape types or climatic variables by means of easy-to-interpret

graphs; (2) identify pollen taxa that are best indicators of a particular vegetation type, landscape or climate charac-

teristics; and (3) utilize the pollen signal of both strong and weak pollen producers. We compare the decision tree

models with ordination and cluster analysis and suggest further applications.

Key words: Classification and regression trees, landscape, pollen/vegetation relationship, surface pollen

samples, vegetation type, Russia.

The Holocene 18,8 (2008) pp. 1259–1271

© 2008 SAGE Publications 10.1177/0959683608096600

*Author for correspondence (e-mail: [email protected])

at Charles Univ/Univ Karlova v Praze on November 28, 2008 http://hol.sagepub.comDownloaded from

and environments, we need either a macrofossil record (Birks and

Birks, 2000) or a thorough knowledge of the modern analogues

and their pollen spectra.

The common approach to the calibration of the

pollen/vegetation relationship relates pollen deposition to vegeta-

tion, land cover, land use and other environmental features, in

order to search for common patterns. Frequently used statistical

methods include correlation (Liu et al., 1999), regression (Webbet al., 1981; Bradshaw and Webb, 1985), cluster analysis (Hoyt,2000; Stutz and Prieto, 2003) and ordination (eg, Gaillard et al.,1992, 1994; Odgaard and Rasmussen, 2000; Brayshay et al.,2000; Fontana, 2005). All of these methods provide useful

insights into the pollen/vegetation relationship, but face the prob-

lem of zero and close-to-zero values in percentage pollen data.

Weak pollen producers have low weight in analyses, despite their

potentially strong indicative meaning. Difficulties arise especially

when indicator species of studied vegetation types are weak pollen

producers. Therefore the statistical tools for assigning pollen spec-

tra to certain vegetation or landscape types should ideally work

with the information carried by the entire pollen assemblage, both

strong and weak pollen producers. At the same time, they should

differentiate between signal and noise in the low percentage val-

ues of a pollen assemblage. They should also clearly indicate

which pollen taxa at which pollen quantities are most responsible

for the discrimination between different vegetation or landscape

types. Particularly the last requirement is hardly met by the statis-

tical methods used in palynology so far.

In this study, we investigate and visualize the relationship

between surface pollen spectra and vegetation, landscape and

environment using the exploratory data analysis technique of deci-

sion trees (Breiman et al., 1984). The main advantages of decisiontrees are: (1) applicability to complex and unbalanced data, with

high-order interactions and non-linear relationships between vari-

ables (De’ath and Fabricius, 2000); (2) ability to detect a mean-

ingful signal (if there is any) in the pollen distribution of both

strong and weak pollen producers; (3) evaluation of the relative

discriminatory capacity of the pollen taxa entered into analysis;

(4) easy-to-interpret output, which illustrates under what criteria

certain pollen samples were assigned to a certain vegetation type

or environmental characteristic.

For our study we chose the Western Sayan Mountains and adja-

cent areas in southern Siberia. This region, together with the adja-

cent Altai Mountains, probably contains the closest modern

analogue of landscapes and vegetation of Central Europe in the

full glacial and Lateglacial period. Local climates of different

parts of these mountains are analogous to the palaeoclimates of

Central Europe in different periods of the Pleistocene or early

Holocene (Frenzel et al., 1992). The flora of these mountainsincludes many species with Euro-Siberian distribution ranges and

historical links to Central Europe (Meusel et al., 1965–1992).Three major biomes, which supposedly occurred widely in the

Pleistocene landscapes of Central Europe (Lang, 1994; Willis

et al., 2000 Jankovská et al., 2002), meet in the study area: taiga,steppe and tundra. These form mosaics depending on local topog-

raphy, altitude and a sharp gradient of climate continentality, run-

ning from the northern windward slopes to the southern

intermountain valleys (Polikarpov et al., 1986). Therefore thestudy of the modern pollen/vegetation/environment relationships

in this landscape provides an opportunity to improve our under-

standing of the Pleistocene landscape history of Central Europe

and to refine its interpretations based on fossil pollen data.

We address the following questions: (1) To what degree of pre-

cision is it possible to discern between studied vegetation or land-

scape types on the basis of surface pollen spectra? (2) Which

pollen taxa contribute most significantly to the differences? (3)

How well do the modern pollen spectra reflect the present climate

characteristics in a dry and winter-cold continental area? (4) What

are the advantages of decision trees in palynology with regard

to other methods?

Study area

The study area is situated in southern Siberia (Russia) near the

Mongolian border (Figure 1; 50°43′–53°33′N, 91°06′–93°28′E). Itincludes the Western Sayan Mountains and adjacent areas of the

Minusinskaya Basin, Central Tuvinian Basin and the Tannu-Ola

Range. The mountains are situated at 350–2860 m a.s.l., basins

at 300–600 m (Minusinskaya) and 550–1100 m a.s.l. (Central

Tuvinian).

At lower and middle altitudes of the Western Sayan, mean

January temperature is −11 to −22°C, mean July temperature16–19°C and annual precipitation 500–900 mm (Gidrometeoizdat

1966–1970). At the north-facing, windward slopes of the main

ridge of the Western Sayan, annual precipitation is approximately

1600 mm. Minusinskaya Basin has a mean January temperature of

−21°C, mean July temperature 20°C and annual precipitation

300–350 mm. The southern part of the Western Sayan, Central

Tuvinian Basin and the Tannu-Ola Range are in the area of a rain

shadow. Their climate is arid continental, with mean January tem-

peratures −27 to −34°C, mean July temperature 16–18°C and

annual precipitation below 400 mm.

Minusinskaya Basin is covered by a meadow steppe with Euro-

Siberian species and patches of Betula pendula or Populus tremulawoodlands or Caragana-Spiraea steppic scrub in slightly humidplaces. The central Tuvinian Basin is covered with dry steppe con-

sisting mainly of Mongolian (central Asian) species, and patches

of low-productive rocky steppe on rock outcrops (also with

Mongolian species), as well as patches of Larix sibirica woodlandor Caragana-Spiraea scrub at relatively humid sites. In transi-tional areas to more humid mountain ranges, steppe occurs on

south-facing slopes and forest on north-facing slopes, forming a

forest-steppe with Betula pendula and/or Pinus sylvestris in thenorth and Larix sibirica in the south of the study area (Chytrý et al.,

1260 The Holocene 18,8 (2008)

Figure 1 Location of 81 sampling plots where the pollen samples

were collected

at Charles Univ/Univ Karlova v Praze on November 28, 2008 http://hol.sagepub.comDownloaded from

2008). A forest zone occurs in humid areas at middle and higher

altitudes. It includes hemiboreal forests at drier and summer-warm

sites, and taiga at wetter, summer-cool sites. Hemiboreal forests

include Betula pendula-Pinus sylvestris mesic forest in the northof the study area, Larix sibirica dry forest in the south and Pinussylvestris dry forest on south-facing slopes in the north. Taigaincludes Abies sibirica-Betula pendula wet forest on valley bot-toms and footslopes in the north, Abies sibirica-Pinus sibiricamesic forest on slopes in the north and Pinus sibirica-Picea obo-vata continental forest in cool and dry places throughout the studyarea (Chytrý et al., 2008). Areas above the timberline supportdwarf-shrub tundra with Betula rotundifolia (dwarf birch from theB. nana group), Vaccinium myrtillus and V. vitis-idaea, and tall-forb vegetation along the mountain streams.

There are scattered villages in the basins, where the steppe or for-

est-steppe is used for livestock grazing. However, mountain areas of

the Western Sayan are almost without permanent settlements; they

harbour primeval vegetation, although forest fires occur frequently

and various stages of post-fire succession are common.

Materials and methods

Pollen samples and vegetation typesVegetation of the study area was surveyed in 2003–2004 as a part

of a broader ecological study of the southern Siberian mountains

(Chytrý et al., 2007, 2008). Sampling units were 307 squared plotsof 100 m

2, from which complete lists of plant species with their

covers were recorded. Plots were classified, based on their species

composition, by the TWINSPAN program (Hill, 1979). Separate

analyses of forest and treeless plots resulted in six vegetation

types of the former (described in Chytrý et al., 2008) and eighttypes of the latter.

Within each of the 100 m2sampling plots we collected surface

pollen samples as five subsamples per plot, subsequently merged

into one. The area of each subsample was c. 10 cm × 10 cm. Wecollected either up to 3 cm of humus and topsoil (in dry steppe and

xeric scrub) or polsters of ground-dwelling bryophytes (in forests,

alpine tundra, alpine scrub and meadow steppe). In order to cover

all main vegetation types, we refrained from restricting our sam-

ples only to places with moss polsters available (cf. Gaillard et al.,1994, Brayshay et al., 2000), even though sampling in two trap-ping media may result in a slight bias.

From the 307 surface pollen samples, we selected 83 for further

analysis (Figure 1, Table 4, e-appendix available online). This

selection was stratified so that each of the 14 vegetation types was

represented by 5–11 pollen samples. However, pollen samples from

subalpine tall-forb vegetation were excluded because of insufficient

pollen content. Low pollen content was also found in Vacciniummyrtillus tundra and Betula rotundifolia-Vaccinium vitis-idaea tun-dra; therefore both tundra types were merged. Thus, for the statisti-

cal analysis we used 81 pollen samples from 12 vegetation types (6

forest and 6 treeless), each represented by 5–11 pollen samples

(Table 1). The samples were dried at room temperature and pre-

pared for analysis according to Faegri and Iversen (1989).

Pollen grains were identified with help of a reference collection

and literature (Reille, 1995–1999; Beug, 2004). Altogether, we

identified 111 pollen taxa and counted a minimum of 500 grains

in 88% of samples. The minimal accepted pollen sum was 290

grains in one steppe sample. Pollen counts were converted into

percentages. Pollen spectra were deposited in the European Pollen

Database (www.europeanpollendatabase.net).

Table 2 lists the pollen taxa that are mentioned further in the

results together with the corresponding plant taxa recorded in the

sampling plots, and Table 5 (e-appendix, available online) shows

an approximate representation of the pollen taxa in the vegetation.

For the latter, all recorded plant species were assigned to pollen

taxa and their percentage covers averaged over all sampled plots

of each vegetation type.

Landscape types and climatic characteristicsTo assess the correspondence between surface pollen samples and

the broader landscape, we utilized the land-cover data prepared by

expert interpretation of satellite images of Landsat-7 with resolu-

tion of 30 m. Their interpretation was assisted by the ERDAS

IMAGINE software (http://gi.leica-geosystems.com/) and ground-

proved during the fieldwork. We defined 13 land-cover classes

(eg, Larix forest, Pinus sibirica forest, xeric scrub, etc). The per-centage area of each land-cover class within 300 and 5000 m

around each of the 81 sites was calculated, using the ArcGIS 8.3

software (http://www.esri.com/). The land-cover data were further

classified by cluster analysis, using Euclidean distance and

Ward’s clustering method in the STATISTICA 8 software

(http://www.statsoft.com/), separately for 300 m and 5000 m cir-

cles. Four clusters at the highest hierarchical level of each classi-

fication were interpreted as landscape types (Table 1).

Mean July and January temperatures and annual precipitation

for sampling sites were obtained from a climatic model based on

the interpolation of measured data from climatic stations com-

bined with standard precipitation–altitude charts and adiabatic

lapse rate estimation (Chytrý et al., 2007, 2008).

Data analysis by decision treesWe used general classification/regression tree (CART) models

included in the STATISTICA 8 software (http://www.statsoft.

com/) to analyse how well the composition of surface pollen sam-

ples reflects the surrounding vegetation type, landscape type and

climate. Further methodical description follows Breiman et al.(1984), De’ath and Fabricius (2000) and STATISTICA 8

Electronic Manual (http://www.statsoft.com/). Decision trees rep-

resent a method of exploratory data analysis that relates several

explanatory variables to a response variable, which can be either

categorical (classification tree) or continuous (regression tree). A

tree is grown by hierarchical splitting of the individual cases into

two mutually exclusive groups, called nodes. The split is carried

out according to a simple splitting rule based on a single explana-

tory variable. The splitting rule is set to maximize the homogene-

ity of the groups and minimize the within-group variation in the

response variable at the same time. Nodes that do not split any fur-

ther are called terminal nodes.

The percentage proportions of 111 pollen taxa and AP sum

(summed over 26 pollen taxa) in surface pollen samples were used

as explanatory variables in all models. Each of the 81 pollen sam-

ples was assigned to the vegetation type where it was collected,

and to landscape type within 300 m and 5000 m around the sam-

pling point. The vegetation and landscape types were categorical

response variables, leading to classification trees, whereas the cli-

mate characteristics were continuous response variables, leading

to regression trees. We ran the models with square-root and log-

transformed pollen percentages, in order to reduce the differences

in abundance of common and rare pollen taxa. As the results were

unaffected by the transformation, we present the models based on

untransformed data.

Process of splitting should stop at an optimal balance between

explained variation in the response variable and the complexity of

the tree. As a stopping parameter we set the minimum number of

cases per terminal node to five. To avoid overfitting of the tree, we

used the V-fold cross-validation (Breiman et al., 1984), withV = 10. V-value determines the number of random subsamples, asequal in size as possible, that are formed from the data set. Then

the decision tree of specified size is computed V-times. Each timeone of the subsamples is left out and used as a test subsample for

Barbora Pelánková et al.: Modern pollen along a steppe–forest–tundra transition 1261

at Charles Univ/Univ Karlova v Praze on November 28, 2008 http://hol.sagepub.comDownloaded from

1262 The Holocene 18,8 (2008)

Table1

Overviewofvegetationandlandscapetypesusedinthisstudy.Thenumberofpollensamplesanalysedpereachvegetation/landscapetypeisshown.Onlyland-coverclassescovering>10%ofthe

areaarementionedforeachlandscapetype

No.ofpollen

Characteristiclocation

samples

Vegetationtypesinthesampleplotsof100m2

Betulapendula-Pinussylvestrismesichemiborealforest

5Forest-steppezone,Npart(moreoceanic)

Larixsibiricadryhemiborealforest

7Forest-steppezone,Spart(morecontinental)

Pinussylvestrisdryhemiborealforest

8Forest-steppezone,dryslopesinNpart

Abiessibirica-Betulapendulawettaigaforest

11

Forestzone,valleybottomsinNpart

Abiessibirica-Pinussibiricamesictaigaforest

5Forestzone,slopesinNpart

Pinussibirica-Piceaobovatacontinentaltaigaforest

10

Forestzone,coolanddryareas

AlpinetundrawithVacciniummyrtillusorBetularotundifolia-Vacciniumvitis-idaea

5Tundrazoneabovethetimberline

XericscrubwithCaraganaspp.andSpiraeaspp.

5N-facingslopesintheforest-steppezone

Species-richmeadowsteppe(Festuco-Brometea)

7Steppeandforest-steppezone,Npart

DryEuro-Siberiansteppe(Festuco-Brometea)

7Steppezone,Npart

DryMongoliansteppe(Cleistogeneteasquarrosae)

6Steppeandforeststeppezone,Spart

DryrockyMongoliansteppe(Cleistogeneteasquarrosae)

5Steppeandforest-steppezone,Spart

Landscapetypeswithin300mfromthesamplingpoint

MosaicofLarixsibiricaforest(46%),Pinussibiricaforest(22%)andBetulapendulaforest(13%)

18

Forestandforest-steppezone,mainlySpart

Mosaicofxericscrub(32%),drysteppe(30%)andLarixsibiricaforest(22%)

22

Forest-steppeandsteppezone,Spart

Mosaicofspecies-richmeadowsteppe(31%),Pinussylvestrisforest(25%),Betulapendulaforest(24%)andxericscrub(16%)

22

Forest-steppeandsteppezone,Npart

MosaicofAbiessibiricaforest(46%),Betulapendulaforest(24%)andPinussibiricaforest(10%)

19

Forestzone,Npart

Landscapetypeswithin5000mfromthesamplingpoint

MosaicofLarixsibiricaforest(49%),Pinussibiricaforest(17%)andxericscrub(12%)

17

Forestandforest-steppezone,mainlySpart

MosaicofLarixsibiricaforest(30%),drysteppe(28%),xericscrub(25%)andalpinescrub(10%)

20

Forest-steppeandsteppezone,Spart

Mosaicofspecies-richmeadowsteppe(28%),Pinussylvestrisforest(27%),Betulapendulaforest(23%)andxericscrub(15%)

20

Forest-steppeandsteppezone,Npart

MosaicofAbiessibiricataiga(35%),Betulapendulaforest(21%),Pinussibiricaforest(14%)andLarixsibiricaforest(11%)

24

Forestzone,Npart

at Charles Univ/Univ Karlova v Praze on November 28, 2008 http://hol.sagepub.comDownloaded from

Barbora Pelánková et al.: Modern pollen along a steppe–forest–tundra transition 1263Table2

Pollentaxamentionedintheresultsandrelatedplantgenera/speciesrecordedinthesamplingplots

Pollentaxon

Relatedgenera/speciesrecordedinsamplingplots

Abies

Abiessibirica

Aconitum/Delphinium

Aconitum(6),Delphinium(3)

Allium

Allium(18)

Alnusviridistype

Alnusfruticosa

Artemisia

Artemisia(17)

Astragalus/Oxytropis

Astragalus(10),Oxytropis(9)

Betulananatype

Betularotundifolia

Betulapendula

Betulapendula

Bupleurum

Bupleurum

(6)

Caryophyllaceae

Eremogonemeyeri,Lychnissibirica,Minuartia(2),Moehringialateriflora,Saginasaginoides

(excl.Cerastium,Dianthus,Gypsophila,Silene,Stellaria)

Cerealia

–Chenopodiaceae

Chenopodiumalbum,Corispermumspp.,Kochiaprostrata,Krascheninnikoviaceratoides,Nanophytongrubovii,Salsolacollina,Teloxysaristata

Compositaesubfam.Asteroideae

Achilleamillefolium

s.l.,Antennariadioica,Arctogerongramineum,Asteralpinus,Cacaliahastata,Doronicumaltaicum,Erigeronacriss.l.,Galatella(2),

(excl.Carduus/Cirsium

type)

Heteropappus(2),Inula(2),Leibnitziaanandria,Leontopodium(2),Ligulariasibirica,Omalothecanorvegica,Ptarmicaimpatiens,Pyrethrumpulchrum,

Rhinactinidiaeremophila,Saussurea(8),Senecionemorensis,Serratula(2),Solidago(2),Stemmacanthacarthamoides,Tanacetumboreale,Tephroseris(2)

Compositaesubfam.Cichorioideae

Cicerbitaazurea,Crepis(3),Hieracium

(7),Lactucasibirica,Picrisdavurica,Scorzonera(3),Sonchusarvensis,Taraxacum

spp.,Tragopogonorientalis,

Trommsdorfiamaculata,Youngia(2)

Cyperaceae

Carex(22),Eriophorumbrachyantherum,Kobresiamyosuroides

Ephedradistachyatype

Ephedra(2)

Filipendula

Filipendulaulmaria

Larix

Larixsibirica

Menthatype

Clinopodiumvulgare,Origanumvulgare,Thymusserpyllums.l.

Peucedanum

Peucedanumvaginatum

Picea

Piceaobovata

Pinuscembratype

Pinussibirica

Pinussylvestristype

Pinussylvestris

Plantagomajor/media

Plantagomedia

Poaceae

Achnatherum

(2),Agropyroncristatum,Agrostis(2),Alopecuruspratensis,Anthoxanthumalpinum,Brachypodium(2),Bromopsisspp.,Calamagrostis(5),Cinna

latifolia,Cleistogenes(2),Dactylisglomerata,Deschampsiacespitosa,Elymus(6),Elytrigia(4),Festuca(10),Helictotrichon(4),Hierochloe(2),Koeleria(2),

Leymusdasystachys,Melica(2),Miliumeffusum,Phleum(2),Poa(3),Psathyrostachysjuncea,Schizachnecallosa,Setariaviridis,Stipa(7),Trisetum

(2)

Potentillatype

Coluriageoides,Fragaria(2),Potentilla(16),Sibbaldiaprocumbens,Sibbaldiantheadpressa

Rosaceae

Agrimoniapilosa,Alchemillaspp.,Chamaerhodos(2),Cotoneaster(2),Dryasoxyodonta,Geumaleppicum,Prunus(2),Rosa(2),Rubus(3),Sorbussibirica

(excl.Filipendula,Potentillatype,Spiraea,Sanguisorba

officinalis)

Rubiaceae

Cruciatakrylovii,Galium(6)

Salix

Salix(11)

Sanguisorbaofficinalis

Sanguisorbaofficinalis

Spiraea

Spiraea(5)

Thalictrum

Thalictrum(4)

Trollius

Trolliusasiaticus

Urtica

Urticadioica

Vacciniaceae/Ericaceae(incl.Pyrolaceae,Empetraceae)

Arctouserythrocarpa,Empetrumnigrum,Ledumpalustre,Pyrolarotundifolia,Rhododendron(3),Vaccinium(3)

Whentherearemorethanonespeciesinagenus,thenumbernexttogenusnameindicateshowmanyspeciesinthatgenuswererecorded.PinussylvestristypewasalwaysconsideredasPinussylvestris,since

Pinusmugodoesnotoccurintheregion.PinuscembratypeandPinussylvestristypearedenotedasPinussibiricaandPinussylvestris,respectively,inthetextandfigures.PlantnamesfollowCherepanov(1995).

at Charles Univ/Univ Karlova v Praze on November 28, 2008 http://hol.sagepub.comDownloaded from

cross-validation. Classification error rate is computed on the test

subsamples. This error rate usually decreases with the increasing

tree size until it reaches a minimum at some specific size beyond

which it remains constant or increases. The error rate is averaged

over all V subsamples, giving the cross validation cost (CV cost).In order to choose the ‘right-sized’ tree, the standard error rule (SE

rule) is applied. The value of standard error is user-set and can be

≥0. Then the ‘right-sized’ tree is the one with fewest terminal nodespossible and with the CV cost ≤ (minimal CV cost + SE of the CVcost). If SE rule = 0, the ‘right-sized’ tree has the same number ofterminal nodes as the tree with minimal CV cost. If SE rule far

exceeds zero, then the ‘right-sized’ tree is much simpler, with

fewer terminal nodes. We set SE rule = 0 for all decision trees inthis study, because higher SE rule-values lead to very simple trees.

After choosing the ‘right-sized’ trees, we calculated the varia-

tion in the response variable explained by each tree from the sum

of squares in the entire data set relative to residual sum of squares.

The predictive accuracy, ie, the ability of decision trees to clas-

sify new samples correctly (Breiman et al., 1984), could not betested on a separate test data set because there were a small num-

ber of pollen samples. Therefore we tested accuracy of models by

deployment, ie, on the same data as they were built on. Accuracy

was computed as a rate of correctly classified samples in the

deployment, relative to their original classification during the tree

building process.

At every split, the main splitting variable (main splitter) and its

splitting value responsible for classification of samples into nodes

are identified, such as to minimize the variation of the response

variable within each node. At every split, the program also identi-

fies surrogate variables (surrogates) which allocate most cases

similarly as the main splitter. Each surrogate is assigned a degree

of association ranging from 0 to 1, which is proportional to its

ability to classify samples in the same way as the main splitter.

Surrogates with the association value > 0.6 were reported and

interpreted in this study.

Relative contribution of each explanatory variable to the predic-

tion of response variable in a tree model is quantified as the predic-

tor’s importance. Its value ranges from 0 to 1 and stems both from

the variable’s role as a main splitter and as a surrogate across all

nodes of the tree. The importance of each explanatory variable is cal-

culated as a sum of the drop in node impurity over all nodes in the

tree when this particular explanatory variable is used in the tree.

Node impurity corresponds to the variation of the response variable

in a particular node. The sums are then expressed relatively to the

largest sum found in all explanatory variables. Explanatory variables

with importance > 0.55 are mentioned in the results.

Ordination of pollen samplesTo illustrate the relationship between decision trees and ordina-

tion, pollen samples with log-transformed pollen percentages were

subject to detrended correspondence analysis (DCA), using the

CANOCO 4.5 program (ter Braak and Šmilauer, 2002).

Assignments of pollen samples to vegetation types and to the ter-

minal nodes of the classification tree with vegetation types were

passively projected on the ordination diagram.

Results

Local vegetation types at sampling sitesClassification tree for vegetation types had seven terminal nodes

(Figure 2). The fist division separated samples from dry steppes

and xeric scrub (terminal nodes I and II) on one side and samples

from meadow steppe, all types of hemiboreal forests and taiga on

the other side. Further, samples from meadow steppe and Larixdry hemiboreal forests were separated (terminal nodes III and IV),

then samples from Pinus sibirica-Picea obovata continental taiga(terminal node V), and samples from Pinus sylvestris dry hemibo-real forest (VI) and Abies-Betula wet taiga (VII). The tree

explained 50% of the variation in the vegetation types by the

1264 The Holocene 18,8 (2008)

81 samples

24 samples

8 samples6 dry E-Ssteppe

10 samples6 meadow steppe

2 Pin syl forest

6 samples5 Larix forest

1

I II

III IV V

VI VII

2 3

546

16 samples

57 samples

41 samples

23 samples

Cich > 0.14%Betula > 16%

Larix > 4%Pla m/m < 0.06%

AP > 80%Arte < 7%

AP > 50%, Arte < 17%,Poac < 6%, Cheno < 1.3%

Pin syl > 6%Pin sib < 46%

Abies > 2%Cyper < 0.5%Pin syl < 16%

V/E< 0.06%

16 samples6 dry Mong. steppe

4 dry rocky Mong. steppe4 xeric scrub

18 samples10 Pin sib-Picea taiga

4 alpine tundra2 Abies-Betula taiga

10 samples5 Pin syl forest

2 Betula-Pin syl forest

13 samples9 Abies-Betula taiga3 Abies-Pin sib taiga

Figure 2 Classification tree showing the relationship between the composition of surface pollen spectra (explanatory variable) and the vegetation

type (response variable) at the studied sites. At each split, the main splitting variable (boldfaced) with its split value (pollen percentages) and sur-

rogate splitting variables with associated value of > 0.6 are shown above the tree branches. Note that only vegetation types represented by at least

two pollen samples are indicated at each terminal node, and the type represented by most pollen samples in each terminal node is boldfaced. For

detailed description of vegetation types see Table 1. The inset tree scheme in the upper right corner illustrates the sequence of splitting. Abies, Abiessibirica; AP, arboreal pollen sum; Arte, Artemisia; Betula, Betula pendula; Cheno, Chenopodiaceae; Cich, Compositae subfam. Cichorioideae;Cyper, Cyperaceae; Larix, Larix sibirica; Picea, Picea obovata; Pin sib, Pinus sibirica; Pin syl, Pinus sylvestris; Pla m/m, Plantago major/media;Poac, Poaceae; V/E, Vacciniaceae/Ericaceae; E-S steppe, Euro-Siberian steppe; Mong. steppe, Mongolian steppe

at Charles Univ/Univ Karlova v Praze on November 28, 2008 http://hol.sagepub.comDownloaded from

composition of surface pollen spectra, and had predictive

accuracy of 57%.

DCA ordination, used for the comparison with the classification

tree, explained 9.9% and 5.9% of variation on the first two ordi-

nation axes (Figure 3a). The ordination diagram indicated a con-

siderable overlap among pollen samples from different vegetation

types, even between some forest and treeless types. When the

assignment of samples to the terminal nodes of the tree was plot-

ted on the same ordination diagram (Figure 3b), there was less

overlap, indicating that groups of pollen samples at the terminal

nodes of the tree are rather coherent in their pollen spectra,

although some of them contain pollen samples from different

vegetation types.

Landscape types around the sampling sitesThe classification tree assigning pollen samples to landscape types

within 300 m from the sampling point (Figure 4a) had five termi-

nal nodes. The left branch resulted in three terminal nodes, repre-

senting a mosaic of Abies, Betula and Pinus sibirica forests(terminal node I), a mosaic of meadow steppe, Pinus sylvestris and

Barbora Pelánková et al.: Modern pollen along a steppe–forest–tundra transition 1265

Betula Pendula-Pinus sylvestrismesic hemiboreal forest

(a) DCA of pollen samples with passively projected vegetation types

(b) DCA of pollen samples with passively projected terminal nodes of the classification tree

Larix sibirica dry hemiboreal forest

Pinus sylvestris dry hemiboreal forest

Abies sibirica-Betula pendula wet taiga

Abies sibirica-Pinus sibirica mesic taiga

Pinus sibirica-Picea obovata continentaltaiga

Dry rocky Mongolian steppe

Dry Mongolian Steppe

Dry Euro-Siberian steppe

Species-rich meadow steppe

Alpine tundra with Vaccinium myrtillus orBetula rotundifolia

Xeric scrub with Caragana spp. andSpiraea spp.

1.6

−0.2−0.2 2.0

1.6

−0.6

Abies

Pinussylvestris

Betulapendula

AP

Vacciniaceae/Ericaceae

Pinus sibirica

−0.8 2.0

Chenopodiaceae

ArtemisiaCyperaceae

PoaceaeLarix

Plantagomajor/media

Cichorioideae Pollen taxa representing themain or surrogate splittingvariables in the classificationtree of vegetation types

Terminal node I

Terminal node II

Terminal node III

Terminal node IV

Terminal node V

Terminal node VI

Terminal node VII

Figure 3 Ordination diagrams (DCA) of pollen samples with symbols indicating (a) assignments of pollen samples to vegetation types, (b) assign-

ments of pollen samples to the terminal nodes of the classification tree of vegetation types (cf. Figure 2), with pollen taxa that acted as the main

splitting variables in the tree model

at Charles Univ/Univ Karlova v Praze on November 28, 2008 http://hol.sagepub.comDownloaded from

Betula forests and xeric scrub (II), and samples surrounded by amosaic of Larix forest, dry steppe and xeric scrub (III). The rightbranch produced two terminal nodes, corresponding to a mosaic of

Larix forest, dry steppe and xeric scrub (IV), and of Larix, Pinussibirica and Betula forest (V). This tree explained 76% of the vari-ation in the landscape types and had a predictive accuracy of 83%.

The corresponding tree for areas 5000 m around the sampling

point also had five terminal nodes (Figure 4b). Samples from the

mosaic of Abies taiga, Betula, Pinus sibirica and Larix forest pre-vailed in terminal nodes I and V. Other terminal nodes included

mostly samples from the mosaic of Larix forest, dry steppe, xericand alpine scrub (II), samples surrounded by Larix and Pinus

sibirica forest and xeric scrub (III) and finally samples surroundedby meadow steppe, Pinus sylvestris and Betula forest in admixturewith xeric scrub (IV). The model explained 70% of variation in

landscape types and predicted with accuracy of 79%.

Climate characteristicsRegression tree models for mean annual precipitation and mean

July and January temperature (Figure 5) explained in turn 86%,

69% and 57% of variation in these variables using pollen data.

Samples assigned to each terminal node were homogeneous in the

values of climatic variables and clearly corresponded to particular

vegetation types.

1266 The Holocene 18,8 (2008)

81 samples

51 samples

(a) landscape types within 300 m around the sampling point

(b) landscape types within 5000 m around the sampling point

28 samples

1

I

II III

IV V

2 4

3

30 samples

Cheno < 1.5%Betula > 14%

Pin sib < 23%, AP < 42%Arte > 16%, Tha > 1%

Eph dis > 1%AP < 40%

AP > 87%, Arte < 5%Poac < 2%, Abies > 1%

Larix < 0.9%

23 samples18 mosaic of Abies &

Betula & Pin sib forest3 mosaic of Larix, Pin sib &

Betula forest2 mosaic of meadow steppe,Pin syl & Betula forest & xeric

scrub

17 samples13 mosaic of Larix,

Pin sib & Betulaforest

4 mosaic of Larix forest,dry steppe & xeric scrub20 samples

18 mosaic of meadowsteppe, Pin syl & Betula

forest & xeric scrub

8 samples7 mosaic of xericscrub, dry steppe

& Larix forest

13 samples11 mosaic of xericscrub, dry steppe

& Larix forest

81 samples

57 samples

28 samples

1

I

II III IV V

243

29 samples

Betula < 16%Larix > 0.3%

Larix < 4%

Picea < 4%

AP > 90%Arte < 5%, Aln vir > 1%

7 samples4 mosaic of Abies, Betula,

Pin sib & Larix forest2 mosaic of Larix forest,

dry steppe, xeric & alpine scrub

24 samples19 mosaic of Abies, Betula,

Pin sib & Larix forest5 mosaic of Larix &

Pin sib forest & xeric scrub

20 samples15 mosaic of Larix forest,dry steppe, xeric scrub &

alpine scrub4 mosaic of Larix &

Pin sib forest & xeric scrub

8 samples7 mosaic of Larix& Pin sib forest &

xeric scrub22 samples

19 mosaic of meadowsteppe, Pin syl & Betula

forest & xeric scrub2 mosaic of Larix forest, drysteppe, xeric & alpine scrub

Figure 4 Classification trees showing the relationship between the surface pollen spectra and the landscape types within (a) 300 m and (b) 5000 m

around the sampling site. See Figure 2 for detailed explanation. The landscape types are defined as mosaics of certain vegetation types (see Table 1

for details). Only landscape types represented by at least two pollen samples are indicated at each terminal node, and the type represented by most

pollen samples in each terminal node is boldfaced. Abies, Abies sibirica; Aln vir, Alnus viridis type; AP, arboreal pollen sum; Arte, Artemisia;Betula, Betula pendula; Cheno, Chenopodiaceae; Eph dist, Ephedra distachya type; Larix, Larix sibirica; Picea, Picea obovata; Pin sib, Pinus sibir-ica; Pin syl, Pinus sylvestris; Pla m/m, Plantago major/media; Poac, Poaceae; Tha, Thalictrum

at Charles Univ/Univ Karlova v Praze on November 28, 2008 http://hol.sagepub.comDownloaded from

Barbora Pelánková et al.: Modern pollen along a steppe–forest–tundra transition 1267

81 samples

36 samples

20 samples

45 samples

Rosac > 0.2%Aln vir > 6.3%

Arte < 10%AP > 74%, Poac < 5%

Larix < 1%Betula > 9%

Pin sib > 33%V/E > 0.11%

Betula < 25%

Larix < 0.65%Cheno < 1.5%

234 ±± 4816 samples

dry Mong. steppe,dry rocky Mong. steppe

& xeric scrub333 ±± 1159 samples

Larix forest &dry E-S steppe

668 ±± 15011 samples

meadow steppe& dry E-S steppe

629 ± 20514 samples

Pin syl forest &Pin sib-Picea taiga

819 ±± 6714 samples

Betula-Pin syl forest& Abies-Betula taiga

1123 ±± 16110 samples

Pin sib-Picea taiga,Abies-Pin sib taiga

& alpine tundra

1396 ±± 1937 samples

Abies-Pin sib taiga

31 samples

24 samples

(a) annual precipitation (mm)

81 samples

38 samples

33 samples

43 samples

Aln vir < 0.45%AP < 89%

Arte > 4.4%

Pla m/m > 0.14%Betula > 19%Filip > 0.14%

Pin sib < 24%Poac > 2.6%

AP < 75%, Arte > 5.4%

Arte > 32%

Mentha < 0.02%Trollius < 0.06%

14.1 ±± 2.012 samples

dry rocky steppe,dry Mong. steppe

& xeric scrub

15.1 ±± 0.813 samplesxeric scrub,

dry E-S steppe& dry Mong. steppe

15.3 ±± 1.510 samples

dry rocky steppe,xeric scrub,

& dry Mong. steppe

17.8 ±± 1.05 samples

dry E-S steppe

8.8 ±± 0.95 samples

alpine tundra &Pin sib-Picea taiga

16.6 ±± 0.813 samples

Betula-Pin syl forest,& meadow steppe

12.6 ±± 1.723 samples

Pin sib-Picea taiga,& Abies-Pin sib taiga

31 samples

26 samples

(b) mean July temperature (°C)

81 samples

45 samplesPla m/m > 0.15%

Larix < 0.26%

−21.5 ± 4.736 samples

Abies-Betula taiga,Abies-Pin sib taiga,

Betula-Pin syl forest,meadow steppe,dry E-S steppe

−31.4 ± 3.238 samples

Pin syl forest, Larix forestdry E-S steppe, dry Mong.steppe, Pin sib-Picea taiga,

& xeric scrub

−22.4 ± 5.67 samples

meadow steppe &Abies-Betula taiga

(c) mean January temperature (°C)

Figure 5 Regression trees showing the relationship between the composition of surface pollen spectra and climatic variables. At each node, mean

value of particular response variable ± standard deviation is given. The main splitting variables with their split value (pollen percentages) are bold-

faced, and surrogate variables with associated value of > 0.6 are given below them. Terminal nodes show the vegetation type(s) represented by

most samples in each node. Abies, Abies sibirica; Aln vir, Alnus viridis type; AP, arboreal pollen sum; Arte, Artemisia; Betula, Betula pendula;Cheno, Chenopodiaceae, Cich, Compositae subfam. Cichorioideae; Filip, Filipendula; Larix, Larix sibirica; Mentha, Mentha type; Picea, Piceaobovata; Pin sib, Pinus sibirica; Pin syl, Pinus sylvestris; Pla m/m, Plantago major/media; Poac, Poaceae; Trollius, Trollius asiaticus; Rosac,Rosaceae; V/E, Vacciniaceae/Ericaceae; E-S steppe, Euro-Siberian Steppe; Mong. steppe, Mongolian steppe

at Charles Univ/Univ Karlova v Praze on November 28, 2008 http://hol.sagepub.comDownloaded from

Relative importance of pollen taxaTable 3 shows the pollen taxa with importance value > 0.55 for

prediction of vegetation composition, landscape types and climate

characteristics. Highly important predictors in all models were

both strong pollen producers (eg, Artemisia, Poaceae, Betula pen-dula, Pinus sibirica and P. sylvestris) and relatively weak pollenproducers or taxa with poorly dispersed pollen (eg, Compositaesubfam. Cichorioideae, Larix, Trollius, Plantago major/media,Ephedra distachya type and Mentha type). The pollen taxa shownin Table 3 that do not appear in the tree graphs are mostly surro-

gates of low association value for any specific division. They may

have contributed to several divisions, but could not distinctively

predict any of them.

In the classification of vegetation and landscape types, AP sum

usually separated forests and meadow steppe from dry scrub, dry

steppe and rocky steppe; it also separated Larix forests from otherforest types. High values of Artemisia, Poaceae and

Chenopodiaceae indicated dry non-forest vegetation. High con-tent of Artemisia also differentiated open hemiboreal forests fromother forest types. In contrast, varying content of Pinus sibirica,P. sylvestris and Abies pollen separated samples from various

types of hemiboreal forests and taiga. Higher content of Betulapendula pollen indicated mesic forest types in contrast to drier for-est types. Picea appeared only once as the main splitter and sepa-rated landscape type with prevailing taiga from the mosaic of

hemiboreal forests and steppe. Cyperaceae pollen increased

towards hemiboreal forests, either with Pinus sylvestris or Larixsibirica, where various species of Carex, particularly C. pedi-formis s. lat., are common (Chytrý et al., 2008).Of the poorly represented pollen taxa, higher values of Alnus

viridis type indicated wet taiga, where Alnus fruticosa occurs inthe shrub layer (Table 5, e-appendix available online).

Cichorioideae pollen separated Euro-Siberian dry steppe fromMongolian types of dry steppe and xeric scrub. Higher values of

Plantago major/media pollen indicated meadow steppe or

meadow steppe/hemiboreal forest mosaic, since P. media oftengrows in meadow steppe (Table 5, e-appendix available online).

Larix pollen separated Larix hemiboreal forests from meadow

steppe as well as the mosaic of Larix forest-steppe from other

more forested landscape types. Higher representation of

Vacciniaceae/Ericaceae pollen differentiated between open hemi-boreal forests and meadow steppe, owing to Vaccinium vitis-idaeaoccurring in open Larix forests. Similarly, higher values ofThalictrum and Ephedra distachya pollen type indicated landscapewith Larix hemiboreal forest, dry steppe and xeric scrub.In the regression tree models of climate characteristics, pollen

taxa whose high values indicated low precipitation (Artemisia,Chenopodiaceae, Poaceae) or high mean July temperatures

(Artemisia) were the same as indicators of dry non-forest vegeta-tion. Larix pollen indicated drier conditions and lower Januarytemperatures. In contrast to continental or very moist conditions,

mesic conditions were usually indicated by high values of Betula

1268 The Holocene 18,8 (2008)

Table 3 Importance of individual pollen taxa as predictors in the classification (Figures 2 and 4) and regression (Figure 5) tree models. The

most important predictor in each model is given an arbitrary importance value of 1.00. Only predictors with the importance value > 0.55 are

shown

Prediction of vegetation types Prediction of landscape types Prediction of climate characteristics

Pollen taxon Importance Pollen taxon Importance Pollen taxon Importance

Compositae subfam. Cichorioideae 1.00 Within 300 m Mean annual precipitation

Betula pendula 0.99 AP sum 1.00 AP sum 1.00

Abies sibirica 0.94 Artemisia 0.88 Poaceae 0.98

Larix sibirica 0.93 Chenopodiaceae 0.86 Artemisia 0.93

Plantago major/media 0.91 Poaceae 0.80 Larix sibirica 0.90

Pinus sylvestris 0.89 Betula pendula 0.77 Abies sibirica 0.85

Cyperaceae 0.85 Picea obovata 0.75 Chenopodiaceae 0.81

Pinus sibirica 0.85 Plantago major/media 0.74 Pinus sibirica 0.76

Thalictrum 0.81 Alnus viridis type 0.73 Alnus viridis type 0.63

Artemisia 0.77 Pinus sibirica 0.64 Thalictrum 0.63

Chenopodiaceae 0.76 Compositae subfam. Cichorioideae 0.60 Betula pendula 0.62

Compositae subfam. Asteroideae 0.76 Abies sibirica 0.60 Bupleurum 0.61

Poaceae 0.76 Cyperaceae 0.61

Alnus viridis type 0.75 Within 5000 m Rosaceae 0.59

Salix 0.72 Larix sibirica 1.00

Potentilla type 0.71 Picea obovata 0.80 Mean July temperature

Bupleurum 0.70 AP sum 0.78 Alnus viridis type 1.00

Vacciniaceae/Ericaceae 0.68 Alnus viridis type 0.78 Poaceae 0.87

Rubiaceae 0.65 Plantago major/media 0.75 Vacciniaceae/Ericacaeae 0.84

Picea obovata 0.65 Artemisia 0.72 Pinus sibirica 0.83

Spiraea 0.65 Betula pendula 0.67 Urtica dioica 0.71

Cerealia 0.64 Pinus sibirica 0.63 AP sum 0.67

Sanguisorba officinalis 0.60 Poaceae 0.58 Plantago major/media 0.65

Allium 0.59 Compositae subfam. Cichorioideae 0.56 Trollius 0.59

Aconitum/Delphinium 0.58 Abies sibirica 0.56 Compositae subfam. Cichorioideae 0.57

AP sum 0.57 Artemisia 0.55

Caryophyllaceae 0.56 Betula pendula 0.55

Astragalus/Oxytropis 0.55

Peucedanum 0.55 Mean January temperature

Larix sibirica 1.00

Betula pendula 0.98

Plantago major/media 0.96

Pinus sylvestris 0.63

at Charles Univ/Univ Karlova v Praze on November 28, 2008 http://hol.sagepub.comDownloaded from

pendula, Plantago major/media or Filipendula pollen. Highervalues of Pinus sibirica and AP sum indicated more moisture andlower July temperatures. Samples from precipitation-richest and

winter-cool sites were separated on the basis of Alnus viridis typepollen, originating from Alnus fruticosa. Rosaceae pollen alsopredicted high precipitation.

Samples associated with the lowest July temperatures were sep-

arated by Trollius pollen. These originated in mountain tundra orcontinental taiga with Pinus sibirica and Picea, where Trollius asi-aticus occurs in wet places.

Discussion

Pollen indication of vegetation andlandscape typesThe classification tree model for vegetation types recorded in 100

m2plots and distinguished on the basis of species composition

(Figure 2) generally performed well, but in some cases it could not

discriminate the samples from forest and non-forest vegetation.

For example, the samples from meadow steppe in the forest-

steppe zone and those from Pinus sylvestris dry hemiboreal forestfell into the same terminal node (Figure 2, node III). Similarly,

pollen samples from Pinus sibirica-Picea obovata continentaltaiga were not distinguished from alpine tundra with Vacciniumspp. and Betula rotundifolia (Figure 2, node V). In the former casepollen spectra of meadow steppe with scattered patches of pine or

birch and the spectra of open dry hemiboreal forest were quite

similar, because the herbs typical of meadow steppe often occur in

the open hemiboreal forests, and the pollen of dominant Pinussylvestris and Betula pendula is well dispersed and abundant. Inthe latter case, pollen spectra of alpine tundra and continental taiga

were similar because the dominant tundra species (Vaccinium spp.,Betula rotundifolia) often occur in the taiga undergrowth.

Moreover, because of valley breezes in the local systems of air cir-

culation, Pinus sibirica and Picea pollen can be easily carried tothe higher altitudes of alpine tundra.

In most other cases, forest and non-forest types were success-

fully separated, but the distinction of some types within the for-

mation of steppe or forest was problematic. For example, some

samples of the xeric scrub or dry rocky Mongolian steppe were

misclassified for dry Mongolian steppe. This happened because

both types of Mongolian steppe differ mainly in species which

cannot be distinguished in pollen, eg, grasses. As for xeric scrub,

its dominant insect-pollinated shrubs Caragana and Spiraea havelow pollen production, and the undergrowth is similar in species

composition to dry Mongolian steppe.

In contrast to the above cases of misclassification, the classifi-

cation tree distinguished the samples from dry Euro-Siberian

steppe from all other steppe types with high precision. Pollen sam-

ples from this steppe had lower AP sums than those from meadow

steppe, but higher content of Cichorioideae and Betula pendulapollen than dry Mongolian steppe. Another successfully distin-

guished vegetation type was Larix hemiboreal forest. The under-growth in this forest is similar to species-rich meadow steppe;

however, even the low pollen signal of Larix was sufficient to sep-arate these two vegetation types. Also Pinus sibirica-Picea conti-nental taiga, Pinus sylvestris dry hemiboreal forest and

Abies-Betula wet taiga were satisfyingly classified into separategroups, on the basis of Abies pollen and changing ratio of Pinussibirica/P. sylvestris pollen.Consequently, there is probably no universal answer how to dis-

tinguish between vegetation types using surface pollen spectra. It

depends on many factors: species composition, physiognomy,

patchiness, pollen productivity and dispersal ability, major wind

directions, characteristics and distance of surrounding communities

(Prentice, 1985; Odgaard, 1999; Brayshay et al., 2000; Bunting etal., 2004; Fontana, 2005). Our model for landscape types at 300 mdistance around the sampling point explained most variation (76%)

and had the best predictive accuracy (83%). The model for land-

scape types at 5000 m distance yielded similar results (70%

explained variation, 79% accuracy). The model for vegetation types

in the 100 m2plots gave only 50% explained variation and 57%

accuracy. These results may have two reasons. First, splitting the

pollen samples into four broader categories (landscape types) can be

done with fewer misclassifications than splitting the same samples

into 12 categories (vegetation types). Second, surface pollen depo-

sition also reflects a broader landscape context besides the vegeta-

tion type at the sampling site. This indicates that the vegetation/

pollen deposition relationship might be more reliably related to

mosaics of different vegetation types than to narrow vegetation

units as those used in phytosociology.

Moreover, the characteristics of dominant species matter. If the

dominant species is a poor pollen producer such as Larix, the pres-ence of a small amount of its pollen will be a much better indica-

tor of vegetation type than would be large amounts of Pinus pollen.Dominant species over-represented in pollen usually distort the

community representation in its pollen signal, especially if there

are no codominants (Brayshay et al., 2000). However, changingratio in the pollen representation of two over-represented domi-

nants can act as a useful diagnostic criterion, such as Pinus sibi-rica and P. sylvestris in our case (Figure 2; see also Liu et al.,1999). Presence of herb species diagnostic for individual vegeta-

tion types in the pollen record is also strongly indicative, even if

these are weak pollen producers with poor dispersal ability, but

ecologically bound to a specific vegetation type.

Classification tree models indicate that besides Larix, also otherpollen taxa with low representation in the studied pollen samples

(eg, Cichorioideae, Ephedra distachya type, Alnus viridis type,Vacciniaceae/Ericaceae) are important indicators of vegetation orlandscape types. These taxa are diagnostic of particular vegetation

type and abundantly occur in the studied vegetation (Table 5,

e-appendix available online). If we omitted them from the analy-

sis because of some arbitrarily chosen threshold value, we would

risk losing a significant piece of information for correct assign-

ment of pollen spectra to vegetation types. In order to exploit max-

imum information about the vegetation type from the pollen

samples, we should include as many pollen taxa as possible and

use both quantitative and qualitative criteria for assigning the

pollen samples to the vegetation types.

Pollen indication of climateAs for the pollen/climate relationship, the relative composition of

surface samples explained 57–86% of variation in annual precipi-

tation and mean July and January temperatures. This indicates that

the relationship between pollen spectra and climate characteristics

is rather tight, and features of past climates can be reasonably pre-

dicted from the fossil pollen spectra. Regression trees (Figure

5a–c) highlighted the importance of two tree species, Larix sibiricaand Pinus sibirica, as climate indicators. The former is a weakpollen producer, but its occurrence clearly indicated low winter

temperatures and low precipitation, ie, a high degree of climatic

continentality. The latter indicated low summer temperatures and

higher precipitation. These results based on pollen spectra are in

accordance with the models of actual distribution of these species

in the study area (Chytrý et al., 2008). They also correspond to theconclusions of Tinner and Kaltenrieder (2005) about the responses

of high-mountain vegetation to the early-Holocene environmental

changes in the Swiss Alps. Unfortunately, these two species are

often overlooked in the fossil pollen spectra. Larix might go unno-ticed because of its low pollen abundance if small total numbers of

pollen grains are counted, while Pinus sibirica (= P. cembra s. lat.)

Barbora Pelánková et al.: Modern pollen along a steppe–forest–tundra transition 1269

at Charles Univ/Univ Karlova v Praze on November 28, 2008 http://hol.sagepub.comDownloaded from

is often not distinguished from P. sylvestris. In such a way impor-tant information on the past climate can be lost.

Decision trees versus other analytical methodsTo assess the relationship between vegetation and pollen spectra,

palynologists have commonly used multivariate methods, such as

ordination and cluster analysis. Like cluster analysis, decision

trees also create a classification, but the fundamental difference is

that the former is an unsupervised and the latter is a supervised

method of classification (Ripley, 1996). Unsupervised classifica-

tion methods search patterns in the matrix of pollen taxa and sam-

ples, and produce classes based on the sample similarity, without

considering any other information. In our case, cluster analysis of

pollen samples would create classes that might not fit the exter-

nally distinguished vegetation types. In contrast, supervised clas-

sification methods match individual samples to a prioriestablished classes (eg, pollen samples to vegetation types) and

develop a classifier that is able to assign new samples to these

classes. In classification trees, the classifier is based on the main

splitting variables and their split values, eg, critical percentage

values of particular pollen taxa. Any new pollen sample can be

compared with the decision criteria at the nodes of the tree and

assigned to one of the classes (terminal nodes). By contrast, unsu-

pervised classification methods cannot assign new samples to the

established classes.

Ordination has similar limitations as unsupervised classifica-

tion. Ordination scatter diagrams represent similarity among

pollen samples, which is entirely based on the matrix of pollen

taxa and samples and does not consider any external information,

eg, on vegetation types and landscape types. This is illustrated in

ordination diagram of DCA (Figure 3a), where individual vegeta-

tion types strongly overlap. Like unsupervised classification, ordi-

nation is appropriate for summarizing the internal structure of

similarities among pollen samples, but it is of limited use for

establishing links between pollen spectra and vegetation or land-

scape types. Assignments of pollen samples to terminal nodes of

the classification tree passively projected on the ordination dia-

gram show less overlap than the projected vegetation types

(Figure 3b), indicating also that decision trees strongly reflect the

similarities among pollen samples, however, in addition to ordi-

nation (or cluster analysis) they also establish clear links between

pollen spectra and external variables, such as vegetation types.

A further advantage of decision trees is that, while working

with one explanatory variable at a time, they give equal weight to

strong and weak pollen producers. If there is a taxon that can suc-

cessfully split the samples into desired groups, the tree finds it, no

matter whether its pollen is represented in tens or tenths of a per-

cent. This is also reflected by the fact that all of our models gave

the same results for both transformed and non-transformed pollen

counts. Tree models make a full use of the pollen signal from

weak pollen producers, even in the presence of strong producers.

This is in contrast with unsupervised classification or ordination,

where strong pollen producers may override the signal carried by

the weak pollen producers in spite of data transformation.

Perspectives for decision trees in palynologyWe found the decision tree models suitable for assessing the rela-

tionships of pollen spectra to vegetation or landscape types and

climate. Their main advantages are: (1) formal, relatively precise

and easy-to-interpret assignment of pollen spectra to vegetation or

landscape types or particular values of climatic variables; (2) iden-

tification of pollen taxa and their threshold percentage values,

which can distinguish a particular environment; (3) ability to find

a meaningful signal in both strong and weak pollen producers.

A perspective use of decision trees might be developing a tree

classifier on the basis of the relationship between modern pollen

spectra and modern vegetation, which would be used to classify

fossil pollen samples. However, such a classifier requires consid-

erably more data on modern pollen deposition and a careful con-

sideration of size of the area interpreted from the fossil samples.

Acknowledgements

We thank Jacqueline van Leeuwen for her help with pollen deter-

mination, Brigitta Ammann for support, Denis Popov for climatic

model, Ondrej Hájek for ArcGIS processing of the land-coverdata, Jirí Danihelka, Michal Hájek, Petra Hájková, Martin Kocí,Svatava Kubešová, Pavel Lustyk, Zdenka Otýpková, Petr

Pokorný, Jan Rolecek, Marcela Reznícková, Petr Šmarda andMilan Valachovic for sampling vegetation and collecting the sur-face pollen samples, and Eric Grimm and an anonymous referee for

the helpful comments on the previous version of this paper. The

research was supported by grants GAAVCR IAA6163303, GACR524/05/H536, MSM0021622416, MSM0021620828 and AVO Z

60050516.

References

Beug, H.-J. 2004: Leitfaden der Pollenbestimmung für Mitteleuropaund angrenzende Gebiete. Verlag Dr. Friedrich Pfeil.Birks, H.H. and Birks, H.J.B. 2000: Future uses of pollen analysis

must include plant macrofossils. Journal of Biogeography 27, 31–35.Bradshaw, R.H.W. and Webb, T., III 1985: Relationships between

contemporary pollen and vegetation data from Wisconsin and

Michigan, USA. Ecology 66, 721–37.Brayshay, B.A., Gilbertson, D.D., Kent, M., Edwards, K.J.,

Wathern, P. andWeaver, R.E. 2000: Surface pollen–vegetation rela-

tionships on the Atlantic seaboard: South Uist, Scotland. Journal ofBiogeography 27, 359–78.Breiman, L., Friedman, J.H., Olshen, R.A. and Stone, C.G. 1984:

Classification and regression trees. Chapman and Hall.Bunting, M.J., Gaillard, M.-J., Sugita, S., Middleton, R. and

Broström, A. 2004: Vegetation structure and pollen source area. TheHolocene 14, 651–60.Cherepanov, S.K. 1995: Sosudistye rasteniya Rossii i sopredel’nykhgosudarstv (Vascular plants of Russia and adjacent countries). Mir isem’ya-95.

Chytrý, M., Danihelka, J., Ermakov, N., Hájek, M., Hájková, P.,

Kocí, M., Kubešová, S., Lustyk, P., Otýpková, Z., Popov, D.,Rolecek, J., Reznícková, M., Šmarda, P. and Valachovic, M. 2007:Plant species richness in continental southern Siberia: effects of pH

and climate in the context of the species pool hypothesis. GlobalEcology and Biogeography 16, 668–78.Chytrý, M., Danihelka, J., Kubešová, S., Lustyk, P., Ermakov, N.,

Hájek, M., Hájková, P., Kocí, M., Otýpková, Z., Rolecek, J.,Reznícková, M., Šmarda, P., Valachovic, M., Popov, D. and Pišút,I. 2008: Diversity of forest vegetation across a strong gradient of cli-

matic continentality: Western Sayan Mountains, southern Siberia.

Plant Ecology 196, 61–83.De’ath, G. and Fabricius, K.E. 2000: Classification and regression

trees: a powerful yet simple technique for ecological data analysis.

Ecology 81, 3178–92.Faegri, K. and Iversen, J. 1989: Textbook of pollen analysis. JohnWiley & Sons.

Fontana, S.L. 2005: Coastal dune vegetation and pollen representa-

tion in south Buenos Aires Province, Argentina. Journal ofBiogeography 32, 719–35.Frenzel, B., Pécsi, M. and Velichko, A.A., editors 1992: Atlas ofpaleoclimates and paleoenvironments of the Northern Hemisphere.Geographical Research Institute and Gustav Fischer Verlag.

Gaillard, M.-J., Birks, H.J.B., Emanuelsson, U. and Berglund,

B.E. 1992: Modern pollen/land-use relationships as an aid in the

reconstruction of past land-uses and cultural landscapes: an example

from south Sweden. Vegetation History and Archaeobotany 1, 3–17.

1270 The Holocene 18,8 (2008)

at Charles Univ/Univ Karlova v Praze on November 28, 2008 http://hol.sagepub.comDownloaded from

Gaillard, M.-J., Birks, H.J.B., Emanuelsson, U., Karlsson, S.,

Lagerås, P. and Olausson, D. 1994: Application of modern pollen/

land-use relationships to the interpretation of pollen diagrams – recon-

structions of land-use history in south Sweden, 3000–0 BP. Review ofPalaeobotany and Palynology 82, 47–73.Gidrometeoizdat 1966–1970: Spravochnik po klimatu SSSR

(Reference books on the climate of the USSR). Gidrometeoizdat.

Hill, M.O. 1979: TWINSPAN – a FORTRAN program for arrangingmultivariate data in an ordered two-way table by classification of theindividuals and attributes. Cornell University Press.Hoyt, C.A. 2000: Pollen signatures of the arid to humid grasslands of

North America. Journal of Biogeography 27, 687–96.Jankovská, V., Chromý, P. and Niznianská, M. 2002: Šafárka – firstpalaeobotanical data of the character of Last Glacial vegetation and

landscape in the West Carpathians (Slovakia). Acta Palaeobotanica42, 39–50.

Lang, G. 1994: Quartäre Vegetationsgeschichte Europas. Methodenund Ergebnisse. Gustav Fischer Verlag.Liu, H., Cui, H., Pott, R. and Speier, M. 1999: The surface pollen of

the woodland-steppe ecotone in southeastern Inner Mongolia, China.

Review of Palaeobotany and Palynology 105, 237–50.Meusel, H., Jäger, E.J., Weinert, E. and Rauschert, S. 1965–1992:

Vergleichende Chorologie der zentraleuropäischen Flora I–III. GustavFischer Verlag.

Odgaard, B.V. 1999: Fossil pollen as a record of past biodiversity.

Journal of Biogeography 26, 7–17.Odgaard, B.V. and Rasmussen, P. 2000: Origin and temporal devel-

opment of macro-scale vegetation patterns in the cultural landscape of

Denmark. Journal of Ecology 88, 733–48.

Polikarpov, N.P., Chebakova, N.M. and Nazimova, D.I. 1986: Klimati gornye lesa Sibiri (Climate and mountain forests of Siberia). Nauka.Prentice, I.C. 1985: Pollen representation, source area, and basin

size: toward a unified theory of pollen analysis. Quaternary Research23, 76–86.

Reille, M. 1995–1999: Pollen et spores d’Europe et d’Afrique dunord. Laboratoire de Botanique Historique et Palynologie.Ripley, B.D. 1996: Pattern recognition and neural networks.Cambridge University Press.

Stutz, S. and Prieto, A.R. 2003: Modern pollen and vegetation relation-

ships in Mar Chiquita coastal lagoon area, southeastern Pampa grass-

lands, Argentina. Review of Palaeobotany and Palynology 123, 183–95.Sugita, S. 1994: Pollen representation of vegetation in Quaternary

sediments: theory and method in patchy vegetation. Journal ofEcology 82, 881–97.ter Braak, C.J.F. and Šmilauer, P. 2002: CANOCO reference man-ual and CanoDraw for Windows user’s guide. Software for CanonicalCommunity Ordination (version 4.5). Biometris.Tinner, W. and Kaltenrieder, P. 2005: Rapid responses of high-

mountain vegetation to early Holocene environmental changes in the

Swiss Alps. Journal of Ecology 93, 936–47.von Post, L. 1916: Om Skogsträdpollen i Sydsvenska Torfmosselager

följder. Geologiska Foreningens i Stockholm Forhandlingar 38, 384–90.Webb, T., III, Howe, S.E., Bradshaw, R.H.W. and Heide, K.M.

1981: Estimating plant abundances from pollen percentages: the use

of regression analysis. Review of Palaeobotany and Palynology 34,269–300.

Willis, K.J., Rudner, E. and Sümegi, P. 2000: The full-glacial forests

of central and southeastern Europe. Quaternary Research 53, 203–13.

Barbora Pelánková et al.: Modern pollen along a steppe–forest–tundra transition 1271

at Charles Univ/Univ Karlova v Praze on November 28, 2008 http://hol.sagepub.comDownloaded from