Modeling a Poisson Forest in Variable Elevations: A Nonparametric Bayesian Approach

13

Transcript of Modeling a Poisson Forest in Variable Elevations: A Nonparametric Bayesian Approach

Modeling a Poisson forest in variable elevations:a nonparametric Bayesian approach �Juha HeikkinenFinnish Forest Research InstituteUnioninkatu 40 A, FIN-00170 Helsinki, FinlandJuha.Heikkinen@metla.�andElja ArjasRolf Nevanlinna InstituteP.O. Box 4, FIN-00014 University of Helsinki, [email protected].�September 25, 1998SummaryA nonparametric Bayesian formulation is given to the problem of modeling non-homo-geneous spatial point patterns in�uenced by concomitant variables. Only incompleteinformation on the concomitant variables is assumed, consisting of a relatively smallnumber of point measurements. Residual variation, caused by other unmeasured in�u-ential factors, is modeled in terms of a spatially varying baseline intensity function. AMarkov chain Monte Carlo scheme is proposed for the simultaneous nonparametric esti-mation of each unknown function in the model. The suggested method is illustrated byreanalysing a data set in Rathbun (1996, Biometrics 52, 226�242), and the estimatedmodels are compared with those obtained by Rathbun.Key words: Ecological response curves; Nonparametric Bayesian inference; Reversiblejump MCMC; Spatial interpolation; Spatial point process.1. IntroductionIn plant ecology, it is a natural idea to relate the spatial variation in abundance of plantsto the values of one or more locally measured concomitant variables, such as groundelevation or acidity of soil, for example. Information on such concomitant variables is inpractice often incomplete, being restricted to a relatively small number of point measure-ments. Furthermore, other unmeasured in�uential factors are likely to be present, givingthen rise to greater variability in abundance than could be expected purely on groundsof the measured concomitant variables. Such residual variation is expressed naturally interms of a spatially varying baseline intensity function. Both the dependence on mea-sured concomitant variables and the baseline are unlikely to have a known functionalform, however. Such considerations lead to a nonparametric model formulation, whereone faces the additional task of estimating the true values of the considered concomitantvariable(s). Here this problem is given a nonparametric Bayesian formulation, and aMarkov chain Monte Carlo (MCMC) scheme is proposed for the numerical estimation.�To appear in Biometrics. 1

Poisson forest in variable elevations 2

++ ++

+ +++++++++

++++++++++++ +

++++++++

+

+

+

+++++

+++

++++

++++++

++++

+++++++++++

+++++

+++++++

++++++++

++++

+++

+++

+ +++ ++++++

++

+

+++++++++

++

++

+

+ +

++

+

++

+

+++

++ +++++++

++++ ++

+

++++

+++ +

++

+++ ++++++++

+++

+++

+++ +

++

++++

++

++

++++++

++

+

+

+

+ +++

+

++++++

+

++ +

+++++++++

++

++++ ++

++ + +

++++

++++

+

+

+

+

+

+++++

+

+++ + +++

+++ + ++

+++++

0 50 100 150 200 2500

50

100

150

2000 50 100 150 200 250

0

50

100

150

200

Figure 1. Locations of ironwood in the Titi Hammock study region; data kindly pro-vided by Stephen Rathbun.As an illustration of the method, we reanalyse a data set described in Rathbun (1996).It contains the mapped locations of trees of several species in a 250 � 200 m region of TitiHammock, a beech-magnolia forest in southern Georgia, USA. The stand patterns showclear dependence on the ground elevation, of which measurements were taken on a regularsquare lattice of 11 � 9 sample points 25 m apart. Consider, for example, the locationsof individual ironwood Carpinus caroliniana trees shown in Figure 1. Comparison withthe elevations (Table 1) reveals that ironwood is clearly most abundant in the lowestelevations.Table 1. Relative elevations (in meters) at sample points in Titi Hammock study area;published in Rathbun (1996). XY 0 25 50 75 100 125 150 175 200 225 250200 7.1 8.4 7.4 5.2 5.3 7.7 9.9 12.4 13.3 14.3 14.6175 6.8 6.9 4.6 5.4 7.3 8.6 9.8 10.5 11.4 11.3 11.4150 4.3 4.2 4.5 6.4 7.8 8.6 8.6 9.0 8.8 8.8 8.8125 3.5 5.1 4.9 6.3 7.7 7.5 7.1 7.2 6.6 6.5 6.5100 4.2 4.6 5.1 5.6 6.6 5.8 4.7 4.3 4.1 4.1 4.275 3.3 4.2 4.6 4.6 4.6 4.1 3.1 2.8 2.6 2.2 2.150 2.6 2.2 2.4 3.7 2.2 2.2 2.2 1.2 0.1 0.4 -0.425 2.3 1.7 1.5 1.2 0.8 0.8 0.9 0.5 0.4 0.0 -0.40 7.3 6.3 4.9 5.1 5.0 3.8 1.0 2.6 0.5 1.0 2.9

Poisson forest in variable elevations 3Methodologically, this work stems from Arjas and Heikkinen (1997) and Heikkinenand Arjas (1998), where we developed a new method for the nonparametric Bayesianestimation of the intensity of a non-homogeneous Poisson process. The method is basedon mixing variable dimensional piecewise constant approximations, whose `smoothness' isregulated by a conditional autoregressive prior distribution. It was noted in the discussionof the latter paper that the same approach is applicable much more generally to theestimation of curves and surfaces, and that these in turn can be parts of a larger, morerealistic model. The current paper gives a practical illustration of such analysis.Our model has three unknown functions to be estimated: First, the ground elevationsurface must be interpolated between the sample points; here we neglect the elevationmeasurement error mainly for the purpose of illustrating how interpolation can be tackledby our approach. Secondly, the response of the tree intensity to changes in elevation isan example of the usual regression curve, and �nally, the baseline intensity surface issimilar to the functions estimated in Heikkinen and Arjas (1998). Hence this applicationprovides a good opportunity to illustrate how basically a single estimation approachcan be applied to a variety of problems, usually attacked by quite di�erent frequentistmethods.Some minor changes are introduced to the method of Heikkinen and Arjas (1998).Most notably the proper multinormal prior is replaced by an improper pairwise di�erenceprior (e.g., Besag et al., 1995), and its precision parameter is treated as unknown.The outline of this paper is as follows. In Section 2 we specify the statistical model tobe considered. Some non-standard aspects of the suggested MCMC estimation schemeare brie�y discussed in Section 3 with more details given in the Appendix. In Section 4we illustrate some of the estimation results and compare them with those of Rathbun(1996). Finally, we discuss the applied modeling and inferential ideas more generally inSection 5.2. ModelIn Rathbun (1996) the forest stand patterns were modeled using the modulated Poissonprocess with the intensity taking a simple parametric form as a function of the eleva-tion. The method proposed there for the maximum likelihood estimation was based onreplacing the unobserved true elevations by their kriging predictors. Here we extend thismodel by allowing for a nonparametric response function and an additional unmeasuredbaseline intensity surface.To introduce our model in more detail, let us denote the Titi Hammock study region[0; 250] � [0; 200] by E. Following Rathbun (1996) we model the observed tree patternx = (x1; : : : ; xN ) � E as a realisation of a non-homogeneous Poisson process on E withan unknown intensity function � : E ! [0;1). Hence the likelihood p(x j �) of data xgiven an intensity function � is proportional toexp��ZE �(s) �(ds)� NYn=1�(xn); (2.1)where � is the (two-dimensional) Lebesgue-measure.The intensity � in turn is modeled as�(s) = �0(s) f�(s)g; (2.2)where �0 : E ! [0;1) is an unknown baseline intensity surface, � : E ! R is thepartially observed elevation surface, and : R ! [0;1) is an unknown multiplica-tive response of the intensity to changes in the elevation. The elevation measurements

Poisson forest in variable elevations 4z = (z1; : : : ; zM ) at the sample points s = (s1; : : : ; sM ) � E are assumed to have negli-gible error so that �(sm) = zm, orp(z j �) / MYm=1 1fzm = �(sm)g: (2.3)Following Arjas and Heikkinen (1997) and Heikkinen and Arjas (1998), the priordistribution of each unknown curve or surface f , f 2 f�0; �; g, is de�ned over a setof piecewise constant functions on a random partition of the function domain (E forsurfaces �0 and � and R for curve ). These partitions are obtained as Voronoi tessel-lations E(�f ) = fE1(�f ); : : : ; EKf (�f )g of E or R, generated by random point patterns�f = (�f;1; : : : ; �f;Kf ). A realisation from the prior distribution is then parametrisedby the generating point pattern �f and the function values �f = (�f;1; : : : ; �f;Kf ) ineach Voronoi tile; the intensities are modeled in log-scale, whereby ��0;k denotes thevalue of log �0 in Ek(��0), and � ;k the value of log in Ek(� ). Note that, owing tothe randomness of the partitions, the pointwise posterior means do not need to forma piecewise constant function, and indeed the posterior mean curves and surfaces aretypically smooth continuous functions. For further discussion on dynamic step functionapproximations in Bayesian inference see, e.g., Arjas and Andreev (1996) or Heikkinen(1998).The prior distribution of f is now determined by specifying the joint prior of �f and�f . This is done via the chain rule decomposition p(�f ;�f ) = p(�f )p(�f j �f ): The priorof �f is taken to be the homogeneous Poisson process with intensity ��;f , constrained tohave at least two points in order for a pairwise di�erence prior to be sensibly de�ned for�f j �f . Hence the density p(�f ) is proportional to �Kf�;f1(Kf > 1).To build a smoothing prior for �f given �f we de�ne a (realisation dependent) neigh-bourhood relation ��f based on the partition E(�f ). In one dimension adjacent generatingpoints are de�ned to be neighbours, and in two dimensions two generating points areneighbours if the corresponding Voronoi tiles are contiguous, sharing a common edge.We then set up a Gaussian pairwise di�erence priorp(�f j �f ; �f ) / �KfYk=1��fwk+2� �1=2� exp���f2 Xk<j wkj(�f;k � �f;j)2�; (2.4)where the precision parameter �f determines the degree of smoothness, the weightswkj = wkj(�f ) are non-zero for neighbouring sites �f;k ��f �f;j , and wk+ =Pj wkj. Thisprior is improper, because the density is invariant under shifts of every coordinate bythe same amount, but the posterior is proper in the presence of any informative data.Also, the full conditionalsp(�f;k j�f;�k; �f ; �f ) / exp���fwk+2 ��f;k �Xj wkjwk+ �f;j�2� (2.5)are proper Gaussian distributions with the expected values given by weighted averagesof the neighbouring levels.The precision parameter �f is treated as unknown with an exponential prior p(�f ) /exp(��f�f ); and the generating point intensity ��;f is used as a control variable to adjustthe degree of smoothing. In the examples we have applied distance dependent weights

Poisson forest in variable elevations 5wkj(�) = k�k � �jk�1; both in one and two dimensions; some alternative schemes arediscussed and applied in Arjas and Heikkinen (1997) and in Heikkinen and Arjas (1998).It should be noted that without any additional constraints the multiplicative intensitycomponents �0 and are unidenti�able. The pairwise di�erence priors applied here havethe convenient property (as compared to proper multinormal priors) that the posterioris invariant under transformations of the form (�0 ! C�0; ! C�1 ) for any positiveconstant C. Hence realisations drawn from the posterior can be scaled afterwards in away that best supports the desired interpretation.In our notation the model of Rathbun (1996) assumes �0 to be identically equalto 1, and log( ) to have a polynomial form (both linear and quadratic functions areapplied). To obtain comparable response function estimates we can, for example, scalethe integrated baseline intensity per unit area in region E, given by RE �0(s) �(ds)=�(E);to be identically equal to 1 in all realisations of �0. The intuition behind this choice isthat, in the hypothetical case where the ground elevation in the entire region E werea constant z0, the total number of trees in that region would be a Poisson randomvariable with mean (z0)�(E), that is, (z0) would appear as a proportionality constantmultiplying the area of E.Another possibility to choose the scaling is to decide on a reference level of elevation,say z0, and set (z0) = 1. Now, if the elevation had the constant value z0 throughoutthe region E, N(A) would be Poisson with parameter RA �0(s) �(ds), and if this valuewere changed to the level z, the parameter would have to be multiplied by (z). Thusthe interpretation of would be similar to that of the relative risk function in theproportional hazards model for survival.3. Posterior SamplingOur inferences are based on reversible jump Markov chain Monte Carlo sampling (Green,1995) from the posterior distribution. In each basic update step either a new value for arandomly selected �f;k or �f , the birth of a new marked generating point (�f ; �f ), or thedeath of a randomly chosen generating point is proposed. With some more details givenin the Appendix, we mention here a few implementation issues speci�c to the currentapplication.The ground elevation surface � should retain its observed data values zm at thesample points sm, m = 1; : : : ;M . This is implemented by initialising �� to s and ��to z, by never accepting a proposal where one tile contains two or more sample pointssm, and by always proposing level zm to the tile that contains sm. Some care is neededhere to ensure reversibility (see Appendix for details). Since there are some adjacentsample points with identical elevation readings, one could alternatively reject only thoseproposals where one tile contains sample points with di�erent observations.The response function is, in principle, de�ned over the entire real line, which issomewhat problematic in our approach. For simplicity, we constrain the generatingpoints � ;k to lie inside the interval � = [�1; 15] containing all observed elevations.Restrictions of the subintervals Ek(� ) to � are applied whenever their lengths areneeded in the dimension changing moves (see Appendix). Note, however, that we do notrestrict the sampled ground elevation values ��;k to lie inside � .In order to facilitate the comparison to Rathbun's estimates, we used the scal-ing option which assumes that RE �0(s) �(ds)=�(E) = 1. This re-scaling is performedperiodically during the sampling (rather than afterwards), which helps avoiding suchdrifts among unidenti�able parameters that could lead to numerical problems (cf. Besaget al., 1995, Sec. 4.1).

Poisson forest in variable elevations 6Elevation

0 50 100 150 200 2500

50

100

150

2000 50 100 150 200 250

0

50

100

150

200

00

2

2 24

4

4

46

6

6

8

8

10

1214

Elevation

Inte

nsity

0 5 10 15

0.0

0.04

0.08

0.12

Baseline intensity

0 50 100 150 200 2500

50

100

150

2000 50 100 150 200 250

0

50

100

150

200

0.6

1

1

1.4

1.8

1.8

2.2

Total intensity

0 50 100 150 200 2500

50

100

150

2000 50 100 150 200 250

0

50

100

150

200

• • ••

• •••••••••

•••••••• •••• •

••••••••

• ••••

•••

••••

••••••

••• •

•••••••••••

•••••

••••• ••

••••••••

••••

•••

•••

• ••• ••••••

••

•••• •••••

••

••

• •

••

••

•••

•• •••••••

•••• ••

••••

••• •

••

••• ••••••••

•••

•••

••• •

••

••••

• •

••

••••••

• •

• •••

••••••

•• •

•••••••••

••

••• • ••

•• • •

••••

••••

•••••

••• • •••

••• • ••

•••••

Figure 2. Pointwise posterior mean estimates for ironwood. For the response function (top right) also Rathbun's log-quadratic estimate (dashed line), and our pointwise 80%credible intervals (dotted lines) are shown. The observed point pattern is overlaid on theplot of total intensity.The results presented in Section 4 are based on samples of 10,000 realisations, col-lected by saving the current state after every 2,000th basic update step after a burn-inperiod of 1,000,000 steps. This makes a total of 21,000,000 steps, which took about 10hours on our workstation based on a 500 MHz DEC Alpha 21164 processor. The accep-tance rates varied between 20% and 70%. Convergence was assessed by monitoring thelikelihood p(x j �), the su�cient statistics in the prior of each parameter function: K,P logwk+ andPwkj(�k � �j)2, and the values of the response curve at 8 control pointsequally spaced on [�1; 15]. The Monte Carlo standard errors for the posterior meansof these statistics were assessed via the initial monotone sequence estimator of Geyer(1992); they were all less than 13% of the corresponding posterior standard deviations.The C-code of our sampler and some S-Plus functions for graphical display of the re-sults are available over the Internet at http://www.stat.jyu.fi/~jmhe/pub/Steps.html.4. ResultsFor an illustration of our method we have selected two representative species, ironwoodand blue beech Ostrya virginiana. Ironwood serves as an example of an `easy' species,for which the parametric model of Rathbun (1996) gives good �t. Blue beech, on theother hand, is the species for which Rathbun reports worst �t. Figures 2 (ironwood)

Poisson forest in variable elevations 7Elevation

0 50 100 150 200 2500

50

100

150

2000 50 100 150 200 250

0

50

100

150

200

02

2 24

4

4

6

6

6

8

8

10

1214

Elevation

Inte

nsity

0 5 10 15

0.0

0.01

0.02

0.03

Baseline intensity

0 50 100 150 200 2500

50

100

150

2000 50 100 150 200 250

0

50

100

150

200

0.51.5

1.5

2.5

Total intensity

0 50 100 150 200 2500

50

100

150

2000 50 100 150 200 250

0

50

100

150

200

••

••••

•• •••••

•••••• • •••

• •••

••• • •

•••

•••••

•••

••

•••• •• •• ••

••

• • ••• •

•••••

•••••

•• •••••• •• • ••• • •••

•••• •• ••

•••••

••

••• ••

••

••• •• •••••

••• • •

••••

••••• •••

•• •• • •

••• •

••

•• ••• ••

•••••

••

••••

•••

•• ••

••• •••••• •••••••••

••• • •

••••••••

•••••

••

• •••••• •••

•• •• •••

••

•••••

••

•••

••••••••

••••

••• • •

••••

••••••

••••••

•••

•••• ••••••

••• •

• •••••

••••••

•• •••••••••• ••

•••••

•••••• •••

• ••

•••

•••••

••• ••••

••••• ••• ••• • •••••

••• •••• •

••

••••

•••

•• ••

••••

• •••

••

••• •••••••

•• ••

•••

••••

••

••••••

•••• • •••

•••

••

••• ••

••

••

•• • • ••••• •••••••

• •• • •••• •••••

• ••• •

••••••• ••

••

••

••

•••• •

••••

•••••••••••

•••

••

•• • ••• •••

•••• •

•••

•••• •

••

•• ••

•• •• ••

• ••

••

•••

• ••••

•• • •

• •

••••••

••

•••

•••••

•• •

••

••••

•••

••••• ••••

•••••

•••• •

•••

••

••• •

•••

• •

•••

•• •••

Figure 3. Pointwise posterior mean estimates for blue beech (as Figure 2).and 3 (blue beech) display curves and surfaces formed by our pointwise posterior meanestimates for elevation �, response curve , baseline intensity �0, and total intensity �.For the response curve also the pointwise symmetric 80% credible intervals, and thecorresponding parametric estimate of Rathbun are shown.Consider �rst the topographic contour maps in the top left displays of Figures 2and 3. Since we have treated ironwood and blue beech data separately, by repeating thesame estimation procedure for both species, we have ended up producing two topographiccontour maps for the Titi Hammock study region. This could be changed easily into acombined analysis, by replacing the two point pattern likelihood expressions arising fromsingle species by their product, and using the same elevation values in both. However,already the way the estimation has been done here, the two contour maps resemble eachother closely, and are also very similar to the map produced in Rathbun (1996, Fig. 3).For a comparison, Figure 4 shows yet another estimate of the elevation surface; thistime without using the tree data at all. That is, we have only used likelihood (2.3) withthe same prior for � as in the other experiments. The estimate with blue beech data isalmost identical to this `prior' interpolation. The northern parts of 2 and 3 m contoursat about `latitude' Y = 50 are somewhat more wiggly, following apparent change curvesin the blue beech intensity. The e�ect of ironwood data on elevation estimates is morevisible, especially in the southern valley, where ironwood is most abundant. Again thetopographic contours adjust to sudden changes in the tree intensity.Consider then the estimated in�uence of elevation on ironwood and blue beech stands.

Poisson forest in variable elevations 8

0 50 100 150 200 2500

50

100

150

2000 50 100 150 200 250

0

50

100

150

200

02

2

2 24

4

4

6

6

6

8

8

10

1214

Figure 4. Posterior means of elevations without using any tree data.Again, as is shown in the top right displays of Figures 2 and 3, our pointwise posteriormeans of the function resemble rather well the corresponding parametric point esti-mates obtained by Rathbun. Apparently the most interesting observation here concernsblue beech: As already noted by Rathbun, his estimate for appears to be too high forelevations below 2 meters. This does not happen in our nonparametric method, wherethe functional form of has not been speci�ed in advance and where it is, in essence,data driven. We actually suspect that the bell-shaped exponential of a quadratic curveconsidered by Rathbun, resembling the normal density, does not give a particularly gooddescription to the abundance of blue beech at higher elevations either.The role of the baseline intensity �0 in the description of the Titi Hammock data isto explain the apparent extra-Poisson variability which would be present if the intensitywere considered as a function of only the corresponding elevation. This element wasmissing from Rathbun's model completely. In the bottom left display of Figure 2, forexample, in the western part of the study area, the posterior mean of �0 assumes valuesas large as 2 or higher, and these are then balanced by much smaller values towardseast. Corresponding clusters of ironwood can be seen in the western part of the map,where there seem to be growing more trees than could be expected purely on groundsof the corresponding elevation readings. Similar comments apply to blue beech, wherethe baseline intensity peaks close to the centre of the study region (bottom left displayof Figure 3).For each unknown function f there are two hyperparameters, ��;f and �f , whosevalues must be speci�ed. The generating point intensity ��;f has a direct interpretationas the prior mean number of generating points per unit length or area. In other words, itdetermines the typical tile size in the individual realisations, and can be used to controlhow �ne details of the function are shown. Its choice also has a considerable e�ecton the required computational e�ort, which increases with the number of tiles. In the

Poisson forest in variable elevations 9

Standard deviation

Prio

r de

nsity

0.0 0.2 0.4 0.6 0.8 1.0

02

46

810

Elevation

Inte

nsity

0 5 10 15

0.0

0.02

0.04

0.06

0.08

Figure 5. Left: prior densities of ��1=2 (standard deviation of the conditional distribu-tion (2.5) of � ;k, when wk+ = 1) with � = 0:005 (dashed line), � = 0:05 (solid line)and � = 0:5 (dotted line). Right: corresponding posterior mean estimates of responsefunction for ironwood.experiments reported above, the values were chosen as follows:� ��0 = 0:0005, corresponding to prior mean number of 25 generating points for thebaseline intensity surface in the 250 � 200 m region;� �� = 0:004, corresponding to prior mean number of 200 generating points for theelevation surface;� � = 0:625, corresponding to prior mean number of 10 generating points for theresponse curve; andA rough idea on the e�ect of �f can be obtained by looking at the full condition-als (2.5) of the function value prior. Consider, for example, the conditional prior dis-tribution of � ;k, the value of log in Ek(� ), given all other values and assuming, forsimplicity, that wk+ = 1, which is a rather typical value with the current choice of � .Given � , the standard deviation of this distribution is � = ��1=2 . The prior densitiesof � corresponding to three distinct values of � are shown in the left hand display ofFigure 5. Note that di�erence log 2 � 0:7 in � -values corresponds to halving or doublingthe intensity. The value of � applied in the reported experiments was 0.05. The righthand display of Figure 5 indicates that the results are not very sensitive to this choice.For f 2 f�0; �g �f was set to 0.01.5. DiscussionIn this paper we have proposed a nonparametric intensity model for describing spatialpoint pattern data, and for relating such a description to the values of a concomitantvariable from which only point measurements are assumed to be available. The suggestedmodel structure involves two nonparametrically de�ned spatial (bivariate) functions, onerepresenting the true values of the considered concomitant variable and the other servingas a baseline intensity, describing observed extra-Poisson residual variation. A third(univariate) nonparametric function is used as a link between the concomitant variableand the observed point pattern response.The suggested statistical inference from this model follows a fully Bayesian estima-tion scheme, where all three nonparametric functions are estimated jointly. In practice,

Poisson forest in variable elevations 10the numerical work involves the application of computationally intensive algorithmicMarkov chain Monte Carlo methods. Their implementation is not a routine task, andrigid convergence assessment is inevitably di�cult. Serious non-convergence, however, isusually detected by simple diagnostics.We believe that these general modeling and inferential ideas, involving a combinationof several nonparametrically de�ned functions to form a likelihood expression and theirjoint Bayesian estimation from observed data, have a much greater potential in appli-cations than we have been able to illustrate here in our concrete example. Responsesurfaces, other than the Poisson intensity, could be modeled simply by choosing a di�er-ent likelihood expression; restoration of a grey level image can be viewed as a particularcase of such task. Image classi�cation is an example of the case, where the functionof interest is truly piecewise constant and its values restricted to a (small) �nite set oflabels. In such case we could replace the Gaussian function value prior by, for example,the Potts model. AcknowledgementsWe are extremely grateful to Steve Rathbun for kindly providing us with the Titi Ham-mock data. This work was partly supported by a research grant from the Academy ofFinland. Computing facilities of the Department of Statistics, University of Jyväskyläwere used. Rolf Turner's ratfor-routines from StatLib-archive (http://lib.stat.cmu.edu/general/delaunay) were modi�ed to perform Voronoi tessellations. The insightfulcomments by the Editor and two referees have helped us improve the manuscript.ReferencesArjas, E. and Andreev, A. (1996). A note on histogram approximation in Bayesiandensity estimation, in J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M.Smith (eds), Bayesian Statistics 5, Oxford University Press, pp. 487�490.Arjas, E. and Heikkinen, J. (1997). An algorithm for nonparametric Bayesian estimationof a Poisson intensity. Computational Statistics 12, 385�402.Besag, J., Green, P. J., Higdon, D. and Mengersen, K. (1995). Bayesian computationand stochastic systems (with discussion). Statistical Science 10, 3�66.Geyer, C. J. (1992). Practical Markov chain Monte Carlo (with discussion). StatisticalScience 7, 473�511.Green, P. J. (1995). Reversible jump MCMC and Bayesian model determination.Biometrika 82, 711�732.Heikkinen, J. (1998). Curve and Surface Estimation Using Dynamic Step Functions, inD. Dey, P. Müller and D. Sinha (eds), Practical Nonparametric and SemiparametricBayesian Statistics, number 133 in Lecture Notes in Statistics, Springer-Verlag, NewYork, chapter 14, pp. 255�272.Heikkinen, J. and Arjas, E. (1998). Non-parametric Bayesian estimation of a spatialPoisson intensity. Scandinavian Journal of Statistics 25. In press.Rathbun, S. L. (1996). Estimation of Poisson intensity using partially observed concomi-tant variables. Biometrics 52, 226�242.

Poisson forest in variable elevations 11AppendixFormulae for the posterior samplerThis appendix gives the formulae applied in the reversible jump MCMC sampling fromthe posterior distribution. For more details on their derivation the reader is referred toHeikkinen (1998).Let � = (��0 ; ��0 ;��0 ; �� ; �� ;�� ; � ; � ;� ) denote the complete set of our modelparameters. The joint posterior p(�jx; z) of � is proportional to the expression� Yf=�0;�; p(�f )p(�f )p(�f j�f ; �f )�p(xj�)p(zj�);where the Poisson likelihood function can be written asp(xj�) / exp�K�0Xk=1 ��0;kNxfEk(��0)g+ K�Xk=1 � ;i(��;k)NxfEk(��)g� K�0Xk=1 K�Xj=1 exp(��0 ;k + � ;i(��;j ))�fEk(��0) \Ej(��)g�:Here Nx(A) is the number of individual points in pattern x inside domain A, �(A) isthe size (Lebesgue measure) of A (i.e., the length in one dimension and the area in two),and i(z) is the index of the interval in E(� ) which contains z. For future reference, notethe two alternative ways of rewriting the double sumK�0Xk=1 K�Xj=1 exp(��0 ;k + � ;i(��;j ))�fEk(��0) \Ej(��)g= K�0Xk=1�exp ��0;kXj exp(� ;i(��;j ))�fEj(��) \Ek(��0)g�= K�Xk=1�exp � ;i(��;k)Xj exp(��0;j)�fEj(��0) \Ek(��)g�:The inner sums typically contain only few terms, since most intersections are empty.In each iteration of the posterior sampler one of the following moves is proposed fora randomly selected parameter function f 2 f�0; �; g:1. Change the precision parameter value �f .2. Change one randomly chosen function value �f;k, k 2 f1; : : : ;Kfg.3. Add a new (marked) generating point (�0f ; �0f ).4. Remove one randomly chosen generating point �f;k, k 2 f1; : : : ;Kfg.To simplify notation the subscript f will be omitted below whenever there is no dangerof confusion. The proposed moves from � to �0 are accepted with probabilitymin�1; q(�0 ! �)q(� ! �0) p(�0jx; z)p(�jx; z) � ; (A.1)

Poisson forest in variable elevations 12where q is the density of the proposal probability kernel. Our dimension changing movesare such that the Jacobian (Green, 1995) is always equal to 1, and therefore it is omittedin expression (A.1).For type 1 move we create the proposal � 0 by drawing log � 0 from the uniform dis-tribution on [log � � C� ; log � + C� ]; where C� (as well as other C's appearing later) isa sampler parameter that can be tuned to improve mixing. The corresponding proposalratio is q(� 0 ! �)=q(� ! � 0) = � 0=� due to the log-transform. The posterior ratio isp(� 0)p(�j�; � 0)p(�)p(�j�; �) = (� 0=�)K=2 exp���� + 12Xk<j wkj(�k � �j)2�(� 0 � �)�:In a type 2 move an index k is sampled from the uniform distribution on integers1; : : : ;Kf . If f = � and Ek(��) \ s 6= ;, we refuse this proposal without further consid-eration, since any change to ��;k would then lead to p(zj�) = 0. Otherwise, we draw aproposal �0k from the uniform distribution on the interval [�k�C�; �k�C�]. The proposalratio cancels due to the symmetry, and the posterior ratio isp(�0kj��k; �; �)p(xj�0)p(�kj��k; �; �)p(xj�0) = exp���(�0k � �k)�wk+(�0k + �k)=2�Xj wkj�j�+Dlike;1�where Dlike;1 = Dlike;f;1 = log p(xj�0)� log p(xj�) is the log-likelihood di�erence:Dlike;�0;1 = (�0�0;k � ��0;k)NxfEk(��0)g� (exp �0�0;k � exp ��0;k)Xj exp(� ;i(��;j ))�fEj(��) \Ek(��0)g;Dlike;�;1 = f� ;i(�0�;k) � � ;i(��;k)gNxfEk(��)g� (exp � ;i(�0�;k) � exp � ;i(��;k))Xj exp(��0;j)�fEj(��0) \Ek(��)g;Dlike; ;1 = f�0 ;k � � ;kg Xj:i(��;j)=kNxfEj(��)g� (exp �0 ;k � exp � ;k) Xj:i(��;j)=kXi exp(��0;i)�fEi(��0) \Ej(��)g:Move types 3 and 4 are designed to form pairs of reversible jumps between di�erentdimensions. To �x the notation let � = (�1; : : : ; �K) and �0 = (�01; : : : ; �0K0), whereK 0 = K + 1 and �0k = �k for k = 1; : : : ;K. That is, �0 is obtained by adding one pointto � and indexing it as the last one or, reversely, � is obtained by deleting �0K0 from �0.De�ne � and �0 analogously, and consider a pair �, �0 of states, identical except for thesedi�erences in the currently updated function f .In a birth proposal from � to �0 the proposed location of a new generating point�0K0 is drawn from the uniform distribution on �f , where �f = E for f 2 f�0; �g. LetK0 denote the set fk : �0k ��0 �0K0g of indices in the neighbourhood of the new point. Iff 2 f�0; g, or f = � and the new tile E0�;K0�(�0�) does not contain any of the samplepoints s, then we propose mark �0K0 = e� + ", wheree� = Xk2K0 �fEk(�)g � �fEk(�0)g�fEK0(�0)g �k

Poisson forest in variable elevations 13is a weighted average of the current function values in the neighbouring tiles, and per-turbation " 2 R is drawn from the density g(") = C" exp(C"")=f1 + exp(C"")g2: (Forf = the intervals Ek(� ) are restricted to � when measuring their lengths �.) Theproposal ratio for this pair of states isq(�0 ! �)q(� ! �0) = dK0=K 0bKg(�0K0 � e�)=�(�) ;where bk and dk denote the probabilities with which birth and death moves, respectively,are proposed when the current number of generating points is k. We have chosen them soas to satisfy equations dk+1=bk = (k + 1)=���(�); whereby the proposal ratio simpli�esto f��g(�0K0 � e�)g�1: Letting w0 denote a weight derived from pattern �0, the posteriorratio can be expressed asp(�0)p(�0j�0; �)p(xj�0)p(�)p(�j�; �)p(xj�) = ����fw0K0+2� Yk2K0 w0k+wk+�1=2 exp���f2 Dprior +Dlike;2�;whereDprior = Xk2K0�w0K0k(�0K0 � �k)2 � 12 Xj2K0f(w0kj � wkj)(�k � �j)2g�;Dlike;�0;2 = Xk2K0�(�0�0;K0 � ��0;k)NxfEK0(�0�0) \Ek(��0)g� (exp �0�0;K0 � exp ��0;k)Xj exp(� ;i(��;j ))�fEj(��) \EK0(�0�0) \Ek(��0)g�;Dlike;�;2 = Xk2K0�(� ;i(�0�;K0 ) � � ;i(��;k))NxfEK0(�0�) \Ek(��)g� (exp � ;i(�0�;K0 ) � exp � ;i(��;k))Xj exp(��0;j)�fEj(��0) \EK0(�0�) \Ek(��)g�;Dlike; ;2 = Xk:i(��;k)=K0�(�0 ;K0 � � ;i(��;k))NxfEk(��)g� (exp �0 ;K0 � exp � ;i(��;k))Xj exp(��0;j)�fEj(��0) \Ek(��)g�:If f = � and s \ E�;K0� (�0�) = fsmg, then we propose �0�;K0� = zm so that �(sm)remains equal to zm and the likelihood p(zj�) remains positive. In order to obtain pairsof reversible jumps we also propose a perturbation �0�;k = ��;k + "(= zm + ") to the tileEk(��) that contains sm in the current tessellation; as earlier, " is drawn from densityg("). This birth move is reversed by removing �0K0 and setting ��;k to zm. The proposalratio is as above, but with g evaluated at �0�;k � ��;k. The expressions for Dprior andDlike;�;2 become a bit more complicated since �0�;k 6= ��;k.Finally, if f = � and the proposed new tile E�;K0�(�0�) would contain more than oneof the sample points s, then we automatically refuse that proposal. Also death proposalswith two sample points in one tile E�;k(��) are automatically refused.