Choice of Distribution Functions for Hydrologic Design

10
VOL. 14, NO. 4 WATER RESOURCES RESEARCH AUGUST 1978 Choice of Distribution Functions for Hydrologic Design EUGENIO CASTANO AND LUCIEN DUCKSTEIN 1 Systems and Industrial Engineering Department, University of Arizona, Tucson, Arizona 85721 ISTVAN BOGARDI Water Resources Centre, Budapest H1054,Hungary The problem of selecting a bivariate probability density function (pdf)for the simultaneous water stages at both ends of a confluence reach isconsidered. This pdfisused to compute the expected flood losses in the design ofa levee with minimum expected yearly cost. A case study in Hungary illustrates the methodology throughout the paper. Twomodel selection procedures arecompared: ranking thecan- didate pdf's by the likelihood of the X 2 statistic andranking them by theirsample likelihoods. A composite model consisting of a linear combination of candidate pdf's weighted proportionally to their sample likelihoods is also considered. It is found that the two selection procedures leadto different choices, which in the example represent a significant cost variation. Although the ranking of the distributions reduces the uncertainty by imposing an ordering within the candidate set, a unique pdf does notfully account for the model uncertainty. In this sense the composite model seems a more reasonable choice, especially if thedecisions are based on expected values. INTRODUCTION In this paper an approach to the model choice problem in hydrology is presented withinthe context of the optimum design of a floodprotection levee alonga confluence reach. Specifically, the model to be chosen is the bivariate probability density function (pdf)of flood magnitudes which isnecessary to compute the expected damages. In hydrologic design the natural uncertainty inherent in some design parameters may be encoded in the form of a given pdf whose parameters are estimated from historical records. This approach is fraught with two moretypes of uncertainty [Bogardi, 1975; Kisiel and Duckstein, 1972]: parameter or sample uncertainty, which is associated with the randomness of the sampling process, and model uncertainty, which arises from ignorance about the type of pdf to be used. Rodr[guez-lturbe andVicens [1974] lump these twotypes of uncertainty under thelabel of operational uncertainty. Asthey point out, theselection of the proper model (pdf)may prove to bevery important when decisions areconditioned by extreme events, as is the case with flood protection. This is because models which seem to fit very well around the mean may differ considerably toward the tails. In the following section the confluence levee problem is briefly reviewed in terms of thegoalfunction andmodel com- ponents.Next, the probabilisticsubmodel is examinedin de- tail,and themodel choice problem isdefined; twoapproaches for modelselection are compared, and also a composite model, in a Bayesian sense, is considered. Finally, the eco- nomic consequences of choosing different pdf's areexplored. THE CONFLUENCELEVEE PROBLEM The elements of the confluence levee problem are nowcon- sidered. The river stretch wherethe levee will be built ends at a confluence, and theriver profile along this stretch has a very small slope. Changes of stage in themain river willpropagate long distances along thetributary owing to thesmall slope. •Also withHydrology and Water Resources Department, Univer- sity of Arizona, Tucson, Arizona 85721. Copyright ¸ 1978 by the American Geophysical Union. Paper number 8W0256. 0043-1397/78/048W-0256501.00 643 The watersurface profile along the reach depends, then, on the stages at its extremes and alsoon the hydraulic characteristics of the channel and the floodplain. The stages are hydrologic variables that cannotbe predicted with certainty and will be treated as random variables. The characteristics of the channel and floodplain are assumed to be known and constant. Associated with eachleveedesign are two basic costs: con- struction and operation costs and flood losses. The former depend on the shape of the levee, and the latter include the cost of rebuilding the levee in addition to the flooddamages in the protectedarea. Flood losses cannot be predictedwith cer- tainty, but if the behavior of the river can be modeled, it is possible to compute their expected value for anylevee design. Several modes of levee failuremay be considered including overtopping, sand boilingandsubsoil failure,waveeffects, and structural failuredueto seepage andwetting [Szidarovszky et al., 1976]. The confluence levee problemcan now be statedas follows. For a confluence reach where the backwater effects must be taken into account, find the levee profile that minimizes the total expected cost, that is, the sum of constructionand main- tenance costsplus expectedflood losses. The solution method developed here will be applied illustra- tively to determine the optimum levee profile for the con- fluence reach of the ZagyvaRiver, an important tributary of the Tisza River, both locatedin Hungarian territory. The lowlands to the west of the ZagyvaRiver are protected by a 60.4-km levee reach between the Jfisztelek gaging station and the mouthof the Tisza River at Szolnok. The average slope between the ends of the reachis about 0.01%, and the altitude difference between the two gaging stations is 6.5 m. Figure1 depicts the region under consideration. Goal Function A generalform of the goal function for the minimization of total expected cost is the following: min {C(G)+ foL(G,H)f(H)dH } (1) where G isthe vector of decision variables; C(G) represents all those costs that are considered deterministic, including con-

Transcript of Choice of Distribution Functions for Hydrologic Design

VOL. 14, NO. 4 WATER RESOURCES RESEARCH AUGUST 1978

Choice of Distribution Functions for Hydrologic Design EUGENIO CASTANO AND LUCIEN DUCKSTEIN 1

Systems and Industrial Engineering Department, University of Arizona, Tucson, Arizona 85721

ISTVAN BOGARDI

Water Resources Centre, Budapest H1054, Hungary

The problem of selecting a bivariate probability density function (pdf) for the simultaneous water stages at both ends of a confluence reach is considered. This pdf is used to compute the expected flood losses in the design of a levee with minimum expected yearly cost. A case study in Hungary illustrates the methodology throughout the paper. Two model selection procedures are compared: ranking the can- didate pdf's by the likelihood of the X 2 statistic and ranking them by their sample likelihoods. A composite model consisting of a linear combination of candidate pdf's weighted proportionally to their sample likelihoods is also considered. It is found that the two selection procedures lead to different choices, which in the example represent a significant cost variation. Although the ranking of the distributions reduces the uncertainty by imposing an ordering within the candidate set, a unique pdf does not fully account for the model uncertainty. In this sense the composite model seems a more reasonable choice, especially if the decisions are based on expected values.

INTRODUCTION

In this paper an approach to the model choice problem in hydrology is presented within the context of the optimum design of a flood protection levee along a confluence reach. Specifically, the model to be chosen is the bivariate probability density function (pdf) of flood magnitudes which is necessary to compute the expected damages. In hydrologic design the natural uncertainty inherent in some design parameters may be encoded in the form of a given pdf whose parameters are estimated from historical records. This approach is fraught with two more types of uncertainty [Bogardi, 1975; Kisiel and Duckstein, 1972]: parameter or sample uncertainty, which is associated with the randomness of the sampling process, and model uncertainty, which arises from ignorance about the type of pdf to be used.

Rodr[guez-lturbe and Vicens [1974] lump these two types of uncertainty under the label of operational uncertainty. As they point out, the selection of the proper model (pdf) may prove to be very important when decisions are conditioned by extreme events, as is the case with flood protection. This is because models which seem to fit very well around the mean may differ considerably toward the tails.

In the following section the confluence levee problem is briefly reviewed in terms of the goal function and model com- ponents. Next, the probabilistic submodel is examined in de- tail, and the model choice problem is defined; two approaches for model selection are compared, and also a composite model, in a Bayesian sense, is considered. Finally, the eco- nomic consequences of choosing different pdf's are explored.

THE CONFLUENCE LEVEE PROBLEM

The elements of the confluence levee problem are now con- sidered. The river stretch where the levee will be built ends at a confluence, and the river profile along this stretch has a very small slope. Changes of stage in the main river will propagate long distances along the tributary owing to the small slope.

•Also with Hydrology and Water Resources Department, Univer- sity of Arizona, Tucson, Arizona 85721.

Copyright ¸ 1978 by the American Geophysical Union. Paper number 8W0256. 0043-1397/78/048W-0256501.00

643

The water surface profile along the reach depends, then, on the stages at its extremes and also on the hydraulic characteristics of the channel and the floodplain. The stages are hydrologic variables that cannot be predicted with certainty and will be treated as random variables. The characteristics of the channel

and floodplain are assumed to be known and constant. Associated with each levee design are two basic costs: con-

struction and operation costs and flood losses. The former depend on the shape of the levee, and the latter include the cost of rebuilding the levee in addition to the flood damages in the protected area. Flood losses cannot be predicted with cer- tainty, but if the behavior of the river can be modeled, it is possible to compute their expected value for any levee design. Several modes of levee failure may be considered including overtopping, sand boiling and subsoil failure, wave effects, and structural failure due to seepage and wetting [Szidarovszky et al., 1976].

The confluence levee problem can now be stated as follows. For a confluence reach where the backwater effects must be

taken into account, find the levee profile that minimizes the total expected cost, that is, the sum of construction and main- tenance costs plus expected flood losses.

The solution method developed here will be applied illustra- tively to determine the optimum levee profile for the con- fluence reach of the Zagyva River, an important tributary of the Tisza River, both located in Hungarian territory. The lowlands to the west of the Zagyva River are protected by a 60.4-km levee reach between the Jfisztelek gaging station and the mouth of the Tisza River at Szolnok. The average slope between the ends of the reach is about 0.01%, and the altitude difference between the two gaging stations is 6.5 m. Figure 1 depicts the region under consideration.

Goal Function

A general form of the goal function for the minimization of total expected cost is the following:

min {C(G)+ foL(G,H)f(H)dH } (1) where G is the vector of decision variables; C(G) represents all those costs that are considered deterministic, including con-

644 CASTANO ET AL.: DISTRIBUTION FUNCTIONS

Zo,,• [ lek

Fig. 1. Geographic sketch of the confluence levee reach investigated.

struction and maintenance; H is the vector of random vari- ables that directly affect the system being designed; and L(G*, H*) is the loss function giving the loss incurred if the decision is G* and the realization of the random vector H is H*.

A very detailed model for the confluence levee problem should include among the decision variables all the design parameters having to do with levee strength and degree of protection, such as levee heights, widths and slopes along the reach, and materials used. Such a model should also take into account all modes of levee failure, and the set of random inputs to the system, H, should include the variables that determine the water surface profile and all other parameters of the flood wave related to the various modes of levee failure,

such as duration of exposure to the flood and wind and wave effects.

A model so detailed would turn out to be rather complex. A simpler model can be defined at the expense of the following simplifying assumptions [Bogardi et al., 1976]:

1. Steady flow conditions exist while the flood flows down the tributary. This assumption is especially justifiable for flat rivers where the duration of the peak flow may be as much as several days [Szidarovszky et al., 1975].

2. The only mode of levee failure to be considered is over- topping. This assumption greatly simplifies the model because with it a failure is determined by comparing the levee profile with the water profile. The most important design parameter is the levee height, and the most important flood parameters are the stages along the reach.

3. The costs are a function of levee height only. With these assumptions the variables in (1) are interpreted

more specifically as follows. G is the profile of the levee, H is a random vector whose elements are the water stages at succes- sive points along the river reach, D is the domain of all possible values of H, f(H) is the pdf of H, L(G, H) is the damage caused by a flood H when the levee profile is G, and C(G) is the yearly construction and maintenance cost of the levee. For computational purposes the following assumptions are added.

4. H is a function of the stages (x, y) at both ends of the reach. For a given (x, y) a unique steady state water surface

profile can be computed by solving the nonuniform steady state flow equation with x and y as boundary values [Szidarovszky et al., 1975].

5. Given the levee heights gm and gt at the ends of the reach, the levee is assumed to follow the steady state water surface profile with the same end stages [Bogardi et al., 1975]. In other words, the levee profile is a function of the end levee heights gm and gr.

6. Along the reach, n cross sections are selected so that H and G can be treated as finite dimensional vectors: (h•, ha, "', h,•) and (g•, go., '", g,•), where ht is the water level and gt is the levee height at cross section i. Also, h• = x, hn = y, g• = gin, and g,• = gr.

With assumptions 4-6 the goal function (1) can be expressed as

min {C[G(gm, gt)]+ ff L[G(gm, gt),H(x,y)]f(x,y)dxdy} (grn,gt)

where f(x, y) is the joint pdf of water stages (x, y). The use of joint nonindependent flood stages is justified by the fact that the steady state water surface profiles depend on the simulta- neous stages at both ends of the reach and because unless the basins arc too far from each other, some dependence is to bc expected between the stages owing to climatic similarities and precipitation phenomena that may affect both basins within a short time interval.

COMPONENTS OF THE MODEL

Equation (2) consists of three submodels: the hydraulic submodel, which generates water profiles for given pairs of water stages (x, y) at the reach extremities; the economic submodel, which consists of the structure and coefficients of both cost and damage functions; and the hydrologic or proba- bilistic submodel, which determines the probability of occur- rence of a pair of water stages (x, y).

The hydraulic and economic submodels are now briefly reviewed; the hydrologic submodel, whose choice constitutes the core of this paper, is presented in the subsequent section.

The hydraulic submodel is developed after Szidarovszky et al. [1975]. It is based on the nonuniform steady state flow equation [Kuiper, 1965]

- (3) dw g dw ko.(z, w)

where w is the horizontal distance (from the confluence), z is the water level at w, • is the average velocity through a cross section at w, q is the discharge, and k(z, w) is the conveyance of the cross section at w. To determine the steady state water surface profile, this equation must be solved with the boundary conditions z = x = stage at main river for w = 0 and z = y = stage at tributary for w = 60.4. Any pair of stages (x, y) uniquely determines a steady state water surface profile. The profile is called type Mx (concave upward) if the difference between the water stages is less than the altitude difference between the gaging stations and type Mo. (concave downward) if it is greater or equal. The point of failure along the reach will be given by the indicator function tt(G, H), which is defined as follows. For curves of type M•, t•(G, H) = 1 if cross section i is the first overtopped cross section counting up from down- stream; tt(G, H) = 0 otherwise. For curves of type Mo., tt(G, H) = 1 if cross section i is the first overtopped cross section counting down from upstream; tt(G, H) = 0 otherwise.

C^ST^NO ET ̂ L.: D[STmBUT[ON FUNCTIONS 645

y(=h t ) 94

93

92-

bt -- YL-' 91 04

90

89

88 77

Events A2

ß

ß e e ß ß ß e e eee ß ß ß ee ß ß ß ß

ß ß le ß

Events AI ß ß

I ;9 I I ; I I I ' i "[ , x(=h m) 78 80 81 2 83 84 85 86 87 88 89 I

bin= XL= 85.95

Fig. 2. Sample used. Events A1 and A2 are also shown.

Since (1) must be solved many times for the computation of expected losses of levee designs, it is necessary to have a fast computational procedure to determine water profiles. The method used here is that of Szidarovszky et al. [1975]. For each of the two types of profiles, M• and M•., they defined a set of n linear regressions to express the water stage at each cross section as a function of the stages (x, y) at both ends of the reach:

ht = atx + bty + ct i = 1, n (4)

where n is the number of cross sections defined along the reach. The regression coefficients were estimated by least squares from a large number of steady state water surface profiles generated by numerical solution of (3) by the step method. The altitude difference between the gaging stations at the ends of the reach is 6.5 m for the Zagyva River. If (y - x) < 6.5 m, an M• profile must be used; if (y - x) > 6.5 m, an M•. profile must be used.

The economic submodel consists of the construction and

maintenance cost functions C(G) and the flood damage func- tions L(G, H). The cost of a levee design alternative is assumed to be

C(G) = 5":C,(g,) (5)

where Ct(gt) is the cost of constructing the levee up to a height gt at cross section i. Given a levee profile G and a backwater curve H, the damages are defined as [Bogardi et al., 1975]

L(G, H) = 5':lt(gt, ht)tt(G, H) (6)

where lt(gt, hi) is the value of the losses if the levee fails at cross section i, with 6(gt, hi) = 0 for ht -< gr.

Two modifications of the economic data used by Bogardi et al. [1975] have been introduced: (1)instead of piecewise linear functions, quadratic functions have been fitted to the cost and damage data, and (2) the damage functions have been extrapo- lated beyond existing data by fitting them to a square root function [Castano, 1976].

THE PROBABILISTIC SUBMODEL

To each pair of stages (x, y) to occur during the life-span of the project there corresponds an optimum levee profile deter- mined by trading off costs and expected flood losses. For the latter quantity to be calculated, the bivariate pdf of (x, y) must be estimated, as is presented next.

Bivariate Partial Duration Series

In the case of a confluence it is natural to expect some degree of correlation between the discharges (and also the stages) of the two rivers. Consequently, simultaneous records are re-

quired in order to estimate the joint pdf of flood stages. Biva- riate partial duration series consist of all pairs (x, y) such that either stage or both stages exceed the basic levels. The sample space of the bivariate partial duration series can be defined as follows:

S = {(x,y):x >b,, or y >bt} (7)

and it can be divided into mutually exclusive and collectively exhaustive events, say,

A1 = {(x, y): x > b,,, y < bt}

A2 = {(x, y): y > bt} (8)

where b,, and bt are the basic stage levels in the main river and the tributary, respectively. The generally used criteria for the selection of the baselevels are either to select the lowest yearly flood in the series or, for long records, to select a value such that an average of three to four floods per year are included. It is also important to select the baselevel so that the flood peaks are separated by substantial recession in stage and discharge [Langbein, 1949]. The conditional pdf's f•(x, y lA1) and f•.(x, y[A2) can be estimated from the subsamples corresponding to events A 1 and A2, respectively. Those pdf's completely syn- thetize the hydrology of the problem for the purposes of this study.

The joint partial duration series is available for the Zagyva River for a period of 36 years. This series consists of 68 pairs (x, y), 21 of which correspond to event A1 and 47 of which correspond to event A2, as is illustrated in Figure 2. The baselevels are b,, = 85.95 m and bt = 91.04 m. This sample space is, in fact, a subset of the set of all possible values of (x, y), and it could be considered as an 'L shaped' tail of a bivariate distribution. To our knowledge, no method exists to estimate the underlying ('complete') distribution from such an L shaped truncated sample. As an alternative, the conditional pdf's f(x, ylA1) and f(x, ylA2) are estimated.

Evaluation of the Expected Damages

Let Nt be the number of events At that occur in 1 year and Pt(Nt = nt), i = 1, 2, be the probability that event At occurs nt times in 1 year. Also, let & be the set of pairs (x, y) corre- sponding to event At. The following assumptions are made.

1. The number of yearly occurrences of the events is inde- pendent; i.e., P(N•, No.) = P•(Nx)' Po.(No.).

2. The losses are time independent functions of the water stages (x, y) and do not depend on the number of events per year or the time between floods.

3. The flood stages (x, y) are assumed to be independent sample elements from the same family for each type of event, A1 orA2.

With these assumptions it can be proved that the expected value of the total yearly losses TYL is [Castano, 1976]

E[TYL] = E•[L]E[Nx] + E•.[L]E[N4 (9)

Thus the expected damages can be computed by means of the conditional pdf's only, and the goal function (2) becomes

min { • C• [g•(gr•, gt)l (gm, gt) k=l

•=1 t

The above assumptions are made for computational conve- nience. Assumption 1, although it is restrictive, is necessary to

646 CASTANO ET AL.: DISTRIBUTION FUNCTIONS

TABLE 1. Results of Bivariate x •' Tests for Candidate pdf's

Degrees of Abbreviation x•' Statistic Freedom

Product of Independent MarginMs, Event A I f•(xlA1)'fo,(ylA1)

Exponential-log normal E-L 6.622 4 Exponential-exponential E-E 6.518 5 Exponential-gamma E-G 4.715 4 E xponen tial-beta E-B 4.069 4 Beta-log normal B-L 5.296 3 Beta-exponential B-E 5.253 4 Beta-gamma B-G 3.389 3 Beta-beta B-B 2.872 3

Bivariate pdf s, Event A I f(x, ylA1)

Bivariate log normal BVL 5.715 Double gamma DBG 4.272

Product of Independent MarginMs, Event A2 fx(xla2)'f,,o, Ia2)

Normal-log normal N-L 13.752 Normal-truncated log normal N-TL 9.214 Normal-gamma N-G 10.374 Normal-truncated normal N-TN 10.064

Bivariate pdf s, Event A2 f(x, ylA2)

Bivariate normal BVN 8.736

Bivariate log normal BVL 14.293

Reject

no

no

no

no

no

no

no

no

2 no 2 no

9 no 9 no 9 no 9 no

8 no 8 no

be able to separate the expected losses due to each event, as is shown in (9). A more generalized model should use the joint pdf P(Nx, N•_) without assuming independence. In the next section the problem of choosing an appropriate pdf ft(x, y) is considered.

SELECTION OF THE HYDROLOGIC SUBMODEL

Since extreme events are of interest, the model uncertainty is large. Furthermore, the difference between estimates of ex- treme values given by two different pdf's is amplified because it is multiplied by large losses associated with high flood levels.

Model Selection From Sample Information

Given a sample of flood events, the problem is to determine the pdf which models as well as possible the phenomenon under study, that is, the pdf which 'best' predicts the relevant aspects of the process in order that decisions based on such predictions may be made.

A common approach is to assume that the 'true' model belongs to one of the well-known families of distributions, such as the normal, log normal, or gamma. Once a set of candidate families has been decided upon, the parameters of one pdf of each family can be obtained by estimation methods such as the maximum likelihood or the method of moments.

Then, the selection of the best pdf can be based, for example, on how well it fits the sample. In recent articles, Slack et al. [1975] and Wallis et al. [1976] use what is called the 'optimal assumed distribution.' It is the distribution that minimizes the

expected design losses. In regard to model choice, Wood et al. [1974, p. 27] raise an

important point: 'Most hydrologic processes are so complex that no model yet devised may be the true model or that no hydrologic events follow one particular model.' Consequently, it could be reasonably expected that a combination of models, for example, a weighted sum of individual pdf's from different families, would better 'explain' the hydrologic process than

does a unique pdf. A composite model similar to the one used by Wood et al. [1974] will also be used here.

Candidate pdf s

After consideration of the marginal sample histograms it was decided to fit a number of pdf's to the sample at hand. Since the sample correlation coefficients are small (less than 0.10), either bivariate pdf's or products of independent mar- ginal pdf's can be used. The pdf used and the equations for the estimation of parameters are presented in Appendix 1. The following bivariate pdf's were estimated: for f(x, y lA1), double gamma and bivariate log normal; for f(x, ylA2), biva- riate normal and bivariate log normal. Several univariate mar- ginals were estimated, but only those that cannot be rejected at the 5% level by using the X•- test are mentioned here: for f•(xlA1), log normal, exponential, gamma, and beta; for f•(ylA1), log normal, exponential, gamma, and beta; for h(xlA2), log normal, gamma, and normal; for fy(ylA2), log normal, truncated log normal, gamma, and truncated normal. Details of estimation and goodness-of-fit tests can be found in the report by Castano [1976].

The results of bivariate X•- tests for bivariate pdf's and products of univariate marginals are shown in Table 1 along with the abbreviated distribution names to be used sub-

sequently, such as N-G for 'normal-gamma.' None of the distributions can be rejected at the 5% level. The sample size is too small to make any valid inference, so that only an illustra- tion of the methodology can be provided.

Ranking the Candidate Distributions

Let us seek a ranking of the pdf's whose parameters have been estimated from a common historical sample. Each pdf belongs to a different family, and when its parameters have been estimated by the method of maximum likelihood, it is the most likely source of the data in the sample, given that the

CASTANO ET AL.: DISTRIBUTION FUNCTIONS 647

TABLE 2. Probability of the x 2 Statistic for Different pdf's

pdf x •' Statistic Probability

f(x, yl,41)* E-L 6.482 0.0397 E-E 5.793 0.0530 E-G 4.721 0.08t8 E-B 3.918 0.1113 B-L 5.300 0.0649 B-E 4.637 0.0846 B-G 3.389 0.1349 B-B 2.872 0.1608 BVL 7.011 0.0317 DBG 4.714 0.0820

I(x, yl.42)• N-L 12.79 0.0364 N-TL 7.979 0.0979 N-G 9.625 0.0755 N-TN 8.941 0.0852 BVN 8.736 0.0880 BVL 14.293 0.0240

*Three degrees of freedom. •Eight degrees of freedom.

model space is restricted to that particular family. When they are confronted with the data, some of the candidates show such poor fits that they may be discarded at once. More powerful tools are required, however, to discriminate between those models that fit the sample equally well. Two selection criteria are used here, and their performances are compared. These methods are the likelihood of the goodness-of-fit statis- tic and the sample likelihoods.

Most likely value of the goodness-of-fit statistic. Benjamin and Cornell [1970] suggest the use of the x 2 statistic as a tool for model choice by selecting the model 'for which the likeli- hood of the observed value of the corresponding closeness-of- fit statistic is largest.' Such a value is more related to the mode of the x 2 pdf than to the minimum value of the statistic. This criterion is based on the fact that for 3 or more degrees of freedom the minimum value of the x 2 statistic is not the most likely outcome. In order to compare models b3• using this criterion it is necessary to have the same number of degrees of freedom in the goodness-of-fit test for all models.

The x 2 goodness-of-fit test for bivariate distributions is simi- lar to the univariate test. Given a sample of N pairs (xi, yl), we test the hypothesis

no: f(x, y) = fo(x, y)

H,: f(x, y) fo(x, y)

as follows. Divide the sample space into k cells such that no cell has less than one sample point and no more than 20% of the cells have less than five sample points. If P• is the number of observed sample points in cell i and et is the expected number of points in cell i given that H0 is true, then the statistic

Q • (Pt - e,) 2 = (11)

is approximately distributed as x 2 with (k - 1) degrees of freedom. If parameters of f0(x, y) are estimated from the sample, 1 degree of freedom is lost for each estimated parame- ter.

Sample likelihoods. Another intuitively appealing ap- proach is to rank the models by the likelihood of the observed

sample under each model. For a sample of N pairs (xt, Yt) the sample likelihood given f0(x, y) is

f0(xx, yx)f0(x2, y2)''' f0(xN, yN)

For a set of candidate pdf's in which each belongs to a differ- ent family and for which the parameters have been estimated by the method of maximum likelihood the pdf chosen by using this criterion would be the pdf with the maximum sample likelihood among all possible pdf's (different values of the parameters) in all the families considered. As will be shown later, the pdf so chosen has the maximum posterior probability of being the true model when the prior distribution for the true model is uniform.

Application to the Zagyva River

Bivariate X 2 statistics were recomputed for the nonrejected pdf's in order to have the same number of degrees of freedom in each group. The x 2 statistics for the candidates of f(x, y[A 1 ) have 3 degrees of freedom, and those for the candidates of f(x, y[A2) have 8 degrees of freedom. Table 2 presents the x 2 statistic for each candidate and its corresponding value of the ordinate in the x 2 pdf with 3 and 8 degrees of freedom for f(x, y[A 1) and f(x, y[A2), respectively. According to Table 2 the models beta-beta for f(x, y[Al) and normal-truncated log normal for f(x, y[A2) should be selected as the best models.

Weights proportional to their sample likelihood were com- puted for each model and are presented in Table 3. According to this table the models beta-log normal for f(x, y[A1) and normal-gamma for f(x, y [A2) should be selected as the best models.

Comparison of the Two Methods

Figure 3 presents the relative ranking of the pdf's for the two choice criteria. The ratings shown were obtained in the follow- ing manner. For each set of pdf's and for each choice criterion a rating between 0 and I was obtained for each element of the set by dividing its index (x 2 probability of sample likelihood weight) by the greatest index in the set for the criterion consid- ered. Figure 3 shows that the two choice criteria disagree

,

TABLE 3. Weights for Candidate pdf's Based on Sample Likelihoods

pdf Weight

f(x, yl,41)* Exponential-log normal 0.0167 Exponential-exponential 0.0025 Exponential-gamma 0.0123 Exponential-beta 0.0084 Beta-log normal 0.3721 Beta-exponential 0.0536 Beta-gamma 0.2657 Beta-beta 0.1769

Bivariate log normal 0.0709 Double gamma 0.0210

f(x, ylA2)T Normal-log normal 0.1433 Normal-truncated log normal 0.1296 Normal-gamma 0.4586 Normal-truncated normal 0.0173 Bivariate normal 0.2496

Bivariate log normal 0.0018

*Normalizing factor is •L• = 4.37551 X 10 -•4. J'Normalizing factor is • L• = 9.80246 X 10 -5•

648 CASTANO ET AL.: DISTRIBUTION FUNCTIONS

8-

.6-

.4-

f(x,y I A I)

Sample Likehhood

B-L

B-G

BVL

B-E

E-B E-E

f(x,y A2)

X = I 0 --N-TL

:.•_vp.

8 -N-G

6

.4 -N-L

-BVL

o

Sample L•kel•hood

Fig. 3. Relative ranking of candidate pdf's.

substantially. Two factors seem to have contributed to this result, the first being the small sample size used, especially for event ,41. Since 1 degree of freedom is lost for each parameter estimated from the sample, to compute the Xo. statistic for a five-parameter distribution would require at least six cells to be defined in the sample space. For a sample of 21 points the requirement of having at least five sample points per cell cannot be met. This situation makes the results of the Xo. tests rather unreliable. Again, the example is for illustrative pur- poses only. A second factor that may bias the results of the Xo. test is the arbitrariness in the definition of the cells. It could

happen that a given arrangement of cells might favor a partic- ular pdf or group of pdf's and disfavor others. Also, the sample likelihoods appear to be extremely sensitive to errors in the normalizing constants (commonly used in the truncated models). A 1% error in such'constants, making the computed probabilities 1% larger, will double the sample likelihood in a sample of 70 points. This will erroneously increase the degree of importance of a model.

It also appears from the results that the sample likelihoods have a better ability to discriminate between models. This is well illustrated in Table 3, where for f(x, y I A 1), only three models out of 10 have a weight greater than 0.10. This ability to reduce the choice set in a very definite manner is a desirable feature in a model selection procedure.

When models are ranked to make a selection, it is possible that two or more models rank very close together. When the sample likelihoods are used, the weights so obtained are ac- tually the posterior probabilities that each model is the true one. In such a case a composite model could be used rather than a more or less arbitrary choice of a unique pdf.

A COMPOSITE MODEL

Wood et al. [1974] have used the sample likelihoods to account for model uncertainty within a Bayesian framework. They formulated the following composite model:

ginal likelihood function of the observations for model i. For model parameters a and a sample q = (q•, ..., qN),

N

(13)

K, = = ])

A complete Bayesian approach (i.e., taking into account the sample uncertainty) has not been attempted here because of the complexity of the Bayesian distribution in a bivariate case. Instead, a composite model is developed as it was by Wood et al. [1974] but without the parameter uncertainty being taken into account. Let

m

f(xlo) = of(x) (14) n=l

so that the composite model has the form

f(x) - f f(xlo)f(o) dO (15)

where 0 is an m vector such that 0n = 1 if fn(x) is the true model and 0n = 0 otherwise and

/91

•0n=l n=l

Notice that this definition reduces the possible values of 0 to 100...0, 010...0, 001...0, ..., 000. ß. 1. The likelihood func- tion given a sample x is

Z(x10) - i[---! =

In (16) there are no cross products because of the restrictions imposed on 0. Also, Ln(x) is the sample likelihood for model k.

Let the prior pdf of 0 be

m

f'(0) = • O,P'(Ot- 1) (17)

where P'(Ot = 1) is the prior probability of model i. Given a sample x = (x•, xo., ..', x,), the prior pdf of 0 can be updated by using Bayes' theorem:

f"(01x)-- f(x10).f'(0) _f L(x10).f'(0) f(x) - f(xlO)f'(O) dO

= •0,Z,(x)'P'(0, = 1) = •O,P"(O, = 1) •L,(x)•"(O, = ])

where

L,(x).e'(Ou = 1)

•,"(o, = ]) = •.,L,(x). •"(o, = ])

(18)

K, P'(O, = 1)h*(q) fc(q) = Z -K•-' (12)

where ft*(q) is the 'Bayesian distribution' given model i, 0t = 1 if ft is the true model and 0t = 0 otherwise, P'(Ot = 1) is the prior probability that ft is the true model, and Kt is the mar-

are the posterior model probabilities. The composite model can now be updated by replacing the

posterior pdf of O:

fc(x) - f f(xlo)f"o) dO

CASTANO ET AL.' DISTRIBUTION FUNCTIONS 649

gt(meters) 97-

96-

95

94 90

ß (26O$) N-L

, COMPOSITE (18 46)

ß (15.24) N-G BVN(15.$6),

,,(14 37) N-TL

I , I

91 92 I grn(meters)

93

Fig. 4. Optimum design alternatives for all models. The quantity in parentheses is the total yearly cost in million forints.

--fI•o,ft(x)l.I•O,p"(O,--1)ldO - •".f,(x)P"(O,- 1) (19)

Thus the composite model is a linear combination of the candidate models, weighted by their posterior probabilities. In the case of equal prior model probabilities P'(Ot = 1) = P' (reflecting perhaps a state of total ignorance about the true model) the posterior model probabilities become

?"(o, = ])= (20) that is, proportional to the sample likelihood given that each model is the true model.

By disregarding the models with weights of less than 0.05 in Table 3 and readjusting the weights so that they add up to 1.00 the following composite models are obtained:

fc(x, ylAl)- 0.42f•_• + 0.30f•_a + 0.20f•_• + 0.08f•v•

f(x, yl,42)= o.•5f•_,• + 0.47f•_a + 0.25f•v• + o. 13f• (21) -TL

where the subindices are the abbreviations used to label the candidate models as defined in Table 2.

OPTIMAL SOLUTION AND SENSITIVITY TO PDF

For this example it was found that the minimum total yearly cost was not affected by the type of pdf used for f(x, y lA 1) but varied considerably with f(x, yl A2). The main reason for this is that the contribution off(x, ylA1) to the expected losses is very small in relation to that of f(x, ylA2) [Castano, 1976, p. 77]. The model was then run for fixed f(x, y lA 1) for which B-B was used and combined with four cases off(x, ylA2) for which N-G, BVN, N-L, and N-TL were used. The optimal solutions for these four combinations and also for the composite model are shown in Figure 4.

The high variability of gt, observed in Figure 4, can be explained by further breaking down the sensitivity to f(x, y lA 2). As a result of the selection procedure, it turned out that the marginal pdf of main stages, f(xlA2), is always normal, whereas f(YlA2), the marginal pdf for tributary stages, is gamma (G), bivariate normal (BYN), log normal (L), and truncated log normal (TL). A change in the marginal pdf affects more directly its corresponding end of the reach; this is mainly the reason why gm stays around 91.5 while gt varies between 96.3 and 96.6 (cf. Figure 4).

The long tail for the y marginal in the N-L model appears to be the reason why the optimum for this model is so separated

from the other models. As a matter of illustration, the proba- bility of stages exceeding 95 m in the tributary (y > 95 m) is 0.0006 according to the N-G model and 0.0118 according to the N-L model, that is, about 20 times larger. When these small probabilities are multiplied by the damages, a consid- erable difference in the expected losses is obtained.

At a first glance it may appear that the only floods to be protected against are due to M• water profiles (gt -gm < 6.5 m for all cases in Figure 4). However, the optimum levee profile is high enough to protect against most M•. and M• floods. The shape of the levee is the result of the interaction among all the components of the problem, especially construction costs and flood losses at each cross section and the pdf's of flood stages. In this example the high levee profile toward the confluence (hence the type M• profile) seems to be due to high flood losses associated with levee failure near the confluence [Castano, 1976, p. 52].

DISCUSSION AND CONCLUSIONS

The selection of a bivariate pdf of water stages based on the partial duration series has been attempted here in the context of the minimization of the expected flood losses. Several points are worth mentioning now.

The use of water stages instead of discharges presented some problems for the distribution fitting and may have affected the final decision because the cross-sectional area increases more

than linearly as the water rises. Hence even if the pdf of the discharges has a long tail, the l•df of the stages may have a shorter tail (if any). The fitting of exponential type distribu- tions to water stages may, consequently, erroneously increase the expected damages. It seems that a better approach would be, when possible, to obtain the distribution of the discharges and then to use the stage-discharge relationship to determine the water levels.

It was initially attempted to estimate a single bivariate pdf from an L shaped tail (cf. Figure 2). This proved to be a difficult estimation problem; only the treatment of very simple cases (linear truncations, univariate pdf's, etc.) could be found in the literature examined. The division of the sample space and the use of conditional distributions constitute, however, another valid approach and satisfy completely the require- ments of the present problem.

The location parameters of some distributions were as- sumed to be known i,n order to simplify the parameter estima- tion in the example of the Zagyva River. For example, the baselevels xL = 85.95 and yL = 91.04 were used as the location parameters for the log normal, exponential, gamma, and beta distributions.

The deviation of the optimal decision and the discounted total yearly costs obtained for the log normal model in the case of the Zagyva River clearly illustrate the importance of model selection for decision making based on extreme hydrologic events.

Although the model selection procedures used here were relatively successful in reducing the choice set, the discrepancy between the two methods suggests the need for additional study (for example, using simulation) of the performance and accuracy of such methods. It also suggests the need for ex- treme caution and awareness of the possible pitfalls in their application. In the example considered, such discrepancy leads to optimum designs whose initial investment differs by 6 mil- lion forints [Castano, 1976] (one forint is approximately US $0.05). This represents a substantial proportion of the total yearly costs shown in Figure 4.

650 CASTANO ET AL.: DISTRIBUTION FUNCTIONS

The ranking of the candidate pdf's may be said to reduce the model uncertainty in the sense that it imposes some preference structure on the candidate set. However, the ranking of the models and the subsequent selection of the best pdf neither eliminate nor take into accounit the model uncertainty. When decision making is based on expected value criteria, as is the case here, the use of the composite model seems a more rea- sonable alternative because it does take into account the model

uncertainty. In the composite model the influence of each candidate pdf on the optimal decision is proportional to the degree of certainty (likelihood) that it is the true model.

The following points can be concluded from this work. 1. When expected damages of extreme events are sought,

the problem of estimating a bivariate pdf from an L shaped truncated sample may be overcome by proper subdivision of the sample space and the use of conditional pdf's.

2. The two model selection procedures presented here can be successfully applied to reduce the set of candidate pdf's. However, the ordering of the distributions is different for the two criteria.

3. Model selection by sample likelihoods is very sensitive to numerical errors in the normalizing constant of a distribu- tion.

4. In the example considered, the best-ranking distribu- tions are f(x, y lA 1) = beta-beta and f(x, y lA2) = normal- truncated log normal for the likelihood of the X: statistics criterion and f(x, y 1.41) = beta-log normal and f(x, y 1.4 2) = normal-gamma under the sample likelihoods criterion. The composite model, which is a linear combination of the can- didate pdf's, is given in (21).

5. For decisions based on expected values the composite model represents an appealing alternative because it takes into account the model uncertainty. In the case of flood protection for the Zagyva River the decision reached with the composite model lies within the range of variation of decisions obtained with the component models.

6. When the composite model is used, the optimum levee design has a height (above sea level) of 95.0 m at the tributary and 91.5 m at the main river. Its initial construction cost has

been computed as 145.26 million forints and has an expected total yearly cost of 18.46 million forints.

7. According to the model formulated here the total yearly cost of the levee system for the Zagyva River is sensitive to the type of distribution used for the joint probability of flood stages conditional upon event A2 and is not sensitive to the distribution of flood stages given event A 1.

Exponential Distribution

f(x; X, 0) -- X -x exp [-(x - 0)/3,] x > 0

Estimation with 0 given

X' = n -x Y](xt - O) Gamma Distribution

X > 0 (A3)

(A4)

f(x;' a, •, 'r) = (x - 'r •"I'(a) exp [-(x - 'r)/•] (A5)

Estimation with •, given

x>*r a>0 fl>0

n-X [Y] log (xt - h, ) ] = log y + f(od) log [(n-X Y](xt - h,) ] = log (oe•) + log (13 •)

[log(r[,])]

where

(A6)

(the 'digamma' function).

Beta Distribution

f(x; a, b, p, q) = [B(p, q)(b - a)J'+q-x]-X(x - a)J'-X(b - x) q-x (A7)

r(p)r(q) a < x < b p > O q > O B(p, q) = F(p + q)

Estimation with a and b known

•,(p') - •,(p' + q') = n-X Y] log ( x, - a ) b-a (A8)

•(q') - •(ff + q') = n-X • log ( b - xt ) b-a

where •( ) represents the 'psi' or digamma function.

Normal Distribution

f(x; #, a) - [(2•r)ø'sa] -x exp [-0.5(x - #):/•:] (A9)

Estimation

#'= n-• Y] (xt) o" = [n-X •(xt - ta'):] ø'• (A10) APPENDIX: DISTRIBUTIONS USED

The functional form and the equations for the estimation of parameters of the distributions used are now presented. Unless otherwise indicated, the information has been taken from Johnson and Kotz [1970a, b, 1972]. Also, n is the sample size. The superscript # is used to denote the estimated parameters.

Log Normal Distribution

f(x; 0, •, •) = [(2•r)ø'•(x - 0)•] -x

ß exp/-0.511og (x - O) - •]2/a •}

Estimation with 0 given

• = n -x• log (xt - 0)

a' = (n -x • [log (x, - O) - •,]:)o.•

x >0 (A1)

(A2)

Truncated Normal Distribution

f(x; xz,, #, o-) = [(2a-)ø-•Ko-] -x exp [-0.5(x - #):/o -:]

xL<x< oo

f(x; xL, #, a) = 0 -oo < x _< x• (All)

where

K = (2•r) -ø'• exp (-0.5u:) du z• = (x• - #)/a

The following equations, relating the first two moments of the truncated distribution to the mean and variance of the com-

plete distribution, can be estimated for known xL:

tax' = [f(x•, ta, a)/F(xL, ta, a)] + ta (A12) = + + ,,: +

CASTANO ET AL.: DISTRIBUTION FUNCTIONS 651

where

f(XL, t•, a) = exp [--0.5(XL -- t•)•'/a•']/[(2•r)ø'sa]

F(XL, t t, a)--f•7f(u, t t, a)du and tq', tt•.' are estimated by Y]xt/n and •,xt•'/n, respectively. Bivariate Gamma Distribution

In the literature, several forms of three-parameter bivariate gamma distributions can be found (see, for example, Mardia [1970] or Johnson and Kotz [1972]). Ghirtis [1967] presents a method for estimation of parameters of the five-parameter form of the 'double gamma' distribution introduced by David and Fix [ 1961 ]. The five-parameter double gamma distribution [Ghirtis, 1967] has the form

f(x'y;a'b c'X t•)=k-•exp [ x Yl ' ' X t•

ß ua-X(x - Xu)O-X(y - tm)C-Xe" du (A13)

where

a = x%,r(a)r(o)r(c) m = min [x/X,

a,b,c,X,g>O x>0 y>0

The estimation of the parameters a, b, c, X, and g can be computed by the equations [Ghirtis, 1967]

X # = k•.o/k•o t• #= ko•./ko• d = knk•oko•/k•.oko•. (A14)

b # = k•o•'/k•.o - a # c # = ko•-/ko•. - a •

where kt,j is the (i, j)th cumulant of the population and can be estimated by Fisher's bivariate k statistics:

k•o = S•o/n

k•o = (n - 1)-x(S•.0 - Sxo•'/n) (A15)

kxx = (n - 1)-x(Sxx- SxoSox/n) So = •,x•y •

Truncated Bivariate Normal Distribution

The truncated bivariate normal distribution for y truncated at yL has the form

f(x, y; t•x, t•y, ax, ay, p, yL) =

exp [-0.5(z •' - 2pzw + w•')/(1 - p•')]/[2•rka,ay(1 - p•.)0.5] (A16)

where

k = f• (2•r)-ø'5 exp (-0.$u 2) du z = (x - I•,)/o'•, w '= (y - w,. = 02,. -

Des Raj [1952] presents the following equations to compute the maximum likelihood estimate of the parameters g•, gy, ax, fly, and p when the truncation point yL is known:

Vo2 -- aye[1 - k'(zx- k')] Voa -- ay(Zx- k')

t•y = yL -- k'ay t• + (z•)pa• - V•o = 0 (A17)

ay(Zx- k')ta•, + aypa•, - vxx = 0

a,, "= V,.o- •,,(•,, + 20a,,z•)- (pa,,)"k'z•

where vo is the ijth sample moment about the truncation,

k' = (y,. - •,•)/•

z• = •(k') / [1- •: •(t) dt] 4,(t): (2•r) -ø'5 exp [-0.5t']

(A18)

Truncated Bivariate Log Normal Distribution

The truncated bivariate log normal distribution for y trun- cated at yL has the form

f(x, y; 0•, Oy, •, •y, a,, ay, p, yL)

= exp [-0.5(z •' - 2pzw + w•')/(1 - f')]/[(x - 0•)

where

'0' - Or) a•ay2z'k(1 - p•.)o.5]

Ox<x< co

(A19)

z = [log (x - 0•) - •]/a•

w= [log0,-0y)-

t• = E [log (x - 0x)] t•y = E [log (y - 0y)]

a• = a [log (x - 0•)] try = a [log O' - Oy)]

k = f•] (2•r) -ø-5 exp (-u"/2)du ZL = [log (YL - 0y) -

For known yL and location parameters 0, and 0y the parame- ters •/•, •/y, ax, fly, and O can be computed as they are in the truncated bivariate normal distribution after making the trans- formation

x' = log (x - 0•)

y' = log (y - Or)

NOTATION

,4 1,/12 conditioning flood events. at, bt, ct regression coefficients.

bm, bt basic flow levels for the main river and the tri- butary.

C(G) yearly construction cost of a levee with profile G.

Ct( ) local cost function at cross section i. D domain of H.

E[ ] expected value. f(H) pdf of H.

f(x, y) joint pdf of water stages. fc(x,y) composite pdf. ft(x, y) joint pdf of water stages, conditional upon event

/1i, i = 1,2. f•(x), fy(y) marginal pdf's.

G levee profile (vector of levee heights gt at n cross sections).

gm levee height at confluence. gt levee height at upstream end of reach. H backwater curve (vector of water stages ht at n

cross sections along the reach). ha water stage at confluence. ht water stage at upstream end of reach.

L(G, H) loss function.

652 CASTANO ET AL.' DISTRIBUTION FUNCTIONS

lt(gt, ht) local loss function at cross section i. S domain of water stages (x, y); also, sample

space.

TYL total yearly losses. x water stage at confluence. y water stage at upstream end of reach.

Acknowledgments. This work was supported in part by National Science Foundation grants ENG 74-20462, 'Sensitivity of Decisions in Resource Engineering to Assumptions of Multivariate Models,'and GF 381833, 'Cooperative Research on Decision Making Under Un- certainty in Hydrologic and Other Resource Systems.' The invaluable cooperation of Jean E. Weber and William Metier is greatly appreci- ated.

REFERENCES

Benjamin, J. R., and C. A. Cornell, Probability, Statistics and Decision for Civil Engineers, McGraw-Hill, New York, 1970.

Bogardi, I., Uncertainty in water resources decision making, in Pro- ceedings of the UN Interregional Seminar on River Basin Develop- ment, Budapest, Hungary, Center for Natural Resources, Energy, and Transport, United Nations, New York, 1975.

Bogardi, I., L. Duckstein, and F. Szidarovszky, Hydrologic system reliability at confluence of rivers, paper presented at Symposium on Mathematical Models in Hydrology and Water Resources Systems, Int. Ass. of Hydrol. Sci., Bratislava, Czechoslovakia, 1975.

Bogardi, I., L. Duckstein, and E. Castano, Effect of stochastic model choice on hydraulic design, paper presented at 2nd International Symposium on Stochastic Hydraulics, Int. Ass. of Hydraul. Res., Lund, Sweden, Aug. 2, 1976.

Castano, E., Model uncertainty in the design of a flood protection levee, Technical Reports on Natural Resource Systems, Rep. 28, Dep. of Syst. and Ind. Eng., Univ. of Ariz., Tucson, June 1976.

David, F. N., and E. Fix, Rank correlation and regression in a non- normal surface, in Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, edited by J. Neyman, p. 177, University of California Press, Berkeley, 1961.

Des Raj, On estimating the parameters of a bivariate normal popu- lation from double or singly linearly truncated samples, Sankhy•, 12, 277-290, 1952.

Ghirtis, G. C., Some problems of statistical inference relating to the double-gamma distribution, Trab. Estadist., 18, 67-87, 1967.

Johnson, N. L., and S. Kotz, Distributions in Statistics: Continuous Univariate Distributions, vol. 1, Houghton Mifflin, Boston, Mass., 1970a.

Johnson, N. L., and S. Kotz, Distributions in Statistics: Continuous Univariate Distributions, vol. 2, Houghton Mifflin, Boston, Mass., 1970b.

Johnson, N., and S. Kotz, Distributions in Statistics: Continuous Mul- tivariate Distributions, John Wiley, New York, 1972.

Kisiel, C., and L. Duckstein, General report on model choice and validation, in Proceedings of the International Symposium on Uncer- tainties in Hydrologic and Water Resource Systems, edited by C. C. Kisiel and L. Duckstein, University of Arizona, Tucson, 1972.

Kuiper, E., Water Resources Development, Butterworths, London, 1965.

Langbein, W. B., Annual floods and the partial duration series, Eos Trans. AGU, 30(6), 879-881, 1949.

Mardia, K. V., Families of bivariate distributions, in Griffin's Statisti- cal Monographs and Courses, no. 27, Griffin, London, 1970.

Rodriguez-Iturbe, I., and J. G. Vicens, On the combined use of re- gional and historical hydrologic information, paper presented at Symposium on Hydrologic Systems Analysis Aspects on Water Resources Planning, AGU, San Francisco, Calif., Dec. 1974.

Slack, J. R., J. R. Wallis, and N. C. Matalas, On the value of informa- tion to flood frequency analysis, Water Resour. Res., 11(5), 629-647, 1975.

Szidarovszky, F., L. Duckstein, and I. Bogardi, Levee system reliabil- ity along a confluence reach, J. Eng. Mech. Div. Amer. Soc. Civil Eng., 101(EM5), 609-622, 1975.

Szidarovszky, F., L. Duckstein, and I. Bogardi, A stochastic model of levee failure, in Mathematical Models for Environmental Problems, edited by C. A. Brebbia, pp. 129-141, Pentech, London, 1976.

Wallis, J. R., N. C. Matalas, and J. R. Slack, Effect of sequence length n on the choice of assumed distribution of floods, Water Resour. Res., 12(3), 457-471, 1976.

Wood, E. F., I. Rodriguez-Iturbe, and J. C. Schaake, Jr., The method- ology of Bayesian inference and decision making applied to extreme hydrologic events, Tech. Rep. 178, Ralph M. Parsons Lab. for Water Resour. and Hydrodyn., Dep. of Civil Eng., Mass. Inst. of Technol., Cambridge, 1974.

(Received September 30, 1976; revised August 1, 1977;

accepted January 24, 1978.)