Assessment of the integrated urban water quality model complexity through identifiability analysis

14
Assessment of the integrated urban water quality model complexity through identifiability analysis Gabriele Freni a , Giorgio Mannina b, *, Gaspare Viviani b a Facolta ` di Ingegneria ed Architettura, Universita ` degli Studi di Enna “Kore”, Cittadella Universitaria, 94100 Enna, Italy b Dipartimento di Ingegneria Idraulica ed Applicazioni Ambientali, Universita ` di Palermo, Viale delle Scienze, 90128 Palermo, Italy article info Article history: Received 8 February 2010 Received in revised form 29 May 2010 Accepted 3 August 2010 Available online 11 August 2010 Keywords: Uncertainty assessment River water-quality modelling Identifiability analysis Integrated urban drainage modelling abstract Urban sources of water pollution have often been cited as the primary cause of poor water quality in receiving water bodies (RWB), and recently many studies have been conducted to investigate both continuous sources, such as wastewater-treatment plant (WWTP) effluents, and intermittent sources, such as combined sewer overflows (CSOs). An urban drainage system must be considered jointly, i.e., by means of an integrated approach. However, although the benefits of an integrated approach have been widely demonstrated, several aspects have prevented its wide application, such as the scarcity of field data for not only the input and output variables but also parameters that govern intermediate stages of the system, which are useful for robust calibration. These factors, along with the high complexity level of the currently adopted approaches, introduce uncertainties in the modelling process that are not always identifiable. In this study, the identifiability analysis was applied to a complex integrated catchment: the Nocella basin (Italy). This system is characterised by two main urban areas served by two WWTPs and has a small river as the RWB. The system was simulated by employing an integrated model developed in previous studies. The main goal of the study was to assess the right number of parameters that can be estimated on the basis of data-source availability. A preliminary sensitivity analysis was undertaken to reduce the model parameters to the most sensitive ones. Subsequently, the identifiability analysis was carried out by progressively considering new data sources and assessing the added value provided by each of them. In the process, several identifiability methods were compared and some new techniques were proposed for reducing subjectivity of the analysis. The study showed the potential of the identifiability analysis for selecting the most relevant parameters in the model, thus allowing for model simplification, and in assessing the impact of data sources for model reliability, thus guiding the analyst in the design of future monitoring campaigns. Further, the analysis showed some critical points in integrated urban drainage modelling, such as the interaction between water quality processes on the catchment and in the sewer, that can prevent the identifiability of some of the related parameters. ª 2010 Elsevier Ltd. All rights reserved. * Corresponding author. Tel.: þ39 091 665 7756; fax: þ390916657749. E-mail address: [email protected] (G. Mannina). Available at www.sciencedirect.com journal homepage: www.elsevier.com/locate/watres water research 45 (2011) 37 e50 0043-1354/$ e see front matter ª 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.watres.2010.08.004

Transcript of Assessment of the integrated urban water quality model complexity through identifiability analysis

wat e r r e s e a r c h 4 5 ( 2 0 1 1 ) 3 7e5 0

Avai lab le a t www.sc iencedi rec t .com

journa l homepage : www.e lsev ie r . com/ loca te /wat res

Assessment of the integrated urban water quality modelcomplexity through identifiability analysis

Gabriele Freni a, Giorgio Mannina b,*, Gaspare Viviani b

a Facolta di Ingegneria ed Architettura, Universita degli Studi di Enna “Kore”, Cittadella Universitaria, 94100 Enna, ItalybDipartimento di Ingegneria Idraulica ed Applicazioni Ambientali, Universita di Palermo, Viale delle Scienze, 90128 Palermo, Italy

a r t i c l e i n f o

Article history:

Received 8 February 2010

Received in revised form

29 May 2010

Accepted 3 August 2010

Available online 11 August 2010

Keywords:

Uncertainty assessment

River water-quality modelling

Identifiability analysis

Integrated urban drainage

modelling

* Corresponding author. Tel.: þ39 091 665 77E-mail address: [email protected] (G

0043-1354/$ e see front matter ª 2010 Elsevdoi:10.1016/j.watres.2010.08.004

a b s t r a c t

Urban sources of water pollution have often been cited as the primary cause of poor

water quality in receiving water bodies (RWB), and recently many studies have been

conducted to investigate both continuous sources, such as wastewater-treatment plant

(WWTP) effluents, and intermittent sources, such as combined sewer overflows (CSOs).

An urban drainage system must be considered jointly, i.e., by means of an integrated

approach. However, although the benefits of an integrated approach have been widely

demonstrated, several aspects have prevented its wide application, such as the scarcity

of field data for not only the input and output variables but also parameters that govern

intermediate stages of the system, which are useful for robust calibration. These factors,

along with the high complexity level of the currently adopted approaches, introduce

uncertainties in the modelling process that are not always identifiable. In this study, the

identifiability analysis was applied to a complex integrated catchment: the Nocella basin

(Italy). This system is characterised by two main urban areas served by two WWTPs and

has a small river as the RWB. The system was simulated by employing an integrated

model developed in previous studies. The main goal of the study was to assess the right

number of parameters that can be estimated on the basis of data-source availability. A

preliminary sensitivity analysis was undertaken to reduce the model parameters to the

most sensitive ones. Subsequently, the identifiability analysis was carried out by

progressively considering new data sources and assessing the added value provided by

each of them. In the process, several identifiability methods were compared and some

new techniques were proposed for reducing subjectivity of the analysis. The study

showed the potential of the identifiability analysis for selecting the most relevant

parameters in the model, thus allowing for model simplification, and in assessing the

impact of data sources for model reliability, thus guiding the analyst in the design of

future monitoring campaigns. Further, the analysis showed some critical points in

integrated urban drainage modelling, such as the interaction between water quality

processes on the catchment and in the sewer, that can prevent the identifiability of some

of the related parameters.

ª 2010 Elsevier Ltd. All rights reserved.

56; fax: þ390916657749.. Mannina).ier Ltd. All rights reserved.

wat e r r e s e a r c h 4 5 ( 2 0 1 1 ) 3 7e5 038

1. Introduction

Integrated modelling of urban wastewater systems is of

growing interest, mainly as a result of the recent adoption of

the EU Water Framework Directive (WFD) (European

Commission, 2000). An integrated modelling approach is

also required due to the concurrently growing awareness that

optimal management of the individual components of urban

wastewater systems (i.e., sewer systems, wastewater-treat-

ment plants and receiving water bodies) does not lead to

optimum performance of the entire system (Rauch et al.,

2002). One of the main bottlenecks preventing the applica-

tion of integrated modelling approaches is the complexity of

the overall system as well as the lack of field data required for

reliable model application. Indeed, in urban drainage water

quality assessment, data availability issues are generally quite

common in both research and practical applications. Such

problems are primarily due to the fact that the required data-

gathering campaigns can be technically complex and

economically demanding. When dealing with complex

modelling approaches in the context of insufficient field data,

classical calibration approaches may lead to several equally

consistent parameter sets and it may thus prove difficult to

arrive at sufficient confidence in the obtained results (Kuczera

and Parent, 1998; Beven and Binley, 1992). An obvious remedy

is model reduction in the sense of restricting the model

description to only the observed data (Jakeman and

Hornberger, 1993). This theoretical principle has some diffi-

culties in practice related to the definition of an objective

procedure for determining the correct model complexity for

a specific application. Identifiability analysis enables

a response to such an issue, consisting of several mathemat-

ical approaches aimed at the investigation of modelling

parameters that can be reliably assessed in a specific model-

ling application and in a specific case study.

Model identifiability analysis basically consists of two

problems: the problem of model-structure selection and the

problem of parameter identification. The model structure is

often imposed by physical considerations, especially with

large environmental systems involving several processes. For

this reason, studies to date have mainly addressed parameter

identifiability and the evaluation of related uncertainty (Brun

et al., 2001; Campolongo et al., 2007). A distinction has to be

made between structural and practical identifiability (De Pauw

et al., 2004). The former provides information about the

theoretical possibility of obtaining unique values for the

parameters once the model structure and the system to be

modelled have been established. In contrast, the practical

identifiability of parameters is dependent on both model

structure and experimental conditions together with the

quality and quantity of the measurements.

In the past, parameter identifiability issues, although

referring to simple models, have been successfully tackled by

detailed analysis of sensitivity functions (Holmberg, 1982;

Reichert and Vanrolleghem, 2001; Saltelli et al., 2006;

Wagener and Kollat, 2007; Campolongo et al., 2007; Gatelli

et al., 2009). Holmberg (1982) suggested the use of graphical

approaches for sensitivity analysis to enable the evaluation of

parameters identifiability. Such approaches are well suited for

small models. Conversely, regarding large models, such as

activated-sludge models (ASMs), the previous approach fails

due to the fact that it is no longer possible to efficiently

analyse the extensive graphical output that is produced (Brun

et al., 2002). To cope with such problems, several analytical

approaches were presented in literature based on detailed

analyses of the sensitivity indices. Morris (1991) and

Campolongo et al. (2007) proposed the analysis of Elementary

Effects (EEs) of parameters on modelling output based on the

statistical analysis of model sensitivities to parameter varia-

tions. In such studies, the average value of the EEs is used to

rank the parameters in terms of sensitivity. Saltelli et al. (2009)

suggested some improvements to the method introducing

the concept of Elementary Interaction in order to highlight

the interaction among parameters in terms of their impact

on modelling outputs. Weijers and Vanrolleghem (1997) and

De Pauw (2005), transferring knowledge from the field of

control theory, demonstrated the effectiveness as well as the

power of FIM-based. The main advantage of such methods is

related to the objectivity of identifiability criteria that are not

dependent decisions, such as the definition of a threshold in

the sensitivity indices to highlight identifiable parameters.

In another approach, Brun et al. (2001), adapting methods

used in linear regression diagnostics (Belsley, 1991), focused

on the analysis of parameter interdependencies and on the

exploration of the effects of fixed parameter values on

parameter estimates. Both studies showed that the different

proposed methods are of variable effectiveness depending on

the structure and number of parameters involved in the

model; such approaches also have very different computa-

tional costs and they are often dependent on user assump-

tions (Brun et al., 2001). Another study, carried out by Malve

et al. (2007), demonstrated that an identifiability analysis

based on Bayes’ paradigm could be used for better fitting in

environmental modelling and selecting potential measure-

ments. Malve et al. (2007) suggested to use the environmental

modelling as a tool for guiding data-gathering campaigns.

The methods based on EE demonstrated high computa-

tional efficiency, especially after the modifications and the

improvements produced in the last decade. The methods

based on FIM analysis have the advantage of being less

affected by subjective choices of the operator (Freni et al.,

2009a; Machado et al., 2009).

Finally,Malveet al. (2007)pointedout thatBayesianmethods

are more data demanding than other identifiability methods

and for this reason they are often not readily applicable.

For this reasons, methods based on FIM analysis was

frequently adopted in integrated urban drainage water-quality

modelling for both its simple use and for the low impact of

subjectivechoices.Moreover, Freni etal. (2009a) investigated the

reduction of overall modelling uncertainty that can be obtained

byfixingsomeparameters constant (non-identifiable)according

to the results of the identifiability analysis. Despite the useful

insights gained by Freni et al. (2009a), the effects of the overall

datacontributionsof thedifferentpartsof the integratedsystem

were not investigated; the investigation of these effects repre-

sents one of the aims of the present study.

The Freni et al. (2009a) study was based solely on river flow

and water quality data, not including the information coming

from the other parts of the integrated system (i.e., the sewer

wat e r r e s e a r c h 4 5 ( 2 0 1 1 ) 3 7e5 0 39

system and wastewater-treatment plant). However, in the

case of integrated models the analysis of the identifiable

parameters on the basis of the whole body of information

coming from the different parts of the integrated system is of

paramount interest and deserves investigation. Indeed, the

uncertainties in parameters and input data propagate through

a chain of interacting models running parallel simulations.

More control of information transfer between time steps

allows for an improved analysis of model-system dynamics.

Bearing in mind the considerations discussed above,

identifiability analysis is applied to a complex case study in

which several data sources are present (i.e., sewer systems,

wastewater-treatment plants and a receiving water body) and

the related model is characterised by numerous parameters

thus increasing response uncertainty. This study attempts to

assess the right number of parameters that can be estimated

on the basis of data source availability. During the process,

several previously published indicators are employed and

a novel one is proposed for reducing the subjectivity of the

identifiability analysis.

2. Materials and methods

2.1. Description of the case study

The analysis was applied to a complex integrated catchment,

the Nocella catchment (Fig. 1), which is an urbanised natural

catchment located near Palermo in the northwestern part of

Fig. 1 e Nocella

Sicily (Italy). The entire natural basin is characterised by

a surface area of 9970 ha and has twomain branches that flow

primarily east to west. The two main branches join together

3 km upstream of the river estuary. The southern branch is

characterised by a smaller elongated basin and receives water

from a large urban area characterised by industrial activities

partially served by a WWTP and partially connected directly

to the RWB. The northern branch was monitored in the

present study. The basin closure is located 9 km upstream of

the river mouth; the catchment area is 6660 ha. The catch-

ment end is equipped with a hydrometeorological station

(Nocella a Zucco).

The northern river reach receives wastewater and storm-

water from two urban areas (Montelepre, with a catchment

surface of 70 ha, and Giardinello, with a surface of 45 ha)

drained by combined sewers. The Montelepre sewer consists

of circular and egg-shaped pipes with maximum dimensions

of 100 cm� 150 cm. The sewer system serves 7000 inhabitants

and has an average dry-weather flow of 12.5 L/s and an

average dry-weather biological oxygen demand (BOD) of

223mg/L. The Giardinello sewer consists of circular pipeswith

a maximum diameter of 80 cm. The served population is 2000

inhabitants and the system has an average dry-weather flow

of 2.5 L/s and an average dry-weather BOD concentration of

420 mg/L. Each sewer system is connected to a WWTP pro-

tected by combined sewer overflow (CSO) devices. The

WWTPs utilise a simplified activated-sludge process for the

organic biological carbon removal with preliminary mechan-

ical treatment units, an activated-sludge tank, and a final

catchment.

wat e r r e s e a r c h 4 5 ( 2 0 1 1 ) 3 7e5 040

circular settler. Rainfall was monitored by four rain gauges

distributed over the basin area: the Montelepre rain gauge

is operated by Palermo University and is characterised by

a 0.1-mm tipping bucket and a temporal resolution of 1 min;

the other three rain gauges are operated by the Regional

Hydrological Service and are characterised by a 0.2-mm

tipping bucket and a temporal resolution of 15 min. The

hydrometeorological station located at the end of the catch-

ment (“Nocella a Zucco”, operated by the Regional Hydrolog-

ical Service) is characterised by an ultrasonic level gauge and

has a temporal resolution of 15 min. The instruments were

integrated by Palermo University technicians by installing, for

the quantity data, an area e velocity submerged probe that

provides water level and velocity data with a 1-min temporal

resolution. An ultrasonic external probe was used to provide

a second water-level measurement for validation and as

a backup in case the submerged probe failed; an automatic

24-bottle water-quality sampler was used for water-quality

data collection. The monitoring was carried out considering

both permanent (based on measuring stations already

present) and temporary measures (i.e. based on measuring

stations on purpose located) (Fig. 2). Flowmeasurements were

carried out using areae velocity probeswith a 1-min temporal

resolution, which allow the inflow and outflow volumes for

each element in the system to be defined. Water-quality

sampling was performed using automatic 24-bottle samplers

and grab sampling was used for defining pollutant loads and

treatment efficiencies. The water-quality parameters moni-

tored were total suspended solids (TSS), BOD, chemical

oxygen demand (COD), ammonia (NH4), total Kjeldahl

nitrogen (TKN), and phosphorus (P); dissolved oxygen (DO)

level was monitored in the river only. All analyses were

carried out according to Standard Methods (APHA, 1995).

The monitoring campaign began in December 2006 and is

still in progress. Rainfall and discharge were monitored

continuously,whilewaterqualitywasmeasuredduringspecific

periods. Further details concerning the case study and moni-

toring campaign can be found in Freni et al. (2010a) andCandela

et al. (2009).

Fig. 2 e Schematic of the urban drainage system monitoring m

urban areas.

2.2. The integrated urban drainage model

In the present study, an integrated model developed in

previous studies was applied (Mannina et al., 2004; Mannina,

2005). A brief description of the structure of the adopted

model follows; the interested reader may refer to the cited

literature for a more detailed description of the selected algo-

rithms. Themodel enables estimation of both the interactions

among the three components of the system (sewer system e

SS, WWTP and RWB) and the effects, in terms of quality, that

urban stormwater causes inside the RWB (Fig. 3). The inte-

grated model is chiefly composed of three sub-models for the

simulation of the components; each sub-model is divided into

a quantity and quality module for the simulations of the

hydrographs and pollutographs. The modelling structure can

be adapted to the specific application by removing or adding

submodels, such as the stormwater tank (SWT) or CSO (Freni

et al., 2010b). The SS sub-model calculates the net rainfall

from the measured rainfall by a loss function taking into

account both initial and continuous losses (W0 and F, respec-

tively). From the net rainfall, the model simulates the net

rainfall-runoff transformation process and the flow propaga-

tion with a cascade of one linear reservoir and a channel,

representing the catchment, and a linear reservoir represent-

ing the sewer network (characterized by the parameters K1, l

and K2, respectively). An exponential function is used to

simulate water buildup on catchment surfaces (Alley and

Smith, 1981). Such an equation depends by two parameters

the buildup rate (Accu) and the decay rate (Disp) that control

the accumulation of pollutants on the catchment surface. The

solid wash-off caused by overland flow during a storm event is

simulated using the formulation proposed by Jewell and

Adrian (1978) where the wash-off coefficient (Arra) and wash-

off factor (Wh) are the two parameters that enable one to

calculate the washed mass of pollutants from the catchment

due to a rainfall event. The solid deposits in the sewers during

dry weather are calculated by using an exponential function.

Regarding the erosion and transport of sewer sediments, to

ensure a realistic approach, particular care is taken regarding

ethodology performed on the Montelepre and Giardinello

Fig. 3 e Schematic overview of the different submodels,

analysed processes, and interconnections.

wat e r r e s e a r c h 4 5 ( 2 0 1 1 ) 3 7e5 0 41

sediment transformations in sewers due to their semi-cohe-

sive behaviour due to the presence of organic substances and

the physicalechemical changes during sewer transport.

Specifically, the eroded mass from the sewer bottom is calcu-

lated according to the Parchure and Metha’s approach (1985)

whereM is the key parameter for assessing such a mass.

The pollutographs at the outlet of the sewer system are

calculated by modelling the complex catchment sewer

network as a reservoir and singling out the different types of

sewer sediment transport (i.e. suspended and bed load

transport). The two types of sediment transport are propa-

gated considering two coefficients: the sewer suspended load

linear reservoir constant (Ksusp) and the sewer bed load linear

reservoir constant (Kbed). Further, the different types of sewer

sediment transport are calculated taking into account the

transport capacity of the flow (see, Mannina and Viviani,

2010a). Finally, the WWTP inflow is computed by taking into

account the presence of a CSO device; its behaviour was

simulated by a rating curve, where CSO efficiency is taken into

account by the introduction of two dilution coefficients (rd1

and rd2) (Mannina and Viviani, 2009). The WWTP sub-model

simulates the behaviour of themost sensitive part of the plant

with respect to storm events; accordingly, the model simu-

lates a plant composed of an activated-sludge tank and

a secondary sedimentation tank. In the activated-sludge-tank

model, the equations derived from Monod’s theory (Metcalf

and Eddy, 2003) are used to describe the removal of BOD

and NH4. Specifically, the BOD removal is controlled by: the

maximum yield coefficient of heterotrophs (mmax,H), BOD

semi saturation constant (Ks), the yield coefficient heterotro-

phic (YH), the decay velocity of heterotrophs. On the other

hand, the NH4 removal is related to the autotroph biomass

and accordingly is controlled by the following parameters.

The maximum yield coefficient of autotrophs (mmax,A), the

yield coefficient autotrophic (YN) and the decay velocity of

autotrophs (bA). The sedimentation tank is simulated using

the modelling approach of Takacs et al. (1991). In particular,

the model predicts the solids concentration profile in the

settler by dividing the settler into a number of layers of

constant thickness and performing a solids balance for each

layer. The third sub-model assesses RWB discharges and

water quality. More specifically, the modelling approach is

focused on rivers characterised by scarce field data and

ephemeral characteristics (i.e., rivers characterised by a long

dry season and intense flows for short periods following

precipitation). This latter aspect is relevant as the phenomena

generally involved in the evaluation of the RWB quality state

play different roles with respect to the perennial streams

commonly presented in the literature (Freni et al., 2009b;

Mannina and Viviani, 2010b). Such rivers are also frequently

found in Mediterranean areas characterised by semi-arid

climates. Due to the highly non-stationary conditions typical

of these ephemeral streams, a dynamicmodel is employed for

the propagation of the river flow. Specifically, the simplified

form of the Saint Venat equation (cinematic wave) is used for

the propagating the flow throughout the river assuming as

solely parameter the river bed roughness (ks). On the other

hand, for the quality aspects the advectionedispersion equa-

tion was implemented to address the water-quality

phenomena (Mannina and Viviani, 2010c; Chapra, 1997;

Brown and Barnwell, 1987). Specifically, the BOD and DO

propagation was assessed considering a longitudinal disper-

sion coefficient (Kdisp) and kinetic constants for the trans-

formation of the BOD (kd and ksod) and oxygen reaeration (ka).

2.3. Model identifiability analysis

Most of the techniques designed to find practically identifiable

subsets of model parameters are based on an investigation of

sensitivity functions. The present study concentrates on

numeric criteria based on correlation studies of sensitivity

functions (Weijers and Vanrolleghem, 1997; Checchi and

Marsili-Libelli, 2005; Saltelli et al., 2006, 2009; Campolongo

et al., 2007; Marsili-Libelli and Giusti, 2008; Freni et al., 2009a;

Gatelli et al., 2009). Many of the methods briefly discussed in

the introduction rely on subjective hypotheses (such as the

definition of a sensitivity threshold for defining identifiable

parameters). In the present study, the analysis was carried out

investigating FIM determinant and eigenvalues because it is

less prone to subjectivity and it is successfully applied in the

same modelling field in literature. In this section, a brief

description of the sensitivity indices and identifiability anal-

ysis is presented. We begin with the assumption that a deter-

ministic model can be described by a general set of equations

y ¼ f(q), where the vector y ¼ ( y1, y2, .y3) represents the n

modelling output variables corresponding to the available

measurements y� ¼ y�1; y�2; .y�nand the vector q ¼ (q1, q2, .qm)

represents the m model parameters. Independent of the

nature of the modelling equations, sensitivity functions can

be defined stating the relevance of the dependencies between

modelling outputs y and parameters q:

si;j ¼Dqj

ysi

vyi

vqj(1)

wat e r r e s e a r c h 4 5 ( 2 0 1 1 ) 3 7e5 042

where Dqj is the variability range of parameter qj (which

depends on prior knowledge) and ysi is a reference (or scaling)

value for the modelling output variable yi, used for preserving

the dimensionless nature of the sensitivity function. The

function si;j is useful because it provides information on the

raw dependency of the modelling output on the parameters.

The parameters Dqj and ysi, the magnitude and the scaling

parameter, respectively, of the sensitivity function, can each

have a great influence on the results of the sensitivity ana-

lysis (Reichert and Vanrolleghem, 2001). In the present study,

ysi is defined as the average measured value of the ith model

output variable, and Dqj can be taken as the variation range of

the jth model parameter obtained according to single-event

model calibrations based on each available rainfall in the

calibration dataset (Beven and Binley, 1992; Freni et al.,

2009aec). With multiple modelling outputs, the analysis of

the functions si, j may be only slightly informative and a more

aggregated index may be useful. For this reason, a weighted

average sensitivity was used for initial parameter evaluation:

sj ¼ 1n

Xni¼1

si;jmax

�si;j� (2)

where maxðsi;jÞ is the maximum of the n sensitivities derived

for the jth model parameter. Scarcely identifiable model

parameters may act in two different ways: (i) they can

generate small weighted sensitivity function values; or (ii)

they can show an approximately linear dependence of sensi-

tivity functions on the parameters. In the first case (the first

non-identifiability criterion), the model parameter does not

greatly affect the modelling output and thus calibration

cannot really assess its value; in the latter case (the second

non-identifiability criterion), the model-parameter variability

does not clearly affect the modelling output and it can be

considered a sort of underlying noise which increases the

uncertainty transferred to the model output variable without

providing relevant additional information to the model.

The identification technique employed here was originally

proposed and applied to WWTP models by Weijers and

Vanrolleghem (1997) and is based on the elaboration of

sensitivity matrices.

The technique consists of two phases for the analysis of

the two previously discussed causes of non-identifiability. In

the preliminary phase, a sensitivity ranking of parameters is

accomplished by averaging the sensitivity of different

modelling outputs to the parameter (Eq. (2)). The preliminary

analysis allows for the reduction of model parameters to the

most sensitive ones, i.e., those characterised by model sensi-

tivities higher than a user-defined threshold; model parame-

ters with sensitivities lower than this threshold can be

considered non-identifiable according to the first criterion

defined above. Such subjective choice is used only for ranking

the parameters and for simplifying the following step of the

analysis by reducing the number of parameters to be inves-

tigated. An inappropriate choice of the threshold may lead to

the following consequences:

� The use of a low threshold leads to the elimination of few

parameters, thus increasing the complexity and the

computational demandsof the followingpart of the analysis;

� The use of an high threshold leads to the initial elimination

of an high Qmes number of parameters; in this way, the

following phase of the analysis may lead to the identifica-

tion of all remaining parameters without reaching a non-

identifiability condition. In this case, the analyst can run the

analysis again reducing the threshold.

The parameters saved in this first elimination phase are

passed to the second phase of the identifiability analysis,

which is based on elaborations of the Fisher Information

Matrix (FIM):

FIM ¼ �S,Q�1mes,S

T�

(3)

where S is a matrix of n rows and m columns containing the

sensitivity indices obtained by Eq. (1) and is the [n� n] covari-

ance matrix of the measurement noise. In the cases where

measurement noise sources are uncorrelated, the Qmes matrix

is diagonal and has a determinant equal to one. Considering

a model with m parameters, the FIM is an [m�m] matrix.

The FIM summarises the importance of each model

parameter with respect to the outputs (Dochain and

Vanrolleghem, 2001). The FIM provides a lower bound for the

parameter error-covariance matrix and its characteristics

may then provide information on the shapes and dimensions

of themodel-confidence regions around the calibration values

of the model parameters (Soderstrom and Stoica, 1989). More

specifically, as each column of the matrix represents a model

parameter, the determinant and the condition number (i.e.,

the ratio between the highest and lowest matrix eigenvalues)

of the FIM provides a reasonable measurement of the corre-

lation of a set of model parameters (Weijers and

Vanrolleghem, 1997). The FIM determinant D (the identifi-

ability criterion) is a representation of the importance of the

model parameters with respect to model outputs: a higher

determinant indicates that the model outputs are more

sensitive to the parameters. Conversely, the presence of one

insensitive parameter causes a drastic reduction of the FIM

determinant, to zero. As the D criterion is dependent on the

magnitude of the parameters involved, this criterion was

normalised (normD) according to Eq. (4):

normD ¼ max�D,kqk2

�(4)

where kqk2 is the Euclidean norm of the parameter vector

evaluated at themean point of the parameter-variation range.

Such normalisation acts as a scaling factor and allows for

comparisons among subsets of the same size but with

different model parameters.

The condition number E (the identifiability criterion) is

a representation of the shape of the confidence region

(Weijers and Vanrolleghem, 1997; Checchi and Marsili-Libelli,

2005): a value near unity indicates that all parameters are

equally important to the model; higher values are obtained in

presence of a dominant or insensitive model parameter:

modE ¼ min

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffimaxðEV½FIM�ÞminðEV½FIM�Þ

s !(5)

where maxðEV½FIM�Þ and minðEV½FIM�Þ are the maximum and

minimum eigenvalues (EV) of the FIM, respectively. From the

wat e r r e s e a r c h 4 5 ( 2 0 1 1 ) 3 7e5 0 43

systems-engineering point of view, it is important to include

in the parameter subset those parameters that maximise the

D criterion and minimise the modE criterion. Both identifica-

tion criteria have advantages and disadvantages (Freni et al.,

2009a): the D criterion represents the size of the confidence

region and thus the aggregated impact of parameters can be

evaluated but the comparison between parameters in terms of

identifiability may be difficult in complex models; the E

criterion enables the easy comparison of the impact of each

parameter on the model, but an objective approach for eval-

uating the number of identifiable parameters is missing (the

maximum number of identifiable parameters can be detected

by a rapid increase in the index value once a new parameter is

added to an identifiable parameter subset).

For this reason, in the present study, similarly to the

method of Machado et al. (2009), a combination of the two

criteria was considered. Hence, the ratio between the normD

and the modE criteria (the DE criterion) is an interesting index

to define subsets of identifiable parameters combining the

advantages of both approaches:

DE ¼max

�D,kqk2

min

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffimaxðEV½FIM�ÞminðEV½FIM�Þ

s ! (6)

Another opportunity can be based on considerations similar

to those that generated the modE criterion in an attempt to

improve its objectivity. Such an aim can be achieved by

comparing the maximum and minimum FIM eigenvalues at

different steps of the identifiability process:

gradE ¼ max

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffimax

�EV�FIMpþ1

��max

�EV�FIMp

�� ,min

�EV�FIMp

��min

�EV�FIMpþ1

��s !

(7)

where p is the number of parameters in each step of the

identifiability analysis, FIMp is the Fisher InformationMatrix of

dimension p� p and the remaining variables as defined above.

At each step of the identifiability analysis, gradE can reach

a peak either if a highly sensitive parameter (the first fraction

has a peak) or an insensitive parameter (the second fraction

has a peak) is included. The number of identifiable parameters

is identified by the absolute maximum of the gradE function.

Practical identifiability approaches use the discussed

criteria for ranking model parameter subsets to find the best

combination that can be assessed according to the available

data. The identification process is iterative and consists of

adding onemodel parameter at a time to an initial identifiable

subset that is usually selectedamong themost sensitivemodel

parameters. In the subsequent iterative steps, all possible

combinations are obtained by adding one parameter to the

identifiable subset and evaluating the identification criteria.

The combination providing the highest values of the identifi-

cation criteria is retained and the iteration is repeateduntil the

global maximum of the identification criteria is reached.

2.4. Methodology application

According to the steps discussed in the previous section, an

initial local sensitivity analysis was performed to identify

the most sensitive model parameters among the fifty-one

characterising the integrated model: twenty-three for each

urbandrainage systemandfive for the RWB. Table 1 shows, for

each sub-model and each parameter, the symbol, the

measurement unit, the variation range and the weighted

sensitivity index.

Similarly to Beven and Binley (1992), parameter-variation

ranges were taken as the intervals strictly including the cali-

brated values obtained bymeans of the seven available events.

In the present study, sensitivity indices were evaluated by

means of 1000 Monte Carlo (MC) simulation runs obtained by

varying all parameters simultaneously and assuming

a uniform distribution. Sensitivity indices were calculated for

thirty modelling outputs for which data were available

(Table 2) and neglected parameters were characterised by

a sensitivity index lower than 0.015 (shown in grey in Table 1).

After the first elimination phase based on weighted

sensitivity ranking, the analysis of the Montelepre and Giar-

dinello urban drainage systems was performed in three steps

(SS, CSO and WWTP) separately and then the RWB. Such an

approach was necessary to avoid the construction of FIMs in

which model outputs and parameters are not linked by

a cause-and-effect relationship. This approach, as further

discussed below, also allowed us to understand the contri-

bution of each data source to the identification process.

Regarding the quantity and quality sub-modules, for sake of

simplicity we do not considered a step-wise procedure

aforementioned as for the three sub-models (i.e. first quantity

and thereafter quality modules).

For each urban drainage system, the analysis started from

the initial subset consisting of the three most sensitive

parameters. All the possible combinations of four parameters

were considered by adding one model parameter to the initial

identifiable set. The FIM was calculated for all the candidate

parameter sets and the identifiability indicators were

computed. The Qmes matrix was assumed to be diagonal and

with determinant equal to one considering that measurement

noise sources are uncorrelated. The best set was selected as

the one providing the highest value of normD, DE and gradE or

the minimum of modE. Therefore, the process was continued

considering all possible combinations of parameters obtained

by adding one additional parameter to the identifiable set; the

parameter providing the best values of the identifiability

indicators was added to the identifiable set and the analysis

was continued adding a parameter at a time until one of the

non-identifiability conditions were reached.

The selection of an improper level of complexity in inte-

grated modelling can have significant consequences on model

output uncertainty, andnon-identifiable parameters contribute

to such uncertainty without providing any additional contri-

butions in the representation of real processes. Once such

parameters are known, they should be fixed to a default value

(for instance the average of the expected variation range) thus

neutralising the related uncertainty. To assess the impact of

non-identifiable parameters on modelling uncertainty, the

Generalised Uncertainty Likelihood Estimation (GLUE by Beven

and Binley, 1992) was applied to the model in two scenarios:

� Considering the variation of all parameters (identifiable and

non-identifiable) obtaining the total uncertainty related to

the model

Table 1 e Variation range of model parameters and average model sensitivities (parameters neglected after initialsensitivity analysis are greyed).

Parameter Symbol Unit Montelepre Giardinello

Dqj sj Dqj sj

Catchment linear channel constant l min 8e30 0.188 1e10 0.221

Initial hydrological abstraction W0 mm 0.1e04 0.524 0.6e1 0.598

Catchment runoff coefficient F e 0.8e09 0.540 0.6e0.9 0.462

Catchment linear reservoir constant K1 min 14e40 0.191 0.1e65 0.197

Sewer linear reservoir constant K2 min 15e35 0.472 0.1e55 0.474

Build-up rate in the Alley-Smith model Accu kg/(ha*d) 0.1e20 0.307 0.1e20 0.284

Decay rate in the Alley-Smith model Disp d�1 0.01e10 0.300 0.01e1 0.225

Wash-off coefficient in the Alley-Smith model Arra mm-Whh(Wh-1) 0.01e0.8 0.335 0.01e1 0.050

Wash-off factor in the Alley-Smith model Wh e 0.3e1 0.240 0.1e3.5 0.437

Sewer erosion factor M kg 0.1e3 0.225 0.1e3 0.341

Sewer suspended load linear reservoir constant Ksusp min 0.2e0.8 0.251 0.01e0.6 0.217

Sewer bed load linear reservoir constant Kbed min 0.04e0.4 0.002 0.01e1 0.004

CSO first dilution factor rd1 e 1.2e1.5 0.384 1.1e1.9 0.013

CSO second dilution factor rd2 e 2e4 0.433 2e2.5 0.441

Max yield coefficient of heterotrophs mmax,H h�1 0.6e13.2 0.081 0.6e13.2 0.003

BOD semi saturation constant Ks g/L 0.005e0.15 0.167 0.005e0.15 0.029

Yield coefficient heterotrophic YH e 0.38e0.75 0.225 0.38e0.75 0.032

Temperature T �C 5e30 0.130 5e30 0.014

Max yield coefficient of autotrophs mmax,A h�1 0.2e0.4 0.118 0.2e0.4 0.042

Oxygen half saturation constant ko g/L 0.1e0.3 0.001 0.1e0.3 0.002

Yield coefficient autotrophic YN e 0.16e0.18 0.428 0.16e0.18 0.226

Decay velocity of heterotrophs bH d-1 0.2e0.8 0.030 0.2e0.8 0.002

Decay velocity of autotrophs bA d-1 0.2e0.8 0.011 0.2e0.8 0.012

Dqj sjRiver bed roughness (GaucklereStrickler) ks m1/3/s 10e70 0.566

Longitudinal dispersion coefficient Kdisp m2/s 1e500 0.001

De-oxygenation coefficient kd s�1 1e100 0.047

Sediment oxygen demand coefficient ksod s�1 1e100 0.351

Re-aeration coefficient ka s�1 1e1000 0.894

wat e r r e s e a r c h 4 5 ( 2 0 1 1 ) 3 7e5 044

� Considering only the identifiable parameters and fixing the

others to the average value of the ranges presented in

Table 1. In this way, the unavoidable uncertainty can be, i.e.

the uncertainty connected to the parameters that can be

reliably calibrated.

In both cases, the uncertainty bands were obtained by

running 10,000 behavioural MC simulations were run

assuming that variable model parameters were uniformly

distributed in the ranges presented in Table 1. According to

the classical application of GLUE, the NasheSutcliffe criterion

Table 2 e Monitored system variables available for theidentifiability analysis with the number of data pointsavailable for each of them.

Systemlocation

Q TSS BOD COD NH4 DO

Montelepre SS 130 24 24 24 24 a

CSO 316 19 19 19 19 a

WWTP a 14 14 a 14 a

Giardinello SS 314 20 20 20 20 a

CSO 314 15 15 15 15 a

WWTP a 15 15 15 15 a

RWB 118 a 22 a a 22

a Data not used in the present model application.

(Nash and Sutcliffe, 1970) was used as likelihood measure and

an acceptability threshold equal to zero for the selection of

behavioural and non-behavioural simulation runs. The

uncertainty bands were computed as the 5% and 95%

percentiles of the likelihood distribution. For brevity’s sake,

the application details of the uncertainty analysis were not

reported in the present paper and they can be found in

previous literature (Freni et al., 2009b,c, 2008b).

3. Analysis of results

The results of the initial weighted sensitivity analysis are

presented in Table 1: eleven parameters (all regarding water

quality aspects) demonstrated sensitivity indices lower than

the threshold and so were neglected in the following part of

the study (being non-identifiable by the first non-identifi-

ability criterion). They were mainly related to WWTP

processes and to the Giardinello urban area. This fact could be

due to several factors such as the lower quality of the Giar-

dinello data, higher uncertainty in the identification of

parameter values, or the lower relevance of the Giardinello

catchment in determining the quality state of the RWB, thus

reducing the related sensitivity indices.

According to the initial analysis, six parameters provided

higher weighted sensitivities and they were used as initial

parameter subsets for the identifiability analysis. More

wat e r r e s e a r c h 4 5 ( 2 0 1 1 ) 3 7e5 0 45

specifically, the most sensitive parameters were the initial

hydrological abstraction (W0), the catchment runoff coeffi-

cient (F) and the sewer linear-reservoir constant (K2) for both

urban drainage systems. Starting from these parameters

the identifiability analysis was carried out according to the

procedure described above. All possible combinations of four

parameters including the initial identifiable set were analysed

selecting the one providing the best values of the identifiability

indices. The process was continued considering sets with

progressively increasing number of parameters, always add-

ing the parameter that provided the best value of the indices to

the previously identifiable set. The best parameter combina-

tions obtained in all performed iterations (increasing the

number of parameters in the set) along with computed iden-

tifiability indices are reported in Tables 3 and 4. The NormD

and DE criteria showed similar results in assessing the same

number of identifiable parameters (Fig. 4): eleven parameters

for the Montelepre urban drainage system and nine for Giar-

dinello. TheDE criterion showed a flat area near themaximum

making its assessment quite difficult. This was due to the

rapid increase of modE that may have masked the increase of

NormD; thus, even if the position of the maximum was

preserved in the analysed case, this conditionmay have led to

an incorrect estimation of the identifiable parameter set.

The ModE criterion showed some limitations due to the

fact that it was constantly growing with the increase in

the dimensions of parameter subsets; The ModE criterion is

characterised by a jump once the number of identifiable

parameters is reached but it is hardly visible in Figures 4a and c

and an objective criterion is not easily assessable thus making

this criterion difficulty applicable by inexperienced analysts.

The gradE index was consistent with the others in the deter-

mination of the identifiable number of parameters and it

eliminated the subjectivity of the modE criterion. All criteria

agree in the composition of the identifiable parameter subsets,

which arepresented in bold type inTables 3 and 4. According to

thesimulation results the following conclusionsmaybedrawn:

� The first identifiable parameters (i.e., W0, F, and K2) are all

connectedwithwater-quantitymodules, demonstrating the

greater importance of such parameters affecting both water

quantity and water-quality modelling outputs; these

parameters deeply influence the volume and the shape of

Table 3 e Best identifiable model parameter subsets for Montelargest identifiable parameter set is indicated in italic; the para

N Parameters

3 W0, F, K2

4 W0, F, K2, rd2

5 W0, F, K2, rd2, Accu

6 W0, F, K2, rd2, Accu, l

7 W0, F, K2, rd2, Accu, l, K1

8 W0, F, K2, rd2, Accu, l, K1, M

9 W0, F, K2, rd2, Accu, l, K1, M, YN

10 W0, F, K2, rd2, Accu, l, K1, M, YN, mmax,H

11 W0, F, K2, rd2, Accu, l, K1, M, YN, mmax,H, Disp

12 W0, F, K2, rd2, Accu, l, K1, M, YN, mmax,H, Disp, mmax,A

13 W0, F, K2, rd2, Accu, l, K1, M, YN, mmax,H, Disp, mmax,A, T

sewer hydrograph thus affecting the behaviour of all the

downstream sub-models; this effect is also due to the higher

availability of water quantity data with respect to the water

quality ones;

� A group of seven parameters (mostly connected with water

quantity sub-models) are identifiable in both urban areas

demonstrating their importance in the integrated model;

the water quality parameters in this groupmainly affect the

accumulation of pollutants in the sewer and on the catch-

ment thus indicating that such process affects significantly

water quality in all model sub-systems;

� Conversely, parameters related towater quality processes in

the sewers are scarcely identifiable thus showing that they

are not relevant or their impact cannot be separated by other

water quality parameters according to the available field

data; the second possibility is probably the most reliable

becausewater quality at the endof the sewerpipe (where the

monitoring station is located) is surely affected by two

accumulation/wash-off processes (one taking part on the

catchment and the other in the sewer pipe) that are not

separable unless a specific campaign is carried out for

monitoring water quality at the sewer inlets;

� Most of the WWTP parameters were non-identifiable

(by the second non-identifiability criterion); this behaviour

can be explained by their lower variability and by the lower

number of affected modelling outputs; many model

parameters interact in the same equations so that the

variation of one of themmay be compensated by the others.

� From a practical point of view, the previous comment

should probably lead to a simplification of the WWTP sub-

model because it is too complexwith respect to the available

data; more interestingly, the analysis should take to

a deeper field investigation of the WWTP by including

additional intermediate monitoring stations in order to

identify more parameters;

� The number of identifiable parameters in the Giardinello

urban drainage system remained lower than in the Mon-

telepre system, confirming the initial differences obtained

in the preliminary sensitivity analysis; this difference may

be related to the different dimensions and characteristics of

the two urban areas (with different ratios between dry and

wet-weather flows) thus taking to a different relevance of

stormwater polluting processes. Giardinello is in fact

lepre urban drainage systems (SS, CSO and WWTP): themeter added at each analysis step is underlined.

normD modE DE gradE

5.99E þ 07 6.1054 9.81E þ 06 1.6

1.31E þ 11 9.76847 1.34E þ 10 1.369

2.3E þ 14 13.3731 1.70E þ 13 1.365

4.7E þ 15 18.2585 2.57E þ 14 1.603

4.3E þ 18 29.2693 1.47E þ 17 1.261

3.6E þ 19 36.9032 9.83E þ 17 1.763

3.80E þ 20 65.0466 5.83E þ 18 2.111

1.08E þ 21 137.291 7.9E þ 18 2.388

2.18E þ 21 327.847 6.7E þ 18 3.182

1.36E þ 21 1043.14 1.31E þ 18 1.363

1.77E þ 20 1421.82 1.24E þ 17 e

Table 4 e Best identifiable model parameter subsets for Giardinello urban drainage systems (SS, CSO and WWTP): thelargest identifiable parameter set is indicated in italic; the parameter added at each analysis step is underlined.

N Parameters normD modE DE gradE

3 W0, F, K2 2.31E þ 09 6.94 3.33E þ 08 1.23

4 W0, F, K2, rd2 3.99E þ 12 8.52 4.69E þ 11 1.25

5 W0, F, K2, rd2, K1 4.61E þ 15 10.66 4.33E þ 14 1.38

6 W0, F, K2, rd2, K1, Ksusp 8.23E þ 17 14.73 5.58E þ 16 1.42

7 W0, F, K2, rd2, K1, Ksusp, YN 2.14E þ 20 20.95 1.02E þ 19 1.62

8 W0, F, K2, rd2, K1, Ksusp, YN, Wh 1.05E þ 22 33.84 3.11E þ 20 1.76

9 W0, F, K2, rd2, K1, Ksusp, YN, Wh, YH 6.96E þ 22 59.56 1.17E þ 21 1.90

10 W0, F, K2, rd2, K1Ksusp,YN, Wh, YH, Accu 1.17E þ 22 113.45 1.03E þ 20 1.28

11 W0, F, K2, rd2, K1, Ksusp, YN, Wh, YH, Accu, l 1.88E þ 21 145.03 1.3E þ 19 e

wat e r r e s e a r c h 4 5 ( 2 0 1 1 ) 3 7e5 046

characterised by lower dry-weather flows and higher

polluting concentrations making the first flush phenom-

enon less evident than in the Montelepre catchment thus

reducing the sensitivity of wet-weather related parameters

and their identifiability;

� The NasheSutcliffe calibration efficiencies (Nash and

Sutcliffe, 1970) were w0.85 in the Montelepre urban drainage

systemandlower than0.6 in theGiardinello (Freni etal., 2010a,

2008a), thus demonstrating that less information can be

derived from the available data;

Twenty parameters were assessed as identifiable bymeans

of data collected in the SS, CSO and WWTP. To evaluate the

0

5

10

15

20

25

0 5 10 15

Number of parameters [-]

Log(

norm

D)

0

400

800

1200

1600

2000

mod

E

normD

modE

0

5

10

15

20

25

0 5 10 15

Number of parameters [-]

Log(

norm

D)

0

30

60

90

120

150

mod

E

normD

modE

a

c

Fig. 4 e Identifiability criteria for the Montelepre urban drainage

impact of additional data sources, the remaining parameters

(three parameters for the RWB and fourteen non-identifiable

in the previous stage for the two urban drainage systems)

were passed through an additional identification step based

on available RWB data. The analysis was intended to assess

the identifiability of the RWB parameters and to verify if this

additional data source would allow for the identification of

additional parameters in the upstream submodels. As shown

in Table 5, five parameters were assessed as identifiable using

the additional data from the RWB. Despite the easily justifi-

able identification of the initial three RWB parameters (i.e., ks,

ka, and ksod), the analysis of this additional data allowed for

the identification of two more parameters that were not

0

5

10

15

20

25

0 5 10 15

Number of parameters [-]

Log(

DE)

0.0

0.5

1.0

1.5

2.0

2.5

grad

E

DE

gradE

0

5

10

15

20

25

0 5 10 15

Number of parameters [-]

Log(

DE)

0

1

2

3

4

5

grad

E

DE

gradE

b

d

system (aeb) and Giardinello urban drainage system (ced).

Table 5 e Additional identifiable model parameter subsets according to RWB data: the largest identifiable parameter set isindicated in italic; the parameter added at each analysis step is underlined. The identifiable parameters.

N Parameters normD modE DE gradE

Montelepre urbandrainage system

Giardinello urbandrainage system

RWB

20 Initial condition: 11 identifiable parameters for Montelepre urban drainage system

(Table 3) and 9 for Giardinello urban drainage system (Table 4)

1.31E þ 09 9.41 1.43E þ 08 1.23

21 e e ks 2.89E þ 10 13.02 2.22E þ 09 1.23

22 e e ks, ka 8.08E þ 10 28.84 2.80E þ 09 1.25

23 e e ks, ka, ksod 1.28E þ 11 32.35 3.95E þ 09 1.34

24 rd1 e ks, ka, ksod 2.69E þ 11 37.18 7.23E þ 09 1.58

25 rd1 Accu ks, ka, ksod 4.00E þ 11 42.35 9.44E þ 09 1.80

26 rd1, Wh Accu ks, ka, ksod 9.12E þ 10 123.10 7.41E þ 08 1.12

27 rd1, Wh Accu, Disp ks, ka, ksod 7.31E þ 09 137.33 5.32E þ 07

wat e r r e s e a r c h 4 5 ( 2 0 1 1 ) 3 7e5 0 47

identifiable by means of the SS, CSO and WWTP data (i.e., rd1

for Montelepre SS and Accu for Giardinello SS). The non-

identifiability of one parameter among ksod and kd was

expectable as they both act on RWB BOD concentration once

again showing that some processes needs specific monitoring

campaigns to be assessable. The identification of additional

parameters that were not initially identified in the urban

drainage system analysis should stress the importance of

interactions in the integrated system that cannot be analysed

as the sum of separated compartments.

The analysis of RWB identifiability criteria confirmed the

good agreement of all adopted indices and the limitations due

to the flatness of the DE and the subjective identification of

jumps in modE (Fig. 5).

This additional step in the identifiability analysis showed

the impact that a coordinated monitoring campaign can have

on the robustness of themodel application. From a qualitative

point of view, it would be expected that a larger dataset may

satisfy more complex models; the identifiability analysis

provides a quantitative response to this consideration by

providing the number of parameters (i.e., indirectly providing

the proper model complexity) that can be identified with the

available dataset and it can suggest an appropriate increase of

the number of model parameters effectively assessable when

new data become available.

Once the non-identifiable parameters were found, the

application of uncertainty analysis allowed us to assess

0

4

8

12

16

15 18 21 24 27 30

Number of parameters [-]

Log

(nor

mD

)

0

40

80

120

160

mod

E

normD

modE

a

Fig. 5 e Identifiability cr

the impact of these parameters. The uncertainty bands

obtained by varying all the model parameters (i.e., identifi-

able and non-identifiable parameters) according to the GLUE

are displayed in Fig. 6aec, while Fig. 6def shows the uncer-

tainty bands obtained by varying only the twenty-five iden-

tifiable parameters (Table 5) and fixing the others to

the averages of their initial variation ranges (Table 1). A

comparison of the uncertainty bands in Fig. 6 shows that the

uncertainty-band width was significantly reduced by

neglecting non-identifiable parameters; specifically, the

following can be noted:

� Discharge uncertainty bands were reduced by an average of

40% while the impact on water-quality variables was over

60%;

� The higher impact on water-quality uncertainty was con-

nected with the higher number of non-identifiable water-

quality parameters that introduced background noise into

the uncertainty analysis; and

� These reductions were obtained without losing the validity

of the modelling hypotheses, as over 90% of the data points

remained within the uncertainty bands.

The results of the uncertainty analysis are dependent on

the specific case study and on the subjective hypotheses

adopted in the GLUE application. Nevertheless the reduction

of uncertainty by fixing the non-identifiable parameters

0

2

4

6

8

10

12

20 22 24 26 28

Number of parameters [-]

Log(

DE)

0.0

0.5

1.0

1.5

2.0

2.5

3.0

grad

E

DE

gradE

b

iteria for the RWB.

Fig. 6 e RWB 5th percentile and 95th percentile in terms of discharge, BOD concentration and DO concentration for the total

uncertainty [(a), (b), (c)] and for the unavoidable uncertainty [(d), (e), (f)].

wat e r r e s e a r c h 4 5 ( 2 0 1 1 ) 3 7e5 048

wat e r r e s e a r c h 4 5 ( 2 0 1 1 ) 3 7e5 0 49

demonstrates the importance of identifiability analysis in the

application of complex environmental models.

4. Conclusions

The present study applied a parameter identifiability analysis

to a complex integrated urban drainage model. We proposed

the use of identifiability analysis as a tool for assessing the

appropriate model complexity to employ for a specific appli-

cation. In the process, several published identifiability criteria

were applied and a new one was proposed for integrating the

simplicity of the indices based on FIM eigenvalues and the

objectivity of these based on the FIM determinant.

The results led to several interesting observations:

� The normD and DE criteria were unambiguous in the defi-

nition of identifiable parameters but DE was characterised

by flatness near the maximum making the assessment of

the number of identifiable parameter quite difficult;

� The modE criterion showed some limitations in the defini-

tion of identifiable parameters due to its subjectivity; in the

presented applications, modE was always consistent with

the criteria based on the FIM determinant but inexperienced

analysts may misinterpret secondary modE jumps as the

consequence of the introduction of a non-identifiable

parameter in the analysis; and

� ThegradEcriterionsolvedsuchsubjectivityproblemsbecause

thenumberof identifiableparameters is givenby theabsolute

maximumof the function and itmaintained the simplicity of

identifiability criteria based on eigenvalues estimation.

The analysis showed some critical points in integrated

urban drainage modelling, such as the interaction between

water quality processes on the catchment and in the sewer,

that can prevent the identifiability of some of the related

parameters. Similar cases may be found the WWTPs, consid-

ering the different processes affecting pollutants concentra-

tion, or in the RWB, considering, as an example, sediment

oxygen demand and the de-oxygenation coefficient. These

identifiability issues may be solved either by simplifying the

model or by carrying out specific field campaigns including

intermediate monitoring stations.

Uncertainty analysis carried out according to the GLUE

methodology confirmed the effectiveness of the identifiability

analysis in selecting the correct model complexity. Indeed,

a reduction of the uncertainty in terms of uncertainty band-

width was shown by fixing the non-identifiable model

parameters.

As a general conclusion, practical identifiability can be used

for guiding the analyst in the selection of the right modelling

detail level for a specific application and it is adequately flexible

to reapply each time new data sources become available,

allowing for modular model complexity adaptable to data

availability, minimising “avoidable uncertainty” (i.e., the uncer-

taintydue to theunnecessary complexityof theappliedmodels).

The results obtained herein are obviously dependent on

the specific case study employed here. Considerations of the

advantages provided by identifiability analysis may be

generalised, especially with respect to integrated modelling

simplification and results reliability. Further research may

involve the effect of data availability with respect to param-

eter identification and the improvements provided by the

introduction of new measuring stations in the system.

Acknowledgements

Authorswish to thankMrs R. D’Addelfio andDr. A. P. Lanza for

their valuable assistance during fieldwork. The authorswould

like also to thank the Editor and the two anonymous reviewers

for very helpful and constructive comments that resulted in

a much improved manuscript.

r e f e r e n c e s

Alley, W.M., Smith, P.E., 1981. Estimation of accumulationparameters for urban runoff quality modelling. WaterResources Research 17 (6), 1657e1664.

APHA., 1995. Standard Methods for Examination of Water andWastewater. APHA, AWWA and WPCF, Washington DC, USA.

Belsley, D.A., 1991. Conditioning Diagnostics e Collinearity andWeak Data in Regression. Wiley, New York.

Beven, K.J., Binley, A.M., 1992. The future of distributed models -model calibration and uncertainty prediction. HydrologicalProcesses 6 (3), 279e298.

Brown, L.C., Barnwell, T.O., 1987. The Enhanced Stream WaterQuality Models QUAL2E and QUAL2E-UNCAS: Documentationand User Manual. USEPA/6003e87/007. USEPA, USA.

Brun, R., Kuhni, M., Siegrist, H., Gujer, W., Reichert, P., 2002.Practical identifiability of ASM2d parameters and systematicselection and tuning of parameter subsets. Water Research 36(16), 4113e4127.

Brun, R., Reichert, P., Kunsch, H.R., 2001. Practical identifiabilityanalysis of large environmental simulation models. WaterResources Research 37 (4), 1015e1030.

Campolongo, F., Cariboni, J., Saltelli, A., 2007. An effectivescreening design for sensitivity analysis of large models.Environmental Modelling & Software 22 (10), 1509e1518. 2007.

Candela, A., Freni, G., Mannina, G., Viviani, G., 2009.Quantification of diffuse and concentrated pollutant loads atthe watershed-scale: an Italian case study. Water Science &Techology 59 (11), 2125e2135.

Chapra, S.C., 1997. Surface Water - Quality Modelling. McGraw-Hill Science/Engineering/Math.

Checchi, N., Marsili-Libelli, S., 2005. Reliability of parameterestimation in respirometric models. Water Research 39 (15),3686e3696.

De Pauw, D.J.W. (2005). Optimal Experimental Design forCalibration of Bio-process Models: A Validated SoftwareToolbox. PhD thesis in Applied Biological Sciences, BIOMATH,University of Gent. Available from: <http://biomath.ugent.be/publications/download/>.

De Pauw, D.J.W., Sin, G., Insel, G., Van Hulle, S.W.H.,Vandenberghe, V., Vanrolleghem, P., 2004. Discussion of:assessing parameter identifiability of activated sludge modelnumber 1. Journal of Environmental Engineering 130 (1),111e112.

Dochain, D., Vanrolleghem, P.A., 2001. Dynamical Modelling andEstimation in Wastewater Treatment Processes. IWAPublishing, London.

European Commission, (2000). Directive 2000/60/EC of theEuropean Parliament and of the Council establishing

wat e r r e s e a r c h 4 5 ( 2 0 1 1 ) 3 7e5 050

a framework for the Community action in the field ofwater policy.

Freni, G., Mannina, G., and Viviani, G. (2008a). Catchment-scalemodelling approach for a holistic urban water qualitymanagement, Proc. 11 ICUD conference, Edinburgh(Scotland e UK), 31 August e 5 September 2008.

Freni, G., Mannina, G., Viviani, G., 2008b. Uncertainty in urbanstormwater quality modelling: the effect of acceptabilitythreshold in the GLUE methodology. Water Research 42 (8e9),2061e2072.

Freni, G., Mannina, G., Viviani, G., 2009a. Identifiability analysisfor receiving water body quality modelling. EnvironmentalModelling & Software 24 (1), 54e62.

Freni, G., Mannina, G., Viviani, G., 2009b. Uncertainty assessmentof an integrated urban drainage model. Journal of Hydrology373 (3e4), 392e404.

Freni, G., Mannina, G., Viviani, G., 2009c. Urban runoff modellinguncertainty: comparison among Bayesian and pseudo-Bayesianmethods. EnvironmentalModelling & Software 24 (9), 1100e1111.

Freni, G., Mannina, G., Viviani, G., 2010a. Urban water qualitymodelling: a parsimonious holistic approach for a complexreal case study. Water Science & Technology 61 (2), 521e536.

Freni, G., Mannina, G., Viviani, G., 2010b. Urban stormwaterquality management: centralized versus source control.Journal of Water Resources Planning and Management e Asce136 (2), 268e278.

Gatelli, D., Kucherenko, S., Ratto, M., Tarantola, S., 2009.Calculating first-order sensitivity measures: a benchmark ofsome recent methodologies. Reliability Engineering & SystemSafety 94 (4), 1212e1219. 2009.

Holmberg, A., 1982. On the practical identifiability of microbialgrowth models incorporating MichaeliseMenten typenonlinearities. Mathematical Biosciences 62 (1), 23e43.

Jakeman, A.J., Hornberger, G.M., 1993. How much complexity iswarranted in a rainfall-runoff model? Water ResourceResearch 29 (8), 2637e2649.

Jewell, T.K., Adrian, D.D., 1978. SWMM storm water pollutantwashoff function. Journal of the Environmental EngineeringDivision 104 (5), 1036e1040.

Kuczera, G., Parent, E., 1998. Monte Carlo assessment ofparameter inference in catchments models: the Metropolisalgorithm. Journal of Hydrology 211 (1e4), 69e85.

Machado, V.C., Tapia, G., Gabriel, D., Lafuente, J., Baeza, J.A., 2009.Systematic identifiability study based on the FisherInformation Matrix for reducing the number of parameterscalibration of an activated sludge model. EnvironmentalModelling and Software 24 (11), 1274e1284.

Malve, O., Laine, M., Haario, H., Kirkkala, T., Sarvala, J., 2007.Bayesian modelling of algal mass occurrences using adaptiveMCMC methods with a lake water quality model.Environmental Modelling and Software 22 (7), 966e977.

Mannina, G., Viviani, G., 2010a. An urban drainage stormwaterquality model: model development and uncertaintyquantification. Journal of Hydrology 381 (3e4), 248e265.

Mannina, G., Viviani, G., 2010b. A parsimonious dynamic modelfor river water quality assessment. Water Science & Techology61 (3), 607e618.

Mannina, G., Viviani, G., 2010c. A hydrodynamic water qualitymodel for propagation of pollutants in rivers. Water Science &Technology 62 (2), 288e299.

Mannina, G., Viviani, G., 2009. Separate and combined sewersystems: a long-term modelling approach. Water Science &Technology 60 (3), 555e565.

Mannina, G. (2005). Integrated urban drainage modelling withuncertainty for stormwater pollution management. PhDthesis, Universita di Catania, (Italy).

Mannina, G., Freni, G., Viviani, G., 2004. Modelling the integratedurban drainage systems. In: Bertrand-Krajewski, L.,Almeida, M., Matos, J., Abdul-Talib, S. (Eds.), Sewer Networksand Processes within Urban Water Systems (WEMSno.).IWA Publishing, London, UK, pp. 3e12.

Marsili-Libelli, S., Giusti, E., 2008. Water quality modelling forsmall river basins. Environmental Modelling and Software 23(4), 451e463.

Metcalf and Eddy, Inc, 2003. Wastewater Engineering: Treatmentand Reuse, fourth ed. McGraw Hill, New York.

Morris, M.D., 1991. Factorial sampling plans for preliminarycomputational experiments. Technometrics 33, 161e174.

Nash, J.E., Sutcliffe, J.V., 1970. River flow forecasting throughconceptual models. Journal of Hydrology 10 (3), 282e290.

Parchure, T.M., Mehta, A.J., 1985. Erosion of soft cohesivesediment deposits. Journal of Hydrology 111 (10), 1308e1326.

Rauch, W., Bertrand-Krajewski, J.-L., Krebs, P., Mark, O.,Schilling, W., Schuetze, M., Vanrolleghem, P.A., 2002.Deterministic modelling of integrated urban drainagesystems. Water Science & Technology 45 (3), 81e94.

Reichert, P., Vanrolleghem, P.A., 2001. Identifiability anduncertainty analysis of the River water quality model No. 1(RWQM1). Water Science & Technology 43 (7), 329e338.

Saltelli, A., Ratto,M., Tarantola, S., Campolongo, F., 2006. Sensitivityanalysis practices: strategies for model-based inference.Reliability Engineering & System Safety 91 (10e11), 1109e1125.

Saltelli, A., Campolongo, F., Cariboni, A., 2009. Screeningimportant inputs in models with strong interaction properties.Reliability Engineering & System Safety 94 (7), 1149e1155. 2009.

Soderstrom, T., Stoica, P., 1989. System Identification. Prentice-Hall, Englewood Cliffs: New Jersey.

Takacs, I., Patry, G.G., Nolasco, D., 1991. A dynamic model of theclarification-thickening process. Water Resource 25 (10),1263e1271.

Wagener, T., Kollat, J., 2007. Numerical and visual evaluation ofhydrological and environmental models using the MonteCarlo analysis toolbox. Environmental Modelling and Software22 (7), 1021e1033.

Weijers, S.R., Vanrolleghem, P.A., 1997. A procedure for selectingbest identifiable parameters in calibrating activated sludgemodel no. 1 to full-scale plant data. Water Science &Technology 36 (5), 69e79.