ANN-based interval forecasting of streamflow discharges using the LUBE method and MOFIPS

12
ANN-based interval forecasting of streamow discharges \using the LUBE method and MOFIPS Riccardo Taormina, Kwok-Wing Chau n Department of Civil and Environmental Engineering, Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong article info Article history: Received 5 September 2014 Received in revised form 28 April 2015 Accepted 26 July 2015 Keywords: MOFIPS PSO Prediction interval LUBE Neural networks Streamow prediction abstract The estimation of prediction intervals (PIs) is a major issue limiting the use of Articial Neural Networks (ANN) solutions for operational streamow forecasting. Recently, a Lower Upper Bound Estimation (LUBE) method has been proposed that outperforms traditional techniques for ANN-based PI estimation. This method construct ANNs with two output neurons that directly approximate the lower and upper bounds of the PIs. The training is performed by minimizing a coverage width-based criterion (CWC), which is a compound, highly nonlinear and discontinuous function. In this work, we test the suitability of the LUBE approach in producing PIs at different condence levels (CL) for the 6 h ahead streamow discharges of the Susquehanna and Nehalem Rivers, US. Due to the success of Particle Swarm Optimization (PSO) in LUBE applications, variants of this algorithm have been employed for CWC minimization. The results obtained are found to vary substantially depending on the chosen PSO paradigm. While the returned PIs are poor when single-objective swarm optimization is employed, substantial improvements are recorded when a multi-objective framework is considered for ANN development. In particular, the Multi-Objective Fully Informed Particle Swarm (MOFIPS) optimization algorithm is found to return valid PIs for both rivers and for the three CL considered of 90%, 95% and 99%. With average PI widths ranging from a minimum of 7% to a maximum of 15% of the range of the streamow data in the test datasets, MOFIPS-based LUBE represents a viable option for straightforward design of more reliable interval-based streamow forecasting models. & 2015 Elsevier Ltd. All rights reserved. 1. Introduction Articial Neural Networks (ANNs) have been widely used as a nonlinear regression tool for the prediction of water resource variables in different hydrological contexts (Daliakopoulos et al., 2005; Kişi, 2013; Maier et al., 2010; Nourani et al., 2009; Taormina et al., 2012; Wu and Chau, 2013). Most of the studies concerning the use of ANN for hydrological modeling pertain to a eld now known as Neural Network River Forecasting (NNRF), which includes applications in rainfall-runoff modeling and prediction of future streamow dis- charges and water levels (Abrahart et al., 2012). Despite the number of successful applications reported in the scientic literature, NNRF solutions still struggle to move from research grade to operational grade level due to a number of unresolved issues that still need to be addressed. A major obstacle is represented by the difculties in estimating the uncertainty of the predicted values produced by a NNRF model (Maier and Dandy, 2000). Indeed, the vast majority of NNRF applications so far have been only concerned with providing point forecasts of the modeled hydrological variable, without estimating the degree of condence associated with them. This is peculiar since hydrological forecasts can be employed for water resource management and natural hazards prevention only if a measure of their reliability is attached to each predicted value (Krzysztofowicz, 2001). The uncertainty characterizing NNRF point forecasts can be overcome by resorting to interval forecasts, or prediction intervals (PIs). A PI is a range of values in which the realization of a predicted random variable is expected to fall with a predened coverage probability, known as the condence level (CL). The width of the interval conveys information regarding the uncer- tainty of the forecast, so that for a given coverage probability, narrower widths entail higher accuracy. PIs are similar to condence intervals (CIs), with the distinction that CIs are associated with the uncertainty in the prediction of an unknown but xed value, whereas PIs are assigned to a random variable yet to be observed (De Gooijer and Hyndman, 2006). Since they also account for model misspecication and noise variance, by denition PIs enclose CIs of corresponding CLs. Although deterministic forecasts dominate the eld of NNRF, there have been a few noteworthy applications of PI-based streamow forecasting. The preferred methods involve the use of Bayesian neural networks (BNN), resampling and ensemble techniques, as well as experiments that involve fuzzy theory. BNNs (Khan and Coulibaly, 2006; Kingston et al., 2005; Zhang et al., 2009) are able to return error Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/engappai Engineering Applications of Articial Intelligence http://dx.doi.org/10.1016/j.engappai.2015.07.019 0952-1976/& 2015 Elsevier Ltd. All rights reserved. n Corresponding author. E-mail address: [email protected] (K.-W. Chau). Engineering Applications of Articial Intelligence 45 (2015) 429440

Transcript of ANN-based interval forecasting of streamflow discharges using the LUBE method and MOFIPS

ANN-based interval forecasting of streamflow discharges\using the LUBE method and MOFIPS

Riccardo Taormina, Kwok-Wing Chau n

Department of Civil and Environmental Engineering, Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong

a r t i c l e i n f o

Article history:Received 5 September 2014Received in revised form28 April 2015Accepted 26 July 2015

Keywords:MOFIPSPSOPrediction intervalLUBENeural networksStreamflow prediction

a b s t r a c t

The estimation of prediction intervals (PIs) is a major issue limiting the use of Artificial Neural Networks(ANN) solutions for operational streamflow forecasting. Recently, a Lower Upper Bound Estimation(LUBE) method has been proposed that outperforms traditional techniques for ANN-based PI estimation.This method construct ANNs with two output neurons that directly approximate the lower and upperbounds of the PIs. The training is performed by minimizing a coverage width-based criterion (CWC),which is a compound, highly nonlinear and discontinuous function. In this work, we test the suitabilityof the LUBE approach in producing PIs at different confidence levels (CL) for the 6 h ahead streamflowdischarges of the Susquehanna and Nehalem Rivers, US. Due to the success of Particle SwarmOptimization (PSO) in LUBE applications, variants of this algorithm have been employed for CWCminimization. The results obtained are found to vary substantially depending on the chosen PSOparadigm. While the returned PIs are poor when single-objective swarm optimization is employed,substantial improvements are recorded when a multi-objective framework is considered for ANNdevelopment. In particular, the Multi-Objective Fully Informed Particle Swarm (MOFIPS) optimizationalgorithm is found to return valid PIs for both rivers and for the three CL considered of 90%, 95% and 99%.With average PI widths ranging from a minimum of 7% to a maximum of 15% of the range of thestreamflow data in the test datasets, MOFIPS-based LUBE represents a viable option for straightforwarddesign of more reliable interval-based streamflow forecasting models.

& 2015 Elsevier Ltd. All rights reserved.

1. Introduction

Artificial Neural Networks (ANNs) have been widely used as anonlinear regression tool for the prediction of water resource variablesin different hydrological contexts (Daliakopoulos et al., 2005; Kişi,2013; Maier et al., 2010; Nourani et al., 2009; Taormina et al., 2012;Wu and Chau, 2013). Most of the studies concerning the use of ANNfor hydrological modeling pertain to a field now known as NeuralNetwork River Forecasting (NNRF), which includes applications inrainfall-runoff modeling and prediction of future streamflow dis-charges and water levels (Abrahart et al., 2012). Despite the numberof successful applications reported in the scientific literature, NNRFsolutions still struggle to move from research grade to operationalgrade level due to a number of unresolved issues that still need to beaddressed. A major obstacle is represented by the difficulties inestimating the uncertainty of the predicted values produced by aNNRF model (Maier and Dandy, 2000). Indeed, the vast majority ofNNRF applications so far have been only concerned with providingpoint forecasts of the modeled hydrological variable, without

estimating the degree of confidence associated with them. This ispeculiar since hydrological forecasts can be employed for waterresource management and natural hazards prevention only if ameasure of their reliability is attached to each predicted value(Krzysztofowicz, 2001). The uncertainty characterizing NNRF pointforecasts can be overcome by resorting to interval forecasts, orprediction intervals (PIs). A PI is a range of values in which therealization of a predicted random variable is expected to fall with apredefined coverage probability, known as the confidence level (CL).The width of the interval conveys information regarding the uncer-tainty of the forecast, so that for a given coverage probability, narrowerwidths entail higher accuracy. PIs are similar to confidence intervals(CIs), with the distinction that CIs are associated with the uncertaintyin the prediction of an unknown but fixed value, whereas PIs areassigned to a random variable yet to be observed (De Gooijer andHyndman, 2006). Since they also account for model misspecificationand noise variance, by definition PIs enclose CIs of corresponding CLs.Although deterministic forecasts dominate the field of NNRF, therehave been a few noteworthy applications of PI-based streamflowforecasting. The preferred methods involve the use of Bayesian neuralnetworks (BNN), resampling and ensemble techniques, as well asexperiments that involve fuzzy theory. BNNs (Khan and Coulibaly,2006; Kingston et al., 2005; Zhang et al., 2009) are able to return error

Contents lists available at ScienceDirect

journal homepage: www.elsevier.com/locate/engappai

Engineering Applications of Artificial Intelligence

http://dx.doi.org/10.1016/j.engappai.2015.07.0190952-1976/& 2015 Elsevier Ltd. All rights reserved.

n Corresponding author.E-mail address: [email protected] (K.-W. Chau).

Engineering Applications of Artificial Intelligence 45 (2015) 429–440

bars along with their predictions, and have strong probabilistic theorybacking them up. However, they require the computation of theHessian matrix at each iteration, which in turn causes singularityproblems that may harm the quality of the PIs as well as heavycomputational costs. Bootstrapping and ensemble modeling (Dawsonet al., 2002; Sharma and Tiwari, 2009; Tiwari and Chatterjee, 2010) arealso very time consuming as the number of ANN models to be trainedhas to be large in order to avoid biased estimates of total errorvariance. Computational costs can be abated with the use of fuzzy-based techniques such as the Local Uncertainty Estimation Model(LUEM) proposed by Shrestha and Solomatine (2006), which is basedon fuzzy c-means clustering; or the approach of Alvisi and Franchini(2011) that employs fuzzy ANNs. Despite their differences, methodsused in hydrology to estimate of ANN-based PIs usually requirecomplex implementations that may have prevented their widespreadapplication. Therefore, it is likely that resorting to less complicated,faster and yet effective techniques may favor a shift from deterministicNNRF models to more reliable solutions based on prediction intervals.Most importantly, all the aforementioned techniques share a commonmethodological weakness since they build the PIs indirectly fromdeterministic point predictions. Indeed, it would be more appropriateto generate the PIs directly through a mechanism that that considersboth coverage probability and interval width criteria. This is the mainpremise that lead to the development of the Lower Upper BoundEstimation (LUBE) method proposed by Khosravi et al. (2011b) togenerate ANN-based PIs. This technique was found to outperformclassic PIs construction techniques such as the delta method, Bayesianmethods, and bootstrapping on both synthetic experiments as well asreal world regression problems (Khosravi et al., 2011b; Quan et al.,2014a, 2014b, 2014c). The LUBE method constructs an ANN with twooutput neurons that directly approximate the lower and upper boundsof the PIs. The training is performed by minimizing a coverage width-based criterion (CWC), which is a compound PI-based objectivefunction accounting for both coverage probability and interval width.The CWC is a highly nonlinear and discontinuous function thatrequires global optimization techniques for its minimization. Inparticular, Particle Swarm Optimization (PSO) algorithm has provenvery efficient in generating high quality PIs (Quan et al., 2014b). The

main objective of this study is to test the suitability of PSO-based LUBEas a straightforward approach to predict streamflow discharges withuncertainty. This is done by producing 90%, 95% and 99% CL PIs for the6 h ahead streamflow discharge of the Susquehanna and Nehalemrivers, US. In addition, this work will assess whether PSO-based LUBEcan benefit from a multi-objective formulation, as done here using theMulti-Objective Fully Informed Particle Swarm (MOFIPS) optimizationalgorithm (Taormina and Chau, 2015). In the study introducing thealgorithm it was demonstrated that MOFIPS-trained ANNs substan-tially outperform ANNs developed using single-objective swarmoptimization. The authors expect that similar improvements can bealso obtained for NNRF models producing interval based predictions,and this hypothesis will be assessed by comparing the performancesof MOFIPS-based LUBE models against those obtained with single-objective PSO and single-objective Fully Informed Particle Swarm(FIPS) optimization. The paper is structured as follows: Section 2 willdiscuss the LUBE method as well as the swarm optimization algo-rithms employed in this study. Section 3 will introduce the casestudies and the datasets employed for the application. Section 4 willdiscuss the results of these experiments and present a thoroughcomparison of the LUBE PIs obtained with each of the consideredswarm optimization techniques. Conclusions are given in the lastsection of the manuscript.

2. Methods

2.1. Prediction intervals

Prediction intervals (PIs) are random intervals constructed fromhistorical observations that enclose a future observation within acertain range with a given probability, known as the confidence level(CL). If l and u are the lower and upper bound delimiting the PI, and1�αð Þ% is the CL attached to it, for a future unknown observationytþ1 of the predicted variable Y we can write PI ¼ ½l;u� such thatPr loytþ1ou� �¼ 1�α. Unlike deterministic forecasts, PIs carry

information about the accuracy of the prediction, a fundamentalrequirement for planning, risk assessment and decision making. For

Fig. 1. LUBE Neural Network model.

R. Taormina, K.-W. Chau / Engineering Applications of Artificial Intelligence 45 (2015) 429–440430

valid PIs at a given confidence level, narrower intervals should ofcourse be preferred since they entail less uncertainty and conveymore information. Despite the superiority of PIs over point predic-tions, the latter constitute by far the most common approachemployed when forecasting water resource variables, especiallywhen neural networks are chosen as the modeling tool for thehydrological process.

2.2. The Lower Upper Bound Estimation method and PI evaluationindices

The Lower Upper Bound Estimation (LUBE) method is a straight-forward and efficient technique to produce high quality PIs for ANNs.The LUBE was found to outperform classic PIs construction techni-ques such as the delta method, Bayesian methods, and bootstrappingon both synthetic experiments as well as real world regressionproblems (Khosravi et al., 2011b; Quan et al., 2014a, 2014b, 2014c).The LUBE method constructs an ANN with two output neurons thatdirectly approximate the lower and upper bounds of the PIs, asshown in Fig. 1. Each actual value of the predicted variable will beenclosed with a 1�αð Þ% probability in the interval between the twoANN outputs. This ANN is trained using historical observations ofboth the predicted variable and a set of relevant inputs. The trainingis performed by minimizing a Coverage Width-based Criterion(CWC), which is a compound PI-based objective function thataccounts for both coverage probability and interval width (Khosraviet al., 2011a). The CWC is defined as a combination of two indices,namely the Prediction Interval Coverage Probability (PICP) (Khosraviet al., 2010) and the Prediction Interval Normalized Root-mean-square Width (PINRW) (Quan et al., 2014b). The value of theseindices, including the CWC, could be also expressed in terms ofpercentages. PICP measures the percentage of target observationswhich are enclosed in the intervals, and is defined as follows:

PICP¼ 1n

Xni ¼ 1

ci ð1Þ

where n is the number of observations, ci is equal to one if theobservation yiA li;ui

� �and zero otherwise. The interval li;ui

� �is the

range of values delimited by the two outputs produced by the ANN(Fig. 1). PINRW provides an estimate of the overall PI width, and isdefined as the 2-norm of the width of the PIs

PINRW¼ 1R

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1n

Xni ¼ 1

ui� lið Þ2vuut : ð2Þ

The CWC cost function is a nonlinear combination of (1) and (2)defined as (Quan et al., 2014b):

CWC¼ PINRW 1þγ PICPð Þe�η PICP�μð Þ� �ð3Þ

where γ PICPð Þ is set equal to 1 during model calibration, while itbecomes a step function of PIPC on test dataset

γ PICPð Þ ¼0; if PICPZμ;1; if PICPoμ:

(ð4Þ

The value of γ PICPð Þ is forced to 1 during training in order to allowfor the construction of more conservative PIs and reduce the riskof violating the CL constraints during test (Khosravi et al., 2011b).The parameters η and μ are two constants used to define thepenalty term controlling the balance between coverage probabilityand width of the developed PIs. The penalty term is needed tosynthesize these two conflicting objectives in a single-objectivecost function. In particular, the CWC is designed so that theoptimization process will first search for valid PIs for whichPICPZ 1�αð Þ holds; then the search is refined by gradually givingmore importance to the PINRW term in (3), so that narrower PIs

are constructed. To ensure the calibration is carried out in this wayμ is set to 1�αð Þ, while literature suggests 80 as an appropriatevalue for η (Quan et al., 2014b). For a simpler assessment of thewidth of the developed PIs, the Prediction Interval NormalizedAverage Width (PINAW) is usually employed instead of PINRW.PINAW is defined as the ratio of the average width of theprediction intervals to the range R of the observed variable

PINAW¼ 1nR

Xni ¼ 1

ui� lið Þ:

2.3. Swarm optimization

The CWC cost function defined in (3) is a function of the lowerand upper bounds produced by the ANN, which in turn arefunctions of the ANN weights and biases. The optimization is thusperformed by finding an optimal set of parameters using acalibration dataset, in total similarity with the training of deter-ministic ANNs. Unfortunately, the CWC function is highly non-linear and complex, thus unsuitable for the fast gradient-basedsearch techniques commonly employed to optimize deterministicANNs (Daliakopoulos et al., 2005). LUBE solutions must be devel-oped using global optimization methods, and Particle SwarmOptimization (PSO) algorithm has proven a very successful candi-date for this task (Quan et al., 2014b). The PSO technique perform apopulation based search which is inspired by natural phenomenasuch as bird flocking and fish schooling (Kennedy and Eberhart,1995). Particles positions in the search space represent a differenthypothesis on the solution of the problem being optimized. In thecase of PSO-based ANN training, each component of the positionarray represents a different ANN model parameter. Fitness valuesare assigned to each position depending on the correspondingvalue taken by the problem objective function, and better solu-tions are identified by higher fitness. Particles move from oneposition to another by adjusting their flight based on the informa-tion they share with the swarm, which is usually arranged to formregular topologies such are spheres, rings, pyramids, and multi-dimensional lattices (Mendes et al., 2003). In the original PSO, aparticle velocity is only a function of its historical best position aswell as the historical best position found in its neighborhood. Asimpler version of the algorithm in which the best positions of allthe particles' neighbors has also been proposed (Mendes et al.,2004). This Fully Informed Particle Swarm (FIPS) optimizationalgorithm was found to outperform canonical PSO on benchmarktrials, and it will also be employed in this study as an alternative tothe original algorithm for the construction of ANN PIs. Althoughthis work represents the first example of PSO-based intervalprediction of streamflow discharge, there have been a few sig-nificant applications of PSO as a training algorithm for determi-nistic NNRF models (Chau, 2007, 2006; Piotrowski andNapiorkowski, 2011). The reader is referred to these works forfurther information regarding the use of swarm optimization forANN development in hydrology.

2.4. PSO-based and FIPS-based LUBE for constructing streamflow PIs

In the following paragraphs the major steps needed for con-structing PSO-based LUBE streamflow forecasting models will bediscussed according to the general version of the method pro-posed by Quan et al. (2014b). Although the following steps refer tothe PSO algorithm, they are implemented likewise when the FIPSis employed.

R. Taormina, K.-W. Chau / Engineering Applications of Artificial Intelligence 45 (2015) 429–440 431

2.4.1. Dataset creationProvided the set of hydrological and meteorological inputs has

been already identified, all the input-output patterns are firstnormalized in the [�0.9,þ0.9] range to facilitate model trainingand avoid saturation of the ANN activation functions. The wholedataset is then divided to form a training and a test dataset.

2.4.2. Finding optimal model structure and ensure generalizationwith k-fold cross-validation

The training dataset is further divided into k subsets to performk-fold cross-validation. This procedure is carried out in order tofind an optimal ANN structure that will prevent over-fitting andmaximize performances. ANN models of increasing complexity, i.e.increasing size of the hidden layer, will be trained k times using ateach repetition a different subset for validation and the remainingk-1 subsets for model training. The median of the k CWC indicescomputed for the validation dataset is later used to select theoptimal LUBE model.

2.4.3. Training with swarm optimizationThe PSO method is used to train an ANN model for each

considered model structure and for each repetition entailed bythe k-fold cross-validation procedure. The algorithm is first initi-alized by setting the parameters governing its search, selecting thenumber of particles in the swarm, and choosing the swarmtopology. Particle position and velocity arrays are also initializedbefore the optimization is started. Each position occupied by theparticles during the search process represents a different config-uration of synaptic weights and biases of the LUBE ANN model.Particle fitness values are computed based on the CWC of thecorresponding LUBE ANN model on the training dataset ðCWCTRAINÞ.The PSO looks for the position in the search space for whichCWCTRAIN of the corresponding ANN configuration is the overalllowest.

2.4.4. Mutation operatorDue to the complexity of the CWC cost function, it is beneficial

to include a mutation operator to boost PSO search and facilitatelocal minima escape. In PSO-based LUBE a Gaussian mutation(Higashi and Iba, 2003; Quan et al., 2014b) is included after eachparticle has moved to a new position. If xi;j identifies the i-thcomponent of the j-th particle position array at time t, its newvalue after mutation will be

xi;j ¼ xi;jþN 0;σ xi;j� �� �

where N 0;σ xi;j� �� �

is a random number from a Gaussian distribu-tion with zero mean and a standard deviation which in the originalPSO-based LUBE is set to 10% of xi;j. The effects of the mutationoperator decrease exponentially with time.

2.4.5. Training termination and model evaluationThe PSO training is stopped when a maximum number of

iterations has been reached, or when the algorithm stalls for apreset number of iterations. When the algorithm exits, a LUBE ANNmodel is built from the particle with the highest fitness, PIs areestimated for the validation dataset and the CWC ðCWCVALÞ iscomputed. When the CWCVAL of all the k subset of cross-validation procedure have been obtained, the median value isevaluated and stored for comparison with those of other modelstructures.

2.4.6. Selection of optimal model structure and PIs construction onthe test dataset

When the cross-validated training procedure has been per-formed for all the considered model structures, the optimal model

structure is selected as the one with the lowest median CWCVAL.The k trained instances of the optimal structure are then used asan ensemble to build the final PIs for the test dataset by averagingtheir lower bounds and the upper bounds outputs.

2.5. MOFIPS-based LUBE method

2.5.1. The MOFIPS algorithmTaormina and Chau (2015) have recently shown the advantages

of employing a multi-objective approach to deterministic NNRFmodel development with swarm optimization. They proposed amethodology treats cross-validated ANN training as a bi-objectiveproblem in which the training dataset is divided in two subsets ofsimilar length. The residual sums of squares of streamflow predic-tions on each dataset are then concurrently minimized using theMulti-Objective Fully Informed Particle Swarm (MOFIPS) optimiza-tion algorithm. The results reported by the authors showed thatMOFIPS-trained ANN substantially outperformed ANN developedwith single-objective PSO algorithms. Part of this work is thereforededicated to investigate whether the use of MOFIPS might benefitsthe development of PI-based ANN forecasting models. The MOFIPSis a Pareto-based swarm optimizer that performs its search byincluding instances of the non-dominated, or Pareto-efficient,solutions in the neighborhood of each particle. These non-dominated solutions act as guides for the entire swarm and forma Pareto-front which is updated at each iteration. When a max-imum number of non-dominated solutions is reached, only thosewith larger crowding distances are retained. The crowding distance(Deb et al., 2002) is a measure of the density of non-dominatedsolutions in a certain area of the search space, thus discardingsolutions with small crowding distance will foster diversity in thefrontier and improve the search process. A polynomial mutationoperator is also included in the algorithm to prevent local minimaentrapment and speed-up the optimization (Deb, 2009).

2.5.2. Building LUBE models with MOFIPS optimizationA LUBE technique based on MOFIPS is devised in similarity with

the bi-objective procedure employed for deterministic ANNs, andshown in Fig. 2. The reader is referred to (Taormina and Chau,2015) for further clarifications on MOFIPS implementation. Thewhole dataset of candidate inputs and observed output variableare first normalized and then split to form a training and a testdataset. The training dataset is in turn divided in two comple-mentary parts of similar length to form the two subsets of the bi-objective problem. The MOFIPS algorithm is then initialized byselecting its parameters, defining the topology to use, and assign-ing random starting values to particle positions and velocities.Initial PIs are then estimated by using the starting particlespositions to build the LUBE models, CWC values are computed,and an initial Pareto-front is generated. If CWC1 and CWC2 denotethe current values of CWC on the two training subsets, an exampleof the MOFIPS Pareto-front can be seen in Fig. 3. After theinitialization, the MOFIPS starts its iterative process to search forbetter LUBE solutions. The non-dominated positions of the Pareto-front are included in particles' neighborhoods, and particle velo-cities are computed based on this augmented set of neighbors.Particles then fly to new positions according to their velocities, andthe mutation operator is applied after they have landed. ThePareto-front is then updated by including those mutated positionswhich are found to be Pareto-efficient with respect to the swarmand the existing frontier. Consequently, previous Pareto-efficientsolutions which are now dominated by the newly added solutionsare removed from the frontier. If the updated frontier is larger thana predefined maximum size, the set of solutions is trimmed downto this maximum size by discarding those solutions with lowest

R. Taormina, K.-W. Chau / Engineering Applications of Artificial Intelligence 45 (2015) 429–440432

crowding distance. The iterative process is terminated either whena maximum number of iterations has been reached, or when adesired fitness value has been obtained for both objective func-tions, or when the Pareto-front has not been updated in a givennumber of iterations.

2.5.3. Selecting optimal MOFIPS-based LUBE solutionsAfter the optimization process has been completed, an optimal

solution should be extracted from the Pareto-front and employedto build streamflow PIs on the test dataset. Theoretically, all thenon-dominated solutions forming the final Pareto-front areequally good with respect to the chosen CWC objective functions,

meaning that none of them outperforms the others on both partsof the training dataset (Fig. 3). However, practical optimal solu-tions can be identified based on the final goal of the applicationthat is obtaining high quality PIs for the test dataset. A firstrequirement is that the selected solutions do not overfit eitherpart of the training dataset. This can be reasonably ensured byconsidering only those solutions which provide valid PIs on bothtraining subsets, thus respecting the constraint PICPZ 1�αð Þ for agiven confidence level. On top of this, two different criteria basedon the PICP and PINRW indices are proposed to select respectivelya Most Precautionary (MP) and a Narrowest Interval (NI) solution.The MP solution is the point on the CWC1 vs CWC2 Pareto-frontcharacterized by highest PICP on the training datasets with respect

Fig. 2. Flowchart of the MOFIPS-based LUBE method.

R. Taormina, K.-W. Chau / Engineering Applications of Artificial Intelligence 45 (2015) 429–440 433

to the medians. This solution is most precautionary in the sensethat having the largest coverage probability will increase the oddsof future unknown streamflow observations to fall within the PIsbuilt by the LUBE model (Khosravi et al., 2011b). We thereforeexpect the MP solution to have higher chances to produce valid,although wider, PIs for the test dataset. If PICP1;i and PICP2;i are thePICP values of the i-th solution on the training subsets, the MPsolution is chosen as:

MP¼ argmax PICP1;i�Median PICP1ð Þ� ��þ PICP2;i�Median PICP2ð Þ� ��

for i¼ 1;2;…;n ð5Þ

where n is the total number of points forming the Pareto-frontier.On the other hand, the NI solution is the one generating valid PIson the training datasets having narrowest widths with respect tothe medians. Similarly to (5), the NI selecting criterion can bewritten as:

NI¼ argmin PINRW1;i�Median PINRW1ð Þ� ��þ PINRW2;i�Median PINRW2ð Þ� ��

for i¼ 1;2;…;n ð6Þ

where PINRW1;i and PINRW2;i are the PINRW values of the i-thsolution on the training subsets. Compared to the MP solution, theNI solution should likely generate narrower PIs for the test datasetalthough they might not satisfy the validity condition for the givenconfidence level. The differences in (5) and (6) should be normal-ized to the respective medians if the distributions of the indicesvary substantially between the two subsets.

3. Case studies

The suitability of LUBE models to generate PIs for streamflowforecasting will be tested on the hourly datasets of two flood-prone rivers in United States, namely the Susquehanna River andthe Nehalem River.

3.1. The Susquehanna river

At around 750 km long and with a watershed of over 70,000 km2,the Susquehanna River is one of the longest rivers in the UnitedStates, draining southern New York State (NY), half of PennsylvaniaState (PA) and emptying into the Chesapeake Bay in Maryland. Forthis case study, streamflow discharges PIs with a lead time of 6 h areproduced for the US Geological Survey (USGS) gauging station inMeshoppen, PA, which monitors a catchment area of 22,585 km2.The PIs at Meshoppen are developed using a set of input variablesobserved at 5 other stations located elsewhere in the area. Inparticular, hourly rainfall near the cities of Milan, Towanda, Dushoreand Montrose, as well as previous hourly streamflow discharges inTowanda are employed. Details regarding these stations are shown inTable 1 for reference. Five additional aggregated time series wereadded to the working dataset by including 6-h cumulated precipita-tion (SUM6) at each rainfall station as well as 6-h moving averagestreamflow discharges (AVG6) at Towanda. Lag times from a mini-mum of 6 to a maximum of 11 h were considered for each raw andaggregated time series. In other words, if t indicates the actual timeof the prediction, lagged time series from t-6 up to t-11 are employedas inputs, providing a forecasting lead time of 6 h. The workingdataset for the Susquehanna River thus contains 60 inputs, and afterremoval of invalid observations, a total of 26,555 input/outputpatterns spanning from January 2004 to April 2008.

3.2. The Nehalem river

The Nehalem River originates in the Northern Oregon CoastRange near the city of Portland, and it ends its 192 km course inthe Pacific Ocean. For this river, PIs with a forecasting lead time of6 h ahead are produced for the USGS gauging station near Foss inTillamook County, OR. The drainage area of this station is of1728 km2, around 80% of the overall watershed area of 2210 km2.The input time series for this case study were chosen as theprevious streamflow values measured at Foss, plus the hourlyrainfall recorded in the meteorological stations of Nehalem, JewellWildlife Meadows, and Vernonia (Table 1). As done for the firstdataset, AVG6 and SUM6 aggregated time series were computedfor the streamflow and rainfall inputs, respectively, and lag timesof 6–11 h were considered. The Nehalem River dataset consists ofFig. 3. Example of MOFIPS Pareto-front for LUBE model development.

Table 1Details of gauging and meteorological stations.

Dataset Name Observed variable WGS84 Coordinates Distance fromstreamflowgauge (km)

Latitude Longitude

Susquehanna Meshoppen Flow discharge 41.61 �76.05 –

Towanda Flow discharge 41.77 �76.44 35Towanda Rainfall 41.75 �76.42 35Milan Rainfall 41.93 �76.52 53Dushore Rainfall 41.53 �76.40 31Montrose Rainfall 41.83 �75.87 29

Nehalem Foss Flow discharge 45.70 �123.75 –

Nehalem Rainfall 45.71 �123.90 53Jewell Meadows Wildlife Rainfall 45.94 �123.53 31Vernonia Rainfall 45.87 �123.19 29

R. Taormina, K.-W. Chau / Engineering Applications of Artificial Intelligence 45 (2015) 429–440434

48 inputs, with 20,577 input/output samples recorded betweenOctober 2007 and December 2013.

4. Results and discussion

4.1. Input selection

An optimal set of input features has to be selected from theavailable candidates of both datasets described in the previoussection. In lack of a standard methodology for input selection todevelop ANN-based PIs, we resort to a wrapper technique devised fordeterministic ANN models. Wrapper techniques are model-basedinput selection schemes where the performance of the learningmachine of choice is employed to evaluate the explanatory power ofdifferent subsets of candidate variables (Guyon and Elisseeff, 2003).In particular, we employ a Constructive Forward Selection (CFS)technique (Maier et al., 2010; May et al., 2011) to determine anoptimal set of inputs for ANNs that perform 6 h ahead pointpredictions of streamflow discharges for both case studies. Theseinput features are then used to develop LUBE models under thehypothesis that they represent a good approximation of the optimalinput set required to develop ANN-based PIs. This assumption islegitimate if one considers that good quality PIs of hydrologicalvariables have been obtained by bootstrapping and ensemblingdeterministic ANN models (Dawson et al., 2002; Sharma andTiwari, 2009; Tiwari and Chatterjee, 2010). The CFS algorithm returnsan optimal ANN via an incremental search strategy where an initialmodel architecture with minimal complexity is trained using oneinput at a time. The input resulting in best model performances ispermanently annexed to the initial model, and then the searchcontinues accordingly to find the next input among the remainingcandidates. This iterative process is stopped when the inclusion ofother inputs does not result in further improvements, even after theANN is augmented with additional hidden neurons. Before launchingthe CFS algorithm, the datasets are first normalized in the [�0.9,0.9]range, and divided in training (40%), validation (40%) and test (20%)subsets as shown in Table 2. The Levenberg–Marquardt algorithmwith early stopping (Daliakopoulos et al., 2005) was employed as theANN training algorithm, using 50 restarts to prevent local minimaentrapment. The CFS method returned an optimal model with5 hidden neurons and 7 inputs for the Susquehanna dataset, whilethe best ANN for the Nehalem River was found to have 3 hiddenneurons and 5 inputs. The details of the selected inputs for both casestudies are shown in Table 3, along with the performances on eachdataset expressed in terms of Root Mean Square Error (RMSE) andNash-Sutcliffe Coefficient Of Efficiency (COE) (Wu and Chau, 2013).The high performances shown by the CFS optimal architecturesuggest that the selected features are suitable for modeling futurestreamflow discharges. They will be employed for LUBE modeldevelopment under the premise made earlier in this section.

4.2. Development of swarm optimization-based LUBE models

After having identified the model inputs, LUBE neural networkscan be developed to generate PIs for the 6-h ahead streamflowdischarge for both rivers. For the PSO and FIPS-based LUBE a 10-foldcross-validation procedure is carried out after joining the trainingand validation subsets in Table 2, and dividing the resulting datasetin 10 equal chunks. The two original subsets are left separatedwhen employing the MOFIPS algorithm, where they are regarded asPart 1 and Part 2 of the overall training dataset used for theoptimization. PIs with CL of 90%, 95% and 99% are considered forLUBE model development, therefore the values of μ in (3) was set to0.9, 0.95 and 0.99 respectively. The value of η was set to 80 for eachconfidence level. In order to reduce the computational burden ofthe experiments, the search for optimal model complexity andswarm topology was performed only for the 90% case, with theresults being extended to the other two cases. The search foroptimal model structure was carried out trying LUBE models with3 to 10 hidden neurons for each algorithm employed. For theSusquehanna River the best performances were obtained usingANNs with 6 hidden neurons (62 model parameters) and irrespec-tive of the training algorithm. On the other hand, all algorithmsprovided best results when 4 hidden neurons (34 model para-meters) were used for the Nehalem River dataset. For the sake ofbrevity, full details on algorithms setup are shown in Table 4, alongwith the appropriate references for their implementation. Contraryto PSO- and FIPS-based LUBE which require 10 runs to implementthe k-fold cross-validation, MOFIPS is able to produce PIs at the endof a single run. Due to the similarity in the workings of the threealgorithms, MOFIPS might therefore require only 1/10 of thecomputational time of its single-objectives counterparts. However,since MOFIPS entails augmented swarm topologies and the addi-tional calculation of the Pareto-fronts at each iteration, this speed-up could be partially reduced. After these considerations, 5 restartsare performed for the MOFIPS technique for fairer comparison. Theoverall Pareto-frontier obtained from these restarts is considered

Table 2Datasets subdivision.

Dataset Subset Intial datetime[mm/dd/yyyy HH:MM]

Ending datetime[mm/dd/yyyy HH:MM]

Total numberof observations

Flow statistics [m3 s�1]

Min Max Average

Susquehanna Training 1/1/2004 11:00 11/23/2005 2:00 10,653 42 5324 604Validation 11/23/2005 14:00 4/21/2007 6:00 10,582 78 4616 585Test 4/21/2007 18:00 5/1/2008 0:00 5320 135 3313 670

Nehalem Training 10/1/2007 17:00 3/22/2010 18:00 8230 2 1481 108Validation 3/22/2010 19:00 6/19/2012 8:00 8231 4 784 121Test 6/19/2012 9:00 12/2/2013 17:00 4116 3 580 108

Table 3Deterministic ANN inputs and performances.

Selected inputs Model peformances

Dataset Station Input type Lag Subset RMSE[m3s-1]

COE

Susquehanna Towanda Flow RAW t-6, t-7 Training 50.7 0.994Towanda Flow AVG6 t-8, t-11 Validation 69.7 0.985Montrose Rainfall RAW t-6 Test 70.3 0.983Montrose Rainfall SUM6 t-8, t-9

Nehalem Foss Flow RAW t-6, t-7 Training 11.1 0.993Vernonia Rainfall RAW t-6 Validation 12.5 0.990Jewell Rainfall RAW t-6 Test 13.8 0.985Jewell Rainfall SUM6 t-6

R. Taormina, K.-W. Chau / Engineering Applications of Artificial Intelligence 45 (2015) 429–440 435

for the extraction of optimal LUBE solutions according to the criteriain Section 2.5.3.

4.3. Comparison of generated LUBE PIs

Due to the different dataset arrangements employed for thesingle-objective and multi-objective training, a direct comparison ofthe PIs generated with the three algorithms can be done only forthe test dataset which is the same for each case. Furthermore, whilethe optimal MOFIPS-based LUBE models can be univocally identi-fied on the Pareto-front using the MP and NI criteria, PSO and FIPS-based LUBE require the construction of an ensemble from 10different models. Averaging is thus necessary to produce the finalPIs of these ensembles, while the indices employed for thecomparison against the MOFIPS solutions are given in terms ofmedians, as suggested in literature (Khosravi et al., 2011b; Quan etal., 2014b). Table 5 shows the comparison of the 90%, 95% and 99%PIs constructed for the 6 h ahead streamflow discharge of theSusquehanna River at the Meshoppen gauging station, while theresults for the Nehalem River at Foss are shown in Table 6. All theindices described in Section 2.2 are shown for comparison, but thediscussion is better carried out in terms of PICP and PINAW. From acursory analysis of the results, it clearly emerges that the MOFIPSsolutions substantially outperform the LUBE models built withsingle objective swarm optimization. For both study cases, the bestperformer at each confidence level, i.e. the LUBE models providingthe narrowest yet valid PIs, are those obtained from the MOFIPS

Pareto-fronts. It can be seen that the NI models are unsurprisinglythose returning the narrowest PIs, with PINAW of around 7.01%,8.00%, and 12.50% for the Susquehanna River and 8.76%, 11.20% and14.91% for the Nehalem River. However, the corresponding PICPvalues of 91.32%, 92.59% and 97.80% for the Susquehanna and of91.21%, 94.87% and 98.98% for the Nehalem imply that NI solutionsreturn strictly valid PIs only for the 90% case. On the other hand, themodels identified by the MP criterion are able to generate valid PIsfor all the examined cases. PINAW of 7.9%, 10.7%, and 13% of thestreamflow discharge range are slightly greater than those of the NImodels, indicating that the MP solutions are the overall best for theSusquehanna case. Similar conclusions cannot be drawn as easilyfor the Nehalem case study. Indeed, while the NI solutions fail toensure the required coverage probability for the 95% and 99% cases,they fall short of only 0.13% and 0.02% with respect to the target CL,and could be regarded as valid. On the other hand, the MP solutionsprovide PIs which are likely too precautionary for this case, withPICPs considerably larger than the correspondent target CL. Conse-quently, the PIs of MP solutions are 18–45% wider than the NI onesas shown by the PINAW values in Table 6. Although the PSO-basedLUBE provides valid PIs at each confidence level for both casestudies, the produced intervals are too wide for any real practicalapplication. The FIPS-based LUBE shows usually better perfor-mances, but it still compares badly against the optimal modelsreturned by the MOFIPS algorithm. This is particularly true the 99%CL case, where the median PICP of the FIPS-based LUBE models isbelow the requested target for both cases. The results in the last

Table 4Details of the swarm optimization algorithms employed.

PSO FIPS MOFIPS

Algorithm formulation PSO Type 1'' constriction (Mendeset al., 2004)

(Mendes etal., 2004)

(Taormina and Chau, 2015)

number of particles (maximumsize of the Pareto-front)

30 30 30 (30)

topology Von Neumann with “self” included(Mendes et al., 2004)

Von Neumann with “self” excluded (Mendes et al., 2004)

minimum and maximum position �3,þ3minimum and maximum velocity �0.5,þ0.5termination criteria 1000 iterationsmutation type Gaussian mutation with exponential decay (Quan

et al., 2014b)polynomial mutation (Taormina and Chau, 2015; Deb, 2009)

mutation details mean is set to 0, and standard deviation is set to10% of the ANN parameter absolute value

mutation probability distribution is set to 30, and percentage of particlessubjected to mutation is set to 1/num. of ANN parameters

Table 5Performances of LUBE models on the test dataset for the Susquehanna River and required computation time.

PICP CWC PINRW PINAW Total computational time [s]

90% Confidence level MOFIPS-based LUBE (NI) 0.9132 0.0719 0.0719 0.07013105.2MOFIPS-based LUBE (MP) 0.9175 0.0803 0.0803 0.0790

PSO-based LUBE 0.9143 0.2275 0.2275 0.1992 6132.7FIPS-based LUBE 0.9130 0.1324 0.1173 0.1129 5952.4

95% Confidence level MOFIPS-based LUBE (NI) 0.9259 0.6410 0.0816 0.08003164.6MOFIPS-based LUBE (MP) 0.9686 0.1078 0.1078 0.1073

PSO-based LUBE 0.9644 0.3757 0.3215 0.2896 6145.1FIPS-based LUBE 0.9523 0.1953 0.1500 0.1434 5914.1

99% Confidence level MOFIPS-based LUBE (NI) 0.9780 0.4836 0.1340 0.12503197.3MOFIPS-based LUBE (MP) 0.9901 0.1445 0.1445 0.1298

PSO-based LUBE 0.9903 0.6947 0.4200 0.3628 6181.9FIPS-based LUBE 0.9791 0.7377 0.2312 0.2050 5879.5

R. Taormina, K.-W. Chau / Engineering Applications of Artificial Intelligence 45 (2015) 429–440436

columns of Tables 5 and 6 show that the MOFIPS-based LUBEgenerates better PIs while at the same time providing remarkablespeedups. Indeed, MOFIPS computational times are 40% to 50%smaller than those required by single-objective swarm optimiza-tion, indicating that the overhead associated with using augmentedswarm topologies and Pareto-fronts calculation is fairly negligible(see Section 4.2). The superiority of MOFIPS solutions can be alsoverified from a visual comparison of the PIs generated for the threeconfidence levels considered, such as shown in Figs. 4–6 for the lastpart of the test dataset of the Susquehanna River. It appears thatmost of the improvements provided by MOFIPS are reflected into abetter positioning of the Lower Bound of the PIs, and a moreaccurate bracketing of peak streamflow discharges. In this regards,of particular interest is Fig. 5 showing the PIs at confidence level95%. For this case, all the algorithms return valid PIs which are alsofound to comprise the peak discharges for the major storm eventoccurring on the night of the 21st of March 2008, with a peak flowof 2483 m3 s�1. The interval generated by the MP solution for thepeak observation is [24,262,724] m3 s�1, which has a width of298 m3 s�1 corresponding to around 12% of the peak dischargeitself. On the other hand, the PI of the FIPS-based LUBE is over threetimes larger including discharges anywhere in the [2005, 2957]

m3 s�1 range. The accuracy of the PSO-based LUBE models is evenworst, with a PI of [772, 3365] m3 s�1 which is almost nine timeswider than that of the optimal MOFIPS solution, and represents104% of the peak flow rate.

4.4. Wet season vs dry season performances

Further insights on the performances of the MOFIPS-based LUBEmethodology can be obtained by decomposing the overall perfor-mance indices into those relative to the wet and dry seasons. From ananalysis of historical records, it emerges that for both case studies thewet season goes from November to May, while the dry seasoncomprises the remaining 5 months. However, due to the lack ofsufficient data samples in the dry season for the test dataset of theSusquehanna River, the analysis will be carried out only for theNehalem River. The test dataset of the Nehalem River is made for 1/3 by samples recorded during the dry months, while the remaining 2/3 pertains to the wet season. Fig. 7 shows the decomposition of theoverall test PICP (left) and PINAW (right) indices of the NI solutions forthe three CLs. The results shown for the PICP suggest that there are nomajor differences in the coverage probability across the two seasons.On the other hand, the PINAW values are generally lower during the

Table 6Performances of LUBE models on the test dataset for the Nehalem River and required computation time.

PICP CWC PINRW PINAW Total computational time [s]

90% Confidence level MOFIPS-based LUBE (NI) 0.9121 0.0901 0.0901 0.08762126.5MOFIPS-based LUBE (MP) 0.9208 0.1127 0.1127 0.1033

PSO-based LUBE 0.9187 0.2509 0.2509 0.2209 3775.8FIPS-based LUBE 0.9175 0.2333 0.2333 0.2106 3743.4

95% Confidence level MOFIPS-based LUBE (NI) 0.9487 0.2490 0.1182 0.11202136.7MOFIPS-based LUBE (MP) 0.9823 0.1667 0.1667 0.1581

PSO-based LUBE 0.9666 0.3189 0.3189 0.2458 4214.8FIPS-based LUBE 0.9580 0.2399 0.2356 0.2213 3731.1

99% Confidence level MOFIPS-based LUBE (NI) 0.9898 0.3444 0.1708 0.14912097.0MOFIPS-based LUBE (MP) 0.9947 0.2477 0.2477 0.2158

PSO-based LUBE 0.9934 0.4733 0.4088 0.3645 3871.0FIPS-based LUBE 0.9883 0.5783 0.3006 0.2628 3543.1

0

500

1000

1500

2000

2500

3000

3500

Stre

amflo

w D

isch

arge

[m

3s- 1

]

OBSERVED

MOFIPS-based LUBE (NI)

FIPS-based LUBE

PSO-based LUBE

Fig. 4. LUBE generated prediction intervals at 90% confidence level for the Susquehanna River.

R. Taormina, K.-W. Chau / Engineering Applications of Artificial Intelligence 45 (2015) 429–440 437

dry season for all the three considered CLs. These results are betterexamined by taking into account the average flows which are 52.1 and134.2 m3 s�1 for the dry and wet season, respectively. It is interestingto note that, although the average flow during the dry season is 61.2%smaller than that of the wet season, the PINAW values for the dryseasons are at most 16.8% smaller than those of the wet season. Thishappens for the 99% CL case, where the overall PINAW of 14.91% isdecomposed into a wet season component of 15.77% and a dry seasoncomponent equal to 13.13%. The contrast between the difference ofthese components and that of the seasonal average flows suggests thatthe width of the PIs is mostly determined by the higher variability ofthe wet season, which is driven by more frequent and intense rainfall

events. This higher variability forces the LUBEmethod towiden the PIsin order to increase coverage probability and meet the validityrequirement set by the CL. However, since the same ANN model isused for both seasons the PIs produced during the dry season, whilenarrower than those of the wet season, are wider than what could beexpected by analyzing the average flows.

5. Conclusions

This paper dealt with the application of the LUBE method for theconstruction of ANN-based PIs of streamflow discharges at 90%, 95%

0

500

1000

1500

2000

2500

3000

3500

Stre

amflo

w D

isch

arge

[m

3s- 1

]

OBSERVED

MOFIPS-based LUBE (MP)

FIPS-based LUBE

PSO-based LUBE

Fig. 5. LUBE generated prediction intervals at 95% confidence level for the Susquehanna River.

0

500

1000

1500

2000

2500

3000

3500

4000

4500

Stre

amflo

w D

isch

arge

[m

3s-1]

OBSERVED

MOFIPS-based LUBE (MP)

FIPS-based LUBE

PSO-based LUBE

Fig. 6. LUBE generated prediction intervals at 99% confidence level for the Susquehanna River.

R. Taormina, K.-W. Chau / Engineering Applications of Artificial Intelligence 45 (2015) 429–440438

and 99% confidence levels. Single-objective and multi-objective swarmoptimization has been employed to develop LUBE models for theprediction of 6 h ahead streamflow discharges of the for the Susque-hanna and Nehalem rivers, US. A novel methodology involving theMOFIPS algorithmwas found to provide valid PIs that are substantiallynarrower than those obtained with single-objective swarm optimiza-tion. With average PI widths ranging from a minimum of 7% to amaximum of 15% of the range of the streamflow data in the testdatasets, MOFIPS-based LUBE could be employed for straightforwarddesign of more reliable interval-based streamflow forecasting models.Although the quality of the PIs was found to be significantly affected bythe algorithm employed for model development, future studies shouldbe focused on finding more appropriate input selection techniques forinterval based hydrological prediction models. In addition, the seasonaldecomposition of the overall performance indices encourages seekingfor further improvements, which could be obtained, for instance, byresorting to a modular approach where the PIs are produced separatelyfor the dry and wet seasons and joined subsequently.

Acknowledgments

This study was funded by the Research Grants Council of HongKong through its Ph.D. Fellowship Scheme and the CentralResearch Grant of Hong Kong Polytechnic University (G-U833).

References

Abrahart, R.J., Anctil, F., Coulibaly, P., Dawson, C.W., Mount, N.J., See, L.M.,Shamseldin, a.Y., Solomatine, D.P., Toth, E., Wilby, R.L., 2012. Two decades ofanarchy? Emerging themes and outstanding challenges for neural networkriver forecasting. Prog. Phys. Geogr. 36, 480–513. http://dx.doi.org/10.1177/0309133312444943.

Alvisi, S., Franchini, M., 2011. Fuzzy neural networks for water level and dischargeforecasting with uncertainty. Environ. Model. Softw. 26, 523–537. http://dx.doi.org/10.1016/j.envsoft.2010.10.016.

Chau, K.W., 2006. Particle swarm optimization training algorithm for ANNs in stageprediction of Shing Mun River. J. Hydrol. 329, 363–367. http://dx.doi.org/10.1016/j.jhydrol.2006.02.025.

Chau, K.W., 2007. A split-step particle swarm optimization algorithm in river stageforecasting. J. Hydrol. 346, 131–135. http://dx.doi.org/10.1016/j.jhydrol.2007.09.004.

Daliakopoulos, I.N., Coulibaly, P., Tsanis, I.K., 2005. Groundwater level forecastingusing artificial neural networks. J. Hydrol. 309, 229–240. http://dx.doi.org/10.1016/j.jhydrol.2004.12.001.

Dawson, C.W., Harpham, C., Wilby, R.L., Chen, Y., 2002. Evaluation of artificial neuralnetwork techniques for flow forecasting in the River Yangtze, China. Hydrol.Earth Syst. Sci. 6, 619–626. http://dx.doi.org/10.5194/hess-6-619-2002.

De Gooijer, J.G., Hyndman, R.J., 2006. 25 years of time series forecasting. Int. J.Forecast. 22, 443–473. http://dx.doi.org/10.1016/j.ijforecast.2006.01.001.

Deb, K., 2009. Multi-objective optimization using evolutionary algorithms. JohnWiley & Sons, New York.

Deb, K., Member, A., Pratap, A., Agarwal, S., Meyarivan, T., 2002. A fast and elitistmultiobjective genetic algorithm. IEEE Trans. Evol. Comput. 6, 182–197.

Guyon, I., Elisseeff, A., 2003. An introduction to variable and feature selection.J. Mach. Learn. Res. 3, 1157–1182.

Higashi, N., Iba, H., 2003. Particle swarm optimization with Gaussian mutation, in:Proceedings of the IEEE 2003 Swarm Intelligence Symposium – SIS'03, pp. 72–79.

Kennedy, J., Eberhart, R., 1995. Particle swarm optimization. In: Proceedings of theInternational Conference on Neural Networks (ICNN'95) – 4, 1942–1948,10.1109/ICNN.1995.488968.

Khan, M.S., Coulibaly, P., 2006. Bayesian neural network for rainfall-runoff model-ing. Water Resour. Res. 42, W07409. http://dx.doi.org/10.1029/2005WR003971.

Khosravi, A., Nahavandi, S., Creighton, D., 2010. Construction of optimal predictionintervals for load forecasting problems. IEEE Trans. Power Syst. 25, 1496–1503.

Khosravi, A., Nahavandi, S., Creighton, D., 2011a. Prediction interval constructionand optimization for adaptive neurofuzzy inference systems. IEEE Trans. FuzzySyst. 19, 983–988.

Khosravi, A., Nahavandi, S., Creighton, D., Atiya, A.F., 2011b. Lower upper boundestimation method for construction of neural network-based prediction inter-vals. IEEE Trans. Neural Netw. 22, 337–346. http://dx.doi.org/10.1109/TNN.2010.2096824.

Kingston, G.B., Lambert, M.F., Maier, H.R., 2005. Bayesian training of artificial neuralnetworks used for water resources modeling. Water Resour. Res. 41, W12409.http://dx.doi.org/10.1029/2005WR004152.

Kişi, Ö., 2013. Evolutionary neural networks for monthly pan evaporation modeling.J. Hydrol. 498, 36–45. http://dx.doi.org/10.1016/j.jhydrol.2013.06.011.

Krzysztofowicz, R., 2001. The case for probabilistic forecasting in hydrology.J. Hydrol. 249, 2–9. http://dx.doi.org/10.1016/S0022-1694(01)00420-6.

Maier, H.R., Dandy, G.C., 2000. Neural networks for the prediction and forecasting ofwater resources variables: a review of modelling issues and applications. Environ.Model. Softw. 15, 101–124. http://dx.doi.org/10.1016/S1364-8152(99)00007-9.

Maier, H.R., Jain, A., Dandy, G.C., Sudheer, K.P., 2010. Methods used for thedevelopment of neural networks for the prediction of water resource variablesin river systems: current status and future directions. Environ. Model. Softw.25, 891–909. http://dx.doi.org/10.1016/j.envsoft.2010.02.003.

May R., Dandy G., Maier H., 2011. Review of input variable selection methods forartificial neural networks. In: Suzuki K. (Ed.). Artificial Neural Networks –

Methodological Advances and Biomedical Applications. INTECH Open AccessPublisher.

Mendes, R., Kennedy, J., Neves, J., 2003. Watch thy neighbor or how the swarm canlearn from its environment. In; Proceedings of the 2003 IEEE Swarm Intelli-gence Symposium – SIS'03, pp. 88–94, 10.1109/SIS.2003.1202252.

Mendes, R., Kennedy, J., Neves, J., 2004. The fully informed particle swarm: simpler,maybe better. IEEE Trans. Evol. Comput. 8, 204–210.

Nourani, V., Alami, M.T., Aminfar, M.H., 2009. A combined neural-wavelet model forprediction of Ligvanchai watershed precipitation. Eng. Appl. Artif. Intell. 22,466–472. http://dx.doi.org/10.1016/j.engappai.2008.09.003.

Piotrowski, A.P., Napiorkowski, J.J., 2011. Optimizing neural networks for river flowforecasting – evolutionary computation methods versus the Levenberg–Mar-quardt approach. J. Hydrol. 407, 12–27. http://dx.doi.org/10.1016/j.jhydrol.2011.06.019.

Quan, H., Srinivasan, D., Khosravi, A., 2014a. Short-term load and wind powerforecasting using neural network-based prediction intervals. IEEE Trans. NeuralNetw. Learn. Syst. 25, 303–315. http://dx.doi.org/10.1109/TNNLS.2013.2276053.

Quan, H., Srinivasan, D., Khosravi, A., 2014b. Particle swarm optimization forconstruction of neural network-based prediction intervals. Neurocomputing127, 172–180. http://dx.doi.org/10.1016/j.neucom.2013.08.020.

Fig. 7. Decomposition of test PICP and PINAW for the Nehalem River.

R. Taormina, K.-W. Chau / Engineering Applications of Artificial Intelligence 45 (2015) 429–440 439

Quan, H., Srinivasan, D., Khosravi, A., 2014c. Uncertainty handling using neuralnetwork-based prediction intervals for electrical load forecasting. Energy 73,916–925. http://dx.doi.org/10.1016/j.energy.2014.06.104.

Sharma, S.K., Tiwari, K.N., 2009. Bootstrap based artificial neural network (BANN)analysis for hierarchical prediction of monthly runoff in Upper Damodar ValleyCatchment. J. Hydrol. 374, 209–222. http://dx.doi.org/10.1016/j.jhydrol.2009.06.003.

Shrestha, D.L., Solomatine, D.P., 2006. Machine learning approaches for estimationof prediction interval for the model output. Neural Netw. 19, 225–235. http://dx.doi.org/10.1016/j.neunet.2006.01.012.

Taormina, R., Chau, K., 2015. Neural network river forecasting with multi-objectivefully informed particle swarm optimization. J. Hydroinformatics 17, 99–112.http://dx.doi.org/10.2166/hydro.2014.116.

Taormina, R., Chau, K., Sethi, R., 2012. Artificial neural network simulation of hourlygroundwater levels in a coastal aquifer system of the Venice lagoon. Eng. Appl.Artif. Intell. 25, 1670–1676. http://dx.doi.org/10.1016/j.engappai.2012.02.009.

Tiwari, M.K., Chatterjee, C., 2010. Uncertainty assessment and ensemble floodforecasting using bootstrap based artificial neural networks (BANNs). J. Hydrol.382, 20–33. http://dx.doi.org/10.1016/j.jhydrol.2009.12.013.

Wu, C.L., Chau, K.W., 2013. Prediction of rainfall time series using modular softcomputingmethods. Eng. Appl. Artif. Intell. 26, 997–1007. http://dx.doi.org/10.1016/j.engappai.2012.05.023.

Zhang, X., Liang, F., Srinivasan, R., Van Liew, M., 2009. Estimating uncertainty ofstreamflow simulation using Bayesian neural networks. Water Resour. Res. 45,W02403. http://dx.doi.org/10.1029/2008WR007030.

R. Taormina, K.-W. Chau / Engineering Applications of Artificial Intelligence 45 (2015) 429–440440