Development of an accurate and reliable hourly flood forecasting model using...

14
See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/236853033 Development of an accurate and reliable hourly flood forecasting model using wavelet- bootstrap-ANN (WBANN) hybrid approach Article in Journal of Hydrology · November 2010 DOI: 10.1016/j.jhydrol.2010.10.001 CITATIONS 88 READS 337 2 authors: Mukesh Kumar Tiwari College of Agricultural Engineering and Tech… 26 PUBLICATIONS 338 CITATIONS SEE PROFILE Chandranath Chatterjee Indian Institute of Technology Kharagpur 136 PUBLICATIONS 901 CITATIONS SEE PROFILE All content following this page was uploaded by Chandranath Chatterjee on 01 December 2016. The user has requested enhancement of the downloaded file. All in-text references underlined in blue are linked to publications on ResearchGate, letting you access and read them immediately.

Transcript of Development of an accurate and reliable hourly flood forecasting model using...

Seediscussions,stats,andauthorprofilesforthispublicationat:https://www.researchgate.net/publication/236853033

Developmentofanaccurateandreliablehourlyfloodforecastingmodelusingwavelet-bootstrap-ANN(WBANN)hybridapproach

ArticleinJournalofHydrology·November2010

DOI:10.1016/j.jhydrol.2010.10.001

CITATIONS

88

READS

337

2authors:

MukeshKumarTiwari

CollegeofAgriculturalEngineeringandTech…

26PUBLICATIONS338CITATIONS

SEEPROFILE

ChandranathChatterjee

IndianInstituteofTechnologyKharagpur

136PUBLICATIONS901CITATIONS

SEEPROFILE

AllcontentfollowingthispagewasuploadedbyChandranathChatterjeeon01December2016.

Theuserhasrequestedenhancementofthedownloadedfile.Allin-textreferencesunderlinedinblue

arelinkedtopublicationsonResearchGate,lettingyouaccessandreadthemimmediately.

Journal of Hydrology 394 (2010) 458–470

Contents lists available at ScienceDirect

Journal of Hydrology

journal homepage: www.elsevier .com/locate / jhydrol

Development of an accurate and reliable hourly flood forecasting modelusing wavelet–bootstrap–ANN (WBANN) hybrid approach

Mukesh K. Tiwari, Chandranath Chatterjee ⇑Agricultural and Food Engineering Department, Indian Institute of Technology, Kharagpur, West Bengal 721 302, India

a r t i c l e i n f o s u m m a r y

Article history:Received 24 March 2010Received in revised form 20 September 2010Accepted 2 October 2010

This manuscript was handled byDr. A. Bardossy, Editor-in-Chief, with theassistance of Vazken Andréassian, AssociateEditor

Keywords:Flood forecastingWaveletBootstrapANNs

0022-1694/$ - see front matter � 2010 Elsevier B.V. Adoi:10.1016/j.jhydrol.2010.10.001

⇑ Corresponding author. Tel.: +91 3222 283158.E-mail addresses: [email protected] (M

fe.iitkgp.ernet.in (C. Chatterjee).

A hybrid wavelet–bootstrap–ANN (WBANN) model is developed in this study to explore the potential ofwavelet and bootstrapping techniques for developing an accurate and reliable ANN model for hourlyflood forecasting. The wavelet technique is used to decompose the times series data into different com-ponents which capture useful information on various resolution levels. Five years hourly water level datafor monsoon season from five gauging stations in Mahanadi River basin, India are used in this study. Theobserved water level time series of a particular gauging station is decomposed to sub-series by discretewavelet transformation and then appropriate sub-series are added up to develop new time series. Thebootstrap resampling method is used to generate different realizations of the newly constructed datasetsusing discrete wavelet transformation to create a set of bootstrap samples that are finally used as input todevelop WBANN model. Performance of WBANN model is also compared with three different ANN mod-els: traditional ANNs, wavelet based ANNs (WANNs), bootstrap based ANNs (BANNs). The results showedthat the hybrid models WBANN and BANN produced better results than the traditional ANN and WANNmodels. WBANN model simulated the peak water level better than ANN, WANN and BANN models, and ingeneral, the overall performance of WBANN model is accurate and reliable as compared to the other threemodels. This study reveals that whereas wavelet decomposition improves the performance of ANN mod-els, bootstrap resampling technique produces more consistent and stable solutions. WBANN model is alsoused to assess the predictive uncertainty in forms of confidence intervals (CI) to assess the predictiveuncertainty for 1–10 h lead time forecasts. Results obtained indicate that WBANN forecasting model withconfidence intervals can improve their reliability for flood forecasting.

� 2010 Elsevier B.V. All rights reserved.

1. Introduction

Hourly flood forecasting is desirable with sufficient lead timefor taking appropriate flood prevention measures, evacuation planand rehabilitation actions. A wide variety of rainfall–runoff modelshave been developed and applied for flood forecasting either basedon mechanistic approach or on a systems theoretic approach. Spa-tially distributed modeling is a typical example of the mechanisticapproach to construct a model that explicitly accounts for as muchof the small-scale physics and the natural heterogeneity as compu-tationally possible (Loague and VanderKwaak, 2004). The approachhas been criticized for resulting in models that are overly complex,leading to problems of over parameterization and equifinality(Beven, 2006), which may manifest itself in large prediction uncer-tainty (Uhlenbrook et al., 1999), whereas system theoretic ap-proach gives more emphasis on system operation and not thenature of the system or the physical laws governing its operation.

ll rights reserved.

.K. Tiwari), cchatterjee@ag-

Hydrological ANN models are simplification of more complex sys-tem where the natural processes are simulated with mathematicalequations and the corresponding parameters are derived fromobservations and experience leading to uncertainty (Srivastavet al., 2007; Han et al., 2007a).

System theoretic approach in the form of artificial neural net-works (ANNs) have gained great attentions by the researchers inlast few decades for river flow forecasting. The ability of ANN inmapping complex nonlinear input–output relationship has in-creased the number of applications in rainfall–runoff modelingand river discharge forecasting (Jain and Srinivasulu, 2004;Altunkaynak, 2007; El-Shafie et al., 2007). There are several typeof ANNs but the major advantage of feed forward backpropagationartificial neural network (FFBP ANN) is that it is less complex thanother ANNs such as radial basis function (RBF) and support vectormachine (SVM) and has similar nonlinear input–output mappingcapability (Sudheer and Jain, 2003; Coulibaly and Evora, 2007; Brayand Han, 2004; Han et al., 2007b). Another type of ANN,generalized regression neural networks (GRNNs) has some advan-tages of being less sensitive to initial weights and do not producenegative values compared to the FFBP ANN (Cigizoglu, 2005a,b;

M.K. Tiwari, C. Chatterjee / Journal of Hydrology 394 (2010) 458–470 459

Cigizoglu and Alp, 2006; Kisi and Cigizoglu, 2007; Sertel et al.,2008), but some literature show mixed performance of FFBP andGRNN (Toprak and Cigizoglu, 2008; Sertel et al., 2008; Tanrikulu,2009; Cigizoglu, 2005a). Even though, fuzzy inference system hasbeen successfully applied in river flow forecasting (Nayak et al.,2004; Mukerji et al., 2009) their application is rather limited incomparison to neural network models and there are several unre-solved issues requiring further attention before more clear guide-lines for the application of fuzzy inference systems can be given(Jacquin and Shamseldin, 2009). Hybrid GA-based ANN algorithmis found to avoid over-fitting and produces better accuracy in mod-el performance but at the expense of additional modeling parame-ters and longer computation time (Wu and Chau, 2006). FFBP ANNsare known to have several dozens of successful applications in riv-er basin management and related problems (Solomatine and Ost-feld, 2008). The FFBP ANNs are still widely applied and is a verypopular tool compared to other data driven techniques in river ba-sin management. Therefore in this study we have used the FFBPANN (later on referred as ANN) model for hourly water level fore-casting. Substantial literature on ANN have been reported in ASCE(2000a,b). Despite the good performance of ANN models, the out-come is highly dependent on the training data arrangement andthere are undesirable uncertainties involved (Han et al., 2007a;Srivastav et al., 2007). The reliability of the model estimateddischarge is affected by three major sources of uncertainties(Bates and Townley, 1988): data uncertainty (quality and represen-tativeness of data), model structure uncertainty (ability of themodel to describe the catchment’s response), and parameteruncertainty (adequate values of model parameters). Han et al.(2007a) studied the uncertainties involved in real time forecastingusing an ANN model. He concluded that for long term predictions,the ANN showed superior performance but that was only probabi-listic depending on how the calibration and test events were ar-ranged. Srivastav et al. (2007) proposed a method of uncertaintyanalysis for ANN hydrological models and showed that the ANNpredictions contain a significant amount of uncertainty. In orderto overcome the limitations inherent in the conventional treat-ment of uncertainty in ANN model predictions, recent trend hasbeen to combine the outputs of several member bootstrap ANNmodels to reduce the uncertainty involved by controlling thegeneralization of final predictive model and to produce more reli-able and consistent predictions (Cannon and Whitfield, 2002;Jeong and Kim, 2005).

The bootstrap is a computational procedure that uses intensiveresampling with replacement, in order to reduce uncertainty(Efron and Tibshirani, 1993). It is also the simplest approach sinceit does not require complex computations of derivatives andHessian-matrix inversion involved in linear methods or the MonteCarlo solutions of the integrals involved in the Bayesian approach(Dybowski and Roberts, 2000). Bootstrap technique has a widevariety of applications ranging from estimating means, CIs, param-eter uncertainties and network design techniques (Lall andSharma, 1996; Sharma et al., 1997; Tasker and Dunne, 1997). Boot-strap technique based ANNs have successfully been introduced inhydrological modeling. Abrahart (2003) employed bootstrap tech-nique to continuously sample the input space in the context ofrainfall–runoff modeling and reported that it offered marginalimprovement in terms of greater accuracies and better globalgeneralizations. Jeong and Kim (2005) used ensemble neural net-work (ENN) using bootstrap technique to simulate monthly rain-fall–runoff. They concluded that ENN is less sensitive to the inputvariable selection and the number of hidden nodes than the singleneural network (SNN). Jia and Culver (2006) used the bootstraptechnique to estimate the generalization errors of neural networkswith different structures and to construct the CIs for synthetic flowprediction with a small data sample. Han et al. (2007a) studied the

uncertainties involved in real time forecasting in using an ANNmodel. They proposed a method to understand the uncertainty inANN hydrologic models with the heuristic that the distance be-tween the input vector at prediction and all the training data pro-vide a valuable indication on how well the prediction would be.They concluded that for long term predictions, the ANN showedsuperior performance but that was only probabilistic dependingon how the calibration and test events were arranged. Howevertheir method did not quantify the uncertainty of the model param-eters or the predictions. Srivastav et al. (2007) proposed a methodof uncertainty analysis for ANN hydrological models which wasbased on bootstrap technique. They developed an ANN model forforecasting the river flow at 1-h lead time and the results revealedthat the proposed method of uncertainty analysis is very efficientand can be applied to an ANN based hydrological model. Tiwariand Chatterjee (2010) applied bootstrap technique for hourly floodforecasting and showed that bootstrap technique is capable ofquantifying uncertainty in hourly flood forecasting and ensemblepredictions were found to be more stable and accurate.

ANN models have limitation to consider any physics of thehydrologic processes in a catchment (Aksoy et al., 2007; Koutsoy-iannis, 2007). Wavelet analysis provides a time–frequency repre-sentation of a signal at many different periods in the time domain(Daubechies, 1990) and gives considerable information about thephysical structure of the data. Recently, variations, periodicitiesand trends in time series have been analyzed using wavelet trans-formation (Xingang et al., 2003; Yueqing et al., 2004; Partal and Ku-cuk, 2006). Wavelet based ANN (WANN) models have beenemployed recently in hydrological modeling successfully. Wangand Ding (2003) proposed a wavelet network model with a combi-nation of the wavelet transform and the ANN, and decomposed theoriginal time series into periodic components by wavelet trans-form. Later, sub-time series were used as the inputs for ANNs,and the resulting model was applied to forecast the original timeseries. This approach was used for monthly groundwater leveland daily discharge forecasting. Kim and Valdes (2003) presenteda hybrid neural network model combined with dyadic wavelettransforms to improve the forecasting of regional droughts. Theseresearchers used the neural network model in two phases. First,neural networks were employed to forecast the signals decom-posed by wavelet transform at various resolution levels, and thenthe forecasted decomposed signals were reconstructed into the ori-ginal time series. The researchers applied this model to the monthlyand annual inflow and rainfall data and showed that the model sig-nificantly improved the neural network’s forecasting performance.Anctil and Tape (2004) used a wavelet–neural network model for1 day lead rainfall–runoff forecasting. The time series was decom-posed by wavelet transform into three sub-series: short, intermedi-ate, and long wavelet periods. Then, multiple-layer ANN forecastingmodels were trained for each wavelet-decomposed sub-series, andlater forecasted decomposed signals were reconstructed into theoriginal time series. Partal and Cigizoglu (2008) predicted the sus-pended sediment load in rivers by a combined wavelet–ANN meth-od. Measured data were decomposed into wavelet components viadiscrete wavelet transform, and the new wavelet series, consistingof the sum of selected wavelet components, was used as input forthe ANN model. The wavelet–ANN model provided a good fit to ob-served data for the testing period. Partal and Cigizoglu (2009) pre-dicted the daily precipitation using meteorological data fromTurkey using the wavelet–neural network method. The new ap-proach in estimating the peak values showed a noticeably high po-sitive effect on the performance evaluation criteria. Kisi (2009) usedwavelet–ANN conjunction model for daily intermittent streamflowforecasting. The results revealed that the proposed hybrid modelcould significantly increase the forecast accuracy of single ANN inforecasting daily intermittent streamflows.

Time series

High-pass filter

Low-pass filter

Details

Approximation

Fig. 1. DWT decomposition of a time series.

460 M.K. Tiwari, C. Chatterjee / Journal of Hydrology 394 (2010) 458–470

In an earlier study, we (Tiwari and Chatterjee, 2010) developedBANN models for hourly flood forecasting. In this study, waveletand bootstrap based ANN (WBANN) models are developed to eval-uate their effectiveness for hourly flood forecasting for 1–10 h leadtime forecasts in Mahanadi River basin. We have also comparedthe performance of WBANN model with traditional ANN, BANN(Tiwari and Chatterjee, 2010), WANN and MLR models to evaluatethe effectiveness of different models.

2. Methodology

2.1. Artificial neural networks

Artificial neural networks are information processing systemscomposed of simple processing elements (nodes) linked byweighted synaptic connections (Muller and Reinhardt, 1991).ANNs develop complex nonlinear input–output mapping by anal-ogy with the functioning of the human brain. Advantages of ANNsbeing fast and efficient in complex and noisy environments, capa-bility to solve a wide variety of problems has numerous real-worldapplications, such as time series prediction, rule-based control, andrainfall–runoff modeling (Jain et al., 1999). The multilayer ANNconsists of the input layer, one or more hidden layers and an out-put layer. The input signal propagates through the network in aforward direction, layer by layer through the computational nodesor neurons in each layer. The most common learning rule for mul-tilayer perceptions is the backpropagation algorithm (BPA) whichinvolves a feed–forward phase in which the external input infor-mation at the input nodes is propagated forward to compute theoutput information signal at the output unit using a randomly as-signed connection strength or weights, and a backward phasewhere weights are updated based on the error between the com-puted and observed values at the output units (Alp and Cigizoglu,2007). A detailed explanation of different properties of ANN is be-yond the scope of this paper. Interested readers are directed to re-fer to text such as Bishop (1995) and Haykin (1999) for discussionon general properties of ANN and Maier and Dandy (2000) foroverview of different applications of ANN in water resources.

2.2. Wavelet analysis

The basic objective of the wavelet transform is to achieve acomplete time-scale representation of localized and transient phe-nomena occurring at different time scales (Labat et al., 2000). Timeseries data are decomposed in different components at differentresolution levels using wavelet function. Wavelet function wðtÞcalled the mother wavelet has finite energy and is mathematicallydefined as:

Z þ1

�1wðtÞdt ¼ 0 ð1Þ

where wa;bðtÞ can be obtained as:

wa;bðtÞ ¼ jaj�1

2wt � b

a

� �ð2Þ

where a and b are real numbers; wa;bðtÞ = wavelet function; a = scaleor frequency parameter; b = translation parameter. The wavelettransformation is a function of two variables a and b. The parameter‘‘a” is interpreted as a dilation (a > 1) or contraction (a < 1) factor ofthe wavelet function wðtÞ corresponding to different scales. Theparameter ‘‘b” can be interpreted as a temporal translation or shiftof the function wðtÞ.

For the time series f ðtÞ 2 L2ðRÞ or finite energy signal (Rosso etal., 2004) the continuous wavelet transform (CWT) of time seriesf ðtÞ is defined as:

Wf ða; bÞ ¼ jaj�12

Z þ1

�1f ðtÞw � t � b

a

� �dt ð3Þ

Where Wf(a, b) is the wavelet coefficient, ‘‘�” corresponds to thecomplex conjugate.

The wavelet transformation seeks level of similarity betweenthe time series data and wavelet function at different scales andtranslation and generate wavelet coefficient Wf(a, b) contour mapalso known as a scalogram. CWT generates large amount of datafor all a and b. However, if the scale and translation are chosenbased on the powers of two (dyadic scales and translation), thenthe amount of data can be reduced considerably resulting in moreefficient data analysis. This transform is called the discrete wavelettransform (DWT) and can be defined as (Mallat, 1989):

wm;nt � b

a

� �¼ a�m=2

o w � t � nboamo

amo

� �ð4Þ

where m and n are integers that control the wavelet scale/dilationand translation, respectively; a0 is a specified fined scale step great-er than 1; and b0 is the location parameter and must be greater thanzero. The most common and simplest choice for parameters area0 = 2 and b0 = 1.

This power-of-two logarithmic scaling of the dilations andtranslations is known as dyadic grid arrangement and is thesimplest and most efficient case for practical purposes (Mallat,1989). For a discrete time series f(t), when occurs at a differenttime t (i.e. here integer time steps are used), the discrete wavelettransform becomes:

Wf ðm;nÞ ¼ 2�m=2XN�1

t¼0

f ðtÞw � 2�mt � n� �

ð5Þ

where Wf(m, n) is the wavelet coefficient for the discrete wavelet ofscale a = 2m and location b = 2m n. f(t) is a finite time series (t =0, 1, 2, . . ., N � 1), and N is an integer power of 2 (N = 2M); n is thetime translation parameter, which changes in the range0 < n < 2M�m � 1, where 1 < m < M.

DWT operates two sets of function viewed as high-pass andlow-pass filters (Fig. 1). The original time series is passed throughhigh-pass and low-pass filters and separated at different scales.The time series is decomposed into one comprising low frequen-cies and its trend (the approximation) and one comprising the highfrequencies and the fast events (the detail). The detail signals cancapture small features of interpretational value in the data; theapproximation represents the background information of data.

2.3. Bootstrapped artificial neural networks (BANNs)

The bootstrap is a data driven simulation method that usesintensive resampling, with replacement, to reduce uncertainties(Efron, 1979; Efron and Tibshirani, 1993). The bootstrap methodgenerates different realizations of a dataset to create bootstrapsamples and their estimates can provide average and variabilityof the estimates. For data consisting of a random sampleTn = {(x1, y1), (x2, y2), . . ., (xn, yn)} of size n from population of un-known probability distribution F, where ti = (xi, yi) is a realizationdrawn independently and identically distributed (i.i.d.) from F

M.K. Tiwari, C. Chatterjee / Journal of Hydrology 394 (2010) 458–470 461

and consists of an input vector xi, and the corresponding outputvector yi. The empirical distribution function F is defined to bethe discrete distribution that puts probability 1/n on each valueti = (xi, yi). T�n can be represented as bootstrap sample of size n takenfrom i.i.d. with replacement from F. The set of B bootstrap samplescan be represented as T1, T2, . . ., Tb , . . ., TB, in which B is the totalnumber of bootstrap samples and ranges usually from 50 to 200(Efron, 1979). For each Tb, a ANN prediction model is constructedand the output is represented as fANNðxi;wbÞ, built using all n obser-vations where wb is the weight vector for ANN developed based onthe bootstrap sample Tb. Performance of the trained ANN model isevaluated using the observation pairs that are not included in abootstrap sample and the average performance of these ANNs ontheir corresponding testing datasets is used as an estimate of thegeneralization error of the ANN model developed on Tb. The gener-alization error of an ANN model can be estimated by its ‘‘E0’’ esti-mate (Twomey and Smith, 1998; Jia and Culver, 2006).

E0 ¼PB

b¼1

Pi2Ab

yi � fANNðxi;wbÞð Þ2PBb¼1#ðAbÞ

ð6Þ

The output of the ANN for a particular input vector xi is repre-sented as fANN(xi, wb), where, Ab is the observation vectors not in-cluded in the bootstrap sample Tb, #(Ab) is the number ofobservation vector in Ab.

For a new input x, the bootstrapped neural network estimateyðxÞ is given by the average of the B bootstrapped estimates.

yðxÞ ¼ 1B

XB

b¼1

fANNðx;wbÞ ð7Þ

and the variance is given by:

r2ðxÞ ¼PB

b¼1

Pi¼Ab

yi � fANNðxi=wbÞ½ �2

B� 1ð8Þ

The confidence interval (CI) at the a% significance level indi-cates that in repeated application of the technique, the frequencywith which the CI would contain the true value is 100 � (1 � a)%.A typical value of a is 0.05 which corresponds to (1 � 0.05) �100% = 95% confidence limits. A 100 � (1 � a)% CI covering themean/ensemble discharge yðxÞ can be estimated in the followingequation (Efron and Tibshirani, 1993):

CI ¼ ½UB; LB� ¼ yðxÞ þ ta=2n�prðxÞ; yðxÞ � ta=2

n�prðxÞh i

ð9Þ

where UB is upper band, LB is lower band, r(x) is the standard devi-ation of B bootstrapped estimates, ta=2

n�p is the a=2 percentile for theStudent t distribution with n � p degrees of freedom; n is the totalnumber of discharge observations; and p is the total number ofparameters in the ANN model. A typical value of a is 0.05.

2.4. Multiple linear regression (MLR) model

In order to benchmark the performance of different developedANN models, multiple linear regression (MLR) models are alsodeveloped for hourly water level forecasting. Multiple linearregression attempts to model the relationship between two ormore independent variables and a dependent variable by fitting alinear equation to the data points. A multiple linear regressionequation takes the following form:

y ¼ aþ b1x1 þ b2x2 þ � � � þ bnxn ð10Þ

where y is the dependent variable, a is a constant and b1 to bn aremultipliers for x1 to xn independent variables. Constant and multi-pliers are estimated through minimizing the sums of squares ofdeviations between each data point and the regression line. MLRhas been the traditional approach utilized in water resources

hydrology for several decades since the last century. Some recentapplications appear in Leclerca and Ouarda (2007) and Sahooet al. (2009).

2.5. Performance indices

The Nash–Sutcliffe coefficient (E), root mean square error(RMSE), mean absolute error (MAE) and persistence index (PERS)performance measures are used to evaluate the accuracy of thedeveloped models. The Nash–Sutcliffe coefficient (E) introducedby Nash and Sutcliffe (1970) is still one of the most widely used cri-teria for assessment of model performance. E provides a measureof the ability of a model to predict values that are different fromthe mean. RMSE and MAE provide different types of informationabout the predictive capabilities of the model. The RMSE measuresthe goodness-of-fit relevant to high flow values whereas the MAEyields a more balanced perspective of the goodness-of-fit at mod-erate flows. PERS is the substitution of the last known figure as thecurrent prediction and represents a good benchmark against whichother predictions can be measured (Cannas et al., 2006; Kitanidisand Bras, 1980). These statistical terms can be expressed asfollows:

(i) Nash–Sutcliffe coefficient (E): it is expressed as:

E ¼ 1�Pn

i¼1 Oi � Pið Þ2Pni¼1 Oi � Oi

� �2 ð11Þ

where Oi and Pi are the observed and predicted water level flow; Oi

is the mean of the observed water level; n is the number of obser-vations. The value of Nash–Sutcliffe coefficient varies between �1and 1. The closer the value to 1, the better is the modelperformance.

(ii) Root mean square error (RMSE): It is expressed as:

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiv

RMSE ¼ 1

n

Xn

i¼1

ðOi � PiÞ2uut ð12Þ

(iii) Mean absolute error (MAE): It is expressed as:

MAE ¼ 1n

Xn

i¼1

jOi � Pij ð13Þ

(iv) Persistence index (PERS): It is expressed as:

PERS ¼ 1� SSESSEnaive

ð14Þ

where

SSE ¼Xn

i¼1

Oi � Pið Þ2 ð15Þ

and

SSEnaive ¼Xn

i¼1

Oi � Oi�Lð Þ2 ð16Þ

In which the SSE terms are the sum of square errors. Oi�L is thedischarge estimate from a persistence model (or naïve model) thattakes the last discharge observation (at time i minus the lead timeL) as a prediction. Persistence consists of a comparison betweenthe model under study, and the naïve model. A value of PERS smal-ler or equal to 0 indicates that the model under study performsworse or no better than the easy to implement naïve model. A PERSvalue of 1 is obtained when the model under study provides exactestimates of observed discharge.

462 M.K. Tiwari, C. Chatterjee / Journal of Hydrology 394 (2010) 458–470

3. Study area and data used

Mahanadi River basin which is one of the largest river basin inIndia is selected for this study. The Mahanadi River flows to the Bayof Bengal in east-central India draining an area of 141,589 km2 andhas a length of 851 km. Mahanadi River basin lies between 80�300E–86� 500E longitude and 19� 210N–23� 350N latitude. About53% of the basin is in the state of Chhattisgarh, 46% is in the coastalstate of Orissa, and the remainder of the basin is in the states ofJharkhand and Maharastra. Numerous dams, irrigation projects,and barrages are present in the Mahanadi River basin, the mostprominent of which is Hirakud Dam. The middle reaches of lowerMahanadi River basin located in Orissa between 82�E 19�N and86�E 22�N and encompassing a geographical area of 47,558.6 km2

forms the study area (Fig. 2). The main river reach extends fromHirakud dam to Naraj having a total length of 358.4 km. The mainsoil types found in the study area are red and yellow soils. The nor-mal annual rainfall is 1458 mm and temperature in this region var-ies from 14 �C to 40 �C. The average monthly pan evaporation ofthe area varies from 2.4 mm to 14.6 mm. Most of the rainfall andriver flow occur during the monsoon season, between June andSeptember. In Delta region of the Mahanadi River basin almostevery year flooding is a serious problem during monsoon seasons.Naraj, which is situated at the mouth of the Delta is selected forwater level forecasting.

The location of different gauging stations is shown in Fig. 2. Thedata used for the study consists of hourly water level of five gaug-ing stations (Kesinga, Salebhata, Kantamal, Tikarpara and Naraj)during the monsoon period (23 June at 9 am to 29 September at11 pm) from year 2001 to year 2005 yielding 11,835 vectors ofdata. Some of the statistical properties of the water level data arepresented in Table 1.

Fig. 2. Index map of the middle reaches of Mahanadi Rive

4. Model development

One of the most important steps in the ANN hydrologic modeldevelopment process is the determination of significant input vari-ables. In our earlier study on hourly water level forecasting (Tiwariand Chatterjee, 2010), for the same datasets in Mahanadi River ba-sin we determined the significant inputs for BANN model usingcross-correlation statistics (Sudheer et al., 2002) and these areshown in Table 2. We also determined the optimal number of hid-den neurons using generalization error as seven with learningcoefficient 0.3 and momentum 0.8 by trial and error. The numberof inputs as well as neural network parameters for all the modelsin this study is taken to be same so that their performance canbe compared. Five years of hourly water level data of five water le-vel gauging stations are divided into three parts. Hourly water levelof years 2001–2003 (7101 patterns) are taken for training, 2004(2367 patterns) for cross-validation and 2005 (2367 patterns) fortesting. The training dataset is used to train the ANN models andthe testing dataset is used to evaluate the performances of models,whereas cross-validation dataset is used to apply an early stoppingapproach (Bishop, 1995) in order to avoid overtraining or over-fit-ting of the training datasets. In early stopping technique, the objec-tive function at each iteration of training and cross-validation ismonitored and the training is stopped when the cross-validationerror is minimum.

Three ANN models developed in this study are; traditional ANN,Wavelet based ANN (WANN) and wavelet–bootstrap based ANNconjunction model (WBANN). The performance of these modelsis compared with the Bootstrap based ANN (BANN) model devel-oped earlier (Tiwari and Chatterjee, 2010). At first, a multi layerperception (MLP) feedforward ANN model is developed. The ANNmodels are developed using the most significant inputs which

r basin showing location of different gauging stations.

Table 1Statistics of the data set for hourly water level forecasting.

Year Statistics Kesinga Salebhata Kantamal Tikarpara Naraj

2001 Mean (m) 170.71 131.76 123.31 60.71 23.88Standard deviation (m) 1.38 0.83 1.86 3.99 1.36Maximum (m) 176.70 135.45 129.95 72.25 27.22Minimum (m) 169.18 130.71 120.93 55.63 22.14

2002 Mean (m) 169.30 131.15 121.21 56.79 22.31Standard deviation (m) 0.53 0.61 0.85 2.11 0.84Maximum (m) 171.53 133.54 124.07 65.16 25.26Minimum (m) 168.46 130.40 120.27 54.45 20.89

2003 Mean (m) 170.48 131.84 122.93 60.29 23.54Standard deviation (m) 1.13 1.32 1.89 4.58 1.59Maximum (m) 176.05 139.90 130.37 73.20 27.05Minimum (m) 168.84 130.12 120.56 55.21 21.16

2004 Mean (m) 170.15 131.24 122.13 58.14 22.88Standard deviation (m) 0.81 0.63 1.34 2.71 0.98Maximum (m) 173.37 134.41 129.34 67.29 25.93Minimum (m) 169.16 130.43 120.27 55.21 21.61

2005 Mean (m) 169.95 131.04 122.04 58.99 23.14Standard deviation (m) 1.12 0.68 1.57 3.52 1.42Maximum (m) 177.41 135.04 130.10 69.12 26.14Minimum (m) 168.62 130.14 120.21 54.23 20.57

Table 2Most significant input vectors selected usingcross-correlation statistics (CCF, ACF, PACF).

Stations Input variables (lags)

Naraj 1–8 hTikarpara 15–20 hKantamal 37–40 hSalebhata 31–36 hKesinga 43–49 h

M.K. Tiwari, C. Chatterjee / Journal of Hydrology 394 (2010) 458–470 463

are first log-transformed and then linearly scaled to the range (0, 1)for ANN modeling (e.g. Campolo et al., 1999). A computationallyefficient second-order training method, the Levenberg–Marquardtmethod is used to minimize the mean squared error between theforecast and observed river flows. In the next step, the DWCs dataare used as inputs to ANN model to develop the WANN model. TheDWT decomposes an original discharge time series into severalcomponents (i.e., DWCs) at different scales (or frequencies). Eachcomponent has a distinct role in the original water level timeseires. The low-frequency component generally reflects theidentity (periodicity and trend) of the signal whereas the high-fre-

quency component uncovers details (Kucuk and Agiralloglu, 2006).The wavelet function is derived from the family of Daubechieswavelets (Wu et al., 2009; Nourani et al., 2009). To choose thenumber of decomposition level or DWCs the following formula isused to determine the decomposition levels (Nourani et al., 2008):

L ¼ int½logðNÞ� ð17Þ

L and N are decomposition level and number of time series data,respectively. This study uses N = 7101 for training, that producesL = 3 approximately. The WANN models are developed employingsub-series DWCs obtained using DWT. For this purpose, firstly, ori-ginal time series is decomposed into three levels of DWCs by DWT.Three levels of decomposition and approximation for water leveldata of Kantamal and Tikarpara are shown in Fig. 3. The effectivewavelet components are determined using the correlation coeffi-cients between each wavelet components and observed water levelat Naraj. correlation between the periodic component and the ori-ginal discharge data reveals the effectiveness of the component.Table 3 shows that the significant periodic components are onlythe approximations of all the time series for all the five gauging

stations, i.e. component A3 of time series of Kesinga, Salebhata,Kantamal, Tikarpara, Naraj constituted the new wavelet water le-vel series which are employed to constitute the new inputs ofthe WANN model for hourly water level forecasting. The newlyconstructed time series is used as WANN model inputs. BANNmodel is developed as an ensemble of several ANNs built usingbootstrap resamples of raw datasets, whereas WBANN model isdeveloped as an ensemble of several ANNs built using bootstrapresamples of DWCs instead of raw datasets. In this way theWBANN model uses the capabilities of both bootstrap resamplingand wavelet transformation technique. Similar to BANN model,WBANN model is also developed using 50 resamples to maintainthe consistency. Bootstrap.xla an Excel Add–In (Barreto and How-land, 2006) is used to generate bootstrap resamples of raw datasetsand DWCs for BANN and WBANN models development, respec-tively. A simple flowchart depicting the development of ANN,BANN, WANN and WBANN models is shown in Fig. 4. The numberof inputs for all the four models is taken same to maintain consis-tency. Similar to ANN the WANN, BANN and WBANN model struc-tures are tested for 1–15 hidden neurons and all the three modelswith seven hidden neurons for which the generalization errors areminimum is chosen as the optimal structure. Therefore, for all thefour models number of hidden neurons is seven.

5. Results and discussion

5.1. Hourly water level forecasting

The results of WBANN model in terms of E, RMSE and MAE per-formance indices for 1–10 h lead time forecasts for testing period isshown in Table 4. WBANN model performs better for 1 h lead timeforecasts with lower RMSE and MAE; and higher E. Performance forhigher lead time deteriorates gradually. The results in terms of E,RMSE and MAE performance indices of three remaining models,traditional ANN, WANN and BANN are shown in Table 5. It is clearfrom the Tables 4 and 5 that for 1 h and 2 h lead time forecast theperformance of WBANN and BANN models are better compared totraditional ANN and WANN. Compared to traditional ANN modelthe performance of all the three models (WBANN, WANN andBANN) is better in most of the cases. For longer lead time in mostof the cases the performance of WBANN model is better comparedto the remaining three models. It can be observed that the perfor-

Kantamal Tikarpara

Fig. 3. Discrete wavelet components of water levels of Kantamal and Tikarpara for the year 2005.

Table 3The correlation coefficients between the discrete wavelet components and the observed water level data at Naraj.

Discrete waveletcomponents

Gauge stations

Salebhata Kesinga Kantamal Tikarpara Naraj

A3 0.7046 0.6405 0.7247 0.9564 0.9999D1 0.0020 0.0037 0.0008 0.0051 0.0069D2 0.0008 0.0014 0.0006 0.0027 0.0057D3 0.0010 0.0016 0.0001 0.0026 0.0104Original 0.7039 0.6403 0.7226 0.9564 1.0000

464 M.K. Tiwari, C. Chatterjee / Journal of Hydrology 394 (2010) 458–470

Hourly water level series

Significant input

WANN development

DWT

Bootstrap resampling

Bootstrap resampling

WBANN development

BANN development

ANN development

DWCs

Fig. 4. Flowchart showing the development of ANN, BANN, WANN and WBANN models.

Table 4Performance indices for 1–10 h lead time forecasts for WBANN model.

Lead time (h) E RMSE (m) MAE (m)

WBANN1 0.9997 0.026 0.0212 0.9995 0.030 0.0213 0.9991 0.042 0.0324 0.9986 0.051 0.0335 0.9979 0.063 0.0436 0.9970 0.076 0.0527 0.9961 0.086 0.0578 0.9948 0.100 0.0669 0.9934 0.113 0.075

10 0.9918 0.126 0.083

Table 5Performance indices for 1–10 h lead time forecasts for ANN, WANN and BANNmodels.

Lead time (h) E RMSE (m) MAE (m)

ANN1 0.9993 0.039 0.0262 0.999 0.044 0.0303 0.9968 0.079 0.0594 0.9976 0.069 0.0475 0.9962 0.086 0.0616 0.9963 0.085 0.0597 0.9949 0.100 0.0658 0.9925 0.120 0.0789 0.992 0.123 0.082

10 0.9906 0.135 0.093

WANN1 0.9994 0.034 0.0192 0.9991 0.041 0.0333 0.9993 0.038 0.0264 0.9986 0.051 0.0365 0.9976 0.069 0.0496 0.997 0.076 0.0527 0.9948 0.100 0.0678 0.9941 0.106 0.0699 0.9919 0.125 0.083

10 0.9916 0.127 0.088

BANN1 0.9997 0.024 0.0182 0.9995 0.032 0.0233 0.999 0.043 0.0314 0.9987 0.050 0.0335 0.9978 0.065 0.0456 0.9972 0.072 0.0497 0.9958 0.090 0.0588 0.9945 0.103 0.0679 0.9931 0.116 0.075

10 0.9912 0.130 0.085

Table 6Number of data points in low-, medium-, and high-water level categories.

Category Number of points Percentage of total data

Low (x < l) 1429 60.37Medium (l 6 x 6 lþ 2r) 914 38.61High (x > lþ 2r) 24 1.01Total 2367 100

l = 23.14.l + 2r = 25.98.l is mean, and r is standard deviation.

M.K. Tiwari, C. Chatterjee / Journal of Hydrology 394 (2010) 458–470 465

mance of WANN model for 3 h lead time is better than 2 h leadtime forecasts and performance of traditional ANN model for 4 hlead time is better than 3 h lead time and 6 h lead time forecastis better than 5 h lead time forecasts. This reveals that traditionalANN and WANN model predictions are not consistent and depictuncertainty, whereas BANN and WBANN model predictions aremore consistent and can be considered as more reliable. The betterperformance by BANN and WBANN models can be attributed to thebootstrap resampling technique, which generate different realiza-tions of the datasets to create bootstrap samples and differentbootstrapped based ANN models may play a complementary rolein the approximation of this process and reducing the variance.Therefore, an ensemble model is generally better than a singlemodel. Another reason for better performance of WBANN modelmay be due to the reason that wavelet transformation is reducingthe noise in the water level time series causing forecasts to bemore reliable and accurate. The capability of wavelet techniquein reducing noise is obvious as it can be seen in Table 3 that onlythe approximation time series of all the water level series of differ-ent gauge stations have good correlation coefficients with the ob-served water level at Naraj gauge station, whereas the detailedwavelet components do not show any correlation with observedwater level at Naraj gauging station which implies that the detailedwavelet components sub-series are the noisy datasets separatedusing wavelet transformation.

For flood forecasting, it is necessary to know the performance ofdifferent ANN models for prediction of high water levels as highwater level prediction accuracy is of utmost importance comparedto low or medium water level profiles. Performance of ANN modelsin forecasting high magnitude water levels for different lead timesis assessed using a ‘‘partitioning analysis” (Jain and Srinivasulu,2004) which is carried out by dividing the total discharge valuesinto low, medium, and high magnitude water levels. Table 6 pre-sents the partitioning of water level values for testing period basedon the relative spread of water level values from the mean. Abso-lute relative error (ARE) and threshold statistics (TS) which givethe performance index in terms of predicting flows and also the

High Water Level

Fig. 5. Distribution of forecast error across different error thresholds for (a) 3, (b) 7,and (c) 10 h lead time forecasts for high water level profiles using BANN, WBANN,WANN and ANN models.

Table 7Performance of MLR Model for hourly water level forecasting.

Lead time (h) E RMSE MAE

1 0.9544 0.31 0.312 0.9815 0.20 0.203 0.9987 0.05 0.054 0.9985 0.05 0.035 0.986 0.16 0.156 0.9886 0.15 0.137 0.9604 0.28 0.268 0.9931 0.11 0.079 0.9916 0.13 0.08

10 0.9899 0.14 0.09

Table 8Persistence index (PERS) for WBANN model for1–10 h lead time forecasts.

Lead time (h) Persistence index

1 0.082 0.583 0.654 0.715 0.706 0.707 0.698 0.679 0.67

10 0.65

Fig. 6. Observed water levels with predicted 95% confidence bands using WBANNmodel for (a) low (b) medium and (c) high water level profiles for 3 h lead timeforecast.

466 M.K. Tiwari, C. Chatterjee / Journal of Hydrology 394 (2010) 458–470

distribution of the prediction errors (Nayak et al., 2005) are chosenfor analyzing performance of different ANN models for high waterlevel prediction.

The threshold statistic for a level of x% is a measure of consis-tency in forecasting errors from a particular model. The threshold

M.K. Tiwari, C. Chatterjee / Journal of Hydrology 394 (2010) 458–470 467

statistics is represented as TSx and expressed as a percentage. Thiscriterion can be expressed for different levels of absolute relativeerror (ARE) from the model. ARE is given as:

AREi ¼ jOi � Pi

Oij � 100 ð18Þ

TSx is computed for x% level as:

TSx ¼Yx

n� 100 ð19Þ

where Yx is the number of computed discharge (out of n total com-puted) for which the absolute relative error is less than x% from themodel.

Distribution of forecast error across different error thresholdsfor 3 h, 7 h and 10 h lead time forecasts for high water level profiles

Fig. 7. Observed water levels with predicted 95% confidence bands using WBANNmodel for (a) low (b) medium and (c) high water level profiles for 7 h lead timeforecast.

are shown in Fig. 5. It is evident from figure that the performanceof WBANN model is better for high water level profiles comparedto remaining three models for 3, 7 and 10 h lead time forecasts.

5.2. Comparison of developed ANN models with MLR and a simplenaïve persistence model

The goal of multiple regression analysis is to evaluate the rela-tionship between several independent or predictor variables and adependent or criterion variable. The discharge at Naraj station isselected as the dependent variable and the input variables as ofANNs are selected as independent variables for MLR model devel-opment. The MLR model is first fitted (to determine the regressioncoefficients) using the data in the training set (2000–2004) andthen tested using the testing dataset (2005). The SPSS software

Fig. 8. Observed water levels with predicted 95% confidence bands using WBANNmodel for (a) low (b) medium and (c) high water level profiles for 10 h lead timeforecast.

Obs

erve

d va

lues

incl

uded

(%)

Confidence level (%)

Fig. 9. Percentage of observed values included in WBANN predicted confidenceband of the testing set for different lead times.

468 M.K. Tiwari, C. Chatterjee / Journal of Hydrology 394 (2010) 458–470

package (version 10, SPSS Inc., Chicago, Illinois) is used for regres-sion calculations. Performance indices are shown in Table 7 for1–10 h lead time flood forecasts using MLR models. It is obviousfrom Tables 4, 5 and 7 that WBANN, BANN, WANN and ANN mod-els perform better than MLR model for all lead time forecasts. It isobvious from Table 8 that the PERS for WBANN models are greaterthan 0 and therefore, WBANN model is better compared to a sim-ple naïve model for 1–10 h lead time forecasts. These findingsagain show the strength and capability of WBANN modeling forhourly water level forecasting.

5.3. Predictive uncertainty in hourly flood forecasting

WBANN model, which utilizes the capability of both waveletand bootstrap resampling technique performed better comparedto the remaining three ANN and MLR models. The capacity of boot-strap techniques has been extended to assess the predictive uncer-tainty in hourly flood forecasting for 1–10 h lead time forecasts. InWBANN model, 50 WANN models are developed for each bootstrapresample datasets (i.e. resample of DWCs) which are drawn withreplacement from the original dataset with length of the datasetsbeing the same as the original. Thus, there are a set of 50 WANNmodels each for 1–10 h lead time flood forecasting. Therefore, forany lead time there are 50 predictions for testing dataset that areused for building a CI to assess the predictive uncertainty. In thisway instead of 1 set of weights we have 50 sets of weights for each1–10 h lead time forecasts. These 50 sets of weights are utilized fortesting dataset and hence, for each data in testing datasets we have50 forecasted values. These values are used to develop confidenceinterval as shown in Eq. (9). Figs. 6–8 show the 95% confidenceband for low, medium and high discharge profiles for 3 h, 7 hand 10 h lead time forecasts. This analysis is carried out for hydro-graphs from low, medium and high water level profiles. Upperband (UB) and lower band (LB) of CIs show the uncertainty in-

Table 9Ninety five percentage of confidence band using WBANN model for low, m

Water level (m) Lead time (h) Observed waterlevel (m)

Meanforeca

Low 1 23.17 23.172 23.17 23.173 23.17 23.184 23.17 23.175 23.17 23.196 23.17 23.227 23.17 23.218 23.17 23.229 23.17 23.20

10 23.17 23.22

Medium 1 24.82 24.792 24.82 24.803 24.82 24.794 24.82 24.775 24.82 24.786 24.82 24.787 24.82 24.828 24.82 24.839 24.82 24.76

10 24.82 24.69

High 1 26.14 26.152 26.14 26.153 26.14 26.184 26.14 26.195 26.14 26.246 26.14 26.267 26.14 26.268 26.14 26.289 26.14 26.34

10 26.14 26.34

volved in predictions. It is clear from the figures that as lead timeincreases the width of the confidence band widens.

Peak water level values of hydrographs which fall in low, med-ium, and high water levels are selected for further analysis of capa-bility of WBANN model for uncertainty assessment in hourly floodforecasting. Table 9 depicts some of the statistics for empirical dis-tribution of predicted water levels of 50 bootstrapped models for1–10 h lead times for actual water level 23.17 m, 24.82 m and26.14 m as peak values of low, medium and high water level profilehydrographs, respectively. Table depicts that the width of the con-fidence band widens for peak values of medium and higher waterlevel forecasts but for a particular water level profile (low, mediumor high) the width is almost consistent for 1–10 h lead time fore-casts. The results show that the peaks of the hydrographs are over-estimated for almost all the lower and higher water levels andunderestimated for medium water levels.

edium, and high peak water level forecast results.

water levelsted (m)

95% upperbound (m)

95% lowerbound (m)

Standarddeviation (m)

23.25 23.10 0.0423.24 23.11 0.0323.23 23.12 0.0323.24 23.11 0.0323.26 23.13 0.0323.27 23.16 0.0323.26 23.15 0.0323.28 23.16 0.0323.27 23.14 0.0323.30 23.13 0.04

24.90 24.67 0.0624.90 24.69 0.0524.87 24.71 0.0424.87 24.67 0.0524.88 24.68 0.0524.90 24.67 0.0624.90 24.74 0.0424.97 24.68 0.0724.94 24.57 0.0924.90 24.47 0.11

26.33 25.97 0.0926.31 26.00 0.0826.34 26.01 0.0826.39 25.99 0.1026.41 26.07 0.0926.47 26.05 0.1126.49 26.02 0.1226.52 26.04 0.1226.65 26.04 0.1526.67 26.02 0.17

M.K. Tiwari, C. Chatterjee / Journal of Hydrology 394 (2010) 458–470 469

The number of observed values actually included in the confi-dence band, when compared to the confidence level (1 � a) as-sumed to draw the band, can be a very important indicator tojudge the accuracy of confidence limits (Tamea et al., 2005).Fig. 9 shows the results of this analysis for the testing datasetsfor 3, 7 and 10 h lead time forecasts. During 3 h lead time forecaststhe observed values included in confidence band are more thantheoretically expected. This is due to the reason that because ofvery high autocorrelation the model is not able to map nonlinearrelationship for short lead time forecasting between inputs andoutputs during training. For 7 and 10 h lead times the observedvalues included in the confidence band are slightly lower than thatwhich is theoretically expected. This discrepancy is due to the factthat the confidence bands are computed considering only the pre-dictive uncertainty arising from the sampling variability whileother sources of uncertainty are not considered in this study. Theslight discrepancy in the observed points included between train-ing and testing datasets is due to the lack of representativenessof the training dataset with respect to the testing dataset. Eventhough mean and standard deviation in Table 1 shows that thedataset used for testing does not reflect lack of representativenessbut it is to be noted that this study uses only 5 years of river flowsand include a limited number of past flood events in the trainingdataset.

6. Summary and conclusions

This research work presents a WBANN model based on wave-let and bootstrap technique for hourly flood forecasting. The timeseries of the observed data are decomposed into different periodiccomponents considering each of the wavelet components makes adistinct contribution to the original time series. The significantwavelet components are selected based on the correlation be-tween the particular component series and the observed water le-vel at Naraj gauging station and a new series is composed byadding the significant components for each time series. In thisstudy only the approximation component is found to be signifi-cant and detail components are found as noise. The bootstrapresampling method is used initially to generate different realiza-tions of the newly constructed datasets using discrete wavelettransformation to generate bootstrap samples that help developdifferent ANN models and their estimates provide a better under-standing of the average and variability of the original unknowndistribution or process. 50 bootstrapped ANN models are gener-ated to reduce the uncertainty involved in ANN predictions. Theinclusion of the dominant wavelet components in the ANN inputlayer reduced the noise in time series data whereas bootstrappingtechniques reduced the uncertainty inherent in the conventionalANN modeling by reducing the variance. WBANN models aredeveloped to incorporate the individual capabilities of waveletand bootstrap techniques. The WBANN model is found to besuperior to the traditional ANN, BANN and WANN in terms ofthe selected performance criteria for 1–10 h lead forecast. AREand TS statistics showed that performance of WBANN model forhigh flow forecasts are better than ANN, BANN and WANN mod-els. The reliability of confidence intervals are judged by comput-ing the observed values actually included in the confidence bandsfor different significance level. Considering that predictive uncer-tainty that arises from data resampling only is assessed observedvalues actually included in the confidence band are satisfactoryand thus show the reliability of confidence bands. It is found thatthe reliability of forecast can be increased by making confidenceintervals and ensemble predictions. This study shows that theWBANN method is an appropriate tool for hourly water levelforecasting.

References

Abrahart, R.J., 2003. Neural network rainfall–runoff forecasting based on continuousresampling. Journal of Hydroinformatics 5 (1), 51–61.

Aksoy, H., Guven, A., Aytek, A., Yuce, M.I., Unal, N.E., 2007. Discussion of generalizedregression neural networks for evapotranspiration modelling by O. Kisi (2006).Hydrological Sciences Journal 52 (4), 825–828.

Alp, M., Cigizoglu, H.K., 2007. Suspended sediment load simulation by two artificialneural network methods using hydrometeorological data. EnvironmentalModelling and Software 22 (1), 2–13.

Altunkaynak, A., 2007. Forecasting surface water level fluctuations of Lake Van byartificial neural networks. Water Resources Management 21 (2), 399–408.

Anctil, F., Tape, D.G., 2004. An exploration of artificial neural network rainfall–runoff forecasting combined with wavelet decomposition. Journal ofEnvironmental Engineering and Sciences 3, 121–128.

Barreto, H., Howland, F.M., 2006. Introductory Econometrics: Using Monte CarloSimulation with Microsoft Excel. Cambridge University Press, New York.

Bates, B.C., Townley, L.R., 1988. Nonlinear, discrete flood event models. 3: analysisof prediction uncertainty. Journal of Hydrology 99, 91–101.

Beven, K., 2006. A manifesto for the equifinality thesis. Journal of Hydrology 320,18–36.

Bishop, C.M., 1995. Neural Networks for Pattern Recognition. Clarendon Press,Oxford, UK.

Bray, M., Han, D., 2004. Identification of support vector machines for runoffmodelling. Journal of Hydroinformatics 6, 265–280.

Campolo, M., Andreussi, P., Soldati, A., 1999. River flood forecasting with a neuralnetwork model. Water Resources Research 35 (4), 1191–1197.

Cannas, B., Fanni, A., See, L., Sias, G., 2006. Data preprocessing for river flowforecasting using neural networks: wavelet transforms and data partitioning.Physics and Chemistry of the Earth 31 (18), 1164–1171.

Cannon, A.J., Whitfield, P.H., 2002. Downscaling recent streamflow conditions inBritish Columbia, Canada using ensemble neural network models. Journal ofHydrology 259, 136–151.

Cigizoglu, H.K., 2005a. Application of the generalized regression neural networks tointermittent flow forecasting and estimation. Journal of Hydrologic Engineering10 (4), 336–341.

Cigizoglu, H.K., 2005b. Generalized regression neural network in monthly flowforecasting. Civil Engineering and Environmental Systems 22 (2), 71–81.

Cigizoglu, H.K., Alp, M., 2006. Generalized regression neural network in modellingriver sediment yield. Advances in Engineering Software 37 (2), 63–68.

Coulibaly, P., Evora, N.D., 2007. Comparison of neural network methods for infillingmissing daily weather records. Journal of Hydrology 341, 27–41.

Daubechies, I., 1990. The wavelet transform, time–frequency localization and signalanalysis. IEEE Transactions on Information Theory 36 (5), 6–7.

Dybowski, R., Roberts, S.J., 2000. Confidence and prediction intervals for feedforward neural networks. In: Dybowski, R., Gant, V. (Eds.), Clinical Applicationsof Artificial Neural Networks. Cambridge University Press.

Efron, B., 1979. Bootstrap methods: another look at the jackknife. Annals ofStatistics 7, 1–26.

Efron, B., Tibshirani, R.J., 1993. An Introduction to the Bootstrap. Chapman and Hall,London, UK.

El-Shafie, A., Taha, M.R., Noureldin, A., 2007. A neuro-fuzzy model for inflowforecasting of the Nile river at Aswan high dam. Water Resources Management21, 533–556.

Han, D., Chan, L., Zhu, N., 2007a. Flood forecasting using support vector machines.Journal of Hydroinformatics 9 (4), 267–276.

Han, D., Kwong, T., Li, S., 2007b. Uncertainties in real-time flood forecasting withneural networks. Hydrological Processes 21 (2), 223–228.

Haykin, S., 1999. Neural Networks: A Comprehensive Foundation, second ed.Prentice Hall, Englewood Cliffs, NJ.

Jacquin, A.P., Shamseldin, A.Y., 2009. Review of the application of fuzzy inferencesystems in river flow forecasting. Journal of Hydroinformatics 11 (3–4), 202–210.

Jain, A., Srinivasulu, S., 2004. Development of effective and efficient rainfall–runoffmodels using integration of deterministic, real-coded genetic algorithms andartificial neural network techniques. Water Resources Research 40, W04302.

Jain, S.K., Das, D., Srivastava, D.K., 1999. Application of ANN for reservoir inflowprediction and operation. Journal of Water Resources Planning andManagement 125 (5), 263–271.

Jeong, D., Kim, Y.O., 2005. Rainfall–runoff models using artificial neural networks forensemble streamflow prediction. Hydrological Processes 19 (19), 3819–3835.

Jia, Y., Culver, T.B., 2006. Bootstrapped artificial neural networks for synthetic flowgeneration with a small data sample. Journal of Hydrology 331, 580–590.

Kim, T.W., Valdes, J.B., 2003. Nonlinear model for drought forecasting based on aconjunction of wavelet transforms and neural networks. Journal of HydrologicEngineering 6, 319–328.

Kisi, O., 2009. Neural networks and wavelet conjunction model for intermittentstreamflow forecasting. Journal of Hydrologic Engineering 14 (8), 773–782.

Kisi, O., Cigizoglu, H.K., 2007. Comparison of different ANN techniques in river flowprediction. Civil Engineering and Environmental Systems 24 (3), 211–231.

Kitanidis, P.K., Bras, R.L., 1980. Real-time forecasting with a conceptual hydrologicmodel: 2. Applications and results. Water Resources Research 16, 1034–1044.

Koutsoyiannis, D., 2007. Discussion of ‘‘generalized regression neural networks forevapotranspiration modelling” by O Kisi (2006). Hydrological Sciences Journal52 (4), 832–835.

470 M.K. Tiwari, C. Chatterjee / Journal of Hydrology 394 (2010) 458–470

Kucuk, M., Agiralloglu, N., 2006. Wavelet regression technique for stream flowprediction. Journal of Applied Statistics 33 (9), 943–960.

Labat, D., Ababou, R., Mangin, A., 2000. Rainfall–runoff relationships for karsticsprings. Part II: continuous wavelet and discrete orthogonal multiresolutionanalyses. Journal of Hydrology 238, 149–178.

Lall, U., Sharma, A., 1996. A nearest neighbor bootstrap for resampling hydrologictime series. Water Resources Research 32 (3), 679–693.

Leclerca, M., Ouarda, T.B.M.J., 2007. Non-stationary regional flood frequencyanalysis at ungauged sites. Journal of Hydrology 343, 254–265.

Loague, L., VanderKwaak, J.E., 2004. Physics-based hydrologic response simulation:platinum bridge, 1958 Edsel, or useful tool. Hydrological Processes 18,2949–2956.

Maier, H.R., Dandy, G.C., 2000. Neural networks for the prediction and forecasting ofwater resources variables: a review of modelling issues and applications.Environmental Modelling and Software 15, 101–124.

Mallat, S.G., 1989. A theory for multi resolution signal decomposition: the waveletrepresentation. IEEE Transactions on Pattern Analysis and Machine Intelligence11 (7), 674–693.

Mukerji, A., Chatterjee, C., Raghuwanshi, N.S., 2009. Flood forecasting using ann,neuro-fuzzy, and neuro-GA models. Journal of Hydrologic Engineering 14 (6),647–652.

Muller, B., Reinhardt, J., 1991. Neural Networks – An Introduction. Springer-Verlag,Berlin.

Nash, J.E., Shutcliff, J.V., 1970. River flow forcasting through conceptual models. I.Journal of Hydrology 10, 282–290.

Nayak, P.C., Sudheer, K.P., Rangan, D.M., Ramasastri, K.S., 2004. A neuro-fuzzycomputing technique for modeling hydrological time series. Journal ofHydrology 291 (1–2), 52–66.

Nayak, P.C., Sudheer, K.P., Rangan, D.M., Ramasastri, K.S., 2005. Short-termflood forecasting with a neurofuzzy model. Water Resources Research 41,W04004.

Nourani, V., Alami, M.T., Aminfar, M.H., 2008. A combined neural–wavelet model forprediction of Ligvanchai watershed precipitation. Engineering Applications ofArtificial Intelligence 16, 1–12.

Nourani, V., Komasi, M., Mano, A., 2009. A multivariate ANN–wavelet approach forrainfall–runoff modeling. Water Resources Management 23 (14), 2877–2894.

Partal, T., Cigizoglu, H.K., 2008. Estimation and forecasting of daily suspendedsediment data using wavelet–neural networks. Journal of Hydrology 358,317–331.

Partal, T., Cigizoglu, H.K., 2009. Prediction of daily precipitation using wavelet–neural networks. Hydrological Sciences Journal 54 (2), 234–246.

Partal, T., Kucuk, M., 2006. Long-term trend analysis using discrete waveletcomponents of annual precipitations measurements in Marmara region(Turkey). Physics and Chemistry of the Earth 31, 1189–1200.

Rosso, O.A., Figliola, A., Blanco, S., Jacovkis, P.M., 2004. Signal separation with almostperiodic components: a wavelets based method. Revista Mexicana de Fisica 50,179–186.

Sahoo, G.B., Schladow, S.G., Reuter, J.E., 2009. Forecasting stream water temperatureusing regression analysis, artificial neural network, and chaotic non-lineardynamic models. Journal of Hydrology 378, 325–342.

Sertel, E., Cigizoglu, H.K., Sanli, D.U., 2008. Estimating daily mean sea level heightsusing artificial neural networks. Journal of Coastal Research 24 (3), 727–734.

Sharma, A., Tarboton, D.G., Lall, U., 1997. Streamflow simulation: a nonparametricapproach. Water Resources Research 33 (3), 291–308.

Solomatine, D.P., Ostfeld, A., 2008. Data-driven modelling: some past experiencesand new approaches. Journal of Hydroinformatics 10 (1), 3–22.

Srivastav, R.K., Sudheer, K.P., Chaubey, I., 2007. A simplified approach to quantifyingpredictive and parametric uncertainty in artificial neural network hydrologicmodels. Water Resources Research 43, W10407.

Sudheer, K.P., Jain, S.K., 2003. Radial basis function neural network for modelingrating curves. Journal of Hydrological Engineering 8 (3), 161–164.

Sudheer, K.P., Gosain, A.K., Ramasastri, K.S., 2002. A data-driven algorithm forconstructing artificial neural network rainfall–runoff models. HydrologicalProcesses 16, 1325–1330.

Tamea, S., Laio, F., Ridolfi, L., 2005. Probabilistic nonlinear prediction of river flows.Water Resources Research 41, W09421.

Tanrikulu, A.H., 2009. Application of ANN techniques for estimating modal dampingof impact-damped flexible beams. Advances in Engineering Software 40 (10),986–990.

ASCE Task Committee on Application of Artificial Neural Networks in Hydrology,2000a. Artificial neural networks in hydrology I: preliminary concepts. Journalof Hydrological Engineering, ASCE 5 (2), 115–123.

ASCE Task Committee on Application of Artificial Neural Networks in Hydrology,2000b. Artificial neural networks in hydrology II: hydrologic applications.Journal of Hydrological Engineering, ASCE 5 (2), 124–137.

Tasker, G.D., Dunne, P., 1997. Bootstrap position analysis for forecasting low flowfrequency. Journal of Water Resources Planning and Management 123 (6),359–367.

Tiwari, M.K., Chatterjee, C., 2010. Uncertainty assessment and ensemble floodforecasting using bootstrap based artificial neural networks (BANNs). Journal ofHydrology 382 (1–4), 20–33.

Toprak, F., Cigizoglu, H.K., 2008. Predicting longitudinal dispersion coefficient innatural streams by artificial neural networks. Hydrological Processes 22 (20),4106–4129.

Twomey, J.M., Smith, A.E., 1998. Bias and variance of validation methods forfunction approximation neural networks under conditions of sparse data. IEEETransactions on Systems, Man, and Cybernetics C: Applications and Reviews 28(3), 417–430.

Uhlenbrook, S., Seibert, J., Leibundgut, C., Rodhe, A., 1999. Prediction uncertainty ofconceptual rainfall–runoff models caused by problems to identify modelparameters and structure. Hydrological Sciences Journal 44, 779–798.

Wang, D., Ding, J., 2003. Wavelet network model and its application to theprediction of hydrology. Nature and Science 1, 67–71.

Wu, C.L., Chau, K.W., 2006. A flood forecasting neural network model with geneticalgorithm. International Journal of Environment and Pollution 28 (3–4), 261–273.

Wu, C.L., Chau, K.W., Li, Y.S., 2009. Methods to improve neural network performancein daily flows prediction. Journal of Hydrology 372, 80–93.

Xingang, D., Ping, W., Jifan, C., 2003. Multiscale characteristics of the rainy seasonrainfall and interdecadal decaying of summer monsoon in North China. ChineseScience Bulletin 48, 2730–2734.

Yueqing, X., Shuangcheng, L., Yunlong, C., 2004. Wavelet analysis of rainfallvariation in the Hebei Plain. Science in China Series D Earth Science 48, 2241–2250.