Post on 16-Jan-2023
Seediscussions,stats,andauthorprofilesforthispublicationat:https://www.researchgate.net/publication/236657120
Anewwavelet―bootstrap―ANNhybridmodelfordailydischargeforecasting
ArticleinJournalofHydroinformatics·July2011
DOI:10.2166/hydro.2010.142
CITATIONS
36
READS
216
2authors:
MukeshKumarTiwari
CollegeofAgriculturalEngineeringandTechno…
26PUBLICATIONS338CITATIONS
SEEPROFILE
ChandranathChatterjee
IndianInstituteofTechnologyKharagpur
136PUBLICATIONS901CITATIONS
SEEPROFILE
AllcontentfollowingthispagewasuploadedbyChandranathChatterjeeon01December2016.
Theuserhasrequestedenhancementofthedownloadedfile.Allin-textreferencesunderlinedinbluearelinkedtopublicationsonResearchGate,lettingyouaccessandreadthemimmediately.
A new wavelet–bootstrap–ANN hybrid model
for daily discharge forecasting
Mukesh K. Tiwari and Chandranath Chatterjee
ABSTRACT
A new hybrid model, the wavelet–bootstrap–ANN (WBANN), for daily discharge forecasting is
proposed in this study. The study explores the potential of wavelet and bootstrapping techniques
to develop an accurate and reliable ANN model. The performance of the WBANN model is also
compared with three more models: traditional ANN, wavelet-based ANN (WANN) and bootstrap-
based ANN (BANN). Input vectors are decomposed into discrete wavelet components (DWCs)
using discrete wavelet transformation (DWT) and then appropriate DWCs sub-series are used as
inputs to the ANN model to develop the WANN model. The BANN model is an ensemble of several
ANNs built using bootstrap resamples of raw datasets, whereas the WBANN model is an
ensemble of several ANNs built using bootstrap resamples of DWCs instead of raw datasets. The
results showed that the hybrid models WBANN and WANN produced significantly better results
than the traditional ANN and BANN, whereas the BANN model is found to be more reliable and
consistent. The WBANN and WANN models simulated the peak discharges better than the ANN
and BANN models, whereas the overall performance of WBANN, which uses the capabilities of
both bootstrap and wavelet techniques, is found to be more accurate and reliable than the
remaining three models.
Key words 9999 bootstrap, discharge, forecasting, resampling, wavelet
INTRODUCTION
Daily discharge forecasting is desirable for water resource
planning and management. A wide variety of rainfall–runoff
models have been developed and applied for discharge fore-
casting. Most of the rainfall–runoff models have been devel-
oped either based on a mechanistic approach or on a systems
theoretic approach. Hydrological prediction usually relies on
incomplete and uncertain process descriptions that have been
deduced from sparse and noisy datasets. Spatially distributed
modeling is a typical example of the mechanistic approach
to construct a model that explicitly accounts for as much of
the small-scale physics and the natural heterogeneity as
computationally possible (Loague & VanderKwaak 2004).
The approach has been criticized for resulting in models
that are overly complex, leading to problems of over-
parameterization and equifinality (Beven 2006), which may
manifest itself in large prediction uncertainty (Uhlenbrook
et al. 1999), while in the system theoretic approach the
concern is with the system operation, not the nature of the
system by itself or the physical laws governing its operation.
Black box models in the form of artificial neural networks
(ANNs) have gained momentum in the last few decades for
river flow forecasting. The FFBP (Feed Forward Back Pro-
pagation) is the most popular ANN training method in water
resources literature. The major advantage of the FFBP ANN
is that it is less complex than other ANNs such as Radial
Basis Function (RBF) and has the same nonlinear input–
output mapping capability (Sudheer & Jain 2003; Coulibaly &
Evora 2007). Some researchers have successfully applied
fuzzy inference systems in river flow forecasting (Nayak
et al. 2004, 2005a; Mukerji et al. (2009); Pramanik & Panda
Mukesh K. TiwariChandranath Chatterjee (corresponding author)Agricultural and Food Engineering Department,Indian Institute of Technology,Kharagpur 721 302,IndiaE-mail: cchatterjee@agfe.iitkgp.ernet.in
doi: 10.2166/hydro.2010.142
& IWA Publishing 2011500 Journal of Hydroinformatics 9999 9999 201113.3
2009). Jacquin & Shamseldin (2009) reviewed the existing
studies dealing with the use of fuzzy inference systems in river
flow forecasting. This review shows that fuzzy inference
systems can be used as effective tools for river flow forecast-
ing, even though their application is rather limited in com-
parison to the popularity of neural network models. They
found that there are several unresolved issues requiring
further attention before more clear guidelines for the applica-
tion of fuzzy inference systems can be given. Wu & Chau
(2006) observed that the hybrid GA-based ANN algorithm,
under cautious treatment to avoid over-fitting, is able to
produce better accuracy in performance, although at the
expense of additional modeling parameters and longer com-
putation time. ANNs are known to have several dozens of
successful applications in river basin management and related
problems (Solomatine & Ostfeld 2008). The ANNs are still
widely applied and are a very popular tool compared to other
data-driven techniques in river basin management. Therefore
in this study we have used the FFBP ANN model as it has
been increasingly used in rainfall–runoff modeling and river
discharge forecasting due to their ability to map complex
nonlinear rainfall–runoff relationships (Baratti et al. 2003;
Cigizoglu 2003a, b; Jain & Srinivasulu 2004; Altunkaynak
2007; Demirel et al. 2009). Substantial literatures on ANN
have been reported in ASCE (2000a, b).
The reliability of the hydrological predictions is affected
by three major sources of uncertainties (Bates & Townley
1988): data uncertainty (quality and representativeness of
data), model structure uncertainty (ability of the model to
describe the catchment’s response) and parameter uncer-
tainty (adequate values of model parameters). Han et al.
(2007) studied the uncertainties involved in real-time predic-
tion in using an ANN model. It was concluded that, for long-
term predictions, the ANN showed superior performance but
that was only probabilistic, depending on how the calibration
and test events were arranged. Srivastav et al. (2007) pro-
posed a method of uncertainty analysis for ANN hydrological
models and showed that the ANN predictions contain a
significant amount of uncertainty. Boucher et al. (2009)
analyzed one-day-ahead hydrological ensemble forecasts
obtained by stacked neural networks and found that ensem-
ble forecasts outperform point forecasts. In order to over-
come the limitations inherent in the conventional treatment
of uncertainty in ANN model predictions, the recent trend
has been to combine the outputs of several member bootstrap
ANN models to reduce the uncertainty involved by control-
ling the generalization of the final predictive model and to
produce more reliable and consistent predictions (Cannon &
Whitfield 2002; Jeong & Kim 2005).
The bootstrap is a computational procedure that uses
intensive resampling with replacement, in order to reduce
uncertainty (Efron & Tibshirani 1993). In addition, it is the
simplest approach since it does not require complex compu-
tations of derivatives and Hessian matrix inversion involved
in linear methods or the Monte Carlo solutions of the
integrals involved in the Bayesian approach (Dybowski &
Roberts (2001)). Applications of the bootstrap method range
from estimating means, confidence intervals and parameter
uncertainties to network design techniques (Lall & Sharma
1996; Sharma et al. 1997; Tasker & Dunne 1997). The boot-
strap technique has also been used in artificial neural network
model development. Abrahart (2003) employed the bootstrap
technique to continuously sample the input space in the
context of rainfall–runoff modeling and reported that it
offered marginal improvement in terms of greater accuracies
and better global generalizations. Anctil & Lauzon (2004)
recommended the joint use of stop training or Bayesian
regularization with either bagging or boosting for improve-
ment and stability in modeling performance. Jeong & Kim
(2005) used ensemble neural networks (ENN) using the boot-
strap technique to simulate monthly rainfall–runoff. They
concluded that ENN is less sensitive to the input variable
selection and the number of hidden nodes than the single
neural network (SNN). Jia & Culver (2006) used the boot-
strap technique to estimate the generalization errors of neural
networks with different structures and to construct the con-
fidence intervals for synthetic flow prediction with a small
data sample. Sharma & Tiwari (2009) applied the bootstrap
technique for monthly runoff prediction and showed that the
BANN model consistently outperformed the ANN models.
Tiwari & Chatterjee (2010) applied the bootstrap technique
for hourly flood forecasting and showed that the bootstrap
technique is capable of quantifying uncertainty in flood
forecasts and ensemble predictions are more stable and
accurate.
A major criticism of ANN models is their limited ability to
account for any physics of the hydrologic processes in a
catchment (Aksoy et al. 2007; Koutsoyiannis (2007)). Daily
501 M. K. Tiwari & C. Chatterjee 9999 A new wavelet–bootstrap–ANN hybrid model Journal of Hydroinformatics 9999 9999 201113.3
discharge is widely perceived as being nonlinear and nonsta-
tionary (Rao et al. 2003; Wang et al. 2006). Nonstationarity in
the data such as trends and seasonal variations influences the
simulation of daily discharge greatly and often causes poor
predictability in practical applications. The pronounced sea-
sonality is driven by different underlying physical processes of
streamflow generation for different periods. For instance, the
low flows are mainly sustained by base flows, while high
flows are affected by intensive rainfalls.
Wavelet transforms, which provide information in both
time and frequency domains of the signal, give considerable
information about the physical structure of the data, and
wavelet analysis provides a time–frequency representation
of a signal at many different periods in the time domain
(Daubechies 1990). In spite of the suitable flexibility of ANN
in modeling hydrologic time series such as runoff (Hsu et al.
1995; Zhang & Dong 2001) non-stationarity in the signal limits
the use of ANN. In such a situation, ANNs will not produce
good results with nonstationary data if pre-processing of the
input and/or output data is not performed (Cannas et al.
2006). The wavelet transformation technique decomposes the
original time series data into sub-time series of different
resolution levels, and these wavelet-transformed data
improve the performance of a forecasting model by capturing
different information on various resolution levels. The sub-
time series obtained using wavelet decomposition of the
signal of different resolution levels provides an interpretation
of the series structure and extracts the useful information.
This is the reason that this technique is largely applied to time
series analysis of nonstationary signals (Nason & Von Sachs
1999). In the recent past, wavelet transforms have become a
useful method for analyzing variations, periodicities and
trends in time series (Xingang et al. 2003; Yueqing et al.
2004; Partal & Kucuk 2006). Recently, the WANN model
has been successfully employed in some hydrology and water
resource studies. For example, Wang & Ding (2003) proposed
a wavelet network model with a combination of the wavelet
transform and the ANN, and decomposed the original time
series into periodic components by wavelet transform. Later,
sub-time series were used as the inputs for ANNs and the
resulting model was applied to forecast the original time
series. This approach was used for monthly groundwater
level and daily discharge forecasting. Kim & Valdes (2003)
presented a hybrid neural network model combined with
dyadic wavelet transforms to improve the forecasting of
regional droughts. These researchers used the neural network
model in two phases. First, neural networks were employed
to forecast the signals decomposed by wavelet transform at
various resolution levels and then the forecasted decomposed
signals were reconstructed into the original time series. The
researchers applied this model to the monthly and annual
inflow and rainfall data and showed that the model signifi-
cantly improved the neural network’s forecasting perfor-
mance. Anctil & Tape (2004) used a wavelet–neural
network model for one-day-ahead rainfall–runoff forecasting.
The time series was decomposed by wavelet transform into
three sub-series: short, intermediate and long wavelet periods.
Then, multiple-layer ANN forecasting models were trained
for each wavelet-decomposed sub-series, and later forecasted
decomposed signals were reconstructed into the original time
series. Partal & Cigizoglu (2008) predicted the suspended
sediment load in rivers by a combined wavelet–ANN method.
Measured data were decomposed into wavelet components
via DWT, and the new wavelet series, consisting of the sum of
selected wavelet components, was used as input for the ANN
model. The wavelet–ANN model provided a good fit to
observed data for the testing period. Zhou et al. (2008)
developed a wavelet predictor–corrector model for prediction
of monthly discharge time series and showed that the model
has higher prediction accuracy than ARIMA and seasonal
ARIMA. Hence a hybrid ANN–wavelet model which uses
multi-scale signals as input data may present more reliable
forecasting rather than a single pattern input. Adamowski
(2008a, b) developed a new method of standalone short-term
river flood forecasting based on wavelet and cross-wavelet
analysis. He found that the proposed wavelet-based forecast-
ing method can be used with great accuracy as a standalone
forecasting method for short-term river flood forecasting. It
was also shown that forecasting models based on wavelet and
cross-wavelet constituent components for forecasting river
floods were not accurate for longer lead time forecasting with
the artificial neural network models providing more accurate
results. Kisi (2008) investigated the accuracy of the neuro-
wavelet (NW) technique for modeling monthly streamflows.
The NW and ANN models were tested by applying them to
various input combinations of monthly streamflow data from
Gerdelli and Isakoy Stations in the Eastern Black Sea region
of Turkey. The test results indicated that the DWT could
502 M. K. Tiwari & C. Chatterjee 9999 A new wavelet–bootstrap–ANN hybrid model Journal of Hydroinformatics 9999 9999 201113.3
increase the accuracy of ANN models in modeling monthly
streamflows. The comparison results showed that the NW
model performed much better than the ANN, multi-linear
regression (MLR) and autoregressive (AR) models. He
recommended further studies using more data from different
areas to strengthen these conclusions. Kisi (2009) proposed
the application of a conjunction model (neurowavelet) for
forecasting daily intermittent streamflow. The comparison
results revealed that the suggested model could significantly
increase the forecast accuracy of single ANN in forecasting
daily intermittent streamflows. Partal (2009) employed the
wavelet–neural network structure that combines wavelet
transform and artificial neural networks to forecast the river
flows of Turkey. He studied the performance comparison of
generalized neural networks and radial basis neural networks
with feed-forward back-propagation methods. Six different
models were studied for forecasting of monthly river flows. It
was seen that the wavelet and feed-forward back-propagation
model was superior to the other models in terms of selected
performance criteria. Satyajirao & Krishna (2009) proposed a
wavelet–neural network (WNN) model, based on a combina-
tion of wavelet analysis and ANN, in order to improve the
precision of daily streamflow and monthly groundwater level
forecasts where there is a scarcity of other hydrological time
series data. The results of daily streamflow and monthly
groundwater level series modeling indicated that the perfor-
mances of WNN models are more effective than the ANN
models. The study also highlights the capability of WNN
models in estimating low and high values in the hydrological
time series data. Wang & Meng (2007) analyzed character-
istics, abruptions, tendencies and causes of annual runoff
variations in the Heihe River drainage basin by the wave-
let–neural network model, and detected the basic variation
characteristics and tendencies of the Heihe River surface
hydrological system. Wang et al. (2009) developed a new
hybrid model, the wavelet–network model (WNM), based on
reasonable combination of wavelet analysis with ANN. The
results from a given case study showed that the WNM has
some significant advantages for short- and long-term predic-
tion of runoff in hydrology. The WNM was compared with
the threshold auto-regressive model (TAR). The accuracy of
prediction using the WNM was better than the TAR. They
recommended further researches on investigating the method
for reasonable selection of resolution level, which is an
important parameter in WNM. Partal & Cigizoglu (2009)
predict the daily precipitation from meteorological data
from Turkey using the wavelet–neural network method. The
new approach in estimating the peak values showed a notice-
ably high positive effect on the performance evaluation
criteria.
To the best of our knowledge, no study has been reported
in the hydrologic literature that has used the combined
strength of wavelet and bootstrap-based ANNs for hydrolo-
gical modeling. The present study is the first application
where ANN models are coupled with DWT and the bootstrap
technique to utilize the individual strength of each approach.
An attempt is made to forecast river discharge for 1 to 5 days
using the novel approach that uses wavelet and bootstrap-
based ANNs in a large river basin where flood forecasting
and water resources planning are critical issues.
METHODOLOGY
Artificial neural networks
Artificial neural networks are information processing systems
composed of simple processing elements (nodes) linked by
weighted synaptic connections (Muller & Reinhardt 1991).
Neural network models are developed by training the net-
work to represent the relationships and processes that are
inherent within the data. Being essentially nonlinear regres-
sion models, they perform an input–output mapping using a
set of interconnected simple processing nodes or neurons.
They reconstruct the complex nonlinear input/output rela-
tions by combining multiple simple functions, by analogy with
the functioning of the human brain. This approach is fast and
robust in noisy environments, flexible in the range of pro-
blems it can solve, and highly adaptive to newer environ-
ments. Owing to these established advantages, ANNs
currently have numerous real-world applications, such as
time series prediction, rule-based control and rainfall–runoff
modeling (Jain et al. 1999). The multilayer feed-forward
neural network consists of a set of sensory units that con-
stitute the input layer, one or more hidden layers of computa-
tion nodes and an output layer of computation nodes. The
input signal propagates through the network in a forward
direction, layer by layer. These neural networks are commonly
503 M. K. Tiwari & C. Chatterjee 9999 A new wavelet–bootstrap–ANN hybrid model Journal of Hydroinformatics 9999 9999 201113.3
referred to as multilayer perceptrons. A detailed explanation
of different properties of ANN are beyond the scope of this
paper. Interested readers are directed to refer to texts such as
Bishop (1995) and Haykin (1999) for discussion on the general
properties of ANN and Maier & Dandy (2000) for an over-
view of different applications of ANN in water resources.
The wavelet analysis
Wavelet analysis is a multiresolution analysis in the time and
frequency domains and is the important derivative of the
Fourier transform. The wavelet function cðtÞ, called the
mother wavelet, has finite energy and is mathematically
defined asRþN�N cðtÞdt ¼ 0.
ca;bðtÞ can be obtained as
ca;bðtÞ ¼ aj j�12c
t� ba
� �ð1Þ
where a and b are real numbers, ca;bðtÞ ¼ successive wavelet,
a¼ scale or frequency factor and b¼ time factor. Thus, the
wavelet transform is a function of two variables, a and b. The
parameter a can be interpreted as a dilation (a41) or con-
traction (ao1) factor of the wavelet function cðtÞ corre-
sponding to different scales of observation. The parameter b
can be interpreted as a temporal translation or shift of the
function cðtÞ.For the time series fðtÞeL2ðRÞ or finite energy signal
(Rosso et al. 2004) the continuous wavelet transform of the
time series f(t) is defined as
Wfða; bÞ ¼ aj j�12
Z þN�N
fðtÞc� t� ba
� �dt ð2Þ
where Wf (a, b) is the wavelet coefficient and * corresponds
to the complex conjugate.
The wavelet transform searches for correlations between
the signal and wavelet function. This calculation is done at
different scales of a and locally around the time of b. The
result is a wavelet coefficient Wf(a, b) contour map known as
a scalogram. However, computing the wavelet coefficients at
every possible scale (resolution level) causes the generation of
a large amount of data. If one chooses the scales and the
positions based on the powers of two (dyadic scales and
positions), then the analysis will be much more efficient as
well as more accurate. This transform is called the discrete
wavelet transform (DWT) and can be defined as (Mallat 1989)
cm;nt� b
a
� �¼ a�m=2
o c�t� nboam
o
amo
� �ð3Þ
where m and n are integers that control the wavelet dilation
and translation, respectively, a0 is a specified fined dilation
step greater than 1 and b0 is the location parameter which
must be greater than zero. The most common and simplest
choices for parameters are a0¼ 2 and b0¼ 1.
This power-of-two logarithmic scaling of the translations
and dilations is known as a dyadic grid arrangement and is
the simplest and most efficient case for practical purposes
(Mallat 1989). For a discrete time series f(t), when it occurs at
a different time t (i.e. here integer time steps are used), the
discrete wavelet transform becomes
Wfðm;nÞ ¼ 2�m=2XN�1
t¼0
fðtÞc�ð2�mi� nÞ ð4Þ
where Wf(m, n) is the wavelet coefficient for the discrete
wavelet of scale a¼ 2m and location b¼ 2mn. f(t) is a finite
time series (t¼ 0, 1, 2,y, N�1), N is an integer power of 2
(N¼ 2M) and n is the time translation parameter, which
changes in the range 0ono2M�m�1, where 1omoM. At the
largest wavelet scale (i.e. 2m where m¼M) only one wavelet is
required to cover the time interval and only one coefficient is
produced. At the next scale (2m�1), two wavelets cover the time
interval, hence two coefficients are produced, and so on down
to m¼ 1. At m¼ 1, the a scale is 21, i.e. 2M�1 or N/2 coefficients
are required to describe the signal at this scale. The total
number of wavelet coefficients for a discrete time series of
length N¼ 2M is then 1þ 2þ 4þ 8 þ y þ 2M�1 ¼ N�1.
In addition to this, a signal smoothed component, W , is
left, which is the signal mean. Thus, a time series of length N
is broken into N components, i.e. with zero redundancy. The
inverse discrete transform is given by
fðtÞ ¼W þXMm¼1
X2M�m�1
n¼0
Wfðm;nÞ2�m2c�ð2�mi� nÞ ð5Þ
or in a simple format as
fðtÞ ¼WðtÞ þXMm¼1
WmðtÞ ð6Þ
504 M. K. Tiwari & C. Chatterjee 9999 A new wavelet–bootstrap–ANN hybrid model Journal of Hydroinformatics 9999 9999 201113.3
where WðtÞ is called the approximation sub-signal at level M
and Wm(t) are detailed subsignals at levels m¼ 1, 2, y,M.
DWT operates two sets of functions viewed as high-pass
and low-pass filters (Figure 1). The original time series is
passed through high-pass and low-pass filters and separated
at different scales. The time series is decomposed into one
comprising low frequencies or its trend (the approximation)
and one comprising the high frequencies or the fast events
(the detail). The wavelet coefficients, Wm(t) (m ¼ 1, 2, y, M),
provide the detail signals, which can capture small features of
interpretational value in the data; the residual term, WðtÞ,represents the background information (approximation) of
data. Because of the simplicity of W1(t), W2(t), y, WM(t),
WðtÞ, some interesting characteristics, such as period, hidden
period, dependence and jump can be diagnosed easily
through these discrete wavelet components (DWCs).
Bootstrapped artificial neural networks (BANNs)
The bootstrap is a data-driven simulation method that uses
intensive resampling, with replacement, to reduce uncertain-
ties (Efron 1979; Efron & Tibishirani, 1993). The technique is
based on resampling with replacement of the available data-
set and training an individual network on each resampled
instance of the original dataset. The bootstrap method can be
used to expand upon a single realization of a distribution or
process to create a set of bootstrap samples that can provide a
better understanding of the average and variability of the
original unknown distribution or process.
Assume that the data consists of a random sample
Tn¼ {(x1, y1), (x2, y2), y., (xn, yn)} of size n drawn from a
population of unknown probability distribution F, where ti (xi,
yi) is a realization drawn independently and identically dis-
tributed (i.i.d.) from F and consists of a predictor vector xi and
the corresponding output variable yi. Let F be the empirical
distribution function for Tn with mass 1/n on t1, t2, y, tn; and
let T�
be a random sample of size n taken from i.i.d. with
replacement from F. The set of B bootstrap samples can be
represented as T1, T2,y, Tb,y, TB, in which B is the total
number of bootstrap samples and ranges usually from 50–200
(Efron 1979). For each Tb, an ANN prediction model is
constructed and the output is represented as
fANNðxi;wb=TbÞ, built using all n observations. The perfor-
mance of the trained ANN model is evaluated using the
observation pairs that are not included in a bootstrap sample
and the average performance of these ANNs on their corre-
sponding validation sets is used as an estimate of the general-
ization error of the ANN model developed on Tn. The
generalization error of an ANN model can be estimated by
its ‘E0’ estimate (Twomey & Smith 1998):
E0 ¼
PBb¼1
PieAbðyi � fANNðxi;wb=TbÞÞ2
PBb¼1ðAbÞ
ð7Þ
The output of the ANN developed based on the bootstrap
sample Tb is represented as fANN(x, wb/Tb), where x is a
particular input vector, wb is the weight vector, Ab is the set of
indices for the observation pairs not included in the bootstrap
sample Tb and #(Ab) is the number of observation pair indices
in Ab. The BANN estimate y xð Þ is given by the average of the
B bootstrapped estimates:
y xð Þ ¼ 1B
XBb¼1
fANN x;wb=Tb
� �ð8Þ
Multiple linear regression (MLR)
Multiple linear regression attempts to model the relationship
between two or more independent variables and a dependent
variable by fitting a linear equation to the data points. A
multiple linear regression equation takes the following form:
y ¼ aþ b1�1 þ b2�2 þyþ bnxn ð9Þ
where y is the dependent variable, a is a constant and b1 to bn
are multipliers for x1 to xn independent variables. Constant
and multipliers are estimated through minimizing the sums of
Time series
High-pass filter
Low-pass filter
Details
Approximation
Figure 1 9999 DWT decomposition of a time series.
505 M. K. Tiwari & C. Chatterjee 9999 A new wavelet–bootstrap–ANN hybrid model Journal of Hydroinformatics 9999 9999 201113.3
squares of deviations between each data point and the
regression line. MLR has been the traditional approach
utilized in water resources hydrology for several decades
since the last century. Some recent applications appear in
Leclerca & Ouarda (2007) and Sahoo et al. (2009).
Performance indices
Dawson et al. (2007) discussed 20 performance measures
generally used in hydrological forecasting. In this study, we
have selected four performance measures to evaluate model
performance. The Nash–Sutcliffe efficiency (E), root mean
square error (RMSE), mean absolute error (MAE) and per-
sistence index (PERS) performance measures are used to
evaluate the accuracy of the developed models. The Nash–
Sutcliffe efficiency (E), introduced by Nash & Sutcliffe (1970),
is still one of the most widely used criteria for the assessment
of model performance. E provides a measure of the ability of
a model to predict values that are different from the mean. It
records as a ratio the level of overall agreement between the
observed and modelled datasets and, as it is sensitive to
differences in the observed and modelled means and var-
iances, it has been strongly recommended (e.g. NERC 1975;
ASCE 1993). RMSE and MAE provide different types of
information about the predictive capabilities of the model.
The RMSE measures the goodness-of-fit relevant to high flow
values whereas the MAE is not weighted towards high(er)
magnitude or low(er) magnitude events, but instead evaluates
all deviations from the observed values, in both an equal
manner and regardless of sign. PERS is the substitution of the
last known value as the current prediction and represents a
good benchmark against which other predictions can be
measured (Cannas et al. 2006; Kitanidis & Bras 1980).
(i) Nash–Sutcliffe coefficient (E). It is expressed as
E ¼ 1�
Pni¼1ðOi � PiÞ2
Pni¼1ðOi �OiÞ2
ð10Þ
where Oi and Pi are the observed and predicted flow, Oi is the
mean of the observed flow and n is the number of data points.
The value of the Nash–Sutcliffe coefficient varies between
�N to 1. The closer the value to 1, the better is the model
performance.
(ii) Root mean square error (RMSE). It is expressed as
RMSE ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1n
Xn
i¼1
ðOi � PiÞ2vuut ð11Þ
(iii) Mean absolute error (MAE). It is expressed as
MAE ¼ 1n
Xn
i¼1
Oi � Pij j ð12Þ
(iv) Persistence index (PERS). It is expressed as
PERS ¼ 1� SSESSEnaive
ð13Þ
where
SSE ¼Xn
i¼1
ðOi � PiÞ2 ð14Þ
and
SSEnaive ¼Xn
i¼1
ðOi �Oi�LÞ2 ð15Þ
in which the SSE terms are the sum of square errors, Oi�L is
the observed discharge at time i minus the lead time L.
Persistence consists of a comparison between the model
under study and the naive model where a value of PERS
smaller than or equal to 0 indicates that the model under
study performs worse or no better than the easy-to-implement
naive model. A PERS value of 1 is obtained when the model
under study provides exact estimates of observed discharge.
STUDY AREA AND DATA USED
The Mahanadi river basin, which is the fourth largest river
basin in India, was selected for this study. The Mahanadi
River flows to the Bay of Bengal in east-central India draining
an area of 141 589 km2 and has a length of 851 km. It lies
between east longitudes 801300 to 861500 and north latitudes
191210 to 231350. About 53% of the basin is in the state of
Chhatisgarh, 46% is in the coastal state of Orissa and the
remainder of the basin is in the states of Jharkhand and
Maharastra. Numerous dams, irrigation projects and barrages
506 M. K. Tiwari & C. Chatterjee 9999 A new wavelet–bootstrap–ANN hybrid model Journal of Hydroinformatics 9999 9999 201113.3
are present in the Mahanadi river basin, the most prominent
of which is the Hirakud reservoir. The middle reaches of the
lower Mahanadi river basin, located in Orissa between 821E
191N and 861E 221N and encompassing a geographical area
of 47 558.6 km2, forms the study area (Figure 2). The main
river reach extends from Hirakud Dam to Naraj, having a
total length of 358.4 km. The main soil types found in the
study area are red and yellow soils. The normal annual
rainfall is 1458 mm and temperature in this region varies
from 14 1C to 40 1C. The average monthly pan evaporation of
the area varies from 2.4 mm to 14.6 mm. Most of the rainfall
and river flow occur during the monsoon season, between
June and September. In the delta region of the Mahanadi
river basin almost every year flooding is a serious problem
during monsoon seasons. Naraj, which is situated at the
mouth of the delta, was selected for daily discharge forecasting.
The data used for this study consists of daily discharge
data of seven upstream gauging stations during the period
2000–2006. Some of the statistical properties of the discharge
data are presented in Table 1. The locations of different
gauging stations are shown in Figure 2. Seven years of daily
discharge data of seven discharge gauging stations for
the monsoon period (17 July to 25 September) from years
2000–2006 (497 sets) are divided into three datasets. Daily
discharge of years 2000–2004 (355 patterns) are taken for
training, 2005 (71 patterns) for cross-validation and 2006 (71
patterns) for testing. The training dataset serves for model
training and the testing dataset is used to evaluate the
performances of the models. The cross-validation set helps
to implement an early stopping approach in order to avoid
overfitting of the training data.
MODEL DEVELOPMENT
One of the most important steps in the ANN hydrologic
model development process is the determination of signifi-
cant input variables. The current study used a statistical
approach suggested by Sudheer et al. (2002) to identify the
appropriate input vectors. The method is based on the heur-
istic that the potential influencing variables corresponding to
different time lags can be identified through statistical analy-
sis of the data series that uses cross-correlation (CCF), auto-
correlation (ACF) and partial autocorrelation functions
(PACF) between the variables. This process is applied to
select significant inputs from the discharge data of seven
Figure 2 9999 Index map of the middle reaches of Mahanadi river basin showing location of different gauging stations.
507 M. K. Tiwari & C. Chatterjee 9999 A new wavelet–bootstrap–ANN hybrid model Journal of Hydroinformatics 9999 9999 201113.3
discharge gauging stations for daily river flow forecasting. The
total input vectors identified are presented in Table 2.
The identification of the optimal network geometry is one
of the major tasks in developing an ANN model. While the
number of input output nodes is problem-dependent, there is
no direct and precise way of determining the optimal number
of hidden nodes. The model architecture of an ANN is
generally selected through a trial-and-error procedure as
currently there is no established methodology for selecting
the appropriate network architecture prior to training. The
optimum numbers of hidden neurons are calculated using the
generalization error. The generalization errors (Equation (7))
of various ANN structures are representative of the ANN
performance (Jia & Culver 2006). The ANN structures are
tested for 1–6 hidden neurons and the structure with 4 hidden
neurons for which the generalization error is lowest is chosen
as the optimal structure.
Four ANN models developed in this study are traditional
ANN, wavelet-based ANN (WANN), bootstrap-based ANN
(BANN) and bootstrap–wavelet ANN conjunction model
(BWANN). At first, a multi-layer perceptron (MLP) feed-
forward ANN model is developed. The ANN models are
Table 1 9999 Statistics of the datasets for daily river flow forecasting
Year Statistics Khairmal Hirakud Release Salebhata Kesinga Kantamal Tikarpara Naraj
2000 Mean (m3/s) 1792.9 784.9 25.0 359.8 569.5 2034.8 2296.9
Standard deviation (m3/s) 897.7 701.6 19.2 367.0 645.9 834.5 641.6
Maximum (m3/s) 5720.0 4601.3 101.2 2452.0 3661.0 4774.8 5050.0
Minimum (m3/s) 509.7 132.9 4.3 113.5 105.0 600.0 1090.0
2001 Mean (m3/s) 6983.8 4057.8 269.2 1049.5 1627.8 6035.1 9932.2
Standard deviation (m3/s) 6383.9 4401.3 431.6 940.6 1187.3 5568.9 9031.4
Maximum (m3/s) 30,468.9 21,567.3 3225.0 5293.0 6500.0 26,700.0 36,714.7
Minimum (m3/s) 894.8 474.5 8.6 116.5 336.7 1369.6 1842.1
2002 Mean (m3/s) 2230.1 1187.6 141.1 253.0 470.0 2206.6 3111.9
Standard deviation (m3/s) 2801.1 1964.7 243.9 327.1 579.5 2412.0 3583.7
Maximum (m3/s) 14,724.8 10,329.4 1545.4 1990.9 2900.0 12,305.6 16,630.3
Minimum (m3/s) 305.8 43.2 1.8 10.0 20.2 436.8 215.1
2003 Mean (m3/s) 9038.2 4887.2 664.5 1256.2 1645.7 7533.6 11,701.5
Standard deviation (m3/s) 8322.0 5306.2 1154.0 1327.7 2061.4 6445.3 10,014.6
Maximum (m3/s) 34,150.1 24,267.2 7916.0 8908.4 12,915.1 25,062.0 35,253.2
Minimum (m3/s) 1257.3 471.8 4.5 308.6 385.2 794.4 2620.3
2004 Mean (m3/s) 4581.7 2613.9 190.5 725.0 946.6 4208.4 5513.7
Standard deviation (m3/s) 4620.5 3240.1 236.2 633.9 687.2 3844.7 5452.1
Maximum (m3/s) 20,331.5 14,952.0 1004.0 3240.6 4014.0 17,744.4 22,465.4
Minimum (m3/s) 1121.3 538.5 13.6 182.6 345.5 1139.1 1320.7
2005 Mean (m3/s) 5570.8 3323.0 150.8 665.1 1001.2 4950.3 8229.0
Standard deviation (m3/s) 5565.3 3712.7 284.9 1210.2 1783.1 4141.0 7006.9
Maximum (m3/s) 26,249.7 13,196.6 1924.0 8120.8 11,030.1 19,000.0 24,399.1
Minimum (m3/s) 849.5 393.9 1.0 113.4 131.8 1130.0 1274.9
2006 Mean (m3/s) 7608.3 3343.9 573.8 1274.3 2138.1 7127.6 11,979.8
Standard deviation (m3/s) 5745.6 3036.7 836.8 1307.5 2985.1 5449.5 8737.6
Maximum (m3/s) 27,099.2 11,870.3 4681.4 6483.1 15,276.2 29,000.0 33,979.2
Minimum (m3/s) 1659.4 240.7 9.3 167.9 305.6 923.8 696.7
508 M. K. Tiwari & C. Chatterjee 9999 A new wavelet–bootstrap–ANN hybrid model Journal of Hydroinformatics 9999 9999 201113.3
developed using the most significant inputs which are first
log-transformed and then linearly scaled to the range (0, 1)
for ANN modeling (Campolo et al. 1999). The computational
efficiency of the training process is an important considera-
tion for ANN modeling. A computationally efficient second-
order training method, the Levenberg–Marquardt method, is
used to minimize the mean squared error between the fore-
cast and observed river flows. In the next step, the DWC data
are used as inputs to the ANN model to develop the WANN
model. The DWT decomposes an original discharge time
series into many components (i.e. DWCs) at different scales
(or frequencies). Each component plays a distinct role in the
original flow series. The low-frequency component generally
reflects the identity (periodicity and trend) of the signal
whereas the high-frequency component uncovers details
(Kucuk & Agiralloglu 2006). The wavelet function is derived
from the family of Daubechies wavelets (Nourani et al. 2009;
Wu et al. 2009). To choose the number of decomposition level
or DWCs the following formula is used to determine the
decomposition levels (Aussem et al. 1998; Nourani et al.
2008):
L ¼ int ½logðNÞ� ð16Þ
where L and N are decomposition level and number of time
series data, respectively. This study uses N¼ 497, which
produces L¼ 3 approximately. The WANN models are devel-
oped employing subseries DWCs obtained using DWT. For
this purpose, firstly, the original time series is decomposed
into three levels of DWCs by DWT. Three levels of decom-
position (i.e., D1, D2 and D3) and approximation (A3) for
discharge data of Khairmal and Hirakud releases are shown
in Figure 3. The effective DWCs are determined using the
correlation coefficients between each wavelet components
and the observed discharge at Naraj. Correlation between the
periodic component and the original discharge data reveals
the effectiveness of the component. Table 3 shows that the
significant periodic components are A3 and D3 for Khairmal,
Hirakud release data and Kantamal; A3, D2 and D3 for
Tikarpara; A3, D1, D2 and D3 for Naraj; and only approx-
imation A3 for Salebhata and Kesinga. These constitute the
new wavelet discharge series. The significant wavelet compo-
nents of a particular gauging station are added and employed
to constitute the new inputs of the WANN for daily discharge
forecasting. The newly constructed time series is used as
WANN model inputs. The BANN model is developed as an
ensemble of several ANNs built using bootstrap resamples of
raw datasets, whereas the WBANN model is developed as an
ensemble of several ANNs built using bootstrap resamples of
DWCs instead of raw datasets. In this way the WBANN
model uses the capabilities of both bootstrap resampling and
wavelet transformation technique. Similar to the BANN
model, the WBANN model is also developed using 200
resamples to maintain consistency. Bootstrap.xla, an Excel
add-in (Barreto & Howland 2006), is used to generate boot-
strap resamples of raw datasets and DWCs for the BANN and
WBANN model development, respectively. A simple flow-
chart depicting the development of ANN, BANN, WANN
and WBANN models is shown in Figure 4. The number of
inputs for all four models is taken as the same to maintain
consistency. Similar to ANN the WANN, BANN and
WBANN model structures are tested for 1–6 hidden neurons
and all three models with 4 hidden neurons for which the
generalization errors are minimum is chosen as the optimal
structure. Therefore, for all four models the number of hidden
neurons is 4.
DAILY DISCHARGE FORECASTING
Figure 5–7 show the hydrograph and scatter plot of observed
and predicted discharges using WBANN for 1-, 3-, and 5-day
lead time forecasts for the testing period. The figures show
that WBANN forecasts are in good agreement with the
observed data. For 1-day lead time forecasts the observed
and predicted values are in good agreement as the predicted
Table 2 9999 Most significant input variables selected using cross-correlation statistics (CCF,
ACF, PACF)
Stations Lags (d)
Naraj Qt�1, Qt�2
Tikarpara Qt�1, Qt�2
Khairmal Qt�1, Qt�2, Qt�3
Kantamal Qt�2, Qt�3
Hirakud Release Qt�1, Qt�2, Qt�3
Salebhata Qt�1, Qt�2, Qt�3
Kesinga Qt�2, Qt�3
509+ M. K. Tiwari & C. Chatterjee 9999 A new wavelet–bootstrap–ANN hybrid model Journal of Hydroinformatics 9999 9999 201113.3
0
8000
16000
24000
32000
Dis
char
ge
(m/s
)
Time (day)
Original
0
4000
8000
12000
16000
20000
Dis
char
ge
(m/s
)
Time (day)
A3
−16000
−8000
0
8000
16000
Dis
char
ge
(m/s
)
Time (day)
D1
−12000
−8000
−4000
0
4000
8000
12000
Dis
char
ge
(m/s
)
Time (day)
D2
−4000
0
4000
Dis
char
ge
(m/s
)
Time (day)
D3
(a) (b)
Khairmal
0
4000
8000
12000
16000
Dis
char
ge
(m/s
)Time (days)
Original
−4000
0
4000
8000
Dis
char
ge
(m/s
)
Time (days)
A3
−2000
0
2000
Dis
char
ge
(m/s
)
Time (days)
D1
−8000
−4000
0
4000
8000
Dis
char
ge
(m/s
)
Time (days)
D2
−8000
−4000
0
4000
8000
Dis
char
ge
(m/s
)
Time (days)
D3
Hirakud release data Figure 3 9999 The discrete wavelet components of the discharge series of (a) Khairmal and (b) Hirakud release data for the year 2006.
510 M. K. Tiwari & C. Chatterjee 9999 A new wavelet–bootstrap–ANN hybrid model Journal of Hydroinformatics 9999 9999 201113.3
values show the general behavior of the observed discharge.
For longer lead times, model predictions gradually deterio-
rate. This is because, for longer lead time, the river flow values
contain less information than for the shorter lead times for
modeling discharge values at the longer time horizon. The
results of the WBANN model in terms of different perfor-
mance indices for the testing period is shown in Table 4. The
RMSE statistic is a measure of residual variance and is
indicative of the model’s ability to predict high flows. Con-
sidering a high value of 33 979.2 m3/s and a very low value of
696.7 m3/s at the Naraj gauging station, the WBANN model
with RMSE value of 4981.3 m3/s performed satisfactorily up
to 5-day lead forecasts.
Table 5 shows the performance of ANN, WANN and
BANN models in terms of E, RMSE and MAE. It is clear from
the Tables 4 and 5 that for 1–5-day lead time forecasts the
WBANN and WANN models perform better than the tradi-
tional ANN and BANN. The time of concentration of the
basin is about 3 day. Therefore, the performance of the model
is not very good for longer lead time forecasts (i.e. 4- and
5-day lead). Performance of the WANN model for 4-day lead
time and performance of the WBANN model for 4- and 5-day
lead times are still significantly better than ANN and BANN,
as these models simulate the antecedent information for lead
time equal to the time of concentration by utilizing the
capability of wavelet and bootstrap techniques. Figures 8–10
show the hydrographs and scatter plots of observed and
predicted discharges using ANN, WANN, BANN and
WBANN for 1-, 3-, and 5-day lead time forecasts for the
testing period. It is obvious from the figures that for longer
lead time (41 day) the traditional ANN and BANN models
underestimate the observed data, whereas the WANN and
WBANN models are consistent and do not significantly
underestimate or overestimate the observed values. It is
observed from the figures that for 1-, 3- and 5-day lead time
forecasts the WANN and WBANN models forecast higher
discharge values better than the ANN and BANN models.
Moreover, performance of the WANN and WBANN models
in predicting peak values for 1- and 3-day lead time forecasts
is almost similar, even though the 5-day lead time forecast
Table 3 9999 Correlation coefficients between the discrete wavelet components and the observed discharge data at Naraj
Discrete wavelet components
Gauge stations
Khairmal Hirakud Release Salebhata Kesinga Kantamal Tikarpara Naraj
A3 0.91 0.87 0.75 0.60 0.66 0.92 0.93
D1 �0.04 �0.04 �0.01 �0.04 �0.02 0.01 0.14
D2 0.06 0.03 0.00 �0.01 0.00 0.13 0.19
D3 0.21 0.13 0.07 0.09 0.11 0.23 0.27
Original 0.88 0.81 0.52 0.39 0.50 0.93 1.00
Daily discharge time series
Significant input
WANN development
DWT
Bootstrapresampling
Bootstrapresampling
WBANNdevelopment
BANN development
ANNdevelopment
DWCs
Figure 4 9999 Flowchart showing the development of ANN, BANN, WANN and WBANN models.
511 M. K. Tiwari & C. Chatterjee 9999 A new wavelet–bootstrap–ANN hybrid model Journal of Hydroinformatics 9999 9999 201113.3
performance of the WANN model is significantly better
compared to the WBANN model. The reason may be that
the WBANN model is the ensemble of several (in this study
200) ANNs built using resamples of DWCs. There may be a
chance that the high discharge values, which are very less, are
not included in any of the 200 bootstrap resamples for 5-day
lead time forecasts. The WANN model developed with newly
constructed time series using DWT is found to be better
compared to the simple ANN and BANN models for daily
discharge forecasting. This shows the capabilities of DWT in
0
10000
20000
30000
40000
Dis
char
ge
(m/s
)
Time (day)
ObservedWBANN
(a)
0
10000
20000
30000
40000
400003000020000100000
Pre
dic
ted
dis
char
ge
(m/s
)
Observed discharge (m /s)
−
WBANN
1:1 Line
(b)
Figure 5 9999 Hydrograph (a) and scatter plot (b) of observed and predicted discharge of
testing dataset for Naraj using the WBANN model for 1-day lead time.
0
10000
20000
30000
40000
Dis
char
ge
(m/s
)
Time (day)
ObservedWBANN
(a)
(b)
0
10000
20000
30000
40000
0 40000300002000010000
Pre
dic
ted
dis
char
ge
(m/s
)
Observed discharge (m /s)
WBANN
1:1 Line
Figure 6 9999 Hydrograph (a) and scatter plot (b) of observed and predicted discharge of
testing dataset for Naraj using the WBANN model for 3-day lead time.
0
10000
20000
30000
40000
Dis
char
ge
(m/s
)
Time (day)
ObservedWBANN
(a)
0
10000
20000
30000
40000
0 40000300002000010000
Pre
dic
ted
dis
char
ge
(m/s
)
Observed discharge (m /s)
WBANN
1:1 Line
(b)
Figure 7 9999 Hydrograph (a) and scatter plot (b) of observed and predicted discharge of
testing dataset for Naraj using the WBANN model for 5-day lead time.
Table 4 9999 Performance indices for 1–5-day lead time forecasts for the WBANN model
WBANN
Lead time (d) E (%) RMSE (m3/s) MAE (m3/s)
1 93.5 2247.9 1628.5
2 89.0 2921.6 2312.4
3 85.4 3367.2 2569.8
4 69.8 4850.9 3747.5
5 68.2 4981.3 3832.1
512 M. K. Tiwari & C. Chatterjee 9999 A new wavelet–bootstrap–ANN hybrid model Journal of Hydroinformatics 9999 9999 201113.3
daily discharge forecasting. WANN forecasts are found to be
more accurate due to the fact that the features (such as
periodicity) of the subseries are obvious (Ning & Yunping
1998). It is to be clarified that, even though WANN for all the
lead times and ANN for 3- and 5-day lead times perform
better than BANN, the BANN and WBANN models are
more reliable and consistent. This is because the ANN
model is uncertain and is not reproducible. Even a change
in the arrangement of training, cross-validation and testing
datasets lead to a different performance of ANN and WANN
models and it cannot be predicted. On the other hand, the
BANN and WBANN models are consistent and more reliable
even if the arrangement of datasets for training, cross-
validation and testing changes. This is the reason why
ensemble forecasting, by averaging all the 200 models, are
taken instead of selecting a best model. The BANN model is
the aggregation of several (200 in this study) ANN models,
each developed on resampled original time series. As the
BANN model is developed on different realizations of the
actual dataset using the bootstrap method, the ensemble pre-
dictions of these competing models lead to comprehensive,
unbiased and realistic predictions. For longer lead time, the
performance of the WBANN model is significantly better
than the ANN, BANN and WANN models. Overall perfor-
mance of the WBANN model is found to be superior due to
the reason that it is developed by generating 200 bootstrap
resamples of DWCs instead of a raw dataset as for BANN. In
this way the WBANN model uses the capabilities of both
bootstrap and wavelet techniques by taking inputs which are
generated by resampling the newly constructed time series
using DWCs.
In order to assess the performance of models to predict
discharge values for different lead time forecasts for different
magnitude flows, a ‘‘partitioning analysis’’ (Jain & Srinivasulu
2004) is carried out by dividing the total discharge values into
low-, medium- and high- magnitude discharges. Table 6
presents the partitioning of discharge values for testing per-
iods based on the relative spread of the discharge from the
mean. It has been reported that the coefficient of efficiency
can be high (80 or 90%) even for poor models, and the best
models do not produce values which, on first examination,
are impressively higher (Legates & McCabe 1999;
Table 5 9999 Performance indices for 1–5-day lead time forecasts for ANN, WANN and BANN
models
Lead time (d) E (%) RMSE (m3/s) MAE (m3/s)
ANN
1 89.4 2873.8 1824.6
2 73.9 4510 2861.9
3 60.9 5517.9 3513.8
4 39.2 6883.4 4784.3
5 25.7 7607.3 5429.7
WANN
1 92.5 2421.3 1678.8
2 81.7 3771 2604.4
3 72.1 4665.4 3207.1
4 66.8 5082.5 3705.5
5 39.8 6847 5293.8
BANN
1 91.2 2619.8 1714.4
2 72.3 4647.4 2943.2
3 57.5 5753.5 3515.8
4 38.0 6951.6 4662.7
5 22.1 7791.9 5481.6
0
10000
20000
30000
40000
Dis
char
ge
(m/s
)
Time (day)
ObservedANNWANNBANNWBANN
(a)
0
10000
20000
30000
40000
0 40000300002000010000
Pre
dic
ted
dis
char
ge
(m/s
)
Observed discharge (m /s)
ANN
WANN
BANN
WBANN
1:1 Line
(b)
Figure 8 9999 Hydrograph (a) and scatter plot (b) of observed and predicted discharge of
testing dataset for Naraj using the ANN, WANN, BANN and WBANN models for
1-day lead time.
513 M. K. Tiwari & C. Chatterjee 9999 A new wavelet–bootstrap–ANN hybrid model Journal of Hydroinformatics 9999 9999 201113.3
Krause et al. 2005). The RMSE statistic indicates only the
model’s ability to predict a value away from the mean (Hsu
et al. 1995). Therefore, it is important to test the model using
some other performance evaluation criteria such as threshold
statistics (Nayak et al. 2005 b). The threshold statistics (TS)
not only give the performance index in terms of predicting
discharges but also the distribution of the prediction errors.
The threshold statistic for a level of x% is a measure of
consistency in forecasting errors from a particular model. The
threshold statistics is represented as TSx and expressed as
a percentage. This criterion can be expressed for different
levels of absolute relative error (ARE) from the model. ARE is
given as
AREi ¼Oi � Pi
Oi
��������� 100 ð17Þ
TSx is computed for x% level as
TSx ¼Yx
n� 100 ð18Þ
where Yx is the number of computed discharges (out of n total
computed) for which the absolute relative error is less than
x% from the model.
Table 7 shows the distribution of forecast error predicted
from different models across different error thresholds for
1–5-day lead time forecasts for low, medium and high discharge
profiles. Even though there are only two high discharge
values, which constitute only 2.8% of the total testing dataset
(Table 6), it is observed that the performance of the WBANN
and WANN models for high flow forecasts is better than for
ANN and BANN. It may be due to the reason that wavelets
are reducing the noise in the discharge time series, making it
easier to predict the discharge whereas bootstrapping is
reducing the variance. Noise in high discharge may be due
0
10000
20000
30000
40000
Dis
char
ge
(m/s
)
Time (day)
ObservedANNWANNBANNWBANN
(a)
0
10000
20000
30000
40000
0 40000300002000010000
Pre
dic
ted
dis
char
ge
(m/s
)
Observed discharge (m /s)
ANN
WANN
BANN
WBANN
1:1 Line
(b)
Figure 9 9999 Hydrograph (a) and scatter plot (b) of observed and predicted discharge of
testing dataset for Naraj using the ANN, WANN, BANN and WBANN models for
3-day lead time.
–10000
0
10000
20000
30000
40000
Dis
char
ge
(m/s
)
Time (day)
ObservedANNWANNBANNWBANN
(a)
0
10000
20000
30000
40000
0 40000300002000010000
Pre
dic
ted
dis
char
ge
(m/s
)
Observed discharge (m /s)
ANN
WANN
BANN
WBANN
1:1 Line
(b)
Figure 10 9999 Hydrograph (a) and scatter plot (b) of observed and predicted discharge of
testing dataset for Naraj using the ANN, WANN, BANN and WBANN models for
5-day lead time.
Table 6 9999 Number of data points in low-, medium- and high-discharge categories
Category Number of pointsPercentage of thetotal data
Low (xom) 38 53.5
Medium (mrxrmþ 2s) 31 43.7
High (x4mþ 2s) 2 2.8
Total 71 100
m is the mean and s is the standard deviation.
514 M. K. Tiwari & C. Chatterjee 9999 A new wavelet–bootstrap–ANN hybrid model Journal of Hydroinformatics 9999 9999 201113.3
to the reason that the discharge values are computed using
the rating curves; however, since rating curves are developed
with limited stage-discharge measurements and since mea-
surements of high flows are rare, there could be significant
errors in rating curves at high levels (Sahoo & Ray 2006). This
error or noise in high discharge is well approximated using
discrete wavelet decomposition. Performance of the
WBANN model for medium flow forecasts is significantly
better compared to the remaining three models as the
WBANN model can forecast 93.5%, 80.6%, 74.2%, 67.5%
and 64.5% of discharge values within 25% relative error for
1–5-day lead time forecast, respectively, whereas the WANN
model can forecast 77.4%, 64.5%, 71.0%, 58.1% and 35.5% of
discharge values within 25% relative error. All four models
perform more or less similarly during low discharge. It is
observed that medium discharges are simulated more satis-
factorily compared to low and high discharges profiles. The
reason may be that the low discharge data are influenced by
the unsaturated condition of the basin and the high discharge
values do not have proper representation in training the
models, whereas medium discharge values are not influenced
much by the unsaturated condition of the basin and have
proper representation in training datasets.
In order to compare ANNs with an empirical model,
we also developed a multiple linear regression (MLR)
model for daily discharge forecasting. The discharge at
Table 7 9999 Threshold statistics for all four models for testing period
Low flow Medium flow High flow
TS(%)
1-dlead
2-dlead
3-dlead
4-dlead
5-dlead
1-dlead
2-dlead
3-dlead
4-dlead
5-dlead
1-dlead
2-dlead
3-dlead
4-dlead
5-dlead
WBANN
5 18.4 5.3 0.0 2.6 5.3 29.0 19.4 19.4 12.9 12.9 0.0 0.0 0.0 0.0 0.0
10 26.3 7.9 0.0 2.6 10.5 51.6 38.7 35.5 29.0 35.5 100.0 50.0 0.0 0.0 0.0
20 50.0 28.4 21.1 13.2 13.2 77.4 71.0 61.3 41.9 54.8 100.0 100.0 50.0 50.0 50.0
25 65.8 41.1 26.3 22.4 18.4 93.5 80.6 74.2 67.5 64.5 100.0 100.0 100.0 50.0 50.0
50 86.8 71.1 68.4 47.4 39.5 100.0 100.0 100.0 93.5 90.3 100.0 100.0 100.0 100.0 100.0
BANN
5 10.5 13.2 5.3 7.9 2.6 22.6 16.1 6.5 6.5 3.2 0.0 0.0 0.0 0.0 0.0
10 21.1 13.2 26.3 21.1 18.4 61.3 29.0 29.0 6.5 6.5 0.0 0.0 0.0 0.0 0.0
20 55.3 34.2 39.5 34.2 31.6 80.6 54.8 51.6 19.4 16.1 50.0 50.0 50.0 0.0 0.0
25 63.2 54.7 44.7 41.8 39.5 93.5 67.7 58.1 32.3 19.4 50.0 50.0 50.0 0.0 0.0
50 92.1 78.9 68.4 52.6 55.3 100.0 93.5 90.3 77.4 54.8 100.0 100.0 50.0 0.0 0.0
WANN
5 7.9 5.3 2.6 10.5 0.0 35.5 16.1 16.1 9.7 12.9 50.0 0.0 0.0 0.0 100.0
10 28.9 15.8 7.9 15.8 5.3 58.1 41.9 35.5 25.8 12.9 100.0 100.0 0.0 0.0 100.0
20 34.2 34.2 23.7 31.6 15.8 71.0 61.3 58.1 48.4 25.8 100.0 100.0 50.0 100.0 100.0
25 52.6 44.7 31.6 42.1 23.7 77.4 64.5 71.0 58.1 35.5 100.0 100.0 100.0 100.0 100.0
50 86.8 68.4 47.4 63.2 34.2 100.0 90.3 90.3 83.9 90.3 100.0 100.0 100.0 100.0 100.0
ANN
5 18.4 7.9 7.9 10.5 5.3 25.8 16.1 9.7 6.5 9.7 50.0 0.0 0.0 0.0 0.0
10 23.7 23.7 18.4 13.2 13.2 58.1 38.7 22.6 12.9 12.9 50.0 0.0 0.0 0.0 0.0
20 63.2 44.7 36.8 26.3 23.7 77.4 58.1 45.2 25.8 22.6 50.0 50.0 50.0 0.0 0.0
25 71.1 47.4 57.9 31.6 26.3 83.9 64.5 54.8 35.5 29.0 50.0 50.0 50.0 0.0 0.0
50 90.7 74.9 78.9 57.9 55.3 100.0 93.5 93.5 87.1 61.3 100.0 100.0 50.0 0.0 0.0
515 M. K. Tiwari & C. Chatterjee 9999 A new wavelet–bootstrap–ANN hybrid model Journal of Hydroinformatics 9999 9999 201113.3
Naraj station is selected as the dependent variable and the
input variables as of ANNs are selected as independent
variables. The best-fit model is estimated and selected
based on the highest coefficient of determination (R2). All
of the MLR models are first trained (to determine the
regression coefficients) using the data in the training set
(2000–2005) and then tested using the testing dataset
(2006). The SPSS software package (version 10, SPSS
Inc., Chicago, IL) is used for regression calculations. The
performance comparison of the ANN, WANN, BANN,
WBANN and MLR models is carried out with a simple
naive model in terms of persistence index, PERS. It is
obvious from Table 8 that the MLR model is better com-
pared to a simple naive model only for 1-day lead time
forecasts. All four ANN models (i.e. ANN, WANN, BANN
and WBANN) perform better compared to the MLR model
for 1–5-day lead time forecasts. It is clear that the PERS of
the WANN and WBANN models are greater than 0 and
therefore both the models are better compared to a simple
naive model for 1–5-day lead time forecasts. Performance
of ANN and BANN are better up to 4-day lead forecasts,
and negative PERS for 5-day lead time forecast indicate
that it is better to implement a simple naive model instead
of ANN and BANN. The performance of the WBANN
model is better than all the remaining models for all the
lead times forecast; these findings again show the strength
and capability of WBANN modeling for daily discharge
forecasting. Negative values of PERS for different models
show that for these lead times the models are not able to
extract any linear or nonlinear relationship and it is better
to implement a simple naive model. It is clear from these
findings that WBANN model performance is very good up
to 5-day lead time forecasts.
CONCLUSIONS
The aim of this paper is to introduce a conjunction model
WBANN based on wavelet and bootstrap techniques for
daily discharge forecasting. The decomposed time series of
the observed data present different periodic components.
Each of the wavelet components makes a distinct contribu-
tion to the original time series. The significant wavelet
components are selected based on the correlation between
the particular component series and the observed discharge
at Naraj and a new series is composed by adding
the significant components for each time series. These
series are employed as inputs to constitute the WANN
model for daily discharge forecasting. Use of wavelet com-
ponents supports the recent studies about the physics
involved in the ANNs (Jain et al. 2004; Sudheer & Jain
2004; Sudheer 2005). The significant inputs to the ANN
model are selected using the cross-correlation statistics.
The bootstrap resampling method is used to generate
different realizations of the datasets to create a set of
bootstrap samples that provide a better understanding of
the average and variability of the original unknown dis-
tribution or process. Two hundred bootstrapped ANN
models are generated to reduce the uncertainty involved
in ANN predictions. The inclusion of the DWCs in the
ANN input layer enabled consideration of the effective
periodic components in the discharge forecasting whereas
bootstrapping techniques are used to reduce the uncer-
tainty inherent in the conventional ANN modeling.
WBANN models are developed to incorporate the indivi-
dual capabilities of wavelet and bootstrap techniques. The
WBANN model is found to be superior to the traditional
ANN, BANN and WANN in terms of the selected perfor-
mance criteria for 1–5-day lead forecasts. The peak dis-
charge forecasts obtained by the WANN model are
noticeably closer to the observations compared to all
three remaining models. ARE and TS statistics show that
performance of the WBANN and WANN models for
high flow forecasts are better than ANN and BANN, the
performance of the WBANN model for medium flow fore-
casts is significantly better compared to the remaining three
models, whereas all four models perform more or less
similar during low discharge forecasts. This study shows
that the WBANN method is quite an appropriate tool for
Table 8 9999 Persistence index (PERS) for three different models for 1–5-day
lead time forecasts
Lead time (d) ANN WANN BANN WBANN MLR
1 0.40 0.58 0.50 0.63 0.3786
2 0.25 0.48 0.21 0.69 �0.0185
3 0.21 0.43 0.14 0.70 �0.2864
4 0.03 0.47 0.01 0.49 �0.4608
5 �0.02 0.17 �0.07 0.59 �0.3678
516 M. K. Tiwari & C. Chatterjee 9999 A new wavelet–bootstrap–ANN hybrid model Journal of Hydroinformatics 9999 9999 201113.3
modeling the discharge for short as well as long lead time
forecasts.
REFERENCES
Abrahart, R. J. 2003 Neural network rainfall-runoff forecasting based oncontinuous resampling. J. Hydroinf. 5(1), 51–61.
Adamowski, J. F. 2008a River flow forecasting using wavelet and cross-wavelet transform models. Hydrol. Processes 22(25), 4877–4891.
Adamowski, J. F. 2008b Development of a short-term river floodforecasting method for snowmelt driven floods based on waveletand cross-wavelet analysis. J. Hydrol. 353(3–4), 247–266.
Aksoy, H., Guven, A., Aytek, A., Yuce, M. I. & Unal, N. E. 2007Discussion of generalized regression neural networks forevapotranspiration modelling by O Kisi (2006). Hydrol. Sci. J.52(4), 825–828.
Altunkaynak, A. 2007 Forecasting surface water level fluctuations ofLake Van by artificial neural networks. Wat. Res. Mngmnt. 21(2),399–408.
Anctil, F & Lauzon, N. 2004 Generalisation for neural networksthrough data sampling and training procedures, with applicationsto streamflow predictions. Hydrol. Earth Syst. Sci. 8(5), 940–958.
Anctil, F. & Tape, D. G. 2004 An exploration of artificial neuralnetwork rainfall-runoff forecasting combined with waveletdecomposition. J. Environ. Engng. Sci. 3(1), 121–128.
ASCE 1993 Criteria for evaluation of watershed models. J. Irrig. Drain. E119(3), 429–442.
ASCE Task Committee on Application of Artificial Neural Networks inHydrology 2000a Artificial neural networks in hydrology I:preliminary concepts. J. Hydrol. Engng. 5(2), 115–123.
ASCE Task Committee on Application of Artificial Neural Networks inHydrology 2000b Artificial neural networks in hydrology II:hydrologic applications. J. Hydrol. Engng. 5(2), 124–137.
Aussem, A., Campbell, J. & Murtagh, F. 1998 Wavelet-based featureextraction and decomposition strategies for financial forecasting.J. Comput. Intel. Finance 6(2), 5–12.
Baratti, R., Cannas, B., Fanni, A., Pintus, M., Sechi, G.M. & Toreno, N.2003 River flow forecast for reservoir management through neuralnetworks. NeuroComputing 55(3–4), 421–437.
Barreto, H. & Howland, F.M. 2006 Introductory Econometrics: UsingMonte Carlo Simulation with Microsoft Excel. CambridgeUniversity Press, Cambridge.
Bates, B.C. & Townley L.R. 1988 Nonlinear, discrete flood eventmodels. 3: Analysis of prediction uncertainty. J. Hydrol. 99(1�2),91–101.
Beven, K. 2006 A manifesto for the equifinality thesis. J. Hydrol.320(1�2), 18–36.
Bishop, C.M., 1995 Neural Networks for Pattern Recognition.Clarendon, Oxford.
Boucher, M.A., Perreault, L. & Anctil, F. 2009 Tools for the assessmentof hydrological ensemble forecasts obtained by neural networks.J. Hydroinf. 11(3–4), 297–307.
Campolo, M., Andreussi, P. & Soldati, A. 1999 River flood forecastingwith a neural network model. Wat. Res. Res. 35(4), 1191–1197.
Cannas, B., Fanni, A., See, L. & Sias, G. 2006 Data preprocessing forriver flow forecasting using neural networks: wavelet transformsand data partitioning. Phys. Chem. Earth 31(18), 1164–1171.
Cannon, A. J. & Whitfield, P. H. 2002 Downscaling recent streamflowconditions in British Columbia, Canada using ensemble neuralnetwork models. J. Hydrol. 259(1-4), 136–151.
Cigizoglu, H. K. 2003a Estimation, forecasting and extrapolation offlow data by artificial neural networks. Hydrol. Sci. J. 48(3),349–361.
Cigizoglu, H. K. 2003b Incorporation of ARMA models into flowforecasting by artificial neural networks. Environmetrics 14(4),417–427.
Coulibaly, P. & Evora, N. D. 2007 Comparison of neural networkmethods for infilling missing daily weather records. J. Hydrol.341(1�2), 27–41.
Daubechies, I. 1990 The wavelet transform, time-frequency localizationand signal analysis. IEEE Trans. Inf. Theory 36(5), 6–7.
Dawson, C. W., Abrahart, R. J. & See, L. M. 2007 HydroTest: aweb-based toolbox of evaluation metrics for the standardisedassessment of hydrological forecasts. Environ. Modell. Software22(7), 1034–1052.
Demirel, M. C., Venancio, A. & Kahya, E. 2009 Flow forecast bySWAT model and ANN in Pracana basin, Portugal. J. Hydrol.40(7), 467–473.
Dybowski, R. & Roberts, S. J. 2001 Confidence intervals and predictionintervals for feed forward neural networks. In: ClinicalApplications of Artificial Neural Networks (Dybowski, R. & Gant,V.). Cambridge University Press, Cambridge, pp. 298–326.
Efron, B. 1979 Bootstrap methods: another look at the jackknife. Ann.Statist. 7(1), 1–26.
Efron, B. & Tibshirani, R. J. 1993 An Introduction to the Bootstrap.Chapman and Hall, London.
Han, D., Kwong, T. & Li, S. 2007a Uncertainties in real-timeflood forecasting with neural networks. Hydrol. Processes 21(2),223–228.
Haykin, S. 1999 Neural Networks: A Comprehensive Foundation 2ndedn. Prentice Hall, Englewood Cliffs, NJ.
Hsu, K., Gupta, H. V. & Sorooshian, S. 1995 Artificial neural networkmodeling of the rainfall-runoff process. Wat. Res. Res. 31(10),2517–2530.
Jacquin, A. P. & Shamseldin, A. Y. 2009 Review of the application offuzzy inference systems in river flow forecasting. J. Hydroinf.11(3–4), 202–210.
Jain, A. & Srinivasulu, S. 2004 Development of effective and efficientrainfall–runoff models using integration of deterministic, real-coded genetic algorithms and artificial neural network techniques.Wat. Res. Res. 40, W04302.
Jain, A., Sudheer, K. P. & Srinivasulu, S. 2004 Identification of physicalprocesses inherent in artificial neural network rainfall runoffmodels. Hydrol. Processes 18(3), 571–581.
Jain, S. K., Das, D. & Srivastava, D. K. 1999 Application of ANN forreservoir inflow prediction and operation. J. Wat. Res. Plann.Mngmnt. 125(5), 263–271.
Jeong, D. & Kim, Y. O. 2005 Rainfall-runoff models using artificialneural networks for ensemble streamflow prediction. Hydrol.Processes 19(19), 3819–3835.
517 M. K. Tiwari & C. Chatterjee 9999 A new wavelet–bootstrap–ANN hybrid model Journal of Hydroinformatics 9999 9999 201113.3
Jia, Y. & Culver, T. B. 2006 Bootstrapped artificial neural networks forsynthetic flow generation with a small data sample. J. Hydrol.331(3-4), 580–590.
Kim, T. W. & Valdes, J. B. 2003 Nonlinear model for droughtforecasting based on a conjunction of wavelet transforms andneural networks. J. Hydrol. Engng. 8(6), 319–328.
Kisi, O. 2008 Stream flow forecasting using neuro-wavelet technique.Hydrol. Processes 22(20), 4142–4152.
Kisi, O. 2009 Neural networks and wavelet conjunction model forintermittent streamflow forecasting. J. Hydrol. Engng. 14(8), 773–782.
Kitanidis, P. K. & Bras, R. L. 1980 Real-time forecasting with aconceptual hydrologic model: 2. Applications and results. Wat.Res. Res. 16(6), 1034–1044.
Koutsoyiannis, D. 2007 Discussion of ‘‘generalized regression neuralnetworks for evapotranspiration modelling’’ by O Kisi (2006).Hydrol. Sci. J. 52(4), 832–835.
Krause, P., Boyle, D. P. & Base, F. 2005 Comparison of differentefficiency criteria for hydrological model assessment. In: Proc. 8thWorkshop for Large Scale Hydrological Modelling, Oppurg 2004(Krause, P., Bongartz, K. & Flugel, W.-A. (Eds.)). Adv. Geosci. 5,89–97.
Kucuk, M. & Agiralloglu, N. 2006 Regression technique for stream flowprediction. J. Appl. Stat. 33(9), 943–960.
Lall, U. & Sharma, A. 1996 A nearest neighbor bootstrap for resamplinghydrologic time series. Wat. Res. Res. 32(3), 679–693.
Leclerca, M. & Ouarda, T. B. M. J. 2007 Non-stationary regional floodfrequency analysis at ungauged sites. J. Hydrol. 343(3-4), 254–265.
Legates, D. R. & McCabe, G. J. 1999 Evaluating the use the goodness-of-fit measure in hydrologic and hydroclimatic model validation.Wat. Res. Res. 35(1), 233–241.
Loague, L. & VanderKwaak, J. E. 2004 Physics-based hydrologicresponse simulation: platinum bridge, 1958 Edsel, or useful tool.Hydrol. Processes 18(15), 2949–2956.
Maier, H. R. & Dandy, G. C. 2000 Neural networks for the predictionand forecasting of water resources variables: a review of modellingissues and applications. Environ. Modell. Software 15(1),101–124.
Mallat, S. G. 1989 A theory for multi resolution signal decomposition:the wavelet representation. IEEE Trans. Pattern Anal. MachineIntel. 11(7), 674–693.
Mukerji, A., Chatterjee, C. & Raghuwanshi, N. S. 2009 Floodforecasting using ann, neuro-fuzzy, and neuro-GA models.J. Hydrol. Engng. 14(6), 647–652.
Muller, B. & Reinhardt, J. 1991 Neural Networks – An Introduction.Springer-Verlag, Berlin.
Nash, J. E. & Sutcliffe, J. V. 1970 River flow forcasting throughconceptual models. I. J. Hydrol. 10(3), 282–290.
Nason, G. P. & Von Sachs, R. 1999 Wavelets in time series analysis.Phil. Trans. R. Soc. Ser. A 357, 2511–2526.
Nayak, P. C., Sudheer, K. P. & Ramasastri, K. S. 2005a Fuzzy computingbased rainfall–runoff model for real time flood forecasting. Hydrol.Processes 19(4), 955–968.
Nayak, P. C., Sudheer, K. P., Rangan, D. M. & Ramasastri, K. S. 2004 Aneuro-fuzzy computing technique for modeling hydrological timeseries. J. Hydrol. 291(1–2), 52–66.
Nayak, P. C., Sudheer, K. P., Rangan, D. M. & Ramasastri, K. S. 2005bShort-term flood forecasting with a neurofuzzy model. Wat. Res.Res. 41(4), W04004.
NERC 1975 Flood Studies Report. Hydrological Studies vol. 1. NaturalEnvironment Research Council, London.
Ning, M., & Yunping, C. 1998 An ANN and wavelet transformationbased method for short term load forecast. Energy managementand power delivery. Int. Conf. 2, 405–410.
Nourani, V, Alami, M.T. & Aminfar, M.H. 2008 A combined neural-wavelet model for prediction of watershed precipitation,Ligvanchai, Iran. J. Environ. Hydrol. 16(2), 1–12.
Nourani, V., Komasi, M. & Mano, A. 2009 A multivariate ANN-waveletapproach for rainfall–runoff modeling. Wat. Res. Mngmnt. 23(14),2877-2894.
Partal, T. 2009 River flow forecasting using different artificial neuralnetwork algorithms and wavelet transform. Can. J. Civil Engng.36(1), 26–38.
Partal, T. & Cigizoglu, H. K. 2008 Estimation and forecasting of dailysuspended sediment data using wavelet–neural networks.J. Hydrol. 358(3-4), 317–331.
Partal, T. & Cigizoglu, H. K. 2009 Prediction of daily precipitation usingwavelet–neural networks. Hydrol. Sci. J. 54(2), 234–246.
Partal, T. & Kucuk, M. 2006 Long-term trend analysis using discretewavelet components of annual precipitations measurements inMarmara region (Turkey). Phys. Chem. Earth 31(18), 1189–1200.
Pramanik, N. & Panda, R. K. 2009 Application of neural network andadaptive neuro-fuzzy inference systems for river flow prediction.Hydrol. Sci. J. 54(2), 247–260.
Rao, A. R., Hamed, K. H. & Chen, H. L. 2003 Nonstationarities inHydrologic and Environmental Time Series. Kluwer, Dordrecht.
Rosso, O. A., Figliola, A., Blanco, S. & Jacovkis, P. M. 2004 Signalseparation with almost periodic components: a wavelets basedmethod. Revista Mexicana de Fisica 50(2), 179–186.
Sahoo, G. B. & Ray, C. 2006 Flow forecasting for a Hawaii stream usingrating curves and neural networks. J. Hydrol. 317(1), 63–80.
Sahoo, G. B., Schladow, S. G. & Reuter, J. E. 2009 Forecasting streamwater temperature using regression analysis, artificial neuralnetwork, and chaotic non-linear dynamic models. J. Hydrol.378(3-4), 325–342.
Satyajirao, Y. R. & Krishna, B. 2009 Modelling hydrological timeseries data using wavelet neural network analysis. IAHS Publ.333, 101–111.
Sharma, A., Tarboton, D. G. & Lall. U. 1997 Streamflow simulation: anonparametric approach. Wat. Res. Res. 33(3), 291–308.
Sharma, S. K. & Tiwari, K. N. 2009 Bootstrap based artificial neuralnetwork (BANN) analysis for hierarchical prediction of monthlyrunoff in Upper Damodar Valley Catchment. J. Hydrol. 374(3-4),209–222.
Solomatine, D. P. & Ostfeld, A. 2008 Data-driven modelling: some pastexperiences and new approaches. J. Hydroinf. 10(1), 3–22.
Srivastav, R. K., Sudheer, K. P. & Chaubey, I. 2007 A simplifiedapproach to quantifying predictive and parametric uncertainty inartificial neural network hydrologic models. Wat. Res. Res. 43(10),W10407.
Sudheer, K. P. 2005 Knowledge extraction from trained neural networkriver flow models. J. Hydrol. Engng. 10(4), 264–269.
518 M. K. Tiwari & C. Chatterjee 9999 A new wavelet–bootstrap–ANN hybrid model Journal of Hydroinformatics 9999 9999 201113.3
Sudheer, K. P., Gosain, A.K. & Ramasastri, K.S. 2002 A data-drivenalgorithm for constructing artificial neural network rainfall-runoffmodels. Hydrol. Processes 16(6), 1325–1330.
Sudheer, K. P. & Jain, A. 2004 Explaining the internal behaviour ofartificial neural network river flow models. Hydrol. Processes18(4), 833–844.
Sudheer, K. P. & Jain, S.K. 2003 Radial basis function neural networkfor modeling rating curves. J. Hydrol. Engng. 8(3), 161–164.
Tasker, G. D. & Dunne, P. 1997 Bootstrap position analysis forforecasting low flow frequency. J. Wat. Res. Plann. Mngmnt.123(6), 359–367.
Tiwari, M. K. & Chatterjee, C. 2010 Uncertainty assessment andensemble flood forecasting using bootstrap based artificial neuralnetworks (BANNs). J. Hydrol. 382(1–4), 20–33.
Twomey, J. M. & Smith, A. E. 1998 Bias and variance of validationmethods for function approximation neural networks underconditions of sparse data. IEEE Trans. Syst. Man Cybernet. Part C:Appl. Rev. 28(3), 417–430.
Uhlenbrook, S., Seibert, J., Leibundgut, C. & Rodhe, A. 1999 Predictionuncertainty of conceptual rainfall-runoff models caused byproblems to identify model parameters and structure. Hydrol. Sci.J. 44(5), 779–798.
Wang, D. & Ding, J. 2003 Wavelet network model and its application tothe prediction of hydrology. Nat. Sci. 1(1), 67–71.
Wang, J. & Meng, J. 2007 Research on runoff variations based onwavelet analysis and wavelet neural network model: a case study
of the Heihe River drainage basin (1944-2005). J. Geog. Sci. 17(3),327–338.
Wang, W., Jin, J. & Li, Y. 2009 Prediction of inflow at Three Gorgesdam in Yangtze River with wavelet network model. Wat. Res.Mngmnt. 23(13), 2791–2803.
Wang, W., Vrijling, J. K., Van Gelder, P. H. A. J. M. & Ma, J. 2006Testing for nonlinearity of streamflow processes at differenttimescales. J. Hydrol. 322(1–4), 247–268.
Wu, C. L. & Chau, K. W. 2006 A flood forecasting neural networkmodel with genetic algorithm. Int. J. Environ. Pollut. 28(3–4),261–273.
Wu, C. L., Chau, K. W. & Li, Y. S. 2009 Methods to improve neuralnetwork performance in daily flows prediction. J. Hydrol. 372(1-4), 80–93.
Xingang, D., Ping, W. & Jifan, C. 2003 Multiscale characteristics of therainy season rainfall and interdecadal decaying of summermonsoon in North China. Chin. Sci. Bull. 48(24), 2730– 2734.
Yueqing, X., Shuangcheng, L. & Yunlong, C. 2004 Wavelet analysis ofrainfall variation in the Hebei Plain. Sci. China Ser. D. 48(12),2241–2250.
Zhang, B. L. & Dong, Z. Y. 2001 An adaptive neural-wavelet modelfor short term load forecasting. Elec. Power Syst. Res. 59(2),121–129.
Zhou, H. C., Peng, Y. & Liang, G.-H. 2008 The research of monthlydischarge predictor-corrector model based on waveletdecomposition. Wat. Res. Mngmnt. 22(1), 217–227.
First received 1 July 2009; accepted in revised form 1 February 2010. Available online 1 October 2010
519 M. K. Tiwari & C. Chatterjee 9999 A new wavelet–bootstrap–ANN hybrid model Journal of Hydroinformatics 9999 9999 201113.3